* Expanding instructions with condition codes inter-deps @ 2011-10-17 15:58 Paulo J. Matos 2011-10-17 17:23 ` Andrew Pinski 2011-10-19 5:22 ` Richard Henderson 0 siblings, 2 replies; 14+ messages in thread From: Paulo J. Matos @ 2011-10-17 15:58 UTC (permalink / raw) To: gcc Hi, To negate a double word (HImode) register, I used to take the instruction all the way to assembly generation and then expand into three assembly instructions like so: xor %t0, #ffff ; invert bits in top word of op0 nadd %b0, #0 ; negate bottom bits of op0 addc %t0, #0 ; add carry to top bits of op0 The interesting thing about this sequence is that the carry added to the top bits of op0, is the carry generated by the previous instruction. If I instead of taking the neghi rule all the way to assembly, and instead expand it I get: (define_expand "neghi2" [(set (match_operand:HI 0 "register_operand") (neg:HI (match_operand:HI 1 "general_operand"))) (clobber (reg:CC RCC))] "" { rtx op0_low = gen_lowpart(QImode, operands[0]); rtx op0_high = gen_highpart(QImode, operands[0]); emit_move_insn(operands[0], operands[1]); emit_insn(gen_xorqi3(op0_high, op0_high, GEN_INT(-1))); emit_insn(gen_negqi2(op0_low, op0_low)); emit_insn(gen_addc_internal(op0_high, op0_high, const0_rtx)); DONE; }) addc_internal looks like: (define_insn "addc_internal" [(set (match_operand:QI 0 "nonimmediate_operand" "=c") (plus:QI (plus:QI (ltu:QI (reg:CC RCC) (const_int 0)) (match_operand:QI 1 "nonimmediate_operand" "%0")) (match_operand:QI 2 "general_operand" "cwmi"))) (clobber (reg:CC RCC))] "" "addc\\t%0,%f2") The problem is that GCC sometimes moves nadd and addc around. It might swap their order or insert instructions in-between that set the carry flag. This ends up generating code which won't work properly. What's the best way to instruct GCC during neghi2 expansion that the carry read by addc internal is the one generated by negqi2 and so no instructions setting carry should be inserted between them and that their order should not be changed? (for example, it would be ok to output negqi2, xorqi3 and addc_internal since xorqi3 only sets N and Z, not the Carry bit). Cheers, -- PMatos ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Expanding instructions with condition codes inter-deps 2011-10-17 15:58 Expanding instructions with condition codes inter-deps Paulo J. Matos @ 2011-10-17 17:23 ` Andrew Pinski 2011-10-18 13:44 ` Paulo J. Matos 2011-10-19 5:22 ` Richard Henderson 1 sibling, 1 reply; 14+ messages in thread From: Andrew Pinski @ 2011-10-17 17:23 UTC (permalink / raw) To: Paulo J. Matos; +Cc: gcc On Mon, Oct 17, 2011 at 3:50 AM, Paulo J. Matos <paulo@matos-sorge.com> wrote: > addc_internal looks like: > (define_insn "addc_internal" > [(set (match_operand:QI 0 "nonimmediate_operand" "=c") > (plus:QI > (plus:QI > (ltu:QI (reg:CC RCC) (const_int 0)) > (match_operand:QI 1 "nonimmediate_operand" "%0")) > (match_operand:QI 2 "general_operand" "cwmi"))) > (clobber (reg:CC RCC))] > "" > "addc\\t%0,%f2") Try adding (use (reg:CC RCC)) to the pattern above. Thanks, Andrew Pinski > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Expanding instructions with condition codes inter-deps 2011-10-17 17:23 ` Andrew Pinski @ 2011-10-18 13:44 ` Paulo J. Matos 0 siblings, 0 replies; 14+ messages in thread From: Paulo J. Matos @ 2011-10-18 13:44 UTC (permalink / raw) To: gcc On 17/10/11 17:20, Andrew Pinski wrote: > On Mon, Oct 17, 2011 at 3:50 AM, Paulo J. Matos<paulo@matos-sorge.com> wrote: > >> addc_internal looks like: >> (define_insn "addc_internal" >> [(set (match_operand:QI 0 "nonimmediate_operand" "=c") >> (plus:QI >> (plus:QI >> (ltu:QI (reg:CC RCC) (const_int 0)) >> (match_operand:QI 1 "nonimmediate_operand" "%0")) >> (match_operand:QI 2 "general_operand" "cwmi"))) >> (clobber (reg:CC RCC))] >> "" >> "addc\\t%0,%f2") > > Try adding (use (reg:CC RCC)) to the pattern above. > Currently compiling that change into the code. I have a feeling that it will either confuse compare-elim.c or that it will be too strict and remove any possibility of optimization since with (use (reg:CC RCC)) we are not really saying that we are only interested in the carry flag instead of the whole thing. It will avoid anything clobbering RCC which is most of the instructions. I will give it a spin and see what actually happens. Thanks, -- PMatos ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Expanding instructions with condition codes inter-deps 2011-10-17 15:58 Expanding instructions with condition codes inter-deps Paulo J. Matos 2011-10-17 17:23 ` Andrew Pinski @ 2011-10-19 5:22 ` Richard Henderson 2011-10-19 5:45 ` Paul_Koning 2011-10-20 12:46 ` Paulo J. Matos 1 sibling, 2 replies; 14+ messages in thread From: Richard Henderson @ 2011-10-19 5:22 UTC (permalink / raw) To: Paulo J. Matos; +Cc: gcc On 10/17/2011 03:50 AM, Paulo J. Matos wrote: > Hi, > > To negate a double word (HImode) register, I used to take the instruction all the way to assembly generation and then expand into three assembly instructions like so: > xor %t0, #ffff ; invert bits in top word of op0 > nadd %b0, #0 ; negate bottom bits of op0 > addc %t0, #0 ; add carry to top bits of op0 > > The interesting thing about this sequence is that the carry added to the top bits of op0, is the carry generated by the previous instruction. > > If I instead of taking the neghi rule all the way to assembly, and instead expand it I get: > (define_expand "neghi2" > [(set (match_operand:HI 0 "register_operand") > (neg:HI (match_operand:HI 1 "general_operand"))) > (clobber (reg:CC RCC))] > "" > { > rtx op0_low = gen_lowpart(QImode, operands[0]); > rtx op0_high = gen_highpart(QImode, operands[0]); > emit_move_insn(operands[0], operands[1]); > emit_insn(gen_xorqi3(op0_high, op0_high, GEN_INT(-1))); > emit_insn(gen_negqi2(op0_low, op0_low)); > emit_insn(gen_addc_internal(op0_high, op0_high, const0_rtx)); > DONE; > }) > > > addc_internal looks like: > (define_insn "addc_internal" > [(set (match_operand:QI 0 "nonimmediate_operand" "=c") > (plus:QI > (plus:QI > (ltu:QI (reg:CC RCC) (const_int 0)) > (match_operand:QI 1 "nonimmediate_operand" "%0")) > (match_operand:QI 2 "general_operand" "cwmi"))) > (clobber (reg:CC RCC))] > "" > "addc\\t%0,%f2") > > > The problem is that GCC sometimes moves nadd and addc around. The thing that's almost certainly missing is that the NAND pattern must SET your flags register, not simply clobber it. Otherwise the dependency between the ADDC and the NAND will never be created properly. > (for example, it would be ok to output negqi2, xorqi3 and > addc_internal since xorqi3 only sets N and Z, not the Carry bit) For that you'd have to model all of the flags bits independently. I don't believe any target has found that level of complexity to be worth the trouble. So, almost certainly, you don't. r~ ^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: Expanding instructions with condition codes inter-deps 2011-10-19 5:22 ` Richard Henderson @ 2011-10-19 5:45 ` Paul_Koning [not found] ` <CAPOJ94M2XrqM_kG98v1dC1=K2fEkHpuNLSkZBrQyzJ9ncmaQXg@mail.gmail.com> 2011-10-21 21:42 ` Paulo J. Matos 2011-10-20 12:46 ` Paulo J. Matos 1 sibling, 2 replies; 14+ messages in thread From: Paul_Koning @ 2011-10-19 5:45 UTC (permalink / raw) To: rth, paulo; +Cc: gcc >From: gcc-owner@gcc.gnu.org [mailto:gcc-owner@gcc.gnu.org] On Behalf Of Richard Henderson >On 10/17/2011 03:50 AM, Paulo J. Matos wrote: >>... >> (for example, it would be ok to output negqi2, xorqi3 and >> addc_internal since xorqi3 only sets N and Z, not the Carry bit) > >For that you'd have to model all of the flags bits independently. >I don't believe any target has found that level of complexity to be worth the trouble. Something like that shows up in the pdp11, where "mov" does not touch C. And C matters for multi-word arithmetic, and also for unsigned compares. So I think a CCmode implementation there would model C separately from the other three flag bits. So not 4 separate elements but two. Right now it's a cc0 target but I figure on changing that at some point. The reasons Paulo mentioned are one of the main reasons for that. paul ^ permalink raw reply [flat|nested] 14+ messages in thread
[parent not found: <CAPOJ94M2XrqM_kG98v1dC1=K2fEkHpuNLSkZBrQyzJ9ncmaQXg@mail.gmail.com>]
* Re: Expanding instructions with condition codes inter-deps [not found] ` <CAPOJ94M2XrqM_kG98v1dC1=K2fEkHpuNLSkZBrQyzJ9ncmaQXg@mail.gmail.com> @ 2011-10-21 20:57 ` Richard Henderson 0 siblings, 0 replies; 14+ messages in thread From: Richard Henderson @ 2011-10-21 20:57 UTC (permalink / raw) To: Peter Bigot; +Cc: Paul_Koning, paulo, gcc On 10/21/2011 09:13 AM, Peter Bigot wrote: > Are there any existing machine descriptions that do model the carry > flag separately, specifically to model rotate operations that use the > carry flag as the destination and source for the shifted bit? Or is > the best I can do for that is to have the value be an unspec, with a > use and clobber of the CC register (which currently blows up in > simplify_subreg, but I'm still experimenting). I don't believe there are any extant with that behaviour. r~ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Expanding instructions with condition codes inter-deps 2011-10-19 5:45 ` Paul_Koning [not found] ` <CAPOJ94M2XrqM_kG98v1dC1=K2fEkHpuNLSkZBrQyzJ9ncmaQXg@mail.gmail.com> @ 2011-10-21 21:42 ` Paulo J. Matos 2011-10-22 0:13 ` Richard Henderson 1 sibling, 1 reply; 14+ messages in thread From: Paulo J. Matos @ 2011-10-21 21:42 UTC (permalink / raw) To: gcc On 19/10/11 01:48, Paul_Koning@Dell.com wrote: >> From: gcc-owner@gcc.gnu.org [mailto:gcc-owner@gcc.gnu.org] On Behalf Of Richard Henderson >> On 10/17/2011 03:50 AM, Paulo J. Matos wrote: >>> ... >>> (for example, it would be ok to output negqi2, xorqi3 and >>> addc_internal since xorqi3 only sets N and Z, not the Carry bit) >> >> For that you'd have to model all of the flags bits independently. >> I don't believe any target has found that level of complexity to be worth the trouble. > > Something like that shows up in the pdp11, where "mov" does not touch C. And C matters for multi-word arithmetic, and also for unsigned compares. So I think a CCmode implementation there would model C separately from the other three flag bits. So not 4 separate elements but two. Right now it's a cc0 target but I figure on changing that at some point. The reasons Paulo mentioned are one of the main reasons for that. > > paul > > I have several CC modes depending on which flags are set by each instruction: CC - all flags set CC_C - Carry flag set CC_NZ - Negative and Zero flags set For neghi2 mode which transforms into an xor, nadd and addc in my arch I need to establish that the addc adds the carry produced by the nadd. So I have implemented the nadd and addc as: (define_insn "negqi2" [(set (match_operand:QI 0 "register_operand" "=c") (neg:QI (match_operand:QI 1 "register_operand" "0"))) (set (reg:CC_C RCC) (eq (match_dup 1) (const_int 0))) (clobber (reg:CC RCC))] "" { operands[2] = const0_rtx; return "nadd\\t%0,%2"; }) (define_insn "*negqi2_flags" [(set (match_operand:QI 0 "register_operand" "=c") (neg:QI (match_operand:QI 1 "register_operand" "0"))) (set (reg RCC) (compare (neg:QI (match_dup 1)) (const_int 0)))] "reload_completed && xap_match_ccmode(insn, CCmode)" { operands[2] = const0_rtx; return "nadd\\t%0,%2"; }) (define_insn "addc_internal" [(set (match_operand:QI 0 "nonimmediate_operand" "=c") (plus:QI (plus:QI (ltu:QI (reg:CC RCC) (const_int 0)) (match_operand:QI 1 "nonimmediate_operand" "%0")) (match_operand:QI 2 "general_operand" "cwmi"))) (use (reg:CC_C RCC)) (clobber (reg:CC RCC))] "" "addc\\t%0,%f2") (define_insn "*addc_internal_flags" [(set (match_operand:QI 0 "nonimmediate_operand" "=c") (plus:QI (plus:QI (ltu:QI (reg:CC RCC) (const_int 0)) (match_operand:QI 1 "nonimmediate_operand" "%0")) (match_operand:QI 2 "general_operand" "cwmi"))) (use (reg:CC_C RCC)) (set (reg RCC) (compare (plus:QI (plus:QI (ltu:QI (reg:CC RCC) (const_int 0)) (match_dup 1)) (match_dup 2)) (const_int 0)))] "reload_completed && xap_match_ccmode(insn, CCmode)" "addc\\t%0,%f2") A couple of things to note: * negqi (which generates the nadd x, y equivalent to -x + y) has a set RCC in C mode followed by a clobber. The set in C mode doesn't show up in the _flags variant which is used only for the compare-elim since it doesn't really matter and it already contains a set RCC anyway. * the addc internal specifically says that it uses the RCC in C mode Now, some things still remain a mistery: * is this enough for GCC to understand that anything that clobbers RCC or specifically touches the RCC in C mode shouldn't go in between these two instructions? Also, do I need to specify in the RCC clobber, exactly which flags are clobbered, or should I use a set instead? * in the case of using sets, it was easy in the case of the negqi of findind the source of the set RCC, however, it's not so easy for the general case. Is unspec the answer? Is unspec the way of saying: "hey, I am setting RCC in Cmode here, you shouldn't bother about the value that I put there. Just know that RCC is going to be set." -- PMatos ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Expanding instructions with condition codes inter-deps 2011-10-21 21:42 ` Paulo J. Matos @ 2011-10-22 0:13 ` Richard Henderson 2011-10-22 5:13 ` Peter Bigot ` (2 more replies) 0 siblings, 3 replies; 14+ messages in thread From: Richard Henderson @ 2011-10-22 0:13 UTC (permalink / raw) To: Paulo J. Matos; +Cc: gcc On 10/21/2011 10:15 AM, Paulo J. Matos wrote: > So I have implemented the nadd and addc as: > > (define_insn "negqi2" > [(set (match_operand:QI 0 "register_operand" "=c") > (neg:QI (match_operand:QI 1 "register_operand" "0"))) > (set (reg:CC_C RCC) (eq (match_dup 1) (const_int 0))) > (clobber (reg:CC RCC))] > "" > { > operands[2] = const0_rtx; > return "nadd\\t%0,%2"; > }) There are lots of parts of the compiler that don't optimize well when an insn has more than one output. For the normal insn, just clobber the flags; don't include a second SET. > (define_insn "addc_internal" > [(set (match_operand:QI 0 "nonimmediate_operand" "=c") > (plus:QI > (plus:QI > (ltu:QI (reg:CC RCC) (const_int 0)) > (match_operand:QI 1 "nonimmediate_operand" "%0")) > (match_operand:QI 2 "general_operand" "cwmi"))) > (use (reg:CC_C RCC)) > (clobber (reg:CC RCC))] > "" > "addc\\t%0,%f2") You don't need the USE, because you mention RCC inside the LTU. > (define_insn "*addc_internal_flags" Likewise. > A couple of things to note: > * negqi (which generates the nadd x, y equivalent to -x + y) has a > set RCC in C mode followed by a clobber. The set in C mode doesn't > show up in the _flags variant which is used only for the compare-elim > since it doesn't really matter and it already contains a set RCC > anyway. Surely the NADD insn is simply a normal subtract (with reversed operands). You shouldn't *need* to implement NEG at all, as the middle-end will let NEG expand via MINUS. Just so you know... > * is this enough for GCC to understand that anything that clobbers > RCC or specifically touches the RCC in C mode shouldn't go in between > these two instructions? Yes. > Also, do I need to specify in the RCC > clobber, exactly which flags are clobbered, or should I use a set > instead? No, the compiler will assume the entire register is changed, no matter what CCmode you place there. > * in the case of using sets, it was easy in the case of the negqi of > findind the source of the set RCC, however, it's not so easy for the > general case. Is unspec the answer? Is unspec the way of saying: > "hey, I am setting RCC in Cmode here, you shouldn't bother about the > value that I put there. Just know that RCC is going to be set." You can often just use (compare x y) as well, assuming that the flags are set "appropriately". GCC doesn't assume anything about the contents of the intermediate CCmode object, but does assume that (lt (compare x y) (const_int 0)) produces the same value as (lt x y) But, yes, if there's no obvious comparison, then unspec is ok. r~ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Expanding instructions with condition codes inter-deps 2011-10-22 0:13 ` Richard Henderson @ 2011-10-22 5:13 ` Peter Bigot 2011-10-22 5:21 ` Paul_Koning 2011-10-24 12:07 ` Paulo J. Matos 2 siblings, 0 replies; 14+ messages in thread From: Peter Bigot @ 2011-10-22 5:13 UTC (permalink / raw) To: Richard Henderson; +Cc: Paulo J. Matos, gcc On Fri, Oct 21, 2011 at 4:41 PM, Richard Henderson <rth@redhat.com> wrote: > On 10/21/2011 10:15 AM, Paulo J. Matos wrote: >> So I have implemented the nadd and addc as: >> >> (define_insn "negqi2" >> [(set (match_operand:QI 0 "register_operand" "=c") >> (neg:QI (match_operand:QI 1 "register_operand" "0"))) >> (set (reg:CC_C RCC) (eq (match_dup 1) (const_int 0))) >> (clobber (reg:CC RCC))] >> "" >> { >> operands[2] = const0_rtx; >> return "nadd\\t%0,%2"; >> }) > > There are lots of parts of the compiler that don't optimize well when an > insn has more than one output. For the normal insn, just clobber the flags; > don't include a second SET. In my similar case, I'm implementing a multi-word rotate by using an arithmetic shift that overflows into carry, followed by a second rotate that shifts in from carry for the upper word: (define_insn "rla<mode>1" [(set (match_operand:INTRegModes 0 "nonimmediate_operand" "+rm") (ashift:<MODE> (match_dup 0) (const_int 1))) (set (reg:CC_C REGNO_SR) (unspec:CC_C [(match_dup 0)] UNSPEC_ROTC))] "" { return msp430_output_template (<MODE>mode, 0, "rla", NULL); } ) (define_insn "rlc<mode>1" [(set (match_operand:INTRegModes 0 "nonimmediate_operand" "+rm") (unspec:<MODE> [(match_dup 0)] UNSPEC_RLC)) (use (reg:CC_C REGNO_SR)) (set (reg:CC_C REGNO_SR) (unspec:CC_C [(match_dup 0)] UNSPEC_ROTC))] "" { return msp430_output_template (<MODE>mode, 0, "rlc", NULL); } ) I found that if I used (clobber (reg:CC_C REGNO_SR)) instead of set that the dependency was not preserved, and the rlc ended up after the rla in the final instruction stream. This is consistent with your previous comment in this thread: > The thing that's almost certainly missing is that the NAND pattern > must SET your flags register, not simply clobber it. Otherwise the > dependency between the ADDC and the NAND will never be created properly. Is there a better way to solve this that doesn't introduce multiple outputs in one insn? Peter > >> (define_insn "addc_internal" >> [(set (match_operand:QI 0 "nonimmediate_operand" "=c") >> (plus:QI >> (plus:QI >> (ltu:QI (reg:CC RCC) (const_int 0)) >> (match_operand:QI 1 "nonimmediate_operand" "%0")) >> (match_operand:QI 2 "general_operand" "cwmi"))) >> (use (reg:CC_C RCC)) >> (clobber (reg:CC RCC))] >> "" >> "addc\\t%0,%f2") > > You don't need the USE, because you mention RCC inside the LTU. > >> (define_insn "*addc_internal_flags" > > Likewise. > >> A couple of things to note: >> * negqi (which generates the nadd x, y equivalent to -x + y) has a >> set RCC in C mode followed by a clobber. The set in C mode doesn't >> show up in the _flags variant which is used only for the compare-elim >> since it doesn't really matter and it already contains a set RCC >> anyway. > > Surely the NADD insn is simply a normal subtract (with reversed operands). > You shouldn't *need* to implement NEG at all, as the middle-end will let > NEG expand via MINUS. > > Just so you know... > >> * is this enough for GCC to understand that anything that clobbers >> RCC or specifically touches the RCC in C mode shouldn't go in between >> these two instructions? > > Yes. > >> Also, do I need to specify in the RCC >> clobber, exactly which flags are clobbered, or should I use a set >> instead? > > No, the compiler will assume the entire register is changed, no matter > what CCmode you place there. > >> * in the case of using sets, it was easy in the case of the negqi of >> findind the source of the set RCC, however, it's not so easy for the >> general case. Is unspec the answer? Is unspec the way of saying: >> "hey, I am setting RCC in Cmode here, you shouldn't bother about the >> value that I put there. Just know that RCC is going to be set." > > You can often just use (compare x y) as well, assuming that the flags > are set "appropriately". GCC doesn't assume anything about the > contents of the intermediate CCmode object, but does assume that > > (lt (compare x y) (const_int 0)) > > produces the same value as > > (lt x y) > > But, yes, if there's no obvious comparison, then unspec is ok. > > > r~ > ^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: Expanding instructions with condition codes inter-deps 2011-10-22 0:13 ` Richard Henderson 2011-10-22 5:13 ` Peter Bigot @ 2011-10-22 5:21 ` Paul_Koning 2011-10-24 7:04 ` Richard Henderson 2011-10-24 12:07 ` Paulo J. Matos 2 siblings, 1 reply; 14+ messages in thread From: Paul_Koning @ 2011-10-22 5:21 UTC (permalink / raw) To: rth, paulo; +Cc: gcc >There are lots of parts of the compiler that don't optimize well when an insn has more than one output. For the normal insn, just clobber the flags; don't include a second SET. Yes, but... isn't the whole point of CC modeling that you can take advantage of the CC left around by an instruction? Typically in machines with condition codes, you can eliminate test instructions (compare with zero) if the previous instruction has that variable as its output. But if we're discouraged from writing insns with CC outputs as normal practice, and if the compiler doesn't handle such constructs well in optimization, what then? Is cc0 any better here? In cc0 style condition code handling, the condition codes output is implicit rather than explicitly stated. Does that help, or hurt, or make no difference for the point you mentioned? paul ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Expanding instructions with condition codes inter-deps 2011-10-22 5:21 ` Paul_Koning @ 2011-10-24 7:04 ` Richard Henderson 2011-10-24 12:12 ` Paulo J. Matos 0 siblings, 1 reply; 14+ messages in thread From: Richard Henderson @ 2011-10-24 7:04 UTC (permalink / raw) To: Paul_Koning; +Cc: paulo, gcc On 10/21/2011 05:49 PM, Paul_Koning@Dell.com wrote: >> There are lots of parts of the compiler that don't optimize well when an insn has more than one output. For the normal insn, just clobber the flags; don't include a second SET. > > Yes, but... isn't the whole point of CC modeling that you can take advantage of the CC left around by an instruction? Typically in machines with condition codes, you can eliminate test instructions (compare with zero) if the previous instruction has that variable as its output. But if we're discouraged from writing insns with CC outputs as normal practice, and if the compiler doesn't handle such constructs well in optimization, what then? The solution is to have *two* insn patterns, one with a set of the flags and one with only a clobber. Have a look through i386.md and how the flags register is handled there. r~ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Expanding instructions with condition codes inter-deps 2011-10-24 7:04 ` Richard Henderson @ 2011-10-24 12:12 ` Paulo J. Matos 0 siblings, 0 replies; 14+ messages in thread From: Paulo J. Matos @ 2011-10-24 12:12 UTC (permalink / raw) To: gcc On 23/10/11 22:21, Richard Henderson wrote: > On 10/21/2011 05:49 PM, Paul_Koning@Dell.com wrote: >>> There are lots of parts of the compiler that don't optimize well when an insn has more than one output. For the normal insn, just clobber the flags; don't include a second SET. >> >> Yes, but... isn't the whole point of CC modeling that you can take advantage of the CC left around by an instruction? Typically in machines with condition codes, you can eliminate test instructions (compare with zero) if the previous instruction has that variable as its output. But if we're discouraged from writing insns with CC outputs as normal practice, and if the compiler doesn't handle such constructs well in optimization, what then? > > The solution is to have *two* insn patterns, one with a set of the flags > and one with only a clobber. Have a look through i386.md and > how the flags register is handled there. > In version 4.6.1, i386.md, I see things like: (define_insn "addqi3_cc" [(set (reg:CC FLAGS_REG) (unspec:CC [(match_operand:QI 1 "nonimmediate_operand" "%0,0") (match_operand:QI 2 "general_operand" "qn,qm")] UNSPEC_ADD_CARRY)) (set (match_operand:QI 0 "nonimmediate_operand" "=qm,q") (plus:QI (match_dup 1) (match_dup 2)))] "ix86_binary_operator_ok (PLUS, QImode, operands)" "add{b}\t{%2, %0|%0, %2}" [(set_attr "type" "alu") (set_attr "mode" "QI")]) This seems to be exactly what we are doing. I can't see where there are separate rules for the flags register. -- PMatos ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Expanding instructions with condition codes inter-deps 2011-10-22 0:13 ` Richard Henderson 2011-10-22 5:13 ` Peter Bigot 2011-10-22 5:21 ` Paul_Koning @ 2011-10-24 12:07 ` Paulo J. Matos 2 siblings, 0 replies; 14+ messages in thread From: Paulo J. Matos @ 2011-10-24 12:07 UTC (permalink / raw) To: gcc On 21/10/11 22:41, Richard Henderson wrote: > On 10/21/2011 10:15 AM, Paulo J. Matos wrote: >> So I have implemented the nadd and addc as: >> >> (define_insn "negqi2" >> [(set (match_operand:QI 0 "register_operand" "=c") >> (neg:QI (match_operand:QI 1 "register_operand" "0"))) >> (set (reg:CC_C RCC) (eq (match_dup 1) (const_int 0))) >> (clobber (reg:CC RCC))] >> "" >> { >> operands[2] = const0_rtx; >> return "nadd\\t%0,%2"; >> }) > > There are lots of parts of the compiler that don't optimize well when an > insn has more than one output. For the normal insn, just clobber the flags; > don't include a second SET. > But this case is not a normal insn per se, I did this to negqi2 because I need GCC to know that this instruction explicitly changes RCC and that the following instruction will use the carry flag (addc). The reason I say it is not a normal insn is because it comes often in a pair negqi2 / addc_internal, like for example addqi3 / addc_internal or subqi3 / subc_internal. >> (define_insn "addc_internal" >> [(set (match_operand:QI 0 "nonimmediate_operand" "=c") >> (plus:QI >> (plus:QI >> (ltu:QI (reg:CC RCC) (const_int 0)) >> (match_operand:QI 1 "nonimmediate_operand" "%0")) >> (match_operand:QI 2 "general_operand" "cwmi"))) >> (use (reg:CC_C RCC)) >> (clobber (reg:CC RCC))] >> "" >> "addc\\t%0,%f2") > > You don't need the USE, because you mention RCC inside the LTU. > >> (define_insn "*addc_internal_flags" > > Likewise. > Got it, thanks. >> A couple of things to note: >> * negqi (which generates the nadd x, y equivalent to -x + y) has a >> set RCC in C mode followed by a clobber. The set in C mode doesn't >> show up in the _flags variant which is used only for the compare-elim >> since it doesn't really matter and it already contains a set RCC >> anyway. > > Surely the NADD insn is simply a normal subtract (with reversed operands). > You shouldn't *need* to implement NEG at all, as the middle-end will let > NEG expand via MINUS. > > Just so you know... > But it is not exactly the same thing in this arch because: subqi3 generates a sub <register>, <data> == <register> = <register> - <data> to represent negqi2 of register R with a nadd I just do: nadd R,#0 to represent it using a sub I require more moves: ld R1, #0 sub R1, @R ; @R is memory mapped R ld R, @R1 >> * is this enough for GCC to understand that anything that clobbers >> RCC or specifically touches the RCC in C mode shouldn't go in between >> these two instructions? > > Yes. > >> Also, do I need to specify in the RCC >> clobber, exactly which flags are clobbered, or should I use a set >> instead? > > No, the compiler will assume the entire register is changed, no matter > what CCmode you place there. > Got it, so the only way to deal with the carry flag by itself would be to represent the Carry flag as a separate flags register. Although that would require more than one flags register and it feels messy. -- PMatos ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Expanding instructions with condition codes inter-deps 2011-10-19 5:22 ` Richard Henderson 2011-10-19 5:45 ` Paul_Koning @ 2011-10-20 12:46 ` Paulo J. Matos 1 sibling, 0 replies; 14+ messages in thread From: Paulo J. Matos @ 2011-10-20 12:46 UTC (permalink / raw) To: gcc On 19/10/11 00:10, Richard Henderson wrote: > > The thing that's almost certainly missing is that the NAND pattern > must SET your flags register, not simply clobber it. Otherwise the > dependency between the ADDC and the NAND will never be created properly. > I understand that there's a missing SET of RCC in the NAND. However, what's the set source, we are really not sure what we are setting RCC to. Also, I am following the pattern set by rx and mn10300 of using a pseudo register for the flags. So my nadd pattern is: (define_insn "negqi2" [(set (match_operand:QI 0 "register_operand" "=c") (neg:QI (match_operand:QI 1 "register_operand" "0"))) (clobber (reg:CC RCC))] "" { operands[2] = const0_rtx; return "nadd\\t%0,%2"; }) (define_insn "*negqi2_flags" [(set (match_operand:QI 0 "register_operand" "=c") (neg:QI (match_operand:QI 1 "register_operand" "0"))) (set (reg RCC) (compare (neg:QI (match_dup 1)) (const_int 0)))] "reload_completed && xap_match_ccmode(insn, CCmode)" { operands[2] = const0_rtx; return "nadd\\t%0,%2"; }) It doesn't make sense to me, to add a (set (reg:C RCC) (neg:QI (match_dup 1))) into the parallel since this is going to class with the already existing _flags version of negqi which sets RCC. >> (for example, it would be ok to output negqi2, xorqi3 and >> addc_internal since xorqi3 only sets N and Z, not the Carry bit) > > For that you'd have to model all of the flags bits independently. > I don't believe any target has found that level of complexity to > be worth the trouble. > > So, almost certainly, you don't. > Well, if I am only interested in th Carry flag, I can as well just bother about that. It wouldn't be much of a trouble since I already have the mode CC_Cmode to model instruction which only set this flag. -- PMatos ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2011-10-24 9:50 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-10-17 15:58 Expanding instructions with condition codes inter-deps Paulo J. Matos 2011-10-17 17:23 ` Andrew Pinski 2011-10-18 13:44 ` Paulo J. Matos 2011-10-19 5:22 ` Richard Henderson 2011-10-19 5:45 ` Paul_Koning [not found] ` <CAPOJ94M2XrqM_kG98v1dC1=K2fEkHpuNLSkZBrQyzJ9ncmaQXg@mail.gmail.com> 2011-10-21 20:57 ` Richard Henderson 2011-10-21 21:42 ` Paulo J. Matos 2011-10-22 0:13 ` Richard Henderson 2011-10-22 5:13 ` Peter Bigot 2011-10-22 5:21 ` Paul_Koning 2011-10-24 7:04 ` Richard Henderson 2011-10-24 12:12 ` Paulo J. Matos 2011-10-24 12:07 ` Paulo J. Matos 2011-10-20 12:46 ` Paulo J. Matos
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).