public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Expanding instructions with condition codes inter-deps
@ 2011-10-17 15:58 Paulo J. Matos
  2011-10-17 17:23 ` Andrew Pinski
  2011-10-19  5:22 ` Richard Henderson
  0 siblings, 2 replies; 14+ messages in thread
From: Paulo J. Matos @ 2011-10-17 15:58 UTC (permalink / raw)
  To: gcc

Hi,

To negate a double word (HImode) register, I used to take the 
instruction all the way to assembly generation and then expand into 
three assembly instructions like so:
   xor  %t0, #ffff ; invert bits in top word of op0
   nadd %b0, #0    ; negate bottom bits of op0
   addc %t0, #0    ; add carry to top bits of op0

The interesting thing about this sequence is that the carry added to the 
top bits of op0, is the carry generated by the previous instruction.

If I instead of taking the neghi rule all the way to assembly, and 
instead expand it I get:
(define_expand "neghi2"
   [(set (match_operand:HI 0 "register_operand")
         (neg:HI (match_operand:HI 1 "general_operand")))
    (clobber (reg:CC RCC))]
   ""
{
     rtx op0_low = gen_lowpart(QImode, operands[0]);
     rtx op0_high = gen_highpart(QImode, operands[0]);
     emit_move_insn(operands[0], operands[1]);
     emit_insn(gen_xorqi3(op0_high, op0_high, GEN_INT(-1)));
     emit_insn(gen_negqi2(op0_low, op0_low));
     emit_insn(gen_addc_internal(op0_high, op0_high, const0_rtx));
     DONE;
})


addc_internal looks like:
(define_insn "addc_internal"
   [(set (match_operand:QI 0 "nonimmediate_operand" "=c")
         (plus:QI
           (plus:QI
             (ltu:QI (reg:CC RCC) (const_int 0))
             (match_operand:QI 1 "nonimmediate_operand" "%0"))
           (match_operand:QI 2 "general_operand" "cwmi")))
    (clobber (reg:CC RCC))]
   ""
   "addc\\t%0,%f2")


The problem is that GCC sometimes moves nadd and addc around. It might 
swap their order or insert instructions in-between that set the carry 
flag. This ends up generating code which won't work properly.

What's the best way to instruct GCC during neghi2 expansion that the 
carry read by addc internal is the one generated by negqi2 and so no 
instructions setting carry should be inserted between them and that 
their order should not be changed? (for example, it would be ok to 
output negqi2, xorqi3 and addc_internal since xorqi3 only sets N and Z, 
not the Carry bit).

Cheers,

-- 
PMatos

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Expanding instructions with condition codes inter-deps
  2011-10-17 15:58 Expanding instructions with condition codes inter-deps Paulo J. Matos
@ 2011-10-17 17:23 ` Andrew Pinski
  2011-10-18 13:44   ` Paulo J. Matos
  2011-10-19  5:22 ` Richard Henderson
  1 sibling, 1 reply; 14+ messages in thread
From: Andrew Pinski @ 2011-10-17 17:23 UTC (permalink / raw)
  To: Paulo J. Matos; +Cc: gcc

On Mon, Oct 17, 2011 at 3:50 AM, Paulo J. Matos <paulo@matos-sorge.com> wrote:

> addc_internal looks like:
> (define_insn "addc_internal"
>  [(set (match_operand:QI 0 "nonimmediate_operand" "=c")
>        (plus:QI
>          (plus:QI
>            (ltu:QI (reg:CC RCC) (const_int 0))
>            (match_operand:QI 1 "nonimmediate_operand" "%0"))
>          (match_operand:QI 2 "general_operand" "cwmi")))
>   (clobber (reg:CC RCC))]
>  ""
>  "addc\\t%0,%f2")

Try adding (use (reg:CC RCC)) to the pattern above.

Thanks,
Andrew Pinski
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Expanding instructions with condition codes inter-deps
  2011-10-17 17:23 ` Andrew Pinski
@ 2011-10-18 13:44   ` Paulo J. Matos
  0 siblings, 0 replies; 14+ messages in thread
From: Paulo J. Matos @ 2011-10-18 13:44 UTC (permalink / raw)
  To: gcc

On 17/10/11 17:20, Andrew Pinski wrote:
> On Mon, Oct 17, 2011 at 3:50 AM, Paulo J. Matos<paulo@matos-sorge.com>  wrote:
>
>> addc_internal looks like:
>> (define_insn "addc_internal"
>>   [(set (match_operand:QI 0 "nonimmediate_operand" "=c")
>>         (plus:QI
>>           (plus:QI
>>             (ltu:QI (reg:CC RCC) (const_int 0))
>>             (match_operand:QI 1 "nonimmediate_operand" "%0"))
>>           (match_operand:QI 2 "general_operand" "cwmi")))
>>    (clobber (reg:CC RCC))]
>>   ""
>>   "addc\\t%0,%f2")
>
> Try adding (use (reg:CC RCC)) to the pattern above.
>

Currently compiling that change into the code. I have a feeling that it 
will either confuse compare-elim.c or that it will be too strict and 
remove any possibility of optimization since with (use (reg:CC RCC)) we 
are not really saying that we are only interested in the carry flag 
instead of the whole thing. It will avoid anything clobbering RCC which 
is most of the instructions.

I will give it a spin and see what actually happens.

Thanks,

-- 
PMatos

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Expanding instructions with condition codes inter-deps
  2011-10-17 15:58 Expanding instructions with condition codes inter-deps Paulo J. Matos
  2011-10-17 17:23 ` Andrew Pinski
@ 2011-10-19  5:22 ` Richard Henderson
  2011-10-19  5:45   ` Paul_Koning
  2011-10-20 12:46   ` Paulo J. Matos
  1 sibling, 2 replies; 14+ messages in thread
From: Richard Henderson @ 2011-10-19  5:22 UTC (permalink / raw)
  To: Paulo J. Matos; +Cc: gcc

On 10/17/2011 03:50 AM, Paulo J. Matos wrote:
> Hi,
> 
> To negate a double word (HImode) register, I used to take the instruction all the way to assembly generation and then expand into three assembly instructions like so:
>   xor  %t0, #ffff ; invert bits in top word of op0
>   nadd %b0, #0    ; negate bottom bits of op0
>   addc %t0, #0    ; add carry to top bits of op0
> 
> The interesting thing about this sequence is that the carry added to the top bits of op0, is the carry generated by the previous instruction.
> 
> If I instead of taking the neghi rule all the way to assembly, and instead expand it I get:
> (define_expand "neghi2"
>   [(set (match_operand:HI 0 "register_operand")
>         (neg:HI (match_operand:HI 1 "general_operand")))
>    (clobber (reg:CC RCC))]
>   ""
> {
>     rtx op0_low = gen_lowpart(QImode, operands[0]);
>     rtx op0_high = gen_highpart(QImode, operands[0]);
>     emit_move_insn(operands[0], operands[1]);
>     emit_insn(gen_xorqi3(op0_high, op0_high, GEN_INT(-1)));
>     emit_insn(gen_negqi2(op0_low, op0_low));
>     emit_insn(gen_addc_internal(op0_high, op0_high, const0_rtx));
>     DONE;
> })
> 
> 
> addc_internal looks like:
> (define_insn "addc_internal"
>   [(set (match_operand:QI 0 "nonimmediate_operand" "=c")
>         (plus:QI
>           (plus:QI
>             (ltu:QI (reg:CC RCC) (const_int 0))
>             (match_operand:QI 1 "nonimmediate_operand" "%0"))
>           (match_operand:QI 2 "general_operand" "cwmi")))
>    (clobber (reg:CC RCC))]
>   ""
>   "addc\\t%0,%f2")
> 
> 
> The problem is that GCC sometimes moves nadd and addc around.

The thing that's almost certainly missing is that the NAND pattern
must SET your flags register, not simply clobber it.  Otherwise the
dependency between the ADDC and the NAND will never be created properly.

> (for example, it would be ok to output negqi2, xorqi3 and
> addc_internal since xorqi3 only sets N and Z, not the Carry bit)

For that you'd have to model all of the flags bits independently.
I don't believe any target has found that level of complexity to
be worth the trouble.

So, almost certainly, you don't.


r~

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: Expanding instructions with condition codes inter-deps
  2011-10-19  5:22 ` Richard Henderson
@ 2011-10-19  5:45   ` Paul_Koning
       [not found]     ` <CAPOJ94M2XrqM_kG98v1dC1=K2fEkHpuNLSkZBrQyzJ9ncmaQXg@mail.gmail.com>
  2011-10-21 21:42     ` Paulo J. Matos
  2011-10-20 12:46   ` Paulo J. Matos
  1 sibling, 2 replies; 14+ messages in thread
From: Paul_Koning @ 2011-10-19  5:45 UTC (permalink / raw)
  To: rth, paulo; +Cc: gcc

>From: gcc-owner@gcc.gnu.org [mailto:gcc-owner@gcc.gnu.org] On Behalf Of Richard Henderson
>On 10/17/2011 03:50 AM, Paulo J. Matos wrote:
>>...
>> (for example, it would be ok to output negqi2, xorqi3 and 
>> addc_internal since xorqi3 only sets N and Z, not the Carry bit)
>
>For that you'd have to model all of the flags bits independently.
>I don't believe any target has found that level of complexity to be worth the trouble.

Something like that shows up in the pdp11, where "mov" does not touch C.  And C matters for multi-word arithmetic, and also for unsigned compares.  So I think a CCmode implementation there would  model C separately from the other three flag bits.  So not 4 separate elements but two.  Right now it's a cc0 target but I figure on changing that at some point.  The reasons Paulo mentioned are one of the main reasons for that.

	paul

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Expanding instructions with condition codes inter-deps
  2011-10-19  5:22 ` Richard Henderson
  2011-10-19  5:45   ` Paul_Koning
@ 2011-10-20 12:46   ` Paulo J. Matos
  1 sibling, 0 replies; 14+ messages in thread
From: Paulo J. Matos @ 2011-10-20 12:46 UTC (permalink / raw)
  To: gcc

On 19/10/11 00:10, Richard Henderson wrote:
>
> The thing that's almost certainly missing is that the NAND pattern
> must SET your flags register, not simply clobber it.  Otherwise the
> dependency between the ADDC and the NAND will never be created properly.
>

I understand that there's a missing SET of RCC in the NAND. However, 
what's the set source, we are really not sure what we are setting RCC to.

Also, I am following the pattern set by rx and mn10300 of using a pseudo 
register for the flags. So my nadd pattern is:
(define_insn "negqi2"
   [(set (match_operand:QI 0 "register_operand" "=c")
         (neg:QI (match_operand:QI 1 "register_operand" "0")))
    (clobber (reg:CC RCC))]
   ""
{
     operands[2] = const0_rtx;
     return  "nadd\\t%0,%2";
})

(define_insn "*negqi2_flags"
   [(set (match_operand:QI 0 "register_operand" "=c")
         (neg:QI (match_operand:QI 1 "register_operand" "0")))
    (set (reg RCC)
         (compare (neg:QI (match_dup 1))
                  (const_int 0)))]
   "reload_completed && xap_match_ccmode(insn, CCmode)"
{
     operands[2] = const0_rtx;
     return  "nadd\\t%0,%2";
})


It doesn't make sense to me, to add a
(set (reg:C RCC) (neg:QI (match_dup 1)))
into the parallel since this is going to class with the already existing 
_flags version of negqi which sets RCC.

>> (for example, it would be ok to output negqi2, xorqi3 and
>> addc_internal since xorqi3 only sets N and Z, not the Carry bit)
>
> For that you'd have to model all of the flags bits independently.
> I don't believe any target has found that level of complexity to
> be worth the trouble.
>
> So, almost certainly, you don't.
>

Well, if I am only interested in th Carry flag, I can as well just 
bother about that. It wouldn't be much of a trouble since I already have 
the mode CC_Cmode to model instruction which only set this flag.

-- 
PMatos

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Expanding instructions with condition codes inter-deps
       [not found]     ` <CAPOJ94M2XrqM_kG98v1dC1=K2fEkHpuNLSkZBrQyzJ9ncmaQXg@mail.gmail.com>
@ 2011-10-21 20:57       ` Richard Henderson
  0 siblings, 0 replies; 14+ messages in thread
From: Richard Henderson @ 2011-10-21 20:57 UTC (permalink / raw)
  To: Peter Bigot; +Cc: Paul_Koning, paulo, gcc

On 10/21/2011 09:13 AM, Peter Bigot wrote:
> Are there any existing machine descriptions that do model the carry
> flag separately, specifically to model rotate operations that use the
> carry flag as the destination and source for the shifted bit?  Or is
> the best I can do for that is to have the value be an unspec, with a
> use and clobber of the CC register (which currently blows up in
> simplify_subreg, but I'm still experimenting).

I don't believe there are any extant with that behaviour.


r~

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Expanding instructions with condition codes inter-deps
  2011-10-19  5:45   ` Paul_Koning
       [not found]     ` <CAPOJ94M2XrqM_kG98v1dC1=K2fEkHpuNLSkZBrQyzJ9ncmaQXg@mail.gmail.com>
@ 2011-10-21 21:42     ` Paulo J. Matos
  2011-10-22  0:13       ` Richard Henderson
  1 sibling, 1 reply; 14+ messages in thread
From: Paulo J. Matos @ 2011-10-21 21:42 UTC (permalink / raw)
  To: gcc

On 19/10/11 01:48, Paul_Koning@Dell.com wrote:
>> From: gcc-owner@gcc.gnu.org [mailto:gcc-owner@gcc.gnu.org] On Behalf Of Richard Henderson
>> On 10/17/2011 03:50 AM, Paulo J. Matos wrote:
>>> ...
>>> (for example, it would be ok to output negqi2, xorqi3 and
>>> addc_internal since xorqi3 only sets N and Z, not the Carry bit)
>>
>> For that you'd have to model all of the flags bits independently.
>> I don't believe any target has found that level of complexity to be worth the trouble.
>
> Something like that shows up in the pdp11, where "mov" does not touch C.  And C matters for multi-word arithmetic, and also for unsigned compares.  So I think a CCmode implementation there would  model C separately from the other three flag bits.  So not 4 separate elements but two.  Right now it's a cc0 target but I figure on changing that at some point.  The reasons Paulo mentioned are one of the main reasons for that.
>
> 	paul
>
>

I have several CC modes depending on which flags are set by each 
instruction:
CC - all flags set
CC_C - Carry flag set
CC_NZ - Negative and Zero flags set

For neghi2 mode which transforms into an xor, nadd and addc in my arch I 
need to establish that the addc adds the carry produced by the nadd.

So I have implemented the nadd and addc as:

(define_insn "negqi2"
   [(set (match_operand:QI 0 "register_operand" "=c")
         (neg:QI (match_operand:QI 1 "register_operand" "0")))
    (set (reg:CC_C RCC) (eq (match_dup 1) (const_int 0)))
    (clobber (reg:CC RCC))]
   ""
{
     operands[2] = const0_rtx;
     return  "nadd\\t%0,%2";
})

(define_insn "*negqi2_flags"
   [(set (match_operand:QI 0 "register_operand" "=c")
         (neg:QI (match_operand:QI 1 "register_operand" "0")))
    (set (reg RCC)
         (compare (neg:QI (match_dup 1))
                  (const_int 0)))]
   "reload_completed && xap_match_ccmode(insn, CCmode)"
{
     operands[2] = const0_rtx;
     return  "nadd\\t%0,%2";
})

(define_insn "addc_internal"
   [(set (match_operand:QI 0 "nonimmediate_operand" "=c")
         (plus:QI
           (plus:QI
             (ltu:QI (reg:CC RCC) (const_int 0))
             (match_operand:QI 1 "nonimmediate_operand" "%0"))
           (match_operand:QI 2 "general_operand" "cwmi")))
    (use (reg:CC_C RCC))
    (clobber (reg:CC RCC))]
   ""
   "addc\\t%0,%f2")

(define_insn "*addc_internal_flags"
   [(set (match_operand:QI 0 "nonimmediate_operand" "=c")
         (plus:QI
           (plus:QI
             (ltu:QI (reg:CC RCC) (const_int 0))
             (match_operand:QI 1 "nonimmediate_operand" "%0"))
           (match_operand:QI 2 "general_operand" "cwmi")))
    (use (reg:CC_C RCC))
    (set (reg RCC)
         (compare
           (plus:QI
             (plus:QI
               (ltu:QI (reg:CC RCC) (const_int 0))
               (match_dup 1))
             (match_dup 2))
           (const_int 0)))]
   "reload_completed && xap_match_ccmode(insn, CCmode)"
   "addc\\t%0,%f2")

A couple of things to note:
* negqi (which generates the nadd x, y equivalent to -x + y) has a set 
RCC in C mode followed by a clobber. The set in C mode doesn't show up 
in the _flags variant which is used only for the compare-elim since it 
doesn't really matter and it already contains a set RCC anyway.
* the addc internal specifically says that it uses the RCC in C mode

Now, some things still remain a mistery:
* is this enough for GCC to understand that anything that clobbers RCC 
or specifically touches the RCC in C mode shouldn't go in between these 
two instructions? Also, do I need to specify in the RCC clobber, exactly 
which flags are clobbered, or should I use a set instead?
* in the case of using sets, it was easy in the case of the negqi of 
findind the source of the set RCC, however, it's not so easy for the 
general case. Is unspec the answer? Is unspec the way of saying: "hey, I 
am setting RCC in Cmode here, you shouldn't bother about the value that 
I put there. Just know that RCC is going to be set."

-- 
PMatos

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Expanding instructions with condition codes inter-deps
  2011-10-21 21:42     ` Paulo J. Matos
@ 2011-10-22  0:13       ` Richard Henderson
  2011-10-22  5:13         ` Peter Bigot
                           ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Richard Henderson @ 2011-10-22  0:13 UTC (permalink / raw)
  To: Paulo J. Matos; +Cc: gcc

On 10/21/2011 10:15 AM, Paulo J. Matos wrote:
> So I have implemented the nadd and addc as:
> 
> (define_insn "negqi2"
>   [(set (match_operand:QI 0 "register_operand" "=c")
>         (neg:QI (match_operand:QI 1 "register_operand" "0")))
>    (set (reg:CC_C RCC) (eq (match_dup 1) (const_int 0)))
>    (clobber (reg:CC RCC))]
>   ""
> {
>     operands[2] = const0_rtx;
>     return  "nadd\\t%0,%2";
> })

There are lots of parts of the compiler that don't optimize well when an
insn has more than one output.  For the normal insn, just clobber the flags;
don't include a second SET.

> (define_insn "addc_internal"
>   [(set (match_operand:QI 0 "nonimmediate_operand" "=c")
>         (plus:QI
>           (plus:QI
>             (ltu:QI (reg:CC RCC) (const_int 0))
>             (match_operand:QI 1 "nonimmediate_operand" "%0"))
>           (match_operand:QI 2 "general_operand" "cwmi")))
>    (use (reg:CC_C RCC))
>    (clobber (reg:CC RCC))]
>   ""
>   "addc\\t%0,%f2")

You don't need the USE, because you mention RCC inside the LTU.

> (define_insn "*addc_internal_flags"

Likewise.

> A couple of things to note:
> * negqi (which generates the nadd x, y equivalent to -x + y) has a
> set RCC in C mode followed by a clobber. The set in C mode doesn't
> show up in the _flags variant which is used only for the compare-elim
> since it doesn't really matter and it already contains a set RCC
> anyway.

Surely the NADD insn is simply a normal subtract (with reversed operands).
You shouldn't *need* to implement NEG at all, as the middle-end will let
NEG expand via MINUS.

Just so you know...

> * is this enough for GCC to understand that anything that clobbers
> RCC or specifically touches the RCC in C mode shouldn't go in between
> these two instructions?

Yes.

> Also, do I need to specify in the RCC
> clobber, exactly which flags are clobbered, or should I use a set
> instead?

No, the compiler will assume the entire register is changed, no matter
what CCmode you place there.

> * in the case of using sets, it was easy in the case of the negqi of
> findind the source of the set RCC, however, it's not so easy for the
> general case. Is unspec the answer? Is unspec the way of saying:
> "hey, I am setting RCC in Cmode here, you shouldn't bother about the
> value that I put there. Just know that RCC is going to be set."

You can often just use (compare x y) as well, assuming that the flags
are set "appropriately".  GCC doesn't assume anything about the 
contents of the intermediate CCmode object, but does assume that

  (lt (compare x y) (const_int 0))

produces the same value as

  (lt x y)

But, yes, if there's no obvious comparison, then unspec is ok.


r~

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Expanding instructions with condition codes inter-deps
  2011-10-22  0:13       ` Richard Henderson
@ 2011-10-22  5:13         ` Peter Bigot
  2011-10-22  5:21         ` Paul_Koning
  2011-10-24 12:07         ` Paulo J. Matos
  2 siblings, 0 replies; 14+ messages in thread
From: Peter Bigot @ 2011-10-22  5:13 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Paulo J. Matos, gcc

On Fri, Oct 21, 2011 at 4:41 PM, Richard Henderson <rth@redhat.com> wrote:
> On 10/21/2011 10:15 AM, Paulo J. Matos wrote:
>> So I have implemented the nadd and addc as:
>>
>> (define_insn "negqi2"
>>   [(set (match_operand:QI 0 "register_operand" "=c")
>>         (neg:QI (match_operand:QI 1 "register_operand" "0")))
>>    (set (reg:CC_C RCC) (eq (match_dup 1) (const_int 0)))
>>    (clobber (reg:CC RCC))]
>>   ""
>> {
>>     operands[2] = const0_rtx;
>>     return  "nadd\\t%0,%2";
>> })
>
> There are lots of parts of the compiler that don't optimize well when an
> insn has more than one output.  For the normal insn, just clobber the flags;
> don't include a second SET.

In my similar case, I'm implementing a multi-word rotate by using an
arithmetic shift that overflows into carry, followed by a second rotate
that shifts in from carry for the upper word:

(define_insn "rla<mode>1"
  [(set (match_operand:INTRegModes 0 "nonimmediate_operand" "+rm")
	(ashift:<MODE> (match_dup 0) (const_int 1)))
   (set (reg:CC_C REGNO_SR) (unspec:CC_C [(match_dup 0)] UNSPEC_ROTC))]
  ""
  { return msp430_output_template (<MODE>mode, 0, "rla", NULL); }
  )

(define_insn "rlc<mode>1"
  [(set (match_operand:INTRegModes 0 "nonimmediate_operand" "+rm")
	(unspec:<MODE> [(match_dup 0)] UNSPEC_RLC))
   (use (reg:CC_C REGNO_SR))
   (set (reg:CC_C REGNO_SR) (unspec:CC_C [(match_dup 0)] UNSPEC_ROTC))]
  ""
  { return msp430_output_template (<MODE>mode, 0, "rlc", NULL); }
  )

I found that if I used (clobber (reg:CC_C REGNO_SR)) instead of set that the
dependency was not preserved, and the rlc ended up after the rla in the
final instruction stream.  This is consistent with your previous comment in
this thread:

> The thing that's almost certainly missing is that the NAND pattern
> must SET your flags register, not simply clobber it.  Otherwise the
> dependency between the ADDC and the NAND will never be created properly.

Is there a better way to solve this that doesn't introduce multiple outputs in
one insn?

Peter

>
>> (define_insn "addc_internal"
>>   [(set (match_operand:QI 0 "nonimmediate_operand" "=c")
>>         (plus:QI
>>           (plus:QI
>>             (ltu:QI (reg:CC RCC) (const_int 0))
>>             (match_operand:QI 1 "nonimmediate_operand" "%0"))
>>           (match_operand:QI 2 "general_operand" "cwmi")))
>>    (use (reg:CC_C RCC))
>>    (clobber (reg:CC RCC))]
>>   ""
>>   "addc\\t%0,%f2")
>
> You don't need the USE, because you mention RCC inside the LTU.
>
>> (define_insn "*addc_internal_flags"
>
> Likewise.
>
>> A couple of things to note:
>> * negqi (which generates the nadd x, y equivalent to -x + y) has a
>> set RCC in C mode followed by a clobber. The set in C mode doesn't
>> show up in the _flags variant which is used only for the compare-elim
>> since it doesn't really matter and it already contains a set RCC
>> anyway.
>
> Surely the NADD insn is simply a normal subtract (with reversed operands).
> You shouldn't *need* to implement NEG at all, as the middle-end will let
> NEG expand via MINUS.
>
> Just so you know...
>
>> * is this enough for GCC to understand that anything that clobbers
>> RCC or specifically touches the RCC in C mode shouldn't go in between
>> these two instructions?
>
> Yes.
>
>> Also, do I need to specify in the RCC
>> clobber, exactly which flags are clobbered, or should I use a set
>> instead?
>
> No, the compiler will assume the entire register is changed, no matter
> what CCmode you place there.
>
>> * in the case of using sets, it was easy in the case of the negqi of
>> findind the source of the set RCC, however, it's not so easy for the
>> general case. Is unspec the answer? Is unspec the way of saying:
>> "hey, I am setting RCC in Cmode here, you shouldn't bother about the
>> value that I put there. Just know that RCC is going to be set."
>
> You can often just use (compare x y) as well, assuming that the flags
> are set "appropriately".  GCC doesn't assume anything about the
> contents of the intermediate CCmode object, but does assume that
>
>  (lt (compare x y) (const_int 0))
>
> produces the same value as
>
>  (lt x y)
>
> But, yes, if there's no obvious comparison, then unspec is ok.
>
>
> r~
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: Expanding instructions with condition codes inter-deps
  2011-10-22  0:13       ` Richard Henderson
  2011-10-22  5:13         ` Peter Bigot
@ 2011-10-22  5:21         ` Paul_Koning
  2011-10-24  7:04           ` Richard Henderson
  2011-10-24 12:07         ` Paulo J. Matos
  2 siblings, 1 reply; 14+ messages in thread
From: Paul_Koning @ 2011-10-22  5:21 UTC (permalink / raw)
  To: rth, paulo; +Cc: gcc

>There are lots of parts of the compiler that don't optimize well when an insn has more than one output.  For the normal insn, just clobber the flags; don't include a second SET.

Yes, but... isn't the whole point of CC modeling that you can take advantage of the CC left around by an instruction?  Typically in machines with condition codes, you can eliminate test instructions (compare with zero) if the previous instruction has that variable as its output.  But if we're discouraged from writing insns with CC outputs as normal practice, and if the compiler doesn't handle such constructs well in optimization, what then?

Is cc0 any better here?  In cc0 style condition code handling, the condition codes output is implicit rather than explicitly stated.  Does that help, or hurt, or make no difference for the point you mentioned?

	paul

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Expanding instructions with condition codes inter-deps
  2011-10-22  5:21         ` Paul_Koning
@ 2011-10-24  7:04           ` Richard Henderson
  2011-10-24 12:12             ` Paulo J. Matos
  0 siblings, 1 reply; 14+ messages in thread
From: Richard Henderson @ 2011-10-24  7:04 UTC (permalink / raw)
  To: Paul_Koning; +Cc: paulo, gcc

On 10/21/2011 05:49 PM, Paul_Koning@Dell.com wrote:
>> There are lots of parts of the compiler that don't optimize well when an insn has more than one output.  For the normal insn, just clobber the flags; don't include a second SET.
> 
> Yes, but... isn't the whole point of CC modeling that you can take advantage of the CC left around by an instruction?  Typically in machines with condition codes, you can eliminate test instructions (compare with zero) if the previous instruction has that variable as its output.  But if we're discouraged from writing insns with CC outputs as normal practice, and if the compiler doesn't handle such constructs well in optimization, what then?

The solution is to have *two* insn patterns, one with a set of the flags
and one with only a clobber.  Have a look through i386.md and
how the flags register is handled there.


r~

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Expanding instructions with condition codes inter-deps
  2011-10-22  0:13       ` Richard Henderson
  2011-10-22  5:13         ` Peter Bigot
  2011-10-22  5:21         ` Paul_Koning
@ 2011-10-24 12:07         ` Paulo J. Matos
  2 siblings, 0 replies; 14+ messages in thread
From: Paulo J. Matos @ 2011-10-24 12:07 UTC (permalink / raw)
  To: gcc

On 21/10/11 22:41, Richard Henderson wrote:
> On 10/21/2011 10:15 AM, Paulo J. Matos wrote:
>> So I have implemented the nadd and addc as:
>>
>> (define_insn "negqi2"
>>    [(set (match_operand:QI 0 "register_operand" "=c")
>>          (neg:QI (match_operand:QI 1 "register_operand" "0")))
>>     (set (reg:CC_C RCC) (eq (match_dup 1) (const_int 0)))
>>     (clobber (reg:CC RCC))]
>>    ""
>> {
>>      operands[2] = const0_rtx;
>>      return  "nadd\\t%0,%2";
>> })
>
> There are lots of parts of the compiler that don't optimize well when an
> insn has more than one output.  For the normal insn, just clobber the flags;
> don't include a second SET.
>

But this case is not a normal insn per se, I did this to negqi2 because 
I need GCC to know that this instruction explicitly changes RCC and that 
the following instruction will use the carry flag (addc).

The reason I say it is not a normal insn is because it comes often in a 
pair negqi2 / addc_internal, like for example addqi3 / addc_internal or 
subqi3 / subc_internal.

>> (define_insn "addc_internal"
>>    [(set (match_operand:QI 0 "nonimmediate_operand" "=c")
>>          (plus:QI
>>            (plus:QI
>>              (ltu:QI (reg:CC RCC) (const_int 0))
>>              (match_operand:QI 1 "nonimmediate_operand" "%0"))
>>            (match_operand:QI 2 "general_operand" "cwmi")))
>>     (use (reg:CC_C RCC))
>>     (clobber (reg:CC RCC))]
>>    ""
>>    "addc\\t%0,%f2")
>
> You don't need the USE, because you mention RCC inside the LTU.
>
>> (define_insn "*addc_internal_flags"
>
> Likewise.
>

Got it, thanks.

>> A couple of things to note:
>> * negqi (which generates the nadd x, y equivalent to -x + y) has a
>> set RCC in C mode followed by a clobber. The set in C mode doesn't
>> show up in the _flags variant which is used only for the compare-elim
>> since it doesn't really matter and it already contains a set RCC
>> anyway.
>
> Surely the NADD insn is simply a normal subtract (with reversed operands).
> You shouldn't *need* to implement NEG at all, as the middle-end will let
> NEG expand via MINUS.
>
> Just so you know...
>

But it is not exactly the same thing in this arch because:
subqi3 generates a
sub <register>, <data> == <register> = <register> - <data>

to represent negqi2 of register R with a nadd I just do:
nadd R,#0

to represent it using a sub I require more moves:
ld R1, #0
sub R1, @R ; @R is memory mapped R
ld R, @R1

>> * is this enough for GCC to understand that anything that clobbers
>> RCC or specifically touches the RCC in C mode shouldn't go in between
>> these two instructions?
>
> Yes.
>
>> Also, do I need to specify in the RCC
>> clobber, exactly which flags are clobbered, or should I use a set
>> instead?
>
> No, the compiler will assume the entire register is changed, no matter
> what CCmode you place there.
>

Got it, so the only way to deal with the carry flag by itself would be 
to represent the Carry flag as a separate flags register. Although that 
would require more than one flags register and it feels messy.

-- 
PMatos

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Expanding instructions with condition codes inter-deps
  2011-10-24  7:04           ` Richard Henderson
@ 2011-10-24 12:12             ` Paulo J. Matos
  0 siblings, 0 replies; 14+ messages in thread
From: Paulo J. Matos @ 2011-10-24 12:12 UTC (permalink / raw)
  To: gcc

On 23/10/11 22:21, Richard Henderson wrote:
> On 10/21/2011 05:49 PM, Paul_Koning@Dell.com wrote:
>>> There are lots of parts of the compiler that don't optimize well when an insn has more than one output.  For the normal insn, just clobber the flags; don't include a second SET.
>>
>> Yes, but... isn't the whole point of CC modeling that you can take advantage of the CC left around by an instruction?  Typically in machines with condition codes, you can eliminate test instructions (compare with zero) if the previous instruction has that variable as its output.  But if we're discouraged from writing insns with CC outputs as normal practice, and if the compiler doesn't handle such constructs well in optimization, what then?
>
> The solution is to have *two* insn patterns, one with a set of the flags
> and one with only a clobber.  Have a look through i386.md and
> how the flags register is handled there.
>

In version 4.6.1, i386.md,  I see things like:

(define_insn "addqi3_cc"
   [(set (reg:CC FLAGS_REG)
         (unspec:CC
           [(match_operand:QI 1 "nonimmediate_operand" "%0,0")
            (match_operand:QI 2 "general_operand" "qn,qm")]
           UNSPEC_ADD_CARRY))
    (set (match_operand:QI 0 "nonimmediate_operand" "=qm,q")
         (plus:QI (match_dup 1) (match_dup 2)))]
   "ix86_binary_operator_ok (PLUS, QImode, operands)"
   "add{b}\t{%2, %0|%0, %2}"
   [(set_attr "type" "alu")
    (set_attr "mode" "QI")])


This seems to be exactly what we are doing. I can't see where there are 
separate rules for the flags register.

-- 
PMatos

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2011-10-24  9:50 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-10-17 15:58 Expanding instructions with condition codes inter-deps Paulo J. Matos
2011-10-17 17:23 ` Andrew Pinski
2011-10-18 13:44   ` Paulo J. Matos
2011-10-19  5:22 ` Richard Henderson
2011-10-19  5:45   ` Paul_Koning
     [not found]     ` <CAPOJ94M2XrqM_kG98v1dC1=K2fEkHpuNLSkZBrQyzJ9ncmaQXg@mail.gmail.com>
2011-10-21 20:57       ` Richard Henderson
2011-10-21 21:42     ` Paulo J. Matos
2011-10-22  0:13       ` Richard Henderson
2011-10-22  5:13         ` Peter Bigot
2011-10-22  5:21         ` Paul_Koning
2011-10-24  7:04           ` Richard Henderson
2011-10-24 12:12             ` Paulo J. Matos
2011-10-24 12:07         ` Paulo J. Matos
2011-10-20 12:46   ` Paulo J. Matos

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).