public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH][simplify-rtx] (GTU (PLUS a C) (C - 1)) --> (LTU a -C)
@ 2016-09-16  8:50 Kyrill Tkachov
  2016-09-16  9:04 ` Richard Biener
  2016-09-16 11:02 ` Bernd Schmidt
  0 siblings, 2 replies; 12+ messages in thread
From: Kyrill Tkachov @ 2016-09-16  8:50 UTC (permalink / raw)
  To: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 1233 bytes --]

Hi all,

Currently the functions:
int f1(int x, int t)
{
   if (x == -1 || x == -2)
     t = 1;
   return t;
}

int f2(int x, int t)
{
   if (x == -1 || x == -2)
     return 1;
   return t;
}

generate different code on AArch64 even though they have identical functionality:
f1:
         add     w0, w0, 2
         cmp     w0, 1
         csinc   w0, w1, wzr, hi
         ret

f2:
         cmn     w0, #2
         csinc   w0, w1, wzr, cc
         ret

The problem is that f2 performs the comparison (LTU w0 -2)
whereas f1 performs (GTU (PLUS w0 2) 1). I think it is possible to simplify the f1 form
to the f2 form with the simplify-rtx.c rule added in this patch. With this patch the
codegen for both f1 and f2 on aarch64 at -O2 is identical (CMN, CSINC).

Bootstrapped and tested on arm-none-linux-gnueabihf, aarch64-none-linux-gnu, x86_64.
What do you think? Is this a correct generalisation of this issue?
If so, ok for trunk?

Thanks,
Kyrill

2016-09-16  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     * simplify-rtx.c (simplify_relational_operation_1): Add transformation
     (GTU (PLUS a C) (C - 1)) --> (LTU a -C).

2016-09-16  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     * gcc.target/aarch64/gtu_to_ltu_cmp_1.c: New test.

[-- Attachment #2: simplify-gtu.patch --]
[-- Type: text/x-patch, Size: 1524 bytes --]

diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index 14302ea06eccc099ef356ab6c63ac020dd083b0c..4153c7335680068ed3ce08410400ac6abaf30c89 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -4663,6 +4663,19 @@ simplify_relational_operation_1 (enum rtx_code code, machine_mode mode,
 				      cmp_mode, XEXP (op0, 0), new_cmp);
     }
 
+  /* (GTU (PLUS a C) (C - 1)) where C is a non-zero constant can be
+     transformed into (LTU a -C).  */
+  if (code == GTU && GET_CODE (op0) == PLUS && CONST_INT_P (op1)
+      && CONST_INT_P (XEXP (op0, 1))
+      && (UINTVAL (op1) == UINTVAL (XEXP (op0, 1)) - 1)
+      && XEXP (op0, 1) != const0_rtx)
+    {
+      rtx new_cmp
+	= simplify_gen_unary (NEG, cmp_mode, XEXP (op0, 1), cmp_mode);
+      return simplify_gen_relational (LTU, mode, cmp_mode,
+				       XEXP (op0, 0), new_cmp);
+    }
+
   /* Canonicalize (LTU/GEU (PLUS a b) b) as (LTU/GEU (PLUS a b) a).  */
   if ((code == LTU || code == GEU)
       && GET_CODE (op0) == PLUS
diff --git a/gcc/testsuite/gcc.target/aarch64/gtu_to_ltu_cmp_1.c b/gcc/testsuite/gcc.target/aarch64/gtu_to_ltu_cmp_1.c
new file mode 100644
index 0000000000000000000000000000000000000000..81c536c90afe38932c48ed0af24f55e73eeff80e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/gtu_to_ltu_cmp_1.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int
+f1 (int x, int t)
+{
+  if (x == -1 || x == -2)
+    t = 1;
+
+  return t;
+}
+
+/* { dg-final { scan-assembler-times "cmn\\tw\[0-9\]+, #2" 1 } } */

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][simplify-rtx] (GTU (PLUS a C) (C - 1)) --> (LTU a -C)
  2016-09-16  8:50 [PATCH][simplify-rtx] (GTU (PLUS a C) (C - 1)) --> (LTU a -C) Kyrill Tkachov
@ 2016-09-16  9:04 ` Richard Biener
  2016-09-16  9:40   ` Kyrill Tkachov
  2016-09-16 11:02 ` Bernd Schmidt
  1 sibling, 1 reply; 12+ messages in thread
From: Richard Biener @ 2016-09-16  9:04 UTC (permalink / raw)
  To: Kyrill Tkachov; +Cc: GCC Patches

On Fri, Sep 16, 2016 at 10:40 AM, Kyrill Tkachov
<kyrylo.tkachov@foss.arm.com> wrote:
> Hi all,
>
> Currently the functions:
> int f1(int x, int t)
> {
>   if (x == -1 || x == -2)
>     t = 1;
>   return t;
> }
>
> int f2(int x, int t)
> {
>   if (x == -1 || x == -2)
>     return 1;
>   return t;
> }
>
> generate different code on AArch64 even though they have identical
> functionality:
> f1:
>         add     w0, w0, 2
>         cmp     w0, 1
>         csinc   w0, w1, wzr, hi
>         ret
>
> f2:
>         cmn     w0, #2
>         csinc   w0, w1, wzr, cc
>         ret
>
> The problem is that f2 performs the comparison (LTU w0 -2)
> whereas f1 performs (GTU (PLUS w0 2) 1). I think it is possible to simplify
> the f1 form
> to the f2 form with the simplify-rtx.c rule added in this patch. With this
> patch the
> codegen for both f1 and f2 on aarch64 at -O2 is identical (CMN, CSINC).
>
> Bootstrapped and tested on arm-none-linux-gnueabihf, aarch64-none-linux-gnu,
> x86_64.
> What do you think? Is this a correct generalisation of this issue?
> If so, ok for trunk?

Do you see a difference on the GIMPLE level?  If so, this kind of
transform looks
appropriate there, too.

Richard.

> Thanks,
> Kyrill
>
> 2016-09-16  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>     * simplify-rtx.c (simplify_relational_operation_1): Add transformation
>     (GTU (PLUS a C) (C - 1)) --> (LTU a -C).
>
> 2016-09-16  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>     * gcc.target/aarch64/gtu_to_ltu_cmp_1.c: New test.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][simplify-rtx] (GTU (PLUS a C) (C - 1)) --> (LTU a -C)
  2016-09-16  9:04 ` Richard Biener
@ 2016-09-16  9:40   ` Kyrill Tkachov
  2016-09-16 10:02     ` Bin.Cheng
  0 siblings, 1 reply; 12+ messages in thread
From: Kyrill Tkachov @ 2016-09-16  9:40 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches


On 16/09/16 10:02, Richard Biener wrote:
> On Fri, Sep 16, 2016 at 10:40 AM, Kyrill Tkachov
> <kyrylo.tkachov@foss.arm.com> wrote:
>> Hi all,
>>
>> Currently the functions:
>> int f1(int x, int t)
>> {
>>    if (x == -1 || x == -2)
>>      t = 1;
>>    return t;
>> }
>>
>> int f2(int x, int t)
>> {
>>    if (x == -1 || x == -2)
>>      return 1;
>>    return t;
>> }
>>
>> generate different code on AArch64 even though they have identical
>> functionality:
>> f1:
>>          add     w0, w0, 2
>>          cmp     w0, 1
>>          csinc   w0, w1, wzr, hi
>>          ret
>>
>> f2:
>>          cmn     w0, #2
>>          csinc   w0, w1, wzr, cc
>>          ret
>>
>> The problem is that f2 performs the comparison (LTU w0 -2)
>> whereas f1 performs (GTU (PLUS w0 2) 1). I think it is possible to simplify
>> the f1 form
>> to the f2 form with the simplify-rtx.c rule added in this patch. With this
>> patch the
>> codegen for both f1 and f2 on aarch64 at -O2 is identical (CMN, CSINC).
>>
>> Bootstrapped and tested on arm-none-linux-gnueabihf, aarch64-none-linux-gnu,
>> x86_64.
>> What do you think? Is this a correct generalisation of this issue?
>> If so, ok for trunk?
> Do you see a difference on the GIMPLE level?  If so, this kind of
> transform looks
> appropriate there, too.

The GIMPLE for the two functions looks almost identical:
f1 (intD.7 xD.3078, intD.7 tD.3079)
{
   intD.7 x_4(D) = xD.3078;
   intD.7 t_5(D) = tD.3079;
   unsigned int x.0_1;
   unsigned int _2;
   x.0_1 = (unsigned int) x_4(D);

   _2 = x.0_1 + 2;
   if (_2 <= 1)
     goto <bb 3>;
   else
     goto <bb 4>;
;;   basic block 3, loop depth 0, count 0, freq 3977, maybe hot
;;   basic block 4, loop depth 0, count 0, freq 10000, maybe hot

   # t_3 = PHI <t_5(D)(2), 1(3)>
   return t_3;
}

f2 (intD.7 xD.3082, intD.7 tD.3083)
{
   intD.7 x_4(D) = xD.3082;
   intD.7 t_5(D) = tD.3083;
   unsigned int x.1_1;
   unsigned int _2;
   intD.7 _3;

   x.1_1 = (unsigned int) x_4(D);

   _2 = x.1_1 + 2;
   if (_2 <= 1)
     goto <bb 4>;
   else
     goto <bb 3>;

;;   basic block 3, loop depth 0, count 0, freq 6761, maybe hot
;;   basic block 4, loop depth 0, count 0, freq 10000, maybe hot
   # _3 = PHI <1(2), t_5(D)(3)>
   return _3;

}

So at GIMPLE level we see a (x + 2 <=u 1) in both cases but with slightly
different CFG.  RTL-level transformations (ce1) bring it to the pre-combine RTL
where one does (LTU w0 -2) and the other does (GTU (PLUS w0 2) 1).

So the differences start at RTL level, so I think we need this transformation there.
However, for the testcase:
unsigned int
foo (unsigned int a, unsigned int b)
{
   return (a + 2) > 1;
}

The differences do appear at GIMPLE level, so I think a match.pd pattern would help here.
I'll look into adding one there as well, but that would be independent of this patch.

Thanks,
Kyrill

> Richard.
>
>> Thanks,
>> Kyrill
>>
>> 2016-09-16  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>
>>      * simplify-rtx.c (simplify_relational_operation_1): Add transformation
>>      (GTU (PLUS a C) (C - 1)) --> (LTU a -C).
>>
>> 2016-09-16  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>
>>      * gcc.target/aarch64/gtu_to_ltu_cmp_1.c: New test.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][simplify-rtx] (GTU (PLUS a C) (C - 1)) --> (LTU a -C)
  2016-09-16  9:40   ` Kyrill Tkachov
@ 2016-09-16 10:02     ` Bin.Cheng
  2016-09-16 10:05       ` Kyrill Tkachov
  0 siblings, 1 reply; 12+ messages in thread
From: Bin.Cheng @ 2016-09-16 10:02 UTC (permalink / raw)
  To: Kyrill Tkachov; +Cc: Richard Biener, GCC Patches

On Fri, Sep 16, 2016 at 10:20 AM, Kyrill Tkachov
<kyrylo.tkachov@foss.arm.com> wrote:
>
> On 16/09/16 10:02, Richard Biener wrote:
>>
>> On Fri, Sep 16, 2016 at 10:40 AM, Kyrill Tkachov
>> <kyrylo.tkachov@foss.arm.com> wrote:
>>>
>>> Hi all,
>>>
>>> Currently the functions:
>>> int f1(int x, int t)
>>> {
>>>    if (x == -1 || x == -2)
>>>      t = 1;
>>>    return t;
>>> }
>>>
>>> int f2(int x, int t)
>>> {
>>>    if (x == -1 || x == -2)
>>>      return 1;
>>>    return t;
>>> }
>>>
>>> generate different code on AArch64 even though they have identical
>>> functionality:
>>> f1:
>>>          add     w0, w0, 2
>>>          cmp     w0, 1
>>>          csinc   w0, w1, wzr, hi
>>>          ret
>>>
>>> f2:
>>>          cmn     w0, #2
>>>          csinc   w0, w1, wzr, cc
>>>          ret
>>>
>>> The problem is that f2 performs the comparison (LTU w0 -2)
>>> whereas f1 performs (GTU (PLUS w0 2) 1). I think it is possible to
>>> simplify
>>> the f1 form
>>> to the f2 form with the simplify-rtx.c rule added in this patch. With
>>> this
>>> patch the
>>> codegen for both f1 and f2 on aarch64 at -O2 is identical (CMN, CSINC).
>>>
>>> Bootstrapped and tested on arm-none-linux-gnueabihf,
>>> aarch64-none-linux-gnu,
>>> x86_64.
>>> What do you think? Is this a correct generalisation of this issue?
>>> If so, ok for trunk?
>>
>> Do you see a difference on the GIMPLE level?  If so, this kind of
>> transform looks
>> appropriate there, too.
>
>
> The GIMPLE for the two functions looks almost identical:
> f1 (intD.7 xD.3078, intD.7 tD.3079)
> {
>   intD.7 x_4(D) = xD.3078;
>   intD.7 t_5(D) = tD.3079;
>   unsigned int x.0_1;
>   unsigned int _2;
>   x.0_1 = (unsigned int) x_4(D);
>
>   _2 = x.0_1 + 2;
>   if (_2 <= 1)
>     goto <bb 3>;
>   else
>     goto <bb 4>;
> ;;   basic block 3, loop depth 0, count 0, freq 3977, maybe hot
> ;;   basic block 4, loop depth 0, count 0, freq 10000, maybe hot
>
>   # t_3 = PHI <t_5(D)(2), 1(3)>
>   return t_3;
> }
>
> f2 (intD.7 xD.3082, intD.7 tD.3083)
> {
>   intD.7 x_4(D) = xD.3082;
>   intD.7 t_5(D) = tD.3083;
>   unsigned int x.1_1;
>   unsigned int _2;
>   intD.7 _3;
>
>   x.1_1 = (unsigned int) x_4(D);
>
>   _2 = x.1_1 + 2;
>   if (_2 <= 1)
>     goto <bb 4>;
>   else
>     goto <bb 3>;
>
> ;;   basic block 3, loop depth 0, count 0, freq 6761, maybe hot
> ;;   basic block 4, loop depth 0, count 0, freq 10000, maybe hot
>   # _3 = PHI <1(2), t_5(D)(3)>
>   return _3;
>
> }
>
> So at GIMPLE level we see a (x + 2 <=u 1) in both cases but with slightly
> different CFG.  RTL-level transformations (ce1) bring it to the pre-combine
> RTL
> where one does (LTU w0 -2) and the other does (GTU (PLUS w0 2) 1).
>
> So the differences start at RTL level, so I think we need this
> transformation there.
> However, for the testcase:
> unsigned int
> foo (unsigned int a, unsigned int b)
> {
>   return (a + 2) > 1;
> }
>
> The differences do appear at GIMPLE level, so I think a match.pd pattern
> would help here.
Hi, may I ask what the function looks like to which this one is different to?

Thanks,
bin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][simplify-rtx] (GTU (PLUS a C) (C - 1)) --> (LTU a -C)
  2016-09-16 10:02     ` Bin.Cheng
@ 2016-09-16 10:05       ` Kyrill Tkachov
  2016-09-16 10:10         ` Bin.Cheng
  0 siblings, 1 reply; 12+ messages in thread
From: Kyrill Tkachov @ 2016-09-16 10:05 UTC (permalink / raw)
  To: Bin.Cheng; +Cc: Richard Biener, GCC Patches


On 16/09/16 10:50, Bin.Cheng wrote:
> On Fri, Sep 16, 2016 at 10:20 AM, Kyrill Tkachov
> <kyrylo.tkachov@foss.arm.com> wrote:
>> On 16/09/16 10:02, Richard Biener wrote:
>>> On Fri, Sep 16, 2016 at 10:40 AM, Kyrill Tkachov
>>> <kyrylo.tkachov@foss.arm.com> wrote:
>>>> Hi all,
>>>>
>>>> Currently the functions:
>>>> int f1(int x, int t)
>>>> {
>>>>     if (x == -1 || x == -2)
>>>>       t = 1;
>>>>     return t;
>>>> }
>>>>
>>>> int f2(int x, int t)
>>>> {
>>>>     if (x == -1 || x == -2)
>>>>       return 1;
>>>>     return t;
>>>> }
>>>>
>>>> generate different code on AArch64 even though they have identical
>>>> functionality:
>>>> f1:
>>>>           add     w0, w0, 2
>>>>           cmp     w0, 1
>>>>           csinc   w0, w1, wzr, hi
>>>>           ret
>>>>
>>>> f2:
>>>>           cmn     w0, #2
>>>>           csinc   w0, w1, wzr, cc
>>>>           ret
>>>>
>>>> The problem is that f2 performs the comparison (LTU w0 -2)
>>>> whereas f1 performs (GTU (PLUS w0 2) 1). I think it is possible to
>>>> simplify
>>>> the f1 form
>>>> to the f2 form with the simplify-rtx.c rule added in this patch. With
>>>> this
>>>> patch the
>>>> codegen for both f1 and f2 on aarch64 at -O2 is identical (CMN, CSINC).
>>>>
>>>> Bootstrapped and tested on arm-none-linux-gnueabihf,
>>>> aarch64-none-linux-gnu,
>>>> x86_64.
>>>> What do you think? Is this a correct generalisation of this issue?
>>>> If so, ok for trunk?
>>> Do you see a difference on the GIMPLE level?  If so, this kind of
>>> transform looks
>>> appropriate there, too.
>>
>> The GIMPLE for the two functions looks almost identical:
>> f1 (intD.7 xD.3078, intD.7 tD.3079)
>> {
>>    intD.7 x_4(D) = xD.3078;
>>    intD.7 t_5(D) = tD.3079;
>>    unsigned int x.0_1;
>>    unsigned int _2;
>>    x.0_1 = (unsigned int) x_4(D);
>>
>>    _2 = x.0_1 + 2;
>>    if (_2 <= 1)
>>      goto <bb 3>;
>>    else
>>      goto <bb 4>;
>> ;;   basic block 3, loop depth 0, count 0, freq 3977, maybe hot
>> ;;   basic block 4, loop depth 0, count 0, freq 10000, maybe hot
>>
>>    # t_3 = PHI <t_5(D)(2), 1(3)>
>>    return t_3;
>> }
>>
>> f2 (intD.7 xD.3082, intD.7 tD.3083)
>> {
>>    intD.7 x_4(D) = xD.3082;
>>    intD.7 t_5(D) = tD.3083;
>>    unsigned int x.1_1;
>>    unsigned int _2;
>>    intD.7 _3;
>>
>>    x.1_1 = (unsigned int) x_4(D);
>>
>>    _2 = x.1_1 + 2;
>>    if (_2 <= 1)
>>      goto <bb 4>;
>>    else
>>      goto <bb 3>;
>>
>> ;;   basic block 3, loop depth 0, count 0, freq 6761, maybe hot
>> ;;   basic block 4, loop depth 0, count 0, freq 10000, maybe hot
>>    # _3 = PHI <1(2), t_5(D)(3)>
>>    return _3;
>>
>> }
>>
>> So at GIMPLE level we see a (x + 2 <=u 1) in both cases but with slightly
>> different CFG.  RTL-level transformations (ce1) bring it to the pre-combine
>> RTL
>> where one does (LTU w0 -2) and the other does (GTU (PLUS w0 2) 1).
>>
>> So the differences start at RTL level, so I think we need this
>> transformation there.
>> However, for the testcase:
>> unsigned int
>> foo (unsigned int a, unsigned int b)
>> {
>>    return (a + 2) > 1;
>> }
>>
>> The differences do appear at GIMPLE level, so I think a match.pd pattern
>> would help here.
> Hi, may I ask what the function looks like to which this one is different to?

Hi Bin,
I meant to say that the unsigned greater than comparison is retained at the GIMPLE level
so could be optimised there.

Kyrill

> Thanks,
> bin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][simplify-rtx] (GTU (PLUS a C) (C - 1)) --> (LTU a -C)
  2016-09-16 10:05       ` Kyrill Tkachov
@ 2016-09-16 10:10         ` Bin.Cheng
  2016-09-16 10:15           ` Kyrill Tkachov
  0 siblings, 1 reply; 12+ messages in thread
From: Bin.Cheng @ 2016-09-16 10:10 UTC (permalink / raw)
  To: Kyrill Tkachov; +Cc: Richard Biener, GCC Patches

On Fri, Sep 16, 2016 at 10:53 AM, Kyrill Tkachov
<kyrylo.tkachov@foss.arm.com> wrote:
>
> On 16/09/16 10:50, Bin.Cheng wrote:
>>
>> On Fri, Sep 16, 2016 at 10:20 AM, Kyrill Tkachov
>> <kyrylo.tkachov@foss.arm.com> wrote:
>>>
>>> On 16/09/16 10:02, Richard Biener wrote:
>>>>
>>>> On Fri, Sep 16, 2016 at 10:40 AM, Kyrill Tkachov
>>>> <kyrylo.tkachov@foss.arm.com> wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> Currently the functions:
>>>>> int f1(int x, int t)
>>>>> {
>>>>>     if (x == -1 || x == -2)
>>>>>       t = 1;
>>>>>     return t;
>>>>> }
>>>>>
>>>>> int f2(int x, int t)
>>>>> {
>>>>>     if (x == -1 || x == -2)
>>>>>       return 1;
>>>>>     return t;
>>>>> }
>>>>>
>>>>> generate different code on AArch64 even though they have identical
>>>>> functionality:
>>>>> f1:
>>>>>           add     w0, w0, 2
>>>>>           cmp     w0, 1
>>>>>           csinc   w0, w1, wzr, hi
>>>>>           ret
>>>>>
>>>>> f2:
>>>>>           cmn     w0, #2
>>>>>           csinc   w0, w1, wzr, cc
>>>>>           ret
>>>>>
>>>>> The problem is that f2 performs the comparison (LTU w0 -2)
>>>>> whereas f1 performs (GTU (PLUS w0 2) 1). I think it is possible to
>>>>> simplify
>>>>> the f1 form
>>>>> to the f2 form with the simplify-rtx.c rule added in this patch. With
>>>>> this
>>>>> patch the
>>>>> codegen for both f1 and f2 on aarch64 at -O2 is identical (CMN, CSINC).
>>>>>
>>>>> Bootstrapped and tested on arm-none-linux-gnueabihf,
>>>>> aarch64-none-linux-gnu,
>>>>> x86_64.
>>>>> What do you think? Is this a correct generalisation of this issue?
>>>>> If so, ok for trunk?
>>>>
>>>> Do you see a difference on the GIMPLE level?  If so, this kind of
>>>> transform looks
>>>> appropriate there, too.
>>>
>>>
>>> The GIMPLE for the two functions looks almost identical:
>>> f1 (intD.7 xD.3078, intD.7 tD.3079)
>>> {
>>>    intD.7 x_4(D) = xD.3078;
>>>    intD.7 t_5(D) = tD.3079;
>>>    unsigned int x.0_1;
>>>    unsigned int _2;
>>>    x.0_1 = (unsigned int) x_4(D);
>>>
>>>    _2 = x.0_1 + 2;
>>>    if (_2 <= 1)
>>>      goto <bb 3>;
>>>    else
>>>      goto <bb 4>;
>>> ;;   basic block 3, loop depth 0, count 0, freq 3977, maybe hot
>>> ;;   basic block 4, loop depth 0, count 0, freq 10000, maybe hot
>>>
>>>    # t_3 = PHI <t_5(D)(2), 1(3)>
>>>    return t_3;
>>> }
>>>
>>> f2 (intD.7 xD.3082, intD.7 tD.3083)
>>> {
>>>    intD.7 x_4(D) = xD.3082;
>>>    intD.7 t_5(D) = tD.3083;
>>>    unsigned int x.1_1;
>>>    unsigned int _2;
>>>    intD.7 _3;
>>>
>>>    x.1_1 = (unsigned int) x_4(D);
>>>
>>>    _2 = x.1_1 + 2;
>>>    if (_2 <= 1)
>>>      goto <bb 4>;
>>>    else
>>>      goto <bb 3>;
>>>
>>> ;;   basic block 3, loop depth 0, count 0, freq 6761, maybe hot
>>> ;;   basic block 4, loop depth 0, count 0, freq 10000, maybe hot
>>>    # _3 = PHI <1(2), t_5(D)(3)>
>>>    return _3;
>>>
>>> }
>>>
>>> So at GIMPLE level we see a (x + 2 <=u 1) in both cases but with slightly
>>> different CFG.  RTL-level transformations (ce1) bring it to the
>>> pre-combine
>>> RTL
>>> where one does (LTU w0 -2) and the other does (GTU (PLUS w0 2) 1).
>>>
>>> So the differences start at RTL level, so I think we need this
>>> transformation there.
>>> However, for the testcase:
>>> unsigned int
>>> foo (unsigned int a, unsigned int b)
>>> {
>>>    return (a + 2) > 1;
>>> }
>>>
>>> The differences do appear at GIMPLE level, so I think a match.pd pattern
>>> would help here.
>>
>> Hi, may I ask what the function looks like to which this one is different
>> to?
>
>
> Hi Bin,
> I meant to say that the unsigned greater than comparison is retained at the
> GIMPLE level
> so could be optimised there.
In this case, the resulting gimple code refers to a huge unsigned
constant.  It's target dependent if that constant can be encoded.
AArch64 has CMN to do that, not sure what other targets' case.  And
AArch64 only supports small range of such constants.  May be better to
leave it for RTL where we know better if result code is optimal.

Thanks,
bin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][simplify-rtx] (GTU (PLUS a C) (C - 1)) --> (LTU a -C)
  2016-09-16 10:10         ` Bin.Cheng
@ 2016-09-16 10:15           ` Kyrill Tkachov
  2016-09-16 10:29             ` Bin.Cheng
  0 siblings, 1 reply; 12+ messages in thread
From: Kyrill Tkachov @ 2016-09-16 10:15 UTC (permalink / raw)
  To: Bin.Cheng; +Cc: Richard Biener, GCC Patches


On 16/09/16 11:05, Bin.Cheng wrote:
> On Fri, Sep 16, 2016 at 10:53 AM, Kyrill Tkachov
> <kyrylo.tkachov@foss.arm.com> wrote:
>> On 16/09/16 10:50, Bin.Cheng wrote:
>>> On Fri, Sep 16, 2016 at 10:20 AM, Kyrill Tkachov
>>> <kyrylo.tkachov@foss.arm.com> wrote:
>>>> On 16/09/16 10:02, Richard Biener wrote:
>>>>> On Fri, Sep 16, 2016 at 10:40 AM, Kyrill Tkachov
>>>>> <kyrylo.tkachov@foss.arm.com> wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> Currently the functions:
>>>>>> int f1(int x, int t)
>>>>>> {
>>>>>>      if (x == -1 || x == -2)
>>>>>>        t = 1;
>>>>>>      return t;
>>>>>> }
>>>>>>
>>>>>> int f2(int x, int t)
>>>>>> {
>>>>>>      if (x == -1 || x == -2)
>>>>>>        return 1;
>>>>>>      return t;
>>>>>> }
>>>>>>
>>>>>> generate different code on AArch64 even though they have identical
>>>>>> functionality:
>>>>>> f1:
>>>>>>            add     w0, w0, 2
>>>>>>            cmp     w0, 1
>>>>>>            csinc   w0, w1, wzr, hi
>>>>>>            ret
>>>>>>
>>>>>> f2:
>>>>>>            cmn     w0, #2
>>>>>>            csinc   w0, w1, wzr, cc
>>>>>>            ret
>>>>>>
>>>>>> The problem is that f2 performs the comparison (LTU w0 -2)
>>>>>> whereas f1 performs (GTU (PLUS w0 2) 1). I think it is possible to
>>>>>> simplify
>>>>>> the f1 form
>>>>>> to the f2 form with the simplify-rtx.c rule added in this patch. With
>>>>>> this
>>>>>> patch the
>>>>>> codegen for both f1 and f2 on aarch64 at -O2 is identical (CMN, CSINC).
>>>>>>
>>>>>> Bootstrapped and tested on arm-none-linux-gnueabihf,
>>>>>> aarch64-none-linux-gnu,
>>>>>> x86_64.
>>>>>> What do you think? Is this a correct generalisation of this issue?
>>>>>> If so, ok for trunk?
>>>>> Do you see a difference on the GIMPLE level?  If so, this kind of
>>>>> transform looks
>>>>> appropriate there, too.
>>>>
>>>> The GIMPLE for the two functions looks almost identical:
>>>> f1 (intD.7 xD.3078, intD.7 tD.3079)
>>>> {
>>>>     intD.7 x_4(D) = xD.3078;
>>>>     intD.7 t_5(D) = tD.3079;
>>>>     unsigned int x.0_1;
>>>>     unsigned int _2;
>>>>     x.0_1 = (unsigned int) x_4(D);
>>>>
>>>>     _2 = x.0_1 + 2;
>>>>     if (_2 <= 1)
>>>>       goto <bb 3>;
>>>>     else
>>>>       goto <bb 4>;
>>>> ;;   basic block 3, loop depth 0, count 0, freq 3977, maybe hot
>>>> ;;   basic block 4, loop depth 0, count 0, freq 10000, maybe hot
>>>>
>>>>     # t_3 = PHI <t_5(D)(2), 1(3)>
>>>>     return t_3;
>>>> }
>>>>
>>>> f2 (intD.7 xD.3082, intD.7 tD.3083)
>>>> {
>>>>     intD.7 x_4(D) = xD.3082;
>>>>     intD.7 t_5(D) = tD.3083;
>>>>     unsigned int x.1_1;
>>>>     unsigned int _2;
>>>>     intD.7 _3;
>>>>
>>>>     x.1_1 = (unsigned int) x_4(D);
>>>>
>>>>     _2 = x.1_1 + 2;
>>>>     if (_2 <= 1)
>>>>       goto <bb 4>;
>>>>     else
>>>>       goto <bb 3>;
>>>>
>>>> ;;   basic block 3, loop depth 0, count 0, freq 6761, maybe hot
>>>> ;;   basic block 4, loop depth 0, count 0, freq 10000, maybe hot
>>>>     # _3 = PHI <1(2), t_5(D)(3)>
>>>>     return _3;
>>>>
>>>> }
>>>>
>>>> So at GIMPLE level we see a (x + 2 <=u 1) in both cases but with slightly
>>>> different CFG.  RTL-level transformations (ce1) bring it to the
>>>> pre-combine
>>>> RTL
>>>> where one does (LTU w0 -2) and the other does (GTU (PLUS w0 2) 1).
>>>>
>>>> So the differences start at RTL level, so I think we need this
>>>> transformation there.
>>>> However, for the testcase:
>>>> unsigned int
>>>> foo (unsigned int a, unsigned int b)
>>>> {
>>>>     return (a + 2) > 1;
>>>> }
>>>>
>>>> The differences do appear at GIMPLE level, so I think a match.pd pattern
>>>> would help here.
>>> Hi, may I ask what the function looks like to which this one is different
>>> to?
>>
>> Hi Bin,
>> I meant to say that the unsigned greater than comparison is retained at the
>> GIMPLE level
>> so could be optimised there.
> In this case, the resulting gimple code refers to a huge unsigned
> constant.  It's target dependent if that constant can be encoded.
> AArch64 has CMN to do that, not sure what other targets' case.  And
> AArch64 only supports small range of such constants.  May be better to
> leave it for RTL where we know better if result code is optimal.

Well, we are saving a PLUS operation, so the resulting GIMPLE is simpler IMO,
which is match.pd's goal.

Thanks,
Kyrill

> Thanks,
> bin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][simplify-rtx] (GTU (PLUS a C) (C - 1)) --> (LTU a -C)
  2016-09-16 10:15           ` Kyrill Tkachov
@ 2016-09-16 10:29             ` Bin.Cheng
  0 siblings, 0 replies; 12+ messages in thread
From: Bin.Cheng @ 2016-09-16 10:29 UTC (permalink / raw)
  To: Kyrill Tkachov; +Cc: Richard Biener, GCC Patches

On Fri, Sep 16, 2016 at 11:07 AM, Kyrill Tkachov
<kyrylo.tkachov@foss.arm.com> wrote:
>
> On 16/09/16 11:05, Bin.Cheng wrote:
>>
>> On Fri, Sep 16, 2016 at 10:53 AM, Kyrill Tkachov
>> <kyrylo.tkachov@foss.arm.com> wrote:
>>>
>>> On 16/09/16 10:50, Bin.Cheng wrote:
>>>>
>>>> On Fri, Sep 16, 2016 at 10:20 AM, Kyrill Tkachov
>>>> <kyrylo.tkachov@foss.arm.com> wrote:
>>>>>
>>>>> On 16/09/16 10:02, Richard Biener wrote:
>>>>>>
>>>>>> On Fri, Sep 16, 2016 at 10:40 AM, Kyrill Tkachov
>>>>>> <kyrylo.tkachov@foss.arm.com> wrote:
>>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> Currently the functions:
>>>>>>> int f1(int x, int t)
>>>>>>> {
>>>>>>>      if (x == -1 || x == -2)
>>>>>>>        t = 1;
>>>>>>>      return t;
>>>>>>> }
>>>>>>>
>>>>>>> int f2(int x, int t)
>>>>>>> {
>>>>>>>      if (x == -1 || x == -2)
>>>>>>>        return 1;
>>>>>>>      return t;
>>>>>>> }
>>>>>>>
>>>>>>> generate different code on AArch64 even though they have identical
>>>>>>> functionality:
>>>>>>> f1:
>>>>>>>            add     w0, w0, 2
>>>>>>>            cmp     w0, 1
>>>>>>>            csinc   w0, w1, wzr, hi
>>>>>>>            ret
>>>>>>>
>>>>>>> f2:
>>>>>>>            cmn     w0, #2
>>>>>>>            csinc   w0, w1, wzr, cc
>>>>>>>            ret
>>>>>>>
>>>>>>> The problem is that f2 performs the comparison (LTU w0 -2)
>>>>>>> whereas f1 performs (GTU (PLUS w0 2) 1). I think it is possible to
>>>>>>> simplify
>>>>>>> the f1 form
>>>>>>> to the f2 form with the simplify-rtx.c rule added in this patch. With
>>>>>>> this
>>>>>>> patch the
>>>>>>> codegen for both f1 and f2 on aarch64 at -O2 is identical (CMN,
>>>>>>> CSINC).
>>>>>>>
>>>>>>> Bootstrapped and tested on arm-none-linux-gnueabihf,
>>>>>>> aarch64-none-linux-gnu,
>>>>>>> x86_64.
>>>>>>> What do you think? Is this a correct generalisation of this issue?
>>>>>>> If so, ok for trunk?
>>>>>>
>>>>>> Do you see a difference on the GIMPLE level?  If so, this kind of
>>>>>> transform looks
>>>>>> appropriate there, too.
>>>>>
>>>>>
>>>>> The GIMPLE for the two functions looks almost identical:
>>>>> f1 (intD.7 xD.3078, intD.7 tD.3079)
>>>>> {
>>>>>     intD.7 x_4(D) = xD.3078;
>>>>>     intD.7 t_5(D) = tD.3079;
>>>>>     unsigned int x.0_1;
>>>>>     unsigned int _2;
>>>>>     x.0_1 = (unsigned int) x_4(D);
>>>>>
>>>>>     _2 = x.0_1 + 2;
>>>>>     if (_2 <= 1)
>>>>>       goto <bb 3>;
>>>>>     else
>>>>>       goto <bb 4>;
>>>>> ;;   basic block 3, loop depth 0, count 0, freq 3977, maybe hot
>>>>> ;;   basic block 4, loop depth 0, count 0, freq 10000, maybe hot
>>>>>
>>>>>     # t_3 = PHI <t_5(D)(2), 1(3)>
>>>>>     return t_3;
>>>>> }
>>>>>
>>>>> f2 (intD.7 xD.3082, intD.7 tD.3083)
>>>>> {
>>>>>     intD.7 x_4(D) = xD.3082;
>>>>>     intD.7 t_5(D) = tD.3083;
>>>>>     unsigned int x.1_1;
>>>>>     unsigned int _2;
>>>>>     intD.7 _3;
>>>>>
>>>>>     x.1_1 = (unsigned int) x_4(D);
>>>>>
>>>>>     _2 = x.1_1 + 2;
>>>>>     if (_2 <= 1)
>>>>>       goto <bb 4>;
>>>>>     else
>>>>>       goto <bb 3>;
>>>>>
>>>>> ;;   basic block 3, loop depth 0, count 0, freq 6761, maybe hot
>>>>> ;;   basic block 4, loop depth 0, count 0, freq 10000, maybe hot
>>>>>     # _3 = PHI <1(2), t_5(D)(3)>
>>>>>     return _3;
>>>>>
>>>>> }
>>>>>
>>>>> So at GIMPLE level we see a (x + 2 <=u 1) in both cases but with
>>>>> slightly
>>>>> different CFG.  RTL-level transformations (ce1) bring it to the
>>>>> pre-combine
>>>>> RTL
>>>>> where one does (LTU w0 -2) and the other does (GTU (PLUS w0 2) 1).
>>>>>
>>>>> So the differences start at RTL level, so I think we need this
>>>>> transformation there.
>>>>> However, for the testcase:
>>>>> unsigned int
>>>>> foo (unsigned int a, unsigned int b)
>>>>> {
>>>>>     return (a + 2) > 1;
>>>>> }
>>>>>
>>>>> The differences do appear at GIMPLE level, so I think a match.pd
>>>>> pattern
>>>>> would help here.
>>>>
>>>> Hi, may I ask what the function looks like to which this one is
>>>> different
>>>> to?
>>>
>>>
>>> Hi Bin,
>>> I meant to say that the unsigned greater than comparison is retained at
>>> the
>>> GIMPLE level
>>> so could be optimised there.
>>
>> In this case, the resulting gimple code refers to a huge unsigned
>> constant.  It's target dependent if that constant can be encoded.
>> AArch64 has CMN to do that, not sure what other targets' case.  And
>> AArch64 only supports small range of such constants.  May be better to
>> leave it for RTL where we know better if result code is optimal.
>
>
> Well, we are saving a PLUS operation, so the resulting GIMPLE is simpler
Ah, yes, right.

Thanks,
bin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][simplify-rtx] (GTU (PLUS a C) (C - 1)) --> (LTU a -C)
  2016-09-16  8:50 [PATCH][simplify-rtx] (GTU (PLUS a C) (C - 1)) --> (LTU a -C) Kyrill Tkachov
  2016-09-16  9:04 ` Richard Biener
@ 2016-09-16 11:02 ` Bernd Schmidt
  2016-09-16 11:36   ` Kyrill Tkachov
  1 sibling, 1 reply; 12+ messages in thread
From: Bernd Schmidt @ 2016-09-16 11:02 UTC (permalink / raw)
  To: Kyrill Tkachov, GCC Patches

On 09/16/2016 10:40 AM, Kyrill Tkachov wrote:
>
> 2016-09-16  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>     * simplify-rtx.c (simplify_relational_operation_1): Add transformation
>     (GTU (PLUS a C) (C - 1)) --> (LTU a -C).
>
> 2016-09-16  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>     * gcc.target/aarch64/gtu_to_ltu_cmp_1.c: New test.

Ok. Don't know if you want to add more variants of the input code to the 
testcase to make sure they're all covered.


Bernd

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][simplify-rtx] (GTU (PLUS a C) (C - 1)) --> (LTU a -C)
  2016-09-16 11:02 ` Bernd Schmidt
@ 2016-09-16 11:36   ` Kyrill Tkachov
  2016-09-19 14:47     ` Kyrill Tkachov
  0 siblings, 1 reply; 12+ messages in thread
From: Kyrill Tkachov @ 2016-09-16 11:36 UTC (permalink / raw)
  To: Bernd Schmidt, GCC Patches


On 16/09/16 11:45, Bernd Schmidt wrote:
> On 09/16/2016 10:40 AM, Kyrill Tkachov wrote:
>>
>> 2016-09-16  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>
>>     * simplify-rtx.c (simplify_relational_operation_1): Add transformation
>>     (GTU (PLUS a C) (C - 1)) --> (LTU a -C).
>>
>> 2016-09-16  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>
>>     * gcc.target/aarch64/gtu_to_ltu_cmp_1.c: New test.
>
> Ok. Don't know if you want to add more variants of the input code to the testcase to make sure they're all covered.
>

Thanks.
I'm having trouble writing testcases for variations of the original testcase as GCC really really wants to convert
everything to a comparison against 1 at RTL level, so only the x == -2 || x == -1 condition seems to trigger this.
However, testcases of the form:
unsigned int
foo (unsigned int a, unsigned int b)
{
   return (a + 10) > 9;
}

seem to trigger it, so I can add some of this form. However, these will be optimised by a match.pd version
of this transformation that I'm working on.

Kyrill

>
> Bernd

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][simplify-rtx] (GTU (PLUS a C) (C - 1)) --> (LTU a -C)
  2016-09-16 11:36   ` Kyrill Tkachov
@ 2016-09-19 14:47     ` Kyrill Tkachov
  2016-09-19 16:26       ` Bernd Schmidt
  0 siblings, 1 reply; 12+ messages in thread
From: Kyrill Tkachov @ 2016-09-19 14:47 UTC (permalink / raw)
  To: Bernd Schmidt, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 1720 bytes --]


On 16/09/16 12:12, Kyrill Tkachov wrote:
>
> On 16/09/16 11:45, Bernd Schmidt wrote:
>> On 09/16/2016 10:40 AM, Kyrill Tkachov wrote:
>>>
>>> 2016-09-16  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>>
>>>     * simplify-rtx.c (simplify_relational_operation_1): Add transformation
>>>     (GTU (PLUS a C) (C - 1)) --> (LTU a -C).
>>>
>>> 2016-09-16  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>>
>>>     * gcc.target/aarch64/gtu_to_ltu_cmp_1.c: New test.
>>
>> Ok. Don't know if you want to add more variants of the input code to the testcase to make sure they're all covered.
>>
>
> Thanks.
> I'm having trouble writing testcases for variations of the original testcase as GCC really really wants to convert
> everything to a comparison against 1 at RTL level, so only the x == -2 || x == -1 condition seems to trigger this.
> However, testcases of the form:
> unsigned int
> foo (unsigned int a, unsigned int b)
> {
>   return (a + 10) > 9;
> }
>
> seem to trigger it, so I can add some of this form. However, these will be optimised by a match.pd version
> of this transformation that I'm working on.
>
Here's the patch with that test added as well.  The simplify-rtx transformation catches it, but if we end up
adding the match.pd form, it will get caught earlier at the GIMPLE level. The test will pass regardless of
where this transformation is done.

Ok?

Thanks,
Kyrill

2016-09-16  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     * simplify-rtx.c (simplify_relational_operation_1): Add transformation
     (GTU (PLUS a C) (C - 1)) --> (LTU a -C).

2016-09-16  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     * gcc.target/aarch64/gtu_to_ltu_cmp_1.c: New test.
     * gcc.target/aarch64/gtu_to_ltu_cmp_2.c: New test.


[-- Attachment #2: simplify-gtu.patch --]
[-- Type: text/x-patch, Size: 2107 bytes --]

diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index 054c0f9d41664f8a4d11765dfb501647cfbc728f..63e864a237a05d250e3d8a3510775585fe8002db 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -4663,6 +4663,19 @@ simplify_relational_operation_1 (enum rtx_code code, machine_mode mode,
 				      cmp_mode, XEXP (op0, 0), new_cmp);
     }
 
+  /* (GTU (PLUS a C) (C - 1)) where C is a non-zero constant can be
+     transformed into (LTU a -C).  */
+  if (code == GTU && GET_CODE (op0) == PLUS && CONST_INT_P (op1)
+      && CONST_INT_P (XEXP (op0, 1))
+      && (UINTVAL (op1) == UINTVAL (XEXP (op0, 1)) - 1)
+      && XEXP (op0, 1) != const0_rtx)
+    {
+      rtx new_cmp
+	= simplify_gen_unary (NEG, cmp_mode, XEXP (op0, 1), cmp_mode);
+      return simplify_gen_relational (LTU, mode, cmp_mode,
+				       XEXP (op0, 0), new_cmp);
+    }
+
   /* Canonicalize (LTU/GEU (PLUS a b) b) as (LTU/GEU (PLUS a b) a).  */
   if ((code == LTU || code == GEU)
       && GET_CODE (op0) == PLUS
diff --git a/gcc/testsuite/gcc.target/aarch64/gtu_to_ltu_cmp_1.c b/gcc/testsuite/gcc.target/aarch64/gtu_to_ltu_cmp_1.c
new file mode 100644
index 0000000000000000000000000000000000000000..81c536c90afe38932c48ed0af24f55e73eeff80e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/gtu_to_ltu_cmp_1.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int
+f1 (int x, int t)
+{
+  if (x == -1 || x == -2)
+    t = 1;
+
+  return t;
+}
+
+/* { dg-final { scan-assembler-times "cmn\\tw\[0-9\]+, #2" 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/gtu_to_ltu_cmp_2.c b/gcc/testsuite/gcc.target/aarch64/gtu_to_ltu_cmp_2.c
new file mode 100644
index 0000000000000000000000000000000000000000..e0e999f9df39c29bb79d8a8f7d9a17f213bd115b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/gtu_to_ltu_cmp_2.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+unsigned int
+foo (unsigned int a, unsigned int b)
+{
+  return (a + 10) > 9;
+}
+
+/* { dg-final { scan-assembler-times "cmn\\tw\[0-9\]+" 1 } } */
+/* { dg-final { scan-assembler-not "add\\tw\[0-9\]+" } } */

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][simplify-rtx] (GTU (PLUS a C) (C - 1)) --> (LTU a -C)
  2016-09-19 14:47     ` Kyrill Tkachov
@ 2016-09-19 16:26       ` Bernd Schmidt
  0 siblings, 0 replies; 12+ messages in thread
From: Bernd Schmidt @ 2016-09-19 16:26 UTC (permalink / raw)
  To: Kyrill Tkachov, GCC Patches

On 09/19/2016 04:43 PM, Kyrill Tkachov wrote:
> Ok?

Sure.


Bernd

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2016-09-19 16:08 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-16  8:50 [PATCH][simplify-rtx] (GTU (PLUS a C) (C - 1)) --> (LTU a -C) Kyrill Tkachov
2016-09-16  9:04 ` Richard Biener
2016-09-16  9:40   ` Kyrill Tkachov
2016-09-16 10:02     ` Bin.Cheng
2016-09-16 10:05       ` Kyrill Tkachov
2016-09-16 10:10         ` Bin.Cheng
2016-09-16 10:15           ` Kyrill Tkachov
2016-09-16 10:29             ` Bin.Cheng
2016-09-16 11:02 ` Bernd Schmidt
2016-09-16 11:36   ` Kyrill Tkachov
2016-09-19 14:47     ` Kyrill Tkachov
2016-09-19 16:26       ` Bernd Schmidt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).