public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH: PR target/44999] Replace "and r0, r0, #255" with uxtb in  thumb2
@ 2010-07-22  7:11 Carrot Wei
  2010-07-26  6:40 ` Carrot Wei
                   ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Carrot Wei @ 2010-07-22  7:11 UTC (permalink / raw)
  To: gcc-patches

In thumb2 "and r0, r0, #255" is 32 bit, uxtb is 16 bit and does the same
operation. This patch simply detect the situation in pattern "andsi3" and call
gen_thumb2_zero_extendqisi2_v6.

Tested on arm qemu.

ChangeLog:
2010-07-22  Wei Guozhi  <carrot@google.com>

        PR target/44999
        * config/arm/arm.md (andsi3): Change to zero extension if possible.
        * config/arm/thumb2.md (thumb2_zero_extendqisi2_v6): Change the name.

ChangeLog:
2010-07-22  Wei Guozhi  <carrot@google.com>

        PR target/44999
        * gcc.target/arm/pr44999.c: New testcase.

		
Index: thumb2.md
===================================================================
--- thumb2.md   (revision 162396)
+++ thumb2.md   (working copy)
@@ -970,7 +970,7 @@
    (set_attr "neg_pool_range" "*,250")]
 )

-(define_insn "*thumb2_zero_extendqisi2_v6"
+(define_insn "thumb2_zero_extendqisi2_v6"
   [(set (match_operand:SI 0 "s_register_operand" "=r,r")
        (zero_extend:SI (match_operand:QI 1 "nonimmediate_operand" "r,m")))]
   "TARGET_THUMB2 && arm_arch6"


Index: arm.md
===================================================================
--- arm.md      (revision 162396)
+++ arm.md      (working copy)
@@ -1933,9 +1933,16 @@
     {
       if (GET_CODE (operands[2]) == CONST_INT)
         {
-          arm_split_constant (AND, SImode, NULL_RTX,
-                             INTVAL (operands[2]), operands[0],
-                             operands[1], optimize && can_create_pseudo_p ());
+         if (INTVAL (operands[2]) == 255 && arm_arch6)
+           {
+             operands[1] = gen_rtx_SUBREG (QImode, operands[1], 0);
+             gen_thumb2_zero_extendqisi2_v6 (operands[0], operands[1]);
+           }
+         else
+           arm_split_constant (AND, SImode, NULL_RTX,
+                               INTVAL (operands[2]), operands[0],
+                               operands[1],
+                               optimize && can_create_pseudo_p ());

           DONE;
         }

		
Index: pr44999.c
===================================================================
--- pr44999.c   (revision 0)
+++ pr44999.c   (revision 0)
@@ -0,0 +1,7 @@
+/* { dg-options "-march=armv7-a -mthumb -Os" }  */
+/* { dg-final { scan-assembler "uxtb" } } */
+
+int tp(int x, int y)
+{
+  return (x & 0xff) - (y & 0xffff);
+}

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH: PR target/44999] Replace "and r0, r0, #255" with uxtb in  thumb2
  2010-07-22  7:11 [PATCH: PR target/44999] Replace "and r0, r0, #255" with uxtb in thumb2 Carrot Wei
@ 2010-07-26  6:40 ` Carrot Wei
  2010-07-26  7:23   ` Paolo Bonzini
  2010-07-30  7:08 ` Carrot Wei
  2010-08-09 10:38 ` Richard Earnshaw
  2 siblings, 1 reply; 22+ messages in thread
From: Carrot Wei @ 2010-07-26  6:40 UTC (permalink / raw)
  To: gcc-patches

ping

On Thu, Jul 22, 2010 at 3:11 PM, Carrot Wei <carrot@google.com> wrote:
> In thumb2 "and r0, r0, #255" is 32 bit, uxtb is 16 bit and does the same
> operation. This patch simply detect the situation in pattern "andsi3" and call
> gen_thumb2_zero_extendqisi2_v6.
>
> Tested on arm qemu.
>
> ChangeLog:
> 2010-07-22  Wei Guozhi  <carrot@google.com>
>
>        PR target/44999
>        * config/arm/arm.md (andsi3): Change to zero extension if possible.
>        * config/arm/thumb2.md (thumb2_zero_extendqisi2_v6): Change the name.
>
> ChangeLog:
> 2010-07-22  Wei Guozhi  <carrot@google.com>
>
>        PR target/44999
>        * gcc.target/arm/pr44999.c: New testcase.
>
>
> Index: thumb2.md
> ===================================================================
> --- thumb2.md   (revision 162396)
> +++ thumb2.md   (working copy)
> @@ -970,7 +970,7 @@
>    (set_attr "neg_pool_range" "*,250")]
>  )
>
> -(define_insn "*thumb2_zero_extendqisi2_v6"
> +(define_insn "thumb2_zero_extendqisi2_v6"
>   [(set (match_operand:SI 0 "s_register_operand" "=r,r")
>        (zero_extend:SI (match_operand:QI 1 "nonimmediate_operand" "r,m")))]
>   "TARGET_THUMB2 && arm_arch6"
>
>
> Index: arm.md
> ===================================================================
> --- arm.md      (revision 162396)
> +++ arm.md      (working copy)
> @@ -1933,9 +1933,16 @@
>     {
>       if (GET_CODE (operands[2]) == CONST_INT)
>         {
> -          arm_split_constant (AND, SImode, NULL_RTX,
> -                             INTVAL (operands[2]), operands[0],
> -                             operands[1], optimize && can_create_pseudo_p ());
> +         if (INTVAL (operands[2]) == 255 && arm_arch6)
> +           {
> +             operands[1] = gen_rtx_SUBREG (QImode, operands[1], 0);
> +             gen_thumb2_zero_extendqisi2_v6 (operands[0], operands[1]);
> +           }
> +         else
> +           arm_split_constant (AND, SImode, NULL_RTX,
> +                               INTVAL (operands[2]), operands[0],
> +                               operands[1],
> +                               optimize && can_create_pseudo_p ());
>
>           DONE;
>         }
>
>
> Index: pr44999.c
> ===================================================================
> --- pr44999.c   (revision 0)
> +++ pr44999.c   (revision 0)
> @@ -0,0 +1,7 @@
> +/* { dg-options "-march=armv7-a -mthumb -Os" }  */
> +/* { dg-final { scan-assembler "uxtb" } } */
> +
> +int tp(int x, int y)
> +{
> +  return (x & 0xff) - (y & 0xffff);
> +}
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH: PR target/44999] Replace "and r0, r0, #255" with uxtb in  thumb2
  2010-07-26  6:40 ` Carrot Wei
@ 2010-07-26  7:23   ` Paolo Bonzini
  2010-07-26  9:09     ` Carrot Wei
  0 siblings, 1 reply; 22+ messages in thread
From: Paolo Bonzini @ 2010-07-26  7:23 UTC (permalink / raw)
  To: Carrot Wei; +Cc: gcc-patches

On 07/26/2010 08:39 AM, Carrot Wei wrote:
> ping

After four days?  People are often on holiday in July.

The patch may be a reasonable approach, however I'd also check whether 
combine attempts this change, and if so why does it fail.

Paolo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH: PR target/44999] Replace "and r0, r0, #255" with uxtb in  thumb2
  2010-07-26  7:23   ` Paolo Bonzini
@ 2010-07-26  9:09     ` Carrot Wei
  2010-07-26  9:22       ` Paolo Bonzini
  0 siblings, 1 reply; 22+ messages in thread
From: Carrot Wei @ 2010-07-26  9:09 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: gcc-patches

I'm not an expert of combine, and I find the following comments in combine.c:

   We try to combine each pair of insns joined by a logical link.
   We also try to combine triples of insns A, B and C when
   C has a link back to B and B has a link back to A.

Only one insn is involved in the change, so combine should not check it.

thanks
Guozhi

On Mon, Jul 26, 2010 at 3:23 PM, Paolo Bonzini <bonzini@gnu.org> wrote:
> On 07/26/2010 08:39 AM, Carrot Wei wrote:
>>
>> ping
>
> After four days?  People are often on holiday in July.
>
> The patch may be a reasonable approach, however I'd also check whether
> combine attempts this change, and if so why does it fail.
>
> Paolo
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH: PR target/44999] Replace "and r0, r0, #255" with uxtb in thumb2
  2010-07-26  9:09     ` Carrot Wei
@ 2010-07-26  9:22       ` Paolo Bonzini
  0 siblings, 0 replies; 22+ messages in thread
From: Paolo Bonzini @ 2010-07-26  9:22 UTC (permalink / raw)
  To: Carrot Wei; +Cc: gcc-patches

On 07/26/2010 11:08 AM, Carrot Wei wrote:
> I'm not an expert of combine, and I find the following comments in combine.c:
>
>     We try to combine each pair of insns joined by a logical link.
>     We also try to combine triples of insns A, B and C when
>     C has a link back to B and B has a link back to A.
>
> Only one insn is involved in the change, so combine should not check it.

Right, of course.  I check the i386 backend and it is also doing the 
same optimization as your patch does, i.e. without the help of the 
middle-end.

(It actually does it using a constraint instead of RTL, but that doesn't 
really matter and would be a bit harder to do for ARM).

Paolo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH: PR target/44999] Replace "and r0, r0, #255" with uxtb in  thumb2
  2010-07-22  7:11 [PATCH: PR target/44999] Replace "and r0, r0, #255" with uxtb in thumb2 Carrot Wei
  2010-07-26  6:40 ` Carrot Wei
@ 2010-07-30  7:08 ` Carrot Wei
  2010-08-09  4:00   ` Carrot Wei
  2010-08-09 10:38 ` Richard Earnshaw
  2 siblings, 1 reply; 22+ messages in thread
From: Carrot Wei @ 2010-07-30  7:08 UTC (permalink / raw)
  To: gcc-patches

ARM maintainers, is this OK?

thanks
Guozhi

On Thu, Jul 22, 2010 at 3:11 PM, Carrot Wei <carrot@google.com> wrote:
> In thumb2 "and r0, r0, #255" is 32 bit, uxtb is 16 bit and does the same
> operation. This patch simply detect the situation in pattern "andsi3" and call
> gen_thumb2_zero_extendqisi2_v6.
>
> Tested on arm qemu.
>
> ChangeLog:
> 2010-07-22  Wei Guozhi  <carrot@google.com>
>
>        PR target/44999
>        * config/arm/arm.md (andsi3): Change to zero extension if possible.
>        * config/arm/thumb2.md (thumb2_zero_extendqisi2_v6): Change the name.
>
> ChangeLog:
> 2010-07-22  Wei Guozhi  <carrot@google.com>
>
>        PR target/44999
>        * gcc.target/arm/pr44999.c: New testcase.
>
>
> Index: thumb2.md
> ===================================================================
> --- thumb2.md   (revision 162396)
> +++ thumb2.md   (working copy)
> @@ -970,7 +970,7 @@
>    (set_attr "neg_pool_range" "*,250")]
>  )
>
> -(define_insn "*thumb2_zero_extendqisi2_v6"
> +(define_insn "thumb2_zero_extendqisi2_v6"
>   [(set (match_operand:SI 0 "s_register_operand" "=r,r")
>        (zero_extend:SI (match_operand:QI 1 "nonimmediate_operand" "r,m")))]
>   "TARGET_THUMB2 && arm_arch6"
>
>
> Index: arm.md
> ===================================================================
> --- arm.md      (revision 162396)
> +++ arm.md      (working copy)
> @@ -1933,9 +1933,16 @@
>     {
>       if (GET_CODE (operands[2]) == CONST_INT)
>         {
> -          arm_split_constant (AND, SImode, NULL_RTX,
> -                             INTVAL (operands[2]), operands[0],
> -                             operands[1], optimize && can_create_pseudo_p ());
> +         if (INTVAL (operands[2]) == 255 && arm_arch6)
> +           {
> +             operands[1] = gen_rtx_SUBREG (QImode, operands[1], 0);
> +             gen_thumb2_zero_extendqisi2_v6 (operands[0], operands[1]);
> +           }
> +         else
> +           arm_split_constant (AND, SImode, NULL_RTX,
> +                               INTVAL (operands[2]), operands[0],
> +                               operands[1],
> +                               optimize && can_create_pseudo_p ());
>
>           DONE;
>         }
>
>
> Index: pr44999.c
> ===================================================================
> --- pr44999.c   (revision 0)
> +++ pr44999.c   (revision 0)
> @@ -0,0 +1,7 @@
> +/* { dg-options "-march=armv7-a -mthumb -Os" }  */
> +/* { dg-final { scan-assembler "uxtb" } } */
> +
> +int tp(int x, int y)
> +{
> +  return (x & 0xff) - (y & 0xffff);
> +}
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH: PR target/44999] Replace "and r0, r0, #255" with uxtb in  thumb2
  2010-07-30  7:08 ` Carrot Wei
@ 2010-08-09  4:00   ` Carrot Wei
  2010-08-10  9:10     ` Nick Clifton
  0 siblings, 1 reply; 22+ messages in thread
From: Carrot Wei @ 2010-08-09  4:00 UTC (permalink / raw)
  To: gcc-patches, Richard Earnshaw, Nick Clifton, Paul Brook

ping

On Fri, Jul 30, 2010 at 2:31 PM, Carrot Wei <carrot@google.com> wrote:
> ARM maintainers, is this OK?
>
> thanks
> Guozhi
>
> On Thu, Jul 22, 2010 at 3:11 PM, Carrot Wei <carrot@google.com> wrote:
>> In thumb2 "and r0, r0, #255" is 32 bit, uxtb is 16 bit and does the same
>> operation. This patch simply detect the situation in pattern "andsi3" and call
>> gen_thumb2_zero_extendqisi2_v6.
>>
>> Tested on arm qemu.
>>
>> ChangeLog:
>> 2010-07-22  Wei Guozhi  <carrot@google.com>
>>
>>        PR target/44999
>>        * config/arm/arm.md (andsi3): Change to zero extension if possible.
>>        * config/arm/thumb2.md (thumb2_zero_extendqisi2_v6): Change the name.
>>
>> ChangeLog:
>> 2010-07-22  Wei Guozhi  <carrot@google.com>
>>
>>        PR target/44999
>>        * gcc.target/arm/pr44999.c: New testcase.
>>
>>
>> Index: thumb2.md
>> ===================================================================
>> --- thumb2.md   (revision 162396)
>> +++ thumb2.md   (working copy)
>> @@ -970,7 +970,7 @@
>>    (set_attr "neg_pool_range" "*,250")]
>>  )
>>
>> -(define_insn "*thumb2_zero_extendqisi2_v6"
>> +(define_insn "thumb2_zero_extendqisi2_v6"
>>   [(set (match_operand:SI 0 "s_register_operand" "=r,r")
>>        (zero_extend:SI (match_operand:QI 1 "nonimmediate_operand" "r,m")))]
>>   "TARGET_THUMB2 && arm_arch6"
>>
>>
>> Index: arm.md
>> ===================================================================
>> --- arm.md      (revision 162396)
>> +++ arm.md      (working copy)
>> @@ -1933,9 +1933,16 @@
>>     {
>>       if (GET_CODE (operands[2]) == CONST_INT)
>>         {
>> -          arm_split_constant (AND, SImode, NULL_RTX,
>> -                             INTVAL (operands[2]), operands[0],
>> -                             operands[1], optimize && can_create_pseudo_p ());
>> +         if (INTVAL (operands[2]) == 255 && arm_arch6)
>> +           {
>> +             operands[1] = gen_rtx_SUBREG (QImode, operands[1], 0);
>> +             gen_thumb2_zero_extendqisi2_v6 (operands[0], operands[1]);
>> +           }
>> +         else
>> +           arm_split_constant (AND, SImode, NULL_RTX,
>> +                               INTVAL (operands[2]), operands[0],
>> +                               operands[1],
>> +                               optimize && can_create_pseudo_p ());
>>
>>           DONE;
>>         }
>>
>>
>> Index: pr44999.c
>> ===================================================================
>> --- pr44999.c   (revision 0)
>> +++ pr44999.c   (revision 0)
>> @@ -0,0 +1,7 @@
>> +/* { dg-options "-march=armv7-a -mthumb -Os" }  */
>> +/* { dg-final { scan-assembler "uxtb" } } */
>> +
>> +int tp(int x, int y)
>> +{
>> +  return (x & 0xff) - (y & 0xffff);
>> +}
>>
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH: PR target/44999] Replace "and r0, r0, #255" with uxtb in thumb2
  2010-07-22  7:11 [PATCH: PR target/44999] Replace "and r0, r0, #255" with uxtb in thumb2 Carrot Wei
  2010-07-26  6:40 ` Carrot Wei
  2010-07-30  7:08 ` Carrot Wei
@ 2010-08-09 10:38 ` Richard Earnshaw
  2010-08-11  1:16   ` Carrot Wei
  2 siblings, 1 reply; 22+ messages in thread
From: Richard Earnshaw @ 2010-08-09 10:38 UTC (permalink / raw)
  To: Carrot Wei; +Cc: gcc-patches


On Thu, 2010-07-22 at 15:11 +0800, Carrot Wei wrote:
> In thumb2 "and r0, r0, #255" is 32 bit, uxtb is 16 bit and does the same
> operation. This patch simply detect the situation in pattern "andsi3" and call
> gen_thumb2_zero_extendqisi2_v6.

> Index: arm.md
> ===================================================================
> --- arm.md      (revision 162396)
> +++ arm.md      (working copy)
> @@ -1933,9 +1933,16 @@
>      {
>        if (GET_CODE (operands[2]) == CONST_INT)
>          {
> -          arm_split_constant (AND, SImode, NULL_RTX,
> -                             INTVAL (operands[2]), operands[0],
> -                             operands[1], optimize && can_create_pseudo_p ());
> +         if (INTVAL (operands[2]) == 255 && arm_arch6)
> +           {
> +             operands[1] = gen_rtx_SUBREG (QImode, operands[1], 0);

No, never generate SUBREGs directly like this (you get the wrong byte
for a big-endian core).

Instead, you should use convert_to_mode().

>+/* { dg-options "-march=armv7-a -mthumb -Os" }  */
>+/* { dg-final { scan-assembler "uxtb" } } */

This is also wrong and will break if people are testing other CPU
permutations.  See thumb2-cmpneg2add-1.c for an example of how to do
this correctly.

R.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH: PR target/44999] Replace "and r0, r0, #255" with uxtb in  thumb2
  2010-08-09  4:00   ` Carrot Wei
@ 2010-08-10  9:10     ` Nick Clifton
  2010-08-10 14:10       ` Richard Earnshaw
  0 siblings, 1 reply; 22+ messages in thread
From: Nick Clifton @ 2010-08-10  9:10 UTC (permalink / raw)
  To: Carrot Wei; +Cc: gcc-patches

Hi Carrot,

>> ARM maintainers, is this OK?

>>> ChangeLog:
>>> 2010-07-22  Wei Guozhi<carrot@google.com>
>>>
>>>         PR target/44999
>>>         * config/arm/arm.md (andsi3): Change to zero extension if possible.
>>>         * config/arm/thumb2.md (thumb2_zero_extendqisi2_v6): Change the name.
>>>
>>> ChangeLog:
>>> 2010-07-22  Wei Guozhi<carrot@google.com>
>>>
>>>         PR target/44999
>>>         * gcc.target/arm/pr44999.c: New testcase.

Approved - please apply.

Cheers
   Nick

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH: PR target/44999] Replace "and r0, r0, #255" with uxtb in  thumb2
  2010-08-10  9:10     ` Nick Clifton
@ 2010-08-10 14:10       ` Richard Earnshaw
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Earnshaw @ 2010-08-10 14:10 UTC (permalink / raw)
  To: Nick Clifton; +Cc: Carrot Wei, gcc-patches


On Tue, 2010-08-10 at 10:08 +0100, Nick Clifton wrote:
> Hi Carrot,
> 
> >> ARM maintainers, is this OK?
> 
> >>> ChangeLog:
> >>> 2010-07-22  Wei Guozhi<carrot@google.com>
> >>>
> >>>         PR target/44999
> >>>         * config/arm/arm.md (andsi3): Change to zero extension if possible.
> >>>         * config/arm/thumb2.md (thumb2_zero_extendqisi2_v6): Change the name.
> >>>
> >>> ChangeLog:
> >>> 2010-07-22  Wei Guozhi<carrot@google.com>
> >>>
> >>>         PR target/44999
> >>>         * gcc.target/arm/pr44999.c: New testcase.
> 
> Approved - please apply.

Sorry, no it's not OK.  See my earlier review.

R.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH: PR target/44999] Replace "and r0, r0, #255" with uxtb in  thumb2
  2010-08-09 10:38 ` Richard Earnshaw
@ 2010-08-11  1:16   ` Carrot Wei
  2010-08-11  8:59     ` Richard Earnshaw
  0 siblings, 1 reply; 22+ messages in thread
From: Carrot Wei @ 2010-08-11  1:16 UTC (permalink / raw)
  To: Richard Earnshaw; +Cc: gcc-patches

Hi Richard

The following patch has been modified as your suggestion. And it
passed the testing on qemu.


Index: pr44999.c
===================================================================
--- pr44999.c   (revision 0)
+++ pr44999.c   (revision 0)
@@ -0,0 +1,9 @@
+/* Use UXTB to extract the lowest byte.  */
+/* { dg-options "-mthumb -Os" } */
+/* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-final { scan-assembler "uxtb" } } */
+
+int tp(int x, int y)
+{
+  return (x & 0xff) - (y & 0xffff);
+}

Index: thumb2.md
===================================================================
--- thumb2.md   (revision 163046)
+++ thumb2.md   (working copy)
@@ -907,7 +907,7 @@
    (set_attr "neg_pool_range" "*,250")]
 )

-(define_insn "*thumb2_zero_extendqisi2_v6"
+(define_insn "thumb2_zero_extendqisi2_v6"
   [(set (match_operand:SI 0 "s_register_operand" "=r,r")
        (zero_extend:SI (match_operand:QI 1 "nonimmediate_operand" "r,m")))]
   "TARGET_THUMB2 && arm_arch6"
Index: arm.md
===================================================================
--- arm.md      (revision 163046)
+++ arm.md      (working copy)
@@ -1947,9 +1947,17 @@
     {
       if (GET_CODE (operands[2]) == CONST_INT)
         {
-          arm_split_constant (AND, SImode, NULL_RTX,
-                             INTVAL (operands[2]), operands[0],
-                             operands[1], optimize && can_create_pseudo_p ());
+         if (INTVAL (operands[2]) == 255 && arm_arch6)
+           {
+             operands[1] = convert_to_mode (QImode, operands[1], 1);
+             emit_insn (gen_thumb2_zero_extendqisi2_v6 (operands[0],
+                                                        operands[1]));
+           }
+         else
+           arm_split_constant (AND, SImode, NULL_RTX,
+                               INTVAL (operands[2]), operands[0],
+                               operands[1],
+                               optimize && can_create_pseudo_p ());

           DONE;
         }



On Mon, Aug 9, 2010 at 6:30 PM, Richard Earnshaw <rearnsha@arm.com> wrote:
>
> On Thu, 2010-07-22 at 15:11 +0800, Carrot Wei wrote:
>> In thumb2 "and r0, r0, #255" is 32 bit, uxtb is 16 bit and does the same
>> operation. This patch simply detect the situation in pattern "andsi3" and call
>> gen_thumb2_zero_extendqisi2_v6.
>
>> Index: arm.md
>> ===================================================================
>> --- arm.md      (revision 162396)
>> +++ arm.md      (working copy)
>> @@ -1933,9 +1933,16 @@
>>      {
>>        if (GET_CODE (operands[2]) == CONST_INT)
>>          {
>> -          arm_split_constant (AND, SImode, NULL_RTX,
>> -                             INTVAL (operands[2]), operands[0],
>> -                             operands[1], optimize && can_create_pseudo_p ());
>> +         if (INTVAL (operands[2]) == 255 && arm_arch6)
>> +           {
>> +             operands[1] = gen_rtx_SUBREG (QImode, operands[1], 0);
>
> No, never generate SUBREGs directly like this (you get the wrong byte
> for a big-endian core).
>
> Instead, you should use convert_to_mode().
>
>>+/* { dg-options "-march=armv7-a -mthumb -Os" }  */
>>+/* { dg-final { scan-assembler "uxtb" } } */
>
> This is also wrong and will break if people are testing other CPU
> permutations.  See thumb2-cmpneg2add-1.c for an example of how to do
> this correctly.
>
> R.
>
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH: PR target/44999] Replace "and r0, r0, #255" with uxtb in  thumb2
  2010-08-11  1:16   ` Carrot Wei
@ 2010-08-11  8:59     ` Richard Earnshaw
  2010-11-03 13:09       ` Paul Brook
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Earnshaw @ 2010-08-11  8:59 UTC (permalink / raw)
  To: Carrot Wei; +Cc: gcc-patches


On Wed, 2010-08-11 at 09:13 +0800, Carrot Wei wrote:
> Hi Richard
> 
> The following patch has been modified as your suggestion. And it
> passed the testing on qemu.
> 
> 
> Index: pr44999.c
> ===================================================================
> --- pr44999.c   (revision 0)
> +++ pr44999.c   (revision 0)
> @@ -0,0 +1,9 @@
> +/* Use UXTB to extract the lowest byte.  */
> +/* { dg-options "-mthumb -Os" } */
> +/* { dg-require-effective-target arm_thumb2_ok } */
> +/* { dg-final { scan-assembler "uxtb" } } */
> +
> +int tp(int x, int y)
> +{
> +  return (x & 0xff) - (y & 0xffff);
> +}
> 
> Index: thumb2.md
> ===================================================================
> --- thumb2.md   (revision 163046)
> +++ thumb2.md   (working copy)
> @@ -907,7 +907,7 @@
>     (set_attr "neg_pool_range" "*,250")]
>  )
> 
> -(define_insn "*thumb2_zero_extendqisi2_v6"
> +(define_insn "thumb2_zero_extendqisi2_v6"
>    [(set (match_operand:SI 0 "s_register_operand" "=r,r")
>         (zero_extend:SI (match_operand:QI 1 "nonimmediate_operand" "r,m")))]
>    "TARGET_THUMB2 && arm_arch6"
> Index: arm.md
> ===================================================================
> --- arm.md      (revision 163046)
> +++ arm.md      (working copy)
> @@ -1947,9 +1947,17 @@
>      {
>        if (GET_CODE (operands[2]) == CONST_INT)
>          {
> -          arm_split_constant (AND, SImode, NULL_RTX,
> -                             INTVAL (operands[2]), operands[0],
> -                             operands[1], optimize && can_create_pseudo_p ());
> +         if (INTVAL (operands[2]) == 255 && arm_arch6)
> +           {
> +             operands[1] = convert_to_mode (QImode, operands[1], 1);
> +             emit_insn (gen_thumb2_zero_extendqisi2_v6 (operands[0],
> +                                                        operands[1]));
> +           }
> +         else
> +           arm_split_constant (AND, SImode, NULL_RTX,
> +                               INTVAL (operands[2]), operands[0],
> +                               operands[1],
> +                               optimize && can_create_pseudo_p ());
> 
>            DONE;
>          }
> 


This is OK.

R.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH: PR target/44999] Replace "and r0, r0, #255" with  uxtb in  thumb2
  2010-08-11  8:59     ` Richard Earnshaw
@ 2010-11-03 13:09       ` Paul Brook
  2010-11-07 14:21         ` Carrot Wei
  0 siblings, 1 reply; 22+ messages in thread
From: Paul Brook @ 2010-11-03 13:09 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Earnshaw, Carrot Wei

> > +             operands[1] = convert_to_mode (QImode, operands[1], 1);
> > +             emit_insn (gen_thumb2_zero_extendqisi2_v6 (operands[0],
> > +                                                        operands[1]));
> > +           }
> >          
> >          }
> 
> This is OK.

No it's not. This code is inside if (TARGET_32BIT). Using thumb specific 
patterns in ARM mode is just plain wrong.

Even ignoring that, I don't think this is the right fix. Why do we need this 
for Thumb mode, and not ARM mode? Why extendqi and not extendhi?
Previous toolchains got this right, and still get this right in ARM mode. I 
suspect you're actually papering over a different bug. 

Paul

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH: PR target/44999] Replace "and r0, r0, #255" with uxtb in thumb2
  2010-11-03 13:09       ` Paul Brook
@ 2010-11-07 14:21         ` Carrot Wei
  2010-11-08 12:02           ` Paul Brook
  0 siblings, 1 reply; 22+ messages in thread
From: Carrot Wei @ 2010-11-07 14:21 UTC (permalink / raw)
  To: Paul Brook; +Cc: gcc-patches, Richard Earnshaw

On Wed, Nov 3, 2010 at 9:04 PM, Paul Brook <paul@codesourcery.com> wrote:
>> > +             operands[1] = convert_to_mode (QImode, operands[1], 1);
>> > +             emit_insn (gen_thumb2_zero_extendqisi2_v6 (operands[0],
>> > +                                                        operands[1]));
>> > +           }
>> >
>> >          }
>>
>> This is OK.
>
> No it's not. This code is inside if (TARGET_32BIT). Using thumb specific
> patterns in ARM mode is just plain wrong.
>
You are right. Following is additional patch to fix it. Tested with
regression testing in qemu.

> Even ignoring that, I don't think this is the right fix. Why do we need this
> for Thumb mode, and not ARM mode? Why extendqi and not extendhi?

In Thumb mode, and with a constant is 32 bit, and uxtb is 16 bit.
In ARM mode, both are 32 bit, so it is not beneficial.
The test case shows that and with 0xFFFF is already converted to uxth,
so it is not handled here.


ChangeLog:
2010-11-07  Wei Guozhi  <carrot@google.com>

        * config/arm/arm.md (andsi3): Call gen_thumb2_zero_extendqisi2_v6 only
        when the target is TARGET_THUMB2.


Index: arm.md
===================================================================
--- arm.md	(revision 165492)
+++ arm.md	(working copy)
@@ -2015,7 +2015,7 @@
     {
       if (GET_CODE (operands[2]) == CONST_INT)
         {
-	  if (INTVAL (operands[2]) == 255 && arm_arch6)
+	  if (INTVAL (operands[2]) == 255 && TARGET_THUMB2 && arm_arch6)
 	    {
 	      operands[1] = convert_to_mode (QImode, operands[1], 1);
 	      emit_insn (gen_thumb2_zero_extendqisi2_v6 (operands[0],

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH: PR target/44999] Replace "and r0, r0, #255" with uxtb in thumb2
  2010-11-07 14:21         ` Carrot Wei
@ 2010-11-08 12:02           ` Paul Brook
  2010-11-09  7:17             ` Carrot Wei
  0 siblings, 1 reply; 22+ messages in thread
From: Paul Brook @ 2010-11-08 12:02 UTC (permalink / raw)
  To: Carrot Wei; +Cc: gcc-patches, Richard Earnshaw

> > Even ignoring that, I don't think this is the right fix. Why do we need
> > this for Thumb mode, and not ARM mode? Why extendqi and not extendhi?
> 
> In Thumb mode, and with a constant is 32 bit, and uxtb is 16 bit.
> In ARM mode, both are 32 bit, so it is not beneficial.
> The test case shows that and with 0xFFFF is already converted to uxth,
> so it is not handled here.

I'm still not convinced. Why is HImode different to QImode? Why has this 
suddenly changed? Why does gcc generate uxtb in ARM mode but not Thumb mode? 
i.e. I've little confidence that this is actually fixing the problem rather than 
papering over a bug elsewhere, especially given we have a perfectly good 
zero_extendqisi2 pattern+expander.

Paul

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH: PR target/44999] Replace "and r0, r0, #255" with uxtb in thumb2
  2010-11-08 12:02           ` Paul Brook
@ 2010-11-09  7:17             ` Carrot Wei
  2010-11-09 11:19               ` Paul Brook
  0 siblings, 1 reply; 22+ messages in thread
From: Carrot Wei @ 2010-11-09  7:17 UTC (permalink / raw)
  To: Paul Brook; +Cc: gcc-patches, Richard Earnshaw

1. Why is HImode different to QImode?

Actually it is not HImode VS QImode, it is 65535 VS 255. The actual
processing of and with constant is handled by function
arm_gen_constant. In this function it finds that 255 can be directly
encoded in an and instruction, so it generates a simple and
instruction. On the other hand, 65535 can't be  directly encoded in an
and instruction, it must find some other ways to handle it, then it
find this and can be accomplished with two shifts. In the later
combine pass, the two shifts are combined into a single uxth.

2. Why do we need it for Thumb mode, and not for ARM mode?

In thumb mode, instruction and with constant is 32 bit, uxtb is 16
bit, with this enhancement we can save 2 bytes.
In ARM mode, all instructions are 32 bit, convert and to uxtb doesn't
help us, so we don't care which instruction is used.
So we need it for Thumb mode only.

thanks
Carrot

On Mon, Nov 8, 2010 at 7:55 PM, Paul Brook <paul@codesourcery.com> wrote:
>> > Even ignoring that, I don't think this is the right fix. Why do we need
>> > this for Thumb mode, and not ARM mode? Why extendqi and not extendhi?
>>
>> In Thumb mode, and with a constant is 32 bit, and uxtb is 16 bit.
>> In ARM mode, both are 32 bit, so it is not beneficial.
>> The test case shows that and with 0xFFFF is already converted to uxth,
>> so it is not handled here.
>
> I'm still not convinced. Why is HImode different to QImode? Why has this
> suddenly changed? Why does gcc generate uxtb in ARM mode but not Thumb mode?
> i.e. I've little confidence that this is actually fixing the problem rather than
> papering over a bug elsewhere, especially given we have a perfectly good
> zero_extendqisi2 pattern+expander.
>
> Paul
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH: PR target/44999] Replace "and r0, r0, #255" with uxtb in thumb2
  2010-11-09  7:17             ` Carrot Wei
@ 2010-11-09 11:19               ` Paul Brook
  2010-11-16  5:04                 ` Carrot Wei
  0 siblings, 1 reply; 22+ messages in thread
From: Paul Brook @ 2010-11-09 11:19 UTC (permalink / raw)
  To: Carrot Wei; +Cc: gcc-patches, Richard Earnshaw

> 2. Why do we need it for Thumb mode, and not for ARM mode?
> 
> In thumb mode, instruction and with constant is 32 bit, uxtb is 16
> bit, with this enhancement we can save 2 bytes.
> In ARM mode, all instructions are 32 bit, convert and to uxtb doesn't
> help us, so we don't care which instruction is used.
> So we need it for Thumb mode only.

No. Look at the output.
In ARM mode we generate uxtb without your patch.
In thumb mode gcc 4.5 generated uxtb, so this is a recent regression. What 
caused this regression?

The fact that this used to work (without andsi expander hacks), and still 
works in ARM mode, suggests there's something you're missing. Given there's 
clearly code somewhere capable of dong this transformation, it seems 
surprising that it's not already undoing you expander hack, and may do so in 
the future.

Paul

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH: PR target/44999] Replace "and r0, r0, #255" with uxtb in thumb2
  2010-11-09 11:19               ` Paul Brook
@ 2010-11-16  5:04                 ` Carrot Wei
  2010-11-16  5:57                   ` Paolo Bonzini
  0 siblings, 1 reply; 22+ messages in thread
From: Carrot Wei @ 2010-11-16  5:04 UTC (permalink / raw)
  To: Paul Brook; +Cc: gcc-patches, Richard Earnshaw

On Tue, Nov 9, 2010 at 3:16 AM, Paul Brook <paul@codesourcery.com> wrote:
>> 2. Why do we need it for Thumb mode, and not for ARM mode?
>>
>> In thumb mode, instruction and with constant is 32 bit, uxtb is 16
>> bit, with this enhancement we can save 2 bytes.
>> In ARM mode, all instructions are 32 bit, convert and to uxtb doesn't
>> help us, so we don't care which instruction is used.
>> So we need it for Thumb mode only.
>
> No. Look at the output.
> In ARM mode we generate uxtb without your patch.
> In thumb mode gcc 4.5 generated uxtb, so this is a recent regression. What
> caused this regression?
>
I've built a new gcc 4.5 compiler, it actually can't generate uxtb in
thumb mode. I've also tried the compiler for each month from 2009 Jan
to now, all of them didn't generate uxtb in thumb mode. And all of
them generated uxtb in arm mode.

> The fact that this used to work (without andsi expander hacks), and still
> works in ARM mode, suggests there's something you're missing. Given there's
> clearly code somewhere capable of dong this transformation, it seems
> surprising that it's not already undoing you expander hack, and may do so in
> the future.
>

For arm mode, the rtl before combine is:

(insn 2 5 3 2 src/tp.c:2 (set (reg/v:SI 134 [ x ])
        (reg:SI 0 r0 [ x ])) 167 {*arm_movsi_insn} (expr_list:REG_DEAD
(reg:SI 0 r0 [ x ])
        (nil)))

(insn 3 2 4 2 src/tp.c:2 (set (reg/v:SI 135 [ y ])
        (reg:SI 1 r1 [ y ])) 167 {*arm_movsi_insn} (expr_list:REG_DEAD
(reg:SI 1 r1 [ y ])
        (nil)))

(note 4 3 7 2 NOTE_INSN_FUNCTION_BEG)

(insn 7 4 8 2 src/tp.c:2 (set (reg:SI 137)
        (and:SI (reg/v:SI 134 [ x ])
            (const_int 255 [0xff]))) 67 {*arm_andsi3_insn}
(expr_list:REG_DEAD (reg/v:SI 134 [ x ])
        (nil)))

(insn 8 7 9 2 src/tp.c:2 (set (reg:SI 139)
        (const_int 65535 [0xffff])) 167 {*arm_movsi_insn} (nil))

(insn 9 8 10 2 src/tp.c:2 (set (reg:SI 138)
        (and:SI (reg/v:SI 135 [ y ])
            (reg:SI 139))) 67 {*arm_andsi3_insn} (expr_list:REG_DEAD
(reg:SI 139)
        (expr_list:REG_DEAD (reg/v:SI 135 [ y ])
            (expr_list:REG_EQUAL (and:SI (reg/v:SI 135 [ y ])
                    (const_int 65535 [0xffff]))
                (nil)))))

...

At combine pass, insn 2 and insn 7 are combined into a single unsigned
extend instruction, similarly insns 3,8,9 are combined into another
unsigned extend instruction, the result is:

(insn 7 4 8 2 src/tp.c:2 (set (reg:SI 137)
        (zero_extend:SI (reg:QI 0 r0 [ x ]))) 149
{*arm_zero_extendqisi2_v6} (expr_list:REG_DEAD (reg:SI 0 r0 [ x ])
        (nil)))

(note 8 7 9 2 NOTE_INSN_DELETED)

(insn 9 8 10 2 src/tp.c:2 (set (reg:SI 138)
        (zero_extend:SI (reg:HI 1 r1 [ y ]))) 144
{*arm_zero_extendhisi2_v6} (expr_list:REG_DEAD (reg:SI 1 r1 [ y ])
        (nil)))
...

In thumb mode, insn 2 and 7 are failed to be combined because function
cant_combine_insn_p returns true for insn 2, which is caused by
CLASS_LIKELY_SPILLED_P returns true for register r0. The
implementation of CLASS_LIKELY_SPILLED_P is

static bool
arm_class_likely_spilled_p (reg_class_t rclass)
{
  if ((TARGET_THUMB && rclass == LO_REGS)
      || rclass  == CC_REG)
    return true;

  return false;
}

The problem is the condition (TARGET_THUMB && rclass == LO_REGS), in
thumb2 we can use the same number of registers as arm mode, so it
should not be likely spilled. The condition should be changed to

    (TARGET_THUMB1 && rclass == LO_REGS)

After this change, gcc now can generate uxtb instruction.

But it is still not general enough because the combine needs two or
more instructions, so it actually depends on the existence of the
parameter

(insn 2 5 3 2 src/tp.c:2 (set (reg/v:SI 134 [ x ])
        (reg:SI 0 r0 [ x ])) 167 {*arm_movsi_insn} (expr_list:REG_DEAD
(reg:SI 0 r0 [ x ])
        (nil)))

If I modify the test case a little

int tp(int x, int y)
{
  return ((x + 3) & 0xff) - (y & 0xffff);
}

The compiler always generates "and" instruction for both arm and thumb
modes. But the enhanced expander can still handle this case.

thanks
Carrot

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH: PR target/44999] Replace "and r0, r0, #255" with uxtb in thumb2
  2010-11-16  5:04                 ` Carrot Wei
@ 2010-11-16  5:57                   ` Paolo Bonzini
  2010-11-19  3:46                     ` Carrot Wei
  0 siblings, 1 reply; 22+ messages in thread
From: Paolo Bonzini @ 2010-11-16  5:57 UTC (permalink / raw)
  To: Carrot Wei; +Cc: Paul Brook, gcc-patches, Richard Earnshaw

On 11/16/2010 02:29 AM, Carrot Wei wrote:
> The compiler always generates "and" instruction for both arm and thumb
> modes. But the enhanced expander can still handle this case.

If anything, a splitter seems preferrable to hacking the expander. 
Within an expander you cannot be sure that passes such as combine won't 
undo your change.

Paolo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH: PR target/44999] Replace "and r0, r0, #255" with uxtb in thumb2
  2010-11-16  5:57                   ` Paolo Bonzini
@ 2010-11-19  3:46                     ` Carrot Wei
  2010-12-02  9:07                       ` Richard Earnshaw
  0 siblings, 1 reply; 22+ messages in thread
From: Carrot Wei @ 2010-11-19  3:46 UTC (permalink / raw)
  To: Paolo Bonzini, Paul Brook, gcc-patches, Richard Earnshaw

Hi

This patch reverts my last patch and implements the enhancement as a
new split rule. Additionally it modifies arm_class_likely_spilled_p so
that thumb2 registers are not likely spilled.

Testing has been run on qemu without regression.

thanks
Guozhi

ChangeLog:
2010-11-18  Wei Guozhi  <carrot@google.com>

        PR target/44999
        * config/arm/arm.md (andsi3): Revert it.
        * config/arm/thumb2.md (thumb2_zero_extendqisi2_v6): Revert it.
        (split andsi3): New split to convert and with 0xFF to uxtb.
        * config/arm/arm.c (arm_class_likely_spilled_p): Remove thumb2 from the
        likely spill cases.

ChangeLog:
2010-11-18  Wei Guozhi  <carrot@google.com>

        PR target/44999
        * gcc.target/arm/pr44999.c: Update it to more general.


Index: thumb2.md
===================================================================
--- thumb2.md	(revision 165462)
+++ thumb2.md	(working copy)
@@ -585,7 +585,7 @@
    (set_attr "neg_pool_range" "*,250")]
 )

-(define_insn "thumb2_zero_extendqisi2_v6"
+(define_insn "*thumb2_zero_extendqisi2_v6"
   [(set (match_operand:SI 0 "s_register_operand" "=r,r")
 	(zero_extend:SI (match_operand:QI 1 "nonimmediate_operand" "r,m")))]
   "TARGET_THUMB2 && arm_arch6"
@@ -1118,3 +1118,16 @@
   "
   operands[2] = GEN_INT (32 - INTVAL (operands[2]));
   ")
+
+(define_split
+  [(set (match_operand:SI 0 "s_register_operand" "")
+	(and:SI (match_operand:SI 1 "s_register_operand" "")
+		(match_operand:SI 2 "const_int_operand" "")))]
+  "TARGET_THUMB2 && arm_arch6 && INTVAL (operands[2]) == 255"
+  [(set (match_dup 0)
+	(zero_extend:SI (match_dup 1)))]
+  "
+  operands[1] = convert_to_mode (QImode, operands[1], 1);
+  "
+)
+
Index: arm.c
===================================================================
--- arm.c	(revision 165462)
+++ arm.c	(working copy)
@@ -22046,7 +22046,7 @@ arm_preferred_simd_mode (enum machine_mo
 static bool
 arm_class_likely_spilled_p (reg_class_t rclass)
 {
-  if ((TARGET_THUMB && rclass == LO_REGS)
+  if ((TARGET_THUMB1 && rclass == LO_REGS)
       || rclass  == CC_REG)
     return true;

Index: arm.md
===================================================================
--- arm.md	(revision 165462)
+++ arm.md	(working copy)
@@ -2015,17 +2015,9 @@
     {
       if (GET_CODE (operands[2]) == CONST_INT)
         {
-	  if (INTVAL (operands[2]) == 255 && arm_arch6)
-	    {
-	      operands[1] = convert_to_mode (QImode, operands[1], 1);
-	      emit_insn (gen_thumb2_zero_extendqisi2_v6 (operands[0],
-							 operands[1]));
-	    }
-	  else
-	    arm_split_constant (AND, SImode, NULL_RTX,
-				INTVAL (operands[2]), operands[0],
-				operands[1],
-				optimize && can_create_pseudo_p ());
+	  arm_split_constant (AND, SImode, NULL_RTX,
+			      INTVAL (operands[2]), operands[0],
+			      operands[1], optimize && can_create_pseudo_p ());

           DONE;
         }

Index: pr44999.c
===================================================================
--- pr44999.c	(revision 165462)
+++ pr44999.c	(working copy)
@@ -5,5 +5,5 @@

 int tp(int x, int y)
 {
-  return (x & 0xff) - (y & 0xffff);
+  return ((x+3) & 0xff) - (y & 0xffff);
 }


On Mon, Nov 15, 2010 at 6:07 PM, Paolo Bonzini <bonzini@gnu.org> wrote:
> On 11/16/2010 02:29 AM, Carrot Wei wrote:
>>
>> The compiler always generates "and" instruction for both arm and thumb
>> modes. But the enhanced expander can still handle this case.
>
> If anything, a splitter seems preferrable to hacking the expander. Within an
> expander you cannot be sure that passes such as combine won't undo your
> change.
>
> Paolo
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH: PR target/44999] Replace "and r0, r0, #255" with uxtb in thumb2
  2010-11-19  3:46                     ` Carrot Wei
@ 2010-12-02  9:07                       ` Richard Earnshaw
  2010-12-03 23:08                         ` Carrot Wei
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Earnshaw @ 2010-12-02  9:07 UTC (permalink / raw)
  To: Carrot Wei; +Cc: Paolo Bonzini, Paul Brook, gcc-patches, Richard Earnshaw

Please don't try to mix patches for different things. Firstly it makes reviewing harder.  Secondly if there's a problem found after the patch is committed the all the changes may get revered rather than just the change that caused the problem. If one part of a patch is dependent on a separate change then please say so in your submission. 

Now on to the details. 

Uxtb and uxth may generate smaller code, but on cortex-a9 they take longer to execute than AND.  So we most likely only want to do this if either optimising for size or if we can't hoist a constant out of a loop.

What's the justification for the class_likely_spilled change?  I can't immediately see why this constraint is any less likely to be true on thumb2 than on thumb1, since the contents of the class is the same (getting this wrong can cause ICEs). 

R. 





On 19 Nov 2010, at 02:40, "Carrot Wei" <carrot@google.com> wrote:

> Hi
> 
> This patch reverts my last patch and implements the enhancement as a
> new split rule. Additionally it modifies arm_class_likely_spilled_p so
> that thumb2 registers are not likely spilled.
> 
> Testing has been run on qemu without regression.
> 
> thanks
> Guozhi
> 
> ChangeLog:
> 2010-11-18  Wei Guozhi  <carrot@google.com>
> 
>        PR target/44999
>        * config/arm/arm.md (andsi3): Revert it.
>        * config/arm/thumb2.md (thumb2_zero_extendqisi2_v6): Revert it.
>        (split andsi3): New split to convert and with 0xFF to uxtb.
>        * config/arm/arm.c (arm_class_likely_spilled_p): Remove thumb2 from the
>        likely spill cases.
> 
> ChangeLog:
> 2010-11-18  Wei Guozhi  <carrot@google.com>
> 
>        PR target/44999
>        * gcc.target/arm/pr44999.c: Update it to more general.
> 
> 
> Index: thumb2.md
> ===================================================================
> --- thumb2.md    (revision 165462)
> +++ thumb2.md    (working copy)
> @@ -585,7 +585,7 @@
>    (set_attr "neg_pool_range" "*,250")]
> )
> 
> -(define_insn "thumb2_zero_extendqisi2_v6"
> +(define_insn "*thumb2_zero_extendqisi2_v6"
>   [(set (match_operand:SI 0 "s_register_operand" "=r,r")
>    (zero_extend:SI (match_operand:QI 1 "nonimmediate_operand" "r,m")))]
>   "TARGET_THUMB2 && arm_arch6"
> @@ -1118,3 +1118,16 @@
>   "
>   operands[2] = GEN_INT (32 - INTVAL (operands[2]));
>   ")
> +
> +(define_split
> +  [(set (match_operand:SI 0 "s_register_operand" "")
> +    (and:SI (match_operand:SI 1 "s_register_operand" "")
> +        (match_operand:SI 2 "const_int_operand" "")))]
> +  "TARGET_THUMB2 && arm_arch6 && INTVAL (operands[2]) == 255"
> +  [(set (match_dup 0)
> +    (zero_extend:SI (match_dup 1)))]
> +  "
> +  operands[1] = convert_to_mode (QImode, operands[1], 1);
> +  "
> +)
> +
> Index: arm.c
> ===================================================================
> --- arm.c    (revision 165462)
> +++ arm.c    (working copy)
> @@ -22046,7 +22046,7 @@ arm_preferred_simd_mode (enum machine_mo
> static bool
> arm_class_likely_spilled_p (reg_class_t rclass)
> {
> -  if ((TARGET_THUMB && rclass == LO_REGS)
> +  if ((TARGET_THUMB1 && rclass == LO_REGS)
>       || rclass  == CC_REG)
>     return true;
> 
> Index: arm.md
> ===================================================================
> --- arm.md    (revision 165462)
> +++ arm.md    (working copy)
> @@ -2015,17 +2015,9 @@
>     {
>       if (GET_CODE (operands[2]) == CONST_INT)
>         {
> -      if (INTVAL (operands[2]) == 255 && arm_arch6)
> -        {
> -          operands[1] = convert_to_mode (QImode, operands[1], 1);
> -          emit_insn (gen_thumb2_zero_extendqisi2_v6 (operands[0],
> -                             operands[1]));
> -        }
> -      else
> -        arm_split_constant (AND, SImode, NULL_RTX,
> -                INTVAL (operands[2]), operands[0],
> -                operands[1],
> -                optimize && can_create_pseudo_p ());
> +      arm_split_constant (AND, SImode, NULL_RTX,
> +                  INTVAL (operands[2]), operands[0],
> +                  operands[1], optimize && can_create_pseudo_p ());
> 
>           DONE;
>         }
> 
> Index: pr44999.c
> ===================================================================
> --- pr44999.c    (revision 165462)
> +++ pr44999.c    (working copy)
> @@ -5,5 +5,5 @@
> 
> int tp(int x, int y)
> {
> -  return (x & 0xff) - (y & 0xffff);
> +  return ((x+3) & 0xff) - (y & 0xffff);
> }
> 
> 
> On Mon, Nov 15, 2010 at 6:07 PM, Paolo Bonzini <bonzini@gnu.org> wrote:
>> On 11/16/2010 02:29 AM, Carrot Wei wrote:
>>> 
>>> The compiler always generates "and" instruction for both arm and thumb
>>> modes. But the enhanced expander can still handle this case.
>> 
>> If anything, a splitter seems preferrable to hacking the expander. Within an
>> expander you cannot be sure that passes such as combine won't undo your
>> change.
>> 
>> Paolo
>> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH: PR target/44999] Replace "and r0, r0, #255" with uxtb in thumb2
  2010-12-02  9:07                       ` Richard Earnshaw
@ 2010-12-03 23:08                         ` Carrot Wei
  0 siblings, 0 replies; 22+ messages in thread
From: Carrot Wei @ 2010-12-03 23:08 UTC (permalink / raw)
  To: Richard Earnshaw; +Cc: Paolo Bonzini, Paul Brook, gcc-patches, Richard Earnshaw

On Wed, Dec 1, 2010 at 4:19 PM, Richard Earnshaw
<richard.earnshaw@buzzard.freeserve.co.uk> wrote:
> Please don't try to mix patches for different things. Firstly it makes reviewing harder.  Secondly if there's a problem found after the patch is committed the all the changes may get revered rather than just the change that caused the problem. If one part of a patch is dependent on a separate change then please say so in your submission.
>
> Now on to the details.
>
> Uxtb and uxth may generate smaller code, but on cortex-a9 they take longer to execute than AND.  So we most likely only want to do this if either optimising for size or if we can't hoist a constant out of a loop.
>
Added the condition checking.

> What's the justification for the class_likely_spilled change?  I can't immediately see why this constraint is any less likely to be true on thumb2 than on thumb1, since the contents of the class is the same (getting this wrong can cause ICEs).
>
On thumb1 only low registers can be used in normal alu instructions,
but on thumb2 nearly all registers can be used in normal alu
instructions. The register usage on thumb2 is more like that on arm
instruction set. So I think it should also have the same
class_likely_spilled behavior as arm. But anyway I removed it from the
patch.

thanks
Guozhi


ChangeLog:
2010-12-03  Wei Guozhi  <carrot@google.com>

        PR target/44999
        * config/arm/arm.md (andsi3): Revert it.
        * config/arm/thumb2.md (thumb2_zero_extendqisi2_v6): Revert it.
        (split andsi3): New split to convert and with 0xFF to uxtb.

ChangeLog:
2010-12-03  Wei Guozhi  <carrot@google.com>

        PR target/44999
        * gcc.target/arm/pr44999.c: Update it to more general.

Index: thumb2.md
===================================================================
--- thumb2.md	(revision 165462)
+++ thumb2.md	(working copy)
@@ -585,7 +585,7 @@
    (set_attr "neg_pool_range" "*,250")]
 )

-(define_insn "thumb2_zero_extendqisi2_v6"
+(define_insn "*thumb2_zero_extendqisi2_v6"
   [(set (match_operand:SI 0 "s_register_operand" "=r,r")
 	(zero_extend:SI (match_operand:QI 1 "nonimmediate_operand" "r,m")))]
   "TARGET_THUMB2 && arm_arch6"
@@ -1118,3 +1118,17 @@
   "
   operands[2] = GEN_INT (32 - INTVAL (operands[2]));
   ")
+
+(define_split
+  [(set (match_operand:SI 0 "s_register_operand" "")
+	(and:SI (match_operand:SI 1 "s_register_operand" "")
+		(match_operand:SI 2 "const_int_operand" "")))]
+  "TARGET_THUMB2 && arm_arch6 && INTVAL (operands[2]) == 255
+   && (optimize_size || !arm_tune_cortex_a9)"
+  [(set (match_dup 0)
+	(zero_extend:SI (match_dup 1)))]
+  "
+  operands[1] = convert_to_mode (QImode, operands[1], 1);
+  "
+)
+
Index: arm.md
===================================================================
--- arm.md	(revision 165462)
+++ arm.md	(working copy)
@@ -2015,17 +2015,9 @@
     {
       if (GET_CODE (operands[2]) == CONST_INT)
         {
-	  if (INTVAL (operands[2]) == 255 && arm_arch6)
-	    {
-	      operands[1] = convert_to_mode (QImode, operands[1], 1);
-	      emit_insn (gen_thumb2_zero_extendqisi2_v6 (operands[0],
-							 operands[1]));
-	    }
-	  else
-	    arm_split_constant (AND, SImode, NULL_RTX,
-				INTVAL (operands[2]), operands[0],
-				operands[1],
-				optimize && can_create_pseudo_p ());
+	  arm_split_constant (AND, SImode, NULL_RTX,
+			      INTVAL (operands[2]), operands[0],
+			      operands[1], optimize && can_create_pseudo_p ());

           DONE;
         }

Index: pr44999.c
===================================================================
--- pr44999.c	(revision 165462)
+++ pr44999.c	(working copy)
@@ -5,5 +5,5 @@

 int tp(int x, int y)
 {
-  return (x & 0xff) - (y & 0xffff);
+  return ((x+3) & 0xff) - (y & 0xffff);
 }

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2010-12-03 23:08 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-07-22  7:11 [PATCH: PR target/44999] Replace "and r0, r0, #255" with uxtb in thumb2 Carrot Wei
2010-07-26  6:40 ` Carrot Wei
2010-07-26  7:23   ` Paolo Bonzini
2010-07-26  9:09     ` Carrot Wei
2010-07-26  9:22       ` Paolo Bonzini
2010-07-30  7:08 ` Carrot Wei
2010-08-09  4:00   ` Carrot Wei
2010-08-10  9:10     ` Nick Clifton
2010-08-10 14:10       ` Richard Earnshaw
2010-08-09 10:38 ` Richard Earnshaw
2010-08-11  1:16   ` Carrot Wei
2010-08-11  8:59     ` Richard Earnshaw
2010-11-03 13:09       ` Paul Brook
2010-11-07 14:21         ` Carrot Wei
2010-11-08 12:02           ` Paul Brook
2010-11-09  7:17             ` Carrot Wei
2010-11-09 11:19               ` Paul Brook
2010-11-16  5:04                 ` Carrot Wei
2010-11-16  5:57                   ` Paolo Bonzini
2010-11-19  3:46                     ` Carrot Wei
2010-12-02  9:07                       ` Richard Earnshaw
2010-12-03 23:08                         ` Carrot Wei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).