* [PATCH] Canonicalize compares in combine [3/3] ARM backend part
@ 2011-04-22 16:23 Chung-Lin Tang
2011-05-09 16:45 ` Ping " Chung-Lin Tang
2011-06-15 13:58 ` Richard Earnshaw
0 siblings, 2 replies; 8+ messages in thread
From: Chung-Lin Tang @ 2011-04-22 16:23 UTC (permalink / raw)
To: gcc-patches; +Cc: Richard Earnshaw
[-- Attachment #1: Type: text/plain, Size: 729 bytes --]
Hi Richard, this part's for you.
The ARM backend changes needed are very little after the prior patches,
basically just a case in arm_canonicalize_comparison() to detect
(zero_extend:SI (subreg:QI (reg:SI ...) 0)), and swap it into (and:SI
(reg:SI) #255).
Had we not tried the combine modifications, this testcase probably could
have also be solved by implementing another version of the corresponding
*andsi3_compare0/_scratch patterns, with ZERO_EXTEND in the body, and
"ands" in the output assembly. Maybe that's an acceptable solution too...
About the (ab)use of CANONICALIZE_COMPARISON, if it really should be
another macro/hook, then this ARM patch will need updating, but the code
should be similar.
Thanks,
Chung-Lin
[-- Attachment #2: 3-arm-parts.diff --]
[-- Type: text/plain, Size: 2028 bytes --]
Index: config/arm/arm.c
===================================================================
--- config/arm/arm.c (revision 172860)
+++ config/arm/arm.c (working copy)
@@ -3276,6 +3276,19 @@
return code;
}
+ /* If *op0 is (zero_extend:SI (subreg:QI (reg:SI) 0)) and comparing
+ with const0_rtx, change it to (and:SI (reg:SI) (const_int 255)),
+ to facilitate possible combining with a cmp into 'ands'. */
+ if (mode == SImode
+ && GET_CODE (*op0) == ZERO_EXTEND
+ && GET_CODE (XEXP (*op0, 0)) == SUBREG
+ && GET_MODE (XEXP (*op0, 0)) == QImode
+ && GET_MODE (SUBREG_REG (XEXP (*op0, 0))) == SImode
+ && SUBREG_BYTE (XEXP (*op0, 0)) == 0
+ && *op1 == const0_rtx)
+ *op0 = gen_rtx_AND (SImode, SUBREG_REG (XEXP (*op0, 0)),
+ GEN_INT (255));
+
/* Comparisons smaller than DImode. Only adjust comparisons against
an out-of-range constant. */
if (GET_CODE (*op1) != CONST_INT
Index: testsuite/gcc.target/arm/combine-movs.c
===================================================================
--- testsuite/gcc.target/arm/combine-movs.c (revision 0)
+++ testsuite/gcc.target/arm/combine-movs.c (revision 0)
@@ -0,0 +1,10 @@
+/* { dg-options "-O" } */
+
+void foo (unsigned long r[], unsigned int d)
+{
+ int i, n = d / 32;
+ for (i = 0; i < n; ++i)
+ r[i] = 0;
+}
+
+/* { dg-final { scan-assembler "movs\tr\[0-9\]" } } */
Index: testsuite/gcc.target/arm/unsigned-extend-2.c
===================================================================
--- testsuite/gcc.target/arm/unsigned-extend-2.c (revision 0)
+++ testsuite/gcc.target/arm/unsigned-extend-2.c (revision 0)
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O -march=armv6" } */
+
+unsigned short foo (unsigned short x)
+{
+ unsigned char i = 0;
+ for (i = 0; i < 8; i++)
+ {
+ x >>= 1;
+ x &= 0x7fff;
+ }
+ return x;
+}
+
+/* { dg-final { scan-assembler "ands" } } */
+/* { dg-final { scan-assembler-not "uxtb" } } */
+/* { dg-final { scan-assembler-not "cmp" } } */
^ permalink raw reply [flat|nested] 8+ messages in thread
* Ping Re: [PATCH] Canonicalize compares in combine [3/3] ARM backend part
2011-04-22 16:23 [PATCH] Canonicalize compares in combine [3/3] ARM backend part Chung-Lin Tang
@ 2011-05-09 16:45 ` Chung-Lin Tang
2011-06-02 4:59 ` Chung-Lin Tang
2011-06-15 13:58 ` Richard Earnshaw
1 sibling, 1 reply; 8+ messages in thread
From: Chung-Lin Tang @ 2011-05-09 16:45 UTC (permalink / raw)
To: gcc-patches; +Cc: Richard Earnshaw
Ping.
On 04/22/2011 11:21 PM, Chung-Lin Tang wrote:
> Hi Richard, this part's for you.
>
> The ARM backend changes needed are very little after the prior patches,
> basically just a case in arm_canonicalize_comparison() to detect
> (zero_extend:SI (subreg:QI (reg:SI ...) 0)), and swap it into (and:SI
> (reg:SI) #255).
>
> Had we not tried the combine modifications, this testcase probably could
> have also be solved by implementing another version of the corresponding
> *andsi3_compare0/_scratch patterns, with ZERO_EXTEND in the body, and
> "ands" in the output assembly. Maybe that's an acceptable solution too...
>
> About the (ab)use of CANONICALIZE_COMPARISON, if it really should be
> another macro/hook, then this ARM patch will need updating, but the code
> should be similar.
>
> Thanks,
> Chung-Lin
^ permalink raw reply [flat|nested] 8+ messages in thread
* Ping Re: [PATCH] Canonicalize compares in combine [3/3] ARM backend part
2011-05-09 16:45 ` Ping " Chung-Lin Tang
@ 2011-06-02 4:59 ` Chung-Lin Tang
0 siblings, 0 replies; 8+ messages in thread
From: Chung-Lin Tang @ 2011-06-02 4:59 UTC (permalink / raw)
To: gcc-patches; +Cc: Richard Earnshaw, Ramana Radhakrishnan
Ping.
On 2011/5/9 11:02 PM, Chung-Lin Tang wrote:
> Ping.
>
> On 04/22/2011 11:21 PM, Chung-Lin Tang wrote:
>> Hi Richard, this part's for you.
>>
>> The ARM backend changes needed are very little after the prior patches,
>> basically just a case in arm_canonicalize_comparison() to detect
>> (zero_extend:SI (subreg:QI (reg:SI ...) 0)), and swap it into (and:SI
>> (reg:SI) #255).
>>
>> Had we not tried the combine modifications, this testcase probably could
>> have also be solved by implementing another version of the corresponding
>> *andsi3_compare0/_scratch patterns, with ZERO_EXTEND in the body, and
>> "ands" in the output assembly. Maybe that's an acceptable solution too...
>>
>> About the (ab)use of CANONICALIZE_COMPARISON, if it really should be
>> another macro/hook, then this ARM patch will need updating, but the code
>> should be similar.
>>
>> Thanks,
>> Chung-Lin
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Canonicalize compares in combine [3/3] ARM backend part
2011-04-22 16:23 [PATCH] Canonicalize compares in combine [3/3] ARM backend part Chung-Lin Tang
2011-05-09 16:45 ` Ping " Chung-Lin Tang
@ 2011-06-15 13:58 ` Richard Earnshaw
2011-07-18 8:32 ` Chung-Lin Tang
1 sibling, 1 reply; 8+ messages in thread
From: Richard Earnshaw @ 2011-06-15 13:58 UTC (permalink / raw)
To: Chung-Lin Tang; +Cc: gcc-patches
On 22/04/11 16:21, Chung-Lin Tang wrote:
> Hi Richard, this part's for you.
>
> The ARM backend changes needed are very little after the prior patches,
> basically just a case in arm_canonicalize_comparison() to detect
> (zero_extend:SI (subreg:QI (reg:SI ...) 0)), and swap it into (and:SI
> (reg:SI) #255).
>
> Had we not tried the combine modifications, this testcase probably could
> have also be solved by implementing another version of the corresponding
> *andsi3_compare0/_scratch patterns, with ZERO_EXTEND in the body, and
> "ands" in the output assembly. Maybe that's an acceptable solution too...
>
> About the (ab)use of CANONICALIZE_COMPARISON, if it really should be
> another macro/hook, then this ARM patch will need updating, but the code
> should be similar.
>
> Thanks,
> Chung-Lin
>
>
> 3-arm-parts.diff
>
>
> Index: config/arm/arm.c
> ===================================================================
> --- config/arm/arm.c (revision 172860)
> +++ config/arm/arm.c (working copy)
> @@ -3276,6 +3276,19 @@
> return code;
> }
>
> + /* If *op0 is (zero_extend:SI (subreg:QI (reg:SI) 0)) and comparing
> + with const0_rtx, change it to (and:SI (reg:SI) (const_int 255)),
> + to facilitate possible combining with a cmp into 'ands'. */
> + if (mode == SImode
> + && GET_CODE (*op0) == ZERO_EXTEND
> + && GET_CODE (XEXP (*op0, 0)) == SUBREG
> + && GET_MODE (XEXP (*op0, 0)) == QImode
> + && GET_MODE (SUBREG_REG (XEXP (*op0, 0))) == SImode
> + && SUBREG_BYTE (XEXP (*op0, 0)) == 0
> + && *op1 == const0_rtx)
> + *op0 = gen_rtx_AND (SImode, SUBREG_REG (XEXP (*op0, 0)),
> + GEN_INT (255));
> +
This is wrong for big-endian code. You should use subreg_lowpart_p to
check the subreg expression (after you've checked that you do have a
subreg, of course).
R.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Canonicalize compares in combine [3/3] ARM backend part
2011-06-15 13:58 ` Richard Earnshaw
@ 2011-07-18 8:32 ` Chung-Lin Tang
0 siblings, 0 replies; 8+ messages in thread
From: Chung-Lin Tang @ 2011-07-18 8:32 UTC (permalink / raw)
To: Richard Earnshaw; +Cc: gcc-patches
[-- Attachment #1: Type: text/plain, Size: 2236 bytes --]
On 2011/6/15 09:12 PM, Richard Earnshaw wrote:
> On 22/04/11 16:21, Chung-Lin Tang wrote:
>> Hi Richard, this part's for you.
>>
>> The ARM backend changes needed are very little after the prior patches,
>> basically just a case in arm_canonicalize_comparison() to detect
>> (zero_extend:SI (subreg:QI (reg:SI ...) 0)), and swap it into (and:SI
>> (reg:SI) #255).
>>
>> Had we not tried the combine modifications, this testcase probably could
>> have also be solved by implementing another version of the corresponding
>> *andsi3_compare0/_scratch patterns, with ZERO_EXTEND in the body, and
>> "ands" in the output assembly. Maybe that's an acceptable solution too...
>>
>> About the (ab)use of CANONICALIZE_COMPARISON, if it really should be
>> another macro/hook, then this ARM patch will need updating, but the code
>> should be similar.
>>
>> Thanks,
>> Chung-Lin
>>
>>
>> 3-arm-parts.diff
>>
>>
>> Index: config/arm/arm.c
>> ===================================================================
>> --- config/arm/arm.c (revision 172860)
>> +++ config/arm/arm.c (working copy)
>> @@ -3276,6 +3276,19 @@
>> return code;
>> }
>>
>> + /* If *op0 is (zero_extend:SI (subreg:QI (reg:SI) 0)) and comparing
>> + with const0_rtx, change it to (and:SI (reg:SI) (const_int 255)),
>> + to facilitate possible combining with a cmp into 'ands'. */
>> + if (mode == SImode
>> + && GET_CODE (*op0) == ZERO_EXTEND
>> + && GET_CODE (XEXP (*op0, 0)) == SUBREG
>> + && GET_MODE (XEXP (*op0, 0)) == QImode
>> + && GET_MODE (SUBREG_REG (XEXP (*op0, 0))) == SImode
>> + && SUBREG_BYTE (XEXP (*op0, 0)) == 0
>> + && *op1 == const0_rtx)
>> + *op0 = gen_rtx_AND (SImode, SUBREG_REG (XEXP (*op0, 0)),
>> + GEN_INT (255));
>> +
>
> This is wrong for big-endian code. You should use subreg_lowpart_p to
> check the subreg expression (after you've checked that you do have a
> subreg, of course).
>
> R.
>
Hi Richard, thanks for catching that. I've updated the patch, and
cross-tested again under both arm/armeb-Linux.
I changed the testcase to use -march=armv6t2 instead of armv6, as the
latter makes the testcase FAIL when configured as --with-mode=thumb.
Is this now okay?
Thanks,
Chung-Lin
[-- Attachment #2: uxtb-cmp.diff --]
[-- Type: text/plain, Size: 2055 bytes --]
Index: config/arm/arm.c
===================================================================
--- config/arm/arm.c (revision 176385)
+++ config/arm/arm.c (working copy)
@@ -3172,6 +3172,19 @@
return code;
}
+ /* If *op0 is (zero_extend:SI (subreg:QI (reg:SI) 0)) and comparing
+ with const0_rtx, change it to (and:SI (reg:SI) (const_int 255)),
+ to facilitate possible combining with a cmp into 'ands'. */
+ if (mode == SImode
+ && GET_CODE (*op0) == ZERO_EXTEND
+ && GET_CODE (XEXP (*op0, 0)) == SUBREG
+ && GET_MODE (XEXP (*op0, 0)) == QImode
+ && GET_MODE (SUBREG_REG (XEXP (*op0, 0))) == SImode
+ && subreg_lowpart_p (XEXP (*op0, 0))
+ && *op1 == const0_rtx)
+ *op0 = gen_rtx_AND (SImode, SUBREG_REG (XEXP (*op0, 0)),
+ GEN_INT (255));
+
/* Comparisons smaller than DImode. Only adjust comparisons against
an out-of-range constant. */
if (GET_CODE (*op1) != CONST_INT
Index: testsuite/gcc.target/arm/combine-movs.c
===================================================================
--- testsuite/gcc.target/arm/combine-movs.c (revision 0)
+++ testsuite/gcc.target/arm/combine-movs.c (revision 0)
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O" } */
+
+void foo (unsigned long r[], unsigned int d)
+{
+ int i, n = d / 32;
+ for (i = 0; i < n; ++i)
+ r[i] = 0;
+}
+
+/* { dg-final { scan-assembler "movs\tr\[0-9\]" } } */
Index: testsuite/gcc.target/arm/unsigned-extend-2.c
===================================================================
--- testsuite/gcc.target/arm/unsigned-extend-2.c (revision 0)
+++ testsuite/gcc.target/arm/unsigned-extend-2.c (revision 0)
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O -march=armv6t2" } */
+
+unsigned short foo (unsigned short x)
+{
+ unsigned char i = 0;
+ for (i = 0; i < 8; i++)
+ {
+ x >>= 1;
+ x &= 0x7fff;
+ }
+ return x;
+}
+
+/* { dg-final { scan-assembler "ands" } } */
+/* { dg-final { scan-assembler-not "uxtb" } } */
+/* { dg-final { scan-assembler-not "cmp" } } */
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Canonicalize compares in combine [3/3] ARM backend part
@ 2011-07-18 9:37 Richard Earnshaw
2011-07-18 14:30 ` Chung-Lin Tang
0 siblings, 1 reply; 8+ messages in thread
From: Richard Earnshaw @ 2011-07-18 9:37 UTC (permalink / raw)
To: Chung-Lin Tang; +Cc: gcc-patches
On 18 Jul 2011, at 07:15, "Chung-Lin Tang" <cltang@codesourcery.com> wrote:
> On 2011/6/15 09:12 PM, Richard Earnshaw wrote:
>> On 22/04/11 16:21, Chung-Lin Tang wrote:
>>> Hi Richard, this part's for you.
>>>
>>> The ARM backend changes needed are very little after the prior patches,
>>> basically just a case in arm_canonicalize_comparison() to detect
>>> (zero_extend:SI (subreg:QI (reg:SI ...) 0)), and swap it into (and:SI
>>> (reg:SI) #255).
>>>
>>> Had we not tried the combine modifications, this testcase probably could
>>> have also be solved by implementing another version of the corresponding
>>> *andsi3_compare0/_scratch patterns, with ZERO_EXTEND in the body, and
>>> "ands" in the output assembly. Maybe that's an acceptable solution too...
>>>
>>> About the (ab)use of CANONICALIZE_COMPARISON, if it really should be
>>> another macro/hook, then this ARM patch will need updating, but the code
>>> should be similar.
>>>
>>> Thanks,
>>> Chung-Lin
>>>
>>>
>>> 3-arm-parts.diff
>>>
>>>
>>> Index: config/arm/arm.c
>>> ===================================================================
>>> --- config/arm/arm.c (revision 172860)
>>> +++ config/arm/arm.c (working copy)
>>> @@ -3276,6 +3276,19 @@
>>> return code;
>>> }
>>>
>>> + /* If *op0 is (zero_extend:SI (subreg:QI (reg:SI) 0)) and comparing
>>> + with const0_rtx, change it to (and:SI (reg:SI) (const_int 255)),
>>> + to facilitate possible combining with a cmp into 'ands'. */
>>> + if (mode == SImode
>>> + && GET_CODE (*op0) == ZERO_EXTEND
>>> + && GET_CODE (XEXP (*op0, 0)) == SUBREG
>>> + && GET_MODE (XEXP (*op0, 0)) == QImode
>>> + && GET_MODE (SUBREG_REG (XEXP (*op0, 0))) == SImode
>>> + && SUBREG_BYTE (XEXP (*op0, 0)) == 0
>>> + && *op1 == const0_rtx)
>>> + *op0 = gen_rtx_AND (SImode, SUBREG_REG (XEXP (*op0, 0)),
>>> + GEN_INT (255));
>>> +
>>
>> This is wrong for big-endian code. You should use subreg_lowpart_p to
>> check the subreg expression (after you've checked that you do have a
>> subreg, of course).
>>
>> R.
>>
>
> Hi Richard, thanks for catching that. I've updated the patch, and
> cross-tested again under both arm/armeb-Linux.
>
> I changed the testcase to use -march=armv6t2 instead of armv6, as the
> latter makes the testcase FAIL when configured as --with-mode=thumb.
>
> Is this now okay?
>
The patch to arm.c is ok, but the change to the test is not as it will cause problems with multilib testing. A better fix is to skip the test if the target is thumb1.
The other test needs a similar check as it seems to expect a movs instruction.
R.
> Thanks,
> Chung-Lin
> <uxtb-cmp.diff>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Canonicalize compares in combine [3/3] ARM backend part
2011-07-18 9:37 Richard Earnshaw
@ 2011-07-18 14:30 ` Chung-Lin Tang
2011-07-18 18:40 ` Richard Earnshaw
0 siblings, 1 reply; 8+ messages in thread
From: Chung-Lin Tang @ 2011-07-18 14:30 UTC (permalink / raw)
To: Richard Earnshaw; +Cc: gcc-patches
[-- Attachment #1: Type: text/plain, Size: 586 bytes --]
On 2011/7/18 04:46 PM, Richard Earnshaw wrote:
> The patch to arm.c is ok, but the change to the test is not as it will cause problems with multilib testing. A better fix is to skip the test if the target is thumb1.
>
> The other test needs a similar check as it seems to expect a movs instruction.
>
> R.
Yes it seems more logical to skip for thumb1, at least for the movs one.
For the uxtb test, I think probably using "dg-require-effective-target
arm_thumb2_ok" would be more suitable wrt multilib testing.
Updated patch for the testcase parts, is this okay?
Thanks,
Chung-Lin
[-- Attachment #2: testcase.diff --]
[-- Type: text/plain, Size: 1039 bytes --]
Index: combine-movs.c
===================================================================
--- combine-movs.c (revision 0)
+++ combine-movs.c (revision 0)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-skip-if "" { arm_thumb1 } } */
+/* { dg-options "-O" } */
+
+void foo (unsigned long r[], unsigned int d)
+{
+ int i, n = d / 32;
+ for (i = 0; i < n; ++i)
+ r[i] = 0;
+}
+
+/* { dg-final { scan-assembler "movs\tr\[0-9\]" } } */
Index: unsigned-extend-2.c
===================================================================
--- unsigned-extend-2.c (revision 0)
+++ unsigned-extend-2.c (revision 0)
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-options "-O" } */
+
+unsigned short foo (unsigned short x)
+{
+ unsigned char i = 0;
+ for (i = 0; i < 8; i++)
+ {
+ x >>= 1;
+ x &= 0x7fff;
+ }
+ return x;
+}
+
+/* { dg-final { scan-assembler "ands" } } */
+/* { dg-final { scan-assembler-not "uxtb" } } */
+/* { dg-final { scan-assembler-not "cmp" } } */
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Canonicalize compares in combine [3/3] ARM backend part
2011-07-18 14:30 ` Chung-Lin Tang
@ 2011-07-18 18:40 ` Richard Earnshaw
0 siblings, 0 replies; 8+ messages in thread
From: Richard Earnshaw @ 2011-07-18 18:40 UTC (permalink / raw)
To: Chung-Lin Tang; +Cc: gcc-patches
On 18/07/11 14:10, Chung-Lin Tang wrote:
> On 2011/7/18 04:46 PM, Richard Earnshaw wrote:
>> The patch to arm.c is ok, but the change to the test is not as it will cause problems with multilib testing. A better fix is to skip the test if the target is thumb1.
>>
>> The other test needs a similar check as it seems to expect a movs instruction.
>>
>> R.
>
> Yes it seems more logical to skip for thumb1, at least for the movs one.
> For the uxtb test, I think probably using "dg-require-effective-target
> arm_thumb2_ok" would be more suitable wrt multilib testing.
>
> Updated patch for the testcase parts, is this okay?
>
> Thanks,
> Chung-Lin
>
OK.
R.
>
> testcase.diff
>
>
> Index: combine-movs.c
> ===================================================================
> --- combine-movs.c (revision 0)
> +++ combine-movs.c (revision 0)
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-skip-if "" { arm_thumb1 } } */
> +/* { dg-options "-O" } */
> +
> +void foo (unsigned long r[], unsigned int d)
> +{
> + int i, n = d / 32;
> + for (i = 0; i < n; ++i)
> + r[i] = 0;
> +}
> +
> +/* { dg-final { scan-assembler "movs\tr\[0-9\]" } } */
> Index: unsigned-extend-2.c
> ===================================================================
> --- unsigned-extend-2.c (revision 0)
> +++ unsigned-extend-2.c (revision 0)
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_thumb2_ok } */
> +/* { dg-options "-O" } */
> +
> +unsigned short foo (unsigned short x)
> +{
> + unsigned char i = 0;
> + for (i = 0; i < 8; i++)
> + {
> + x >>= 1;
> + x &= 0x7fff;
> + }
> + return x;
> +}
> +
> +/* { dg-final { scan-assembler "ands" } } */
> +/* { dg-final { scan-assembler-not "uxtb" } } */
> +/* { dg-final { scan-assembler-not "cmp" } } */
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2011-07-18 17:50 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-04-22 16:23 [PATCH] Canonicalize compares in combine [3/3] ARM backend part Chung-Lin Tang
2011-05-09 16:45 ` Ping " Chung-Lin Tang
2011-06-02 4:59 ` Chung-Lin Tang
2011-06-15 13:58 ` Richard Earnshaw
2011-07-18 8:32 ` Chung-Lin Tang
2011-07-18 9:37 Richard Earnshaw
2011-07-18 14:30 ` Chung-Lin Tang
2011-07-18 18:40 ` Richard Earnshaw
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).