* Re: [patch] PR rtl-optimization/65220: avoid integer division in stack alignment
@ 2015-02-27 9:42 Uros Bizjak
0 siblings, 0 replies; 8+ messages in thread
From: Uros Bizjak @ 2015-02-27 9:42 UTC (permalink / raw)
To: gcc-patches; +Cc: Aldy Hernandez, Richard Henderson, Jakub Jelinek
Hello!
> This is actually Jakub's patch from the PR, with a few minor tweaks that were needed to bootstrap
> and pass the regression suite. The splitter was using operand 0 without setting it first. It should've
> been operand 2. Also, there was a division by zero that was causing an invalid insn; fixed by
> changing the INTVAL + 2 into INTVAL - 2. Finally, a reload_completed was added at Jakub's
> request.
PR rtl-optimization/65220
* config/i386/i386.md (*udivmod<mode>4_pow2): New.
+ "UINTVAL (operands[3]) - 2 < <MODE_SIZE> * BITS_PER_UNIT
+ && (UINTVAL (operands[3]) & (UINTVAL (operands[3]) - 1)) == 0"
+ "#"
+ "reload_completed"
This should be "&& reload_completed", so it will also include the
above condition in the split condition.
OK for mainline with the above change.
Thanks,
Uros.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch] PR rtl-optimization/65220: avoid integer division in stack alignment
2015-03-02 18:55 ` Aldy Hernandez
@ 2015-03-02 21:40 ` Richard Henderson
0 siblings, 0 replies; 8+ messages in thread
From: Richard Henderson @ 2015-03-02 21:40 UTC (permalink / raw)
To: Aldy Hernandez, gcc-patches; +Cc: Jakub Jelinek
On 03/02/2015 10:55 AM, Aldy Hernandez wrote:
>
>>> Do you actually need the reload completed? Couldn't this be legitimately split
>>> before reload (and then parts of it DCE'd as necessary)?
>>>
>>> Otherwise, the split condition does need "&&" as mentioned by Uros.
>>
>> Oh, I forgot -- even if you can exclude reload_completed,
>> you'd then need to use "&& 1" to effectively copy the first
>> condition.
>
> And *this* is why I hate rtl. All these secret magic knobs. Arghh...
>
> The following patch still fixes the problem in the PR, but without
> reload_completed.
>
> I am doing a full round of tests.
>
> OK pending tests?
>
Yes.
r~
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch] PR rtl-optimization/65220: avoid integer division in stack alignment
2015-03-02 18:38 ` Richard Henderson
@ 2015-03-02 18:55 ` Aldy Hernandez
2015-03-02 21:40 ` Richard Henderson
0 siblings, 1 reply; 8+ messages in thread
From: Aldy Hernandez @ 2015-03-02 18:55 UTC (permalink / raw)
To: Richard Henderson, gcc-patches; +Cc: Jakub Jelinek
[-- Attachment #1: Type: text/plain, Size: 569 bytes --]
>> Do you actually need the reload completed? Couldn't this be legitimately split
>> before reload (and then parts of it DCE'd as necessary)?
>>
>> Otherwise, the split condition does need "&&" as mentioned by Uros.
>
> Oh, I forgot -- even if you can exclude reload_completed,
> you'd then need to use "&& 1" to effectively copy the first
> condition.
And *this* is why I hate rtl. All these secret magic knobs. Arghh...
The following patch still fixes the problem in the PR, but without
reload_completed.
I am doing a full round of tests.
OK pending tests?
[-- Attachment #2: curr --]
[-- Type: text/plain, Size: 713 bytes --]
commit ee167c8a716799af9eb0e2275156061d76719dac
Author: Aldy Hernandez <aldyh@redhat.com>
Date: Mon Mar 2 10:53:29 2015 -0800
* config/i386/i386.md (*udivmod<mode>4_pow2): Remove
reload_completed.
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 8495284..8a80415 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -7343,7 +7343,7 @@
"UINTVAL (operands[3]) - 2 < <MODE_SIZE> * BITS_PER_UNIT
&& (UINTVAL (operands[3]) & (UINTVAL (operands[3]) - 1)) == 0"
"#"
- "&& reload_completed"
+ "&& 1"
[(set (match_dup 1) (match_dup 2))
(parallel [(set (match_dup 0) (lshiftrt:<MODE> (match_dup 2) (match_dup 4)))
(clobber (reg:CC FLAGS_REG))])
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch] PR rtl-optimization/65220: avoid integer division in stack alignment
2015-03-02 17:55 ` Richard Henderson
2015-03-02 18:25 ` Aldy Hernandez
@ 2015-03-02 18:38 ` Richard Henderson
2015-03-02 18:55 ` Aldy Hernandez
1 sibling, 1 reply; 8+ messages in thread
From: Richard Henderson @ 2015-03-02 18:38 UTC (permalink / raw)
To: Aldy Hernandez, gcc-patches; +Cc: Jakub Jelinek
On 03/02/2015 09:55 AM, Richard Henderson wrote:
> On 02/26/2015 05:46 PM, Aldy Hernandez wrote:
>> +;; Optimize division or modulo by constant power of 2, if the constant
>> +;; materializes only after expansion.
>> +(define_insn_and_split "*udivmod<mode>4_pow2"
>> + [(set (match_operand:SWI48 0 "register_operand" "=r")
>> + (udiv:SWI48 (match_operand:SWI48 2 "register_operand" "0")
>> + (match_operand:SWI48 3 "const_int_operand" "n")))
>> + (set (match_operand:SWI48 1 "register_operand" "=r")
>> + (umod:SWI48 (match_dup 2) (match_dup 3)))
>> + (clobber (reg:CC FLAGS_REG))]
>> + "UINTVAL (operands[3]) - 2 < <MODE_SIZE> * BITS_PER_UNIT
>> + && (UINTVAL (operands[3]) & (UINTVAL (operands[3]) - 1)) == 0"
>> + "#"
>> + "reload_completed"
>> + [(set (match_dup 1) (match_dup 2))
>> + (parallel [(set (match_dup 0) (lshiftrt:<MODE> (match_dup 2) (match_dup 4)))
>> + (clobber (reg:CC FLAGS_REG))])
>> + (parallel [(set (match_dup 1) (and:<MODE> (match_dup 1) (match_dup 5)))
>> + (clobber (reg:CC FLAGS_REG))])]
>> +{
>
> Do you actually need the reload completed? Couldn't this be legitimately split
> before reload (and then parts of it DCE'd as necessary)?
>
> Otherwise, the split condition does need "&&" as mentioned by Uros.
Oh, I forgot -- even if you can exclude reload_completed,
you'd then need to use "&& 1" to effectively copy the first
condition.
r~
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch] PR rtl-optimization/65220: avoid integer division in stack alignment
2015-03-02 18:25 ` Aldy Hernandez
@ 2015-03-02 18:34 ` Richard Henderson
0 siblings, 0 replies; 8+ messages in thread
From: Richard Henderson @ 2015-03-02 18:34 UTC (permalink / raw)
To: Aldy Hernandez, gcc-patches; +Cc: Jakub Jelinek
On 03/02/2015 10:25 AM, Aldy Hernandez wrote:
> On 03/02/2015 09:55 AM, Richard Henderson wrote:
>> On 02/26/2015 05:46 PM, Aldy Hernandez wrote:
>>> +;; Optimize division or modulo by constant power of 2, if the constant
>>> +;; materializes only after expansion.
>>> +(define_insn_and_split "*udivmod<mode>4_pow2"
>>> + [(set (match_operand:SWI48 0 "register_operand" "=r")
>>> + (udiv:SWI48 (match_operand:SWI48 2 "register_operand" "0")
>>> + (match_operand:SWI48 3 "const_int_operand" "n")))
>>> + (set (match_operand:SWI48 1 "register_operand" "=r")
>>> + (umod:SWI48 (match_dup 2) (match_dup 3)))
>>> + (clobber (reg:CC FLAGS_REG))]
>>> + "UINTVAL (operands[3]) - 2 < <MODE_SIZE> * BITS_PER_UNIT
>>> + && (UINTVAL (operands[3]) & (UINTVAL (operands[3]) - 1)) == 0"
>>> + "#"
>>> + "reload_completed"
>>> + [(set (match_dup 1) (match_dup 2))
>>> + (parallel [(set (match_dup 0) (lshiftrt:<MODE> (match_dup 2) (match_dup
>>> 4)))
>>> + (clobber (reg:CC FLAGS_REG))])
>>> + (parallel [(set (match_dup 1) (and:<MODE> (match_dup 1) (match_dup 5)))
>>> + (clobber (reg:CC FLAGS_REG))])]
>>> +{
>>
>> Do you actually need the reload completed? Couldn't this be legitimately split
>> before reload (and then parts of it DCE'd as necessary)?
>
> I think Jakub mentioned we needed it, otherwise we needed to check for a
> division by zero for the INTVAL. But I could be misunderstanding.
The UINTVAL - 2 < MODE_BIT_SIZE condition means that 0 & 1 are excluded. So I
don't see how division by zero applies here.
r~
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch] PR rtl-optimization/65220: avoid integer division in stack alignment
2015-03-02 17:55 ` Richard Henderson
@ 2015-03-02 18:25 ` Aldy Hernandez
2015-03-02 18:34 ` Richard Henderson
2015-03-02 18:38 ` Richard Henderson
1 sibling, 1 reply; 8+ messages in thread
From: Aldy Hernandez @ 2015-03-02 18:25 UTC (permalink / raw)
To: Richard Henderson, gcc-patches; +Cc: Jakub Jelinek
On 03/02/2015 09:55 AM, Richard Henderson wrote:
> On 02/26/2015 05:46 PM, Aldy Hernandez wrote:
>> +;; Optimize division or modulo by constant power of 2, if the constant
>> +;; materializes only after expansion.
>> +(define_insn_and_split "*udivmod<mode>4_pow2"
>> + [(set (match_operand:SWI48 0 "register_operand" "=r")
>> + (udiv:SWI48 (match_operand:SWI48 2 "register_operand" "0")
>> + (match_operand:SWI48 3 "const_int_operand" "n")))
>> + (set (match_operand:SWI48 1 "register_operand" "=r")
>> + (umod:SWI48 (match_dup 2) (match_dup 3)))
>> + (clobber (reg:CC FLAGS_REG))]
>> + "UINTVAL (operands[3]) - 2 < <MODE_SIZE> * BITS_PER_UNIT
>> + && (UINTVAL (operands[3]) & (UINTVAL (operands[3]) - 1)) == 0"
>> + "#"
>> + "reload_completed"
>> + [(set (match_dup 1) (match_dup 2))
>> + (parallel [(set (match_dup 0) (lshiftrt:<MODE> (match_dup 2) (match_dup 4)))
>> + (clobber (reg:CC FLAGS_REG))])
>> + (parallel [(set (match_dup 1) (and:<MODE> (match_dup 1) (match_dup 5)))
>> + (clobber (reg:CC FLAGS_REG))])]
>> +{
>
> Do you actually need the reload completed? Couldn't this be legitimately split
> before reload (and then parts of it DCE'd as necessary)?
I think Jakub mentioned we needed it, otherwise we needed to check for a
division by zero for the INTVAL. But I could be misunderstanding.
Aldy
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [patch] PR rtl-optimization/65220: avoid integer division in stack alignment
2015-02-27 2:10 Aldy Hernandez
@ 2015-03-02 17:55 ` Richard Henderson
2015-03-02 18:25 ` Aldy Hernandez
2015-03-02 18:38 ` Richard Henderson
0 siblings, 2 replies; 8+ messages in thread
From: Richard Henderson @ 2015-03-02 17:55 UTC (permalink / raw)
To: Aldy Hernandez, gcc-patches; +Cc: Jakub Jelinek
On 02/26/2015 05:46 PM, Aldy Hernandez wrote:
> +;; Optimize division or modulo by constant power of 2, if the constant
> +;; materializes only after expansion.
> +(define_insn_and_split "*udivmod<mode>4_pow2"
> + [(set (match_operand:SWI48 0 "register_operand" "=r")
> + (udiv:SWI48 (match_operand:SWI48 2 "register_operand" "0")
> + (match_operand:SWI48 3 "const_int_operand" "n")))
> + (set (match_operand:SWI48 1 "register_operand" "=r")
> + (umod:SWI48 (match_dup 2) (match_dup 3)))
> + (clobber (reg:CC FLAGS_REG))]
> + "UINTVAL (operands[3]) - 2 < <MODE_SIZE> * BITS_PER_UNIT
> + && (UINTVAL (operands[3]) & (UINTVAL (operands[3]) - 1)) == 0"
> + "#"
> + "reload_completed"
> + [(set (match_dup 1) (match_dup 2))
> + (parallel [(set (match_dup 0) (lshiftrt:<MODE> (match_dup 2) (match_dup 4)))
> + (clobber (reg:CC FLAGS_REG))])
> + (parallel [(set (match_dup 1) (and:<MODE> (match_dup 1) (match_dup 5)))
> + (clobber (reg:CC FLAGS_REG))])]
> +{
Do you actually need the reload completed? Couldn't this be legitimately split
before reload (and then parts of it DCE'd as necessary)?
Otherwise, the split condition does need "&&" as mentioned by Uros.
r~
^ permalink raw reply [flat|nested] 8+ messages in thread
* [patch] PR rtl-optimization/65220: avoid integer division in stack alignment
@ 2015-02-27 2:10 Aldy Hernandez
2015-03-02 17:55 ` Richard Henderson
0 siblings, 1 reply; 8+ messages in thread
From: Aldy Hernandez @ 2015-02-27 2:10 UTC (permalink / raw)
To: gcc-patches, Richard Henderson; +Cc: Jakub Jelinek
[-- Attachment #1: Type: text/plain, Size: 462 bytes --]
This is actually Jakub's patch from the PR, with a few minor tweaks that
were needed to bootstrap and pass the regression suite.
The splitter was using operand 0 without setting it first. It should've
been operand 2. Also, there was a division by zero that was causing an
invalid insn; fixed by changing the INTVAL + 2 into INTVAL - 2.
Finally, a reload_completed was added at Jakub's request.
Bootstrapped and tested on x86-64 Linux.
OK for mainline?
[-- Attachment #2: curr --]
[-- Type: text/plain, Size: 2107 bytes --]
PR rtl-optimization/65220
* config/i386/i386.md (*udivmod<mode>4_pow2): New.
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 2d3d075..ad02156 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -7331,6 +7331,32 @@
[(set_attr "type" "multi")
(set_attr "mode" "<MODE>")])
+;; Optimize division or modulo by constant power of 2, if the constant
+;; materializes only after expansion.
+(define_insn_and_split "*udivmod<mode>4_pow2"
+ [(set (match_operand:SWI48 0 "register_operand" "=r")
+ (udiv:SWI48 (match_operand:SWI48 2 "register_operand" "0")
+ (match_operand:SWI48 3 "const_int_operand" "n")))
+ (set (match_operand:SWI48 1 "register_operand" "=r")
+ (umod:SWI48 (match_dup 2) (match_dup 3)))
+ (clobber (reg:CC FLAGS_REG))]
+ "UINTVAL (operands[3]) - 2 < <MODE_SIZE> * BITS_PER_UNIT
+ && (UINTVAL (operands[3]) & (UINTVAL (operands[3]) - 1)) == 0"
+ "#"
+ "reload_completed"
+ [(set (match_dup 1) (match_dup 2))
+ (parallel [(set (match_dup 0) (lshiftrt:<MODE> (match_dup 2) (match_dup 4)))
+ (clobber (reg:CC FLAGS_REG))])
+ (parallel [(set (match_dup 1) (and:<MODE> (match_dup 1) (match_dup 5)))
+ (clobber (reg:CC FLAGS_REG))])]
+{
+ int v = exact_log2 (UINTVAL (operands[3]));
+ operands[4] = GEN_INT (v);
+ operands[5] = GEN_INT ((HOST_WIDE_INT_1U << v) - 1);
+}
+ [(set_attr "type" "multi")
+ (set_attr "mode" "<MODE>")])
+
(define_insn "*udivmod<mode>4_noext"
[(set (match_operand:SWIM248 0 "register_operand" "=a")
(udiv:SWIM248 (match_operand:SWIM248 2 "register_operand" "0")
diff --git a/gcc/testsuite/gcc.target/i386/pr65520.c b/gcc/testsuite/gcc.target/i386/pr65520.c
new file mode 100644
index 0000000..8a62c39
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr65520.c
@@ -0,0 +1,20 @@
+/* PR target/65520 */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int foo (void *);
+
+void
+bar (void)
+{
+ unsigned s = 128;
+ while (1)
+ {
+ unsigned b[s];
+ if (foo (b))
+ break;
+ s *= 2;
+ }
+}
+
+/* { dg-final { scan-assembler-not "div\[^\n\r]*%" } } */
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2015-03-02 21:40 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-27 9:42 [patch] PR rtl-optimization/65220: avoid integer division in stack alignment Uros Bizjak
-- strict thread matches above, loose matches on Subject: below --
2015-02-27 2:10 Aldy Hernandez
2015-03-02 17:55 ` Richard Henderson
2015-03-02 18:25 ` Aldy Hernandez
2015-03-02 18:34 ` Richard Henderson
2015-03-02 18:38 ` Richard Henderson
2015-03-02 18:55 ` Aldy Hernandez
2015-03-02 21:40 ` Richard Henderson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).