public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: [patch] PR rtl-optimization/65220: avoid integer division in stack alignment
@ 2015-02-27  9:42 Uros Bizjak
  0 siblings, 0 replies; 8+ messages in thread
From: Uros Bizjak @ 2015-02-27  9:42 UTC (permalink / raw)
  To: gcc-patches; +Cc: Aldy Hernandez, Richard Henderson, Jakub Jelinek

Hello!

> This is actually Jakub's patch from the PR, with a few minor tweaks that were needed to bootstrap
> and pass the regression suite. The splitter was using operand 0 without setting it first. It should've
> been operand 2. Also, there was a division by zero that was causing an invalid insn; fixed by
> changing the INTVAL + 2 into INTVAL - 2. Finally, a reload_completed was added at Jakub's
> request.

    PR rtl-optimization/65220
    * config/i386/i386.md (*udivmod<mode>4_pow2): New.

+  "UINTVAL (operands[3]) - 2 < <MODE_SIZE> * BITS_PER_UNIT
+   && (UINTVAL (operands[3]) & (UINTVAL (operands[3]) - 1)) == 0"
+  "#"
+  "reload_completed"

This should be "&& reload_completed", so it will also include the
above condition in the split condition.

OK for mainline with the above change.

Thanks,
Uros.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [patch] PR rtl-optimization/65220: avoid integer division in stack alignment
  2015-03-02 18:55     ` Aldy Hernandez
@ 2015-03-02 21:40       ` Richard Henderson
  0 siblings, 0 replies; 8+ messages in thread
From: Richard Henderson @ 2015-03-02 21:40 UTC (permalink / raw)
  To: Aldy Hernandez, gcc-patches; +Cc: Jakub Jelinek

On 03/02/2015 10:55 AM, Aldy Hernandez wrote:
> 
>>> Do you actually need the reload completed?  Couldn't this be legitimately split
>>> before reload (and then parts of it DCE'd as necessary)?
>>>
>>> Otherwise, the split condition does need "&&" as mentioned by Uros.
>>
>> Oh, I forgot -- even if you can exclude reload_completed,
>> you'd then need to use "&& 1" to effectively copy the first
>> condition.
> 
> And *this* is why I hate rtl.  All these secret magic knobs.  Arghh...
> 
> The following patch still fixes the problem in the PR, but without
> reload_completed.
> 
> I am doing a full round of tests.
> 
> OK pending tests?
> 

Yes.


r~

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [patch] PR rtl-optimization/65220: avoid integer division in stack alignment
  2015-03-02 18:38   ` Richard Henderson
@ 2015-03-02 18:55     ` Aldy Hernandez
  2015-03-02 21:40       ` Richard Henderson
  0 siblings, 1 reply; 8+ messages in thread
From: Aldy Hernandez @ 2015-03-02 18:55 UTC (permalink / raw)
  To: Richard Henderson, gcc-patches; +Cc: Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 569 bytes --]


>> Do you actually need the reload completed?  Couldn't this be legitimately split
>> before reload (and then parts of it DCE'd as necessary)?
>>
>> Otherwise, the split condition does need "&&" as mentioned by Uros.
>
> Oh, I forgot -- even if you can exclude reload_completed,
> you'd then need to use "&& 1" to effectively copy the first
> condition.

And *this* is why I hate rtl.  All these secret magic knobs.  Arghh...

The following patch still fixes the problem in the PR, but without 
reload_completed.

I am doing a full round of tests.

OK pending tests?


[-- Attachment #2: curr --]
[-- Type: text/plain, Size: 713 bytes --]

commit ee167c8a716799af9eb0e2275156061d76719dac
Author: Aldy Hernandez <aldyh@redhat.com>
Date:   Mon Mar 2 10:53:29 2015 -0800

    	* config/i386/i386.md (*udivmod<mode>4_pow2): Remove
    	reload_completed.

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 8495284..8a80415 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -7343,7 +7343,7 @@
   "UINTVAL (operands[3]) - 2 < <MODE_SIZE> * BITS_PER_UNIT
    && (UINTVAL (operands[3]) & (UINTVAL (operands[3]) - 1)) == 0"
   "#"
-  "&& reload_completed"
+  "&& 1"
   [(set (match_dup 1) (match_dup 2))
    (parallel [(set (match_dup 0) (lshiftrt:<MODE> (match_dup 2) (match_dup 4)))
 	      (clobber (reg:CC FLAGS_REG))])

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [patch] PR rtl-optimization/65220: avoid integer division in stack alignment
  2015-03-02 17:55 ` Richard Henderson
  2015-03-02 18:25   ` Aldy Hernandez
@ 2015-03-02 18:38   ` Richard Henderson
  2015-03-02 18:55     ` Aldy Hernandez
  1 sibling, 1 reply; 8+ messages in thread
From: Richard Henderson @ 2015-03-02 18:38 UTC (permalink / raw)
  To: Aldy Hernandez, gcc-patches; +Cc: Jakub Jelinek

On 03/02/2015 09:55 AM, Richard Henderson wrote:
> On 02/26/2015 05:46 PM, Aldy Hernandez wrote:
>> +;; Optimize division or modulo by constant power of 2, if the constant
>> +;; materializes only after expansion.
>> +(define_insn_and_split "*udivmod<mode>4_pow2"
>> +  [(set (match_operand:SWI48 0 "register_operand" "=r")
>> +	(udiv:SWI48 (match_operand:SWI48 2 "register_operand" "0")
>> +		    (match_operand:SWI48 3 "const_int_operand" "n")))
>> +   (set (match_operand:SWI48 1 "register_operand" "=r")
>> +	(umod:SWI48 (match_dup 2) (match_dup 3)))
>> +   (clobber (reg:CC FLAGS_REG))]
>> +  "UINTVAL (operands[3]) - 2 < <MODE_SIZE> * BITS_PER_UNIT
>> +   && (UINTVAL (operands[3]) & (UINTVAL (operands[3]) - 1)) == 0"
>> +  "#"
>> +  "reload_completed"
>> +  [(set (match_dup 1) (match_dup 2))
>> +   (parallel [(set (match_dup 0) (lshiftrt:<MODE> (match_dup 2) (match_dup 4)))
>> +	      (clobber (reg:CC FLAGS_REG))])
>> +   (parallel [(set (match_dup 1) (and:<MODE> (match_dup 1) (match_dup 5)))
>> +	      (clobber (reg:CC FLAGS_REG))])]
>> +{
> 
> Do you actually need the reload completed?  Couldn't this be legitimately split
> before reload (and then parts of it DCE'd as necessary)?
> 
> Otherwise, the split condition does need "&&" as mentioned by Uros.

Oh, I forgot -- even if you can exclude reload_completed,
you'd then need to use "&& 1" to effectively copy the first
condition.


r~

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [patch] PR rtl-optimization/65220: avoid integer division in stack alignment
  2015-03-02 18:25   ` Aldy Hernandez
@ 2015-03-02 18:34     ` Richard Henderson
  0 siblings, 0 replies; 8+ messages in thread
From: Richard Henderson @ 2015-03-02 18:34 UTC (permalink / raw)
  To: Aldy Hernandez, gcc-patches; +Cc: Jakub Jelinek

On 03/02/2015 10:25 AM, Aldy Hernandez wrote:
> On 03/02/2015 09:55 AM, Richard Henderson wrote:
>> On 02/26/2015 05:46 PM, Aldy Hernandez wrote:
>>> +;; Optimize division or modulo by constant power of 2, if the constant
>>> +;; materializes only after expansion.
>>> +(define_insn_and_split "*udivmod<mode>4_pow2"
>>> +  [(set (match_operand:SWI48 0 "register_operand" "=r")
>>> +    (udiv:SWI48 (match_operand:SWI48 2 "register_operand" "0")
>>> +            (match_operand:SWI48 3 "const_int_operand" "n")))
>>> +   (set (match_operand:SWI48 1 "register_operand" "=r")
>>> +    (umod:SWI48 (match_dup 2) (match_dup 3)))
>>> +   (clobber (reg:CC FLAGS_REG))]
>>> +  "UINTVAL (operands[3]) - 2 < <MODE_SIZE> * BITS_PER_UNIT
>>> +   && (UINTVAL (operands[3]) & (UINTVAL (operands[3]) - 1)) == 0"
>>> +  "#"
>>> +  "reload_completed"
>>> +  [(set (match_dup 1) (match_dup 2))
>>> +   (parallel [(set (match_dup 0) (lshiftrt:<MODE> (match_dup 2) (match_dup
>>> 4)))
>>> +          (clobber (reg:CC FLAGS_REG))])
>>> +   (parallel [(set (match_dup 1) (and:<MODE> (match_dup 1) (match_dup 5)))
>>> +          (clobber (reg:CC FLAGS_REG))])]
>>> +{
>>
>> Do you actually need the reload completed?  Couldn't this be legitimately split
>> before reload (and then parts of it DCE'd as necessary)?
> 
> I think Jakub mentioned we needed it, otherwise we needed to check for a
> division by zero for the INTVAL.  But I could be misunderstanding.

The UINTVAL - 2 < MODE_BIT_SIZE condition means that 0 & 1 are excluded.  So I
don't see how division by zero applies here.


r~

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [patch] PR rtl-optimization/65220: avoid integer division in stack alignment
  2015-03-02 17:55 ` Richard Henderson
@ 2015-03-02 18:25   ` Aldy Hernandez
  2015-03-02 18:34     ` Richard Henderson
  2015-03-02 18:38   ` Richard Henderson
  1 sibling, 1 reply; 8+ messages in thread
From: Aldy Hernandez @ 2015-03-02 18:25 UTC (permalink / raw)
  To: Richard Henderson, gcc-patches; +Cc: Jakub Jelinek

On 03/02/2015 09:55 AM, Richard Henderson wrote:
> On 02/26/2015 05:46 PM, Aldy Hernandez wrote:
>> +;; Optimize division or modulo by constant power of 2, if the constant
>> +;; materializes only after expansion.
>> +(define_insn_and_split "*udivmod<mode>4_pow2"
>> +  [(set (match_operand:SWI48 0 "register_operand" "=r")
>> +	(udiv:SWI48 (match_operand:SWI48 2 "register_operand" "0")
>> +		    (match_operand:SWI48 3 "const_int_operand" "n")))
>> +   (set (match_operand:SWI48 1 "register_operand" "=r")
>> +	(umod:SWI48 (match_dup 2) (match_dup 3)))
>> +   (clobber (reg:CC FLAGS_REG))]
>> +  "UINTVAL (operands[3]) - 2 < <MODE_SIZE> * BITS_PER_UNIT
>> +   && (UINTVAL (operands[3]) & (UINTVAL (operands[3]) - 1)) == 0"
>> +  "#"
>> +  "reload_completed"
>> +  [(set (match_dup 1) (match_dup 2))
>> +   (parallel [(set (match_dup 0) (lshiftrt:<MODE> (match_dup 2) (match_dup 4)))
>> +	      (clobber (reg:CC FLAGS_REG))])
>> +   (parallel [(set (match_dup 1) (and:<MODE> (match_dup 1) (match_dup 5)))
>> +	      (clobber (reg:CC FLAGS_REG))])]
>> +{
>
> Do you actually need the reload completed?  Couldn't this be legitimately split
> before reload (and then parts of it DCE'd as necessary)?

I think Jakub mentioned we needed it, otherwise we needed to check for a 
division by zero for the INTVAL.  But I could be misunderstanding.

Aldy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [patch] PR rtl-optimization/65220: avoid integer division in stack alignment
  2015-02-27  2:10 Aldy Hernandez
@ 2015-03-02 17:55 ` Richard Henderson
  2015-03-02 18:25   ` Aldy Hernandez
  2015-03-02 18:38   ` Richard Henderson
  0 siblings, 2 replies; 8+ messages in thread
From: Richard Henderson @ 2015-03-02 17:55 UTC (permalink / raw)
  To: Aldy Hernandez, gcc-patches; +Cc: Jakub Jelinek

On 02/26/2015 05:46 PM, Aldy Hernandez wrote:
> +;; Optimize division or modulo by constant power of 2, if the constant
> +;; materializes only after expansion.
> +(define_insn_and_split "*udivmod<mode>4_pow2"
> +  [(set (match_operand:SWI48 0 "register_operand" "=r")
> +	(udiv:SWI48 (match_operand:SWI48 2 "register_operand" "0")
> +		    (match_operand:SWI48 3 "const_int_operand" "n")))
> +   (set (match_operand:SWI48 1 "register_operand" "=r")
> +	(umod:SWI48 (match_dup 2) (match_dup 3)))
> +   (clobber (reg:CC FLAGS_REG))]
> +  "UINTVAL (operands[3]) - 2 < <MODE_SIZE> * BITS_PER_UNIT
> +   && (UINTVAL (operands[3]) & (UINTVAL (operands[3]) - 1)) == 0"
> +  "#"
> +  "reload_completed"
> +  [(set (match_dup 1) (match_dup 2))
> +   (parallel [(set (match_dup 0) (lshiftrt:<MODE> (match_dup 2) (match_dup 4)))
> +	      (clobber (reg:CC FLAGS_REG))])
> +   (parallel [(set (match_dup 1) (and:<MODE> (match_dup 1) (match_dup 5)))
> +	      (clobber (reg:CC FLAGS_REG))])]
> +{

Do you actually need the reload completed?  Couldn't this be legitimately split
before reload (and then parts of it DCE'd as necessary)?

Otherwise, the split condition does need "&&" as mentioned by Uros.


r~

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [patch] PR rtl-optimization/65220: avoid integer division in stack alignment
@ 2015-02-27  2:10 Aldy Hernandez
  2015-03-02 17:55 ` Richard Henderson
  0 siblings, 1 reply; 8+ messages in thread
From: Aldy Hernandez @ 2015-02-27  2:10 UTC (permalink / raw)
  To: gcc-patches, Richard Henderson; +Cc: Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 462 bytes --]

This is actually Jakub's patch from the PR, with a few minor tweaks that 
were needed to bootstrap and pass the regression suite.

The splitter was using operand 0 without setting it first.  It should've 
been operand 2.  Also, there was a division by zero that was causing an 
invalid insn; fixed by changing the INTVAL + 2 into INTVAL - 2. 
Finally, a reload_completed was added at Jakub's request.

Bootstrapped and tested on x86-64 Linux.

OK for mainline?


[-- Attachment #2: curr --]
[-- Type: text/plain, Size: 2107 bytes --]

    	PR rtl-optimization/65220
    	* config/i386/i386.md (*udivmod<mode>4_pow2): New.

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 2d3d075..ad02156 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -7331,6 +7331,32 @@
   [(set_attr "type" "multi")
    (set_attr "mode" "<MODE>")])
 
+;; Optimize division or modulo by constant power of 2, if the constant
+;; materializes only after expansion.
+(define_insn_and_split "*udivmod<mode>4_pow2"
+  [(set (match_operand:SWI48 0 "register_operand" "=r")
+	(udiv:SWI48 (match_operand:SWI48 2 "register_operand" "0")
+		    (match_operand:SWI48 3 "const_int_operand" "n")))
+   (set (match_operand:SWI48 1 "register_operand" "=r")
+	(umod:SWI48 (match_dup 2) (match_dup 3)))
+   (clobber (reg:CC FLAGS_REG))]
+  "UINTVAL (operands[3]) - 2 < <MODE_SIZE> * BITS_PER_UNIT
+   && (UINTVAL (operands[3]) & (UINTVAL (operands[3]) - 1)) == 0"
+  "#"
+  "reload_completed"
+  [(set (match_dup 1) (match_dup 2))
+   (parallel [(set (match_dup 0) (lshiftrt:<MODE> (match_dup 2) (match_dup 4)))
+	      (clobber (reg:CC FLAGS_REG))])
+   (parallel [(set (match_dup 1) (and:<MODE> (match_dup 1) (match_dup 5)))
+	      (clobber (reg:CC FLAGS_REG))])]
+{
+  int v = exact_log2 (UINTVAL (operands[3]));
+  operands[4] = GEN_INT (v);
+  operands[5] = GEN_INT ((HOST_WIDE_INT_1U << v) - 1);
+}
+  [(set_attr "type" "multi")
+   (set_attr "mode" "<MODE>")])
+
 (define_insn "*udivmod<mode>4_noext"
   [(set (match_operand:SWIM248 0 "register_operand" "=a")
 	(udiv:SWIM248 (match_operand:SWIM248 2 "register_operand" "0")
diff --git a/gcc/testsuite/gcc.target/i386/pr65520.c b/gcc/testsuite/gcc.target/i386/pr65520.c
new file mode 100644
index 0000000..8a62c39
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr65520.c
@@ -0,0 +1,20 @@
+/* PR target/65520 */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int foo (void *);
+
+void
+bar (void)
+{
+  unsigned s = 128;
+  while (1)
+    {
+      unsigned b[s];
+      if (foo (b))
+	break;
+      s *= 2;
+    }
+}
+
+/* { dg-final { scan-assembler-not "div\[^\n\r]*%" } } */

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-03-02 21:40 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-27  9:42 [patch] PR rtl-optimization/65220: avoid integer division in stack alignment Uros Bizjak
  -- strict thread matches above, loose matches on Subject: below --
2015-02-27  2:10 Aldy Hernandez
2015-03-02 17:55 ` Richard Henderson
2015-03-02 18:25   ` Aldy Hernandez
2015-03-02 18:34     ` Richard Henderson
2015-03-02 18:38   ` Richard Henderson
2015-03-02 18:55     ` Aldy Hernandez
2015-03-02 21:40       ` Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).