public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/29789] New: Missed optimization in libquantum
@ 2006-11-09 21:54 rguenth at gcc dot gnu dot org
2006-11-09 22:07 ` [Bug tree-optimization/29789] " rguenth at gcc dot gnu dot org
` (8 more replies)
0 siblings, 9 replies; 10+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2006-11-09 21:54 UTC (permalink / raw)
To: gcc-bugs
In SPEC 2k6 libquantum we can find the following (simplified) hot loop
void
quantum_toffoli(int control1, int control2, int target, unsigned long *state,
int size)
{
int i;
for(i=0; i<size; i++)
{
if (state[i] & ((unsigned long) 1 << control1))
if (state[i] & ((unsigned long) 1 << control2))
state[i] ^= ((unsigned long) 1 << target);
}
}
where we miss that
- The masks of the form ((unsigned long) 1 << X) are loop invariant
- We can combine the two ifs to
if (state[i] & C == C)
with C = ((unsigned long) 1 << control1) | ((unsigned long) 1 << control2)
The first missed-optimization is also caused by fold at
/* If this is an EQ or NE comparison with zero and ARG0 is
(1 << foo) & bar, convert it to (bar >> foo) & 1. Both require
two operations, but the latter can be done in one less insn
on machines that have only two-operand insns or on which a
constant cannot be the first operand. */
which causes the bit-tests to be written with a not loop-invariant shift.
--
Summary: Missed optimization in libquantum
Product: gcc
Version: 4.3.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: rguenth at gcc dot gnu dot org
OtherBugsDependingO 15357,26163
nThis:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29789
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug tree-optimization/29789] Missed optimization in libquantum
2006-11-09 21:54 [Bug tree-optimization/29789] New: Missed optimization in libquantum rguenth at gcc dot gnu dot org
@ 2006-11-09 22:07 ` rguenth at gcc dot gnu dot org
2006-11-10 1:56 ` pinskia at gcc dot gnu dot org
` (7 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2006-11-09 22:07 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from rguenth at gcc dot gnu dot org 2006-11-09 22:07 -------
If the fold transformation is disabled we can force lim to move the shifts out
of the loop by specifying --param lim-expensive=1 (it considers the shifts
being of cost one). The default of the param is 20 though.
PRE thinks the shifts are loop induction variables, so it doesn't move them out
of the loop.
With the above "fixes" we get
.L4:
movl (%edi,%ecx,4), %edx
testl %edx, %ebx
je .L5
testl %edx, %esi
je .L5
xorl %eax, %edx
movl %edx, (%edi,%ecx,4)
.p2align 4,,7
.L5:
addl $1, %ecx
cmpl 24(%ebp), %ecx
jne .L4
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29789
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug tree-optimization/29789] Missed optimization in libquantum
2006-11-09 21:54 [Bug tree-optimization/29789] New: Missed optimization in libquantum rguenth at gcc dot gnu dot org
2006-11-09 22:07 ` [Bug tree-optimization/29789] " rguenth at gcc dot gnu dot org
@ 2006-11-10 1:56 ` pinskia at gcc dot gnu dot org
2006-11-10 9:21 ` rguenth at gcc dot gnu dot org
` (6 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2006-11-10 1:56 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from pinskia at gcc dot gnu dot org 2006-11-10 01:55 -------
> (it considers the shifts being of cost one)
depends on the target. On some targets (Cell), shifts with a constant shifter
is cheap (same as a normal add or subtract) while with a non constant one, it
is microcoded and every expensive.
--
pinskia at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Severity|normal |enhancement
Status|UNCONFIRMED |NEW
Ever Confirmed|0 |1
Last reconfirmed|0000-00-00 00:00:00 |2006-11-10 01:55:58
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29789
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug tree-optimization/29789] Missed optimization in libquantum
2006-11-09 21:54 [Bug tree-optimization/29789] New: Missed optimization in libquantum rguenth at gcc dot gnu dot org
2006-11-09 22:07 ` [Bug tree-optimization/29789] " rguenth at gcc dot gnu dot org
2006-11-10 1:56 ` pinskia at gcc dot gnu dot org
@ 2006-11-10 9:21 ` rguenth at gcc dot gnu dot org
2007-04-20 12:52 ` [Bug tree-optimization/29789] Missed invariant out of the loop with conditionals and shifts rguenth at gcc dot gnu dot org
` (5 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2006-11-10 9:21 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from rguenth at gcc dot gnu dot org 2006-11-10 09:21 -------
I suppose we should look at RTL loop invariant motion to hoist the shift
because we should have more precise information there.
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |rakdver at gcc dot gnu dot
| |org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29789
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug tree-optimization/29789] Missed invariant out of the loop with conditionals and shifts
2006-11-09 21:54 [Bug tree-optimization/29789] New: Missed optimization in libquantum rguenth at gcc dot gnu dot org
` (2 preceding siblings ...)
2006-11-10 9:21 ` rguenth at gcc dot gnu dot org
@ 2007-04-20 12:52 ` rguenth at gcc dot gnu dot org
2007-04-20 14:34 ` rguenth at gcc dot gnu dot org
` (4 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2007-04-20 12:52 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from rguenth at gcc dot gnu dot org 2007-04-20 13:51 -------
Mine. I have a tree if-conversion patch that produces
<L0>:;
D.1993 = MEM[base: state, index: ivtmp.32, step: 8];
if (pretmp.25 == (pretmp.25 & D.1993)) goto <L1>; else goto <L3>;
<L1>:;
MEM[base: state, index: ivtmp.32, step: 8] = 1 << target ^ D.1993;
<L3>:;
ivtmp.32 = ivtmp.32 + 1;
if (size > (int) ivtmp.32) goto <L0>; else goto <L6>;
and finally
.L4:
movq (%r11,%r9,8), %rdi
movq %rsi, %rax
andq %rdi, %rax
cmpq %rax, %rsi
jne .L5
xorq %rdx, %rdi
movq %rdi, (%r11,%r9,8)
.L5:
addq $1, %r9
cmpl %r9d, %r8d
jg .L4
for the inner loop.
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
AssignedTo|unassigned at gcc dot gnu |rguenth at gcc dot gnu dot
|dot org |org
Status|NEW |ASSIGNED
Last reconfirmed|2006-11-10 01:55:58 |2007-04-20 13:51:52
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29789
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug tree-optimization/29789] Missed invariant out of the loop with conditionals and shifts
2006-11-09 21:54 [Bug tree-optimization/29789] New: Missed optimization in libquantum rguenth at gcc dot gnu dot org
` (3 preceding siblings ...)
2007-04-20 12:52 ` [Bug tree-optimization/29789] Missed invariant out of the loop with conditionals and shifts rguenth at gcc dot gnu dot org
@ 2007-04-20 14:34 ` rguenth at gcc dot gnu dot org
2007-04-20 15:09 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
` (3 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2007-04-20 14:34 UTC (permalink / raw)
To: gcc-bugs
------- Comment #5 from rguenth at gcc dot gnu dot org 2007-04-20 15:33 -------
Note that this does fix the loop invariant motion only in the case of two
ifs can be merged (because that re-instantiates the A & (1 << B) form). The
following parts are still not resolved:
void quantum_cnot(int control, int target, unsigned long *state, int size)
{
int i;
for(i=0; i<size; i++)
{
if (state[i] & ((unsigned long) 1 << control))
state[i] ^= ((unsigned long) 1 << target);
}
}
(and more similar loops in libquantum). It would be nice if rtl loop-invariant
motion could detect this form:
(insn 23 22 24 4 (parallel [
(set (reg:DI 67)
(lshiftrt:DI (reg:DI 62 [ D.1992 ])
(subreg:QI (reg/v:SI 63 [ control ]) 0)))
(clobber (reg:CC 17 flags))
]) 470 {*lshrdi3_1_rex64} (nil)
(nil))
(insn 24 23 25 4 (parallel [
(set (reg:SI 68)
(and:SI (subreg:SI (reg:DI 67) 0)
(const_int 1 [0x1])))
(clobber (reg:CC 17 flags))
]) 301 {*andsi_1} (nil)
(nil))
and move the invariant (1 << control). It does move the (1 << target) which
looks like
(insn 30 28 31 5 (set (reg:DI 70)
(const_int 1 [0x1])) 82 {*movdi_1_rex64} (nil)
(nil))
(insn 31 30 32 5 (parallel [
(set (reg:DI 69)
(ashift:DI (reg:DI 70)
(subreg:QI (reg/v:SI 64 [ target ]) 0)))
(clobber (reg:CC 17 flags))
]) 411 {*ashldi3_1_rex64} (nil)
(expr_list:REG_EQUAL (ashift:DI (const_int 1 [0x1])
(subreg:QI (reg/v:SI 64 [ target ]) 0))
(nil)))
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29789
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug tree-optimization/29789] Missed invariant out of the loop with conditionals and shifts
2006-11-09 21:54 [Bug tree-optimization/29789] New: Missed optimization in libquantum rguenth at gcc dot gnu dot org
` (4 preceding siblings ...)
2007-04-20 14:34 ` rguenth at gcc dot gnu dot org
@ 2007-04-20 15:09 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
2007-04-20 16:00 ` rguenth at gcc dot gnu dot org
` (2 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: rakdver at atrey dot karlin dot mff dot cuni dot cz @ 2007-04-20 15:09 UTC (permalink / raw)
To: gcc-bugs
------- Comment #6 from rakdver at atrey dot karlin dot mff dot cuni dot cz 2007-04-20 16:09 -------
Subject: Re: Missed invariant out of the loop with conditionals and shifts
> void quantum_cnot(int control, int target, unsigned long *state, int size)
> {
> int i;
>
> for(i=0; i<size; i++)
> {
> if (state[i] & ((unsigned long) 1 << control))
> state[i] ^= ((unsigned long) 1 << target);
> }
> }
>
> (and more similar loops in libquantum). It would be nice if rtl loop-invariant
> motion could detect this form:
>
> (insn 23 22 24 4 (parallel [
> (set (reg:DI 67)
> (lshiftrt:DI (reg:DI 62 [ D.1992 ])
> (subreg:QI (reg/v:SI 63 [ control ]) 0)))
> (clobber (reg:CC 17 flags))
> ]) 470 {*lshrdi3_1_rex64} (nil)
> (nil))
>
> (insn 24 23 25 4 (parallel [
> (set (reg:SI 68)
> (and:SI (subreg:SI (reg:DI 67) 0)
> (const_int 1 [0x1])))
> (clobber (reg:CC 17 flags))
> ]) 301 {*andsi_1} (nil)
> (nil))
>
> and move the invariant (1 << control).
how exactly do you imagine this transformation should work? The insns
you show essentially are
tmp = x >> control;
z = tmp & 1;
I do not see how to transform just this pattern profitably (without also
seeing that z is only used in condition).
I could hack something in, however handling just this single pattern
specially seems a bit weird to me.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29789
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug tree-optimization/29789] Missed invariant out of the loop with conditionals and shifts
2006-11-09 21:54 [Bug tree-optimization/29789] New: Missed optimization in libquantum rguenth at gcc dot gnu dot org
` (5 preceding siblings ...)
2007-04-20 15:09 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
@ 2007-04-20 16:00 ` rguenth at gcc dot gnu dot org
2007-04-22 11:27 ` rguenth at gcc dot gnu dot org
2007-04-22 11:35 ` rguenth at gcc dot gnu dot org
8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2007-04-20 16:00 UTC (permalink / raw)
To: gcc-bugs
------- Comment #7 from rguenth at gcc dot gnu dot org 2007-04-20 16:59 -------
I posted a patch for the tree level im instead to handle this special case
right
after the reciprocal special case. I agree it's a special case, but as it
happens
in spec 2k6...
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29789
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug tree-optimization/29789] Missed invariant out of the loop with conditionals and shifts
2006-11-09 21:54 [Bug tree-optimization/29789] New: Missed optimization in libquantum rguenth at gcc dot gnu dot org
` (6 preceding siblings ...)
2007-04-20 16:00 ` rguenth at gcc dot gnu dot org
@ 2007-04-22 11:27 ` rguenth at gcc dot gnu dot org
2007-04-22 11:35 ` rguenth at gcc dot gnu dot org
8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2007-04-22 11:27 UTC (permalink / raw)
To: gcc-bugs
------- Comment #8 from rguenth at gcc dot gnu dot org 2007-04-22 12:27 -------
Subject: Bug 29789
Author: rguenth
Date: Sun Apr 22 12:26:49 2007
New Revision: 124042
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=124042
Log:
2007-04-22 Richard Guenther <rguenther@suse.de>
PR tree-optimization/29789
* tree-ssa-loop-im.c (stmt_cost): Adjust cost of shifts.
(rewrite_reciprocal): New helper split out from
determine_invariantness_stmt.
(rewrite_bittest): Likewise.
(determine_invariantness_stmt): Rewrite (A >> B) & 1 to
A & (1 << B) if (1 << B) is loop invariant but (A >> B)
is not.
* gcc.dg/tree-ssa/ssa-lim-1.c: New testcase.
* gcc.dg/tree-ssa/ssa-lim-2.c: Likewise.
Added:
trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-1.c
trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-2.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-loop-im.c
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29789
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug tree-optimization/29789] Missed invariant out of the loop with conditionals and shifts
2006-11-09 21:54 [Bug tree-optimization/29789] New: Missed optimization in libquantum rguenth at gcc dot gnu dot org
` (7 preceding siblings ...)
2007-04-22 11:27 ` rguenth at gcc dot gnu dot org
@ 2007-04-22 11:35 ` rguenth at gcc dot gnu dot org
8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2007-04-22 11:35 UTC (permalink / raw)
To: gcc-bugs
------- Comment #9 from rguenth at gcc dot gnu dot org 2007-04-22 12:35 -------
Fixed. The combining the bit-test issue is split to PR31657.
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution| |FIXED
Target Milestone|--- |4.3.0
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29789
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2007-04-22 11:35 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-11-09 21:54 [Bug tree-optimization/29789] New: Missed optimization in libquantum rguenth at gcc dot gnu dot org
2006-11-09 22:07 ` [Bug tree-optimization/29789] " rguenth at gcc dot gnu dot org
2006-11-10 1:56 ` pinskia at gcc dot gnu dot org
2006-11-10 9:21 ` rguenth at gcc dot gnu dot org
2007-04-20 12:52 ` [Bug tree-optimization/29789] Missed invariant out of the loop with conditionals and shifts rguenth at gcc dot gnu dot org
2007-04-20 14:34 ` rguenth at gcc dot gnu dot org
2007-04-20 15:09 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
2007-04-20 16:00 ` rguenth at gcc dot gnu dot org
2007-04-22 11:27 ` rguenth at gcc dot gnu dot org
2007-04-22 11:35 ` rguenth at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).