* [Bug rtl-optimization/15792] missed subreg optimization
[not found] <bug-15792-6528@http.gcc.gnu.org/bugzilla/>
@ 2006-01-18 4:45 ` pinskia at gcc dot gnu dot org
2006-01-20 15:48 ` tony dot linthicum at amd dot com
` (6 subsequent siblings)
7 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2006-01-18 4:45 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from pinskia at gcc dot gnu dot org 2006-01-18 04:45 -------
The problem here is that we don't split up the subregister early before
register allocation.
If we split it up before combine, we would be able to combine the or and get
the more optimial results.
A patch like
http://gcc.gnu.org/ml/gcc-patches/2005-05/msg00554.html
should help.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15792
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug rtl-optimization/15792] missed subreg optimization
[not found] <bug-15792-6528@http.gcc.gnu.org/bugzilla/>
2006-01-18 4:45 ` pinskia at gcc dot gnu dot org
@ 2006-01-20 15:48 ` tony dot linthicum at amd dot com
2006-01-20 15:52 ` pinskia at gcc dot gnu dot org
` (5 subsequent siblings)
7 siblings, 0 replies; 13+ messages in thread
From: tony dot linthicum at amd dot com @ 2006-01-20 15:48 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from tony dot linthicum at amd dot com 2006-01-20 15:48 -------
I've been looking at this a bit, and tried the patch. It does indeed fix the
problem in test1 above, but it does not appear to be the complete solution.
The load of 'x' in test1 is actually split fairly early, and from what I can
tell, the superfluous move is actually the result of the register allocator
doing a poor job of live range analysis when confronted with subregs. I
suspect this is why most things (i.e. those things other than branches) are not
split into subregs until after reload. Unfortunately, the subreg lowering
won't touch a subreg if it's seen a reference to the "inner" register so we get
the same unnecessary move if the code looks like:
foo(long long y, long long z)
{
unsigned long long x;
x = y + z;
if (x) gh();
}
I'm going to experiment with moving where the subreg lowering code occurs and
moving up the splitting into subregs and see if I can get the desired results.
I'm pretty new to GCC, so if any of the above seems like I'm off in the weeds
then please let me know.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15792
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug rtl-optimization/15792] missed subreg optimization
[not found] <bug-15792-6528@http.gcc.gnu.org/bugzilla/>
2006-01-18 4:45 ` pinskia at gcc dot gnu dot org
2006-01-20 15:48 ` tony dot linthicum at amd dot com
@ 2006-01-20 15:52 ` pinskia at gcc dot gnu dot org
2006-02-02 18:18 ` ian at airs dot com
` (4 subsequent siblings)
7 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2006-01-20 15:52 UTC (permalink / raw)
To: gcc-bugs
------- Comment #5 from pinskia at gcc dot gnu dot org 2006-01-20 15:52 -------
(In reply to comment #4)
> I'm going to experiment with moving where the subreg lowering code occurs and
> moving up the splitting into subregs and see if I can get the desired results.
> I'm pretty new to GCC, so if any of the above seems like I'm off in the weeds
> then please let me know.
This seems right but the other issue is that register allocator allocates DI as
two consecutive register as one (that might be only part of the cause).
--
pinskia at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |tony dot linthicum at amd
| |dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15792
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug rtl-optimization/15792] missed subreg optimization
[not found] <bug-15792-6528@http.gcc.gnu.org/bugzilla/>
` (2 preceding siblings ...)
2006-01-20 15:52 ` pinskia at gcc dot gnu dot org
@ 2006-02-02 18:18 ` ian at airs dot com
2006-02-06 17:13 ` tony dot linthicum at amd dot com
` (3 subsequent siblings)
7 siblings, 0 replies; 13+ messages in thread
From: ian at airs dot com @ 2006-02-02 18:18 UTC (permalink / raw)
To: gcc-bugs
------- Comment #6 from ian at airs dot com 2006-02-02 18:18 -------
With the version of RTH's subreg lowering pass which I am working on, I get
identical code for both functions:
test1:
movl 8(%esp), %eax
orl 4(%esp), %eax
jne .L7
ret
.p2align 4,,7
.L7:
jmp gh
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15792
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug rtl-optimization/15792] missed subreg optimization
[not found] <bug-15792-6528@http.gcc.gnu.org/bugzilla/>
` (3 preceding siblings ...)
2006-02-02 18:18 ` ian at airs dot com
@ 2006-02-06 17:13 ` tony dot linthicum at amd dot com
2006-02-07 0:30 ` ian at airs dot com
` (2 subsequent siblings)
7 siblings, 0 replies; 13+ messages in thread
From: tony dot linthicum at amd dot com @ 2006-02-06 17:13 UTC (permalink / raw)
To: gcc-bugs
------- Comment #7 from tony dot linthicum at amd dot com 2006-02-06 17:13 -------
So do I, at least for the original code (i.e. test and test1). I'm curious,
though, if you've tried the example that I listed above (foo). I still get
subregs with that one, though I honestly don't recall at the moment whether or
not it makes the register allocator screw up or not (I *think* it does, but I'd
have to check). Either way, though, the presence of the subregs provides the
needed fodder for RA badness so I'm curious if it's present in what you're
working on.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15792
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug rtl-optimization/15792] missed subreg optimization
[not found] <bug-15792-6528@http.gcc.gnu.org/bugzilla/>
` (4 preceding siblings ...)
2006-02-06 17:13 ` tony dot linthicum at amd dot com
@ 2006-02-07 0:30 ` ian at airs dot com
2006-02-07 8:23 ` ian at airs dot com
2007-11-10 0:15 ` rask at gcc dot gnu dot org
7 siblings, 0 replies; 13+ messages in thread
From: ian at airs dot com @ 2006-02-07 0:30 UTC (permalink / raw)
To: gcc-bugs
------- Comment #8 from ian at airs dot com 2006-02-07 00:30 -------
Yes, I still get an unnecessary move in your test case which uses addition.
One reason this happens is because the addition can not be split until after
the reload pass is complete. That is because the add relies on the condition
code registers, but reload can clobber the condition code registers between any
arbitrary pair of instructions.
Another reason this happens is that the compiler knows how to set the condition
flags using a bitwise or, but it does so using a scratch register to hold the
destination of the bitwise or. The register allocator is not clever enough to
see that if it has a DImode pair of registers which dies in the insn, that it
can use the second register in the DImode pair as the scratch register. If the
register allocator saw that, then it could use that register as the scratch
register and avoid allocating a new scratch register and copying the value into
it.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15792
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug rtl-optimization/15792] missed subreg optimization
[not found] <bug-15792-6528@http.gcc.gnu.org/bugzilla/>
` (5 preceding siblings ...)
2006-02-07 0:30 ` ian at airs dot com
@ 2006-02-07 8:23 ` ian at airs dot com
2007-11-10 0:15 ` rask at gcc dot gnu dot org
7 siblings, 0 replies; 13+ messages in thread
From: ian at airs dot com @ 2006-02-07 8:23 UTC (permalink / raw)
To: gcc-bugs
------- Comment #9 from ian at airs dot com 2006-02-07 08:23 -------
I now have a reasonably simple reload patch which eliminates the unnecessary
move. For the test case in comment #4, I get this code with -O2
-momit-leaf-frame-pointer:
foo:
movl 12(%esp), %eax
movl 16(%esp), %edx
addl 4(%esp), %eax
adcl 8(%esp), %edx
orl %eax, %edx
jne .L7
rep ; ret
.p2align 4,,7
.L7:
jmp gh
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15792
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug rtl-optimization/15792] missed subreg optimization
[not found] <bug-15792-6528@http.gcc.gnu.org/bugzilla/>
` (6 preceding siblings ...)
2006-02-07 8:23 ` ian at airs dot com
@ 2007-11-10 0:15 ` rask at gcc dot gnu dot org
7 siblings, 0 replies; 13+ messages in thread
From: rask at gcc dot gnu dot org @ 2007-11-10 0:15 UTC (permalink / raw)
To: gcc-bugs
------- Comment #10 from rask at gcc dot gnu dot org 2007-11-10 00:15 -------
This was fixed in 4.3.0.
--
rask at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Keywords| |ra
Known to fail| |4.1.2 4.2.0 4.2.1 4.2.2
Known to work| |4.3.0
Resolution| |FIXED
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15792
^ permalink raw reply [flat|nested] 13+ messages in thread