public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/15792] missed subreg optimization
[not found] <bug-15792-6528@http.gcc.gnu.org/bugzilla/>
@ 2006-01-18 4:45 ` pinskia at gcc dot gnu dot org
2006-01-20 15:48 ` tony dot linthicum at amd dot com
` (6 subsequent siblings)
7 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2006-01-18 4:45 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from pinskia at gcc dot gnu dot org 2006-01-18 04:45 -------
The problem here is that we don't split up the subregister early before
register allocation.
If we split it up before combine, we would be able to combine the or and get
the more optimial results.
A patch like
http://gcc.gnu.org/ml/gcc-patches/2005-05/msg00554.html
should help.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15792
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug rtl-optimization/15792] missed subreg optimization
[not found] <bug-15792-6528@http.gcc.gnu.org/bugzilla/>
2006-01-18 4:45 ` [Bug rtl-optimization/15792] missed subreg optimization pinskia at gcc dot gnu dot org
@ 2006-01-20 15:48 ` tony dot linthicum at amd dot com
2006-01-20 15:52 ` pinskia at gcc dot gnu dot org
` (5 subsequent siblings)
7 siblings, 0 replies; 12+ messages in thread
From: tony dot linthicum at amd dot com @ 2006-01-20 15:48 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from tony dot linthicum at amd dot com 2006-01-20 15:48 -------
I've been looking at this a bit, and tried the patch. It does indeed fix the
problem in test1 above, but it does not appear to be the complete solution.
The load of 'x' in test1 is actually split fairly early, and from what I can
tell, the superfluous move is actually the result of the register allocator
doing a poor job of live range analysis when confronted with subregs. I
suspect this is why most things (i.e. those things other than branches) are not
split into subregs until after reload. Unfortunately, the subreg lowering
won't touch a subreg if it's seen a reference to the "inner" register so we get
the same unnecessary move if the code looks like:
foo(long long y, long long z)
{
unsigned long long x;
x = y + z;
if (x) gh();
}
I'm going to experiment with moving where the subreg lowering code occurs and
moving up the splitting into subregs and see if I can get the desired results.
I'm pretty new to GCC, so if any of the above seems like I'm off in the weeds
then please let me know.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15792
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug rtl-optimization/15792] missed subreg optimization
[not found] <bug-15792-6528@http.gcc.gnu.org/bugzilla/>
2006-01-18 4:45 ` [Bug rtl-optimization/15792] missed subreg optimization pinskia at gcc dot gnu dot org
2006-01-20 15:48 ` tony dot linthicum at amd dot com
@ 2006-01-20 15:52 ` pinskia at gcc dot gnu dot org
2006-02-02 18:18 ` ian at airs dot com
` (4 subsequent siblings)
7 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2006-01-20 15:52 UTC (permalink / raw)
To: gcc-bugs
------- Comment #5 from pinskia at gcc dot gnu dot org 2006-01-20 15:52 -------
(In reply to comment #4)
> I'm going to experiment with moving where the subreg lowering code occurs and
> moving up the splitting into subregs and see if I can get the desired results.
> I'm pretty new to GCC, so if any of the above seems like I'm off in the weeds
> then please let me know.
This seems right but the other issue is that register allocator allocates DI as
two consecutive register as one (that might be only part of the cause).
--
pinskia at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |tony dot linthicum at amd
| |dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15792
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug rtl-optimization/15792] missed subreg optimization
[not found] <bug-15792-6528@http.gcc.gnu.org/bugzilla/>
` (2 preceding siblings ...)
2006-01-20 15:52 ` pinskia at gcc dot gnu dot org
@ 2006-02-02 18:18 ` ian at airs dot com
2006-02-06 17:13 ` tony dot linthicum at amd dot com
` (3 subsequent siblings)
7 siblings, 0 replies; 12+ messages in thread
From: ian at airs dot com @ 2006-02-02 18:18 UTC (permalink / raw)
To: gcc-bugs
------- Comment #6 from ian at airs dot com 2006-02-02 18:18 -------
With the version of RTH's subreg lowering pass which I am working on, I get
identical code for both functions:
test1:
movl 8(%esp), %eax
orl 4(%esp), %eax
jne .L7
ret
.p2align 4,,7
.L7:
jmp gh
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15792
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug rtl-optimization/15792] missed subreg optimization
[not found] <bug-15792-6528@http.gcc.gnu.org/bugzilla/>
` (3 preceding siblings ...)
2006-02-02 18:18 ` ian at airs dot com
@ 2006-02-06 17:13 ` tony dot linthicum at amd dot com
2006-02-07 0:30 ` ian at airs dot com
` (2 subsequent siblings)
7 siblings, 0 replies; 12+ messages in thread
From: tony dot linthicum at amd dot com @ 2006-02-06 17:13 UTC (permalink / raw)
To: gcc-bugs
------- Comment #7 from tony dot linthicum at amd dot com 2006-02-06 17:13 -------
So do I, at least for the original code (i.e. test and test1). I'm curious,
though, if you've tried the example that I listed above (foo). I still get
subregs with that one, though I honestly don't recall at the moment whether or
not it makes the register allocator screw up or not (I *think* it does, but I'd
have to check). Either way, though, the presence of the subregs provides the
needed fodder for RA badness so I'm curious if it's present in what you're
working on.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15792
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug rtl-optimization/15792] missed subreg optimization
[not found] <bug-15792-6528@http.gcc.gnu.org/bugzilla/>
` (4 preceding siblings ...)
2006-02-06 17:13 ` tony dot linthicum at amd dot com
@ 2006-02-07 0:30 ` ian at airs dot com
2006-02-07 8:23 ` ian at airs dot com
2007-11-10 0:15 ` rask at gcc dot gnu dot org
7 siblings, 0 replies; 12+ messages in thread
From: ian at airs dot com @ 2006-02-07 0:30 UTC (permalink / raw)
To: gcc-bugs
------- Comment #8 from ian at airs dot com 2006-02-07 00:30 -------
Yes, I still get an unnecessary move in your test case which uses addition.
One reason this happens is because the addition can not be split until after
the reload pass is complete. That is because the add relies on the condition
code registers, but reload can clobber the condition code registers between any
arbitrary pair of instructions.
Another reason this happens is that the compiler knows how to set the condition
flags using a bitwise or, but it does so using a scratch register to hold the
destination of the bitwise or. The register allocator is not clever enough to
see that if it has a DImode pair of registers which dies in the insn, that it
can use the second register in the DImode pair as the scratch register. If the
register allocator saw that, then it could use that register as the scratch
register and avoid allocating a new scratch register and copying the value into
it.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15792
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug rtl-optimization/15792] missed subreg optimization
[not found] <bug-15792-6528@http.gcc.gnu.org/bugzilla/>
` (5 preceding siblings ...)
2006-02-07 0:30 ` ian at airs dot com
@ 2006-02-07 8:23 ` ian at airs dot com
2007-11-10 0:15 ` rask at gcc dot gnu dot org
7 siblings, 0 replies; 12+ messages in thread
From: ian at airs dot com @ 2006-02-07 8:23 UTC (permalink / raw)
To: gcc-bugs
------- Comment #9 from ian at airs dot com 2006-02-07 08:23 -------
I now have a reasonably simple reload patch which eliminates the unnecessary
move. For the test case in comment #4, I get this code with -O2
-momit-leaf-frame-pointer:
foo:
movl 12(%esp), %eax
movl 16(%esp), %edx
addl 4(%esp), %eax
adcl 8(%esp), %edx
orl %eax, %edx
jne .L7
rep ; ret
.p2align 4,,7
.L7:
jmp gh
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15792
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug rtl-optimization/15792] missed subreg optimization
[not found] <bug-15792-6528@http.gcc.gnu.org/bugzilla/>
` (6 preceding siblings ...)
2006-02-07 8:23 ` ian at airs dot com
@ 2007-11-10 0:15 ` rask at gcc dot gnu dot org
7 siblings, 0 replies; 12+ messages in thread
From: rask at gcc dot gnu dot org @ 2007-11-10 0:15 UTC (permalink / raw)
To: gcc-bugs
------- Comment #10 from rask at gcc dot gnu dot org 2007-11-10 00:15 -------
This was fixed in 4.3.0.
--
rask at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Keywords| |ra
Known to fail| |4.1.2 4.2.0 4.2.1 4.2.2
Known to work| |4.3.0
Resolution| |FIXED
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15792
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug rtl-optimization/15792] missed subreg optimization
[not found] <bug-15792-4@http.gcc.gnu.org/bugzilla/>
2021-10-15 3:00 ` gabravier at gmail dot com
@ 2023-05-15 5:34 ` pinskia at gcc dot gnu.org
1 sibling, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-15 5:34 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=15792
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Known to fail| |
--- Comment #12 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Gabriel Ravier from comment #11)
> Seems like the issue is present again, except it's test1 that gets the
> better asm now. Perhaps this should be re-opened ?
This bug was about 32bit x86 and the code looks good in GCC 9, 10, 11, and 12
and the trunk.
If you were testing on x86_64, you need to use __int128_t to see what the
original issue was about:
void gh();
void test(__int128_t x) {
long g = (long)x|((long)(x>>64));
if (g) gh();
}
void test1(__int128_t x) {
if (x) gh();
}
GCC 4.8+ produces:
test1:
.cfi_startproc
orq %rdi, %rsi
jne .L7
rep ret
For both. There was an extra mov in GCC 4.5.0-4.7.0 for test though. In GCC
4.4.0, test1 was two compare and jumps (ok). GCC 4.1.2 had the bad code
generation which was mentioned in comment #0.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug rtl-optimization/15792] missed subreg optimization
[not found] <bug-15792-4@http.gcc.gnu.org/bugzilla/>
@ 2021-10-15 3:00 ` gabravier at gmail dot com
2023-05-15 5:34 ` pinskia at gcc dot gnu.org
1 sibling, 0 replies; 12+ messages in thread
From: gabravier at gmail dot com @ 2021-10-15 3:00 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=15792
Gabriel Ravier <gabravier at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |gabravier at gmail dot com
--- Comment #11 from Gabriel Ravier <gabravier at gmail dot com> ---
Seems like the issue is present again, except it's test1 that gets the better
asm now. Perhaps this should be re-opened ?
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug rtl-optimization/15792] missed subreg optimization
2004-06-03 4:27 [Bug rtl-optimization/15792] New: " pinskia at gcc dot gnu dot org
2004-06-15 20:07 ` [Bug rtl-optimization/15792] " bangerth at dealii dot org
@ 2004-08-20 18:47 ` dann at godzilla dot ics dot uci dot edu
1 sibling, 0 replies; 12+ messages in thread
From: dann at godzilla dot ics dot uci dot edu @ 2004-08-20 18:47 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From dann at godzilla dot ics dot uci dot edu 2004-08-20 18:47 -------
(In reply to comment #1)
> Indeed. In test1, we get a completely bogus sequence:
> movl 12(%ebp), %edx
> movl 8(%ebp), %eax
> movl %edx, %ecx
> orl %eax, %ecx
> What is the compiler thinking, moving data first into adx just to move
> it further into ecx the next moment?
This is a regression from gcc-3.0, the mov is not generated there:
movl 16(%esp), %eax
movl 20(%esp), %edx
orl %edx, %eax
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15792
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug rtl-optimization/15792] missed subreg optimization
2004-06-03 4:27 [Bug rtl-optimization/15792] New: " pinskia at gcc dot gnu dot org
@ 2004-06-15 20:07 ` bangerth at dealii dot org
2004-08-20 18:47 ` dann at godzilla dot ics dot uci dot edu
1 sibling, 0 replies; 12+ messages in thread
From: bangerth at dealii dot org @ 2004-06-15 20:07 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From bangerth at dealii dot org 2004-06-15 20:06 -------
Indeed. In test1, we get a completely bogus sequence:
movl 12(%ebp), %edx
movl 8(%ebp), %eax
movl %edx, %ecx
orl %eax, %ecx
What is the compiler thinking, moving data first into adx just to move
it further into ecx the next moment?
W.
--
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Ever Confirmed| |1
Last reconfirmed|0000-00-00 00:00:00 |2004-06-15 20:07:00
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15792
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2023-05-15 5:34 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <bug-15792-6528@http.gcc.gnu.org/bugzilla/>
2006-01-18 4:45 ` [Bug rtl-optimization/15792] missed subreg optimization pinskia at gcc dot gnu dot org
2006-01-20 15:48 ` tony dot linthicum at amd dot com
2006-01-20 15:52 ` pinskia at gcc dot gnu dot org
2006-02-02 18:18 ` ian at airs dot com
2006-02-06 17:13 ` tony dot linthicum at amd dot com
2006-02-07 0:30 ` ian at airs dot com
2006-02-07 8:23 ` ian at airs dot com
2007-11-10 0:15 ` rask at gcc dot gnu dot org
[not found] <bug-15792-4@http.gcc.gnu.org/bugzilla/>
2021-10-15 3:00 ` gabravier at gmail dot com
2023-05-15 5:34 ` pinskia at gcc dot gnu.org
2004-06-03 4:27 [Bug rtl-optimization/15792] New: " pinskia at gcc dot gnu dot org
2004-06-15 20:07 ` [Bug rtl-optimization/15792] " bangerth at dealii dot org
2004-08-20 18:47 ` dann at godzilla dot ics dot uci dot edu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).