public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/30829] New: extra register move to self
@ 2007-02-17 4:50 dean at arctic dot org
2007-11-09 0:58 ` [Bug target/30829] extra register zero extends dje at google dot com
2007-11-09 11:08 ` rask at gcc dot gnu dot org
0 siblings, 2 replies; 3+ messages in thread
From: dean at arctic dot org @ 2007-02-17 4:50 UTC (permalink / raw)
To: gcc-bugs
this may or may not be the same as #29775.
% cat extra-mov.c <<EOF
unsigned *table;
unsigned shift;
unsigned foo1(unsigned p)
{
unsigned block = p >> shift;
unsigned value = table[block];
return (value >> 8);
}
EOF
% gcc -g -O3 -Wall -c -o extra-mov.o extra-mov.c
% objdump -dr extra-mov.o
extra-mov.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <foo1>:
0: 8b 0d 00 00 00 00 mov 0(%rip),%ecx # 6 <foo1+0x6>
2: R_X86_64_PC32 shift+0xfffffffffffffffc
6: 48 8b 05 00 00 00 00 mov 0(%rip),%rax # d <foo1+0xd>
9: R_X86_64_PC32 table+0xfffffffffffffffc
d: d3 ef shr %cl,%edi
f: 89 ff mov %edi,%edi
11: 8b 04 b8 mov (%rax,%rdi,4),%eax
14: c1 e8 08 shr $0x8,%eax
17: c3 retq
the "mov %edi,%edi" is presumably there as a 32->64 extend... but it's not
necessary because %edi was written in previous instruction.
% gcc -v
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc/configure --prefix=/home/odo/gcc --disable-multilib
--disable-biarch x86_64-unknown-linux-gnu --enable-languages=c
Thread model: posix
gcc version 4.3.0 20070217 (experimental)
i've seen it in 4.1.1 too.
-dean
--
Summary: extra register move to self
Product: gcc
Version: 4.3.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: dean at arctic dot org
GCC build triplet: x86_64-unknown-linux-gnu
GCC host triplet: x86_64-unknown-linux-gnu
GCC target triplet: x86_64-unknown-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30829
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug target/30829] extra register zero extends
2007-02-17 4:50 [Bug target/30829] New: extra register move to self dean at arctic dot org
@ 2007-11-09 0:58 ` dje at google dot com
2007-11-09 11:08 ` rask at gcc dot gnu dot org
1 sibling, 0 replies; 3+ messages in thread
From: dje at google dot com @ 2007-11-09 0:58 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from dje at google dot com 2007-11-09 00:57 -------
I looked into what's going on here.
This is a problem in the i386.md lshr+zext combiner patterns (or a problem in
the combine pass, depending on one's point of view). There are patterns in
i386.md that are supposed to handle this case but they're not being used.
lshrsi3_1_one_bit_zext doesn't handle the lshr(1) case. I don't understand why
the lshr is outside of the zero_extend in its definition. Looking at the dump
of the combine pass I see it trying to match a zero_extract, and if I code
lshrsi3_1_one_bit_zext to use a zero_extract then I get the expected code (i.e.
the superfluous move is gone).
lshrsi3_1_zext doesn't handle the lshr(n) case. It intuitively matches but
combine tries to "simplify" this case to (and:DI (subreg:DI (lshr ...) 0)
0xffffffff) and there's no pattern that matches this. If I add a pattern to
match this it still doesn't work because the rtx_cost of the two separate insns
lshr,zext (= 4 + 1) is less than the rtx cost of the combined pattern (= 4 + 3
+ 4) so combine rejects the change because it thinks it's more expensive.
ix86_rtx_costs has code to treat zero_extend SI->DI as a cheap case but it
doesn't get used here unfortunately because the zero_extend gets converted to
an AND (which isn't to say that's bad for the general case).
There are two other related patterns that I also don't understand:
lshrsi3_cmp_one_bit_zext and lshrsi3_cmp_zext. It's not clear to me that they
will match anything.
It's unfortunate that the intuitive pattern can't work here. i.e. for the one
bit shift one needs to use a zero_extract and for the N bit shift one needs to
use AND and hack ix86_rtx_costs. Maybe if the "simplified" pattern doesn't
work combine could try a "less simplified" pattern, or maybe combine could be
told not to simplify this particular case (either via target dependent means or
by detecting the specific case of SI->DI). Maybe I'm missing something.
--
dje at google dot com changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |dje at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30829
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug target/30829] extra register zero extends
2007-02-17 4:50 [Bug target/30829] New: extra register move to self dean at arctic dot org
2007-11-09 0:58 ` [Bug target/30829] extra register zero extends dje at google dot com
@ 2007-11-09 11:08 ` rask at gcc dot gnu dot org
1 sibling, 0 replies; 3+ messages in thread
From: rask at gcc dot gnu dot org @ 2007-11-09 11:08 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from rask at gcc dot gnu dot org 2007-11-09 11:08 -------
It's not unusual to need more than one instruction pattern for the same machine
instruction. See <URL:http://gcc.gnu.org/ml/gcc-patches/2007-10/msg01318.html>
and the followup for a recent example and what you can do about it.
--
rask at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30829
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2007-11-09 11:08 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-02-17 4:50 [Bug target/30829] New: extra register move to self dean at arctic dot org
2007-11-09 0:58 ` [Bug target/30829] extra register zero extends dje at google dot com
2007-11-09 11:08 ` rask at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).