public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/30829]  New: extra register move to self
@ 2007-02-17  4:50 dean at arctic dot org
  2007-11-09  0:58 ` [Bug target/30829] extra register zero extends dje at google dot com
  2007-11-09 11:08 ` rask at gcc dot gnu dot org
  0 siblings, 2 replies; 3+ messages in thread
From: dean at arctic dot org @ 2007-02-17  4:50 UTC (permalink / raw)
  To: gcc-bugs

this may or may not be the same as #29775.

% cat extra-mov.c <<EOF
unsigned *table;
unsigned shift;

unsigned foo1(unsigned p)
{
  unsigned block = p >> shift;
  unsigned value = table[block];
  return (value >> 8);
}
EOF
% gcc -g -O3 -Wall   -c -o extra-mov.o extra-mov.c
% objdump -dr extra-mov.o

extra-mov.o:     file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <foo1>:
   0:   8b 0d 00 00 00 00       mov    0(%rip),%ecx        # 6 <foo1+0x6>
                        2: R_X86_64_PC32        shift+0xfffffffffffffffc
   6:   48 8b 05 00 00 00 00    mov    0(%rip),%rax        # d <foo1+0xd>
                        9: R_X86_64_PC32        table+0xfffffffffffffffc
   d:   d3 ef                   shr    %cl,%edi
   f:   89 ff                   mov    %edi,%edi
  11:   8b 04 b8                mov    (%rax,%rdi,4),%eax
  14:   c1 e8 08                shr    $0x8,%eax
  17:   c3                      retq

the "mov %edi,%edi" is presumably there as a 32->64 extend... but it's not
necessary because %edi was written in previous instruction.

% gcc -v
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc/configure --prefix=/home/odo/gcc --disable-multilib
--disable-biarch x86_64-unknown-linux-gnu --enable-languages=c
Thread model: posix
gcc version 4.3.0 20070217 (experimental)

i've seen it in 4.1.1 too.

-dean


-- 
           Summary: extra register move to self
           Product: gcc
           Version: 4.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: dean at arctic dot org
 GCC build triplet: x86_64-unknown-linux-gnu
  GCC host triplet: x86_64-unknown-linux-gnu
GCC target triplet: x86_64-unknown-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30829


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug target/30829] extra register zero extends
  2007-02-17  4:50 [Bug target/30829] New: extra register move to self dean at arctic dot org
@ 2007-11-09  0:58 ` dje at google dot com
  2007-11-09 11:08 ` rask at gcc dot gnu dot org
  1 sibling, 0 replies; 3+ messages in thread
From: dje at google dot com @ 2007-11-09  0:58 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from dje at google dot com  2007-11-09 00:57 -------
I looked into what's going on here.

This is a problem in the i386.md lshr+zext combiner patterns (or a problem in
the combine pass, depending on one's point of view).  There are patterns in
i386.md that are supposed to handle this case but they're not being used.

lshrsi3_1_one_bit_zext doesn't handle the lshr(1) case. I don't understand why
the lshr is outside of the zero_extend in its definition.  Looking at the dump
of the combine pass I see it trying to match a zero_extract, and if I code
lshrsi3_1_one_bit_zext to use a zero_extract then I get the expected code (i.e.
the superfluous move is gone).

lshrsi3_1_zext doesn't handle the lshr(n) case.  It intuitively matches but
combine tries to "simplify" this case to (and:DI (subreg:DI (lshr ...) 0)
0xffffffff) and there's no pattern that matches this.  If I add a pattern to
match this it still doesn't work because the rtx_cost of the two separate insns
lshr,zext (= 4 + 1) is less than the rtx cost of the combined pattern (= 4 + 3
+ 4) so combine rejects the change because it thinks it's more expensive. 
ix86_rtx_costs has code to treat zero_extend SI->DI as a cheap case but it
doesn't get used here unfortunately because the zero_extend gets converted to
an AND (which isn't to say that's bad for the general case).

There are two other related patterns that I also don't understand:
lshrsi3_cmp_one_bit_zext and lshrsi3_cmp_zext.  It's not clear to me that they
will match anything.

It's unfortunate that the intuitive pattern can't work here.  i.e. for the one
bit shift one needs to use a zero_extract and for the N bit shift one needs to
use AND and hack ix86_rtx_costs.  Maybe if the "simplified" pattern doesn't
work combine could try a "less simplified" pattern, or maybe combine could be
told not to simplify this particular case (either via target dependent means or
by detecting the specific case of SI->DI).  Maybe I'm missing something.


-- 

dje at google dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dje at google dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30829


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug target/30829] extra register zero extends
  2007-02-17  4:50 [Bug target/30829] New: extra register move to self dean at arctic dot org
  2007-11-09  0:58 ` [Bug target/30829] extra register zero extends dje at google dot com
@ 2007-11-09 11:08 ` rask at gcc dot gnu dot org
  1 sibling, 0 replies; 3+ messages in thread
From: rask at gcc dot gnu dot org @ 2007-11-09 11:08 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from rask at gcc dot gnu dot org  2007-11-09 11:08 -------
It's not unusual to need more than one instruction pattern for the same machine
instruction. See <URL:http://gcc.gnu.org/ml/gcc-patches/2007-10/msg01318.html>
and the followup for a recent example and what you can do about it.


-- 

rask at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30829


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2007-11-09 11:08 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-02-17  4:50 [Bug target/30829] New: extra register move to self dean at arctic dot org
2007-11-09  0:58 ` [Bug target/30829] extra register zero extends dje at google dot com
2007-11-09 11:08 ` rask at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).