public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/53639] New: x86_64: redundant 64-bit operations on 32-bit integers
@ 2012-06-11 22:49 mikpe at it dot uu.se
  2012-06-12  7:40 ` [Bug target/53639] " jakub at gcc dot gnu.org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: mikpe at it dot uu.se @ 2012-06-11 22:49 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53639

             Bug #: 53639
           Summary: x86_64: redundant 64-bit operations on 32-bit integers
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: mikpe@it.uu.se


Created attachment 27605
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27605
test case

> cat redundant-64-bit-ops.c
struct tlb_entry {
    unsigned int tag;
    unsigned int va_pa_off;
};

struct core {
    struct tlb_entry tlb[1 << 10];
    unsigned char *hmem;
};

unsigned char read_byte_slow(struct core *core, unsigned int va);

static inline unsigned char read_byte(struct core *core, unsigned int va)
{
    unsigned int vpn;
    struct tlb_entry *te;

    vpn = va >> 12;
    te = &core->tlb[vpn & ((1 << 10) - 1)];
    if (__builtin_expect(te->tag != vpn, 0))
        return read_byte_slow(core, va);
    return *(core->hmem + va + te->va_pa_off);
}

unsigned char foo(struct core *core, unsigned int *q)
{
    return read_byte(core, *q);
}
> /tmp/objdir/gcc/xgcc -B/tmp/objdir/gcc -O3 -S redundant-64-bit-ops.c
> cat redundant-64-bit-ops.s
        .file   "redundant-64-bit-ops.c"
        .text
        .p2align 4,,15
        .globl  foo
        .type   foo, @function
foo:
.LFB1:
        .cfi_startproc
        movl    (%rsi), %esi (G)
        movl    %esi, %edx
        shrl    $12, %edx (C)
        movq    %rdx, %rcx (A)
        andl    $1023, %ecx (B)
        leaq    (%rdi,%rcx,8), %rcx
        cmpl    (%rcx), %edx
        jne     .L4
        movl    %esi, %eax (D)
        movl    4(%rcx), %edx
        addq    8192(%rdi), %rax (E)
        movzbl  (%rax,%rdx), %eax (F)
        ret
.L4:
        jmp     read_byte_slow
        .cfi_endproc
.LFE1:
        .size   foo, .-foo
        .ident  "GCC: (GNU) 4.8.0 20120610 (experimental)"
        .section        .note.GNU-stack,"",@progbits

1. Instruction (A) does a 64-bit move, however instruction (B) shows that only
the low 32 bits of the destination are live, and instruction (C) shows that the
source is already in zero-extended form.  Therefore instruction (A) should just
be a 32-bit 'movl %edx, %ecx'.

2. Instruction (D) is either a zero-extension, or a redundant move due to poor
register allocation.  The destination does need to be in zero-extended form to
work in instructions (E) and (F), but the source is already in zero-extended
form since instruction (G), so %rax should have been replaced with %rsi in (E)
and (F), and (D) should have been deleted.

The above was with gcc-4.8-20120610, but gcc-4.7-20120605 has the same problem.
gcc-4.6.3 has the first problem but not the second, so the likely path is one
instruction shorter there.  Unfortunately gcc-4.6.3 chose %eax as the
destination for the first load from %rsi, forcing it to insert a compensation
move back to %esi before the tailcall in the unlikely path.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/53639] x86_64: redundant 64-bit operations on 32-bit integers
  2012-06-11 22:49 [Bug target/53639] New: x86_64: redundant 64-bit operations on 32-bit integers mikpe at it dot uu.se
@ 2012-06-12  7:40 ` jakub at gcc dot gnu.org
  2012-06-12 10:41 ` jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2012-06-12  7:40 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53639

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> 2012-06-12 07:40:26 UTC ---
Created attachment 27606
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27606
gcc48-pr53639.patch

The first problem is that combiner combines:
(insn 9 8 10 2 (parallel [
            (set (reg:SI 74 [ D.1765 ])
                (and:SI (reg/v:SI 60 [ vpn ])
                    (const_int 1023 [0x3ff])))
            (clobber (reg:CC 17 flags))
        ]) pr53639.c:19 378 {*andsi_1}
     (expr_list:REG_UNUSED (reg:CC 17 flags)
        (nil)))

(insn 10 9 11 2 (set (reg:DI 75 [ D.1765 ])
        (zero_extend:DI (reg:SI 74 [ D.1765 ]))) pr53639.c:19 112
{*zero_extendsidi2_rex64}
     (expr_list:REG_DEAD (reg:SI 74 [ D.1765 ])
        (nil)))

into:
(insn 10 9 11 2 (parallel [
            (set (reg:DI 75 [ D.1765 ])
                (and:DI (subreg:DI (reg/v:SI 60 [ vpn ]) 0)
                    (const_int 1023 [0x3ff])))
            (clobber (reg:CC 17 flags))
        ]) pr53639.c:19 377 {*anddi_1}
     (expr_list:REG_UNUSED (reg:CC 17 flags)
        (nil)))
(expand_compound_operation in particular).  But the presence of the DImode
paradoxical subreg leads the RA to do the move in 64-bit rather than 32-bit.

The attached untested patch cures that by splitting *anddi_1 into *andsi_1_zext
so that the zero extension from SImode to DImode is done only on the result of
the and.

The second problem looks like RA decision, initially the SI 59 register (read
from *q) and DI 80 register (zero_extend:DI (reg:SI 59)) are given the eax/rax
register:
      Popping a2(r80,l0)  -- assign reg 0
...
      Popping a5(r59,l0)  -- assign reg 0
but there is also an esi = (reg:SI 59) assignment in another code branch (set
up of parameters for the tail call), so in the end IRA decides to put SI 59
into %esi register, but doesn't reconsider that the corresponding DI 80
register could be very well moved to that register as well:
Assigning 4 to a5r59
Disposition:
    5:r59  l0     4    6:r60  l0     1    4:r62  l0     2    0:r70  l0     0
    3:r72  l0     5    8:r73  l0     4    7:r75  l0     2    1:r78  l0     1
    2:r80  l0     0
Nothing afterwards fixes this up then.  The REE pass does nothing, as the
zero_extend uses different registers (%rax = zext (%esi)), so it doesn't
eliminate the extension, and supposedly because of that other passes don't
consider it worthwhile to rename the regs.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/53639] x86_64: redundant 64-bit operations on 32-bit integers
  2012-06-11 22:49 [Bug target/53639] New: x86_64: redundant 64-bit operations on 32-bit integers mikpe at it dot uu.se
  2012-06-12  7:40 ` [Bug target/53639] " jakub at gcc dot gnu.org
@ 2012-06-12 10:41 ` jakub at gcc dot gnu.org
  2012-06-12 12:21 ` ubizjak at gmail dot com
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2012-06-12 10:41 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53639

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #27606|0                           |1
        is obsolete|                            |

--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> 2012-06-12 10:41:18 UTC ---
Created attachment 27608
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27608
gcc48-pr53639.patch

Unfortunately that patch regressed pr49095.c testcase.  So, either we limit the
splitter to the paradoxical subreg that is created by the combiner when seeing
SImode and followed by zero_extend to DImode of the result (done in this
patch), or we'd need to add new peepholes for the a = mem; a &= const; mem = a;
if (a)
cases where a &= const has been transformed into andsi_1_zext.  Uros, any
preference?


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/53639] x86_64: redundant 64-bit operations on 32-bit integers
  2012-06-11 22:49 [Bug target/53639] New: x86_64: redundant 64-bit operations on 32-bit integers mikpe at it dot uu.se
  2012-06-12  7:40 ` [Bug target/53639] " jakub at gcc dot gnu.org
  2012-06-12 10:41 ` jakub at gcc dot gnu.org
@ 2012-06-12 12:21 ` ubizjak at gmail dot com
  2012-06-14 18:27 ` jakub at gcc dot gnu.org
  2024-03-28  6:25 ` pinskia at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: ubizjak at gmail dot com @ 2012-06-12 12:21 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53639

--- Comment #3 from Uros Bizjak <ubizjak at gmail dot com> 2012-06-12 12:21:07 UTC ---
(In reply to comment #2)

> Unfortunately that patch regressed pr49095.c testcase.  So, either we limit the
> splitter to the paradoxical subreg that is created by the combiner when seeing
> SImode and followed by zero_extend to DImode of the result (done in this
> patch), or we'd need to add new peepholes for the a = mem; a &= const; mem = a;
> if (a)
> cases where a &= const has been transformed into andsi_1_zext.  Uros, any
> preference?

The splitter, since the scheduler can break interesting sequence by inserting
unrelated instructions.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/53639] x86_64: redundant 64-bit operations on 32-bit integers
  2012-06-11 22:49 [Bug target/53639] New: x86_64: redundant 64-bit operations on 32-bit integers mikpe at it dot uu.se
                   ` (2 preceding siblings ...)
  2012-06-12 12:21 ` ubizjak at gmail dot com
@ 2012-06-14 18:27 ` jakub at gcc dot gnu.org
  2024-03-28  6:25 ` pinskia at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2012-06-14 18:27 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53639

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> 2012-06-14 18:27:00 UTC ---
Author: jakub
Date: Thu Jun 14 18:26:53 2012
New Revision: 188629

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=188629
Log:
    PR target/53639
    * config/i386/i386.md (*anddi_1 into *andsi_1_zext splitter): New.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386.md


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/53639] x86_64: redundant 64-bit operations on 32-bit integers
  2012-06-11 22:49 [Bug target/53639] New: x86_64: redundant 64-bit operations on 32-bit integers mikpe at it dot uu.se
                   ` (3 preceding siblings ...)
  2012-06-14 18:27 ` jakub at gcc dot gnu.org
@ 2024-03-28  6:25 ` pinskia at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-28  6:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53639

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
   Target Milestone|---                         |4.8.0
             Status|UNCONFIRMED                 |RESOLVED

--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Fixed a long time ago.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-03-28  6:25 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-11 22:49 [Bug target/53639] New: x86_64: redundant 64-bit operations on 32-bit integers mikpe at it dot uu.se
2012-06-12  7:40 ` [Bug target/53639] " jakub at gcc dot gnu.org
2012-06-12 10:41 ` jakub at gcc dot gnu.org
2012-06-12 12:21 ` ubizjak at gmail dot com
2012-06-14 18:27 ` jakub at gcc dot gnu.org
2024-03-28  6:25 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).