public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/92180] Missed optimization on casting __builtin_ia32_rdtsc result to int32
       [not found] <bug-92180-4@http.gcc.gnu.org/bugzilla/>
@ 2020-11-16 23:57 ` cvs-commit at gcc dot gnu.org
  2021-08-14  9:50 ` pinskia at gcc dot gnu.org
  2023-09-28 10:39 ` stefan.sakalik at gmail dot com
  2 siblings, 0 replies; 3+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-11-16 23:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92180

--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jeff Law <law@gcc.gnu.org>:

https://gcc.gnu.org/g:be39636d9f68c437c8a2c2e7a225c4aed4663e78

commit r11-5066-gbe39636d9f68c437c8a2c2e7a225c4aed4663e78
Author: Roger Sayle <roger@nextmovesoftware.com>
Date:   Mon Nov 16 16:55:29 2020 -0700

    Improve code generation for x86_64 [PR 92180]

    This patch catches a missed optimization opportunity where GCC currently
    generates worse code than LLVM.  The issue, as nicely analyzed in bugzilla,
    boils down to the following three insns in combine:

    (insn 6 5 7 2 (parallel [
                (set (reg:DI 85)
                    (ashift:DI (reg:DI 85)
                        (const_int 32 [0x20])))
                (clobber (reg:CC 17 flags))
            ]) "pr92180.c":4:10 564 {*ashldi3_1}
         (expr_list:REG_UNUSED (reg:CC 17 flags)
            (nil)))
    (insn 7 6 14 2 (parallel [
                (set (reg:DI 84)
                    (ior:DI (reg:DI 84)
                        (reg:DI 85)))
                (clobber (reg:CC 17 flags))
            ]) "pr92180.c":4:10 454 {*iordi_1}
         (expr_list:REG_DEAD (reg:DI 85)
            (expr_list:REG_UNUSED (reg:CC 17 flags)
                (nil))))
    (insn 14 7 15 2 (set (reg/i:SI 0 ax)
            (subreg:SI (reg:DI 84) 0)) "pr92180.c":5:1 67 {*movsi_internal}
         (expr_list:REG_DEAD (reg:DI 84)
            (nil)))

    Normally, combine/simplify-rtx would notice that insns 6 and 7
    (which update highpart bits) are unnecessary as the final insn 14
    only requires to lowpart bits.  The complication is that insn 14
    sets a hard register in targetm.class_likely_spilled_p which
    prevents combine from performing its simplifications, and removing
    the redundant instructions.

    At first glance a fix would appear to require changes to combine,
    potentially affecting code generation on all small register class
    targets...  An alternate (and I think clever) solution is to spot
    that this problematic situation can be avoided by the backend.

    At RTL expansion time, the middle-end has a clear separation between
    pseudos and hard registers, so the RTL initially contains:

    (insn 9 8 10 2 (set (reg:SI 86)
            (subreg:SI (reg:DI 82 [ _1 ]) 0)) "pr92180.c":6:10 -1
         (nil))
    (insn 10 9 14 2 (set (reg:SI 83 [ <retval> ])
            (reg:SI 86)) "pr92180.c":6:10 -1
         (nil))
    (insn 14 10 15 2 (set (reg/i:SI 0 ax)
            (reg:SI 83 [ <retval> ])) "pr92180.c":7:1 -1
         (nil))

    which can be optimized without problems by combine; it is only the
    intervening passes (initially fwprop1) that propagate computations
    into sets of hard registers, and disable those opportunities.

    The solution proposed here is to have the x86 backend/recog prevent
    early RTL passes composing instructions (that set likely_spilled hard
    registers) that they (combine) can't simplify, until after reload.
    We allow sets from pseudo registers, immediate constants and memory
    accesses, but anything more complicated is performed via a temporary
    pseudo.  Not only does this simplify things for the register allocator,
    but any remaining register-to-register moves are easily cleaned up
    by the late optimization passes after reload, such as peephole2 and
    cprop_hardreg.

    This patch has been tested on x86_64-pc-linux-gnu with a
    "make bootstrap" and a "make -k check" with no new failures.
    Ok for mainline?

    gcc
            PR rtl-optimization/92180
            * config/i386/i386.c (ix86_hardreg_mov_ok): New function to
            determine whether (set DST SRC) should be allowed at this point.
            * config/i386/i386-protos.h (ix86_hardreg_mov_ok): Prototype here.
            * config/i386/i386-expand.c (ix86_expand_move): Check whether
            this is a complex set of a likely spilled hard register, and if
            so place the value in a pseudo, and load the hard reg from it.
            * config/i386/i386.md (*movdi_internal, *movsi_internal)
            (*movhi_internal, *movqi_internal): Make these instructions
            conditional on ix86_hardreg_mov_ok.
            (*lea<mode>): Make this define_insn_and_split conditional on
            ix86_hardreg_mov_ok.

    gcc/testsuite
            PR rtl-optimization/92180
            * gcc.target/i386/pr92180.c: New test.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug rtl-optimization/92180] Missed optimization on casting __builtin_ia32_rdtsc result to int32
       [not found] <bug-92180-4@http.gcc.gnu.org/bugzilla/>
  2020-11-16 23:57 ` [Bug rtl-optimization/92180] Missed optimization on casting __builtin_ia32_rdtsc result to int32 cvs-commit at gcc dot gnu.org
@ 2021-08-14  9:50 ` pinskia at gcc dot gnu.org
  2023-09-28 10:39 ` stefan.sakalik at gmail dot com
  2 siblings, 0 replies; 3+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-08-14  9:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92180

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |11.0
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED

--- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Fixed.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug rtl-optimization/92180] Missed optimization on casting __builtin_ia32_rdtsc result to int32
       [not found] <bug-92180-4@http.gcc.gnu.org/bugzilla/>
  2020-11-16 23:57 ` [Bug rtl-optimization/92180] Missed optimization on casting __builtin_ia32_rdtsc result to int32 cvs-commit at gcc dot gnu.org
  2021-08-14  9:50 ` pinskia at gcc dot gnu.org
@ 2023-09-28 10:39 ` stefan.sakalik at gmail dot com
  2 siblings, 0 replies; 3+ messages in thread
From: stefan.sakalik at gmail dot com @ 2023-09-28 10:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92180

Stefan <stefan.sakalik at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |stefan.sakalik at gmail dot com

--- Comment #7 from Stefan <stefan.sakalik at gmail dot com> ---
This was not completely fixed, for example this code:
https://godbolt.org/z/7W9nqTsjE

#include <bits/stdc++.h>
#include <x86intrin.h>

uint32_t rdtsc32() { return static_cast<uint32_t>(__rdtsc()); }

uint64_t rdtsc_delta(uint64_t x) {
    return rdtsc32() - rdtsc32();
}

Produces 

rdtsc_delta(unsigned long):
        rdtsc
        mov     rcx, rax
        sal     rdx, 32
        or      rcx, rdx
        rdtsc
        sub     ecx, eax
        mov     rax, rcx
        ret

as opposed to clang version

rdtsc_delta(unsigned long):
        rdtsc
        mov     rcx, rax
        rdtsc
        sub     ecx, eax
        mov     rax, rcx
        ret

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-09-28 10:39 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-92180-4@http.gcc.gnu.org/bugzilla/>
2020-11-16 23:57 ` [Bug rtl-optimization/92180] Missed optimization on casting __builtin_ia32_rdtsc result to int32 cvs-commit at gcc dot gnu.org
2021-08-14  9:50 ` pinskia at gcc dot gnu.org
2023-09-28 10:39 ` stefan.sakalik at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).