public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/98957] New: [x86] Odd code generation for 8-bit left shift
@ 2021-02-03 17:42 gabravier at gmail dot com
  2021-02-04  7:52 ` [Bug target/98957] [11 Regression] " rguenth at gcc dot gnu.org
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: gabravier at gmail dot com @ 2021-02-03 17:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98957

            Bug ID: 98957
           Summary: [x86] Odd code generation for 8-bit left shift
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gabravier at gmail dot com
  Target Milestone: ---

bool f(uint8_t m)
{
    return m >> 7;
}

With -O3, LLVM outputs this:

f(unsigned char):
  mov eax, edi
  shr al, 7
  ret

With -march=[some-amd-machine-type], GCC outputs this:

f(unsigned char):
  mov eax, edi
  shr ax, 7
  and eax, 1
  ret

Otherwise, it outputs the same thing as LLVM. I took a long look at
x86-tune.def to see if there was anything related to this triggered by either
from m_AMD_MULTIPLE or m_ZNVER, but couldn't find anything. Also, even if this
is normal (8-bit shr is bad and 16-bit shr is better?? Since when ?? I've
searched for a while and found nothing about this), GCC 10 outputs this:

f(unsigned char):
  movzx eax, dil
  shr ax, 7
  ret

making this look at the very least like a regression to me.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/98957] [11 Regression] [x86] Odd code generation for 8-bit left shift
  2021-02-03 17:42 [Bug target/98957] New: [x86] Odd code generation for 8-bit left shift gabravier at gmail dot com
@ 2021-02-04  7:52 ` rguenth at gcc dot gnu.org
  2021-02-04  9:13 ` jakub at gcc dot gnu.org
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-02-04  7:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98957

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P1
            Summary|[x86] Odd code generation   |[11 Regression] [x86] Odd
                   |for 8-bit left shift        |code generation for 8-bit
                   |                            |left shift
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2021-02-04
   Target Milestone|---                         |11.0
           Keywords|                            |needs-bisection

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Indeed odd.  -mtune=core-avx2 outputs just

        movl    %edi, %eax
        shrb    $7, %al
        ret

which is also what generic tuning produces.  This must be some partial
reg stall stuff, not sure which.

Maybe there's again some flags aliasing going on in the backend.  Marking as
regression for investigation.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/98957] [11 Regression] [x86] Odd code generation for 8-bit left shift
  2021-02-03 17:42 [Bug target/98957] New: [x86] Odd code generation for 8-bit left shift gabravier at gmail dot com
  2021-02-04  7:52 ` [Bug target/98957] [11 Regression] " rguenth at gcc dot gnu.org
@ 2021-02-04  9:13 ` jakub at gcc dot gnu.org
  2021-02-04  9:19 ` jakub at gcc dot gnu.org
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-02-04  9:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98957

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|needs-bisection             |
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Started with r11-5066-gbe39636d9f68c437c8a2c2e7a225c4aed4663e78

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/98957] [11 Regression] [x86] Odd code generation for 8-bit left shift
  2021-02-03 17:42 [Bug target/98957] New: [x86] Odd code generation for 8-bit left shift gabravier at gmail dot com
  2021-02-04  7:52 ` [Bug target/98957] [11 Regression] " rguenth at gcc dot gnu.org
  2021-02-04  9:13 ` jakub at gcc dot gnu.org
@ 2021-02-04  9:19 ` jakub at gcc dot gnu.org
  2021-02-04  9:43 ` jakub at gcc dot gnu.org
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-02-04  9:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98957

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
The first change is during cse1:
 (insn 10 9 11 2 (parallel [
             (set (reg:HI 90)
                 (and:HI (subreg:HI (reg:QI 89) 0)
                     (const_int 1 [0x1])))
             (clobber (reg:CC 17 flags))
         ]) "pr98957.c":3:14 489 {*andhi_1}
      (expr_list:REG_DEAD (reg:QI 89)
         (expr_list:REG_UNUSED (reg:CC 17 flags)
             (nil))))
 (insn 11 10 12 2 (set (reg:QI 86)
         (subreg:QI (reg:HI 90) 0)) "pr98957.c":3:14 77 {*movqi_internal}
      (expr_list:REG_DEAD (reg:HI 90)
         (nil)))
 (insn 12 11 16 2 (set (reg:QI 83 [ <retval> ])
         (subreg:QI (reg:HI 90) 0)) "pr98957.c":3:14 77 {*movqi_internal}
      (expr_list:REG_DEAD (reg:QI 86)
         (nil)))
 (insn 16 12 17 2 (set (reg/i:QI 0 ax)
-        (subreg:QI (reg:HI 90) 0)) "pr98957.c":4:1 77 {*movqi_internal}
+        (reg:QI 86)) "pr98957.c":4:1 77 {*movqi_internal}
      (expr_list:REG_DEAD (reg:QI 83 [ <retval> ])
         (nil)))
and that then changes how combine handles it.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/98957] [11 Regression] [x86] Odd code generation for 8-bit left shift
  2021-02-03 17:42 [Bug target/98957] New: [x86] Odd code generation for 8-bit left shift gabravier at gmail dot com
                   ` (2 preceding siblings ...)
  2021-02-04  9:19 ` jakub at gcc dot gnu.org
@ 2021-02-04  9:43 ` jakub at gcc dot gnu.org
  2021-02-04 10:07 ` jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-02-04  9:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98957

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |law at gcc dot gnu.org,
                   |                            |sayle at gcc dot gnu.org

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
The thing is that during combine that change allows one further optimization.
After successfully optimizing the and away:
Trying 8 -> 10:
    8: {r88:HI=r87:HI 0>>0x7;clobber flags:CC;}
      REG_DEAD r87:HI
      REG_UNUSED flags:CC
   10: {r90:HI=r88:HI&0x1;clobber flags:CC;}
      REG_DEAD r88:HI
      REG_UNUSED flags:CC
Successfully matched this instruction:
(parallel [
        (set (reg:HI 90)
            (lshiftrt:HI (reg:HI 87 [ m ])
                (const_int 7 [0x7])))
        (clobber (reg:CC 17 flags))
    ])
it adds it back again:
Trying 7, 10 -> 11:
    7: r87:HI=zero_extend(r91:SI#0)
      REG_DEAD r91:SI
   10: {r90:HI=r87:HI 0>>0x7;clobber flags:CC;}
      REG_DEAD r87:HI
      REG_UNUSED flags:CC
   11: r86:QI=r90:HI#0
      REG_DEAD r90:HI
Failed to match this instruction:
(set (subreg:HI (reg:QI 86) 0)
    (zero_extract:HI (subreg:HI (reg:SI 91) 0)
        (const_int 1 [0x1])
        (const_int 7 [0x7])))
Failed to match this instruction:
(set (subreg:HI (reg:QI 86) 0)
    (and:HI (lshiftrt:HI (subreg:HI (reg:SI 91) 0)
            (const_int 7 [0x7]))
        (const_int 1 [0x1])))
Successfully matched this instruction:
(set (reg:HI 90)
    (lshiftrt:HI (subreg:HI (reg:SI 91) 0)
        (const_int 7 [0x7])))
Successfully matched this instruction:
(set (subreg:HI (reg:QI 86) 0)
    (and:HI (reg:HI 90)
        (const_int 1 [0x1])))
allowing combination of insns 7, 10 and 11
original costs 4 + 4 + 4 = 12
replacement costs 4 + 4 = 8
deferring deletion of insn with uid = 7.
modifying insn i2    10: {r90:HI=r91:SI#0 0>>0x7;clobber flags:CC;}
      REG_UNUSED flags:CC
      REG_DEAD r91:SI
deferring rescan insn with uid = 10.
modifying insn i3    11: {r86:QI#0=r90:HI&0x1;clobber flags:CC;}
      REG_UNUSED flags:CC
      REG_DEAD r90:HI
deferring rescan insn with uid = 11.
in a 3 to 2 combination.  It is unclear why the
(insn 11 10 16 2 (set (reg:QI 86)
        (subreg:QI (reg:HI 90) 0)) "pr98957.c":3:14 77 {*movqi_internal}
     (expr_list:REG_DEAD (reg:HI 90)
        (nil)))
insn is considered to have any cost at all though...

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/98957] [11 Regression] [x86] Odd code generation for 8-bit left shift
  2021-02-03 17:42 [Bug target/98957] New: [x86] Odd code generation for 8-bit left shift gabravier at gmail dot com
                   ` (3 preceding siblings ...)
  2021-02-04  9:43 ` jakub at gcc dot gnu.org
@ 2021-02-04 10:07 ` jakub at gcc dot gnu.org
  2021-02-04 11:34 ` [Bug target/98957] [11 Regression] [x86] Odd code generation for 8-bit right shift jakub at gcc dot gnu.org
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-02-04 10:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98957

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Though, in this case it seems like the best fix is to:
2021-02-04  Jakub Jelinek  <jakub@redhat.com>

        PR target/98957
        * config/i386/x86-tune.def (X86_TUNE_BRANCH_PREDICTION_HINTS,
        X86_TUNE_PROMOTE_QI_REGS): Use HOST_WIDE_INT_0U instead of 0U.
        (X86_TUNE_QIMODE_MATH): Use ~HOST_WIDE_INT_0U instead of ~0U.

--- gcc/config/i386/x86-tune.def.jj     2021-01-04 10:25:45.175162012 +0100
+++ gcc/config/i386/x86-tune.def        2021-02-04 10:56:20.031489884 +0100
@@ -580,15 +580,16 @@ DEF_TUNE (X86_TUNE_AVOID_VECTOR_DECODE,
    on simulation result. But after P4 was made, no performance benefit
    was observed with branch hints.  It also increases the code size.
    As a result, icc never generates branch hints.  */
-DEF_TUNE (X86_TUNE_BRANCH_PREDICTION_HINTS, "branch_prediction_hints", 0U)
+DEF_TUNE (X86_TUNE_BRANCH_PREDICTION_HINTS, "branch_prediction_hints",
+         HOST_WIDE_INT_0U)

 /* X86_TUNE_QIMODE_MATH: Enable use of 8bit arithmetic.  */
-DEF_TUNE (X86_TUNE_QIMODE_MATH, "qimode_math", ~0U)
+DEF_TUNE (X86_TUNE_QIMODE_MATH, "qimode_math", ~HOST_WIDE_INT_0U)

 /* X86_TUNE_PROMOTE_QI_REGS: This enables generic code that promotes all 8bit
    arithmetic to 32bit via PROMOTE_MODE macro.  This code generation scheme
    is usually used for RISC targets.  */
-DEF_TUNE (X86_TUNE_PROMOTE_QI_REGS, "promote_qi_regs", 0U)
+DEF_TUNE (X86_TUNE_PROMOTE_QI_REGS, "promote_qi_regs", HOST_WIDE_INT_0U)

 /* X86_TUNE_EMIT_VZEROUPPER: This enables vzeroupper instruction insertion
    before a transfer of control flow out of the function.  */

because disabling QImode math on all the PROCESSOR_* tunings which happen to be
>= 32 seems unintended.  E.g. in GCC 8 that was just
PROCESSOR_BTVER2 and PROCESSOR_ZNVER1, in GCC <= 7 none, in GCC 9
PROCESSOR_BDVER{2,3,4}, PROCESSOR_BTVER{1,2} and PROCESSOR_ZNVER{1,2},
GCC 10 added PROCESSOR_AMDFAM10 to the GCC 9 set and trunk adds
PROCESSOR_ATHLON, PROCESSOR_K8 and PROCESSOR_ZNVER3 to that set (set of tunings
that disable QImode math).

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/98957] [11 Regression] [x86] Odd code generation for 8-bit right shift
  2021-02-03 17:42 [Bug target/98957] New: [x86] Odd code generation for 8-bit left shift gabravier at gmail dot com
                   ` (4 preceding siblings ...)
  2021-02-04 10:07 ` jakub at gcc dot gnu.org
@ 2021-02-04 11:34 ` jakub at gcc dot gnu.org
  2021-02-05  9:40 ` cvs-commit at gcc dot gnu.org
  2021-02-05  9:46 ` jakub at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-02-04 11:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98957

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
            Summary|[11 Regression] [x86] Odd   |[11 Regression] [x86] Odd
                   |code generation for 8-bit   |code generation for 8-bit
                   |left shift                  |right shift
           Assignee|unassigned at gcc dot gnu.org      |jakub at gcc dot gnu.org

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/98957] [11 Regression] [x86] Odd code generation for 8-bit right shift
  2021-02-03 17:42 [Bug target/98957] New: [x86] Odd code generation for 8-bit left shift gabravier at gmail dot com
                   ` (5 preceding siblings ...)
  2021-02-04 11:34 ` [Bug target/98957] [11 Regression] [x86] Odd code generation for 8-bit right shift jakub at gcc dot gnu.org
@ 2021-02-05  9:40 ` cvs-commit at gcc dot gnu.org
  2021-02-05  9:46 ` jakub at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-02-05  9:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98957

--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:37876976b0511ec96741f638f160874f2added0e

commit r11-7121-g37876976b0511ec96741f638f160874f2added0e
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Fri Feb 5 10:39:03 2021 +0100

    i386: Fix up TARGET_QIMODE_MATH for many AMD CPU tunings [PR98957]

    As written in the PR, TARGET_QIMODE_MATH was meant to be set for all
    tunings and it was the case for GCC <= 7, but as the number of
    PROCESSOR_* enumerators grew, some AMD tunings (which are at the end
    of the list) over time got enumerators with values >= 32 and
    TARGET_QIMODE_MATH became disabled for them, in GCC 8 for 2
    tunings, in GCC 9 for 7 tunings, in GCC 10 for 8 tunings, and
    on the trunk for 11 tunings.

    The following patch fixes it by using uhwis rather than uints
    and gives them also symbolic names.

    2021-02-05  Jakub Jelinek  <jakub@redhat.com>

            PR target/98957
            * config/i386/i386-options.c (m_NONE, m_ALL): Define.
            * config/i386/x86-tune.def (X86_TUNE_BRANCH_PREDICTION_HINTS,
            X86_TUNE_PROMOTE_QI_REGS): Use m_NONE instead of 0U.
            (X86_TUNE_QIMODE_MATH): Use m_ALL instead of ~0U.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/98957] [11 Regression] [x86] Odd code generation for 8-bit right shift
  2021-02-03 17:42 [Bug target/98957] New: [x86] Odd code generation for 8-bit left shift gabravier at gmail dot com
                   ` (6 preceding siblings ...)
  2021-02-05  9:40 ` cvs-commit at gcc dot gnu.org
@ 2021-02-05  9:46 ` jakub at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-02-05  9:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98957

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-02-05  9:46 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-03 17:42 [Bug target/98957] New: [x86] Odd code generation for 8-bit left shift gabravier at gmail dot com
2021-02-04  7:52 ` [Bug target/98957] [11 Regression] " rguenth at gcc dot gnu.org
2021-02-04  9:13 ` jakub at gcc dot gnu.org
2021-02-04  9:19 ` jakub at gcc dot gnu.org
2021-02-04  9:43 ` jakub at gcc dot gnu.org
2021-02-04 10:07 ` jakub at gcc dot gnu.org
2021-02-04 11:34 ` [Bug target/98957] [11 Regression] [x86] Odd code generation for 8-bit right shift jakub at gcc dot gnu.org
2021-02-05  9:40 ` cvs-commit at gcc dot gnu.org
2021-02-05  9:46 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).