public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/105034] New: [10/11/12 regression]Suboptimal codegen for min/max with -Os
@ 2022-03-23  6:50 crazylht at gmail dot com
  2022-03-23 11:01 ` [Bug target/105034] " rguenth at gcc dot gnu.org
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: crazylht at gmail dot com @ 2022-03-23  6:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105034

            Bug ID: 105034
           Summary: [10/11/12 regression]Suboptimal codegen for min/max
                    with -Os
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: crazylht at gmail dot com
  Target Milestone: ---

#define max(a,b) (((a) > (b))? (a) : (b))
#define min(a,b) (((a) < (b))? (a) : (b))

int foo(int x)
{
  return max(x,0);
}

int bar(int x)
{
  return min(x,0);
}

unsigned int baz(unsigned int x)
{
  return min(x,1);
}

gcc10/11/12 -Os -msse4.1

foo(int):
        movd    xmm0, edi
        xorps   xmm1, xmm1
        pmaxsd  xmm0, xmm1
        movd    eax, xmm0
        ret
bar(int):
        movd    xmm0, edi
        xorps   xmm1, xmm1
        pminsd  xmm0, xmm1
        movd    eax, xmm0
        ret
baz(unsigned int):
        xor     eax, eax
        test    edi, edi
        setne   al
        ret

gcc9.4 -Os -msse4.1

foo(int):
        test    edi, edi
        mov     eax, 0
        cmovns  eax, edi
        ret
bar(int):
        test    edi, edi
        mov     eax, 0
        cmovle  eax, edi
        ret
baz(unsigned int):
        xor     eax, eax
        test    edi, edi
        setne   al
        ret

Os size 
   text    data     bss     dec     hex filename
    178       0       0     178      b2 Os.o


O2 size
   text    data     bss     dec     hex filename
    176       0       0     176      b0 O2.o

https://godbolt.org/z/1sYxdTcKz

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/105034] [10/11/12 regression]Suboptimal codegen for min/max with -Os
  2022-03-23  6:50 [Bug target/105034] New: [10/11/12 regression]Suboptimal codegen for min/max with -Os crazylht at gmail dot com
@ 2022-03-23 11:01 ` rguenth at gcc dot gnu.org
  2022-03-28  2:49 ` wwwhhhyyy333 at gmail dot com
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-03-23 11:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105034

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2022-03-23
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1
   Target Milestone|---                         |10.4
           Priority|P3                          |P2

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
With -mavx it gets worse since vpxor is one byte larger than xorps.  Not sure
if STV is tuned for -Os very well.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/105034] [10/11/12 regression]Suboptimal codegen for min/max with -Os
  2022-03-23  6:50 [Bug target/105034] New: [10/11/12 regression]Suboptimal codegen for min/max with -Os crazylht at gmail dot com
  2022-03-23 11:01 ` [Bug target/105034] " rguenth at gcc dot gnu.org
@ 2022-03-28  2:49 ` wwwhhhyyy333 at gmail dot com
  2022-04-14  8:38 ` roger at nextmovesoftware dot com
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: wwwhhhyyy333 at gmail dot com @ 2022-03-28  2:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105034

--- Comment #2 from Hongyu Wang <wwwhhhyyy333 at gmail dot com> ---
For -O2 stv doesn't do such transform
Computing gain for chain #1...
  Instruction gain 8 for     7: {r84:SI=smax(r85:SI,0);clobber flags:CC;}
      REG_DEAD r85:SI
      REG_UNUSED flags:CC
  Instruction conversion gain: 8
  Registers conversion cost: 12
  Total gain: -4

Since sse->integer reg move cost is 6 for generic cost.

Buf for -Os the cost is 3 so it is consider to be profitable.
Computing gain for chain #1...
  Instruction gain 8 for     7: {r84:SI=smax(r85:SI,0);clobber flags:CC;}
      REG_DEAD r85:SI
      REG_UNUSED flags:CC
  Instruction conversion gain: 8
  Registers conversion cost: 6
  Total gain: 2

FWIW, the solution would be either adjust the ix86_size cost, or blocks out 
optimize_size in the stv gate.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/105034] [10/11/12 regression]Suboptimal codegen for min/max with -Os
  2022-03-23  6:50 [Bug target/105034] New: [10/11/12 regression]Suboptimal codegen for min/max with -Os crazylht at gmail dot com
  2022-03-23 11:01 ` [Bug target/105034] " rguenth at gcc dot gnu.org
  2022-03-28  2:49 ` wwwhhhyyy333 at gmail dot com
@ 2022-04-14  8:38 ` roger at nextmovesoftware dot com
  2022-04-14  9:00 ` rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: roger at nextmovesoftware dot com @ 2022-04-14  8:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105034

Roger Sayle <roger at nextmovesoftware dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |roger at nextmovesoftware dot com

--- Comment #3 from Roger Sayle <roger at nextmovesoftware dot com> ---
Hi Hongtao,
Note that -mstv is a net win on the code size benchmark CSiBE, so gating the
entire pass on optimize_size is not an ideal solution.  Instead the gain
function needs to choose which chains to transform based on optimize_size aware
costs.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/105034] [10/11/12 regression]Suboptimal codegen for min/max with -Os
  2022-03-23  6:50 [Bug target/105034] New: [10/11/12 regression]Suboptimal codegen for min/max with -Os crazylht at gmail dot com
                   ` (2 preceding siblings ...)
  2022-04-14  8:38 ` roger at nextmovesoftware dot com
@ 2022-04-14  9:00 ` rguenth at gcc dot gnu.org
  2022-04-14  9:20 ` roger at nextmovesoftware dot com
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-04-14  9:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105034

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
Example that we don't transform but could:

typedef int v4si __attribute__((vector_size(16)));

#define min(a,b) ((a)<(b)?(a):(b))

v4si foo (v4si a, v4si b)
{
  a[0] = min (a[0], b[0]);
  return a;
}

here the scalar code is

        movd    %xmm0, %edx
        movd    %xmm1, %eax
        cmpl    %edx, %eax
        cmovg   %edx, %eax
        pinsrd  $0, %eax, %xmm0

where we could use sth like

        movq %xmm0, %xmm2
        minpd %xmm2, %xmm1
        <some pack/unpack/palign or whatever>

a testcase variant could return the scalar minimum.  For both cases it's
likely a win even for -Os.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/105034] [10/11/12 regression]Suboptimal codegen for min/max with -Os
  2022-03-23  6:50 [Bug target/105034] New: [10/11/12 regression]Suboptimal codegen for min/max with -Os crazylht at gmail dot com
                   ` (3 preceding siblings ...)
  2022-04-14  9:00 ` rguenth at gcc dot gnu.org
@ 2022-04-14  9:20 ` roger at nextmovesoftware dot com
  2022-06-28 10:48 ` [Bug target/105034] [10/11/12/13 " jakub at gcc dot gnu.org
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: roger at nextmovesoftware dot com @ 2022-04-14  9:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105034

--- Comment #5 from Roger Sayle <roger at nextmovesoftware dot com> ---
The latest CSiBE results on x86_64-pc-linux-gnu:  With -Os the total size is
3696263, and with -Os -mno-stv the total size is 3966887, i.e. 624 bytes
larger.  The worst regression from -mno-stv is
teem-1.6.0-src/src/nrrd/parseNrrd which 402 bytes larger, and the best
improvement from -mno-stv is linux-2.4.23-pre3-testplatform/net/ipv4/route
which is 134 bytes smaller.  So I think this is a fine tuning problem.

cmp/cmov is much shorter than a pmax or a pmin, so SImode MAX/MIN should have
negative gain with -Os.  Likewise for const0_rtx.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/105034] [10/11/12/13 regression]Suboptimal codegen for min/max with -Os
  2022-03-23  6:50 [Bug target/105034] New: [10/11/12 regression]Suboptimal codegen for min/max with -Os crazylht at gmail dot com
                   ` (4 preceding siblings ...)
  2022-04-14  9:20 ` roger at nextmovesoftware dot com
@ 2022-06-28 10:48 ` jakub at gcc dot gnu.org
  2023-07-07 10:42 ` [Bug target/105034] [11/12/13/14 " rguenth at gcc dot gnu.org
  2023-11-01  3:54 ` crazylht at gmail dot com
  7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-06-28 10:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105034

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|10.4                        |10.5

--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 10.4 is being released, retargeting bugs to GCC 10.5.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/105034] [11/12/13/14 regression]Suboptimal codegen for min/max with -Os
  2022-03-23  6:50 [Bug target/105034] New: [10/11/12 regression]Suboptimal codegen for min/max with -Os crazylht at gmail dot com
                   ` (5 preceding siblings ...)
  2022-06-28 10:48 ` [Bug target/105034] [10/11/12/13 " jakub at gcc dot gnu.org
@ 2023-07-07 10:42 ` rguenth at gcc dot gnu.org
  2023-11-01  3:54 ` crazylht at gmail dot com
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-07-07 10:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105034

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|10.5                        |11.5

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 10 branch is being closed.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/105034] [11/12/13/14 regression]Suboptimal codegen for min/max with -Os
  2022-03-23  6:50 [Bug target/105034] New: [10/11/12 regression]Suboptimal codegen for min/max with -Os crazylht at gmail dot com
                   ` (6 preceding siblings ...)
  2023-07-07 10:42 ` [Bug target/105034] [11/12/13/14 " rguenth at gcc dot gnu.org
@ 2023-11-01  3:54 ` crazylht at gmail dot com
  7 siblings, 0 replies; 9+ messages in thread
From: crazylht at gmail dot com @ 2023-11-01  3:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105034

Hongtao.liu <crazylht at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED

--- Comment #8 from Hongtao.liu <crazylht at gmail dot com> ---
Looks like it's fixed in latest trunk.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-11-01  3:54 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-23  6:50 [Bug target/105034] New: [10/11/12 regression]Suboptimal codegen for min/max with -Os crazylht at gmail dot com
2022-03-23 11:01 ` [Bug target/105034] " rguenth at gcc dot gnu.org
2022-03-28  2:49 ` wwwhhhyyy333 at gmail dot com
2022-04-14  8:38 ` roger at nextmovesoftware dot com
2022-04-14  9:00 ` rguenth at gcc dot gnu.org
2022-04-14  9:20 ` roger at nextmovesoftware dot com
2022-06-28 10:48 ` [Bug target/105034] [10/11/12/13 " jakub at gcc dot gnu.org
2023-07-07 10:42 ` [Bug target/105034] [11/12/13/14 " rguenth at gcc dot gnu.org
2023-11-01  3:54 ` crazylht at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).