public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/57193] New: suboptimal register allocation for SSE registers
@ 2013-05-07  9:36 vermaelen.wouter at gmail dot com
  2013-05-07 11:47 ` [Bug rtl-optimization/57193] " rguenth at gcc dot gnu.org
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: vermaelen.wouter at gmail dot com @ 2013-05-07  9:36 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57193

             Bug #: 57193
           Summary: suboptimal register allocation for SSE registers
    Classification: Unclassified
           Product: gcc
           Version: 4.9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: vermaelen.wouter@gmail.com


This bug _might_ be related to PR56339, although that report talks about a
regression compared to 4.7, while this bug seems to be a regression compared to
4.4.

I was converting some hand-written asm code to SSE-intrinsics, but
unfortunately the version using intrinsics generates worse code. It contains
two unnecessary 'movdqa' instructions.

I managed to reduce my test to this routine:

//--------------------------------------------------------------
#include <emmintrin.h>

void test1(const __m128i* in1, const __m128i* in2, __m128i* out,
           __m128i f, __m128i zero)
{
    __m128i c = _mm_avg_epu8(*in1, *in2);
    __m128i l = _mm_unpacklo_epi8(c, zero);
    __m128i h = _mm_unpackhi_epi8(c, zero);
    __m128i m = _mm_mulhi_epu16(l, f);
    __m128i n = _mm_mulhi_epu16(h, f);
    *out = _mm_packus_epi16(m, n);
}
//--------------------------------------------------------------

A (few days old) gcc snapshot generates the following code. Versions 4.5, 4.6
and 4.7 generate similar code:

   0:   66 0f 6f 17             movdqa (%rdi),%xmm2
   4:   66 0f e0 16             pavgb  (%rsi),%xmm2
   8:   66 0f 6f da             movdqa %xmm2,%xmm3
   c:   66 0f 68 d1             punpckhbw %xmm1,%xmm2
  10:   66 0f 60 d9             punpcklbw %xmm1,%xmm3
  14:   66 0f e4 d0             pmulhuw %xmm0,%xmm2
  18:   66 0f 6f cb             movdqa %xmm3,%xmm1
  1c:   66 0f e4 c8             pmulhuw %xmm0,%xmm1
  20:   66 0f 6f c1             movdqa %xmm1,%xmm0
  24:   66 0f 67 c2             packuswb %xmm2,%xmm0
  28:   66 0f 7f 02             movdqa %xmm0,(%rdx)
  2c:   c3                      retq

Gcc version 4.3 and 4.4 (and clang) generate the following optimal(?) code:
   0:   66 0f 6f 17             movdqa (%rdi),%xmm2
   4:   66 0f e0 16             pavgb  (%rsi),%xmm2
   8:   66 0f 6f da             movdqa %xmm2,%xmm3
   c:   66 0f 68 d1             punpckhbw %xmm1,%xmm2
  10:   66 0f 60 d9             punpcklbw %xmm1,%xmm3
  14:   66 0f e4 d8             pmulhuw %xmm0,%xmm3
  18:   66 0f e4 c2             pmulhuw %xmm2,%xmm0
  1c:   66 0f 67 d8             packuswb %xmm0,%xmm3
  20:   66 0f 7f 1a             movdqa %xmm3,(%rdx)
  24:   c3                      retq


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug rtl-optimization/57193] suboptimal register allocation for SSE registers
  2013-05-07  9:36 [Bug rtl-optimization/57193] New: suboptimal register allocation for SSE registers vermaelen.wouter at gmail dot com
@ 2013-05-07 11:47 ` rguenth at gcc dot gnu.org
  2013-05-07 18:17 ` [Bug rtl-optimization/57193] [4.5/4.6/4.7/4.8/4.9 Regression] " hjl.tools at gmail dot com
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-05-07 11:47 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57193

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization, ra
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2013-05-07
                 CC|                            |vmakarov at gcc dot gnu.org
     Ever Confirmed|0                           |1

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> 2013-05-07 11:47:25 UTC ---
Confirmed.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug rtl-optimization/57193] [4.5/4.6/4.7/4.8/4.9 Regression] suboptimal register allocation for SSE registers
  2013-05-07  9:36 [Bug rtl-optimization/57193] New: suboptimal register allocation for SSE registers vermaelen.wouter at gmail dot com
  2013-05-07 11:47 ` [Bug rtl-optimization/57193] " rguenth at gcc dot gnu.org
@ 2013-05-07 18:17 ` hjl.tools at gmail dot com
  2013-05-08  8:59 ` [Bug rtl-optimization/57193] [4.7/4.8/4.9 " rguenth at gcc dot gnu.org
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: hjl.tools at gmail dot com @ 2013-05-07 18:17 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57193

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hjl.tools at gmail dot com
            Summary|suboptimal register         |[4.5/4.6/4.7/4.8/4.9
                   |allocation for SSE          |Regression] suboptimal
                   |registers                   |register allocation for SSE
                   |                            |registers

--- Comment #2 from H.J. Lu <hjl.tools at gmail dot com> 2013-05-07 18:17:11 UTC ---
It is caused by revision 156641:

http://gcc.gnu.org/ml/gcc-cvs/2010-02/msg00222.html


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug rtl-optimization/57193] [4.7/4.8/4.9 Regression] suboptimal register allocation for SSE registers
  2013-05-07  9:36 [Bug rtl-optimization/57193] New: suboptimal register allocation for SSE registers vermaelen.wouter at gmail dot com
  2013-05-07 11:47 ` [Bug rtl-optimization/57193] " rguenth at gcc dot gnu.org
  2013-05-07 18:17 ` [Bug rtl-optimization/57193] [4.5/4.6/4.7/4.8/4.9 Regression] " hjl.tools at gmail dot com
@ 2013-05-08  8:59 ` rguenth at gcc dot gnu.org
  2013-10-30 12:18 ` rguenth at gcc dot gnu.org
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-05-08  8:59 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57193

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to work|                            |4.4.6
   Target Milestone|---                         |4.7.4
            Summary|[4.5/4.6/4.7/4.8/4.9        |[4.7/4.8/4.9 Regression]
                   |Regression] suboptimal      |suboptimal register
                   |register allocation for SSE |allocation for SSE
                   |registers                   |registers
      Known to fail|                            |4.5.3, 4.6.4


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug rtl-optimization/57193] [4.7/4.8/4.9 Regression] suboptimal register allocation for SSE registers
  2013-05-07  9:36 [Bug rtl-optimization/57193] New: suboptimal register allocation for SSE registers vermaelen.wouter at gmail dot com
                   ` (2 preceding siblings ...)
  2013-05-08  8:59 ` [Bug rtl-optimization/57193] [4.7/4.8/4.9 " rguenth at gcc dot gnu.org
@ 2013-10-30 12:18 ` rguenth at gcc dot gnu.org
  2014-02-12 20:03 ` rth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-10-30 12:18 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57193

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P2
      Known to fail|                            |4.9.0

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
Re-confirmed on trunk.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug rtl-optimization/57193] [4.7/4.8/4.9 Regression] suboptimal register allocation for SSE registers
  2013-05-07  9:36 [Bug rtl-optimization/57193] New: suboptimal register allocation for SSE registers vermaelen.wouter at gmail dot com
                   ` (3 preceding siblings ...)
  2013-10-30 12:18 ` rguenth at gcc dot gnu.org
@ 2014-02-12 20:03 ` rth at gcc dot gnu.org
  2014-06-12 13:41 ` [Bug rtl-optimization/57193] [4.7/4.8/4.9/4.10 " rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rth at gcc dot gnu.org @ 2014-02-12 20:03 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57193

Richard Henderson <rth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|2013-05-07 00:00:00         |2014-2-12
                 CC|                            |rth at gcc dot gnu.org

--- Comment #4 from Richard Henderson <rth at gcc dot gnu.org> ---
It seems like incomplete reload inheritance:

(insn 19 16 21 2 (set (reg:V8HI 107)
  (truncate:V8HI
    (lshiftrt:V8SI
      (mult:V8SI (zero_extend:V8SI (subreg:V8HI (reg:V16QI 105) 0))
                 (zero_extend:V8SI (subreg:V8HI (reg/v:V2DI 101 [ f ]) 0)))
      (const_int 16 [0x10]))))
  include/emmintrin.h:1362 2134 {*umulv8hi3_highpart}
  (expr_list:REG_DEAD (reg:V16QI 105) (nil)))

      Creating newreg=111 from oldreg=107, assigning class SSE_REGS to r111
   19: r111:V8HI=trunc(zero_extend(r111:V8HI)*zero_extend(r101:V2DI#0) 0>>0x10)
      REG_DEAD r105:V16QI
    Inserting insn reload before:
   31: r111:V8HI=r105:V16QI#0
    Inserting insn reload after:
   32: r107:V8HI=r111:V8HI

The new register r111 does wind up inheriting from r107, but not
transitively to r105.  Thus we wind up leaving the copy insn 31.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug rtl-optimization/57193] [4.7/4.8/4.9/4.10 Regression] suboptimal register allocation for SSE registers
  2013-05-07  9:36 [Bug rtl-optimization/57193] New: suboptimal register allocation for SSE registers vermaelen.wouter at gmail dot com
                   ` (4 preceding siblings ...)
  2014-02-12 20:03 ` rth at gcc dot gnu.org
@ 2014-06-12 13:41 ` rguenth at gcc dot gnu.org
  2014-12-19 13:34 ` [Bug rtl-optimization/57193] [4.8/4.9/5 " jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-06-12 13:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57193

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.7.4                       |4.8.4

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
The 4.7 branch is being closed, moving target milestone to 4.8.4.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug rtl-optimization/57193] [4.8/4.9/5 Regression] suboptimal register allocation for SSE registers
  2013-05-07  9:36 [Bug rtl-optimization/57193] New: suboptimal register allocation for SSE registers vermaelen.wouter at gmail dot com
                   ` (5 preceding siblings ...)
  2014-06-12 13:41 ` [Bug rtl-optimization/57193] [4.7/4.8/4.9/4.10 " rguenth at gcc dot gnu.org
@ 2014-12-19 13:34 ` jakub at gcc dot gnu.org
  2015-06-23  8:15 ` [Bug rtl-optimization/57193] [4.8/4.9/5/6 " rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: jakub at gcc dot gnu.org @ 2014-12-19 13:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57193

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.8.4                       |4.8.5

--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 4.8.4 has been released.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug rtl-optimization/57193] [4.8/4.9/5/6 Regression] suboptimal register allocation for SSE registers
  2013-05-07  9:36 [Bug rtl-optimization/57193] New: suboptimal register allocation for SSE registers vermaelen.wouter at gmail dot com
                   ` (6 preceding siblings ...)
  2014-12-19 13:34 ` [Bug rtl-optimization/57193] [4.8/4.9/5 " jakub at gcc dot gnu.org
@ 2015-06-23  8:15 ` rguenth at gcc dot gnu.org
  2015-06-26 20:03 ` [Bug rtl-optimization/57193] [4.9/5/6 " jakub at gcc dot gnu.org
  2015-06-26 20:33 ` jakub at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-06-23  8:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57193

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.8.5                       |4.9.3

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
The gcc-4_8-branch is being closed, re-targeting regressions to 4.9.3.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug rtl-optimization/57193] [4.9/5/6 Regression] suboptimal register allocation for SSE registers
  2013-05-07  9:36 [Bug rtl-optimization/57193] New: suboptimal register allocation for SSE registers vermaelen.wouter at gmail dot com
                   ` (7 preceding siblings ...)
  2015-06-23  8:15 ` [Bug rtl-optimization/57193] [4.8/4.9/5/6 " rguenth at gcc dot gnu.org
@ 2015-06-26 20:03 ` jakub at gcc dot gnu.org
  2015-06-26 20:33 ` jakub at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: jakub at gcc dot gnu.org @ 2015-06-26 20:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57193

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 4.9.3 has been released.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug rtl-optimization/57193] [4.9/5/6 Regression] suboptimal register allocation for SSE registers
  2013-05-07  9:36 [Bug rtl-optimization/57193] New: suboptimal register allocation for SSE registers vermaelen.wouter at gmail dot com
                   ` (8 preceding siblings ...)
  2015-06-26 20:03 ` [Bug rtl-optimization/57193] [4.9/5/6 " jakub at gcc dot gnu.org
@ 2015-06-26 20:33 ` jakub at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: jakub at gcc dot gnu.org @ 2015-06-26 20:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57193

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.9.3                       |4.9.4


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2015-06-26 20:33 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-07  9:36 [Bug rtl-optimization/57193] New: suboptimal register allocation for SSE registers vermaelen.wouter at gmail dot com
2013-05-07 11:47 ` [Bug rtl-optimization/57193] " rguenth at gcc dot gnu.org
2013-05-07 18:17 ` [Bug rtl-optimization/57193] [4.5/4.6/4.7/4.8/4.9 Regression] " hjl.tools at gmail dot com
2013-05-08  8:59 ` [Bug rtl-optimization/57193] [4.7/4.8/4.9 " rguenth at gcc dot gnu.org
2013-10-30 12:18 ` rguenth at gcc dot gnu.org
2014-02-12 20:03 ` rth at gcc dot gnu.org
2014-06-12 13:41 ` [Bug rtl-optimization/57193] [4.7/4.8/4.9/4.10 " rguenth at gcc dot gnu.org
2014-12-19 13:34 ` [Bug rtl-optimization/57193] [4.8/4.9/5 " jakub at gcc dot gnu.org
2015-06-23  8:15 ` [Bug rtl-optimization/57193] [4.8/4.9/5/6 " rguenth at gcc dot gnu.org
2015-06-26 20:03 ` [Bug rtl-optimization/57193] [4.9/5/6 " jakub at gcc dot gnu.org
2015-06-26 20:33 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).