public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/54593] New: [missed-optimization] Move from SSE to integer register goes through the stack without -march=native
@ 2012-09-15 16:07 sgunderson at bigfoot dot com
  2012-09-15 16:36 ` [Bug target/54593] " pinskia at gcc dot gnu.org
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: sgunderson at bigfoot dot com @ 2012-09-15 16:07 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54593

             Bug #: 54593
           Summary: [missed-optimization] Move from SSE to integer
                    register goes through the stack without -march=native
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: sgunderson@bigfoot.com


Hi,

I have reproduced this on 4.4, 4.6, 4.7 and 4.8 (Debian 20120820-1, trunk
version 190537). Given the following code:

  #include <x86intrin.h>

  int test1(__m128i v) {
     return _mm_cvtsi128_si32(v);
  }

GCC generates

   0:    66 0f 7e 44 24 f4        movd   %xmm0,-0xc(%rsp)
   6:    8b 44 24 f4              mov    -0xc(%rsp),%eax
   a:    c3                       retq   

Shouldn't it go directly to %eax instead of through the stack? Granted, on
Netburst this takes ten cycles or so, but this is x86-64. It appears to be some
sort of tuning issue, since if I use -mtune=native (I am on an Atom) I get:

   0:    66 0f 7e c0              movd   %xmm0,%eax
   4:    90                       nop
   5:    90                       nop
   6:    90                       nop
   7:    90                       nop
   8:    90                       nop
   9:    90                       nop
   a:    c3                       retq   

which is sort-of what I expect. Well, the NOPs are a bit weird, but... :-)


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/54593] [missed-optimization] Move from SSE to integer register goes through the stack without -march=native
  2012-09-15 16:07 [Bug target/54593] New: [missed-optimization] Move from SSE to integer register goes through the stack without -march=native sgunderson at bigfoot dot com
@ 2012-09-15 16:36 ` pinskia at gcc dot gnu.org
  2012-09-15 16:38 ` sgunderson at bigfoot dot com
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2012-09-15 16:36 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54593

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|                            |INVALID

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> 2012-09-15 16:35:43 UTC ---
This depends on the actually x86 processor.  On AMD processors, it is better to
go through memory than going direct.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/54593] [missed-optimization] Move from SSE to integer register goes through the stack without -march=native
  2012-09-15 16:07 [Bug target/54593] New: [missed-optimization] Move from SSE to integer register goes through the stack without -march=native sgunderson at bigfoot dot com
  2012-09-15 16:36 ` [Bug target/54593] " pinskia at gcc dot gnu.org
@ 2012-09-15 16:38 ` sgunderson at bigfoot dot com
  2012-09-15 16:50 ` pinskia at gcc dot gnu.org
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: sgunderson at bigfoot dot com @ 2012-09-15 16:38 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54593

--- Comment #2 from sgunderson at bigfoot dot com 2012-09-15 16:38:34 UTC ---
Interesting. So it's a conscious choice that “generic” does this?


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/54593] [missed-optimization] Move from SSE to integer register goes through the stack without -march=native
  2012-09-15 16:07 [Bug target/54593] New: [missed-optimization] Move from SSE to integer register goes through the stack without -march=native sgunderson at bigfoot dot com
  2012-09-15 16:36 ` [Bug target/54593] " pinskia at gcc dot gnu.org
  2012-09-15 16:38 ` sgunderson at bigfoot dot com
@ 2012-09-15 16:50 ` pinskia at gcc dot gnu.org
  2012-09-15 16:54 ` sgunderson at bigfoot dot com
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2012-09-15 16:50 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54593

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> 2012-09-15 16:50:31 UTC ---
(In reply to comment #2)
> Interesting. So it's a conscious choice that “generic” does this?

Yes:
  /* X86_TUNE_SSE_PARTIAL_REG_DEPENDENCY: In the Generic model we have a
     conflict here in between PPro/Pentium4 based chips that thread 128bit
     SSE registers as single units versus K8 based chips that divide SSE
     registers to two 64bit halves.  This knob promotes all store destinations
     to be 128bit to allow register renaming on 128bit SSE units, but usually
     results in one extra microop on 64bit SSE units.  Experimental results
     shows that disabling this option on P4 brings over 20% SPECfp regression,
     while enabling it on K8 brings roughly 2.4% regression that can be partly
     masked by careful scheduling of moves.  */
  m_PPRO | m_P4_NOCONA | m_CORE2I7 | m_ATOM  | m_AMDFAM10 | m_BDVER |
m_GENERIC,


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/54593] [missed-optimization] Move from SSE to integer register goes through the stack without -march=native
  2012-09-15 16:07 [Bug target/54593] New: [missed-optimization] Move from SSE to integer register goes through the stack without -march=native sgunderson at bigfoot dot com
                   ` (2 preceding siblings ...)
  2012-09-15 16:50 ` pinskia at gcc dot gnu.org
@ 2012-09-15 16:54 ` sgunderson at bigfoot dot com
  2012-09-15 20:17 ` hjl.tools at gmail dot com
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: sgunderson at bigfoot dot com @ 2012-09-15 16:54 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54593

--- Comment #4 from sgunderson at bigfoot dot com 2012-09-15 16:54:28 UTC ---
I'm not sure if I understand the comment very well; it talks about Pentium 4,
but none of them run 64-bit code, do they?


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/54593] [missed-optimization] Move from SSE to integer register goes through the stack without -march=native
  2012-09-15 16:07 [Bug target/54593] New: [missed-optimization] Move from SSE to integer register goes through the stack without -march=native sgunderson at bigfoot dot com
                   ` (3 preceding siblings ...)
  2012-09-15 16:54 ` sgunderson at bigfoot dot com
@ 2012-09-15 20:17 ` hjl.tools at gmail dot com
  2012-09-15 20:28 ` sgunderson at bigfoot dot com
  2020-04-29  1:48 ` pinskia at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: hjl.tools at gmail dot com @ 2012-09-15 20:17 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54593

--- Comment #5 from H.J. Lu <hjl.tools at gmail dot com> 2012-09-15 20:17:24 UTC ---
(In reply to comment #4)
> I'm not sure if I understand the comment very well; it talks about Pentium 4,
> but none of them run 64-bit code, do they?

Wrong quote.  It should be

  /* X86_TUNE_INTER_UNIT_MOVES */
  ~(m_AMD_MULTIPLE | m_GENERIC),


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/54593] [missed-optimization] Move from SSE to integer register goes through the stack without -march=native
  2012-09-15 16:07 [Bug target/54593] New: [missed-optimization] Move from SSE to integer register goes through the stack without -march=native sgunderson at bigfoot dot com
                   ` (4 preceding siblings ...)
  2012-09-15 20:17 ` hjl.tools at gmail dot com
@ 2012-09-15 20:28 ` sgunderson at bigfoot dot com
  2020-04-29  1:48 ` pinskia at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: sgunderson at bigfoot dot com @ 2012-09-15 20:28 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54593

--- Comment #6 from sgunderson at bigfoot dot com 2012-09-15 20:28:02 UTC ---
Ah. So basically it hurts AMD enough (the opposite doesn't hit Intel enough)
that the choice was made to make it that way generic too. Well, as long as it's
a deliberate choice, I assume it's a reasonable tradeoff, so thanks for the
enlightenment. :-)


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/54593] [missed-optimization] Move from SSE to integer register goes through the stack without -march=native
  2012-09-15 16:07 [Bug target/54593] New: [missed-optimization] Move from SSE to integer register goes through the stack without -march=native sgunderson at bigfoot dot com
                   ` (5 preceding siblings ...)
  2012-09-15 20:28 ` sgunderson at bigfoot dot com
@ 2020-04-29  1:48 ` pinskia at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2020-04-29  1:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54593

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |gabravier at gmail dot com

--- Comment #7 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
*** Bug 94837 has been marked as a duplicate of this bug. ***

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-04-29  1:48 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-09-15 16:07 [Bug target/54593] New: [missed-optimization] Move from SSE to integer register goes through the stack without -march=native sgunderson at bigfoot dot com
2012-09-15 16:36 ` [Bug target/54593] " pinskia at gcc dot gnu.org
2012-09-15 16:38 ` sgunderson at bigfoot dot com
2012-09-15 16:50 ` pinskia at gcc dot gnu.org
2012-09-15 16:54 ` sgunderson at bigfoot dot com
2012-09-15 20:17 ` hjl.tools at gmail dot com
2012-09-15 20:28 ` sgunderson at bigfoot dot com
2020-04-29  1:48 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).