public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/44578]  New: GCC generates MMX instructions but fails to generate "emms"
@ 2010-06-18 10:27 stephen dot dolan at havok dot com
  2010-06-18 10:46 ` [Bug target/44578] " rguenth at gcc dot gnu dot org
  2010-06-19 17:23 ` ubizjak at gmail dot com
  0 siblings, 2 replies; 14+ messages in thread
From: stephen dot dolan at havok dot com @ 2010-06-18 10:27 UTC (permalink / raw)
  To: gcc-bugs

I'm using GCC to compile some code which uses SSE intrinsics. The code is being
compiled at -O3 -mfpmath=sse.

GCC decides to use MMX instructions for some of the operations (zeroing some
memory). There are no MMX intrinsics in the source, but an SSE _mm_setzero_ps
gets compiled into a pair of movq %mm0.

No emms instruction is emitted, possibly because gcc "knows" the FPU is not in
use because of the -mfpmath=sse switch, leaving the FPU in MMX mode.

However, the platform ABI specifies floating-point return values be returned in
FPU registers, so gcc moves return values from SSE registers to the FPU for
argument passing/returning. Since the FPU is in an invalid state because of the
lack of emms, this corrupts the floating-point values.

This behaviour seems to be very dependent on the exact version of gcc and the
exact source.



Here's a testcase:

#include <stdio.h>
#include <emmintrin.h>

// Since this is all in one file, we need to mark some functions
// noinline so that the interesting parts don't get compiled away
#define NOINLINE __attribute__((noinline))


struct OutputData
{
  __m128 a, b;
};

struct InputData
{
  float a, b;
};

// Something that uses an OutputData that won't be compiled away
__m128 ga, gb;
NOINLINE void doSomethingWith(const OutputData& d){
  ga = d.a;
  gb = d.b;
}


NOINLINE void calc(const InputData& in)
{
  OutputData out;

  // the next two lines are where the bug manifests
  // gcc decides to use MMX instructions to write
  // some zeros, but fails to clean up afterwards.
  out.a = _mm_setr_ps(in.a, in.b, 0, 0);
  out.b = _mm_setzero_ps();

  // ensure the above is not optimised away
  doSomethingWith(out);
}


NOINLINE float retFloat()
{
  return 3;
}


int main()
{
  InputData x = {3.4, 42};

  // GCC emits MMX instructions for this function, but emits no emms
  calc(x);

  // This uses the FPU to return the value, which gets corrupted
  printf("%f\n", retFloat());

  return 0;
}


On my machine, this generates (for the function "calc"):

.globl _Z4calcRK9InputData
        .type   _Z4calcRK9InputData, @function
_Z4calcRK9InputData:
.LFB530:
        .cfi_startproc
        .cfi_personality 0x0,__gxx_personality_v0
        pushl   %ebp
        .cfi_def_cfa_offset 8
        pxor    %mm0, %mm0
        movl    %esp, %ebp
        .cfi_offset 5, -8
        .cfi_def_cfa_register 5
        subl    $60, %esp
        movl    8(%ebp), %eax
        movss   4(%eax), %xmm0
        movss   (%eax), %xmm1
        movq    %mm0, -48(%ebp) ; MMX instruction
        movq    %mm0, -24(%ebp) ; MMX instruction
        leal    -40(%ebp), %eax
        movq    %mm0, -16(%ebp) ; MMX instruction
        movl    %eax, (%esp)
        unpcklps        %xmm0, %xmm1
        movaps  %xmm1, %xmm0
        xorps   %xmm1, %xmm1
        movlhps %xmm1, %xmm0
        movlps  %xmm0, -40(%ebp)
        movhps  %xmm0, -32(%ebp)
        call    _Z15doSomethingWithRK10OutputData
        leave
        ret ; FPU stack still in MMX mode and unusable for floating point
        .cfi_endproc
.LFE530:
        .size   _Z4calcRK9InputData, .-_Z4calcRK9InputData


System:
Ubuntu Lucid (gcc 4:4.4.3-1ubuntu1)

GCC:
Using built-in specs.
Target: i486-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 4.4.3-4ubuntu5'
--with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs
--enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --enable-shared
--enable-multiarch --enable-linker-build-id --with-system-zlib
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--with-gxx-include-dir=/usr/include/c++/4.4 --program-suffix=-4.4 --enable-nls
--enable-clocale=gnu --enable-libstdcxx-debug --enable-plugin --enable-objc-gc
--enable-targets=all --disable-werror --with-arch-32=i486 --with-tune=generic
--enable-checking=release --build=i486-linux-gnu --host=i486-linux-gnu
--target=i486-linux-gnu
Thread model: posix
gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5)


-- 
           Summary: GCC generates MMX instructions but fails to generate
                    "emms"
           Product: gcc
           Version: 4.4.3
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: stephen dot dolan at havok dot com
  GCC host triplet: i486-linux-gnu
GCC target triplet: i486-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44578


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2013-04-30  6:07 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-44578-4@http.gcc.gnu.org/bugzilla/>
2012-11-15  6:36 ` [Bug target/44578] GCC generates MMX instructions but fails to generate "emms" kl4yfd at gmail dot com
2012-11-15  6:39 ` kl4yfd at gmail dot com
2012-11-15 12:06 ` jakub at gcc dot gnu.org
2012-11-15 17:05 ` kl4yfd at gmail dot com
2013-04-29 16:44 ` tejohnson at google dot com
2013-04-29 17:13 ` ubizjak at gmail dot com
2013-04-29 17:24 ` tejohnson at google dot com
2013-04-29 17:37 ` ubizjak at gmail dot com
2013-04-30  5:38 ` ubizjak at gmail dot com
2013-04-30  5:43 ` tejohnson at google dot com
2013-04-30  5:53 ` ubizjak at gmail dot com
2013-04-30  6:07 ` ubizjak at gmail dot com
2010-06-18 10:27 [Bug target/44578] New: " stephen dot dolan at havok dot com
2010-06-18 10:46 ` [Bug target/44578] " rguenth at gcc dot gnu dot org
2010-06-19 17:23 ` ubizjak at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).