public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/27869]  New: "-O -fregmove" handles SSE scalar instructions incorrectly
@ 2006-06-01 23:35 tijl at ulyssis dot org
  2006-06-02  0:48 ` [Bug target/27869] " pinskia at gcc dot gnu dot org
                   ` (15 more replies)
  0 siblings, 16 replies; 17+ messages in thread
From: tijl at ulyssis dot org @ 2006-06-01 23:35 UTC (permalink / raw)
  To: gcc-bugs

Consider the following C program using SSE intrinsics:

//----------
#include <stdio.h>
#include <xmmintrin.h>

int main(int argc, const char **argv) {
        __m128 v;
        v = _mm_setr_ps( 1.0f, 2.0f, 3.0f, 4.0f );

        v = _mm_rsqrt_ss( v );
        v = _mm_add_ss( v, _mm_movehl_ps( v, v ));
        v = _mm_add_ss( v, _mm_shuffle_ps( v, v, _MM_SHUFFLE( 0, 0, 0, 1 )));

        printf( "%e %e %e %e\n", ((float *)&v)[0], ((float *)&v)[1], ((float
*)&v)[2], ((float *)&v)[3] );
        return 0;
}
//----------

Compiling and running this gives different results depending on whether
-fregmove is specified or not.

tijl@kalimero regmove% gcc41 -Wall -O -fno-regmove -march=pentium4m -o test
main.c
tijl@kalimero regmove% ./test
5.999756e+00 2.000000e+00 3.000000e+00 4.000000e+00
tijl@kalimero regmove% gcc41 -Wall -O -fregmove -march=pentium4m -o test main.c
tijl@kalimero regmove% ./test
7.999756e+00 4.000000e+00 3.000000e+00 4.000000e+00

The first case (-fno-regmove) is the correct one.

When you take a look at the assembly output for both cases the problem is with
an "addss %xmm1, %xmm0" that is changed to "addss %xmm0, %xmm1". This is
incorrect. The addss instruction is not commutative (unlike addps which sums
over the entire vector).

The same problem occurs with _mm_add_ss in the code above replaced by
_mm_mul_ss (mulss instruction), but not with _mm_sub_ss for instance
(obviously), so I suppose this can be fixed by handling addss and mulss the
same way as subss.

I suppose other instructions could be affected too.


-- 
           Summary: "-O -fregmove" handles SSE scalar instructions
                    incorrectly
           Product: gcc
           Version: 4.1.2
            Status: UNCONFIRMED
          Severity: critical
          Priority: P3
         Component: c
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: tijl at ulyssis dot org
 GCC build triplet: i386-portbld-freebsd6.1
  GCC host triplet: i386-portbld-freebsd6.1
GCC target triplet: i386-portbld-freebsd6.1


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27869


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2008-02-03 14:32 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-06-01 23:35 [Bug c/27869] New: "-O -fregmove" handles SSE scalar instructions incorrectly tijl at ulyssis dot org
2006-06-02  0:48 ` [Bug target/27869] " pinskia at gcc dot gnu dot org
2006-06-02 11:02 ` tijl at ulyssis dot org
2007-04-04 11:17 ` steven at gcc dot gnu dot org
2007-04-04 11:35 ` rguenth at gcc dot gnu dot org
2007-04-04 11:47 ` steven at gcc dot gnu dot org
2007-04-05 22:56 ` echristo at apple dot com
2007-04-06 15:08 ` hubicka at ucw dot cz
2007-04-06 15:43 ` stevenb dot gcc at gmail dot com
2007-04-06 16:01 ` hubicka at ucw dot cz
2007-04-06 19:32 ` echristo at apple dot com
2007-04-09 23:06 ` hubicka at gcc dot gnu dot org
2007-04-10  0:57 ` echristo at apple dot com
2007-04-10 10:02 ` mark at gcc dot gnu dot org
2007-04-10 17:05 ` mmitchel at gcc dot gnu dot org
2007-04-16 16:07 ` hubicka at gcc dot gnu dot org
2008-02-03 14:32 ` steven at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).