public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug translation/16111] New: generates invalid SSE movdqa instruction (instead of movaps)
@ 2004-06-21 11:42 djp at volny dot cz
  2004-06-21 15:09 ` [Bug target/16111] " bangerth at dealii dot org
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: djp at volny dot cz @ 2004-06-21 11:42 UTC (permalink / raw)
  To: gcc-bugs

My project stops working after switching from 3.3.3 to 3.4.0; I've found that
the problem is in SSE2; in some (register intensive) code the compiler generates
movdqa (packed ints) instead of movaps (float4 vector) causing invalid result (NaN).

I've created a simple test case to demonstrate the problem:

--------------------------------------------------------------------
#include <math.h>
#include <xmmintrin.h>
#include <stdio.h>

static inline __m128 xmm_dot4(__m128 a, __m128 b) {
	__m128 v0 = _mm_mul_ps(a, b);
	__m128 v1 = _mm_movehl_ps(v1, v0);
	v0 = _mm_add_ps(v0, v1);
	v1 = _mm_shuffle_ps(v0, v0, _MM_SHUFFLE(0,0,0,1));
	return _mm_add_ss(v0, v1);
}

/*
 * gcc-3.4.0+ generates invalid movdqa instruction;
 * works well if you replace movdqa by movaps in asm.
 */
void foo(float* boxCenter, float* boxExtents) 
{
	unsigned int MASK = 0x80000000;
	__m128 mask = _mm_set1_ps((float&)MASK);
	__m128 center = _mm_loadu_ps(boxCenter);
	__m128 extents = _mm_loadu_ps(boxExtents);
	center = _mm_andnot_ps(mask, center); // common code for doing abs
	extents = _mm_xor_ps(mask, extents); // common code for doing neg
	center = xmm_dot4(center, extents);
	_mm_storeu_ps(boxCenter, center);
	_mm_storeu_ps(boxExtents, extents);
}


float center[] = { 1, 1, 1, 1 };
float extents[] = { 27.5f, 27.5f, 0, 0 };

int main() 
{
	foo(center, extents);
	printf("extents (%f %f %f %f)\n", extents[0], extents[1], extents[2],
extents[3]); // prints NaN

	return 0;
}

--------------------------------------------------------------------

I've tried both 3.4.0 release, and latest CVS snapshot:

/opt/gcc-3.4.0/bin/g++-3.4 -v -save-temps -O3 -msse -mfpmath=sse
-fomit-frame-pointer -finline-limit=2000 "test.cxx" -o "test" -L/opt/gcc-3.4.0/lib
Reading specs from /opt/gcc-3.4.0/lib/gcc/i686-pc-linux-gnu/3.4.1/specs
Configured with: ../configure --prefix=/opt/gcc-3.4.0 --program-suffix=-3.4
--enable-languages=c,c++,java --with-gcc --with-gnu-as --with-gnu-ld
--enable-shared --enable-threads=posix --enable-libgcj --disable-java-awt
--without-x --enable-java-gc=boehm --disable-debug --disable-libgcj-debug
--disable-interpreter --disable-x --enable-hash-synchronization
Thread model: posix
gcc version 3.4.1 20040618 (prerelease)
 /opt/gcc-3.4.0/libexec/gcc/i686-pc-linux-gnu/3.4.1/cc1plus -E -quiet -v
-D_GNU_SOURCE test.cxx -msse -mfpmath=sse -mtune=pentiumpro -fomit-frame-pointer
-finline-limit=2000 -O3 -o test.ii
ignoring nonexistent directory
"/opt/gcc-3.4.0/lib/gcc/i686-pc-linux-gnu/3.4.1/../../../../i686-pc-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
 /opt/gcc-3.4.0/lib/gcc/i686-pc-linux-gnu/3.4.1/../../../../include/c++/3.4.1
 /opt/gcc-3.4.0/lib/gcc/i686-pc-linux-gnu/3.4.1/../../../../include/c++/3.4.1/i686-pc-linux-gnu
 /opt/gcc-3.4.0/lib/gcc/i686-pc-linux-gnu/3.4.1/../../../../include/c++/3.4.1/backward
 /usr/local/include
 /opt/gcc-3.4.0/include
 /opt/gcc-3.4.0/lib/gcc/i686-pc-linux-gnu/3.4.1/include
 /usr/include
End of search list.
 /opt/gcc-3.4.0/libexec/gcc/i686-pc-linux-gnu/3.4.1/cc1plus -fpreprocessed
test.ii -quiet -dumpbase test.cxx -msse -mfpmath=sse -mtune=pentiumpro -auxbase
test -O3 -version -fomit-frame-pointer -finline-limit=2000 -o test.s
GNU C++ version 3.4.1 20040618 (prerelease) (i686-pc-linux-gnu)
	compiled by GNU C version 3.3.3 (Debian 20040321).
GGC heuristics: --param ggc-min-expand=90 --param ggc-min-heapsize=113152
 as -V -Qy -o test.o test.s
GNU assembler version 2.14.90.0.7 (i386-linux) using BFD version 2.14.90.0.7
20031029 Debian GNU/Linux
 /opt/gcc-3.4.0/libexec/gcc/i686-pc-linux-gnu/3.4.1/collect2 --eh-frame-hdr -m
elf_i386 -dynamic-linker /lib/ld-linux.so.2 -o test /usr/lib/crt1.o
/usr/lib/crti.o /opt/gcc-3.4.0/lib/gcc/i686-pc-linux-gnu/3.4.1/crtbegin.o
-L/opt/gcc-3.4.0/lib -L/opt/gcc-3.4.0/lib/gcc/i686-pc-linux-gnu/3.4.1
-L/opt/gcc-3.4.0/lib/gcc/i686-pc-linux-gnu/3.4.1/../../.. test.o -lstdc++ -lm
-lgcc_s -lgcc -lc -lgcc_s -lgcc
/opt/gcc-3.4.0/lib/gcc/i686-pc-linux-gnu/3.4.1/crtend.o /usr/lib/crtn.o
/opt/gcc-3.4.0/bin/g++-3.4 -v -save-temps -O3 -msse -mfpmath=sse
-fomit-frame-pointer -finline-limit=2000 "test.cxx" -S -o "test.S"
-L/opt/gcc-3.4.0/lib
Reading specs from /opt/gcc-3.4.0/lib/gcc/i686-pc-linux-gnu/3.4.1/specs
Configured with: ../configure --prefix=/opt/gcc-3.4.0 --program-suffix=-3.4
--enable-languages=c,c++,java --with-gcc --with-gnu-as --with-gnu-ld
--enable-shared --enable-threads=posix --enable-libgcj --disable-java-awt
--without-x --enable-java-gc=boehm --disable-debug --disable-libgcj-debug
--disable-interpreter --disable-x --enable-hash-synchronization
Thread model: posix
gcc version 3.4.1 20040618 (prerelease)
 /opt/gcc-3.4.0/libexec/gcc/i686-pc-linux-gnu/3.4.1/cc1plus -E -quiet -v
-D_GNU_SOURCE test.cxx -msse -mfpmath=sse -mtune=pentiumpro -fomit-frame-pointer
-finline-limit=2000 -O3 -o test.ii
ignoring nonexistent directory
"/opt/gcc-3.4.0/lib/gcc/i686-pc-linux-gnu/3.4.1/../../../../i686-pc-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
 /opt/gcc-3.4.0/lib/gcc/i686-pc-linux-gnu/3.4.1/../../../../include/c++/3.4.1
 /opt/gcc-3.4.0/lib/gcc/i686-pc-linux-gnu/3.4.1/../../../../include/c++/3.4.1/i686-pc-linux-gnu
 /opt/gcc-3.4.0/lib/gcc/i686-pc-linux-gnu/3.4.1/../../../../include/c++/3.4.1/backward
 /usr/local/include
 /opt/gcc-3.4.0/include
 /opt/gcc-3.4.0/lib/gcc/i686-pc-linux-gnu/3.4.1/include
 /usr/include
End of search list.
 /opt/gcc-3.4.0/libexec/gcc/i686-pc-linux-gnu/3.4.1/cc1plus -fpreprocessed
test.ii -quiet -dumpbase test.cxx -msse -mfpmath=sse -mtune=pentiumpro
-auxbase-strip test.S -O3 -version -fomit-frame-pointer -finline-limit=2000 -o
test.S
GNU C++ version 3.4.1 20040618 (prerelease) (i686-pc-linux-gnu)
	compiled by GNU C version 3.3.3 (Debian 20040321).
GGC heuristics: --param ggc-min-expand=90 --param ggc-min-heapsize=113152

--------------------------------------------------------------------
And the result:

	.file	"test.cxx"
.globl extents
	.data
	.align 4
	.type	extents, @object
	.size	extents, 16
extents:
	.long	1104936960
	.long	1104936960
	.long	0
	.long	0
.globl center
	.align 4
	.type	center, @object
	.size	center, 16
center:
	.long	1065353216
	.long	1065353216
	.long	1065353216
	.long	1065353216
	.text
	.align 2
	.p2align 4,,15
.globl _Z3fooPfS_
	.type	_Z3fooPfS_, @function
_Z3fooPfS_:
.LFB312:
	subl	$4, %esp
.LCFI0:
	movl	8(%esp), %eax
	movl	$0x80000000, (%esp)
	movl	12(%esp), %edx
	movss	(%esp), %xmm1
	movups	(%eax), %xmm6
	movups	(%edx), %xmm5
	shufps	$0, %xmm1, %xmm1
	movdqa	%xmm1, %xmm4
	andnps	%xmm6, %xmm4
	xorps	%xmm5, %xmm1
	movaps	%xmm4, %xmm0
	mulps	%xmm1, %xmm0
	movhlps	%xmm0, %xmm3
	addps	%xmm3, %xmm0
	movaps	%xmm0, %xmm2
	shufps	$1, %xmm0, %xmm2
	addss	%xmm2, %xmm0
	movups	%xmm0, (%eax)
	movups	%xmm1, (%edx)
	popl	%eax
	ret
.LFE312:
	.size	_Z3fooPfS_, .-_Z3fooPfS_
	.section	.rodata.str1.1,"aMS",@progbits,1
.LC0:
	.string	"extents (%f %f %f %f)\n"
	.text
	.align 2
	.p2align 4,,15
.globl main
	.type	main, @function
main:
.LFB313:
	pushl	%ebp
.LCFI1:
	movl	%esp, %ebp
.LCFI2:
	subl	$40, %esp
.LCFI3:
	movl	$0x80000000, -4(%ebp)
	movups	extents, %xmm5
	andl	$-16, %esp
	subl	$16, %esp
	movss	-4(%ebp), %xmm1
	movups	center, %xmm6
	movl	$.LC0, (%esp)
	shufps	$0, %xmm1, %xmm1
	movdqa	%xmm1, %xmm4
	xorps	%xmm5, %xmm1
	andnps	%xmm6, %xmm4
	movaps	%xmm4, %xmm0
	movups	%xmm1, extents
	mulps	%xmm1, %xmm0
	movhlps	%xmm0, %xmm3
	flds	extents+12
	addps	%xmm3, %xmm0
	movaps	%xmm0, %xmm2
	shufps	$1, %xmm0, %xmm2
	addss	%xmm2, %xmm0
	fstpl	28(%esp)
	flds	extents+8
	movups	%xmm0, center
	fstpl	20(%esp)
	flds	extents+4
	fstpl	12(%esp)
	flds	extents
	fstpl	4(%esp)
	call	printf
	leave
	xorl	%eax, %eax
	ret
.LFE313:
	.size	main, .-main
	.section	.eh_frame,"a",@progbits
.Lframe1:
	.long	.LECIE1-.LSCIE1
.LSCIE1:
	.long	0x0
	.byte	0x1
	.string	"zP"
	.uleb128 0x1
	.sleb128 -4
	.byte	0x8
	.uleb128 0x5
	.byte	0x0
	.long	__gxx_personality_v0
	.byte	0xc
	.uleb128 0x4
	.uleb128 0x4
	.byte	0x88
	.uleb128 0x1
	.align 4
.LECIE1:
.LSFDE3:
	.long	.LEFDE3-.LASFDE3
.LASFDE3:
	.long	.LASFDE3-.Lframe1
	.long	.LFB313
	.long	.LFE313-.LFB313
	.uleb128 0x0
	.byte	0x4
	.long	.LCFI1-.LFB313
	.byte	0xe
	.uleb128 0x8
	.byte	0x85
	.uleb128 0x2
	.byte	0x4
	.long	.LCFI2-.LCFI1
	.byte	0xd
	.uleb128 0x5
	.align 4
.LEFDE3:
	.section	.note.GNU-stack,"",@progbits
	.ident	"GCC: (GNU) 3.4.1 20040618 (prerelease)"

--------------------------------------------------------------------

Note that replacing movdqa by movaps (or using gcc 3.3.3;-) fixes the problem.

Hope it helps.

-- 
           Summary: generates invalid SSE movdqa instruction (instead of
                    movaps)
           Product: gcc
           Version: 3.4.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: translation
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: djp at volny dot cz
                CC: gcc-bugs at gcc dot gnu dot org
 GCC build triplet: i686-pc-linux-gnu
  GCC host triplet: i686-pc-linux-gnu
GCC target triplet: i686-pc-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16111


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/16111] generates invalid SSE movdqa instruction (instead of movaps)
  2004-06-21 11:42 [Bug translation/16111] New: generates invalid SSE movdqa instruction (instead of movaps) djp at volny dot cz
@ 2004-06-21 15:09 ` bangerth at dealii dot org
  2004-06-21 16:34 ` pinskia at gcc dot gnu dot org
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: bangerth at dealii dot org @ 2004-06-21 15:09 UTC (permalink / raw)
  To: gcc-bugs



-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|translation                 |target


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16111


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/16111] generates invalid SSE movdqa instruction (instead of movaps)
  2004-06-21 11:42 [Bug translation/16111] New: generates invalid SSE movdqa instruction (instead of movaps) djp at volny dot cz
  2004-06-21 15:09 ` [Bug target/16111] " bangerth at dealii dot org
@ 2004-06-21 16:34 ` pinskia at gcc dot gnu dot org
  2004-06-21 17:18 ` djp at volny dot cz
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-06-21 16:34 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2004-06-21 16:34 -------
Hmm this works for on the mainline, 3.4.0, and 3.3.3:
tin:~/src/gnu/gcctest>g++ -O3 -msse -mfpmath=sse -fomit-frame-pointer -finline-limit=2000 
pr16111.c
tin:~/src/gnu/gcctest>./a.out
extents (-27.500000 -27.500000 -0.000000 -0.000000)
tin:~/src/gnu/gcctest>~/ia32_linux_gcc3_4/bin/g++ -O3 -msse -mfpmath=sse -fomit-frame-pointer 
-finline-limit=2000 pr16111.c
tin:~/src/gnu/gcctest>./a.out
extents (-27.500000 -27.500000 -0.000000 -0.000000)
tin:~/src/gnu/gcctest>~/ia32_linux_gcc3_3/bin/g++ -O3 -msse -mfpmath=sse -fomit-frame-pointer 
-finline-limit=2000 pr16111.c
tin:~/src/gnu/gcctest>./a.out
extents (-27.500000 -27.500000 -0.000000 -0.000000)

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |wrong-code


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16111


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/16111] generates invalid SSE movdqa instruction (instead of movaps)
  2004-06-21 11:42 [Bug translation/16111] New: generates invalid SSE movdqa instruction (instead of movaps) djp at volny dot cz
  2004-06-21 15:09 ` [Bug target/16111] " bangerth at dealii dot org
  2004-06-21 16:34 ` pinskia at gcc dot gnu dot org
@ 2004-06-21 17:18 ` djp at volny dot cz
  2004-06-21 17:40 ` pinskia at gcc dot gnu dot org
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: djp at volny dot cz @ 2004-06-21 17:18 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From djp at volny dot cz  2004-06-21 17:18 -------
did you run the test on amd or intel?
my results and more info:

GCC 3.4.0 (mainline)
/opt/gcc-3.4.0/bin/g++-3.4 -v -save-temps -O3 -msse -mfpmath=sse
-fomit-frame-pointer -finline-limit=2000 "test.cxx" -o "test" -L/opt/gcc-3.4.0/lib
==>
LD_LIBRARY_PATH="/opt/gcc-3.3.3/lib:$LD_LIBRARY_PATH" ./test
extents (-27.500000 -27.500000 -0.000000 nan)

GCC 3.3.3 (mainline)
/opt/gcc-3.3.3/bin/g++-3.3 -v -save-temps -O3 -msse -mfpmath=sse
-fomit-frame-pointer -finline-limit=2000 "test.cxx" -o "test" -L/opt/gcc-3.3.3/lib
==>
LD_LIBRARY_PATH="/opt/gcc-3.4.0/lib:$LD_LIBRARY_PATH" ./test
extents (-27.500000 -27.500000 -0.000000 -0.000000)

root@vox:/proc# cat cpuinfo
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 6
model           : 6
model name      : AMD Athlon(tm) XP 2100+
stepping        : 2
cpu MHz         : 1737.340
cache size      : 256 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 mmx fxsr sse syscall mmxext 3dnowext 3dnow
bogomips        : 3432.44

test.s from 3.4.0
-----------------
	.type	_Z3fooPfS_, @function
_Z3fooPfS_:
.LFB312:
	subl	$4, %esp
.LCFI0:
	movl	8(%esp), %eax
	movl	$0x80000000, (%esp)
	movl	12(%esp), %edx
	movss	(%esp), %xmm1
	movups	(%eax), %xmm6
	movups	(%edx), %xmm5
	shufps	$0, %xmm1, %xmm1
	movdqa	%xmm1, %xmm4
	andnps	%xmm6, %xmm4
	xorps	%xmm5, %xmm1
	movaps	%xmm4, %xmm0
	mulps	%xmm1, %xmm0
	movhlps	%xmm0, %xmm3
	addps	%xmm3, %xmm0
	movaps	%xmm0, %xmm2
	shufps	$1, %xmm0, %xmm2
	addss	%xmm2, %xmm0
	movups	%xmm0, (%eax)
	movups	%xmm1, (%edx)
	popl	%eax
	ret

test.s from 3.3.3
-----------------

.type	_Z3fooPfS_, @function
_Z3fooPfS_:
.LFB314:
	subl	$4, %esp
.LCFI0:
	movl	8(%esp), %edx
	movl	$0x80000000, (%esp)
	movl	12(%esp), %ecx
	movss	(%esp), %xmm5
	movups	(%edx), %xmm4
	movups	(%ecx), %xmm6
	shufps	$0, %xmm5, %xmm5
	movaps	%xmm5, %xmm2
	andnps	%xmm4, %xmm2
	xorps	%xmm6, %xmm5
	movaps	%xmm2, %xmm1
	mulps	%xmm5, %xmm1
	movhlps	%xmm1, %xmm3
	addps	%xmm3, %xmm1
	movaps	%xmm1, %xmm0
	shufps	$1, %xmm1, %xmm0
	addss	%xmm0, %xmm1
	movups	%xmm1, (%edx)
	movups	%xmm5, (%ecx)
	popl	%eax
	ret


As you can see, the ONLY difference is 

	movdqa	%xmm1, %xmm4


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16111


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/16111] generates invalid SSE movdqa instruction (instead of movaps)
  2004-06-21 11:42 [Bug translation/16111] New: generates invalid SSE movdqa instruction (instead of movaps) djp at volny dot cz
                   ` (2 preceding siblings ...)
  2004-06-21 17:18 ` djp at volny dot cz
@ 2004-06-21 17:40 ` pinskia at gcc dot gnu dot org
  2004-07-25 17:35 ` drober32 at fau dot edu
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-06-21 17:40 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2004-06-21 17:40 -------
Mine was a pure intel machine:
tin:~/src/gnu/gcctest>cat /proc/cpuinfo 
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 15
model           : 2
model name      : Intel(R) Pentium(R) 4 CPU 2.00GHz
stepping        : 4
cpu MHz         : 1994.146
cache size      : 512 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts 
acpi mmx fxsr sse sse2 ss ht tm
bogomips        : 3971.48

So either is is a bug in AMD's sse implemenation which is likely.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16111


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/16111] generates invalid SSE movdqa instruction (instead of movaps)
  2004-06-21 11:42 [Bug translation/16111] New: generates invalid SSE movdqa instruction (instead of movaps) djp at volny dot cz
                   ` (3 preceding siblings ...)
  2004-06-21 17:40 ` pinskia at gcc dot gnu dot org
@ 2004-07-25 17:35 ` drober32 at fau dot edu
  2004-09-21 12:53 ` coyote at coyotegulch dot com
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: drober32 at fau dot edu @ 2004-07-25 17:35 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From drober32 at fau dot edu  2004-07-25 17:35 -------
movdqa is an SSE2 instruction, which is not supported by Pentium 3 or Athlon XP.
The instruction should not be generated unless at least one of -msse2 or
-march=(pentium4, pentium-m, athlon64, etc.) is given.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16111


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/16111] generates invalid SSE movdqa instruction (instead of movaps)
  2004-06-21 11:42 [Bug translation/16111] New: generates invalid SSE movdqa instruction (instead of movaps) djp at volny dot cz
                   ` (4 preceding siblings ...)
  2004-07-25 17:35 ` drober32 at fau dot edu
@ 2004-09-21 12:53 ` coyote at coyotegulch dot com
  2004-12-21  3:41 ` pinskia at gcc dot gnu dot org
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: coyote at coyotegulch dot com @ 2004-09-21 12:53 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From coyote at coyotegulch dot com  2004-09-21 12:53 -------
Is anyone working on this, or should I feel free to tackle it?

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |coyote at coyotegulch dot
                   |                            |com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16111


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/16111] generates invalid SSE movdqa instruction (instead of movaps)
  2004-06-21 11:42 [Bug translation/16111] New: generates invalid SSE movdqa instruction (instead of movaps) djp at volny dot cz
                   ` (5 preceding siblings ...)
  2004-09-21 12:53 ` coyote at coyotegulch dot com
@ 2004-12-21  3:41 ` pinskia at gcc dot gnu dot org
  2004-12-21  6:42 ` uros at kss-loka dot si
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-12-21  3:41 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2004-12-21 03:41 -------
Hmm, I wonder if this is fixed now.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16111


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/16111] generates invalid SSE movdqa instruction (instead of movaps)
  2004-06-21 11:42 [Bug translation/16111] New: generates invalid SSE movdqa instruction (instead of movaps) djp at volny dot cz
                   ` (6 preceding siblings ...)
  2004-12-21  3:41 ` pinskia at gcc dot gnu dot org
@ 2004-12-21  6:42 ` uros at kss-loka dot si
  2004-12-21 15:25 ` uros at kss-loka dot si
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: uros at kss-loka dot si @ 2004-12-21  6:42 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From uros at kss-loka dot si  2004-12-21 06:42 -------
Mainline does not generate movdqa insn anymore. However:

g++ -O1 -msse -mfpmath=sse -fomit-frame-pointer -finline-limit=2000 pr16111.cpp 
./a.out
extents (-27.500000 -27.500000 -0.000000 -0.000000)

g++ -O2 -msse -mfpmath=sse -fomit-frame-pointer -finline-limit=2000 pr16111.cpp
./a.out
extents (0.000000 0.000000 2.018096 2.018096)

g++ -O3 -msse -mfpmath=sse -fomit-frame-pointer -finline-limit=2000 pr16111.cpp 
./a.out
extents (0.000000 0.000000 36.658997 36.658997)

Result is different, it depends on optimization level.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16111


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/16111] generates invalid SSE movdqa instruction (instead of movaps)
  2004-06-21 11:42 [Bug translation/16111] New: generates invalid SSE movdqa instruction (instead of movaps) djp at volny dot cz
                   ` (7 preceding siblings ...)
  2004-12-21  6:42 ` uros at kss-loka dot si
@ 2004-12-21 15:25 ` uros at kss-loka dot si
  2004-12-28  6:33 ` uros at kss-loka dot si
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: uros at kss-loka dot si @ 2004-12-21 15:25 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From uros at kss-loka dot si  2004-12-21 15:25 -------
Does life analysis eat RTLs with -O2?

This part is from pr16111_.cpp.15.cse2:

...

(note 8 7 13 0 NOTE_INSN_FUNCTION_BEG)

(insn 13 8 16 0 (set (mem/i:SI (plus:SI (reg/f:SI 20 frame)
                (const_int -4 [0xfffffffc])) [5 MASK+0 S4 A32])
        (const_int -2147483648 [0x80000000])) 35 {*movsi_1} (nil)
    (nil))

(insn 16 13 17 0 (set (reg:SF 73)
        (mem:SF (plus:SI (reg/f:SI 20 frame)
                (const_int -4 [0xfffffffc])) [7 S4 A32])) 60 {*movsf_1} (nil)
    (nil))

(insn 17 16 21 0 (set (mem/i:SF (plus:SI (reg/f:SI 20 frame)
                (const_int -8 [0xfffffff8])) [7 __F+0 S4 A32])
        (reg:SF 73)) 60 {*movsf_1} (nil)
    (nil))

...

And in pr16111_.cpp.16.life, (insn 13) is just missing. There is no
NOTE_INSN_DELETED, just plain nothing:

...

(note 8 7 16 0 NOTE_INSN_FUNCTION_BEG)

(insn 16 8 17 0 (set (reg:SF 73)
        (mem:SF (plus:SI (reg/f:SI 20 frame)
                (const_int -4 [0xfffffffc])) [7 S4 A32])) 60 {*movsf_1} (nil)
    (nil))

(insn 17 16 21 0 (set (mem/i:SF (plus:SI (reg/f:SI 20 frame)
                (const_int -8 [0xfffffff8])) [7 __F+0 S4 A32])
        (reg:SF 73)) 60 {*movsf_1} (insn_list:REG_DEP_TRUE 16 (nil))
    (expr_list:REG_DEAD (reg:SF 73)
        (nil)))

...

Uros.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|                            |1
   Last reconfirmed|0000-00-00 00:00:00         |2004-12-21 15:25:19
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16111


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/16111] generates invalid SSE movdqa instruction (instead of movaps)
  2004-06-21 11:42 [Bug translation/16111] New: generates invalid SSE movdqa instruction (instead of movaps) djp at volny dot cz
                   ` (8 preceding siblings ...)
  2004-12-21 15:25 ` uros at kss-loka dot si
@ 2004-12-28  6:33 ` uros at kss-loka dot si
  2004-12-29  0:32 ` rth at gcc dot gnu dot org
  2004-12-29  0:35 ` pinskia at gcc dot gnu dot org
  11 siblings, 0 replies; 13+ messages in thread
From: uros at kss-loka dot si @ 2004-12-28  6:33 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From uros at kss-loka dot si  2004-12-28 06:33 -------
The original bug is fixed for 4.0.0. The problem described in comment #8 looks
like a problem with aliasing (http://gcc.gnu.org/ml/gcc/2004-12/msg01096.html).

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to work|                            |4.0.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16111


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/16111] generates invalid SSE movdqa instruction (instead of movaps)
  2004-06-21 11:42 [Bug translation/16111] New: generates invalid SSE movdqa instruction (instead of movaps) djp at volny dot cz
                   ` (9 preceding siblings ...)
  2004-12-28  6:33 ` uros at kss-loka dot si
@ 2004-12-29  0:32 ` rth at gcc dot gnu dot org
  2004-12-29  0:35 ` pinskia at gcc dot gnu dot org
  11 siblings, 0 replies; 13+ messages in thread
From: rth at gcc dot gnu dot org @ 2004-12-29  0:32 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From rth at gcc dot gnu dot org  2004-12-29 00:32 -------
The problem mentioned in comment 8 is not a bug.

	(float&)MASK

has the exact same semantics as

        *(float *)&MASK

which, as we all ought to know by now, is undefined.  Open another PR for
the missing diagnostic with -Wstrict-aliasing if you like, but this one's
fixed.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED
   Target Milestone|---                         |4.0.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16111


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/16111] generates invalid SSE movdqa instruction (instead of movaps)
  2004-06-21 11:42 [Bug translation/16111] New: generates invalid SSE movdqa instruction (instead of movaps) djp at volny dot cz
                   ` (10 preceding siblings ...)
  2004-12-29  0:32 ` rth at gcc dot gnu dot org
@ 2004-12-29  0:35 ` pinskia at gcc dot gnu dot org
  11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-12-29  0:35 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2004-12-29 00:35 -------
(In reply to comment #10)> 
> which, as we all ought to know by now, is undefined.  Open another PR for
> the missing diagnostic with -Wstrict-aliasing if you like, but this one's
> fixed.
PR 14024 is the PR about the aliasing warnings not working in C++.



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16111


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2004-12-29  0:35 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-06-21 11:42 [Bug translation/16111] New: generates invalid SSE movdqa instruction (instead of movaps) djp at volny dot cz
2004-06-21 15:09 ` [Bug target/16111] " bangerth at dealii dot org
2004-06-21 16:34 ` pinskia at gcc dot gnu dot org
2004-06-21 17:18 ` djp at volny dot cz
2004-06-21 17:40 ` pinskia at gcc dot gnu dot org
2004-07-25 17:35 ` drober32 at fau dot edu
2004-09-21 12:53 ` coyote at coyotegulch dot com
2004-12-21  3:41 ` pinskia at gcc dot gnu dot org
2004-12-21  6:42 ` uros at kss-loka dot si
2004-12-21 15:25 ` uros at kss-loka dot si
2004-12-28  6:33 ` uros at kss-loka dot si
2004-12-29  0:32 ` rth at gcc dot gnu dot org
2004-12-29  0:35 ` pinskia at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).