public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/38525]  New: sse2(int16) code fails with -O3
@ 2008-12-14 16:49 leonid at volnitsky dot com
  2008-12-14 16:54 ` [Bug c++/38525] " leonid at volnitsky dot com
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: leonid at volnitsky dot com @ 2008-12-14 16:49 UTC (permalink / raw)
  To: gcc-bugs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 5717 bytes --]

When compiling attached unit test with -O3, it fails on sse2 test.  With any
other optimization  level including -O2 plus: 
   -fgcse-after-reload    
   -finline-functions     
   -fipa-cp-clone         
   -fpredictive-commoning -ftree-vectorize       
   -funswitch-loops
which should be the same as -O3, it passes unit test succesfully. 

Gentoo x86_64, with  gcc-4.4 and 4.3.2
Project page: http://volnitsky/project/lvvlib
Source: http://github.com/lvv/lvvlib
u-array.ii.gz attached. 

---------------------------------------------------------
g++      u-array.cc -o u-array     -Wall
-DID='"081214_183302-��g++-4.4.0-OPTIMIZE-0e541970++M"'   -I ~/p/  -I
/usr/local/include -DNDEBUG  -DGSL_RANGE_CHECK_OFF -DNOCHECK   -pipe
-Wno-reorder -Wno-sign-compare  -O3 -march=native  -fwhole-program --combine 
-fopenmp -fomit-frame-pointer -funsafe-loop-optimizations  -DGCC_BUG -O3 -v  
-Wno-unused-variable -Wno-unused-variable  
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ./configure : (reconfigured) ./configure --disable-bootstrap
--enable-languages=c,c++,fortran --enable-multilib
Thread model: posix
gcc version 4.4.0 20081208 (experimental) (GCC) 
COLLECT_GCC_OPTIONS='-o' 'u-array' '-Wall'
'-DID="081214_183302-��g++-4.4.0-OPTIMIZE-0e541970++M"' '-I' '/home/lvv/p/'
'-I' '/usr/local/include' '-DNDEBUG' '-DGSL_RANGE_CHECK_OFF' '-DNOCHECK'
'-pipe' '-Wno-reorder' '-Wno-sign-compare' '-O3'  '-fwhole-program' '-combine'
'-fopenmp' '-fomit-frame-pointer' '-funsafe-loop-optimizations' '-DGCC_BUG'
'-O3' '-v' '-Wno-unused-variable' '-shared-libgcc' '-pthread'
 /usr/local/libexec/gcc/x86_64-unknown-linux-gnu/4.4.0/cc1plus -quiet -v -I
/home/lvv/p/ -I /usr/local/include -D_GNU_SOURCE -D_REENTRANT
-DID="081214_183302-��g++-4.4.0-OPTIMIZE-0e541970++M" -DNDEBUG
-DGSL_RANGE_CHECK_OFF -DNOCHECK -DGCC_BUG u-array.cc -march=core2 -mcx16 -msahf
--param l1-cache-size=32 --param l1-cache-line-size=64 --param
l2-cache-size=4096 -mtune=core2 -quiet -dumpbase u-array.cc -auxbase u-array
-O3 -O3 -Wall -Wno-reorder -Wno-sign-compare -Wno-unused-variable -version
-fwhole-program -fopenmp -fomit-frame-pointer -funsafe-loop-optimizations -o -
|
 as -V -Qy -o /tmp/ccO8IZrX.o -
GNU assembler version 2.18 (x86_64-pc-linux-gnu) using BFD version (GNU
Binutils) 2.18
ignoring nonexistent directory
"/usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.4.0/../../../../x86_64-unknown-linux-gnu/include"
ignoring duplicate directory "/usr/local/include"
  as it is a non-system directory that duplicates a system directory
ignoring duplicate directory "/usr/local/include"
  as it is a non-system directory that duplicates a system directory
#include "..." search starts here:
#include <...> search starts here:
 /home/lvv/p/

/usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.4.0/../../../../include/c++/4.4.0

/usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.4.0/../../../../include/c++/4.4.0/x86_64-unknown-linux-gnu

/usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.4.0/../../../../include/c++/4.4.0/backward
 /usr/local/include
 /usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.4.0/include
 /usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.4.0/include-fixed
 /usr/include
End of search list.
GNU C++ (GCC) version 4.4.0 20081208 (experimental) (x86_64-unknown-linux-gnu)
        compiled by GNU C version 4.4.0 20081115 (experimental), GMP version
4.2.4, MPFR version 2.3.2.
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
Compiler executable checksum: f0165b2e8081c04e24b530c99a2b0b81
COMPILER_PATH=/usr/local/libexec/gcc/x86_64-unknown-linux-gnu/4.4.0/:/usr/local/libexec/gcc/x86_64-unknown-linux-gnu/4.4.0/:/usr/local/libexec/gcc/x86_64-unknown-linux-gnu/:/usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.4.0/:/usr/local/lib/gcc/x86_64-unknown-linux-gnu/
LIBRARY_PATH=/usr/local/lib/../lib64/:/usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.4.0/:/usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.4.0/../../../../lib64/:/lib/../lib64/:/usr/lib/../lib64/:./:/usr/local/lib/:/usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.4.0/../../../:/lib/:/usr/lib/
Reading specs from /usr/local/lib/../lib64/libgomp.spec
COLLECT_GCC_OPTIONS='-o' 'u-array' '-Wall'
'-DID="081214_183302-��g++-4.4.0-OPTIMIZE-0e541970++M"' '-I' '/home/lvv/p/'
'-I' '/usr/local/include' '-DNDEBUG' '-DGSL_RANGE_CHECK_OFF' '-DNOCHECK'
'-pipe' '-Wno-reorder' '-Wno-sign-compare' '-O3'  '-fwhole-program' '-combine'
'-fopenmp' '-fomit-frame-pointer' '-funsafe-loop-optimizations' '-DGCC_BUG'
'-O3' '-v' '-Wno-unused-variable' '-shared-libgcc' '-pthread'
 /usr/local/libexec/gcc/x86_64-unknown-linux-gnu/4.4.0/collect2 --eh-frame-hdr
-m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o u-array
/usr/lib/../lib64/crt1.o /usr/lib/../lib64/crti.o
/usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.4.0/crtbegin.o
-L/usr/local/lib/../lib64 -L/usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.4.0
-L/usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.4.0/../../../../lib64
-L/lib/../lib64 -L/usr/lib/../lib64 -L. -L/usr/local/lib
-L/usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.4.0/../../.. /tmp/ccO8IZrX.o
-lstdc++ -lm -lgomp -lgcc_s -lgcc -lpthread -lc -lgcc_s -lgcc
/usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.4.0/crtend.o
/usr/lib/../lib64/crtn.o


-- 
           Summary: sse2(int16) code fails with -O3
           Product: gcc
           Version: 4.4.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: leonid at volnitsky dot com
  GCC host triplet: x86_64-unknown-linux-gnu
GCC target triplet: x86_64-unknown-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38525


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c++/38525] sse2(int16) code fails with -O3
  2008-12-14 16:49 [Bug c++/38525] New: sse2(int16) code fails with -O3 leonid at volnitsky dot com
@ 2008-12-14 16:54 ` leonid at volnitsky dot com
  2008-12-14 19:30 ` ubizjak at gmail dot com
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: leonid at volnitsky dot com @ 2008-12-14 16:54 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from leonid at volnitsky dot com  2008-12-14 16:52 -------
Created an attachment (id=16907)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16907&action=view)
u-array.ii  


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38525


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c++/38525] sse2(int16) code fails with -O3
  2008-12-14 16:49 [Bug c++/38525] New: sse2(int16) code fails with -O3 leonid at volnitsky dot com
  2008-12-14 16:54 ` [Bug c++/38525] " leonid at volnitsky dot com
@ 2008-12-14 19:30 ` ubizjak at gmail dot com
  2008-12-14 19:32 ` ubizjak at gmail dot com
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: ubizjak at gmail dot com @ 2008-12-14 19:30 UTC (permalink / raw)
  To: gcc-bugs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 684 bytes --]



------- Comment #2 from ubizjak at gmail dot com  2008-12-14 19:29 -------
I can't compile the attachment:

g++ -O3 -fpreprocessed u-array.ii 
In file included from /home/lvv/p/lvv/sse.h:23,
                 from /home/lvv/p/lvv/array.h:35,
                 from u-array.cc:10:
/usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.4.0/include/pmmintrin.h: In
function ‘float __vector__ _mm_addsub_ps(float __vector__, float __vector__)’:
[...]

However, since you already found a failing test, please reduce your source to a
short, self contained runtime testcase (in plain C if possible) that fails with
certain compile flags.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38525


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c++/38525] sse2(int16) code fails with -O3
  2008-12-14 16:49 [Bug c++/38525] New: sse2(int16) code fails with -O3 leonid at volnitsky dot com
  2008-12-14 16:54 ` [Bug c++/38525] " leonid at volnitsky dot com
  2008-12-14 19:30 ` ubizjak at gmail dot com
@ 2008-12-14 19:32 ` ubizjak at gmail dot com
  2008-12-14 19:58 ` hjl dot tools at gmail dot com
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: ubizjak at gmail dot com @ 2008-12-14 19:32 UTC (permalink / raw)
  To: gcc-bugs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 793 bytes --]



------- Comment #3 from ubizjak at gmail dot com  2008-12-14 19:31 -------
(In reply to comment #2)

> /usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.4.0/include/pmmintrin.h: In
> function ‘float __vector__ _mm_addsub_ps(float __vector__, float __vector__)’:

g++ -O3 -fpreprocessed u-array.ii 
In file included from /home/lvv/p/lvv/sse.h:23,
                 from /home/lvv/p/lvv/array.h:35,
                 from u-array.cc:10:
/usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.4.0/include/pmmintrin.h: In
function ‘float __vector__ _mm_addsub_ps(float __vector__, float __vector__)’:
/usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.4.0/include/pmmintrin.h:53:
error: ‘__builtin_ia32_addsubps’ was not declared in this scope
[...]


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38525


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c++/38525] sse2(int16) code fails with -O3
  2008-12-14 16:49 [Bug c++/38525] New: sse2(int16) code fails with -O3 leonid at volnitsky dot com
                   ` (2 preceding siblings ...)
  2008-12-14 19:32 ` ubizjak at gmail dot com
@ 2008-12-14 19:58 ` hjl dot tools at gmail dot com
  2008-12-14 20:00 ` hjl dot tools at gmail dot com
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: hjl dot tools at gmail dot com @ 2008-12-14 19:58 UTC (permalink / raw)
  To: gcc-bugs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1229 bytes --]



------- Comment #4 from hjl dot tools at gmail dot com  2008-12-14 19:57 -------
(In reply to comment #3)
> (In reply to comment #2)
> 
> > /usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.4.0/include/pmmintrin.h: In
> > function �float __vector__ _mm_addsub_ps(float __vector__, float __vector__)�:
> 
> g++ -O3 -fpreprocessed u-array.ii 
> In file included from /home/lvv/p/lvv/sse.h:23,
>                  from /home/lvv/p/lvv/array.h:35,
>                  from u-array.cc:10:
> /usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.4.0/include/pmmintrin.h: In
> function �float __vector__ _mm_addsub_ps(float __vector__, float __vector__)�:
> /usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.4.0/include/pmmintrin.h:53:
> error: �__builtin_ia32_addsubps� was not declared in this scope
> [...]
> 

You need to use -march=core2 to enable SSSE3.


-- 

hjl dot tools at gmail dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hjl dot tools at gmail dot
                   |                            |com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38525


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c++/38525] sse2(int16) code fails with -O3
  2008-12-14 16:49 [Bug c++/38525] New: sse2(int16) code fails with -O3 leonid at volnitsky dot com
                   ` (3 preceding siblings ...)
  2008-12-14 19:58 ` hjl dot tools at gmail dot com
@ 2008-12-14 20:00 ` hjl dot tools at gmail dot com
  2008-12-14 21:02 ` leonid at volnitsky dot com
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: hjl dot tools at gmail dot com @ 2008-12-14 20:00 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from hjl dot tools at gmail dot com  2008-12-14 19:59 -------
(In reply to comment #1)
> Created an attachment (id=16907)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16907&action=view) [edit]
> u-array.ii  
> 

Please extract the failed ones to create a smaller testcase.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38525


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c++/38525] sse2(int16) code fails with -O3
  2008-12-14 16:49 [Bug c++/38525] New: sse2(int16) code fails with -O3 leonid at volnitsky dot com
                   ` (4 preceding siblings ...)
  2008-12-14 20:00 ` hjl dot tools at gmail dot com
@ 2008-12-14 21:02 ` leonid at volnitsky dot com
  2008-12-14 22:33 ` leonid at volnitsky dot com
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: leonid at volnitsky dot com @ 2008-12-14 21:02 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from leonid at volnitsky dot com  2008-12-14 21:00 -------
Created an attachment (id=16910)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16910&action=view)
Trimmed unit test.  Only one,  relevant to bug test


-- 

leonid at volnitsky dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #16907|0                           |1
        is obsolete|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38525


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c++/38525] sse2(int16) code fails with -O3
  2008-12-14 16:49 [Bug c++/38525] New: sse2(int16) code fails with -O3 leonid at volnitsky dot com
                   ` (5 preceding siblings ...)
  2008-12-14 21:02 ` leonid at volnitsky dot com
@ 2008-12-14 22:33 ` leonid at volnitsky dot com
  2008-12-16  0:05 ` pinskia at gcc dot gnu dot org
  2008-12-16 13:22 ` leonid at volnitsky dot com
  8 siblings, 0 replies; 10+ messages in thread
From: leonid at volnitsky dot com @ 2008-12-14 22:33 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #7 from leonid at volnitsky dot com  2008-12-14 22:32 -------
(In reply to comment #2)
> ... please reduce your source to a
> short, self contained runtime testcase (in plain C if possible) that fails with
> certain compile flags.

I've tried to do that (see below).  But unfortunately it does not exhibit same
behavior. It will fail on any optimized compile with warning about aliasing.
Which is fixable by adding -fno-strict-aliasing. Don't know if this is related
on not. 

---------------------------------------------------------
#include <immintrin.h>
#include <stdio.h>

int main(int argc, char *argv[]) {

        int16_t  volatile A[2000];    
        for (int i=0; i<(2000-2); i+=2) { A[i]=1;  A[i+1]=2; };  A[333] = 3; 

         #define mk_m128i(x) *(__m128i*)&(x)
        __m128i m1 = mk_m128i(A[0]);
        __m128i m2 = mk_m128i(A[8]);

        for (int i= 16;  i < 2000-16; i+=16) {                  // SSE
                 m1 = _mm_max_epi16(m1, mk_m128i(A[i]) ); 
                 m2 = _mm_max_epi16(m2, mk_m128i(A[i+8]) ); 
        }  

        m1 = _mm_max_epi16(m1, m2);

        int16_t* ip  = (int16_t*)&m1;  
        printf("%hi %hi %hi %hi  %hi %hi %hi %hi \n", *ip++, *ip++, *ip++,
*ip++, *ip++, *ip++, *ip++, *ip);       

return 0;                                                                       
 }




-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38525


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c++/38525] sse2(int16) code fails with -O3
  2008-12-14 16:49 [Bug c++/38525] New: sse2(int16) code fails with -O3 leonid at volnitsky dot com
                   ` (6 preceding siblings ...)
  2008-12-14 22:33 ` leonid at volnitsky dot com
@ 2008-12-16  0:05 ` pinskia at gcc dot gnu dot org
  2008-12-16 13:22 ` leonid at volnitsky dot com
  8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2008-12-16  0:05 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #8 from pinskia at gcc dot gnu dot org  2008-12-16 00:03 -------
        int16_t* ip  = (int16_t*)&m1;  
        printf("%hi %hi %hi %hi  %hi %hi %hi %hi \n", *ip++, *ip++, *ip++,
*ip++, *ip++, *ip++, *ip++, *ip);       

That is violating C/C++ aliasing rules even if __m128i is marked with
may_alias, the type you are accessing via is not marked as such.


-- 

pinskia at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|                            |INVALID


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38525


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c++/38525] sse2(int16) code fails with -O3
  2008-12-14 16:49 [Bug c++/38525] New: sse2(int16) code fails with -O3 leonid at volnitsky dot com
                   ` (7 preceding siblings ...)
  2008-12-16  0:05 ` pinskia at gcc dot gnu dot org
@ 2008-12-16 13:22 ` leonid at volnitsky dot com
  8 siblings, 0 replies; 10+ messages in thread
From: leonid at volnitsky dot com @ 2008-12-16 13:22 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #9 from leonid at volnitsky dot com  2008-12-16 13:21 -------
(In reply to comment #8)
>>         int16_t* ip  = (int16_t*)&m1;  
> That is violating C/C++ aliasing rules

The code that you quote is not part of the reported bug.  It was quick and
dirty hack.  But that was it.  Adding -fno-strict-aliasing fixed unit tests.  

But then there is a question.  Full code attached to the bug does not gave any
warning about aliasing violation with  -Wall and -O3.  Is no-warning a bug? 





-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38525


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2008-12-16 13:22 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-12-14 16:49 [Bug c++/38525] New: sse2(int16) code fails with -O3 leonid at volnitsky dot com
2008-12-14 16:54 ` [Bug c++/38525] " leonid at volnitsky dot com
2008-12-14 19:30 ` ubizjak at gmail dot com
2008-12-14 19:32 ` ubizjak at gmail dot com
2008-12-14 19:58 ` hjl dot tools at gmail dot com
2008-12-14 20:00 ` hjl dot tools at gmail dot com
2008-12-14 21:02 ` leonid at volnitsky dot com
2008-12-14 22:33 ` leonid at volnitsky dot com
2008-12-16  0:05 ` pinskia at gcc dot gnu dot org
2008-12-16 13:22 ` leonid at volnitsky dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).