From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugs-return-202794-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 28388 invoked by alias); 18 Nov 2006 15:18:08 -0000
Received: (qmail 28326 invoked by uid 48); 18 Nov 2006 15:17:59 -0000
Date: Sat, 18 Nov 2006 15:18:00 -0000
Message-ID: <20061118151759.28325.qmail@sourceware.org>
X-Bugzilla-Reason: CC
References: <bug-29884-13529@http.gcc.gnu.org/bugzilla/>
Subject: [Bug target/29884] gcc-4.1.1, gcc-4.0.1 generate segfaulting SSE code
In-Reply-To: <bug-29884-13529@http.gcc.gnu.org/bugzilla/>
Reply-To: gcc-bugzilla@gcc.gnu.org
To: gcc-bugs@gcc.gnu.org
From: "sergstesh at yahoo dot com" <gcc-bugzilla@gcc.gnu.org>
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
X-SW-Source: 2006-11/txt/msg01604.txt.bz2
List-Id: <gcc-bugs.sourceware.org>


------- Comment #5 from sergstesh at yahoo dot com  2006-11-18 15:17 -------
IIRC, misaligned data should cause performance penalty, not segmentation fault.

Look at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29818 , at the case when
there is no segfault:

"
when the code runs fine (i.e. compiled by gcc-4.1.1), the screen output is:

"
checkpoint 1
&inp_array_1[0]=80498e0
checkpoint 2
bn=1
&inp_array_1[1]=80498e4
checkpoint 3
checkpoint 4
...
"

- as you see '&inp_array_1[1]=80498e4', and its WRT to line numbers 31..35 in:

     29   while(half_nos - bn >= NUMBER_OF_ELEMENTS_IN_VECTOR)
     30     {
     31     fprintf(stderr, "bn=%u\n", bn);
     32     fprintf(stderr, "&inp_array_1[%u]=%0lx\n", bn, (unsigned
long)&inp_array_1[bn]);
     33
     34     fprintf(stderr, "checkpoint 3\n");
     35     vtmp1.v = *(vFloat *)&inp_array_1[bn];
     36     fprintf(stderr, "checkpoint 4\n");
     37
     38     bn += NUMBER_OF_ELEMENTS_IN_VECTOR;
     39     } // while(half_nos - bn >= NUMBER_OF_ELEMENTS_IN_VECTOR)
.

In this case the address is 80498e4, i.e. no a multiple of 16, still, the
code does not segfault, even though a misaligned operation:

     35     vtmp1.v = *(vFloat *)&inp_array_1[bn];

is executed.

This is what I found in the documentation:

http://gcc.gnu.org/onlinedocs/gcc-4.1.1/gcc/i386-and-x86_002d64-Options.html#i386-and-x86_002d64-Options
:

"
-mpreferred-stack-boundary=num
    Attempt to keep the stack boundary aligned to a 2 raised to num byte
boundary. If -mpreferred-stack-boundary is not specified, the default is 4 (16
bytes or 128 bits), except when optimizing for code size (-Os), in which case
the default is the minimum correct alignment (4 bytes for x86, and 8 bytes for
x86-64).

    On Pentium and PentiumPro, double and long double values should be aligned
to an 8 byte boundary (see -malign-double) or suffer significant run time
performance penalties. On Pentium III, the Streaming SIMD Extension (SSE) data
type __m128 suffers similar penalties if it is not 16 byte aligned.
...
"

- from the above I expected to "suffer significant run time performance
penalties", but not a segfault.

Could you please:

1) point me to the documentation which says that misaligned SSE data will
cause segfault;

2) if such a document does not exist, update the documentation, preferably
pointing also to Intel documentation stating that misaligned SSE data causes
segmentation fault;

3) explain, how/why the code in

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29818

does not segfault even though it has the same misalignment as here


?

Thanks in advance.


-- 

sergstesh at yahoo dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |UNCONFIRMED
         Resolution|INVALID                     |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29884