From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 15221 invoked by alias); 23 Aug 2006 08:04:39 -0000 Received: (qmail 15183 invoked by uid 48); 23 Aug 2006 08:04:29 -0000 Date: Wed, 23 Aug 2006 08:04:00 -0000 Message-ID: <20060823080429.15182.qmail@sourceware.org> X-Bugzilla-Reason: CC References: Subject: [Bug target/27537] XMM alignment fault when compiling for i386 with -Os In-Reply-To: Reply-To: gcc-bugzilla@gcc.gnu.org To: gcc-bugs@gcc.gnu.org From: "agner at agner dot org" Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2006-08/txt/msg01921.txt.bz2 List-Id: ------- Comment #11 from agner at agner dot org 2006-08-23 08:04 ------- This problem wouldn't have happened if the ABI had been better maintained. Somebody decides to change the calling convention without properly documenting the change, and somebody else makes another change that is incompatible because the alignment requirement has never made it into the ABI documents. Let me help you making a decision on this issue by summarizing the pro's and con's of 16-bytes stack alignment in 32-bit x86 Linux/BSD. Advantages of enforcing 16-bytes stack alignment: ------------------------------------------------- 1. The use of XMM code is becoming more common now that all new computers have support for the SSE2 or higher instructions set. The necessary alignment of XMM variables can be implemented more efficiently when the stack is aligned. 2. Variables of type double (double precision floating point) are accessed more efficiently when aligned by 8. This is easily achieved when the stack is aligned. 3. Function parameters of type double will automatically get the optimal alignment, unless the parameter is preceded by an odd number of smaller parameters (including any 'this' pointer and return pointer). This means that more than 50% of function parameters of type double will be optimally aligned, versus 50% without stack alignment. The C/C++ programmer will be able to ensure optimal alignment by manipulating the order of function parameters. 4. Functions that need to align local variables can do so without using EBP as stack frame. This frees EBP for other purposes. General purpose registers is a scarce resource in 32-bit mode. 5. 16-bytes stack alignment is officially enforced in Intel-based Mac OS X. It is desirable to have identical ABI's for Linux, BSD and Mac. This makes it possible to use the same compilers and the same function libraries for all three platforms (except for the object file format, which can be converted). 6. The stack alignment requires no extra instructions in leaf functions, which are more likely to contain the critical innermost loop than non-leaf functions. 7. The stack alignment requires no extra instructions in a non-leaf function if the function adjusts the stack pointer anyway for the sake of allocating local storage. 8. Stack alignment is already implemented in Gcc and existing code relies on it. Disadvantages of enforcing 16-bytes stack alignment: ---------------------------------------------------- 1. A non-leaf function without any stack space allocated for local storage needs one or two extra instructions for conforming to the stack alignment requirement. 2. The alignment requirement results in unused space in the stack. This takes up to 12 bytes of extra space in the data cache for each function calling level except the innermost. Assuming that only the innermost three function levels matter in terms of speed, and that the number of unused bytes is 8 on average for all but the innermost function, the total amount of space wasted in the data cache is 16 bytes. 3. The Intel compiler does not enforce stack alignment. However, the Intel people are ready to change this as soon as you Gnu people make a decision on this issue. I have contact with the Intel people about this issue. 4. Stack alignment is not enforced in 32-bit Windows. Compatibility with the Windows ABI might be desirable. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27537