From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 14085 invoked by alias); 15 Jan 2004 14:52:43 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 14077 invoked by uid 48); 15 Jan 2004 14:52:42 -0000 Date: Thu, 15 Jan 2004 14:52:00 -0000 Message-ID: <20040115145242.14076.qmail@sources.redhat.com> From: "roger at eyesopen dot com" To: gcc-bugs@gcc.gnu.org In-Reply-To: <20040107211538.13608.roger@eyesopen.com> References: <20040107211538.13608.roger@eyesopen.com> Reply-To: gcc-bugzilla@gcc.gnu.org Subject: [Bug optimization/13608] [3.3 regression] Incorrect code with -O3 -ffast-math X-Bugzilla-Reason: CC X-SW-Source: 2004-01/txt/msg01744.txt.bz2 List-Id: ------- Additional Comments From roger at eyesopen dot com 2004-01-15 14:52 ------- Well, the -40(%ebp) appears to be being created legitimately (on gcc-3.3.3) by a call to assign_stack_local from get_secondary_mem in reload.c. There seems to be some poor interaction between alloca and the x86 stack frame. The prologue looks like: _Z16FindBestRotationPKfS0_jPf: pushl %ebp movl %esp, %ebp // %sp == %ebp pushl %edi // %sp == %ebp - 4 pushl %esi // %sp == %ebp - 8 pushl %ebx // %sp == %ebp - 12 subl $28, %esp // %sp == %ebp - 40 // calculate argument to alloca movl 16(%ebp), %edi // %edi = n leal (%edi,%edi,2), %edx // %edx = 3*n leal 15(,%edx,4), %esi // %esi = 3*n*sizeof(float)+15 andl $-16, %esi // round to multiple of 16 subl %esi, %esp leal 4(%esp), %edx // %edx result from alloca ... fstps -40(%ebp) // same address as %edx + %esi - 4 Changing the "subl $28, %esp" to "subl $32, %esp" resolves the failure. So it looks like there's a poor interaction between the stack slot allocation routines and alloca. i.e. the last four bytes of the first alloca allocation (where the allocation is a multiple of 16 and therefore has no padding) overlap with the last four bytes of of cfun->x_frame_offset, i.e. the last allocated stack slot. Can someone more familiar with i386's stack frames take it from here? I don't know where the x86's push is predecrement or postdecrement so I don't know whether the "%sp == %ebp - 40", i.e. having sp point to the problematic stack slot, is the cause of the problem or not. Internally, that stack_frame_size is "28", FRAME_GROWS_DOWNWARDS and the last/problematic stack slot is the last four bytes of these 28, i.e. (MEM (PLUS (REG frame_pointer_rtx) (CONST_INT -28))) which once "eliminate_regs" has has its way becomes (MEM (PLUS (REG %ebp) (CONST_INT -40))) I hope this helps. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13608