From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 1566 invoked by alias); 25 May 2007 01:23:00 -0000 Received: (qmail 1519 invoked by uid 22791); 25 May 2007 01:22:56 -0000 X-Spam-Check-By: sourceware.org Received: from smtp-out.google.com (HELO smtp-out.google.com) (216.239.45.13) by sourceware.org (qpsmtpd/0.31) with ESMTP; Fri, 25 May 2007 01:22:53 +0000 Received: from zps78.corp.google.com (zps78.corp.google.com [172.25.146.78]) by smtp-out.google.com with ESMTP id l4P1MiqH026521; Thu, 24 May 2007 18:22:44 -0700 Received: from smtp.corp.google.com (spacemonkey3.corp.google.com [192.168.120.116]) by zps78.corp.google.com with ESMTP id l4P1MbLT014183 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Thu, 24 May 2007 18:22:37 -0700 Received: from localhost.localdomain.google.com (152.sub-75-209-60.myvzw.com [75.209.60.152]) (authenticated bits=0) by smtp.corp.google.com (8.13.8/8.13.8) with ESMTP id l4P1MWCU020384 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Thu, 24 May 2007 18:22:35 -0700 To: Uros Bizjak Cc: gcc-patches@gcc.gnu.org, Kenneth Zadeck , seongbae.park@gmail.com Subject: Re: Dataflow miscompiles soft-fp/floatuntitf.c on x86_64 References: <46547B14.5010009@gmail.com> From: Ian Lance Taylor Date: Fri, 25 May 2007 01:30:00 -0000 In-Reply-To: <46547B14.5010009@gmail.com> Message-ID: User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.4 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2007-05/txt/msg01678.txt.bz2 Uros Bizjak writes: > Latest dataflow branch segfaults for: > > FAIL: gcc.dg/torture/fp-int-convert-float128-timode.c -O0 execution test > FAIL: gcc.dg/torture/fp-int-convert-float128-timode.c -O1 execution test > FAIL: gcc.dg/torture/fp-int-convert-float128-timode.c -O2 execution test > FAIL: gcc.dg/torture/fp-int-convert-float128-timode.c -O3 > -fomit-frame-pointer execution test > FAIL: gcc.dg/torture/fp-int-convert-float128-timode.c -O3 -g > execution test > FAIL: gcc.dg/torture/fp-int-convert-float128-timode.c -Os execution test > > (The tests were run on FC6, x86_64-pc-linux-gnu). > > This is due to segfault in soft-fp/floatuntitf.c at __floatuntitf+202, > where %rsp is not aligned to 16 bytes for movdqa: > > 0x0000000000402c10 <__floatuntitf+192>: mov %ax,0xfffffffffffffffe(%rsp) > 0x0000000000402c15 <__floatuntitf+197>: andb > $0x7f,0xffffffffffffffff(%rsp) > 0x0000000000402c1a <__floatuntitf+202>: movdqa > 0xfffffffffffffff0(%rsp),%xmm0 <<< > 0x0000000000402c20 <__floatuntitf+208>: pop %rbx > 0x0000000000402c21 <__floatuntitf+209>: pop %rbp > 0x0000000000402c22 <__floatuntitf+210>: retq Uros. Seongbae tracked this down. It's an interesting problem. In the .lreg dump we see this: ;; Register 79 in 5. ... (insn:HI 4 3 5 2 ../../../gccDBaseline/libgcc/../gcc/config/soft-fp/floatuntitf.c:35 (set (reg:TI 81 [ i ]) (subreg:TI (reg:DI 79 [ i ]) 0)) 87 {*movti_rex64} (expr_list:REG_DEAD (reg:DI 79 [ i ]) (nil))) Note that paradoxical subeg applied to reg 79. Note that pseudo reg 79 is put in hard reg 5 (%rdi). When reload starts, it calls mark_home_live to call df_set_regs_ever_live for each hard register being used. Since reg 79 is DImode, and has been put in hard reg 5 this marks hard reg 5 as live (this is x86_64 so hard reg 5 is DImode). reload proceeds as usual. At the end, reload calls cleanup_subreg_operands. That function fixes up any subregs of hard registers to refer directly to the register in question. In this case, though, it turns the paradoxical subreg into a TImode reference to hard reg 5, giving us this in the .greg dump: (insn:HI 4 7 5 2 ../../../gccDBaseline/libgcc/../gcc/config/soft-fp/floatuntitf.c:35 (set (reg:TI 37 r8 [orig:81 i ] [81]) (reg:TI 5 di [orig:79 i ] [79])) 87 {*movti_rex64} (nil)) So now we have a TImode reference to hard reg 5, which means hard regs 5 and 6. But when we set regs_ever_live, we only marked 5. As it happens, reg 6 was not otherwise referenced in the function. And, as it happens, reg 6 is %rbp, which is callee saved. All of the above happens on both mainline and dataflow branch. On mainline we simply carry on, with a reference to hard reg 6 but with that bit clear in regs_ever_live. Since this is paradoxical subreg, hard reg 6 is never actually changed, and indeed all the references to it drop out as well. So mainline is fine. Dataflow branch, however, recomputes regs_ever_live after reload is finished. When it does this, it sees the TImode reference to reg 5, and therefore marks both reg 5 and 6 as live. But it only does this after reload is finished. The effect is that the value of regs_ever_live is not the same during and after reload. reg 6 is clear in regs_ever_live during reload, and set after reload. This is bad. It's bad because the x86_64 backend uses regs_ever_live when determining the frame layout, which is done when computing register elimination during reload. And the x86_64 backend uses regs_ever_live again when deciding which registers to save and restore in the prologue and epilogue, after reload. When the frame layout computation changes, it means that the stack usage changes, and the register elimination computations are incorrect. Fortunately the fix turns out to be relatively simple. I have appended the untested patch. Note that this means that dataflow branch will generate worse code for this test case: it will save %rbp unnecessarily. In general it is risky to change regs_ever_live after reload is complete. Changing whether a callee saved register is live will tend to break the assumptions made by register elimination and prologue and epilogue threading. I think it would be appropriate to add some asserts to detect any case where this happens, and either fix it or make sure it is OK. Ian 2007-05-24 Ian Lance Taylor * reload1.c (mark_home_live_1): New static function, broken out of mark_home_live. (mark_home_live): Call mark_home_live_1. (scan_paradoxical_subregs): Call mark_home_live_1. Index: gcc/reload1.c =================================================================== --- gcc/reload1.c (revision 125036) +++ gcc/reload1.c (working copy) @@ -2141,21 +2141,31 @@ alter_reg (int i, int from_reg) } } -/* Mark the slots in regs_ever_live for the hard regs - used by pseudo-reg number REGNO. */ +/* Mark the slots in regs_ever_live for the hard regs used by + pseudo-reg number REGNO, accessed in MODE. */ -void -mark_home_live (int regno) +static void +mark_home_live_1 (int regno, enum machine_mode mode) { int i, lim; i = reg_renumber[regno]; if (i < 0) return; - lim = i + hard_regno_nregs[i][PSEUDO_REGNO_MODE (regno)]; + lim = i + hard_regno_nregs[i][mode]; while (i < lim) df_set_regs_ever_live(i++, true); } + +/* Mark the slots in regs_ever_live for the hard regs + used by pseudo-reg number REGNO. */ + +void +mark_home_live (int regno) +{ + if (reg_renumber[regno] >= 0) + mark_home_live_1 (regno, PSEUDO_REGNO_MODE (regno)); +} /* This function handles the tracking of elimination offsets around branches. @@ -3905,8 +3915,11 @@ scan_paradoxical_subregs (rtx x) if (REG_P (SUBREG_REG (x)) && (GET_MODE_SIZE (GET_MODE (x)) > reg_max_ref_width[REGNO (SUBREG_REG (x))])) - reg_max_ref_width[REGNO (SUBREG_REG (x))] - = GET_MODE_SIZE (GET_MODE (x)); + { + reg_max_ref_width[REGNO (SUBREG_REG (x))] + = GET_MODE_SIZE (GET_MODE (x)); + mark_home_live_1 (REGNO (SUBREG_REG (x)), GET_MODE (x)); + } return; default: