From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 98854 invoked by alias); 19 Apr 2016 09:12:20 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 98838 invoked by uid 89); 19 Apr 2016 09:12:18 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.2 required=5.0 tests=AWL,BAYES_40,KAM_LAZY_DOMAIN_SECURITY,RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=Strip, INSN_CODE, set_dest, SET_SRC X-HELO: e06smtp16.uk.ibm.com Received: from e06smtp16.uk.ibm.com (HELO e06smtp16.uk.ibm.com) (195.75.94.112) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (CAMELLIA256-SHA encrypted) ESMTPS; Tue, 19 Apr 2016 09:12:08 +0000 Received: from localhost by e06smtp16.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 19 Apr 2016 10:12:04 +0100 Received: from d06dlp01.portsmouth.uk.ibm.com (9.149.20.13) by e06smtp16.uk.ibm.com (192.168.101.146) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Tue, 19 Apr 2016 10:12:02 +0100 X-IBM-Helo: d06dlp01.portsmouth.uk.ibm.com X-IBM-MailFrom: krebbel@linux.vnet.ibm.com X-IBM-RcptTo: gcc-patches@gcc.gnu.org Received: from b06cxnps4074.portsmouth.uk.ibm.com (d06relay11.portsmouth.uk.ibm.com [9.149.109.196]) by d06dlp01.portsmouth.uk.ibm.com (Postfix) with ESMTP id DD36F17D854C for ; Tue, 19 Apr 2016 10:03:22 +0100 (BST) Received: from d06av02.portsmouth.uk.ibm.com (d06av02.portsmouth.uk.ibm.com [9.149.37.228]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u3J92Zkr2031954 for ; Tue, 19 Apr 2016 09:02:35 GMT Received: from d06av02.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av02.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id u3J92ZRl013335 for ; Tue, 19 Apr 2016 03:02:35 -0600 Received: from maggie.boeblingen.de.ibm.com (dyn-9-152-212-123.boeblingen.de.ibm.com [9.152.212.123]) by d06av02.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id u3J92Y6g013271 (version=TLSv1/SSLv3 cipher=AES256-SHA256 bits=256 verify=NO); Tue, 19 Apr 2016 03:02:35 -0600 From: Andreas Krebbel To: gcc-patches@gcc.gnu.org Cc: uweigand@de.ibm.com Subject: [PATCH] PR70674: S/390: Add memory barrier to stack pointer restore from fpr. Date: Tue, 19 Apr 2016 09:12:00 -0000 Message-Id: <1461056554-18150-1-git-send-email-krebbel@linux.vnet.ibm.com> X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16041909-0025-0000-0000-000012AB0F09 X-IsSubscribed: yes X-SW-Source: 2016-04/txt/msg00910.txt.bz2 This patches fixes a problem with stack variable accesses being scheduled after the stack pointer restore instructions. In the testcase this happened with the stack variable 'a' accessed through the frame pointer. The existing stack_tie we have in the backend is basically useless when trying to block stack variable accesses from being scheduled across an insn. The alias set of stack variables and the frame alias set usually differ and hence aren't in conflict with each other. The solution appears to be a magic MEM term with a scratch register which is handled as a full memory barrier when analyzing scheduling dependencies. With the patch a (clobber (mem:BLK (scratch))) is being added to the restore instruction in order to prevent any memory operations to be scheduled across the insn. The patch does that only for the one case where the stack pointer is restored from an FPR. Theoretically this might happen also in the case where the stack pointer gets restored using a load multiple. However, triggering that problem with load-multiple appears to be much harder since the load-multiple will restore the frame pointer as well. So in order to see the problem a different call-clobbered register would need to be used as temporary stack pointer. Another case which needs to be handled some day is the stack pointer allocation part. It needs to be a memory barrier as well. I'll post the patches for the other two parts when gcc 7 entered stage 1 again. Bootstrapped and regression tested with --with-arch z196 and z13 on s390 and s390x. This needs to go into 4.9/5/6 branches. -Andreas- gcc/ChangeLog: 2016-04-19 Andreas Krebbel PR target/70674 * config/s390/s390.c (s390_restore_gprs_from_fprs): Pick the new stack_restore_from_fpr pattern when restoring r15. (s390_optimize_prologue): Strip away the memory barrier in the parallel when trying to get rid of restore insns. * config/s390/s390.md ("stack_restore_from_fpr"): New insn definition for loading the stack pointer from an FPR. Compared to the normal move insn this pattern includes a full memory barrier. gcc/testsuite/ChangeLog: 2016-04-19 Andreas Krebbel PR target/70674 * gcc.target/s390/pr70674.c: New test. --- gcc/config/s390/s390.c | 91 +++++++++++++++++++-------------- gcc/config/s390/s390.md | 10 ++++ gcc/testsuite/gcc.target/s390/pr70674.c | 13 +++++ 3 files changed, 76 insertions(+), 38 deletions(-) create mode 100644 gcc/testsuite/gcc.target/s390/pr70674.c diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c index 1134d0f..e969542 100644 --- a/gcc/config/s390/s390.c +++ b/gcc/config/s390/s390.c @@ -10538,19 +10538,25 @@ s390_restore_gprs_from_fprs (void) for (i = 6; i < 16; i++) { - if (FP_REGNO_P (cfun_gpr_save_slot (i))) - { - rtx_insn *insn = - emit_move_insn (gen_rtx_REG (DImode, i), - gen_rtx_REG (DImode, cfun_gpr_save_slot (i))); - df_set_regs_ever_live (i, true); - add_reg_note (insn, REG_CFA_RESTORE, gen_rtx_REG (DImode, i)); - if (i == STACK_POINTER_REGNUM) - add_reg_note (insn, REG_CFA_DEF_CFA, - plus_constant (Pmode, stack_pointer_rtx, - STACK_POINTER_OFFSET)); - RTX_FRAME_RELATED_P (insn) = 1; - } + rtx_insn *insn; + + if (!FP_REGNO_P (cfun_gpr_save_slot (i))) + continue; + + if (i == STACK_POINTER_REGNUM) + insn = emit_insn (gen_stack_restore_from_fpr ( + gen_rtx_REG (DImode, cfun_gpr_save_slot (i)))); + else + insn = + emit_move_insn (gen_rtx_REG (DImode, i), + gen_rtx_REG (DImode, cfun_gpr_save_slot (i))); + df_set_regs_ever_live (i, true); + add_reg_note (insn, REG_CFA_RESTORE, gen_rtx_REG (DImode, i)); + if (i == STACK_POINTER_REGNUM) + add_reg_note (insn, REG_CFA_DEF_CFA, + plus_constant (Pmode, stack_pointer_rtx, + STACK_POINTER_OFFSET)); + RTX_FRAME_RELATED_P (insn) = 1; } } @@ -13032,37 +13038,46 @@ s390_optimize_prologue (void) /* Remove ldgr/lgdr instructions used for saving and restore GPRs if possible. */ - if (TARGET_Z10 - && GET_CODE (pat) == SET - && GET_MODE (SET_SRC (pat)) == DImode - && REG_P (SET_SRC (pat)) - && REG_P (SET_DEST (pat))) + if (TARGET_Z10) { - int src_regno = REGNO (SET_SRC (pat)); - int dest_regno = REGNO (SET_DEST (pat)); - int gpr_regno; - int fpr_regno; + rtx tmp_pat = pat; - if (!((GENERAL_REGNO_P (src_regno) && FP_REGNO_P (dest_regno)) - || (FP_REGNO_P (src_regno) && GENERAL_REGNO_P (dest_regno)))) - continue; + if (INSN_CODE (insn) == CODE_FOR_stack_restore_from_fpr) + tmp_pat = XVECEXP (pat, 0, 0); - gpr_regno = GENERAL_REGNO_P (src_regno) ? src_regno : dest_regno; - fpr_regno = FP_REGNO_P (src_regno) ? src_regno : dest_regno; + if (GET_CODE (tmp_pat) == SET + && GET_MODE (SET_SRC (tmp_pat)) == DImode + && REG_P (SET_SRC (tmp_pat)) + && REG_P (SET_DEST (tmp_pat))) + { + int src_regno = REGNO (SET_SRC (tmp_pat)); + int dest_regno = REGNO (SET_DEST (tmp_pat)); + int gpr_regno; + int fpr_regno; + + if (!((GENERAL_REGNO_P (src_regno) + && FP_REGNO_P (dest_regno)) + || (FP_REGNO_P (src_regno) + && GENERAL_REGNO_P (dest_regno)))) + continue; - /* GPR must be call-saved, FPR must be call-clobbered. */ - if (!call_really_used_regs[fpr_regno] - || call_really_used_regs[gpr_regno]) - continue; + gpr_regno = GENERAL_REGNO_P (src_regno) ? src_regno : dest_regno; + fpr_regno = FP_REGNO_P (src_regno) ? src_regno : dest_regno; - /* It must not happen that what we once saved in an FPR now - needs a stack slot. */ - gcc_assert (cfun_gpr_save_slot (gpr_regno) != SAVE_SLOT_STACK); + /* GPR must be call-saved, FPR must be call-clobbered. */ + if (!call_really_used_regs[fpr_regno] + || call_really_used_regs[gpr_regno]) + continue; - if (cfun_gpr_save_slot (gpr_regno) == SAVE_SLOT_NONE) - { - remove_insn (insn); - continue; + /* It must not happen that what we once saved in an FPR now + needs a stack slot. */ + gcc_assert (cfun_gpr_save_slot (gpr_regno) != SAVE_SLOT_STACK); + + if (cfun_gpr_save_slot (gpr_regno) == SAVE_SLOT_NONE) + { + remove_insn (insn); + continue; + } } } diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md index 5a9f1c8..2b4e8f4 100644 --- a/gcc/config/s390/s390.md +++ b/gcc/config/s390/s390.md @@ -299,6 +299,8 @@ (BASE_REGNUM 13) ; Return address register. (RETURN_REGNUM 14) + ; Stack pointer register. + (STACK_REGNUM 15) ; Condition code register. (CC_REGNUM 33) ; Thread local storage pointer register. @@ -10387,6 +10389,14 @@ [(set_attr "length" "0")]) +(define_insn "stack_restore_from_fpr" + [(set (reg:DI STACK_REGNUM) + (match_operand:DI 0 "register_operand" "f")) + (clobber (mem:BLK (scratch)))] + "TARGET_Z10" + "lgdr\t%%r15,%0" + [(set_attr "op_type" "RRE")]) + ; ; Data prefetch patterns ; diff --git a/gcc/testsuite/gcc.target/s390/pr70674.c b/gcc/testsuite/gcc.target/s390/pr70674.c new file mode 100644 index 0000000..13bf271 --- /dev/null +++ b/gcc/testsuite/gcc.target/s390/pr70674.c @@ -0,0 +1,13 @@ +/* Test case for PR/70674. */ + +/* { dg-do compile { target s390x-*-* } } */ +/* { dg-options "-march=z10 -mtune=z196 -O2 -fno-omit-frame-pointer -fno-asynchronous-unwind-tables" } */ + +void +foo (void) +{ + volatile int a = 5; + (void) a; +} + +/* { dg-final { scan-assembler-not "^.*lgdr.*\n.*\\(%r11\\)" } } */ -- 1.9.1