From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 57554 invoked by alias); 1 Sep 2016 12:33:33 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 57544 invoked by uid 89); 1 Sep 2016 12:33:33 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.5 required=5.0 tests=AWL,BAYES_00,KAM_LOTSOFHASH,SPF_PASS autolearn=no version=3.3.2 spammy=machine_mode, Pmode, pmode, cselib X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (146.101.78.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 01 Sep 2016 12:33:22 +0000 Received: from EUR01-VE1-obe.outbound.protection.outlook.com (mail-ve1eur01lp0242.outbound.protection.outlook.com [213.199.154.242]) (Using TLS) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-85-2ly9MVcVOm6hjrAI8rDdjQ-1; Thu, 01 Sep 2016 13:33:19 +0100 Received: from AM5PR0802MB2610.eurprd08.prod.outlook.com (10.175.46.18) by AM5PR0802MB2611.eurprd08.prod.outlook.com (10.175.46.19) with Microsoft SMTP Server (version=TLS1_0, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA_P384) id 15.1.599.9; Thu, 1 Sep 2016 12:33:18 +0000 Received: from AM5PR0802MB2610.eurprd08.prod.outlook.com ([10.175.46.18]) by AM5PR0802MB2610.eurprd08.prod.outlook.com ([10.175.46.18]) with mapi id 15.01.0599.010; Thu, 1 Sep 2016 12:33:18 +0000 From: Wilco Dijkstra To: GCC Patches CC: nd Subject: Re: [PATCH][AArch64 - v2] Simplify eh_return implementation Date: Thu, 01 Sep 2016 12:33:00 -0000 Message-ID: References: ,, In-Reply-To: x-ms-office365-filtering-correlation-id: f961b909-3275-4826-5cd7-08d3d26426f3 x-microsoft-exchange-diagnostics: 1;AM5PR0802MB2611;20:+e3wUIvqRH/aYayOVYNotd1lheOAcCUwQchzzn/2g3Ppl9cPlya25UJ/2jAqLKEnsBtn+iYW5v6mq1WSRqu7s+7Zs/LlIUOc2V6QqOvEyJhTGRdIURCwvCx4XtBQGgPMsQMsjpdjd7tbQlzPsbFn9xF5LTyvzIbiWkWJEfvoiHg= x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:AM5PR0802MB2611; nodisclaimer: True x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(180628864354917); x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(6040176)(601004)(2401047)(5005006)(8121501046)(3002001)(10201501046)(6055026);SRVR:AM5PR0802MB2611;BCL:0;PCL:0;RULEID:;SRVR:AM5PR0802MB2611; x-forefront-prvs: 0052308DC6 x-forefront-antispam-report: SFV:NSPM;SFS:(10009020)(6009001)(7916002)(377424004)(189002)(199003)(54534003)(101416001)(86362001)(10400500002)(33656002)(3660700001)(11100500001)(3280700002)(106356001)(106116001)(105586002)(2906002)(5660300001)(50986999)(76176999)(19580405001)(575784001)(2900100001)(2950100001)(19580395003)(54356999)(3846002)(76576001)(8936002)(81166006)(8676002)(110136002)(92566002)(81156014)(7696003)(7846002)(450100001)(305945005)(66066001)(74316002)(9686002)(77096005)(97736004)(87936001)(6116002)(68736007)(189998001)(122556002)(5002640100001)(7736002)(4326007)(102836003)(586003);DIR:OUT;SFP:1101;SCL:1;SRVR:AM5PR0802MB2611;H:AM5PR0802MB2610.eurprd08.prod.outlook.com;FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-originalarrivaltime: 01 Sep 2016 12:33:17.9896 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM5PR0802MB2611 X-MC-Unique: 2ly9MVcVOm6hjrAI8rDdjQ-1 Content-Type: text/plain; charset=WINDOWS-1252 Content-Transfer-Encoding: quoted-printable X-SW-Source: 2016-09/txt/msg00036.txt.bz2 =20 Ping I noticed it would still be a good idea to add an extra barrier in the epil= og as the scheduler doesn't appear to handle aliases of frame accesses properly. This patch simplifies the handling of the EH return value.=A0 We force the = use of the frame pointer so the return location is always at FP + 8.=A0 This means we = can emit a simple volatile access in EH_RETURN_HANDLER_RTX without needing md patterns, splitters and frame offset calculations.=A0 The new implementatio= n also fixes various bugs in aarch64_final_eh_return_addr, which does not work with -fomit-frame-pointer, alloca or outgoing arguments. Bootstrap OK, GCC Regression OK, OK for trunk? Would it be useful to backpo= rt this to GCC6.x? ChangeLog: 2016-08-10=A0 Wilco Dijkstra=A0 gcc/ =A0=A0=A0=A0=A0=A0=A0 * config/aarch64/aarch64.md (eh_return): Remove patte= rn and splitter. =A0=A0=A0=A0=A0=A0=A0 * config/aarch64/aarch64.h (AARCH64_EH_STACKADJ_REGNU= M): Remove. =A0=A0=A0=A0=A0=A0=A0 (EH_RETURN_HANDLER_RTX): New define. =A0=A0=A0=A0=A0=A0=A0 * config/aarch64/aarch64.c (aarch64_frame_pointer_req= uired): =A0=A0=A0=A0=A0=A0=A0 Force frame pointer in EH return functions. =A0=A0=A0=A0=A0=A0=A0 (aarch64_expand_epilogue): Add barrier for eh_return. =A0=A0=A0=A0=A0=A0=A0 (aarch64_final_eh_return_addr): Remove. =A0=A0=A0=A0=A0=A0=A0 (aarch64_eh_return_handler_rtx): New function. =A0=A0=A0=A0=A0=A0=A0 * config/aarch64/aarch64-protos.h (aarch64_final_eh_r= eturn_addr): =A0=A0=A0=A0=A0=A0=A0 Remove. =A0=A0=A0=A0=A0=A0=A0 (aarch64_eh_return_handler_rtx): New prototype. -- diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch= 64-protos.h index 3cdd69b8af1089a839e5d45cda94bc70a15cd777..327c0a97f6f687604afef249b79= ac22628418070 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -358,7 +358,7 @@ int aarch64_hard_regno_mode_ok (unsigned, machine_mode); =A0int aarch64_hard_regno_nregs (unsigned, machine_mode); =A0int aarch64_uxt_size (int, HOST_WIDE_INT); =A0int aarch64_vec_fpconst_pow_of_2 (rtx); -rtx aarch64_final_eh_return_addr (void); +rtx aarch64_eh_return_handler_rtx (void); =A0rtx aarch64_mask_from_zextract_ops (rtx, rtx); =A0const char *aarch64_output_move_struct (rtx *operands); =A0rtx aarch64_return_addr (int, rtx); diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 003fec87e41db618570663f28cc2387a87e8252a..fa81e4b853dafcccc0884295528= 8861ec7e7acca 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -400,9 +400,9 @@ extern unsigned aarch64_architecture_version; =A0#define ASM_DECLARE_FUNCTION_NAME(STR, NAME, DECL)=A0=A0=A0=A0=A0 \ =A0=A0 aarch64_declare_function_name (STR, NAME, DECL) =A0 -/* The register that holds the return address in exception handlers.=A0 */ -#define AARCH64_EH_STACKADJ_REGNUM=A0=A0=A0=A0 (R0_REGNUM + 4) -#define EH_RETURN_STACKADJ_RTX gen_rtx_REG (Pmode, AARCH64_EH_STACKADJ_REG= NUM) +/* For EH returns X4 contains the stack adjustment.=A0 */ +#define EH_RETURN_STACKADJ_RTX gen_rtx_REG (Pmode, R4_REGNUM) +#define EH_RETURN_HANDLER_RTX=A0 aarch64_eh_return_handler_rtx () =A0 =A0/* Don't use __builtin_setjmp until we've defined it.=A0 */ =A0#undef DONT_USE_BUILTIN_SETJMP diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 5a25fce17785af9f9dc12e0f2a9609af09af0b35..bb8baff1e7a06942c8b8f51c1d6= b341673401ef9 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -2739,6 +2739,10 @@ aarch64_frame_pointer_required (void) =A0=A0=A0=A0=A0=A0 && (!crtl->is_leaf || df_regs_ever_live_p (LR_REGNUM))) =A0=A0=A0=A0 return true; =A0 +=A0 /* Force a frame pointer for EH returns so the return address is at FP= +8.=A0 */ +=A0 if (crtl->calls_eh_return) +=A0=A0=A0 return true; + =A0=A0 return false; =A0} =A0 @@ -3298,7 +3302,8 @@ aarch64_expand_epilogue (bool for_sibcall) =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= + cfun->machine->frame.saved_varargs_size) !=3D 0; =A0 =A0=A0 /* Emit a barrier to prevent loads from a deallocated stack.=A0 */ -=A0 if (final_adjust > crtl->outgoing_args_size || cfun->calls_alloca) +=A0 if (final_adjust > crtl->outgoing_args_size || cfun->calls_alloca +=A0=A0=A0=A0=A0 || crtl->calls_eh_return) =A0=A0=A0=A0 { =A0=A0=A0=A0=A0=A0 emit_insn (gen_stack_tie (stack_pointer_rtx, stack_point= er_rtx)); =A0=A0=A0=A0=A0=A0 need_barrier_p =3D false; @@ -3366,52 +3371,15 @@ aarch64_expand_epilogue (bool for_sibcall) =A0=A0=A0=A0 emit_jump_insn (ret_rtx); =A0} =A0 -/* Return the place to copy the exception unwinding return address to. -=A0=A0 This will probably be a stack slot, but could (in theory be the -=A0=A0 return register).=A0 */ +/* Implement EH_RETURN_HANDLER_RTX.=A0 The return address is stored at FP = + 8. +=A0=A0 The access needs to be volatile to prevent it from being removed.= =A0 */ =A0rtx -aarch64_final_eh_return_addr (void) +aarch64_eh_return_handler_rtx (void) =A0{ -=A0 HOST_WIDE_INT fp_offset; - -=A0 aarch64_layout_frame (); - -=A0 fp_offset =3D cfun->machine->frame.frame_size -=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 - cfun->machine->frame.hard_fp_offset; - -=A0 if (cfun->machine->frame.reg_offset[LR_REGNUM] < 0) -=A0=A0=A0 return gen_rtx_REG (DImode, LR_REGNUM); - -=A0 /* DSE and CSELIB do not detect an alias between sp+k1 and fp+k2.=A0 T= his can -=A0=A0=A0=A0 result in a store to save LR introduced by builtin_eh_return = () being -=A0=A0=A0=A0 incorrectly deleted because the alias is not detected. -=A0=A0=A0=A0 So in the calculation of the address to copy the exception un= winding -=A0=A0=A0=A0 return address to, we note 2 cases. -=A0=A0=A0=A0 If FP is needed and the fp_offset is 0, it means that SP =3D = FP and hence -=A0=A0=A0=A0 we return a SP-relative location since all the addresses are = SP-relative -=A0=A0=A0=A0 in this case.=A0 This prevents the store from being optimized= away. -=A0=A0=A0=A0 If the fp_offset is not 0, then the addresses will be FP-rela= tive and -=A0=A0=A0=A0 therefore we return a FP-relative location.=A0 */ - -=A0 if (frame_pointer_needed) -=A0=A0=A0 { -=A0=A0=A0=A0=A0 if (fp_offset) -=A0=A0=A0=A0=A0=A0=A0 return gen_frame_mem (DImode, -=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0 plus_constant (Pmode, hard_frame_pointer_rtx, UNITS_PER_WORD)); -=A0=A0=A0=A0=A0 else -=A0=A0=A0=A0=A0=A0=A0 return gen_frame_mem (DImode, -=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0 plus_constant (Pmode, stack_pointer_rtx, UNITS_PER_WORD)); -=A0=A0=A0 } - -=A0 /* If FP is not needed, we calculate the location of LR, which would be -=A0=A0=A0=A0 at the top of the saved registers block.=A0 */ - -=A0 return gen_frame_mem (DImode, -=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 plus_co= nstant (Pmode, -=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 stack_pointer_rtx, -=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 fp_offset -=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 + cfun->machine->frame.saved_regs_s= ize -=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 - 2 * UNITS_PER_WORD)); +=A0 rtx tmp =3D gen_frame_mem (Pmode, +=A0=A0=A0 plus_constant (Pmode, hard_frame_pointer_rtx, UNITS_PER_WORD)); +=A0 MEM_VOLATILE_P (tmp) =3D true; +=A0 return tmp; =A0} =A0 =A0/* Output code to add DELTA to the first argument, and then jump diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 21f5a6aba74d28f04b9391ba917453a4cd7de1af..7d86aa7c0cb2fdb30889badc172= d2270eeadb1e5 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -591,25 +591,6 @@ =A0=A0 [(set_attr "type" "branch")] =A0) =A0 -(define_insn "eh_return" -=A0 [(unspec_volatile [(match_operand:DI 0 "register_operand" "r")] -=A0=A0=A0 UNSPECV_EH_RETURN)] -=A0 "" -=A0 "#" -=A0 [(set_attr "type" "branch")] - -) - -(define_split -=A0 [(unspec_volatile [(match_operand:DI 0 "register_operand" "")] -=A0=A0=A0 UNSPECV_EH_RETURN)] -=A0 "reload_completed" -=A0 [(set (match_dup 1) (match_dup 0))] -=A0 { -=A0=A0=A0 operands[1] =3D aarch64_final_eh_return_addr (); -=A0 } -) - =A0(define_insn "*cb1" =A0=A0 [(set (pc) (if_then_else (EQL (match_operand:GPI 0 "register_operand= " "r") =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0 (const_int 0))