From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 10926 invoked by alias); 11 Oct 2011 09:27:45 -0000 Received: (qmail 10916 invoked by uid 22791); 11 Oct 2011 09:27:43 -0000 X-SWARE-Spam-Status: No, hits=-1.0 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW X-Spam-Check-By: sourceware.org Received: from service87.mimecast.com (HELO service87.mimecast.com) (91.220.42.44) by sourceware.org (qpsmtpd/0.43rc1) with SMTP; Tue, 11 Oct 2011 09:27:17 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.21]) by service87.mimecast.com; Tue, 11 Oct 2011 10:27:14 +0100 Received: from [10.1.79.40] ([10.1.255.212]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.0); Tue, 11 Oct 2011 10:27:12 +0100 Subject: [RFA/ARM][Patch 03/05]: STRD generation instead of PUSH in A15 Thumb2 prologue. From: Sameera Deshpande To: "gcc-patches@gcc.gnu.org" Cc: "nickc@redhat.com" , Richard Earnshaw , "paul@codesourcery.com" , Ramana Radhakrishnan In-Reply-To: <1318324138.2186.40.camel@e102549-lin.cambridge.arm.com> References: <1318324138.2186.40.camel@e102549-lin.cambridge.arm.com> Date: Tue, 11 Oct 2011 09:53:00 -0000 Message-ID: <1318325232.2186.58.camel@e102549-lin.cambridge.arm.com> Mime-Version: 1.0 X-MC-Unique: 111101110271400901 Content-Type: multipart/mixed; boundary="=-lQunIQnh48jbXVNi9z9R" X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2011-10/txt/msg00863.txt.bz2 --=-lQunIQnh48jbXVNi9z9R Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-length: 1109 Hi! This patch generates STRD instruction instead of PUSH in thumb2 mode for A15. For optimize_size, original prologue is generated for A15. The work involves defining new functions, predicates and patterns. The patch is tested with check-gcc, check-gdb and bootstrap with no regression.=20 Changelog entries for the patch for STRD generation for a15-thumb2: 2011-10-11 Sameera Deshpande =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20 =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20 * config/arm/arm.c (thumb2_emit_strd_push): New static function.=20=20 (arm_expand_prologue): Update.=20 * config/arm/ldmstm.md (thumb2_strd): New pattern. (thumb2_strd_base): Likewise. --=20 --=-lQunIQnh48jbXVNi9z9R Content-Type: text/x-patch; name=a15_thumb2_strd_prologue-6Oct.patch; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="a15_thumb2_strd_prologue-6Oct.patch" Content-length: 7363 diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 3eba510..fd8c31d 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -15095,6 +15095,125 @@ arm_output_function_epilogue (FILE *file ATTRIBUT= E_UNUSED, } } =20 +/* Generate and emit a pattern that will be recognized as STRD pattern. I= f even + number of registers are being pushed, multiple STRD patterns are create= d for + all register pairs. If odd number of registers are pushed, first regis= ter is + stored by using STR pattern. */ +static void +thumb2_emit_strd_push (unsigned long saved_regs_mask) +{ + int num_regs =3D 0; + int i, j; + rtx par =3D NULL_RTX; + rtx insn =3D NULL_RTX; + rtx dwarf =3D NULL_RTX; + rtx tmp, reg, tmp1; + + for (i =3D 0; i <=3D LAST_ARM_REGNUM; i++) + if (saved_regs_mask & (1 << i)) + num_regs++; + + gcc_assert (num_regs && num_regs <=3D 16); + + /* Pre-decrement the stack pointer, based on there being num_regs 4-byte + registers to push. */ + tmp =3D gen_rtx_SET (VOIDmode, + stack_pointer_rtx, + plus_constant (stack_pointer_rtx, -4 * num_regs)); + RTX_FRAME_RELATED_P (tmp) =3D 1; + insn =3D emit_insn (tmp); + + /* Create sequence for DWARF info. */ + dwarf =3D gen_rtx_SEQUENCE (VOIDmode, rtvec_alloc (num_regs + 1)); + + /* RTLs cannot be shared, hence create new copy for dwarf. */ + tmp1 =3D gen_rtx_SET (VOIDmode, + stack_pointer_rtx, + plus_constant (stack_pointer_rtx, -4 * num_regs)); + RTX_FRAME_RELATED_P (tmp1) =3D 1; + XVECEXP (dwarf, 0, 0) =3D tmp1; + + for (i =3D num_regs - 1, j =3D LAST_ARM_REGNUM; i >=3D (num_regs % 2); j= --) + /* Var j iterates over all the registers to gather all the registers in + saved_regs_mask. Var i gives index of register R_j in stack frame. + A PARALLEL RTX of register-pair is created here, so that pattern for + STRD can be matched. If num_regs is odd, 1st register will be push= ed + using STR and remaining registers will be pushed with STRD in pairs. + If num_regs is even, all registers are pushed with STRD in pairs. + Hence, skip first element for odd num_regs. */ + if (saved_regs_mask & (1 << j)) + { + gcc_assert (j !=3D SP_REGNUM); + gcc_assert (j !=3D PC_REGNUM); + + /* Create RTX for store. New RTX is created for dwarf as + they are not sharable. */ + reg =3D gen_rtx_REG (SImode, j); + tmp =3D gen_rtx_SET (SImode, + gen_frame_mem + (SImode, + plus_constant (stack_pointer_rtx, 4 * i)), + reg); + + tmp1 =3D gen_rtx_SET (SImode, + gen_frame_mem + (SImode, + plus_constant (stack_pointer_rtx, 4 * i)), + reg); + RTX_FRAME_RELATED_P (tmp) =3D 1; + RTX_FRAME_RELATED_P (tmp1) =3D 1; + + if (((i - (num_regs % 2)) % 2) =3D=3D 1) + /* When (i - (num_regs % 2)) is odd, the RTX to be emitted is ye= t to + be created. Hence create it first. The STRD pattern we are + generating is : + [ (SET (MEM (PLUS (SP) (NUM))) (reg_t1)) + (SET (MEM (PLUS (SP) (NUM + 4))) (reg_t2)) ] + were target registers need not be consecutive. */ + par =3D gen_rtx_PARALLEL (VOIDmode, rtvec_alloc (2)); + + /* Register R_j is added in PARALLEL RTX. If (i - (num_regs % 2))= is + even, the reg_j is added as 0th element and if it is odd, reg_i= is + added as 1st element of STRD pattern shown above. */ + XVECEXP (par, 0, ((i - (num_regs % 2)) % 2)) =3D tmp; + XVECEXP (dwarf, 0, (i + 1)) =3D tmp1; + + if (((i - (num_regs % 2)) % 2) =3D=3D 0) + /* When (i - (num_regs % 2)) is even, RTXs for both the registers + to be loaded are generated in above given STRD pattern, and t= he + pattern can be emitted now. */ + emit_insn (par); + + i--; + } + + if ((num_regs % 2) =3D=3D 1) + { + /* If odd number of registers are pushed, generate STR pattern to st= ore + lone register. */ + for (; (saved_regs_mask & (1 << j)) =3D=3D 0; j--); + + tmp1 =3D gen_frame_mem (SImode, plus_constant (stack_pointer_rtx, 4 = * i)); + reg =3D gen_rtx_REG (SImode, j); + tmp =3D gen_rtx_SET (SImode, tmp1, reg); + RTX_FRAME_RELATED_P (tmp) =3D 1; + + emit_insn (tmp); + + tmp1 =3D gen_rtx_SET (SImode, + gen_frame_mem + (SImode, + plus_constant (stack_pointer_rtx, 4 * i)), + reg); + RTX_FRAME_RELATED_P (tmp1) =3D 1; + XVECEXP (dwarf, 0, (i + 1)) =3D tmp1; + } + + add_reg_note (insn, REG_FRAME_RELATED_EXPR, dwarf); + RTX_FRAME_RELATED_P (insn) =3D 1; + return; +} + /* Generate and emit an insn that we will recognize as a push_multi. Unfortunately, since this insn does not reflect very well the actual semantics of the operation, we need to annotate the insn for the benefit @@ -16307,8 +16426,16 @@ arm_expand_prologue (void) saved_regs +=3D frame; } } - insn =3D emit_multi_reg_push (live_regs_mask); - RTX_FRAME_RELATED_P (insn) =3D 1; + + if (TARGET_THUMB2 && current_tune->prefer_ldrd_strd && !optimize_siz= e) + { + thumb2_emit_strd_push (live_regs_mask); + } + else + { + insn =3D emit_multi_reg_push (live_regs_mask); + RTX_FRAME_RELATED_P (insn) =3D 1; + } } =20 if (! IS_VOLATILE (func_type)) diff --git a/gcc/config/arm/ldmstm.md b/gcc/config/arm/ldmstm.md index 21d2815..e3dcd4f 100644 --- a/gcc/config/arm/ldmstm.md +++ b/gcc/config/arm/ldmstm.md @@ -47,6 +47,32 @@ [(set_attr "type" "load2") (set_attr "predicable" "yes")]) =20 +(define_insn "*thumb2_strd_base" + [(set (mem:SI (match_operand:SI 0 "s_register_operand" "rk")) + (match_operand:SI 1 "register_operand" "r")) + (set (mem:SI (plus:SI (match_dup 0) + (const_int 4))) + (match_operand:SI 2 "register_operand" "r"))] + "(TARGET_THUMB2 && current_tune->prefer_ldrd_strd + && (!bad_reg_pair_for_thumb_ldrd_strd (operands[1], operands[2])))" + "strd%?\t%1, %2, [%0]" + [(set_attr "type" "store2") + (set_attr "predicable" "yes")]) + +(define_insn "*thumb2_strd" + [(set (mem:SI (plus:SI (match_operand:SI 0 "s_register_operand" "rk") + (match_operand:SI 1 "ldrd_immediate_operand" "Pz"= ))) + (match_operand:SI 2 "register_operand" "r")) + (set (mem:SI (plus:SI (match_dup 0) + (match_operand:SI 3 "const_int_operand" ""))) + (match_operand:SI 4 "register_operand" "r"))] + "(TARGET_THUMB2 && current_tune->prefer_ldrd_strd + && ((INTVAL (operands[1]) + 4) =3D=3D INTVAL (operands[3])) + && (!bad_reg_pair_for_thumb_ldrd_strd (operands[2], operands[4])))" + "strd%?\t%2, %4, [%0, %1]" + [(set_attr "type" "store2") + (set_attr "predicable" "yes")]) + (define_insn "*ldm4_ia" [(match_parallel 0 "load_multiple_operation" [(set (match_operand:SI 1 "arm_hard_register_operand" "")= --=-lQunIQnh48jbXVNi9z9R--