From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 14870 invoked by alias); 14 Dec 2010 22:01:27 -0000 Received: (qmail 14283 invoked by uid 22791); 14 Dec 2010 22:01:09 -0000 X-SWARE-Spam-Status: No, hits=-5.0 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,RCVD_IN_DNSWL_HI,SPF_HELO_PASS,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from smtp-out.google.com (HELO smtp-out.google.com) (74.125.121.35) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 14 Dec 2010 22:00:54 +0000 Received: from hpaq3.eem.corp.google.com (hpaq3.eem.corp.google.com [172.25.149.3]) by smtp-out.google.com with ESMTP id oBEM0p6r005540 for ; Tue, 14 Dec 2010 14:00:51 -0800 Received: from gwaa20 (gwaa20.prod.google.com [10.200.27.20]) by hpaq3.eem.corp.google.com with ESMTP id oBEM0guF021022 for ; Tue, 14 Dec 2010 14:00:49 -0800 Received: by gwaa20 with SMTP id a20so822931gwa.7 for ; Tue, 14 Dec 2010 14:00:49 -0800 (PST) MIME-Version: 1.0 Received: by 10.151.78.3 with SMTP id f3mr8649849ybl.419.1292364045153; Tue, 14 Dec 2010 14:00:45 -0800 (PST) Received: by 10.151.105.6 with HTTP; Tue, 14 Dec 2010 14:00:43 -0800 (PST) In-Reply-To: References: <1283354531.25967.50.camel@e102346-lin.cambridge.arm.com> <201010131201.18075.paul@codesourcery.com> Date: Tue, 14 Dec 2010 22:58:00 -0000 Message-ID: Subject: Re: [PATCH: ARM] PR 45335 Use ldrd and strd to access two consecutive words From: Carrot Wei To: Paul Brook , Richard Earnshaw , Nick Clifton Cc: gcc-patches@gcc.gnu.org, ramana.radhakrishnan@arm.com Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-System-Of-Record: true Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2010-12/txt/msg01139.txt.bz2 ping On Mon, Nov 29, 2010 at 2:32 PM, Carrot Wei wrote: > ping > > On Mon, Nov 22, 2010 at 3:16 PM, Carrot Wei wrote: >> ping >> >> On Sun, Oct 31, 2010 at 2:22 AM, Carrot Wei wrote: >>> Ping >>> >>> On Sun, Oct 24, 2010 at 9:46 PM, Carrot Wei wrote: >>>> Ping >>>> >>>> On Sat, Oct 16, 2010 at 8:27 PM, Carrot Wei wrote: >>>>> On Wed, Oct 13, 2010 at 7:01 PM, Paul Brook w= rote: >>>>>>> ChangeLog: >>>>>>> 2010-09-04 =A0Wei Guozhi =A0 >>>>>>> >>>>>>> =A0 =A0 =A0 =A0 PR target/45335 >>>>>>> =A0 =A0 =A0 =A0 * gcc/config/arm/thumb2.md (thumb2_ldrd, thumb2_ldr= d_reg1, >>>>>>> =A0 =A0 =A0 =A0 thumb2_ldrd_reg2 and peephole2): New insn pattern a= nd related >>>>>>> =A0 =A0 =A0 =A0 peephole2. >>>>>>> =A0 =A0 =A0 =A0 (thumb2_strd, thumb2_strd_reg1, thumb2_strd_reg2 an= d peephole2): >>>>>>> =A0 =A0 =A0 =A0 New insn pattern and related peephole2. >>>>>>> =A0 =A0 =A0 =A0 * gcc/config/arm/arm.c (thumb2_legitimate_ldrd_p): = New function. >>>>>>> =A0 =A0 =A0 =A0 (thumb2_check_ldrd_operands): New function. >>>>>>> =A0 =A0 =A0 =A0 (thumb2_prefer_ldmstm): New function. >>>>>>> =A0 =A0 =A0 =A0 * gcc/config/arm/arm-protos.h (thumb2_legitimate_ld= rd_p): New >>>>>>> prototype. (thumb2_check_ldrd_operands): New prototype. >>>>>>> =A0 =A0 =A0 =A0 (thumb2_prefer_ldmstm): New prototype. >>>>>>> =A0 =A0 =A0 =A0 * gcc/config/arm/ldmstm.md (ldm2_ia, stm2_ia, ldm2_= db, stm2_db): >>>>>>> =A0 =A0 =A0 =A0 Change the ldm/stm patterns with 2 words to ARM onl= y. >>>>>>> =A0 =A0 =A0 =A0 * gcc/config/arm/constraints.md (Py): New thumb2 co= nstant >>>>>>> constraint suitable to ldrd/strd instructions. >>>>>> >>>>>> Not ok. >>>>>> >>>>>> Why is this restricted to Thumb mode? The ARM variant of ldrd isn't = quite as >>>>>> flexible, but still provides a useful improvement over ldm. >>>>>> >>>>> I agree the ARM version is also useful. But it brings much less >>>>> benefit with too much complexity (due to more restriction and insn >>>>> pattern conflict with ldm). So I will leave it as a future >>>>> improvement. >>>>> >>>>>> This transformation is only valid on ARMv7 cores. On earlier hardware >>>>>> (depending on system configuration) it may cause undefined behavior = or an >>>>>> alignment trap. >>>>>> >>>>> done. >>>>> >>>>>> The range on -1020 to +1024 is used in several places, but without a= ny >>>>>> apparent explanation of why it's different to the range of an ldrd >>>>>> instruction. =A0I figured it out eventually, but it deserves a comme= nt. >>>>>> >>>>> Comments added. >>>>> >>>>>>> + =A0"TARGET_THUMB2 && thumb2_check_ldrd_operands (operands[0], ope= rands[1], >>>>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 operands[2], 0, operands[3], 1)" >>>>>> >>>>>> Passed operands do not match expected types. Specifically "0" is not= an rtx >>>>>> (should be "NULL_RTX"), and "1" is not a boolean value (should be "t= rue"). >>>>>> Many other occurrences. >>>>>> >>>>> Fixed. >>>>> >>>>>>> +(define_constraint "Py" >>>>>>> + =A0"@internal In Thumb-2 state a constant that is a multiple of 4= in the >>>>>>> + =A0 range -1020 to 1024" >>>>>> >>>>>> This comment seems particularly pointless. You should mention why th= is >>>>>> exists/where it is used. >>>>>> >>>>>> I think you're better off enforcing this in the insn condition, and = remove >>>>>> this constraint. At least half the uses (the -reg[12] insns) are inc= orrect, >>>>>> and you already need the condition to enforce the dependency between= the >>>>>> operands. >>>>>> >>>>> I removed this constraint and add the check to insn condition. >>>>> >>>>>>> +thumb2_check_ldrd_operands (rtx reg1, rtx reg2, rtx base, >>>>>>>... >>>>>>> + =A0if (ldrd && (reg1 =3D=3D reg2)) >>>>>>> + =A0 =A0return false; >>>>>> >>>>>> This function is part of the instruction condition. =A0Instruction c= onditions >>>>>> must not be used to enforce register allocation. >>>>>> >>>>> removed. >>>>> >>>>>>> +thumb2_legitimate_ldrd_p ( >>>>>>>... >>>>>>> + =A0if (ldrd && ((reg1 =3D=3D reg2) || (reg1 =3D=3D base1))) >>>>>>> + =A0 =A0return false; >>>>>> >>>>>> You're incorrectly assuming offset1 < offset2, which might not be tr= ue at this >>>>>> point. >>>>>> >>>>> The following check assumes offset1 < offset2 >>>>> + =A0if ((offset1 + 4) =3D=3D offset2) >>>>> + =A0 =A0return true; >>>>> >>>>> And another check assumes offset2 < offset1, so both cases are covere= d. >>>>> + =A0if ((offset2 + 4) =3D=3D offset1) >>>>> + =A0 =A0return true; >>>>> >>>>>>> + =A0/* Now ldm/stm is possible. Check for special cases ldm/stm ha= s lower >>>>>>> + =A0 =A0 cost. =A0*/ >>>>>>> + =A0return false; >>>>>> >>>>>> Code clearly doesn't match the comment. =A0In fact this function alw= ays returns >>>>>> false. >>>>>> >>>>> Richard mentioned that in some cases (specifically cortex A9) ldm has >>>>> less cost than ldrd and we should model this in the insn pattern. This >>>>> function is used for this. But I don't know the cortex A9 architecture >>>>> detail, so it should be filled by somebody with more knowledge about >>>>> it in future. >>>>> >>>>> Wei Guozhi >>>>> >>>>> >>>>> ChangeLog: >>>>> 2010-10-16 =A0Wei Guozhi =A0 >>>>> >>>>> =A0 =A0 =A0 =A0PR target/45335 >>>>> =A0 =A0 =A0 =A0* gcc/config/arm/thumb2.md (thumb2_ldrd, thumb2_ldrd_r= eg1, >>>>> =A0 =A0 =A0 =A0thumb2_ldrd_reg2 and peephole2): New insn pattern and = related >>>>> =A0 =A0 =A0 =A0peephole2. >>>>> =A0 =A0 =A0 =A0(thumb2_strd, thumb2_strd_reg1, thumb2_strd_reg2 and p= eephole2): >>>>> =A0 =A0 =A0 =A0New insn pattern and related peephole2. >>>>> =A0 =A0 =A0 =A0* gcc/config/arm/arm.c (thumb2_legitimate_ldrd_p): New= function. >>>>> =A0 =A0 =A0 =A0(thumb2_check_ldrd_operands): New function. >>>>> =A0 =A0 =A0 =A0(thumb2_prefer_ldmstm): New function. >>>>> =A0 =A0 =A0 =A0* gcc/config/arm/arm-protos.h (thumb2_legitimate_ldrd_= p): New prototype. >>>>> =A0 =A0 =A0 =A0(thumb2_check_ldrd_operands): New prototype. >>>>> =A0 =A0 =A0 =A0(thumb2_prefer_ldmstm): New prototype. >>>>> =A0 =A0 =A0 =A0* gcc/config/arm/ldmstm.md (ldm2_ia, stm2_ia, ldm2_db,= stm2_db): >>>>> =A0 =A0 =A0 =A0Change the ldm/stm patterns with 2 words to ARM only. >>>>> >>>>> >>>>> 2010-10-16 =A0Wei Guozhi =A0 >>>>> >>>>> =A0 =A0 =A0 =A0PR target/45335 >>>>> =A0 =A0 =A0 =A0* gcc.target/arm/pr45335.c: New test. >>>>> =A0 =A0 =A0 =A0* gcc.target/arm/pr40457-1.c: Changed to load 3 words. >>>>> =A0 =A0 =A0 =A0* gcc.target/arm/pr40457-2.c: Changed to store 3 words. >>>>> =A0 =A0 =A0 =A0* gcc.target/arm/pr40457-3.c: Changed to store 3 words. >>>>> >>>>> >>>>> Index: thumb2.md >>>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>>>> --- thumb2.md =A0 (revision 165492) >>>>> +++ thumb2.md =A0 (working copy) >>>>> @@ -1118,3 +1118,228 @@ (define_peephole2 >>>>> =A0 " >>>>> =A0 operands[2] =3D GEN_INT (32 - INTVAL (operands[2])); >>>>> =A0 ") >>>>> + >>>>> +(define_insn "*thumb2_ldrd" >>>>> + =A0[(parallel [(set (match_operand:SI 0 "s_register_operand" "") >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(mem:SI (plus:SI >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 (match_= operand:SI 2 "s_register_operand" "rk") >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 (match_= operand:SI 3 "const_int_operand" "")))) >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 (set (match_operand:SI 1 "s_register_operan= d" "") >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(mem:SI (plus:SI (match_dup 2) >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(match_opera= nd:SI 4 "const_int_operand" ""))))])] >>>>> + =A0"TARGET_THUMB2 && arm_arch7 >>>>> + =A0 && thumb2_check_ldrd_operands (operands[3], operands[4])" >>>>> + =A0"* >>>>> + =A0{ >>>>> + =A0 =A0HOST_WIDE_INT offset1 =3D INTVAL (operands[3]); >>>>> + =A0 =A0HOST_WIDE_INT offset2 =3D INTVAL (operands[4]); >>>>> + =A0 =A0if (offset1 > offset2) >>>>> + =A0 =A0 =A0{ >>>>> + =A0 =A0 =A0 /* Swap the operands so that memory [base+offset1] is l= oaded into >>>>> + =A0 =A0 =A0 =A0 =A0operands[0]. =A0*/ >>>>> + =A0 =A0 =A0 rtx tmp =3D operands[0]; >>>>> + =A0 =A0 =A0 operands[0] =3D operands[1]; >>>>> + =A0 =A0 =A0 operands[1] =3D tmp; >>>>> + =A0 =A0 =A0 tmp =3D operands[3]; >>>>> + =A0 =A0 =A0 operands[3] =3D operands[4]; >>>>> + =A0 =A0 =A0 operands[4] =3D tmp; >>>>> + =A0 =A0 =A0 offset1 =3D INTVAL (operands[3]); >>>>> + =A0 =A0 =A0 offset2 =3D INTVAL (operands[4]); >>>>> + =A0 =A0 =A0} >>>>> + =A0 =A0if (thumb2_prefer_ldmstm (operands[0], operands[1], >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 operands[2]= , operands[3], operands[4], true)) >>>>> + =A0 =A0 =A0return \"ldmdb\\t%2, {%0, %1}\"; >>>>> + =A0 =A0else if (fix_cm3_ldrd && (operands[2] =3D=3D operands[0])) >>>>> + =A0 =A0 =A0{ >>>>> + =A0 =A0 =A0 if (offset1 <=3D -256) >>>>> + =A0 =A0 =A0 =A0 { >>>>> + =A0 =A0 =A0 =A0 =A0 output_asm_insn (\"sub\\t%2, %2, %n3\", operand= s); >>>>> + =A0 =A0 =A0 =A0 =A0 output_asm_insn (\"ldr\\t%1, [%2, #4]\", operan= ds); >>>>> + =A0 =A0 =A0 =A0 =A0 output_asm_insn (\"ldr\\t%0, [%2]\", operands); >>>>> + =A0 =A0 =A0 =A0 } >>>>> + =A0 =A0 =A0 else >>>>> + =A0 =A0 =A0 =A0 { >>>>> + =A0 =A0 =A0 =A0 =A0 output_asm_insn (\"ldr\\t%1, [%2, %4]\", operan= ds); >>>>> + =A0 =A0 =A0 =A0 =A0 output_asm_insn (\"ldr\\t%0, [%2, %3]\", operan= ds); >>>>> + =A0 =A0 =A0 =A0 } >>>>> + =A0 =A0 =A0 return \"\"; >>>>> + =A0 =A0 =A0} >>>>> + =A0 =A0else >>>>> + =A0 =A0 =A0return \"ldrd\\t%0, %1, [%2, %3]\"; >>>>> + =A0}" >>>>> +) >>>>> + >>>>> +(define_insn "*thumb2_ldrd_reg1" >>>>> + =A0[(parallel [(set (match_operand:SI 0 "s_register_operand" "") >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(mem:SI (match_operand:SI 2 "s_r= egister_operand" "rk"))) >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 (set (match_operand:SI 1 "s_register_operan= d" "") >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(mem:SI (plus:SI (match_dup 2) >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(match_opera= nd:SI 3 "const_int_operand" ""))))])] >>>>> + =A0"TARGET_THUMB2 && arm_arch7 >>>>> + =A0 && thumb2_check_ldrd_operands (NULL_RTX, operands[3])" >>>>> + =A0"* >>>>> + =A0{ >>>>> + =A0 =A0HOST_WIDE_INT offset2 =3D INTVAL (operands[3]); >>>>> + =A0 =A0if (offset2 =3D=3D 4) >>>>> + =A0 =A0 =A0{ >>>>> + =A0 =A0 =A0 if (thumb2_prefer_ldmstm (operands[0], operands[1], >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ope= rands[2], NULL_RTX, operands[3], true)) >>>>> + =A0 =A0 =A0 =A0 return \"ldmia\\t%2, {%0, %1}\"; >>>>> + =A0 =A0 =A0 if (fix_cm3_ldrd && (operands[2] =3D=3D operands[0])) >>>>> + =A0 =A0 =A0 =A0 { >>>>> + =A0 =A0 =A0 =A0 =A0 output_asm_insn (\"ldr\\t%1, [%2, %3]\", operan= ds); >>>>> + =A0 =A0 =A0 =A0 =A0 output_asm_insn (\"ldr\\t%0, [%2]\", operands); >>>>> + =A0 =A0 =A0 =A0 =A0 return \"\"; >>>>> + =A0 =A0 =A0 =A0 } >>>>> + =A0 =A0 =A0 return \"ldrd\\t%0, %1, [%2]\"; >>>>> + =A0 =A0 =A0} >>>>> + =A0 =A0else >>>>> + =A0 =A0 =A0{ >>>>> + =A0 =A0 =A0 if (fix_cm3_ldrd && (operands[2] =3D=3D operands[1])) >>>>> + =A0 =A0 =A0 =A0 { >>>>> + =A0 =A0 =A0 =A0 =A0 output_asm_insn (\"ldr\\t%0, [%2]\", operands); >>>>> + =A0 =A0 =A0 =A0 =A0 output_asm_insn (\"ldr\\t%1, [%2, %3]\", operan= ds); >>>>> + =A0 =A0 =A0 =A0 } >>>>> + =A0 =A0 =A0 return \"ldrd\\t%1, %0, [%2, %3]\"; >>>>> + =A0 =A0 =A0} >>>>> + =A0}" >>>>> +) >>>>> + >>>>> +(define_insn "*thumb2_ldrd_reg2" >>>>> + =A0[(parallel [(set (match_operand:SI 0 "s_register_operand" "") >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(mem:SI (plus:SI >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 (match_= operand:SI 2 "s_register_operand" "rk") >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 (match_= operand:SI 3 "const_int_operand" "")))) >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 (set (match_operand:SI 1 "s_register_operan= d" "") >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(mem:SI (match_dup 2)))])] >>>>> + =A0"TARGET_THUMB2 && arm_arch7 >>>>> + =A0 && thumb2_check_ldrd_operands (operands[3], NULL_RTX)" >>>>> + =A0"* >>>>> + =A0{ >>>>> + =A0 =A0HOST_WIDE_INT offset1 =3D INTVAL (operands[3]); >>>>> + =A0 =A0if (offset1 =3D=3D -4) >>>>> + =A0 =A0 =A0{ >>>>> + =A0 =A0 =A0 if (fix_cm3_ldrd && (operands[2] =3D=3D operands[0])) >>>>> + =A0 =A0 =A0 =A0 { >>>>> + =A0 =A0 =A0 =A0 =A0 output_asm_insn (\"ldr\\t%1, [%2]\", operands); >>>>> + =A0 =A0 =A0 =A0 =A0 output_asm_insn (\"ldr\\t%0, [%2, %3]\", operan= ds); >>>>> + =A0 =A0 =A0 =A0 =A0 return \"\"; >>>>> + =A0 =A0 =A0 =A0 } >>>>> + =A0 =A0 =A0 return \"ldrd\\t%0, %1, [%2, %3]\"; >>>>> + =A0 =A0 =A0} >>>>> + =A0 =A0else >>>>> + =A0 =A0 =A0{ >>>>> + =A0 =A0 =A0 if (thumb2_prefer_ldmstm (operands[0], operands[1], >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ope= rands[2], operands[3], NULL_RTX, true)) >>>>> + =A0 =A0 =A0 =A0 return \"ldmia\\t%2, {%1, %0}\"; >>>>> + =A0 =A0 =A0 if (fix_cm3_ldrd && (operands[2] =3D=3D operands[1])) >>>>> + =A0 =A0 =A0 =A0 { >>>>> + =A0 =A0 =A0 =A0 =A0 output_asm_insn (\"ldr\\t%0, [%2, %3]\", operan= ds); >>>>> + =A0 =A0 =A0 =A0 =A0 output_asm_insn (\"ldr\\t%1, [%2]\", operands); >>>>> + =A0 =A0 =A0 =A0 =A0 return \"\"; >>>>> + =A0 =A0 =A0 =A0 } >>>>> + =A0 =A0 =A0 return \"ldrd\\t%1, %0, [%2]\"; >>>>> + =A0 =A0 =A0} >>>>> + =A0}" >>>>> +) >>>>> + >>>>> +(define_peephole2 >>>>> + =A0[(set (match_operand:SI 0 "s_register_operand" "") >>>>> + =A0 =A0 =A0 (match_operand:SI 2 "memory_operand" "")) >>>>> + =A0 (set (match_operand:SI 1 "s_register_operand" "") >>>>> + =A0 =A0 =A0 (match_operand:SI 3 "memory_operand" ""))] >>>>> + =A0"TARGET_THUMB2 && arm_arch7 >>>>> + =A0 && thumb2_legitimate_ldrd_p (operands[0], operands[1], >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 operand= s[2], operands[3], true)" >>>>> + =A0[(parallel [(set (match_operand:SI 0 "s_register_operand" "") >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(match_operand:SI 2 "memory_oper= and" "")) >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 (set (match_operand:SI 1 "s_register_operan= d" "") >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(match_operand:SI 3 "memory_oper= and" ""))])] >>>>> + =A0"" >>>>> +) >>>>> + >>>>> +(define_insn "*thumb2_strd" >>>>> + =A0[(parallel [(set (mem:SI >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 (plus:SI (match_operand= :SI 2 "s_register_operand" "rk") >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(mat= ch_operand:SI 3 "const_int_operand" ""))) >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(match_operand:SI 0 "s_register_= operand" "")) >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 (set (mem:SI (plus:SI (match_dup 2) >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(mat= ch_operand:SI 4 "const_int_operand" ""))) >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(match_operand:SI 1 "s_register_= operand" ""))])] >>>>> + =A0"TARGET_THUMB2 && arm_arch7 >>>>> + =A0 && thumb2_check_ldrd_operands (operands[3], operands[4])" >>>>> + =A0"* >>>>> + =A0{ >>>>> + =A0 =A0HOST_WIDE_INT offset1 =3D INTVAL (operands[3]); >>>>> + =A0 =A0HOST_WIDE_INT offset2 =3D INTVAL (operands[4]); >>>>> + =A0 =A0if (thumb2_prefer_ldmstm (operands[0], operands[1], >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 operands[2]= , operands[3], operands[4], false)) >>>>> + =A0 =A0 =A0return \"stmdb\\t%2, {%0, %1}\"; >>>>> + =A0 =A0if (offset1 < offset2) >>>>> + =A0 =A0 =A0return \"strd\\t%0, %1, [%2, %3]\"; >>>>> + =A0 =A0else >>>>> + =A0 =A0 =A0return \"strd\\t%1, %0, [%2, %4]\"; >>>>> + =A0}" >>>>> +) >>>>> + >>>>> +(define_insn "*thumb2_strd_reg1" >>>>> + =A0[(parallel [(set (mem:SI (match_operand:SI 2 "s_register_operand= " "rk")) >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(match_operand:SI 0 "s_register_= operand" "")) >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 (set (mem:SI (plus:SI (match_dup 2) >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 (match_= operand:SI 3 "const_int_operand" ""))) >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(match_operand:SI 1 "s_register_= operand" ""))])] >>>>> + =A0"TARGET_THUMB2 && arm_arch7 >>>>> + =A0 && thumb2_check_ldrd_operands (NULL_RTX, operands[3])" >>>>> + =A0"* >>>>> + =A0{ >>>>> + =A0 =A0HOST_WIDE_INT offset2 =3D INTVAL (operands[3]); >>>>> + =A0 =A0if (offset2 =3D=3D 4) >>>>> + =A0 =A0 =A0{ >>>>> + =A0 =A0 =A0 if (thumb2_prefer_ldmstm (operands[0], operands[1], >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ope= rands[2], NULL_RTX, operands[3], false)) >>>>> + =A0 =A0 =A0 =A0 return \"stmia\\t%2, {%0, %1}\"; >>>>> + =A0 =A0 =A0 return \"strd\\t%0, %1, [%2]\"; >>>>> + =A0 =A0 =A0} >>>>> + =A0 =A0else >>>>> + =A0 =A0 =A0return \"strd\\t%1, %0, [%2, %3]\"; >>>>> + =A0}" >>>>> +) >>>>> + >>>>> +(define_insn "*thumb2_strd_reg2" >>>>> + =A0[(parallel [(set (mem:SI (plus:SI >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 (match_= operand:SI 2 "s_register_operand" "rk") >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 (match_= operand:SI 3 "const_int_operand" ""))) >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(match_operand:SI 0 "s_register_= operand" "")) >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 (set (mem:SI (match_dup 2)) >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(match_operand:SI 1 "s_register_= operand" ""))])] >>>>> + =A0"TARGET_THUMB2 && arm_arch7 >>>>> + =A0 && thumb2_check_ldrd_operands (operands[3], NULL_RTX)" >>>>> + =A0"* >>>>> + =A0{ >>>>> + =A0 =A0HOST_WIDE_INT offset1 =3D INTVAL (operands[3]); >>>>> + =A0 =A0if (offset1 =3D=3D -4) >>>>> + =A0 =A0 =A0return \"strd\\t%0, %1, [%2, %3]\"; >>>>> + =A0 =A0else >>>>> + =A0 =A0 =A0{ >>>>> + =A0 =A0 =A0 if (thumb2_prefer_ldmstm (operands[0], operands[1], >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ope= rands[2], operands[3], NULL_RTX, false)) >>>>> + =A0 =A0 =A0 =A0 return \"stmia\\t%2, {%1, %0}\"; >>>>> + =A0 =A0 =A0 return \"strd\\t%1, %0, [%2]\"; >>>>> + =A0 =A0 =A0} >>>>> + =A0}" >>>>> +) >>>>> + >>>>> +(define_peephole2 >>>>> + =A0[(set (match_operand:SI 2 "memory_operand" "") >>>>> + =A0 =A0 =A0 (match_operand:SI 0 "s_register_operand" "")) >>>>> + =A0 (set (match_operand:SI 3 "memory_operand" "") >>>>> + =A0 =A0 =A0 (match_operand:SI 1 "s_register_operand" ""))] >>>>> + =A0"TARGET_THUMB2 && arm_arch7 >>>>> + =A0 && thumb2_legitimate_ldrd_p (operands[0], operands[1], >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 operand= s[2], operands[3], false)" >>>>> + =A0[(parallel [(set (match_operand:SI 2 "memory_operand" "") >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(match_operand:SI 0 "s_register_= operand" "")) >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 (set (match_operand:SI 3 "memory_operand" "= ") >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(match_operand:SI 1 "s_register_= operand" ""))])] >>>>> + =A0"" >>>>> +) >>>>> Index: arm.c >>>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>>>> --- arm.c =A0 =A0 =A0 (revision 165492) >>>>> +++ arm.c =A0 =A0 =A0 (working copy) >>>>> @@ -23254,4 +23254,134 @@ arm_builtin_support_vector_misalignment >>>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0is_packed); >>>>> =A0} >>>>> >>>>> +/* Check the validity of operands in an ldrd/strd instruction. =A0*/ >>>>> +bool >>>>> +thumb2_check_ldrd_operands (rtx off1, rtx off2) >>>>> +{ >>>>> + =A0HOST_WIDE_INT offset1 =3D 0; >>>>> + =A0HOST_WIDE_INT offset2 =3D 0; >>>>> + >>>>> + =A0if (off1 !=3D NULL_RTX) >>>>> + =A0 =A0offset1 =3D INTVAL (off1); >>>>> + =A0if (off2 !=3D NULL_RTX) >>>>> + =A0 =A0offset2 =3D INTVAL (off2); >>>>> + >>>>> + =A0/* The offset range of LDRD is [-1020, 1020]. Here we check if b= oth >>>>> + =A0 =A0 offsets lie in the range [-1020, 1024]. If one of the offse= ts is >>>>> + =A0 =A0 1024, the following condition ((offset1 + 4) =3D=3D offset2= ) will ensure >>>>> + =A0 =A0 offset1 to be 1020, suitable for instruction LDRD. =A0*/ >>>>> + =A0if ((offset1 > 1024) || (offset1 < -1020) || ((offset1 & 3) !=3D= 0)) >>>>> + =A0 =A0return false; >>>>> + =A0if ((offset2 > 1024) || (offset2 < -1020) || ((offset2 & 3) !=3D= 0)) >>>>> + =A0 =A0return false; >>>>> + >>>>> + =A0if ((offset1 + 4) =3D=3D offset2) >>>>> + =A0 =A0return true; >>>>> + =A0if ((offset2 + 4) =3D=3D offset1) >>>>> + =A0 =A0return true; >>>>> + >>>>> + =A0return false; >>>>> +} >>>>> + >>>>> +/* Check if the two memory accesses can be merged to an ldrd/strd in= struction. >>>>> + =A0 That is they use the same base register, and the gap between co= nstant >>>>> + =A0 offsets should be 4. =A0*/ >>>>> +bool >>>>> +thumb2_legitimate_ldrd_p (rtx reg1, rtx reg2, rtx mem1, rtx mem2, bo= ol ldrd) >>>>> +{ >>>>> + =A0rtx base1, base2, op1; >>>>> + =A0rtx addr1 =3D XEXP (mem1, 0); >>>>> + =A0rtx addr2 =3D XEXP (mem2, 0); >>>>> + =A0HOST_WIDE_INT offset1 =3D 0; >>>>> + =A0HOST_WIDE_INT offset2 =3D 0; >>>>> + >>>>> + =A0if (MEM_VOLATILE_P (mem1) || MEM_VOLATILE_P (mem2)) >>>>> + =A0 =A0return false; >>>>> + >>>>> + =A0if (REG_P (addr1)) >>>>> + =A0 =A0base1 =3D addr1; >>>>> + =A0else if (GET_CODE (addr1) =3D=3D PLUS) >>>>> + =A0 =A0{ >>>>> + =A0 =A0 =A0base1 =3D XEXP (addr1, 0); >>>>> + =A0 =A0 =A0op1 =3D XEXP (addr1, 1); >>>>> + =A0 =A0 =A0if (!REG_P (base1) || (GET_CODE (op1) !=3D CONST_INT)) >>>>> + =A0 =A0 =A0 return false; >>>>> + =A0 =A0 =A0offset1 =3D INTVAL (op1); >>>>> + =A0 =A0} >>>>> + =A0else >>>>> + =A0 =A0return false; >>>>> + >>>>> + =A0if (REG_P (addr2)) >>>>> + =A0 =A0base2 =3D addr2; >>>>> + =A0else if (GET_CODE (addr2) =3D=3D PLUS) >>>>> + =A0 =A0{ >>>>> + =A0 =A0 =A0base2 =3D XEXP (addr2, 0); >>>>> + =A0 =A0 =A0op1 =3D XEXP (addr2, 1); >>>>> + =A0 =A0 =A0if (!REG_P (base2) || (GET_CODE (op1) !=3D CONST_INT)) >>>>> + =A0 =A0 =A0 return false; >>>>> + =A0 =A0 =A0offset2 =3D INTVAL (op1); >>>>> + =A0 =A0} >>>>> + =A0else >>>>> + =A0 =A0return false; >>>>> + >>>>> + =A0if (base1 !=3D base2) >>>>> + =A0 =A0return false; >>>>> + >>>>> + =A0/* The offset range of LDRD is [-1020, 1020]. Here we check if b= oth >>>>> + =A0 =A0 offsets lie in the range [-1020, 1024]. If one of the offse= ts is >>>>> + =A0 =A0 1024, the following condition ((offset1 + 4) =3D=3D offset2= ) will ensure >>>>> + =A0 =A0 offset1 to be 1020, suitable for instruction LDRD. =A0*/ >>>>> + =A0if ((offset1 > 1024) || (offset1 < -1020) || ((offset1 & 3) !=3D= 0)) >>>>> + =A0 =A0return false; >>>>> + =A0if ((offset2 > 1024) || (offset2 < -1020) || ((offset2 & 3) !=3D= 0)) >>>>> + =A0 =A0return false; >>>>> + >>>>> + =A0if (ldrd && ((reg1 =3D=3D reg2) || (reg1 =3D=3D base1))) >>>>> + =A0 =A0return false; >>>>> + >>>>> + =A0if ((offset1 + 4) =3D=3D offset2) >>>>> + =A0 =A0return true; >>>>> + =A0if ((offset2 + 4) =3D=3D offset1) >>>>> + =A0 =A0return true; >>>>> + >>>>> + =A0return false; >>>>> +} >>>>> + >>>>> +/* Check if the insn can be expressed as ldm/stm with less cost. =A0= */ >>>>> +bool >>>>> +thumb2_prefer_ldmstm (rtx reg1, rtx reg2, rtx base, >>>>> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 rtx off1, rtx off2, bool ld= rd) >>>>> +{ >>>>> + =A0HOST_WIDE_INT offset1 =3D 0; >>>>> + =A0HOST_WIDE_INT offset2 =3D 0; >>>>> + >>>>> + =A0if (off1 !=3D NULL_RTX) >>>>> + =A0 =A0offset1 =3D INTVAL (off1); >>>>> + =A0if (off2 !=3D NULL_RTX) >>>>> + =A0 =A0offset2 =3D INTVAL (off2); >>>>> + >>>>> + =A0if (offset1 > offset2) >>>>> + =A0 =A0{ >>>>> + =A0 =A0 =A0rtx tmp; >>>>> + =A0 =A0 =A0HOST_WIDE_INT t =3D offset1; >>>>> + =A0 =A0 =A0offset1 =3D offset2; >>>>> + =A0 =A0 =A0offset2 =3D t; >>>>> + =A0 =A0 =A0tmp =3D reg1; >>>>> + =A0 =A0 =A0reg1 =3D reg2; >>>>> + =A0 =A0 =A0reg2 =3D tmp; >>>>> + =A0 =A0} >>>>> + >>>>> + =A0/* The offset of ldmdb is -8, the offset of ldmia is 0. =A0*/ >>>>> + =A0if ((offset1 !=3D -8) && (offset1 !=3D 0)) >>>>> + =A0 =A0return false; >>>>> + >>>>> + =A0/* Lower register corresponds to lower memory. =A0*/ >>>>> + =A0if (REGNO (reg1) > REGNO (reg2)) >>>>> + =A0 =A0return false; >>>>> + >>>>> + =A0/* Now ldm/stm is possible. Check for special cases ldm/stm has = lower >>>>> + =A0 =A0 cost. =A0*/ >>>>> + =A0return false; >>>>> +} >>>>> + >>>>> =A0#include "gt-arm.h" >>>>> Index: arm-protos.h >>>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>>>> --- arm-protos.h =A0 =A0 =A0 =A0(revision 165492) >>>>> +++ arm-protos.h =A0 =A0 =A0 =A0(working copy) >>>>> @@ -150,6 +150,9 @@ extern void arm_expand_sync (enum machin >>>>> =A0extern const char *arm_output_memory_barrier (rtx *); >>>>> =A0extern const char *arm_output_sync_insn (rtx, rtx *); >>>>> =A0extern unsigned int arm_sync_loop_insns (rtx , rtx *); >>>>> +extern bool thumb2_check_ldrd_operands (rtx, rtx); >>>>> +extern bool thumb2_legitimate_ldrd_p (rtx, rtx, rtx, rtx, bool); >>>>> +extern bool thumb2_prefer_ldmstm (rtx, rtx, rtx, rtx, rtx, bool); >>>>> >>>>> =A0#if defined TREE_CODE >>>>> =A0extern void arm_init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx= , tree); >>>>> Index: ldmstm.md >>>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>>>> --- ldmstm.md =A0 (revision 165492) >>>>> +++ ldmstm.md =A0 (working copy) >>>>> @@ -852,7 +852,7 @@ (define_insn "*ldm2_ia" >>>>> =A0 =A0 =A0(set (match_operand:SI 2 "arm_hard_register_operand" "") >>>>> =A0 =A0 =A0 =A0 =A0 (mem:SI (plus:SI (match_dup 3) >>>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 (const_int 4))))])] >>>>> - =A0"TARGET_32BIT && XVECLEN (operands[0], 0) =3D=3D 2" >>>>> + =A0"TARGET_ARM && XVECLEN (operands[0], 0) =3D=3D 2" >>>>> =A0 "ldm%(ia%)\t%3, {%1, %2}" >>>>> =A0 [(set_attr "type" "load2") >>>>> =A0 =A0(set_attr "predicable" "yes")]) >>>>> @@ -901,7 +901,7 @@ (define_insn "*stm2_ia" >>>>> =A0 =A0 =A0 =A0 =A0 (match_operand:SI 1 "arm_hard_register_operand" "= ")) >>>>> =A0 =A0 =A0(set (mem:SI (plus:SI (match_dup 3) (const_int 4))) >>>>> =A0 =A0 =A0 =A0 =A0 (match_operand:SI 2 "arm_hard_register_operand" "= "))])] >>>>> - =A0"TARGET_32BIT && XVECLEN (operands[0], 0) =3D=3D 2" >>>>> + =A0"TARGET_ARM && XVECLEN (operands[0], 0) =3D=3D 2" >>>>> =A0 "stm%(ia%)\t%3, {%1, %2}" >>>>> =A0 [(set_attr "type" "store2") >>>>> =A0 =A0(set_attr "predicable" "yes")]) >>>>> @@ -1041,7 +1041,7 @@ (define_insn "*ldm2_db" >>>>> =A0 =A0 =A0(set (match_operand:SI 2 "arm_hard_register_operand" "") >>>>> =A0 =A0 =A0 =A0 =A0 (mem:SI (plus:SI (match_dup 3) >>>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 (const_int -4))))])] >>>>> - =A0"TARGET_32BIT && XVECLEN (operands[0], 0) =3D=3D 2" >>>>> + =A0"TARGET_ARM && XVECLEN (operands[0], 0) =3D=3D 2" >>>>> =A0 "ldm%(db%)\t%3, {%1, %2}" >>>>> =A0 [(set_attr "type" "load2") >>>>> =A0 =A0(set_attr "predicable" "yes")]) >>>>> @@ -1067,7 +1067,7 @@ (define_insn "*stm2_db" >>>>> =A0 =A0 =A0 =A0 =A0 (match_operand:SI 1 "arm_hard_register_operand" "= ")) >>>>> =A0 =A0 =A0(set (mem:SI (plus:SI (match_dup 3) (const_int -4))) >>>>> =A0 =A0 =A0 =A0 =A0 (match_operand:SI 2 "arm_hard_register_operand" "= "))])] >>>>> - =A0"TARGET_32BIT && XVECLEN (operands[0], 0) =3D=3D 2" >>>>> + =A0"TARGET_ARM && XVECLEN (operands[0], 0) =3D=3D 2" >>>>> =A0 "stm%(db%)\t%3, {%1, %2}" >>>>> =A0 [(set_attr "type" "store2") >>>>> =A0 =A0(set_attr "predicable" "yes")]) >>>>> >>>>> >>>>> Index: pr40457-3.c >>>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>>>> --- pr40457-3.c (revision 165492) >>>>> +++ pr40457-3.c (working copy) >>>>> @@ -5,6 +5,7 @@ void foo(int* p) >>>>> =A0{ >>>>> =A0 p[0] =3D 1; >>>>> =A0 p[1] =3D 0; >>>>> + =A0p[2] =3D 2; >>>>> =A0} >>>>> >>>>> =A0/* { dg-final { scan-assembler "stm" } } */ >>>>> Index: pr40457-1.c >>>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>>>> --- pr40457-1.c (revision 165492) >>>>> +++ pr40457-1.c (working copy) >>>>> @@ -1,9 +1,9 @@ >>>>> -/* { dg-options "-Os" } =A0*/ >>>>> +/* { dg-options "-O2" } =A0*/ >>>>> =A0/* { dg-do compile } */ >>>>> >>>>> =A0int bar(int* p) >>>>> =A0{ >>>>> - =A0int x =3D p[0] + p[1]; >>>>> + =A0int x =3D p[0] + p[1] + p[2]; >>>>> =A0 return x; >>>>> =A0} >>>>> >>>>> Index: pr40457-2.c >>>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>>>> --- pr40457-2.c (revision 165492) >>>>> +++ pr40457-2.c (working copy) >>>>> @@ -5,6 +5,7 @@ void foo(int* p) >>>>> =A0{ >>>>> =A0 p[0] =3D 1; >>>>> =A0 p[1] =3D 0; >>>>> + =A0p[2] =3D 2; >>>>> =A0} >>>>> >>>>> =A0/* { dg-final { scan-assembler "stm" } } */ >>>>> Index: pr45335.c >>>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>>>> --- pr45335.c =A0 (revision 0) >>>>> +++ pr45335.c =A0 (revision 0) >>>>> @@ -0,0 +1,22 @@ >>>>> +/* { dg-options "-mthumb -O2" } */ >>>>> +/* { dg-require-effective-target arm_thumb2_ok } */ >>>>> +/* { dg-final { scan-assembler "ldrd" } } */ >>>>> +/* { dg-final { scan-assembler "strd" } } */ >>>>> + >>>>> +struct S >>>>> +{ >>>>> + =A0 =A0void* p1; >>>>> + =A0 =A0void* p2; >>>>> + =A0 =A0void* p3; >>>>> + =A0 =A0void* p4; >>>>> +}; >>>>> + >>>>> +extern printf(char*, ...); >>>>> + >>>>> +void foo1(struct S* fp, struct S* otherSaveArea) >>>>> +{ >>>>> + =A0 =A0struct S* saveA =3D fp - 1; >>>>> + =A0 =A0printf("StackSaveArea for fp %p [%p/%p]:\n", fp, saveA, othe= rSaveArea); >>>>> + =A0 =A0printf("prevFrame=3D%p savedPc=3D%p meth=3D%p curPc=3D%p fp[= 0]=3D0x%08x\n", >>>>> + =A0 =A0 =A0 =A0saveA->p1, saveA->p2, saveA->p3, saveA->p4, *(unsign= ed int*)fp); >>>>> +} >>>>> >>>> >>> >> >