From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 7962 invoked by alias); 5 May 2011 09:42:30 -0000 Received: (qmail 7950 invoked by uid 22791); 5 May 2011 09:42:28 -0000 X-SWARE-Spam-Status: No, hits=-2.4 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW,TW_CB X-Spam-Check-By: sourceware.org Received: from service87.mimecast.com (HELO service87.mimecast.com) (94.185.240.25) by sourceware.org (qpsmtpd/0.43rc1) with SMTP; Thu, 05 May 2011 09:42:11 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.21]) by service87.mimecast.com; Thu, 05 May 2011 10:42:08 +0100 Received: from [10.1.67.34] ([10.1.255.212]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.0); Thu, 5 May 2011 10:42:05 +0100 Subject: Re: [PATCH, ARM] PR47855 Compute attr length for thumb2 insns, 3/3 (issue4475042) From: Richard Earnshaw To: Guozhi Wei Cc: reply@codereview.appspotmail.com, gcc-patches@gcc.gnu.org In-Reply-To: <20110505065105.BBA6820A24@guozhiwei.sha.corp.google.com> References: <20110505065105.BBA6820A24@guozhiwei.sha.corp.google.com> Date: Thu, 05 May 2011 09:43:00 -0000 Message-Id: <1304588525.19537.67.camel@e102346-lin.cambridge.arm.com> Mime-Version: 1.0 X-MC-Unique: 111050510420802601 Content-Type: text/plain; charset=WINDOWS-1252 Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2011-05/txt/msg00391.txt.bz2 On Thu, 2011-05-05 at 14:51 +0800, Guozhi Wei wrote: > Hi >=20 > This is the third part of the fixing for >=20 > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D47855 >=20 > This patch contains the length computation/refinement for insn patterns > "*thumb2_movsi_insn", "*thumb2_cbz" and "*thumb2_cbnz". >=20 > At the same time this patch revealed two bugs. The first is the maximum o= ffset > of cbz/cbnz, it should be 126, but it is 128 in patterns "*thumb2_cbz" and > "*thumb2_cbnz". The second is that only 2-register form of shift instruct= ions > can be 16 bit, but 3-register form is allowed in "*thumb2_shiftsi3_short"= and > related peephole2. The fix is also contained in this patch. >=20 > The patch has been tested on arm qemu. >=20 > thanks > Carrot >=20 >=20 > 2011-05-05 Guozhi Wei >=20 > PR target/47855 > * config/arm/thumb2.md (thumb2_movsi_insn): Add length addtribute. > (thumb2_shiftsi3_short and peephole2): Remove 3-register case. > (thumb2_cbz): Refine length computation. > (thumb2_cbnz): Likewise. >=20 Hmm, although these changes are all related to length calculations, they are really three patches that are unrelated to each other. It would be easier to review this if they were kept separate. 1) thumb2_shiftsi3_short This appears to be a straight bug. We are putting out a 32-bit instruction when we are claiming it to be only 16 bits. This is OK. 2) thumb2_movsi_insn There are two things here. a) Thumb2 has a 16-bit move instruction for all core register-to-register transfers, so the separation of alternatives 1 and 2 is unnecessary -- just code these as "rk".=20=20 b) The ldm form does not support unaligned memory accesses. I'm aware that work is being done to add unaligned support to GCC for ARM, so I need to find out whether this patch will interfere with those changes. I'll try to find out what the situation is here and get back to you. 3) thumb2_cbz and thumb2_cbnz The range calculations look wrong here. Remember that the 'pc' as far as GCC is concerned is the address of the start of the insn. So for a backwards branch you need to account for all the bytes in the insn pattern that occur before the branch instruction itself, and secondly you also have to remember that the 'pc' that the CPU uses is the address of the branch instruction plus 4. All these conspire to reduce the backwards range of a short branch to several bytes less than the 256 that you currently have coded. R. > Index: config/arm/thumb2.md > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- config/arm/thumb2.md (revision 173350) > +++ config/arm/thumb2.md (working copy) > @@ -165,25 +165,49 @@ > ;; regs. The high register alternatives are not taken into account when > ;; choosing register preferences in order to reflect their expense. > (define_insn "*thumb2_movsi_insn" > - [(set (match_operand:SI 0 "nonimmediate_operand" "=3Drk,r,r,r,l ,*hk,m= ,*m") > - (match_operand:SI 1 "general_operand" "rk ,I,K,j,mi,*mi,l,*hk"))] > + [(set (match_operand:SI 0 "nonimmediate_operand" "=3Dl,rk,r,r,r,l ,*rk= ,Uu,*m") > + (match_operand:SI 1 "general_operand" "l ,rk,I,K,j,Uu,*mi,l ,*rk")= )] > "TARGET_THUMB2 && ! TARGET_IWMMXT > && !(TARGET_HARD_FLOAT && TARGET_VFP) > && ( register_operand (operands[0], SImode) > || register_operand (operands[1], SImode))" > - "@ > - mov%?\\t%0, %1 > - mov%?\\t%0, %1 > - mvn%?\\t%0, #%B1 > - movw%?\\t%0, %1 > - ldr%?\\t%0, %1 > - ldr%?\\t%0, %1 > - str%?\\t%1, %0 > - str%?\\t%1, %0" > - [(set_attr "type" "*,*,*,*,load1,load1,store1,store1") > + "* > + switch (which_alternative) > + { > + case 0: return \"mov%?\\t%0, %1\"; > + case 1: return \"mov%?\\t%0, %1\"; > + case 2: return \"mov%?\\t%0, %1\"; > + case 3: return \"mvn%?\\t%0, #%B1\"; > + case 4: return \"movw%?\\t%0, %1\"; > + > + case 5: > + if (GET_CODE (XEXP (operands[1], 0)) =3D=3D POST_INC) > + { > + operands[1] =3D XEXP (XEXP (operands[1], 0), 0); > + return \"ldm%(ia%)\t%1!, {%0}\"; > + } > + else > + return \"ldr%?\\t%0, %1\"; > + > + case 6: return \"ldr%?\\t%0, %1\"; > + > + case 7: > + if (GET_CODE (XEXP (operands[0], 0)) =3D=3D POST_INC) > + { > + operands[0] =3D XEXP (XEXP (operands[0], 0), 0); > + return \"stm%(ia%)\t%0!, {%1}\"; > + } > + else > + return \"str%?\\t%1, %0\"; > + > + case 8: return \"str%?\\t%1, %0\"; > + default: gcc_unreachable (); > + }" > + [(set_attr "type" "*,*,*,*,*,load1,load1,store1,store1") > (set_attr "predicable" "yes") > - (set_attr "pool_range" "*,*,*,*,1020,4096,*,*") > - (set_attr "neg_pool_range" "*,*,*,*,0,0,*,*")] > + (set_attr "length" "2,4,4,4,4,2,4,2,4") > + (set_attr "pool_range" "*,*,*,*,*,1020,4096,*,*") > + (set_attr "neg_pool_range" "*,*,*,*,*,0,0,*,*")] > ) >=20=20 > (define_insn "tls_load_dot_plus_four" > @@ -685,7 +709,8 @@ > "TARGET_THUMB2 > && peep2_regno_dead_p(0, CC_REGNUM) > && ((GET_CODE(operands[3]) !=3D ROTATE && GET_CODE(operands[3]) !=3D = ROTATERT) > - || REG_P(operands[2]))" > + || REG_P(operands[2])) > + && (CONSTANT_P (operands[2]) || (operands[0] =3D=3D operands[1]))" > [(parallel > [(set (match_dup 0) > (match_op_dup 3 > @@ -696,10 +721,10 @@ > ) >=20=20 > (define_insn "*thumb2_shiftsi3_short" > - [(set (match_operand:SI 0 "low_register_operand" "=3Dl") > + [(set (match_operand:SI 0 "low_register_operand" "=3Dl,l") > (match_operator:SI 3 "shift_operator" > - [(match_operand:SI 1 "low_register_operand" "l") > - (match_operand:SI 2 "low_reg_or_int_operand" "lM")])) > + [(match_operand:SI 1 "low_register_operand" "0,l") > + (match_operand:SI 2 "low_reg_or_int_operand" "l,M")])) > (clobber (reg:CC CC_REGNUM))] > "TARGET_THUMB2 && reload_completed > && ((GET_CODE(operands[3]) !=3D ROTATE && GET_CODE(operands[3]) !=3D = ROTATERT) > @@ -707,7 +732,7 @@ > "* return arm_output_shift(operands, 2);" > [(set_attr "predicable" "yes") > (set_attr "shift" "1") > - (set_attr "length" "2") > + (set_attr "length" "2,2") > (set (attr "type") (if_then_else (match_operand 2 "const_int_operand"= "") > (const_string "alu_shift") > (const_string "alu_shift_reg")))] > @@ -965,13 +990,23 @@ > else > return \"cmp\\t%0, #0\;beq\\t%l1\"; > " > - [(set (attr "length")=20 > - (if_then_else > - (and (ge (minus (match_dup 1) (pc)) (const_int 2)) > - (le (minus (match_dup 1) (pc)) (const_int 128)) > - (eq (symbol_ref ("which_alternative")) (const_int 0))) > - (const_int 2) > - (const_int 8)))] > + [(set (attr "length") > + (if_then_else > + (eq (symbol_ref ("which_alternative")) (const_int 0)) > + (if_then_else > + (and (ge (minus (match_dup 1) (pc)) (const_int 2)) > + (le (minus (match_dup 1) (pc)) (const_int 126))) > + (const_int 2) > + (if_then_else > + (and (ge (minus (match_dup 1) (pc)) (const_int -256)) > + (le (minus (match_dup 1) (pc)) (const_int 254))) > + (const_int 4) > + (const_int 6))) > + (if_then_else > + (and (ge (minus (match_dup 1) (pc)) (const_int -256)) > + (le (minus (match_dup 1) (pc)) (const_int 254))) > + (const_int 6) > + (const_int 8))))] > ) >=20=20 > (define_insn "*thumb2_cbnz" > @@ -988,13 +1023,23 @@ > else > return \"cmp\\t%0, #0\;bne\\t%l1\"; > " > - [(set (attr "length")=20 > - (if_then_else > - (and (ge (minus (match_dup 1) (pc)) (const_int 2)) > - (le (minus (match_dup 1) (pc)) (const_int 128)) > - (eq (symbol_ref ("which_alternative")) (const_int 0))) > - (const_int 2) > - (const_int 8)))] > + [(set (attr "length") > + (if_then_else > + (eq (symbol_ref ("which_alternative")) (const_int 0)) > + (if_then_else > + (and (ge (minus (match_dup 1) (pc)) (const_int 2)) > + (le (minus (match_dup 1) (pc)) (const_int 126))) > + (const_int 2) > + (if_then_else > + (and (ge (minus (match_dup 1) (pc)) (const_int -256)) > + (le (minus (match_dup 1) (pc)) (const_int 254))) > + (const_int 4) > + (const_int 6))) > + (if_then_else > + (and (ge (minus (match_dup 1) (pc)) (const_int -256)) > + (le (minus (match_dup 1) (pc)) (const_int 254))) > + (const_int 6) > + (const_int 8))))] > ) >=20=20 > ;; 16-bit complement >=20 > -- > This patch is available for review at http://codereview.appspot.com/44750= 42