* Autoincrement examples @ 1999-09-22 3:09 Bernd Schmidt 1999-09-22 4:32 ` Michael Hayes ` (3 more replies) 0 siblings, 4 replies; 94+ messages in thread From: Bernd Schmidt @ 1999-09-22 3:09 UTC (permalink / raw) To: gcc I'm playing with a patch to improve the generation of auto-increment addressing modes, e.g. by generating POST_MODIFY and PRE_MODIFY rtxs for targets where this is possible. If someone has (preferrably small) example code (for any target), which shows how the compiler generates auto-increments either very well or very badly, I'd like to get a copy of these test cases so I can make sure I'm not pessimizing anything. Please don't reply to the list; there's no reason to spam it with all these test cases. Bernd ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-09-22 3:09 Autoincrement examples Bernd Schmidt @ 1999-09-22 4:32 ` Michael Hayes 1999-09-22 4:39 ` Bernd Schmidt ` (2 more replies) 1999-09-22 12:31 ` Denis Chertykov ` (2 subsequent siblings) 3 siblings, 3 replies; 94+ messages in thread From: Michael Hayes @ 1999-09-22 4:32 UTC (permalink / raw) To: Bernd Schmidt; +Cc: gcc Bernd Schmidt writes: > I'm playing with a patch to improve the generation of auto-increment > addressing modes, e.g. by generating POST_MODIFY and PRE_MODIFY rtxs for > targets where this is possible. I am interested in what your patch does. I've also been tidying up a patch that I'm about to submit for a separate autoincrement pass that is run as part of flow optimization. This collects lists of register references within a basic block and uses these lists to look for a sequence of memory references to merge with an increment insn. I found that this approach worked better than scanning def-use chains. This pass also generates {PRE,POST}_MODIFY rtxs as well (well it did until the recent reload changes broke this aspect). I also have patches for the rest of the gcc infrastructure to handle {PRE,POST}_MODIFY if you are interested. > If someone has (preferrably small) example code (for any target), which > shows how the compiler generates auto-increments either very well or very > badly, I'd like to get a copy of these test cases so I can make sure I'm not > pessimizing anything. Most of problems I've seen with autoincrement generation are due to poor giv combination during strength reduction. I'll post some of my testcases separately... Michael. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-09-22 4:32 ` Michael Hayes @ 1999-09-22 4:39 ` Bernd Schmidt 1999-09-22 4:57 ` Michael Hayes ` (2 more replies) 1999-09-24 4:35 ` Michael Hayes 1999-09-30 18:02 ` Michael Hayes 2 siblings, 3 replies; 94+ messages in thread From: Bernd Schmidt @ 1999-09-22 4:39 UTC (permalink / raw) To: Michael Hayes; +Cc: gcc On Wed, 22 Sep 1999, Michael Hayes wrote: > Bernd Schmidt writes: > > I'm playing with a patch to improve the generation of auto-increment > > addressing modes, e.g. by generating POST_MODIFY and PRE_MODIFY rtxs for > > targets where this is possible. > > I am interested in what your patch does. Currently, it only moves the generation of auto-increments after reload. This is so we don't have to worry about how to reload other forms like P{RE,OST}_MODIFY (or autoincs of any form). The thing that worries me at the moment is that I had to delete a ton of code in regmove, and I want to make sure we're not losing any optimizations. > I've also been tidying up a patch that I'm about to submit for a > separate autoincrement pass that is run as part of flow optimization. > This collects lists of register references within a basic block and > uses these lists to look for a sequence of memory references to merge > with an increment insn. I found that this approach worked better than > scanning def-use chains. > > This pass also generates {PRE,POST}_MODIFY rtxs as well (well it did > until the recent reload changes broke this aspect). > > I also have patches for the rest of the gcc infrastructure to handle > {PRE,POST}_MODIFY if you are interested. I am. Bernd ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-09-22 4:39 ` Bernd Schmidt @ 1999-09-22 4:57 ` Michael Hayes 1999-09-30 18:02 ` Michael Hayes 1999-09-22 5:00 ` Michael Hayes 1999-09-30 18:02 ` Bernd Schmidt 2 siblings, 1 reply; 94+ messages in thread From: Michael Hayes @ 1999-09-22 4:57 UTC (permalink / raw) To: Bernd Schmidt; +Cc: Michael Hayes, gcc Bernd Schmidt writes: > > I also have patches for the rest of the gcc infrastructure to handle > > {PRE,POST}_MODIFY if you are interested. > > I am. Here goes... begin 644 postmod.patch.gz M'XL("(3#Z#<``W!O<W1M;V0N<&%T8V@`Y#UI<]M&LI_I7S&J5-FB!4D`;\IQ MMAB)COFBJT0ZZVPJA8+`H8@U"?`!H$1%J__^NN?`#"Y2LA5Y[:=*)!P],SW= M/3U]#3SPQW1U0-R([KDOWG[YSXN+PR&9>#-Z0/;IE1OMNTX8!_Z^>QWATX@_ MO'+=?3:D<?TBI''HT6O/OR(A_(F\P"?67K?^8NQ-)F37);MULKL@NR$^%(B^ M?OV:7U6L;K>[;W;W:R8QNP?-[D&C7F&M=W=WLR`U8ED'S=J!6<,.]!_B.G[@ MVU,GFI+ME4'FP9A6V2BUFF4:\*M)$([U*A^U"=SMOB#LQW4B2LXO^O;@]/`@ M]>QL.+*/^@4/&>1.NO7)V='@W>_IQP@KG^M]I+L\/#33][WCXX/</"?!;&R' M\8K-TO,CG\^RV>DVC&;7-/DL":E4>%?_^0_9?M_[K6\C<C"+B_Y)_W3$``#B MY4M"5XX;V[/@JD:V=\G@=/1;[YALNX$?Q;837EG5*OGI+3&KA7T*THA.MQ[= M:95W>QE2Y],;N,0[/L;^:W(8S!?+F))X2HD+_"3+B(Y)',#-?!%$\@7TZ?AQ MM$?(NR#$D>>+&348HX$J+:1*2S+ZVZ`*:Z+(X/@X<^+XMV0[FCNS6369-;GQ MXBE9A)3`U($F,30M_8$UX4UN]\CK?6T`P#HA!)=1^V@P/"=%($J.&<S?R+RL MU/M!3.TYG=LWH1?'U"?;SG@<<M%O=[MU`WYUI>C?O1`XG"^C*6HE1IO%`B]Q M5!C2_01KY]J9>6,GIA'Y]S**M5>+P/-C&@)20"KLRYN0[>U?^B/[\.RH+\8F M;]\2(3U(H9*WH"(42TK`A+BL>PW=5%_(;D"2%-S'_L=S#FW@>L(6%_U?=%BX M/3TK!!R.>H>_`EM!$OL7-L)].)'#W+$%A&0U.J9:0-\G;3=VDRR.=1UQB&?G MTX`;`'2U")_;`N!CEIH`5KM68`/`4XDL+E]^66`&-"J\!Y3#+%!B"#1RJF(> M7%/[\M9>>-0%\=N.`X-,PF!ND!GU#0)R>27V3*O>KAM6O5.7BH,0$%EG#Z') M6];HC7KJ+&//=^TX$(1_2S2IC0,[*YII2<D"@$R6"%T*4LCG!@CHK)K'%/$O MPA6?;\`V#X+XDD)\,[#K5U0!L$1>X0_,GGFN%]MR%H"_^:;D/>PH["U*B62H M9M=]LPQ]2&>)5GKHR$)#?<NR\C#@390IZ%NCS1>*8LXUF5$GS*JDK#*JU1KH MDS3-C#)BG<;!-R.WZP@H"92\A;V"AA$5"*R=2&X37Z,%T:30^Z_RT8+))*(Q M#`.TUY"`._5,^(>-!O(BJT>^-5[\W3KD:['Y4?-Z`ID0%M;55XBQ7*T/LC2+ M#*QF3:**BN4J%V:Q&@?-^D'3JK#V*/!9F,2\:N:46;`((WOINU/'OZ)C>Y'$ M(4";73O>S%X(XZK6[!A6K55781?YJ/G5PBX$:!<O0Y]+9S86(]@<4C>X>FX^ MBT%+&=THBJ8UZ@FR2')QG3>EFV:%]8!,R`&MXS5;'+%S.:-L9=$H0H9'@*`; M+WB0S2"W@N&=EFG`KY;:P`BYEV3&Y7?+X@#IA9YZIML"J7>:%Y=^J/G%.<Z" MDSH"MU.L<,"?CLF4AI1X$7&G001+W(F89SIW5MY\.9>@\=2)N<"**:7BA,\P MI>(6*1\TW2;M>3XQ)7+F#`9M0L?S[6!!0\<?1U(DA"53[U@&_&II`==ZIP:/ MNET5AT/J*=4=+)C6/NF?)`$U\(@U`.X5!XO$)Y;$92&[U#:0!Y624@7U@)W? M>*C=K3?L5H2O=@C>:1KAU=VK`P90BNG.PS$5#'H<`O=/A8`F(C)@F6"0CN`E M*O#*G3E1]/Q:4(Y;J@A;9H$B;)DZREP7RML"=8@['_3#U6$>;HU&1.T9CFUH M9;-F:,W[MC.+(X/@>H`_[)>(+YBPW]4M?0ODCVI-/2!=PHO_'^*X%@.2QJ#_ M"C?R'[P)2.F$7/1[QW;O8C!Z?](?#;@F`$UWXL3NE`7+)[/`B5%\QL$2MK`D MVFR0RV5,`G]V"SJ\C,5ROP-61\S&8?PV2.0Z,Y%9LMJ-IF&UI<O&^-MNMN%1 MNZ[O&'S/X#\Z>_D/H'S$\4,%[,T700A(NJBNB4,6$5V.`Q13+XIIR#0RJFVP MLT,ZIWY,QXQ/04C&-'D$:'K8A1>3FV`Y&Y/8^03]WP2$KD!U,WLM8NV`F0!$ M48DO%_"<(7$3!D"UQ<QQ*8_3EEE5:XTP4DQ*+@\K%`>#_-P;]C&@:1\>]X9# M@]3(:T'A-TDO*&\(<B[;)CU85?R?"]&:L3+P!A,R+EB#TZ/^QTT(Z/*8ID/. M9EUKQV8E#9S'@#E,P#FF4L"V@C]Z\M+JF&B!6*9N5+$5-J?S(+S5S3)NC5U1 M'].#=N(E;I_S%]`S3[=I)@+3.<S:0"(7):\>.R*:-(\=D1/M,P84/O7F\;@Q M5ZL#*>N9I?ELI`0IZQT].ST?/^J#B9K8"J!FQ\]O*?!12^V$3I'#U*DK=+FN MYS<E-D)'NDQ9J,1":.<6]<3ST3Y81LQKXDW)]@+6.9@&RSC91]CX[4;#:#>Z M:O]@#YI68AW@5JKMI9ZO)8XJDJ4:0#@;_^']N2?@$G^"V1DI([D`$`,K.P\! M%/&BC;`JM+1Q?,V8WP2'83Y]]B>]T>'[_E"J^:0!WV"@8?4+>=2T@"6=I*X" MARVE>#EG-K,DRXMRB,2O7`.P@4R2/@I][AH@TN(94`([,W&(+4*=<';KSH++ M2QI*QX]M6`FPUH$TC^W(^XO^`8/%54[3/[%'"WL<GO2.CW'G'0PQJ<FVW_ZP MRE=!LVZT6[5D%3P;R?,B72#W7\R7]0#*E__F>)==9XME-$V65[*X/'\6N.R: M_86^V3KK-+M&I]51A@YN:&"0;N$T`#.<DKS.$0[<#WR.&QC]WZ4SP]F)`:LJ M4RZWP;0'HQ,UFY>J9$+,Q;!RK3$G1FZDB!(;Y40B:A#='I5<A([>/!@S)6$; M$=-T`).&,L201&G,&)<RJ-U+,ZK3JAN==D,94=\4FQX,JZW"_T*N/A14F\53 MR4!VC<L8H#U'OUOD`T1(N&8V:P;\ZJ2\ETOJ.DM1`(8AE#ELPB%S286SFOBV MW$</P)/=DXW/N(<:P*^03&%@`UQ]5OFF6C%,:`0^\=CSG1"\?+14P3Y\H?B! MG:1;>3P""A-?8'2`3XN[O]*\+@K>;@A'Y\F3\G]O!7$Q0@O>^2*D"4(1<3!" MRRNIOHQV+`0+;`!>M!HI]^?[YD7N66&<_+^"18];5BT369G*M"2$U/"3=/1\ MG8RRP0#9AQD`)V)^'B-&38:.HF!.X=4,D(^(3WGMYC@@T8*ZGC.3G<18!!B1 MFRD5O4V=Q8+Z49I92DW=EF=Z;W5EN)%'*ZFO;KG&^P>ID0.5\XAFP<V!$/Q6 M"ZB5"<E]7]32$T9%(`^4^P?1-"NI8RHK>K=70CZM%J@:JZW%O>&N"8^Z#<VF M%VC`XF&1[WL]*)HFPV?684)GE3OVNW@+YC66EM@NCS\,5<0;C)1T'2:"Z/NC M;(OO4LT.STZ'H][I*!NN5`U$Q)*WN4NN"%)B[Q)#AN1M9O0W&:@H=D+<SG=E M#7GI2-F6U!\3[/_1#2-GPA`35:H2U]+Z5+V]SFGY])Y=W2<<+S&6$LXRNC)+ MYHZ+#[XI(%2)KX\^2(3V$18JA)3%MM$5&=LS>DUGD<$EM]NI&_"KHR2W;IH- MHV[6Z@_*V!2E;`I2)FSA:7D3E3C9(BJ"C3A3]-!2$`6I%=;=`Q,\>GZE4LFA M?/]U4%Z?$LKFA%B:A\4QI1(#E8H:2"""FMP/8KU^0=:;IR;[&\LG\;E.@M"5 ML:#2B!&3HB1V:LG8*6:'`C^F*Y8L)TVKTS2:5E?+"S5K-=-HULU$BHH+4CXS MSW+'$B#Q"JAA)DMB)16'>&/I;ZRJS/!E^6Q5T7U#;9CVX.2;D\'IAZ%0>0IW MWA<>^UB&(>RE,]A!;B@W<*+E`G-:Y`Y0-W!:]U*'\TJ"I1M'0H7>L*H$AS`M MF.2\@)MWL,L:L-/<RPP7N;R5)TABM,1$!Y@S\[E!FC0/0@1VU)D9T$-+%(C1 M=!D9V$M$97-9T##'0PML7Y0),$8HI.^6)"-0A`DP;^M<PARWJQHI+L#%]9A] M"!:O0F<;\)&H5*'W8QJ_$J<DT(;F>4"!3]*(L5YDY^846GISNK>W1P;<C('M MX`HK.V:P0TM28'9IEQ_YR<P_1O/C!FG'4H.PBKA"Q`BZS#AF$6#GC#RP>'%2 M0$"TE]0A),ZV!=8`\L6F2);*W#&2R8P=5_5;;$.QSWZUWYU=V#P?=\Z:J<,0 M2;-DJUV_'E.MX'^#O$P_XAE`UE.P\)=SD(+;17H_X&?K]%62GXF98'0G$5-" M"W2!&S]0>V:ZY1O91/X%D>FM3?3B&4=4:Y<JE;TEI5/NG'S(G]Z2=X,+4"#G MP_Z'H[,D5,?FC0$8>F4S1MJRHS]8PS]9Z$8S:)A4D^U"9-]S:77R.&LR`F:P M(W).9!:X3NP%ONH>),=SI]J\!"MAKF-8.B[HD20!K4]1("^@2W!G/&'*+Z98 MMSUW/K$S9(!&QL5C[:OIF@<VQ7=>&,4B_R/67FHFL'`%#GN\)3;^)WH"_JN8 M'=KK'1U=V*/?S[$\%42LRNJNC,1)A%4HO`75'H^X45"![B<V)--6Q)G$3*<Y M8URE7FQ`1^AX0`>.&R]AY=_*^@O>X8TJ&H@2'58I6C=Z(`@HA2:M7$)PRRBD MK1<BUY)\EUI!*<#"U:23]WP9<S<(WGIC%"6?WBB)W\53/X`@$%K-8*6B6/K& MM0)$U#3P#O'3,OT%;7!+TUOQ>]5.H)K10;E)A($K'+Y$4!S@CQ.)X@K<HAS/ M!Q"U5/;T=<N-X<PJ!BX!62]I**7[I[1T2]U2`*H+,B#X3]Q*^5:4W5E1&'$; MR2@1H9.1*E(QLT*-<Z%?TJ2NYM9<3ENNE*X$P%1:(*44#=UVT;E'4AH<`5^F M5G`*-%-4DGJ7EI'TG8G_I:"%:%_TC\]Z1_;9Z'W_@O._`D)-BZ8#^)Q^P.3( MZ*-!6'Z#W9Z/+IX?225;%?%'8ITR5Q28E"S-JN./UE2\%%9IRS*80C.:ZVX0 M)+)]%3@S63TN<J[,>&-F=,MLMXV6V:DI,[H%1K4!OY*"W$JV[*:2PTJ!L`<[ M.H@RI"LYXUJ/43#KO#*F$V<YB\4KX9$43A&+>IQYL(0-D)?'NY0[%:35-)M& MJVE9>CYY\\'?2DE(0O1,L/Z)82J%Q1X._M77-?L*-=>68+7P/R9S#"(@#(@K M+O.3WHA'5D7,$YPB\)DUF./^Z2^C]P*&[()[1CRFF>#O[B[/WK9:X.NT6O66 MGK]]YBD2@J*^]FQQ<NHE4SY<'$O:2G1^Z:E>/9"4`G]$)&E#.T:+--`#XTU; M&#AAA4)H=S\L`+2E+P&/_$A,\@^RZY$#XO&7]YS:?Z,\I6J,K*]39&2M.^'< M:A6=<&ZU-)1?)X5&5FFE$?:C2HVLPEJC3D[7T)DW]WPGIJI*%;^1P)T@]6V0 M6K?1-&I=O4Y5/JKEC^*4U346'L41*C7]<$/40L@N%P:Z$#9,,I<W!)[]2%ZF M'OYQ^N'$[A\/3@:GO9^/V<8Y_!,A=W9XCG:"7>W^%`?,R'N;"H(([7+W^?1K MMH%8K8;4V3]0'YC^0O-)"HZ!9!-4JJ*#(*;BA,7.6X)U%F_DHF8[=!'@K@0D M?#?70DCHJ(`#`'9G&"ROIFB)@K5+0\\E2Y9S8V$-]%<8,[BYQZ6@V86)M9-Z ML[]Y8KD>BW)T:CLHH\!.IL?MXB,RA9D_6:->I,97N61``JI)4U9_K_0T0-*@ M3"UGJI:+)EFDF5>:6GXB`5B_&,"$L5G.,ATRYPNBWF@VC'JC54\5Q64V^2%< M#2\.P62?C6V8FJ)L50_C,W^#R,^%V/Q<H1SUY7EO-.I?G/)[(`#XA_9E,.;Y M*2E:,/W!A!]L\STVRXGCH=,9Q4'($['H5V([/!=PX]SN:7OH`,-O?A#.G1D2 M;NI<TUQ?2W!#F;.DM1N#5P[PE]Y?3A@FD0$9P(C>"%V.&\:-!\R9>"M$92YS M?:R?]/Q`%<H))BDQ9<W@DS7P+%G1:%G`F+:9*IW[:HQAK!GFF4"OJ2\SL%E2 MT_$>KX][-%\J3\.2)V5*=I%Q!&PWHG;DS1<S;W*KG:)36TZW5K,,^*6=H8.[ M#CRJJQ0J]UE>F:\.9$+#4I<U=5E7EPV1\>!W3?6BI2[;ZK*C+KNIEG/UXD=U M^9.Z_$U=!N+<4)(3TW)-1.NSK]Z\4Y>_J,OW*10B]<)3EWX*9J!>_(^Z_%5= M'K_2COU&P3)TG_V$MQJWW,IL%!F9#1UE+ESRML#*K%58/]S(S,,E5F8W)[-S M)_R$>L*6[;BE!'=H)-EC&K&B3HY#VZP9;;.M5;3C`RM==538928B&[&XO:E5 MT7%'))<G>\#A[YW/&79SH\S^+YJ;G]7<*FDN)YT.Q?RK?W%F]S^.+GJ'HX-' M4C5AF4[;M4VSN,D%$\\<WYD]^WJ1PY:?E2_Z*$*CIB',5HN\2Q:!U256^\`T M#TQTR1K\TPAYL'5K!</8-IU,J!NSLS4K>5*PWC#@5U<_*<@>-:S"CR&4?_<@ M^S#G@3WN&Y3\(Y/:@P^GP_/^H?W;V7%O-#CNLW?[K_D[D+=S>_#N@(Q8T#[" M)`?[N`VYI3'?1(5L3!UOXNQ&[I0^^\F@U-#E*M4J_#295<N@CMQ+/5%B8)%: M[:#6.6BT*[P_9&L)K!09R\R+#,+:*&.W?U&[EOFN9[U5:QGU5MW2*E+$HVY: MIZ;[L50_&S7H(RH-]E^+&@/\O<.J"I+CH5D4:AGMDZ2`-@%;FX!3DRO5E%JX MEO\DA3KX<Z^IL;WILZNPO6FY:-8[1:)9[PA4A?+:FZ:^Z5(_:'8.ZOCI7&PO M-)<.DXB@E7?]5C$-?9Y4)F#HAT1\Y8%_=Q,);?O.G/(=OE4S.F9-NGX@$18K M28B(0X:_G_Q\AB=#WL%"B,@EI;QD8>9=ANB'3I:^RXQWSR=T[L6V>&%CZ:0L MT?D!&.?Y5.O,_C#L'VU?C#Y6P=G'/[L_X==$J]P'Q@/4O(4#^Y@;!N@#S(+@ M$XLP88H%PPPXD\C@S,?CWQ'Z$UA&00`=:#:%#B)1R0`HWX(1?TWQBZ1>%$O, MP$5@I[69W@O\V//1X4!=.(/6N[OP[)JNB.L2_,00SK.Q9XFJWUD@'>\M/,%. MMODLQT6'-JOHL.7?)T=)B]Z+#6(]@.H!?31)Z'>#TR-;T,D^/1OU$Y__"@@N M4L97-E(0Z"\</@%OL)0@CS#_P!RB!_9KZN$FSD58D)X+OB9R@97P\"^2,-8P M><)#_1$6)&"9SQ*XK&IWA9,'2TE4[`XFR')6A>0Z/G&N`V\,LDW=3[C<<I*1 ML)A\1RQ6:ZGW873&V,`/%NJ$EY]3*#B+K5B9?YEC8*X?_0O(N7Z2EQOZ2<TY MUU'J[0-Z*D<I]3;7$Y>*(19Y.R'(8PSV'8L]C`-6:,B"$U@&/:,Q*RK?Q_,, M2CSWDV\S<.'T1-T;R@]8W9CV#'FQ''Z]E%>38Q0Z(F!+>J['*NI8Q1J-*%=R M$>^)E7R1980Z+UZB#H1UH98%"P.RR)^*^C(%7D\B_U];@>^4K17]2]/%JT5` MP$(N6PT;^M!`L!,,R:;$@BV;Y+BSD(B=-3AO7)X*H'@P+.;71WKF[:UD6AO5 M4O'&HJ:55SO?O[[7A*>`;)+;9:\PL?(@#;[S.`G9^7P)D8*/.C6%SLX3FQ([ MPI1X8+]F9L4\9$>K/-&.5GFJ':WR9#M:9?..ME/0DZ8JLYHI^SY#\6QO:=V= MZRSS#PVL[RNMH=<A!J_UO@IFF=HQUJ"5Z8EA]=U8`%_D^W7J=3`=M"@6G@;O MFBI_+"@*V[T]')U=9%;&-M;I5`'GPG]ZHXAQV!$K:ULCHJ4PF>$*>B@0.]5; MJ1B7@>2G][#A!*76S[`8Z+.GJ+BS<<BGF6268.DE6`)1.IAHO7:L4M51!E). MS/7#Y>B4'ZX(Y#,GEQ>$TN$^:WI$J;N>ZX*-%801-R1ZI[_@=OGN3!I/6'>0 MO+@;CGH7(Z-_>G3/;0U9.896'0=A$.QEUE50?;/=G4-N#TZ'IU7R\9`%!O&& MI2`4;/Z<EM!F6`,$^@__)8@Q=>)II5(YOS@;G3&;`XS$4`;_XOE>#!;.\X;_ M_F]02W'?G&6*[>8L4[AC01Z%LA%S%Z8*1H96)H96AJ"Y"Z`)H-(971%B#!#S M_BQ'!6<%Q'I[^%:.O/R\JE10BQ+6-`?;;F)D8*9C8F2(-"H-%3*&UP4.P#HR MMP)[41^<F9N9`ZQJ0,D*LEL+V$I/*0:W\U$:]L"DZ)`&V4"%JX0#VX-3%MHP M!K?H(6UCZ+H#6#\#NCNM&'P"`6R_%,(10"/`6Y^0MD^E9!:#U\*`JE<]2/Y$ M=2-&F8CL1`Q)$EQ8E$H=!V)D?1QA")&D0A"B;=:#'&T)WPN!/PPQ78@N1WD( M$G:?`MQ]\)5=H*T]KL'!\0&PQ(Y%2D'#H2RQJ+JB%M25PY'%#&'.#0J)4(`J 5!XDG(F^PRTS.X.("`(321;KK;P`` ` end ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-09-22 4:57 ` Michael Hayes @ 1999-09-30 18:02 ` Michael Hayes 0 siblings, 0 replies; 94+ messages in thread From: Michael Hayes @ 1999-09-30 18:02 UTC (permalink / raw) To: Bernd Schmidt; +Cc: Michael Hayes, gcc Bernd Schmidt writes: > > I also have patches for the rest of the gcc infrastructure to handle > > {PRE,POST}_MODIFY if you are interested. > > I am. Here goes... begin 644 postmod.patch.gz M'XL("(3#Z#<``W!O<W1M;V0N<&%T8V@`Y#UI<]M&LI_I7S&J5-FB!4D`;\IQ MMAB)COFBJT0ZZVPJA8+`H8@U"?`!H$1%J__^NN?`#"Y2LA5Y[:=*)!P],SW= M/3U]#3SPQW1U0-R([KDOWG[YSXN+PR&9>#-Z0/;IE1OMNTX8!_Z^>QWATX@_ MO'+=?3:D<?TBI''HT6O/OR(A_(F\P"?67K?^8NQ-)F37);MULKL@NR$^%(B^ M?OV:7U6L;K>[;W;W:R8QNP?-[D&C7F&M=W=WLR`U8ED'S=J!6<,.]!_B.G[@ MVU,GFI+ME4'FP9A6V2BUFF4:\*M)$([U*A^U"=SMOB#LQW4B2LXO^O;@]/`@ M]>QL.+*/^@4/&>1.NO7)V='@W>_IQP@KG^M]I+L\/#33][WCXX/</"?!;&R' M\8K-TO,CG\^RV>DVC&;7-/DL":E4>%?_^0_9?M_[K6\C<C"+B_Y)_W3$``#B MY4M"5XX;V[/@JD:V=\G@=/1;[YALNX$?Q;837EG5*OGI+3&KA7T*THA.MQ[= M:95W>QE2Y],;N,0[/L;^:W(8S!?+F))X2HD+_"3+B(Y)',#-?!%$\@7TZ?AQ MM$?(NR#$D>>+&348HX$J+:1*2S+ZVZ`*:Z+(X/@X<^+XMV0[FCNS6369-;GQ MXBE9A)3`U($F,30M_8$UX4UN]\CK?6T`P#HA!)=1^V@P/"=%($J.&<S?R+RL MU/M!3.TYG=LWH1?'U"?;SG@<<M%O=[MU`WYUI>C?O1`XG"^C*6HE1IO%`B]Q M5!C2_01KY]J9>6,GIA'Y]S**M5>+P/-C&@)20"KLRYN0[>U?^B/[\.RH+\8F M;]\2(3U(H9*WH"(42TK`A+BL>PW=5%_(;D"2%-S'_L=S#FW@>L(6%_U?=%BX M/3TK!!R.>H>_`EM!$OL7-L)].)'#W+$%A&0U.J9:0-\G;3=VDRR.=1UQB&?G MTX`;`'2U")_;`N!CEIH`5KM68`/`4XDL+E]^66`&-"J\!Y3#+%!B"#1RJF(> M7%/[\M9>>-0%\=N.`X-,PF!ND!GU#0)R>27V3*O>KAM6O5.7BH,0$%EG#Z') M6];HC7KJ+&//=^TX$(1_2S2IC0,[*YII2<D"@$R6"%T*4LCG!@CHK)K'%/$O MPA6?;\`V#X+XDD)\,[#K5U0!L$1>X0_,GGFN%]MR%H"_^:;D/>PH["U*B62H M9M=]LPQ]2&>)5GKHR$)#?<NR\C#@390IZ%NCS1>*8LXUF5$GS*JDK#*JU1KH MDS3-C#)BG<;!-R.WZP@H"92\A;V"AA$5"*R=2&X37Z,%T:30^Z_RT8+))*(Q M#`.TUY"`._5,^(>-!O(BJT>^-5[\W3KD:['Y4?-Z`ID0%M;55XBQ7*T/LC2+ M#*QF3:**BN4J%V:Q&@?-^D'3JK#V*/!9F,2\:N:46;`((WOINU/'OZ)C>Y'$ M(4";73O>S%X(XZK6[!A6K55781?YJ/G5PBX$:!<O0Y]+9S86(]@<4C>X>FX^ MBT%+&=THBJ8UZ@FR2')QG3>EFV:%]8!,R`&MXS5;'+%S.:-L9=$H0H9'@*`; M+WB0S2"W@N&=EFG`KY;:P`BYEV3&Y7?+X@#IA9YZIML"J7>:%Y=^J/G%.<Z" MDSH"MU.L<,"?CLF4AI1X$7&G001+W(F89SIW5MY\.9>@\=2)N<"**:7BA,\P MI>(6*1\TW2;M>3XQ)7+F#`9M0L?S[6!!0\<?1U(DA"53[U@&_&II`==ZIP:/ MNET5AT/J*=4=+)C6/NF?)`$U\(@U`.X5!XO$)Y;$92&[U#:0!Y624@7U@)W? M>*C=K3?L5H2O=@C>:1KAU=VK`P90BNG.PS$5#'H<`O=/A8`F(C)@F6"0CN`E M*O#*G3E1]/Q:4(Y;J@A;9H$B;)DZREP7RML"=8@['_3#U6$>;HU&1.T9CFUH M9;-F:,W[MC.+(X/@>H`_[)>(+YBPW]4M?0ODCVI-/2!=PHO_'^*X%@.2QJ#_ M"C?R'[P)2.F$7/1[QW;O8C!Z?](?#;@F`$UWXL3NE`7+)[/`B5%\QL$2MK`D MVFR0RV5,`G]V"SJ\C,5ROP-61\S&8?PV2.0Z,Y%9LMJ-IF&UI<O&^-MNMN%1 MNZ[O&'S/X#\Z>_D/H'S$\4,%[,T700A(NJBNB4,6$5V.`Q13+XIIR#0RJFVP MLT,ZIWY,QXQ/04C&-'D$:'K8A1>3FV`Y&Y/8^03]WP2$KD!U,WLM8NV`F0!$ M48DO%_"<(7$3!D"UQ<QQ*8_3EEE5:XTP4DQ*+@\K%`>#_-P;]C&@:1\>]X9# M@]3(:T'A-TDO*&\(<B[;)CU85?R?"]&:L3+P!A,R+EB#TZ/^QTT(Z/*8ID/. M9EUKQV8E#9S'@#E,P#FF4L"V@C]Z\M+JF&B!6*9N5+$5-J?S(+S5S3)NC5U1 M'].#=N(E;I_S%]`S3[=I)@+3.<S:0"(7):\>.R*:-(\=D1/M,P84/O7F\;@Q M5ZL#*>N9I?ELI`0IZQT].ST?/^J#B9K8"J!FQ\]O*?!12^V$3I'#U*DK=+FN MYS<E-D)'NDQ9J,1":.<6]<3ST3Y81LQKXDW)]@+6.9@&RSC91]CX[4;#:#>Z M:O]@#YI68AW@5JKMI9ZO)8XJDJ4:0#@;_^']N2?@$G^"V1DI([D`$`,K.P\! M%/&BC;`JM+1Q?,V8WP2'83Y]]B>]T>'[_E"J^:0!WV"@8?4+>=2T@"6=I*X" MARVE>#EG-K,DRXMRB,2O7`.P@4R2/@I][AH@TN(94`([,W&(+4*=<';KSH++ M2QI*QX]M6`FPUH$TC^W(^XO^`8/%54[3/[%'"WL<GO2.CW'G'0PQJ<FVW_ZP MRE=!LVZT6[5D%3P;R?,B72#W7\R7]0#*E__F>)==9XME-$V65[*X/'\6N.R: M_86^V3KK-+M&I]51A@YN:&"0;N$T`#.<DKS.$0[<#WR.&QC]WZ4SP]F)`:LJ M4RZWP;0'HQ,UFY>J9$+,Q;!RK3$G1FZDB!(;Y40B:A#='I5<A([>/!@S)6$; M$=-T`).&,L201&G,&)<RJ-U+,ZK3JAN==D,94=\4FQX,JZW"_T*N/A14F\53 MR4!VC<L8H#U'OUOD`T1(N&8V:P;\ZJ2\ETOJ.DM1`(8AE#ELPB%S286SFOBV MW$</P)/=DXW/N(<:P*^03&%@`UQ]5OFF6C%,:`0^\=CSG1"\?+14P3Y\H?B! MG:1;>3P""A-?8'2`3XN[O]*\+@K>;@A'Y\F3\G]O!7$Q0@O>^2*D"4(1<3!" MRRNIOHQV+`0+;`!>M!HI]^?[YD7N66&<_+^"18];5BT369G*M"2$U/"3=/1\ MG8RRP0#9AQD`)V)^'B-&38:.HF!.X=4,D(^(3WGMYC@@T8*ZGC.3G<18!!B1 MFRD5O4V=Q8+Z49I92DW=EF=Z;W5EN)%'*ZFO;KG&^P>ID0.5\XAFP<V!$/Q6 M"ZB5"<E]7]32$T9%(`^4^P?1-"NI8RHK>K=70CZM%J@:JZW%O>&N"8^Z#<VF M%VC`XF&1[WL]*)HFPV?684)GE3OVNW@+YC66EM@NCS\,5<0;C)1T'2:"Z/NC M;(OO4LT.STZ'H][I*!NN5`U$Q)*WN4NN"%)B[Q)#AN1M9O0W&:@H=D+<SG=E M#7GI2-F6U!\3[/_1#2-GPA`35:H2U]+Z5+V]SFGY])Y=W2<<+S&6$LXRNC)+ MYHZ+#[XI(%2)KX\^2(3V$18JA)3%MM$5&=LS>DUGD<$EM]NI&_"KHR2W;IH- MHV[6Z@_*V!2E;`I2)FSA:7D3E3C9(BJ"C3A3]-!2$`6I%=;=`Q,\>GZE4LFA M?/]U4%Z?$LKFA%B:A\4QI1(#E8H:2"""FMP/8KU^0=:;IR;[&\LG\;E.@M"5 ML:#2B!&3HB1V:LG8*6:'`C^F*Y8L)TVKTS2:5E?+"S5K-=-HULU$BHH+4CXS MSW+'$B#Q"JAA)DMB)16'>&/I;ZRJS/!E^6Q5T7U#;9CVX.2;D\'IAZ%0>0IW MWA<>^UB&(>RE,]A!;B@W<*+E`G-:Y`Y0-W!:]U*'\TJ"I1M'0H7>L*H$AS`M MF.2\@)MWL,L:L-/<RPP7N;R5)TABM,1$!Y@S\[E!FC0/0@1VU)D9T$-+%(C1 M=!D9V$M$97-9T##'0PML7Y0),$8HI.^6)"-0A`DP;^M<PARWJQHI+L#%]9A] M"!:O0F<;\)&H5*'W8QJ_$J<DT(;F>4"!3]*(L5YDY^846GISNK>W1P;<C('M MX`HK.V:P0TM28'9IEQ_YR<P_1O/C!FG'4H.PBKA"Q`BZS#AF$6#GC#RP>'%2 M0$"TE]0A),ZV!=8`\L6F2);*W#&2R8P=5_5;;$.QSWZUWYU=V#P?=\Z:J<,0 M2;-DJUV_'E.MX'^#O$P_XAE`UE.P\)=SD(+;17H_X&?K]%62GXF98'0G$5-" M"W2!&S]0>V:ZY1O91/X%D>FM3?3B&4=4:Y<JE;TEI5/NG'S(G]Z2=X,+4"#G MP_Z'H[,D5,?FC0$8>F4S1MJRHS]8PS]9Z$8S:)A4D^U"9-]S:77R.&LR`F:P M(W).9!:X3NP%ONH>),=SI]J\!"MAKF-8.B[HD20!K4]1("^@2W!G/&'*+Z98 MMSUW/K$S9(!&QL5C[:OIF@<VQ7=>&,4B_R/67FHFL'`%#GN\)3;^)WH"_JN8 M'=KK'1U=V*/?S[$\%42LRNJNC,1)A%4HO`75'H^X45"![B<V)--6Q)G$3*<Y M8URE7FQ`1^AX0`>.&R]AY=_*^@O>X8TJ&H@2'58I6C=Z(`@HA2:M7$)PRRBD MK1<BUY)\EUI!*<#"U:23]WP9<S<(WGIC%"6?WBB)W\53/X`@$%K-8*6B6/K& MM0)$U#3P#O'3,OT%;7!+TUOQ>]5.H)K10;E)A($K'+Y$4!S@CQ.)X@K<HAS/ M!Q"U5/;T=<N-X<PJ!BX!62]I**7[I[1T2]U2`*H+,B#X3]Q*^5:4W5E1&'$; MR2@1H9.1*E(QLT*-<Z%?TJ2NYM9<3ENNE*X$P%1:(*44#=UVT;E'4AH<`5^F M5G`*-%-4DGJ7EI'TG8G_I:"%:%_TC\]Z1_;9Z'W_@O._`D)-BZ8#^)Q^P.3( MZ*-!6'Z#W9Z/+IX?225;%?%'8ITR5Q28E"S-JN./UE2\%%9IRS*80C.:ZVX0 M)+)]%3@S63TN<J[,>&-F=,MLMXV6V:DI,[H%1K4!OY*"W$JV[*:2PTJ!L`<[ M.H@RI"LYXUJ/43#KO#*F$V<YB\4KX9$43A&+>IQYL(0-D)?'NY0[%:35-)M& MJVE9>CYY\\'?2DE(0O1,L/Z)82J%Q1X._M77-?L*-=>68+7P/R9S#"(@#(@K M+O.3WHA'5D7,$YPB\)DUF./^Z2^C]P*&[()[1CRFF>#O[B[/WK9:X.NT6O66 MGK]]YBD2@J*^]FQQ<NHE4SY<'$O:2G1^Z:E>/9"4`G]$)&E#.T:+--`#XTU; M&#AAA4)H=S\L`+2E+P&/_$A,\@^RZY$#XO&7]YS:?Z,\I6J,K*]39&2M.^'< M:A6=<&ZU-)1?)X5&5FFE$?:C2HVLPEJC3D[7T)DW]WPGIJI*%;^1P)T@]6V0 M6K?1-&I=O4Y5/JKEC^*4U346'L41*C7]<$/40L@N%P:Z$#9,,I<W!)[]2%ZF M'OYQ^N'$[A\/3@:GO9^/V<8Y_!,A=W9XCG:"7>W^%`?,R'N;"H(([7+W^?1K MMH%8K8;4V3]0'YC^0O-)"HZ!9!-4JJ*#(*;BA,7.6X)U%F_DHF8[=!'@K@0D M?#?70DCHJ(`#`'9G&"ROIFB)@K5+0\\E2Y9S8V$-]%<8,[BYQZ6@V86)M9-Z ML[]Y8KD>BW)T:CLHH\!.IL?MXB,RA9D_6:->I,97N61``JI)4U9_K_0T0-*@ M3"UGJI:+)EFDF5>:6GXB`5B_&,"$L5G.,ATRYPNBWF@VC'JC54\5Q64V^2%< M#2\.P62?C6V8FJ)L50_C,W^#R,^%V/Q<H1SUY7EO-.I?G/)[(`#XA_9E,.;Y M*2E:,/W!A!]L\STVRXGCH=,9Q4'($['H5V([/!=PX]SN:7OH`,-O?A#.G1D2 M;NI<TUQ?2W!#F;.DM1N#5P[PE]Y?3A@FD0$9P(C>"%V.&\:-!\R9>"M$92YS M?:R?]/Q`%<H))BDQ9<W@DS7P+%G1:%G`F+:9*IW[:HQAK!GFF4"OJ2\SL%E2 MT_$>KX][-%\J3\.2)V5*=I%Q!&PWHG;DS1<S;W*KG:)36TZW5K,,^*6=H8.[ M#CRJJQ0J]UE>F:\.9$+#4I<U=5E7EPV1\>!W3?6BI2[;ZK*C+KNIEG/UXD=U M^9.Z_$U=!N+<4)(3TW)-1.NSK]Z\4Y>_J,OW*10B]<)3EWX*9J!>_(^Z_%5= M'K_2COU&P3)TG_V$MQJWW,IL%!F9#1UE+ESRML#*K%58/]S(S,,E5F8W)[-S M)_R$>L*6[;BE!'=H)-EC&K&B3HY#VZP9;;.M5;3C`RM==538928B&[&XO:E5 MT7%'))<G>\#A[YW/&79SH\S^+YJ;G]7<*FDN)YT.Q?RK?W%F]S^.+GJ'HX-' M4C5AF4[;M4VSN,D%$\\<WYD]^WJ1PY:?E2_Z*$*CIB',5HN\2Q:!U256^\`T M#TQTR1K\TPAYL'5K!</8-IU,J!NSLS4K>5*PWC#@5U<_*<@>-:S"CR&4?_<@ M^S#G@3WN&Y3\(Y/:@P^GP_/^H?W;V7%O-#CNLW?[K_D[D+=S>_#N@(Q8T#[" M)`?[N`VYI3'?1(5L3!UOXNQ&[I0^^\F@U-#E*M4J_#295<N@CMQ+/5%B8)%: M[:#6.6BT*[P_9&L)K!09R\R+#,+:*&.W?U&[EOFN9[U5:QGU5MW2*E+$HVY: MIZ;[L50_&S7H(RH-]E^+&@/\O<.J"I+CH5D4:AGMDZ2`-@%;FX!3DRO5E%JX MEO\DA3KX<Z^IL;WILZNPO6FY:-8[1:)9[PA4A?+:FZ:^Z5(_:'8.ZOCI7&PO M-)<.DXB@E7?]5C$-?9Y4)F#HAT1\Y8%_=Q,);?O.G/(=OE4S.F9-NGX@$18K M28B(0X:_G_Q\AB=#WL%"B,@EI;QD8>9=ANB'3I:^RXQWSR=T[L6V>&%CZ:0L MT?D!&.?Y5.O,_C#L'VU?C#Y6P=G'/[L_X==$J]P'Q@/4O(4#^Y@;!N@#S(+@ M$XLP88H%PPPXD\C@S,?CWQ'Z$UA&00`=:#:%#B)1R0`HWX(1?TWQBZ1>%$O, MP$5@I[69W@O\V//1X4!=.(/6N[OP[)JNB.L2_,00SK.Q9XFJWUD@'>\M/,%. MMODLQT6'-JOHL.7?)T=)B]Z+#6(]@.H!?31)Z'>#TR-;T,D^/1OU$Y__"@@N M4L97-E(0Z"\</@%OL)0@CS#_P!RB!_9KZN$FSD58D)X+OB9R@97P\"^2,-8P M><)#_1$6)&"9SQ*XK&IWA9,'2TE4[`XFR')6A>0Z/G&N`V\,LDW=3[C<<I*1 ML)A\1RQ6:ZGW873&V,`/%NJ$EY]3*#B+K5B9?YEC8*X?_0O(N7Z2EQOZ2<TY MUU'J[0-Z*D<I]3;7$Y>*(19Y.R'(8PSV'8L]C`-6:,B"$U@&/:,Q*RK?Q_,, M2CSWDV\S<.'T1-T;R@]8W9CV#'FQ''Z]E%>38Q0Z(F!+>J['*NI8Q1J-*%=R M$>^)E7R1980Z+UZB#H1UH98%"P.RR)^*^C(%7D\B_U];@>^4K17]2]/%JT5` MP$(N6PT;^M!`L!,,R:;$@BV;Y+BSD(B=-3AO7)X*H'@P+.;71WKF[:UD6AO5 M4O'&HJ:55SO?O[[7A*>`;)+;9:\PL?(@#;[S.`G9^7P)D8*/.C6%SLX3FQ([ MPI1X8+]F9L4\9$>K/-&.5GFJ':WR9#M:9?..ME/0DZ8JLYHI^SY#\6QO:=V= MZRSS#PVL[RNMH=<A!J_UO@IFF=HQUJ"5Z8EA]=U8`%_D^W7J=3`=M"@6G@;O MFBI_+"@*V[T]')U=9%;&-M;I5`'GPG]ZHXAQV!$K:ULCHJ4PF>$*>B@0.]5; MJ1B7@>2G][#A!*76S[`8Z+.GJ+BS<<BGF6268.DE6`)1.IAHO7:L4M51!E). MS/7#Y>B4'ZX(Y#,GEQ>$TN$^:WI$J;N>ZX*-%801-R1ZI[_@=OGN3!I/6'>0 MO+@;CGH7(Z-_>G3/;0U9.896'0=A$.QEUE50?;/=G4-N#TZ'IU7R\9`%!O&& MI2`4;/Z<EM!F6`,$^@__)8@Q=>)II5(YOS@;G3&;`XS$4`;_XOE>#!;.\X;_ M_F]02W'?G&6*[>8L4[AC01Z%LA%S%Z8*1H96)H96AJ"Y"Z`)H-(971%B#!#S M_BQ'!6<%Q'I[^%:.O/R\JE10BQ+6-`?;;F)D8*9C8F2(-"H-%3*&UP4.P#HR MMP)[41^<F9N9`ZQJ0,D*LEL+V$I/*0:W\U$:]L"DZ)`&V4"%JX0#VX-3%MHP M!K?H(6UCZ+H#6#\#NCNM&'P"`6R_%,(10"/`6Y^0MD^E9!:#U\*`JE<]2/Y$ M=2-&F8CL1`Q)$EQ8E$H=!V)D?1QA")&D0A"B;=:#'&T)WPN!/PPQ78@N1WD( M$G:?`MQ]\)5=H*T]KL'!\0&PQ(Y%2D'#H2RQJ+JB%M25PY'%#&'.#0J)4(`J 5!XDG(F^PRTS.X.("`(321;KK;P`` ` end ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-09-22 4:39 ` Bernd Schmidt 1999-09-22 4:57 ` Michael Hayes @ 1999-09-22 5:00 ` Michael Hayes 1999-09-22 23:23 ` Michael Hayes 1999-09-30 18:02 ` Michael Hayes 1999-09-30 18:02 ` Bernd Schmidt 2 siblings, 2 replies; 94+ messages in thread From: Michael Hayes @ 1999-09-22 5:00 UTC (permalink / raw) To: Bernd Schmidt; +Cc: Michael Hayes, gcc Bernd Schmidt writes: > Currently, it only moves the generation of auto-increments after reload. > This is so we don't have to worry about how to reload other forms like > P{RE,OST}_MODIFY (or autoincs of any form). Interesting, I'll have to think about this. I'll run some tests with my code to see what effect deferring autoincs until after reload is like. I've got a gut feeling that there was a very good reason that I can't remember. Michael. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-09-22 5:00 ` Michael Hayes @ 1999-09-22 23:23 ` Michael Hayes 1999-09-23 1:50 ` Bernd Schmidt 1999-09-30 18:02 ` Michael Hayes 1999-09-30 18:02 ` Michael Hayes 1 sibling, 2 replies; 94+ messages in thread From: Michael Hayes @ 1999-09-22 23:23 UTC (permalink / raw) To: Michael Hayes; +Cc: Bernd Schmidt, gcc Michael Hayes writes: > Interesting, I'll have to think about this. I'll run some tests with > my code to see what effect deferring autoincs until after reload is > like. I've got a gut feeling that there was a very good reason that > I can't remember. Following up to my own post, there are two reasons why I prefer autoinc generation before reload. o more compact addressing modes can be generated o the combiner can do a better job. For example, given the following insns: (set (reg:QI 43) (mem/s:QI (reg:QI 46) 2)) (set (reg:QI 46) (plus:QI (reg:QI 46) (const_int 4 [0x4]))) (set (reg/v:QI 42) (plus:QI (reg/v:QI 42) (reg:QI 43))) the combiner will not combine the first and the third insns unless the second insn is combined with the first insn using autoincrement addressing. Michael. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-09-22 23:23 ` Michael Hayes @ 1999-09-23 1:50 ` Bernd Schmidt 1999-09-23 4:37 ` Michael Hayes 1999-09-30 18:02 ` Bernd Schmidt 1999-09-30 18:02 ` Michael Hayes 1 sibling, 2 replies; 94+ messages in thread From: Bernd Schmidt @ 1999-09-23 1:50 UTC (permalink / raw) To: Michael Hayes; +Cc: gcc On Thu, 23 Sep 1999, Michael Hayes wrote: > Following up to my own post, there are two reasons why I prefer > autoinc generation before reload. > > o more compact addressing modes can be generated How so? > o the combiner can do a better job. For example, given the following > insns: > > (set (reg:QI 43) (mem/s:QI (reg:QI 46) 2)) > > (set (reg:QI 46) (plus:QI (reg:QI 46) (const_int 4 [0x4]))) > > (set (reg/v:QI 42) (plus:QI (reg/v:QI 42) (reg:QI 43))) > > the combiner will not combine the first and the third insns unless the > second insn is combined with the first insn using autoincrement > addressing. That's a good point. I've noticed similar cases in the past, but never thought about how this problem might apply to autoinc generation. We could try to improve the combiner to try to move any instructions that are between i1/i2/i3 away if that's possible; that way, we'd probably expose quite a few additional combination opportunities. Bernd ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-09-23 1:50 ` Bernd Schmidt @ 1999-09-23 4:37 ` Michael Hayes 1999-09-30 18:02 ` Michael Hayes 1999-09-30 18:02 ` Bernd Schmidt 1 sibling, 1 reply; 94+ messages in thread From: Michael Hayes @ 1999-09-23 4:37 UTC (permalink / raw) To: Bernd Schmidt; +Cc: Michael Hayes, gcc Bernd Schmidt writes: > > o more compact addressing modes can be generated > > How so? Oh, in terms of the number of bits required to encode the addressing mode since the displacement is no longer required. Michael. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-09-23 4:37 ` Michael Hayes @ 1999-09-30 18:02 ` Michael Hayes 0 siblings, 0 replies; 94+ messages in thread From: Michael Hayes @ 1999-09-30 18:02 UTC (permalink / raw) To: Bernd Schmidt; +Cc: Michael Hayes, gcc Bernd Schmidt writes: > > o more compact addressing modes can be generated > > How so? Oh, in terms of the number of bits required to encode the addressing mode since the displacement is no longer required. Michael. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-09-23 1:50 ` Bernd Schmidt 1999-09-23 4:37 ` Michael Hayes @ 1999-09-30 18:02 ` Bernd Schmidt 1 sibling, 0 replies; 94+ messages in thread From: Bernd Schmidt @ 1999-09-30 18:02 UTC (permalink / raw) To: Michael Hayes; +Cc: gcc On Thu, 23 Sep 1999, Michael Hayes wrote: > Following up to my own post, there are two reasons why I prefer > autoinc generation before reload. > > o more compact addressing modes can be generated How so? > o the combiner can do a better job. For example, given the following > insns: > > (set (reg:QI 43) (mem/s:QI (reg:QI 46) 2)) > > (set (reg:QI 46) (plus:QI (reg:QI 46) (const_int 4 [0x4]))) > > (set (reg/v:QI 42) (plus:QI (reg/v:QI 42) (reg:QI 43))) > > the combiner will not combine the first and the third insns unless the > second insn is combined with the first insn using autoincrement > addressing. That's a good point. I've noticed similar cases in the past, but never thought about how this problem might apply to autoinc generation. We could try to improve the combiner to try to move any instructions that are between i1/i2/i3 away if that's possible; that way, we'd probably expose quite a few additional combination opportunities. Bernd ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-09-22 23:23 ` Michael Hayes 1999-09-23 1:50 ` Bernd Schmidt @ 1999-09-30 18:02 ` Michael Hayes 1 sibling, 0 replies; 94+ messages in thread From: Michael Hayes @ 1999-09-30 18:02 UTC (permalink / raw) To: Michael Hayes; +Cc: Bernd Schmidt, gcc Michael Hayes writes: > Interesting, I'll have to think about this. I'll run some tests with > my code to see what effect deferring autoincs until after reload is > like. I've got a gut feeling that there was a very good reason that > I can't remember. Following up to my own post, there are two reasons why I prefer autoinc generation before reload. o more compact addressing modes can be generated o the combiner can do a better job. For example, given the following insns: (set (reg:QI 43) (mem/s:QI (reg:QI 46) 2)) (set (reg:QI 46) (plus:QI (reg:QI 46) (const_int 4 [0x4]))) (set (reg/v:QI 42) (plus:QI (reg/v:QI 42) (reg:QI 43))) the combiner will not combine the first and the third insns unless the second insn is combined with the first insn using autoincrement addressing. Michael. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-09-22 5:00 ` Michael Hayes 1999-09-22 23:23 ` Michael Hayes @ 1999-09-30 18:02 ` Michael Hayes 1 sibling, 0 replies; 94+ messages in thread From: Michael Hayes @ 1999-09-30 18:02 UTC (permalink / raw) To: Bernd Schmidt; +Cc: Michael Hayes, gcc Bernd Schmidt writes: > Currently, it only moves the generation of auto-increments after reload. > This is so we don't have to worry about how to reload other forms like > P{RE,OST}_MODIFY (or autoincs of any form). Interesting, I'll have to think about this. I'll run some tests with my code to see what effect deferring autoincs until after reload is like. I've got a gut feeling that there was a very good reason that I can't remember. Michael. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-09-22 4:39 ` Bernd Schmidt 1999-09-22 4:57 ` Michael Hayes 1999-09-22 5:00 ` Michael Hayes @ 1999-09-30 18:02 ` Bernd Schmidt 2 siblings, 0 replies; 94+ messages in thread From: Bernd Schmidt @ 1999-09-30 18:02 UTC (permalink / raw) To: Michael Hayes; +Cc: gcc On Wed, 22 Sep 1999, Michael Hayes wrote: > Bernd Schmidt writes: > > I'm playing with a patch to improve the generation of auto-increment > > addressing modes, e.g. by generating POST_MODIFY and PRE_MODIFY rtxs for > > targets where this is possible. > > I am interested in what your patch does. Currently, it only moves the generation of auto-increments after reload. This is so we don't have to worry about how to reload other forms like P{RE,OST}_MODIFY (or autoincs of any form). The thing that worries me at the moment is that I had to delete a ton of code in regmove, and I want to make sure we're not losing any optimizations. > I've also been tidying up a patch that I'm about to submit for a > separate autoincrement pass that is run as part of flow optimization. > This collects lists of register references within a basic block and > uses these lists to look for a sequence of memory references to merge > with an increment insn. I found that this approach worked better than > scanning def-use chains. > > This pass also generates {PRE,POST}_MODIFY rtxs as well (well it did > until the recent reload changes broke this aspect). > > I also have patches for the rest of the gcc infrastructure to handle > {PRE,POST}_MODIFY if you are interested. I am. Bernd ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-09-22 4:32 ` Michael Hayes 1999-09-22 4:39 ` Bernd Schmidt @ 1999-09-24 4:35 ` Michael Hayes 1999-09-30 18:02 ` Michael Hayes 1999-11-18 0:22 ` Jeffrey A Law 1999-09-30 18:02 ` Michael Hayes 2 siblings, 2 replies; 94+ messages in thread From: Michael Hayes @ 1999-09-24 4:35 UTC (permalink / raw) To: Michael Hayes; +Cc: Bernd Schmidt, gcc Michael Hayes writes: > I've also been tidying up a patch that I'm about to submit for a > separate autoincrement pass that is run as part of flow optimization. > This collects lists of register references within a basic block and > uses these lists to look for a sequence of memory references to merge > with an increment insn. I found that this approach worked better than > scanning def-use chains. Here's the patches I was referring to. There are three new files: autoinc.c, ref.c, ref.h. I've written this as a bolt-on to life_analysis_1 since it completely replaces the autoinc code in flow.c. I'd be interested in folks' opinions to whether this is the best approach or whether I'm barking up the wrong tree.... For starters, here the docs from autoinc.c: There are a number of transformations which can be made by this optimization: case A: *REG1; REG1 = REG1 + INC => REG1 = REG1; *REG1++ case B: *REG1; REG2 = REG1 + INC => *REG1++; REG2 = REG1 where REG1 dies in the add insn. case C: (very uncommon) *REG1; REG2 = REG1 + INC => REG2 = REG1; *REG2++ where REG1 is live after the add insn and where REG2 is not used between the first memref and the add insn. This case requires a new move insn to be inserted before the first memref which makes REG2 live earlier. However, this won't affect autoinc processing for REG2 since REG2 is not operand 1 of an add insn. case D: REG1 = REG1 + INC; *REG1 => REG1 = REG1; *++REG1 case E: REG2 = REG1 + INC; *REG1 => *++REG1; REG2 = REG1 where REG1 dies in the last memref and REG2 is not used between the add insn and last memref. case F: *REG1; *(REG1 + INC) => *REG1; *++REG1 where REG1 dies in the last memref. This latter case is useful for DSP architectures which can handle multiple autoincrement addresses better than multiple indirect addresses. This case could be handled by a separate, optional, scan of the register ref list. Note that strength_reduce in loop.c performs the following transformation which we try to undo: *R; R = R + 1; *R; R = R + 1 => *R; *(R + 1); R = R + 2 However, the following is not transformed: R = R + 1; *R; R = R + 1; *R begin 644 autoinc.patch.gz M'XL("--?ZS<``V%U=&]I;F,N<&%T8V@`W%S[5]M(EO[9.?M'%.FS:1O$PZ;I M!!C28XRAV>6UMIET=J:/CBR5L099\DHRM#N3_WWOHTJJD@PDO=-S^BSG)-AR M/>ZKOOO5K3+G<2!_.1"3*'G<\E\=_=]_7@UZ0S$)(WD@MN6=GVW[7IHG\;;_ MD.'3C!_>^?XVS^D\O$IEGH;R(8SO1`J_LC")17NK_7W[51!.)F+3%YN[8G,N M-E-ZJH5=7U]7+QOM_?W][9W][<ZN:.\=[+P]Z.PU>(3-S<U:H^]$NPTM#K[K MX!CFCSA.,A#6$9==L=-IM]N;[=V=MXZX'7:WA%C?ICG;N[M.>W=?8`<:GQY\ MUQ;P>O.5$-^$L1\M`BE>Y\D\D@];T]?6TU3ZR5WU81AG\>8D\NXR_&3#^,1; MY`F\X0[PB=4M&6>YY]^K#^&30$["6`KUW/6GB_C>]:(H\<4O,_I=TQE:YJ$O MPC@7<9+,W5GR(-UYH]&X&5R/KIO--/^EU3HDU7?;N\YN6ZDN=,^')`R@*[R4 M+D\[!XESF<)0X,'0AU;@4AI.T'B.4(/:8Z!ETL!]2")X%DD7C9)5Q+![S+ST MWDWE7>9&(4CMY:Z,@Z('/)<Y=EJS.D7A!)K&7K3,PLQMFXHZ:`;ZKSY7&(>Y MRV.Z#]+/DU3U%'HJL<[]H6.#?K(\7?BY]H98KP\Z3Y.Y=^?ETAV#;^X;MNQD M)_4?"4;QMMO><W8[1;S]O_7"Z?E%7YGT#^J-ZE*R=!+-B2-B[.H0($)/26LK MD+28.KMOOW,ZN^]V]'(28GM=?)#BT8.EZ$6AEXEBK#">).F,70BO!(CG12*0 M7@`*)*D4,@IG84P-&*IP/#(2C61(15J#,T35!:;`A:A>X/I)(+D3C3D1S37X M/$KHHQE`7"Z#%GTF.!3\)`:KQKD[6<0^B8R38NB"SM^CSAT=O'\DG2M.^N=I M7@V3E>M%-'&9<&3L[;]U.M_O[)61(52H9^&O$BP4Z/?C92[!9&`,+\H2@6$_ ME:G4YK#C&PSP-2Y7BF'`B\FA?D=Y`GM83ZJ=\<-/K_2GDS#-<G?N99D*@[W] M?53P71D&_P(%G_#OBVIN\!.&(QSD-VK^+%H\XPN.B;=O]YS.6XT69,6W;]_A MHSUM100Y,>B?N<?=X7G//;ZX[OVG:(8M<<1/\;U[=G%]W+TXY.9*C\\<Y*BH MHAMN,L]A<8$OFDJDV!U[6>@S,BIKM@Z9D-`JODH>13Z5I):U=F$!>KZ_2`%7 MP6V7WCWP$^`H,US":!D!JJ9">OY4VY5F$C23R!-8\KC2C)4/T_NIA/&TN&`O MB0`2!/!A!DQ2C\21A`E-3$&^F1<O!2@&(87S89"%&>1(E'&1R<#!%P3U@#=! M*+."]EF>PY3(SIF+)F2#6,H`.P,^16YR[V`>EEE+<:;][R!=[Q0P_TTX`98F MNK>C:_?\JN>>]'O:A.<3(JN"P.\AN9>!\"8H'H..(Q[!;(LL%SD:4?X"LB-K MUD-II>4O<S`#F!Z'`?N!]9-%G+,N:PK)GL*Q3WH4PMRF.&1EU"\(I9_Z/]U` M5.`[1[1;V*_Q21$36-:@ZEY)A?\0JK;%/_[Q)&[_1GW1M=N!?-B.%U'4^`#B M7WI+(;X7.^\.=CL'G;<"C/&.S*(9O-\X34,QE',!6Y#.+FY!.N^PV7XUP&AT M8Z6WG;W]73;J!AKO6B_.\5*`0F/(>6`;6A+XPEX4,\"2=`GZ3P`W8U]F#&BH M[=G5+?:?PU).P62C*7HC@T69`DU2#E)`H+,L=^XE\V4:WDUST>RU2`5QFDHI MALDD?T2P/@4O!-3'$>>@N^X6PU9OO`#SH^B7H3_U9"1NML2/WA*,W9QM3?'% MGV'!@[T\I*_C1;K<\ORM^-<6@LT&2XGH8XJ*FO1Z6]R"W^#'$Q0J4T(=BF6R M@#4:@RT"B">61(0YHL0VF(.(\A(&@$<@/@0CXAG(,,MP#GR#0Y_)6*;`1&X6 MXPA@Z@(8=YP!%($T^"2;DG8P#'9XRBR'0H;P>2H`^VC'VR',:7HY2IF2V9,8 M54;(`BI>-JUI62J#O)BFG29SD'P*HX$NCV$4B;%$A)LL(@?Z0EOQX7STX_7M M2'2O/HH/W<&@>S7Z>`AM\VD"G\H'R2.%L%Y"&!CD3\$A2[`$#'#9'_1^A![= MX_.+\]%'%/WT?'35'P[%Z?5`=,5-=S`Z[]U>=`?BYG9P<SWL0_`,)0HEE?!/ MV'%"GD@1[W,OC#*E[D?P70:B18#DWH-$5)?`G@`V((3GRY<=A*:,$E@>J**R MWB&`/8G$`=6[OOEX?G4&D@(\P8H'%$I#")$\><&;CMC;%R.)T")N(L^78E,, M%]AU=W<'#?Y288%4W%ZG93)"@L.41\2+V1@\C\J!];,BJ6;B<0JKAZ(9/#OS M`LDQ)T!."`ESU1[PZ`(:@WF[!_1Z'1A!^Q!Y09OI05ML"`!6(8[>FT\/N>7& MAC7(<7603GT0U<_^F+H]DH+T'G.LCEG(WI19&2S*R7H'0C0A]I>P*`&M9KPL M7IC=>,HJ=$@%:W*P$Q)PE7A,$8@W%"T[V!+"@2@"#3*6^:-4"X0H'J(L("SU MLW11J$J*I/)_%F&J`!A\*X&0`/?A&2%YC>FE3`D>Y0370&T"=OL,DB(/0^*1 M%M)+8:$BCO^8/,+R31T.A<<D_A8@;C*!7;).#;C3]153VE#)@$:")[ZM,P!) MBEJU,0@AVDPOE3XZ.=#2V.&DPF=%5&ULX(N*J_L'I5)/#:-Z6HY_+JPBSW9/ MU:&%,]7JJ42!T=U6^=1:`^O-4MH6\PF0M?C4T/9E,8MY*'0`^S$\:4ZFJ0#B MY+&3X0W`A#\%G/'S!026`0I3D#V2-,IL$>4A(M-*L@R]P`"\`*!?T3B,`XA5 M/^=@U6VM>/8)C2%J>3+*Z1X@*B1ER%>.RF%>Y(@,1**!%$H7O!N=`BDS5RI3 MFZLD5[D+TIJ,[_(I;):#A8\RB2A)YEL0OS)%+,QX@201,!4=RS90*I,`F\P! M0&"1`60GVG,00QA`X#4"">.M0C#R*[YOE9]UM)S&,C-DT*%5B"$#%=1/3(5O MBQ3P;U;MUT_B27A7K0AG2S#=K/HTS:/J(UV$J#ZG[=4F;:^J'U$=>O6TN`VL M/U.U;/OAA!_A4U601M[..X#CVS.QH_.=>/"B,,""&S#!^$Y2Z"9QM%0A!?GR M)Q>)QEG?A=WK4'`S16'9M4B`:)(`*1AV&/1[UV?N]4T?.,W)4*R+/9UDJ^(, M^_]UV[_J]5WH)IJK^[:*T`1YAU1#7"`R)T"RHH!VNL(;(V="='R6>C.1P&W+ M+$1R04C0NS[IHQ(818/1A<`B`BUO@B)K/,)BXAL\S&7_4O=<,97BLP[!V/G5 M\$JW!?\"LXKU@M&)BBRDBJ0XKPL3PY-/U$@"$\$2"=4X!!<Z-E35!.8NW^!@ MY3N@IG-\]_G0M.)`@@EC)<O+VM*R!;R!5K_*-($]'<L=8QD!-^S`M+QQA-6% M#X/3B^X9K\%XDUKCX)ZV#_$YH&%L<QZ&S(=B*'!B$';$]>GIL#_21DLF$RP^ M`01Q28H;\PBZ!QNZ1^T#V):F1,BU/K0]G8:`&:6QJ71EV19IJJK!9+@-@BW6 M8XHG0PX)Z2A!L`3MMW1M"DM-W*HH5]&@,YPPEBZI-RN<IGH8(QT^XV@`JAWN MAYVH0'<DSOHC%ZWH#L__NP\;-ZZ4*F3$33>+HR7\I.>ESVZ'T&]T/>B[L#$8 MX7Y^T+_L7XW4..+-&VWN#37?$0J)S_'7>[&#XS:4<'J,0C<A(\R5]8E.^O9$ M.(CJ4DZYN6K*/\&4HCHG#/?\G/#O&=V>5XK[OCA^5:4GQ_]3?7Q#_L_L^RB3 M3SKLXKI[\J_P5SE/3;??S5\\Y^_FKF+X?ZJW4D91C<6?383M)3%0DYR0*(-M M!Z4$@*RKVTN%5AD#(Q.[DI`!+^4::Z!S`R,S;/Z7HGMQ/ARI0IL)U<A*J?!4 M':T+PZG:#UI68RF1:\Y^BF_;W1Z]K,1-&:@1.&MDQIXX6_BXBT$^;,M3`5B0 MV(!53+Y8_YRXF+L=W&G?8<D=_O.0BSHX`Z%B6@"L2HNZDUC7KPJ/8[;#D:P' MYOM*9A7K-)D%R#"OU1U%*)$96X1E:H6@L;-N9N$ORB+6CE"$.@ACA*BM-1=+ M<FEO1:FD8@4%/"1O*LMJG;PT=XL`:[:,!0!6YKL$-&:SL++1AJ2&\+Z3.1\S M6R.`D#_\\`.60ZS]@N(W:+Q,5X2PN#5''AGP>8$7$510I0<B##3%S39-82L@ MXV"5^)^U):DF'%(&%"&L2O20"#<VZB95_G+AMV&$5<F4_/[7\.<MO7(MJ.5& MO.0AP9Z??L3J=?$4<8P?J^REYB0KQO2*AG#$#=,%KPQLG!,)&:G94&#_A4,8 M?JER]F8Q.!K8$6^X8EX\!;QQ`,X</0]6T6N&YL-5;SZ/EFI<]RY-%G-PBK:U MPKN=+P]S5`6V2O/#BI<@'K4;=ECJLH5QRB7&8VC'QV:GUP.7.'03F[<LMQFK MZ0-$6O(H[O&_1_DMUUV#!,&1BC&L&FR($UH@)86$'QVBS:Q%3#)("!TYL#%N MRXTQ,5^+KF;%,%3MQ?EIK>!ZP.7+A:E%IK??+A)J@C)5^]%58F-Q^DF*%8!Z M10E;TBY-596J60;&+N4IQM<;<2K?_AW/=K`L`/LV>TG2DD.OB2,%:^Q#H9Y= M]7\:*5?@$PJ/!AT5H\]<Z*X0!S]UP(>VLPB7:%Q7U=C(HYDCZH[%`%OI_LWW M4SR,/&()28)GVU7PV4`W#!C)L+6(HS"^5PZ::+LY*F:PD*_V08C##BU=*E!A MIS#GM*PZ<W*OXC73!OH/9N\&7*K$HV'<M/!Y%XQ!)Z-6/BT'>0D/<;"KZU%_ M6$&%(I&6(.,"2@Q<8A1-)8-&JR='L;<8:W1G`D9$\A%)%]\T*6T7F`$;<V!! MS;(?*'Y)VV+>U$7&!FU3)[\,SSR6G`.-!>I%*3ASN6DP$\3"!84\GMGH+%6: MBZ6L`2:.`#@)NTMW..CQS0^M^\ZSLI]/&`"H`$Q'5^"WS62.B#U?,LP8Z9)B M(LQA(SDI/*!.TZ%OEM#I$->'YW/II:H*[8G7$&>O*;S@-8CWFM$F+);U:EW1 MK8`#7H0'XZC>21_=6^IG@/G$94G4:BWY&'FP4%JAOJ)&%LD=8K6(KQQD>97= M`G&]H>!2Z^CJFEPK)0,1BP$PFNDS%)\I,Z_%&IW\27-7S"#(*Q0Q1N$-@LM' M;B4OYHG1CI#/<8@VVHF:SO$$*Y!9'<1!$^JW)<Q*24E[M8K/2_PT`<XD5HU- MDX-G8OHU:=/_'0?%PY?Y5U!@O@X#(U4Y+\(9-F\_]4''&@.F7MD06N85.LR5 M"S6U]8ZW57;=HN3+)5.W:+I)N@M*[FK*KNX"562W'A;LGG%K59WQYPJ&H?IE M6@"N-^B?NHB`E[<7HW/WAOS4:9DMUHHVW<O^I6ZBW/@,7P+-Z,#D5.4H=(FB M8I:>NN*C%N"=..*AQ?LC<7H^``IZ,^S?GERC"+#$^@,UXP_<#.^V(<+_E=[] M+`X*R$<>T%2LDH4UP>V#HNMJKTB;1$47:B="NGJ*F0LO&&GSZ!MXY2)TRE.? M&>QAQ[SGS+R9@9,>K=3VEAX%]QR4!22==&0BP0M,.&7,(ZDAVW1,6XS7Y!/6 MPE=)V:6MSP4J)XZEJMSYN,6GS\5Q$?H-]CDY<B[JR^UZK<(V>KM&LS#=ZQCG MAWJ45>>4,262ZF:>MDTX"HB"'$)!7Q7ZZ8H.=J^5;GD(XV2;8QT##]F;"KPB MHM^\*>+Y!'[?U%K6FAKABR2L?%3M^/36MXLB(2A#_QTNU_!)ILU&O2+\,5*_ M"2?V44;)6N>+/&L"<L$B>$T3G/PM?GV@7G?A-9#^/)`II[9O8..IR]A&\:TP M$]L"JT85$]05.:XKPLR!SW+U/2H*%<\Z*VOH:*#B>N5DCLD!$G0(VC'?OK-- M@[*"PL1^:4:F*B_9B"UR_*Q%JG:WK?2D*7I/^!3OGES(_%M@!8`Y:L7RN2J9 MH;"4IN+^5.+50TSFEE&J![;4&ZQ7G"]]17STC?CHO6`-T]0FK%NV,OA6C0]K MQ[1-FYV$&1Z;J)TIRK&(`6?0$/,4JT&T^(MS<@)<TIA8(]9=[OA$!,WQZ"VW MMNP`,26M!;R!"F<8Y5PTP-B_OH6,HL*>;MRMWMACU;"^L3>+2[4SD2=/L8K/ MGCXP,7:%7#^P<*ACU$=X4M6"$CGD:%6QKS5&+8I+AK;VG1;N!LJF,RX?Z9.7 M>O/ZYE6)`JB)D0>_[,,#+B1Q5C8.1`ADR2DH7:M:B.9>*TK=Y6QK*^8K*U>U M">GGN5G+OBNF71,KU2R*[O]D+<WIUNH3_AYJKE[VYE*^!>$^(66?)UG^F;_R MLS3N0W-.-_:%*E.38'2A'A=OF:XI-'6!3(EG+Q>K?*8J'"L^A0RC5H(9J2L: M8F&RUK)MAS1N`"@I&!L&C3)?7MUO&UM,_"G0"$^F#9YY<W$[W+X\O[H=(DNK M;NI[UU?#4?=J](78M8J,PT3G5Z._="^>':&R_0=P1M3U!$M&)%@S03SSAHR= MQ"%^4V8PNN"\AWB-%^U-?$8E$$EZ*Y`$)L4,2A-H^5G<3;7%XHT!/S,?@8@C MOF8S"8%ARCA9W$V)DTM_D2-[U7M7RJW5G;>Q":R=,.'F'?:RF:K<U([7*S4J MI#4J4O!_^@\><`*V6&(+]B?X`!;]7_C!4\5],\!79R`5Q#N_1Q:R@'W5'E&A MS%/[0W-[V*B%(\M:3VRMPP)L&D2SJ%@4!,"90#9)]5]RSM\7LWEVB,(A&\$M MRQRV+'@I)PX4_/$5?/RB!-Y\EM_BA@AY`_F6OM$U]GPN@'H1P)2NK]5I9QFY M5!;$:/V/V\L;$OMI_9[)M954J\]D34]^85I_0DX">SK!`50I$E#YN3HS@5:T MZ+$E(0RH-"*-"I%LR"AZ.+9Z_-S4R1*,M_-K1ZI`9>2B6M41?UZ@'BW[U*Y' M93D(@=JE)%SL5`,M2E9E-?4'R\OZK.S9JS"\FHNK!P1(!]8%&4N+%ZZNE-;Q M=<[^Q)["!$NG,"(!%I]#1--%ORKT-+BW>3U`0^?F45&`:I0Y77^Z87SZ64MB M6G14O6$8,'4/%&/GE86_0GB\!':>WF?"N_/"F.9!2/P$`.<@Q?BL.$9]6>TP MIUFA_JD7XFQX3RNE,C-=5P40GO/A,>ZE9G.T!.(_^&4;B4A!>Q0?H?NAND+C MX)3>`PR,FCAT80VK"8NY*C9P5[RQYX=5&Y>WBZC5IY)@K;CVH^C8R?GPQKAW MD9I^8E4;1=Q5J!A^HK!=E[)P'1>U+"899_TK7*X<`=OD4EX9#?(J"?BY'@$K MA*]1R3^,[*4+?)/8,A7861'?)A(W"JJEE36";35D$SVTBI//\M%*RV>X::7E M*IY:HZF5/LJJ^HXC_UAM-C96TW7F<<39F-3@-VSX.*&H6JK#(OMTH/%TJ;\T M*CGC2$6'BJZ2.BL??2UMIE!HK%+2H-.V@U?<[^538P2R#"M,]'U,=:>7`2!` M+(%F/@=0IFLQX81)`U(&^L,&>!/\'I&(KC'!,/B=H6H%`JUA5[;?'ZV\_+NR MLF)=L]JI'3_=\'<HOO($"L7&)SR'/C7/S',=DQNK(W(:`R`U"\?J5O_7'/BH M;WM@7;YVZO-;SW?*0Y@\R;VH?GB@SD^^YO1$/W3)G$>%+)OO]4-UI,"]5K!] M:F0Q?IOKK[Z/JAD=%N-7L66UAFPI\\,"#O$LDL_4D,`^>BG[J_;-2R[5JYMY M20'[RDV9D>-BRYY?<G"G%FBC8>]E@`,ZX@V>XZEU7P+XFC6+JJS;IJKF)U3T M/ISS-[:1="43=3(4TBD';YWYA,;01N4!"W^JME]5^6_Q'<Y&O>[_7-F?-]T- MPKAZ!)1>TPJQ4<%A7^.XQBJO$31^K>?RPG/8W1:6BA6F]PI0*V=A9?4*R(N& M*FU75R=PS<J:+"BGA7>5?L^BWQ5Z<*@KP?XB3?&^>_$W,-"DYE?ZCP$EU3<8 ME%$1Q_4M_M76M[\:``F=UB)^$[+X4C(D5?H;#<;E?P?HXWAQ=\=TW?J3!,#D MYPNZ4E.2RY7@J?\2@CL>F_Z+7?Z["&.Z+$I_#>'KD)0'L!Z-S0N@YM^;,.'6 M.&$OT-<ZL,4OB+-R*Y6W:T"FX"4H3N;`,'/UL2->_RT^-OSW[\$!G5>@N`9' MHBL=B]E<]ZJCM[;:RM*;#JC*'V:`F,)P;-+)#:HRQ56J/R_NH_`^R/P3`GB0 MBC>L;F0Z]>89$BYU6S1=Q/0]-0B^@`ICA>/T.5"6S"3_F0@L=>B;L47(J0RM MKY[^;WM7WMM&<NS_IC[%V(!C<DTYU@,"[)-66>B@-D)D29"HK/.R`3$627NT M%*EPR(C"QM_]=1W=7=7'B+(WP":P@%U+,WU-=75U=1V_IA1\;[Y"6VLYG6%^ M-VWJWY-/&0YDF+8^G[W'@QJFM=Z6<S-)Y>2U5Q+1`4TRPW5CSEI#.!V94Q2J M47"0JH:C.0&G!?2`"$2;5$K>&Q^S(M9B9#4#6;X;@2684_8;<[).^OX]`_!6 M\)UE;?K;!X<1J[[:74\O69>?C>"X-11\R/-US,TOAHG8\1J9&@<9A1_ATR^1 M@<@3IQ*9Y?*W*`'3H@_VFP84F$>DFJ[SJ'SS0C`MWAZ?W3T:_K815)+7S7!? M",D!L^TVU?"[A)SRJP-&5NP&98M-2.'$5W_$L$CX=7,SEJEQ8+%$Z0ED:;B% MJ-TCO\&85G]GB\ICF(N4D]OP;8GG)PJDA^3NR03+W-6CY7#F"(4T\JP8NGC' M`SBB!)OC^_<4^4J_C:;#+FMD^/>'B9%^$P=^A4D$306@OMIOH_#K*!(P3ABH M*?`1OAB!A"@9`#_NTDA[CG^H1Q3=K!W;"(92D=6OZ?`M@AH1'8AS&AKHDOX2 MO06B^(Y%MSP`XV>$V9,>`<+\<3>;+Y;3:E&%,Q@)Y?7U'35V*(EP*ZY..LA_ M_:4;.("<E^BGZ;J"NG@*3`^D+8<0/5O_:TJM"='S[1^^%1`]1T;2^J45+"G) ME6B^8%0&.6=$HJ\H/5]1>KZB]/PF4'J<XN=9^WHV@3FODPN;%>Y2RFQP655& M6:TFP'EBZ;/0]K`LKXM,XAM)#+M#3.&;X2LA"P2]:F;G^DCGIK(8SI9PR.!W MN#=:@!K,2S&S@0W!&UA0-:=^8NTI)WU6`!%-HD#%PE$\.H2(RD\PS?B/P&7, MV13DOK()*B*)3WT/-@U3(:V[L3H,7Z%C`_@,9N7C?5DC@9?3(1^"D%G-"BQ! M(%IC#T*>H(D5-2$&&!UO0D"DD7K5U/)U;BX<,.7U?%;7EM(8OX!!>(KX,"2@ M.Q%SR&&XT!=2NI:D1L_%U?&A0UQAHSDMS$0^I,OE,E\D`(7^L:RN?YX\<'(1 MC&TL\Q8J5HO,B_8,N(;@ASK\V38@:./+4$ET?0%+HE\PPHA^Z"&N]?,&$).Y M!2#QC\!HL&G:WZQ'BPB<I`DM.U<*]RSX'S7%EJ,`8A@T(?Y])RID4QK&`V>W M;B@S7*.,X217QIU1.8.DA!`UY`^9U8LY)M+T1=D08_02`F.VM8\@X0&(\B=, M-:&4_<JY`_BVO,?(&"X2V0M@W-[7"9_=]H$+D(,DTY!9;>1!?TI2[N#D;'^_ M=_$TZEV;,PSDZ_SG4]!^_N=1\8"!6C&]U8LK4`0>[D9H[^[_];S',"T4N:RH MS)9B/@5AV!$(,'+3`KEF8T2W@9T(4QJK6O3#.9$@=JD=UD#\05'MJ[)C4(6< M09`/F-2&/(55H*B"S@?X-$,'!X7)%DH;,-H=P#PO[XJ20+74/FS'%#KR$-;X M%7(5PT:/`2H8';-&`X3^*.L:7@!%':>%^`-(*JBAGBC7.`6F<4/NEZSY60H@ M(/L@Z<^[,^3Q;Z")9344#CQ;TW!<6];K6(%+`KG].R%-NQB@,!NW7;\V!(G= M247;O9!9!,*_+]\+LGB?DBS@J>3X->C`DZH0T82RC#+0`Z?J!0%X4LP+3BM* M"QMK9W7!6WX=6EKSD))>4^]ADJ,+9HGR#?")3^VU7^7>F'IZWA.=!F76(@"H M1C,T?'A:\!)G"A@6PE"XR].!49)$4KX-;V,>WO7A>H>]H]@H!R_-$>GX-)Y/ MN^/^S?3U]\#+(5]%1&A,56GHT&[?B0[5JTR'*:FK*%RGA2ZJWO:$_B3!RR<< M.3-YJ?6;D%G-J\<'1GZ7WF@3T0,>D\7^:82\=Z/`OZ^*/^U='`ZPN\$I>@G: M[/W5.VHJ6-,EW-,W8-*][<%EWK?"[6'='9]=S^E)68.;G[`M95G5.E&L^=D9 MZ!VL'UM+6#<`3#9$$7GW"-,-:4RKK@-I<!RT2O+3+Y8]^&Q%S*@RJU;*+86& MXT4Y-V?NNKA#BT(-YP;>_Q"2$_4#^T4N[L>I3!C@C"O3'HOJUYD@?6`1C"#> MN]@[.>F=V`#BM^KM_LF?;<I).%<>%(!X-F:R=W_I'9ST3GDBWW3(N5&Q9Z,B MMT:KB&@,]3#LF.H9BG<$GH;\X,@F?/\1+"F)S[R\VC?LB_W]ZU]%_/[_>A=G M`[./7>P=]/.E+H]_.%VC5/_B^*`_.#G[<0`F,9^T\<$%5#-%E!LJ;LB,N)-< M'=S`RJX+N2ME_8J/+HD[S&&<KKT>ZB]8#A?]=_2QUSX8G#Y^I:EBL8I@H0KH M(CY&)!-9WYKQR34.H44!R$+L:4DM\EAF:5`E7CI/6AUR?`X>MQXMZN8AQHMJ ME5]2'&,5D=8MK!4O*QL(M!ZA6RE:16T*VJE8H#5Y4OJGGL20`!-4M'F[6/F] M(L&AL1+@>3:O`6186G&RG6JXA>Z!D'HS#&Z.-6;"(8KK.B%@,7_V9&^_=V+6 M-6-#\]/+O[[=/XL?N_R.^&FBW.'9U?Y)3[TX/U!_[AT>7@S,W,8/#X^/CN0; M*8Q53\0YOI@+508[)ULTT`D*T+-=O,@(/1^>&82BR-F'R(8EH$Q#36"6IGR> M=XXS.0FM]U9K.,0THEB7!;1Z!/]ML$<9?J*<G8;M25+$E)=$4V-8V9ZY3]D' M!/"OV86%P]132WB2VU'!PUY<,'KF$P"BLOIQ&(P>(AB/?+P$P468@HAM9IU6 MA`]`4PQF\/<DNXWH8VZQC@.\YZ5)BH>3&M#NO3G0_!R2CE0$]3$_HCIQ/WHY M'V%N9+=P02PNS5T/`G#OJ"%2EU>13_YX2MT!4U-)!XX_-DO!N2V*(4(IV0Z; MV'S5X>0K*?W7%8P[MKAEIRCL'N+@QR2IYYB"Z0/C)?U"XH'GSW]/00%LZ".A MQ0O&\_0$HH:S>DSWSRT!LX'Y<2`Q,)D28\!#W;L%&S0S3;T0=I46L3)XL"C8 M:3$C*46S4L/]3^!%#B31/L25V5AL+"0A29-RJZ7GTHQ3B2H1,1W,Z!!"((G+ M\==0:H22*=4&4]AAAZU$&U>7B=J.25Q(L,OFO'2T"C5@NAOCYY$@)W$&MX#; M/:)CF(*@?]QBG`'CA""^(4-M(#)=<7C9?\U5P:MMGM08=(,<=H^^KX7KD\!D M6<E@3Q34O)1CX86(5N/W%=RT-@(PMV#8T(B18]7<YB.IP%(48AKWY&7-F&[Z MR@^HB@+0?3"_IZN_'&=$IQG+'`%][404A3Z0V.+1X::A;'3$:2J+5`O9U('B MT=X>&N'9&PPIN5-"BAV.1$?#V<C1"K4!-F<#N)[-WX551N%5..6B-D[^;!PQ M&@.N$.)AV)P/N$\O17]V;&%&(X*,4+JG%/50O%/@S23\%I\(N`:WCF1*6Z'D MS]:.YU"666J)B[7GI=2"@WA*3GRFC'>QE<Q'U\MY7?US-'F0S@?HA/B3)3+Z MZY5@2I`C;3&P;X7%H,6,DV@#V";DFZ`[(9-4JG-PZ`Z+.9P!G*Q0%52"CI5" MS\+P@Q*1=PHNB32GE`HY5C=M+F="JA]RI]%HH/\..4P__K>D@G/1ZU]=G&XW M%]J[?.LNK%#ZWM7IY7GO8/"7LY.]_G%P=C#2XGQPK`\DT-#QZ?E5N!DCJD,Y MK.@V%PQ:L+<#FPWR5H?&(]@4@3I@=#ICF[K5`[6M;Q0T%!W.W\5G%"ZWZ4\5 M"$0)H+-C;H5"0L!#-Z,>^7NP9/#=_B*)6FPF!SP^/*=7F))\#=J/^C**\[?Y MB>#7NUL6<X@JL0A:E!L/OD<.N^#H)T;-@M0!,W2*#L%%CR%;AAFK!47Q,U8N MM.0_&[2.ZQ(A>A>\M7ZH2#]WL+RF+HX$=?1Z.3&JF?^Z[Y]]7UQ!N/=B"2)[ M\D"'-93O$=%!RT%H.>S23&%9@^2QHYK/WD]&M[6[I<,)9W3,E0N,.1FJ]NE+ M7,/<$MW9:=J'*QHAN`:Q@J=P8V]53HS&#`)^7F$(_'TYAY@<.6.7%-]BCA]= M@@^6QT_,[P`YB<`C9A.1J\*R7JW4.,?D!0=LXM!E/7$+Y;PD'R\0@.Z;AL%7 M4XB@YRNR:JOA_$@C`1;`<2I-')-8*='$GJCN?2+)="P4#HZX<LO2FC+@@!T, M$[&5S68)?4*@LID62_6%7+JUF4<(!2+4O@JH.;\U+Y9U^6&DMA%K39(=1?+? MJ.(W4D:BS>N&X(%OBN]49?J&P4GO](?^GU!(%S?LOJ!)B(_7<75\?&,/]WC= M;4;DHJ[KY*67MOR'$?FE63,)T:H*8AA(O!.[*;>PN'P1::WRJZQIR"T8A#X` M>3(OOAG?+MC&!.:HH[.+MWM]MBSMA/64EXG_<>9%VX0E+#:1M#(*D8Z1V+<` MK`QS_'+T,J$2]DM$EN"OQRUBNZ@91M#FM<"5M\7$"+5)I)95/@E=]-Q:.05I MY=2CUH?98D8FN!U?WAU7TENO:Z3JY([(V(*S`HL/[D4?K(A]LY/C:&G,K1YC M83M087"MB'WSPY5Z062$Y9"DVW)5W2YO79"@N\(.O98HC4_V\*:.Z\D2)B^9 MX$B#70V6U;"-\2I=1/50=E=\KDRI4$1[5\6=%-;.:M?.P>SV#J*W[8#!;1]` M^A`?$TX#C\+BB\/#9[O4HP4G$NCOA`B^$;EDS4P'@0%&Q:]PEJI<T$"0I)"I MC\WGFB`BT`15D4L'/0BA[5P'"$7.]-!>3C"QD6U<&[>U_\(>C]#D;7CB,L). M0H7Z9?6RV2_C#V0(+@_S1!'*<*)#;%P':8NLC/P/)WVP(R)N-1N17;"MS90G MY0YV04#MQ3,BM@#G(<(D]V"YV.IL:H.(Z>;`)HOB^5Z_W[NPO!(9%9L_=5FG MLIW$\CZ].CD!RG:+L)_TKA1TF@&R.C!'-0]D94V#9IX!H)\%!?T?EPX\+42M MP='5Z4'_^,PPZ.7>#[;A'5D;?[B>O]V:+K=^E<2D"0]E5)Y/8^8C@_.4M-'; MHAI`Q)^U@@8U[50=?:#Z)"EAYJ]7XLV79H7QZ8)<5?!DT]NB45-$K;Y$+!$/ M440Y9V!/&E<K5!+]=1NH^?..MTG'[SF!G)6N?MQ1EV^?EQSYL:SM/0+F]Z&L M_[&T45!T]:'"/`KO74AGXKIX$#MO>"$]:)28O`6;WX:SB7"2';_PS_'S[>-H M]^:@"`QOM5%F,MJUX@T]6(BV2G@LCN<1,Y\Q>OIN,><+3X=%^^/,:*J8.-$A M&"E@>.52B`WH?GTR'ZG(YLO^WL&?!^=GQZ>&=!B=<_76<6E6K73C/#"DK1%1 M'!T?/HY-6,"8I_SIUGUVC6&I#\B%"'HS$;OTY\^WGM.\8N(IHTA2J56:H4': M0YS-,MU'2!S3-7F"M(;2Y10:=Q5ETQ:9V!83J:="D^D6.L6T4`FECC@JL]1* MY_4N_,JK1TSU#^`-4Z/(O31C<J\:X"$*L]-<'.]?]7N#JU,S,X>9P"4?(R=4 M-'*=H,;G-(0/U]<VJG8`)W$=:"OS*WUH9QBJ^XTE&V=-V'SQXAN.TPTB>WDW M-)H!0E3+IC$D(:B=J$SUZ5-8#-F_E"Z+%^P*A13"3M'S,)_-;NT->/9TK.Q8 M`(C%&9!ED."$/SJ_=EK,82D;5:BL=0R7'Z3][97[[??%MXJ^P[7I:UM8G\!# M2>"P>H[",O;TWS$HV_:3!Z5Q18)PYYV@F/@"V6E8;*A;DQDWEGV<DA@D]-J+ MY!?L<'(A_5(2E@Z<P[L4R=U1+R&TADQ*:$W0"!VVUUM3$7(CO;2TZ0UP61)I M?W0-5>_T,'"WRV,6':A"U=.<M*`9%(M&$<`F3XY/>P,CP#']0Q4+>O/JK3VH M0:BX.JA1D=Z[W@$(K^.C`>B`QPB;`;\6;26=(3!);D:_*%70:B&4J!+L83NZ MJ-[YN%;71_71GTY!M14SNQ^__B0/?6KKHZEF8S#HF04`R13CN9$W(YN-:*9P M2DF>V>G9R1(R.JBE#H@JAH%``:$U/F.C9I"U>*W'Y;B;K</G(F:QA#R#X8BY M/<WE\#FU:H1Y/L'L`D$C.RK73@+O1MH=?J45X8<4KXG(>"&8:(V%P5K,>DOC M,04]6A]Y15WQO+)J8*:U5]G&"F*&_`5:F@%D[/6\>F^3[XN"9\U>JT[Y-T=G M"64O0)=82U.C16+U&VI!)Q*A*GS>O]`QK/$>XS`K3!NIUSN)NG:GR=1UKU-U MA\W]#G6_R3WQ3<-&&+T;)NNE6TNW$[:@..60;IU#YK?>O&HA,_42$&5B\K-W MQ0ECV'HZNS:7)>$HXR2U;+Z1E9E7='DC?!;*+ROTTL%O`6BEG@'*_X$7$L.2 MDX;FC2"6&L.^Y5.]@I=)5$3=5``$Z1L($1IW`_3\U(X24@=5GR>01J=&_5>1 MAA=&'$,.SC*_&W5QC^$6<CN1W(@.>R>]?N]05+@\N[HXZ`T0=\M5D/>\)1FQ M28ZH`JGE;K<&`D)B[(DJA"VP&X;R8A"I\JZ,6#:D(9=2GHTO/=SS>5L=L^$O M3(Q4/A#D9JE9D/KUN&JARS7I%I1JK)0+JJQ3E1!/.)FQW*2%6A\,:TC\SGUG MO-R2\K30,M4]<^TDU%Q5L%G\VK6YGJ,#\=/3O"XM[C#>(>Z`_QCJBR`VG`D0 MD:DMCX/^%%KPODC$IPSQ_F8/<3M<1\/`BU%O[?`SI^MGX=R?N7JIOL6%\_P$ MC"ED;\<)9!!N%SV#8E[>'H<@(S0:%_WP0%'GRRFA5<HK10PECVRP!\3)4$U[ M=RTN<C3-UW@7C@A2+Q<$:H*WU?7IQMI6B_+OZ19%P':;5^Z(/EHMYN5T-%N: MX\VL))>^2%UHR>//:+PH)A0EB;$W-D1C,IO=F<^8SR:`T475O@?PT<-9?*T< M(1V(._=0$Z5*"&WNEV32!>-.'<[GPY-L&#MY(3(?&AV.,MX^I6_G%>^?N4D$ M[:^&FWRU>RM@M\>TLP37Y;.,#@&L$N*:%@"Y8DAF02J#&\;G:=@&:C8%S2&W MB^7M'4&..IS:.@1#U^"4D60#T:9MLP&F1U:MG`Y*,("%(.FI9RDP=4P["YZA M82-XALLO1,U\1JL2+PB"WVR&/CZ0N'Q!E'X(TW=A:/=BN%T\CX!#M-"S'3P) MC]UE>Z6O8W#EZ!OI@H5(0Q.`XE!1WS`:R$V\F;%H/T=4R.<I:$-H%8<4>F1] M*`M'>KC\T)9/%O,+E]K(B5=_8;%=]>+68OM4)B6U+"&`>8@027FQ4C&S_B(A MC$9Z`]8!4=4%RX3U?3`O_21"=5<R0#<HR\GV84&WZL2X7)FM3C`P)7=XWG*W M,5&C+;ANT]#G?J=X7FSCK_,=-<N)W5'2MHR(ZW]+WR?D`ZQYOK,30U^<N.NI M%:ZX%W7[Q1#@:SLP^`WMHXR^W'ZS^V)7Q=X=Q3T3C$I(Y?PH'0^%XRO,`.<H M%7Z%03*K-(XQ.PJSZ7[)$*+.9,2U7X&<L,1EK`2!Y:MYRW(1B&W%1.EF1?ZE M;EK?U.=O07`'!)'L:#Z*C7W,\V`^_+)QP=K?5D\P9G\[Q]A.XHB5OPK7NUW' MJ?@2%J?)B]`H+"^8>G#/M7G:@QO2?%MNO2>X/-6O8_=LC\3SU.N&])P+89?I M7S1,W(,-&O;MQ/)I*BX3$JI4;KK"A(`UV33=V-5E[S/X4;$CP((]PHI1WR(^ M-QB^:<U0J;F56-/T1.=M-YQ/&F0<XN"TIZF[)X5WA1R*\7>[N[O/O0;A%#36 MRC9YW>TD=:N?IJRZ@2AE;0]^Y39>#)_SYSUWUP51`=86X5>K8;X8(E`R5Z#N MN]Q[E\?59?VARTS6=5^7L.E\CFY.1()S(B/DN^-4RI*CKX_P0.R?K9?[,(I? MG'I\L[.1T8J;E.`@$-C><7#C[S?('RYNUJ`FQ/^7>&,F^@I_.-QOP`!CFB&V MOX,7>01F4(V.[AKO*B._5.<_:ZR/S+`;;6T#1#)7!^@!YP8K,/H_/17H^Z,$ M^M[:VMYZL_T_;]8$^MYZ\X>O0-]?@;Z_`GW_EP)]"]!C#2WSBS`M!#'>$?:' M5TKB\\>G'2=9#^B>F,D#P^KQK7\EG#"Z]CS0)1P#T,*[I'EC:A3C\]E8SJ)] MC[EE:(]$&!X,NX0<8[@,AASO=YBEW1%?Z;?/C1"0UJBS')`;RS,!XN+\(%3^ M&(R`US/,\*.KB<:Z+&'Y4.&S);>*"52^G"8\XOE@>0P2@;LZ(83Z]\@.OE*( M.DI53D>KA1Z$+#<U;ZG<N:E1@14X6Q81P76CL,"EV\Y.;>!AVDC9!=U-ASM$ M-H?]C;MP>@C?##-U;"I0IIJ+>7NDFF#-MYBUBGD8U^Z6([,3$.2Z@+[RSCP7 MQN%&L>'QLWD9P(FA4[3Q'XA6P$M\91DX3>A"UM,D2^%RT<6<B4Z6`YA+7<P" M3LI28)#4I8`OPE+@HM*E@,O"4NC%T<60<3H)RAJ9:K8S5*H4(6&,8HT&)"33 M)G?A`6GQ[QCP-*P,]PN?R_&1T3^^A[&3JBV,*(\.P)7--437'*_5#A3--2-; MB`P]^.HG=PI,_QCE/QZ7)T*BWSW7<;?@W`'J18U"]P_^F=#B($S<4%):4,6H M-21#<TV\W%E_<,)6FVTC,X*G-Q,:;^+&E)EDK0$!F9-LN1?P96Y2]ORL^"+" M.Q"R=[8OS[I?TI58`*F>Z#[:7">"3@%K!6]S;"&XO[DN(71X+GXZ$X>SW5`# M\D6>L%ZV,NMEW8K>N)@4+U<G_6,K8(CT(LK'DMX;P?2[^);:CJ4BK&@5.)1H MR+]+-=2AS61/*6>OK?.;(288I8@N7^%L!M``[>82[RQ"(OM-`?>,Z.M2!,,& MSBXPAC64R];UMI9(3HCF8&MB%7V%@)-@8D!URZ8$G5^<]<_:[2C41^<4(DXE M_0_3<?R_U93^Y\)HPH[P,)OK)5M+>LJS0X01/=:"B'9J;.>1QL"^PO7)MM;5 /BJ.FP<;_`V4#/7TPP@`` ` end ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-09-24 4:35 ` Michael Hayes @ 1999-09-30 18:02 ` Michael Hayes 1999-11-18 0:22 ` Jeffrey A Law 1 sibling, 0 replies; 94+ messages in thread From: Michael Hayes @ 1999-09-30 18:02 UTC (permalink / raw) To: Michael Hayes; +Cc: Bernd Schmidt, gcc Michael Hayes writes: > I've also been tidying up a patch that I'm about to submit for a > separate autoincrement pass that is run as part of flow optimization. > This collects lists of register references within a basic block and > uses these lists to look for a sequence of memory references to merge > with an increment insn. I found that this approach worked better than > scanning def-use chains. Here's the patches I was referring to. There are three new files: autoinc.c, ref.c, ref.h. I've written this as a bolt-on to life_analysis_1 since it completely replaces the autoinc code in flow.c. I'd be interested in folks' opinions to whether this is the best approach or whether I'm barking up the wrong tree.... For starters, here the docs from autoinc.c: There are a number of transformations which can be made by this optimization: case A: *REG1; REG1 = REG1 + INC => REG1 = REG1; *REG1++ case B: *REG1; REG2 = REG1 + INC => *REG1++; REG2 = REG1 where REG1 dies in the add insn. case C: (very uncommon) *REG1; REG2 = REG1 + INC => REG2 = REG1; *REG2++ where REG1 is live after the add insn and where REG2 is not used between the first memref and the add insn. This case requires a new move insn to be inserted before the first memref which makes REG2 live earlier. However, this won't affect autoinc processing for REG2 since REG2 is not operand 1 of an add insn. case D: REG1 = REG1 + INC; *REG1 => REG1 = REG1; *++REG1 case E: REG2 = REG1 + INC; *REG1 => *++REG1; REG2 = REG1 where REG1 dies in the last memref and REG2 is not used between the add insn and last memref. case F: *REG1; *(REG1 + INC) => *REG1; *++REG1 where REG1 dies in the last memref. This latter case is useful for DSP architectures which can handle multiple autoincrement addresses better than multiple indirect addresses. This case could be handled by a separate, optional, scan of the register ref list. Note that strength_reduce in loop.c performs the following transformation which we try to undo: *R; R = R + 1; *R; R = R + 1 => *R; *(R + 1); R = R + 2 However, the following is not transformed: R = R + 1; *R; R = R + 1; *R begin 644 autoinc.patch.gz M'XL("--?ZS<``V%U=&]I;F,N<&%T8V@`W%S[5]M(EO[9.?M'%.FS:1O$PZ;I M!!C28XRAV>6UMIET=J:/CBR5L099\DHRM#N3_WWOHTJJD@PDO=-S^BSG)-AR M/>ZKOOO5K3+G<2!_.1"3*'G<\E\=_=]_7@UZ0S$)(WD@MN6=GVW[7IHG\;;_ MD.'3C!_>^?XVS^D\O$IEGH;R(8SO1`J_LC")17NK_7W[51!.)F+3%YN[8G,N M-E-ZJH5=7U]7+QOM_?W][9W][<ZN:.\=[+P]Z.PU>(3-S<U:H^]$NPTM#K[K MX!CFCSA.,A#6$9==L=-IM]N;[=V=MXZX'7:WA%C?ICG;N[M.>W=?8`<:GQY\ MUQ;P>O.5$-^$L1\M`BE>Y\D\D@];T]?6TU3ZR5WU81AG\>8D\NXR_&3#^,1; MY`F\X0[PB=4M&6>YY]^K#^&30$["6`KUW/6GB_C>]:(H\<4O,_I=TQE:YJ$O MPC@7<9+,W5GR(-UYH]&X&5R/KIO--/^EU3HDU7?;N\YN6ZDN=,^')`R@*[R4 M+D\[!XESF<)0X,'0AU;@4AI.T'B.4(/:8Z!ETL!]2")X%DD7C9)5Q+![S+ST MWDWE7>9&(4CMY:Z,@Z('/)<Y=EJS.D7A!)K&7K3,PLQMFXHZ:`;ZKSY7&(>Y MRV.Z#]+/DU3U%'HJL<[]H6.#?K(\7?BY]H98KP\Z3Y.Y=^?ETAV#;^X;MNQD M)_4?"4;QMMO><W8[1;S]O_7"Z?E%7YGT#^J-ZE*R=!+-B2-B[.H0($)/26LK MD+28.KMOOW,ZN^]V]'(28GM=?)#BT8.EZ$6AEXEBK#">).F,70BO!(CG12*0 M7@`*)*D4,@IG84P-&*IP/#(2C61(15J#,T35!:;`A:A>X/I)(+D3C3D1S37X M/$KHHQE`7"Z#%GTF.!3\)`:KQKD[6<0^B8R38NB"SM^CSAT=O'\DG2M.^N=I M7@V3E>M%-'&9<&3L[;]U.M_O[)61(52H9^&O$BP4Z/?C92[!9&`,+\H2@6$_ ME:G4YK#C&PSP-2Y7BF'`B\FA?D=Y`GM83ZJ=\<-/K_2GDS#-<G?N99D*@[W] M?53P71D&_P(%G_#OBVIN\!.&(QSD-VK^+%H\XPN.B;=O]YS.6XT69,6W;]_A MHSUM100Y,>B?N<?=X7G//;ZX[OVG:(8M<<1/\;U[=G%]W+TXY.9*C\\<Y*BH MHAMN,L]A<8$OFDJDV!U[6>@S,BIKM@Z9D-`JODH>13Z5I):U=F$!>KZ_2`%7 MP6V7WCWP$^`H,US":!D!JJ9">OY4VY5F$C23R!-8\KC2C)4/T_NIA/&TN&`O MB0`2!/!A!DQ2C\21A`E-3$&^F1<O!2@&(87S89"%&>1(E'&1R<#!%P3U@#=! M*+."]EF>PY3(SIF+)F2#6,H`.P,^16YR[V`>EEE+<:;][R!=[Q0P_TTX`98F MNK>C:_?\JN>>]'O:A.<3(JN"P.\AN9>!\"8H'H..(Q[!;(LL%SD:4?X"LB-K MUD-II>4O<S`#F!Z'`?N!]9-%G+,N:PK)GL*Q3WH4PMRF.&1EU"\(I9_Z/]U` M5.`[1[1;V*_Q21$36-:@ZEY)A?\0JK;%/_[Q)&[_1GW1M=N!?-B.%U'4^`#B M7WI+(;X7.^\.=CL'G;<"C/&.S*(9O-\X34,QE',!6Y#.+FY!.N^PV7XUP&AT M8Z6WG;W]73;J!AKO6B_.\5*`0F/(>6`;6A+XPEX4,\"2=`GZ3P`W8U]F#&BH M[=G5+?:?PU).P62C*7HC@T69`DU2#E)`H+,L=^XE\V4:WDUST>RU2`5QFDHI MALDD?T2P/@4O!-3'$>>@N^X6PU9OO`#SH^B7H3_U9"1NML2/WA*,W9QM3?'% MGV'!@[T\I*_C1;K<\ORM^-<6@LT&2XGH8XJ*FO1Z6]R"W^#'$Q0J4T(=BF6R M@#4:@RT"B">61(0YHL0VF(.(\A(&@$<@/@0CXAG(,,MP#GR#0Y_)6*;`1&X6 MXPA@Z@(8=YP!%($T^"2;DG8P#'9XRBR'0H;P>2H`^VC'VR',:7HY2IF2V9,8 M54;(`BI>-JUI62J#O)BFG29SD'P*HX$NCV$4B;%$A)LL(@?Z0EOQX7STX_7M M2'2O/HH/W<&@>S7Z>`AM\VD"G\H'R2.%L%Y"&!CD3\$A2[`$#'#9'_1^A![= MX_.+\]%'%/WT?'35'P[%Z?5`=,5-=S`Z[]U>=`?BYG9P<SWL0_`,)0HEE?!/ MV'%"GD@1[W,OC#*E[D?P70:B18#DWH-$5)?`G@`V((3GRY<=A*:,$E@>J**R MWB&`/8G$`=6[OOEX?G4&D@(\P8H'%$I#")$\><&;CMC;%R.)T")N(L^78E,, M%]AU=W<'#?Y288%4W%ZG93)"@L.41\2+V1@\C\J!];,BJ6;B<0JKAZ(9/#OS M`LDQ)T!."`ESU1[PZ`(:@WF[!_1Z'1A!^Q!Y09OI05ML"`!6(8[>FT\/N>7& MAC7(<7603GT0U<_^F+H]DH+T'G.LCEG(WI19&2S*R7H'0C0A]I>P*`&M9KPL M7IC=>,HJ=$@%:W*P$Q)PE7A,$8@W%"T[V!+"@2@"#3*6^:-4"X0H'J(L("SU MLW11J$J*I/)_%F&J`!A\*X&0`/?A&2%YC>FE3`D>Y0370&T"=OL,DB(/0^*1 M%M)+8:$BCO^8/,+R31T.A<<D_A8@;C*!7;).#;C3]153VE#)@$:")[ZM,P!) MBEJU,0@AVDPOE3XZ.=#2V.&DPF=%5&ULX(N*J_L'I5)/#:-Z6HY_+JPBSW9/ MU:&%,]7JJ42!T=U6^=1:`^O-4MH6\PF0M?C4T/9E,8MY*'0`^S$\:4ZFJ0#B MY+&3X0W`A#\%G/'S!026`0I3D#V2-,IL$>4A(M-*L@R]P`"\`*!?T3B,`XA5 M/^=@U6VM>/8)C2%J>3+*Z1X@*B1ER%>.RF%>Y(@,1**!%$H7O!N=`BDS5RI3 MFZLD5[D+TIJ,[_(I;):#A8\RB2A)YEL0OS)%+,QX@201,!4=RS90*I,`F\P! M0&"1`60GVG,00QA`X#4"">.M0C#R*[YOE9]UM)S&,C-DT*%5B"$#%=1/3(5O MBQ3P;U;MUT_B27A7K0AG2S#=K/HTS:/J(UV$J#ZG[=4F;:^J'U$=>O6TN`VL M/U.U;/OAA!_A4U601M[..X#CVS.QH_.=>/"B,,""&S#!^$Y2Z"9QM%0A!?GR M)Q>)QEG?A=WK4'`S16'9M4B`:)(`*1AV&/1[UV?N]4T?.,W)4*R+/9UDJ^(, M^_]UV[_J]5WH)IJK^[:*T`1YAU1#7"`R)T"RHH!VNL(;(V="='R6>C.1P&W+ M+$1R04C0NS[IHQ(818/1A<`B`BUO@B)K/,)BXAL\S&7_4O=<,97BLP[!V/G5 M\$JW!?\"LXKU@M&)BBRDBJ0XKPL3PY-/U$@"$\$2"=4X!!<Z-E35!.8NW^!@ MY3N@IG-\]_G0M.)`@@EC)<O+VM*R!;R!5K_*-($]'<L=8QD!-^S`M+QQA-6% M#X/3B^X9K\%XDUKCX)ZV#_$YH&%L<QZ&S(=B*'!B$';$]>GIL#_21DLF$RP^ M`01Q28H;\PBZ!QNZ1^T#V):F1,BU/K0]G8:`&:6QJ71EV19IJJK!9+@-@BW6 M8XHG0PX)Z2A!L`3MMW1M"DM-W*HH5]&@,YPPEBZI-RN<IGH8(QT^XV@`JAWN MAYVH0'<DSOHC%ZWH#L__NP\;-ZZ4*F3$33>+HR7\I.>ESVZ'T&]T/>B[L#$8 MX7Y^T+_L7XW4..+-&VWN#37?$0J)S_'7>[&#XS:4<'J,0C<A(\R5]8E.^O9$ M.(CJ4DZYN6K*/\&4HCHG#/?\G/#O&=V>5XK[OCA^5:4GQ_]3?7Q#_L_L^RB3 M3SKLXKI[\J_P5SE/3;??S5\\Y^_FKF+X?ZJW4D91C<6?383M)3%0DYR0*(-M M!Z4$@*RKVTN%5AD#(Q.[DI`!+^4::Z!S`R,S;/Z7HGMQ/ARI0IL)U<A*J?!4 M':T+PZG:#UI68RF1:\Y^BF_;W1Z]K,1-&:@1.&MDQIXX6_BXBT$^;,M3`5B0 MV(!53+Y8_YRXF+L=W&G?8<D=_O.0BSHX`Z%B6@"L2HNZDUC7KPJ/8[;#D:P' MYOM*9A7K-)D%R#"OU1U%*)$96X1E:H6@L;-N9N$ORB+6CE"$.@ACA*BM-1=+ M<FEO1:FD8@4%/"1O*LMJG;PT=XL`:[:,!0!6YKL$-&:SL++1AJ2&\+Z3.1\S M6R.`D#_\\`.60ZS]@N(W:+Q,5X2PN#5''AGP>8$7$510I0<B##3%S39-82L@ MXV"5^)^U):DF'%(&%"&L2O20"#<VZB95_G+AMV&$5<F4_/[7\.<MO7(MJ.5& MO.0AP9Z??L3J=?$4<8P?J^REYB0KQO2*AG#$#=,%KPQLG!,)&:G94&#_A4,8 M?JER]F8Q.!K8$6^X8EX\!;QQ`,X</0]6T6N&YL-5;SZ/EFI<]RY-%G-PBK:U MPKN=+P]S5`6V2O/#BI<@'K4;=ECJLH5QRB7&8VC'QV:GUP.7.'03F[<LMQFK MZ0-$6O(H[O&_1_DMUUV#!,&1BC&L&FR($UH@)86$'QVBS:Q%3#)("!TYL#%N MRXTQ,5^+KF;%,%3MQ?EIK>!ZP.7+A:E%IK??+A)J@C)5^]%58F-Q^DF*%8!Z M10E;TBY-596J60;&+N4IQM<;<2K?_AW/=K`L`/LV>TG2DD.OB2,%:^Q#H9Y= M]7\:*5?@$PJ/!AT5H\]<Z*X0!S]UP(>VLPB7:%Q7U=C(HYDCZH[%`%OI_LWW M4SR,/&()28)GVU7PV4`W#!C)L+6(HS"^5PZ::+LY*F:PD*_V08C##BU=*E!A MIS#GM*PZ<W*OXC73!OH/9N\&7*K$HV'<M/!Y%XQ!)Z-6/BT'>0D/<;"KZU%_ M6$&%(I&6(.,"2@Q<8A1-)8-&JR='L;<8:W1G`D9$\A%)%]\T*6T7F`$;<V!! MS;(?*'Y)VV+>U$7&!FU3)[\,SSR6G`.-!>I%*3ASN6DP$\3"!84\GMGH+%6: MBZ6L`2:.`#@)NTMW..CQS0^M^\ZSLI]/&`"H`$Q'5^"WS62.B#U?,LP8Z9)B M(LQA(SDI/*!.TZ%OEM#I$->'YW/II:H*[8G7$&>O*;S@-8CWFM$F+);U:EW1 MK8`#7H0'XZC>21_=6^IG@/G$94G4:BWY&'FP4%JAOJ)&%LD=8K6(KQQD>97= M`G&]H>!2Z^CJFEPK)0,1BP$PFNDS%)\I,Z_%&IW\27-7S"#(*Q0Q1N$-@LM' M;B4OYHG1CI#/<8@VVHF:SO$$*Y!9'<1!$^JW)<Q*24E[M8K/2_PT`<XD5HU- MDX-G8OHU:=/_'0?%PY?Y5U!@O@X#(U4Y+\(9-F\_]4''&@.F7MD06N85.LR5 M"S6U]8ZW57;=HN3+)5.W:+I)N@M*[FK*KNX"562W'A;LGG%K59WQYPJ&H?IE M6@"N-^B?NHB`E[<7HW/WAOS4:9DMUHHVW<O^I6ZBW/@,7P+-Z,#D5.4H=(FB M8I:>NN*C%N"=..*AQ?LC<7H^``IZ,^S?GERC"+#$^@,UXP_<#.^V(<+_E=[] M+`X*R$<>T%2LDH4UP>V#HNMJKTB;1$47:B="NGJ*F0LO&&GSZ!MXY2)TRE.? M&>QAQ[SGS+R9@9,>K=3VEAX%]QR4!22==&0BP0M,.&7,(ZDAVW1,6XS7Y!/6 MPE=)V:6MSP4J)XZEJMSYN,6GS\5Q$?H-]CDY<B[JR^UZK<(V>KM&LS#=ZQCG MAWJ45>>4,262ZF:>MDTX"HB"'$)!7Q7ZZ8H.=J^5;GD(XV2;8QT##]F;"KPB MHM^\*>+Y!'[?U%K6FAKABR2L?%3M^/36MXLB(2A#_QTNU_!)ILU&O2+\,5*_ M"2?V44;)6N>+/&L"<L$B>$T3G/PM?GV@7G?A-9#^/)`II[9O8..IR]A&\:TP M$]L"JT85$]05.:XKPLR!SW+U/2H*%<\Z*VOH:*#B>N5DCLD!$G0(VC'?OK-- M@[*"PL1^:4:F*B_9B"UR_*Q%JG:WK?2D*7I/^!3OGES(_%M@!8`Y:L7RN2J9 MH;"4IN+^5.+50TSFEE&J![;4&ZQ7G"]]17STC?CHO6`-T]0FK%NV,OA6C0]K MQ[1-FYV$&1Z;J)TIRK&(`6?0$/,4JT&T^(MS<@)<TIA8(]9=[OA$!,WQZ"VW MMNP`,26M!;R!"F<8Y5PTP-B_OH6,HL*>;MRMWMACU;"^L3>+2[4SD2=/L8K/ MGCXP,7:%7#^P<*ACU$=X4M6"$CGD:%6QKS5&+8I+AK;VG1;N!LJF,RX?Z9.7 M>O/ZYE6)`JB)D0>_[,,#+B1Q5C8.1`ADR2DH7:M:B.9>*TK=Y6QK*^8K*U>U M">GGN5G+OBNF71,KU2R*[O]D+<WIUNH3_AYJKE[VYE*^!>$^(66?)UG^F;_R MLS3N0W-.-_:%*E.38'2A'A=OF:XI-'6!3(EG+Q>K?*8J'"L^A0RC5H(9J2L: M8F&RUK)MAS1N`"@I&!L&C3)?7MUO&UM,_"G0"$^F#9YY<W$[W+X\O[H=(DNK M;NI[UU?#4?=J](78M8J,PT3G5Z._="^>':&R_0=P1M3U!$M&)%@S03SSAHR= MQ"%^4V8PNN"\AWB-%^U-?$8E$$EZ*Y`$)L4,2A-H^5G<3;7%XHT!/S,?@8@C MOF8S"8%ARCA9W$V)DTM_D2-[U7M7RJW5G;>Q":R=,.'F'?:RF:K<U([7*S4J MI#4J4O!_^@\><`*V6&(+]B?X`!;]7_C!4\5],\!79R`5Q#N_1Q:R@'W5'E&A MS%/[0W-[V*B%(\M:3VRMPP)L&D2SJ%@4!,"90#9)]5]RSM\7LWEVB,(A&\$M MRQRV+'@I)PX4_/$5?/RB!-Y\EM_BA@AY`_F6OM$U]GPN@'H1P)2NK]5I9QFY M5!;$:/V/V\L;$OMI_9[)M954J\]D34]^85I_0DX">SK!`50I$E#YN3HS@5:T MZ+$E(0RH-"*-"I%LR"AZ.+9Z_-S4R1*,M_-K1ZI`9>2B6M41?UZ@'BW[U*Y' M93D(@=JE)%SL5`,M2E9E-?4'R\OZK.S9JS"\FHNK!P1(!]8%&4N+%ZZNE-;Q M=<[^Q)["!$NG,"(!%I]#1--%ORKT-+BW>3U`0^?F45&`:I0Y77^Z87SZ64MB M6G14O6$8,'4/%&/GE86_0GB\!':>WF?"N_/"F.9!2/P$`.<@Q?BL.$9]6>TP MIUFA_JD7XFQX3RNE,C-=5P40GO/A,>ZE9G.T!.(_^&4;B4A!>Q0?H?NAND+C MX)3>`PR,FCAT80VK"8NY*C9P5[RQYX=5&Y>WBZC5IY)@K;CVH^C8R?GPQKAW MD9I^8E4;1=Q5J!A^HK!=E[)P'1>U+"899_TK7*X<`=OD4EX9#?(J"?BY'@$K MA*]1R3^,[*4+?)/8,A7861'?)A(W"JJEE36";35D$SVTBI//\M%*RV>X::7E M*IY:HZF5/LJJ^HXC_UAM-C96TW7F<<39F-3@-VSX.*&H6JK#(OMTH/%TJ;\T M*CGC2$6'BJZ2.BL??2UMIE!HK%+2H-.V@U?<[^538P2R#"M,]'U,=:>7`2!` M+(%F/@=0IFLQX81)`U(&^L,&>!/\'I&(KC'!,/B=H6H%`JUA5[;?'ZV\_+NR MLF)=L]JI'3_=\'<HOO($"L7&)SR'/C7/S',=DQNK(W(:`R`U"\?J5O_7'/BH M;WM@7;YVZO-;SW?*0Y@\R;VH?GB@SD^^YO1$/W3)G$>%+)OO]4-UI,"]5K!] M:F0Q?IOKK[Z/JAD=%N-7L66UAFPI\\,"#O$LDL_4D,`^>BG[J_;-2R[5JYMY M20'[RDV9D>-BRYY?<G"G%FBC8>]E@`,ZX@V>XZEU7P+XFC6+JJS;IJKF)U3T M/ISS-[:1="43=3(4TBD';YWYA,;01N4!"W^JME]5^6_Q'<Y&O>[_7-F?-]T- MPKAZ!)1>TPJQ4<%A7^.XQBJO$31^K>?RPG/8W1:6BA6F]PI0*V=A9?4*R(N& M*FU75R=PS<J:+"BGA7>5?L^BWQ5Z<*@KP?XB3?&^>_$W,-"DYE?ZCP$EU3<8 ME%$1Q_4M_M76M[\:``F=UB)^$[+X4C(D5?H;#<;E?P?HXWAQ=\=TW?J3!,#D MYPNZ4E.2RY7@J?\2@CL>F_Z+7?Z["&.Z+$I_#>'KD)0'L!Z-S0N@YM^;,.'6 M.&$OT-<ZL,4OB+-R*Y6W:T"FX"4H3N;`,'/UL2->_RT^-OSW[\$!G5>@N`9' MHBL=B]E<]ZJCM[;:RM*;#JC*'V:`F,)P;-+)#:HRQ56J/R_NH_`^R/P3`GB0 MBC>L;F0Z]>89$BYU6S1=Q/0]-0B^@`ICA>/T.5"6S"3_F0@L=>B;L47(J0RM MKY[^;WM7WMM&<NS_IC[%V(!C<DTYU@,"[)-66>B@-D)D29"HK/.R`3$627NT M%*EPR(C"QM_]=1W=7=7'B+(WP":P@%U+,WU-=75U=1V_IA1\;[Y"6VLYG6%^ M-VWJWY-/&0YDF+8^G[W'@QJFM=Z6<S-)Y>2U5Q+1`4TRPW5CSEI#.!V94Q2J M47"0JH:C.0&G!?2`"$2;5$K>&Q^S(M9B9#4#6;X;@2684_8;<[).^OX]`_!6 M\)UE;?K;!X<1J[[:74\O69>?C>"X-11\R/-US,TOAHG8\1J9&@<9A1_ATR^1 M@<@3IQ*9Y?*W*`'3H@_VFP84F$>DFJ[SJ'SS0C`MWAZ?W3T:_K815)+7S7!? M",D!L^TVU?"[A)SRJP-&5NP&98M-2.'$5W_$L$CX=7,SEJEQ8+%$Z0ED:;B% MJ-TCO\&85G]GB\ICF(N4D]OP;8GG)PJDA^3NR03+W-6CY7#F"(4T\JP8NGC' M`SBB!)OC^_<4^4J_C:;#+FMD^/>'B9%^$P=^A4D$306@OMIOH_#K*!(P3ABH M*?`1OAB!A"@9`#_NTDA[CG^H1Q3=K!W;"(92D=6OZ?`M@AH1'8AS&AKHDOX2 MO06B^(Y%MSP`XV>$V9,>`<+\<3>;+Y;3:E&%,Q@)Y?7U'35V*(EP*ZY..LA_ M_:4;.("<E^BGZ;J"NG@*3`^D+8<0/5O_:TJM"='S[1^^%1`]1T;2^J45+"G) ME6B^8%0&.6=$HJ\H/5]1>KZB]/PF4'J<XN=9^WHV@3FODPN;%>Y2RFQP655& M6:TFP'EBZ;/0]K`LKXM,XAM)#+M#3.&;X2LA"P2]:F;G^DCGIK(8SI9PR.!W MN#=:@!K,2S&S@0W!&UA0-:=^8NTI)WU6`!%-HD#%PE$\.H2(RD\PS?B/P&7, MV13DOK()*B*)3WT/-@U3(:V[L3H,7Z%C`_@,9N7C?5DC@9?3(1^"D%G-"BQ! M(%IC#T*>H(D5-2$&&!UO0D"DD7K5U/)U;BX<,.7U?%;7EM(8OX!!>(KX,"2@ M.Q%SR&&XT!=2NI:D1L_%U?&A0UQAHSDMS$0^I,OE,E\D`(7^L:RN?YX\<'(1 MC&TL\Q8J5HO,B_8,N(;@ASK\V38@:./+4$ET?0%+HE\PPHA^Z"&N]?,&$).Y M!2#QC\!HL&G:WZQ'BPB<I`DM.U<*]RSX'S7%EJ,`8A@T(?Y])RID4QK&`V>W M;B@S7*.,X217QIU1.8.DA!`UY`^9U8LY)M+T1=D08_02`F.VM8\@X0&(\B=, M-:&4_<JY`_BVO,?(&"X2V0M@W-[7"9_=]H$+D(,DTY!9;>1!?TI2[N#D;'^_ M=_$TZEV;,PSDZ_SG4]!^_N=1\8"!6C&]U8LK4`0>[D9H[^[_];S',"T4N:RH MS)9B/@5AV!$(,'+3`KEF8T2W@9T(4QJK6O3#.9$@=JD=UD#\05'MJ[)C4(6< M09`/F-2&/(55H*B"S@?X-$,'!X7)%DH;,-H=P#PO[XJ20+74/FS'%#KR$-;X M%7(5PT:/`2H8';-&`X3^*.L:7@!%':>%^`-(*JBAGBC7.`6F<4/NEZSY60H@ M(/L@Z<^[,^3Q;Z")9344#CQ;TW!<6];K6(%+`KG].R%-NQB@,!NW7;\V!(G= M247;O9!9!,*_+]\+LGB?DBS@J>3X->C`DZH0T82RC#+0`Z?J!0%X4LP+3BM* M"QMK9W7!6WX=6EKSD))>4^]ADJ,+9HGR#?")3^VU7^7>F'IZWA.=!F76(@"H M1C,T?'A:\!)G"A@6PE"XR].!49)$4KX-;V,>WO7A>H>]H]@H!R_-$>GX-)Y/ MN^/^S?3U]\#+(5]%1&A,56GHT&[?B0[5JTR'*:FK*%RGA2ZJWO:$_B3!RR<< M.3-YJ?6;D%G-J\<'1GZ7WF@3T0,>D\7^:82\=Z/`OZ^*/^U='`ZPN\$I>@G: M[/W5.VHJ6-,EW-,W8-*][<%EWK?"[6'='9]=S^E)68.;G[`M95G5.E&L^=D9 MZ!VL'UM+6#<`3#9$$7GW"-,-:4RKK@-I<!RT2O+3+Y8]^&Q%S*@RJU;*+86& MXT4Y-V?NNKA#BT(-YP;>_Q"2$_4#^T4N[L>I3!C@C"O3'HOJUYD@?6`1C"#> MN]@[.>F=V`#BM^KM_LF?;<I).%<>%(!X-F:R=W_I'9ST3GDBWW3(N5&Q9Z,B MMT:KB&@,]3#LF.H9BG<$GH;\X,@F?/\1+"F)S[R\VC?LB_W]ZU]%_/[_>A=G M`[./7>P=]/.E+H]_.%VC5/_B^*`_.#G[<0`F,9^T\<$%5#-%E!LJ;LB,N)-< M'=S`RJX+N2ME_8J/+HD[S&&<KKT>ZB]8#A?]=_2QUSX8G#Y^I:EBL8I@H0KH M(CY&)!-9WYKQR34.H44!R$+L:4DM\EAF:5`E7CI/6AUR?`X>MQXMZN8AQHMJ ME5]2'&,5D=8MK!4O*QL(M!ZA6RE:16T*VJE8H#5Y4OJGGL20`!-4M'F[6/F] M(L&AL1+@>3:O`6186G&RG6JXA>Z!D'HS#&Z.-6;"(8KK.B%@,7_V9&^_=V+6 M-6-#\]/+O[[=/XL?N_R.^&FBW.'9U?Y)3[TX/U!_[AT>7@S,W,8/#X^/CN0; M*8Q53\0YOI@+508[)ULTT`D*T+-=O,@(/1^>&82BR-F'R(8EH$Q#36"6IGR> M=XXS.0FM]U9K.,0THEB7!;1Z!/]ML$<9?J*<G8;M25+$E)=$4V-8V9ZY3]D' M!/"OV86%P]132WB2VU'!PUY<,'KF$P"BLOIQ&(P>(AB/?+P$P468@HAM9IU6 MA`]`4PQF\/<DNXWH8VZQC@.\YZ5)BH>3&M#NO3G0_!R2CE0$]3$_HCIQ/WHY M'V%N9+=P02PNS5T/`G#OJ"%2EU>13_YX2MT!4U-)!XX_-DO!N2V*(4(IV0Z; MV'S5X>0K*?W7%8P[MKAEIRCL'N+@QR2IYYB"Z0/C)?U"XH'GSW]/00%LZ".A MQ0O&\_0$HH:S>DSWSRT!LX'Y<2`Q,)D28\!#W;L%&S0S3;T0=I46L3)XL"C8 M:3$C*46S4L/]3^!%#B31/L25V5AL+"0A29-RJZ7GTHQ3B2H1,1W,Z!!"((G+ M\==0:H22*=4&4]AAAZU$&U>7B=J.25Q(L,OFO'2T"C5@NAOCYY$@)W$&MX#; M/:)CF(*@?]QBG`'CA""^(4-M(#)=<7C9?\U5P:MMGM08=(,<=H^^KX7KD\!D M6<E@3Q34O)1CX86(5N/W%=RT-@(PMV#8T(B18]7<YB.IP%(48AKWY&7-F&[Z MR@^HB@+0?3"_IZN_'&=$IQG+'`%][404A3Z0V.+1X::A;'3$:2J+5`O9U('B MT=X>&N'9&PPIN5-"BAV.1$?#V<C1"K4!-F<#N)[-WX551N%5..6B-D[^;!PQ M&@.N$.)AV)P/N$\O17]V;&%&(X*,4+JG%/50O%/@S23\%I\(N`:WCF1*6Z'D MS]:.YU"666J)B[7GI=2"@WA*3GRFC'>QE<Q'U\MY7?US-'F0S@?HA/B3)3+Z MZY5@2I`C;3&P;X7%H,6,DV@#V";DFZ`[(9-4JG-PZ`Z+.9P!G*Q0%52"CI5" MS\+P@Q*1=PHNB32GE`HY5C=M+F="JA]RI]%HH/\..4P__K>D@G/1ZU]=G&XW M%]J[?.LNK%#ZWM7IY7GO8/"7LY.]_G%P=C#2XGQPK`\DT-#QZ?E5N!DCJD,Y MK.@V%PQ:L+<#FPWR5H?&(]@4@3I@=#ICF[K5`[6M;Q0T%!W.W\5G%"ZWZ4\5 M"$0)H+-C;H5"0L!#-Z,>^7NP9/#=_B*)6FPF!SP^/*=7F))\#=J/^C**\[?Y MB>#7NUL6<X@JL0A:E!L/OD<.N^#H)T;-@M0!,W2*#L%%CR%;AAFK!47Q,U8N MM.0_&[2.ZQ(A>A>\M7ZH2#]WL+RF+HX$=?1Z.3&JF?^Z[Y]]7UQ!N/=B"2)[ M\D"'-93O$=%!RT%H.>S23&%9@^2QHYK/WD]&M[6[I<,)9W3,E0N,.1FJ]NE+ M7,/<$MW9:=J'*QHAN`:Q@J=P8V]53HS&#`)^7F$(_'TYAY@<.6.7%-]BCA]= M@@^6QT_,[P`YB<`C9A.1J\*R7JW4.,?D!0=LXM!E/7$+Y;PD'R\0@.Z;AL%7 M4XB@YRNR:JOA_$@C`1;`<2I-')-8*='$GJCN?2+)="P4#HZX<LO2FC+@@!T, M$[&5S68)?4*@LID62_6%7+JUF4<(!2+4O@JH.;\U+Y9U^6&DMA%K39(=1?+? MJ.(W4D:BS>N&X(%OBN]49?J&P4GO](?^GU!(%S?LOJ!)B(_7<75\?&,/]WC= M;4;DHJ[KY*67MOR'$?FE63,)T:H*8AA(O!.[*;>PN'P1::WRJZQIR"T8A#X` M>3(OOAG?+MC&!.:HH[.+MWM]MBSMA/64EXG_<>9%VX0E+#:1M#(*D8Z1V+<` MK`QS_'+T,J$2]DM$EN"OQRUBNZ@91M#FM<"5M\7$"+5)I)95/@E=]-Q:.05I MY=2CUH?98D8FN!U?WAU7TENO:Z3JY([(V(*S`HL/[D4?K(A]LY/C:&G,K1YC M83M087"MB'WSPY5Z062$Y9"DVW)5W2YO79"@N\(.O98HC4_V\*:.Z\D2)B^9 MX$B#70V6U;"-\2I=1/50=E=\KDRI4$1[5\6=%-;.:M?.P>SV#J*W[8#!;1]` M^A`?$TX#C\+BB\/#9[O4HP4G$NCOA`B^$;EDS4P'@0%&Q:]PEJI<T$"0I)"I MC\WGFB`BT`15D4L'/0BA[5P'"$7.]-!>3C"QD6U<&[>U_\(>C]#D;7CB,L). M0H7Z9?6RV2_C#V0(+@_S1!'*<*)#;%P':8NLC/P/)WVP(R)N-1N17;"MS90G MY0YV04#MQ3,BM@#G(<(D]V"YV.IL:H.(Z>;`)HOB^5Z_W[NPO!(9%9L_=5FG MLIW$\CZ].CD!RG:+L)_TKA1TF@&R.C!'-0]D94V#9IX!H)\%!?T?EPX\+42M MP='5Z4'_^,PPZ.7>#[;A'5D;?[B>O]V:+K=^E<2D"0]E5)Y/8^8C@_.4M-'; MHAI`Q)^U@@8U[50=?:#Z)"EAYJ]7XLV79H7QZ8)<5?!DT]NB45-$K;Y$+!$/ M440Y9V!/&E<K5!+]=1NH^?..MTG'[SF!G)6N?MQ1EV^?EQSYL:SM/0+F]Z&L M_[&T45!T]:'"/`KO74AGXKIX$#MO>"$]:)28O`6;WX:SB7"2';_PS_'S[>-H M]^:@"`QOM5%F,MJUX@T]6(BV2G@LCN<1,Y\Q>OIN,><+3X=%^^/,:*J8.-$A M&"E@>.52B`WH?GTR'ZG(YLO^WL&?!^=GQZ>&=!B=<_76<6E6K73C/#"DK1%1 M'!T?/HY-6,"8I_SIUGUVC6&I#\B%"'HS$;OTY\^WGM.\8N(IHTA2J56:H4': M0YS-,MU'2!S3-7F"M(;2Y10:=Q5ETQ:9V!83J:="D^D6.L6T4`FECC@JL]1* MY_4N_,JK1TSU#^`-4Z/(O31C<J\:X"$*L]-<'.]?]7N#JU,S,X>9P"4?(R=4 M-'*=H,;G-(0/U]<VJG8`)W$=:"OS*WUH9QBJ^XTE&V=-V'SQXAN.TPTB>WDW M-)H!0E3+IC$D(:B=J$SUZ5-8#-F_E"Z+%^P*A13"3M'S,)_-;NT->/9TK.Q8 M`(C%&9!ED."$/SJ_=EK,82D;5:BL=0R7'Z3][97[[??%MXJ^P[7I:UM8G\!# M2>"P>H[",O;TWS$HV_:3!Z5Q18)PYYV@F/@"V6E8;*A;DQDWEGV<DA@D]-J+ MY!?L<'(A_5(2E@Z<P[L4R=U1+R&TADQ*:$W0"!VVUUM3$7(CO;2TZ0UP61)I M?W0-5>_T,'"WRV,6':A"U=.<M*`9%(M&$<`F3XY/>P,CP#']0Q4+>O/JK3VH M0:BX.JA1D=Z[W@$(K^.C`>B`QPB;`;\6;26=(3!);D:_*%70:B&4J!+L83NZ MJ-[YN%;71_71GTY!M14SNQ^__B0/?6KKHZEF8S#HF04`R13CN9$W(YN-:*9P M2DF>V>G9R1(R.JBE#H@JAH%``:$U/F.C9I"U>*W'Y;B;K</G(F:QA#R#X8BY M/<WE\#FU:H1Y/L'L`D$C.RK73@+O1MH=?J45X8<4KXG(>"&8:(V%P5K,>DOC M,04]6A]Y15WQO+)J8*:U5]G&"F*&_`5:F@%D[/6\>F^3[XN"9\U>JT[Y-T=G M"64O0)=82U.C16+U&VI!)Q*A*GS>O]`QK/$>XS`K3!NIUSN)NG:GR=1UKU-U MA\W]#G6_R3WQ3<-&&+T;)NNE6TNW$[:@..60;IU#YK?>O&HA,_42$&5B\K-W MQ0ECV'HZNS:7)>$HXR2U;+Z1E9E7='DC?!;*+ROTTL%O`6BEG@'*_X$7$L.2 MDX;FC2"6&L.^Y5.]@I=)5$3=5``$Z1L($1IW`_3\U(X24@=5GR>01J=&_5>1 MAA=&'$,.SC*_&W5QC^$6<CN1W(@.>R>]?N]05+@\N[HXZ`T0=\M5D/>\)1FQ M28ZH`JGE;K<&`D)B[(DJA"VP&X;R8A"I\JZ,6#:D(9=2GHTO/=SS>5L=L^$O M3(Q4/A#D9JE9D/KUN&JARS7I%I1JK)0+JJQ3E1!/.)FQW*2%6A\,:TC\SGUG MO-R2\K30,M4]<^TDU%Q5L%G\VK6YGJ,#\=/3O"XM[C#>(>Z`_QCJBR`VG`D0 MD:DMCX/^%%KPODC$IPSQ_F8/<3M<1\/`BU%O[?`SI^MGX=R?N7JIOL6%\_P$ MC"ED;\<)9!!N%SV#8E[>'H<@(S0:%_WP0%'GRRFA5<HK10PECVRP!\3)4$U[ M=RTN<C3-UW@7C@A2+Q<$:H*WU?7IQMI6B_+OZ19%P':;5^Z(/EHMYN5T-%N: MX\VL))>^2%UHR>//:+PH)A0EB;$W-D1C,IO=F<^8SR:`T475O@?PT<-9?*T< M(1V(._=0$Z5*"&WNEV32!>-.'<[GPY-L&#MY(3(?&AV.,MX^I6_G%>^?N4D$ M[:^&FWRU>RM@M\>TLP37Y;.,#@&L$N*:%@"Y8DAF02J#&\;G:=@&:C8%S2&W MB^7M'4&..IS:.@1#U^"4D60#T:9MLP&F1U:MG`Y*,("%(.FI9RDP=4P["YZA M82-XALLO1,U\1JL2+PB"WVR&/CZ0N'Q!E'X(TW=A:/=BN%T\CX!#M-"S'3P) MC]UE>Z6O8W#EZ!OI@H5(0Q.`XE!1WS`:R$V\F;%H/T=4R.<I:$-H%8<4>F1] M*`M'>KC\T)9/%O,+E]K(B5=_8;%=]>+68OM4)B6U+"&`>8@027FQ4C&S_B(A MC$9Z`]8!4=4%RX3U?3`O_21"=5<R0#<HR\GV84&WZL2X7)FM3C`P)7=XWG*W M,5&C+;ANT]#G?J=X7FSCK_,=-<N)W5'2MHR(ZW]+WR?D`ZQYOK,30U^<N.NI M%:ZX%W7[Q1#@:SLP^`WMHXR^W'ZS^V)7Q=X=Q3T3C$I(Y?PH'0^%XRO,`.<H M%7Z%03*K-(XQ.PJSZ7[)$*+.9,2U7X&<L,1EK`2!Y:MYRW(1B&W%1.EF1?ZE M;EK?U.=O07`'!)'L:#Z*C7W,\V`^_+)QP=K?5D\P9G\[Q]A.XHB5OPK7NUW' MJ?@2%J?)B]`H+"^8>G#/M7G:@QO2?%MNO2>X/-6O8_=LC\3SU.N&])P+89?I M7S1,W(,-&O;MQ/)I*BX3$JI4;KK"A(`UV33=V-5E[S/X4;$CP((]PHI1WR(^ M-QB^:<U0J;F56-/T1.=M-YQ/&F0<XN"TIZF[)X5WA1R*\7>[N[O/O0;A%#36 MRC9YW>TD=:N?IJRZ@2AE;0]^Y39>#)_SYSUWUP51`=86X5>K8;X8(E`R5Z#N MN]Q[E\?59?VARTS6=5^7L.E\CFY.1()S(B/DN^-4RI*CKX_P0.R?K9?[,(I? MG'I\L[.1T8J;E.`@$-C><7#C[S?('RYNUJ`FQ/^7>&,F^@I_.-QOP`!CFB&V MOX,7>01F4(V.[AKO*B._5.<_:ZR/S+`;;6T#1#)7!^@!YP8K,/H_/17H^Z,$ M^M[:VMYZL_T_;]8$^MYZ\X>O0-]?@;Z_`GW_EP)]"]!C#2WSBS`M!#'>$?:' M5TKB\\>G'2=9#^B>F,D#P^KQK7\EG#"Z]CS0)1P#T,*[I'EC:A3C\]E8SJ)] MC[EE:(]$&!X,NX0<8[@,AASO=YBEW1%?Z;?/C1"0UJBS')`;RS,!XN+\(%3^ M&(R`US/,\*.KB<:Z+&'Y4.&S);>*"52^G"8\XOE@>0P2@;LZ(83Z]\@.OE*( M.DI53D>KA1Z$+#<U;ZG<N:E1@14X6Q81P76CL,"EV\Y.;>!AVDC9!=U-ASM$ M-H?]C;MP>@C?##-U;"I0IIJ+>7NDFF#-MYBUBGD8U^Z6([,3$.2Z@+[RSCP7 MQN%&L>'QLWD9P(FA4[3Q'XA6P$M\91DX3>A"UM,D2^%RT<6<B4Z6`YA+7<P" M3LI28)#4I8`OPE+@HM*E@,O"4NC%T<60<3H)RAJ9:K8S5*H4(6&,8HT&)"33 M)G?A`6GQ[QCP-*P,]PN?R_&1T3^^A[&3JBV,*(\.P)7--437'*_5#A3--2-; MB`P]^.HG=PI,_QCE/QZ7)T*BWSW7<;?@W`'J18U"]P_^F=#B($S<4%):4,6H M-21#<TV\W%E_<,)6FVTC,X*G-Q,:;^+&E)EDK0$!F9-LN1?P96Y2]ORL^"+" M.Q"R=[8OS[I?TI58`*F>Z#[:7">"3@%K!6]S;"&XO[DN(71X+GXZ$X>SW5`# M\D6>L%ZV,NMEW8K>N)@4+U<G_6,K8(CT(LK'DMX;P?2[^);:CJ4BK&@5.)1H MR+]+-=2AS61/*6>OK?.;(288I8@N7^%L!M``[>82[RQ"(OM-`?>,Z.M2!,,& MSBXPAC64R];UMI9(3HCF8&MB%7V%@)-@8D!URZ8$G5^<]<_:[2C41^<4(DXE M_0_3<?R_U93^Y\)HPH[P,)OK)5M+>LJS0X01/=:"B'9J;.>1QL"^PO7)MM;5 /BJ.FP<;_`V4#/7TPP@`` ` end ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-09-24 4:35 ` Michael Hayes 1999-09-30 18:02 ` Michael Hayes @ 1999-11-18 0:22 ` Jeffrey A Law 1999-11-18 0:45 ` Michael Hayes ` (3 more replies) 1 sibling, 4 replies; 94+ messages in thread From: Jeffrey A Law @ 1999-11-18 0:22 UTC (permalink / raw) To: Michael Hayes; +Cc: gcc, amylaar In message <14315.24602.150585.362111@ongaonga.elec.canterbury.ac.nz>you writ e: > Michael Hayes writes: > > I've also been tidying up a patch that I'm about to submit for a > > separate autoincrement pass that is run as part of flow optimization. > > This collects lists of register references within a basic block and > > uses these lists to look for a sequence of memory references to merge > > with an increment insn. I found that this approach worked better than > > scanning def-use chains. > > Here's the patches I was referring to. There are three new files: > autoinc.c, ref.c, ref.h. I've written this as a bolt-on to > life_analysis_1 since it completely replaces the autoinc code in > flow.c. > > I'd be interested in folks' opinions to whether this is the best > approach or whether I'm barking up the wrong tree.... > > For starters, here the docs from autoinc.c: > > There are a number of transformations which can be made by > this optimization: > > case A: > *REG1; REG1 = REG1 + INC => REG1 = REG1; *REG1++ > > case B: > *REG1; REG2 = REG1 + INC => *REG1++; REG2 = REG1 > where REG1 dies in the add insn. > > case C: (very uncommon) > *REG1; REG2 = REG1 + INC => REG2 = REG1; *REG2++ > where REG1 is live after the add insn and where REG2 is not used > between the first memref and the add insn. This case requires > a new move insn to be inserted before the first memref which makes > REG2 live earlier. However, this won't affect autoinc processing > for REG2 since REG2 is not operand 1 of an add insn. > > case D: > REG1 = REG1 + INC; *REG1 => REG1 = REG1; *++REG1 > > case E: > REG2 = REG1 + INC; *REG1 => *++REG1; REG2 = REG1 > where REG1 dies in the last memref and REG2 is not used between > the add insn and last memref. > > case F: > *REG1; *(REG1 + INC) => *REG1; *++REG1 > where REG1 dies in the last memref. > > This latter case is useful for DSP architectures which can handle > multiple autoincrement addresses better than multiple indirect > addresses. This case could be handled by a separate, optional, scan > of the register ref list. > > > Note that strength_reduce in loop.c performs the following > transformation which we try to undo: > *R; R = R + 1; *R; R = R + 1 => *R; *(R + 1); R = R + 2 > > However, the following is not transformed: > R = R + 1; *R; R = R + 1; *R This looks very similar to something Cygnus did for a customer but hasn't had the time to contribute. Our implementation sat inside regmove and I believe performed similar transformations. What I would like to do is have a "cook off" between the two implementations. ie, I want us to evaluate the two hunks of code both from a standpoint of which is more effective at optimizing sequences that can use autoinc to remove instructions and from a cleanliness/long term maintainability standpoint. I do _not_ want to ultimately have two hunks of code that basically do the same thing. That's dumb. Joern -- can you get a patch for the regmove changes put together and submit it to the list? Michael -- can you look at Joern's implementation and compare it to your own? Joern -- can you do the same with Michael's implementation? It would be nice if some other folks could try these two implementations on targets that have autoincrement addresses to see what effect they have. The two key issues are which optimizes better and which is more maintainable long term. jeff ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-18 0:22 ` Jeffrey A Law @ 1999-11-18 0:45 ` Michael Hayes 1999-11-18 7:33 ` Joern Rennecke ` (2 more replies) 1999-11-18 7:29 ` Joern Rennecke ` (2 subsequent siblings) 3 siblings, 3 replies; 94+ messages in thread From: Michael Hayes @ 1999-11-18 0:45 UTC (permalink / raw) To: law; +Cc: Michael Hayes, gcc, amylaar Jeffrey A Law writes: > This looks very similar to something Cygnus did for a customer but hasn't > had the time to contribute. > > Our implementation sat inside regmove and I believe performed similar > transformations. > What I would like to do is have a "cook off" between the two implementations. > ie, I want us to evaluate the two hunks of code both from a standpoint of which > is more effective at optimizing sequences that can use autoinc to remove > instructions and from a cleanliness/long term maintainability standpoint. What were the chief advantages of doing this during regmove? I'm not familiar with this pass but I feel the transformations are too late. For simple cases it should make no difference, but for more complex cases, autoincrement optimisation needs to run before instruction combination. > Joern -- can you do the same with Michael's implementation? I'll resubmit my patches tomorrow. There have been a few mods since my previous submission to properly fix {post,pre}_modify addressing modes and to make the code more robust. For the cook off, I've got a testsuite of 25 vector/matrix manipulation routines I have written and 40 assorted testcases that I can contribute. Michael. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-18 0:45 ` Michael Hayes @ 1999-11-18 7:33 ` Joern Rennecke 1999-11-18 16:25 ` Jeffrey A Law 1999-11-30 23:37 ` Joern Rennecke 1999-11-18 17:00 ` Michael Hayes 1999-11-30 23:37 ` Michael Hayes 2 siblings, 2 replies; 94+ messages in thread From: Joern Rennecke @ 1999-11-18 7:33 UTC (permalink / raw) To: Michael Hayes; +Cc: law, m.hayes, gcc, amylaar > What were the chief advantages of doing this during regmove? I'm not > familiar with this pass but I feel the transformations are too late. > For simple cases it should make no difference, but for more complex > cases, autoincrement optimisation needs to run before instruction > combination. My loop patches recognize a number of cases where autoincrement considerations require a different pattern of giv combination. However, we are not supposed to generate autoincrement before flow. The point of doin this in regmove is that we alerady got constraint information there - using autoincrement instead of three-address-adds is much more useful when you don't actually have a three-address add to start with. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-18 7:33 ` Joern Rennecke @ 1999-11-18 16:25 ` Jeffrey A Law 1999-11-30 23:37 ` Jeffrey A Law 1999-12-13 15:28 ` Joern Rennecke 1999-11-30 23:37 ` Joern Rennecke 1 sibling, 2 replies; 94+ messages in thread From: Jeffrey A Law @ 1999-11-18 16:25 UTC (permalink / raw) To: Joern Rennecke; +Cc: Michael Hayes, gcc, amylaar In message < 199911181532.PAA09078@phal.cygnus.co.uk >you write: > However, we are not supposed to generate autoincrement before flow. The > point of doin this in regmove is that we alerady got constraint information > there - using autoincrement instead of three-address-adds is much more > useful when you don't actually have a three-address add to start with. It seems to me that we can get this information during flow/life analysis just as easily as in regmove.c The interactions between loop & autoinc are clearly something we need to deal with. Both of you have noted the fact that loop opts tend to disguise autoinc opportunities. But I believe we can attack the problems separately -- ie, we can improve autoinc detection separately from making the loop optimizer generate code that is more autoinc friendly. The biggest obstacle in doing that is deciding precisely what sequences we want to detect for autoinc opts and which ones we're going to depend on loop generating autoinc friendly code. jeff ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-18 16:25 ` Jeffrey A Law @ 1999-11-30 23:37 ` Jeffrey A Law 1999-12-13 15:28 ` Joern Rennecke 1 sibling, 0 replies; 94+ messages in thread From: Jeffrey A Law @ 1999-11-30 23:37 UTC (permalink / raw) To: Joern Rennecke; +Cc: Michael Hayes, gcc, amylaar In message < 199911181532.PAA09078@phal.cygnus.co.uk >you write: > However, we are not supposed to generate autoincrement before flow. The > point of doin this in regmove is that we alerady got constraint information > there - using autoincrement instead of three-address-adds is much more > useful when you don't actually have a three-address add to start with. It seems to me that we can get this information during flow/life analysis just as easily as in regmove.c The interactions between loop & autoinc are clearly something we need to deal with. Both of you have noted the fact that loop opts tend to disguise autoinc opportunities. But I believe we can attack the problems separately -- ie, we can improve autoinc detection separately from making the loop optimizer generate code that is more autoinc friendly. The biggest obstacle in doing that is deciding precisely what sequences we want to detect for autoinc opts and which ones we're going to depend on loop generating autoinc friendly code. jeff ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-18 16:25 ` Jeffrey A Law 1999-11-30 23:37 ` Jeffrey A Law @ 1999-12-13 15:28 ` Joern Rennecke 1999-12-31 23:54 ` Joern Rennecke 1 sibling, 1 reply; 94+ messages in thread From: Joern Rennecke @ 1999-12-13 15:28 UTC (permalink / raw) To: law; +Cc: m.hayes, gcc > But I believe we can attack the problems separately -- ie, we can improve > autoinc detection separately from making the loop optimizer generate code that > is more autoinc friendly. The biggest obstacle in doing that is deciding > precisely what sequences we want to detect for autoinc opts and which ones > we're going to depend on loop generating autoinc friendly code. Sorry, but for the example I just posted there is no chance for decent autoinc generation after the current mainline loop optimizer reduced tons of givs separately. We may discuss the patches separately, but we have to test the autoinc patches with the loop patches in place to get meaningful results. With loop & regmove patches, the example I posted takes 62 insns for the SH. With loop & flow patches, 64. With the flow patches alone, 185. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-12-13 15:28 ` Joern Rennecke @ 1999-12-31 23:54 ` Joern Rennecke 0 siblings, 0 replies; 94+ messages in thread From: Joern Rennecke @ 1999-12-31 23:54 UTC (permalink / raw) To: law; +Cc: m.hayes, gcc > But I believe we can attack the problems separately -- ie, we can improve > autoinc detection separately from making the loop optimizer generate code that > is more autoinc friendly. The biggest obstacle in doing that is deciding > precisely what sequences we want to detect for autoinc opts and which ones > we're going to depend on loop generating autoinc friendly code. Sorry, but for the example I just posted there is no chance for decent autoinc generation after the current mainline loop optimizer reduced tons of givs separately. We may discuss the patches separately, but we have to test the autoinc patches with the loop patches in place to get meaningful results. With loop & regmove patches, the example I posted takes 62 insns for the SH. With loop & flow patches, 64. With the flow patches alone, 185. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-18 7:33 ` Joern Rennecke 1999-11-18 16:25 ` Jeffrey A Law @ 1999-11-30 23:37 ` Joern Rennecke 1 sibling, 0 replies; 94+ messages in thread From: Joern Rennecke @ 1999-11-30 23:37 UTC (permalink / raw) To: Michael Hayes; +Cc: law, m.hayes, gcc, amylaar > What were the chief advantages of doing this during regmove? I'm not > familiar with this pass but I feel the transformations are too late. > For simple cases it should make no difference, but for more complex > cases, autoincrement optimisation needs to run before instruction > combination. My loop patches recognize a number of cases where autoincrement considerations require a different pattern of giv combination. However, we are not supposed to generate autoincrement before flow. The point of doin this in regmove is that we alerady got constraint information there - using autoincrement instead of three-address-adds is much more useful when you don't actually have a three-address add to start with. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-18 0:45 ` Michael Hayes 1999-11-18 7:33 ` Joern Rennecke @ 1999-11-18 17:00 ` Michael Hayes 1999-11-18 18:02 ` Joern Rennecke ` (3 more replies) 1999-11-30 23:37 ` Michael Hayes 2 siblings, 4 replies; 94+ messages in thread From: Michael Hayes @ 1999-11-18 17:00 UTC (permalink / raw) To: Michael Hayes; +Cc: law, gcc, amylaar Michael Hayes writes: > I'll resubmit my patches tomorrow. There have been a few mods since > my previous submission to properly fix {post,pre}_modify addressing > modes and to make the code more robust. OK, here goes. Michael. begin 644 autoinc.patch.gz M'XL(")"@-#@``V%U=&]I;F,N<&%T8V@`W#SY<QI'NC_COZ*]6Y6`#;(.V[&D M=?P00C9E!"J.V-Y4:FK$-##6,,.;&22(X_W;]SNZ9WH.D'*]EUU58D&?WWUU MMSJ^(]<G8NH%=WN31Z]__\^C06LHIJXG3\2SR6WT;#:9/).S"7_@;>JWCT(9 MAZZ\=?V9".%7Y`:^.-@[>/GJD>-.IZ(Q$8TCT5B*1DBM&KXG3YZHCY6#X^/C M9P?[\)\X/#HY>GYR\++"*S0:C<*@`W%X>'+X\F3_.US#_!%G010'?EU<-L7^ MX<'!0>/@:/^[NA@/FWM"/'E&>QX</:\?/-\7.('6YX9#`9\;CX3XN^M/O)4C MQ=_B8.G)V[WYWS*MH9P$LWRCZT=^8^K9LPA[GAH]]BH.X`M/@)[,M.`ZBNW) MC>J$'D=.75\*U6Y-YBO_QK(]+YB(]8)^%W"&D;$[$;>!ZP@_@(_2XLE+V#>6 MH;4(@`\N$_SHZ+OZT?,C1EYDYB)>H6/=!AZT>=)"E*+*U:`_ZE>K8;RNU4YS M,Q9V>&.%<E:I),/D+))Q7>P<'EF>>RLM.[:D[V1GXJ3'F4F>.X6AONUM(C>R M#BH&0'4!^-$_Q;TFMC=9`2+2FGG!M>VE^^H%HFLW7MC+ND@^E"ZT#(.E/<.% MKH'X-V68\C\PKX(_@G]TOPDB"MO1T3$PX(46MO\2!M3%1:?;%D_^0QB15Z$, MCJ(ZK0L?I]7)]L&J<A$`O1Q)*G3X_,5!'?[Y3BN1$*3XXK4`D*ZLBTZOV3VE M=G<JJH_3Z;9C30)'UAXQ9#SKF]?B7U6:.&PU>]9YNWENM?KG;?$++_>^T^VF MK<0>D6=*!F1<E:B7`<$+:/<%6+18.AH$DHA)X`.A_=B:KOQ)C-:[JH05L#Q" M5(^UN/X%464F_0$(YX6B5%M$%96$Y>#5*Y"#5\?[J1PH8;,B]V<I;-_1WZ\W ML8R$'4*C%P4"FL1<AI)]$L%M2#I@?2]['VGACM=B>JJ_@73SL$P+S<"6+]0Z MF=NAB&S`B'"3M^`A$,$?+SJ#X<BZ&K;'YWUKT'[;&8[:@Y]X*1CJ1N!+:$%7 M"<>K5T>(_U$J'/]7^"N>_P8J/.46ME:XR!],F)V6I923)$K'X!4.CY\_3X,2 M^':(38GJ:8M6"@U8MKHQYHN`#NNL.>RTK+-NO_5>5-T:*"VUXG?K;;=_!JHK MOK+B(%E4H&(%R]A=(`.K"E#?NK8C=\)V-]6WIZQOSYZ(7G`GXKDD9($0TR!< MV*168$SMR605@M4&7E_:-Q#90'2S"$`2EG84"3`6H9#V9*Y913L)VDG$@7`D M:J]`>P(B`M-@^TDH83T-+E@<":2W'0<Z(XA!]4HL?NA1Q1S@6]C^1@!B((>X MG\&X2*PBZ=3Q`SF*(!2.*Z,D8,SP$WTRF[>EJ(+#\:5T<#*X.<\*;NH8@4G% MU:/GQX?@ZX]?:0/Q=W<*\9UHCD=]J]-K@:EK:1)VIA3F(A"N?QO<2$?84P2/ M#5E=W`'95E$L8B2B7`/L&&_KI332<KT$,@#I<1F@'U`_6/DQX_)86<=MMO&+ M7D4`!T55G#(RZA>(S\?VQRN0"OQ6%P<UG%?YPE'-\V.(*U_L'Z5!]%\"U0/Q MRR];?<%OQ+?43Z"5`^W6XH`24A<4($60CD3NS'=9)HY?0/QW_/*%E@D=IJ!` M6KY<QQ9(XX_PQ0]^`A#VE5]3PX!\[Z5<BCB$`%\$4W$W=T&8#:U)Q5O8RZ6T M0R0/$Z5A+(3$44Z9?2^:AD[OHD](5KZ8D/'H@D4A&,&JF&9EW'O?ZW_HU5@H MCE\^1UR5"4/TGSGR]IF_\KS*!V#[I;T1XJ78?W5R!`G==P(R/,[X0CF%A&\T M7X%IN14'KS`K/#@^>?$*AQSG.4`KF_E<_?CX@/=\BA2[`'L$MBF*(Z38,I(K M)TCU'_8"5^1/@.,H!LJL"&4%R8ZQU\#>M[VQ0#D""Q@"44=S%.((;%D8X]HD MU^;,/65;12M8;D)W-H]%M54C+,1%**48!M/X#AWC!0BO0W/JH@/IHI[F0VY] MO0*I%=<;<0G,MJ4GKO;$.WL#$%<7>W/\\#]@)R=[$QOSO>M5N-FS)WO^SS6T MT4\92C3:)JB(2:NUQR/X"W9/$:A(`74J-L$*3)L/1'*`6@R)<&,TKL^`')19 M;F`!:`+P@9KH!@"&!9$:O^#2;Z4O0QO`7EU[(*==2%']""PX0(,MT9RP@V5P MPC:RG`KI0G\HP&50B>&03'75CA'*D,@>^(@R6GK,.)*A!2Q39!S0#MIV'BP! M\CFL!KC<N9XGKB4ZANG*J\-<&"L^=$;O^N.1:/8^B0_-P:#9&WTZA;'Q/(!> MB!=X)1?,C`L+`_PA,&0#E(`%+MN#UCN8T3SK=#NC3PCZ16?4:P^'XJ(_$$UQ MU1R,.JUQMSD05^/!57_8!N$92@1**N"WT'%*G`C13<:VZT4*W4_`NPA`\\`! M0E2#SE!".`/6%D1XN;F?04A*+P#3BR@JZIV"CR206*!:_:M/G=Y;@!2L.AA* M,-ZA"R(2!_=PLRX@J1A)M,CBRK,G4C3$<(53CX[VD>`[*SF,8$:T)X&'/(]* M%1L1`#[;IJ&L0T;I@NUV/90\0_551`'YYH0C"E)TN=5B:'OK(\Z(I>?ZZ-3B M8"9)8&EG)P#:;G0?6B.*4$CT,+H!;M!"V(,*%7&\0_()"HBRM('/CERS*8A- MB/S5XIH,4M-$`99)D2`UCE`V-P"UYZF0+<&%03+QH:61%5X0W.`*)&>0X(4; MDP2(1;)EA$$;&,AKUY?:/M[9$1%XA>@&J%8DK*"!-AI$6!5D`RF!``*4/NX% M'`)YA!"B`3J(H3DXTCW-\W)>).G%)`P@M%24IM@`"9`E/H*$=&=B8JB">.!> M1.G()#7Z<#'NG$<:(ZS#W45*,8N@4`C,$N7S9$;[?U?NY`;`6/D:-M9`',(L MYW`7DLP`I2:2(48K"FUPX%KJT_HA9+)3=Y8O.48;`&B1;PUC#YJR\W4"7.C` MR+_0F%8KL^VD50W2JL*FX,ES39!F.0U8OP$Q$_=A]P,*G]M&D<_"?W@IE4J" M?*TFL1Z*W+'4Y]/"(.B$\`&'X/\H'[O&.`\8`Y*4C.'_(!`9R'@5HC$8#]LD M'XGL0`35ZW.4EJR)62V!%*]Q.1UPZ106$TYJP2V^4"O/F)TF7V":`H#B2X@G M:8;X_G5Y&JG6?L/#,`_&W74T>B)FTB=P8#R!8X7V'9:7=<!:YXFU9%.$^W4R M#=&NOFV/K$LL^N`"-9I14Q`S>1307TLIUX*\]:P]^'74FWC!-=C'_P(*:O1_ M&Q7'9'B`7!>@+@''Q.AX@$Y1CGQ4B'E*%&1SA9J!FTTMM&ZXX30A9"K]U"F> MZ$^GQ1'4F1*<,XL+ZVK0_H&6K^E5L;77_C@J]%,=0_=A"^\B/8I8.+%[*G(L MYN)'KT\S&M]K<B4#@3IDVA%"^`43,/(!F8&`T/31RF<2Z33)=/[&../B;!_2 M+*X,7D&\22F0]IL48+QS_6K%E"8%3I]SW60GI]69`]HS'/PG,7BW!"4\`)C? MO'E#OEV%&%B40>^+.$PHBKUF*^\8M<,<VA<N%7VF!?.@(\!.;]A#.B#Y1I^N MVJG4I^`KL0='XQ1!GNF4'AOCS5(^@$;4K8=E+(SZCBLF#1*"N63UY(.A,2#2 M*]?A\25D9ZL!K'U-V%H0N(@J[I`2VU5(T0ZO69@PAS]O7VAT<,77ADQKG_<C MK/Q30>'RH[7W2T9K'@^E'4)TB>S1FA3XR4<*\U*MHJK,*?:<JAT04,BB.KVL MHACZKI1)F=<IE2?0;G_S3<)SLR/E826QG(J.6D,UZ",%)(2SF.EP0+NWEX*K MIO?&W6Y!'[L00BNA9.%$``TY!-F.[Z1D`3V@6!0_'0J*FB)(VQX@J99>I$QB M#_C7X1\ON7KE!TMP5N2WB3!QGZ+GU[S-*<?2CU\+-(46J3(-.*RIKD)/(B$H M%]542N_1;5,?DB6*\K&+X2U5I!:^O&,Q!]^!Y$@XOLU$]<<01G#-`'X0EYQK M+I$`51*O$@[!"O.V;6:JC(LTX[=;)!,@0"&R$@IE;!7]H_N!$55S7BV)Z3GF MKWYC!.Q80/U9`@6KR?HUY44394\Z.$1+>YF>F7X#7QS!\F(.2-%/C49V@Y0& MB44E@2M8))"%IN-H,5`)(N2#.J#`U!,D`<.\)`,WF<X_6^UY:O9V&/3*%S9P MAOG,8U.P[Q51;(>Q)G\K7Q,`E"]XP$8%UZ`VRK1OV^BKJ7?FB#]-]\@4VPX& M'U.L6L`29ORD]`\&*`-,@W;H83:N^D.5,6?+<W'8;B5-TJ"\DC[0NI@Y4C;8 MGAF*RB%L3B+R44E1A=)8UYQ:C(`3A6O-)1X?8C%W]*XSM#"$I*#^+HGEL^K% M=P:*>4@E-$H":3B?I9\I_[N2EN*L[)B2(5_S@5;&KI@)"^(5^-[&#*D0:#"= M@5C@09H;;T'^H3B6LV'?A/7/5T,^*=RMA3SF/TD)$_E_;`!F7P=A3#=3?K]: MEFM?65I:SN5$$+)3"G*236939_UH>T:;#LJMME.8E",M%2F;Z[TR(UMU!H&R M2J"2Z_`I!HL:L7RJ_2[ZZUCI5NJPM?+P.JH48.35YI&`N3&>XKBXQIVD\K8J M&XOL70<7XTH\K@*<0@=9R6<L812;B]/!%%Y#6BV%'?,R^?($E;>WEW/4M:8I M'E?__\2+18=S7Y#$\<R%U>T,1SM*?SO<SV_PTGFC6*RR**',@!V5RR0=JNBS MUU\EET;EXGZF_G58NI43[.:Q\Q_E)=22K)[O5IE?00?T#OS[J7C7')Q;M)W5 M@U]#M4U=9&NE1JB0?.!<4Z\%V23`IG>`;T^?)L&`J3W;:[GN3T#\&K*V4JEL M-\W:;6YQ\;]"6[<ZX2L^`,0CJ\R1(9XUX5%*<OZM3.:P/>*;M!_O$3:'85)7 M7#*RM2Z5(UT.U*=ET.6`<7N-6T*B`HI=76?J@<-@`;I@AS,91V))9\01G@0I M+[H*9<1F4V,$7$PVYA3AUO96K)'ZH"M*;04*(HH&W4"M`BQ4$[IJ#IK=;KN+ M%:-4<'3O6??]@JZU%GF5O0?XM$2X/O[0;G7;/5JM+O9KHB&PJ(%G"?OPN]%( MZE!9&N,\NHG$\TBT1"X#9(0+=:N[.9Z-EZ`Y')^!V-)^O_PBBOW_;`_Z%OCG M0;,UVCYJV'G;>\"HT:#3&EG=_@<++SEHXC'W/QJ892N4Q84`XEJI=J@%UF4% MG(+5?K!*+.T8FOU4'RZ`DY!$*)/M@G'`(V=_8RS!;RWP]'\>!JO9/'>I`.U[ M0TZG<G*OBP9@HM^A7X/11Z8>7L,&.J?47&?)S/VDAL@^_56=-)46-"X!/M-H MX'U>Y:;T>78QQ2BS&D4CF`%)Z^*O4C<3OL7*B]TE0QCM!K&HI>OM.JK*'`72 M)IJZ5GI:4Q6.AQ&Z4D:KPIH&[3*%D0<*N8HTRB5\ET#"1!!(Y7_69K!4D-!B M-)'*[#VY4U&D,Y*L68UOSC8GU+]%P"';`H9#U%6=E%CLB0VRUFV>M;L86)Z8 MK<-/EV?]8G.KWQMB;7E4;"T9=]X?GW7;F8ZK5N9K\_Q\8`%OBXWGG8L+L\>T M[IF=6'+287RY]D[251AUZ(V7:6QQV;ZLTT75G+DR(DYU@YK$T([$M<29*"S% M4HDA\(EDULA"PS[94(F%QAA65Q8_TX3_4["D-M'V&Y:SAJ/^H+W#WYD4@?$F MT3(PK/7.:D]SCVZ_>?[`+=AQ9FC^@9SLG?PVE'0T61?)%7I*(W%>EHAKH3TP M!X^&2=9\]'D[Y`R/Y`H6Y0]VPL&Z<`+_VSC9<!>OUC4\.U$^5)NPAVIW4JG5 M-*E\S4&,1V%3-C?L^=*S,9-^>>+A#<<4'Q'0C3FZ"\82B)>$RGT)^?WU?1'Q M5CZV1RD<1(SM$6F%SC$B:7'WOEJKPOJ&]_004KII1HK&/(GPV0'>E<TIT]DJ MAA'*W-,@0_W*5:]2'@]I;5,(E`@]A469$*N^4[G*UE#T1:(,!RVD2;K&>%@R M.Q$1]DO4JLB5G(/2S3@=%65$FNNV"WN37L!5D3TNM...YBY*Z5@7?B"N-P88 M6''XWQT/R1)E(MG\J,M.;SRL&2OF+6*&6,HV:LT3++%*SM0H0Z02&E'00M18 MRA`?U1"5;*?!UZX;=-4V07Q[J)\-OS74\/.PR'_'V$+\OVLL6;&\O&+E1QLX M$@-*\8`Z](3,"20;O94RA8'G<%*7H%VT<[3A8S/%06;2^X5AYY_(3</RXN": M^%ZDO=122[%)]?X@D?$MJ4LER;Q(#']/=EE1=-R2!>7)B-LI0`W0MRNRMF[; M-/E/,@7\DWZZ!G&^R5OF07LT'O1.=@]J#B^M_E5[T.R=#S,1U+@WO&JWK!_Z MW>:HDXO`0$ZOK$XVK,.%.KVK<=X;X/E*:#LN%@ULC\X?],-LL-$+Q)3*$/0` MB9XK7>,E=(JG0KK]C9*CK1;,UI<0T47B%=@T"JM3&[]+::2Q&1UY0`>=@^`J M?#$*Z\D![ZCPH9$YO"$BD*'-%8^4GRT%'YD7EQ[!3M#]9C`CBS,!#SR3?*XT M7:Y$B+==\'6=`@7$%"OEZGZS>F:`FDK7N^F5`5_#I@B(WD:`C((WI$,JE":U M4HHV.KZ)C9H.PQ;T5F;FWDKE*H(PA*09#[@0$HGDBR"[VQ,I=F\>OQ%C?'T8 MKWP[EMZ&0UZ*@`M$1T>+KSL%;0DLM".\9J.A"H-K3RXBXZ:2\DYT/<&.R84Y MF?49DV1AM1*?TF!A8.7'>(L=S=S*=WV0+-L#(X<&+73M:PPA[1`OOYL<&_)% M<JPZB,_(<S.(CS#6QR,%O,^".9VI%5KTHDPDD0BY4"^C"'1SGO'^+K3Y1`() M<`O4A]$`O.LO5S'+E^\0K+C/!X8$18#@S(2"&!5CL6!":_FXPYV^TH=7V!%" MM8YZVI"HI4X(,4W)@4E/WYR`]@0"N,`63?785-T(^(A7$&]H=Q>I&8*7`;K8 M,YD)''1.;FY4,+40=WPV;215#CY37"@^BW]D)C,.5K?=>SMZ1WY>?%;5Y%(3 M#>%L<3HU?]8I$B1)6TTNA5N)O4RMK?KBR*D-.E-B6C,#Z;[U9!6B/H#"Z4.K MA.5\T(1J34\P(W[GID)TG6`G"D./[?EQ]9/I(E:9.B;U%_W!97.D\O/3_+Q, MT5_]2HHT>@E-6%JBM%9CF'1Z\[B(?W1_0AY_*[\M"49&MNL!#`I[<A$G]"@< MK&3RUP+PL:_PP*AYA3#$Q:7W#0;3XI5U$BZLDV"A,@OB@`L9I^GX)&(N=[W) M(FYM6XY&*R2U-`/A=@'A#+$_GVZ3:+,DYMXGPAI0HVSELOAN!]>,"PJE+'7W M?V&OW<5JD;S&2:Y,TB$26>-N$SZ4W)BD<I9+#Z(8V+6%%ZJJ=+Q:%Y[-41-C M@W]+`-LS%2D<DCWM4J)IEJNT\K2"Q1+?26J(<:],T)J]W*B@<)G>^HXC[7C? MM4;CB`Q8G;L>!E&M2VQRMU\=2W1^QWQ:?ML23`3FD%LHM5,A-E^"S)YG%PXW M\V5'(D&U4&+,U@BS96"=#U#E$(1B:(;13$!2"/?;W>7M!&*JQ1.?^"T@GM+C MWS50UU%P+`G_]2HFH<;T%%_DZ5I<\JSMVI[<X&LWCN[0#=H^W<N)U)O)BGJU MG5_9N"5=_O3`.#.X:HY&[8&6E4)=*R,S.;I001HR%!*V3)T(2(X/X972\K]` MJS:>:Y,]5/$MEYRQI:%:)/\9$(XKD1JP3F(]^(\"X0W(J;NF,*47Q.HA+L6> MRN8V2(%P(?JC$LG\XD9U]9<?3)+,;00+<^<-?';,^7-;WS6`:?A$.C9L>N)O M]OE,N/2\.CT@UD2E/P:!,0W])1$TOV3O(>=[+(R_@@0=:3NAKYL+_D,=V=-+ M)G7])_.PR54N)2<)>DH^,4O<3,)'O-W.#^66<?(W,41U'D"L1&]D:QB9V20< MF=.38@T1[T.CZNEX)8&6'K$-1\W6>^NJW^D!Z>BX?GRI3\JW!S8)G"T@;435 M(7KJF=X6P9!878!1,I7F5PG:$5WCV9`4+FQ'4J#^^_F=Y>EVUYA2)D.2Y*8` MS=F5D/,3%R#!AO.`DI,=D&3ZRRIV.%O1B]P\>JC%PM!RZV+<:XTZ?;#MP^9; M;0A.36VG'S4O_9L8_"<QU(`OZ<BR$P(>KPIA@%DM*ZWF*8$>:M)$&$6+W(+9 M`D1F3E;F2>+S1>NBJ4=0MEC7/!/S=K:<<WKOU#%2V5L'U>8EL#/0!6+LO]N[ M]N8V;B3_-UGW(<;..BM9I"S*L9-0J[ADBW:\95L^6K[-U66+-2)',C<2R9TA M;6L?WWW1+P`-8(:4G:WD[J)*Q>00@V>CT=WH_C5%/FG!IL,A[J2$PU_#P9HX M3.$,&IV=^2$/GOS3R03DRG"!TOM:S"8=J^ID,R3R3G8V6UW=(#JL6:CB60:D M*-V+NA]-G^Q/B/:$O5*/H(.969_A\\=O3P>CMZ_,HAS7N*$X3R=/L(./+"A: MN>)B/):0A!$H\#I*P7>"=LYC8:##79DVCFHVI(R]S^[:(`<5%\$T9.0)F($M MOVJ\#P[>3KQ,[[/,>U@O`6<[9.$4,18\\BKPP2GG\RLRVD^L4JW,7\75=,D( M)7D`0(!_/B(*2#(E\%\C0.65]LAQG91/._;3O>P;-;^3C>=7:MA\@B?^!(>O MU\UPZ04Q_#LZ)77?N%,Z*%6DA\+%L/O%O!'XC8;%)KHV/R)>R,>RU@!PA\UN M+-^!+RL?VSXG1,@(_'.7882C4:W@QH<L46B$\%Y#C#%J]<J\"!&\CEO"R9AG MD]75U36&;`4^(*QI.9XT^&'P!%C'\Z<C.'R>OT+>#LX;6XHW[B'VFCMW_J[. M(!'<*(P[./8/=%%]SO!;HO+;K_9DE!?5L2.GDZK[G[ZRI@X?FFRVXH)XCA!I M%#-<"%Z'F<29=\6FM5?24ZVF"M[=S9IJ2K%3M]\22P6B`O(G%!9J356;T1F> M)YM0FN<#EF<+HU,43&]I.H/A5*H2IKHUY+8!=?%!O!E]K5,,(B(+%80&PM$A MU@"VX22/\\S'^R-KN=Z49D:K<3D]$XRG+.,%82=.#L%[>I*061#9PPHLFPD< M1&ER3%,-.I@0A_KZ=!@'1&M6*>U!':F?#Q+O"L.L>=?^G'IWTMSN1+>;9.U[ M#?P\^FV2?"]=6[J>L(84!H$+WS%$,UWZ414>Z206GV$)B$TXH559@C83/;6M M*!&=DXI4;0ZA=V@:,"QD`L(Y\`8#P0VK,)B];(RJAQ_61+I[\U(VHRF$O1,X MA4V[IN,T?]:N,6'$?IAP5>*,46:CGIP.F-KAX^C%\U>#D=&1$`&%[57T"R[3 M\>#%X)2D?'[AS<G;X9/!"-%>[0M[MB\U"]&TCU2!%+D+:T3P;H'XFH;H4,(P ME0V;)K+>D!WO#00BI9;2:MW/IJ.QVJ2T)?AF3LCW(V4`9]0&9P&@,QSC]LV: MP.D"W@_Q@NIRM*0>X*0S8')8E#*+T\LZ?@"DG'1T59,H(P9XF@N1:^PX8X$F MR4\RS5/L,UM/0E92!9O93Y;=Q,H-\UU#Z[Z-%_H[P1/@KQ,_LM,WOV%DK=#X MQ$.3:?T<+"YE2KKE>?,?'8]>>Z6IXZW,[S4YZ;2\N\U6R]ZIB7>.U"SOI=H& MZC;*EV<"0U\EM'7C`J(+P+7SG4`V"X9^\H)@+#?JC;W[OB8LQM7LI]G\PPSC M`3(7#R`!!N@E06\R`C)M<C2+@T88NLLA=AP&]D)0,+V)L8+5$OY_85XIIU;3 M*CXNRWQ6S%=&1I[G=*'KN?^V?!FZ.%]FE]?HXX">%W)!?SF?+\PPRCEX:^W2 M:X\`PN=X#O<:`(2+6(!(,!25"5H/]P`E,7J)LU2HF]3TW<0MSVK)BPS.?#`? MAQR*,$)UV6H>7`2(AYWRR(:(B0[<[[?L(L+150&TL[;M!>2V3CI)4%V]I_YQ M<;:Z`*^6)2#;F2DSG\#!`J>I^.M*XES+=(@I59M"0/./B]75`NQ$P"<IH<%Y MQ4AC'0!)/9M7+CS0QT:/3@^"M].V-AOX9Y]PE1O(7+-1#D8.QV[H8>J9V4+1 M,W21#)ZA^AP\P[WI'_V2)<`\_L<_,OHDP>_X@`#F1;11'L3GB]+4>BZ3>7MH M)O;.I)_=CL#;-$>4!B(N&$-_>>JRA%-P688S43(5_-$8=W:B(\);WM:Y(2PC M*MS^<7;;@W%W91T:BN6QX6#O3&"<[EP*7HF/*+]]9J]8I=%3+Z56&N`C_K?O M4%U*07J!XX4[GW6[F=]];=3W$*`@_$O."P&`2IXC4C/"S/M5JUJQ<^&=J'/L M8+\'&\36<@$HCI%1'77'S?S\7)\X4PP9^UOAJM2!#BU9>]@OM/9)_OE1.2"W MJ!W3<_3-V0.+@?>J=1T)W[>^T3PI";_GC[ZW<U"6(X'#@A:3T>N7+=/;#CJF M^#"O6^30+HO;)DO&H^RVF9\/!]EM0UKPL3S0!!1+"_[<YM'DND_HI7QH77G9 MA=AY]O)ZURX,C9BNW26^B+H=[;MJZ\ZD8_C,-G2^K>]+HY'+F.V([2NF@?\Z M>F%;)FBE<);K>VEI*.Q?9CI8(B/\&3K)I-+8Q]I>&"'D<[H0->;['[L=:(1W MR$;0CQX>#Q(/#34\?_K?_:B2$USR\`5X&E<#3VOJX6`6[K"P,^`EFM"%I.'8 M5!2='J,78*:KUDP?)I$-CKS'P(3Y>4T#K]'CQY@+KRLU',[C-!]#_B)\(W5# MS>R[MYW:C#1T36IPR[7%9&9W55B7Y2^)795JUVZOVA9ICU&K;=]KP&.N->U[ M%1.!8(5FNVS'_)#$+5XH)\K6+5?HCK\A):8K>_MFL`G)`<+M&G*+ZO<\8(,N MFMK,3#37$DOS$?)9N&;4R3A8UPJA+-<:JN63QEI_@YK^<'AX>-M))5;.9>&V MRWOK("FB_CAC"1C8,PO-\)'KN#.YS<.[#7>.7ED6NN&C".IW)B@^\@O4?(=; M[W"_.BR3=)B0.G9T";O9I^@_-$F@B^/-R1NGLJ:L94;]T:H/W9I_NLKC/`Y( M4O>=PF/EHDF7"%QMN6)QL:4WZA2XOXBXV3RAX&0/0543NM=[=ORX`4.'IVT" ME5A$F36@V:J#U7)2E*5&T+&Z9N_3N[MFG6V'*_&HV&Y:,$<6Z?YR%=3;&V:N M>8>9:TX,]>SW('/-5U_W>WL;9J[I];[Y+7/-;YEK?LM<\W\T<XV7Q4,#8?S= M,UH$KH014H$336+-YI\'EK,^696&/4#,'!F-\#X$`EF,NM`1X;Y#(>L@;W=( MQL80),:X$:^`;.L#QG"AY1=!0]!/L0)`OOFL2U?\"XS3W?9&Z0[1=IAAP0BN M%E0[Y&<>Y(2]<:+RS\'<.IYC)!WEFS[790EYA`J?K+A6#%1RY?3$(_H(ED>? M#B.D9.`H?D]%5VMY`"YQZ)57Q<>E[H1?#M*Y4;G7D!@;[.VU91'[7%<*&]R_ M()6E#>[RVBDCZUTYR@YHVFPR&SR(TUVX.ZEY1T)N:EZS3F)K7O-(\R5&AV:4 MMJ^H*,[4G`240\@#ZG'7IM9AQ/:B[1+"\#8`+6$[V\)_*,E#4`8T"%U([O3\ M4KA==#%K_//+`;J?+B8X>WXIL.[J4D`782FX#-2E@,K"4GA?IHLAX6PG9M;P M5'.<H5"E)A+ZZ.W18`K):,I-./1K_!XC2X<O&WYE7Z9_\'HE,,C"#ZFW/?/, MV@[8LG45`:?<L!XH6E>-7T-D0L*??K2Z8/K/J`!QO]PD)-H]L@UW,HZ0H%94 M+W3[&M:"[`#>?0&4]&VS7J\URD#SFW!,!`-.6(%KZZCIP<VK"<TT<67*(+)1 MAV":DV1Y%-!EW:(<N55Q1;Q[AY"\:]MRI/LY37D;(-72\U=/AO6->/,4D%;P M:QU9>-3?_"YAJ3@JOCD1AZO=\`8$6-Q@O_1J]LNF+SHS8I*]O'UQ^EP8#$V] MAXLL4^\,7_HWR\7LC]LRBP1WXD$M)RIROZ4JVJ;#Y$@)9[OB9L!0#E-)A`+I M\-C]'R1`.5SBD\7CR.Y0P#,C&EUJPK""DR%ZS(9\62[U-F+)"=8<'$TLHG]$ M>#R;*4EB:%X/3TY/MK8BIRH=?X2H>O0_C%]Q_TYG]#][71HVA,IL72NU;WGN M=[4]S)*1`C4UH0177Y49V[H:/`^UQGJ:*PN0F[$F2.@B$XP?E%@O/O*\%!L- M6N/'W[B1IC8"_P>7@$9::9@5Z^ZOVMZT!9L4:$U+TDC+46UC>VZAP8K&:TMV M5$UC=WU*OX$EC<U9N^/6TW)*>:"_S7KW^P\>]/<>;FA->_C@:\^:=B*H?(91 M48Y2S(L*`"WHWZ2@Q:)$I[^9TGXSI?UF2OL5FM+NW<5M<HH&*DP"S/F0<7!F M]BOK;UTQ:!!;PC`$G(-2T"_>W[5\R2X7G`2:<]>("KT#$`A[A&W?RW8R(TEG MV>%W_M,#*KFSHRIY'%:R'U?"[^F?\36RP.'WR92PSQDXDJQ$;1NZ0)?W_0R= MKR#?L.%65W-V76YLW7M*0]C'(:C&IQ)991$JI`L$:B$E]T5`9'2GS'J[N]06 M=-EH,V';L3!7Q8&4Q5]7TY(9,"4``80Q:I'NBZ:<*UFR-$4-T+)#'ARJ!KN' MHRCRTFQ4X./?&R'V/67>@*S5B.&9HVNNO5IQ>)-MN;S#FLB]UQ\S8R5E/?2M MG:E5<FMTW)?>:')B\DE0U<X.?`B6>M!W@ZJKAM]4"]]$5N`,[R]/N*!V,7GW M!%3@O:Z'_%3M@;M;KK?;)$:9OMI?O=&N[Z9M!TF'K=38)L%9,)!H=OSF=099 M*`V?X2P"CBD0"`C68K'#M5S`?M(`]E(LEX)5:PL##AD`U1&Q2EE%SS:9*37& M>E-5F$,Y7S)X,,18=Q`V!BNRB6>=,D8P,&IJG]5/K699W6P_NXM[WDRV>P>X MCL_@NH?9?2HGS3AL%G-T%K.+Y3LC8DY6&-2"'N&[8\$J9>A`U`UEOVAFS-/^ MP=1HF!3@)<XF<QF"Z0+T%F/2]5?FDCA`^+[M?MN7?GI;V>N#D*_M1C'AC5/3 M%'R%`G+2_$?[\W*OMVM2K[<W3*8.NZN;;K:D?.W!L_$\4?#<)ETW,O'[_')J M#N-B1("/N`D0HI'A<%X>_0!&YU?/!J,7)T_>""XD"::T@%,7^F:$.7AA.'AR M\LSZ`QD2>A":!([>GIX8NC*:_'^^';QZ,AB9U[*M]+M.'8>@'ALV"I`]``>+ M2DQ^!M(7\-E&(9Y$$O,'B`4=YBEHR.$4:L/3%X0G+[B/NCX+JLG78R\'+^7- M1%,L&7=L.E<IZRZPJ!K_DH=5)V@7/&?:<KE#RAB$]D+W!`J_QK,^CM,RPNX" MG1D._-D<2B;%8J-1,Z@IB*U_*\IY-N4,;;/Y#)EL960W`+(T8_G3\.F+(\8L MGW6Q-`6!\#RAA&@$.YI[J@:G$;K![([8>B<[>?H4`J,ERQTY"1N&<W:]+$3% MH!KDC=?#@6IZ>LXQ>0XXVLC!DZ)2[9AEHFJ`,W)KIKR]:D0Q9.)J">(MU`J! MV$R3.*I`+3,JWX?R_#*_Z.`0.SR,#O0$O:R4MPH5M1Y'6/-5;HZL68$)AK`. MY9`45G?00#@Z."+MPGPEF(QM<6BB/J61Y\"$QY9M]F@=#EX.7IUR/6!]!/'I MNVROC5?78'PU/<T>95WQ^3[$$OU,?^^2-R[Z1'+7I047N6!]*X-N'`_2W?B# M=`/_-N[+3DU?3#/-?2%/X>89:>X*34/8(]T5:F5M3SYS4F@2UG7$FY*&_%;2 M.;JH^*4)Q_7B%Z<;ZLJO@&QL1WX9JF%<23GL=.Z]^<Q(>+&;*$)HL0LK8Q9A M7@659Q12-K!.:M7HLLROLR/,,(B9'-09"`H$V@C#VHY,=;OVU!C*L8%ZD'.9 MB5Z#)/*V\F+"-=!Q7'GFBVJ%[A&@NNC^),%-Y<0!Z<8/2LP1%I&P$=F]<76% MQT0Y*KU,J)L%B>=A[D#_>R"^9'=S!Q_$!XYI.^EE*]W17K4>SBK'1M$7,`&, MM(P#%;@CRWS)_'LM.UK;N#W:8$007<H.(YI+(>@@V5/(0K8LM/T![6B*O,Q# MI`L%(X=P+B-+JEM>Z!VLU\@.B/J#Z^65<3\?0CBO1/W[M<PFJ>I]K)S3\MH& M;[,D'X;"._!#6">!.M13PI,M<6/N(3A/J3!%7/W_F?YY5TV\E4_5`M643<HO MMJR?$XHHS,\?9B-O_(1;7B`-!^+Q6'!F,:4C56'D2)+5<K=UH$V0I<FC7YS_ M-ZO"&U2H=C'@Q9=TG6RFD9P3N%;QFU;+24[F^6)Q><VUC"[*^6IAEE[[F@=1 MKXW$WK`P>^'"H*(Z(H"%,P`?0I@&0/6N!37P]M*?"LS'#6'PD#J)3.V3.2(G MH:D$AX3IJY&=D8S?"LQ_H%E-*#?O:@$3&F`WTB0E0L3M=N+8[BC9,FY9L\]P M?".S-WAOVA<[9M#A'H;G-KVL5S(-WI"<K^YWF/06CTCR3&LUEM.,T)]I=&>$ M`)3;E$(WX$>)>U4/V]*F^*``/#>6-M]P*P_5B$+B?AC%'_L1,\941U"7T?'Z M39U`G]F(P9,0\4F\;1UK\L='Z:+1-@FB%`*EH@,>YOQ5Q[8>.+P`4"=O'#:- MI)0U;&!(V8ZWN%KA0N%+,5D%&UPBD[P<RY8H_,5HI5>BCIE+?]1ZT*C$U2XH MK];']HEA<\"2DZ)5C;90:F"A:%DB?-=!M`3(=3A+9X43X'A,OBF9<H9YFH'( M!2@8MX.R)6:@B#9YZ`4\/HEP,2ZW+9^/F!?L.N"10);^K-Z'&C\4=!6HK@ON M=?'WNNRX058KVH\Z9E%OHUI0'!88[T=D"\>\H=I>/3N&UVQP>-27`W^E&(+& MNEDH-*J+^T2,RLT^N:WS8/;,IB:A7KR?/B0R)*9V$34:]-GG8HF3-H@"2+$> MWCRZDQ@U!2M/"DDR&6LS4)>B0:0EL&/:!%QV+W5%!*[@NON:)&%KC,L$<KWK M:3J4"`'Z"M?U(C5KLKN5$H]@N;X,)I#VWU[C/GA^+@GE?H);HR5&+'3G"Y#/ M%M<D;G@W*ZCZ39=5<>G4(H'@60*C`,<`NAHDG'ZZ@!1"09R[#.)=;PM`J=64 MDF,%9F]D]OP20&?"`P^%Q@2^6@PZ8[B!'31+?:QF*:79`G\B:%^@+1M%^#4> M.7RM^>H$E[8HR,Y/W9C/BDJNS\>D@A/03Z2>_F!-J"Z)O:C&;:<PD[>%T[.I M89A'-N+V8)ZP:,)HR[-+I(]!C)Y)VZG1,L3F'M<KU%4!%X8!%YG-W9F&>V>? MS:]F`6Z@4L<(.G5*<8JQ[JMZ3/-U`:QA^A(R$?O&8VT2=FJUT_>5LN]KW(%^ MSEK^2*P`5<-516@P(!:8NA_Z<QQ?N^^XH>]E3WZUN%S[VWX)YW0J#M2T;!3B M6J\VK;$EU)S/!UYG*T<34?5JJ@*`PHN,$0P@JU$R'0'7^HB*0=@J2)""9=2W M(B48)K98&PU@BB"'+7CWJ$,C+2=D*(K2(<C7)+G,,*+)Y#-O.W><ZP`G]$1S M77[E)U3#/=_;E5HP$\44?8+@5KOB3"L?(&\8UL15]M#7Q]:W16XZ=KGG[I6> M7/P&;BMNJ/3RXVUR8;(^![!Z%4.HX;M4[LFVG1LQ_V`KY@MVT#FA2"TI9Y<9 M'DFAF1&]K*`6R*XWI*R_7D7.@%C0Z]%M'56ATKT))!)Y5T=6/PV-]%3[8MO2 M4?%@)T1.Y%$%]::U(^@BL'O(G446:'*/T8)+;K=#?+D0X#Z1+WC4^;CIQW'3 M)$602X^DPGO'&<`\=X:6K*<':V>7AP0%'PPP3^31-NS:2;N^*:]II+5#>5(S MB^!"^*)8_MZ<\&;7JR!.'(8=J<;LLREX7!:DP.\&WUZ%N:W]@?FLCO["(6J1 M!\I]82KH^8,[GE:8'='FTLY6,[,EH<>+$K0<W"?6+PEY$W8-137PM;F@^V+H M]X?\VN6EECWE.OE%,9O(K;9.B+=661H2R&"D#5HB)$"6:/GX4.VE523W!.X0 MW)/$+3&=P2IO!C[RLV74NA'8WY+7S+X*UK,J6'J0>[')S]&3\$VS,^:SZ3B_ M!(^+E,F.9Q/::S#0*8VP;LX/5-4T`MQ;]R,:]7N-NI3;P7"LB\N;XW,H82!' MQR99M@5-WV'W`JAG\3&'$ZU#M1ZFG3#A-WR;3[BO5,'[5/`K<DRJ>]-L[[.Y M.4*F,]2AXJE-Q@9$ZC&IIBVZS$/QIQ^.NJ/*A%/2)PV;R_A1H_633O2<)6(\ MK3#GL4BS'5)+3WUC*J2_*[HS<*AL8?'8AL9=,4<=#,_\HR_1Z?:`1"HO8[.# M&H7>1?>;]%;BJM6U=BO1GKNNB!K$OZ96W;N)9F]ER6':2]^?>91^<[?B!O\= MPUQW!HFYA>O1[-`WQI`]:3_Q(UP&Z3NOT,Y)?Z""1;5PWS:_M^W5F+E3QQ0$ M9=[#Z$J0=$,3"X8F'KTZO<&9E=*+$*/:QSZKK24PRIB3X`*!C*F'J%#(Z0!^ M6^$)\:+`H`S(;>8?X-I,F6K<"S+E<5"WNZP"2Y].R0D54RD7,TQ;#`I-,5XM M0?07$P)EQPX,(%9;2#@.@`TE7TXK-J!-BBKM11:BQ>XKF%C'91U8+/-DBN1, MB[G:$5">:%%"W`,UY'=*P/A,X<$7T7T=G7=YG7[NJ^?I@R.ZK\'M83=["R5D MM-E!2BL/<AL6YR^KJT5U`)WCY)+58HKH4^0.V$+"!YCL8H+^Q&7Q>]`F09+$ MM057$LSK2;K8Y1(340;Y7F7XB72;?WS[\K5+MYD<7\-9%QQU%D'56\D-C]6: M?B*S]=%?DT':4*H&/-%V2?,)^T9'#X^>^V,*Q3<S][<.V4[HG05)`7'-T;_M M-;.)'V26L;HU(VTD<OH]$[\*:V%TQN]'BAK$D>%F3J))5X?@EF]('OGLKXIW M7N[VZ^R:!FJ-HMY%&[J,'2+K!`\S`%ED_[&N6`3K@0\!K2R?PKX!'^(2+>L8 MG&$8WH*\9D#EO%I@"ETS:69,]Q;SRJ6")6]IBH804U('!IF_-Q6#"MA!IVHP M>ZP6UB$`_L"K?*PRMVXNQ!T_?_,ZDC9,87%ANW4H[KDM;]VTI-/Z%&$N:+A% MS3:WJ@6=ULV%N72CGSK6S46ZGVVP4BA@E)`$@LXPL7<B)L66[\'3]H/SGPU> M(7#(EM5QC![%_>FS]R3C#PL6*,D-VL&Q[QG/8R#R6YG=GVG&CE*DLO\V"J-! M22LKK"G',R,Q`/2GRA#0:\!OK:"&0AD),1#3JESI.\SP@DN95OT-BYL>G%!S M5OD,S,G)/.LWE9'5*JM!!K)SY"P0WCIB'`SD:_Q0@4'O"D**V6N.&-$$T\O- M#>/'KE9B.J-P`\QT3>DM(#;J)^"(YYPT!")U0SL47X)[5P'?'28#99*F->4Q M&V<G>DV1BS>\_(-N.[15<2ZL_"LU7Q[&-.>`VVGJ,*R]FIYQ+-U-[MHXQI+` M1(,+MT^]5G/W7LOY,K^,;UML,HK-;ZSDH9\'4^=CDW0/^):H&8-9A=%+-L%Y M";<7G`1%+BK4=!43,M'*R(BTJ@+0>.7VV4R:F>LKNCZHWN4E$!OX#V-*[^IR MOK266Z-.24UT%4+2;7`MC6*6V9O?'PV/1T^'1AP/<W;7*85!%B!T__4U&:W# MI.-)-$91K`4PK]`KL71'`5QUTY4M".8?\I)H,L)TH/L;=B2?V^.?I]_/;313 M-+/V7A@ZW&9$#:VC84+'+TLV=K<DF0"=%GXK?,VBIRK,&@$#_6FZR.;OX=*A M6!)0,JS=%*^^R!9`UW9!AIV0SX9S'UT%D3+-I[>Z!&JZ_[%7/YPX)Z(`MVHR M()I4LV`W6;A6:M7P"+CIRBU5?E7=6;3`^*MG&;=KA08K.\"E\V`0]9`#[1R& M?$<!A=L=%KS7R.$9*YEW_)AP0#.).,4I#?..<T0C3RJP#XGF2\^^=JKSTT%; MN!,C/"`@C!>)UR&<9DR0ZN?!`^P%AOV&F[/&`X)!&HH@F;F?FKRC(+<W/"UJ MLHC;!QHDW!TIGO.&/6'4+3Y!D<+@DH/7G-?ON&.*,9[[8V_][DSZ",F.6=E] MZV($?1Z<4`KG6BA.^BP$A6ZLGH-4AN2XA9>),)1WL$LM++>T3G=T_@D#M^N0 M(.QU4;[+%Q4(EHQ,4JXPLW-AB&^"%CZ[<'(U"7G((3-QA28<\?]V&<A)"I'` M>@+W<68Y/!5S<[`"<@P)+H_(T>`BGR+QP?EI:/TZ0\",J[PTBY1?[CIA&+T2 MB&?89HQ>.P%-U&BL*"J"TCJ=%"5%BP;S0?G>Z+RF>TKG$J53"VMK()[#/(,C M0!W"K')&`]DSND?2(<01`!\%#E4>OSNO9R+5G</-9*]-Z=DPCBLS@]?U=!U3 M\YU)(M2I0J+&3D;>;?CT<W@@TH1*,/GFU\@!TZQ/(.YA_WIA&%7`]VJXFGYG M+7]S3##-WM:O[A%UOV\8E4_K%:3I<3L%5ML>JN&X/#[E=@?T##-@^&6-WMX[ MP$Z#_K1''[O=F*?&\2N.%JANWSH8'"'J]*@_8$RM7TI17]5T&=@]-GZ5HXYH M\?XE4BO`_<<Y<J08._<*2J&W@\_.*%Z$/D'>>I;(\/O%I>%^ER.=>+RA`.:] M]\_;R$(:.9JFDA:@-ROE7,\).@91JUK9&\/MV:6F*@B'2[MP(,P:0*V$L&U< MP6F8H%)\4<^F%T9[-T?%ZM)0Z1+)Z++(WTLO\&T;7E2IS)>U7M3I++S1E*<G M29^N>#+$IX)O/\`NA@`-#K;*?%G,R^5J-EU.0^*(^/WFHE3@=A&E9H\$B)MQ MA>#.S%ZL_3C;]`S(VH@\WL]>YC\5T,;N=-8^_/R_]O#)&YR"?G9O_+ZZ=S$> MWRLNQO3!:ZOSOFVZ54Z+]W3\OY\B/EUO]_Y7O?9D>GZ>=<=9]W[6763=$I^J MG@)TH?>]!>"`]WJ]>[UOLKW[_0??]N_?;U%=@&R8+/EMMK?7?_!U_\&#$!@Q M.WG\QS>&%2WG1GIXOSNWD'CS;&F6T?R#B]-=8C<>/GS8>?CU/N$H9G2[UITO MIH:B=N?TC9!E^`MFDQW;W^@4LS]>X5N7XROS_Q^A.B-C8->Y0+Y<EN:0-5]_ M!WFK1O.SOQ3CY0A)QSPRFM?P:`3]WQ:JD(JNSL9&]#3?)M>S+D#YS:!+U>(R MO^[RL"[*?/$.'IY-EU?Y8A>8:#5?E6/X\5U>O<.Z;F7$<^&%BW&7P-*P0\^> M/=DF9]3I^76W7'[<G9N6S7]?@/T8R:)"GQMX'^PT#.)T>8U!19COEVUC9ZOI MY5)<0RG.$<H:YG2)]7&``C"D\;B7T3^+RQ5LX1,07#],*XET7]I7V8>4+E8J M,MD9+@8T(LN(^)?_GY=1@$3AK?/_U<L:;FLZW<R0^O)Q;$;WY.35T^?/1M^; M(3(6%6[KWH,'7W=Z#Q[N>P"I_.BA$`E`EV)U^"%=F7DX/'T!3WZW=3H<#.`3 MW`%5N^\R!5!%RYN98IY@1>^!9`Z?0'4T5'C1K8JE>8%I[ITP*C@8$<)*JL+Z MI3$'H)45'\?%PKSH1&4<A86073.01(V,(*T'I)IWHP#R?CV$3^T=IK%?HNU, M]%YL7SY_4OO9FO;YITEAO\#^K^E8$QD`UI?Y6!9&V7X7DH-00?M?(F1-Z]C@ "```` ` end ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-18 17:00 ` Michael Hayes @ 1999-11-18 18:02 ` Joern Rennecke 1999-11-18 18:27 ` Joern Rennecke ` (3 more replies) 1999-11-30 23:37 ` Michael Hayes ` (2 subsequent siblings) 3 siblings, 4 replies; 94+ messages in thread From: Joern Rennecke @ 1999-11-18 18:02 UTC (permalink / raw) To: Michael Hayes; +Cc: m.hayes, law, gcc, amylaar > OK, here goes. My first impressions are: - Your code will take huge amounts of compile time for large functions, since you process each register individually, and ref_find_ref_between can cause a nested traversal of parts of a basic block - making the total complexity something like O(3). I think my scheme has O(1) in general (there could be pathologic cases when the hash chains gets too long, but that is unlikely, and if it ever gets worrysome, it could be cured with variable has table size or expandable hash tables). - By processing each register separately, you miss register-register copies and three-address adds. Optimizing code with such constructs was actually the main objective of my patch. It actually reduces register pressure. - You can handle hard registers. Have you been prompted by some real word code to implement this, or was this an instance of doing it because you could, and with little extra effort? - Your code can generate PRE_MODIFY / POST_MODIFY, something which mine can't. - I'm not sure you if your code can rewrite (mem (plus (reg ..) (const_int ..))) to use a different offset? FWIW, mine can't, but it is feasible with some effort to implement it. So right now, our patches really do different things, with only a small overlap. Of course this could be changed. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-18 18:02 ` Joern Rennecke @ 1999-11-18 18:27 ` Joern Rennecke 1999-11-30 23:37 ` Joern Rennecke 1999-11-18 21:28 ` Michael Hayes ` (2 subsequent siblings) 3 siblings, 1 reply; 94+ messages in thread From: Joern Rennecke @ 1999-11-18 18:27 UTC (permalink / raw) To: m.hayes; +Cc: law, gcc > - Your code will take huge amounts of compile time for large functions, > since you process each register individually, and ref_find_ref_between > can cause a nested traversal of parts of a basic block - making the > total complexity something like O(3). > I think my scheme has O(1) in general (there could be pathologic cases Oops, that should be O(N^3) / O(N^1) , respectively. And on thecond thought, your code is more likely to be O(N^2), since instructions tend to have a limited number of registers they reference, so that the ref_find_ref_between calls can only be up to a constant multiple of the number of insn. They will probably be insignificant compared to the effort of scanning every insn for as many times as there are pseudo registers. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-18 18:27 ` Joern Rennecke @ 1999-11-30 23:37 ` Joern Rennecke 0 siblings, 0 replies; 94+ messages in thread From: Joern Rennecke @ 1999-11-30 23:37 UTC (permalink / raw) To: m.hayes; +Cc: law, gcc > - Your code will take huge amounts of compile time for large functions, > since you process each register individually, and ref_find_ref_between > can cause a nested traversal of parts of a basic block - making the > total complexity something like O(3). > I think my scheme has O(1) in general (there could be pathologic cases Oops, that should be O(N^3) / O(N^1) , respectively. And on thecond thought, your code is more likely to be O(N^2), since instructions tend to have a limited number of registers they reference, so that the ref_find_ref_between calls can only be up to a constant multiple of the number of insn. They will probably be insignificant compared to the effort of scanning every insn for as many times as there are pseudo registers. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-18 18:02 ` Joern Rennecke 1999-11-18 18:27 ` Joern Rennecke @ 1999-11-18 21:28 ` Michael Hayes 1999-11-18 23:06 ` Toshiyasu Morita ` (3 more replies) 1999-11-22 23:47 ` Jeffrey A Law 1999-11-30 23:37 ` Joern Rennecke 3 siblings, 4 replies; 94+ messages in thread From: Michael Hayes @ 1999-11-18 21:28 UTC (permalink / raw) To: Joern Rennecke; +Cc: Michael Hayes, law, gcc, amylaar Joern Rennecke writes: > My first impressions are: > > - Your code will take huge amounts of compile time for large functions, > since you process each register individually, and ref_find_ref_between > can cause a nested traversal of parts of a basic block - making the > total complexity something like O(3). ref_find_ref_between is only very rarely called and is only used for {pre,post}_modify addressing modes with a register increment. Many of the register ref lists do not need to be computed. However, I found that this simplifies dead store elimination; this is essential prior to autoinc processing if the loop unrolling has taken place. Your code does not have to worry about this since flow has removed dead stores. > - By processing each register separately, you miss register-register > copies and three-address adds. Optimizing code with such constructs > was actually the main objective of my patch. It actually reduces > register pressure. I have rarely found these cases to be important for autoinc address generation. However, reduction of register pressure is a good thing. > - You can handle hard registers. Have you been prompted by some real > word code to implement this, or was this an instance of doing it > because you could, and with little extra effort? The latter. I find it sometimes mops up some of reloads spills... > - Your code can generate PRE_MODIFY / POST_MODIFY, something which mine > can't. Yes, this is important for DSP architectures; especially things like matrix multiplication where you want a large stride. > - I'm not sure you if your code can rewrite > (mem (plus (reg ..) (const_int ..))) to use a different offset? > FWIW, mine can't, but it is feasible with some effort to > implement it. With the information that I collect, these transformations would be straightforward. More importantly, I would like to reorder some of the memory references to improve autoinc generation. Does your code maintain the LOG_LINKS? If so I would like to try running the instruction combiner after the regmove pass. Alternatively, could your code be easily modified to remove redundant load instructions once an autoinc address was generated? Michael. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-18 21:28 ` Michael Hayes @ 1999-11-18 23:06 ` Toshiyasu Morita 1999-11-19 2:35 ` Michael Hayes 1999-11-30 23:37 ` Toshiyasu Morita 1999-11-19 8:49 ` Joern Rennecke ` (2 subsequent siblings) 3 siblings, 2 replies; 94+ messages in thread From: Toshiyasu Morita @ 1999-11-18 23:06 UTC (permalink / raw) To: Michael Hayes; +Cc: gcc Michael Hayes wrote: ... > > - I'm not sure you if your code can rewrite > > (mem (plus (reg ..) (const_int ..))) to use a different offset? > > FWIW, mine can't, but it is feasible with some effort to > > implement it. > > With the information that I collect, these transformations would be > straightforward. More importantly, I would like to reorder some of > the memory references to improve autoinc generation. IMHO this is bad. Improving autoinc generation prior to sched1 may have the nasty side effect of reducing sched1's ability to reorder instructions since sched1 is unable to recalculate memory offsets. This problem with autoinc generation is basically a subset of the address inheritance issue, and I believe a better solution to the entire address inheritance issue is to eliminate as much address arithmetic (incl. autoinc generation) prior to sched1 and regenerate it after scheduling. Consider this code on an in-order superscalar processor: *dest++ += 1; *dest++ += 1; If autoinc is generated before sched1, then it will generate code somewhat like this: move.l (r0),r1 add #1,r1 move.l r1,(r0)+ move.l (r0),r2 add #1,r2 move.l r2,(r0)+ when sched1 tries to reorder this code, it will try to hide the memory load latency of the two memory loads but fail because the post-increment memory stores inhibit proper scheduling. If the target processor is a typical in-order single scalar processor, with a memory load latency of two (Hitachi SH2/SH3, 486, R4000, etc), the previous code saves two clocks (two add #4,r0) but loses two clocks in the memory latency for a net gain of zero clocks. If the target processor is a typical in-order superscalar processor issuing two instructions per clock with a memory load latency of two clocks (Hitachi SH4, Pentium, R5000, etc) then you will have saved two half-clocks (two add #4,r0 instructions) but the resulting code is unable to hide the memory load latency so the processsor stalls for either four or six half-clocks (depending on pairing) for a net loss of two or four half-clocks. I believe that the best solution to this problem is to "flatten" all the autoinc addressing modes prior to sched1, e.g. pretend the target supports large offsets for memory references and convert all pre/post inc/dec instructions to offset memory references followed by a fixup at the end of the basic block. This gives the scheduler maximum freedom for reordering instructions. Post scheduling, address inheritance can be generated. Applying this to the previous sample would give the following code prior to sched: move.l (r0),r1 add #1,r1 move.l r1,(r0) move.l (4,r0),r2 add #1,r2 move.l r2,(4,r0) add #8,r0 This code could be properly optimized by the scheduler to: move.l (r0),r1 move.l (4,r0),r2 add #1,r1 add #1,r2 move.l r1,(r0) move.l r2,(4,r0) add #8,r0 post-sched1 the autoinc could be generated: move.l (r0),r1 move.l (4,r0),r2 add #1,r1 add #1,r2 move.l r1,(r0)+ move.l r2,(r0)+ This, IMHO, is a better way to generate address inheritance. Toshi ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-18 23:06 ` Toshiyasu Morita @ 1999-11-19 2:35 ` Michael Hayes 1999-11-30 23:37 ` Michael Hayes 1999-11-30 23:37 ` Toshiyasu Morita 1 sibling, 1 reply; 94+ messages in thread From: Michael Hayes @ 1999-11-19 2:35 UTC (permalink / raw) To: Toshiyasu Morita; +Cc: Michael Hayes, gcc Toshiyasu Morita writes: > Improving autoinc generation prior to sched1 may have the nasty side effect > of reducing sched1's ability to reorder instructions since sched1 is > unable to recalculate memory offsets. Yes, I understand where you are coming from. I believe the problem to be a weakness with the scheduler. > I believe that the best solution to this problem is to "flatten" all the > autoinc addressing modes prior to sched1, e.g. pretend the target supports > large offsets for memory references and convert all pre/post inc/dec > instructions to offset memory references followed by a fixup at the end of > the basic block. This gives the scheduler maximum freedom for reordering > instructions. Post scheduling, address inheritance can be > generated. However with other processors, directly manipulating address registers stalls the pipeline and thus the scheduling characteristics can be completely different. I suppose the scheduler could take this into account. Ideally, the scheduler should be able to freely flatten and reconstruct autoinc addressing modes where it felt appropriate. Of course, this may not be straightforward ;-) I suppose as a compromise for the interim, we agressively generate autoincrements before sched1 for some processors, but defer this until after sched1 for others. Michael. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-19 2:35 ` Michael Hayes @ 1999-11-30 23:37 ` Michael Hayes 0 siblings, 0 replies; 94+ messages in thread From: Michael Hayes @ 1999-11-30 23:37 UTC (permalink / raw) To: Toshiyasu Morita; +Cc: Michael Hayes, gcc Toshiyasu Morita writes: > Improving autoinc generation prior to sched1 may have the nasty side effect > of reducing sched1's ability to reorder instructions since sched1 is > unable to recalculate memory offsets. Yes, I understand where you are coming from. I believe the problem to be a weakness with the scheduler. > I believe that the best solution to this problem is to "flatten" all the > autoinc addressing modes prior to sched1, e.g. pretend the target supports > large offsets for memory references and convert all pre/post inc/dec > instructions to offset memory references followed by a fixup at the end of > the basic block. This gives the scheduler maximum freedom for reordering > instructions. Post scheduling, address inheritance can be > generated. However with other processors, directly manipulating address registers stalls the pipeline and thus the scheduling characteristics can be completely different. I suppose the scheduler could take this into account. Ideally, the scheduler should be able to freely flatten and reconstruct autoinc addressing modes where it felt appropriate. Of course, this may not be straightforward ;-) I suppose as a compromise for the interim, we agressively generate autoincrements before sched1 for some processors, but defer this until after sched1 for others. Michael. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-18 23:06 ` Toshiyasu Morita 1999-11-19 2:35 ` Michael Hayes @ 1999-11-30 23:37 ` Toshiyasu Morita 1 sibling, 0 replies; 94+ messages in thread From: Toshiyasu Morita @ 1999-11-30 23:37 UTC (permalink / raw) To: Michael Hayes; +Cc: gcc Michael Hayes wrote: ... > > - I'm not sure you if your code can rewrite > > (mem (plus (reg ..) (const_int ..))) to use a different offset? > > FWIW, mine can't, but it is feasible with some effort to > > implement it. > > With the information that I collect, these transformations would be > straightforward. More importantly, I would like to reorder some of > the memory references to improve autoinc generation. IMHO this is bad. Improving autoinc generation prior to sched1 may have the nasty side effect of reducing sched1's ability to reorder instructions since sched1 is unable to recalculate memory offsets. This problem with autoinc generation is basically a subset of the address inheritance issue, and I believe a better solution to the entire address inheritance issue is to eliminate as much address arithmetic (incl. autoinc generation) prior to sched1 and regenerate it after scheduling. Consider this code on an in-order superscalar processor: *dest++ += 1; *dest++ += 1; If autoinc is generated before sched1, then it will generate code somewhat like this: move.l (r0),r1 add #1,r1 move.l r1,(r0)+ move.l (r0),r2 add #1,r2 move.l r2,(r0)+ when sched1 tries to reorder this code, it will try to hide the memory load latency of the two memory loads but fail because the post-increment memory stores inhibit proper scheduling. If the target processor is a typical in-order single scalar processor, with a memory load latency of two (Hitachi SH2/SH3, 486, R4000, etc), the previous code saves two clocks (two add #4,r0) but loses two clocks in the memory latency for a net gain of zero clocks. If the target processor is a typical in-order superscalar processor issuing two instructions per clock with a memory load latency of two clocks (Hitachi SH4, Pentium, R5000, etc) then you will have saved two half-clocks (two add #4,r0 instructions) but the resulting code is unable to hide the memory load latency so the processsor stalls for either four or six half-clocks (depending on pairing) for a net loss of two or four half-clocks. I believe that the best solution to this problem is to "flatten" all the autoinc addressing modes prior to sched1, e.g. pretend the target supports large offsets for memory references and convert all pre/post inc/dec instructions to offset memory references followed by a fixup at the end of the basic block. This gives the scheduler maximum freedom for reordering instructions. Post scheduling, address inheritance can be generated. Applying this to the previous sample would give the following code prior to sched: move.l (r0),r1 add #1,r1 move.l r1,(r0) move.l (4,r0),r2 add #1,r2 move.l r2,(4,r0) add #8,r0 This code could be properly optimized by the scheduler to: move.l (r0),r1 move.l (4,r0),r2 add #1,r1 add #1,r2 move.l r1,(r0) move.l r2,(4,r0) add #8,r0 post-sched1 the autoinc could be generated: move.l (r0),r1 move.l (4,r0),r2 add #1,r1 add #1,r2 move.l r1,(r0)+ move.l r2,(r0)+ This, IMHO, is a better way to generate address inheritance. Toshi ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-18 21:28 ` Michael Hayes 1999-11-18 23:06 ` Toshiyasu Morita @ 1999-11-19 8:49 ` Joern Rennecke 1999-11-30 23:37 ` Joern Rennecke 1999-11-22 23:43 ` Jeffrey A Law 1999-11-30 23:37 ` Michael Hayes 3 siblings, 1 reply; 94+ messages in thread From: Joern Rennecke @ 1999-11-19 8:49 UTC (permalink / raw) To: Michael Hayes; +Cc: amylaar, m.hayes, law, gcc, amylaar > > - By processing each register separately, you miss register-register > > copies and three-address adds. Optimizing code with such constructs > > was actually the main objective of my patch. It actually reduces > > register pressure. > > I have rarely found these cases to be important for autoinc address > generation. However, reduction of register pressure is a good thing. It gets pretty important with my loop patches applied. > > - Your code can generate PRE_MODIFY / POST_MODIFY, something which mine > > can't. > > Yes, this is important for DSP architectures; especially things like > matrix multiplication where you want a large stride. My patch can also modified to accomodate this, but for an efficient implementation, this wants a balanced tree instead of a hash table. Just like rewriting (mem (plus (reg ..) (const_int ..))) . I have some AVL code, but I haven't assigned the deletion to the FSF and I really need the code somewhere else, too. So maybe it'll be best to go with splay trees. I understand we already have code for that in the source code tree. > With the information that I collect, these transformations would be > straightforward. More importantly, I would like to reorder some of > the memory references to improve autoinc generation. Which gets us into aliasing problems... well, I suppose it's easier these days, now that the alias code is centralized. > Does your code maintain the LOG_LINKS? If so I would like to try No, it doesn't. > running the instruction combiner after the regmove pass. > Alternatively, could your code be easily modified to remove redundant > load instructions once an autoinc address was generated? Could you sketch an rtl example of the transformation you have in mind? ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-19 8:49 ` Joern Rennecke @ 1999-11-30 23:37 ` Joern Rennecke 0 siblings, 0 replies; 94+ messages in thread From: Joern Rennecke @ 1999-11-30 23:37 UTC (permalink / raw) To: Michael Hayes; +Cc: amylaar, m.hayes, law, gcc, amylaar > > - By processing each register separately, you miss register-register > > copies and three-address adds. Optimizing code with such constructs > > was actually the main objective of my patch. It actually reduces > > register pressure. > > I have rarely found these cases to be important for autoinc address > generation. However, reduction of register pressure is a good thing. It gets pretty important with my loop patches applied. > > - Your code can generate PRE_MODIFY / POST_MODIFY, something which mine > > can't. > > Yes, this is important for DSP architectures; especially things like > matrix multiplication where you want a large stride. My patch can also modified to accomodate this, but for an efficient implementation, this wants a balanced tree instead of a hash table. Just like rewriting (mem (plus (reg ..) (const_int ..))) . I have some AVL code, but I haven't assigned the deletion to the FSF and I really need the code somewhere else, too. So maybe it'll be best to go with splay trees. I understand we already have code for that in the source code tree. > With the information that I collect, these transformations would be > straightforward. More importantly, I would like to reorder some of > the memory references to improve autoinc generation. Which gets us into aliasing problems... well, I suppose it's easier these days, now that the alias code is centralized. > Does your code maintain the LOG_LINKS? If so I would like to try No, it doesn't. > running the instruction combiner after the regmove pass. > Alternatively, could your code be easily modified to remove redundant > load instructions once an autoinc address was generated? Could you sketch an rtl example of the transformation you have in mind? ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-18 21:28 ` Michael Hayes 1999-11-18 23:06 ` Toshiyasu Morita 1999-11-19 8:49 ` Joern Rennecke @ 1999-11-22 23:43 ` Jeffrey A Law 1999-11-23 7:07 ` Joern Rennecke 1999-11-30 23:37 ` Jeffrey A Law 1999-11-30 23:37 ` Michael Hayes 3 siblings, 2 replies; 94+ messages in thread From: Jeffrey A Law @ 1999-11-22 23:43 UTC (permalink / raw) To: Michael Hayes; +Cc: Joern Rennecke, gcc, amylaar In message < 14388.56414.788032.326677@ongaonga.elec.canterbury.ac.nz >you writ e: > > - By processing each register separately, you miss register-register > > copies and three-address adds. Optimizing code with such constructs > > was actually the main objective of my patch. It actually reduces > > register pressure. > > I have rarely found these cases to be important for autoinc address > generation. However, reduction of register pressure is a good thing. I am somewhat surprised that that this isn't caught by cases B, C & E, at least in limited forms. Maybe Joern could provide some testcases for the 3-address problems. > With the information that I collect, these transformations would be > straightforward. More importantly, I would like to reorder some of > the memory references to improve autoinc generation. > > Does your code maintain the LOG_LINKS? If so I would like to try > running the instruction combiner after the regmove pass. > Alternatively, could your code be easily modified to remove redundant > load instructions once an autoinc address was generated? My gut tells me that this is a mistake. Though I'm also wondering if regmove should be re-cast using a conflict/adjacency list. But that's a topic for another discussion. While I realize there are optimization opportunities you're missing you're talking about a fairly significant amount of work for what is likely to be marginal benefit. jeff ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-22 23:43 ` Jeffrey A Law @ 1999-11-23 7:07 ` Joern Rennecke 1999-11-30 23:37 ` Joern Rennecke 1999-11-30 23:37 ` Jeffrey A Law 1 sibling, 1 reply; 94+ messages in thread From: Joern Rennecke @ 1999-11-23 7:07 UTC (permalink / raw) To: law; +Cc: m.hayes, amylaar, gcc, amylaar > > > - By processing each register separately, you miss register-register > > > copies and three-address adds. Optimizing code with such constructs > > > was actually the main objective of my patch. It actually reduces > > > register pressure. > > > > I have rarely found these cases to be important for autoinc address > > generation. However, reduction of register pressure is a good thing. > I am somewhat surprised that that this isn't caught by cases B, C & E, at > least in limited forms. The regmove patches from 1997 already addressed this kind of 'piecemeal' optimization. But it turned out that many worthwhile optimizations are only discernible as worthwhile when you look at the entire set of related values. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-23 7:07 ` Joern Rennecke @ 1999-11-30 23:37 ` Joern Rennecke 0 siblings, 0 replies; 94+ messages in thread From: Joern Rennecke @ 1999-11-30 23:37 UTC (permalink / raw) To: law; +Cc: m.hayes, amylaar, gcc, amylaar > > > - By processing each register separately, you miss register-register > > > copies and three-address adds. Optimizing code with such constructs > > > was actually the main objective of my patch. It actually reduces > > > register pressure. > > > > I have rarely found these cases to be important for autoinc address > > generation. However, reduction of register pressure is a good thing. > I am somewhat surprised that that this isn't caught by cases B, C & E, at > least in limited forms. The regmove patches from 1997 already addressed this kind of 'piecemeal' optimization. But it turned out that many worthwhile optimizations are only discernible as worthwhile when you look at the entire set of related values. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-22 23:43 ` Jeffrey A Law 1999-11-23 7:07 ` Joern Rennecke @ 1999-11-30 23:37 ` Jeffrey A Law 1 sibling, 0 replies; 94+ messages in thread From: Jeffrey A Law @ 1999-11-30 23:37 UTC (permalink / raw) To: Michael Hayes; +Cc: Joern Rennecke, gcc, amylaar In message < 14388.56414.788032.326677@ongaonga.elec.canterbury.ac.nz >you writ e: > > - By processing each register separately, you miss register-register > > copies and three-address adds. Optimizing code with such constructs > > was actually the main objective of my patch. It actually reduces > > register pressure. > > I have rarely found these cases to be important for autoinc address > generation. However, reduction of register pressure is a good thing. I am somewhat surprised that that this isn't caught by cases B, C & E, at least in limited forms. Maybe Joern could provide some testcases for the 3-address problems. > With the information that I collect, these transformations would be > straightforward. More importantly, I would like to reorder some of > the memory references to improve autoinc generation. > > Does your code maintain the LOG_LINKS? If so I would like to try > running the instruction combiner after the regmove pass. > Alternatively, could your code be easily modified to remove redundant > load instructions once an autoinc address was generated? My gut tells me that this is a mistake. Though I'm also wondering if regmove should be re-cast using a conflict/adjacency list. But that's a topic for another discussion. While I realize there are optimization opportunities you're missing you're talking about a fairly significant amount of work for what is likely to be marginal benefit. jeff ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-18 21:28 ` Michael Hayes ` (2 preceding siblings ...) 1999-11-22 23:43 ` Jeffrey A Law @ 1999-11-30 23:37 ` Michael Hayes 3 siblings, 0 replies; 94+ messages in thread From: Michael Hayes @ 1999-11-30 23:37 UTC (permalink / raw) To: Joern Rennecke; +Cc: Michael Hayes, law, gcc, amylaar Joern Rennecke writes: > My first impressions are: > > - Your code will take huge amounts of compile time for large functions, > since you process each register individually, and ref_find_ref_between > can cause a nested traversal of parts of a basic block - making the > total complexity something like O(3). ref_find_ref_between is only very rarely called and is only used for {pre,post}_modify addressing modes with a register increment. Many of the register ref lists do not need to be computed. However, I found that this simplifies dead store elimination; this is essential prior to autoinc processing if the loop unrolling has taken place. Your code does not have to worry about this since flow has removed dead stores. > - By processing each register separately, you miss register-register > copies and three-address adds. Optimizing code with such constructs > was actually the main objective of my patch. It actually reduces > register pressure. I have rarely found these cases to be important for autoinc address generation. However, reduction of register pressure is a good thing. > - You can handle hard registers. Have you been prompted by some real > word code to implement this, or was this an instance of doing it > because you could, and with little extra effort? The latter. I find it sometimes mops up some of reloads spills... > - Your code can generate PRE_MODIFY / POST_MODIFY, something which mine > can't. Yes, this is important for DSP architectures; especially things like matrix multiplication where you want a large stride. > - I'm not sure you if your code can rewrite > (mem (plus (reg ..) (const_int ..))) to use a different offset? > FWIW, mine can't, but it is feasible with some effort to > implement it. With the information that I collect, these transformations would be straightforward. More importantly, I would like to reorder some of the memory references to improve autoinc generation. Does your code maintain the LOG_LINKS? If so I would like to try running the instruction combiner after the regmove pass. Alternatively, could your code be easily modified to remove redundant load instructions once an autoinc address was generated? Michael. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-18 18:02 ` Joern Rennecke 1999-11-18 18:27 ` Joern Rennecke 1999-11-18 21:28 ` Michael Hayes @ 1999-11-22 23:47 ` Jeffrey A Law 1999-11-30 23:37 ` Jeffrey A Law 1999-11-30 23:37 ` Joern Rennecke 3 siblings, 1 reply; 94+ messages in thread From: Jeffrey A Law @ 1999-11-22 23:47 UTC (permalink / raw) To: Joern Rennecke; +Cc: Michael Hayes, gcc, amylaar In message < 199911190201.CAA06779@phal.cygnus.co.uk >you write: > - By processing each register separately, you miss register-register > copies and three-address adds. Optimizing code with such constructs > was actually the main objective of my patch. It actually reduces > register pressure. Yes, but I also think this is what makes your code a lot more complex. Let's face it, the regmove code is rather complex and difficult to understand. This is actually what got me thinking about whether or not conflict and adjacency lists would simplify things. A global register coalescing pass would a. remove lots of the reg-reg copies. b. provide you with the info necessary to account for reg-reg copies that we can't eliminate. > So right now, our patches really do different things, with only a small > overlap. > Of course this could be changed. I'd like to see the two implementations converge -- it's rather dumb to have tons of autoinc code all over the place, particularly when I expect that with some work they can be merged. jeff ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-22 23:47 ` Jeffrey A Law @ 1999-11-30 23:37 ` Jeffrey A Law 0 siblings, 0 replies; 94+ messages in thread From: Jeffrey A Law @ 1999-11-30 23:37 UTC (permalink / raw) To: Joern Rennecke; +Cc: Michael Hayes, gcc, amylaar In message < 199911190201.CAA06779@phal.cygnus.co.uk >you write: > - By processing each register separately, you miss register-register > copies and three-address adds. Optimizing code with such constructs > was actually the main objective of my patch. It actually reduces > register pressure. Yes, but I also think this is what makes your code a lot more complex. Let's face it, the regmove code is rather complex and difficult to understand. This is actually what got me thinking about whether or not conflict and adjacency lists would simplify things. A global register coalescing pass would a. remove lots of the reg-reg copies. b. provide you with the info necessary to account for reg-reg copies that we can't eliminate. > So right now, our patches really do different things, with only a small > overlap. > Of course this could be changed. I'd like to see the two implementations converge -- it's rather dumb to have tons of autoinc code all over the place, particularly when I expect that with some work they can be merged. jeff ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-18 18:02 ` Joern Rennecke ` (2 preceding siblings ...) 1999-11-22 23:47 ` Jeffrey A Law @ 1999-11-30 23:37 ` Joern Rennecke 3 siblings, 0 replies; 94+ messages in thread From: Joern Rennecke @ 1999-11-30 23:37 UTC (permalink / raw) To: Michael Hayes; +Cc: m.hayes, law, gcc, amylaar > OK, here goes. My first impressions are: - Your code will take huge amounts of compile time for large functions, since you process each register individually, and ref_find_ref_between can cause a nested traversal of parts of a basic block - making the total complexity something like O(3). I think my scheme has O(1) in general (there could be pathologic cases when the hash chains gets too long, but that is unlikely, and if it ever gets worrysome, it could be cured with variable has table size or expandable hash tables). - By processing each register separately, you miss register-register copies and three-address adds. Optimizing code with such constructs was actually the main objective of my patch. It actually reduces register pressure. - You can handle hard registers. Have you been prompted by some real word code to implement this, or was this an instance of doing it because you could, and with little extra effort? - Your code can generate PRE_MODIFY / POST_MODIFY, something which mine can't. - I'm not sure you if your code can rewrite (mem (plus (reg ..) (const_int ..))) to use a different offset? FWIW, mine can't, but it is feasible with some effort to implement it. So right now, our patches really do different things, with only a small overlap. Of course this could be changed. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-18 17:00 ` Michael Hayes 1999-11-18 18:02 ` Joern Rennecke @ 1999-11-30 23:37 ` Michael Hayes 1999-12-08 10:57 ` Joern Rennecke 1999-12-17 18:08 ` Joern Rennecke 3 siblings, 0 replies; 94+ messages in thread From: Michael Hayes @ 1999-11-30 23:37 UTC (permalink / raw) To: Michael Hayes; +Cc: law, gcc, amylaar Michael Hayes writes: > I'll resubmit my patches tomorrow. There have been a few mods since > my previous submission to properly fix {post,pre}_modify addressing > modes and to make the code more robust. OK, here goes. Michael. begin 644 autoinc.patch.gz M'XL(")"@-#@``V%U=&]I;F,N<&%T8V@`W#SY<QI'NC_COZ*]6Y6`#;(.V[&D M=?P00C9E!"J.V-Y4:FK$-##6,,.;&22(X_W;]SNZ9WH.D'*]EUU58D&?WWUU MMSJ^(]<G8NH%=WN31Z]__\^C06LHIJXG3\2SR6WT;#:9/).S"7_@;>JWCT(9 MAZZ\=?V9".%7Y`:^.-@[>/GJD>-.IZ(Q$8TCT5B*1DBM&KXG3YZHCY6#X^/C M9P?[\)\X/#HY>GYR\++"*S0:C<*@`W%X>'+X\F3_.US#_!%G010'?EU<-L7^ MX<'!0>/@:/^[NA@/FWM"/'E&>QX</:\?/-\7.('6YX9#`9\;CX3XN^M/O)4C MQ=_B8.G)V[WYWS*MH9P$LWRCZT=^8^K9LPA[GAH]]BH.X`M/@)[,M.`ZBNW) MC>J$'D=.75\*U6Y-YBO_QK(]+YB(]8)^%W"&D;$[$;>!ZP@_@(_2XLE+V#>6 MH;4(@`\N$_SHZ+OZT?,C1EYDYB)>H6/=!AZT>=)"E*+*U:`_ZE>K8;RNU4YS M,Q9V>&.%<E:I),/D+))Q7>P<'EF>>RLM.[:D[V1GXJ3'F4F>.X6AONUM(C>R M#BH&0'4!^-$_Q;TFMC=9`2+2FGG!M>VE^^H%HFLW7MC+ND@^E"ZT#(.E/<.% MKH'X-V68\C\PKX(_@G]TOPDB"MO1T3$PX(46MO\2!M3%1:?;%D_^0QB15Z$, MCJ(ZK0L?I]7)]L&J<A$`O1Q)*G3X_,5!'?[Y3BN1$*3XXK4`D*ZLBTZOV3VE M=G<JJH_3Z;9C30)'UAXQ9#SKF]?B7U6:.&PU>]9YNWENM?KG;?$++_>^T^VF MK<0>D6=*!F1<E:B7`<$+:/<%6+18.AH$DHA)X`.A_=B:KOQ)C-:[JH05L#Q" M5(^UN/X%464F_0$(YX6B5%M$%96$Y>#5*Y"#5\?[J1PH8;,B]V<I;-_1WZ\W ML8R$'4*C%P4"FL1<AI)]$L%M2#I@?2]['VGACM=B>JJ_@73SL$P+S<"6+]0Z MF=NAB&S`B'"3M^`A$,$?+SJ#X<BZ&K;'YWUKT'[;&8[:@Y]X*1CJ1N!+:$%7 M"<>K5T>(_U$J'/]7^"N>_P8J/.46ME:XR!],F)V6I923)$K'X!4.CY\_3X,2 M^':(38GJ:8M6"@U8MKHQYHN`#NNL.>RTK+-NO_5>5-T:*"VUXG?K;;=_!JHK MOK+B(%E4H&(%R]A=(`.K"E#?NK8C=\)V-]6WIZQOSYZ(7G`GXKDD9($0TR!< MV*168$SMR605@M4&7E_:-Q#90'2S"$`2EG84"3`6H9#V9*Y913L)VDG$@7`D M:J]`>P(B`M-@^TDH83T-+E@<":2W'0<Z(XA!]4HL?NA1Q1S@6]C^1@!B((>X MG\&X2*PBZ=3Q`SF*(!2.*Z,D8,SP$WTRF[>EJ(+#\:5T<#*X.<\*;NH8@4G% MU:/GQX?@ZX]?:0/Q=W<*\9UHCD=]J]-K@:EK:1)VIA3F(A"N?QO<2$?84P2/ M#5E=W`'95E$L8B2B7`/L&&_KI332<KT$,@#I<1F@'U`_6/DQX_)86<=MMO&+ M7D4`!T55G#(RZA>(S\?VQRN0"OQ6%P<UG%?YPE'-\V.(*U_L'Z5!]%\"U0/Q MRR];?<%OQ+?43Z"5`^W6XH`24A<4($60CD3NS'=9)HY?0/QW_/*%E@D=IJ!` M6KY<QQ9(XX_PQ0]^`A#VE5]3PX!\[Z5<BCB$`%\$4W$W=T&8#:U)Q5O8RZ6T M0R0/$Z5A+(3$44Z9?2^:AD[OHD](5KZ8D/'H@D4A&,&JF&9EW'O?ZW_HU5@H MCE\^1UR5"4/TGSGR]IF_\KS*!V#[I;T1XJ78?W5R!`G==P(R/,[X0CF%A&\T M7X%IN14'KS`K/#@^>?$*AQSG.4`KF_E<_?CX@/=\BA2[`'L$MBF*(Z38,I(K M)TCU'_8"5^1/@.,H!LJL"&4%R8ZQU\#>M[VQ0#D""Q@"44=S%.((;%D8X]HD MU^;,/65;12M8;D)W-H]%M54C+,1%**48!M/X#AWC!0BO0W/JH@/IHI[F0VY] MO0*I%=<;<0G,MJ4GKO;$.WL#$%<7>W/\\#]@)R=[$QOSO>M5N-FS)WO^SS6T MT4\92C3:)JB(2:NUQR/X"W9/$:A(`74J-L$*3)L/1'*`6@R)<&,TKL^`')19 M;F`!:`+P@9KH!@"&!9$:O^#2;Z4O0QO`7EU[(*==2%']""PX0(,MT9RP@V5P MPC:RG`KI0G\HP&50B>&03'75CA'*D,@>^(@R6GK,.)*A!2Q39!S0#MIV'BP! M\CFL!KC<N9XGKB4ZANG*J\-<&"L^=$;O^N.1:/8^B0_-P:#9&WTZA;'Q/(!> MB!=X)1?,C`L+`_PA,&0#E(`%+MN#UCN8T3SK=#NC3PCZ16?4:P^'XJ(_$$UQ MU1R,.JUQMSD05^/!57_8!N$92@1**N"WT'%*G`C13<:VZT4*W4_`NPA`\\`! M0E2#SE!".`/6%D1XN;F?04A*+P#3BR@JZIV"CR206*!:_:M/G=Y;@!2L.AA* M,-ZA"R(2!_=PLRX@J1A)M,CBRK,G4C3$<(53CX[VD>`[*SF,8$:T)X&'/(]* M%1L1`#[;IJ&L0T;I@NUV/90\0_551`'YYH0C"E)TN=5B:'OK(\Z(I>?ZZ-3B M8"9)8&EG)P#:;G0?6B.*4$CT,+H!;M!"V(,*%7&\0_()"HBRM('/CERS*8A- MB/S5XIH,4M-$`99)D2`UCE`V-P"UYZF0+<&%03+QH:61%5X0W.`*)&>0X(4; MDP2(1;)EA$$;&,AKUY?:/M[9$1%XA>@&J%8DK*"!-AI$6!5D`RF!``*4/NX% M'`)YA!"B`3J(H3DXTCW-\W)>).G%)`P@M%24IM@`"9`E/H*$=&=B8JB">.!> M1.G()#7Z<#'NG$<:(ZS#W45*,8N@4`C,$N7S9$;[?U?NY`;`6/D:-M9`',(L MYW`7DLP`I2:2(48K"FUPX%KJT_HA9+)3=Y8O.48;`&B1;PUC#YJR\W4"7.C` MR+_0F%8KL^VD50W2JL*FX,ES39!F.0U8OP$Q$_=A]P,*G]M&D<_"?W@IE4J" M?*TFL1Z*W+'4Y]/"(.B$\`&'X/\H'[O&.`\8`Y*4C.'_(!`9R'@5HC$8#]LD M'XGL0`35ZW.4EJR)62V!%*]Q.1UPZ106$TYJP2V^4"O/F)TF7V":`H#B2X@G M:8;X_G5Y&JG6?L/#,`_&W74T>B)FTB=P8#R!8X7V'9:7=<!:YXFU9%.$^W4R M#=&NOFV/K$LL^N`"-9I14Q`S>1307TLIUX*\]:P]^'74FWC!-=C'_P(*:O1_ M&Q7'9'B`7!>@+@''Q.AX@$Y1CGQ4B'E*%&1SA9J!FTTMM&ZXX30A9"K]U"F> MZ$^GQ1'4F1*<,XL+ZVK0_H&6K^E5L;77_C@J]%,=0_=A"^\B/8I8.+%[*G(L MYN)'KT\S&M]K<B4#@3IDVA%"^`43,/(!F8&`T/31RF<2Z33)=/[&../B;!_2 M+*X,7D&\22F0]IL48+QS_6K%E"8%3I]SW60GI]69`]HS'/PG,7BW!"4\`)C? MO'E#OEV%&%B40>^+.$PHBKUF*^\8M<,<VA<N%7VF!?.@(\!.;]A#.B#Y1I^N MVJG4I^`KL0='XQ1!GNF4'AOCS5(^@$;4K8=E+(SZCBLF#1*"N63UY(.A,2#2 M*]?A\25D9ZL!K'U-V%H0N(@J[I`2VU5(T0ZO69@PAS]O7VAT<,77ADQKG_<C MK/Q30>'RH[7W2T9K'@^E'4)TB>S1FA3XR4<*\U*MHJK,*?:<JAT04,BB.KVL MHACZKI1)F=<IE2?0;G_S3<)SLR/E826QG(J.6D,UZ",%)(2SF.EP0+NWEX*K MIO?&W6Y!'[L00BNA9.%$``TY!-F.[Z1D`3V@6!0_'0J*FB)(VQX@J99>I$QB M#_C7X1\ON7KE!TMP5N2WB3!QGZ+GU[S-*<?2CU\+-(46J3(-.*RIKD)/(B$H M%]542N_1;5,?DB6*\K&+X2U5I!:^O&,Q!]^!Y$@XOLU$]<<01G#-`'X0EYQK M+I$`51*O$@[!"O.V;6:JC(LTX[=;)!,@0"&R$@IE;!7]H_N!$55S7BV)Z3GF MKWYC!.Q80/U9`@6KR?HUY44394\Z.$1+>YF>F7X#7QS!\F(.2-%/C49V@Y0& MB44E@2M8))"%IN-H,5`)(N2#.J#`U!,D`<.\)`,WF<X_6^UY:O9V&/3*%S9P MAOG,8U.P[Q51;(>Q)G\K7Q,`E"]XP$8%UZ`VRK1OV^BKJ7?FB#]-]\@4VPX& M'U.L6L`29ORD]`\&*`-,@W;H83:N^D.5,6?+<W'8;B5-TJ"\DC[0NI@Y4C;8 MGAF*RB%L3B+R44E1A=)8UYQ:C(`3A6O-)1X?8C%W]*XSM#"$I*#^+HGEL^K% M=P:*>4@E-$H":3B?I9\I_[N2EN*L[)B2(5_S@5;&KI@)"^(5^-[&#*D0:#"= M@5C@09H;;T'^H3B6LV'?A/7/5T,^*=RMA3SF/TD)$_E_;`!F7P=A3#=3?K]: MEFM?65I:SN5$$+)3"G*236939_UH>T:;#LJMME.8E",M%2F;Z[TR(UMU!H&R M2J"2Z_`I!HL:L7RJ_2[ZZUCI5NJPM?+P.JH48.35YI&`N3&>XKBXQIVD\K8J M&XOL70<7XTH\K@*<0@=9R6<L812;B]/!%%Y#6BV%'?,R^?($E;>WEW/4M:8I M'E?__\2+18=S7Y#$\<R%U>T,1SM*?SO<SV_PTGFC6*RR**',@!V5RR0=JNBS MUU\EET;EXGZF_G58NI43[.:Q\Q_E)=22K)[O5IE?00?T#OS[J7C7')Q;M)W5 M@U]#M4U=9&NE1JB0?.!<4Z\%V23`IG>`;T^?)L&`J3W;:[GN3T#\&K*V4JEL M-\W:;6YQ\;]"6[<ZX2L^`,0CJ\R1(9XUX5%*<OZM3.:P/>*;M!_O$3:'85)7 M7#*RM2Z5(UT.U*=ET.6`<7N-6T*B`HI=76?J@<-@`;I@AS,91V))9\01G@0I M+[H*9<1F4V,$7$PVYA3AUO96K)'ZH"M*;04*(HH&W4"M`BQ4$[IJ#IK=;KN+ M%:-4<'3O6??]@JZU%GF5O0?XM$2X/O[0;G7;/5JM+O9KHB&PJ(%G"?OPN]%( MZE!9&N,\NHG$\TBT1"X#9(0+=:N[.9Z-EZ`Y')^!V-)^O_PBBOW_;`_Z%OCG M0;,UVCYJV'G;>\"HT:#3&EG=_@<++SEHXC'W/QJ892N4Q84`XEJI=J@%UF4% MG(+5?K!*+.T8FOU4'RZ`DY!$*)/M@G'`(V=_8RS!;RWP]'\>!JO9/'>I`.U[ M0TZG<G*OBP9@HM^A7X/11Z8>7L,&.J?47&?)S/VDAL@^_56=-)46-"X!/M-H MX'U>Y:;T>78QQ2BS&D4CF`%)Z^*O4C<3OL7*B]TE0QCM!K&HI>OM.JK*'`72 M)IJZ5GI:4Q6.AQ&Z4D:KPIH&[3*%D0<*N8HTRB5\ET#"1!!(Y7_69K!4D-!B M-)'*[#VY4U&D,Y*L68UOSC8GU+]%P"';`H9#U%6=E%CLB0VRUFV>M;L86)Z8 MK<-/EV?]8G.KWQMB;7E4;"T9=]X?GW7;F8ZK5N9K\_Q\8`%OBXWGG8L+L\>T M[IF=6'+287RY]D[251AUZ(V7:6QQV;ZLTT75G+DR(DYU@YK$T([$M<29*"S% M4HDA\(EDULA"PS[94(F%QAA65Q8_TX3_4["D-M'V&Y:SAJ/^H+W#WYD4@?$F MT3(PK/7.:D]SCVZ_>?[`+=AQ9FC^@9SLG?PVE'0T61?)%7I*(W%>EHAKH3TP M!X^&2=9\]'D[Y`R/Y`H6Y0]VPL&Z<`+_VSC9<!>OUC4\.U$^5)NPAVIW4JG5 M-*E\S4&,1V%3-C?L^=*S,9-^>>+A#<<4'Q'0C3FZ"\82B)>$RGT)^?WU?1'Q M5CZV1RD<1(SM$6F%SC$B:7'WOEJKPOJ&]_004KII1HK&/(GPV0'>E<TIT]DJ MAA'*W-,@0_W*5:]2'@]I;5,(E`@]A469$*N^4[G*UE#T1:(,!RVD2;K&>%@R M.Q$1]DO4JLB5G(/2S3@=%65$FNNV"WN37L!5D3TNM...YBY*Z5@7?B"N-P88 M6''XWQT/R1)E(MG\J,M.;SRL&2OF+6*&6,HV:LT3++%*SM0H0Z02&E'00M18 MRA`?U1"5;*?!UZX;=-4V07Q[J)\-OS74\/.PR'_'V$+\OVLL6;&\O&+E1QLX M$@-*\8`Z](3,"20;O94RA8'G<%*7H%VT<[3A8S/%06;2^X5AYY_(3</RXN": M^%ZDO=122[%)]?X@D?$MJ4LER;Q(#']/=EE1=-R2!>7)B-LI0`W0MRNRMF[; M-/E/,@7\DWZZ!G&^R5OF07LT'O1.=@]J#B^M_E5[T.R=#S,1U+@WO&JWK!_Z MW>:HDXO`0$ZOK$XVK,.%.KVK<=X;X/E*:#LN%@ULC\X?],-LL-$+Q)3*$/0` MB9XK7>,E=(JG0KK]C9*CK1;,UI<0T47B%=@T"JM3&[]+::2Q&1UY0`>=@^`J M?#$*Z\D![ZCPH9$YO"$BD*'-%8^4GRT%'YD7EQ[!3M#]9C`CBS,!#SR3?*XT M7:Y$B+==\'6=`@7$%"OEZGZS>F:`FDK7N^F5`5_#I@B(WD:`C((WI$,JE":U M4HHV.KZ)C9H.PQ;T5F;FWDKE*H(PA*09#[@0$HGDBR"[VQ,I=F\>OQ%C?'T8 MKWP[EMZ&0UZ*@`M$1T>+KSL%;0DLM".\9J.A"H-K3RXBXZ:2\DYT/<&.R84Y MF?49DV1AM1*?TF!A8.7'>(L=S=S*=WV0+-L#(X<&+73M:PPA[1`OOYL<&_)% M<JPZB,_(<S.(CS#6QR,%O,^".9VI%5KTHDPDD0BY4"^C"'1SGO'^+K3Y1`() M<`O4A]$`O.LO5S'+E^\0K+C/!X8$18#@S(2"&!5CL6!":_FXPYV^TH=7V!%" MM8YZVI"HI4X(,4W)@4E/WYR`]@0"N,`63?785-T(^(A7$&]H=Q>I&8*7`;K8 M,YD)''1.;FY4,+40=WPV;215#CY37"@^BW]D)C,.5K?=>SMZ1WY>?%;5Y%(3 M#>%L<3HU?]8I$B1)6TTNA5N)O4RMK?KBR*D-.E-B6C,#Z;[U9!6B/H#"Z4.K MA.5\T(1J34\P(W[GID)TG6`G"D./[?EQ]9/I(E:9.B;U%_W!97.D\O/3_+Q, MT5_]2HHT>@E-6%JBM%9CF'1Z\[B(?W1_0AY_*[\M"49&MNL!#`I[<A$G]"@< MK&3RUP+PL:_PP*AYA3#$Q:7W#0;3XI5U$BZLDV"A,@OB@`L9I^GX)&(N=[W) M(FYM6XY&*R2U-`/A=@'A#+$_GVZ3:+,DYMXGPAI0HVSELOAN!]>,"PJE+'7W M?V&OW<5JD;S&2:Y,TB$26>-N$SZ4W)BD<I9+#Z(8V+6%%ZJJ=+Q:%Y[-41-C M@W]+`-LS%2D<DCWM4J)IEJNT\K2"Q1+?26J(<:],T)J]W*B@<)G>^HXC[7C? MM4;CB`Q8G;L>!E&M2VQRMU\=2W1^QWQ:?ML23`3FD%LHM5,A-E^"S)YG%PXW M\V5'(D&U4&+,U@BS96"=#U#E$(1B:(;13$!2"/?;W>7M!&*JQ1.?^"T@GM+C MWS50UU%P+`G_]2HFH<;T%%_DZ5I<\JSMVI[<X&LWCN[0#=H^W<N)U)O)BGJU MG5_9N"5=_O3`.#.X:HY&[8&6E4)=*R,S.;I001HR%!*V3)T(2(X/X972\K]` MJS:>:Y,]5/$MEYRQI:%:)/\9$(XKD1JP3F(]^(\"X0W(J;NF,*47Q.HA+L6> MRN8V2(%P(?JC$LG\XD9U]9<?3)+,;00+<^<-?';,^7-;WS6`:?A$.C9L>N)O M]OE,N/2\.CT@UD2E/P:!,0W])1$TOV3O(>=[+(R_@@0=:3NAKYL+_D,=V=-+ M)G7])_.PR54N)2<)>DH^,4O<3,)'O-W.#^66<?(W,41U'D"L1&]D:QB9V20< MF=.38@T1[T.CZNEX)8&6'K$-1\W6>^NJW^D!Z>BX?GRI3\JW!S8)G"T@;435 M(7KJF=X6P9!878!1,I7F5PG:$5WCV9`4+FQ'4J#^^_F=Y>EVUYA2)D.2Y*8` MS=F5D/,3%R#!AO.`DI,=D&3ZRRIV.%O1B]P\>JC%PM!RZV+<:XTZ?;#MP^9; M;0A.36VG'S4O_9L8_"<QU(`OZ<BR$P(>KPIA@%DM*ZWF*8$>:M)$&$6+W(+9 M`D1F3E;F2>+S1>NBJ4=0MEC7/!/S=K:<<WKOU#%2V5L'U>8EL#/0!6+LO]N[ M]N8V;B3_-UGW(<;..BM9I"S*L9-0J[ADBW:\95L^6K[-U66+-2)',C<2R9TA M;6L?WWW1+P`-8(:4G:WD[J)*Q>00@V>CT=WH_C5%/FG!IL,A[J2$PU_#P9HX M3.$,&IV=^2$/GOS3R03DRG"!TOM:S"8=J^ID,R3R3G8V6UW=(#JL6:CB60:D M*-V+NA]-G^Q/B/:$O5*/H(.969_A\\=O3P>CMZ_,HAS7N*$X3R=/L(./+"A: MN>)B/):0A!$H\#I*P7>"=LYC8:##79DVCFHVI(R]S^[:(`<5%\$T9.0)F($M MOVJ\#P[>3KQ,[[/,>U@O`6<[9.$4,18\\BKPP2GG\RLRVD^L4JW,7\75=,D( M)7D`0(!_/B(*2#(E\%\C0.65]LAQG91/._;3O>P;-;^3C>=7:MA\@B?^!(>O MU\UPZ04Q_#LZ)77?N%,Z*%6DA\+%L/O%O!'XC8;%)KHV/R)>R,>RU@!PA\UN M+-^!+RL?VSXG1,@(_'.7882C4:W@QH<L46B$\%Y#C#%J]<J\"!&\CEO"R9AG MD]75U36&;`4^(*QI.9XT^&'P!%C'\Z<C.'R>OT+>#LX;6XHW[B'VFCMW_J[. M(!'<*(P[./8/=%%]SO!;HO+;K_9DE!?5L2.GDZK[G[ZRI@X?FFRVXH)XCA!I M%#-<"%Z'F<29=\6FM5?24ZVF"M[=S9IJ2K%3M]\22P6B`O(G%!9J356;T1F> M)YM0FN<#EF<+HU,43&]I.H/A5*H2IKHUY+8!=?%!O!E]K5,,(B(+%80&PM$A MU@"VX22/\\S'^R-KN=Z49D:K<3D]$XRG+.,%82=.#L%[>I*061#9PPHLFPD< M1&ER3%,-.I@0A_KZ=!@'1&M6*>U!':F?#Q+O"L.L>=?^G'IWTMSN1+>;9.U[ M#?P\^FV2?"]=6[J>L(84!H$+WS%$,UWZ414>Z206GV$)B$TXH559@C83/;6M M*!&=DXI4;0ZA=V@:,"QD`L(Y\`8#P0VK,)B];(RJAQ_61+I[\U(VHRF$O1,X MA4V[IN,T?]:N,6'$?IAP5>*,46:CGIP.F-KAX^C%\U>#D=&1$`&%[57T"R[3 M\>#%X)2D?'[AS<G;X9/!"-%>[0M[MB\U"]&TCU2!%+D+:T3P;H'XFH;H4,(P ME0V;)K+>D!WO#00BI9;2:MW/IJ.QVJ2T)?AF3LCW(V4`9]0&9P&@,QSC]LV: MP.D"W@_Q@NIRM*0>X*0S8')8E#*+T\LZ?@"DG'1T59,H(P9XF@N1:^PX8X$F MR4\RS5/L,UM/0E92!9O93Y;=Q,H-\UU#Z[Z-%_H[P1/@KQ,_LM,WOV%DK=#X MQ$.3:?T<+"YE2KKE>?,?'8]>>Z6IXZW,[S4YZ;2\N\U6R]ZIB7>.U"SOI=H& MZC;*EV<"0U\EM'7C`J(+P+7SG4`V"X9^\H)@+#?JC;W[OB8LQM7LI]G\PPSC M`3(7#R`!!N@E06\R`C)M<C2+@T88NLLA=AP&]D)0,+V)L8+5$OY_85XIIU;3 M*CXNRWQ6S%=&1I[G=*'KN?^V?!FZ.%]FE]?HXX">%W)!?SF?+\PPRCEX:^W2 M:X\`PN=X#O<:`(2+6(!(,!25"5H/]P`E,7J)LU2HF]3TW<0MSVK)BPS.?#`? MAQR*,$)UV6H>7`2(AYWRR(:(B0[<[[?L(L+150&TL[;M!>2V3CI)4%V]I_YQ M<;:Z`*^6)2#;F2DSG\#!`J>I^.M*XES+=(@I59M"0/./B]75`NQ$P"<IH<%Y MQ4AC'0!)/9M7+CS0QT:/3@^"M].V-AOX9Y]PE1O(7+-1#D8.QV[H8>J9V4+1 M,W21#)ZA^AP\P[WI'_V2)<`\_L<_,OHDP>_X@`#F1;11'L3GB]+4>BZ3>7MH M)O;.I)_=CL#;-$>4!B(N&$-_>>JRA%-P688S43(5_-$8=W:B(\);WM:Y(2PC M*MS^<7;;@W%W91T:BN6QX6#O3&"<[EP*7HF/*+]]9J]8I=%3+Z56&N`C_K?O M4%U*07J!XX4[GW6[F=]];=3W$*`@_$O."P&`2IXC4C/"S/M5JUJQ<^&=J'/L M8+\'&\36<@$HCI%1'77'S?S\7)\X4PP9^UOAJM2!#BU9>]@OM/9)_OE1.2"W MJ!W3<_3-V0.+@?>J=1T)W[>^T3PI";_GC[ZW<U"6(X'#@A:3T>N7+=/;#CJF M^#"O6^30+HO;)DO&H^RVF9\/!]EM0UKPL3S0!!1+"_[<YM'DND_HI7QH77G9 MA=AY]O)ZURX,C9BNW26^B+H=[;MJZ\ZD8_C,-G2^K>]+HY'+F.V([2NF@?\Z M>F%;)FBE<);K>VEI*.Q?9CI8(B/\&3K)I-+8Q]I>&"'D<[H0->;['[L=:(1W MR$;0CQX>#Q(/#34\?_K?_:B2$USR\`5X&E<#3VOJX6`6[K"P,^`EFM"%I.'8 M5!2='J,78*:KUDP?)I$-CKS'P(3Y>4T#K]'CQY@+KRLU',[C-!]#_B)\(W5# MS>R[MYW:C#1T36IPR[7%9&9W55B7Y2^)795JUVZOVA9ICU&K;=]KP&.N->U[ M%1.!8(5FNVS'_)#$+5XH)\K6+5?HCK\A):8K>_MFL`G)`<+M&G*+ZO<\8(,N MFMK,3#37$DOS$?)9N&;4R3A8UPJA+-<:JN63QEI_@YK^<'AX>-M))5;.9>&V MRWOK("FB_CAC"1C8,PO-\)'KN#.YS<.[#7>.7ED6NN&C".IW)B@^\@O4?(=; M[W"_.BR3=)B0.G9T";O9I^@_-$F@B^/-R1NGLJ:L94;]T:H/W9I_NLKC/`Y( M4O>=PF/EHDF7"%QMN6)QL:4WZA2XOXBXV3RAX&0/0543NM=[=ORX`4.'IVT" ME5A$F36@V:J#U7)2E*5&T+&Z9N_3N[MFG6V'*_&HV&Y:,$<6Z?YR%=3;&V:N M>8>9:TX,]>SW('/-5U_W>WL;9J[I];[Y+7/-;YEK?LM<\W\T<XV7Q4,#8?S= M,UH$KH014H$336+-YI\'EK,^696&/4#,'!F-\#X$`EF,NM`1X;Y#(>L@;W=( MQL80),:X$:^`;.L#QG"AY1=!0]!/L0)`OOFL2U?\"XS3W?9&Z0[1=IAAP0BN M%E0[Y&<>Y(2]<:+RS\'<.IYC)!WEFS[790EYA`J?K+A6#%1RY?3$(_H(ED>? M#B.D9.`H?D]%5VMY`"YQZ)57Q<>E[H1?#M*Y4;G7D!@;[.VU91'[7%<*&]R_ M()6E#>[RVBDCZUTYR@YHVFPR&SR(TUVX.ZEY1T)N:EZS3F)K7O-(\R5&AV:4 MMJ^H*,[4G`240\@#ZG'7IM9AQ/:B[1+"\#8`+6$[V\)_*,E#4`8T"%U([O3\ M4KA==#%K_//+`;J?+B8X>WXIL.[J4D`782FX#-2E@,K"4GA?IHLAX6PG9M;P M5'.<H5"E)A+ZZ.W18`K):,I-./1K_!XC2X<O&WYE7Z9_\'HE,,C"#ZFW/?/, MV@[8LG45`:?<L!XH6E>-7T-D0L*??K2Z8/K/J`!QO]PD)-H]L@UW,HZ0H%94 M+W3[&M:"[`#>?0&4]&VS7J\URD#SFW!,!`-.6(%KZZCIP<VK"<TT<67*(+)1 MAV":DV1Y%-!EW:(<N55Q1;Q[AY"\:]MRI/LY37D;(-72\U=/AO6->/,4D%;P M:QU9>-3?_"YAJ3@JOCD1AZO=\`8$6-Q@O_1J]LNF+SHS8I*]O'UQ^EP8#$V] MAXLL4^\,7_HWR\7LC]LRBP1WXD$M)RIROZ4JVJ;#Y$@)9[OB9L!0#E-)A`+I M\-C]'R1`.5SBD\7CR.Y0P#,C&EUJPK""DR%ZS(9\62[U-F+)"=8<'$TLHG]$ M>#R;*4EB:%X/3TY/MK8BIRH=?X2H>O0_C%]Q_TYG]#][71HVA,IL72NU;WGN M=[4]S)*1`C4UH0177Y49V[H:/`^UQGJ:*PN0F[$F2.@B$XP?E%@O/O*\%!L- M6N/'W[B1IC8"_P>7@$9::9@5Z^ZOVMZT!9L4:$U+TDC+46UC>VZAP8K&:TMV M5$UC=WU*OX$EC<U9N^/6TW)*>:"_S7KW^P\>]/<>;FA->_C@:\^:=B*H?(91 M48Y2S(L*`"WHWZ2@Q:)$I[^9TGXSI?UF2OL5FM+NW<5M<HH&*DP"S/F0<7!F M]BOK;UTQ:!!;PC`$G(-2T"_>W[5\R2X7G`2:<]>("KT#$`A[A&W?RW8R(TEG MV>%W_M,#*KFSHRIY'%:R'U?"[^F?\36RP.'WR92PSQDXDJQ$;1NZ0)?W_0R= MKR#?L.%65W-V76YLW7M*0]C'(:C&IQ)991$JI`L$:B$E]T5`9'2GS'J[N]06 M=-EH,V';L3!7Q8&4Q5]7TY(9,"4``80Q:I'NBZ:<*UFR-$4-T+)#'ARJ!KN' MHRCRTFQ4X./?&R'V/67>@*S5B.&9HVNNO5IQ>)-MN;S#FLB]UQ\S8R5E/?2M MG:E5<FMTW)?>:')B\DE0U<X.?`B6>M!W@ZJKAM]4"]]$5N`,[R]/N*!V,7GW M!%3@O:Z'_%3M@;M;KK?;)$:9OMI?O=&N[Z9M!TF'K=38)L%9,)!H=OSF=099 M*`V?X2P"CBD0"`C68K'#M5S`?M(`]E(LEX)5:PL##AD`U1&Q2EE%SS:9*37& M>E-5F$,Y7S)X,,18=Q`V!BNRB6>=,D8P,&IJG]5/K699W6P_NXM[WDRV>P>X MCL_@NH?9?2HGS3AL%G-T%K.+Y3LC8DY6&-2"'N&[8\$J9>A`U`UEOVAFS-/^ MP=1HF!3@)<XF<QF"Z0+T%F/2]5?FDCA`^+[M?MN7?GI;V>N#D*_M1C'AC5/3 M%'R%`G+2_$?[\W*OMVM2K[<W3*8.NZN;;K:D?.W!L_$\4?#<)ETW,O'[_')J M#N-B1("/N`D0HI'A<%X>_0!&YU?/!J,7)T_>""XD"::T@%,7^F:$.7AA.'AR M\LSZ`QD2>A":!([>GIX8NC*:_'^^';QZ,AB9U[*M]+M.'8>@'ALV"I`]``>+ M2DQ^!M(7\-E&(9Y$$O,'B`4=YBEHR.$4:L/3%X0G+[B/NCX+JLG78R\'+^7- M1%,L&7=L.E<IZRZPJ!K_DH=5)V@7/&?:<KE#RAB$]D+W!`J_QK,^CM,RPNX" MG1D._-D<2B;%8J-1,Z@IB*U_*\IY-N4,;;/Y#)EL960W`+(T8_G3\.F+(\8L MGW6Q-`6!\#RAA&@$.YI[J@:G$;K![([8>B<[>?H4`J,ERQTY"1N&<W:]+$3% MH!KDC=?#@6IZ>LXQ>0XXVLC!DZ)2[9AEHFJ`,W)KIKR]:D0Q9.)J">(MU`J! MV$R3.*I`+3,JWX?R_#*_Z.`0.SR,#O0$O:R4MPH5M1Y'6/-5;HZL68$)AK`. MY9`45G?00#@Z."+MPGPEF(QM<6BB/J61Y\"$QY9M]F@=#EX.7IUR/6!]!/'I MNVROC5?78'PU/<T>95WQ^3[$$OU,?^^2-R[Z1'+7I047N6!]*X-N'`_2W?B# M=`/_-N[+3DU?3#/-?2%/X>89:>X*34/8(]T5:F5M3SYS4F@2UG7$FY*&_%;2 M.;JH^*4)Q_7B%Z<;ZLJO@&QL1WX9JF%<23GL=.Z]^<Q(>+&;*$)HL0LK8Q9A M7@659Q12-K!.:M7HLLROLR/,,(B9'-09"`H$V@C#VHY,=;OVU!C*L8%ZD'.9 MB5Z#)/*V\F+"-=!Q7'GFBVJ%[A&@NNC^),%-Y<0!Z<8/2LP1%I&P$=F]<76% MQT0Y*KU,J)L%B>=A[D#_>R"^9'=S!Q_$!XYI.^EE*]W17K4>SBK'1M$7,`&, MM(P#%;@CRWS)_'LM.UK;N#W:8$007<H.(YI+(>@@V5/(0K8LM/T![6B*O,Q# MI`L%(X=P+B-+JEM>Z!VLU\@.B/J#Z^65<3\?0CBO1/W[M<PFJ>I]K)S3\MH& M;[,D'X;"._!#6">!.M13PI,M<6/N(3A/J3!%7/W_F?YY5TV\E4_5`M643<HO MMJR?$XHHS,\?9B-O_(1;7B`-!^+Q6'!F,:4C56'D2)+5<K=UH$V0I<FC7YS_ M-ZO"&U2H=C'@Q9=TG6RFD9P3N%;QFU;+24[F^6)Q><VUC"[*^6IAEE[[F@=1 MKXW$WK`P>^'"H*(Z(H"%,P`?0I@&0/6N!37P]M*?"LS'#6'PD#J)3.V3.2(G MH:D$AX3IJY&=D8S?"LQ_H%E-*#?O:@$3&F`WTB0E0L3M=N+8[BC9,FY9L\]P M?".S-WAOVA<[9M#A'H;G-KVL5S(-WI"<K^YWF/06CTCR3&LUEM.,T)]I=&>$ M`)3;E$(WX$>)>U4/V]*F^*``/#>6-M]P*P_5B$+B?AC%'_L1,\941U"7T?'Z M39U`G]F(P9,0\4F\;1UK\L='Z:+1-@FB%`*EH@,>YOQ5Q[8>.+P`4"=O'#:- MI)0U;&!(V8ZWN%KA0N%+,5D%&UPBD[P<RY8H_,5HI5>BCIE+?]1ZT*C$U2XH MK];']HEA<\"2DZ)5C;90:F"A:%DB?-=!M`3(=3A+9X43X'A,OBF9<H9YFH'( M!2@8MX.R)6:@B#9YZ`4\/HEP,2ZW+9^/F!?L.N"10);^K-Z'&C\4=!6HK@ON M=?'WNNRX058KVH\Z9E%OHUI0'!88[T=D"\>\H=I>/3N&UVQP>-27`W^E&(+& MNEDH-*J+^T2,RLT^N:WS8/;,IB:A7KR?/B0R)*9V$34:]-GG8HF3-H@"2+$> MWCRZDQ@U!2M/"DDR&6LS4)>B0:0EL&/:!%QV+W5%!*[@NON:)&%KC,L$<KWK M:3J4"`'Z"M?U(C5KLKN5$H]@N;X,)I#VWU[C/GA^+@GE?H);HR5&+'3G"Y#/ M%M<D;G@W*ZCZ39=5<>G4(H'@60*C`,<`NAHDG'ZZ@!1"09R[#.)=;PM`J=64 MDF,%9F]D]OP20&?"`P^%Q@2^6@PZ8[B!'31+?:QF*:79`G\B:%^@+1M%^#4> M.7RM^>H$E[8HR,Y/W9C/BDJNS\>D@A/03Z2>_F!-J"Z)O:C&;:<PD[>%T[.I M89A'-N+V8)ZP:,)HR[-+I(]!C)Y)VZG1,L3F'M<KU%4!%X8!%YG-W9F&>V>? MS:]F`6Z@4L<(.G5*<8JQ[JMZ3/-U`:QA^A(R$?O&8VT2=FJUT_>5LN]KW(%^ MSEK^2*P`5<-516@P(!:8NA_Z<QQ?N^^XH>]E3WZUN%S[VWX)YW0J#M2T;!3B M6J\VK;$EU)S/!UYG*T<34?5JJ@*`PHN,$0P@JU$R'0'7^HB*0=@J2)""9=2W M(B48)K98&PU@BB"'+7CWJ$,C+2=D*(K2(<C7)+G,,*+)Y#-O.W><ZP`G]$1S M77[E)U3#/=_;E5HP$\44?8+@5KOB3"L?(&\8UL15]M#7Q]:W16XZ=KGG[I6> M7/P&;BMNJ/3RXVUR8;(^![!Z%4.HX;M4[LFVG1LQ_V`KY@MVT#FA2"TI9Y<9 M'DFAF1&]K*`6R*XWI*R_7D7.@%C0Z]%M'56ATKT))!)Y5T=6/PV-]%3[8MO2 M4?%@)T1.Y%$%]::U(^@BL'O(G446:'*/T8)+;K=#?+D0X#Z1+WC4^;CIQW'3 M)$602X^DPGO'&<`\=X:6K*<':V>7AP0%'PPP3^31-NS:2;N^*:]II+5#>5(S MB^!"^*)8_MZ<\&;7JR!.'(8=J<;LLREX7!:DP.\&WUZ%N:W]@?FLCO["(6J1 M!\I]82KH^8,[GE:8'='FTLY6,[,EH<>+$K0<W"?6+PEY$W8-137PM;F@^V+H M]X?\VN6EECWE.OE%,9O(K;9.B+=661H2R&"D#5HB)$"6:/GX4.VE523W!.X0 MW)/$+3&=P2IO!C[RLV74NA'8WY+7S+X*UK,J6'J0>[')S]&3\$VS,^:SZ3B_ M!(^+E,F.9Q/::S#0*8VP;LX/5-4T`MQ;]R,:]7N-NI3;P7"LB\N;XW,H82!' MQR99M@5-WV'W`JAG\3&'$ZU#M1ZFG3#A-WR;3[BO5,'[5/`K<DRJ>]-L[[.Y M.4*F,]2AXJE-Q@9$ZC&IIBVZS$/QIQ^.NJ/*A%/2)PV;R_A1H_633O2<)6(\ MK3#GL4BS'5)+3WUC*J2_*[HS<*AL8?'8AL9=,4<=#,_\HR_1Z?:`1"HO8[.# M&H7>1?>;]%;BJM6U=BO1GKNNB!K$OZ96W;N)9F]ER6':2]^?>91^<[?B!O\= MPUQW!HFYA>O1[-`WQI`]:3_Q(UP&Z3NOT,Y)?Z""1;5PWS:_M^W5F+E3QQ0$ M9=[#Z$J0=$,3"X8F'KTZO<&9E=*+$*/:QSZKK24PRIB3X`*!C*F'J%#(Z0!^ M6^$)\:+`H`S(;>8?X-I,F6K<"S+E<5"WNZP"2Y].R0D54RD7,TQ;#`I-,5XM M0?07$P)EQPX,(%9;2#@.@`TE7TXK-J!-BBKM11:BQ>XKF%C'91U8+/-DBN1, MB[G:$5">:%%"W`,UY'=*P/A,X<$7T7T=G7=YG7[NJ^?I@R.ZK\'M83=["R5D MM-E!2BL/<AL6YR^KJT5U`)WCY)+58HKH4^0.V$+"!YCL8H+^Q&7Q>]`F09+$ MM057$LSK2;K8Y1(340;Y7F7XB72;?WS[\K5+MYD<7\-9%QQU%D'56\D-C]6: M?B*S]=%?DT':4*H&/-%V2?,)^T9'#X^>^V,*Q3<S][<.V4[HG05)`7'-T;_M M-;.)'V26L;HU(VTD<OH]$[\*:V%TQN]'BAK$D>%F3J))5X?@EF]('OGLKXIW M7N[VZ^R:!FJ-HMY%&[J,'2+K!`\S`%ED_[&N6`3K@0\!K2R?PKX!'^(2+>L8 MG&$8WH*\9D#EO%I@"ETS:69,]Q;SRJ6")6]IBH804U('!IF_-Q6#"MA!IVHP M>ZP6UB$`_L"K?*PRMVXNQ!T_?_,ZDC9,87%ANW4H[KDM;]VTI-/Z%&$N:+A% MS3:WJ@6=ULV%N72CGSK6S46ZGVVP4BA@E)`$@LXPL7<B)L66[\'3]H/SGPU> M(7#(EM5QC![%_>FS]R3C#PL6*,D-VL&Q[QG/8R#R6YG=GVG&CE*DLO\V"J-! M22LKK"G',R,Q`/2GRA#0:\!OK:"&0AD),1#3JESI.\SP@DN95OT-BYL>G%!S M5OD,S,G)/.LWE9'5*JM!!K)SY"P0WCIB'`SD:_Q0@4'O"D**V6N.&-$$T\O- M#>/'KE9B.J-P`\QT3>DM(#;J)^"(YYPT!")U0SL47X)[5P'?'28#99*F->4Q M&V<G>DV1BS>\_(-N.[15<2ZL_"LU7Q[&-.>`VVGJ,*R]FIYQ+-U-[MHXQI+` M1(,+MT^]5G/W7LOY,K^,;UML,HK-;ZSDH9\'4^=CDW0/^):H&8-9A=%+-L%Y M";<7G`1%+BK4=!43,M'*R(BTJ@+0>.7VV4R:F>LKNCZHWN4E$!OX#V-*[^IR MOK266Z-.24UT%4+2;7`MC6*6V9O?'PV/1T^'1AP/<W;7*85!%B!T__4U&:W# MI.-)-$91K`4PK]`KL71'`5QUTY4M".8?\I)H,L)TH/L;=B2?V^.?I]_/;313 M-+/V7A@ZW&9$#:VC84+'+TLV=K<DF0"=%GXK?,VBIRK,&@$#_6FZR.;OX=*A M6!)0,JS=%*^^R!9`UW9!AIV0SX9S'UT%D3+-I[>Z!&JZ_[%7/YPX)Z(`MVHR M()I4LV`W6;A6:M7P"+CIRBU5?E7=6;3`^*MG&;=KA08K.\"E\V`0]9`#[1R& M?$<!A=L=%KS7R.$9*YEW_)AP0#.).,4I#?..<T0C3RJP#XGF2\^^=JKSTT%; MN!,C/"`@C!>)UR&<9DR0ZN?!`^P%AOV&F[/&`X)!&HH@F;F?FKRC(+<W/"UJ MLHC;!QHDW!TIGO.&/6'4+3Y!D<+@DH/7G-?ON&.*,9[[8V_][DSZ",F.6=E] MZV($?1Z<4`KG6BA.^BP$A6ZLGH-4AN2XA9>),)1WL$LM++>T3G=T_@D#M^N0 M(.QU4;[+%Q4(EHQ,4JXPLW-AB&^"%CZ[<'(U"7G((3-QA28<\?]V&<A)"I'` M>@+W<68Y/!5S<[`"<@P)+H_(T>`BGR+QP?EI:/TZ0\",J[PTBY1?[CIA&+T2 MB&?89HQ>.P%-U&BL*"J"TCJ=%"5%BP;S0?G>Z+RF>TKG$J53"VMK()[#/(,C M0!W"K')&`]DSND?2(<01`!\%#E4>OSNO9R+5G</-9*]-Z=DPCBLS@]?U=!U3 M\YU)(M2I0J+&3D;>;?CT<W@@TH1*,/GFU\@!TZQ/(.YA_WIA&%7`]VJXFGYG M+7]S3##-WM:O[A%UOV\8E4_K%:3I<3L%5ML>JN&X/#[E=@?T##-@^&6-WMX[ MP$Z#_K1''[O=F*?&\2N.%JANWSH8'"'J]*@_8$RM7TI17]5T&=@]-GZ5HXYH M\?XE4BO`_<<Y<J08._<*2J&W@\_.*%Z$/D'>>I;(\/O%I>%^ER.=>+RA`.:] M]\_;R$(:.9JFDA:@-ROE7,\).@91JUK9&\/MV:6F*@B'2[MP(,P:0*V$L&U< MP6F8H%)\4<^F%T9[-T?%ZM)0Z1+)Z++(WTLO\&T;7E2IS)>U7M3I++S1E*<G M29^N>#+$IX)O/\`NA@`-#K;*?%G,R^5J-EU.0^*(^/WFHE3@=A&E9H\$B)MQ MA>#.S%ZL_3C;]`S(VH@\WL]>YC\5T,;N=-8^_/R_]O#)&YR"?G9O_+ZZ=S$> MWRLNQO3!:ZOSOFVZ54Z+]W3\OY\B/EUO]_Y7O?9D>GZ>=<=9]W[6763=$I^J MG@)TH?>]!>"`]WJ]>[UOLKW[_0??]N_?;U%=@&R8+/EMMK?7?_!U_\&#$!@Q M.WG\QS>&%2WG1GIXOSNWD'CS;&F6T?R#B]-=8C<>/GS8>?CU/N$H9G2[UITO MIH:B=N?TC9!E^`MFDQW;W^@4LS]>X5N7XROS_Q^A.B-C8->Y0+Y<EN:0-5]_ M!WFK1O.SOQ3CY0A)QSPRFM?P:`3]WQ:JD(JNSL9&]#3?)M>S+D#YS:!+U>(R MO^[RL"[*?/$.'IY-EU?Y8A>8:#5?E6/X\5U>O<.Z;F7$<^&%BW&7P-*P0\^> M/=DF9]3I^76W7'[<G9N6S7]?@/T8R:)"GQMX'^PT#.)T>8U!19COEVUC9ZOI MY5)<0RG.$<H:YG2)]7&``C"D\;B7T3^+RQ5LX1,07#],*XET7]I7V8>4+E8J M,MD9+@8T(LN(^)?_GY=1@$3AK?/_U<L:;FLZW<R0^O)Q;$;WY.35T^?/1M^; M(3(6%6[KWH,'7W=Z#Q[N>P"I_.BA$`E`EV)U^"%=F7DX/'T!3WZW=3H<#.`3 MW`%5N^\R!5!%RYN98IY@1>^!9`Z?0'4T5'C1K8JE>8%I[ITP*C@8$<)*JL+Z MI3$'H)45'\?%PKSH1&4<A86073.01(V,(*T'I)IWHP#R?CV$3^T=IK%?HNU, M]%YL7SY_4OO9FO;YITEAO\#^K^E8$QD`UI?Y6!9&V7X7DH-00?M?(F1-Z]C@ "```` ` end ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-18 17:00 ` Michael Hayes 1999-11-18 18:02 ` Joern Rennecke 1999-11-30 23:37 ` Michael Hayes @ 1999-12-08 10:57 ` Joern Rennecke 1999-12-08 14:38 ` Michael Hayes 1999-12-31 23:54 ` Joern Rennecke 1999-12-17 18:08 ` Joern Rennecke 3 siblings, 2 replies; 94+ messages in thread From: Joern Rennecke @ 1999-12-08 10:57 UTC (permalink / raw) To: Michael Hayes; +Cc: m.hayes, law, gcc, amylaar > OK, here goes. > > Michael. > > begin 644 autoinc.patch.gz Can you post autoinc.h ? ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-12-08 10:57 ` Joern Rennecke @ 1999-12-08 14:38 ` Michael Hayes 1999-12-10 8:42 ` Joern Rennecke 1999-12-31 23:54 ` Michael Hayes 1999-12-31 23:54 ` Joern Rennecke 1 sibling, 2 replies; 94+ messages in thread From: Michael Hayes @ 1999-12-08 14:38 UTC (permalink / raw) To: Joern Rennecke; +Cc: Michael Hayes, gcc, amylaar Joern Rennecke writes: > Can you post autoinc.h ? OK, be warned, it's trivial... /* Optimize by combining creating autoincrement memory references for GNU compiler. This is part of flow optimization. Copyright (C) 1999 Free Software Foundation, Inc. Contributed by Michael P. Hayes (m.hayes@elec.canterbury.ac.nz) This file is part of GNU CC. GNU CC is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2, or (at your option) any later version. GNU CC is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with GNU CC; see the file COPYING. If not, write to the Free Software Foundation, 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. */ #ifdef AUTO_INC_DEC extern int autoinc_optimize PROTO ((int, int, FILE *)); #endif ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-12-08 14:38 ` Michael Hayes @ 1999-12-10 8:42 ` Joern Rennecke 1999-12-10 13:36 ` Michael Hayes 1999-12-31 23:54 ` Joern Rennecke 1999-12-31 23:54 ` Michael Hayes 1 sibling, 2 replies; 94+ messages in thread From: Joern Rennecke @ 1999-12-10 8:42 UTC (permalink / raw) To: Michael Hayes; +Cc: amylaar, m.hayes, gcc, amylaar It appears your patch is not intended for existing ports. Or is there another patch for defaults.h ? autoinc.o: In function `autoinc_search': /s/egcs-ai-mh/gcc/autoinc.c:394: undefined reference to `USE_STORE_PRE_MODIFY_REG' /s/egcs-ai-mh/gcc/autoinc.c:397: undefined reference to `USE_STORE_POST_MODIFY_REG' /s/egcs-ai-mh/gcc/autoinc.c:400: undefined reference to `USE_LOAD_PRE_MODIFY_REG' /s/egcs-ai-mh/gcc/autoinc.c:403: undefined reference to `USE_LOAD_POST_MODIFY_REG' /s/egcs-ai-mh/gcc/autoinc.c:475: undefined reference to `USE_STORE_PRE_MODIFY_DISP' /s/egcs-ai-mh/gcc/autoinc.c:478: undefined reference to `USE_STORE_POST_MODIFY_DISP' /s/egcs-ai-mh/gcc/autoinc.c:481: undefined reference to `USE_LOAD_PRE_MODIFY_DISP' /s/egcs-ai-mh/gcc/autoinc.c:484: undefined reference to `USE_LOAD_POST_MODIFY_DISP' ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-12-10 8:42 ` Joern Rennecke @ 1999-12-10 13:36 ` Michael Hayes 1999-12-10 16:59 ` Loop patch update (Was: Re: Autoincrement examples) Joern Rennecke ` (2 more replies) 1999-12-31 23:54 ` Joern Rennecke 1 sibling, 3 replies; 94+ messages in thread From: Michael Hayes @ 1999-12-10 13:36 UTC (permalink / raw) To: Joern Rennecke; +Cc: Michael Hayes, gcc, amylaar Joern Rennecke writes: > It appears your patch is not intended for existing ports. Or is there > another patch for defaults.h ? Here are my diffs for rtl.h again. In an earlier post on this thread I sent this with other patches to support post_modify addressing modes throughout the compiler. Michael. Index: rtl.h =================================================================== RCS file: /cvs/gcc/egcs/gcc/rtl.h,v retrieving revision 1.158 diff -c -3 -p -r1.158 rtl.h *** rtl.h 1999/12/04 03:00:03 1.158 --- rtl.h 1999/12/10 21:32:14 *************** extern const char * const note_insn_name *** 763,803 **** /* 1 means a SYMBOL_REF has been the library function in emit_library_call. */ #define SYMBOL_REF_USED(RTX) ((RTX)->used) /* Define a macro to look for REG_INC notes, but save time on machines where they never exist. */ ! /* Don't continue this line--convex cc version 4.1 would lose. */ ! #if (defined (HAVE_PRE_INCREMENT) || defined (HAVE_PRE_DECREMENT) || defined (HAVE_POST_INCREMENT) || defined (HAVE_POST_DECREMENT)) ! #define FIND_REG_INC_NOTE(insn, reg) (find_reg_note ((insn), REG_INC, (reg))) ! #else ! #define FIND_REG_INC_NOTE(insn, reg) 0 #endif /* Indicate whether the machine has any sort of auto increment addressing. If not, we can avoid checking for REG_INC notes. */ /* Don't continue this line--convex cc version 4.1 would lose. */ ! #if (defined (HAVE_PRE_INCREMENT) || defined (HAVE_PRE_DECREMENT) || defined (HAVE_POST_INCREMENT) || defined (HAVE_POST_DECREMENT)) #define AUTO_INC_DEC #endif #ifndef HAVE_PRE_INCREMENT ! #define HAVE_PRE_INCREMENT 0 #endif #ifndef HAVE_PRE_DECREMENT ! #define HAVE_PRE_DECREMENT 0 #endif #ifndef HAVE_POST_INCREMENT ! #define HAVE_POST_INCREMENT 0 #endif #ifndef HAVE_POST_DECREMENT ! #define HAVE_POST_DECREMENT 0 #endif /* Some architectures do not have complete pre/post increment/decrement instruction sets, or only move some modes efficiently. These macros allow us to tune autoincrement generation. */ --- 763,833 ---- /* 1 means a SYMBOL_REF has been the library function in emit_library_call. */ #define SYMBOL_REF_USED(RTX) ((RTX)->used) + #if (defined (HAVE_PRE_MODIFY_DISP) || defined (HAVE_PRE_MODIFY_REG) || defined (HAVE_POST_MODIFY_DISP) || defined (HAVE_POST_MODIFY_REG)) + #define HAVE_AUTO_MODIFY + #endif + + #if (defined (HAVE_PRE_INCREMENT) || defined (HAVE_POST_INCREMENT)) + #define HAVE_AUTO_INC + #endif + /* Define a macro to look for REG_INC notes, but save time on machines where they never exist. */ ! #if (defined (HAVE_PRE_DECREMENT) || defined (HAVE_POST_DECREMENT)) ! #define HAVE_AUTO_DEC #endif /* Indicate whether the machine has any sort of auto increment addressing. If not, we can avoid checking for REG_INC notes. */ /* Don't continue this line--convex cc version 4.1 would lose. */ ! #if (defined (HAVE_AUTO_MODIFY) || defined (HAVE_AUTO_INC) || defined (HAVE_AUTO_DEC)) #define AUTO_INC_DEC #endif + /* Define a macro to look for REG_INC notes, + but save time on machines where they never exist. */ + + #ifdef AUTO_INC_DEC + #define FIND_REG_INC_NOTE(insn, reg) (find_reg_note ((insn), REG_INC, (reg))) + #else + #define FIND_REG_INC_NOTE(insn, reg) 0 + #endif + #ifndef HAVE_PRE_INCREMENT ! #define HAVE_PRE_INCREMENT 0 #endif #ifndef HAVE_PRE_DECREMENT ! #define HAVE_PRE_DECREMENT 0 #endif #ifndef HAVE_POST_INCREMENT ! #define HAVE_POST_INCREMENT 0 #endif #ifndef HAVE_POST_DECREMENT ! #define HAVE_POST_DECREMENT 0 #endif + #ifndef HAVE_POST_MODIFY_DISP + #define HAVE_POST_MODIFY_DISP 0 + #endif + + #ifndef HAVE_PRE_MODIFY_DISP + #define HAVE_PRE_MODIFY_DISP 0 + #endif + + #ifndef HAVE_POST_MODIFY_REG + #define HAVE_POST_MODIFY_REG 0 + #endif + #ifndef HAVE_PRE_MODIFY_REG + #define HAVE_PRE_MODIFY_REG 0 + #endif + + /* Some architectures do not have complete pre/post increment/decrement instruction sets, or only move some modes efficiently. These macros allow us to tune autoincrement generation. */ *************** extern const char * const note_insn_name *** 834,840 **** --- 864,902 ---- #define USE_STORE_PRE_DECREMENT(MODE) HAVE_PRE_DECREMENT #endif + #ifndef USE_LOAD_POST_MODIFY_DISP + #define USE_LOAD_POST_MODIFY_DISP(MODE) HAVE_POST_MODIFY_DISP + #endif + + #ifndef USE_LOAD_PRE_MODIFY_DISP + #define USE_LOAD_PRE_MODIFY_DISP(MODE) HAVE_PRE_MODIFY_DISP + #endif + + #ifndef USE_STORE_POST_MODIFY_DISP + #define USE_STORE_POST_MODIFY_DISP(MODE) HAVE_POST_MODIFY_DISP + #endif + + #ifndef USE_STORE_PRE_MODIFY_DISP + #define USE_STORE_PRE_MODIFY_DISP(MODE) HAVE_PRE_MODIFY_DISP + #endif + + #ifndef USE_LOAD_PRE_MODIFY_REG + #define USE_LOAD_PRE_MODIFY_REG(MODE) HAVE_PRE_MODIFY_REG + #endif + + #ifndef USE_LOAD_POST_MODIFY_REG + #define USE_LOAD_POST_MODIFY_REG(MODE) HAVE_POST_MODIFY_REG + #endif + + #ifndef USE_STORE_PRE_MODIFY_REG + #define USE_STORE_PRE_MODIFY_REG(MODE) HAVE_PRE_MODIFY_REG + #endif + + #ifndef USE_STORE_POST_MODIFY_REG + #define USE_STORE_POST_MODIFY_REG(MODE) HAVE_POST_MODIFY_REG + #endif + /* Accessors for RANGE_INFO. */ /* For RANGE_{START,END} notes return the RANGE_START note. */ #define RANGE_INFO_NOTE_START(INSN) XCEXP (INSN, 0, RANGE_INFO) ^ permalink raw reply [flat|nested] 94+ messages in thread
* Loop patch update (Was: Re: Autoincrement examples) 1999-12-10 13:36 ` Michael Hayes @ 1999-12-10 16:59 ` Joern Rennecke 1999-12-13 14:37 ` Joern Rennecke ` (2 more replies) 1999-12-14 15:49 ` Autoincrement patches " Joern Rennecke 1999-12-31 23:54 ` Autoincrement examples Michael Hayes 2 siblings, 3 replies; 94+ messages in thread From: Joern Rennecke @ 1999-12-10 16:59 UTC (permalink / raw) To: Michael Hayes; +Cc: amylaar, m.hayes, gcc, amylaar I've noticed that all the intresting examples seem to get pessimized by loop so that no autoincrement optimization is applicable any more. So I've updated my loop patches for the current egcs mainline: Thu Sep 2 23:50:58 1999 J"orn Rennecke <amylaar@cygnus.co.uk> * loop.h (struct induction): New member did_derive. * loop.c (strength_reduce, record_giv): Clear did_derive when creating giv. (recombine_givs): Set did_derive when deriving from a giv. (strength_reduce): Don't apply post-increment auto_inc_opt if did_derive is set. Wed May 12 22:19:02 1999 J"orn Rennecke <amylaar@cygnus.co.uk> * loop.h (reg_dead_after_loop): Declare. * unroll.c (reg_dead_after_loop): Don't declare. No longer static. * loop.c (recombine_givs): If a giv is used outside the loop, use reg_dead_after_loop to find out if it matters. Wed May 12 20:27:39 1999 J"orn Rennecke <amylaar@cygnus.co.uk> * loop.c (strength_reduce): When doing biv->giv conversion, update reg note of NEXT->insn. When converting to a DEST_REG giv, undo all changes that tried to make DEST_ADDR givs. * combine.c (validate_subst): Use SUBST for all substitutions. Wed May 12 19:51:51 1999 J"orn Rennecke <amylaar@cygnus.co.uk> * loop.c (strength_reduce): Check if a biv can be detected by a REG_EQUAL note attached to the biv insn. Wed May 5 21:15:40 1999 J"orn Rennecke <amylaar@cygnus.co.uk> * loop.h (struct induction): New members autoinc_pred, autoinc_succ, preinc, combine_start_limit, combine_end_limit. * loop.c (last_recorded_giv, last_recorded_addr_giv): New static variables. (strength_reduce): Set new fields in struct induction for givs. Initialize last_recorded_giv. (find_mem_givs): Set mem_mode before calling record_giv. (record_giv): Set new fields in struct induction. Keep track of last_recorded_giv. (combine_givs_p): Supress combinations that would foil giv-giv autoincrement opportunities. (combine_givs): Propagate autoinc_{pred,succ} settings into combine_{start,end}_limit. (recombine_givs): Allow to derive givs from a non-eliminable biv. Index: combine.c =================================================================== RCS file: /cvs/gcc/egcs/gcc/combine.c,v retrieving revision 1.101 diff -p -r1.101 combine.c *** combine.c 1999/12/09 10:46:10 1.101 --- combine.c 1999/12/11 00:42:37 *************** static int combinable_i3pat PROTO((rtx, *** 361,366 **** --- 361,367 ---- static int contains_muldiv PROTO((rtx)); static rtx try_combine PROTO((rtx, rtx, rtx)); static void undo_all PROTO((void)); + static void undo_last PROTO((void)); static void undo_commit PROTO((void)); static rtx *find_split_point PROTO((rtx *, rtx)); static rtx subst PROTO((rtx, rtx, rtx, int, int)); *************** combine_instructions (f, nregs) *** 738,743 **** --- 739,745 ---- total_successes += combine_successes; nonzero_sign_valid = 0; + reg_last_set_value = 0; /* Make recognizer allow volatile MEMs again. */ init_recog (); *************** try_combine (i3, i2, i1) *** 2695,2700 **** --- 2697,2728 ---- else return newi2pat ? i2 : i3; } + + /* Undo all the modifications recorded in undobuf after previous_undos. */ + + static void + undo_last () + { + struct undo *undo, *next; + + for (undo = undobuf.undos; undo != undobuf.previous_undos; undo = next) + { + next = undo->next; + if (undo->is_int) + *undo->where.i = undo->old_contents.i; + else + *undo->where.r = undo->old_contents.r; + + undo->next = undobuf.frees; + undobuf.frees = undo; + } + + undobuf.undos = undobuf.previous_undos; + + /* Clear this here, so that subsequent get_last_value calls are not + affected. */ + subst_prev_insn = NULL_RTX; + } \f /* Undo all the modifications recorded in undobuf. */ *************** nonzero_bits (x, mode) *** 7790,7795 **** --- 7818,7827 ---- } #endif + /* If called from loop, the reg_last_* arrays are not set. */ + if (! reg_last_set_value) + return nonzero; + /* If X is a register whose nonzero bits value is current, use it. Otherwise, if X is a register whose value we can find, use that value. Otherwise, use the previously-computed global nonzero bits *************** num_sign_bit_copies (x, mode) *** 8188,8193 **** --- 8220,8229 ---- return GET_MODE_BITSIZE (Pmode) - GET_MODE_BITSIZE (ptr_mode) + 1; #endif + /* If called from loop, the reg_last_* arrays are not set. */ + if (! reg_last_set_value) + return 1; + if (reg_last_set_value[REGNO (x)] != 0 && reg_last_set_mode[REGNO (x)] == mode && (reg_last_set_label[REGNO (x)] == label_tick *************** get_last_value (x) *** 11217,11223 **** && (value = get_last_value (SUBREG_REG (x))) != 0) return gen_lowpart_for_combine (GET_MODE (x), value); ! if (GET_CODE (x) != REG) return 0; regno = REGNO (x); --- 11253,11260 ---- && (value = get_last_value (SUBREG_REG (x))) != 0) return gen_lowpart_for_combine (GET_MODE (x), value); ! /* If called from loop, the reg_last_* arrays are not set. */ ! if (GET_CODE (x) != REG || ! reg_last_set_value) return 0; regno = REGNO (x); *************** insn_cuid (insn) *** 12371,12376 **** --- 12408,12495 ---- abort (); return INSN_CUID (insn); + } + \f + void + validate_subst_start () + { + undobuf.undos = 0; + + /* Save the current high-water-mark so we can free storage if we didn't + accept this set of combinations. */ + undobuf.storage = (char *) oballoc (0); + } + + void + validate_subst_undo () + { + undo_all (); + } + + /* Replace FROM with to throughout in INSN, and make simplifications. + Return nonzero for success. */ + int + validate_subst (insn, from, to) + rtx insn, from, to; + { + rtx pat, new_pat; + int i; + + pat = PATTERN (insn); + + /* If from is not mentioned in PAT, we don't need to grind it through + subst. This can safe some time, and also avoids suprious failures + when a simplification is not recognized as a valid insn. */ + if (! reg_mentioned_p (from, pat)) + { + /* But we still have to make sure a REG_EQUAL note gets updated. */ + rtx note = find_reg_note (insn, REG_EQUAL, NULL_RTX); + + if (note) + SUBST (XEXP (note, 0), subst (XEXP (note, 0), from, to, 0, 1)); + return 1; + } + + /* We have to set previous_undos to prevent gen_rtx_combine from re-using + some piece of shared rtl. */ + undobuf.previous_undos = undobuf.undos; + + subst_insn = insn; + + new_pat = subst (pat, from, to, 0, 1); + + /* If PAT is a PARALLEL, check to see if it contains the CLOBBER + we use to indicate that something didn't match. If we find such a + thing, force rejection. + This is the same test as in recog_for_combine; we can't use the that + function here because it tries to use data flow information. */ + if (GET_CODE (pat) == PARALLEL) + for (i = XVECLEN (pat, 0) - 1; i >= 0; i--) + if (GET_CODE (XVECEXP (pat, 0, i)) == CLOBBER + && XEXP (XVECEXP (pat, 0, i), 0) == const0_rtx) + { + undo_last (); + return 0; + } + + /* Change INSN to a nop so that validate_change is forced to re-recognize. */ + PATTERN (insn) = const0_rtx; + if (validate_change (insn, &PATTERN (insn), new_pat, 0)) + { + rtx note = find_reg_note (insn, REG_EQUAL, NULL_RTX); + + PATTERN (insn) = pat; + SUBST ( PATTERN (insn), new_pat); + if (note) + SUBST (XEXP (note, 0), subst (XEXP (note, 0), from, to, 0, 1)); + return 1; + } + else + { + PATTERN (insn) = pat; + undo_last (); + return 0; + } } \f void Index: loop.c =================================================================== RCS file: /cvs/gcc/egcs/gcc/loop.c,v retrieving revision 1.206 diff -p -r1.206 loop.c *** loop.c 1999/12/10 15:27:55 1.206 --- loop.c 1999/12/11 00:42:40 *************** struct movable *** 254,260 **** that the reg is live outside the range from where it is set to the following label. */ unsigned int done : 1; /* 1 inhibits further processing of this */ ! unsigned int partial : 1; /* 1 means this reg is used for zero-extending. In particular, moving it does not make it invariant. */ --- 254,260 ---- that the reg is live outside the range from where it is set to the following label. */ unsigned int done : 1; /* 1 inhibits further processing of this */ ! unsigned int partial : 1; /* 1 means this reg is used for zero-extending. In particular, moving it does not make it invariant. */ *************** static rtx express_from_1 PROTO((rtx, rt *** 324,331 **** static rtx combine_givs_p PROTO((struct induction *, struct induction *)); static void combine_givs PROTO((struct iv_class *)); struct recombine_givs_stats; ! static int find_life_end PROTO((rtx, struct recombine_givs_stats *, rtx, rtx)); ! static void recombine_givs PROTO((struct iv_class *, rtx, rtx, int)); static int product_cheap_p PROTO((rtx, rtx)); static int maybe_eliminate_biv PROTO((struct iv_class *, rtx, rtx, int, int, int)); static int maybe_eliminate_biv_1 PROTO((rtx, rtx, struct iv_class *, int, rtx)); --- 324,335 ---- static rtx combine_givs_p PROTO((struct induction *, struct induction *)); static void combine_givs PROTO((struct iv_class *)); struct recombine_givs_stats; ! static void find_giv_uses PROTO((rtx, struct recombine_givs_stats *, rtx, ! rtx)); ! static void note_giv_use PROTO((struct induction *, rtx, int, ! struct recombine_givs_stats *)); ! static int cmp_giv_by_value_and_insn PROTO((struct induction **, struct induction **)); ! static void recombine_givs PROTO((struct iv_class *, rtx, rtx, rtx, rtx, int)); static int product_cheap_p PROTO((rtx, rtx)); static int maybe_eliminate_biv PROTO((struct iv_class *, rtx, rtx, int, int, int)); static int maybe_eliminate_biv_1 PROTO((rtx, rtx, struct iv_class *, int, rtx)); *************** scan_loop (loop_start, end, loop_cont, u *** 753,759 **** /* Count number of times each reg is set during this loop. Set VARRAY_CHAR (may_not_optimize, I) if it is not safe to move out the setting of register I. Set VARRAY_RTX (reg_single_usage, I). */ ! /* Allocate extra space for REGS that might be created by load_mems. We allocate a little extra slop as well, in the hopes that even after the moving of movables creates some new registers --- 757,763 ---- /* Count number of times each reg is set during this loop. Set VARRAY_CHAR (may_not_optimize, I) if it is not safe to move out the setting of register I. Set VARRAY_RTX (reg_single_usage, I). */ ! /* Allocate extra space for REGS that might be created by load_mems. We allocate a little extra slop as well, in the hopes that even after the moving of movables creates some new registers *************** record_excess_regs (in_this, not_in_this *** 1222,1228 **** && ! reg_mentioned_p (in_this, not_in_this)) *output = gen_rtx_EXPR_LIST (VOIDmode, in_this, *output); return; ! default: break; } --- 1226,1232 ---- && ! reg_mentioned_p (in_this, not_in_this)) *output = gen_rtx_EXPR_LIST (VOIDmode, in_this, *output); return; ! default: break; } *************** replace_call_address (x, reg, addr) *** 2296,2302 **** abort (); XEXP (x, 0) = addr; return; ! default: break; } --- 2300,2306 ---- abort (); XEXP (x, 0) = addr; return; ! default: break; } *************** count_nonfixed_reads (x) *** 2347,2353 **** case MEM: return ((invariant_p (XEXP (x, 0)) != 1) + count_nonfixed_reads (XEXP (x, 0))); ! default: break; } --- 2351,2357 ---- case MEM: return ((invariant_p (XEXP (x, 0)) != 1) + count_nonfixed_reads (XEXP (x, 0))); ! default: break; } *************** invariant_p (x) *** 3341,3347 **** if (MEM_VOLATILE_P (x)) return 0; break; ! default: break; } --- 3345,3351 ---- if (MEM_VOLATILE_P (x)) return 0; break; ! default: break; } *************** static rtx note_insn; *** 3710,3715 **** --- 3714,3726 ---- static rtx addr_placeholder; + /* The last giv we have seen since we passed a CODE_LABEL. Used to + find places where auto-increment is useful to generate a DEST_ADDR + giv from a giv with the same mult_val but different add_val. + We must suppress some giv combinations to allow these auto_increments + to be formed. */ + static struct induction *last_recorded_giv, *last_recorded_addr_giv; + /* ??? Unfinished optimizations, and possible future optimizations, for the strength reduction code. */ *************** static rtx addr_placeholder; *** 3744,3750 **** This does not cause a problem here, because the added registers cannot be givs outside of their loop, and hence will never be reconsidered. But scan_loop must check regnos to make sure they are in bounds. ! SCAN_START is the first instruction in the loop, as the loop would actually be executed. END is the NOTE_INSN_LOOP_END. LOOP_TOP is the first instruction in the loop, as it is layed out in the --- 3755,3761 ---- This does not cause a problem here, because the added registers cannot be givs outside of their loop, and hence will never be reconsidered. But scan_loop must check regnos to make sure they are in bounds. ! SCAN_START is the first instruction in the loop, as the loop would actually be executed. END is the NOTE_INSN_LOOP_END. LOOP_TOP is the first instruction in the loop, as it is layed out in the *************** strength_reduce (scan_start, end, loop_t *** 3838,3847 **** && REG_IV_TYPE (REGNO (dest_reg)) != NOT_BASIC_INDUCT) { int multi_insn_incr = 0; ! if (basic_induction_var (SET_SRC (set), GET_MODE (SET_SRC (set)), ! dest_reg, p, &inc_val, &mult_val, ! &location, &multi_insn_incr)) { /* It is a possible basic induction variable. Create and initialize an induction structure for it. */ --- 3849,3863 ---- && REG_IV_TYPE (REGNO (dest_reg)) != NOT_BASIC_INDUCT) { int multi_insn_incr = 0; + enum machine_mode mode = GET_MODE (SET_SRC (set)); + rtx note = find_reg_note (p, REG_EQUAL, 0); ! if (basic_induction_var (SET_SRC (set), mode, dest_reg, p, ! &inc_val, &mult_val, &location, &multi_insn_incr) ! || (note ! && basic_induction_var (XEXP (note, 0), mode, dest_reg, p, ! &inc_val, &mult_val, &location, ! &multi_insn_incr))) { /* It is a possible basic induction variable. Create and initialize an induction structure for it. */ *************** strength_reduce (scan_start, end, loop_t *** 4180,4185 **** --- 4196,4207 ---- if (loop_dump_stream) fprintf (loop_dump_stream, "is giv of biv %d\n", bl2->regno); + + /* If the changed insn carries a REG_EQUAL note, update it. */ + note = find_reg_note (bl->biv->insn, REG_EQUAL, NULL_RTX); + if (note) + XEXP (note, 0) = copy_rtx (src); + /* Let this giv be discovered by the generic code. */ REG_IV_TYPE (bl->regno) = UNKNOWN_INDUCT; reg_biv_class[bl->regno] = NULL_PTR; *************** strength_reduce (scan_start, end, loop_t *** 4315,4321 **** for (vp = &bl->biv, next = *vp; v = next, next = v->next_iv;) { HOST_WIDE_INT offset; ! rtx set, add_val, old_reg, dest_reg, last_use_insn, note; int old_regno, new_regno; if (! v->always_executed --- 4337,4343 ---- for (vp = &bl->biv, next = *vp; v = next, next = v->next_iv;) { HOST_WIDE_INT offset; ! rtx set, src, add_val, old_reg, dest_reg, last_use_insn, note; int old_regno, new_regno; if (! v->always_executed *************** strength_reduce (scan_start, end, loop_t *** 4337,4344 **** add_val = plus_constant (next->add_val, offset); old_reg = v->dest_reg; dest_reg = gen_reg_rtx (v->mode); ! ! /* Unlike reg_iv_type / reg_iv_info, the other three arrays have been allocated with some slop space, so we may not actually need to reallocate them. If we do, the following if statement will be executed just once in this loop. */ --- 4359,4368 ---- add_val = plus_constant (next->add_val, offset); old_reg = v->dest_reg; dest_reg = gen_reg_rtx (v->mode); ! old_regno = REGNO (old_reg); ! new_regno = REGNO (dest_reg); ! ! /* Unlike reg_iv_type / reg_iv_info, the other four arrays have been allocated with some slop space, so we may not actually need to reallocate them. If we do, the following if statement will be executed just once in this loop. */ *************** strength_reduce (scan_start, end, loop_t *** 4350,4395 **** VARRAY_GROW (may_not_optimize, nregs); VARRAY_GROW (reg_single_usage, nregs); } ! if (! validate_change (next->insn, next->location, add_val, 0)) { vp = &v->next_iv; continue; } - - /* Here we can try to eliminate the increment by combining - it into the uses. */ - - /* Set last_use_insn so that we can check against it. */ ! for (last_use_insn = v->insn, p = NEXT_INSN (v->insn); ! p != next->insn; ! p = next_insn_in_loop (p, scan_start, end, loop_top)) { if (GET_RTX_CLASS (GET_CODE (p)) != 'i') continue; ! if (reg_mentioned_p (old_reg, PATTERN (p))) ! { ! last_use_insn = p; ! } } ! ! /* If we can't get the LUIDs for the insns, we can't ! calculate the lifetime. This is likely from unrolling ! of an inner loop, so there is little point in making this ! a DEST_REG giv anyways. */ ! if (INSN_UID (v->insn) >= max_uid_for_loop ! || INSN_UID (last_use_insn) >= max_uid_for_loop ! || ! validate_change (v->insn, &SET_DEST (set), dest_reg, 0)) { /* Change the increment at NEXT back to what it was. */ if (! validate_change (next->insn, next->location, next->add_val, 0)) abort (); vp = &v->next_iv; continue; } next->add_val = add_val; v->dest_reg = dest_reg; v->giv_type = DEST_REG; v->location = &SET_SRC (set); --- 4374,4465 ---- VARRAY_GROW (may_not_optimize, nregs); VARRAY_GROW (reg_single_usage, nregs); } ! VARRAY_CHAR (may_not_optimize, new_regno) = 0; ! if (! validate_change (next->insn, next->location, add_val, 0)) { vp = &v->next_iv; continue; } ! src = SET_SRC (set); ! /* Try to replace all uses of OLD_REG with SRC. This will ! mostly win when it generates / changes address givs, but it ! might also change some DEST_REG givs or create the odd ! PEA on an 68k. */ ! last_use_insn = NULL_RTX; ! validate_subst_start (); ! for (p = NEXT_INSN (v->insn); p != next->insn; p = NEXT_INSN (p)) { + rtx newpat; + if (GET_RTX_CLASS (GET_CODE (p)) != 'i') continue; ! if (! validate_subst (p, old_reg, src)) ! last_use_insn = p; } ! /* If some uses remain, we'd like to make this a DEST_REG ! giv. However, after loop unrolling, V->INSN or LAST_USE_INSN ! might have no valid luid. We need these not only for ! calculating the lifetime now, but also in recombine_givs when ! doing giv derivation, to find givs with non-overlapping ! lifetimes. So if we don't have LUIDs available, or if we ! can't calculate the giv, leave the biv increment alone. */ ! if (last_use_insn ! && (INSN_UID (v->insn) >= max_uid_for_loop ! || INSN_UID (last_use_insn) >= max_uid_for_loop ! || ! validate_change (v->insn, &SET_DEST (set), ! dest_reg, 0))) { /* Change the increment at NEXT back to what it was. */ if (! validate_change (next->insn, next->location, next->add_val, 0)) abort (); + + /* Undo all the substitutions made by validate_subst above, + since the biv does hold the incremented value after + all. */ + validate_subst_undo (); + vp = &v->next_iv; continue; } + + /* If we have to make a DEST_REG giv, undo all the + substitutions made by validate_subst above, since we are + going to replace the biv by a DEST_REG giv. We must do this + before allocating anything more on obstack, e.g. with + copy_rtx. */ + if (last_use_insn) + validate_subst_undo (); + + /* If next_insn has a REG_EQUAL note that mentiones OLD_REG, + it must be replaced. */ + note = find_reg_note (next->insn, REG_EQUAL, NULL_RTX); + if (note && reg_mentioned_p (old_reg, XEXP (note, 0))) + XEXP (note, 0) = copy_rtx (SET_SRC (single_set (next->insn))); + + /* Remove the increment from the list of biv increments. */ + *vp = next; + bl->biv_count--; + VARRAY_INT (set_in_loop, old_regno)--; + VARRAY_INT (n_times_set, old_regno)--; next->add_val = add_val; + + if (! last_use_insn) + { + if (loop_dump_stream) + fprintf (loop_dump_stream, + "Increment %d of biv %d eliminated.\n\n", + INSN_UID (v->insn), old_regno); + PUT_CODE (v->insn, NOTE); + NOTE_LINE_NUMBER (v->insn) = NOTE_INSN_DELETED; + NOTE_SOURCE_FILE (v->insn) = 0; + VARRAY_INT (set_in_loop, new_regno) = 0; + VARRAY_INT (n_times_set, new_regno) = 0; + continue; + } + v->dest_reg = dest_reg; v->giv_type = DEST_REG; v->location = &SET_SRC (set); *************** strength_reduce (scan_start, end, loop_t *** 4406,4443 **** v->unrolled = 0; v->shared = 0; v->derived_from = 0; v->always_computable = 1; v->always_executed = 1; v->replaceable = 1; v->no_const_addval = 0; ! ! old_regno = REGNO (old_reg); ! new_regno = REGNO (dest_reg); ! VARRAY_INT (set_in_loop, old_regno)--; VARRAY_INT (set_in_loop, new_regno) = 1; - VARRAY_INT (n_times_set, old_regno)--; VARRAY_INT (n_times_set, new_regno) = 1; ! VARRAY_CHAR (may_not_optimize, new_regno) = 0; ! REG_IV_TYPE (new_regno) = GENERAL_INDUCT; REG_IV_INFO (new_regno) = v; - - /* If next_insn has a REG_EQUAL note that mentiones OLD_REG, - it must be replaced. */ - note = find_reg_note (next->insn, REG_EQUAL, NULL_RTX); - if (note && reg_mentioned_p (old_reg, XEXP (note, 0))) - XEXP (note, 0) = copy_rtx (SET_SRC (single_set (next->insn))); ! /* Remove the increment from the list of biv increments, ! and record it as a giv. */ ! *vp = next; ! bl->biv_count--; v->next_iv = bl->giv; bl->giv = v; bl->giv_count++; v->benefit = rtx_cost (SET_SRC (set), SET); bl->total_benefit += v->benefit; ! /* Now replace the biv with DEST_REG in all insns between the replaced increment and the next increment, and remember the last insn that needed a replacement. */ --- 4476,4506 ---- v->unrolled = 0; v->shared = 0; v->derived_from = 0; + v->did_derive = 0; + v->combine_start_limit = 0; + v->combine_end_limit = 0; v->always_computable = 1; v->always_executed = 1; v->replaceable = 1; v->no_const_addval = 0; ! v->autoinc_pred = 0; ! v->autoinc_succ = 0; ! v->preinc = 0; ! v->leading_combined = 0; ! VARRAY_INT (set_in_loop, new_regno) = 1; VARRAY_INT (n_times_set, new_regno) = 1; ! REG_IV_TYPE (new_regno) = GENERAL_INDUCT; REG_IV_INFO (new_regno) = v; ! /* Record V as a giv. */ v->next_iv = bl->giv; bl->giv = v; bl->giv_count++; v->benefit = rtx_cost (SET_SRC (set), SET); bl->total_benefit += v->benefit; ! /* Now replace the biv with DEST_REG in all insns between the replaced increment and the next increment, and remember the last insn that needed a replacement. */ *************** strength_reduce (scan_start, end, loop_t *** 4446,4452 **** p = next_insn_in_loop (p, scan_start, end, loop_top)) { rtx note; ! if (GET_RTX_CLASS (GET_CODE (p)) != 'i') continue; if (reg_mentioned_p (old_reg, PATTERN (p))) --- 4509,4515 ---- p = next_insn_in_loop (p, scan_start, end, loop_top)) { rtx note; ! if (GET_RTX_CLASS (GET_CODE (p)) != 'i') continue; if (reg_mentioned_p (old_reg, PATTERN (p))) *************** strength_reduce (scan_start, end, loop_t *** 4462,4468 **** = replace_rtx (XEXP (note, 0), old_reg, dest_reg); } } ! v->last_use = last_use_insn; v->lifetime = INSN_LUID (v->insn) - INSN_LUID (last_use_insn); /* If the lifetime is zero, it means that this register is really --- 4525,4531 ---- = replace_rtx (XEXP (note, 0), old_reg, dest_reg); } } ! v->last_use = last_use_insn; v->lifetime = INSN_LUID (v->insn) - INSN_LUID (last_use_insn); /* If the lifetime is zero, it means that this register is really *************** strength_reduce (scan_start, end, loop_t *** 4488,4493 **** --- 4551,4557 ---- not_every_iteration = 0; loop_depth = 0; maybe_multiple = 0; + last_recorded_giv = last_recorded_addr_giv = 0; p = scan_start; while (1) { *************** strength_reduce (scan_start, end, loop_t *** 4680,4685 **** --- 4744,4752 ---- && no_labels_between_p (p, loop_end) && loop_insn_first_p (p, loop_cont)) not_every_iteration = 0; + + if (GET_CODE (p) == CODE_LABEL) + last_recorded_giv = last_recorded_addr_giv = 0; } /* Try to calculate and save the number of loop iterations. This is *************** strength_reduce (scan_start, end, loop_t *** 4905,4911 **** /* Now that we know which givs will be reduced, try to rearrange the combinations to reduce register pressure. ! recombine_givs calls find_life_end, which needs reg_iv_type and reg_iv_info to be valid for all pseudos. We do the necessary reallocation here since it allows to check if there are still more bivs to process. */ --- 4972,4978 ---- /* Now that we know which givs will be reduced, try to rearrange the combinations to reduce register pressure. ! recombine_givs calls find_giv_uses, which needs reg_iv_type and reg_iv_info to be valid for all pseudos. We do the necessary reallocation here since it allows to check if there are still more bivs to process. */ *************** strength_reduce (scan_start, end, loop_t *** 4920,4926 **** VARRAY_GROW (reg_iv_type, nregs); VARRAY_GROW (reg_iv_info, nregs); } ! recombine_givs (bl, loop_start, loop_end, unroll_p); /* Reduce each giv that we decided to reduce. */ --- 4987,4993 ---- VARRAY_GROW (reg_iv_type, nregs); VARRAY_GROW (reg_iv_info, nregs); } ! recombine_givs (bl, scan_start, loop_start, loop_end, loop_top, unroll_p); /* Reduce each giv that we decided to reduce. */ *************** strength_reduce (scan_start, end, loop_t *** 4931,4979 **** { int auto_inc_opt = 0; ! /* If the code for derived givs immediately below has already allocated a new_reg, we must keep it. */ if (! v->new_reg) v->new_reg = gen_reg_rtx (v->mode); if (v->derived_from) ! { ! struct induction *d = v->derived_from; ! ! /* In case d->dest_reg is not replaceable, we have ! to replace it in v->insn now. */ ! if (! d->new_reg) ! d->new_reg = gen_reg_rtx (d->mode); ! PATTERN (v->insn) ! = replace_rtx (PATTERN (v->insn), d->dest_reg, d->new_reg); ! PATTERN (v->insn) ! = replace_rtx (PATTERN (v->insn), v->dest_reg, v->new_reg); ! /* For each place where the biv is incremented, add an ! insn to set the new, reduced reg for the giv. ! We used to do this only for biv_count != 1, but ! this fails when there is a giv after a single biv ! increment, e.g. when the last giv was expressed as ! pre-decrement. */ ! for (tv = bl->biv; tv; tv = tv->next_iv) ! { ! /* We always emit reduced giv increments before the ! biv increment when bl->biv_count != 1. So by ! emitting the add insns for derived givs after the ! biv increment, they pick up the updated value of ! the reduced giv. ! If the reduced giv is processed with ! auto_inc_opt == 1, then it is incremented earlier ! than the biv, hence we'll still pick up the right ! value. ! If it's processed with auto_inc_opt == -1, ! that implies that the biv increment is before the ! first reduced giv's use. The derived giv's lifetime ! is after the reduced giv's lifetime, hence in this ! case, the biv increment doesn't matter. */ ! emit_insn_after (copy_rtx (PATTERN (v->insn)), tv->insn); ! } ! continue; ! } #ifdef AUTO_INC_DEC /* If the target has auto-increment addressing modes, and --- 4998,5010 ---- { int auto_inc_opt = 0; ! /* If the code for derived givs in recombine_givs has already allocated a new_reg, we must keep it. */ if (! v->new_reg) v->new_reg = gen_reg_rtx (v->mode); if (v->derived_from) ! continue; #ifdef AUTO_INC_DEC /* If the target has auto-increment addressing modes, and *************** strength_reduce (scan_start, end, loop_t *** 5032,5037 **** --- 5063,5073 ---- else auto_inc_opt = 1; + /* We can't put an insn after v->insn if v was used to + derive other givs in recombine_givs. */ + if (auto_inc_opt == 1 && v->did_derive) + auto_inc_opt = 0; + #ifdef HAVE_cc0 { rtx prev; *************** strength_reduce (scan_start, end, loop_t *** 5065,5070 **** --- 5101,5115 ---- else insert_before = v->insn; + /* If the biv was recognized from a REG_EQUAL note, we + can have the special case that the giv is used in the + biv increment. Then the giv increment must be put + after the biv increment, which is typically actually + a copy of the giv into the biv. */ + if (reg_overlap_mentioned_p (v->dest_reg, + SET_SRC (single_set (tv->insn)))) + insert_before = NEXT_INSN (tv->insn); + if (tv->mult_val == const1_rtx) emit_iv_add_mult (tv->add_val, v->mult_val, v->new_reg, v->new_reg, insert_before); *************** strength_reduce (scan_start, end, loop_t *** 5311,5317 **** if (unrolled_insn_copies < 0) unrolled_insn_copies = 0; } ! /* Unroll loops from within strength reduction so that we can use the induction variable information that strength_reduce has already collected. Always unroll loops that would be as small or smaller --- 5356,5362 ---- if (unrolled_insn_copies < 0) unrolled_insn_copies = 0; } ! /* Unroll loops from within strength reduction so that we can use the induction variable information that strength_reduce has already collected. Always unroll loops that would be as small or smaller *************** find_mem_givs (x, insn, not_every_iterat *** 5436,5446 **** struct induction *v = (struct induction *) oballoc (sizeof (struct induction)); record_giv (v, insn, src_reg, addr_placeholder, mult_val, add_val, benefit, DEST_ADDR, not_every_iteration, maybe_multiple, &XEXP (x, 0), loop_start, loop_end); - - v->mem_mode = GET_MODE (x); } } return; --- 5481,5490 ---- struct induction *v = (struct induction *) oballoc (sizeof (struct induction)); + v->mem_mode = GET_MODE (x); record_giv (v, insn, src_reg, addr_placeholder, mult_val, add_val, benefit, DEST_ADDR, not_every_iteration, maybe_multiple, &XEXP (x, 0), loop_start, loop_end); } } return; *************** record_giv (v, insn, src_reg, dest_reg, *** 5622,5629 **** --- 5666,5680 ---- v->auto_inc_opt = 0; v->unrolled = 0; v->shared = 0; + v->autoinc_pred = 0; + v->autoinc_succ = 0; + v->preinc = 0; + v->leading_combined = 0; v->derived_from = 0; + v->did_derive = 0; v->last_use = 0; + v->combine_start_limit = 0; + v->combine_end_limit = 0; /* The v->always_computable field is used in update_giv_derive, to determine whether a giv can be used to derive another giv. For a *************** record_giv (v, insn, src_reg, dest_reg, *** 5774,5779 **** --- 5825,5894 ---- } } + #ifdef AUTO_INC_DEC + if (last_recorded_addr_giv + && last_recorded_addr_giv->src_reg == src_reg + && rtx_equal_p (last_recorded_addr_giv->mult_val, mult_val) + && GET_CODE (add_val) == CONST_INT) + { + /* Check if changing the previous giv to post-increment would allow + to generate the value of the current giv. */ + if (! last_recorded_addr_giv->preinc + && ((HAVE_POST_INCREMENT + && (INTVAL (add_val) - INTVAL (last_recorded_addr_giv->add_val) + == GET_MODE_SIZE (last_recorded_addr_giv->mem_mode))) + || (HAVE_POST_DECREMENT + && ((INTVAL (add_val) + - INTVAL (last_recorded_addr_giv->add_val)) + == -GET_MODE_SIZE (last_recorded_addr_giv->mem_mode)))) + && ! (combine_givs_p (last_recorded_addr_giv, v) + || combine_givs_p (v, last_recorded_addr_giv))) + { + last_recorded_addr_giv->autoinc_pred = 1; + v->autoinc_succ = 1; + /* Record only one autoinc opportunity for LAST_RECORDED_ADDR_GIV. */ + last_recorded_addr_giv = 0; + } + else if ((HAVE_PRE_INCREMENT + && type == DEST_ADDR + && (INTVAL (add_val) - INTVAL (last_recorded_addr_giv->add_val) + == GET_MODE_SIZE (v->mem_mode))) + || (HAVE_PRE_DECREMENT + && type == DEST_ADDR + && (INTVAL (add_val) - INTVAL (last_recorded_addr_giv->add_val) + == -GET_MODE_SIZE (v->mem_mode)))) + { + struct induction *succ = v; + + if (last_recorded_giv->giv_type == DEST_REG + && rtx_equal_p (last_recorded_giv->add_val, v->add_val) + && rtx_equal_p (last_recorded_giv->mult_val, v->mult_val)) + succ = last_recorded_giv; + last_recorded_addr_giv->autoinc_pred = 1; + succ->autoinc_succ = 1; + v->preinc = 1; + /* Record only one autoinc opportunity for LAST_RECORDED_ADDR_GIV. */ + last_recorded_addr_giv = 0; + } + } + + last_recorded_giv = v; + + /* Only record DEST_ADDR givs as such for following auto_increment tests + if we can use them at all. */ + if (type == DEST_ADDR + && GET_CODE (add_val) == CONST_INT + /* And only if it's likely to be useful. The typical case uses a + structure, sub-array or several array members each iteration, so + we should see an increment that is larger than the individual + access size. */ + && (GET_CODE (reg_biv_class[REGNO (src_reg)]->biv->add_val) != CONST_INT + || (abs (INTVAL (v->mult_val) + * INTVAL (reg_biv_class[REGNO (src_reg)]->biv->add_val)) + > GET_MODE_SIZE (v->mem_mode)))) + last_recorded_addr_giv = v; + #endif + if (loop_dump_stream) { if (type == DEST_REG) *************** check_final_value (v, loop_start, loop_e *** 5915,5921 **** last_giv_use = p; } } ! /* Now that the lifetime of the giv is known, check for branches from within the lifetime to outside the lifetime if it is still replaceable. */ --- 6030,6036 ---- last_giv_use = p; } } ! /* Now that the lifetime of the giv is known, check for branches from within the lifetime to outside the lifetime if it is still replaceable. */ *************** general_induction_var (x, src_reg, add_v *** 6281,6289 **** --- 6396,6406 ---- rtx orig_x = x; char *storage; + #if 0 /* Invariants are useful to derive other givs from. */ /* If this is an invariant, forget it, it isn't a giv. */ if (invariant_p (x) == 1) return 0; + #endif /* See if the expression could be a giv and get its form. Mark our place on the obstack in case we don't find a giv. */ *************** simplify_giv_expr (x, benefit) *** 6636,6646 **** *benefit += v->benefit; if (v->cant_derive) return 0; - - tem = gen_rtx_PLUS (mode, gen_rtx_MULT (mode, - v->src_reg, v->mult_val), - v->add_val); if (v->derive_adjustment) tem = gen_rtx_MINUS (mode, tem, v->derive_adjustment); return simplify_giv_expr (tem, benefit); --- 6753,6765 ---- *benefit += v->benefit; if (v->cant_derive) return 0; + if (v->mult_val != const0_rtx) + tem = gen_rtx_PLUS (mode, gen_rtx_MULT (mode, + v->src_reg, v->mult_val), + v->add_val); + else + tem = v->add_val; if (v->derive_adjustment) tem = gen_rtx_MINUS (mode, tem, v->derive_adjustment); return simplify_giv_expr (tem, benefit); *************** express_from (g1, g2) *** 7076,7085 **** mult = gen_rtx_PLUS (g2->mode, mult, XEXP (add, 0)); add = tem; } ! return gen_rtx_PLUS (g2->mode, mult, add); } ! } \f /* Return an rtx, if any, that expresses giv G2 as a function of the register --- 7195,7204 ---- mult = gen_rtx_PLUS (g2->mode, mult, XEXP (add, 0)); add = tem; } ! return gen_rtx_PLUS (g2->mode, mult, add); } ! } \f /* Return an rtx, if any, that expresses giv G2 as a function of the register *************** combine_givs_p (g1, g2) *** 7101,7106 **** --- 7220,7241 ---- if (tem == g1->dest_reg && (g1->giv_type == DEST_REG || g2->giv_type == DEST_ADDR)) { + /* Don't combine if this would prevent an autoinc opportunity. + We only do this check in if both givs are the same; if they + can be combined even though they are different, then it is likely + that the putative autoinc-pair can be combined without an + autoincrement, too. We don't want to prevent that combination. */ + #ifdef AUTO_INC_DEC + if ((g1->combine_start_limit + && loop_insn_first_p (g2->insn, g1->combine_start_limit)) + || (g1->combine_end_limit + && loop_insn_first_p (g1->combine_end_limit, g2->insn)) + || (g2->combine_start_limit + && loop_insn_first_p (g1->insn, g2->combine_start_limit)) + || (g2->combine_end_limit + && loop_insn_first_p (g2->combine_end_limit, g1->insn))) + return 0; + #endif return g1->dest_reg; } *************** cmp_combine_givs_stats (xp, yp) *** 7151,7156 **** --- 7286,7312 ---- return d; } + static int + cmp_giv_by_value_and_insn (xp, yp) + struct induction **xp, **yp; + { + struct induction *x = *xp, *y = *yp; + HOST_WIDE_INT d; + + d = (int) GET_CODE (x->mult_val) - (int) GET_CODE (y->mult_val); + if (! d && GET_CODE (x->mult_val) == CONST_INT) + d = INTVAL (x->mult_val) - INTVAL (y->mult_val); + if (! d) + d = (int) GET_CODE (x->add_val) - (int) GET_CODE (y->add_val); + if (! d && GET_CODE (x->add_val) == CONST_INT) + d = INTVAL (x->add_val) - INTVAL (y->add_val); + if (d) + return d < 0 ? -1 : 1; + if (x->insn == y->insn) + return xp - yp; + return loop_insn_first_p (x->insn, y->insn) ? -1 : 1; + } + /* Check all pairs of givs for iv_class BL and see if any can be combined with any other. If so, point SAME to the giv combined with and set NEW_REG to be an expression (in terms of the other giv's DEST_REG) equivalent to the *************** combine_givs (bl) *** 7176,7186 **** --- 7332,7393 ---- giv_array = (struct induction **) alloca (giv_count * sizeof (struct induction *)); + + #ifdef AUTO_INC_DEC + /* Order givs by mult_val / add_val / position in insn stream. */ i = 0; for (g1 = bl->giv; g1; g1 = g1->next_iv) if (!g1->ignore) giv_array[i++] = g1; + qsort (giv_array, giv_count, sizeof(*giv_array), cmp_giv_by_value_and_insn); + + /* Go through givs forward in insn stream order, set combine_start_limit + for giv that flag autoinc_succ or follow one with matching + add_val/mult_val that flags it. */ + g1 = NULL_PTR; + for (i = 0; i < giv_count; i++) + { + g2 = giv_array[i]; + if (g2->autoinc_succ) + g1 = g2; + else if (! g1) + continue; + else if (! rtx_equal_p (g1->mult_val, g2->mult_val) + || ! rtx_equal_p (g1->add_val, g2->add_val)) + { + g1 = 0; + continue; + } + g2->combine_start_limit = g1->insn; + } + + /* Go through givs backward in insn stream order, set combine_end_limit + for giv that flag autoinc_pred or follow one with matching + add_val/mult_val that flags it. */ + g1 = NULL_PTR; + for (i = giv_count - 1; i >= 0; i--) + { + g2 = giv_array[i]; + if (g2->autoinc_pred) + g1 = g2; + else if (! g1) + continue; + else if (! rtx_equal_p (g1->mult_val, g2->mult_val) + || ! rtx_equal_p (g1->add_val, g2->add_val)) + { + g1 = 0; + continue; + } + g2->combine_end_limit = g1->insn; + } + #endif /* AUTO_INC_DEC */ + + i = 0; + for (g1 = bl->giv; g1; g1 = g1->next_iv) + if (!g1->ignore) + giv_array[i++] = g1; + stats = (struct combine_givs_stats *) xcalloc (giv_count, sizeof (*stats)); can_combine = (rtx *) xcalloc (giv_count, giv_count * sizeof(rtx)); *************** struct recombine_givs_stats *** 7322,7327 **** --- 7529,7537 ---- { int giv_number; int start_luid, end_luid; + rtx start_insn; /* First insn in loop order in which the giv (including + combinations) is used; Initialized to NULL_RTX; set + to a NOTE when invalid. */ }; /* Used below as comparison function for qsort. We want a ascending luid *************** cmp_recombine_givs_stats (xp, yp) *** 7344,7356 **** return d; } ! /* Scan X, which is a part of INSN, for the end of life of a giv. Also ! look for the start of life of a giv where the start has not been seen ! yet to unlock the search for the end of its life. ! Only consider givs that belong to BIV. ! Return the total number of lifetime ends that have been found. */ ! static int ! find_life_end (x, stats, insn, biv) rtx x, insn, biv; struct recombine_givs_stats *stats; { --- 7554,7605 ---- return d; } ! /* The last label we encountered while scanning forward for giv uses. ! Is initialized to SCAN_START (not necessarily a label) in recombine_givs. */ ! static rtx loop_last_label; ! ! /* V, a giv, is used in INSN. ! FROM_COMBINED is set if the use comes (possibly) from a combined giv. ! It must not be set if there are no combined givs for this giv, since ! this can confuse giv derivation to move the giv insn to the wrong place. ! Update start_insn / end_luid in STATS accordingly. */ ! static void ! note_giv_use (v, insn, from_combined, stats) ! struct induction *v; ! rtx insn; ! int from_combined; ! struct recombine_givs_stats *stats; ! { ! if (stats[v->ix].start_insn) ! { ! if (loop_insn_first_p (stats[v->ix].start_insn, loop_last_label) ! && (loop_insn_first_p (loop_last_label, insn) ! || loop_insn_first_p (insn, stats[v->ix].start_insn))) ! stats[v->ix].start_insn = loop_number_loop_starts[0]; ! } ! else ! { ! rtx p; ! ! stats[v->ix].start_insn = insn; ! if (from_combined) ! v->leading_combined = 1; ! ! /* Update start_luid now so that we won't loose this information it ! when we invalidate start_insn. */ ! for (p = insn; INSN_UID (p) >= max_uid_for_loop; ) ! p = PREV_INSN (p); ! stats[v->ix].start_luid = INSN_LUID (p); ! } ! while (INSN_UID (insn) >= max_uid_for_loop) ! insn = NEXT_INSN (insn); ! stats[v->ix].end_luid = INSN_LUID (insn); ! } ! ! /* Scan X, which is a part of INSN, for uses of givs. ! Only consider givs that belong to BIV. */ ! static void ! find_giv_uses (x, stats, insn, biv) rtx x, insn, biv; struct recombine_givs_stats *stats; { *************** find_life_end (x, stats, insn, biv) *** 7372,7419 **** if (REG_IV_TYPE (regno) == GENERAL_INDUCT && ! v->ignore ! && v->src_reg == biv ! && stats[v->ix].end_luid <= 0) { ! /* If we see a 0 here for end_luid, it means that we have ! scanned the entire loop without finding any use at all. ! We must not predicate this code on a start_luid match ! since that would make the test fail for givs that have ! been hoisted out of inner loops. */ ! if (stats[v->ix].end_luid == 0) { ! stats[v->ix].end_luid = stats[v->ix].start_luid; ! return 1 + find_life_end (SET_SRC (x), stats, insn, biv); } - else if (stats[v->ix].start_luid == INSN_LUID (insn)) - stats[v->ix].end_luid = 0; } - return find_life_end (SET_SRC (x), stats, insn, biv); } break; } case REG: { int regno = REGNO (x); ! struct induction *v = REG_IV_INFO (regno); ! ! if (REG_IV_TYPE (regno) == GENERAL_INDUCT ! && ! v->ignore ! && v->src_reg == biv ! && stats[v->ix].end_luid == 0) { ! while (INSN_UID (insn) >= max_uid_for_loop) ! insn = NEXT_INSN (insn); ! stats[v->ix].end_luid = INSN_LUID (insn); ! return 1; } ! return 0; } case LABEL_REF: case CONST_DOUBLE: case CONST_INT: case CONST: ! return 0; default: break; } --- 7621,7696 ---- if (REG_IV_TYPE (regno) == GENERAL_INDUCT && ! v->ignore ! && v->src_reg == biv) ! { ! /* Since we are setting a non-ignored general induction ! variable, this insn will be changed or go away, hence ! we don't have to consider uses in the SET_SRC. */ ! return; ! } ! find_giv_uses (SET_SRC (x), stats, insn, biv); ! return; ! } ! break; ! } ! /* If this is a reduced DEST_ADDR giv, the original address doesn't ! count; but if the giv has been combined with another one, we must ! count the use there. */ ! case MEM: ! { ! rtx src_reg; ! rtx add_val; ! rtx mult_val; ! int benefit; ! struct induction *v; ! ! if (general_induction_var (XEXP (x, 0), &src_reg, &add_val, ! &mult_val, 1, &benefit) ! && src_reg == biv) ! { ! for (v = reg_biv_class[REGNO (biv)]->giv; v; v = v->next_iv) { ! if (v->location == &XEXP (x, 0)) { ! int from_combined = 0; ! ! if (v->same) ! { ! v = v->same; ! from_combined = 1; ! } ! if (v->ignore) ! break; ! note_giv_use (v, insn, from_combined, stats); ! return; } } } break; } case REG: { int regno = REGNO (x); ! if (REG_IV_TYPE (regno) == GENERAL_INDUCT) { ! struct induction *v = REG_IV_INFO (regno); ! int from_combined = 0; ! ! if (v->same) ! { ! v = v->same; ! from_combined = 1; ! } ! if (! v->ignore && v->src_reg == biv) ! note_giv_use (v, insn, from_combined, stats); } ! return; } case LABEL_REF: case CONST_DOUBLE: case CONST_INT: case CONST: ! return; default: break; } *************** find_life_end (x, stats, insn, biv) *** 7422,7434 **** for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--) { if (fmt[i] == 'e') ! retval += find_life_end (XEXP (x, i), stats, insn, biv); else if (fmt[i] == 'E') for (j = XVECLEN (x, i) - 1; j >= 0; j--) ! retval += find_life_end (XVECEXP (x, i, j), stats, insn, biv); } ! return retval; } /* For each giv that has been combined with another, look if --- 7699,7711 ---- for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--) { if (fmt[i] == 'e') ! find_giv_uses (XEXP (x, i), stats, insn, biv); else if (fmt[i] == 'E') for (j = XVECLEN (x, i) - 1; j >= 0; j--) ! find_giv_uses (XVECEXP (x, i, j), stats, insn, biv); } ! return; } /* For each giv that has been combined with another, look if *************** find_life_end (x, stats, insn, biv) *** 7436,7451 **** This tends to shorten giv lifetimes, and helps the next step: try to derive givs from other givs. */ static void ! recombine_givs (bl, loop_start, loop_end, unroll_p) struct iv_class *bl; ! rtx loop_start, loop_end; int unroll_p; { struct induction *v, **giv_array, *last_giv; struct recombine_givs_stats *stats; int giv_count; int i, rescan; ! int ends_need_computing; for (giv_count = 0, v = bl->giv; v; v = v->next_iv) { --- 7713,7732 ---- This tends to shorten giv lifetimes, and helps the next step: try to derive givs from other givs. */ static void ! recombine_givs (bl, scan_start, loop_start, loop_end, loop_top, unroll_p) struct iv_class *bl; ! rtx scan_start, loop_start, loop_end, loop_top; int unroll_p; { struct induction *v, **giv_array, *last_giv; struct recombine_givs_stats *stats; int giv_count; int i, rescan; ! int n_giv_live_after_loop; ! struct induction **giv_live_after_loop; ! rtx biv_use_start, biv_use_end; ! struct induction *biv_giv; ! int life_start, life_end; for (giv_count = 0, v = bl->giv; v; v = v->next_iv) { *************** recombine_givs (bl, loop_start, loop_end *** 7456,7469 **** = (struct induction **) xmalloc (giv_count * sizeof (struct induction *)); stats = (struct recombine_givs_stats *) xmalloc (giv_count * sizeof *stats); ! /* Initialize stats and set up the ix field for each giv in stats to name ! the corresponding index into stats. */ ! for (i = 0, v = bl->giv; v; v = v->next_iv) { rtx p; if (v->ignore) ! continue; giv_array[i] = v; stats[i].giv_number = i; /* If this giv has been hoisted out of an inner loop, use the luid of --- 7737,7757 ---- = (struct induction **) xmalloc (giv_count * sizeof (struct induction *)); stats = (struct recombine_givs_stats *) xmalloc (giv_count * sizeof *stats); ! /* Initialize stats, and clear the live_after_loop fields. ! Also note where the biv is used by unreduced givs. */ ! for (i = 0, biv_use_start = biv_use_end = 0, v = bl->giv; v; v = v->next_iv) { rtx p; if (v->ignore) ! { ! if (! biv_use_start || loop_insn_first_p (v->insn, biv_use_start)) ! biv_use_start = v->insn; ! if (! biv_use_end || loop_insn_first_p (biv_use_end, v->insn)) ! biv_use_end = v->insn; ! continue; ! } ! v->live_after_loop = 0; giv_array[i] = v; stats[i].giv_number = i; /* If this giv has been hoisted out of an inner loop, use the luid of *************** recombine_givs (bl, loop_start, loop_end *** 7471,7476 **** --- 7759,7765 ---- for (p = v->insn; INSN_UID (p) >= max_uid_for_loop; ) p = PREV_INSN (p); stats[i].start_luid = INSN_LUID (p); + stats[i].start_insn = NULL_RTX; i++; } *************** recombine_givs (bl, loop_start, loop_end *** 7525,7654 **** last_giv = v; } ! ends_need_computing = 0; ! /* For each DEST_REG giv, compute lifetime starts, and try to compute ! lifetime ends from regscan info. */ ! for (i = giv_count - 1; i >= 0; i--) { ! v = giv_array[stats[i].giv_number]; ! if (v->ignore) continue; ! if (v->giv_type == DEST_ADDR) ! { ! /* Loop unrolling of an inner loop can even create new DEST_REG ! givs. */ ! rtx p; ! for (p = v->insn; INSN_UID (p) >= max_uid_for_loop; ) ! p = PREV_INSN (p); ! stats[i].start_luid = stats[i].end_luid = INSN_LUID (p); ! if (p != v->insn) ! stats[i].end_luid++; ! } ! else /* v->giv_type == DEST_REG */ ! { ! if (v->last_use) ! { ! stats[i].start_luid = INSN_LUID (v->insn); ! stats[i].end_luid = INSN_LUID (v->last_use); ! } ! else if (INSN_UID (v->insn) >= max_uid_for_loop) ! { ! rtx p; ! /* This insn has been created by loop optimization on an inner ! loop. We don't have a proper start_luid that will match ! when we see the first set. But we do know that there will ! be no use before the set, so we can set end_luid to 0 so that ! we'll start looking for the last use right away. */ ! for (p = PREV_INSN (v->insn); INSN_UID (p) >= max_uid_for_loop; ) ! p = PREV_INSN (p); ! stats[i].start_luid = INSN_LUID (p); ! stats[i].end_luid = 0; ! ends_need_computing++; ! } ! else ! { ! int regno = REGNO (v->dest_reg); ! int count = VARRAY_INT (n_times_set, regno) - 1; ! rtx p = v->insn; ! ! /* Find the first insn that sets the giv, so that we can verify ! if this giv's lifetime wraps around the loop. We also need ! the luid of the first setting insn in order to detect the ! last use properly. */ ! while (count) ! { ! p = prev_nonnote_insn (p); ! if (reg_set_p (v->dest_reg, p)) ! count--; ! } ! stats[i].start_luid = INSN_LUID (p); ! if (stats[i].start_luid > uid_luid[REGNO_FIRST_UID (regno)]) ! { ! stats[i].end_luid = -1; ! ends_need_computing++; ! } ! else ! { ! stats[i].end_luid = uid_luid[REGNO_LAST_UID (regno)]; ! if (stats[i].end_luid > INSN_LUID (loop_end)) ! { ! stats[i].end_luid = -1; ! ends_need_computing++; ! } ! } ! } ! } ! } ! /* If the regscan information was unconclusive for one or more DEST_REG ! givs, scan the all insn in the loop to find out lifetime ends. */ ! if (ends_need_computing) ! { ! rtx biv = bl->biv->src_reg; ! rtx p = loop_end; ! ! do ! { ! if (p == loop_start) ! p = loop_end; ! p = PREV_INSN (p); ! if (GET_RTX_CLASS (GET_CODE (p)) != 'i') ! continue; ! ends_need_computing -= find_life_end (PATTERN (p), stats, p, biv); } - while (ends_need_computing); } ! /* Set start_luid back to the last insn that sets the giv. This allows ! more combinations. */ ! for (i = giv_count - 1; i >= 0; i--) ! { ! v = giv_array[stats[i].giv_number]; ! if (v->ignore) ! continue; ! if (INSN_UID (v->insn) < max_uid_for_loop) ! stats[i].start_luid = INSN_LUID (v->insn); ! } ! /* Now adjust lifetime ends by taking combined givs into account. */ for (i = giv_count - 1; i >= 0; i--) { - unsigned luid; - int j; - v = giv_array[stats[i].giv_number]; ! if (v->ignore) continue; ! if (v->same && ! v->same->ignore) ! { ! j = v->same->ix; ! luid = stats[i].start_luid; ! /* Use unsigned arithmetic to model loop wrap-around. */ ! if (luid - stats[j].start_luid ! > (unsigned) stats[j].end_luid - stats[j].start_luid) ! stats[j].end_luid = luid; ! } } qsort (stats, giv_count, sizeof(*stats), cmp_recombine_givs_stats); --- 7814,7913 ---- last_giv = v; } ! /* Set up the giv_live_after_loop array. */ ! n_giv_live_after_loop = 0; ! giv_live_after_loop = NULL_PTR; ! for (v = bl->giv; v; v = v->next_iv) { ! struct induction *same; ! ! if (v->giv_type != DEST_REG || v->last_use) continue; ! if ((uid_luid[REGNO_FIRST_UID (REGNO (v->dest_reg))] ! > INSN_LUID (loop_start)) ! && (uid_luid[REGNO_LAST_UID (REGNO (v->dest_reg))] ! < INSN_LUID (loop_end))) ! continue; ! /* Sometimes the register is immediately overwritten after the loop. ! This happens particularily in the second loop pass, when we see ! the results of strength reduction in the first pass. */ ! if (flag_expensive_optimizations ! && reg_dead_after_loop (v->dest_reg, loop_start, loop_end)) ! continue; ! same = v->same ? v->same : v; ! if (! same->ignore ! && ! same->live_after_loop) ! { ! same->live_after_loop = 1; ! if (! giv_live_after_loop) ! giv_live_after_loop ! = (struct induction **) alloca (sizeof (struct induction *) ! * giv_count); ! giv_live_after_loop[n_giv_live_after_loop++] = same; } } ! /* Scan all the insns in the loop to find out lifetime starts and ends. */ ! { ! rtx biv = bl->biv->src_reg; ! rtx p = loop_end; ! for (loop_last_label = scan_start, p = scan_start; p; ! p = next_insn_in_loop (p, scan_start, loop_end, loop_top)) ! { ! if (GET_CODE (p) == CODE_LABEL) ! loop_last_label = p; ! else if (GET_RTX_CLASS (GET_CODE (p)) == 'i') ! { ! find_giv_uses (PATTERN (p), stats, p, biv); ! /* If this is a jump, we have to consider uses outside the loop. */ ! if (GET_CODE (p) == JUMP_INSN && GET_CODE (PATTERN (p)) != RETURN) ! { ! int is_loop_exit = 1; ! rtx label; ! ! if (condjump_p (p) || condjump_in_parallel_p (p)) ! { ! label = XEXP (condjump_label (p), 0); ! /* If the destination is within the loop, and this ! is not a conditional branch at the loop end, this ! is not a loop exit. */ ! if (loop_insn_first_p (loop_start, label) ! && loop_insn_first_p (label, loop_end) ! && (simplejump_p (p) ! /* Shortcut for forward branches - by definition, ! they can't be the end of the loop */ ! || loop_insn_first_p (p, label) ! || ! no_labels_between_p (p, loop_end))) ! is_loop_exit = 0; ! } ! ! if (is_loop_exit) ! { ! for (i = n_giv_live_after_loop -1; i >= 0; i--) ! /* We don't have recorded which givs are life after the ! loop only because their giv register is life, or ! (also) because a combined giv is life after the loop, ! so just pretend it is the latter if any other givs ! have been combined with this one. */ ! note_giv_use (giv_live_after_loop[i], p, ! giv_live_after_loop[i]->combined_with, ! stats); ! } ! } ! } ! } ! } ! /* Ignore givs that are not used at all. */ for (i = giv_count - 1; i >= 0; i--) { v = giv_array[stats[i].giv_number]; ! if (v->ignore || v->same) continue; ! if (! stats[i].start_insn) ! v->ignore = 1; } qsort (stats, giv_count, sizeof(*stats), cmp_recombine_givs_stats); *************** recombine_givs (bl, loop_start, loop_end *** 7664,7680 **** When we are finished with the current LAST_GIV (i.e. the inner loop terminates), we start again with rescan, which then becomes the new LAST_GIV. */ for (i = giv_count - 1; i >= 0; i = rescan) { ! int life_start, life_end; ! for (last_giv = 0, rescan = -1; i >= 0; i--) { rtx sum; v = giv_array[stats[i].giv_number]; ! if (v->giv_type != DEST_REG || v->derived_from || v->same) continue; if (! last_giv) { /* Don't use a giv that's likely to be dead to derive --- 7923,7964 ---- When we are finished with the current LAST_GIV (i.e. the inner loop terminates), we start again with rescan, which then becomes the new LAST_GIV. */ + + /* The biv is also a giv, of sorts. If it can't be eliminated, we + might as well consider to derive givs from it. */ + if (biv_use_start && bl->biv_count == 1) + { + biv_giv = (struct induction *) oballoc (sizeof *biv_giv); + biv_giv->add_val = const0_rtx; + biv_giv->mult_val = const1_rtx; + biv_giv->dest_reg = biv_giv->new_reg = regno_reg_rtx[bl->regno]; + biv_giv->insn = bl->biv->insn; /* Used for debugging dump. */ + last_giv = biv_giv; + while (INSN_UID (biv_use_start) >= max_uid_for_loop) + biv_use_start = PREV_INSN (biv_use_start); + life_start = INSN_LUID (biv_use_start); + while (INSN_UID (biv_use_end) >= max_uid_for_loop) + biv_use_end = NEXT_INSN (biv_use_end); + life_end = INSN_LUID (biv_use_end); + } + else + last_giv = 0; + for (i = giv_count - 1; i >= 0; i = rescan) { ! rtx add_insn, trial_add_insn = NULL_RTX; ! for (rescan = -1; i >= 0; i--) { rtx sum; v = giv_array[stats[i].giv_number]; ! if (v->derived_from || v->same || v->ignore) continue; + + if (! v->new_reg) + v->new_reg = gen_reg_rtx (v->mode); + if (! last_giv) { /* Don't use a giv that's likely to be dead to derive *************** recombine_givs (bl, loop_start, loop_end *** 7687,7704 **** } continue; } /* Use unsigned arithmetic to model loop wrap around. */ if (((unsigned) stats[i].start_luid - life_start >= (unsigned) life_end - life_start) && ((unsigned) stats[i].end_luid - life_start > (unsigned) life_end - life_start) - /* Check that the giv insn we're about to use for deriving - precedes all uses of that giv. Note that initializing the - derived giv would defeat the purpose of reducing register - pressure. - ??? We could arrange to move the insn. */ - && ((unsigned) stats[i].end_luid - INSN_LUID (loop_start) - > (unsigned) stats[i].start_luid - INSN_LUID (loop_start)) && rtx_equal_p (last_giv->mult_val, v->mult_val) /* ??? Could handle libcalls, but would need more logic. */ && ! find_reg_note (v->insn, REG_RETVAL, NULL_RTX) --- 7971,7998 ---- } continue; } + + /* ??? We would save some time by setting up add_insn only + immediately before it is going to be used, but that would + make the multi-line conditional below even harder to read. */ + if (v->giv_type == DEST_REG) + add_insn = v->insn; + else + { + if (! trial_add_insn) + { + trial_add_insn = make_insn_raw (NULL_RTX); + PREV_INSN (trial_add_insn) = NULL_RTX; + NEXT_INSN (trial_add_insn) = NULL_RTX; + } + add_insn = trial_add_insn; + } + /* Use unsigned arithmetic to model loop wrap around. */ if (((unsigned) stats[i].start_luid - life_start >= (unsigned) life_end - life_start) && ((unsigned) stats[i].end_luid - life_start > (unsigned) life_end - life_start) && rtx_equal_p (last_giv->mult_val, v->mult_val) /* ??? Could handle libcalls, but would need more logic. */ && ! find_reg_note (v->insn, REG_RETVAL, NULL_RTX) *************** recombine_givs (bl, loop_start, loop_end *** 7708,7737 **** don't have this detailed control flow information. N.B. since last_giv will be reduced, it is valid anywhere in the loop, so we don't need to check the ! validity of last_giv. ! We rely here on the fact that v->always_executed implies that ! there is no jump to someplace else in the loop before the ! giv insn, and hence any insn that is executed before the ! giv insn in the loop will have a lower luid. */ ! && (v->always_executed || ! v->combined_with) && (sum = express_from (last_giv, v)) /* Make sure we don't make the add more expensive. ADD_COST doesn't take different costs of registers and constants into account, so compare the cost of the actual SET_SRCs. */ ! && (rtx_cost (sum, SET) ! <= rtx_cost (SET_SRC (single_set (v->insn)), SET)) /* ??? unroll can't understand anything but reg + const_int sums. It would be cleaner to fix unroll. */ && ((GET_CODE (sum) == PLUS && GET_CODE (XEXP (sum, 0)) == REG && GET_CODE (XEXP (sum, 1)) == CONST_INT) || ! unroll_p) ! && validate_change (v->insn, &PATTERN (v->insn), ! gen_rtx_SET (VOIDmode, v->dest_reg, sum), 0)) { v->derived_from = last_giv; life_end = stats[i].end_luid; if (loop_dump_stream) { fprintf (loop_dump_stream, --- 8002,8088 ---- don't have this detailed control flow information. N.B. since last_giv will be reduced, it is valid anywhere in the loop, so we don't need to check the ! validity of last_giv. */ ! && (GET_CODE (stats[i].start_insn) != NOTE ! || ! v->combined_with ! /* We rely here on the fact that v->always_executed implies ! that there is no jump to someplace else in the loop before ! the giv insn, and hence any insn that is executed before ! the giv insn in the loop will have a lower luid. */ ! || (v->giv_type == DEST_REG ! && v->always_executed ! && ! v->leading_combined ! /* Check that the giv insn we're about to use for ! deriving precedes all uses of that giv. Note that ! initializing the derived giv would defeat the purpose ! of reducing register pressure. */ ! && ((unsigned) stats[i].end_luid - INSN_LUID (scan_start) ! > ((unsigned) stats[i].start_luid ! - INSN_LUID (scan_start))))) ! /* If we are deriving from the biv, this must be before the biv ! increment. */ ! && (last_giv != biv_giv ! || loop_insn_first_p ((v->leading_combined ! ? stats[i].start_insn : v->insn), ! bl->biv->insn)) && (sum = express_from (last_giv, v)) /* Make sure we don't make the add more expensive. ADD_COST doesn't take different costs of registers and constants into account, so compare the cost of the actual SET_SRCs. */ ! && (v->giv_type != DEST_REG ! || (rtx_cost (sum, SET) ! <= rtx_cost (SET_SRC (single_set (v->insn)), SET))) /* ??? unroll can't understand anything but reg + const_int sums. It would be cleaner to fix unroll. */ && ((GET_CODE (sum) == PLUS && GET_CODE (XEXP (sum, 0)) == REG && GET_CODE (XEXP (sum, 1)) == CONST_INT) || ! unroll_p) ! && validate_change (add_insn, &PATTERN (add_insn), ! gen_rtx_SET (VOIDmode, v->new_reg, sum), 0)) { + struct induction *tv; + + last_giv->did_derive = 1; v->derived_from = last_giv; life_end = stats[i].end_luid; + if (v->giv_type == DEST_ADDR) + { + trial_add_insn = NULL_RTX; + reorder_insns (add_insn, add_insn, + PREV_INSN (stats[i].start_insn)); + } + /* Check if we want / have to move this giv. */ + else if (v->leading_combined) + { + rtx insert_after = PREV_INSN (stats[i].start_insn); + rtx prev = PREV_INSN (v->insn); + rtx next = NEXT_INSN (v->insn); + #ifdef HAVE_cc0 + if (GET_RTX_CLASS (GET_CODE (insert_after)) == 'i' + && sets_cc0_p (PATTERN (insert_after))) + insert_after = PREV_INSN (insert_after); + #endif + if (v->insn == insert_after + || prev == insert_after) + ; /* do nothing */ + else if (loop_insn_first_p (v->insn, insert_after)) + { + reorder_insns (v->insn, v->insn, insert_after); + while (INSN_UID (prev) >= max_uid_for_loop) + prev = PREV_INSN (prev); + compute_luids (next, v->insn, INSN_LUID (prev)); + } + else + { + reorder_insns (v->insn, v->insn, insert_after); + while (INSN_UID (insert_after) >= max_uid_for_loop) + insert_after = PREV_INSN (insert_after); + compute_luids (v->insn, prev, INSN_LUID (insert_after)); + } + } + if (loop_dump_stream) { fprintf (loop_dump_stream, *************** recombine_givs (bl, loop_start, loop_end *** 7740,7749 **** --- 8091,8147 ---- print_rtl (loop_dump_stream, sum); putc ('\n', loop_dump_stream); } + + /* In case LAST_GIV->dest_reg is not replaceable, we have + to replace it in ADD_INSN now. */ + PATTERN (add_insn) + = replace_rtx (PATTERN (add_insn), last_giv->dest_reg, + last_giv->new_reg); + + /* For each place where the biv is incremented, add an + insn to set the new, reduced reg for the giv. + We used to do this only for biv_count != 1, but + this fails when there is a giv after a single biv + increment, e.g. when the last giv was expressed as + pre-decrement. + We do this here (rather than at giv derivation time) because + we want to copy ADD_INSN - which is not the same as V->insn + for DEST_ADDR givs - and to exploit the lifetime + information we have. */ + for (tv = bl->biv; tv; tv = tv->next_iv) + { + /* If the biv increment precedes ADD_INSN, we can ignore it. + Only handle the most common case here. */ + if (loop_insn_first_p (tv->insn, add_insn) + && (loop_insn_first_p (scan_start, tv->insn) + || loop_insn_first_p (add_insn, scan_start))) + continue; + /* Likewise if the biv increment is after the last giv use. + Only handle the most common case here. */ + if (INSN_UID (tv->insn) < max_uid_for_loop + && stats[i].end_luid < INSN_LUID (tv->insn) + && INSN_LUID (scan_start) < stats[i].end_luid) + continue; + + /* We always emit reduced giv increments before the biv + increment when bl->biv_count != 1. So by emitting + the add insns for derived givs after the biv increment, + they pick up the updated value of the reduced giv. + If the reduced giv is processed with auto_inc_opt == 1, + then it is incremented earlier than the biv, hence we'll + still pick up the right value. + If it's processed with auto_inc_opt == -1, + that implies that the biv increment is before the + first reduced giv's use. The derived giv's lifetime + is after the reduced giv's lifetime, hence in this + case, the biv increment doesn't matter. */ + emit_insn_after (copy_rtx (PATTERN (add_insn)), tv->insn); + } } else if (rescan < 0) rescan = i; } + last_giv = 0; } /* Clean up. */ *************** load_mems_and_recount_loop_regs_set (sca *** 9691,9697 **** int nregs = max_reg_num (); load_mems (scan_start, end, loop_top, start); ! /* Recalculate set_in_loop and friends since load_mems may have created new registers. */ if (max_reg_num () > nregs) --- 10089,10095 ---- int nregs = max_reg_num (); load_mems (scan_start, end, loop_top, start); ! /* Recalculate set_in_loop and friends since load_mems may have created new registers. */ if (max_reg_num () > nregs) *************** load_mems_and_recount_loop_regs_set (sca *** 9724,9730 **** VARRAY_CHAR (may_not_optimize, i) = 1; VARRAY_INT (set_in_loop, i) = 1; } ! #ifdef AVOID_CCMODE_COPIES /* Don't try to move insns which set CC registers if we should not create CCmode register copies. */ --- 10122,10128 ---- VARRAY_CHAR (may_not_optimize, i) = 1; VARRAY_INT (set_in_loop, i) = 1; } ! #ifdef AVOID_CCMODE_COPIES /* Don't try to move insns which set CC registers if we should not create CCmode register copies. */ *************** replace_label (x, data) *** 10176,10182 **** if (XEXP (l, 0) != old_label) return 0; ! XEXP (l, 0) = new_label; ++LABEL_NUSES (new_label); --LABEL_NUSES (old_label); --- 10574,10580 ---- if (XEXP (l, 0) != old_label) return 0; ! XEXP (l, 0) = new_label; ++LABEL_NUSES (new_label); --LABEL_NUSES (old_label); Index: loop.h =================================================================== RCS file: /cvs/gcc/egcs/gcc/loop.h,v retrieving revision 1.20 diff -p -r1.20 loop.h *** loop.h 1999/12/08 03:22:33 1.20 --- loop.h 1999/12/11 00:42:40 *************** struct induction *** 102,107 **** --- 102,124 ---- unsigned shared : 1; unsigned no_const_addval : 1; /* 1 if add_val does not contain a const. */ unsigned multi_insn_incr : 1; /* 1 if multiple insns updated the biv. */ + + /* giv-giv autoinc in the following means that that a DEST_ADDR giv + can be formed from a preceding giv with the same mult_val but + different add_val by using auto-increment. */ + unsigned autoinc_pred : 1; /* 1 if predecessor in a giv-giv autoinc. */ + unsigned autoinc_succ : 1; /* 1 if successor in a giv-giv autoinc. */ + unsigned preinc : 1; /* 1 if considered for pre-increment in + giv-giv autoinc. */ + unsigned live_after_loop : 1; /* Used inside recombine_givs to keep track + of which givs have already been included + in an array of givs live after the loop. */ + unsigned leading_combined : 1;/* In recombine_givs, set if this giv has been + combined with one or more other givs that + precede the giv insn of this giv. + Giv derivation then requires to move the + giv insn before the first use. */ + unsigned did_derive : 1; /* Set in recombine_givs. */ int lifetime; /* Length of life of this giv */ rtx derive_adjustment; /* If nonzero, is an adjustment to be subtracted from add_val when this giv *************** struct induction *** 128,133 **** --- 145,152 ---- that doesn't have this field set. */ rtx last_use; /* For a giv made from a biv increment, this is a substitute for the lifetime information. */ + rtx combine_start_limit; + rtx combine_end_limit; }; /* A `struct iv_class' is created for each biv. */ *************** void emit_unrolled_add PROTO((rtx, rtx, *** 257,262 **** --- 276,282 ---- int back_branch_in_range_p PROTO((rtx, rtx, rtx)); int loop_insn_first_p PROTO((rtx, rtx)); + int reg_dead_after_loop PROTO((rtx, rtx, rtx)); /* Forward declarations for non-static functions declared in stmt.c. */ void find_loop_tree_blocks PROTO((void)); Index: rtl.h =================================================================== RCS file: /cvs/gcc/egcs/gcc/rtl.h,v retrieving revision 1.158 diff -p -r1.158 rtl.h *** rtl.h 1999/12/04 03:00:03 1.158 --- rtl.h 1999/12/11 00:42:42 *************** extern void add_clobbers PROTO ((rtx, i *** 1470,1475 **** --- 1470,1478 ---- extern void combine_instructions PROTO ((rtx, int)); extern int extended_count PROTO ((rtx, enum machine_mode, int)); extern rtx remove_death PROTO ((int, rtx)); + extern int validate_subst PROTO((rtx, rtx, rtx)); + extern void validate_subst_start PROTO((void)); + extern void validate_subst_undo PROTO((void)); #ifdef BUFSIZ extern void dump_combine_stats PROTO ((FILE *)); extern void dump_combine_total_stats PROTO ((FILE *)); Index: unroll.c =================================================================== RCS file: /cvs/gcc/egcs/gcc/unroll.c,v retrieving revision 1.79 diff -p -r1.79 unroll.c *** unroll.c 1999/11/29 10:51:09 1.79 --- unroll.c 1999/12/11 00:42:43 *************** static int find_splittable_regs PROTO((e *** 205,211 **** unsigned HOST_WIDE_INT)); static int find_splittable_givs PROTO((struct iv_class *, enum unroll_types, rtx, rtx, rtx, int)); - static int reg_dead_after_loop PROTO((rtx, rtx, rtx)); static rtx fold_rtx_mult_add PROTO((rtx, rtx, rtx, enum machine_mode)); static int verify_addresses PROTO((struct induction *, rtx, int)); static rtx remap_split_bivs PROTO((rtx)); --- 205,210 ---- *************** find_splittable_givs (bl, unroll_type, l *** 3221,3227 **** /* ?? Could be made more intelligent in the handling of jumps, so that it can search past if statements and other similar structures. */ ! static int reg_dead_after_loop (reg, loop_start, loop_end) rtx reg, loop_start, loop_end; { --- 3220,3226 ---- /* ?? Could be made more intelligent in the handling of jumps, so that it can search past if statements and other similar structures. */ ! int reg_dead_after_loop (reg, loop_start, loop_end) rtx reg, loop_start, loop_end; { ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Loop patch update (Was: Re: Autoincrement examples) 1999-12-10 16:59 ` Loop patch update (Was: Re: Autoincrement examples) Joern Rennecke @ 1999-12-13 14:37 ` Joern Rennecke 1999-12-13 14:59 ` Autoincrement example (Was: Re: Loop patch update (Was: Re: Autoincrement examples)) Joern Rennecke ` (2 more replies) 1999-12-21 17:04 ` Joern Rennecke 1999-12-31 23:54 ` Joern Rennecke 2 siblings, 3 replies; 94+ messages in thread From: Joern Rennecke @ 1999-12-13 14:37 UTC (permalink / raw) To: m.hayes, gcc I've found that there were still unnecessary givs left for this testcase on SH: typedef struct { char a,b,c,d,e,f,g,h; } s; f (p, q, endp) s *p, *q; int *endp; { int i; for (i = 0; i < *endp; i++) { p->a = q->b; p->b = q->c; p->c = q->d; p->d = q->e; p->e = q->f; p++, q++; p->a = q->c; p->b = q->d; p->c = q->e; p->d = q->f; p->e = q->g; p++, q++; p->a = q->d; p->b = q->e; p->c = q->f; p->d = q->g; p->e = q->h; p++, q++; } } The first problem - fixed by the first hunk - was an outright bug: the sign of the calculated lifetime was wrong. The second one is rather a missing feature: we didn't take into account that reducing a DEST_REG giv that can be combined with a giv is really pretty cheap. Mon Dec 13 22:30:36 1999 J"orn Rennecke <amylaar@cygnus.co.uk> * loop.c (strength_reduce): Fix sign of giv lifetime calculation for givs made from biv increments. Don't reject giv for its cost if can likely be combined with the biv. *** loop.c-19991210 Fri Dec 10 23:20:56 1999 --- loop.c Mon Dec 13 22:28:49 1999 *************** strength_reduce (scan_start, end, loop_t *** 4527,4533 **** } v->last_use = last_use_insn; ! v->lifetime = INSN_LUID (v->insn) - INSN_LUID (last_use_insn); /* If the lifetime is zero, it means that this register is really a dead store. So mark this as a giv that can be ignored. This will not prevent the biv from being eliminated. */ --- 4527,4533 ---- } v->last_use = last_use_insn; ! v->lifetime = INSN_LUID (last_use_insn) - INSN_LUID (v->insn); /* If the lifetime is zero, it means that this register is really a dead store. So mark this as a giv that can be ignored. This will not prevent the biv from being eliminated. */ *************** strength_reduce (scan_start, end, loop_t *** 4910,4916 **** of such giv's whether or not we know they are used after the loop exit. */ ! if ( ! flag_reduce_all_givs && v->lifetime * threshold * benefit < insn_count && ! bl->reversed ) { if (loop_dump_stream) --- 4910,4922 ---- of such giv's whether or not we know they are used after the loop exit. */ ! if ( ! flag_reduce_all_givs ! /* If this is an always executed DEST_REG giv with mult_val 1, ! we can combine it with the biv. */ ! && ! (v->giv_type == DEST_REG && all_reduced ! && v->always_executed && v->mult_val == const1_rtx ! && rtx_cost (v->add_val, PLUS) <= 1) ! && v->lifetime * threshold * benefit < insn_count && ! bl->reversed ) { if (loop_dump_stream) ^ permalink raw reply [flat|nested] 94+ messages in thread
* Autoincrement example (Was: Re: Loop patch update (Was: Re: Autoincrement examples)) 1999-12-13 14:37 ` Joern Rennecke @ 1999-12-13 14:59 ` Joern Rennecke 1999-12-31 23:54 ` Joern Rennecke 1999-12-13 20:20 ` Loop patch update (Was: Re: Autoincrement examples) Jeffrey A Law 1999-12-31 23:54 ` Joern Rennecke 2 siblings, 1 reply; 94+ messages in thread From: Joern Rennecke @ 1999-12-13 14:59 UTC (permalink / raw) To: m.hayes, gcc > typedef struct { char a,b,c,d,e,f,g,h; } s; > f (p, q, endp) > s *p, *q; > int *endp; > { > int i; > > for (i = 0; i < *endp; i++) > { > p->a = q->b; > p->b = q->c; > p->c = q->d; > p->d = q->e; > p->e = q->f; > p++, q++; > > p->a = q->c; > p->b = q->d; > p->c = q->e; > p->d = q->f; > p->e = q->g; > p++, q++; > > p->a = q->d; > p->b = q->e; > p->c = q->f; > p->d = q->g; > p->e = q->h; > p++, q++; > } > } P.S.: This is also an example where my regmove patch gives better code (+ in the diff below) than your flow patch (-). Target: SH Options: -O2 -fno-schedule-insns2 (I disabled scheduling to make the comparison easier.) --- /b/egcs-ai-mh-sh/gcc/lben.s Mon Dec 13 22:50:13 1999 +++ lben.s Mon Dec 13 22:52:13 1999 @@ -7,20 +7,18 @@ gcc2_compiled.: _f: mov.l r14,@-r15 mov r15,r14 - mov #0,r3 + mov #0,r2 mov.l @r6,r1 - cmp/ge r1,r3 + cmp/ge r1,r2 bt .L4 add #1,r5 .align 2 .L6: - mov r5,r2 - mov.b @r2+,r1 + mov.b @r5+,r1 mov.b r1,@r4 - mov.b @r2,r1 + mov.b @r5+,r1 add #1,r4 mov.b r1,@r4 - add #2,r5 mov.b @r5+,r1 add #1,r4 mov.b r1,@r4 @@ -60,13 +58,13 @@ _f: add #1,r4 mov.b r1,@r4 mov.b @r5,r1 + add #2,r5 add #1,r4 mov.b r1,@r4 add #4,r4 - add #2,r5 - add #1,r3 + add #1,r2 mov.l @r6,r1 - cmp/ge r1,r3 + cmp/ge r1,r2 bf .L6 .L4: mov r14,r15 ^ permalink raw reply [flat|nested] 94+ messages in thread
* Autoincrement example (Was: Re: Loop patch update (Was: Re: Autoincrement examples)) 1999-12-13 14:59 ` Autoincrement example (Was: Re: Loop patch update (Was: Re: Autoincrement examples)) Joern Rennecke @ 1999-12-31 23:54 ` Joern Rennecke 0 siblings, 0 replies; 94+ messages in thread From: Joern Rennecke @ 1999-12-31 23:54 UTC (permalink / raw) To: m.hayes, gcc > typedef struct { char a,b,c,d,e,f,g,h; } s; > f (p, q, endp) > s *p, *q; > int *endp; > { > int i; > > for (i = 0; i < *endp; i++) > { > p->a = q->b; > p->b = q->c; > p->c = q->d; > p->d = q->e; > p->e = q->f; > p++, q++; > > p->a = q->c; > p->b = q->d; > p->c = q->e; > p->d = q->f; > p->e = q->g; > p++, q++; > > p->a = q->d; > p->b = q->e; > p->c = q->f; > p->d = q->g; > p->e = q->h; > p++, q++; > } > } P.S.: This is also an example where my regmove patch gives better code (+ in the diff below) than your flow patch (-). Target: SH Options: -O2 -fno-schedule-insns2 (I disabled scheduling to make the comparison easier.) --- /b/egcs-ai-mh-sh/gcc/lben.s Mon Dec 13 22:50:13 1999 +++ lben.s Mon Dec 13 22:52:13 1999 @@ -7,20 +7,18 @@ gcc2_compiled.: _f: mov.l r14,@-r15 mov r15,r14 - mov #0,r3 + mov #0,r2 mov.l @r6,r1 - cmp/ge r1,r3 + cmp/ge r1,r2 bt .L4 add #1,r5 .align 2 .L6: - mov r5,r2 - mov.b @r2+,r1 + mov.b @r5+,r1 mov.b r1,@r4 - mov.b @r2,r1 + mov.b @r5+,r1 add #1,r4 mov.b r1,@r4 - add #2,r5 mov.b @r5+,r1 add #1,r4 mov.b r1,@r4 @@ -60,13 +58,13 @@ _f: add #1,r4 mov.b r1,@r4 mov.b @r5,r1 + add #2,r5 add #1,r4 mov.b r1,@r4 add #4,r4 - add #2,r5 - add #1,r3 + add #1,r2 mov.l @r6,r1 - cmp/ge r1,r3 + cmp/ge r1,r2 bf .L6 .L4: mov r14,r15 ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Loop patch update (Was: Re: Autoincrement examples) 1999-12-13 14:37 ` Joern Rennecke 1999-12-13 14:59 ` Autoincrement example (Was: Re: Loop patch update (Was: Re: Autoincrement examples)) Joern Rennecke @ 1999-12-13 20:20 ` Jeffrey A Law 1999-12-14 11:20 ` Joern Rennecke 1999-12-31 23:54 ` Jeffrey A Law 1999-12-31 23:54 ` Joern Rennecke 2 siblings, 2 replies; 94+ messages in thread From: Jeffrey A Law @ 1999-12-13 20:20 UTC (permalink / raw) To: Joern Rennecke; +Cc: m.hayes, gcc In message < 199912132235.WAA28605@phal.cygnus.co.uk >you write: [ ... ] > The first problem - fixed by the first hunk - was an outright bug: the > sign of the calculated lifetime was wrong. > > The second one is rather a missing feature: we didn't take into account tha > t > reducing a DEST_REG giv that can be combined with a giv is really pretty > cheap. > > Mon Dec 13 22:30:36 1999 J"orn Rennecke <amylaar@cygnus.co.uk> > > * loop.c (strength_reduce): Fix sign of giv lifetime calculation > for givs made from biv increments. > Don't reject giv for its cost if can likely be combined with the biv. Assuming this doesn't depend on the other loop changes in the queue, you can install it. Thanks, jeff ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Loop patch update (Was: Re: Autoincrement examples) 1999-12-13 20:20 ` Loop patch update (Was: Re: Autoincrement examples) Jeffrey A Law @ 1999-12-14 11:20 ` Joern Rennecke 1999-12-31 23:54 ` Joern Rennecke 1999-12-31 23:54 ` Jeffrey A Law 1 sibling, 1 reply; 94+ messages in thread From: Joern Rennecke @ 1999-12-14 11:20 UTC (permalink / raw) To: law; +Cc: amylaar, m.hayes, gcc > Assuming this doesn't depend on the other loop changes in the queue, you can > install it. I've installed the first hunk. The second one makes only sense together with the code to derive givs from bivs in recombine_givs. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Loop patch update (Was: Re: Autoincrement examples) 1999-12-14 11:20 ` Joern Rennecke @ 1999-12-31 23:54 ` Joern Rennecke 0 siblings, 0 replies; 94+ messages in thread From: Joern Rennecke @ 1999-12-31 23:54 UTC (permalink / raw) To: law; +Cc: amylaar, m.hayes, gcc > Assuming this doesn't depend on the other loop changes in the queue, you can > install it. I've installed the first hunk. The second one makes only sense together with the code to derive givs from bivs in recombine_givs. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Loop patch update (Was: Re: Autoincrement examples) 1999-12-13 20:20 ` Loop patch update (Was: Re: Autoincrement examples) Jeffrey A Law 1999-12-14 11:20 ` Joern Rennecke @ 1999-12-31 23:54 ` Jeffrey A Law 1 sibling, 0 replies; 94+ messages in thread From: Jeffrey A Law @ 1999-12-31 23:54 UTC (permalink / raw) To: Joern Rennecke; +Cc: m.hayes, gcc In message < 199912132235.WAA28605@phal.cygnus.co.uk >you write: [ ... ] > The first problem - fixed by the first hunk - was an outright bug: the > sign of the calculated lifetime was wrong. > > The second one is rather a missing feature: we didn't take into account tha > t > reducing a DEST_REG giv that can be combined with a giv is really pretty > cheap. > > Mon Dec 13 22:30:36 1999 J"orn Rennecke <amylaar@cygnus.co.uk> > > * loop.c (strength_reduce): Fix sign of giv lifetime calculation > for givs made from biv increments. > Don't reject giv for its cost if can likely be combined with the biv. Assuming this doesn't depend on the other loop changes in the queue, you can install it. Thanks, jeff ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Loop patch update (Was: Re: Autoincrement examples) 1999-12-13 14:37 ` Joern Rennecke 1999-12-13 14:59 ` Autoincrement example (Was: Re: Loop patch update (Was: Re: Autoincrement examples)) Joern Rennecke 1999-12-13 20:20 ` Loop patch update (Was: Re: Autoincrement examples) Jeffrey A Law @ 1999-12-31 23:54 ` Joern Rennecke 2 siblings, 0 replies; 94+ messages in thread From: Joern Rennecke @ 1999-12-31 23:54 UTC (permalink / raw) To: m.hayes, gcc I've found that there were still unnecessary givs left for this testcase on SH: typedef struct { char a,b,c,d,e,f,g,h; } s; f (p, q, endp) s *p, *q; int *endp; { int i; for (i = 0; i < *endp; i++) { p->a = q->b; p->b = q->c; p->c = q->d; p->d = q->e; p->e = q->f; p++, q++; p->a = q->c; p->b = q->d; p->c = q->e; p->d = q->f; p->e = q->g; p++, q++; p->a = q->d; p->b = q->e; p->c = q->f; p->d = q->g; p->e = q->h; p++, q++; } } The first problem - fixed by the first hunk - was an outright bug: the sign of the calculated lifetime was wrong. The second one is rather a missing feature: we didn't take into account that reducing a DEST_REG giv that can be combined with a giv is really pretty cheap. Mon Dec 13 22:30:36 1999 J"orn Rennecke <amylaar@cygnus.co.uk> * loop.c (strength_reduce): Fix sign of giv lifetime calculation for givs made from biv increments. Don't reject giv for its cost if can likely be combined with the biv. *** loop.c-19991210 Fri Dec 10 23:20:56 1999 --- loop.c Mon Dec 13 22:28:49 1999 *************** strength_reduce (scan_start, end, loop_t *** 4527,4533 **** } v->last_use = last_use_insn; ! v->lifetime = INSN_LUID (v->insn) - INSN_LUID (last_use_insn); /* If the lifetime is zero, it means that this register is really a dead store. So mark this as a giv that can be ignored. This will not prevent the biv from being eliminated. */ --- 4527,4533 ---- } v->last_use = last_use_insn; ! v->lifetime = INSN_LUID (last_use_insn) - INSN_LUID (v->insn); /* If the lifetime is zero, it means that this register is really a dead store. So mark this as a giv that can be ignored. This will not prevent the biv from being eliminated. */ *************** strength_reduce (scan_start, end, loop_t *** 4910,4916 **** of such giv's whether or not we know they are used after the loop exit. */ ! if ( ! flag_reduce_all_givs && v->lifetime * threshold * benefit < insn_count && ! bl->reversed ) { if (loop_dump_stream) --- 4910,4922 ---- of such giv's whether or not we know they are used after the loop exit. */ ! if ( ! flag_reduce_all_givs ! /* If this is an always executed DEST_REG giv with mult_val 1, ! we can combine it with the biv. */ ! && ! (v->giv_type == DEST_REG && all_reduced ! && v->always_executed && v->mult_val == const1_rtx ! && rtx_cost (v->add_val, PLUS) <= 1) ! && v->lifetime * threshold * benefit < insn_count && ! bl->reversed ) { if (loop_dump_stream) ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Loop patch update (Was: Re: Autoincrement examples) 1999-12-10 16:59 ` Loop patch update (Was: Re: Autoincrement examples) Joern Rennecke 1999-12-13 14:37 ` Joern Rennecke @ 1999-12-21 17:04 ` Joern Rennecke 1999-12-31 23:54 ` Joern Rennecke 2000-01-04 0:54 ` Jeffrey A Law 1999-12-31 23:54 ` Joern Rennecke 2 siblings, 2 replies; 94+ messages in thread From: Joern Rennecke @ 1999-12-21 17:04 UTC (permalink / raw) To: Michael Hayes; +Cc: gcc When I tried the c4x compiler built with my almagamated patches on a combine.i from it own build (x86 hosted), I get a SEGV. I eventally tracked it down to the DECL_RTL of uid_cuid being clobbered by a loop transformation. On the c4x, (mem/f:QI (symbol_ref:QI ("uid_cuid")) can be shared. *** loop.c-19991221 Tue Dec 21 22:49:24 1999 --- loop.c Wed Dec 22 00:46:12 1999 *************** general_induction_var (x, src_reg, add_v *** 6421,6428 **** --- 6421,6437 ---- #if 0 /* Invariants are useful to derive other givs from. */ /* If this is an invariant, forget it, it isn't a giv. */ if (invariant_p (x) == 1) return 0; + #else + /* If this is an address, we couldn't derive other givs from it, so ignore + it if it's invariant. + If we recorded these as givs, we could generate incorrect code and/or + corrupt the DECL_RTL of global variables on some targets when the + address is substituted with a pseudo, since MEMs with a + CONSTANT_ADDRESS_P address can be shared. */ + if (is_addr && invariant_p (x) == 1) + return 0; #endif /* See if the expression could be a giv and get its form. Mark our place on the obstack in case we don't find a giv. */ ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Loop patch update (Was: Re: Autoincrement examples) 1999-12-21 17:04 ` Joern Rennecke @ 1999-12-31 23:54 ` Joern Rennecke 2000-01-04 0:54 ` Jeffrey A Law 1 sibling, 0 replies; 94+ messages in thread From: Joern Rennecke @ 1999-12-31 23:54 UTC (permalink / raw) To: Michael Hayes; +Cc: gcc When I tried the c4x compiler built with my almagamated patches on a combine.i from it own build (x86 hosted), I get a SEGV. I eventally tracked it down to the DECL_RTL of uid_cuid being clobbered by a loop transformation. On the c4x, (mem/f:QI (symbol_ref:QI ("uid_cuid")) can be shared. *** loop.c-19991221 Tue Dec 21 22:49:24 1999 --- loop.c Wed Dec 22 00:46:12 1999 *************** general_induction_var (x, src_reg, add_v *** 6421,6428 **** --- 6421,6437 ---- #if 0 /* Invariants are useful to derive other givs from. */ /* If this is an invariant, forget it, it isn't a giv. */ if (invariant_p (x) == 1) return 0; + #else + /* If this is an address, we couldn't derive other givs from it, so ignore + it if it's invariant. + If we recorded these as givs, we could generate incorrect code and/or + corrupt the DECL_RTL of global variables on some targets when the + address is substituted with a pseudo, since MEMs with a + CONSTANT_ADDRESS_P address can be shared. */ + if (is_addr && invariant_p (x) == 1) + return 0; #endif /* See if the expression could be a giv and get its form. Mark our place on the obstack in case we don't find a giv. */ ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Loop patch update (Was: Re: Autoincrement examples) 1999-12-21 17:04 ` Joern Rennecke 1999-12-31 23:54 ` Joern Rennecke @ 2000-01-04 0:54 ` Jeffrey A Law 2000-01-20 23:33 ` Joern Rennecke 1 sibling, 1 reply; 94+ messages in thread From: Jeffrey A Law @ 2000-01-04 0:54 UTC (permalink / raw) To: Joern Rennecke; +Cc: Michael Hayes, gcc In message <199912220104.BAA13879@phal.cygnus.co.uk>you write: > When I tried the c4x compiler built with my almagamated patches on a > combine.i from it own build (x86 hosted), I get a SEGV. I eventally > tracked it down to the DECL_RTL of uid_cuid being clobbered by a loop > transformation. On the c4x, (mem/f:QI (symbol_ref:QI ("uid_cuid")) can > be shared. Believe it or not, sharing of MEM (constant address) sounds awful familiar. I could have swore I was working on a similar problem relatively recently. However, I can't find any evidence of that in my mail logs. Sigh. Anyway, instead of #if 0 #else #endif, just remove the dead code and the unnecessary cpp conditionals. Seems to me that you might want to go ahead and submit a change to fix up general_induction_var independently of the rest of the loop changes. jeff ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Loop patch update (Was: Re: Autoincrement examples) 2000-01-04 0:54 ` Jeffrey A Law @ 2000-01-20 23:33 ` Joern Rennecke 0 siblings, 0 replies; 94+ messages in thread From: Joern Rennecke @ 2000-01-20 23:33 UTC (permalink / raw) To: law; +Cc: Joern Rennecke, Michael Hayes, gcc > Seems to me that you might want to go ahead and submit a change to fix up > general_induction_var independently of the rest of the loop changes. It currently works, albeit for the wrong reasons... ^ permalink raw reply [flat|nested] 94+ messages in thread
* Loop patch update (Was: Re: Autoincrement examples) 1999-12-10 16:59 ` Loop patch update (Was: Re: Autoincrement examples) Joern Rennecke 1999-12-13 14:37 ` Joern Rennecke 1999-12-21 17:04 ` Joern Rennecke @ 1999-12-31 23:54 ` Joern Rennecke 2 siblings, 0 replies; 94+ messages in thread From: Joern Rennecke @ 1999-12-31 23:54 UTC (permalink / raw) To: Michael Hayes; +Cc: amylaar, m.hayes, gcc, amylaar I've noticed that all the intresting examples seem to get pessimized by loop so that no autoincrement optimization is applicable any more. So I've updated my loop patches for the current egcs mainline: Thu Sep 2 23:50:58 1999 J"orn Rennecke <amylaar@cygnus.co.uk> * loop.h (struct induction): New member did_derive. * loop.c (strength_reduce, record_giv): Clear did_derive when creating giv. (recombine_givs): Set did_derive when deriving from a giv. (strength_reduce): Don't apply post-increment auto_inc_opt if did_derive is set. Wed May 12 22:19:02 1999 J"orn Rennecke <amylaar@cygnus.co.uk> * loop.h (reg_dead_after_loop): Declare. * unroll.c (reg_dead_after_loop): Don't declare. No longer static. * loop.c (recombine_givs): If a giv is used outside the loop, use reg_dead_after_loop to find out if it matters. Wed May 12 20:27:39 1999 J"orn Rennecke <amylaar@cygnus.co.uk> * loop.c (strength_reduce): When doing biv->giv conversion, update reg note of NEXT->insn. When converting to a DEST_REG giv, undo all changes that tried to make DEST_ADDR givs. * combine.c (validate_subst): Use SUBST for all substitutions. Wed May 12 19:51:51 1999 J"orn Rennecke <amylaar@cygnus.co.uk> * loop.c (strength_reduce): Check if a biv can be detected by a REG_EQUAL note attached to the biv insn. Wed May 5 21:15:40 1999 J"orn Rennecke <amylaar@cygnus.co.uk> * loop.h (struct induction): New members autoinc_pred, autoinc_succ, preinc, combine_start_limit, combine_end_limit. * loop.c (last_recorded_giv, last_recorded_addr_giv): New static variables. (strength_reduce): Set new fields in struct induction for givs. Initialize last_recorded_giv. (find_mem_givs): Set mem_mode before calling record_giv. (record_giv): Set new fields in struct induction. Keep track of last_recorded_giv. (combine_givs_p): Supress combinations that would foil giv-giv autoincrement opportunities. (combine_givs): Propagate autoinc_{pred,succ} settings into combine_{start,end}_limit. (recombine_givs): Allow to derive givs from a non-eliminable biv. Index: combine.c =================================================================== RCS file: /cvs/gcc/egcs/gcc/combine.c,v retrieving revision 1.101 diff -p -r1.101 combine.c *** combine.c 1999/12/09 10:46:10 1.101 --- combine.c 1999/12/11 00:42:37 *************** static int combinable_i3pat PROTO((rtx, *** 361,366 **** --- 361,367 ---- static int contains_muldiv PROTO((rtx)); static rtx try_combine PROTO((rtx, rtx, rtx)); static void undo_all PROTO((void)); + static void undo_last PROTO((void)); static void undo_commit PROTO((void)); static rtx *find_split_point PROTO((rtx *, rtx)); static rtx subst PROTO((rtx, rtx, rtx, int, int)); *************** combine_instructions (f, nregs) *** 738,743 **** --- 739,745 ---- total_successes += combine_successes; nonzero_sign_valid = 0; + reg_last_set_value = 0; /* Make recognizer allow volatile MEMs again. */ init_recog (); *************** try_combine (i3, i2, i1) *** 2695,2700 **** --- 2697,2728 ---- else return newi2pat ? i2 : i3; } + + /* Undo all the modifications recorded in undobuf after previous_undos. */ + + static void + undo_last () + { + struct undo *undo, *next; + + for (undo = undobuf.undos; undo != undobuf.previous_undos; undo = next) + { + next = undo->next; + if (undo->is_int) + *undo->where.i = undo->old_contents.i; + else + *undo->where.r = undo->old_contents.r; + + undo->next = undobuf.frees; + undobuf.frees = undo; + } + + undobuf.undos = undobuf.previous_undos; + + /* Clear this here, so that subsequent get_last_value calls are not + affected. */ + subst_prev_insn = NULL_RTX; + } \f /* Undo all the modifications recorded in undobuf. */ *************** nonzero_bits (x, mode) *** 7790,7795 **** --- 7818,7827 ---- } #endif + /* If called from loop, the reg_last_* arrays are not set. */ + if (! reg_last_set_value) + return nonzero; + /* If X is a register whose nonzero bits value is current, use it. Otherwise, if X is a register whose value we can find, use that value. Otherwise, use the previously-computed global nonzero bits *************** num_sign_bit_copies (x, mode) *** 8188,8193 **** --- 8220,8229 ---- return GET_MODE_BITSIZE (Pmode) - GET_MODE_BITSIZE (ptr_mode) + 1; #endif + /* If called from loop, the reg_last_* arrays are not set. */ + if (! reg_last_set_value) + return 1; + if (reg_last_set_value[REGNO (x)] != 0 && reg_last_set_mode[REGNO (x)] == mode && (reg_last_set_label[REGNO (x)] == label_tick *************** get_last_value (x) *** 11217,11223 **** && (value = get_last_value (SUBREG_REG (x))) != 0) return gen_lowpart_for_combine (GET_MODE (x), value); ! if (GET_CODE (x) != REG) return 0; regno = REGNO (x); --- 11253,11260 ---- && (value = get_last_value (SUBREG_REG (x))) != 0) return gen_lowpart_for_combine (GET_MODE (x), value); ! /* If called from loop, the reg_last_* arrays are not set. */ ! if (GET_CODE (x) != REG || ! reg_last_set_value) return 0; regno = REGNO (x); *************** insn_cuid (insn) *** 12371,12376 **** --- 12408,12495 ---- abort (); return INSN_CUID (insn); + } + \f + void + validate_subst_start () + { + undobuf.undos = 0; + + /* Save the current high-water-mark so we can free storage if we didn't + accept this set of combinations. */ + undobuf.storage = (char *) oballoc (0); + } + + void + validate_subst_undo () + { + undo_all (); + } + + /* Replace FROM with to throughout in INSN, and make simplifications. + Return nonzero for success. */ + int + validate_subst (insn, from, to) + rtx insn, from, to; + { + rtx pat, new_pat; + int i; + + pat = PATTERN (insn); + + /* If from is not mentioned in PAT, we don't need to grind it through + subst. This can safe some time, and also avoids suprious failures + when a simplification is not recognized as a valid insn. */ + if (! reg_mentioned_p (from, pat)) + { + /* But we still have to make sure a REG_EQUAL note gets updated. */ + rtx note = find_reg_note (insn, REG_EQUAL, NULL_RTX); + + if (note) + SUBST (XEXP (note, 0), subst (XEXP (note, 0), from, to, 0, 1)); + return 1; + } + + /* We have to set previous_undos to prevent gen_rtx_combine from re-using + some piece of shared rtl. */ + undobuf.previous_undos = undobuf.undos; + + subst_insn = insn; + + new_pat = subst (pat, from, to, 0, 1); + + /* If PAT is a PARALLEL, check to see if it contains the CLOBBER + we use to indicate that something didn't match. If we find such a + thing, force rejection. + This is the same test as in recog_for_combine; we can't use the that + function here because it tries to use data flow information. */ + if (GET_CODE (pat) == PARALLEL) + for (i = XVECLEN (pat, 0) - 1; i >= 0; i--) + if (GET_CODE (XVECEXP (pat, 0, i)) == CLOBBER + && XEXP (XVECEXP (pat, 0, i), 0) == const0_rtx) + { + undo_last (); + return 0; + } + + /* Change INSN to a nop so that validate_change is forced to re-recognize. */ + PATTERN (insn) = const0_rtx; + if (validate_change (insn, &PATTERN (insn), new_pat, 0)) + { + rtx note = find_reg_note (insn, REG_EQUAL, NULL_RTX); + + PATTERN (insn) = pat; + SUBST ( PATTERN (insn), new_pat); + if (note) + SUBST (XEXP (note, 0), subst (XEXP (note, 0), from, to, 0, 1)); + return 1; + } + else + { + PATTERN (insn) = pat; + undo_last (); + return 0; + } } \f void Index: loop.c =================================================================== RCS file: /cvs/gcc/egcs/gcc/loop.c,v retrieving revision 1.206 diff -p -r1.206 loop.c *** loop.c 1999/12/10 15:27:55 1.206 --- loop.c 1999/12/11 00:42:40 *************** struct movable *** 254,260 **** that the reg is live outside the range from where it is set to the following label. */ unsigned int done : 1; /* 1 inhibits further processing of this */ ! unsigned int partial : 1; /* 1 means this reg is used for zero-extending. In particular, moving it does not make it invariant. */ --- 254,260 ---- that the reg is live outside the range from where it is set to the following label. */ unsigned int done : 1; /* 1 inhibits further processing of this */ ! unsigned int partial : 1; /* 1 means this reg is used for zero-extending. In particular, moving it does not make it invariant. */ *************** static rtx express_from_1 PROTO((rtx, rt *** 324,331 **** static rtx combine_givs_p PROTO((struct induction *, struct induction *)); static void combine_givs PROTO((struct iv_class *)); struct recombine_givs_stats; ! static int find_life_end PROTO((rtx, struct recombine_givs_stats *, rtx, rtx)); ! static void recombine_givs PROTO((struct iv_class *, rtx, rtx, int)); static int product_cheap_p PROTO((rtx, rtx)); static int maybe_eliminate_biv PROTO((struct iv_class *, rtx, rtx, int, int, int)); static int maybe_eliminate_biv_1 PROTO((rtx, rtx, struct iv_class *, int, rtx)); --- 324,335 ---- static rtx combine_givs_p PROTO((struct induction *, struct induction *)); static void combine_givs PROTO((struct iv_class *)); struct recombine_givs_stats; ! static void find_giv_uses PROTO((rtx, struct recombine_givs_stats *, rtx, ! rtx)); ! static void note_giv_use PROTO((struct induction *, rtx, int, ! struct recombine_givs_stats *)); ! static int cmp_giv_by_value_and_insn PROTO((struct induction **, struct induction **)); ! static void recombine_givs PROTO((struct iv_class *, rtx, rtx, rtx, rtx, int)); static int product_cheap_p PROTO((rtx, rtx)); static int maybe_eliminate_biv PROTO((struct iv_class *, rtx, rtx, int, int, int)); static int maybe_eliminate_biv_1 PROTO((rtx, rtx, struct iv_class *, int, rtx)); *************** scan_loop (loop_start, end, loop_cont, u *** 753,759 **** /* Count number of times each reg is set during this loop. Set VARRAY_CHAR (may_not_optimize, I) if it is not safe to move out the setting of register I. Set VARRAY_RTX (reg_single_usage, I). */ ! /* Allocate extra space for REGS that might be created by load_mems. We allocate a little extra slop as well, in the hopes that even after the moving of movables creates some new registers --- 757,763 ---- /* Count number of times each reg is set during this loop. Set VARRAY_CHAR (may_not_optimize, I) if it is not safe to move out the setting of register I. Set VARRAY_RTX (reg_single_usage, I). */ ! /* Allocate extra space for REGS that might be created by load_mems. We allocate a little extra slop as well, in the hopes that even after the moving of movables creates some new registers *************** record_excess_regs (in_this, not_in_this *** 1222,1228 **** && ! reg_mentioned_p (in_this, not_in_this)) *output = gen_rtx_EXPR_LIST (VOIDmode, in_this, *output); return; ! default: break; } --- 1226,1232 ---- && ! reg_mentioned_p (in_this, not_in_this)) *output = gen_rtx_EXPR_LIST (VOIDmode, in_this, *output); return; ! default: break; } *************** replace_call_address (x, reg, addr) *** 2296,2302 **** abort (); XEXP (x, 0) = addr; return; ! default: break; } --- 2300,2306 ---- abort (); XEXP (x, 0) = addr; return; ! default: break; } *************** count_nonfixed_reads (x) *** 2347,2353 **** case MEM: return ((invariant_p (XEXP (x, 0)) != 1) + count_nonfixed_reads (XEXP (x, 0))); ! default: break; } --- 2351,2357 ---- case MEM: return ((invariant_p (XEXP (x, 0)) != 1) + count_nonfixed_reads (XEXP (x, 0))); ! default: break; } *************** invariant_p (x) *** 3341,3347 **** if (MEM_VOLATILE_P (x)) return 0; break; ! default: break; } --- 3345,3351 ---- if (MEM_VOLATILE_P (x)) return 0; break; ! default: break; } *************** static rtx note_insn; *** 3710,3715 **** --- 3714,3726 ---- static rtx addr_placeholder; + /* The last giv we have seen since we passed a CODE_LABEL. Used to + find places where auto-increment is useful to generate a DEST_ADDR + giv from a giv with the same mult_val but different add_val. + We must suppress some giv combinations to allow these auto_increments + to be formed. */ + static struct induction *last_recorded_giv, *last_recorded_addr_giv; + /* ??? Unfinished optimizations, and possible future optimizations, for the strength reduction code. */ *************** static rtx addr_placeholder; *** 3744,3750 **** This does not cause a problem here, because the added registers cannot be givs outside of their loop, and hence will never be reconsidered. But scan_loop must check regnos to make sure they are in bounds. ! SCAN_START is the first instruction in the loop, as the loop would actually be executed. END is the NOTE_INSN_LOOP_END. LOOP_TOP is the first instruction in the loop, as it is layed out in the --- 3755,3761 ---- This does not cause a problem here, because the added registers cannot be givs outside of their loop, and hence will never be reconsidered. But scan_loop must check regnos to make sure they are in bounds. ! SCAN_START is the first instruction in the loop, as the loop would actually be executed. END is the NOTE_INSN_LOOP_END. LOOP_TOP is the first instruction in the loop, as it is layed out in the *************** strength_reduce (scan_start, end, loop_t *** 3838,3847 **** && REG_IV_TYPE (REGNO (dest_reg)) != NOT_BASIC_INDUCT) { int multi_insn_incr = 0; ! if (basic_induction_var (SET_SRC (set), GET_MODE (SET_SRC (set)), ! dest_reg, p, &inc_val, &mult_val, ! &location, &multi_insn_incr)) { /* It is a possible basic induction variable. Create and initialize an induction structure for it. */ --- 3849,3863 ---- && REG_IV_TYPE (REGNO (dest_reg)) != NOT_BASIC_INDUCT) { int multi_insn_incr = 0; + enum machine_mode mode = GET_MODE (SET_SRC (set)); + rtx note = find_reg_note (p, REG_EQUAL, 0); ! if (basic_induction_var (SET_SRC (set), mode, dest_reg, p, ! &inc_val, &mult_val, &location, &multi_insn_incr) ! || (note ! && basic_induction_var (XEXP (note, 0), mode, dest_reg, p, ! &inc_val, &mult_val, &location, ! &multi_insn_incr))) { /* It is a possible basic induction variable. Create and initialize an induction structure for it. */ *************** strength_reduce (scan_start, end, loop_t *** 4180,4185 **** --- 4196,4207 ---- if (loop_dump_stream) fprintf (loop_dump_stream, "is giv of biv %d\n", bl2->regno); + + /* If the changed insn carries a REG_EQUAL note, update it. */ + note = find_reg_note (bl->biv->insn, REG_EQUAL, NULL_RTX); + if (note) + XEXP (note, 0) = copy_rtx (src); + /* Let this giv be discovered by the generic code. */ REG_IV_TYPE (bl->regno) = UNKNOWN_INDUCT; reg_biv_class[bl->regno] = NULL_PTR; *************** strength_reduce (scan_start, end, loop_t *** 4315,4321 **** for (vp = &bl->biv, next = *vp; v = next, next = v->next_iv;) { HOST_WIDE_INT offset; ! rtx set, add_val, old_reg, dest_reg, last_use_insn, note; int old_regno, new_regno; if (! v->always_executed --- 4337,4343 ---- for (vp = &bl->biv, next = *vp; v = next, next = v->next_iv;) { HOST_WIDE_INT offset; ! rtx set, src, add_val, old_reg, dest_reg, last_use_insn, note; int old_regno, new_regno; if (! v->always_executed *************** strength_reduce (scan_start, end, loop_t *** 4337,4344 **** add_val = plus_constant (next->add_val, offset); old_reg = v->dest_reg; dest_reg = gen_reg_rtx (v->mode); ! ! /* Unlike reg_iv_type / reg_iv_info, the other three arrays have been allocated with some slop space, so we may not actually need to reallocate them. If we do, the following if statement will be executed just once in this loop. */ --- 4359,4368 ---- add_val = plus_constant (next->add_val, offset); old_reg = v->dest_reg; dest_reg = gen_reg_rtx (v->mode); ! old_regno = REGNO (old_reg); ! new_regno = REGNO (dest_reg); ! ! /* Unlike reg_iv_type / reg_iv_info, the other four arrays have been allocated with some slop space, so we may not actually need to reallocate them. If we do, the following if statement will be executed just once in this loop. */ *************** strength_reduce (scan_start, end, loop_t *** 4350,4395 **** VARRAY_GROW (may_not_optimize, nregs); VARRAY_GROW (reg_single_usage, nregs); } ! if (! validate_change (next->insn, next->location, add_val, 0)) { vp = &v->next_iv; continue; } - - /* Here we can try to eliminate the increment by combining - it into the uses. */ - - /* Set last_use_insn so that we can check against it. */ ! for (last_use_insn = v->insn, p = NEXT_INSN (v->insn); ! p != next->insn; ! p = next_insn_in_loop (p, scan_start, end, loop_top)) { if (GET_RTX_CLASS (GET_CODE (p)) != 'i') continue; ! if (reg_mentioned_p (old_reg, PATTERN (p))) ! { ! last_use_insn = p; ! } } ! ! /* If we can't get the LUIDs for the insns, we can't ! calculate the lifetime. This is likely from unrolling ! of an inner loop, so there is little point in making this ! a DEST_REG giv anyways. */ ! if (INSN_UID (v->insn) >= max_uid_for_loop ! || INSN_UID (last_use_insn) >= max_uid_for_loop ! || ! validate_change (v->insn, &SET_DEST (set), dest_reg, 0)) { /* Change the increment at NEXT back to what it was. */ if (! validate_change (next->insn, next->location, next->add_val, 0)) abort (); vp = &v->next_iv; continue; } next->add_val = add_val; v->dest_reg = dest_reg; v->giv_type = DEST_REG; v->location = &SET_SRC (set); --- 4374,4465 ---- VARRAY_GROW (may_not_optimize, nregs); VARRAY_GROW (reg_single_usage, nregs); } ! VARRAY_CHAR (may_not_optimize, new_regno) = 0; ! if (! validate_change (next->insn, next->location, add_val, 0)) { vp = &v->next_iv; continue; } ! src = SET_SRC (set); ! /* Try to replace all uses of OLD_REG with SRC. This will ! mostly win when it generates / changes address givs, but it ! might also change some DEST_REG givs or create the odd ! PEA on an 68k. */ ! last_use_insn = NULL_RTX; ! validate_subst_start (); ! for (p = NEXT_INSN (v->insn); p != next->insn; p = NEXT_INSN (p)) { + rtx newpat; + if (GET_RTX_CLASS (GET_CODE (p)) != 'i') continue; ! if (! validate_subst (p, old_reg, src)) ! last_use_insn = p; } ! /* If some uses remain, we'd like to make this a DEST_REG ! giv. However, after loop unrolling, V->INSN or LAST_USE_INSN ! might have no valid luid. We need these not only for ! calculating the lifetime now, but also in recombine_givs when ! doing giv derivation, to find givs with non-overlapping ! lifetimes. So if we don't have LUIDs available, or if we ! can't calculate the giv, leave the biv increment alone. */ ! if (last_use_insn ! && (INSN_UID (v->insn) >= max_uid_for_loop ! || INSN_UID (last_use_insn) >= max_uid_for_loop ! || ! validate_change (v->insn, &SET_DEST (set), ! dest_reg, 0))) { /* Change the increment at NEXT back to what it was. */ if (! validate_change (next->insn, next->location, next->add_val, 0)) abort (); + + /* Undo all the substitutions made by validate_subst above, + since the biv does hold the incremented value after + all. */ + validate_subst_undo (); + vp = &v->next_iv; continue; } + + /* If we have to make a DEST_REG giv, undo all the + substitutions made by validate_subst above, since we are + going to replace the biv by a DEST_REG giv. We must do this + before allocating anything more on obstack, e.g. with + copy_rtx. */ + if (last_use_insn) + validate_subst_undo (); + + /* If next_insn has a REG_EQUAL note that mentiones OLD_REG, + it must be replaced. */ + note = find_reg_note (next->insn, REG_EQUAL, NULL_RTX); + if (note && reg_mentioned_p (old_reg, XEXP (note, 0))) + XEXP (note, 0) = copy_rtx (SET_SRC (single_set (next->insn))); + + /* Remove the increment from the list of biv increments. */ + *vp = next; + bl->biv_count--; + VARRAY_INT (set_in_loop, old_regno)--; + VARRAY_INT (n_times_set, old_regno)--; next->add_val = add_val; + + if (! last_use_insn) + { + if (loop_dump_stream) + fprintf (loop_dump_stream, + "Increment %d of biv %d eliminated.\n\n", + INSN_UID (v->insn), old_regno); + PUT_CODE (v->insn, NOTE); + NOTE_LINE_NUMBER (v->insn) = NOTE_INSN_DELETED; + NOTE_SOURCE_FILE (v->insn) = 0; + VARRAY_INT (set_in_loop, new_regno) = 0; + VARRAY_INT (n_times_set, new_regno) = 0; + continue; + } + v->dest_reg = dest_reg; v->giv_type = DEST_REG; v->location = &SET_SRC (set); *************** strength_reduce (scan_start, end, loop_t *** 4406,4443 **** v->unrolled = 0; v->shared = 0; v->derived_from = 0; v->always_computable = 1; v->always_executed = 1; v->replaceable = 1; v->no_const_addval = 0; ! ! old_regno = REGNO (old_reg); ! new_regno = REGNO (dest_reg); ! VARRAY_INT (set_in_loop, old_regno)--; VARRAY_INT (set_in_loop, new_regno) = 1; - VARRAY_INT (n_times_set, old_regno)--; VARRAY_INT (n_times_set, new_regno) = 1; ! VARRAY_CHAR (may_not_optimize, new_regno) = 0; ! REG_IV_TYPE (new_regno) = GENERAL_INDUCT; REG_IV_INFO (new_regno) = v; - - /* If next_insn has a REG_EQUAL note that mentiones OLD_REG, - it must be replaced. */ - note = find_reg_note (next->insn, REG_EQUAL, NULL_RTX); - if (note && reg_mentioned_p (old_reg, XEXP (note, 0))) - XEXP (note, 0) = copy_rtx (SET_SRC (single_set (next->insn))); ! /* Remove the increment from the list of biv increments, ! and record it as a giv. */ ! *vp = next; ! bl->biv_count--; v->next_iv = bl->giv; bl->giv = v; bl->giv_count++; v->benefit = rtx_cost (SET_SRC (set), SET); bl->total_benefit += v->benefit; ! /* Now replace the biv with DEST_REG in all insns between the replaced increment and the next increment, and remember the last insn that needed a replacement. */ --- 4476,4506 ---- v->unrolled = 0; v->shared = 0; v->derived_from = 0; + v->did_derive = 0; + v->combine_start_limit = 0; + v->combine_end_limit = 0; v->always_computable = 1; v->always_executed = 1; v->replaceable = 1; v->no_const_addval = 0; ! v->autoinc_pred = 0; ! v->autoinc_succ = 0; ! v->preinc = 0; ! v->leading_combined = 0; ! VARRAY_INT (set_in_loop, new_regno) = 1; VARRAY_INT (n_times_set, new_regno) = 1; ! REG_IV_TYPE (new_regno) = GENERAL_INDUCT; REG_IV_INFO (new_regno) = v; ! /* Record V as a giv. */ v->next_iv = bl->giv; bl->giv = v; bl->giv_count++; v->benefit = rtx_cost (SET_SRC (set), SET); bl->total_benefit += v->benefit; ! /* Now replace the biv with DEST_REG in all insns between the replaced increment and the next increment, and remember the last insn that needed a replacement. */ *************** strength_reduce (scan_start, end, loop_t *** 4446,4452 **** p = next_insn_in_loop (p, scan_start, end, loop_top)) { rtx note; ! if (GET_RTX_CLASS (GET_CODE (p)) != 'i') continue; if (reg_mentioned_p (old_reg, PATTERN (p))) --- 4509,4515 ---- p = next_insn_in_loop (p, scan_start, end, loop_top)) { rtx note; ! if (GET_RTX_CLASS (GET_CODE (p)) != 'i') continue; if (reg_mentioned_p (old_reg, PATTERN (p))) *************** strength_reduce (scan_start, end, loop_t *** 4462,4468 **** = replace_rtx (XEXP (note, 0), old_reg, dest_reg); } } ! v->last_use = last_use_insn; v->lifetime = INSN_LUID (v->insn) - INSN_LUID (last_use_insn); /* If the lifetime is zero, it means that this register is really --- 4525,4531 ---- = replace_rtx (XEXP (note, 0), old_reg, dest_reg); } } ! v->last_use = last_use_insn; v->lifetime = INSN_LUID (v->insn) - INSN_LUID (last_use_insn); /* If the lifetime is zero, it means that this register is really *************** strength_reduce (scan_start, end, loop_t *** 4488,4493 **** --- 4551,4557 ---- not_every_iteration = 0; loop_depth = 0; maybe_multiple = 0; + last_recorded_giv = last_recorded_addr_giv = 0; p = scan_start; while (1) { *************** strength_reduce (scan_start, end, loop_t *** 4680,4685 **** --- 4744,4752 ---- && no_labels_between_p (p, loop_end) && loop_insn_first_p (p, loop_cont)) not_every_iteration = 0; + + if (GET_CODE (p) == CODE_LABEL) + last_recorded_giv = last_recorded_addr_giv = 0; } /* Try to calculate and save the number of loop iterations. This is *************** strength_reduce (scan_start, end, loop_t *** 4905,4911 **** /* Now that we know which givs will be reduced, try to rearrange the combinations to reduce register pressure. ! recombine_givs calls find_life_end, which needs reg_iv_type and reg_iv_info to be valid for all pseudos. We do the necessary reallocation here since it allows to check if there are still more bivs to process. */ --- 4972,4978 ---- /* Now that we know which givs will be reduced, try to rearrange the combinations to reduce register pressure. ! recombine_givs calls find_giv_uses, which needs reg_iv_type and reg_iv_info to be valid for all pseudos. We do the necessary reallocation here since it allows to check if there are still more bivs to process. */ *************** strength_reduce (scan_start, end, loop_t *** 4920,4926 **** VARRAY_GROW (reg_iv_type, nregs); VARRAY_GROW (reg_iv_info, nregs); } ! recombine_givs (bl, loop_start, loop_end, unroll_p); /* Reduce each giv that we decided to reduce. */ --- 4987,4993 ---- VARRAY_GROW (reg_iv_type, nregs); VARRAY_GROW (reg_iv_info, nregs); } ! recombine_givs (bl, scan_start, loop_start, loop_end, loop_top, unroll_p); /* Reduce each giv that we decided to reduce. */ *************** strength_reduce (scan_start, end, loop_t *** 4931,4979 **** { int auto_inc_opt = 0; ! /* If the code for derived givs immediately below has already allocated a new_reg, we must keep it. */ if (! v->new_reg) v->new_reg = gen_reg_rtx (v->mode); if (v->derived_from) ! { ! struct induction *d = v->derived_from; ! ! /* In case d->dest_reg is not replaceable, we have ! to replace it in v->insn now. */ ! if (! d->new_reg) ! d->new_reg = gen_reg_rtx (d->mode); ! PATTERN (v->insn) ! = replace_rtx (PATTERN (v->insn), d->dest_reg, d->new_reg); ! PATTERN (v->insn) ! = replace_rtx (PATTERN (v->insn), v->dest_reg, v->new_reg); ! /* For each place where the biv is incremented, add an ! insn to set the new, reduced reg for the giv. ! We used to do this only for biv_count != 1, but ! this fails when there is a giv after a single biv ! increment, e.g. when the last giv was expressed as ! pre-decrement. */ ! for (tv = bl->biv; tv; tv = tv->next_iv) ! { ! /* We always emit reduced giv increments before the ! biv increment when bl->biv_count != 1. So by ! emitting the add insns for derived givs after the ! biv increment, they pick up the updated value of ! the reduced giv. ! If the reduced giv is processed with ! auto_inc_opt == 1, then it is incremented earlier ! than the biv, hence we'll still pick up the right ! value. ! If it's processed with auto_inc_opt == -1, ! that implies that the biv increment is before the ! first reduced giv's use. The derived giv's lifetime ! is after the reduced giv's lifetime, hence in this ! case, the biv increment doesn't matter. */ ! emit_insn_after (copy_rtx (PATTERN (v->insn)), tv->insn); ! } ! continue; ! } #ifdef AUTO_INC_DEC /* If the target has auto-increment addressing modes, and --- 4998,5010 ---- { int auto_inc_opt = 0; ! /* If the code for derived givs in recombine_givs has already allocated a new_reg, we must keep it. */ if (! v->new_reg) v->new_reg = gen_reg_rtx (v->mode); if (v->derived_from) ! continue; #ifdef AUTO_INC_DEC /* If the target has auto-increment addressing modes, and *************** strength_reduce (scan_start, end, loop_t *** 5032,5037 **** --- 5063,5073 ---- else auto_inc_opt = 1; + /* We can't put an insn after v->insn if v was used to + derive other givs in recombine_givs. */ + if (auto_inc_opt == 1 && v->did_derive) + auto_inc_opt = 0; + #ifdef HAVE_cc0 { rtx prev; *************** strength_reduce (scan_start, end, loop_t *** 5065,5070 **** --- 5101,5115 ---- else insert_before = v->insn; + /* If the biv was recognized from a REG_EQUAL note, we + can have the special case that the giv is used in the + biv increment. Then the giv increment must be put + after the biv increment, which is typically actually + a copy of the giv into the biv. */ + if (reg_overlap_mentioned_p (v->dest_reg, + SET_SRC (single_set (tv->insn)))) + insert_before = NEXT_INSN (tv->insn); + if (tv->mult_val == const1_rtx) emit_iv_add_mult (tv->add_val, v->mult_val, v->new_reg, v->new_reg, insert_before); *************** strength_reduce (scan_start, end, loop_t *** 5311,5317 **** if (unrolled_insn_copies < 0) unrolled_insn_copies = 0; } ! /* Unroll loops from within strength reduction so that we can use the induction variable information that strength_reduce has already collected. Always unroll loops that would be as small or smaller --- 5356,5362 ---- if (unrolled_insn_copies < 0) unrolled_insn_copies = 0; } ! /* Unroll loops from within strength reduction so that we can use the induction variable information that strength_reduce has already collected. Always unroll loops that would be as small or smaller *************** find_mem_givs (x, insn, not_every_iterat *** 5436,5446 **** struct induction *v = (struct induction *) oballoc (sizeof (struct induction)); record_giv (v, insn, src_reg, addr_placeholder, mult_val, add_val, benefit, DEST_ADDR, not_every_iteration, maybe_multiple, &XEXP (x, 0), loop_start, loop_end); - - v->mem_mode = GET_MODE (x); } } return; --- 5481,5490 ---- struct induction *v = (struct induction *) oballoc (sizeof (struct induction)); + v->mem_mode = GET_MODE (x); record_giv (v, insn, src_reg, addr_placeholder, mult_val, add_val, benefit, DEST_ADDR, not_every_iteration, maybe_multiple, &XEXP (x, 0), loop_start, loop_end); } } return; *************** record_giv (v, insn, src_reg, dest_reg, *** 5622,5629 **** --- 5666,5680 ---- v->auto_inc_opt = 0; v->unrolled = 0; v->shared = 0; + v->autoinc_pred = 0; + v->autoinc_succ = 0; + v->preinc = 0; + v->leading_combined = 0; v->derived_from = 0; + v->did_derive = 0; v->last_use = 0; + v->combine_start_limit = 0; + v->combine_end_limit = 0; /* The v->always_computable field is used in update_giv_derive, to determine whether a giv can be used to derive another giv. For a *************** record_giv (v, insn, src_reg, dest_reg, *** 5774,5779 **** --- 5825,5894 ---- } } + #ifdef AUTO_INC_DEC + if (last_recorded_addr_giv + && last_recorded_addr_giv->src_reg == src_reg + && rtx_equal_p (last_recorded_addr_giv->mult_val, mult_val) + && GET_CODE (add_val) == CONST_INT) + { + /* Check if changing the previous giv to post-increment would allow + to generate the value of the current giv. */ + if (! last_recorded_addr_giv->preinc + && ((HAVE_POST_INCREMENT + && (INTVAL (add_val) - INTVAL (last_recorded_addr_giv->add_val) + == GET_MODE_SIZE (last_recorded_addr_giv->mem_mode))) + || (HAVE_POST_DECREMENT + && ((INTVAL (add_val) + - INTVAL (last_recorded_addr_giv->add_val)) + == -GET_MODE_SIZE (last_recorded_addr_giv->mem_mode)))) + && ! (combine_givs_p (last_recorded_addr_giv, v) + || combine_givs_p (v, last_recorded_addr_giv))) + { + last_recorded_addr_giv->autoinc_pred = 1; + v->autoinc_succ = 1; + /* Record only one autoinc opportunity for LAST_RECORDED_ADDR_GIV. */ + last_recorded_addr_giv = 0; + } + else if ((HAVE_PRE_INCREMENT + && type == DEST_ADDR + && (INTVAL (add_val) - INTVAL (last_recorded_addr_giv->add_val) + == GET_MODE_SIZE (v->mem_mode))) + || (HAVE_PRE_DECREMENT + && type == DEST_ADDR + && (INTVAL (add_val) - INTVAL (last_recorded_addr_giv->add_val) + == -GET_MODE_SIZE (v->mem_mode)))) + { + struct induction *succ = v; + + if (last_recorded_giv->giv_type == DEST_REG + && rtx_equal_p (last_recorded_giv->add_val, v->add_val) + && rtx_equal_p (last_recorded_giv->mult_val, v->mult_val)) + succ = last_recorded_giv; + last_recorded_addr_giv->autoinc_pred = 1; + succ->autoinc_succ = 1; + v->preinc = 1; + /* Record only one autoinc opportunity for LAST_RECORDED_ADDR_GIV. */ + last_recorded_addr_giv = 0; + } + } + + last_recorded_giv = v; + + /* Only record DEST_ADDR givs as such for following auto_increment tests + if we can use them at all. */ + if (type == DEST_ADDR + && GET_CODE (add_val) == CONST_INT + /* And only if it's likely to be useful. The typical case uses a + structure, sub-array or several array members each iteration, so + we should see an increment that is larger than the individual + access size. */ + && (GET_CODE (reg_biv_class[REGNO (src_reg)]->biv->add_val) != CONST_INT + || (abs (INTVAL (v->mult_val) + * INTVAL (reg_biv_class[REGNO (src_reg)]->biv->add_val)) + > GET_MODE_SIZE (v->mem_mode)))) + last_recorded_addr_giv = v; + #endif + if (loop_dump_stream) { if (type == DEST_REG) *************** check_final_value (v, loop_start, loop_e *** 5915,5921 **** last_giv_use = p; } } ! /* Now that the lifetime of the giv is known, check for branches from within the lifetime to outside the lifetime if it is still replaceable. */ --- 6030,6036 ---- last_giv_use = p; } } ! /* Now that the lifetime of the giv is known, check for branches from within the lifetime to outside the lifetime if it is still replaceable. */ *************** general_induction_var (x, src_reg, add_v *** 6281,6289 **** --- 6396,6406 ---- rtx orig_x = x; char *storage; + #if 0 /* Invariants are useful to derive other givs from. */ /* If this is an invariant, forget it, it isn't a giv. */ if (invariant_p (x) == 1) return 0; + #endif /* See if the expression could be a giv and get its form. Mark our place on the obstack in case we don't find a giv. */ *************** simplify_giv_expr (x, benefit) *** 6636,6646 **** *benefit += v->benefit; if (v->cant_derive) return 0; - - tem = gen_rtx_PLUS (mode, gen_rtx_MULT (mode, - v->src_reg, v->mult_val), - v->add_val); if (v->derive_adjustment) tem = gen_rtx_MINUS (mode, tem, v->derive_adjustment); return simplify_giv_expr (tem, benefit); --- 6753,6765 ---- *benefit += v->benefit; if (v->cant_derive) return 0; + if (v->mult_val != const0_rtx) + tem = gen_rtx_PLUS (mode, gen_rtx_MULT (mode, + v->src_reg, v->mult_val), + v->add_val); + else + tem = v->add_val; if (v->derive_adjustment) tem = gen_rtx_MINUS (mode, tem, v->derive_adjustment); return simplify_giv_expr (tem, benefit); *************** express_from (g1, g2) *** 7076,7085 **** mult = gen_rtx_PLUS (g2->mode, mult, XEXP (add, 0)); add = tem; } ! return gen_rtx_PLUS (g2->mode, mult, add); } ! } \f /* Return an rtx, if any, that expresses giv G2 as a function of the register --- 7195,7204 ---- mult = gen_rtx_PLUS (g2->mode, mult, XEXP (add, 0)); add = tem; } ! return gen_rtx_PLUS (g2->mode, mult, add); } ! } \f /* Return an rtx, if any, that expresses giv G2 as a function of the register *************** combine_givs_p (g1, g2) *** 7101,7106 **** --- 7220,7241 ---- if (tem == g1->dest_reg && (g1->giv_type == DEST_REG || g2->giv_type == DEST_ADDR)) { + /* Don't combine if this would prevent an autoinc opportunity. + We only do this check in if both givs are the same; if they + can be combined even though they are different, then it is likely + that the putative autoinc-pair can be combined without an + autoincrement, too. We don't want to prevent that combination. */ + #ifdef AUTO_INC_DEC + if ((g1->combine_start_limit + && loop_insn_first_p (g2->insn, g1->combine_start_limit)) + || (g1->combine_end_limit + && loop_insn_first_p (g1->combine_end_limit, g2->insn)) + || (g2->combine_start_limit + && loop_insn_first_p (g1->insn, g2->combine_start_limit)) + || (g2->combine_end_limit + && loop_insn_first_p (g2->combine_end_limit, g1->insn))) + return 0; + #endif return g1->dest_reg; } *************** cmp_combine_givs_stats (xp, yp) *** 7151,7156 **** --- 7286,7312 ---- return d; } + static int + cmp_giv_by_value_and_insn (xp, yp) + struct induction **xp, **yp; + { + struct induction *x = *xp, *y = *yp; + HOST_WIDE_INT d; + + d = (int) GET_CODE (x->mult_val) - (int) GET_CODE (y->mult_val); + if (! d && GET_CODE (x->mult_val) == CONST_INT) + d = INTVAL (x->mult_val) - INTVAL (y->mult_val); + if (! d) + d = (int) GET_CODE (x->add_val) - (int) GET_CODE (y->add_val); + if (! d && GET_CODE (x->add_val) == CONST_INT) + d = INTVAL (x->add_val) - INTVAL (y->add_val); + if (d) + return d < 0 ? -1 : 1; + if (x->insn == y->insn) + return xp - yp; + return loop_insn_first_p (x->insn, y->insn) ? -1 : 1; + } + /* Check all pairs of givs for iv_class BL and see if any can be combined with any other. If so, point SAME to the giv combined with and set NEW_REG to be an expression (in terms of the other giv's DEST_REG) equivalent to the *************** combine_givs (bl) *** 7176,7186 **** --- 7332,7393 ---- giv_array = (struct induction **) alloca (giv_count * sizeof (struct induction *)); + + #ifdef AUTO_INC_DEC + /* Order givs by mult_val / add_val / position in insn stream. */ i = 0; for (g1 = bl->giv; g1; g1 = g1->next_iv) if (!g1->ignore) giv_array[i++] = g1; + qsort (giv_array, giv_count, sizeof(*giv_array), cmp_giv_by_value_and_insn); + + /* Go through givs forward in insn stream order, set combine_start_limit + for giv that flag autoinc_succ or follow one with matching + add_val/mult_val that flags it. */ + g1 = NULL_PTR; + for (i = 0; i < giv_count; i++) + { + g2 = giv_array[i]; + if (g2->autoinc_succ) + g1 = g2; + else if (! g1) + continue; + else if (! rtx_equal_p (g1->mult_val, g2->mult_val) + || ! rtx_equal_p (g1->add_val, g2->add_val)) + { + g1 = 0; + continue; + } + g2->combine_start_limit = g1->insn; + } + + /* Go through givs backward in insn stream order, set combine_end_limit + for giv that flag autoinc_pred or follow one with matching + add_val/mult_val that flags it. */ + g1 = NULL_PTR; + for (i = giv_count - 1; i >= 0; i--) + { + g2 = giv_array[i]; + if (g2->autoinc_pred) + g1 = g2; + else if (! g1) + continue; + else if (! rtx_equal_p (g1->mult_val, g2->mult_val) + || ! rtx_equal_p (g1->add_val, g2->add_val)) + { + g1 = 0; + continue; + } + g2->combine_end_limit = g1->insn; + } + #endif /* AUTO_INC_DEC */ + + i = 0; + for (g1 = bl->giv; g1; g1 = g1->next_iv) + if (!g1->ignore) + giv_array[i++] = g1; + stats = (struct combine_givs_stats *) xcalloc (giv_count, sizeof (*stats)); can_combine = (rtx *) xcalloc (giv_count, giv_count * sizeof(rtx)); *************** struct recombine_givs_stats *** 7322,7327 **** --- 7529,7537 ---- { int giv_number; int start_luid, end_luid; + rtx start_insn; /* First insn in loop order in which the giv (including + combinations) is used; Initialized to NULL_RTX; set + to a NOTE when invalid. */ }; /* Used below as comparison function for qsort. We want a ascending luid *************** cmp_recombine_givs_stats (xp, yp) *** 7344,7356 **** return d; } ! /* Scan X, which is a part of INSN, for the end of life of a giv. Also ! look for the start of life of a giv where the start has not been seen ! yet to unlock the search for the end of its life. ! Only consider givs that belong to BIV. ! Return the total number of lifetime ends that have been found. */ ! static int ! find_life_end (x, stats, insn, biv) rtx x, insn, biv; struct recombine_givs_stats *stats; { --- 7554,7605 ---- return d; } ! /* The last label we encountered while scanning forward for giv uses. ! Is initialized to SCAN_START (not necessarily a label) in recombine_givs. */ ! static rtx loop_last_label; ! ! /* V, a giv, is used in INSN. ! FROM_COMBINED is set if the use comes (possibly) from a combined giv. ! It must not be set if there are no combined givs for this giv, since ! this can confuse giv derivation to move the giv insn to the wrong place. ! Update start_insn / end_luid in STATS accordingly. */ ! static void ! note_giv_use (v, insn, from_combined, stats) ! struct induction *v; ! rtx insn; ! int from_combined; ! struct recombine_givs_stats *stats; ! { ! if (stats[v->ix].start_insn) ! { ! if (loop_insn_first_p (stats[v->ix].start_insn, loop_last_label) ! && (loop_insn_first_p (loop_last_label, insn) ! || loop_insn_first_p (insn, stats[v->ix].start_insn))) ! stats[v->ix].start_insn = loop_number_loop_starts[0]; ! } ! else ! { ! rtx p; ! ! stats[v->ix].start_insn = insn; ! if (from_combined) ! v->leading_combined = 1; ! ! /* Update start_luid now so that we won't loose this information it ! when we invalidate start_insn. */ ! for (p = insn; INSN_UID (p) >= max_uid_for_loop; ) ! p = PREV_INSN (p); ! stats[v->ix].start_luid = INSN_LUID (p); ! } ! while (INSN_UID (insn) >= max_uid_for_loop) ! insn = NEXT_INSN (insn); ! stats[v->ix].end_luid = INSN_LUID (insn); ! } ! ! /* Scan X, which is a part of INSN, for uses of givs. ! Only consider givs that belong to BIV. */ ! static void ! find_giv_uses (x, stats, insn, biv) rtx x, insn, biv; struct recombine_givs_stats *stats; { *************** find_life_end (x, stats, insn, biv) *** 7372,7419 **** if (REG_IV_TYPE (regno) == GENERAL_INDUCT && ! v->ignore ! && v->src_reg == biv ! && stats[v->ix].end_luid <= 0) { ! /* If we see a 0 here for end_luid, it means that we have ! scanned the entire loop without finding any use at all. ! We must not predicate this code on a start_luid match ! since that would make the test fail for givs that have ! been hoisted out of inner loops. */ ! if (stats[v->ix].end_luid == 0) { ! stats[v->ix].end_luid = stats[v->ix].start_luid; ! return 1 + find_life_end (SET_SRC (x), stats, insn, biv); } - else if (stats[v->ix].start_luid == INSN_LUID (insn)) - stats[v->ix].end_luid = 0; } - return find_life_end (SET_SRC (x), stats, insn, biv); } break; } case REG: { int regno = REGNO (x); ! struct induction *v = REG_IV_INFO (regno); ! ! if (REG_IV_TYPE (regno) == GENERAL_INDUCT ! && ! v->ignore ! && v->src_reg == biv ! && stats[v->ix].end_luid == 0) { ! while (INSN_UID (insn) >= max_uid_for_loop) ! insn = NEXT_INSN (insn); ! stats[v->ix].end_luid = INSN_LUID (insn); ! return 1; } ! return 0; } case LABEL_REF: case CONST_DOUBLE: case CONST_INT: case CONST: ! return 0; default: break; } --- 7621,7696 ---- if (REG_IV_TYPE (regno) == GENERAL_INDUCT && ! v->ignore ! && v->src_reg == biv) ! { ! /* Since we are setting a non-ignored general induction ! variable, this insn will be changed or go away, hence ! we don't have to consider uses in the SET_SRC. */ ! return; ! } ! find_giv_uses (SET_SRC (x), stats, insn, biv); ! return; ! } ! break; ! } ! /* If this is a reduced DEST_ADDR giv, the original address doesn't ! count; but if the giv has been combined with another one, we must ! count the use there. */ ! case MEM: ! { ! rtx src_reg; ! rtx add_val; ! rtx mult_val; ! int benefit; ! struct induction *v; ! ! if (general_induction_var (XEXP (x, 0), &src_reg, &add_val, ! &mult_val, 1, &benefit) ! && src_reg == biv) ! { ! for (v = reg_biv_class[REGNO (biv)]->giv; v; v = v->next_iv) { ! if (v->location == &XEXP (x, 0)) { ! int from_combined = 0; ! ! if (v->same) ! { ! v = v->same; ! from_combined = 1; ! } ! if (v->ignore) ! break; ! note_giv_use (v, insn, from_combined, stats); ! return; } } } break; } case REG: { int regno = REGNO (x); ! if (REG_IV_TYPE (regno) == GENERAL_INDUCT) { ! struct induction *v = REG_IV_INFO (regno); ! int from_combined = 0; ! ! if (v->same) ! { ! v = v->same; ! from_combined = 1; ! } ! if (! v->ignore && v->src_reg == biv) ! note_giv_use (v, insn, from_combined, stats); } ! return; } case LABEL_REF: case CONST_DOUBLE: case CONST_INT: case CONST: ! return; default: break; } *************** find_life_end (x, stats, insn, biv) *** 7422,7434 **** for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--) { if (fmt[i] == 'e') ! retval += find_life_end (XEXP (x, i), stats, insn, biv); else if (fmt[i] == 'E') for (j = XVECLEN (x, i) - 1; j >= 0; j--) ! retval += find_life_end (XVECEXP (x, i, j), stats, insn, biv); } ! return retval; } /* For each giv that has been combined with another, look if --- 7699,7711 ---- for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--) { if (fmt[i] == 'e') ! find_giv_uses (XEXP (x, i), stats, insn, biv); else if (fmt[i] == 'E') for (j = XVECLEN (x, i) - 1; j >= 0; j--) ! find_giv_uses (XVECEXP (x, i, j), stats, insn, biv); } ! return; } /* For each giv that has been combined with another, look if *************** find_life_end (x, stats, insn, biv) *** 7436,7451 **** This tends to shorten giv lifetimes, and helps the next step: try to derive givs from other givs. */ static void ! recombine_givs (bl, loop_start, loop_end, unroll_p) struct iv_class *bl; ! rtx loop_start, loop_end; int unroll_p; { struct induction *v, **giv_array, *last_giv; struct recombine_givs_stats *stats; int giv_count; int i, rescan; ! int ends_need_computing; for (giv_count = 0, v = bl->giv; v; v = v->next_iv) { --- 7713,7732 ---- This tends to shorten giv lifetimes, and helps the next step: try to derive givs from other givs. */ static void ! recombine_givs (bl, scan_start, loop_start, loop_end, loop_top, unroll_p) struct iv_class *bl; ! rtx scan_start, loop_start, loop_end, loop_top; int unroll_p; { struct induction *v, **giv_array, *last_giv; struct recombine_givs_stats *stats; int giv_count; int i, rescan; ! int n_giv_live_after_loop; ! struct induction **giv_live_after_loop; ! rtx biv_use_start, biv_use_end; ! struct induction *biv_giv; ! int life_start, life_end; for (giv_count = 0, v = bl->giv; v; v = v->next_iv) { *************** recombine_givs (bl, loop_start, loop_end *** 7456,7469 **** = (struct induction **) xmalloc (giv_count * sizeof (struct induction *)); stats = (struct recombine_givs_stats *) xmalloc (giv_count * sizeof *stats); ! /* Initialize stats and set up the ix field for each giv in stats to name ! the corresponding index into stats. */ ! for (i = 0, v = bl->giv; v; v = v->next_iv) { rtx p; if (v->ignore) ! continue; giv_array[i] = v; stats[i].giv_number = i; /* If this giv has been hoisted out of an inner loop, use the luid of --- 7737,7757 ---- = (struct induction **) xmalloc (giv_count * sizeof (struct induction *)); stats = (struct recombine_givs_stats *) xmalloc (giv_count * sizeof *stats); ! /* Initialize stats, and clear the live_after_loop fields. ! Also note where the biv is used by unreduced givs. */ ! for (i = 0, biv_use_start = biv_use_end = 0, v = bl->giv; v; v = v->next_iv) { rtx p; if (v->ignore) ! { ! if (! biv_use_start || loop_insn_first_p (v->insn, biv_use_start)) ! biv_use_start = v->insn; ! if (! biv_use_end || loop_insn_first_p (biv_use_end, v->insn)) ! biv_use_end = v->insn; ! continue; ! } ! v->live_after_loop = 0; giv_array[i] = v; stats[i].giv_number = i; /* If this giv has been hoisted out of an inner loop, use the luid of *************** recombine_givs (bl, loop_start, loop_end *** 7471,7476 **** --- 7759,7765 ---- for (p = v->insn; INSN_UID (p) >= max_uid_for_loop; ) p = PREV_INSN (p); stats[i].start_luid = INSN_LUID (p); + stats[i].start_insn = NULL_RTX; i++; } *************** recombine_givs (bl, loop_start, loop_end *** 7525,7654 **** last_giv = v; } ! ends_need_computing = 0; ! /* For each DEST_REG giv, compute lifetime starts, and try to compute ! lifetime ends from regscan info. */ ! for (i = giv_count - 1; i >= 0; i--) { ! v = giv_array[stats[i].giv_number]; ! if (v->ignore) continue; ! if (v->giv_type == DEST_ADDR) ! { ! /* Loop unrolling of an inner loop can even create new DEST_REG ! givs. */ ! rtx p; ! for (p = v->insn; INSN_UID (p) >= max_uid_for_loop; ) ! p = PREV_INSN (p); ! stats[i].start_luid = stats[i].end_luid = INSN_LUID (p); ! if (p != v->insn) ! stats[i].end_luid++; ! } ! else /* v->giv_type == DEST_REG */ ! { ! if (v->last_use) ! { ! stats[i].start_luid = INSN_LUID (v->insn); ! stats[i].end_luid = INSN_LUID (v->last_use); ! } ! else if (INSN_UID (v->insn) >= max_uid_for_loop) ! { ! rtx p; ! /* This insn has been created by loop optimization on an inner ! loop. We don't have a proper start_luid that will match ! when we see the first set. But we do know that there will ! be no use before the set, so we can set end_luid to 0 so that ! we'll start looking for the last use right away. */ ! for (p = PREV_INSN (v->insn); INSN_UID (p) >= max_uid_for_loop; ) ! p = PREV_INSN (p); ! stats[i].start_luid = INSN_LUID (p); ! stats[i].end_luid = 0; ! ends_need_computing++; ! } ! else ! { ! int regno = REGNO (v->dest_reg); ! int count = VARRAY_INT (n_times_set, regno) - 1; ! rtx p = v->insn; ! ! /* Find the first insn that sets the giv, so that we can verify ! if this giv's lifetime wraps around the loop. We also need ! the luid of the first setting insn in order to detect the ! last use properly. */ ! while (count) ! { ! p = prev_nonnote_insn (p); ! if (reg_set_p (v->dest_reg, p)) ! count--; ! } ! stats[i].start_luid = INSN_LUID (p); ! if (stats[i].start_luid > uid_luid[REGNO_FIRST_UID (regno)]) ! { ! stats[i].end_luid = -1; ! ends_need_computing++; ! } ! else ! { ! stats[i].end_luid = uid_luid[REGNO_LAST_UID (regno)]; ! if (stats[i].end_luid > INSN_LUID (loop_end)) ! { ! stats[i].end_luid = -1; ! ends_need_computing++; ! } ! } ! } ! } ! } ! /* If the regscan information was unconclusive for one or more DEST_REG ! givs, scan the all insn in the loop to find out lifetime ends. */ ! if (ends_need_computing) ! { ! rtx biv = bl->biv->src_reg; ! rtx p = loop_end; ! ! do ! { ! if (p == loop_start) ! p = loop_end; ! p = PREV_INSN (p); ! if (GET_RTX_CLASS (GET_CODE (p)) != 'i') ! continue; ! ends_need_computing -= find_life_end (PATTERN (p), stats, p, biv); } - while (ends_need_computing); } ! /* Set start_luid back to the last insn that sets the giv. This allows ! more combinations. */ ! for (i = giv_count - 1; i >= 0; i--) ! { ! v = giv_array[stats[i].giv_number]; ! if (v->ignore) ! continue; ! if (INSN_UID (v->insn) < max_uid_for_loop) ! stats[i].start_luid = INSN_LUID (v->insn); ! } ! /* Now adjust lifetime ends by taking combined givs into account. */ for (i = giv_count - 1; i >= 0; i--) { - unsigned luid; - int j; - v = giv_array[stats[i].giv_number]; ! if (v->ignore) continue; ! if (v->same && ! v->same->ignore) ! { ! j = v->same->ix; ! luid = stats[i].start_luid; ! /* Use unsigned arithmetic to model loop wrap-around. */ ! if (luid - stats[j].start_luid ! > (unsigned) stats[j].end_luid - stats[j].start_luid) ! stats[j].end_luid = luid; ! } } qsort (stats, giv_count, sizeof(*stats), cmp_recombine_givs_stats); --- 7814,7913 ---- last_giv = v; } ! /* Set up the giv_live_after_loop array. */ ! n_giv_live_after_loop = 0; ! giv_live_after_loop = NULL_PTR; ! for (v = bl->giv; v; v = v->next_iv) { ! struct induction *same; ! ! if (v->giv_type != DEST_REG || v->last_use) continue; ! if ((uid_luid[REGNO_FIRST_UID (REGNO (v->dest_reg))] ! > INSN_LUID (loop_start)) ! && (uid_luid[REGNO_LAST_UID (REGNO (v->dest_reg))] ! < INSN_LUID (loop_end))) ! continue; ! /* Sometimes the register is immediately overwritten after the loop. ! This happens particularily in the second loop pass, when we see ! the results of strength reduction in the first pass. */ ! if (flag_expensive_optimizations ! && reg_dead_after_loop (v->dest_reg, loop_start, loop_end)) ! continue; ! same = v->same ? v->same : v; ! if (! same->ignore ! && ! same->live_after_loop) ! { ! same->live_after_loop = 1; ! if (! giv_live_after_loop) ! giv_live_after_loop ! = (struct induction **) alloca (sizeof (struct induction *) ! * giv_count); ! giv_live_after_loop[n_giv_live_after_loop++] = same; } } ! /* Scan all the insns in the loop to find out lifetime starts and ends. */ ! { ! rtx biv = bl->biv->src_reg; ! rtx p = loop_end; ! for (loop_last_label = scan_start, p = scan_start; p; ! p = next_insn_in_loop (p, scan_start, loop_end, loop_top)) ! { ! if (GET_CODE (p) == CODE_LABEL) ! loop_last_label = p; ! else if (GET_RTX_CLASS (GET_CODE (p)) == 'i') ! { ! find_giv_uses (PATTERN (p), stats, p, biv); ! /* If this is a jump, we have to consider uses outside the loop. */ ! if (GET_CODE (p) == JUMP_INSN && GET_CODE (PATTERN (p)) != RETURN) ! { ! int is_loop_exit = 1; ! rtx label; ! ! if (condjump_p (p) || condjump_in_parallel_p (p)) ! { ! label = XEXP (condjump_label (p), 0); ! /* If the destination is within the loop, and this ! is not a conditional branch at the loop end, this ! is not a loop exit. */ ! if (loop_insn_first_p (loop_start, label) ! && loop_insn_first_p (label, loop_end) ! && (simplejump_p (p) ! /* Shortcut for forward branches - by definition, ! they can't be the end of the loop */ ! || loop_insn_first_p (p, label) ! || ! no_labels_between_p (p, loop_end))) ! is_loop_exit = 0; ! } ! ! if (is_loop_exit) ! { ! for (i = n_giv_live_after_loop -1; i >= 0; i--) ! /* We don't have recorded which givs are life after the ! loop only because their giv register is life, or ! (also) because a combined giv is life after the loop, ! so just pretend it is the latter if any other givs ! have been combined with this one. */ ! note_giv_use (giv_live_after_loop[i], p, ! giv_live_after_loop[i]->combined_with, ! stats); ! } ! } ! } ! } ! } ! /* Ignore givs that are not used at all. */ for (i = giv_count - 1; i >= 0; i--) { v = giv_array[stats[i].giv_number]; ! if (v->ignore || v->same) continue; ! if (! stats[i].start_insn) ! v->ignore = 1; } qsort (stats, giv_count, sizeof(*stats), cmp_recombine_givs_stats); *************** recombine_givs (bl, loop_start, loop_end *** 7664,7680 **** When we are finished with the current LAST_GIV (i.e. the inner loop terminates), we start again with rescan, which then becomes the new LAST_GIV. */ for (i = giv_count - 1; i >= 0; i = rescan) { ! int life_start, life_end; ! for (last_giv = 0, rescan = -1; i >= 0; i--) { rtx sum; v = giv_array[stats[i].giv_number]; ! if (v->giv_type != DEST_REG || v->derived_from || v->same) continue; if (! last_giv) { /* Don't use a giv that's likely to be dead to derive --- 7923,7964 ---- When we are finished with the current LAST_GIV (i.e. the inner loop terminates), we start again with rescan, which then becomes the new LAST_GIV. */ + + /* The biv is also a giv, of sorts. If it can't be eliminated, we + might as well consider to derive givs from it. */ + if (biv_use_start && bl->biv_count == 1) + { + biv_giv = (struct induction *) oballoc (sizeof *biv_giv); + biv_giv->add_val = const0_rtx; + biv_giv->mult_val = const1_rtx; + biv_giv->dest_reg = biv_giv->new_reg = regno_reg_rtx[bl->regno]; + biv_giv->insn = bl->biv->insn; /* Used for debugging dump. */ + last_giv = biv_giv; + while (INSN_UID (biv_use_start) >= max_uid_for_loop) + biv_use_start = PREV_INSN (biv_use_start); + life_start = INSN_LUID (biv_use_start); + while (INSN_UID (biv_use_end) >= max_uid_for_loop) + biv_use_end = NEXT_INSN (biv_use_end); + life_end = INSN_LUID (biv_use_end); + } + else + last_giv = 0; + for (i = giv_count - 1; i >= 0; i = rescan) { ! rtx add_insn, trial_add_insn = NULL_RTX; ! for (rescan = -1; i >= 0; i--) { rtx sum; v = giv_array[stats[i].giv_number]; ! if (v->derived_from || v->same || v->ignore) continue; + + if (! v->new_reg) + v->new_reg = gen_reg_rtx (v->mode); + if (! last_giv) { /* Don't use a giv that's likely to be dead to derive *************** recombine_givs (bl, loop_start, loop_end *** 7687,7704 **** } continue; } /* Use unsigned arithmetic to model loop wrap around. */ if (((unsigned) stats[i].start_luid - life_start >= (unsigned) life_end - life_start) && ((unsigned) stats[i].end_luid - life_start > (unsigned) life_end - life_start) - /* Check that the giv insn we're about to use for deriving - precedes all uses of that giv. Note that initializing the - derived giv would defeat the purpose of reducing register - pressure. - ??? We could arrange to move the insn. */ - && ((unsigned) stats[i].end_luid - INSN_LUID (loop_start) - > (unsigned) stats[i].start_luid - INSN_LUID (loop_start)) && rtx_equal_p (last_giv->mult_val, v->mult_val) /* ??? Could handle libcalls, but would need more logic. */ && ! find_reg_note (v->insn, REG_RETVAL, NULL_RTX) --- 7971,7998 ---- } continue; } + + /* ??? We would save some time by setting up add_insn only + immediately before it is going to be used, but that would + make the multi-line conditional below even harder to read. */ + if (v->giv_type == DEST_REG) + add_insn = v->insn; + else + { + if (! trial_add_insn) + { + trial_add_insn = make_insn_raw (NULL_RTX); + PREV_INSN (trial_add_insn) = NULL_RTX; + NEXT_INSN (trial_add_insn) = NULL_RTX; + } + add_insn = trial_add_insn; + } + /* Use unsigned arithmetic to model loop wrap around. */ if (((unsigned) stats[i].start_luid - life_start >= (unsigned) life_end - life_start) && ((unsigned) stats[i].end_luid - life_start > (unsigned) life_end - life_start) && rtx_equal_p (last_giv->mult_val, v->mult_val) /* ??? Could handle libcalls, but would need more logic. */ && ! find_reg_note (v->insn, REG_RETVAL, NULL_RTX) *************** recombine_givs (bl, loop_start, loop_end *** 7708,7737 **** don't have this detailed control flow information. N.B. since last_giv will be reduced, it is valid anywhere in the loop, so we don't need to check the ! validity of last_giv. ! We rely here on the fact that v->always_executed implies that ! there is no jump to someplace else in the loop before the ! giv insn, and hence any insn that is executed before the ! giv insn in the loop will have a lower luid. */ ! && (v->always_executed || ! v->combined_with) && (sum = express_from (last_giv, v)) /* Make sure we don't make the add more expensive. ADD_COST doesn't take different costs of registers and constants into account, so compare the cost of the actual SET_SRCs. */ ! && (rtx_cost (sum, SET) ! <= rtx_cost (SET_SRC (single_set (v->insn)), SET)) /* ??? unroll can't understand anything but reg + const_int sums. It would be cleaner to fix unroll. */ && ((GET_CODE (sum) == PLUS && GET_CODE (XEXP (sum, 0)) == REG && GET_CODE (XEXP (sum, 1)) == CONST_INT) || ! unroll_p) ! && validate_change (v->insn, &PATTERN (v->insn), ! gen_rtx_SET (VOIDmode, v->dest_reg, sum), 0)) { v->derived_from = last_giv; life_end = stats[i].end_luid; if (loop_dump_stream) { fprintf (loop_dump_stream, --- 8002,8088 ---- don't have this detailed control flow information. N.B. since last_giv will be reduced, it is valid anywhere in the loop, so we don't need to check the ! validity of last_giv. */ ! && (GET_CODE (stats[i].start_insn) != NOTE ! || ! v->combined_with ! /* We rely here on the fact that v->always_executed implies ! that there is no jump to someplace else in the loop before ! the giv insn, and hence any insn that is executed before ! the giv insn in the loop will have a lower luid. */ ! || (v->giv_type == DEST_REG ! && v->always_executed ! && ! v->leading_combined ! /* Check that the giv insn we're about to use for ! deriving precedes all uses of that giv. Note that ! initializing the derived giv would defeat the purpose ! of reducing register pressure. */ ! && ((unsigned) stats[i].end_luid - INSN_LUID (scan_start) ! > ((unsigned) stats[i].start_luid ! - INSN_LUID (scan_start))))) ! /* If we are deriving from the biv, this must be before the biv ! increment. */ ! && (last_giv != biv_giv ! || loop_insn_first_p ((v->leading_combined ! ? stats[i].start_insn : v->insn), ! bl->biv->insn)) && (sum = express_from (last_giv, v)) /* Make sure we don't make the add more expensive. ADD_COST doesn't take different costs of registers and constants into account, so compare the cost of the actual SET_SRCs. */ ! && (v->giv_type != DEST_REG ! || (rtx_cost (sum, SET) ! <= rtx_cost (SET_SRC (single_set (v->insn)), SET))) /* ??? unroll can't understand anything but reg + const_int sums. It would be cleaner to fix unroll. */ && ((GET_CODE (sum) == PLUS && GET_CODE (XEXP (sum, 0)) == REG && GET_CODE (XEXP (sum, 1)) == CONST_INT) || ! unroll_p) ! && validate_change (add_insn, &PATTERN (add_insn), ! gen_rtx_SET (VOIDmode, v->new_reg, sum), 0)) { + struct induction *tv; + + last_giv->did_derive = 1; v->derived_from = last_giv; life_end = stats[i].end_luid; + if (v->giv_type == DEST_ADDR) + { + trial_add_insn = NULL_RTX; + reorder_insns (add_insn, add_insn, + PREV_INSN (stats[i].start_insn)); + } + /* Check if we want / have to move this giv. */ + else if (v->leading_combined) + { + rtx insert_after = PREV_INSN (stats[i].start_insn); + rtx prev = PREV_INSN (v->insn); + rtx next = NEXT_INSN (v->insn); + #ifdef HAVE_cc0 + if (GET_RTX_CLASS (GET_CODE (insert_after)) == 'i' + && sets_cc0_p (PATTERN (insert_after))) + insert_after = PREV_INSN (insert_after); + #endif + if (v->insn == insert_after + || prev == insert_after) + ; /* do nothing */ + else if (loop_insn_first_p (v->insn, insert_after)) + { + reorder_insns (v->insn, v->insn, insert_after); + while (INSN_UID (prev) >= max_uid_for_loop) + prev = PREV_INSN (prev); + compute_luids (next, v->insn, INSN_LUID (prev)); + } + else + { + reorder_insns (v->insn, v->insn, insert_after); + while (INSN_UID (insert_after) >= max_uid_for_loop) + insert_after = PREV_INSN (insert_after); + compute_luids (v->insn, prev, INSN_LUID (insert_after)); + } + } + if (loop_dump_stream) { fprintf (loop_dump_stream, *************** recombine_givs (bl, loop_start, loop_end *** 7740,7749 **** --- 8091,8147 ---- print_rtl (loop_dump_stream, sum); putc ('\n', loop_dump_stream); } + + /* In case LAST_GIV->dest_reg is not replaceable, we have + to replace it in ADD_INSN now. */ + PATTERN (add_insn) + = replace_rtx (PATTERN (add_insn), last_giv->dest_reg, + last_giv->new_reg); + + /* For each place where the biv is incremented, add an + insn to set the new, reduced reg for the giv. + We used to do this only for biv_count != 1, but + this fails when there is a giv after a single biv + increment, e.g. when the last giv was expressed as + pre-decrement. + We do this here (rather than at giv derivation time) because + we want to copy ADD_INSN - which is not the same as V->insn + for DEST_ADDR givs - and to exploit the lifetime + information we have. */ + for (tv = bl->biv; tv; tv = tv->next_iv) + { + /* If the biv increment precedes ADD_INSN, we can ignore it. + Only handle the most common case here. */ + if (loop_insn_first_p (tv->insn, add_insn) + && (loop_insn_first_p (scan_start, tv->insn) + || loop_insn_first_p (add_insn, scan_start))) + continue; + /* Likewise if the biv increment is after the last giv use. + Only handle the most common case here. */ + if (INSN_UID (tv->insn) < max_uid_for_loop + && stats[i].end_luid < INSN_LUID (tv->insn) + && INSN_LUID (scan_start) < stats[i].end_luid) + continue; + + /* We always emit reduced giv increments before the biv + increment when bl->biv_count != 1. So by emitting + the add insns for derived givs after the biv increment, + they pick up the updated value of the reduced giv. + If the reduced giv is processed with auto_inc_opt == 1, + then it is incremented earlier than the biv, hence we'll + still pick up the right value. + If it's processed with auto_inc_opt == -1, + that implies that the biv increment is before the + first reduced giv's use. The derived giv's lifetime + is after the reduced giv's lifetime, hence in this + case, the biv increment doesn't matter. */ + emit_insn_after (copy_rtx (PATTERN (add_insn)), tv->insn); + } } else if (rescan < 0) rescan = i; } + last_giv = 0; } /* Clean up. */ *************** load_mems_and_recount_loop_regs_set (sca *** 9691,9697 **** int nregs = max_reg_num (); load_mems (scan_start, end, loop_top, start); ! /* Recalculate set_in_loop and friends since load_mems may have created new registers. */ if (max_reg_num () > nregs) --- 10089,10095 ---- int nregs = max_reg_num (); load_mems (scan_start, end, loop_top, start); ! /* Recalculate set_in_loop and friends since load_mems may have created new registers. */ if (max_reg_num () > nregs) *************** load_mems_and_recount_loop_regs_set (sca *** 9724,9730 **** VARRAY_CHAR (may_not_optimize, i) = 1; VARRAY_INT (set_in_loop, i) = 1; } ! #ifdef AVOID_CCMODE_COPIES /* Don't try to move insns which set CC registers if we should not create CCmode register copies. */ --- 10122,10128 ---- VARRAY_CHAR (may_not_optimize, i) = 1; VARRAY_INT (set_in_loop, i) = 1; } ! #ifdef AVOID_CCMODE_COPIES /* Don't try to move insns which set CC registers if we should not create CCmode register copies. */ *************** replace_label (x, data) *** 10176,10182 **** if (XEXP (l, 0) != old_label) return 0; ! XEXP (l, 0) = new_label; ++LABEL_NUSES (new_label); --LABEL_NUSES (old_label); --- 10574,10580 ---- if (XEXP (l, 0) != old_label) return 0; ! XEXP (l, 0) = new_label; ++LABEL_NUSES (new_label); --LABEL_NUSES (old_label); Index: loop.h =================================================================== RCS file: /cvs/gcc/egcs/gcc/loop.h,v retrieving revision 1.20 diff -p -r1.20 loop.h *** loop.h 1999/12/08 03:22:33 1.20 --- loop.h 1999/12/11 00:42:40 *************** struct induction *** 102,107 **** --- 102,124 ---- unsigned shared : 1; unsigned no_const_addval : 1; /* 1 if add_val does not contain a const. */ unsigned multi_insn_incr : 1; /* 1 if multiple insns updated the biv. */ + + /* giv-giv autoinc in the following means that that a DEST_ADDR giv + can be formed from a preceding giv with the same mult_val but + different add_val by using auto-increment. */ + unsigned autoinc_pred : 1; /* 1 if predecessor in a giv-giv autoinc. */ + unsigned autoinc_succ : 1; /* 1 if successor in a giv-giv autoinc. */ + unsigned preinc : 1; /* 1 if considered for pre-increment in + giv-giv autoinc. */ + unsigned live_after_loop : 1; /* Used inside recombine_givs to keep track + of which givs have already been included + in an array of givs live after the loop. */ + unsigned leading_combined : 1;/* In recombine_givs, set if this giv has been + combined with one or more other givs that + precede the giv insn of this giv. + Giv derivation then requires to move the + giv insn before the first use. */ + unsigned did_derive : 1; /* Set in recombine_givs. */ int lifetime; /* Length of life of this giv */ rtx derive_adjustment; /* If nonzero, is an adjustment to be subtracted from add_val when this giv *************** struct induction *** 128,133 **** --- 145,152 ---- that doesn't have this field set. */ rtx last_use; /* For a giv made from a biv increment, this is a substitute for the lifetime information. */ + rtx combine_start_limit; + rtx combine_end_limit; }; /* A `struct iv_class' is created for each biv. */ *************** void emit_unrolled_add PROTO((rtx, rtx, *** 257,262 **** --- 276,282 ---- int back_branch_in_range_p PROTO((rtx, rtx, rtx)); int loop_insn_first_p PROTO((rtx, rtx)); + int reg_dead_after_loop PROTO((rtx, rtx, rtx)); /* Forward declarations for non-static functions declared in stmt.c. */ void find_loop_tree_blocks PROTO((void)); Index: rtl.h =================================================================== RCS file: /cvs/gcc/egcs/gcc/rtl.h,v retrieving revision 1.158 diff -p -r1.158 rtl.h *** rtl.h 1999/12/04 03:00:03 1.158 --- rtl.h 1999/12/11 00:42:42 *************** extern void add_clobbers PROTO ((rtx, i *** 1470,1475 **** --- 1470,1478 ---- extern void combine_instructions PROTO ((rtx, int)); extern int extended_count PROTO ((rtx, enum machine_mode, int)); extern rtx remove_death PROTO ((int, rtx)); + extern int validate_subst PROTO((rtx, rtx, rtx)); + extern void validate_subst_start PROTO((void)); + extern void validate_subst_undo PROTO((void)); #ifdef BUFSIZ extern void dump_combine_stats PROTO ((FILE *)); extern void dump_combine_total_stats PROTO ((FILE *)); Index: unroll.c =================================================================== RCS file: /cvs/gcc/egcs/gcc/unroll.c,v retrieving revision 1.79 diff -p -r1.79 unroll.c *** unroll.c 1999/11/29 10:51:09 1.79 --- unroll.c 1999/12/11 00:42:43 *************** static int find_splittable_regs PROTO((e *** 205,211 **** unsigned HOST_WIDE_INT)); static int find_splittable_givs PROTO((struct iv_class *, enum unroll_types, rtx, rtx, rtx, int)); - static int reg_dead_after_loop PROTO((rtx, rtx, rtx)); static rtx fold_rtx_mult_add PROTO((rtx, rtx, rtx, enum machine_mode)); static int verify_addresses PROTO((struct induction *, rtx, int)); static rtx remap_split_bivs PROTO((rtx)); --- 205,210 ---- *************** find_splittable_givs (bl, unroll_type, l *** 3221,3227 **** /* ?? Could be made more intelligent in the handling of jumps, so that it can search past if statements and other similar structures. */ ! static int reg_dead_after_loop (reg, loop_start, loop_end) rtx reg, loop_start, loop_end; { --- 3220,3226 ---- /* ?? Could be made more intelligent in the handling of jumps, so that it can search past if statements and other similar structures. */ ! int reg_dead_after_loop (reg, loop_start, loop_end) rtx reg, loop_start, loop_end; { ^ permalink raw reply [flat|nested] 94+ messages in thread
* Autoincrement patches (Was: Re: Autoincrement examples) 1999-12-10 13:36 ` Michael Hayes 1999-12-10 16:59 ` Loop patch update (Was: Re: Autoincrement examples) Joern Rennecke @ 1999-12-14 15:49 ` Joern Rennecke 1999-12-14 19:58 ` Autoincrement patches Michael Hayes 1999-12-31 23:54 ` Autoincrement patches (Was: Re: Autoincrement examples) Joern Rennecke 1999-12-31 23:54 ` Autoincrement examples Michael Hayes 2 siblings, 2 replies; 94+ messages in thread From: Joern Rennecke @ 1999-12-14 15:49 UTC (permalink / raw) To: Michael Hayes; +Cc: gcc I think I can integrate the generation of PRE_MODIFY / POST_MODIFY for constant displacements into my patch, but incrementing by a register won't quite fit. Do you think this functionality of your patch can be reasonably separated from the rest? Would it be noticably smaller than the whole patch? ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement patches 1999-12-14 15:49 ` Autoincrement patches " Joern Rennecke @ 1999-12-14 19:58 ` Michael Hayes 1999-12-16 15:05 ` Joern Rennecke 1999-12-31 23:54 ` Michael Hayes 1999-12-31 23:54 ` Autoincrement patches (Was: Re: Autoincrement examples) Joern Rennecke 1 sibling, 2 replies; 94+ messages in thread From: Michael Hayes @ 1999-12-14 19:58 UTC (permalink / raw) To: Joern Rennecke; +Cc: Michael Hayes, gcc Joern Rennecke writes: > I think I can integrate the generation of PRE_MODIFY / POST_MODIFY for > constant displacements into my patch, but incrementing by a register > won't quite fit. Pity, since this optimisation is very important for things like matrix multiplication. > Do you think this functionality of your patch can be reasonably separated > from the rest? I don't think so since it uses the same infrastructure. Maybe using SSA it could be simplified a little. Michael. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement patches 1999-12-14 19:58 ` Autoincrement patches Michael Hayes @ 1999-12-16 15:05 ` Joern Rennecke 1999-12-31 23:54 ` Joern Rennecke 1999-12-31 23:54 ` Michael Hayes 1 sibling, 1 reply; 94+ messages in thread From: Joern Rennecke @ 1999-12-16 15:05 UTC (permalink / raw) To: Michael Hayes; +Cc: Joern Rennecke, gcc > > I think I can integrate the generation of PRE_MODIFY / POST_MODIFY for > > constant displacements into my patch, but incrementing by a register > > won't quite fit. > > Pity, since this optimisation is very important for things like matrix > multiplication. > > > Do you think this functionality of your patch can be reasonably separated > > from the rest? > > I don't think so since it uses the same infrastructure. Maybe using > SSA it could be simplified a little. I'm currently trying to figure out if this optimization could share infrastructure with my patch. I suppose I already have most of the required information (i.e. where are adds, and where are MEMs), and the rest shouldn't that hard to come by. My main concern is destructive interaction of the different optimizations - i.e. one relies on some information gathered beforethe other changed the RTL, and the end result is incorrect code. So in order to assess the actual problem better, I need to understand what the preconditions for the register PRE/POST_MODIFY optimization are. I.e. if the add is before the memref, can the sum be used anywhere before the memref to be turned into PRE_MODIFY? I would guess not; and if it can't, it won't be a member of a set of related values, i.e. no problem there. The increment is also in a register, which could well be in a set of related values. The issue here is that the lifetime of the increment register is possibly extended. So the death information would have to be updated, a mere bookkeeping operation at this stage, but it must not be omitted. Things seem more tricky for POST_MODIFY. In my current scheme, when a (mem (plus (reg 1) (reg 2))) is seen, the code just sees two registers that are referenced in a way that can't be optimized further at this point. but they are both expected to retain their value. And the memory location of their reference. So if the POST_MODIFY optimization is applied, the incremented register would have to get an invalidate_luid if it is in a set of related values, and the location of both the incremented register and the increment would have to be updated. Do you make any attempts to squeeze out further references to the incremented register between the memref and the increment? That would pose further porblems. The comments in some of your code seem to be out of sync with the code. + /* Scan the list of NUM memrefs in PLIST for REGNO to see which + ones can be converted to autoincrements. X is the rtx within the insn + INCR that increments REGNO and PRE is 1 if INCR precedes the memrefs + of REGNO. Return the number of memrefs converted to autoincrements. */ + static int + autoinc_search (ref_info, regno, incr_ref, ref2, pre, ret) + struct ref_info *ref_info; + int regno; + struct ref *incr_ref; + struct ref *ref2; + int pre; + struct ref **ret; ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement patches 1999-12-16 15:05 ` Joern Rennecke @ 1999-12-31 23:54 ` Joern Rennecke 0 siblings, 0 replies; 94+ messages in thread From: Joern Rennecke @ 1999-12-31 23:54 UTC (permalink / raw) To: Michael Hayes; +Cc: Joern Rennecke, gcc > > I think I can integrate the generation of PRE_MODIFY / POST_MODIFY for > > constant displacements into my patch, but incrementing by a register > > won't quite fit. > > Pity, since this optimisation is very important for things like matrix > multiplication. > > > Do you think this functionality of your patch can be reasonably separated > > from the rest? > > I don't think so since it uses the same infrastructure. Maybe using > SSA it could be simplified a little. I'm currently trying to figure out if this optimization could share infrastructure with my patch. I suppose I already have most of the required information (i.e. where are adds, and where are MEMs), and the rest shouldn't that hard to come by. My main concern is destructive interaction of the different optimizations - i.e. one relies on some information gathered beforethe other changed the RTL, and the end result is incorrect code. So in order to assess the actual problem better, I need to understand what the preconditions for the register PRE/POST_MODIFY optimization are. I.e. if the add is before the memref, can the sum be used anywhere before the memref to be turned into PRE_MODIFY? I would guess not; and if it can't, it won't be a member of a set of related values, i.e. no problem there. The increment is also in a register, which could well be in a set of related values. The issue here is that the lifetime of the increment register is possibly extended. So the death information would have to be updated, a mere bookkeeping operation at this stage, but it must not be omitted. Things seem more tricky for POST_MODIFY. In my current scheme, when a (mem (plus (reg 1) (reg 2))) is seen, the code just sees two registers that are referenced in a way that can't be optimized further at this point. but they are both expected to retain their value. And the memory location of their reference. So if the POST_MODIFY optimization is applied, the incremented register would have to get an invalidate_luid if it is in a set of related values, and the location of both the incremented register and the increment would have to be updated. Do you make any attempts to squeeze out further references to the incremented register between the memref and the increment? That would pose further porblems. The comments in some of your code seem to be out of sync with the code. + /* Scan the list of NUM memrefs in PLIST for REGNO to see which + ones can be converted to autoincrements. X is the rtx within the insn + INCR that increments REGNO and PRE is 1 if INCR precedes the memrefs + of REGNO. Return the number of memrefs converted to autoincrements. */ + static int + autoinc_search (ref_info, regno, incr_ref, ref2, pre, ret) + struct ref_info *ref_info; + int regno; + struct ref *incr_ref; + struct ref *ref2; + int pre; + struct ref **ret; ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement patches 1999-12-14 19:58 ` Autoincrement patches Michael Hayes 1999-12-16 15:05 ` Joern Rennecke @ 1999-12-31 23:54 ` Michael Hayes 1 sibling, 0 replies; 94+ messages in thread From: Michael Hayes @ 1999-12-31 23:54 UTC (permalink / raw) To: Joern Rennecke; +Cc: Michael Hayes, gcc Joern Rennecke writes: > I think I can integrate the generation of PRE_MODIFY / POST_MODIFY for > constant displacements into my patch, but incrementing by a register > won't quite fit. Pity, since this optimisation is very important for things like matrix multiplication. > Do you think this functionality of your patch can be reasonably separated > from the rest? I don't think so since it uses the same infrastructure. Maybe using SSA it could be simplified a little. Michael. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Autoincrement patches (Was: Re: Autoincrement examples) 1999-12-14 15:49 ` Autoincrement patches " Joern Rennecke 1999-12-14 19:58 ` Autoincrement patches Michael Hayes @ 1999-12-31 23:54 ` Joern Rennecke 1 sibling, 0 replies; 94+ messages in thread From: Joern Rennecke @ 1999-12-31 23:54 UTC (permalink / raw) To: Michael Hayes; +Cc: gcc I think I can integrate the generation of PRE_MODIFY / POST_MODIFY for constant displacements into my patch, but incrementing by a register won't quite fit. Do you think this functionality of your patch can be reasonably separated from the rest? Would it be noticably smaller than the whole patch? ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-12-10 13:36 ` Michael Hayes 1999-12-10 16:59 ` Loop patch update (Was: Re: Autoincrement examples) Joern Rennecke 1999-12-14 15:49 ` Autoincrement patches " Joern Rennecke @ 1999-12-31 23:54 ` Michael Hayes 2 siblings, 0 replies; 94+ messages in thread From: Michael Hayes @ 1999-12-31 23:54 UTC (permalink / raw) To: Joern Rennecke; +Cc: Michael Hayes, gcc, amylaar Joern Rennecke writes: > It appears your patch is not intended for existing ports. Or is there > another patch for defaults.h ? Here are my diffs for rtl.h again. In an earlier post on this thread I sent this with other patches to support post_modify addressing modes throughout the compiler. Michael. Index: rtl.h =================================================================== RCS file: /cvs/gcc/egcs/gcc/rtl.h,v retrieving revision 1.158 diff -c -3 -p -r1.158 rtl.h *** rtl.h 1999/12/04 03:00:03 1.158 --- rtl.h 1999/12/10 21:32:14 *************** extern const char * const note_insn_name *** 763,803 **** /* 1 means a SYMBOL_REF has been the library function in emit_library_call. */ #define SYMBOL_REF_USED(RTX) ((RTX)->used) /* Define a macro to look for REG_INC notes, but save time on machines where they never exist. */ ! /* Don't continue this line--convex cc version 4.1 would lose. */ ! #if (defined (HAVE_PRE_INCREMENT) || defined (HAVE_PRE_DECREMENT) || defined (HAVE_POST_INCREMENT) || defined (HAVE_POST_DECREMENT)) ! #define FIND_REG_INC_NOTE(insn, reg) (find_reg_note ((insn), REG_INC, (reg))) ! #else ! #define FIND_REG_INC_NOTE(insn, reg) 0 #endif /* Indicate whether the machine has any sort of auto increment addressing. If not, we can avoid checking for REG_INC notes. */ /* Don't continue this line--convex cc version 4.1 would lose. */ ! #if (defined (HAVE_PRE_INCREMENT) || defined (HAVE_PRE_DECREMENT) || defined (HAVE_POST_INCREMENT) || defined (HAVE_POST_DECREMENT)) #define AUTO_INC_DEC #endif #ifndef HAVE_PRE_INCREMENT ! #define HAVE_PRE_INCREMENT 0 #endif #ifndef HAVE_PRE_DECREMENT ! #define HAVE_PRE_DECREMENT 0 #endif #ifndef HAVE_POST_INCREMENT ! #define HAVE_POST_INCREMENT 0 #endif #ifndef HAVE_POST_DECREMENT ! #define HAVE_POST_DECREMENT 0 #endif /* Some architectures do not have complete pre/post increment/decrement instruction sets, or only move some modes efficiently. These macros allow us to tune autoincrement generation. */ --- 763,833 ---- /* 1 means a SYMBOL_REF has been the library function in emit_library_call. */ #define SYMBOL_REF_USED(RTX) ((RTX)->used) + #if (defined (HAVE_PRE_MODIFY_DISP) || defined (HAVE_PRE_MODIFY_REG) || defined (HAVE_POST_MODIFY_DISP) || defined (HAVE_POST_MODIFY_REG)) + #define HAVE_AUTO_MODIFY + #endif + + #if (defined (HAVE_PRE_INCREMENT) || defined (HAVE_POST_INCREMENT)) + #define HAVE_AUTO_INC + #endif + /* Define a macro to look for REG_INC notes, but save time on machines where they never exist. */ ! #if (defined (HAVE_PRE_DECREMENT) || defined (HAVE_POST_DECREMENT)) ! #define HAVE_AUTO_DEC #endif /* Indicate whether the machine has any sort of auto increment addressing. If not, we can avoid checking for REG_INC notes. */ /* Don't continue this line--convex cc version 4.1 would lose. */ ! #if (defined (HAVE_AUTO_MODIFY) || defined (HAVE_AUTO_INC) || defined (HAVE_AUTO_DEC)) #define AUTO_INC_DEC #endif + /* Define a macro to look for REG_INC notes, + but save time on machines where they never exist. */ + + #ifdef AUTO_INC_DEC + #define FIND_REG_INC_NOTE(insn, reg) (find_reg_note ((insn), REG_INC, (reg))) + #else + #define FIND_REG_INC_NOTE(insn, reg) 0 + #endif + #ifndef HAVE_PRE_INCREMENT ! #define HAVE_PRE_INCREMENT 0 #endif #ifndef HAVE_PRE_DECREMENT ! #define HAVE_PRE_DECREMENT 0 #endif #ifndef HAVE_POST_INCREMENT ! #define HAVE_POST_INCREMENT 0 #endif #ifndef HAVE_POST_DECREMENT ! #define HAVE_POST_DECREMENT 0 #endif + #ifndef HAVE_POST_MODIFY_DISP + #define HAVE_POST_MODIFY_DISP 0 + #endif + + #ifndef HAVE_PRE_MODIFY_DISP + #define HAVE_PRE_MODIFY_DISP 0 + #endif + + #ifndef HAVE_POST_MODIFY_REG + #define HAVE_POST_MODIFY_REG 0 + #endif + #ifndef HAVE_PRE_MODIFY_REG + #define HAVE_PRE_MODIFY_REG 0 + #endif + + /* Some architectures do not have complete pre/post increment/decrement instruction sets, or only move some modes efficiently. These macros allow us to tune autoincrement generation. */ *************** extern const char * const note_insn_name *** 834,840 **** --- 864,902 ---- #define USE_STORE_PRE_DECREMENT(MODE) HAVE_PRE_DECREMENT #endif + #ifndef USE_LOAD_POST_MODIFY_DISP + #define USE_LOAD_POST_MODIFY_DISP(MODE) HAVE_POST_MODIFY_DISP + #endif + + #ifndef USE_LOAD_PRE_MODIFY_DISP + #define USE_LOAD_PRE_MODIFY_DISP(MODE) HAVE_PRE_MODIFY_DISP + #endif + + #ifndef USE_STORE_POST_MODIFY_DISP + #define USE_STORE_POST_MODIFY_DISP(MODE) HAVE_POST_MODIFY_DISP + #endif + + #ifndef USE_STORE_PRE_MODIFY_DISP + #define USE_STORE_PRE_MODIFY_DISP(MODE) HAVE_PRE_MODIFY_DISP + #endif + + #ifndef USE_LOAD_PRE_MODIFY_REG + #define USE_LOAD_PRE_MODIFY_REG(MODE) HAVE_PRE_MODIFY_REG + #endif + + #ifndef USE_LOAD_POST_MODIFY_REG + #define USE_LOAD_POST_MODIFY_REG(MODE) HAVE_POST_MODIFY_REG + #endif + + #ifndef USE_STORE_PRE_MODIFY_REG + #define USE_STORE_PRE_MODIFY_REG(MODE) HAVE_PRE_MODIFY_REG + #endif + + #ifndef USE_STORE_POST_MODIFY_REG + #define USE_STORE_POST_MODIFY_REG(MODE) HAVE_POST_MODIFY_REG + #endif + /* Accessors for RANGE_INFO. */ /* For RANGE_{START,END} notes return the RANGE_START note. */ #define RANGE_INFO_NOTE_START(INSN) XCEXP (INSN, 0, RANGE_INFO) ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-12-10 8:42 ` Joern Rennecke 1999-12-10 13:36 ` Michael Hayes @ 1999-12-31 23:54 ` Joern Rennecke 1 sibling, 0 replies; 94+ messages in thread From: Joern Rennecke @ 1999-12-31 23:54 UTC (permalink / raw) To: Michael Hayes; +Cc: amylaar, m.hayes, gcc, amylaar It appears your patch is not intended for existing ports. Or is there another patch for defaults.h ? autoinc.o: In function `autoinc_search': /s/egcs-ai-mh/gcc/autoinc.c:394: undefined reference to `USE_STORE_PRE_MODIFY_REG' /s/egcs-ai-mh/gcc/autoinc.c:397: undefined reference to `USE_STORE_POST_MODIFY_REG' /s/egcs-ai-mh/gcc/autoinc.c:400: undefined reference to `USE_LOAD_PRE_MODIFY_REG' /s/egcs-ai-mh/gcc/autoinc.c:403: undefined reference to `USE_LOAD_POST_MODIFY_REG' /s/egcs-ai-mh/gcc/autoinc.c:475: undefined reference to `USE_STORE_PRE_MODIFY_DISP' /s/egcs-ai-mh/gcc/autoinc.c:478: undefined reference to `USE_STORE_POST_MODIFY_DISP' /s/egcs-ai-mh/gcc/autoinc.c:481: undefined reference to `USE_LOAD_PRE_MODIFY_DISP' /s/egcs-ai-mh/gcc/autoinc.c:484: undefined reference to `USE_LOAD_POST_MODIFY_DISP' ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-12-08 14:38 ` Michael Hayes 1999-12-10 8:42 ` Joern Rennecke @ 1999-12-31 23:54 ` Michael Hayes 1 sibling, 0 replies; 94+ messages in thread From: Michael Hayes @ 1999-12-31 23:54 UTC (permalink / raw) To: Joern Rennecke; +Cc: Michael Hayes, gcc, amylaar Joern Rennecke writes: > Can you post autoinc.h ? OK, be warned, it's trivial... /* Optimize by combining creating autoincrement memory references for GNU compiler. This is part of flow optimization. Copyright (C) 1999 Free Software Foundation, Inc. Contributed by Michael P. Hayes (m.hayes@elec.canterbury.ac.nz) This file is part of GNU CC. GNU CC is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2, or (at your option) any later version. GNU CC is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with GNU CC; see the file COPYING. If not, write to the Free Software Foundation, 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. */ #ifdef AUTO_INC_DEC extern int autoinc_optimize PROTO ((int, int, FILE *)); #endif ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-12-08 10:57 ` Joern Rennecke 1999-12-08 14:38 ` Michael Hayes @ 1999-12-31 23:54 ` Joern Rennecke 1 sibling, 0 replies; 94+ messages in thread From: Joern Rennecke @ 1999-12-31 23:54 UTC (permalink / raw) To: Michael Hayes; +Cc: m.hayes, law, gcc, amylaar > OK, here goes. > > Michael. > > begin 644 autoinc.patch.gz Can you post autoinc.h ? ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-18 17:00 ` Michael Hayes ` (2 preceding siblings ...) 1999-12-08 10:57 ` Joern Rennecke @ 1999-12-17 18:08 ` Joern Rennecke 1999-12-17 18:27 ` Michael Hayes 1999-12-31 23:54 ` Joern Rennecke 3 siblings, 2 replies; 94+ messages in thread From: Joern Rennecke @ 1999-12-17 18:08 UTC (permalink / raw) To: Michael Hayes; +Cc: law, gcc, amylaar + /* Look for ref for REG of REF_TYPE between INSN1 and INSN2 inclusive. */ + struct ref * + ref_find_ref_between (ref_info, reg, insn1, insn2, ref_type) + struct ref_info *ref_info; + rtx reg; + rtx insn1, insn2; + enum ref_type ref_type; + { + rtx insn; + struct ref *ref; + + for (insn = insn1; insn != NEXT_INSN (insn2); insn = NEXT_INSN (insn)) + if ((ref = ref_find_ref (ref_info, reg, insn, REF_REG_DEF))) + return ref; + return NULL; + } You ignore the parameter REF_TYPE. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-12-17 18:08 ` Joern Rennecke @ 1999-12-17 18:27 ` Michael Hayes 1999-12-31 23:54 ` Michael Hayes 1999-12-31 23:54 ` Joern Rennecke 1 sibling, 1 reply; 94+ messages in thread From: Michael Hayes @ 1999-12-17 18:27 UTC (permalink / raw) To: Joern Rennecke; +Cc: Michael Hayes, law, gcc, amylaar Joern Rennecke writes: > You ignore the parameter REF_TYPE. Well spotted. if ((ref = ref_find_ref (ref_info, reg, insn, REF_REG_DEF))) should be if ((ref = ref_find_ref (ref_info, reg, insn, ref_type))) Michael. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-12-17 18:27 ` Michael Hayes @ 1999-12-31 23:54 ` Michael Hayes 0 siblings, 0 replies; 94+ messages in thread From: Michael Hayes @ 1999-12-31 23:54 UTC (permalink / raw) To: Joern Rennecke; +Cc: Michael Hayes, law, gcc, amylaar Joern Rennecke writes: > You ignore the parameter REF_TYPE. Well spotted. if ((ref = ref_find_ref (ref_info, reg, insn, REF_REG_DEF))) should be if ((ref = ref_find_ref (ref_info, reg, insn, ref_type))) Michael. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-12-17 18:08 ` Joern Rennecke 1999-12-17 18:27 ` Michael Hayes @ 1999-12-31 23:54 ` Joern Rennecke 1 sibling, 0 replies; 94+ messages in thread From: Joern Rennecke @ 1999-12-31 23:54 UTC (permalink / raw) To: Michael Hayes; +Cc: law, gcc, amylaar + /* Look for ref for REG of REF_TYPE between INSN1 and INSN2 inclusive. */ + struct ref * + ref_find_ref_between (ref_info, reg, insn1, insn2, ref_type) + struct ref_info *ref_info; + rtx reg; + rtx insn1, insn2; + enum ref_type ref_type; + { + rtx insn; + struct ref *ref; + + for (insn = insn1; insn != NEXT_INSN (insn2); insn = NEXT_INSN (insn)) + if ((ref = ref_find_ref (ref_info, reg, insn, REF_REG_DEF))) + return ref; + return NULL; + } You ignore the parameter REF_TYPE. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-18 0:45 ` Michael Hayes 1999-11-18 7:33 ` Joern Rennecke 1999-11-18 17:00 ` Michael Hayes @ 1999-11-30 23:37 ` Michael Hayes 2 siblings, 0 replies; 94+ messages in thread From: Michael Hayes @ 1999-11-30 23:37 UTC (permalink / raw) To: law; +Cc: Michael Hayes, gcc, amylaar Jeffrey A Law writes: > This looks very similar to something Cygnus did for a customer but hasn't > had the time to contribute. > > Our implementation sat inside regmove and I believe performed similar > transformations. > What I would like to do is have a "cook off" between the two implementations. > ie, I want us to evaluate the two hunks of code both from a standpoint of which > is more effective at optimizing sequences that can use autoinc to remove > instructions and from a cleanliness/long term maintainability standpoint. What were the chief advantages of doing this during regmove? I'm not familiar with this pass but I feel the transformations are too late. For simple cases it should make no difference, but for more complex cases, autoincrement optimisation needs to run before instruction combination. > Joern -- can you do the same with Michael's implementation? I'll resubmit my patches tomorrow. There have been a few mods since my previous submission to properly fix {post,pre}_modify addressing modes and to make the code more robust. For the cook off, I've got a testsuite of 25 vector/matrix manipulation routines I have written and 40 assorted testcases that I can contribute. Michael. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-18 0:22 ` Jeffrey A Law 1999-11-18 0:45 ` Michael Hayes @ 1999-11-18 7:29 ` Joern Rennecke 1999-11-30 23:37 ` Joern Rennecke 1999-11-18 14:29 ` Joern Rennecke 1999-11-30 23:37 ` Jeffrey A Law 3 siblings, 1 reply; 94+ messages in thread From: Joern Rennecke @ 1999-11-18 7:29 UTC (permalink / raw) To: law; +Cc: m.hayes, gcc, amylaar > Joern -- can you get a patch for the regmove changes put together and submit > it to the list? It's not quite that simple - the regmove patches work hand-in-hand with my loop patches. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-18 7:29 ` Joern Rennecke @ 1999-11-30 23:37 ` Joern Rennecke 0 siblings, 0 replies; 94+ messages in thread From: Joern Rennecke @ 1999-11-30 23:37 UTC (permalink / raw) To: law; +Cc: m.hayes, gcc, amylaar > Joern -- can you get a patch for the regmove changes put together and submit > it to the list? It's not quite that simple - the regmove patches work hand-in-hand with my loop patches. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-18 0:22 ` Jeffrey A Law 1999-11-18 0:45 ` Michael Hayes 1999-11-18 7:29 ` Joern Rennecke @ 1999-11-18 14:29 ` Joern Rennecke 1999-11-22 23:47 ` Jeffrey A Law 1999-11-30 23:37 ` Joern Rennecke 1999-11-30 23:37 ` Jeffrey A Law 3 siblings, 2 replies; 94+ messages in thread From: Joern Rennecke @ 1999-11-18 14:29 UTC (permalink / raw) To: law; +Cc: m.hayes, gcc, gcc-patches > Joern -- can you get a patch for the regmove changes put together and submit > it to the list? These are the regmove patches: Wed Oct 20 20:45:45 1999 J"orn Rennecke <amylaar@cygnus.co.uk> * regmove.c (invalidate_related): New Argument call_tally. Set rel->reg_orig_calls_crossed when setting rel->invalidate_luid. Changed all callers. (optimize_related_values_1): Don't set rel->reg_orig_calls_crossed if rel->invalidate_luid is set. (optimize_related_values): Bump CALL_TALLY *after* inputs have been processed. (find_related): When recursively processing a SET_DEST of a CALL_INSN, pass an incremented value for CALL_TALLY. Wed Oct 20 00:58:08 1999 J"orn Rennecke <amylaar@cygnus.co.uk> * regmove.c (find_related): Ignore registers that change size. Fri Oct 1 15:04:25 1999 J"orn Rennecke <amylaar@cygnus.co.uk> * regmove.c (optimize_related_values_1): Fix check when to preserve update->insn. Tue Jun 29 07:46:53 1999 J"orn Rennecke <amylaar@cygnus.co.uk> * regmove.c (optimize_related_values_1): When deciding whether to add a REG_DEAD or REG_UNUSED note, also check for a REG_INC notes we might have created. * integrate.c (copy_rtx_and_substitute): Handle NOTE_INSN_DELETED_LABEL notes. Don't handle 'n' rtx_format case. Mon Mar 8 16:00:35 1999 Jim Wilson <wilson@cygnus.com> * regmove.c (optimize_related_values): Add bounds check for b before BLOCK_HEAD check. Fri Feb 19 23:10:32 1999 Richard Henderson <rth@cygnus.com> * regmove.c (optimize_related_values): Use insn modes rather than sets_cc0_p. Watch basic block heads and ends rather than insn types. Thu Jan 28 01:08:31 1999 J"orn Rennecke <amylaar@cygnus.co.uk> * regmove.c (find_related): Check if a register belonging to a set of related values is clobbered in an insn where it is also used. (optimize_related_values_1): Handle REG_UNUSED notes. (optimize_related_values): Likewise. Mon Dec 14 17:08:17 1998 Jim Wilson <wilson@cygnus.com> * regmove.c (REL_USE_HASH): Use unsigned HOST_WIDE_INT instead of unsigned. Fri Nov 13 10:14:04 1998 J"orn Rennecke <amylaar@cygnus.co.uk> * regmove.c (optimize_related_values_1): Reject optimization if offset for rel_base_reg_user would be to large. Fri Nov 13 04:36:06 1998 J"orn Rennecke <amylaar@cygnus.co.uk> * regmove.c (rel_record_mem): Don't do anything if the register already has an invalidate_luid. Thu Nov 12 23:02:32 1998 J"orn Rennecke <amylaar@cygnus.co.uk> * regmove.c (invalidate_related): Don't do anything if the register already has an invalidate_luid. (optimize_related_values): Don't update death field if invalidate_luid field is set. Wed Oct 14 21:38:11 1998 J"orn Rennecke <amylaar@cygnus.co.uk> * regmove.c (optimize_related_values): Check if cc0 is set. * regmove.c (optimize_related_values): Fix problem with multiple related values in single insn. Wed Sep 23 20:42:54 1998 J"orn Rennecke <amylaar@cygnus.co.uk> * regmove.c (optimize_related_values_1): Set use->insn when emitting the linking insn before the final 'use' for a register that does not die within the scope of the optimization. Mon Sep 21 15:04:16 1998 J"orn Rennecke <amylaar@cygnus.co.uk> * regmove.c (count_sets): New function. (gen_add3_insn): If single instruction add fails and source and destination register are different, try a move / add sequence. (rel_use_chain): New member match_offset. (optimize_related_values_1): Set it, and use it to avoid linking chains when this requires more than one instruction for the add. (add_limits): New file scope array. (optimize_related_values): Initialize it. Mon Sep 21 14:55:36 1998 J"orn Rennecke <amylaar@cygnus.co.uk> * regmove.c (optimize_related_values_1): Don't use rel_base->reg for a chain that needs an out-of-range offset. Take setting of rel_base_reg_user into account when deciding if there are enough registers available. Tue Sep 15 16:41:00 1998 Michael Tiemann <michael@impact.tiemann.org> * regmove.c (find_related): We also have to track expressions that are just naked registers. Otherwise, we burn one register to prime the related values, and we'll also miss the second (but not subsequent) opportunities to use related values. Thu Sep 3 23:33:57 1998 J"orn Rennecke <amylaar@cygnus.co.uk> * rtl.h (push_obstacks_nochange, end_temporary_allocation): Declare. * regmove.c (obstack.h): Include. (REL_USE_HASH_SIZE, REL_USE_HASH, rel_alloc, rel_new): Define. (struct related, struct related_baseinfo, struct update): New structs. (struct rel_use_chain, struct rel_use): Likewise. (regno_related, rel_base_list, unrelatedly_used): New variables. (related_obstack): Likewise. (regclass_compatible_p, lookup_related): New functions. (rel_build_chain, rel_record_mem, invalidate_related): Likewise. (find_related, chain_starts_earlier, chain_ends_later): Likewise. (optimize_related_values_1, optimize_related_values_0): Likewise. (optimize_related_values): Likewise. (regmove_optimize): Use regclass_compatible_p. Call optimize_related_values. Index: regmove.c =================================================================== RCS file: /cvs/gcc/egcs/gcc/regmove.c,v retrieving revision 1.74 diff -p -r1.74 regmove.c *** regmove.c 1999/11/08 04:56:18 1.74 --- regmove.c 1999/11/18 22:21:46 *************** Boston, MA 02111-1307, USA. */ *** 40,45 **** --- 40,46 ---- #include "insn-flags.h" #include "basic-block.h" #include "toplev.h" + #include "obstack.h" static int optimize_reg_copy_1 PROTO((rtx, rtx, rtx)); static void optimize_reg_copy_2 PROTO((rtx, rtx, rtx)); *************** static int fixup_match_1 PROTO((rtx, rtx *** 66,71 **** --- 67,87 ---- static int reg_is_remote_constant_p PROTO((rtx, rtx, rtx)); static int stable_and_no_regs_but_for_p PROTO((rtx, rtx, rtx)); static int regclass_compatible_p PROTO((int, int)); + #ifdef AUTO_INC_DEC + static struct rel_use *lookup_related PROTO((int, enum reg_class, HOST_WIDE_INT)); + static void rel_build_chain PROTO((struct rel_use *, struct rel_use *, int)); + static void rel_record_mem PROTO((rtx *, rtx, int, int, int, rtx, int, int)); + static void invalidate_related PROTO((rtx, int, int)); + static void find_related PROTO((rtx *, rtx, int, int)); + static int chain_starts_earlier PROTO((const PTR, const PTR)); + static int chain_ends_later PROTO((const PTR, const PTR)); + static struct related *optimize_related_values_1 PROTO((struct related *, int, + int, rtx, FILE *)); + static void optimize_related_values_0 PROTO((struct related *, int, int, + rtx, FILE *)); + static void optimize_related_values PROTO((int, FILE *)); + static void count_sets PROTO((rtx, rtx, void *)); + #endif /* AUTO_INC_DEC */ static int replacement_quality PROTO((rtx)); static int fixup_match_2 PROTO((rtx, rtx, rtx, rtx, FILE *)); static int loop_depth; *************** gen_add3_insn (r0, r1, c) *** 90,106 **** rtx r0, r1, c; { int icode = (int) add_optab->handlers[(int) GET_MODE (r0)].insn_code; ! if (icode == CODE_FOR_nothing || ! ((*insn_data[icode].operand[0].predicate) ! (r0, insn_data[icode].operand[0].mode)) || ! ((*insn_data[icode].operand[1].predicate) ! (r1, insn_data[icode].operand[1].mode)) || ! ((*insn_data[icode].operand[2].predicate) ! (c, insn_data[icode].operand[2].mode))) return NULL_RTX; ! return (GEN_FCN (icode) (r0, r1, c)); } --- 106,144 ---- rtx r0, r1, c; { int icode = (int) add_optab->handlers[(int) GET_MODE (r0)].insn_code; + int mcode; + rtx s, move; ! if (icode == CODE_FOR_nothing || ! ((*insn_data[icode].operand[0].predicate) ! (r0, insn_data[icode].operand[0].mode))) ! return NULL_RTX; ! ! if (((*insn_data[icode].operand[1].predicate) ! (r1, insn_data[icode].operand[1].mode)) ! && ((*insn_data[icode].operand[2].predicate) ! (c, insn_data[icode].operand[2].mode))) ! return (GEN_FCN (icode) (r0, r1, c)); ! ! mcode = (int) mov_optab->handlers[(int) GET_MODE (r0)].insn_code; ! if (REGNO (r0) == REGNO (r1) || ! ((*insn_data[icode].operand[1].predicate) ! (r0, insn_data[icode].operand[1].mode)) || ! ((*insn_data[icode].operand[2].predicate) ! (r1, insn_data[icode].operand[2].mode)) ! || ! ((*insn_data[mcode].operand[0].predicate) ! (r0, insn_data[mcode].operand[0].mode)) ! || ! ((*insn_data[mcode].operand[1].predicate) ! (c, insn_data[mcode].operand[1].mode))) return NULL_RTX; ! start_sequence (); ! move = emit_insn (GEN_FCN (mcode) (r0, c)); ! REG_NOTES (move) = gen_rtx_EXPR_LIST (REG_EQUAL, c, NULL_RTX); ! emit_insn (GEN_FCN (icode) (r0, r0, r1)); ! s = gen_sequence (); ! end_sequence (); ! return s; } *************** flags_set_1 (x, pat, data) *** 330,336 **** && reg_overlap_mentioned_p (x, flags_set_1_rtx)) flags_set_1_set = 1; } ! \f static int *regno_src_regno; /* Indicate how good a choice REG (which appears as a source) is to replace --- 368,1989 ---- && reg_overlap_mentioned_p (x, flags_set_1_rtx)) flags_set_1_set = 1; } ! ! #ifdef AUTO_INC_DEC ! ! /* Some machines have two-address-adds and instructions that can ! use only register-indirect addressing and auto_increment, but no ! offsets. If multiple fields of a struct are accessed more than ! once, cse will load each of the member addresses in separate registers. ! This not only costs a lot of registers, but also of instructions, ! since each add to initialize an address register must be really expanded ! into a register-register move followed by an add. ! regmove_optimize uses some heuristics to detect this case; if these ! indicate that this is likely, optimize_related_values is run once for ! the entire function. ! ! We build chains of uses of related values that can be satisfied with the ! same base register by taking advantage of auto-increment address modes ! instead of explicit add instructions. ! ! We try to link chains with disjoint lifetimes together to reduce the ! number of temporary registers and register-register copies. ! ! This optimization pass operates on basic blocks one at a time; it could be ! extended to work on extended basic blocks or entire functions. */ ! ! /* For each set of values related to a common base register, we use a ! hash table which maps constant offsets to instructions. ! ! The instructions mapped to are those that use a register which may, ! (possibly with a change in addressing mode) differ from the initial ! value of the base register by exactly that offset after the ! execution of the instruction. ! Here we define the size of the hash table, and the hash function to use. */ ! #define REL_USE_HASH_SIZE 43 ! #define REL_USE_HASH(I) ((I) % (unsigned HOST_WIDE_INT) REL_USE_HASH_SIZE) ! ! /* For each register in a set of registers that are related, we keep a ! struct related. ! ! u.base contains the register number of the base register (i.e. the one ! that was the source of the first three-address add for this set of ! related values). ! ! INSN is the instruction that initialized the register, or, for the ! base, the instruction that initialized the first non-base register. ! ! BASE is the register number of the base register. ! ! For the base register only, the member BASEINFO points to some extra data. ! ! 'luid' here means linear uid. We count them starting at the function ! start; they are used to avoid overlapping lifetimes. ! ! UPDATES is a list of instructions that set the register to a new ! value that is still related to the same base. ! ! When a register in a set of related values is set to something that ! is not related to the base, INVALIDATE_LUID is set to the luid of ! the instruction that does this set. This is used to avoid re-using ! this register in an overlapping liftime for a related value. ! ! DEATH is first used to store the insn (if any) where the register dies. ! When the optimization is actually performed, the REG_DEAD note from ! the insn denoted by DEATH is removed. ! Thereafter, the removed death note is stored in DEATH, marking not ! only that the register dies, but also making the note available for reuse. ! ! We also use a struct related to keep track of registers that have been ! used for anything that we don't recognize as related values. ! The only really interesting datum for these is u.last_luid, which is ! the luid of the last reference we have seen. These struct relateds ! are marked by a zero INSN field; most other members are not used and ! remain uninitialized. */ ! ! struct related { ! rtx insn, reg; ! union { int base; int last_luid; } u; ! HOST_WIDE_INT offset; ! struct related *prev; ! struct update *updates; ! struct related_baseinfo *baseinfo; ! int invalidate_luid; ! rtx death; ! int reg_orig_calls_crossed, reg_set_call_tally, reg_orig_refs; ! }; ! ! /* HASHTAB maps offsets to register uses with a matching MATCH_OFFSET. ! PREV_BASE points to the struct related for the previous base register ! that we currently keep track of. ! INSN_LUID is the luid of the instruction that started this set of ! related values. */ ! struct related_baseinfo { ! struct rel_use *hashtab[REL_USE_HASH_SIZE]; ! struct rel_use_chain *chains; ! struct related *prev_base; ! int insn_luid; ! }; ! ! /* INSN is an instruction that sets a register that previously contained ! a related value to a new value that is related to the same base register. ! When the optimization is performed, we have to delete INSN. ! DEATH_INSN points to the insn (if any) where the register died that we ! set in INSN. When we perform the optimization, the REG_DEAD note has ! to be removed from DEATH_INSN. ! PREV points to the struct update that pertains to the previous ! instruction pertaining to the same register that set it from one ! related value to another one. */ ! struct update ! { ! rtx insn, death_insn; ! struct update *prev; ! }; ! ! struct rel_use_chain ! { ! struct rel_use *chain; /* Points to first use in this chain. */ ! struct rel_use_chain *prev, *linked; ! /* Only set after the chain has been completed: */ ! struct rel_use *end; /* Last use in this chain. */ ! int start_luid, end_luid, calls_crossed; ! rtx reg; /* The register allocated for this chain. */ ! HOST_WIDE_INT match_offset; /* Offset after execution of last insn. */ ! }; ! ! /* ADDRP points to the place where the actual use of the related value is. ! This is commonly a memory address, and has to be set to a register ! or some auto_inc addressing of this register. ! But ADDRP is also used for all other uses of related values to ! the place where the register is inserted; we can tell that an ! unardorned register is to be inserted because no offset adjustment ! is required, hence this is handled by the same logic as register-indirect ! addressing. The only exception to this is when SET_IN_PARALLEL is set, ! see below. ! OFFSET is the offset that is actually used in this instance, i.e. ! the value of the base register when the set of related values was ! created plus OFFSET yields the value that is used. ! This might be different from the value of the used register before ! executing INSN if we elected to use pre-{in,de}crement addressing. ! If we have the option to use post-{in,d})crement addressing, all ! choices are linked cyclically together with the SIBLING field. ! Otherwise, it's a one-link-cycle, i.e. SIBLING points at the ! struct rel_use it is a member of. ! MATCH_OFFSET is the offset that is available after the execution ! of INSN. It is the same as OFFSET for straight register-indirect ! addressing and for pre-{in,de}crement addressing, while it differs ! for the post-{in,de}crement addressing modes. ! If SET_IN_PARALLEL is set, MATCH_OFFSET differs from OFFSET, yet ! this is no post-{in,de}crement addresing. Rather, it is a set ! inside a PARALLEL that adds some constant to a register that holds ! one value of a set of related values that we keep track of. ! ADDRP then points only to the set destination of this set; another ! struct rel_use is used for the source of the set. ! NO_LINK_PRED is nonzero for the last use in a chain if it cannot be ! the predecessor for a another chain to be linked to. This can happen ! for uses that come with a clobber, and for uses by a register that ! is live at the end of the processed range of insns ! (usually a basic block). */ ! struct rel_use ! { ! rtx insn, *addrp; ! int luid, call_tally; ! enum reg_class class; ! unsigned set_in_parallel : 1; ! unsigned no_link_pred: 1; ! HOST_WIDE_INT offset, match_offset; ! struct rel_use *next_chain, **prev_chain_ref, *next_hash, *sibling; ! }; ! ! struct related **regno_related, *rel_base_list, *unrelatedly_used; ! ! #define rel_alloc(N) obstack_alloc(&related_obstack, (N)) ! #define rel_new(X) ((X) = rel_alloc (sizeof *(X))) ! ! static struct obstack related_obstack; ! ! /* For each integer machine mode, the minimum and maximum constant that ! can be added with a single constant. ! This is supposed to define an interval around zero; if there are ! singular points disconnected from this interval, we want to leave ! them out. */ ! ! static HOST_WIDE_INT add_limits[NUM_MACHINE_MODES][2]; ! ! /* Try to find a related value with offset OFFSET from the base ! register belonging to REGNO, using a register with preferred class ! that is compatible with CLASS. */ ! static struct rel_use * ! lookup_related (regno, class, offset) ! int regno; ! enum reg_class class; ! HOST_WIDE_INT offset; ! { ! int base = regno_related[regno]->u.base; ! int hash = REL_USE_HASH (offset); ! struct rel_use *match = regno_related[base]->baseinfo->hashtab[hash]; ! for (; match; match = match->next_hash) ! { ! if (offset != match->match_offset) ! continue; ! if (match->next_chain) ! continue; ! if (regclass_compatible_p (class, match->class)) ! break; ! } ! return match; ! } ! ! /* Add NEW_USE at the end of the chain that currently ends with MATCH; ! If MATCH is not set, create a new chain. ! BASE is the base register number the chain belongs to. */ ! static void ! rel_build_chain (new_use, match, base) ! struct rel_use *new_use, *match; ! int base; ! { ! int hash; ! ! if (match) ! { ! struct rel_use *sibling = match; ! do ! { ! sibling->next_chain = new_use; ! if (sibling->prev_chain_ref) ! *sibling->prev_chain_ref = match; ! sibling = sibling->sibling; ! } ! while (sibling != match); ! new_use->prev_chain_ref = &match->next_chain; ! new_use->next_chain = 0; ! } ! else ! { ! struct rel_use_chain *new_chain; ! ! rel_new (new_chain); ! new_chain->chain = new_use; ! new_use->prev_chain_ref = &new_chain->chain; ! new_use->next_chain = 0; ! new_use->next_chain = NULL_PTR; ! new_chain->linked = 0; ! new_chain->prev = regno_related[base]->baseinfo->chains; ! regno_related[base]->baseinfo->chains = new_chain; ! } ! hash = REL_USE_HASH (new_use->offset); ! new_use->next_hash = regno_related[base]->baseinfo->hashtab[hash]; ! regno_related[base]->baseinfo->hashtab[hash] = new_use; ! } ! ! /* Record the use of register ADDR in a memory reference. ! ADDRP is the memory location where the address is stored. ! SIZE is the size of the memory reference. ! PRE_OFFS is the offset that has to be added to the value in ADDR ! due to PRE_{IN,DE}CREMENT addressing in the original address; likewise, ! POST_OFFSET denotes POST_{IN,DE}CREMENT addressing. INSN is the ! instruction that uses this address, LUID its luid, and CALL_TALLY ! the current number of calls encountered since the start of the ! function. */ ! static void ! rel_record_mem (addrp, addr, size, pre_offs, post_offs, insn, luid, call_tally) ! rtx *addrp, addr, insn; ! int size, pre_offs, post_offs; ! int luid, call_tally; ! { ! static rtx auto_inc; ! rtx orig_addr = *addrp; ! int regno, base; ! HOST_WIDE_INT offset; ! struct rel_use *new_use, *match; ! enum reg_class class; ! int hash; ! ! if (GET_CODE (addr) != REG) ! abort (); ! ! regno = REGNO (addr); ! if (! regno_related[regno] || ! regno_related[regno]->insn ! || regno_related[regno]->invalidate_luid) ! return; ! ! regno_related[regno]->reg_orig_refs += loop_depth; ! ! offset = regno_related[regno]->offset += pre_offs; ! base = regno_related[regno]->u.base; ! ! if (! auto_inc) ! { ! push_obstacks_nochange (); ! end_temporary_allocation (); ! auto_inc = gen_rtx_PRE_INC (Pmode, addr); ! pop_obstacks (); ! } ! ! XEXP (auto_inc, 0) = addr; ! *addrp = auto_inc; ! ! rel_new (new_use); ! new_use->insn = insn; ! new_use->addrp = addrp; ! new_use->luid = luid; ! new_use->call_tally = call_tally; ! new_use->class = class = reg_preferred_class (regno); ! new_use->set_in_parallel = 0; ! new_use->offset = offset; ! new_use->match_offset = offset; ! new_use->sibling = new_use; ! ! do ! { ! match = lookup_related (regno, class, offset); ! if (! match) ! { ! /* We can choose PRE_{IN,DE}CREMENT on the spot with the information ! we have gathered about the preceding instructions, while we have ! to record POST_{IN,DE}CREMENT possibilities so that we can check ! later if we have a use for their output value. */ ! /* We use recog here directly because we are only testing here if ! the changes could be made, but don't really want to make a ! change right now. The caching from recog_memoized would only ! get in the way. */ ! match = lookup_related (regno, class, offset - size); ! if (HAVE_PRE_INCREMENT && match) ! { ! PUT_CODE (auto_inc, PRE_INC); ! if (recog (PATTERN (insn), insn, NULL_PTR) >= 0) ! break; ! } ! match = lookup_related (regno, class, offset + size); ! if (HAVE_PRE_DECREMENT && match) ! { ! PUT_CODE (auto_inc, PRE_DEC); ! if (recog (PATTERN (insn), insn, NULL_PTR) >= 0) ! break; ! } ! match = 0; ! } ! PUT_CODE (auto_inc, POST_INC); ! if (HAVE_POST_INCREMENT && recog (PATTERN (insn), insn, NULL_PTR) >= 0) ! { ! struct rel_use *inc_use; ! ! rel_new (inc_use); ! *inc_use = *new_use; ! inc_use->sibling = new_use; ! new_use->sibling = inc_use; ! inc_use->prev_chain_ref = NULL_PTR; ! inc_use->next_chain = NULL_PTR; ! hash = REL_USE_HASH (inc_use->match_offset = offset + size); ! inc_use->next_hash = regno_related[base]->baseinfo->hashtab[hash]; ! regno_related[base]->baseinfo->hashtab[hash] = inc_use; ! } ! PUT_CODE (auto_inc, POST_DEC); ! if (HAVE_POST_DECREMENT && recog (PATTERN (insn), insn, NULL_PTR) >= 0) ! { ! struct rel_use *dec_use; ! ! rel_new (dec_use); ! *dec_use = *new_use; ! dec_use->sibling = new_use->sibling; ! new_use->sibling = dec_use; ! dec_use->prev_chain_ref = NULL_PTR; ! dec_use->next_chain = NULL_PTR; ! hash = REL_USE_HASH (dec_use->match_offset = offset + size); ! dec_use->next_hash = regno_related[base]->baseinfo->hashtab[hash]; ! regno_related[base]->baseinfo->hashtab[hash] = dec_use; ! } ! } ! while (0); ! rel_build_chain (new_use, match, base); ! *addrp = orig_addr; ! ! regno_related[regno]->offset += post_offs; ! } ! ! /* Note that REG is set to something that we do not regognize as a ! related value, at an insn with linear uid LUID. */ ! static void ! invalidate_related (reg, luid, call_tally) ! rtx reg; ! int luid; ! { ! int regno = REGNO (reg); ! struct related *rel = regno_related[regno]; ! if (! rel) ! { ! rel_new (rel); ! regno_related[regno] = rel; ! rel->prev = unrelatedly_used; ! unrelatedly_used = rel; ! rel->reg = reg; ! rel->insn = NULL_RTX; ! rel->invalidate_luid = 0; ! rel->u.last_luid = luid; ! } ! else if (rel->invalidate_luid) ! ; /* do nothing */ ! else if (! rel->insn) ! rel->u.last_luid = luid; ! else ! { ! rel->invalidate_luid = luid; ! rel->reg_orig_calls_crossed = call_tally - rel->reg_set_call_tally; ! } ! } ! ! /* Check the RTL fragment pointed to by XP for related values - that is, ! if any new are created, or if they are assigned new values. Also ! note any other sets so that we can track lifetime conflicts. ! INSN is the instruction XP points into, LUID its luid, and CALL_TALLY ! the number of preceding calls in the function. */ ! static void ! find_related (xp, insn, luid, call_tally) ! rtx *xp, insn; ! int luid, call_tally; ! { ! rtx x = *xp; ! enum rtx_code code = GET_CODE (x); ! const char *fmt; ! int i; ! ! switch (code) ! { ! case SET: ! { ! rtx dst = SET_DEST (x); ! rtx src = SET_SRC (x); ! ! /* First, check out if this sets a new related value. ! We don't care about register class differences here, since ! we might still find multiple related values share the same ! class even if it is disjunct from the class of the original ! register. ! We use a do .. while (0); here because there are many possible ! conditions that make us want to handle this like an ordinary set. */ ! do ! { ! rtx src_reg, src_const; ! int src_regno, dst_regno; ! struct related *new_related; ! ! /* First check that we have actually something like ! (set (reg pseudo_dst) (plus (reg pseudo_src) (const_int))) . */ ! if (GET_CODE (src) == PLUS) ! { ! src_reg = XEXP (src, 0); ! src_const = XEXP (src, 1); ! } ! else if (GET_CODE (src) == REG ! && GET_MODE_CLASS (GET_MODE (src)) == MODE_INT) ! { ! src_reg = src; ! src_const = const0_rtx; ! } ! else ! break; ! ! if (GET_CODE (src_reg) != REG ! || GET_CODE (src_const) != CONST_INT ! || GET_CODE (dst) != REG) ! break; ! dst_regno = REGNO (dst); ! src_regno = REGNO (src_reg); ! ! /* If only some words of multi-word pseudo are stored into, the ! old value does not die at the store, yet we can't replace the ! register. We cannot handle this case, so reject any pseudos ! that have such stores. ! We approximate this with REG_CHANGES_SIZE, which is true also ! in a few cases that we could handle (i.e. same number of words, ! or size change only when reading). */ ! if (src_regno < FIRST_PSEUDO_REGISTER ! || REG_CHANGES_SIZE (src_regno) ! || dst_regno < FIRST_PSEUDO_REGISTER ! || REG_CHANGES_SIZE (dst_regno)) ! break; ! ! /* We only know how to remove the set if that is ! all what the insn does. */ ! if (x != single_set (insn)) ! break; ! ! /* We cannot handle multiple lifetimes. */ ! if ((regno_related[src_regno] ! && regno_related[src_regno]->invalidate_luid) ! || (regno_related[dst_regno] ! && regno_related[dst_regno]->invalidate_luid)) ! break; ! ! /* Check if this is merely an update of a register with a ! value belonging to a group of related values we already ! track. */ ! if (regno_related[dst_regno] && regno_related[dst_regno]->insn) ! { ! struct update *new_update; ! ! /* If the base register changes, don't handle this as a ! related value. We can currently only attribute the ! register to one base, and keep record of one lifetime ! during which we might re-use the register. */ ! if (! regno_related[src_regno] ! || ! regno_related[src_regno]->insn ! ||(regno_related[dst_regno]->u.base ! != regno_related[src_regno]->u.base)) ! break; ! regno_related[src_regno]->reg_orig_refs += loop_depth; ! regno_related[dst_regno]->reg_orig_refs += loop_depth; ! regno_related[dst_regno]->offset ! = regno_related[src_regno]->offset + INTVAL (src_const); ! rel_new (new_update); ! new_update->insn = insn; ! new_update->death_insn = regno_related[dst_regno]->death; ! regno_related[dst_regno]->death = NULL_RTX; ! new_update->prev = regno_related[dst_regno]->updates; ! regno_related[dst_regno]->updates = new_update; ! return; ! } ! if (! regno_related[src_regno] ! || ! regno_related[src_regno]->insn) ! { ! if (src_regno == dst_regno) ! break; ! rel_new (new_related); ! new_related->reg = src_reg; ! new_related->insn = insn; ! new_related->updates = 0; ! new_related->reg_set_call_tally = call_tally; ! new_related->reg_orig_refs = loop_depth; ! new_related->u.base = src_regno; ! new_related->offset = 0; ! new_related->prev = 0; ! new_related->invalidate_luid = 0; ! new_related->death = NULL_RTX; ! rel_new (new_related->baseinfo); ! bzero ((char *) new_related->baseinfo, ! sizeof *new_related->baseinfo); ! new_related->baseinfo->prev_base = rel_base_list; ! rel_base_list = new_related; ! new_related->baseinfo->insn_luid = luid; ! regno_related[src_regno] = new_related; ! } ! /* If the destination register has been used since we started ! tracking this group of related values, there would be tricky ! lifetime problems that we don't want to tackle right now. */ ! else if (regno_related[dst_regno] ! && (regno_related[dst_regno]->u.last_luid ! >= regno_related[regno_related[src_regno]->u.base]->baseinfo->insn_luid)) ! break; ! rel_new (new_related); ! new_related->reg = dst; ! new_related->insn = insn; ! new_related->updates = 0; ! new_related->reg_set_call_tally = call_tally; ! new_related->reg_orig_refs = loop_depth; ! new_related->u.base = regno_related[src_regno]->u.base; ! new_related->offset = ! regno_related[src_regno]->offset + INTVAL (src_const); ! new_related->invalidate_luid = 0; ! new_related->death = NULL_RTX; ! new_related->prev = regno_related[src_regno]->prev; ! regno_related[src_regno]->prev = new_related; ! regno_related[dst_regno] = new_related; ! return; ! } ! while (0); ! ! /* The SET has not been recognized as setting up a related value. ! If the destination is ultimately a register, we have to ! invalidate what we have memorized about any related value ! previously stored into it. */ ! while (GET_CODE (dst) == SUBREG ! || GET_CODE (dst) == ZERO_EXTRACT ! || GET_CODE (dst) == SIGN_EXTRACT ! || GET_CODE (dst) == STRICT_LOW_PART) ! dst = XEXP (dst, 0); ! if (GET_CODE (dst) == REG) ! { ! find_related (&SET_SRC (x), insn, luid, call_tally); ! invalidate_related (dst, luid, call_tally); ! return; ! } ! find_related (&SET_SRC (x), insn, luid, call_tally); ! find_related (&SET_DEST (x), insn, luid, ! call_tally + (GET_CODE (insn) == CALL_INSN)); ! return; ! } ! case CLOBBER: ! { ! rtx dst = XEXP (x, 0); ! while (GET_CODE (dst) == SUBREG ! || GET_CODE (dst) == ZERO_EXTRACT ! || GET_CODE (dst) == SIGN_EXTRACT ! || GET_CODE (dst) == STRICT_LOW_PART) ! dst = XEXP (dst, 0); ! if (GET_CODE (dst) == REG) ! { ! int regno = REGNO (dst); ! struct related *rel = regno_related[regno]; ! ! /* If this clobbers a register that belongs to a set of related ! values, we have to check if the same register appears somewhere ! else in the insn : this is then likely to be a match_dup. */ ! ! if (rel ! && rel->insn ! && ! rel->invalidate_luid ! && xp != &PATTERN (insn) ! && count_occurrences (PATTERN (insn), dst) > 1) ! { ! enum reg_class class = reg_preferred_class (regno); ! struct rel_use *new_use, *match; ! HOST_WIDE_INT offset = rel->offset; ! ! rel_new (new_use); ! new_use->insn = insn; ! new_use->addrp = &XEXP (x, 0); ! new_use->luid = luid; ! new_use->call_tally = call_tally; ! new_use->class = class; ! new_use->set_in_parallel = 1; ! new_use->sibling = new_use; ! do ! { ! new_use->match_offset = new_use->offset = offset; ! match = lookup_related (regno, class, offset); ! offset++; ! } ! while (! match || match->luid != luid); ! rel_build_chain (new_use, match, rel->u.base); ! /* Prevent other registers from using the same chain. */ ! new_use->next_chain = new_use; ! } ! invalidate_related (dst, luid, call_tally); ! return; ! } ! break; ! } ! case REG: ! { ! int regno = REGNO (x); ! if (! regno_related[regno]) ! { ! rel_new (regno_related[regno]); ! regno_related[regno]->prev = unrelatedly_used; ! unrelatedly_used = regno_related[regno]; ! regno_related[regno]->reg = x; ! regno_related[regno]->insn = NULL_RTX; ! regno_related[regno]->u.last_luid = luid; ! } ! else if (! regno_related[regno]->insn) ! regno_related[regno]->u.last_luid = luid; ! else if (! regno_related[regno]->invalidate_luid) ! { ! struct rel_use *new_use, *match; ! HOST_WIDE_INT offset; ! int base; ! enum reg_class class; ! ! regno_related[regno]->reg_orig_refs += loop_depth; ! ! offset = regno_related[regno]->offset; ! base = regno_related[regno]->u.base; ! ! rel_new (new_use); ! new_use->insn = insn; ! new_use->addrp = xp; ! new_use->luid = luid; ! new_use->call_tally = call_tally; ! new_use->class = class = reg_preferred_class (regno); ! new_use->set_in_parallel = 0; ! new_use->offset = offset; ! new_use->match_offset = offset; ! new_use->sibling = new_use; ! ! match = lookup_related (regno, class, offset); ! rel_build_chain (new_use, match, base); ! } ! return; ! } ! case MEM: ! { ! int size = GET_MODE_SIZE (GET_MODE (x)); ! rtx *addrp= &XEXP (x, 0), addr = *addrp; ! ! switch (GET_CODE (addr)) ! { ! case REG: ! rel_record_mem (addrp, addr, size, 0, 0, ! insn, luid, call_tally); ! return; ! case PRE_INC: ! rel_record_mem (addrp, XEXP (addr, 0), size, size, 0, ! insn, luid, call_tally); ! return; ! case POST_INC: ! rel_record_mem (addrp, XEXP (addr, 0), size, 0, size, ! insn, luid, call_tally); ! return; ! case PRE_DEC: ! rel_record_mem (addrp, XEXP (addr, 0), size, -size, 0, ! insn, luid, call_tally); ! return; ! case POST_DEC: ! rel_record_mem (addrp, XEXP (addr, 0), size, 0, -size, ! insn, luid, call_tally); ! return; ! default: ! break; ! } ! break; ! } ! case PARALLEL: ! { ! for (i = XVECLEN (x, 0) - 1; i >= 0; i--) ! { ! rtx *yp = &XVECEXP (x, 0, i); ! rtx y = *yp; ! if (GET_CODE (y) == SET) ! { ! rtx dst; ! ! find_related (&SET_SRC (y), insn, luid, call_tally); ! dst = SET_DEST (y); ! while (GET_CODE (dst) == SUBREG ! || GET_CODE (dst) == ZERO_EXTRACT ! || GET_CODE (dst) == SIGN_EXTRACT ! || GET_CODE (dst) == STRICT_LOW_PART) ! dst = XEXP (dst, 0); ! if (GET_CODE (dst) != REG) ! find_related (&SET_DEST (y), insn, luid, ! call_tally + (GET_CODE (insn) == CALL_INSN)); ! } ! else if (GET_CODE (y) != CLOBBER) ! find_related (yp, insn, luid, call_tally); ! } ! for (i = XVECLEN (x, 0) - 1; i >= 0; i--) ! { ! rtx *yp = &XVECEXP (x, 0, i); ! rtx y = *yp; ! if (GET_CODE (y) == SET) ! { ! rtx *dstp; ! ! dstp = &SET_DEST (y); ! while (GET_CODE (*dstp) == SUBREG ! || GET_CODE (*dstp) == ZERO_EXTRACT ! || GET_CODE (*dstp) == SIGN_EXTRACT ! || GET_CODE (*dstp) == STRICT_LOW_PART) ! dstp = &XEXP (*dstp, 0); ! if (GET_CODE (*dstp) == REG) ! { ! int regno = REGNO (*dstp); ! rtx src = SET_SRC (y); ! if (regno_related[regno] && regno_related[regno]->insn ! && GET_CODE (src) == PLUS ! && XEXP (src, 0) == *dstp ! && GET_CODE (XEXP (src, 1)) == CONST_INT) ! { ! struct rel_use *new_use, *match; ! enum reg_class class; ! ! regno_related[regno]->reg_orig_refs += loop_depth; ! rel_new (new_use); ! new_use->insn = insn; ! new_use->addrp = dstp; ! new_use->luid = luid; ! new_use->call_tally = call_tally; ! new_use->class = class = reg_preferred_class (regno); ! new_use->set_in_parallel = 1; ! new_use->offset = regno_related[regno]->offset; ! new_use->match_offset ! = regno_related[regno]->offset ! += INTVAL (XEXP (src, 1)); ! new_use->sibling = new_use; ! match = lookup_related (regno, class, new_use->offset); ! rel_build_chain (new_use, match, ! regno_related[regno]->u.base); ! } ! else ! /* We assume here that a CALL_INSN won't set a pseudo ! at the same time as a MEM that contains the pseudo ! - if that were the case, we'd have to use an ! incremented CALL_TALLY value. */ ! invalidate_related (*dstp, luid, call_tally); ! } ! } ! else if (GET_CODE (y) == CLOBBER) ! find_related (yp, insn, luid, call_tally); ! } ! return; ! } ! default: ! break; ! } ! fmt = GET_RTX_FORMAT (code); ! ! for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--) ! { ! if (fmt[i] == 'e') ! find_related (&XEXP (x, i), insn, luid, call_tally); ! if (fmt[i] == 'E') ! { ! register int j; ! for (j = 0; j < XVECLEN (x, i); j++) ! find_related (&XVECEXP (x, i, j), insn, luid, call_tally); ! } ! } ! } ! ! /* Comparison functions for qsort. */ ! static int ! chain_starts_earlier (chain1, chain2) ! const PTR chain1; ! const PTR chain2; ! { ! int d = ((*(struct rel_use_chain **)chain2)->start_luid ! - (*(struct rel_use_chain **)chain1)->start_luid); ! if (! d) ! d = ((*(struct rel_use_chain **)chain2)->chain->offset ! - (*(struct rel_use_chain **)chain1)->chain->offset); ! if (! d) ! d = ((*(struct rel_use_chain **)chain2)->chain->set_in_parallel ! - (*(struct rel_use_chain **)chain1)->chain->set_in_parallel); ! /* If set_in_parallel is not set on both chain's first use, they must ! differ in start_luid or offset, since otherwise they would use the ! same chain. ! Thus the remaining problem is with set_in_parallel uses; for these, we ! know that *addrp is a register. Since the same register may not be set ! multiple times in the same insn, the registers must be different. */ ! ! if (! d) ! d = (REGNO (*(*(struct rel_use_chain **)chain2)->chain->addrp) ! - REGNO (*(*(struct rel_use_chain **)chain1)->chain->addrp)); ! return d; ! } ! ! static int ! chain_ends_later (chain1, chain2) ! const PTR chain1; ! const PTR chain2; ! { ! int d = ((*(struct rel_use_chain **)chain1)->end->no_link_pred ! - (*(struct rel_use_chain **)chain2)->end->no_link_pred); ! if (! d) ! d = ((*(struct rel_use_chain **)chain1)->end_luid ! - (*(struct rel_use_chain **)chain2)->end_luid); ! if (! d) ! d = ((*(struct rel_use_chain **)chain2)->chain->offset ! - (*(struct rel_use_chain **)chain1)->chain->offset); ! if (! d) ! d = ((*(struct rel_use_chain **)chain2)->chain->set_in_parallel ! - (*(struct rel_use_chain **)chain1)->chain->set_in_parallel); ! /* If set_in_parallel is not set on both chain's first use, they must ! differ in start_luid or offset, since otherwise they would use the ! same chain. ! Thus the remaining problem is with set_in_parallel uses; for these, we ! know that *addrp is a register. Since the same register may not be set ! multiple times in the same insn, the registers must be different. */ ! ! if (! d) ! d = (REGNO (*(*(struct rel_use_chain **)chain2)->chain->addrp) ! - REGNO (*(*(struct rel_use_chain **)chain1)->chain->addrp)); ! return d; ! } ! ! static void ! count_sets (x, pat, trash) ! rtx x, pat; ! void *trash ATTRIBUTE_UNUSED; ! { ! if (GET_CODE (x) == REG) ! REG_N_SETS (REGNO (x))++; ! } ! ! /* Perform the optimization for a single set of related values. ! INSERT_AFTER is an instruction after which we may emit instructions ! to initialize registers that remain live beyond the end of the group ! of instructions which have been examined. */ ! static struct related * ! optimize_related_values_1 (rel_base, luid, call_tally, insert_after, ! regmove_dump_file) ! struct related *rel_base; ! int luid, call_tally; ! rtx insert_after; ! FILE *regmove_dump_file; ! { ! struct related_baseinfo *baseinfo = rel_base->baseinfo; ! struct related *rel; ! struct rel_use_chain *chain, *chain0, **chain_starttab, **chain_endtab; ! struct rel_use_chain **pred_chainp, *pred_chain, *last_init_chain; ! int num_regs, num_av_regs, num_chains, num_linked, max_end_luid, i; ! int max_start_luid; ! struct rel_use_chain *rel_base_reg_user; ! int mode; ! HOST_WIDE_INT rel_base_reg_user_offset = 0; ! ! /* For any registers that are still live, we have to arrange ! to have them set to their proper values. ! Also count with how many registers (not counting base) we are ! dealing with here. */ ! for (num_regs = -1, rel = rel_base; rel; rel = rel->prev, num_regs++) ! { ! int regno = REGNO (rel->reg); ! ! if (! rel->death ! && ! rel->invalidate_luid) ! { ! enum reg_class class = reg_preferred_class (regno); ! struct rel_use *new_use, *match; ! ! rel_new (new_use); ! new_use->insn = NULL_RTX; ! new_use->addrp = &rel->reg; ! new_use->luid = luid; ! new_use->call_tally = call_tally; ! new_use->class = class; ! new_use->set_in_parallel = 1; ! new_use->match_offset = new_use->offset = rel->offset; ! new_use->sibling = new_use; ! match = lookup_related (regno, class, rel->offset); ! rel_build_chain (new_use, match, REGNO (rel_base->reg)); ! /* Prevent other registers from using the same chain. */ ! new_use->next_chain = new_use; ! ! rel->reg_orig_calls_crossed = call_tally - rel->reg_set_call_tally; ! } ! } ! ! /* Now for every chain of values related to the base, set start ! and end luid, match_offset, and reg. Also count the number of these ! chains, and determine the largest end luid. */ ! num_chains = 0; ! for (max_end_luid = 0, chain = baseinfo->chains; chain; chain = chain->prev) ! { ! struct rel_use *use, *next; ! ! num_chains++; ! next = chain->chain; ! chain->start_luid = next->luid; ! do ! { ! use = next; ! next = use->next_chain; ! } ! while (next && next != use); ! use->no_link_pred = next != NULL_PTR; ! use->next_chain = 0; ! chain->end = use; ! chain->end_luid = use->luid; ! chain->match_offset = use->match_offset; ! chain->calls_crossed = use->call_tally - chain->chain->call_tally; ! ! chain->reg = use->insn ? NULL_RTX : *use->addrp; ! ! if (use->luid > max_end_luid) ! max_end_luid = use->luid; ! ! if (regmove_dump_file) ! fprintf (regmove_dump_file, "Chain start: %d end: %d\n", ! chain->start_luid, chain->end_luid); ! } ! ! if (regmove_dump_file) ! fprintf (regmove_dump_file, ! "Insn %d reg %d: found %d chains.\n", ! INSN_UID (rel_base->insn), REGNO (rel_base->reg), num_chains); ! ! if (! num_chains) ! return baseinfo->prev_base; ! ! /* For every chain, we try to find another chain the lifetime of which ! ends before the lifetime of said chain starts. ! So we first sort according to luid of first and last instruction that ! is in the chain, respectively; this is O(n * log n) on average. */ ! chain_starttab = rel_alloc (num_chains * sizeof *chain_starttab); ! chain_endtab = rel_alloc (num_chains * sizeof *chain_starttab); ! for (chain = baseinfo->chains, i = 0; chain; chain = chain->prev, i++) ! { ! chain_starttab[i] = chain; ! chain_endtab[i] = chain; ! } ! qsort (chain_starttab, num_chains, sizeof *chain_starttab, ! chain_starts_earlier); ! qsort (chain_endtab, num_chains, sizeof *chain_endtab, chain_ends_later); ! ! ! /* Now we go through every chain, starting with the one that starts ! second (we can skip the first because we know there would be no match), ! and check it against the chain that ends first. */ ! /* ??? We assume here that reg_class_compatible_p will seldom return false. ! If that is not true, we should do a more thorough search for suitable ! chain combinations. */ ! pred_chainp = chain_endtab; ! pred_chain = *pred_chainp; ! max_start_luid = chain_starttab[num_chains - 1]->start_luid; ! for (num_linked = 0, i = num_chains - 2; i >= 0; i--) ! { ! struct rel_use_chain *succ_chain = chain_starttab[i]; ! if (succ_chain->start_luid > pred_chain->end_luid ! && ! pred_chain->end->no_link_pred ! && (pred_chain->calls_crossed ! ? succ_chain->calls_crossed ! : succ_chain->end->call_tally == pred_chain->chain->call_tally) ! && regclass_compatible_p (succ_chain->chain->class, ! pred_chain->chain->class) ! /* add_limits is not valid for MODE_PARTIAL_INT . */ ! && GET_MODE_CLASS (GET_MODE (rel_base->reg)) == MODE_INT ! && (succ_chain->chain->offset - pred_chain->match_offset ! >= add_limits[(int) GET_MODE (rel_base->reg)][0]) ! && (succ_chain->chain->offset - pred_chain->match_offset ! <= add_limits[(int) GET_MODE (rel_base->reg)][1])) ! { ! /* We can link these chains together. */ ! pred_chain->linked = succ_chain; ! succ_chain->start_luid = 0; ! num_linked++; ! ! pred_chain = *++pred_chainp; ! } ! else ! max_start_luid = succ_chain->start_luid; ! } ! ! if (regmove_dump_file && num_linked) ! fprintf (regmove_dump_file, "Linked to %d sets of chains.\n", ! num_chains - num_linked); ! ! /* Now count the number of registers that are available for reuse. */ ! /* ??? In rare cases, we might reuse more if we took different ! end luids of the chains into account. Or we could just allocate ! some new regs. But that would probably not be worth the effort. */ ! /* ??? We should pay attention to preferred register classes here to, ! if the to-be-allocated register have a life outside the range that ! we handle. */ ! for (num_av_regs = 0, rel = rel_base->prev; rel; rel = rel->prev) ! { ! if (! rel->invalidate_luid ! || rel->invalidate_luid > max_end_luid) ! num_av_regs++; ! } ! ! /* Propagate mandatory register assignments to the first chain in ! all sets of liked chains, and set rel_base_reg_user. */ ! for (rel_base_reg_user = 0, i = 0; i < num_chains; i++) ! { ! struct rel_use_chain *chain = chain_starttab[i]; ! if (chain->linked) ! chain->reg = chain->linked->reg; ! if (chain->reg == rel_base->reg) ! rel_base_reg_user = chain; ! } ! /* If rel_base->reg is not a mandatory allocated register, allocate ! it to that chain that starts first and has no allocated register, ! and that allows the addition of the start value in a single ! instruction. */ ! mode = (int) GET_MODE (rel_base->reg); ! if (! rel_base_reg_user) ! { ! for ( i = num_chains - 1; i >= 0; --i) ! { ! struct rel_use_chain *chain = chain_starttab[i]; ! if (! chain->reg ! && chain->start_luid ! && chain->chain->offset >= add_limits[mode][0] ! && chain->chain->offset <= add_limits[mode][1] ! /* Also can't use this chain if it's register is clobbered ! and other chains need to start later. */ ! && (! (chain->end->no_link_pred && chain->end->insn) ! || chain->end->luid >= max_start_luid) ! /* Also can't use it if it lasts longer than base ! base is available. */ ! && (! rel_base->invalidate_luid ! || rel_base->invalidate_luid > chain->end_luid)) ! { ! chain->reg = rel_base->reg; ! rel_base_reg_user = chain; ! break; ! } ! } ! } ! else ! rel_base_reg_user_offset = rel_base_reg_user->chain->offset; ! ! /* Now check if it is worth doing this optimization after all. ! Using separate registers per value, like in the code generated by cse, ! costs two instructions per register (one move and one add). ! Using the chains we have set up, we need two instructions for every ! linked set of chains, plus one instruction for every link; ! however, if the base register is allocated to a chain ! (i.e. rel_base_reg_user != 0), we don't need a move insn to start ! that chain. ! We do the optimization if we save instructions, or if we ! stay with the same number of instructions, but save registers. ! We also require that we have enough registers available for reuse. ! Moreover, we have to check that we can add the offset for ! rel_base_reg_user, in case it is a mandatory allocated register. */ ! if ((2 * num_regs ! > ((2 * num_chains - num_linked - (rel_base_reg_user != 0)) ! - (num_linked != 0))) ! && num_av_regs + (rel_base_reg_user != 0) >= num_chains - num_linked ! && rel_base_reg_user_offset >= add_limits[mode][0] ! && rel_base_reg_user_offset <= add_limits[mode][1]) ! { ! /* Hold the current offset between the initial value of rel_base->reg ! and the current value of rel_base->rel before the instruction ! that starts the current set of chains. */ ! int base_offset = 0; ! /* The next use of rel_base->reg that we have to look out for. */ ! struct rel_use *base_use; ! /* Pointer to next insn where we look for it. */ ! rtx base_use_scan = 0; ! int base_last_use_call_tally = rel_base->reg_set_call_tally; ! int base_regno; ! int base_seen; ! ! if (regmove_dump_file) ! fprintf (regmove_dump_file, "Optimization is worth while.\n"); ! ! /* First, remove all the setting insns, death notes ! and refcount increments that are now obsolete. */ ! for (rel = rel_base; rel; rel = rel->prev) ! { ! struct update *update; ! int regno = REGNO (rel->reg); ! ! if (rel != rel_base) ! { ! /* The first setting insn might be the start of a basic block. */ ! if (rel->insn == rel_base->insn ! /* We have to preserve insert_after. */ ! || rel->insn == insert_after) ! { ! PUT_CODE (rel->insn, NOTE); ! NOTE_LINE_NUMBER (rel->insn) = NOTE_INSN_DELETED; ! NOTE_SOURCE_FILE (rel->insn) = 0; ! } ! else ! delete_insn (rel->insn); ! REG_N_SETS (regno)--; ! } ! REG_N_CALLS_CROSSED (regno) -= rel->reg_orig_calls_crossed; ! for (update = rel->updates; update; update = update->prev) ! { ! rtx death_insn = update->death_insn; ! if (death_insn) ! { ! rtx death_note ! = find_reg_note (death_insn, REG_DEAD, rel->reg); ! if (! death_note) ! death_note ! = find_reg_note (death_insn, REG_UNUSED, rel->reg); ! remove_note (death_insn, death_note); ! REG_N_DEATHS (regno)--; ! } ! /* We have to preserve insert_after. */ ! if (update->insn == insert_after) ! { ! PUT_CODE (update->insn, NOTE); ! NOTE_LINE_NUMBER (update->insn) = NOTE_INSN_DELETED; ! NOTE_SOURCE_FILE (update->insn) = 0; ! } ! else ! delete_insn (update->insn); ! REG_N_SETS (regno)--; ! } ! if (rel->death) ! { ! rtx death_note = find_reg_note (rel->death, REG_DEAD, rel->reg); ! if (! death_note) ! death_note = find_reg_note (rel->death, REG_UNUSED, rel->reg); ! remove_note (rel->death, death_note); ! rel->death = death_note; ! REG_N_DEATHS (regno)--; ! } ! } ! /* Go through all the chains and install the planned changes. */ ! rel = rel_base; ! if (rel_base_reg_user) ! { ! base_use = rel_base_reg_user->chain; ! base_use_scan = chain_starttab[num_chains - 1]->chain->insn; ! } ! ! /* Set last_init_chain to the chain that starts latest. */ ! for (i = 0; ! chain_starttab[i]->start_luid; i++); ! last_init_chain = chain_starttab[i]; ! /* If there are multiple chains that consist only of an ! assignment to a register that is live at the end of the ! block, they all have the same luid. The loop that emits the ! insns for all the chains below starts with the chain with the ! highest index, and due to the way insns are emitted after ! insert_after, the first emitted will eventually be the last. */ ! for (; i < num_chains; i++) ! { ! if (! chain_starttab[i]->start_luid) ! continue; ! if (chain_starttab[i]->chain->insn) ! break; ! last_init_chain = chain_starttab[i]; ! } ! ! base_regno = REGNO (rel_base->reg); ! base_seen = 0; ! for (i = num_chains - 1; i >= 0; i--) ! { ! int first_call_tally; ! rtx reg; ! int regno; ! struct rel_use *use, *last_use; ! ! chain0 = chain_starttab[i]; ! if (! chain0->start_luid) ! continue; ! first_call_tally = chain0->chain->call_tally; ! reg = chain0->reg; ! /* If this chain has not got a register yet, assign one. */ ! if (! reg) ! { ! do ! rel = rel->prev; ! while (! rel->death ! || (rel->invalidate_luid ! && rel->invalidate_luid <= max_end_luid)); ! reg = rel->reg; ! } ! regno = REGNO (reg); ! ! use = chain0->chain; ! ! /* Update base_offset. */ ! if (rel_base_reg_user) ! { ! rtx use_insn = use->insn; ! rtx base_use_insn = base_use->insn; ! ! if (! use_insn) ! use_insn = insert_after; ! ! while (base_use_scan != use_insn) ! { ! if (base_use_scan == base_use_insn) ! { ! base_offset = base_use->match_offset; ! base_use = base_use->next_chain; ! if (! base_use) ! { ! rel_base_reg_user = rel_base_reg_user->linked; ! if (! rel_base_reg_user) ! break; ! base_use = rel_base_reg_user->chain; ! } ! base_use_insn = base_use->insn; ! } ! base_use_scan = NEXT_INSN (base_use_scan); ! } ! /* If we are processing the start of a chain that starts with ! an instruction that also uses the base register, (that happens ! only if USE_INSN contains multiple distinct, but related ! values) and the chains using the base register have already ! been processed, the initializing instruction of the new ! register will end up after the adjustment of the base ! register. */ ! if (use_insn == base_use_insn && base_seen) ! base_offset = base_use->offset; ! } ! if (regno == base_regno) ! base_seen = 1; ! if (regno != base_regno || use->offset - base_offset) ! { ! HOST_WIDE_INT offset = use->offset - base_offset; ! rtx add; ! ! add = (offset ! ? gen_add3_insn (reg, rel_base->reg, GEN_INT (offset)) ! : gen_move_insn (reg, rel_base->reg)); ! if (! add) ! abort (); ! if (GET_CODE (add) == SEQUENCE) ! { ! int i; ! ! for (i = XVECLEN (add, 0) - 1; i >= 0; i--) ! note_stores (PATTERN (XVECEXP (add, 0, i)), count_sets, ! NULL); ! } ! else ! note_stores (add, count_sets, NULL); ! if (use->insn) ! add = emit_insn_before (add, use->insn); ! else ! add = emit_insn_after (add, insert_after); ! if (use->call_tally > base_last_use_call_tally) ! base_last_use_call_tally = use->call_tally; ! /* If this is the last reg initializing insn, and we ! still have to place a death note for the base reg, ! attach it to this insn - ! unless we are still using the base register. */ ! if (chain0 == last_init_chain ! && rel_base->death ! && regno != base_regno) ! { ! XEXP (rel_base->death, 0) = rel_base->reg; ! XEXP (rel_base->death, 1) = REG_NOTES (add); ! REG_NOTES (add) = rel_base->death; ! REG_N_DEATHS (base_regno)++; ! } ! } ! for (last_use = 0, chain = chain0; chain; chain = chain->linked) ! { ! int last_offset; ! ! use = chain->chain; ! if (last_use && use->offset != last_use->offset) ! { ! rtx add ! = gen_add3_insn (reg, reg, ! GEN_INT (use->offset - last_use->offset)); ! if (! add) ! abort (); ! if (use->insn) ! emit_insn_before (add, use->insn); ! else ! { ! /* Set use->insn, so that base_offset will be adjusted ! in time if REG is REL_BASE->REG . */ ! use->insn = emit_insn_after (add, last_use->insn); ! } ! REG_N_SETS (regno)++; ! } ! for (last_offset = use->offset; use; use = use->next_chain) ! { ! rtx addr; ! int use_offset; ! ! addr = *use->addrp; ! if (GET_CODE (addr) != REG) ! remove_note (use->insn, ! find_reg_note (use->insn, REG_INC, ! XEXP (addr, 0))); ! use_offset = use->offset; ! if (use_offset == last_offset) ! { ! if (use->set_in_parallel) ! { ! REG_N_SETS (REGNO (addr))--; ! addr = reg; ! } ! else if (use->match_offset > use_offset) ! addr = gen_rtx_POST_INC (Pmode, reg); ! else if (use->match_offset < use_offset) ! addr = gen_rtx_POST_DEC (Pmode, reg); ! else ! addr = reg; ! } ! else if (use_offset > last_offset) ! addr = gen_rtx_PRE_INC (Pmode, reg); ! else ! addr = gen_rtx_PRE_DEC (Pmode, reg); ! /* Group changes from the same chain for the same insn ! together, to avoid failures for match_dups. */ ! validate_change (use->insn, use->addrp, addr, 1); ! if ((! use->next_chain || use->next_chain->insn != use->insn) ! && ! apply_change_group ()) ! abort (); ! if (addr != reg) ! REG_NOTES (use->insn) ! = gen_rtx_EXPR_LIST (REG_INC, reg, REG_NOTES (use->insn)); ! last_use = use; ! last_offset = use->match_offset; ! } ! } ! /* If REG dies, attach its death note to the last using insn in ! the set of linked chains we just handled. ! However, if REG is the base register, don't do this if there ! will be a later initializing instruction for another register. ! The initializing instruction for last_init_chain will be inserted ! before last_init_chain->chain->insn, so if the luids (and hence ! the insns these stand for) are equal, put the death note here. */ ! if (reg == rel->reg ! && rel->death ! && (rel != rel_base ! || last_use->luid >= last_init_chain->start_luid)) ! { ! XEXP (rel->death, 0) = reg; ! ! /* Note that passing only PATTERN (LAST_USE->insn) to ! reg_set_p here is not enough, since we might have ! created an REG_INC for REG above. */ ! ! PUT_MODE (rel->death, (reg_set_p (reg, last_use->insn) ! ? REG_UNUSED : REG_DEAD)); ! XEXP (rel->death, 1) = REG_NOTES (last_use->insn); ! REG_NOTES (last_use->insn) = rel->death; ! /* Mark this death as 'used up'. That is important for the ! base register. */ ! rel->death = NULL_RTX; ! REG_N_DEATHS (regno)++; ! } ! if (regno == base_regno) ! base_last_use_call_tally = last_use->call_tally; ! else ! REG_N_CALLS_CROSSED (regno) ! += last_use->call_tally - first_call_tally; ! } ! ! REG_N_CALLS_CROSSED (base_regno) += ! base_last_use_call_tally - rel_base->reg_set_call_tally; ! } ! ! /* Finally, clear the entries that we used in regno_related. We do it ! item by item here, because doing it with bzero for each basic block ! would give O(n*n) time complexity. */ ! for (rel = rel_base; rel; rel = rel->prev) ! regno_related[REGNO (rel->reg)] = 0; ! return baseinfo->prev_base; ! } ! ! /* Finalize the optimization for any related values know so far, and reset ! the entries in regno_related that we have disturbed. */ ! static void ! optimize_related_values_0 (rel_base_list, luid, call_tally, insert_after, ! regmove_dump_file) ! struct related *rel_base_list; ! int luid, call_tally; ! rtx insert_after; ! FILE *regmove_dump_file; ! { ! while (rel_base_list) ! rel_base_list ! = optimize_related_values_1 (rel_base_list, luid, call_tally, ! insert_after, regmove_dump_file); ! for ( ; unrelatedly_used; unrelatedly_used = unrelatedly_used->prev) ! regno_related[REGNO (unrelatedly_used->reg)] = 0; ! } ! ! /* Scan the entire function for instances where multiple registers are ! set to values that differ only by a constant. ! Then try to reduce the number of instructions and/or registers needed ! by exploiting auto_increment and true two-address additions. */ ! ! static void ! optimize_related_values (nregs, regmove_dump_file) ! int nregs; ! FILE *regmove_dump_file; ! { ! int b; ! rtx insn; ! int luid = 0; ! int call_tally = 0; ! int save_loop_depth = loop_depth; ! enum machine_mode mode; ! ! if (regmove_dump_file) ! fprintf (regmove_dump_file, "Starting optimize_related_values.\n"); ! ! /* For each integer mode, find minimum and maximum value for a single- ! instruction reg-constant add. */ ! for (mode = GET_CLASS_NARROWEST_MODE (MODE_INT); mode != VOIDmode; ! mode = GET_MODE_WIDER_MODE (mode)) ! { ! rtx reg = gen_rtx_REG (mode, FIRST_PSEUDO_REGISTER); ! int icode = (int) add_optab->handlers[(int) mode].insn_code; ! HOST_WIDE_INT tmp; ! rtx add, set; ! int p, p_max; ! ! add_limits[(int) mode][0] = 0; ! add_limits[(int) mode][1] = 0; ! if (icode == CODE_FOR_nothing ! || ! (*insn_data[icode].operand[0].predicate) (reg, mode) ! || ! (*insn_data[icode].operand[1].predicate) (reg, mode) ! || ! (*insn_data[icode].operand[2].predicate) (const1_rtx, mode)) ! continue; ! add = GEN_FCN (icode) (reg, reg, const1_rtx); ! if (GET_CODE (add) == SEQUENCE) ! continue; ! add = make_insn_raw (add); ! set = single_set (add); ! if (! set ! || GET_CODE (SET_SRC (set)) != PLUS ! || XEXP (SET_SRC (set), 1) != const1_rtx) ! continue; ! p_max = GET_MODE_BITSIZE (mode) - 1; ! if (p_max > HOST_BITS_PER_WIDE_INT - 2) ! p_max = HOST_BITS_PER_WIDE_INT - 2; ! for (p = 2; p < p_max; p++) ! { ! if (! validate_change (add, &XEXP (SET_SRC (set), 1), ! GEN_INT (((HOST_WIDE_INT) 1 << p) - 1), 0)) ! break; ! } ! add_limits[(int) mode][1] = tmp = INTVAL (XEXP (SET_SRC (set), 1)); ! /* We need a range of known good values for the constant of the add. ! Thus, before checking for the power of two, check for one less first, ! in case the power of two is an exceptional value. */ ! if (validate_change (add, &XEXP (SET_SRC (set), 1), GEN_INT (-tmp), 0)) ! { ! if (validate_change (add, &XEXP (SET_SRC (set), 1), ! GEN_INT (-tmp - 1), 0)) ! add_limits[(int) mode][0] = -tmp - 1; ! else ! add_limits[(int) mode][0] = -tmp; ! } ! } ! ! /* Insert notes before basic block ends, so that we can safely ! insert other insns. ! Don't do this when it would separate a BARRIER from the insn that ! it belongs to; we really need the notes only when the basic block ! end is due to a following label or to the end of the function. ! We must never dispose a JUMP_INSN as last insn of a basic block, ! since this confuses save_call_clobbered_regs. */ ! for (b = 0; b < n_basic_blocks; b++) ! { ! rtx end = BLOCK_END (b); ! if (GET_CODE (end) != JUMP_INSN) ! { ! rtx next = next_nonnote_insn (BLOCK_END (b)); ! if (! next || GET_CODE (next) != BARRIER) ! BLOCK_END (b) = emit_note_after (NOTE_INSN_DELETED, BLOCK_END (b)); ! } ! } ! ! gcc_obstack_init (&related_obstack); ! regno_related = rel_alloc (nregs * sizeof *regno_related); ! bzero ((char *) regno_related, nregs * sizeof *regno_related); ! rel_base_list = 0; ! loop_depth = 1; ! b = -1; ! ! for (insn = get_insns (); insn; insn = NEXT_INSN (insn)) ! { ! luid++; ! ! /* Check to see if this is the first insn of the next block. There is ! no next block if we are already in the last block. */ ! if ((b+1) < n_basic_blocks && insn == BLOCK_HEAD (b+1)) ! b = b+1; ! ! if (GET_CODE (insn) == NOTE) ! { ! if (NOTE_LINE_NUMBER (insn) == NOTE_INSN_LOOP_BEG) ! loop_depth++; ! else if (NOTE_LINE_NUMBER (insn) == NOTE_INSN_LOOP_END) ! loop_depth--; ! } ! ! /* Don't do anything before the start of the first basic block. */ ! if (b < 0) ! continue; ! ! /* Don't do anything if this instruction is in the shadow of a ! live flags register. */ ! if (GET_MODE (insn) == HImode) ! continue; ! ! if (GET_RTX_CLASS (GET_CODE (insn)) == 'i') ! { ! rtx note; ! find_related (&PATTERN (insn), insn, luid, call_tally); ! for (note = REG_NOTES (insn); note; note = XEXP (note, 1)) ! { ! if (REG_NOTE_KIND (note) == REG_DEAD ! || (REG_NOTE_KIND (note) == REG_UNUSED ! && GET_CODE (XEXP (note, 0)) == REG)) ! { ! rtx reg = XEXP (note, 0); ! int regno = REGNO (reg); ! if (REG_NOTE_KIND (note) == REG_DEAD ! && reg_set_p (reg, PATTERN (insn))) ! { ! remove_note (insn, note); ! REG_N_DEATHS (regno)--; ! } ! else if (regno_related[regno] ! && ! regno_related[regno]->invalidate_luid) ! { ! regno_related[regno]->death = insn; ! regno_related[regno]->reg_orig_calls_crossed ! = call_tally - regno_related[regno]->reg_set_call_tally; ! } ! } ! } ! /* Inputs to a call insn do not cross the call, therefore CALL_TALLY ! must be bumped *after* they have been processed. */ ! if (GET_CODE (insn) == CALL_INSN) ! call_tally++; ! } ! ! /* We end current processing at the end of a basic block, or when ! a flags register becomes live. ! ! Otherwise, we might end up with one or more extra instructions ! inserted in front of the user, to set up or adjust a register. ! There are cases where this could be handled smarter, but most of ! the time the user will be a branch anyways, so the extra effort ! to handle the occasional conditional instruction is probably not ! justified by the little possible extra gain. */ ! ! if (insn == BLOCK_END (b) ! || GET_MODE (insn) == QImode) ! { ! optimize_related_values_0 (rel_base_list, luid, call_tally, ! prev_nonnote_insn (insn), ! regmove_dump_file); ! rel_base_list = 0; ! } ! } ! optimize_related_values_0 (rel_base_list, luid, call_tally, ! get_last_insn (), regmove_dump_file); ! obstack_free (&related_obstack, 0); ! loop_depth = save_loop_depth; ! if (regmove_dump_file) ! fprintf (regmove_dump_file, "Finished optimize_related_values.\n"); ! } ! ! #endif /* AUTO_INC_DEC */ ! static int *regno_src_regno; /* Indicate how good a choice REG (which appears as a source) is to replace *************** copy_src_to_dest (insn, src, dest, loop_ *** 796,801 **** --- 2449,2455 ---- seq = gen_sequence (); end_sequence (); /* If this sequence uses new registers, we may not use it. */ + /* ????? This code no longer works since no_new_pseudos is set. */ if (old_num_regs != reg_rtx_no || ! validate_replace_rtx (src, dest, insn)) { *************** regmove_optimize (f, nregs, regmove_dump *** 1098,1103 **** --- 2752,2763 ---- /* Find out where a potential flags register is live, and so that we can supress some optimizations in those zones. */ mark_flags_life_zones (discover_flags_reg ()); + #ifdef AUTO_INC_DEC + /* See the comment in front of REL_USE_HASH_SIZE what + this is about. */ + if (flag_regmove && flag_expensive_optimizations) + optimize_related_values (nregs, regmove_dump_file); + #endif regno_src_regno = (int *) xmalloc (sizeof *regno_src_regno * nregs); for (i = nregs; --i >= 0; ) regno_src_regno[i] = -1; *************** regmove_optimize (f, nregs, regmove_dump *** 1196,1206 **** src = recog_data.operand[op_no]; dst = recog_data.operand[match_no]; - if (GET_CODE (src) != REG) - continue; - src_subreg = src; if (GET_CODE (dst) == SUBREG && GET_MODE_SIZE (GET_MODE (dst)) >= GET_MODE_SIZE (GET_MODE (SUBREG_REG (dst)))) { --- 2856,2864 ---- src = recog_data.operand[op_no]; dst = recog_data.operand[match_no]; src_subreg = src; if (GET_CODE (dst) == SUBREG + && GET_CODE (src) == REG && GET_MODE_SIZE (GET_MODE (dst)) >= GET_MODE_SIZE (GET_MODE (SUBREG_REG (dst)))) { *************** regmove_optimize (f, nregs, regmove_dump *** 1209,1215 **** src, SUBREG_WORD (dst)); dst = SUBREG_REG (dst); } ! if (GET_CODE (dst) != REG || REGNO (dst) < FIRST_PSEUDO_REGISTER) continue; --- 2867,2879 ---- src, SUBREG_WORD (dst)); dst = SUBREG_REG (dst); } ! else if (GET_CODE (src) == SUBREG ! && (GET_MODE_SIZE (GET_MODE (src)) ! >= GET_MODE_SIZE (GET_MODE (SUBREG_REG (src))))) ! src = SUBREG_REG (src); ! ! if (GET_CODE (src) != REG ! || GET_CODE (dst) != REG || REGNO (dst) < FIRST_PSEUDO_REGISTER) continue; *************** fixup_match_1 (insn, set, src, src_subre *** 1871,1876 **** --- 3535,3553 ---- validate_change (insn, recog_data.operand_loc[match_number], src, 1); if (validate_replace_rtx (dst, src_subreg, p)) success = 1; + else if (src_subreg != src + && *recog_data.operand_loc[match_number] == dst) + { + /* In this case, we originally have a subreg in the src + operand. It's mode should match the destination operand. + Moreover, P is likely to use DST in a subreg, so replacing + it with another subreg will fail - but putting the raw + register there can succeed. */ + validate_change (insn, recog_data.operand_loc[match_number], + src_subreg, 1); + if (validate_replace_rtx (dst, src, p)) + success = 1; + } break; } ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-18 14:29 ` Joern Rennecke @ 1999-11-22 23:47 ` Jeffrey A Law 1999-11-30 23:37 ` Jeffrey A Law 1999-11-30 23:37 ` Joern Rennecke 1 sibling, 1 reply; 94+ messages in thread From: Jeffrey A Law @ 1999-11-22 23:47 UTC (permalink / raw) To: m.hayes, gcc, gcc-patches Michael -- can you please provide comments on Joern's patch? It's important that we get comments and evaluations on both implementations. jeff ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-22 23:47 ` Jeffrey A Law @ 1999-11-30 23:37 ` Jeffrey A Law 0 siblings, 0 replies; 94+ messages in thread From: Jeffrey A Law @ 1999-11-30 23:37 UTC (permalink / raw) To: m.hayes, gcc, gcc-patches Michael -- can you please provide comments on Joern's patch? It's important that we get comments and evaluations on both implementations. jeff ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-18 14:29 ` Joern Rennecke 1999-11-22 23:47 ` Jeffrey A Law @ 1999-11-30 23:37 ` Joern Rennecke 1 sibling, 0 replies; 94+ messages in thread From: Joern Rennecke @ 1999-11-30 23:37 UTC (permalink / raw) To: law; +Cc: m.hayes, gcc, gcc-patches > Joern -- can you get a patch for the regmove changes put together and submit > it to the list? These are the regmove patches: Wed Oct 20 20:45:45 1999 J"orn Rennecke <amylaar@cygnus.co.uk> * regmove.c (invalidate_related): New Argument call_tally. Set rel->reg_orig_calls_crossed when setting rel->invalidate_luid. Changed all callers. (optimize_related_values_1): Don't set rel->reg_orig_calls_crossed if rel->invalidate_luid is set. (optimize_related_values): Bump CALL_TALLY *after* inputs have been processed. (find_related): When recursively processing a SET_DEST of a CALL_INSN, pass an incremented value for CALL_TALLY. Wed Oct 20 00:58:08 1999 J"orn Rennecke <amylaar@cygnus.co.uk> * regmove.c (find_related): Ignore registers that change size. Fri Oct 1 15:04:25 1999 J"orn Rennecke <amylaar@cygnus.co.uk> * regmove.c (optimize_related_values_1): Fix check when to preserve update->insn. Tue Jun 29 07:46:53 1999 J"orn Rennecke <amylaar@cygnus.co.uk> * regmove.c (optimize_related_values_1): When deciding whether to add a REG_DEAD or REG_UNUSED note, also check for a REG_INC notes we might have created. * integrate.c (copy_rtx_and_substitute): Handle NOTE_INSN_DELETED_LABEL notes. Don't handle 'n' rtx_format case. Mon Mar 8 16:00:35 1999 Jim Wilson <wilson@cygnus.com> * regmove.c (optimize_related_values): Add bounds check for b before BLOCK_HEAD check. Fri Feb 19 23:10:32 1999 Richard Henderson <rth@cygnus.com> * regmove.c (optimize_related_values): Use insn modes rather than sets_cc0_p. Watch basic block heads and ends rather than insn types. Thu Jan 28 01:08:31 1999 J"orn Rennecke <amylaar@cygnus.co.uk> * regmove.c (find_related): Check if a register belonging to a set of related values is clobbered in an insn where it is also used. (optimize_related_values_1): Handle REG_UNUSED notes. (optimize_related_values): Likewise. Mon Dec 14 17:08:17 1998 Jim Wilson <wilson@cygnus.com> * regmove.c (REL_USE_HASH): Use unsigned HOST_WIDE_INT instead of unsigned. Fri Nov 13 10:14:04 1998 J"orn Rennecke <amylaar@cygnus.co.uk> * regmove.c (optimize_related_values_1): Reject optimization if offset for rel_base_reg_user would be to large. Fri Nov 13 04:36:06 1998 J"orn Rennecke <amylaar@cygnus.co.uk> * regmove.c (rel_record_mem): Don't do anything if the register already has an invalidate_luid. Thu Nov 12 23:02:32 1998 J"orn Rennecke <amylaar@cygnus.co.uk> * regmove.c (invalidate_related): Don't do anything if the register already has an invalidate_luid. (optimize_related_values): Don't update death field if invalidate_luid field is set. Wed Oct 14 21:38:11 1998 J"orn Rennecke <amylaar@cygnus.co.uk> * regmove.c (optimize_related_values): Check if cc0 is set. * regmove.c (optimize_related_values): Fix problem with multiple related values in single insn. Wed Sep 23 20:42:54 1998 J"orn Rennecke <amylaar@cygnus.co.uk> * regmove.c (optimize_related_values_1): Set use->insn when emitting the linking insn before the final 'use' for a register that does not die within the scope of the optimization. Mon Sep 21 15:04:16 1998 J"orn Rennecke <amylaar@cygnus.co.uk> * regmove.c (count_sets): New function. (gen_add3_insn): If single instruction add fails and source and destination register are different, try a move / add sequence. (rel_use_chain): New member match_offset. (optimize_related_values_1): Set it, and use it to avoid linking chains when this requires more than one instruction for the add. (add_limits): New file scope array. (optimize_related_values): Initialize it. Mon Sep 21 14:55:36 1998 J"orn Rennecke <amylaar@cygnus.co.uk> * regmove.c (optimize_related_values_1): Don't use rel_base->reg for a chain that needs an out-of-range offset. Take setting of rel_base_reg_user into account when deciding if there are enough registers available. Tue Sep 15 16:41:00 1998 Michael Tiemann <michael@impact.tiemann.org> * regmove.c (find_related): We also have to track expressions that are just naked registers. Otherwise, we burn one register to prime the related values, and we'll also miss the second (but not subsequent) opportunities to use related values. Thu Sep 3 23:33:57 1998 J"orn Rennecke <amylaar@cygnus.co.uk> * rtl.h (push_obstacks_nochange, end_temporary_allocation): Declare. * regmove.c (obstack.h): Include. (REL_USE_HASH_SIZE, REL_USE_HASH, rel_alloc, rel_new): Define. (struct related, struct related_baseinfo, struct update): New structs. (struct rel_use_chain, struct rel_use): Likewise. (regno_related, rel_base_list, unrelatedly_used): New variables. (related_obstack): Likewise. (regclass_compatible_p, lookup_related): New functions. (rel_build_chain, rel_record_mem, invalidate_related): Likewise. (find_related, chain_starts_earlier, chain_ends_later): Likewise. (optimize_related_values_1, optimize_related_values_0): Likewise. (optimize_related_values): Likewise. (regmove_optimize): Use regclass_compatible_p. Call optimize_related_values. Index: regmove.c =================================================================== RCS file: /cvs/gcc/egcs/gcc/regmove.c,v retrieving revision 1.74 diff -p -r1.74 regmove.c *** regmove.c 1999/11/08 04:56:18 1.74 --- regmove.c 1999/11/18 22:21:46 *************** Boston, MA 02111-1307, USA. */ *** 40,45 **** --- 40,46 ---- #include "insn-flags.h" #include "basic-block.h" #include "toplev.h" + #include "obstack.h" static int optimize_reg_copy_1 PROTO((rtx, rtx, rtx)); static void optimize_reg_copy_2 PROTO((rtx, rtx, rtx)); *************** static int fixup_match_1 PROTO((rtx, rtx *** 66,71 **** --- 67,87 ---- static int reg_is_remote_constant_p PROTO((rtx, rtx, rtx)); static int stable_and_no_regs_but_for_p PROTO((rtx, rtx, rtx)); static int regclass_compatible_p PROTO((int, int)); + #ifdef AUTO_INC_DEC + static struct rel_use *lookup_related PROTO((int, enum reg_class, HOST_WIDE_INT)); + static void rel_build_chain PROTO((struct rel_use *, struct rel_use *, int)); + static void rel_record_mem PROTO((rtx *, rtx, int, int, int, rtx, int, int)); + static void invalidate_related PROTO((rtx, int, int)); + static void find_related PROTO((rtx *, rtx, int, int)); + static int chain_starts_earlier PROTO((const PTR, const PTR)); + static int chain_ends_later PROTO((const PTR, const PTR)); + static struct related *optimize_related_values_1 PROTO((struct related *, int, + int, rtx, FILE *)); + static void optimize_related_values_0 PROTO((struct related *, int, int, + rtx, FILE *)); + static void optimize_related_values PROTO((int, FILE *)); + static void count_sets PROTO((rtx, rtx, void *)); + #endif /* AUTO_INC_DEC */ static int replacement_quality PROTO((rtx)); static int fixup_match_2 PROTO((rtx, rtx, rtx, rtx, FILE *)); static int loop_depth; *************** gen_add3_insn (r0, r1, c) *** 90,106 **** rtx r0, r1, c; { int icode = (int) add_optab->handlers[(int) GET_MODE (r0)].insn_code; ! if (icode == CODE_FOR_nothing || ! ((*insn_data[icode].operand[0].predicate) ! (r0, insn_data[icode].operand[0].mode)) || ! ((*insn_data[icode].operand[1].predicate) ! (r1, insn_data[icode].operand[1].mode)) || ! ((*insn_data[icode].operand[2].predicate) ! (c, insn_data[icode].operand[2].mode))) return NULL_RTX; ! return (GEN_FCN (icode) (r0, r1, c)); } --- 106,144 ---- rtx r0, r1, c; { int icode = (int) add_optab->handlers[(int) GET_MODE (r0)].insn_code; + int mcode; + rtx s, move; ! if (icode == CODE_FOR_nothing || ! ((*insn_data[icode].operand[0].predicate) ! (r0, insn_data[icode].operand[0].mode))) ! return NULL_RTX; ! ! if (((*insn_data[icode].operand[1].predicate) ! (r1, insn_data[icode].operand[1].mode)) ! && ((*insn_data[icode].operand[2].predicate) ! (c, insn_data[icode].operand[2].mode))) ! return (GEN_FCN (icode) (r0, r1, c)); ! ! mcode = (int) mov_optab->handlers[(int) GET_MODE (r0)].insn_code; ! if (REGNO (r0) == REGNO (r1) || ! ((*insn_data[icode].operand[1].predicate) ! (r0, insn_data[icode].operand[1].mode)) || ! ((*insn_data[icode].operand[2].predicate) ! (r1, insn_data[icode].operand[2].mode)) ! || ! ((*insn_data[mcode].operand[0].predicate) ! (r0, insn_data[mcode].operand[0].mode)) ! || ! ((*insn_data[mcode].operand[1].predicate) ! (c, insn_data[mcode].operand[1].mode))) return NULL_RTX; ! start_sequence (); ! move = emit_insn (GEN_FCN (mcode) (r0, c)); ! REG_NOTES (move) = gen_rtx_EXPR_LIST (REG_EQUAL, c, NULL_RTX); ! emit_insn (GEN_FCN (icode) (r0, r0, r1)); ! s = gen_sequence (); ! end_sequence (); ! return s; } *************** flags_set_1 (x, pat, data) *** 330,336 **** && reg_overlap_mentioned_p (x, flags_set_1_rtx)) flags_set_1_set = 1; } ! \f static int *regno_src_regno; /* Indicate how good a choice REG (which appears as a source) is to replace --- 368,1989 ---- && reg_overlap_mentioned_p (x, flags_set_1_rtx)) flags_set_1_set = 1; } ! ! #ifdef AUTO_INC_DEC ! ! /* Some machines have two-address-adds and instructions that can ! use only register-indirect addressing and auto_increment, but no ! offsets. If multiple fields of a struct are accessed more than ! once, cse will load each of the member addresses in separate registers. ! This not only costs a lot of registers, but also of instructions, ! since each add to initialize an address register must be really expanded ! into a register-register move followed by an add. ! regmove_optimize uses some heuristics to detect this case; if these ! indicate that this is likely, optimize_related_values is run once for ! the entire function. ! ! We build chains of uses of related values that can be satisfied with the ! same base register by taking advantage of auto-increment address modes ! instead of explicit add instructions. ! ! We try to link chains with disjoint lifetimes together to reduce the ! number of temporary registers and register-register copies. ! ! This optimization pass operates on basic blocks one at a time; it could be ! extended to work on extended basic blocks or entire functions. */ ! ! /* For each set of values related to a common base register, we use a ! hash table which maps constant offsets to instructions. ! ! The instructions mapped to are those that use a register which may, ! (possibly with a change in addressing mode) differ from the initial ! value of the base register by exactly that offset after the ! execution of the instruction. ! Here we define the size of the hash table, and the hash function to use. */ ! #define REL_USE_HASH_SIZE 43 ! #define REL_USE_HASH(I) ((I) % (unsigned HOST_WIDE_INT) REL_USE_HASH_SIZE) ! ! /* For each register in a set of registers that are related, we keep a ! struct related. ! ! u.base contains the register number of the base register (i.e. the one ! that was the source of the first three-address add for this set of ! related values). ! ! INSN is the instruction that initialized the register, or, for the ! base, the instruction that initialized the first non-base register. ! ! BASE is the register number of the base register. ! ! For the base register only, the member BASEINFO points to some extra data. ! ! 'luid' here means linear uid. We count them starting at the function ! start; they are used to avoid overlapping lifetimes. ! ! UPDATES is a list of instructions that set the register to a new ! value that is still related to the same base. ! ! When a register in a set of related values is set to something that ! is not related to the base, INVALIDATE_LUID is set to the luid of ! the instruction that does this set. This is used to avoid re-using ! this register in an overlapping liftime for a related value. ! ! DEATH is first used to store the insn (if any) where the register dies. ! When the optimization is actually performed, the REG_DEAD note from ! the insn denoted by DEATH is removed. ! Thereafter, the removed death note is stored in DEATH, marking not ! only that the register dies, but also making the note available for reuse. ! ! We also use a struct related to keep track of registers that have been ! used for anything that we don't recognize as related values. ! The only really interesting datum for these is u.last_luid, which is ! the luid of the last reference we have seen. These struct relateds ! are marked by a zero INSN field; most other members are not used and ! remain uninitialized. */ ! ! struct related { ! rtx insn, reg; ! union { int base; int last_luid; } u; ! HOST_WIDE_INT offset; ! struct related *prev; ! struct update *updates; ! struct related_baseinfo *baseinfo; ! int invalidate_luid; ! rtx death; ! int reg_orig_calls_crossed, reg_set_call_tally, reg_orig_refs; ! }; ! ! /* HASHTAB maps offsets to register uses with a matching MATCH_OFFSET. ! PREV_BASE points to the struct related for the previous base register ! that we currently keep track of. ! INSN_LUID is the luid of the instruction that started this set of ! related values. */ ! struct related_baseinfo { ! struct rel_use *hashtab[REL_USE_HASH_SIZE]; ! struct rel_use_chain *chains; ! struct related *prev_base; ! int insn_luid; ! }; ! ! /* INSN is an instruction that sets a register that previously contained ! a related value to a new value that is related to the same base register. ! When the optimization is performed, we have to delete INSN. ! DEATH_INSN points to the insn (if any) where the register died that we ! set in INSN. When we perform the optimization, the REG_DEAD note has ! to be removed from DEATH_INSN. ! PREV points to the struct update that pertains to the previous ! instruction pertaining to the same register that set it from one ! related value to another one. */ ! struct update ! { ! rtx insn, death_insn; ! struct update *prev; ! }; ! ! struct rel_use_chain ! { ! struct rel_use *chain; /* Points to first use in this chain. */ ! struct rel_use_chain *prev, *linked; ! /* Only set after the chain has been completed: */ ! struct rel_use *end; /* Last use in this chain. */ ! int start_luid, end_luid, calls_crossed; ! rtx reg; /* The register allocated for this chain. */ ! HOST_WIDE_INT match_offset; /* Offset after execution of last insn. */ ! }; ! ! /* ADDRP points to the place where the actual use of the related value is. ! This is commonly a memory address, and has to be set to a register ! or some auto_inc addressing of this register. ! But ADDRP is also used for all other uses of related values to ! the place where the register is inserted; we can tell that an ! unardorned register is to be inserted because no offset adjustment ! is required, hence this is handled by the same logic as register-indirect ! addressing. The only exception to this is when SET_IN_PARALLEL is set, ! see below. ! OFFSET is the offset that is actually used in this instance, i.e. ! the value of the base register when the set of related values was ! created plus OFFSET yields the value that is used. ! This might be different from the value of the used register before ! executing INSN if we elected to use pre-{in,de}crement addressing. ! If we have the option to use post-{in,d})crement addressing, all ! choices are linked cyclically together with the SIBLING field. ! Otherwise, it's a one-link-cycle, i.e. SIBLING points at the ! struct rel_use it is a member of. ! MATCH_OFFSET is the offset that is available after the execution ! of INSN. It is the same as OFFSET for straight register-indirect ! addressing and for pre-{in,de}crement addressing, while it differs ! for the post-{in,de}crement addressing modes. ! If SET_IN_PARALLEL is set, MATCH_OFFSET differs from OFFSET, yet ! this is no post-{in,de}crement addresing. Rather, it is a set ! inside a PARALLEL that adds some constant to a register that holds ! one value of a set of related values that we keep track of. ! ADDRP then points only to the set destination of this set; another ! struct rel_use is used for the source of the set. ! NO_LINK_PRED is nonzero for the last use in a chain if it cannot be ! the predecessor for a another chain to be linked to. This can happen ! for uses that come with a clobber, and for uses by a register that ! is live at the end of the processed range of insns ! (usually a basic block). */ ! struct rel_use ! { ! rtx insn, *addrp; ! int luid, call_tally; ! enum reg_class class; ! unsigned set_in_parallel : 1; ! unsigned no_link_pred: 1; ! HOST_WIDE_INT offset, match_offset; ! struct rel_use *next_chain, **prev_chain_ref, *next_hash, *sibling; ! }; ! ! struct related **regno_related, *rel_base_list, *unrelatedly_used; ! ! #define rel_alloc(N) obstack_alloc(&related_obstack, (N)) ! #define rel_new(X) ((X) = rel_alloc (sizeof *(X))) ! ! static struct obstack related_obstack; ! ! /* For each integer machine mode, the minimum and maximum constant that ! can be added with a single constant. ! This is supposed to define an interval around zero; if there are ! singular points disconnected from this interval, we want to leave ! them out. */ ! ! static HOST_WIDE_INT add_limits[NUM_MACHINE_MODES][2]; ! ! /* Try to find a related value with offset OFFSET from the base ! register belonging to REGNO, using a register with preferred class ! that is compatible with CLASS. */ ! static struct rel_use * ! lookup_related (regno, class, offset) ! int regno; ! enum reg_class class; ! HOST_WIDE_INT offset; ! { ! int base = regno_related[regno]->u.base; ! int hash = REL_USE_HASH (offset); ! struct rel_use *match = regno_related[base]->baseinfo->hashtab[hash]; ! for (; match; match = match->next_hash) ! { ! if (offset != match->match_offset) ! continue; ! if (match->next_chain) ! continue; ! if (regclass_compatible_p (class, match->class)) ! break; ! } ! return match; ! } ! ! /* Add NEW_USE at the end of the chain that currently ends with MATCH; ! If MATCH is not set, create a new chain. ! BASE is the base register number the chain belongs to. */ ! static void ! rel_build_chain (new_use, match, base) ! struct rel_use *new_use, *match; ! int base; ! { ! int hash; ! ! if (match) ! { ! struct rel_use *sibling = match; ! do ! { ! sibling->next_chain = new_use; ! if (sibling->prev_chain_ref) ! *sibling->prev_chain_ref = match; ! sibling = sibling->sibling; ! } ! while (sibling != match); ! new_use->prev_chain_ref = &match->next_chain; ! new_use->next_chain = 0; ! } ! else ! { ! struct rel_use_chain *new_chain; ! ! rel_new (new_chain); ! new_chain->chain = new_use; ! new_use->prev_chain_ref = &new_chain->chain; ! new_use->next_chain = 0; ! new_use->next_chain = NULL_PTR; ! new_chain->linked = 0; ! new_chain->prev = regno_related[base]->baseinfo->chains; ! regno_related[base]->baseinfo->chains = new_chain; ! } ! hash = REL_USE_HASH (new_use->offset); ! new_use->next_hash = regno_related[base]->baseinfo->hashtab[hash]; ! regno_related[base]->baseinfo->hashtab[hash] = new_use; ! } ! ! /* Record the use of register ADDR in a memory reference. ! ADDRP is the memory location where the address is stored. ! SIZE is the size of the memory reference. ! PRE_OFFS is the offset that has to be added to the value in ADDR ! due to PRE_{IN,DE}CREMENT addressing in the original address; likewise, ! POST_OFFSET denotes POST_{IN,DE}CREMENT addressing. INSN is the ! instruction that uses this address, LUID its luid, and CALL_TALLY ! the current number of calls encountered since the start of the ! function. */ ! static void ! rel_record_mem (addrp, addr, size, pre_offs, post_offs, insn, luid, call_tally) ! rtx *addrp, addr, insn; ! int size, pre_offs, post_offs; ! int luid, call_tally; ! { ! static rtx auto_inc; ! rtx orig_addr = *addrp; ! int regno, base; ! HOST_WIDE_INT offset; ! struct rel_use *new_use, *match; ! enum reg_class class; ! int hash; ! ! if (GET_CODE (addr) != REG) ! abort (); ! ! regno = REGNO (addr); ! if (! regno_related[regno] || ! regno_related[regno]->insn ! || regno_related[regno]->invalidate_luid) ! return; ! ! regno_related[regno]->reg_orig_refs += loop_depth; ! ! offset = regno_related[regno]->offset += pre_offs; ! base = regno_related[regno]->u.base; ! ! if (! auto_inc) ! { ! push_obstacks_nochange (); ! end_temporary_allocation (); ! auto_inc = gen_rtx_PRE_INC (Pmode, addr); ! pop_obstacks (); ! } ! ! XEXP (auto_inc, 0) = addr; ! *addrp = auto_inc; ! ! rel_new (new_use); ! new_use->insn = insn; ! new_use->addrp = addrp; ! new_use->luid = luid; ! new_use->call_tally = call_tally; ! new_use->class = class = reg_preferred_class (regno); ! new_use->set_in_parallel = 0; ! new_use->offset = offset; ! new_use->match_offset = offset; ! new_use->sibling = new_use; ! ! do ! { ! match = lookup_related (regno, class, offset); ! if (! match) ! { ! /* We can choose PRE_{IN,DE}CREMENT on the spot with the information ! we have gathered about the preceding instructions, while we have ! to record POST_{IN,DE}CREMENT possibilities so that we can check ! later if we have a use for their output value. */ ! /* We use recog here directly because we are only testing here if ! the changes could be made, but don't really want to make a ! change right now. The caching from recog_memoized would only ! get in the way. */ ! match = lookup_related (regno, class, offset - size); ! if (HAVE_PRE_INCREMENT && match) ! { ! PUT_CODE (auto_inc, PRE_INC); ! if (recog (PATTERN (insn), insn, NULL_PTR) >= 0) ! break; ! } ! match = lookup_related (regno, class, offset + size); ! if (HAVE_PRE_DECREMENT && match) ! { ! PUT_CODE (auto_inc, PRE_DEC); ! if (recog (PATTERN (insn), insn, NULL_PTR) >= 0) ! break; ! } ! match = 0; ! } ! PUT_CODE (auto_inc, POST_INC); ! if (HAVE_POST_INCREMENT && recog (PATTERN (insn), insn, NULL_PTR) >= 0) ! { ! struct rel_use *inc_use; ! ! rel_new (inc_use); ! *inc_use = *new_use; ! inc_use->sibling = new_use; ! new_use->sibling = inc_use; ! inc_use->prev_chain_ref = NULL_PTR; ! inc_use->next_chain = NULL_PTR; ! hash = REL_USE_HASH (inc_use->match_offset = offset + size); ! inc_use->next_hash = regno_related[base]->baseinfo->hashtab[hash]; ! regno_related[base]->baseinfo->hashtab[hash] = inc_use; ! } ! PUT_CODE (auto_inc, POST_DEC); ! if (HAVE_POST_DECREMENT && recog (PATTERN (insn), insn, NULL_PTR) >= 0) ! { ! struct rel_use *dec_use; ! ! rel_new (dec_use); ! *dec_use = *new_use; ! dec_use->sibling = new_use->sibling; ! new_use->sibling = dec_use; ! dec_use->prev_chain_ref = NULL_PTR; ! dec_use->next_chain = NULL_PTR; ! hash = REL_USE_HASH (dec_use->match_offset = offset + size); ! dec_use->next_hash = regno_related[base]->baseinfo->hashtab[hash]; ! regno_related[base]->baseinfo->hashtab[hash] = dec_use; ! } ! } ! while (0); ! rel_build_chain (new_use, match, base); ! *addrp = orig_addr; ! ! regno_related[regno]->offset += post_offs; ! } ! ! /* Note that REG is set to something that we do not regognize as a ! related value, at an insn with linear uid LUID. */ ! static void ! invalidate_related (reg, luid, call_tally) ! rtx reg; ! int luid; ! { ! int regno = REGNO (reg); ! struct related *rel = regno_related[regno]; ! if (! rel) ! { ! rel_new (rel); ! regno_related[regno] = rel; ! rel->prev = unrelatedly_used; ! unrelatedly_used = rel; ! rel->reg = reg; ! rel->insn = NULL_RTX; ! rel->invalidate_luid = 0; ! rel->u.last_luid = luid; ! } ! else if (rel->invalidate_luid) ! ; /* do nothing */ ! else if (! rel->insn) ! rel->u.last_luid = luid; ! else ! { ! rel->invalidate_luid = luid; ! rel->reg_orig_calls_crossed = call_tally - rel->reg_set_call_tally; ! } ! } ! ! /* Check the RTL fragment pointed to by XP for related values - that is, ! if any new are created, or if they are assigned new values. Also ! note any other sets so that we can track lifetime conflicts. ! INSN is the instruction XP points into, LUID its luid, and CALL_TALLY ! the number of preceding calls in the function. */ ! static void ! find_related (xp, insn, luid, call_tally) ! rtx *xp, insn; ! int luid, call_tally; ! { ! rtx x = *xp; ! enum rtx_code code = GET_CODE (x); ! const char *fmt; ! int i; ! ! switch (code) ! { ! case SET: ! { ! rtx dst = SET_DEST (x); ! rtx src = SET_SRC (x); ! ! /* First, check out if this sets a new related value. ! We don't care about register class differences here, since ! we might still find multiple related values share the same ! class even if it is disjunct from the class of the original ! register. ! We use a do .. while (0); here because there are many possible ! conditions that make us want to handle this like an ordinary set. */ ! do ! { ! rtx src_reg, src_const; ! int src_regno, dst_regno; ! struct related *new_related; ! ! /* First check that we have actually something like ! (set (reg pseudo_dst) (plus (reg pseudo_src) (const_int))) . */ ! if (GET_CODE (src) == PLUS) ! { ! src_reg = XEXP (src, 0); ! src_const = XEXP (src, 1); ! } ! else if (GET_CODE (src) == REG ! && GET_MODE_CLASS (GET_MODE (src)) == MODE_INT) ! { ! src_reg = src; ! src_const = const0_rtx; ! } ! else ! break; ! ! if (GET_CODE (src_reg) != REG ! || GET_CODE (src_const) != CONST_INT ! || GET_CODE (dst) != REG) ! break; ! dst_regno = REGNO (dst); ! src_regno = REGNO (src_reg); ! ! /* If only some words of multi-word pseudo are stored into, the ! old value does not die at the store, yet we can't replace the ! register. We cannot handle this case, so reject any pseudos ! that have such stores. ! We approximate this with REG_CHANGES_SIZE, which is true also ! in a few cases that we could handle (i.e. same number of words, ! or size change only when reading). */ ! if (src_regno < FIRST_PSEUDO_REGISTER ! || REG_CHANGES_SIZE (src_regno) ! || dst_regno < FIRST_PSEUDO_REGISTER ! || REG_CHANGES_SIZE (dst_regno)) ! break; ! ! /* We only know how to remove the set if that is ! all what the insn does. */ ! if (x != single_set (insn)) ! break; ! ! /* We cannot handle multiple lifetimes. */ ! if ((regno_related[src_regno] ! && regno_related[src_regno]->invalidate_luid) ! || (regno_related[dst_regno] ! && regno_related[dst_regno]->invalidate_luid)) ! break; ! ! /* Check if this is merely an update of a register with a ! value belonging to a group of related values we already ! track. */ ! if (regno_related[dst_regno] && regno_related[dst_regno]->insn) ! { ! struct update *new_update; ! ! /* If the base register changes, don't handle this as a ! related value. We can currently only attribute the ! register to one base, and keep record of one lifetime ! during which we might re-use the register. */ ! if (! regno_related[src_regno] ! || ! regno_related[src_regno]->insn ! ||(regno_related[dst_regno]->u.base ! != regno_related[src_regno]->u.base)) ! break; ! regno_related[src_regno]->reg_orig_refs += loop_depth; ! regno_related[dst_regno]->reg_orig_refs += loop_depth; ! regno_related[dst_regno]->offset ! = regno_related[src_regno]->offset + INTVAL (src_const); ! rel_new (new_update); ! new_update->insn = insn; ! new_update->death_insn = regno_related[dst_regno]->death; ! regno_related[dst_regno]->death = NULL_RTX; ! new_update->prev = regno_related[dst_regno]->updates; ! regno_related[dst_regno]->updates = new_update; ! return; ! } ! if (! regno_related[src_regno] ! || ! regno_related[src_regno]->insn) ! { ! if (src_regno == dst_regno) ! break; ! rel_new (new_related); ! new_related->reg = src_reg; ! new_related->insn = insn; ! new_related->updates = 0; ! new_related->reg_set_call_tally = call_tally; ! new_related->reg_orig_refs = loop_depth; ! new_related->u.base = src_regno; ! new_related->offset = 0; ! new_related->prev = 0; ! new_related->invalidate_luid = 0; ! new_related->death = NULL_RTX; ! rel_new (new_related->baseinfo); ! bzero ((char *) new_related->baseinfo, ! sizeof *new_related->baseinfo); ! new_related->baseinfo->prev_base = rel_base_list; ! rel_base_list = new_related; ! new_related->baseinfo->insn_luid = luid; ! regno_related[src_regno] = new_related; ! } ! /* If the destination register has been used since we started ! tracking this group of related values, there would be tricky ! lifetime problems that we don't want to tackle right now. */ ! else if (regno_related[dst_regno] ! && (regno_related[dst_regno]->u.last_luid ! >= regno_related[regno_related[src_regno]->u.base]->baseinfo->insn_luid)) ! break; ! rel_new (new_related); ! new_related->reg = dst; ! new_related->insn = insn; ! new_related->updates = 0; ! new_related->reg_set_call_tally = call_tally; ! new_related->reg_orig_refs = loop_depth; ! new_related->u.base = regno_related[src_regno]->u.base; ! new_related->offset = ! regno_related[src_regno]->offset + INTVAL (src_const); ! new_related->invalidate_luid = 0; ! new_related->death = NULL_RTX; ! new_related->prev = regno_related[src_regno]->prev; ! regno_related[src_regno]->prev = new_related; ! regno_related[dst_regno] = new_related; ! return; ! } ! while (0); ! ! /* The SET has not been recognized as setting up a related value. ! If the destination is ultimately a register, we have to ! invalidate what we have memorized about any related value ! previously stored into it. */ ! while (GET_CODE (dst) == SUBREG ! || GET_CODE (dst) == ZERO_EXTRACT ! || GET_CODE (dst) == SIGN_EXTRACT ! || GET_CODE (dst) == STRICT_LOW_PART) ! dst = XEXP (dst, 0); ! if (GET_CODE (dst) == REG) ! { ! find_related (&SET_SRC (x), insn, luid, call_tally); ! invalidate_related (dst, luid, call_tally); ! return; ! } ! find_related (&SET_SRC (x), insn, luid, call_tally); ! find_related (&SET_DEST (x), insn, luid, ! call_tally + (GET_CODE (insn) == CALL_INSN)); ! return; ! } ! case CLOBBER: ! { ! rtx dst = XEXP (x, 0); ! while (GET_CODE (dst) == SUBREG ! || GET_CODE (dst) == ZERO_EXTRACT ! || GET_CODE (dst) == SIGN_EXTRACT ! || GET_CODE (dst) == STRICT_LOW_PART) ! dst = XEXP (dst, 0); ! if (GET_CODE (dst) == REG) ! { ! int regno = REGNO (dst); ! struct related *rel = regno_related[regno]; ! ! /* If this clobbers a register that belongs to a set of related ! values, we have to check if the same register appears somewhere ! else in the insn : this is then likely to be a match_dup. */ ! ! if (rel ! && rel->insn ! && ! rel->invalidate_luid ! && xp != &PATTERN (insn) ! && count_occurrences (PATTERN (insn), dst) > 1) ! { ! enum reg_class class = reg_preferred_class (regno); ! struct rel_use *new_use, *match; ! HOST_WIDE_INT offset = rel->offset; ! ! rel_new (new_use); ! new_use->insn = insn; ! new_use->addrp = &XEXP (x, 0); ! new_use->luid = luid; ! new_use->call_tally = call_tally; ! new_use->class = class; ! new_use->set_in_parallel = 1; ! new_use->sibling = new_use; ! do ! { ! new_use->match_offset = new_use->offset = offset; ! match = lookup_related (regno, class, offset); ! offset++; ! } ! while (! match || match->luid != luid); ! rel_build_chain (new_use, match, rel->u.base); ! /* Prevent other registers from using the same chain. */ ! new_use->next_chain = new_use; ! } ! invalidate_related (dst, luid, call_tally); ! return; ! } ! break; ! } ! case REG: ! { ! int regno = REGNO (x); ! if (! regno_related[regno]) ! { ! rel_new (regno_related[regno]); ! regno_related[regno]->prev = unrelatedly_used; ! unrelatedly_used = regno_related[regno]; ! regno_related[regno]->reg = x; ! regno_related[regno]->insn = NULL_RTX; ! regno_related[regno]->u.last_luid = luid; ! } ! else if (! regno_related[regno]->insn) ! regno_related[regno]->u.last_luid = luid; ! else if (! regno_related[regno]->invalidate_luid) ! { ! struct rel_use *new_use, *match; ! HOST_WIDE_INT offset; ! int base; ! enum reg_class class; ! ! regno_related[regno]->reg_orig_refs += loop_depth; ! ! offset = regno_related[regno]->offset; ! base = regno_related[regno]->u.base; ! ! rel_new (new_use); ! new_use->insn = insn; ! new_use->addrp = xp; ! new_use->luid = luid; ! new_use->call_tally = call_tally; ! new_use->class = class = reg_preferred_class (regno); ! new_use->set_in_parallel = 0; ! new_use->offset = offset; ! new_use->match_offset = offset; ! new_use->sibling = new_use; ! ! match = lookup_related (regno, class, offset); ! rel_build_chain (new_use, match, base); ! } ! return; ! } ! case MEM: ! { ! int size = GET_MODE_SIZE (GET_MODE (x)); ! rtx *addrp= &XEXP (x, 0), addr = *addrp; ! ! switch (GET_CODE (addr)) ! { ! case REG: ! rel_record_mem (addrp, addr, size, 0, 0, ! insn, luid, call_tally); ! return; ! case PRE_INC: ! rel_record_mem (addrp, XEXP (addr, 0), size, size, 0, ! insn, luid, call_tally); ! return; ! case POST_INC: ! rel_record_mem (addrp, XEXP (addr, 0), size, 0, size, ! insn, luid, call_tally); ! return; ! case PRE_DEC: ! rel_record_mem (addrp, XEXP (addr, 0), size, -size, 0, ! insn, luid, call_tally); ! return; ! case POST_DEC: ! rel_record_mem (addrp, XEXP (addr, 0), size, 0, -size, ! insn, luid, call_tally); ! return; ! default: ! break; ! } ! break; ! } ! case PARALLEL: ! { ! for (i = XVECLEN (x, 0) - 1; i >= 0; i--) ! { ! rtx *yp = &XVECEXP (x, 0, i); ! rtx y = *yp; ! if (GET_CODE (y) == SET) ! { ! rtx dst; ! ! find_related (&SET_SRC (y), insn, luid, call_tally); ! dst = SET_DEST (y); ! while (GET_CODE (dst) == SUBREG ! || GET_CODE (dst) == ZERO_EXTRACT ! || GET_CODE (dst) == SIGN_EXTRACT ! || GET_CODE (dst) == STRICT_LOW_PART) ! dst = XEXP (dst, 0); ! if (GET_CODE (dst) != REG) ! find_related (&SET_DEST (y), insn, luid, ! call_tally + (GET_CODE (insn) == CALL_INSN)); ! } ! else if (GET_CODE (y) != CLOBBER) ! find_related (yp, insn, luid, call_tally); ! } ! for (i = XVECLEN (x, 0) - 1; i >= 0; i--) ! { ! rtx *yp = &XVECEXP (x, 0, i); ! rtx y = *yp; ! if (GET_CODE (y) == SET) ! { ! rtx *dstp; ! ! dstp = &SET_DEST (y); ! while (GET_CODE (*dstp) == SUBREG ! || GET_CODE (*dstp) == ZERO_EXTRACT ! || GET_CODE (*dstp) == SIGN_EXTRACT ! || GET_CODE (*dstp) == STRICT_LOW_PART) ! dstp = &XEXP (*dstp, 0); ! if (GET_CODE (*dstp) == REG) ! { ! int regno = REGNO (*dstp); ! rtx src = SET_SRC (y); ! if (regno_related[regno] && regno_related[regno]->insn ! && GET_CODE (src) == PLUS ! && XEXP (src, 0) == *dstp ! && GET_CODE (XEXP (src, 1)) == CONST_INT) ! { ! struct rel_use *new_use, *match; ! enum reg_class class; ! ! regno_related[regno]->reg_orig_refs += loop_depth; ! rel_new (new_use); ! new_use->insn = insn; ! new_use->addrp = dstp; ! new_use->luid = luid; ! new_use->call_tally = call_tally; ! new_use->class = class = reg_preferred_class (regno); ! new_use->set_in_parallel = 1; ! new_use->offset = regno_related[regno]->offset; ! new_use->match_offset ! = regno_related[regno]->offset ! += INTVAL (XEXP (src, 1)); ! new_use->sibling = new_use; ! match = lookup_related (regno, class, new_use->offset); ! rel_build_chain (new_use, match, ! regno_related[regno]->u.base); ! } ! else ! /* We assume here that a CALL_INSN won't set a pseudo ! at the same time as a MEM that contains the pseudo ! - if that were the case, we'd have to use an ! incremented CALL_TALLY value. */ ! invalidate_related (*dstp, luid, call_tally); ! } ! } ! else if (GET_CODE (y) == CLOBBER) ! find_related (yp, insn, luid, call_tally); ! } ! return; ! } ! default: ! break; ! } ! fmt = GET_RTX_FORMAT (code); ! ! for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--) ! { ! if (fmt[i] == 'e') ! find_related (&XEXP (x, i), insn, luid, call_tally); ! if (fmt[i] == 'E') ! { ! register int j; ! for (j = 0; j < XVECLEN (x, i); j++) ! find_related (&XVECEXP (x, i, j), insn, luid, call_tally); ! } ! } ! } ! ! /* Comparison functions for qsort. */ ! static int ! chain_starts_earlier (chain1, chain2) ! const PTR chain1; ! const PTR chain2; ! { ! int d = ((*(struct rel_use_chain **)chain2)->start_luid ! - (*(struct rel_use_chain **)chain1)->start_luid); ! if (! d) ! d = ((*(struct rel_use_chain **)chain2)->chain->offset ! - (*(struct rel_use_chain **)chain1)->chain->offset); ! if (! d) ! d = ((*(struct rel_use_chain **)chain2)->chain->set_in_parallel ! - (*(struct rel_use_chain **)chain1)->chain->set_in_parallel); ! /* If set_in_parallel is not set on both chain's first use, they must ! differ in start_luid or offset, since otherwise they would use the ! same chain. ! Thus the remaining problem is with set_in_parallel uses; for these, we ! know that *addrp is a register. Since the same register may not be set ! multiple times in the same insn, the registers must be different. */ ! ! if (! d) ! d = (REGNO (*(*(struct rel_use_chain **)chain2)->chain->addrp) ! - REGNO (*(*(struct rel_use_chain **)chain1)->chain->addrp)); ! return d; ! } ! ! static int ! chain_ends_later (chain1, chain2) ! const PTR chain1; ! const PTR chain2; ! { ! int d = ((*(struct rel_use_chain **)chain1)->end->no_link_pred ! - (*(struct rel_use_chain **)chain2)->end->no_link_pred); ! if (! d) ! d = ((*(struct rel_use_chain **)chain1)->end_luid ! - (*(struct rel_use_chain **)chain2)->end_luid); ! if (! d) ! d = ((*(struct rel_use_chain **)chain2)->chain->offset ! - (*(struct rel_use_chain **)chain1)->chain->offset); ! if (! d) ! d = ((*(struct rel_use_chain **)chain2)->chain->set_in_parallel ! - (*(struct rel_use_chain **)chain1)->chain->set_in_parallel); ! /* If set_in_parallel is not set on both chain's first use, they must ! differ in start_luid or offset, since otherwise they would use the ! same chain. ! Thus the remaining problem is with set_in_parallel uses; for these, we ! know that *addrp is a register. Since the same register may not be set ! multiple times in the same insn, the registers must be different. */ ! ! if (! d) ! d = (REGNO (*(*(struct rel_use_chain **)chain2)->chain->addrp) ! - REGNO (*(*(struct rel_use_chain **)chain1)->chain->addrp)); ! return d; ! } ! ! static void ! count_sets (x, pat, trash) ! rtx x, pat; ! void *trash ATTRIBUTE_UNUSED; ! { ! if (GET_CODE (x) == REG) ! REG_N_SETS (REGNO (x))++; ! } ! ! /* Perform the optimization for a single set of related values. ! INSERT_AFTER is an instruction after which we may emit instructions ! to initialize registers that remain live beyond the end of the group ! of instructions which have been examined. */ ! static struct related * ! optimize_related_values_1 (rel_base, luid, call_tally, insert_after, ! regmove_dump_file) ! struct related *rel_base; ! int luid, call_tally; ! rtx insert_after; ! FILE *regmove_dump_file; ! { ! struct related_baseinfo *baseinfo = rel_base->baseinfo; ! struct related *rel; ! struct rel_use_chain *chain, *chain0, **chain_starttab, **chain_endtab; ! struct rel_use_chain **pred_chainp, *pred_chain, *last_init_chain; ! int num_regs, num_av_regs, num_chains, num_linked, max_end_luid, i; ! int max_start_luid; ! struct rel_use_chain *rel_base_reg_user; ! int mode; ! HOST_WIDE_INT rel_base_reg_user_offset = 0; ! ! /* For any registers that are still live, we have to arrange ! to have them set to their proper values. ! Also count with how many registers (not counting base) we are ! dealing with here. */ ! for (num_regs = -1, rel = rel_base; rel; rel = rel->prev, num_regs++) ! { ! int regno = REGNO (rel->reg); ! ! if (! rel->death ! && ! rel->invalidate_luid) ! { ! enum reg_class class = reg_preferred_class (regno); ! struct rel_use *new_use, *match; ! ! rel_new (new_use); ! new_use->insn = NULL_RTX; ! new_use->addrp = &rel->reg; ! new_use->luid = luid; ! new_use->call_tally = call_tally; ! new_use->class = class; ! new_use->set_in_parallel = 1; ! new_use->match_offset = new_use->offset = rel->offset; ! new_use->sibling = new_use; ! match = lookup_related (regno, class, rel->offset); ! rel_build_chain (new_use, match, REGNO (rel_base->reg)); ! /* Prevent other registers from using the same chain. */ ! new_use->next_chain = new_use; ! ! rel->reg_orig_calls_crossed = call_tally - rel->reg_set_call_tally; ! } ! } ! ! /* Now for every chain of values related to the base, set start ! and end luid, match_offset, and reg. Also count the number of these ! chains, and determine the largest end luid. */ ! num_chains = 0; ! for (max_end_luid = 0, chain = baseinfo->chains; chain; chain = chain->prev) ! { ! struct rel_use *use, *next; ! ! num_chains++; ! next = chain->chain; ! chain->start_luid = next->luid; ! do ! { ! use = next; ! next = use->next_chain; ! } ! while (next && next != use); ! use->no_link_pred = next != NULL_PTR; ! use->next_chain = 0; ! chain->end = use; ! chain->end_luid = use->luid; ! chain->match_offset = use->match_offset; ! chain->calls_crossed = use->call_tally - chain->chain->call_tally; ! ! chain->reg = use->insn ? NULL_RTX : *use->addrp; ! ! if (use->luid > max_end_luid) ! max_end_luid = use->luid; ! ! if (regmove_dump_file) ! fprintf (regmove_dump_file, "Chain start: %d end: %d\n", ! chain->start_luid, chain->end_luid); ! } ! ! if (regmove_dump_file) ! fprintf (regmove_dump_file, ! "Insn %d reg %d: found %d chains.\n", ! INSN_UID (rel_base->insn), REGNO (rel_base->reg), num_chains); ! ! if (! num_chains) ! return baseinfo->prev_base; ! ! /* For every chain, we try to find another chain the lifetime of which ! ends before the lifetime of said chain starts. ! So we first sort according to luid of first and last instruction that ! is in the chain, respectively; this is O(n * log n) on average. */ ! chain_starttab = rel_alloc (num_chains * sizeof *chain_starttab); ! chain_endtab = rel_alloc (num_chains * sizeof *chain_starttab); ! for (chain = baseinfo->chains, i = 0; chain; chain = chain->prev, i++) ! { ! chain_starttab[i] = chain; ! chain_endtab[i] = chain; ! } ! qsort (chain_starttab, num_chains, sizeof *chain_starttab, ! chain_starts_earlier); ! qsort (chain_endtab, num_chains, sizeof *chain_endtab, chain_ends_later); ! ! ! /* Now we go through every chain, starting with the one that starts ! second (we can skip the first because we know there would be no match), ! and check it against the chain that ends first. */ ! /* ??? We assume here that reg_class_compatible_p will seldom return false. ! If that is not true, we should do a more thorough search for suitable ! chain combinations. */ ! pred_chainp = chain_endtab; ! pred_chain = *pred_chainp; ! max_start_luid = chain_starttab[num_chains - 1]->start_luid; ! for (num_linked = 0, i = num_chains - 2; i >= 0; i--) ! { ! struct rel_use_chain *succ_chain = chain_starttab[i]; ! if (succ_chain->start_luid > pred_chain->end_luid ! && ! pred_chain->end->no_link_pred ! && (pred_chain->calls_crossed ! ? succ_chain->calls_crossed ! : succ_chain->end->call_tally == pred_chain->chain->call_tally) ! && regclass_compatible_p (succ_chain->chain->class, ! pred_chain->chain->class) ! /* add_limits is not valid for MODE_PARTIAL_INT . */ ! && GET_MODE_CLASS (GET_MODE (rel_base->reg)) == MODE_INT ! && (succ_chain->chain->offset - pred_chain->match_offset ! >= add_limits[(int) GET_MODE (rel_base->reg)][0]) ! && (succ_chain->chain->offset - pred_chain->match_offset ! <= add_limits[(int) GET_MODE (rel_base->reg)][1])) ! { ! /* We can link these chains together. */ ! pred_chain->linked = succ_chain; ! succ_chain->start_luid = 0; ! num_linked++; ! ! pred_chain = *++pred_chainp; ! } ! else ! max_start_luid = succ_chain->start_luid; ! } ! ! if (regmove_dump_file && num_linked) ! fprintf (regmove_dump_file, "Linked to %d sets of chains.\n", ! num_chains - num_linked); ! ! /* Now count the number of registers that are available for reuse. */ ! /* ??? In rare cases, we might reuse more if we took different ! end luids of the chains into account. Or we could just allocate ! some new regs. But that would probably not be worth the effort. */ ! /* ??? We should pay attention to preferred register classes here to, ! if the to-be-allocated register have a life outside the range that ! we handle. */ ! for (num_av_regs = 0, rel = rel_base->prev; rel; rel = rel->prev) ! { ! if (! rel->invalidate_luid ! || rel->invalidate_luid > max_end_luid) ! num_av_regs++; ! } ! ! /* Propagate mandatory register assignments to the first chain in ! all sets of liked chains, and set rel_base_reg_user. */ ! for (rel_base_reg_user = 0, i = 0; i < num_chains; i++) ! { ! struct rel_use_chain *chain = chain_starttab[i]; ! if (chain->linked) ! chain->reg = chain->linked->reg; ! if (chain->reg == rel_base->reg) ! rel_base_reg_user = chain; ! } ! /* If rel_base->reg is not a mandatory allocated register, allocate ! it to that chain that starts first and has no allocated register, ! and that allows the addition of the start value in a single ! instruction. */ ! mode = (int) GET_MODE (rel_base->reg); ! if (! rel_base_reg_user) ! { ! for ( i = num_chains - 1; i >= 0; --i) ! { ! struct rel_use_chain *chain = chain_starttab[i]; ! if (! chain->reg ! && chain->start_luid ! && chain->chain->offset >= add_limits[mode][0] ! && chain->chain->offset <= add_limits[mode][1] ! /* Also can't use this chain if it's register is clobbered ! and other chains need to start later. */ ! && (! (chain->end->no_link_pred && chain->end->insn) ! || chain->end->luid >= max_start_luid) ! /* Also can't use it if it lasts longer than base ! base is available. */ ! && (! rel_base->invalidate_luid ! || rel_base->invalidate_luid > chain->end_luid)) ! { ! chain->reg = rel_base->reg; ! rel_base_reg_user = chain; ! break; ! } ! } ! } ! else ! rel_base_reg_user_offset = rel_base_reg_user->chain->offset; ! ! /* Now check if it is worth doing this optimization after all. ! Using separate registers per value, like in the code generated by cse, ! costs two instructions per register (one move and one add). ! Using the chains we have set up, we need two instructions for every ! linked set of chains, plus one instruction for every link; ! however, if the base register is allocated to a chain ! (i.e. rel_base_reg_user != 0), we don't need a move insn to start ! that chain. ! We do the optimization if we save instructions, or if we ! stay with the same number of instructions, but save registers. ! We also require that we have enough registers available for reuse. ! Moreover, we have to check that we can add the offset for ! rel_base_reg_user, in case it is a mandatory allocated register. */ ! if ((2 * num_regs ! > ((2 * num_chains - num_linked - (rel_base_reg_user != 0)) ! - (num_linked != 0))) ! && num_av_regs + (rel_base_reg_user != 0) >= num_chains - num_linked ! && rel_base_reg_user_offset >= add_limits[mode][0] ! && rel_base_reg_user_offset <= add_limits[mode][1]) ! { ! /* Hold the current offset between the initial value of rel_base->reg ! and the current value of rel_base->rel before the instruction ! that starts the current set of chains. */ ! int base_offset = 0; ! /* The next use of rel_base->reg that we have to look out for. */ ! struct rel_use *base_use; ! /* Pointer to next insn where we look for it. */ ! rtx base_use_scan = 0; ! int base_last_use_call_tally = rel_base->reg_set_call_tally; ! int base_regno; ! int base_seen; ! ! if (regmove_dump_file) ! fprintf (regmove_dump_file, "Optimization is worth while.\n"); ! ! /* First, remove all the setting insns, death notes ! and refcount increments that are now obsolete. */ ! for (rel = rel_base; rel; rel = rel->prev) ! { ! struct update *update; ! int regno = REGNO (rel->reg); ! ! if (rel != rel_base) ! { ! /* The first setting insn might be the start of a basic block. */ ! if (rel->insn == rel_base->insn ! /* We have to preserve insert_after. */ ! || rel->insn == insert_after) ! { ! PUT_CODE (rel->insn, NOTE); ! NOTE_LINE_NUMBER (rel->insn) = NOTE_INSN_DELETED; ! NOTE_SOURCE_FILE (rel->insn) = 0; ! } ! else ! delete_insn (rel->insn); ! REG_N_SETS (regno)--; ! } ! REG_N_CALLS_CROSSED (regno) -= rel->reg_orig_calls_crossed; ! for (update = rel->updates; update; update = update->prev) ! { ! rtx death_insn = update->death_insn; ! if (death_insn) ! { ! rtx death_note ! = find_reg_note (death_insn, REG_DEAD, rel->reg); ! if (! death_note) ! death_note ! = find_reg_note (death_insn, REG_UNUSED, rel->reg); ! remove_note (death_insn, death_note); ! REG_N_DEATHS (regno)--; ! } ! /* We have to preserve insert_after. */ ! if (update->insn == insert_after) ! { ! PUT_CODE (update->insn, NOTE); ! NOTE_LINE_NUMBER (update->insn) = NOTE_INSN_DELETED; ! NOTE_SOURCE_FILE (update->insn) = 0; ! } ! else ! delete_insn (update->insn); ! REG_N_SETS (regno)--; ! } ! if (rel->death) ! { ! rtx death_note = find_reg_note (rel->death, REG_DEAD, rel->reg); ! if (! death_note) ! death_note = find_reg_note (rel->death, REG_UNUSED, rel->reg); ! remove_note (rel->death, death_note); ! rel->death = death_note; ! REG_N_DEATHS (regno)--; ! } ! } ! /* Go through all the chains and install the planned changes. */ ! rel = rel_base; ! if (rel_base_reg_user) ! { ! base_use = rel_base_reg_user->chain; ! base_use_scan = chain_starttab[num_chains - 1]->chain->insn; ! } ! ! /* Set last_init_chain to the chain that starts latest. */ ! for (i = 0; ! chain_starttab[i]->start_luid; i++); ! last_init_chain = chain_starttab[i]; ! /* If there are multiple chains that consist only of an ! assignment to a register that is live at the end of the ! block, they all have the same luid. The loop that emits the ! insns for all the chains below starts with the chain with the ! highest index, and due to the way insns are emitted after ! insert_after, the first emitted will eventually be the last. */ ! for (; i < num_chains; i++) ! { ! if (! chain_starttab[i]->start_luid) ! continue; ! if (chain_starttab[i]->chain->insn) ! break; ! last_init_chain = chain_starttab[i]; ! } ! ! base_regno = REGNO (rel_base->reg); ! base_seen = 0; ! for (i = num_chains - 1; i >= 0; i--) ! { ! int first_call_tally; ! rtx reg; ! int regno; ! struct rel_use *use, *last_use; ! ! chain0 = chain_starttab[i]; ! if (! chain0->start_luid) ! continue; ! first_call_tally = chain0->chain->call_tally; ! reg = chain0->reg; ! /* If this chain has not got a register yet, assign one. */ ! if (! reg) ! { ! do ! rel = rel->prev; ! while (! rel->death ! || (rel->invalidate_luid ! && rel->invalidate_luid <= max_end_luid)); ! reg = rel->reg; ! } ! regno = REGNO (reg); ! ! use = chain0->chain; ! ! /* Update base_offset. */ ! if (rel_base_reg_user) ! { ! rtx use_insn = use->insn; ! rtx base_use_insn = base_use->insn; ! ! if (! use_insn) ! use_insn = insert_after; ! ! while (base_use_scan != use_insn) ! { ! if (base_use_scan == base_use_insn) ! { ! base_offset = base_use->match_offset; ! base_use = base_use->next_chain; ! if (! base_use) ! { ! rel_base_reg_user = rel_base_reg_user->linked; ! if (! rel_base_reg_user) ! break; ! base_use = rel_base_reg_user->chain; ! } ! base_use_insn = base_use->insn; ! } ! base_use_scan = NEXT_INSN (base_use_scan); ! } ! /* If we are processing the start of a chain that starts with ! an instruction that also uses the base register, (that happens ! only if USE_INSN contains multiple distinct, but related ! values) and the chains using the base register have already ! been processed, the initializing instruction of the new ! register will end up after the adjustment of the base ! register. */ ! if (use_insn == base_use_insn && base_seen) ! base_offset = base_use->offset; ! } ! if (regno == base_regno) ! base_seen = 1; ! if (regno != base_regno || use->offset - base_offset) ! { ! HOST_WIDE_INT offset = use->offset - base_offset; ! rtx add; ! ! add = (offset ! ? gen_add3_insn (reg, rel_base->reg, GEN_INT (offset)) ! : gen_move_insn (reg, rel_base->reg)); ! if (! add) ! abort (); ! if (GET_CODE (add) == SEQUENCE) ! { ! int i; ! ! for (i = XVECLEN (add, 0) - 1; i >= 0; i--) ! note_stores (PATTERN (XVECEXP (add, 0, i)), count_sets, ! NULL); ! } ! else ! note_stores (add, count_sets, NULL); ! if (use->insn) ! add = emit_insn_before (add, use->insn); ! else ! add = emit_insn_after (add, insert_after); ! if (use->call_tally > base_last_use_call_tally) ! base_last_use_call_tally = use->call_tally; ! /* If this is the last reg initializing insn, and we ! still have to place a death note for the base reg, ! attach it to this insn - ! unless we are still using the base register. */ ! if (chain0 == last_init_chain ! && rel_base->death ! && regno != base_regno) ! { ! XEXP (rel_base->death, 0) = rel_base->reg; ! XEXP (rel_base->death, 1) = REG_NOTES (add); ! REG_NOTES (add) = rel_base->death; ! REG_N_DEATHS (base_regno)++; ! } ! } ! for (last_use = 0, chain = chain0; chain; chain = chain->linked) ! { ! int last_offset; ! ! use = chain->chain; ! if (last_use && use->offset != last_use->offset) ! { ! rtx add ! = gen_add3_insn (reg, reg, ! GEN_INT (use->offset - last_use->offset)); ! if (! add) ! abort (); ! if (use->insn) ! emit_insn_before (add, use->insn); ! else ! { ! /* Set use->insn, so that base_offset will be adjusted ! in time if REG is REL_BASE->REG . */ ! use->insn = emit_insn_after (add, last_use->insn); ! } ! REG_N_SETS (regno)++; ! } ! for (last_offset = use->offset; use; use = use->next_chain) ! { ! rtx addr; ! int use_offset; ! ! addr = *use->addrp; ! if (GET_CODE (addr) != REG) ! remove_note (use->insn, ! find_reg_note (use->insn, REG_INC, ! XEXP (addr, 0))); ! use_offset = use->offset; ! if (use_offset == last_offset) ! { ! if (use->set_in_parallel) ! { ! REG_N_SETS (REGNO (addr))--; ! addr = reg; ! } ! else if (use->match_offset > use_offset) ! addr = gen_rtx_POST_INC (Pmode, reg); ! else if (use->match_offset < use_offset) ! addr = gen_rtx_POST_DEC (Pmode, reg); ! else ! addr = reg; ! } ! else if (use_offset > last_offset) ! addr = gen_rtx_PRE_INC (Pmode, reg); ! else ! addr = gen_rtx_PRE_DEC (Pmode, reg); ! /* Group changes from the same chain for the same insn ! together, to avoid failures for match_dups. */ ! validate_change (use->insn, use->addrp, addr, 1); ! if ((! use->next_chain || use->next_chain->insn != use->insn) ! && ! apply_change_group ()) ! abort (); ! if (addr != reg) ! REG_NOTES (use->insn) ! = gen_rtx_EXPR_LIST (REG_INC, reg, REG_NOTES (use->insn)); ! last_use = use; ! last_offset = use->match_offset; ! } ! } ! /* If REG dies, attach its death note to the last using insn in ! the set of linked chains we just handled. ! However, if REG is the base register, don't do this if there ! will be a later initializing instruction for another register. ! The initializing instruction for last_init_chain will be inserted ! before last_init_chain->chain->insn, so if the luids (and hence ! the insns these stand for) are equal, put the death note here. */ ! if (reg == rel->reg ! && rel->death ! && (rel != rel_base ! || last_use->luid >= last_init_chain->start_luid)) ! { ! XEXP (rel->death, 0) = reg; ! ! /* Note that passing only PATTERN (LAST_USE->insn) to ! reg_set_p here is not enough, since we might have ! created an REG_INC for REG above. */ ! ! PUT_MODE (rel->death, (reg_set_p (reg, last_use->insn) ! ? REG_UNUSED : REG_DEAD)); ! XEXP (rel->death, 1) = REG_NOTES (last_use->insn); ! REG_NOTES (last_use->insn) = rel->death; ! /* Mark this death as 'used up'. That is important for the ! base register. */ ! rel->death = NULL_RTX; ! REG_N_DEATHS (regno)++; ! } ! if (regno == base_regno) ! base_last_use_call_tally = last_use->call_tally; ! else ! REG_N_CALLS_CROSSED (regno) ! += last_use->call_tally - first_call_tally; ! } ! ! REG_N_CALLS_CROSSED (base_regno) += ! base_last_use_call_tally - rel_base->reg_set_call_tally; ! } ! ! /* Finally, clear the entries that we used in regno_related. We do it ! item by item here, because doing it with bzero for each basic block ! would give O(n*n) time complexity. */ ! for (rel = rel_base; rel; rel = rel->prev) ! regno_related[REGNO (rel->reg)] = 0; ! return baseinfo->prev_base; ! } ! ! /* Finalize the optimization for any related values know so far, and reset ! the entries in regno_related that we have disturbed. */ ! static void ! optimize_related_values_0 (rel_base_list, luid, call_tally, insert_after, ! regmove_dump_file) ! struct related *rel_base_list; ! int luid, call_tally; ! rtx insert_after; ! FILE *regmove_dump_file; ! { ! while (rel_base_list) ! rel_base_list ! = optimize_related_values_1 (rel_base_list, luid, call_tally, ! insert_after, regmove_dump_file); ! for ( ; unrelatedly_used; unrelatedly_used = unrelatedly_used->prev) ! regno_related[REGNO (unrelatedly_used->reg)] = 0; ! } ! ! /* Scan the entire function for instances where multiple registers are ! set to values that differ only by a constant. ! Then try to reduce the number of instructions and/or registers needed ! by exploiting auto_increment and true two-address additions. */ ! ! static void ! optimize_related_values (nregs, regmove_dump_file) ! int nregs; ! FILE *regmove_dump_file; ! { ! int b; ! rtx insn; ! int luid = 0; ! int call_tally = 0; ! int save_loop_depth = loop_depth; ! enum machine_mode mode; ! ! if (regmove_dump_file) ! fprintf (regmove_dump_file, "Starting optimize_related_values.\n"); ! ! /* For each integer mode, find minimum and maximum value for a single- ! instruction reg-constant add. */ ! for (mode = GET_CLASS_NARROWEST_MODE (MODE_INT); mode != VOIDmode; ! mode = GET_MODE_WIDER_MODE (mode)) ! { ! rtx reg = gen_rtx_REG (mode, FIRST_PSEUDO_REGISTER); ! int icode = (int) add_optab->handlers[(int) mode].insn_code; ! HOST_WIDE_INT tmp; ! rtx add, set; ! int p, p_max; ! ! add_limits[(int) mode][0] = 0; ! add_limits[(int) mode][1] = 0; ! if (icode == CODE_FOR_nothing ! || ! (*insn_data[icode].operand[0].predicate) (reg, mode) ! || ! (*insn_data[icode].operand[1].predicate) (reg, mode) ! || ! (*insn_data[icode].operand[2].predicate) (const1_rtx, mode)) ! continue; ! add = GEN_FCN (icode) (reg, reg, const1_rtx); ! if (GET_CODE (add) == SEQUENCE) ! continue; ! add = make_insn_raw (add); ! set = single_set (add); ! if (! set ! || GET_CODE (SET_SRC (set)) != PLUS ! || XEXP (SET_SRC (set), 1) != const1_rtx) ! continue; ! p_max = GET_MODE_BITSIZE (mode) - 1; ! if (p_max > HOST_BITS_PER_WIDE_INT - 2) ! p_max = HOST_BITS_PER_WIDE_INT - 2; ! for (p = 2; p < p_max; p++) ! { ! if (! validate_change (add, &XEXP (SET_SRC (set), 1), ! GEN_INT (((HOST_WIDE_INT) 1 << p) - 1), 0)) ! break; ! } ! add_limits[(int) mode][1] = tmp = INTVAL (XEXP (SET_SRC (set), 1)); ! /* We need a range of known good values for the constant of the add. ! Thus, before checking for the power of two, check for one less first, ! in case the power of two is an exceptional value. */ ! if (validate_change (add, &XEXP (SET_SRC (set), 1), GEN_INT (-tmp), 0)) ! { ! if (validate_change (add, &XEXP (SET_SRC (set), 1), ! GEN_INT (-tmp - 1), 0)) ! add_limits[(int) mode][0] = -tmp - 1; ! else ! add_limits[(int) mode][0] = -tmp; ! } ! } ! ! /* Insert notes before basic block ends, so that we can safely ! insert other insns. ! Don't do this when it would separate a BARRIER from the insn that ! it belongs to; we really need the notes only when the basic block ! end is due to a following label or to the end of the function. ! We must never dispose a JUMP_INSN as last insn of a basic block, ! since this confuses save_call_clobbered_regs. */ ! for (b = 0; b < n_basic_blocks; b++) ! { ! rtx end = BLOCK_END (b); ! if (GET_CODE (end) != JUMP_INSN) ! { ! rtx next = next_nonnote_insn (BLOCK_END (b)); ! if (! next || GET_CODE (next) != BARRIER) ! BLOCK_END (b) = emit_note_after (NOTE_INSN_DELETED, BLOCK_END (b)); ! } ! } ! ! gcc_obstack_init (&related_obstack); ! regno_related = rel_alloc (nregs * sizeof *regno_related); ! bzero ((char *) regno_related, nregs * sizeof *regno_related); ! rel_base_list = 0; ! loop_depth = 1; ! b = -1; ! ! for (insn = get_insns (); insn; insn = NEXT_INSN (insn)) ! { ! luid++; ! ! /* Check to see if this is the first insn of the next block. There is ! no next block if we are already in the last block. */ ! if ((b+1) < n_basic_blocks && insn == BLOCK_HEAD (b+1)) ! b = b+1; ! ! if (GET_CODE (insn) == NOTE) ! { ! if (NOTE_LINE_NUMBER (insn) == NOTE_INSN_LOOP_BEG) ! loop_depth++; ! else if (NOTE_LINE_NUMBER (insn) == NOTE_INSN_LOOP_END) ! loop_depth--; ! } ! ! /* Don't do anything before the start of the first basic block. */ ! if (b < 0) ! continue; ! ! /* Don't do anything if this instruction is in the shadow of a ! live flags register. */ ! if (GET_MODE (insn) == HImode) ! continue; ! ! if (GET_RTX_CLASS (GET_CODE (insn)) == 'i') ! { ! rtx note; ! find_related (&PATTERN (insn), insn, luid, call_tally); ! for (note = REG_NOTES (insn); note; note = XEXP (note, 1)) ! { ! if (REG_NOTE_KIND (note) == REG_DEAD ! || (REG_NOTE_KIND (note) == REG_UNUSED ! && GET_CODE (XEXP (note, 0)) == REG)) ! { ! rtx reg = XEXP (note, 0); ! int regno = REGNO (reg); ! if (REG_NOTE_KIND (note) == REG_DEAD ! && reg_set_p (reg, PATTERN (insn))) ! { ! remove_note (insn, note); ! REG_N_DEATHS (regno)--; ! } ! else if (regno_related[regno] ! && ! regno_related[regno]->invalidate_luid) ! { ! regno_related[regno]->death = insn; ! regno_related[regno]->reg_orig_calls_crossed ! = call_tally - regno_related[regno]->reg_set_call_tally; ! } ! } ! } ! /* Inputs to a call insn do not cross the call, therefore CALL_TALLY ! must be bumped *after* they have been processed. */ ! if (GET_CODE (insn) == CALL_INSN) ! call_tally++; ! } ! ! /* We end current processing at the end of a basic block, or when ! a flags register becomes live. ! ! Otherwise, we might end up with one or more extra instructions ! inserted in front of the user, to set up or adjust a register. ! There are cases where this could be handled smarter, but most of ! the time the user will be a branch anyways, so the extra effort ! to handle the occasional conditional instruction is probably not ! justified by the little possible extra gain. */ ! ! if (insn == BLOCK_END (b) ! || GET_MODE (insn) == QImode) ! { ! optimize_related_values_0 (rel_base_list, luid, call_tally, ! prev_nonnote_insn (insn), ! regmove_dump_file); ! rel_base_list = 0; ! } ! } ! optimize_related_values_0 (rel_base_list, luid, call_tally, ! get_last_insn (), regmove_dump_file); ! obstack_free (&related_obstack, 0); ! loop_depth = save_loop_depth; ! if (regmove_dump_file) ! fprintf (regmove_dump_file, "Finished optimize_related_values.\n"); ! } ! ! #endif /* AUTO_INC_DEC */ ! static int *regno_src_regno; /* Indicate how good a choice REG (which appears as a source) is to replace *************** copy_src_to_dest (insn, src, dest, loop_ *** 796,801 **** --- 2449,2455 ---- seq = gen_sequence (); end_sequence (); /* If this sequence uses new registers, we may not use it. */ + /* ????? This code no longer works since no_new_pseudos is set. */ if (old_num_regs != reg_rtx_no || ! validate_replace_rtx (src, dest, insn)) { *************** regmove_optimize (f, nregs, regmove_dump *** 1098,1103 **** --- 2752,2763 ---- /* Find out where a potential flags register is live, and so that we can supress some optimizations in those zones. */ mark_flags_life_zones (discover_flags_reg ()); + #ifdef AUTO_INC_DEC + /* See the comment in front of REL_USE_HASH_SIZE what + this is about. */ + if (flag_regmove && flag_expensive_optimizations) + optimize_related_values (nregs, regmove_dump_file); + #endif regno_src_regno = (int *) xmalloc (sizeof *regno_src_regno * nregs); for (i = nregs; --i >= 0; ) regno_src_regno[i] = -1; *************** regmove_optimize (f, nregs, regmove_dump *** 1196,1206 **** src = recog_data.operand[op_no]; dst = recog_data.operand[match_no]; - if (GET_CODE (src) != REG) - continue; - src_subreg = src; if (GET_CODE (dst) == SUBREG && GET_MODE_SIZE (GET_MODE (dst)) >= GET_MODE_SIZE (GET_MODE (SUBREG_REG (dst)))) { --- 2856,2864 ---- src = recog_data.operand[op_no]; dst = recog_data.operand[match_no]; src_subreg = src; if (GET_CODE (dst) == SUBREG + && GET_CODE (src) == REG && GET_MODE_SIZE (GET_MODE (dst)) >= GET_MODE_SIZE (GET_MODE (SUBREG_REG (dst)))) { *************** regmove_optimize (f, nregs, regmove_dump *** 1209,1215 **** src, SUBREG_WORD (dst)); dst = SUBREG_REG (dst); } ! if (GET_CODE (dst) != REG || REGNO (dst) < FIRST_PSEUDO_REGISTER) continue; --- 2867,2879 ---- src, SUBREG_WORD (dst)); dst = SUBREG_REG (dst); } ! else if (GET_CODE (src) == SUBREG ! && (GET_MODE_SIZE (GET_MODE (src)) ! >= GET_MODE_SIZE (GET_MODE (SUBREG_REG (src))))) ! src = SUBREG_REG (src); ! ! if (GET_CODE (src) != REG ! || GET_CODE (dst) != REG || REGNO (dst) < FIRST_PSEUDO_REGISTER) continue; *************** fixup_match_1 (insn, set, src, src_subre *** 1871,1876 **** --- 3535,3553 ---- validate_change (insn, recog_data.operand_loc[match_number], src, 1); if (validate_replace_rtx (dst, src_subreg, p)) success = 1; + else if (src_subreg != src + && *recog_data.operand_loc[match_number] == dst) + { + /* In this case, we originally have a subreg in the src + operand. It's mode should match the destination operand. + Moreover, P is likely to use DST in a subreg, so replacing + it with another subreg will fail - but putting the raw + register there can succeed. */ + validate_change (insn, recog_data.operand_loc[match_number], + src_subreg, 1); + if (validate_replace_rtx (dst, src, p)) + success = 1; + } break; } ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-18 0:22 ` Jeffrey A Law ` (2 preceding siblings ...) 1999-11-18 14:29 ` Joern Rennecke @ 1999-11-30 23:37 ` Jeffrey A Law 3 siblings, 0 replies; 94+ messages in thread From: Jeffrey A Law @ 1999-11-30 23:37 UTC (permalink / raw) To: Michael Hayes; +Cc: gcc, amylaar In message <14315.24602.150585.362111@ongaonga.elec.canterbury.ac.nz>you writ e: > Michael Hayes writes: > > I've also been tidying up a patch that I'm about to submit for a > > separate autoincrement pass that is run as part of flow optimization. > > This collects lists of register references within a basic block and > > uses these lists to look for a sequence of memory references to merge > > with an increment insn. I found that this approach worked better than > > scanning def-use chains. > > Here's the patches I was referring to. There are three new files: > autoinc.c, ref.c, ref.h. I've written this as a bolt-on to > life_analysis_1 since it completely replaces the autoinc code in > flow.c. > > I'd be interested in folks' opinions to whether this is the best > approach or whether I'm barking up the wrong tree.... > > For starters, here the docs from autoinc.c: > > There are a number of transformations which can be made by > this optimization: > > case A: > *REG1; REG1 = REG1 + INC => REG1 = REG1; *REG1++ > > case B: > *REG1; REG2 = REG1 + INC => *REG1++; REG2 = REG1 > where REG1 dies in the add insn. > > case C: (very uncommon) > *REG1; REG2 = REG1 + INC => REG2 = REG1; *REG2++ > where REG1 is live after the add insn and where REG2 is not used > between the first memref and the add insn. This case requires > a new move insn to be inserted before the first memref which makes > REG2 live earlier. However, this won't affect autoinc processing > for REG2 since REG2 is not operand 1 of an add insn. > > case D: > REG1 = REG1 + INC; *REG1 => REG1 = REG1; *++REG1 > > case E: > REG2 = REG1 + INC; *REG1 => *++REG1; REG2 = REG1 > where REG1 dies in the last memref and REG2 is not used between > the add insn and last memref. > > case F: > *REG1; *(REG1 + INC) => *REG1; *++REG1 > where REG1 dies in the last memref. > > This latter case is useful for DSP architectures which can handle > multiple autoincrement addresses better than multiple indirect > addresses. This case could be handled by a separate, optional, scan > of the register ref list. > > > Note that strength_reduce in loop.c performs the following > transformation which we try to undo: > *R; R = R + 1; *R; R = R + 1 => *R; *(R + 1); R = R + 2 > > However, the following is not transformed: > R = R + 1; *R; R = R + 1; *R This looks very similar to something Cygnus did for a customer but hasn't had the time to contribute. Our implementation sat inside regmove and I believe performed similar transformations. What I would like to do is have a "cook off" between the two implementations. ie, I want us to evaluate the two hunks of code both from a standpoint of which is more effective at optimizing sequences that can use autoinc to remove instructions and from a cleanliness/long term maintainability standpoint. I do _not_ want to ultimately have two hunks of code that basically do the same thing. That's dumb. Joern -- can you get a patch for the regmove changes put together and submit it to the list? Michael -- can you look at Joern's implementation and compare it to your own? Joern -- can you do the same with Michael's implementation? It would be nice if some other folks could try these two implementations on targets that have autoincrement addresses to see what effect they have. The two key issues are which optimizes better and which is more maintainable long term. jeff ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-09-22 4:32 ` Michael Hayes 1999-09-22 4:39 ` Bernd Schmidt 1999-09-24 4:35 ` Michael Hayes @ 1999-09-30 18:02 ` Michael Hayes 2 siblings, 0 replies; 94+ messages in thread From: Michael Hayes @ 1999-09-30 18:02 UTC (permalink / raw) To: Bernd Schmidt; +Cc: gcc Bernd Schmidt writes: > I'm playing with a patch to improve the generation of auto-increment > addressing modes, e.g. by generating POST_MODIFY and PRE_MODIFY rtxs for > targets where this is possible. I am interested in what your patch does. I've also been tidying up a patch that I'm about to submit for a separate autoincrement pass that is run as part of flow optimization. This collects lists of register references within a basic block and uses these lists to look for a sequence of memory references to merge with an increment insn. I found that this approach worked better than scanning def-use chains. This pass also generates {PRE,POST}_MODIFY rtxs as well (well it did until the recent reload changes broke this aspect). I also have patches for the rest of the gcc infrastructure to handle {PRE,POST}_MODIFY if you are interested. > If someone has (preferrably small) example code (for any target), which > shows how the compiler generates auto-increments either very well or very > badly, I'd like to get a copy of these test cases so I can make sure I'm not > pessimizing anything. Most of problems I've seen with autoincrement generation are due to poor giv combination during strength reduction. I'll post some of my testcases separately... Michael. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-09-22 3:09 Autoincrement examples Bernd Schmidt 1999-09-22 4:32 ` Michael Hayes @ 1999-09-22 12:31 ` Denis Chertykov 1999-09-30 18:02 ` Denis Chertykov 1999-09-29 6:15 ` Rask Ingemann Lambertsen 1999-09-30 18:02 ` Bernd Schmidt 3 siblings, 1 reply; 94+ messages in thread From: Denis Chertykov @ 1999-09-22 12:31 UTC (permalink / raw) To: Bernd Schmidt; +Cc: gcc Bernd Schmidt <bernds@cygnus.co.uk> writes: > I'm playing with a patch to improve the generation of auto-increment > addressing modes, e.g. by generating POST_MODIFY and PRE_MODIFY rtxs for > targets where this is possible. Few months ago I'm playing with autoincrement/autodecrement addressing mode. (I'm playing with my own gcc port: ATMEL AVR - 8bit microcontroller) Combiner pass generate (and try to recognize) insns like: (parallel [(set (mem:MODE (reg1)) reg2) (set (reg1) (plus (reg1) (const_int MODE_SIZE)))]) IMHO: Combiner must transform this insn to insn with post increment address. (I'm know about flow pass - flow must do this. But combine must transform insns to POST/PRE DECRE/INCRE MENT form if it's possible.) This is my experimental code. I'm call `try_auto_inc' from `recog_for_combine' immediately before: insn_code_number = recog (pat, insn, &num_clobbers_to_add); as: pat = try_auto_inc (pnewpat, insn, ¬es); /* XXXX */ /* Is the result of combination a valid instruction? */ insn_code_number = recog (pat, insn, &num_clobbers_to_add); try_auto_inc and try_auto_inc_1 : rtx try_auto_inc_1 (reg, mem_addr, set) rtx reg; rtx mem_addr; rtx set; { if (GET_CODE (set) == SET) { rtx addr = XEXP (mem_addr,0); HOST_WIDE_INT offset = 0; if (GET_CODE (addr) == PLUS && GET_CODE (XEXP (addr, 1)) == CONST_INT) offset = INTVAL (XEXP (addr, 1)), addr = XEXP (addr, 0); if (GET_CODE (addr) == REG && ! reg_overlap_mentioned_p (addr, reg)) { register rtx y; register int size = GET_MODE_SIZE (GET_MODE (mem_addr)); /* Is the next use an increment that might make auto-increment? */ if ((y = SET_SRC (set), GET_CODE (y) == PLUS) ) if (XEXP (y, 0) == addr && addr == SET_DEST (y)) if (GET_CODE (XEXP (y, 1)) == CONST_INT) if ((HAVE_POST_INCREMENT && (INTVAL (XEXP (y, 1)) == size && offset == 0)) || (HAVE_POST_DECREMENT && (INTVAL (XEXP (y, 1)) == - size && offset == 0)) || (HAVE_PRE_INCREMENT && (INTVAL (XEXP (y, 1)) == size && offset == size)) || (HAVE_PRE_DECREMENT && (INTVAL (XEXP (y, 1)) == - size && offset == - size))) { enum rtx_code inc_code = (INTVAL (XEXP (y, 1)) == size ? (offset ? PRE_INC : POST_INC) : (offset ? PRE_DEC : POST_DEC)); /* This is the simple case. Try to make the auto-inc. */ return gen_rtx_MEM (GET_MODE (mem_addr), gen_rtx_fmt_e (inc_code, Pmode, addr)); } } } return NULL_RTX; } rtx try_auto_inc (pnewpat, insn, pnotes) rtx *pnewpat; rtx insn; rtx *pnotes; { rtx pat = *pnewpat; rtx new_rtx; rtx *new_mem; if (GET_CODE (pat) == PARALLEL && XVECLEN (pat, 0) == 2) { rtx incr = XVECEXP (pat, 0, 1); rtx x = XVECEXP (pat, 0, 0); rtx addr; rtx reg; if (GET_CODE (SET_SRC (x)) == MEM && REG_P (SET_DEST (x))) { addr = SET_SRC (x); reg = SET_DEST (x); new_rtx = gen_rtx_SET (VOIDmode, reg, addr); new_mem = &XEXP (new_rtx,1); } else if (GET_CODE (SET_DEST (x)) == MEM && REG_P (SET_SRC (x))) { addr = SET_DEST (x); reg = SET_SRC (x); new_rtx = gen_rtx_SET (VOIDmode, addr, reg); new_mem = &XEXP (new_rtx,0); } else return *pnewpat; if (pat = try_auto_inc_1 (reg, addr, incr)) { *new_mem = pat; *pnewpat = new_rtx; /* insn has an implicit side effect. */ /* Warning !!! */ /* combine don't work correctly with NOTES */ /* combine.c must be changed. */ *pnotes = gen_rtx_EXPR_LIST (REG_INC, XEXP (addr,0), 0); } } return *pnewpat; } Patch for combine.c: ===File ~/d/Archive/gcc/egcs-19990502/gcc/combiner-patch==== *** combine.c Thu Mar 25 19:54:31 1999 --- /disk5/egcs-19990502/gcc/combine.c Thu Jul 15 23:17:05 1999 *************** try_combine (i3, i2, i1) *** 2235,2241 **** } for (note = new_other_notes; note; note = XEXP (note, 1)) ! if (GET_CODE (XEXP (note, 0)) == REG) REG_N_DEATHS (REGNO (XEXP (note, 0)))++; distribute_notes (new_other_notes, undobuf.other_insn, --- 2235,2243 ---- } for (note = new_other_notes; note; note = XEXP (note, 1)) ! if ((REG_NOTE_KIND (note) == REG_UNUSED ! || REG_NOTE_KIND (note) == REG_DEAD) ! && GET_CODE (XEXP (note, 0)) == REG) REG_N_DEATHS (REGNO (XEXP (note, 0)))++; distribute_notes (new_other_notes, undobuf.other_insn, *************** try_combine (i3, i2, i1) *** 2391,2397 **** if (newi2pat && new_i2_notes) { for (temp = new_i2_notes; temp; temp = XEXP (temp, 1)) ! if (GET_CODE (XEXP (temp, 0)) == REG) REG_N_DEATHS (REGNO (XEXP (temp, 0)))++; distribute_notes (new_i2_notes, i2, i2, NULL_RTX, NULL_RTX, NULL_RTX); --- 2393,2401 ---- if (newi2pat && new_i2_notes) { for (temp = new_i2_notes; temp; temp = XEXP (temp, 1)) ! if ((REG_NOTE_KIND (temp) == REG_UNUSED ! || REG_NOTE_KIND (temp) == REG_DEAD) ! && GET_CODE (XEXP (temp, 0)) == REG) REG_N_DEATHS (REGNO (XEXP (temp, 0)))++; distribute_notes (new_i2_notes, i2, i2, NULL_RTX, NULL_RTX, NULL_RTX); *************** try_combine (i3, i2, i1) *** 2400,2406 **** if (new_i3_notes) { for (temp = new_i3_notes; temp; temp = XEXP (temp, 1)) ! if (GET_CODE (XEXP (temp, 0)) == REG) REG_N_DEATHS (REGNO (XEXP (temp, 0)))++; distribute_notes (new_i3_notes, i3, i3, NULL_RTX, NULL_RTX, NULL_RTX); --- 2404,2412 ---- if (new_i3_notes) { for (temp = new_i3_notes; temp; temp = XEXP (temp, 1)) ! if ((REG_NOTE_KIND (temp) == REG_UNUSED ! || REG_NOTE_KIND (temp) == REG_DEAD) ! && GET_CODE (XEXP (temp, 0)) == REG) REG_N_DEATHS (REGNO (XEXP (temp, 0)))++; distribute_notes (new_i3_notes, i3, i3, NULL_RTX, NULL_RTX, NULL_RTX); *************** recog_for_combine (pnewpat, insn, pnotes *** 9115,9120 **** --- 9124,9131 ---- if (GET_CODE (XVECEXP (pat, 0, i)) == CLOBBER && XEXP (XVECEXP (pat, 0, i), 0) == const0_rtx) return -1; + + pat = try_auto_inc (pnewpat, insn, ¬es); /* XXXX */ /* Is the result of combination a valid instruction? */ insn_code_number = recog (pat, insn, &num_clobbers_to_add); ============================================================ Also patch for rtlanal.c needed: ===File ~/d/Archive/gcc/egcs-19990502/gcc/rtlanal-patch===== *** rtlanal.c Mon Apr 12 06:18:55 1999 --- /disk5/egcs-19990502/gcc/rtlanal.c Tue May 25 21:41:14 1999 *************** dead_or_set_regno_p (insn, test_regno) *** 1292,1308 **** TEST_REGNO. */ for (link = REG_NOTES (insn); link; link = XEXP (link, 1)) { ! if (REG_NOTE_KIND (link) != REG_DEAD ! || GET_CODE (XEXP (link, 0)) != REG) ! continue; ! ! regno = REGNO (XEXP (link, 0)); ! endregno = (regno >= FIRST_PSEUDO_REGISTER ? regno + 1 ! : regno + HARD_REGNO_NREGS (regno, ! GET_MODE (XEXP (link, 0)))); ! ! if (test_regno >= regno && test_regno < endregno) ! return 1; } if (GET_CODE (insn) == CALL_INSN --- 1292,1308 ---- TEST_REGNO. */ for (link = REG_NOTES (insn); link; link = XEXP (link, 1)) { ! if ((REG_NOTE_KIND (link) == REG_DEAD || REG_NOTE_KIND (link) == REG_INC) ! && GET_CODE (XEXP (link, 0)) == REG) ! { ! regno = REGNO (XEXP (link, 0)); ! endregno = (regno >= FIRST_PSEUDO_REGISTER ? regno + 1 ! : regno + HARD_REGNO_NREGS (regno, ! GET_MODE (XEXP (link, 0)))); ! ! if (test_regno >= regno && test_regno < endregno) ! return 1; ! } } if (GET_CODE (insn) == CALL_INSN ============================================================ Denis. PS: all my patches may be too old because they against egcs-19990502. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-09-22 12:31 ` Denis Chertykov @ 1999-09-30 18:02 ` Denis Chertykov 0 siblings, 0 replies; 94+ messages in thread From: Denis Chertykov @ 1999-09-30 18:02 UTC (permalink / raw) To: Bernd Schmidt; +Cc: gcc Bernd Schmidt <bernds@cygnus.co.uk> writes: > I'm playing with a patch to improve the generation of auto-increment > addressing modes, e.g. by generating POST_MODIFY and PRE_MODIFY rtxs for > targets where this is possible. Few months ago I'm playing with autoincrement/autodecrement addressing mode. (I'm playing with my own gcc port: ATMEL AVR - 8bit microcontroller) Combiner pass generate (and try to recognize) insns like: (parallel [(set (mem:MODE (reg1)) reg2) (set (reg1) (plus (reg1) (const_int MODE_SIZE)))]) IMHO: Combiner must transform this insn to insn with post increment address. (I'm know about flow pass - flow must do this. But combine must transform insns to POST/PRE DECRE/INCRE MENT form if it's possible.) This is my experimental code. I'm call `try_auto_inc' from `recog_for_combine' immediately before: insn_code_number = recog (pat, insn, &num_clobbers_to_add); as: pat = try_auto_inc (pnewpat, insn, ¬es); /* XXXX */ /* Is the result of combination a valid instruction? */ insn_code_number = recog (pat, insn, &num_clobbers_to_add); try_auto_inc and try_auto_inc_1 : rtx try_auto_inc_1 (reg, mem_addr, set) rtx reg; rtx mem_addr; rtx set; { if (GET_CODE (set) == SET) { rtx addr = XEXP (mem_addr,0); HOST_WIDE_INT offset = 0; if (GET_CODE (addr) == PLUS && GET_CODE (XEXP (addr, 1)) == CONST_INT) offset = INTVAL (XEXP (addr, 1)), addr = XEXP (addr, 0); if (GET_CODE (addr) == REG && ! reg_overlap_mentioned_p (addr, reg)) { register rtx y; register int size = GET_MODE_SIZE (GET_MODE (mem_addr)); /* Is the next use an increment that might make auto-increment? */ if ((y = SET_SRC (set), GET_CODE (y) == PLUS) ) if (XEXP (y, 0) == addr && addr == SET_DEST (y)) if (GET_CODE (XEXP (y, 1)) == CONST_INT) if ((HAVE_POST_INCREMENT && (INTVAL (XEXP (y, 1)) == size && offset == 0)) || (HAVE_POST_DECREMENT && (INTVAL (XEXP (y, 1)) == - size && offset == 0)) || (HAVE_PRE_INCREMENT && (INTVAL (XEXP (y, 1)) == size && offset == size)) || (HAVE_PRE_DECREMENT && (INTVAL (XEXP (y, 1)) == - size && offset == - size))) { enum rtx_code inc_code = (INTVAL (XEXP (y, 1)) == size ? (offset ? PRE_INC : POST_INC) : (offset ? PRE_DEC : POST_DEC)); /* This is the simple case. Try to make the auto-inc. */ return gen_rtx_MEM (GET_MODE (mem_addr), gen_rtx_fmt_e (inc_code, Pmode, addr)); } } } return NULL_RTX; } rtx try_auto_inc (pnewpat, insn, pnotes) rtx *pnewpat; rtx insn; rtx *pnotes; { rtx pat = *pnewpat; rtx new_rtx; rtx *new_mem; if (GET_CODE (pat) == PARALLEL && XVECLEN (pat, 0) == 2) { rtx incr = XVECEXP (pat, 0, 1); rtx x = XVECEXP (pat, 0, 0); rtx addr; rtx reg; if (GET_CODE (SET_SRC (x)) == MEM && REG_P (SET_DEST (x))) { addr = SET_SRC (x); reg = SET_DEST (x); new_rtx = gen_rtx_SET (VOIDmode, reg, addr); new_mem = &XEXP (new_rtx,1); } else if (GET_CODE (SET_DEST (x)) == MEM && REG_P (SET_SRC (x))) { addr = SET_DEST (x); reg = SET_SRC (x); new_rtx = gen_rtx_SET (VOIDmode, addr, reg); new_mem = &XEXP (new_rtx,0); } else return *pnewpat; if (pat = try_auto_inc_1 (reg, addr, incr)) { *new_mem = pat; *pnewpat = new_rtx; /* insn has an implicit side effect. */ /* Warning !!! */ /* combine don't work correctly with NOTES */ /* combine.c must be changed. */ *pnotes = gen_rtx_EXPR_LIST (REG_INC, XEXP (addr,0), 0); } } return *pnewpat; } Patch for combine.c: ===File ~/d/Archive/gcc/egcs-19990502/gcc/combiner-patch==== *** combine.c Thu Mar 25 19:54:31 1999 --- /disk5/egcs-19990502/gcc/combine.c Thu Jul 15 23:17:05 1999 *************** try_combine (i3, i2, i1) *** 2235,2241 **** } for (note = new_other_notes; note; note = XEXP (note, 1)) ! if (GET_CODE (XEXP (note, 0)) == REG) REG_N_DEATHS (REGNO (XEXP (note, 0)))++; distribute_notes (new_other_notes, undobuf.other_insn, --- 2235,2243 ---- } for (note = new_other_notes; note; note = XEXP (note, 1)) ! if ((REG_NOTE_KIND (note) == REG_UNUSED ! || REG_NOTE_KIND (note) == REG_DEAD) ! && GET_CODE (XEXP (note, 0)) == REG) REG_N_DEATHS (REGNO (XEXP (note, 0)))++; distribute_notes (new_other_notes, undobuf.other_insn, *************** try_combine (i3, i2, i1) *** 2391,2397 **** if (newi2pat && new_i2_notes) { for (temp = new_i2_notes; temp; temp = XEXP (temp, 1)) ! if (GET_CODE (XEXP (temp, 0)) == REG) REG_N_DEATHS (REGNO (XEXP (temp, 0)))++; distribute_notes (new_i2_notes, i2, i2, NULL_RTX, NULL_RTX, NULL_RTX); --- 2393,2401 ---- if (newi2pat && new_i2_notes) { for (temp = new_i2_notes; temp; temp = XEXP (temp, 1)) ! if ((REG_NOTE_KIND (temp) == REG_UNUSED ! || REG_NOTE_KIND (temp) == REG_DEAD) ! && GET_CODE (XEXP (temp, 0)) == REG) REG_N_DEATHS (REGNO (XEXP (temp, 0)))++; distribute_notes (new_i2_notes, i2, i2, NULL_RTX, NULL_RTX, NULL_RTX); *************** try_combine (i3, i2, i1) *** 2400,2406 **** if (new_i3_notes) { for (temp = new_i3_notes; temp; temp = XEXP (temp, 1)) ! if (GET_CODE (XEXP (temp, 0)) == REG) REG_N_DEATHS (REGNO (XEXP (temp, 0)))++; distribute_notes (new_i3_notes, i3, i3, NULL_RTX, NULL_RTX, NULL_RTX); --- 2404,2412 ---- if (new_i3_notes) { for (temp = new_i3_notes; temp; temp = XEXP (temp, 1)) ! if ((REG_NOTE_KIND (temp) == REG_UNUSED ! || REG_NOTE_KIND (temp) == REG_DEAD) ! && GET_CODE (XEXP (temp, 0)) == REG) REG_N_DEATHS (REGNO (XEXP (temp, 0)))++; distribute_notes (new_i3_notes, i3, i3, NULL_RTX, NULL_RTX, NULL_RTX); *************** recog_for_combine (pnewpat, insn, pnotes *** 9115,9120 **** --- 9124,9131 ---- if (GET_CODE (XVECEXP (pat, 0, i)) == CLOBBER && XEXP (XVECEXP (pat, 0, i), 0) == const0_rtx) return -1; + + pat = try_auto_inc (pnewpat, insn, ¬es); /* XXXX */ /* Is the result of combination a valid instruction? */ insn_code_number = recog (pat, insn, &num_clobbers_to_add); ============================================================ Also patch for rtlanal.c needed: ===File ~/d/Archive/gcc/egcs-19990502/gcc/rtlanal-patch===== *** rtlanal.c Mon Apr 12 06:18:55 1999 --- /disk5/egcs-19990502/gcc/rtlanal.c Tue May 25 21:41:14 1999 *************** dead_or_set_regno_p (insn, test_regno) *** 1292,1308 **** TEST_REGNO. */ for (link = REG_NOTES (insn); link; link = XEXP (link, 1)) { ! if (REG_NOTE_KIND (link) != REG_DEAD ! || GET_CODE (XEXP (link, 0)) != REG) ! continue; ! ! regno = REGNO (XEXP (link, 0)); ! endregno = (regno >= FIRST_PSEUDO_REGISTER ? regno + 1 ! : regno + HARD_REGNO_NREGS (regno, ! GET_MODE (XEXP (link, 0)))); ! ! if (test_regno >= regno && test_regno < endregno) ! return 1; } if (GET_CODE (insn) == CALL_INSN --- 1292,1308 ---- TEST_REGNO. */ for (link = REG_NOTES (insn); link; link = XEXP (link, 1)) { ! if ((REG_NOTE_KIND (link) == REG_DEAD || REG_NOTE_KIND (link) == REG_INC) ! && GET_CODE (XEXP (link, 0)) == REG) ! { ! regno = REGNO (XEXP (link, 0)); ! endregno = (regno >= FIRST_PSEUDO_REGISTER ? regno + 1 ! : regno + HARD_REGNO_NREGS (regno, ! GET_MODE (XEXP (link, 0)))); ! ! if (test_regno >= regno && test_regno < endregno) ! return 1; ! } } if (GET_CODE (insn) == CALL_INSN ============================================================ Denis. PS: all my patches may be too old because they against egcs-19990502. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-09-22 3:09 Autoincrement examples Bernd Schmidt 1999-09-22 4:32 ` Michael Hayes 1999-09-22 12:31 ` Denis Chertykov @ 1999-09-29 6:15 ` Rask Ingemann Lambertsen 1999-09-30 18:02 ` Rask Ingemann Lambertsen 1999-09-30 18:02 ` Bernd Schmidt 3 siblings, 1 reply; 94+ messages in thread From: Rask Ingemann Lambertsen @ 1999-09-29 6:15 UTC (permalink / raw) To: GCC mailing list [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain, Size: 1429 bytes --] Den 22-Sep-99 11:08:25 skrev Bernd Schmidt følgende om "Autoincrement examples": > If someone has (preferrably small) example code (for any target), which > shows how the compiler generates auto-increments either very well or very > badly, This particular example has long been a pet peeve of mine wrt. GCC code generation on m68k: void stringcopy (const char *src, char *dest) { while (*dest++ = *src++) ; } GCC 2.95.1 generates this code for the loop body: L5: moveb a1@+,a0@ tstb a0@+ jne L5 That sort of code will give any m68k programmer red eyes, since the tstb instruction is unneeded. The loop body should be just L5: moveb a1@+,a0@+ jne L5 In particular on the 68010, the latter loop should be quite a bit faster (2*strlen(src) + 2 memory accesses instead of 5*strlen(src) memory accesses. Regards, /¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯T¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯\ | Rask Ingemann Lambertsen | E-mail: mailto:rask@kampsax.k-net.dk | | Registered Phase5 developer | WWW: http://www.gbar.dtu.dk/~c948374/ | | A4000, 866 kkeys/s (RC5-64) | "ThrustMe" on XPilot, ARCnet and IRC | | If it jams, force it. If it breaks, it needed replacing. | ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-09-29 6:15 ` Rask Ingemann Lambertsen @ 1999-09-30 18:02 ` Rask Ingemann Lambertsen 0 siblings, 0 replies; 94+ messages in thread From: Rask Ingemann Lambertsen @ 1999-09-30 18:02 UTC (permalink / raw) To: GCC mailing list [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain, Size: 1429 bytes --] Den 22-Sep-99 11:08:25 skrev Bernd Schmidt følgende om "Autoincrement examples": > If someone has (preferrably small) example code (for any target), which > shows how the compiler generates auto-increments either very well or very > badly, This particular example has long been a pet peeve of mine wrt. GCC code generation on m68k: void stringcopy (const char *src, char *dest) { while (*dest++ = *src++) ; } GCC 2.95.1 generates this code for the loop body: L5: moveb a1@+,a0@ tstb a0@+ jne L5 That sort of code will give any m68k programmer red eyes, since the tstb instruction is unneeded. The loop body should be just L5: moveb a1@+,a0@+ jne L5 In particular on the 68010, the latter loop should be quite a bit faster (2*strlen(src) + 2 memory accesses instead of 5*strlen(src) memory accesses. Regards, /¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯T¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯\ | Rask Ingemann Lambertsen | E-mail: mailto:rask@kampsax.k-net.dk | | Registered Phase5 developer | WWW: http://www.gbar.dtu.dk/~c948374/ | | A4000, 866 kkeys/s (RC5-64) | "ThrustMe" on XPilot, ARCnet and IRC | | If it jams, force it. If it breaks, it needed replacing. | ^ permalink raw reply [flat|nested] 94+ messages in thread
* Autoincrement examples 1999-09-22 3:09 Autoincrement examples Bernd Schmidt ` (2 preceding siblings ...) 1999-09-29 6:15 ` Rask Ingemann Lambertsen @ 1999-09-30 18:02 ` Bernd Schmidt 3 siblings, 0 replies; 94+ messages in thread From: Bernd Schmidt @ 1999-09-30 18:02 UTC (permalink / raw) To: gcc I'm playing with a patch to improve the generation of auto-increment addressing modes, e.g. by generating POST_MODIFY and PRE_MODIFY rtxs for targets where this is possible. If someone has (preferrably small) example code (for any target), which shows how the compiler generates auto-increments either very well or very badly, I'd like to get a copy of these test cases so I can make sure I'm not pessimizing anything. Please don't reply to the list; there's no reason to spam it with all these test cases. Bernd ^ permalink raw reply [flat|nested] 94+ messages in thread
[parent not found: <14390.31570.64315.731520@ongaonga.elec.canterbury.ac.nz>]
* Re: Autoincrement examples [not found] <14390.31570.64315.731520@ongaonga.elec.canterbury.ac.nz> @ 1999-11-20 9:09 ` Joern Rennecke 1999-11-20 14:48 ` Michael Hayes 1999-11-30 23:37 ` Joern Rennecke 0 siblings, 2 replies; 94+ messages in thread From: Joern Rennecke @ 1999-11-20 9:09 UTC (permalink / raw) To: Michael Hayes; +Cc: amylaar, m.hayes, law, gcc, amylaar > The instruction combiner may then convert this into: ... > (set (reg u) (mult (mem (post_inc (reg a))) (mem (post_inc (reg b))))) > (set (reg v) (plus (reg v) (reg u))) ... > Now if autoinc generation takes place after instruction combination we > end up with the following that exhibits load instructions that could > be optimised away: ... > (set (reg s) (mem (post_inc (reg a)))) > (set (reg t) (mem (post_inc (reg b)))) > (set (reg u) (mult (reg s) (reg t))) > (set (reg v) (plus (reg v) (reg u))) Oh. So you are not actually talking about dead loads, but loads that could be combined into thev instruction that uses the loaded value. This optimization opportunity seems to be due to a peculiarity of your target architecture - not being a true load-store one, but allowing different sets of addressing modes in different instructions. Indeed, it seems that it seems best to let combine do this job. Updating LOG_LINKS seems tedious, but not intrincically hard to do. Just time-consuming to code. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-20 9:09 ` Joern Rennecke @ 1999-11-20 14:48 ` Michael Hayes 1999-11-30 23:37 ` Michael Hayes 1999-11-30 23:37 ` Joern Rennecke 1 sibling, 1 reply; 94+ messages in thread From: Michael Hayes @ 1999-11-20 14:48 UTC (permalink / raw) To: Joern Rennecke; +Cc: Michael Hayes, law, gcc, amylaar Joern Rennecke writes: > Oh. So you are not actually talking about dead loads, but loads that could > be combined into thev instruction that uses the loaded value. Yes. > This optimization opportunity seems to be due to a peculiarity of your > target architecture - not being a true load-store one, but allowing different > sets of addressing modes in different instructions. This is typical of digital signal processor architectures where you want to perform the kernel of a dot-product with a single instruction; usually a MAC with a pair of autoincrement memory references. Michael. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-20 14:48 ` Michael Hayes @ 1999-11-30 23:37 ` Michael Hayes 0 siblings, 0 replies; 94+ messages in thread From: Michael Hayes @ 1999-11-30 23:37 UTC (permalink / raw) To: Joern Rennecke; +Cc: Michael Hayes, law, gcc, amylaar Joern Rennecke writes: > Oh. So you are not actually talking about dead loads, but loads that could > be combined into thev instruction that uses the loaded value. Yes. > This optimization opportunity seems to be due to a peculiarity of your > target architecture - not being a true load-store one, but allowing different > sets of addressing modes in different instructions. This is typical of digital signal processor architectures where you want to perform the kernel of a dot-product with a single instruction; usually a MAC with a pair of autoincrement memory references. Michael. ^ permalink raw reply [flat|nested] 94+ messages in thread
* Re: Autoincrement examples 1999-11-20 9:09 ` Joern Rennecke 1999-11-20 14:48 ` Michael Hayes @ 1999-11-30 23:37 ` Joern Rennecke 1 sibling, 0 replies; 94+ messages in thread From: Joern Rennecke @ 1999-11-30 23:37 UTC (permalink / raw) To: Michael Hayes; +Cc: amylaar, m.hayes, law, gcc, amylaar > The instruction combiner may then convert this into: ... > (set (reg u) (mult (mem (post_inc (reg a))) (mem (post_inc (reg b))))) > (set (reg v) (plus (reg v) (reg u))) ... > Now if autoinc generation takes place after instruction combination we > end up with the following that exhibits load instructions that could > be optimised away: ... > (set (reg s) (mem (post_inc (reg a)))) > (set (reg t) (mem (post_inc (reg b)))) > (set (reg u) (mult (reg s) (reg t))) > (set (reg v) (plus (reg v) (reg u))) Oh. So you are not actually talking about dead loads, but loads that could be combined into thev instruction that uses the loaded value. This optimization opportunity seems to be due to a peculiarity of your target architecture - not being a true load-store one, but allowing different sets of addressing modes in different instructions. Indeed, it seems that it seems best to let combine do this job. Updating LOG_LINKS seems tedious, but not intrincically hard to do. Just time-consuming to code. ^ permalink raw reply [flat|nested] 94+ messages in thread
end of thread, other threads:[~2000-01-20 23:33 UTC | newest] Thread overview: 94+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 1999-09-22 3:09 Autoincrement examples Bernd Schmidt 1999-09-22 4:32 ` Michael Hayes 1999-09-22 4:39 ` Bernd Schmidt 1999-09-22 4:57 ` Michael Hayes 1999-09-30 18:02 ` Michael Hayes 1999-09-22 5:00 ` Michael Hayes 1999-09-22 23:23 ` Michael Hayes 1999-09-23 1:50 ` Bernd Schmidt 1999-09-23 4:37 ` Michael Hayes 1999-09-30 18:02 ` Michael Hayes 1999-09-30 18:02 ` Bernd Schmidt 1999-09-30 18:02 ` Michael Hayes 1999-09-30 18:02 ` Michael Hayes 1999-09-30 18:02 ` Bernd Schmidt 1999-09-24 4:35 ` Michael Hayes 1999-09-30 18:02 ` Michael Hayes 1999-11-18 0:22 ` Jeffrey A Law 1999-11-18 0:45 ` Michael Hayes 1999-11-18 7:33 ` Joern Rennecke 1999-11-18 16:25 ` Jeffrey A Law 1999-11-30 23:37 ` Jeffrey A Law 1999-12-13 15:28 ` Joern Rennecke 1999-12-31 23:54 ` Joern Rennecke 1999-11-30 23:37 ` Joern Rennecke 1999-11-18 17:00 ` Michael Hayes 1999-11-18 18:02 ` Joern Rennecke 1999-11-18 18:27 ` Joern Rennecke 1999-11-30 23:37 ` Joern Rennecke 1999-11-18 21:28 ` Michael Hayes 1999-11-18 23:06 ` Toshiyasu Morita 1999-11-19 2:35 ` Michael Hayes 1999-11-30 23:37 ` Michael Hayes 1999-11-30 23:37 ` Toshiyasu Morita 1999-11-19 8:49 ` Joern Rennecke 1999-11-30 23:37 ` Joern Rennecke 1999-11-22 23:43 ` Jeffrey A Law 1999-11-23 7:07 ` Joern Rennecke 1999-11-30 23:37 ` Joern Rennecke 1999-11-30 23:37 ` Jeffrey A Law 1999-11-30 23:37 ` Michael Hayes 1999-11-22 23:47 ` Jeffrey A Law 1999-11-30 23:37 ` Jeffrey A Law 1999-11-30 23:37 ` Joern Rennecke 1999-11-30 23:37 ` Michael Hayes 1999-12-08 10:57 ` Joern Rennecke 1999-12-08 14:38 ` Michael Hayes 1999-12-10 8:42 ` Joern Rennecke 1999-12-10 13:36 ` Michael Hayes 1999-12-10 16:59 ` Loop patch update (Was: Re: Autoincrement examples) Joern Rennecke 1999-12-13 14:37 ` Joern Rennecke 1999-12-13 14:59 ` Autoincrement example (Was: Re: Loop patch update (Was: Re: Autoincrement examples)) Joern Rennecke 1999-12-31 23:54 ` Joern Rennecke 1999-12-13 20:20 ` Loop patch update (Was: Re: Autoincrement examples) Jeffrey A Law 1999-12-14 11:20 ` Joern Rennecke 1999-12-31 23:54 ` Joern Rennecke 1999-12-31 23:54 ` Jeffrey A Law 1999-12-31 23:54 ` Joern Rennecke 1999-12-21 17:04 ` Joern Rennecke 1999-12-31 23:54 ` Joern Rennecke 2000-01-04 0:54 ` Jeffrey A Law 2000-01-20 23:33 ` Joern Rennecke 1999-12-31 23:54 ` Joern Rennecke 1999-12-14 15:49 ` Autoincrement patches " Joern Rennecke 1999-12-14 19:58 ` Autoincrement patches Michael Hayes 1999-12-16 15:05 ` Joern Rennecke 1999-12-31 23:54 ` Joern Rennecke 1999-12-31 23:54 ` Michael Hayes 1999-12-31 23:54 ` Autoincrement patches (Was: Re: Autoincrement examples) Joern Rennecke 1999-12-31 23:54 ` Autoincrement examples Michael Hayes 1999-12-31 23:54 ` Joern Rennecke 1999-12-31 23:54 ` Michael Hayes 1999-12-31 23:54 ` Joern Rennecke 1999-12-17 18:08 ` Joern Rennecke 1999-12-17 18:27 ` Michael Hayes 1999-12-31 23:54 ` Michael Hayes 1999-12-31 23:54 ` Joern Rennecke 1999-11-30 23:37 ` Michael Hayes 1999-11-18 7:29 ` Joern Rennecke 1999-11-30 23:37 ` Joern Rennecke 1999-11-18 14:29 ` Joern Rennecke 1999-11-22 23:47 ` Jeffrey A Law 1999-11-30 23:37 ` Jeffrey A Law 1999-11-30 23:37 ` Joern Rennecke 1999-11-30 23:37 ` Jeffrey A Law 1999-09-30 18:02 ` Michael Hayes 1999-09-22 12:31 ` Denis Chertykov 1999-09-30 18:02 ` Denis Chertykov 1999-09-29 6:15 ` Rask Ingemann Lambertsen 1999-09-30 18:02 ` Rask Ingemann Lambertsen 1999-09-30 18:02 ` Bernd Schmidt [not found] <14390.31570.64315.731520@ongaonga.elec.canterbury.ac.nz> 1999-11-20 9:09 ` Joern Rennecke 1999-11-20 14:48 ` Michael Hayes 1999-11-30 23:37 ` Michael Hayes 1999-11-30 23:37 ` Joern Rennecke
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).