From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 25737 invoked by alias); 22 Sep 2009 11:53:07 -0000 Received: (qmail 25723 invoked by uid 22791); 22 Sep 2009 11:53:06 -0000 X-SWARE-Spam-Status: No, hits=-0.8 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_15 X-Spam-Check-By: sourceware.org Received: from portal.icerasemi.com (HELO pOrtaL.icerasemi.com) (213.249.204.90) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 22 Sep 2009 11:53:01 +0000 X-ASG-Debug-ID: 1253620377-045900080000-ThFIni X-Barracuda-URL: http://192.168.1.243:80/cgi-bin/mark.cgi Received: from Exchangevs.Icerasemi.com (cluster1.icerasemi.local [192.168.1.203]) by pOrtaL.icerasemi.com (Spam & Virus Firewall) with ESMTP id 057B879020 for ; Tue, 22 Sep 2009 11:52:58 +0000 (GMT) Received: from Exchangevs.Icerasemi.com (cluster1.icerasemi.local [192.168.1.203]) by pOrtaL.icerasemi.com with ESMTP id Ytrs6m2OlErwIap7 for ; Tue, 22 Sep 2009 11:52:58 +0000 (GMT) Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable X-ASG-Orig-Subj: RFC: missed loop optimizations from loop induction variable copies Subject: RFC: missed loop optimizations from loop induction variable copies Date: Tue, 22 Sep 2009 11:53:00 -0000 Message-ID: <4D60B0700D1DB54A8C0C6E9BE69163700B7F5FDB@EXCHANGEVS.IceraSemi.local> From: "Rahul Kharche" To: Cc: "sdkteam-gnu" X-Barracuda-Connect: cluster1.icerasemi.local[192.168.1.203] X-Barracuda-Start-Time: 1253620378 X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using global scores of TAG_LEVEL=1000.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.9671 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-IsSubscribed: yes Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org X-SW-Source: 2009-09/txt/msg00432.txt.bz2 The following causes missed loop optimizations in O2 from creating unnecessary loop induction variables. Or, is a case of IV opts not able to coalesce copies of induction variables. A previous post related to this was made in PR41026 which had a type promoted loop index variable copied. I believe this example makes the problem more obvious. struct struct_t { int* data; }; void testAutoIncStruct (struct struct_t* sp, int start, int end) { int i; for (i =3D 0; i+start < end; i++) { sp->data[i+start] =3D 0; } } With GCC v4.4.1 release) and gcc -O2 -fdump-tree-all on the above case we get the following dump from IVOpts testAutoIncStruct (struct struct_t * sp, int start, int end) { unsigned int D.1283; unsigned int D.1284; int D.1282; unsigned int ivtmp.32; int * pretmp.17; int i; int * D.1245; unsigned int D.1244; unsigned int D.1243; : if (start_3(D) < end_5(D)) goto ; else goto ; : pretmp.17_22 =3D sp_6(D)->data; D.1282_23 =3D start_3(D) + 1; ivtmp.32_25 =3D (unsigned int) D.1282_23; D.1283_27 =3D (unsigned int) end_5(D); D.1284_28 =3D D.1283_27 + 1; : # start_20 =3D PHI # ivtmp.32_7 =3D PHI D.1243_9 =3D (unsigned int) start_20; D.1244_10 =3D D.1243_9 * 4; D.1245_11 =3D pretmp.17_22 + D.1244_10; *D.1245_11 =3D 0; start_26 =3D (int) ivtmp.32_7; start_4 =3D start_26; ivtmp.32_24 =3D ivtmp.32_7 + 1; if (ivtmp.32_24 !=3D D.1284_28) goto ; else goto ; : goto ; : return; } IVOpts cannot identify start_26, start_4 and ivtmp_32_7 to be copies. The root cause is that expression 'i + start' is identified as a common expression between the test in the header and the index operation in the latch. This is unified by copy propagation or FRE prior to loop optimizations and creates a new induction variable. If we disable tree copy propagation and FRE with gcc -O2 -fno-tree-copy-prop -fno-tree-fre -fdump-tree-all we get testAutoIncStruct (struct struct_t * sp, int start, int end) { unsigned int D.1287; unsigned int D.1288; unsigned int D.1289; int D.1290; unsigned int D.1284; unsigned int D.1285; unsigned int D.1286; int * pretmp.17; int i; int * D.1245; unsigned int D.1244; unsigned int D.1243; int D.1242; int * D.1241; : if (start_3(D) < end_5(D)) goto ; else goto ; : pretmp.17_23 =3D sp_6(D)->data; D.1287_27 =3D (unsigned int) end_5(D); D.1288_28 =3D (unsigned int) start_3(D); D.1289_29 =3D D.1287_27 - D.1288_28; D.1290_30 =3D (int) D.1289_29; : # i_20 =3D PHI D.1241_7 =3D pretmp.17_23; D.1284_26 =3D (unsigned int) start_3(D); D.1285_25 =3D (unsigned int) i_20; D.1286_24 =3D D.1284_26 + D.1285_25; MEM[base: pretmp.17_23, index: D.1286_24, step: 4] =3D 0; i_12 =3D i_20 + 1; if (i_12 !=3D D.1290_30) goto ; else goto ; : goto ; : return; } The correct single induction variable as been identified here. This is not a loop header copying problem either. If we disable loop header copying, we still get multiple induction variables created. In fact in the above case loop header copying correctly enables post-increment mode on our port. testAutoIncStruct (struct struct_t * sp, int start, int end) { unsigned int D.1282; unsigned int ivtmp.31; unsigned int ivtmp.29; int i; int * D.1245; unsigned int D.1244; unsigned int D.1243; int D.1242; int * D.1241; : ivtmp.29_18 =3D (unsigned int) start_3(D); D.1282_21 =3D (unsigned int) start_3(D); ivtmp.31_22 =3D D.1282_21 * 4; goto ; : D.1241_7 =3D sp_6(D)->data; D.1244_10 =3D ivtmp.31_19; D.1245_11 =3D D.1241_7 + D.1244_10; *D.1245_11 =3D 0; ivtmp.29_17 =3D ivtmp.29_8 + 1; ivtmp.31_20 =3D ivtmp.31_19 + 4; : # ivtmp.29_8 =3D PHI # ivtmp.31_19 =3D PHI D.1242_23 =3D (int) ivtmp.29_8; if (D.1242_23 < end_5(D)) goto ; else goto ; : return; } Does this imply we try and not copy propagate or FRE potential induction variables? Or is this simply a missed case in IVOpts? Rahul