From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 32673 invoked by alias); 6 Oct 2002 19:46:02 -0000 Mailing-List: contact gcc-prs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-prs-owner@gcc.gnu.org Received: (qmail 32655 invoked by uid 71); 6 Oct 2002 19:46:02 -0000 Date: Sun, 06 Oct 2002 12:46:00 -0000 Message-ID: <20021006194602.32654.qmail@sources.redhat.com> To: nobody@gcc.gnu.org Cc: gcc-prs@gcc.gnu.org, From: Bernd Paysan Subject: Re: optimization/8092: cross-jump triggers too often Reply-To: Bernd Paysan X-SW-Source: 2002-10/txt/msg00224.txt.bz2 List-Id: The following reply was made to PR optimization/8092; it has been noted by GNATS. From: Bernd Paysan To: anton@mips.complang.tuwien.ac.at, Anton Ertl , rth@gcc.gnu.org, gcc-bugs@gcc.gnu.org, gcc-prs@gcc.gnu.org, obody@gcc.gnu.org, gcc-gnats@gcc.gnu.org Cc: Subject: Re: optimization/8092: cross-jump triggers too often Date: Sun, 6 Oct 2002 21:40:46 +0200 On Saturday 05 October 2002 11:57, Anton Ertl wrote: > 2) Fix the bug that moves unrelated code into virtual machine > instructions even with -fno-gcse; we can work around that in the > present case (not yet done in the timings above), but at least I did > not find the source of this code and thus the workaround. I also suggest fixing the bug that moves unrelated code around within GCS= E.=20 The code I've seen moved around is access to static pointers, like=20 stdin/stdout, __ctype_toupper, and so on. This indicates that the constan= t=20 propagation code has some bugs in it. The following code illustrates this= : #include int foo (void ** code) { static void * symbols[] =3D { &&label1, &&label2, &&label3, &&label4,=20 &&label5, &&label6, &&label7 }; int a, b, c; label1: a =3D (int)stdin; goto **(code++); label2: b =3D (int)stdout; goto **(code++); label3: c =3D (int)stderr; goto **(code++); label4: a =3D 12; goto **(code++); label5: b =3D 123; goto **(code++); label6: c =3D 1234; goto **(code++); label7: return a^b^c; } Compile with -O2 -fno-cross-jump to see how the initializing code for "gl= obal=20 expressions" like stdout/stderr gets cluttered all over the place. The du= mp=20 obtained with -dG shows which expression is copied: GCSE pass 1 SET hash table (11 buckets, 3 entries) Index 0 (hash value 5) (set (reg/v:SI 60) (const_int 12 [0xc])) Index 1 (hash value 6) (set (reg/v:SI 61) (const_int 123 [0x7b])) Index 2 (hash value 7) (set (reg/v:SI 62) (const_int 1234 [0x4d2])) LDST list:=20 Pattern ( 0): (mem/f:SI (symbol_ref:SI ("stderr")) [5 stderr+0 S4 A32]= ) =09 Loads : (insn_list 42 (nil)) =09Stores : (nil) Pattern ( 0): (mem/f:SI (symbol_ref:SI ("stdout")) [5 stdout+0 S4 A32]= ) =09 Loads : (insn_list 28 (nil)) =09Stores : (nil) Pattern ( 0): (mem/f:SI (symbol_ref:SI ("stdin")) [5 stdin+0 S4 A32]) =09 Loads : (insn_list 14 (nil)) =09Stores : (nil) Expression hash table (15 buckets, 5 entries) Index 0 (hash value 7) (mem/f:SI (symbol_ref:SI ("stdin")) [5 stdin+0 S4 A32]) Index 1 (hash value 2) (mem:SI (reg/v/f:SI 59) [4 S4 A32]) Index 2 (hash value 12) (plus:SI (reg/v/f:SI 59) (const_int 4 [0x4])) Index 3 (hash value 7) (mem/f:SI (symbol_ref:SI ("stdout")) [5 stdout+0 S4 A32]) Index 4 (hash value 8) (mem/f:SI (symbol_ref:SI ("stderr")) [5 stderr+0 S4 A32]) PRE: redundant insn 14 (expression 0) in bb 1, reaching reg is 77 PRE: redundant insn 28 (expression 3) in bb 2, reaching reg is 78 PRE: redundant insn 42 (expression 4) in bb 3, reaching reg is 79 PRE/HOIST: edge (0,1), copy expression 0 PRE/HOIST: end of bb 1, insn 139, copying expression 3 to reg 78 PRE/HOIST: edge (1,2), copy expression 3 PRE/HOIST: end of bb 1, insn 142, copying expression 4 to reg 79 PRE/HOIST: edge (1,3), copy expression 4 PRE/HOIST: end of bb 2, insn 145, copying expression 4 to reg 79 PRE/HOIST: edge (2,3), copy expression 4 PRE/HOIST: end of bb 3, insn 148, copying expression 3 to reg 78 PRE/HOIST: edge (3,2), copy expression 3 PRE/HOIST: end of bb 4, insn 151, copying expression 3 to reg 78 PRE/HOIST: edge (4,2), copy expression 3 PRE/HOIST: end of bb 4, insn 154, copying expression 4 to reg 79 PRE/HOIST: edge (4,3), copy expression 4 PRE/HOIST: end of bb 5, insn 157, copying expression 3 to reg 78 PRE/HOIST: edge (5,2), copy expression 3 PRE/HOIST: end of bb 5, insn 160, copying expression 4 to reg 79 PRE/HOIST: edge (5,3), copy expression 4 PRE/HOIST: end of bb 6, insn 163, copying expression 3 to reg 78 PRE/HOIST: edge (6,2), copy expression 3 PRE/HOIST: end of bb 6, insn 166, copying expression 4 to reg 79 PRE/HOIST: edge (6,3), copy expression 4 PRE GCSE of foo, pass 1: 4072 bytes needed, 3 substs, 21 insns created Apparently, GCSE copies constant expressions that seem to be "sufficientl= y=20 complicated" all over the place. Although, in this case, the target=20 architecture implements (mem/f (symbol_ref xxx)) with the same instructio= n as=20 (const_int xxx), which is not copied. If I "poison" symbol_ref expression= s in=20 gcse.c (so that they aren't considdered), I still get a few redundant=20 expressions spread all over the place, e.g. increments of the stack point= er=20 (which happen fairly often, but doing so in advance is stupid). As far as I understood the code, gcse.c copies "partially redundant"=20 expressions throughout the entire CFG to "make them fully redundant". Tha= t's=20 what's happening here. How can an expression that occurs only once be=20 "redundant"? And why doesn't gcse.c try to remove copied expressions=20 afterwards after it finds out that the attempt to improve the code has=20 failed? And why is there no cost calculation? If you copy an expression f= rom=20 one block to m blocks, the costs are multiplied by m. --=20 Bernd Paysan "If you want it done right, you have to do it yourself" http://www.jwdt.com/~paysan/