From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id B4F26388C571; Mon, 13 Nov 2023 15:26:51 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B4F26388C571 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1699889211; bh=Vj1i1k+DW1fBAKq2x+iq1uMMWR/Qq2ZwywJ1/KqRZi4=; h=From:To:Subject:Date:In-Reply-To:References:From; b=As2SF3686TWuXbgqEmEmwJl3ZKubhT/gBcNAGysROvAiG7NYeosO2faGtIfl9X6WM 3JR/5UqgMuW71dD/lKdO4Jyk+oaIkCNKuPMNjI0571L+8kI5ii9ZwqptekMJdmBjEf m/UTkzK0bM2l9OjU2L2Q1hni+JMBhZFrBsQaDQLE= From: "manolis.tsamis at vrull dot eu" To: gcc-bugs@gcc.gnu.org Subject: [Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94 Date: Mon, 13 Nov 2023 15:26:51 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: rtl-optimization X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: manolis.tsamis at vrull dot eu X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 14.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D112415 --- Comment #48 from Manolis Tsamis --- (In reply to dave.anglin from comment #47) > On 2023-11-13 4:33 a.m., manolis.tsamis at vrull dot eu wrote: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D112415 > > > > --- Comment #44 from Manolis Tsamis --- > > (In reply to John David Anglin from comment #39) > >> In the f-m-o pass, the following three insns that set call clobbered > >> registers r20-r22 are pulled from loop: > >> > >> (insn 186 183 190 29 (set (reg/f:SI 22 %r22 [478]) > >> (plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127]) > >> (const_int 388 [0x184]))) "../Python/compile.c":5964:9 120 > >> {addsi3} > >> (nil)) > >> (insn 190 186 187 29 (set (reg/f:SI 21 %r21 [479]) > >> (plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127]) > >> (const_int 392 [0x188]))) "../Python/compile.c":5964:9 120 > >> {addsi3} > >> (nil)) > >> (insn 194 191 195 29 (set (reg/f:SI 20 %r20 [480]) > >> (plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127]) > >> (const_int 396 [0x18c]))) "../Python/compile.c":5964:9 120 > >> {addsi3} > >> (nil)) > >> > >> They are used in the following insns before call to compiler_visit_exp= r1: > >> > >> (insn 242 238 258 32 (set (mem:SI (reg/f:SI 22 %r22 [478]) [4 MEM[(int > >> *)prephit > >> mp_37 + 388B]+0 S4 A32]) > >> (reg:SI 23 %r23 [orig:173 vect__102.2442 ] [173])) > >> "../Python/compile.c" > >> :5968:22 42 {*pa.md:2193} > >> (expr_list:REG_DEAD (reg:SI 23 %r23 [orig:173 vect__102.2442 ] [= 173]) > >> (expr_list:REG_DEAD (reg/f:SI 22 %r22 [478]) > >> (nil)))) > >> (insn 258 242 246 32 (set (reg:SI 26 %r26) > >> (reg/v/f:SI 5 %r5 [orig:198 c ] [198])) > >> "../Python/compile.c":5969:15 42 {*pa.md:2193} > >> (nil)) > >> (insn 246 258 250 32 (set (mem:SI (reg/f:SI 21 %r21 [479]) [4 MEM[(int > >> *)prephitmp_37 + 392B]+0 S4 A32]) > >> (reg:SI 29 %r29 [orig:169 vect__102.2443 ] [169])) > >> "../Python/compile.c":5968:22 42 {*pa.md:2193} > >> (expr_list:REG_DEAD (reg:SI 29 %r29 [orig:169 vect__102.2443 ] [= 169]) > >> (expr_list:REG_DEAD (reg/f:SI 21 %r21 [479]) > >> (nil)))) > >> (insn 250 246 254 32 (set (mem:SI (reg/f:SI 20 %r20 [480]) [4 MEM[(int > >> *)prephitmp_37 + 396B]+0 S4 A32]) > >> (reg:SI 31 %r31 [orig:145 vect__102.2444 ] [145])) > >> "../Python/compile.c":5968:22 42 {*pa.md:2193} > >> (expr_list:REG_DEAD (reg:SI 31 %r31 [orig:145 vect__102.2444 ] [= 145]) > >> (expr_list:REG_DEAD (reg/f:SI 20 %r20 [480]) > >> (nil)))) > >> > >> After the call, we have: > >> > >> (insn 1241 269 273 30 (set (reg/f:SI 22 %r22 [478]) > >> (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])) > >> "../Python/compile.c":5970:20 -1 > >> (nil)) > >> (insn 273 1241 1242 30 (set (mem:SI (plus:SI (reg/f:SI 22 %r22 [478]) > >> (const_int 388 [0x184])) [4 MEM[(int *)_107 + 388B]+0= S4 > >> A32]) > >> (reg:SI 14 %r14 [orig:167 vect_pretmp_36.2450 ] [167])) > >> "../Python/compile.c":5970:20 42 {*pa.md:2193} > >> (nil)) > >> (insn 1242 273 277 30 (set (reg/f:SI 21 %r21 [479]) > >> (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])) > >> "../Python/compile.c":5970:20 -1 > >> (nil)) > >> (insn 277 1242 1243 30 (set (mem:SI (plus:SI (reg/f:SI 21 %r21 [479]) > >> (const_int 392 [0x188])) [4 MEM[(int *)_107 + 392B]+0= S4 > >> A32]) > >> (reg:SI 13 %r13 [orig:156 vect_pretmp_36.2451 ] [156])) > >> "../Python/compile.c":5970:20 42 {*pa.md:2193} > >> (nil)) > >> (insn 1243 277 281 30 (set (reg/f:SI 20 %r20 [480]) > >> (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127])) > >> "../Python/compile.c":5970:20 -1 > >> (nil)) > >> (insn 281 1243 299 30 (set (mem:SI (plus:SI (reg/f:SI 20 %r20 [480]) > >> (const_int 396 [0x18c])) [4 MEM[(int *)_107 + 396B]+0= S4 > >> A32]) > >> (reg:SI 12 %r12 [orig:134 vect_pretmp_36.2452 ] [134])) > >> "../Python/compile.c":5970:20 42 {*pa.md:2193} > >> (nil)) > >> > >> We have lost the offsets that were added initially to r20, r21 and r22. > >> > >> Previous ce3 pass had: > >> > >> (insn 272 269 273 30 (set (reg/f:SI 22 %r22 [478]) > >> (plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127]) > >> (const_int 388 [0x184]))) "../Python/compile.c":5970:20 1= 20 > >> {addsi3} > >> (nil)) > >> (insn 273 272 276 30 (set (mem:SI (reg/f:SI 22 %r22 [478]) [4 MEM[(int > >> *)_107 + 388B]+0 S4 A32]) > >> (reg:SI 14 %r14 [orig:167 vect_pretmp_36.2450 ] [167])) > >> "../Python/compile.c":5970:20 42 {*pa.md:2193} > >> (nil)) > >> (insn 276 273 277 30 (set (reg/f:SI 21 %r21 [479]) > >> (plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127]) > >> (const_int 392 [0x188]))) "../Python/compile.c":5970:20 1= 20 > >> {addsi3} > >> (nil)) > >> (insn 277 276 280 30 (set (mem:SI (reg/f:SI 21 %r21 [479]) [4 MEM[(int > >> *)_107 + 392B]+0 S4 A32]) > >> (reg:SI 13 %r13 [orig:156 vect_pretmp_36.2451 ] [156])) > >> "../Python/compile.c":5970:20 42 {*pa.md:2193} > >> (nil)) > >> (insn 280 277 281 30 (set (reg/f:SI 20 %r20 [480]) > >> (plus:SI (reg/f:SI 19 %r19 [orig:127 prephitmp_37 ] [127]) > >> (const_int 396 [0x18c]))) "../Python/compile.c":5970:20 1= 20 > >> {addsi3} > >> (nil)) > >> (insn 281 280 284 30 (set (mem:SI (reg/f:SI 20 %r20 [480]) [4 MEM[(int > >> *)_107 + 396B]+0 S4 A32]) > >> (reg:SI 12 %r12 [orig:134 vect_pretmp_36.2452 ] [134])) > >> "../Python/compile.c":5970:20 42 {*pa.md:2193} > >> (nil)) > >> > >> So, this is a f-m-o bug. > > Hi Dave, > > > > I don't see an f-m-o bug here. The offsets aren't lost, they're just mo= ved in > > the corresponding memory loads/stores. If you look the stores in ce3 t= hey > > don't have offsets whereas after f-m-o they have. E.g. in ce3: (insn 27= 3 272 > > 276 30 (set (mem:SI (reg/f:SI 22 %r22 [478]) ...) but in f-m-o it is (i= nsn 273 > > 1241 1242 30 (set (mem:SI (plus:SI (reg/f:SI 22 %r22 [478]) (const_int = 388 > > [0x184]) ...). > > > > This is the way that f-m-o works. It can also be seen in the f-m-o dump= s, where > > offsets changes to memory ops are reported as 'Memory offset changed' a= nd > > instructions which got their offset propagated (like insns 272, 276, 28= 0) are > > reported as 'Instruction folded': > Hi Manolis, >=20 > If you look at the f-m-o transformation applied to insn 272 and insn 273, > you will see that > "reg/f:SI 22 %r22 [478]" is not dead after these insns.=C2=A0 The transfo= rmation > changes the value > of r22 which is wrong without changing all uses of the register and > adjusting the other sets > for the register.=C2=A0 It only changed the use in insn 273 and not the u= ses > earlier in the loop. I see, thanks for pointing that out! I'll debug this further and see why it misses f-m-o's use detection code. Manolis=