From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 595B2384AB41; Wed, 8 May 2024 18:10:23 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 595B2384AB41 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1715191823; bh=7bLNXucjRX1bXdipwjGuL1QWEyvUcSPscacEeLyPDSA=; h=From:To:Subject:Date:In-Reply-To:References:From; b=jleotr5wB9kJV/AItachX7K7di2DwYVBPHcpAWJFa2aYLh4M9LtaesRkNM2YNepdT vsIqazcxWBJZFZIsio1DwcG9ksEHP5qcaS8GZQfOdItsF7CY90N03UY/+WMigmxaII 24WzXp4UMp52LuBpzHm+/zboMWFTEsfvaqQ+XVRw= From: "slyfox at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/114872] [13/14/15 Regression] Miscompilation with -O2 after commit r13-8037 Date: Wed, 08 May 2024 18:10:22 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 13.2.1 X-Bugzilla-Keywords: wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: slyfox at gcc dot gnu.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 13.3 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D114872 --- Comment #25 from Sergei Trofimovich --- (In reply to Richard Biener from comment #24) > (In reply to Sergei Trofimovich from comment #23) > [...] > > Why did `gcc` generate unconditional NULL dereference here? I suspect it > > somehow inferred that `__pyx_t_6 =3D NULL;` in that branch, but not bef= ore > > comparison. >=20 > That's what happens if we isolate an unreachable path because of a NULL > dereference (like if exposed by jump-threading). We make the NULL > dereference volatile so it stays but DCE/DSE can cleanup code on the path > leading to it. >=20 > If you run into such path the this might suggest that jump-threading > triggered > a problem with the setjmp/longjmp, so it's then likely some condition tha= t's > evaluated in a wrong way after the longjmp, either because a dependent > value wasn't properly preserved or by GCC breaking that. Seeing stack me= mory > arguments used on a call in a previous comment Yeah, that makes sense. Having stared a bit more at __pyx_pf_4sage_4libs_3gap_7element_19GapElement_Function_2__call__() I think I get the problem now. We deal with the code similar to the following: __pyx_pf_4sage_4libs_3gap_7element_19GapElement_Function_2__call__() { __pyx_t_6 =3D NULL; // the loop is not very important, but it forces `__pyx_t_6` initializa= tion before `setjmp()` for (;;) { __pyx_t_6 =3D something(); int done =3D use_pre_setjmp(__pyx_t_6); __pyx_t_6 =3D NULL; if (done) break; } int mode =3D setjmp(jb); switch (mode) { case 1: // longjmp() case break; case 0: // regular case (`case 3:` in real code) __pyx_t_6 =3D something_else(); // set __pyx_t_6 to non-zero int done =3D use_post_setjmp(__pyx_t_6); // call longjmp(jb, 1) here __pyx_t_6 =3D NULL; break; } // get here via longjmp() if (__pyx_t_6 !=3D NULL) deref(__pyx_t_6); } AFAIU `gcc` is smart enough to see that all paths to `deref()` reach with `__pyx_t_6 =3D NULL`, but it does not eliminate the `deref()` entirely and uses `if (__pyx_t_6 !=3D NULL) deref(NULL);` as Richard explained above. Now due to `longjmp()` `__pyx_t_6 =3D NULL;` does not get executed (even though it's present in assembly code as `movq $0, -200(%rbp)` in all the places where it's present in C code. As a result after the `longjmp()` `__pyx_t_6` is not `NULL` and we get to `deref(NNULL)` and SIGSEGV. Thus it's a matter of missing `volatile __pyx_t_6`. Sounds about right? > I wondered if POSIX suggests > that even non-register variables need to be made volatile and thus whether > SRA or FRE might impose problems with code using setjmp/longjmp. That matches my understanding as well. Would it be fair to say that sprinkl= ing `volatile` has to be done for every single local variable in the function to prevent possible stack reuse? And that rule should extend to the functions that cou= ld host an inline variant of the function using setjmp()/longjmp() and not just immediate caller of setjmp()/longjmp()?=