From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 2567F385DDE5; Tue, 11 Jun 2024 10:36:51 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 2567F385DDE5 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1718102211; bh=CrFgNpYK73UpIASiQ2KojwJPuA1DAyNGEiiFWQeLI9s=; h=From:To:Subject:Date:In-Reply-To:References:From; b=U9r5RUOd1zhQuOhr8a8S+doJdHexs1MUJzXlpIrH+M0/jAQeDYyCtPQNDACOc/wlj wxWn2b002gzns8dNF730MKEdDilloP2tVmwRMPgiITl245XoQTOCj+h/p7os70pCaG W9hYFoVZi0Buph6bfaJFNpxyoi9d6C35B26rDJB4= From: "cvs-commit at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug middle-end/111422] Wrong code at -O3 on x86_64-linux-gnu Date: Tue, 11 Jun 2024 10:36:49 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: middle-end X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: missed-optimization, needs-bisection, wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: cvs-commit at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D111422 --- Comment #7 from GCC Commits --- The releases/gcc-12 branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:170c2bba7cb85b3ac9380a7d5a1c6d82b3c6aa63 commit r12-10506-g170c2bba7cb85b3ac9380a7d5a1c6d82b3c6aa63 Author: Jakub Jelinek Date: Tue Jan 16 11:49:34 2024 +0100 cfgexpand: Workaround CSE of ADDR_EXPRs in VAR_DECL partitioning [PR113= 372] The following patch adds a quick workaround to bugs in VAR_DECL partitioning. The problem is that there is no dependency between ADDR_EXPRs of local decls and CLOBBERs of those vars, so VN can CSE uses of ADDR_EXPRs (including ivopts integral variants thereof), which can break add_scope_conflicts discovery of what variables are actually used in certain region. E.g. we can have ivtmp.40_3 =3D (unsigned long) &MEM [(void *)&bi= tint.6 + 8B]; ... uses of ivtmp.40_3 ... bitint.6 =3D{v} {CLOBBER(eos)}; ... ivtmp.28_43 =3D (unsigned long) &MEM [(void *)&bitint.6 + 8B]; ... uses of ivtmp.28_43 before VN (such as dom3), which the add_scope_conflicts code identifies= as 2 independent uses of bitint.6 variable (which is correct), but then VN determines ivtmp.28_43 is the same as ivtmp.40_3 and just uses ivtmp.40= _3 even in the second region; at that point add_scope_conflict thinks the bitint.6 variable is not used in that region anymore. The following patch does a simple single def-stmt check for such ADDR_E= XPRs (rather than say trying to do a full propagation of what SSA_NAMEs can contain ADDR_EXPRs of local variables), which seems to workaround all 4 PRs. In addition to this patch I've used the attached one to gather statisti= cs on the total size of all variable partitions in a function and seems besides the new testcases nothing is really affected compared to no patch (I've actually just modified the patch to =3D=3D OMP_SCAN instead of =3D=3D A= DDR_EXPR, so it looks the same except that it never triggers). The comparison wasn't perfect because I've only gathered BITS_PER_WORD, main_input_filename (= did some replacement of build directories and /tmp/ccXXXXXX names of LTO to make it more similar between the two bootstraps/regtests), current_function_= name and the total size of all variable partitions if any, because I didn't record e.g. the optimization options and so e.g. torture tests which iterate over options could have different partition sizes even in one compiler = when BITS_PER_WORD, main_input_filename and current_function_name are all eq= ual. So had to write an awk script to check if the first triple in the second build appeared in the first one and the quadruple in the second build appeared in the first one too, otherwise print result and that only triggered in the new tests. Also, the cc1plus binary according to objdump -dr is identical between = the two builds except for the ADDR_EXPR vs. OMP_SCAN constant in the two sp= ots. 2024-01-16 Jakub Jelinek PR tree-optimization/113372 PR middle-end/90348 PR middle-end/110115 PR middle-end/111422 * cfgexpand.cc (add_scope_conflicts_2): New function. (add_scope_conflicts_1): Use it. * gcc.c-torture/execute/pr90348.c: New test. * gcc.c-torture/execute/pr110115.c: New test. * gcc.c-torture/execute/pr111422.c: New test. (cherry picked from commit 1251d3957de04dc9b023a23c09400217e13deadb)=