From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 5E8443858430; Fri, 27 Jan 2023 15:13:27 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 5E8443858430 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1674832407; bh=lSqJxJzqW6uHw8ADHNAq1OKjZo0Ayi2xxqAaNT1Xc7o=; h=From:To:Subject:Date:In-Reply-To:References:From; b=Rytfdp6B328noZ0aGttXbgP5+OYqyYkDg70TeopR4t//w9J/7bdMsRICbJh6E7DWE 8AKyPG75yX4Htdt8i4TuBb8w8dHUTsCxWOnBWVrpZmR171ekGBaiMifLz1xWS9nzTP lbcQNiB+vZZlQ9nISmWLOc2HdrIf2E5ElUKBg340= From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled Date: Fri, 27 Jan 2023 15:13:26 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 11.3.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D108552 --- Comment #26 from Richard Biener --- And yes, to IV optimization the gcov counter for the loop body is just anot= her IV candidate that can be used, and in this case it allows to elide the otherwise unused original IV. Now, in principle we should have applied store-motion and not only PRE which would have avoided the issue, not tricking the RA into reloading the value from where we store it in the loop, but the kernel uses -fno-tree-loop-im, preventing that. If you enable that you'd get [local count: 105119324]: __gcov0.prep_compound_page_I_lsm.1755_4 =3D __gcov0.prep_compound_page[7]; _92 =3D (long unsigned int) page_12(D); _57 =3D _92 + 1; _119 =3D page_12(D) + 40; ivtmp.1762_136 =3D (unsigned int) _119; [local count: 955630225]: # i_66 =3D PHI # ivtmp.1762_6 =3D PHI p_15 =3D (struct page *) ivtmp.1762_6; MEM [(union *)p_15 + 12B] =3D 1024B; MEM[(volatile long unsigned int *)p_15 + 4B] =3D{v} _57; i_17 =3D i_66 + 1; ivtmp.1762_46 =3D ivtmp.1762_6 + 40; if (nr_pages_11 !=3D i_17) goto ; [89.00%] else goto ; [11.00%] [local count: 105119324]: _73 =3D (unsigned int) nr_pages_11; _163 =3D _73 + 4294967294; _159 =3D (long long int) _163; _1 =3D __gcov0.prep_compound_page_I_lsm.1755_4 + 1; PROF_edge_counter_74 =3D _1 + _159; __gcov0.prep_compound_page[7] =3D PROF_edge_counter_74; which is the desired optimization, handling the counter in the loop like an induction variable instead of going through memory.=