From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 145A5385800A; Fri, 4 Aug 2023 10:09:33 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 145A5385800A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1691143773; bh=Sogg/Ms2s92/nilsG8mjul7q2ofE/ZOfLr9XzAqih6s=; h=From:To:Subject:Date:In-Reply-To:References:From; b=Pzq8Ht+w2wzQtOVosW/64hgtkMgdi+Www/bMms/06AYsBRy2nsGEcNY8/oFW8L8E1 H2H5JJG3PYyyDALXGpe60RCiWTL0+4QcJpovec/Cgnk/xItiDYGN16QAZUVAHbhZWd /VM+f5xTO55CcOhICZPoaIaw8ABCBW99NqvaFr6s= From: "hubicka at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/106293] [13 regression] 456.hmmer at -Ofast -march=native regressed by 19% on zen2 and zen3 in July 2022 Date: Fri, 04 Aug 2023 10:09:30 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 13.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: hubicka at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 13.3 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: short_desc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D106293 Jan Hubicka changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|[13/14 Regression] |[13 regression] 456.hmmer |456.hmmer at -Ofast |at -Ofast -march=3Dnative |-march=3Dnative regressed by |regressed by 19% on zen2 |19% on zen2 and zen3 in |and zen3 in July 2022 |July 2022 | --- Comment #26 from Jan Hubicka --- We are out of regression finally, but still there are several things to fix. 1) vectorizer produces corrupt profile 2) loop-split is not able to work out that it splits last iteration 3) we work way to hard optimizing loops iterating 0 times. The loop in question really iterates zero times. It is created by loop spl= it from the internal loop: for (k =3D 1; k <=3D M; k++) { mc[k] =3D mpp[k-1] + tpmm[k-1]; if ((sc =3D ip[k-1] + tpim[k-1]) > mc[k]) mc[k] =3D sc; if ((sc =3D dpp[k-1] + tpdm[k-1]) > mc[k]) mc[k] =3D sc; if ((sc =3D xmb + bp[k]) > mc[k]) mc[k] =3D sc; mc[k] +=3D ms[k]; if (mc[k] < -INFTY) mc[k] =3D -INFTY; dc[k] =3D dc[k-1] + tpdd[k-1]; if ((sc =3D mc[k-1] + tpmd[k-1]) > dc[k]) dc[k] =3D sc; if (dc[k] < -INFTY) dc[k] =3D -INFTY; if (k < M) { ic[k] =3D mpp[k] + tpmi[k]; if ((sc =3D ip[k] + tpii[k]) > ic[k]) ic[k] =3D sc; ic[k] +=3D is[k]; if (ic[k] < -INFTY) ic[k] =3D -INFTY; } it peels off the last iteration. For ocnidtion is if (k <=3D M) while we plit on if (k < M) M is varianble and nothing seems to be able to optimize out the second loop after splitting. My plan is to add the pattern match so loop split gets this right and recor= ds upper bound on iteration count, but first want to show other bugs exposed by this scenario.=