From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id C58053858CDA; Sun, 16 Jul 2023 20:06:35 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C58053858CDA DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1689537995; bh=W57h/Lq5B14XgjoDF8BJQyJenFJuCKTgx0Ghrl3ZWnc=; h=From:To:Subject:Date:In-Reply-To:References:From; b=EG6ZYR7kHlfjJyJqM3jqgFW0vS6gi2gb+GoZzx9RBnSxPBLXn3lF2apeNygm5Bmhv iuwWuKxP4Otso3S+V43RroN8i+sJogCHxqrdqD1an8ADj0TC60PsMbAE/cjNM+F3T/ F/NMQjT4e5zLUInIC78RN48nklye5cBJ9zcvlwss= From: "hubicka at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/110649] [14 Regression] 25% sphinx3 spec2006 regression on Ice Lake and zen between g:acaa441a98bebc52 (2023-07-06 11:36) and g:55900189ab517906 (2023-07-07 00:23) Date: Sun, 16 Jul 2023 20:06:34 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: missed-optimization, needs-bisection X-Bugzilla-Severity: normal X-Bugzilla-Who: hubicka at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 14.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D110649 --- Comment #7 from Jan Hubicka --- I found the problem why vectorizer gets vectorized epilogue profile scales wrong. It is scale_profile_for_vect_loop that uses niter_for_unrolled_loop which does not understand the fact that if iteration count is not divisible, the epilogue (unless loop is masked) will use the count. THe upper bound compuation is actually right in update of loop_info, so we = can just use it directly instead of relying on niter_for_unrolled_loop. Wrong profile in: ;; basic block 14, loop depth 2, count 13764235 (guessed, freq 1.9247), m= aybe hot ;; Invalid sum of incoming counts 25234431 (guessed, freq 3.5286), should= be 13764235 (guessed, freq 1.9247) Is caused by loop peeling. The unrolled loop is peeled 4 times which seems like a reasonable idea, but I am not sure why profile is not updated correc= tly here.=