From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 620DA3858D38; Wed, 1 Feb 2023 08:41:41 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 620DA3858D38 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1675240901; bh=pwe7NGZftxDoQ5O35FKHRSRVWl0xyZhXehESw5zNB6M=; h=From:To:Subject:Date:In-Reply-To:References:From; b=ZiY8EhNrl4uDeVmtRw6UlzAg7wrdJ3vK1PyJNETMPZe4E7ROhl/s1aTwfL0B773Yr 7OUOQufVojsoSvdVqQxfSZ4pWlXH0DaJeb0VPJ0icRPc3AUyL7w48/sMeEM9Gs+8p0 2sRzCH+7ZgxHXCxhmWLEKrjXNRrieOA8jcaPBSVY= From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/108601] [13 Regression] vector peeling ICEs with VLA in gcc_r in SPEC2017 since g:c13223b790bbc5e4a3f5605e057eac59b61b2c85 Date: Wed, 01 Feb 2023 08:41:37 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 13.0 X-Bugzilla-Keywords: ice-on-valid-code X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 13.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D108601 --- Comment #13 from Richard Biener --- (In reply to Tamar Christina from comment #7) > (In reply to Andrew Pinski from comment #1) > > So here is how I would tackle this: > > Put all the needed .i/.ii files in a response file. > >=20 > >=20 > > $CC -c @files @options > > $CC -r -o file.o @fileso @options=20 > >=20 > > Since this is only at profile generated stage it is not as hard ... > > Then start by reducing the needed .o files in `fileso` . > > When that is finished. Update `files` to match `fileso`. > > and then run delta (or another automated reducer) over the files in `fi= les`. > > Maybe even change -flto=3Dauto etc. >=20 > Thanks! Managed to reduce it to something fairly simple. >=20 > Repro: >=20 > ---- >=20 > decode_options() { > int flag =3D 1; > for (; flag <=3D 1 << 21; flag <<=3D 1) > ; > } >=20 > ---- >=20 > compile with gcc -fprofile-generate -mcpu=3Dneoverse-v1 -Ofast opts.i OK so after _very_ many analyses we get t.c:3:15: note: ***** Choosing vector mode VNx2DI t.c:3:15: note: ***** Re-trying epilogue analysis with vector mode VNx2DI ... but then vect_can_advance_ivs_p should return false for the VNx2DI mode vectorized loop and thus no epilogue peeling possible? We also do not choose any epilogue vector mode in the end, so the issue isn't really epilogue vectorization related. I suppose the VNx2DI vector loop doesn't use fully masked vectorization but we should have forced that because we cannot create an epilogue. That boils down to vect_can_peel_nonlinear_iv_p but oddly enough that's called from vectorizable_nonlinear_induction itself, possibly because it calls vect_peel_nonlinear_iv_init. But then this should never happen because peeling should be disabled if not possible (but in this context we don't know whether we actually need to peel). I think we should remove the vect_can_peel_nonlinear_iv_p call from vectorizable_nonlinear_induction and adjust vect_can_peel_nonlinear_iv_p to require a .is_constant () VF.=