From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id 620DA3858D38; Wed,  1 Feb 2023 08:41:41 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 620DA3858D38
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1675240901;
	bh=pwe7NGZftxDoQ5O35FKHRSRVWl0xyZhXehESw5zNB6M=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=ZiY8EhNrl4uDeVmtRw6UlzAg7wrdJ3vK1PyJNETMPZe4E7ROhl/s1aTwfL0B773Yr
	 7OUOQufVojsoSvdVqQxfSZ4pWlXH0DaJeb0VPJ0icRPc3AUyL7w48/sMeEM9Gs+8p0
	 2sRzCH+7ZgxHXCxhmWLEKrjXNRrieOA8jcaPBSVY=
From: "rguenth at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/108601] [13 Regression] vector peeling ICEs
 with VLA in gcc_r in SPEC2017 since
 g:c13223b790bbc5e4a3f5605e057eac59b61b2c85
Date: Wed, 01 Feb 2023 08:41:37 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: tree-optimization
X-Bugzilla-Version: 13.0
X-Bugzilla-Keywords: ice-on-valid-code
X-Bugzilla-Severity: normal
X-Bugzilla-Who: rguenth at gcc dot gnu.org
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 13.0
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-108601-4-AKyreuLGSj@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-108601-4@http.gcc.gnu.org/bugzilla/>
References: <bug-108601-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D108601
--- Comment #13 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Tamar Christina from comment #7)
> (In reply to Andrew Pinski from comment #1)
> > So here is how I would tackle this:
> > Put all the needed .i/.ii files in a response file.
> >=20
> >=20
> > $CC -c @files @options
> > $CC -r -o file.o @fileso @options=20
> >=20
> > Since this is only at profile generated stage it is not as hard ...
> > Then start by reducing the needed .o files in `fileso` .
> > When that is finished. Update `files` to match `fileso`.
> > and then run delta (or another automated reducer) over the files in `fi=
les`.
> > Maybe even change -flto=3Dauto etc.
>=20
> Thanks! Managed to reduce it to something fairly simple.
>=20
> Repro:
>=20
> ----
>=20
> decode_options() {
>   int flag =3D 1;
>   for (; flag <=3D 1 << 21; flag <<=3D 1)
>     ;
> }
>=20
> ----
>=20
> compile with gcc -fprofile-generate -mcpu=3Dneoverse-v1 -Ofast opts.i

OK so after _very_ many analyses we get

t.c:3:15: note:  ***** Choosing vector mode VNx2DI
t.c:3:15: note:  ***** Re-trying epilogue analysis with vector mode VNx2DI
...

but then vect_can_advance_ivs_p should return false for the VNx2DI mode
vectorized loop and thus no epilogue peeling possible?

We also do not choose any epilogue vector mode in the end, so the issue
isn't really epilogue vectorization related.  I suppose the
VNx2DI vector loop doesn't use fully masked vectorization but we should
have forced that because we cannot create an epilogue.

That boils down to vect_can_peel_nonlinear_iv_p but oddly enough that's
called from vectorizable_nonlinear_induction itself, possibly because
it calls vect_peel_nonlinear_iv_init.  But then this should never happen
because peeling should be disabled if not possible (but in this context
we don't know whether we actually need to peel).

I think we should remove the vect_can_peel_nonlinear_iv_p call from
vectorizable_nonlinear_induction and adjust vect_can_peel_nonlinear_iv_p
to require a .is_constant () VF.=