From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id CC9CA3857415; Wed, 27 Apr 2022 12:30:19 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org CC9CA3857415 From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/105219] [12 Regression] SVE: Wrong code with -O3 -msve-vector-bits=128 -mtune=thunderx Date: Wed, 27 Apr 2022 12:30:19 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P1 X-Bugzilla-Assigned-To: avieira at gcc dot gnu.org X-Bugzilla-Target-Milestone: 12.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Apr 2022 12:30:19 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D105219 --- Comment #17 from Richard Biener --- (In reply to Richard Biener from comment #16) > (In reply to rsandifo@gcc.gnu.org from comment #15) > > (In reply to Richard Biener from comment #14) > > > diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc > > > index d7bc34636bd..3b63ab7b669 100644 > > > --- a/gcc/tree-vect-loop.cc > > > +++ b/gcc/tree-vect-loop.cc > > > @@ -9977,7 +9981,7 @@ vect_transform_loop (loop_vec_info loop_vinfo, = gimple > > > *loop_vectorized_call) > > > lowest_vf) - 1 > > > : wi::udiv_floor (loop->nb_iterations_upper_bound + > > > bias_for_lowest, > > > lowest_vf) - 1); > > > - if (main_vinfo) > > > + if (main_vinfo && !main_vinfo->peeling_for_alignment) > > > { > > > unsigned int bound; > > > poly_uint64 main_iters > > It might be better to add the maximum peeling amount to main_iters. > > Maybe you'd prefer this anyway for GCC 12 though. > >=20 > > I wonder if there's a similar problem for peeling for gaps, > > in cases where the epilogue doesn't need the same peeling. >=20 > I don't quite understand the code in if (main_vinfo) but the point is > that for our case main_iters is zero (and so is prologue_iters if that > would exist). I'm not sure how the code can be adjusted with that > given it computes upper bounds and uses min() for the upper bound > of the epilogue - we'd need to adjust that with a max (2*vf-2, > old-upper-bound) > when there's prologue peeling and the short cut exists (I don't actually > compute that). That is, the code does if (can_div_away_from_zero_p (main_iters, LOOP_VINFO_VECT_FACTOR (loop_vinfo), &bound)) loop->nb_iterations_upper_bound =3D wi::umin ((widest_int) (bound - 1), loop->nb_iterations_upper_bound); and so assumes that the scalar epilogue never runs for more than epilogue VF - 1 times which is wrong. So I simply gated this whole code. But you are right that peeling for gaps would need similar handling so I'll play safe and add && !main_vinfo->peeling_for_gaps.=20 >=20 > peeling for gaps means we run the epilogue for main VF more iterations, > but that would just mean the vectorized epilogue executes one more time > and has peeling for gaps applied as well, so the scalar epilogue runs > for epilogue VF more iterations. >=20 > I'm not sure what conditions prevent epilogue vectorization but I think > there were some at least.=