From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id 0F2EB38582A0; Mon, 24 Oct 2022 09:54:34 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0F2EB38582A0
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1666605274;
	bh=nU0F/4uowm8tlHN5i/emu9rwIR1JrsMINujbdsvhUIU=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=jGY/BBhEMFf97jeIm7mIjecO7GgfqwyYJqfxa4L/IWomrxKgkkAN2MNxkWMnLCLlG
	 d3ODabtiERgjdTOVGv8rJn6MvTiND3OL4eK4GSo3YHSGtb/nZEwYohUmauKh/1puqn
	 HZumfn34AsJ7KxO/XjeCMJNuu6QuHFzOP3bUOAD8=
From: "rguenth at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/107176] [10/11/12/13 Regression] Wrong code
 at -Os on x86_64-pc-linux-gnu since r7-2012-g43aabfcfd4139e4c
Date: Mon, 24 Oct 2022 09:54:30 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: tree-optimization
X-Bugzilla-Version: 13.0
X-Bugzilla-Keywords: wrong-code
X-Bugzilla-Severity: normal
X-Bugzilla-Who: rguenth at gcc dot gnu.org
X-Bugzilla-Status: ASSIGNED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P2
X-Bugzilla-Assigned-To: rguenth at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 10.5
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-107176-4-STLroeK0EC@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-107176-4@http.gcc.gnu.org/bugzilla/>
References: <bug-107176-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D107176
--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #2)
> final value replacement:
>   b_lsm.8_26 =3D PHI <b_lsm.8_15(4)>
>  with expr: 1
>  final stmt:
>   b_lsm.8_26 =3D 1;
>=20
> where
>=20
> (get_scalar_evolution
>   (scalar =3D b_lsm.8_15)
>   (scalar_evolution =3D {0, +, 1}_1))
> (chrec_apply
>   (varying_loop =3D 1)
>   (chrec =3D {0, +, 1}_1)
>   (x =3D 1)
>   (res =3D 1))
>=20
> and
>=20
>   <bb 3> [local count: 955630225]:
>   _1 =3D (unsigned int) b_lsm.8_15;
>   _2 =3D _1 + 4294967206;
>   # RANGE [irange] long int [0, 4294967295] NONZERO 0xffffffff
>   _12 =3D (long int) _2;
>   # RANGE [irange] long int [91, 4294967386] NONZERO 0x1ffffffff
>   _3 =3D _12 + 91;
>=20
>   <bb 4> [local count: 1073741824]:
>   # b_lsm.8_15 =3D PHI <0(2), _3(3)>

For comparison in PR66375 we have

<bb 4> :
# c_6 =3D PHI <0(2), c_12(3)>
a.1_5 =3D a;
if (a.1_5 <=3D 12)
  goto <bb 3>; [INV]
else
  goto <bb 5>; [INV]

<bb 3> :
_1 =3D (signed char) c_6;
_2 =3D (int) _1;
c_12 =3D _2 + -11;
_4 =3D a.1_5 + 1;
a =3D _4;
goto <bb 4>;

so the backedge definition is

  (long)((unsigned)IV + -90u) + 91

vs.

  (int)(signed char)IV + -11


I think the issue is the CONVERT_EXPR handling in follow_ssa_edge_expr where
it isn't all that clear in which case we can analyze the evolution in the
narrower or the wider type.  In the case in this PR we mishandled the
middle conversion while in the older case we mishandle the "initial"
conversion.  I suspect that trying to optimize things on-the-fly is
difficult (and reasoning about the relevant cases there).

I've tried (again) to more correctly have the current evolution tentative
until we hit the loop PHI again when following the use-def chain from the
latch definition, but then we don't even know whether we will have an
evolution in the end (tried { initial, +, scev_not_known }).  Going fully
symbolic will lead to the issue pointed out in comment 6 of PR66375,
we'd get { (int)(signed char)0, +, -11 } which isn't what we want.

If, as in this bug, we have two evolutions in different types, we probably
have to give up.  Maybe we need to think of a PLUS as { unknown, +, val }
and instead fend off "wrongly" typed evolutions when we reach the
loop PHI node again.  But then the PR66375 case _does_ have an expression
correctly describing the evolution of the IV.  Just in this PRs case there
is none I think.  Or rather, I guess it would be
{ (long){ 0u, +, -90u }_1, +, 90 }_1 but that wouldn't be affine at least.=