From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id 6EE133858420; Fri,  4 Nov 2022 13:40:41 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 6EE133858420
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1667569258;
	bh=metG6ygsnXhWJHFHp4TU5wrcTrSJ6SYGL0dBArf031Y=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=jI9iyF9D2Pxht5TJDB2ar05nK067mLGYhwSGbOhjK1W8HV32P3AcBO5bZxk99k4NO
	 xGfHSX9Xyc4r+v16nU2oM1m/Dk+iZj6bxJNxCtplCGSTcGViee0sSSfPu+VGxYN/lP
	 XZVe555rCysaUXOxuLySXLbLq2Q4ZXUPn2AxYwys=
From: "amacleod at redhat dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/55157] Missed VRP with != 0 and multiply
Date: Fri, 04 Nov 2022 13:40:41 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: tree-optimization
X-Bugzilla-Version: 4.8.0
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: enhancement
X-Bugzilla-Who: amacleod at redhat dot com
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-55157-4-6zHD3nuaCk@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-55157-4@http.gcc.gnu.org/bugzilla/>
References: <bug-55157-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D55157
--- Comment #6 from Andrew Macleod <amacleod at redhat dot com> ---
(In reply to Aldy Hernandez from comment #4)

>=20
> The patch below does this, but it does have a 3% penalty for VRP (though =
no
> penalty to overall compilation).  I'm inclined to pursue this route, since
> it makes nonzero mask optimization more pervasive across the board.
>=20
> What do you think Andrew?
>=20

1) Why wouldn't this be done in set_range_from_nonzero_bits()?  That call is
just above that spot in the code. Or is the name misleading and it does
something else?

2) That seems expensive.. we must be doing unnecessary work.  Maybe it would
speed up if we checked if either the ctz or clz would cause it to do anythi=
ng
first.  Thus avoiding creating a couple of ranges and performing a union and
intersection in cases where neither the leading nor trailing bit is a zero?

3) It also seems to me that you then only need to add the zero/union iff the
trailing bit has zeros. ie, if the are no trailing zeros, then just set the=
 lb
to 0, and calculate the UB based on the clz.

I should think that would speed things up a bit.=