From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id 14EE83858D33; Tue, 31 Jan 2023 12:03:20 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 14EE83858D33
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1675166600;
	bh=oBbg9QjBKCy5+5Hw57y76Qgzkt6ZbyILqvZZ0CyR79w=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=KofgUtH5fg4qTb1gatoOY4kLhWgLkIqHwyjKmJ72are2wF7UmsQ5o7dWJ5LaL8rhg
	 hgQZYfFVgBwheH+V0nQjFDKrJRp1y49ZyaVHL34++5OsRg+D9jTYm8QPOvd1W77CY0
	 SljP4q83DE6Awl1dOSpOtHsDuP8QU13Kr0bHhpSI=
From: "rsandifo at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/108583] [13 Regression] wrong code with vector division
 by uint16 at -O2
Date: Tue, 31 Jan 2023 12:03:19 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: 13.0
X-Bugzilla-Keywords: wrong-code
X-Bugzilla-Severity: normal
X-Bugzilla-Who: rsandifo at gcc dot gnu.org
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P1
X-Bugzilla-Assigned-To: tnfchris at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 13.0
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-108583-4-3BLbjGW6rI@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-108583-4@http.gcc.gnu.org/bugzilla/>
References: <bug-108583-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D108583
--- Comment #13 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.o=
rg> ---
OK, hopefully I understand now.  Sorry for being slow.

But what specific constraints do we want to apply to the optimisation?

(In reply to Tamar Christina from comment #3)
> Right, so this is because in the expansion we don't have enough context to
> decide how to optimize the division.
>=20
> This optimization is only possible when the input is widened because you
> need an additional free bit so that the second addition can't overflow.
>=20
> The vectorizer has this context but since we didn't want a new IFN the
> context should instead be derivable in
> targetm.vectorize.can_special_div_by_const hook.
>=20
> So my proposal to fix this and keep
> targetm.vectorize.can_special_div_by_const as a general divide optimizati=
on
> hook is to pass the actual `tree` operand0 as well to the hook such that =
the
> hook has a bit more context.
>=20
> I was hoping to use get_nonzero_bits to get what the actual range of the
> operand is.  But it looks like for widening operations it still reports -=
1.
The original motivating example was:

void draw_bitmap1(uint8_t* restrict pixel, uint8_t level, int n)
{
  for (int i =3D 0; i < (n & -16); i+=3D1)
    pixel[i] =3D (pixel[i] * level) / 0xff;
}

where we then do a 16-bit division.  But since level and pixel[i] are
unconstrained, the maximum value of pixel[i] * level is 0xfe01.
There's no free bit in that sense.  It seems more that range of the
dividend is [0, N*N] for a divisor of N.

If that's the condition we want to test for, it seems like something
we need to check in the vectoriser rather than the hook.  And it's
not something we can easily do in the vector form, since we don't
track ranges for vectors (AFAIK).=