From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id 442C13861898; Thu, 11 Jan 2024 09:58:07 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 442C13861898
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1704967087;
	bh=BneCAMZHZ1Wbu3ER9hEn0K9hTdr7hUI2f1Kg+1Wpahk=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=CHN2Xl1M5WcrYRxbsGGnea1FLkxzEbSe1p7A4SRLazRLYk9tPZLUDPfth341LdIG6
	 ht7i/E9nN4ryqSl/+cGebs2h6DrZA03X+RKBGnPPro2YK46r8VueO2b/ClVDvawFjZ
	 t/103c1Y9cwK8mwJazg+w6KQqCyiEduDxUOT92FU=
From: "fxue at os dot amperecomputing.com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/113326] Optimize vector shift with constant delta on
 shifting-count operand
Date: Thu, 11 Jan 2024 09:58:05 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: unknown
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: enhancement
X-Bugzilla-Who: fxue at os dot amperecomputing.com
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-113326-4-vNBO7hOOnv@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-113326-4@http.gcc.gnu.org/bugzilla/>
References: <bug-113326-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D113326
--- Comment #7 from Feng Xue <fxue at os dot amperecomputing.com> ---
(In reply to Richard Biener from comment #6)
> (In reply to Andrew Pinski from comment #5)
> > One more thing:
> > ```
> >  vect_shift_0 =3D vect_value >> { 0, 1, 2, 3 };
> >  vect_shift_1 =3D vect_value >> { 4, 5, 6, 7 };
> >  vect_shift_2 =3D vect_value >> { 8, 9, 10, 11 };
> >  vect_shift_3 =3D vect_value >> { 12, 13, 14, 15 };
> > ```
> > vs
> > ```
> >  vect_shift_0 =3D vect_value >> { 0, 1, 2, 3 };
> >  vect_shift_1 =3D vect_shift_0 >> { 4, 4, 4, 4 };
> >  vect_shift_2 =3D vect_shift_0 >> { 8, 8, 8, 8 };
> >  vect_shift_3 =3D vect_shift_0 >> { 12, 12, 12, 12 };
> > ```
> >=20
> > the first has fully independent operations while in the second case, th=
ere
> > is one dependent and the are independent operations.
> >=20
> > On cores which has many vector units the first one might be faster than=
 the
> > second one.  So this needs a cost model too.
>=20
> Note the vectorizer has the shift values dependent as well (across
> iterations),
> we just constant propagate after unrolling here.
>=20
> Note this is basically asking for "strength-reduction" of expensive
> constants which could be more generally useful and not only for this
> specific shift case.  Consider the same example but with an add instead
> of a shift for example, the same exact set of constants will appear.

It is. But only find that vector shift has special treatment to constant
operands based on its numerical pattern. No sure any other operator would b=
e.

BTW, here is a scalar-version strength-reduction for shift, like:

  int a =3D value >> n;
  int b =3D value >> (n + 6);

  =3D=3D>

  int a =3D value >> n;
  int b =3D a >> 6;      // (n + 6) is not needed

But this is not covered by current scalar strength-reduction pass.=