public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Tamar Christina <Tamar.Christina@arm.com>
To: Richard Sandiford <Richard.Sandiford@arm.com>,
	Richard Biener <rguenther@suse.de>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>, nd <nd@arm.com>
Subject: RE: [PATCH 1/2]middle-end Support optimized division by pow2 bitmask
Date: Wed, 22 Jun 2022 00:34:53 +0000	[thread overview]
Message-ID: <VI1PR08MB53259CC9F66E245E38CBC5DAFFB29@VI1PR08MB5325.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <VI1PR08MB53256C0BAB19A02DEF1700E9FFAA9@VI1PR08MB5325.eurprd08.prod.outlook.com>

> -----Original Message-----
> From: Tamar Christina
> Sent: Tuesday, June 14, 2022 4:58 PM
> To: Richard Sandiford <richard.sandiford@arm.com>; Richard Biener
> <rguenther@suse.de>
> Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>
> Subject: RE: [PATCH 1/2]middle-end Support optimized division by pow2
> bitmask
> 
> 
> 
> > -----Original Message-----
> > From: Richard Sandiford <richard.sandiford@arm.com>
> > Sent: Tuesday, June 14, 2022 2:43 PM
> > To: Richard Biener <rguenther@suse.de>
> > Cc: Tamar Christina <Tamar.Christina@arm.com>;
> > gcc-patches@gcc.gnu.org; nd <nd@arm.com>
> > Subject: Re: [PATCH 1/2]middle-end Support optimized division by pow2
> > bitmask
> >
> > Richard Biener <rguenther@suse.de> writes:
> > > On Mon, 13 Jun 2022, Tamar Christina wrote:
> > >
> > >> > -----Original Message-----
> > >> > From: Richard Biener <rguenther@suse.de>
> > >> > Sent: Monday, June 13, 2022 12:48 PM
> > >> > To: Tamar Christina <Tamar.Christina@arm.com>
> > >> > Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; Richard Sandiford
> > >> > <Richard.Sandiford@arm.com>
> > >> > Subject: RE: [PATCH 1/2]middle-end Support optimized division by
> > >> > pow2 bitmask
> > >> >
> > >> > On Mon, 13 Jun 2022, Tamar Christina wrote:
> > >> >
> > >> > > > -----Original Message-----
> > >> > > > From: Richard Biener <rguenther@suse.de>
> > >> > > > Sent: Monday, June 13, 2022 10:39 AM
> > >> > > > To: Tamar Christina <Tamar.Christina@arm.com>
> > >> > > > Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; Richard
> > >> > > > Sandiford <Richard.Sandiford@arm.com>
> > >> > > > Subject: Re: [PATCH 1/2]middle-end Support optimized division
> > >> > > > by
> > >> > > > pow2 bitmask
> > >> > > >
> > >> > > > On Mon, 13 Jun 2022, Richard Biener wrote:
> > >> > > >
> > >> > > > > On Thu, 9 Jun 2022, Tamar Christina wrote:
> > >> > > > >
> > >> > > > > > Hi All,
> > >> > > > > >
> > >> > > > > > In plenty of image and video processing code it's common
> > >> > > > > > to modify pixel values by a widening operation and then
> > >> > > > > > scale them back into range
> > >> > > > by dividing by 255.
> > >> > > > > >
> > >> > > > > > This patch adds an optab to allow us to emit an optimized
> > >> > > > > > sequence when doing an unsigned division that is equivalent
> to:
> > >> > > > > >
> > >> > > > > >    x = y / (2 ^ (bitsize (y)/2)-1
> > >> > > > > >
> > >> > > > > > Bootstrapped Regtested on aarch64-none-linux-gnu,
> > >> > > > > > x86_64-pc-linux-gnu and no issues.
> > >> > > > > >
> > >> > > > > > Ok for master?
> > >> > > > >
> > >> > > > > Looking at 2/2 it seems that this is the wrong way to
> > >> > > > > attack the problem.  The ISA doesn't have such instruction
> > >> > > > > so adding an optab looks premature.  I suppose that there's
> > >> > > > > no unsigned vector integer division and thus we open-code
> > >> > > > > that in a different
> > way?
> > >> > > > > Isn't the correct thing then to fixup that open-coding if
> > >> > > > > it is more
> > >> > efficient?
> > >> > > >
> > >> > >
> > >> > > The problem is that even if you fixup the open-coding it would
> > >> > > need to be something target specific? The sequence of
> > >> > > instructions we generate don't have a GIMPLE representation.
> > >> > > So whatever is generated I'd have to fixup in RTL then.
> > >> >
> > >> > What's the operation that doesn't have a GIMPLE representation?
> > >>
> > >> For NEON use two operations:
> > >> 1. Add High narrowing lowpart, essentially doing (a +w b) >>.n
> bitsize(a)/2
> > >>     Where the + widens and the >> narrows.  So you give it two
> > >> shorts, get a byte 2. Add widening add of lowpart so basically
> > >> lowpart (a +w b)
> > >>
> > >> For SVE2 we use a different sequence, we use two back-to-back
> > sequences of:
> > >> 1. Add narrow high part (bottom).  In SVE the Top and Bottom
> > >> instructions
> > select
> > >>    Even and odd elements of the vector rather than "top half" and
> > >> "bottom
> > half".
> > >>
> > >>    So this instruction does : Add each vector element of the first
> > >> source
> > vector to the
> > >>    corresponding vector element of the second source vector, and
> > >> place
> > the most
> > >>     significant half of the result in the even-numbered half-width
> > destination elements,
> > >>     while setting the odd-numbered elements to zero.
> > >>
> > >> So there's an explicit permute in there. The instructions are
> > >> sufficiently different that there wouldn't be a single GIMPLE
> > representation.
> > >
> > > I see.  Are these also useful to express scalar integer division?
> > >
> > > I'll defer to others to ack the special udiv_pow2_bitmask optab or
> > > suggest some piecemail things other targets might be able to do as
> > > well.  It does look very special.  I'd also bikeshed it to
> > > udiv_pow2m1 since 'bitmask' is less obvious than 2^n-1 (assuming I
> > > interpreted 'bitmask' correctly ;)).  It seems to be even less
> > > general since it is an unary op and the actual divisor is
> > > constrained by the mode itself?
> >
> > Yeah, those were my concerns as well.  For n-bit numbers, the same
> > kind of arithmetic transformation can be used for any 2^m-1 for m in
> > [n/2, n), so from a target-independent point of view, m==n/2 isn't
> particularly special.
> > Hard-coding one value of m would make sense if there was an underlying
> > instruction that did exactly this, but like you say, there isn't.
> >
> > Would a compromise be to define an optab for ADDHN and then add a
> > vector pattern for this division that (at least initially) prefers
> > ADDHN over the current approach whenever ADDHN is available?  We
> could
> > then adapt the conditions on the pattern if other targets also provide
> > ADDHN but don't want this transform.  (I think the other instructions
> > in the pattern already have
> > optabs.)
> >
> > That still leaves open the question about what to do about SVE2, but
> > the underlying problem there is that the vectoriser doesn't know about
> > the B/T layout.
> 
> Wouldn't it be better to just generalize the optab and to pass on the mask?
> I'd prefer to do that than teach the vectorizer about ADDHN (which can't be
> easily done now) let alone teaching it about B/T.   It also seems somewhat
> unnecessary to diverge the implementation here in the mid-end. After all,
> you can generate better SSE code here as well, so focusing on generating ISA
> specific code from here for each ISA seems like the wrong approach to me.

Ping, is there any consensus here? 

Thanks,
Tamar

> 
> Thanks,
> Tamar
> 
> >
> > Thanks,
> > Richard

  parent reply	other threads:[~2022-06-22  0:35 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-09  4:39 Tamar Christina
2022-06-09  4:40 ` [PATCH 2/2]AArch64 aarch64: Add implementation for pow2 bitmask division Tamar Christina
2022-06-13  9:24 ` [PATCH 1/2]middle-end Support optimized division by pow2 bitmask Richard Biener
2022-06-13  9:39   ` Richard Biener
2022-06-13 10:09     ` Tamar Christina
2022-06-13 11:47       ` Richard Biener
2022-06-13 14:37         ` Tamar Christina
2022-06-14 13:18           ` Richard Biener
2022-06-14 13:38             ` Tamar Christina
2022-06-14 13:42             ` Richard Sandiford
2022-06-14 15:57               ` Tamar Christina
2022-06-14 16:09                 ` Richard Biener
2022-06-22  0:34                 ` Tamar Christina [this message]
2022-06-26 19:55                   ` Jeff Law
2022-09-23  9:33 ` [PATCH 1/4]middle-end Support not decomposing specific divisions during vectorization Tamar Christina
2022-09-23  9:33 ` [PATCH 2/4]AArch64 Add implementation for pow2 bitmask division Tamar Christina
2022-10-31 11:34   ` Tamar Christina
2022-11-09  8:33     ` Tamar Christina
2022-11-09 16:02     ` Kyrylo Tkachov
2022-09-23  9:33 ` [PATCH 3/4]AArch64 Add SVE2 " Tamar Christina
2022-10-31 11:34   ` Tamar Christina
2022-11-09  8:33     ` Tamar Christina
2022-11-12 12:17   ` Richard Sandiford
2022-09-23  9:34 ` [PATCH 4/4]AArch64 sve2: rewrite pack + NARROWB + NARROWB to NARROWB + NARROWT Tamar Christina
2022-10-31 11:34   ` Tamar Christina
2022-11-09  8:33     ` Tamar Christina
2022-11-12 12:25   ` Richard Sandiford
2022-11-12 12:33     ` Richard Sandiford
2022-09-26 10:39 ` [PATCH 1/4]middle-end Support not decomposing specific divisions during vectorization Richard Biener
2022-10-31 11:34   ` Tamar Christina
2022-10-31 17:12     ` Jeff Law
2022-11-08 17:36     ` Tamar Christina
2022-11-09  8:01       ` Richard Biener
2022-11-09  8:26         ` Tamar Christina
2022-11-09 10:37 ` Kyrylo Tkachov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=VI1PR08MB53259CC9F66E245E38CBC5DAFFB29@VI1PR08MB5325.eurprd08.prod.outlook.com \
    --to=tamar.christina@arm.com \
    --cc=Richard.Sandiford@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=nd@arm.com \
    --cc=rguenther@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).