public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Sandiford <richard.sandiford@arm.com>
To: Richard Biener <rguenther@suse.de>
Cc: Tamar Christina <Tamar.Christina@arm.com>,
	"gcc-patches\@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
	nd <nd@arm.com>
Subject: Re: [PATCH 1/2]middle-end Support optimized division by pow2 bitmask
Date: Tue, 14 Jun 2022 14:42:44 +0100	[thread overview]
Message-ID: <mptv8t3cqej.fsf@arm.com> (raw)
In-Reply-To: <2p382n54-427o-8q82-6o45-p2nn6869opr5@fhfr.qr> (Richard Biener's message of "Tue, 14 Jun 2022 15:18:38 +0200 (CEST)")

Richard Biener <rguenther@suse.de> writes:
> On Mon, 13 Jun 2022, Tamar Christina wrote:
>
>> > -----Original Message-----
>> > From: Richard Biener <rguenther@suse.de>
>> > Sent: Monday, June 13, 2022 12:48 PM
>> > To: Tamar Christina <Tamar.Christina@arm.com>
>> > Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; Richard Sandiford
>> > <Richard.Sandiford@arm.com>
>> > Subject: RE: [PATCH 1/2]middle-end Support optimized division by pow2
>> > bitmask
>> > 
>> > On Mon, 13 Jun 2022, Tamar Christina wrote:
>> > 
>> > > > -----Original Message-----
>> > > > From: Richard Biener <rguenther@suse.de>
>> > > > Sent: Monday, June 13, 2022 10:39 AM
>> > > > To: Tamar Christina <Tamar.Christina@arm.com>
>> > > > Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; Richard Sandiford
>> > > > <Richard.Sandiford@arm.com>
>> > > > Subject: Re: [PATCH 1/2]middle-end Support optimized division by
>> > > > pow2 bitmask
>> > > >
>> > > > On Mon, 13 Jun 2022, Richard Biener wrote:
>> > > >
>> > > > > On Thu, 9 Jun 2022, Tamar Christina wrote:
>> > > > >
>> > > > > > Hi All,
>> > > > > >
>> > > > > > In plenty of image and video processing code it's common to
>> > > > > > modify pixel values by a widening operation and then scale them
>> > > > > > back into range
>> > > > by dividing by 255.
>> > > > > >
>> > > > > > This patch adds an optab to allow us to emit an optimized
>> > > > > > sequence when doing an unsigned division that is equivalent to:
>> > > > > >
>> > > > > >    x = y / (2 ^ (bitsize (y)/2)-1
>> > > > > >
>> > > > > > Bootstrapped Regtested on aarch64-none-linux-gnu,
>> > > > > > x86_64-pc-linux-gnu and no issues.
>> > > > > >
>> > > > > > Ok for master?
>> > > > >
>> > > > > Looking at 2/2 it seems that this is the wrong way to attack the
>> > > > > problem.  The ISA doesn't have such instruction so adding an optab
>> > > > > looks premature.  I suppose that there's no unsigned vector
>> > > > > integer division and thus we open-code that in a different way?
>> > > > > Isn't the correct thing then to fixup that open-coding if it is more
>> > efficient?
>> > > >
>> > >
>> > > The problem is that even if you fixup the open-coding it would need to
>> > > be something target specific? The sequence of instructions we generate
>> > > don't have a GIMPLE representation.  So whatever is generated I'd have
>> > > to fixup in RTL then.
>> > 
>> > What's the operation that doesn't have a GIMPLE representation?
>> 
>> For NEON use two operations:
>> 1. Add High narrowing lowpart, essentially doing (a +w b) >>.n bitsize(a)/2
>>     Where the + widens and the >> narrows.  So you give it two shorts, get a byte
>> 2. Add widening add of lowpart so basically lowpart (a +w b)
>> 
>> For SVE2 we use a different sequence, we use two back-to-back sequences of:
>> 1. Add narrow high part (bottom).  In SVE the Top and Bottom instructions select
>>    Even and odd elements of the vector rather than "top half" and "bottom half".
>> 
>>    So this instruction does : Add each vector element of the first source vector to the
>>    corresponding vector element of the second source vector, and place the most
>>     significant half of the result in the even-numbered half-width destination elements,
>>     while setting the odd-numbered elements to zero.
>> 
>> So there's an explicit permute in there. The instructions are sufficiently different that there
>> wouldn't be a single GIMPLE representation.
>
> I see.  Are these also useful to express scalar integer division?
>
> I'll defer to others to ack the special udiv_pow2_bitmask optab
> or suggest some piecemail things other targets might be able to do as 
> well.  It does look very special.  I'd also bikeshed it to
> udiv_pow2m1 since 'bitmask' is less obvious than 2^n-1 (assuming
> I interpreted 'bitmask' correctly ;)).  It seems to be even less
> general since it is an unary op and the actual divisor is constrained
> by the mode itself?

Yeah, those were my concerns as well.  For n-bit numbers, the same kind
of arithmetic transformation can be used for any 2^m-1 for m in [n/2, n),
so from a target-independent point of view, m==n/2 isn't particularly
special.  Hard-coding one value of m would make sense if there was an
underlying instruction that did exactly this, but like you say, there
isn't.

Would a compromise be to define an optab for ADDHN and then add a vector
pattern for this division that (at least initially) prefers ADDHN over
the current approach whenever ADDHN is available?  We could then adapt
the conditions on the pattern if other targets also provide ADDHN but
don't want this transform.  (I think the other instructions in the
pattern already have optabs.)

That still leaves open the question about what to do about SVE2,
but the underlying problem there is that the vectoriser doesn't
know about the B/T layout.

Thanks,
Richard

  parent reply	other threads:[~2022-06-14 13:42 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-09  4:39 Tamar Christina
2022-06-09  4:40 ` [PATCH 2/2]AArch64 aarch64: Add implementation for pow2 bitmask division Tamar Christina
2022-06-13  9:24 ` [PATCH 1/2]middle-end Support optimized division by pow2 bitmask Richard Biener
2022-06-13  9:39   ` Richard Biener
2022-06-13 10:09     ` Tamar Christina
2022-06-13 11:47       ` Richard Biener
2022-06-13 14:37         ` Tamar Christina
2022-06-14 13:18           ` Richard Biener
2022-06-14 13:38             ` Tamar Christina
2022-06-14 13:42             ` Richard Sandiford [this message]
2022-06-14 15:57               ` Tamar Christina
2022-06-14 16:09                 ` Richard Biener
2022-06-22  0:34                 ` Tamar Christina
2022-06-26 19:55                   ` Jeff Law
2022-09-23  9:33 ` [PATCH 1/4]middle-end Support not decomposing specific divisions during vectorization Tamar Christina
2022-09-23  9:33 ` [PATCH 2/4]AArch64 Add implementation for pow2 bitmask division Tamar Christina
2022-10-31 11:34   ` Tamar Christina
2022-11-09  8:33     ` Tamar Christina
2022-11-09 16:02     ` Kyrylo Tkachov
2022-09-23  9:33 ` [PATCH 3/4]AArch64 Add SVE2 " Tamar Christina
2022-10-31 11:34   ` Tamar Christina
2022-11-09  8:33     ` Tamar Christina
2022-11-12 12:17   ` Richard Sandiford
2022-09-23  9:34 ` [PATCH 4/4]AArch64 sve2: rewrite pack + NARROWB + NARROWB to NARROWB + NARROWT Tamar Christina
2022-10-31 11:34   ` Tamar Christina
2022-11-09  8:33     ` Tamar Christina
2022-11-12 12:25   ` Richard Sandiford
2022-11-12 12:33     ` Richard Sandiford
2022-09-26 10:39 ` [PATCH 1/4]middle-end Support not decomposing specific divisions during vectorization Richard Biener
2022-10-31 11:34   ` Tamar Christina
2022-10-31 17:12     ` Jeff Law
2022-11-08 17:36     ` Tamar Christina
2022-11-09  8:01       ` Richard Biener
2022-11-09  8:26         ` Tamar Christina
2022-11-09 10:37 ` Kyrylo Tkachov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=mptv8t3cqej.fsf@arm.com \
    --to=richard.sandiford@arm.com \
    --cc=Tamar.Christina@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=nd@arm.com \
    --cc=rguenther@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).