public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] RISC-V: branch-(not)equals-zero compares against $zero
@ 2022-11-08 19:55 Philipp Tomsich
  2022-11-17 22:44 ` Jeff Law
  0 siblings, 1 reply; 5+ messages in thread
From: Philipp Tomsich @ 2022-11-08 19:55 UTC (permalink / raw)
  To: gcc-patches
  Cc: Vineet Gupta, Kito Cheng, Jeff Law, Palmer Dabbelt,
	Christoph Muellner, Philipp Tomsich

If we are testing a register or a paradoxical subreg (i.e. anything that is not
a partial subreg) for equality/non-equality with zero, we can generate a branch
that compares against $zero.  This will work for QI, HI, SI and DImode, so we
enable this for ANYI.

2020-08-30  gcc/ChangeLog:

	* config/riscv/riscv.md (*branch<mode>_equals_zero): Added pattern.

---

 gcc/config/riscv/riscv.md | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 798f7370a08..97f6ca37891 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -2205,6 +2205,19 @@
 
 ;; Conditional branches
 
+(define_insn "*branch<mode>_equals_zero"
+  [(set (pc)
+	(if_then_else
+	 (match_operator 1 "equality_operator"
+			 [(match_operand:ANYI 2 "register_operand" "r")
+			  (const_int 0)])
+	 (label_ref (match_operand 0 "" ""))
+	 (pc)))]
+  "!partial_subreg_p (operands[2])"
+  "b%C1\t%2,zero,%0"
+  [(set_attr "type" "branch")
+   (set_attr "mode" "none")])
+
 (define_insn "*branch<mode>"
   [(set (pc)
 	(if_then_else
-- 
2.34.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] RISC-V: branch-(not)equals-zero compares against $zero
  2022-11-08 19:55 [PATCH] RISC-V: branch-(not)equals-zero compares against $zero Philipp Tomsich
@ 2022-11-17 22:44 ` Jeff Law
  2022-11-18  4:53   ` Palmer Dabbelt
  0 siblings, 1 reply; 5+ messages in thread
From: Jeff Law @ 2022-11-17 22:44 UTC (permalink / raw)
  To: Philipp Tomsich, gcc-patches
  Cc: Vineet Gupta, Kito Cheng, Jeff Law, Palmer Dabbelt, Christoph Muellner


On 11/8/22 12:55, Philipp Tomsich wrote:
> If we are testing a register or a paradoxical subreg (i.e. anything that is not
> a partial subreg) for equality/non-equality with zero, we can generate a branch
> that compares against $zero.  This will work for QI, HI, SI and DImode, so we
> enable this for ANYI.
>
> 2020-08-30  gcc/ChangeLog:
>
> 	* config/riscv/riscv.md (*branch<mode>_equals_zero): Added pattern.

I've gone back an forth on this a few times.  As you know, I hate 
subregs in the target descriptions and I guess I need to extend that to 
querying if something is a subreg or not rather than just subregs 
appearing in the RTL.


Presumably the idea behind rejecting partial subregs is the bits outside 
the partial is unspecified, but that's also going to be true if we're 
looking at a hardreg in QImode (for example) irrespective of it being 
wrapped in a subreg.


I don't doubt it works the vast majority of the time, but I haven't been 
able to convince myself it'll work all the time.  How do we ensure that 
the bits outside the mode are zero?  I've been bitten by this kind of 
problem before, and it's safe to say it was exceedingly painful to find.


Jeff



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] RISC-V: branch-(not)equals-zero compares against $zero
  2022-11-17 22:44 ` Jeff Law
@ 2022-11-18  4:53   ` Palmer Dabbelt
  2022-11-18  9:14     ` Philipp Tomsich
  2022-11-19 16:47     ` Jeff Law
  0 siblings, 2 replies; 5+ messages in thread
From: Palmer Dabbelt @ 2022-11-18  4:53 UTC (permalink / raw)
  To: jeffreyalaw
  Cc: philipp.tomsich, gcc-patches, Vineet Gupta, Kito Cheng, jlaw,
	christoph.muellner

On Thu, 17 Nov 2022 14:44:31 PST (-0800), jeffreyalaw@gmail.com wrote:
>
> On 11/8/22 12:55, Philipp Tomsich wrote:
>> If we are testing a register or a paradoxical subreg (i.e. anything that is not
>> a partial subreg) for equality/non-equality with zero, we can generate a branch
>> that compares against $zero.  This will work for QI, HI, SI and DImode, so we
>> enable this for ANYI.
>>
>> 2020-08-30  gcc/ChangeLog:
>>
>> 	* config/riscv/riscv.md (*branch<mode>_equals_zero): Added pattern.
>
> I've gone back an forth on this a few times.  As you know, I hate
> subregs in the target descriptions and I guess I need to extend that to
> querying if something is a subreg or not rather than just subregs
> appearing in the RTL.
>
>
> Presumably the idea behind rejecting partial subregs is the bits outside
> the partial is unspecified, but that's also going to be true if we're
> looking at a hardreg in QImode (for example) irrespective of it being
> wrapped in a subreg.
>
>
> I don't doubt it works the vast majority of the time, but I haven't been
> able to convince myself it'll work all the time.  How do we ensure that
> the bits outside the mode are zero?  I've been bitten by this kind of
> problem before, and it's safe to say it was exceedingly painful to find.

I don't really understand the middle-end issues here (if there are 
any?), but I'm pretty sure code like this has passed by a few times 
before and we've yet to find a reliable way to optimize these cases.  
There's a bunch of patterns where knowing the XLEN-extension of shorter 
values would let us generate better code, but there's also cases where 
we'd generate worse code by ensure any extension scheme is followed.

Every time I've seen this come up before I've managed to convince myself 
we can't really fix the problem in the backend, though: if we always 
generate extended values in registers then we just push the cost over to 
the other patterns.  The only way I've come up with to handle something 
like this is to push more types into the middle-end so we can track 
these high bits and generate the faster sequences where we know what 
they are.  That seems like a huge mess, though, and every time it comes 
up folks run away ;)

Sorry if that's kind of vague, I usually find a way to break these but 
my box isn't cooperating with GCC builds today so I haven't even gotten 
that far yet...

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] RISC-V: branch-(not)equals-zero compares against $zero
  2022-11-18  4:53   ` Palmer Dabbelt
@ 2022-11-18  9:14     ` Philipp Tomsich
  2022-11-19 16:47     ` Jeff Law
  1 sibling, 0 replies; 5+ messages in thread
From: Philipp Tomsich @ 2022-11-18  9:14 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: jeffreyalaw, gcc-patches, Vineet Gupta, Kito Cheng, jlaw,
	christoph.muellner

On Fri, 18 Nov 2022 at 05:53, Palmer Dabbelt <palmer@rivosinc.com> wrote:
>
> On Thu, 17 Nov 2022 14:44:31 PST (-0800), jeffreyalaw@gmail.com wrote:
> >
> > On 11/8/22 12:55, Philipp Tomsich wrote:
> >> If we are testing a register or a paradoxical subreg (i.e. anything that is not
> >> a partial subreg) for equality/non-equality with zero, we can generate a branch
> >> that compares against $zero.  This will work for QI, HI, SI and DImode, so we
> >> enable this for ANYI.
> >>
> >> 2020-08-30  gcc/ChangeLog:
> >>
> >>      * config/riscv/riscv.md (*branch<mode>_equals_zero): Added pattern.
> >
> > I've gone back an forth on this a few times.  As you know, I hate
> > subregs in the target descriptions and I guess I need to extend that to
> > querying if something is a subreg or not rather than just subregs
> > appearing in the RTL.
> >
> >
> > Presumably the idea behind rejecting partial subregs is the bits outside
> > the partial is unspecified, but that's also going to be true if we're
> > looking at a hardreg in QImode (for example) irrespective of it being
> > wrapped in a subreg.
> >
> >
> > I don't doubt it works the vast majority of the time, but I haven't been
> > able to convince myself it'll work all the time.  How do we ensure that
> > the bits outside the mode are zero?  I've been bitten by this kind of
> > problem before, and it's safe to say it was exceedingly painful to find.
>
> I don't really understand the middle-end issues here (if there are
> any?), but I'm pretty sure code like this has passed by a few times
> before and we've yet to find a reliable way to optimize these cases.
> There's a bunch of patterns where knowing the XLEN-extension of shorter
> values would let us generate better code, but there's also cases where
> we'd generate worse code by ensure any extension scheme is followed.
>
> Every time I've seen this come up before I've managed to convince myself
> we can't really fix the problem in the backend, though: if we always
> generate extended values in registers then we just push the cost over to
> the other patterns.  The only way I've come up with to handle something
> like this is to push more types into the middle-end so we can track
> these high bits and generate the faster sequences where we know what
> they are.  That seems like a huge mess, though, and every time it comes
> up folks run away ;)

You are perfectly right that this problem can not be fixed in the
backend, at least not in a general manner (i.e., additional patterns
can resolve some of the cases and it is messy in the backend).

In fact, we are looking at fixing this before/during lowering by
avoiding the extension whenever possible (based on the type
information and even value ranges).  However, this work will miss
GCC13 and that is the reason why the band-aid was submitted here.

> Sorry if that's kind of vague, I usually find a way to break these but
> my box isn't cooperating with GCC builds today so I haven't even gotten
> that far yet...

I am gathering the original rationale why this should be safe from our
internal communication (the change is 2 years old, after all) and will
follow up.
If you find a way to break this in the meantime, please let us know.

Philipp.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] RISC-V: branch-(not)equals-zero compares against $zero
  2022-11-18  4:53   ` Palmer Dabbelt
  2022-11-18  9:14     ` Philipp Tomsich
@ 2022-11-19 16:47     ` Jeff Law
  1 sibling, 0 replies; 5+ messages in thread
From: Jeff Law @ 2022-11-19 16:47 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: philipp.tomsich, gcc-patches, Vineet Gupta, Kito Cheng, jlaw,
	christoph.muellner


On 11/17/22 21:53, Palmer Dabbelt wrote:
> On Thu, 17 Nov 2022 14:44:31 PST (-0800), jeffreyalaw@gmail.com wrote:
>>
>> On 11/8/22 12:55, Philipp Tomsich wrote:
>>> If we are testing a register or a paradoxical subreg (i.e. anything 
>>> that is not
>>> a partial subreg) for equality/non-equality with zero, we can 
>>> generate a branch
>>> that compares against $zero.  This will work for QI, HI, SI and 
>>> DImode, so we
>>> enable this for ANYI.
>>>
>>> 2020-08-30  gcc/ChangeLog:
>>>
>>>     * config/riscv/riscv.md (*branch<mode>_equals_zero): Added pattern.
>>
>> I've gone back an forth on this a few times.  As you know, I hate
>> subregs in the target descriptions and I guess I need to extend that to
>> querying if something is a subreg or not rather than just subregs
>> appearing in the RTL.
>>
>>
>> Presumably the idea behind rejecting partial subregs is the bits outside
>> the partial is unspecified, but that's also going to be true if we're
>> looking at a hardreg in QImode (for example) irrespective of it being
>> wrapped in a subreg.
>>
>>
>> I don't doubt it works the vast majority of the time, but I haven't been
>> able to convince myself it'll work all the time.  How do we ensure that
>> the bits outside the mode are zero?  I've been bitten by this kind of
>> problem before, and it's safe to say it was exceedingly painful to find.
>
> I don't really understand the middle-end issues here (if there are 
> any?), but I'm pretty sure code like this has passed by a few times 
> before and we've yet to find a reliable way to optimize these cases.  
> There's a bunch of patterns where knowing the XLEN-extension of 
> shorter values would let us generate better code, but there's also 
> cases where we'd generate worse code by ensure any extension scheme is 
> followed.

It's not really the extension scheme, though that is a subset of the 
concerns in this space.  Essentially we have to be 100% sure that the 
bits outside of the branch mode (QI/HI/SI) and XLEN are zero, it's not 
just the sign bit.  This becomes even more of a concern as we exploit 
the bitmanip extensions more aggressively.


The SUBREG check is supposed to avoid that problem, but I'm not 
convinced it's sufficient.

Philipp claims that PROMOTE_MODE plus WORD_REGISTER_OPERATIONS is 
sufficient here, but I'm not sure that's the case.   He's digging out 
the rationale from some internal archives which we'll dig into once he 
finds it.

I'd be happy to be proved wrong :-)


jeff




^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-11-19 16:48 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-08 19:55 [PATCH] RISC-V: branch-(not)equals-zero compares against $zero Philipp Tomsich
2022-11-17 22:44 ` Jeff Law
2022-11-18  4:53   ` Palmer Dabbelt
2022-11-18  9:14     ` Philipp Tomsich
2022-11-19 16:47     ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).