public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] RISC-V: Optimize zbb ins sext.b and sext.h in rv64
@ 2023-03-24  1:53 Feng Wang
  2023-04-22  0:08 ` Jeff Law
  2023-04-22  0:13 ` Jeff Law
  0 siblings, 2 replies; 8+ messages in thread
From: Feng Wang @ 2023-03-24  1:53 UTC (permalink / raw)
  To: gcc-patches; +Cc: kito.cheng, palmer, Feng Wang

This patch optimize the combine processing for sext.b/h in rv64.
Please refer to the following test case,
int sextb32(int x)
{ return (x << 24) >> 24; }

The rtl expression is as follows,
(insn 6 3 7 2 (set (reg:SI 138)
        (ashift:SI (subreg/s/u:SI (reg/v:DI 136 [ xD.2271 ]) 0)
            (const_int 24 [0x18]))) "sextb.c":2:13 195 {ashlsi3}
     (expr_list:REG_DEAD (reg/v:DI 136 [ xD.2271 ])
        (nil)))
(insn 7 6 8 2 (set (reg:SI 137)
        (ashiftrt:SI (reg:SI 138)
            (const_int 24 [0x18]))) "sextb.c":2:20 196 {ashrsi3}
     (expr_list:REG_DEAD (reg:SI 138)
        (nil)))

During the combine phase, they will combine into
(set (reg:SI 137)
    (ashiftrt:SI (subreg:SI (ashift:DI (reg:DI 140)
                (const_int 24 [0x18])) 0)
        (const_int 24 [0x18])))

The optimal combine result is
(set (reg:SI 137)
    (sign_extend:SI (subreg:QI (reg:DI 140) 0)))
This can be converted to the sext ins.

Due to the influence of subreg,the current processing
can't obtain the imm of left shifts. Need to peel off
another layer of rtl to obtain it.

gcc/ChangeLog:

        * combine.cc (extract_left_shift): Add SUBREG case.

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/zbb-sext-rv64.c: New test.
---
 gcc/combine.cc                                 |  5 +++++
 gcc/testsuite/gcc.target/riscv/zbb-sext-rv64.c | 12 ++++++++++++
 2 files changed, 17 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-sext-rv64.c

diff --git a/gcc/combine.cc b/gcc/combine.cc
index 053879500b7..fb396a3d974 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -7915,6 +7915,11 @@ extract_left_shift (scalar_int_mode mode, rtx x, int count)
 
   switch (code)
     {
+    case SUBREG:
+      x = XEXP (x, 0);
+      if (GET_CODE(x) != ASHIFT)
+        break;
+
     case ASHIFT:
       /* This is the shift itself.  If it is wide enough, we will return
 	 either the value being shifted if the shift count is equal to
diff --git a/gcc/testsuite/gcc.target/riscv/zbb-sext-rv64.c b/gcc/testsuite/gcc.target/riscv/zbb-sext-rv64.c
new file mode 100644
index 00000000000..4086ee56f57
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zbb-sext-rv64.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64g_zbb -mabi=lp64d -O2" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
+
+int sextb32(int x)
+{ return (x << 24) >> 24; }
+
+int sexth32(int x)
+{ return (x << 16) >> 16; }
+
+/* { dg-final { scan-assembler "sext.b" } } */
+/* { dg-final { scan-assembler "sext.h" } } */
\ No newline at end of file
-- 
2.17.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] RISC-V: Optimize zbb ins sext.b and sext.h in rv64
  2023-03-24  1:53 [PATCH] RISC-V: Optimize zbb ins sext.b and sext.h in rv64 Feng Wang
@ 2023-04-22  0:08 ` Jeff Law
  2023-04-23  0:24   ` Feng Wang
  2023-04-22  0:13 ` Jeff Law
  1 sibling, 1 reply; 8+ messages in thread
From: Jeff Law @ 2023-04-22  0:08 UTC (permalink / raw)
  To: Feng Wang, gcc-patches; +Cc: kito.cheng, palmer



On 3/23/23 19:53, Feng Wang wrote:
> This patch optimize the combine processing for sext.b/h in rv64.
> Please refer to the following test case,
> int sextb32(int x)
> { return (x << 24) >> 24; }
> 
> The rtl expression is as follows,
> (insn 6 3 7 2 (set (reg:SI 138)
>          (ashift:SI (subreg/s/u:SI (reg/v:DI 136 [ xD.2271 ]) 0)
>              (const_int 24 [0x18]))) "sextb.c":2:13 195 {ashlsi3}
>       (expr_list:REG_DEAD (reg/v:DI 136 [ xD.2271 ])
>          (nil)))
> (insn 7 6 8 2 (set (reg:SI 137)
>          (ashiftrt:SI (reg:SI 138)
>              (const_int 24 [0x18]))) "sextb.c":2:20 196 {ashrsi3}
>       (expr_list:REG_DEAD (reg:SI 138)
>          (nil)))
> 
> During the combine phase, they will combine into
> (set (reg:SI 137)
>      (ashiftrt:SI (subreg:SI (ashift:DI (reg:DI 140)
>                  (const_int 24 [0x18])) 0)
>          (const_int 24 [0x18])))
> 
> The optimal combine result is
> (set (reg:SI 137)
>      (sign_extend:SI (subreg:QI (reg:DI 140) 0)))
> This can be converted to the sext ins.
> 
> Due to the influence of subreg,the current processing
> can't obtain the imm of left shifts. Need to peel off
> another layer of rtl to obtain it.
> 
> gcc/ChangeLog:
> 
>          * combine.cc (extract_left_shift): Add SUBREG case.
> 
> gcc/testsuite/ChangeLog:
> 
>          * gcc.target/riscv/zbb-sext-rv64.c: New test.
SUBREGs have painful semantics and we should be very careful just 
stripping them.

For example, you might have a subreg that extracts the *high* part.  Or 
you might have (subreg (mem)) or a paradoxical subreg, etc.

At the *least* this case would need verification that you're getting the 
lowpart.  However, I suspect there's other conditions that need to be 
checked to make this valid.

But I would suggest we look elsewhere.  It could be that combine is 
reassociating the subreg in ways that are undesirable and which 
ultimately makes our job harder. Additionally if we can fix this in a 
generic simplification/folder routine, then multiple passes can benefit.

For example in simplify_context::simplify_binary_operation we get a form 
more amenable to optimization.

> #0  simplify_context::simplify_binary_operation (this=0x7fffffffda68, code=ASHIFTRT, mode=E_SImode, 
>     op0=0x7fffea11eb40, op1=0x7fffea009610) at /home/jlaw/riscv-persist/ventana/gcc/gcc/simplify-rtx.cc:2558
> 2558      gcc_assert (GET_RTX_CLASS (code) != RTX_COMPARE);
> (gdb) p code
> $24 = ASHIFTRT
> (gdb) p mode
> $25 = E_SImode
> (gdb) p debug_rtx (op0)
> (ashift:SI (subreg/s/u:SI (reg/v:DI 74 [ x ]) 0)
>     (const_int 24 [0x18]))
> $26 = void
> (gdb) p debug_rtx (op1)
> (const_int 24 [0x18])
> $27 = void

So that's (ashiftrt (ashift (object) 24) 24), ie sign extension.

ie, we really don't have to think about the fact that the underlying 
object is a SUBREG because the outer operations are very clearly a sign 
extension regardless of the object they're operating on.

With that in mind I would suggest you look at adding a case for detect 
zero/sign extension in simplify_context::simplify_binary_operation_1.

Thanks,
Jeff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] RISC-V: Optimize zbb ins sext.b and sext.h in rv64
  2023-03-24  1:53 [PATCH] RISC-V: Optimize zbb ins sext.b and sext.h in rv64 Feng Wang
  2023-04-22  0:08 ` Jeff Law
@ 2023-04-22  0:13 ` Jeff Law
  1 sibling, 0 replies; 8+ messages in thread
From: Jeff Law @ 2023-04-22  0:13 UTC (permalink / raw)
  To: Feng Wang, gcc-patches; +Cc: kito.cheng, palmer



On 3/23/23 19:53, Feng Wang wrote:
> This patch optimize the combine processing for sext.b/h in rv64.
> Please refer to the following test case,
[ ... ]
I've opened BZ109592 to track this problem.

jeff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Re: [PATCH] RISC-V: Optimize zbb ins sext.b and sext.h in rv64
  2023-04-22  0:08 ` Jeff Law
@ 2023-04-23  0:24   ` Feng Wang
  0 siblings, 0 replies; 8+ messages in thread
From: Feng Wang @ 2023-04-23  0:24 UTC (permalink / raw)
  To: Jeff Law, gcc-patches; +Cc: kito.cheng, palmer

On 2023-04-22 08:08  Jeff Law<jeffreyalaw@gmail.com> wrote:
>
>
>
>On 3/23/23 19:53, Feng Wang wrote:
>> This patch optimize the combine processing for sext.b/h in rv64.
>> Please refer to the following test case,
>> int sextb32(int x)
>> { return (x << 24) >> 24; }
>>
>> The rtl expression is as follows,
>> (insn 6 3 7 2 (set (reg:SI 138)
>>          (ashift:SI (subreg/s/u:SI (reg/v:DI 136 [ xD.2271 ]) 0)
>>              (const_int 24 [0x18]))) "sextb.c":2:13 195 {ashlsi3}
>>       (expr_list:REG_DEAD (reg/v:DI 136 [ xD.2271 ])
>>          (nil)))
>> (insn 7 6 8 2 (set (reg:SI 137)
>>          (ashiftrt:SI (reg:SI 138)
>>              (const_int 24 [0x18]))) "sextb.c":2:20 196 {ashrsi3}
>>       (expr_list:REG_DEAD (reg:SI 138)
>>          (nil)))
>>
>> During the combine phase, they will combine into
>> (set (reg:SI 137)
>>      (ashiftrt:SI (subreg:SI (ashift:DI (reg:DI 140)
>>                  (const_int 24 [0x18])) 0)
>>          (const_int 24 [0x18])))
>>
>> The optimal combine result is
>> (set (reg:SI 137)
>>      (sign_extend:SI (subreg:QI (reg:DI 140) 0)))
>> This can be converted to the sext ins.
>>
>> Due to the influence of subreg,the current processing
>> can't obtain the imm of left shifts. Need to peel off
>> another layer of rtl to obtain it.
>>
>> gcc/ChangeLog:
>>
>>          * combine.cc (extract_left_shift): Add SUBREG case.
>>
>> gcc/testsuite/ChangeLog:
>>
>>          * gcc.target/riscv/zbb-sext-rv64.c: New test.
>SUBREGs have painful semantics and we should be very careful just
>stripping them.
>
>For example, you might have a subreg that extracts the *high* part.  Or
>you might have (subreg (mem)) or a paradoxical subreg, etc.
>
>At the *least* this case would need verification that you're getting the
>lowpart.  However, I suspect there's other conditions that need to be
>checked to make this valid.
>
>But I would suggest we look elsewhere.  It could be that combine is
>reassociating the subreg in ways that are undesirable and which
>ultimately makes our job harder. Additionally if we can fix this in a
>generic simplification/folder routine, then multiple passes can benefit.
>
>For example in simplify_context::simplify_binary_operation we get a form
>more amenable to optimization.
>
>> #0  simplify_context::simplify_binary_operation (this=0x7fffffffda68, code=ASHIFTRT, mode=E_SImode,
>>     op0=0x7fffea11eb40, op1=0x7fffea009610) at /home/jlaw/riscv-persist/ventana/gcc/gcc/simplify-rtx.cc:2558
>> 2558      gcc_assert (GET_RTX_CLASS (code) != RTX_COMPARE);
>> (gdb) p code
>> $24 = ASHIFTRT
>> (gdb) p mode
>> $25 = E_SImode
>> (gdb) p debug_rtx (op0)
>> (ashift:SI (subreg/s/u:SI (reg/v:DI 74 [ x ]) 0)
>>     (const_int 24 [0x18]))
>> $26 = void
>> (gdb) p debug_rtx (op1)
>> (const_int 24 [0x18])
>> $27 = void
>
>So that's (ashiftrt (ashift (object) 24) 24), ie sign extension.
>
>ie, we really don't have to think about the fact that the underlying
>object is a SUBREG because the outer operations are very clearly a sign
>extension regardless of the object they're operating on.
>
>With that in mind I would suggest you look at adding a case for detect
>zero/sign extension in simplify_context::simplify_binary_operation_1.
>
>Thanks,
>Jeff 
You are right, I will modify it according to your suggestion.
Thanks.
Feng Wang

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] RISC-V: Optimize zbb ins sext.b and sext.h in rv64
  2023-03-27  1:32   ` Feng Wang
@ 2023-03-27  2:05     ` Jeff Law
  0 siblings, 0 replies; 8+ messages in thread
From: Jeff Law @ 2023-03-27  2:05 UTC (permalink / raw)
  To: Feng Wang, juzhe.zhong, gcc-patches; +Cc: kito.cheng, palmer



On 3/26/23 19:32, Feng Wang wrote:
> On 2023-03-26 02:18  Jeff Law<jeffreyalaw@gmail.com> wrote:
>>
>>
>>
>> On 3/23/23 20:45, juzhe.zhong@rivai.ai wrote:
>>> Sounds like you are looking at redundant extension problem in RISC-V port.
>>> This is the issue I want to fix but I don't find the time to do that.
>>> My first impression is that we need to fix redundant extension in "ree"
>>> PASS.
>>> I am not sure.
>> It's actually quite a bit more complicated.
>>
>> Some extension elimination can and probably should be happening in
>> gimple. In gimple you have access to type information as well as range
>> information.  So you have the opportunity to do things like rewrite the
>> IL to use different types when it's safe to do so, or to use range
>> information to identify when an object is already properly extended and
>> thus eliminate the extension before we expand gimple into RTL.
>>
>> Once in RTL, you can use forward propagation to eliminate extensions, or
>> at least fold them into existing operations.  combine can eliminate
>> extensions and it has the ability to track (for example) if the upper
>> bits are copies of the sign bit, if they're known zero, etc.  combine is
>> also capable of recognizing that a load implicitly extends and using
>> that knowledge to eliminate extensions or to discover that a pair of
>> shifts are just zero or sign extending a value, etc etc.  combine also
>> interacts with simplify-rtx which is used by other passes, so there's a
>> chance that work in simplify-rtx can eliminate extensions not just in
>> combine, but in other passes as well.
>>
>> REE is a post-register allocation pass and kind of the last chance to
>> eliminate extensions.
>>
>> So for any given redundant extension, the way to go (IMHO) is to walk
>> through the optimizer pipeline to see where it can potentially be
>> eliminated.  In general, the earlier in the optimizer pipeline the
>> extension can be eliminated, the better.
>>
>> Jeff
> Hi Jeff,Do you think my patch modification is suitable?What else needs to be improved?
I haven't looked at it in any detail.  We're in stage4 right now, so 
it's regression bugfixes only going into the tree.  Once gcc-13 branches 
I'll be focused on helping folks move RVV forward, submitting/refining 
various RISC-V patches from Ventana and reviewing other RISC-V related 
patches.

Jeff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] RISC-V: Optimize zbb ins sext.b and sext.h in rv64
  2023-03-24  2:45 juzhe.zhong
  2023-03-24  6:13 ` Feng Wang
@ 2023-03-25 18:18 ` Jeff Law
  2023-03-27  1:32   ` Feng Wang
  1 sibling, 1 reply; 8+ messages in thread
From: Jeff Law @ 2023-03-25 18:18 UTC (permalink / raw)
  To: juzhe.zhong, gcc-patches; +Cc: kito.cheng, palmer, wangfeng



On 3/23/23 20:45, juzhe.zhong@rivai.ai wrote:
> Sounds like you are looking at redundant extension problem in RISC-V port.
> This is the issue I want to fix but I don't find the time to do that.
> My first impression is that we need to fix redundant extension in "ree" 
> PASS.
> I am not sure.
It's actually quite a bit more complicated.

Some extension elimination can and probably should be happening in 
gimple. In gimple you have access to type information as well as range 
information.  So you have the opportunity to do things like rewrite the 
IL to use different types when it's safe to do so, or to use range 
information to identify when an object is already properly extended and 
thus eliminate the extension before we expand gimple into RTL.

Once in RTL, you can use forward propagation to eliminate extensions, or 
at least fold them into existing operations.  combine can eliminate 
extensions and it has the ability to track (for example) if the upper 
bits are copies of the sign bit, if they're known zero, etc.  combine is 
also capable of recognizing that a load implicitly extends and using 
that knowledge to eliminate extensions or to discover that a pair of 
shifts are just zero or sign extending a value, etc etc.  combine also 
interacts with simplify-rtx which is used by other passes, so there's a 
chance that work in simplify-rtx can eliminate extensions not just in 
combine, but in other passes as well.

REE is a post-register allocation pass and kind of the last chance to 
eliminate extensions.

So for any given redundant extension, the way to go (IMHO) is to walk 
through the optimizer pipeline to see where it can potentially be 
eliminated.  In general, the earlier in the optimizer pipeline the 
extension can be eliminated, the better.

Jeff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] RISC-V: Optimize zbb ins sext.b and sext.h in rv64
  2023-03-24  2:45 juzhe.zhong
@ 2023-03-24  6:13 ` Feng Wang
  2023-03-25 18:18 ` Jeff Law
  1 sibling, 0 replies; 8+ messages in thread
From: Feng Wang @ 2023-03-24  6:13 UTC (permalink / raw)
  To: juzhe.zhong, gcc-patches; +Cc: kito.cheng, palmer, Jeff Law

Hi Juzhe,

Thank you for your reply, I'm really doing some optimization work right now.
I am very interested in the question you have raised, and I will take the time to try to optimize it.
I hope I can communicate with you and learn from you more in the future.

Thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] RISC-V: Optimize zbb ins sext.b and sext.h in rv64
@ 2023-03-24  2:45 juzhe.zhong
  2023-03-24  6:13 ` Feng Wang
  2023-03-25 18:18 ` Jeff Law
  0 siblings, 2 replies; 8+ messages in thread
From: juzhe.zhong @ 2023-03-24  2:45 UTC (permalink / raw)
  To: gcc-patches; +Cc: kito.cheng, palmer, jeffreyalaw, wangfeng

[-- Attachment #1: Type: text/plain, Size: 424 bytes --]

Sounds like you are looking at redundant extension problem in RISC-V port.
This is the issue I want to fix but I don't find the time to do that.
My first impression is that we need to fix redundant extension in "ree" PASS.
I am not sure.

Base on you are looking at this kind of issues, would you mind looking at this issue?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108016 

Thanks.



juzhe.zhong@rivai.ai

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-04-23  0:24 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-24  1:53 [PATCH] RISC-V: Optimize zbb ins sext.b and sext.h in rv64 Feng Wang
2023-04-22  0:08 ` Jeff Law
2023-04-23  0:24   ` Feng Wang
2023-04-22  0:13 ` Jeff Law
2023-03-24  2:45 juzhe.zhong
2023-03-24  6:13 ` Feng Wang
2023-03-25 18:18 ` Jeff Law
2023-03-27  1:32   ` Feng Wang
2023-03-27  2:05     ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).