* [PATCH] LoongArch: Only transform move/move/bstrins to srai/bstrins when -Os
@ 2024-06-15 13:47 Xi Ruoyao
2024-06-26 7:54 ` Ping: " Xi Ruoyao
0 siblings, 1 reply; 3+ messages in thread
From: Xi Ruoyao @ 2024-06-15 13:47 UTC (permalink / raw)
To: gcc-patches; +Cc: chenglulu, i, xuchenghua, Xi Ruoyao
The first form has a lower latency (due to the special handling of
"move" in LA464 and LA664) despite it's longer.
gcc/ChangeLog:
* config/loongarch/loongarch.md (define_peephole2): Require
optimize_insn_for_size_p () for move/move/bstrins =>
srai/bstrins transform.
---
Bootstrapped and regtested on loongarch64-linux-gnu. Ok for trunk?
gcc/config/loongarch/loongarch.md | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md
index 25c1d323ba0..e4434c3bd4e 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -1617,20 +1617,23 @@ (define_insn_and_split "*bstrins_<mode>_for_ior_mask"
})
;; We always avoid the shift operation in bstrins_<mode>_for_ior_mask
-;; if possible, but the result may be sub-optimal when one of the masks
+;; if possible, but the result may be larger when one of the masks
;; is (1 << N) - 1 and one of the src register is the dest register.
;; For example:
;; move t0, a0
;; move a0, a1
;; bstrins.d a0, t0, 42, 0
;; ret
-;; using a shift operation would be better:
+;; using a shift operation would be smaller:
;; srai.d t0, a1, 43
;; bstrins.d a0, t0, 63, 43
;; ret
;; unfortunately we cannot figure it out in split1: before reload we cannot
;; know if the dest register is one of the src register. Fix it up in
;; peephole2.
+;;
+;; Note that the first form has a lower latency so this should only be
+;; done when optimizing for size.
(define_peephole2
[(set (match_operand:GPR 0 "register_operand")
(match_operand:GPR 1 "register_operand"))
@@ -1639,7 +1642,7 @@ (define_peephole2
(match_operand:SI 3 "const_int_operand")
(const_int 0))
(match_dup 0))]
- "peep2_reg_dead_p (3, operands[0])"
+ "peep2_reg_dead_p (3, operands[0]) && optimize_insn_for_size_p ()"
[(const_int 0)]
{
int len = GET_MODE_BITSIZE (<MODE>mode) - INTVAL (operands[3]);
--
2.45.2
^ permalink raw reply [flat|nested] 3+ messages in thread
* Ping: [PATCH] LoongArch: Only transform move/move/bstrins to srai/bstrins when -Os
2024-06-15 13:47 [PATCH] LoongArch: Only transform move/move/bstrins to srai/bstrins when -Os Xi Ruoyao
@ 2024-06-26 7:54 ` Xi Ruoyao
2024-06-26 9:10 ` Lulu Cheng
0 siblings, 1 reply; 3+ messages in thread
From: Xi Ruoyao @ 2024-06-26 7:54 UTC (permalink / raw)
To: gcc-patches; +Cc: chenglulu, i, xuchenghua
Ping.
On Sat, 2024-06-15 at 21:47 +0800, Xi Ruoyao wrote:
> The first form has a lower latency (due to the special handling of
> "move" in LA464 and LA664) despite it's longer.
>
> gcc/ChangeLog:
>
> * config/loongarch/loongarch.md (define_peephole2): Require
> optimize_insn_for_size_p () for move/move/bstrins =>
> srai/bstrins transform.
> ---
>
> Bootstrapped and regtested on loongarch64-linux-gnu. Ok for trunk?
>
> gcc/config/loongarch/loongarch.md | 9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/config/loongarch/loongarch.md
> b/gcc/config/loongarch/loongarch.md
> index 25c1d323ba0..e4434c3bd4e 100644
> --- a/gcc/config/loongarch/loongarch.md
> +++ b/gcc/config/loongarch/loongarch.md
> @@ -1617,20 +1617,23 @@ (define_insn_and_split
> "*bstrins_<mode>_for_ior_mask"
> })
>
> ;; We always avoid the shift operation in bstrins_<mode>_for_ior_mask
> -;; if possible, but the result may be sub-optimal when one of the
> masks
> +;; if possible, but the result may be larger when one of the masks
> ;; is (1 << N) - 1 and one of the src register is the dest register.
> ;; For example:
> ;; move t0, a0
> ;; move a0, a1
> ;; bstrins.d a0, t0, 42, 0
> ;; ret
> -;; using a shift operation would be better:
> +;; using a shift operation would be smaller:
> ;; srai.d t0, a1, 43
> ;; bstrins.d a0, t0, 63, 43
> ;; ret
> ;; unfortunately we cannot figure it out in split1: before reload we
> cannot
> ;; know if the dest register is one of the src register. Fix it up
> in
> ;; peephole2.
> +;;
> +;; Note that the first form has a lower latency so this should only
> be
> +;; done when optimizing for size.
> (define_peephole2
> [(set (match_operand:GPR 0 "register_operand")
> (match_operand:GPR 1 "register_operand"))
> @@ -1639,7 +1642,7 @@ (define_peephole2
> (match_operand:SI 3 "const_int_operand")
> (const_int 0))
> (match_dup 0))]
> - "peep2_reg_dead_p (3, operands[0])"
> + "peep2_reg_dead_p (3, operands[0]) && optimize_insn_for_size_p ()"
> [(const_int 0)]
> {
> int len = GET_MODE_BITSIZE (<MODE>mode) - INTVAL (operands[3]);
--
Xi Ruoyao <xry111@xry111.site>
School of Aerospace Science and Technology, Xidian University
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Ping: [PATCH] LoongArch: Only transform move/move/bstrins to srai/bstrins when -Os
2024-06-26 7:54 ` Ping: " Xi Ruoyao
@ 2024-06-26 9:10 ` Lulu Cheng
0 siblings, 0 replies; 3+ messages in thread
From: Lulu Cheng @ 2024-06-26 9:10 UTC (permalink / raw)
To: Xi Ruoyao, gcc-patches; +Cc: i, xuchenghua
>> ;; We always avoid the shift operation in bstrins_<mode>_for_ior_mask
>> -;; if possible, but the result may be sub-optimal when one of the
>> masks
>> +;; if possible, but the result may be larger when one of the masks
>> ;; is (1 << N) - 1 and one of the src register is the dest register.
>> ;; For example:
>> ;; move t0, a0
>> ;; move a0, a1
>> ;; bstrins.d a0, t0, 42, 0
>> ;; ret
>> -;; using a shift operation would be better:
>> +;; using a shift operation would be smaller:
>> ;; srai.d t0, a1, 43
>> ;; bstrins.d a0, t0, 63, 43
>> ;; ret
>> ;; unfortunately we cannot figure it out in split1: before reload we
>> cannot
>> ;; know if the dest register is one of the src register. Fix it up
>> in
>> ;; peephole2.
>> +;;
>> +;; Note that the first form has a lower latency so this should only
The result of my test is that the latency of these two forms is the
same, is there a problem with my test?
>> be
>> +;; done when optimizing for size.
>> (define_peephole2
>> [(set (match_operand:GPR 0 "register_operand")
>> (match_operand:GPR 1 "register_operand"))
>> @@ -1639,7 +1642,7 @@ (define_peephole2
>> (match_operand:SI 3 "const_int_operand")
>> (const_int 0))
>> (match_dup 0))]
>> - "peep2_reg_dead_p (3, operands[0])"
>> + "peep2_reg_dead_p (3, operands[0]) && optimize_insn_for_size_p ()"
>> [(const_int 0)]
>> {
>> int len = GET_MODE_BITSIZE (<MODE>mode) - INTVAL (operands[3]);
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2024-06-26 9:10 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-15 13:47 [PATCH] LoongArch: Only transform move/move/bstrins to srai/bstrins when -Os Xi Ruoyao
2024-06-26 7:54 ` Ping: " Xi Ruoyao
2024-06-26 9:10 ` Lulu Cheng
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).