public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH v1 0/3] RISC-V: Improve sequences with shifted zero-extended operands
@ 2022-05-24 21:47 Philipp Tomsich
  2022-05-24 21:47 ` [PATCH v1 1/3] RISC-V: add consecutive_bits_operand predicate Philipp Tomsich
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Philipp Tomsich @ 2022-05-24 21:47 UTC (permalink / raw)
  To: gcc-patches
  Cc: Manolis Tsamis, Jim Wilson, Vineet Gupta, Kito Cheng,
	Palmer Dabbelt, Andrew Waterman, Philipp Tomsich


Code-generation currently misses some opportunities for optimized
sequences when zero-extension is combined with shifts.


Philipp Tomsich (3):
  RISC-V: add consecutive_bits_operand predicate
  RISC-V: Split slli+sh[123]add.uw opportunities to avoid zext.w
  RISC-V: Replace zero_extendsidi2_shifted with generalized split

 gcc/config/riscv/bitmanip.md               | 44 ++++++++++++++++++++++
 gcc/config/riscv/predicates.md             | 11 ++++++
 gcc/config/riscv/riscv.md                  | 37 +++++++++---------
 gcc/testsuite/gcc.target/riscv/zba-shadd.c | 13 +++++++
 4 files changed, 88 insertions(+), 17 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zba-shadd.c

-- 
2.34.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v1 1/3] RISC-V: add consecutive_bits_operand predicate
  2022-05-24 21:47 [PATCH v1 0/3] RISC-V: Improve sequences with shifted zero-extended operands Philipp Tomsich
@ 2022-05-24 21:47 ` Philipp Tomsich
  2022-06-07 10:25   ` Kito Cheng
  2022-05-24 21:47 ` [PATCH v1 2/3] RISC-V: Split slli+sh[123]add.uw opportunities to avoid zext.w Philipp Tomsich
  2022-05-24 21:47 ` [PATCH v1 3/3] RISC-V: Replace zero_extendsidi2_shifted with generalized split Philipp Tomsich
  2 siblings, 1 reply; 14+ messages in thread
From: Philipp Tomsich @ 2022-05-24 21:47 UTC (permalink / raw)
  To: gcc-patches
  Cc: Manolis Tsamis, Jim Wilson, Vineet Gupta, Kito Cheng,
	Palmer Dabbelt, Andrew Waterman, Philipp Tomsich

Provide an easy way to constrain for constants that are a a single,
consecutive run of ones.

gcc/ChangeLog:

	* config/riscv/predicates.md (consecutive_bits_operand):
          Implement new predicate.

Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
---

 gcc/config/riscv/predicates.md | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index c37caa2502b..90db5dfcdd5 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -243,3 +243,14 @@ (define_predicate "const63_operand"
 (define_predicate "imm5_operand"
   (and (match_code "const_int")
        (match_test "INTVAL (op) < 5")))
+
+;; A CONST_INT operand that consists of a single run of consecutive set bits.
+(define_predicate "consecutive_bits_operand"
+  (match_code "const_int")
+{
+	unsigned HOST_WIDE_INT val = UINTVAL (op);
+	if (exact_log2 ((val >> ctz_hwi (val)) + 1) < 0)
+	        return false;
+
+	return true;
+})
-- 
2.34.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v1 2/3] RISC-V: Split slli+sh[123]add.uw opportunities to avoid zext.w
  2022-05-24 21:47 [PATCH v1 0/3] RISC-V: Improve sequences with shifted zero-extended operands Philipp Tomsich
  2022-05-24 21:47 ` [PATCH v1 1/3] RISC-V: add consecutive_bits_operand predicate Philipp Tomsich
@ 2022-05-24 21:47 ` Philipp Tomsich
  2022-06-07 10:25   ` Kito Cheng
  2022-06-17  8:34   ` Andreas Schwab
  2022-05-24 21:47 ` [PATCH v1 3/3] RISC-V: Replace zero_extendsidi2_shifted with generalized split Philipp Tomsich
  2 siblings, 2 replies; 14+ messages in thread
From: Philipp Tomsich @ 2022-05-24 21:47 UTC (permalink / raw)
  To: gcc-patches
  Cc: Manolis Tsamis, Jim Wilson, Vineet Gupta, Kito Cheng,
	Palmer Dabbelt, Andrew Waterman, Philipp Tomsich

When encountering a prescaled (biased) value as a candidate for
sh[123]add.uw, the combine pass will present this as shifted by the
aggregate amount (prescale + shift-amount) with an appropriately
adjusted mask constant that has fewer than 32 bits set.

E.g., here's the failing expression seen in combine for a prescale of
1 and a shift of 2 (note how 0x3fffffff8 >> 3 is 0x7fffffff).
  Trying 7, 8 -> 10:
      7: r78:SI=r81:DI#0<<0x1
        REG_DEAD r81:DI
      8: r79:DI=zero_extend(r78:SI)
        REG_DEAD r78:SI
     10: r80:DI=r79:DI<<0x2+r82:DI
        REG_DEAD r79:DI
        REG_DEAD r82:DI
  Failed to match this instruction:
  (set (reg:DI 80 [ cD.1491 ])
      (plus:DI (and:DI (ashift:DI (reg:DI 81)
                       (const_int 3 [0x3]))
               (const_int 17179869176 [0x3fffffff8]))
          (reg:DI 82)))

To address this, we introduce a splitter handling these cases.

gcc/ChangeLog:

	* config/riscv/bitmanip.md: Add split to handle opportunities
	  for slli + sh[123]add.uw

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/zba-shadd.c: New test.

Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Co-developed-by: Manolis Tsamis <manolis.tsamis@vrull.eu>

---

 gcc/config/riscv/bitmanip.md               | 44 ++++++++++++++++++++++
 gcc/testsuite/gcc.target/riscv/zba-shadd.c | 13 +++++++
 2 files changed, 57 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zba-shadd.c

diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index 0ab9ffe3c0b..6c1ccc6f8c5 100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -79,6 +79,50 @@ (define_insn "*shNadduw"
   [(set_attr "type" "bitmanip")
    (set_attr "mode" "DI")])
 
+;; During combine, we may encounter an attempt to combine
+;;   slli rtmp, rs, #imm
+;;   zext.w rtmp, rtmp
+;;   sh[123]add rd, rtmp, rs2
+;; which will lead to the immediate not satisfying the above constraints.
+;; By splitting the compound expression, we can simplify to a slli and a
+;; sh[123]add.uw.
+(define_split
+  [(set (match_operand:DI 0 "register_operand")
+	(plus:DI (and:DI (ashift:DI (match_operand:DI 1 "register_operand")
+				    (match_operand:QI 2 "immediate_operand"))
+			 (match_operand:DI 3 "consecutive_bits_operand"))
+		 (match_operand:DI 4 "register_operand")))
+   (clobber (match_operand:DI 5 "register_operand"))]
+  "TARGET_64BIT && TARGET_ZBA"
+  [(set (match_dup 5) (ashift:DI (match_dup 1) (match_dup 6)))
+   (set (match_dup 0) (plus:DI (and:DI (ashift:DI (match_dup 5)
+						  (match_dup 7))
+				       (match_dup 8))
+			       (match_dup 4)))]
+{
+	unsigned HOST_WIDE_INT mask = UINTVAL (operands[3]);
+	/* scale: shift within the sh[123]add.uw */
+	int scale = 32 - clz_hwi (mask);
+	/* bias:  pre-scale amount (i.e. the prior shift amount) */
+	int bias = ctz_hwi (mask) - scale;
+
+	/* If the bias + scale don't add up to operand[2], reject. */
+	if ((scale + bias) != UINTVAL (operands[2]))
+	   FAIL;
+
+	/* If the shift-amount is out-of-range for sh[123]add.uw, reject. */
+	if ((scale < 1) || (scale > 3))
+	   FAIL;
+
+	/* If there's no bias, the '*shNadduw' pattern should have matched. */
+	if (bias == 0)
+	   FAIL;
+
+	operands[6] = GEN_INT (bias);
+	operands[7] = GEN_INT (scale);
+	operands[8] = GEN_INT (0xffffffffULL << scale);
+})
+
 (define_insn "*add.uw"
   [(set (match_operand:DI 0 "register_operand" "=r")
 	(plus:DI (zero_extend:DI
diff --git a/gcc/testsuite/gcc.target/riscv/zba-shadd.c b/gcc/testsuite/gcc.target/riscv/zba-shadd.c
new file mode 100644
index 00000000000..33da2530f3f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zba-shadd.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=rv64gc_zba -mabi=lp64" } */
+
+unsigned long foo(unsigned int a, unsigned long b)
+{
+        a = a << 1;
+        unsigned long c = (unsigned long) a;
+        unsigned long d = b + (c<<2);
+        return d;
+}
+
+/* { dg-final { scan-assembler "sh2add.uw" } } */
+/* { dg-final { scan-assembler-not "zext" } } */
\ No newline at end of file
-- 
2.34.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v1 3/3] RISC-V: Replace zero_extendsidi2_shifted with generalized split
  2022-05-24 21:47 [PATCH v1 0/3] RISC-V: Improve sequences with shifted zero-extended operands Philipp Tomsich
  2022-05-24 21:47 ` [PATCH v1 1/3] RISC-V: add consecutive_bits_operand predicate Philipp Tomsich
  2022-05-24 21:47 ` [PATCH v1 2/3] RISC-V: Split slli+sh[123]add.uw opportunities to avoid zext.w Philipp Tomsich
@ 2022-05-24 21:47 ` Philipp Tomsich
  2022-06-07 10:24   ` Kito Cheng
  2 siblings, 1 reply; 14+ messages in thread
From: Philipp Tomsich @ 2022-05-24 21:47 UTC (permalink / raw)
  To: gcc-patches
  Cc: Manolis Tsamis, Jim Wilson, Vineet Gupta, Kito Cheng,
	Palmer Dabbelt, Andrew Waterman, Philipp Tomsich

The current method of treating shifts of extended values on RISC-V
frequently causes sequences of 3 shifts, despite the presence of the
'zero_extendsidi2_shifted' pattern.

Consider:
    unsigned long f(unsigned int a, unsigned long b)
    {
            a = a << 1;
            unsigned long c = (unsigned long) a;
            c = b + (c<<4);
            return c;
    }
which will present at combine-time as:
    Trying 7, 8 -> 9:
        7: r78:SI=r81:DI#0<<0x1
          REG_DEAD r81:DI
        8: r79:DI=zero_extend(r78:SI)
          REG_DEAD r78:SI
        9: r72:DI=r79:DI<<0x4
          REG_DEAD r79:DI
    Failed to match this instruction:
    (set (reg:DI 72 [ _1 ])
        (and:DI (ashift:DI (reg:DI 81)
                (const_int 5 [0x5]))
    	(const_int 68719476704 [0xfffffffe0])))
and produce the following (optimized) assembly:
    f:
	slliw	a5,a0,1
	slli	a5,a5,32
	srli	a5,a5,28
	add	a0,a5,a1
	ret

The current way of handling this (in 'zero_extendsidi2_shifted')
doesn't apply for two reasons:
- this is seen before reload, and
- (more importantly) the constant mask is not 0xfffffffful.

To address this, we introduce a generalized version of shifting
zero-extended values that supports any mask of consecutive ones as
long as the number of training zeros is the inner shift-amount.

With this new split, we generate the following assembly for the
aforementioned function:
    f:
	slli	a0,a0,33
	srli	a0,a0,28
	add	a0,a0,a1
	ret

gcc/ChangeLog:

	* config/riscv/riscv.md (zero_extendsidi2_shifted): Replace
	  with a generalized split that requires no clobber, runs
	  before reload and works for smaller masks.

Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
---

 gcc/config/riscv/riscv.md | 37 ++++++++++++++++++++-----------------
 1 file changed, 20 insertions(+), 17 deletions(-)

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index b8ab0cf169a..cc10cd90a74 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -2119,23 +2119,26 @@ (define_split
 ;; occur when unsigned int is used for array indexing.  Split this into two
 ;; shifts.  Otherwise we can get 3 shifts.
 
-(define_insn_and_split "zero_extendsidi2_shifted"
-  [(set (match_operand:DI 0 "register_operand" "=r")
-	(and:DI (ashift:DI (match_operand:DI 1 "register_operand" "r")
-			   (match_operand:QI 2 "immediate_operand" "I"))
-		(match_operand 3 "immediate_operand" "")))
-   (clobber (match_scratch:DI 4 "=&r"))]
-  "TARGET_64BIT && !TARGET_ZBA
-   && ((INTVAL (operands[3]) >> INTVAL (operands[2])) == 0xffffffff)"
-  "#"
-  "&& reload_completed"
-  [(set (match_dup 4)
-	(ashift:DI (match_dup 1) (const_int 32)))
-   (set (match_dup 0)
-	(lshiftrt:DI (match_dup 4) (match_dup 5)))]
-  "operands[5] = GEN_INT (32 - (INTVAL (operands [2])));"
-  [(set_attr "type" "shift")
-   (set_attr "mode" "DI")])
+(define_split
+  [(set (match_operand:DI 0 "register_operand")
+	(and:DI (ashift:DI (match_operand:DI 1 "register_operand")
+			   (match_operand:QI 2 "immediate_operand"))
+	        (match_operand:DI 3 "consecutive_bits_operand")))]
+  "TARGET_64BIT"
+  [(set (match_dup 0) (ashift:DI (match_dup 1) (match_dup 4)))
+   (set (match_dup 0) (lshiftrt:DI (match_dup 0) (match_dup 5)))]
+{
+	unsigned HOST_WIDE_INT mask = UINTVAL (operands[3]);
+	int leading = clz_hwi (mask);
+	int trailing = ctz_hwi (mask);
+
+	/* The shift-amount must match the number of trailing bits */
+	if (trailing != UINTVAL (operands[2]))
+	   FAIL;
+
+	operands[4] = GEN_INT (leading + trailing);
+	operands[5] = GEN_INT (leading);
+})
 
 ;;
 ;;  ....................
-- 
2.34.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 3/3] RISC-V: Replace zero_extendsidi2_shifted with generalized split
  2022-05-24 21:47 ` [PATCH v1 3/3] RISC-V: Replace zero_extendsidi2_shifted with generalized split Philipp Tomsich
@ 2022-06-07 10:24   ` Kito Cheng
  2022-06-07 10:50     ` Philipp Tomsich
  0 siblings, 1 reply; 14+ messages in thread
From: Kito Cheng @ 2022-06-07 10:24 UTC (permalink / raw)
  To: Philipp Tomsich
  Cc: GCC Patches, Manolis Tsamis, Jim Wilson, Vineet Gupta,
	Palmer Dabbelt, Andrew Waterman

On Wed, May 25, 2022 at 5:47 AM Philipp Tomsich
<philipp.tomsich@vrull.eu> wrote:
>
> The current method of treating shifts of extended values on RISC-V
> frequently causes sequences of 3 shifts, despite the presence of the
> 'zero_extendsidi2_shifted' pattern.
>
> Consider:
>     unsigned long f(unsigned int a, unsigned long b)
>     {
>             a = a << 1;
>             unsigned long c = (unsigned long) a;
>             c = b + (c<<4);
>             return c;
>     }
> which will present at combine-time as:
>     Trying 7, 8 -> 9:
>         7: r78:SI=r81:DI#0<<0x1
>           REG_DEAD r81:DI
>         8: r79:DI=zero_extend(r78:SI)
>           REG_DEAD r78:SI
>         9: r72:DI=r79:DI<<0x4
>           REG_DEAD r79:DI
>     Failed to match this instruction:
>     (set (reg:DI 72 [ _1 ])
>         (and:DI (ashift:DI (reg:DI 81)
>                 (const_int 5 [0x5]))
>         (const_int 68719476704 [0xfffffffe0])))
> and produce the following (optimized) assembly:
>     f:
>         slliw   a5,a0,1
>         slli    a5,a5,32
>         srli    a5,a5,28
>         add     a0,a5,a1
>         ret
>
> The current way of handling this (in 'zero_extendsidi2_shifted')
> doesn't apply for two reasons:
> - this is seen before reload, and
> - (more importantly) the constant mask is not 0xfffffffful.
>
> To address this, we introduce a generalized version of shifting
> zero-extended values that supports any mask of consecutive ones as
> long as the number of training zeros is the inner shift-amount.
>
> With this new split, we generate the following assembly for the
> aforementioned function:
>     f:
>         slli    a0,a0,33
>         srli    a0,a0,28
>         add     a0,a0,a1
>         ret
>
> gcc/ChangeLog:
>
>         * config/riscv/riscv.md (zero_extendsidi2_shifted): Replace
>           with a generalized split that requires no clobber, runs
>           before reload and works for smaller masks.
>
> Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
> ---
>
>  gcc/config/riscv/riscv.md | 37 ++++++++++++++++++++-----------------
>  1 file changed, 20 insertions(+), 17 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> index b8ab0cf169a..cc10cd90a74 100644
> --- a/gcc/config/riscv/riscv.md
> +++ b/gcc/config/riscv/riscv.md
> @@ -2119,23 +2119,26 @@ (define_split
>  ;; occur when unsigned int is used for array indexing.  Split this into two
>  ;; shifts.  Otherwise we can get 3 shifts.
>
> -(define_insn_and_split "zero_extendsidi2_shifted"
> -  [(set (match_operand:DI 0 "register_operand" "=r")
> -       (and:DI (ashift:DI (match_operand:DI 1 "register_operand" "r")
> -                          (match_operand:QI 2 "immediate_operand" "I"))
> -               (match_operand 3 "immediate_operand" "")))
> -   (clobber (match_scratch:DI 4 "=&r"))]
> -  "TARGET_64BIT && !TARGET_ZBA
> -   && ((INTVAL (operands[3]) >> INTVAL (operands[2])) == 0xffffffff)"
> -  "#"
> -  "&& reload_completed"
> -  [(set (match_dup 4)
> -       (ashift:DI (match_dup 1) (const_int 32)))
> -   (set (match_dup 0)
> -       (lshiftrt:DI (match_dup 4) (match_dup 5)))]
> -  "operands[5] = GEN_INT (32 - (INTVAL (operands [2])));"
> -  [(set_attr "type" "shift")
> -   (set_attr "mode" "DI")])
> +(define_split
> +  [(set (match_operand:DI 0 "register_operand")
> +       (and:DI (ashift:DI (match_operand:DI 1 "register_operand")
> +                          (match_operand:QI 2 "immediate_operand"))
> +               (match_operand:DI 3 "consecutive_bits_operand")))]
> +  "TARGET_64BIT"
> +  [(set (match_dup 0) (ashift:DI (match_dup 1) (match_dup 4)))
> +   (set (match_dup 0) (lshiftrt:DI (match_dup 0) (match_dup 5)))]

I would prefer to keep using another register if possible:

like this:
+  [(set (match_dup 6) (ashift:DI (match_dup 1) (match_dup 4)))
+   (set (match_dup 0) (lshiftrt:DI (match_dup 6) (match_dup 5)))]

if (can_create_pseudo_p)
  operands[6] = gen_reg_rtx (DImode);
else
  operands[6] = operands[0];


> +{
> +       unsigned HOST_WIDE_INT mask = UINTVAL (operands[3]);
> +       int leading = clz_hwi (mask);
> +       int trailing = ctz_hwi (mask);
> +
> +       /* The shift-amount must match the number of trailing bits */
> +       if (trailing != UINTVAL (operands[2]))
> +          FAIL;
> +
> +       operands[4] = GEN_INT (leading + trailing);
> +       operands[5] = GEN_INT (leading);
> +})
>
>  ;;
>  ;;  ....................
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 2/3] RISC-V: Split slli+sh[123]add.uw opportunities to avoid zext.w
  2022-05-24 21:47 ` [PATCH v1 2/3] RISC-V: Split slli+sh[123]add.uw opportunities to avoid zext.w Philipp Tomsich
@ 2022-06-07 10:25   ` Kito Cheng
  2022-06-14 11:39     ` Philipp Tomsich
  2022-06-17  8:34   ` Andreas Schwab
  1 sibling, 1 reply; 14+ messages in thread
From: Kito Cheng @ 2022-06-07 10:25 UTC (permalink / raw)
  To: Philipp Tomsich
  Cc: GCC Patches, Manolis Tsamis, Jim Wilson, Vineet Gupta,
	Palmer Dabbelt, Andrew Waterman

LGTM, you can commit that without [3/3] if you like :)

On Wed, May 25, 2022 at 5:47 AM Philipp Tomsich
<philipp.tomsich@vrull.eu> wrote:
>
> When encountering a prescaled (biased) value as a candidate for
> sh[123]add.uw, the combine pass will present this as shifted by the
> aggregate amount (prescale + shift-amount) with an appropriately
> adjusted mask constant that has fewer than 32 bits set.
>
> E.g., here's the failing expression seen in combine for a prescale of
> 1 and a shift of 2 (note how 0x3fffffff8 >> 3 is 0x7fffffff).
>   Trying 7, 8 -> 10:
>       7: r78:SI=r81:DI#0<<0x1
>         REG_DEAD r81:DI
>       8: r79:DI=zero_extend(r78:SI)
>         REG_DEAD r78:SI
>      10: r80:DI=r79:DI<<0x2+r82:DI
>         REG_DEAD r79:DI
>         REG_DEAD r82:DI
>   Failed to match this instruction:
>   (set (reg:DI 80 [ cD.1491 ])
>       (plus:DI (and:DI (ashift:DI (reg:DI 81)
>                        (const_int 3 [0x3]))
>                (const_int 17179869176 [0x3fffffff8]))
>           (reg:DI 82)))
>
> To address this, we introduce a splitter handling these cases.
>
> gcc/ChangeLog:
>
>         * config/riscv/bitmanip.md: Add split to handle opportunities
>           for slli + sh[123]add.uw
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.target/riscv/zba-shadd.c: New test.
>
> Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
> Co-developed-by: Manolis Tsamis <manolis.tsamis@vrull.eu>
>
> ---
>
>  gcc/config/riscv/bitmanip.md               | 44 ++++++++++++++++++++++
>  gcc/testsuite/gcc.target/riscv/zba-shadd.c | 13 +++++++
>  2 files changed, 57 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zba-shadd.c
>
> diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
> index 0ab9ffe3c0b..6c1ccc6f8c5 100644
> --- a/gcc/config/riscv/bitmanip.md
> +++ b/gcc/config/riscv/bitmanip.md
> @@ -79,6 +79,50 @@ (define_insn "*shNadduw"
>    [(set_attr "type" "bitmanip")
>     (set_attr "mode" "DI")])
>
> +;; During combine, we may encounter an attempt to combine
> +;;   slli rtmp, rs, #imm
> +;;   zext.w rtmp, rtmp
> +;;   sh[123]add rd, rtmp, rs2
> +;; which will lead to the immediate not satisfying the above constraints.
> +;; By splitting the compound expression, we can simplify to a slli and a
> +;; sh[123]add.uw.
> +(define_split
> +  [(set (match_operand:DI 0 "register_operand")
> +       (plus:DI (and:DI (ashift:DI (match_operand:DI 1 "register_operand")
> +                                   (match_operand:QI 2 "immediate_operand"))
> +                        (match_operand:DI 3 "consecutive_bits_operand"))
> +                (match_operand:DI 4 "register_operand")))
> +   (clobber (match_operand:DI 5 "register_operand"))]
> +  "TARGET_64BIT && TARGET_ZBA"
> +  [(set (match_dup 5) (ashift:DI (match_dup 1) (match_dup 6)))
> +   (set (match_dup 0) (plus:DI (and:DI (ashift:DI (match_dup 5)
> +                                                 (match_dup 7))
> +                                      (match_dup 8))
> +                              (match_dup 4)))]
> +{
> +       unsigned HOST_WIDE_INT mask = UINTVAL (operands[3]);
> +       /* scale: shift within the sh[123]add.uw */
> +       int scale = 32 - clz_hwi (mask);
> +       /* bias:  pre-scale amount (i.e. the prior shift amount) */
> +       int bias = ctz_hwi (mask) - scale;
> +
> +       /* If the bias + scale don't add up to operand[2], reject. */
> +       if ((scale + bias) != UINTVAL (operands[2]))
> +          FAIL;
> +
> +       /* If the shift-amount is out-of-range for sh[123]add.uw, reject. */
> +       if ((scale < 1) || (scale > 3))
> +          FAIL;
> +
> +       /* If there's no bias, the '*shNadduw' pattern should have matched. */
> +       if (bias == 0)
> +          FAIL;
> +
> +       operands[6] = GEN_INT (bias);
> +       operands[7] = GEN_INT (scale);
> +       operands[8] = GEN_INT (0xffffffffULL << scale);
> +})
> +
>  (define_insn "*add.uw"
>    [(set (match_operand:DI 0 "register_operand" "=r")
>         (plus:DI (zero_extend:DI
> diff --git a/gcc/testsuite/gcc.target/riscv/zba-shadd.c b/gcc/testsuite/gcc.target/riscv/zba-shadd.c
> new file mode 100644
> index 00000000000..33da2530f3f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/zba-shadd.c
> @@ -0,0 +1,13 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -march=rv64gc_zba -mabi=lp64" } */
> +
> +unsigned long foo(unsigned int a, unsigned long b)
> +{
> +        a = a << 1;
> +        unsigned long c = (unsigned long) a;
> +        unsigned long d = b + (c<<2);
> +        return d;
> +}
> +
> +/* { dg-final { scan-assembler "sh2add.uw" } } */
> +/* { dg-final { scan-assembler-not "zext" } } */
> \ No newline at end of file
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 1/3] RISC-V: add consecutive_bits_operand predicate
  2022-05-24 21:47 ` [PATCH v1 1/3] RISC-V: add consecutive_bits_operand predicate Philipp Tomsich
@ 2022-06-07 10:25   ` Kito Cheng
  2022-06-14 11:38     ` Philipp Tomsich
  0 siblings, 1 reply; 14+ messages in thread
From: Kito Cheng @ 2022-06-07 10:25 UTC (permalink / raw)
  To: Philipp Tomsich; +Cc: GCC Patches, Andrew Waterman, Vineet Gupta

LGTM


On Wed, May 25, 2022 at 5:48 AM Philipp Tomsich
<philipp.tomsich@vrull.eu> wrote:
>
> Provide an easy way to constrain for constants that are a a single,
> consecutive run of ones.
>
> gcc/ChangeLog:
>
>         * config/riscv/predicates.md (consecutive_bits_operand):
>           Implement new predicate.
>
> Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
> ---
>
>  gcc/config/riscv/predicates.md | 11 +++++++++++
>  1 file changed, 11 insertions(+)
>
> diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
> index c37caa2502b..90db5dfcdd5 100644
> --- a/gcc/config/riscv/predicates.md
> +++ b/gcc/config/riscv/predicates.md
> @@ -243,3 +243,14 @@ (define_predicate "const63_operand"
>  (define_predicate "imm5_operand"
>    (and (match_code "const_int")
>         (match_test "INTVAL (op) < 5")))
> +
> +;; A CONST_INT operand that consists of a single run of consecutive set bits.
> +(define_predicate "consecutive_bits_operand"
> +  (match_code "const_int")
> +{
> +       unsigned HOST_WIDE_INT val = UINTVAL (op);
> +       if (exact_log2 ((val >> ctz_hwi (val)) + 1) < 0)
> +               return false;
> +
> +       return true;
> +})
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 3/3] RISC-V: Replace zero_extendsidi2_shifted with generalized split
  2022-06-07 10:24   ` Kito Cheng
@ 2022-06-07 10:50     ` Philipp Tomsich
  2022-06-07 13:18       ` Kito Cheng
  0 siblings, 1 reply; 14+ messages in thread
From: Philipp Tomsich @ 2022-06-07 10:50 UTC (permalink / raw)
  To: Kito Cheng
  Cc: GCC Patches, Manolis Tsamis, Jim Wilson, Vineet Gupta,
	Palmer Dabbelt, Andrew Waterman

On Tue, 7 Jun 2022 at 12:24, Kito Cheng <kito.cheng@gmail.com> wrote:
>
> On Wed, May 25, 2022 at 5:47 AM Philipp Tomsich
> <philipp.tomsich@vrull.eu> wrote:
> >
> > The current method of treating shifts of extended values on RISC-V
> > frequently causes sequences of 3 shifts, despite the presence of the
> > 'zero_extendsidi2_shifted' pattern.
> >
> > Consider:
> >     unsigned long f(unsigned int a, unsigned long b)
> >     {
> >             a = a << 1;
> >             unsigned long c = (unsigned long) a;
> >             c = b + (c<<4);
> >             return c;
> >     }
> > which will present at combine-time as:
> >     Trying 7, 8 -> 9:
> >         7: r78:SI=r81:DI#0<<0x1
> >           REG_DEAD r81:DI
> >         8: r79:DI=zero_extend(r78:SI)
> >           REG_DEAD r78:SI
> >         9: r72:DI=r79:DI<<0x4
> >           REG_DEAD r79:DI
> >     Failed to match this instruction:
> >     (set (reg:DI 72 [ _1 ])
> >         (and:DI (ashift:DI (reg:DI 81)
> >                 (const_int 5 [0x5]))
> >         (const_int 68719476704 [0xfffffffe0])))
> > and produce the following (optimized) assembly:
> >     f:
> >         slliw   a5,a0,1
> >         slli    a5,a5,32
> >         srli    a5,a5,28
> >         add     a0,a5,a1
> >         ret
> >
> > The current way of handling this (in 'zero_extendsidi2_shifted')
> > doesn't apply for two reasons:
> > - this is seen before reload, and
> > - (more importantly) the constant mask is not 0xfffffffful.
> >
> > To address this, we introduce a generalized version of shifting
> > zero-extended values that supports any mask of consecutive ones as
> > long as the number of training zeros is the inner shift-amount.
> >
> > With this new split, we generate the following assembly for the
> > aforementioned function:
> >     f:
> >         slli    a0,a0,33
> >         srli    a0,a0,28
> >         add     a0,a0,a1
> >         ret
> >
> > gcc/ChangeLog:
> >
> >         * config/riscv/riscv.md (zero_extendsidi2_shifted): Replace
> >           with a generalized split that requires no clobber, runs
> >           before reload and works for smaller masks.
> >
> > Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
> > ---
> >
> >  gcc/config/riscv/riscv.md | 37 ++++++++++++++++++++-----------------
> >  1 file changed, 20 insertions(+), 17 deletions(-)
> >
> > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> > index b8ab0cf169a..cc10cd90a74 100644
> > --- a/gcc/config/riscv/riscv.md
> > +++ b/gcc/config/riscv/riscv.md
> > @@ -2119,23 +2119,26 @@ (define_split
> >  ;; occur when unsigned int is used for array indexing.  Split this into two
> >  ;; shifts.  Otherwise we can get 3 shifts.
> >
> > -(define_insn_and_split "zero_extendsidi2_shifted"
> > -  [(set (match_operand:DI 0 "register_operand" "=r")
> > -       (and:DI (ashift:DI (match_operand:DI 1 "register_operand" "r")
> > -                          (match_operand:QI 2 "immediate_operand" "I"))
> > -               (match_operand 3 "immediate_operand" "")))
> > -   (clobber (match_scratch:DI 4 "=&r"))]
> > -  "TARGET_64BIT && !TARGET_ZBA
> > -   && ((INTVAL (operands[3]) >> INTVAL (operands[2])) == 0xffffffff)"
> > -  "#"
> > -  "&& reload_completed"
> > -  [(set (match_dup 4)
> > -       (ashift:DI (match_dup 1) (const_int 32)))
> > -   (set (match_dup 0)
> > -       (lshiftrt:DI (match_dup 4) (match_dup 5)))]
> > -  "operands[5] = GEN_INT (32 - (INTVAL (operands [2])));"
> > -  [(set_attr "type" "shift")
> > -   (set_attr "mode" "DI")])
> > +(define_split
> > +  [(set (match_operand:DI 0 "register_operand")
> > +       (and:DI (ashift:DI (match_operand:DI 1 "register_operand")
> > +                          (match_operand:QI 2 "immediate_operand"))
> > +               (match_operand:DI 3 "consecutive_bits_operand")))]
> > +  "TARGET_64BIT"
> > +  [(set (match_dup 0) (ashift:DI (match_dup 1) (match_dup 4)))
> > +   (set (match_dup 0) (lshiftrt:DI (match_dup 0) (match_dup 5)))]
>
> I would prefer to keep using another register if possible:
>
> like this:
> +  [(set (match_dup 6) (ashift:DI (match_dup 1) (match_dup 4)))
> +   (set (match_dup 0) (lshiftrt:DI (match_dup 6) (match_dup 5)))]
>
> if (can_create_pseudo_p)
>   operands[6] = gen_reg_rtx (DImode);
> else
>   operands[6] = operands[0];

I don't see the benefit to this (unless you expect opportunities for
CSE), as there will be a linear dependency chain anyway.  I'd like to
understand your reasoning behind this a bit better, as our style
currently generally tries to not avoid introducing temporaries if it
is avoidable.

Thanks,
Philipp.

>
> > +{
> > +       unsigned HOST_WIDE_INT mask = UINTVAL (operands[3]);
> > +       int leading = clz_hwi (mask);
> > +       int trailing = ctz_hwi (mask);
> > +
> > +       /* The shift-amount must match the number of trailing bits */
> > +       if (trailing != UINTVAL (operands[2]))
> > +          FAIL;
> > +
> > +       operands[4] = GEN_INT (leading + trailing);
> > +       operands[5] = GEN_INT (leading);
> > +})
> >
> >  ;;
> >  ;;  ....................
> > --
> > 2.34.1
> >

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 3/3] RISC-V: Replace zero_extendsidi2_shifted with generalized split
  2022-06-07 10:50     ` Philipp Tomsich
@ 2022-06-07 13:18       ` Kito Cheng
  0 siblings, 0 replies; 14+ messages in thread
From: Kito Cheng @ 2022-06-07 13:18 UTC (permalink / raw)
  To: Philipp Tomsich; +Cc: Andrew Waterman, Vineet Gupta, GCC Patches

Using the same pseudo register makes one longer live range instead of
two shorter live ranges, that's not good when inst. scheduler try to
separate those two instructions, and I think register allocator has
more complete knowledge to decide which way is better - using the same
or different, so I prefer to use another pseudo here if possible.

That's also what AArch64/ARM/x86 port did - use new pseudo as tmp if possible.


On Tue, Jun 7, 2022 at 6:50 PM Philipp Tomsich <philipp.tomsich@vrull.eu> wrote:
>
> On Tue, 7 Jun 2022 at 12:24, Kito Cheng <kito.cheng@gmail.com> wrote:
> >
> > On Wed, May 25, 2022 at 5:47 AM Philipp Tomsich
> > <philipp.tomsich@vrull.eu> wrote:
> > >
> > > The current method of treating shifts of extended values on RISC-V
> > > frequently causes sequences of 3 shifts, despite the presence of the
> > > 'zero_extendsidi2_shifted' pattern.
> > >
> > > Consider:
> > >     unsigned long f(unsigned int a, unsigned long b)
> > >     {
> > >             a = a << 1;
> > >             unsigned long c = (unsigned long) a;
> > >             c = b + (c<<4);
> > >             return c;
> > >     }
> > > which will present at combine-time as:
> > >     Trying 7, 8 -> 9:
> > >         7: r78:SI=r81:DI#0<<0x1
> > >           REG_DEAD r81:DI
> > >         8: r79:DI=zero_extend(r78:SI)
> > >           REG_DEAD r78:SI
> > >         9: r72:DI=r79:DI<<0x4
> > >           REG_DEAD r79:DI
> > >     Failed to match this instruction:
> > >     (set (reg:DI 72 [ _1 ])
> > >         (and:DI (ashift:DI (reg:DI 81)
> > >                 (const_int 5 [0x5]))
> > >         (const_int 68719476704 [0xfffffffe0])))
> > > and produce the following (optimized) assembly:
> > >     f:
> > >         slliw   a5,a0,1
> > >         slli    a5,a5,32
> > >         srli    a5,a5,28
> > >         add     a0,a5,a1
> > >         ret
> > >
> > > The current way of handling this (in 'zero_extendsidi2_shifted')
> > > doesn't apply for two reasons:
> > > - this is seen before reload, and
> > > - (more importantly) the constant mask is not 0xfffffffful.
> > >
> > > To address this, we introduce a generalized version of shifting
> > > zero-extended values that supports any mask of consecutive ones as
> > > long as the number of training zeros is the inner shift-amount.
> > >
> > > With this new split, we generate the following assembly for the
> > > aforementioned function:
> > >     f:
> > >         slli    a0,a0,33
> > >         srli    a0,a0,28
> > >         add     a0,a0,a1
> > >         ret
> > >
> > > gcc/ChangeLog:
> > >
> > >         * config/riscv/riscv.md (zero_extendsidi2_shifted): Replace
> > >           with a generalized split that requires no clobber, runs
> > >           before reload and works for smaller masks.
> > >
> > > Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
> > > ---
> > >
> > >  gcc/config/riscv/riscv.md | 37 ++++++++++++++++++++-----------------
> > >  1 file changed, 20 insertions(+), 17 deletions(-)
> > >
> > > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> > > index b8ab0cf169a..cc10cd90a74 100644
> > > --- a/gcc/config/riscv/riscv.md
> > > +++ b/gcc/config/riscv/riscv.md
> > > @@ -2119,23 +2119,26 @@ (define_split
> > >  ;; occur when unsigned int is used for array indexing.  Split this into two
> > >  ;; shifts.  Otherwise we can get 3 shifts.
> > >
> > > -(define_insn_and_split "zero_extendsidi2_shifted"
> > > -  [(set (match_operand:DI 0 "register_operand" "=r")
> > > -       (and:DI (ashift:DI (match_operand:DI 1 "register_operand" "r")
> > > -                          (match_operand:QI 2 "immediate_operand" "I"))
> > > -               (match_operand 3 "immediate_operand" "")))
> > > -   (clobber (match_scratch:DI 4 "=&r"))]
> > > -  "TARGET_64BIT && !TARGET_ZBA
> > > -   && ((INTVAL (operands[3]) >> INTVAL (operands[2])) == 0xffffffff)"
> > > -  "#"
> > > -  "&& reload_completed"
> > > -  [(set (match_dup 4)
> > > -       (ashift:DI (match_dup 1) (const_int 32)))
> > > -   (set (match_dup 0)
> > > -       (lshiftrt:DI (match_dup 4) (match_dup 5)))]
> > > -  "operands[5] = GEN_INT (32 - (INTVAL (operands [2])));"
> > > -  [(set_attr "type" "shift")
> > > -   (set_attr "mode" "DI")])
> > > +(define_split
> > > +  [(set (match_operand:DI 0 "register_operand")
> > > +       (and:DI (ashift:DI (match_operand:DI 1 "register_operand")
> > > +                          (match_operand:QI 2 "immediate_operand"))
> > > +               (match_operand:DI 3 "consecutive_bits_operand")))]
> > > +  "TARGET_64BIT"
> > > +  [(set (match_dup 0) (ashift:DI (match_dup 1) (match_dup 4)))
> > > +   (set (match_dup 0) (lshiftrt:DI (match_dup 0) (match_dup 5)))]
> >
> > I would prefer to keep using another register if possible:
> >
> > like this:
> > +  [(set (match_dup 6) (ashift:DI (match_dup 1) (match_dup 4)))
> > +   (set (match_dup 0) (lshiftrt:DI (match_dup 6) (match_dup 5)))]
> >
> > if (can_create_pseudo_p)
> >   operands[6] = gen_reg_rtx (DImode);
> > else
> >   operands[6] = operands[0];
>
> I don't see the benefit to this (unless you expect opportunities for
> CSE), as there will be a linear dependency chain anyway.  I'd like to
> understand your reasoning behind this a bit better, as our style
> currently generally tries to not avoid introducing temporaries if it
> is avoidable.
>
> Thanks,
> Philipp.
>
> >
> > > +{
> > > +       unsigned HOST_WIDE_INT mask = UINTVAL (operands[3]);
> > > +       int leading = clz_hwi (mask);
> > > +       int trailing = ctz_hwi (mask);
> > > +
> > > +       /* The shift-amount must match the number of trailing bits */
> > > +       if (trailing != UINTVAL (operands[2]))
> > > +          FAIL;
> > > +
> > > +       operands[4] = GEN_INT (leading + trailing);
> > > +       operands[5] = GEN_INT (leading);
> > > +})
> > >
> > >  ;;
> > >  ;;  ....................
> > > --
> > > 2.34.1
> > >

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 1/3] RISC-V: add consecutive_bits_operand predicate
  2022-06-07 10:25   ` Kito Cheng
@ 2022-06-14 11:38     ` Philipp Tomsich
  0 siblings, 0 replies; 14+ messages in thread
From: Philipp Tomsich @ 2022-06-14 11:38 UTC (permalink / raw)
  To: Kito Cheng; +Cc: GCC Patches, Andrew Waterman, Vineet Gupta

Thanks, applied to master!


On Tue, 7 Jun 2022 at 12:26, Kito Cheng <kito.cheng@gmail.com> wrote:
>
> LGTM
>
>
> On Wed, May 25, 2022 at 5:48 AM Philipp Tomsich
> <philipp.tomsich@vrull.eu> wrote:
> >
> > Provide an easy way to constrain for constants that are a a single,
> > consecutive run of ones.
> >
> > gcc/ChangeLog:
> >
> >         * config/riscv/predicates.md (consecutive_bits_operand):
> >           Implement new predicate.
> >
> > Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
> > ---
> >
> >  gcc/config/riscv/predicates.md | 11 +++++++++++
> >  1 file changed, 11 insertions(+)
> >
> > diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
> > index c37caa2502b..90db5dfcdd5 100644
> > --- a/gcc/config/riscv/predicates.md
> > +++ b/gcc/config/riscv/predicates.md
> > @@ -243,3 +243,14 @@ (define_predicate "const63_operand"
> >  (define_predicate "imm5_operand"
> >    (and (match_code "const_int")
> >         (match_test "INTVAL (op) < 5")))
> > +
> > +;; A CONST_INT operand that consists of a single run of consecutive set bits.
> > +(define_predicate "consecutive_bits_operand"
> > +  (match_code "const_int")
> > +{
> > +       unsigned HOST_WIDE_INT val = UINTVAL (op);
> > +       if (exact_log2 ((val >> ctz_hwi (val)) + 1) < 0)
> > +               return false;
> > +
> > +       return true;
> > +})
> > --
> > 2.34.1
> >

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 2/3] RISC-V: Split slli+sh[123]add.uw opportunities to avoid zext.w
  2022-06-07 10:25   ` Kito Cheng
@ 2022-06-14 11:39     ` Philipp Tomsich
  0 siblings, 0 replies; 14+ messages in thread
From: Philipp Tomsich @ 2022-06-14 11:39 UTC (permalink / raw)
  To: Kito Cheng
  Cc: GCC Patches, Manolis Tsamis, Jim Wilson, Vineet Gupta,
	Palmer Dabbelt, Andrew Waterman

Thanks, applied to master!

For [3/3], I'll submit a new standalone patch with the requested changes.

On Tue, 7 Jun 2022 at 12:25, Kito Cheng <kito.cheng@gmail.com> wrote:
>
> LGTM, you can commit that without [3/3] if you like :)
>
> On Wed, May 25, 2022 at 5:47 AM Philipp Tomsich
> <philipp.tomsich@vrull.eu> wrote:
> >
> > When encountering a prescaled (biased) value as a candidate for
> > sh[123]add.uw, the combine pass will present this as shifted by the
> > aggregate amount (prescale + shift-amount) with an appropriately
> > adjusted mask constant that has fewer than 32 bits set.
> >
> > E.g., here's the failing expression seen in combine for a prescale of
> > 1 and a shift of 2 (note how 0x3fffffff8 >> 3 is 0x7fffffff).
> >   Trying 7, 8 -> 10:
> >       7: r78:SI=r81:DI#0<<0x1
> >         REG_DEAD r81:DI
> >       8: r79:DI=zero_extend(r78:SI)
> >         REG_DEAD r78:SI
> >      10: r80:DI=r79:DI<<0x2+r82:DI
> >         REG_DEAD r79:DI
> >         REG_DEAD r82:DI
> >   Failed to match this instruction:
> >   (set (reg:DI 80 [ cD.1491 ])
> >       (plus:DI (and:DI (ashift:DI (reg:DI 81)
> >                        (const_int 3 [0x3]))
> >                (const_int 17179869176 [0x3fffffff8]))
> >           (reg:DI 82)))
> >
> > To address this, we introduce a splitter handling these cases.
> >
> > gcc/ChangeLog:
> >
> >         * config/riscv/bitmanip.md: Add split to handle opportunities
> >           for slli + sh[123]add.uw
> >
> > gcc/testsuite/ChangeLog:
> >
> >         * gcc.target/riscv/zba-shadd.c: New test.
> >
> > Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
> > Co-developed-by: Manolis Tsamis <manolis.tsamis@vrull.eu>
> >
> > ---
> >
> >  gcc/config/riscv/bitmanip.md               | 44 ++++++++++++++++++++++
> >  gcc/testsuite/gcc.target/riscv/zba-shadd.c | 13 +++++++
> >  2 files changed, 57 insertions(+)
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/zba-shadd.c
> >
> > diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
> > index 0ab9ffe3c0b..6c1ccc6f8c5 100644
> > --- a/gcc/config/riscv/bitmanip.md
> > +++ b/gcc/config/riscv/bitmanip.md
> > @@ -79,6 +79,50 @@ (define_insn "*shNadduw"
> >    [(set_attr "type" "bitmanip")
> >     (set_attr "mode" "DI")])
> >
> > +;; During combine, we may encounter an attempt to combine
> > +;;   slli rtmp, rs, #imm
> > +;;   zext.w rtmp, rtmp
> > +;;   sh[123]add rd, rtmp, rs2
> > +;; which will lead to the immediate not satisfying the above constraints.
> > +;; By splitting the compound expression, we can simplify to a slli and a
> > +;; sh[123]add.uw.
> > +(define_split
> > +  [(set (match_operand:DI 0 "register_operand")
> > +       (plus:DI (and:DI (ashift:DI (match_operand:DI 1 "register_operand")
> > +                                   (match_operand:QI 2 "immediate_operand"))
> > +                        (match_operand:DI 3 "consecutive_bits_operand"))
> > +                (match_operand:DI 4 "register_operand")))
> > +   (clobber (match_operand:DI 5 "register_operand"))]
> > +  "TARGET_64BIT && TARGET_ZBA"
> > +  [(set (match_dup 5) (ashift:DI (match_dup 1) (match_dup 6)))
> > +   (set (match_dup 0) (plus:DI (and:DI (ashift:DI (match_dup 5)
> > +                                                 (match_dup 7))
> > +                                      (match_dup 8))
> > +                              (match_dup 4)))]
> > +{
> > +       unsigned HOST_WIDE_INT mask = UINTVAL (operands[3]);
> > +       /* scale: shift within the sh[123]add.uw */
> > +       int scale = 32 - clz_hwi (mask);
> > +       /* bias:  pre-scale amount (i.e. the prior shift amount) */
> > +       int bias = ctz_hwi (mask) - scale;
> > +
> > +       /* If the bias + scale don't add up to operand[2], reject. */
> > +       if ((scale + bias) != UINTVAL (operands[2]))
> > +          FAIL;
> > +
> > +       /* If the shift-amount is out-of-range for sh[123]add.uw, reject. */
> > +       if ((scale < 1) || (scale > 3))
> > +          FAIL;
> > +
> > +       /* If there's no bias, the '*shNadduw' pattern should have matched. */
> > +       if (bias == 0)
> > +          FAIL;
> > +
> > +       operands[6] = GEN_INT (bias);
> > +       operands[7] = GEN_INT (scale);
> > +       operands[8] = GEN_INT (0xffffffffULL << scale);
> > +})
> > +
> >  (define_insn "*add.uw"
> >    [(set (match_operand:DI 0 "register_operand" "=r")
> >         (plus:DI (zero_extend:DI
> > diff --git a/gcc/testsuite/gcc.target/riscv/zba-shadd.c b/gcc/testsuite/gcc.target/riscv/zba-shadd.c
> > new file mode 100644
> > index 00000000000..33da2530f3f
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/zba-shadd.c
> > @@ -0,0 +1,13 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -march=rv64gc_zba -mabi=lp64" } */
> > +
> > +unsigned long foo(unsigned int a, unsigned long b)
> > +{
> > +        a = a << 1;
> > +        unsigned long c = (unsigned long) a;
> > +        unsigned long d = b + (c<<2);
> > +        return d;
> > +}
> > +
> > +/* { dg-final { scan-assembler "sh2add.uw" } } */
> > +/* { dg-final { scan-assembler-not "zext" } } */
> > \ No newline at end of file
> > --
> > 2.34.1
> >

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 2/3] RISC-V: Split slli+sh[123]add.uw opportunities to avoid zext.w
  2022-05-24 21:47 ` [PATCH v1 2/3] RISC-V: Split slli+sh[123]add.uw opportunities to avoid zext.w Philipp Tomsich
  2022-06-07 10:25   ` Kito Cheng
@ 2022-06-17  8:34   ` Andreas Schwab
  2022-06-17 14:00     ` Kito Cheng
  1 sibling, 1 reply; 14+ messages in thread
From: Andreas Schwab @ 2022-06-17  8:34 UTC (permalink / raw)
  To: Philipp Tomsich; +Cc: gcc-patches, Andrew Waterman, Vineet Gupta, Kito Cheng

../../gcc/config/riscv/bitmanip.md: In function 'rtx_insn* gen_split_44(rtx_ins\
n*, rtx_def**)':
../../gcc/config/riscv/bitmanip.md:110:28: error: comparison of integer express\
ions of different signedness: 'int' and 'long unsigned int' [-Werror=sign-compa\
re]
  110 |         if ((scale + bias) != UINTVAL (operands[2]))

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 2/3] RISC-V: Split slli+sh[123]add.uw opportunities to avoid zext.w
  2022-06-17  8:34   ` Andreas Schwab
@ 2022-06-17 14:00     ` Kito Cheng
  2022-06-17 14:19       ` Philipp Tomsich
  0 siblings, 1 reply; 14+ messages in thread
From: Kito Cheng @ 2022-06-17 14:00 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Philipp Tomsich, Vineet Gupta, GCC Patches, Andrew Waterman

Hi Andreas:

Fixed via https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=d6b423882a05d7b4f40ae1e9d942c9c4c13761b7,
thanks!

On Fri, Jun 17, 2022 at 4:34 PM Andreas Schwab <schwab@linux-m68k.org> wrote:
>
> ../../gcc/config/riscv/bitmanip.md: In function 'rtx_insn* gen_split_44(rtx_ins\
> n*, rtx_def**)':
> ../../gcc/config/riscv/bitmanip.md:110:28: error: comparison of integer express\
> ions of different signedness: 'int' and 'long unsigned int' [-Werror=sign-compa\
> re]
>   110 |         if ((scale + bias) != UINTVAL (operands[2]))
>
> --
> Andreas Schwab, schwab@linux-m68k.org
> GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
> "And now for something completely different."

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 2/3] RISC-V: Split slli+sh[123]add.uw opportunities to avoid zext.w
  2022-06-17 14:00     ` Kito Cheng
@ 2022-06-17 14:19       ` Philipp Tomsich
  0 siblings, 0 replies; 14+ messages in thread
From: Philipp Tomsich @ 2022-06-17 14:19 UTC (permalink / raw)
  To: Kito Cheng; +Cc: Andreas Schwab, Vineet Gupta, GCC Patches, Andrew Waterman

Kito, thanks: you were a few minutes ahead of my fix there.

On Fri, 17 Jun 2022 at 16:00, Kito Cheng <kito.cheng@gmail.com> wrote:

> Hi Andreas:
>
> Fixed via
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=d6b423882a05d7b4f40ae1e9d942c9c4c13761b7
> ,
> thanks!
>
> On Fri, Jun 17, 2022 at 4:34 PM Andreas Schwab <schwab@linux-m68k.org>
> wrote:
> >
> > ../../gcc/config/riscv/bitmanip.md: In function 'rtx_insn*
> gen_split_44(rtx_ins\
> > n*, rtx_def**)':
> > ../../gcc/config/riscv/bitmanip.md:110:28: error: comparison of integer
> express\
> > ions of different signedness: 'int' and 'long unsigned int'
> [-Werror=sign-compa\
> > re]
> >   110 |         if ((scale + bias) != UINTVAL (operands[2]))
> >
> > --
> > Andreas Schwab, schwab@linux-m68k.org
> > GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
> > "And now for something completely different."
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-06-17 14:20 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-24 21:47 [PATCH v1 0/3] RISC-V: Improve sequences with shifted zero-extended operands Philipp Tomsich
2022-05-24 21:47 ` [PATCH v1 1/3] RISC-V: add consecutive_bits_operand predicate Philipp Tomsich
2022-06-07 10:25   ` Kito Cheng
2022-06-14 11:38     ` Philipp Tomsich
2022-05-24 21:47 ` [PATCH v1 2/3] RISC-V: Split slli+sh[123]add.uw opportunities to avoid zext.w Philipp Tomsich
2022-06-07 10:25   ` Kito Cheng
2022-06-14 11:39     ` Philipp Tomsich
2022-06-17  8:34   ` Andreas Schwab
2022-06-17 14:00     ` Kito Cheng
2022-06-17 14:19       ` Philipp Tomsich
2022-05-24 21:47 ` [PATCH v1 3/3] RISC-V: Replace zero_extendsidi2_shifted with generalized split Philipp Tomsich
2022-06-07 10:24   ` Kito Cheng
2022-06-07 10:50     ` Philipp Tomsich
2022-06-07 13:18       ` Kito Cheng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).