public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] riscv: implement TARGET_MODE_REP_EXTENDED
@ 2022-09-05 21:44 Philipp Tomsich
  2022-09-06 11:39 ` Alexander Monakov
  0 siblings, 1 reply; 18+ messages in thread
From: Philipp Tomsich @ 2022-09-05 21:44 UTC (permalink / raw)
  To: gcc-patches
  Cc: Palmer Dabbelt, Vineet Gupta, Jojo R, Christoph Muellner,
	Kito Cheng, Philipp Tomsich

TARGET_MODE_REP_EXTENDED is supposed to match LOAD_EXTEND_OP, so this
adds an implementation using the same logic as in LOAD_EXTEND_OP.

This reduces the number of extension operations, as evidenced in the
reduction of dynamic instructions for the xz benchmark in SPEC CPU:

                    # dynamic instructions
                    baseline          new          improvement
xz, workload 1    384681308026    374464538911        2.66%
xz, workload 2    985995327109    974304030498        1.19%
xz, workload 3    545372994523    533717744260        2.14%

The shift-shift-2.c testcase needs to be adjusted, as it will no
longer use slliw/slriw for sub5, but will instead emit slli/slri.

No new regressions runnung the riscv.exp suite.

gcc/ChangeLog:

	* config/riscv/riscv.cc (riscv_mode_rep_extended):
	(TARGET_MODE_REP_EXTENDED): Implement.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/shift-shift-2.c: Adjust.

Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>

---

 gcc/config/riscv/riscv.cc                      | 15 +++++++++++++++
 gcc/testsuite/gcc.target/riscv/shift-shift-2.c |  2 --
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 675d92c0961..cf829f390ab 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -5053,6 +5053,18 @@ riscv_hard_regno_mode_ok (unsigned int regno, machine_mode mode)
   return true;
 }
 
+/* Implement TARGET_MODE_REP_EXTENDED.  */
+
+static int
+riscv_mode_rep_extended (scalar_int_mode mode, scalar_int_mode mode_rep)
+{
+  /* On 64-bit targets, SImode register values are sign-extended to DImode.  */
+  if (TARGET_64BIT && mode == SImode && mode_rep == DImode)
+    return SIGN_EXTEND;
+
+  return UNKNOWN;
+}
+
 /* Implement TARGET_MODES_TIEABLE_P.
 
    Don't allow floating-point modes to be tied, since type punning of
@@ -6153,6 +6165,9 @@ riscv_init_libfuncs (void)
 #undef TARGET_HARD_REGNO_MODE_OK
 #define TARGET_HARD_REGNO_MODE_OK riscv_hard_regno_mode_ok
 
+#undef TARGET_MODE_REP_EXTENDED
+#define TARGET_MODE_REP_EXTENDED riscv_mode_rep_extended
+
 #undef TARGET_MODES_TIEABLE_P
 #define TARGET_MODES_TIEABLE_P riscv_modes_tieable_p
 
diff --git a/gcc/testsuite/gcc.target/riscv/shift-shift-2.c b/gcc/testsuite/gcc.target/riscv/shift-shift-2.c
index 5f93be15ac5..2f38b3f0fec 100644
--- a/gcc/testsuite/gcc.target/riscv/shift-shift-2.c
+++ b/gcc/testsuite/gcc.target/riscv/shift-shift-2.c
@@ -38,5 +38,3 @@ sub5 (unsigned int i)
 }
 /* { dg-final { scan-assembler-times "slli" 5 } } */
 /* { dg-final { scan-assembler-times "srli" 5 } } */
-/* { dg-final { scan-assembler-times "slliw" 1 } } */
-/* { dg-final { scan-assembler-times "srliw" 1 } } */
-- 
2.34.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] riscv: implement TARGET_MODE_REP_EXTENDED
  2022-09-05 21:44 [PATCH] riscv: implement TARGET_MODE_REP_EXTENDED Philipp Tomsich
@ 2022-09-06 11:39 ` Alexander Monakov
  2022-09-16 23:48   ` Jeff Law
  2022-11-04 23:00   ` Philipp Tomsich
  0 siblings, 2 replies; 18+ messages in thread
From: Alexander Monakov @ 2022-09-06 11:39 UTC (permalink / raw)
  To: Philipp Tomsich; +Cc: gcc-patches, Vineet Gupta, Kito Cheng, Jojo R

On Mon, 5 Sep 2022, Philipp Tomsich wrote:

> +riscv_mode_rep_extended (scalar_int_mode mode, scalar_int_mode mode_rep)
> +{
> +  /* On 64-bit targets, SImode register values are sign-extended to DImode.  */
> +  if (TARGET_64BIT && mode == SImode && mode_rep == DImode)
> +    return SIGN_EXTEND;

I think this leads to a counter-intuitive requirement that a hand-written
inline asm must sign-extend its output operands that are bound to either
signed or unsigned 32-bit lvalues. Will compiler users be aware of that?

Moreover, without adjusting TARGET_TRULY_NOOP_TRUNCATION this should cause
miscompilation when a 64-bit variable is truncated to 32 bits: the pre-existing
hook says that nothing needs to be done to truncate, but the new hook says
that the result of the truncation is properly sign-extended.

The documentation for TARGET_MODE_REP_EXTENDED warns about that:

    In order to enforce the representation of mode, TARGET_TRULY_NOOP_TRUNCATION
    should return false when truncating to mode. 

Alexander

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] riscv: implement TARGET_MODE_REP_EXTENDED
  2022-09-06 11:39 ` Alexander Monakov
@ 2022-09-16 23:48   ` Jeff Law
  2022-09-17  7:59     ` Palmer Dabbelt
  2022-11-04 23:00   ` Philipp Tomsich
  1 sibling, 1 reply; 18+ messages in thread
From: Jeff Law @ 2022-09-16 23:48 UTC (permalink / raw)
  To: gcc-patches


On 9/6/22 05:39, Alexander Monakov via Gcc-patches wrote:
> On Mon, 5 Sep 2022, Philipp Tomsich wrote:
>
>> +riscv_mode_rep_extended (scalar_int_mode mode, scalar_int_mode mode_rep)
>> +{
>> +  /* On 64-bit targets, SImode register values are sign-extended to DImode.  */
>> +  if (TARGET_64BIT && mode == SImode && mode_rep == DImode)
>> +    return SIGN_EXTEND;
> I think this leads to a counter-intuitive requirement that a hand-written
> inline asm must sign-extend its output operands that are bound to either
> signed or unsigned 32-bit lvalues. Will compiler users be aware of that?

Is this significantly different than on MIPS?  Hand-written code there 
also has to ensure that the results are properly sign extended and it's 
been that way for 20+ years since the introduction of mips64 IIRC.  
Though I don't think we had MODE_REP_EXTENDED that long.

Haha, MIPS is the only target that currently defines 
TARGET_MODE_REP_EXTENDED :-)



>
> Moreover, without adjusting TARGET_TRULY_NOOP_TRUNCATION this should cause
> miscompilation when a 64-bit variable is truncated to 32 bits: the pre-existing
> hook says that nothing needs to be done to truncate, but the new hook says
> that the result of the truncation is properly sign-extended.
>
> The documentation for TARGET_MODE_REP_EXTENDED warns about that:
>
>      In order to enforce the representation of mode, TARGET_TRULY_NOOP_TRUNCATION
>      should return false when truncating to mode.

This may well need adjusting in Philipp's patch.   I'd be surprised if 
the MIPS definition wasn't usable nearly verbatim here.


jeff


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] riscv: implement TARGET_MODE_REP_EXTENDED
  2022-09-16 23:48   ` Jeff Law
@ 2022-09-17  7:59     ` Palmer Dabbelt
  0 siblings, 0 replies; 18+ messages in thread
From: Palmer Dabbelt @ 2022-09-17  7:59 UTC (permalink / raw)
  To: gcc-patches

On Fri, 16 Sep 2022 16:48:24 PDT (-0700), gcc-patches@gcc.gnu.org wrote:
>
> On 9/6/22 05:39, Alexander Monakov via Gcc-patches wrote:
>> On Mon, 5 Sep 2022, Philipp Tomsich wrote:
>>
>>> +riscv_mode_rep_extended (scalar_int_mode mode, scalar_int_mode mode_rep)
>>> +{
>>> +  /* On 64-bit targets, SImode register values are sign-extended to DImode.  */
>>> +  if (TARGET_64BIT && mode == SImode && mode_rep == DImode)
>>> +    return SIGN_EXTEND;
>> I think this leads to a counter-intuitive requirement that a hand-written
>> inline asm must sign-extend its output operands that are bound to either
>> signed or unsigned 32-bit lvalues. Will compiler users be aware of that?
>
> Is this significantly different than on MIPS?  Hand-written code there
> also has to ensure that the results are properly sign extended and it's
> been that way for 20+ years since the introduction of mips64 IIRC. 
> Though I don't think we had MODE_REP_EXTENDED that long.

IMO the problem isn't so much that asm has this constraint, it's that 
it's a new constraint and thus risks breaking code that used to work.  
That said...

> Haha, MIPS is the only target that currently defines
> TARGET_MODE_REP_EXTENDED :-)
>
>
>
>>
>> Moreover, without adjusting TARGET_TRULY_NOOP_TRUNCATION this should cause
>> miscompilation when a 64-bit variable is truncated to 32 bits: the pre-existing
>> hook says that nothing needs to be done to truncate, but the new hook says
>> that the result of the truncation is properly sign-extended.
>>
>> The documentation for TARGET_MODE_REP_EXTENDED warns about that:
>>
>>      In order to enforce the representation of mode, TARGET_TRULY_NOOP_TRUNCATION
>>      should return false when truncating to mode.
>
> This may well need adjusting in Philipp's patch.   I'd be surprised if
> the MIPS definition wasn't usable nearly verbatim here.

Yes, and we have a few MIPS-isms in the ISA but don't have the same 
flavor of TRULY_NOOP_TRUNCATION.  It's been pointed out a handful of 
times and I'm not sure what the right way to go is here, every time I 
try and reason about which is going to produce better code I come up 
with a different answer.  IIRC last time I looked at this I came to the 
conclusion that we're doing the right thing for RISC-V because most of 
our instructions implicitly truncate.  It's pretty easy to generate bad 
code here and I'm pretty sure we could fix some of that by moving to a 
more MIPS-like TRULY_MODE_TRUNCATION, but I think we'd end up just 
pushing the problems around.

Every time I look at this I also get worried that we've leaked some of 
these internal promotion rules into something visible to inline asm, but 
when I poke around it seems like things generally work.



>
>
> jeff

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] riscv: implement TARGET_MODE_REP_EXTENDED
  2022-09-06 11:39 ` Alexander Monakov
  2022-09-16 23:48   ` Jeff Law
@ 2022-11-04 23:00   ` Philipp Tomsich
  2022-11-05  6:16     ` [PATCH] [PHIOPT] Add A ? B + CST : B match and simplify optimizations Zhongyunde
                       ` (2 more replies)
  1 sibling, 3 replies; 18+ messages in thread
From: Philipp Tomsich @ 2022-11-04 23:00 UTC (permalink / raw)
  To: Alexander Monakov; +Cc: gcc-patches, Vineet Gupta, Kito Cheng, Jojo R

[-- Attachment #1: Type: text/plain, Size: 3889 bytes --]

Alexander,

I had missed your comment until now.

On Tue, 6 Sept 2022 at 13:39, Alexander Monakov <amonakov@ispras.ru> wrote:
>
> On Mon, 5 Sep 2022, Philipp Tomsich wrote:
>
> > +riscv_mode_rep_extended (scalar_int_mode mode, scalar_int_mode
mode_rep)
> > +{
> > +  /* On 64-bit targets, SImode register values are sign-extended to
DImode.  */
> > +  if (TARGET_64BIT && mode == SImode && mode_rep == DImode)
> > +    return SIGN_EXTEND;
>
> I think this leads to a counter-intuitive requirement that a hand-written
> inline asm must sign-extend its output operands that are bound to either
> signed or unsigned 32-bit lvalues. Will compiler users be aware of that?

I am not sure if I fully understand your concern, as the mode of the
asm-output will be derived from the variable type.
So "asm (... : "=r" (a))" will take DI/SI/HI/QImode depending on the type
of a.

The concern, as far as I understand would be the case where the
assembly-sequence leaves an incompatible extension in the register.
So l understand the problem being:
    int a;
    asm volatile ("zext.w %0, %1" : "=r"(a) : "r"(b));  // possible pitfall
-- the programmer wants a SImode representation (so it needs to be extended)
but not
    long long a;
    asm volatile ("zext.w %0, %1" : "=r"(a): "r"(b));   // possible pitfall
-- the programmer wants the DImode representation (don't extend)

If we look at the output of expand for the first case (I made sure to
consume "a" again using a "asm volatile ("nop" : : "r"(a))"), we see that
an explicit extension is created from SImode to DImode (the "(subreg:SI
(reg:DI ...) 0)" in the asm-operand for the next block is a bigger concern
— this may be an artifact of TARGET_TRULY_NOOP_TRUNCATION not being
properly defined on RISC-V; but this issue didn't originate from this
patch):

;; __asm__ __volatile__("zext.w %0,%1" : "=r" a_2 : "r" b_1(D));
(insn 7 5 6 (set (reg:SI 75 [ aD.1490 ])
        (asm_operands/v:SI ("zext.w %0,%1") ("=r") 0 [
                (reg/v:DI 74 [ bD.1487 ])
            ]
             [
                (asm_input:DI ("r") ./rep-asm.c:5)
            ]
             [] ./rep-asm.c:5)) "./rep-asm.c":5:3 -1
     (nil))
(insn 6 7 0 (set (reg/v:DI 72 [ aD.1490 ])
        (sign_extend:DI (reg:SI 75 [ aD.1490 ]))) "./rep-asm.c":5:3 -1
     (nil))
;; __asm__ __volatile__("nop" :  : "r" a_2);
(insn 8 6 0 (asm_operands/v ("nop") ("") 0 [
            (subreg/s/u:SI (reg/v:DI 72 [ aD.1490 ]) 0)
        ]
         [
            (asm_input:SI ("r") ./rep-asm.c:6)
        ]
         [] ./rep-asm.c:6) "./rep-asm.c":6:3 -1
     (nil))

which translates to:

f:

#APP

# 5 "./rep-asm.c" 1

zext.w a0,a0

# 0 "" 2

#NO_APP

sext.w a0,a0

#APP

# 6 "./rep-asm.c" 1

nop

# 0 "" 2

#NO_APP


If this patch is not applied (and TARGET_MODE_REP_EXTENDED is not defined),
we see the sext.w missing.
So in fact, we may have unintentionally fixed an issue?

> Moreover, without adjusting TARGET_TRULY_NOOP_TRUNCATION this should cause
> miscompilation when a 64-bit variable is truncated to 32 bits: the
pre-existing
> hook says that nothing needs to be done to truncate, but the new hook says
> that the result of the truncation is properly sign-extended.
>
> The documentation for TARGET_MODE_REP_EXTENDED warns about that:
>
>     In order to enforce the representation of mode,
TARGET_TRULY_NOOP_TRUNCATION
>     should return false when truncating to mode.

Good point.  This should indeed be added and we'll look into this in more
detail for v2.

Looking into this in more detail, it seems that this change doesn't make
things worse (we already had that problem before, as an explicit extension
might also lead to the same reasoning).
So somehow we've been avoiding this bullet so far, although I don't know
yet how...

Philipp.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH] [PHIOPT] Add A ? B + CST : B match and simplify optimizations
  2022-11-04 23:00   ` Philipp Tomsich
@ 2022-11-05  6:16     ` Zhongyunde
  2022-11-05  6:34       ` Andrew Pinski
  2022-11-07 13:55     ` [PATCH] riscv: implement TARGET_MODE_REP_EXTENDED Alexander Monakov
  2022-11-20 16:09     ` Jeff Law
  2 siblings, 1 reply; 18+ messages in thread
From: Zhongyunde @ 2022-11-05  6:16 UTC (permalink / raw)
  To: pinskia, hongtao.liu
  Cc: gcc-patches, Zhangwen(Esan), Weiwei (weiwei, Compiler), zhong_1985624

[-- Attachment #1: Type: text/plain, Size: 340 bytes --]

hi,
  This patch is try to fix the issue https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107190, 
would you like to give me some suggestion, thanks.

~/source/gccUpstreamDir/gcc/testsuite(cfg) » git format-patch -1 --start-number=00 HEAD -o ~/patch
/home/zhongyunde/patch/0000-PHIOPT-Add-A-B-CST-B-match-and-simplify-optimization.patch

[-- Attachment #2: 0000-PHIOPT-Add-A-B-CST-B-match-and-simplify-optimization.patch --]
[-- Type: application/octet-stream, Size: 3087 bytes --]

From 80552e137217129ae00c66cfdcc835a4af32647b Mon Sep 17 00:00:00 2001
From: zhongyunde <zhongyunde@huawei.com>
Date: Sat, 5 Nov 2022 13:22:33 +0800
Subject: [PATCH] [PHIOPT] Add A ? B + CST : B match and simplify optimizations

    Refer to commit b6bdd7a4, use pattern match to simple
    A ? B + CST : B (where CST is power of 2) simplifications.
    Fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107190

    gcc/
            * match.pd (A ? B + CST : B): Add simplifcations for A ? B + POW2 : B

    gcc/testsuite/
            * gcc.dg/pr107190.c: New test.
---
 gcc/match.pd                    | 21 +++++++++++++++++++++
 gcc/testsuite/gcc.dg/pr107190.c | 27 +++++++++++++++++++++++++++
 2 files changed, 48 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pr107190.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 194ba8f5188..611a586d2c1 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -4503,6 +4503,27 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
       && INTEGRAL_TYPE_P (TREE_TYPE (@0)))
   (cond @1 (convert @2) (convert @3))))
 
+#if GIMPLE
+(if (canonicalize_math_p ())
+/* These patterns are mostly used by PHIOPT to move some operations outside of
+   the if statements. They should be done late because it gives jump threading
+   and few other passes to reduce what is going on.  */
+/* a ? x op C : x -> x op << log2(C) when C is power of 2. */
+ (for op (plus minus bit_ior bit_xor lshift rshift lrotate rrotate)
+  (simplify
+   (cond @0 (op:s @1 INTEGER_CST@2) @1)
+    /* powerof2cst */
+   (if (INTEGRAL_TYPE_P (type) && integer_pow2p (@2))
+    (with {
+      tree shift = build_int_cst (integer_type_node, tree_log2 (@2));
+     }
+     (op @1 (lshift (convert (convert:boolean_type_node @0)) { shift; })))
+   )
+  )
+ )
+)
+#endif
+
 /* Simplification moved from fold_cond_expr_with_comparison.  It may also
    be extended.  */
 /* This pattern implements two kinds simplification:
diff --git a/gcc/testsuite/gcc.dg/pr107190.c b/gcc/testsuite/gcc.dg/pr107190.c
new file mode 100644
index 00000000000..235b2761a02
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr107190.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fexpensive-optimizations -fdump-tree-phiopt2-details" } */
+
+#  define BN_BITS4        32
+#  define BN_MASK2        (0xffffffffffffffffL)
+#  define BN_MASK2l       (0xffffffffL)
+#  define BN_MASK2h       (0xffffffff00000000L)
+#  define BN_MASK2h1      (0xffffffff80000000L)
+#  define LBITS(a)        ((a)&BN_MASK2l)
+#  define HBITS(a)        (((a)>>BN_BITS4)&BN_MASK2l)
+#  define L2HBITS(a)      (((a)<<BN_BITS4)&BN_MASK2)
+
+unsigned int test_m(unsigned long in0, unsigned long in1) {
+    unsigned long m, m1, lt, ht, bl, bh;
+    lt = LBITS(in0);
+    ht = HBITS(in0);
+    bl = LBITS(in1);
+    bh = HBITS(in1);
+    m  = bh * lt;
+    m1 = bl * ht;
+    ht = bh * ht;
+    m  = (m + m1) & BN_MASK2;
+    if (m < m1) ht += L2HBITS((unsigned long)1);
+    return ht + m;
+}
+
+/* { dg-final { scan-tree-dump "COND_EXPR in block 2 and PHI in block 4 converted to straightline code" "phiopt2" } } */
-- 
2.19.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] [PHIOPT] Add A ? B + CST : B match and simplify optimizations
  2022-11-05  6:16     ` [PATCH] [PHIOPT] Add A ? B + CST : B match and simplify optimizations Zhongyunde
@ 2022-11-05  6:34       ` Andrew Pinski
  2022-11-05  9:03         ` Zhongyunde
  0 siblings, 1 reply; 18+ messages in thread
From: Andrew Pinski @ 2022-11-05  6:34 UTC (permalink / raw)
  To: Zhongyunde
  Cc: hongtao.liu, gcc-patches, Zhangwen(Esan),
	Weiwei (weiwei, Compiler),
	zhong_1985624

On Fri, Nov 4, 2022 at 11:17 PM Zhongyunde <zhongyunde@huawei.com> wrote:
>
> hi,
>   This patch is try to fix the issue https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107190,
> would you like to give me some suggestion, thanks.

This seems like a "simplified" version of
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584411.html
which just handles power of 2 constants where we know the cond will be
removed.
We could do even more "simplified" of 1 if needed really.
What is the IR before PHI-OPT? Is it just + 1?

Also your pattern can be simplified to use integer_pow2p in the match
part instead of INTEGER_CST.

Thanks,
Andrew

>
> ~/source/gccUpstreamDir/gcc/testsuite(cfg) » git format-patch -1 --start-number=00 HEAD -o ~/patch
> /home/zhongyunde/patch/0000-PHIOPT-Add-A-B-CST-B-match-and-simplify-optimization.patch

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH] [PHIOPT] Add A ? B + CST : B match and simplify optimizations
  2022-11-05  6:34       ` Andrew Pinski
@ 2022-11-05  9:03         ` Zhongyunde
  2022-11-08 14:58           ` Richard Biener
  0 siblings, 1 reply; 18+ messages in thread
From: Zhongyunde @ 2022-11-05  9:03 UTC (permalink / raw)
  To: Andrew Pinski
  Cc: hongtao.liu, gcc-patches, Zhangwen(Esan),
	Weiwei (weiwei, Compiler),
	zhong_1985624

[-- Attachment #1: Type: text/plain, Size: 1343 bytes --]


> -----Original Message-----
> From: Andrew Pinski [mailto:pinskia@gcc.gnu.org]
> Sent: Saturday, November 5, 2022 2:34 PM
> To: Zhongyunde <zhongyunde@huawei.com>
> Cc: hongtao.liu@intel.com; gcc-patches@gcc.gnu.org; Zhangwen(Esan)
> <zwzhangwen.zhang@huawei.com>; Weiwei (weiwei, Compiler)
> <weiwei64@huawei.com>; zhong_1985624@163.com
> Subject: Re: [PATCH] [PHIOPT] Add A ? B + CST : B match and simplify
> optimizations
> 
> On Fri, Nov 4, 2022 at 11:17 PM Zhongyunde <zhongyunde@huawei.com>
> wrote:
> >
> > hi,
> >   This patch is try to fix the issue
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107190,
> > would you like to give me some suggestion, thanks.
> 
> This seems like a "simplified" version of
> https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584411.html
> which just handles power of 2 constants where we know the cond will be
> removed.
> We could do even more "simplified" of 1 if needed really.
> What is the IR before PHI-OPT? Is it just + 1?

Thanks for your attention. It is + 4294967296 before PHI-OPT  (See detail https://gcc.godbolt.org/z/6zEc6ja1z)
So we should keep matching the power of 2 constants ?

> Also your pattern can be simplified to use integer_pow2p in the match part
> instead of INTEGER_CST.
> 
Apply your comment, thanks

> Thanks,
> Andrew



[-- Attachment #2: 0001-PHIOPT-Add-A-B-op-CST-B-match-and-simplify-optimizat.patch --]
[-- Type: application/octet-stream, Size: 3072 bytes --]

From 1dfbda734390d8398a12a455028149b058076a51 Mon Sep 17 00:00:00 2001
From: zhongyunde <zhongyunde@huawei.com>
Date: Sat, 5 Nov 2022 13:22:33 +0800
Subject: [PATCH] [PHIOPT] Add A ? B op CST : B match and simplify
 optimizations

    Refer to commit b6bdd7a4, use pattern match to simple
    A ? B op CST : B (where CST is power of 2) simplifications.
    Fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107190

    gcc/
            * match.pd (A ? B op CST : B): Add simplifcations for A ? B op POW2 : B

    gcc/testsuite/
            * gcc.dg/pr107190.c: New test.
---
 gcc/match.pd                    | 21 +++++++++++++++++++++
 gcc/testsuite/gcc.dg/pr107190.c | 27 +++++++++++++++++++++++++++
 2 files changed, 48 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pr107190.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 194ba8f5188..7ff393f371f 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -4503,6 +4503,27 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
       && INTEGRAL_TYPE_P (TREE_TYPE (@0)))
   (cond @1 (convert @2) (convert @3))))
 
+#if GIMPLE
+(if (canonicalize_math_p ())
+/* These patterns are mostly used by PHIOPT to move some operations outside of
+   the if statements. They should be done late because it gives jump threading
+   and few other passes to reduce what is going on.  */
+/* a ? x op C : x -> x op << log2(C) when C is power of 2. */
+ (for op (plus minus bit_ior bit_xor lshift rshift lrotate rrotate)
+  (simplify
+   (cond @0 (op:s @1 integer_pow2p@2) @1)
+    /* powerof2cst */
+   (if (INTEGRAL_TYPE_P (type))
+    (with {
+      tree shift = build_int_cst (integer_type_node, tree_log2 (@2));
+     }
+     (op @1 (lshift (convert (convert:boolean_type_node @0)) { shift; })))
+   )
+  )
+ )
+)
+#endif
+
 /* Simplification moved from fold_cond_expr_with_comparison.  It may also
    be extended.  */
 /* This pattern implements two kinds simplification:
diff --git a/gcc/testsuite/gcc.dg/pr107190.c b/gcc/testsuite/gcc.dg/pr107190.c
new file mode 100644
index 00000000000..235b2761a02
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr107190.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fexpensive-optimizations -fdump-tree-phiopt2-details" } */
+
+#  define BN_BITS4        32
+#  define BN_MASK2        (0xffffffffffffffffL)
+#  define BN_MASK2l       (0xffffffffL)
+#  define BN_MASK2h       (0xffffffff00000000L)
+#  define BN_MASK2h1      (0xffffffff80000000L)
+#  define LBITS(a)        ((a)&BN_MASK2l)
+#  define HBITS(a)        (((a)>>BN_BITS4)&BN_MASK2l)
+#  define L2HBITS(a)      (((a)<<BN_BITS4)&BN_MASK2)
+
+unsigned int test_m(unsigned long in0, unsigned long in1) {
+    unsigned long m, m1, lt, ht, bl, bh;
+    lt = LBITS(in0);
+    ht = HBITS(in0);
+    bl = LBITS(in1);
+    bh = HBITS(in1);
+    m  = bh * lt;
+    m1 = bl * ht;
+    ht = bh * ht;
+    m  = (m + m1) & BN_MASK2;
+    if (m < m1) ht += L2HBITS((unsigned long)1);
+    return ht + m;
+}
+
+/* { dg-final { scan-tree-dump "COND_EXPR in block 2 and PHI in block 4 converted to straightline code" "phiopt2" } } */
-- 
2.19.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] riscv: implement TARGET_MODE_REP_EXTENDED
  2022-11-04 23:00   ` Philipp Tomsich
  2022-11-05  6:16     ` [PATCH] [PHIOPT] Add A ? B + CST : B match and simplify optimizations Zhongyunde
@ 2022-11-07 13:55     ` Alexander Monakov
  2022-11-08 23:45       ` Philipp Tomsich
  2022-11-20 16:09     ` Jeff Law
  2 siblings, 1 reply; 18+ messages in thread
From: Alexander Monakov @ 2022-11-07 13:55 UTC (permalink / raw)
  To: Philipp Tomsich; +Cc: gcc-patches, Vineet Gupta, Kito Cheng, Jojo R



On Sat, 5 Nov 2022, Philipp Tomsich wrote:

> Alexander,
> 
> I had missed your comment until now.

Please make sure to read replies from Jeff and Palmer as well (their responses
went to gcc-patches with empty Cc list):
https://inbox.sourceware.org/gcc-patches/ba895f78-7f47-0f4-5bfe-e21893c4bcb@ispras.ru/T/#m7b7e5708b82de3b05ba8007ae6544891a03bdc42

For now let me respond to some of the more prominent points:

> > I think this leads to a counter-intuitive requirement that a hand-written
> > inline asm must sign-extend its output operands that are bound to either
> > signed or unsigned 32-bit lvalues. Will compiler users be aware of that?
> 
> I am not sure if I fully understand your concern, as the mode of the
> asm-output will be derived from the variable type.
> So "asm (... : "=r" (a))" will take DI/SI/HI/QImode depending on the type
> of a.

Yes. The problem arises when 'a' later undergoes conversion to a wider type.

> The concern, as far as I understand would be the case where the
> assembly-sequence leaves an incompatible extension in the register.

Existing documentation like the psABI does not constrain compiler users in any
way, so their inline asm snippets are free to leave garbage in upper bits. Thus
there's no "incompatibility" to speak of.

To give a specific example that will be problematic if you go far enough down
the road of matching MIPS64 behavior:

long f(void)
{
    int x;
    asm("" : "=r"(x));
    return x;
}

here GCC (unlike LLVM) omits sign extension of 'x', assuming that asm output
must have been sign-extended to 64 bits by the programmer.

Alexander

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] [PHIOPT] Add A ? B + CST : B match and simplify optimizations
  2022-11-05  9:03         ` Zhongyunde
@ 2022-11-08 14:58           ` Richard Biener
  2022-11-08 15:51             ` 钟云德
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Biener @ 2022-11-08 14:58 UTC (permalink / raw)
  To: Zhongyunde
  Cc: Andrew Pinski, zhong_1985624, Weiwei (weiwei, Compiler),
	hongtao.liu, gcc-patches, Zhangwen(Esan)

On Sat, Nov 5, 2022 at 10:03 AM Zhongyunde via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
>
> > -----Original Message-----
> > From: Andrew Pinski [mailto:pinskia@gcc.gnu.org]
> > Sent: Saturday, November 5, 2022 2:34 PM
> > To: Zhongyunde <zhongyunde@huawei.com>
> > Cc: hongtao.liu@intel.com; gcc-patches@gcc.gnu.org; Zhangwen(Esan)
> > <zwzhangwen.zhang@huawei.com>; Weiwei (weiwei, Compiler)
> > <weiwei64@huawei.com>; zhong_1985624@163.com
> > Subject: Re: [PATCH] [PHIOPT] Add A ? B + CST : B match and simplify
> > optimizations
> >
> > On Fri, Nov 4, 2022 at 11:17 PM Zhongyunde <zhongyunde@huawei.com>
> > wrote:
> > >
> > > hi,
> > >   This patch is try to fix the issue
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107190,
> > > would you like to give me some suggestion, thanks.
> >
> > This seems like a "simplified" version of
> > https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584411.html
> > which just handles power of 2 constants where we know the cond will be
> > removed.
> > We could do even more "simplified" of 1 if needed really.
> > What is the IR before PHI-OPT? Is it just + 1?
>
> Thanks for your attention. It is + 4294967296 before PHI-OPT  (See detail https://gcc.godbolt.org/z/6zEc6ja1z)
> So we should keep matching the power of 2 constants ?
>
> > Also your pattern can be simplified to use integer_pow2p in the match part
> > instead of INTEGER_CST.
> >
> Apply your comment, thanks

How does the patch fix the mentioned bug?  match.pd patterns should make things
"simpler" but x + (a << C') isn't simpler than a ? x + C : x.  It
looks you are targeting
PHI-OPT here, so maybe instead extend value_replacement to handle this case,
it does look similar to the case with neutral/absorbing element there?

Richard.

>
> > Thanks,
> > Andrew
>
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re:Re: [PATCH] [PHIOPT] Add A ? B + CST : B match and simplify optimizations
  2022-11-08 14:58           ` Richard Biener
@ 2022-11-08 15:51             ` 钟云德
  2022-11-09  8:00               ` Richard Biener
  0 siblings, 1 reply; 18+ messages in thread
From: 钟云德 @ 2022-11-08 15:51 UTC (permalink / raw)
  To: Richard Biener
  Cc: Zhongyunde, Andrew Pinski, Weiwei (weiwei, Compiler),
	hongtao.liu, gcc-patches, Zhangwen(Esan)

[-- Attachment #1: Type: text/plain, Size: 3406 bytes --]

At 2022-11-08 22:58:34, "Richard Biener" <richard.guenther@gmail.com> wrote:

>On Sat, Nov 5, 2022 at 10:03 AM Zhongyunde via Gcc-patches
><gcc-patches@gcc.gnu.org> wrote:
>>
>>
>> > -----Original Message-----
>> > From: Andrew Pinski [mailto:pinskia@gcc.gnu.org]
>> > Sent: Saturday, November 5, 2022 2:34 PM
>> > To: Zhongyunde <zhongyunde@huawei.com>
>> > Cc: hongtao.liu@intel.com; gcc-patches@gcc.gnu.org; Zhangwen(Esan)
>> > <zwzhangwen.zhang@huawei.com>; Weiwei (weiwei, Compiler)
>> > <weiwei64@huawei.com>; zhong_1985624@163.com
>> > Subject: Re: [PATCH] [PHIOPT] Add A ? B + CST : B match and simplify
>> > optimizations
>> >
>> > On Fri, Nov 4, 2022 at 11:17 PM Zhongyunde <zhongyunde@huawei.com>
>> > wrote:
>> > >
>> > > hi,
>> > >   This patch is try to fix the issue
>> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107190,
>> > > would you like to give me some suggestion, thanks.
>> >
>> > This seems like a "simplified" version of
>> > https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584411.html
>> > which just handles power of 2 constants where we know the cond will be
>> > removed.
>> > We could do even more "simplified" of 1 if needed really.
>> > What is the IR before PHI-OPT? Is it just + 1?
>>
>> Thanks for your attention. It is + 4294967296 before PHI-OPT  (See detail https://gcc.godbolt.org/z/6zEc6ja1z)
>> So we should keep matching the power of 2 constants ?
>>
>> > Also your pattern can be simplified to use integer_pow2p in the match part
>> > instead of INTEGER_CST.
>> >
>> Apply your comment, thanks
>
>How does the patch fix the mentioned bug?  match.pd patterns should make things
>"simpler" but x + (a << C') isn't simpler than a ? x + C : x.  It
>looks you are targeting
>PHI-OPT here, so maybe instead extend value_replacement to handle this case,
>it does look similar to the case with neutral/absorbing element there?
>

>Richard.


Thanks. This patch try to fix the 1st issued mentioned in107090 – [aarch64] sequence logic should be combined with mul and umulh (gnu.org)
Sure, I'll take a look at the function value_replacement. 
I have also noticed that the function of two_value_replacement is very close to patch I want to achieve, and it may be easy to extend.
It seems can be expressed equally in match.pd (called by match_simplify_replacement), so how do we
choose where to implement may be better?
```
|
/* Do the replacement of conditional if it can be done.  */if (!early_p                                                                                                                && !diamond_p                                                                                                           && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))                                                              cfgchanged = true;                                                                                          elseif (!diamond_p                                                                                                              && match_simplify_replacement (bb, bb1, e1, e2, phi,                                                                                                   arg0, arg1, early_p))                                                             cfgchanged = true;                                                                                          
|
```
>> > Thanks, >> > Andrew >> >>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] riscv: implement TARGET_MODE_REP_EXTENDED
  2022-11-07 13:55     ` [PATCH] riscv: implement TARGET_MODE_REP_EXTENDED Alexander Monakov
@ 2022-11-08 23:45       ` Philipp Tomsich
  2022-11-09 17:21         ` Alexander Monakov
  0 siblings, 1 reply; 18+ messages in thread
From: Philipp Tomsich @ 2022-11-08 23:45 UTC (permalink / raw)
  To: Alexander Monakov; +Cc: gcc-patches, Vineet Gupta, Kito Cheng, Jojo R, Jeff Law

On Mon, 7 Nov 2022 at 14:55, Alexander Monakov <amonakov@ispras.ru> wrote:
>
>
>
> On Sat, 5 Nov 2022, Philipp Tomsich wrote:
>
> > Alexander,
> >
> > I had missed your comment until now.
>
> Please make sure to read replies from Jeff and Palmer as well (their responses
> went to gcc-patches with empty Cc list):
> https://inbox.sourceware.org/gcc-patches/ba895f78-7f47-0f4-5bfe-e21893c4bcb@ispras.ru/T/#m7b7e5708b82de3b05ba8007ae6544891a03bdc42
>
> For now let me respond to some of the more prominent points:
>
> > > I think this leads to a counter-intuitive requirement that a hand-written
> > > inline asm must sign-extend its output operands that are bound to either
> > > signed or unsigned 32-bit lvalues. Will compiler users be aware of that?
> >
> > I am not sure if I fully understand your concern, as the mode of the
> > asm-output will be derived from the variable type.
> > So "asm (... : "=r" (a))" will take DI/SI/HI/QImode depending on the type
> > of a.
>
> Yes. The problem arises when 'a' later undergoes conversion to a wider type.
>
> > The concern, as far as I understand would be the case where the
> > assembly-sequence leaves an incompatible extension in the register.
>
> Existing documentation like the psABI does not constrain compiler users in any
> way, so their inline asm snippets are free to leave garbage in upper bits. Thus
> there's no "incompatibility" to speak of.
>
> To give a specific example that will be problematic if you go far enough down
> the road of matching MIPS64 behavior:
>
> long f(void)
> {
>     int x;
>     asm("" : "=r"(x));
>     return x;
> }
>
> here GCC (unlike LLVM) omits sign extension of 'x', assuming that asm output
> must have been sign-extended to 64 bits by the programmer.

In fact, with the proposed patch (but also without it), GCC will sign-extend:
f:
  sext.w a0,a0
  ret
  .size f, .-f

To make sure that this is not just an extension to promote the int to
long for the function return, I next added another empty asm to
consume 'x'.
This clearly shows that the extension is performed to postprocess the
output of the asm-statement:

f:
  # ./asm2.c:4:     asm("" : "=r"(x));
  sext.w a0,a0 # x, x
  # ./asm2.c:5:     asm("" : : "r"(x));
  # ./asm2.c:7: }
  ret

Thanks,
Philipp.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Re: [PATCH] [PHIOPT] Add A ? B + CST : B match and simplify optimizations
  2022-11-08 15:51             ` 钟云德
@ 2022-11-09  8:00               ` Richard Biener
  0 siblings, 0 replies; 18+ messages in thread
From: Richard Biener @ 2022-11-09  8:00 UTC (permalink / raw)
  To: 钟云德
  Cc: Zhongyunde, Andrew Pinski, Weiwei (weiwei, Compiler),
	hongtao.liu, gcc-patches, Zhangwen(Esan)

[-- Attachment #1: Type: text/plain, Size: 3996 bytes --]

On Tue, Nov 8, 2022 at 4:51 PM 钟云德 <zhong_1985624@163.com> wrote:

> At 2022-11-08 22:58:34, "Richard Biener" <richard.guenther@gmail.com>
> wrote:
>
> >On Sat, Nov 5, 2022 at 10:03 AM Zhongyunde via Gcc-patches
> ><gcc-patches@gcc.gnu.org> wrote:
> >>
> >>
> >> > -----Original Message-----
> >> > From: Andrew Pinski [mailto:pinskia@gcc.gnu.org]
> >> > Sent: Saturday, November 5, 2022 2:34 PM
> >> > To: Zhongyunde <zhongyunde@huawei.com>
> >> > Cc: hongtao.liu@intel.com; gcc-patches@gcc.gnu.org; Zhangwen(Esan)
> >> > <zwzhangwen.zhang@huawei.com>; Weiwei (weiwei, Compiler)
> >> > <weiwei64@huawei.com>; zhong_1985624@163.com
> >> > Subject: Re: [PATCH] [PHIOPT] Add A ? B + CST : B match and simplify
> >> > optimizations
> >> >
> >> > On Fri, Nov 4, 2022 at 11:17 PM Zhongyunde <zhongyunde@huawei.com>
> >> > wrote:
> >> > >
> >> > > hi,
> >> > >   This patch is try to fix the issue
> >> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107190,
> >> > > would you like to give me some suggestion, thanks.
> >> >
> >> > This seems like a "simplified" version of
> >> > https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584411.html
> >> > which just handles power of 2 constants where we know the cond will be
> >> > removed.
> >> > We could do even more "simplified" of 1 if needed really.
> >> > What is the IR before PHI-OPT? Is it just + 1?
> >>
> >> Thanks for your attention. It is + 4294967296 before PHI-OPT  (See detail https://gcc.godbolt.org/z/6zEc6ja1z)
> >> So we should keep matching the power of 2 constants ?
> >>
> >> > Also your pattern can be simplified to use integer_pow2p in the match part
> >> > instead of INTEGER_CST.
> >> >
> >> Apply your comment, thanks
> >
> >How does the patch fix the mentioned bug?  match.pd patterns should make things
> >"simpler" but x + (a << C') isn't simpler than a ? x + C : x.  It
> >looks you are targeting
> >PHI-OPT here, so maybe instead extend value_replacement to handle this case,
> >it does look similar to the case with neutral/absorbing element there?
> >
> >Richard.
>
> Thanks. This patch try to fix the 1st issued mentioned in 107090 – [aarch64] sequence logic should be combined with mul and umulh (gnu.org) <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107090>
> Sure, I'll take a look at the function value_replacement.
> I have also noticed that the function of two_value_replacement is very close to patch I want to achieve, and it may be easy to extend.
> It seems can be expressed equally in match.pd (called by match_simplify_replacement), so how do we
> choose where to implement may be better?
>
>
I think that if we realize that we don't want a simplification to always
apply (like by checking
for canonicalize_math_p ()), then we should look for a pass which is a good
fit and since
you add the pattern to trigger a PHI-OPT optimization that looked like an
obvious choice ...

```
>
>           /* Do the replacement of conditional if it can be done.  */                                                             if (!early_p                                                                                                                && !diamond_p                                                                                                           && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))                                                              cfgchanged = true;                                                                                                    else if (!diamond_p                                                                                                              && match_simplify_replacement (bb, bb1, e1, e2, phi,                                                                                                   arg0, arg1, early_p))                                                             cfgchanged = true;
>
> ```
> >> > Thanks, >> > Andrew >> >>
>
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] riscv: implement TARGET_MODE_REP_EXTENDED
  2022-11-08 23:45       ` Philipp Tomsich
@ 2022-11-09 17:21         ` Alexander Monakov
  0 siblings, 0 replies; 18+ messages in thread
From: Alexander Monakov @ 2022-11-09 17:21 UTC (permalink / raw)
  To: Philipp Tomsich; +Cc: gcc-patches, Vineet Gupta, Kito Cheng, Jojo R, Jeff Law

On Wed, 9 Nov 2022, Philipp Tomsich wrote:

> > To give a specific example that will be problematic if you go far enough down
> > the road of matching MIPS64 behavior:
> >
> > long f(void)
> > {
> >     int x;
> >     asm("" : "=r"(x));
> >     return x;
> > }
> >
> > here GCC (unlike LLVM) omits sign extension of 'x', assuming that asm output
> > must have been sign-extended to 64 bits by the programmer.
> 
> In fact, with the proposed patch (but also without it), GCC will sign-extend:
> f:
>   sext.w a0,a0
>   ret
>   .size f, .-f

I'm aware. I said "will be problematic if ...", meaning that GCC omits sign
extension when targeting MIPS64, and if you match MIPS64 behavior on RISC-V, 
you'll get in that situation as well.

> To make sure that this is not just an extension to promote the int to
> long for the function return, I next added another empty asm to
> consume 'x'.
> This clearly shows that the extension is performed to postprocess the
> output of the asm-statement:
> 
> f:
>   # ./asm2.c:4:     asm("" : "=r"(x));
>   sext.w a0,a0 # x, x
>   # ./asm2.c:5:     asm("" : : "r"(x));
>   # ./asm2.c:7: }
>   ret

No, you cannot distinguish post-processing the output of the first asm vs.
pre-processing the input of the second asm. Try

  asm("" : "+r"(x));

as the second asm instead, and you'll get

f:
# t.c:17:     asm("" : "=r"(x));
# t.c:18:     asm("" : "+r"(x));
# t.c:20: }
        sext.w  a0,a0   #, x
        ret

If it helps, here's a Compiler Explorer link comparing with MIPS64:
https://godbolt.org/z/7eobvdKdK

Alexander

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] riscv: implement TARGET_MODE_REP_EXTENDED
  2022-11-04 23:00   ` Philipp Tomsich
  2022-11-05  6:16     ` [PATCH] [PHIOPT] Add A ? B + CST : B match and simplify optimizations Zhongyunde
  2022-11-07 13:55     ` [PATCH] riscv: implement TARGET_MODE_REP_EXTENDED Alexander Monakov
@ 2022-11-20 16:09     ` Jeff Law
  2022-11-21 13:49       ` Alexander Monakov
  2 siblings, 1 reply; 18+ messages in thread
From: Jeff Law @ 2022-11-20 16:09 UTC (permalink / raw)
  To: Philipp Tomsich, Alexander Monakov
  Cc: Vineet Gupta, gcc-patches, Kito Cheng, Jojo R


On 11/4/22 17:00, Philipp Tomsich wrote:
> Alexander,
>
> I had missed your comment until now.
>
> On Tue, 6 Sept 2022 at 13:39, Alexander Monakov <amonakov@ispras.ru> wrote:
>> On Mon, 5 Sep 2022, Philipp Tomsich wrote:
>>
>>> +riscv_mode_rep_extended (scalar_int_mode mode, scalar_int_mode
> mode_rep)
>>> +{
>>> +  /* On 64-bit targets, SImode register values are sign-extended to
> DImode.  */
>>> +  if (TARGET_64BIT && mode == SImode && mode_rep == DImode)
>>> +    return SIGN_EXTEND;
>> I think this leads to a counter-intuitive requirement that a hand-written
>> inline asm must sign-extend its output operands that are bound to either
>> signed or unsigned 32-bit lvalues. Will compiler users be aware of that?
> I am not sure if I fully understand your concern, as the mode of the
> asm-output will be derived from the variable type.
> So "asm (... : "=r" (a))" will take DI/SI/HI/QImode depending on the type
> of a.

Correct.


>
> The concern, as far as I understand would be the case where the
> assembly-sequence leaves an incompatible extension in the register.

Right.  The question in my mind is whether or not the responsibility 
should be on the compiler or on the developer to ensure the ASM output 
is properly extended.  If someone's writing ASMs, then to a large 
degree, I consider it their responsibility to make sure things are 
properly extended -- even more so if having the compiler do it results 
in slower code independent of ASMs.


Jeff


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] riscv: implement TARGET_MODE_REP_EXTENDED
  2022-11-20 16:09     ` Jeff Law
@ 2022-11-21 13:49       ` Alexander Monakov
  2022-11-21 14:56         ` Jeff Law
  0 siblings, 1 reply; 18+ messages in thread
From: Alexander Monakov @ 2022-11-21 13:49 UTC (permalink / raw)
  To: Jeff Law; +Cc: Philipp Tomsich, Vineet Gupta, gcc-patches, Kito Cheng, Jojo R

[-- Attachment #1: Type: text/plain, Size: 1305 bytes --]


On Sun, 20 Nov 2022, Jeff Law wrote:

> > The concern, as far as I understand would be the case where the
> > assembly-sequence leaves an incompatible extension in the register.
> 
> Right.  The question in my mind is whether or not the responsibility should be
> on the compiler or on the developer to ensure the ASM output is properly
> extended.  If someone's writing ASMs, then to a large degree, I consider it
> their responsibility to make sure things are properly extended

Right. They should also find out the hard way, with zero documentation telling
them they had to (let alone *why* they had to), and without a hypothetical
-fsanitize=abi that would catch exactly the point of the missing extension
instead of letting the program crash mysteriously at a much later point.
Uphill both ways, in a world where LLVM does not place such burden on the
programmer, and even among GCC targets only mips64 made a precedent (also
without documenting the new requirement).

> -- even more so
> if having the compiler do it results in slower code independent of ASMs.

I think LLVM demonstrates well enough that a compiler can do a better job
than GCC at eliminating redundant extensions without upgrading requirements
for inline asm (in the usual C code, not for sub-word outputs of an asm).

Alexander

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] riscv: implement TARGET_MODE_REP_EXTENDED
  2022-11-21 13:49       ` Alexander Monakov
@ 2022-11-21 14:56         ` Jeff Law
  2022-11-21 15:33           ` Alexander Monakov
  0 siblings, 1 reply; 18+ messages in thread
From: Jeff Law @ 2022-11-21 14:56 UTC (permalink / raw)
  To: Alexander Monakov
  Cc: Philipp Tomsich, Vineet Gupta, gcc-patches, Kito Cheng, Jojo R


On 11/21/22 06:49, Alexander Monakov wrote:
> On Sun, 20 Nov 2022, Jeff Law wrote:
>
>>> The concern, as far as I understand would be the case where the
>>> assembly-sequence leaves an incompatible extension in the register.
>> Right.  The question in my mind is whether or not the responsibility should be
>> on the compiler or on the developer to ensure the ASM output is properly
>> extended.  If someone's writing ASMs, then to a large degree, I consider it
>> their responsibility to make sure things are properly extended
> Right. They should also find out the hard way, with zero documentation telling
> them they had to (let alone *why* they had to), and without a hypothetical
> -fsanitize=abi that would catch exactly the point of the missing extension
> instead of letting the program crash mysteriously at a much later point.
> Uphill both ways, in a world where LLVM does not place such burden on the
> programmer, and even among GCC targets only mips64 made a precedent (also
> without documenting the new requirement).

They're writing assembly code -- in my book that means they'd better 
have a pretty good understanding of the architecture, its limitations 
and quirks.

  I'm happy to give them some diagnostics, guardrails, etc etc, but 
slowing down standard C code to placate someone who doesn't really know 
the architecture, but is trying to write assembly code is a step too far 
for me.


>
>> -- even more so
>> if having the compiler do it results in slower code independent of ASMs.
> I think LLVM demonstrates well enough that a compiler can do a better job
> than GCC at eliminating redundant extensions without upgrading requirements
> for inline asm (in the usual C code, not for sub-word outputs of an asm).

Perhaps.  But it's also the case the LLVM and GCC have some significant 
differences in implementation.  Sometimes those differences in 
impementations have notable impacts on how code is generated.


jeff

>
> Alexander

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] riscv: implement TARGET_MODE_REP_EXTENDED
  2022-11-21 14:56         ` Jeff Law
@ 2022-11-21 15:33           ` Alexander Monakov
  0 siblings, 0 replies; 18+ messages in thread
From: Alexander Monakov @ 2022-11-21 15:33 UTC (permalink / raw)
  To: Jeff Law; +Cc: Philipp Tomsich, Vineet Gupta, gcc-patches, Kito Cheng, Jojo R


On Mon, 21 Nov 2022, Jeff Law wrote:

> They're writing assembly code -- in my book that means they'd better have a
> pretty good understanding of the architecture, its limitations and quirks.

That GCC ties together optimization and inline asm interface via its internal
TARGET_MODE_REP_EXTENDED macro is not a quirk of the RISC-V architecture.
It's a quirk of GCC.

Alexander

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2022-11-21 15:33 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-05 21:44 [PATCH] riscv: implement TARGET_MODE_REP_EXTENDED Philipp Tomsich
2022-09-06 11:39 ` Alexander Monakov
2022-09-16 23:48   ` Jeff Law
2022-09-17  7:59     ` Palmer Dabbelt
2022-11-04 23:00   ` Philipp Tomsich
2022-11-05  6:16     ` [PATCH] [PHIOPT] Add A ? B + CST : B match and simplify optimizations Zhongyunde
2022-11-05  6:34       ` Andrew Pinski
2022-11-05  9:03         ` Zhongyunde
2022-11-08 14:58           ` Richard Biener
2022-11-08 15:51             ` 钟云德
2022-11-09  8:00               ` Richard Biener
2022-11-07 13:55     ` [PATCH] riscv: implement TARGET_MODE_REP_EXTENDED Alexander Monakov
2022-11-08 23:45       ` Philipp Tomsich
2022-11-09 17:21         ` Alexander Monakov
2022-11-20 16:09     ` Jeff Law
2022-11-21 13:49       ` Alexander Monakov
2022-11-21 14:56         ` Jeff Law
2022-11-21 15:33           ` Alexander Monakov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).