From: Georg-Johann Lay <avr@gjlay.de>
To: Richard Biener <richard.guenther@gmail.com>, pan2.li@intel.com
Cc: gcc-patches@gcc.gnu.org, juzhe.zhong@rivai.ai,
yanzhang.wang@intel.com, kito.cheng@gmail.com,
Tamar.Christina@arm.com
Subject: Re: [PATCH v1] Internal-fn: Add new internal function SAT_ADDU
Date: Tue, 27 Feb 2024 11:53:12 +0100 [thread overview]
Message-ID: <5e617497-8f81-45e4-bf69-7f4c8b08930b@gjlay.de> (raw)
In-Reply-To: <CAFiYyc342qk3EQN9RzZPaFQ6pFF1QEVXyS_pD7-MKO+1JAga-g@mail.gmail.com>
Am 19.02.24 um 08:36 schrieb Richard Biener:
> On Sat, Feb 17, 2024 at 11:30 AM <pan2.li@intel.com> wrote:
>>
>> From: Pan Li <pan2.li@intel.com>
>>
>> This patch would like to add the middle-end presentation for the
>> unsigned saturation add. Aka set the result of add to the max
>> when overflow. It will take the pattern similar as below.
>>
>> SAT_ADDU (x, y) => (x + y) | (-(TYPE)((TYPE)(x + y) < x))
Does this even try to wort out the costs?
For example, with the following example
#define T __UINT16_TYPE__
T sat_add1 (T x, T y)
{
return (x + y) | (- (T)((T)(x + y) < x));
}
T sat_add2 (T x, T y)
{
T z = x + y;
if (z < x)
z = (T) -1;
return z;
}
And then "avr-gcc -S -Os -dp" the code is
sat_add1:
add r22,r24 ; 7 [c=8 l=2] *addhi3/0
adc r23,r25
ldi r18,lo8(1) ; 8 [c=4 l=2] *movhi/4
ldi r19,0
cp r22,r24 ; 9 [c=8 l=2] cmphi3/2
cpc r23,r25
brlo .L2 ; 10 [c=16 l=1] branch
ldi r19,0 ; 31 [c=4 l=1] movqi_insn/0
ldi r18,0 ; 32 [c=4 l=1] movqi_insn/0
.L2:
clr r24 ; 13 [c=12 l=4] neghi2/1
clr r25
sub r24,r18
sbc r25,r19
or r24,r22 ; 29 [c=4 l=1] iorqi3/0
or r25,r23 ; 30 [c=4 l=1] iorqi3/0
ret ; 35 [c=0 l=1] return
sat_add2:
add r22,r24 ; 8 [c=8 l=2] *addhi3/0
adc r23,r25
cp r22,r24 ; 9 [c=8 l=2] cmphi3/2
cpc r23,r25
brsh .L3 ; 10 [c=16 l=1] branch
ldi r22,lo8(-1) ; 5 [c=4 l=2] *movhi/4
ldi r23,lo8(-1)
.L3:
mov r25,r23 ; 21 [c=4 l=1] movqi_insn/0
mov r24,r22 ; 22 [c=4 l=1] movqi_insn/0
ret ; 25 [c=0 l=1] return
i.e. the conditional jump is better than overly smart arithmetic
(smaller and faster code with less register pressure).
With larger dypes the difference is even more pronounced-
Johann
>> Take uint8_t as example, we will have:
>>
>> * SAT_ADDU (1, 254) => 255.
>> * SAT_ADDU (1, 255) => 255.
>> * SAT_ADDU (2, 255) => 255.
>> * SAT_ADDU (255, 255) => 255.
>>
>> The patch also implement the SAT_ADDU in the riscv backend as
>> the sample. Given below example:
>>
>> uint64_t sat_add_u64 (uint64_t x, uint64_t y)
>> {
>> return (x + y) | (- (uint64_t)((uint64_t)(x + y) < x));
>> }
>>
>> Before this patch:
>>
>> uint64_t sat_add_uint64_t (uint64_t x, uint64_t y)
>> {
>> long unsigned int _1;
>> _Bool _2;
>> long unsigned int _3;
>> long unsigned int _4;
>> uint64_t _7;
>> long unsigned int _10;
>> __complex__ long unsigned int _11;
>>
>> ;; basic block 2, loop depth 0
>> ;; pred: ENTRY
>> _11 = .ADD_OVERFLOW (x_5(D), y_6(D));
>> _1 = REALPART_EXPR <_11>;
>> _10 = IMAGPART_EXPR <_11>;
>> _2 = _10 != 0;
>> _3 = (long unsigned int) _2;
>> _4 = -_3;
>> _7 = _1 | _4;
>> return _7;
>> ;; succ: EXIT
>>
>> }
>>
>> After this patch:
>>
>> uint64_t sat_add_uint64_t (uint64_t x, uint64_t y)
>> {
>> uint64_t _7;
>>
>> ;; basic block 2, loop depth 0
>> ;; pred: ENTRY
>> _7 = .SAT_ADDU (x_5(D), y_6(D)); [tail call]
>> return _7;
>> ;; succ: EXIT
>>
>> }
>>
>> Then we will have the middle-end representation like .SAT_ADDU after
>> this patch.
>
> I'll note that on RTL we already have SS_PLUS/US_PLUS and friends and
> the corresponding ssadd/usadd optabs. There's not much documentation
> unfortunately besides the use of gen_*_fixed_libfunc usage where the comment
> suggests this is used for fixed-point operations. It looks like arm uses
> fractional/accumulator modes for this but for example bfin has ssaddsi3.
>
> So the question is whether the fixed-point case can be distinguished from
> the integer case based on mode.
>
> There's also FIXED_POINT_TYPE on the GENERIC/GIMPLE side and
> no special tree operator codes for them. So compared to what appears
> to be the case on RTL we'd need a way to represent saturating integer
> operations on GIMPLE.
>
> The natural thing is to use direct optab internal functions (that's what you
> basically did, but you added a new optab, IMO without good reason).
> More GIMPLE-like would be to let the types involved decide whether
> it's signed or unsigned saturation. That's actually what I'd prefer here
> and if we don't map 1:1 to optabs then instead use tree codes like
> S_PLUS_EXPR (mimicing RTL here).
>
> Any other opinions? Anyone knows more about fixed-point and RTL/modes?
>
> Richard.
>
>> PR target/51492
>> PR target/112600
>>
>> gcc/ChangeLog:
>>
>> * config/riscv/riscv-protos.h (riscv_expand_saturation_addu):
>> New func decl for the SAT_ADDU expand.
>> * config/riscv/riscv.cc (riscv_expand_saturation_addu): New func
>> impl for the SAT_ADDU expand.
>> * config/riscv/riscv.md (sat_addu_<mode>3): New pattern to impl
>> the standard name SAT_ADDU.
>> * doc/md.texi: Add doc for SAT_ADDU.
>> * internal-fn.cc (commutative_binary_fn_p): Add type IFN_SAT_ADDU.
>> * internal-fn.def (SAT_ADDU): Add SAT_ADDU.
>> * match.pd: Add simplify pattern patch for SAT_ADDU.
>> * optabs.def (OPTAB_D): Add sat_addu_optab.
>>
>> gcc/testsuite/ChangeLog:
>>
>> * gcc.target/riscv/sat_addu-1.c: New test.
>> * gcc.target/riscv/sat_addu-2.c: New test.
>> * gcc.target/riscv/sat_addu-3.c: New test.
>> * gcc.target/riscv/sat_addu-4.c: New test.
>> * gcc.target/riscv/sat_addu-run-1.c: New test.
>> * gcc.target/riscv/sat_addu-run-2.c: New test.
>> * gcc.target/riscv/sat_addu-run-3.c: New test.
>> * gcc.target/riscv/sat_addu-run-4.c: New test.
>> * gcc.target/riscv/sat_arith.h: New test.
>>
>> Signed-off-by: Pan Li <pan2.li@intel.com>
>> ---
>> gcc/config/riscv/riscv-protos.h | 1 +
>> gcc/config/riscv/riscv.cc | 46 +++++++++++++++++
>> gcc/config/riscv/riscv.md | 11 +++++
>> gcc/doc/md.texi | 11 +++++
>> gcc/internal-fn.cc | 1 +
>> gcc/internal-fn.def | 1 +
>> gcc/match.pd | 22 +++++++++
>> gcc/optabs.def | 2 +
>> gcc/testsuite/gcc.target/riscv/sat_addu-1.c | 18 +++++++
>> gcc/testsuite/gcc.target/riscv/sat_addu-2.c | 20 ++++++++
>> gcc/testsuite/gcc.target/riscv/sat_addu-3.c | 17 +++++++
>> gcc/testsuite/gcc.target/riscv/sat_addu-4.c | 16 ++++++
>> .../gcc.target/riscv/sat_addu-run-1.c | 42 ++++++++++++++++
>> .../gcc.target/riscv/sat_addu-run-2.c | 42 ++++++++++++++++
>> .../gcc.target/riscv/sat_addu-run-3.c | 42 ++++++++++++++++
>> .../gcc.target/riscv/sat_addu-run-4.c | 49 +++++++++++++++++++
>> gcc/testsuite/gcc.target/riscv/sat_arith.h | 15 ++++++
>> 17 files changed, 356 insertions(+)
>> create mode 100644 gcc/testsuite/gcc.target/riscv/sat_addu-1.c
>> create mode 100644 gcc/testsuite/gcc.target/riscv/sat_addu-2.c
>> create mode 100644 gcc/testsuite/gcc.target/riscv/sat_addu-3.c
>> create mode 100644 gcc/testsuite/gcc.target/riscv/sat_addu-4.c
>> create mode 100644 gcc/testsuite/gcc.target/riscv/sat_addu-run-1.c
>> create mode 100644 gcc/testsuite/gcc.target/riscv/sat_addu-run-2.c
>> create mode 100644 gcc/testsuite/gcc.target/riscv/sat_addu-run-3.c
>> create mode 100644 gcc/testsuite/gcc.target/riscv/sat_addu-run-4.c
>> create mode 100644 gcc/testsuite/gcc.target/riscv/sat_arith.h
>>
>> diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
>> index ae1685850ac..f201b2384f9 100644
>> --- a/gcc/config/riscv/riscv-protos.h
>> +++ b/gcc/config/riscv/riscv-protos.h
>> @@ -132,6 +132,7 @@ extern void riscv_asm_output_external (FILE *, const tree, const char *);
>> extern bool
>> riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT, int);
>> extern void riscv_legitimize_poly_move (machine_mode, rtx, rtx, rtx);
>> +extern void riscv_expand_saturation_addu (rtx, rtx, rtx);
>>
>> #ifdef RTX_CODE
>> extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx, bool *invert_ptr = 0);
>> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
>> index 799d7919a4a..84e86eb5d49 100644
>> --- a/gcc/config/riscv/riscv.cc
>> +++ b/gcc/config/riscv/riscv.cc
>> @@ -10657,6 +10657,52 @@ riscv_vector_mode_supported_any_target_p (machine_mode)
>> return true;
>> }
>>
>> +/* Emit insn for the saturation addu, aka (x + y) | - ((x + y) < x). */
>> +void
>> +riscv_expand_saturation_addu (rtx dest, rtx x, rtx y)
>> +{
>> + machine_mode mode = GET_MODE (dest);
>> + rtx pmode_sum = gen_reg_rtx (Pmode);
>> + rtx pmode_lt = gen_reg_rtx (Pmode);
>> + rtx pmode_x = gen_lowpart (Pmode, x);
>> + rtx pmode_y = gen_lowpart (Pmode, y);
>> + rtx pmode_dest = gen_reg_rtx (Pmode);
>> +
>> + /* Step-1: sum = x + y */
>> + if (mode == SImode && mode != Pmode)
>> + { /* Take addw to avoid the sum truncate. */
>> + rtx simode_sum = gen_reg_rtx (SImode);
>> + riscv_emit_binary (PLUS, simode_sum, x, y);
>> + emit_move_insn (pmode_sum, gen_lowpart (Pmode, simode_sum));
>> + }
>> + else
>> + riscv_emit_binary (PLUS, pmode_sum, pmode_x, pmode_y);
>> +
>> + /* Step-1.1: truncate sum for HI and QI as we have no insn for add QI/HI. */
>> + if (mode == HImode || mode == QImode)
>> + {
>> + int shift_bits = GET_MODE_BITSIZE (Pmode)
>> + - GET_MODE_BITSIZE (mode).to_constant ();
>> +
>> + gcc_assert (shift_bits > 0);
>> +
>> + riscv_emit_binary (ASHIFT, pmode_sum, pmode_sum, GEN_INT (shift_bits));
>> + riscv_emit_binary (LSHIFTRT, pmode_sum, pmode_sum, GEN_INT (shift_bits));
>> + }
>> +
>> + /* Step-2: lt = sum < x */
>> + riscv_emit_binary (LTU, pmode_lt, pmode_sum, pmode_x);
>> +
>> + /* Step-3: lt = -lt */
>> + riscv_emit_unary (NEG, pmode_lt, pmode_lt);
>> +
>> + /* Step-4: pmode_dest = sum | lt */
>> + riscv_emit_binary (IOR, pmode_dest, pmode_lt, pmode_sum);
>> +
>> + /* Step-5: dest = pmode_dest */
>> + emit_move_insn (dest, gen_lowpart (mode, pmode_dest));
>> +}
>> +
>> /* Initialize the GCC target structure. */
>> #undef TARGET_ASM_ALIGNED_HI_OP
>> #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t"
>> diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
>> index 39b29795cd6..03cbe5a2ca9 100644
>> --- a/gcc/config/riscv/riscv.md
>> +++ b/gcc/config/riscv/riscv.md
>> @@ -3841,6 +3841,17 @@ (define_insn "*large_load_address"
>> [(set_attr "type" "load")
>> (set (attr "length") (const_int 8))])
>>
>> +(define_expand "sat_addu_<mode>3"
>> + [(match_operand:ANYI 0 "register_operand")
>> + (match_operand:ANYI 1 "register_operand")
>> + (match_operand:ANYI 2 "register_operand")]
>> + ""
>> + {
>> + riscv_expand_saturation_addu (operands[0], operands[1], operands[2]);
>> + DONE;
>> + }
>> +)
>> +
>> (include "bitmanip.md")
>> (include "crypto.md")
>> (include "sync.md")
>> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
>> index b0c61925120..5867afdb1a0 100644
>> --- a/gcc/doc/md.texi
>> +++ b/gcc/doc/md.texi
>> @@ -6653,6 +6653,17 @@ The operation is only supported for vector modes @var{m}.
>>
>> This pattern is not allowed to @code{FAIL}.
>>
>> +@cindex @code{sat_addu_@var{m}3} instruction pattern
>> +@item @samp{sat_addu_@var{m}3}
>> +Perform the saturation unsigned add for the operand 1 and operand 2 and
>> +store the result into the operand 0. All operands have mode @var{m},
>> +which is a scalar integer mode.
>> +
>> +@smallexample
>> + typedef unsigned char uint8_t;
>> + uint8_t sat_addu (uint8_t x, uint8_t y) => return (x + y) | -((x + y) < x);
>> +@end smallexample
>> +
>> @cindex @code{cmla@var{m}4} instruction pattern
>> @item @samp{cmla@var{m}4}
>> Perform a vector multiply and accumulate that is semantically the same as
>> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
>> index a07f25f3aee..dee73dbc614 100644
>> --- a/gcc/internal-fn.cc
>> +++ b/gcc/internal-fn.cc
>> @@ -4159,6 +4159,7 @@ commutative_binary_fn_p (internal_fn fn)
>> case IFN_VEC_WIDEN_PLUS_HI:
>> case IFN_VEC_WIDEN_PLUS_EVEN:
>> case IFN_VEC_WIDEN_PLUS_ODD:
>> + case IFN_SAT_ADDU:
>> return true;
>>
>> default:
>> diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
>> index c14d30365c1..a04592fc779 100644
>> --- a/gcc/internal-fn.def
>> +++ b/gcc/internal-fn.def
>> @@ -428,6 +428,7 @@ DEF_INTERNAL_WIDENING_OPTAB_FN (VEC_WIDEN_ABD,
>> binary)
>> DEF_INTERNAL_OPTAB_FN (VEC_FMADDSUB, ECF_CONST, vec_fmaddsub, ternary)
>> DEF_INTERNAL_OPTAB_FN (VEC_FMSUBADD, ECF_CONST, vec_fmsubadd, ternary)
>> +DEF_INTERNAL_OPTAB_FN (SAT_ADDU, ECF_CONST | ECF_NOTHROW, sat_addu, binary)
>>
>> /* FP scales. */
>> DEF_INTERNAL_FLT_FN (LDEXP, ECF_CONST, ldexp, binary)
>> diff --git a/gcc/match.pd b/gcc/match.pd
>> index 711c3a10c3f..9de1106adcf 100644
>> --- a/gcc/match.pd
>> +++ b/gcc/match.pd
>> @@ -1994,6 +1994,28 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>> )
>> )
>>
>> +#if GIMPLE
>> +
>> +/* Saturation add unsigned, aka:
>> + SAT_ADDU = (X + Y) | - ((X + Y) < X) or
>> + SAT_ADDU = (X + Y) | - ((X + Y) < Y). */
>> +(simplify
>> + (bit_ior:c (plus:c@2 @0 @1) (negate (convert (lt @2 @0))))
>> + (if (optimize
>> + && INTEGRAL_TYPE_P (type)
>> + && TYPE_UNSIGNED (TREE_TYPE (@0))
>> + && types_match (type, TREE_TYPE (@0))
>> + && types_match (type, TREE_TYPE (@1))
>> + && direct_internal_fn_supported_p (IFN_SAT_ADDU, type, OPTIMIZE_FOR_BOTH))
>> + (IFN_SAT_ADDU @0 @1)))
>> +
>> +/* SAT_ADDU (X, 0) = X */
>> +(simplify
>> + (IFN_SAT_ADDU:c @0 integer_zerop)
>> + @0)
>> +
>> +#endif
>> +
>> /* A few cases of fold-const.cc negate_expr_p predicate. */
>> (match negate_expr_p
>> INTEGER_CST
>> diff --git a/gcc/optabs.def b/gcc/optabs.def
>> index ad14f9328b9..a2c11b7707b 100644
>> --- a/gcc/optabs.def
>> +++ b/gcc/optabs.def
>> @@ -300,6 +300,8 @@ OPTAB_D (usubc5_optab, "usubc$I$a5")
>> OPTAB_D (addptr3_optab, "addptr$a3")
>> OPTAB_D (spaceship_optab, "spaceship$a3")
>>
>> +OPTAB_D (sat_addu_optab, "sat_addu_$a3")
>> +
>> OPTAB_D (smul_highpart_optab, "smul$a3_highpart")
>> OPTAB_D (umul_highpart_optab, "umul$a3_highpart")
>>
>> diff --git a/gcc/testsuite/gcc.target/riscv/sat_addu-1.c b/gcc/testsuite/gcc.target/riscv/sat_addu-1.c
>> new file mode 100644
>> index 00000000000..229abef0faa
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/riscv/sat_addu-1.c
>> @@ -0,0 +1,18 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fno-schedule-insns -fno-schedule-insns2" } */
>> +/* { dg-skip-if "" { *-*-* } { "-flto" } } */
>> +/* { dg-final { check-function-bodies "**" "" } } */
>> +
>> +#include "sat_arith.h"
>> +
>> +/*
>> +** sat_addu_uint8_t:
>> +** add\s+[atx][0-9]+,\s*a0,\s*a1
>> +** andi\s+[atx][0-9]+,\s*[atx][0-9]+,\s*0xff
>> +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
>> +** neg\s+[atx][0-9]+,\s*[atx][0-9]+
>> +** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
>> +** andi\s+a0,\s*a0,\s*0xff
>> +** ret
>> +*/
>> +DEF_SAT_ADDU(uint8_t)
>> diff --git a/gcc/testsuite/gcc.target/riscv/sat_addu-2.c b/gcc/testsuite/gcc.target/riscv/sat_addu-2.c
>> new file mode 100644
>> index 00000000000..4023b030811
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/riscv/sat_addu-2.c
>> @@ -0,0 +1,20 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fno-schedule-insns -fno-schedule-insns2" } */
>> +/* { dg-skip-if "" { *-*-* } { "-flto" } } */
>> +/* { dg-final { check-function-bodies "**" "" } } */
>> +
>> +#include "sat_arith.h"
>> +
>> +/*
>> +** sat_addu_uint16_t:
>> +** add\s+[atx][0-9]+,\s*a0,\s*a1
>> +** slli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48
>> +** srli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48
>> +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
>> +** neg\s+[atx][0-9]+,\s*[atx][0-9]+
>> +** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
>> +** slli\s+a0,\s*a0,\s*48
>> +** srli\s+a0,\s*a0,\s*48
>> +** ret
>> +*/
>> +DEF_SAT_ADDU(uint16_t)
>> diff --git a/gcc/testsuite/gcc.target/riscv/sat_addu-3.c b/gcc/testsuite/gcc.target/riscv/sat_addu-3.c
>> new file mode 100644
>> index 00000000000..4d0af97fb67
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/riscv/sat_addu-3.c
>> @@ -0,0 +1,17 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fno-schedule-insns -fno-schedule-insns2" } */
>> +/* { dg-skip-if "" { *-*-* } { "-flto" } } */
>> +/* { dg-final { check-function-bodies "**" "" } } */
>> +
>> +#include "sat_arith.h"
>> +
>> +/*
>> +** sat_addu_uint32_t:
>> +** addw\s+[atx][0-9]+,\s*a0,\s*a1
>> +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
>> +** neg\s+[atx][0-9]+,\s*[atx][0-9]+
>> +** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
>> +** sext.w\s+a0,\s*a0
>> +** ret
>> +*/
>> +DEF_SAT_ADDU(uint32_t)
>> diff --git a/gcc/testsuite/gcc.target/riscv/sat_addu-4.c b/gcc/testsuite/gcc.target/riscv/sat_addu-4.c
>> new file mode 100644
>> index 00000000000..926f31266e3
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/riscv/sat_addu-4.c
>> @@ -0,0 +1,16 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fno-schedule-insns -fno-schedule-insns2" } */
>> +/* { dg-skip-if "" { *-*-* } { "-flto" } } */
>> +/* { dg-final { check-function-bodies "**" "" } } */
>> +
>> +#include "sat_arith.h"
>> +
>> +/*
>> +** sat_addu_uint64_t:
>> +** add\s+[atx][0-9]+,\s*a0,\s*a1
>> +** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
>> +** neg\s+[atx][0-9]+,\s*[atx][0-9]+
>> +** or\s+a0,\s*[atx][0-9]+,\s*[atx][0-9]+
>> +** ret
>> +*/
>> +DEF_SAT_ADDU(uint64_t)
>> diff --git a/gcc/testsuite/gcc.target/riscv/sat_addu-run-1.c b/gcc/testsuite/gcc.target/riscv/sat_addu-run-1.c
>> new file mode 100644
>> index 00000000000..b19515c39d1
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/riscv/sat_addu-run-1.c
>> @@ -0,0 +1,42 @@
>> +/* { dg-do run { target { riscv_v } } } */
>> +/* { dg-additional-options "-std=c99" } */
>> +
>> +#include "sat_arith.h"
>> +
>> +DEF_SAT_ADDU(uint8_t)
>> +
>> +int
>> +main ()
>> +{
>> + if (RUN_SAT_ADDU (uint8_t, 0, 0) != 0)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint8_t, 0, 1) != 1)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint8_t, 1, 1) != 2)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint8_t, 0, 254) != 254)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint8_t, 1, 254) != 255)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint8_t, 2, 254) != 255)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint8_t, 0, 255) != 255)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint8_t, 1, 255) != 255)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint8_t, 2, 255) != 255)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint8_t, 255, 255) != 255)
>> + __builtin_abort ();
>> +
>> + return 0;
>> +}
>> diff --git a/gcc/testsuite/gcc.target/riscv/sat_addu-run-2.c b/gcc/testsuite/gcc.target/riscv/sat_addu-run-2.c
>> new file mode 100644
>> index 00000000000..90073fbe4ba
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/riscv/sat_addu-run-2.c
>> @@ -0,0 +1,42 @@
>> +/* { dg-do run { target { riscv_v } } } */
>> +/* { dg-additional-options "-std=c99" } */
>> +
>> +#include "sat_arith.h"
>> +
>> +DEF_SAT_ADDU(uint16_t)
>> +
>> +int
>> +main ()
>> +{
>> + if (RUN_SAT_ADDU (uint16_t, 0, 0) != 0)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint16_t, 0, 1) != 1)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint16_t, 1, 1) != 2)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint16_t, 0, 65534) != 65534)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint16_t, 1, 65534) != 65535)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint16_t, 2, 65534) != 65535)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint16_t, 0, 65535) != 65535)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint16_t, 1, 65535) != 65535)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint16_t, 2, 65535) != 65535)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint16_t, 65535, 65535) != 65535)
>> + __builtin_abort ();
>> +
>> + return 0;
>> +}
>> diff --git a/gcc/testsuite/gcc.target/riscv/sat_addu-run-3.c b/gcc/testsuite/gcc.target/riscv/sat_addu-run-3.c
>> new file mode 100644
>> index 00000000000..996dd3de737
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/riscv/sat_addu-run-3.c
>> @@ -0,0 +1,42 @@
>> +/* { dg-do run { target { riscv_v } } } */
>> +/* { dg-additional-options "-std=c99" } */
>> +
>> +#include "sat_arith.h"
>> +
>> +DEF_SAT_ADDU(uint32_t)
>> +
>> +int
>> +main ()
>> +{
>> + if (RUN_SAT_ADDU (uint32_t, 0, 0) != 0)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint32_t, 0, 1) != 1)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint32_t, 1, 1) != 2)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint32_t, 0, 4294967294) != 4294967294)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint32_t, 1, 4294967294) != 4294967295)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint32_t, 2, 4294967294) != 4294967295)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint32_t, 0, 4294967295) != 4294967295)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint32_t, 1, 4294967295) != 4294967295)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint32_t, 2, 4294967295) != 4294967295)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint32_t, 4294967295, 4294967295) != 4294967295)
>> + __builtin_abort ();
>> +
>> + return 0;
>> +}
>> diff --git a/gcc/testsuite/gcc.target/riscv/sat_addu-run-4.c b/gcc/testsuite/gcc.target/riscv/sat_addu-run-4.c
>> new file mode 100644
>> index 00000000000..51a5421577b
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/riscv/sat_addu-run-4.c
>> @@ -0,0 +1,49 @@
>> +/* { dg-do run { target { riscv_v } } } */
>> +/* { dg-additional-options "-std=c99" } */
>> +
>> +#include "sat_arith.h"
>> +
>> +DEF_SAT_ADDU(uint64_t)
>> +
>> +int
>> +main ()
>> +{
>> + if (RUN_SAT_ADDU (uint64_t, 0, 0) != 0)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint64_t, 0, 1) != 1)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint64_t, 1, 1) != 2)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint64_t, 0, 18446744073709551614u)
>> + != 18446744073709551614u)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint64_t, 1, 18446744073709551614u)
>> + != 18446744073709551615u)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint64_t, 2, 18446744073709551614u)
>> + != 18446744073709551615u)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint64_t, 0, 18446744073709551615u)
>> + != 18446744073709551615u)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint64_t, 1, 18446744073709551615u)
>> + != 18446744073709551615u)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint64_t, 2, 18446744073709551615u)
>> + != 18446744073709551615u)
>> + __builtin_abort ();
>> +
>> + if (RUN_SAT_ADDU (uint64_t, 18446744073709551615u, 18446744073709551615u)
>> + != 18446744073709551615u)
>> + __builtin_abort ();
>> +
>> + return 0;
>> +}
>> diff --git a/gcc/testsuite/gcc.target/riscv/sat_arith.h b/gcc/testsuite/gcc.target/riscv/sat_arith.h
>> new file mode 100644
>> index 00000000000..4c00157685e
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/riscv/sat_arith.h
>> @@ -0,0 +1,15 @@
>> +#ifndef HAVE_SAT_ARITH
>> +#define HAVE_SAT_ARITH
>> +
>> +#include <stdint.h>
>> +
>> +#define DEF_SAT_ADDU(TYPE) \
>> +TYPE __attribute__((noinline)) \
>> +sat_addu_##TYPE (TYPE x, TYPE y) \
>> +{ \
>> + return (x + y) | (-(TYPE)((TYPE)(x + y) < x)); \
>> +}
>> +
>> +#define RUN_SAT_ADDU(TYPE, x, y) sat_addu_##TYPE(x, y)
>> +
>> +#endif
>> --
>> 2.34.1
>>
next prev parent reply other threads:[~2024-02-27 10:53 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-17 10:30 pan2.li
2024-02-19 7:36 ` Richard Biener
2024-02-19 8:30 ` Andrew Pinski
2024-02-19 8:42 ` Li, Pan2
2024-02-19 8:41 ` Li, Pan2
2024-02-19 8:55 ` Tamar Christina
2024-02-19 12:59 ` Li, Pan2
2024-02-19 13:04 ` Tamar Christina
2024-02-24 11:18 ` Li, Pan2
2024-02-27 10:53 ` Georg-Johann Lay [this message]
2024-02-27 11:15 ` Tamar Christina
2024-02-27 12:07 ` Georg-Johann Lay
2024-02-24 11:12 ` [PATCH v2] Draft|Internal-fn: Introduce internal fn saturation US_PLUS pan2.li
2024-02-25 9:01 ` Tamar Christina
2024-02-26 6:27 ` Li, Pan2
2024-02-27 9:43 ` Richard Biener
2024-02-27 9:56 ` Tamar Christina
2024-02-27 12:41 ` Li, Pan2
2024-02-27 12:57 ` Tamar Christina
2024-02-27 13:42 ` Richard Biener
2024-02-27 14:35 ` Li, Pan2
2024-03-02 7:46 ` Li, Pan2
2024-03-04 10:31 ` Richard Biener
2024-03-05 7:09 ` Li, Pan2
2024-03-05 8:41 ` Richard Biener
2024-03-07 1:54 ` Li, Pan2
2024-03-07 8:41 ` Richard Biener
2024-03-08 1:01 ` Li, Pan2
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5e617497-8f81-45e4-bf69-7f4c8b08930b@gjlay.de \
--to=avr@gjlay.de \
--cc=Tamar.Christina@arm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=juzhe.zhong@rivai.ai \
--cc=kito.cheng@gmail.com \
--cc=pan2.li@intel.com \
--cc=richard.guenther@gmail.com \
--cc=yanzhang.wang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).