From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-x532.google.com (mail-ed1-x532.google.com [IPv6:2a00:1450:4864:20::532]) by sourceware.org (Postfix) with ESMTPS id 6CB8B3857820 for ; Thu, 11 Nov 2021 16:42:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6CB8B3857820 Received: by mail-ed1-x532.google.com with SMTP id z21so26509544edb.5 for ; Thu, 11 Nov 2021 08:42:42 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=QlqujOYaLD6KZNXTwc93yxup3+BNDWNWKuT96M17n6o=; b=akgfvK1/GkAStdfPWIDi2crzJgqSf4HXvhNT5v94JXwcecqmzgtz6e8WYwz+bp7/4D qIWPBUXI6FLrzbP75KQPlwfvauyJ+JWYR9FtEbfkUESgyiD8q01yf3lAoppasMLU5b+X QIvv7MTfr5ftd5rY3owa5BUvoBRS7Eu4x8Qu65ynBYHAleqBtE/gVi8GfhHERwFGirfy ggyvsmHrddVhn2WsOgywM9toPMV0aGEXeSXa22GIEhoCttylfOiiwCf3KAo5otCvUxT4 Df0u2mGMft14c4swoiLyrIA8eXKfkZ5ouoKSIGP3lGQEoS+LMuP29tbkk8JJJjw7LjXm lAhQ== X-Gm-Message-State: AOAM532pE/3x1aRwhOEh7bdVciodcagLa0TXP7uJxem2OB5CU9PsobXa cUCN0r/XtUveKkmMkowEg3a0Vrb0/guzyWERR1w= X-Google-Smtp-Source: ABdhPJwYxYB+aZ+gv/08RR4vyT/JvLqOqNknH5OStmRNLOX1qlx2JBylzyeOFE/rJ6M9cDDnWbXONfLtWZQNHi1dtmY= X-Received: by 2002:a50:8dcb:: with SMTP id s11mr11556782edh.318.1636648961280; Thu, 11 Nov 2021 08:42:41 -0800 (PST) MIME-Version: 1.0 References: <20211111141020.2738001-1-philipp.tomsich@vrull.eu> <20211111141020.2738001-9-philipp.tomsich@vrull.eu> In-Reply-To: From: Kito Cheng Date: Fri, 12 Nov 2021 00:42:30 +0800 Message-ID: Subject: Re: [PATCH v1 8/8] RISC-V: bitmanip: relax minmax to operate on GPR To: Philipp Tomsich Cc: wilson@tuliptree.org, GCC Patches Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-6.6 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Nov 2021 16:42:44 -0000 Hi Philipp: This testcase got wrong result with this patch even w/o si3_sext pattern: #include #define MAX(A, B) ((A) > (B) ? (A) : (B)) long long __attribute__((noinline, noipa)) foo6(long long a, long long b, int c) { int xa =3D a; int xb =3D b; return MAX(MAX(xa, xb), c); } int main() { long long a =3D 0x200000000ll; long long b =3D 0x1ffffffffl; int c =3D 10; long long d =3D foo6(a, b, c); printf ("%lld %lld %d =3D %lld\n", a, b, c, d); return 0; } On Fri, Nov 12, 2021 at 12:27 AM Kito Cheng wrote: > > IIRC it's not work even without sign extend pattern since I did similar e= xperimental before (not for RISC-V, but same concept), I guess I need more = time to test that. > > Philipp Tomsich =E6=96=BC 2021=E5=B9=B411=E6= =9C=8812=E6=97=A5 =E9=80=B1=E4=BA=94 00:18 =E5=AF=AB=E9=81=93=EF=BC=9A >> >> Kito, >> >> Unless I am missing something, the problem is not the relaxation to >> GPR, but rather the sign-extending pattern I had squashed into the >> same patch. >> If you disable "si3_sext", a sext.w will be have to be >> emitted after the 'max' and before the return (or before the SImode >> output is consumed as a DImode), pushing the REE opportunity to a >> subsequent consumer (e.g. an addw). >> >> This will generate >> foo6: >> max a0,a0,a1 >> sext.w a0,a0 >> ret >> which (assuming that the inputs to max are properly sign-extended >> SImode values living in DImode registers) will be the same as >> performing the two sext.w before the max. >> >> Having a second set of eyes on this is appreciated =E2=80=94 let me know= if >> you agree and I'll revise, once I have collected feedback on the >> remaining patches of the series. >> >> Philipp. >> >> >> On Thu, 11 Nov 2021 at 17:00, Kito Cheng wrote: >> > >> > Hi Philipp: >> > >> > We can't pretend we have SImode min/max instruction without that seman= tic. >> > Give this testcase, x86 and rv64gc print out 8589934592 8589934591 =3D= 0, >> > but with this patch and compile with rv64gc_zba_zbb -O3, the output >> > become 8589934592 8589934591 =3D 8589934592 >> > >> > -------------Testcase--------------- >> > #include >> > long long __attribute__((noinline, noipa)) >> > foo6(long long a, long long b) >> > { >> > int xa =3D a; >> > int xb =3D b; >> > return (xa > xb ? xa : xb); >> > } >> > int main() { >> > long long a =3D 0x200000000ll; >> > long long b =3D 0x1ffffffffl; >> > long long c =3D foo6(a, b); >> > printf ("%lld %lld =3D %lld\n", a, b, c); >> > return 0; >> > } >> > -------------------------------------- >> > v64gc_zba_zbb -O3 w/o this patch: >> > foo6: >> > sext.w a1,a1 >> > sext.w a0,a0 >> > max a0,a0,a1 >> > ret >> > >> > -------------------------------------- >> > v64gc_zba_zbb -O3 w/ this patch: >> > foo6: >> > max a0,a0,a1 >> > ret >> > >> > On Thu, Nov 11, 2021 at 10:10 PM Philipp Tomsich >> > wrote: >> > > >> > > While min/minu/max/maxu instructions are provided for XLEN only, the= se >> > > can safely operate on GPRs (i.e. SImode or DImode for RV64): SImode = is >> > > always sign-extended, which ensures that the XLEN-wide instructions >> > > can be used for signed and unsigned comparisons on SImode yielding a >> > > correct ordering of value. >> > > >> > > This commit >> > > - relaxes the minmax pattern to express for GPR (instead of X only)= , >> > > providing both a si3 and di3 expansion on RV64 >> > > - adds a sign-extending form for thee si3 pattern for RV64 to all R= EE >> > > to eliminate redundant extensions >> > > - adds test-cases for both >> > > >> > > gcc/ChangeLog: >> > > >> > > * config/riscv/bitmanip.md: Relax minmax to GPR (i.e SImode = or >> > > DImode) on RV64. >> > > * config/riscv/bitmanip.md (si3_sext): Add >> > > pattern for REE. >> > > >> > > gcc/testsuite/ChangeLog: >> > > >> > > * gcc.target/riscv/zbb-min-max.c: Add testcases for SImode >> > > operands checking that no redundant sign- or zero-extensio= ns >> > > are emitted. >> > > >> > > Signed-off-by: Philipp Tomsich >> > > --- >> > > >> > > gcc/config/riscv/bitmanip.md | 14 +++++++++++--- >> > > gcc/testsuite/gcc.target/riscv/zbb-min-max.c | 20 +++++++++++++++++= --- >> > > 2 files changed, 28 insertions(+), 6 deletions(-) >> > > >> > > diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmani= p.md >> > > index 000deb48b16..2a28f78f5f6 100644 >> > > --- a/gcc/config/riscv/bitmanip.md >> > > +++ b/gcc/config/riscv/bitmanip.md >> > > @@ -260,13 +260,21 @@ (define_insn "bswap2" >> > > [(set_attr "type" "bitmanip")]) >> > > >> > > (define_insn "3" >> > > - [(set (match_operand:X 0 "register_operand" "=3Dr") >> > > - (bitmanip_minmax:X (match_operand:X 1 "register_operand" "r= ") >> > > - (match_operand:X 2 "register_operand" "r"= )))] >> > > + [(set (match_operand:GPR 0 "register_operand" "=3Dr") >> > > + (bitmanip_minmax:GPR (match_operand:GPR 1 "register_operand= " "r") >> > > + (match_operand:GPR 2 "register_operand"= "r")))] >> > > "TARGET_ZBB" >> > > "\t%0,%1,%2" >> > > [(set_attr "type" "bitmanip")]) >> > > >> > > +(define_insn "si3_sext" >> > > + [(set (match_operand:DI 0 "register_operand" "=3Dr") >> > > + (sign_extend:DI (bitmanip_minmax:SI (match_operand:SI 1 "re= gister_operand" "r") >> > > + (match_operand:SI 2 "register_operand" = "r"))))] >> > > + "TARGET_64BIT && TARGET_ZBB" >> > > + "\t%0,%1,%2" >> > > + [(set_attr "type" "bitmanip")]) >> > > + >> > > ;; orc.b (or-combine) is added as an unspec for the benefit of the = support >> > > ;; for optimized string functions (such as strcmp). >> > > (define_insn "orcb2" >> > > diff --git a/gcc/testsuite/gcc.target/riscv/zbb-min-max.c b/gcc/test= suite/gcc.target/riscv/zbb-min-max.c >> > > index f44c398ea08..7169e873551 100644 >> > > --- a/gcc/testsuite/gcc.target/riscv/zbb-min-max.c >> > > +++ b/gcc/testsuite/gcc.target/riscv/zbb-min-max.c >> > > @@ -1,5 +1,5 @@ >> > > /* { dg-do compile } */ >> > > -/* { dg-options "-march=3Drv64gc_zbb -mabi=3Dlp64 -O2" } */ >> > > +/* { dg-options "-march=3Drv64gc_zba_zbb -mabi=3Dlp64 -O2" } */ >> > > >> > > long >> > > foo1 (long i, long j) >> > > @@ -25,7 +25,21 @@ foo4 (unsigned long i, unsigned long j) >> > > return i > j ? i : j; >> > > } >> > > >> > > +unsigned int >> > > +foo5(unsigned int a, unsigned int b) >> > > +{ >> > > + return a > b ? a : b; >> > > +} >> > > + >> > > +int >> > > +foo6(int a, int b) >> > > +{ >> > > + return a > b ? a : b; >> > > +} >> > > + >> > > /* { dg-final { scan-assembler-times "min" 3 } } */ >> > > -/* { dg-final { scan-assembler-times "max" 3 } } */ >> > > +/* { dg-final { scan-assembler-times "max" 4 } } */ >> > > /* { dg-final { scan-assembler-times "minu" 1 } } */ >> > > -/* { dg-final { scan-assembler-times "maxu" 1 } } */ >> > > +/* { dg-final { scan-assembler-times "maxu" 3 } } */ >> > > +/* { dg-final { scan-assembler-not "zext.w" } } */ >> > > +/* { dg-final { scan-assembler-not "sext.w" } } */ >> > > -- >> > > 2.32.0 >> > >