From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-f42.google.com (mail-lf1-f42.google.com [209.85.167.42]) by sourceware.org (Postfix) with ESMTPS id 56A263858D28 for ; Fri, 22 Mar 2024 03:43:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 56A263858D28 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=hoult.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 56A263858D28 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=209.85.167.42 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1711079039; cv=none; b=nyXIDgHIKoj6hEeq0PwYg4BLUsz1gW9x7KnW6x3f9ZobdVC304bJR9JgjJOUKXq+3eqvoYX1vRzpcxczgLdXVgsdK9PEZy9AhRIHxYP81X7JzlWvua0rLNv/uf1SAYYFECayOYEZqvdPaFLNBHrtCcnRLuzACQdS/1qtH+p6Em4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1711079039; c=relaxed/simple; bh=L49zxvr3Tq6v+B5aSLVemRYc3qYa+JMkyY57FwH7QsA=; h=MIME-Version:From:Date:Message-ID:Subject:To; b=ZtCOeeFSQkiMBLlu5HC9C4hPusHN5tgAJI1y0Pu8S4Gg+ETdrUqr7Km6LkTINeqpInCTiWFqoZOIDpxUEsODr4JNL4DPcEC4lJATmSiq0K7z6YA5ofNH/wxBOSTDQ2x3QiEOgOLM4ewpINNLSq9lGG0ojAwrtpfJ/bRCqNpSECU= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lf1-f42.google.com with SMTP id 2adb3069b0e04-513d717269fso2183588e87.0 for ; Thu, 21 Mar 2024 20:43:55 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711079034; x=1711683834; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xmAeySIWocYtet2GO1vKYhV6u1sN0HJKoX6F+6gJqH0=; b=NsuoBSdZ9OvBUsxWmGhoDC3XN+bujHMCkLfl5FQLdWSfyRs6CCOG2X1oQ6e8GtR2ee 1kW5P00SlebP8x3U6cnnF5/l1Gcxc0VAThCfLDLnMON5pyJQ6WH4khmjvR/7va8UE9WH CTLSRg/V+INMimSxRpv38p1V8WaNq0PVl8bubAeimdZDmUNt7lfnwio92MvoP/PBxtfi 7AvZr/iarvxykrvxGFq0iQhTv/xRc2anthrNp6hWiKkCgOzhEbZtWIRPTkpkHVzSP4o5 bGez2BAj7zn4rgpIim5d5wk274B2IsOVM9k0Pl+cx6IYt6qBww0r1l52idz5ZfFcjtDW fTng== X-Gm-Message-State: AOJu0YzgQWmiUQPhEvQEFUICE2tXo5QEB1EikNKcLl1eyPmAszc5Mi2X iMua8c48UT9G0PNglwQksPuxjb7xfFNs2y8kcugJFKvq26yEalThVP7N9j8dhzpwLtocHECEw7E 0pc77KlBszry402Gb8OtmBOmaX84= X-Google-Smtp-Source: AGHT+IGIj+EXeaSQzeGdNCiz+VArLfA+Nv433y/szZD6ejaHhNXiCOJPmS8bOo41m+yPA2ME/DLJogd4pfcscJbxifQ= X-Received: by 2002:ac2:55b2:0:b0:513:ee2c:e89b with SMTP id y18-20020ac255b2000000b00513ee2ce89bmr659677lfg.33.1711079033757; Thu, 21 Mar 2024 20:43:53 -0700 (PDT) MIME-Version: 1.0 References: <20240321234552.2140254-1-christoph.muellner@vrull.eu> In-Reply-To: <20240321234552.2140254-1-christoph.muellner@vrull.eu> From: Bruce Hoult Date: Fri, 22 Mar 2024 16:43:40 +1300 Message-ID: Subject: Re: [PATCH] RISC-V: Don't add fractional LMUL types to V_VLS for XTheadVector To: =?UTF-8?Q?Christoph_M=C3=BCllner?= Cc: gcc-patches@gcc.gnu.org, Kito Cheng , Palmer Dabbelt , Andrew Waterman , Philipp Tomsich , Camel Coder , Juzhe-Zhong , Jun Sha , Xianmiao Qu , Jin Ma Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-10.0 required=5.0 tests=BAYES_00,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,GIT_PATCH_0,HEADER_FROM_DIFFERENT_DOMAINS,KAM_DMARC_STATUS,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: > The effect is demonstrated by a new test case that shows that the by-pieces framework now emits `sb` instructions instead of triggering an ICE So these small memset() now don't use RVV at all if xtheadvector is enabled= ? I don't have evidence whether the use of RVV (whether V or xtheadvector) for these memsets is a win or not, but the treatment should probably be consistent. I don't know why RVV 1.0 uses a fractional LMUL at all here. It would work perfectly well with LMUL=3D1 and just setting vl to the appropriate length (which is always less than 16 bytes). Use of fractional LMUL doesn't save any resources. On Fri, Mar 22, 2024 at 12:46=E2=80=AFPM Christoph M=C3=BCllner wrote: > > The expansion of `memset` (via expand_builtin_memset_args()) > uses clear_by_pieces() and store_by_pieces() to avoid calls > to the C runtime. To check if a type can be used for that purpose > the function by_pieces_mode_supported_p() tests if a `mov` and > a `vec_duplicate` INSN can be expaned by the backend. > > The `vec_duplicate` expansion takes arguments of type `V_VLS`. > The `mov` expansions take arguments of type `V`, `VB`, `VT`, > `VLS_AVL_IMM`, and `VLS_AVL_REG`. Some of these types (in fact > not types but type iterators) include fractional LMUL types. > E.g. `V_VLS` includes `V`, which includes `VI`, which includes > `RVVMF2QI`. > > This results in an attempt to use fractional LMUL-types for > the `memset` expansion resulting in an ICE for XTheadVector, > because that extension cannot handle fractional LMULs. > > This patch addresses this issue by splitting the definition > of the `VI` mode itereator into `VI_NOFRAC` (without fractional > LMUL types) and `VI_FRAC` (only fractional LMUL types). > Further, it defines `V_VLS` such, that `VI_FRAC` types are only > included if XTheadVector is not enabled. > > The effect is demonstrated by a new test case that shows > that the by-pieces framework now emits `sb` instructions > instead of triggering an ICE. > > Signed-off-by: Christoph M=C3=BCllner > > PR 114194 > > gcc/ChangeLog: > > * config/riscv/vector-iterators.md: Split VI into VI_FRAC and VI_= NOFRAC. > Only include VI_NOFRAC in V_VLS without TARGET_XTHEADVECTOR. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/rvv/xtheadvector/pr114194.c: New test. > > Signed-off-by: Christoph M=C3=BCllner > --- > gcc/config/riscv/vector-iterators.md | 19 +++++-- > .../riscv/rvv/xtheadvector/pr114194.c | 56 +++++++++++++++++++ > 2 files changed, 69 insertions(+), 6 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114= 194.c > > diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vect= or-iterators.md > index c2ea7e8b10a..a24e1bf078f 100644 > --- a/gcc/config/riscv/vector-iterators.md > +++ b/gcc/config/riscv/vector-iterators.md > @@ -108,17 +108,24 @@ (define_c_enum "unspecv" [ > UNSPECV_FRM_RESTORE_EXIT > ]) > > -(define_mode_iterator VI [ > - RVVM8QI RVVM4QI RVVM2QI RVVM1QI RVVMF2QI RVVMF4QI (RVVMF8QI "TARGET_MI= N_VLEN > 32") > - > - RVVM8HI RVVM4HI RVVM2HI RVVM1HI RVVMF2HI (RVVMF4HI "TARGET_MIN_VLEN > = 32") > - > - RVVM8SI RVVM4SI RVVM2SI RVVM1SI (RVVMF2SI "TARGET_MIN_VLEN > 32") > +;; Subset of VI with fractional LMUL types > +(define_mode_iterator VI_FRAC [ > + RVVMF2QI RVVMF4QI (RVVMF8QI "TARGET_MIN_VLEN > 32") > + RVVMF2HI (RVVMF4HI "TARGET_MIN_VLEN > 32") > + (RVVMF2SI "TARGET_MIN_VLEN > 32") > +]) > > +;; Subset of VI with non-fractional LMUL types > +(define_mode_iterator VI_NOFRAC [ > + RVVM8QI RVVM4QI RVVM2QI RVVM1QI > + RVVM8HI RVVM4HI RVVM2HI RVVM1HI > + RVVM8SI RVVM4SI RVVM2SI RVVM1SI > (RVVM8DI "TARGET_VECTOR_ELEN_64") (RVVM4DI "TARGET_VECTOR_ELEN_64") > (RVVM2DI "TARGET_VECTOR_ELEN_64") (RVVM1DI "TARGET_VECTOR_ELEN_64") > ]) > > +(define_mode_iterator VI [ VI_NOFRAC (VI_FRAC "!TARGET_XTHEADVECTOR") ]) > + > ;; This iterator is the same as above but with TARGET_VECTOR_ELEN_FP_16 > ;; changed to TARGET_ZVFH. TARGET_VECTOR_ELEN_FP_16 is also true for > ;; TARGET_ZVFHMIN while we actually want to disable all instructions apa= rt > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c b= /gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c > new file mode 100644 > index 00000000000..fc2d1349425 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c > @@ -0,0 +1,56 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=3Drv32gc_xtheadvector" { target { rv32 } } } */ > +/* { dg-options "-march=3Drv64gc_xtheadvector" { target { rv64 } } } */ > +/* { dg-final { check-function-bodies "**" "" } } */ > + > +/* > +** foo0_1: > +** sb\tzero,0([a-x0-9]+) > +** ret > +*/ > +void foo0_1 (void *p) > +{ > + __builtin_memset (p, 0, 1); > +} > + > +/* > +** foo0_7: > +** sb\tzero,0([a-x0-9]+) > +** sb\tzero,1([a-x0-9]+) > +** sb\tzero,2([a-x0-9]+) > +** sb\tzero,3([a-x0-9]+) > +** sb\tzero,4([a-x0-9]+) > +** sb\tzero,5([a-x0-9]+) > +** sb\tzero,6([a-x0-9]+) > +** ret > +*/ > +void foo0_7 (void *p) > +{ > + __builtin_memset (p, 0, 7); > +} > + > +/* > +** foo1_1: > +** li\t[a-x0-9]+,1 > +** sb\t[a-x0-9]+,0([a-x0-9]+) > +** ret > +*/ > +void foo1_1 (void *p) > +{ > + __builtin_memset (p, 1, 1); > +} > + > +/* > +** foo1_5: > +** li\t[a-x0-9]+,1 > +** sb\t[a-x0-9]+,0([a-x0-9]+) > +** sb\t[a-x0-9]+,1([a-x0-9]+) > +** sb\t[a-x0-9]+,2([a-x0-9]+) > +** sb\t[a-x0-9]+,3([a-x0-9]+) > +** sb\t[a-x0-9]+,4([a-x0-9]+) > +** ret > +*/ > +void foo1_5 (void *p) > +{ > + __builtin_memset (p, 1, 5); > +} > -- > 2.44.0 > >