From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-x534.google.com (mail-ed1-x534.google.com [IPv6:2a00:1450:4864:20::534]) by sourceware.org (Postfix) with ESMTPS id C4CBE3858D37 for ; Thu, 26 Oct 2023 13:59:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C4CBE3858D37 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org C4CBE3858D37 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::534 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698328786; cv=none; b=czsA9vZO3yUZTbS8Owl2LnCmzzTUUT6tHkMG093EbwViEWK3+FbsnS2rDE7D2FLxvpCg8DVTKvNhc4Tu9mw86C2U+OicvGMWmfjRUSJupZPygBTv7P1lw2zb/ZoZLAqJBwVruk6jV2BFRS3M0Rma6njsuSa9D1z+x+3xXOL4aEQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698328786; c=relaxed/simple; bh=r8QqTMbGMIoADBHXELo6+aLJncnl+0gY/e79hAmN3c0=; h=DKIM-Signature:From:Mime-Version:Subject:Date:Message-Id:To; b=n5fX8q/cRdzhTTlyag2sbUwXQ+rziR2H1bXoFAyloCO7zFlx8Kd2x0wRcfEGLO0vieJSAl2zPTZVLi62H2QRZ0KJqoZrJ1nNPvMXAy4i97BefLcd/qObQX5bCJtTH9wnWHibFHKvsSInm9D+ieeF1dbJC7H/Vvjul73lbP4UHSM= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ed1-x534.google.com with SMTP id 4fb4d7f45d1cf-53e08e439c7so1596419a12.0 for ; Thu, 26 Oct 2023 06:59:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1698328783; x=1698933583; darn=gcc.gnu.org; h=to:in-reply-to:cc:references:message-id:date:subject:mime-version :from:content-transfer-encoding:from:to:cc:subject:date:message-id :reply-to; bh=qFleB0Rdm1J3HRHZrX0SNLu9aqa+bCnxN7sQxLlh/Hc=; b=HyTprJn1E3MU/NbyhpP8mc3JBng8PNBrrFbx3acVNWJ909f9FGh2yV1JPQ2MkUKZnu rRsCdbHskq7JP4CBMa8b2SyHvIKG3WwLRI8BFgOiQT85x2OQdNDOF08a52bst5ibiB/n MIp4ChLlH6n6OI6p78/iCfW8FPKCWH8S3jSIxoHj8F7y7jlAouqAiyG5wx0N6n42dLLD tbMq4BekymiPHHrgVpMWeKlbeuXTYvno7FQbkUQaMNCdXoGUgbxsewIs3JBFB44qZqp6 e97ZEn2wTsBcoHT0eRSRayxLuGlVV0OwcfRMuOaOgORPSE+ACyW95kwic3ppZJX8ic5g +rAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698328783; x=1698933583; h=to:in-reply-to:cc:references:message-id:date:subject:mime-version :from:content-transfer-encoding:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qFleB0Rdm1J3HRHZrX0SNLu9aqa+bCnxN7sQxLlh/Hc=; b=JctrKTJAVAv+MpRpFyiLOAcsdLroC/JujwnFURKDYCl3sj9kCP0kOD8Adn1OtwlFiY ddP7qmfIjyHawqNiwvj74grdCjaiugE04cs3LfcxuTkpqpw22qUR/oFMORPHueuJgAwO Xh2UrJafJxknq5aW+QjDZgf86hL22I+Pky4SyzUAE/kcQYN9VE1mzrNj+91u8OyF7MP6 XeRO0zuHEkCzxJ66a4V9uR/fNfy5iltTLYUAAC1J1UXIm9qI0HTxvC4fZUuqmoh+fIOY tYRdxArxsf1Li5WuN0FJGHea4XIU1H1gcB6bUFuKNPmM1VK1m+2lYBd/Mioifw6yTXp5 18rg== X-Gm-Message-State: AOJu0YwmadwcKuD4gkbaU2oS6e+rDgBvdav/IchDOtlc4v93N1DQ+nw8 4JovoKSX6yrsRvqQRwTXlW1FvYrO/Ok= X-Google-Smtp-Source: AGHT+IGqay+FWGjMRqevk4yJl9n3+mbkoobzfk135X46GvpFGtNY6Ew1ikWPYxKgQEWgwhx3mwIr1Q== X-Received: by 2002:a50:cd9e:0:b0:53d:a90e:be90 with SMTP id p30-20020a50cd9e000000b0053da90ebe90mr14212686edi.15.1698328783223; Thu, 26 Oct 2023 06:59:43 -0700 (PDT) Received: from smtpclient.apple (dynamic-077-007-027-039.77.7.pool.telefonica.de. [77.7.27.39]) by smtp.gmail.com with ESMTPSA id cy8-20020a0564021c8800b0054018a76825sm8541460edb.8.2023.10.26.06.59.42 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 26 Oct 2023 06:59:42 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable From: Richard Biener Mime-Version: 1.0 (1.0) Subject: Re: [PATCH v2] VECT: Remove the type size restriction of vectorizer Date: Thu, 26 Oct 2023 15:59:31 +0200 Message-Id: <6D1DE93C-B4FC-4E64-AB39-D7D0F929AB64@gmail.com> References: Cc: gcc-patches@gcc.gnu.org, juzhe.zhong@rivai.ai, "Wang, Yanzhang" , kito.cheng@gmail.com, "Liu, Hongtao" , Richard Sandiford In-Reply-To: To: "Li, Pan2" X-Mailer: iPhone Mail (20H115) X-Spam-Status: No, score=-8.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_SHORT,RCVD_IN_BARRACUDACENTRAL,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: > Am 26.10.2023 um 13:59 schrieb Li, Pan2 : >=20 > =EF=BB=BFThanks Richard for comments. >=20 >> Can you explain why this is necessary? In particular what is lhs_rtx >> mode vs ops[0].value mode? >=20 > For testcase gcc.target/aarch64/sve/popcount_1.c, the rtl are list as belo= w. >=20 > The lhs_rtx is (reg:VNx2SI 98 [ vect__5.36 ]). > The ops[0].value is (reg:VNx2DI 104). >=20 > The restriction removing make the vector rtl enter expand_fn_using_insn an= d of course hit the INTEGER_P assertion. But I think this shows we mid-selected the optab, a convert_move is certainl= y not correct unconditionally here (the target might not support that) > Pan >=20 > -----Original Message----- > From: Richard Biener =20 > Sent: Thursday, October 26, 2023 4:38 PM > To: Li, Pan2 > Cc: gcc-patches@gcc.gnu.org; juzhe.zhong@rivai.ai; Wang, Yanzhang ; kito.cheng@gmail.com; Liu, Hongtao ; Richard Sandiford > Subject: Re: [PATCH v2] VECT: Remove the type size restriction of vectoriz= er >=20 >> On Thu, Oct 26, 2023 at 4:18=E2=80=AFAM wrote: >>=20 >> From: Pan Li >>=20 >> Update in v2: >>=20 >> * Fix one ICE of type assertion. >> * Adjust some test cases for aarch64 sve and riscv vector. >>=20 >> Original log: >>=20 >> The vectoriable_call has one restriction of the size of data type. >> Aka DF to DI is allowed but SF to DI isn't. You may see below message >> when try to vectorize function call like lrintf. >>=20 >> void >> test_lrintf (long *out, float *in, unsigned count) >> { >> for (unsigned i =3D 0; i < count; i++) >> out[i] =3D __builtin_lrintf (in[i]); >> } >>=20 >> lrintf.c:5:26: missed: couldn't vectorize loop >> lrintf.c:5:26: missed: not vectorized: unsupported data-type >>=20 >> Then the standard name pattern like lrintmn2 cannot work for different >> data type size like SF =3D> DI. This patch would like to remove this data= >> type size check and unblock the standard name like lrintmn2. >>=20 >> The below test are passed for this patch. >>=20 >> * The x86 bootstrap and regression test. >> * The aarch64 regression test. >> * The risc-v regression tests. >>=20 >> gcc/ChangeLog: >>=20 >> * internal-fn.cc (expand_fn_using_insn): Add vector int assertion.= >> * tree-vect-stmts.cc (vectorizable_call): Remove size check. >>=20 >> gcc/testsuite/ChangeLog: >>=20 >> * gcc.target/aarch64/sve/clrsb_1.c: Adjust checker. >> * gcc.target/aarch64/sve/clz_1.c: Ditto. >> * gcc.target/aarch64/sve/popcount_1.c: Ditto. >> * gcc.target/riscv/rvv/autovec/unop/popcount.c: Ditto. >>=20 >> Signed-off-by: Pan Li >> --- >> gcc/internal-fn.cc | 3 ++- >> gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c | 3 +-- >> gcc/testsuite/gcc.target/aarch64/sve/clz_1.c | 3 +-- >> gcc/testsuite/gcc.target/aarch64/sve/popcount_1.c | 3 +-- >> .../gcc.target/riscv/rvv/autovec/unop/popcount.c | 2 +- >> gcc/tree-vect-stmts.cc | 13 ------------- >> 6 files changed, 6 insertions(+), 21 deletions(-) >>=20 >> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc >> index 61d5a9e4772..17c0f4c3805 100644 >> --- a/gcc/internal-fn.cc >> +++ b/gcc/internal-fn.cc >> @@ -281,7 +281,8 @@ expand_fn_using_insn (gcall *stmt, insn_code icode, u= nsigned int noutputs, >> emit_move_insn (lhs_rtx, ops[0].value); >> else >> { >> - gcc_checking_assert (INTEGRAL_TYPE_P (TREE_TYPE (lhs))); >> + gcc_checking_assert (INTEGRAL_TYPE_P (TREE_TYPE (lhs)) >> + || VECTOR_INTEGER_TYPE_P (TREE_TYPE (lhs))= ); >=20 > Can you explain why this is necessary? In particular what is lhs_rtx > mode vs ops[0].value mode? >=20 >> convert_move (lhs_rtx, ops[0].value, 0); >=20 > I'm not sure convert_move handles vector modes correctly. Richard > probably added this code, CCed. >=20 > Richard. >=20 >> } >> } >> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c b/gcc/testsui= te/gcc.target/aarch64/sve/clrsb_1.c >> index bdc9856faaf..940d08bbc7b 100644 >> --- a/gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c >> +++ b/gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c >> @@ -18,5 +18,4 @@ clrsb_64 (unsigned int *restrict dst, uint64_t *restric= t src, int size) >> } >>=20 >> /* { dg-final { scan-assembler-times {\tcls\tz[0-9]+\.s, p[0-7]/m, z[0-9]= +\.s\n} 1 } } */ >> -/* { dg-final { scan-assembler-times {\tcls\tz[0-9]+\.d, p[0-7]/m, z[0-9= ]+\.d\n} 2 } } */ >> -/* { dg-final { scan-assembler-times {\tuzp1\tz[0-9]+\.s, z[0-9]+\.s, z[= 0-9]+\.s\n} 1 } } */ >> +/* { dg-final { scan-assembler-times {\tcls\tz[0-9]+\.d, p[0-7]/m, z[0-9= ]+\.d\n} 1 } } */ >> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/clz_1.c b/gcc/testsuite= /gcc.target/aarch64/sve/clz_1.c >> index 0c7a4e6d768..58b8ff406d2 100644 >> --- a/gcc/testsuite/gcc.target/aarch64/sve/clz_1.c >> +++ b/gcc/testsuite/gcc.target/aarch64/sve/clz_1.c >> @@ -18,5 +18,4 @@ clz_64 (unsigned int *restrict dst, uint64_t *restrict s= rc, int size) >> } >>=20 >> /* { dg-final { scan-assembler-times {\tclz\tz[0-9]+\.s, p[0-7]/m, z[0-9]= +\.s\n} 1 } } */ >> -/* { dg-final { scan-assembler-times {\tclz\tz[0-9]+\.d, p[0-7]/m, z[0-9= ]+\.d\n} 2 } } */ >> -/* { dg-final { scan-assembler-times {\tuzp1\tz[0-9]+\.s, z[0-9]+\.s, z[= 0-9]+\.s\n} 1 } } */ >> +/* { dg-final { scan-assembler-times {\tclz\tz[0-9]+\.d, p[0-7]/m, z[0-9= ]+\.d\n} 1 } } */ >> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/popcount_1.c b/gcc/test= suite/gcc.target/aarch64/sve/popcount_1.c >> index dfb6f4ac7a5..0eba898307c 100644 >> --- a/gcc/testsuite/gcc.target/aarch64/sve/popcount_1.c >> +++ b/gcc/testsuite/gcc.target/aarch64/sve/popcount_1.c >> @@ -18,5 +18,4 @@ popcount_64 (unsigned int *restrict dst, uint64_t *rest= rict src, int size) >> } >>=20 >> /* { dg-final { scan-assembler-times {\tcnt\tz[0-9]+\.s, p[0-7]/m, z[0-9]= +\.s\n} 1 } } */ >> -/* { dg-final { scan-assembler-times {\tcnt\tz[0-9]+\.d, p[0-7]/m, z[0-9= ]+\.d\n} 2 } } */ >> -/* { dg-final { scan-assembler-times {\tuzp1\tz[0-9]+\.s, z[0-9]+\.s, z[= 0-9]+\.s\n} 1 } } */ >> +/* { dg-final { scan-assembler-times {\tcnt\tz[0-9]+\.d, p[0-7]/m, z[0-9= ]+\.d\n} 1 } } */ >> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/popcount.c b= /gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/popcount.c >> index 585a522aa81..e6e3c70f927 100644 >> --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/popcount.c >> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/popcount.c >> @@ -1461,4 +1461,4 @@ main () >> RUN_ALL () >> } >>=20 >> -/* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 229 "vect" } } */= >> +/* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 384 "vect" } } */= >> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc >> index a9200767f67..fa4ca0634e8 100644 >> --- a/gcc/tree-vect-stmts.cc >> +++ b/gcc/tree-vect-stmts.cc >> @@ -3361,19 +3361,6 @@ vectorizable_call (vec_info *vinfo, >>=20 >> return false; >> } >> - /* FORNOW: we don't yet support mixtures of vector sizes for calls, >> - just mixtures of nunits. E.g. DI->SI versions of __builtin_ctz* >> - are traditionally vectorized as two VnDI->VnDI IFN_CTZs followed >> - by a pack of the two vectors into an SI vector. We would need >> - separate code to handle direct VnDI->VnSI IFN_CTZs. */ >> - if (TYPE_SIZE (vectype_in) !=3D TYPE_SIZE (vectype_out)) >> - { >> - if (dump_enabled_p ()) >> - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, >> - "mismatched vector sizes %T and %T\n", >> - vectype_in, vectype_out); >> - return false; >> - } >>=20 >> if (VECTOR_BOOLEAN_TYPE_P (vectype_out) >> !=3D VECTOR_BOOLEAN_TYPE_P (vectype_in)) >> -- >> 2.34.1 >>=20