From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id F04C93858D28 for ; Fri, 19 Aug 2022 12:52:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org F04C93858D28 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4705A1042; Fri, 19 Aug 2022 05:52:17 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.62]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C9F943F66F; Fri, 19 Aug 2022 05:52:14 -0700 (PDT) From: Richard Sandiford To: "juzhe.zhong\@rivai.ai" Mail-Followup-To: "juzhe.zhong\@rivai.ai" , gcc-patches , rguenther , "kito.cheng" , richard.sandiford@arm.com Cc: gcc-patches , rguenther , "kito.cheng" Subject: Re: [PATCH] middle-end: skipp stepped vector test of poly_int (1, 1) and allow the machine_mode definition with poly_uint16 (1, 1) References: <20220818104608.259204-1-juzhe.zhong@rivai.ai> Date: Fri, 19 Aug 2022 13:52:13 +0100 In-Reply-To: (juzhe's message of "Fri, 19 Aug 2022 17:57:46 +0800") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-51.3 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, WEIRD_PORT autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Aug 2022 12:52:17 -0000 "juzhe.zhong@rivai.ai" writes: >> Ah, right, sorry for the bogus suggestion. >> In that case, what specifically goes wrong? Which test in >> test_vector_subregs_modes fails, and what does the ICE look like? > > Because may_be (nunits, 1) return true. For nunits =3D (1,1), it will exe= cute test_vector_subregs_modes. > The fail ICE: > ../../../riscv-gcc/gcc/simplify-rtx.cc:8396: test_vector_subregs_modes: F= AIL: ASSERT_RTX_EQ (expected, simplify_subreg (QImode, x, inner_mode, byte)) > expected: (nil) > > actual: (const_int 0 [0]) > cc1: internal compiler error: in assert_rtx_eq_at, at selftest-rtl.cc:57 > 0x1304ee1 selftest::assert_rtx_eq_at(selftest::location const&, char cons= t*, rtx_def*, rtx_def*) > ../../../riscv-gcc/gcc/selftest-rtl.cc:57 > 0x1332504 test_vector_subregs_modes > ../../../riscv-gcc/gcc/simplify-rtx.cc:8395 > 0x1332988 test_vector_subregs_fore_back > ../../../riscv-gcc/gcc/simplify-rtx.cc:8442 > 0x1332ae7 test_vector_subregs > ../../../riscv-gcc/gcc/simplify-rtx.cc:8467 > 0x1332c57 test_vector_ops > ../../../riscv-gcc/gcc/simplify-rtx.cc:8487 > 0x1332c7b selftest::simplify_rtx_cc_tests() > ../../../riscv-gcc/gcc/simplify-rtx.cc:8547 > 0x21318fa selftest::run_tests() > ../../../riscv-gcc/gcc/selftest-run-tests.cc:115 > 0x1362a76 toplev::run_self_tests() > ../../../riscv-gcc/gcc/toplev.cc:2205 > > I analyzed the codes: > In test_vector_subregs_fore_back, when nunits =3D (1,1). The expected =3D= NULL_RTX and simplify_subreg (QImode, x, inner_mode, byte) =3D const_int 0. > So the assertion fails. Hmm, ok, so the subreg operation is unexpected succeeding. > This is the test for stepped vector using 2 element per pattern. For pol= y_uint16 (1,1), it's true it is possible only has 1 element.=20 The stepped case is 3 elements per pattern rather than 2. In a stepped pattern: a, b, b+n are represented explicitly, then the rest are implicitly b+n*2, b+n*3, etc. The case being handled by this code is instead the 2-element case: a, b are represented explicitly, then the rest are implicitly all b. Why is (1,1) different though? The test is constructing: nunits: 1 + 1x shape: nelts_per_pattern =3D=3D 2, npatterns =3D=3D 1 elements: a, b[, b, b, b, b, ...] It then tests subregs starting at 0 + 1x (so starting at the first b). But for (2,2) we should have: nunits: 2 + 2x shape: nelts_per_pattern =3D=3D 2, npatterns =3D=3D 2 elements: a1, a2, b1, b2[, b1, b2, b1, b2, ...] and we should test subregs starting at 0 + 2x (so starting at the first b1). The two cases should be very similar, it's just that the (2,2) case doubles the number of patterns. > I think it makes sense to fail the test. However for poly (1,1) machine m= ode, can we have the chance that some target define this > kind of machine mode only used for intrinsics? I already developed full = RVV support in GCC (including intrinsc and auto-vectorization). > I only enable auto-vectorization with mode larger than (2,2) and test it = fully. > From my experience, it seems the stepped vector only created during VLA a= uto-vectorization. And I think only allow poly (1,1)mode used in intrinsics= will > not create issues. Am I understanding wrong =EF=BC=9FFeel free to correct= me. Thanks ~ Following on from what I said above, it doesn't look like this particular case is related to stepped vectors. (1,1) shouldn't (need to) be a special case though. Any potentital problems that would occur for (1,1) with npatterns=3D=3D1 would also occur for (n,n) with npatterns=3D=3Dn. E.g. if stepped vectors are problematic for (1,1) then an interleaving of 2 stepped vectors (i.e. npatterns=3D=3D2) would be problematic for (2,2). So yeah, preventing a mode being used for autovectorisation would allow the target to have a bit more control over which constants are actually generated. But it shouldn't be necessary to do that for correctness. Thanks, Richard > juzhe.zhong@rivai.ai >=20=20 > From: Richard Sandiford > Date: 2022-08-19 17:35 > To: juzhe.zhong\@rivai.ai > CC: rguenther; gcc-patches; kito.cheng > Subject: Re: [PATCH] middle-end: skipp stepped vector test of poly_int (1= , 1) and allow the machine_mode definition with poly_uint16 (1, 1) > "juzhe.zhong@rivai.ai" writes: >> Hi, Richard. I tried the codes: >> if (!nunits.is_constant () && maybe_gt (nunits, 1))=20 >> test_vector_subregs_modes (x, nunits - min_nunits, count); >> >> It still failed. For nunits =3D (1,1) , maybe_gt (nunits, 1) return true= value. >=20=20 > Ah, right, sorry for the bogus suggestion. >=20=20 > In that case, what specifically goes wrong? Which test in > test_vector_subregs_modes fails, and what does the ICE look like? >=20=20 > Thanks, > Richard >=20=20 >> But I tried: >> if (!nunits.is_constant () && known_gt (nunits, 1))=20 >> test_vector_subregs_modes (x, nunits - min_nunits, count); >> It pass. But it report a warning: "warning: comparison between signed an= d unsigned integer expressions [-Wsign-compare]" during the compilation. >> >> Finally, I tried: >> if (!nunits.is_constant () && known_gt (GET_MODE_NUNITS (inner_mode), 1)= )=20 >> test_vector_subregs_modes (x, nunits - min_nunits, count); >> It passed with no warning. >> >> Is 'known_gt (GET_MODE_NUNITS (inner_mode), 1)' a good solution for this? >> Thanks! >> >> >> juzhe.zhong@rivai.ai >>=20=20 >> From: Richard Sandiford >> Date: 2022-08-19 16:03 >> To: juzhe.zhong >> CC: gcc-patches; rguenther; kito.cheng >> Subject: Re: [PATCH] middle-end: skipp stepped vector test of poly_int (= 1, 1) and allow the machine_mode definition with poly_uint16 (1, 1) >> juzhe.zhong@rivai.ai writes: >>> From: zhongjuzhe >>> >>> Hello. This patch is preparing for following RVV support. >>> >>> Both ARM SVE and RVV (RISC-V 'V' Extension) support length-agnostic vec= tor. >>> The minimum vector length of ARM SVE is 128-bit and the runtime invaria= nt of ARM SVE is always 128-bit blocks. >>> However, the minimum vector length of RVV can be 32bit in 'Zve32*' sub-= extension and 64bit in 'Zve*' sub-extension. >>> >>> So I define the machine_mode as follows: >>> VECTOR_MODE_WITH_PREFIX (VNx, INT, DI, 1, 0); >>> ADJUST_NUNITS (MODE, riscv_vector_chunks); >>> The riscv_vector_chunks =3D poly_uint16 (1, 1) >>> >>> The compilation is failed for the stepped vector test: >>> (const_vector:VNx1DI repeat [ >>> (const_int 8 [0x8]) >>> (const_int 7 [0x7]) >>> ]) >>> >>> I understand for stepped vector should always have aleast 2 elements an= d stepped vector initialization is common >>> for VLA (vector-lengthe agnostic) auto-vectorization. It makes sense th= at report fail for stepped vector of poly_uint16 (1, 1). >>> >>> machine mode with nunits =3D poly_uint16 (1, 1) needs to implemented in= intrinsics. And I would like to enable RVV auto-vectorization >>> with vector mode only nunits is larger than poly_uint16 (2, 2) in RISC-= V backend. I think it will not create issue if we define >>> vector mode with nunits =3D poly_uint16 (1, 1). Feel free to correct me= or offer me some other better solutions. Thanks! >>> >>>=20=20=20 >>> >>> gcc/ChangeLog: >>> >>> * simplify-rtx.cc (test_vector_subregs_fore_back): skip test fo= r poly_uint16 (1, 1). >>> >>> --- >>> gcc/simplify-rtx.cc | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc >>> index 7d09bf7103d..61e0dfa00d0 100644 >>> --- a/gcc/simplify-rtx.cc >>> +++ b/gcc/simplify-rtx.cc >>> @@ -8438,7 +8438,7 @@ test_vector_subregs_fore_back (machine_mode inner= _mode) >>> rtx x =3D builder.build (); >>>=20=20 >>> test_vector_subregs_modes (x); >>> - if (!nunits.is_constant ()) >>> + if (!nunits.is_constant () && known_ne (nunits, poly_uint16 (1, 1))) >>> test_vector_subregs_modes (x, nunits - min_nunits, count); >>=20=20 >> I think instead we should use maybe_gt (nunits, 1), on the basis that >> the fore_back tests require vectors that have a minimum of 2 elements. >> Something like poly_uint16 (1, 2) would have the same problem as >> poly_uint16 (1, 1). ({1, 2} is an unlikely value, but it's OK in >> principle.) >>=20=20 >> This corresponds to the minimum of 3 elements for the stepped tests: >>=20=20 >> if (GET_MODE_CLASS (mode) =3D=3D MODE_VECTOR_INT >> && maybe_gt (GET_MODE_NUNITS (mode), 2)) >> { >> test_vector_ops_series (mode, scalar_reg); >> test_vector_subregs (mode); >> } >>=20=20 >> Thanks, >> Richard >>=20=20 >=20=20