From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 5C6873858C66 for ; Tue, 25 Jul 2023 09:41:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5C6873858C66 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E1A7715BF; Tue, 25 Jul 2023 02:42:03 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 0FFD73F6C4; Tue, 25 Jul 2023 02:41:19 -0700 (PDT) From: Richard Sandiford To: =?utf-8?B?6ZKf5bGF5ZOy?= Mail-Followup-To: =?utf-8?B?6ZKf5bGF5ZOy?= ,rguenther , gcc-patches , richard.sandiford@arm.com Cc: rguenther , gcc-patches Subject: Re: [PATCH] VECT: Support CALL vectorization for COND_LEN_* References: <20230724114650.61923-1-juzhe.zhong@rivai.ai> <62B332CEE95B33E1+2023072507160172829450@rivai.ai> Date: Tue, 25 Jul 2023 10:41:18 +0100 In-Reply-To: <62B332CEE95B33E1+2023072507160172829450@rivai.ai> (=?utf-8?B?IumSn+WxheWTsiIncw==?= message of "Tue, 25 Jul 2023 07:16:02 +0800") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-19.0 required=5.0 tests=BAYES_00,BODY_8BITS,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE,WEIRD_PORT autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: =E9=92=9F=E5=B1=85=E5=93=B2 writes: > Hi, Richi. Thank you so much for review. > >>> This function doesn't seem to care about conditional vectorization >>> support, so why are you changing it? > > I debug and analyze the code here: > > Breakpoint 1, vectorizable_call (vinfo=3D0x3d358d0, stmt_info=3D0x3dcc820= , gsi=3D0x0, vec_stmt=3D0x0, slp_node=3D0x0, cost_vec=3D0x7fffffffc668) at = ../../../riscv-gcc/gcc/tree-vect-stmts.cc:3485 > 3485 ifn =3D vectorizable_internal_function (cfn, callee, vectype_= out, > (gdb) p cfn > $1 =3D CFN_COND_ADD > (gdb) call print_gimple_stmt(stdout,stmt,0,0) > _9 =3D .COND_ADD (_27, _6, _8, _6); > > When target like RISC-V didnt' support COND_* (we only support COND_LEN_*= , no support COND_*), > ifn will be IFN_LAST. > Also, the following code doesn't modify ifn until > "internal_fn cond_fn =3D get_conditional_internal_fn (ifn);" > When ifn =3D=3D IFN_LAST, cond_fn will IFN_LAST too. > > So, I extend "vectorizable_internal_function" to return COND_LEN_*. > > Am I going into wrong direction on debug && analysis. The main difference between IFN_COND_ADD and IFN_COND_LEN_ADD is that IFN_COND_ADD makes logical sense for scalars, whereas IFN_COND_LEN_ADD only makes sense for vectors. So I think if-conversion has to create IFN_COND_ADD rather than IFN_COND_LEN_ADD even for targets that only support IFN_COND_LEN_ADD. (Which is what the patch does.) IFN_COND_LEN_ADD is really a generalisation of IFN_COND_ADD. In a sense, every target that supports IFN_COND_LEN_ADD also supports IFN_COND_ADD. All we need is two extra arguments. And the patch does seem to take that approach (by adding dummy length arguments). So I think there are at least four ways we could go: (1) RVV implements cond_* optabs as expanders. RVV therefore supports both IFN_COND_ADD and IFN_COND_LEN_ADD. No dummy length arguments are needed at the gimple level. (2) Target-independent code automatically provides cond_* optabs based on cond_len_* optabs. Otherwise the same as (1). (3) A gimple pass lowers IFN_COND_ADD to IFN_COND_LEN_ADD on targets that need it. Could be vec lowering or isel. (4) The vectoriser maintains the distinction between IFN_COND_ADD and IFN_COND_LEN_ADD from the outset. It adds dummy length arguments where necessary. The patch is going for (4). But I think we should at least consider whether any of the other alternatives make sense. If we do go for (4), I think the get_conditional_len_internal_fn is in the wrong place. It should be applied: internal_fn ifn; if (internal_fn_p (cfn)) ifn =3D as_internal_fn (cfn); else ifn =3D associated_internal_fn (fndecl); if (ifn !=3D IFN_LAST && direct_internal_fn_p (ifn)) { // After this point I don't think we should assume that every IFN_COND_* has an associated tree code. It should be possible to create IFN_COND_*s from unconditional internal functions too. So rather than use: tree_code code =3D conditional_internal_fn_code (ifn); len_ifn =3D get_conditional_len_internal_fn (code); I think we should have an internal-fn helper that returns IFN_COND_LEN_* for a given IFN_COND_*. It could handle IFN_MASK_LOAD -> IFN_MASK_LEN_LOAD etc. too. It would help if we combined the COND_* and COND_LEN_* definitions. For example: DEF_INTERNAL_OPTAB_FN (COND_ADD, ECF_CONST, cond_add, cond_binary) DEF_INTERNAL_OPTAB_FN (COND_LEN_ADD, ECF_CONST, cond_len_add, cond_len_bina= ry) could become: DEF_INTERNAL_COND_FN (ADD, ECF_CONST, add, binary) that expands to the COND_ADD and COND_LEN_ADD definitions. This would help to keep the definitions in sync and would make it eaiser to do a mapping between COND_* and COND_LEN_*. Thanks, Richard