From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out2.suse.de (smtp-out2.suse.de [IPv6:2001:67c:2178:6::1d]) by sourceware.org (Postfix) with ESMTPS id 20C633858CD1 for ; Mon, 31 Jul 2023 09:30:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 20C633858CD1 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id 941F51F854; Mon, 31 Jul 2023 09:30:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1690795803; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Rztl2tk9VyyYoXFS2nElQI/TrntyJWn9fwW9fVs8LMw=; b=tOv9YCR8icU+ELx3q8/9y8VmLzll5+qxrzPXlWSARXlWWClfvuls5MZLo5hVA2Z26c5Ltc 6bz4M8pQOsB55WDaz5/3q33h/J0t61CvFQ+LDEXaxmiSOtpt9A2D6LA0xkmIx3+uVdvtwv KvNG8xt+RTw64Qhbw5AxaeiPwFDgdwM= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1690795803; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Rztl2tk9VyyYoXFS2nElQI/TrntyJWn9fwW9fVs8LMw=; b=lc1qopFoe903/j6dPc7R1mKpGzUGgTz058Q0e5ovGfX738eZOfGIVi9ll9sxOmp5iBgwSy 1t4VZiNNoePVl2CA== Received: from wotan.suse.de (wotan.suse.de [10.160.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 83F442C142; Mon, 31 Jul 2023 09:30:03 +0000 (UTC) Date: Mon, 31 Jul 2023 09:30:03 +0000 (UTC) From: Richard Biener To: Richard Sandiford cc: =?GB2312?B?1tO+09Xc?= , gcc-patches Subject: Re: [PATCH] VECT: Support CALL vectorization for COND_LEN_* In-Reply-To: Message-ID: References: <20230724114650.61923-1-juzhe.zhong@rivai.ai> <62B332CEE95B33E1+2023072507160172829450@rivai.ai> User-Agent: Alpine 2.22 (LSU 394 2020-01-19) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-5.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE,WEIRD_PORT autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Tue, 25 Jul 2023, Richard Sandiford wrote: > ??? writes: > > Hi, Richi. Thank you so much for review. > > > >>> This function doesn't seem to care about conditional vectorization > >>> support, so why are you changing it? > > > > I debug and analyze the code here: > > > > Breakpoint 1, vectorizable_call (vinfo=0x3d358d0, stmt_info=0x3dcc820, gsi=0x0, vec_stmt=0x0, slp_node=0x0, cost_vec=0x7fffffffc668) at ../../../riscv-gcc/gcc/tree-vect-stmts.cc:3485 > > 3485 ifn = vectorizable_internal_function (cfn, callee, vectype_out, > > (gdb) p cfn > > $1 = CFN_COND_ADD > > (gdb) call print_gimple_stmt(stdout,stmt,0,0) > > _9 = .COND_ADD (_27, _6, _8, _6); > > > > When target like RISC-V didnt' support COND_* (we only support COND_LEN_*, no support COND_*), > > ifn will be IFN_LAST. > > Also, the following code doesn't modify ifn until > > "internal_fn cond_fn = get_conditional_internal_fn (ifn);" > > When ifn == IFN_LAST, cond_fn will IFN_LAST too. > > > > So, I extend "vectorizable_internal_function" to return COND_LEN_*. > > > > Am I going into wrong direction on debug && analysis. > > The main difference between IFN_COND_ADD and IFN_COND_LEN_ADD is that > IFN_COND_ADD makes logical sense for scalars, whereas IFN_COND_LEN_ADD > only makes sense for vectors. So I think if-conversion has to create > IFN_COND_ADD rather than IFN_COND_LEN_ADD even for targets that only > support IFN_COND_LEN_ADD. (Which is what the patch does.) > > IFN_COND_LEN_ADD is really a generalisation of IFN_COND_ADD. In a > sense, every target that supports IFN_COND_LEN_ADD also supports > IFN_COND_ADD. All we need is two extra arguments. And the patch > does seem to take that approach (by adding dummy length arguments). > > So I think there are at least four ways we could go: > > (1) RVV implements cond_* optabs as expanders. RVV therefore supports > both IFN_COND_ADD and IFN_COND_LEN_ADD. No dummy length arguments > are needed at the gimple level. > > (2) Target-independent code automatically provides cond_* optabs > based on cond_len_* optabs. Otherwise the same as (1). > > (3) A gimple pass lowers IFN_COND_ADD to IFN_COND_LEN_ADD on targets > that need it. Could be vec lowering or isel. > > (4) The vectoriser maintains the distinction between IFN_COND_ADD and > IFN_COND_LEN_ADD from the outset. It adds dummy length arguments > where necessary. > > The patch is going for (4). But I think we should at least consider > whether any of the other alternatives make sense. I think only (1) and (4) make sense. The vectorizer would need to check for IFN_COND_LEN_ADD or IFN_COND_ADD support anyway even when only emitting IFN_COND_ADD. Since we've already gone half-way then we can as well do (4). That also possibly simplifies N targets as opposed to just the vectorizer. So my preference would be (4), but I'm also fine with (1). Richard. > If we do go for (4), I think the get_conditional_len_internal_fn > is in the wrong place. It should be applied: > > internal_fn ifn; > if (internal_fn_p (cfn)) > ifn = as_internal_fn (cfn); > else > ifn = associated_internal_fn (fndecl); > if (ifn != IFN_LAST && direct_internal_fn_p (ifn)) > { > // After this point > > I don't think we should assume that every IFN_COND_* has an associated > tree code. It should be possible to create IFN_COND_*s from unconditional > internal functions too. So rather than use: > > tree_code code = conditional_internal_fn_code (ifn); > len_ifn = get_conditional_len_internal_fn (code); > > I think we should have an internal-fn helper that returns IFN_COND_LEN_* > for a given IFN_COND_*. It could handle IFN_MASK_LOAD -> IFN_MASK_LEN_LOAD > etc. too. > > It would help if we combined the COND_* and COND_LEN_* definitions. > For example: > > DEF_INTERNAL_OPTAB_FN (COND_ADD, ECF_CONST, cond_add, cond_binary) > DEF_INTERNAL_OPTAB_FN (COND_LEN_ADD, ECF_CONST, cond_len_add, cond_len_binary) > > could become: > > DEF_INTERNAL_COND_FN (ADD, ECF_CONST, add, binary) > > that expands to the COND_ADD and COND_LEN_ADD definitions. This would > help to keep the definitions in sync and would make it eaiser to do a > mapping between COND_* and COND_LEN_*. > > Thanks, > Richard > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)