From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 52BD73858D20; Fri, 14 Apr 2023 09:42:35 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 52BD73858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 857D12F4; Fri, 14 Apr 2023 02:43:19 -0700 (PDT) Received: from [10.57.68.11] (unknown [10.57.68.11]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 2DD4C3F587; Fri, 14 Apr 2023 02:42:34 -0700 (PDT) Message-ID: Date: Fri, 14 Apr 2023 10:42:28 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.9.0 Subject: Re: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64 Content-Language: en-US To: Richard Biener Cc: Andrew Stubbs , "gcc-regression@gcc.gnu.org" , "gcc-patches@gcc.gnu.org" , "haochen.jiang@intel.com" References: <202304130148.33D1mmns1987590@shliclel4214.sh.intel.com> <341dd608-a512-3c74-303d-1942876a3850@arm.com> From: "Andre Vieira (lists)" In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-10.1 required=5.0 tests=BAYES_00,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,KAM_NUMSUBJECT,NICE_REPLY_A,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 14/04/2023 10:09, Richard Biener wrote: > On Fri, Apr 14, 2023 at 10:43 AM Andre Vieira (lists) > wrote: >> >> Resending this to everyone (sorry for the double send Richard). >> >> On 14/04/2023 09:15, Andre Vieira (lists) wrote: >> > >> > >> > On 14/04/2023 07:55, Richard Biener wrote: >> >> On Thu, Apr 13, 2023 at 4:25 PM Andre Vieira (lists) >> >> wrote: >> >>> >> >>> >> >>> >> >>> On 13/04/2023 15:00, Richard Biener wrote: >> >>>> On Thu, Apr 13, 2023 at 3:00 PM Andre Vieira (lists) via Gcc-patches >> >>>> wrote: >> >>>>> >> >>>>> >> >>>>> >> >>> >> >>> But that's not it, I've been looking at it, and there is code in place >> >>> that does what I expected which is defer the choice of vectype for simd >> >>> clones until vectorizable_simd_clone_call, unfortunately it has a >> >>> mistaken assumption that simdclones don't return :/ >> >> >> >> I think that's not it - when the SIMD clone returns a vector we have to >> >> determine the vector type in this function. We cannot defer this. >> > >> > What's 'this function' here, do you mean we have to determine the >> > vectype in 'vect_get_vector_types_for_stmt' & >> > 'vect_determine_vf_for_stmt' ? > > Yes. > >> Because at that time we don't yet know >> > what clone we will be using, this choice is done inside >> > vectorizable_simd_clone_call. In fact, to choose the simd clone, we need >> > to know the vf as that has to be a multiple of the chosen clone's >> > simdlen. So we simply can't use the simdclone's types (as that depends >> > on the simdlen) to choose the vf because the choice of simdlend depends >> > on the vf. And there was already code in place to handle this, >> > unfortunately that code was wrong and had the wrong assumption that >> > simdclones didn't return (probably was true at some point and bitrotted). > > But to compute the VF we need to know the vector types! We're only > calling vectorizable_* when the VF is final. That said, the code you quote: > >> >> >> >>> see vect_get_vector_types_for_stmt: >> >>> ... >> >>> if (gimple_get_lhs (stmt) == NULL_TREE > > is just for the case of a function without return value. For this case > it's OK to do nothing - 'vectype' is the vector type of all vector defs > a stmt produces. > > For calls with a LHS it should fall through to generic code doing > get_vectype_for_scalar_type on the LHS type. I think that may work, but right now it will still go and look at the arguments of the call and use the smallest type among them to adjust the nunits (in 'vect_get_vector_types_for_stmt'). In fact (this is just for illustration) if I hack that function like this: --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -12745,8 +12745,11 @@ vect_get_vector_types_for_stmt (vec_info *vinfo, stmt_vec_info stmt_info, /* The number of units is set according to the smallest scalar type (or the largest vector size, but we only support one vector size per vectorization). */ - scalar_type = vect_get_smallest_scalar_type (stmt_info, - TREE_TYPE (vectype)); + if (simd_clone_call_p (stmt_info->stmt)) + scalar_type = TREE_TYPE (vectype); + else + scalar_type = vect_get_smallest_scalar_type (stmt_info, + TREE_TYPE (vectype)); if (scalar_type != TREE_TYPE (vectype)) { if (dump_enabled_p ()) It will use the same types as before without (-m32), like I explained before the -m32 turns the pointer inside MASK_CALL into a 32-bit pointer so now the smallest size is 32-bits. This makes it pick V8SI instead of the original V4DI (scalar return type is DImode). Changing the VF to 8, thus unrolling the loop as it needs to make 2 calls, each handling 4 nunits.