From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 1B2BE3858C52 for ; Wed, 9 Nov 2022 11:33:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 1B2BE3858C52 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D82441FB; Wed, 9 Nov 2022 03:33:38 -0800 (PST) Received: from [10.1.26.145] (E121495.arm.com [10.1.26.145]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 35B893F534; Wed, 9 Nov 2022 03:33:32 -0800 (PST) Message-ID: <6a1c88e9-d55a-2b81-3c5b-2716ec06e110@arm.com> Date: Wed, 9 Nov 2022 11:33:26 +0000 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.4.0 Subject: Re: [AArch64] Enable generation of FRINTNZ instructions Content-Language: en-US To: Richard Biener Cc: "Andre Vieira (lists) via Gcc-patches" , richard.sandiford@arm.com References: <8225375c-eb9e-f9b3-6bcd-9fbccf2fc87b@arm.com> <70s9nn94-452-5rrr-4458-q6n3qp563652@fhfr.qr> <36e3469a-3922-d49e-4006-0088eac29157@arm.com> <653o8886-3p5n-sr82-9n83-71q33o8824@fhfr.qr> <6c730f35-10b1-2843-cffc-4ed0851380be@arm.com> <85sr96q-o3s-602o-3436-40713n68pp84@fhfr.qr> <8d593d5f-41a0-6051-0ce0-d72834ecfa25@arm.com> <9d3755df-6c41-20e4-31fb-811e5cd9182a@arm.com> <231396s0-2756-q51s-q55-o8npqo91on32@fhfr.qr> <5d7bb7af-b09e-cb91-b457-c6148f65028e@arm.com> <051c70ff-7c59-bbfe-7780-4cb2380d9a93@arm.com> <1cdffce8-e0e5-3304-52b9-3b736a4d380d@arm.com> From: "Andre Vieira (lists)" In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,NICE_REPLY_A,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 07/11/2022 14:56, Richard Biener wrote: > On Mon, 7 Nov 2022, Andre Vieira (lists) wrote: > >> On 07/11/2022 11:05, Richard Biener wrote: >>> On Fri, 4 Nov 2022, Andre Vieira (lists) wrote: >>> >>>> Sorry for the delay, just been reminded I still had this patch outstanding >>>> from last stage 1. Hopefully since it has been mostly reviewed it could go >>>> in >>>> for this stage 1? >>>> >>>> I addressed the comments and gave the slp-part of vectorizable_call some >>>> TLC >>>> to make it work. >>>> >>>> I also changed vect_get_slp_defs as I noticed that the call from >>>> vectorizable_call was creating an auto_vec with 'nargs' that might be less >>>> than the number of children in the slp_node >>> how so? Please fix that in the caller. It looks like it probably >>> shoud use vect_nargs instead? >> Well that was my first intuition, but when I looked at it further the variant >> it's calling: >> void vect_get_slp_defs (vec_info *, slp_tree slp_node, vec > >> *vec_oprnds, unsigned n) >> >> Is actually creating a vector of vectors of slp defs. So for each child of >> slp_node it calls: >> void vect_get_slp_defs (slp_tree slp_node, vec *vec_defs) >> >> Which returns a vector of vectorized defs. So vect_nargs would be the right >> size for the inner vec of vec_defs, but the outer should have the same >> number of elements as the original slp_node has children. > No, the inner vector is the vector of vectors for each arg, the outer > vector should be the one for each argument. Hm, that was a confusing > sentence. > > That said, the number of SLP children of a call node should eventually > be the number of arguments of the call (plus masks, etc.). So it > looks about correct besides the vec_nargs issue? Yeah you are right, I misunderstood what the children were, so you have a child node for each argument of the call. Though, since you iterate over the 'scalar' arguments of the call I actually think 'nargs' was correct to begin with, which would explain why this never went wrong... So I think it is actually correct as is, I must have gotten confused by some earlier investigation into how to deal with the scalar arguments... Sorry for the noise, I'll undo these changes to the patch.