From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by sourceware.org (Postfix) with ESMTPS id D69193858031 for ; Tue, 7 Dec 2021 11:45:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org D69193858031 Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id B018221B3F; Tue, 7 Dec 2021 11:45:42 +0000 (UTC) Received: from murzim.suse.de (murzim.suse.de [10.160.4.192]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 924D5A3B81; Tue, 7 Dec 2021 11:45:42 +0000 (UTC) Date: Tue, 7 Dec 2021 12:45:42 +0100 (CET) From: Richard Biener To: "Andre Vieira (lists)" cc: "gcc-patches@gcc.gnu.org" , richard.sandiford@arm.com Subject: Re: [vect] Re-analyze all modes for epilogues In-Reply-To: Message-ID: References: <4a2e6dde-cc5c-97fe-7a43-bd59d542c2ce@arm.com> <4272814n-8538-p793-157q-5n6q16r48n51@fhfr.qr> <623fbfd9-b97c-8c6e-0348-07d6c4496592@arm.com> <5c887c48-7f7e-c02b-2998-7a7c41b11af8@arm.com> <33cb143e-bb2e-e214-cd5f-66fd2d1bd20b@arm.com> <5op15ns-4sq8-2sn3-41qs-49q44417sp6@fhfr.qr> <99qs2o2p-pn87-n164-q8n9-9p814r6n75r1@fhfr.qr> <475fae98-9541-5dca-2e60-eaff172ff787@arm.com> <8p72o15s-5894-4or0-409r-oo4p74o238r1@fhfr.qr> <21e3500d-6cf5-ed46-6f95-1f554c5dbc50@arm.com> <5477e0cb-6dc9-e828-7c20-a99de3c6840c@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-3.6 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_SHORT, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 07 Dec 2021 11:45:45 -0000 On Tue, 7 Dec 2021, Andre Vieira (lists) wrote: > Hi, > > I've split this particular part off, since it's not only relevant to > unrolling. The new test shows how this is useful for existing (non-unrolling) > cases. I also had to fix the costing function, the main_vf / epilogue_vf > calculations for old and new didn't take into consideration that the main_vf > could be lower, nor did it take into consideration that they were not > necessarily always a multiple of each other.  So using CEIL here is the > correct approach. > > Bootstrapped and regression tested on aarch64-none-linux-gnu. > > OK for trunk? + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "***** Re-trying analysis with first vector mode" + " %s for epilogue.\n", + GET_MODE_NAME (vector_modes[mode_i])); that will now unconditionally dump VOIDmode which isn't really useful information. I suggest to dump "Re-trying analysis using mode autodetection for epilouge.\n". OK with that change. Can you check whether, give we know the main VF, the epilogue analysis does not start with am autodetected vector mode that needs a too large VF? Thanks, Richard. > gcc/ChangeLog: > >         * tree-vect-loop.c (vect_better_loop_vinfo_p): Round factors up for > epilogue costing. >         (vect_analyze_loop): Re-analyze all modes for epilogues. > > gcc/testsuite/ChangeLog: > >         * gcc.target/aarch64/masked_epilogue.c: New test. > > On 30/11/2021 13:56, Richard Biener wrote: > > On Tue, 30 Nov 2021, Andre Vieira (lists) wrote: > > > >> On 25/11/2021 12:46, Richard Biener wrote: > >>> Oops, my fault, yes, it does. I would suggest to refactor things so > >>> that the mode_i = first_loop_i case is there only once. I also wonder > >>> if all the argument about starting at 0 doesn't apply to the > >>> not unrolled LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P as well? So > >>> what's the reason to differ here? So in the end I'd just change > >>> the existing > >>> > >>> if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo)) > >>> { > >>> > >>> to > >>> > >>> if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo) > >>> || first_loop_vinfo->suggested_unroll_factor > 1) > >>> { > >>> > >>> and maybe revisit this when we have an actual testcase showing that > >>> doing sth else has a positive effect? > >>> > >>> Thanks, > >>> Richard. > >> So I had a quick chat with Richard Sandiford and he is suggesting resetting > >> mode_i to 0 for all cases. > >> > >> He pointed out that for some tunings the SVE mode might come after the NEON > >> mode, which means that even for not-unrolled loop_vinfos we could end up > >> with > >> a suboptimal choice of mode for the epilogue. I.e. it could be that we pick > >> V16QI for main vectorization, but that's VNx16QI + 1 in the array, so we'd > >> not > >> try VNx16QI for the epilogue. > >> > >> This would simplify the mode selecting cases, by just simply restarting at > >> mode_i in all epilogue cases. Is that something you'd be OK? > > Works for me with an updated comment. Even better with showing a > > testcase exercising such tuning. > > > > Richard. > -- Richard Biener SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)