From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <rguenther@suse.de>
Received: from mx2.suse.de (mx2.suse.de [195.135.220.15])
 by sourceware.org (Postfix) with ESMTPS id 73B8E385803C
 for <gcc-patches@gcc.gnu.org>; Tue,  5 Jan 2021 09:04:29 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 73B8E385803C
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=suse.de
Authentication-Results: sourceware.org;
 spf=pass smtp.mailfrom=rguenther@suse.de
X-Virus-Scanned: by amavisd-new at test-mx.suse.de
Received: from relay2.suse.de (unknown [195.135.221.27])
 by mx2.suse.de (Postfix) with ESMTP id 62ACDAD29;
 Tue,  5 Jan 2021 09:04:28 +0000 (UTC)
Date: Tue, 5 Jan 2021 10:04:28 +0100 (CET)
From: Richard Biener <rguenther@suse.de>
To: Richard Sandiford <richard.sandiford@arm.com>
cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] vect: Fix missing alias checks for 128-bit SVE [PR98371]
In-Reply-To: <mpt8s9vnux1.fsf@arm.com>
Message-ID: <nycvar.YFH.7.76.2101051004200.17979@zhemvz.fhfr.qr>
References: <mpt8s9vnux1.fsf@arm.com>
User-Agent: Alpine 2.21 (LSU 202 2017-01-01)
MIME-Version: 1.0
X-Spam-Status: No, score=-11.3 required=5.0 tests=BAYES_00, GIT_PATCH_0,
 KAM_DMARC_STATUS, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,
 SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on
 server2.sourceware.org
Content-Type: text/plain; charset=ISO-8859-15
Content-Transfer-Encoding: 8BIT
X-Content-Filtered-By: Mailman/MimeDel 2.1.29
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Tue, 05 Jan 2021 09:04:32 -0000

On Fri, 18 Dec 2020, Richard Sandiford wrote:

> On AArch64, the vectoriser tries various ways of vectorising with both
> SVE and Advanced SIMD and picks the best one.  All other things being
> equal, it prefers earlier attempts over later attempts.
> 
> The way this works currently is that, once it has a successful
> vectorisation attempt A, it analyses all other attempts as epilogue
> loops of A:
> 
>       /* When pick_lowest_cost_p is true, we should in principle iterate
> 	 over all the loop_vec_infos that LOOP_VINFO could replace and
> 	 try to vectorize LOOP_VINFO under the same conditions.
> 	 E.g. when trying to replace an epilogue loop, we should vectorize
> 	 LOOP_VINFO as an epilogue loop with the same VF limit.  When trying
> 	 to replace the main loop, we should vectorize LOOP_VINFO as a main
> 	 loop too.
> 
> 	 However, autovectorize_vector_modes is usually sorted as follows:
> 
> 	 - Modes that naturally produce lower VFs usually follow modes that
> 	   naturally produce higher VFs.
> 
> 	 - When modes naturally produce the same VF, maskable modes
> 	   usually follow unmaskable ones, so that the maskable mode
> 	   can be used to vectorize the epilogue of the unmaskable mode.
> 
> 	 This order is preferred because it leads to the maximum
> 	 epilogue vectorization opportunities.  Targets should only use
> 	 a different order if they want to make wide modes available while
> 	 disparaging them relative to earlier, smaller modes.  The assumption
> 	 in that case is that the wider modes are more expensive in some
> 	 way that isn't reflected directly in the costs.
> 
> 	 There should therefore be few interesting cases in which
> 	 LOOP_VINFO fails when treated as an epilogue loop, succeeds when
> 	 treated as a standalone loop, and ends up being genuinely cheaper
> 	 than FIRST_LOOP_VINFO.  */
> 
> However, the vectoriser can normally elide alias checks for epilogue
> loops, on the basis that the main loop should do them instead.
> Converting an epilogue loop to a main loop can therefore cause the alias
> checks to be skipped.  (It probably also unfairly penalises the original
> loop in the cost comparison, given that one loop will have alias checks
> and the other won't.)
> 
> As the comment says, we should in principle analyse each vector mode
> twice: once as a main loop and once as an epilogue.  However, doing
> that up-front would be quite expensive.  This patch instead goes for a
> compromise: if an epilogue loop for mode M2 seems better than a main
> loop for mode M1, re-analyse with M2 as the main loop.
> 
> The patch fixes dg.torture.exp=pr69719.c when testing with
> -msve-vector-bits=128.
> 
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

OK.

Thanks,
Richard.

> Richard
> 
> 
> gcc/
> 	PR tree-optimization/98371
> 	* tree-vect-loop.c (vect_reanalyze_as_main_loop): New function.
> 	(vect_analyze_loop): If an epilogue loop appears to be cheaper
> 	than the main loop, re-analyze it as a main loop before adopting
> 	it as a main loop.
> ---
>  gcc/tree-vect-loop.c | 61 +++++++++++++++++++++++++++++++++++++++++---
>  1 file changed, 58 insertions(+), 3 deletions(-)
> 
> diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
> index 688538a4521..221541f4a38 100644
> --- a/gcc/tree-vect-loop.c
> +++ b/gcc/tree-vect-loop.c
> @@ -2850,6 +2850,45 @@ vect_joust_loop_vinfos (loop_vec_info new_loop_vinfo,
>    return true;
>  }
>  
> +/* If LOOP_VINFO is already a main loop, return it unmodified.  Otherwise
> +   try to reanalyze it as a main loop.  Return the loop_vinfo on success
> +   and null on failure.  */
> +
> +static loop_vec_info
> +vect_reanalyze_as_main_loop (loop_vec_info loop_vinfo, unsigned int *n_stmts)
> +{
> +  if (!LOOP_VINFO_EPILOGUE_P (loop_vinfo))
> +    return loop_vinfo;
> +
> +  if (dump_enabled_p ())
> +    dump_printf_loc (MSG_NOTE, vect_location,
> +		     "***** Reanalyzing as a main loop with vector mode %s\n",
> +		     GET_MODE_NAME (loop_vinfo->vector_mode));
> +
> +  struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
> +  vec_info_shared *shared = loop_vinfo->shared;
> +  opt_loop_vec_info main_loop_vinfo = vect_analyze_loop_form (loop, shared);
> +  gcc_assert (main_loop_vinfo);
> +
> +  main_loop_vinfo->vector_mode = loop_vinfo->vector_mode;
> +
> +  bool fatal = false;
> +  bool res = vect_analyze_loop_2 (main_loop_vinfo, fatal, n_stmts);
> +  loop->aux = NULL;
> +  if (!res)
> +    {
> +      if (dump_enabled_p ())
> +	dump_printf_loc (MSG_NOTE, vect_location,
> +			 "***** Failed to analyze main loop with vector"
> +			 " mode %s\n",
> +			 GET_MODE_NAME (loop_vinfo->vector_mode));
> +      delete main_loop_vinfo;
> +      return NULL;
> +    }
> +  LOOP_VINFO_VECTORIZABLE_P (main_loop_vinfo) = 1;
> +  return main_loop_vinfo;
> +}
> +
>  /* Function vect_analyze_loop.
>  
>     Apply a set of analyses on LOOP, and create a loop_vec_info struct
> @@ -2997,9 +3036,25 @@ vect_analyze_loop (class loop *loop, vec_info_shared *shared)
>  	      if (vinfos.is_empty ()
>  		  && vect_joust_loop_vinfos (loop_vinfo, first_loop_vinfo))
>  		{
> -		  delete first_loop_vinfo;
> -		  first_loop_vinfo = opt_loop_vec_info::success (NULL);
> -		  LOOP_VINFO_ORIG_LOOP_INFO (loop_vinfo) = NULL;
> +		  loop_vec_info main_loop_vinfo
> +		    = vect_reanalyze_as_main_loop (loop_vinfo, &n_stmts);
> +		  if (main_loop_vinfo == loop_vinfo)
> +		    {
> +		      delete first_loop_vinfo;
> +		      first_loop_vinfo = opt_loop_vec_info::success (NULL);
> +		    }
> +		  else if (main_loop_vinfo
> +			   && vect_joust_loop_vinfos (main_loop_vinfo,
> +						      first_loop_vinfo))
> +		    {
> +		      delete first_loop_vinfo;
> +		      first_loop_vinfo = opt_loop_vec_info::success (NULL);
> +		      delete loop_vinfo;
> +		      loop_vinfo
> +			= opt_loop_vec_info::success (main_loop_vinfo);
> +		    }
> +		  else
> +		    delete main_loop_vinfo;
>  		}
>  	    }
>  
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)