From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-485740-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 4666 invoked by alias); 17 Sep 2018 09:14:16 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 4656 invoked by uid 89); 17 Sep 2018 09:14:15 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,SPF_PASS autolearn=ham version=3.3.2 spammy=disallows
X-HELO: foss.arm.com
Received: from usa-sjc-mx-foss1.foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 17 Sep 2018 09:14:14 +0000
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249])	by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 13D3418A;	Mon, 17 Sep 2018 02:14:13 -0700 (PDT)
Received: from localhost (unknown [10.32.99.101])	by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 6B1BC3F703;	Mon, 17 Sep 2018 02:14:12 -0700 (PDT)
From: Richard Sandiford <richard.sandiford@arm.com>
To: <ams@codesourcery.com>
Mail-Followup-To: <ams@codesourcery.com>,<gcc-patches@gcc.gnu.org>, richard.sandiford@arm.com
Cc: <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH 14/25] Disable inefficient vectorization of elementwise loads/stores.
References: <cover.1536144068.git.ams@codesourcery.com>	<fb85f5cc96463b1a779cd4f874dff269960b40a3.1536144068.git.ams@codesourcery.com>
Date: Mon, 17 Sep 2018 09:16:00 -0000
In-Reply-To: <fb85f5cc96463b1a779cd4f874dff269960b40a3.1536144068.git.ams@codesourcery.com>	(ams's message of "Wed, 5 Sep 2018 12:50:29 +0100")
Message-ID: <87lg804hb1.fsf@arm.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-SW-Source: 2018-09/txt/msg00843.txt.bz2

<ams@codesourcery.com> writes:
> If the autovectorizer tries to load a GCN 64-lane vector elementwise then it
> blows away the register file and produces horrible code.

Do all the registers really need to be live at once, or is it "just" bad
scheduling?  I'd have expected the initial rtl to load each element and
then insert it immediately, so that the number of insertions doesn't
directly affect register pressure.

> This patch simply disallows elementwise loads for such large vectors.  Is there
> a better way to disable this in the middle-end?

Do you ever want elementwise accesses for GCN?  If not, it might be
better to disable them in the target's cost model.

Thanks,
Richard

>
> 2018-09-05  Julian Brown  <julian@codesourcery.com>
>
> 	gcc/
> 	* tree-vect-stmts.c (get_load_store_type): Don't use VMAT_ELEMENTWISE
> 	loads/stores with many-element (>=64) vectors.
> ---
>  gcc/tree-vect-stmts.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
>
> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> index 8875201..a333991 100644
> --- a/gcc/tree-vect-stmts.c
> +++ b/gcc/tree-vect-stmts.c
> @@ -2452,6 +2452,26 @@ get_load_store_type (stmt_vec_info stmt_info, tree vectype, bool slp,
>  	*memory_access_type = VMAT_CONTIGUOUS;
>      }
>  
> +  /* FIXME: Element-wise accesses can be extremely expensive if we have a
> +     large number of elements to deal with (e.g. 64 for AMD GCN) using the
> +     current generic code expansion.  Until an efficient code sequence is
> +     supported for affected targets instead, don't attempt vectorization for
> +     VMAT_ELEMENTWISE at all.  */
> +  if (*memory_access_type == VMAT_ELEMENTWISE)
> +    {
> +      poly_uint64 nelements = TYPE_VECTOR_SUBPARTS (vectype);
> +
> +      if (maybe_ge (nelements, 64))
> +	{
> +	  if (dump_enabled_p ())
> +	    dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> +	      "too many elements (%u) for elementwise accesses\n",
> +	      (unsigned) nelements.to_constant ());
> +
> +	  return false;
> +	}
> +    }
> +
>    if ((*memory_access_type == VMAT_ELEMENTWISE
>         || *memory_access_type == VMAT_STRIDED_SLP)
>        && !nunits.is_constant ())