From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <richard.sandiford@arm.com>
Received: from foss.arm.com (foss.arm.com [217.140.110.172])
 by sourceware.org (Postfix) with ESMTP id 6CA2A3858C60
 for <gcc-patches@gcc.gnu.org>; Fri, 24 Sep 2021 17:17:51 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6CA2A3858C60
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14])
 by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id EE0ED6D;
 Fri, 24 Sep 2021 10:17:50 -0700 (PDT)
Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.126])
 by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 7763D3F59C;
 Fri, 24 Sep 2021 10:17:50 -0700 (PDT)
From: Richard Sandiford <richard.sandiford@arm.com>
To: Richard Biener <rguenther@suse.de>
Mail-Followup-To: Richard Biener <rguenther@suse.de>, gcc-patches@gcc.gnu.org,
 richard.sandiford@arm.com
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] Allow different vector types for stmt groups
References: <op1o7789-5ns6-q56-6on9-948o6o2q83r1@fhfr.qr>
Date: Fri, 24 Sep 2021 18:17:49 +0100
In-Reply-To: <op1o7789-5ns6-q56-6on9-948o6o2q83r1@fhfr.qr> (Richard Biener's
 message of "Mon, 20 Sep 2021 14:14:51 +0200 (CEST)")
Message-ID: <mptpmsx99ky.fsf@arm.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, GIT_PATCH_0,
 KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS,
 TXREP autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Fri, 24 Sep 2021 17:17:54 -0000

Richard Biener <rguenther@suse.de> writes:
> This allows vectorization (in practice non-loop vectorization) to
> have a stmt participate in different vector type vectorizations.
> It allows us to remove vect_update_shared_vectype and replace it
> by pushing/popping STMT_VINFO_VECTYPE from SLP_TREE_VECTYPE around
> vect_analyze_stmt and vect_transform_stmt.
>
> For data-ref the situation is a bit more complicated since we
> analyze alignment info with a specific vector type in mind which
> doesn't play well when that changes.
>
> So the bulk of the change is passing down the actual vector type
> used for a vectorized access to the various accessors of alignment
> info, first and foremost dr_misalignment but also aligned_access_p,
> known_alignment_for_access_p, vect_known_alignment_in_bytes and
> vect_supportable_dr_alignment.  I took the liberty to replace
> ALL_CAPS macro accessors with the lower-case function invocations.
>
> The actual changes to the behavior are in dr_misalignment which now
> is the place factoring in the negative step adjustment as well as
> handling alignment queries for a vector type with bigger alignment
> requirements than what we can (or have) analyze(d).
>
> vect_slp_analyze_node_alignment makes use of this and upon receiving
> a vector type with a bigger alingment desire re-analyzes the DR
> with respect to it but keeps an older more precise result if possible.
> In this context it might be possible to do the analysis just once
> but instead of analyzing with respect to a specific desired alignment
> look for the biggest alignment we can compute a not unknown alignment.
>
> The ChangeLog includes the functional changes but not the bulk due
> to the alignment accessor API changes - I hope that's something good.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu, testing on SPEC
> CPU 2017 in progress (for stats and correctness).
>
> Any comments?

Sorry for the super-slow response, some comments below.

> [=E2=80=A6]
> diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
> index a57700f2c1b..c42fc2fb272 100644
> --- a/gcc/tree-vect-data-refs.c
> +++ b/gcc/tree-vect-data-refs.c
> @@ -887,37 +887,53 @@ vect_slp_analyze_instance_dependence (vec_info *vin=
fo, slp_instance instance)
>    return res;
>  }
>=20=20
> -/* Return the misalignment of DR_INFO.  */
> +/* Return the misalignment of DR_INFO accessed in VECTYPE.  */
>=20=20
>  int
> -dr_misalignment (dr_vec_info *dr_info)
> +dr_misalignment (dr_vec_info *dr_info, tree vectype)
>  {
> +  HOST_WIDE_INT diff =3D 0;
> +  /* Alignment is only analyzed for the first element of a DR group,
> +     use that but adjust misalignment by the offset of the access.  */
>    if (STMT_VINFO_GROUPED_ACCESS (dr_info->stmt))
>      {
>        dr_vec_info *first_dr
>  	=3D STMT_VINFO_DR_INFO (DR_GROUP_FIRST_ELEMENT (dr_info->stmt));
> -      int misalign =3D first_dr->misalignment;
> -      gcc_assert (misalign !=3D DR_MISALIGNMENT_UNINITIALIZED);
> -      if (misalign =3D=3D DR_MISALIGNMENT_UNKNOWN)
> -	return misalign;
>        /* vect_analyze_data_ref_accesses guarantees that DR_INIT are
>  	 INTEGER_CSTs and the first element in the group has the lowest
>  	 address.  Likewise vect_compute_data_ref_alignment will
>  	 have ensured that target_alignment is constant and otherwise
>  	 set misalign to DR_MISALIGNMENT_UNKNOWN.  */

Can you move the second sentence down so that it stays with the to_constant?

> -      HOST_WIDE_INT diff =3D (TREE_INT_CST_LOW (DR_INIT (dr_info->dr))
> -			    - TREE_INT_CST_LOW (DR_INIT (first_dr->dr)));
> +      diff =3D (TREE_INT_CST_LOW (DR_INIT (dr_info->dr))
> +	      - TREE_INT_CST_LOW (DR_INIT (first_dr->dr)));
>        gcc_assert (diff >=3D 0);
> -      unsigned HOST_WIDE_INT target_alignment_c
> -	=3D first_dr->target_alignment.to_constant ();
> -      return (misalign + diff) % target_alignment_c;
> +      dr_info =3D first_dr;
>      }
> -  else
> +
> +  int misalign =3D dr_info->misalignment;
> +  gcc_assert (misalign !=3D DR_MISALIGNMENT_UNINITIALIZED);
> +  if (misalign =3D=3D DR_MISALIGNMENT_UNKNOWN)
> +    return misalign;
> +
> +  /* If the access is only aligned for a vector type with smaller alignm=
ent
> +     requirement the access has unknown misalignment.  */
> +  if (maybe_lt (dr_info->target_alignment * BITS_PER_UNIT,
> +		targetm.vectorize.preferred_vector_alignment (vectype)))
> +    return DR_MISALIGNMENT_UNKNOWN;
> +
> +  /* If this is a backward running DR then first access in the larger
> +     vectype actually is N-1 elements before the address in the DR.
> +     Adjust misalign accordingly.  */
> +  if (tree_int_cst_sgn (DR_STEP (dr_info->dr)) < 0)
>      {
> -      int misalign =3D dr_info->misalignment;
> -      gcc_assert (misalign !=3D DR_MISALIGNMENT_UNINITIALIZED);
> -      return misalign;
> +      if (!TYPE_VECTOR_SUBPARTS (vectype).is_constant ())
> +	return DR_MISALIGNMENT_UNKNOWN;
> +      diff +=3D ((TYPE_VECTOR_SUBPARTS (vectype).to_constant () - 1)
> +	       * -TREE_INT_CST_LOW (TYPE_SIZE_UNIT (TREE_TYPE (vectype))));

The target alignment might only be element alignment.  If so, I think we
should either skip this block or return early above.  This would avoid
returning DR_MISALIGNMENT_UNKNOWN unnecessarily for VLA.

Or, more generally, I think this is case where known_misalignment
does work.  I.e. diff can be a poly_int64 and:

>      }
> +  unsigned HOST_WIDE_INT target_alignment_c
> +    =3D dr_info->target_alignment.to_constant ();
> +  return (misalign + diff) % target_alignment_c;

the last line can become:

  if (!known_misalignment (misalign + diff, target_alignment_c, &misalign))
    return DR_MISALIGNMENT_UNKNOWN;
  return misalign;

avoiding the need for the is_constant above.

> [=E2=80=A6]
> @@ -1896,10 +1902,9 @@ vect_enhance_data_refs_alignment (loop_vec_info lo=
op_vinfo)
>  	      unsigned int target_align =3D
>  		DR_TARGET_ALIGNMENT (dr_info).to_constant ();
>  	      unsigned int dr_size =3D vect_get_scalar_dr_size (dr_info);
> -	      mis =3D (negative
> -		     ? DR_MISALIGNMENT (dr_info)
> -		     : -DR_MISALIGNMENT (dr_info));
> -	      if (DR_MISALIGNMENT (dr_info) !=3D 0)
> +	      unsigned int mis =3D dr_misalignment (dr_info, vectype);
> +	      mis =3D negative ? mis : -mis;

Just checking: is this still correct?  It probably is, but it looked a
bit like we're handling negative steps twice.  Same further down.

LGTM otherwise FWIW.

Thanks,
Richard