[PATCH] tree-optimization/102139 - fix SLP DR base alignment

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH] tree-optimization/102139 - fix SLP DR base alignment
@ 2021-08-31  9:25 Richard Biener
  2021-08-31 11:43 ` Richard Biener
  0 siblings, 1 reply; 5+ messages in thread
From: Richard Biener @ 2021-08-31  9:25 UTC (permalink / raw)
  To: gcc-patches; +Cc: richard.sandiford

When doing whole-function SLP we have to make sure the recorded
base alignments we compute as the maximum alignment seen for a
base anywhere in the function is actually valid at the point
we want to make use of it.

To make this work we now record the stmt the alignment was derived
from in addition to the DRs innermost behavior and we use a
dominance check to verify the recorded info is valid when doing
BB vectorization.

Note this leaves a small(?) hole for the case where we have sth
like

    unaligned DR
    call (); // does not return
    aligned DR

since we'll derive an aligned access for the earlier DR but the
later DR is never actually reached since the call does not
return.  To plug this hole one option (for the easy backporting)
would be to simply not use the base-alignment recording at all.
Alternatively we'd have to store the dataref grouping 'id' somewhere
in the DR itself and use that to handle this particular case.

For optimal handling we'd need the ability to record different
base alignments based on context, we could hash on the BB and
at query time walk immediate dominators to find the "best"
base alignment.  But I'd rather leave such improvement for trunk.

Any opinions?  The issue looks quite serious and IMHO warrants a
timely fix, even if partial - I'm not sure how often the same-BB
case would trigger.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Thanks,
Richard.

2021-08-31  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/102139
	* tree-vectorizer.h (vec_base_alignments): Adjust hash-map
	type to record a std::pair of the stmt and the innermost
	loop behavior.
	* tree-vect-data-refs.c (vect_record_base_alignment): Adjust.
	(vect_compute_data_ref_alignment): Verify the recorded
	base alignment can be used.
---
 gcc/tree-vect-data-refs.c | 19 ++++++++++++-------
 gcc/tree-vectorizer.h     |  7 ++++---
 2 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 37f46d1aaa3..e2549811961 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -895,11 +895,11 @@ vect_record_base_alignment (vec_info *vinfo, stmt_vec_info stmt_info,
 			    innermost_loop_behavior *drb)
 {
   bool existed;
-  innermost_loop_behavior *&entry
+  std::pair<gimple *, innermost_loop_behavior *> &entry
     = vinfo->base_alignments.get_or_insert (drb->base_address, &existed);
-  if (!existed || entry->base_alignment < drb->base_alignment)
+  if (!existed || entry.second->base_alignment < drb->base_alignment)
     {
-      entry = drb;
+      entry = std::make_pair (stmt_info->stmt, drb);
       if (dump_enabled_p ())
 	dump_printf_loc (MSG_NOTE, vect_location,
 			 "recording new base alignment for %T\n"
@@ -1060,11 +1060,16 @@ vect_compute_data_ref_alignment (vec_info *vinfo, dr_vec_info *dr_info)
 
   /* Calculate the maximum of the pooled base address alignment and the
      alignment that we can compute for DR itself.  */
-  innermost_loop_behavior **entry = base_alignments->get (drb->base_address);
-  if (entry && base_alignment < (*entry)->base_alignment)
+  std::pair<gimple *, innermost_loop_behavior *> *entry
+    = base_alignments->get (drb->base_address);
+  if (entry
+      && base_alignment < (*entry).second->base_alignment
+      && (loop_vinfo
+	  || dominated_by_p (CDI_DOMINATORS, gimple_bb (stmt_info->stmt),
+			     gimple_bb (entry->first))))
     {
-      base_alignment = (*entry)->base_alignment;
-      base_misalignment = (*entry)->base_misalignment;
+      base_alignment = entry->second->base_alignment;
+      base_misalignment = entry->second->base_misalignment;
     }
 
   if (drb->offset_alignment < vect_align_c
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index 72e018e8eac..8db642c7dc3 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -106,10 +106,11 @@ struct stmt_info_for_cost {
 
 typedef vec<stmt_info_for_cost> stmt_vector_for_cost;
 
-/* Maps base addresses to an innermost_loop_behavior that gives the maximum
-   known alignment for that base.  */
+/* Maps base addresses to an innermost_loop_behavior and the stmt it was
+   derived from that gives the maximum known alignment for that base.  */
 typedef hash_map<tree_operand_hash,
-		 innermost_loop_behavior *> vec_base_alignments;
+		 std::pair<gimple *, innermost_loop_behavior *> >
+	  vec_base_alignments;
 
 /************************************************************************
   SLP
-- 
2.31.1

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] tree-optimization/102139 - fix SLP DR base alignment
  2021-08-31  9:25 [PATCH] tree-optimization/102139 - fix SLP DR base alignment Richard Biener
@ 2021-08-31 11:43 ` Richard Biener
  2021-09-01  9:59   ` Richard Sandiford
  0 siblings, 1 reply; 5+ messages in thread
From: Richard Biener @ 2021-08-31 11:43 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches, Richard Sandiford

[-- Attachment #1: Type: text/plain, Size: 1991 bytes --]

On Tue, Aug 31, 2021 at 11:26 AM Richard Biener via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> When doing whole-function SLP we have to make sure the recorded
> base alignments we compute as the maximum alignment seen for a
> base anywhere in the function is actually valid at the point
> we want to make use of it.
>
> To make this work we now record the stmt the alignment was derived
> from in addition to the DRs innermost behavior and we use a
> dominance check to verify the recorded info is valid when doing
> BB vectorization.
>
> Note this leaves a small(?) hole for the case where we have sth
> like
>
>     unaligned DR
>     call (); // does not return
>     aligned DR
>
> since we'll derive an aligned access for the earlier DR but the
> later DR is never actually reached since the call does not
> return.  To plug this hole one option (for the easy backporting)
> would be to simply not use the base-alignment recording at all.
> Alternatively we'd have to store the dataref grouping 'id' somewhere
> in the DR itself and use that to handle this particular case.

It turns out this isn't too difficult so the following is a patch adjusted
to cover that case together with a testcase.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

OK?

Thanks,
Richard.

2021-08-31  Richard Biener  <rguenther@suse.de>

        PR tree-optimization/102139
        * tree-vectorizer.h (vec_base_alignments): Adjust hash-map
        type to record a std::pair of the stmt-info and the innermost
        loop behavior.
        (dr_vec_info::group): New member.
        * tree-vect-data-refs.c (vect_record_base_alignment): Adjust.
        (vect_compute_data_ref_alignment): Verify the recorded
        base alignment can be used.
        (data_ref_pair): Remove.
        (dr_group_sort_cmp): Adjust.
        (vect_analyze_data_ref_accesses): Store the group-ID in the
        dr_vec_info and operate on a vector of dr_vec_infos.

        * gcc.dg/torture/pr102139.c: New testcase.

[-- Attachment #2: p --]
[-- Type: application/octet-stream, Size: 10810 bytes --]

From b68f235bda237ce63bf3e621488226549c13bf72 Mon Sep 17 00:00:00 2001
From: Richard Biener <rguenther@suse.de>
Date: Tue, 31 Aug 2021 10:28:40 +0200
Subject: [PATCH] tree-optimization/102139 - fix SLP DR base alignment
To: gcc-patches@gcc.gnu.org

When doing whole-function SLP we have to make sure the recorded
base alignments we compute as the maximum alignment seen for a
base anywhere in the function is actually valid at the point
we want to make use of it.

To make this work we now record the stmt the alignment was derived
from in addition to the DRs innermost behavior and we use a
dominance check to verify the recorded info is valid when doing
BB vectorization.  For this to work for groups inside a BB that are
separate by a call that might not return we now store the DR
analysis group-id permanently and use that for an additional check
when the DRs are in the same BB.

2021-08-31  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/102139
	* tree-vectorizer.h (vec_base_alignments): Adjust hash-map
	type to record a std::pair of the stmt-info and the innermost
	loop behavior.
	(dr_vec_info::group): New member.
	* tree-vect-data-refs.c (vect_record_base_alignment): Adjust.
	(vect_compute_data_ref_alignment): Verify the recorded
	base alignment can be used.
	(data_ref_pair): Remove.
	(dr_group_sort_cmp): Adjust.
	(vect_analyze_data_ref_accesses): Store the group-ID in the
	dr_vec_info and operate on a vector of dr_vec_infos.

	* gcc.dg/torture/pr102139.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr102139.c | 43 ++++++++++++++++
 gcc/tree-vect-data-refs.c               | 66 +++++++++++++------------
 gcc/tree-vectorizer.h                   | 10 ++--
 3 files changed, 85 insertions(+), 34 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr102139.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr102139.c b/gcc/testsuite/gcc.dg/torture/pr102139.c
new file mode 100644
index 00000000000..06c1357438a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr102139.c
@@ -0,0 +1,43 @@
+/* { dg-do run } */
+/* { dg-additional-options "-ftree-slp-vectorize" } */
+
+typedef double aligned_double __attribute__((aligned(2*sizeof(double))));
+
+void __attribute__((noipa))
+bar (int aligned, double *p)
+{
+  if (aligned)
+    {
+      *(aligned_double *)p = 3.;
+      p[1] = 4.;
+    }
+  else
+    {
+      p[2] = 0.;
+      p[3] = 1.;
+    }
+}
+
+void __attribute__((noipa))
+foo (int i)
+{
+  if (i)
+    __builtin_exit (0);
+}
+void __attribute__((noipa))
+baz (double *p)
+{
+  p[0] = 0.;
+  p[1] = 1.;
+  foo (1);
+  *(aligned_double *)p = 3.;
+  p[1] = 4.;
+}
+
+double x[8] __attribute__((aligned(2*sizeof (double))));
+int main()
+{
+  bar (0, &x[1]);
+  baz (&x[1]);
+  return 0;
+}
diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 37f46d1aaa3..f43d0f4785e 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -895,11 +895,11 @@ vect_record_base_alignment (vec_info *vinfo, stmt_vec_info stmt_info,
 			    innermost_loop_behavior *drb)
 {
   bool existed;
-  innermost_loop_behavior *&entry
+  std::pair<stmt_vec_info, innermost_loop_behavior *> &entry
     = vinfo->base_alignments.get_or_insert (drb->base_address, &existed);
-  if (!existed || entry->base_alignment < drb->base_alignment)
+  if (!existed || entry.second->base_alignment < drb->base_alignment)
     {
-      entry = drb;
+      entry = std::make_pair (stmt_info, drb);
       if (dump_enabled_p ())
 	dump_printf_loc (MSG_NOTE, vect_location,
 			 "recording new base alignment for %T\n"
@@ -1060,11 +1060,18 @@ vect_compute_data_ref_alignment (vec_info *vinfo, dr_vec_info *dr_info)
 
   /* Calculate the maximum of the pooled base address alignment and the
      alignment that we can compute for DR itself.  */
-  innermost_loop_behavior **entry = base_alignments->get (drb->base_address);
-  if (entry && base_alignment < (*entry)->base_alignment)
+  std::pair<stmt_vec_info, innermost_loop_behavior *> *entry
+    = base_alignments->get (drb->base_address);
+  if (entry
+      && base_alignment < (*entry).second->base_alignment
+      && (loop_vinfo
+	  || (dominated_by_p (CDI_DOMINATORS, gimple_bb (stmt_info->stmt),
+			      gimple_bb (entry->first->stmt))
+	      && (gimple_bb (stmt_info->stmt) != gimple_bb (entry->first->stmt)
+		  || (entry->first->dr_aux.group <= dr_info->group)))))
     {
-      base_alignment = (*entry)->base_alignment;
-      base_misalignment = (*entry)->base_misalignment;
+      base_alignment = entry->second->base_alignment;
+      base_misalignment = entry->second->base_misalignment;
     }
 
   if (drb->offset_alignment < vect_align_c
@@ -2813,18 +2820,16 @@ vect_analyze_data_ref_access (vec_info *vinfo, dr_vec_info *dr_info)
   return vect_analyze_group_access (vinfo, dr_info);
 }
 
-typedef std::pair<data_reference_p, int> data_ref_pair;
-
 /* Compare two data-references DRA and DRB to group them into chunks
    suitable for grouping.  */
 
 static int
 dr_group_sort_cmp (const void *dra_, const void *drb_)
 {
-  data_ref_pair dra_pair = *(data_ref_pair *)const_cast<void *>(dra_);
-  data_ref_pair drb_pair = *(data_ref_pair *)const_cast<void *>(drb_);
-  data_reference_p dra = dra_pair.first;
-  data_reference_p drb = drb_pair.first;
+  dr_vec_info *dra_info = *(dr_vec_info **)const_cast<void *>(dra_);
+  dr_vec_info *drb_info = *(dr_vec_info **)const_cast<void *>(drb_);
+  data_reference_p dra = dra_info->dr;
+  data_reference_p drb = drb_info->dr;
   int cmp;
 
   /* Stabilize sort.  */
@@ -2832,8 +2837,8 @@ dr_group_sort_cmp (const void *dra_, const void *drb_)
     return 0;
 
   /* Different group IDs lead never belong to the same group.  */
-  if (dra_pair.second != drb_pair.second)
-    return dra_pair.second < drb_pair.second ? -1 : 1;
+  if (dra_info->group != drb_info->group)
+    return dra_info->group < drb_info->group ? -1 : 1;
 
   /* Ordering of DRs according to base.  */
   cmp = data_ref_compare_tree (DR_BASE_ADDRESS (dra),
@@ -2953,18 +2958,18 @@ vect_analyze_data_ref_accesses (vec_info *vinfo,
   /* Sort the array of datarefs to make building the interleaving chains
      linear.  Don't modify the original vector's order, it is needed for
      determining what dependencies are reversed.  */
-  vec<data_ref_pair> datarefs_copy;
+  vec<dr_vec_info *> datarefs_copy;
   datarefs_copy.create (datarefs.length ());
   for (unsigned i = 0; i < datarefs.length (); i++)
     {
-      int group_id;
+      dr_vec_info *dr_info = vinfo->lookup_dr (datarefs[i]);
       /* If the caller computed DR grouping use that, otherwise group by
 	 basic blocks.  */
       if (dataref_groups)
-	group_id = (*dataref_groups)[i];
+	dr_info->group = (*dataref_groups)[i];
       else
-	group_id = gimple_bb (DR_STMT (datarefs[i]))->index;
-      datarefs_copy.quick_push (data_ref_pair (datarefs[i], group_id));
+	dr_info->group = gimple_bb (DR_STMT (datarefs[i]))->index;
+      datarefs_copy.quick_push (dr_info);
     }
   datarefs_copy.qsort (dr_group_sort_cmp);
   hash_set<stmt_vec_info> to_fixup;
@@ -2972,9 +2977,9 @@ vect_analyze_data_ref_accesses (vec_info *vinfo,
   /* Build the interleaving chains.  */
   for (i = 0; i < datarefs_copy.length () - 1;)
     {
-      data_reference_p dra = datarefs_copy[i].first;
-      int dra_group_id = datarefs_copy[i].second;
-      dr_vec_info *dr_info_a = vinfo->lookup_dr (dra);
+      dr_vec_info *dr_info_a = datarefs_copy[i];
+      data_reference_p dra = dr_info_a->dr;
+      int dra_group_id = dr_info_a->group;
       stmt_vec_info stmtinfo_a = dr_info_a->stmt;
       stmt_vec_info lastinfo = NULL;
       if (!STMT_VINFO_VECTORIZABLE (stmtinfo_a)
@@ -2985,9 +2990,9 @@ vect_analyze_data_ref_accesses (vec_info *vinfo,
 	}
       for (i = i + 1; i < datarefs_copy.length (); ++i)
 	{
-	  data_reference_p drb = datarefs_copy[i].first;
-	  int drb_group_id = datarefs_copy[i].second;
-	  dr_vec_info *dr_info_b = vinfo->lookup_dr (drb);
+	  dr_vec_info *dr_info_b = datarefs_copy[i];
+	  data_reference_p drb = dr_info_b->dr;
+	  int drb_group_id = dr_info_b->group;
 	  stmt_vec_info stmtinfo_b = dr_info_b->stmt;
 	  if (!STMT_VINFO_VECTORIZABLE (stmtinfo_b)
 	      || STMT_VINFO_GATHER_SCATTER_P (stmtinfo_b))
@@ -3048,7 +3053,7 @@ vect_analyze_data_ref_accesses (vec_info *vinfo,
 	  HOST_WIDE_INT init_a = TREE_INT_CST_LOW (DR_INIT (dra));
 	  HOST_WIDE_INT init_b = TREE_INT_CST_LOW (DR_INIT (drb));
 	  HOST_WIDE_INT init_prev
-	    = TREE_INT_CST_LOW (DR_INIT (datarefs_copy[i-1].first));
+	    = TREE_INT_CST_LOW (DR_INIT (datarefs_copy[i-1]->dr));
 	  gcc_assert (init_a <= init_b
 		      && init_a <= init_prev
 		      && init_prev <= init_b);
@@ -3056,7 +3061,7 @@ vect_analyze_data_ref_accesses (vec_info *vinfo,
 	  /* Do not place the same access in the interleaving chain twice.  */
 	  if (init_b == init_prev)
 	    {
-	      gcc_assert (gimple_uid (DR_STMT (datarefs_copy[i-1].first))
+	      gcc_assert (gimple_uid (DR_STMT (datarefs_copy[i-1]->dr))
 			  < gimple_uid (DR_STMT (drb)));
 	      /* Simply link in duplicates and fix up the chain below.  */
 	    }
@@ -3169,10 +3174,9 @@ vect_analyze_data_ref_accesses (vec_info *vinfo,
       to_fixup.add (newgroup);
     }
 
-  data_ref_pair *dr_pair;
-  FOR_EACH_VEC_ELT (datarefs_copy, i, dr_pair)
+  dr_vec_info *dr_info;
+  FOR_EACH_VEC_ELT (datarefs_copy, i, dr_info)
     {
-      dr_vec_info *dr_info = vinfo->lookup_dr (dr_pair->first);
       if (STMT_VINFO_VECTORIZABLE (dr_info->stmt)
 	  && !vect_analyze_data_ref_access (vinfo, dr_info))
 	{
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index 72e018e8eac..7453d2a9131 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -106,10 +106,11 @@ struct stmt_info_for_cost {
 
 typedef vec<stmt_info_for_cost> stmt_vector_for_cost;
 
-/* Maps base addresses to an innermost_loop_behavior that gives the maximum
-   known alignment for that base.  */
+/* Maps base addresses to an innermost_loop_behavior and the stmt it was
+   derived from that gives the maximum known alignment for that base.  */
 typedef hash_map<tree_operand_hash,
-		 innermost_loop_behavior *> vec_base_alignments;
+		 std::pair<stmt_vec_info, innermost_loop_behavior *> >
+	  vec_base_alignments;
 
 /************************************************************************
   SLP
@@ -1059,6 +1060,9 @@ public:
   data_reference *dr;
   /* The statement that contains the data reference.  */
   stmt_vec_info stmt;
+  /* The analysis group this DR belongs to when doing BB vectorization.
+     DRs of the same group belong to the same conditional execution context.  */
+  unsigned group;
   /* The misalignment in bytes of the reference, or -1 if not known.  */
   int misalignment;
   /* The byte alignment that we'd ideally like the reference to have,
-- 
2.31.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] tree-optimization/102139 - fix SLP DR base alignment
  2021-08-31 11:43 ` Richard Biener
@ 2021-09-01  9:59   ` Richard Sandiford
  2021-09-01 10:07     ` Richard Biener
  0 siblings, 1 reply; 5+ messages in thread
From: Richard Sandiford @ 2021-09-01  9:59 UTC (permalink / raw)
  To: Richard Biener; +Cc: Richard Biener, GCC Patches

Richard Biener <richard.guenther@gmail.com> writes:
> On Tue, Aug 31, 2021 at 11:26 AM Richard Biener via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>>
>> When doing whole-function SLP we have to make sure the recorded
>> base alignments we compute as the maximum alignment seen for a
>> base anywhere in the function is actually valid at the point
>> we want to make use of it.

Ah, yeah, the danger of optimisations that silently rely on the
then-current restrictions :-(

>> To make this work we now record the stmt the alignment was derived
>> from in addition to the DRs innermost behavior and we use a
>> dominance check to verify the recorded info is valid when doing
>> BB vectorization.
>>
>> Note this leaves a small(?) hole for the case where we have sth
>> like
>>
>>     unaligned DR
>>     call (); // does not return
>>     aligned DR
>>
>> since we'll derive an aligned access for the earlier DR but the
>> later DR is never actually reached since the call does not
>> return.  To plug this hole one option (for the easy backporting)
>> would be to simply not use the base-alignment recording at all.
>> Alternatively we'd have to store the dataref grouping 'id' somewhere
>> in the DR itself and use that to handle this particular case.
>
> It turns out this isn't too difficult so the following is a patch adjusted
> to cover that case together with a testcase.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
>
> OK?

TBH I know nothing about this group id scheme, so I'm not really
in a position to comment.  But it LGTM from the (few) bits I do understand.

I guess we're leaving the same easter egg for loop optimisation if
we ever support early exits, but I'm not sure what to do about that.

Thanks,
Richard

>
> Thanks,
> Richard.
>
> 2021-08-31  Richard Biener  <rguenther@suse.de>
>
>         PR tree-optimization/102139
>         * tree-vectorizer.h (vec_base_alignments): Adjust hash-map
>         type to record a std::pair of the stmt-info and the innermost
>         loop behavior.
>         (dr_vec_info::group): New member.
>         * tree-vect-data-refs.c (vect_record_base_alignment): Adjust.
>         (vect_compute_data_ref_alignment): Verify the recorded
>         base alignment can be used.
>         (data_ref_pair): Remove.
>         (dr_group_sort_cmp): Adjust.
>         (vect_analyze_data_ref_accesses): Store the group-ID in the
>         dr_vec_info and operate on a vector of dr_vec_infos.
>
>         * gcc.dg/torture/pr102139.c: New testcase.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] tree-optimization/102139 - fix SLP DR base alignment
  2021-09-01  9:59   ` Richard Sandiford
@ 2021-09-01 10:07     ` Richard Biener
  2021-09-01 10:26       ` Richard Sandiford
  0 siblings, 1 reply; 5+ messages in thread
From: Richard Biener @ 2021-09-01 10:07 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Richard Biener, GCC Patches

On Wed, 1 Sep 2021, Richard Sandiford wrote:

> Richard Biener <richard.guenther@gmail.com> writes:
> > On Tue, Aug 31, 2021 at 11:26 AM Richard Biener via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> >>
> >> When doing whole-function SLP we have to make sure the recorded
> >> base alignments we compute as the maximum alignment seen for a
> >> base anywhere in the function is actually valid at the point
> >> we want to make use of it.
> 
> Ah, yeah, the danger of optimisations that silently rely on the
> then-current restrictions :-(

Yeah.

> >> To make this work we now record the stmt the alignment was derived
> >> from in addition to the DRs innermost behavior and we use a
> >> dominance check to verify the recorded info is valid when doing
> >> BB vectorization.
> >>
> >> Note this leaves a small(?) hole for the case where we have sth
> >> like
> >>
> >>     unaligned DR
> >>     call (); // does not return
> >>     aligned DR
> >>
> >> since we'll derive an aligned access for the earlier DR but the
> >> later DR is never actually reached since the call does not
> >> return.  To plug this hole one option (for the easy backporting)
> >> would be to simply not use the base-alignment recording at all.
> >> Alternatively we'd have to store the dataref grouping 'id' somewhere
> >> in the DR itself and use that to handle this particular case.
> >
> > It turns out this isn't too difficult so the following is a patch adjusted
> > to cover that case together with a testcase.
> >
> > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> >
> > OK?
> 
> TBH I know nothing about this group id scheme, so I'm not really
> in a position to comment.  But it LGTM from the (few) bits I do understand.
> 
> I guess we're leaving the same easter egg for loop optimisation if
> we ever support early exits, but I'm not sure what to do about that.

We're currently not recording alignment from masked DRs
(aka DR_IS_CONDITIONAL_IN_STMT), I suppose we'd need to mark
all stmts after early exits in this way then since in the end
they will have to be masked on the early exit.

So we _might_ be fine there ...

Pushed.

Thanks,
Richard.

> Thanks,
> Richard
> 
> >
> > Thanks,
> > Richard.
> >
> > 2021-08-31  Richard Biener  <rguenther@suse.de>
> >
> >         PR tree-optimization/102139
> >         * tree-vectorizer.h (vec_base_alignments): Adjust hash-map
> >         type to record a std::pair of the stmt-info and the innermost
> >         loop behavior.
> >         (dr_vec_info::group): New member.
> >         * tree-vect-data-refs.c (vect_record_base_alignment): Adjust.
> >         (vect_compute_data_ref_alignment): Verify the recorded
> >         base alignment can be used.
> >         (data_ref_pair): Remove.
> >         (dr_group_sort_cmp): Adjust.
> >         (vect_analyze_data_ref_accesses): Store the group-ID in the
> >         dr_vec_info and operate on a vector of dr_vec_infos.
> >
> >         * gcc.dg/torture/pr102139.c: New testcase.
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] tree-optimization/102139 - fix SLP DR base alignment
  2021-09-01 10:07     ` Richard Biener
@ 2021-09-01 10:26       ` Richard Sandiford
  0 siblings, 0 replies; 5+ messages in thread
From: Richard Sandiford @ 2021-09-01 10:26 UTC (permalink / raw)
  To: Richard Biener; +Cc: Richard Biener, GCC Patches

Richard Biener <rguenther@suse.de> writes:
> On Wed, 1 Sep 2021, Richard Sandiford wrote:
>
>> Richard Biener <richard.guenther@gmail.com> writes:
>> > On Tue, Aug 31, 2021 at 11:26 AM Richard Biener via Gcc-patches
>> > <gcc-patches@gcc.gnu.org> wrote:
>> >>
>> >> When doing whole-function SLP we have to make sure the recorded
>> >> base alignments we compute as the maximum alignment seen for a
>> >> base anywhere in the function is actually valid at the point
>> >> we want to make use of it.
>> 
>> Ah, yeah, the danger of optimisations that silently rely on the
>> then-current restrictions :-(
>
> Yeah.
>
>> >> To make this work we now record the stmt the alignment was derived
>> >> from in addition to the DRs innermost behavior and we use a
>> >> dominance check to verify the recorded info is valid when doing
>> >> BB vectorization.
>> >>
>> >> Note this leaves a small(?) hole for the case where we have sth
>> >> like
>> >>
>> >>     unaligned DR
>> >>     call (); // does not return
>> >>     aligned DR
>> >>
>> >> since we'll derive an aligned access for the earlier DR but the
>> >> later DR is never actually reached since the call does not
>> >> return.  To plug this hole one option (for the easy backporting)
>> >> would be to simply not use the base-alignment recording at all.
>> >> Alternatively we'd have to store the dataref grouping 'id' somewhere
>> >> in the DR itself and use that to handle this particular case.
>> >
>> > It turns out this isn't too difficult so the following is a patch adjusted
>> > to cover that case together with a testcase.
>> >
>> > Bootstrapped and tested on x86_64-unknown-linux-gnu.
>> >
>> > OK?
>> 
>> TBH I know nothing about this group id scheme, so I'm not really
>> in a position to comment.  But it LGTM from the (few) bits I do understand.
>> 
>> I guess we're leaving the same easter egg for loop optimisation if
>> we ever support early exits, but I'm not sure what to do about that.
>
> We're currently not recording alignment from masked DRs
> (aka DR_IS_CONDITIONAL_IN_STMT), I suppose we'd need to mark
> all stmts after early exits in this way then since in the end
> they will have to be masked on the early exit.
>
> So we _might_ be fine there ...

Yeah, for a pure SVE-like implementation that's probably true.  But we
also have the option of vectorising an early exit by branching if the
condition is true for *any* element, then handling the remaining
elements with an epilogue.

It would be quite a different approach from what we're doing now,
and wouldn't necessarily require up-front if-conversion.  But the
point is that it's theoretically possible, just as whole-function
SLP was theoretically possible (but seemingly some way off) when
this code was written :-)

Thanks,
Richard

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-09-01 10:26 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-31  9:25 [PATCH] tree-optimization/102139 - fix SLP DR base alignment Richard Biener
2021-08-31 11:43 ` Richard Biener
2021-09-01  9:59   ` Richard Sandiford
2021-09-01 10:07     ` Richard Biener
2021-09-01 10:26       ` Richard Sandiford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).