public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [RFC] S/390: Alignment peeling prolog generation
@ 2017-04-11 14:38 Robin Dapp
  2017-04-11 14:57 ` Bin.Cheng
  0 siblings, 1 reply; 51+ messages in thread
From: Robin Dapp @ 2017-04-11 14:38 UTC (permalink / raw)
  To: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 1193 bytes --]

Hi,

when looking at various vectorization examples on s390x I noticed that
we still peel vf/2 iterations for alignment even though vectorization
costs of unaligned loads and stores are the same as normal loads/stores.

A simple example is

void foo(int *restrict a, int *restrict b, unsigned int n)
{
  for (unsigned int i = 0; i < n; i++)
    {
      b[i] = a[i] * 2 + 1;
    }
}

which gets peeled unless __builtin_assume_aligned (a, 8) is used.

In tree-vect-data-refs.c there are several checks that involve costs  in
the peeling decision none of which seems to suffice in this case. For a
loop with only read DRs there is a check that has been triggering (i.e.
disable peeling) since we implemented the vectorization costs.

Here, we have DR_MISALIGNMENT (dr) == -1 for all DRs but the costs
should still dictate to never peel. I attached a tentative patch for
discussion which fixes the problem by checking the costs for npeel = 0
and npeel = vf/2 after ensuring we support all misalignments. Is there a
better way and place to do it? Are we missing something somewhere else
that would preclude the peeling from happening?

This is not indended for stage 4 obviously :)

Regards
 Robin

[-- Attachment #2: gcc-omit-peeling.diff --]
[-- Type: text/x-patch, Size: 2442 bytes --]

diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 3fc762a..795c22c 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -1418,6 +1418,7 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo)
   stmt_vec_info stmt_info;
   unsigned int npeel = 0;
   bool all_misalignments_unknown = true;
+  bool all_misalignments_supported = true;
   unsigned int vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
   unsigned possible_npeel_number = 1;
   tree vectype;
@@ -1547,6 +1548,7 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo)
                 }
 
               all_misalignments_unknown = false;
+
               /* Data-ref that was chosen for the case that all the
                  misalignments are unknown is not relevant anymore, since we
                  have a data-ref with known alignment.  */
@@ -1609,6 +1611,24 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo)
               break;
             }
         }
+
+      /* Check if target supports misaligned data access for current data
+	 reference.  */
+      vectype = STMT_VINFO_VECTYPE (stmt_info);
+      machine_mode mode = TYPE_MODE (vectype);
+      if (targetm.vectorize.
+	  support_vector_misalignment (mode, TREE_TYPE (DR_REF (dr)),
+				       DR_MISALIGNMENT (dr), false))
+	{
+	  vect_peeling_hash_insert (&peeling_htab, loop_vinfo,
+				    dr, 0);
+	  /* Also insert vf/2 peeling that will be used when all
+	     misalignments are unknown. */
+	  vect_peeling_hash_insert (&peeling_htab, loop_vinfo,
+				    dr, vf / 2);
+	}
+      else
+	all_misalignments_supported = false;
     }
 
   /* Check if we can possibly peel the loop.  */
@@ -1687,6 +1707,18 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo)
             dr0 = first_store;
         }
 
+      /* If the target supports accessing all data references in a misaligned
+	 way, check costs to see if we can leave them unaligned and do not
+	 perform any peeling.  */
+      if (all_misalignments_supported)
+	{
+	  dr0 = vect_peeling_hash_choose_best_peeling (&peeling_htab,
+						       loop_vinfo, &npeel,
+						       &body_cost_vec);
+	  if (!dr0 || !npeel)
+	    do_peeling = false;
+	}
+
       /* In case there are only loads with different unknown misalignments, use
          peeling only if it may help to align other accesses in the loop or
 	 if it may help improving load bandwith when we'd end up using

^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2017-06-07 11:43 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-11 14:38 [RFC] S/390: Alignment peeling prolog generation Robin Dapp
2017-04-11 14:57 ` Bin.Cheng
2017-04-11 15:03   ` Robin Dapp
2017-04-11 15:07     ` Bin.Cheng
2017-04-11 16:25   ` Richard Biener
2017-04-12  7:51     ` Robin Dapp
2017-04-12  7:58       ` Richard Biener
2017-05-04  9:04         ` Robin Dapp
2017-05-05 11:04           ` Richard Biener
2017-05-08 16:12             ` Robin Dapp
2017-05-09 10:38               ` Richard Biener
2017-05-11 11:17                 ` Robin Dapp
2017-05-11 12:15                   ` Richard Biener
2017-05-11 12:16                     ` Richard Biener
2017-05-11 12:48                       ` Richard Biener
2017-05-11 11:17                 ` [PATCH 1/5] Vect peeling cost model Robin Dapp
2017-05-11 11:18                 ` [PATCH 2/5] " Robin Dapp
2017-05-11 11:19                 ` [PATCH 3/5] " Robin Dapp
2017-05-11 11:20                 ` [PATCH 4/5] " Robin Dapp
2017-05-11 15:30                   ` [PATCH 4/5 v2] " Robin Dapp
2017-05-12  9:36                     ` Richard Biener
2017-05-23 15:58                       ` [PATCH 0/5 v3] " Robin Dapp
2017-05-24  7:51                         ` Richard Biener
2017-05-24 11:57                           ` Robin Dapp
2017-05-24 13:56                             ` Richard Biener
2017-06-03 17:12                         ` Andreas Schwab
2017-06-06  7:13                           ` Robin Dapp
2017-06-06 17:26                             ` Andreas Schwab
2017-06-07 10:50                               ` Robin Dapp
2017-06-07 11:43                                 ` Andreas Schwab
2017-05-23 15:58                       ` [PATCH 1/5 " Robin Dapp
2017-05-23 15:58                       ` [PATCH 2/5 " Robin Dapp
2017-05-23 19:25                         ` Richard Sandiford
2017-05-24  7:37                           ` Robin Dapp
2017-05-24  7:53                             ` Richard Sandiford
2017-05-23 15:59                       ` [PATCH 5/5 " Robin Dapp
2017-05-23 15:59                       ` [PATCH 4/5 " Robin Dapp
2017-05-31 13:56                         ` Christophe Lyon
2017-05-31 14:37                           ` Robin Dapp
2017-05-31 14:49                             ` Christophe Lyon
2017-05-23 16:02                       ` [PATCH 3/5 " Robin Dapp
2017-05-11 11:59                 ` [PATCH 5/5] " Robin Dapp
2017-05-08 16:13             ` [PATCH 3/4] " Robin Dapp
2017-05-09 10:41               ` Richard Biener
2017-05-08 16:27             ` [PATCH 4/4] " Robin Dapp
2017-05-09 10:55               ` Richard Biener
2017-05-04  9:04         ` [PATCH 1/3] " Robin Dapp
2017-05-05 10:32           ` Richard Biener
2017-05-04  9:07         ` [PATCH 2/3] " Robin Dapp
2017-05-05 10:37           ` Richard Biener
2017-05-04  9:14         ` [PATCH 3/3] " Robin Dapp

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).