public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: "Andre Vieira (lists)" <andre.simoesdiasvieira@arm.com>
To: Richard Biener <rguenther@suse.de>
Cc: Jakub Jelinek <jakub@redhat.com>,
	Richard Sandiford <richard.sandiford@arm.com>,
	"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] Teach vectorizer to deal with bitfield accesses (was: [RFC] Teach vectorizer to deal with bitfield reads)
Date: Tue, 16 Aug 2022 11:24:44 +0100	[thread overview]
Message-ID: <aa8a155a-d420-83e2-7687-7b4fb5c50f2d@arm.com> (raw)
In-Reply-To: <nycvar.YFH.7.77.849.2208091350580.3343@jbgna.fhfr.qr>

[-- Attachment #1: Type: text/plain, Size: 7989 bytes --]

Hi,

New version of the patch attached, but haven't recreated the ChangeLog 
yet, just waiting to see if this is what you had in mind. See also some 
replies to your comments in-line below:

On 09/08/2022 15:34, Richard Biener wrote:

> @@ -2998,7 +3013,7 @@ ifcvt_split_critical_edges (class loop *loop, bool
> aggressive_if_conv)
>     auto_vec<edge> critical_edges;
>
>     /* Loop is not well formed.  */
> -  if (num <= 2 || loop->inner || !single_exit (loop))
> +  if (num <= 2 || loop->inner)
>       return false;
>
>     body = get_loop_body (loop);
>
> this doesn't appear in the ChangeLog nor is it clear to me why it's
> needed?  Likewise
So both these and...
>
> -  /* Save BB->aux around loop_version as that uses the same field.  */
> -  save_length = loop->inner ? loop->inner->num_nodes : loop->num_nodes;
> -  void **saved_preds = XALLOCAVEC (void *, save_length);
> -  for (unsigned i = 0; i < save_length; i++)
> -    saved_preds[i] = ifc_bbs[i]->aux;
> +  void **saved_preds = NULL;
> +  if (any_complicated_phi || need_to_predicate)
> +    {
> +      /* Save BB->aux around loop_version as that uses the same field.
> */
> +      save_length = loop->inner ? loop->inner->num_nodes :
> loop->num_nodes;
> +      saved_preds = XALLOCAVEC (void *, save_length);
> +      for (unsigned i = 0; i < save_length; i++)
> +       saved_preds[i] = ifc_bbs[i]->aux;
> +    }
>
> is that just premature optimization?

.. these changes are to make sure we can still use the loop versioning 
code even for cases where there are bitfields to lower but no ifcvts 
(i.e. num of BBs <= 2).
I wasn't sure about the loop-inner condition and the small examples I 
tried it seemed to work, that is loop version seems to be able to handle 
nested loops.

The single_exit condition is still required for both, because the code 
to create the loop versions depends on it. It does look like I missed 
this in the ChangeLog...

> +  /* BITSTART and BITEND describe the region we can safely load from
> inside the
> +     structure.  BITPOS is the bit position of the value inside the
> +     representative that we will end up loading OFFSET bytes from the
> start
> +     of the struct.  BEST_MODE is the mode describing the optimal size of
> the
> +     representative chunk we load.  If this is a write we will store the
> same
> +     sized representative back, after we have changed the appropriate
> bits.  */
> +  get_bit_range (&bitstart, &bitend, comp_ref, &bitpos, &offset);
>
> I think you need to give up when get_bit_range sets bitstart = bitend to
> zero
>
> +  if (get_best_mode (bitsize, bitpos.to_constant (), bitstart, bitend,
> +                    TYPE_ALIGN (TREE_TYPE (struct_expr)),
> +                    INT_MAX, false, &best_mode))
>
> +  tree rep_decl = build_decl (UNKNOWN_LOCATION, FIELD_DECL,
> +                             NULL_TREE, rep_type);
> +  /* Load from the start of 'offset + bitpos % alignment'.  */
> +  uint64_t extra_offset = bitpos.to_constant ();
>
> you shouldn't build a new FIELD_DECL.  Either you use
> DECL_BIT_FIELD_REPRESENTATIVE directly or you use a
> BIT_FIELD_REF accessing the "representative".
> DECL_BIT_FIELD_REPRESENTATIVE exists so it can maintain
> a variable field offset, you can also subset that with an
> intermediate BIT_FIELD_REF if DECL_BIT_FIELD_REPRESENTATIVE is
> too large for your taste.
>
> I'm not sure all the offset calculation you do is correct, but
> since you shouldn't invent a new FIELD_DECL it probably needs
> to change anyway ...
I can use the DECL_BIT_FIELD_REPRESENTATIVE, but I'll still need some 
offset calculation/extraction. It's easier to example with an example:

In vect-bitfield-read-3.c the struct:
typedef struct {
     int  c;
     int  b;
     bool a : 1;
} struct_t;

and field access 'vect_false[i].a' or 'vect_true[i].a' will lead to a 
DECL_BIT_FIELD_REPRESENTATIVE of TYPE_SIZE of 8 (and TYPE_PRECISION is 
also 8 as expected). However, the DECL_FIELD_OFFSET of either the 
original field decl, the actual bitfield member, or the 
DECL_BIT_FIELD_REPRESENTATIVE is 0 and the DECL_FIELD_BIT_OFFSET is 64. 
These will lead to the correct load:
_1 = vect_false[i].D;

D here being the representative is an 8-bit load from vect_false[i] + 
64bits. So all good there. However, when we construct BIT_FIELD_REF we 
can't simply use DECL_FIELD_BIT_OFFSET (field_decl) as the 
BIT_FIELD_REF's bitpos.  During `verify_gimple` it checks that bitpos + 
bitsize < TYPE_SIZE (TREE_TYPE (load)) where BIT_FIELD_REF (load, 
bitsize, bitpos).

So instead I change bitpos such that:
align_of_representative = TYPE_ALIGN (TREE_TYPE (representative));
bitpos -= bitpos.to_constant () / align_of_representative * 
align_of_representative;

I've now rewritten this to:
poly_int64 q,r;
if (can_trunc_div_p(bitpos, align_of_representative, &q, &r))
   bitpos = r;

It makes it slightly clearer, also because I no longer need the changes 
to the original tree offset as I'm just using D for the load.
> Note that for optimization it will be important that all
> accesses to the bitfield members of the same bitfield use the
> same underlying area (CSE and store-forwarding will thank you).
>
> +
> +  need_to_lower_bitfields = bitfields_to_lower_p (loop,
> &bitfields_to_lower);
> +  if (!ifcvt_split_critical_edges (loop, aggressive_if_conv)
> +      && !need_to_lower_bitfields)
>       goto cleanup;
>
> so we lower bitfields even when we cannot split critical edges?
> why?
>
> +  need_to_ifcvt
> +    = if_convertible_loop_p (loop) && dbg_cnt (if_conversion_tree);
> +  if (!need_to_ifcvt && !need_to_lower_bitfields)
>       goto cleanup;
>
> likewise - if_convertible_loop_p performs other checks, the only
> one we want to elide is the loop->num_nodes <= 2 check since
> we want to lower bitfields in single-block loops as well.  That
> means we only have to scan for bitfield accesses in the first
> block "prematurely".  So I would interwind the need_to_lower_bitfields
> into if_convertible_loop_p and if_convertible_loop_p_1 and
> put the loop->num_nodes <= 2 after it when !need_to_lower_bitfields.
I'm not sure I understood this. But I'd rather keep the 'need_to_ifcvt' 
(new) and 'need_to_lower_bitfields' separate. One thing I did change is 
that we no longer check for bitfields to lower if there are if-stmts 
that we can't lower, since we will not be vectorizing this loop anyway 
so no point in wasting time lowering bitfields. At the same time though, 
I'd like to be able to lower-bitfields if there are no ifcvts.
> +  if (!useless_type_conversion_p (TREE_TYPE (lhs), ret_type))
> +    {
> +      pattern_stmt
> +       = gimple_build_assign (vect_recog_temp_ssa_var (ret_type, NULL),
> +                              NOP_EXPR, lhs);
> +      lhs = gimple_get_lhs (pattern_stmt);
> +      append_pattern_def_seq (vinfo, stmt_info, pattern_stmt);
> +    }
>
> hm - so you have for example
>
>   int _1 = MEM;
>   int:3 _2 = BIT_FIELD_REF <_1, ...>
>   type _3 = (type) _2;
>
> and that _3 = (type) _2 is because of integer promotion and you
> perform all the shifting in that type.  I suppose you should
> verify that the cast is indeed promoting, not narrowing, since
> otherwise you'll produce wrong code?  That said, shouldn't you
> perform the shift / mask in the type of _1 instead?  (the hope
> is, of course, that typeof (_1) == type in most cases)
>
> Similar comments apply to vect_recog_bit_insert_pattern.
Good shout, hadn't realized that yet because of how the testcases didn't 
have that problem, but when using the REPRESENTATIVE macro they do test 
that now. I don't think the bit_insert is a problem though. In 
bit_insert, 'value' always has the relevant bits starting at its LSB. So 
regardless of whether the load (and store) type is larger or smaller 
than the type, performing the shifts and masks in this type should be OK 
as you'll only be 'cutting off' the MSB's which would be the ones that 
would get truncated anyway? Or am missing something here?

[-- Attachment #2: vect_bitfield2.patch --]
[-- Type: text/plain, Size: 29509 bytes --]

diff --git a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-1.c b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-1.c
new file mode 100644
index 0000000000000000000000000000000000000000..01cf34fb44484ca926ca5de99eef76dd99b69e92
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-1.c
@@ -0,0 +1,40 @@
+/* { dg-require-effective-target vect_int } */
+
+#include <stdarg.h>
+#include "tree-vect.h"
+
+extern void abort(void);
+
+struct s { int i : 31; };
+
+#define ELT0 {0}
+#define ELT1 {1}
+#define ELT2 {2}
+#define ELT3 {3}
+#define N 32
+#define RES 48
+struct s A[N]
+  = { ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3,
+      ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3,
+      ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3,
+      ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3};
+
+int __attribute__ ((noipa))
+f(struct s *ptr, unsigned n) {
+    int res = 0;
+    for (int i = 0; i < n; ++i)
+      res += ptr[i].i;
+    return res;
+}
+
+int main (void)
+{
+  check_vect ();
+
+  if (f(&A[0], N) != RES)
+    abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-2.c b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-2.c
new file mode 100644
index 0000000000000000000000000000000000000000..1a4a1579c1478b9407ad21b19e8fbdca9f674b42
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-2.c
@@ -0,0 +1,43 @@
+/* { dg-require-effective-target vect_int } */
+
+#include <stdarg.h>
+#include "tree-vect.h"
+
+extern void abort(void);
+
+struct s {
+    unsigned i : 31;
+    char a : 4;
+};
+
+#define N 32
+#define ELT0 {0x7FFFFFFFUL, 0}
+#define ELT1 {0x7FFFFFFFUL, 1}
+#define ELT2 {0x7FFFFFFFUL, 2}
+#define ELT3 {0x7FFFFFFFUL, 3}
+#define RES 48
+struct s A[N]
+  = { ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3,
+      ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3,
+      ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3,
+      ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3};
+
+int __attribute__ ((noipa))
+f(struct s *ptr, unsigned n) {
+    int res = 0;
+    for (int i = 0; i < n; ++i)
+      res += ptr[i].a;
+    return res;
+}
+
+int main (void)
+{
+  check_vect ();
+
+  if (f(&A[0], N) != RES)
+    abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-3.c b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-3.c
new file mode 100644
index 0000000000000000000000000000000000000000..216611a29fd8bbfbafdbdb79d790e520f44ba672
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-3.c
@@ -0,0 +1,43 @@
+/* { dg-require-effective-target vect_int } */
+
+#include <stdarg.h>
+#include "tree-vect.h"
+#include <stdbool.h>
+
+extern void abort(void);
+
+typedef struct {
+    int  c;
+    int  b;
+    bool a : 1;
+} struct_t;
+
+#define N 16
+#define ELT_F { 0xFFFFFFFF, 0xFFFFFFFF, 0 }
+#define ELT_T { 0xFFFFFFFF, 0xFFFFFFFF, 1 }
+
+struct_t vect_false[N] = { ELT_F, ELT_F, ELT_F, ELT_F, ELT_F, ELT_F, ELT_F, ELT_F,
+			   ELT_F, ELT_F, ELT_F, ELT_F, ELT_F, ELT_F, ELT_F, ELT_F  };
+struct_t vect_true[N]  = { ELT_F, ELT_F, ELT_T, ELT_F, ELT_F, ELT_F, ELT_F, ELT_F,
+			   ELT_F, ELT_F, ELT_T, ELT_F, ELT_F, ELT_F, ELT_F, ELT_F  };
+int main (void)
+{
+  unsigned ret = 0;
+  for (unsigned i = 0; i < N; i++)
+  {
+      ret |= vect_false[i].a;
+  }
+  if (ret)
+    abort ();
+
+  for (unsigned i = 0; i < N; i++)
+  {
+      ret |= vect_true[i].a;
+  }
+  if (!ret)
+    abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-4.c b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-4.c
new file mode 100644
index 0000000000000000000000000000000000000000..5bc9c412e9616aefcbf49a4518f1603380a54b2f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-4.c
@@ -0,0 +1,45 @@
+/* { dg-require-effective-target vect_int } */
+
+#include <stdarg.h>
+#include "tree-vect.h"
+
+extern void abort(void);
+
+struct s {
+    unsigned i : 31;
+    char x : 2;
+    char a : 4;
+};
+
+#define N 32
+#define ELT0 {0x7FFFFFFFUL, 3, 0}
+#define ELT1 {0x7FFFFFFFUL, 3, 1}
+#define ELT2 {0x7FFFFFFFUL, 3, 2}
+#define ELT3 {0x7FFFFFFFUL, 3, 3}
+#define RES 48
+struct s A[N]
+  = { ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3,
+      ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3,
+      ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3,
+      ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3};
+
+int __attribute__ ((noipa))
+f(struct s *ptr, unsigned n) {
+    int res = 0;
+    for (int i = 0; i < n; ++i)
+      res += ptr[i].a;
+    return res;
+}
+
+int main (void)
+{
+  check_vect ();
+
+  if (f(&A[0], N) != RES)
+    abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
+
diff --git a/gcc/testsuite/gcc.dg/vect/vect-bitfield-write-1.c b/gcc/testsuite/gcc.dg/vect/vect-bitfield-write-1.c
new file mode 100644
index 0000000000000000000000000000000000000000..19683d277b1ade1034496136f1d03bb2b446900f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-bitfield-write-1.c
@@ -0,0 +1,39 @@
+/* { dg-require-effective-target vect_int } */
+
+#include <stdarg.h>
+#include "tree-vect.h"
+
+extern void abort(void);
+
+struct s { int i : 31; };
+
+#define N 32
+#define V 5
+struct s A[N];
+
+void __attribute__ ((noipa))
+f(struct s *ptr, unsigned n) {
+    for (int i = 0; i < n; ++i)
+      ptr[i].i = V;
+}
+
+void __attribute__ ((noipa))
+check_f(struct s *ptr) {
+    for (unsigned i = 0; i < N; ++i)
+      if (ptr[i].i != V)
+	abort ();
+}
+
+int main (void)
+{
+  check_vect ();
+  __builtin_memset (&A[0], 0, sizeof(struct s) * N);
+
+  f(&A[0], N);
+  check_f (&A[0]);
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
+
diff --git a/gcc/testsuite/gcc.dg/vect/vect-bitfield-write-2.c b/gcc/testsuite/gcc.dg/vect/vect-bitfield-write-2.c
new file mode 100644
index 0000000000000000000000000000000000000000..d550dd35ab75eb67f6e53f89fbf55b7315e50bc9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-bitfield-write-2.c
@@ -0,0 +1,42 @@
+/* { dg-require-effective-target vect_int } */
+
+#include <stdarg.h>
+#include "tree-vect.h"
+
+extern void abort(void);
+
+struct s {
+    unsigned i : 31;
+    char a : 4;
+};
+
+#define N 32
+#define V 5
+struct s A[N];
+
+void __attribute__ ((noipa))
+f(struct s *ptr, unsigned n) {
+    for (int i = 0; i < n; ++i)
+      ptr[i].a = V;
+}
+
+void __attribute__ ((noipa))
+check_f(struct s *ptr) {
+    for (unsigned i = 0; i < N; ++i)
+      if (ptr[i].a != V)
+	abort ();
+}
+
+int main (void)
+{
+  check_vect ();
+  __builtin_memset (&A[0], 0, sizeof(struct s) * N);
+
+  f(&A[0], N);
+  check_f (&A[0]);
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
+
diff --git a/gcc/testsuite/gcc.dg/vect/vect-bitfield-write-3.c b/gcc/testsuite/gcc.dg/vect/vect-bitfield-write-3.c
new file mode 100644
index 0000000000000000000000000000000000000000..3303d2610ff972d986be172962c129634ee64254
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-bitfield-write-3.c
@@ -0,0 +1,43 @@
+/* { dg-require-effective-target vect_int } */
+
+#include <stdarg.h>
+#include "tree-vect.h"
+
+extern void abort(void);
+
+struct s {
+    unsigned i : 31;
+    char x : 2;
+    char a : 4;
+};
+
+#define N 32
+#define V 5
+struct s A[N];
+
+void __attribute__ ((noipa))
+f(struct s *ptr, unsigned n) {
+    for (int i = 0; i < n; ++i)
+      ptr[i].a = V;
+}
+
+void __attribute__ ((noipa))
+check_f(struct s *ptr) {
+    for (unsigned i = 0; i < N; ++i)
+      if (ptr[i].a != V)
+	abort ();
+}
+
+int main (void)
+{
+  check_vect ();
+  __builtin_memset (&A[0], 0, sizeof(struct s) * N);
+
+  f(&A[0], N);
+  check_f (&A[0]);
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
+
diff --git a/gcc/tree-if-conv.cc b/gcc/tree-if-conv.cc
index 1c8e1a45234b8c3565edaacd55abbee23d8ea240..f450dbb1922586b3d405281f605fb0d8a7fc8fc2 100644
--- a/gcc/tree-if-conv.cc
+++ b/gcc/tree-if-conv.cc
@@ -91,6 +91,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-pass.h"
 #include "ssa.h"
 #include "expmed.h"
+#include "expr.h"
 #include "optabs-query.h"
 #include "gimple-pretty-print.h"
 #include "alias.h"
@@ -123,6 +124,9 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-vectorizer.h"
 #include "tree-eh.h"
 
+/* For lang_hooks.types.type_for_mode.  */
+#include "langhooks.h"
+
 /* Only handle PHIs with no more arguments unless we are asked to by
    simd pragma.  */
 #define MAX_PHI_ARG_NUM \
@@ -145,6 +149,12 @@ static bool need_to_rewrite_undefined;
    before phi_convertible_by_degenerating_args.  */
 static bool any_complicated_phi;
 
+/* True if we have bitfield accesses we can lower.  */
+static bool need_to_lower_bitfields;
+
+/* True if there is any ifcvting to be done.  */
+static bool need_to_ifcvt;
+
 /* Hash for struct innermost_loop_behavior.  It depends on the user to
    free the memory.  */
 
@@ -2898,18 +2908,22 @@ version_loop_for_if_conversion (class loop *loop, vec<gimple *> *preds)
   class loop *new_loop;
   gimple *g;
   gimple_stmt_iterator gsi;
-  unsigned int save_length;
+  unsigned int save_length = 0;
 
   g = gimple_build_call_internal (IFN_LOOP_VECTORIZED, 2,
 				  build_int_cst (integer_type_node, loop->num),
 				  integer_zero_node);
   gimple_call_set_lhs (g, cond);
 
-  /* Save BB->aux around loop_version as that uses the same field.  */
-  save_length = loop->inner ? loop->inner->num_nodes : loop->num_nodes;
-  void **saved_preds = XALLOCAVEC (void *, save_length);
-  for (unsigned i = 0; i < save_length; i++)
-    saved_preds[i] = ifc_bbs[i]->aux;
+  void **saved_preds = NULL;
+  if (any_complicated_phi || need_to_predicate)
+    {
+      /* Save BB->aux around loop_version as that uses the same field.  */
+      save_length = loop->inner ? loop->inner->num_nodes : loop->num_nodes;
+      saved_preds = XALLOCAVEC (void *, save_length);
+      for (unsigned i = 0; i < save_length; i++)
+	saved_preds[i] = ifc_bbs[i]->aux;
+    }
 
   initialize_original_copy_tables ();
   /* At this point we invalidate porfile confistency until IFN_LOOP_VECTORIZED
@@ -2921,8 +2935,9 @@ version_loop_for_if_conversion (class loop *loop, vec<gimple *> *preds)
 			   profile_probability::always (), true);
   free_original_copy_tables ();
 
-  for (unsigned i = 0; i < save_length; i++)
-    ifc_bbs[i]->aux = saved_preds[i];
+  if (any_complicated_phi || need_to_predicate)
+    for (unsigned i = 0; i < save_length; i++)
+      ifc_bbs[i]->aux = saved_preds[i];
 
   if (new_loop == NULL)
     return NULL;
@@ -2998,7 +3013,7 @@ ifcvt_split_critical_edges (class loop *loop, bool aggressive_if_conv)
   auto_vec<edge> critical_edges;
 
   /* Loop is not well formed.  */
-  if (num <= 2 || loop->inner || !single_exit (loop))
+  if (loop->inner)
     return false;
 
   body = get_loop_body (loop);
@@ -3259,6 +3274,196 @@ ifcvt_hoist_invariants (class loop *loop, edge pe)
   free (body);
 }
 
+/* Returns the DECL_FIELD_BIT_OFFSET of the bitfield accesse in stmt iff its
+   type mode is not BLKmode.  If BITPOS is not NULL it will hold the poly_int64
+   value of the DECL_FIELD_BIT_OFFSET of the bitfield access and STRUCT_EXPR,
+   if not NULL, will hold the tree representing the base struct of this
+   bitfield.  */
+
+static tree
+get_bitfield_rep (gassign *stmt, bool write, poly_int64 *bitpos,
+		  tree *struct_expr)
+{
+  tree comp_ref = write ? gimple_get_lhs (stmt)
+			: gimple_assign_rhs1 (stmt);
+
+  if (struct_expr)
+    *struct_expr = TREE_OPERAND (comp_ref, 0);
+
+  tree field_decl = TREE_OPERAND (comp_ref, 1);
+  if (bitpos)
+    *bitpos = tree_to_poly_int64 (DECL_FIELD_BIT_OFFSET (field_decl));
+
+  tree rep_decl = DECL_BIT_FIELD_REPRESENTATIVE (field_decl);
+  /* Bail out if the representative is BLKmode as we will not be able to
+     vectorize this.  */
+  if (TYPE_MODE (TREE_TYPE (rep_decl)) == E_BLKmode)
+    return NULL_TREE;
+
+  return rep_decl;
+
+}
+
+/* Lowers the bitfield described by DATA.
+   For a write like:
+
+   struct.bf = _1;
+
+   lower to:
+
+   __ifc_1 = struct.<representative>;
+   __ifc_2 = BIT_INSERT_EXPR (__ifc_1, _1, bitpos);
+   struct.<representative> = __ifc_2;
+
+   For a read:
+
+   _1 = struct.bf;
+
+    lower to:
+
+    __ifc_1 = struct.<representative>;
+    _1 =  BIT_FIELD_REF (__ifc_1, bitsize, bitpos);
+
+    where representative is a legal load that contains the bitfield value,
+    bitsize is the size of the bitfield and bitpos the offset to the start of
+    the bitfield within the representative.  */
+
+static void
+lower_bitfield (gassign *stmt, bool write)
+{
+  tree struct_expr;
+  poly_int64 bitpos;
+  tree rep_decl = get_bitfield_rep (stmt, write, &bitpos, &struct_expr);
+  tree rep_type = TREE_TYPE (rep_decl);
+  tree bf_type = TREE_TYPE (gimple_get_lhs (stmt));
+
+  gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    {
+      fprintf (dump_file, "Lowering:\n");
+      print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
+      fprintf (dump_file, "to:\n");
+    }
+
+  /* BITPOS represents the position of the first bit of the bitfield field we
+     are accessing.  However, sometimes it can be from the start of the struct,
+     and sometimes from the start of the representative we are loading.  For
+     the first, the following code will adapt BITPOS to the latter since that
+     is the value BIT_FIELD_REF is expecting as bitposition.  For the latter
+     this should no effect.  */
+  HOST_WIDE_INT q;
+  poly_int64 r;
+  poly_int64 rep_align = TYPE_ALIGN (rep_type);
+  if (can_div_trunc_p (bitpos, rep_align, &q, &r))
+    bitpos = r;
+
+  /* REP_COMP_REF is a COMPONENT_REF for the representative.  NEW_VAL is it's
+     defining SSA_NAME.  */
+  tree rep_comp_ref = build3 (COMPONENT_REF, rep_type, struct_expr, rep_decl,
+			      NULL_TREE);
+  tree new_val = ifc_temp_var (rep_type, rep_comp_ref, &gsi);
+
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    print_gimple_stmt (dump_file, SSA_NAME_DEF_STMT (new_val), 0, TDF_SLIM);
+
+  tree bitpos_tree = build_int_cst (bitsizetype, bitpos);
+  if (write)
+    {
+      new_val = ifc_temp_var (rep_type,
+			      build3 (BIT_INSERT_EXPR, rep_type, new_val,
+				      unshare_expr (gimple_assign_rhs1 (stmt)),
+				      bitpos_tree), &gsi);
+
+      if (dump_file && (dump_flags & TDF_DETAILS))
+	print_gimple_stmt (dump_file, SSA_NAME_DEF_STMT (new_val), 0, TDF_SLIM);
+
+      gimple *new_stmt = gimple_build_assign (unshare_expr (rep_comp_ref),
+					      new_val);
+      gimple_set_vuse (new_stmt, gimple_vuse (stmt));
+      tree vdef = gimple_vdef (stmt);
+      gimple_set_vdef (new_stmt, vdef);
+      SSA_NAME_DEF_STMT (vdef) = new_stmt;
+      gsi_insert_before (&gsi, new_stmt, GSI_SAME_STMT);
+
+      if (dump_file && (dump_flags & TDF_DETAILS))
+	print_gimple_stmt (dump_file, new_stmt, 0, TDF_SLIM);
+    }
+  else
+    {
+      tree bfr = build3 (BIT_FIELD_REF, bf_type, new_val,
+			 build_int_cst (bitsizetype, TYPE_PRECISION (bf_type)),
+			 bitpos_tree);
+      new_val = ifc_temp_var (bf_type, bfr, &gsi);
+      redundant_ssa_names.safe_push (std::make_pair (gimple_get_lhs (stmt),
+						     new_val));
+
+      if (dump_file && (dump_flags & TDF_DETAILS))
+	print_gimple_stmt (dump_file, SSA_NAME_DEF_STMT (new_val), 0, TDF_SLIM);
+    }
+
+  gsi_remove (&gsi, true);
+}
+
+/* Return TRUE if there are bitfields to lower in this LOOP.  Fill TO_LOWER
+   with data structures representing these bitfields.  */
+
+static bool
+bitfields_to_lower_p (class loop *loop,
+		      vec <gassign *> &reads_to_lower,
+		      vec <gassign *> &writes_to_lower)
+{
+  basic_block *bbs = get_loop_body (loop);
+  gimple_stmt_iterator gsi;
+
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    {
+      fprintf (dump_file, "Analyzing loop %d for bitfields:\n", loop->num);
+    }
+
+  for (unsigned i = 0; i < loop->num_nodes; ++i)
+    {
+      basic_block bb = bbs[i];
+      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+	{
+	  gassign *stmt = dyn_cast<gassign*> (gsi_stmt (gsi));
+	  if (!stmt)
+	    continue;
+
+	  tree op = gimple_get_lhs (stmt);
+	  bool write = TREE_CODE (op) == COMPONENT_REF;
+
+	  if (!write)
+	    op = gimple_assign_rhs1 (stmt);
+
+	  if (TREE_CODE (op) != COMPONENT_REF)
+	    continue;
+
+	  if (DECL_BIT_FIELD_TYPE (TREE_OPERAND (op, 1)))
+	    {
+	      if (dump_file && (dump_flags & TDF_DETAILS))
+		print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
+
+	      if (!get_bitfield_rep (stmt, write, NULL, NULL))
+		{
+		  if (dump_file && (dump_flags & TDF_DETAILS))
+		    fprintf (dump_file, "\t Bitfield NOT OK to lower,"
+					" representative is BLKmode.\n");
+		  return false;
+		}
+
+	      if (dump_file && (dump_flags & TDF_DETAILS))
+		fprintf (dump_file, "\tBitfield OK to lower.\n");
+	      if (write)
+		writes_to_lower.safe_push (stmt);
+	      else
+		reads_to_lower.safe_push (stmt);
+	    }
+	}
+    }
+  return !reads_to_lower.is_empty () || !writes_to_lower.is_empty ();
+}
+
+
 /* If-convert LOOP when it is legal.  For the moment this pass has no
    profitability analysis.  Returns non-zero todo flags when something
    changed.  */
@@ -3269,12 +3474,18 @@ tree_if_conversion (class loop *loop, vec<gimple *> *preds)
   unsigned int todo = 0;
   bool aggressive_if_conv;
   class loop *rloop;
+  vec <gassign *> reads_to_lower;
+  vec <gassign *> writes_to_lower;
   bitmap exit_bbs;
   edge pe;
 
  again:
+  reads_to_lower.create (4);
+  writes_to_lower.create (4);
   rloop = NULL;
   ifc_bbs = NULL;
+  need_to_lower_bitfields = false;
+  need_to_ifcvt = false;
   need_to_predicate = false;
   need_to_rewrite_undefined = false;
   any_complicated_phi = false;
@@ -3290,16 +3501,30 @@ tree_if_conversion (class loop *loop, vec<gimple *> *preds)
 	aggressive_if_conv = true;
     }
 
-  if (!ifcvt_split_critical_edges (loop, aggressive_if_conv))
+  if (!single_exit (loop))
     goto cleanup;
 
-  if (!if_convertible_loop_p (loop)
-      || !dbg_cnt (if_conversion_tree))
-    goto cleanup;
+  /* If there are more than two BBs in the loop then there is at least one if
+     to convert.  */
+  if (loop->num_nodes > 2)
+    {
+      need_to_ifcvt = true;
+      if (!ifcvt_split_critical_edges (loop, aggressive_if_conv))
+	goto cleanup;
+
+      if (!if_convertible_loop_p (loop) || !dbg_cnt (if_conversion_tree))
+	goto cleanup;
+
+      if ((need_to_predicate || any_complicated_phi)
+	  && ((!flag_tree_loop_vectorize && !loop->force_vectorize)
+	      || loop->dont_vectorize))
+	goto cleanup;
+    }
 
-  if ((need_to_predicate || any_complicated_phi)
-      && ((!flag_tree_loop_vectorize && !loop->force_vectorize)
-	  || loop->dont_vectorize))
+  need_to_lower_bitfields = bitfields_to_lower_p (loop, reads_to_lower,
+						  writes_to_lower);
+
+  if (!need_to_ifcvt && !need_to_lower_bitfields)
     goto cleanup;
 
   /* The edge to insert invariant stmts on.  */
@@ -3310,7 +3535,8 @@ tree_if_conversion (class loop *loop, vec<gimple *> *preds)
      Either version this loop, or if the pattern is right for outer-loop
      vectorization, version the outer loop.  In the latter case we will
      still if-convert the original inner loop.  */
-  if (need_to_predicate
+  if (need_to_lower_bitfields
+      || need_to_predicate
       || any_complicated_phi
       || flag_tree_loop_if_convert != 1)
     {
@@ -3350,10 +3576,31 @@ tree_if_conversion (class loop *loop, vec<gimple *> *preds)
 	pe = single_pred_edge (gimple_bb (preds->last ()));
     }
 
-  /* Now all statements are if-convertible.  Combine all the basic
-     blocks into one huge basic block doing the if-conversion
-     on-the-fly.  */
-  combine_blocks (loop);
+  if (need_to_lower_bitfields)
+    {
+      if (dump_file && (dump_flags & TDF_DETAILS))
+	{
+	  fprintf (dump_file, "-------------------------\n");
+	  fprintf (dump_file, "Start lowering bitfields\n");
+	}
+      while (!reads_to_lower.is_empty ())
+	lower_bitfield (reads_to_lower.pop (), false);
+      while (!writes_to_lower.is_empty ())
+	lower_bitfield (writes_to_lower.pop (), true);
+
+      if (dump_file && (dump_flags & TDF_DETAILS))
+	{
+	  fprintf (dump_file, "Done lowering bitfields\n");
+	  fprintf (dump_file, "-------------------------\n");
+	}
+    }
+  if (need_to_ifcvt)
+    {
+      /* Now all statements are if-convertible.  Combine all the basic
+	 blocks into one huge basic block doing the if-conversion
+	 on-the-fly.  */
+      combine_blocks (loop);
+    }
 
   /* Perform local CSE, this esp. helps the vectorizer analysis if loads
      and stores are involved.  CSE only the loop body, not the entry
@@ -3380,6 +3627,8 @@ tree_if_conversion (class loop *loop, vec<gimple *> *preds)
   todo |= TODO_cleanup_cfg;
 
  cleanup:
+  reads_to_lower.release ();
+  writes_to_lower.release ();
   if (ifc_bbs)
     {
       unsigned int i;
diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
index b279a82551eb70379804d405983ae5dc44b66bf5..e93cdc727da4bb7863b2ad13f29f7d550492adea 100644
--- a/gcc/tree-vect-data-refs.cc
+++ b/gcc/tree-vect-data-refs.cc
@@ -4301,7 +4301,8 @@ vect_find_stmt_data_reference (loop_p loop, gimple *stmt,
       free_data_ref (dr);
       return opt_result::failure_at (stmt,
 				     "not vectorized:"
-				     " statement is bitfield access %G", stmt);
+				     " statement is an unsupported"
+				     " bitfield access %G", stmt);
     }
 
   if (DR_BASE_ADDRESS (dr)
diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index dfbfb71b3c69a0205ccc1b287cb50fa02a70942e..5486aa72a33274db954abf275c2c30dae3accc1c 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -35,6 +35,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-eh.h"
 #include "gimplify.h"
 #include "gimple-iterator.h"
+#include "gimplify-me.h"
 #include "cfgloop.h"
 #include "tree-vectorizer.h"
 #include "dumpfile.h"
@@ -1828,6 +1829,206 @@ vect_recog_widen_sum_pattern (vec_info *vinfo,
   return pattern_stmt;
 }
 
+/* Function vect_recog_bitfield_ref_pattern
+
+   Try to find the following pattern:
+
+   _2 = BIT_FIELD_REF (_1, bitsize, bitpos);
+   _3 = (type) _2;
+
+   where type is a non-bitfield type, that is to say, it's precision matches
+   2^(TYPE_SIZE(type) - (TYPE_UNSIGNED (type) ? 1 : 2)).
+
+   Input:
+
+   * STMT_VINFO: The stmt from which the pattern search begins.
+   here it starts with:
+   _3 = (type) _2;
+
+   Output:
+
+   * TYPE_OUT: The vector type of the output of this pattern.
+
+   * Return value: A new stmt that will be used to replace the sequence of
+   stmts that constitute the pattern. In this case it will be:
+   patt1 = (type) _1;
+   patt2 = patt1 >> bitpos;
+   _3 = patt2 & ((1 << bitsize) - 1);
+
+*/
+
+static gimple *
+vect_recog_bitfield_ref_pattern (vec_info *vinfo, stmt_vec_info stmt_info,
+				 tree *type_out)
+{
+  gassign *nop_stmt = dyn_cast <gassign *> (stmt_info->stmt);
+  if (!nop_stmt
+      || !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (nop_stmt))
+      || TREE_CODE (gimple_assign_rhs1 (nop_stmt)) != SSA_NAME)
+    return NULL;
+
+  gassign *bf_stmt
+    = dyn_cast <gassign *> (SSA_NAME_DEF_STMT (gimple_assign_rhs1 (nop_stmt)));
+
+  if (!bf_stmt || gimple_assign_rhs_code (bf_stmt) != BIT_FIELD_REF)
+    return NULL;
+
+  tree bf_ref = gimple_assign_rhs1 (bf_stmt);
+  tree lhs = TREE_OPERAND (bf_ref, 0);
+
+  if (!INTEGRAL_TYPE_P (TREE_TYPE (lhs)))
+    return NULL;
+
+  gimple *pattern_stmt;
+  tree ret_type = TREE_TYPE (gimple_assign_lhs (nop_stmt));
+
+  /* We move the conversion earlier if the loaded type is smaller than the
+     return type to enable the use of widening loads.  */
+  if (TYPE_PRECISION (TREE_TYPE (lhs)) < TYPE_PRECISION (ret_type)
+      && !useless_type_conversion_p (TREE_TYPE (lhs), ret_type))
+    {
+      pattern_stmt
+	= gimple_build_assign (vect_recog_temp_ssa_var (ret_type, NULL),
+			       NOP_EXPR, lhs);
+      lhs = gimple_get_lhs (pattern_stmt);
+      append_pattern_def_seq (vinfo, stmt_info, pattern_stmt);
+    }
+
+  unsigned HOST_WIDE_INT shift_n = bit_field_offset (bf_ref).to_constant ();
+  unsigned HOST_WIDE_INT mask_i = bit_field_size (bf_ref).to_constant ();
+  tree mask = build_int_cst (TREE_TYPE (lhs),
+			     ((1ULL << mask_i) - 1) << shift_n);
+  pattern_stmt
+    = gimple_build_assign (vect_recog_temp_ssa_var (TREE_TYPE (lhs), NULL),
+			   BIT_AND_EXPR, lhs, mask);
+  lhs = gimple_get_lhs (pattern_stmt);
+  if (shift_n)
+    {
+      append_pattern_def_seq (vinfo, stmt_info, pattern_stmt,
+			      get_vectype_for_scalar_type (vinfo,
+							   TREE_TYPE (lhs)));
+      pattern_stmt
+	= gimple_build_assign (vect_recog_temp_ssa_var (TREE_TYPE (lhs), NULL),
+			       RSHIFT_EXPR, lhs,
+			       build_int_cst (sizetype, shift_n));
+      lhs = gimple_get_lhs (pattern_stmt);
+    }
+
+  if (!useless_type_conversion_p (TREE_TYPE (lhs), ret_type))
+    {
+      append_pattern_def_seq (vinfo, stmt_info, pattern_stmt);
+      pattern_stmt
+	= gimple_build_assign (vect_recog_temp_ssa_var (ret_type, NULL),
+			       NOP_EXPR, lhs);
+      lhs = gimple_get_lhs (pattern_stmt);
+    }
+
+  *type_out = STMT_VINFO_VECTYPE (stmt_info);
+  vect_pattern_detected ("bitfield_ref pattern", stmt_info->stmt);
+
+  return pattern_stmt;
+}
+
+/* Function vect_recog_bit_insert_pattern
+
+   Try to find the following pattern:
+
+   _3 = BIT_INSERT_EXPR (_1, _2, bitpos);
+
+   Input:
+
+   * STMT_VINFO: The stmt we want to replace.
+
+   Output:
+
+   * TYPE_OUT: The vector type of the output of this pattern.
+
+   * Return value: A new stmt that will be used to replace the sequence of
+   stmts that constitute the pattern. In this case it will be:
+   patt1 = _2 & mask;		    // Clearing of the non-relevant bits in the
+				    // 'to-write value'.
+   patt2 = patt1 << bitpos;	    // Shift the cleaned value in to place.
+   patt3 = _1 & ~(mask << bitpos);  // Clearing the bits we want to write to,
+				    // from the value we want to write to.
+   _3 = patt3 | patt2;		    // Write bits.
+
+
+   where mask = ((1 << TYPE_PRECISION (_2)) - 1), a mask to keep the number of
+   bits corresponding to the real size of the bitfield value we are writing to.
+
+*/
+
+static gimple *
+vect_recog_bit_insert_pattern (vec_info *vinfo, stmt_vec_info stmt_info,
+			       tree *type_out)
+{
+  gassign *bf_stmt = dyn_cast <gassign *> (stmt_info->stmt);
+  if (!bf_stmt || gimple_assign_rhs_code (bf_stmt) != BIT_INSERT_EXPR)
+    return NULL;
+
+  tree load = gimple_assign_rhs1 (bf_stmt);
+  tree value = gimple_assign_rhs2 (bf_stmt);
+  tree offset = gimple_assign_rhs3 (bf_stmt);
+
+  tree bf_type = TREE_TYPE (value);
+  tree load_type = TREE_TYPE (load);
+
+  if (!INTEGRAL_TYPE_P (load_type))
+    return NULL;
+
+  gimple *pattern_stmt;
+
+  if (!useless_type_conversion_p (TREE_TYPE (value), load_type))
+    {
+      value = fold_build1 (NOP_EXPR, load_type, value);
+      if (!CONSTANT_CLASS_P (value))
+	{
+	  pattern_stmt
+	    = gimple_build_assign (vect_recog_temp_ssa_var (load_type, NULL),
+				   value);
+	  value = gimple_get_lhs (pattern_stmt);
+	  append_pattern_def_seq (vinfo, stmt_info, pattern_stmt);
+	}
+    }
+
+  unsigned HOST_WIDE_INT mask_i = (1ULL << TYPE_PRECISION (bf_type)) - 1;
+  tree mask_t = build_int_cst (load_type, mask_i);
+  /* Clear bits we don't want to write back from value and shift it in place.  */
+  pattern_stmt
+    = gimple_build_assign (vect_recog_temp_ssa_var (load_type, NULL),
+			   fold_build2 (BIT_AND_EXPR, load_type, value,
+					mask_t));
+  append_pattern_def_seq (vinfo, stmt_info, pattern_stmt);
+  unsigned HOST_WIDE_INT shift_n = tree_to_uhwi (offset);
+  if (shift_n)
+    {
+      pattern_stmt
+	= gimple_build_assign (vect_recog_temp_ssa_var (load_type, NULL),
+			       LSHIFT_EXPR, value, offset);
+      append_pattern_def_seq (vinfo, stmt_info, pattern_stmt);
+      value = gimple_get_lhs (pattern_stmt);
+    }
+  /* Mask off the bits in the loaded value.  */
+  mask_i <<= shift_n;
+  mask_i = ~mask_i;
+  mask_t = build_int_cst (load_type, mask_i);
+
+  tree lhs = vect_recog_temp_ssa_var (load_type, NULL);
+  pattern_stmt = gimple_build_assign (lhs, BIT_AND_EXPR,load, mask_t);
+  append_pattern_def_seq (vinfo, stmt_info, pattern_stmt);
+
+  /* Compose the value to write back.  */
+  pattern_stmt
+    = gimple_build_assign (vect_recog_temp_ssa_var (load_type, NULL),
+			   BIT_IOR_EXPR, lhs, value);
+
+  *type_out = STMT_VINFO_VECTYPE (stmt_info);
+  vect_pattern_detected ("bit_insert pattern", stmt_info->stmt);
+
+  return pattern_stmt;
+}
+
+
 /* Recognize cases in which an operation is performed in one type WTYPE
    but could be done more efficiently in a narrower type NTYPE.  For example,
    if we have:
@@ -5623,6 +5824,8 @@ struct vect_recog_func
    taken which means usually the more complex one needs to preceed the
    less comples onex (widen_sum only after dot_prod or sad for example).  */
 static vect_recog_func vect_vect_recog_func_ptrs[] = {
+  { vect_recog_bitfield_ref_pattern, "bitfield_ref" },
+  { vect_recog_bit_insert_pattern, "bit_insert" },
   { vect_recog_over_widening_pattern, "over_widening" },
   /* Must come after over_widening, which narrows the shift as much as
      possible beforehand.  */

  reply	other threads:[~2022-08-16 10:24 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-26 10:00 [RFC] Teach vectorizer to deal with bitfield reads Andre Vieira (lists)
2022-07-27 11:37 ` Richard Biener
2022-07-29  8:57   ` Andre Vieira (lists)
2022-07-29  9:11     ` Richard Biener
2022-07-29 10:31     ` Jakub Jelinek
2022-07-29 10:52       ` Richard Biener
2022-08-01 10:21         ` Andre Vieira (lists)
2022-08-01 13:16           ` Richard Biener
2022-08-08 14:06             ` [PATCH] Teach vectorizer to deal with bitfield accesses (was: [RFC] Teach vectorizer to deal with bitfield reads) Andre Vieira (lists)
2022-08-09 14:34               ` Richard Biener
2022-08-16 10:24                 ` Andre Vieira (lists) [this message]
2022-08-17 12:49                   ` Richard Biener
2022-08-25  9:09                     ` Andre Vieira (lists)
2022-09-08  9:07                       ` Andre Vieira (lists)
2022-09-08 11:51                       ` Richard Biener
2022-09-26 15:23                         ` Andre Vieira (lists)
2022-09-27 12:34                           ` Richard Biener
2022-09-28  9:43                             ` Andre Vieira (lists)
2022-09-28 17:31                               ` Andre Vieira (lists)
2022-09-29  7:54                                 ` Richard Biener
2022-10-07 14:20                                   ` Andre Vieira (lists)
2022-10-12  1:55                                     ` Hongtao Liu
2022-10-12  2:11                                       ` Hongtao Liu
2022-08-01 10:13       ` [RFC] Teach vectorizer to deal with bitfield reads Andre Vieira (lists)
2022-10-12  9:02 ` Eric Botcazou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aa8a155a-d420-83e2-7687-7b4fb5c50f2d@arm.com \
    --to=andre.simoesdiasvieira@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=rguenther@suse.de \
    --cc=richard.sandiford@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).