public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Add a new option "-ftree-bitfield-merge" (patch / doc inside)
@ 2013-07-17 20:58 Zoran Jovanovic
  2013-07-17 21:19 ` Joseph S. Myers
                   ` (5 more replies)
  0 siblings, 6 replies; 17+ messages in thread
From: Zoran Jovanovic @ 2013-07-17 20:58 UTC (permalink / raw)
  To: gcc-patches; +Cc: Petar Jovanovic

Hello,
This patch adds new optimization pass that combines several adjacent bit field accesses that copy values from one memory location to another into single bit field access.

Example:

Original code:
  <unnamed-unsigned:3> D.1351;
  <unnamed-unsigned:9> D.1350;
  <unnamed-unsigned:7> D.1349;
  D.1349_2 = p1_1(D)->f1;
  p2_3(D)->f1 = D.1349_2;
  D.1350_4 = p1_1(D)->f2;
  p2_3(D)->f2 = D.1350_4;
  D.1351_5 = p1_1(D)->f3;
  p2_3(D)->f3 = D.1351_5;

Optimized code:
  <unnamed-unsigned:19> D.1358;
  D.1358_10 = BIT_FIELD_REF <*p1_1(D), 19, 13>;
  BIT_FIELD_REF <*p2_3(D), 19, 13> = D.1358_10;

Algorithm works on basic block level and consists of following 3 major steps:
1. Go trough basic block statements list. If there are statement pairs that implement copy of bit field content from one memory location to another record statements pointers and other necessary data in corresponding data structure.
2. Identify records that represent adjacent bit field accesses and mark them as merged.
3. Modify trees accordingly.

New command line option "-ftree-bitfield-merge" is introduced.

Tested - passed gcc regression tests.

Changelog -

gcc/ChangeLog:
2013-07-17 Zoran Jovanovic (zoran.jovanovic@imgtec.com)
  * Makefile.in : Added tree-ssa-bitfield-merge.o to OBJS.
  * common.opt (ftree-bitfield-merge): New option.
  * doc/invoke.texi: Added reference to "-ftree-bitfield-merge".
  * dwarf2out.c (field_type): static removed from declaration.
  (simple_type_size_in_bits): static removed from declaration.
  (field_byte_offset): static removed from declaration.
  (field_type): static inline removed from declaration.
  * passes.c (init_optimization_passes): pass_bitfield_merge pass added.
  * testsuite/gcc.dg/tree-ssa/bitfldmrg.c: New test.
  * timevar.def : Added TV_TREE_BITFIELD_MERGE.
  * tree-pass.h : Added pass_bitfield_merge declaration.
  * tree-ssa-bitfield-merge.c : New file.

Patch -

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index d5121f3..5cdd6eb 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1417,6 +1417,7 @@ OBJS = \
 	tree-ssa-dom.o \
 	tree-ssa-dse.o \
 	tree-ssa-forwprop.o \
+	tree-ssa-bitfield-merge.o \
 	tree-ssa-ifcombine.o \
 	tree-ssa-live.o \
 	tree-ssa-loop-ch.o \
@@ -2312,6 +2313,11 @@ tree-ssa-forwprop.o : tree-ssa-forwprop.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
    $(TREE_FLOW_H) $(TREE_PASS_H) $(DIAGNOSTIC_H) \
    langhooks.h $(FLAGS_H) $(GIMPLE_H) $(GIMPLE_PRETTY_PRINT_H) $(EXPR_H) \
    $(OPTABS_H) tree-ssa-propagate.h
+tree-ssa-bitfield-merge.o : tree-ssa-forwprop.c $(CONFIG_H) $(SYSTEM_H) \
+   coretypes.h $(TM_H) $(TREE_H) $(TM_P_H) $(BASIC_BLOCK_H) \
+   $(TREE_FLOW_H) $(TREE_PASS_H) $(TREE_DUMP_H) $(DIAGNOSTIC_H) $(TIMEVAR_H) \
+   langhooks.h $(FLAGS_H) $(GIMPLE_H) tree-pretty-print.h \
+   gimple-pretty-print.h $(EXPR_H)
 tree-ssa-phiprop.o : tree-ssa-phiprop.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
    $(TM_H) $(TREE_H) $(TM_P_H) $(BASIC_BLOCK_H) \
    $(TREE_FLOW_H) $(TREE_PASS_H) $(DIAGNOSTIC_H) \
@@ -3803,6 +3809,7 @@ GTFILES = $(CPP_ID_DATA_H) $(srcdir)/input.h $(srcdir)/coretypes.h \
   $(srcdir)/ipa-inline.h \
   $(srcdir)/asan.c \
   $(srcdir)/tsan.c \
+  $(srcdir)/tree-ssa-bitfield-merge.c \
   @all_gtfiles@
 
 # Compute the list of GT header files from the corresponding C sources,
diff --git a/gcc/common.opt b/gcc/common.opt
index 4c7933e..e0dbc37 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2088,6 +2088,10 @@ ftree-forwprop
 Common Report Var(flag_tree_forwprop) Init(1) Optimization
 Enable forward propagation on trees
 
+ftree-bitfield-merge
+Common Report Var(flag_tree_bitfield_merge) Init(0) Optimization
+Enable bit field merge on trees
+
 ftree-fre
 Common Report Var(flag_tree_fre) Optimization
 Enable Full Redundancy Elimination (FRE) on trees
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index dd82880..7b671aa 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -409,7 +409,7 @@ Objective-C and Objective-C++ Dialects}.
 -fsplit-ivs-in-unroller -fsplit-wide-types -fstack-protector @gol
 -fstack-protector-all -fstack-protector-strong -fstrict-aliasing @gol
 -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
--ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
+-ftree-bitfield-merge -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
 -ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
 -ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
 -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
@@ -7582,6 +7582,11 @@ pointer alignment information.
 This pass only operates on local scalar variables and is enabled by default
 at @option{-O} and higher.  It requires that @option{-ftree-ccp} is enabled.
 
+@item -ftree-bitfield-merge
+@opindex ftree-bitfield-merge
+Combines several adjacent bit field accesses that copy values
+from one memory location to another into single bit field access.
+
 @item -ftree-ccp
 @opindex ftree-ccp
 Perform sparse conditional constant propagation (CCP) on trees.  This
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index f42ad66..a08eede 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -3100,11 +3100,11 @@ static dw_loc_descr_ref loc_descriptor (rtx, enum machine_mode mode,
 static dw_loc_list_ref loc_list_from_tree (tree, int);
 static dw_loc_descr_ref loc_descriptor_from_tree (tree, int);
 static HOST_WIDE_INT ceiling (HOST_WIDE_INT, unsigned int);
-static tree field_type (const_tree);
+tree field_type (const_tree);
 static unsigned int simple_type_align_in_bits (const_tree);
 static unsigned int simple_decl_align_in_bits (const_tree);
-static unsigned HOST_WIDE_INT simple_type_size_in_bits (const_tree);
-static HOST_WIDE_INT field_byte_offset (const_tree);
+unsigned HOST_WIDE_INT simple_type_size_in_bits (const_tree);
+HOST_WIDE_INT field_byte_offset (const_tree);
 static void add_AT_location_description	(dw_die_ref, enum dwarf_attribute,
 					 dw_loc_list_ref);
 static void add_data_member_location_attribute (dw_die_ref, tree);
@@ -10061,7 +10061,7 @@ is_base_type (tree type)
    else return BITS_PER_WORD if the type actually turns out to be an
    ERROR_MARK node.  */
 
-static inline unsigned HOST_WIDE_INT
+unsigned HOST_WIDE_INT
 simple_type_size_in_bits (const_tree type)
 {
   if (TREE_CODE (type) == ERROR_MARK)
@@ -14375,7 +14375,7 @@ ceiling (HOST_WIDE_INT value, unsigned int boundary)
    `integer_type_node' if the given node turns out to be an
    ERROR_MARK node.  */
 
-static inline tree
+tree
 field_type (const_tree decl)
 {
   tree type;
@@ -14426,7 +14426,7 @@ round_up_to_align (double_int t, unsigned int align)
    because the offset is actually variable.  (We can't handle the latter case
    just yet).  */
 
-static HOST_WIDE_INT
+HOST_WIDE_INT
 field_byte_offset (const_tree decl)
 {
   double_int object_offset_in_bits;
diff --git a/gcc/passes.c b/gcc/passes.c
index c8b03ee..3149adc 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -1325,6 +1325,7 @@ init_optimization_passes (void)
 	  NEXT_PASS (pass_remove_cgraph_callee_edges);
 	  NEXT_PASS (pass_rename_ssa_copies);
 	  NEXT_PASS (pass_ccp);
+	  NEXT_PASS (pass_bitfield_merge);
 	  /* After CCP we rewrite no longer addressed locals into SSA
 	     form if possible.  */
 	  NEXT_PASS (pass_forwprop);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg.c b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg.c
new file mode 100644
index 0000000..9be6bc9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-bitfield-merge -fdump-tree-bitfieldmerge" }  */
+
+struct S
+{
+  unsigned f1:7;
+  unsigned f2:9;
+  unsigned f3:3;
+  unsigned f4:5;
+  unsigned f5:1;
+  unsigned f6:2;
+};
+
+unsigned
+foo (struct S *p1, struct S *p2, int *ptr)
+{
+  p2->f1 = p1->f1;
+  p2->f2 = p1->f2;
+  p2->f3 = p1->f3;
+  *ptr = 7;
+  p2->f4 = p1->f4;
+  p2->f5 = p1->f5;
+  p2->f6 = p1->f6;
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump "19" "bitfieldmerge" } } */
+/* { dg-final { scan-tree-dump "8" "bitfieldmerge"} } */
+/* { dg-final { cleanup-tree-dump "bitfieldmerge" } } */
diff --git a/gcc/timevar.def b/gcc/timevar.def
index 44f0eac..d9b1c23 100644
--- a/gcc/timevar.def
+++ b/gcc/timevar.def
@@ -149,6 +149,7 @@ DEFTIMEVAR (TV_TREE_FRE		     , "tree FRE")
 DEFTIMEVAR (TV_TREE_SINK             , "tree code sinking")
 DEFTIMEVAR (TV_TREE_PHIOPT	     , "tree linearize phis")
 DEFTIMEVAR (TV_TREE_FORWPROP	     , "tree forward propagate")
+DEFTIMEVAR (TV_TREE_BITFIELD_MERGE   , "tree bitfield merge")
 DEFTIMEVAR (TV_TREE_PHIPROP	     , "tree phiprop")
 DEFTIMEVAR (TV_TREE_DCE		     , "tree conservative DCE")
 DEFTIMEVAR (TV_TREE_CD_DCE	     , "tree aggressive DCE")
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index b8c59a7..59ca028 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -337,6 +337,7 @@ extern struct gimple_opt_pass pass_warn_function_noreturn;
 extern struct gimple_opt_pass pass_cselim;
 extern struct gimple_opt_pass pass_phiopt;
 extern struct gimple_opt_pass pass_forwprop;
+extern struct gimple_opt_pass pass_bitfield_merge;
 extern struct gimple_opt_pass pass_phiprop;
 extern struct gimple_opt_pass pass_tree_ifcombine;
 extern struct gimple_opt_pass pass_dse;
diff --git a/gcc/tree-ssa-bitfield-merge.c b/gcc/tree-ssa-bitfield-merge.c
new file mode 100755
index 0000000..71f41b3
--- /dev/null
+++ b/gcc/tree-ssa-bitfield-merge.c
@@ -0,0 +1,503 @@
+/* Forward propagation of expressions for single use variables.
+   Copyright (C) 2004, 2005, 2007, 2008, 2009, 2010, 2011
+   Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "tree.h"
+#include "tm_p.h"
+#include "basic-block.h"
+#include "timevar.h"
+#include "gimple-pretty-print.h"
+#include "tree-flow.h"
+#include "tree-pass.h"
+#include "tree-dump.h"
+#include "langhooks.h"
+#include "flags.h"
+#include "gimple.h"
+#include "expr.h"
+#include "ggc.h"
+
+tree
+field_type (const_tree decl);
+
+bool
+expressions_equal_p (tree e1, tree e2);
+
+HOST_WIDE_INT
+field_byte_offset (const_tree decl);
+
+unsigned HOST_WIDE_INT
+simple_type_size_in_bits (const_tree type);
+
+/* This pass combines several adjacent bit field accesses that copy values
+   from one memory location to another into single bit field access.  */
+
+/* Data for single bit field read/write sequence.  */
+struct GTY (()) bitfield_access_d {
+  gimple load_stmt;		  /* Bit field load statement.  */
+  gimple store_stmt;		  /* Bit field store statement.  */
+  unsigned src_offset_bytes;	  /* Bit field offset at src in bytes.  */
+  unsigned src_bit_offset;	  /* Bit field offset inside source word.  */
+  unsigned src_bit_size;	  /* Size of bit field in source word.  */
+  unsigned dst_offset_bytes;	  /* Bit field offset at dst in bytes.  */
+  unsigned dst_bit_offset;	  /* Bit field offset inside destination
+				     word.  */
+  unsigned dst_bit_size;	  /* Size of bit field in destination word.  */
+  tree src_addr;		  /* Address of source memory access.  */
+  tree dst_addr;		  /* Address of destination memory access.  */
+  bool merged;			  /* True if access is merged with another
+				     one.  */
+  bool modified;		  /* True if bitfield size is modified.  */
+  bool is_barrier;		  /* True if access is barrier (call or mem
+				     access).  */
+  struct bitfield_access_d *next; /* Access with which this one is merged.  */
+};
+
+typedef struct bitfield_access_d bitfield_access_o;
+typedef struct bitfield_access_d *bitfield_access;
+
+/* Connecting register with bit field access sequence that defines value in
+   that register.  */
+struct GTY (()) bitfield_stmt_access_pair_d
+{
+  gimple stmt;
+  bitfield_access access;
+};
+
+typedef struct bitfield_stmt_access_pair_d bitfield_stmt_access_pair_o;
+typedef struct bitfield_stmt_access_pair_d *bitfield_stmt_access_pair;
+
+static GTY ((param_is (struct bitfield_stmt_access_pair_d)))
+  htab_t bitfield_stmt_access_htab;
+
+/* Hash table callbacks for bitfield_stmt_access_htab.  */
+
+static hashval_t
+bitfield_stmt_access_pair_htab_hash (const void *p)
+{
+  const struct bitfield_stmt_access_pair_d *entry =
+    (const struct bitfield_stmt_access_pair_d *)p;
+  return (hashval_t) (uintptr_t) entry->stmt;
+}
+
+static int
+bitfield_stmt_access_pair_htab_eq (const void *p1, const void *p2)
+{
+  const struct bitfield_stmt_access_pair_d *entry1 =
+    (const struct bitfield_stmt_access_pair_d *)p1;
+  const struct bitfield_stmt_access_pair_d *entry2 =
+    (const struct bitfield_stmt_access_pair_d *)p2;
+  return entry1->stmt == entry2->stmt;
+}
+
+
+static bool cfg_changed;
+
+/* Compare two bit field access records.  */
+
+static int
+cmp_access (const void *p1, const void *p2)
+{
+  const bitfield_access a1 = (*(const bitfield_access*)p1);
+  const bitfield_access a2 = (*(const bitfield_access*)p2);
+
+  if (!expressions_equal_p (a1->src_addr, a1->src_addr))
+    return a1 - a2;
+
+  if (!expressions_equal_p (a1->dst_addr, a1->dst_addr))
+    return a1 - a2;
+
+  return a1->src_bit_offset - a2->src_bit_offset;
+}
+
+/* Create new bit field access structure and add it to given bitfield_accesses
+   htab.  */
+static bitfield_access
+create_and_insert_access (vec<bitfield_access>
+		       *bitfield_accesses)
+{
+  bitfield_access access = ggc_alloc_bitfield_access_d ();
+  memset (access, 0, sizeof (struct bitfield_access_d));
+  bitfield_accesses->safe_push (access);
+  return access;
+}
+
+/* Slightly modified add_bit_offset_attribute from dwarf2out.c.  */
+static inline HOST_WIDE_INT
+get_bit_offset (tree decl)
+{
+  HOST_WIDE_INT object_offset_in_bytes = field_byte_offset (decl);
+  tree type = DECL_BIT_FIELD_TYPE (decl);
+  HOST_WIDE_INT bitpos_int;
+  HOST_WIDE_INT highest_order_object_bit_offset;
+  HOST_WIDE_INT highest_order_field_bit_offset;
+  HOST_WIDE_INT bit_offset;
+
+  /* Must be a field and a bit field.  */
+  gcc_assert (type && TREE_CODE (decl) == FIELD_DECL);
+  if (! host_integerp (bit_position (decl), 0)
+      || ! host_integerp (DECL_SIZE (decl), 1))
+    return -1;
+
+  bitpos_int = int_bit_position (decl);
+
+  /* Note that the bit offset is always the distance (in bits) from the
+     highest-order bit of the "containing object" to the highest-order bit of
+     the bit-field itself.  Since the "high-order end" of any object or field
+     is different on big-endian and little-endian machines, the computation
+     below must take account of these differences.  */
+  highest_order_object_bit_offset = object_offset_in_bytes * BITS_PER_UNIT;
+  highest_order_field_bit_offset = bitpos_int;
+
+  if (! BYTES_BIG_ENDIAN)
+    {
+      highest_order_field_bit_offset += tree_low_cst (DECL_SIZE (decl), 0);
+      highest_order_object_bit_offset += simple_type_size_in_bits (type);
+    }
+
+  bit_offset
+    = (! BYTES_BIG_ENDIAN
+       ? highest_order_object_bit_offset - highest_order_field_bit_offset
+       : highest_order_field_bit_offset - highest_order_object_bit_offset);
+
+  return bit_offset;
+}
+
+/* Slightly modified add_byte_size_attribute from dwarf2out.c.  */
+static inline HOST_WIDE_INT
+get_byte_size (tree tree_node)
+{
+  unsigned size;
+
+  switch (TREE_CODE (tree_node))
+    {
+    case ERROR_MARK:
+      size = 0;
+      break;
+    case ENUMERAL_TYPE:
+    case RECORD_TYPE:
+    case UNION_TYPE:
+    case QUAL_UNION_TYPE:
+      size = int_size_in_bytes (tree_node);
+      break;
+    case FIELD_DECL:
+      /* For a data member of a struct or union, the DW_AT_byte_size is
+	 generally given as the number of bytes normally allocated for an
+	 object of the *declared* type of the member itself.  This is true
+	 even for bit-fields.  */
+      size = simple_type_size_in_bits (field_type (tree_node)) / BITS_PER_UNIT;
+      break;
+    default:
+      gcc_unreachable ();
+    }
+
+  return size;
+}
+
+/* Returns size of combined bitfields.  */
+static int
+get_merged_bit_field_size (bitfield_access access)
+{
+  bitfield_access tmp_access = access;
+  int size = 0;
+
+  while (tmp_access)
+  {
+    size += tmp_access->src_bit_size;
+    tmp_access = tmp_access->next;
+  }
+  return size;
+}
+
+/* Adds new pair consisting of statement and bit field access structure that
+   contains it.  */
+static bool add_stmt_access_pair (bitfield_access access, gimple stmt)
+{
+  bitfield_stmt_access_pair new_pair;
+  void **slot;
+  new_pair = ggc_alloc_bitfield_stmt_access_pair_o ();
+  new_pair->stmt = stmt;
+  new_pair->access = access;
+  slot = htab_find_slot (bitfield_stmt_access_htab, new_pair, INSERT);
+  if (*slot == HTAB_EMPTY_ENTRY)
+    {
+      *slot = new_pair;
+      return true;
+    }
+  return false;
+}
+
+/* Main entry point for the bit field merge optimization.  */
+static unsigned int
+ssa_bitfield_merge (void)
+{
+  basic_block bb;
+  unsigned int todoflags = 0;
+  vec<bitfield_access> bitfield_accesses;
+  int ix, iy;
+  bitfield_access access;
+
+  cfg_changed = false;
+
+  FOR_EACH_BB (bb)
+    {
+      gimple_stmt_iterator gsi;
+      vec<bitfield_access> bitfield_accesses_merge = vNULL;
+      bitfield_accesses.create (0);
+
+      bitfield_stmt_access_htab
+	= htab_create_ggc (128,
+			 bitfield_stmt_access_pair_htab_hash,
+			 bitfield_stmt_access_pair_htab_eq,
+			 NULL);
+
+      /* Identify all bitfield copy sequences in the basic-block.  */
+      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)
+	{
+	  gimple stmt = gsi_stmt (gsi);
+	  tree lhs, rhs;
+	  void **slot;
+	  struct bitfield_stmt_access_pair_d asdata;
+
+	  if (!is_gimple_assign (stmt))
+	    {
+	      gsi_next (&gsi);
+	      continue;
+	    }
+
+	  lhs = gimple_assign_lhs (stmt);
+	  rhs = gimple_assign_rhs1 (stmt);
+
+	  if (TREE_CODE (rhs) == COMPONENT_REF)
+	    {
+	      use_operand_p use;
+	      gimple use_stmt;
+	      tree op0 = TREE_OPERAND (rhs, 0);
+	      tree op1 = TREE_OPERAND (rhs, 1);
+
+	      if (TREE_CODE (op1) == FIELD_DECL && DECL_BIT_FIELD_TYPE (op1))
+		{
+		  if (single_imm_use (lhs, &use, &use_stmt)
+		       && is_gimple_assign (use_stmt))
+		    {
+		      tree use_lhs = gimple_assign_lhs (use_stmt);
+		      if (TREE_CODE (use_lhs) == COMPONENT_REF)
+			{
+			  tree use_op0 = TREE_OPERAND (use_lhs, 0);
+			  tree use_op1 = TREE_OPERAND (use_lhs, 1);
+			  if (TREE_CODE (use_op1) == FIELD_DECL
+			      && DECL_BIT_FIELD_TYPE (use_op1))
+			    {
+			      /* Create new bit field access structure.  */
+			      access = create_and_insert_access
+					 (&bitfield_accesses);
+			      /* Collect access data - load instruction.  */
+			      access->src_bit_size = tree_low_cst
+						      (DECL_SIZE (op1), 1);
+			      access->src_bit_offset = get_bit_offset (op1);
+			      access->src_offset_bytes = get_byte_size (op1);
+			      access->src_addr = op0;
+			      access->load_stmt = gsi_stmt (gsi);
+			      /* Collect access data - store instruction.  */
+			      access->dst_bit_size = tree_low_cst (DECL_SIZE
+								    (use_op1),
+								   1);
+			      access->dst_bit_offset = get_bit_offset
+							 (use_op1);
+			      access->dst_offset_bytes = get_byte_size
+							   (use_op1);
+			      access->dst_addr = use_op0;
+			      access->store_stmt = use_stmt;
+			      add_stmt_access_pair (access, stmt);
+			      add_stmt_access_pair (access, use_stmt);
+			    }
+			}
+		    }
+		}
+	    }
+
+	  /* Insert barrier for merging if statement is function call or memory
+	     access.  */
+	  asdata.stmt = stmt;
+	  slot = htab_find_slot (bitfield_stmt_access_htab, &asdata,
+				 NO_INSERT);
+	  if (!slot && ((gimple_code (stmt) == GIMPLE_CALL)
+	      || (gimple_has_mem_ops (stmt))))
+	    {
+	      /* Create new bit field access structure.  */
+	      access = create_and_insert_access (&bitfield_accesses);
+	      /* Mark it as barrier.  */
+	      access->is_barrier = true;
+	    }
+
+	  gsi_next (&gsi);
+	}
+
+      /* If there are no at least two accesses go to the next basic block.  */
+      if (bitfield_accesses.length () <= 1)
+	{
+	  bitfield_accesses.release ();
+	  continue;
+	}
+
+      /* Find bitfield accesses that can be merged.  */
+      for (ix = 0; bitfield_accesses.iterate (ix, &access); ix++)
+	{
+	  bitfield_access head_access;
+	  bitfield_access mrg_access;
+	  bitfield_access prev_access;
+
+	  if (!bitfield_accesses_merge.exists ())
+	    bitfield_accesses_merge.create (0);
+
+	  bitfield_accesses_merge.safe_push (access);
+
+	  if (!access->is_barrier
+	      && !(access == bitfield_accesses.last ()
+	      && !bitfield_accesses_merge.is_empty ()))
+	    continue;
+
+	  bitfield_accesses_merge.qsort (cmp_access);
+
+	  head_access = NULL;
+	  for (iy = 0; bitfield_accesses_merge.iterate (iy, &mrg_access); iy++)
+	    {
+	      if (head_access
+		  && expressions_equal_p (head_access->src_addr,
+					  mrg_access->src_addr)
+		  && expressions_equal_p (head_access->dst_addr,
+					  mrg_access->dst_addr)
+		  && prev_access->src_offset_bytes
+		     == prev_access->src_offset_bytes
+		  && prev_access->dst_offset_bytes
+		     == prev_access->dst_offset_bytes
+		  && prev_access->src_bit_offset + prev_access->src_bit_size
+		     == mrg_access->src_bit_offset
+		  && prev_access->dst_bit_offset + prev_access->dst_bit_size
+		     == mrg_access->dst_bit_offset)
+		{
+		  /* Merge conditions are satisfied - merge accesses.  */
+		  mrg_access->merged = true;
+		  prev_access->next = mrg_access;
+		  head_access->modified = true;
+		  prev_access = mrg_access;
+		}
+	      else
+		head_access = prev_access = mrg_access;
+	    }
+	  bitfield_accesses_merge.release ();
+	  bitfield_accesses_merge = vNULL;
+	}
+
+      /* Modify generated code.  */
+      for (ix = 0; bitfield_accesses.iterate (ix, &access); ix++)
+	{
+	  if (access->merged)
+	    {
+	      /* Access merged - remove instructions.  */
+	      gimple_stmt_iterator tmp_gsi;
+	      tmp_gsi = gsi_for_stmt (access->load_stmt);
+	      gsi_remove (&tmp_gsi, true);
+	      tmp_gsi = gsi_for_stmt (access->store_stmt);
+	      gsi_remove (&tmp_gsi, true);
+	    }
+	  else if (access->modified)
+	    {
+	      /* Access modified - modify generated code.  */
+	      gimple_stmt_iterator tmp_gsi;
+	      tree tmp_ssa;
+	      tree itype = make_node (INTEGER_TYPE);
+	      tree new_rhs;
+	      tree new_lhs;
+	      gimple new_stmt;
+
+	      /* Bitfield size changed - modify load statement.  */
+	      access->src_bit_size = get_merged_bit_field_size (access);
+	      TYPE_PRECISION (itype) = access->src_bit_size;
+	      fixup_unsigned_type (itype);
+	      tmp_ssa = make_ssa_name (create_tmp_var (itype, NULL), NULL);
+	      new_rhs = build3 (BIT_FIELD_REF, itype, access->src_addr,
+				build_int_cst (unsigned_type_node,
+					       access->src_bit_size),
+				build_int_cst (unsigned_type_node,
+					       access->src_bit_offset));
+
+	      tmp_gsi = gsi_for_stmt (access->load_stmt);
+	      new_stmt = gimple_build_assign (tmp_ssa, new_rhs);
+	      gsi_insert_after (&tmp_gsi, new_stmt, GSI_SAME_STMT);
+	      SSA_NAME_DEF_STMT (tmp_ssa) = new_stmt;
+	      gsi_remove (&tmp_gsi, true);
+
+	      /* Bitfield size changed - modify store statement.  */
+	      new_lhs = build3 (BIT_FIELD_REF, itype, access->dst_addr,
+				build_int_cst (unsigned_type_node,
+					       access->src_bit_size),
+				build_int_cst (unsigned_type_node,
+					       access->dst_bit_offset));
+
+	      tmp_gsi = gsi_for_stmt (access->store_stmt);
+	      new_stmt = gimple_build_assign (new_lhs, tmp_ssa);
+	      gsi_insert_after (&tmp_gsi, new_stmt, GSI_SAME_STMT);
+	      gsi_remove (&tmp_gsi, true);
+
+	      cfg_changed = true;
+	    }
+	}
+      /* Empty or delete data structures used for basic block.  */
+      htab_empty (bitfield_stmt_access_htab);
+      bitfield_accesses.release ();
+    }
+
+  if (cfg_changed)
+    todoflags |= TODO_cleanup_cfg;
+
+  return todoflags;
+}
+
+static bool
+gate_bitfield_merge (void)
+{
+  return flag_tree_bitfield_merge;
+}
+
+struct gimple_opt_pass pass_bitfield_merge =
+{
+ {
+  GIMPLE_PASS,
+  "bitfieldmerge",		/* name */
+  OPTGROUP_NONE,                /* optinfo_flags */
+  gate_bitfield_merge,		/* gate */
+  ssa_bitfield_merge,		/* execute */
+  NULL,				/* sub */
+  NULL,				/* next */
+  0,				/* static_pass_number */
+  TV_TREE_BITFIELD_MERGE,	/* tv_id */
+  PROP_cfg | PROP_ssa,		/* properties_required */
+  0,				/* properties_provided */
+  0,				/* properties_destroyed */
+  0,				/* todo_flags_start */
+  TODO_update_ssa
+  | TODO_verify_ssa		/* todo_flags_finish */
+ }
+};
+
+#include "gt-tree-ssa-bitfield-merge.h"


Regards,
Zoran Jovanovic

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Add a new option "-ftree-bitfield-merge" (patch / doc inside)
  2013-07-17 20:58 [PATCH] Add a new option "-ftree-bitfield-merge" (patch / doc inside) Zoran Jovanovic
@ 2013-07-17 21:19 ` Joseph S. Myers
  2013-07-17 21:25 ` Andrew Pinski
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 17+ messages in thread
From: Joseph S. Myers @ 2013-07-17 21:19 UTC (permalink / raw)
  To: Zoran Jovanovic; +Cc: gcc-patches, Petar Jovanovic

[-- Attachment #1: Type: text/plain, Size: 1655 bytes --]

On Wed, 17 Jul 2013, Zoran Jovanovic wrote:

> This patch adds new optimization pass that combines several adjacent bit 
> field accesses that copy values from one memory location to another into 
> single bit field access.

Could you clarify if this works correctly in the presence of unions?  That 
is, if the sequence of bit-fields being read from overlaps with the 
sequence written to (but no individual store involves a write overlapping 
with a read), whether because they are in the same structure or because 
they are in structures appropriately overlaid with unions, the semantics 
of the sequence of loads and stores is preserved (which may not be the 
same as a simple copy)?  There should be comprehensive testcases added to 
the testsuite covering different variations on this issue.

> +Enable bit field merge on trees

"bit-field", see codingconventions.html (this applies to both --help text 
and Texinfo documentation).

> +   Copyright (C) 2004, 2005, 2007, 2008, 2009, 2010, 2011
> +   Free Software Foundation, Inc.

<year>-2013 (all on one line).

> +tree
> +field_type (const_tree decl);
> +
> +bool
> +expressions_equal_p (tree e1, tree e2);
> +
> +HOST_WIDE_INT
> +field_byte_offset (const_tree decl);
> +
> +unsigned HOST_WIDE_INT
> +simple_type_size_in_bits (const_tree type);

Never include such non-static declarations in a .c file; include the 
appropriate header for the declarations instead.  Try to use static 
forward declarations only if needed because of recursion (otherwise 
topologically sort the functions in the source file).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Add a new option "-ftree-bitfield-merge" (patch / doc inside)
  2013-07-17 20:58 [PATCH] Add a new option "-ftree-bitfield-merge" (patch / doc inside) Zoran Jovanovic
  2013-07-17 21:19 ` Joseph S. Myers
@ 2013-07-17 21:25 ` Andrew Pinski
  2013-07-18 10:06 ` Richard Biener
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 17+ messages in thread
From: Andrew Pinski @ 2013-07-17 21:25 UTC (permalink / raw)
  To: Zoran Jovanovic; +Cc: gcc-patches, Petar Jovanovic

On Wed, Jul 17, 2013 at 1:38 PM, Zoran Jovanovic
<Zoran.Jovanovic@imgtec.com> wrote:
> Hello,
> This patch adds new optimization pass that combines several adjacent bit field accesses that copy values from one memory location to another into single bit field access.
>
> Example:
>
> Original code:
>   <unnamed-unsigned:3> D.1351;
>   <unnamed-unsigned:9> D.1350;
>   <unnamed-unsigned:7> D.1349;
>   D.1349_2 = p1_1(D)->f1;
>   p2_3(D)->f1 = D.1349_2;
>   D.1350_4 = p1_1(D)->f2;
>   p2_3(D)->f2 = D.1350_4;
>   D.1351_5 = p1_1(D)->f3;
>   p2_3(D)->f3 = D.1351_5;
>
> Optimized code:
>   <unnamed-unsigned:19> D.1358;
>   D.1358_10 = BIT_FIELD_REF <*p1_1(D), 19, 13>;
>   BIT_FIELD_REF <*p2_3(D), 19, 13> = D.1358_10;
>
> Algorithm works on basic block level and consists of following 3 major steps:
> 1. Go trough basic block statements list. If there are statement pairs that implement copy of bit field content from one memory location to another record statements pointers and other necessary data in corresponding data structure.
> 2. Identify records that represent adjacent bit field accesses and mark them as merged.
> 3. Modify trees accordingly.
>
> New command line option "-ftree-bitfield-merge" is introduced.
>
> Tested - passed gcc regression tests.


This patch looks like it will fix
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45274 .

Thanks,
Andrew Pinski

>
> Changelog -
>
> gcc/ChangeLog:
> 2013-07-17 Zoran Jovanovic (zoran.jovanovic@imgtec.com)
>   * Makefile.in : Added tree-ssa-bitfield-merge.o to OBJS.
>   * common.opt (ftree-bitfield-merge): New option.
>   * doc/invoke.texi: Added reference to "-ftree-bitfield-merge".
>   * dwarf2out.c (field_type): static removed from declaration.
>   (simple_type_size_in_bits): static removed from declaration.
>   (field_byte_offset): static removed from declaration.
>   (field_type): static inline removed from declaration.
>   * passes.c (init_optimization_passes): pass_bitfield_merge pass added.
>   * testsuite/gcc.dg/tree-ssa/bitfldmrg.c: New test.
>   * timevar.def : Added TV_TREE_BITFIELD_MERGE.
>   * tree-pass.h : Added pass_bitfield_merge declaration.
>   * tree-ssa-bitfield-merge.c : New file.
>
> Patch -
>
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index d5121f3..5cdd6eb 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1417,6 +1417,7 @@ OBJS = \
>         tree-ssa-dom.o \
>         tree-ssa-dse.o \
>         tree-ssa-forwprop.o \
> +       tree-ssa-bitfield-merge.o \
>         tree-ssa-ifcombine.o \
>         tree-ssa-live.o \
>         tree-ssa-loop-ch.o \
> @@ -2312,6 +2313,11 @@ tree-ssa-forwprop.o : tree-ssa-forwprop.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
>     $(TREE_FLOW_H) $(TREE_PASS_H) $(DIAGNOSTIC_H) \
>     langhooks.h $(FLAGS_H) $(GIMPLE_H) $(GIMPLE_PRETTY_PRINT_H) $(EXPR_H) \
>     $(OPTABS_H) tree-ssa-propagate.h
> +tree-ssa-bitfield-merge.o : tree-ssa-forwprop.c $(CONFIG_H) $(SYSTEM_H) \
> +   coretypes.h $(TM_H) $(TREE_H) $(TM_P_H) $(BASIC_BLOCK_H) \
> +   $(TREE_FLOW_H) $(TREE_PASS_H) $(TREE_DUMP_H) $(DIAGNOSTIC_H) $(TIMEVAR_H) \
> +   langhooks.h $(FLAGS_H) $(GIMPLE_H) tree-pretty-print.h \
> +   gimple-pretty-print.h $(EXPR_H)
>  tree-ssa-phiprop.o : tree-ssa-phiprop.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
>     $(TM_H) $(TREE_H) $(TM_P_H) $(BASIC_BLOCK_H) \
>     $(TREE_FLOW_H) $(TREE_PASS_H) $(DIAGNOSTIC_H) \
> @@ -3803,6 +3809,7 @@ GTFILES = $(CPP_ID_DATA_H) $(srcdir)/input.h $(srcdir)/coretypes.h \
>    $(srcdir)/ipa-inline.h \
>    $(srcdir)/asan.c \
>    $(srcdir)/tsan.c \
> +  $(srcdir)/tree-ssa-bitfield-merge.c \
>    @all_gtfiles@
>
>  # Compute the list of GT header files from the corresponding C sources,
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 4c7933e..e0dbc37 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -2088,6 +2088,10 @@ ftree-forwprop
>  Common Report Var(flag_tree_forwprop) Init(1) Optimization
>  Enable forward propagation on trees
>
> +ftree-bitfield-merge
> +Common Report Var(flag_tree_bitfield_merge) Init(0) Optimization
> +Enable bit field merge on trees
> +
>  ftree-fre
>  Common Report Var(flag_tree_fre) Optimization
>  Enable Full Redundancy Elimination (FRE) on trees
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index dd82880..7b671aa 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -409,7 +409,7 @@ Objective-C and Objective-C++ Dialects}.
>  -fsplit-ivs-in-unroller -fsplit-wide-types -fstack-protector @gol
>  -fstack-protector-all -fstack-protector-strong -fstrict-aliasing @gol
>  -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
> --ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
> +-ftree-bitfield-merge -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
>  -ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
>  -ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
>  -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
> @@ -7582,6 +7582,11 @@ pointer alignment information.
>  This pass only operates on local scalar variables and is enabled by default
>  at @option{-O} and higher.  It requires that @option{-ftree-ccp} is enabled.
>
> +@item -ftree-bitfield-merge
> +@opindex ftree-bitfield-merge
> +Combines several adjacent bit field accesses that copy values
> +from one memory location to another into single bit field access.
> +
>  @item -ftree-ccp
>  @opindex ftree-ccp
>  Perform sparse conditional constant propagation (CCP) on trees.  This
> diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
> index f42ad66..a08eede 100644
> --- a/gcc/dwarf2out.c
> +++ b/gcc/dwarf2out.c
> @@ -3100,11 +3100,11 @@ static dw_loc_descr_ref loc_descriptor (rtx, enum machine_mode mode,
>  static dw_loc_list_ref loc_list_from_tree (tree, int);
>  static dw_loc_descr_ref loc_descriptor_from_tree (tree, int);
>  static HOST_WIDE_INT ceiling (HOST_WIDE_INT, unsigned int);
> -static tree field_type (const_tree);
> +tree field_type (const_tree);
>  static unsigned int simple_type_align_in_bits (const_tree);
>  static unsigned int simple_decl_align_in_bits (const_tree);
> -static unsigned HOST_WIDE_INT simple_type_size_in_bits (const_tree);
> -static HOST_WIDE_INT field_byte_offset (const_tree);
> +unsigned HOST_WIDE_INT simple_type_size_in_bits (const_tree);
> +HOST_WIDE_INT field_byte_offset (const_tree);
>  static void add_AT_location_description        (dw_die_ref, enum dwarf_attribute,
>                                          dw_loc_list_ref);
>  static void add_data_member_location_attribute (dw_die_ref, tree);
> @@ -10061,7 +10061,7 @@ is_base_type (tree type)
>     else return BITS_PER_WORD if the type actually turns out to be an
>     ERROR_MARK node.  */
>
> -static inline unsigned HOST_WIDE_INT
> +unsigned HOST_WIDE_INT
>  simple_type_size_in_bits (const_tree type)
>  {
>    if (TREE_CODE (type) == ERROR_MARK)
> @@ -14375,7 +14375,7 @@ ceiling (HOST_WIDE_INT value, unsigned int boundary)
>     `integer_type_node' if the given node turns out to be an
>     ERROR_MARK node.  */
>
> -static inline tree
> +tree
>  field_type (const_tree decl)
>  {
>    tree type;
> @@ -14426,7 +14426,7 @@ round_up_to_align (double_int t, unsigned int align)
>     because the offset is actually variable.  (We can't handle the latter case
>     just yet).  */
>
> -static HOST_WIDE_INT
> +HOST_WIDE_INT
>  field_byte_offset (const_tree decl)
>  {
>    double_int object_offset_in_bits;
> diff --git a/gcc/passes.c b/gcc/passes.c
> index c8b03ee..3149adc 100644
> --- a/gcc/passes.c
> +++ b/gcc/passes.c
> @@ -1325,6 +1325,7 @@ init_optimization_passes (void)
>           NEXT_PASS (pass_remove_cgraph_callee_edges);
>           NEXT_PASS (pass_rename_ssa_copies);
>           NEXT_PASS (pass_ccp);
> +         NEXT_PASS (pass_bitfield_merge);
>           /* After CCP we rewrite no longer addressed locals into SSA
>              form if possible.  */
>           NEXT_PASS (pass_forwprop);
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg.c b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg.c
> new file mode 100644
> index 0000000..9be6bc9
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg.c
> @@ -0,0 +1,29 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -ftree-bitfield-merge -fdump-tree-bitfieldmerge" }  */
> +
> +struct S
> +{
> +  unsigned f1:7;
> +  unsigned f2:9;
> +  unsigned f3:3;
> +  unsigned f4:5;
> +  unsigned f5:1;
> +  unsigned f6:2;
> +};
> +
> +unsigned
> +foo (struct S *p1, struct S *p2, int *ptr)
> +{
> +  p2->f1 = p1->f1;
> +  p2->f2 = p1->f2;
> +  p2->f3 = p1->f3;
> +  *ptr = 7;
> +  p2->f4 = p1->f4;
> +  p2->f5 = p1->f5;
> +  p2->f6 = p1->f6;
> +  return 0;
> +}
> +
> +/* { dg-final { scan-tree-dump "19" "bitfieldmerge" } } */
> +/* { dg-final { scan-tree-dump "8" "bitfieldmerge"} } */
> +/* { dg-final { cleanup-tree-dump "bitfieldmerge" } } */
> diff --git a/gcc/timevar.def b/gcc/timevar.def
> index 44f0eac..d9b1c23 100644
> --- a/gcc/timevar.def
> +++ b/gcc/timevar.def
> @@ -149,6 +149,7 @@ DEFTIMEVAR (TV_TREE_FRE                  , "tree FRE")
>  DEFTIMEVAR (TV_TREE_SINK             , "tree code sinking")
>  DEFTIMEVAR (TV_TREE_PHIOPT          , "tree linearize phis")
>  DEFTIMEVAR (TV_TREE_FORWPROP        , "tree forward propagate")
> +DEFTIMEVAR (TV_TREE_BITFIELD_MERGE   , "tree bitfield merge")
>  DEFTIMEVAR (TV_TREE_PHIPROP         , "tree phiprop")
>  DEFTIMEVAR (TV_TREE_DCE                     , "tree conservative DCE")
>  DEFTIMEVAR (TV_TREE_CD_DCE          , "tree aggressive DCE")
> diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
> index b8c59a7..59ca028 100644
> --- a/gcc/tree-pass.h
> +++ b/gcc/tree-pass.h
> @@ -337,6 +337,7 @@ extern struct gimple_opt_pass pass_warn_function_noreturn;
>  extern struct gimple_opt_pass pass_cselim;
>  extern struct gimple_opt_pass pass_phiopt;
>  extern struct gimple_opt_pass pass_forwprop;
> +extern struct gimple_opt_pass pass_bitfield_merge;
>  extern struct gimple_opt_pass pass_phiprop;
>  extern struct gimple_opt_pass pass_tree_ifcombine;
>  extern struct gimple_opt_pass pass_dse;
> diff --git a/gcc/tree-ssa-bitfield-merge.c b/gcc/tree-ssa-bitfield-merge.c
> new file mode 100755
> index 0000000..71f41b3
> --- /dev/null
> +++ b/gcc/tree-ssa-bitfield-merge.c
> @@ -0,0 +1,503 @@
> +/* Forward propagation of expressions for single use variables.
> +   Copyright (C) 2004, 2005, 2007, 2008, 2009, 2010, 2011
> +   Free Software Foundation, Inc.
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify
> +it under the terms of the GNU General Public License as published by
> +the Free Software Foundation; either version 3, or (at your option)
> +any later version.
> +
> +GCC is distributed in the hope that it will be useful,
> +but WITHOUT ANY WARRANTY; without even the implied warranty of
> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +GNU General Public License for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +<http://www.gnu.org/licenses/>.  */
> +
> +#include "config.h"
> +#include "system.h"
> +#include "coretypes.h"
> +#include "tm.h"
> +#include "tree.h"
> +#include "tm_p.h"
> +#include "basic-block.h"
> +#include "timevar.h"
> +#include "gimple-pretty-print.h"
> +#include "tree-flow.h"
> +#include "tree-pass.h"
> +#include "tree-dump.h"
> +#include "langhooks.h"
> +#include "flags.h"
> +#include "gimple.h"
> +#include "expr.h"
> +#include "ggc.h"
> +
> +tree
> +field_type (const_tree decl);
> +
> +bool
> +expressions_equal_p (tree e1, tree e2);
> +
> +HOST_WIDE_INT
> +field_byte_offset (const_tree decl);
> +
> +unsigned HOST_WIDE_INT
> +simple_type_size_in_bits (const_tree type);
> +
> +/* This pass combines several adjacent bit field accesses that copy values
> +   from one memory location to another into single bit field access.  */
> +
> +/* Data for single bit field read/write sequence.  */
> +struct GTY (()) bitfield_access_d {
> +  gimple load_stmt;              /* Bit field load statement.  */
> +  gimple store_stmt;             /* Bit field store statement.  */
> +  unsigned src_offset_bytes;     /* Bit field offset at src in bytes.  */
> +  unsigned src_bit_offset;       /* Bit field offset inside source word.  */
> +  unsigned src_bit_size;         /* Size of bit field in source word.  */
> +  unsigned dst_offset_bytes;     /* Bit field offset at dst in bytes.  */
> +  unsigned dst_bit_offset;       /* Bit field offset inside destination
> +                                    word.  */
> +  unsigned dst_bit_size;         /* Size of bit field in destination word.  */
> +  tree src_addr;                 /* Address of source memory access.  */
> +  tree dst_addr;                 /* Address of destination memory access.  */
> +  bool merged;                   /* True if access is merged with another
> +                                    one.  */
> +  bool modified;                 /* True if bitfield size is modified.  */
> +  bool is_barrier;               /* True if access is barrier (call or mem
> +                                    access).  */
> +  struct bitfield_access_d *next; /* Access with which this one is merged.  */
> +};
> +
> +typedef struct bitfield_access_d bitfield_access_o;
> +typedef struct bitfield_access_d *bitfield_access;
> +
> +/* Connecting register with bit field access sequence that defines value in
> +   that register.  */
> +struct GTY (()) bitfield_stmt_access_pair_d
> +{
> +  gimple stmt;
> +  bitfield_access access;
> +};
> +
> +typedef struct bitfield_stmt_access_pair_d bitfield_stmt_access_pair_o;
> +typedef struct bitfield_stmt_access_pair_d *bitfield_stmt_access_pair;
> +
> +static GTY ((param_is (struct bitfield_stmt_access_pair_d)))
> +  htab_t bitfield_stmt_access_htab;
> +
> +/* Hash table callbacks for bitfield_stmt_access_htab.  */
> +
> +static hashval_t
> +bitfield_stmt_access_pair_htab_hash (const void *p)
> +{
> +  const struct bitfield_stmt_access_pair_d *entry =
> +    (const struct bitfield_stmt_access_pair_d *)p;
> +  return (hashval_t) (uintptr_t) entry->stmt;
> +}
> +
> +static int
> +bitfield_stmt_access_pair_htab_eq (const void *p1, const void *p2)
> +{
> +  const struct bitfield_stmt_access_pair_d *entry1 =
> +    (const struct bitfield_stmt_access_pair_d *)p1;
> +  const struct bitfield_stmt_access_pair_d *entry2 =
> +    (const struct bitfield_stmt_access_pair_d *)p2;
> +  return entry1->stmt == entry2->stmt;
> +}
> +
> +
> +static bool cfg_changed;
> +
> +/* Compare two bit field access records.  */
> +
> +static int
> +cmp_access (const void *p1, const void *p2)
> +{
> +  const bitfield_access a1 = (*(const bitfield_access*)p1);
> +  const bitfield_access a2 = (*(const bitfield_access*)p2);
> +
> +  if (!expressions_equal_p (a1->src_addr, a1->src_addr))
> +    return a1 - a2;
> +
> +  if (!expressions_equal_p (a1->dst_addr, a1->dst_addr))
> +    return a1 - a2;
> +
> +  return a1->src_bit_offset - a2->src_bit_offset;
> +}
> +
> +/* Create new bit field access structure and add it to given bitfield_accesses
> +   htab.  */
> +static bitfield_access
> +create_and_insert_access (vec<bitfield_access>
> +                      *bitfield_accesses)
> +{
> +  bitfield_access access = ggc_alloc_bitfield_access_d ();
> +  memset (access, 0, sizeof (struct bitfield_access_d));
> +  bitfield_accesses->safe_push (access);
> +  return access;
> +}
> +
> +/* Slightly modified add_bit_offset_attribute from dwarf2out.c.  */
> +static inline HOST_WIDE_INT
> +get_bit_offset (tree decl)
> +{
> +  HOST_WIDE_INT object_offset_in_bytes = field_byte_offset (decl);
> +  tree type = DECL_BIT_FIELD_TYPE (decl);
> +  HOST_WIDE_INT bitpos_int;
> +  HOST_WIDE_INT highest_order_object_bit_offset;
> +  HOST_WIDE_INT highest_order_field_bit_offset;
> +  HOST_WIDE_INT bit_offset;
> +
> +  /* Must be a field and a bit field.  */
> +  gcc_assert (type && TREE_CODE (decl) == FIELD_DECL);
> +  if (! host_integerp (bit_position (decl), 0)
> +      || ! host_integerp (DECL_SIZE (decl), 1))
> +    return -1;
> +
> +  bitpos_int = int_bit_position (decl);
> +
> +  /* Note that the bit offset is always the distance (in bits) from the
> +     highest-order bit of the "containing object" to the highest-order bit of
> +     the bit-field itself.  Since the "high-order end" of any object or field
> +     is different on big-endian and little-endian machines, the computation
> +     below must take account of these differences.  */
> +  highest_order_object_bit_offset = object_offset_in_bytes * BITS_PER_UNIT;
> +  highest_order_field_bit_offset = bitpos_int;
> +
> +  if (! BYTES_BIG_ENDIAN)
> +    {
> +      highest_order_field_bit_offset += tree_low_cst (DECL_SIZE (decl), 0);
> +      highest_order_object_bit_offset += simple_type_size_in_bits (type);
> +    }
> +
> +  bit_offset
> +    = (! BYTES_BIG_ENDIAN
> +       ? highest_order_object_bit_offset - highest_order_field_bit_offset
> +       : highest_order_field_bit_offset - highest_order_object_bit_offset);
> +
> +  return bit_offset;
> +}
> +
> +/* Slightly modified add_byte_size_attribute from dwarf2out.c.  */
> +static inline HOST_WIDE_INT
> +get_byte_size (tree tree_node)
> +{
> +  unsigned size;
> +
> +  switch (TREE_CODE (tree_node))
> +    {
> +    case ERROR_MARK:
> +      size = 0;
> +      break;
> +    case ENUMERAL_TYPE:
> +    case RECORD_TYPE:
> +    case UNION_TYPE:
> +    case QUAL_UNION_TYPE:
> +      size = int_size_in_bytes (tree_node);
> +      break;
> +    case FIELD_DECL:
> +      /* For a data member of a struct or union, the DW_AT_byte_size is
> +        generally given as the number of bytes normally allocated for an
> +        object of the *declared* type of the member itself.  This is true
> +        even for bit-fields.  */
> +      size = simple_type_size_in_bits (field_type (tree_node)) / BITS_PER_UNIT;
> +      break;
> +    default:
> +      gcc_unreachable ();
> +    }
> +
> +  return size;
> +}
> +
> +/* Returns size of combined bitfields.  */
> +static int
> +get_merged_bit_field_size (bitfield_access access)
> +{
> +  bitfield_access tmp_access = access;
> +  int size = 0;
> +
> +  while (tmp_access)
> +  {
> +    size += tmp_access->src_bit_size;
> +    tmp_access = tmp_access->next;
> +  }
> +  return size;
> +}
> +
> +/* Adds new pair consisting of statement and bit field access structure that
> +   contains it.  */
> +static bool add_stmt_access_pair (bitfield_access access, gimple stmt)
> +{
> +  bitfield_stmt_access_pair new_pair;
> +  void **slot;
> +  new_pair = ggc_alloc_bitfield_stmt_access_pair_o ();
> +  new_pair->stmt = stmt;
> +  new_pair->access = access;
> +  slot = htab_find_slot (bitfield_stmt_access_htab, new_pair, INSERT);
> +  if (*slot == HTAB_EMPTY_ENTRY)
> +    {
> +      *slot = new_pair;
> +      return true;
> +    }
> +  return false;
> +}
> +
> +/* Main entry point for the bit field merge optimization.  */
> +static unsigned int
> +ssa_bitfield_merge (void)
> +{
> +  basic_block bb;
> +  unsigned int todoflags = 0;
> +  vec<bitfield_access> bitfield_accesses;
> +  int ix, iy;
> +  bitfield_access access;
> +
> +  cfg_changed = false;
> +
> +  FOR_EACH_BB (bb)
> +    {
> +      gimple_stmt_iterator gsi;
> +      vec<bitfield_access> bitfield_accesses_merge = vNULL;
> +      bitfield_accesses.create (0);
> +
> +      bitfield_stmt_access_htab
> +       = htab_create_ggc (128,
> +                        bitfield_stmt_access_pair_htab_hash,
> +                        bitfield_stmt_access_pair_htab_eq,
> +                        NULL);
> +
> +      /* Identify all bitfield copy sequences in the basic-block.  */
> +      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)
> +       {
> +         gimple stmt = gsi_stmt (gsi);
> +         tree lhs, rhs;
> +         void **slot;
> +         struct bitfield_stmt_access_pair_d asdata;
> +
> +         if (!is_gimple_assign (stmt))
> +           {
> +             gsi_next (&gsi);
> +             continue;
> +           }
> +
> +         lhs = gimple_assign_lhs (stmt);
> +         rhs = gimple_assign_rhs1 (stmt);
> +
> +         if (TREE_CODE (rhs) == COMPONENT_REF)
> +           {
> +             use_operand_p use;
> +             gimple use_stmt;
> +             tree op0 = TREE_OPERAND (rhs, 0);
> +             tree op1 = TREE_OPERAND (rhs, 1);
> +
> +             if (TREE_CODE (op1) == FIELD_DECL && DECL_BIT_FIELD_TYPE (op1))
> +               {
> +                 if (single_imm_use (lhs, &use, &use_stmt)
> +                      && is_gimple_assign (use_stmt))
> +                   {
> +                     tree use_lhs = gimple_assign_lhs (use_stmt);
> +                     if (TREE_CODE (use_lhs) == COMPONENT_REF)
> +                       {
> +                         tree use_op0 = TREE_OPERAND (use_lhs, 0);
> +                         tree use_op1 = TREE_OPERAND (use_lhs, 1);
> +                         if (TREE_CODE (use_op1) == FIELD_DECL
> +                             && DECL_BIT_FIELD_TYPE (use_op1))
> +                           {
> +                             /* Create new bit field access structure.  */
> +                             access = create_and_insert_access
> +                                        (&bitfield_accesses);
> +                             /* Collect access data - load instruction.  */
> +                             access->src_bit_size = tree_low_cst
> +                                                     (DECL_SIZE (op1), 1);
> +                             access->src_bit_offset = get_bit_offset (op1);
> +                             access->src_offset_bytes = get_byte_size (op1);
> +                             access->src_addr = op0;
> +                             access->load_stmt = gsi_stmt (gsi);
> +                             /* Collect access data - store instruction.  */
> +                             access->dst_bit_size = tree_low_cst (DECL_SIZE
> +                                                                   (use_op1),
> +                                                                  1);
> +                             access->dst_bit_offset = get_bit_offset
> +                                                        (use_op1);
> +                             access->dst_offset_bytes = get_byte_size
> +                                                          (use_op1);
> +                             access->dst_addr = use_op0;
> +                             access->store_stmt = use_stmt;
> +                             add_stmt_access_pair (access, stmt);
> +                             add_stmt_access_pair (access, use_stmt);
> +                           }
> +                       }
> +                   }
> +               }
> +           }
> +
> +         /* Insert barrier for merging if statement is function call or memory
> +            access.  */
> +         asdata.stmt = stmt;
> +         slot = htab_find_slot (bitfield_stmt_access_htab, &asdata,
> +                                NO_INSERT);
> +         if (!slot && ((gimple_code (stmt) == GIMPLE_CALL)
> +             || (gimple_has_mem_ops (stmt))))
> +           {
> +             /* Create new bit field access structure.  */
> +             access = create_and_insert_access (&bitfield_accesses);
> +             /* Mark it as barrier.  */
> +             access->is_barrier = true;
> +           }
> +
> +         gsi_next (&gsi);
> +       }
> +
> +      /* If there are no at least two accesses go to the next basic block.  */
> +      if (bitfield_accesses.length () <= 1)
> +       {
> +         bitfield_accesses.release ();
> +         continue;
> +       }
> +
> +      /* Find bitfield accesses that can be merged.  */
> +      for (ix = 0; bitfield_accesses.iterate (ix, &access); ix++)
> +       {
> +         bitfield_access head_access;
> +         bitfield_access mrg_access;
> +         bitfield_access prev_access;
> +
> +         if (!bitfield_accesses_merge.exists ())
> +           bitfield_accesses_merge.create (0);
> +
> +         bitfield_accesses_merge.safe_push (access);
> +
> +         if (!access->is_barrier
> +             && !(access == bitfield_accesses.last ()
> +             && !bitfield_accesses_merge.is_empty ()))
> +           continue;
> +
> +         bitfield_accesses_merge.qsort (cmp_access);
> +
> +         head_access = NULL;
> +         for (iy = 0; bitfield_accesses_merge.iterate (iy, &mrg_access); iy++)
> +           {
> +             if (head_access
> +                 && expressions_equal_p (head_access->src_addr,
> +                                         mrg_access->src_addr)
> +                 && expressions_equal_p (head_access->dst_addr,
> +                                         mrg_access->dst_addr)
> +                 && prev_access->src_offset_bytes
> +                    == prev_access->src_offset_bytes
> +                 && prev_access->dst_offset_bytes
> +                    == prev_access->dst_offset_bytes
> +                 && prev_access->src_bit_offset + prev_access->src_bit_size
> +                    == mrg_access->src_bit_offset
> +                 && prev_access->dst_bit_offset + prev_access->dst_bit_size
> +                    == mrg_access->dst_bit_offset)
> +               {
> +                 /* Merge conditions are satisfied - merge accesses.  */
> +                 mrg_access->merged = true;
> +                 prev_access->next = mrg_access;
> +                 head_access->modified = true;
> +                 prev_access = mrg_access;
> +               }
> +             else
> +               head_access = prev_access = mrg_access;
> +           }
> +         bitfield_accesses_merge.release ();
> +         bitfield_accesses_merge = vNULL;
> +       }
> +
> +      /* Modify generated code.  */
> +      for (ix = 0; bitfield_accesses.iterate (ix, &access); ix++)
> +       {
> +         if (access->merged)
> +           {
> +             /* Access merged - remove instructions.  */
> +             gimple_stmt_iterator tmp_gsi;
> +             tmp_gsi = gsi_for_stmt (access->load_stmt);
> +             gsi_remove (&tmp_gsi, true);
> +             tmp_gsi = gsi_for_stmt (access->store_stmt);
> +             gsi_remove (&tmp_gsi, true);
> +           }
> +         else if (access->modified)
> +           {
> +             /* Access modified - modify generated code.  */
> +             gimple_stmt_iterator tmp_gsi;
> +             tree tmp_ssa;
> +             tree itype = make_node (INTEGER_TYPE);
> +             tree new_rhs;
> +             tree new_lhs;
> +             gimple new_stmt;
> +
> +             /* Bitfield size changed - modify load statement.  */
> +             access->src_bit_size = get_merged_bit_field_size (access);
> +             TYPE_PRECISION (itype) = access->src_bit_size;
> +             fixup_unsigned_type (itype);
> +             tmp_ssa = make_ssa_name (create_tmp_var (itype, NULL), NULL);
> +             new_rhs = build3 (BIT_FIELD_REF, itype, access->src_addr,
> +                               build_int_cst (unsigned_type_node,
> +                                              access->src_bit_size),
> +                               build_int_cst (unsigned_type_node,
> +                                              access->src_bit_offset));
> +
> +             tmp_gsi = gsi_for_stmt (access->load_stmt);
> +             new_stmt = gimple_build_assign (tmp_ssa, new_rhs);
> +             gsi_insert_after (&tmp_gsi, new_stmt, GSI_SAME_STMT);
> +             SSA_NAME_DEF_STMT (tmp_ssa) = new_stmt;
> +             gsi_remove (&tmp_gsi, true);
> +
> +             /* Bitfield size changed - modify store statement.  */
> +             new_lhs = build3 (BIT_FIELD_REF, itype, access->dst_addr,
> +                               build_int_cst (unsigned_type_node,
> +                                              access->src_bit_size),
> +                               build_int_cst (unsigned_type_node,
> +                                              access->dst_bit_offset));
> +
> +             tmp_gsi = gsi_for_stmt (access->store_stmt);
> +             new_stmt = gimple_build_assign (new_lhs, tmp_ssa);
> +             gsi_insert_after (&tmp_gsi, new_stmt, GSI_SAME_STMT);
> +             gsi_remove (&tmp_gsi, true);
> +
> +             cfg_changed = true;
> +           }
> +       }
> +      /* Empty or delete data structures used for basic block.  */
> +      htab_empty (bitfield_stmt_access_htab);
> +      bitfield_accesses.release ();
> +    }
> +
> +  if (cfg_changed)
> +    todoflags |= TODO_cleanup_cfg;
> +
> +  return todoflags;
> +}
> +
> +static bool
> +gate_bitfield_merge (void)
> +{
> +  return flag_tree_bitfield_merge;
> +}
> +
> +struct gimple_opt_pass pass_bitfield_merge =
> +{
> + {
> +  GIMPLE_PASS,
> +  "bitfieldmerge",             /* name */
> +  OPTGROUP_NONE,                /* optinfo_flags */
> +  gate_bitfield_merge,         /* gate */
> +  ssa_bitfield_merge,          /* execute */
> +  NULL,                                /* sub */
> +  NULL,                                /* next */
> +  0,                           /* static_pass_number */
> +  TV_TREE_BITFIELD_MERGE,      /* tv_id */
> +  PROP_cfg | PROP_ssa,         /* properties_required */
> +  0,                           /* properties_provided */
> +  0,                           /* properties_destroyed */
> +  0,                           /* todo_flags_start */
> +  TODO_update_ssa
> +  | TODO_verify_ssa            /* todo_flags_finish */
> + }
> +};
> +
> +#include "gt-tree-ssa-bitfield-merge.h"
>
>
> Regards,
> Zoran Jovanovic
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Add a new option "-ftree-bitfield-merge" (patch / doc inside)
  2013-07-17 20:58 [PATCH] Add a new option "-ftree-bitfield-merge" (patch / doc inside) Zoran Jovanovic
  2013-07-17 21:19 ` Joseph S. Myers
  2013-07-17 21:25 ` Andrew Pinski
@ 2013-07-18 10:06 ` Richard Biener
  2013-07-30 15:02   ` Zoran Jovanovic
  2013-07-18 18:14 ` Cary Coutant
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Richard Biener @ 2013-07-18 10:06 UTC (permalink / raw)
  To: Zoran Jovanovic, gcc-patches; +Cc: Petar Jovanovic

Zoran Jovanovic <Zoran.Jovanovic@imgtec.com> wrote:

>Hello,
>This patch adds new optimization pass that combines several adjacent
>bit field accesses that copy values from one memory location to another
>into single bit field access.
>
>Example:
>
>Original code:
>  <unnamed-unsigned:3> D.1351;
>  <unnamed-unsigned:9> D.1350;
>  <unnamed-unsigned:7> D.1349;
>  D.1349_2 = p1_1(D)->f1;
>  p2_3(D)->f1 = D.1349_2;
>  D.1350_4 = p1_1(D)->f2;
>  p2_3(D)->f2 = D.1350_4;
>  D.1351_5 = p1_1(D)->f3;
>  p2_3(D)->f3 = D.1351_5;
>
>Optimized code:
>  <unnamed-unsigned:19> D.1358;
>  D.1358_10 = BIT_FIELD_REF <*p1_1(D), 19, 13>;
>  BIT_FIELD_REF <*p2_3(D), 19, 13> = D.1358_10;
>
>Algorithm works on basic block level and consists of following 3 major
>steps:
>1. Go trough basic block statements list. If there are statement pairs
>that implement copy of bit field content from one memory location to
>another record statements pointers and other necessary data in
>corresponding data structure.
>2. Identify records that represent adjacent bit field accesses and mark
>them as merged.
>3. Modify trees accordingly.

All this should use BITFIELD_REPRESENTATIVE both to decide what accesses are related and for the lowering. This makes sure to honor the appropriate memory models.

In theory only lowering is necessary and FRE and DSE will do the job of optimizing - also properly accounting for alias issues that Joseph mentions. The lowering and analysis is strongly related to SRA So I don't believe we want a new pass for this.

Richard.

>New command line option "-ftree-bitfield-merge" is introduced.
>
>Tested - passed gcc regression tests.
>
>Changelog -
>
>gcc/ChangeLog:
>2013-07-17 Zoran Jovanovic (zoran.jovanovic@imgtec.com)
>  * Makefile.in : Added tree-ssa-bitfield-merge.o to OBJS.
>  * common.opt (ftree-bitfield-merge): New option.
>  * doc/invoke.texi: Added reference to "-ftree-bitfield-merge".
>  * dwarf2out.c (field_type): static removed from declaration.
>  (simple_type_size_in_bits): static removed from declaration.
>  (field_byte_offset): static removed from declaration.
>  (field_type): static inline removed from declaration.
>  * passes.c (init_optimization_passes): pass_bitfield_merge pass
>added.
>  * testsuite/gcc.dg/tree-ssa/bitfldmrg.c: New test.
>  * timevar.def : Added TV_TREE_BITFIELD_MERGE.
>  * tree-pass.h : Added pass_bitfield_merge declaration.
>  * tree-ssa-bitfield-merge.c : New file.
>
>Patch -
>
>diff --git a/gcc/Makefile.in b/gcc/Makefile.in
>index d5121f3..5cdd6eb 100644
>--- a/gcc/Makefile.in
>+++ b/gcc/Makefile.in
>@@ -1417,6 +1417,7 @@ OBJS = \
> 	tree-ssa-dom.o \
> 	tree-ssa-dse.o \
> 	tree-ssa-forwprop.o \
>+	tree-ssa-bitfield-merge.o \
> 	tree-ssa-ifcombine.o \
> 	tree-ssa-live.o \
> 	tree-ssa-loop-ch.o \
>@@ -2312,6 +2313,11 @@ tree-ssa-forwprop.o : tree-ssa-forwprop.c
>$(CONFIG_H) $(SYSTEM_H) coretypes.h \
>    $(TREE_FLOW_H) $(TREE_PASS_H) $(DIAGNOSTIC_H) \
>    langhooks.h $(FLAGS_H) $(GIMPLE_H) $(GIMPLE_PRETTY_PRINT_H)
>$(EXPR_H) \
>    $(OPTABS_H) tree-ssa-propagate.h
>+tree-ssa-bitfield-merge.o : tree-ssa-forwprop.c $(CONFIG_H)
>$(SYSTEM_H) \
>+   coretypes.h $(TM_H) $(TREE_H) $(TM_P_H) $(BASIC_BLOCK_H) \
>+   $(TREE_FLOW_H) $(TREE_PASS_H) $(TREE_DUMP_H) $(DIAGNOSTIC_H)
>$(TIMEVAR_H) \
>+   langhooks.h $(FLAGS_H) $(GIMPLE_H) tree-pretty-print.h \
>+   gimple-pretty-print.h $(EXPR_H)
> tree-ssa-phiprop.o : tree-ssa-phiprop.c $(CONFIG_H) $(SYSTEM_H)
>coretypes.h \
>    $(TM_H) $(TREE_H) $(TM_P_H) $(BASIC_BLOCK_H) \
>    $(TREE_FLOW_H) $(TREE_PASS_H) $(DIAGNOSTIC_H) \
>@@ -3803,6 +3809,7 @@ GTFILES = $(CPP_ID_DATA_H) $(srcdir)/input.h
>$(srcdir)/coretypes.h \
>   $(srcdir)/ipa-inline.h \
>   $(srcdir)/asan.c \
>   $(srcdir)/tsan.c \
>+  $(srcdir)/tree-ssa-bitfield-merge.c \
>   @all_gtfiles@
> 
> # Compute the list of GT header files from the corresponding C
>sources,
>diff --git a/gcc/common.opt b/gcc/common.opt
>index 4c7933e..e0dbc37 100644
>--- a/gcc/common.opt
>+++ b/gcc/common.opt
>@@ -2088,6 +2088,10 @@ ftree-forwprop
> Common Report Var(flag_tree_forwprop) Init(1) Optimization
> Enable forward propagation on trees
> 
>+ftree-bitfield-merge
>+Common Report Var(flag_tree_bitfield_merge) Init(0) Optimization
>+Enable bit field merge on trees
>+
> ftree-fre
> Common Report Var(flag_tree_fre) Optimization
> Enable Full Redundancy Elimination (FRE) on trees
>diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>index dd82880..7b671aa 100644
>--- a/gcc/doc/invoke.texi
>+++ b/gcc/doc/invoke.texi
>@@ -409,7 +409,7 @@ Objective-C and Objective-C++ Dialects}.
> -fsplit-ivs-in-unroller -fsplit-wide-types -fstack-protector @gol
> -fstack-protector-all -fstack-protector-strong -fstrict-aliasing @gol
> -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
>--ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
>+-ftree-bitfield-merge -ftree-builtin-call-dce -ftree-ccp -ftree-ch
>@gol
> -ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
> -ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
> -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
>@@ -7582,6 +7582,11 @@ pointer alignment information.
> This pass only operates on local scalar variables and is enabled by
>default
> at @option{-O} and higher.  It requires that @option{-ftree-ccp} is
>enabled.
> 
>+@item -ftree-bitfield-merge
>+@opindex ftree-bitfield-merge
>+Combines several adjacent bit field accesses that copy values
>+from one memory location to another into single bit field access.
>+
> @item -ftree-ccp
> @opindex ftree-ccp
> Perform sparse conditional constant propagation (CCP) on trees.  This
>diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
>index f42ad66..a08eede 100644
>--- a/gcc/dwarf2out.c
>+++ b/gcc/dwarf2out.c
>@@ -3100,11 +3100,11 @@ static dw_loc_descr_ref loc_descriptor (rtx,
>enum machine_mode mode,
> static dw_loc_list_ref loc_list_from_tree (tree, int);
> static dw_loc_descr_ref loc_descriptor_from_tree (tree, int);
> static HOST_WIDE_INT ceiling (HOST_WIDE_INT, unsigned int);
>-static tree field_type (const_tree);
>+tree field_type (const_tree);
> static unsigned int simple_type_align_in_bits (const_tree);
> static unsigned int simple_decl_align_in_bits (const_tree);
>-static unsigned HOST_WIDE_INT simple_type_size_in_bits (const_tree);
>-static HOST_WIDE_INT field_byte_offset (const_tree);
>+unsigned HOST_WIDE_INT simple_type_size_in_bits (const_tree);
>+HOST_WIDE_INT field_byte_offset (const_tree);
> static void add_AT_location_description	(dw_die_ref, enum
>dwarf_attribute,
> 					 dw_loc_list_ref);
> static void add_data_member_location_attribute (dw_die_ref, tree);
>@@ -10061,7 +10061,7 @@ is_base_type (tree type)
>    else return BITS_PER_WORD if the type actually turns out to be an
>    ERROR_MARK node.  */
> 
>-static inline unsigned HOST_WIDE_INT
>+unsigned HOST_WIDE_INT
> simple_type_size_in_bits (const_tree type)
> {
>   if (TREE_CODE (type) == ERROR_MARK)
>@@ -14375,7 +14375,7 @@ ceiling (HOST_WIDE_INT value, unsigned int
>boundary)
>    `integer_type_node' if the given node turns out to be an
>    ERROR_MARK node.  */
> 
>-static inline tree
>+tree
> field_type (const_tree decl)
> {
>   tree type;
>@@ -14426,7 +14426,7 @@ round_up_to_align (double_int t, unsigned int
>align)
>    because the offset is actually variable.  (We can't handle the
>latter case
>    just yet).  */
> 
>-static HOST_WIDE_INT
>+HOST_WIDE_INT
> field_byte_offset (const_tree decl)
> {
>   double_int object_offset_in_bits;
>diff --git a/gcc/passes.c b/gcc/passes.c
>index c8b03ee..3149adc 100644
>--- a/gcc/passes.c
>+++ b/gcc/passes.c
>@@ -1325,6 +1325,7 @@ init_optimization_passes (void)
> 	  NEXT_PASS (pass_remove_cgraph_callee_edges);
> 	  NEXT_PASS (pass_rename_ssa_copies);
> 	  NEXT_PASS (pass_ccp);
>+	  NEXT_PASS (pass_bitfield_merge);
> 	  /* After CCP we rewrite no longer addressed locals into SSA
> 	     form if possible.  */
> 	  NEXT_PASS (pass_forwprop);
>diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg.c
>b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg.c
>new file mode 100644
>index 0000000..9be6bc9
>--- /dev/null
>+++ b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg.c
>@@ -0,0 +1,29 @@
>+/* { dg-do compile } */
>+/* { dg-options "-O2 -ftree-bitfield-merge -fdump-tree-bitfieldmerge"
>}  */
>+
>+struct S
>+{
>+  unsigned f1:7;
>+  unsigned f2:9;
>+  unsigned f3:3;
>+  unsigned f4:5;
>+  unsigned f5:1;
>+  unsigned f6:2;
>+};
>+
>+unsigned
>+foo (struct S *p1, struct S *p2, int *ptr)
>+{
>+  p2->f1 = p1->f1;
>+  p2->f2 = p1->f2;
>+  p2->f3 = p1->f3;
>+  *ptr = 7;
>+  p2->f4 = p1->f4;
>+  p2->f5 = p1->f5;
>+  p2->f6 = p1->f6;
>+  return 0;
>+}
>+
>+/* { dg-final { scan-tree-dump "19" "bitfieldmerge" } } */
>+/* { dg-final { scan-tree-dump "8" "bitfieldmerge"} } */
>+/* { dg-final { cleanup-tree-dump "bitfieldmerge" } } */
>diff --git a/gcc/timevar.def b/gcc/timevar.def
>index 44f0eac..d9b1c23 100644
>--- a/gcc/timevar.def
>+++ b/gcc/timevar.def
>@@ -149,6 +149,7 @@ DEFTIMEVAR (TV_TREE_FRE		     , "tree FRE")
> DEFTIMEVAR (TV_TREE_SINK             , "tree code sinking")
> DEFTIMEVAR (TV_TREE_PHIOPT	     , "tree linearize phis")
> DEFTIMEVAR (TV_TREE_FORWPROP	     , "tree forward propagate")
>+DEFTIMEVAR (TV_TREE_BITFIELD_MERGE   , "tree bitfield merge")
> DEFTIMEVAR (TV_TREE_PHIPROP	     , "tree phiprop")
> DEFTIMEVAR (TV_TREE_DCE		     , "tree conservative DCE")
> DEFTIMEVAR (TV_TREE_CD_DCE	     , "tree aggressive DCE")
>diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
>index b8c59a7..59ca028 100644
>--- a/gcc/tree-pass.h
>+++ b/gcc/tree-pass.h
>@@ -337,6 +337,7 @@ extern struct gimple_opt_pass
>pass_warn_function_noreturn;
> extern struct gimple_opt_pass pass_cselim;
> extern struct gimple_opt_pass pass_phiopt;
> extern struct gimple_opt_pass pass_forwprop;
>+extern struct gimple_opt_pass pass_bitfield_merge;
> extern struct gimple_opt_pass pass_phiprop;
> extern struct gimple_opt_pass pass_tree_ifcombine;
> extern struct gimple_opt_pass pass_dse;
>diff --git a/gcc/tree-ssa-bitfield-merge.c
>b/gcc/tree-ssa-bitfield-merge.c
>new file mode 100755
>index 0000000..71f41b3
>--- /dev/null
>+++ b/gcc/tree-ssa-bitfield-merge.c
>@@ -0,0 +1,503 @@
>+/* Forward propagation of expressions for single use variables.
>+   Copyright (C) 2004, 2005, 2007, 2008, 2009, 2010, 2011
>+   Free Software Foundation, Inc.
>+
>+This file is part of GCC.
>+
>+GCC is free software; you can redistribute it and/or modify
>+it under the terms of the GNU General Public License as published by
>+the Free Software Foundation; either version 3, or (at your option)
>+any later version.
>+
>+GCC is distributed in the hope that it will be useful,
>+but WITHOUT ANY WARRANTY; without even the implied warranty of
>+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>+GNU General Public License for more details.
>+
>+You should have received a copy of the GNU General Public License
>+along with GCC; see the file COPYING3.  If not see
>+<http://www.gnu.org/licenses/>.  */
>+
>+#include "config.h"
>+#include "system.h"
>+#include "coretypes.h"
>+#include "tm.h"
>+#include "tree.h"
>+#include "tm_p.h"
>+#include "basic-block.h"
>+#include "timevar.h"
>+#include "gimple-pretty-print.h"
>+#include "tree-flow.h"
>+#include "tree-pass.h"
>+#include "tree-dump.h"
>+#include "langhooks.h"
>+#include "flags.h"
>+#include "gimple.h"
>+#include "expr.h"
>+#include "ggc.h"
>+
>+tree
>+field_type (const_tree decl);
>+
>+bool
>+expressions_equal_p (tree e1, tree e2);
>+
>+HOST_WIDE_INT
>+field_byte_offset (const_tree decl);
>+
>+unsigned HOST_WIDE_INT
>+simple_type_size_in_bits (const_tree type);
>+
>+/* This pass combines several adjacent bit field accesses that copy
>values
>+   from one memory location to another into single bit field access.
> */
>+
>+/* Data for single bit field read/write sequence.  */
>+struct GTY (()) bitfield_access_d {
>+  gimple load_stmt;		  /* Bit field load statement.  */
>+  gimple store_stmt;		  /* Bit field store statement.  */
>+  unsigned src_offset_bytes;	  /* Bit field offset at src in bytes.
> */
>+  unsigned src_bit_offset;	  /* Bit field offset inside source word.
> */
>+  unsigned src_bit_size;	  /* Size of bit field in source word.  */
>+  unsigned dst_offset_bytes;	  /* Bit field offset at dst in bytes.
> */
>+  unsigned dst_bit_offset;	  /* Bit field offset inside destination
>+				     word.  */
>+  unsigned dst_bit_size;	  /* Size of bit field in destination word.
> */
>+  tree src_addr;		  /* Address of source memory access.  */
>+  tree dst_addr;		  /* Address of destination memory access.  */
>+  bool merged;			  /* True if access is merged with another
>+				     one.  */
>+  bool modified;		  /* True if bitfield size is modified.  */
>+  bool is_barrier;		  /* True if access is barrier (call or mem
>+				     access).  */
>+  struct bitfield_access_d *next; /* Access with which this one is
>merged.  */
>+};
>+
>+typedef struct bitfield_access_d bitfield_access_o;
>+typedef struct bitfield_access_d *bitfield_access;
>+
>+/* Connecting register with bit field access sequence that defines
>value in
>+   that register.  */
>+struct GTY (()) bitfield_stmt_access_pair_d
>+{
>+  gimple stmt;
>+  bitfield_access access;
>+};
>+
>+typedef struct bitfield_stmt_access_pair_d
>bitfield_stmt_access_pair_o;
>+typedef struct bitfield_stmt_access_pair_d *bitfield_stmt_access_pair;
>+
>+static GTY ((param_is (struct bitfield_stmt_access_pair_d)))
>+  htab_t bitfield_stmt_access_htab;
>+
>+/* Hash table callbacks for bitfield_stmt_access_htab.  */
>+
>+static hashval_t
>+bitfield_stmt_access_pair_htab_hash (const void *p)
>+{
>+  const struct bitfield_stmt_access_pair_d *entry =
>+    (const struct bitfield_stmt_access_pair_d *)p;
>+  return (hashval_t) (uintptr_t) entry->stmt;
>+}
>+
>+static int
>+bitfield_stmt_access_pair_htab_eq (const void *p1, const void *p2)
>+{
>+  const struct bitfield_stmt_access_pair_d *entry1 =
>+    (const struct bitfield_stmt_access_pair_d *)p1;
>+  const struct bitfield_stmt_access_pair_d *entry2 =
>+    (const struct bitfield_stmt_access_pair_d *)p2;
>+  return entry1->stmt == entry2->stmt;
>+}
>+
>+
>+static bool cfg_changed;
>+
>+/* Compare two bit field access records.  */
>+
>+static int
>+cmp_access (const void *p1, const void *p2)
>+{
>+  const bitfield_access a1 = (*(const bitfield_access*)p1);
>+  const bitfield_access a2 = (*(const bitfield_access*)p2);
>+
>+  if (!expressions_equal_p (a1->src_addr, a1->src_addr))
>+    return a1 - a2;
>+
>+  if (!expressions_equal_p (a1->dst_addr, a1->dst_addr))
>+    return a1 - a2;
>+
>+  return a1->src_bit_offset - a2->src_bit_offset;
>+}
>+
>+/* Create new bit field access structure and add it to given
>bitfield_accesses
>+   htab.  */
>+static bitfield_access
>+create_and_insert_access (vec<bitfield_access>
>+		       *bitfield_accesses)
>+{
>+  bitfield_access access = ggc_alloc_bitfield_access_d ();
>+  memset (access, 0, sizeof (struct bitfield_access_d));
>+  bitfield_accesses->safe_push (access);
>+  return access;
>+}
>+
>+/* Slightly modified add_bit_offset_attribute from dwarf2out.c.  */
>+static inline HOST_WIDE_INT
>+get_bit_offset (tree decl)
>+{
>+  HOST_WIDE_INT object_offset_in_bytes = field_byte_offset (decl);
>+  tree type = DECL_BIT_FIELD_TYPE (decl);
>+  HOST_WIDE_INT bitpos_int;
>+  HOST_WIDE_INT highest_order_object_bit_offset;
>+  HOST_WIDE_INT highest_order_field_bit_offset;
>+  HOST_WIDE_INT bit_offset;
>+
>+  /* Must be a field and a bit field.  */
>+  gcc_assert (type && TREE_CODE (decl) == FIELD_DECL);
>+  if (! host_integerp (bit_position (decl), 0)
>+      || ! host_integerp (DECL_SIZE (decl), 1))
>+    return -1;
>+
>+  bitpos_int = int_bit_position (decl);
>+
>+  /* Note that the bit offset is always the distance (in bits) from
>the
>+     highest-order bit of the "containing object" to the highest-order
>bit of
>+     the bit-field itself.  Since the "high-order end" of any object
>or field
>+     is different on big-endian and little-endian machines, the
>computation
>+     below must take account of these differences.  */
>+  highest_order_object_bit_offset = object_offset_in_bytes *
>BITS_PER_UNIT;
>+  highest_order_field_bit_offset = bitpos_int;
>+
>+  if (! BYTES_BIG_ENDIAN)
>+    {
>+      highest_order_field_bit_offset += tree_low_cst (DECL_SIZE
>(decl), 0);
>+      highest_order_object_bit_offset += simple_type_size_in_bits
>(type);
>+    }
>+
>+  bit_offset
>+    = (! BYTES_BIG_ENDIAN
>+       ? highest_order_object_bit_offset -
>highest_order_field_bit_offset
>+       : highest_order_field_bit_offset -
>highest_order_object_bit_offset);
>+
>+  return bit_offset;
>+}
>+
>+/* Slightly modified add_byte_size_attribute from dwarf2out.c.  */
>+static inline HOST_WIDE_INT
>+get_byte_size (tree tree_node)
>+{
>+  unsigned size;
>+
>+  switch (TREE_CODE (tree_node))
>+    {
>+    case ERROR_MARK:
>+      size = 0;
>+      break;
>+    case ENUMERAL_TYPE:
>+    case RECORD_TYPE:
>+    case UNION_TYPE:
>+    case QUAL_UNION_TYPE:
>+      size = int_size_in_bytes (tree_node);
>+      break;
>+    case FIELD_DECL:
>+      /* For a data member of a struct or union, the DW_AT_byte_size
>is
>+	 generally given as the number of bytes normally allocated for an
>+	 object of the *declared* type of the member itself.  This is true
>+	 even for bit-fields.  */
>+      size = simple_type_size_in_bits (field_type (tree_node)) /
>BITS_PER_UNIT;
>+      break;
>+    default:
>+      gcc_unreachable ();
>+    }
>+
>+  return size;
>+}
>+
>+/* Returns size of combined bitfields.  */
>+static int
>+get_merged_bit_field_size (bitfield_access access)
>+{
>+  bitfield_access tmp_access = access;
>+  int size = 0;
>+
>+  while (tmp_access)
>+  {
>+    size += tmp_access->src_bit_size;
>+    tmp_access = tmp_access->next;
>+  }
>+  return size;
>+}
>+
>+/* Adds new pair consisting of statement and bit field access
>structure that
>+   contains it.  */
>+static bool add_stmt_access_pair (bitfield_access access, gimple stmt)
>+{
>+  bitfield_stmt_access_pair new_pair;
>+  void **slot;
>+  new_pair = ggc_alloc_bitfield_stmt_access_pair_o ();
>+  new_pair->stmt = stmt;
>+  new_pair->access = access;
>+  slot = htab_find_slot (bitfield_stmt_access_htab, new_pair, INSERT);
>+  if (*slot == HTAB_EMPTY_ENTRY)
>+    {
>+      *slot = new_pair;
>+      return true;
>+    }
>+  return false;
>+}
>+
>+/* Main entry point for the bit field merge optimization.  */
>+static unsigned int
>+ssa_bitfield_merge (void)
>+{
>+  basic_block bb;
>+  unsigned int todoflags = 0;
>+  vec<bitfield_access> bitfield_accesses;
>+  int ix, iy;
>+  bitfield_access access;
>+
>+  cfg_changed = false;
>+
>+  FOR_EACH_BB (bb)
>+    {
>+      gimple_stmt_iterator gsi;
>+      vec<bitfield_access> bitfield_accesses_merge = vNULL;
>+      bitfield_accesses.create (0);
>+
>+      bitfield_stmt_access_htab
>+	= htab_create_ggc (128,
>+			 bitfield_stmt_access_pair_htab_hash,
>+			 bitfield_stmt_access_pair_htab_eq,
>+			 NULL);
>+
>+      /* Identify all bitfield copy sequences in the basic-block.  */
>+      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)
>+	{
>+	  gimple stmt = gsi_stmt (gsi);
>+	  tree lhs, rhs;
>+	  void **slot;
>+	  struct bitfield_stmt_access_pair_d asdata;
>+
>+	  if (!is_gimple_assign (stmt))
>+	    {
>+	      gsi_next (&gsi);
>+	      continue;
>+	    }
>+
>+	  lhs = gimple_assign_lhs (stmt);
>+	  rhs = gimple_assign_rhs1 (stmt);
>+
>+	  if (TREE_CODE (rhs) == COMPONENT_REF)
>+	    {
>+	      use_operand_p use;
>+	      gimple use_stmt;
>+	      tree op0 = TREE_OPERAND (rhs, 0);
>+	      tree op1 = TREE_OPERAND (rhs, 1);
>+
>+	      if (TREE_CODE (op1) == FIELD_DECL && DECL_BIT_FIELD_TYPE (op1))
>+		{
>+		  if (single_imm_use (lhs, &use, &use_stmt)
>+		       && is_gimple_assign (use_stmt))
>+		    {
>+		      tree use_lhs = gimple_assign_lhs (use_stmt);
>+		      if (TREE_CODE (use_lhs) == COMPONENT_REF)
>+			{
>+			  tree use_op0 = TREE_OPERAND (use_lhs, 0);
>+			  tree use_op1 = TREE_OPERAND (use_lhs, 1);
>+			  if (TREE_CODE (use_op1) == FIELD_DECL
>+			      && DECL_BIT_FIELD_TYPE (use_op1))
>+			    {
>+			      /* Create new bit field access structure.  */
>+			      access = create_and_insert_access
>+					 (&bitfield_accesses);
>+			      /* Collect access data - load instruction.  */
>+			      access->src_bit_size = tree_low_cst
>+						      (DECL_SIZE (op1), 1);
>+			      access->src_bit_offset = get_bit_offset (op1);
>+			      access->src_offset_bytes = get_byte_size (op1);
>+			      access->src_addr = op0;
>+			      access->load_stmt = gsi_stmt (gsi);
>+			      /* Collect access data - store instruction.  */
>+			      access->dst_bit_size = tree_low_cst (DECL_SIZE
>+								    (use_op1),
>+								   1);
>+			      access->dst_bit_offset = get_bit_offset
>+							 (use_op1);
>+			      access->dst_offset_bytes = get_byte_size
>+							   (use_op1);
>+			      access->dst_addr = use_op0;
>+			      access->store_stmt = use_stmt;
>+			      add_stmt_access_pair (access, stmt);
>+			      add_stmt_access_pair (access, use_stmt);
>+			    }
>+			}
>+		    }
>+		}
>+	    }
>+
>+	  /* Insert barrier for merging if statement is function call or
>memory
>+	     access.  */
>+	  asdata.stmt = stmt;
>+	  slot = htab_find_slot (bitfield_stmt_access_htab, &asdata,
>+				 NO_INSERT);
>+	  if (!slot && ((gimple_code (stmt) == GIMPLE_CALL)
>+	      || (gimple_has_mem_ops (stmt))))
>+	    {
>+	      /* Create new bit field access structure.  */
>+	      access = create_and_insert_access (&bitfield_accesses);
>+	      /* Mark it as barrier.  */
>+	      access->is_barrier = true;
>+	    }
>+
>+	  gsi_next (&gsi);
>+	}
>+
>+      /* If there are no at least two accesses go to the next basic
>block.  */
>+      if (bitfield_accesses.length () <= 1)
>+	{
>+	  bitfield_accesses.release ();
>+	  continue;
>+	}
>+
>+      /* Find bitfield accesses that can be merged.  */
>+      for (ix = 0; bitfield_accesses.iterate (ix, &access); ix++)
>+	{
>+	  bitfield_access head_access;
>+	  bitfield_access mrg_access;
>+	  bitfield_access prev_access;
>+
>+	  if (!bitfield_accesses_merge.exists ())
>+	    bitfield_accesses_merge.create (0);
>+
>+	  bitfield_accesses_merge.safe_push (access);
>+
>+	  if (!access->is_barrier
>+	      && !(access == bitfield_accesses.last ()
>+	      && !bitfield_accesses_merge.is_empty ()))
>+	    continue;
>+
>+	  bitfield_accesses_merge.qsort (cmp_access);
>+
>+	  head_access = NULL;
>+	  for (iy = 0; bitfield_accesses_merge.iterate (iy, &mrg_access);
>iy++)
>+	    {
>+	      if (head_access
>+		  && expressions_equal_p (head_access->src_addr,
>+					  mrg_access->src_addr)
>+		  && expressions_equal_p (head_access->dst_addr,
>+					  mrg_access->dst_addr)
>+		  && prev_access->src_offset_bytes
>+		     == prev_access->src_offset_bytes
>+		  && prev_access->dst_offset_bytes
>+		     == prev_access->dst_offset_bytes
>+		  && prev_access->src_bit_offset + prev_access->src_bit_size
>+		     == mrg_access->src_bit_offset
>+		  && prev_access->dst_bit_offset + prev_access->dst_bit_size
>+		     == mrg_access->dst_bit_offset)
>+		{
>+		  /* Merge conditions are satisfied - merge accesses.  */
>+		  mrg_access->merged = true;
>+		  prev_access->next = mrg_access;
>+		  head_access->modified = true;
>+		  prev_access = mrg_access;
>+		}
>+	      else
>+		head_access = prev_access = mrg_access;
>+	    }
>+	  bitfield_accesses_merge.release ();
>+	  bitfield_accesses_merge = vNULL;
>+	}
>+
>+      /* Modify generated code.  */
>+      for (ix = 0; bitfield_accesses.iterate (ix, &access); ix++)
>+	{
>+	  if (access->merged)
>+	    {
>+	      /* Access merged - remove instructions.  */
>+	      gimple_stmt_iterator tmp_gsi;
>+	      tmp_gsi = gsi_for_stmt (access->load_stmt);
>+	      gsi_remove (&tmp_gsi, true);
>+	      tmp_gsi = gsi_for_stmt (access->store_stmt);
>+	      gsi_remove (&tmp_gsi, true);
>+	    }
>+	  else if (access->modified)
>+	    {
>+	      /* Access modified - modify generated code.  */
>+	      gimple_stmt_iterator tmp_gsi;
>+	      tree tmp_ssa;
>+	      tree itype = make_node (INTEGER_TYPE);
>+	      tree new_rhs;
>+	      tree new_lhs;
>+	      gimple new_stmt;
>+
>+	      /* Bitfield size changed - modify load statement.  */
>+	      access->src_bit_size = get_merged_bit_field_size (access);
>+	      TYPE_PRECISION (itype) = access->src_bit_size;
>+	      fixup_unsigned_type (itype);
>+	      tmp_ssa = make_ssa_name (create_tmp_var (itype, NULL), NULL);
>+	      new_rhs = build3 (BIT_FIELD_REF, itype, access->src_addr,
>+				build_int_cst (unsigned_type_node,
>+					       access->src_bit_size),
>+				build_int_cst (unsigned_type_node,
>+					       access->src_bit_offset));
>+
>+	      tmp_gsi = gsi_for_stmt (access->load_stmt);
>+	      new_stmt = gimple_build_assign (tmp_ssa, new_rhs);
>+	      gsi_insert_after (&tmp_gsi, new_stmt, GSI_SAME_STMT);
>+	      SSA_NAME_DEF_STMT (tmp_ssa) = new_stmt;
>+	      gsi_remove (&tmp_gsi, true);
>+
>+	      /* Bitfield size changed - modify store statement.  */
>+	      new_lhs = build3 (BIT_FIELD_REF, itype, access->dst_addr,
>+				build_int_cst (unsigned_type_node,
>+					       access->src_bit_size),
>+				build_int_cst (unsigned_type_node,
>+					       access->dst_bit_offset));
>+
>+	      tmp_gsi = gsi_for_stmt (access->store_stmt);
>+	      new_stmt = gimple_build_assign (new_lhs, tmp_ssa);
>+	      gsi_insert_after (&tmp_gsi, new_stmt, GSI_SAME_STMT);
>+	      gsi_remove (&tmp_gsi, true);
>+
>+	      cfg_changed = true;
>+	    }
>+	}
>+      /* Empty or delete data structures used for basic block.  */
>+      htab_empty (bitfield_stmt_access_htab);
>+      bitfield_accesses.release ();
>+    }
>+
>+  if (cfg_changed)
>+    todoflags |= TODO_cleanup_cfg;
>+
>+  return todoflags;
>+}
>+
>+static bool
>+gate_bitfield_merge (void)
>+{
>+  return flag_tree_bitfield_merge;
>+}
>+
>+struct gimple_opt_pass pass_bitfield_merge =
>+{
>+ {
>+  GIMPLE_PASS,
>+  "bitfieldmerge",		/* name */
>+  OPTGROUP_NONE,                /* optinfo_flags */
>+  gate_bitfield_merge,		/* gate */
>+  ssa_bitfield_merge,		/* execute */
>+  NULL,				/* sub */
>+  NULL,				/* next */
>+  0,				/* static_pass_number */
>+  TV_TREE_BITFIELD_MERGE,	/* tv_id */
>+  PROP_cfg | PROP_ssa,		/* properties_required */
>+  0,				/* properties_provided */
>+  0,				/* properties_destroyed */
>+  0,				/* todo_flags_start */
>+  TODO_update_ssa
>+  | TODO_verify_ssa		/* todo_flags_finish */
>+ }
>+};
>+
>+#include "gt-tree-ssa-bitfield-merge.h"
>
>
>Regards,
>Zoran Jovanovic


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Add a new option "-ftree-bitfield-merge" (patch / doc inside)
  2013-07-17 20:58 [PATCH] Add a new option "-ftree-bitfield-merge" (patch / doc inside) Zoran Jovanovic
                   ` (2 preceding siblings ...)
  2013-07-18 10:06 ` Richard Biener
@ 2013-07-18 18:14 ` Cary Coutant
  2013-07-18 19:47 ` Hans-Peter Nilsson
  2013-08-23 14:25 ` Zoran Jovanovic
  5 siblings, 0 replies; 17+ messages in thread
From: Cary Coutant @ 2013-07-18 18:14 UTC (permalink / raw)
  To: Zoran Jovanovic; +Cc: gcc-patches, Petar Jovanovic

>   * dwarf2out.c (field_type): static removed from declaration.
>   (simple_type_size_in_bits): static removed from declaration.
>   (field_byte_offset): static removed from declaration.
>   (field_type): static inline removed from declaration.

If you're going to use these declarations from
tree-ssa-bitfield-merge.c, it would be better to move the declarations
into dwarf2out.h, and include that file from
tree-ssa-bitfield-merge.c. Even better would be to move these routines
(which today are in dwarf2out.c simply because that was the only file
that needed them) to a more appropriate location. I'd suggest
tree.h/tree.c, but tree.c is already way too big -- does someone have
a better suggestion?

-cary

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Add a new option "-ftree-bitfield-merge" (patch / doc inside)
  2013-07-17 20:58 [PATCH] Add a new option "-ftree-bitfield-merge" (patch / doc inside) Zoran Jovanovic
                   ` (3 preceding siblings ...)
  2013-07-18 18:14 ` Cary Coutant
@ 2013-07-18 19:47 ` Hans-Peter Nilsson
  2013-08-23 14:25 ` Zoran Jovanovic
  5 siblings, 0 replies; 17+ messages in thread
From: Hans-Peter Nilsson @ 2013-07-18 19:47 UTC (permalink / raw)
  To: Zoran Jovanovic; +Cc: gcc-patches, Petar Jovanovic

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1502 bytes --]

On Wed, 17 Jul 2013, Zoran Jovanovic wrote:
> Hello,
> This patch adds new optimization pass that combines several adjacent bit field accesses that copy values from one memory location to another into single bit field access.
>
> Example:
>
> Original code:
>   <unnamed-unsigned:3> D.1351;
>   <unnamed-unsigned:9> D.1350;
>   <unnamed-unsigned:7> D.1349;
>   D.1349_2 = p1_1(D)->f1;
>   p2_3(D)->f1 = D.1349_2;
>   D.1350_4 = p1_1(D)->f2;
>   p2_3(D)->f2 = D.1350_4;
>   D.1351_5 = p1_1(D)->f3;
>   p2_3(D)->f3 = D.1351_5;
>
> Optimized code:
>   <unnamed-unsigned:19> D.1358;
>   D.1358_10 = BIT_FIELD_REF <*p1_1(D), 19, 13>;
>   BIT_FIELD_REF <*p2_3(D), 19, 13> = D.1358_10;
>
> Algorithm works on basic block level and consists of following 3 major steps:
> 1. Go trough basic block statements list. If there are statement pairs that implement copy of bit field content from one memory location to another record statements pointers and other necessary data in corresponding data structure.
> 2. Identify records that represent adjacent bit field accesses and mark them as merged.

I see noone else asked, so: How are volatile bitfields handled
or accesses to bitfields in structures declared volatile?

(A quick grep found no match for "volatil" in the patch; maybe
they're rejected by other means, but better double-check.  This
pass must not widen, narrow or combine such accesses and this
may or may not depend on the language standard.)

> 3. Modify trees accordingly.

brgds, H-P

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH] Add a new option "-ftree-bitfield-merge" (patch / doc inside)
  2013-07-18 10:06 ` Richard Biener
@ 2013-07-30 15:02   ` Zoran Jovanovic
  2013-08-27 11:33     ` Richard Biener
  0 siblings, 1 reply; 17+ messages in thread
From: Zoran Jovanovic @ 2013-07-30 15:02 UTC (permalink / raw)
  To: Richard Biener, gcc-patches; +Cc: Petar Jovanovic

Thank you for the reply.
I am in the process of modifying the patch according to some comments received.
Currently I am considering the usage of DECL_BIT_FIELD_REPRESENTATIVE.
I see that they can be used during analysis phase for deciding which accesses can be merged - only accesses with same representative will be merged.
I have more dilemmas with usage of representatives for lowering. If my understanding is correct bit field representative can only be associated to a field declaration, and not to a BIT_FIELD_REF node. As a consequence optimization must use COMPONENT_REF to model new bit field access (which should be an equivalent of several merged accesses). To use COMPONENT_REF a new field declaration with appropriate bit size (equal to sum of bit sizes of all merged bit field accesses) must be created and then corresponding bit field representative could be attached.
Is my understanding correct? Is creating a new field declaration for every set of merged bit field accesses acceptable?

Regards,
Zoran Jovanovic

________________________________________
From: Richard Biener [richard.guenther@gmail.com]
Sent: Thursday, July 18, 2013 11:31 AM
To: Zoran Jovanovic; gcc-patches@gcc.gnu.org
Cc: Petar Jovanovic
Subject: Re: [PATCH] Add a new option "-ftree-bitfield-merge" (patch / doc inside)

Zoran Jovanovic <Zoran.Jovanovic@imgtec.com> wrote:

>Hello,
>This patch adds new optimization pass that combines several adjacent
>bit field accesses that copy values from one memory location to another
>into single bit field access.
>
>Example:
>
>Original code:
>  <unnamed-unsigned:3> D.1351;
>  <unnamed-unsigned:9> D.1350;
>  <unnamed-unsigned:7> D.1349;
>  D.1349_2 = p1_1(D)->f1;
>  p2_3(D)->f1 = D.1349_2;
>  D.1350_4 = p1_1(D)->f2;
>  p2_3(D)->f2 = D.1350_4;
>  D.1351_5 = p1_1(D)->f3;
>  p2_3(D)->f3 = D.1351_5;
>
>Optimized code:
>  <unnamed-unsigned:19> D.1358;
>  D.1358_10 = BIT_FIELD_REF <*p1_1(D), 19, 13>;
>  BIT_FIELD_REF <*p2_3(D), 19, 13> = D.1358_10;
>
>Algorithm works on basic block level and consists of following 3 major
>steps:
>1. Go trough basic block statements list. If there are statement pairs
>that implement copy of bit field content from one memory location to
>another record statements pointers and other necessary data in
>corresponding data structure.
>2. Identify records that represent adjacent bit field accesses and mark
>them as merged.
>3. Modify trees accordingly.

All this should use BITFIELD_REPRESENTATIVE both to decide what accesses are related and for the lowering. This makes sure to honor the appropriate memory models.

In theory only lowering is necessary and FRE and DSE will do the job of optimizing - also properly accounting for alias issues that Joseph mentions. The lowering and analysis is strongly related to SRA So I don't believe we want a new pass for this.

Richard.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH] Add a new option "-ftree-bitfield-merge" (patch / doc inside)
  2013-07-17 20:58 [PATCH] Add a new option "-ftree-bitfield-merge" (patch / doc inside) Zoran Jovanovic
                   ` (4 preceding siblings ...)
  2013-07-18 19:47 ` Hans-Peter Nilsson
@ 2013-08-23 14:25 ` Zoran Jovanovic
  2013-08-23 22:19   ` Joseph S. Myers
                     ` (2 more replies)
  5 siblings, 3 replies; 17+ messages in thread
From: Zoran Jovanovic @ 2013-08-23 14:25 UTC (permalink / raw)
  To: gcc-patches; +Cc: Petar Jovanovic

Hello,
This is new patch version. 
Optimization does not use BIT_FIELD_REF any more, instead it generates new COMPONENT_REF and FIELD_DECL.
Existing Bit field representative is associated with newly created field declaration.
During analysis phase optimization uses bit field representatives when deciding which bit-fields accesses can be merged.
Instead of having separate pass optimization is moved to tree-sra.c and executed with sra early.
New test case involving unions is added.
Also, some other comments received on first patch are applied in new implementation.


Example:

Original code:
  <unnamed-unsigned:3> D.1351;
  <unnamed-unsigned:9> D.1350;
  <unnamed-unsigned:7> D.1349;
  D.1349_2 = p1_1(D)->f1;
  p2_3(D)->f1 = D.1349_2;
  D.1350_4 = p1_1(D)->f2;
  p2_3(D)->f2 = D.1350_4;
  D.1351_5 = p1_1(D)->f3;
  p2_3(D)->f3 = D.1351_5;

Optimized code:
  <unnamed-unsigned:19> D.1358;
  _16 = pr1_2(D)->_field0;
  pr2_4(D)->_field0 = _16;
  
Algorithm works on basic block level and consists of following 3 major steps:
1. Go trough basic block statements list. If there are statement pairs that implement copy of bit field content from one memory location to another record statements pointers and other necessary data in corresponding data structure.
2. Identify records that represent adjacent bit field accesses and mark them as merged.
3. Modify trees accordingly.

New command line option "-ftree-bitfield-merge" is introduced.

Tested - passed gcc regression tests.

Changelog -

gcc/ChangeLog:
2013-08-22 Zoran Jovanovic (zoran.jovanovic@imgtec.com)
  * Makefile.in : Added tree-sra.c to GTFILES.
  * common.opt (ftree-bitfield-merge): New option.
  * doc/invoke.texi: Added reference to "-ftree-bitfield-merge".
  * tree-sra.c (ssa_bitfield_merge): New function.
  Entry for (-ftree-bitfield-merge).
  (bitfield_stmt_access_pair_htab_hash): New function.
  (bitfield_stmt_access_pair_htab_eq): New function.
  (cmp_access): New function.
  (create_and_insert_access): New function.
  (get_bit_offset): New function.
  (get_merged_bit_field_size): New function.
  (add_stmt_access_pair): New function.
  * dwarf2out.c (simple_type_size_in_bits): moved to tree.c.
  (field_byte_offset): declaration moved to tree.h, static removed.
  * testsuite/gcc.dg/tree-ssa/bitfldmrg1.c: New test.
  * testsuite/gcc.dg/tree-ssa/bitfldmrg2.c: New test.
  * tree-ssa-sccvn.c (expressions_equal_p): moved to tree.c.
  * tree-ssa-sccvn.h (expressions_equal_p): declaration moved to tree.h.
  * tree.c (expressions_equal_p): moved from tree-ssa-sccvn.c.
  (simple_type_size_in_bits): moved from dwarf2out.c.
  * tree.h (expressions_equal_p): declaration added.
  (field_byte_offset): declaration added.

Patch -

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 6034046..dad9337 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -3831,6 +3831,7 @@ GTFILES = $(CPP_ID_DATA_H) $(srcdir)/input.h $(srcdir)/coretypes.h \
   $(srcdir)/vtable-verify.c \
   $(srcdir)/asan.c \
   $(srcdir)/tsan.c $(srcdir)/ipa-devirt.c \
+  $(srcdir)/tree-sra.c \
   @all_gtfiles@
 
 # Compute the list of GT header files from the corresponding C sources,
diff --git a/gcc/common.opt b/gcc/common.opt
index 9082280..fe0ecd9 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2160,6 +2160,10 @@ ftree-sra
 Common Report Var(flag_tree_sra) Optimization
 Perform scalar replacement of aggregates
 
+ftree-bitfield-merge
+Common Report Var(flag_tree_bitfield_merge) Init(0) Optimization
+Enable bit-field merge on trees
+
 ftree-ter
 Common Report Var(flag_tree_ter) Optimization
 Replace temporary expressions in the SSA->normal pass
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index dae7605..7abe538 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -412,7 +412,7 @@ Objective-C and Objective-C++ Dialects}.
 -fsplit-ivs-in-unroller -fsplit-wide-types -fstack-protector @gol
 -fstack-protector-all -fstack-protector-strong -fstrict-aliasing @gol
 -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
--ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
+-ftree-bitfield-merge -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
 -ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
 -ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
 -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
@@ -7646,6 +7646,11 @@ pointer alignment information.
 This pass only operates on local scalar variables and is enabled by default
 at @option{-O} and higher.  It requires that @option{-ftree-ccp} is enabled.
 
+@item -ftree-bitfield-merge
+@opindex ftree-bitfield-merge
+Combines several adjacent bit-field accesses that copy values
+from one memory location to another into single bit-field access.
+
 @item -ftree-ccp
 @opindex ftree-ccp
 Perform sparse conditional constant propagation (CCP) on trees.  This
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index fc1c3f2..f3530ef 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -3104,8 +3104,6 @@ static HOST_WIDE_INT ceiling (HOST_WIDE_INT, unsigned int);
 static tree field_type (const_tree);
 static unsigned int simple_type_align_in_bits (const_tree);
 static unsigned int simple_decl_align_in_bits (const_tree);
-static unsigned HOST_WIDE_INT simple_type_size_in_bits (const_tree);
-static HOST_WIDE_INT field_byte_offset (const_tree);
 static void add_AT_location_description	(dw_die_ref, enum dwarf_attribute,
 					 dw_loc_list_ref);
 static void add_data_member_location_attribute (dw_die_ref, tree);
@@ -10145,25 +10143,6 @@ is_base_type (tree type)
   return 0;
 }
 
-/* Given a pointer to a tree node, assumed to be some kind of a ..._TYPE
-   node, return the size in bits for the type if it is a constant, or else
-   return the alignment for the type if the type's size is not constant, or
-   else return BITS_PER_WORD if the type actually turns out to be an
-   ERROR_MARK node.  */
-
-static inline unsigned HOST_WIDE_INT
-simple_type_size_in_bits (const_tree type)
-{
-  if (TREE_CODE (type) == ERROR_MARK)
-    return BITS_PER_WORD;
-  else if (TYPE_SIZE (type) == NULL_TREE)
-    return 0;
-  else if (host_integerp (TYPE_SIZE (type), 1))
-    return tree_low_cst (TYPE_SIZE (type), 1);
-  else
-    return TYPE_ALIGN (type);
-}
-
 /* Similarly, but return a double_int instead of UHWI.  */
 
 static inline double_int
@@ -14516,7 +14495,7 @@ round_up_to_align (double_int t, unsigned int align)
    because the offset is actually variable.  (We can't handle the latter case
    just yet).  */
 
-static HOST_WIDE_INT
+HOST_WIDE_INT
 field_byte_offset (const_tree decl)
 {
   double_int object_offset_in_bits;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c
new file mode 100644
index 0000000..e9e96b7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-bitfield-merge -fdump-tree-esra" }  */
+
+struct S
+{
+  unsigned f1:7;
+  unsigned f2:9;
+  unsigned f3:3;
+  unsigned f4:5;
+  unsigned f5:1;
+  unsigned f6:2;
+};
+
+unsigned
+foo (struct S *p1, struct S *p2, int *ptr)
+{
+  p2->f1 = p1->f1;
+  p2->f2 = p1->f2;
+  p2->f3 = p1->f3;
+  *ptr = 7;
+  p2->f4 = p1->f4;
+  p2->f5 = p1->f5;
+  p2->f6 = p1->f6;
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump "19" "esra" } } */
+/* { dg-final { scan-tree-dump "8" "esra"} } */
+/* { dg-final { cleanup-tree-dump "esra" } } */
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c
new file mode 100644
index 0000000..c056a30
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c
@@ -0,0 +1,42 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-bitfield-merge -fdump-tree-esra" }  */
+
+struct proba1
+{
+  unsigned f1:7;
+  unsigned f2:5;
+  unsigned f3:1;
+  unsigned f4:7;
+};
+
+
+struct proba2
+{
+  unsigned f0:1;
+  unsigned f1:7;
+  unsigned f2:5;
+  unsigned f3:1;
+  unsigned f4:7;
+};
+
+union proba_un
+{
+  struct proba1 a;
+  struct proba2 b;
+};
+
+
+int func (union proba_un *pr1, union proba_un *pr2)
+{
+  pr2->a.f1 = pr1->a.f1;
+  pr2->a.f2 = pr1->a.f2;
+  pr2->a.f3 = pr1->a.f3;
+  pr2->b.f2 = pr1->b.f2;
+  pr2->a.f4 = pr1->a.f4;
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump "13" "esra" } } */
+/* { dg-final { scan-tree-dump "7" "esra"} } */
+/* { dg-final { cleanup-tree-dump "esra" } } */
+
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
old mode 100644
new mode 100755
index 8e3bb81..cf31463
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -91,6 +91,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-inline.h"
 #include "gimple-pretty-print.h"
 #include "ipa-inline.h"
+#include "ggc.h"
 
 /* Enumeration of all aggregate reductions we can do.  */
 enum sra_mode { SRA_MODE_EARLY_IPA,   /* early call regularization */
@@ -3424,12 +3425,469 @@ perform_intra_sra (void)
   return ret;
 }
 
+/* This optimization combines several adjacent bit-field accesses that copy
+   values from one memory location to another into single bit-field access.  */
+
+/* Data for single bit-field read/write sequence.  */
+struct GTY (()) bitfield_access_d {
+  gimple load_stmt;		  /* Bit-field load statement.  */
+  gimple store_stmt;		  /* Bit-field store statement.  */
+  unsigned src_offset_words;	  /* Bit-field offset at src in words.  */
+  unsigned src_bit_offset;	  /* Bit-field offset inside source word.  */
+  unsigned src_bit_size;	  /* Size of bit-field in source word.  */
+  unsigned dst_offset_words;	  /* Bit-field offset at dst in words.  */
+  unsigned dst_bit_offset;	  /* Bit-field offset inside destination
+				     word.  */
+  unsigned dst_bit_size;	  /* Size of bit-field in destination word.  */
+  tree src_addr;		  /* Address of source memory access.  */
+  tree dst_addr;		  /* Address of destination memory access.  */
+  bool merged;			  /* True if access is merged with another
+				     one.  */
+  bool modified;		  /* True if bit-field size is modified.  */
+  bool is_barrier;		  /* True if access is barrier (call or mem
+				     access).  */
+  struct bitfield_access_d *next; /* Access with which this one is merged.  */
+  tree bitfield_type;		  /* Field type.  */
+  tree bitfield_representative;	  /* Bit field representative of original
+				     declaration.  */
+  tree field_decl_context;	  /* Context of original bit-field
+				     declaration.  */
+};
+
+typedef struct bitfield_access_d bitfield_access_o;
+typedef struct bitfield_access_d *bitfield_access;
+
+/* Connecting register with bit-field access sequence that defines value in
+   that register.  */
+struct GTY (()) bitfield_stmt_access_pair_d
+{
+  gimple stmt;
+  bitfield_access access;
+};
+
+typedef struct bitfield_stmt_access_pair_d bitfield_stmt_access_pair_o;
+typedef struct bitfield_stmt_access_pair_d *bitfield_stmt_access_pair;
+
+static GTY ((param_is (struct bitfield_stmt_access_pair_d)))
+  htab_t bitfield_stmt_access_htab;
+
+/* Hash table callbacks for bitfield_stmt_access_htab.  */
+
+static hashval_t
+bitfield_stmt_access_pair_htab_hash (const void *p)
+{
+  const struct bitfield_stmt_access_pair_d *entry =
+    (const struct bitfield_stmt_access_pair_d *)p;
+  return (hashval_t) (uintptr_t) entry->stmt;
+}
+
+static int
+bitfield_stmt_access_pair_htab_eq (const void *p1, const void *p2)
+{
+  const struct bitfield_stmt_access_pair_d *entry1 =
+    (const struct bitfield_stmt_access_pair_d *)p1;
+  const struct bitfield_stmt_access_pair_d *entry2 =
+    (const struct bitfield_stmt_access_pair_d *)p2;
+  return entry1->stmt == entry2->stmt;
+}
+
+/* Counter used for generating unique names for new fields.  */
+static unsigned new_field_no;
+
+/* Compare two bit-field access records.  */
+
+static int
+cmp_access (const void *p1, const void *p2)
+{
+  const bitfield_access a1 = (*(const bitfield_access*)p1);
+  const bitfield_access a2 = (*(const bitfield_access*)p2);
+
+  if (a1->bitfield_representative - a2->bitfield_representative)
+    return a1->bitfield_representative - a2->bitfield_representative;
+
+  if (!expressions_equal_p (a1->src_addr, a1->src_addr))
+    return a1 - a2;
+
+  if (!expressions_equal_p (a1->dst_addr, a1->dst_addr))
+    return a1 - a2;
+
+  if (a1->src_offset_words - a2->src_offset_words)
+    return a1->src_offset_words - a2->src_offset_words;
+
+  return a1->src_bit_offset - a2->src_bit_offset;
+}
+
+/* Create new bit-field access structure and add it to given bitfield_accesses
+   htab.  */
+
+static bitfield_access
+create_and_insert_access (vec<bitfield_access>
+		       *bitfield_accesses)
+{
+  bitfield_access access = ggc_alloc_bitfield_access_d ();
+  memset (access, 0, sizeof (struct bitfield_access_d));
+  bitfield_accesses->safe_push (access);
+  return access;
+}
+
+/* Slightly modified add_bit_offset_attribute from dwarf2out.c.  */
+
+static inline HOST_WIDE_INT
+get_bit_offset (tree decl)
+{
+  HOST_WIDE_INT object_offset_in_bytes = field_byte_offset (decl);
+  tree type = DECL_BIT_FIELD_TYPE (decl);
+  HOST_WIDE_INT bitpos_int;
+  HOST_WIDE_INT highest_order_object_bit_offset;
+  HOST_WIDE_INT highest_order_field_bit_offset;
+  HOST_WIDE_INT bit_offset;
+
+  /* Must be a field and a bit-field.  */
+  gcc_assert (type && TREE_CODE (decl) == FIELD_DECL);
+  if (! host_integerp (bit_position (decl), 0)
+      || ! host_integerp (DECL_SIZE (decl), 1))
+    return -1;
+
+  bitpos_int = int_bit_position (decl);
+
+  /* Note that the bit offset is always the distance (in bits) from the
+     highest-order bit of the "containing object" to the highest-order bit of
+     the bit-field itself.  Since the "high-order end" of any object or field
+     is different on big-endian and little-endian machines, the computation
+     below must take account of these differences.  */
+  highest_order_object_bit_offset = object_offset_in_bytes * BITS_PER_UNIT;
+  highest_order_field_bit_offset = bitpos_int;
+
+  if (! BYTES_BIG_ENDIAN)
+    {
+      highest_order_field_bit_offset += tree_low_cst (DECL_SIZE (decl), 0);
+      highest_order_object_bit_offset += simple_type_size_in_bits (type);
+    }
+
+  bit_offset
+    = (! BYTES_BIG_ENDIAN
+       ? highest_order_object_bit_offset - highest_order_field_bit_offset
+       : highest_order_field_bit_offset - highest_order_object_bit_offset);
+
+  return bit_offset;
+}
+
+/* Returns size of combined bitfields.  */
+
+static int
+get_merged_bit_field_size (bitfield_access access)
+{
+  bitfield_access tmp_access = access;
+  int size = 0;
+
+  while (tmp_access)
+  {
+    size += tmp_access->src_bit_size;
+    tmp_access = tmp_access->next;
+  }
+  return size;
+}
+
+/* Adds new pair consisting of statement and bit-field access structure that
+   contains it.  */
+
+static bool add_stmt_access_pair (bitfield_access access, gimple stmt)
+{
+  bitfield_stmt_access_pair new_pair;
+  void **slot;
+  new_pair = ggc_alloc_bitfield_stmt_access_pair_o ();
+  new_pair->stmt = stmt;
+  new_pair->access = access;
+  slot = htab_find_slot (bitfield_stmt_access_htab, new_pair, INSERT);
+  if (*slot == HTAB_EMPTY_ENTRY)
+    {
+      *slot = new_pair;
+      return true;
+    }
+  return false;
+}
+
+/* Main entry point for the bit-field merge optimization.  */
+
+static unsigned int
+ssa_bitfield_merge (void)
+{
+  basic_block bb;
+  unsigned int todoflags = 0;
+  vec<bitfield_access> bitfield_accesses;
+  int ix, iy;
+  bitfield_access access;
+  bool cfg_changed = false;
+
+  /* In the strict volatile bitfields case, doing code changes here may prevent
+     other optimizations, in particular in a SLOW_BYTE_ACCESS setting.  */
+  if (flag_strict_volatile_bitfields> 0)
+    return 0;
+
+  FOR_EACH_BB (bb)
+    {
+      gimple_stmt_iterator gsi;
+      vec<bitfield_access> bitfield_accesses_merge = vNULL;
+      tree prev_representative = NULL_TREE;
+      bitfield_accesses.create (0);
+
+      bitfield_stmt_access_htab
+	= htab_create_ggc (128, bitfield_stmt_access_pair_htab_hash,
+			   bitfield_stmt_access_pair_htab_eq, NULL);
+
+      /* Identify all bitfield copy sequences in the basic-block.  */
+      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)
+	{
+	  gimple stmt = gsi_stmt (gsi);
+	  tree lhs, rhs;
+	  void **slot;
+	  struct bitfield_stmt_access_pair_d asdata;
+
+	  if (!is_gimple_assign (stmt))
+	    {
+	      gsi_next (&gsi);
+	      continue;
+	    }
+
+	  lhs = gimple_assign_lhs (stmt);
+	  rhs = gimple_assign_rhs1 (stmt);
+
+	  if (TREE_CODE (rhs) == COMPONENT_REF)
+	    {
+	      use_operand_p use;
+	      gimple use_stmt;
+	      tree op0 = TREE_OPERAND (rhs, 0);
+	      tree op1 = TREE_OPERAND (rhs, 1);
+
+	      if (TREE_CODE (op1) == FIELD_DECL && DECL_BIT_FIELD_TYPE (op1)
+		  && !TREE_THIS_VOLATILE (op1))
+		{
+		  if (single_imm_use (lhs, &use, &use_stmt)
+		       && is_gimple_assign (use_stmt))
+		    {
+		      tree use_lhs = gimple_assign_lhs (use_stmt);
+		      if (TREE_CODE (use_lhs) == COMPONENT_REF)
+			{
+			  tree use_op0 = TREE_OPERAND (use_lhs, 0);
+			  tree use_op1 = TREE_OPERAND (use_lhs, 1);
+			  tree tmp_repr = DECL_BIT_FIELD_REPRESENTATIVE (op1);
+			  if (TREE_CODE (use_op1) == FIELD_DECL
+			      && DECL_BIT_FIELD_TYPE (use_op1)
+			      && !TREE_THIS_VOLATILE (use_op1))
+			    {
+			      if (prev_representative
+				  && (prev_representative != tmp_repr))
+				{
+				  /* If previous access has different
+				     representative then barrier is needed
+				     between it and new access.  */
+				  access = create_and_insert_access
+					     (&bitfield_accesses);
+				  access->is_barrier = true;
+				}
+			      prev_representative = tmp_repr;
+			      /* Create new bit-field access structure.  */
+			      access = create_and_insert_access
+					 (&bitfield_accesses);
+			      /* Collect access data - load instruction.  */
+			      access->src_bit_size = tree_low_cst
+						      (DECL_SIZE (op1), 1);
+			      access->src_bit_offset = get_bit_offset (op1);
+			      access->src_offset_words =
+				field_byte_offset (op1) / UNITS_PER_WORD;
+			      access->src_addr = op0;
+			      access->load_stmt = gsi_stmt (gsi);
+			      /* Collect access data - store instruction.  */
+			      access->dst_bit_size =
+				tree_low_cst (DECL_SIZE (use_op1), 1);
+			      access->dst_bit_offset =
+				get_bit_offset (use_op1);
+			      access->dst_offset_words =
+				field_byte_offset (use_op1) / UNITS_PER_WORD;
+			      access->dst_addr = use_op0;
+			      access->store_stmt = use_stmt;
+			      add_stmt_access_pair (access, stmt);
+			      add_stmt_access_pair (access, use_stmt);
+			      access->bitfield_type
+				= DECL_BIT_FIELD_TYPE (use_op1);
+			      access->bitfield_representative = tmp_repr;
+			      access->field_decl_context =
+				DECL_FIELD_CONTEXT (op1);
+			    }
+			}
+		    }
+		}
+	    }
+
+	  /* Insert barrier for merging if statement is function call or memory
+	     access.  */
+	  asdata.stmt = stmt;
+	  slot
+	    = htab_find_slot (bitfield_stmt_access_htab, &asdata, NO_INSERT);
+	  if (!slot
+	      && ((gimple_code (stmt) == GIMPLE_CALL)
+		  || (gimple_has_mem_ops (stmt))))
+	    {
+	      /* Create new bit-field access structure.  */
+	      access = create_and_insert_access (&bitfield_accesses);
+	      /* Mark it as barrier.  */
+	      access->is_barrier = true;
+	    }
+
+	  gsi_next (&gsi);
+	}
+
+      /* If there are no at least two accesses go to the next basic block.  */
+      if (bitfield_accesses.length () <= 1)
+	{
+	  bitfield_accesses.release ();
+	  continue;
+	}
+
+      /* Find bit-field accesses that can be merged.  */
+      for (ix = 0; bitfield_accesses.iterate (ix, &access); ix++)
+	{
+	  bitfield_access head_access;
+	  bitfield_access mrg_access;
+	  bitfield_access prev_access;
+	  if (!bitfield_accesses_merge.exists ())
+	    bitfield_accesses_merge.create (0);
+
+	  bitfield_accesses_merge.safe_push (access);
+
+	  if (!access->is_barrier
+	      && !(access == bitfield_accesses.last ()
+	      && !bitfield_accesses_merge.is_empty ()))
+	    continue;
+
+	  bitfield_accesses_merge.qsort (cmp_access);
+
+	  head_access = NULL;
+	  for (iy = 0; bitfield_accesses_merge.iterate (iy, &mrg_access); iy++)
+	    {
+	      if (head_access
+		  && expressions_equal_p (head_access->src_addr,
+					  mrg_access->src_addr)
+		  && expressions_equal_p (head_access->dst_addr,
+					  mrg_access->dst_addr)
+		  && prev_access->src_offset_words
+		     == prev_access->src_offset_words
+		  && prev_access->dst_offset_words
+		     == prev_access->dst_offset_words
+		  && prev_access->src_bit_offset + prev_access->src_bit_size
+		     == mrg_access->src_bit_offset
+		  && prev_access->dst_bit_offset + prev_access->dst_bit_size
+		     == mrg_access->dst_bit_offset
+		  && prev_access->bitfield_representative
+		     == mrg_access->bitfield_representative)
+		{
+		  /* Merge conditions are satisfied - merge accesses.  */
+		  mrg_access->merged = true;
+		  prev_access->next = mrg_access;
+		  head_access->modified = true;
+		  prev_access = mrg_access;
+		}
+	      else
+		head_access = prev_access = mrg_access;
+	    }
+	  bitfield_accesses_merge.release ();
+	  bitfield_accesses_merge = vNULL;
+	}
+
+      /* Modify generated code.  */
+      for (ix = 0; bitfield_accesses.iterate (ix, &access); ix++)
+	{
+	  if (access->merged)
+	    {
+	      /* Access merged - remove instructions.  */
+	      gimple_stmt_iterator tmp_gsi;
+	      tmp_gsi = gsi_for_stmt (access->load_stmt);
+	      gsi_remove (&tmp_gsi, true);
+	      tmp_gsi = gsi_for_stmt (access->store_stmt);
+	      gsi_remove (&tmp_gsi, true);
+	    }
+	  else if (access->modified)
+	    {
+	      /* Access modified - modify generated code.  */
+	      gimple_stmt_iterator tmp_gsi;
+	      tree tmp_ssa;
+	      tree itype = make_node (INTEGER_TYPE);
+	      tree new_rhs;
+	      tree new_lhs;
+	      gimple new_stmt;
+	      char new_field_name [15];
+	      int decl_size;
+
+	      /* Bit-field size changed - modify load statement.  */
+	      access->src_bit_size = get_merged_bit_field_size (access);
+
+	      TYPE_PRECISION (itype) = access->src_bit_size;
+	      fixup_unsigned_type (itype);
+
+	      /* Create new declaration.  */
+	      tree new_field = make_node (FIELD_DECL);
+	      sprintf (new_field_name, "_field%d", new_field_no++);
+	      DECL_NAME (new_field) = get_identifier (new_field_name);
+	      TREE_TYPE (new_field) = itype;
+	      DECL_BIT_FIELD (new_field) = 1;
+	      DECL_BIT_FIELD_TYPE (new_field) = access->bitfield_type;
+	      DECL_BIT_FIELD_REPRESENTATIVE (new_field) =
+		access->bitfield_representative;
+	      DECL_FIELD_CONTEXT (new_field) = access->field_decl_context;
+	      DECL_NONADDRESSABLE_P (new_field) = 1;
+	      DECL_FIELD_OFFSET (new_field) =
+		build_int_cst (unsigned_type_node, access->src_offset_words);
+	      DECL_FIELD_BIT_OFFSET (new_field) =
+		build_int_cst (unsigned_type_node, access->src_bit_offset);
+	      DECL_SIZE (new_field) = build_int_cst (unsigned_type_node,
+						     access->src_bit_size);
+	      decl_size = access->src_bit_size / BITS_PER_UNIT
+		+ (access->src_bit_size % BITS_PER_UNIT ? 1 : 0);
+	      DECL_SIZE_UNIT (new_field) =
+		build_int_cst (unsigned_type_node, decl_size);
+
+	      tmp_ssa = make_ssa_name (create_tmp_var (itype, NULL), NULL);
+
+	      /* Create new component ref.  */
+	      new_rhs = build3 (COMPONENT_REF, itype, access->src_addr,
+				new_field, NULL);
+	      tmp_gsi = gsi_for_stmt (access->load_stmt);
+	      new_stmt = gimple_build_assign (tmp_ssa, new_rhs);
+	      gsi_insert_after (&tmp_gsi, new_stmt, GSI_SAME_STMT);
+	      SSA_NAME_DEF_STMT (tmp_ssa) = new_stmt;
+	      gsi_remove (&tmp_gsi, true);
+
+	      /* Bit-field size changed - modify store statement.  */
+	      /* Create new component ref.  */
+	      new_lhs = build3 (COMPONENT_REF, itype, access->dst_addr,
+				new_field, NULL);
+	      new_stmt = gimple_build_assign (new_lhs, tmp_ssa);
+	      tmp_gsi = gsi_for_stmt (access->store_stmt);
+	      gsi_insert_after (&tmp_gsi, new_stmt, GSI_SAME_STMT);
+	      gsi_remove (&tmp_gsi, true);
+	      cfg_changed = true;
+	    }
+	}
+      /* Empty or delete data structures used for basic block.  */
+      htab_empty (bitfield_stmt_access_htab);
+      bitfield_accesses.release ();
+    }
+
+  if (cfg_changed)
+    todoflags |= TODO_cleanup_cfg;
+
+  return todoflags;
+}
+
 /* Perform early intraprocedural SRA.  */
 static unsigned int
 early_intra_sra (void)
 {
+  unsigned int todoflags = 0;
   sra_mode = SRA_MODE_EARLY_INTRA;
-  return perform_intra_sra ();
+  if (flag_tree_bitfield_merge)
+    todoflags = ssa_bitfield_merge ();
+  return todoflags | perform_intra_sra ();
 }
 
 /* Perform "late" intraprocedural SRA.  */
@@ -5095,3 +5553,5 @@ make_pass_early_ipa_sra (gcc::context *ctxt)
 {
   return new pass_early_ipa_sra (ctxt);
 }
+
+#include "gt-tree-sra.h"
diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
index 6886efb..afc73a6 100644
--- a/gcc/tree-ssa-sccvn.c
+++ b/gcc/tree-ssa-sccvn.c
@@ -4176,29 +4176,6 @@ get_next_value_id (void)
   return next_value_id++;
 }
 
-
-/* Compare two expressions E1 and E2 and return true if they are equal.  */
-
-bool
-expressions_equal_p (tree e1, tree e2)
-{
-  /* The obvious case.  */
-  if (e1 == e2)
-    return true;
-
-  /* If only one of them is null, they cannot be equal.  */
-  if (!e1 || !e2)
-    return false;
-
-  /* Now perform the actual comparison.  */
-  if (TREE_CODE (e1) == TREE_CODE (e2)
-      && operand_equal_p (e1, e2, OEP_PURE_SAME))
-    return true;
-
-  return false;
-}
-
-
 /* Return true if the nary operation NARY may trap.  This is a copy
    of stmt_could_throw_1_p adjusted to the SCCVN IL.  */
 
diff --git a/gcc/tree-ssa-sccvn.h b/gcc/tree-ssa-sccvn.h
index 94e3603..707b18c 100644
--- a/gcc/tree-ssa-sccvn.h
+++ b/gcc/tree-ssa-sccvn.h
@@ -21,10 +21,6 @@
 #ifndef TREE_SSA_SCCVN_H
 #define TREE_SSA_SCCVN_H
 
-/* In tree-ssa-sccvn.c  */
-bool expressions_equal_p (tree, tree);
-
-
 /* TOP of the VN lattice.  */
 extern tree VN_TOP;
 
diff --git a/gcc/tree.c b/gcc/tree.c
index 1947105..7e8d4a8 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -12108,4 +12108,44 @@ contains_bitfld_component_ref_p (const_tree ref)
   return false;
 }
 
+/* Compare two expressions E1 and E2 and return true if they are equal.  */
+
+bool
+expressions_equal_p (tree e1, tree e2)
+{
+  /* The obvious case.  */
+  if (e1 == e2)
+    return true;
+
+  /* If only one of them is null, they cannot be equal.  */
+  if (!e1 || !e2)
+    return false;
+
+  /* Now perform the actual comparison.  */
+  if (TREE_CODE (e1) == TREE_CODE (e2)
+      && operand_equal_p (e1, e2, OEP_PURE_SAME))
+    return true;
+
+  return false;
+}
+
+/* Given a pointer to a tree node, assumed to be some kind of a ..._TYPE
+   node, return the size in bits for the type if it is a constant, or else
+   return the alignment for the type if the type's size is not constant, or
+   else return BITS_PER_WORD if the type actually turns out to be an
+   ERROR_MARK node.  */
+
+unsigned HOST_WIDE_INT
+simple_type_size_in_bits (const_tree type)
+{
+  if (TREE_CODE (type) == ERROR_MARK)
+    return BITS_PER_WORD;
+  else if (TYPE_SIZE (type) == NULL_TREE)
+    return 0;
+  else if (host_integerp (TYPE_SIZE (type), 1))
+    return tree_low_cst (TYPE_SIZE (type), 1);
+  else
+    return TYPE_ALIGN (type);
+}
+
 #include "gt-tree.h"
diff --git a/gcc/tree.h b/gcc/tree.h
index 84bd699..0ea2203 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -5981,6 +5981,7 @@ extern tree get_ref_base_and_extent (tree, HOST_WIDE_INT *,
 				     HOST_WIDE_INT *, HOST_WIDE_INT *);
 extern bool contains_bitfld_component_ref_p (const_tree);
 extern bool type_in_anonymous_namespace_p (tree);
+extern bool expressions_equal_p (tree e1, tree e2);
 
 /* In tree-nested.c */
 extern tree build_addr (tree, tree);
@@ -6502,6 +6503,10 @@ extern bool block_may_fallthru (const_tree);
 /* In vtable-verify.c.  */
 extern void save_vtable_map_decl (tree);
 
+/* In dwarf2out.c.  */
+HOST_WIDE_INT
+field_byte_offset (const_tree decl);
+
 ?
 /* Functional interface to the builtin functions.  */
 
@@ -6613,5 +6618,6 @@ builtin_decl_implicit_p (enum built_in_function fncode)
 #endif	/* NO_DOLLAR_IN_LABEL */
 #endif	/* NO_DOT_IN_LABEL */
 
+extern unsigned HOST_WIDE_INT simple_type_size_in_bits (const_tree type);
 
 #endif  /* GCC_TREE_H  */


Regards,
Zoran Jovanovic

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH] Add a new option "-ftree-bitfield-merge" (patch / doc inside)
  2013-08-23 14:25 ` Zoran Jovanovic
@ 2013-08-23 22:19   ` Joseph S. Myers
       [not found]   ` <140b1b1b35a.2760.0f39ed3bcad52ef2c88c90062b7714dc@gmail.com>
  2013-09-24 23:10   ` Zoran Jovanovic
  2 siblings, 0 replies; 17+ messages in thread
From: Joseph S. Myers @ 2013-08-23 22:19 UTC (permalink / raw)
  To: Zoran Jovanovic; +Cc: gcc-patches, Petar Jovanovic

On Fri, 23 Aug 2013, Zoran Jovanovic wrote:

> New test case involving unions is added.

Tests for unions should include *execution* tests that the relevant bits 
have the right values after the sequence of assignments.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH] Add a new option "-ftree-bitfield-merge" (patch / doc inside)
       [not found]   ` <140b1b1b35a.2760.0f39ed3bcad52ef2c88c90062b7714dc@gmail.com>
@ 2013-08-24 22:32     ` Bernhard Reutner-Fischer
  0 siblings, 0 replies; 17+ messages in thread
From: Bernhard Reutner-Fischer @ 2013-08-24 22:32 UTC (permalink / raw)
  To: Zoran Jovanovic, gcc-patches; +Cc: Petar Jovanovic

On 23 August 2013 16:05:32 Zoran Jovanovic <Zoran.Jovanovic@imgtec.com> wrote:
> Hello,
> This is new patch version. Optimization does not use BIT_FIELD_REF any 
> more, instead it generates new COMPONENT_REF and FIELD_DECL.
> Existing Bit field representative is associated with newly created field 
> declaration.
> During analysis phase optimization uses bit field representatives when 
> deciding which bit-fields accesses can be merged.
> Instead of having separate pass optimization is moved to tree-sra.c and 
> executed with sra early.
> New test case involving unions is added.
> Also, some other comments received on first patch are applied in new 
> implementation.
>
>
> Example:
>
> Original code:
>   <unnamed-unsigned:3> D.1351;
>   <unnamed-unsigned:9> D.1350;
>   <unnamed-unsigned:7> D.1349;
>   D.1349_2 = p1_1(D)->f1;
>   p2_3(D)->f1 = D.1349_2;
>   D.1350_4 = p1_1(D)->f2;
>   p2_3(D)->f2 = D.1350_4;
>   D.1351_5 = p1_1(D)->f3;
>   p2_3(D)->f3 = D.1351_5;
>
> Optimized code:
>   <unnamed-unsigned:19> D.1358;
>   _16 = pr1_2(D)->_field0;
>   pr2_4(D)->_field0 = _16;
>
> Algorithm works on basic block level and consists of following 3 major steps:
> 1. Go trough basic block statements list. If there are statement pairs that 
> implement copy of bit field content from one memory location to another 
> record statements pointers and other necessary data in corresponding data 
> structure.
> 2. Identify records that represent adjacent bit field accesses and mark 
> them as merged.
> 3. Modify trees accordingly.
>
> New command line option "-ftree-bitfield-merge" is introduced.
>
> Tested - passed gcc regression tests.
>
> Changelog -
>
> gcc/ChangeLog:
> 2013-08-22 Zoran Jovanovic (zoran.jovanovic@imgtec.com)
>   * Makefile.in : Added tree-sra.c to GTFILES.
>   * common.opt (ftree-bitfield-merge): New option.
>   * doc/invoke.texi: Added reference to "-ftree-bitfield-merge".
>   * tree-sra.c (ssa_bitfield_merge): New function.
>   Entry for (-ftree-bitfield-merge).
>   (bitfield_stmt_access_pair_htab_hash): New function.
>   (bitfield_stmt_access_pair_htab_eq): New function.
>   (cmp_access): New function.
>   (create_and_insert_access): New function.
>   (get_bit_offset): New function.
>   (get_merged_bit_field_size): New function.
>   (add_stmt_access_pair): New function.
>   * dwarf2out.c (simple_type_size_in_bits): moved to tree.c.
>   (field_byte_offset): declaration moved to tree.h, static removed.
>   * testsuite/gcc.dg/tree-ssa/bitfldmrg1.c: New test.
>   * testsuite/gcc.dg/tree-ssa/bitfldmrg2.c: New test.
>   * tree-ssa-sccvn.c (expressions_equal_p): moved to tree.c.
>   * tree-ssa-sccvn.h (expressions_equal_p): declaration moved to tree.h.
>   * tree.c (expressions_equal_p): moved from tree-ssa-sccvn.c.
>   (simple_type_size_in_bits): moved from dwarf2out.c.
>   * tree.h (expressions_equal_p): declaration added.
>   (field_byte_offset): declaration added.
>
> Patch -
>
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index 6034046..dad9337 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -3831,6 +3831,7 @@ GTFILES = $(CPP_ID_DATA_H) $(srcdir)/input.h 
> $(srcdir)/coretypes.h \
>    $(srcdir)/vtable-verify.c \
>    $(srcdir)/asan.c \
>    $(srcdir)/tsan.c $(srcdir)/ipa-devirt.c \
> +  $(srcdir)/tree-sra.c \
>    @all_gtfiles@
>
>  # Compute the list of GT header files from the corresponding C sources,
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 9082280..fe0ecd9 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -2160,6 +2160,10 @@ ftree-sra
>  Common Report Var(flag_tree_sra) Optimization
>  Perform scalar replacement of aggregates
>
> +ftree-bitfield-merge
> +Common Report Var(flag_tree_bitfield_merge) Init(0) Optimization
> +Enable bit-field merge on trees
> +
>  ftree-ter
>  Common Report Var(flag_tree_ter) Optimization
>  Replace temporary expressions in the SSA->normal pass
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index dae7605..7abe538 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -412,7 +412,7 @@ Objective-C and Objective-C++ Dialects}.
>  -fsplit-ivs-in-unroller -fsplit-wide-types -fstack-protector @gol
>  -fstack-protector-all -fstack-protector-strong -fstrict-aliasing @gol
>  -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
> --ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
> +-ftree-bitfield-merge -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
>  -ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
>  -ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
>  -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
> @@ -7646,6 +7646,11 @@ pointer alignment information.
>  This pass only operates on local scalar variables and is enabled by default
>  at @option{-O} and higher.  It requires that @option{-ftree-ccp} is enabled.
>
> +@item -ftree-bitfield-merge
> +@opindex ftree-bitfield-merge
> +Combines several adjacent bit-field accesses that copy values
> +from one memory location to another into single bit-field access.

into one single
Would be easier to understand, IMHO. Same for the other occurrences in this 
patch.

> +
>  @item -ftree-ccp
>  @opindex ftree-ccp
>  Perform sparse conditional constant propagation (CCP) on trees.  This
> diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
> index fc1c3f2..f3530ef 100644
> --- a/gcc/dwarf2out.c
> +++ b/gcc/dwarf2out.c
> @@ -3104,8 +3104,6 @@ static HOST_WIDE_INT ceiling (HOST_WIDE_INT, unsigned 
> int);
>  static tree field_type (const_tree);
>  static unsigned int simple_type_align_in_bits (const_tree);
>  static unsigned int simple_decl_align_in_bits (const_tree);
> -static unsigned HOST_WIDE_INT simple_type_size_in_bits (const_tree);
> -static HOST_WIDE_INT field_byte_offset (const_tree);
>  static void add_AT_location_description	(dw_die_ref, enum dwarf_attribute,
>  					 dw_loc_list_ref);
>  static void add_data_member_location_attribute (dw_die_ref, tree);
> @@ -10145,25 +10143,6 @@ is_base_type (tree type)
>    return 0;
>  }
>
> -/* Given a pointer to a tree node, assumed to be some kind of a ..._TYPE
> -   node, return the size in bits for the type if it is a constant, or else
> -   return the alignment for the type if the type's size is not constant, or
> -   else return BITS_PER_WORD if the type actually turns out to be an
> -   ERROR_MARK node.  */
> -
> -static inline unsigned HOST_WIDE_INT
> -simple_type_size_in_bits (const_tree type)
> -{
> -  if (TREE_CODE (type) == ERROR_MARK)
> -    return BITS_PER_WORD;
> -  else if (TYPE_SIZE (type) == NULL_TREE)
> -    return 0;
> -  else if (host_integerp (TYPE_SIZE (type), 1))
> -    return tree_low_cst (TYPE_SIZE (type), 1);
> -  else
> -    return TYPE_ALIGN (type);
> -}
> -
>  /* Similarly, but return a double_int instead of UHWI.  */
>
>  static inline double_int
> @@ -14516,7 +14495,7 @@ round_up_to_align (double_int t, unsigned int align)
>     because the offset is actually variable.  (We can't handle the latter case
>     just yet).  */
>
> -static HOST_WIDE_INT
> +HOST_WIDE_INT
>  field_byte_offset (const_tree decl)
>  {
>    double_int object_offset_in_bits;
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c
> new file mode 100644
> index 0000000..e9e96b7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c
> @@ -0,0 +1,30 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -ftree-bitfield-merge -fdump-tree-esra" }  */
> +
> +struct S
> +{
> +  unsigned f1:7;
> +  unsigned f2:9;
> +  unsigned f3:3;
> +  unsigned f4:5;
> +  unsigned f5:1;
> +  unsigned f6:2;
> +};
> +
> +unsigned
> +foo (struct S *p1, struct S *p2, int *ptr)
> +{
> +  p2->f1 = p1->f1;
> +  p2->f2 = p1->f2;
> +  p2->f3 = p1->f3;
> +  *ptr = 7;
> +  p2->f4 = p1->f4;
> +  p2->f5 = p1->f5;
> +  p2->f6 = p1->f6;
> +  return 0;
> +}
> +
> +/* { dg-final { scan-tree-dump "19" "esra" } } */
> +/* { dg-final { scan-tree-dump "8" "esra"} } */
> +/* { dg-final { cleanup-tree-dump "esra" } } */
> +
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c
> new file mode 100644
> index 0000000..c056a30
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c
> @@ -0,0 +1,42 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -ftree-bitfield-merge -fdump-tree-esra" }  */
> +
> +struct proba1
> +{
> +  unsigned f1:7;
> +  unsigned f2:5;
> +  unsigned f3:1;
> +  unsigned f4:7;
> +};
> +
> +
> +struct proba2
> +{
> +  unsigned f0:1;
> +  unsigned f1:7;
> +  unsigned f2:5;
> +  unsigned f3:1;
> +  unsigned f4:7;
> +};
> +
> +union proba_un
> +{
> +  struct proba1 a;
> +  struct proba2 b;
> +};
> +
> +
> +int func (union proba_un *pr1, union proba_un *pr2)
> +{
> +  pr2->a.f1 = pr1->a.f1;
> +  pr2->a.f2 = pr1->a.f2;
> +  pr2->a.f3 = pr1->a.f3;
> +  pr2->b.f2 = pr1->b.f2;
> +  pr2->a.f4 = pr1->a.f4;
> +  return 0;
> +}
> +
> +/* { dg-final { scan-tree-dump "13" "esra" } } */
> +/* { dg-final { scan-tree-dump "7" "esra"} } */
> +/* { dg-final { cleanup-tree-dump "esra" } } */
> +
> diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
> old mode 100644
> new mode 100755
> index 8e3bb81..cf31463
> --- a/gcc/tree-sra.c
> +++ b/gcc/tree-sra.c
> @@ -91,6 +91,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "tree-inline.h"
>  #include "gimple-pretty-print.h"
>  #include "ipa-inline.h"
> +#include "ggc.h"
>
>  /* Enumeration of all aggregate reductions we can do.  */
>  enum sra_mode { SRA_MODE_EARLY_IPA,   /* early call regularization */
> @@ -3424,12 +3425,469 @@ perform_intra_sra (void)
>    return ret;
>  }
>
> +/* This optimization combines several adjacent bit-field accesses that copy
> +   values from one memory location to another into single bit-field 
> access.  */

Ditto.
> +
> +/* Data for single bit-field read/write sequence.  */
> +struct GTY (()) bitfield_access_d {
> +  gimple load_stmt;		  /* Bit-field load statement.  */
> +  gimple store_stmt;		  /* Bit-field store statement.  */
> +  unsigned src_offset_words;	  /* Bit-field offset at src in words.  */
> +  unsigned src_bit_offset;	  /* Bit-field offset inside source word.  */
> +  unsigned src_bit_size;	  /* Size of bit-field in source word.  */
> +  unsigned dst_offset_words;	  /* Bit-field offset at dst in words.  */
> +  unsigned dst_bit_offset;	  /* Bit-field offset inside destination
> +				     word.  */
> +  unsigned dst_bit_size;	  /* Size of bit-field in destination word.  */
> +  tree src_addr;		  /* Address of source memory access.  */
> +  tree dst_addr;		  /* Address of destination memory access.  */
> +  bool merged;			  /* True if access is merged with another
> +				     one.  */
> +  bool modified;		  /* True if bit-field size is modified.  */
> +  bool is_barrier;		  /* True if access is barrier (call or mem
> +				     access).  */
> +  struct bitfield_access_d *next; /* Access with which this one is merged.  */
> +  tree bitfield_type;		  /* Field type.  */
> +  tree bitfield_representative;	  /* Bit field representative of original
> +				     declaration.  */
> +  tree field_decl_context;	  /* Context of original bit-field
> +				     declaration.  */
> +};
> +
> +typedef struct bitfield_access_d bitfield_access_o;
> +typedef struct bitfield_access_d *bitfield_access;
> +
> +/* Connecting register with bit-field access sequence that defines value in
> +   that register.  */
> +struct GTY (()) bitfield_stmt_access_pair_d
> +{
> +  gimple stmt;
> +  bitfield_access access;
> +};
> +
> +typedef struct bitfield_stmt_access_pair_d bitfield_stmt_access_pair_o;
> +typedef struct bitfield_stmt_access_pair_d *bitfield_stmt_access_pair;
> +
> +static GTY ((param_is (struct bitfield_stmt_access_pair_d)))
> +  htab_t bitfield_stmt_access_htab;
> +
> +/* Hash table callbacks for bitfield_stmt_access_htab.  */
> +
> +static hashval_t
> +bitfield_stmt_access_pair_htab_hash (const void *p)
> +{
> +  const struct bitfield_stmt_access_pair_d *entry =
> +    (const struct bitfield_stmt_access_pair_d *)p;

Perhaps shorter to say const bitfield_stmt_access_pair here?

> +  return (hashval_t) (uintptr_t) entry->stmt;
> +}
> +
> +static int
> +bitfield_stmt_access_pair_htab_eq (const void *p1, const void *p2)
> +{
> +  const struct bitfield_stmt_access_pair_d *entry1 =
> +    (const struct bitfield_stmt_access_pair_d *)p1;
> +  const struct bitfield_stmt_access_pair_d *entry2 =
> +    (const struct bitfield_stmt_access_pair_d *)p2;

Likewise const bitfield_stmt_access_pair ?
> +  return entry1->stmt == entry2->stmt;
> +}
> +
> +/* Counter used for generating unique names for new fields.  */
> +static unsigned new_field_no;
> +
> +/* Compare two bit-field access records.  */
> +
> +static int
> +cmp_access (const void *p1, const void *p2)
> +{
> +  const bitfield_access a1 = (*(const bitfield_access*)p1);
> +  const bitfield_access a2 = (*(const bitfield_access*)p2);
> +
> +  if (a1->bitfield_representative - a2->bitfield_representative)
> +    return a1->bitfield_representative - a2->bitfield_representative;
> +
> +  if (!expressions_equal_p (a1->src_addr, a1->src_addr))

The second parm has to be a2->src_addr to make sense to me.

> +    return a1 - a2;
> +
> +  if (!expressions_equal_p (a1->dst_addr, a1->dst_addr))

Ditto.

> +    return a1 - a2;
> +
> +  if (a1->src_offset_words - a2->src_offset_words)
> +    return a1->src_offset_words - a2->src_offset_words;
> +
> +  return a1->src_bit_offset - a2->src_bit_offset;
> +}
> +
> +/* Create new bit-field access structure and add it to given bitfield_accesses
> +   htab.  */
> +
> +static bitfield_access
> +create_and_insert_access (vec<bitfield_access>
> +		       *bitfield_accesses)
> +{
> +  bitfield_access access = ggc_alloc_bitfield_access_d ();
> +  memset (access, 0, sizeof (struct bitfield_access_d));
> +  bitfield_accesses->safe_push (access);
> +  return access;
> +}
> +
> +/* Slightly modified add_bit_offset_attribute from dwarf2out.c.  */
> +
> +static inline HOST_WIDE_INT
> +get_bit_offset (tree decl)
> +{
> +  HOST_WIDE_INT object_offset_in_bytes = field_byte_offset (decl);
> +  tree type = DECL_BIT_FIELD_TYPE (decl);
> +  HOST_WIDE_INT bitpos_int;
> +  HOST_WIDE_INT highest_order_object_bit_offset;
> +  HOST_WIDE_INT highest_order_field_bit_offset;
> +  HOST_WIDE_INT bit_offset;
> +
> +  /* Must be a field and a bit-field.  */
> +  gcc_assert (type && TREE_CODE (decl) == FIELD_DECL);
> +  if (! host_integerp (bit_position (decl), 0)
> +      || ! host_integerp (DECL_SIZE (decl), 1))

Didn't we have integer_zero and integer_onep for these, nowadays?

> +    return -1;
> +
> +  bitpos_int = int_bit_position (decl);
> +
> +  /* Note that the bit offset is always the distance (in bits) from the
> +     highest-order bit of the "containing object" to the highest-order bit of
> +     the bit-field itself.  Since the "high-order end" of any object or field
> +     is different on big-endian and little-endian machines, the computation
> +     below must take account of these differences.  */
> +  highest_order_object_bit_offset = object_offset_in_bytes * BITS_PER_UNIT;
> +  highest_order_field_bit_offset = bitpos_int;
> +
> +  if (! BYTES_BIG_ENDIAN)
> +    {
> +      highest_order_field_bit_offset += tree_low_cst (DECL_SIZE (decl), 0);
> +      highest_order_object_bit_offset += simple_type_size_in_bits (type);
> +    }
> +
> +  bit_offset
> +    = (! BYTES_BIG_ENDIAN
> +       ? highest_order_object_bit_offset - highest_order_field_bit_offset
> +       : highest_order_field_bit_offset - highest_order_object_bit_offset);

I'd manually move the LE store to the if above and do BE in an else to 
improve readability.

> +
> +  return bit_offset;
> +}
> +
> +/* Returns size of combined bitfields.  */
> +
> +static int
> +get_merged_bit_field_size (bitfield_access access)
> +{
> +  bitfield_access tmp_access = access;
> +  int size = 0;

I take it this won't overflow for insanely large bit-fields..

> +
> +  while (tmp_access)
> +  {
> +    size += tmp_access->src_bit_size;
> +    tmp_access = tmp_access->next;
> +  }
> +  return size;
> +}
> +
> +/* Adds new pair consisting of statement and bit-field access structure that
> +   contains it.  */

Params in upper case.

> +
> +static bool add_stmt_access_pair (bitfield_access access, gimple stmt)
> +{
> +  bitfield_stmt_access_pair new_pair;
> +  void **slot;
> +  new_pair = ggc_alloc_bitfield_stmt_access_pair_o ();
> +  new_pair->stmt = stmt;
> +  new_pair->access = access;
> +  slot = htab_find_slot (bitfield_stmt_access_htab, new_pair, INSERT);
> +  if (*slot == HTAB_EMPTY_ENTRY)
> +    {
> +      *slot = new_pair;
> +      return true;
> +    }
> +  return false;
> +}
> +
> +/* Main entry point for the bit-field merge optimization.  */
> +
> +static unsigned int
> +ssa_bitfield_merge (void)
> +{
> +  basic_block bb;
> +  unsigned int todoflags = 0;
> +  vec<bitfield_access> bitfield_accesses;
> +  int ix, iy;
> +  bitfield_access access;
> +  bool cfg_changed = false;
> +
> +  /* In the strict volatile bitfields case, doing code changes here may 
> prevent
> +     other optimizations, in particular in a SLOW_BYTE_ACCESS setting.  */
> +  if (flag_strict_volatile_bitfields> 0)

My mailer does not render a space before the ">", is it missing?

> +    return 0;
> +
> +  FOR_EACH_BB (bb)
> +    {
> +      gimple_stmt_iterator gsi;
> +      vec<bitfield_access> bitfield_accesses_merge = vNULL;
> +      tree prev_representative = NULL_TREE;
> +      bitfield_accesses.create (0);
> +
> +      bitfield_stmt_access_htab
> +	= htab_create_ggc (128, bitfield_stmt_access_pair_htab_hash,
> +			   bitfield_stmt_access_pair_htab_eq, NULL);

This sounds like it allocates htab even for BBs without a bit-field, no? 
Isn't that a bit wasteful?

> +
> +      /* Identify all bitfield copy sequences in the basic-block.  */
> +      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)
> +	{
> +	  gimple stmt = gsi_stmt (gsi);
> +	  tree lhs, rhs;
> +	  void **slot;
> +	  struct bitfield_stmt_access_pair_d asdata;
> +
> +	  if (!is_gimple_assign (stmt))
> +	    {
> +	      gsi_next (&gsi);
> +	      continue;
> +	    }
> +
> +	  lhs = gimple_assign_lhs (stmt);
> +	  rhs = gimple_assign_rhs1 (stmt);
> +
> +	  if (TREE_CODE (rhs) == COMPONENT_REF)
> +	    {
> +	      use_operand_p use;
> +	      gimple use_stmt;
> +	      tree op0 = TREE_OPERAND (rhs, 0);
> +	      tree op1 = TREE_OPERAND (rhs, 1);
> +
> +	      if (TREE_CODE (op1) == FIELD_DECL && DECL_BIT_FIELD_TYPE (op1)
> +		  && !TREE_THIS_VOLATILE (op1))
> +		{
> +		  if (single_imm_use (lhs, &use, &use_stmt)
> +		       && is_gimple_assign (use_stmt))
> +		    {
> +		      tree use_lhs = gimple_assign_lhs (use_stmt);
> +		      if (TREE_CODE (use_lhs) == COMPONENT_REF)
> +			{
> +			  tree use_op0 = TREE_OPERAND (use_lhs, 0);
> +			  tree use_op1 = TREE_OPERAND (use_lhs, 1);
> +			  tree tmp_repr = DECL_BIT_FIELD_REPRESENTATIVE (op1);
> +			  if (TREE_CODE (use_op1) == FIELD_DECL
> +			      && DECL_BIT_FIELD_TYPE (use_op1)
> +			      && !TREE_THIS_VOLATILE (use_op1))
> +			    {
> +			      if (prev_representative
> +				  && (prev_representative != tmp_repr))
> +				{
> +				  /* If previous access has different
> +				     representative then barrier is needed
> +				     between it and new access.  */
> +				  access = create_and_insert_access
> +					     (&bitfield_accesses);
> +				  access->is_barrier = true;
> +				}
> +			      prev_representative = tmp_repr;
> +			      /* Create new bit-field access structure.  */
> +			      access = create_and_insert_access
> +					 (&bitfield_accesses);
> +			      /* Collect access data - load instruction.  */
> +			      access->src_bit_size = tree_low_cst
> +						      (DECL_SIZE (op1), 1);
> +			      access->src_bit_offset = get_bit_offset (op1);
> +			      access->src_offset_words =
> +				field_byte_offset (op1) / UNITS_PER_WORD;
> +			      access->src_addr = op0;
> +			      access->load_stmt = gsi_stmt (gsi);
> +			      /* Collect access data - store instruction.  */
> +			      access->dst_bit_size =
> +				tree_low_cst (DECL_SIZE (use_op1), 1);
> +			      access->dst_bit_offset =
> +				get_bit_offset (use_op1);
> +			      access->dst_offset_words =
> +				field_byte_offset (use_op1) / UNITS_PER_WORD;
> +			      access->dst_addr = use_op0;
> +			      access->store_stmt = use_stmt;
> +			      add_stmt_access_pair (access, stmt);
> +			      add_stmt_access_pair (access, use_stmt);
> +			      access->bitfield_type
> +				= DECL_BIT_FIELD_TYPE (use_op1);
> +			      access->bitfield_representative = tmp_repr;
> +			      access->field_decl_context =
> +				DECL_FIELD_CONTEXT (op1);
> +			    }
> +			}
> +		    }
> +		}
> +	    }
> +
> +	  /* Insert barrier for merging if statement is function call or memory
> +	     access.  */
> +	  asdata.stmt = stmt;
> +	  slot
> +	    = htab_find_slot (bitfield_stmt_access_htab, &asdata, NO_INSERT);
> +	  if (!slot
> +	      && ((gimple_code (stmt) == GIMPLE_CALL)
> +		  || (gimple_has_mem_ops (stmt))))
> +	    {
> +	      /* Create new bit-field access structure.  */
> +	      access = create_and_insert_access (&bitfield_accesses);
> +	      /* Mark it as barrier.  */
> +	      access->is_barrier = true;
> +	    }
> +
> +	  gsi_next (&gsi);
> +	}
> +
> +      /* If there are no at least two accesses go to the next basic block.  */
> +      if (bitfield_accesses.length () <= 1)
> +	{
> +	  bitfield_accesses.release ();
> +	  continue;
> +	}
> +
> +      /* Find bit-field accesses that can be merged.  */
> +      for (ix = 0; bitfield_accesses.iterate (ix, &access); ix++)
> +	{
> +	  bitfield_access head_access;
> +	  bitfield_access mrg_access;
> +	  bitfield_access prev_access;
> +	  if (!bitfield_accesses_merge.exists ())
> +	    bitfield_accesses_merge.create (0);
> +
> +	  bitfield_accesses_merge.safe_push (access);
> +
> +	  if (!access->is_barrier
> +	      && !(access == bitfield_accesses.last ()
> +	      && !bitfield_accesses_merge.is_empty ()))
> +	    continue;
> +
> +	  bitfield_accesses_merge.qsort (cmp_access);
> +
> +	  head_access = NULL;
> +	  for (iy = 0; bitfield_accesses_merge.iterate (iy, &mrg_access); iy++)
> +	    {
> +	      if (head_access
> +		  && expressions_equal_p (head_access->src_addr,
> +					  mrg_access->src_addr)
> +		  && expressions_equal_p (head_access->dst_addr,
> +					  mrg_access->dst_addr)
> +		  && prev_access->src_offset_words
> +		     == prev_access->src_offset_words

Really? How come?

> +		  && prev_access->dst_offset_words
> +		     == prev_access->dst_offset_words

Ditto. Did you mean == mrg_access?

> +		  && prev_access->src_bit_offset + prev_access->src_bit_size
> +		     == mrg_access->src_bit_offset
> +		  && prev_access->dst_bit_offset + prev_access->dst_bit_size
> +		     == mrg_access->dst_bit_offset
> +		  && prev_access->bitfield_representative
> +		     == mrg_access->bitfield_representative)
> +		{
> +		  /* Merge conditions are satisfied - merge accesses.  */
> +		  mrg_access->merged = true;
> +		  prev_access->next = mrg_access;
> +		  head_access->modified = true;
> +		  prev_access = mrg_access;
> +		}
> +	      else
> +		head_access = prev_access = mrg_access;
> +	    }
> +	  bitfield_accesses_merge.release ();
> +	  bitfield_accesses_merge = vNULL;
> +	}
> +
> +      /* Modify generated code.  */
> +      for (ix = 0; bitfield_accesses.iterate (ix, &access); ix++)
> +	{
> +	  if (access->merged)
> +	    {
> +	      /* Access merged - remove instructions.  */
> +	      gimple_stmt_iterator tmp_gsi;
> +	      tmp_gsi = gsi_for_stmt (access->load_stmt);
> +	      gsi_remove (&tmp_gsi, true);
> +	      tmp_gsi = gsi_for_stmt (access->store_stmt);
> +	      gsi_remove (&tmp_gsi, true);
> +	    }
> +	  else if (access->modified)
> +	    {
> +	      /* Access modified - modify generated code.  */
> +	      gimple_stmt_iterator tmp_gsi;
> +	      tree tmp_ssa;
> +	      tree itype = make_node (INTEGER_TYPE);
> +	      tree new_rhs;
> +	      tree new_lhs;
> +	      gimple new_stmt;
> +	      char new_field_name [15];
> +	      int decl_size;
> +
> +	      /* Bit-field size changed - modify load statement.  */
> +	      access->src_bit_size = get_merged_bit_field_size (access);
> +
> +	      TYPE_PRECISION (itype) = access->src_bit_size;
> +	      fixup_unsigned_type (itype);
> +
> +	      /* Create new declaration.  */
> +	      tree new_field = make_node (FIELD_DECL);
> +	      sprintf (new_field_name, "_field%d", new_field_no++);

Safe with huge fields?

> +	      DECL_NAME (new_field) = get_identifier (new_field_name);
> +	      TREE_TYPE (new_field) = itype;
> +	      DECL_BIT_FIELD (new_field) = 1;
> +	      DECL_BIT_FIELD_TYPE (new_field) = access->bitfield_type;
> +	      DECL_BIT_FIELD_REPRESENTATIVE (new_field) =
> +		access->bitfield_representative;
> +	      DECL_FIELD_CONTEXT (new_field) = access->field_decl_context;
> +	      DECL_NONADDRESSABLE_P (new_field) = 1;
> +	      DECL_FIELD_OFFSET (new_field) =
> +		build_int_cst (unsigned_type_node, access->src_offset_words);
> +	      DECL_FIELD_BIT_OFFSET (new_field) =
> +		build_int_cst (unsigned_type_node, access->src_bit_offset);
> +	      DECL_SIZE (new_field) = build_int_cst (unsigned_type_node,
> +						     access->src_bit_size);
> +	      decl_size = access->src_bit_size / BITS_PER_UNIT
> +		+ (access->src_bit_size % BITS_PER_UNIT ? 1 : 0);
> +	      DECL_SIZE_UNIT (new_field) =
> +		build_int_cst (unsigned_type_node, decl_size);
> +
> +	      tmp_ssa = make_ssa_name (create_tmp_var (itype, NULL), NULL);

Did Richi add a shorthand for that not too long ago or is this the current 
way already?

Thanks,
> +
> +	      /* Create new component ref.  */
> +	      new_rhs = build3 (COMPONENT_REF, itype, access->src_addr,
> +				new_field, NULL);
> +	      tmp_gsi = gsi_for_stmt (access->load_stmt);
> +	      new_stmt = gimple_build_assign (tmp_ssa, new_rhs);
> +	      gsi_insert_after (&tmp_gsi, new_stmt, GSI_SAME_STMT);
> +	      SSA_NAME_DEF_STMT (tmp_ssa) = new_stmt;
> +	      gsi_remove (&tmp_gsi, true);
> +
> +	      /* Bit-field size changed - modify store statement.  */
> +	      /* Create new component ref.  */
> +	      new_lhs = build3 (COMPONENT_REF, itype, access->dst_addr,
> +				new_field, NULL);
> +	      new_stmt = gimple_build_assign (new_lhs, tmp_ssa);
> +	      tmp_gsi = gsi_for_stmt (access->store_stmt);
> +	      gsi_insert_after (&tmp_gsi, new_stmt, GSI_SAME_STMT);
> +	      gsi_remove (&tmp_gsi, true);
> +	      cfg_changed = true;
> +	    }
> +	}
> +      /* Empty or delete data structures used for basic block.  */
> +      htab_empty (bitfield_stmt_access_htab);
> +      bitfield_accesses.release ();
> +    }
> +
> +  if (cfg_changed)
> +    todoflags |= TODO_cleanup_cfg;
> +
> +  return todoflags;
> +}
> +
>  /* Perform early intraprocedural SRA.  */
>  static unsigned int
>  early_intra_sra (void)
>  {
> +  unsigned int todoflags = 0;
>    sra_mode = SRA_MODE_EARLY_INTRA;
> -  return perform_intra_sra ();
> +  if (flag_tree_bitfield_merge)
> +    todoflags = ssa_bitfield_merge ();
> +  return todoflags | perform_intra_sra ();
>  }
>
>  /* Perform "late" intraprocedural SRA.  */
> @@ -5095,3 +5553,5 @@ make_pass_early_ipa_sra (gcc::context *ctxt)
>  {
>    return new pass_early_ipa_sra (ctxt);
>  }
> +
> +#include "gt-tree-sra.h"
> diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
> index 6886efb..afc73a6 100644
> --- a/gcc/tree-ssa-sccvn.c
> +++ b/gcc/tree-ssa-sccvn.c
> @@ -4176,29 +4176,6 @@ get_next_value_id (void)
>    return next_value_id++;
>  }
>
> -
> -/* Compare two expressions E1 and E2 and return true if they are equal.  */
> -
> -bool
> -expressions_equal_p (tree e1, tree e2)
> -{
> -  /* The obvious case.  */
> -  if (e1 == e2)
> -    return true;
> -
> -  /* If only one of them is null, they cannot be equal.  */
> -  if (!e1 || !e2)
> -    return false;
> -
> -  /* Now perform the actual comparison.  */
> -  if (TREE_CODE (e1) == TREE_CODE (e2)
> -      && operand_equal_p (e1, e2, OEP_PURE_SAME))
> -    return true;
> -
> -  return false;
> -}
> -
> -
>  /* Return true if the nary operation NARY may trap.  This is a copy
>     of stmt_could_throw_1_p adjusted to the SCCVN IL.  */
>
> diff --git a/gcc/tree-ssa-sccvn.h b/gcc/tree-ssa-sccvn.h
> index 94e3603..707b18c 100644
> --- a/gcc/tree-ssa-sccvn.h
> +++ b/gcc/tree-ssa-sccvn.h
> @@ -21,10 +21,6 @@
>  #ifndef TREE_SSA_SCCVN_H
>  #define TREE_SSA_SCCVN_H
>
> -/* In tree-ssa-sccvn.c  */
> -bool expressions_equal_p (tree, tree);
> -
> -
>  /* TOP of the VN lattice.  */
>  extern tree VN_TOP;
>
> diff --git a/gcc/tree.c b/gcc/tree.c
> index 1947105..7e8d4a8 100644
> --- a/gcc/tree.c
> +++ b/gcc/tree.c
> @@ -12108,4 +12108,44 @@ contains_bitfld_component_ref_p (const_tree ref)
>    return false;
>  }
>
> +/* Compare two expressions E1 and E2 and return true if they are equal.  */
> +
> +bool
> +expressions_equal_p (tree e1, tree e2)
> +{
> +  /* The obvious case.  */
> +  if (e1 == e2)
> +    return true;
> +
> +  /* If only one of them is null, they cannot be equal.  */
> +  if (!e1 || !e2)
> +    return false;
> +
> +  /* Now perform the actual comparison.  */
> +  if (TREE_CODE (e1) == TREE_CODE (e2)
> +      && operand_equal_p (e1, e2, OEP_PURE_SAME))
> +    return true;
> +
> +  return false;
> +}
> +
> +/* Given a pointer to a tree node, assumed to be some kind of a ..._TYPE
> +   node, return the size in bits for the type if it is a constant, or else
> +   return the alignment for the type if the type's size is not constant, or
> +   else return BITS_PER_WORD if the type actually turns out to be an
> +   ERROR_MARK node.  */
> +
> +unsigned HOST_WIDE_INT
> +simple_type_size_in_bits (const_tree type)
> +{
> +  if (TREE_CODE (type) == ERROR_MARK)
> +    return BITS_PER_WORD;
> +  else if (TYPE_SIZE (type) == NULL_TREE)
> +    return 0;
> +  else if (host_integerp (TYPE_SIZE (type), 1))
> +    return tree_low_cst (TYPE_SIZE (type), 1);
> +  else
> +    return TYPE_ALIGN (type);
> +}
> +
>  #include "gt-tree.h"
> diff --git a/gcc/tree.h b/gcc/tree.h
> index 84bd699..0ea2203 100644
> --- a/gcc/tree.h
> +++ b/gcc/tree.h
> @@ -5981,6 +5981,7 @@ extern tree get_ref_base_and_extent (tree, 
> HOST_WIDE_INT *,
>  				     HOST_WIDE_INT *, HOST_WIDE_INT *);
>  extern bool contains_bitfld_component_ref_p (const_tree);
>  extern bool type_in_anonymous_namespace_p (tree);
> +extern bool expressions_equal_p (tree e1, tree e2);
>
>  /* In tree-nested.c */
>  extern tree build_addr (tree, tree);
> @@ -6502,6 +6503,10 @@ extern bool block_may_fallthru (const_tree);
>  /* In vtable-verify.c.  */
>  extern void save_vtable_map_decl (tree);
>
> +/* In dwarf2out.c.  */
> +HOST_WIDE_INT
> +field_byte_offset (const_tree decl);
> +
>  ?
>  /* Functional interface to the builtin functions.  */
>
> @@ -6613,5 +6618,6 @@ builtin_decl_implicit_p (enum built_in_function fncode)
>  #endif	/* NO_DOLLAR_IN_LABEL */
>  #endif	/* NO_DOT_IN_LABEL */
>
> +extern unsigned HOST_WIDE_INT simple_type_size_in_bits (const_tree type);
>
>  #endif  /* GCC_TREE_H  */
>
>
> Regards,
> Zoran Jovanovic


Sent with AquaMail for Android
http://www.aqua-mail.com


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Add a new option "-ftree-bitfield-merge" (patch / doc inside)
  2013-07-30 15:02   ` Zoran Jovanovic
@ 2013-08-27 11:33     ` Richard Biener
  0 siblings, 0 replies; 17+ messages in thread
From: Richard Biener @ 2013-08-27 11:33 UTC (permalink / raw)
  To: Zoran Jovanovic; +Cc: gcc-patches, Petar Jovanovic

On Tue, Jul 30, 2013 at 4:47 PM, Zoran Jovanovic
<Zoran.Jovanovic@imgtec.com> wrote:
> Thank you for the reply.
> I am in the process of modifying the patch according to some comments received.
> Currently I am considering the usage of DECL_BIT_FIELD_REPRESENTATIVE.
> I see that they can be used during analysis phase for deciding which accesses can be merged - only accesses with same representative will be merged.
> I have more dilemmas with usage of representatives for lowering. If my understanding is correct bit field representative can only be associated to a field declaration, and not to a BIT_FIELD_REF node. As a consequence optimization must use COMPONENT_REF to model new bit field access (which should be an equivalent of several merged accesses). To use COMPONENT_REF a new field declaration with appropriate bit size (equal to sum of bit sizes of all merged bit field accesses) must be created and then corresponding bit field representative could be attached.
> Is my understanding correct? Is creating a new field declaration for every set of merged bit field accesses acceptable?

You should just use the DECL_BIT_FIELD_REPRESENTATIVE FIELD_DECL, not
create any new FIELD_DECLs.  Also you can use BIT_FIELD_REFs but you need to
query the proper bit position / size from the DECL_BIT_FIELD_REPRESENTATIVE
FIELD_DECL.  Using the COMPONENT_REF is better though, I think.

Full patch queued for review (but you might want to update it to not create new
FIELD_DECLs).

Richard.

>
> Regards,
> Zoran Jovanovic
>
> ________________________________________
> From: Richard Biener [richard.guenther@gmail.com]
> Sent: Thursday, July 18, 2013 11:31 AM
> To: Zoran Jovanovic; gcc-patches@gcc.gnu.org
> Cc: Petar Jovanovic
> Subject: Re: [PATCH] Add a new option "-ftree-bitfield-merge" (patch / doc inside)
>
> Zoran Jovanovic <Zoran.Jovanovic@imgtec.com> wrote:
>
>>Hello,
>>This patch adds new optimization pass that combines several adjacent
>>bit field accesses that copy values from one memory location to another
>>into single bit field access.
>>
>>Example:
>>
>>Original code:
>>  <unnamed-unsigned:3> D.1351;
>>  <unnamed-unsigned:9> D.1350;
>>  <unnamed-unsigned:7> D.1349;
>>  D.1349_2 = p1_1(D)->f1;
>>  p2_3(D)->f1 = D.1349_2;
>>  D.1350_4 = p1_1(D)->f2;
>>  p2_3(D)->f2 = D.1350_4;
>>  D.1351_5 = p1_1(D)->f3;
>>  p2_3(D)->f3 = D.1351_5;
>>
>>Optimized code:
>>  <unnamed-unsigned:19> D.1358;
>>  D.1358_10 = BIT_FIELD_REF <*p1_1(D), 19, 13>;
>>  BIT_FIELD_REF <*p2_3(D), 19, 13> = D.1358_10;
>>
>>Algorithm works on basic block level and consists of following 3 major
>>steps:
>>1. Go trough basic block statements list. If there are statement pairs
>>that implement copy of bit field content from one memory location to
>>another record statements pointers and other necessary data in
>>corresponding data structure.
>>2. Identify records that represent adjacent bit field accesses and mark
>>them as merged.
>>3. Modify trees accordingly.
>
> All this should use BITFIELD_REPRESENTATIVE both to decide what accesses are related and for the lowering. This makes sure to honor the appropriate memory models.
>
> In theory only lowering is necessary and FRE and DSE will do the job of optimizing - also properly accounting for alias issues that Joseph mentions. The lowering and analysis is strongly related to SRA So I don't believe we want a new pass for this.
>
> Richard.
>
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH] Add a new option "-ftree-bitfield-merge" (patch / doc inside)
  2013-08-23 14:25 ` Zoran Jovanovic
  2013-08-23 22:19   ` Joseph S. Myers
       [not found]   ` <140b1b1b35a.2760.0f39ed3bcad52ef2c88c90062b7714dc@gmail.com>
@ 2013-09-24 23:10   ` Zoran Jovanovic
       [not found]     ` <CAFiYyc0dcpDeXqwM2G3BTJUkpTsjzivRVEuWGfmGE4QcMhxERA@mail.gmail.com>
  2 siblings, 1 reply; 17+ messages in thread
From: Zoran Jovanovic @ 2013-09-24 23:10 UTC (permalink / raw)
  To: gcc-patches; +Cc: Petar Jovanovic

Hello,
This is new patch version. 
Comments from Bernhard Reutner-Fischer review applied.
Also, test case bitfildmrg2.c modified - it is now execute test.


Example:

Original code:
  <unnamed-unsigned:3> D.1351;
  <unnamed-unsigned:9> D.1350;
  <unnamed-unsigned:7> D.1349;
  D.1349_2 = p1_1(D)->f1;
  p2_3(D)->f1 = D.1349_2;
  D.1350_4 = p1_1(D)->f2;
  p2_3(D)->f2 = D.1350_4;
  D.1351_5 = p1_1(D)->f3;
  p2_3(D)->f3 = D.1351_5;

Optimized code:
  <unnamed-unsigned:19> D.1358;
  _16 = pr1_2(D)->_field0;
  pr2_4(D)->_field0 = _16;
  
Algorithm works on basic block level and consists of following 3 major steps:
1. Go trough basic block statements list. If there are statement pairs that implement copy of bit field content from one memory location to another record statements pointers and other necessary data in corresponding data structure.
2. Identify records that represent adjacent bit field accesses and mark them as merged.
3. Modify trees accordingly.

New command line option "-ftree-bitfield-merge" is introduced.

Tested - passed gcc regression tests.

Changelog -

gcc/ChangeLog:
2013-09-24 Zoran Jovanovic (zoran.jovanovic@imgtec.com)
  * Makefile.in : Added tree-sra.c to GTFILES.
  * common.opt (ftree-bitfield-merge): New option.
  * doc/invoke.texi: Added reference to "-ftree-bitfield-merge".
  * tree-sra.c (ssa_bitfield_merge): New function.
  Entry for (-ftree-bitfield-merge).
  (bitfield_stmt_access_pair_htab_hash): New function.
  (bitfield_stmt_access_pair_htab_eq): New function.
  (cmp_access): New function.
  (create_and_insert_access): New function.
  (get_bit_offset): New function.
  (get_merged_bit_field_size): New function.
  (add_stmt_access_pair): New function.
  * dwarf2out.c (simple_type_size_in_bits): moved to tree.c.
  (field_byte_offset): declaration moved to tree.h, static removed.
  * testsuite/gcc.dg/tree-ssa/bitfldmrg1.c: New test.
  * testsuite/gcc.dg/tree-ssa/bitfldmrg2.c: New test.
  * tree-ssa-sccvn.c (expressions_equal_p): moved to tree.c.
  * tree-ssa-sccvn.h (expressions_equal_p): declaration moved to tree.h.
  * tree.c (expressions_equal_p): moved from tree-ssa-sccvn.c.
  (simple_type_size_in_bits): moved from dwarf2out.c.
  * tree.h (expressions_equal_p): declaration added.
  (field_byte_offset): declaration added.

Patch -

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index a2e3f7a..54aa8e7 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -3847,6 +3847,7 @@ GTFILES = $(CPP_ID_DATA_H) $(srcdir)/input.h $(srcdir)/coretypes.h \
   $(srcdir)/asan.c \
   $(srcdir)/ubsan.c \
   $(srcdir)/tsan.c $(srcdir)/ipa-devirt.c \
+  $(srcdir)/tree-sra.c \
   @all_gtfiles@
 
 # Compute the list of GT header files from the corresponding C sources,
diff --git a/gcc/common.opt b/gcc/common.opt
index 202e169..afac514 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2164,6 +2164,10 @@ ftree-sra
 Common Report Var(flag_tree_sra) Optimization
 Perform scalar replacement of aggregates
 
+ftree-bitfield-merge
+Common Report Var(flag_tree_bitfield_merge) Init(0) Optimization
+Enable bit-field merge on trees
+
 ftree-ter
 Common Report Var(flag_tree_ter) Optimization
 Replace temporary expressions in the SSA->normal pass
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index aa0f4ed..e588cae 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -412,7 +412,7 @@ Objective-C and Objective-C++ Dialects}.
 -fsplit-ivs-in-unroller -fsplit-wide-types -fstack-protector @gol
 -fstack-protector-all -fstack-protector-strong -fstrict-aliasing @gol
 -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
--ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
+-ftree-bitfield-merge -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
 -ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
 -ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
 -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
@@ -7679,6 +7679,11 @@ pointer alignment information.
 This pass only operates on local scalar variables and is enabled by default
 at @option{-O} and higher.  It requires that @option{-ftree-ccp} is enabled.
 
+@item -ftree-bitfield-merge
+@opindex ftree-bitfield-merge
+Combines several adjacent bit-field accesses that copy values
+from one memory location to another into one single bit-field access.
+
 @item -ftree-ccp
 @opindex ftree-ccp
 Perform sparse conditional constant propagation (CCP) on trees.  This
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 95049e4..e74db17 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -3108,8 +3108,6 @@ static HOST_WIDE_INT ceiling (HOST_WIDE_INT, unsigned int);
 static tree field_type (const_tree);
 static unsigned int simple_type_align_in_bits (const_tree);
 static unsigned int simple_decl_align_in_bits (const_tree);
-static unsigned HOST_WIDE_INT simple_type_size_in_bits (const_tree);
-static HOST_WIDE_INT field_byte_offset (const_tree);
 static void add_AT_location_description	(dw_die_ref, enum dwarf_attribute,
 					 dw_loc_list_ref);
 static void add_data_member_location_attribute (dw_die_ref, tree);
@@ -10149,25 +10147,6 @@ is_base_type (tree type)
   return 0;
 }
 
-/* Given a pointer to a tree node, assumed to be some kind of a ..._TYPE
-   node, return the size in bits for the type if it is a constant, or else
-   return the alignment for the type if the type's size is not constant, or
-   else return BITS_PER_WORD if the type actually turns out to be an
-   ERROR_MARK node.  */
-
-static inline unsigned HOST_WIDE_INT
-simple_type_size_in_bits (const_tree type)
-{
-  if (TREE_CODE (type) == ERROR_MARK)
-    return BITS_PER_WORD;
-  else if (TYPE_SIZE (type) == NULL_TREE)
-    return 0;
-  else if (host_integerp (TYPE_SIZE (type), 1))
-    return tree_low_cst (TYPE_SIZE (type), 1);
-  else
-    return TYPE_ALIGN (type);
-}
-
 /* Similarly, but return a double_int instead of UHWI.  */
 
 static inline double_int
@@ -14521,7 +14500,7 @@ round_up_to_align (double_int t, unsigned int align)
    because the offset is actually variable.  (We can't handle the latter case
    just yet).  */
 
-static HOST_WIDE_INT
+HOST_WIDE_INT
 field_byte_offset (const_tree decl)
 {
   double_int object_offset_in_bits;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c
new file mode 100644
index 0000000..e9e96b7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-bitfield-merge -fdump-tree-esra" }  */
+
+struct S
+{
+  unsigned f1:7;
+  unsigned f2:9;
+  unsigned f3:3;
+  unsigned f4:5;
+  unsigned f5:1;
+  unsigned f6:2;
+};
+
+unsigned
+foo (struct S *p1, struct S *p2, int *ptr)
+{
+  p2->f1 = p1->f1;
+  p2->f2 = p1->f2;
+  p2->f3 = p1->f3;
+  *ptr = 7;
+  p2->f4 = p1->f4;
+  p2->f5 = p1->f5;
+  p2->f6 = p1->f6;
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump "19" "esra" } } */
+/* { dg-final { scan-tree-dump "8" "esra"} } */
+/* { dg-final { cleanup-tree-dump "esra" } } */
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c
new file mode 100644
index 0000000..653e904
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c
@@ -0,0 +1,90 @@
+/* Check whether use of -ftree-bitfield-merge results in presence of overlaping
+   unions results in incorrect code.  */
+/* { dg-options "-O2 -ftree-bitfield-merge" }  */
+/* { dg-do run } */
+#include <stdio.h>
+extern void abort (void);
+
+struct s1
+{
+  unsigned f1:4;
+  unsigned f2:4;
+  unsigned f3:4;
+};
+
+struct s2
+{
+  unsigned char c;
+  unsigned f1:4;
+  unsigned f2:4;
+  unsigned f3:4;
+};
+
+struct s3
+{
+  unsigned f1:3;
+  unsigned f2:3;
+  unsigned f3:3;
+};
+
+struct s4
+{
+  unsigned f0:3;
+  unsigned f1:3;
+  unsigned f2:3;
+  unsigned f3:3;
+};
+
+union un_1
+{
+  struct s1 a;
+  struct s2 b;
+};
+
+union un_2
+{
+  struct s3 a;
+  struct s4 b;
+};
+
+void f1 (union un_1 *p1, union un_1 *p2)
+{
+  p2->a.f3 = p1->b.f3;
+  p2->a.f2 = p1->b.f2;
+  p2->a.f1 = p1->b.f1;
+
+  if (p1->b.f1 != 3)
+    abort ();
+}
+
+void f2 (union un_2 *p1, union un_2 *p2)
+{
+  p2->b.f1 = p1->a.f1;
+  p2->b.f2 = p1->a.f2;
+  p2->b.f3 = p1->a.f3;
+
+  if (p2->b.f1 != 0 || p2->b.f2 != 0 || p2->b.f3 != 0)
+    abort ();
+}
+
+int main ()
+{
+  union un_1 u1;
+  union un_2 u2;
+
+  u1.b.f1 = 1;
+  u1.b.f2 = 2;
+  u1.b.f3 = 3;
+  u1.b.c = 0;
+
+  f1 (&u1, &u1);
+
+  u2.b.f0 = 0;
+  u2.b.f1 = 1;
+  u2.b.f2 = 2;
+  u2.b.f3 = 3;
+
+  f2 (&u2, &u2);
+
+  return 0;
+}
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 58c7565..610245b 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -92,6 +92,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-pretty-print.h"
 #include "ipa-inline.h"
 #include "ipa-utils.h"
+#include "ggc.h"
 
 /* Enumeration of all aggregate reductions we can do.  */
 enum sra_mode { SRA_MODE_EARLY_IPA,   /* early call regularization */
@@ -3424,12 +3425,476 @@ perform_intra_sra (void)
   return ret;
 }
 
+/* This optimization combines several adjacent bit-field accesses that copy
+   values from one memory location to another into one single bit-field
+   access.  */
+
+/* Data for single bit-field read/write sequence.  */
+struct GTY (()) bitfield_access_d {
+  gimple load_stmt;		  /* Bit-field load statement.  */
+  gimple store_stmt;		  /* Bit-field store statement.  */
+  unsigned src_offset_words;	  /* Bit-field offset at src in words.  */
+  unsigned src_bit_offset;	  /* Bit-field offset inside source word.  */
+  unsigned src_bit_size;	  /* Size of bit-field in source word.  */
+  unsigned dst_offset_words;	  /* Bit-field offset at dst in words.  */
+  unsigned dst_bit_offset;	  /* Bit-field offset inside destination
+				     word.  */
+  unsigned src_field_offset;	  /* Source field offset.  */
+  unsigned dst_bit_size;	  /* Size of bit-field in destination word.  */
+  tree src_addr;		  /* Address of source memory access.  */
+  tree dst_addr;		  /* Address of destination memory access.  */
+  bool merged;			  /* True if access is merged with another
+				     one.  */
+  bool modified;		  /* True if bit-field size is modified.  */
+  bool is_barrier;		  /* True if access is barrier (call or mem
+				     access).  */
+  struct bitfield_access_d *next; /* Access with which this one is merged.  */
+  tree bitfield_type;		  /* Field type.  */
+  tree bitfield_representative;	  /* Bit field representative of original
+				     declaration.  */
+  tree field_decl_context;	  /* Context of original bit-field
+				     declaration.  */
+};
+
+typedef struct bitfield_access_d bitfield_access_o;
+typedef struct bitfield_access_d *bitfield_access;
+
+/* Connecting register with bit-field access sequence that defines value in
+   that register.  */
+struct GTY (()) bitfield_stmt_access_pair_d
+{
+  gimple stmt;
+  bitfield_access access;
+};
+
+typedef struct bitfield_stmt_access_pair_d bitfield_stmt_access_pair_o;
+typedef struct bitfield_stmt_access_pair_d *bitfield_stmt_access_pair;
+
+static GTY ((param_is (struct bitfield_stmt_access_pair_d)))
+  htab_t bitfield_stmt_access_htab;
+
+/* Hash table callbacks for bitfield_stmt_access_htab.  */
+
+static hashval_t
+bitfield_stmt_access_pair_htab_hash (const void *p)
+{
+  const bitfield_stmt_access_pair entry = (const bitfield_stmt_access_pair)p;
+  return (hashval_t) (uintptr_t) entry->stmt;
+}
+
+static int
+bitfield_stmt_access_pair_htab_eq (const void *p1, const void *p2)
+{
+  const struct bitfield_stmt_access_pair_d *entry1 =
+    (const bitfield_stmt_access_pair)p1;
+  const struct bitfield_stmt_access_pair_d *entry2 =
+    (const bitfield_stmt_access_pair)p2;
+  return entry1->stmt == entry2->stmt;
+}
+
+/* Counter used for generating unique names for new fields.  */
+static unsigned new_field_no;
+
+/* Compare two bit-field access records.  */
+
+static int
+cmp_access (const void *p1, const void *p2)
+{
+  const bitfield_access a1 = (*(const bitfield_access*)p1);
+  const bitfield_access a2 = (*(const bitfield_access*)p2);
+
+  if (a1->bitfield_representative - a2->bitfield_representative)
+    return a1->bitfield_representative - a2->bitfield_representative;
+
+  if (!expressions_equal_p (a1->src_addr, a2->src_addr))
+    return a1 - a2;
+
+  if (!expressions_equal_p (a1->dst_addr, a2->dst_addr))
+    return a1 - a2;
+
+  if (a1->src_offset_words - a2->src_offset_words)
+    return a1->src_offset_words - a2->src_offset_words;
+
+  return a1->src_bit_offset - a2->src_bit_offset;
+}
+
+/* Create new bit-field access structure and add it to given bitfield_accesses
+   htab.  */
+
+static bitfield_access
+create_and_insert_access (vec<bitfield_access>
+		       *bitfield_accesses)
+{
+  bitfield_access access = ggc_alloc_bitfield_access_d ();
+  memset (access, 0, sizeof (struct bitfield_access_d));
+  bitfield_accesses->safe_push (access);
+  return access;
+}
+
+/* Slightly modified add_bit_offset_attribute from dwarf2out.c.  */
+
+static inline HOST_WIDE_INT
+get_bit_offset (tree decl)
+{
+  tree type = DECL_BIT_FIELD_TYPE (decl);
+  HOST_WIDE_INT bitpos_int;
+
+  /* Must be a field and a bit-field.  */
+  gcc_assert (type && TREE_CODE (decl) == FIELD_DECL);
+  /* Bit position and decl size should be integer constants that can be
+     represented in a signle HOST_WIDE_INT.  */
+  if (! host_integerp (bit_position (decl), 0)
+      || ! host_integerp (DECL_SIZE (decl), 1))
+    return -1;
+
+  bitpos_int = int_bit_position (decl);
+  return bitpos_int;
+}
+
+/* Returns size of combined bitfields.  Size cannot be larger than size
+   of largest directly accessible memory unit.  */
+
+static int
+get_merged_bit_field_size (bitfield_access access)
+{
+  bitfield_access tmp_access = access;
+  int size = 0;
+
+  while (tmp_access)
+  {
+    size += tmp_access->src_bit_size;
+    tmp_access = tmp_access->next;
+  }
+  return size;
+}
+
+/* Adds new pair consisting of statement and bit-field access structure that
+   contains it.  */
+
+static bool add_stmt_access_pair (bitfield_access access, gimple stmt)
+{
+  bitfield_stmt_access_pair new_pair;
+  void **slot;
+  if (!bitfield_stmt_access_htab)
+    bitfield_stmt_access_htab =
+      htab_create_ggc (128, bitfield_stmt_access_pair_htab_hash,
+		       bitfield_stmt_access_pair_htab_eq, NULL);
+  new_pair = ggc_alloc_bitfield_stmt_access_pair_o ();
+  new_pair->stmt = stmt;
+  new_pair->access = access;
+  slot = htab_find_slot (bitfield_stmt_access_htab, new_pair, INSERT);
+  if (*slot == HTAB_EMPTY_ENTRY)
+    {
+      *slot = new_pair;
+      return true;
+    }
+  return false;
+}
+
+/* Returns true if given COMPONENT_REF is part of an union.  */
+
+static bool part_of_union_p (tree component)
+{
+  tree tmp = component;
+  bool res = false;
+  while (TREE_CODE (tmp) == COMPONENT_REF)
+    {
+      if (TREE_CODE (TREE_TYPE (tmp)) == UNION_TYPE)
+	{
+	  res = true;
+	  break;
+	}
+      tmp = TREE_OPERAND (tmp, 0);
+    }
+  if (tmp && (TREE_CODE (TREE_TYPE (tmp)) == UNION_TYPE))
+    res = true;
+  return res;
+}
+
+/* Main entry point for the bit-field merge optimization.  */
+
+static unsigned int
+ssa_bitfield_merge (void)
+{
+  basic_block bb;
+  unsigned int todoflags = 0;
+  vec<bitfield_access> bitfield_accesses;
+  int ix, iy;
+  bitfield_access access;
+  bool cfg_changed = false;
+
+  /* In the strict volatile bitfields case, doing code changes here may prevent
+     other optimizations, in particular in a SLOW_BYTE_ACCESS setting.  */
+  if (flag_strict_volatile_bitfields > 0)
+    return 0;
+
+  FOR_EACH_BB (bb)
+    {
+      gimple_stmt_iterator gsi;
+      vec<bitfield_access> bitfield_accesses_merge = vNULL;
+      tree prev_representative = NULL_TREE;
+      bitfield_accesses.create (0);
+
+      /* Identify all bitfield copy sequences in the basic-block.  */
+      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)
+	{
+	  gimple stmt = gsi_stmt (gsi);
+	  tree lhs, rhs;
+	  void **slot;
+	  struct bitfield_stmt_access_pair_d asdata;
+
+	  if (!is_gimple_assign (stmt))
+	    {
+	      gsi_next (&gsi);
+	      continue;
+	    }
+
+	  lhs = gimple_assign_lhs (stmt);
+	  rhs = gimple_assign_rhs1 (stmt);
+
+	  if (TREE_CODE (rhs) == COMPONENT_REF)
+	    {
+	      use_operand_p use;
+	      gimple use_stmt;
+	      tree op0 = TREE_OPERAND (rhs, 0);
+	      tree op1 = TREE_OPERAND (rhs, 1);
+
+	      if (TREE_CODE (op1) == FIELD_DECL && DECL_BIT_FIELD_TYPE (op1)
+		  && !TREE_THIS_VOLATILE (op1) && !part_of_union_p (rhs))
+		{
+		  if (single_imm_use (lhs, &use, &use_stmt)
+		       && is_gimple_assign (use_stmt))
+		    {
+		      tree use_lhs = gimple_assign_lhs (use_stmt);
+		      if (TREE_CODE (use_lhs) == COMPONENT_REF)
+			{
+			  tree use_op0 = TREE_OPERAND (use_lhs, 0);
+			  tree use_op1 = TREE_OPERAND (use_lhs, 1);
+			  tree tmp_repr = DECL_BIT_FIELD_REPRESENTATIVE (op1);
+
+			  if (TREE_CODE (use_op1) == FIELD_DECL
+			      && DECL_BIT_FIELD_TYPE (use_op1)
+			      && !TREE_THIS_VOLATILE (use_op1))
+			    {
+			      if (prev_representative
+				  && (prev_representative != tmp_repr))
+				{
+				  /* If previous access has different
+				     representative then barrier is needed
+				     between it and new access.  */
+				  access = create_and_insert_access
+					     (&bitfield_accesses);
+				  access->is_barrier = true;
+				}
+			      prev_representative = tmp_repr;
+			      /* Create new bit-field access structure.  */
+			      access = create_and_insert_access
+					 (&bitfield_accesses);
+			      /* Collect access data - load instruction.  */
+			      access->src_bit_size = tree_low_cst
+						      (DECL_SIZE (op1), 1);
+			      access->src_bit_offset = get_bit_offset (op1);
+			      access->src_offset_words =
+				field_byte_offset (op1) / UNITS_PER_WORD;
+			      access->src_field_offset =
+				tree_low_cst (DECL_FIELD_OFFSET (op1),1);
+			      access->src_addr = op0;
+			      access->load_stmt = gsi_stmt (gsi);
+			      /* Collect access data - store instruction.  */
+			      access->dst_bit_size =
+				tree_low_cst (DECL_SIZE (use_op1), 1);
+			      access->dst_bit_offset =
+				get_bit_offset (use_op1);
+			      access->dst_offset_words =
+				field_byte_offset (use_op1) / UNITS_PER_WORD;
+			      access->dst_addr = use_op0;
+			      access->store_stmt = use_stmt;
+			      add_stmt_access_pair (access, stmt);
+			      add_stmt_access_pair (access, use_stmt);
+			      access->bitfield_type
+				= DECL_BIT_FIELD_TYPE (use_op1);
+			      access->bitfield_representative = tmp_repr;
+			      access->field_decl_context =
+				DECL_FIELD_CONTEXT (op1);
+			    }
+			}
+		    }
+		}
+	    }
+
+	  /* Insert barrier for merging if statement is function call or memory
+	     access.  */
+	  if (bitfield_stmt_access_htab)
+	    {
+	      asdata.stmt = stmt;
+	      slot
+		= htab_find_slot (bitfield_stmt_access_htab, &asdata,
+				  NO_INSERT);
+	      if (!slot
+		  && ((gimple_code (stmt) == GIMPLE_CALL)
+		      || (gimple_has_mem_ops (stmt))))
+		{
+		  /* Create new bit-field access structure.  */
+		  access = create_and_insert_access (&bitfield_accesses);
+		  /* Mark it as barrier.  */
+		  access->is_barrier = true;
+		}
+	    }
+	  gsi_next (&gsi);
+	}
+
+      /* If there are no at least two accesses go to the next basic block.  */
+      if (bitfield_accesses.length () <= 1)
+	{
+	  bitfield_accesses.release ();
+	  continue;
+	}
+
+      /* Find bit-field accesses that can be merged.  */
+      for (ix = 0; bitfield_accesses.iterate (ix, &access); ix++)
+	{
+	  bitfield_access head_access;
+	  bitfield_access mrg_access;
+	  bitfield_access prev_access;
+	  if (!bitfield_accesses_merge.exists ())
+	    bitfield_accesses_merge.create (0);
+
+	  bitfield_accesses_merge.safe_push (access);
+
+	  if (!access->is_barrier
+	      && !(access == bitfield_accesses.last ()
+	      && !bitfield_accesses_merge.is_empty ()))
+	    continue;
+
+	  bitfield_accesses_merge.qsort (cmp_access);
+
+	  head_access = prev_access = NULL;
+	  for (iy = 0; bitfield_accesses_merge.iterate (iy, &mrg_access); iy++)
+	    {
+	      if (head_access
+		  && expressions_equal_p (head_access->src_addr,
+					  mrg_access->src_addr)
+		  && expressions_equal_p (head_access->dst_addr,
+					  mrg_access->dst_addr)
+		  && prev_access->src_offset_words
+		     == mrg_access->src_offset_words
+		  && prev_access->dst_offset_words
+		     == mrg_access->dst_offset_words
+		  && prev_access->src_bit_offset + prev_access->src_bit_size
+		     == mrg_access->src_bit_offset
+		  && prev_access->dst_bit_offset + prev_access->dst_bit_size
+		     == mrg_access->dst_bit_offset
+		  && prev_access->bitfield_representative
+		     == mrg_access->bitfield_representative)
+		{
+		  /* Merge conditions are satisfied - merge accesses.  */
+		  mrg_access->merged = true;
+		  prev_access->next = mrg_access;
+		  head_access->modified = true;
+		  prev_access = mrg_access;
+		}
+	      else
+		head_access = prev_access = mrg_access;
+	    }
+	  bitfield_accesses_merge.release ();
+	  bitfield_accesses_merge = vNULL;
+	}
+
+      /* Modify generated code.  */
+      for (ix = 0; bitfield_accesses.iterate (ix, &access); ix++)
+	{
+	  if (access->merged)
+	    {
+	      /* Access merged - remove instructions.  */
+	      gimple_stmt_iterator tmp_gsi;
+	      tmp_gsi = gsi_for_stmt (access->load_stmt);
+	      gsi_remove (&tmp_gsi, true);
+	      tmp_gsi = gsi_for_stmt (access->store_stmt);
+	      gsi_remove (&tmp_gsi, true);
+	    }
+	  else if (access->modified)
+	    {
+	      /* Access modified - modify generated code.  */
+	      gimple_stmt_iterator tmp_gsi;
+	      tree tmp_ssa;
+	      tree itype = make_node (INTEGER_TYPE);
+	      tree new_rhs;
+	      tree new_lhs;
+	      gimple new_stmt;
+	      char new_field_name [15];
+	      int decl_size;
+
+	      /* Bit-field size changed - modify load statement.  */
+	      access->src_bit_size = get_merged_bit_field_size (access);
+
+	      TYPE_PRECISION (itype) = access->src_bit_size;
+	      fixup_unsigned_type (itype);
+
+	      /* Create new declaration.  */
+	      tree new_field = make_node (FIELD_DECL);
+	      sprintf (new_field_name, "_field%x", new_field_no++);
+	      DECL_NAME (new_field) = get_identifier (new_field_name);
+	      TREE_TYPE (new_field) = itype;
+	      DECL_BIT_FIELD (new_field) = 1;
+	      DECL_BIT_FIELD_TYPE (new_field) = access->bitfield_type;
+	      DECL_BIT_FIELD_REPRESENTATIVE (new_field) =
+		access->bitfield_representative;
+	      DECL_FIELD_CONTEXT (new_field) = access->field_decl_context;
+	      DECL_NONADDRESSABLE_P (new_field) = 1;
+	      DECL_FIELD_OFFSET (new_field) =
+		build_int_cst (unsigned_type_node, access->src_field_offset);
+	      DECL_FIELD_BIT_OFFSET (new_field) =
+		build_int_cst (unsigned_type_node, access->src_bit_offset);
+	      DECL_SIZE (new_field) = build_int_cst (unsigned_type_node,
+						     access->src_bit_size);
+	      decl_size = access->src_bit_size / BITS_PER_UNIT
+		+ (access->src_bit_size % BITS_PER_UNIT ? 1 : 0);
+	      DECL_SIZE_UNIT (new_field) =
+		build_int_cst (unsigned_type_node, decl_size);
+
+	      tmp_ssa = make_ssa_name (create_tmp_var (itype, NULL), NULL);
+
+	      /* Create new component ref.  */
+	      new_rhs = build3 (COMPONENT_REF, itype, access->src_addr,
+				new_field, NULL);
+	      tmp_gsi = gsi_for_stmt (access->load_stmt);
+	      new_stmt = gimple_build_assign (tmp_ssa, new_rhs);
+	      gsi_insert_after (&tmp_gsi, new_stmt, GSI_SAME_STMT);
+	      SSA_NAME_DEF_STMT (tmp_ssa) = new_stmt;
+	      gsi_remove (&tmp_gsi, true);
+
+	      /* Bit-field size changed - modify store statement.  */
+	      /* Create new component ref.  */
+	      new_lhs = build3 (COMPONENT_REF, itype, access->dst_addr,
+				new_field, NULL);
+	      new_stmt = gimple_build_assign (new_lhs, tmp_ssa);
+	      tmp_gsi = gsi_for_stmt (access->store_stmt);
+	      gsi_insert_after (&tmp_gsi, new_stmt, GSI_SAME_STMT);
+	      gsi_remove (&tmp_gsi, true);
+	      cfg_changed = true;
+	    }
+	}
+      /* Empty or delete data structures used for basic block.  */
+      htab_empty (bitfield_stmt_access_htab);
+      bitfield_stmt_access_htab = NULL;
+      bitfield_accesses.release ();
+    }
+
+  if (cfg_changed)
+    todoflags |= TODO_cleanup_cfg;
+
+  return todoflags;
+}
+
 /* Perform early intraprocedural SRA.  */
 static unsigned int
 early_intra_sra (void)
 {
+  unsigned int todoflags = 0;
   sra_mode = SRA_MODE_EARLY_INTRA;
-  return perform_intra_sra ();
+  if (flag_tree_bitfield_merge)
+    todoflags = ssa_bitfield_merge ();
+  return todoflags | perform_intra_sra ();
 }
 
 /* Perform "late" intraprocedural SRA.  */
@@ -5105,3 +5570,5 @@ make_pass_early_ipa_sra (gcc::context *ctxt)
 {
   return new pass_early_ipa_sra (ctxt);
 }
+
+#include "gt-tree-sra.h"
diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
index bd2feb4..683fd76 100644
--- a/gcc/tree-ssa-sccvn.c
+++ b/gcc/tree-ssa-sccvn.c
@@ -4176,29 +4176,6 @@ get_next_value_id (void)
   return next_value_id++;
 }
 
-
-/* Compare two expressions E1 and E2 and return true if they are equal.  */
-
-bool
-expressions_equal_p (tree e1, tree e2)
-{
-  /* The obvious case.  */
-  if (e1 == e2)
-    return true;
-
-  /* If only one of them is null, they cannot be equal.  */
-  if (!e1 || !e2)
-    return false;
-
-  /* Now perform the actual comparison.  */
-  if (TREE_CODE (e1) == TREE_CODE (e2)
-      && operand_equal_p (e1, e2, OEP_PURE_SAME))
-    return true;
-
-  return false;
-}
-
-
 /* Return true if the nary operation NARY may trap.  This is a copy
    of stmt_could_throw_1_p adjusted to the SCCVN IL.  */
 
diff --git a/gcc/tree-ssa-sccvn.h b/gcc/tree-ssa-sccvn.h
index 94e3603..707b18c 100644
--- a/gcc/tree-ssa-sccvn.h
+++ b/gcc/tree-ssa-sccvn.h
@@ -21,10 +21,6 @@
 #ifndef TREE_SSA_SCCVN_H
 #define TREE_SSA_SCCVN_H
 
-/* In tree-ssa-sccvn.c  */
-bool expressions_equal_p (tree, tree);
-
-
 /* TOP of the VN lattice.  */
 extern tree VN_TOP;
 
diff --git a/gcc/tree.c b/gcc/tree.c
index 6593cf8..6683957 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -12155,4 +12155,44 @@ contains_bitfld_component_ref_p (const_tree ref)
   return false;
 }
 
+/* Compare two expressions E1 and E2 and return true if they are equal.  */
+
+bool
+expressions_equal_p (tree e1, tree e2)
+{
+  /* The obvious case.  */
+  if (e1 == e2)
+    return true;
+
+  /* If only one of them is null, they cannot be equal.  */
+  if (!e1 || !e2)
+    return false;
+
+  /* Now perform the actual comparison.  */
+  if (TREE_CODE (e1) == TREE_CODE (e2)
+      && operand_equal_p (e1, e2, OEP_PURE_SAME))
+    return true;
+
+  return false;
+}
+
+/* Given a pointer to a tree node, assumed to be some kind of a ..._TYPE
+   node, return the size in bits for the type if it is a constant, or else
+   return the alignment for the type if the type's size is not constant, or
+   else return BITS_PER_WORD if the type actually turns out to be an
+   ERROR_MARK node.  */
+
+unsigned HOST_WIDE_INT
+simple_type_size_in_bits (const_tree type)
+{
+  if (TREE_CODE (type) == ERROR_MARK)
+    return BITS_PER_WORD;
+  else if (TYPE_SIZE (type) == NULL_TREE)
+    return 0;
+  else if (host_integerp (TYPE_SIZE (type), 1))
+    return tree_low_cst (TYPE_SIZE (type), 1);
+  else
+    return TYPE_ALIGN (type);
+}
+
 #include "gt-tree.h"
diff --git a/gcc/tree.h b/gcc/tree.h
index a263a2c..b2bd481 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -4528,6 +4528,7 @@ extern tree get_ref_base_and_extent (tree, HOST_WIDE_INT *,
 				     HOST_WIDE_INT *, HOST_WIDE_INT *);
 extern bool contains_bitfld_component_ref_p (const_tree);
 extern bool type_in_anonymous_namespace_p (tree);
+extern bool expressions_equal_p (tree e1, tree e2);
 
 /* In tree-nested.c */
 extern tree build_addr (tree, tree);
@@ -4693,6 +4694,10 @@ extern tree resolve_asm_operand_names (tree, tree, tree, tree);
 extern tree tree_overlaps_hard_reg_set (tree, HARD_REG_SET *);
 #endif
 
+/* In dwarf2out.c.  */
+HOST_WIDE_INT
+field_byte_offset (const_tree decl);
+
 ?
 /* In tree-inline.c  */
 
@@ -4979,5 +4984,6 @@ builtin_decl_implicit_p (enum built_in_function fncode)
 #endif	/* NO_DOLLAR_IN_LABEL */
 #endif	/* NO_DOT_IN_LABEL */
 
+extern unsigned HOST_WIDE_INT simple_type_size_in_bits (const_tree type);
 
 #endif  /* GCC_TREE_H  */


Regards,
Zoran Jovanovic

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Add a new option "-ftree-bitfield-merge" (patch / doc inside)
       [not found]     ` <CAFiYyc0dcpDeXqwM2G3BTJUkpTsjzivRVEuWGfmGE4QcMhxERA@mail.gmail.com>
@ 2013-11-08 13:07       ` Richard Biener
  2013-11-08 14:11         ` Richard Biener
  0 siblings, 1 reply; 17+ messages in thread
From: Richard Biener @ 2013-11-08 13:07 UTC (permalink / raw)
  To: Zoran Jovanovic; +Cc: gcc-patches

> Hello,
> This is new patch version.
> Comments from Bernhard Reutner-Fischer review applied.
> Also, test case bitfildmrg2.c modified - it is now execute test.
> 
> 
> Example:
> 
> Original code:
>   <unnamed-unsigned:3> D.1351;
>   <unnamed-unsigned:9> D.1350;
>   <unnamed-unsigned:7> D.1349;
>   D.1349_2 = p1_1(D)->f1;
>   p2_3(D)->f1 = D.1349_2;
>   D.1350_4 = p1_1(D)->f2;
>   p2_3(D)->f2 = D.1350_4;
>   D.1351_5 = p1_1(D)->f3;
>   p2_3(D)->f3 = D.1351_5;
> 
> Optimized code:
>   <unnamed-unsigned:19> D.1358;
>   _16 = pr1_2(D)->_field0;
>   pr2_4(D)->_field0 = _16;
> 
> Algorithm works on basic block level and consists of following 3 major steps:
> 1. Go trough basic block statements list. If there are statement pairs
> that implement copy of bit field content from one memory location to
> another record statements pointers and other necessary data in
> corresponding data structure.
> 2. Identify records that represent adjacent bit field accesses and
> mark them as merged.
> 3. Modify trees accordingly.
> 
> New command line option "-ftree-bitfield-merge" is introduced.
> 
> Tested - passed gcc regression tests.

Comments inline (sorry for the late reply...)

> Changelog -
> 
> gcc/ChangeLog:
> 2013-09-24 Zoran Jovanovic (zoran.jovanovic@imgtec.com)
>   * Makefile.in : Added tree-sra.c to GTFILES.
>   * common.opt (ftree-bitfield-merge): New option.
>   * doc/invoke.texi: Added reference to "-ftree-bitfield-merge".
>   * tree-sra.c (ssa_bitfield_merge): New function.
>   Entry for (-ftree-bitfield-merge).
>   (bitfield_stmt_access_pair_htab_hash): New function.
>   (bitfield_stmt_access_pair_htab_eq): New function.
>   (cmp_access): New function.
>   (create_and_insert_access): New function.
>   (get_bit_offset): New function.
>   (get_merged_bit_field_size): New function.
>   (add_stmt_access_pair): New function.
>   * dwarf2out.c (simple_type_size_in_bits): moved to tree.c.
>   (field_byte_offset): declaration moved to tree.h, static removed.
>   * testsuite/gcc.dg/tree-ssa/bitfldmrg1.c: New test.
>   * testsuite/gcc.dg/tree-ssa/bitfldmrg2.c: New test.
>   * tree-ssa-sccvn.c (expressions_equal_p): moved to tree.c.
>   * tree-ssa-sccvn.h (expressions_equal_p): declaration moved to tree.h.
>   * tree.c (expressions_equal_p): moved from tree-ssa-sccvn.c.
>   (simple_type_size_in_bits): moved from dwarf2out.c.
>   * tree.h (expressions_equal_p): declaration added.
>   (field_byte_offset): declaration added.
> 
> Patch -
> 
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index a2e3f7a..54aa8e7 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -3847,6 +3847,7 @@ GTFILES = $(CPP_ID_DATA_H) $(srcdir)/input.h
> $(srcdir)/coretypes.h \
>    $(srcdir)/asan.c \
>    $(srcdir)/ubsan.c \
>    $(srcdir)/tsan.c $(srcdir)/ipa-devirt.c \
> +  $(srcdir)/tree-sra.c \
>    @all_gtfiles@
> 
>  # Compute the list of GT header files from the corresponding C sources,
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 202e169..afac514 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -2164,6 +2164,10 @@ ftree-sra
>  Common Report Var(flag_tree_sra) Optimization
>  Perform scalar replacement of aggregates
> 
> +ftree-bitfield-merge
> +Common Report Var(flag_tree_bitfield_merge) Init(0) Optimization
> +Enable bit-field merge on trees
> +

Drop the tree- prefix for new options, it doesn't tell users anything.
I suggest

fmerge-bitfields
Common Report Var(flag_tree_bitfield_merge) Init(0) Optimization
Merge loads and stores of consecutive bitfields

and adjust docs with regarding to the flag name change of course.

>  ftree-ter
>  Common Report Var(flag_tree_ter) Optimization
>  Replace temporary expressions in the SSA->normal pass
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index aa0f4ed..e588cae 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -412,7 +412,7 @@ Objective-C and Objective-C++ Dialects}.
>  -fsplit-ivs-in-unroller -fsplit-wide-types -fstack-protector @gol
>  -fstack-protector-all -fstack-protector-strong -fstrict-aliasing @gol
>  -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
> --ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
> +-ftree-bitfield-merge -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
>  -ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
>  -ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
>  -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
> @@ -7679,6 +7679,11 @@ pointer alignment information.
>  This pass only operates on local scalar variables and is enabled by default
>  at @option{-O} and higher.  It requires that @option{-ftree-ccp} is enabled.
> 
> +@item -ftree-bitfield-merge
> +@opindex ftree-bitfield-merge
> +Combines several adjacent bit-field accesses that copy values
> +from one memory location to another into one single bit-field access.
> +
>  @item -ftree-ccp
>  @opindex ftree-ccp
>  Perform sparse conditional constant propagation (CCP) on trees.  This
> diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
> index 95049e4..e74db17 100644
> --- a/gcc/dwarf2out.c
> +++ b/gcc/dwarf2out.c
> @@ -3108,8 +3108,6 @@ static HOST_WIDE_INT ceiling (HOST_WIDE_INT,
> unsigned int);
>  static tree field_type (const_tree);
>  static unsigned int simple_type_align_in_bits (const_tree);
>  static unsigned int simple_decl_align_in_bits (const_tree);
> -static unsigned HOST_WIDE_INT simple_type_size_in_bits (const_tree);
> -static HOST_WIDE_INT field_byte_offset (const_tree);
>  static void add_AT_location_description        (dw_die_ref, enum
> dwarf_attribute,
>                                          dw_loc_list_ref);
>  static void add_data_member_location_attribute (dw_die_ref, tree);
> @@ -10149,25 +10147,6 @@ is_base_type (tree type)
>    return 0;
>  }
> 
> -/* Given a pointer to a tree node, assumed to be some kind of a ..._TYPE
> -   node, return the size in bits for the type if it is a constant, or else
> -   return the alignment for the type if the type's size is not constant, or
> -   else return BITS_PER_WORD if the type actually turns out to be an
> -   ERROR_MARK node.  */
> -
> -static inline unsigned HOST_WIDE_INT
> -simple_type_size_in_bits (const_tree type)
> -{
> -  if (TREE_CODE (type) == ERROR_MARK)
> -    return BITS_PER_WORD;
> -  else if (TYPE_SIZE (type) == NULL_TREE)
> -    return 0;
> -  else if (host_integerp (TYPE_SIZE (type), 1))
> -    return tree_low_cst (TYPE_SIZE (type), 1);
> -  else
> -    return TYPE_ALIGN (type);
> -}
> -
>  /* Similarly, but return a double_int instead of UHWI.  */
> 
>  static inline double_int
> @@ -14521,7 +14500,7 @@ round_up_to_align (double_int t, unsigned int align)
>     because the offset is actually variable.  (We can't handle the latter case
>     just yet).  */
> 
> -static HOST_WIDE_INT
> +HOST_WIDE_INT
>  field_byte_offset (const_tree decl)
>  {
>    double_int object_offset_in_bits;
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c
> b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c
> new file mode 100644
> index 0000000..e9e96b7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c
> @@ -0,0 +1,30 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -ftree-bitfield-merge -fdump-tree-esra" }  */
> +
> +struct S
> +{
> +  unsigned f1:7;
> +  unsigned f2:9;
> +  unsigned f3:3;
> +  unsigned f4:5;
> +  unsigned f5:1;
> +  unsigned f6:2;
> +};
> +
> +unsigned
> +foo (struct S *p1, struct S *p2, int *ptr)
> +{
> +  p2->f1 = p1->f1;
> +  p2->f2 = p1->f2;
> +  p2->f3 = p1->f3;
> +  *ptr = 7;
> +  p2->f4 = p1->f4;
> +  p2->f5 = p1->f5;
> +  p2->f6 = p1->f6;
> +  return 0;
> +}
> +
> +/* { dg-final { scan-tree-dump "19" "esra" } } */
> +/* { dg-final { scan-tree-dump "8" "esra"} } */

That's an awfully unspecific dump scan ;)  Please make it more
specific and add a comment what code generation you expect.

> +/* { dg-final { cleanup-tree-dump "esra" } } */
> +
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c
> b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c
> new file mode 100644
> index 0000000..653e904
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c
> @@ -0,0 +1,90 @@
> +/* Check whether use of -ftree-bitfield-merge results in presence of overlaping
> +   unions results in incorrect code.  */
> +/* { dg-options "-O2 -ftree-bitfield-merge" }  */
> +/* { dg-do run } */
> +#include <stdio.h>
> +extern void abort (void);
> +
> +struct s1
> +{
> +  unsigned f1:4;
> +  unsigned f2:4;
> +  unsigned f3:4;
> +};
> +
> +struct s2
> +{
> +  unsigned char c;
> +  unsigned f1:4;
> +  unsigned f2:4;
> +  unsigned f3:4;
> +};
> +
> +struct s3
> +{
> +  unsigned f1:3;
> +  unsigned f2:3;
> +  unsigned f3:3;
> +};
> +
> +struct s4
> +{
> +  unsigned f0:3;
> +  unsigned f1:3;
> +  unsigned f2:3;
> +  unsigned f3:3;
> +};
> +
> +union un_1
> +{
> +  struct s1 a;
> +  struct s2 b;
> +};
> +
> +union un_2
> +{
> +  struct s3 a;
> +  struct s4 b;
> +};
> +
> +void f1 (union un_1 *p1, union un_1 *p2)
> +{
> +  p2->a.f3 = p1->b.f3;
> +  p2->a.f2 = p1->b.f2;
> +  p2->a.f1 = p1->b.f1;
> +
> +  if (p1->b.f1 != 3)
> +    abort ();
> +}
> +
> +void f2 (union un_2 *p1, union un_2 *p2)
> +{
> +  p2->b.f1 = p1->a.f1;
> +  p2->b.f2 = p1->a.f2;
> +  p2->b.f3 = p1->a.f3;
> +
> +  if (p2->b.f1 != 0 || p2->b.f2 != 0 || p2->b.f3 != 0)
> +    abort ();
> +}
> +
> +int main ()
> +{
> +  union un_1 u1;
> +  union un_2 u2;
> +
> +  u1.b.f1 = 1;
> +  u1.b.f2 = 2;
> +  u1.b.f3 = 3;
> +  u1.b.c = 0;
> +
> +  f1 (&u1, &u1);
> +
> +  u2.b.f0 = 0;
> +  u2.b.f1 = 1;
> +  u2.b.f2 = 2;
> +  u2.b.f3 = 3;
> +
> +  f2 (&u2, &u2);
> +
> +  return 0;
> +}
> diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
> index 58c7565..610245b 100644
> --- a/gcc/tree-sra.c
> +++ b/gcc/tree-sra.c
> @@ -92,6 +92,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "gimple-pretty-print.h"
>  #include "ipa-inline.h"
>  #include "ipa-utils.h"
> +#include "ggc.h"

You shouldn't need that I think.

>  /* Enumeration of all aggregate reductions we can do.  */
>  enum sra_mode { SRA_MODE_EARLY_IPA,   /* early call regularization */
> @@ -3424,12 +3425,476 @@ perform_intra_sra (void)
>    return ret;
>  }
> 
> +/* This optimization combines several adjacent bit-field accesses that copy
> +   values from one memory location to another into one single bit-field
> +   access.  */
> +
> +/* Data for single bit-field read/write sequence.  */
> +struct GTY (()) bitfield_access_d {
> +  gimple load_stmt;              /* Bit-field load statement.  */
> +  gimple store_stmt;             /* Bit-field store statement.  */
> +  unsigned src_offset_words;     /* Bit-field offset at src in words.  */
> +  unsigned src_bit_offset;       /* Bit-field offset inside source word.  */
> +  unsigned src_bit_size;         /* Size of bit-field in source word.  */
> +  unsigned dst_offset_words;     /* Bit-field offset at dst in words.  */
> +  unsigned dst_bit_offset;       /* Bit-field offset inside destination
> +                                    word.  */
> +  unsigned src_field_offset;     /* Source field offset.  */
> +  unsigned dst_bit_size;         /* Size of bit-field in destination word.  */
> +  tree src_addr;                 /* Address of source memory access.  */
> +  tree dst_addr;                 /* Address of destination memory access.  */
> +  bool merged;                   /* True if access is merged with another
> +                                    one.  */
> +  bool modified;                 /* True if bit-field size is modified.  */
> +  bool is_barrier;               /* True if access is barrier (call or mem
> +                                    access).  */
> +  struct bitfield_access_d *next; /* Access with which this one is merged.  */
> +  tree bitfield_type;            /* Field type.  */
> +  tree bitfield_representative;          /* Bit field representative
> of original
> +                                    declaration.  */
> +  tree field_decl_context;       /* Context of original bit-field
> +                                    declaration.  */
> +};
> +
> +typedef struct bitfield_access_d bitfield_access_o;
> +typedef struct bitfield_access_d *bitfield_access;
> +
> +/* Connecting register with bit-field access sequence that defines value in
> +   that register.  */
> +struct GTY (()) bitfield_stmt_access_pair_d
> +{
> +  gimple stmt;
> +  bitfield_access access;
> +};
> +
> +typedef struct bitfield_stmt_access_pair_d bitfield_stmt_access_pair_o;
> +typedef struct bitfield_stmt_access_pair_d *bitfield_stmt_access_pair;
> +
> +static GTY ((param_is (struct bitfield_stmt_access_pair_d)))
> +  htab_t bitfield_stmt_access_htab;

Nor does this need to be registered with the garbage collector.  Its
lifetime is not greater than that of the SRA pass, right?

> +/* Hash table callbacks for bitfield_stmt_access_htab.  */
> +
> +static hashval_t
> +bitfield_stmt_access_pair_htab_hash (const void *p)
> +{
> +  const bitfield_stmt_access_pair entry = (const bitfield_stmt_access_pair)p;
> +  return (hashval_t) (uintptr_t) entry->stmt;
> +}
> +
> +static int
> +bitfield_stmt_access_pair_htab_eq (const void *p1, const void *p2)
> +{
> +  const struct bitfield_stmt_access_pair_d *entry1 =
> +    (const bitfield_stmt_access_pair)p1;
> +  const struct bitfield_stmt_access_pair_d *entry2 =
> +    (const bitfield_stmt_access_pair)p2;
> +  return entry1->stmt == entry2->stmt;
> +}
> +
> +/* Counter used for generating unique names for new fields.  */
> +static unsigned new_field_no;

Must be stale...?

> +/* Compare two bit-field access records.  */
> +
> +static int
> +cmp_access (const void *p1, const void *p2)
> +{
> +  const bitfield_access a1 = (*(const bitfield_access*)p1);
> +  const bitfield_access a2 = (*(const bitfield_access*)p2);
> +
> +  if (a1->bitfield_representative - a2->bitfield_representative)
> +    return a1->bitfield_representative - a2->bitfield_representative;

Subtracting two unrelated pointers is undefined - I suggest you
use DECL_UID (a1->bitfield_representative).

> +  if (!expressions_equal_p (a1->src_addr, a2->src_addr))
> +    return a1 - a2;

Same - the comparison ends up depending on memory layout, that's bad.

> +
> +  if (!expressions_equal_p (a1->dst_addr, a2->dst_addr))
> +    return a1 - a2;
> +  if (a1->src_offset_words - a2->src_offset_words)
> +    return a1->src_offset_words - a2->src_offset_words;
> +
> +  return a1->src_bit_offset - a2->src_bit_offset;
> +}
> +
> +/* Create new bit-field access structure and add it to given bitfield_accesses
> +   htab.  */
> +
> +static bitfield_access
> +create_and_insert_access (vec<bitfield_access>
> +                      *bitfield_accesses)
> +{
> +  bitfield_access access = ggc_alloc_bitfield_access_d ();
> +  memset (access, 0, sizeof (struct bitfield_access_d));
> +  bitfield_accesses->safe_push (access);
> +  return access;
> +}
> +
> +/* Slightly modified add_bit_offset_attribute from dwarf2out.c.  */
> +
> +static inline HOST_WIDE_INT
> +get_bit_offset (tree decl)
> +{
> +  tree type = DECL_BIT_FIELD_TYPE (decl);
> +  HOST_WIDE_INT bitpos_int;
> +
> +  /* Must be a field and a bit-field.  */
> +  gcc_assert (type && TREE_CODE (decl) == FIELD_DECL);
> +  /* Bit position and decl size should be integer constants that can be
> +     represented in a signle HOST_WIDE_INT.  */
> +  if (! host_integerp (bit_position (decl), 0)
> +      || ! host_integerp (DECL_SIZE (decl), 1))
> +    return -1;
> +
> +  bitpos_int = int_bit_position (decl);
> +  return bitpos_int;
> +}
> +
> +/* Returns size of combined bitfields.  Size cannot be larger than size
> +   of largest directly accessible memory unit.  */
> +
> +static int
> +get_merged_bit_field_size (bitfield_access access)
> +{
> +  bitfield_access tmp_access = access;
> +  int size = 0;
> +
> +  while (tmp_access)
> +  {
> +    size += tmp_access->src_bit_size;
> +    tmp_access = tmp_access->next;
> +  }
> +  return size;
> +}
> +
> +/* Adds new pair consisting of statement and bit-field access structure that
> +   contains it.  */
> +
> +static bool add_stmt_access_pair (bitfield_access access, gimple stmt)
> +{
> +  bitfield_stmt_access_pair new_pair;
> +  void **slot;
> +  if (!bitfield_stmt_access_htab)
> +    bitfield_stmt_access_htab =
> +      htab_create_ggc (128, bitfield_stmt_access_pair_htab_hash,
> +                      bitfield_stmt_access_pair_htab_eq, NULL);
> +  new_pair = ggc_alloc_bitfield_stmt_access_pair_o ();
> +  new_pair->stmt = stmt;
> +  new_pair->access = access;
> +  slot = htab_find_slot (bitfield_stmt_access_htab, new_pair, INSERT);
> +  if (*slot == HTAB_EMPTY_ENTRY)
> +    {
> +      *slot = new_pair;
> +      return true;
> +    }
> +  return false;
> +}
> +
> +/* Returns true if given COMPONENT_REF is part of an union.  */
> +
> +static bool part_of_union_p (tree component)
> +{
> +  tree tmp = component;
> +  bool res = false;
> +  while (TREE_CODE (tmp) == COMPONENT_REF)
> +    {
> +      if (TREE_CODE (TREE_TYPE (tmp)) == UNION_TYPE)
> +       {
> +         res = true;
> +         break;
> +       }
> +      tmp = TREE_OPERAND (tmp, 0);
> +    }
> +  if (tmp && (TREE_CODE (TREE_TYPE (tmp)) == UNION_TYPE))
> +    res = true;
> +  return res;
> +}
> +
> +/* Main entry point for the bit-field merge optimization.  */

Ok, I'm skipping to here (I was wondering what you need all the above
functions for)

> +static unsigned int
> +ssa_bitfield_merge (void)
> +{
> +  basic_block bb;
> +  unsigned int todoflags = 0;
> +  vec<bitfield_access> bitfield_accesses;
> +  int ix, iy;
> +  bitfield_access access;
> +  bool cfg_changed = false;
> +
> +  /* In the strict volatile bitfields case, doing code changes here may prevent
> +     other optimizations, in particular in a SLOW_BYTE_ACCESS setting.  */
> +  if (flag_strict_volatile_bitfields > 0)
> +    return 0;

Hmm, I'm not sure we should care ... - but well, I don't care ;)

> +  FOR_EACH_BB (bb)
> +    {
> +      gimple_stmt_iterator gsi;
> +      vec<bitfield_access> bitfield_accesses_merge = vNULL;
> +      tree prev_representative = NULL_TREE;
> +      bitfield_accesses.create (0);
> +
> +      /* Identify all bitfield copy sequences in the basic-block.  */
> +      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)
> +       {
> +         gimple stmt = gsi_stmt (gsi);
> +         tree lhs, rhs;
> +         void **slot;
> +         struct bitfield_stmt_access_pair_d asdata;
> +
> +         if (!is_gimple_assign (stmt))

Instead of checking TREE_THIS_VOLATILE below check here
  || gimple_has_volatile_ops (stmt)

also you can narrow the assigns to visit by checking for

           !is_gimple_assign_single (stmt)

instead

> +           {
> +             gsi_next (&gsi);
> +             continue;
> +           }
> +
> +         lhs = gimple_assign_lhs (stmt);
> +         rhs = gimple_assign_rhs1 (stmt);
> +
> +         if (TREE_CODE (rhs) == COMPONENT_REF)
> +           {
> +             use_operand_p use;
> +             gimple use_stmt;
> +             tree op0 = TREE_OPERAND (rhs, 0);
> +             tree op1 = TREE_OPERAND (rhs, 1);
> +
> +             if (TREE_CODE (op1) == FIELD_DECL && DECL_BIT_FIELD_TYPE (op1)

op1 is always a FIELD_DECL

> +                 && !TREE_THIS_VOLATILE (op1) && !part_of_union_p (rhs))

I wonder what's the issue with unions ... sure, for

union {
  int field1 : 3;
  int field2 : 2;
};

but for

union {
  struct {
     int field1 : 3;
     int field2 : 2;
  } a;
...
};

?  Thus I'd simply check

            && TREE_CODE (DECL_CONTEXT (op1)) != UNION_TYPE
            && TREE_CODE (DECL_CONTEXT (op1)) != QUAL_UNION_TYPE

maybe introduce

#define UNION_TYPE_P(TYPE)
  (TREE_CODE (TYPE) == UNION_TYPE            \
   || TREE_CODE (TYPE) == QUAL_UNION_TYPE)

alongside RECORD_OR_UNION_TYPE_P in tree.h.

> +               {
> +                 if (single_imm_use (lhs, &use, &use_stmt)
> +                      && is_gimple_assign (use_stmt))
> +                   {
> +                     tree use_lhs = gimple_assign_lhs (use_stmt);
> +                     if (TREE_CODE (use_lhs) == COMPONENT_REF)

I'm not sure I follow the logic here, but it seems that you match
a very specific pattern only - a bitfield copy.

I'd have applied the lowering (this is really a lowering pass
with a cost model you make up here) if the same 
DECL_BIT_FIELD_REPRESENTATIVE is used more than once in a BB
on the same underlying base object.  That is, we are reasonably
sure we can remove a redundant load and eventually redundant stores.

Thus, very simplistic you'd just record a op0, 
DECL_BIT_FIELD_REPRESENTATIVE pair into the hashtable, using
iterative_hash_expr for op0 and DECL_UID for the representative,
counting the number of times you see them.  Make sure to apply
the same for stores to bitfields, of course.

> +                       {
> +                         tree use_op0 = TREE_OPERAND (use_lhs, 0);
> +                         tree use_op1 = TREE_OPERAND (use_lhs, 1);
> +                         tree tmp_repr = DECL_BIT_FIELD_REPRESENTATIVE (op1);
> +
> +                         if (TREE_CODE (use_op1) == FIELD_DECL
> +                             && DECL_BIT_FIELD_TYPE (use_op1)
> +                             && !TREE_THIS_VOLATILE (use_op1))
> +                           {
> +                             if (prev_representative
> +                                 && (prev_representative != tmp_repr))
> +                               {
> +                                 /* If previous access has different
> +                                    representative then barrier is needed
> +                                    between it and new access.  */
> +                                 access = create_and_insert_access
> +                                            (&bitfield_accesses);
> +                                 access->is_barrier = true;
> +                               }
> +                             prev_representative = tmp_repr;
> +                             /* Create new bit-field access structure.  */
> +                             access = create_and_insert_access
> +                                        (&bitfield_accesses);
> +                             /* Collect access data - load instruction.  */
> +                             access->src_bit_size = tree_low_cst
> +                                                     (DECL_SIZE (op1), 1);
> +                             access->src_bit_offset = get_bit_offset (op1);
> +                             access->src_offset_words =
> +                               field_byte_offset (op1) / UNITS_PER_WORD;
> +                             access->src_field_offset =
> +                               tree_low_cst (DECL_FIELD_OFFSET (op1),1);
> +                             access->src_addr = op0;
> +                             access->load_stmt = gsi_stmt (gsi);
> +                             /* Collect access data - store instruction.  */
> +                             access->dst_bit_size =
> +                               tree_low_cst (DECL_SIZE (use_op1), 1);
> +                             access->dst_bit_offset =
> +                               get_bit_offset (use_op1);
> +                             access->dst_offset_words =
> +                               field_byte_offset (use_op1) / UNITS_PER_WORD;
> +                             access->dst_addr = use_op0;
> +                             access->store_stmt = use_stmt;
> +                             add_stmt_access_pair (access, stmt);
> +                             add_stmt_access_pair (access, use_stmt);
> +                             access->bitfield_type
> +                               = DECL_BIT_FIELD_TYPE (use_op1);
> +                             access->bitfield_representative = tmp_repr;
> +                             access->field_decl_context =
> +                               DECL_FIELD_CONTEXT (op1);
> +                           }
> +                       }
> +                   }
> +               }
> +           }
> +
> +         /* Insert barrier for merging if statement is function call or memory
> +            access.  */
> +         if (bitfield_stmt_access_htab)
> +           {
> +             asdata.stmt = stmt;
> +             slot
> +               = htab_find_slot (bitfield_stmt_access_htab, &asdata,
> +                                 NO_INSERT);
> +             if (!slot
> +                 && ((gimple_code (stmt) == GIMPLE_CALL)
> +                     || (gimple_has_mem_ops (stmt))))
> +               {
> +                 /* Create new bit-field access structure.  */
> +                 access = create_and_insert_access (&bitfield_accesses);
> +                 /* Mark it as barrier.  */
> +                 access->is_barrier = true;
> +               }
> +           }
> +         gsi_next (&gsi);
> +       }
> +
> +      /* If there are no at least two accesses go to the next basic block.  */
> +      if (bitfield_accesses.length () <= 1)
> +       {
> +         bitfield_accesses.release ();
> +         continue;
> +       }
> +
> +      /* Find bit-field accesses that can be merged.  */
> +      for (ix = 0; bitfield_accesses.iterate (ix, &access); ix++)
> +       {
> +         bitfield_access head_access;
> +         bitfield_access mrg_access;
> +         bitfield_access prev_access;
> +         if (!bitfield_accesses_merge.exists ())
> +           bitfield_accesses_merge.create (0);
> +
> +         bitfield_accesses_merge.safe_push (access);
> +
> +         if (!access->is_barrier
> +             && !(access == bitfield_accesses.last ()
> +             && !bitfield_accesses_merge.is_empty ()))
> +           continue;
> +
> +         bitfield_accesses_merge.qsort (cmp_access);
> +
> +         head_access = prev_access = NULL;
> +         for (iy = 0; bitfield_accesses_merge.iterate (iy, &mrg_access); iy++)
> +           {
> +             if (head_access
> +                 && expressions_equal_p (head_access->src_addr,
> +                                         mrg_access->src_addr)
> +                 && expressions_equal_p (head_access->dst_addr,
> +                                         mrg_access->dst_addr)
> +                 && prev_access->src_offset_words
> +                    == mrg_access->src_offset_words
> +                 && prev_access->dst_offset_words
> +                    == mrg_access->dst_offset_words
> +                 && prev_access->src_bit_offset + prev_access->src_bit_size
> +                    == mrg_access->src_bit_offset
> +                 && prev_access->dst_bit_offset + prev_access->dst_bit_size
> +                    == mrg_access->dst_bit_offset
> +                 && prev_access->bitfield_representative
> +                    == mrg_access->bitfield_representative)
> +               {
> +                 /* Merge conditions are satisfied - merge accesses.  */
> +                 mrg_access->merged = true;
> +                 prev_access->next = mrg_access;
> +                 head_access->modified = true;
> +                 prev_access = mrg_access;
> +               }
> +             else
> +               head_access = prev_access = mrg_access;
> +           }
> +         bitfield_accesses_merge.release ();
> +         bitfield_accesses_merge = vNULL;
> +       }
> +
> +      /* Modify generated code.  */

Ick, so you are actually applying an optimization instead of just
lowering the accesses to DECL_BIT_FIELD_REPRESENTATIVE accesses
and letting CSE and DSE do their job.  Hmm.  I don't think this is
a good idea.

Instead what I'd like to see is more something like the following
(bah, I never merged BIT_FIELD_EXPR which performs the bit-field-insert
in a single stmt);  quickly hacked together, without a cost model,
just the lowering piece (likely not endian clean - but who knows).

Index: gcc/tree-sra.c
===================================================================
--- gcc/tree-sra.c	(revision 204561)
+++ gcc/tree-sra.c	(working copy)
@@ -3445,10 +3445,121 @@ perform_intra_sra (void)
   return ret;
 }
 
+static void
+lower_bitfields (void)
+{
+  basic_block bb;
+  FOR_EACH_BB (bb)
+    for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
+	 !gsi_end_p (gsi); gsi_next (&gsi))
+      {
+	gimple stmt = gsi_stmt (gsi);
+	if (!gimple_assign_single_p (stmt)
+	    || gimple_has_volatile_ops (stmt))
+	  continue;
+
+	/* Lower a bitfield read.  */
+	tree ref = gimple_assign_rhs1 (stmt);
+	if (TREE_CODE (ref) == COMPONENT_REF
+	    && DECL_BIT_FIELD_TYPE (TREE_OPERAND (ref, 1))
+	    && DECL_BIT_FIELD_REPRESENTATIVE (TREE_OPERAND (ref, 1)))
+	  {
+	    tree field = TREE_OPERAND (ref, 1);
+	    tree rep = DECL_BIT_FIELD_REPRESENTATIVE (field);
+	    unsigned HOST_WIDE_INT off;
+	    if (host_integerp (DECL_FIELD_OFFSET (field), 1)
+		&& host_integerp (DECL_FIELD_OFFSET (rep), 1))
+	      off = (tree_low_cst (DECL_FIELD_OFFSET (field), 1)
+		     - tree_low_cst (DECL_FIELD_OFFSET (rep), 1)) * BITS_PER_UNIT;
+	    else
+	      off = 0;
+	    off += (tree_low_cst (DECL_FIELD_BIT_OFFSET (field), 1)
+		    - tree_low_cst (DECL_FIELD_BIT_OFFSET (rep), 1));
+	    tree loadres = make_ssa_name (TREE_TYPE (rep), NULL);
+	    gimple load
+	      = gimple_build_assign (loadres,
+				     build3 (COMPONENT_REF, TREE_TYPE (rep),
+					     TREE_OPERAND (ref, 0), rep,
+					     NULL_TREE));
+	    gimple_set_vuse (load, gimple_vuse (stmt));
+	    gsi_insert_before (&gsi, load, GSI_SAME_STMT);
+	    gimple_assign_set_rhs1 (stmt,
+				    build3 (BIT_FIELD_REF, TREE_TYPE (ref),
+					    loadres,
+					    DECL_SIZE (field),
+					    bitsize_int (off)));
+	    update_stmt (stmt);
+	  }
+	ref = gimple_assign_lhs (stmt);
+	if (TREE_CODE (ref) == COMPONENT_REF
+	    && DECL_BIT_FIELD_TYPE (TREE_OPERAND (ref, 1))
+	    && DECL_BIT_FIELD_REPRESENTATIVE (TREE_OPERAND (ref, 1)))
+	  {
+	    tree field = TREE_OPERAND (ref, 1);
+	    tree rep = DECL_BIT_FIELD_REPRESENTATIVE (field);
+	    unsigned HOST_WIDE_INT off;
+	    if (host_integerp (DECL_FIELD_OFFSET (field), 1)
+		&& host_integerp (DECL_FIELD_OFFSET (rep), 1))
+	      off = (tree_low_cst (DECL_FIELD_OFFSET (field), 1)
+		     - tree_low_cst (DECL_FIELD_OFFSET (rep), 1)) * BITS_PER_UNIT;
+	    else
+	      off = 0;
+	    off += (tree_low_cst (DECL_FIELD_BIT_OFFSET (field), 1)
+		    - tree_low_cst (DECL_FIELD_BIT_OFFSET (rep), 1));
+	    tree loadres = make_ssa_name (TREE_TYPE (rep), NULL);
+	    gimple load
+	      = gimple_build_assign (loadres,
+				     build3 (COMPONENT_REF, TREE_TYPE (rep),
+					     unshare_expr (TREE_OPERAND (ref, 0)),
+					     rep,
+					     NULL_TREE));
+	    gimple_set_vuse (load, gimple_vuse (stmt));
+	    gsi_insert_before (&gsi, load, GSI_SAME_STMT);
+	    /* FIXME:  BIT_FIELD_EXPR.  */
+	    /* Mask out bits.  */
+	    tree masked = make_ssa_name (TREE_TYPE (rep), NULL);
+	    gimple tems
+	      = gimple_build_assign_with_ops (BIT_AND_EXPR,
+					      masked, loadres,
+					      double_int_to_tree (TREE_TYPE (rep),
+								  double_int::mask (TREE_INT_CST_LOW (DECL_SIZE (field))).lshift (off)));
+	    gsi_insert_before (&gsi, tems, GSI_SAME_STMT);
+	    /* Zero-extend the value to representative size.  */
+	    tree tem2 = make_ssa_name (unsigned_type_for (TREE_TYPE (field)), NULL);
+	    tems = gimple_build_assign_with_ops (NOP_EXPR, tem2,
+						 gimple_assign_rhs1 (stmt),
+						 NULL_TREE);
+	    gsi_insert_before (&gsi, tems, GSI_SAME_STMT);
+	    tree tem = make_ssa_name (TREE_TYPE (rep), NULL);
+	    tems = gimple_build_assign_with_ops (NOP_EXPR, tem, tem2, NULL_TREE);
+	    gsi_insert_before (&gsi, tems, GSI_SAME_STMT);
+	    /* Shift the value into place.  */
+	    tem2 = make_ssa_name (TREE_TYPE (rep), NULL);
+	    tems = gimple_build_assign_with_ops (LSHIFT_EXPR, tem2, tem,
+						 size_int (off));
+	    gsi_insert_before (&gsi, tems, GSI_SAME_STMT);
+	    /* Merge masked loaded value and value.  */
+	    tree modres = make_ssa_name (TREE_TYPE (rep), NULL);
+	    gimple mod
+	      = gimple_build_assign_with_ops (BIT_IOR_EXPR, modres,
+					      masked, tem2);
+	    gsi_insert_before (&gsi, mod, GSI_SAME_STMT);
+	    /* Finally adjust the store.  */
+	    gimple_assign_set_rhs1 (stmt, modres);
+	    gimple_assign_set_lhs (stmt,
+				   build3 (COMPONENT_REF, TREE_TYPE (rep),
+					   TREE_OPERAND (ref, 0), rep,
+					   NULL_TREE));
+	    update_stmt (stmt);
+	  }
+      }
+}
+
 /* Perform early intraprocedural SRA.  */
 static unsigned int
 early_intra_sra (void)
 {
+  lower_bitfields ();
   sra_mode = SRA_MODE_EARLY_INTRA;
   return perform_intra_sra ();
 }


The idea is that this lowering then makes ESRA able to handle
it (you can see that for example in the result from
gcc.c-torture/execute/20000113-1.c which is miscompiled by
the above, eh ... what did I say about not testing it??).

I'll try to get the above working and cleaned up a bit today
and maybe early next week.  So stay tuned, I'll hand it over
to make you test if it works as advertised for you.

Richard.


> +      for (ix = 0; bitfield_accesses.iterate (ix, &access); ix++)
> +       {
> +         if (access->merged)
> +           {
> +             /* Access merged - remove instructions.  */
> +             gimple_stmt_iterator tmp_gsi;
> +             tmp_gsi = gsi_for_stmt (access->load_stmt);
> +             gsi_remove (&tmp_gsi, true);
> +             tmp_gsi = gsi_for_stmt (access->store_stmt);
> +             gsi_remove (&tmp_gsi, true);
> +           }
> +         else if (access->modified)
> +           {
> +             /* Access modified - modify generated code.  */
> +             gimple_stmt_iterator tmp_gsi;
> +             tree tmp_ssa;
> +             tree itype = make_node (INTEGER_TYPE);
> +             tree new_rhs;
> +             tree new_lhs;
> +             gimple new_stmt;
> +             char new_field_name [15];
> +             int decl_size;
> +
> +             /* Bit-field size changed - modify load statement.  */
> +             access->src_bit_size = get_merged_bit_field_size (access);
> +
> +             TYPE_PRECISION (itype) = access->src_bit_size;
> +             fixup_unsigned_type (itype);
> +
> +             /* Create new declaration.  */
> +             tree new_field = make_node (FIELD_DECL);
> +             sprintf (new_field_name, "_field%x", new_field_no++);
> +             DECL_NAME (new_field) = get_identifier (new_field_name);
> +             TREE_TYPE (new_field) = itype;
> +             DECL_BIT_FIELD (new_field) = 1;
> +             DECL_BIT_FIELD_TYPE (new_field) = access->bitfield_type;
> +             DECL_BIT_FIELD_REPRESENTATIVE (new_field) =
> +               access->bitfield_representative;
> +             DECL_FIELD_CONTEXT (new_field) = access->field_decl_context;
> +             DECL_NONADDRESSABLE_P (new_field) = 1;
> +             DECL_FIELD_OFFSET (new_field) =
> +               build_int_cst (unsigned_type_node, access->src_field_offset);
> +             DECL_FIELD_BIT_OFFSET (new_field) =
> +               build_int_cst (unsigned_type_node, access->src_bit_offset);
> +             DECL_SIZE (new_field) = build_int_cst (unsigned_type_node,
> +                                                    access->src_bit_size);
> +             decl_size = access->src_bit_size / BITS_PER_UNIT
> +               + (access->src_bit_size % BITS_PER_UNIT ? 1 : 0);
> +             DECL_SIZE_UNIT (new_field) =
> +               build_int_cst (unsigned_type_node, decl_size);
> +
> +             tmp_ssa = make_ssa_name (create_tmp_var (itype, NULL), NULL);
> +
> +             /* Create new component ref.  */
> +             new_rhs = build3 (COMPONENT_REF, itype, access->src_addr,
> +                               new_field, NULL);
> +             tmp_gsi = gsi_for_stmt (access->load_stmt);
> +             new_stmt = gimple_build_assign (tmp_ssa, new_rhs);
> +             gsi_insert_after (&tmp_gsi, new_stmt, GSI_SAME_STMT);
> +             SSA_NAME_DEF_STMT (tmp_ssa) = new_stmt;
> +             gsi_remove (&tmp_gsi, true);
> +
> +             /* Bit-field size changed - modify store statement.  */
> +             /* Create new component ref.  */
> +             new_lhs = build3 (COMPONENT_REF, itype, access->dst_addr,
> +                               new_field, NULL);
> +             new_stmt = gimple_build_assign (new_lhs, tmp_ssa);
> +             tmp_gsi = gsi_for_stmt (access->store_stmt);
> +             gsi_insert_after (&tmp_gsi, new_stmt, GSI_SAME_STMT);
> +             gsi_remove (&tmp_gsi, true);
> +             cfg_changed = true;
> +           }
> +       }
> +      /* Empty or delete data structures used for basic block.  */
> +      htab_empty (bitfield_stmt_access_htab);
> +      bitfield_stmt_access_htab = NULL;
> +      bitfield_accesses.release ();
> +    }
> +
> +  if (cfg_changed)
> +    todoflags |= TODO_cleanup_cfg;
> +
> +  return todoflags;
> +}
> +
>  /* Perform early intraprocedural SRA.  */
>  static unsigned int
>  early_intra_sra (void)
>  {
> +  unsigned int todoflags = 0;
>    sra_mode = SRA_MODE_EARLY_INTRA;
> -  return perform_intra_sra ();
> +  if (flag_tree_bitfield_merge)
> +    todoflags = ssa_bitfield_merge ();
> +  return todoflags | perform_intra_sra ();
>  }
> 
>  /* Perform "late" intraprocedural SRA.  */
> @@ -5105,3 +5570,5 @@ make_pass_early_ipa_sra (gcc::context *ctxt)
>  {
>    return new pass_early_ipa_sra (ctxt);
>  }
> +
> +#include "gt-tree-sra.h"
> diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
> index bd2feb4..683fd76 100644
> --- a/gcc/tree-ssa-sccvn.c
> +++ b/gcc/tree-ssa-sccvn.c
> @@ -4176,29 +4176,6 @@ get_next_value_id (void)
>    return next_value_id++;
>  }
> 
> -
> -/* Compare two expressions E1 and E2 and return true if they are equal.  */
> -
> -bool
> -expressions_equal_p (tree e1, tree e2)
> -{
> -  /* The obvious case.  */
> -  if (e1 == e2)
> -    return true;
> -
> -  /* If only one of them is null, they cannot be equal.  */
> -  if (!e1 || !e2)
> -    return false;
> -
> -  /* Now perform the actual comparison.  */
> -  if (TREE_CODE (e1) == TREE_CODE (e2)
> -      && operand_equal_p (e1, e2, OEP_PURE_SAME))
> -    return true;
> -
> -  return false;
> -}
> -
> -
>  /* Return true if the nary operation NARY may trap.  This is a copy
>     of stmt_could_throw_1_p adjusted to the SCCVN IL.  */
> 
> diff --git a/gcc/tree-ssa-sccvn.h b/gcc/tree-ssa-sccvn.h
> index 94e3603..707b18c 100644
> --- a/gcc/tree-ssa-sccvn.h
> +++ b/gcc/tree-ssa-sccvn.h
> @@ -21,10 +21,6 @@
>  #ifndef TREE_SSA_SCCVN_H
>  #define TREE_SSA_SCCVN_H
> 
> -/* In tree-ssa-sccvn.c  */
> -bool expressions_equal_p (tree, tree);
> -
> -
>  /* TOP of the VN lattice.  */
>  extern tree VN_TOP;
> 
> diff --git a/gcc/tree.c b/gcc/tree.c
> index 6593cf8..6683957 100644
> --- a/gcc/tree.c
> +++ b/gcc/tree.c
> @@ -12155,4 +12155,44 @@ contains_bitfld_component_ref_p (const_tree ref)
>    return false;
>  }
> 
> +/* Compare two expressions E1 and E2 and return true if they are equal.  */
> +
> +bool
> +expressions_equal_p (tree e1, tree e2)
> +{
> +  /* The obvious case.  */
> +  if (e1 == e2)
> +    return true;
> +
> +  /* If only one of them is null, they cannot be equal.  */
> +  if (!e1 || !e2)
> +    return false;
> +
> +  /* Now perform the actual comparison.  */
> +  if (TREE_CODE (e1) == TREE_CODE (e2)
> +      && operand_equal_p (e1, e2, OEP_PURE_SAME))
> +    return true;
> +
> +  return false;
> +}
> +
> +/* Given a pointer to a tree node, assumed to be some kind of a ..._TYPE
> +   node, return the size in bits for the type if it is a constant, or else
> +   return the alignment for the type if the type's size is not constant, or
> +   else return BITS_PER_WORD if the type actually turns out to be an
> +   ERROR_MARK node.  */
> +
> +unsigned HOST_WIDE_INT
> +simple_type_size_in_bits (const_tree type)
> +{
> +  if (TREE_CODE (type) == ERROR_MARK)
> +    return BITS_PER_WORD;
> +  else if (TYPE_SIZE (type) == NULL_TREE)
> +    return 0;
> +  else if (host_integerp (TYPE_SIZE (type), 1))
> +    return tree_low_cst (TYPE_SIZE (type), 1);
> +  else
> +    return TYPE_ALIGN (type);
> +}
> +
>  #include "gt-tree.h"
> diff --git a/gcc/tree.h b/gcc/tree.h
> index a263a2c..b2bd481 100644
> --- a/gcc/tree.h
> +++ b/gcc/tree.h
> @@ -4528,6 +4528,7 @@ extern tree get_ref_base_and_extent (tree,
> HOST_WIDE_INT *,
>                                      HOST_WIDE_INT *, HOST_WIDE_INT *);
>  extern bool contains_bitfld_component_ref_p (const_tree);
>  extern bool type_in_anonymous_namespace_p (tree);
> +extern bool expressions_equal_p (tree e1, tree e2);
> 
>  /* In tree-nested.c */
>  extern tree build_addr (tree, tree);
> @@ -4693,6 +4694,10 @@ extern tree resolve_asm_operand_names (tree,
> tree, tree, tree);
>  extern tree tree_overlaps_hard_reg_set (tree, HARD_REG_SET *);
>  #endif
> 
> +/* In dwarf2out.c.  */
> +HOST_WIDE_INT
> +field_byte_offset (const_tree decl);
> +
>  ?
>  /* In tree-inline.c  */
> 
> @@ -4979,5 +4984,6 @@ builtin_decl_implicit_p (enum built_in_function fncode)
>  #endif /* NO_DOLLAR_IN_LABEL */
>  #endif /* NO_DOT_IN_LABEL */
> 
> +extern unsigned HOST_WIDE_INT simple_type_size_in_bits (const_tree type);
> 
>  #endif  /* GCC_TREE_H  */
> 
> 
> Regards,
> Zoran Jovanovic
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Add a new option "-ftree-bitfield-merge" (patch / doc inside)
  2013-11-08 13:07       ` Richard Biener
@ 2013-11-08 14:11         ` Richard Biener
  2014-03-09 20:40           ` Zoran Jovanovic
  0 siblings, 1 reply; 17+ messages in thread
From: Richard Biener @ 2013-11-08 14:11 UTC (permalink / raw)
  To: Zoran Jovanovic; +Cc: gcc-patches

[-- Attachment #1: Type: TEXT/PLAIN, Size: 43815 bytes --]

On Fri, 8 Nov 2013, Richard Biener wrote:

> > Hello,
> > This is new patch version.
> > Comments from Bernhard Reutner-Fischer review applied.
> > Also, test case bitfildmrg2.c modified - it is now execute test.
> > 
> > 
> > Example:
> > 
> > Original code:
> >   <unnamed-unsigned:3> D.1351;
> >   <unnamed-unsigned:9> D.1350;
> >   <unnamed-unsigned:7> D.1349;
> >   D.1349_2 = p1_1(D)->f1;
> >   p2_3(D)->f1 = D.1349_2;
> >   D.1350_4 = p1_1(D)->f2;
> >   p2_3(D)->f2 = D.1350_4;
> >   D.1351_5 = p1_1(D)->f3;
> >   p2_3(D)->f3 = D.1351_5;
> > 
> > Optimized code:
> >   <unnamed-unsigned:19> D.1358;
> >   _16 = pr1_2(D)->_field0;
> >   pr2_4(D)->_field0 = _16;
> > 
> > Algorithm works on basic block level and consists of following 3 major steps:
> > 1. Go trough basic block statements list. If there are statement pairs
> > that implement copy of bit field content from one memory location to
> > another record statements pointers and other necessary data in
> > corresponding data structure.
> > 2. Identify records that represent adjacent bit field accesses and
> > mark them as merged.
> > 3. Modify trees accordingly.
> > 
> > New command line option "-ftree-bitfield-merge" is introduced.
> > 
> > Tested - passed gcc regression tests.
> 
> Comments inline (sorry for the late reply...)
> 
> > Changelog -
> > 
> > gcc/ChangeLog:
> > 2013-09-24 Zoran Jovanovic (zoran.jovanovic@imgtec.com)
> >   * Makefile.in : Added tree-sra.c to GTFILES.
> >   * common.opt (ftree-bitfield-merge): New option.
> >   * doc/invoke.texi: Added reference to "-ftree-bitfield-merge".
> >   * tree-sra.c (ssa_bitfield_merge): New function.
> >   Entry for (-ftree-bitfield-merge).
> >   (bitfield_stmt_access_pair_htab_hash): New function.
> >   (bitfield_stmt_access_pair_htab_eq): New function.
> >   (cmp_access): New function.
> >   (create_and_insert_access): New function.
> >   (get_bit_offset): New function.
> >   (get_merged_bit_field_size): New function.
> >   (add_stmt_access_pair): New function.
> >   * dwarf2out.c (simple_type_size_in_bits): moved to tree.c.
> >   (field_byte_offset): declaration moved to tree.h, static removed.
> >   * testsuite/gcc.dg/tree-ssa/bitfldmrg1.c: New test.
> >   * testsuite/gcc.dg/tree-ssa/bitfldmrg2.c: New test.
> >   * tree-ssa-sccvn.c (expressions_equal_p): moved to tree.c.
> >   * tree-ssa-sccvn.h (expressions_equal_p): declaration moved to tree.h.
> >   * tree.c (expressions_equal_p): moved from tree-ssa-sccvn.c.
> >   (simple_type_size_in_bits): moved from dwarf2out.c.
> >   * tree.h (expressions_equal_p): declaration added.
> >   (field_byte_offset): declaration added.
> > 
> > Patch -
> > 
> > diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> > index a2e3f7a..54aa8e7 100644
> > --- a/gcc/Makefile.in
> > +++ b/gcc/Makefile.in
> > @@ -3847,6 +3847,7 @@ GTFILES = $(CPP_ID_DATA_H) $(srcdir)/input.h
> > $(srcdir)/coretypes.h \
> >    $(srcdir)/asan.c \
> >    $(srcdir)/ubsan.c \
> >    $(srcdir)/tsan.c $(srcdir)/ipa-devirt.c \
> > +  $(srcdir)/tree-sra.c \
> >    @all_gtfiles@
> > 
> >  # Compute the list of GT header files from the corresponding C sources,
> > diff --git a/gcc/common.opt b/gcc/common.opt
> > index 202e169..afac514 100644
> > --- a/gcc/common.opt
> > +++ b/gcc/common.opt
> > @@ -2164,6 +2164,10 @@ ftree-sra
> >  Common Report Var(flag_tree_sra) Optimization
> >  Perform scalar replacement of aggregates
> > 
> > +ftree-bitfield-merge
> > +Common Report Var(flag_tree_bitfield_merge) Init(0) Optimization
> > +Enable bit-field merge on trees
> > +
> 
> Drop the tree- prefix for new options, it doesn't tell users anything.
> I suggest
> 
> fmerge-bitfields
> Common Report Var(flag_tree_bitfield_merge) Init(0) Optimization
> Merge loads and stores of consecutive bitfields
> 
> and adjust docs with regarding to the flag name change of course.
> 
> >  ftree-ter
> >  Common Report Var(flag_tree_ter) Optimization
> >  Replace temporary expressions in the SSA->normal pass
> > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > index aa0f4ed..e588cae 100644
> > --- a/gcc/doc/invoke.texi
> > +++ b/gcc/doc/invoke.texi
> > @@ -412,7 +412,7 @@ Objective-C and Objective-C++ Dialects}.
> >  -fsplit-ivs-in-unroller -fsplit-wide-types -fstack-protector @gol
> >  -fstack-protector-all -fstack-protector-strong -fstrict-aliasing @gol
> >  -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
> > --ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
> > +-ftree-bitfield-merge -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
> >  -ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
> >  -ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
> >  -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
> > @@ -7679,6 +7679,11 @@ pointer alignment information.
> >  This pass only operates on local scalar variables and is enabled by default
> >  at @option{-O} and higher.  It requires that @option{-ftree-ccp} is enabled.
> > 
> > +@item -ftree-bitfield-merge
> > +@opindex ftree-bitfield-merge
> > +Combines several adjacent bit-field accesses that copy values
> > +from one memory location to another into one single bit-field access.
> > +
> >  @item -ftree-ccp
> >  @opindex ftree-ccp
> >  Perform sparse conditional constant propagation (CCP) on trees.  This
> > diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
> > index 95049e4..e74db17 100644
> > --- a/gcc/dwarf2out.c
> > +++ b/gcc/dwarf2out.c
> > @@ -3108,8 +3108,6 @@ static HOST_WIDE_INT ceiling (HOST_WIDE_INT,
> > unsigned int);
> >  static tree field_type (const_tree);
> >  static unsigned int simple_type_align_in_bits (const_tree);
> >  static unsigned int simple_decl_align_in_bits (const_tree);
> > -static unsigned HOST_WIDE_INT simple_type_size_in_bits (const_tree);
> > -static HOST_WIDE_INT field_byte_offset (const_tree);
> >  static void add_AT_location_description        (dw_die_ref, enum
> > dwarf_attribute,
> >                                          dw_loc_list_ref);
> >  static void add_data_member_location_attribute (dw_die_ref, tree);
> > @@ -10149,25 +10147,6 @@ is_base_type (tree type)
> >    return 0;
> >  }
> > 
> > -/* Given a pointer to a tree node, assumed to be some kind of a ..._TYPE
> > -   node, return the size in bits for the type if it is a constant, or else
> > -   return the alignment for the type if the type's size is not constant, or
> > -   else return BITS_PER_WORD if the type actually turns out to be an
> > -   ERROR_MARK node.  */
> > -
> > -static inline unsigned HOST_WIDE_INT
> > -simple_type_size_in_bits (const_tree type)
> > -{
> > -  if (TREE_CODE (type) == ERROR_MARK)
> > -    return BITS_PER_WORD;
> > -  else if (TYPE_SIZE (type) == NULL_TREE)
> > -    return 0;
> > -  else if (host_integerp (TYPE_SIZE (type), 1))
> > -    return tree_low_cst (TYPE_SIZE (type), 1);
> > -  else
> > -    return TYPE_ALIGN (type);
> > -}
> > -
> >  /* Similarly, but return a double_int instead of UHWI.  */
> > 
> >  static inline double_int
> > @@ -14521,7 +14500,7 @@ round_up_to_align (double_int t, unsigned int align)
> >     because the offset is actually variable.  (We can't handle the latter case
> >     just yet).  */
> > 
> > -static HOST_WIDE_INT
> > +HOST_WIDE_INT
> >  field_byte_offset (const_tree decl)
> >  {
> >    double_int object_offset_in_bits;
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c
> > new file mode 100644
> > index 0000000..e9e96b7
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c
> > @@ -0,0 +1,30 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -ftree-bitfield-merge -fdump-tree-esra" }  */
> > +
> > +struct S
> > +{
> > +  unsigned f1:7;
> > +  unsigned f2:9;
> > +  unsigned f3:3;
> > +  unsigned f4:5;
> > +  unsigned f5:1;
> > +  unsigned f6:2;
> > +};
> > +
> > +unsigned
> > +foo (struct S *p1, struct S *p2, int *ptr)
> > +{
> > +  p2->f1 = p1->f1;
> > +  p2->f2 = p1->f2;
> > +  p2->f3 = p1->f3;
> > +  *ptr = 7;
> > +  p2->f4 = p1->f4;
> > +  p2->f5 = p1->f5;
> > +  p2->f6 = p1->f6;
> > +  return 0;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump "19" "esra" } } */
> > +/* { dg-final { scan-tree-dump "8" "esra"} } */
> 
> That's an awfully unspecific dump scan ;)  Please make it more
> specific and add a comment what code generation you expect.
> 
> > +/* { dg-final { cleanup-tree-dump "esra" } } */
> > +
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c
> > new file mode 100644
> > index 0000000..653e904
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c
> > @@ -0,0 +1,90 @@
> > +/* Check whether use of -ftree-bitfield-merge results in presence of overlaping
> > +   unions results in incorrect code.  */
> > +/* { dg-options "-O2 -ftree-bitfield-merge" }  */
> > +/* { dg-do run } */
> > +#include <stdio.h>
> > +extern void abort (void);
> > +
> > +struct s1
> > +{
> > +  unsigned f1:4;
> > +  unsigned f2:4;
> > +  unsigned f3:4;
> > +};
> > +
> > +struct s2
> > +{
> > +  unsigned char c;
> > +  unsigned f1:4;
> > +  unsigned f2:4;
> > +  unsigned f3:4;
> > +};
> > +
> > +struct s3
> > +{
> > +  unsigned f1:3;
> > +  unsigned f2:3;
> > +  unsigned f3:3;
> > +};
> > +
> > +struct s4
> > +{
> > +  unsigned f0:3;
> > +  unsigned f1:3;
> > +  unsigned f2:3;
> > +  unsigned f3:3;
> > +};
> > +
> > +union un_1
> > +{
> > +  struct s1 a;
> > +  struct s2 b;
> > +};
> > +
> > +union un_2
> > +{
> > +  struct s3 a;
> > +  struct s4 b;
> > +};
> > +
> > +void f1 (union un_1 *p1, union un_1 *p2)
> > +{
> > +  p2->a.f3 = p1->b.f3;
> > +  p2->a.f2 = p1->b.f2;
> > +  p2->a.f1 = p1->b.f1;
> > +
> > +  if (p1->b.f1 != 3)
> > +    abort ();
> > +}
> > +
> > +void f2 (union un_2 *p1, union un_2 *p2)
> > +{
> > +  p2->b.f1 = p1->a.f1;
> > +  p2->b.f2 = p1->a.f2;
> > +  p2->b.f3 = p1->a.f3;
> > +
> > +  if (p2->b.f1 != 0 || p2->b.f2 != 0 || p2->b.f3 != 0)
> > +    abort ();
> > +}
> > +
> > +int main ()
> > +{
> > +  union un_1 u1;
> > +  union un_2 u2;
> > +
> > +  u1.b.f1 = 1;
> > +  u1.b.f2 = 2;
> > +  u1.b.f3 = 3;
> > +  u1.b.c = 0;
> > +
> > +  f1 (&u1, &u1);
> > +
> > +  u2.b.f0 = 0;
> > +  u2.b.f1 = 1;
> > +  u2.b.f2 = 2;
> > +  u2.b.f3 = 3;
> > +
> > +  f2 (&u2, &u2);
> > +
> > +  return 0;
> > +}
> > diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
> > index 58c7565..610245b 100644
> > --- a/gcc/tree-sra.c
> > +++ b/gcc/tree-sra.c
> > @@ -92,6 +92,7 @@ along with GCC; see the file COPYING3.  If not see
> >  #include "gimple-pretty-print.h"
> >  #include "ipa-inline.h"
> >  #include "ipa-utils.h"
> > +#include "ggc.h"
> 
> You shouldn't need that I think.
> 
> >  /* Enumeration of all aggregate reductions we can do.  */
> >  enum sra_mode { SRA_MODE_EARLY_IPA,   /* early call regularization */
> > @@ -3424,12 +3425,476 @@ perform_intra_sra (void)
> >    return ret;
> >  }
> > 
> > +/* This optimization combines several adjacent bit-field accesses that copy
> > +   values from one memory location to another into one single bit-field
> > +   access.  */
> > +
> > +/* Data for single bit-field read/write sequence.  */
> > +struct GTY (()) bitfield_access_d {
> > +  gimple load_stmt;              /* Bit-field load statement.  */
> > +  gimple store_stmt;             /* Bit-field store statement.  */
> > +  unsigned src_offset_words;     /* Bit-field offset at src in words.  */
> > +  unsigned src_bit_offset;       /* Bit-field offset inside source word.  */
> > +  unsigned src_bit_size;         /* Size of bit-field in source word.  */
> > +  unsigned dst_offset_words;     /* Bit-field offset at dst in words.  */
> > +  unsigned dst_bit_offset;       /* Bit-field offset inside destination
> > +                                    word.  */
> > +  unsigned src_field_offset;     /* Source field offset.  */
> > +  unsigned dst_bit_size;         /* Size of bit-field in destination word.  */
> > +  tree src_addr;                 /* Address of source memory access.  */
> > +  tree dst_addr;                 /* Address of destination memory access.  */
> > +  bool merged;                   /* True if access is merged with another
> > +                                    one.  */
> > +  bool modified;                 /* True if bit-field size is modified.  */
> > +  bool is_barrier;               /* True if access is barrier (call or mem
> > +                                    access).  */
> > +  struct bitfield_access_d *next; /* Access with which this one is merged.  */
> > +  tree bitfield_type;            /* Field type.  */
> > +  tree bitfield_representative;          /* Bit field representative
> > of original
> > +                                    declaration.  */
> > +  tree field_decl_context;       /* Context of original bit-field
> > +                                    declaration.  */
> > +};
> > +
> > +typedef struct bitfield_access_d bitfield_access_o;
> > +typedef struct bitfield_access_d *bitfield_access;
> > +
> > +/* Connecting register with bit-field access sequence that defines value in
> > +   that register.  */
> > +struct GTY (()) bitfield_stmt_access_pair_d
> > +{
> > +  gimple stmt;
> > +  bitfield_access access;
> > +};
> > +
> > +typedef struct bitfield_stmt_access_pair_d bitfield_stmt_access_pair_o;
> > +typedef struct bitfield_stmt_access_pair_d *bitfield_stmt_access_pair;
> > +
> > +static GTY ((param_is (struct bitfield_stmt_access_pair_d)))
> > +  htab_t bitfield_stmt_access_htab;
> 
> Nor does this need to be registered with the garbage collector.  Its
> lifetime is not greater than that of the SRA pass, right?
> 
> > +/* Hash table callbacks for bitfield_stmt_access_htab.  */
> > +
> > +static hashval_t
> > +bitfield_stmt_access_pair_htab_hash (const void *p)
> > +{
> > +  const bitfield_stmt_access_pair entry = (const bitfield_stmt_access_pair)p;
> > +  return (hashval_t) (uintptr_t) entry->stmt;
> > +}
> > +
> > +static int
> > +bitfield_stmt_access_pair_htab_eq (const void *p1, const void *p2)
> > +{
> > +  const struct bitfield_stmt_access_pair_d *entry1 =
> > +    (const bitfield_stmt_access_pair)p1;
> > +  const struct bitfield_stmt_access_pair_d *entry2 =
> > +    (const bitfield_stmt_access_pair)p2;
> > +  return entry1->stmt == entry2->stmt;
> > +}
> > +
> > +/* Counter used for generating unique names for new fields.  */
> > +static unsigned new_field_no;
> 
> Must be stale...?
> 
> > +/* Compare two bit-field access records.  */
> > +
> > +static int
> > +cmp_access (const void *p1, const void *p2)
> > +{
> > +  const bitfield_access a1 = (*(const bitfield_access*)p1);
> > +  const bitfield_access a2 = (*(const bitfield_access*)p2);
> > +
> > +  if (a1->bitfield_representative - a2->bitfield_representative)
> > +    return a1->bitfield_representative - a2->bitfield_representative;
> 
> Subtracting two unrelated pointers is undefined - I suggest you
> use DECL_UID (a1->bitfield_representative).
> 
> > +  if (!expressions_equal_p (a1->src_addr, a2->src_addr))
> > +    return a1 - a2;
> 
> Same - the comparison ends up depending on memory layout, that's bad.
> 
> > +
> > +  if (!expressions_equal_p (a1->dst_addr, a2->dst_addr))
> > +    return a1 - a2;
> > +  if (a1->src_offset_words - a2->src_offset_words)
> > +    return a1->src_offset_words - a2->src_offset_words;
> > +
> > +  return a1->src_bit_offset - a2->src_bit_offset;
> > +}
> > +
> > +/* Create new bit-field access structure and add it to given bitfield_accesses
> > +   htab.  */
> > +
> > +static bitfield_access
> > +create_and_insert_access (vec<bitfield_access>
> > +                      *bitfield_accesses)
> > +{
> > +  bitfield_access access = ggc_alloc_bitfield_access_d ();
> > +  memset (access, 0, sizeof (struct bitfield_access_d));
> > +  bitfield_accesses->safe_push (access);
> > +  return access;
> > +}
> > +
> > +/* Slightly modified add_bit_offset_attribute from dwarf2out.c.  */
> > +
> > +static inline HOST_WIDE_INT
> > +get_bit_offset (tree decl)
> > +{
> > +  tree type = DECL_BIT_FIELD_TYPE (decl);
> > +  HOST_WIDE_INT bitpos_int;
> > +
> > +  /* Must be a field and a bit-field.  */
> > +  gcc_assert (type && TREE_CODE (decl) == FIELD_DECL);
> > +  /* Bit position and decl size should be integer constants that can be
> > +     represented in a signle HOST_WIDE_INT.  */
> > +  if (! host_integerp (bit_position (decl), 0)
> > +      || ! host_integerp (DECL_SIZE (decl), 1))
> > +    return -1;
> > +
> > +  bitpos_int = int_bit_position (decl);
> > +  return bitpos_int;
> > +}
> > +
> > +/* Returns size of combined bitfields.  Size cannot be larger than size
> > +   of largest directly accessible memory unit.  */
> > +
> > +static int
> > +get_merged_bit_field_size (bitfield_access access)
> > +{
> > +  bitfield_access tmp_access = access;
> > +  int size = 0;
> > +
> > +  while (tmp_access)
> > +  {
> > +    size += tmp_access->src_bit_size;
> > +    tmp_access = tmp_access->next;
> > +  }
> > +  return size;
> > +}
> > +
> > +/* Adds new pair consisting of statement and bit-field access structure that
> > +   contains it.  */
> > +
> > +static bool add_stmt_access_pair (bitfield_access access, gimple stmt)
> > +{
> > +  bitfield_stmt_access_pair new_pair;
> > +  void **slot;
> > +  if (!bitfield_stmt_access_htab)
> > +    bitfield_stmt_access_htab =
> > +      htab_create_ggc (128, bitfield_stmt_access_pair_htab_hash,
> > +                      bitfield_stmt_access_pair_htab_eq, NULL);
> > +  new_pair = ggc_alloc_bitfield_stmt_access_pair_o ();
> > +  new_pair->stmt = stmt;
> > +  new_pair->access = access;
> > +  slot = htab_find_slot (bitfield_stmt_access_htab, new_pair, INSERT);
> > +  if (*slot == HTAB_EMPTY_ENTRY)
> > +    {
> > +      *slot = new_pair;
> > +      return true;
> > +    }
> > +  return false;
> > +}
> > +
> > +/* Returns true if given COMPONENT_REF is part of an union.  */
> > +
> > +static bool part_of_union_p (tree component)
> > +{
> > +  tree tmp = component;
> > +  bool res = false;
> > +  while (TREE_CODE (tmp) == COMPONENT_REF)
> > +    {
> > +      if (TREE_CODE (TREE_TYPE (tmp)) == UNION_TYPE)
> > +       {
> > +         res = true;
> > +         break;
> > +       }
> > +      tmp = TREE_OPERAND (tmp, 0);
> > +    }
> > +  if (tmp && (TREE_CODE (TREE_TYPE (tmp)) == UNION_TYPE))
> > +    res = true;
> > +  return res;
> > +}
> > +
> > +/* Main entry point for the bit-field merge optimization.  */
> 
> Ok, I'm skipping to here (I was wondering what you need all the above
> functions for)
> 
> > +static unsigned int
> > +ssa_bitfield_merge (void)
> > +{
> > +  basic_block bb;
> > +  unsigned int todoflags = 0;
> > +  vec<bitfield_access> bitfield_accesses;
> > +  int ix, iy;
> > +  bitfield_access access;
> > +  bool cfg_changed = false;
> > +
> > +  /* In the strict volatile bitfields case, doing code changes here may prevent
> > +     other optimizations, in particular in a SLOW_BYTE_ACCESS setting.  */
> > +  if (flag_strict_volatile_bitfields > 0)
> > +    return 0;
> 
> Hmm, I'm not sure we should care ... - but well, I don't care ;)
> 
> > +  FOR_EACH_BB (bb)
> > +    {
> > +      gimple_stmt_iterator gsi;
> > +      vec<bitfield_access> bitfield_accesses_merge = vNULL;
> > +      tree prev_representative = NULL_TREE;
> > +      bitfield_accesses.create (0);
> > +
> > +      /* Identify all bitfield copy sequences in the basic-block.  */
> > +      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)
> > +       {
> > +         gimple stmt = gsi_stmt (gsi);
> > +         tree lhs, rhs;
> > +         void **slot;
> > +         struct bitfield_stmt_access_pair_d asdata;
> > +
> > +         if (!is_gimple_assign (stmt))
> 
> Instead of checking TREE_THIS_VOLATILE below check here
>   || gimple_has_volatile_ops (stmt)
> 
> also you can narrow the assigns to visit by checking for
> 
>            !is_gimple_assign_single (stmt)
> 
> instead
> 
> > +           {
> > +             gsi_next (&gsi);
> > +             continue;
> > +           }
> > +
> > +         lhs = gimple_assign_lhs (stmt);
> > +         rhs = gimple_assign_rhs1 (stmt);
> > +
> > +         if (TREE_CODE (rhs) == COMPONENT_REF)
> > +           {
> > +             use_operand_p use;
> > +             gimple use_stmt;
> > +             tree op0 = TREE_OPERAND (rhs, 0);
> > +             tree op1 = TREE_OPERAND (rhs, 1);
> > +
> > +             if (TREE_CODE (op1) == FIELD_DECL && DECL_BIT_FIELD_TYPE (op1)
> 
> op1 is always a FIELD_DECL
> 
> > +                 && !TREE_THIS_VOLATILE (op1) && !part_of_union_p (rhs))
> 
> I wonder what's the issue with unions ... sure, for
> 
> union {
>   int field1 : 3;
>   int field2 : 2;
> };
> 
> but for
> 
> union {
>   struct {
>      int field1 : 3;
>      int field2 : 2;
>   } a;
> ...
> };
> 
> ?  Thus I'd simply check
> 
>             && TREE_CODE (DECL_CONTEXT (op1)) != UNION_TYPE
>             && TREE_CODE (DECL_CONTEXT (op1)) != QUAL_UNION_TYPE
> 
> maybe introduce
> 
> #define UNION_TYPE_P(TYPE)
>   (TREE_CODE (TYPE) == UNION_TYPE            \
>    || TREE_CODE (TYPE) == QUAL_UNION_TYPE)
> 
> alongside RECORD_OR_UNION_TYPE_P in tree.h.
> 
> > +               {
> > +                 if (single_imm_use (lhs, &use, &use_stmt)
> > +                      && is_gimple_assign (use_stmt))
> > +                   {
> > +                     tree use_lhs = gimple_assign_lhs (use_stmt);
> > +                     if (TREE_CODE (use_lhs) == COMPONENT_REF)
> 
> I'm not sure I follow the logic here, but it seems that you match
> a very specific pattern only - a bitfield copy.
> 
> I'd have applied the lowering (this is really a lowering pass
> with a cost model you make up here) if the same 
> DECL_BIT_FIELD_REPRESENTATIVE is used more than once in a BB
> on the same underlying base object.  That is, we are reasonably
> sure we can remove a redundant load and eventually redundant stores.
> 
> Thus, very simplistic you'd just record a op0, 
> DECL_BIT_FIELD_REPRESENTATIVE pair into the hashtable, using
> iterative_hash_expr for op0 and DECL_UID for the representative,
> counting the number of times you see them.  Make sure to apply
> the same for stores to bitfields, of course.
> 
> > +                       {
> > +                         tree use_op0 = TREE_OPERAND (use_lhs, 0);
> > +                         tree use_op1 = TREE_OPERAND (use_lhs, 1);
> > +                         tree tmp_repr = DECL_BIT_FIELD_REPRESENTATIVE (op1);
> > +
> > +                         if (TREE_CODE (use_op1) == FIELD_DECL
> > +                             && DECL_BIT_FIELD_TYPE (use_op1)
> > +                             && !TREE_THIS_VOLATILE (use_op1))
> > +                           {
> > +                             if (prev_representative
> > +                                 && (prev_representative != tmp_repr))
> > +                               {
> > +                                 /* If previous access has different
> > +                                    representative then barrier is needed
> > +                                    between it and new access.  */
> > +                                 access = create_and_insert_access
> > +                                            (&bitfield_accesses);
> > +                                 access->is_barrier = true;
> > +                               }
> > +                             prev_representative = tmp_repr;
> > +                             /* Create new bit-field access structure.  */
> > +                             access = create_and_insert_access
> > +                                        (&bitfield_accesses);
> > +                             /* Collect access data - load instruction.  */
> > +                             access->src_bit_size = tree_low_cst
> > +                                                     (DECL_SIZE (op1), 1);
> > +                             access->src_bit_offset = get_bit_offset (op1);
> > +                             access->src_offset_words =
> > +                               field_byte_offset (op1) / UNITS_PER_WORD;
> > +                             access->src_field_offset =
> > +                               tree_low_cst (DECL_FIELD_OFFSET (op1),1);
> > +                             access->src_addr = op0;
> > +                             access->load_stmt = gsi_stmt (gsi);
> > +                             /* Collect access data - store instruction.  */
> > +                             access->dst_bit_size =
> > +                               tree_low_cst (DECL_SIZE (use_op1), 1);
> > +                             access->dst_bit_offset =
> > +                               get_bit_offset (use_op1);
> > +                             access->dst_offset_words =
> > +                               field_byte_offset (use_op1) / UNITS_PER_WORD;
> > +                             access->dst_addr = use_op0;
> > +                             access->store_stmt = use_stmt;
> > +                             add_stmt_access_pair (access, stmt);
> > +                             add_stmt_access_pair (access, use_stmt);
> > +                             access->bitfield_type
> > +                               = DECL_BIT_FIELD_TYPE (use_op1);
> > +                             access->bitfield_representative = tmp_repr;
> > +                             access->field_decl_context =
> > +                               DECL_FIELD_CONTEXT (op1);
> > +                           }
> > +                       }
> > +                   }
> > +               }
> > +           }
> > +
> > +         /* Insert barrier for merging if statement is function call or memory
> > +            access.  */
> > +         if (bitfield_stmt_access_htab)
> > +           {
> > +             asdata.stmt = stmt;
> > +             slot
> > +               = htab_find_slot (bitfield_stmt_access_htab, &asdata,
> > +                                 NO_INSERT);
> > +             if (!slot
> > +                 && ((gimple_code (stmt) == GIMPLE_CALL)
> > +                     || (gimple_has_mem_ops (stmt))))
> > +               {
> > +                 /* Create new bit-field access structure.  */
> > +                 access = create_and_insert_access (&bitfield_accesses);
> > +                 /* Mark it as barrier.  */
> > +                 access->is_barrier = true;
> > +               }
> > +           }
> > +         gsi_next (&gsi);
> > +       }
> > +
> > +      /* If there are no at least two accesses go to the next basic block.  */
> > +      if (bitfield_accesses.length () <= 1)
> > +       {
> > +         bitfield_accesses.release ();
> > +         continue;
> > +       }
> > +
> > +      /* Find bit-field accesses that can be merged.  */
> > +      for (ix = 0; bitfield_accesses.iterate (ix, &access); ix++)
> > +       {
> > +         bitfield_access head_access;
> > +         bitfield_access mrg_access;
> > +         bitfield_access prev_access;
> > +         if (!bitfield_accesses_merge.exists ())
> > +           bitfield_accesses_merge.create (0);
> > +
> > +         bitfield_accesses_merge.safe_push (access);
> > +
> > +         if (!access->is_barrier
> > +             && !(access == bitfield_accesses.last ()
> > +             && !bitfield_accesses_merge.is_empty ()))
> > +           continue;
> > +
> > +         bitfield_accesses_merge.qsort (cmp_access);
> > +
> > +         head_access = prev_access = NULL;
> > +         for (iy = 0; bitfield_accesses_merge.iterate (iy, &mrg_access); iy++)
> > +           {
> > +             if (head_access
> > +                 && expressions_equal_p (head_access->src_addr,
> > +                                         mrg_access->src_addr)
> > +                 && expressions_equal_p (head_access->dst_addr,
> > +                                         mrg_access->dst_addr)
> > +                 && prev_access->src_offset_words
> > +                    == mrg_access->src_offset_words
> > +                 && prev_access->dst_offset_words
> > +                    == mrg_access->dst_offset_words
> > +                 && prev_access->src_bit_offset + prev_access->src_bit_size
> > +                    == mrg_access->src_bit_offset
> > +                 && prev_access->dst_bit_offset + prev_access->dst_bit_size
> > +                    == mrg_access->dst_bit_offset
> > +                 && prev_access->bitfield_representative
> > +                    == mrg_access->bitfield_representative)
> > +               {
> > +                 /* Merge conditions are satisfied - merge accesses.  */
> > +                 mrg_access->merged = true;
> > +                 prev_access->next = mrg_access;
> > +                 head_access->modified = true;
> > +                 prev_access = mrg_access;
> > +               }
> > +             else
> > +               head_access = prev_access = mrg_access;
> > +           }
> > +         bitfield_accesses_merge.release ();
> > +         bitfield_accesses_merge = vNULL;
> > +       }
> > +
> > +      /* Modify generated code.  */
> 
> Ick, so you are actually applying an optimization instead of just
> lowering the accesses to DECL_BIT_FIELD_REPRESENTATIVE accesses
> and letting CSE and DSE do their job.  Hmm.  I don't think this is
> a good idea.
> 
> Instead what I'd like to see is more something like the following
> (bah, I never merged BIT_FIELD_EXPR which performs the bit-field-insert
> in a single stmt);  quickly hacked together, without a cost model,
> just the lowering piece (likely not endian clean - but who knows).
> 
> Index: gcc/tree-sra.c
> ===================================================================
> --- gcc/tree-sra.c	(revision 204561)
> +++ gcc/tree-sra.c	(working copy)
> @@ -3445,10 +3445,121 @@ perform_intra_sra (void)
>    return ret;
>  }
>  
> +static void
> +lower_bitfields (void)
> +{
> +  basic_block bb;
> +  FOR_EACH_BB (bb)
> +    for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
> +	 !gsi_end_p (gsi); gsi_next (&gsi))
> +      {
> +	gimple stmt = gsi_stmt (gsi);
> +	if (!gimple_assign_single_p (stmt)
> +	    || gimple_has_volatile_ops (stmt))
> +	  continue;
> +
> +	/* Lower a bitfield read.  */
> +	tree ref = gimple_assign_rhs1 (stmt);
> +	if (TREE_CODE (ref) == COMPONENT_REF
> +	    && DECL_BIT_FIELD_TYPE (TREE_OPERAND (ref, 1))
> +	    && DECL_BIT_FIELD_REPRESENTATIVE (TREE_OPERAND (ref, 1)))
> +	  {
> +	    tree field = TREE_OPERAND (ref, 1);
> +	    tree rep = DECL_BIT_FIELD_REPRESENTATIVE (field);
> +	    unsigned HOST_WIDE_INT off;
> +	    if (host_integerp (DECL_FIELD_OFFSET (field), 1)
> +		&& host_integerp (DECL_FIELD_OFFSET (rep), 1))
> +	      off = (tree_low_cst (DECL_FIELD_OFFSET (field), 1)
> +		     - tree_low_cst (DECL_FIELD_OFFSET (rep), 1)) * BITS_PER_UNIT;
> +	    else
> +	      off = 0;
> +	    off += (tree_low_cst (DECL_FIELD_BIT_OFFSET (field), 1)
> +		    - tree_low_cst (DECL_FIELD_BIT_OFFSET (rep), 1));
> +	    tree loadres = make_ssa_name (TREE_TYPE (rep), NULL);
> +	    gimple load
> +	      = gimple_build_assign (loadres,
> +				     build3 (COMPONENT_REF, TREE_TYPE (rep),
> +					     TREE_OPERAND (ref, 0), rep,
> +					     NULL_TREE));
> +	    gimple_set_vuse (load, gimple_vuse (stmt));
> +	    gsi_insert_before (&gsi, load, GSI_SAME_STMT);
> +	    gimple_assign_set_rhs1 (stmt,
> +				    build3 (BIT_FIELD_REF, TREE_TYPE (ref),
> +					    loadres,
> +					    DECL_SIZE (field),
> +					    bitsize_int (off)));
> +	    update_stmt (stmt);
> +	  }
> +	ref = gimple_assign_lhs (stmt);
> +	if (TREE_CODE (ref) == COMPONENT_REF
> +	    && DECL_BIT_FIELD_TYPE (TREE_OPERAND (ref, 1))
> +	    && DECL_BIT_FIELD_REPRESENTATIVE (TREE_OPERAND (ref, 1)))
> +	  {
> +	    tree field = TREE_OPERAND (ref, 1);
> +	    tree rep = DECL_BIT_FIELD_REPRESENTATIVE (field);
> +	    unsigned HOST_WIDE_INT off;
> +	    if (host_integerp (DECL_FIELD_OFFSET (field), 1)
> +		&& host_integerp (DECL_FIELD_OFFSET (rep), 1))
> +	      off = (tree_low_cst (DECL_FIELD_OFFSET (field), 1)
> +		     - tree_low_cst (DECL_FIELD_OFFSET (rep), 1)) * BITS_PER_UNIT;
> +	    else
> +	      off = 0;
> +	    off += (tree_low_cst (DECL_FIELD_BIT_OFFSET (field), 1)
> +		    - tree_low_cst (DECL_FIELD_BIT_OFFSET (rep), 1));
> +	    tree loadres = make_ssa_name (TREE_TYPE (rep), NULL);
> +	    gimple load
> +	      = gimple_build_assign (loadres,
> +				     build3 (COMPONENT_REF, TREE_TYPE (rep),
> +					     unshare_expr (TREE_OPERAND (ref, 0)),
> +					     rep,
> +					     NULL_TREE));
> +	    gimple_set_vuse (load, gimple_vuse (stmt));
> +	    gsi_insert_before (&gsi, load, GSI_SAME_STMT);
> +	    /* FIXME:  BIT_FIELD_EXPR.  */
> +	    /* Mask out bits.  */
> +	    tree masked = make_ssa_name (TREE_TYPE (rep), NULL);
> +	    gimple tems
> +	      = gimple_build_assign_with_ops (BIT_AND_EXPR,
> +					      masked, loadres,
> +					      double_int_to_tree (TREE_TYPE (rep),
> +								  double_int::mask (TREE_INT_CST_LOW (DECL_SIZE (field))).lshift (off)));
> +	    gsi_insert_before (&gsi, tems, GSI_SAME_STMT);
> +	    /* Zero-extend the value to representative size.  */
> +	    tree tem2 = make_ssa_name (unsigned_type_for (TREE_TYPE (field)), NULL);
> +	    tems = gimple_build_assign_with_ops (NOP_EXPR, tem2,
> +						 gimple_assign_rhs1 (stmt),
> +						 NULL_TREE);
> +	    gsi_insert_before (&gsi, tems, GSI_SAME_STMT);
> +	    tree tem = make_ssa_name (TREE_TYPE (rep), NULL);
> +	    tems = gimple_build_assign_with_ops (NOP_EXPR, tem, tem2, NULL_TREE);
> +	    gsi_insert_before (&gsi, tems, GSI_SAME_STMT);
> +	    /* Shift the value into place.  */
> +	    tem2 = make_ssa_name (TREE_TYPE (rep), NULL);
> +	    tems = gimple_build_assign_with_ops (LSHIFT_EXPR, tem2, tem,
> +						 size_int (off));
> +	    gsi_insert_before (&gsi, tems, GSI_SAME_STMT);
> +	    /* Merge masked loaded value and value.  */
> +	    tree modres = make_ssa_name (TREE_TYPE (rep), NULL);
> +	    gimple mod
> +	      = gimple_build_assign_with_ops (BIT_IOR_EXPR, modres,
> +					      masked, tem2);
> +	    gsi_insert_before (&gsi, mod, GSI_SAME_STMT);
> +	    /* Finally adjust the store.  */
> +	    gimple_assign_set_rhs1 (stmt, modres);
> +	    gimple_assign_set_lhs (stmt,
> +				   build3 (COMPONENT_REF, TREE_TYPE (rep),
> +					   TREE_OPERAND (ref, 0), rep,
> +					   NULL_TREE));
> +	    update_stmt (stmt);
> +	  }
> +      }
> +}
> +
>  /* Perform early intraprocedural SRA.  */
>  static unsigned int
>  early_intra_sra (void)
>  {
> +  lower_bitfields ();
>    sra_mode = SRA_MODE_EARLY_INTRA;
>    return perform_intra_sra ();
>  }
> 
> 
> The idea is that this lowering then makes ESRA able to handle
> it (you can see that for example in the result from
> gcc.c-torture/execute/20000113-1.c which is miscompiled by
> the above, eh ... what did I say about not testing it??).
> 
> I'll try to get the above working and cleaned up a bit today
> and maybe early next week.  So stay tuned, I'll hand it over
> to make you test if it works as advertised for you.

Just forgot to invert the mask.  So, simple cost model as I
suggested and the lowering I suggested shows that for your
copying testcase we fail to optimize this because of aliasing
issues ...  we get

  _18 = p1_2(D)->D.1790;
  _3 = BIT_FIELD_REF <_18, 7, 0>;
  _19 = p2_4(D)->D.1790;
  _20 = _19 & 4294967168;
  _21 = (unsigned int) _3;
  _22 = _21 | _20;
  p2_4(D)->D.1790 = _22;
  _23 = p1_2(D)->D.1790;
  _6 = BIT_FIELD_REF <_23, 9, 7>;
  _24 = _22 & 4294901887;
  _25 = (unsigned int) _6;
  _26 = _25 << 7;
  _27 = _24 | _26;
  p2_4(D)->D.1790 = _27;

and because p1 and p2 alias the now bigger memory accesses conflict.

Interesting issue ... ;)

Code generation quality with lowering to shifts and masks for bitfield
writes is another issue (BIT_FIELD_EXPR, or BIT_FIELD_COMPOSE as
I renamed it to IIRC would somewhat make that easier to fix).

Updated and somewhat tested patch (with cost model as I proposed)
below.  Old BIT_FIELD_COMPOSE patch attached.

Richard.

Index: gcc/tree-sra.c
===================================================================
--- gcc/tree-sra.c	(revision 204561)
+++ gcc/tree-sra.c	(working copy)
@@ -3445,10 +3445,279 @@ perform_intra_sra (void)
   return ret;
 }
 
+/* Bitfield access and hashtable support commoning same base and
+   representative.  */
+
+struct bfaccess
+{
+  bfaccess (tree r) : ref (r), count (1) {}
+
+  tree ref;
+  unsigned count;
+
+  /* hash_table support */
+  typedef bfaccess value_type;
+  typedef bfaccess compare_type;
+  static inline hashval_t hash (const bfaccess *);
+  static inline int equal (const bfaccess*, const bfaccess *);
+  static inline void remove (bfaccess*);
+};
+
+hashval_t
+bfaccess::hash (const bfaccess *a)
+{
+  return iterative_hash_hashval_t
+      (iterative_hash_expr (TREE_OPERAND (a->ref, 0), 0),
+       DECL_UID (DECL_BIT_FIELD_REPRESENTATIVE (TREE_OPERAND (a->ref, 1))));
+}
+
+int
+bfaccess::equal (const bfaccess *a, const bfaccess *b)
+{
+  return ((DECL_BIT_FIELD_REPRESENTATIVE (TREE_OPERAND (a->ref, 1))
+	   == DECL_BIT_FIELD_REPRESENTATIVE (TREE_OPERAND (b->ref, 1)))
+	  && operand_equal_p (TREE_OPERAND (a->ref, 0),
+			      TREE_OPERAND (b->ref, 0), 0));
+}
+
+void
+bfaccess::remove (bfaccess *a)
+{
+  delete a;
+}
+
+/* Return whether REF is a bitfield access the bit offset of the bitfield
+   within the representative in *OFF if that is not NULL.  */
+
+static bool
+bitfield_access_p (tree ref, unsigned HOST_WIDE_INT *off)
+{
+  if (TREE_CODE (ref) != COMPONENT_REF)
+    return false;
+
+  tree field = TREE_OPERAND (ref, 1);
+  if (!DECL_BIT_FIELD_TYPE (field))
+    return false;
+
+  tree rep = DECL_BIT_FIELD_REPRESENTATIVE (field);
+  if (!rep)
+    return false;
+
+  if (!off)
+    return true;
+
+  if (host_integerp (DECL_FIELD_OFFSET (field), 1)
+      && host_integerp (DECL_FIELD_OFFSET (rep), 1))
+    *off = (tree_low_cst (DECL_FIELD_OFFSET (field), 1)
+	    - tree_low_cst (DECL_FIELD_OFFSET (rep), 1)) * BITS_PER_UNIT;
+  else
+    *off = 0;
+  *off += (tree_low_cst (DECL_FIELD_BIT_OFFSET (field), 1)
+	  - tree_low_cst (DECL_FIELD_BIT_OFFSET (rep), 1));
+
+  return true;
+}
+
+
+/* Lower the bitfield read at *GSI, the offset of the bitfield
+   relative to the bitfield representative is OFF bits.  */
+
+static void
+lower_bitfield_read (gimple_stmt_iterator *gsi, unsigned HOST_WIDE_INT off)
+{
+  gimple stmt = gsi_stmt (*gsi);
+  tree ref = gimple_assign_rhs1 (stmt);
+  tree field = TREE_OPERAND (ref, 1);
+  tree rep = DECL_BIT_FIELD_REPRESENTATIVE (field);
+
+  tree loadres = make_ssa_name (TREE_TYPE (rep), NULL);
+  gimple load
+      = gimple_build_assign (loadres,
+			     build3 (COMPONENT_REF, TREE_TYPE (rep),
+				     TREE_OPERAND (ref, 0), rep,
+				     NULL_TREE));
+  gimple_set_vuse (load, gimple_vuse (stmt));
+  gsi_insert_before (gsi, load, GSI_SAME_STMT);
+  gimple_assign_set_rhs1 (stmt,
+			  build3 (BIT_FIELD_REF, TREE_TYPE (ref),
+				  loadres,
+				  DECL_SIZE (field),
+				  bitsize_int (off)));
+  update_stmt (stmt);
+}
+
+/* Lower the bitfield write at *GSI, the offset of the bitfield
+   relative to the bitfield representative is OFF bits.  */
+
+static void
+lower_bitfield_write (gimple_stmt_iterator *gsi, unsigned HOST_WIDE_INT off)
+{
+  gimple stmt = gsi_stmt (*gsi);
+  tree ref = gimple_assign_lhs (stmt);
+  tree field = TREE_OPERAND (ref, 1);
+  tree rep = DECL_BIT_FIELD_REPRESENTATIVE (field);
+
+  tree loadres = make_ssa_name (TREE_TYPE (rep), NULL);
+  gimple load
+      = gimple_build_assign (loadres,
+			     build3 (COMPONENT_REF, TREE_TYPE (rep),
+				     unshare_expr
+				     (TREE_OPERAND (ref, 0)),
+				     rep,
+				     NULL_TREE));
+  gimple_set_vuse (load, gimple_vuse (stmt));
+  gsi_insert_before (gsi, load, GSI_SAME_STMT);
+  /* FIXME:  BIT_FIELD_EXPR.  */
+  /* Mask out bits.  */
+  tree masked = make_ssa_name (TREE_TYPE (rep), NULL);
+  tree mask
+      = double_int_to_tree (TREE_TYPE (rep),
+			    ~double_int::mask
+			    (TREE_INT_CST_LOW (DECL_SIZE (field)))
+			    .lshift (off));
+  gimple tems
+      = gimple_build_assign_with_ops (BIT_AND_EXPR,
+				      masked, loadres, mask);
+  gsi_insert_before (gsi, tems, GSI_SAME_STMT);
+  /* Zero-extend the value to representative size.  */
+  tree tem2;
+  if (!TYPE_UNSIGNED (TREE_TYPE (field)))
+    {
+      tem2 = make_ssa_name (unsigned_type_for (TREE_TYPE (field)),
+			    NULL);
+      tems = gimple_build_assign_with_ops (NOP_EXPR, tem2,
+					   gimple_assign_rhs1 (stmt),
+					   NULL_TREE);
+      gsi_insert_before (gsi, tems, GSI_SAME_STMT);
+    }
+  else
+    tem2 = gimple_assign_rhs1 (stmt);
+  tree tem = make_ssa_name (TREE_TYPE (rep), NULL);
+  tems = gimple_build_assign_with_ops (NOP_EXPR, tem,
+				       tem2, NULL_TREE);
+  gsi_insert_before (gsi, tems, GSI_SAME_STMT);
+  /* Shift the value into place.  */
+  if (off != 0)
+    {
+      tem2 = make_ssa_name (TREE_TYPE (rep), NULL);
+      tems = gimple_build_assign_with_ops (LSHIFT_EXPR, tem2, tem,
+					   size_int (off));
+      gsi_insert_before (gsi, tems, GSI_SAME_STMT);
+    }
+  else
+    tem2 = tem;
+  /* Merge masked loaded value and value.  */
+  tree modres = make_ssa_name (TREE_TYPE (rep), NULL);
+  gimple mod
+      = gimple_build_assign_with_ops (BIT_IOR_EXPR, modres,
+				      masked, tem2);
+  gsi_insert_before (gsi, mod, GSI_SAME_STMT);
+  /* Finally adjust the store.  */
+  gimple_assign_set_rhs1 (stmt, modres);
+  gimple_assign_set_lhs (stmt,
+			 build3 (COMPONENT_REF, TREE_TYPE (rep),
+				 TREE_OPERAND (ref, 0), rep,
+				 NULL_TREE));
+  update_stmt (stmt);
+}
+
+/* Lower bitfield accesses to accesses of their
+   DECL_BIT_FIELD_REPRESENTATIVE.  */
+
+static void
+lower_bitfields (bool all)
+{
+  basic_block bb;
+
+  hash_table <bfaccess> bf;
+  bf.create (1);
+
+  FOR_EACH_BB (bb)
+    {
+      bool any = false;
+      bf.empty ();
+
+      /* We do two passes, the first one identifying interesting
+         bitfield accesses and the second one actually lowering them.  */
+      if (!all)
+	for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
+	     !gsi_end_p (gsi); gsi_next (&gsi))
+	  {
+	    gimple stmt = gsi_stmt (gsi);
+	    if (!gimple_assign_single_p (stmt)
+		|| gimple_has_volatile_ops (stmt))
+	      continue;
+
+	    tree ref = gimple_assign_rhs1 (stmt);
+	    if (bitfield_access_p (ref, NULL))
+	      {
+		bfaccess a(ref);
+		bfaccess **slot = bf.find_slot (&a, INSERT);
+		if (*slot)
+		  (*slot)->count++;
+		else
+		  *slot = new bfaccess(a);
+		if ((*slot)->count > 1)
+		  any = true;
+	      }
+
+	    ref = gimple_assign_lhs (stmt);
+	    if (bitfield_access_p (ref, NULL))
+	      {
+		bfaccess a(ref);
+		bfaccess **slot = bf.find_slot (&a, INSERT);
+		if (*slot)
+		  (*slot)->count++;
+		else
+		  *slot = new bfaccess(a);
+		if ((*slot)->count > 1)
+		  any = true;
+	      }
+	  }
+
+      if (!all && !any)
+	continue;
+
+      for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
+	   !gsi_end_p (gsi); gsi_next (&gsi))
+	{
+	  gimple stmt = gsi_stmt (gsi);
+	  if (!gimple_assign_single_p (stmt)
+	      || gimple_has_volatile_ops (stmt))
+	    continue;
+
+	  tree ref;
+	  unsigned HOST_WIDE_INT off;
+
+	  /* Lower a bitfield read.  */
+	  ref = gimple_assign_rhs1 (stmt);
+	  if (bitfield_access_p (ref, &off))
+	    {
+	      bfaccess a(ref);
+	      bfaccess *aa = bf.find (&a);
+	      if (all || (aa->count > 1))
+		lower_bitfield_read (&gsi, off);
+	    }
+	  /* Lower a bitfield write to a read-modify-write cycle.  */
+	  ref = gimple_assign_lhs (stmt);
+	  if (bitfield_access_p (ref, &off))
+	    {
+	      bfaccess a(ref);
+	      bfaccess *aa = bf.find (&a);
+	      if (all || (aa->count > 1))
+		lower_bitfield_write (&gsi, off);
+	    }
+	}
+    }
+
+  bf.dispose ();
+}
+
 /* Perform early intraprocedural SRA.  */
 static unsigned int
 early_intra_sra (void)
 {
+  lower_bitfields (false);
   sra_mode = SRA_MODE_EARLY_INTRA;
   return perform_intra_sra ();
 }

[-- Attachment #2: BIT_FIELD_COMPOSE --]
[-- Type: TEXT/PLAIN, Size: 20420 bytes --]

2011-06-16  Richard Guenther  <rguenther@suse.de>

	* expr.c (expand_expr_real_1): Handle BIT_FIELD_COMPOSE.
	* fold-const.c (operand_equal_p): Likewise.
	(build_bit_mask): New function.
	(fold_quaternary_loc): Likewise.
	(fold): Call it.
	(fold_build4_stat_loc): New function.
	* gimplify.c (gimplify_expr): Handle BIT_FIELD_COMPOSE.
	* tree-inline.c (estimate_operator_cost): Likewise.
	* tree-pretty-print.c (dump_generic_node): Likewise.
	* tree-ssa-operands.c (get_expr_operands): Likewise.
	* tree.def (BIT_FIELD_COMPOSE): New tree code.
	* tree.h (build_bit_mask): Declare.
	(fold_quaternary): Define.
	(fold_quaternary_loc): Declare.
	(fold_build4): Define.
	(fold_build4_loc): Likewise.
	(fold_build4_stat_loc): Declare.
	* gimple.c (gimple_rhs_class_table): Handle BIT_FIELD_COMPOSE.

Index: trunk/gcc/expr.c
===================================================================
*** trunk.orig/gcc/expr.c	2011-07-05 13:39:12.000000000 +0200
--- trunk/gcc/expr.c	2011-07-05 14:15:25.000000000 +0200
*************** expand_expr_real_1 (tree exp, rtx target
*** 8693,8698 ****
--- 8693,8710 ----
  
        return expand_constructor (exp, target, modifier, false);
  
+     case BIT_FIELD_COMPOSE:
+       {
+         unsigned bitpos = (unsigned) TREE_INT_CST_LOW (TREE_OPERAND (exp, 3));
+         unsigned bitsize = (unsigned) TREE_INT_CST_LOW (TREE_OPERAND (exp, 2));
+ 	rtx op0 = expand_normal (TREE_OPERAND (exp, 0));
+ 	rtx op1 = expand_normal (TREE_OPERAND (exp, 1));
+ 	rtx dst = gen_reg_rtx (mode);
+ 	emit_move_insn (dst, op0);
+ 	store_bit_field (dst, bitsize, bitpos, mode, op1);
+ 	return dst;
+       }
+ 
      case TARGET_MEM_REF:
        {
  	addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (exp));
Index: trunk/gcc/fold-const.c
===================================================================
*** trunk.orig/gcc/fold-const.c	2011-07-05 13:39:12.000000000 +0200
--- trunk/gcc/fold-const.c	2011-07-05 14:15:25.000000000 +0200
*************** operand_equal_p (const_tree arg0, const_
*** 2667,2672 ****
--- 2667,2675 ----
  	case DOT_PROD_EXPR:
  	  return OP_SAME (0) && OP_SAME (1) && OP_SAME (2);
  
+ 	case BIT_FIELD_COMPOSE:
+ 	  return OP_SAME (0) && OP_SAME (1) && OP_SAME (2) && OP_SAME (3);
+ 
  	default:
  	  return 0;
  	}
*************** contains_label_p (tree st)
*** 13236,13241 ****
--- 13239,13257 ----
     (walk_tree_without_duplicates (&st, contains_label_1 , NULL) != NULL_TREE);
  }
  
+ /* Builds and returns a mask of integral type TYPE for masking out
+    BITSIZE bits at bit position BITPOS in a word of type TYPE.
+    The mask has the bits set from bit BITPOS to BITPOS + BITSIZE - 1.  */
+ 
+ tree
+ build_bit_mask (tree type, unsigned int bitsize, unsigned int bitpos)
+ {
+   tree mask = double_int_to_tree (type, double_int_mask (bitsize));
+   mask = const_binop (LSHIFT_EXPR, mask, size_int (bitpos));
+ 
+   return mask;
+ }
+ 
  /* Fold a ternary expression of code CODE and type TYPE with operands
     OP0, OP1, and OP2.  Return the folded expression if folding is
     successful.  Otherwise, return NULL_TREE.  */
*************** fold_ternary_loc (location_t loc, enum t
*** 13573,13578 ****
--- 13589,13606 ----
  	    }
  	}
  
+       /* Perform constant folding.  */
+       if (TREE_CODE (op0) == INTEGER_CST)
+ 	{
+ 	  unsigned bitpos = (unsigned) TREE_INT_CST_LOW (op2);
+ 	  unsigned bitsize = (unsigned) TREE_INT_CST_LOW (op1);
+ 	  double_int res;
+ 	  res = double_int_rshift (tree_to_double_int (op0), bitpos,
+ 				   HOST_BITS_PER_DOUBLE_INT, false);
+ 	  res = double_int_ext (res, bitsize, TYPE_UNSIGNED (type));
+ 	  return double_int_to_tree (type, res);
+ 	}
+ 
        /* A bit-field-ref that referenced the full argument can be stripped.  */
        if (INTEGRAL_TYPE_P (TREE_TYPE (arg0))
  	  && TYPE_PRECISION (TREE_TYPE (arg0)) == tree_low_cst (arg1, 1)
*************** fold_ternary_loc (location_t loc, enum t
*** 13597,13602 ****
--- 13625,13701 ----
      } /* switch (code) */
  }
  
+ /* Fold a quaternary expression of code CODE and type TYPE with operands
+    OP0, OP1, OP2, and OP3.  Return the folded expression if folding is
+    successful.  Otherwise, return NULL_TREE.  */
+ 
+ tree
+ fold_quaternary_loc (location_t loc, enum tree_code code, tree type,
+ 		     tree op0, tree op1, tree op2, tree op3)
+ {
+   tree arg0 = NULL_TREE, arg1 = NULL_TREE;
+   enum tree_code_class kind = TREE_CODE_CLASS (code);
+ 
+   gcc_assert (IS_EXPR_CODE_CLASS (kind)
+               && TREE_CODE_LENGTH (code) == 4);
+ 
+   /* Strip any conversions that don't change the mode.  This is safe
+      for every expression, except for a comparison expression because
+      its signedness is derived from its operands.  So, in the latter
+      case, only strip conversions that don't change the signedness.
+ 
+      Note that this is done as an internal manipulation within the
+      constant folder, in order to find the simplest representation of
+      the arguments so that their form can be studied.  In any cases,
+      the appropriate type conversions should be put back in the tree
+      that will get out of the constant folder.  */
+   if (op0)
+     {
+       arg0 = op0;
+       STRIP_NOPS (arg0);
+     }
+ 
+   if (op1)
+     {
+       arg1 = op1;
+       STRIP_NOPS (arg1);
+     }
+ 
+   switch (code)
+     {
+     case BIT_FIELD_COMPOSE:
+       /* Perform (partial) constant folding of BIT_FIELD_COMPOSE.  */
+       if (TREE_CODE (arg0) == INTEGER_CST
+ 	  || TREE_CODE (arg1) == INTEGER_CST)
+         {
+           unsigned bitpos = (unsigned) TREE_INT_CST_LOW (op3);
+           unsigned bitsize = (unsigned) TREE_INT_CST_LOW (op2);
+           tree bits, mask;
+           /* build a mask to mask/clear the bits in the word.  */
+           mask = build_bit_mask (type, bitsize, bitpos);
+           /* extend the bits to the word type, shift them to the right
+              place and mask the bits.  */
+           bits = fold_convert_loc (loc, type, arg1);
+           bits = fold_build2_loc (loc, BIT_AND_EXPR, type,
+ 				  fold_build2_loc (loc, LSHIFT_EXPR, type,
+ 						   bits, size_int (bitpos)),
+ 				  mask);
+           /* switch to clear mask and do the composition.  */
+           mask = fold_build1_loc (loc, BIT_NOT_EXPR, type, mask);
+           return fold_build2_loc (loc, BIT_IOR_EXPR, type,
+ 				  fold_build2_loc (loc, BIT_AND_EXPR, type,
+ 						   fold_convert (type, arg0),
+ 						   mask),
+ 				  bits);
+         }
+ 
+       return NULL_TREE;
+ 
+     default:
+       return NULL_TREE;
+     }
+ }
+ 
  /* Perform constant folding and related simplification of EXPR.
     The related simplifications include x*1 => x, x*0 => 0, etc.,
     and application of the associative law.
*************** fold (tree expr)
*** 13638,13644 ****
    if (IS_EXPR_CODE_CLASS (kind))
      {
        tree type = TREE_TYPE (t);
!       tree op0, op1, op2;
  
        switch (TREE_CODE_LENGTH (code))
  	{
--- 13737,13743 ----
    if (IS_EXPR_CODE_CLASS (kind))
      {
        tree type = TREE_TYPE (t);
!       tree op0, op1, op2, op3;
  
        switch (TREE_CODE_LENGTH (code))
  	{
*************** fold (tree expr)
*** 13657,13662 ****
--- 13756,13768 ----
  	  op2 = TREE_OPERAND (t, 2);
  	  tem = fold_ternary_loc (loc, code, type, op0, op1, op2);
  	  return tem ? tem : expr;
+ 	case 4:
+ 	  op0 = TREE_OPERAND (t, 0);
+ 	  op1 = TREE_OPERAND (t, 1);
+ 	  op2 = TREE_OPERAND (t, 2);
+ 	  op3 = TREE_OPERAND (t, 3);
+ 	  tem = fold_quaternary_loc (loc, code, type, op0, op1, op2, op3);
+ 	  return tem ? tem : expr;
  	default:
  	  break;
  	}
*************** fold_build3_stat_loc (location_t loc, en
*** 14113,14118 ****
--- 14219,14310 ----
    return tem;
  }
  
+ /* Fold a quaternary tree expression with code CODE of type TYPE with
+    operands OP0, OP1, OP2, and OP3.  Return a folded expression if
+    successful.  Otherwise, return a tree expression with code CODE of
+    type TYPE with operands OP0, OP1, OP2, and OP3.  */
+ 
+ tree
+ fold_build4_stat_loc (location_t loc, enum tree_code code, tree type,
+ 		      tree op0, tree op1, tree op2, tree op3 MEM_STAT_DECL)
+ {
+   tree tem;
+ #ifdef ENABLE_FOLD_CHECKING
+   unsigned char checksum_before_op0[16],
+                 checksum_before_op1[16],
+                 checksum_before_op2[16],
+                 checksum_before_op3[16],
+ 		checksum_after_op0[16],
+ 		checksum_after_op1[16],
+ 		checksum_after_op2[16];
+ 		checksum_after_op3[16];
+   struct md5_ctx ctx;
+   htab_t ht;
+ 
+   ht = htab_create (32, htab_hash_pointer, htab_eq_pointer, NULL);
+   md5_init_ctx (&ctx);
+   fold_checksum_tree (op0, &ctx, ht);
+   md5_finish_ctx (&ctx, checksum_before_op0);
+   htab_empty (ht);
+ 
+   md5_init_ctx (&ctx);
+   fold_checksum_tree (op1, &ctx, ht);
+   md5_finish_ctx (&ctx, checksum_before_op1);
+   htab_empty (ht);
+ 
+   md5_init_ctx (&ctx);
+   fold_checksum_tree (op2, &ctx, ht);
+   md5_finish_ctx (&ctx, checksum_before_op2);
+   htab_empty (ht);
+ 
+   md5_init_ctx (&ctx);
+   fold_checksum_tree (op3, &ctx, ht);
+   md5_finish_ctx (&ctx, checksum_before_op3);
+   htab_empty (ht);
+ #endif
+ 
+   gcc_assert (TREE_CODE_CLASS (code) != tcc_vl_exp);
+   tem = fold_quaternary_loc (loc, code, type, op0, op1, op2, op3);
+   if (!tem)
+     tem = build4_stat_loc (loc, code, type, op0, op1, op2, op3 PASS_MEM_STAT);
+ 
+ #ifdef ENABLE_FOLD_CHECKING
+   md5_init_ctx (&ctx);
+   fold_checksum_tree (op0, &ctx, ht);
+   md5_finish_ctx (&ctx, checksum_after_op0);
+   htab_empty (ht);
+ 
+   if (memcmp (checksum_before_op0, checksum_after_op0, 16))
+     fold_check_failed (op0, tem);
+ 
+   md5_init_ctx (&ctx);
+   fold_checksum_tree (op1, &ctx, ht);
+   md5_finish_ctx (&ctx, checksum_after_op1);
+   htab_empty (ht);
+ 
+   if (memcmp (checksum_before_op1, checksum_after_op1, 16))
+     fold_check_failed (op1, tem);
+ 
+   md5_init_ctx (&ctx);
+   fold_checksum_tree (op2, &ctx, ht);
+   md5_finish_ctx (&ctx, checksum_after_op2);
+   htab_delete (ht);
+ 
+   if (memcmp (checksum_before_op2, checksum_after_op2, 16))
+     fold_check_failed (op2, tem);
+ 
+   md5_init_ctx (&ctx);
+   fold_checksum_tree (op3, &ctx, ht);
+   md5_finish_ctx (&ctx, checksum_after_op3);
+   htab_delete (ht);
+ 
+   if (memcmp (checksum_before_op3, checksum_after_op3, 16))
+     fold_check_failed (op3, tem);
+ #endif
+   return tem;
+ }
+ 
+ 
  /* Fold a CALL_EXPR expression of type TYPE with operands FN and NARGS
     arguments in ARGARRAY, and a null static chain.
     Return a folded expression if successful.  Otherwise, return a CALL_EXPR
Index: trunk/gcc/gimplify.c
===================================================================
*** trunk.orig/gcc/gimplify.c	2011-07-05 13:39:12.000000000 +0200
--- trunk/gcc/gimplify.c	2011-07-05 14:15:25.000000000 +0200
*************** gimplify_expr (tree *expr_p, gimple_seq
*** 7239,7244 ****
--- 7239,7248 ----
  	  /* Classified as tcc_expression.  */
  	  goto expr_3;
  
+ 	case BIT_FIELD_COMPOSE:
+ 	  /* Arguments 3 and 4 are constants.  */
+ 	  goto expr_2;
+ 
  	case POINTER_PLUS_EXPR:
            /* Convert ((type *)A)+offset into &A->field_of_type_and_offset.
  	     The second is gimple immediate saving a need for extra statement.
Index: trunk/gcc/tree-inline.c
===================================================================
*** trunk.orig/gcc/tree-inline.c	2011-07-05 13:39:12.000000000 +0200
--- trunk/gcc/tree-inline.c	2011-07-05 14:15:25.000000000 +0200
*************** estimate_operator_cost (enum tree_code c
*** 3414,3419 ****
--- 3414,3423 ----
          return weights->div_mod_cost;
        return 1;
  
+     /* Bit-field insertion needs several shift and mask operations.  */
+     case BIT_FIELD_COMPOSE:
+       return 3;
+ 
      default:
        /* We expect a copy assignment with no operator.  */
        gcc_assert (get_gimple_rhs_class (code) == GIMPLE_SINGLE_RHS);
Index: trunk/gcc/tree-pretty-print.c
===================================================================
*** trunk.orig/gcc/tree-pretty-print.c	2011-07-05 13:39:12.000000000 +0200
--- trunk/gcc/tree-pretty-print.c	2011-07-05 14:15:25.000000000 +0200
*************** dump_generic_node (pretty_printer *buffe
*** 1217,1222 ****
--- 1217,1234 ----
        pp_string (buffer, ">");
        break;
  
+     case BIT_FIELD_COMPOSE:
+       pp_string (buffer, "BIT_FIELD_COMPOSE <");
+       dump_generic_node (buffer, TREE_OPERAND (node, 0), spc, flags, false);
+       pp_string (buffer, ", ");
+       dump_generic_node (buffer, TREE_OPERAND (node, 1), spc, flags, false);
+       pp_string (buffer, ", ");
+       dump_generic_node (buffer, TREE_OPERAND (node, 2), spc, flags, false);
+       pp_string (buffer, ", ");
+       dump_generic_node (buffer, TREE_OPERAND (node, 3), spc, flags, false);
+       pp_string (buffer, ">");
+       break;
+ 
      case ARRAY_REF:
      case ARRAY_RANGE_REF:
        op0 = TREE_OPERAND (node, 0);
Index: trunk/gcc/tree-ssa-operands.c
===================================================================
*** trunk.orig/gcc/tree-ssa-operands.c	2011-07-05 13:39:12.000000000 +0200
--- trunk/gcc/tree-ssa-operands.c	2011-07-05 14:15:25.000000000 +0200
*************** get_expr_operands (gimple stmt, tree *ex
*** 974,979 ****
--- 974,984 ----
        get_expr_operands (stmt, &TREE_OPERAND (expr, 0), flags);
        return;
  
+     case BIT_FIELD_COMPOSE:
+       gcc_assert (TREE_CODE (TREE_OPERAND (expr, 2)) == INTEGER_CST
+ 		  && TREE_CODE (TREE_OPERAND (expr, 3)) == INTEGER_CST);
+       /* Fallthru.  */
+ 
      case TRUTH_AND_EXPR:
      case TRUTH_OR_EXPR:
      case TRUTH_XOR_EXPR:
Index: trunk/gcc/tree.def
===================================================================
*** trunk.orig/gcc/tree.def	2011-07-05 13:39:12.000000000 +0200
--- trunk/gcc/tree.def	2011-07-05 14:15:25.000000000 +0200
*************** DEFTREECODE (ADDR_EXPR, "addr_expr", tcc
*** 784,789 ****
--- 784,801 ----
     descriptor of type ptr_mode.  */
  DEFTREECODE (FDESC_EXPR, "fdesc_expr", tcc_expression, 2)
  
+ /* Given a word, a value and a bitfield position and size within
+    the word, produce the value that results if replacing the
+    described parts of word with value.
+    Operand 0 is a tree for the word of integral type;
+    Operand 1 is a tree for the value of integral type;
+    Operand 2 is a tree giving the constant number of bits being
+    referenced which is less or equal to the precision of the value;
+    Operand 3 is a tree giving the constant position of the first referenced
+    bit such that the sum of operands 2 and 3 is less than or equal to the
+    precision of the word.  */
+ DEFTREECODE (BIT_FIELD_COMPOSE, "bitfield_compose", tcc_expression, 4)
+ 
  /* Given two real or integer operands of the same type,
     returns a complex value of the corresponding complex type.  */
  DEFTREECODE (COMPLEX_EXPR, "complex_expr", tcc_binary, 2)
Index: trunk/gcc/tree.h
===================================================================
*** trunk.orig/gcc/tree.h	2011-07-05 13:39:12.000000000 +0200
--- trunk/gcc/tree.h	2011-07-05 14:15:25.000000000 +0200
*************** extern bool is_typedef_decl (tree x);
*** 5140,5145 ****
--- 5140,5146 ----
  extern bool typedef_variant_p (tree);
  extern bool auto_var_in_fn_p (const_tree, const_tree);
  extern tree build_low_bits_mask (tree, unsigned);
+ extern tree build_bit_mask (tree type, unsigned int, unsigned int);
  extern tree tree_strip_nop_conversions (tree);
  extern tree tree_strip_sign_nop_conversions (tree);
  extern tree lhd_gcc_personality (void);
*************** extern tree fold_binary_loc (location_t,
*** 5196,5201 ****
--- 5197,5205 ----
  #define fold_ternary(CODE,T1,T2,T3,T4)\
     fold_ternary_loc (UNKNOWN_LOCATION, CODE, T1, T2, T3, T4)
  extern tree fold_ternary_loc (location_t, enum tree_code, tree, tree, tree, tree);
+ #define fold_quaternary(CODE,T1,T2,T3,T4,T5)\
+    fold_quaternary_loc (UNKNOWN_LOCATION, CODE, T1, T2, T3, T4, T5)
+ extern tree fold_quaternary_loc (location_t, enum tree_code, tree, tree, tree, tree, tree);
  #define fold_build1(c,t1,t2)\
     fold_build1_stat_loc (UNKNOWN_LOCATION, c, t1, t2 MEM_STAT_INFO)
  #define fold_build1_loc(l,c,t1,t2)\
*************** extern tree fold_build2_stat_loc (locati
*** 5214,5219 ****
--- 5218,5229 ----
     fold_build3_stat_loc (l, c, t1, t2, t3, t4 MEM_STAT_INFO)
  extern tree fold_build3_stat_loc (location_t, enum tree_code, tree, tree, tree,
  				  tree MEM_STAT_DECL);
+ #define fold_build4(c,t1,t2,t3,t4,t5)\
+    fold_build4_stat_loc (UNKNOWN_LOCATION, c, t1, t2, t3, t4, t5 MEM_STAT_INFO)
+ #define fold_build4_loc(l,c,t1,t2,t3,t4,t5)\
+    fold_build4_stat_loc (l, c, t1, t2, t3, t4, t5 MEM_STAT_INFO)
+ extern tree fold_build4_stat_loc (location_t, enum tree_code, tree, tree, tree,
+ 				  tree, tree MEM_STAT_DECL);
  extern tree fold_build1_initializer_loc (location_t, enum tree_code, tree, tree);
  extern tree fold_build2_initializer_loc (location_t, enum tree_code, tree, tree, tree);
  extern tree fold_build3_initializer_loc (location_t, enum tree_code, tree, tree, tree, tree);
Index: trunk/gcc/gimple.c
===================================================================
*** trunk.orig/gcc/gimple.c	2011-07-05 13:39:12.000000000 +0200
--- trunk/gcc/gimple.c	2011-07-05 14:15:25.000000000 +0200
*************** get_gimple_rhs_num_ops (enum tree_code c
*** 2623,2629 ****
        || (SYM) == ADDR_EXPR						    \
        || (SYM) == WITH_SIZE_EXPR					    \
        || (SYM) == SSA_NAME						    \
!       || (SYM) == VEC_COND_EXPR) ? GIMPLE_SINGLE_RHS			    \
     : GIMPLE_INVALID_RHS),
  #define END_OF_BASE_TREE_CODES (unsigned char) GIMPLE_INVALID_RHS,
  
--- 2623,2630 ----
        || (SYM) == ADDR_EXPR						    \
        || (SYM) == WITH_SIZE_EXPR					    \
        || (SYM) == SSA_NAME						    \
!       || (SYM) == VEC_COND_EXPR						    \
!       || (SYM) == BIT_FIELD_COMPOSE) ? GIMPLE_SINGLE_RHS			    \
     : GIMPLE_INVALID_RHS),
  #define END_OF_BASE_TREE_CODES (unsigned char) GIMPLE_INVALID_RHS,
  
Index: trunk/gcc/cfgexpand.c
===================================================================
*** trunk.orig/gcc/cfgexpand.c	2011-07-05 13:39:12.000000000 +0200
--- trunk/gcc/cfgexpand.c	2011-07-05 14:15:25.000000000 +0200
*************** expand_debug_expr (tree exp)
*** 3236,3246 ****
      case VEC_WIDEN_MULT_LO_EXPR:
        return NULL;
  
!    /* Misc codes.  */
      case ADDR_SPACE_CONVERT_EXPR:
      case FIXED_CONVERT_EXPR:
      case OBJ_TYPE_REF:
      case WITH_SIZE_EXPR:
        return NULL;
  
      case DOT_PROD_EXPR:
--- 3236,3247 ----
      case VEC_WIDEN_MULT_LO_EXPR:
        return NULL;
  
!     /* Misc codes.  */
      case ADDR_SPACE_CONVERT_EXPR:
      case FIXED_CONVERT_EXPR:
      case OBJ_TYPE_REF:
      case WITH_SIZE_EXPR:
+     case BIT_FIELD_COMPOSE:
        return NULL;
  
      case DOT_PROD_EXPR:
Index: trunk/gcc/gimple-fold.c
===================================================================
*** trunk.orig/gcc/gimple-fold.c	2011-07-05 13:39:12.000000000 +0200
--- trunk/gcc/gimple-fold.c	2011-07-05 14:19:14.000000000 +0200
*************** gimple_fold_stmt_to_constant_1 (gimple s
*** 2864,2869 ****
--- 2864,2886 ----
  
  		  return build_vector (TREE_TYPE (rhs), nreverse (list));
  		}
+ 	      else if (TREE_CODE (rhs) == BIT_FIELD_COMPOSE)
+ 		{
+ 		  tree val0 = TREE_OPERAND (rhs, 0);
+ 		  tree val1 = TREE_OPERAND (rhs, 1);
+ 		  if (TREE_CODE (val0) == SSA_NAME)
+ 		    val0 = (*valueize) (val0);
+ 		  if (TREE_CODE (val1) == SSA_NAME)
+ 		    val1 = (*valueize) (val1);
+ 		  if (TREE_CODE (val0) == INTEGER_CST
+ 		      && TREE_CODE (val1) == INTEGER_CST)
+ 		    return fold_quaternary_loc (EXPR_LOCATION (rhs),
+ 						TREE_CODE (rhs),
+ 						TREE_TYPE (rhs),
+ 						val0, val1,
+ 						TREE_OPERAND (rhs, 2),
+ 						TREE_OPERAND (rhs, 3));
+ 		}
  
                if (kind == tcc_reference)
  		{

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH] Add a new option "-ftree-bitfield-merge" (patch / doc inside)
  2013-11-08 14:11         ` Richard Biener
@ 2014-03-09 20:40           ` Zoran Jovanovic
  2014-03-12 21:51             ` Bernhard Reutner-Fischer
  0 siblings, 1 reply; 17+ messages in thread
From: Zoran Jovanovic @ 2014-03-09 20:40 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

Hello,
This is new patch version. 
Approach suggested by Richard Biener with lowering bit-field accesses instead of modifying gimple trees is implemented.
Although, algorithm still detects adjacent bit field accesses, which copy values from one to another bit-field of same type.
If those accesses are detected field size used during lowering is equal to sum of sizes of all adjacent fields that can be merged. 
Idea is to let dse and cse to remove unnecessary instructions afterwards.
I wanted to preserve this behavior because it was one of original goals of this work.


Example:

Original code:
  <bb 2>:
  _2 = pr1.f1;
  pr2.f1 = _2;
  _4 = pr1.f2;
  pr2.f2 = _4;
  _6 = pr1.f3;
  pr2.f3 = _6;
  return;


Optimized code:
  <bb 2>:
  _8 = pr1.D.1364;
  _9 = BIT_FIELD_REF <_8, 13, 0>;
  _10 = pr2.D.1364;
  _11 = _10 & 4294959104;
  _12 = (unsigned int) _9;
  _13 = _11 | _12;
  pr2.D.1364 = _13;
  _14 = pr1.D.1364;
  _15 = BIT_FIELD_REF <_14, 13, 0>;
  _16 = pr2.D.1364;
  _17 = _16 & 4294959104;
  _18 = (unsigned int) _15;
  _19 = _17 | _18;
  pr2.D.1364 = _19;
  _20 = pr1.D.1364;
  _21 = BIT_FIELD_REF <_20, 13, 0>;
  _22 = pr2.D.1364;
  _23 = _22 & 4294959104;
  _24 = (unsigned int) _21;
  _25 = _23 | _24;
  pr2.D.1364 = _25;
  return;
  

Algorithm works on basic block level and consists of following 3 major steps:
1. Go through basic block statements list. If there are statement pairs that implement copy of bit field content from one memory location to another record statements pointers and other necessary data in corresponding data structure.
2. Identify records that represent adjacent bit field accesses and mark them as merged.
3. Lower bit-field accesses by using new field size for those that can be merged.


New command line option "-fmerge-bitfields" is introduced.

Tested - passed gcc regression tests.

Changelog -

gcc/ChangeLog:
2014-03-09 Zoran Jovanovic (zoran.jovanovic@imgtec.com)
  * common.opt (fmerge-bitfields): New option.
  * doc/invoke.texi: Added reference to "-fmerge-bitfields".
  * tree-sra.c (lower_bitfields): New function.
  Entry for (-fmerge-bitfields).
  (bfaccess::hash): New function.
  (bfaccess::equal): New function.
  (bfaccess::remove): New function.
  (bitfield_access_p): New function.
  (lower_bitfield_read): New function.
  (lower_bitfield_write): New function.
  (bitfield_stmt_access_pair_htab_hash): New function.
  (bitfield_stmt_access_pair_htab_eq): New function.
  (create_and_insert_access): New function.
  (get_bit_offset): New function.
  (get_merged_bit_field_size): New function.
  (add_stmt_access_pair): New function.
  (cmp_access): New function.
  * dwarf2out.c (simple_type_size_in_bits): moved to tree.c.
  (field_byte_offset): declaration moved to tree.h, static removed.
  * testsuite/gcc.dg/tree-ssa/bitfldmrg1.c: New test.
  * testsuite/gcc.dg/tree-ssa/bitfldmrg2.c: New test.
  * tree-ssa-sccvn.c (expressions_equal_p): moved to tree.c.
  * tree-ssa-sccvn.h (expressions_equal_p): declaration moved to tree.h.
  * tree.c (expressions_equal_p): moved from tree-ssa-sccvn.c.
  (simple_type_size_in_bits): moved from dwarf2out.c.
  * tree.h (expressions_equal_p): declaration added.
  (field_byte_offset): declaration added.

Patch -

diff --git a/gcc/common.opt b/gcc/common.opt
index 661516d..3331d03 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2193,6 +2193,10 @@ ftree-sra
 Common Report Var(flag_tree_sra) Optimization
 Perform scalar replacement of aggregates
 
+fmerge-bitfields
+Common Report Var(flag_tree_bitfield_merge) Init(0) Optimization
+Merge loads and stores of consecutive bitfields
+
 ftree-ter
 Common Report Var(flag_tree_ter) Optimization
 Replace temporary expressions in the SSA->normal pass
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 24bd76e..54bae56 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -411,7 +411,7 @@ Objective-C and Objective-C++ Dialects}.
 -fsplit-ivs-in-unroller -fsplit-wide-types -fstack-protector @gol
 -fstack-protector-all -fstack-protector-strong -fstrict-aliasing @gol
 -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
--ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
+-fmerge-bitfields -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
 -ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
 -ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
 -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
@@ -7807,6 +7807,11 @@ pointer alignment information.
 This pass only operates on local scalar variables and is enabled by default
 at @option{-O} and higher.  It requires that @option{-ftree-ccp} is enabled.
 
+@item -fbitfield-merge
+@opindex fmerge-bitfields
+Combines several adjacent bit-field accesses that copy values
+from one memory location to another into one single bit-field access.
+
 @item -ftree-ccp
 @opindex ftree-ccp
 Perform sparse conditional constant propagation (CCP) on trees.  This
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 2b584a5..5150d40 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -3119,8 +3119,6 @@ static HOST_WIDE_INT ceiling (HOST_WIDE_INT, unsigned int);
 static tree field_type (const_tree);
 static unsigned int simple_type_align_in_bits (const_tree);
 static unsigned int simple_decl_align_in_bits (const_tree);
-static unsigned HOST_WIDE_INT simple_type_size_in_bits (const_tree);
-static HOST_WIDE_INT field_byte_offset (const_tree);
 static void add_AT_location_description	(dw_die_ref, enum dwarf_attribute,
 					 dw_loc_list_ref);
 static void add_data_member_location_attribute (dw_die_ref, tree);
@@ -10281,25 +10279,6 @@ is_base_type (tree type)
   return 0;
 }
 
-/* Given a pointer to a tree node, assumed to be some kind of a ..._TYPE
-   node, return the size in bits for the type if it is a constant, or else
-   return the alignment for the type if the type's size is not constant, or
-   else return BITS_PER_WORD if the type actually turns out to be an
-   ERROR_MARK node.  */
-
-static inline unsigned HOST_WIDE_INT
-simple_type_size_in_bits (const_tree type)
-{
-  if (TREE_CODE (type) == ERROR_MARK)
-    return BITS_PER_WORD;
-  else if (TYPE_SIZE (type) == NULL_TREE)
-    return 0;
-  else if (tree_fits_uhwi_p (TYPE_SIZE (type)))
-    return tree_to_uhwi (TYPE_SIZE (type));
-  else
-    return TYPE_ALIGN (type);
-}
-
 /* Similarly, but return a double_int instead of UHWI.  */
 
 static inline double_int
@@ -14667,7 +14646,7 @@ round_up_to_align (double_int t, unsigned int align)
    because the offset is actually variable.  (We can't handle the latter case
    just yet).  */
 
-static HOST_WIDE_INT
+HOST_WIDE_INT
 field_byte_offset (const_tree decl)
 {
   double_int object_offset_in_bits;
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 284d544..c6a19b2 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -3462,10 +3462,608 @@ perform_intra_sra (void)
   return ret;
 }
 
+/* Bitfield access and hashtable support commoning same base and
+   representative.  */
+
+struct bfaccess
+{
+  bfaccess (tree r):ref (r), r_count (1), w_count (1), merged (false),
+    modified (false), is_barrier (false), next (0), head_access (0)
+  {
+  }
+
+  tree ref;
+  unsigned r_count;  /* Read counter.  */
+  unsigned w_count;  /* Write counter.  */
+
+  /* hash_table support.  */
+  typedef bfaccess value_type;
+  typedef bfaccess compare_type;
+  static inline hashval_t hash (const bfaccess *);
+  static inline int equal (const bfaccess *, const bfaccess *);
+  static inline void remove (bfaccess *);
+
+  gimple load_stmt;		/* Bit-field load statement.  */
+  gimple store_stmt;		/* Bit-field store statement.  */
+  unsigned src_offset_words;	/* Bit-field offset at src in words.  */
+  unsigned src_bit_offset;	/* Bit-field offset inside source word.  */
+  unsigned src_bit_size;	/* Size of bit-field in source word.  */
+  unsigned dst_offset_words;	/* Bit-field offset at dst in words.  */
+  unsigned dst_bit_offset;	/* Bit-field offset inside destination
+				   word.  */
+  unsigned src_field_offset;	/* Source field offset.  */
+  unsigned dst_bit_size;	/* Size of bit-field in destination word.  */
+  tree src_addr;		/* Address of source memory access.  */
+  tree dst_addr;		/* Address of destination memory access.  */
+  bool merged;			/* True if access is merged with another
+				   one.  */
+  bool modified;		/* True if bit-field size is modified.  */
+  bool is_barrier;		/* True if access is barrier (call or mem
+				   access).  */
+  struct bfaccess *next;	/* Access with which this one is merged.  */
+  tree bitfield_representative;	/* Bit field representative of original
+				   declaration.  */
+  struct bfaccess *head_access;	/* Head of access list where this one is
+				   merged.  */
+};
+
+hashval_t bfaccess::hash (const bfaccess * a)
+{
+  return iterative_hash_hashval_t
+    (iterative_hash_expr (TREE_OPERAND (a->ref, 0), 0),
+     DECL_UID (DECL_BIT_FIELD_REPRESENTATIVE (TREE_OPERAND (a->ref, 1))));
+}
+
+int
+bfaccess::equal (const bfaccess * a, const bfaccess * b)
+{
+  return ((DECL_BIT_FIELD_REPRESENTATIVE (TREE_OPERAND (a->ref, 1))
+	   == DECL_BIT_FIELD_REPRESENTATIVE (TREE_OPERAND (b->ref, 1)))
+	  && operand_equal_p (TREE_OPERAND (a->ref, 0),
+			      TREE_OPERAND (b->ref, 0), 0));
+}
+
+void
+bfaccess::remove (bfaccess * a)
+{
+  delete a;
+}
+
+/* Return whether REF is a bitfield access the bit offset of the bitfield
+   within the representative in *OFF if that is not NULL.  */
+
+static bool
+bitfield_access_p (tree ref, unsigned HOST_WIDE_INT * off)
+{
+  if (TREE_CODE (ref) != COMPONENT_REF)
+    return false;
+
+  tree field = TREE_OPERAND (ref, 1);
+  if (!DECL_BIT_FIELD_TYPE (field))
+    return false;
+
+  tree rep = DECL_BIT_FIELD_REPRESENTATIVE (field);
+  if (!rep)
+    return false;
+
+  if (!off)
+    return true;
+
+  if (tree_fits_uhwi_p (DECL_FIELD_OFFSET (field))
+      && tree_fits_uhwi_p (DECL_FIELD_OFFSET (rep)))
+    *off = (tree_to_uhwi (DECL_FIELD_OFFSET (field))
+	    - tree_to_uhwi (DECL_FIELD_OFFSET (rep))) * BITS_PER_UNIT;
+  else
+    *off = 0;
+  *off += (tree_to_uhwi (DECL_FIELD_BIT_OFFSET (field))
+	   - tree_to_uhwi (DECL_FIELD_BIT_OFFSET (rep)));
+
+  return true;
+}
+
+
+/* Lower the bitfield read at *GSI, the offset of the bitfield
+   relative to the bitfield representative is OFF bits.  */
+#include "gimple-pretty-print.h"
+static void
+lower_bitfield_read (gimple_stmt_iterator * gsi, unsigned HOST_WIDE_INT off,
+		     tree size, tree type)
+{
+  //debug_tree (size);
+  gimple stmt = gsi_stmt (*gsi);
+  tree ref = gimple_assign_rhs1 (stmt);
+  tree field = TREE_OPERAND (ref, 1);
+  tree rep = DECL_BIT_FIELD_REPRESENTATIVE (field);
+
+  tree loadres = make_ssa_name (TREE_TYPE (rep), NULL);
+  gimple load = gimple_build_assign (loadres,
+				     build3 (COMPONENT_REF, TREE_TYPE (rep),
+					     TREE_OPERAND (ref, 0), rep,
+					     NULL_TREE));
+  gimple_set_vuse (load, gimple_vuse (stmt));
+  gsi_insert_before (gsi, load, GSI_SAME_STMT);
+  if (!type)
+    type = TREE_TYPE (ref);
+  gimple_assign_set_rhs1 (stmt,
+			  build3 (BIT_FIELD_REF, type,
+				  loadres, size, bitsize_int (off)));
+  update_stmt (stmt);
+}
+
+/* Lower the bitfield write at *GSI, the offset of the bitfield
+   relative to the bitfield representative is OFF bits.  */
+
+static void
+lower_bitfield_write (gimple_stmt_iterator * gsi, unsigned HOST_WIDE_INT off,
+		      tree size)
+{
+  gimple stmt = gsi_stmt (*gsi);
+  tree ref = gimple_assign_lhs (stmt);
+  tree field = TREE_OPERAND (ref, 1);
+  tree rep = DECL_BIT_FIELD_REPRESENTATIVE (field);
+
+  tree loadres = make_ssa_name (TREE_TYPE (rep), NULL);
+  gimple load = gimple_build_assign (loadres,
+				     build3 (COMPONENT_REF, TREE_TYPE (rep),
+					     unshare_expr
+					     (TREE_OPERAND (ref, 0)),
+					     rep,
+					     NULL_TREE));
+  gimple_set_vuse (load, gimple_vuse (stmt));
+  gsi_insert_before (gsi, load, GSI_SAME_STMT);
+  /* FIXME:  BIT_FIELD_EXPR.  */
+  /* Mask out bits.  */
+  tree masked = make_ssa_name (TREE_TYPE (rep), NULL);
+  tree mask = double_int_to_tree (TREE_TYPE (rep),
+				  ~double_int::mask
+				  (TREE_INT_CST_LOW (size)).lshift (off));
+  gimple tems = gimple_build_assign_with_ops (BIT_AND_EXPR,
+					      masked, loadres, mask);
+  gsi_insert_before (gsi, tems, GSI_SAME_STMT);
+  /* Zero-extend the value to representative size.  */
+  tree tem2;
+  if (!TYPE_UNSIGNED (TREE_TYPE (field)))
+    {
+      tem2 = make_ssa_name (unsigned_type_for (TREE_TYPE (field)), NULL);
+      tems = gimple_build_assign_with_ops (NOP_EXPR, tem2,
+					   gimple_assign_rhs1 (stmt),
+					   NULL_TREE);
+      gsi_insert_before (gsi, tems, GSI_SAME_STMT);
+    }
+  else
+    tem2 = gimple_assign_rhs1 (stmt);
+  tree tem = make_ssa_name (TREE_TYPE (rep), NULL);
+  tems = gimple_build_assign_with_ops (NOP_EXPR, tem, tem2, NULL_TREE);
+  gsi_insert_before (gsi, tems, GSI_SAME_STMT);
+  /* Shift the value into place.  */
+  if (off != 0)
+    {
+      tem2 = make_ssa_name (TREE_TYPE (rep), NULL);
+      tems = gimple_build_assign_with_ops (LSHIFT_EXPR, tem2, tem,
+					   size_int (off));
+      gsi_insert_before (gsi, tems, GSI_SAME_STMT);
+    }
+  else
+    tem2 = tem;
+  /* Merge masked loaded value and value.  */
+  tree modres = make_ssa_name (TREE_TYPE (rep), NULL);
+  gimple mod = gimple_build_assign_with_ops (BIT_IOR_EXPR, modres,
+					     masked, tem2);
+  gsi_insert_before (gsi, mod, GSI_SAME_STMT);
+  /* Finally adjust the store.  */
+  gimple_assign_set_rhs1 (stmt, modres);
+  gimple_assign_set_lhs (stmt,
+			 build3 (COMPONENT_REF, TREE_TYPE (rep),
+				 TREE_OPERAND (ref, 0), rep, NULL_TREE));
+  update_stmt (stmt);
+}
+
+/* Connecting register with bit-field access sequence that defines value in
+   that register.  */
+struct bitfield_stmt_access_pair
+{
+  gimple stmt;
+  bfaccess *access;
+  bitfield_stmt_access_pair (gimple s, bfaccess *a):stmt (s), access (a) {};
+  /* hash_table support.  */
+  typedef bitfield_stmt_access_pair value_type;
+  typedef bitfield_stmt_access_pair compare_type;
+  static inline hashval_t hash (const bitfield_stmt_access_pair *);
+  static inline int equal (const bitfield_stmt_access_pair *,
+			   const bitfield_stmt_access_pair *);
+  static inline void remove (bitfield_stmt_access_pair *);
+};
+
+hashval_t bitfield_stmt_access_pair::hash (const bitfield_stmt_access_pair *a)
+{
+  return hashval_t (gimple_uid (a->stmt));
+}
+
+int bitfield_stmt_access_pair::equal (const bitfield_stmt_access_pair *a,
+				      const bitfield_stmt_access_pair *b)
+{
+  return a->stmt == b->stmt;
+}
+
+void bitfield_stmt_access_pair::remove (bitfield_stmt_access_pair *a)
+{
+  delete a;
+}
+
+/* Create new bit-field access structure and add it to given bitfield_accesses
+   htab.  */
+
+static struct bfaccess *
+create_and_insert_access (vec < struct bfaccess *>*bitfield_accesses,
+			  struct bfaccess *access)
+{
+  if (!access)
+    access = new bfaccess (NULL);
+  bitfield_accesses->safe_push (access);
+  return access;
+
+}
+
+static inline HOST_WIDE_INT
+get_bit_offset (tree decl)
+{
+  tree type = DECL_BIT_FIELD_TYPE (decl);
+  HOST_WIDE_INT bitpos_int;
+
+  /* Must be a field and a bit-field.  */
+  gcc_assert (type && TREE_CODE (decl) == FIELD_DECL);
+  /* Bit position and decl size should be integer constants that can be
+     represented in a signle HOST_WIDE_INT.  */
+  if (!tree_fits_uhwi_p (bit_position (decl))
+      || !tree_fits_uhwi_p (DECL_SIZE (decl)))
+    return -1;
+
+  bitpos_int = int_bit_position (decl);
+  return bitpos_int;
+}
+
+static bool
+add_stmt_access_pair (hash_table < bitfield_stmt_access_pair > &bf_stmnt_acc,
+		      bfaccess *access, gimple stmt)
+{
+  bitfield_stmt_access_pair p(stmt, access);
+  bitfield_stmt_access_pair  **slot = bf_stmnt_acc.find_slot (&p, INSERT);
+  if (!*slot)
+    {
+      *slot = new bitfield_stmt_access_pair (stmt, access);
+      return true;
+    }
+  return false;
+}
+
+/* Compare two bit-field access records.  */
+
+static int
+cmp_access (const void *p1, const void *p2)
+{
+  const struct bfaccess *a1 = *((const struct bfaccess **) p1);
+  const struct bfaccess *a2 = *((const struct bfaccess **) p2);
+
+  if (DECL_UID (a1->bitfield_representative) -
+      DECL_UID (a2->bitfield_representative))
+    return DECL_UID (a1->bitfield_representative) -
+      DECL_UID (a2->bitfield_representative);
+
+  if (!expressions_equal_p (a1->src_addr, a2->src_addr))
+    return a1 - a2;
+  if (!expressions_equal_p (a1->dst_addr, a2->dst_addr))
+    return a1 - a2;
+  if (a1->src_offset_words - a2->src_offset_words)
+    return a1->src_offset_words - a2->src_offset_words;
+  return a1->src_bit_offset - a2->src_bit_offset;
+}
+
+/* Returns size of combined bitfields.  Size cannot be larger than size
+   of largest directly accessible memory unit.  */
+
+static int
+get_merged_bit_field_size (bfaccess * access)
+{
+  bfaccess *tmp_access = access;
+  int size = 0;
+
+  while (tmp_access)
+    {
+      size += tmp_access->src_bit_size;
+      tmp_access = tmp_access->next;
+    }
+  return size;
+}
+
+/* Lower bitfield accesses to accesses of their
+   DECL_BIT_FIELD_REPRESENTATIVE.  */
+
+static void
+lower_bitfields (bool all)
+{
+  basic_block bb;
+
+  hash_table < bfaccess > bf;
+  bf.create (1);
+  hash_table < bitfield_stmt_access_pair > bf_stmnt_acc;
+  bf_stmnt_acc.create (1);
+
+  vec < struct bfaccess *>bitfield_accesses;
+  struct bfaccess *access;
+
+  FOR_EACH_BB_FN (bb, cfun)
+  {
+    bool any = false;
+    bf.empty ();
+    bf_stmnt_acc.empty ();
+    tree prev_representative = NULL_TREE;
+    bitfield_accesses.create (0);
+
+    /* We do two passes, the first one identifying interesting
+       bitfield accesses and the second one actually lowering them.  */
+    if (!all)
+      for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
+	   !gsi_end_p (gsi); gsi_next (&gsi))
+	{
+	  gimple stmt = gsi_stmt (gsi);
+
+	  if (!gimple_assign_single_p (stmt)
+	      || gimple_has_volatile_ops (stmt))
+	    continue;
+
+	  tree ref = gimple_assign_rhs1 (stmt);
+	  if (bitfield_access_p (ref, NULL))
+	    {
+	      bfaccess a (ref);
+	      bfaccess **slot = bf.find_slot (&a, INSERT);
+	      gimple use_stmt;
+	      use_operand_p use;
+	      tree op0 = TREE_OPERAND (ref, 0);
+	      tree op1 = TREE_OPERAND (ref, 1);
+
+	      if (TREE_CODE (DECL_CONTEXT (op1)) == UNION_TYPE
+		  || TREE_CODE (DECL_CONTEXT (op1)) == QUAL_UNION_TYPE)
+		continue;
+
+	      if (*slot)
+		(*slot)->r_count++;
+	      else
+		*slot = new bfaccess (a);
+
+	      if ((*slot)->r_count > 1)
+		any = true;
+
+	      if (single_imm_use (gimple_assign_lhs (stmt), &use, &use_stmt)
+		  && is_gimple_assign (use_stmt))
+		{
+		  tree uses_stmt_lhs = gimple_assign_lhs (use_stmt);
+		  if (bitfield_access_p (uses_stmt_lhs, NULL))
+		    {
+		      tree use_op0 = TREE_OPERAND (uses_stmt_lhs, 0);
+		      tree use_op1 = TREE_OPERAND (uses_stmt_lhs, 1);
+		      tree use_repr = DECL_BIT_FIELD_REPRESENTATIVE (use_op1);
+
+		      if (prev_representative
+			  && (prev_representative != use_repr))
+			{
+			  /* If previous access has different
+			     representative then barrier is needed
+			     between it and new access.  */
+			  access = create_and_insert_access
+			    (&bitfield_accesses, NULL);
+			  access->is_barrier = true;
+			}
+		      prev_representative = use_repr;
+		      /* Create new bit-field access structure.  */
+		      access = create_and_insert_access
+			(&bitfield_accesses, NULL);
+		      /* Collect access data - load instruction.  */
+		      access->src_bit_size = tree_to_uhwi (DECL_SIZE (op1));
+		      access->src_bit_offset = get_bit_offset (op1);
+		      access->src_offset_words =
+			field_byte_offset (op1) / UNITS_PER_WORD;
+		      access->src_field_offset =
+			tree_to_uhwi (DECL_FIELD_OFFSET (op1));
+		      access->src_addr = op0;
+		      access->load_stmt = gsi_stmt (gsi);
+		      /* Collect access data - store instruction.  */
+		      access->dst_bit_size =
+			tree_to_uhwi (DECL_SIZE (use_op1));
+		      access->dst_bit_offset = get_bit_offset (use_op1);
+		      access->dst_offset_words =
+			field_byte_offset (use_op1) / UNITS_PER_WORD;
+		      access->dst_addr = use_op0;
+		      access->store_stmt = use_stmt;
+		      add_stmt_access_pair (bf_stmnt_acc, access, stmt);
+		      add_stmt_access_pair (bf_stmnt_acc, access, use_stmt);
+		      access->bitfield_representative = use_repr;
+		    }
+		}
+	    }
+
+	  ref = gimple_assign_lhs (stmt);
+	  if (bitfield_access_p (ref, NULL))
+	    {
+	      bfaccess a (ref);
+	      bfaccess **slot = bf.find_slot (&a, INSERT);
+	      if (*slot)
+		(*slot)->w_count++;
+	      else
+		*slot = new bfaccess (a);
+	      if ((*slot)->w_count > 1)
+		any = true;
+	    }
+	  /* Insert barrier for merging if statement is function call or memory
+	     access.  */
+	  bitfield_stmt_access_pair asdata (stmt, NULL);
+	  if (!bf_stmnt_acc.find (&asdata)
+	      && ((gimple_code (stmt) == GIMPLE_CALL)
+		 || (gimple_has_mem_ops (stmt))))
+	    {
+	      /* Create new bit-field access structure.  */
+	      access = create_and_insert_access (&bitfield_accesses, NULL);
+	      /* Mark it as barrier.  */
+	      access->is_barrier = true;
+	    }
+	}
+
+    if (!all && !any)
+      continue;
+
+    /* If there are no at least two accesses go to the next basic block.  */
+    if (bitfield_accesses.length () <= 1)
+      {
+	bitfield_accesses.release ();
+	continue;
+      }
+    vec < struct bfaccess *>bitfield_accesses_merge = vNULL;
+    /* Find bit-field accesses that can be merged.  */
+    for (int ix = 0; bitfield_accesses.iterate (ix, &access); ix++)
+      {
+	struct bfaccess *head_access;
+	struct bfaccess *mrg_access;
+	struct bfaccess *prev_access;
+
+	if (!bitfield_accesses_merge.exists ())
+	  bitfield_accesses_merge.create (0);
+
+	if (!access->is_barrier)
+	  bitfield_accesses_merge.safe_push (access);
+
+	if (!access->is_barrier
+	    && !(access == bitfield_accesses.last ()
+		 && !bitfield_accesses_merge.is_empty ()))
+	  continue;
+
+	bitfield_accesses_merge.qsort (cmp_access);
+	head_access = prev_access = NULL;
+	int iy;
+	for (iy = 0; bitfield_accesses_merge.iterate (iy, &mrg_access); iy++)
+	  {
+	    if (head_access
+		&& expressions_equal_p (head_access->src_addr,
+					mrg_access->src_addr)
+		&& expressions_equal_p (head_access->dst_addr,
+					mrg_access->dst_addr)
+		&& prev_access->src_offset_words
+		== mrg_access->src_offset_words
+		&& prev_access->dst_offset_words
+		== mrg_access->dst_offset_words
+		&& prev_access->src_bit_offset + prev_access->src_bit_size
+		== mrg_access->src_bit_offset
+		&& prev_access->dst_bit_offset + prev_access->dst_bit_size
+		== mrg_access->dst_bit_offset
+		&& prev_access->bitfield_representative
+		== mrg_access->bitfield_representative)
+	      {
+		/* Merge conditions are satisfied - merge accesses.  */
+		mrg_access->merged = true;
+		prev_access->next = mrg_access;
+		head_access->modified = true;
+		prev_access = mrg_access;
+		mrg_access->head_access = head_access;
+	      }
+	    else
+	      head_access = prev_access = mrg_access;
+	  }
+	bitfield_accesses_merge.release ();
+	bitfield_accesses_merge = vNULL;
+      }
+
+    for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
+	 !gsi_end_p (gsi); gsi_next (&gsi))
+      {
+	gimple stmt = gsi_stmt (gsi);
+	if (!gimple_assign_single_p (stmt) || gimple_has_volatile_ops (stmt))
+	  continue;
+
+	tree ref;
+	unsigned HOST_WIDE_INT off;
+	tree size;
+
+	/* Lower a bitfield read.  */
+	ref = gimple_assign_rhs1 (stmt);
+	if (bitfield_access_p (ref, &off))
+	  {
+	    bfaccess a (ref);
+	    bfaccess *aa = NULL;
+	    bitfield_stmt_access_pair st_acc (stmt, NULL);
+	    bitfield_stmt_access_pair *p_st_acc;
+	    p_st_acc = bf_stmnt_acc.find (&st_acc);
+	    if (p_st_acc)
+	      aa = p_st_acc->access;
+	    if (!aa)
+	      aa = bf.find (&a);
+
+	    if (aa->merged || aa->modified)
+	      size =
+		build_int_cst (unsigned_type_node,
+			       get_merged_bit_field_size (aa->
+							  head_access ? aa->
+							  head_access : aa));
+	    else
+	      size = DECL_SIZE (TREE_OPERAND (ref, 1));
+	    if (aa->merged)
+	      off = aa->head_access->src_bit_offset;
+
+	    if (aa->merged || aa->modified)
+	      {
+		tree tmp_ssa;
+		tree itype = make_node (INTEGER_TYPE);
+		TYPE_PRECISION (itype) = TREE_INT_CST_LOW (size);
+		fixup_unsigned_type (itype);
+		lower_bitfield_read (&gsi, off, size, itype);
+		tmp_ssa =
+		make_ssa_name (create_tmp_var (itype, NULL), NULL);
+		gimple_assign_set_lhs (aa->load_stmt, tmp_ssa);
+		update_stmt (aa->load_stmt);
+		gimple_assign_set_rhs1 (aa->store_stmt, tmp_ssa);
+	      }
+	    else if (all || (aa->r_count > 1))
+	      lower_bitfield_read (&gsi, off, size, NULL);
+	  }
+	/* Lower a bitfield write to a read-modify-write cycle.  */
+	ref = gimple_assign_lhs (stmt);
+	if (bitfield_access_p (ref, &off))
+	  {
+	    bfaccess a (ref);
+	    bfaccess *aa = NULL;
+	    bitfield_stmt_access_pair st_acc (stmt, NULL);
+	    bitfield_stmt_access_pair *p_st_acc;
+	    p_st_acc = bf_stmnt_acc.find (&st_acc);
+	    if (p_st_acc)
+	      aa = p_st_acc->access;
+	    if (!aa)
+	      aa = bf.find (&a);
+
+	    if (aa->merged || aa->modified)
+	      size =
+		build_int_cst (unsigned_type_node,
+			       get_merged_bit_field_size (aa->
+							  head_access ? aa->
+							  head_access : aa));
+	    else
+	      size = DECL_SIZE (TREE_OPERAND (ref, 1));
+	    if (aa->merged)
+	      off = aa->head_access->dst_bit_offset;
+
+	    if (all || (aa->w_count > 1) || aa->merged || aa->modified)
+	      lower_bitfield_write (&gsi, off, size);
+	  }
+      }
+  }
+
+  bf.dispose ();
+  bf_stmnt_acc.dispose ();
+}
+
 /* Perform early intraprocedural SRA.  */
 static unsigned int
 early_intra_sra (void)
 {
+
+  if (flag_tree_bitfield_merge)
+    lower_bitfields (false);
   sra_mode = SRA_MODE_EARLY_INTRA;
   return perform_intra_sra ();
 }
diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
index f7ec8b6..cde6ce6 100644
--- a/gcc/tree-ssa-sccvn.c
+++ b/gcc/tree-ssa-sccvn.c
@@ -4193,29 +4193,6 @@ get_next_value_id (void)
   return next_value_id++;
 }
 
-
-/* Compare two expressions E1 and E2 and return true if they are equal.  */
-
-bool
-expressions_equal_p (tree e1, tree e2)
-{
-  /* The obvious case.  */
-  if (e1 == e2)
-    return true;
-
-  /* If only one of them is null, they cannot be equal.  */
-  if (!e1 || !e2)
-    return false;
-
-  /* Now perform the actual comparison.  */
-  if (TREE_CODE (e1) == TREE_CODE (e2)
-      && operand_equal_p (e1, e2, OEP_PURE_SAME))
-    return true;
-
-  return false;
-}
-
-
 /* Return true if the nary operation NARY may trap.  This is a copy
    of stmt_could_throw_1_p adjusted to the SCCVN IL.  */
 
diff --git a/gcc/tree-ssa-sccvn.h b/gcc/tree-ssa-sccvn.h
index f52783a..0aa5537 100644
--- a/gcc/tree-ssa-sccvn.h
+++ b/gcc/tree-ssa-sccvn.h
@@ -21,10 +21,6 @@
 #ifndef TREE_SSA_SCCVN_H
 #define TREE_SSA_SCCVN_H
 
-/* In tree-ssa-sccvn.c  */
-bool expressions_equal_p (tree, tree);
-
-
 /* TOP of the VN lattice.  */
 extern tree VN_TOP;
 
diff --git a/gcc/tree.c b/gcc/tree.c
index d102d07..78355cc 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -12369,4 +12369,44 @@ get_base_address (tree t)
   return t;
 }
 
+/* Compare two expressions E1 and E2 and return true if they are equal.  */
+
+bool
+expressions_equal_p (tree e1, tree e2)
+{
+  /* The obvious case.  */
+  if (e1 == e2)
+    return true;
+
+  /* If only one of them is null, they cannot be equal.  */
+  if (!e1 || !e2)
+    return false;
+
+  /* Now perform the actual comparison.  */
+  if (TREE_CODE (e1) == TREE_CODE (e2)
+      && operand_equal_p (e1, e2, OEP_PURE_SAME))
+    return true;
+
+  return false;
+}
+
+/* Given a pointer to a tree node, assumed to be some kind of a ..._TYPE
+   node, return the size in bits for the type if it is a constant, or else
+   return the alignment for the type if the type's size is not constant, or
+   else return BITS_PER_WORD if the type actually turns out to be an
+   ERROR_MARK node.  */
+
+unsigned HOST_WIDE_INT
+simple_type_size_in_bits (const_tree type)
+{
+  if (TREE_CODE (type) == ERROR_MARK)
+    return BITS_PER_WORD;
+  else if (TYPE_SIZE (type) == NULL_TREE)
+    return 0;
+  else if (tree_fits_uhwi_p (TYPE_SIZE (type)))
+    return tree_to_uhwi (TYPE_SIZE (type));
+  else
+    return TYPE_ALIGN (type);
+}
+
 #include "gt-tree.h"
diff --git a/gcc/tree.h b/gcc/tree.h
index 0dc8d0d..4a5b930 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -3984,6 +3984,7 @@ extern tree substitute_placeholder_in_expr (tree, tree);
   ((EXP) == 0 || TREE_CONSTANT (EXP) ? (EXP)	\
    : substitute_placeholder_in_expr (EXP, OBJ))
 
+extern unsigned HOST_WIDE_INT simple_type_size_in_bits (const_tree type);
 
 /* stabilize_reference (EXP) returns a reference equivalent to EXP
    but it can be used multiple times
@@ -4100,6 +4101,11 @@ inlined_function_outer_scope_p (const_tree block)
        (TREE = function_args_iter_cond (&(ITER))) != NULL_TREE;		\
        function_args_iter_next (&(ITER)))
 
+
+/* In dwarf2out.c.  */
+HOST_WIDE_INT
+field_byte_offset (const_tree decl);
+
 /* In tree.c */
 extern unsigned crc32_string (unsigned, const char *);
 extern unsigned crc32_byte (unsigned, char);
@@ -4244,6 +4250,7 @@ extern tree obj_type_ref_class (tree ref);
 extern bool types_same_for_odr (tree type1, tree type2);
 extern bool contains_bitfld_component_ref_p (const_tree);
 extern bool type_in_anonymous_namespace_p (tree);
+extern bool expressions_equal_p (tree e1, tree e2);
 extern bool block_may_fallthru (const_tree);
 extern void using_eh_for_cleanups (void);
 extern bool using_eh_for_cleanups_p (void);



Regards,
Zoran Jovanovic

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Add a new option "-ftree-bitfield-merge" (patch / doc inside)
  2014-03-09 20:40           ` Zoran Jovanovic
@ 2014-03-12 21:51             ` Bernhard Reutner-Fischer
  0 siblings, 0 replies; 17+ messages in thread
From: Bernhard Reutner-Fischer @ 2014-03-12 21:51 UTC (permalink / raw)
  To: Zoran Jovanovic; +Cc: Richard Biener, gcc-patches

On Sun, Mar 09, 2014 at 08:35:43PM +0000, Zoran Jovanovic wrote:
> Hello,
> This is new patch version. 
> Approach suggested by Richard Biener with lowering bit-field accesses instead of modifying gimple trees is implemented.
 
> New command line option "-fmerge-bitfields" is introduced.
> 
> Tested - passed gcc regression tests.
> 
> Changelog -
> 
> gcc/ChangeLog:
> 2014-03-09 Zoran Jovanovic (zoran.jovanovic@imgtec.com)
>   * common.opt (fmerge-bitfields): New option.
>   * doc/invoke.texi: Added reference to "-fmerge-bitfields".

Present tense.

>   * tree-sra.c (lower_bitfields): New function.
>   Entry for (-fmerge-bitfields).
>   (bfaccess::hash): New function.
>   (bfaccess::equal): New function.
>   (bfaccess::remove): New function.
>   (bitfield_access_p): New function.
>   (lower_bitfield_read): New function.
>   (lower_bitfield_write): New function.
>   (bitfield_stmt_access_pair_htab_hash): New function.
>   (bitfield_stmt_access_pair_htab_eq): New function.
>   (create_and_insert_access): New function.
>   (get_bit_offset): New function.
>   (get_merged_bit_field_size): New function.
>   (add_stmt_access_pair): New function.
>   (cmp_access): New function.
>   * dwarf2out.c (simple_type_size_in_bits): moved to tree.c.

Present tense. Capital 'M'ove

>   (field_byte_offset): declaration moved to tree.h, static removed.

Capital 'D'eclaration. These are supposed to be sentences. By removing
static you IMHO 'make extern'.

>   * testsuite/gcc.dg/tree-ssa/bitfldmrg1.c: New test.
>   * testsuite/gcc.dg/tree-ssa/bitfldmrg2.c: New test.
>   * tree-ssa-sccvn.c (expressions_equal_p): moved to tree.c.

See above.

>   * tree-ssa-sccvn.h (expressions_equal_p): declaration moved to tree.h.

Likewise.

>   * tree.c (expressions_equal_p): moved from tree-ssa-sccvn.c.

See above.

>   (simple_type_size_in_bits): moved from dwarf2out.c.

See above.

>   * tree.h (expressions_equal_p): declaration added.

Ditto.

>   (field_byte_offset): declaration added.

Ditto.

> 
> Patch -
> 
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 661516d..3331d03 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -2193,6 +2193,10 @@ ftree-sra
>  Common Report Var(flag_tree_sra) Optimization
>  Perform scalar replacement of aggregates
>  
> +fmerge-bitfields
> +Common Report Var(flag_tree_bitfield_merge) Init(0) Optimization

Optimization but not enabled for any level. So, where would one
generally want this enabled? CSiBE numbers? SPEC you-name-it
improvements? size(1) improvements where? In GCC there is generally no
interest in the size(1) added to the collection itself, so let me ask
for size(1) and bloat(-o-meter) stats for gcc, cc1 and collect2, just
for the sake of it?

> +Merge loads and stores of consecutive bitfields
> +
>  ftree-ter
>  Common Report Var(flag_tree_ter) Optimization
>  Replace temporary expressions in the SSA->normal pass
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 24bd76e..54bae56 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -411,7 +411,7 @@ Objective-C and Objective-C++ Dialects}.
>  -fsplit-ivs-in-unroller -fsplit-wide-types -fstack-protector @gol
>  -fstack-protector-all -fstack-protector-strong -fstrict-aliasing @gol
>  -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
> --ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
> +-fmerge-bitfields -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
>  -ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
>  -ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
>  -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
> @@ -7807,6 +7807,11 @@ pointer alignment information.
>  This pass only operates on local scalar variables and is enabled by default
>  at @option{-O} and higher.  It requires that @option{-ftree-ccp} is enabled.
>  
> +@item -fbitfield-merge

you are talking about '-fmerge-bitfields' up until here (except for
Subject. [Confusion starts here -- Subject: -ftree-bitfield-merge; sofar
Intro -fmerge-bitfields and ChangeLog -fmerge-bitfields]

> +@opindex fmerge-bitfields
> +Combines several adjacent bit-field accesses that copy values
> +from one memory location to another into one single bit-field access.
> +
>  @item -ftree-ccp
>  @opindex ftree-ccp
>  Perform sparse conditional constant propagation (CCP) on trees.  This

> diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
> index 284d544..c6a19b2 100644
> --- a/gcc/tree-sra.c
> +++ b/gcc/tree-sra.c
> @@ -3462,10 +3462,608 @@ perform_intra_sra (void)
>    return ret;
>  }
>  
> +/* Bitfield access and hashtable support commoning same base and
> +   representative.  */
> +
> +struct bfaccess
> +{
> +  bfaccess (tree r):ref (r), r_count (1), w_count (1), merged (false),
> +    modified (false), is_barrier (false), next (0), head_access (0)
> +  {
> +  }
> +
> +  tree ref;
> +  unsigned r_count;  /* Read counter.  */
> +  unsigned w_count;  /* Write counter.  */
> +
> +  /* hash_table support.  */
> +  typedef bfaccess value_type;
> +  typedef bfaccess compare_type;
> +  static inline hashval_t hash (const bfaccess *);

I suspect there's a reason to not have the * be a const* ?

> +  static inline int equal (const bfaccess *, const bfaccess *);

..which holds true here as well.

> +  static inline void remove (bfaccess *);
> +
> +  gimple load_stmt;		/* Bit-field load statement.  */
> +  gimple store_stmt;		/* Bit-field store statement.  */
> +  unsigned src_offset_words;	/* Bit-field offset at src in words.  */
> +  unsigned src_bit_offset;	/* Bit-field offset inside source word.  */
> +  unsigned src_bit_size;	/* Size of bit-field in source word.  */
> +  unsigned dst_offset_words;	/* Bit-field offset at dst in words.  */
> +  unsigned dst_bit_offset;	/* Bit-field offset inside destination
> +				   word.  */
> +  unsigned src_field_offset;	/* Source field offset.  */
> +  unsigned dst_bit_size;	/* Size of bit-field in destination word.  */
> +  tree src_addr;		/* Address of source memory access.  */
> +  tree dst_addr;		/* Address of destination memory access.  */
> +  bool merged;			/* True if access is merged with another
> +				   one.  */
> +  bool modified;		/* True if bit-field size is modified.  */
> +  bool is_barrier;		/* True if access is barrier (call or mem
> +				   access).  */

Back then the above usually were bitfields:1 themselves and for space
considerations were moved to a place where alignment begged for or with
cacheline friendliness in mind but i guess those times are past
nowadays, yes?

> +  struct bfaccess *next;	/* Access with which this one is merged.  */
> +  tree bitfield_representative;	/* Bit field representative of original
> +				   declaration.  */
> +  struct bfaccess *head_access;	/* Head of access list where this one is
> +				   merged.  */
> +};

> +/* Return whether REF is a bitfield access the bit offset of the bitfield

mhm. Maybe it's late here by now, but can you actually parse the
sentence above? Is there an 'at' missing somewhere?
"a bitfield access at" perhaps?

Here I'll stop attempts to follow what you wrote, no offence.

TIA && cheers,

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Add a new option "-ftree-bitfield-merge" (patch / doc inside)
@ 2013-07-22 10:08 Bernd Edlinger
  0 siblings, 0 replies; 17+ messages in thread
From: Bernd Edlinger @ 2013-07-22 10:08 UTC (permalink / raw)
  To: Zoran Jovanovic, gcc-patches; +Cc: Petar Jovanovic

Hello Zoran,

I may be wrong, but what you are trying to do is very similar to what's
in fold-const.c optimize_bit_field_compare().

There was a discussion in April 2012 on this thread: "Continue strict-volatile-bitfields fixes"

http://gcc.gnu.org/ml/gcc-patches/2012-04/msg01094.html

The result was that this optimization seems to break other possible optimizations later on,
when -fstrict-volatile-bitfields was enabled on the SH target. Even when the bit fields are NOT volatile.
(Of course you should not touch volatile bit fields at all)

And this was added to optimize_bit_field_compare as a result:

  /* In the strict volatile bitfields case, doing code changes here may prevent
     other optimizations, in particular in a SLOW_BYTE_ACCESS setting.  */
  if (flag_strict_volatile_bitfields> 0)
    return 0;



Regards
Bernd. 		 	   		  

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2014-03-12 21:46 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-17 20:58 [PATCH] Add a new option "-ftree-bitfield-merge" (patch / doc inside) Zoran Jovanovic
2013-07-17 21:19 ` Joseph S. Myers
2013-07-17 21:25 ` Andrew Pinski
2013-07-18 10:06 ` Richard Biener
2013-07-30 15:02   ` Zoran Jovanovic
2013-08-27 11:33     ` Richard Biener
2013-07-18 18:14 ` Cary Coutant
2013-07-18 19:47 ` Hans-Peter Nilsson
2013-08-23 14:25 ` Zoran Jovanovic
2013-08-23 22:19   ` Joseph S. Myers
     [not found]   ` <140b1b1b35a.2760.0f39ed3bcad52ef2c88c90062b7714dc@gmail.com>
2013-08-24 22:32     ` Bernhard Reutner-Fischer
2013-09-24 23:10   ` Zoran Jovanovic
     [not found]     ` <CAFiYyc0dcpDeXqwM2G3BTJUkpTsjzivRVEuWGfmGE4QcMhxERA@mail.gmail.com>
2013-11-08 13:07       ` Richard Biener
2013-11-08 14:11         ` Richard Biener
2014-03-09 20:40           ` Zoran Jovanovic
2014-03-12 21:51             ` Bernhard Reutner-Fischer
2013-07-22 10:08 Bernd Edlinger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).