public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 2/5] Make tree-complex.c:extract_component() handle V_C_Es
  2009-04-28 10:10 [PATCH 0/5] New implementation of SRA Martin Jambor
  2009-04-28 10:10 ` [PATCH 4/5] Fix indirect inlining fallout with new intra-SRA Martin Jambor
@ 2009-04-28 10:10 ` Martin Jambor
  2009-04-28 11:52   ` Richard Guenther
  2009-04-28 10:11 ` [PATCH 1/5] Get rid off old external tree-sra.c stuff Martin Jambor
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 25+ messages in thread
From: Martin Jambor @ 2009-04-28 10:10 UTC (permalink / raw)
  To: GCC Patches; +Cc: Richard Guenther, Jan Hubicka

[-- Attachment #1: complex_vce.diff --]
[-- Type: text/plain, Size: 868 bytes --]

Currently     tree-complex.c:extract_component()     cannot     handle
VIEW_CONVERT_EXPRs  which  makes  the  new SRA  ICE  during  bootstrap
(IIRC).  This seems to be an ommision so I added a label for this code
so that they are handled just like other handled components.

Thanks,

Martin


2009-04-27  Martin Jambor  <mjambor@suse.cz>

	* tree-complex.c (extract_component): Added VIEW_CONVERT_EXPR switch
	case.


Index: mine/gcc/tree-complex.c
===================================================================
--- mine.orig/gcc/tree-complex.c	2009-04-25 19:11:37.000000000 +0200
+++ mine/gcc/tree-complex.c	2009-04-25 19:11:47.000000000 +0200
@@ -601,6 +601,7 @@ extract_component (gimple_stmt_iterator 
     case INDIRECT_REF:
     case COMPONENT_REF:
     case ARRAY_REF:
+    case VIEW_CONVERT_EXPR:
       {
 	tree inner_type = TREE_TYPE (TREE_TYPE (t));
 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 4/5] Fix indirect inlining fallout with new intra-SRA
  2009-04-28 10:10 [PATCH 0/5] New implementation of SRA Martin Jambor
@ 2009-04-28 10:10 ` Martin Jambor
  2009-04-28 12:15   ` Richard Guenther
  2009-04-28 10:10 ` [PATCH 2/5] Make tree-complex.c:extract_component() handle V_C_Es Martin Jambor
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 25+ messages in thread
From: Martin Jambor @ 2009-04-28 10:10 UTC (permalink / raw)
  To: GCC Patches; +Cc: Richard Guenther, Jan Hubicka

[-- Attachment #1: fix_iinln.diff --]
[-- Type: text/plain, Size: 1880 bytes --]

The new intra-SRA produces an extra copy assignment and that breaks
ipa-prop.c pattern matching.  The following patch fixes that.

Thanks,

Martin


2009-04-27  Martin Jambor  <mjambor@suse.cz>

	* ipa-prop.c (get_ssa_def_if_simple_copy): New function.
	(determine_cst_member_ptr): Call get_ssa_def_if_simple_copy to skip
	simple copies.


Index: mine/gcc/ipa-prop.c
===================================================================
--- mine.orig/gcc/ipa-prop.c
+++ mine/gcc/ipa-prop.c
@@ -456,6 +456,22 @@ fill_member_ptr_cst_jump_function (struc
   jfunc->value.member_cst.delta = delta;
 }
 
+/* If RHS is an SSA_NAMe and it is defined by a simple copy assign statement,
+   return the rhs of its defining statement.  */
+
+static inline tree
+get_ssa_def_if_simple_copy (tree rhs)
+{
+  if (TREE_CODE (rhs) == SSA_NAME && !SSA_NAME_IS_DEFAULT_DEF (rhs))
+    {
+      gimple def_stmt = SSA_NAME_DEF_STMT (rhs);
+
+      if (is_gimple_assign (def_stmt) && gimple_num_ops (def_stmt) == 2)
+	rhs = gimple_assign_rhs1 (def_stmt);
+    }
+  return rhs;
+}
+
 /* Traverse statements from CALL backwards, scanning whether the argument ARG
    which is a member pointer is filled in with constant values.  If it is, fill
    the jump function JFUNC in appropriately.  METHOD_FIELD and DELTA_FIELD are
@@ -495,6 +511,7 @@ determine_cst_member_ptr (gimple call, t
       fld = TREE_OPERAND (lhs, 1);
       if (!method && fld == method_field)
 	{
+	  rhs = get_ssa_def_if_simple_copy (rhs);
 	  if (TREE_CODE (rhs) == ADDR_EXPR
 	      && TREE_CODE (TREE_OPERAND (rhs, 0)) == FUNCTION_DECL
 	      && TREE_CODE (TREE_TYPE (TREE_OPERAND (rhs, 0))) == METHOD_TYPE)
@@ -512,6 +529,7 @@ determine_cst_member_ptr (gimple call, t
 
       if (!delta && fld == delta_field)
 	{
+	  rhs = get_ssa_def_if_simple_copy (rhs);
 	  if (TREE_CODE (rhs) == INTEGER_CST)
 	    {
 	      delta = rhs;

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 0/5] New implementation of SRA
@ 2009-04-28 10:10 Martin Jambor
  2009-04-28 10:10 ` [PATCH 4/5] Fix indirect inlining fallout with new intra-SRA Martin Jambor
                   ` (4 more replies)
  0 siblings, 5 replies; 25+ messages in thread
From: Martin Jambor @ 2009-04-28 10:10 UTC (permalink / raw)
  To: GCC Patches; +Cc: Richard Guenther, Jan Hubicka


Hi,

this  patch  set contains  my  new  implementation of  intraprocedural
Scalar Replacement of Aggregates (SRA)  that I would like to commit to
trunk first.  It is essentially a  part of the merge of the pretty-ipa
branch   because  a   (only  slightly   different)  variant   of  this
implementation   was  on  that   branch  for   a  number   of  months.
Nevertheless,  this patch  set  does not  contain the  interprocedural
variant, IPA-SRA  that I posted here  earlier.  I will send  that as a
followup patch later (hopefully once this is in).

Unlike  the  previous  SRA,  this  one is  not  based  on  decomposing
aggregate  types and references  but on  get_ref_base_and_extent() and
classifying accesses into aggregates on the basis of their offsets and
sizes.   It  only  creates  scalar  replacements and  only  for  those
components that are actually individually accessed in a function.  The
main advantages of this new implementation are that:

  - It is able to scalarize unions and aggregates that contain unions
    (PR 32964).  Moreover, there are potentially other scalarizable
    accesses caught by get_ref_base_and_extent() that are not by
    simple reference analysis, an example is an access to a
    one-element array that is not at the end of a structure.

  - It is simpler.  In fact, simplicity was the main objective of this
    new implementation.   On the  other hand, whenever  performance or
    bogus warning avoidance required  it (the two were usually tightly
    connected), necessary computations and data structures were added,
    even though some are not super simple.

    Still, with comments stripped off, the new implementation has 2246
    lines of code  whereas the old one has 3375.   I also believe it's
    easier to grasp how the new one works, though my view is obviously
    skewed.  Hopefully, it contains less bugs and makes those that are
    there easier to hunt down.   Avoiding all sorts of trickeries also
    makes reasoning of SRA's effects on the code easier.


Its behavior  on the  branch shows  that it does  not bring  about any
non-noise compile  time or run  time regressions.  My isolated  and so
far rather  sloppy benchmarks of  this particular version  suggest the
same thing.  Nevertheless, I am  about to test it thoroughly using one
of our  automated testers.   If there are  indeed no problems,  I will
propose that these patches are committed to trunk.

Because I belive  that the final version will be  very similar to this
one, I'd like to  invite others, especially middle-end maintainers, to
review it.

I have bootstrapped and tested the patches on x86_64-linux-gnu without
any problems.  I intend to do that  on i686 and I'd like to do that on
hppa-linux-gnu  too  but trunk  does  not  bootstrap  there as  it  is
(probably one of the post expand-from-SSA issues).  I'll keep trying.

Thank you very much in advance,

Martin

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/5] Get rid off old external tree-sra.c stuff
  2009-04-28 10:10 [PATCH 0/5] New implementation of SRA Martin Jambor
  2009-04-28 10:10 ` [PATCH 4/5] Fix indirect inlining fallout with new intra-SRA Martin Jambor
  2009-04-28 10:10 ` [PATCH 2/5] Make tree-complex.c:extract_component() handle V_C_Es Martin Jambor
@ 2009-04-28 10:11 ` Martin Jambor
  2009-04-28 12:55   ` Richard Guenther
  2009-04-28 10:12 ` [PATCH 5/5] "Fix" the rest of the fallouts of new intra-SRA Martin Jambor
  2009-04-28 10:14 ` [PATCH 3/5] New intraprocedural Scalar Reduction of Aggregates Martin Jambor
  4 siblings, 1 reply; 25+ messages in thread
From: Martin Jambor @ 2009-04-28 10:11 UTC (permalink / raw)
  To: GCC Patches; +Cc: Richard Guenther, Jan Hubicka

[-- Attachment #1: move_insert_edge_copies_seq.diff --]
[-- Type: text/plain, Size: 5261 bytes --]

This  patch gets  rid off  all extermal  things in  the  old tree-sra.
sra_insert_before,       sra_insert_after      sra_init_cache      and
sra_type_can_be_decomposed_p  are not actually  used anywhere  so they
are made  static. insert_edge_copies_seq is  used in mudflap and  so I
copid the function there and made it static too.  The original one had
to be moved  upwards in the file do that  tree-sra compiles.  Yes, ths
patch duplicates the  function but the origial copy  is nuked with the
rest of the file by the next patch.

Thanks,

Martin


2009-04-27  Martin Jambor  <mjambor@suse.cz>

	* tree-flow.h (insert_edge_copies_seq): Undeclare.
	(sra_insert_before): Likewise.
	(sra_insert_after): Likewise.
	(sra_init_cache): Likewise.
	(sra_type_can_be_decomposed_p): Likewise.

	* tree-mudflap.c (insert_edge_copies_seq): Copied here from tree-sra.c

	* tree-sra.c (sra_type_can_be_decomposed_p): Made static.
	(sra_insert_before): Likewise.
	(sra_insert_after): Likewise.
	(sra_init_cache): Likewise.
	(insert_edge_copies_seq): Made static and moved upwards.


Index: mine/gcc/tree-flow.h
===================================================================
--- mine.orig/gcc/tree-flow.h
+++ mine/gcc/tree-flow.h
@@ -873,13 +873,6 @@ tree vn_lookup_with_vuses (tree, VEC (tr
 /* In tree-ssa-sink.c  */
 bool is_hidden_global_store (gimple);
 
-/* In tree-sra.c  */
-void insert_edge_copies_seq (gimple_seq, basic_block);
-void sra_insert_before (gimple_stmt_iterator *, gimple_seq);
-void sra_insert_after (gimple_stmt_iterator *, gimple_seq);
-void sra_init_cache (void);
-bool sra_type_can_be_decomposed_p (tree);
-
 /* In tree-loop-linear.c  */
 extern void linear_transform_loops (void);
 extern unsigned perfect_loop_nest_depth (struct loop *);
Index: mine/gcc/tree-mudflap.c
===================================================================
--- mine.orig/gcc/tree-mudflap.c
+++ mine/gcc/tree-mudflap.c
@@ -447,6 +447,26 @@ execute_mudflap_function_ops (void)
   return 0;
 }
 
+/* Insert a gimple_seq SEQ on all the outgoing edges out of BB.  Note that
+   if BB has more than one edge, STMT will be replicated for each edge.
+   Also, abnormal edges will be ignored.  */
+
+static void
+insert_edge_copies_seq (gimple_seq seq, basic_block bb)
+{
+  edge e;
+  edge_iterator ei;
+  unsigned n_copies = -1;
+
+  FOR_EACH_EDGE (e, ei, bb->succs)
+    if (!(e->flags & EDGE_ABNORMAL))
+      n_copies++;
+
+  FOR_EACH_EDGE (e, ei, bb->succs)
+    if (!(e->flags & EDGE_ABNORMAL))
+      gsi_insert_seq_on_edge (e, n_copies-- > 0 ? gimple_seq_copy (seq) : seq);
+}
+
 /* Create and initialize local shadow variables for the lookup cache
    globals.  Put their decls in the *_l globals for use by
    mf_build_check_statement_for.  */
Index: mine/gcc/tree-sra.c
===================================================================
--- mine.orig/gcc/tree-sra.c
+++ mine/gcc/tree-sra.c
@@ -236,7 +236,7 @@ is_sra_scalar_type (tree type)
    instantiated, just that if we decide to break up the type into
    separate pieces that it can be done.  */
 
-bool
+static bool
 sra_type_can_be_decomposed_p (tree type)
 {
   unsigned int cache = TYPE_UID (TYPE_MAIN_VARIANT (type)) * 2;
@@ -1263,6 +1263,26 @@ build_element_name (struct sra_elt *elt)
   return XOBFINISH (&sra_obstack, char *);
 }
 
+/* Insert a gimple_seq SEQ on all the outgoing edges out of BB.  Note that
+   if BB has more than one edge, STMT will be replicated for each edge.
+   Also, abnormal edges will be ignored.  */
+
+static void
+insert_edge_copies_seq (gimple_seq seq, basic_block bb)
+{
+  edge e;
+  edge_iterator ei;
+  unsigned n_copies = -1;
+
+  FOR_EACH_EDGE (e, ei, bb->succs)
+    if (!(e->flags & EDGE_ABNORMAL))
+      n_copies++;
+
+  FOR_EACH_EDGE (e, ei, bb->succs)
+    if (!(e->flags & EDGE_ABNORMAL))
+      gsi_insert_seq_on_edge (e, n_copies-- > 0 ? gimple_seq_copy (seq) : seq);
+}
+
 /* Instantiate an element as an independent variable.  */
 
 static void
@@ -2785,29 +2805,9 @@ generate_element_init (struct sra_elt *e
   return ret;
 }
 
-/* Insert a gimple_seq SEQ on all the outgoing edges out of BB.  Note that
-   if BB has more than one edge, STMT will be replicated for each edge.
-   Also, abnormal edges will be ignored.  */
-
-void
-insert_edge_copies_seq (gimple_seq seq, basic_block bb)
-{
-  edge e;
-  edge_iterator ei;
-  unsigned n_copies = -1;
-
-  FOR_EACH_EDGE (e, ei, bb->succs)
-    if (!(e->flags & EDGE_ABNORMAL)) 
-      n_copies++;
-
-  FOR_EACH_EDGE (e, ei, bb->succs)
-    if (!(e->flags & EDGE_ABNORMAL)) 
-      gsi_insert_seq_on_edge (e, n_copies-- > 0 ? gimple_seq_copy (seq) : seq);
-}
-
 /* Helper function to insert LIST before GSI, and set up line number info.  */
 
-void
+static void
 sra_insert_before (gimple_stmt_iterator *gsi, gimple_seq seq)
 {
   gimple stmt = gsi_stmt (*gsi);
@@ -2819,7 +2819,7 @@ sra_insert_before (gimple_stmt_iterator
 
 /* Similarly, but insert after GSI.  Handles insertion onto edges as well.  */
 
-void
+static void
 sra_insert_after (gimple_stmt_iterator *gsi, gimple_seq seq)
 {
   gimple stmt = gsi_stmt (*gsi);
@@ -3597,7 +3597,7 @@ debug_sra_elt_name (struct sra_elt *elt)
   fputc ('\n', stderr);
 }
 
-void 
+static void
 sra_init_cache (void)
 {
   if (sra_type_decomp_cache)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 5/5] "Fix" the rest of the fallouts of new intra-SRA
  2009-04-28 10:10 [PATCH 0/5] New implementation of SRA Martin Jambor
                   ` (2 preceding siblings ...)
  2009-04-28 10:11 ` [PATCH 1/5] Get rid off old external tree-sra.c stuff Martin Jambor
@ 2009-04-28 10:12 ` Martin Jambor
  2009-04-28 13:05   ` Richard Guenther
  2009-04-28 10:14 ` [PATCH 3/5] New intraprocedural Scalar Reduction of Aggregates Martin Jambor
  4 siblings, 1 reply; 25+ messages in thread
From: Martin Jambor @ 2009-04-28 10:12 UTC (permalink / raw)
  To: GCC Patches; +Cc: Richard Guenther, Jan Hubicka

[-- Attachment #1: testcase_fixes.diff --]
[-- Type: text/plain, Size: 3075 bytes --]

The following patch amends testcases (rather than the compiler) so
that they not fail with the new intra-SRA.

The FRE testcases rely on the fact that SRA does not scalarize unions.
The new one,  however, does.  Therefore I simply  switched SRA off for
them.  Hopefully that is the correct thing to do.

The gfortran.dg/pr25923.f90 expects a weird warning that I do not give
out because the new SRA does not scalarize anything in that particular
testcase  because the  individual fields  of the  structure  are never
accessed individually.  An  unpatched compiler with -fno-tree-sra also
does not give any warning.  I believe it was left there as potentially
uselful  and not  misleading but  I tent  to believe  it's  not really
required...

With  the previous and  this patch,  there are  no regressions  in the
testsuite on  x86_64-linux-gnu.  I  have even bootstrapped  and tested
Ada.

Thanks,

Martin


2009-04-27  Martin Jambor  <mjambor@suse.cz>

	    * gfortran.dg/pr25923.f90: Remove warning expectation.
	    * gcc.dg/tree-ssa/ssa-fre-7.c: Compile with -fno-tree-sra.
	    * gcc.dg/tree-ssa/ssa-fre-8.c: Likewise.
	    * gcc.dg/tree-ssa/ssa-fre-9.c: Likewise.


Index: mine/gcc/testsuite/gfortran.dg/pr25923.f90
===================================================================
--- mine.orig/gcc/testsuite/gfortran.dg/pr25923.f90
+++ mine/gcc/testsuite/gfortran.dg/pr25923.f90
@@ -10,7 +10,7 @@ implicit none
 
 contains
 
-  function baz(arg) result(res) ! { dg-warning "res.yr' may be" }
+  function baz(arg) result(res)
     type(bar), intent(in) :: arg
     type(bar) :: res
     logical, external:: some_func
Index: mine/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-7.c
===================================================================
--- mine.orig/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-7.c
+++ mine/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-7.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O -fdump-tree-fre-details -fdump-tree-optimized" } */
+/* { dg-options "-O -fno-tree-sra -fdump-tree-fre-details -fdump-tree-optimized" } */
 #if (__SIZEOF_INT__ == __SIZEOF_FLOAT__)
 typedef int intflt;
 #elif (__SIZEOF_LONG__ == __SIZEOF_FLOAT__)
Index: mine/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-8.c
===================================================================
--- mine.orig/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-8.c
+++ mine/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-8.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O -fdump-tree-fre-details" } */
+/* { dg-options "-O -fno-tree-sra -fdump-tree-fre-details" } */
 #if (__SIZEOF_INT__ == __SIZEOF_FLOAT__)
 typedef int intflt;
 #elif (__SIZEOF_LONG__ == __SIZEOF_FLOAT__)
Index: mine/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-9.c
===================================================================
--- mine.orig/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-9.c
+++ mine/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-9.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O -fdump-tree-fre-stats" } */
+/* { dg-options "-O -fno-tree-sra -fdump-tree-fre-stats" } */
 
 union loc {
     unsigned reg;

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 3/5] New intraprocedural Scalar Reduction of Aggregates.
  2009-04-28 10:10 [PATCH 0/5] New implementation of SRA Martin Jambor
                   ` (3 preceding siblings ...)
  2009-04-28 10:12 ` [PATCH 5/5] "Fix" the rest of the fallouts of new intra-SRA Martin Jambor
@ 2009-04-28 10:14 ` Martin Jambor
  2009-04-28 10:27   ` Martin Jambor
  2009-04-29 10:59   ` Richard Guenther
  4 siblings, 2 replies; 25+ messages in thread
From: Martin Jambor @ 2009-04-28 10:14 UTC (permalink / raw)
  To: GCC Patches; +Cc: Richard Guenther, Jan Hubicka

[-- Attachment #1: trunk_intra_sra.diff --]
[-- Type: text/plain, Size: 181047 bytes --]

This  is  the  new  intraprocedural  SRA.  I  have  stripped  off  the
interprocedural part  and will propose to commit  it separately later.
I have  tried to  remove almost every  trace of IPA-SRA,  however, two
provisions for it  have remained in the patch.   First, an enumeration
(rather than  a boolean) is  used to distuinguish between  "early" and
"late" SRA  so that other  SRA modes can  be added later  on.  Second,
scan_function()  has a  hook parameter  and a  void  pointer parameter
which are not used in this patch but will be by IPA-SRA.

Otherwise, the patch is hopefully self-contained and the bases of its
operation is described by the initial comment.

The patch bootstraps (on x86_64-linux-gnu but I am about to try it on
hppa-linux-gnu too) but produces a small number of testsuite failures
which are handled by the two following patches.

Thanks,

Martin


2009-04-27  Martin Jambor  <mjambor@suse.cz>

	* tree-sra.c (enum sra_mode): The whole contents of the file was
	replaced.
	

Index: mine/gcc/tree-sra.c
===================================================================
--- mine.orig/gcc/tree-sra.c
+++ mine/gcc/tree-sra.c
@@ -1,19 +1,18 @@
 /* Scalar Replacement of Aggregates (SRA) converts some structure
    references into scalar references, exposing them to the scalar
    optimizers.
-   Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2009
-   Free Software Foundation, Inc.
-   Contributed by Diego Novillo <dnovillo@redhat.com>
+   Copyright (C) 2008, 2009 Free Software Foundation, Inc.
+   Contributed by Martin Jambor <mjambor@suse.cz>
 
 This file is part of GCC.
 
-GCC is free software; you can redistribute it and/or modify it
-under the terms of the GNU General Public License as published by the
-Free Software Foundation; either version 3, or (at your option) any
-later version.
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
 
-GCC is distributed in the hope that it will be useful, but WITHOUT
-ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
 FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
 for more details.
 
@@ -21,3656 +20,2436 @@ You should have received a copy of the G
 along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
+/* This file implements Scalar Reduction of Aggregates (SRA).  SRA is run
+   twice, once in the early stages of compilation (early SRA) and once in the
+   late stages (late SRA).  The aim of both is to turn references to scalar
+   parts of aggregates into uses of independent scalar variables.
+
+   The two passes are nearly identical, the only difference is that early SRA
+   does not scalarize unions which are used as the result in a GIMPLE_RETURN
+   statement because together with inlining this can lead to weird type
+   conversions.
+
+   Both passes operate in four stages:
+
+   1. The declarations that have properties which make them candidates for
+      scalarization are identified in function find_var_candidates().  The
+      candidates are stored in candidate_bitmap.
+
+   2. The function body is scanned.  In the process, declarations which are
+      used in a manner that prevent their scalarization are removed from the
+      candidate bitmap.  More importantly, for every access into an aggregate,
+      an access structure (struct access) is created by create_access() and
+      stored in a vector associated with the aggregate.  Among other
+      information, the aggregate declaration, the offset and size of the access
+      and its type are stored in the structure.
+
+      On a related note, assign_link structures are created for every assign
+      statement between candidate aggregates and attached to the related
+      accesses.
+
+   3. The vectors of accesses are analyzed.  They are first sorted according to
+      their offset and size and then scanned for partially overlapping accesses
+      (i.e. those which overlap but one is not entirely within another).  Such
+      an access disqualifies the whole aggregate from being scalarized.
+
+      If there is no such inhibiting overlap, a representative access structure
+      is chosen for every unique combination of offset and size.  Afterwards,
+      the pass builds a set of trees from these structures, in which children
+      of an access are within their parent (in terms of offset and size).
+
+      Then accesses  are propagated  whenever possible (i.e.  in cases  when it
+      does not create a partially overlapping access) across assign_links from
+      the right hand side to the left hand side.
+
+      Then the set of trees for each declaration is traversed again and those
+      accesses which should be replaced by a scalar are identified.
+
+   4. The function is traversed again, and for every reference into an
+      aggregate that has some component which is about to be scalarized,
+      statements are amended and new statements are created as necessary.
+      Finally, if a parameter got scalarized, the scalar replacements are
+      initialized with values from respective parameter aggregates.
+*/
+
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
+#include "alloc-pool.h"
 #include "tm.h"
-#include "ggc.h"
 #include "tree.h"
-
-/* These RTL headers are needed for basic-block.h.  */
-#include "rtl.h"
-#include "tm_p.h"
-#include "hard-reg-set.h"
-#include "basic-block.h"
-#include "diagnostic.h"
-#include "langhooks.h"
-#include "tree-inline.h"
-#include "tree-flow.h"
 #include "gimple.h"
+#include "tree-flow.h"
+#include "diagnostic.h"
 #include "tree-dump.h"
-#include "tree-pass.h"
 #include "timevar.h"
-#include "flags.h"
-#include "bitmap.h"
-#include "obstack.h"
-#include "target.h"
-/* expr.h is needed for MOVE_RATIO.  */
-#include "expr.h"
 #include "params.h"
+#include "target.h"
+#include "flags.h"
 
-
-/* This object of this pass is to replace a non-addressable aggregate with a
-   set of independent variables.  Most of the time, all of these variables
-   will be scalars.  But a secondary objective is to break up larger
-   aggregates into smaller aggregates.  In the process we may find that some
-   bits of the larger aggregate can be deleted as unreferenced.
-
-   This substitution is done globally.  More localized substitutions would
-   be the purvey of a load-store motion pass.
-
-   The optimization proceeds in phases:
-
-     (1) Identify variables that have types that are candidates for
-	 decomposition.
-
-     (2) Scan the function looking for the ways these variables are used.
-	 In particular we're interested in the number of times a variable
-	 (or member) is needed as a complete unit, and the number of times
-	 a variable (or member) is copied.
-
-     (3) Based on the usage profile, instantiate substitution variables.
-
-     (4) Scan the function making replacements.
+/* Enumeration of all aggregate reductions we can do.  */
+enum sra_mode {SRA_MODE_EARLY_INTRA, /* early intraprocedural SRA */
+	       SRA_MODE_INTRA};	     /* late intraprocedural SRA */
+
+/* Global variable describing which aggregate reduction we are performing at
+   the moment.  */
+static enum sra_mode sra_mode;
+
+struct assign_link;
+
+/* ACCESS represents each access to an aggregate variable (as a whole or a
+   part).  It can also represent a group of accesses that refer to exactly the
+   same fragment of an aggregate (i.e. those that have exactly the same offset
+   and size).  Such representatives for a single aggregate, once determined,
+   are linked in a linked list and have the group fields set.
+
+   Moreover, when doing intraprocedural SRA, a tree is built from those
+   representatives (by the means of first_child and next_sibling pointers), in
+   which all items in a subtree are "within" the root, i.e. their offset is
+   greater or equal to offset of the root and offset+size is smaller or equal
+   to offset+size of the root.  Children of an access are sorted by offset.
 */
 
-
-/* True if this is the "early" pass, before inlining.  */
-static bool early_sra;
-
-/* The set of aggregate variables that are candidates for scalarization.  */
-static bitmap sra_candidates;
-
-/* Set of scalarizable PARM_DECLs that need copy-in operations at the
-   beginning of the function.  */
-static bitmap needs_copy_in;
-
-/* Sets of bit pairs that cache type decomposition and instantiation.  */
-static bitmap sra_type_decomp_cache;
-static bitmap sra_type_inst_cache;
-
-/* One of these structures is created for each candidate aggregate and
-   each (accessed) member or group of members of such an aggregate.  */
-struct sra_elt
-{
-  /* A tree of the elements.  Used when we want to traverse everything.  */
-  struct sra_elt *parent;
-  struct sra_elt *groups;
-  struct sra_elt *children;
-  struct sra_elt *sibling;
-
-  /* If this element is a root, then this is the VAR_DECL.  If this is
-     a sub-element, this is some token used to identify the reference.
-     In the case of COMPONENT_REF, this is the FIELD_DECL.  In the case
-     of an ARRAY_REF, this is the (constant) index.  In the case of an
-     ARRAY_RANGE_REF, this is the (constant) RANGE_EXPR.  In the case
-     of a complex number, this is a zero or one.  */
-  tree element;
-
-  /* The type of the element.  */
-  tree type;
-
-  /* A VAR_DECL, for any sub-element we've decided to replace.  */
-  tree replacement;
-
-  /* The number of times the element is referenced as a whole.  I.e.
-     given "a.b.c", this would be incremented for C, but not for A or B.  */
-  unsigned int n_uses;
-
-  /* The number of times the element is copied to or from another
-     scalarizable element.  */
-  unsigned int n_copies;
-
-  /* True if TYPE is scalar.  */
-  bool is_scalar;
-
-  /* True if this element is a group of members of its parent.  */
-  bool is_group;
-
-  /* True if we saw something about this element that prevents scalarization,
-     such as non-constant indexing.  */
-  bool cannot_scalarize;
-
-  /* True if we've decided that structure-to-structure assignment
-     should happen via memcpy and not per-element.  */
-  bool use_block_copy;
-
-  /* True if everything under this element has been marked TREE_NO_WARNING.  */
-  bool all_no_warning;
-
-  /* A flag for use with/after random access traversals.  */
-  bool visited;
-
-  /* True if there is BIT_FIELD_REF on the lhs with a vector. */
-  bool is_vector_lhs;
-
-  /* 1 if the element is a field that is part of a block, 2 if the field
-     is the block itself, 0 if it's neither.  */
-  char in_bitfld_block;
-};
-
-#define IS_ELEMENT_FOR_GROUP(ELEMENT) (TREE_CODE (ELEMENT) == RANGE_EXPR)
-
-#define FOR_EACH_ACTUAL_CHILD(CHILD, ELT)			\
-  for ((CHILD) = (ELT)->is_group				\
-		 ? next_child_for_group (NULL, (ELT))		\
-		 : (ELT)->children;				\
-       (CHILD);							\
-       (CHILD) = (ELT)->is_group				\
-		 ? next_child_for_group ((CHILD), (ELT))	\
-		 : (CHILD)->sibling)
-
-/* Helper function for above macro.  Return next child in group.  */
-static struct sra_elt *
-next_child_for_group (struct sra_elt *child, struct sra_elt *group)
-{
-  gcc_assert (group->is_group);
-
-  /* Find the next child in the parent.  */
-  if (child)
-    child = child->sibling;
-  else
-    child = group->parent->children;
-
-  /* Skip siblings that do not belong to the group.  */
-  while (child)
-    {
-      tree g_elt = group->element;
-      if (TREE_CODE (g_elt) == RANGE_EXPR)
-	{
-	  if (!tree_int_cst_lt (child->element, TREE_OPERAND (g_elt, 0))
-	      && !tree_int_cst_lt (TREE_OPERAND (g_elt, 1), child->element))
-	    break;
-	}
-      else
-	gcc_unreachable ();
-
-      child = child->sibling;
-    }
-
-  return child;
-}
-
-/* Random access to the child of a parent is performed by hashing.
-   This prevents quadratic behavior, and allows SRA to function
-   reasonably on larger records.  */
-static htab_t sra_map;
-
-/* All structures are allocated out of the following obstack.  */
-static struct obstack sra_obstack;
-
-/* Debugging functions.  */
-static void dump_sra_elt_name (FILE *, struct sra_elt *);
-extern void debug_sra_elt_name (struct sra_elt *);
-
-/* Forward declarations.  */
-static tree generate_element_ref (struct sra_elt *);
-static gimple_seq sra_build_assignment (tree dst, tree src);
-static void mark_all_v_defs_seq (gimple_seq);
-
-\f
-/* Return true if DECL is an SRA candidate.  */
-
-static bool
-is_sra_candidate_decl (tree decl)
-{
-  return DECL_P (decl) && bitmap_bit_p (sra_candidates, DECL_UID (decl));
-}
-
-/* Return true if TYPE is a scalar type.  */
-
-static bool
-is_sra_scalar_type (tree type)
-{
-  enum tree_code code = TREE_CODE (type);
-  return (code == INTEGER_TYPE || code == REAL_TYPE || code == VECTOR_TYPE
-	  || code == FIXED_POINT_TYPE
-	  || code == ENUMERAL_TYPE || code == BOOLEAN_TYPE
-	  || code == POINTER_TYPE || code == OFFSET_TYPE
-	  || code == REFERENCE_TYPE);
-}
-
-/* Return true if TYPE can be decomposed into a set of independent variables.
-
-   Note that this doesn't imply that all elements of TYPE can be
-   instantiated, just that if we decide to break up the type into
-   separate pieces that it can be done.  */
-
-static bool
-sra_type_can_be_decomposed_p (tree type)
-{
-  unsigned int cache = TYPE_UID (TYPE_MAIN_VARIANT (type)) * 2;
-  tree t;
-
-  /* Avoid searching the same type twice.  */
-  if (bitmap_bit_p (sra_type_decomp_cache, cache+0))
-    return true;
-  if (bitmap_bit_p (sra_type_decomp_cache, cache+1))
-    return false;
-
-  /* The type must have a definite nonzero size.  */
-  if (TYPE_SIZE (type) == NULL || TREE_CODE (TYPE_SIZE (type)) != INTEGER_CST
-      || integer_zerop (TYPE_SIZE (type)))
-    goto fail;
-
-  /* The type must be a non-union aggregate.  */
-  switch (TREE_CODE (type))
-    {
-    case RECORD_TYPE:
-      {
-	bool saw_one_field = false;
-
-	for (t = TYPE_FIELDS (type); t ; t = TREE_CHAIN (t))
-	  if (TREE_CODE (t) == FIELD_DECL)
-	    {
-	      /* Reject incorrectly represented bit fields.  */
-	      if (DECL_BIT_FIELD (t)
-		  && INTEGRAL_TYPE_P (TREE_TYPE (t))
-		  && (tree_low_cst (DECL_SIZE (t), 1)
-		      != TYPE_PRECISION (TREE_TYPE (t))))
-		goto fail;
-
-	      saw_one_field = true;
-	    }
-
-	/* Record types must have at least one field.  */
-	if (!saw_one_field)
-	  goto fail;
-      }
-      break;
-
-    case ARRAY_TYPE:
-      /* Array types must have a fixed lower and upper bound.  */
-      t = TYPE_DOMAIN (type);
-      if (t == NULL)
-	goto fail;
-      if (TYPE_MIN_VALUE (t) == NULL || !TREE_CONSTANT (TYPE_MIN_VALUE (t)))
-	goto fail;
-      if (TYPE_MAX_VALUE (t) == NULL || !TREE_CONSTANT (TYPE_MAX_VALUE (t)))
-	goto fail;
-      break;
-
-    case COMPLEX_TYPE:
-      break;
-
-    default:
-      goto fail;
-    }
-
-  bitmap_set_bit (sra_type_decomp_cache, cache+0);
-  return true;
-
- fail:
-  bitmap_set_bit (sra_type_decomp_cache, cache+1);
-  return false;
-}
-
-/* Returns true if the TYPE is one of the available va_list types.
-   Otherwise it returns false.
-   Note, that for multiple calling conventions there can be more
-   than just one va_list type present.  */
-
-static bool
-is_va_list_type (tree type)
+struct access
 {
-  tree h;
-
-  if (type == NULL_TREE)
-    return false;
-  h = targetm.canonical_va_list_type (type);
-  if (h == NULL_TREE)
-    return false;
-  if (TYPE_MAIN_VARIANT (type) == TYPE_MAIN_VARIANT (h))
-	 return true;
-  return false;
-}
-
-/* Return true if DECL can be decomposed into a set of independent
-   (though not necessarily scalar) variables.  */
-
-static bool
-decl_can_be_decomposed_p (tree var)
-{
-  /* Early out for scalars.  */
-  if (is_sra_scalar_type (TREE_TYPE (var)))
-    return false;
-
-  /* The variable must not be aliased.  */
-  if (!is_gimple_non_addressable (var))
-    {
-      if (dump_file && (dump_flags & TDF_DETAILS))
-	{
-	  fprintf (dump_file, "Cannot scalarize variable ");
-	  print_generic_expr (dump_file, var, dump_flags);
-	  fprintf (dump_file, " because it must live in memory\n");
-	}
-      return false;
-    }
-
-  /* The variable must not be volatile.  */
-  if (TREE_THIS_VOLATILE (var))
-    {
-      if (dump_file && (dump_flags & TDF_DETAILS))
-	{
-	  fprintf (dump_file, "Cannot scalarize variable ");
-	  print_generic_expr (dump_file, var, dump_flags);
-	  fprintf (dump_file, " because it is declared volatile\n");
-	}
-      return false;
-    }
-
-  /* We must be able to decompose the variable's type.  */
-  if (!sra_type_can_be_decomposed_p (TREE_TYPE (var)))
-    {
-      if (dump_file && (dump_flags & TDF_DETAILS))
-	{
-	  fprintf (dump_file, "Cannot scalarize variable ");
-	  print_generic_expr (dump_file, var, dump_flags);
-	  fprintf (dump_file, " because its type cannot be decomposed\n");
-	}
-      return false;
-    }
-
-  /* HACK: if we decompose a va_list_type_node before inlining, then we'll
-     confuse tree-stdarg.c, and we won't be able to figure out which and
-     how many arguments are accessed.  This really should be improved in
-     tree-stdarg.c, as the decomposition is truly a win.  This could also
-     be fixed if the stdarg pass ran early, but this can't be done until
-     we've aliasing information early too.  See PR 30791.  */
-  if (early_sra && is_va_list_type (TREE_TYPE (var)))
-    return false;
-
-  return true;
-}
-
-/* Return true if TYPE can be *completely* decomposed into scalars.  */
-
-static bool
-type_can_instantiate_all_elements (tree type)
-{
-  if (is_sra_scalar_type (type))
-    return true;
-  if (!sra_type_can_be_decomposed_p (type))
-    return false;
-
-  switch (TREE_CODE (type))
-    {
-    case RECORD_TYPE:
-      {
-	unsigned int cache = TYPE_UID (TYPE_MAIN_VARIANT (type)) * 2;
-	tree f;
-
-	if (bitmap_bit_p (sra_type_inst_cache, cache+0))
-	  return true;
-	if (bitmap_bit_p (sra_type_inst_cache, cache+1))
-	  return false;
-
-	for (f = TYPE_FIELDS (type); f ; f = TREE_CHAIN (f))
-	  if (TREE_CODE (f) == FIELD_DECL)
-	    {
-	      if (!type_can_instantiate_all_elements (TREE_TYPE (f)))
-		{
-		  bitmap_set_bit (sra_type_inst_cache, cache+1);
-		  return false;
-		}
-	    }
-
-	bitmap_set_bit (sra_type_inst_cache, cache+0);
-	return true;
-      }
-
-    case ARRAY_TYPE:
-      return type_can_instantiate_all_elements (TREE_TYPE (type));
-
-    case COMPLEX_TYPE:
-      return true;
-
-    default:
-      gcc_unreachable ();
-    }
-}
-
-/* Test whether ELT or some sub-element cannot be scalarized.  */
-
-static bool
-can_completely_scalarize_p (struct sra_elt *elt)
-{
-  struct sra_elt *c;
-
-  if (elt->cannot_scalarize)
-    return false;
-
-  for (c = elt->children; c; c = c->sibling)
-    if (!can_completely_scalarize_p (c))
-      return false;
-
-  for (c = elt->groups; c; c = c->sibling)
-    if (!can_completely_scalarize_p (c))
-      return false;
-
-  return true;
-}
-
-\f
-/* A simplified tree hashing algorithm that only handles the types of
-   trees we expect to find in sra_elt->element.  */
-
-static hashval_t
-sra_hash_tree (tree t)
-{
-  hashval_t h;
-
-  switch (TREE_CODE (t))
-    {
-    case VAR_DECL:
-    case PARM_DECL:
-    case RESULT_DECL:
-      h = DECL_UID (t);
-      break;
-
-    case INTEGER_CST:
-      h = TREE_INT_CST_LOW (t) ^ TREE_INT_CST_HIGH (t);
-      break;
-
-    case RANGE_EXPR:
-      h = iterative_hash_expr (TREE_OPERAND (t, 0), 0);
-      h = iterative_hash_expr (TREE_OPERAND (t, 1), h);
-      break;
-
-    case FIELD_DECL:
-      /* We can have types that are compatible, but have different member
-	 lists, so we can't hash fields by ID.  Use offsets instead.  */
-      h = iterative_hash_expr (DECL_FIELD_OFFSET (t), 0);
-      h = iterative_hash_expr (DECL_FIELD_BIT_OFFSET (t), h);
-      break;
-
-    case BIT_FIELD_REF:
-      /* Don't take operand 0 into account, that's our parent.  */
-      h = iterative_hash_expr (TREE_OPERAND (t, 1), 0);
-      h = iterative_hash_expr (TREE_OPERAND (t, 2), h);
-      break;
-
-    default:
-      gcc_unreachable ();
-    }
-
-  return h;
-}
-
-/* Hash function for type SRA_PAIR.  */
-
-static hashval_t
-sra_elt_hash (const void *x)
-{
-  const struct sra_elt *const e = (const struct sra_elt *) x;
-  const struct sra_elt *p;
-  hashval_t h;
-
-  h = sra_hash_tree (e->element);
-
-  /* Take into account everything except bitfield blocks back up the
-     chain.  Given that chain lengths are rarely very long, this
-     should be acceptable.  If we truly identify this as a performance
-     problem, it should work to hash the pointer value
-     "e->parent".  */
-  for (p = e->parent; p ; p = p->parent)
-    if (!p->in_bitfld_block)
-      h = (h * 65521) ^ sra_hash_tree (p->element);
-
-  return h;
-}
-
-/* Equality function for type SRA_PAIR.  */
-
-static int
-sra_elt_eq (const void *x, const void *y)
-{
-  const struct sra_elt *const a = (const struct sra_elt *) x;
-  const struct sra_elt *const b = (const struct sra_elt *) y;
-  tree ae, be;
-  const struct sra_elt *ap = a->parent;
-  const struct sra_elt *bp = b->parent;
-
-  if (ap)
-    while (ap->in_bitfld_block)
-      ap = ap->parent;
-  if (bp)
-    while (bp->in_bitfld_block)
-      bp = bp->parent;
-
-  if (ap != bp)
-    return false;
-
-  ae = a->element;
-  be = b->element;
-
-  if (ae == be)
-    return true;
-  if (TREE_CODE (ae) != TREE_CODE (be))
-    return false;
-
-  switch (TREE_CODE (ae))
-    {
-    case VAR_DECL:
-    case PARM_DECL:
-    case RESULT_DECL:
-      /* These are all pointer unique.  */
-      return false;
-
-    case INTEGER_CST:
-      /* Integers are not pointer unique, so compare their values.  */
-      return tree_int_cst_equal (ae, be);
-
-    case RANGE_EXPR:
-      return
-	tree_int_cst_equal (TREE_OPERAND (ae, 0), TREE_OPERAND (be, 0))
-	&& tree_int_cst_equal (TREE_OPERAND (ae, 1), TREE_OPERAND (be, 1));
-
-    case FIELD_DECL:
-      /* Fields are unique within a record, but not between
-	 compatible records.  */
-      if (DECL_FIELD_CONTEXT (ae) == DECL_FIELD_CONTEXT (be))
-	return false;
-      return fields_compatible_p (ae, be);
-
-    case BIT_FIELD_REF:
-      return
-	tree_int_cst_equal (TREE_OPERAND (ae, 1), TREE_OPERAND (be, 1))
-	&& tree_int_cst_equal (TREE_OPERAND (ae, 2), TREE_OPERAND (be, 2));
-
-    default:
-      gcc_unreachable ();
-    }
-}
-
-/* Create or return the SRA_ELT structure for CHILD in PARENT.  PARENT
-   may be null, in which case CHILD must be a DECL.  */
-
-static struct sra_elt *
-lookup_element (struct sra_elt *parent, tree child, tree type,
-		enum insert_option insert)
-{
-  struct sra_elt dummy;
-  struct sra_elt **slot;
-  struct sra_elt *elt;
-
-  if (parent)
-    dummy.parent = parent->is_group ? parent->parent : parent;
-  else
-    dummy.parent = NULL;
-  dummy.element = child;
-
-  slot = (struct sra_elt **) htab_find_slot (sra_map, &dummy, insert);
-  if (!slot && insert == NO_INSERT)
-    return NULL;
-
-  elt = *slot;
-  if (!elt && insert == INSERT)
-    {
-      *slot = elt = XOBNEW (&sra_obstack, struct sra_elt);
-      memset (elt, 0, sizeof (*elt));
-
-      elt->parent = parent;
-      elt->element = child;
-      elt->type = type;
-      elt->is_scalar = is_sra_scalar_type (type);
-
-      if (parent)
-	{
-	  if (IS_ELEMENT_FOR_GROUP (elt->element))
-	    {
-	      elt->is_group = true;
-	      elt->sibling = parent->groups;
-	      parent->groups = elt;
-	    }
-	  else
-	    {
-	      elt->sibling = parent->children;
-	      parent->children = elt;
-	    }
-	}
-
-      /* If this is a parameter, then if we want to scalarize, we have
-	 one copy from the true function parameter.  Count it now.  */
-      if (TREE_CODE (child) == PARM_DECL)
-	{
-	  elt->n_copies = 1;
-	  bitmap_set_bit (needs_copy_in, DECL_UID (child));
-	}
-    }
-
-  return elt;
-}
-
-/* Create or return the SRA_ELT structure for EXPR if the expression
-   refers to a scalarizable variable.  */
-
-static struct sra_elt *
-maybe_lookup_element_for_expr (tree expr)
-{
-  struct sra_elt *elt;
-  tree child;
-
-  switch (TREE_CODE (expr))
-    {
-    case VAR_DECL:
-    case PARM_DECL:
-    case RESULT_DECL:
-      if (is_sra_candidate_decl (expr))
-	return lookup_element (NULL, expr, TREE_TYPE (expr), INSERT);
-      return NULL;
-
-    case ARRAY_REF:
-      /* We can't scalarize variable array indices.  */
-      if (in_array_bounds_p (expr))
-        child = TREE_OPERAND (expr, 1);
-      else
-	return NULL;
-      break;
-
-    case ARRAY_RANGE_REF:
-      /* We can't scalarize variable array indices.  */
-      if (range_in_array_bounds_p (expr))
-	{
-	  tree domain = TYPE_DOMAIN (TREE_TYPE (expr));
-	  child = build2 (RANGE_EXPR, integer_type_node,
-			  TYPE_MIN_VALUE (domain), TYPE_MAX_VALUE (domain));
-	}
-      else
-	return NULL;
-      break;
-
-    case COMPONENT_REF:
-      {
-	tree type = TREE_TYPE (TREE_OPERAND (expr, 0));
-	/* Don't look through unions.  */
-	if (TREE_CODE (type) != RECORD_TYPE)
-	  return NULL;
-	/* Neither through variable-sized records.  */
-	if (TYPE_SIZE (type) == NULL_TREE
-	    || TREE_CODE (TYPE_SIZE (type)) != INTEGER_CST)
-	  return NULL;
-	child = TREE_OPERAND (expr, 1);
-      }
-      break;
-
-    case REALPART_EXPR:
-      child = integer_zero_node;
-      break;
-    case IMAGPART_EXPR:
-      child = integer_one_node;
-      break;
+  /* Values returned by `get_ref_base_and_extent' for each COMPONENT_REF
+     If EXPR isn't a COMPONENT_REF just set `BASE = EXPR', `OFFSET = 0',
+     `SIZE = TREE_SIZE (TREE_TYPE (expr))'.  */
+  HOST_WIDE_INT offset;
+  HOST_WIDE_INT size;
+  tree base;
+
+  /* Expression.  */
+  tree expr;
+  /* Type.  */
+  tree type;
 
-    default:
-      return NULL;
-    }
+  /* Next group representative for this aggregate. */
+  struct access *next_grp;
 
-  elt = maybe_lookup_element_for_expr (TREE_OPERAND (expr, 0));
-  if (elt)
-    return lookup_element (elt, child, TREE_TYPE (expr), INSERT);
-  return NULL;
-}
-
-\f
-/* Functions to walk just enough of the tree to see all scalarizable
-   references, and categorize them.  */
-
-/* A set of callbacks for phases 2 and 4.  They'll be invoked for the
-   various kinds of references seen.  In all cases, *GSI is an iterator
-   pointing to the statement being processed.  */
-struct sra_walk_fns
-{
-  /* Invoked when ELT is required as a unit.  Note that ELT might refer to
-     a leaf node, in which case this is a simple scalar reference.  *EXPR_P
-     points to the location of the expression.  IS_OUTPUT is true if this
-     is a left-hand-side reference.  USE_ALL is true if we saw something we
-     couldn't quite identify and had to force the use of the entire object.  */
-  void (*use) (struct sra_elt *elt, tree *expr_p,
-	       gimple_stmt_iterator *gsi, bool is_output, bool use_all);
-
-  /* Invoked when we have a copy between two scalarizable references.  */
-  void (*copy) (struct sra_elt *lhs_elt, struct sra_elt *rhs_elt,
-		gimple_stmt_iterator *gsi);
-
-  /* Invoked when ELT is initialized from a constant.  VALUE may be NULL,
-     in which case it should be treated as an empty CONSTRUCTOR.  */
-  void (*init) (struct sra_elt *elt, tree value, gimple_stmt_iterator *gsi);
-
-  /* Invoked when we have a copy between one scalarizable reference ELT
-     and one non-scalarizable reference OTHER without side-effects. 
-     IS_OUTPUT is true if ELT is on the left-hand side.  */
-  void (*ldst) (struct sra_elt *elt, tree other,
-		gimple_stmt_iterator *gsi, bool is_output);
-
-  /* True during phase 2, false during phase 4.  */
-  /* ??? This is a hack.  */
-  bool initial_scan;
+  /* Pointer to the group representative.  Pointer to itself if the struct is
+     the representative.  */
+  struct access *group_representative;
+
+  /* If this access has any children (in terms of the definition above), this
+     points to the first one.  */
+  struct access *first_child;
+
+  /* Pointer to the next sibling in the access tree as described above.  */
+  struct access *next_sibling;
+
+  /* Pointers to the first and last element in the linked list of assign
+     links.  */
+  struct assign_link *first_link, *last_link;
+  /* Pointer to the next access in the work queue.  */
+  struct access *next_queued;
+
+  /* Replacement variable for this access "region."  Never to be accessed
+     directly, always only by the means of get_access_replacement() and only
+     when grp_to_be_replaced flag is set.  */
+  tree replacement_decl;
+
+  /* Is this particular access write access? */
+  unsigned write : 1;
+
+  /* Is this access currently in the work queue?  */
+  unsigned grp_queued : 1;
+  /* Does this group contain a write access?  This flag is propagated down the
+     access tree.  */
+  unsigned grp_write : 1;
+  /* Does this group contain a read access?  This flag is propagated down the
+     access tree.  */
+  unsigned grp_read : 1;
+  /* Is the subtree rooted in this access fully covered by scalar
+     replacements?  */
+  unsigned grp_covered : 1;
+  /* If set to true, this access and all below it in an access tree must not be
+     scalarized.  */
+  unsigned grp_unscalarizable_region : 1;
+  /* Whether data have been written to parts of the aggregate covered by this
+     access which is not to be scalarized.  This flag is propagated up in the
+     access tree.  */
+  unsigned grp_unscalarized_data : 1;
+  /* Does this access and/or group contain a write access through a
+     BIT_FIELD_REF?  */
+  unsigned grp_bfr_lhs : 1;
+
+  /* Set when a scalar replacement should be created for this variable.  We do
+     the decision and creation at different places because create_tmp_var
+     cannot be called from within FOR_EACH_REFERENCED_VAR. */
+  unsigned grp_to_be_replaced : 1;
 };
 
-#ifdef ENABLE_CHECKING
-/* Invoked via walk_tree, if *TP contains a candidate decl, return it.  */
-
-static tree
-sra_find_candidate_decl (tree *tp, int *walk_subtrees,
-			 void *data ATTRIBUTE_UNUSED)
-{
-  tree t = *tp;
-  enum tree_code code = TREE_CODE (t);
-
-  if (code == VAR_DECL || code == PARM_DECL || code == RESULT_DECL)
-    {
-      *walk_subtrees = 0;
-      if (is_sra_candidate_decl (t))
-	return t;
-    }
-  else if (TYPE_P (t))
-    *walk_subtrees = 0;
-
-  return NULL;
-}
-#endif
-
-/* Walk most expressions looking for a scalarizable aggregate.
-   If we find one, invoke FNS->USE.  */
-
-static void
-sra_walk_expr (tree *expr_p, gimple_stmt_iterator *gsi, bool is_output,
-	       const struct sra_walk_fns *fns)
-{
-  tree expr = *expr_p;
-  tree inner = expr;
-  bool disable_scalarization = false;
-  bool use_all_p = false;
-
-  /* We're looking to collect a reference expression between EXPR and INNER,
-     such that INNER is a scalarizable decl and all other nodes through EXPR
-     are references that we can scalarize.  If we come across something that
-     we can't scalarize, we reset EXPR.  This has the effect of making it
-     appear that we're referring to the larger expression as a whole.  */
-
-  while (1)
-    switch (TREE_CODE (inner))
-      {
-      case VAR_DECL:
-      case PARM_DECL:
-      case RESULT_DECL:
-	/* If there is a scalarizable decl at the bottom, then process it.  */
-	if (is_sra_candidate_decl (inner))
-	  {
-	    struct sra_elt *elt = maybe_lookup_element_for_expr (expr);
-	    if (disable_scalarization)
-	      elt->cannot_scalarize = true;
-	    else
-	      fns->use (elt, expr_p, gsi, is_output, use_all_p);
-	  }
-	return;
-
-      case ARRAY_REF:
-	/* Non-constant index means any member may be accessed.  Prevent the
-	   expression from being scalarized.  If we were to treat this as a
-	   reference to the whole array, we can wind up with a single dynamic
-	   index reference inside a loop being overridden by several constant
-	   index references during loop setup.  It's possible that this could
-	   be avoided by using dynamic usage counts based on BB trip counts
-	   (based on loop analysis or profiling), but that hardly seems worth
-	   the effort.  */
-	/* ??? Hack.  Figure out how to push this into the scan routines
-	   without duplicating too much code.  */
-	if (!in_array_bounds_p (inner))
-	  {
-	    disable_scalarization = true;
-	    goto use_all;
-	  }
-	/* ??? Are we assured that non-constant bounds and stride will have
-	   the same value everywhere?  I don't think Fortran will...  */
-	if (TREE_OPERAND (inner, 2) || TREE_OPERAND (inner, 3))
-	  goto use_all;
-	inner = TREE_OPERAND (inner, 0);
-	break;
-
-      case ARRAY_RANGE_REF:
-	if (!range_in_array_bounds_p (inner))
-	  {
-	    disable_scalarization = true;
-	    goto use_all;
-	  }
-	/* ??? See above non-constant bounds and stride .  */
-	if (TREE_OPERAND (inner, 2) || TREE_OPERAND (inner, 3))
-	  goto use_all;
-	inner = TREE_OPERAND (inner, 0);
-	break;
-
-      case COMPONENT_REF:
-	{
-	  tree type = TREE_TYPE (TREE_OPERAND (inner, 0));
-	  /* Don't look through unions.  */
-	  if (TREE_CODE (type) != RECORD_TYPE)
-	    goto use_all;
-	  /* Neither through variable-sized records.  */
-	  if (TYPE_SIZE (type) == NULL_TREE
-	      || TREE_CODE (TYPE_SIZE (type)) != INTEGER_CST)
-	    goto use_all;
-	  inner = TREE_OPERAND (inner, 0);
-	}
-	break;
-
-      case REALPART_EXPR:
-      case IMAGPART_EXPR:
-	inner = TREE_OPERAND (inner, 0);
-	break;
-
-      case BIT_FIELD_REF:
-	/* A bit field reference to a specific vector is scalarized but for
-	   ones for inputs need to be marked as used on the left hand size so
-	   when we scalarize it, we can mark that variable as non renamable.  */
-	if (is_output
-	    && TREE_CODE (TREE_TYPE (TREE_OPERAND (inner, 0))) == VECTOR_TYPE)
-	  {
-	    struct sra_elt *elt
-	      = maybe_lookup_element_for_expr (TREE_OPERAND (inner, 0));
-	    if (elt)
-	      elt->is_vector_lhs = true;
-	  }
+typedef struct access *access_p;
 
-	/* A bit field reference (access to *multiple* fields simultaneously)
-	   is not currently scalarized.  Consider this an access to the full
-	   outer element, to which walk_tree will bring us next.  */
-	goto use_all;
-
-      CASE_CONVERT:
-	/* Similarly, a nop explicitly wants to look at an object in a
-	   type other than the one we've scalarized.  */
-	goto use_all;
-
-      case VIEW_CONVERT_EXPR:
-	/* Likewise for a view conversion, but with an additional twist:
-	   it can be on the LHS and, in this case, an access to the full
-	   outer element would mean a killing def.  So we need to punt
-	   if we haven't already a full access to the current element,
-	   because we cannot pretend to have a killing def if we only
-	   have a partial access at some level.  */
-	if (is_output && !use_all_p && inner != expr)
-	  disable_scalarization = true;
-	goto use_all;
-
-      case WITH_SIZE_EXPR:
-	/* This is a transparent wrapper.  The entire inner expression really
-	   is being used.  */
-	goto use_all;
-
-      use_all:
-        expr_p = &TREE_OPERAND (inner, 0);
-	inner = expr = *expr_p;
-	use_all_p = true;
-	break;
-
-      default:
-#ifdef ENABLE_CHECKING
-	/* Validate that we're not missing any references.  */
-	gcc_assert (!walk_tree (&inner, sra_find_candidate_decl, NULL, NULL));
-#endif
-	return;
-      }
-}
-
-/* Walk the arguments of a GIMPLE_CALL looking for scalarizable aggregates.
-   If we find one, invoke FNS->USE.  */
-
-static void
-sra_walk_gimple_call (gimple stmt, gimple_stmt_iterator *gsi,
-		    const struct sra_walk_fns *fns)
-{
-  int i;
-  int nargs = gimple_call_num_args (stmt);
+DEF_VEC_P (access_p);
+DEF_VEC_ALLOC_P (access_p, heap);
 
-  for (i = 0; i < nargs; i++)
-    sra_walk_expr (gimple_call_arg_ptr (stmt, i), gsi, false, fns);
-
-  if (gimple_call_lhs (stmt))
-    sra_walk_expr (gimple_call_lhs_ptr (stmt), gsi, true, fns);
-}
-
-/* Walk the inputs and outputs of a GIMPLE_ASM looking for scalarizable
-   aggregates.  If we find one, invoke FNS->USE.  */
-
-static void
-sra_walk_gimple_asm (gimple stmt, gimple_stmt_iterator *gsi,
-		   const struct sra_walk_fns *fns)
-{
-  size_t i;
-  for (i = 0; i < gimple_asm_ninputs (stmt); i++)
-    sra_walk_expr (&TREE_VALUE (gimple_asm_input_op (stmt, i)), gsi, false, fns);
-  for (i = 0; i < gimple_asm_noutputs (stmt); i++)
-    sra_walk_expr (&TREE_VALUE (gimple_asm_output_op (stmt, i)), gsi, true, fns);
-}
-
-/* Walk a GIMPLE_ASSIGN and categorize the assignment appropriately.  */
-
-static void
-sra_walk_gimple_assign (gimple stmt, gimple_stmt_iterator *gsi,
-			const struct sra_walk_fns *fns)
-{
-  struct sra_elt *lhs_elt = NULL, *rhs_elt = NULL;
-  tree lhs, rhs;
+/* Alloc pool for allocating access structures.  */
+static alloc_pool access_pool;
 
-  /* If there is more than 1 element on the RHS, only walk the lhs.  */
-  if (!gimple_assign_single_p (stmt))
-    {
-      sra_walk_expr (gimple_assign_lhs_ptr (stmt), gsi, true, fns);
-      return;
-    }
-
-  lhs = gimple_assign_lhs (stmt);
-  rhs = gimple_assign_rhs1 (stmt);
-  lhs_elt = maybe_lookup_element_for_expr (lhs);
-  rhs_elt = maybe_lookup_element_for_expr (rhs);
-
-  /* If both sides are scalarizable, this is a COPY operation.  */
-  if (lhs_elt && rhs_elt)
-    {
-      fns->copy (lhs_elt, rhs_elt, gsi);
-      return;
-    }
-
-  /* If the RHS is scalarizable, handle it.  There are only two cases.  */
-  if (rhs_elt)
-    {
-      if (!rhs_elt->is_scalar && !TREE_SIDE_EFFECTS (lhs))
-	fns->ldst (rhs_elt, lhs, gsi, false);
-      else
-	fns->use (rhs_elt, gimple_assign_rhs1_ptr (stmt), gsi, false, false);
-    }
-
-  /* If it isn't scalarizable, there may be scalarizable variables within, so
-     check for a call or else walk the RHS to see if we need to do any
-     copy-in operations.  We need to do it before the LHS is scalarized so
-     that the statements get inserted in the proper place, before any
-     copy-out operations.  */
-  else
-    sra_walk_expr (gimple_assign_rhs1_ptr (stmt), gsi, false, fns);
-
-  /* Likewise, handle the LHS being scalarizable.  We have cases similar
-     to those above, but also want to handle RHS being constant.  */
-  if (lhs_elt)
-    {
-      /* If this is an assignment from a constant, or constructor, then
-	 we have access to all of the elements individually.  Invoke INIT.  */
-      if (TREE_CODE (rhs) == COMPLEX_EXPR
-	  || TREE_CODE (rhs) == COMPLEX_CST
-	  || TREE_CODE (rhs) == CONSTRUCTOR)
-	fns->init (lhs_elt, rhs, gsi);
-
-      /* If this is an assignment from read-only memory, treat this as if
-	 we'd been passed the constructor directly.  Invoke INIT.  */
-      else if (TREE_CODE (rhs) == VAR_DECL
-	       && TREE_STATIC (rhs)
-	       && !DECL_EXTERNAL (rhs)
-	       && TREE_READONLY (rhs)
-	       && targetm.binds_local_p (rhs))
-	fns->init (lhs_elt, DECL_INITIAL (rhs), gsi);
-
-      /* If this is a copy from a non-scalarizable lvalue, invoke LDST.
-	 The lvalue requirement prevents us from trying to directly scalarize
-	 the result of a function call.  Which would result in trying to call
-	 the function multiple times, and other evil things.  */
-      else if (!lhs_elt->is_scalar
-	       && !TREE_SIDE_EFFECTS (rhs) && is_gimple_addressable (rhs))
-	fns->ldst (lhs_elt, rhs, gsi, true);
-
-      /* Otherwise we're being used in some context that requires the
-	 aggregate to be seen as a whole.  Invoke USE.  */
-      else
-	fns->use (lhs_elt, gimple_assign_lhs_ptr (stmt), gsi, true, false);
-    }
-
-  /* Similarly to above, LHS_ELT being null only means that the LHS as a
-     whole is not a scalarizable reference.  There may be occurrences of
-     scalarizable variables within, which implies a USE.  */
-  else
-    sra_walk_expr (gimple_assign_lhs_ptr (stmt), gsi, true, fns);
-}
-
-/* Entry point to the walk functions.  Search the entire function,
-   invoking the callbacks in FNS on each of the references to
-   scalarizable variables.  */
-
-static void
-sra_walk_function (const struct sra_walk_fns *fns)
-{
-  basic_block bb;
-  gimple_stmt_iterator si, ni;
-
-  /* ??? Phase 4 could derive some benefit to walking the function in
-     dominator tree order.  */
-
-  FOR_EACH_BB (bb)
-    for (si = gsi_start_bb (bb); !gsi_end_p (si); si = ni)
-      {
-	gimple stmt;
-
-	stmt = gsi_stmt (si);
-
-	ni = si;
-	gsi_next (&ni);
-
-	/* If the statement does not reference memory, then it doesn't
-	   make any structure references that we care about.  */
-	if (!gimple_references_memory_p (stmt))
-	  continue;
-
-	switch (gimple_code (stmt))
-	  {
-	  case GIMPLE_RETURN:
-	    /* If we have "return <retval>" then the return value is
-	       already exposed for our pleasure.  Walk it as a USE to
-	       force all the components back in place for the return.
-	       */
-	    if (gimple_return_retval (stmt)  == NULL_TREE)
-	      ;
-	    else
-	      sra_walk_expr (gimple_return_retval_ptr (stmt), &si, false,
-                             fns);
-	    break;
-
-	  case GIMPLE_ASSIGN:
-	    sra_walk_gimple_assign (stmt, &si, fns);
-	    break;
-	  case GIMPLE_CALL:
-	    sra_walk_gimple_call (stmt, &si, fns);
-	    break;
-	  case GIMPLE_ASM:
-	    sra_walk_gimple_asm (stmt, &si, fns);
-	    break;
-
-	  default:
-	    break;
-	  }
-      }
-}
-\f
-/* Phase One: Scan all referenced variables in the program looking for
-   structures that could be decomposed.  */
-
-static bool
-find_candidates_for_sra (void)
+/* A structure linking lhs and rhs accesses from an aggregate assignment.  They
+   are used to propagate subaccesses from rhs to lhs as long as they don't
+   conflict with what is already there.  */
+struct assign_link
 {
-  bool any_set = false;
-  tree var;
-  referenced_var_iterator rvi;
+  struct access *lacc, *racc;
+  struct assign_link *next;
+};
 
-  FOR_EACH_REFERENCED_VAR (var, rvi)
-    {
-      if (decl_can_be_decomposed_p (var))
-        {
-          bitmap_set_bit (sra_candidates, DECL_UID (var));
-          any_set = true;
-        }
-    }
+/* Alloc pool for allocating assign link structures.  */
+static alloc_pool link_pool;
 
-  return any_set;
-}
+/* Base (tree) -> Vector (VEC(access_p,heap) *) map.  */
+static struct pointer_map_t *base_access_vec;
+
+/* Bitmap of bases (candidates).  */
+static bitmap candidate_bitmap;
+/* Bitmap of declarations used in a return statement.  */
+static bitmap retvals_bitmap;
+/* Obstack for creation of fancy names.  */
+static struct obstack name_obstack;
 
-\f
-/* Phase Two: Scan all references to scalarizable variables.  Count the
-   number of times they are used or copied respectively.  */
+/* Head of a linked list of accesses that need to have its subaccesses
+   propagated to their assignment counterparts. */
+static struct access *work_queue_head;
 
-/* Callbacks to fill in SRA_WALK_FNS.  Everything but USE is
-   considered a copy, because we can decompose the reference such that
-   the sub-elements needn't be contiguous.  */
+/* Dump contents of ACCESS to file F in a human friendly way.  If GRP is true,
+   representative fields are dumped, otherwise those which only describe the
+   individual access are.  */
 
 static void
-scan_use (struct sra_elt *elt, tree *expr_p ATTRIBUTE_UNUSED,
-	  gimple_stmt_iterator *gsi ATTRIBUTE_UNUSED,
-	  bool is_output ATTRIBUTE_UNUSED, bool use_all ATTRIBUTE_UNUSED)
+dump_access (FILE *f, struct access *access, bool grp)
 {
-  elt->n_uses += 1;
+  fprintf (f, "access { ");
+  fprintf (f, "base = (%d)'", DECL_UID (access->base));
+  print_generic_expr (f, access->base, 0);
+  fprintf (f, "', offset = %d", (int) access->offset);
+  fprintf (f, ", size = %d", (int) access->size);
+  fprintf (f, ", expr = ");
+  print_generic_expr (f, access->expr, 0);
+  fprintf (f, ", type = ");
+  print_generic_expr (f, access->type, 0);
+  if (grp)
+    fprintf (f, ", grp_write = %d, grp_read = %d, grp_covered = %d, "
+	     "grp_unscalarizable_region = %d, grp_unscalarized_data = %d, "
+	     "grp_to_be_replaced = %d\n",
+	     access->grp_write, access->grp_read, access->grp_covered,
+	     access->grp_unscalarizable_region, access->grp_unscalarized_data,
+	     access->grp_to_be_replaced);
+  else
+    fprintf (f, ", write = %d'\n", access->write);
 }
 
+/* Dump a subtree rooted in ACCESS to file F, indent by LEVEL.  */
+
 static void
-scan_copy (struct sra_elt *lhs_elt, struct sra_elt *rhs_elt,
-	   gimple_stmt_iterator *gsi ATTRIBUTE_UNUSED)
+dump_access_tree_1 (FILE *f, struct access *access, int level)
 {
-  lhs_elt->n_copies += 1;
-  rhs_elt->n_copies += 1;
+  do
+    {
+      int i;
+
+      for (i = 0; i < level; i++)
+	fputs ("* ", dump_file);
+
+      dump_access (f, access, true);
+
+      if (access->first_child)
+	dump_access_tree_1 (f, access->first_child, level + 1);
+
+      access = access->next_sibling;
+    }
+  while (access);
 }
 
+/* Dump all access trees for a variable, given the pointer to the first root in
+   ACCESS.  */
+
 static void
-scan_init (struct sra_elt *lhs_elt, tree rhs ATTRIBUTE_UNUSED,
-	   gimple_stmt_iterator *gsi ATTRIBUTE_UNUSED)
+dump_access_tree (FILE *f, struct access *access)
 {
-  lhs_elt->n_copies += 1;
+  for (; access; access = access->next_grp)
+    dump_access_tree_1 (f, access, 0);
 }
 
-static void
-scan_ldst (struct sra_elt *elt, tree other ATTRIBUTE_UNUSED,
-	   gimple_stmt_iterator *gsi ATTRIBUTE_UNUSED,
-	   bool is_output ATTRIBUTE_UNUSED)
+/* Return a vector of pointers to accesses for the variable given in BASE or
+   NULL if there is none.  */
+
+static VEC (access_p, heap) *
+get_base_access_vector (tree base)
 {
-  elt->n_copies += 1;
+  void **slot;
+
+  slot = pointer_map_contains (base_access_vec, base);
+  if (!slot)
+    return NULL;
+  else
+    return *(VEC (access_p, heap) **) slot;
 }
 
-/* Dump the values we collected during the scanning phase.  */
+/* Find an access with required OFFSET and SIZE in a subtree of accesses rooted
+   in ACCESS.  Return NULL if it cannot be found.  */
 
-static void
-scan_dump (struct sra_elt *elt)
+static struct access *
+find_access_in_subtree (struct access *access, HOST_WIDE_INT offset,
+			HOST_WIDE_INT size)
 {
-  struct sra_elt *c;
-
-  dump_sra_elt_name (dump_file, elt);
-  fprintf (dump_file, ": n_uses=%u n_copies=%u\n", elt->n_uses, elt->n_copies);
+  while (access && (access->offset != offset || access->size != size))
+    {
+      struct access *child = access->first_child;
 
-  for (c = elt->children; c ; c = c->sibling)
-    scan_dump (c);
+      while (child && (child->offset + child->size <= offset))
+	child = child->next_sibling;
+      access = child;
+    }
 
-  for (c = elt->groups; c ; c = c->sibling)
-    scan_dump (c);
+  return access;
 }
 
-/* Entry point to phase 2.  Scan the entire function, building up
-   scalarization data structures, recording copies and uses.  */
+/* Return the first group representative for DECL or NULL if none exists.  */
 
-static void
-scan_function (void)
+static struct access *
+get_first_repr_for_decl (tree base)
 {
-  static const struct sra_walk_fns fns = {
-    scan_use, scan_copy, scan_init, scan_ldst, true
-  };
-  bitmap_iterator bi;
-
-  sra_walk_function (&fns);
+  VEC (access_p, heap) *access_vec;
 
-  if (dump_file && (dump_flags & TDF_DETAILS))
-    {
-      unsigned i;
+  access_vec = get_base_access_vector (base);
+  if (!access_vec)
+    return NULL;
 
-      fputs ("\nScan results:\n", dump_file);
-      EXECUTE_IF_SET_IN_BITMAP (sra_candidates, 0, i, bi)
-	{
-	  tree var = referenced_var (i);
-	  struct sra_elt *elt = lookup_element (NULL, var, NULL, NO_INSERT);
-	  if (elt)
-	    scan_dump (elt);
-	}
-      fputc ('\n', dump_file);
-    }
+  return VEC_index (access_p, access_vec, 0);
 }
-\f
-/* Phase Three: Make decisions about which variables to scalarize, if any.
-   All elements to be scalarized have replacement variables made for them.  */
 
-/* A subroutine of build_element_name.  Recursively build the element
-   name on the obstack.  */
+/* Find an access representative for the variable BASE and given OFFSET and
+   SIZE.  Requires that access trees have already been built.  Return NULL if
+   it cannot be found.  */
+
+static struct access *
+get_var_base_offset_size_access (tree base, HOST_WIDE_INT offset,
+				 HOST_WIDE_INT size)
+{
+  struct access *access;
+
+  access = get_first_repr_for_decl (base);
+  while (access && (access->offset + access->size <= offset))
+    access = access->next_grp;
+  if (!access)
+    return NULL;
+
+  return find_access_in_subtree (access, offset, size);
+}
 
+/* Add LINK to the linked list of assign links of RACC.  */
 static void
-build_element_name_1 (struct sra_elt *elt)
+add_link_to_rhs (struct access *racc, struct assign_link *link)
 {
-  tree t;
-  char buffer[32];
+  gcc_assert (link->racc == racc);
 
-  if (elt->parent)
+  if (!racc->first_link)
     {
-      build_element_name_1 (elt->parent);
-      obstack_1grow (&sra_obstack, '$');
+      gcc_assert (!racc->last_link);
+      racc->first_link = link;
+    }
+  else
+    racc->last_link->next = link;
 
-      if (TREE_CODE (elt->parent->type) == COMPLEX_TYPE)
-	{
-	  if (elt->element == integer_zero_node)
-	    obstack_grow (&sra_obstack, "real", 4);
-	  else
-	    obstack_grow (&sra_obstack, "imag", 4);
-	  return;
-	}
+  racc->last_link = link;
+  link->next = NULL;
+}
+
+/* Move all link structures in their linked list in OLD_RACC to the linked list
+   in NEW_RACC.  */
+static void
+relink_to_new_repr (struct access *new_racc, struct access *old_racc)
+{
+  if (!old_racc->first_link)
+    {
+      gcc_assert (!old_racc->last_link);
+      return;
     }
 
-  t = elt->element;
-  if (TREE_CODE (t) == INTEGER_CST)
+  if (new_racc->first_link)
     {
-      /* ??? Eh.  Don't bother doing double-wide printing.  */
-      sprintf (buffer, HOST_WIDE_INT_PRINT_DEC, TREE_INT_CST_LOW (t));
-      obstack_grow (&sra_obstack, buffer, strlen (buffer));
-    }
-  else if (TREE_CODE (t) == BIT_FIELD_REF)
-    {
-      sprintf (buffer, "B" HOST_WIDE_INT_PRINT_DEC,
-	       tree_low_cst (TREE_OPERAND (t, 2), 1));
-      obstack_grow (&sra_obstack, buffer, strlen (buffer));
-      sprintf (buffer, "F" HOST_WIDE_INT_PRINT_DEC,
-	       tree_low_cst (TREE_OPERAND (t, 1), 1));
-      obstack_grow (&sra_obstack, buffer, strlen (buffer));
+      gcc_assert (!new_racc->last_link->next);
+      gcc_assert (!old_racc->last_link || !old_racc->last_link->next);
+
+      new_racc->last_link->next = old_racc->first_link;
+      new_racc->last_link = old_racc->last_link;
     }
   else
     {
-      tree name = DECL_NAME (t);
-      if (name)
-	obstack_grow (&sra_obstack, IDENTIFIER_POINTER (name),
-		      IDENTIFIER_LENGTH (name));
-      else
-	{
-	  sprintf (buffer, "D%u", DECL_UID (t));
-	  obstack_grow (&sra_obstack, buffer, strlen (buffer));
-	}
+      gcc_assert (!new_racc->last_link);
+
+      new_racc->first_link = old_racc->first_link;
+      new_racc->last_link = old_racc->last_link;
     }
+  old_racc->first_link = old_racc->last_link = NULL;
 }
 
-/* Construct a pretty variable name for an element's replacement variable.
-   The name is built on the obstack.  */
+/* Add ACCESS to the work queue (which is actually a stack).  */
 
-static char *
-build_element_name (struct sra_elt *elt)
+static void
+add_access_to_work_queue (struct access *access)
+{
+  if (!access->grp_queued)
+    {
+      gcc_assert (!access->next_queued);
+      access->next_queued = work_queue_head;
+      access->grp_queued = 1;
+      work_queue_head = access;
+    }
+}
+
+/* Pop an access from the work queue, and return it, assuming there is one.  */
+
+static struct access *
+pop_access_from_work_queue (void)
 {
-  build_element_name_1 (elt);
-  obstack_1grow (&sra_obstack, '\0');
-  return XOBFINISH (&sra_obstack, char *);
+  struct access *access = work_queue_head;
+
+  work_queue_head = access->next_queued;
+  access->next_queued = NULL;
+  access->grp_queued = 0;
+  return access;
 }
 
-/* Insert a gimple_seq SEQ on all the outgoing edges out of BB.  Note that
-   if BB has more than one edge, STMT will be replicated for each edge.
-   Also, abnormal edges will be ignored.  */
+
+/* Allocate necessary structures.  */
 
 static void
-insert_edge_copies_seq (gimple_seq seq, basic_block bb)
+sra_initialize (void)
 {
-  edge e;
-  edge_iterator ei;
-  unsigned n_copies = -1;
+  candidate_bitmap = BITMAP_ALLOC (NULL);
+  retvals_bitmap = BITMAP_ALLOC (NULL);
+  gcc_obstack_init (&name_obstack);
+  access_pool = create_alloc_pool ("SRA accesses", sizeof (struct access), 16);
+  link_pool = create_alloc_pool ("SRA links", sizeof (struct assign_link), 16);
+  base_access_vec = pointer_map_create ();
+}
 
-  FOR_EACH_EDGE (e, ei, bb->succs)
-    if (!(e->flags & EDGE_ABNORMAL))
-      n_copies++;
+/* Hook fed to pointer_map_traverse, deallocate stored vectors.  */
 
-  FOR_EACH_EDGE (e, ei, bb->succs)
-    if (!(e->flags & EDGE_ABNORMAL))
-      gsi_insert_seq_on_edge (e, n_copies-- > 0 ? gimple_seq_copy (seq) : seq);
+static bool
+delete_base_accesses (const void *key ATTRIBUTE_UNUSED, void **value,
+		     void *data ATTRIBUTE_UNUSED)
+{
+  VEC (access_p, heap) *access_vec;
+  access_vec = (VEC (access_p, heap) *) *value;
+  VEC_free (access_p, heap, access_vec);
+
+  return true;
 }
 
-/* Instantiate an element as an independent variable.  */
+/* Deallocate all general structures.  */
 
 static void
-instantiate_element (struct sra_elt *elt)
+sra_deinitialize (void)
 {
-  struct sra_elt *base_elt;
-  tree var, base;
-  bool nowarn = TREE_NO_WARNING (elt->element);
+  BITMAP_FREE (candidate_bitmap);
+  BITMAP_FREE (retvals_bitmap);
+  free_alloc_pool (access_pool);
+  free_alloc_pool (link_pool);
+  obstack_free (&name_obstack, NULL);
 
-  for (base_elt = elt; base_elt->parent; base_elt = base_elt->parent)
-    if (!nowarn)
-      nowarn = TREE_NO_WARNING (base_elt->parent->element);
-  base = base_elt->element;
+  pointer_map_traverse (base_access_vec, delete_base_accesses, NULL);
+  pointer_map_destroy (base_access_vec);
+}
 
-  elt->replacement = var = make_rename_temp (elt->type, "SR");
+/* Remove DECL from candidates for SRA and write REASON to the dump file if
+   there is one.  */
+static void
+disqualify_candidate (tree decl, const char *reason)
+{
+  bitmap_clear_bit (candidate_bitmap, DECL_UID (decl));
 
-  if (DECL_P (elt->element)
-      && !tree_int_cst_equal (DECL_SIZE (var), DECL_SIZE (elt->element)))
+  if (dump_file)
     {
-      DECL_SIZE (var) = DECL_SIZE (elt->element);
-      DECL_SIZE_UNIT (var) = DECL_SIZE_UNIT (elt->element);
-
-      elt->in_bitfld_block = 1;
-      elt->replacement = fold_build3 (BIT_FIELD_REF, elt->type, var,
-				      DECL_SIZE (var),
-				      BYTES_BIG_ENDIAN
-				      ? size_binop (MINUS_EXPR,
-						    TYPE_SIZE (elt->type),
-						    DECL_SIZE (var))
-				      : bitsize_int (0));
+      fprintf (dump_file, "! Disqualifying ");
+      print_generic_expr (dump_file, decl, 0);
+      fprintf (dump_file, " - %s\n", reason);
     }
+}
 
-  /* For vectors, if used on the left hand side with BIT_FIELD_REF,
-     they are not a gimple register.  */
-  if (TREE_CODE (TREE_TYPE (var)) == VECTOR_TYPE && elt->is_vector_lhs)
-    DECL_GIMPLE_REG_P (var) = 0;
+/* Return true iff the type contains a field or an element which does not allow
+   scalarization.  */
 
-  DECL_SOURCE_LOCATION (var) = DECL_SOURCE_LOCATION (base);
-  DECL_ARTIFICIAL (var) = 1;
+static bool
+type_internals_preclude_sra_p (tree type)
+{
+  tree fld;
+  tree et;
 
-  if (TREE_THIS_VOLATILE (elt->type))
+  switch (TREE_CODE (type))
     {
-      TREE_THIS_VOLATILE (var) = 1;
-      TREE_SIDE_EFFECTS (var) = 1;
+    case RECORD_TYPE:
+    case UNION_TYPE:
+    case QUAL_UNION_TYPE:
+      for (fld = TYPE_FIELDS (type); fld; fld = TREE_CHAIN (fld))
+	if (TREE_CODE (fld) == FIELD_DECL)
+	  {
+	    tree ft = TREE_TYPE (fld);
+
+	    if (TREE_THIS_VOLATILE (fld)
+		|| !DECL_FIELD_OFFSET (fld) || !DECL_SIZE (fld)
+		|| !host_integerp (DECL_FIELD_OFFSET (fld), 1)
+		|| !host_integerp (DECL_SIZE (fld), 1))
+	      return true;
+
+	    if (AGGREGATE_TYPE_P (ft)
+		&& type_internals_preclude_sra_p (ft))
+	      return true;
+	  }
+
+      return false;
+
+    case ARRAY_TYPE:
+      et = TREE_TYPE (type);
+
+      if (AGGREGATE_TYPE_P (et))
+	return type_internals_preclude_sra_p (et);
+      else
+	return false;
+
+    default:
+      return false;
     }
+}
 
-  if (DECL_NAME (base) && !DECL_IGNORED_P (base))
+/* Create and insert access for EXPR. Return created access, or NULL if it is
+   not possible.  */
+
+static struct access *
+create_access (tree expr, bool write)
+{
+  struct access *access;
+  void **slot;
+  VEC (access_p,heap) *vec;
+  HOST_WIDE_INT offset, size, max_size;
+  tree base = expr;
+  bool unscalarizable_region = false;
+
+  if (handled_component_p (expr))
+    base = get_ref_base_and_extent (expr, &offset, &size, &max_size);
+  else
     {
-      char *pretty_name = build_element_name (elt);
-      DECL_NAME (var) = get_identifier (pretty_name);
-      obstack_free (&sra_obstack, pretty_name);
+      tree tree_size;
 
-      SET_DECL_DEBUG_EXPR (var, generate_element_ref (elt));
-      DECL_DEBUG_EXPR_IS_FROM (var) = 1;
-      
-      DECL_IGNORED_P (var) = 0;
-      TREE_NO_WARNING (var) = nowarn;
+      tree_size = TYPE_SIZE (TREE_TYPE (base));
+      if (tree_size && host_integerp (tree_size, 1))
+	size = max_size = tree_low_cst (tree_size, 1);
+      else
+	size = max_size = -1;
+
+      offset = 0;
     }
-  else
+
+  if (!base || !DECL_P (base)
+      || !bitmap_bit_p (candidate_bitmap, DECL_UID (base)))
+    return NULL;
+
+  if (size != max_size)
     {
-      DECL_IGNORED_P (var) = 1;
-      /* ??? We can't generate any warning that would be meaningful.  */
-      TREE_NO_WARNING (var) = 1;
-    }
-
-  /* Zero-initialize bit-field scalarization variables, to avoid
-     triggering undefined behavior.  */
-  if (TREE_CODE (elt->element) == BIT_FIELD_REF
-      || (var != elt->replacement
-	  && TREE_CODE (elt->replacement) == BIT_FIELD_REF))
-    {
-      gimple_seq init = sra_build_assignment (var,
-                                              fold_convert (TREE_TYPE (var),
-                                                            integer_zero_node)
-                                             );
-      insert_edge_copies_seq (init, ENTRY_BLOCK_PTR);
-      mark_all_v_defs_seq (init);
+      size = max_size;
+      unscalarizable_region = true;
     }
 
-  if (dump_file)
+  if (size < 0)
     {
-      fputs ("  ", dump_file);
-      dump_sra_elt_name (dump_file, elt);
-      fputs (" -> ", dump_file);
-      print_generic_expr (dump_file, var, dump_flags);
-      fputc ('\n', dump_file);
+      disqualify_candidate (base, "Encountered an ultra variable sized "
+			    "access.");
+      return NULL;
     }
+
+  access = (struct access *) pool_alloc (access_pool);
+  memset (access, 0, sizeof (struct access));
+
+  access->base = base;
+  access->offset = offset;
+  access->size = size;
+  access->expr = expr;
+  access->type = TREE_TYPE (expr);
+  access->write = write;
+  access->grp_unscalarizable_region = unscalarizable_region;
+
+  slot = pointer_map_contains (base_access_vec, base);
+  if (slot)
+    vec = (VEC (access_p, heap) *) *slot;
+  else
+    vec = VEC_alloc (access_p, heap, 32);
+
+  VEC_safe_push (access_p, heap, vec, access);
+
+  *((struct VEC (access_p,heap) **)
+	pointer_map_insert (base_access_vec, base)) = vec;
+
+  return access;
 }
 
-/* Make one pass across an element tree deciding whether or not it's
-   profitable to instantiate individual leaf scalars.
 
-   PARENT_USES and PARENT_COPIES are the sum of the N_USES and N_COPIES
-   fields all the way up the tree.  */
+/* Callback of walk_tree.  Search the given tree for a declaration and exclude
+   it from the candidates.  */
 
-static void
-decide_instantiation_1 (struct sra_elt *elt, unsigned int parent_uses,
-			unsigned int parent_copies)
+static tree
+disqualify_all (tree *tp, int *walk_subtrees, void *data ATTRIBUTE_UNUSED)
 {
-  if (dump_file && !elt->parent)
-    {
-      fputs ("Initial instantiation for ", dump_file);
-      dump_sra_elt_name (dump_file, elt);
-      fputc ('\n', dump_file);
-    }
+  tree base = *tp;
+
 
-  if (elt->cannot_scalarize)
-    return;
+  if (TREE_CODE (base) == SSA_NAME)
+    base = SSA_NAME_VAR (base);
 
-  if (elt->is_scalar)
+  if (DECL_P (base))
     {
-      /* The decision is simple: instantiate if we're used more frequently
-	 than the parent needs to be seen as a complete unit.  */
-      if (elt->n_uses + elt->n_copies + parent_copies > parent_uses)
-	instantiate_element (elt);
+      disqualify_candidate (base, "From within disqualify_all().");
+      *walk_subtrees = 0;
     }
   else
-    {
-      struct sra_elt *c, *group;
-      unsigned int this_uses = elt->n_uses + parent_uses;
-      unsigned int this_copies = elt->n_copies + parent_copies;
-
-      /* Consider groups of sub-elements as weighing in favour of
-	 instantiation whatever their size.  */
-      for (group = elt->groups; group ; group = group->sibling)
-	FOR_EACH_ACTUAL_CHILD (c, group)
-	  {
-	    c->n_uses += group->n_uses;
-	    c->n_copies += group->n_copies;
-	  }
+    *walk_subtrees = 1;
 
-      for (c = elt->children; c ; c = c->sibling)
-	decide_instantiation_1 (c, this_uses, this_copies);
-    }
+
+  return NULL_TREE;
 }
 
-/* Compute the size and number of all instantiated elements below ELT.
-   We will only care about this if the size of the complete structure
-   fits in a HOST_WIDE_INT, so we don't have to worry about overflow.  */
+/* Scan expression EXPR and create access structures for all accesses to
+   candidates for scalarization.  Return the created access or NULL if none is
+   created.  */
 
-static unsigned int
-sum_instantiated_sizes (struct sra_elt *elt, unsigned HOST_WIDE_INT *sizep)
+static struct access *
+build_access_from_expr_1 (tree *expr_ptr,
+			gimple_stmt_iterator *gsi ATTRIBUTE_UNUSED, bool write)
 {
-  if (elt->replacement)
+  struct access *ret = NULL;
+  tree expr = *expr_ptr;
+  tree safe_expr = expr;
+  bool bit_ref;
+
+  if (TREE_CODE (expr) == BIT_FIELD_REF)
     {
-      *sizep += TREE_INT_CST_LOW (TYPE_SIZE_UNIT (elt->type));
-      return 1;
+      expr = TREE_OPERAND (expr, 0);
+      bit_ref = true;
     }
   else
+    bit_ref = false;
+
+  while (TREE_CODE (expr) == NOP_EXPR
+	 || TREE_CODE (expr) == VIEW_CONVERT_EXPR
+	 || TREE_CODE (expr) == REALPART_EXPR
+	 || TREE_CODE (expr) == IMAGPART_EXPR)
+    expr = TREE_OPERAND (expr, 0);
+
+  switch (TREE_CODE (expr))
     {
-      struct sra_elt *c;
-      unsigned int count = 0;
+    case ADDR_EXPR:
+    case SSA_NAME:
+    case INDIRECT_REF:
+      break;
+
+    case VAR_DECL:
+    case PARM_DECL:
+    case RESULT_DECL:
+    case COMPONENT_REF:
+    case ARRAY_REF:
+      ret = create_access (expr, write);
+      break;
 
-      for (c = elt->children; c ; c = c->sibling)
-	count += sum_instantiated_sizes (c, sizep);
+    case REALPART_EXPR:
+    case IMAGPART_EXPR:
+      expr = TREE_OPERAND (expr, 0);
+      ret = create_access (expr, write);
+      break;
 
-      return count;
+    case ARRAY_RANGE_REF:
+    default:
+      walk_tree (&safe_expr, disqualify_all, NULL, NULL);
+      break;
     }
+
+  if (write && bit_ref && ret)
+    ret->grp_bfr_lhs = 1;
+
+  return ret;
 }
 
-/* Instantiate fields in ELT->TYPE that are not currently present as
-   children of ELT.  */
+/* Scan expression EXPR and create access structures for all accesses to
+   candidates for scalarization.  Return true if any access has been
+   inserted.  */
 
-static void instantiate_missing_elements (struct sra_elt *elt);
+static bool
+build_access_from_expr (tree *expr_ptr,
+			gimple_stmt_iterator *gsi ATTRIBUTE_UNUSED, bool write,
+			void *data ATTRIBUTE_UNUSED)
+{
+  return build_access_from_expr_1 (expr_ptr, gsi, write) != NULL;
+}
 
-static struct sra_elt *
-instantiate_missing_elements_1 (struct sra_elt *elt, tree child, tree type)
+/* Disqualify LHS and RHS for scalarization if STMT must end its basic block in
+   modes in which it matters, return true iff they have been disqualified.  RHS
+   may be NULL, in that case ignore it.  If we scalarize an aggregate in
+   intra-SRA we may need to add statements after each statement.  This is not
+   possible if a statement unconditionally has to end the basic block.  */
+static bool
+disqualify_ops_if_throwing_stmt (gimple stmt, tree *lhs, tree *rhs)
 {
-  struct sra_elt *sub = lookup_element (elt, child, type, INSERT);
-  if (sub->is_scalar)
+  if (stmt_can_throw_internal (stmt) || stmt_ends_bb_p (stmt))
     {
-      if (sub->replacement == NULL)
-	instantiate_element (sub);
+      walk_tree (lhs, disqualify_all, NULL, NULL);
+      if (rhs)
+	walk_tree (rhs, disqualify_all, NULL, NULL);
+      return true;
     }
-  else
-    instantiate_missing_elements (sub);
-  return sub;
+  return false;
 }
 
-/* Obtain the canonical type for field F of ELEMENT.  */
 
-static tree
-canon_type_for_field (tree f, tree element)
-{
-  tree field_type = TREE_TYPE (f);
+/* Result code for scan_assign callback for scan_function.  */
+enum scan_assign_result {SRA_SA_NONE,       /* nothing done for the stmt */
+			 SRA_SA_PROCESSED,  /* stmt analyzed/changed */
+			 SRA_SA_REMOVED};   /* stmt redundant and eliminated */
 
-  /* canonicalize_component_ref() unwidens some bit-field types (not
-     marked as DECL_BIT_FIELD in C++), so we must do the same, lest we
-     may introduce type mismatches.  */
-  if (INTEGRAL_TYPE_P (field_type)
-      && DECL_MODE (f) != TYPE_MODE (field_type))
-    field_type = TREE_TYPE (get_unwidened (build3 (COMPONENT_REF,
-						   field_type,
-						   element,
-						   f, NULL_TREE),
-					   NULL_TREE));
-
-  return field_type;
-}
-
-/* Look for adjacent fields of ELT starting at F that we'd like to
-   scalarize as a single variable.  Return the last field of the
-   group.  */
 
-static tree
-try_instantiate_multiple_fields (struct sra_elt *elt, tree f)
+/* Scan expressions occuring in the statement pointed to by STMT_EXPR, create
+   access structures for all accesses to candidates for scalarization and
+   remove those candidates which occur in statements or expressions that
+   prevent them from being split apart.  Return true if any access has been
+   inserted.  */
+
+static enum scan_assign_result
+build_accesses_from_assign (gimple *stmt_ptr,
+			    gimple_stmt_iterator *gsi ATTRIBUTE_UNUSED,
+			    void *data ATTRIBUTE_UNUSED)
 {
-  int count;
-  unsigned HOST_WIDE_INT align, bit, size, alchk;
-  enum machine_mode mode;
-  tree first = f, prev;
-  tree type, var;
-  struct sra_elt *block;
-
-  /* Point fields are typically best handled as standalone entities.  */
-  if (POINTER_TYPE_P (TREE_TYPE (f)))
-    return f;
-    
-  if (!is_sra_scalar_type (TREE_TYPE (f))
-      || !host_integerp (DECL_FIELD_OFFSET (f), 1)
-      || !host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)
-      || !host_integerp (DECL_SIZE (f), 1)
-      || lookup_element (elt, f, NULL, NO_INSERT))
-    return f;
-
-  block = elt;
-
-  /* For complex and array objects, there are going to be integer
-     literals as child elements.  In this case, we can't just take the
-     alignment and mode of the decl, so we instead rely on the element
-     type.
-
-     ??? We could try to infer additional alignment from the full
-     object declaration and the location of the sub-elements we're
-     accessing.  */
-  for (count = 0; !DECL_P (block->element); count++)
-    block = block->parent;
-
-  align = DECL_ALIGN (block->element);
-  alchk = GET_MODE_BITSIZE (DECL_MODE (block->element));
-
-  if (count)
-    {
-      type = TREE_TYPE (block->element);
-      while (count--)
-	type = TREE_TYPE (type);
-
-      align = TYPE_ALIGN (type);
-      alchk = GET_MODE_BITSIZE (TYPE_MODE (type));
-    }
-
-  if (align < alchk)
-    align = alchk;
-
-  /* Coalescing wider fields is probably pointless and
-     inefficient.  */
-  if (align > BITS_PER_WORD)
-    align = BITS_PER_WORD;
-
-  bit = tree_low_cst (DECL_FIELD_OFFSET (f), 1) * BITS_PER_UNIT
-    + tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
-  size = tree_low_cst (DECL_SIZE (f), 1);
-
-  alchk = align - 1;
-  alchk = ~alchk;
-
-  if ((bit & alchk) != ((bit + size - 1) & alchk))
-    return f;
-
-  /* Find adjacent fields in the same alignment word.  */
-
-  for (prev = f, f = TREE_CHAIN (f);
-       f && TREE_CODE (f) == FIELD_DECL
-	 && is_sra_scalar_type (TREE_TYPE (f))
-	 && host_integerp (DECL_FIELD_OFFSET (f), 1)
-	 && host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)
-	 && host_integerp (DECL_SIZE (f), 1)
-	 && !lookup_element (elt, f, NULL, NO_INSERT);
-       prev = f, f = TREE_CHAIN (f))
-    {
-      unsigned HOST_WIDE_INT nbit, nsize;
-
-      nbit = tree_low_cst (DECL_FIELD_OFFSET (f), 1) * BITS_PER_UNIT
-	+ tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
-      nsize = tree_low_cst (DECL_SIZE (f), 1);
+  gimple stmt = *stmt_ptr;
+  tree *lhs_ptr, *rhs_ptr;
+  struct access *lacc, *racc;
 
-      if (bit + size == nbit)
-	{
-	  if ((bit & alchk) != ((nbit + nsize - 1) & alchk))
-	    {
-	      /* If we're at an alignment boundary, don't bother
-		 growing alignment such that we can include this next
-		 field.  */
-	      if ((nbit & alchk)
-		  || GET_MODE_BITSIZE (DECL_MODE (f)) <= align)
-		break;
-
-	      align = GET_MODE_BITSIZE (DECL_MODE (f));
-	      alchk = align - 1;
-	      alchk = ~alchk;
+  if (gimple_assign_rhs2 (stmt))
+    return SRA_SA_NONE;
 
-	      if ((bit & alchk) != ((nbit + nsize - 1) & alchk))
-		break;
-	    }
-	  size += nsize;
-	}
-      else if (nbit + nsize == bit)
-	{
-	  if ((nbit & alchk) != ((bit + size - 1) & alchk))
-	    {
-	      if ((bit & alchk)
-		  || GET_MODE_BITSIZE (DECL_MODE (f)) <= align)
-		break;
-
-	      align = GET_MODE_BITSIZE (DECL_MODE (f));
-	      alchk = align - 1;
-	      alchk = ~alchk;
+  lhs_ptr = gimple_assign_lhs_ptr (stmt);
+  rhs_ptr = gimple_assign_rhs1_ptr (stmt);
 
-	      if ((nbit & alchk) != ((bit + size - 1) & alchk))
-		break;
-	    }
-	  bit = nbit;
-	  size += nsize;
-	}
-      else
-	break;
-    }
+  if (disqualify_ops_if_throwing_stmt (stmt, lhs_ptr, rhs_ptr))
+    return SRA_SA_NONE;
+
+  racc = build_access_from_expr_1 (rhs_ptr, gsi, false);
+  lacc = build_access_from_expr_1 (lhs_ptr, gsi, true);
+
+  if (lacc && racc
+      && !lacc->grp_unscalarizable_region
+      && !racc->grp_unscalarizable_region
+      && AGGREGATE_TYPE_P (TREE_TYPE (*lhs_ptr))
+      && lacc->size <= racc->size
+      && useless_type_conversion_p (lacc->type, racc->type))
+    {
+      struct assign_link *link;
 
-  f = prev;
+      link = (struct assign_link *) pool_alloc (link_pool);
+      memset (link, 0, sizeof (struct assign_link));
 
-  if (f == first)
-    return f;
+      link->lacc = lacc;
+      link->racc = racc;
+
+      add_link_to_rhs (racc, link);
+    }
+
+  return (lacc || racc) ? SRA_SA_PROCESSED : SRA_SA_NONE;
+}
 
-  gcc_assert ((bit & alchk) == ((bit + size - 1) & alchk));
+/* Scan function and look for interesting statements. Return true if any has
+   been found or processed, as indicated by callbacks.  SCAN_EXPR is a callback
+   called on all expressions within statements except assign statements and
+   those deemed entirely unsuitable for some reason (all operands in such
+   statements and expression are removed from candidate_bitmap).  SCAN_ASSIGN
+   is a callback called on all assign statements, HANDLE_SSA_DEFS is a callback
+   called on assign statements and those call statements which have a lhs and
+   it is the only callback which can be NULL. ANALYSIS_STAGE is true when
+   running in the analysis stage of a pass and thus no statement is being
+   modified.  DATA is a pointer passed to all callbacks.  If any single
+   callback returns true, this function also returns true, otherwise it returns
+   false.  */
 
-  /* Try to widen the bit range so as to cover padding bits as well.  */
+static bool
+scan_function (bool (*scan_expr) (tree *, gimple_stmt_iterator *, bool, void *),
+	       enum scan_assign_result (*scan_assign) (gimple *,
+						       gimple_stmt_iterator *,
+						       void *),
+	       bool (*handle_ssa_defs)(gimple, void *),
+	       bool analysis_stage, void *data)
+{
+  gimple_stmt_iterator gsi;
+  basic_block bb;
+  unsigned i;
+  tree *t;
+  bool ret = false;
 
-  if ((bit & ~alchk) || size != align)
+  FOR_EACH_BB (bb)
     {
-      unsigned HOST_WIDE_INT mbit = bit & alchk;
-      unsigned HOST_WIDE_INT msize = align;
+      bool bb_changed = false;
 
-      for (f = TYPE_FIELDS (elt->type);
-	   f; f = TREE_CHAIN (f))
+      gsi = gsi_start_bb (bb);
+      while (!gsi_end_p (gsi))
 	{
-	  unsigned HOST_WIDE_INT fbit, fsize;
+	  gimple stmt = gsi_stmt (gsi);
+	  enum scan_assign_result assign_result;
+	  bool any = false, deleted = false;
 
-	  /* Skip the fields from first to prev.  */
-	  if (f == first)
+	  switch (gimple_code (stmt))
 	    {
-	      f = prev;
-	      continue;
-	    }
-
-	  if (!(TREE_CODE (f) == FIELD_DECL
-		&& host_integerp (DECL_FIELD_OFFSET (f), 1)
-		&& host_integerp (DECL_FIELD_BIT_OFFSET (f), 1)))
-	    continue;
-
-	  fbit = tree_low_cst (DECL_FIELD_OFFSET (f), 1) * BITS_PER_UNIT
-	    + tree_low_cst (DECL_FIELD_BIT_OFFSET (f), 1);
+	    case GIMPLE_RETURN:
+	      t = gimple_return_retval_ptr (stmt);
+	      if (*t != NULL_TREE)
+		{
+		  if (DECL_P (*t))
+		    {
+		      tree ret_type = TREE_TYPE (*t);
+		      if (sra_mode == SRA_MODE_EARLY_INTRA
+			  && (TREE_CODE (ret_type) == UNION_TYPE
+			      || TREE_CODE (ret_type) == QUAL_UNION_TYPE))
+			disqualify_candidate (*t,
+					      "Union in a return statement.");
+		      else
+			bitmap_set_bit (retvals_bitmap, DECL_UID (*t));
+		    }
+		  any |= scan_expr (t, &gsi, false, data);
+		}
+	      break;
 
-	  /* If we're past the selected word, we're fine.  */
-	  if ((bit & alchk) < (fbit & alchk))
-	    continue;
+	    case GIMPLE_ASSIGN:
+	      assign_result = scan_assign (&stmt, &gsi, data);
+	      any |= assign_result == SRA_SA_PROCESSED;
+	      deleted = assign_result == SRA_SA_REMOVED;
+	      if (handle_ssa_defs && assign_result != SRA_SA_REMOVED)
+		any |= handle_ssa_defs (stmt, data);
+	      break;
 
-	  if (host_integerp (DECL_SIZE (f), 1))
-	    fsize = tree_low_cst (DECL_SIZE (f), 1);
-	  else
-	    /* Assume a variable-sized field takes up all space till
-	       the end of the word.  ??? Endianness issues?  */
-	    fsize = align - (fbit & alchk);
+	    case GIMPLE_CALL:
+	      /* Operands must be processed before the lhs.  */
+	      for (i = 0; i < gimple_call_num_args (stmt); i++)
+		{
+		  tree *argp = gimple_call_arg_ptr (stmt, i);
+		  any |= scan_expr (argp, &gsi, false, data);
+		}
 
-	  if ((fbit & alchk) < (bit & alchk))
-	    {
-	      /* A large field might start at a previous word and
-		 extend into the selected word.  Exclude those
-		 bits.  ??? Endianness issues? */
-	      HOST_WIDE_INT diff = fbit + fsize - mbit;
+	      if (gimple_call_lhs (stmt))
+		{
+		  tree *lhs_ptr = gimple_call_lhs_ptr (stmt);
+		  if (!analysis_stage ||
+		      !disqualify_ops_if_throwing_stmt (stmt, lhs_ptr, NULL))
+		    {
+		      any |= scan_expr (lhs_ptr, &gsi, true, data);
+		      if (handle_ssa_defs)
+			any |= handle_ssa_defs (stmt, data);
+		    }
+		}
+	      break;
 
-	      if (diff <= 0)
-		continue;
+	    case GIMPLE_ASM:
+	      for (i = 0; i < gimple_asm_ninputs (stmt); i++)
+		{
+		  tree *op = &TREE_VALUE (gimple_asm_input_op (stmt, i));
+		  any |= scan_expr (op, &gsi, false, data);
+		}
+	      for (i = 0; i < gimple_asm_noutputs (stmt); i++)
+		{
+		  tree *op = &TREE_VALUE (gimple_asm_output_op (stmt, i));
+		  any |= scan_expr (op, &gsi, true, data);
+		}
 
-	      mbit += diff;
-	      msize -= diff;
+	    default:
+	      if (analysis_stage)
+		walk_gimple_op (stmt, disqualify_all, NULL);
+	      break;
 	    }
-	  else
+
+	  if (any)
 	    {
-	      /* Non-overlapping, great.  */
-	      if (fbit + fsize <= mbit
-		  || mbit + msize <= fbit)
-		continue;
+	      ret = true;
+	      bb_changed = true;
 
-	      if (fbit <= mbit)
+	      if (!analysis_stage)
 		{
-		  unsigned HOST_WIDE_INT diff = fbit + fsize - mbit;
-		  mbit += diff;
-		  msize -= diff;
+		  update_stmt (stmt);
+		  if (!stmt_could_throw_p (stmt))
+		    remove_stmt_from_eh_region (stmt);
 		}
-	      else if (fbit > mbit)
-		msize -= (mbit + msize - fbit);
-	      else
-		gcc_unreachable ();
+	    }
+	  if (deleted)
+	    bb_changed = true;
+	  else
+	    {
+	      gsi_next (&gsi);
+	      ret = true;
 	    }
 	}
-
-      bit = mbit;
-      size = msize;
+      if (!analysis_stage && bb_changed)
+	gimple_purge_dead_eh_edges (bb);
     }
 
-  /* Now we know the bit range we're interested in.  Find the smallest
-     machine mode we can use to access it.  */
-
-  for (mode = smallest_mode_for_size (size, MODE_INT);
-       ;
-       mode = GET_MODE_WIDER_MODE (mode))
-    {
-      gcc_assert (mode != VOIDmode);
-
-      alchk = GET_MODE_PRECISION (mode) - 1;
-      alchk = ~alchk;
+  return ret;
+}
 
-      if ((bit & alchk) == ((bit + size - 1) & alchk))
-	break;
-    }
+/* Helper of QSORT function. There are pointers to accesses in the array.  An
+   access is considered smaller than another if it has smaller offset or if the
+   offsets are the same but is size is bigger. */
 
-  gcc_assert (~alchk < align);
+static int
+compare_access_positions (const void *a, const void *b)
+{
+  const access_p *fp1 = (const access_p *) a;
+  const access_p *fp2 = (const access_p *) b;
+  const access_p f1 = *fp1;
+  const access_p f2 = *fp2;
 
-  /* Create the field group as a single variable.  */
+  if (f1->offset != f2->offset)
+    return f1->offset < f2->offset ? -1 : 1;
 
-  /* We used to create a type for the mode above, but size turns
-     to be out not of mode-size.  As we need a matching type
-     to build a BIT_FIELD_REF, use a nonstandard integer type as
-     fallback.  */
-  type = lang_hooks.types.type_for_size (size, 1);
-  if (!type || TYPE_PRECISION (type) != size)
-    type = build_nonstandard_integer_type (size, 1);
-  gcc_assert (type);
-  var = build3 (BIT_FIELD_REF, type, NULL_TREE,
-		bitsize_int (size), bitsize_int (bit));
+  if (f1->size == f2->size)
+    return 0;
+  /* We want the bigger accesses first, thus the opposite operator in the next
+     line: */
+  return f1->size > f2->size ? -1 : 1;
+}
 
-  block = instantiate_missing_elements_1 (elt, var, type);
-  gcc_assert (block && block->is_scalar);
 
-  var = block->replacement;
-  block->in_bitfld_block = 2;
+/* Append a name of the declaration to the name obstack.  A helper function for
+   make_fancy_name.  */
 
-  /* Add the member fields to the group, such that they access
-     portions of the group variable.  */
+static void
+make_fancy_decl_name (tree decl)
+{
+  char buffer[32];
 
-  for (f = first; f != TREE_CHAIN (prev); f = TREE_CHAIN (f))
+  tree name = DECL_NAME (decl);
+  if (name)
+    obstack_grow (&name_obstack, IDENTIFIER_POINTER (name),
+		  IDENTIFIER_LENGTH (name));
+  else
     {
-      tree field_type = canon_type_for_field (f, elt->element);
-      struct sra_elt *fld = lookup_element (block, f, field_type, INSERT);
-
-      gcc_assert (fld && fld->is_scalar && !fld->replacement);
-
-      fld->replacement = fold_build3 (BIT_FIELD_REF, field_type, var,
-				      bitsize_int (TYPE_PRECISION (field_type)),
-				      bitsize_int
-				      ((TREE_INT_CST_LOW (DECL_FIELD_OFFSET (f))
-					* BITS_PER_UNIT
-					+ (TREE_INT_CST_LOW
-					   (DECL_FIELD_BIT_OFFSET (f)))
-					- (TREE_INT_CST_LOW
-					   (TREE_OPERAND (block->element, 2))))
-				       & ~alchk));
-      fld->in_bitfld_block = 1;
+      sprintf (buffer, "D%u", DECL_UID (decl));
+      obstack_grow (&name_obstack, buffer, strlen (buffer));
     }
-
-  return prev;
 }
 
+/* Helper for make_fancy_name.  */
+
 static void
-instantiate_missing_elements (struct sra_elt *elt)
+make_fancy_name_1 (tree expr)
 {
-  tree type = elt->type;
+  char buffer[32];
+  tree index;
 
-  switch (TREE_CODE (type))
+  if (DECL_P (expr))
     {
-    case RECORD_TYPE:
-      {
-	tree f;
-	for (f = TYPE_FIELDS (type); f ; f = TREE_CHAIN (f))
-	  if (TREE_CODE (f) == FIELD_DECL)
-	    {
-	      tree last = try_instantiate_multiple_fields (elt, f);
-
-	      if (last != f)
-		{
-		  f = last;
-		  continue;
-		}
-
-	      instantiate_missing_elements_1 (elt, f,
-					      canon_type_for_field
-					      (f, elt->element));
-	    }
-	break;
-      }
-
-    case ARRAY_TYPE:
-      {
-	tree i, max, subtype;
-
-	i = TYPE_MIN_VALUE (TYPE_DOMAIN (type));
-	max = TYPE_MAX_VALUE (TYPE_DOMAIN (type));
-	subtype = TREE_TYPE (type);
+      make_fancy_decl_name (expr);
+      return;
+    }
 
-	while (1)
-	  {
-	    instantiate_missing_elements_1 (elt, i, subtype);
-	    if (tree_int_cst_equal (i, max))
-	      break;
-	    i = int_const_binop (PLUS_EXPR, i, integer_one_node, true);
-	  }
+  switch (TREE_CODE (expr))
+    {
+    case COMPONENT_REF:
+      make_fancy_name_1 (TREE_OPERAND (expr, 0));
+      obstack_1grow (&name_obstack, '$');
+      make_fancy_decl_name (TREE_OPERAND (expr, 1));
+      break;
 
+    case ARRAY_REF:
+      make_fancy_name_1 (TREE_OPERAND (expr, 0));
+      obstack_1grow (&name_obstack, '$');
+      /* Arrays with only one element may not have a constant as their
+	 index. */
+      index = TREE_OPERAND (expr, 1);
+      if (TREE_CODE (index) != INTEGER_CST)
 	break;
-      }
+      sprintf (buffer, HOST_WIDE_INT_PRINT_DEC, TREE_INT_CST_LOW (index));
+      obstack_grow (&name_obstack, buffer, strlen (buffer));
 
-    case COMPLEX_TYPE:
-      type = TREE_TYPE (type);
-      instantiate_missing_elements_1 (elt, integer_zero_node, type);
-      instantiate_missing_elements_1 (elt, integer_one_node, type);
       break;
 
+    case BIT_FIELD_REF:
+    case REALPART_EXPR:
+    case IMAGPART_EXPR:
+      gcc_unreachable (); 	/* we treat these as scalars.  */
+      break;
     default:
-      gcc_unreachable ();
+      break;
     }
 }
 
-/* Return true if there is only one non aggregate field in the record, TYPE.
-   Return false otherwise.  */
+/* Create a human readable name for replacement variable of ACCESS.  */
 
-static bool
-single_scalar_field_in_record_p (tree type)
+static char *
+make_fancy_name (tree expr)
 {
-   int num_fields = 0;
-   tree field;
-   if (TREE_CODE (type) != RECORD_TYPE)
-     return false;
-
-   for (field = TYPE_FIELDS (type); field; field = TREE_CHAIN (field))
-     if (TREE_CODE (field) == FIELD_DECL)
-       {
-         num_fields++;
-
-         if (num_fields == 2)
-           return false;
-	 
-         if (AGGREGATE_TYPE_P (TREE_TYPE (field)))
-           return false;
-       }
-
-   return true;
-}
-
-/* Make one pass across an element tree deciding whether to perform block
-   or element copies.  If we decide on element copies, instantiate all
-   elements.  Return true if there are any instantiated sub-elements.  */
+  make_fancy_name_1 (expr);
+  obstack_1grow (&name_obstack, '\0');
+  return XOBFINISH (&name_obstack, char *);
+}
+
+/* Helper function for build_ref_for_offset.  */
 
 static bool
-decide_block_copy (struct sra_elt *elt)
+build_ref_for_offset_1 (tree *res, tree type, HOST_WIDE_INT offset,
+			tree exp_type)
 {
-  struct sra_elt *c;
-  bool any_inst;
-
-  /* We shouldn't be invoked on groups of sub-elements as they must
-     behave like their parent as far as block copy is concerned.  */
-  gcc_assert (!elt->is_group);
-
-  /* If scalarization is disabled, respect it.  */
-  if (elt->cannot_scalarize)
+  while (1)
     {
-      elt->use_block_copy = 1;
+      tree fld;
+      tree tr_size, index;
+      HOST_WIDE_INT el_size;
 
-      if (dump_file)
-	{
-	  fputs ("Scalarization disabled for ", dump_file);
-	  dump_sra_elt_name (dump_file, elt);
-	  fputc ('\n', dump_file);
-	}
+      if (offset == 0 && exp_type
+	  && useless_type_conversion_p (exp_type, type))
+	return true;
 
-      /* Disable scalarization of sub-elements */
-      for (c = elt->children; c; c = c->sibling)
+      switch (TREE_CODE (type))
 	{
-	  c->cannot_scalarize = 1;
-	  decide_block_copy (c);
-	}
+	case UNION_TYPE:
+	case QUAL_UNION_TYPE:
+	case RECORD_TYPE:
+	  /* Some ADA records are half-unions, treat all of them the same.  */
+	  for (fld = TYPE_FIELDS (type); fld; fld = TREE_CHAIN (fld))
+	    {
+	      HOST_WIDE_INT pos, size;
+	      tree expr, *expr_ptr;
 
-      /* Groups behave like their parent.  */
-      for (c = elt->groups; c; c = c->sibling)
-	{
-	  c->cannot_scalarize = 1;
-	  c->use_block_copy = 1;
-	}
+	      if (TREE_CODE (fld) != FIELD_DECL)
+		continue;
 
-      return false;
-    }
+	      pos = int_bit_position (fld);
+	      gcc_assert (TREE_CODE (type) == RECORD_TYPE || pos == 0);
+	      size = tree_low_cst (DECL_SIZE (fld), 1);
+	      if (pos > offset || (pos + size) <= offset)
+		continue;
 
-  /* Don't decide if we've no uses and no groups.  */
-  if (elt->n_uses == 0 && elt->n_copies == 0 && elt->groups == NULL)
-    ;
-
-  else if (!elt->is_scalar)
-    {
-      tree size_tree = TYPE_SIZE_UNIT (elt->type);
-      bool use_block_copy = true;
-
-      /* Tradeoffs for COMPLEX types pretty much always make it better
-	 to go ahead and split the components.  */
-      if (TREE_CODE (elt->type) == COMPLEX_TYPE)
-	use_block_copy = false;
-
-      /* Don't bother trying to figure out the rest if the structure is
-	 so large we can't do easy arithmetic.  This also forces block
-	 copies for variable sized structures.  */
-      else if (host_integerp (size_tree, 1))
-	{
-	  unsigned HOST_WIDE_INT full_size, inst_size = 0;
-	  unsigned int max_size, max_count, inst_count, full_count;
-
-	  /* If the sra-max-structure-size parameter is 0, then the
-	     user has not overridden the parameter and we can choose a
-	     sensible default.  */
-	  max_size = SRA_MAX_STRUCTURE_SIZE
-	    ? SRA_MAX_STRUCTURE_SIZE
-	    : MOVE_RATIO (optimize_function_for_speed_p (cfun)) * UNITS_PER_WORD;
-	  max_count = SRA_MAX_STRUCTURE_COUNT
-	    ? SRA_MAX_STRUCTURE_COUNT
-	    : MOVE_RATIO (optimize_function_for_speed_p (cfun));
-
-	  full_size = tree_low_cst (size_tree, 1);
-	  full_count = count_type_elements (elt->type, false);
-	  inst_count = sum_instantiated_sizes (elt, &inst_size);
-
-	  /* If there is only one scalar field in the record, don't block copy.  */
-	  if (single_scalar_field_in_record_p (elt->type))
-	    use_block_copy = false;
-
-	  /* ??? What to do here.  If there are two fields, and we've only
-	     instantiated one, then instantiating the other is clearly a win.
-	     If there are a large number of fields then the size of the copy
-	     is much more of a factor.  */
-
-	  /* If the structure is small, and we've made copies, go ahead
-	     and instantiate, hoping that the copies will go away.  */
-	  if (full_size <= max_size
-	      && (full_count - inst_count) <= max_count
-	      && elt->n_copies > elt->n_uses)
-	    use_block_copy = false;
-	  else if (inst_count * 100 >= full_count * SRA_FIELD_STRUCTURE_RATIO
-		   && inst_size * 100 >= full_size * SRA_FIELD_STRUCTURE_RATIO)
-	    use_block_copy = false;
-
-	  /* In order to avoid block copy, we have to be able to instantiate
-	     all elements of the type.  See if this is possible.  */
-	  if (!use_block_copy
-	      && (!can_completely_scalarize_p (elt)
-		  || !type_can_instantiate_all_elements (elt->type)))
-	    use_block_copy = true;
-	}
-
-      elt->use_block_copy = use_block_copy;
-
-      /* Groups behave like their parent.  */
-      for (c = elt->groups; c; c = c->sibling)
-	c->use_block_copy = use_block_copy;
+	      if (res)
+		{
+		  expr = build3 (COMPONENT_REF, TREE_TYPE (fld), *res, fld,
+				 NULL_TREE);
+		  expr_ptr = &expr;
+		}
+	      else
+		expr_ptr = NULL;
+	      if (build_ref_for_offset_1 (expr_ptr, TREE_TYPE (fld),
+					  offset - pos, exp_type))
+		{
+		  if (res)
+		    *res = expr;
+		  return true;
+		}
+	    }
+	  return false;
 
-      if (dump_file)
-	{
-	  fprintf (dump_file, "Using %s for ",
-		   use_block_copy ? "block-copy" : "element-copy");
-	  dump_sra_elt_name (dump_file, elt);
-	  fputc ('\n', dump_file);
-	}
+	case ARRAY_TYPE:
+	  tr_size = TYPE_SIZE (TREE_TYPE (type));
+	  if (!tr_size || !host_integerp (tr_size, 1))
+	    return false;
+	  el_size = tree_low_cst (tr_size, 1);
+
+	  index = build_int_cst (TYPE_DOMAIN (type), offset / el_size);
+	  if (!integer_zerop (TYPE_MIN_VALUE (TYPE_DOMAIN (type))))
+	    index = int_const_binop (PLUS_EXPR, index,
+				     TYPE_MIN_VALUE (TYPE_DOMAIN (type)), 0);
+	  if (res)
+	    *res = build4 (ARRAY_REF, TREE_TYPE (type), *res, index, NULL_TREE,
+			   NULL_TREE);
+	  offset = offset % el_size;
+	  type = TREE_TYPE (type);
+	  break;
+
+	default:
+	  if (offset != 0)
+	    return false;
 
-      if (!use_block_copy)
-	{
-	  instantiate_missing_elements (elt);
-	  return true;
+	  if (exp_type)
+	    return false;
+	  else
+	    return true;
 	}
     }
-
-  any_inst = elt->replacement != NULL;
-
-  for (c = elt->children; c ; c = c->sibling)
-    any_inst |= decide_block_copy (c);
-
-  return any_inst;
 }
 
-/* Entry point to phase 3.  Instantiate scalar replacement variables.  */
-
-static void
-decide_instantiations (void)
-{
-  unsigned int i;
-  bool cleared_any;
-  bitmap_head done_head;
-  bitmap_iterator bi;
-
-  /* We cannot clear bits from a bitmap we're iterating over,
-     so save up all the bits to clear until the end.  */
-  bitmap_initialize (&done_head, &bitmap_default_obstack);
-  cleared_any = false;
+/* Construct an expression that would reference a part of aggregate *EXPR of
+   type TYPE at the given OFFSET of the type EXP_TYPE.  If EXPR is NULL, the
+   function only determines whether it can build such a reference without
+   actually doing it.
 
-  EXECUTE_IF_SET_IN_BITMAP (sra_candidates, 0, i, bi)
-    {
-      tree var = referenced_var (i);
-      struct sra_elt *elt = lookup_element (NULL, var, NULL, NO_INSERT);
-      if (elt)
-	{
-	  decide_instantiation_1 (elt, 0, 0);
-	  if (!decide_block_copy (elt))
-	    elt = NULL;
-	}
-      if (!elt)
-	{
-	  bitmap_set_bit (&done_head, i);
-	  cleared_any = true;
-	}
-    }
+   FIXME: Eventually this should be replaced with
+   maybe_fold_offset_to_reference() from tree-ssa-ccp.c but that requires a
+   minor rewrite of fold_stmt.
+ */
 
-  if (cleared_any)
+static bool
+build_ref_for_offset (tree *expr, tree type, HOST_WIDE_INT offset,
+		      tree exp_type, bool allow_ptr)
+{
+  if (allow_ptr && POINTER_TYPE_P (type))
     {
-      bitmap_and_compl_into (sra_candidates, &done_head);
-      bitmap_and_compl_into (needs_copy_in, &done_head);
+      type = TREE_TYPE (type);
+      if (expr)
+	*expr = fold_build1 (INDIRECT_REF, type, *expr);
     }
-  bitmap_clear (&done_head);
-  
-  mark_set_for_renaming (sra_candidates);
 
-  if (dump_file)
-    fputc ('\n', dump_file);
+  return build_ref_for_offset_1 (expr, type, offset, exp_type);
 }
 
-\f
-/* Phase Four: Update the function to match the replacements created.  */
+/* The very first phase of intraprocedural SRA.  It marks in candidate_bitmap
+   those with type which is suitable for scalarization.  */
 
-/* Mark all the variables in virtual operands in all the statements in
-   LIST for renaming.  */
-
-static void
-mark_all_v_defs_seq (gimple_seq seq)
+static bool
+find_var_candidates (void)
 {
-  gimple_stmt_iterator gsi;
+  tree var, type;
+  referenced_var_iterator rvi;
+  bool ret = false;
 
-  for (gsi = gsi_start (seq); !gsi_end_p (gsi); gsi_next (&gsi))
-    update_stmt_if_modified (gsi_stmt (gsi));
-}
+  FOR_EACH_REFERENCED_VAR (var, rvi)
+    {
+      if (TREE_CODE (var) != VAR_DECL && TREE_CODE (var) != PARM_DECL)
+        continue;
+      type = TREE_TYPE (var);
+
+      if (!AGGREGATE_TYPE_P (type)
+	  || needs_to_live_in_memory (var)
+	  || TREE_THIS_VOLATILE (var)
+	  || !COMPLETE_TYPE_P (type)
+	  || !host_integerp (TYPE_SIZE (type), 1)
+          || tree_low_cst (TYPE_SIZE (type), 1) == 0
+	  || type_internals_preclude_sra_p (type))
+	continue;
 
-/* Mark every replacement under ELT with TREE_NO_WARNING.  */
+      bitmap_set_bit (candidate_bitmap, DECL_UID (var));
 
-static void
-mark_no_warning (struct sra_elt *elt)
-{
-  if (!elt->all_no_warning)
-    {
-      if (elt->replacement)
-	TREE_NO_WARNING (elt->replacement) = 1;
-      else
+      if (dump_file)
 	{
-	  struct sra_elt *c;
-	  FOR_EACH_ACTUAL_CHILD (c, elt)
-	    mark_no_warning (c);
+	  fprintf (dump_file, "Candidate (%d): ", DECL_UID (var));
+	  print_generic_expr (dump_file, var, 0);
+	  fprintf (dump_file, "\n");
 	}
-      elt->all_no_warning = true;
+      ret = true;
     }
+
+  return ret;
 }
 
-/* Build a single level component reference to ELT rooted at BASE.  */
+/* Return true if TYPE should be considered a scalar type by SRA.  */
 
-static tree
-generate_one_element_ref (struct sra_elt *elt, tree base)
+static bool
+is_sra_scalar_type (tree type)
 {
-  switch (TREE_CODE (TREE_TYPE (base)))
-    {
-    case RECORD_TYPE:
-      {
-	tree field = elt->element;
-
-	/* We can't test elt->in_bitfld_block here because, when this is
-	   called from instantiate_element, we haven't set this field
-	   yet.  */
-	if (TREE_CODE (field) == BIT_FIELD_REF)
-	  {
-	    tree ret = unshare_expr (field);
-	    TREE_OPERAND (ret, 0) = base;
-	    return ret;
-	  }
-
-	/* Watch out for compatible records with differing field lists.  */
-	if (DECL_FIELD_CONTEXT (field) != TYPE_MAIN_VARIANT (TREE_TYPE (base)))
-	  field = find_compatible_field (TREE_TYPE (base), field);
-
-        return build3 (COMPONENT_REF, elt->type, base, field, NULL);
-      }
-
-    case ARRAY_TYPE:
-      if (TREE_CODE (elt->element) == RANGE_EXPR)
-	return build4 (ARRAY_RANGE_REF, elt->type, base,
-		       TREE_OPERAND (elt->element, 0), NULL, NULL);
-      else
-	return build4 (ARRAY_REF, elt->type, base, elt->element, NULL, NULL);
+  enum tree_code code = TREE_CODE (type);
+  return (INTEGRAL_TYPE_P (type) || SCALAR_FLOAT_TYPE_P (type)
+	  || FIXED_POINT_TYPE_P (type) || POINTER_TYPE_P (type)
+	  || code == VECTOR_TYPE || code == COMPLEX_TYPE
+	  || code == OFFSET_TYPE);
+}
 
-    case COMPLEX_TYPE:
-      if (elt->element == integer_zero_node)
-	return build1 (REALPART_EXPR, elt->type, base);
-      else
-	return build1 (IMAGPART_EXPR, elt->type, base);
 
-    default:
-      gcc_unreachable ();
-    }
-}
+/* Sort all accesses for the given variable, check for partial overlaps and
+   return NULL if there are any.  If there are none, pick a representative for
+   each combination of offset and size and create a linked list out of them.
+   Return the pointer to the first representative and make sure it is the first
+   one in the vector of accesses.  */
+
+static struct access *
+sort_and_splice_var_accesses (tree var)
+{
+  int i, j, access_count;
+  struct access *res, **prev_acc_ptr = &res;
+  VEC (access_p, heap) *access_vec;
+  bool first = true;
+  HOST_WIDE_INT low = -1, high = 0;
 
-/* Build a full component reference to ELT rooted at its native variable.  */
+  access_vec = get_base_access_vector (var);
+  if (!access_vec)
+    return NULL;
+  access_count = VEC_length (access_p, access_vec);
 
-static tree
-generate_element_ref (struct sra_elt *elt)
-{
-  if (elt->parent)
-    return generate_one_element_ref (elt, generate_element_ref (elt->parent));
-  else
-    return elt->element;
-}
+  /* Sort by <OFFSET, SIZE>.  */
+  qsort (VEC_address (access_p, access_vec), access_count, sizeof (access_p),
+	 compare_access_positions);
 
-/* Return true if BF is a bit-field that we can handle like a scalar.  */
+  i = 0;
+  while (i < access_count)
+    {
+      struct access *access = VEC_index (access_p, access_vec, i);
+      bool modification = access->write;
+      bool grp_read = !access->write;
+      bool grp_bfr_lhs = access->grp_bfr_lhs;
+      bool first_scalar = is_sra_scalar_type (access->type);
+      bool unscalarizable_region = access->grp_unscalarizable_region;
 
-static bool
-scalar_bitfield_p (tree bf)
-{
-  return (TREE_CODE (bf) == BIT_FIELD_REF
-	  && (is_gimple_reg (TREE_OPERAND (bf, 0))
-	      || (TYPE_MODE (TREE_TYPE (TREE_OPERAND (bf, 0))) != BLKmode
-		  && (!TREE_SIDE_EFFECTS (TREE_OPERAND (bf, 0))
-		      || (GET_MODE_BITSIZE (TYPE_MODE (TREE_TYPE
-						       (TREE_OPERAND (bf, 0))))
-			  <= BITS_PER_WORD)))));
-}
-
-/* Create an assignment statement from SRC to DST.  */
-
-static gimple_seq
-sra_build_assignment (tree dst, tree src)
-{
-  gimple stmt;
-  gimple_seq seq = NULL, seq2 = NULL;
-  /* Turning BIT_FIELD_REFs into bit operations enables other passes
-     to do a much better job at optimizing the code.
-     From dst = BIT_FIELD_REF <var, sz, off> we produce
-
-	SR.1 = (scalar type) var;
-	SR.2 = SR.1 >> off;
-	SR.3 = SR.2 & ((1 << sz) - 1);
-	... possible sign extension of SR.3 ...
-	dst = (destination type) SR.3;
-   */
-  if (scalar_bitfield_p (src))
-    {
-      tree var, shift, width;
-      tree utype, stype;
-      bool unsignedp = (INTEGRAL_TYPE_P (TREE_TYPE (src))
-		        ? TYPE_UNSIGNED (TREE_TYPE (src)) : true);
-      struct gimplify_ctx gctx;
-
-      var = TREE_OPERAND (src, 0);
-      width = TREE_OPERAND (src, 1);
-      /* The offset needs to be adjusted to a right shift quantity
-	 depending on the endianness.  */
-      if (BYTES_BIG_ENDIAN)
+      if (first || access->offset >= high)
 	{
-	  tree tmp = size_binop (PLUS_EXPR, width, TREE_OPERAND (src, 2));
-	  shift = size_binop (MINUS_EXPR, TYPE_SIZE (TREE_TYPE (var)), tmp);
+	  first = false;
+	  low = access->offset;
+	  high = access->offset + access->size;
 	}
+      else if (access->offset > low && access->offset + access->size > high)
+	return NULL;
       else
-	shift = TREE_OPERAND (src, 2);
+	gcc_assert (access->offset >= low
+		    && access->offset + access->size <= high);
 
-      /* In weird cases we have non-integral types for the source or
-	 destination object.
-	 ???  For unknown reasons we also want an unsigned scalar type.  */
-      stype = TREE_TYPE (var);
-      if (!INTEGRAL_TYPE_P (stype))
-	stype = lang_hooks.types.type_for_size (TREE_INT_CST_LOW
-						(TYPE_SIZE (stype)), 1);
-      else if (!TYPE_UNSIGNED (stype))
-	stype = unsigned_type_for (stype);
-
-      utype = TREE_TYPE (dst);
-      if (!INTEGRAL_TYPE_P (utype))
-	utype = lang_hooks.types.type_for_size (TREE_INT_CST_LOW
-						(TYPE_SIZE (utype)), 1);
-      else if (!TYPE_UNSIGNED (utype))
-	utype = unsigned_type_for (utype);
-
-      /* Convert the base var of the BIT_FIELD_REF to the scalar type
-	 we use for computation if we cannot use it directly.  */
-      if (INTEGRAL_TYPE_P (TREE_TYPE (var)))
-	var = fold_convert (stype, var);
-      else
-	var = fold_build1 (VIEW_CONVERT_EXPR, stype, var);
+      j = i + 1;
+      while (j < access_count)
+	{
+	  struct access *ac2 = VEC_index (access_p, access_vec, j);
+	  if (ac2->offset != access->offset || ac2->size != access->size)
+	    break;
+	  modification |= ac2->write;
+	  grp_read |= !ac2->write;
+	  grp_bfr_lhs |= ac2->grp_bfr_lhs;
+	  unscalarizable_region |= ac2->grp_unscalarizable_region;
+	  relink_to_new_repr (access, ac2);
 
-      if (!integer_zerop (shift))
-	var = fold_build2 (RSHIFT_EXPR, stype, var, shift);
+	  /* If one of the equivalent accesses is scalar, use it as a
+	     representative (this happens when when there is for example on a
+	     single scalar field in a structure).  */
+	  if (!first_scalar && is_sra_scalar_type (ac2->type))
+	    {
+	      struct access tmp_acc;
+	      first_scalar = true;
 
-      /* If we need a masking operation, produce one.  */
-      if (TREE_INT_CST_LOW (width) == TYPE_PRECISION (stype))
-	unsignedp = true;
-      else
-	{
-	  tree one = build_int_cst_wide (stype, 1, 0);
-	  tree mask = int_const_binop (LSHIFT_EXPR, one, width, 0);
-	  mask = int_const_binop (MINUS_EXPR, mask, one, 0);
-	  var = fold_build2 (BIT_AND_EXPR, stype, var, mask);
+	      memcpy (&tmp_acc, ac2, sizeof (struct access));
+	      memcpy (ac2, access,  sizeof (struct access));
+	      memcpy (access, &tmp_acc, sizeof (struct access));
+	    }
+	  ac2->group_representative = access;
+	  j++;
 	}
 
-      /* After shifting and masking, convert to the target type.  */
-      var = fold_convert (utype, var);
+      i = j;
 
-      /* Perform sign extension, if required.
-	 ???  This should never be necessary.  */
-      if (!unsignedp)
-	{
-	  tree signbit = int_const_binop (LSHIFT_EXPR,
-					  build_int_cst_wide (utype, 1, 0),
-					  size_binop (MINUS_EXPR, width,
-						      bitsize_int (1)), 0);
+      access->group_representative = access;
+      access->grp_write = modification;
+      access->grp_read = grp_read;
+      access->grp_bfr_lhs = grp_bfr_lhs;
+      access->grp_unscalarizable_region = unscalarizable_region;
+      if (access->first_link)
+	add_access_to_work_queue (access);
 
-	  var = fold_build2 (BIT_XOR_EXPR, utype, var, signbit);
-	  var = fold_build2 (MINUS_EXPR, utype, var, signbit);
-	}
+      *prev_acc_ptr = access;
+      prev_acc_ptr = &access->next_grp;
+    }
 
-      /* fold_build3 (BIT_FIELD_REF, ...) sometimes returns a cast.  */
-      STRIP_NOPS (dst);
+  gcc_assert (res == VEC_index (access_p, access_vec, 0));
+  return res;
+}
 
-      /* Finally, move and convert to the destination.  */
-      if (INTEGRAL_TYPE_P (TREE_TYPE (dst)))
-	var = fold_convert (TREE_TYPE (dst), var);
-      else
-	var = fold_build1 (VIEW_CONVERT_EXPR, TREE_TYPE (dst), var);
+/* Create a variable for the given ACCESS which determines the type, name and a
+   few other properties.  Return the variable declaration and store it also to
+   ACCESS->replacement.  */
 
-      push_gimplify_context (&gctx);
-      gctx.allow_rhs_cond_expr = true;
+static tree
+create_access_replacement (struct access *access)
+{
+  tree repl;
 
-      gimplify_assign (dst, var, &seq);
+  repl = make_rename_temp (access->type, "SR");
+  get_var_ann (repl);
+  add_referenced_var (repl);
 
-      if (gimple_referenced_vars (cfun))
-	for (var = gctx.temps; var; var = TREE_CHAIN (var))
-	  {
-	    add_referenced_var (var);
-	    mark_sym_for_renaming (var);
-	  }
-      pop_gimplify_context (NULL);
+  DECL_SOURCE_LOCATION (repl) = DECL_SOURCE_LOCATION (access->base);
+  DECL_ARTIFICIAL (repl) = 1;
 
-      return seq;
-    }
+  if (DECL_NAME (access->base) && !DECL_IGNORED_P (access->base))
+    {
+      char *pretty_name = make_fancy_name (access->expr);
 
-  /* fold_build3 (BIT_FIELD_REF, ...) sometimes returns a cast.  */
-  if (CONVERT_EXPR_P (dst))
+      DECL_NAME (repl) = get_identifier (pretty_name);
+      obstack_free (&name_obstack, pretty_name);
+
+      SET_DECL_DEBUG_EXPR (repl, access->expr);
+      DECL_DEBUG_EXPR_IS_FROM (repl) = 1;
+      DECL_IGNORED_P (repl) = 0;
+      TREE_NO_WARNING (repl) = TREE_NO_WARNING (access->base);
+    }
+  else
     {
-      STRIP_NOPS (dst);
-      src = fold_convert (TREE_TYPE (dst), src);
+      DECL_IGNORED_P (repl) = 1;
+      TREE_NO_WARNING (repl) = 1;
     }
-  /* It was hoped that we could perform some type sanity checking
-     here, but since front-ends can emit accesses of fields in types
-     different from their nominal types and copy structures containing
-     them as a whole, we'd have to handle such differences here.
-     Since such accesses under different types require compatibility
-     anyway, there's little point in making tests and/or adding
-     conversions to ensure the types of src and dst are the same.
-     So we just assume type differences at this point are ok.
-     The only exception we make here are pointer types, which can be different
-     in e.g. structurally equal, but non-identical RECORD_TYPEs.  */
-  else if (POINTER_TYPE_P (TREE_TYPE (dst))
-	   && !useless_type_conversion_p (TREE_TYPE (dst), TREE_TYPE (src)))
-    src = fold_convert (TREE_TYPE (dst), src);
 
-  /* ???  Only call the gimplifier if we need to.  Otherwise we may 
-     end up substituting with DECL_VALUE_EXPR - see PR37380.  */
-  if (!handled_component_p (src)
-      && !SSA_VAR_P (src))
+  if (access->grp_bfr_lhs)
+    DECL_GIMPLE_REG_P (repl) = 0;
+
+  if (dump_file)
     {
-      src = force_gimple_operand (src, &seq2, false, NULL_TREE);
-      gimple_seq_add_seq (&seq, seq2);
+      fprintf (dump_file, "Created a replacement for ");
+      print_generic_expr (dump_file, access->base, 0);
+      fprintf (dump_file, " offset: %u, size: %u: ",
+	       (unsigned) access->offset, (unsigned) access->size);
+      print_generic_expr (dump_file, repl, 0);
+      fprintf (dump_file, "\n");
     }
-  stmt = gimple_build_assign (dst, src);
-  gimple_seq_add_stmt (&seq, stmt);
-  return seq;
-}
 
-/* BIT_FIELD_REFs must not be shared.  sra_build_elt_assignment()
-   takes care of assignments, but we must create copies for uses.  */
-#define REPLDUP(t) (TREE_CODE (t) != BIT_FIELD_REF ? (t) : unshare_expr (t))
+  return repl;
+}
 
-/* Emit an assignment from SRC to DST, but if DST is a scalarizable
-   BIT_FIELD_REF, turn it into bit operations.  */
+/* Return ACCESS scalar replacement, create it if it does not exist yet.  */
 
-static gimple_seq
-sra_build_bf_assignment (tree dst, tree src)
+static inline tree
+get_access_replacement (struct access *access)
 {
-  tree var, type, utype, tmp, tmp2, tmp3;
-  gimple_seq seq;
-  gimple stmt;
-  tree cst, cst2, mask;
-  tree minshift, maxshift;
-
-  if (TREE_CODE (dst) != BIT_FIELD_REF)
-    return sra_build_assignment (dst, src);
+  gcc_assert (access->grp_to_be_replaced);
 
-  var = TREE_OPERAND (dst, 0);
+  if (access->replacement_decl)
+    return access->replacement_decl;
 
-  if (!scalar_bitfield_p (dst))
-    return sra_build_assignment (REPLDUP (dst), src);
+  access->replacement_decl = create_access_replacement (access);
+  return access->replacement_decl;
+}
 
-  seq = NULL;
+/* Build a subtree of accesses rooted in *ACCESS, and move the pointer in the
+   linked list along the way.  Stop when *ACCESS is NULL or the access pointed
+   to it is not "within" the root.  */
 
-  cst = fold_convert (bitsizetype, TREE_OPERAND (dst, 2));
-  cst2 = size_binop (PLUS_EXPR,
-		     fold_convert (bitsizetype, TREE_OPERAND (dst, 1)),
-		     cst);
+static void
+build_access_subtree (struct access **access)
+{
+  struct access *root = *access, *last_child = NULL;
+  HOST_WIDE_INT limit = root->offset + root->size;
 
-  if (BYTES_BIG_ENDIAN)
+  *access = (*access)->next_grp;
+  while  (*access && (*access)->offset + (*access)->size <= limit)
     {
-      maxshift = size_binop (MINUS_EXPR, TYPE_SIZE (TREE_TYPE (var)), cst);
-      minshift = size_binop (MINUS_EXPR, TYPE_SIZE (TREE_TYPE (var)), cst2);
+      if (!last_child)
+	root->first_child = *access;
+      else
+	last_child->next_sibling = *access;
+      last_child = *access;
+
+      build_access_subtree (access);
     }
-  else
+}
+
+/* Build a tree of access representatives, ACCESS is the pointer to the first
+   one, others are linked in a list by the next_grp field.  Decide about scalar
+   replacements on the way, return true iff any are to be created.  */
+
+static void
+build_access_trees (struct access *access)
+{
+  while (access)
     {
-      maxshift = cst2;
-      minshift = cst;
+      struct access *root = access;
+
+      build_access_subtree (&access);
+      root->next_grp = access;
     }
+}
 
-  type = TREE_TYPE (var);
-  if (!INTEGRAL_TYPE_P (type))
-    type = lang_hooks.types.type_for_size
-      (TREE_INT_CST_LOW (TYPE_SIZE (TREE_TYPE (var))), 1);
-  if (TYPE_UNSIGNED (type))
-    utype = type;
-  else
-    utype = unsigned_type_for (type);
+/* Analyze the subtree of accesses rooted in ROOT, scheduling replacements when
+   both seeming beneficial and when ALLOW_REPLACEMENTS allows it.  Also set
+   all sorts of access flags appropriately along the way, notably always ser
+   grp_read when MARK_READ is true and grp_write when MARK_WRITE is true.  */
 
-  mask = build_int_cst_wide (utype, 1, 0);
-  if (TREE_INT_CST_LOW (maxshift) == TYPE_PRECISION (utype))
-    cst = build_int_cst_wide (utype, 0, 0);
-  else
-    cst = int_const_binop (LSHIFT_EXPR, mask, maxshift, true);
-  if (integer_zerop (minshift))
-    cst2 = mask;
-  else
-    cst2 = int_const_binop (LSHIFT_EXPR, mask, minshift, true);
-  mask = int_const_binop (MINUS_EXPR, cst, cst2, true);
-  mask = fold_build1 (BIT_NOT_EXPR, utype, mask);
-
-  if (TYPE_MAIN_VARIANT (utype) != TYPE_MAIN_VARIANT (TREE_TYPE (var))
-      && !integer_zerop (mask))
-    {
-      tmp = var;
-      if (!is_gimple_variable (tmp))
-	tmp = unshare_expr (var);
-      else
-	TREE_NO_WARNING (var) = true;
+static bool
+analyze_access_subtree (struct access *root, bool allow_replacements,
+			bool mark_read, bool mark_write)
+{
+  struct access *child;
+  HOST_WIDE_INT limit = root->offset + root->size;
+  HOST_WIDE_INT covered_to = root->offset;
+  bool scalar = is_sra_scalar_type (root->type);
+  bool hole = false, sth_created = false;
 
-      tmp2 = make_rename_temp (utype, "SR");
+  if (mark_read)
+    root->grp_read = true;
+  else if (root->grp_read)
+    mark_read = true;
 
-      if (INTEGRAL_TYPE_P (TREE_TYPE (var)))
-	tmp = fold_convert (utype, tmp);
-      else
-	tmp = fold_build1 (VIEW_CONVERT_EXPR, utype, tmp);
+  if (mark_write)
+    root->grp_write = true;
+  else if (root->grp_write)
+    mark_write = true;
 
-      stmt = gimple_build_assign (tmp2, tmp);
-      gimple_seq_add_stmt (&seq, stmt);
-    }
-  else
-    tmp2 = var;
+  if (root->grp_unscalarizable_region)
+    allow_replacements = false;
 
-  if (!integer_zerop (mask))
+  for (child = root->first_child; child; child = child->next_sibling)
     {
-      tmp = make_rename_temp (utype, "SR");
-      stmt = gimple_build_assign (tmp, fold_build2 (BIT_AND_EXPR, utype,
-						    tmp2, mask));
-      gimple_seq_add_stmt (&seq, stmt);
-    }
-  else
-    tmp = mask;
+      if (!hole && child->offset < covered_to)
+	hole = true;
+      else
+	covered_to += child->size;
 
-  if (is_gimple_reg (src) && INTEGRAL_TYPE_P (TREE_TYPE (src)))
-    tmp2 = src;
-  else if (INTEGRAL_TYPE_P (TREE_TYPE (src)))
-    {
-      gimple_seq tmp_seq;
-      tmp2 = make_rename_temp (TREE_TYPE (src), "SR");
-      tmp_seq = sra_build_assignment (tmp2, src);
-      gimple_seq_add_seq (&seq, tmp_seq);
-    }
-  else
-    {
-      gimple_seq tmp_seq;
-      tmp2 = make_rename_temp
-	(lang_hooks.types.type_for_size
-	 (TREE_INT_CST_LOW (TYPE_SIZE (TREE_TYPE (src))),
-	  1), "SR");
-      tmp_seq = sra_build_assignment (tmp2, fold_build1 (VIEW_CONVERT_EXPR,
-						      TREE_TYPE (tmp2), src));
-      gimple_seq_add_seq (&seq, tmp_seq);
+      sth_created |= analyze_access_subtree (child,
+					     allow_replacements && !scalar,
+					     mark_read, mark_write);
+
+      root->grp_unscalarized_data |= child->grp_unscalarized_data;
+      hole |= !child->grp_covered;
     }
 
-  if (!TYPE_UNSIGNED (TREE_TYPE (tmp2)))
+  if (allow_replacements && scalar && !root->first_child)
     {
-      gimple_seq tmp_seq;
-      tree ut = unsigned_type_for (TREE_TYPE (tmp2));
-      tmp3 = make_rename_temp (ut, "SR");
-      tmp2 = fold_convert (ut, tmp2);
-      tmp_seq = sra_build_assignment (tmp3, tmp2);
-      gimple_seq_add_seq (&seq, tmp_seq);
-
-      tmp2 = fold_build1 (BIT_NOT_EXPR, utype, mask);
-      tmp2 = int_const_binop (RSHIFT_EXPR, tmp2, minshift, true);
-      tmp2 = fold_convert (ut, tmp2);
-      tmp2 = fold_build2 (BIT_AND_EXPR, ut, tmp3, tmp2);
-
-      if (tmp3 != tmp2)
+      if (dump_file)
 	{
-	  tmp3 = make_rename_temp (ut, "SR");
-	  tmp_seq = sra_build_assignment (tmp3, tmp2);
-          gimple_seq_add_seq (&seq, tmp_seq);
+	  fprintf (dump_file, "Marking ");
+	  print_generic_expr (dump_file, root->base, 0);
+	  fprintf (dump_file, " offset: %u, size: %u: ",
+		   (unsigned) root->offset, (unsigned) root->size);
+	  fprintf (dump_file, " to be replaced.\n");
 	}
 
-      tmp2 = tmp3;
+      root->grp_to_be_replaced = 1;
+      sth_created = true;
+      hole = false;
     }
+  else if (covered_to < limit)
+    hole = true;
 
-  if (TYPE_MAIN_VARIANT (TREE_TYPE (tmp2)) != TYPE_MAIN_VARIANT (utype))
+  if (sth_created && !hole)
     {
-      gimple_seq tmp_seq;
-      tmp3 = make_rename_temp (utype, "SR");
-      tmp2 = fold_convert (utype, tmp2);
-      tmp_seq = sra_build_assignment (tmp3, tmp2);
-      gimple_seq_add_seq (&seq, tmp_seq);
-      tmp2 = tmp3;
+      root->grp_covered = 1;
+      return true;
     }
+  if (root->grp_write || TREE_CODE (root->base) == PARM_DECL)
+    root->grp_unscalarized_data = 1; /* not covered and written to */
+  if (sth_created)
+    return true;
+  return false;
+}
 
-  if (!integer_zerop (minshift))
+/* Analyze all access trees linked by next_grp by the means of
+   analyze_access_subtree.  */
+static bool
+analyze_access_trees (struct access *access)
+{
+  bool ret = false;
+
+  while (access)
     {
-      tmp3 = make_rename_temp (utype, "SR");
-      stmt = gimple_build_assign (tmp3, fold_build2 (LSHIFT_EXPR, utype,
-						     tmp2, minshift));
-      gimple_seq_add_stmt (&seq, stmt);
-      tmp2 = tmp3;
+      if (analyze_access_subtree (access, true, false, false))
+	ret = true;
+      access = access->next_grp;
     }
 
-  if (utype != TREE_TYPE (var))
-    tmp3 = make_rename_temp (utype, "SR");
-  else
-    tmp3 = var;
-  stmt = gimple_build_assign (tmp3, fold_build2 (BIT_IOR_EXPR, utype,
-						 tmp, tmp2));
-      gimple_seq_add_stmt (&seq, stmt);
+  return ret;
+}
+
+/* Return true iff a potential new child of LACC at offset OFFSET and with size
+   SIZE would conflict with an already existing one.  If exactly such a child
+   already exists in LACC, store a pointer to it in EXACT_MATCH.  */
+
+static bool
+child_would_conflict_in_lacc (struct access *lacc, HOST_WIDE_INT norm_offset,
+			      HOST_WIDE_INT size, struct access **exact_match)
+{
+  struct access *child;
 
-  if (tmp3 != var)
+  for (child = lacc->first_child; child; child = child->next_sibling)
     {
-      if (TREE_TYPE (var) == type)
-	stmt = gimple_build_assign (var, fold_convert (type, tmp3));
-      else
-	stmt = gimple_build_assign (var, fold_build1 (VIEW_CONVERT_EXPR,
-						      TREE_TYPE (var), tmp3));
-      gimple_seq_add_stmt (&seq, stmt);
+      if (child->offset == norm_offset && child->size == size)
+	{
+	  *exact_match = child;
+	  return true;
+	}
+
+      if (child->offset < norm_offset + size
+	  && child->offset + child->size > norm_offset)
+	return true;
     }
 
-  return seq;
+  return false;
 }
 
-/* Expand an assignment of SRC to the scalarized representation of
-   ELT.  If it is a field group, try to widen the assignment to cover
-   the full variable.  */
+/* Set the expr of TARGET to one just like MODEL but with is own base at the
+   bottom of the handled components.  */
 
-static gimple_seq
-sra_build_elt_assignment (struct sra_elt *elt, tree src)
+static void
+duplicate_expr_for_different_base (struct access *target,
+				   struct access *model)
 {
-  tree dst = elt->replacement;
-  tree var, tmp, cst, cst2;
-  gimple stmt;
-  gimple_seq seq;
+  tree t, expr = unshare_expr (model->expr);
 
-  if (TREE_CODE (dst) != BIT_FIELD_REF
-      || !elt->in_bitfld_block)
-    return sra_build_assignment (REPLDUP (dst), src);
+  gcc_assert (handled_component_p (expr));
+  t = expr;
+  while (handled_component_p (TREE_OPERAND (t, 0)))
+    t = TREE_OPERAND (t, 0);
+  gcc_assert (TREE_OPERAND (t, 0) == model->base);
+  TREE_OPERAND (t, 0) = target->base;
 
-  var = TREE_OPERAND (dst, 0);
-
-  /* Try to widen the assignment to the entire variable.
-     We need the source to be a BIT_FIELD_REF as well, such that, for
-     BIT_FIELD_REF<d,sz,dp> = BIT_FIELD_REF<s,sz,sp>,
-     by design, conditions are met such that we can turn it into
-     d = BIT_FIELD_REF<s,dw,sp-dp>.  */
-  if (elt->in_bitfld_block == 2
-      && TREE_CODE (src) == BIT_FIELD_REF)
-    {
-      tmp = src;
-      cst = TYPE_SIZE (TREE_TYPE (var));
-      cst2 = size_binop (MINUS_EXPR, TREE_OPERAND (src, 2),
-			 TREE_OPERAND (dst, 2));
+  target->expr = expr;
+}
 
-      src = TREE_OPERAND (src, 0);
 
-      /* Avoid full-width bit-fields.  */
-      if (integer_zerop (cst2)
-	  && tree_int_cst_equal (cst, TYPE_SIZE (TREE_TYPE (src))))
-	{
-	  if (INTEGRAL_TYPE_P (TREE_TYPE (src))
-	      && !TYPE_UNSIGNED (TREE_TYPE (src)))
-	    src = fold_convert (unsigned_type_for (TREE_TYPE (src)), src);
+/* Create a new child access of PARENT, with all properties just like MODEL
+   except for its offset and with its grp_write false and grp_read true.
+   Return the new access. Note that this access is created long after all
+   splicing and sorting, it's not located in any access vector and is
+   automatically a representative of its group.  */
 
-	  /* If a single conversion won't do, we'll need a statement
-	     list.  */
-	  if (TYPE_MAIN_VARIANT (TREE_TYPE (var))
-	      != TYPE_MAIN_VARIANT (TREE_TYPE (src)))
-	    {
-              gimple_seq tmp_seq;
-	      seq = NULL;
+static struct access *
+create_artificial_child_access (struct access *parent, struct access *model,
+				HOST_WIDE_INT new_offset)
+{
+  struct access *access;
+  struct access **child;
 
-	      if (!INTEGRAL_TYPE_P (TREE_TYPE (src)))
-		src = fold_build1 (VIEW_CONVERT_EXPR,
-				   lang_hooks.types.type_for_size
-				   (TREE_INT_CST_LOW
-				    (TYPE_SIZE (TREE_TYPE (src))),
-				    1), src);
-	      gcc_assert (TYPE_UNSIGNED (TREE_TYPE (src)));
-
-	      tmp = make_rename_temp (TREE_TYPE (src), "SR");
-	      stmt = gimple_build_assign (tmp, src);
-	      gimple_seq_add_stmt (&seq, stmt);
-
-	      tmp_seq = sra_build_assignment (var,
-					      fold_convert (TREE_TYPE (var),
-							    tmp));
-	      gimple_seq_add_seq (&seq, tmp_seq);
+  gcc_assert (!model->grp_unscalarizable_region);
 
-	      return seq;
-	    }
+  access = (struct access *) pool_alloc (access_pool);
+  memset (access, 0, sizeof (struct access));
+  access->base = parent->base;
+  access->offset = new_offset;
+  access->size = model->size;
+  duplicate_expr_for_different_base (access, model);
+  access->type = model->type;
+  access->grp_write = true;
+  access->grp_read = false;
 
-	  src = fold_convert (TREE_TYPE (var), src);
-	}
-      else
-	{
-	  src = fold_convert (TREE_TYPE (var), tmp);
-	}
+  child = &parent->first_child;
+  while (*child && (*child)->offset < new_offset)
+    child = &(*child)->next_sibling;
 
-      return sra_build_assignment (var, src);
-    }
+  access->next_sibling = *child;
+  *child = access;
 
-  return sra_build_bf_assignment (dst, src);
+  return access;
 }
 
-/* Generate a set of assignment statements in *LIST_P to copy all
-   instantiated elements under ELT to or from the equivalent structure
-   rooted at EXPR.  COPY_OUT controls the direction of the copy, with
-   true meaning to copy out of EXPR into ELT.  */
 
-static void
-generate_copy_inout (struct sra_elt *elt, bool copy_out, tree expr,
-		     gimple_seq *seq_p)
+/* Propagate all subaccesses of RACC across an assignment link to LACC. Return
+   true if any new subaccess was created.  Additionally, if RACC is a scalar
+   access but LACC is not, change the type of the latter.  */
+
+static bool
+propagate_subacesses_accross_link (struct access *lacc, struct access *racc)
 {
-  struct sra_elt *c;
-  gimple_seq tmp_seq;
-  tree t;
+  struct access *rchild;
+  HOST_WIDE_INT norm_delta = lacc->offset - racc->offset;
+  bool ret = false;
+
+  if (is_sra_scalar_type (lacc->type)
+      || lacc->grp_unscalarizable_region
+      || racc->grp_unscalarizable_region)
+    return false;
 
-  if (!copy_out && TREE_CODE (expr) == SSA_NAME
-      && TREE_CODE (TREE_TYPE (expr)) == COMPLEX_TYPE)
+  if (!lacc->first_child && !racc->first_child
+      && is_sra_scalar_type (racc->type)
+      && (sra_mode == SRA_MODE_INTRA
+          || !bitmap_bit_p (retvals_bitmap, DECL_UID (lacc->base))))
     {
-      tree r, i;
+      duplicate_expr_for_different_base (lacc, racc);
+      lacc->type = racc->type;
+      return false;
+    }
 
-      c = lookup_element (elt, integer_zero_node, NULL, NO_INSERT);
-      r = c->replacement;
-      c = lookup_element (elt, integer_one_node, NULL, NO_INSERT);
-      i = c->replacement;
+  gcc_assert (lacc->size <= racc->size);
 
-      t = build2 (COMPLEX_EXPR, elt->type, r, i);
-      tmp_seq = sra_build_bf_assignment (expr, t);
-      SSA_NAME_DEF_STMT (expr) = gimple_seq_last_stmt (tmp_seq);
-      gimple_seq_add_seq (seq_p, tmp_seq);
-    }
-  else if (elt->replacement)
-    {
-      if (copy_out)
-	tmp_seq = sra_build_elt_assignment (elt, expr);
-      else
-	tmp_seq = sra_build_bf_assignment (expr, REPLDUP (elt->replacement));
-      gimple_seq_add_seq (seq_p, tmp_seq);
-    }
-  else
+  for (rchild = racc->first_child; rchild; rchild = rchild->next_sibling)
     {
-      FOR_EACH_ACTUAL_CHILD (c, elt)
+      struct access *new_acc = NULL;
+      HOST_WIDE_INT norm_offset = rchild->offset + norm_delta;
+
+      if (rchild->grp_unscalarizable_region)
+	continue;
+
+      if (child_would_conflict_in_lacc (lacc, norm_offset, rchild->size,
+					&new_acc))
 	{
-	  t = generate_one_element_ref (c, unshare_expr (expr));
-	  generate_copy_inout (c, copy_out, t, seq_p);
+	  if (new_acc && rchild->first_child)
+	    ret |= propagate_subacesses_accross_link (new_acc, rchild);
+	  continue;
 	}
+
+      new_acc = create_artificial_child_access (lacc, rchild, norm_offset);
+      if (racc->first_child)
+	propagate_subacesses_accross_link (new_acc, rchild);
+
+      ret = true;
     }
+
+  return ret;
 }
 
-/* Generate a set of assignment statements in *LIST_P to copy all instantiated
-   elements under SRC to their counterparts under DST.  There must be a 1-1
-   correspondence of instantiated elements.  */
+/* Propagate all subaccesses across assignment links.  */
 
 static void
-generate_element_copy (struct sra_elt *dst, struct sra_elt *src, gimple_seq *seq_p)
+propagate_all_subaccesses (void)
 {
-  struct sra_elt *dc, *sc;
-
-  FOR_EACH_ACTUAL_CHILD (dc, dst)
+  while (work_queue_head)
     {
-      sc = lookup_element (src, dc->element, NULL, NO_INSERT);
-      if (!sc && dc->in_bitfld_block == 2)
-	{
-	  struct sra_elt *dcs;
+      struct access *racc = pop_access_from_work_queue ();
+      struct assign_link *link;
 
-	  FOR_EACH_ACTUAL_CHILD (dcs, dc)
-	    {
-	      sc = lookup_element (src, dcs->element, NULL, NO_INSERT);
-	      gcc_assert (sc);
-	      generate_element_copy (dcs, sc, seq_p);
-	    }
+      gcc_assert (racc->first_link);
 
-	  continue;
-	}
+      for (link = racc->first_link; link; link = link->next)
+	{
+	  struct access *lacc = link->lacc;
 
-      /* If DST and SRC are structs with the same elements, but do not have
-	 the same TYPE_MAIN_VARIANT, then lookup of DST FIELD_DECL in SRC
-	 will fail.  Try harder by finding the corresponding FIELD_DECL
-	 in SRC.  */
-      if (!sc)
-	{
-	  tree f;
-
-	  gcc_assert (useless_type_conversion_p (dst->type, src->type));
-	  gcc_assert (TREE_CODE (dc->element) == FIELD_DECL);
-	  for (f = TYPE_FIELDS (src->type); f ; f = TREE_CHAIN (f))
-	    if (simple_cst_equal (DECL_FIELD_OFFSET (f),
-				  DECL_FIELD_OFFSET (dc->element)) > 0
-		&& simple_cst_equal (DECL_FIELD_BIT_OFFSET (f),
-				     DECL_FIELD_BIT_OFFSET (dc->element)) > 0
-		&& simple_cst_equal (DECL_SIZE (f),
-				     DECL_SIZE (dc->element)) > 0
-		&& (useless_type_conversion_p (TREE_TYPE (dc->element),
-					       TREE_TYPE (f))
-		    || (POINTER_TYPE_P (TREE_TYPE (dc->element))
-			&& POINTER_TYPE_P (TREE_TYPE (f)))))
-	      break;
-	  gcc_assert (f != NULL_TREE);
-	  sc = lookup_element (src, f, NULL, NO_INSERT);
+	  if (!bitmap_bit_p (candidate_bitmap, DECL_UID (lacc->base)))
+	    continue;
+	  lacc = lacc->group_representative;
+	  if (propagate_subacesses_accross_link (lacc, racc)
+	      && lacc->first_link)
+	    add_access_to_work_queue (lacc);
 	}
-
-      generate_element_copy (dc, sc, seq_p);
-    }
-
-  if (dst->replacement)
-    {
-      gimple_seq tmp_seq;
-
-      gcc_assert (src->replacement);
-
-      tmp_seq = sra_build_elt_assignment (dst, REPLDUP (src->replacement));
-      gimple_seq_add_seq (seq_p, tmp_seq);
     }
 }
 
-/* Generate a set of assignment statements in *LIST_P to zero all instantiated
-   elements under ELT.  In addition, do not assign to elements that have been
-   marked VISITED but do reset the visited flag; this allows easy coordination
-   with generate_element_init.  */
+/* Go through all accesses collected throughout the (intraprocedural) analysis
+   stage, exclude overlapping ones, identify representatives and build trees
+   out of them, making decisions about scalarization on the way.  Return true
+   iff there are any to-be-scalarized variables after this stage. */
 
-static void
-generate_element_zero (struct sra_elt *elt, gimple_seq *seq_p)
+static bool
+analyze_all_variable_accesses (void)
 {
-  struct sra_elt *c;
+  tree var;
+  referenced_var_iterator rvi;
+  bool res = false;
 
-  if (elt->visited)
-    {
-      elt->visited = false;
-      return;
-    }
+  FOR_EACH_REFERENCED_VAR (var, rvi)
+    if (bitmap_bit_p (candidate_bitmap, DECL_UID (var)))
+      {
+	struct access *access;
 
-  if (!elt->in_bitfld_block)
-    FOR_EACH_ACTUAL_CHILD (c, elt)
-      generate_element_zero (c, seq_p);
+	access = sort_and_splice_var_accesses (var);
+	if (access)
+	  build_access_trees (access);
+	else
+	  disqualify_candidate (var,
+				"No or inhibitingly overlapping accesses.");
+      }
 
-  if (elt->replacement)
-    {
-      tree t;
-      gimple_seq tmp_seq;
+  propagate_all_subaccesses ();
 
-      gcc_assert (elt->is_scalar);
-      t = fold_convert (elt->type, integer_zero_node);
+  FOR_EACH_REFERENCED_VAR (var, rvi)
+    if (bitmap_bit_p (candidate_bitmap, DECL_UID (var)))
+      {
+	struct access *access = get_first_repr_for_decl (var);
 
-      tmp_seq = sra_build_elt_assignment (elt, t);
-      gimple_seq_add_seq (seq_p, tmp_seq);
-    }
+	if (analyze_access_trees (access))
+	  {
+	    res = true;
+	    if (dump_file)
+	      {
+		fprintf (dump_file, "\nAccess trees for ");
+		print_generic_expr (dump_file, var, 0);
+		fprintf (dump_file, " (UID: %u): \n", DECL_UID (var));
+		dump_access_tree (dump_file, access);
+		fprintf (dump_file, "\n");
+	      }
+	  }
+	else
+	  disqualify_candidate (var, "No scalar replacements to be created.");
+      }
+
+  return res;
 }
 
-/* Generate an assignment VAR = INIT, where INIT may need gimplification.
-   Add the result to *LIST_P.  */
+/* Return true iff a reference statement into aggregate AGG can be built for
+   every single to-be-replaced accesses that is a child of ACCESS, its sibling
+   or a child of its sibling. TOP_OFFSET is the offset from the processed
+   access subtree that has to be subtracted from offset of each access.  */
 
-static void
-generate_one_element_init (struct sra_elt *elt, tree init, gimple_seq *seq_p)
+static bool
+ref_expr_for_all_replacements_p (struct access *access, tree agg,
+				 HOST_WIDE_INT top_offset)
 {
-  gimple_seq tmp_seq = sra_build_elt_assignment (elt, init);
-  gimple_seq_add_seq (seq_p, tmp_seq);
+  do
+    {
+      if (access->grp_to_be_replaced
+	  && !build_ref_for_offset (NULL, TREE_TYPE (agg),
+				    access->offset - top_offset,
+				    access->type, false))
+	return false;
+
+      if (access->first_child
+	  && !ref_expr_for_all_replacements_p (access->first_child, agg,
+					       top_offset))
+	return false;
+
+      access = access->next_sibling;
+    }
+  while (access);
+
+  return true;
 }
 
-/* Generate a set of assignment statements in *LIST_P to set all instantiated
-   elements under ELT with the contents of the initializer INIT.  In addition,
-   mark all assigned elements VISITED; this allows easy coordination with
-   generate_element_zero.  Return false if we found a case we couldn't
-   handle.  */
 
-static bool
-generate_element_init_1 (struct sra_elt *elt, tree init, gimple_seq *seq_p)
+/* Generate statements copying scalar replacements of accesses within a subtree
+   into or out of AGG.  ACCESS is the first child of the root of the subtree to
+   be processed.  AGG is an aggregate type expression (can be a declaration but
+   does not have to be, it can for example also be an indirect_ref).
+   TOP_OFFSET is the offset of the processed subtree which has to be subtracted
+   from offsets of individual accesses to get corresponding offsets for AGG.
+   If CHUNK_SIZE is non-null, copy only replacements in the interval
+   <start_offset, start_offset + chunk_size>, otherwise copy all.  GSI is a
+   statement iterator used to place the new statements.  WRITE should be true
+   when the statements should write from AGG to the replacement and false if
+   vice versa.  if INSERT_AFTER is true, new statements will be added after the
+   current statement in GSI, they will be added before the statement
+   otherwise.  */
+
+static void
+generate_subtree_copies (struct access *access, tree agg,
+			 HOST_WIDE_INT top_offset,
+			 HOST_WIDE_INT start_offset, HOST_WIDE_INT chunk_size,
+			 gimple_stmt_iterator *gsi, bool write,
+			 bool insert_after)
 {
-  bool result = true;
-  enum tree_code init_code;
-  struct sra_elt *sub;
-  tree t;
-  unsigned HOST_WIDE_INT idx;
-  tree value, purpose;
+  do
+    {
+      tree expr = unshare_expr (agg);
 
-  /* We can be passed DECL_INITIAL of a static variable.  It might have a
-     conversion, which we strip off here.  */
-  STRIP_USELESS_TYPE_CONVERSION (init);
-  init_code = TREE_CODE (init);
+      if (chunk_size && access->offset >= start_offset + chunk_size)
+	return;
 
-  if (elt->is_scalar)
-    {
-      if (elt->replacement)
-	{
-	  generate_one_element_init (elt, init, seq_p);
-	  elt->visited = true;
-	}
-      return result;
-    }
+      if (access->grp_to_be_replaced
+	  && (chunk_size == 0
+	      || access->offset + access->size > start_offset))
+	{
+	  bool repl_found;
+	  gimple stmt;
+
+	  repl_found = build_ref_for_offset (&expr, TREE_TYPE (agg),
+					     access->offset - top_offset,
+					     access->type, false);
+	  gcc_assert (repl_found);
 
-  switch (init_code)
-    {
-    case COMPLEX_CST:
-    case COMPLEX_EXPR:
-      FOR_EACH_ACTUAL_CHILD (sub, elt)
-	{
-	  if (sub->element == integer_zero_node)
-	    t = (init_code == COMPLEX_EXPR
-		 ? TREE_OPERAND (init, 0) : TREE_REALPART (init));
+	  if (write)
+	    stmt = gimple_build_assign (get_access_replacement (access), expr);
 	  else
-	    t = (init_code == COMPLEX_EXPR
-		 ? TREE_OPERAND (init, 1) : TREE_IMAGPART (init));
-	  result &= generate_element_init_1 (sub, t, seq_p);
-	}
-      break;
-
-    case CONSTRUCTOR:
-      FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (init), idx, purpose, value)
-	{
-	  /* Array constructors are routinely created with NULL indices.  */
-	  if (purpose == NULL_TREE)
 	    {
-	      result = false;
-	      break;
+	      tree repl = get_access_replacement (access);
+	      TREE_NO_WARNING (repl) = 1;
+	      stmt = gimple_build_assign (expr, repl);
 	    }
-	  if (TREE_CODE (purpose) == RANGE_EXPR)
-	    {
-	      tree lower = TREE_OPERAND (purpose, 0);
-	      tree upper = TREE_OPERAND (purpose, 1);
 
-	      while (1)
-		{
-	  	  sub = lookup_element (elt, lower, NULL, NO_INSERT);
-		  if (sub != NULL)
-		    result &= generate_element_init_1 (sub, value, seq_p);
-		  if (tree_int_cst_equal (lower, upper))
-		    break;
-		  lower = int_const_binop (PLUS_EXPR, lower,
-					   integer_one_node, true);
-		}
-	    }
+	  if (insert_after)
+	    gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
 	  else
-	    {
-	      sub = lookup_element (elt, purpose, NULL, NO_INSERT);
-	      if (sub != NULL)
-		result &= generate_element_init_1 (sub, value, seq_p);
-	    }
+	    gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
+	  update_stmt (stmt);
 	}
-      break;
 
-    default:
-      elt->visited = true;
-      result = false;
-    }
+      if (access->first_child)
+	generate_subtree_copies (access->first_child, agg, top_offset,
+				 start_offset, chunk_size, gsi,
+				 write, insert_after);
 
-  return result;
+      access = access->next_sibling;
+    }
+  while (access);
 }
 
-/* A wrapper function for generate_element_init_1 that handles cleanup after
-   gimplification.  */
+/* Assign zero to all scalar replacements in an access subtree.  ACCESS is the
+   the root of the subtree to be processed.  GSI is the statement iterator used
+   for inserting statements which are added after the current statement if
+   INSERT_AFTER is true or before it otherwise.  */
 
-static bool
-generate_element_init (struct sra_elt *elt, tree init, gimple_seq *seq_p)
-{
-  bool ret;
-  struct gimplify_ctx gctx;
+static void
+init_subtree_with_zero (struct access *access, gimple_stmt_iterator *gsi,
+			bool insert_after)
 
-  push_gimplify_context (&gctx);
-  ret = generate_element_init_1 (elt, init, seq_p);
-  pop_gimplify_context (NULL);
+{
+  struct access *child;
 
-  /* The replacement can expose previously unreferenced variables.  */
-  if (ret && *seq_p)
+  if (access->grp_to_be_replaced)
     {
-      gimple_stmt_iterator i;
+      gimple stmt;
 
-      for (i = gsi_start (*seq_p); !gsi_end_p (i); gsi_next (&i))
-	find_new_referenced_vars (gsi_stmt (i));
+      stmt = gimple_build_assign (get_access_replacement (access),
+				  fold_convert (access->type,
+						integer_zero_node));
+      if (insert_after)
+	gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
+      else
+	gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
+      update_stmt (stmt);
     }
 
-  return ret;
+  for (child = access->first_child; child; child = child->next_sibling)
+    init_subtree_with_zero (child, gsi, insert_after);
 }
 
-/* Helper function to insert LIST before GSI, and set up line number info.  */
+/* Search for an access representative for the given expression EXPR and
+   return it or NULL if it cannot be found.  */
 
-static void
-sra_insert_before (gimple_stmt_iterator *gsi, gimple_seq seq)
+static struct access *
+get_access_for_expr (tree expr)
 {
-  gimple stmt = gsi_stmt (*gsi);
-
-  if (gimple_has_location (stmt))
-    annotate_all_with_location (seq, gimple_location (stmt));
-  gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT);
-}
+  HOST_WIDE_INT offset, size, max_size;
+  tree base;
 
-/* Similarly, but insert after GSI.  Handles insertion onto edges as well.  */
+  if (TREE_CODE (expr) == NOP_EXPR
+      || TREE_CODE (expr) == VIEW_CONVERT_EXPR)
+    expr = TREE_OPERAND (expr, 0);
 
-static void
-sra_insert_after (gimple_stmt_iterator *gsi, gimple_seq seq)
-{
-  gimple stmt = gsi_stmt (*gsi);
+  if (handled_component_p (expr))
+    {
+      base = get_ref_base_and_extent (expr, &offset, &size, &max_size);
+      size = max_size;
+      if (size == -1 || !base || !DECL_P (base))
+	return NULL;
+    }
+  else if (DECL_P (expr))
+    {
+      tree tree_size;
 
-  if (gimple_has_location (stmt))
-    annotate_all_with_location (seq, gimple_location (stmt));
+      base = expr;
+      tree_size = TYPE_SIZE (TREE_TYPE (base));
+      if (tree_size && host_integerp (tree_size, 1))
+	size = max_size = tree_low_cst (tree_size, 1);
+      else
+	return NULL;
 
-  if (stmt_ends_bb_p (stmt))
-    insert_edge_copies_seq (seq, gsi_bb (*gsi));
+      offset = 0;
+    }
   else
-    gsi_insert_seq_after (gsi, seq, GSI_SAME_STMT);
+    return NULL;
+
+  if (!bitmap_bit_p (candidate_bitmap, DECL_UID (base)))
+    return NULL;
+
+  return get_var_base_offset_size_access (base, offset, size);
 }
 
-/* Similarly, but replace the statement at GSI.  */
+/* Substitute into *EXPR an expression of type TYPE with the value of the
+   replacement of ACCESS.  This is done either by producing a special V_C_E
+   assignment statement converting the replacement to a new temporary of the
+   requested type if TYPE is not TREE_ADDRESSABLE or by going through the base
+   aggregate if it is.  */
 
 static void
-sra_replace (gimple_stmt_iterator *gsi, gimple_seq seq)
+fix_incompatible_types_for_expr (tree *expr, tree type, struct access *access,
+				 gimple_stmt_iterator *gsi, bool write)
 {
-  sra_insert_before (gsi, seq);
-  unlink_stmt_vdef (gsi_stmt (*gsi));
-  gsi_remove (gsi, false);
-  if (gsi_end_p (*gsi))
-    *gsi = gsi_last (gsi_seq (*gsi));
-  else
-    gsi_prev (gsi);
-}
+  tree repl = get_access_replacement (access);
+  if (!TREE_ADDRESSABLE (type))
+    {
+      tree tmp = create_tmp_var (type, "SRvce");
 
-/* Data structure that bitfield_overlaps_p fills in with information
-   about the element passed in and how much of it overlaps with the
-   bit-range passed it to.  */
+      add_referenced_var (tmp);
+      if (is_gimple_reg_type (type))
+	tmp = make_ssa_name (tmp, NULL);
 
-struct bitfield_overlap_info
-{
-  /* The bit-length of an element.  */
-  tree field_len;
+      if (write)
+	{
+	  gimple stmt;
+	  tree conv = fold_build1 (VIEW_CONVERT_EXPR, TREE_TYPE (repl), tmp);
 
-  /* The bit-position of the element in its parent.  */
-  tree field_pos;
+	  *expr = tmp;
+	  if (is_gimple_reg_type (type))
+	    SSA_NAME_DEF_STMT (tmp) = gsi_stmt (*gsi);
+	  stmt = gimple_build_assign (repl, conv);
+	  gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
+	  update_stmt (stmt);
+	}
+      else
+	{
+	  gimple stmt;
+	  tree conv = fold_build1 (VIEW_CONVERT_EXPR, type, repl);
 
-  /* The number of bits of the element that overlap with the incoming
-     bit range.  */
-  tree overlap_len;
+	  stmt = gimple_build_assign (tmp, conv);
+	  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
+	  if (is_gimple_reg_type (type))
+	    SSA_NAME_DEF_STMT (tmp) = stmt;
+	  *expr = tmp;
+	  update_stmt (stmt);
+	}
+    }
+  else
+    {
+      if (write)
+	{
+	  gimple stmt;
 
-  /* The first bit of the element that overlaps with the incoming bit
-     range.  */
-  tree overlap_pos;
-};
+	  stmt = gimple_build_assign (repl, unshare_expr (access->expr));
+	  gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
+	  update_stmt (stmt);
+	}
+      else
+	{
+	  gimple stmt;
+
+	  stmt = gimple_build_assign (unshare_expr (access->expr), repl);
+	  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
+	  update_stmt (stmt);
+	}
+    }
+}
 
-/* Return true if a BIT_FIELD_REF<(FLD->parent), BLEN, BPOS>
-   expression (referenced as BF below) accesses any of the bits in FLD,
-   false if it doesn't.  If DATA is non-null, its field_len and
-   field_pos are filled in such that BIT_FIELD_REF<(FLD->parent),
-   field_len, field_pos> (referenced as BFLD below) represents the
-   entire field FLD->element, and BIT_FIELD_REF<BFLD, overlap_len,
-   overlap_pos> represents the portion of the entire field that
-   overlaps with BF.  */
+
+/* Callback for scan_function.  Replace the expression EXPR with a scalar
+   replacement if there is one and generate other statements to do type
+   conversion or subtree copying if necessary.  GSI is used to place newly
+   created statements, WRITE is true if the expression is being written to (it
+   is on a LHS of a statement or output in an assembly statement).  */
 
 static bool
-bitfield_overlaps_p (tree blen, tree bpos, struct sra_elt *fld,
-		     struct bitfield_overlap_info *data)
+sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi, bool write,
+		 void *data ATTRIBUTE_UNUSED)
 {
-  tree flen, fpos;
-  bool ret;
+  struct access *access;
+  tree type, bfr;
 
-  if (TREE_CODE (fld->element) == FIELD_DECL)
+  if (TREE_CODE (*expr) == BIT_FIELD_REF)
     {
-      flen = fold_convert (bitsizetype, DECL_SIZE (fld->element));
-      fpos = fold_convert (bitsizetype, DECL_FIELD_OFFSET (fld->element));
-      fpos = size_binop (MULT_EXPR, fpos, bitsize_int (BITS_PER_UNIT));
-      fpos = size_binop (PLUS_EXPR, fpos, DECL_FIELD_BIT_OFFSET (fld->element));
-    }
-  else if (TREE_CODE (fld->element) == BIT_FIELD_REF)
-    {
-      flen = fold_convert (bitsizetype, TREE_OPERAND (fld->element, 1));
-      fpos = fold_convert (bitsizetype, TREE_OPERAND (fld->element, 2));
-    }
-  else if (TREE_CODE (fld->element) == INTEGER_CST)
-    {
-      tree domain_type = TYPE_DOMAIN (TREE_TYPE (fld->parent->element));
-      flen = fold_convert (bitsizetype, TYPE_SIZE (fld->type));
-      fpos = fold_convert (bitsizetype, fld->element);
-      if (domain_type && TYPE_MIN_VALUE (domain_type))
-	fpos = size_binop (MINUS_EXPR, fpos,
-			   fold_convert (bitsizetype,
-			   		 TYPE_MIN_VALUE (domain_type)));
-      fpos = size_binop (MULT_EXPR, flen, fpos);
+      bfr = *expr;
+      expr = &TREE_OPERAND (*expr, 0);
     }
   else
-    gcc_unreachable ();
-
-  gcc_assert (host_integerp (blen, 1)
-	      && host_integerp (bpos, 1)
-	      && host_integerp (flen, 1)
-	      && host_integerp (fpos, 1));
+    bfr = NULL_TREE;
 
-  ret = ((!tree_int_cst_lt (fpos, bpos)
-	  && tree_int_cst_lt (size_binop (MINUS_EXPR, fpos, bpos),
-			      blen))
-	 || (!tree_int_cst_lt (bpos, fpos)
-	     && tree_int_cst_lt (size_binop (MINUS_EXPR, bpos, fpos),
-				 flen)));
+  if (TREE_CODE (*expr) == REALPART_EXPR || TREE_CODE (*expr) == IMAGPART_EXPR)
+    expr = &TREE_OPERAND (*expr, 0);
+  type = TREE_TYPE (*expr);
 
-  if (!ret)
-    return ret;
+  access = get_access_for_expr (*expr);
+  if (!access)
+    return false;
 
-  if (data)
+  if (access->grp_to_be_replaced)
     {
-      tree bend, fend;
-
-      data->field_len = flen;
-      data->field_pos = fpos;
-
-      fend = size_binop (PLUS_EXPR, fpos, flen);
-      bend = size_binop (PLUS_EXPR, bpos, blen);
-
-      if (tree_int_cst_lt (bend, fend))
-	data->overlap_len = size_binop (MINUS_EXPR, bend, fpos);
+      if (!useless_type_conversion_p (type, access->type))
+	fix_incompatible_types_for_expr (expr, type, access, gsi, write);
       else
-	data->overlap_len = NULL;
+	*expr = get_access_replacement (access);
+    }
 
-      if (tree_int_cst_lt (fpos, bpos))
+  if (access->first_child)
+    {
+      HOST_WIDE_INT start_offset, chunk_size;
+      if (bfr
+	  && host_integerp (TREE_OPERAND (bfr, 1), 1)
+	  && host_integerp (TREE_OPERAND (bfr, 2), 1))
 	{
-	  data->overlap_pos = size_binop (MINUS_EXPR, bpos, fpos);
-	  data->overlap_len = size_binop (MINUS_EXPR,
-					  data->overlap_len
-					  ? data->overlap_len
-					  : data->field_len,
-					  data->overlap_pos);
+	  start_offset = tree_low_cst (TREE_OPERAND (bfr, 1), 1);
+	  chunk_size = tree_low_cst (TREE_OPERAND (bfr, 2), 1);
 	}
       else
-	data->overlap_pos = NULL;
-    }
+	start_offset = chunk_size = 0;
 
-  return ret;
+      generate_subtree_copies (access->first_child, access->base, 0,
+			       start_offset, chunk_size, gsi, write, write);
+    }
+  return true;
 }
 
-/* Add to LISTP a sequence of statements that copies BLEN bits between
-   VAR and the scalarized elements of ELT, starting a bit VPOS of VAR
-   and at bit BPOS of ELT.  The direction of the copy is given by
-   TO_VAR.  */
+/* Store all replacements in the access tree rooted in TOP_RACC either to their
+   base aggregate if there are unscalarized data or directly to LHS
+   otherwise.  */
 
 static void
-sra_explode_bitfield_assignment (tree var, tree vpos, bool to_var,
-				 gimple_seq *seq_p, tree blen, tree bpos,
-				 struct sra_elt *elt)
+handle_unscalarized_data_in_subtree (struct access *top_racc, tree lhs,
+				     gimple_stmt_iterator *gsi)
 {
-  struct sra_elt *fld;
-  struct bitfield_overlap_info flp;
-
-  FOR_EACH_ACTUAL_CHILD (fld, elt)
-    {
-      tree flen, fpos;
+  if (top_racc->grp_unscalarized_data)
+    generate_subtree_copies (top_racc->first_child, top_racc->base, 0, 0, 0,
+			     gsi, false, false);
+  else
+    generate_subtree_copies (top_racc->first_child, lhs, top_racc->offset,
+			     0, 0, gsi, false, false);
+}
 
-      if (!bitfield_overlaps_p (blen, bpos, fld, &flp))
-	continue;
 
-      flen = flp.overlap_len ? flp.overlap_len : flp.field_len;
-      fpos = flp.overlap_pos ? flp.overlap_pos : bitsize_int (0);
+/* Try to generate statements to load all sub-replacements in an access
+   (sub)tree (LACC is the first child) from scalar replacements in the TOP_RACC
+   (sub)tree.  If that is not possible, refresh the TOP_RACC base aggregate and
+   load the accesses from it.  LEFT_OFFSET is the offset of the left whole
+   subtree being copied, RIGHT_OFFSET is the same thing for the right subtree.
+   GSI is stmt iterator used for statement insertions.  *REFRESHED is true iff
+   the rhs top aggregate has already been refreshed by contents of its scalar
+   reductions and is set to true if this function has to do it.  */
 
-      if (fld->replacement)
+static void
+load_assign_lhs_subreplacements (struct access *lacc, struct access *top_racc,
+				 HOST_WIDE_INT left_offset,
+				 HOST_WIDE_INT right_offset,
+				 gimple_stmt_iterator *old_gsi,
+				 gimple_stmt_iterator *new_gsi,
+				 bool *refreshed, tree lhs)
+{
+  do
+    {
+      if (lacc->grp_to_be_replaced)
 	{
-	  tree infld, invar, type;
-          gimple_seq st;
-
-	  infld = fld->replacement;
-
-	  type = unsigned_type_for (TREE_TYPE (infld));
-	  if (TYPE_PRECISION (type) != TREE_INT_CST_LOW (flen))
-	    type = build_nonstandard_integer_type (TREE_INT_CST_LOW (flen), 1);
+	  struct access *racc;
+	  HOST_WIDE_INT offset = lacc->offset - left_offset + right_offset;
 
-	  if (TREE_CODE (infld) == BIT_FIELD_REF)
+	  racc = find_access_in_subtree (top_racc, offset, lacc->size);
+	  if (racc && racc->grp_to_be_replaced)
 	    {
-	      fpos = size_binop (PLUS_EXPR, fpos, TREE_OPERAND (infld, 2));
-	      infld = TREE_OPERAND (infld, 0);
-	    }
-	  else if (BYTES_BIG_ENDIAN && DECL_P (fld->element)
-		   && !tree_int_cst_equal (TYPE_SIZE (TREE_TYPE (infld)),
-					   DECL_SIZE (fld->element)))
-	    {
-	      fpos = size_binop (PLUS_EXPR, fpos,
-				 TYPE_SIZE (TREE_TYPE (infld)));
-	      fpos = size_binop (MINUS_EXPR, fpos,
-				 DECL_SIZE (fld->element));
-	    }
-
-	  infld = fold_build3 (BIT_FIELD_REF, type, infld, flen, fpos);
+	      gimple stmt;
 
-	  invar = size_binop (MINUS_EXPR, flp.field_pos, bpos);
-	  if (flp.overlap_pos)
-	    invar = size_binop (PLUS_EXPR, invar, flp.overlap_pos);
-	  invar = size_binop (PLUS_EXPR, invar, vpos);
-
-	  invar = fold_build3 (BIT_FIELD_REF, type, var, flen, invar);
+	      if (useless_type_conversion_p (lacc->type, racc->type))
+		stmt = gimple_build_assign (get_access_replacement (lacc),
+					    get_access_replacement (racc));
+	      else
+		{
+		  tree rhs = fold_build1 (VIEW_CONVERT_EXPR, lacc->type,
+					  get_access_replacement (racc));
+		  stmt = gimple_build_assign (get_access_replacement (lacc),
+					      rhs);
+		}
 
-	  if (to_var)
-	    st = sra_build_bf_assignment (invar, infld);
+	      gsi_insert_after (new_gsi, stmt, GSI_NEW_STMT);
+	      update_stmt (stmt);
+	    }
 	  else
-	    st = sra_build_bf_assignment (infld, invar);
+	    {
+	      tree expr = unshare_expr (top_racc->base);
+	      bool repl_found;
+	      gimple stmt;
+
+	      /* No suitable access on the right hand side, need to load from
+		 the aggregate.  See if we have to update it first... */
+	      if (!*refreshed)
+		{
+		  gcc_assert (top_racc->first_child);
+		  handle_unscalarized_data_in_subtree (top_racc, lhs, old_gsi);
+		  *refreshed = true;
+		}
 
-	  gimple_seq_add_seq (seq_p, st);
+	      repl_found = build_ref_for_offset (&expr,
+						 TREE_TYPE (top_racc->base),
+						 lacc->offset - left_offset,
+						 lacc->type, false);
+	      gcc_assert (repl_found);
+	      stmt = gimple_build_assign (get_access_replacement (lacc),
+					  expr);
+	      gsi_insert_after (new_gsi, stmt, GSI_NEW_STMT);
+	      update_stmt (stmt);
+	    }
 	}
-      else
+      else if (lacc->grp_read && !lacc->grp_covered && !*refreshed)
 	{
-	  tree sub = size_binop (MINUS_EXPR, flp.field_pos, bpos);
-	  sub = size_binop (PLUS_EXPR, vpos, sub);
-	  if (flp.overlap_pos)
-	    sub = size_binop (PLUS_EXPR, sub, flp.overlap_pos);
-
-	  sra_explode_bitfield_assignment (var, sub, to_var, seq_p,
-					   flen, fpos, fld);
+	  handle_unscalarized_data_in_subtree (top_racc, lhs, old_gsi);
+	  *refreshed = true;
 	}
+
+      if (lacc->first_child)
+	load_assign_lhs_subreplacements (lacc->first_child, top_racc,
+					 left_offset, right_offset,
+					 old_gsi, new_gsi, refreshed, lhs);
+      lacc = lacc->next_sibling;
     }
+  while (lacc);
 }
 
-/* Add to LISTBEFOREP statements that copy scalarized members of ELT
-   that overlap with BIT_FIELD_REF<(ELT->element), BLEN, BPOS> back
-   into the full variable, and to LISTAFTERP, if non-NULL, statements
-   that copy the (presumably modified) overlapping portions of the
-   full variable back to the scalarized variables.  */
+/* Return true iff ACC is non-NULL and has subaccesses.  */
 
-static void
-sra_sync_for_bitfield_assignment (gimple_seq *seq_before_p,
-                                  gimple_seq *seq_after_p,
-				  tree blen, tree bpos,
-				  struct sra_elt *elt)
+static inline bool
+access_has_children_p (struct access *acc)
 {
-  struct sra_elt *fld;
-  struct bitfield_overlap_info flp;
-
-  FOR_EACH_ACTUAL_CHILD (fld, elt)
-    if (bitfield_overlaps_p (blen, bpos, fld, &flp))
-      {
-	if (fld->replacement || (!flp.overlap_len && !flp.overlap_pos))
-	  {
-	    generate_copy_inout (fld, false, generate_element_ref (fld),
-				 seq_before_p);
-	    mark_no_warning (fld);
-	    if (seq_after_p)
-	      generate_copy_inout (fld, true, generate_element_ref (fld),
-				   seq_after_p);
-	  }
-	else
-	  {
-	    tree flen = flp.overlap_len ? flp.overlap_len : flp.field_len;
-	    tree fpos = flp.overlap_pos ? flp.overlap_pos : bitsize_int (0);
-
-	    sra_sync_for_bitfield_assignment (seq_before_p, seq_after_p,
-					      flen, fpos, fld);
-	  }
-      }
+  return acc && acc->first_child;
 }
 
-/* Scalarize a USE.  To recap, this is either a simple reference to ELT,
-   if elt is scalar, or some occurrence of ELT that requires a complete
-   aggregate.  IS_OUTPUT is true if ELT is being modified.  */
-
-static void
-scalarize_use (struct sra_elt *elt, tree *expr_p, gimple_stmt_iterator *gsi,
-	       bool is_output, bool use_all)
-{
-  gimple stmt = gsi_stmt (*gsi);
-  tree bfexpr;
-
-  if (elt->replacement)
-    {
-      tree replacement = elt->replacement;
-
-      /* If we have a replacement, then updating the reference is as
-	 simple as modifying the existing statement in place.  */
-      if (is_output
-	  && TREE_CODE (elt->replacement) == BIT_FIELD_REF
-	  && is_gimple_reg (TREE_OPERAND (elt->replacement, 0))
-	  && is_gimple_assign (stmt)
-	  && gimple_assign_lhs_ptr (stmt) == expr_p)
-	{
-          gimple_seq newseq;
-          /* RHS must be a single operand. */
-          gcc_assert (gimple_assign_single_p (stmt));
-	  newseq = sra_build_elt_assignment (elt, gimple_assign_rhs1 (stmt));
-	  sra_replace (gsi, newseq);
-	  return;
-	}
-      else if (!is_output
-	       && TREE_CODE (elt->replacement) == BIT_FIELD_REF
-	       && is_gimple_assign (stmt)
-	       && gimple_assign_rhs1_ptr (stmt) == expr_p)
-	{
-	  tree tmp = make_rename_temp
-	    (TREE_TYPE (gimple_assign_lhs (stmt)), "SR");
-	  gimple_seq newseq = sra_build_assignment (tmp, REPLDUP (elt->replacement));
-
-	  sra_insert_before (gsi, newseq);
-	  replacement = tmp;
-	}
-      if (is_output)
-	  update_stmt_if_modified (stmt);
-      *expr_p = REPLDUP (replacement);
-      update_stmt (stmt);
-    }
-  else if (use_all && is_output
-	   && is_gimple_assign (stmt)
-	   && TREE_CODE (bfexpr
-			 = gimple_assign_lhs (stmt)) == BIT_FIELD_REF
-	   && &TREE_OPERAND (bfexpr, 0) == expr_p
-	   && INTEGRAL_TYPE_P (TREE_TYPE (bfexpr))
-	   && TREE_CODE (TREE_TYPE (*expr_p)) == RECORD_TYPE)
-    {
-      gimple_seq seq_before = NULL;
-      gimple_seq seq_after = NULL;
-      tree blen = fold_convert (bitsizetype, TREE_OPERAND (bfexpr, 1));
-      tree bpos = fold_convert (bitsizetype, TREE_OPERAND (bfexpr, 2));
-      bool update = false;
-
-      if (!elt->use_block_copy)
-	{
-	  tree type = TREE_TYPE (bfexpr);
-	  tree var = make_rename_temp (type, "SR"), tmp, vpos;
-          gimple st;
-
-	  gimple_assign_set_lhs (stmt, var);
-	  update = true;
+/* Modify assignments with a CONSTRUCTOR on their RHS.  STMT contains a pointer
+   to the assignment and GSI is the statement iterator pointing at it.  Returns
+   the same values as sra_modify_assign.  */
 
-	  if (!TYPE_UNSIGNED (type))
-	    {
-	      type = unsigned_type_for (type);
-	      tmp = make_rename_temp (type, "SR");
-	      st = gimple_build_assign (tmp, fold_convert (type, var));
-	      gimple_seq_add_stmt (&seq_after, st);
-	      var = tmp;
-	    }
+static enum scan_assign_result
+sra_modify_constructor_assign (gimple *stmt, gimple_stmt_iterator *gsi)
+{
+  tree lhs = gimple_assign_lhs (*stmt);
+  struct access *acc;
 
-	  /* If VAR is wider than BLEN bits, it is padded at the
-	     most-significant end.  We want to set VPOS such that
-	     <BIT_FIELD_REF VAR BLEN VPOS> would refer to the
-	     least-significant BLEN bits of VAR.  */
-	  if (BYTES_BIG_ENDIAN)
-	    vpos = size_binop (MINUS_EXPR, TYPE_SIZE (type), blen);
-	  else
-	    vpos = bitsize_int (0);
-	  sra_explode_bitfield_assignment
-	    (var, vpos, false, &seq_after, blen, bpos, elt);
-	}
-      else
-	sra_sync_for_bitfield_assignment
-	  (&seq_before, &seq_after, blen, bpos, elt);
+  gcc_assert (TREE_CODE (lhs) != REALPART_EXPR
+	      && TREE_CODE (lhs) != IMAGPART_EXPR);
+  acc = get_access_for_expr (lhs);
+  if (!acc)
+    return SRA_SA_NONE;
 
-      if (seq_before)
-	{
-	  mark_all_v_defs_seq (seq_before);
-	  sra_insert_before (gsi, seq_before);
-	}
-      if (seq_after)
-	{
-	  mark_all_v_defs_seq (seq_after);
-	  sra_insert_after (gsi, seq_after);
-	}
+  if (VEC_length (constructor_elt,
+		  CONSTRUCTOR_ELTS (gimple_assign_rhs1 (*stmt))) > 0)
+    {
+      /* I have never seen this code path trigger but if it can happen the
+	 following should handle it gracefully.  */
+      if (access_has_children_p (acc))
+	generate_subtree_copies (acc->first_child, acc->base, 0, 0, 0, gsi,
+				 true, true);
+      return SRA_SA_PROCESSED;
+    }
 
-      if (update)
-	update_stmt (stmt);
+  if (acc->grp_covered)
+    {
+      init_subtree_with_zero (acc, gsi, false);
+      unlink_stmt_vdef (*stmt);
+      gsi_remove (gsi, true);
+      return SRA_SA_REMOVED;
     }
-  else if (use_all && !is_output
-	   && is_gimple_assign (stmt)
-	   && TREE_CODE (bfexpr
-			 = gimple_assign_rhs1 (stmt)) == BIT_FIELD_REF
-	   && &TREE_OPERAND (gimple_assign_rhs1 (stmt), 0) == expr_p
-	   && INTEGRAL_TYPE_P (TREE_TYPE (bfexpr))
-	   && TREE_CODE (TREE_TYPE (*expr_p)) == RECORD_TYPE)
+  else
     {
-      gimple_seq seq = NULL;
-      tree blen = fold_convert (bitsizetype, TREE_OPERAND (bfexpr, 1));
-      tree bpos = fold_convert (bitsizetype, TREE_OPERAND (bfexpr, 2));
-      bool update = false;
+      init_subtree_with_zero (acc, gsi, true);
+      return SRA_SA_PROCESSED;
+    }
+}
 
-      if (!elt->use_block_copy)
-	{
-	  tree type = TREE_TYPE (bfexpr);
-	  tree var = make_rename_temp (type, "SR"), tmp, vpos;
-	  gimple st = NULL;
 
-	  gimple_assign_set_rhs1 (stmt, var);
-	  update = true;
+/* Modify statements with IMAGPART_EXPR or REALPART_EXPR on their lhs with
+   to-be-scalarized expressions with them.  STMT is the statement and GSI is
+   the iterator used to place new helper statements.  Returns the same values
+   as sra_modify_assign.  */
 
-	  if (!TYPE_UNSIGNED (type))
-	    {
-	      type = unsigned_type_for (type);
-	      tmp = make_rename_temp (type, "SR");
-	      st = gimple_build_assign (var,
-					fold_convert (TREE_TYPE (var), tmp));
-	      var = tmp;
-	    }
+static enum scan_assign_result
+sra_modify_partially_complex_lhs (gimple stmt, gimple_stmt_iterator *gsi)
+{
+  tree lhs, complex, ptype, rp, ip;
+  struct access *access;
+  gimple new_stmt, aux_stmt;
 
-	  gimple_seq_add_stmt (&seq,
-                               gimple_build_assign
-				 (var, build_int_cst_wide (type, 0, 0)));
-
-	  /* If VAR is wider than BLEN bits, it is padded at the
-	     most-significant end.  We want to set VPOS such that
-	     <BIT_FIELD_REF VAR BLEN VPOS> would refer to the
-	     least-significant BLEN bits of VAR.  */
-	  if (BYTES_BIG_ENDIAN)
-	    vpos = size_binop (MINUS_EXPR, TYPE_SIZE (type), blen);
-	  else
-	    vpos = bitsize_int (0);
-	  sra_explode_bitfield_assignment
-	    (var, vpos, true, &seq, blen, bpos, elt);
+  lhs = gimple_assign_lhs (stmt);
+  complex = TREE_OPERAND (lhs, 0);
 
-	  if (st)
-	    gimple_seq_add_stmt (&seq, st);
-	}
-      else
-	sra_sync_for_bitfield_assignment
-	  (&seq, NULL, blen, bpos, elt);
+  access = get_access_for_expr (complex);
 
-      if (seq)
-	{
-	  mark_all_v_defs_seq (seq);
-	  sra_insert_before (gsi, seq);
-	}
+  if (!access || !access->grp_to_be_replaced)
+    return SRA_SA_NONE;
+
+  ptype = TREE_TYPE (TREE_TYPE (complex));
+  rp = create_tmp_var (ptype, "SRr");
+  add_referenced_var (rp);
+  rp = make_ssa_name (rp, NULL);
 
-      if (update)
-	update_stmt (stmt);
+  ip = create_tmp_var (ptype, "SRp");
+  add_referenced_var (ip);
+  ip = make_ssa_name (ip, NULL);
+
+  if (TREE_CODE (lhs) == IMAGPART_EXPR)
+    {
+      aux_stmt = gimple_build_assign (rp, fold_build1 (REALPART_EXPR, ptype,
+					     get_access_replacement (access)));
+      SSA_NAME_DEF_STMT (rp) = aux_stmt;
+      gimple_assign_set_lhs (stmt, ip);
+      SSA_NAME_DEF_STMT (ip) = stmt;
     }
   else
     {
-      gimple_seq seq = NULL;
+      aux_stmt = gimple_build_assign (ip, fold_build1 (IMAGPART_EXPR, ptype,
+					     get_access_replacement (access)));
+      SSA_NAME_DEF_STMT (ip) = aux_stmt;
+      gimple_assign_set_lhs (stmt, rp);
+      SSA_NAME_DEF_STMT (rp) = stmt;
+    }
 
-      /* Otherwise we need some copies.  If ELT is being read, then we
-	 want to store all (modified) sub-elements back into the
-	 structure before the reference takes place.  If ELT is being
-	 written, then we want to load the changed values back into
-	 our shadow variables.  */
-      /* ??? We don't check modified for reads, we just always write all of
-	 the values.  We should be able to record the SSA number of the VOP
-	 for which the values were last read.  If that number matches the
-	 SSA number of the VOP in the current statement, then we needn't
-	 emit an assignment.  This would also eliminate double writes when
-	 a structure is passed as more than one argument to a function call.
-	 This optimization would be most effective if sra_walk_function
-	 processed the blocks in dominator order.  */
+  gsi_insert_before (gsi, aux_stmt, GSI_SAME_STMT);
+  update_stmt (aux_stmt);
+  new_stmt = gimple_build_assign (get_access_replacement (access),
+				  fold_build2 (COMPLEX_EXPR, access->type,
+					       rp, ip));
+  gsi_insert_after (gsi, new_stmt, GSI_NEW_STMT);
+  update_stmt (new_stmt);
+  return SRA_SA_PROCESSED;
+}
 
-      generate_copy_inout (elt, is_output, generate_element_ref (elt), &seq);
-      if (seq == NULL)
-	return;
-      mark_all_v_defs_seq (seq);
-      if (is_output)
-	sra_insert_after (gsi, seq);
-      else
-	{
-	  sra_insert_before (gsi, seq);
-	  if (use_all)
-	    mark_no_warning (elt);
-	}
+/* Return true iff T has a VIEW_CONVERT_EXPR among its handled components.  */
+
+static bool
+contains_view_convert_expr_p (tree t)
+{
+  while (1)
+    {
+      if (TREE_CODE (t) == VIEW_CONVERT_EXPR)
+	return true;
+      if (!handled_component_p (t))
+	return false;
+      t = TREE_OPERAND (t, 0);
     }
 }
 
-/* Scalarize a COPY.  To recap, this is an assignment statement between
-   two scalarizable references, LHS_ELT and RHS_ELT.  */
+/* Change STMT to assign compatible types by means of adding component or array
+   references or VIEW_CONVERT_EXPRs.  All parameters have the same meaning as
+   variable with the same names in sra_modify_assign.  This is done in a
+   such a complicated way in order to make
+   testsuite/g++.dg/tree-ssa/ssa-sra-2.C happy and so it helps in at least some
+   cases.  */
 
 static void
-scalarize_copy (struct sra_elt *lhs_elt, struct sra_elt *rhs_elt,
-		gimple_stmt_iterator *gsi)
+fix_modified_assign_compatibility (gimple_stmt_iterator *gsi, gimple *stmt,
+				   struct access *lacc, struct access *racc,
+				   tree lhs, tree *rhs, tree ltype, tree rtype)
 {
-  gimple_seq seq;
-  gimple stmt;
-
-  if (lhs_elt->replacement && rhs_elt->replacement)
+  if (racc && racc->grp_to_be_replaced && AGGREGATE_TYPE_P (ltype)
+      && !access_has_children_p (lacc))
     {
-      /* If we have two scalar operands, modify the existing statement.  */
-      stmt = gsi_stmt (*gsi);
-
-      /* See the commentary in sra_walk_function concerning
-	 RETURN_EXPR, and why we should never see one here.  */
-      gcc_assert (is_gimple_assign (stmt));
-      gcc_assert (gimple_assign_copy_p (stmt));
-
-
-      gimple_assign_set_lhs (stmt, lhs_elt->replacement);
-      gimple_assign_set_rhs1 (stmt, REPLDUP (rhs_elt->replacement));
-      update_stmt (stmt);
+      tree expr = unshare_expr (lhs);
+      bool found = build_ref_for_offset (&expr, ltype, racc->offset, rtype,
+					 false);
+      if (found)
+	{
+	  gimple_assign_set_lhs (*stmt, expr);
+	  return;
+	}
     }
-  else if (lhs_elt->use_block_copy || rhs_elt->use_block_copy)
+
+  if (lacc && lacc->grp_to_be_replaced && AGGREGATE_TYPE_P (rtype)
+      && !access_has_children_p (racc))
     {
-      /* If either side requires a block copy, then sync the RHS back
-	 to the original structure, leave the original assignment
-	 statement (which will perform the block copy), then load the
-	 LHS values out of its now-updated original structure.  */
-      /* ??? Could perform a modified pair-wise element copy.  That
-	 would at least allow those elements that are instantiated in
-	 both structures to be optimized well.  */
-
-      seq = NULL;
-      generate_copy_inout (rhs_elt, false,
-			   generate_element_ref (rhs_elt), &seq);
-      if (seq)
-	{
-	  mark_all_v_defs_seq (seq);
-	  sra_insert_before (gsi, seq);
-	}
-
-      seq = NULL;
-      generate_copy_inout (lhs_elt, true,
-			   generate_element_ref (lhs_elt), &seq);
-      if (seq)
+      tree expr = unshare_expr (*rhs);
+      bool found = build_ref_for_offset (&expr, rtype, lacc->offset, ltype,
+					 false);
+      if (found)
 	{
-	  mark_all_v_defs_seq (seq);
-	  sra_insert_after (gsi, seq);
+	  gimple_assign_set_rhs1 (*stmt, expr);
+	  return;
 	}
     }
-  else
-    {
-      /* Otherwise both sides must be fully instantiated.  In which
-	 case perform pair-wise element assignments and replace the
-	 original block copy statement.  */
 
-      stmt = gsi_stmt (*gsi);
-      update_stmt_if_modified (stmt);
+  *rhs = fold_build1 (VIEW_CONVERT_EXPR, ltype, *rhs);
+  gimple_assign_set_rhs_from_tree (gsi, *rhs);
+  *stmt = gsi_stmt (*gsi);
+}
+
+/* Callback of scan_function to process assign statements.  It examines both
+   sides of the statement, replaces them with a scalare replacement if there is
+   one and generating copying of replacements if scalarized aggregates have been
+   used in the assignment.  STMT is a pointer to the assign statement, GSI is
+   used to hold generated statements for type conversions and subtree
+   copying.  */
+
+static enum scan_assign_result
+sra_modify_assign (gimple *stmt, gimple_stmt_iterator *gsi,
+		   void *data ATTRIBUTE_UNUSED)
+{
+  struct access *lacc, *racc;
+  tree ltype, rtype;
+  tree lhs, rhs;
+  bool modify_this_stmt;
 
-      seq = NULL;
-      generate_element_copy (lhs_elt, rhs_elt, &seq);
-      gcc_assert (seq);
-      mark_all_v_defs_seq (seq);
-      sra_replace (gsi, seq);
-    }
-}
+  if (gimple_assign_rhs2 (*stmt))
+    return SRA_SA_NONE;
+  lhs = gimple_assign_lhs (*stmt);
+  rhs = gimple_assign_rhs1 (*stmt);
 
-/* Scalarize an INIT.  To recap, this is an assignment to a scalarizable
-   reference from some form of constructor: CONSTRUCTOR, COMPLEX_CST or
-   COMPLEX_EXPR.  If RHS is NULL, it should be treated as an empty
-   CONSTRUCTOR.  */
+  if (TREE_CODE (rhs) == CONSTRUCTOR)
+    return sra_modify_constructor_assign (stmt, gsi);
 
-static void
-scalarize_init (struct sra_elt *lhs_elt, tree rhs, gimple_stmt_iterator *gsi)
-{
-  bool result = true;
-  gimple_seq seq = NULL, init_seq = NULL;
+  if (TREE_CODE (lhs) == REALPART_EXPR || TREE_CODE (lhs) == IMAGPART_EXPR)
+    return sra_modify_partially_complex_lhs (*stmt, gsi);
 
-  /* Generate initialization statements for all members extant in the RHS.  */
-  if (rhs)
+  if (TREE_CODE (rhs) == REALPART_EXPR || TREE_CODE (rhs) == IMAGPART_EXPR
+      || TREE_CODE (rhs) == BIT_FIELD_REF || TREE_CODE (lhs) == BIT_FIELD_REF)
     {
-      /* Unshare the expression just in case this is from a decl's initial.  */
-      rhs = unshare_expr (rhs);
-      result = generate_element_init (lhs_elt, rhs, &init_seq);
+      modify_this_stmt = sra_modify_expr (gimple_assign_rhs1_ptr (*stmt),
+					  gsi, false, data);
+      modify_this_stmt |= sra_modify_expr (gimple_assign_lhs_ptr (*stmt),
+					   gsi, true, data);
+      return modify_this_stmt ? SRA_SA_PROCESSED : SRA_SA_NONE;
     }
 
-  if (!result)
+  lacc = get_access_for_expr (lhs);
+  racc = get_access_for_expr (rhs);
+  if (!lacc && !racc)
+    return SRA_SA_NONE;
+
+  modify_this_stmt = ((lacc && lacc->grp_to_be_replaced)
+		      || (racc && racc->grp_to_be_replaced));
+
+  if (lacc && lacc->grp_to_be_replaced)
     {
-      /* If we failed to convert the entire initializer, then we must
-	 leave the structure assignment in place and must load values
-	 from the structure into the slots for which we did not find
-	 constants.  The easiest way to do this is to generate a complete
-	 copy-out, and then follow that with the constant assignments
-	 that we were able to build.  DCE will clean things up.  */
-      gimple_seq seq0 = NULL;
-      generate_copy_inout (lhs_elt, true, generate_element_ref (lhs_elt),
-			   &seq0);
-      gimple_seq_add_seq (&seq0, seq);
-      seq = seq0;
+      lhs = get_access_replacement (lacc);
+      gimple_assign_set_lhs (*stmt, lhs);
+      ltype = lacc->type;
     }
   else
-    {
-      /* CONSTRUCTOR is defined such that any member not mentioned is assigned
-	 a zero value.  Initialize the rest of the instantiated elements.  */
-      generate_element_zero (lhs_elt, &seq);
-      gimple_seq_add_seq (&seq, init_seq);
-    }
+    ltype = TREE_TYPE (lhs);
 
-  if (lhs_elt->use_block_copy || !result)
+  if (racc && racc->grp_to_be_replaced)
     {
-      /* Since LHS is not fully instantiated, we must leave the structure
-	 assignment in place.  Treating this case differently from a USE
-	 exposes constants to later optimizations.  */
-      if (seq)
-	{
-	  mark_all_v_defs_seq (seq);
-	  sra_insert_after (gsi, seq);
-	}
+      rhs = get_access_replacement (racc);
+      gimple_assign_set_rhs1 (*stmt, rhs);
+      rtype = racc->type;
     }
   else
-    {
-      /* The LHS is fully instantiated.  The list of initializations
-	 replaces the original structure assignment.  */
-      gcc_assert (seq);
-      update_stmt_if_modified (gsi_stmt (*gsi));
-      mark_all_v_defs_seq (seq);
-      sra_replace (gsi, seq);
-    }
-}
+    rtype = TREE_TYPE (rhs);
 
-/* A subroutine of scalarize_ldst called via walk_tree.  Set TREE_NO_TRAP
-   on all INDIRECT_REFs.  */
-
-static tree
-mark_notrap (tree *tp, int *walk_subtrees, void *data ATTRIBUTE_UNUSED)
-{
-  tree t = *tp;
-
-  if (TREE_CODE (t) == INDIRECT_REF)
+  /* The possibility that gimple_assign_set_rhs_from_tree() might reallocate
+     the statement makes the position of this pop_stmt_changes() a bit awkward
+     but hopefully make some sense.  */
+  if (modify_this_stmt)
     {
-      TREE_THIS_NOTRAP (t) = 1;
-      *walk_subtrees = 0;
+      if (!useless_type_conversion_p (ltype, rtype))
+	fix_modified_assign_compatibility (gsi, stmt, lacc, racc,
+					   lhs, &rhs, ltype, rtype);
     }
-  else if (IS_TYPE_OR_DECL_P (t))
-    *walk_subtrees = 0;
-
-  return NULL;
-}
-
-/* Scalarize a LDST.  To recap, this is an assignment between one scalarizable
-   reference ELT and one non-scalarizable reference OTHER.  IS_OUTPUT is true
-   if ELT is on the left-hand side.  */
 
-static void
-scalarize_ldst (struct sra_elt *elt, tree other,
-		gimple_stmt_iterator *gsi, bool is_output)
-{
-  /* Shouldn't have gotten called for a scalar.  */
-  gcc_assert (!elt->replacement);
-
-  if (elt->use_block_copy)
+  if (contains_view_convert_expr_p (rhs) || contains_view_convert_expr_p (lhs)
+      || (access_has_children_p (racc)
+	  && !ref_expr_for_all_replacements_p (racc, lhs, racc->offset))
+      || (access_has_children_p (lacc)
+	  && !ref_expr_for_all_replacements_p (lacc, rhs, lacc->offset)))
     {
-      /* Since ELT is not fully instantiated, we have to leave the
-	 block copy in place.  Treat this as a USE.  */
-      scalarize_use (elt, NULL, gsi, is_output, false);
+      if (access_has_children_p (racc))
+	generate_subtree_copies (racc->first_child, racc->base, 0, 0, 0,
+				 gsi, false, false);
+      if (access_has_children_p (lacc))
+	generate_subtree_copies (lacc->first_child, lacc->base, 0, 0, 0,
+				 gsi, true, true);
     }
   else
     {
-      /* The interesting case is when ELT is fully instantiated.  In this
-	 case we can have each element stored/loaded directly to/from the
-	 corresponding slot in OTHER.  This avoids a block copy.  */
-
-      gimple_seq seq = NULL;
-      gimple stmt = gsi_stmt (*gsi);
-
-      update_stmt_if_modified (stmt);
-      generate_copy_inout (elt, is_output, other, &seq);
-      gcc_assert (seq);
-      mark_all_v_defs_seq (seq);
-
-      /* Preserve EH semantics.  */
-      if (stmt_ends_bb_p (stmt))
-	{
-	  gimple_stmt_iterator si;
-	  gimple first;
-          gimple_seq blist = NULL;
-	  bool thr = stmt_could_throw_p (stmt);
-
-	  /* If the last statement of this BB created an EH edge
-	     before scalarization, we have to locate the first
-	     statement that can throw in the new statement list and
-	     use that as the last statement of this BB, such that EH
-	     semantics is preserved.  All statements up to this one
-	     are added to the same BB.  All other statements in the
-	     list will be added to normal outgoing edges of the same
-	     BB.  If they access any memory, it's the same memory, so
-	     we can assume they won't throw.  */
-	  si = gsi_start (seq);
-	  for (first = gsi_stmt (si);
-	       thr && !gsi_end_p (si) && !stmt_could_throw_p (first);
-	       first = gsi_stmt (si))
+      if (access_has_children_p (lacc) && access_has_children_p (racc))
+	{
+	  gimple_stmt_iterator orig_gsi = *gsi;
+	  bool refreshed;
+
+	  if (lacc->grp_read && !lacc->grp_covered)
 	    {
-	      gsi_remove (&si, false);
-	      gimple_seq_add_stmt (&blist, first);
+	      handle_unscalarized_data_in_subtree (racc, lhs, gsi);
+	      refreshed = true;
 	    }
+	  else
+	    refreshed = false;
 
-	  /* Extract the first remaining statement from LIST, this is
-	     the EH statement if there is one.  */
-	  gsi_remove (&si, false);
-
-	  if (blist)
-	    sra_insert_before (gsi, blist);
-
-	  /* Replace the old statement with this new representative.  */
-	  gsi_replace (gsi, first, true);
+	  load_assign_lhs_subreplacements (lacc->first_child, racc,
+					   lacc->offset, racc->offset,
+					   &orig_gsi, gsi, &refreshed, lhs);
+	  if (!refreshed || !racc->grp_unscalarized_data)
+	    {
+	      if (*stmt == gsi_stmt (*gsi))
+		gsi_next (gsi);
 
-	  if (!gsi_end_p (si))
+	      unlink_stmt_vdef (*stmt);
+	      gsi_remove (&orig_gsi, true);
+	      return SRA_SA_REMOVED;
+	    }
+	}
+      else
+	{
+	  if (access_has_children_p (racc))
 	    {
-	      /* If any reference would trap, then they all would.  And more
-		 to the point, the first would.  Therefore none of the rest
-		 will trap since the first didn't.  Indicate this by
-		 iterating over the remaining statements and set
-		 TREE_THIS_NOTRAP in all INDIRECT_REFs.  */
-	      do
+	      if (!racc->grp_unscalarized_data)
 		{
-		  walk_gimple_stmt (&si, NULL, mark_notrap, NULL);
-		  gsi_next (&si);
+		  generate_subtree_copies (racc->first_child, lhs,
+					   racc->offset, 0, 0, gsi,
+					   false, false);
+		  gcc_assert (*stmt == gsi_stmt (*gsi));
+		  unlink_stmt_vdef (*stmt);
+		  gsi_remove (gsi, true);
+		  return SRA_SA_REMOVED;
 		}
-	      while (!gsi_end_p (si));
-
-	      insert_edge_copies_seq (seq, gsi_bb (*gsi));
+	      else
+		generate_subtree_copies (racc->first_child, lhs,
+					 racc->offset, 0, 0, gsi, false, true);
 	    }
+	  else if (access_has_children_p (lacc))
+	    generate_subtree_copies (lacc->first_child, rhs, lacc->offset,
+				     0, 0, gsi, true, true);
 	}
-      else
-	sra_replace (gsi, seq);
     }
+
+  return modify_this_stmt ? SRA_SA_PROCESSED : SRA_SA_NONE;
 }
 
-/* Generate initializations for all scalarizable parameters.  */
+/* Generate statements initializing scalar replacements of parts of function
+   parameters.  */
 
 static void
-scalarize_parms (void)
+initialize_parameter_reductions (void)
 {
+  gimple_stmt_iterator gsi;
   gimple_seq seq = NULL;
-  unsigned i;
-  bitmap_iterator bi;
+  tree parm;
 
-  EXECUTE_IF_SET_IN_BITMAP (needs_copy_in, 0, i, bi)
+  for (parm = DECL_ARGUMENTS (current_function_decl);
+       parm;
+       parm = TREE_CHAIN (parm))
     {
-      tree var = referenced_var (i);
-      struct sra_elt *elt = lookup_element (NULL, var, NULL, NO_INSERT);
-      generate_copy_inout (elt, true, var, &seq);
-    }
+      VEC (access_p, heap) *access_vec;
+      struct access *access;
 
-  if (seq)
-    {
-      insert_edge_copies_seq (seq, ENTRY_BLOCK_PTR);
-      mark_all_v_defs_seq (seq);
-    }
-}
+      if (!bitmap_bit_p (candidate_bitmap, DECL_UID (parm)))
+	continue;
+      access_vec = get_base_access_vector (parm);
+      if (!access_vec)
+	continue;
 
-/* Entry point to phase 4.  Update the function to match replacements.  */
+      if (!seq)
+	{
+	  seq = gimple_seq_alloc ();
+	  gsi = gsi_start (seq);
+	}
 
-static void
-scalarize_function (void)
-{
-  static const struct sra_walk_fns fns = {
-    scalarize_use, scalarize_copy, scalarize_init, scalarize_ldst, false
-  };
+      for (access = VEC_index (access_p, access_vec, 0);
+	   access;
+	   access = access->next_grp)
+	generate_subtree_copies (access, parm, 0, 0, 0, &gsi, true, true);
+    }
 
-  sra_walk_function (&fns);
-  scalarize_parms ();
-  gsi_commit_edge_inserts ();
+  if (seq)
+    gsi_insert_seq_on_edge_immediate (single_succ_edge (ENTRY_BLOCK_PTR), seq);
 }
 
-\f
-/* Debug helper function.  Print ELT in a nice human-readable format.  */
-
-static void
-dump_sra_elt_name (FILE *f, struct sra_elt *elt)
+/* The "main" function of intraprocedural SRA passes.  Runs the analysis and if
+   it reveals there are components of some aggregates to be scalarized, it runs
+   the required transformations.  */
+static unsigned int
+perform_intra_sra (void)
 {
-  if (elt->parent && TREE_CODE (elt->parent->type) == COMPLEX_TYPE)
-    {
-      fputs (elt->element == integer_zero_node ? "__real__ " : "__imag__ ", f);
-      dump_sra_elt_name (f, elt->parent);
-    }
-  else
-    {
-      if (elt->parent)
-        dump_sra_elt_name (f, elt->parent);
-      if (DECL_P (elt->element))
-	{
-	  if (TREE_CODE (elt->element) == FIELD_DECL)
-	    fputc ('.', f);
-	  print_generic_expr (f, elt->element, dump_flags);
-	}
-      else if (TREE_CODE (elt->element) == BIT_FIELD_REF)
-	fprintf (f, "$B" HOST_WIDE_INT_PRINT_DEC "F" HOST_WIDE_INT_PRINT_DEC,
-		 tree_low_cst (TREE_OPERAND (elt->element, 2), 1),
-		 tree_low_cst (TREE_OPERAND (elt->element, 1), 1));
-      else if (TREE_CODE (elt->element) == RANGE_EXPR)
-	fprintf (f, "["HOST_WIDE_INT_PRINT_DEC".."HOST_WIDE_INT_PRINT_DEC"]",
-		 TREE_INT_CST_LOW (TREE_OPERAND (elt->element, 0)),
-		 TREE_INT_CST_LOW (TREE_OPERAND (elt->element, 1)));
-      else
-	fprintf (f, "[" HOST_WIDE_INT_PRINT_DEC "]",
-		 TREE_INT_CST_LOW (elt->element));
-    }
-}
+  int ret = 0;
+  sra_initialize ();
 
-/* Likewise, but callable from the debugger.  */
+  if (!find_var_candidates ())
+    goto out;
 
-void
-debug_sra_elt_name (struct sra_elt *elt)
-{
-  dump_sra_elt_name (stderr, elt);
-  fputc ('\n', stderr);
-}
+  if (!scan_function (build_access_from_expr, build_accesses_from_assign, NULL,
+		      true, NULL))
+    goto out;
 
-static void
-sra_init_cache (void)
-{
-  if (sra_type_decomp_cache)
-    return;
+  if (!analyze_all_variable_accesses ())
+    goto out;
 
-  sra_type_decomp_cache = BITMAP_ALLOC (NULL);
-  sra_type_inst_cache = BITMAP_ALLOC (NULL);
-}
+  scan_function (sra_modify_expr, sra_modify_assign, NULL,
+		 false, NULL);
+  initialize_parameter_reductions ();
 
+  ret = TODO_update_ssa;
 
-/* Main entry point.  */
+  if (sra_mode == SRA_MODE_EARLY_INTRA)
+    ret = TODO_update_ssa;
+  else
+    ret = TODO_update_ssa | TODO_rebuild_alias;
+ out:
+  sra_deinitialize ();
+  return ret;
+}
 
+/* Perform early intraprocedural SRA.  */
 static unsigned int
-tree_sra (void)
+early_intra_sra (void)
 {
-  /* Initialize local variables.  */
-  gcc_obstack_init (&sra_obstack);
-  sra_candidates = BITMAP_ALLOC (NULL);
-  needs_copy_in = BITMAP_ALLOC (NULL);
-  sra_init_cache ();
-  sra_map = htab_create (101, sra_elt_hash, sra_elt_eq, NULL);
-
-  /* Scan.  If we find anything, instantiate and scalarize.  */
-  if (find_candidates_for_sra ())
-    {
-      scan_function ();
-      decide_instantiations ();
-      scalarize_function ();
-    }
-
-  /* Free allocated memory.  */
-  htab_delete (sra_map);
-  sra_map = NULL;
-  BITMAP_FREE (sra_candidates);
-  BITMAP_FREE (needs_copy_in);
-  BITMAP_FREE (sra_type_decomp_cache);
-  BITMAP_FREE (sra_type_inst_cache);
-  obstack_free (&sra_obstack, NULL);
-  return 0;
+  sra_mode = SRA_MODE_EARLY_INTRA;
+  return perform_intra_sra ();
 }
 
+/* Perform "late" intraprocedural SRA.  */
 static unsigned int
-tree_sra_early (void)
+late_intra_sra (void)
 {
-  unsigned int ret;
-
-  early_sra = true;
-  ret = tree_sra ();
-  early_sra = false;
-
-  return ret;
+  sra_mode = SRA_MODE_INTRA;
+  return perform_intra_sra ();
 }
 
+
 static bool
-gate_sra (void)
+gate_intra_sra (void)
 {
   return flag_tree_sra != 0;
 }
 
+
 struct gimple_opt_pass pass_sra_early =
 {
  {
   GIMPLE_PASS,
-  "esra",				/* name */
-  gate_sra,				/* gate */
-  tree_sra_early,			/* execute */
+  "esra",	 			/* name */
+  gate_intra_sra,			/* gate */
+  early_intra_sra,			/* execute */
   NULL,					/* sub */
   NULL,					/* next */
   0,					/* static_pass_number */
   TV_TREE_SRA,				/* tv_id */
-  PROP_cfg | PROP_ssa,			/* properties_required */
+  PROP_cfg | PROP_ssa,                  /* properties_required */
   0,					/* properties_provided */
-  0,				        /* properties_destroyed */
+  0,					/* properties_destroyed */
   0,					/* todo_flags_start */
   TODO_dump_func
   | TODO_update_ssa
@@ -3679,20 +2458,21 @@ struct gimple_opt_pass pass_sra_early =
  }
 };
 
+
 struct gimple_opt_pass pass_sra =
 {
  {
   GIMPLE_PASS,
-  "sra",				/* name */
-  gate_sra,				/* gate */
-  tree_sra,				/* execute */
+  "sra",	 			/* name */
+  gate_intra_sra,			/* gate */
+  late_intra_sra,			/* execute */
   NULL,					/* sub */
   NULL,					/* next */
   0,					/* static_pass_number */
   TV_TREE_SRA,				/* tv_id */
-  PROP_cfg | PROP_ssa,			/* properties_required */
+  PROP_cfg | PROP_ssa,                  /* properties_required */
   0,					/* properties_provided */
-  0,				        /* properties_destroyed */
+  0,					/* properties_destroyed */
   TODO_update_address_taken,		/* todo_flags_start */
   TODO_dump_func
   | TODO_update_ssa
Index: mine/gcc/Makefile.in
===================================================================
--- mine.orig/gcc/Makefile.in
+++ mine/gcc/Makefile.in
@@ -2732,11 +2732,9 @@ tree-ssa-ccp.o : tree-ssa-ccp.c $(TREE_F
    $(DIAGNOSTIC_H) $(FUNCTION_H) $(TIMEVAR_H) $(TM_H) coretypes.h \
    $(TREE_DUMP_H) $(BASIC_BLOCK_H) $(TREE_PASS_H) langhooks.h \
    tree-ssa-propagate.h value-prof.h $(FLAGS_H) $(TARGET_H) $(TOPLEV_H)
-tree-sra.o : tree-sra.c $(CONFIG_H) $(SYSTEM_H) $(TREE_H) $(RTL_H) \
-    $(TM_P_H) $(TREE_FLOW_H) $(DIAGNOSTIC_H) $(TREE_INLINE_H) \
-    $(TIMEVAR_H) $(TM_H) coretypes.h $(TREE_DUMP_H) $(GIMPLE_H) \
-    langhooks.h $(TREE_PASS_H) $(FLAGS_H) $(EXPR_H) $(BASIC_BLOCK_H) \
-    $(BITMAP_H) $(GGC_H) hard-reg-set.h $(OBSTACK_H) $(PARAMS_H) $(TARGET_H)
+tree-sra.o : tree-sra.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) \
+   $(GIMPLE_H) $(TREE_FLOW_H) $(DIAGNOSTIC_H) $(TREE_DUMP_H) \
+   $(TIMEVAR_H) $(PARAMS_H) $(TARGET_H) $(FLAGS_H)
 tree-switch-conversion.o : tree-switch-conversion.c $(CONFIG_H) $(SYSTEM_H) \
     $(TREE_H) $(TM_P_H) $(TREE_FLOW_H) $(DIAGNOSTIC_H) $(TREE_INLINE_H) \
     $(TIMEVAR_H) $(TM_H) coretypes.h $(TREE_DUMP_H) $(GIMPLE_H) \

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/5] New intraprocedural Scalar Reduction of Aggregates.
  2009-04-28 10:14 ` [PATCH 3/5] New intraprocedural Scalar Reduction of Aggregates Martin Jambor
@ 2009-04-28 10:27   ` Martin Jambor
  2009-04-29 12:56     ` Richard Guenther
  2009-04-29 10:59   ` Richard Guenther
  1 sibling, 1 reply; 25+ messages in thread
From: Martin Jambor @ 2009-04-28 10:27 UTC (permalink / raw)
  To: GCC Patches; +Cc: Richard Guenther, Jan Hubicka

On Tue, Apr 28, 2009 at 12:04:32PM +0200, Martin Jambor wrote:
> This  is  the  new  intraprocedural  SRA.  I  have  stripped  off  the
> interprocedural part  and will propose to commit  it separately later.
> I have  tried to  remove almost every  trace of IPA-SRA,  however, two
> provisions for it  have remained in the patch.   First, an enumeration
> (rather than  a boolean) is  used to distuinguish between  "early" and
> "late" SRA  so that other  SRA modes can  be added later  on.  Second,
> scan_function()  has a  hook parameter  and a  void  pointer parameter
> which are not used in this patch but will be by IPA-SRA.
> 
> Otherwise, the patch is hopefully self-contained and the bases of its
> operation is described by the initial comment.
> 
> The patch bootstraps (on x86_64-linux-gnu but I am about to try it on
> hppa-linux-gnu too) but produces a small number of testsuite failures
> which are handled by the two following patches.
> 
> Thanks,
> 
> Martin
> 
> 
> 2009-04-27  Martin Jambor  <mjambor@suse.cz>
> 
> 	* tree-sra.c (enum sra_mode): The whole contents of the file was
> 	replaced.

Hm, the  patch is quite unreadable,  below is the  new tree-sra.c file
which entirely replaces the old one (note that the patch also modifies
the Makefile though):



/* Scalar Replacement of Aggregates (SRA) converts some structure
   references into scalar references, exposing them to the scalar
   optimizers.
   Copyright (C) 2008, 2009 Free Software Foundation, Inc.
   Contributed by Martin Jambor <mjambor@suse.cz>

This file is part of GCC.

GCC is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 3, or (at your option) any later
version.

GCC is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
for more details.

You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING3.  If not see
<http://www.gnu.org/licenses/>.  */

/* This file implements Scalar Reduction of Aggregates (SRA).  SRA is run
   twice, once in the early stages of compilation (early SRA) and once in the
   late stages (late SRA).  The aim of both is to turn references to scalar
   parts of aggregates into uses of independent scalar variables.

   The two passes are nearly identical, the only difference is that early SRA
   does not scalarize unions which are used as the result in a GIMPLE_RETURN
   statement because together with inlining this can lead to weird type
   conversions.

   Both passes operate in four stages:

   1. The declarations that have properties which make them candidates for
      scalarization are identified in function find_var_candidates().  The
      candidates are stored in candidate_bitmap.

   2. The function body is scanned.  In the process, declarations which are
      used in a manner that prevent their scalarization are removed from the
      candidate bitmap.  More importantly, for every access into an aggregate,
      an access structure (struct access) is created by create_access() and
      stored in a vector associated with the aggregate.  Among other
      information, the aggregate declaration, the offset and size of the access
      and its type are stored in the structure.

      On a related note, assign_link structures are created for every assign
      statement between candidate aggregates and attached to the related
      accesses.

   3. The vectors of accesses are analyzed.  They are first sorted according to
      their offset and size and then scanned for partially overlapping accesses
      (i.e. those which overlap but one is not entirely within another).  Such
      an access disqualifies the whole aggregate from being scalarized.

      If there is no such inhibiting overlap, a representative access structure
      is chosen for every unique combination of offset and size.  Afterwards,
      the pass builds a set of trees from these structures, in which children
      of an access are within their parent (in terms of offset and size).

      Then accesses  are propagated  whenever possible (i.e.  in cases  when it
      does not create a partially overlapping access) across assign_links from
      the right hand side to the left hand side.

      Then the set of trees for each declaration is traversed again and those
      accesses which should be replaced by a scalar are identified.

   4. The function is traversed again, and for every reference into an
      aggregate that has some component which is about to be scalarized,
      statements are amended and new statements are created as necessary.
      Finally, if a parameter got scalarized, the scalar replacements are
      initialized with values from respective parameter aggregates.
*/

#include "config.h"
#include "system.h"
#include "coretypes.h"
#include "alloc-pool.h"
#include "tm.h"
#include "tree.h"
#include "gimple.h"
#include "tree-flow.h"
#include "diagnostic.h"
#include "tree-dump.h"
#include "timevar.h"
#include "params.h"
#include "target.h"
#include "flags.h"

/* Enumeration of all aggregate reductions we can do.  */
enum sra_mode {SRA_MODE_EARLY_INTRA, /* early intraprocedural SRA */
	       SRA_MODE_INTRA};	     /* late intraprocedural SRA */

/* Global variable describing which aggregate reduction we are performing at
   the moment.  */
static enum sra_mode sra_mode;

struct assign_link;

/* ACCESS represents each access to an aggregate variable (as a whole or a
   part).  It can also represent a group of accesses that refer to exactly the
   same fragment of an aggregate (i.e. those that have exactly the same offset
   and size).  Such representatives for a single aggregate, once determined,
   are linked in a linked list and have the group fields set.

   Moreover, when doing intraprocedural SRA, a tree is built from those
   representatives (by the means of first_child and next_sibling pointers), in
   which all items in a subtree are "within" the root, i.e. their offset is
   greater or equal to offset of the root and offset+size is smaller or equal
   to offset+size of the root.  Children of an access are sorted by offset.
*/

struct access
{
  /* Values returned by `get_ref_base_and_extent' for each COMPONENT_REF
     If EXPR isn't a COMPONENT_REF just set `BASE = EXPR', `OFFSET = 0',
     `SIZE = TREE_SIZE (TREE_TYPE (expr))'.  */
  HOST_WIDE_INT offset;
  HOST_WIDE_INT size;
  tree base;

  /* Expression.  */
  tree expr;
  /* Type.  */
  tree type;

  /* Next group representative for this aggregate. */
  struct access *next_grp;

  /* Pointer to the group representative.  Pointer to itself if the struct is
     the representative.  */
  struct access *group_representative;

  /* If this access has any children (in terms of the definition above), this
     points to the first one.  */
  struct access *first_child;

  /* Pointer to the next sibling in the access tree as described above.  */
  struct access *next_sibling;

  /* Pointers to the first and last element in the linked list of assign
     links.  */
  struct assign_link *first_link, *last_link;
  /* Pointer to the next access in the work queue.  */
  struct access *next_queued;

  /* Replacement variable for this access "region."  Never to be accessed
     directly, always only by the means of get_access_replacement() and only
     when grp_to_be_replaced flag is set.  */
  tree replacement_decl;

  /* Is this particular access write access? */
  unsigned write : 1;

  /* Is this access currently in the work queue?  */
  unsigned grp_queued : 1;
  /* Does this group contain a write access?  This flag is propagated down the
     access tree.  */
  unsigned grp_write : 1;
  /* Does this group contain a read access?  This flag is propagated down the
     access tree.  */
  unsigned grp_read : 1;
  /* Is the subtree rooted in this access fully covered by scalar
     replacements?  */
  unsigned grp_covered : 1;
  /* If set to true, this access and all below it in an access tree must not be
     scalarized.  */
  unsigned grp_unscalarizable_region : 1;
  /* Whether data have been written to parts of the aggregate covered by this
     access which is not to be scalarized.  This flag is propagated up in the
     access tree.  */
  unsigned grp_unscalarized_data : 1;
  /* Does this access and/or group contain a write access through a
     BIT_FIELD_REF?  */
  unsigned grp_bfr_lhs : 1;

  /* Set when a scalar replacement should be created for this variable.  We do
     the decision and creation at different places because create_tmp_var
     cannot be called from within FOR_EACH_REFERENCED_VAR. */
  unsigned grp_to_be_replaced : 1;
};

typedef struct access *access_p;

DEF_VEC_P (access_p);
DEF_VEC_ALLOC_P (access_p, heap);

/* Alloc pool for allocating access structures.  */
static alloc_pool access_pool;

/* A structure linking lhs and rhs accesses from an aggregate assignment.  They
   are used to propagate subaccesses from rhs to lhs as long as they don't
   conflict with what is already there.  */
struct assign_link
{
  struct access *lacc, *racc;
  struct assign_link *next;
};

/* Alloc pool for allocating assign link structures.  */
static alloc_pool link_pool;

/* Base (tree) -> Vector (VEC(access_p,heap) *) map.  */
static struct pointer_map_t *base_access_vec;

/* Bitmap of bases (candidates).  */
static bitmap candidate_bitmap;
/* Bitmap of declarations used in a return statement.  */
static bitmap retvals_bitmap;
/* Obstack for creation of fancy names.  */
static struct obstack name_obstack;

/* Head of a linked list of accesses that need to have its subaccesses
   propagated to their assignment counterparts. */
static struct access *work_queue_head;

/* Dump contents of ACCESS to file F in a human friendly way.  If GRP is true,
   representative fields are dumped, otherwise those which only describe the
   individual access are.  */

static void
dump_access (FILE *f, struct access *access, bool grp)
{
  fprintf (f, "access { ");
  fprintf (f, "base = (%d)'", DECL_UID (access->base));
  print_generic_expr (f, access->base, 0);
  fprintf (f, "', offset = %d", (int) access->offset);
  fprintf (f, ", size = %d", (int) access->size);
  fprintf (f, ", expr = ");
  print_generic_expr (f, access->expr, 0);
  fprintf (f, ", type = ");
  print_generic_expr (f, access->type, 0);
  if (grp)
    fprintf (f, ", grp_write = %d, grp_read = %d, grp_covered = %d, "
	     "grp_unscalarizable_region = %d, grp_unscalarized_data = %d, "
	     "grp_to_be_replaced = %d\n",
	     access->grp_write, access->grp_read, access->grp_covered,
	     access->grp_unscalarizable_region, access->grp_unscalarized_data,
	     access->grp_to_be_replaced);
  else
    fprintf (f, ", write = %d'\n", access->write);
}

/* Dump a subtree rooted in ACCESS to file F, indent by LEVEL.  */

static void
dump_access_tree_1 (FILE *f, struct access *access, int level)
{
  do
    {
      int i;

      for (i = 0; i < level; i++)
	fputs ("* ", dump_file);

      dump_access (f, access, true);

      if (access->first_child)
	dump_access_tree_1 (f, access->first_child, level + 1);

      access = access->next_sibling;
    }
  while (access);
}

/* Dump all access trees for a variable, given the pointer to the first root in
   ACCESS.  */

static void
dump_access_tree (FILE *f, struct access *access)
{
  for (; access; access = access->next_grp)
    dump_access_tree_1 (f, access, 0);
}

/* Return a vector of pointers to accesses for the variable given in BASE or
   NULL if there is none.  */

static VEC (access_p, heap) *
get_base_access_vector (tree base)
{
  void **slot;

  slot = pointer_map_contains (base_access_vec, base);
  if (!slot)
    return NULL;
  else
    return *(VEC (access_p, heap) **) slot;
}

/* Find an access with required OFFSET and SIZE in a subtree of accesses rooted
   in ACCESS.  Return NULL if it cannot be found.  */

static struct access *
find_access_in_subtree (struct access *access, HOST_WIDE_INT offset,
			HOST_WIDE_INT size)
{
  while (access && (access->offset != offset || access->size != size))
    {
      struct access *child = access->first_child;

      while (child && (child->offset + child->size <= offset))
	child = child->next_sibling;
      access = child;
    }

  return access;
}

/* Return the first group representative for DECL or NULL if none exists.  */

static struct access *
get_first_repr_for_decl (tree base)
{
  VEC (access_p, heap) *access_vec;

  access_vec = get_base_access_vector (base);
  if (!access_vec)
    return NULL;

  return VEC_index (access_p, access_vec, 0);
}

/* Find an access representative for the variable BASE and given OFFSET and
   SIZE.  Requires that access trees have already been built.  Return NULL if
   it cannot be found.  */

static struct access *
get_var_base_offset_size_access (tree base, HOST_WIDE_INT offset,
				 HOST_WIDE_INT size)
{
  struct access *access;

  access = get_first_repr_for_decl (base);
  while (access && (access->offset + access->size <= offset))
    access = access->next_grp;
  if (!access)
    return NULL;

  return find_access_in_subtree (access, offset, size);
}

/* Add LINK to the linked list of assign links of RACC.  */
static void
add_link_to_rhs (struct access *racc, struct assign_link *link)
{
  gcc_assert (link->racc == racc);

  if (!racc->first_link)
    {
      gcc_assert (!racc->last_link);
      racc->first_link = link;
    }
  else
    racc->last_link->next = link;

  racc->last_link = link;
  link->next = NULL;
}

/* Move all link structures in their linked list in OLD_RACC to the linked list
   in NEW_RACC.  */
static void
relink_to_new_repr (struct access *new_racc, struct access *old_racc)
{
  if (!old_racc->first_link)
    {
      gcc_assert (!old_racc->last_link);
      return;
    }

  if (new_racc->first_link)
    {
      gcc_assert (!new_racc->last_link->next);
      gcc_assert (!old_racc->last_link || !old_racc->last_link->next);

      new_racc->last_link->next = old_racc->first_link;
      new_racc->last_link = old_racc->last_link;
    }
  else
    {
      gcc_assert (!new_racc->last_link);

      new_racc->first_link = old_racc->first_link;
      new_racc->last_link = old_racc->last_link;
    }
  old_racc->first_link = old_racc->last_link = NULL;
}

/* Add ACCESS to the work queue (which is actually a stack).  */

static void
add_access_to_work_queue (struct access *access)
{
  if (!access->grp_queued)
    {
      gcc_assert (!access->next_queued);
      access->next_queued = work_queue_head;
      access->grp_queued = 1;
      work_queue_head = access;
    }
}

/* Pop an access from the work queue, and return it, assuming there is one.  */

static struct access *
pop_access_from_work_queue (void)
{
  struct access *access = work_queue_head;

  work_queue_head = access->next_queued;
  access->next_queued = NULL;
  access->grp_queued = 0;
  return access;
}


/* Allocate necessary structures.  */

static void
sra_initialize (void)
{
  candidate_bitmap = BITMAP_ALLOC (NULL);
  retvals_bitmap = BITMAP_ALLOC (NULL);
  gcc_obstack_init (&name_obstack);
  access_pool = create_alloc_pool ("SRA accesses", sizeof (struct access), 16);
  link_pool = create_alloc_pool ("SRA links", sizeof (struct assign_link), 16);
  base_access_vec = pointer_map_create ();
}

/* Hook fed to pointer_map_traverse, deallocate stored vectors.  */

static bool
delete_base_accesses (const void *key ATTRIBUTE_UNUSED, void **value,
		     void *data ATTRIBUTE_UNUSED)
{
  VEC (access_p, heap) *access_vec;
  access_vec = (VEC (access_p, heap) *) *value;
  VEC_free (access_p, heap, access_vec);

  return true;
}

/* Deallocate all general structures.  */

static void
sra_deinitialize (void)
{
  BITMAP_FREE (candidate_bitmap);
  BITMAP_FREE (retvals_bitmap);
  free_alloc_pool (access_pool);
  free_alloc_pool (link_pool);
  obstack_free (&name_obstack, NULL);

  pointer_map_traverse (base_access_vec, delete_base_accesses, NULL);
  pointer_map_destroy (base_access_vec);
}

/* Remove DECL from candidates for SRA and write REASON to the dump file if
   there is one.  */
static void
disqualify_candidate (tree decl, const char *reason)
{
  bitmap_clear_bit (candidate_bitmap, DECL_UID (decl));

  if (dump_file)
    {
      fprintf (dump_file, "! Disqualifying ");
      print_generic_expr (dump_file, decl, 0);
      fprintf (dump_file, " - %s\n", reason);
    }
}

/* Return true iff the type contains a field or an element which does not allow
   scalarization.  */

static bool
type_internals_preclude_sra_p (tree type)
{
  tree fld;
  tree et;

  switch (TREE_CODE (type))
    {
    case RECORD_TYPE:
    case UNION_TYPE:
    case QUAL_UNION_TYPE:
      for (fld = TYPE_FIELDS (type); fld; fld = TREE_CHAIN (fld))
	if (TREE_CODE (fld) == FIELD_DECL)
	  {
	    tree ft = TREE_TYPE (fld);

	    if (TREE_THIS_VOLATILE (fld)
		|| !DECL_FIELD_OFFSET (fld) || !DECL_SIZE (fld)
		|| !host_integerp (DECL_FIELD_OFFSET (fld), 1)
		|| !host_integerp (DECL_SIZE (fld), 1))
	      return true;

	    if (AGGREGATE_TYPE_P (ft)
		&& type_internals_preclude_sra_p (ft))
	      return true;
	  }

      return false;

    case ARRAY_TYPE:
      et = TREE_TYPE (type);

      if (AGGREGATE_TYPE_P (et))
	return type_internals_preclude_sra_p (et);
      else
	return false;

    default:
      return false;
    }
}

/* Create and insert access for EXPR. Return created access, or NULL if it is
   not possible.  */

static struct access *
create_access (tree expr, bool write)
{
  struct access *access;
  void **slot;
  VEC (access_p,heap) *vec;
  HOST_WIDE_INT offset, size, max_size;
  tree base = expr;
  bool unscalarizable_region = false;

  if (handled_component_p (expr))
    base = get_ref_base_and_extent (expr, &offset, &size, &max_size);
  else
    {
      tree tree_size;

      tree_size = TYPE_SIZE (TREE_TYPE (base));
      if (tree_size && host_integerp (tree_size, 1))
	size = max_size = tree_low_cst (tree_size, 1);
      else
	size = max_size = -1;

      offset = 0;
    }

  if (!base || !DECL_P (base)
      || !bitmap_bit_p (candidate_bitmap, DECL_UID (base)))
    return NULL;

  if (size != max_size)
    {
      size = max_size;
      unscalarizable_region = true;
    }

  if (size < 0)
    {
      disqualify_candidate (base, "Encountered an ultra variable sized "
			    "access.");
      return NULL;
    }

  access = (struct access *) pool_alloc (access_pool);
  memset (access, 0, sizeof (struct access));

  access->base = base;
  access->offset = offset;
  access->size = size;
  access->expr = expr;
  access->type = TREE_TYPE (expr);
  access->write = write;
  access->grp_unscalarizable_region = unscalarizable_region;

  slot = pointer_map_contains (base_access_vec, base);
  if (slot)
    vec = (VEC (access_p, heap) *) *slot;
  else
    vec = VEC_alloc (access_p, heap, 32);

  VEC_safe_push (access_p, heap, vec, access);

  *((struct VEC (access_p,heap) **)
	pointer_map_insert (base_access_vec, base)) = vec;

  return access;
}


/* Callback of walk_tree.  Search the given tree for a declaration and exclude
   it from the candidates.  */

static tree
disqualify_all (tree *tp, int *walk_subtrees, void *data ATTRIBUTE_UNUSED)
{
  tree base = *tp;


  if (TREE_CODE (base) == SSA_NAME)
    base = SSA_NAME_VAR (base);

  if (DECL_P (base))
    {
      disqualify_candidate (base, "From within disqualify_all().");
      *walk_subtrees = 0;
    }
  else
    *walk_subtrees = 1;


  return NULL_TREE;
}

/* Scan expression EXPR and create access structures for all accesses to
   candidates for scalarization.  Return the created access or NULL if none is
   created.  */

static struct access *
build_access_from_expr_1 (tree *expr_ptr,
			gimple_stmt_iterator *gsi ATTRIBUTE_UNUSED, bool write)
{
  struct access *ret = NULL;
  tree expr = *expr_ptr;
  tree safe_expr = expr;
  bool bit_ref;

  if (TREE_CODE (expr) == BIT_FIELD_REF)
    {
      expr = TREE_OPERAND (expr, 0);
      bit_ref = true;
    }
  else
    bit_ref = false;

  while (TREE_CODE (expr) == NOP_EXPR
	 || TREE_CODE (expr) == VIEW_CONVERT_EXPR
	 || TREE_CODE (expr) == REALPART_EXPR
	 || TREE_CODE (expr) == IMAGPART_EXPR)
    expr = TREE_OPERAND (expr, 0);

  switch (TREE_CODE (expr))
    {
    case ADDR_EXPR:
    case SSA_NAME:
    case INDIRECT_REF:
      break;

    case VAR_DECL:
    case PARM_DECL:
    case RESULT_DECL:
    case COMPONENT_REF:
    case ARRAY_REF:
      ret = create_access (expr, write);
      break;

    case REALPART_EXPR:
    case IMAGPART_EXPR:
      expr = TREE_OPERAND (expr, 0);
      ret = create_access (expr, write);
      break;

    case ARRAY_RANGE_REF:
    default:
      walk_tree (&safe_expr, disqualify_all, NULL, NULL);
      break;
    }

  if (write && bit_ref && ret)
    ret->grp_bfr_lhs = 1;

  return ret;
}

/* Scan expression EXPR and create access structures for all accesses to
   candidates for scalarization.  Return true if any access has been
   inserted.  */

static bool
build_access_from_expr (tree *expr_ptr,
			gimple_stmt_iterator *gsi ATTRIBUTE_UNUSED, bool write,
			void *data ATTRIBUTE_UNUSED)
{
  return build_access_from_expr_1 (expr_ptr, gsi, write) != NULL;
}

/* Disqualify LHS and RHS for scalarization if STMT must end its basic block in
   modes in which it matters, return true iff they have been disqualified.  RHS
   may be NULL, in that case ignore it.  If we scalarize an aggregate in
   intra-SRA we may need to add statements after each statement.  This is not
   possible if a statement unconditionally has to end the basic block.  */
static bool
disqualify_ops_if_throwing_stmt (gimple stmt, tree *lhs, tree *rhs)
{
  if (stmt_can_throw_internal (stmt) || stmt_ends_bb_p (stmt))
    {
      walk_tree (lhs, disqualify_all, NULL, NULL);
      if (rhs)
	walk_tree (rhs, disqualify_all, NULL, NULL);
      return true;
    }
  return false;
}


/* Result code for scan_assign callback for scan_function.  */
enum scan_assign_result {SRA_SA_NONE,       /* nothing done for the stmt */
			 SRA_SA_PROCESSED,  /* stmt analyzed/changed */
			 SRA_SA_REMOVED};   /* stmt redundant and eliminated */


/* Scan expressions occuring in the statement pointed to by STMT_EXPR, create
   access structures for all accesses to candidates for scalarization and
   remove those candidates which occur in statements or expressions that
   prevent them from being split apart.  Return true if any access has been
   inserted.  */

static enum scan_assign_result
build_accesses_from_assign (gimple *stmt_ptr,
			    gimple_stmt_iterator *gsi ATTRIBUTE_UNUSED,
			    void *data ATTRIBUTE_UNUSED)
{
  gimple stmt = *stmt_ptr;
  tree *lhs_ptr, *rhs_ptr;
  struct access *lacc, *racc;

  if (gimple_assign_rhs2 (stmt))
    return SRA_SA_NONE;

  lhs_ptr = gimple_assign_lhs_ptr (stmt);
  rhs_ptr = gimple_assign_rhs1_ptr (stmt);

  if (disqualify_ops_if_throwing_stmt (stmt, lhs_ptr, rhs_ptr))
    return SRA_SA_NONE;

  racc = build_access_from_expr_1 (rhs_ptr, gsi, false);
  lacc = build_access_from_expr_1 (lhs_ptr, gsi, true);

  if (lacc && racc
      && !lacc->grp_unscalarizable_region
      && !racc->grp_unscalarizable_region
      && AGGREGATE_TYPE_P (TREE_TYPE (*lhs_ptr))
      && lacc->size <= racc->size
      && useless_type_conversion_p (lacc->type, racc->type))
    {
      struct assign_link *link;

      link = (struct assign_link *) pool_alloc (link_pool);
      memset (link, 0, sizeof (struct assign_link));

      link->lacc = lacc;
      link->racc = racc;

      add_link_to_rhs (racc, link);
    }

  return (lacc || racc) ? SRA_SA_PROCESSED : SRA_SA_NONE;
}

/* Scan function and look for interesting statements. Return true if any has
   been found or processed, as indicated by callbacks.  SCAN_EXPR is a callback
   called on all expressions within statements except assign statements and
   those deemed entirely unsuitable for some reason (all operands in such
   statements and expression are removed from candidate_bitmap).  SCAN_ASSIGN
   is a callback called on all assign statements, HANDLE_SSA_DEFS is a callback
   called on assign statements and those call statements which have a lhs and
   it is the only callback which can be NULL. ANALYSIS_STAGE is true when
   running in the analysis stage of a pass and thus no statement is being
   modified.  DATA is a pointer passed to all callbacks.  If any single
   callback returns true, this function also returns true, otherwise it returns
   false.  */

static bool
scan_function (bool (*scan_expr) (tree *, gimple_stmt_iterator *, bool, void *),
	       enum scan_assign_result (*scan_assign) (gimple *,
						       gimple_stmt_iterator *,
						       void *),
	       bool (*handle_ssa_defs)(gimple, void *),
	       bool analysis_stage, void *data)
{
  gimple_stmt_iterator gsi;
  basic_block bb;
  unsigned i;
  tree *t;
  bool ret = false;

  FOR_EACH_BB (bb)
    {
      bool bb_changed = false;

      gsi = gsi_start_bb (bb);
      while (!gsi_end_p (gsi))
	{
	  gimple stmt = gsi_stmt (gsi);
	  enum scan_assign_result assign_result;
	  bool any = false, deleted = false;

	  switch (gimple_code (stmt))
	    {
	    case GIMPLE_RETURN:
	      t = gimple_return_retval_ptr (stmt);
	      if (*t != NULL_TREE)
		{
		  if (DECL_P (*t))
		    {
		      tree ret_type = TREE_TYPE (*t);
		      if (sra_mode == SRA_MODE_EARLY_INTRA
			  && (TREE_CODE (ret_type) == UNION_TYPE
			      || TREE_CODE (ret_type) == QUAL_UNION_TYPE))
			disqualify_candidate (*t,
					      "Union in a return statement.");
		      else
			bitmap_set_bit (retvals_bitmap, DECL_UID (*t));
		    }
		  any |= scan_expr (t, &gsi, false, data);
		}
	      break;

	    case GIMPLE_ASSIGN:
	      assign_result = scan_assign (&stmt, &gsi, data);
	      any |= assign_result == SRA_SA_PROCESSED;
	      deleted = assign_result == SRA_SA_REMOVED;
	      if (handle_ssa_defs && assign_result != SRA_SA_REMOVED)
		any |= handle_ssa_defs (stmt, data);
	      break;

	    case GIMPLE_CALL:
	      /* Operands must be processed before the lhs.  */
	      for (i = 0; i < gimple_call_num_args (stmt); i++)
		{
		  tree *argp = gimple_call_arg_ptr (stmt, i);
		  any |= scan_expr (argp, &gsi, false, data);
		}

	      if (gimple_call_lhs (stmt))
		{
		  tree *lhs_ptr = gimple_call_lhs_ptr (stmt);
		  if (!analysis_stage ||
		      !disqualify_ops_if_throwing_stmt (stmt, lhs_ptr, NULL))
		    {
		      any |= scan_expr (lhs_ptr, &gsi, true, data);
		      if (handle_ssa_defs)
			any |= handle_ssa_defs (stmt, data);
		    }
		}
	      break;

	    case GIMPLE_ASM:
	      for (i = 0; i < gimple_asm_ninputs (stmt); i++)
		{
		  tree *op = &TREE_VALUE (gimple_asm_input_op (stmt, i));
		  any |= scan_expr (op, &gsi, false, data);
		}
	      for (i = 0; i < gimple_asm_noutputs (stmt); i++)
		{
		  tree *op = &TREE_VALUE (gimple_asm_output_op (stmt, i));
		  any |= scan_expr (op, &gsi, true, data);
		}

	    default:
	      if (analysis_stage)
		walk_gimple_op (stmt, disqualify_all, NULL);
	      break;
	    }

	  if (any)
	    {
	      ret = true;
	      bb_changed = true;

	      if (!analysis_stage)
		{
		  update_stmt (stmt);
		  if (!stmt_could_throw_p (stmt))
		    remove_stmt_from_eh_region (stmt);
		}
	    }
	  if (deleted)
	    bb_changed = true;
	  else
	    {
	      gsi_next (&gsi);
	      ret = true;
	    }
	}
      if (!analysis_stage && bb_changed)
	gimple_purge_dead_eh_edges (bb);
    }

  return ret;
}

/* Helper of QSORT function. There are pointers to accesses in the array.  An
   access is considered smaller than another if it has smaller offset or if the
   offsets are the same but is size is bigger. */

static int
compare_access_positions (const void *a, const void *b)
{
  const access_p *fp1 = (const access_p *) a;
  const access_p *fp2 = (const access_p *) b;
  const access_p f1 = *fp1;
  const access_p f2 = *fp2;

  if (f1->offset != f2->offset)
    return f1->offset < f2->offset ? -1 : 1;

  if (f1->size == f2->size)
    return 0;
  /* We want the bigger accesses first, thus the opposite operator in the next
     line: */
  return f1->size > f2->size ? -1 : 1;
}


/* Append a name of the declaration to the name obstack.  A helper function for
   make_fancy_name.  */

static void
make_fancy_decl_name (tree decl)
{
  char buffer[32];

  tree name = DECL_NAME (decl);
  if (name)
    obstack_grow (&name_obstack, IDENTIFIER_POINTER (name),
		  IDENTIFIER_LENGTH (name));
  else
    {
      sprintf (buffer, "D%u", DECL_UID (decl));
      obstack_grow (&name_obstack, buffer, strlen (buffer));
    }
}

/* Helper for make_fancy_name.  */

static void
make_fancy_name_1 (tree expr)
{
  char buffer[32];
  tree index;

  if (DECL_P (expr))
    {
      make_fancy_decl_name (expr);
      return;
    }

  switch (TREE_CODE (expr))
    {
    case COMPONENT_REF:
      make_fancy_name_1 (TREE_OPERAND (expr, 0));
      obstack_1grow (&name_obstack, '$');
      make_fancy_decl_name (TREE_OPERAND (expr, 1));
      break;

    case ARRAY_REF:
      make_fancy_name_1 (TREE_OPERAND (expr, 0));
      obstack_1grow (&name_obstack, '$');
      /* Arrays with only one element may not have a constant as their
	 index. */
      index = TREE_OPERAND (expr, 1);
      if (TREE_CODE (index) != INTEGER_CST)
	break;
      sprintf (buffer, HOST_WIDE_INT_PRINT_DEC, TREE_INT_CST_LOW (index));
      obstack_grow (&name_obstack, buffer, strlen (buffer));

      break;

    case BIT_FIELD_REF:
    case REALPART_EXPR:
    case IMAGPART_EXPR:
      gcc_unreachable (); 	/* we treat these as scalars.  */
      break;
    default:
      break;
    }
}

/* Create a human readable name for replacement variable of ACCESS.  */

static char *
make_fancy_name (tree expr)
{
  make_fancy_name_1 (expr);
  obstack_1grow (&name_obstack, '\0');
  return XOBFINISH (&name_obstack, char *);
}

/* Helper function for build_ref_for_offset.  */

static bool
build_ref_for_offset_1 (tree *res, tree type, HOST_WIDE_INT offset,
			tree exp_type)
{
  while (1)
    {
      tree fld;
      tree tr_size, index;
      HOST_WIDE_INT el_size;

      if (offset == 0 && exp_type
	  && useless_type_conversion_p (exp_type, type))
	return true;

      switch (TREE_CODE (type))
	{
	case UNION_TYPE:
	case QUAL_UNION_TYPE:
	case RECORD_TYPE:
	  /* Some ADA records are half-unions, treat all of them the same.  */
	  for (fld = TYPE_FIELDS (type); fld; fld = TREE_CHAIN (fld))
	    {
	      HOST_WIDE_INT pos, size;
	      tree expr, *expr_ptr;

	      if (TREE_CODE (fld) != FIELD_DECL)
		continue;

	      pos = int_bit_position (fld);
	      gcc_assert (TREE_CODE (type) == RECORD_TYPE || pos == 0);
	      size = tree_low_cst (DECL_SIZE (fld), 1);
	      if (pos > offset || (pos + size) <= offset)
		continue;

	      if (res)
		{
		  expr = build3 (COMPONENT_REF, TREE_TYPE (fld), *res, fld,
				 NULL_TREE);
		  expr_ptr = &expr;
		}
	      else
		expr_ptr = NULL;
	      if (build_ref_for_offset_1 (expr_ptr, TREE_TYPE (fld),
					  offset - pos, exp_type))
		{
		  if (res)
		    *res = expr;
		  return true;
		}
	    }
	  return false;

	case ARRAY_TYPE:
	  tr_size = TYPE_SIZE (TREE_TYPE (type));
	  if (!tr_size || !host_integerp (tr_size, 1))
	    return false;
	  el_size = tree_low_cst (tr_size, 1);

	  index = build_int_cst (TYPE_DOMAIN (type), offset / el_size);
	  if (!integer_zerop (TYPE_MIN_VALUE (TYPE_DOMAIN (type))))
	    index = int_const_binop (PLUS_EXPR, index,
				     TYPE_MIN_VALUE (TYPE_DOMAIN (type)), 0);
	  if (res)
	    *res = build4 (ARRAY_REF, TREE_TYPE (type), *res, index, NULL_TREE,
			   NULL_TREE);
	  offset = offset % el_size;
	  type = TREE_TYPE (type);
	  break;

	default:
	  if (offset != 0)
	    return false;

	  if (exp_type)
	    return false;
	  else
	    return true;
	}
    }
}

/* Construct an expression that would reference a part of aggregate *EXPR of
   type TYPE at the given OFFSET of the type EXP_TYPE.  If EXPR is NULL, the
   function only determines whether it can build such a reference without
   actually doing it.

   FIXME: Eventually this should be replaced with
   maybe_fold_offset_to_reference() from tree-ssa-ccp.c but that requires a
   minor rewrite of fold_stmt.
 */

static bool
build_ref_for_offset (tree *expr, tree type, HOST_WIDE_INT offset,
		      tree exp_type, bool allow_ptr)
{
  if (allow_ptr && POINTER_TYPE_P (type))
    {
      type = TREE_TYPE (type);
      if (expr)
	*expr = fold_build1 (INDIRECT_REF, type, *expr);
    }

  return build_ref_for_offset_1 (expr, type, offset, exp_type);
}

/* The very first phase of intraprocedural SRA.  It marks in candidate_bitmap
   those with type which is suitable for scalarization.  */

static bool
find_var_candidates (void)
{
  tree var, type;
  referenced_var_iterator rvi;
  bool ret = false;

  FOR_EACH_REFERENCED_VAR (var, rvi)
    {
      if (TREE_CODE (var) != VAR_DECL && TREE_CODE (var) != PARM_DECL)
        continue;
      type = TREE_TYPE (var);

      if (!AGGREGATE_TYPE_P (type)
	  || needs_to_live_in_memory (var)
	  || TREE_THIS_VOLATILE (var)
	  || !COMPLETE_TYPE_P (type)
	  || !host_integerp (TYPE_SIZE (type), 1)
          || tree_low_cst (TYPE_SIZE (type), 1) == 0
	  || type_internals_preclude_sra_p (type))
	continue;

      bitmap_set_bit (candidate_bitmap, DECL_UID (var));

      if (dump_file)
	{
	  fprintf (dump_file, "Candidate (%d): ", DECL_UID (var));
	  print_generic_expr (dump_file, var, 0);
	  fprintf (dump_file, "\n");
	}
      ret = true;
    }

  return ret;
}

/* Return true if TYPE should be considered a scalar type by SRA.  */

static bool
is_sra_scalar_type (tree type)
{
  enum tree_code code = TREE_CODE (type);
  return (INTEGRAL_TYPE_P (type) || SCALAR_FLOAT_TYPE_P (type)
	  || FIXED_POINT_TYPE_P (type) || POINTER_TYPE_P (type)
	  || code == VECTOR_TYPE || code == COMPLEX_TYPE
	  || code == OFFSET_TYPE);
}


/* Sort all accesses for the given variable, check for partial overlaps and
   return NULL if there are any.  If there are none, pick a representative for
   each combination of offset and size and create a linked list out of them.
   Return the pointer to the first representative and make sure it is the first
   one in the vector of accesses.  */

static struct access *
sort_and_splice_var_accesses (tree var)
{
  int i, j, access_count;
  struct access *res, **prev_acc_ptr = &res;
  VEC (access_p, heap) *access_vec;
  bool first = true;
  HOST_WIDE_INT low = -1, high = 0;

  access_vec = get_base_access_vector (var);
  if (!access_vec)
    return NULL;
  access_count = VEC_length (access_p, access_vec);

  /* Sort by <OFFSET, SIZE>.  */
  qsort (VEC_address (access_p, access_vec), access_count, sizeof (access_p),
	 compare_access_positions);

  i = 0;
  while (i < access_count)
    {
      struct access *access = VEC_index (access_p, access_vec, i);
      bool modification = access->write;
      bool grp_read = !access->write;
      bool grp_bfr_lhs = access->grp_bfr_lhs;
      bool first_scalar = is_sra_scalar_type (access->type);
      bool unscalarizable_region = access->grp_unscalarizable_region;

      if (first || access->offset >= high)
	{
	  first = false;
	  low = access->offset;
	  high = access->offset + access->size;
	}
      else if (access->offset > low && access->offset + access->size > high)
	return NULL;
      else
	gcc_assert (access->offset >= low
		    && access->offset + access->size <= high);

      j = i + 1;
      while (j < access_count)
	{
	  struct access *ac2 = VEC_index (access_p, access_vec, j);
	  if (ac2->offset != access->offset || ac2->size != access->size)
	    break;
	  modification |= ac2->write;
	  grp_read |= !ac2->write;
	  grp_bfr_lhs |= ac2->grp_bfr_lhs;
	  unscalarizable_region |= ac2->grp_unscalarizable_region;
	  relink_to_new_repr (access, ac2);

	  /* If one of the equivalent accesses is scalar, use it as a
	     representative (this happens when when there is for example on a
	     single scalar field in a structure).  */
	  if (!first_scalar && is_sra_scalar_type (ac2->type))
	    {
	      struct access tmp_acc;
	      first_scalar = true;

	      memcpy (&tmp_acc, ac2, sizeof (struct access));
	      memcpy (ac2, access,  sizeof (struct access));
	      memcpy (access, &tmp_acc, sizeof (struct access));
	    }
	  ac2->group_representative = access;
	  j++;
	}

      i = j;

      access->group_representative = access;
      access->grp_write = modification;
      access->grp_read = grp_read;
      access->grp_bfr_lhs = grp_bfr_lhs;
      access->grp_unscalarizable_region = unscalarizable_region;
      if (access->first_link)
	add_access_to_work_queue (access);

      *prev_acc_ptr = access;
      prev_acc_ptr = &access->next_grp;
    }

  gcc_assert (res == VEC_index (access_p, access_vec, 0));
  return res;
}

/* Create a variable for the given ACCESS which determines the type, name and a
   few other properties.  Return the variable declaration and store it also to
   ACCESS->replacement.  */

static tree
create_access_replacement (struct access *access)
{
  tree repl;

  repl = make_rename_temp (access->type, "SR");
  get_var_ann (repl);
  add_referenced_var (repl);

  DECL_SOURCE_LOCATION (repl) = DECL_SOURCE_LOCATION (access->base);
  DECL_ARTIFICIAL (repl) = 1;

  if (DECL_NAME (access->base) && !DECL_IGNORED_P (access->base))
    {
      char *pretty_name = make_fancy_name (access->expr);

      DECL_NAME (repl) = get_identifier (pretty_name);
      obstack_free (&name_obstack, pretty_name);

      SET_DECL_DEBUG_EXPR (repl, access->expr);
      DECL_DEBUG_EXPR_IS_FROM (repl) = 1;
      DECL_IGNORED_P (repl) = 0;
      TREE_NO_WARNING (repl) = TREE_NO_WARNING (access->base);
    }
  else
    {
      DECL_IGNORED_P (repl) = 1;
      TREE_NO_WARNING (repl) = 1;
    }

  if (access->grp_bfr_lhs)
    DECL_GIMPLE_REG_P (repl) = 0;

  if (dump_file)
    {
      fprintf (dump_file, "Created a replacement for ");
      print_generic_expr (dump_file, access->base, 0);
      fprintf (dump_file, " offset: %u, size: %u: ",
	       (unsigned) access->offset, (unsigned) access->size);
      print_generic_expr (dump_file, repl, 0);
      fprintf (dump_file, "\n");
    }

  return repl;
}

/* Return ACCESS scalar replacement, create it if it does not exist yet.  */

static inline tree
get_access_replacement (struct access *access)
{
  gcc_assert (access->grp_to_be_replaced);

  if (access->replacement_decl)
    return access->replacement_decl;

  access->replacement_decl = create_access_replacement (access);
  return access->replacement_decl;
}

/* Build a subtree of accesses rooted in *ACCESS, and move the pointer in the
   linked list along the way.  Stop when *ACCESS is NULL or the access pointed
   to it is not "within" the root.  */

static void
build_access_subtree (struct access **access)
{
  struct access *root = *access, *last_child = NULL;
  HOST_WIDE_INT limit = root->offset + root->size;

  *access = (*access)->next_grp;
  while  (*access && (*access)->offset + (*access)->size <= limit)
    {
      if (!last_child)
	root->first_child = *access;
      else
	last_child->next_sibling = *access;
      last_child = *access;

      build_access_subtree (access);
    }
}

/* Build a tree of access representatives, ACCESS is the pointer to the first
   one, others are linked in a list by the next_grp field.  Decide about scalar
   replacements on the way, return true iff any are to be created.  */

static void
build_access_trees (struct access *access)
{
  while (access)
    {
      struct access *root = access;

      build_access_subtree (&access);
      root->next_grp = access;
    }
}

/* Analyze the subtree of accesses rooted in ROOT, scheduling replacements when
   both seeming beneficial and when ALLOW_REPLACEMENTS allows it.  Also set
   all sorts of access flags appropriately along the way, notably always ser
   grp_read when MARK_READ is true and grp_write when MARK_WRITE is true.  */

static bool
analyze_access_subtree (struct access *root, bool allow_replacements,
			bool mark_read, bool mark_write)
{
  struct access *child;
  HOST_WIDE_INT limit = root->offset + root->size;
  HOST_WIDE_INT covered_to = root->offset;
  bool scalar = is_sra_scalar_type (root->type);
  bool hole = false, sth_created = false;

  if (mark_read)
    root->grp_read = true;
  else if (root->grp_read)
    mark_read = true;

  if (mark_write)
    root->grp_write = true;
  else if (root->grp_write)
    mark_write = true;

  if (root->grp_unscalarizable_region)
    allow_replacements = false;

  for (child = root->first_child; child; child = child->next_sibling)
    {
      if (!hole && child->offset < covered_to)
	hole = true;
      else
	covered_to += child->size;

      sth_created |= analyze_access_subtree (child,
					     allow_replacements && !scalar,
					     mark_read, mark_write);

      root->grp_unscalarized_data |= child->grp_unscalarized_data;
      hole |= !child->grp_covered;
    }

  if (allow_replacements && scalar && !root->first_child)
    {
      if (dump_file)
	{
	  fprintf (dump_file, "Marking ");
	  print_generic_expr (dump_file, root->base, 0);
	  fprintf (dump_file, " offset: %u, size: %u: ",
		   (unsigned) root->offset, (unsigned) root->size);
	  fprintf (dump_file, " to be replaced.\n");
	}

      root->grp_to_be_replaced = 1;
      sth_created = true;
      hole = false;
    }
  else if (covered_to < limit)
    hole = true;

  if (sth_created && !hole)
    {
      root->grp_covered = 1;
      return true;
    }
  if (root->grp_write || TREE_CODE (root->base) == PARM_DECL)
    root->grp_unscalarized_data = 1; /* not covered and written to */
  if (sth_created)
    return true;
  return false;
}

/* Analyze all access trees linked by next_grp by the means of
   analyze_access_subtree.  */
static bool
analyze_access_trees (struct access *access)
{
  bool ret = false;

  while (access)
    {
      if (analyze_access_subtree (access, true, false, false))
	ret = true;
      access = access->next_grp;
    }

  return ret;
}

/* Return true iff a potential new child of LACC at offset OFFSET and with size
   SIZE would conflict with an already existing one.  If exactly such a child
   already exists in LACC, store a pointer to it in EXACT_MATCH.  */

static bool
child_would_conflict_in_lacc (struct access *lacc, HOST_WIDE_INT norm_offset,
			      HOST_WIDE_INT size, struct access **exact_match)
{
  struct access *child;

  for (child = lacc->first_child; child; child = child->next_sibling)
    {
      if (child->offset == norm_offset && child->size == size)
	{
	  *exact_match = child;
	  return true;
	}

      if (child->offset < norm_offset + size
	  && child->offset + child->size > norm_offset)
	return true;
    }

  return false;
}

/* Set the expr of TARGET to one just like MODEL but with is own base at the
   bottom of the handled components.  */

static void
duplicate_expr_for_different_base (struct access *target,
				   struct access *model)
{
  tree t, expr = unshare_expr (model->expr);

  gcc_assert (handled_component_p (expr));
  t = expr;
  while (handled_component_p (TREE_OPERAND (t, 0)))
    t = TREE_OPERAND (t, 0);
  gcc_assert (TREE_OPERAND (t, 0) == model->base);
  TREE_OPERAND (t, 0) = target->base;

  target->expr = expr;
}


/* Create a new child access of PARENT, with all properties just like MODEL
   except for its offset and with its grp_write false and grp_read true.
   Return the new access. Note that this access is created long after all
   splicing and sorting, it's not located in any access vector and is
   automatically a representative of its group.  */

static struct access *
create_artificial_child_access (struct access *parent, struct access *model,
				HOST_WIDE_INT new_offset)
{
  struct access *access;
  struct access **child;

  gcc_assert (!model->grp_unscalarizable_region);

  access = (struct access *) pool_alloc (access_pool);
  memset (access, 0, sizeof (struct access));
  access->base = parent->base;
  access->offset = new_offset;
  access->size = model->size;
  duplicate_expr_for_different_base (access, model);
  access->type = model->type;
  access->grp_write = true;
  access->grp_read = false;

  child = &parent->first_child;
  while (*child && (*child)->offset < new_offset)
    child = &(*child)->next_sibling;

  access->next_sibling = *child;
  *child = access;

  return access;
}


/* Propagate all subaccesses of RACC across an assignment link to LACC. Return
   true if any new subaccess was created.  Additionally, if RACC is a scalar
   access but LACC is not, change the type of the latter.  */

static bool
propagate_subacesses_accross_link (struct access *lacc, struct access *racc)
{
  struct access *rchild;
  HOST_WIDE_INT norm_delta = lacc->offset - racc->offset;
  bool ret = false;

  if (is_sra_scalar_type (lacc->type)
      || lacc->grp_unscalarizable_region
      || racc->grp_unscalarizable_region)
    return false;

  if (!lacc->first_child && !racc->first_child
      && is_sra_scalar_type (racc->type)
      && (sra_mode == SRA_MODE_INTRA
          || !bitmap_bit_p (retvals_bitmap, DECL_UID (lacc->base))))
    {
      duplicate_expr_for_different_base (lacc, racc);
      lacc->type = racc->type;
      return false;
    }

  gcc_assert (lacc->size <= racc->size);

  for (rchild = racc->first_child; rchild; rchild = rchild->next_sibling)
    {
      struct access *new_acc = NULL;
      HOST_WIDE_INT norm_offset = rchild->offset + norm_delta;

      if (rchild->grp_unscalarizable_region)
	continue;

      if (child_would_conflict_in_lacc (lacc, norm_offset, rchild->size,
					&new_acc))
	{
	  if (new_acc && rchild->first_child)
	    ret |= propagate_subacesses_accross_link (new_acc, rchild);
	  continue;
	}

      new_acc = create_artificial_child_access (lacc, rchild, norm_offset);
      if (racc->first_child)
	propagate_subacesses_accross_link (new_acc, rchild);

      ret = true;
    }

  return ret;
}

/* Propagate all subaccesses across assignment links.  */

static void
propagate_all_subaccesses (void)
{
  while (work_queue_head)
    {
      struct access *racc = pop_access_from_work_queue ();
      struct assign_link *link;

      gcc_assert (racc->first_link);

      for (link = racc->first_link; link; link = link->next)
	{
	  struct access *lacc = link->lacc;

	  if (!bitmap_bit_p (candidate_bitmap, DECL_UID (lacc->base)))
	    continue;
	  lacc = lacc->group_representative;
	  if (propagate_subacesses_accross_link (lacc, racc)
	      && lacc->first_link)
	    add_access_to_work_queue (lacc);
	}
    }
}

/* Go through all accesses collected throughout the (intraprocedural) analysis
   stage, exclude overlapping ones, identify representatives and build trees
   out of them, making decisions about scalarization on the way.  Return true
   iff there are any to-be-scalarized variables after this stage. */

static bool
analyze_all_variable_accesses (void)
{
  tree var;
  referenced_var_iterator rvi;
  bool res = false;

  FOR_EACH_REFERENCED_VAR (var, rvi)
    if (bitmap_bit_p (candidate_bitmap, DECL_UID (var)))
      {
	struct access *access;

	access = sort_and_splice_var_accesses (var);
	if (access)
	  build_access_trees (access);
	else
	  disqualify_candidate (var,
				"No or inhibitingly overlapping accesses.");
      }

  propagate_all_subaccesses ();

  FOR_EACH_REFERENCED_VAR (var, rvi)
    if (bitmap_bit_p (candidate_bitmap, DECL_UID (var)))
      {
	struct access *access = get_first_repr_for_decl (var);

	if (analyze_access_trees (access))
	  {
	    res = true;
	    if (dump_file)
	      {
		fprintf (dump_file, "\nAccess trees for ");
		print_generic_expr (dump_file, var, 0);
		fprintf (dump_file, " (UID: %u): \n", DECL_UID (var));
		dump_access_tree (dump_file, access);
		fprintf (dump_file, "\n");
	      }
	  }
	else
	  disqualify_candidate (var, "No scalar replacements to be created.");
      }

  return res;
}

/* Return true iff a reference statement into aggregate AGG can be built for
   every single to-be-replaced accesses that is a child of ACCESS, its sibling
   or a child of its sibling. TOP_OFFSET is the offset from the processed
   access subtree that has to be subtracted from offset of each access.  */

static bool
ref_expr_for_all_replacements_p (struct access *access, tree agg,
				 HOST_WIDE_INT top_offset)
{
  do
    {
      if (access->grp_to_be_replaced
	  && !build_ref_for_offset (NULL, TREE_TYPE (agg),
				    access->offset - top_offset,
				    access->type, false))
	return false;

      if (access->first_child
	  && !ref_expr_for_all_replacements_p (access->first_child, agg,
					       top_offset))
	return false;

      access = access->next_sibling;
    }
  while (access);

  return true;
}


/* Generate statements copying scalar replacements of accesses within a subtree
   into or out of AGG.  ACCESS is the first child of the root of the subtree to
   be processed.  AGG is an aggregate type expression (can be a declaration but
   does not have to be, it can for example also be an indirect_ref).
   TOP_OFFSET is the offset of the processed subtree which has to be subtracted
   from offsets of individual accesses to get corresponding offsets for AGG.
   If CHUNK_SIZE is non-null, copy only replacements in the interval
   <start_offset, start_offset + chunk_size>, otherwise copy all.  GSI is a
   statement iterator used to place the new statements.  WRITE should be true
   when the statements should write from AGG to the replacement and false if
   vice versa.  if INSERT_AFTER is true, new statements will be added after the
   current statement in GSI, they will be added before the statement
   otherwise.  */

static void
generate_subtree_copies (struct access *access, tree agg,
			 HOST_WIDE_INT top_offset,
			 HOST_WIDE_INT start_offset, HOST_WIDE_INT chunk_size,
			 gimple_stmt_iterator *gsi, bool write,
			 bool insert_after)
{
  do
    {
      tree expr = unshare_expr (agg);

      if (chunk_size && access->offset >= start_offset + chunk_size)
	return;

      if (access->grp_to_be_replaced
	  && (chunk_size == 0
	      || access->offset + access->size > start_offset))
	{
	  bool repl_found;
	  gimple stmt;

	  repl_found = build_ref_for_offset (&expr, TREE_TYPE (agg),
					     access->offset - top_offset,
					     access->type, false);
	  gcc_assert (repl_found);

	  if (write)
	    stmt = gimple_build_assign (get_access_replacement (access), expr);
	  else
	    {
	      tree repl = get_access_replacement (access);
	      TREE_NO_WARNING (repl) = 1;
	      stmt = gimple_build_assign (expr, repl);
	    }

	  if (insert_after)
	    gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
	  else
	    gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
	  update_stmt (stmt);
	}

      if (access->first_child)
	generate_subtree_copies (access->first_child, agg, top_offset,
				 start_offset, chunk_size, gsi,
				 write, insert_after);

      access = access->next_sibling;
    }
  while (access);
}

/* Assign zero to all scalar replacements in an access subtree.  ACCESS is the
   the root of the subtree to be processed.  GSI is the statement iterator used
   for inserting statements which are added after the current statement if
   INSERT_AFTER is true or before it otherwise.  */

static void
init_subtree_with_zero (struct access *access, gimple_stmt_iterator *gsi,
			bool insert_after)

{
  struct access *child;

  if (access->grp_to_be_replaced)
    {
      gimple stmt;

      stmt = gimple_build_assign (get_access_replacement (access),
				  fold_convert (access->type,
						integer_zero_node));
      if (insert_after)
	gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
      else
	gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
      update_stmt (stmt);
    }

  for (child = access->first_child; child; child = child->next_sibling)
    init_subtree_with_zero (child, gsi, insert_after);
}

/* Search for an access representative for the given expression EXPR and
   return it or NULL if it cannot be found.  */

static struct access *
get_access_for_expr (tree expr)
{
  HOST_WIDE_INT offset, size, max_size;
  tree base;

  if (TREE_CODE (expr) == NOP_EXPR
      || TREE_CODE (expr) == VIEW_CONVERT_EXPR)
    expr = TREE_OPERAND (expr, 0);

  if (handled_component_p (expr))
    {
      base = get_ref_base_and_extent (expr, &offset, &size, &max_size);
      size = max_size;
      if (size == -1 || !base || !DECL_P (base))
	return NULL;
    }
  else if (DECL_P (expr))
    {
      tree tree_size;

      base = expr;
      tree_size = TYPE_SIZE (TREE_TYPE (base));
      if (tree_size && host_integerp (tree_size, 1))
	size = max_size = tree_low_cst (tree_size, 1);
      else
	return NULL;

      offset = 0;
    }
  else
    return NULL;

  if (!bitmap_bit_p (candidate_bitmap, DECL_UID (base)))
    return NULL;

  return get_var_base_offset_size_access (base, offset, size);
}

/* Substitute into *EXPR an expression of type TYPE with the value of the
   replacement of ACCESS.  This is done either by producing a special V_C_E
   assignment statement converting the replacement to a new temporary of the
   requested type if TYPE is not TREE_ADDRESSABLE or by going through the base
   aggregate if it is.  */

static void
fix_incompatible_types_for_expr (tree *expr, tree type, struct access *access,
				 gimple_stmt_iterator *gsi, bool write)
{
  tree repl = get_access_replacement (access);
  if (!TREE_ADDRESSABLE (type))
    {
      tree tmp = create_tmp_var (type, "SRvce");

      add_referenced_var (tmp);
      if (is_gimple_reg_type (type))
	tmp = make_ssa_name (tmp, NULL);

      if (write)
	{
	  gimple stmt;
	  tree conv = fold_build1 (VIEW_CONVERT_EXPR, TREE_TYPE (repl), tmp);

	  *expr = tmp;
	  if (is_gimple_reg_type (type))
	    SSA_NAME_DEF_STMT (tmp) = gsi_stmt (*gsi);
	  stmt = gimple_build_assign (repl, conv);
	  gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
	  update_stmt (stmt);
	}
      else
	{
	  gimple stmt;
	  tree conv = fold_build1 (VIEW_CONVERT_EXPR, type, repl);

	  stmt = gimple_build_assign (tmp, conv);
	  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
	  if (is_gimple_reg_type (type))
	    SSA_NAME_DEF_STMT (tmp) = stmt;
	  *expr = tmp;
	  update_stmt (stmt);
	}
    }
  else
    {
      if (write)
	{
	  gimple stmt;

	  stmt = gimple_build_assign (repl, unshare_expr (access->expr));
	  gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
	  update_stmt (stmt);
	}
      else
	{
	  gimple stmt;

	  stmt = gimple_build_assign (unshare_expr (access->expr), repl);
	  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
	  update_stmt (stmt);
	}
    }
}


/* Callback for scan_function.  Replace the expression EXPR with a scalar
   replacement if there is one and generate other statements to do type
   conversion or subtree copying if necessary.  GSI is used to place newly
   created statements, WRITE is true if the expression is being written to (it
   is on a LHS of a statement or output in an assembly statement).  */

static bool
sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi, bool write,
		 void *data ATTRIBUTE_UNUSED)
{
  struct access *access;
  tree type, bfr;

  if (TREE_CODE (*expr) == BIT_FIELD_REF)
    {
      bfr = *expr;
      expr = &TREE_OPERAND (*expr, 0);
    }
  else
    bfr = NULL_TREE;

  if (TREE_CODE (*expr) == REALPART_EXPR || TREE_CODE (*expr) == IMAGPART_EXPR)
    expr = &TREE_OPERAND (*expr, 0);
  type = TREE_TYPE (*expr);

  access = get_access_for_expr (*expr);
  if (!access)
    return false;

  if (access->grp_to_be_replaced)
    {
      if (!useless_type_conversion_p (type, access->type))
	fix_incompatible_types_for_expr (expr, type, access, gsi, write);
      else
	*expr = get_access_replacement (access);
    }

  if (access->first_child)
    {
      HOST_WIDE_INT start_offset, chunk_size;
      if (bfr
	  && host_integerp (TREE_OPERAND (bfr, 1), 1)
	  && host_integerp (TREE_OPERAND (bfr, 2), 1))
	{
	  start_offset = tree_low_cst (TREE_OPERAND (bfr, 1), 1);
	  chunk_size = tree_low_cst (TREE_OPERAND (bfr, 2), 1);
	}
      else
	start_offset = chunk_size = 0;

      generate_subtree_copies (access->first_child, access->base, 0,
			       start_offset, chunk_size, gsi, write, write);
    }
  return true;
}

/* Store all replacements in the access tree rooted in TOP_RACC either to their
   base aggregate if there are unscalarized data or directly to LHS
   otherwise.  */

static void
handle_unscalarized_data_in_subtree (struct access *top_racc, tree lhs,
				     gimple_stmt_iterator *gsi)
{
  if (top_racc->grp_unscalarized_data)
    generate_subtree_copies (top_racc->first_child, top_racc->base, 0, 0, 0,
			     gsi, false, false);
  else
    generate_subtree_copies (top_racc->first_child, lhs, top_racc->offset,
			     0, 0, gsi, false, false);
}


/* Try to generate statements to load all sub-replacements in an access
   (sub)tree (LACC is the first child) from scalar replacements in the TOP_RACC
   (sub)tree.  If that is not possible, refresh the TOP_RACC base aggregate and
   load the accesses from it.  LEFT_OFFSET is the offset of the left whole
   subtree being copied, RIGHT_OFFSET is the same thing for the right subtree.
   GSI is stmt iterator used for statement insertions.  *REFRESHED is true iff
   the rhs top aggregate has already been refreshed by contents of its scalar
   reductions and is set to true if this function has to do it.  */

static void
load_assign_lhs_subreplacements (struct access *lacc, struct access *top_racc,
				 HOST_WIDE_INT left_offset,
				 HOST_WIDE_INT right_offset,
				 gimple_stmt_iterator *old_gsi,
				 gimple_stmt_iterator *new_gsi,
				 bool *refreshed, tree lhs)
{
  do
    {
      if (lacc->grp_to_be_replaced)
	{
	  struct access *racc;
	  HOST_WIDE_INT offset = lacc->offset - left_offset + right_offset;

	  racc = find_access_in_subtree (top_racc, offset, lacc->size);
	  if (racc && racc->grp_to_be_replaced)
	    {
	      gimple stmt;

	      if (useless_type_conversion_p (lacc->type, racc->type))
		stmt = gimple_build_assign (get_access_replacement (lacc),
					    get_access_replacement (racc));
	      else
		{
		  tree rhs = fold_build1 (VIEW_CONVERT_EXPR, lacc->type,
					  get_access_replacement (racc));
		  stmt = gimple_build_assign (get_access_replacement (lacc),
					      rhs);
		}

	      gsi_insert_after (new_gsi, stmt, GSI_NEW_STMT);
	      update_stmt (stmt);
	    }
	  else
	    {
	      tree expr = unshare_expr (top_racc->base);
	      bool repl_found;
	      gimple stmt;

	      /* No suitable access on the right hand side, need to load from
		 the aggregate.  See if we have to update it first... */
	      if (!*refreshed)
		{
		  gcc_assert (top_racc->first_child);
		  handle_unscalarized_data_in_subtree (top_racc, lhs, old_gsi);
		  *refreshed = true;
		}

	      repl_found = build_ref_for_offset (&expr,
						 TREE_TYPE (top_racc->base),
						 lacc->offset - left_offset,
						 lacc->type, false);
	      gcc_assert (repl_found);
	      stmt = gimple_build_assign (get_access_replacement (lacc),
					  expr);
	      gsi_insert_after (new_gsi, stmt, GSI_NEW_STMT);
	      update_stmt (stmt);
	    }
	}
      else if (lacc->grp_read && !lacc->grp_covered && !*refreshed)
	{
	  handle_unscalarized_data_in_subtree (top_racc, lhs, old_gsi);
	  *refreshed = true;
	}

      if (lacc->first_child)
	load_assign_lhs_subreplacements (lacc->first_child, top_racc,
					 left_offset, right_offset,
					 old_gsi, new_gsi, refreshed, lhs);
      lacc = lacc->next_sibling;
    }
  while (lacc);
}

/* Return true iff ACC is non-NULL and has subaccesses.  */

static inline bool
access_has_children_p (struct access *acc)
{
  return acc && acc->first_child;
}

/* Modify assignments with a CONSTRUCTOR on their RHS.  STMT contains a pointer
   to the assignment and GSI is the statement iterator pointing at it.  Returns
   the same values as sra_modify_assign.  */

static enum scan_assign_result
sra_modify_constructor_assign (gimple *stmt, gimple_stmt_iterator *gsi)
{
  tree lhs = gimple_assign_lhs (*stmt);
  struct access *acc;

  gcc_assert (TREE_CODE (lhs) != REALPART_EXPR
	      && TREE_CODE (lhs) != IMAGPART_EXPR);
  acc = get_access_for_expr (lhs);
  if (!acc)
    return SRA_SA_NONE;

  if (VEC_length (constructor_elt,
		  CONSTRUCTOR_ELTS (gimple_assign_rhs1 (*stmt))) > 0)
    {
      /* I have never seen this code path trigger but if it can happen the
	 following should handle it gracefully.  */
      if (access_has_children_p (acc))
	generate_subtree_copies (acc->first_child, acc->base, 0, 0, 0, gsi,
				 true, true);
      return SRA_SA_PROCESSED;
    }

  if (acc->grp_covered)
    {
      init_subtree_with_zero (acc, gsi, false);
      unlink_stmt_vdef (*stmt);
      gsi_remove (gsi, true);
      return SRA_SA_REMOVED;
    }
  else
    {
      init_subtree_with_zero (acc, gsi, true);
      return SRA_SA_PROCESSED;
    }
}


/* Modify statements with IMAGPART_EXPR or REALPART_EXPR on their lhs with
   to-be-scalarized expressions with them.  STMT is the statement and GSI is
   the iterator used to place new helper statements.  Returns the same values
   as sra_modify_assign.  */

static enum scan_assign_result
sra_modify_partially_complex_lhs (gimple stmt, gimple_stmt_iterator *gsi)
{
  tree lhs, complex, ptype, rp, ip;
  struct access *access;
  gimple new_stmt, aux_stmt;

  lhs = gimple_assign_lhs (stmt);
  complex = TREE_OPERAND (lhs, 0);

  access = get_access_for_expr (complex);

  if (!access || !access->grp_to_be_replaced)
    return SRA_SA_NONE;

  ptype = TREE_TYPE (TREE_TYPE (complex));
  rp = create_tmp_var (ptype, "SRr");
  add_referenced_var (rp);
  rp = make_ssa_name (rp, NULL);

  ip = create_tmp_var (ptype, "SRp");
  add_referenced_var (ip);
  ip = make_ssa_name (ip, NULL);

  if (TREE_CODE (lhs) == IMAGPART_EXPR)
    {
      aux_stmt = gimple_build_assign (rp, fold_build1 (REALPART_EXPR, ptype,
					     get_access_replacement (access)));
      SSA_NAME_DEF_STMT (rp) = aux_stmt;
      gimple_assign_set_lhs (stmt, ip);
      SSA_NAME_DEF_STMT (ip) = stmt;
    }
  else
    {
      aux_stmt = gimple_build_assign (ip, fold_build1 (IMAGPART_EXPR, ptype,
					     get_access_replacement (access)));
      SSA_NAME_DEF_STMT (ip) = aux_stmt;
      gimple_assign_set_lhs (stmt, rp);
      SSA_NAME_DEF_STMT (rp) = stmt;
    }

  gsi_insert_before (gsi, aux_stmt, GSI_SAME_STMT);
  update_stmt (aux_stmt);
  new_stmt = gimple_build_assign (get_access_replacement (access),
				  fold_build2 (COMPLEX_EXPR, access->type,
					       rp, ip));
  gsi_insert_after (gsi, new_stmt, GSI_NEW_STMT);
  update_stmt (new_stmt);
  return SRA_SA_PROCESSED;
}

/* Return true iff T has a VIEW_CONVERT_EXPR among its handled components.  */

static bool
contains_view_convert_expr_p (tree t)
{
  while (1)
    {
      if (TREE_CODE (t) == VIEW_CONVERT_EXPR)
	return true;
      if (!handled_component_p (t))
	return false;
      t = TREE_OPERAND (t, 0);
    }
}

/* Change STMT to assign compatible types by means of adding component or array
   references or VIEW_CONVERT_EXPRs.  All parameters have the same meaning as
   variable with the same names in sra_modify_assign.  This is done in a
   such a complicated way in order to make
   testsuite/g++.dg/tree-ssa/ssa-sra-2.C happy and so it helps in at least some
   cases.  */

static void
fix_modified_assign_compatibility (gimple_stmt_iterator *gsi, gimple *stmt,
				   struct access *lacc, struct access *racc,
				   tree lhs, tree *rhs, tree ltype, tree rtype)
{
  if (racc && racc->grp_to_be_replaced && AGGREGATE_TYPE_P (ltype)
      && !access_has_children_p (lacc))
    {
      tree expr = unshare_expr (lhs);
      bool found = build_ref_for_offset (&expr, ltype, racc->offset, rtype,
					 false);
      if (found)
	{
	  gimple_assign_set_lhs (*stmt, expr);
	  return;
	}
    }

  if (lacc && lacc->grp_to_be_replaced && AGGREGATE_TYPE_P (rtype)
      && !access_has_children_p (racc))
    {
      tree expr = unshare_expr (*rhs);
      bool found = build_ref_for_offset (&expr, rtype, lacc->offset, ltype,
					 false);
      if (found)
	{
	  gimple_assign_set_rhs1 (*stmt, expr);
	  return;
	}
    }

  *rhs = fold_build1 (VIEW_CONVERT_EXPR, ltype, *rhs);
  gimple_assign_set_rhs_from_tree (gsi, *rhs);
  *stmt = gsi_stmt (*gsi);
}

/* Callback of scan_function to process assign statements.  It examines both
   sides of the statement, replaces them with a scalare replacement if there is
   one and generating copying of replacements if scalarized aggregates have been
   used in the assignment.  STMT is a pointer to the assign statement, GSI is
   used to hold generated statements for type conversions and subtree
   copying.  */

static enum scan_assign_result
sra_modify_assign (gimple *stmt, gimple_stmt_iterator *gsi,
		   void *data ATTRIBUTE_UNUSED)
{
  struct access *lacc, *racc;
  tree ltype, rtype;
  tree lhs, rhs;
  bool modify_this_stmt;

  if (gimple_assign_rhs2 (*stmt))
    return SRA_SA_NONE;
  lhs = gimple_assign_lhs (*stmt);
  rhs = gimple_assign_rhs1 (*stmt);

  if (TREE_CODE (rhs) == CONSTRUCTOR)
    return sra_modify_constructor_assign (stmt, gsi);

  if (TREE_CODE (lhs) == REALPART_EXPR || TREE_CODE (lhs) == IMAGPART_EXPR)
    return sra_modify_partially_complex_lhs (*stmt, gsi);

  if (TREE_CODE (rhs) == REALPART_EXPR || TREE_CODE (rhs) == IMAGPART_EXPR
      || TREE_CODE (rhs) == BIT_FIELD_REF || TREE_CODE (lhs) == BIT_FIELD_REF)
    {
      modify_this_stmt = sra_modify_expr (gimple_assign_rhs1_ptr (*stmt),
					  gsi, false, data);
      modify_this_stmt |= sra_modify_expr (gimple_assign_lhs_ptr (*stmt),
					   gsi, true, data);
      return modify_this_stmt ? SRA_SA_PROCESSED : SRA_SA_NONE;
    }

  lacc = get_access_for_expr (lhs);
  racc = get_access_for_expr (rhs);
  if (!lacc && !racc)
    return SRA_SA_NONE;

  modify_this_stmt = ((lacc && lacc->grp_to_be_replaced)
		      || (racc && racc->grp_to_be_replaced));

  if (lacc && lacc->grp_to_be_replaced)
    {
      lhs = get_access_replacement (lacc);
      gimple_assign_set_lhs (*stmt, lhs);
      ltype = lacc->type;
    }
  else
    ltype = TREE_TYPE (lhs);

  if (racc && racc->grp_to_be_replaced)
    {
      rhs = get_access_replacement (racc);
      gimple_assign_set_rhs1 (*stmt, rhs);
      rtype = racc->type;
    }
  else
    rtype = TREE_TYPE (rhs);

  /* The possibility that gimple_assign_set_rhs_from_tree() might reallocate
     the statement makes the position of this pop_stmt_changes() a bit awkward
     but hopefully make some sense.  */
  if (modify_this_stmt)
    {
      if (!useless_type_conversion_p (ltype, rtype))
	fix_modified_assign_compatibility (gsi, stmt, lacc, racc,
					   lhs, &rhs, ltype, rtype);
    }

  if (contains_view_convert_expr_p (rhs) || contains_view_convert_expr_p (lhs)
      || (access_has_children_p (racc)
	  && !ref_expr_for_all_replacements_p (racc, lhs, racc->offset))
      || (access_has_children_p (lacc)
	  && !ref_expr_for_all_replacements_p (lacc, rhs, lacc->offset)))
    {
      if (access_has_children_p (racc))
	generate_subtree_copies (racc->first_child, racc->base, 0, 0, 0,
				 gsi, false, false);
      if (access_has_children_p (lacc))
	generate_subtree_copies (lacc->first_child, lacc->base, 0, 0, 0,
				 gsi, true, true);
    }
  else
    {
      if (access_has_children_p (lacc) && access_has_children_p (racc))
	{
	  gimple_stmt_iterator orig_gsi = *gsi;
	  bool refreshed;

	  if (lacc->grp_read && !lacc->grp_covered)
	    {
	      handle_unscalarized_data_in_subtree (racc, lhs, gsi);
	      refreshed = true;
	    }
	  else
	    refreshed = false;

	  load_assign_lhs_subreplacements (lacc->first_child, racc,
					   lacc->offset, racc->offset,
					   &orig_gsi, gsi, &refreshed, lhs);
	  if (!refreshed || !racc->grp_unscalarized_data)
	    {
	      if (*stmt == gsi_stmt (*gsi))
		gsi_next (gsi);

	      unlink_stmt_vdef (*stmt);
	      gsi_remove (&orig_gsi, true);
	      return SRA_SA_REMOVED;
	    }
	}
      else
	{
	  if (access_has_children_p (racc))
	    {
	      if (!racc->grp_unscalarized_data)
		{
		  generate_subtree_copies (racc->first_child, lhs,
					   racc->offset, 0, 0, gsi,
					   false, false);
		  gcc_assert (*stmt == gsi_stmt (*gsi));
		  unlink_stmt_vdef (*stmt);
		  gsi_remove (gsi, true);
		  return SRA_SA_REMOVED;
		}
	      else
		generate_subtree_copies (racc->first_child, lhs,
					 racc->offset, 0, 0, gsi, false, true);
	    }
	  else if (access_has_children_p (lacc))
	    generate_subtree_copies (lacc->first_child, rhs, lacc->offset,
				     0, 0, gsi, true, true);
	}
    }

  return modify_this_stmt ? SRA_SA_PROCESSED : SRA_SA_NONE;
}

/* Generate statements initializing scalar replacements of parts of function
   parameters.  */

static void
initialize_parameter_reductions (void)
{
  gimple_stmt_iterator gsi;
  gimple_seq seq = NULL;
  tree parm;

  for (parm = DECL_ARGUMENTS (current_function_decl);
       parm;
       parm = TREE_CHAIN (parm))
    {
      VEC (access_p, heap) *access_vec;
      struct access *access;

      if (!bitmap_bit_p (candidate_bitmap, DECL_UID (parm)))
	continue;
      access_vec = get_base_access_vector (parm);
      if (!access_vec)
	continue;

      if (!seq)
	{
	  seq = gimple_seq_alloc ();
	  gsi = gsi_start (seq);
	}

      for (access = VEC_index (access_p, access_vec, 0);
	   access;
	   access = access->next_grp)
	generate_subtree_copies (access, parm, 0, 0, 0, &gsi, true, true);
    }

  if (seq)
    gsi_insert_seq_on_edge_immediate (single_succ_edge (ENTRY_BLOCK_PTR), seq);
}

/* The "main" function of intraprocedural SRA passes.  Runs the analysis and if
   it reveals there are components of some aggregates to be scalarized, it runs
   the required transformations.  */
static unsigned int
perform_intra_sra (void)
{
  int ret = 0;
  sra_initialize ();

  if (!find_var_candidates ())
    goto out;

  if (!scan_function (build_access_from_expr, build_accesses_from_assign, NULL,
		      true, NULL))
    goto out;

  if (!analyze_all_variable_accesses ())
    goto out;

  scan_function (sra_modify_expr, sra_modify_assign, NULL,
		 false, NULL);
  initialize_parameter_reductions ();

  ret = TODO_update_ssa;

  if (sra_mode == SRA_MODE_EARLY_INTRA)
    ret = TODO_update_ssa;
  else
    ret = TODO_update_ssa | TODO_rebuild_alias;
 out:
  sra_deinitialize ();
  return ret;
}

/* Perform early intraprocedural SRA.  */
static unsigned int
early_intra_sra (void)
{
  sra_mode = SRA_MODE_EARLY_INTRA;
  return perform_intra_sra ();
}

/* Perform "late" intraprocedural SRA.  */
static unsigned int
late_intra_sra (void)
{
  sra_mode = SRA_MODE_INTRA;
  return perform_intra_sra ();
}


static bool
gate_intra_sra (void)
{
  return flag_tree_sra != 0;
}


struct gimple_opt_pass pass_sra_early =
{
 {
  GIMPLE_PASS,
  "esra",	 			/* name */
  gate_intra_sra,			/* gate */
  early_intra_sra,			/* execute */
  NULL,					/* sub */
  NULL,					/* next */
  0,					/* static_pass_number */
  TV_TREE_SRA,				/* tv_id */
  PROP_cfg | PROP_ssa,                  /* properties_required */
  0,					/* properties_provided */
  0,					/* properties_destroyed */
  0,					/* todo_flags_start */
  TODO_dump_func
  | TODO_update_ssa
  | TODO_ggc_collect
  | TODO_verify_ssa			/* todo_flags_finish */
 }
};


struct gimple_opt_pass pass_sra =
{
 {
  GIMPLE_PASS,
  "sra",	 			/* name */
  gate_intra_sra,			/* gate */
  late_intra_sra,			/* execute */
  NULL,					/* sub */
  NULL,					/* next */
  0,					/* static_pass_number */
  TV_TREE_SRA,				/* tv_id */
  PROP_cfg | PROP_ssa,                  /* properties_required */
  0,					/* properties_provided */
  0,					/* properties_destroyed */
  TODO_update_address_taken,		/* todo_flags_start */
  TODO_dump_func
  | TODO_update_ssa
  | TODO_ggc_collect
  | TODO_verify_ssa			/* todo_flags_finish */
 }
};

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/5] Make tree-complex.c:extract_component() handle  V_C_Es
  2009-04-28 10:10 ` [PATCH 2/5] Make tree-complex.c:extract_component() handle V_C_Es Martin Jambor
@ 2009-04-28 11:52   ` Richard Guenther
  2009-05-20 10:20     ` Martin Jambor
  0 siblings, 1 reply; 25+ messages in thread
From: Richard Guenther @ 2009-04-28 11:52 UTC (permalink / raw)
  To: Martin Jambor; +Cc: GCC Patches, Jan Hubicka

On Tue, 28 Apr 2009, Martin Jambor wrote:

> Currently     tree-complex.c:extract_component()     cannot     handle
> VIEW_CONVERT_EXPRs  which  makes  the  new SRA  ICE  during  bootstrap
> (IIRC).  This seems to be an ommision so I added a label for this code
> so that they are handled just like other handled components.
> 
> Thanks,

Ok.

Thanks,
Richard.

> 2009-04-27  Martin Jambor  <mjambor@suse.cz>
> 
> 	* tree-complex.c (extract_component): Added VIEW_CONVERT_EXPR switch
> 	case.
> 
> 
> Index: mine/gcc/tree-complex.c
> ===================================================================
> --- mine.orig/gcc/tree-complex.c	2009-04-25 19:11:37.000000000 +0200
> +++ mine/gcc/tree-complex.c	2009-04-25 19:11:47.000000000 +0200
> @@ -601,6 +601,7 @@ extract_component (gimple_stmt_iterator 
>      case INDIRECT_REF:
>      case COMPONENT_REF:
>      case ARRAY_REF:
> +    case VIEW_CONVERT_EXPR:
>        {
>  	tree inner_type = TREE_TYPE (TREE_TYPE (t));

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/5] Fix indirect inlining fallout with new intra-SRA
  2009-04-28 10:10 ` [PATCH 4/5] Fix indirect inlining fallout with new intra-SRA Martin Jambor
@ 2009-04-28 12:15   ` Richard Guenther
  2009-04-29 12:39     ` Martin Jambor
  0 siblings, 1 reply; 25+ messages in thread
From: Richard Guenther @ 2009-04-28 12:15 UTC (permalink / raw)
  To: Martin Jambor; +Cc: GCC Patches, Richard Guenther, Jan Hubicka

On Tue, Apr 28, 2009 at 12:04 PM, Martin Jambor <mjambor@suse.cz> wrote:
> The new intra-SRA produces an extra copy assignment and that breaks
> ipa-prop.c pattern matching.  The following patch fixes that.
>
> Thanks,
>
> Martin
>
>
> 2009-04-27  Martin Jambor  <mjambor@suse.cz>
>
>        * ipa-prop.c (get_ssa_def_if_simple_copy): New function.
>        (determine_cst_member_ptr): Call get_ssa_def_if_simple_copy to skip
>        simple copies.
>
>
> Index: mine/gcc/ipa-prop.c
> ===================================================================
> --- mine.orig/gcc/ipa-prop.c
> +++ mine/gcc/ipa-prop.c
> @@ -456,6 +456,22 @@ fill_member_ptr_cst_jump_function (struc
>   jfunc->value.member_cst.delta = delta;
>  }
>
> +/* If RHS is an SSA_NAMe and it is defined by a simple copy assign statement,
> +   return the rhs of its defining statement.  */
> +
> +static inline tree
> +get_ssa_def_if_simple_copy (tree rhs)
> +{
> +  if (TREE_CODE (rhs) == SSA_NAME && !SSA_NAME_IS_DEFAULT_DEF (rhs))
> +    {
> +      gimple def_stmt = SSA_NAME_DEF_STMT (rhs);
> +
> +      if (is_gimple_assign (def_stmt) && gimple_num_ops (def_stmt) == 2)
> +       rhs = gimple_assign_rhs1 (def_stmt);
> +    }
> +  return rhs;
> +}

IMHO this function should loop.  Also use gimple_assign_single_p
instead of the assign && num_ops check.  You also have to
check the gimple_assign_rhs_code to be SSA_NAME, otherwise
you happily look through all unary operations.

Richard.

> +
>  /* Traverse statements from CALL backwards, scanning whether the argument ARG
>    which is a member pointer is filled in with constant values.  If it is, fill
>    the jump function JFUNC in appropriately.  METHOD_FIELD and DELTA_FIELD are
> @@ -495,6 +511,7 @@ determine_cst_member_ptr (gimple call, t
>       fld = TREE_OPERAND (lhs, 1);
>       if (!method && fld == method_field)
>        {
> +         rhs = get_ssa_def_if_simple_copy (rhs);
>          if (TREE_CODE (rhs) == ADDR_EXPR
>              && TREE_CODE (TREE_OPERAND (rhs, 0)) == FUNCTION_DECL
>              && TREE_CODE (TREE_TYPE (TREE_OPERAND (rhs, 0))) == METHOD_TYPE)
> @@ -512,6 +529,7 @@ determine_cst_member_ptr (gimple call, t
>
>       if (!delta && fld == delta_field)
>        {
> +         rhs = get_ssa_def_if_simple_copy (rhs);
>          if (TREE_CODE (rhs) == INTEGER_CST)
>            {
>              delta = rhs;
>
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/5] Get rid off old external tree-sra.c stuff
  2009-04-28 10:11 ` [PATCH 1/5] Get rid off old external tree-sra.c stuff Martin Jambor
@ 2009-04-28 12:55   ` Richard Guenther
  2009-05-20 10:19     ` Martin Jambor
  0 siblings, 1 reply; 25+ messages in thread
From: Richard Guenther @ 2009-04-28 12:55 UTC (permalink / raw)
  To: Martin Jambor; +Cc: GCC Patches, Jan Hubicka

On Tue, 28 Apr 2009, Martin Jambor wrote:

> This  patch gets  rid off  all extermal  things in  the  old tree-sra.
> sra_insert_before,       sra_insert_after      sra_init_cache      and
> sra_type_can_be_decomposed_p  are not actually  used anywhere  so they
> are made  static. insert_edge_copies_seq is  used in mudflap and  so I
> copid the function there and made it static too.  The original one had
> to be moved  upwards in the file do that  tree-sra compiles.  Yes, ths
> patch duplicates the  function but the origial copy  is nuked with the
> rest of the file by the next patch.
> 
> Thanks,
> 
> Martin

This is ok.

Thanks,
Richard.

> 
> 2009-04-27  Martin Jambor  <mjambor@suse.cz>
> 
> 	* tree-flow.h (insert_edge_copies_seq): Undeclare.
> 	(sra_insert_before): Likewise.
> 	(sra_insert_after): Likewise.
> 	(sra_init_cache): Likewise.
> 	(sra_type_can_be_decomposed_p): Likewise.
> 
> 	* tree-mudflap.c (insert_edge_copies_seq): Copied here from tree-sra.c
> 
> 	* tree-sra.c (sra_type_can_be_decomposed_p): Made static.
> 	(sra_insert_before): Likewise.
> 	(sra_insert_after): Likewise.
> 	(sra_init_cache): Likewise.
> 	(insert_edge_copies_seq): Made static and moved upwards.
> 
> 
> Index: mine/gcc/tree-flow.h
> ===================================================================
> --- mine.orig/gcc/tree-flow.h
> +++ mine/gcc/tree-flow.h
> @@ -873,13 +873,6 @@ tree vn_lookup_with_vuses (tree, VEC (tr
>  /* In tree-ssa-sink.c  */
>  bool is_hidden_global_store (gimple);
>  
> -/* In tree-sra.c  */
> -void insert_edge_copies_seq (gimple_seq, basic_block);
> -void sra_insert_before (gimple_stmt_iterator *, gimple_seq);
> -void sra_insert_after (gimple_stmt_iterator *, gimple_seq);
> -void sra_init_cache (void);
> -bool sra_type_can_be_decomposed_p (tree);
> -
>  /* In tree-loop-linear.c  */
>  extern void linear_transform_loops (void);
>  extern unsigned perfect_loop_nest_depth (struct loop *);
> Index: mine/gcc/tree-mudflap.c
> ===================================================================
> --- mine.orig/gcc/tree-mudflap.c
> +++ mine/gcc/tree-mudflap.c
> @@ -447,6 +447,26 @@ execute_mudflap_function_ops (void)
>    return 0;
>  }
>  
> +/* Insert a gimple_seq SEQ on all the outgoing edges out of BB.  Note that
> +   if BB has more than one edge, STMT will be replicated for each edge.
> +   Also, abnormal edges will be ignored.  */
> +
> +static void
> +insert_edge_copies_seq (gimple_seq seq, basic_block bb)
> +{
> +  edge e;
> +  edge_iterator ei;
> +  unsigned n_copies = -1;
> +
> +  FOR_EACH_EDGE (e, ei, bb->succs)
> +    if (!(e->flags & EDGE_ABNORMAL))
> +      n_copies++;
> +
> +  FOR_EACH_EDGE (e, ei, bb->succs)
> +    if (!(e->flags & EDGE_ABNORMAL))
> +      gsi_insert_seq_on_edge (e, n_copies-- > 0 ? gimple_seq_copy (seq) : seq);
> +}
> +
>  /* Create and initialize local shadow variables for the lookup cache
>     globals.  Put their decls in the *_l globals for use by
>     mf_build_check_statement_for.  */
> Index: mine/gcc/tree-sra.c
> ===================================================================
> --- mine.orig/gcc/tree-sra.c
> +++ mine/gcc/tree-sra.c
> @@ -236,7 +236,7 @@ is_sra_scalar_type (tree type)
>     instantiated, just that if we decide to break up the type into
>     separate pieces that it can be done.  */
>  
> -bool
> +static bool
>  sra_type_can_be_decomposed_p (tree type)
>  {
>    unsigned int cache = TYPE_UID (TYPE_MAIN_VARIANT (type)) * 2;
> @@ -1263,6 +1263,26 @@ build_element_name (struct sra_elt *elt)
>    return XOBFINISH (&sra_obstack, char *);
>  }
>  
> +/* Insert a gimple_seq SEQ on all the outgoing edges out of BB.  Note that
> +   if BB has more than one edge, STMT will be replicated for each edge.
> +   Also, abnormal edges will be ignored.  */
> +
> +static void
> +insert_edge_copies_seq (gimple_seq seq, basic_block bb)
> +{
> +  edge e;
> +  edge_iterator ei;
> +  unsigned n_copies = -1;
> +
> +  FOR_EACH_EDGE (e, ei, bb->succs)
> +    if (!(e->flags & EDGE_ABNORMAL))
> +      n_copies++;
> +
> +  FOR_EACH_EDGE (e, ei, bb->succs)
> +    if (!(e->flags & EDGE_ABNORMAL))
> +      gsi_insert_seq_on_edge (e, n_copies-- > 0 ? gimple_seq_copy (seq) : seq);
> +}
> +
>  /* Instantiate an element as an independent variable.  */
>  
>  static void
> @@ -2785,29 +2805,9 @@ generate_element_init (struct sra_elt *e
>    return ret;
>  }
>  
> -/* Insert a gimple_seq SEQ on all the outgoing edges out of BB.  Note that
> -   if BB has more than one edge, STMT will be replicated for each edge.
> -   Also, abnormal edges will be ignored.  */
> -
> -void
> -insert_edge_copies_seq (gimple_seq seq, basic_block bb)
> -{
> -  edge e;
> -  edge_iterator ei;
> -  unsigned n_copies = -1;
> -
> -  FOR_EACH_EDGE (e, ei, bb->succs)
> -    if (!(e->flags & EDGE_ABNORMAL)) 
> -      n_copies++;
> -
> -  FOR_EACH_EDGE (e, ei, bb->succs)
> -    if (!(e->flags & EDGE_ABNORMAL)) 
> -      gsi_insert_seq_on_edge (e, n_copies-- > 0 ? gimple_seq_copy (seq) : seq);
> -}
> -
>  /* Helper function to insert LIST before GSI, and set up line number info.  */
>  
> -void
> +static void
>  sra_insert_before (gimple_stmt_iterator *gsi, gimple_seq seq)
>  {
>    gimple stmt = gsi_stmt (*gsi);
> @@ -2819,7 +2819,7 @@ sra_insert_before (gimple_stmt_iterator
>  
>  /* Similarly, but insert after GSI.  Handles insertion onto edges as well.  */
>  
> -void
> +static void
>  sra_insert_after (gimple_stmt_iterator *gsi, gimple_seq seq)
>  {
>    gimple stmt = gsi_stmt (*gsi);
> @@ -3597,7 +3597,7 @@ debug_sra_elt_name (struct sra_elt *elt)
>    fputc ('\n', stderr);
>  }
>  
> -void 
> +static void
>  sra_init_cache (void)
>  {
>    if (sra_type_decomp_cache)
> 
> 

-- 
Richard Guenther <rguenther@suse.de>
Novell / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 - GF: Markus Rex

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/5] "Fix" the rest of the fallouts of new intra-SRA
  2009-04-28 10:12 ` [PATCH 5/5] "Fix" the rest of the fallouts of new intra-SRA Martin Jambor
@ 2009-04-28 13:05   ` Richard Guenther
  0 siblings, 0 replies; 25+ messages in thread
From: Richard Guenther @ 2009-04-28 13:05 UTC (permalink / raw)
  To: Martin Jambor; +Cc: GCC Patches, Jan Hubicka, fortran

On Tue, 28 Apr 2009, Martin Jambor wrote:

> The following patch amends testcases (rather than the compiler) so
> that they not fail with the new intra-SRA.
> 
> The FRE testcases rely on the fact that SRA does not scalarize unions.
> The new one,  however, does.  Therefore I simply  switched SRA off for
> them.  Hopefully that is the correct thing to do.
> 
> The gfortran.dg/pr25923.f90 expects a weird warning that I do not give
> out because the new SRA does not scalarize anything in that particular
> testcase  because the  individual fields  of the  structure  are never
> accessed individually.  An  unpatched compiler with -fno-tree-sra also
> does not give any warning.  I believe it was left there as potentially
> uselful  and not  misleading but  I tent  to believe  it's  not really
> required...

Hm.  The PR was about garbled warning messages and the warning itself
does look correct (and useful).  But it indeed does not make sense
for SRA to scalarize the return value.  Thus I think it is more 
appropriate to XFAIL the warning.
 
> With  the previous and  this patch,  there are  no regressions  in the
> testsuite on  x86_64-linux-gnu.  I  have even bootstrapped  and tested
> Ada.

The FRE testcase changes are ok.  XFAILing the fortran testcase is also
ok if there are no objections from the Fortran maintainers (CCed).

Thanks,
Richard.

> Thanks,
> 
> Martin
> 
> 
> 2009-04-27  Martin Jambor  <mjambor@suse.cz>
> 
> 	    * gfortran.dg/pr25923.f90: Remove warning expectation.
> 	    * gcc.dg/tree-ssa/ssa-fre-7.c: Compile with -fno-tree-sra.
> 	    * gcc.dg/tree-ssa/ssa-fre-8.c: Likewise.
> 	    * gcc.dg/tree-ssa/ssa-fre-9.c: Likewise.
> 
> 
> Index: mine/gcc/testsuite/gfortran.dg/pr25923.f90
> ===================================================================
> --- mine.orig/gcc/testsuite/gfortran.dg/pr25923.f90
> +++ mine/gcc/testsuite/gfortran.dg/pr25923.f90
> @@ -10,7 +10,7 @@ implicit none
>  
>  contains
>  
> -  function baz(arg) result(res) ! { dg-warning "res.yr' may be" }
> +  function baz(arg) result(res)
>      type(bar), intent(in) :: arg
>      type(bar) :: res
>      logical, external:: some_func
> Index: mine/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-7.c
> ===================================================================
> --- mine.orig/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-7.c
> +++ mine/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-7.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O -fdump-tree-fre-details -fdump-tree-optimized" } */
> +/* { dg-options "-O -fno-tree-sra -fdump-tree-fre-details -fdump-tree-optimized" } */
>  #if (__SIZEOF_INT__ == __SIZEOF_FLOAT__)
>  typedef int intflt;
>  #elif (__SIZEOF_LONG__ == __SIZEOF_FLOAT__)
> Index: mine/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-8.c
> ===================================================================
> --- mine.orig/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-8.c
> +++ mine/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-8.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O -fdump-tree-fre-details" } */
> +/* { dg-options "-O -fno-tree-sra -fdump-tree-fre-details" } */
>  #if (__SIZEOF_INT__ == __SIZEOF_FLOAT__)
>  typedef int intflt;
>  #elif (__SIZEOF_LONG__ == __SIZEOF_FLOAT__)
> Index: mine/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-9.c
> ===================================================================
> --- mine.orig/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-9.c
> +++ mine/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-9.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O -fdump-tree-fre-stats" } */
> +/* { dg-options "-O -fno-tree-sra -fdump-tree-fre-stats" } */
>  
>  union loc {
>      unsigned reg;
> 
> 

-- 
Richard Guenther <rguenther@suse.de>
Novell / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 - GF: Markus Rex

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/5] New intraprocedural Scalar Reduction of  Aggregates.
  2009-04-28 10:14 ` [PATCH 3/5] New intraprocedural Scalar Reduction of Aggregates Martin Jambor
  2009-04-28 10:27   ` Martin Jambor
@ 2009-04-29 10:59   ` Richard Guenther
  2009-04-29 12:16     ` Martin Jambor
  1 sibling, 1 reply; 25+ messages in thread
From: Richard Guenther @ 2009-04-29 10:59 UTC (permalink / raw)
  To: Martin Jambor; +Cc: GCC Patches, Jan Hubicka

On Tue, 28 Apr 2009, Martin Jambor wrote:

> This  is  the  new  intraprocedural  SRA.  I  have  stripped  off  the
> interprocedural part  and will propose to commit  it separately later.
> I have  tried to  remove almost every  trace of IPA-SRA,  however, two
> provisions for it  have remained in the patch.   First, an enumeration
> (rather than  a boolean) is  used to distuinguish between  "early" and
> "late" SRA  so that other  SRA modes can  be added later  on.  Second,
> scan_function()  has a  hook parameter  and a  void  pointer parameter
> which are not used in this patch but will be by IPA-SRA.
> 
> Otherwise, the patch is hopefully self-contained and the bases of its
> operation is described by the initial comment.
> 
> The patch bootstraps (on x86_64-linux-gnu but I am about to try it on
> hppa-linux-gnu too) but produces a small number of testsuite failures
> which are handled by the two following patches.
> 
> Thanks,
> 
> Martin
> 
> 
> 2009-04-27  Martin Jambor  <mjambor@suse.cz>
> 
> 	* tree-sra.c (enum sra_mode): The whole contents of the file was
> 	replaced.

That looks odd ;)  I would make it

	* tree-sra.c: New implementation of SRA.

and in the usual way list any exported functions that were removed and
list new exported functions that were added.

Can you instead of posting a diff post the new contents?  That would
be way easier to review.

Thanks,
Richard.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/5] New intraprocedural Scalar Reduction of Aggregates.
  2009-04-29 10:59   ` Richard Guenther
@ 2009-04-29 12:16     ` Martin Jambor
  0 siblings, 0 replies; 25+ messages in thread
From: Martin Jambor @ 2009-04-29 12:16 UTC (permalink / raw)
  To: Richard Guenther; +Cc: GCC Patches, Jan Hubicka

Hi,

On Wed, Apr 29, 2009 at 12:27:10PM +0200, Richard Guenther wrote:
> On Tue, 28 Apr 2009, Martin Jambor wrote:
> 
> > This  is  the  new  intraprocedural  SRA.  I  have  stripped  off  the
> > interprocedural part  and will propose to commit  it separately later.
> > I have  tried to  remove almost every  trace of IPA-SRA,  however, two
> > provisions for it  have remained in the patch.   First, an enumeration
> > (rather than  a boolean) is  used to distuinguish between  "early" and
> > "late" SRA  so that other  SRA modes can  be added later  on.  Second,
> > scan_function()  has a  hook parameter  and a  void  pointer parameter
> > which are not used in this patch but will be by IPA-SRA.
> > 
> > Otherwise, the patch is hopefully self-contained and the bases of its
> > operation is described by the initial comment.
> > 
> > The patch bootstraps (on x86_64-linux-gnu but I am about to try it on
> > hppa-linux-gnu too) but produces a small number of testsuite failures
> > which are handled by the two following patches.
> > 
> > Thanks,
> > 
> > Martin
> > 
> > 
> > 2009-04-27  Martin Jambor  <mjambor@suse.cz>
> > 
> > 	* tree-sra.c (enum sra_mode): The whole contents of the file was
> > 	replaced.
> 
> That looks odd ;)  I would make it
> 
> 	* tree-sra.c: New implementation of SRA.
> 
> and in the usual way list any exported functions that were removed and
> list new exported functions that were added.

OK. I will use that entry.   There are no new exported functions, only
two  exported gimple_opt_pass  structures (which  are really  the only
part of the old file which was not really replaced but only changed).

Old exported functions are made static by the first patch in the
series. 

> 
> Can you instead of posting a diff post the new contents?  That would
> be way easier to review.

I've alreasy done that
(http://gcc.gnu.org/ml/gcc-patches/2009-04/msg02218.html).

However, when  I was writing  an email to  Honza about the last  ICE I
fixed I realized that two  comparisons had the opposite operators than
I  originally  meant  and  that  could lead  to  some  pessimizations.
Thinking about  it more, I have  also realized that I  could be bolder
and propagate subaccess  among accesses of different sizes  if I limit
the  propagated  children  so  that   they  are  not  outside  of  the
destination size.

The new version (already bootstrapped and tested) is below.  There are
only  minor  changes  in  functions  build_accesses_from_assign()  and
propagate_subacesses_accross_link().

I'm about to patch the automatic tester now.

Thanks,

Martin




/* Scalar Replacement of Aggregates (SRA) converts some structure
   references into scalar references, exposing them to the scalar
   optimizers.
   Copyright (C) 2008, 2009 Free Software Foundation, Inc.
   Contributed by Martin Jambor <mjambor@suse.cz>

This file is part of GCC.

GCC is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 3, or (at your option) any later
version.

GCC is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
for more details.

You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING3.  If not see
<http://www.gnu.org/licenses/>.  */

/* This file implements Scalar Reduction of Aggregates (SRA).  SRA is run
   twice, once in the early stages of compilation (early SRA) and once in the
   late stages (late SRA).  The aim of both is to turn references to scalar
   parts of aggregates into uses of independent scalar variables.

   The two passes are nearly identical, the only difference is that early SRA
   does not scalarize unions which are used as the result in a GIMPLE_RETURN
   statement because together with inlining this can lead to weird type
   conversions.

   Both passes operate in four stages:

   1. The declarations that have properties which make them candidates for
      scalarization are identified in function find_var_candidates().  The
      candidates are stored in candidate_bitmap.

   2. The function body is scanned.  In the process, declarations which are
      used in a manner that prevent their scalarization are removed from the
      candidate bitmap.  More importantly, for every access into an aggregate,
      an access structure (struct access) is created by create_access() and
      stored in a vector associated with the aggregate.  Among other
      information, the aggregate declaration, the offset and size of the access
      and its type are stored in the structure.

      On a related note, assign_link structures are created for every assign
      statement between candidate aggregates and attached to the related
      accesses.

   3. The vectors of accesses are analyzed.  They are first sorted according to
      their offset and size and then scanned for partially overlapping accesses
      (i.e. those which overlap but one is not entirely within another).  Such
      an access disqualifies the whole aggregate from being scalarized.

      If there is no such inhibiting overlap, a representative access structure
      is chosen for every unique combination of offset and size.  Afterwards,
      the pass builds a set of trees from these structures, in which children
      of an access are within their parent (in terms of offset and size).

      Then accesses  are propagated  whenever possible (i.e.  in cases  when it
      does not create a partially overlapping access) across assign_links from
      the right hand side to the left hand side.

      Then the set of trees for each declaration is traversed again and those
      accesses which should be replaced by a scalar are identified.

   4. The function is traversed again, and for every reference into an
      aggregate that has some component which is about to be scalarized,
      statements are amended and new statements are created as necessary.
      Finally, if a parameter got scalarized, the scalar replacements are
      initialized with values from respective parameter aggregates.
*/

#include "config.h"
#include "system.h"
#include "coretypes.h"
#include "alloc-pool.h"
#include "tm.h"
#include "tree.h"
#include "gimple.h"
#include "tree-flow.h"
#include "diagnostic.h"
#include "tree-dump.h"
#include "timevar.h"
#include "params.h"
#include "target.h"
#include "flags.h"

/* Enumeration of all aggregate reductions we can do.  */
enum sra_mode {SRA_MODE_EARLY_INTRA, /* early intraprocedural SRA */
	       SRA_MODE_INTRA};	     /* late intraprocedural SRA */

/* Global variable describing which aggregate reduction we are performing at
   the moment.  */
static enum sra_mode sra_mode;

struct assign_link;

/* ACCESS represents each access to an aggregate variable (as a whole or a
   part).  It can also represent a group of accesses that refer to exactly the
   same fragment of an aggregate (i.e. those that have exactly the same offset
   and size).  Such representatives for a single aggregate, once determined,
   are linked in a linked list and have the group fields set.

   Moreover, when doing intraprocedural SRA, a tree is built from those
   representatives (by the means of first_child and next_sibling pointers), in
   which all items in a subtree are "within" the root, i.e. their offset is
   greater or equal to offset of the root and offset+size is smaller or equal
   to offset+size of the root.  Children of an access are sorted by offset.
*/

struct access
{
  /* Values returned by `get_ref_base_and_extent' for each COMPONENT_REF
     If EXPR isn't a COMPONENT_REF just set `BASE = EXPR', `OFFSET = 0',
     `SIZE = TREE_SIZE (TREE_TYPE (expr))'.  */
  HOST_WIDE_INT offset;
  HOST_WIDE_INT size;
  tree base;

  /* Expression.  */
  tree expr;
  /* Type.  */
  tree type;

  /* Next group representative for this aggregate. */
  struct access *next_grp;

  /* Pointer to the group representative.  Pointer to itself if the struct is
     the representative.  */
  struct access *group_representative;

  /* If this access has any children (in terms of the definition above), this
     points to the first one.  */
  struct access *first_child;

  /* Pointer to the next sibling in the access tree as described above.  */
  struct access *next_sibling;

  /* Pointers to the first and last element in the linked list of assign
     links.  */
  struct assign_link *first_link, *last_link;
  /* Pointer to the next access in the work queue.  */
  struct access *next_queued;

  /* Replacement variable for this access "region."  Never to be accessed
     directly, always only by the means of get_access_replacement() and only
     when grp_to_be_replaced flag is set.  */
  tree replacement_decl;

  /* Is this particular access write access? */
  unsigned write : 1;

  /* Is this access currently in the work queue?  */
  unsigned grp_queued : 1;
  /* Does this group contain a write access?  This flag is propagated down the
     access tree.  */
  unsigned grp_write : 1;
  /* Does this group contain a read access?  This flag is propagated down the
     access tree.  */
  unsigned grp_read : 1;
  /* Is the subtree rooted in this access fully covered by scalar
     replacements?  */
  unsigned grp_covered : 1;
  /* If set to true, this access and all below it in an access tree must not be
     scalarized.  */
  unsigned grp_unscalarizable_region : 1;
  /* Whether data have been written to parts of the aggregate covered by this
     access which is not to be scalarized.  This flag is propagated up in the
     access tree.  */
  unsigned grp_unscalarized_data : 1;
  /* Does this access and/or group contain a write access through a
     BIT_FIELD_REF?  */
  unsigned grp_bfr_lhs : 1;

  /* Set when a scalar replacement should be created for this variable.  We do
     the decision and creation at different places because create_tmp_var
     cannot be called from within FOR_EACH_REFERENCED_VAR. */
  unsigned grp_to_be_replaced : 1;
};

typedef struct access *access_p;

DEF_VEC_P (access_p);
DEF_VEC_ALLOC_P (access_p, heap);

/* Alloc pool for allocating access structures.  */
static alloc_pool access_pool;

/* A structure linking lhs and rhs accesses from an aggregate assignment.  They
   are used to propagate subaccesses from rhs to lhs as long as they don't
   conflict with what is already there.  */
struct assign_link
{
  struct access *lacc, *racc;
  struct assign_link *next;
};

/* Alloc pool for allocating assign link structures.  */
static alloc_pool link_pool;

/* Base (tree) -> Vector (VEC(access_p,heap) *) map.  */
static struct pointer_map_t *base_access_vec;

/* Bitmap of bases (candidates).  */
static bitmap candidate_bitmap;
/* Bitmap of declarations used in a return statement.  */
static bitmap retvals_bitmap;
/* Obstack for creation of fancy names.  */
static struct obstack name_obstack;

/* Head of a linked list of accesses that need to have its subaccesses
   propagated to their assignment counterparts. */
static struct access *work_queue_head;

/* Dump contents of ACCESS to file F in a human friendly way.  If GRP is true,
   representative fields are dumped, otherwise those which only describe the
   individual access are.  */

static void
dump_access (FILE *f, struct access *access, bool grp)
{
  fprintf (f, "access { ");
  fprintf (f, "base = (%d)'", DECL_UID (access->base));
  print_generic_expr (f, access->base, 0);
  fprintf (f, "', offset = %d", (int) access->offset);
  fprintf (f, ", size = %d", (int) access->size);
  fprintf (f, ", expr = ");
  print_generic_expr (f, access->expr, 0);
  fprintf (f, ", type = ");
  print_generic_expr (f, access->type, 0);
  if (grp)
    fprintf (f, ", grp_write = %d, grp_read = %d, grp_covered = %d, "
	     "grp_unscalarizable_region = %d, grp_unscalarized_data = %d, "
	     "grp_to_be_replaced = %d\n",
	     access->grp_write, access->grp_read, access->grp_covered,
	     access->grp_unscalarizable_region, access->grp_unscalarized_data,
	     access->grp_to_be_replaced);
  else
    fprintf (f, ", write = %d'\n", access->write);
}

/* Dump a subtree rooted in ACCESS to file F, indent by LEVEL.  */

static void
dump_access_tree_1 (FILE *f, struct access *access, int level)
{
  do
    {
      int i;

      for (i = 0; i < level; i++)
	fputs ("* ", dump_file);

      dump_access (f, access, true);

      if (access->first_child)
	dump_access_tree_1 (f, access->first_child, level + 1);

      access = access->next_sibling;
    }
  while (access);
}

/* Dump all access trees for a variable, given the pointer to the first root in
   ACCESS.  */

static void
dump_access_tree (FILE *f, struct access *access)
{
  for (; access; access = access->next_grp)
    dump_access_tree_1 (f, access, 0);
}

/* Return a vector of pointers to accesses for the variable given in BASE or
   NULL if there is none.  */

static VEC (access_p, heap) *
get_base_access_vector (tree base)
{
  void **slot;

  slot = pointer_map_contains (base_access_vec, base);
  if (!slot)
    return NULL;
  else
    return *(VEC (access_p, heap) **) slot;
}

/* Find an access with required OFFSET and SIZE in a subtree of accesses rooted
   in ACCESS.  Return NULL if it cannot be found.  */

static struct access *
find_access_in_subtree (struct access *access, HOST_WIDE_INT offset,
			HOST_WIDE_INT size)
{
  while (access && (access->offset != offset || access->size != size))
    {
      struct access *child = access->first_child;

      while (child && (child->offset + child->size <= offset))
	child = child->next_sibling;
      access = child;
    }

  return access;
}

/* Return the first group representative for DECL or NULL if none exists.  */

static struct access *
get_first_repr_for_decl (tree base)
{
  VEC (access_p, heap) *access_vec;

  access_vec = get_base_access_vector (base);
  if (!access_vec)
    return NULL;

  return VEC_index (access_p, access_vec, 0);
}

/* Find an access representative for the variable BASE and given OFFSET and
   SIZE.  Requires that access trees have already been built.  Return NULL if
   it cannot be found.  */

static struct access *
get_var_base_offset_size_access (tree base, HOST_WIDE_INT offset,
				 HOST_WIDE_INT size)
{
  struct access *access;

  access = get_first_repr_for_decl (base);
  while (access && (access->offset + access->size <= offset))
    access = access->next_grp;
  if (!access)
    return NULL;

  return find_access_in_subtree (access, offset, size);
}

/* Add LINK to the linked list of assign links of RACC.  */
static void
add_link_to_rhs (struct access *racc, struct assign_link *link)
{
  gcc_assert (link->racc == racc);

  if (!racc->first_link)
    {
      gcc_assert (!racc->last_link);
      racc->first_link = link;
    }
  else
    racc->last_link->next = link;

  racc->last_link = link;
  link->next = NULL;
}

/* Move all link structures in their linked list in OLD_RACC to the linked list
   in NEW_RACC.  */
static void
relink_to_new_repr (struct access *new_racc, struct access *old_racc)
{
  if (!old_racc->first_link)
    {
      gcc_assert (!old_racc->last_link);
      return;
    }

  if (new_racc->first_link)
    {
      gcc_assert (!new_racc->last_link->next);
      gcc_assert (!old_racc->last_link || !old_racc->last_link->next);

      new_racc->last_link->next = old_racc->first_link;
      new_racc->last_link = old_racc->last_link;
    }
  else
    {
      gcc_assert (!new_racc->last_link);

      new_racc->first_link = old_racc->first_link;
      new_racc->last_link = old_racc->last_link;
    }
  old_racc->first_link = old_racc->last_link = NULL;
}

/* Add ACCESS to the work queue (which is actually a stack).  */

static void
add_access_to_work_queue (struct access *access)
{
  if (!access->grp_queued)
    {
      gcc_assert (!access->next_queued);
      access->next_queued = work_queue_head;
      access->grp_queued = 1;
      work_queue_head = access;
    }
}

/* Pop an access from the work queue, and return it, assuming there is one.  */

static struct access *
pop_access_from_work_queue (void)
{
  struct access *access = work_queue_head;

  work_queue_head = access->next_queued;
  access->next_queued = NULL;
  access->grp_queued = 0;
  return access;
}


/* Allocate necessary structures.  */

static void
sra_initialize (void)
{
  candidate_bitmap = BITMAP_ALLOC (NULL);
  retvals_bitmap = BITMAP_ALLOC (NULL);
  gcc_obstack_init (&name_obstack);
  access_pool = create_alloc_pool ("SRA accesses", sizeof (struct access), 16);
  link_pool = create_alloc_pool ("SRA links", sizeof (struct assign_link), 16);
  base_access_vec = pointer_map_create ();
}

/* Hook fed to pointer_map_traverse, deallocate stored vectors.  */

static bool
delete_base_accesses (const void *key ATTRIBUTE_UNUSED, void **value,
		     void *data ATTRIBUTE_UNUSED)
{
  VEC (access_p, heap) *access_vec;
  access_vec = (VEC (access_p, heap) *) *value;
  VEC_free (access_p, heap, access_vec);

  return true;
}

/* Deallocate all general structures.  */

static void
sra_deinitialize (void)
{
  BITMAP_FREE (candidate_bitmap);
  BITMAP_FREE (retvals_bitmap);
  free_alloc_pool (access_pool);
  free_alloc_pool (link_pool);
  obstack_free (&name_obstack, NULL);

  pointer_map_traverse (base_access_vec, delete_base_accesses, NULL);
  pointer_map_destroy (base_access_vec);
}

/* Remove DECL from candidates for SRA and write REASON to the dump file if
   there is one.  */
static void
disqualify_candidate (tree decl, const char *reason)
{
  bitmap_clear_bit (candidate_bitmap, DECL_UID (decl));

  if (dump_file)
    {
      fprintf (dump_file, "! Disqualifying ");
      print_generic_expr (dump_file, decl, 0);
      fprintf (dump_file, " - %s\n", reason);
    }
}

/* Return true iff the type contains a field or an element which does not allow
   scalarization.  */

static bool
type_internals_preclude_sra_p (tree type)
{
  tree fld;
  tree et;

  switch (TREE_CODE (type))
    {
    case RECORD_TYPE:
    case UNION_TYPE:
    case QUAL_UNION_TYPE:
      for (fld = TYPE_FIELDS (type); fld; fld = TREE_CHAIN (fld))
	if (TREE_CODE (fld) == FIELD_DECL)
	  {
	    tree ft = TREE_TYPE (fld);

	    if (TREE_THIS_VOLATILE (fld)
		|| !DECL_FIELD_OFFSET (fld) || !DECL_SIZE (fld)
		|| !host_integerp (DECL_FIELD_OFFSET (fld), 1)
		|| !host_integerp (DECL_SIZE (fld), 1))
	      return true;

	    if (AGGREGATE_TYPE_P (ft)
		&& type_internals_preclude_sra_p (ft))
	      return true;
	  }

      return false;

    case ARRAY_TYPE:
      et = TREE_TYPE (type);

      if (AGGREGATE_TYPE_P (et))
	return type_internals_preclude_sra_p (et);
      else
	return false;

    default:
      return false;
    }
}

/* Create and insert access for EXPR. Return created access, or NULL if it is
   not possible.  */

static struct access *
create_access (tree expr, bool write)
{
  struct access *access;
  void **slot;
  VEC (access_p,heap) *vec;
  HOST_WIDE_INT offset, size, max_size;
  tree base = expr;
  bool unscalarizable_region = false;

  if (handled_component_p (expr))
    base = get_ref_base_and_extent (expr, &offset, &size, &max_size);
  else
    {
      tree tree_size;

      tree_size = TYPE_SIZE (TREE_TYPE (base));
      if (tree_size && host_integerp (tree_size, 1))
	size = max_size = tree_low_cst (tree_size, 1);
      else
	size = max_size = -1;

      offset = 0;
    }

  if (!base || !DECL_P (base)
      || !bitmap_bit_p (candidate_bitmap, DECL_UID (base)))
    return NULL;

  if (size != max_size)
    {
      size = max_size;
      unscalarizable_region = true;
    }

  if (size < 0)
    {
      disqualify_candidate (base, "Encountered an ultra variable sized "
			    "access.");
      return NULL;
    }

  access = (struct access *) pool_alloc (access_pool);
  memset (access, 0, sizeof (struct access));

  access->base = base;
  access->offset = offset;
  access->size = size;
  access->expr = expr;
  access->type = TREE_TYPE (expr);
  access->write = write;
  access->grp_unscalarizable_region = unscalarizable_region;

  slot = pointer_map_contains (base_access_vec, base);
  if (slot)
    vec = (VEC (access_p, heap) *) *slot;
  else
    vec = VEC_alloc (access_p, heap, 32);

  VEC_safe_push (access_p, heap, vec, access);

  *((struct VEC (access_p,heap) **)
	pointer_map_insert (base_access_vec, base)) = vec;

  return access;
}


/* Callback of walk_tree.  Search the given tree for a declaration and exclude
   it from the candidates.  */

static tree
disqualify_all (tree *tp, int *walk_subtrees, void *data ATTRIBUTE_UNUSED)
{
  tree base = *tp;


  if (TREE_CODE (base) == SSA_NAME)
    base = SSA_NAME_VAR (base);

  if (DECL_P (base))
    {
      disqualify_candidate (base, "From within disqualify_all().");
      *walk_subtrees = 0;
    }
  else
    *walk_subtrees = 1;


  return NULL_TREE;
}

/* Scan expression EXPR and create access structures for all accesses to
   candidates for scalarization.  Return the created access or NULL if none is
   created.  */

static struct access *
build_access_from_expr_1 (tree *expr_ptr,
			gimple_stmt_iterator *gsi ATTRIBUTE_UNUSED, bool write)
{
  struct access *ret = NULL;
  tree expr = *expr_ptr;
  tree safe_expr = expr;
  bool bit_ref;

  if (TREE_CODE (expr) == BIT_FIELD_REF)
    {
      expr = TREE_OPERAND (expr, 0);
      bit_ref = true;
    }
  else
    bit_ref = false;

  while (TREE_CODE (expr) == NOP_EXPR
	 || TREE_CODE (expr) == VIEW_CONVERT_EXPR
	 || TREE_CODE (expr) == REALPART_EXPR
	 || TREE_CODE (expr) == IMAGPART_EXPR)
    expr = TREE_OPERAND (expr, 0);

  switch (TREE_CODE (expr))
    {
    case ADDR_EXPR:
    case SSA_NAME:
    case INDIRECT_REF:
      break;

    case VAR_DECL:
    case PARM_DECL:
    case RESULT_DECL:
    case COMPONENT_REF:
    case ARRAY_REF:
      ret = create_access (expr, write);
      break;

    case REALPART_EXPR:
    case IMAGPART_EXPR:
      expr = TREE_OPERAND (expr, 0);
      ret = create_access (expr, write);
      break;

    case ARRAY_RANGE_REF:
    default:
      walk_tree (&safe_expr, disqualify_all, NULL, NULL);
      break;
    }

  if (write && bit_ref && ret)
    ret->grp_bfr_lhs = 1;

  return ret;
}

/* Scan expression EXPR and create access structures for all accesses to
   candidates for scalarization.  Return true if any access has been
   inserted.  */

static bool
build_access_from_expr (tree *expr_ptr,
			gimple_stmt_iterator *gsi ATTRIBUTE_UNUSED, bool write,
			void *data ATTRIBUTE_UNUSED)
{
  return build_access_from_expr_1 (expr_ptr, gsi, write) != NULL;
}

/* Disqualify LHS and RHS for scalarization if STMT must end its basic block in
   modes in which it matters, return true iff they have been disqualified.  RHS
   may be NULL, in that case ignore it.  If we scalarize an aggregate in
   intra-SRA we may need to add statements after each statement.  This is not
   possible if a statement unconditionally has to end the basic block.  */
static bool
disqualify_ops_if_throwing_stmt (gimple stmt, tree *lhs, tree *rhs)
{
  if (stmt_can_throw_internal (stmt) || stmt_ends_bb_p (stmt))
    {
      walk_tree (lhs, disqualify_all, NULL, NULL);
      if (rhs)
	walk_tree (rhs, disqualify_all, NULL, NULL);
      return true;
    }
  return false;
}


/* Result code for scan_assign callback for scan_function.  */
enum scan_assign_result {SRA_SA_NONE,       /* nothing done for the stmt */
			 SRA_SA_PROCESSED,  /* stmt analyzed/changed */
			 SRA_SA_REMOVED};   /* stmt redundant and eliminated */


/* Scan expressions occuring in the statement pointed to by STMT_EXPR, create
   access structures for all accesses to candidates for scalarization and
   remove those candidates which occur in statements or expressions that
   prevent them from being split apart.  Return true if any access has been
   inserted.  */

static enum scan_assign_result
build_accesses_from_assign (gimple *stmt_ptr,
			    gimple_stmt_iterator *gsi ATTRIBUTE_UNUSED,
			    void *data ATTRIBUTE_UNUSED)
{
  gimple stmt = *stmt_ptr;
  tree *lhs_ptr, *rhs_ptr;
  struct access *lacc, *racc;

  if (gimple_assign_rhs2 (stmt))
    return SRA_SA_NONE;

  lhs_ptr = gimple_assign_lhs_ptr (stmt);
  rhs_ptr = gimple_assign_rhs1_ptr (stmt);

  if (disqualify_ops_if_throwing_stmt (stmt, lhs_ptr, rhs_ptr))
    return SRA_SA_NONE;

  racc = build_access_from_expr_1 (rhs_ptr, gsi, false);
  lacc = build_access_from_expr_1 (lhs_ptr, gsi, true);

  if (lacc && racc
      && !lacc->grp_unscalarizable_region
      && !racc->grp_unscalarizable_region
      && AGGREGATE_TYPE_P (TREE_TYPE (*lhs_ptr))
      && useless_type_conversion_p (lacc->type, racc->type))
    {
      struct assign_link *link;

      link = (struct assign_link *) pool_alloc (link_pool);
      memset (link, 0, sizeof (struct assign_link));

      link->lacc = lacc;
      link->racc = racc;

      add_link_to_rhs (racc, link);
    }

  return (lacc || racc) ? SRA_SA_PROCESSED : SRA_SA_NONE;
}

/* Scan function and look for interesting statements. Return true if any has
   been found or processed, as indicated by callbacks.  SCAN_EXPR is a callback
   called on all expressions within statements except assign statements and
   those deemed entirely unsuitable for some reason (all operands in such
   statements and expression are removed from candidate_bitmap).  SCAN_ASSIGN
   is a callback called on all assign statements, HANDLE_SSA_DEFS is a callback
   called on assign statements and those call statements which have a lhs and
   it is the only callback which can be NULL. ANALYSIS_STAGE is true when
   running in the analysis stage of a pass and thus no statement is being
   modified.  DATA is a pointer passed to all callbacks.  If any single
   callback returns true, this function also returns true, otherwise it returns
   false.  */

static bool
scan_function (bool (*scan_expr) (tree *, gimple_stmt_iterator *, bool, void *),
	       enum scan_assign_result (*scan_assign) (gimple *,
						       gimple_stmt_iterator *,
						       void *),
	       bool (*handle_ssa_defs)(gimple, void *),
	       bool analysis_stage, void *data)
{
  gimple_stmt_iterator gsi;
  basic_block bb;
  unsigned i;
  tree *t;
  bool ret = false;

  FOR_EACH_BB (bb)
    {
      bool bb_changed = false;

      gsi = gsi_start_bb (bb);
      while (!gsi_end_p (gsi))
	{
	  gimple stmt = gsi_stmt (gsi);
	  enum scan_assign_result assign_result;
	  bool any = false, deleted = false;

	  switch (gimple_code (stmt))
	    {
	    case GIMPLE_RETURN:
	      t = gimple_return_retval_ptr (stmt);
	      if (*t != NULL_TREE)
		{
		  if (DECL_P (*t))
		    {
		      tree ret_type = TREE_TYPE (*t);
		      if (sra_mode == SRA_MODE_EARLY_INTRA
			  && (TREE_CODE (ret_type) == UNION_TYPE
			      || TREE_CODE (ret_type) == QUAL_UNION_TYPE))
			disqualify_candidate (*t,
					      "Union in a return statement.");
		      else
			bitmap_set_bit (retvals_bitmap, DECL_UID (*t));
		    }
		  any |= scan_expr (t, &gsi, false, data);
		}
	      break;

	    case GIMPLE_ASSIGN:
	      assign_result = scan_assign (&stmt, &gsi, data);
	      any |= assign_result == SRA_SA_PROCESSED;
	      deleted = assign_result == SRA_SA_REMOVED;
	      if (handle_ssa_defs && assign_result != SRA_SA_REMOVED)
		any |= handle_ssa_defs (stmt, data);
	      break;

	    case GIMPLE_CALL:
	      /* Operands must be processed before the lhs.  */
	      for (i = 0; i < gimple_call_num_args (stmt); i++)
		{
		  tree *argp = gimple_call_arg_ptr (stmt, i);
		  any |= scan_expr (argp, &gsi, false, data);
		}

	      if (gimple_call_lhs (stmt))
		{
		  tree *lhs_ptr = gimple_call_lhs_ptr (stmt);
		  if (!analysis_stage ||
		      !disqualify_ops_if_throwing_stmt (stmt, lhs_ptr, NULL))
		    {
		      any |= scan_expr (lhs_ptr, &gsi, true, data);
		      if (handle_ssa_defs)
			any |= handle_ssa_defs (stmt, data);
		    }
		}
	      break;

	    case GIMPLE_ASM:
	      for (i = 0; i < gimple_asm_ninputs (stmt); i++)
		{
		  tree *op = &TREE_VALUE (gimple_asm_input_op (stmt, i));
		  any |= scan_expr (op, &gsi, false, data);
		}
	      for (i = 0; i < gimple_asm_noutputs (stmt); i++)
		{
		  tree *op = &TREE_VALUE (gimple_asm_output_op (stmt, i));
		  any |= scan_expr (op, &gsi, true, data);
		}

	    default:
	      if (analysis_stage)
		walk_gimple_op (stmt, disqualify_all, NULL);
	      break;
	    }

	  if (any)
	    {
	      ret = true;
	      bb_changed = true;

	      if (!analysis_stage)
		{
		  update_stmt (stmt);
		  if (!stmt_could_throw_p (stmt))
		    remove_stmt_from_eh_region (stmt);
		}
	    }
	  if (deleted)
	    bb_changed = true;
	  else
	    {
	      gsi_next (&gsi);
	      ret = true;
	    }
	}
      if (!analysis_stage && bb_changed)
	gimple_purge_dead_eh_edges (bb);
    }

  return ret;
}

/* Helper of QSORT function. There are pointers to accesses in the array.  An
   access is considered smaller than another if it has smaller offset or if the
   offsets are the same but is size is bigger. */

static int
compare_access_positions (const void *a, const void *b)
{
  const access_p *fp1 = (const access_p *) a;
  const access_p *fp2 = (const access_p *) b;
  const access_p f1 = *fp1;
  const access_p f2 = *fp2;

  if (f1->offset != f2->offset)
    return f1->offset < f2->offset ? -1 : 1;

  if (f1->size == f2->size)
    return 0;
  /* We want the bigger accesses first, thus the opposite operator in the next
     line: */
  return f1->size > f2->size ? -1 : 1;
}


/* Append a name of the declaration to the name obstack.  A helper function for
   make_fancy_name.  */

static void
make_fancy_decl_name (tree decl)
{
  char buffer[32];

  tree name = DECL_NAME (decl);
  if (name)
    obstack_grow (&name_obstack, IDENTIFIER_POINTER (name),
		  IDENTIFIER_LENGTH (name));
  else
    {
      sprintf (buffer, "D%u", DECL_UID (decl));
      obstack_grow (&name_obstack, buffer, strlen (buffer));
    }
}

/* Helper for make_fancy_name.  */

static void
make_fancy_name_1 (tree expr)
{
  char buffer[32];
  tree index;

  if (DECL_P (expr))
    {
      make_fancy_decl_name (expr);
      return;
    }

  switch (TREE_CODE (expr))
    {
    case COMPONENT_REF:
      make_fancy_name_1 (TREE_OPERAND (expr, 0));
      obstack_1grow (&name_obstack, '$');
      make_fancy_decl_name (TREE_OPERAND (expr, 1));
      break;

    case ARRAY_REF:
      make_fancy_name_1 (TREE_OPERAND (expr, 0));
      obstack_1grow (&name_obstack, '$');
      /* Arrays with only one element may not have a constant as their
	 index. */
      index = TREE_OPERAND (expr, 1);
      if (TREE_CODE (index) != INTEGER_CST)
	break;
      sprintf (buffer, HOST_WIDE_INT_PRINT_DEC, TREE_INT_CST_LOW (index));
      obstack_grow (&name_obstack, buffer, strlen (buffer));

      break;

    case BIT_FIELD_REF:
    case REALPART_EXPR:
    case IMAGPART_EXPR:
      gcc_unreachable (); 	/* we treat these as scalars.  */
      break;
    default:
      break;
    }
}

/* Create a human readable name for replacement variable of ACCESS.  */

static char *
make_fancy_name (tree expr)
{
  make_fancy_name_1 (expr);
  obstack_1grow (&name_obstack, '\0');
  return XOBFINISH (&name_obstack, char *);
}

/* Helper function for build_ref_for_offset.  */

static bool
build_ref_for_offset_1 (tree *res, tree type, HOST_WIDE_INT offset,
			tree exp_type)
{
  while (1)
    {
      tree fld;
      tree tr_size, index;
      HOST_WIDE_INT el_size;

      if (offset == 0 && exp_type
	  && useless_type_conversion_p (exp_type, type))
	return true;

      switch (TREE_CODE (type))
	{
	case UNION_TYPE:
	case QUAL_UNION_TYPE:
	case RECORD_TYPE:
	  /* Some ADA records are half-unions, treat all of them the same.  */
	  for (fld = TYPE_FIELDS (type); fld; fld = TREE_CHAIN (fld))
	    {
	      HOST_WIDE_INT pos, size;
	      tree expr, *expr_ptr;

	      if (TREE_CODE (fld) != FIELD_DECL)
		continue;

	      pos = int_bit_position (fld);
	      gcc_assert (TREE_CODE (type) == RECORD_TYPE || pos == 0);
	      size = tree_low_cst (DECL_SIZE (fld), 1);
	      if (pos > offset || (pos + size) <= offset)
		continue;

	      if (res)
		{
		  expr = build3 (COMPONENT_REF, TREE_TYPE (fld), *res, fld,
				 NULL_TREE);
		  expr_ptr = &expr;
		}
	      else
		expr_ptr = NULL;
	      if (build_ref_for_offset_1 (expr_ptr, TREE_TYPE (fld),
					  offset - pos, exp_type))
		{
		  if (res)
		    *res = expr;
		  return true;
		}
	    }
	  return false;

	case ARRAY_TYPE:
	  tr_size = TYPE_SIZE (TREE_TYPE (type));
	  if (!tr_size || !host_integerp (tr_size, 1))
	    return false;
	  el_size = tree_low_cst (tr_size, 1);

	  index = build_int_cst (TYPE_DOMAIN (type), offset / el_size);
	  if (!integer_zerop (TYPE_MIN_VALUE (TYPE_DOMAIN (type))))
	    index = int_const_binop (PLUS_EXPR, index,
				     TYPE_MIN_VALUE (TYPE_DOMAIN (type)), 0);
	  if (res)
	    *res = build4 (ARRAY_REF, TREE_TYPE (type), *res, index, NULL_TREE,
			   NULL_TREE);
	  offset = offset % el_size;
	  type = TREE_TYPE (type);
	  break;

	default:
	  if (offset != 0)
	    return false;

	  if (exp_type)
	    return false;
	  else
	    return true;
	}
    }
}

/* Construct an expression that would reference a part of aggregate *EXPR of
   type TYPE at the given OFFSET of the type EXP_TYPE.  If EXPR is NULL, the
   function only determines whether it can build such a reference without
   actually doing it.

   FIXME: Eventually this should be replaced with
   maybe_fold_offset_to_reference() from tree-ssa-ccp.c but that requires a
   minor rewrite of fold_stmt.
 */

static bool
build_ref_for_offset (tree *expr, tree type, HOST_WIDE_INT offset,
		      tree exp_type, bool allow_ptr)
{
  if (allow_ptr && POINTER_TYPE_P (type))
    {
      type = TREE_TYPE (type);
      if (expr)
	*expr = fold_build1 (INDIRECT_REF, type, *expr);
    }

  return build_ref_for_offset_1 (expr, type, offset, exp_type);
}

/* The very first phase of intraprocedural SRA.  It marks in candidate_bitmap
   those with type which is suitable for scalarization.  */

static bool
find_var_candidates (void)
{
  tree var, type;
  referenced_var_iterator rvi;
  bool ret = false;

  FOR_EACH_REFERENCED_VAR (var, rvi)
    {
      if (TREE_CODE (var) != VAR_DECL && TREE_CODE (var) != PARM_DECL)
        continue;
      type = TREE_TYPE (var);

      if (!AGGREGATE_TYPE_P (type)
	  || needs_to_live_in_memory (var)
	  || TREE_THIS_VOLATILE (var)
	  || !COMPLETE_TYPE_P (type)
	  || !host_integerp (TYPE_SIZE (type), 1)
          || tree_low_cst (TYPE_SIZE (type), 1) == 0
	  || type_internals_preclude_sra_p (type))
	continue;

      bitmap_set_bit (candidate_bitmap, DECL_UID (var));

      if (dump_file)
	{
	  fprintf (dump_file, "Candidate (%d): ", DECL_UID (var));
	  print_generic_expr (dump_file, var, 0);
	  fprintf (dump_file, "\n");
	}
      ret = true;
    }

  return ret;
}

/* Return true if TYPE should be considered a scalar type by SRA.  */

static bool
is_sra_scalar_type (tree type)
{
  enum tree_code code = TREE_CODE (type);
  return (INTEGRAL_TYPE_P (type) || SCALAR_FLOAT_TYPE_P (type)
	  || FIXED_POINT_TYPE_P (type) || POINTER_TYPE_P (type)
	  || code == VECTOR_TYPE || code == COMPLEX_TYPE
	  || code == OFFSET_TYPE);
}


/* Sort all accesses for the given variable, check for partial overlaps and
   return NULL if there are any.  If there are none, pick a representative for
   each combination of offset and size and create a linked list out of them.
   Return the pointer to the first representative and make sure it is the first
   one in the vector of accesses.  */

static struct access *
sort_and_splice_var_accesses (tree var)
{
  int i, j, access_count;
  struct access *res, **prev_acc_ptr = &res;
  VEC (access_p, heap) *access_vec;
  bool first = true;
  HOST_WIDE_INT low = -1, high = 0;

  access_vec = get_base_access_vector (var);
  if (!access_vec)
    return NULL;
  access_count = VEC_length (access_p, access_vec);

  /* Sort by <OFFSET, SIZE>.  */
  qsort (VEC_address (access_p, access_vec), access_count, sizeof (access_p),
	 compare_access_positions);

  i = 0;
  while (i < access_count)
    {
      struct access *access = VEC_index (access_p, access_vec, i);
      bool modification = access->write;
      bool grp_read = !access->write;
      bool grp_bfr_lhs = access->grp_bfr_lhs;
      bool first_scalar = is_sra_scalar_type (access->type);
      bool unscalarizable_region = access->grp_unscalarizable_region;

      if (first || access->offset >= high)
	{
	  first = false;
	  low = access->offset;
	  high = access->offset + access->size;
	}
      else if (access->offset > low && access->offset + access->size > high)
	return NULL;
      else
	gcc_assert (access->offset >= low
		    && access->offset + access->size <= high);

      j = i + 1;
      while (j < access_count)
	{
	  struct access *ac2 = VEC_index (access_p, access_vec, j);
	  if (ac2->offset != access->offset || ac2->size != access->size)
	    break;
	  modification |= ac2->write;
	  grp_read |= !ac2->write;
	  grp_bfr_lhs |= ac2->grp_bfr_lhs;
	  unscalarizable_region |= ac2->grp_unscalarizable_region;
	  relink_to_new_repr (access, ac2);

	  /* If one of the equivalent accesses is scalar, use it as a
	     representative (this happens when when there is for example on a
	     single scalar field in a structure).  */
	  if (!first_scalar && is_sra_scalar_type (ac2->type))
	    {
	      struct access tmp_acc;
	      first_scalar = true;

	      memcpy (&tmp_acc, ac2, sizeof (struct access));
	      memcpy (ac2, access,  sizeof (struct access));
	      memcpy (access, &tmp_acc, sizeof (struct access));
	    }
	  ac2->group_representative = access;
	  j++;
	}

      i = j;

      access->group_representative = access;
      access->grp_write = modification;
      access->grp_read = grp_read;
      access->grp_bfr_lhs = grp_bfr_lhs;
      access->grp_unscalarizable_region = unscalarizable_region;
      if (access->first_link)
	add_access_to_work_queue (access);

      *prev_acc_ptr = access;
      prev_acc_ptr = &access->next_grp;
    }

  gcc_assert (res == VEC_index (access_p, access_vec, 0));
  return res;
}

/* Create a variable for the given ACCESS which determines the type, name and a
   few other properties.  Return the variable declaration and store it also to
   ACCESS->replacement.  */

static tree
create_access_replacement (struct access *access)
{
  tree repl;

  repl = make_rename_temp (access->type, "SR");
  get_var_ann (repl);
  add_referenced_var (repl);

  DECL_SOURCE_LOCATION (repl) = DECL_SOURCE_LOCATION (access->base);
  DECL_ARTIFICIAL (repl) = 1;

  if (DECL_NAME (access->base) && !DECL_IGNORED_P (access->base))
    {
      char *pretty_name = make_fancy_name (access->expr);

      DECL_NAME (repl) = get_identifier (pretty_name);
      obstack_free (&name_obstack, pretty_name);

      SET_DECL_DEBUG_EXPR (repl, access->expr);
      DECL_DEBUG_EXPR_IS_FROM (repl) = 1;
      DECL_IGNORED_P (repl) = 0;
      TREE_NO_WARNING (repl) = TREE_NO_WARNING (access->base);
    }
  else
    {
      DECL_IGNORED_P (repl) = 1;
      TREE_NO_WARNING (repl) = 1;
    }

  if (access->grp_bfr_lhs)
    DECL_GIMPLE_REG_P (repl) = 0;

  if (dump_file)
    {
      fprintf (dump_file, "Created a replacement for ");
      print_generic_expr (dump_file, access->base, 0);
      fprintf (dump_file, " offset: %u, size: %u: ",
	       (unsigned) access->offset, (unsigned) access->size);
      print_generic_expr (dump_file, repl, 0);
      fprintf (dump_file, "\n");
    }

  return repl;
}

/* Return ACCESS scalar replacement, create it if it does not exist yet.  */

static inline tree
get_access_replacement (struct access *access)
{
  gcc_assert (access->grp_to_be_replaced);

  if (access->replacement_decl)
    return access->replacement_decl;

  access->replacement_decl = create_access_replacement (access);
  return access->replacement_decl;
}

/* Build a subtree of accesses rooted in *ACCESS, and move the pointer in the
   linked list along the way.  Stop when *ACCESS is NULL or the access pointed
   to it is not "within" the root.  */

static void
build_access_subtree (struct access **access)
{
  struct access *root = *access, *last_child = NULL;
  HOST_WIDE_INT limit = root->offset + root->size;

  *access = (*access)->next_grp;
  while  (*access && (*access)->offset + (*access)->size <= limit)
    {
      if (!last_child)
	root->first_child = *access;
      else
	last_child->next_sibling = *access;
      last_child = *access;

      build_access_subtree (access);
    }
}

/* Build a tree of access representatives, ACCESS is the pointer to the first
   one, others are linked in a list by the next_grp field.  Decide about scalar
   replacements on the way, return true iff any are to be created.  */

static void
build_access_trees (struct access *access)
{
  while (access)
    {
      struct access *root = access;

      build_access_subtree (&access);
      root->next_grp = access;
    }
}

/* Analyze the subtree of accesses rooted in ROOT, scheduling replacements when
   both seeming beneficial and when ALLOW_REPLACEMENTS allows it.  Also set
   all sorts of access flags appropriately along the way, notably always ser
   grp_read when MARK_READ is true and grp_write when MARK_WRITE is true.  */

static bool
analyze_access_subtree (struct access *root, bool allow_replacements,
			bool mark_read, bool mark_write)
{
  struct access *child;
  HOST_WIDE_INT limit = root->offset + root->size;
  HOST_WIDE_INT covered_to = root->offset;
  bool scalar = is_sra_scalar_type (root->type);
  bool hole = false, sth_created = false;

  if (mark_read)
    root->grp_read = true;
  else if (root->grp_read)
    mark_read = true;

  if (mark_write)
    root->grp_write = true;
  else if (root->grp_write)
    mark_write = true;

  if (root->grp_unscalarizable_region)
    allow_replacements = false;

  for (child = root->first_child; child; child = child->next_sibling)
    {
      if (!hole && child->offset < covered_to)
	hole = true;
      else
	covered_to += child->size;

      sth_created |= analyze_access_subtree (child,
					     allow_replacements && !scalar,
					     mark_read, mark_write);

      root->grp_unscalarized_data |= child->grp_unscalarized_data;
      hole |= !child->grp_covered;
    }

  if (allow_replacements && scalar && !root->first_child)
    {
      if (dump_file)
	{
	  fprintf (dump_file, "Marking ");
	  print_generic_expr (dump_file, root->base, 0);
	  fprintf (dump_file, " offset: %u, size: %u: ",
		   (unsigned) root->offset, (unsigned) root->size);
	  fprintf (dump_file, " to be replaced.\n");
	}

      root->grp_to_be_replaced = 1;
      sth_created = true;
      hole = false;
    }
  else if (covered_to < limit)
    hole = true;

  if (sth_created && !hole)
    {
      root->grp_covered = 1;
      return true;
    }
  if (root->grp_write || TREE_CODE (root->base) == PARM_DECL)
    root->grp_unscalarized_data = 1; /* not covered and written to */
  if (sth_created)
    return true;
  return false;
}

/* Analyze all access trees linked by next_grp by the means of
   analyze_access_subtree.  */
static bool
analyze_access_trees (struct access *access)
{
  bool ret = false;

  while (access)
    {
      if (analyze_access_subtree (access, true, false, false))
	ret = true;
      access = access->next_grp;
    }

  return ret;
}

/* Return true iff a potential new child of LACC at offset OFFSET and with size
   SIZE would conflict with an already existing one.  If exactly such a child
   already exists in LACC, store a pointer to it in EXACT_MATCH.  */

static bool
child_would_conflict_in_lacc (struct access *lacc, HOST_WIDE_INT norm_offset,
			      HOST_WIDE_INT size, struct access **exact_match)
{
  struct access *child;

  for (child = lacc->first_child; child; child = child->next_sibling)
    {
      if (child->offset == norm_offset && child->size == size)
	{
	  *exact_match = child;
	  return true;
	}

      if (child->offset < norm_offset + size
	  && child->offset + child->size > norm_offset)
	return true;
    }

  return false;
}

/* Set the expr of TARGET to one just like MODEL but with is own base at the
   bottom of the handled components.  */

static void
duplicate_expr_for_different_base (struct access *target,
				   struct access *model)
{
  tree t, expr = unshare_expr (model->expr);

  gcc_assert (handled_component_p (expr));
  t = expr;
  while (handled_component_p (TREE_OPERAND (t, 0)))
    t = TREE_OPERAND (t, 0);
  gcc_assert (TREE_OPERAND (t, 0) == model->base);
  TREE_OPERAND (t, 0) = target->base;

  target->expr = expr;
}


/* Create a new child access of PARENT, with all properties just like MODEL
   except for its offset and with its grp_write false and grp_read true.
   Return the new access. Note that this access is created long after all
   splicing and sorting, it's not located in any access vector and is
   automatically a representative of its group.  */

static struct access *
create_artificial_child_access (struct access *parent, struct access *model,
				HOST_WIDE_INT new_offset)
{
  struct access *access;
  struct access **child;

  gcc_assert (!model->grp_unscalarizable_region);

  access = (struct access *) pool_alloc (access_pool);
  memset (access, 0, sizeof (struct access));
  access->base = parent->base;
  access->offset = new_offset;
  access->size = model->size;
  duplicate_expr_for_different_base (access, model);
  access->type = model->type;
  access->grp_write = true;
  access->grp_read = false;

  child = &parent->first_child;
  while (*child && (*child)->offset < new_offset)
    child = &(*child)->next_sibling;

  access->next_sibling = *child;
  *child = access;

  return access;
}


/* Propagate all subaccesses of RACC across an assignment link to LACC. Return
   true if any new subaccess was created.  Additionally, if RACC is a scalar
   access but LACC is not, change the type of the latter.  */

static bool
propagate_subacesses_accross_link (struct access *lacc, struct access *racc)
{
  struct access *rchild;
  HOST_WIDE_INT norm_delta = lacc->offset - racc->offset;
  HOST_WIDE_INT lbase_size = tree_low_cst (DECL_SIZE (lacc->base), 1);

  bool ret = false;

  if (is_sra_scalar_type (lacc->type)
      || lacc->grp_unscalarizable_region
      || racc->grp_unscalarizable_region)
    return false;

  if (!lacc->first_child && !racc->first_child
      && is_sra_scalar_type (racc->type)
      && (sra_mode == SRA_MODE_INTRA
          || !bitmap_bit_p (retvals_bitmap, DECL_UID (lacc->base))))
    {
      duplicate_expr_for_different_base (lacc, racc);
      lacc->type = racc->type;
      return false;
    }

  for (rchild = racc->first_child; rchild; rchild = rchild->next_sibling)
    {
      struct access *new_acc = NULL;
      HOST_WIDE_INT norm_offset = rchild->offset + norm_delta;

      if (rchild->grp_unscalarizable_region
	  || norm_offset + rchild->size > lbase_size)
	continue;

      if (child_would_conflict_in_lacc (lacc, norm_offset, rchild->size,
					&new_acc))
	{
	  if (new_acc && rchild->first_child)
	    ret |= propagate_subacesses_accross_link (new_acc, rchild);
	  continue;
	}

      new_acc = create_artificial_child_access (lacc, rchild, norm_offset);
      if (racc->first_child)
	propagate_subacesses_accross_link (new_acc, rchild);

      ret = true;
    }

  return ret;
}

/* Propagate all subaccesses across assignment links.  */

static void
propagate_all_subaccesses (void)
{
  while (work_queue_head)
    {
      struct access *racc = pop_access_from_work_queue ();
      struct assign_link *link;

      gcc_assert (racc->first_link);

      for (link = racc->first_link; link; link = link->next)
	{
	  struct access *lacc = link->lacc;

	  if (!bitmap_bit_p (candidate_bitmap, DECL_UID (lacc->base)))
	    continue;
	  lacc = lacc->group_representative;
	  if (propagate_subacesses_accross_link (lacc, racc)
	      && lacc->first_link)
	    add_access_to_work_queue (lacc);
	}
    }
}

/* Go through all accesses collected throughout the (intraprocedural) analysis
   stage, exclude overlapping ones, identify representatives and build trees
   out of them, making decisions about scalarization on the way.  Return true
   iff there are any to-be-scalarized variables after this stage. */

static bool
analyze_all_variable_accesses (void)
{
  tree var;
  referenced_var_iterator rvi;
  bool res = false;

  FOR_EACH_REFERENCED_VAR (var, rvi)
    if (bitmap_bit_p (candidate_bitmap, DECL_UID (var)))
      {
	struct access *access;

	access = sort_and_splice_var_accesses (var);
	if (access)
	  build_access_trees (access);
	else
	  disqualify_candidate (var,
				"No or inhibitingly overlapping accesses.");
      }

  propagate_all_subaccesses ();

  FOR_EACH_REFERENCED_VAR (var, rvi)
    if (bitmap_bit_p (candidate_bitmap, DECL_UID (var)))
      {
	struct access *access = get_first_repr_for_decl (var);

	if (analyze_access_trees (access))
	  {
	    res = true;
	    if (dump_file)
	      {
		fprintf (dump_file, "\nAccess trees for ");
		print_generic_expr (dump_file, var, 0);
		fprintf (dump_file, " (UID: %u): \n", DECL_UID (var));
		dump_access_tree (dump_file, access);
		fprintf (dump_file, "\n");
	      }
	  }
	else
	  disqualify_candidate (var, "No scalar replacements to be created.");
      }

  return res;
}

/* Return true iff a reference statement into aggregate AGG can be built for
   every single to-be-replaced accesses that is a child of ACCESS, its sibling
   or a child of its sibling. TOP_OFFSET is the offset from the processed
   access subtree that has to be subtracted from offset of each access.  */

static bool
ref_expr_for_all_replacements_p (struct access *access, tree agg,
				 HOST_WIDE_INT top_offset)
{
  do
    {
      if (access->grp_to_be_replaced
	  && !build_ref_for_offset (NULL, TREE_TYPE (agg),
				    access->offset - top_offset,
				    access->type, false))
	return false;

      if (access->first_child
	  && !ref_expr_for_all_replacements_p (access->first_child, agg,
					       top_offset))
	return false;

      access = access->next_sibling;
    }
  while (access);

  return true;
}


/* Generate statements copying scalar replacements of accesses within a subtree
   into or out of AGG.  ACCESS is the first child of the root of the subtree to
   be processed.  AGG is an aggregate type expression (can be a declaration but
   does not have to be, it can for example also be an indirect_ref).
   TOP_OFFSET is the offset of the processed subtree which has to be subtracted
   from offsets of individual accesses to get corresponding offsets for AGG.
   If CHUNK_SIZE is non-null, copy only replacements in the interval
   <start_offset, start_offset + chunk_size>, otherwise copy all.  GSI is a
   statement iterator used to place the new statements.  WRITE should be true
   when the statements should write from AGG to the replacement and false if
   vice versa.  if INSERT_AFTER is true, new statements will be added after the
   current statement in GSI, they will be added before the statement
   otherwise.  */

static void
generate_subtree_copies (struct access *access, tree agg,
			 HOST_WIDE_INT top_offset,
			 HOST_WIDE_INT start_offset, HOST_WIDE_INT chunk_size,
			 gimple_stmt_iterator *gsi, bool write,
			 bool insert_after)
{
  do
    {
      tree expr = unshare_expr (agg);

      if (chunk_size && access->offset >= start_offset + chunk_size)
	return;

      if (access->grp_to_be_replaced
	  && (chunk_size == 0
	      || access->offset + access->size > start_offset))
	{
	  bool repl_found;
	  gimple stmt;

	  repl_found = build_ref_for_offset (&expr, TREE_TYPE (agg),
					     access->offset - top_offset,
					     access->type, false);
	  gcc_assert (repl_found);

	  if (write)
	    stmt = gimple_build_assign (get_access_replacement (access), expr);
	  else
	    {
	      tree repl = get_access_replacement (access);
	      TREE_NO_WARNING (repl) = 1;
	      stmt = gimple_build_assign (expr, repl);
	    }

	  if (insert_after)
	    gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
	  else
	    gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
	  update_stmt (stmt);
	}

      if (access->first_child)
	generate_subtree_copies (access->first_child, agg, top_offset,
				 start_offset, chunk_size, gsi,
				 write, insert_after);

      access = access->next_sibling;
    }
  while (access);
}

/* Assign zero to all scalar replacements in an access subtree.  ACCESS is the
   the root of the subtree to be processed.  GSI is the statement iterator used
   for inserting statements which are added after the current statement if
   INSERT_AFTER is true or before it otherwise.  */

static void
init_subtree_with_zero (struct access *access, gimple_stmt_iterator *gsi,
			bool insert_after)

{
  struct access *child;

  if (access->grp_to_be_replaced)
    {
      gimple stmt;

      stmt = gimple_build_assign (get_access_replacement (access),
				  fold_convert (access->type,
						integer_zero_node));
      if (insert_after)
	gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
      else
	gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
      update_stmt (stmt);
    }

  for (child = access->first_child; child; child = child->next_sibling)
    init_subtree_with_zero (child, gsi, insert_after);
}

/* Search for an access representative for the given expression EXPR and
   return it or NULL if it cannot be found.  */

static struct access *
get_access_for_expr (tree expr)
{
  HOST_WIDE_INT offset, size, max_size;
  tree base;

  if (TREE_CODE (expr) == NOP_EXPR
      || TREE_CODE (expr) == VIEW_CONVERT_EXPR)
    expr = TREE_OPERAND (expr, 0);

  if (handled_component_p (expr))
    {
      base = get_ref_base_and_extent (expr, &offset, &size, &max_size);
      size = max_size;
      if (size == -1 || !base || !DECL_P (base))
	return NULL;
    }
  else if (DECL_P (expr))
    {
      tree tree_size;

      base = expr;
      tree_size = TYPE_SIZE (TREE_TYPE (base));
      if (tree_size && host_integerp (tree_size, 1))
	size = max_size = tree_low_cst (tree_size, 1);
      else
	return NULL;

      offset = 0;
    }
  else
    return NULL;

  if (!bitmap_bit_p (candidate_bitmap, DECL_UID (base)))
    return NULL;

  return get_var_base_offset_size_access (base, offset, size);
}

/* Substitute into *EXPR an expression of type TYPE with the value of the
   replacement of ACCESS.  This is done either by producing a special V_C_E
   assignment statement converting the replacement to a new temporary of the
   requested type if TYPE is not TREE_ADDRESSABLE or by going through the base
   aggregate if it is.  */

static void
fix_incompatible_types_for_expr (tree *expr, tree type, struct access *access,
				 gimple_stmt_iterator *gsi, bool write)
{
  tree repl = get_access_replacement (access);
  if (!TREE_ADDRESSABLE (type))
    {
      tree tmp = create_tmp_var (type, "SRvce");

      add_referenced_var (tmp);
      if (is_gimple_reg_type (type))
	tmp = make_ssa_name (tmp, NULL);

      if (write)
	{
	  gimple stmt;
	  tree conv = fold_build1 (VIEW_CONVERT_EXPR, TREE_TYPE (repl), tmp);

	  *expr = tmp;
	  if (is_gimple_reg_type (type))
	    SSA_NAME_DEF_STMT (tmp) = gsi_stmt (*gsi);
	  stmt = gimple_build_assign (repl, conv);
	  gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
	  update_stmt (stmt);
	}
      else
	{
	  gimple stmt;
	  tree conv = fold_build1 (VIEW_CONVERT_EXPR, type, repl);

	  stmt = gimple_build_assign (tmp, conv);
	  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
	  if (is_gimple_reg_type (type))
	    SSA_NAME_DEF_STMT (tmp) = stmt;
	  *expr = tmp;
	  update_stmt (stmt);
	}
    }
  else
    {
      if (write)
	{
	  gimple stmt;

	  stmt = gimple_build_assign (repl, unshare_expr (access->expr));
	  gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
	  update_stmt (stmt);
	}
      else
	{
	  gimple stmt;

	  stmt = gimple_build_assign (unshare_expr (access->expr), repl);
	  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
	  update_stmt (stmt);
	}
    }
}


/* Callback for scan_function.  Replace the expression EXPR with a scalar
   replacement if there is one and generate other statements to do type
   conversion or subtree copying if necessary.  GSI is used to place newly
   created statements, WRITE is true if the expression is being written to (it
   is on a LHS of a statement or output in an assembly statement).  */

static bool
sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi, bool write,
		 void *data ATTRIBUTE_UNUSED)
{
  struct access *access;
  tree type, bfr;

  if (TREE_CODE (*expr) == BIT_FIELD_REF)
    {
      bfr = *expr;
      expr = &TREE_OPERAND (*expr, 0);
    }
  else
    bfr = NULL_TREE;

  if (TREE_CODE (*expr) == REALPART_EXPR || TREE_CODE (*expr) == IMAGPART_EXPR)
    expr = &TREE_OPERAND (*expr, 0);
  type = TREE_TYPE (*expr);

  access = get_access_for_expr (*expr);
  if (!access)
    return false;

  if (access->grp_to_be_replaced)
    {
      if (!useless_type_conversion_p (type, access->type))
	fix_incompatible_types_for_expr (expr, type, access, gsi, write);
      else
	*expr = get_access_replacement (access);
    }

  if (access->first_child)
    {
      HOST_WIDE_INT start_offset, chunk_size;
      if (bfr
	  && host_integerp (TREE_OPERAND (bfr, 1), 1)
	  && host_integerp (TREE_OPERAND (bfr, 2), 1))
	{
	  start_offset = tree_low_cst (TREE_OPERAND (bfr, 1), 1);
	  chunk_size = tree_low_cst (TREE_OPERAND (bfr, 2), 1);
	}
      else
	start_offset = chunk_size = 0;

      generate_subtree_copies (access->first_child, access->base, 0,
			       start_offset, chunk_size, gsi, write, write);
    }
  return true;
}

/* Store all replacements in the access tree rooted in TOP_RACC either to their
   base aggregate if there are unscalarized data or directly to LHS
   otherwise.  */

static void
handle_unscalarized_data_in_subtree (struct access *top_racc, tree lhs,
				     gimple_stmt_iterator *gsi)
{
  if (top_racc->grp_unscalarized_data)
    generate_subtree_copies (top_racc->first_child, top_racc->base, 0, 0, 0,
			     gsi, false, false);
  else
    generate_subtree_copies (top_racc->first_child, lhs, top_racc->offset,
			     0, 0, gsi, false, false);
}


/* Try to generate statements to load all sub-replacements in an access
   (sub)tree (LACC is the first child) from scalar replacements in the TOP_RACC
   (sub)tree.  If that is not possible, refresh the TOP_RACC base aggregate and
   load the accesses from it.  LEFT_OFFSET is the offset of the left whole
   subtree being copied, RIGHT_OFFSET is the same thing for the right subtree.
   GSI is stmt iterator used for statement insertions.  *REFRESHED is true iff
   the rhs top aggregate has already been refreshed by contents of its scalar
   reductions and is set to true if this function has to do it.  */

static void
load_assign_lhs_subreplacements (struct access *lacc, struct access *top_racc,
				 HOST_WIDE_INT left_offset,
				 HOST_WIDE_INT right_offset,
				 gimple_stmt_iterator *old_gsi,
				 gimple_stmt_iterator *new_gsi,
				 bool *refreshed, tree lhs)
{
  do
    {
      if (lacc->grp_to_be_replaced)
	{
	  struct access *racc;
	  HOST_WIDE_INT offset = lacc->offset - left_offset + right_offset;

	  racc = find_access_in_subtree (top_racc, offset, lacc->size);
	  if (racc && racc->grp_to_be_replaced)
	    {
	      gimple stmt;

	      if (useless_type_conversion_p (lacc->type, racc->type))
		stmt = gimple_build_assign (get_access_replacement (lacc),
					    get_access_replacement (racc));
	      else
		{
		  tree rhs = fold_build1 (VIEW_CONVERT_EXPR, lacc->type,
					  get_access_replacement (racc));
		  stmt = gimple_build_assign (get_access_replacement (lacc),
					      rhs);
		}

	      gsi_insert_after (new_gsi, stmt, GSI_NEW_STMT);
	      update_stmt (stmt);
	    }
	  else
	    {
	      tree expr = unshare_expr (top_racc->base);
	      bool repl_found;
	      gimple stmt;

	      /* No suitable access on the right hand side, need to load from
		 the aggregate.  See if we have to update it first... */
	      if (!*refreshed)
		{
		  gcc_assert (top_racc->first_child);
		  handle_unscalarized_data_in_subtree (top_racc, lhs, old_gsi);
		  *refreshed = true;
		}

	      repl_found = build_ref_for_offset (&expr,
						 TREE_TYPE (top_racc->base),
						 lacc->offset - left_offset,
						 lacc->type, false);
	      gcc_assert (repl_found);
	      stmt = gimple_build_assign (get_access_replacement (lacc),
					  expr);
	      gsi_insert_after (new_gsi, stmt, GSI_NEW_STMT);
	      update_stmt (stmt);
	    }
	}
      else if (lacc->grp_read && !lacc->grp_covered && !*refreshed)
	{
	  handle_unscalarized_data_in_subtree (top_racc, lhs, old_gsi);
	  *refreshed = true;
	}

      if (lacc->first_child)
	load_assign_lhs_subreplacements (lacc->first_child, top_racc,
					 left_offset, right_offset,
					 old_gsi, new_gsi, refreshed, lhs);
      lacc = lacc->next_sibling;
    }
  while (lacc);
}

/* Return true iff ACC is non-NULL and has subaccesses.  */

static inline bool
access_has_children_p (struct access *acc)
{
  return acc && acc->first_child;
}

/* Modify assignments with a CONSTRUCTOR on their RHS.  STMT contains a pointer
   to the assignment and GSI is the statement iterator pointing at it.  Returns
   the same values as sra_modify_assign.  */

static enum scan_assign_result
sra_modify_constructor_assign (gimple *stmt, gimple_stmt_iterator *gsi)
{
  tree lhs = gimple_assign_lhs (*stmt);
  struct access *acc;

  gcc_assert (TREE_CODE (lhs) != REALPART_EXPR
	      && TREE_CODE (lhs) != IMAGPART_EXPR);
  acc = get_access_for_expr (lhs);
  if (!acc)
    return SRA_SA_NONE;

  if (VEC_length (constructor_elt,
		  CONSTRUCTOR_ELTS (gimple_assign_rhs1 (*stmt))) > 0)
    {
      /* I have never seen this code path trigger but if it can happen the
	 following should handle it gracefully.  */
      if (access_has_children_p (acc))
	generate_subtree_copies (acc->first_child, acc->base, 0, 0, 0, gsi,
				 true, true);
      return SRA_SA_PROCESSED;
    }

  if (acc->grp_covered)
    {
      init_subtree_with_zero (acc, gsi, false);
      unlink_stmt_vdef (*stmt);
      gsi_remove (gsi, true);
      return SRA_SA_REMOVED;
    }
  else
    {
      init_subtree_with_zero (acc, gsi, true);
      return SRA_SA_PROCESSED;
    }
}


/* Modify statements with IMAGPART_EXPR or REALPART_EXPR on their lhs with
   to-be-scalarized expressions with them.  STMT is the statement and GSI is
   the iterator used to place new helper statements.  Returns the same values
   as sra_modify_assign.  */

static enum scan_assign_result
sra_modify_partially_complex_lhs (gimple stmt, gimple_stmt_iterator *gsi)
{
  tree lhs, complex, ptype, rp, ip;
  struct access *access;
  gimple new_stmt, aux_stmt;

  lhs = gimple_assign_lhs (stmt);
  complex = TREE_OPERAND (lhs, 0);

  access = get_access_for_expr (complex);

  if (!access || !access->grp_to_be_replaced)
    return SRA_SA_NONE;

  ptype = TREE_TYPE (TREE_TYPE (complex));
  rp = create_tmp_var (ptype, "SRr");
  add_referenced_var (rp);
  rp = make_ssa_name (rp, NULL);

  ip = create_tmp_var (ptype, "SRp");
  add_referenced_var (ip);
  ip = make_ssa_name (ip, NULL);

  if (TREE_CODE (lhs) == IMAGPART_EXPR)
    {
      aux_stmt = gimple_build_assign (rp, fold_build1 (REALPART_EXPR, ptype,
					     get_access_replacement (access)));
      SSA_NAME_DEF_STMT (rp) = aux_stmt;
      gimple_assign_set_lhs (stmt, ip);
      SSA_NAME_DEF_STMT (ip) = stmt;
    }
  else
    {
      aux_stmt = gimple_build_assign (ip, fold_build1 (IMAGPART_EXPR, ptype,
					     get_access_replacement (access)));
      SSA_NAME_DEF_STMT (ip) = aux_stmt;
      gimple_assign_set_lhs (stmt, rp);
      SSA_NAME_DEF_STMT (rp) = stmt;
    }

  gsi_insert_before (gsi, aux_stmt, GSI_SAME_STMT);
  update_stmt (aux_stmt);
  new_stmt = gimple_build_assign (get_access_replacement (access),
				  fold_build2 (COMPLEX_EXPR, access->type,
					       rp, ip));
  gsi_insert_after (gsi, new_stmt, GSI_NEW_STMT);
  update_stmt (new_stmt);
  return SRA_SA_PROCESSED;
}

/* Return true iff T has a VIEW_CONVERT_EXPR among its handled components.  */

static bool
contains_view_convert_expr_p (tree t)
{
  while (1)
    {
      if (TREE_CODE (t) == VIEW_CONVERT_EXPR)
	return true;
      if (!handled_component_p (t))
	return false;
      t = TREE_OPERAND (t, 0);
    }
}

/* Change STMT to assign compatible types by means of adding component or array
   references or VIEW_CONVERT_EXPRs.  All parameters have the same meaning as
   variable with the same names in sra_modify_assign.  This is done in a
   such a complicated way in order to make
   testsuite/g++.dg/tree-ssa/ssa-sra-2.C happy and so it helps in at least some
   cases.  */

static void
fix_modified_assign_compatibility (gimple_stmt_iterator *gsi, gimple *stmt,
				   struct access *lacc, struct access *racc,
				   tree lhs, tree *rhs, tree ltype, tree rtype)
{
  if (racc && racc->grp_to_be_replaced && AGGREGATE_TYPE_P (ltype)
      && !access_has_children_p (lacc))
    {
      tree expr = unshare_expr (lhs);
      bool found = build_ref_for_offset (&expr, ltype, racc->offset, rtype,
					 false);
      if (found)
	{
	  gimple_assign_set_lhs (*stmt, expr);
	  return;
	}
    }

  if (lacc && lacc->grp_to_be_replaced && AGGREGATE_TYPE_P (rtype)
      && !access_has_children_p (racc))
    {
      tree expr = unshare_expr (*rhs);
      bool found = build_ref_for_offset (&expr, rtype, lacc->offset, ltype,
					 false);
      if (found)
	{
	  gimple_assign_set_rhs1 (*stmt, expr);
	  return;
	}
    }

  *rhs = fold_build1 (VIEW_CONVERT_EXPR, ltype, *rhs);
  gimple_assign_set_rhs_from_tree (gsi, *rhs);
  *stmt = gsi_stmt (*gsi);
}

/* Callback of scan_function to process assign statements.  It examines both
   sides of the statement, replaces them with a scalare replacement if there is
   one and generating copying of replacements if scalarized aggregates have been
   used in the assignment.  STMT is a pointer to the assign statement, GSI is
   used to hold generated statements for type conversions and subtree
   copying.  */

static enum scan_assign_result
sra_modify_assign (gimple *stmt, gimple_stmt_iterator *gsi,
		   void *data ATTRIBUTE_UNUSED)
{
  struct access *lacc, *racc;
  tree ltype, rtype;
  tree lhs, rhs;
  bool modify_this_stmt;

  if (gimple_assign_rhs2 (*stmt))
    return SRA_SA_NONE;
  lhs = gimple_assign_lhs (*stmt);
  rhs = gimple_assign_rhs1 (*stmt);

  if (TREE_CODE (rhs) == CONSTRUCTOR)
    return sra_modify_constructor_assign (stmt, gsi);

  if (TREE_CODE (lhs) == REALPART_EXPR || TREE_CODE (lhs) == IMAGPART_EXPR)
    return sra_modify_partially_complex_lhs (*stmt, gsi);

  if (TREE_CODE (rhs) == REALPART_EXPR || TREE_CODE (rhs) == IMAGPART_EXPR
      || TREE_CODE (rhs) == BIT_FIELD_REF || TREE_CODE (lhs) == BIT_FIELD_REF)
    {
      modify_this_stmt = sra_modify_expr (gimple_assign_rhs1_ptr (*stmt),
					  gsi, false, data);
      modify_this_stmt |= sra_modify_expr (gimple_assign_lhs_ptr (*stmt),
					   gsi, true, data);
      return modify_this_stmt ? SRA_SA_PROCESSED : SRA_SA_NONE;
    }

  lacc = get_access_for_expr (lhs);
  racc = get_access_for_expr (rhs);
  if (!lacc && !racc)
    return SRA_SA_NONE;

  modify_this_stmt = ((lacc && lacc->grp_to_be_replaced)
		      || (racc && racc->grp_to_be_replaced));

  if (lacc && lacc->grp_to_be_replaced)
    {
      lhs = get_access_replacement (lacc);
      gimple_assign_set_lhs (*stmt, lhs);
      ltype = lacc->type;
    }
  else
    ltype = TREE_TYPE (lhs);

  if (racc && racc->grp_to_be_replaced)
    {
      rhs = get_access_replacement (racc);
      gimple_assign_set_rhs1 (*stmt, rhs);
      rtype = racc->type;
    }
  else
    rtype = TREE_TYPE (rhs);

  /* The possibility that gimple_assign_set_rhs_from_tree() might reallocate
     the statement makes the position of this pop_stmt_changes() a bit awkward
     but hopefully make some sense.  */
  if (modify_this_stmt)
    {
      if (!useless_type_conversion_p (ltype, rtype))
	fix_modified_assign_compatibility (gsi, stmt, lacc, racc,
					   lhs, &rhs, ltype, rtype);
    }

  if (contains_view_convert_expr_p (rhs) || contains_view_convert_expr_p (lhs)
      || (access_has_children_p (racc)
	  && !ref_expr_for_all_replacements_p (racc, lhs, racc->offset))
      || (access_has_children_p (lacc)
	  && !ref_expr_for_all_replacements_p (lacc, rhs, lacc->offset)))
    {
      if (access_has_children_p (racc))
	generate_subtree_copies (racc->first_child, racc->base, 0, 0, 0,
				 gsi, false, false);
      if (access_has_children_p (lacc))
	generate_subtree_copies (lacc->first_child, lacc->base, 0, 0, 0,
				 gsi, true, true);
    }
  else
    {
      if (access_has_children_p (lacc) && access_has_children_p (racc))
	{
	  gimple_stmt_iterator orig_gsi = *gsi;
	  bool refreshed;

	  if (lacc->grp_read && !lacc->grp_covered)
	    {
	      handle_unscalarized_data_in_subtree (racc, lhs, gsi);
	      refreshed = true;
	    }
	  else
	    refreshed = false;

	  load_assign_lhs_subreplacements (lacc->first_child, racc,
					   lacc->offset, racc->offset,
					   &orig_gsi, gsi, &refreshed, lhs);
	  if (!refreshed || !racc->grp_unscalarized_data)
	    {
	      if (*stmt == gsi_stmt (*gsi))
		gsi_next (gsi);

	      unlink_stmt_vdef (*stmt);
	      gsi_remove (&orig_gsi, true);
	      return SRA_SA_REMOVED;
	    }
	}
      else
	{
	  if (access_has_children_p (racc))
	    {
	      if (!racc->grp_unscalarized_data)
		{
		  generate_subtree_copies (racc->first_child, lhs,
					   racc->offset, 0, 0, gsi,
					   false, false);
		  gcc_assert (*stmt == gsi_stmt (*gsi));
		  unlink_stmt_vdef (*stmt);
		  gsi_remove (gsi, true);
		  return SRA_SA_REMOVED;
		}
	      else
		generate_subtree_copies (racc->first_child, lhs,
					 racc->offset, 0, 0, gsi, false, true);
	    }
	  else if (access_has_children_p (lacc))
	    generate_subtree_copies (lacc->first_child, rhs, lacc->offset,
				     0, 0, gsi, true, true);
	}
    }

  return modify_this_stmt ? SRA_SA_PROCESSED : SRA_SA_NONE;
}

/* Generate statements initializing scalar replacements of parts of function
   parameters.  */

static void
initialize_parameter_reductions (void)
{
  gimple_stmt_iterator gsi;
  gimple_seq seq = NULL;
  tree parm;

  for (parm = DECL_ARGUMENTS (current_function_decl);
       parm;
       parm = TREE_CHAIN (parm))
    {
      VEC (access_p, heap) *access_vec;
      struct access *access;

      if (!bitmap_bit_p (candidate_bitmap, DECL_UID (parm)))
	continue;
      access_vec = get_base_access_vector (parm);
      if (!access_vec)
	continue;

      if (!seq)
	{
	  seq = gimple_seq_alloc ();
	  gsi = gsi_start (seq);
	}

      for (access = VEC_index (access_p, access_vec, 0);
	   access;
	   access = access->next_grp)
	generate_subtree_copies (access, parm, 0, 0, 0, &gsi, true, true);
    }

  if (seq)
    gsi_insert_seq_on_edge_immediate (single_succ_edge (ENTRY_BLOCK_PTR), seq);
}

/* The "main" function of intraprocedural SRA passes.  Runs the analysis and if
   it reveals there are components of some aggregates to be scalarized, it runs
   the required transformations.  */
static unsigned int
perform_intra_sra (void)
{
  int ret = 0;
  sra_initialize ();

  if (!find_var_candidates ())
    goto out;

  if (!scan_function (build_access_from_expr, build_accesses_from_assign, NULL,
		      true, NULL))
    goto out;

  if (!analyze_all_variable_accesses ())
    goto out;

  scan_function (sra_modify_expr, sra_modify_assign, NULL,
		 false, NULL);
  initialize_parameter_reductions ();

  ret = TODO_update_ssa;

  if (sra_mode == SRA_MODE_EARLY_INTRA)
    ret = TODO_update_ssa;
  else
    ret = TODO_update_ssa | TODO_rebuild_alias;
 out:
  sra_deinitialize ();
  return ret;
}

/* Perform early intraprocedural SRA.  */
static unsigned int
early_intra_sra (void)
{
  sra_mode = SRA_MODE_EARLY_INTRA;
  return perform_intra_sra ();
}

/* Perform "late" intraprocedural SRA.  */
static unsigned int
late_intra_sra (void)
{
  sra_mode = SRA_MODE_INTRA;
  return perform_intra_sra ();
}


static bool
gate_intra_sra (void)
{
  return flag_tree_sra != 0;
}


struct gimple_opt_pass pass_sra_early =
{
 {
  GIMPLE_PASS,
  "esra",	 			/* name */
  gate_intra_sra,			/* gate */
  early_intra_sra,			/* execute */
  NULL,					/* sub */
  NULL,					/* next */
  0,					/* static_pass_number */
  TV_TREE_SRA,				/* tv_id */
  PROP_cfg | PROP_ssa,                  /* properties_required */
  0,					/* properties_provided */
  0,					/* properties_destroyed */
  0,					/* todo_flags_start */
  TODO_dump_func
  | TODO_update_ssa
  | TODO_ggc_collect
  | TODO_verify_ssa			/* todo_flags_finish */
 }
};


struct gimple_opt_pass pass_sra =
{
 {
  GIMPLE_PASS,
  "sra",	 			/* name */
  gate_intra_sra,			/* gate */
  late_intra_sra,			/* execute */
  NULL,					/* sub */
  NULL,					/* next */
  0,					/* static_pass_number */
  TV_TREE_SRA,				/* tv_id */
  PROP_cfg | PROP_ssa,                  /* properties_required */
  0,					/* properties_provided */
  0,					/* properties_destroyed */
  TODO_update_address_taken,		/* todo_flags_start */
  TODO_dump_func
  | TODO_update_ssa
  | TODO_ggc_collect
  | TODO_verify_ssa			/* todo_flags_finish */
 }
};

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/5] Fix indirect inlining fallout with new intra-SRA
  2009-04-28 12:15   ` Richard Guenther
@ 2009-04-29 12:39     ` Martin Jambor
  2009-04-29 13:13       ` Richard Guenther
  0 siblings, 1 reply; 25+ messages in thread
From: Martin Jambor @ 2009-04-29 12:39 UTC (permalink / raw)
  To: Richard Guenther; +Cc: GCC Patches, Richard Guenther, Jan Hubicka

Hi,

On Tue, Apr 28, 2009 at 01:48:55PM +0200, Richard Guenther wrote:
> On Tue, Apr 28, 2009 at 12:04 PM, Martin Jambor <mjambor@suse.cz> wrote:
> > The new intra-SRA produces an extra copy assignment and that breaks
> > ipa-prop.c pattern matching.  The following patch fixes that.
> >
> > Thanks,
> >
> > Martin
> >
> >
> > 2009-04-27  Martin Jambor  <mjambor@suse.cz>
> >
> >        * ipa-prop.c (get_ssa_def_if_simple_copy): New function.
> >        (determine_cst_member_ptr): Call get_ssa_def_if_simple_copy to skip
> >        simple copies.
> >
> >
> > Index: mine/gcc/ipa-prop.c
> > ===================================================================
> > --- mine.orig/gcc/ipa-prop.c
> > +++ mine/gcc/ipa-prop.c
> > @@ -456,6 +456,22 @@ fill_member_ptr_cst_jump_function (struc
> >   jfunc->value.member_cst.delta = delta;
> >  }
> >
> > +/* If RHS is an SSA_NAMe and it is defined by a simple copy assign statement,
> > +   return the rhs of its defining statement.  */
> > +
> > +static inline tree
> > +get_ssa_def_if_simple_copy (tree rhs)
> > +{
> > +  if (TREE_CODE (rhs) == SSA_NAME && !SSA_NAME_IS_DEFAULT_DEF (rhs))
> > +    {
> > +      gimple def_stmt = SSA_NAME_DEF_STMT (rhs);
> > +
> > +      if (is_gimple_assign (def_stmt) && gimple_num_ops (def_stmt) == 2)
> > +       rhs = gimple_assign_rhs1 (def_stmt);
> > +    }
> > +  return rhs;
> > +}
> 
> IMHO this function should loop.  Also use gimple_assign_single_p
> instead of the assign && num_ops check. 

OK

> You also  have to check  the gimple_assign_rhs_code to  be SSA_NAME,
> otherwise you happily look through all unary operations.
> 

Will the RHS code be SSA_NAME even when the RHS is an invariant? (I am
eventually looking  for an invariant,  specifically an ADDR_EXPR  of a
FUNCTION_DECL and an integer constant, not an ssa name.)

Thanks,

Martin

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/5] New intraprocedural Scalar Reduction of  Aggregates.
  2009-04-28 10:27   ` Martin Jambor
@ 2009-04-29 12:56     ` Richard Guenther
  2009-05-10 10:33       ` Martin Jambor
  2009-05-10 10:39       ` Martin Jambor
  0 siblings, 2 replies; 25+ messages in thread
From: Richard Guenther @ 2009-04-29 12:56 UTC (permalink / raw)
  To: Martin Jambor; +Cc: GCC Patches, Jan Hubicka

On Tue, 28 Apr 2009, Martin Jambor wrote:

> On Tue, Apr 28, 2009 at 12:04:32PM +0200, Martin Jambor wrote:
> > This  is  the  new  intraprocedural  SRA.  I  have  stripped  off  the
> > interprocedural part  and will propose to commit  it separately later.
> > I have  tried to  remove almost every  trace of IPA-SRA,  however, two
> > provisions for it  have remained in the patch.   First, an enumeration
> > (rather than  a boolean) is  used to distuinguish between  "early" and
> > "late" SRA  so that other  SRA modes can  be added later  on.  Second,
> > scan_function()  has a  hook parameter  and a  void  pointer parameter
> > which are not used in this patch but will be by IPA-SRA.
> > 
> > Otherwise, the patch is hopefully self-contained and the bases of its
> > operation is described by the initial comment.
> > 
> > The patch bootstraps (on x86_64-linux-gnu but I am about to try it on
> > hppa-linux-gnu too) but produces a small number of testsuite failures
> > which are handled by the two following patches.
> > 
> > Thanks,
> > 
> > Martin
> > 
> > 
> > 2009-04-27  Martin Jambor  <mjambor@suse.cz>
> > 
> > 	* tree-sra.c (enum sra_mode): The whole contents of the file was
> > 	replaced.
> 
> Hm, the  patch is quite unreadable,  below is the  new tree-sra.c file
> which entirely replaces the old one (note that the patch also modifies
> the Makefile though):

Ah.  Here it is ... the comment to the changelog still applies.

> /* Scalar Replacement of Aggregates (SRA) converts some structure
>    references into scalar references, exposing them to the scalar
>    optimizers.
>    Copyright (C) 2008, 2009 Free Software Foundation, Inc.
>    Contributed by Martin Jambor <mjambor@suse.cz>
> 
> This file is part of GCC.
> 
> GCC is free software; you can redistribute it and/or modify it under
> the terms of the GNU General Public License as published by the Free
> Software Foundation; either version 3, or (at your option) any later
> version.
> 
> GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> WARRANTY; without even the implied warranty of MERCHANTABILITY or
> FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> for more details.
> 
> You should have received a copy of the GNU General Public License
> along with GCC; see the file COPYING3.  If not see
> <http://www.gnu.org/licenses/>.  */
> 
> /* This file implements Scalar Reduction of Aggregates (SRA).  SRA is run
>    twice, once in the early stages of compilation (early SRA) and once in the
>    late stages (late SRA).  The aim of both is to turn references to scalar
>    parts of aggregates into uses of independent scalar variables.
> 
>    The two passes are nearly identical, the only difference is that early SRA
>    does not scalarize unions which are used as the result in a GIMPLE_RETURN
>    statement because together with inlining this can lead to weird type
>    conversions.

Do you happen to have a testcase for this or can you describe that problem
some more?

>    Both passes operate in four stages:
> 
>    1. The declarations that have properties which make them candidates for
>       scalarization are identified in function find_var_candidates().  The
>       candidates are stored in candidate_bitmap.
> 
>    2. The function body is scanned.  In the process, declarations which are
>       used in a manner that prevent their scalarization are removed from the
>       candidate bitmap.  More importantly, for every access into an aggregate,
>       an access structure (struct access) is created by create_access() and
>       stored in a vector associated with the aggregate.  Among other
>       information, the aggregate declaration, the offset and size of the access
>       and its type are stored in the structure.
> 
>       On a related note, assign_link structures are created for every assign
>       statement between candidate aggregates and attached to the related
>       accesses.
> 
>    3. The vectors of accesses are analyzed.  They are first sorted according to
>       their offset and size and then scanned for partially overlapping accesses
>       (i.e. those which overlap but one is not entirely within another).  Such
>       an access disqualifies the whole aggregate from being scalarized.

This happens only when get_ref_base_and_extent punts and returns -1 for
the access size?  And of course with struct field accesses of different
structs that are inside the same union.

>       If there is no such inhibiting overlap, a representative access structure
>       is chosen for every unique combination of offset and size.  Afterwards,
>       the pass builds a set of trees from these structures, in which children
>       of an access are within their parent (in terms of offset and size).
> 
>       Then accesses  are propagated  whenever possible (i.e.  in cases  when it
>       does not create a partially overlapping access) across assign_links from
>       the right hand side to the left hand side.
> 
>       Then the set of trees for each declaration is traversed again and those
>       accesses which should be replaced by a scalar are identified.
> 
>    4. The function is traversed again, and for every reference into an
>       aggregate that has some component which is about to be scalarized,
>       statements are amended and new statements are created as necessary.
>       Finally, if a parameter got scalarized, the scalar replacements are
>       initialized with values from respective parameter aggregates.
> */

Closing */ goes to the previous line.

> 
> #include "config.h"
> #include "system.h"
> #include "coretypes.h"
> #include "alloc-pool.h"
> #include "tm.h"
> #include "tree.h"
> #include "gimple.h"
> #include "tree-flow.h"
> #include "diagnostic.h"
> #include "tree-dump.h"
> #include "timevar.h"
> #include "params.h"
> #include "target.h"
> #include "flags.h"
> 
> /* Enumeration of all aggregate reductions we can do.  */
> enum sra_mode {SRA_MODE_EARLY_INTRA, /* early intraprocedural SRA */
> 	       SRA_MODE_INTRA};	     /* late intraprocedural SRA */

Spaces after { and before }.

> 
> /* Global variable describing which aggregate reduction we are performing at
>    the moment.  */
> static enum sra_mode sra_mode;
> 
> struct assign_link;
> 
> /* ACCESS represents each access to an aggregate variable (as a whole or a
>    part).  It can also represent a group of accesses that refer to exactly the
>    same fragment of an aggregate (i.e. those that have exactly the same offset
>    and size).  Such representatives for a single aggregate, once determined,
>    are linked in a linked list and have the group fields set.
> 
>    Moreover, when doing intraprocedural SRA, a tree is built from those
>    representatives (by the means of first_child and next_sibling pointers), in
>    which all items in a subtree are "within" the root, i.e. their offset is
>    greater or equal to offset of the root and offset+size is smaller or equal
>    to offset+size of the root.  Children of an access are sorted by offset.
> */

*/ to previous line.

> struct access
> {
>   /* Values returned by `get_ref_base_and_extent' for each COMPONENT_REF
>      If EXPR isn't a COMPONENT_REF just set `BASE = EXPR', `OFFSET = 0',
>      `SIZE = TREE_SIZE (TREE_TYPE (expr))'.  */

s/COMPONENT_REF/component reference/g - it's not only COMPONENT_REF
trees we handle.

>   HOST_WIDE_INT offset;
>   HOST_WIDE_INT size;
>   tree base;
> 
>   /* Expression.  */
>   tree expr;
>   /* Type.  */
>   tree type;
> 
>   /* Next group representative for this aggregate. */
>   struct access *next_grp;
> 
>   /* Pointer to the group representative.  Pointer to itself if the struct is
>      the representative.  */
>   struct access *group_representative;
> 
>   /* If this access has any children (in terms of the definition above), this
>      points to the first one.  */
>   struct access *first_child;
> 
>   /* Pointer to the next sibling in the access tree as described above.  */
>   struct access *next_sibling;
> 
>   /* Pointers to the first and last element in the linked list of assign
>      links.  */
>   struct assign_link *first_link, *last_link;

vertical space missing.

>   /* Pointer to the next access in the work queue.  */
>   struct access *next_queued;
> 
>   /* Replacement variable for this access "region."  Never to be accessed
>      directly, always only by the means of get_access_replacement() and only
>      when grp_to_be_replaced flag is set.  */
>   tree replacement_decl;
> 
>   /* Is this particular access write access? */
>   unsigned write : 1;
> 
>   /* Is this access currently in the work queue?  */
>   unsigned grp_queued : 1;
>   /* Does this group contain a write access?  This flag is propagated down the
>      access tree.  */
>   unsigned grp_write : 1;
>   /* Does this group contain a read access?  This flag is propagated down the
>      access tree.  */
>   unsigned grp_read : 1;
>   /* Is the subtree rooted in this access fully covered by scalar
>      replacements?  */
>   unsigned grp_covered : 1;
>   /* If set to true, this access and all below it in an access tree must not be
>      scalarized.  */
>   unsigned grp_unscalarizable_region : 1;
>   /* Whether data have been written to parts of the aggregate covered by this
>      access which is not to be scalarized.  This flag is propagated up in the
>      access tree.  */
>   unsigned grp_unscalarized_data : 1;
>   /* Does this access and/or group contain a write access through a
>      BIT_FIELD_REF?  */
>   unsigned grp_bfr_lhs : 1;
> 
>   /* Set when a scalar replacement should be created for this variable.  We do
>      the decision and creation at different places because create_tmp_var
>      cannot be called from within FOR_EACH_REFERENCED_VAR. */
>   unsigned grp_to_be_replaced : 1;
> };
> 
> typedef struct access *access_p;
> 
> DEF_VEC_P (access_p);
> DEF_VEC_ALLOC_P (access_p, heap);
> 
> /* Alloc pool for allocating access structures.  */
> static alloc_pool access_pool;
> 
> /* A structure linking lhs and rhs accesses from an aggregate assignment.  They
>    are used to propagate subaccesses from rhs to lhs as long as they don't
>    conflict with what is already there.  */
> struct assign_link
> {
>   struct access *lacc, *racc;
>   struct assign_link *next;
> };
> 
> /* Alloc pool for allocating assign link structures.  */
> static alloc_pool link_pool;
> 
> /* Base (tree) -> Vector (VEC(access_p,heap) *) map.  */
> static struct pointer_map_t *base_access_vec;
> 
> /* Bitmap of bases (candidates).  */
> static bitmap candidate_bitmap;
> /* Bitmap of declarations used in a return statement.  */
> static bitmap retvals_bitmap;
> /* Obstack for creation of fancy names.  */
> static struct obstack name_obstack;
> 
> /* Head of a linked list of accesses that need to have its subaccesses
>    propagated to their assignment counterparts. */
> static struct access *work_queue_head;
> 
> /* Dump contents of ACCESS to file F in a human friendly way.  If GRP is true,
>    representative fields are dumped, otherwise those which only describe the
>    individual access are.  */
> 
> static void
> dump_access (FILE *f, struct access *access, bool grp)
> {
>   fprintf (f, "access { ");
>   fprintf (f, "base = (%d)'", DECL_UID (access->base));
>   print_generic_expr (f, access->base, 0);
>   fprintf (f, "', offset = %d", (int) access->offset);
>   fprintf (f, ", size = %d", (int) access->size);

you can use ", offset = "HOST_WIDE_INT_PRINT_DEC, access->offset here.

>   fprintf (f, ", expr = ");
>   print_generic_expr (f, access->expr, 0);
>   fprintf (f, ", type = ");
>   print_generic_expr (f, access->type, 0);
>   if (grp)
>     fprintf (f, ", grp_write = %d, grp_read = %d, grp_covered = %d, "
> 	     "grp_unscalarizable_region = %d, grp_unscalarized_data = %d, "
> 	     "grp_to_be_replaced = %d\n",
> 	     access->grp_write, access->grp_read, access->grp_covered,
> 	     access->grp_unscalarizable_region, access->grp_unscalarized_data,
> 	     access->grp_to_be_replaced);
>   else
>     fprintf (f, ", write = %d'\n", access->write);
> }
> 
> /* Dump a subtree rooted in ACCESS to file F, indent by LEVEL.  */
> 
> static void
> dump_access_tree_1 (FILE *f, struct access *access, int level)
> {
>   do
>     {
>       int i;
> 
>       for (i = 0; i < level; i++)
> 	fputs ("* ", dump_file);
> 
>       dump_access (f, access, true);
> 
>       if (access->first_child)
> 	dump_access_tree_1 (f, access->first_child, level + 1);
> 
>       access = access->next_sibling;
>     }
>   while (access);
> }
> 
> /* Dump all access trees for a variable, given the pointer to the first root in
>    ACCESS.  */
> 
> static void
> dump_access_tree (FILE *f, struct access *access)
> {
>   for (; access; access = access->next_grp)
>     dump_access_tree_1 (f, access, 0);
> }
> 
> /* Return a vector of pointers to accesses for the variable given in BASE or
>    NULL if there is none.  */
> 
> static VEC (access_p, heap) *
> get_base_access_vector (tree base)
> {
>   void **slot;
> 
>   slot = pointer_map_contains (base_access_vec, base);
>   if (!slot)
>     return NULL;
>   else
>     return *(VEC (access_p, heap) **) slot;
> }
> 
> /* Find an access with required OFFSET and SIZE in a subtree of accesses rooted
>    in ACCESS.  Return NULL if it cannot be found.  */
> 
> static struct access *
> find_access_in_subtree (struct access *access, HOST_WIDE_INT offset,
> 			HOST_WIDE_INT size)
> {
>   while (access && (access->offset != offset || access->size != size))
>     {
>       struct access *child = access->first_child;
> 
>       while (child && (child->offset + child->size <= offset))
> 	child = child->next_sibling;

Do you limit the number of siblings?  We should keep an eye on this
for potential compile-time issues (not that I expect some).

>       access = child;
>     }
> 
>   return access;
> }
> 
> /* Return the first group representative for DECL or NULL if none exists.  */
> 
> static struct access *
> get_first_repr_for_decl (tree base)
> {
>   VEC (access_p, heap) *access_vec;
> 
>   access_vec = get_base_access_vector (base);
>   if (!access_vec)
>     return NULL;
> 
>   return VEC_index (access_p, access_vec, 0);
> }
> 
> /* Find an access representative for the variable BASE and given OFFSET and
>    SIZE.  Requires that access trees have already been built.  Return NULL if
>    it cannot be found.  */
> 
> static struct access *
> get_var_base_offset_size_access (tree base, HOST_WIDE_INT offset,
> 				 HOST_WIDE_INT size)
> {
>   struct access *access;
> 
>   access = get_first_repr_for_decl (base);
>   while (access && (access->offset + access->size <= offset))
>     access = access->next_grp;
>   if (!access)
>     return NULL;
> 
>   return find_access_in_subtree (access, offset, size);
> }
> 
> /* Add LINK to the linked list of assign links of RACC.  */
> static void
> add_link_to_rhs (struct access *racc, struct assign_link *link)
> {
>   gcc_assert (link->racc == racc);
> 
>   if (!racc->first_link)
>     {
>       gcc_assert (!racc->last_link);
>       racc->first_link = link;
>     }
>   else
>     racc->last_link->next = link;
> 
>   racc->last_link = link;
>   link->next = NULL;
> }
> 
> /* Move all link structures in their linked list in OLD_RACC to the linked list
>    in NEW_RACC.  */
> static void
> relink_to_new_repr (struct access *new_racc, struct access *old_racc)
> {
>   if (!old_racc->first_link)
>     {
>       gcc_assert (!old_racc->last_link);
>       return;
>     }
> 
>   if (new_racc->first_link)
>     {
>       gcc_assert (!new_racc->last_link->next);
>       gcc_assert (!old_racc->last_link || !old_racc->last_link->next);
> 
>       new_racc->last_link->next = old_racc->first_link;
>       new_racc->last_link = old_racc->last_link;
>     }
>   else
>     {
>       gcc_assert (!new_racc->last_link);
> 
>       new_racc->first_link = old_racc->first_link;
>       new_racc->last_link = old_racc->last_link;
>     }
>   old_racc->first_link = old_racc->last_link = NULL;
> }
> 
> /* Add ACCESS to the work queue (which is actually a stack).  */
> 
> static void
> add_access_to_work_queue (struct access *access)
> {
>   if (!access->grp_queued)
>     {
>       gcc_assert (!access->next_queued);
>       access->next_queued = work_queue_head;
>       access->grp_queued = 1;
>       work_queue_head = access;
>     }
> }
> 
> /* Pop an access from the work queue, and return it, assuming there is one.  */
> 
> static struct access *
> pop_access_from_work_queue (void)
> {
>   struct access *access = work_queue_head;
> 
>   work_queue_head = access->next_queued;
>   access->next_queued = NULL;
>   access->grp_queued = 0;
>   return access;
> }
> 
> 
> /* Allocate necessary structures.  */
> 
> static void
> sra_initialize (void)
> {
>   candidate_bitmap = BITMAP_ALLOC (NULL);
>   retvals_bitmap = BITMAP_ALLOC (NULL);
>   gcc_obstack_init (&name_obstack);
>   access_pool = create_alloc_pool ("SRA accesses", sizeof (struct access), 16);
>   link_pool = create_alloc_pool ("SRA links", sizeof (struct assign_link), 16);
>   base_access_vec = pointer_map_create ();
> }
> 
> /* Hook fed to pointer_map_traverse, deallocate stored vectors.  */
> 
> static bool
> delete_base_accesses (const void *key ATTRIBUTE_UNUSED, void **value,
> 		     void *data ATTRIBUTE_UNUSED)
> {
>   VEC (access_p, heap) *access_vec;
>   access_vec = (VEC (access_p, heap) *) *value;
>   VEC_free (access_p, heap, access_vec);
> 
>   return true;
> }
> 
> /* Deallocate all general structures.  */
> 
> static void
> sra_deinitialize (void)
> {
>   BITMAP_FREE (candidate_bitmap);
>   BITMAP_FREE (retvals_bitmap);
>   free_alloc_pool (access_pool);
>   free_alloc_pool (link_pool);
>   obstack_free (&name_obstack, NULL);
> 
>   pointer_map_traverse (base_access_vec, delete_base_accesses, NULL);
>   pointer_map_destroy (base_access_vec);
> }
> 
> /* Remove DECL from candidates for SRA and write REASON to the dump file if
>    there is one.  */
> static void
> disqualify_candidate (tree decl, const char *reason)
> {
>   bitmap_clear_bit (candidate_bitmap, DECL_UID (decl));
> 
>   if (dump_file)

&& (dump_flags & TDF_DETAILS)

>     {
>       fprintf (dump_file, "! Disqualifying ");
>       print_generic_expr (dump_file, decl, 0);
>       fprintf (dump_file, " - %s\n", reason);
>     }
> }
> 
> /* Return true iff the type contains a field or an element which does not allow
>    scalarization.  */
> 
> static bool
> type_internals_preclude_sra_p (tree type)
> {
>   tree fld;
>   tree et;
> 
>   switch (TREE_CODE (type))
>     {
>     case RECORD_TYPE:
>     case UNION_TYPE:
>     case QUAL_UNION_TYPE:
>       for (fld = TYPE_FIELDS (type); fld; fld = TREE_CHAIN (fld))
> 	if (TREE_CODE (fld) == FIELD_DECL)
> 	  {
> 	    tree ft = TREE_TYPE (fld);
> 
> 	    if (TREE_THIS_VOLATILE (fld)
> 		|| !DECL_FIELD_OFFSET (fld) || !DECL_SIZE (fld)
> 		|| !host_integerp (DECL_FIELD_OFFSET (fld), 1)
> 		|| !host_integerp (DECL_SIZE (fld), 1))
> 	      return true;
> 
> 	    if (AGGREGATE_TYPE_P (ft)
> 		&& type_internals_preclude_sra_p (ft))
> 	      return true;
> 	  }
> 
>       return false;
> 
>     case ARRAY_TYPE:
>       et = TREE_TYPE (type);
> 
>       if (AGGREGATE_TYPE_P (et))
> 	return type_internals_preclude_sra_p (et);
>       else
> 	return false;
> 
>     default:
>       return false;
>     }
> }
> 
> /* Create and insert access for EXPR. Return created access, or NULL if it is
>    not possible.  */
> 
> static struct access *
> create_access (tree expr, bool write)
> {
>   struct access *access;
>   void **slot;
>   VEC (access_p,heap) *vec;
>   HOST_WIDE_INT offset, size, max_size;
>   tree base = expr;
>   bool unscalarizable_region = false;
> 
>   if (handled_component_p (expr))
>     base = get_ref_base_and_extent (expr, &offset, &size, &max_size);
>   else
>     {
>       tree tree_size;
> 
>       tree_size = TYPE_SIZE (TREE_TYPE (base));
>       if (tree_size && host_integerp (tree_size, 1))
> 	size = max_size = tree_low_cst (tree_size, 1);
>       else
> 	size = max_size = -1;
> 
>       offset = 0;
>     }

get_ref_base_and_extent should now also work on plain DECLs
(non-handled_component_p) and base should never be NULL.

>   if (!base || !DECL_P (base)
>       || !bitmap_bit_p (candidate_bitmap, DECL_UID (base)))
>     return NULL;
> 
>   if (size != max_size)
>     {
>       size = max_size;
>       unscalarizable_region = true;
>     }
> 
>   if (size < 0)
>     {
>       disqualify_candidate (base, "Encountered an ultra variable sized "
> 			    "access.");

ultra variable sized?  I would name it 'unconstrained access'.  Note
that there is still useful information in this case if offset is non-zero
(namely accesses before [offset, -1] may still be scalarized).

Maybe something for further improvements.  This for example would
happen for structs with trailing arrays.

>       return NULL;
>     }
> 
>   access = (struct access *) pool_alloc (access_pool);
>   memset (access, 0, sizeof (struct access));
> 
>   access->base = base;
>   access->offset = offset;
>   access->size = size;
>   access->expr = expr;
>   access->type = TREE_TYPE (expr);
>   access->write = write;
>   access->grp_unscalarizable_region = unscalarizable_region;
> 
>   slot = pointer_map_contains (base_access_vec, base);
>   if (slot)
>     vec = (VEC (access_p, heap) *) *slot;
>   else
>     vec = VEC_alloc (access_p, heap, 32);
> 
>   VEC_safe_push (access_p, heap, vec, access);
> 
>   *((struct VEC (access_p,heap) **)
> 	pointer_map_insert (base_access_vec, base)) = vec;
> 
>   return access;
> }
> 
> 
> /* Callback of walk_tree.  Search the given tree for a declaration and exclude
>    it from the candidates.  */
> 
> static tree
> disqualify_all (tree *tp, int *walk_subtrees, void *data ATTRIBUTE_UNUSED)
> {
>   tree base = *tp;
> 
> 
>   if (TREE_CODE (base) == SSA_NAME)
>     base = SSA_NAME_VAR (base);

Err ... for SSA_NAME bases there is nothing to scalarize?  So just
bail out in that case?  In fact, using walk_tree for disqualify_all looks
a bit expensive (it also walks types).

>   if (DECL_P (base))
>     {
>       disqualify_candidate (base, "From within disqualify_all().");
>       *walk_subtrees = 0;
>     }
>   else
>     *walk_subtrees = 1;
> 
> 
>   return NULL_TREE;
> }
> 
> /* Scan expression EXPR and create access structures for all accesses to
>    candidates for scalarization.  Return the created access or NULL if none is
>    created.  */
> 
> static struct access *
> build_access_from_expr_1 (tree *expr_ptr,
> 			gimple_stmt_iterator *gsi ATTRIBUTE_UNUSED, bool write)
> {
>   struct access *ret = NULL;
>   tree expr = *expr_ptr;
>   tree safe_expr = expr;
>   bool bit_ref;
> 
>   if (TREE_CODE (expr) == BIT_FIELD_REF)
>     {
>       expr = TREE_OPERAND (expr, 0);
>       bit_ref = true;
>     }
>   else
>     bit_ref = false;
> 
>   while (TREE_CODE (expr) == NOP_EXPR

CONVERT_EXPR_P (expr)

> 	 || TREE_CODE (expr) == VIEW_CONVERT_EXPR
> 	 || TREE_CODE (expr) == REALPART_EXPR
> 	 || TREE_CODE (expr) == IMAGPART_EXPR)
>     expr = TREE_OPERAND (expr, 0);

Why do this here btw, and not just lump ...

>   switch (TREE_CODE (expr))
>     {
>     case ADDR_EXPR:
>     case SSA_NAME:
>     case INDIRECT_REF:
>       break;
> 
>     case VAR_DECL:
>     case PARM_DECL:
>     case RESULT_DECL:
>     case COMPONENT_REF:
>     case ARRAY_REF:
>       ret = create_access (expr, write);
>       break;

... this ...

>     case REALPART_EXPR:
>     case IMAGPART_EXPR:
>       expr = TREE_OPERAND (expr, 0);
>       ret = create_access (expr, write);

... and this together?  Won't you create bogus accesses if you
strip for example IMAGPART_EXPR (which has non-zero offset)?

>       break;
> 
>     case ARRAY_RANGE_REF:

it should just be handled fine I think.

>     default:
>       walk_tree (&safe_expr, disqualify_all, NULL, NULL);

and if not, this should just disqualify the base of the access, like
get_base_address (safe_expr) (save_expr you mean?) and then if that
is a DECL, disqualify that decl.

>       break;
>     }
> 
>   if (write && bit_ref && ret)
>     ret->grp_bfr_lhs = 1;
> 
>   return ret;
> }
> 
> /* Scan expression EXPR and create access structures for all accesses to
>    candidates for scalarization.  Return true if any access has been
>    inserted.  */
> 
> static bool
> build_access_from_expr (tree *expr_ptr,
> 			gimple_stmt_iterator *gsi ATTRIBUTE_UNUSED, bool write,
> 			void *data ATTRIBUTE_UNUSED)
> {
>   return build_access_from_expr_1 (expr_ptr, gsi, write) != NULL;
> }
> 
> /* Disqualify LHS and RHS for scalarization if STMT must end its basic block in
>    modes in which it matters, return true iff they have been disqualified.  RHS
>    may be NULL, in that case ignore it.  If we scalarize an aggregate in
>    intra-SRA we may need to add statements after each statement.  This is not
>    possible if a statement unconditionally has to end the basic block.  */
> static bool
> disqualify_ops_if_throwing_stmt (gimple stmt, tree *lhs, tree *rhs)
> {
>   if (stmt_can_throw_internal (stmt) || stmt_ends_bb_p (stmt))
>     {
>       walk_tree (lhs, disqualify_all, NULL, NULL);
>       if (rhs)
> 	walk_tree (rhs, disqualify_all, NULL, NULL);
>       return true;
>     }
>   return false;
> }
> 
> 
> /* Result code for scan_assign callback for scan_function.  */
> enum scan_assign_result {SRA_SA_NONE,       /* nothing done for the stmt */
> 			 SRA_SA_PROCESSED,  /* stmt analyzed/changed */
> 			 SRA_SA_REMOVED};   /* stmt redundant and eliminated */

space after { and before }.

> 
> 
> /* Scan expressions occuring in the statement pointed to by STMT_EXPR, create
>    access structures for all accesses to candidates for scalarization and
>    remove those candidates which occur in statements or expressions that
>    prevent them from being split apart.  Return true if any access has been
>    inserted.  */
> 
> static enum scan_assign_result
> build_accesses_from_assign (gimple *stmt_ptr,
> 			    gimple_stmt_iterator *gsi ATTRIBUTE_UNUSED,
> 			    void *data ATTRIBUTE_UNUSED)
> {
>   gimple stmt = *stmt_ptr;
>   tree *lhs_ptr, *rhs_ptr;
>   struct access *lacc, *racc;
> 
>   if (gimple_assign_rhs2 (stmt))

!gimple_assign_single_p (stmt)

>     return SRA_SA_NONE;
>
>   lhs_ptr = gimple_assign_lhs_ptr (stmt);
>   rhs_ptr = gimple_assign_rhs1_ptr (stmt);

you probably don't need to pass pointers to trees everywhere as you
are not changing them.

>   if (disqualify_ops_if_throwing_stmt (stmt, lhs_ptr, rhs_ptr))
>     return SRA_SA_NONE;
> 
>   racc = build_access_from_expr_1 (rhs_ptr, gsi, false);
>   lacc = build_access_from_expr_1 (lhs_ptr, gsi, true);

just avoid calling into build_access_from_expr_1 for SSA_NAMEs
or is_gimple_min_invariant lhs/rhs, that should make that
function more regular.

>   if (lacc && racc
>       && !lacc->grp_unscalarizable_region
>       && !racc->grp_unscalarizable_region
>       && AGGREGATE_TYPE_P (TREE_TYPE (*lhs_ptr))
>       && lacc->size <= racc->size
>       && useless_type_conversion_p (lacc->type, racc->type))

useless_type_conversion_p should be always true here.

>     {
>       struct assign_link *link;
> 
>       link = (struct assign_link *) pool_alloc (link_pool);
>       memset (link, 0, sizeof (struct assign_link));
> 
>       link->lacc = lacc;
>       link->racc = racc;
> 
>       add_link_to_rhs (racc, link);
>     }
> 
>   return (lacc || racc) ? SRA_SA_PROCESSED : SRA_SA_NONE;
> }
> 
> /* Scan function and look for interesting statements. Return true if any has
>    been found or processed, as indicated by callbacks.  SCAN_EXPR is a callback
>    called on all expressions within statements except assign statements and
>    those deemed entirely unsuitable for some reason (all operands in such
>    statements and expression are removed from candidate_bitmap).  SCAN_ASSIGN
>    is a callback called on all assign statements, HANDLE_SSA_DEFS is a callback
>    called on assign statements and those call statements which have a lhs and
>    it is the only callback which can be NULL. ANALYSIS_STAGE is true when
>    running in the analysis stage of a pass and thus no statement is being
>    modified.  DATA is a pointer passed to all callbacks.  If any single
>    callback returns true, this function also returns true, otherwise it returns
>    false.  */
> 
> static bool
> scan_function (bool (*scan_expr) (tree *, gimple_stmt_iterator *, bool, void *),
> 	       enum scan_assign_result (*scan_assign) (gimple *,
> 						       gimple_stmt_iterator *,
> 						       void *),
> 	       bool (*handle_ssa_defs)(gimple, void *),
> 	       bool analysis_stage, void *data)
> {
>   gimple_stmt_iterator gsi;
>   basic_block bb;
>   unsigned i;
>   tree *t;
>   bool ret = false;
> 
>   FOR_EACH_BB (bb)
>     {
>       bool bb_changed = false;
> 
>       gsi = gsi_start_bb (bb);
>       while (!gsi_end_p (gsi))
> 	{
> 	  gimple stmt = gsi_stmt (gsi);
> 	  enum scan_assign_result assign_result;
> 	  bool any = false, deleted = false;
> 
> 	  switch (gimple_code (stmt))
> 	    {
> 	    case GIMPLE_RETURN:
> 	      t = gimple_return_retval_ptr (stmt);
> 	      if (*t != NULL_TREE)
> 		{
> 		  if (DECL_P (*t))
> 		    {
> 		      tree ret_type = TREE_TYPE (*t);
> 		      if (sra_mode == SRA_MODE_EARLY_INTRA
> 			  && (TREE_CODE (ret_type) == UNION_TYPE
> 			      || TREE_CODE (ret_type) == QUAL_UNION_TYPE))
> 			disqualify_candidate (*t,
> 					      "Union in a return statement.");
> 		      else
> 			bitmap_set_bit (retvals_bitmap, DECL_UID (*t));
> 		    }
> 		  any |= scan_expr (t, &gsi, false, data);
> 		}

Likewise for passing pointers (why is gsi necessary and passing a stmt
does not work?)

> 	      break;
> 
> 	    case GIMPLE_ASSIGN:
> 	      assign_result = scan_assign (&stmt, &gsi, data);
> 	      any |= assign_result == SRA_SA_PROCESSED;
> 	      deleted = assign_result == SRA_SA_REMOVED;
> 	      if (handle_ssa_defs && assign_result != SRA_SA_REMOVED)
> 		any |= handle_ssa_defs (stmt, data);
> 	      break;
> 
> 	    case GIMPLE_CALL:
> 	      /* Operands must be processed before the lhs.  */
> 	      for (i = 0; i < gimple_call_num_args (stmt); i++)
> 		{
> 		  tree *argp = gimple_call_arg_ptr (stmt, i);
> 		  any |= scan_expr (argp, &gsi, false, data);
> 		}
> 
> 	      if (gimple_call_lhs (stmt))
> 		{
> 		  tree *lhs_ptr = gimple_call_lhs_ptr (stmt);
> 		  if (!analysis_stage ||
> 		      !disqualify_ops_if_throwing_stmt (stmt, lhs_ptr, NULL))
> 		    {
> 		      any |= scan_expr (lhs_ptr, &gsi, true, data);
> 		      if (handle_ssa_defs)
> 			any |= handle_ssa_defs (stmt, data);
> 		    }
> 		}
> 	      break;
> 
> 	    case GIMPLE_ASM:
> 	      for (i = 0; i < gimple_asm_ninputs (stmt); i++)
> 		{
> 		  tree *op = &TREE_VALUE (gimple_asm_input_op (stmt, i));
> 		  any |= scan_expr (op, &gsi, false, data);
> 		}
> 	      for (i = 0; i < gimple_asm_noutputs (stmt); i++)
> 		{
> 		  tree *op = &TREE_VALUE (gimple_asm_output_op (stmt, i));
> 		  any |= scan_expr (op, &gsi, true, data);
> 		}

asm operands with memory constraints should be disqualified from SRA
(see walk_stmt_load_store_addr_ops and/or the operand scanner).

> 	    default:
> 	      if (analysis_stage)
> 		walk_gimple_op (stmt, disqualify_all, NULL);

You seem to be very eager to disqualify anything unknown ;)  (But I
yet have to come across a TREE_ADDRESSABLE check ...)

> 	      break;
> 	    }
> 
> 	  if (any)
> 	    {
> 	      ret = true;
> 	      bb_changed = true;
> 
> 	      if (!analysis_stage)

Oh.  So we reuse this function.  Hmm.

> 		{
> 		  update_stmt (stmt);
> 		  if (!stmt_could_throw_p (stmt))
> 		    remove_stmt_from_eh_region (stmt);

Usually

  if (maybe_clean_or_replace_eh_stmt (stmt, stmt)
      && gimple_purge_dead_eh_edges (gimple_bb (stmt)))

is the pattern for this.  But then you disqualified all throwing
expressions, no?

> 		}
> 	    }
> 	  if (deleted)
> 	    bb_changed = true;
> 	  else
> 	    {
> 	      gsi_next (&gsi);
> 	      ret = true;
> 	    }
> 	}
>       if (!analysis_stage && bb_changed)
> 	gimple_purge_dead_eh_edges (bb);
>     }
> 
>   return ret;
> }
> 
> /* Helper of QSORT function. There are pointers to accesses in the array.  An
>    access is considered smaller than another if it has smaller offset or if the
>    offsets are the same but is size is bigger. */
> 
> static int
> compare_access_positions (const void *a, const void *b)
> {
>   const access_p *fp1 = (const access_p *) a;
>   const access_p *fp2 = (const access_p *) b;
>   const access_p f1 = *fp1;
>   const access_p f2 = *fp2;
> 
>   if (f1->offset != f2->offset)
>     return f1->offset < f2->offset ? -1 : 1;
> 
>   if (f1->size == f2->size)
>     return 0;
>   /* We want the bigger accesses first, thus the opposite operator in the next
>      line: */
>   return f1->size > f2->size ? -1 : 1;
> }
> 
> 
> /* Append a name of the declaration to the name obstack.  A helper function for
>    make_fancy_name.  */
> 
> static void
> make_fancy_decl_name (tree decl)
> {
>   char buffer[32];
> 
>   tree name = DECL_NAME (decl);
>   if (name)
>     obstack_grow (&name_obstack, IDENTIFIER_POINTER (name),
> 		  IDENTIFIER_LENGTH (name));
>   else
>     {
>       sprintf (buffer, "D%u", DECL_UID (decl));

That would just be useless information.  I guess you copied this
from old SRA?

>       obstack_grow (&name_obstack, buffer, strlen (buffer));
>     }
> }
> 
> /* Helper for make_fancy_name.  */
> 
> static void
> make_fancy_name_1 (tree expr)
> {
>   char buffer[32];
>   tree index;
> 
>   if (DECL_P (expr))
>     {
>       make_fancy_decl_name (expr);
>       return;
>     }
> 
>   switch (TREE_CODE (expr))
>     {
>     case COMPONENT_REF:
>       make_fancy_name_1 (TREE_OPERAND (expr, 0));
>       obstack_1grow (&name_obstack, '$');
>       make_fancy_decl_name (TREE_OPERAND (expr, 1));
>       break;
> 
>     case ARRAY_REF:
>       make_fancy_name_1 (TREE_OPERAND (expr, 0));
>       obstack_1grow (&name_obstack, '$');
>       /* Arrays with only one element may not have a constant as their
> 	 index. */
>       index = TREE_OPERAND (expr, 1);
>       if (TREE_CODE (index) != INTEGER_CST)
> 	break;
>       sprintf (buffer, HOST_WIDE_INT_PRINT_DEC, TREE_INT_CST_LOW (index));
>       obstack_grow (&name_obstack, buffer, strlen (buffer));
> 
>       break;
> 
>     case BIT_FIELD_REF:
>     case REALPART_EXPR:
>     case IMAGPART_EXPR:
>       gcc_unreachable (); 	/* we treat these as scalars.  */
>       break;
>     default:
>       break;
>     }
> }
> 
> /* Create a human readable name for replacement variable of ACCESS.  */
> 
> static char *
> make_fancy_name (tree expr)
> {
>   make_fancy_name_1 (expr);
>   obstack_1grow (&name_obstack, '\0');
>   return XOBFINISH (&name_obstack, char *);
> }

As all new scalars are DECL_ARTIFICIAL anyway why bother to create
a fancy name? ....

> /* Helper function for build_ref_for_offset.  */
> 
> static bool
> build_ref_for_offset_1 (tree *res, tree type, HOST_WIDE_INT offset,
> 			tree exp_type)
> {
>   while (1)
>     {
>       tree fld;
>       tree tr_size, index;
>       HOST_WIDE_INT el_size;
> 
>       if (offset == 0 && exp_type
> 	  && useless_type_conversion_p (exp_type, type))
> 	return true;
> 
>       switch (TREE_CODE (type))
> 	{
> 	case UNION_TYPE:
> 	case QUAL_UNION_TYPE:
> 	case RECORD_TYPE:
> 	  /* Some ADA records are half-unions, treat all of them the same.  */
> 	  for (fld = TYPE_FIELDS (type); fld; fld = TREE_CHAIN (fld))
> 	    {
> 	      HOST_WIDE_INT pos, size;
> 	      tree expr, *expr_ptr;
> 
> 	      if (TREE_CODE (fld) != FIELD_DECL)
> 		continue;
> 
> 	      pos = int_bit_position (fld);
> 	      gcc_assert (TREE_CODE (type) == RECORD_TYPE || pos == 0);
> 	      size = tree_low_cst (DECL_SIZE (fld), 1);
> 	      if (pos > offset || (pos + size) <= offset)
> 		continue;
> 
> 	      if (res)
> 		{
> 		  expr = build3 (COMPONENT_REF, TREE_TYPE (fld), *res, fld,
> 				 NULL_TREE);
> 		  expr_ptr = &expr;
> 		}
> 	      else
> 		expr_ptr = NULL;
> 	      if (build_ref_for_offset_1 (expr_ptr, TREE_TYPE (fld),
> 					  offset - pos, exp_type))
> 		{
> 		  if (res)
> 		    *res = expr;
> 		  return true;
> 		}
> 	    }
> 	  return false;
> 
> 	case ARRAY_TYPE:
> 	  tr_size = TYPE_SIZE (TREE_TYPE (type));
> 	  if (!tr_size || !host_integerp (tr_size, 1))
> 	    return false;
> 	  el_size = tree_low_cst (tr_size, 1);
> 
> 	  index = build_int_cst (TYPE_DOMAIN (type), offset / el_size);
> 	  if (!integer_zerop (TYPE_MIN_VALUE (TYPE_DOMAIN (type))))
> 	    index = int_const_binop (PLUS_EXPR, index,
> 				     TYPE_MIN_VALUE (TYPE_DOMAIN (type)), 0);
> 	  if (res)
> 	    *res = build4 (ARRAY_REF, TREE_TYPE (type), *res, index, NULL_TREE,
> 			   NULL_TREE);
> 	  offset = offset % el_size;
> 	  type = TREE_TYPE (type);
> 	  break;
> 
> 	default:
> 	  if (offset != 0)
> 	    return false;
> 
> 	  if (exp_type)
> 	    return false;
> 	  else
> 	    return true;
> 	}
>     }
> }
> 
> /* Construct an expression that would reference a part of aggregate *EXPR of
>    type TYPE at the given OFFSET of the type EXP_TYPE.  If EXPR is NULL, the
>    function only determines whether it can build such a reference without
>    actually doing it.
> 
>    FIXME: Eventually this should be replaced with
>    maybe_fold_offset_to_reference() from tree-ssa-ccp.c but that requires a
>    minor rewrite of fold_stmt.
>  */
> 
> static bool
> build_ref_for_offset (tree *expr, tree type, HOST_WIDE_INT offset,
> 		      tree exp_type, bool allow_ptr)
> {
>   if (allow_ptr && POINTER_TYPE_P (type))
>     {
>       type = TREE_TYPE (type);
>       if (expr)
> 	*expr = fold_build1 (INDIRECT_REF, type, *expr);
>     }
> 
>   return build_ref_for_offset_1 (expr, type, offset, exp_type);
> }
> 
> /* The very first phase of intraprocedural SRA.  It marks in candidate_bitmap
>    those with type which is suitable for scalarization.  */
> 
> static bool
> find_var_candidates (void)
> {
>   tree var, type;
>   referenced_var_iterator rvi;
>   bool ret = false;
> 
>   FOR_EACH_REFERENCED_VAR (var, rvi)
>     {
>       if (TREE_CODE (var) != VAR_DECL && TREE_CODE (var) != PARM_DECL)
>         continue;
>       type = TREE_TYPE (var);
> 
>       if (!AGGREGATE_TYPE_P (type)
> 	  || needs_to_live_in_memory (var)

Ok, here's the TREE_ADDRESSABLE check.  I'm finally convinced that
disqualify_all should go ;)

> 	  || TREE_THIS_VOLATILE (var)
> 	  || !COMPLETE_TYPE_P (type)
> 	  || !host_integerp (TYPE_SIZE (type), 1)
>           || tree_low_cst (TYPE_SIZE (type), 1) == 0
> 	  || type_internals_preclude_sra_p (type))
> 	continue;
> 
>       bitmap_set_bit (candidate_bitmap, DECL_UID (var));
> 
>       if (dump_file)

&& (dump_flags & TDF_DETAILS)

> 	{
> 	  fprintf (dump_file, "Candidate (%d): ", DECL_UID (var));
> 	  print_generic_expr (dump_file, var, 0);
> 	  fprintf (dump_file, "\n");
> 	}
>       ret = true;
>     }
> 
>   return ret;
> }
> 
> /* Return true if TYPE should be considered a scalar type by SRA.  */
> 
> static bool
> is_sra_scalar_type (tree type)
> {
>   enum tree_code code = TREE_CODE (type);
>   return (INTEGRAL_TYPE_P (type) || SCALAR_FLOAT_TYPE_P (type)
> 	  || FIXED_POINT_TYPE_P (type) || POINTER_TYPE_P (type)
> 	  || code == VECTOR_TYPE || code == COMPLEX_TYPE
> 	  || code == OFFSET_TYPE);
> }

Why is this anything different from is_gimple_reg_type ()?

> /* Sort all accesses for the given variable, check for partial overlaps and
>    return NULL if there are any.  If there are none, pick a representative for
>    each combination of offset and size and create a linked list out of them.
>    Return the pointer to the first representative and make sure it is the first
>    one in the vector of accesses.  */
> 
> static struct access *
> sort_and_splice_var_accesses (tree var)
> {
>   int i, j, access_count;
>   struct access *res, **prev_acc_ptr = &res;
>   VEC (access_p, heap) *access_vec;
>   bool first = true;
>   HOST_WIDE_INT low = -1, high = 0;
> 
>   access_vec = get_base_access_vector (var);
>   if (!access_vec)
>     return NULL;
>   access_count = VEC_length (access_p, access_vec);
> 
>   /* Sort by <OFFSET, SIZE>.  */
>   qsort (VEC_address (access_p, access_vec), access_count, sizeof (access_p),
> 	 compare_access_positions);
> 
>   i = 0;
>   while (i < access_count)
>     {
>       struct access *access = VEC_index (access_p, access_vec, i);
>       bool modification = access->write;
>       bool grp_read = !access->write;
>       bool grp_bfr_lhs = access->grp_bfr_lhs;
>       bool first_scalar = is_sra_scalar_type (access->type);
>       bool unscalarizable_region = access->grp_unscalarizable_region;
> 
>       if (first || access->offset >= high)
> 	{
> 	  first = false;
> 	  low = access->offset;
> 	  high = access->offset + access->size;
> 	}
>       else if (access->offset > low && access->offset + access->size > high)
> 	return NULL;
>       else
> 	gcc_assert (access->offset >= low
> 		    && access->offset + access->size <= high);
> 
>       j = i + 1;
>       while (j < access_count)
> 	{
> 	  struct access *ac2 = VEC_index (access_p, access_vec, j);
> 	  if (ac2->offset != access->offset || ac2->size != access->size)
> 	    break;
> 	  modification |= ac2->write;
> 	  grp_read |= !ac2->write;
> 	  grp_bfr_lhs |= ac2->grp_bfr_lhs;
> 	  unscalarizable_region |= ac2->grp_unscalarizable_region;
> 	  relink_to_new_repr (access, ac2);
> 
> 	  /* If one of the equivalent accesses is scalar, use it as a
> 	     representative (this happens when when there is for example on a
> 	     single scalar field in a structure).  */
> 	  if (!first_scalar && is_sra_scalar_type (ac2->type))
> 	    {
> 	      struct access tmp_acc;
> 	      first_scalar = true;
> 
> 	      memcpy (&tmp_acc, ac2, sizeof (struct access));
> 	      memcpy (ac2, access,  sizeof (struct access));
> 	      memcpy (access, &tmp_acc, sizeof (struct access));
> 	    }
> 	  ac2->group_representative = access;
> 	  j++;
> 	}
> 
>       i = j;
> 
>       access->group_representative = access;
>       access->grp_write = modification;
>       access->grp_read = grp_read;
>       access->grp_bfr_lhs = grp_bfr_lhs;
>       access->grp_unscalarizable_region = unscalarizable_region;
>       if (access->first_link)
> 	add_access_to_work_queue (access);
> 
>       *prev_acc_ptr = access;
>       prev_acc_ptr = &access->next_grp;
>     }
> 
>   gcc_assert (res == VEC_index (access_p, access_vec, 0));
>   return res;
> }
> 
> /* Create a variable for the given ACCESS which determines the type, name and a
>    few other properties.  Return the variable declaration and store it also to
>    ACCESS->replacement.  */
> 
> static tree
> create_access_replacement (struct access *access)
> {
>   tree repl;
> 
>   repl = make_rename_temp (access->type, "SR");
>   get_var_ann (repl);
>   add_referenced_var (repl);
> 
>   DECL_SOURCE_LOCATION (repl) = DECL_SOURCE_LOCATION (access->base);
>   DECL_ARTIFICIAL (repl) = 1;
> 
>   if (DECL_NAME (access->base) && !DECL_IGNORED_P (access->base))

at least && !DECL_ARTIFICIAL (access->base) I think.

>     {
>       char *pretty_name = make_fancy_name (access->expr);
> 
>       DECL_NAME (repl) = get_identifier (pretty_name);
>       obstack_free (&name_obstack, pretty_name);
> 
>       SET_DECL_DEBUG_EXPR (repl, access->expr);
>       DECL_DEBUG_EXPR_IS_FROM (repl) = 1;
>       DECL_IGNORED_P (repl) = 0;
>       TREE_NO_WARNING (repl) = TREE_NO_WARNING (access->base);
>     }
>   else
>     {
>       DECL_IGNORED_P (repl) = 1;
>       TREE_NO_WARNING (repl) = 1;
>     }

So just copy DECL_IGNORED_P and TREE_NO_WARNING from access->base
unconditionally?

>   if (access->grp_bfr_lhs)
>     DECL_GIMPLE_REG_P (repl) = 0;

But you never set it (see update_address_taken for more cases,
most notably VIEW_CONVERT_EXPR on the lhs which need to be taken
care of).  You should set it for COMPLEX_TYPE and VECTOR_TYPE 
replacements.

>   if (dump_file)
>     {
>       fprintf (dump_file, "Created a replacement for ");
>       print_generic_expr (dump_file, access->base, 0);
>       fprintf (dump_file, " offset: %u, size: %u: ",
> 	       (unsigned) access->offset, (unsigned) access->size);
>       print_generic_expr (dump_file, repl, 0);
>       fprintf (dump_file, "\n");
>     }
> 
>   return repl;
> }
> 
> /* Return ACCESS scalar replacement, create it if it does not exist yet.  */
> 
> static inline tree
> get_access_replacement (struct access *access)
> {
>   gcc_assert (access->grp_to_be_replaced);
> 
>   if (access->replacement_decl)
>     return access->replacement_decl;
> 
>   access->replacement_decl = create_access_replacement (access);
>   return access->replacement_decl;
> }
> 
> /* Build a subtree of accesses rooted in *ACCESS, and move the pointer in the
>    linked list along the way.  Stop when *ACCESS is NULL or the access pointed
>    to it is not "within" the root.  */
> 
> static void
> build_access_subtree (struct access **access)
> {
>   struct access *root = *access, *last_child = NULL;
>   HOST_WIDE_INT limit = root->offset + root->size;
> 
>   *access = (*access)->next_grp;
>   while  (*access && (*access)->offset + (*access)->size <= limit)
>     {
>       if (!last_child)
> 	root->first_child = *access;
>       else
> 	last_child->next_sibling = *access;
>       last_child = *access;
> 
>       build_access_subtree (access);
>     }
> }
> 
> /* Build a tree of access representatives, ACCESS is the pointer to the first
>    one, others are linked in a list by the next_grp field.  Decide about scalar
>    replacements on the way, return true iff any are to be created.  */
> 
> static void
> build_access_trees (struct access *access)
> {
>   while (access)
>     {
>       struct access *root = access;
> 
>       build_access_subtree (&access);
>       root->next_grp = access;
>     }
> }
> 
> /* Analyze the subtree of accesses rooted in ROOT, scheduling replacements when
>    both seeming beneficial and when ALLOW_REPLACEMENTS allows it.  Also set
>    all sorts of access flags appropriately along the way, notably always ser
>    grp_read when MARK_READ is true and grp_write when MARK_WRITE is true.  */
> 
> static bool
> analyze_access_subtree (struct access *root, bool allow_replacements,
> 			bool mark_read, bool mark_write)
> {
>   struct access *child;
>   HOST_WIDE_INT limit = root->offset + root->size;
>   HOST_WIDE_INT covered_to = root->offset;
>   bool scalar = is_sra_scalar_type (root->type);
>   bool hole = false, sth_created = false;
> 
>   if (mark_read)
>     root->grp_read = true;
>   else if (root->grp_read)
>     mark_read = true;
> 
>   if (mark_write)
>     root->grp_write = true;
>   else if (root->grp_write)
>     mark_write = true;
> 
>   if (root->grp_unscalarizable_region)
>     allow_replacements = false;
> 
>   for (child = root->first_child; child; child = child->next_sibling)
>     {
>       if (!hole && child->offset < covered_to)
> 	hole = true;
>       else
> 	covered_to += child->size;
> 
>       sth_created |= analyze_access_subtree (child,
> 					     allow_replacements && !scalar,
> 					     mark_read, mark_write);
> 
>       root->grp_unscalarized_data |= child->grp_unscalarized_data;
>       hole |= !child->grp_covered;
>     }
> 
>   if (allow_replacements && scalar && !root->first_child)
>     {
>       if (dump_file)

&& (dump_flags & TDF_DETAILS)

> 	{
> 	  fprintf (dump_file, "Marking ");
> 	  print_generic_expr (dump_file, root->base, 0);
> 	  fprintf (dump_file, " offset: %u, size: %u: ",
> 		   (unsigned) root->offset, (unsigned) root->size);
> 	  fprintf (dump_file, " to be replaced.\n");
> 	}
> 
>       root->grp_to_be_replaced = 1;
>       sth_created = true;
>       hole = false;
>     }
>   else if (covered_to < limit)
>     hole = true;
> 
>   if (sth_created && !hole)
>     {
>       root->grp_covered = 1;
>       return true;
>     }
>   if (root->grp_write || TREE_CODE (root->base) == PARM_DECL)
>     root->grp_unscalarized_data = 1; /* not covered and written to */
>   if (sth_created)
>     return true;
>   return false;
> }
> 
> /* Analyze all access trees linked by next_grp by the means of
>    analyze_access_subtree.  */
> static bool
> analyze_access_trees (struct access *access)
> {
>   bool ret = false;
> 
>   while (access)
>     {
>       if (analyze_access_subtree (access, true, false, false))
> 	ret = true;
>       access = access->next_grp;
>     }
> 
>   return ret;
> }
> 
> /* Return true iff a potential new child of LACC at offset OFFSET and with size
>    SIZE would conflict with an already existing one.  If exactly such a child
>    already exists in LACC, store a pointer to it in EXACT_MATCH.  */
> 
> static bool
> child_would_conflict_in_lacc (struct access *lacc, HOST_WIDE_INT norm_offset,
> 			      HOST_WIDE_INT size, struct access **exact_match)
> {
>   struct access *child;
> 
>   for (child = lacc->first_child; child; child = child->next_sibling)
>     {
>       if (child->offset == norm_offset && child->size == size)
> 	{
> 	  *exact_match = child;
> 	  return true;
> 	}
> 
>       if (child->offset < norm_offset + size
> 	  && child->offset + child->size > norm_offset)
> 	return true;
>     }
> 
>   return false;
> }
> 
> /* Set the expr of TARGET to one just like MODEL but with is own base at the
>    bottom of the handled components.  */
> 
> static void
> duplicate_expr_for_different_base (struct access *target,
> 				   struct access *model)
> {
>   tree t, expr = unshare_expr (model->expr);
> 
>   gcc_assert (handled_component_p (expr));
>   t = expr;
>   while (handled_component_p (TREE_OPERAND (t, 0)))
>     t = TREE_OPERAND (t, 0);
>   gcc_assert (TREE_OPERAND (t, 0) == model->base);
>   TREE_OPERAND (t, 0) = target->base;
> 
>   target->expr = expr;
> }
> 
> 
> /* Create a new child access of PARENT, with all properties just like MODEL
>    except for its offset and with its grp_write false and grp_read true.
>    Return the new access. Note that this access is created long after all
>    splicing and sorting, it's not located in any access vector and is
>    automatically a representative of its group.  */
> 
> static struct access *
> create_artificial_child_access (struct access *parent, struct access *model,
> 				HOST_WIDE_INT new_offset)
> {
>   struct access *access;
>   struct access **child;
> 
>   gcc_assert (!model->grp_unscalarizable_region);
> 
>   access = (struct access *) pool_alloc (access_pool);
>   memset (access, 0, sizeof (struct access));
>   access->base = parent->base;
>   access->offset = new_offset;
>   access->size = model->size;
>   duplicate_expr_for_different_base (access, model);
>   access->type = model->type;
>   access->grp_write = true;
>   access->grp_read = false;
> 
>   child = &parent->first_child;
>   while (*child && (*child)->offset < new_offset)
>     child = &(*child)->next_sibling;
> 
>   access->next_sibling = *child;
>   *child = access;
> 
>   return access;
> }
> 
> 
> /* Propagate all subaccesses of RACC across an assignment link to LACC. Return
>    true if any new subaccess was created.  Additionally, if RACC is a scalar
>    access but LACC is not, change the type of the latter.  */
> 
> static bool
> propagate_subacesses_accross_link (struct access *lacc, struct access *racc)
> {
>   struct access *rchild;
>   HOST_WIDE_INT norm_delta = lacc->offset - racc->offset;
>   bool ret = false;
> 
>   if (is_sra_scalar_type (lacc->type)
>       || lacc->grp_unscalarizable_region
>       || racc->grp_unscalarizable_region)
>     return false;
> 
>   if (!lacc->first_child && !racc->first_child
>       && is_sra_scalar_type (racc->type)
>       && (sra_mode == SRA_MODE_INTRA
>           || !bitmap_bit_p (retvals_bitmap, DECL_UID (lacc->base))))
>     {
>       duplicate_expr_for_different_base (lacc, racc);
>       lacc->type = racc->type;
>       return false;
>     }
> 
>   gcc_assert (lacc->size <= racc->size);
> 
>   for (rchild = racc->first_child; rchild; rchild = rchild->next_sibling)
>     {
>       struct access *new_acc = NULL;
>       HOST_WIDE_INT norm_offset = rchild->offset + norm_delta;
> 
>       if (rchild->grp_unscalarizable_region)
> 	continue;
> 
>       if (child_would_conflict_in_lacc (lacc, norm_offset, rchild->size,
> 					&new_acc))
> 	{
> 	  if (new_acc && rchild->first_child)
> 	    ret |= propagate_subacesses_accross_link (new_acc, rchild);
> 	  continue;
> 	}
> 
>       new_acc = create_artificial_child_access (lacc, rchild, norm_offset);
>       if (racc->first_child)
> 	propagate_subacesses_accross_link (new_acc, rchild);
> 
>       ret = true;
>     }
> 
>   return ret;
> }
> 
> /* Propagate all subaccesses across assignment links.  */
> 
> static void
> propagate_all_subaccesses (void)
> {
>   while (work_queue_head)
>     {
>       struct access *racc = pop_access_from_work_queue ();
>       struct assign_link *link;
> 
>       gcc_assert (racc->first_link);
> 
>       for (link = racc->first_link; link; link = link->next)
> 	{
> 	  struct access *lacc = link->lacc;
> 
> 	  if (!bitmap_bit_p (candidate_bitmap, DECL_UID (lacc->base)))
> 	    continue;
> 	  lacc = lacc->group_representative;
> 	  if (propagate_subacesses_accross_link (lacc, racc)
> 	      && lacc->first_link)
> 	    add_access_to_work_queue (lacc);
> 	}
>     }
> }
> 
> /* Go through all accesses collected throughout the (intraprocedural) analysis
>    stage, exclude overlapping ones, identify representatives and build trees
>    out of them, making decisions about scalarization on the way.  Return true
>    iff there are any to-be-scalarized variables after this stage. */
> 
> static bool
> analyze_all_variable_accesses (void)
> {
>   tree var;
>   referenced_var_iterator rvi;
>   bool res = false;
> 
>   FOR_EACH_REFERENCED_VAR (var, rvi)
>     if (bitmap_bit_p (candidate_bitmap, DECL_UID (var)))
>       {
> 	struct access *access;
> 
> 	access = sort_and_splice_var_accesses (var);
> 	if (access)
> 	  build_access_trees (access);
> 	else
> 	  disqualify_candidate (var,
> 				"No or inhibitingly overlapping accesses.");
>       }
> 
>   propagate_all_subaccesses ();
> 
>   FOR_EACH_REFERENCED_VAR (var, rvi)
>     if (bitmap_bit_p (candidate_bitmap, DECL_UID (var)))
>       {
> 	struct access *access = get_first_repr_for_decl (var);
> 
> 	if (analyze_access_trees (access))
> 	  {
> 	    res = true;
> 	    if (dump_file)

&& (dump_flags & TDF_DETAILS)

> 	      {
> 		fprintf (dump_file, "\nAccess trees for ");
> 		print_generic_expr (dump_file, var, 0);
> 		fprintf (dump_file, " (UID: %u): \n", DECL_UID (var));
> 		dump_access_tree (dump_file, access);
> 		fprintf (dump_file, "\n");
> 	      }
> 	  }
> 	else
> 	  disqualify_candidate (var, "No scalar replacements to be created.");
>       }
> 
>   return res;
> }
> 
> /* Return true iff a reference statement into aggregate AGG can be built for
>    every single to-be-replaced accesses that is a child of ACCESS, its sibling
>    or a child of its sibling. TOP_OFFSET is the offset from the processed
>    access subtree that has to be subtracted from offset of each access.  */
> 
> static bool
> ref_expr_for_all_replacements_p (struct access *access, tree agg,
> 				 HOST_WIDE_INT top_offset)
> {
>   do
>     {
>       if (access->grp_to_be_replaced
> 	  && !build_ref_for_offset (NULL, TREE_TYPE (agg),
> 				    access->offset - top_offset,
> 				    access->type, false))
> 	return false;
> 
>       if (access->first_child
> 	  && !ref_expr_for_all_replacements_p (access->first_child, agg,
> 					       top_offset))
> 	return false;
> 
>       access = access->next_sibling;
>     }
>   while (access);
> 
>   return true;
> }
> 
> 
> /* Generate statements copying scalar replacements of accesses within a subtree
>    into or out of AGG.  ACCESS is the first child of the root of the subtree to
>    be processed.  AGG is an aggregate type expression (can be a declaration but
>    does not have to be, it can for example also be an indirect_ref).
>    TOP_OFFSET is the offset of the processed subtree which has to be subtracted
>    from offsets of individual accesses to get corresponding offsets for AGG.
>    If CHUNK_SIZE is non-null, copy only replacements in the interval
>    <start_offset, start_offset + chunk_size>, otherwise copy all.  GSI is a
>    statement iterator used to place the new statements.  WRITE should be true
>    when the statements should write from AGG to the replacement and false if
>    vice versa.  if INSERT_AFTER is true, new statements will be added after the
>    current statement in GSI, they will be added before the statement
>    otherwise.  */
> 
> static void
> generate_subtree_copies (struct access *access, tree agg,
> 			 HOST_WIDE_INT top_offset,
> 			 HOST_WIDE_INT start_offset, HOST_WIDE_INT chunk_size,
> 			 gimple_stmt_iterator *gsi, bool write,
> 			 bool insert_after)
> {
>   do
>     {
>       tree expr = unshare_expr (agg);
> 
>       if (chunk_size && access->offset >= start_offset + chunk_size)
> 	return;
> 
>       if (access->grp_to_be_replaced
> 	  && (chunk_size == 0
> 	      || access->offset + access->size > start_offset))
> 	{
> 	  bool repl_found;
> 	  gimple stmt;
> 
> 	  repl_found = build_ref_for_offset (&expr, TREE_TYPE (agg),
> 					     access->offset - top_offset,
> 					     access->type, false);
> 	  gcc_assert (repl_found);
> 
> 	  if (write)
> 	    stmt = gimple_build_assign (get_access_replacement (access), expr);
> 	  else
> 	    {
> 	      tree repl = get_access_replacement (access);
> 	      TREE_NO_WARNING (repl) = 1;
> 	      stmt = gimple_build_assign (expr, repl);
> 	    }
> 
> 	  if (insert_after)
> 	    gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
> 	  else
> 	    gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
> 	  update_stmt (stmt);
> 	}
> 
>       if (access->first_child)
> 	generate_subtree_copies (access->first_child, agg, top_offset,
> 				 start_offset, chunk_size, gsi,
> 				 write, insert_after);
> 
>       access = access->next_sibling;
>     }
>   while (access);
> }
> 
> /* Assign zero to all scalar replacements in an access subtree.  ACCESS is the
>    the root of the subtree to be processed.  GSI is the statement iterator used
>    for inserting statements which are added after the current statement if
>    INSERT_AFTER is true or before it otherwise.  */
> 
> static void
> init_subtree_with_zero (struct access *access, gimple_stmt_iterator *gsi,
> 			bool insert_after)
> 
> {
>   struct access *child;
> 
>   if (access->grp_to_be_replaced)
>     {
>       gimple stmt;
> 
>       stmt = gimple_build_assign (get_access_replacement (access),
> 				  fold_convert (access->type,
> 						integer_zero_node));
>       if (insert_after)
> 	gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
>       else
> 	gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
>       update_stmt (stmt);
>     }
> 
>   for (child = access->first_child; child; child = child->next_sibling)
>     init_subtree_with_zero (child, gsi, insert_after);
> }
> 
> /* Search for an access representative for the given expression EXPR and
>    return it or NULL if it cannot be found.  */
> 
> static struct access *
> get_access_for_expr (tree expr)
> {
>   HOST_WIDE_INT offset, size, max_size;
>   tree base;
> 
>   if (TREE_CODE (expr) == NOP_EXPR

CONVERT_EXPR_P (expr)

>       || TREE_CODE (expr) == VIEW_CONVERT_EXPR)

VIEW_CONVERT_EXPR is also a handled_component_p.

Note that NOP_EXPR should never occur here - that would be invalid
gimple.  So I think you can (and should) just delete the above.

>     expr = TREE_OPERAND (expr, 0);
> 
>   if (handled_component_p (expr))
>     {
>       base = get_ref_base_and_extent (expr, &offset, &size, &max_size);
>       size = max_size;
>       if (size == -1 || !base || !DECL_P (base))
> 	return NULL;
>     }
>   else if (DECL_P (expr))
>     {
>       tree tree_size;
> 
>       base = expr;
>       tree_size = TYPE_SIZE (TREE_TYPE (base));
>       if (tree_size && host_integerp (tree_size, 1))
> 	size = max_size = tree_low_cst (tree_size, 1);
>       else
> 	return NULL;
> 
>       offset = 0;

See above.  get_ref_base_and_extent handles plain DECLs just fine.

>     }
>   else
>     return NULL;
> 
>   if (!bitmap_bit_p (candidate_bitmap, DECL_UID (base)))
>     return NULL;
> 
>   return get_var_base_offset_size_access (base, offset, size);
> }
> 
> /* Substitute into *EXPR an expression of type TYPE with the value of the
>    replacement of ACCESS.  This is done either by producing a special V_C_E
>    assignment statement converting the replacement to a new temporary of the
>    requested type if TYPE is not TREE_ADDRESSABLE or by going through the base
>    aggregate if it is.  */
> 
> static void
> fix_incompatible_types_for_expr (tree *expr, tree type, struct access *access,
> 				 gimple_stmt_iterator *gsi, bool write)
> {
>   tree repl = get_access_replacement (access);
>   if (!TREE_ADDRESSABLE (type))
>     {
>       tree tmp = create_tmp_var (type, "SRvce");
> 
>       add_referenced_var (tmp);
>       if (is_gimple_reg_type (type))
> 	tmp = make_ssa_name (tmp, NULL);

Should be always is_gimple_reg_type () if it is a type suitable for
a SRA scalar replacement.  But you should set DECL_GIMPLE_REG_P for
VECTOR and COMPLEX types here.

>       if (write)
> 	{
> 	  gimple stmt;
> 	  tree conv = fold_build1 (VIEW_CONVERT_EXPR, TREE_TYPE (repl), tmp);

This needs to either always fold to plain 'tmp' or tmp has to be a
non-register.  Otherwise you will create invalid gimple.

> 	  *expr = tmp;
> 	  if (is_gimple_reg_type (type))
> 	    SSA_NAME_DEF_STMT (tmp) = gsi_stmt (*gsi);

See above.

> 	  stmt = gimple_build_assign (repl, conv);
> 	  gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
> 	  update_stmt (stmt);
> 	}
>       else
> 	{
> 	  gimple stmt;
> 	  tree conv = fold_build1 (VIEW_CONVERT_EXPR, type, repl);
> 
> 	  stmt = gimple_build_assign (tmp, conv);
> 	  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
> 	  if (is_gimple_reg_type (type))
> 	    SSA_NAME_DEF_STMT (tmp) = stmt;

See above.  (I wonder if the patch still passes bootstrap & regtest
after the typecking patch)

> 	  *expr = tmp;
> 	  update_stmt (stmt);
> 	}
>     }
>   else
>     {
>       if (write)
> 	{
> 	  gimple stmt;
> 
> 	  stmt = gimple_build_assign (repl, unshare_expr (access->expr));
> 	  gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
> 	  update_stmt (stmt);
> 	}
>       else
> 	{
> 	  gimple stmt;
> 
> 	  stmt = gimple_build_assign (unshare_expr (access->expr), repl);
> 	  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
> 	  update_stmt (stmt);
> 	}

I don't understand this path.  Are the types here always compatible?

>     }
> }
> 
> 
> /* Callback for scan_function.  Replace the expression EXPR with a scalar
>    replacement if there is one and generate other statements to do type
>    conversion or subtree copying if necessary.  GSI is used to place newly
>    created statements, WRITE is true if the expression is being written to (it
>    is on a LHS of a statement or output in an assembly statement).  */
> 
> static bool
> sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi, bool write,
> 		 void *data ATTRIBUTE_UNUSED)
> {
>   struct access *access;
>   tree type, bfr;
> 
>   if (TREE_CODE (*expr) == BIT_FIELD_REF)
>     {
>       bfr = *expr;
>       expr = &TREE_OPERAND (*expr, 0);
>     }
>   else
>     bfr = NULL_TREE;
> 
>   if (TREE_CODE (*expr) == REALPART_EXPR || TREE_CODE (*expr) == IMAGPART_EXPR)
>     expr = &TREE_OPERAND (*expr, 0);

Why strip these early?  I think this is wrong (or do you always want to
produce complex type replacements, even if only the real or imaginary part
are used?  If so a strathegic comment somewhere is missing.)

>   type = TREE_TYPE (*expr);
> 
>   access = get_access_for_expr (*expr);
>   if (!access)
>     return false;
> 
>   if (access->grp_to_be_replaced)
>     {
>       if (!useless_type_conversion_p (type, access->type))
> 	fix_incompatible_types_for_expr (expr, type, access, gsi, write);
>       else
> 	*expr = get_access_replacement (access);
>     }
> 
>   if (access->first_child)
>     {
>       HOST_WIDE_INT start_offset, chunk_size;
>       if (bfr
> 	  && host_integerp (TREE_OPERAND (bfr, 1), 1)
> 	  && host_integerp (TREE_OPERAND (bfr, 2), 1))
> 	{
> 	  start_offset = tree_low_cst (TREE_OPERAND (bfr, 1), 1);
> 	  chunk_size = tree_low_cst (TREE_OPERAND (bfr, 2), 1);
> 	}
>       else
> 	start_offset = chunk_size = 0;
> 
>       generate_subtree_copies (access->first_child, access->base, 0,
> 			       start_offset, chunk_size, gsi, write, write);
>     }
>   return true;
> }
> 
> /* Store all replacements in the access tree rooted in TOP_RACC either to their
>    base aggregate if there are unscalarized data or directly to LHS
>    otherwise.  */
> 
> static void
> handle_unscalarized_data_in_subtree (struct access *top_racc, tree lhs,
> 				     gimple_stmt_iterator *gsi)
> {
>   if (top_racc->grp_unscalarized_data)
>     generate_subtree_copies (top_racc->first_child, top_racc->base, 0, 0, 0,
> 			     gsi, false, false);
>   else
>     generate_subtree_copies (top_racc->first_child, lhs, top_racc->offset,
> 			     0, 0, gsi, false, false);
> }
> 
> 
> /* Try to generate statements to load all sub-replacements in an access
>    (sub)tree (LACC is the first child) from scalar replacements in the TOP_RACC
>    (sub)tree.  If that is not possible, refresh the TOP_RACC base aggregate and
>    load the accesses from it.  LEFT_OFFSET is the offset of the left whole
>    subtree being copied, RIGHT_OFFSET is the same thing for the right subtree.
>    GSI is stmt iterator used for statement insertions.  *REFRESHED is true iff
>    the rhs top aggregate has already been refreshed by contents of its scalar
>    reductions and is set to true if this function has to do it.  */
> 
> static void
> load_assign_lhs_subreplacements (struct access *lacc, struct access *top_racc,
> 				 HOST_WIDE_INT left_offset,
> 				 HOST_WIDE_INT right_offset,
> 				 gimple_stmt_iterator *old_gsi,
> 				 gimple_stmt_iterator *new_gsi,
> 				 bool *refreshed, tree lhs)
> {
>   do
>     {
>       if (lacc->grp_to_be_replaced)
> 	{
> 	  struct access *racc;
> 	  HOST_WIDE_INT offset = lacc->offset - left_offset + right_offset;
> 
> 	  racc = find_access_in_subtree (top_racc, offset, lacc->size);
> 	  if (racc && racc->grp_to_be_replaced)
> 	    {
> 	      gimple stmt;
> 
> 	      if (useless_type_conversion_p (lacc->type, racc->type))
> 		stmt = gimple_build_assign (get_access_replacement (lacc),
> 					    get_access_replacement (racc));
> 	      else
> 		{
> 		  tree rhs = fold_build1 (VIEW_CONVERT_EXPR, lacc->type,
> 					  get_access_replacement (racc));
> 		  stmt = gimple_build_assign (get_access_replacement (lacc),
> 					      rhs);
> 		}
> 
> 	      gsi_insert_after (new_gsi, stmt, GSI_NEW_STMT);
> 	      update_stmt (stmt);
> 	    }
> 	  else
> 	    {
> 	      tree expr = unshare_expr (top_racc->base);
> 	      bool repl_found;
> 	      gimple stmt;
> 
> 	      /* No suitable access on the right hand side, need to load from
> 		 the aggregate.  See if we have to update it first... */
> 	      if (!*refreshed)
> 		{
> 		  gcc_assert (top_racc->first_child);
> 		  handle_unscalarized_data_in_subtree (top_racc, lhs, old_gsi);
> 		  *refreshed = true;
> 		}
> 
> 	      repl_found = build_ref_for_offset (&expr,
> 						 TREE_TYPE (top_racc->base),
> 						 lacc->offset - left_offset,
> 						 lacc->type, false);
> 	      gcc_assert (repl_found);
> 	      stmt = gimple_build_assign (get_access_replacement (lacc),
> 					  expr);
> 	      gsi_insert_after (new_gsi, stmt, GSI_NEW_STMT);
> 	      update_stmt (stmt);
> 	    }
> 	}
>       else if (lacc->grp_read && !lacc->grp_covered && !*refreshed)
> 	{
> 	  handle_unscalarized_data_in_subtree (top_racc, lhs, old_gsi);
> 	  *refreshed = true;
> 	}
> 
>       if (lacc->first_child)
> 	load_assign_lhs_subreplacements (lacc->first_child, top_racc,
> 					 left_offset, right_offset,
> 					 old_gsi, new_gsi, refreshed, lhs);
>       lacc = lacc->next_sibling;
>     }
>   while (lacc);
> }
> 
> /* Return true iff ACC is non-NULL and has subaccesses.  */
> 
> static inline bool
> access_has_children_p (struct access *acc)
> {
>   return acc && acc->first_child;
> }
> 
> /* Modify assignments with a CONSTRUCTOR on their RHS.  STMT contains a pointer
>    to the assignment and GSI is the statement iterator pointing at it.  Returns
>    the same values as sra_modify_assign.  */
> 
> static enum scan_assign_result
> sra_modify_constructor_assign (gimple *stmt, gimple_stmt_iterator *gsi)
> {
>   tree lhs = gimple_assign_lhs (*stmt);
>   struct access *acc;
> 
>   gcc_assert (TREE_CODE (lhs) != REALPART_EXPR
> 	      && TREE_CODE (lhs) != IMAGPART_EXPR);
>   acc = get_access_for_expr (lhs);
>   if (!acc)
>     return SRA_SA_NONE;
> 
>   if (VEC_length (constructor_elt,
> 		  CONSTRUCTOR_ELTS (gimple_assign_rhs1 (*stmt))) > 0)
>     {
>       /* I have never seen this code path trigger but if it can happen the
> 	 following should handle it gracefully.  */

It can trigger for vector constants.

>       if (access_has_children_p (acc))
> 	generate_subtree_copies (acc->first_child, acc->base, 0, 0, 0, gsi,
> 				 true, true);
>       return SRA_SA_PROCESSED;
>     }
> 
>   if (acc->grp_covered)
>     {
>       init_subtree_with_zero (acc, gsi, false);
>       unlink_stmt_vdef (*stmt);
>       gsi_remove (gsi, true);
>       return SRA_SA_REMOVED;
>     }
>   else
>     {
>       init_subtree_with_zero (acc, gsi, true);
>       return SRA_SA_PROCESSED;
>     }
> }
> 
> 
> /* Modify statements with IMAGPART_EXPR or REALPART_EXPR on their lhs with
>    to-be-scalarized expressions with them.  STMT is the statement and GSI is
>    the iterator used to place new helper statements.  Returns the same values
>    as sra_modify_assign.  */
> 
> static enum scan_assign_result
> sra_modify_partially_complex_lhs (gimple stmt, gimple_stmt_iterator *gsi)
> {
>   tree lhs, complex, ptype, rp, ip;
>   struct access *access;
>   gimple new_stmt, aux_stmt;
> 
>   lhs = gimple_assign_lhs (stmt);
>   complex = TREE_OPERAND (lhs, 0);
> 
>   access = get_access_for_expr (complex);
> 
>   if (!access || !access->grp_to_be_replaced)
>     return SRA_SA_NONE;
> 
>   ptype = TREE_TYPE (TREE_TYPE (complex));
>   rp = create_tmp_var (ptype, "SRr");
>   add_referenced_var (rp);
>   rp = make_ssa_name (rp, NULL);
> 
>   ip = create_tmp_var (ptype, "SRp");
>   add_referenced_var (ip);
>   ip = make_ssa_name (ip, NULL);
> 
>   if (TREE_CODE (lhs) == IMAGPART_EXPR)
>     {
>       aux_stmt = gimple_build_assign (rp, fold_build1 (REALPART_EXPR, ptype,
> 					     get_access_replacement (access)));
>       SSA_NAME_DEF_STMT (rp) = aux_stmt;
>       gimple_assign_set_lhs (stmt, ip);
>       SSA_NAME_DEF_STMT (ip) = stmt;
>     }
>   else
>     {
>       aux_stmt = gimple_build_assign (ip, fold_build1 (IMAGPART_EXPR, ptype,
> 					     get_access_replacement (access)));
>       SSA_NAME_DEF_STMT (ip) = aux_stmt;
>       gimple_assign_set_lhs (stmt, rp);
>       SSA_NAME_DEF_STMT (rp) = stmt;
>     }
> 
>   gsi_insert_before (gsi, aux_stmt, GSI_SAME_STMT);
>   update_stmt (aux_stmt);
>   new_stmt = gimple_build_assign (get_access_replacement (access),
> 				  fold_build2 (COMPLEX_EXPR, access->type,
> 					       rp, ip));
>   gsi_insert_after (gsi, new_stmt, GSI_NEW_STMT);
>   update_stmt (new_stmt);

Hm.  So you do what complex lowering does here.  Note that this may
create loads from uninitialized memory with all its problems.

WRT the complex stuff.  If you would do scalarization and analysis
just on the components (not special case REAL/IMAGPART_EXPR everywhere)
it should work better, correct?  You still could handle group
scalarization for the case of for example passing a complex argument
to a function.

void bar(_Complex float);
void foo(float x, float y)
{
  _Complex float z = x;
  __imag z = y;
  bar(z);
}

The same applies for vectors - the REAL/IMAGPART_EXPRs equivalent
there is BIT_FIELD_REF.

>   return SRA_SA_PROCESSED;
> }
> 
> /* Return true iff T has a VIEW_CONVERT_EXPR among its handled components.  */
> 
> static bool
> contains_view_convert_expr_p (tree t)
> {
>   while (1)
>     {
>       if (TREE_CODE (t) == VIEW_CONVERT_EXPR)
> 	return true;
>       if (!handled_component_p (t))
> 	return false;
>       t = TREE_OPERAND (t, 0);
>     }
> }

Place this in tree-flow-inline.h next to ref_contains_array_ref, also
structure the loop in the same way.

> /* Change STMT to assign compatible types by means of adding component or array
>    references or VIEW_CONVERT_EXPRs.  All parameters have the same meaning as
>    variable with the same names in sra_modify_assign.  This is done in a
>    such a complicated way in order to make
>    testsuite/g++.dg/tree-ssa/ssa-sra-2.C happy and so it helps in at least some
>    cases.  */
> 
> static void
> fix_modified_assign_compatibility (gimple_stmt_iterator *gsi, gimple *stmt,
> 				   struct access *lacc, struct access *racc,
> 				   tree lhs, tree *rhs, tree ltype, tree rtype)
> {
>   if (racc && racc->grp_to_be_replaced && AGGREGATE_TYPE_P (ltype)
>       && !access_has_children_p (lacc))
>     {
>       tree expr = unshare_expr (lhs);
>       bool found = build_ref_for_offset (&expr, ltype, racc->offset, rtype,
> 					 false);
>       if (found)
> 	{
> 	  gimple_assign_set_lhs (*stmt, expr);
> 	  return;
> 	}
>     }
> 
>   if (lacc && lacc->grp_to_be_replaced && AGGREGATE_TYPE_P (rtype)
>       && !access_has_children_p (racc))
>     {
>       tree expr = unshare_expr (*rhs);
>       bool found = build_ref_for_offset (&expr, rtype, lacc->offset, ltype,
> 					 false);
>       if (found)
> 	{
> 	  gimple_assign_set_rhs1 (*stmt, expr);
> 	  return;
> 	}
>     }
> 
>   *rhs = fold_build1 (VIEW_CONVERT_EXPR, ltype, *rhs);
>   gimple_assign_set_rhs_from_tree (gsi, *rhs);
>   *stmt = gsi_stmt (*gsi);

Reading this I have a deja-vu - isn't there another function in this
file doing the same thing?  You are doing much unsharing even though
you re-build the access tree from scratch?

> }
> 
> /* Callback of scan_function to process assign statements.  It examines both
>    sides of the statement, replaces them with a scalare replacement if there is
>    one and generating copying of replacements if scalarized aggregates have been
>    used in the assignment.  STMT is a pointer to the assign statement, GSI is
>    used to hold generated statements for type conversions and subtree
>    copying.  */
> 
> static enum scan_assign_result
> sra_modify_assign (gimple *stmt, gimple_stmt_iterator *gsi,
> 		   void *data ATTRIBUTE_UNUSED)
> {
>   struct access *lacc, *racc;
>   tree ltype, rtype;
>   tree lhs, rhs;
>   bool modify_this_stmt;
> 
>   if (gimple_assign_rhs2 (*stmt))

!gimple_assign_single_p (*stmt)

(the only gimple assign that may access memory)

>     return SRA_SA_NONE;
>   lhs = gimple_assign_lhs (*stmt);
>   rhs = gimple_assign_rhs1 (*stmt);
> 
>   if (TREE_CODE (rhs) == CONSTRUCTOR)
>     return sra_modify_constructor_assign (stmt, gsi);
> 
>   if (TREE_CODE (lhs) == REALPART_EXPR || TREE_CODE (lhs) == IMAGPART_EXPR)
>     return sra_modify_partially_complex_lhs (*stmt, gsi);
> 
>   if (TREE_CODE (rhs) == REALPART_EXPR || TREE_CODE (rhs) == IMAGPART_EXPR
>       || TREE_CODE (rhs) == BIT_FIELD_REF || TREE_CODE (lhs) == BIT_FIELD_REF)
>     {
>       modify_this_stmt = sra_modify_expr (gimple_assign_rhs1_ptr (*stmt),
> 					  gsi, false, data);
>       modify_this_stmt |= sra_modify_expr (gimple_assign_lhs_ptr (*stmt),
> 					   gsi, true, data);
>       return modify_this_stmt ? SRA_SA_PROCESSED : SRA_SA_NONE;
>     }
> 
>   lacc = get_access_for_expr (lhs);
>   racc = get_access_for_expr (rhs);
>   if (!lacc && !racc)
>     return SRA_SA_NONE;
> 
>   modify_this_stmt = ((lacc && lacc->grp_to_be_replaced)
> 		      || (racc && racc->grp_to_be_replaced));
> 
>   if (lacc && lacc->grp_to_be_replaced)
>     {
>       lhs = get_access_replacement (lacc);
>       gimple_assign_set_lhs (*stmt, lhs);
>       ltype = lacc->type;
>     }
>   else
>     ltype = TREE_TYPE (lhs);
> 
>   if (racc && racc->grp_to_be_replaced)
>     {
>       rhs = get_access_replacement (racc);
>       gimple_assign_set_rhs1 (*stmt, rhs);
>       rtype = racc->type;
>     }
>   else
>     rtype = TREE_TYPE (rhs);
> 
>   /* The possibility that gimple_assign_set_rhs_from_tree() might reallocate
>      the statement makes the position of this pop_stmt_changes() a bit awkward
>      but hopefully make some sense.  */

I don't see pop_stmt_changes().

>   if (modify_this_stmt)
>     {
>       if (!useless_type_conversion_p (ltype, rtype))
> 	fix_modified_assign_compatibility (gsi, stmt, lacc, racc,
> 					   lhs, &rhs, ltype, rtype);
>     }
> 
>   if (contains_view_convert_expr_p (rhs) || contains_view_convert_expr_p (lhs)
>       || (access_has_children_p (racc)
> 	  && !ref_expr_for_all_replacements_p (racc, lhs, racc->offset))
>       || (access_has_children_p (lacc)
> 	  && !ref_expr_for_all_replacements_p (lacc, rhs, lacc->offset)))

?  A comment is missing what this case is about ...

(this smells like fixup that could be avoided by doing things correct
in the first place)

>     {
>       if (access_has_children_p (racc))
> 	generate_subtree_copies (racc->first_child, racc->base, 0, 0, 0,
> 				 gsi, false, false);
>       if (access_has_children_p (lacc))
> 	generate_subtree_copies (lacc->first_child, lacc->base, 0, 0, 0,
> 				 gsi, true, true);
>     }
>   else
>     {
>       if (access_has_children_p (lacc) && access_has_children_p (racc))
> 	{
> 	  gimple_stmt_iterator orig_gsi = *gsi;
> 	  bool refreshed;
> 
> 	  if (lacc->grp_read && !lacc->grp_covered)
> 	    {
> 	      handle_unscalarized_data_in_subtree (racc, lhs, gsi);
> 	      refreshed = true;
> 	    }
> 	  else
> 	    refreshed = false;
> 
> 	  load_assign_lhs_subreplacements (lacc->first_child, racc,
> 					   lacc->offset, racc->offset,
> 					   &orig_gsi, gsi, &refreshed, lhs);
> 	  if (!refreshed || !racc->grp_unscalarized_data)
> 	    {
> 	      if (*stmt == gsi_stmt (*gsi))
> 		gsi_next (gsi);
> 
> 	      unlink_stmt_vdef (*stmt);
> 	      gsi_remove (&orig_gsi, true);
> 	      return SRA_SA_REMOVED;
> 	    }
> 	}
>       else
> 	{
> 	  if (access_has_children_p (racc))
> 	    {
> 	      if (!racc->grp_unscalarized_data)
> 		{
> 		  generate_subtree_copies (racc->first_child, lhs,
> 					   racc->offset, 0, 0, gsi,
> 					   false, false);
> 		  gcc_assert (*stmt == gsi_stmt (*gsi));
> 		  unlink_stmt_vdef (*stmt);
> 		  gsi_remove (gsi, true);
> 		  return SRA_SA_REMOVED;
> 		}
> 	      else
> 		generate_subtree_copies (racc->first_child, lhs,
> 					 racc->offset, 0, 0, gsi, false, true);
> 	    }
> 	  else if (access_has_children_p (lacc))
> 	    generate_subtree_copies (lacc->first_child, rhs, lacc->offset,
> 				     0, 0, gsi, true, true);
> 	}
>     }
> 
>   return modify_this_stmt ? SRA_SA_PROCESSED : SRA_SA_NONE;
> }
> 
> /* Generate statements initializing scalar replacements of parts of function
>    parameters.  */
> 
> static void
> initialize_parameter_reductions (void)
> {
>   gimple_stmt_iterator gsi;
>   gimple_seq seq = NULL;
>   tree parm;
> 
>   for (parm = DECL_ARGUMENTS (current_function_decl);
>        parm;
>        parm = TREE_CHAIN (parm))
>     {
>       VEC (access_p, heap) *access_vec;
>       struct access *access;
> 
>       if (!bitmap_bit_p (candidate_bitmap, DECL_UID (parm)))
> 	continue;
>       access_vec = get_base_access_vector (parm);
>       if (!access_vec)
> 	continue;
> 
>       if (!seq)
> 	{
> 	  seq = gimple_seq_alloc ();
> 	  gsi = gsi_start (seq);
> 	}
> 
>       for (access = VEC_index (access_p, access_vec, 0);
> 	   access;
> 	   access = access->next_grp)
> 	generate_subtree_copies (access, parm, 0, 0, 0, &gsi, true, true);
>     }
> 
>   if (seq)
>     gsi_insert_seq_on_edge_immediate (single_succ_edge (ENTRY_BLOCK_PTR), seq);
> }
> 
> /* The "main" function of intraprocedural SRA passes.  Runs the analysis and if
>    it reveals there are components of some aggregates to be scalarized, it runs
>    the required transformations.  */
> static unsigned int
> perform_intra_sra (void)
> {
>   int ret = 0;
>   sra_initialize ();
> 
>   if (!find_var_candidates ())
>     goto out;
> 
>   if (!scan_function (build_access_from_expr, build_accesses_from_assign, NULL,
> 		      true, NULL))
>     goto out;
> 
>   if (!analyze_all_variable_accesses ())
>     goto out;
> 
>   scan_function (sra_modify_expr, sra_modify_assign, NULL,
> 		 false, NULL);
>   initialize_parameter_reductions ();
> 
>   ret = TODO_update_ssa;

redundant set.

>   if (sra_mode == SRA_MODE_EARLY_INTRA)
>     ret = TODO_update_ssa;
>   else
>     ret = TODO_update_ssa | TODO_rebuild_alias;

in fact you shouldn't (need to) rebuild alias.

>  out:
>   sra_deinitialize ();
>   return ret;
> }
> 
> /* Perform early intraprocedural SRA.  */
> static unsigned int
> early_intra_sra (void)
> {
>   sra_mode = SRA_MODE_EARLY_INTRA;
>   return perform_intra_sra ();
> }
> 
> /* Perform "late" intraprocedural SRA.  */
> static unsigned int
> late_intra_sra (void)
> {
>   sra_mode = SRA_MODE_INTRA;
>   return perform_intra_sra ();
> }
> 
> 
> static bool
> gate_intra_sra (void)
> {
>   return flag_tree_sra != 0;
> }
> 
> 
> struct gimple_opt_pass pass_sra_early =
> {
>  {
>   GIMPLE_PASS,
>   "esra",	 			/* name */
>   gate_intra_sra,			/* gate */
>   early_intra_sra,			/* execute */
>   NULL,					/* sub */
>   NULL,					/* next */
>   0,					/* static_pass_number */
>   TV_TREE_SRA,				/* tv_id */
>   PROP_cfg | PROP_ssa,                  /* properties_required */
>   0,					/* properties_provided */
>   0,					/* properties_destroyed */
>   0,					/* todo_flags_start */
>   TODO_dump_func
>   | TODO_update_ssa
>   | TODO_ggc_collect
>   | TODO_verify_ssa			/* todo_flags_finish */
>  }
> };
> 
> 
> struct gimple_opt_pass pass_sra =
> {
>  {
>   GIMPLE_PASS,
>   "sra",	 			/* name */
>   gate_intra_sra,			/* gate */
>   late_intra_sra,			/* execute */
>   NULL,					/* sub */
>   NULL,					/* next */
>   0,					/* static_pass_number */
>   TV_TREE_SRA,				/* tv_id */
>   PROP_cfg | PROP_ssa,                  /* properties_required */
>   0,					/* properties_provided */
>   0,					/* properties_destroyed */
>   TODO_update_address_taken,		/* todo_flags_start */
>   TODO_dump_func
>   | TODO_update_ssa
>   | TODO_ggc_collect
>   | TODO_verify_ssa			/* todo_flags_finish */
>  }
> };


Overall it looks good - I'm still a little bit confused, but that's
likely because reading code from top to bottom doesn't make the
most sense in all cases ;)

Looking forward to a second look on a revised version.

Thanks,
Richard.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/5] Fix indirect inlining fallout with new intra-SRA
  2009-04-29 12:39     ` Martin Jambor
@ 2009-04-29 13:13       ` Richard Guenther
  2009-05-20 10:23         ` Martin Jambor
  0 siblings, 1 reply; 25+ messages in thread
From: Richard Guenther @ 2009-04-29 13:13 UTC (permalink / raw)
  To: Martin Jambor; +Cc: Richard Guenther, GCC Patches, Jan Hubicka

On Wed, 29 Apr 2009, Martin Jambor wrote:

> Hi,
> 
> On Tue, Apr 28, 2009 at 01:48:55PM +0200, Richard Guenther wrote:
> > On Tue, Apr 28, 2009 at 12:04 PM, Martin Jambor <mjambor@suse.cz> wrote:
> > > The new intra-SRA produces an extra copy assignment and that breaks
> > > ipa-prop.c pattern matching.  The following patch fixes that.
> > >
> > > Thanks,
> > >
> > > Martin
> > >
> > >
> > > 2009-04-27  Martin Jambor  <mjambor@suse.cz>
> > >
> > >        * ipa-prop.c (get_ssa_def_if_simple_copy): New function.
> > >        (determine_cst_member_ptr): Call get_ssa_def_if_simple_copy to skip
> > >        simple copies.
> > >
> > >
> > > Index: mine/gcc/ipa-prop.c
> > > ===================================================================
> > > --- mine.orig/gcc/ipa-prop.c
> > > +++ mine/gcc/ipa-prop.c
> > > @@ -456,6 +456,22 @@ fill_member_ptr_cst_jump_function (struc
> > >   jfunc->value.member_cst.delta = delta;
> > >  }
> > >
> > > +/* If RHS is an SSA_NAMe and it is defined by a simple copy assign statement,
> > > +   return the rhs of its defining statement.  */
> > > +
> > > +static inline tree
> > > +get_ssa_def_if_simple_copy (tree rhs)
> > > +{
> > > +  if (TREE_CODE (rhs) == SSA_NAME && !SSA_NAME_IS_DEFAULT_DEF (rhs))
> > > +    {
> > > +      gimple def_stmt = SSA_NAME_DEF_STMT (rhs);
> > > +
> > > +      if (is_gimple_assign (def_stmt) && gimple_num_ops (def_stmt) == 2)
> > > +       rhs = gimple_assign_rhs1 (def_stmt);
> > > +    }
> > > +  return rhs;
> > > +}
> > 
> > IMHO this function should loop.  Also use gimple_assign_single_p
> > instead of the assign && num_ops check. 
> 
> OK
> 
> > You also  have to check  the gimple_assign_rhs_code to  be SSA_NAME,
> > otherwise you happily look through all unary operations.
> > 
> 
> Will the RHS code be SSA_NAME even when the RHS is an invariant? (I am
> eventually looking  for an invariant,  specifically an ADDR_EXPR  of a
> FUNCTION_DECL and an integer constant, not an ssa name.)

No.  In that case you want to check

  if (gimple_assign_single_p (def_stmt)
      && (gimple_assign_rhs_code (def_stmt) == SSA_NAME
          || is_gimple_min_invariant (gimple_assign_rhs1 (def_stmt)))

Richard.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/5] New intraprocedural Scalar Reduction of Aggregates.
  2009-04-29 12:56     ` Richard Guenther
@ 2009-05-10 10:33       ` Martin Jambor
  2009-05-10 11:48         ` Richard Guenther
  2009-05-10 10:39       ` Martin Jambor
  1 sibling, 1 reply; 25+ messages in thread
From: Martin Jambor @ 2009-05-10 10:33 UTC (permalink / raw)
  To: Richard Guenther; +Cc: GCC Patches, Jan Hubicka

Hi,

thank you very much for a detailed review.

You are  probably correct that the  functions are in  a rather strange
order  for a  reader.  This  is because  IPA-SRA was  removed  and the
things it shares with intra-SRA remained in the same order.

I will send a new tree-sra.c as a separate reply to your mail in a few
moments.  A few reactions to your comments are below.

On Wed, Apr 29, 2009 at 02:48:12PM +0200, Richard Guenther wrote:
> On Tue, 28 Apr 2009, Martin Jambor wrote:
> 
> > On Tue, Apr 28, 2009 at 12:04:32PM +0200, Martin Jambor wrote:
> > > This  is  the  new  intraprocedural  SRA.  I  have  stripped  off  the
> > > interprocedural part  and will propose to commit  it separately later.
> > > I have  tried to  remove almost every  trace of IPA-SRA,  however, two
> > > provisions for it  have remained in the patch.   First, an enumeration
> > > (rather than  a boolean) is  used to distuinguish between  "early" and
> > > "late" SRA  so that other  SRA modes can  be added later  on.  Second,
> > > scan_function()  has a  hook parameter  and a  void  pointer parameter
> > > which are not used in this patch but will be by IPA-SRA.
> > > 
> > > Otherwise, the patch is hopefully self-contained and the bases of its
> > > operation is described by the initial comment.
> > > 
> > > The patch bootstraps (on x86_64-linux-gnu but I am about to try it on
> > > hppa-linux-gnu too) but produces a small number of testsuite failures
> > > which are handled by the two following patches.
> > > 
> > > Thanks,
> > > 
> > > Martin
> > > 
> > > 
> > > 2009-04-27  Martin Jambor  <mjambor@suse.cz>
> > > 
> > > 	* tree-sra.c (enum sra_mode): The whole contents of the file was
> > > 	replaced.
> > 
> > Hm, the  patch is quite unreadable,  below is the  new tree-sra.c file
> > which entirely replaces the old one (note that the patch also modifies
> > the Makefile though):
> 
> Ah.  Here it is ... the comment to the changelog still applies.

Sure.

> > /* Scalar Replacement of Aggregates (SRA) converts some structure
> >    references into scalar references, exposing them to the scalar
> >    optimizers.
> >    Copyright (C) 2008, 2009 Free Software Foundation, Inc.
> >    Contributed by Martin Jambor <mjambor@suse.cz>
> > 
> > This file is part of GCC.
> > 
> > GCC is free software; you can redistribute it and/or modify it under
> > the terms of the GNU General Public License as published by the Free
> > Software Foundation; either version 3, or (at your option) any later
> > version.
> > 
> > GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> > WARRANTY; without even the implied warranty of MERCHANTABILITY or
> > FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> > for more details.
> > 
> > You should have received a copy of the GNU General Public License
> > along with GCC; see the file COPYING3.  If not see
> > <http://www.gnu.org/licenses/>.  */
> > 
> > /* This file implements Scalar Reduction of Aggregates (SRA).  SRA is run
> >    twice, once in the early stages of compilation (early SRA) and once in the
> >    late stages (late SRA).  The aim of both is to turn references to scalar
> >    parts of aggregates into uses of independent scalar variables.
> > 
> >    The two passes are nearly identical, the only difference is that early SRA
> >    does not scalarize unions which are used as the result in a GIMPLE_RETURN
> >    statement because together with inlining this can lead to weird type
> >    conversions.
> 
> Do you happen to have a testcase for this or can you describe that problem
> some more?

The testcase  is gfortran.fortran-torture/execute/entry_4.f90 from our
testsuite.  IIRC  there is a  union with a  float and a boolean  in it
which is returned in a  number of functions.  In different functions I
selected one of these fields as scalar replacements (because the other
one was not  used in the function, I refuse  to create replacements of
accesses  that are  scalar but  have children  and children  of scalar
accesses because of similar problems)  and used a V_C_E to construct a
return value  of the correct  type.  However, inlining  combined these
together and wanted to cast a  boolean to a float.  That made expander
ICE on an assert.

> >    Both passes operate in four stages:
> > 
> >    1. The declarations that have properties which make them candidates for
> >       scalarization are identified in function find_var_candidates().  The
> >       candidates are stored in candidate_bitmap.
> > 
> >    2. The function body is scanned.  In the process, declarations which are
> >       used in a manner that prevent their scalarization are removed from the
> >       candidate bitmap.  More importantly, for every access into an aggregate,
> >       an access structure (struct access) is created by create_access() and
> >       stored in a vector associated with the aggregate.  Among other
> >       information, the aggregate declaration, the offset and size of the access
> >       and its type are stored in the structure.
> > 
> >       On a related note, assign_link structures are created for every assign
> >       statement between candidate aggregates and attached to the related
> >       accesses.
> > 
> >    3. The vectors of accesses are analyzed.  They are first sorted according to
> >       their offset and size and then scanned for partially overlapping accesses
> >       (i.e. those which overlap but one is not entirely within another).  Such
> >       an access disqualifies the whole aggregate from being scalarized.
> 
> This happens only when get_ref_base_and_extent punts and returns -1 for
> the access size?  And of course with struct field accesses of different
> structs that are inside the same union.

This was specifically aimed  at through-union accesses which could not
be (both) represented  by independent scalars.  As you  noted below, I
bail out if get_ref_base_and_extent returned -1 for max_size.

> 
> >       If there is no such inhibiting overlap, a representative access structure
> >       is chosen for every unique combination of offset and size.  Afterwards,
> >       the pass builds a set of trees from these structures, in which children
> >       of an access are within their parent (in terms of offset and size).
> > 
> >       Then accesses  are propagated  whenever possible (i.e.  in cases  when it
> >       does not create a partially overlapping access) across assign_links from
> >       the right hand side to the left hand side.
> > 
> >       Then the set of trees for each declaration is traversed again and those
> >       accesses which should be replaced by a scalar are identified.
> > 
> >    4. The function is traversed again, and for every reference into an
> >       aggregate that has some component which is about to be scalarized,
> >       statements are amended and new statements are created as necessary.
> >       Finally, if a parameter got scalarized, the scalar replacements are
> >       initialized with values from respective parameter aggregates.
> > */
> 
> Closing */ goes to the previous line.

OK

> > #include "config.h"
> > #include "system.h"
> > #include "coretypes.h"
> > #include "alloc-pool.h"
> > #include "tm.h"
> > #include "tree.h"
> > #include "gimple.h"
> > #include "tree-flow.h"
> > #include "diagnostic.h"
> > #include "tree-dump.h"
> > #include "timevar.h"
> > #include "params.h"
> > #include "target.h"
> > #include "flags.h"
> > 
> > /* Enumeration of all aggregate reductions we can do.  */
> > enum sra_mode {SRA_MODE_EARLY_INTRA, /* early intraprocedural SRA */
> > 	       SRA_MODE_INTRA};	     /* late intraprocedural SRA */
> 
> Spaces after { and before }.
> 

OK

> > /* Global variable describing which aggregate reduction we are performing at
> >    the moment.  */
> > static enum sra_mode sra_mode;
> > 
> > struct assign_link;
> > 
> > /* ACCESS represents each access to an aggregate variable (as a whole or a
> >    part).  It can also represent a group of accesses that refer to exactly the
> >    same fragment of an aggregate (i.e. those that have exactly the same offset
> >    and size).  Such representatives for a single aggregate, once determined,
> >    are linked in a linked list and have the group fields set.
> > 
> >    Moreover, when doing intraprocedural SRA, a tree is built from those
> >    representatives (by the means of first_child and next_sibling pointers), in
> >    which all items in a subtree are "within" the root, i.e. their offset is
> >    greater or equal to offset of the root and offset+size is smaller or equal
> >    to offset+size of the root.  Children of an access are sorted by offset.
> > */
> 
> */ to previous line.

OK

> > struct access
> > {
> >   /* Values returned by `get_ref_base_and_extent' for each COMPONENT_REF
> >      If EXPR isn't a COMPONENT_REF just set `BASE = EXPR', `OFFSET = 0',
> >      `SIZE = TREE_SIZE (TREE_TYPE (expr))'.  */
> 
> s/COMPONENT_REF/component reference/g - it's not only COMPONENT_REF
> trees we handle.

OK

> >   HOST_WIDE_INT offset;
> >   HOST_WIDE_INT size;
> >   tree base;
> > 
> >   /* Expression.  */
> >   tree expr;
> >   /* Type.  */
> >   tree type;
> > 
> >   /* Next group representative for this aggregate. */
> >   struct access *next_grp;
> > 
> >   /* Pointer to the group representative.  Pointer to itself if the struct is
> >      the representative.  */
> >   struct access *group_representative;
> > 
> >   /* If this access has any children (in terms of the definition above), this
> >      points to the first one.  */
> >   struct access *first_child;
> > 
> >   /* Pointer to the next sibling in the access tree as described above.  */
> >   struct access *next_sibling;
> > 
> >   /* Pointers to the first and last element in the linked list of assign
> >      links.  */
> >   struct assign_link *first_link, *last_link;
> 
> vertical space missing.

OK

> >   /* Pointer to the next access in the work queue.  */
> >   struct access *next_queued;
> > 
> >   /* Replacement variable for this access "region."  Never to be accessed
> >      directly, always only by the means of get_access_replacement() and only
> >      when grp_to_be_replaced flag is set.  */
> >   tree replacement_decl;
> > 
> >   /* Is this particular access write access? */
> >   unsigned write : 1;
> > 
> >   /* Is this access currently in the work queue?  */
> >   unsigned grp_queued : 1;
> >   /* Does this group contain a write access?  This flag is propagated down the
> >      access tree.  */
> >   unsigned grp_write : 1;
> >   /* Does this group contain a read access?  This flag is propagated down the
> >      access tree.  */
> >   unsigned grp_read : 1;
> >   /* Is the subtree rooted in this access fully covered by scalar
> >      replacements?  */
> >   unsigned grp_covered : 1;
> >   /* If set to true, this access and all below it in an access tree must not be
> >      scalarized.  */
> >   unsigned grp_unscalarizable_region : 1;
> >   /* Whether data have been written to parts of the aggregate covered by this
> >      access which is not to be scalarized.  This flag is propagated up in the
> >      access tree.  */
> >   unsigned grp_unscalarized_data : 1;
> >   /* Does this access and/or group contain a write access through a
> >      BIT_FIELD_REF?  */
> >   unsigned grp_bfr_lhs : 1;
> > 
> >   /* Set when a scalar replacement should be created for this variable.  We do
> >      the decision and creation at different places because create_tmp_var
> >      cannot be called from within FOR_EACH_REFERENCED_VAR. */
> >   unsigned grp_to_be_replaced : 1;
> > };
> > 
> > typedef struct access *access_p;
> > 
> > DEF_VEC_P (access_p);
> > DEF_VEC_ALLOC_P (access_p, heap);
> > 
> > /* Alloc pool for allocating access structures.  */
> > static alloc_pool access_pool;
> > 
> > /* A structure linking lhs and rhs accesses from an aggregate assignment.  They
> >    are used to propagate subaccesses from rhs to lhs as long as they don't
> >    conflict with what is already there.  */
> > struct assign_link
> > {
> >   struct access *lacc, *racc;
> >   struct assign_link *next;
> > };
> > 
> > /* Alloc pool for allocating assign link structures.  */
> > static alloc_pool link_pool;
> > 
> > /* Base (tree) -> Vector (VEC(access_p,heap) *) map.  */
> > static struct pointer_map_t *base_access_vec;
> > 
> > /* Bitmap of bases (candidates).  */
> > static bitmap candidate_bitmap;
> > /* Bitmap of declarations used in a return statement.  */
> > static bitmap retvals_bitmap;
> > /* Obstack for creation of fancy names.  */
> > static struct obstack name_obstack;
> > 
> > /* Head of a linked list of accesses that need to have its subaccesses
> >    propagated to their assignment counterparts. */
> > static struct access *work_queue_head;
> > 
> > /* Dump contents of ACCESS to file F in a human friendly way.  If GRP is true,
> >    representative fields are dumped, otherwise those which only describe the
> >    individual access are.  */
> > 
> > static void
> > dump_access (FILE *f, struct access *access, bool grp)
> > {
> >   fprintf (f, "access { ");
> >   fprintf (f, "base = (%d)'", DECL_UID (access->base));
> >   print_generic_expr (f, access->base, 0);
> >   fprintf (f, "', offset = %d", (int) access->offset);
> >   fprintf (f, ", size = %d", (int) access->size);
> 
> you can use ", offset = "HOST_WIDE_INT_PRINT_DEC, access->offset here.
> 

OK

> >   fprintf (f, ", expr = ");
> >   print_generic_expr (f, access->expr, 0);
> >   fprintf (f, ", type = ");
> >   print_generic_expr (f, access->type, 0);
> >   if (grp)
> >     fprintf (f, ", grp_write = %d, grp_read = %d, grp_covered = %d, "
> > 	     "grp_unscalarizable_region = %d, grp_unscalarized_data = %d, "
> > 	     "grp_to_be_replaced = %d\n",
> > 	     access->grp_write, access->grp_read, access->grp_covered,
> > 	     access->grp_unscalarizable_region, access->grp_unscalarized_data,
> > 	     access->grp_to_be_replaced);
> >   else
> >     fprintf (f, ", write = %d'\n", access->write);
> > }
> > 
> > /* Dump a subtree rooted in ACCESS to file F, indent by LEVEL.  */
> > 
> > static void
> > dump_access_tree_1 (FILE *f, struct access *access, int level)
> > {
> >   do
> >     {
> >       int i;
> > 
> >       for (i = 0; i < level; i++)
> > 	fputs ("* ", dump_file);
> > 
> >       dump_access (f, access, true);
> > 
> >       if (access->first_child)
> > 	dump_access_tree_1 (f, access->first_child, level + 1);
> > 
> >       access = access->next_sibling;
> >     }
> >   while (access);
> > }
> > 
> > /* Dump all access trees for a variable, given the pointer to the first root in
> >    ACCESS.  */
> > 
> > static void
> > dump_access_tree (FILE *f, struct access *access)
> > {
> >   for (; access; access = access->next_grp)
> >     dump_access_tree_1 (f, access, 0);
> > }
> > 
> > /* Return a vector of pointers to accesses for the variable given in BASE or
> >    NULL if there is none.  */
> > 
> > static VEC (access_p, heap) *
> > get_base_access_vector (tree base)
> > {
> >   void **slot;
> > 
> >   slot = pointer_map_contains (base_access_vec, base);
> >   if (!slot)
> >     return NULL;
> >   else
> >     return *(VEC (access_p, heap) **) slot;
> > }
> > 
> > /* Find an access with required OFFSET and SIZE in a subtree of accesses rooted
> >    in ACCESS.  Return NULL if it cannot be found.  */
> > 
> > static struct access *
> > find_access_in_subtree (struct access *access, HOST_WIDE_INT offset,
> > 			HOST_WIDE_INT size)
> > {
> >   while (access && (access->offset != offset || access->size != size))
> >     {
> >       struct access *child = access->first_child;
> > 
> >       while (child && (child->offset + child->size <= offset))
> > 	child = child->next_sibling;
> 
> Do you limit the number of siblings?  We should keep an eye on this
> for potential compile-time issues (not that I expect some).
> 

Not artificially and given how I construct them (from representatives),
I doubt that  it is easily possible.  They are limited  by the size and
the  structure of  the  aggregate though.   If  ever we  come across  a
problem  like this,  we  might disable  scalarization  of grossly  huge
aggregates.


> >       access = child;
> >     }
> > 
> >   return access;
> > }
> > 
> > /* Return the first group representative for DECL or NULL if none exists.  */
> > 
> > static struct access *
> > get_first_repr_for_decl (tree base)
> > {
> >   VEC (access_p, heap) *access_vec;
> > 
> >   access_vec = get_base_access_vector (base);
> >   if (!access_vec)
> >     return NULL;
> > 
> >   return VEC_index (access_p, access_vec, 0);
> > }
> > 
> > /* Find an access representative for the variable BASE and given OFFSET and
> >    SIZE.  Requires that access trees have already been built.  Return NULL if
> >    it cannot be found.  */
> > 
> > static struct access *
> > get_var_base_offset_size_access (tree base, HOST_WIDE_INT offset,
> > 				 HOST_WIDE_INT size)
> > {
> >   struct access *access;
> > 
> >   access = get_first_repr_for_decl (base);
> >   while (access && (access->offset + access->size <= offset))
> >     access = access->next_grp;
> >   if (!access)
> >     return NULL;
> > 
> >   return find_access_in_subtree (access, offset, size);
> > }
> > 
> > /* Add LINK to the linked list of assign links of RACC.  */
> > static void
> > add_link_to_rhs (struct access *racc, struct assign_link *link)
> > {
> >   gcc_assert (link->racc == racc);
> > 
> >   if (!racc->first_link)
> >     {
> >       gcc_assert (!racc->last_link);
> >       racc->first_link = link;
> >     }
> >   else
> >     racc->last_link->next = link;
> > 
> >   racc->last_link = link;
> >   link->next = NULL;
> > }
> > 
> > /* Move all link structures in their linked list in OLD_RACC to the linked list
> >    in NEW_RACC.  */
> > static void
> > relink_to_new_repr (struct access *new_racc, struct access *old_racc)
> > {
> >   if (!old_racc->first_link)
> >     {
> >       gcc_assert (!old_racc->last_link);
> >       return;
> >     }
> > 
> >   if (new_racc->first_link)
> >     {
> >       gcc_assert (!new_racc->last_link->next);
> >       gcc_assert (!old_racc->last_link || !old_racc->last_link->next);
> > 
> >       new_racc->last_link->next = old_racc->first_link;
> >       new_racc->last_link = old_racc->last_link;
> >     }
> >   else
> >     {
> >       gcc_assert (!new_racc->last_link);
> > 
> >       new_racc->first_link = old_racc->first_link;
> >       new_racc->last_link = old_racc->last_link;
> >     }
> >   old_racc->first_link = old_racc->last_link = NULL;
> > }
> > 
> > /* Add ACCESS to the work queue (which is actually a stack).  */
> > 
> > static void
> > add_access_to_work_queue (struct access *access)
> > {
> >   if (!access->grp_queued)
> >     {
> >       gcc_assert (!access->next_queued);
> >       access->next_queued = work_queue_head;
> >       access->grp_queued = 1;
> >       work_queue_head = access;
> >     }
> > }
> > 
> > /* Pop an access from the work queue, and return it, assuming there is one.  */
> > 
> > static struct access *
> > pop_access_from_work_queue (void)
> > {
> >   struct access *access = work_queue_head;
> > 
> >   work_queue_head = access->next_queued;
> >   access->next_queued = NULL;
> >   access->grp_queued = 0;
> >   return access;
> > }
> > 
> > 
> > /* Allocate necessary structures.  */
> > 
> > static void
> > sra_initialize (void)
> > {
> >   candidate_bitmap = BITMAP_ALLOC (NULL);
> >   retvals_bitmap = BITMAP_ALLOC (NULL);
> >   gcc_obstack_init (&name_obstack);
> >   access_pool = create_alloc_pool ("SRA accesses", sizeof (struct access), 16);
> >   link_pool = create_alloc_pool ("SRA links", sizeof (struct assign_link), 16);
> >   base_access_vec = pointer_map_create ();
> > }
> > 
> > /* Hook fed to pointer_map_traverse, deallocate stored vectors.  */
> > 
> > static bool
> > delete_base_accesses (const void *key ATTRIBUTE_UNUSED, void **value,
> > 		     void *data ATTRIBUTE_UNUSED)
> > {
> >   VEC (access_p, heap) *access_vec;
> >   access_vec = (VEC (access_p, heap) *) *value;
> >   VEC_free (access_p, heap, access_vec);
> > 
> >   return true;
> > }
> > 
> > /* Deallocate all general structures.  */
> > 
> > static void
> > sra_deinitialize (void)
> > {
> >   BITMAP_FREE (candidate_bitmap);
> >   BITMAP_FREE (retvals_bitmap);
> >   free_alloc_pool (access_pool);
> >   free_alloc_pool (link_pool);
> >   obstack_free (&name_obstack, NULL);
> > 
> >   pointer_map_traverse (base_access_vec, delete_base_accesses, NULL);
> >   pointer_map_destroy (base_access_vec);
> > }
> > 
> > /* Remove DECL from candidates for SRA and write REASON to the dump file if
> >    there is one.  */
> > static void
> > disqualify_candidate (tree decl, const char *reason)
> > {
> >   bitmap_clear_bit (candidate_bitmap, DECL_UID (decl));
> > 
> >   if (dump_file)
> 
> && (dump_flags & TDF_DETAILS)

OK

> >     {
> >       fprintf (dump_file, "! Disqualifying ");
> >       print_generic_expr (dump_file, decl, 0);
> >       fprintf (dump_file, " - %s\n", reason);
> >     }
> > }
> > 
> > /* Return true iff the type contains a field or an element which does not allow
> >    scalarization.  */
> > 
> > static bool
> > type_internals_preclude_sra_p (tree type)
> > {
> >   tree fld;
> >   tree et;
> > 
> >   switch (TREE_CODE (type))
> >     {
> >     case RECORD_TYPE:
> >     case UNION_TYPE:
> >     case QUAL_UNION_TYPE:
> >       for (fld = TYPE_FIELDS (type); fld; fld = TREE_CHAIN (fld))
> > 	if (TREE_CODE (fld) == FIELD_DECL)
> > 	  {
> > 	    tree ft = TREE_TYPE (fld);
> > 
> > 	    if (TREE_THIS_VOLATILE (fld)
> > 		|| !DECL_FIELD_OFFSET (fld) || !DECL_SIZE (fld)
> > 		|| !host_integerp (DECL_FIELD_OFFSET (fld), 1)
> > 		|| !host_integerp (DECL_SIZE (fld), 1))
> > 	      return true;
> > 
> > 	    if (AGGREGATE_TYPE_P (ft)
> > 		&& type_internals_preclude_sra_p (ft))
> > 	      return true;
> > 	  }
> > 
> >       return false;
> > 
> >     case ARRAY_TYPE:
> >       et = TREE_TYPE (type);
> > 
> >       if (AGGREGATE_TYPE_P (et))
> > 	return type_internals_preclude_sra_p (et);
> >       else
> > 	return false;
> > 
> >     default:
> >       return false;
> >     }
> > }
> > 
> > /* Create and insert access for EXPR. Return created access, or NULL if it is
> >    not possible.  */
> > 
> > static struct access *
> > create_access (tree expr, bool write)
> > {
> >   struct access *access;
> >   void **slot;
> >   VEC (access_p,heap) *vec;
> >   HOST_WIDE_INT offset, size, max_size;
> >   tree base = expr;
> >   bool unscalarizable_region = false;
> > 
> >   if (handled_component_p (expr))
> >     base = get_ref_base_and_extent (expr, &offset, &size, &max_size);
> >   else
> >     {
> >       tree tree_size;
> > 
> >       tree_size = TYPE_SIZE (TREE_TYPE (base));
> >       if (tree_size && host_integerp (tree_size, 1))
> > 	size = max_size = tree_low_cst (tree_size, 1);
> >       else
> > 	size = max_size = -1;
> > 
> >       offset = 0;
> >     }
> 
> get_ref_base_and_extent should now also work on plain DECLs
> (non-handled_component_p) and base should never be NULL.

Good point.  (However,  I seem to remember I put  the !base check there
for a reason... it might have  been IPA-SRA though.  I'll put an assert
there for the next few days at least.)

> >   if (!base || !DECL_P (base)
> >       || !bitmap_bit_p (candidate_bitmap, DECL_UID (base)))
> >     return NULL;
> > 
> >   if (size != max_size)
> >     {
> >       size = max_size;
> >       unscalarizable_region = true;
> >     }
> > 
> >   if (size < 0)
> >     {
> >       disqualify_candidate (base, "Encountered an ultra variable sized "
> > 			    "access.");
> 
> ultra variable sized?  I would name it 'unconstrained access'.

Well, IPA-SRA does not even like normal variable sized :-)  But yes,
I'll change the reason string.

>  Note that there is still useful information in this case if offset
> is non-zero (namely accesses before [offset, -1] may still be
> scalarized).  Maybe something for further improvements.  This for
> example would happen for structs with trailing arrays.

Yes, we could make an unscalarizable region from it.  It's now on my
TODO list.

> >       return NULL;
> >     }
> > 
> >   access = (struct access *) pool_alloc (access_pool);
> >   memset (access, 0, sizeof (struct access));
> > 
> >   access->base = base;
> >   access->offset = offset;
> >   access->size = size;
> >   access->expr = expr;
> >   access->type = TREE_TYPE (expr);
> >   access->write = write;
> >   access->grp_unscalarizable_region = unscalarizable_region;
> > 
> >   slot = pointer_map_contains (base_access_vec, base);
> >   if (slot)
> >     vec = (VEC (access_p, heap) *) *slot;
> >   else
> >     vec = VEC_alloc (access_p, heap, 32);
> > 
> >   VEC_safe_push (access_p, heap, vec, access);
> > 
> >   *((struct VEC (access_p,heap) **)
> > 	pointer_map_insert (base_access_vec, base)) = vec;
> > 
> >   return access;
> > }
> > 
> > 
> > /* Callback of walk_tree.  Search the given tree for a declaration and exclude
> >    it from the candidates.  */
> > 
> > static tree
> > disqualify_all (tree *tp, int *walk_subtrees, void *data ATTRIBUTE_UNUSED)
> > {
> >   tree base = *tp;
> > 
> > 
> >   if (TREE_CODE (base) == SSA_NAME)
> >     base = SSA_NAME_VAR (base);
> 
> Err ... for SSA_NAME bases there is nothing to scalarize?  So just
> bail out in that case? 

You are right, this is also a leftover from IPA-SRA. 

> In fact, using walk_tree for disqualify_all looks
> a bit expensive (it also walks types).

OK, it  seems quite possible to  get rid of  it and use a  simple dive
through handled_components where still necessary.

> >   if (DECL_P (base))
> >     {
> >       disqualify_candidate (base, "From within disqualify_all().");
> >       *walk_subtrees = 0;
> >     }
> >   else
> >     *walk_subtrees = 1;
> > 
> > 
> >   return NULL_TREE;
> > }
> > 
> > /* Scan expression EXPR and create access structures for all accesses to
> >    candidates for scalarization.  Return the created access or NULL if none is
> >    created.  */
> > 
> > static struct access *
> > build_access_from_expr_1 (tree *expr_ptr,
> > 			gimple_stmt_iterator *gsi ATTRIBUTE_UNUSED, bool write)
> > {
> >   struct access *ret = NULL;
> >   tree expr = *expr_ptr;
> >   tree safe_expr = expr;
> >   bool bit_ref;
> > 
> >   if (TREE_CODE (expr) == BIT_FIELD_REF)
> >     {
> >       expr = TREE_OPERAND (expr, 0);
> >       bit_ref = true;
> >     }
> >   else
> >     bit_ref = false;
> > 
> >   while (TREE_CODE (expr) == NOP_EXPR
> 
> CONVERT_EXPR_P (expr)

OK... but  at another place  in the email  you said it might  not even
appear in a valid gimple statement?  Should I remove it altogether?

> > 	 || TREE_CODE (expr) == VIEW_CONVERT_EXPR
> > 	 || TREE_CODE (expr) == REALPART_EXPR
> > 	 || TREE_CODE (expr) == IMAGPART_EXPR)
> >     expr = TREE_OPERAND (expr, 0);
> 
> Why do this here btw, and not just lump ...
> 
> >   switch (TREE_CODE (expr))
> >     {
> >     case ADDR_EXPR:
> >     case SSA_NAME:
> >     case INDIRECT_REF:
> >       break;
> > 
> >     case VAR_DECL:
> >     case PARM_DECL:
> >     case RESULT_DECL:
> >     case COMPONENT_REF:
> >     case ARRAY_REF:
> >       ret = create_access (expr, write);
> >       break;
> 
> ... this ...
> 
> >     case REALPART_EXPR:
> >     case IMAGPART_EXPR:
> >       expr = TREE_OPERAND (expr, 0);
> >       ret = create_access (expr, write);
> 
> ... and this together?  Won't you create bogus accesses if you
> strip for example IMAGPART_EXPR (which has non-zero offset)?

That would  break the complex  number into its components.   I thought
that they are  meant to stay together for  some reason, otherwise they
would not be represented explicitly  in gimple... do you think it does
not matter?  What about vectors then?

The access is not bogus because modification functions take care of
these statements in a special way.  However, if it is indeed OK to
split complex numbers into their components, I will gladly simplify
this as you suggested.

> >       break;
> > 
> >     case ARRAY_RANGE_REF:
> 
> it should just be handled fine I think.

OK, I will try that at some later stage.

> >     default:
> >       walk_tree (&safe_expr, disqualify_all, NULL, NULL);
> 
> and if not, this should just disqualify the base of the access, like
> get_base_address (safe_expr) (save_expr you mean?) and then if that
> is a DECL, disqualify that decl.

I'll test just doing nothing, things handled by get_base_address are
either fine or already accounted for.

> >       break;
> >     }
> > 
> >   if (write && bit_ref && ret)
> >     ret->grp_bfr_lhs = 1;
> > 
> >   return ret;
> > }
> > 
> > /* Scan expression EXPR and create access structures for all accesses to
> >    candidates for scalarization.  Return true if any access has been
> >    inserted.  */
> > 
> > static bool
> > build_access_from_expr (tree *expr_ptr,
> > 			gimple_stmt_iterator *gsi ATTRIBUTE_UNUSED, bool write,
> > 			void *data ATTRIBUTE_UNUSED)
> > {
> >   return build_access_from_expr_1 (expr_ptr, gsi, write) != NULL;
> > }
> > 
> > /* Disqualify LHS and RHS for scalarization if STMT must end its basic block in
> >    modes in which it matters, return true iff they have been disqualified.  RHS
> >    may be NULL, in that case ignore it.  If we scalarize an aggregate in
> >    intra-SRA we may need to add statements after each statement.  This is not
> >    possible if a statement unconditionally has to end the basic block.  */
> > static bool
> > disqualify_ops_if_throwing_stmt (gimple stmt, tree *lhs, tree *rhs)
> > {
> >   if (stmt_can_throw_internal (stmt) || stmt_ends_bb_p (stmt))
> >     {
> >       walk_tree (lhs, disqualify_all, NULL, NULL);
> >       if (rhs)
> > 	walk_tree (rhs, disqualify_all, NULL, NULL);
> >       return true;
> >     }
> >   return false;
> > }
> > 
> > 
> > /* Result code for scan_assign callback for scan_function.  */
> > enum scan_assign_result {SRA_SA_NONE,       /* nothing done for the stmt */
> > 			 SRA_SA_PROCESSED,  /* stmt analyzed/changed */
> > 			 SRA_SA_REMOVED};   /* stmt redundant and eliminated */
> 
> space after { and before }.

OK
 
> > 
> > 
> > /* Scan expressions occuring in the statement pointed to by STMT_EXPR, create
> >    access structures for all accesses to candidates for scalarization and
> >    remove those candidates which occur in statements or expressions that
> >    prevent them from being split apart.  Return true if any access has been
> >    inserted.  */
> > 
> > static enum scan_assign_result
> > build_accesses_from_assign (gimple *stmt_ptr,
> > 			    gimple_stmt_iterator *gsi ATTRIBUTE_UNUSED,
> > 			    void *data ATTRIBUTE_UNUSED)
> > {
> >   gimple stmt = *stmt_ptr;
> >   tree *lhs_ptr, *rhs_ptr;
> >   struct access *lacc, *racc;
> > 
> >   if (gimple_assign_rhs2 (stmt))
> 
> !gimple_assign_single_p (stmt)
> 
> >     return SRA_SA_NONE;
> >
> >   lhs_ptr = gimple_assign_lhs_ptr (stmt);
> >   rhs_ptr = gimple_assign_rhs1_ptr (stmt);
> 
> you probably don't need to pass pointers to trees everywhere as you
> are not changing them.

Well, this  function is a  callback called by scan_function  which can
also  call  sra_modify_expr  in  the  last  stage  of  the  pass  when
statements  are modified.   I have  considered splitting  the function
into two but  in the end I  thought they would be too  similar and the
overhead is hopefully manageable.
 
> >   if (disqualify_ops_if_throwing_stmt (stmt, lhs_ptr, rhs_ptr))
> >     return SRA_SA_NONE;
> > 
> >   racc = build_access_from_expr_1 (rhs_ptr, gsi, false);
> >   lacc = build_access_from_expr_1 (lhs_ptr, gsi, true);
> 
> just avoid calling into build_access_from_expr_1 for SSA_NAMEs
> or is_gimple_min_invariant lhs/rhs, that should make that
> function more regular.

In what sense?  build_access_from_expr_1 looks at TREE_CODE anyway and
can discard the two cases,  without for example looking into ADR_EXPRs
like is_gimple_min_invariant().

But if you really think it is indeed beneficial, I can do that, sure -
to me it just looks ugly).

 
> >   if (lacc && racc
> >       && !lacc->grp_unscalarizable_region
> >       && !racc->grp_unscalarizable_region
> >       && AGGREGATE_TYPE_P (TREE_TYPE (*lhs_ptr))
> >       && lacc->size <= racc->size
> >       && useless_type_conversion_p (lacc->type, racc->type))
> 
> useless_type_conversion_p should be always true here.

I don't think so, build_access_from_expr_1 can look through V_C_Es and
the types of accesses are the type of the operand in such cases..
 
> >     {
> >       struct assign_link *link;
> > 
> >       link = (struct assign_link *) pool_alloc (link_pool);
> >       memset (link, 0, sizeof (struct assign_link));
> > 
> >       link->lacc = lacc;
> >       link->racc = racc;
> > 
> >       add_link_to_rhs (racc, link);
> >     }
> > 
> >   return (lacc || racc) ? SRA_SA_PROCESSED : SRA_SA_NONE;
> > }
> > 
> > /* Scan function and look for interesting statements. Return true if any has
> >    been found or processed, as indicated by callbacks.  SCAN_EXPR is a callback
> >    called on all expressions within statements except assign statements and
> >    those deemed entirely unsuitable for some reason (all operands in such
> >    statements and expression are removed from candidate_bitmap).  SCAN_ASSIGN
> >    is a callback called on all assign statements, HANDLE_SSA_DEFS is a callback
> >    called on assign statements and those call statements which have a lhs and
> >    it is the only callback which can be NULL. ANALYSIS_STAGE is true when
> >    running in the analysis stage of a pass and thus no statement is being
> >    modified.  DATA is a pointer passed to all callbacks.  If any single
> >    callback returns true, this function also returns true, otherwise it returns
> >    false.  */
> > 
> > static bool
> > scan_function (bool (*scan_expr) (tree *, gimple_stmt_iterator *, bool, void *),
> > 	       enum scan_assign_result (*scan_assign) (gimple *,
> > 						       gimple_stmt_iterator *,
> > 						       void *),
> > 	       bool (*handle_ssa_defs)(gimple, void *),
> > 	       bool analysis_stage, void *data)
> > {
> >   gimple_stmt_iterator gsi;
> >   basic_block bb;
> >   unsigned i;
> >   tree *t;
> >   bool ret = false;
> > 
> >   FOR_EACH_BB (bb)
> >     {
> >       bool bb_changed = false;
> > 
> >       gsi = gsi_start_bb (bb);
> >       while (!gsi_end_p (gsi))
> > 	{
> > 	  gimple stmt = gsi_stmt (gsi);
> > 	  enum scan_assign_result assign_result;
> > 	  bool any = false, deleted = false;
> > 
> > 	  switch (gimple_code (stmt))
> > 	    {
> > 	    case GIMPLE_RETURN:
> > 	      t = gimple_return_retval_ptr (stmt);
> > 	      if (*t != NULL_TREE)
> > 		{
> > 		  if (DECL_P (*t))
> > 		    {
> > 		      tree ret_type = TREE_TYPE (*t);
> > 		      if (sra_mode == SRA_MODE_EARLY_INTRA
> > 			  && (TREE_CODE (ret_type) == UNION_TYPE
> > 			      || TREE_CODE (ret_type) == QUAL_UNION_TYPE))
> > 			disqualify_candidate (*t,
> > 					      "Union in a return statement.");
> > 		      else
> > 			bitmap_set_bit (retvals_bitmap, DECL_UID (*t));
> > 		    }
> > 		  any |= scan_expr (t, &gsi, false, data);
> > 		}
> 
> Likewise for passing pointers (why is gsi necessary and passing a stmt
> does not work?)

So that scan_expr can insert V_C_E statements before or after the
statement.  I'll explain more when commenting on the
fix_incompatible_types_for_expr() function.

 
> > 	      break;
> > 
> > 	    case GIMPLE_ASSIGN:
> > 	      assign_result = scan_assign (&stmt, &gsi, data);
> > 	      any |= assign_result == SRA_SA_PROCESSED;
> > 	      deleted = assign_result == SRA_SA_REMOVED;
> > 	      if (handle_ssa_defs && assign_result != SRA_SA_REMOVED)
> > 		any |= handle_ssa_defs (stmt, data);
> > 	      break;
> > 
> > 	    case GIMPLE_CALL:
> > 	      /* Operands must be processed before the lhs.  */
> > 	      for (i = 0; i < gimple_call_num_args (stmt); i++)
> > 		{
> > 		  tree *argp = gimple_call_arg_ptr (stmt, i);
> > 		  any |= scan_expr (argp, &gsi, false, data);
> > 		}
> > 
> > 	      if (gimple_call_lhs (stmt))
> > 		{
> > 		  tree *lhs_ptr = gimple_call_lhs_ptr (stmt);
> > 		  if (!analysis_stage ||
> > 		      !disqualify_ops_if_throwing_stmt (stmt, lhs_ptr, NULL))
> > 		    {
> > 		      any |= scan_expr (lhs_ptr, &gsi, true, data);
> > 		      if (handle_ssa_defs)
> > 			any |= handle_ssa_defs (stmt, data);
> > 		    }
> > 		}
> > 	      break;
> > 
> > 	    case GIMPLE_ASM:
> > 	      for (i = 0; i < gimple_asm_ninputs (stmt); i++)
> > 		{
> > 		  tree *op = &TREE_VALUE (gimple_asm_input_op (stmt, i));
> > 		  any |= scan_expr (op, &gsi, false, data);
> > 		}
> > 	      for (i = 0; i < gimple_asm_noutputs (stmt); i++)
> > 		{
> > 		  tree *op = &TREE_VALUE (gimple_asm_output_op (stmt, i));
> > 		  any |= scan_expr (op, &gsi, true, data);
> > 		}
> 
> asm operands with memory constraints should be disqualified from SRA
> (see walk_stmt_load_store_addr_ops and/or the operand scanner).

Hm, identifying those with that constraint seems quite complicated and
scan_function  is  enough  complex  as   it  is.   I  guess  I'll  use
walk_stmt_load_store_addr_ops to determine this for me then.

> > 	    default:
> > 	      if (analysis_stage)
> > 		walk_gimple_op (stmt, disqualify_all, NULL);
> 
> You seem to be very eager to disqualify anything unknown ;)  (But I
> yet have to come across a TREE_ADDRESSABLE check ...)

It is in find_var_candidates.
 
> > 	      break;
> > 	    }
> > 
> > 	  if (any)
> > 	    {
> > 	      ret = true;
> > 	      bb_changed = true;
> > 
> > 	      if (!analysis_stage)
> 
> Oh.  So we reuse this function.  Hmm.

Yes.   And with  IPA-SRA it's  reused again  with another  few special
cases.  As I  have written above, I  am no longer sure this  is a good
idea but so far it's been manageable and it has a few advantages too.
 
> > 		{
> > 		  update_stmt (stmt);
> > 		  if (!stmt_could_throw_p (stmt))
> > 		    remove_stmt_from_eh_region (stmt);
> 
> Usually
> 
>   if (maybe_clean_or_replace_eh_stmt (stmt, stmt)
>       && gimple_purge_dead_eh_edges (gimple_bb (stmt)))
> 
> is the pattern for this.  But then you disqualified all throwing
> expressions, no?

I'll experiment with removing this.  So far it's on my TODO list so
that it does not hold me up sending an updated version to you.

 
> > 		}
> > 	    }
> > 	  if (deleted)
> > 	    bb_changed = true;
> > 	  else
> > 	    {
> > 	      gsi_next (&gsi);
> > 	      ret = true;
> > 	    }
> > 	}
> >       if (!analysis_stage && bb_changed)
> > 	gimple_purge_dead_eh_edges (bb);
> >     }
> > 
> >   return ret;
> > }
> > 
> > /* Helper of QSORT function. There are pointers to accesses in the array.  An
> >    access is considered smaller than another if it has smaller offset or if the
> >    offsets are the same but is size is bigger. */
> > 
> > static int
> > compare_access_positions (const void *a, const void *b)
> > {
> >   const access_p *fp1 = (const access_p *) a;
> >   const access_p *fp2 = (const access_p *) b;
> >   const access_p f1 = *fp1;
> >   const access_p f2 = *fp2;
> > 
> >   if (f1->offset != f2->offset)
> >     return f1->offset < f2->offset ? -1 : 1;
> > 
> >   if (f1->size == f2->size)
> >     return 0;
> >   /* We want the bigger accesses first, thus the opposite operator in the next
> >      line: */
> >   return f1->size > f2->size ? -1 : 1;
> > }
> > 
> > 
> > /* Append a name of the declaration to the name obstack.  A helper function for
> >    make_fancy_name.  */
> > 
> > static void
> > make_fancy_decl_name (tree decl)
> > {
> >   char buffer[32];
> > 
> >   tree name = DECL_NAME (decl);
> >   if (name)
> >     obstack_grow (&name_obstack, IDENTIFIER_POINTER (name),
> > 		  IDENTIFIER_LENGTH (name));
> >   else
> >     {
> >       sprintf (buffer, "D%u", DECL_UID (decl));
> 
> That would just be useless information.  I guess you copied this
> from old SRA?

Yes.  All this fancy naming stuff  is quite useless but I find it very
handy when debugging SRA issues.
 
> >       obstack_grow (&name_obstack, buffer, strlen (buffer));
> >     }
> > }
> > 
> > /* Helper for make_fancy_name.  */
> > 
> > static void
> > make_fancy_name_1 (tree expr)
> > {
> >   char buffer[32];
> >   tree index;
> > 
> >   if (DECL_P (expr))
> >     {
> >       make_fancy_decl_name (expr);
> >       return;
> >     }
> > 
> >   switch (TREE_CODE (expr))
> >     {
> >     case COMPONENT_REF:
> >       make_fancy_name_1 (TREE_OPERAND (expr, 0));
> >       obstack_1grow (&name_obstack, '$');
> >       make_fancy_decl_name (TREE_OPERAND (expr, 1));
> >       break;
> > 
> >     case ARRAY_REF:
> >       make_fancy_name_1 (TREE_OPERAND (expr, 0));
> >       obstack_1grow (&name_obstack, '$');
> >       /* Arrays with only one element may not have a constant as their
> > 	 index. */
> >       index = TREE_OPERAND (expr, 1);
> >       if (TREE_CODE (index) != INTEGER_CST)
> > 	break;
> >       sprintf (buffer, HOST_WIDE_INT_PRINT_DEC, TREE_INT_CST_LOW (index));
> >       obstack_grow (&name_obstack, buffer, strlen (buffer));
> > 
> >       break;
> > 
> >     case BIT_FIELD_REF:
> >     case REALPART_EXPR:
> >     case IMAGPART_EXPR:
> >       gcc_unreachable (); 	/* we treat these as scalars.  */
> >       break;
> >     default:
> >       break;
> >     }
> > }
> > 
> > /* Create a human readable name for replacement variable of ACCESS.  */
> > 
> > static char *
> > make_fancy_name (tree expr)
> > {
> >   make_fancy_name_1 (expr);
> >   obstack_1grow (&name_obstack, '\0');
> >   return XOBFINISH (&name_obstack, char *);
> > }
> 
> As all new scalars are DECL_ARTIFICIAL anyway why bother to create
> a fancy name? ....

Well, it helps when reading dumps in order to look into what SRA did.
 
> > /* Helper function for build_ref_for_offset.  */
> > 
> > static bool
> > build_ref_for_offset_1 (tree *res, tree type, HOST_WIDE_INT offset,
> > 			tree exp_type)
> > {
> >   while (1)
> >     {
> >       tree fld;
> >       tree tr_size, index;
> >       HOST_WIDE_INT el_size;
> > 
> >       if (offset == 0 && exp_type
> > 	  && useless_type_conversion_p (exp_type, type))
> > 	return true;
> > 
> >       switch (TREE_CODE (type))
> > 	{
> > 	case UNION_TYPE:
> > 	case QUAL_UNION_TYPE:
> > 	case RECORD_TYPE:
> > 	  /* Some ADA records are half-unions, treat all of them the same.  */
> > 	  for (fld = TYPE_FIELDS (type); fld; fld = TREE_CHAIN (fld))
> > 	    {
> > 	      HOST_WIDE_INT pos, size;
> > 	      tree expr, *expr_ptr;
> > 
> > 	      if (TREE_CODE (fld) != FIELD_DECL)
> > 		continue;
> > 
> > 	      pos = int_bit_position (fld);
> > 	      gcc_assert (TREE_CODE (type) == RECORD_TYPE || pos == 0);
> > 	      size = tree_low_cst (DECL_SIZE (fld), 1);
> > 	      if (pos > offset || (pos + size) <= offset)
> > 		continue;
> > 
> > 	      if (res)
> > 		{
> > 		  expr = build3 (COMPONENT_REF, TREE_TYPE (fld), *res, fld,
> > 				 NULL_TREE);
> > 		  expr_ptr = &expr;
> > 		}
> > 	      else
> > 		expr_ptr = NULL;
> > 	      if (build_ref_for_offset_1 (expr_ptr, TREE_TYPE (fld),
> > 					  offset - pos, exp_type))
> > 		{
> > 		  if (res)
> > 		    *res = expr;
> > 		  return true;
> > 		}
> > 	    }
> > 	  return false;
> > 
> > 	case ARRAY_TYPE:
> > 	  tr_size = TYPE_SIZE (TREE_TYPE (type));
> > 	  if (!tr_size || !host_integerp (tr_size, 1))
> > 	    return false;
> > 	  el_size = tree_low_cst (tr_size, 1);
> > 
> > 	  index = build_int_cst (TYPE_DOMAIN (type), offset / el_size);
> > 	  if (!integer_zerop (TYPE_MIN_VALUE (TYPE_DOMAIN (type))))
> > 	    index = int_const_binop (PLUS_EXPR, index,
> > 				     TYPE_MIN_VALUE (TYPE_DOMAIN (type)), 0);
> > 	  if (res)
> > 	    *res = build4 (ARRAY_REF, TREE_TYPE (type), *res, index, NULL_TREE,
> > 			   NULL_TREE);
> > 	  offset = offset % el_size;
> > 	  type = TREE_TYPE (type);
> > 	  break;
> > 
> > 	default:
> > 	  if (offset != 0)
> > 	    return false;
> > 
> > 	  if (exp_type)
> > 	    return false;
> > 	  else
> > 	    return true;
> > 	}
> >     }
> > }
> > 
> > /* Construct an expression that would reference a part of aggregate *EXPR of
> >    type TYPE at the given OFFSET of the type EXP_TYPE.  If EXPR is NULL, the
> >    function only determines whether it can build such a reference without
> >    actually doing it.
> > 
> >    FIXME: Eventually this should be replaced with
> >    maybe_fold_offset_to_reference() from tree-ssa-ccp.c but that requires a
> >    minor rewrite of fold_stmt.
> >  */
> > 
> > static bool
> > build_ref_for_offset (tree *expr, tree type, HOST_WIDE_INT offset,
> > 		      tree exp_type, bool allow_ptr)
> > {
> >   if (allow_ptr && POINTER_TYPE_P (type))
> >     {
> >       type = TREE_TYPE (type);
> >       if (expr)
> > 	*expr = fold_build1 (INDIRECT_REF, type, *expr);
> >     }
> > 
> >   return build_ref_for_offset_1 (expr, type, offset, exp_type);
> > }
> > 
> > /* The very first phase of intraprocedural SRA.  It marks in candidate_bitmap
> >    those with type which is suitable for scalarization.  */
> > 
> > static bool
> > find_var_candidates (void)
> > {
> >   tree var, type;
> >   referenced_var_iterator rvi;
> >   bool ret = false;
> > 
> >   FOR_EACH_REFERENCED_VAR (var, rvi)
> >     {
> >       if (TREE_CODE (var) != VAR_DECL && TREE_CODE (var) != PARM_DECL)
> >         continue;
> >       type = TREE_TYPE (var);
> > 
> >       if (!AGGREGATE_TYPE_P (type)
> > 	  || needs_to_live_in_memory (var)
> 
> Ok, here's the TREE_ADDRESSABLE check.  I'm finally convinced that
> disqualify_all should go ;)

OK
 
> > 	  || TREE_THIS_VOLATILE (var)
> > 	  || !COMPLETE_TYPE_P (type)
> > 	  || !host_integerp (TYPE_SIZE (type), 1)
> >           || tree_low_cst (TYPE_SIZE (type), 1) == 0
> > 	  || type_internals_preclude_sra_p (type))
> > 	continue;
> > 
> >       bitmap_set_bit (candidate_bitmap, DECL_UID (var));
> > 
> >       if (dump_file)
> 
> && (dump_flags & TDF_DETAILS)

OK
 
> > 	{
> > 	  fprintf (dump_file, "Candidate (%d): ", DECL_UID (var));
> > 	  print_generic_expr (dump_file, var, 0);
> > 	  fprintf (dump_file, "\n");
> > 	}
> >       ret = true;
> >     }
> > 
> >   return ret;
> > }
> > 
> > /* Return true if TYPE should be considered a scalar type by SRA.  */
> > 
> > static bool
> > is_sra_scalar_type (tree type)
> > {
> >   enum tree_code code = TREE_CODE (type);
> >   return (INTEGRAL_TYPE_P (type) || SCALAR_FLOAT_TYPE_P (type)
> > 	  || FIXED_POINT_TYPE_P (type) || POINTER_TYPE_P (type)
> > 	  || code == VECTOR_TYPE || code == COMPLEX_TYPE
> > 	  || code == OFFSET_TYPE);
> > }
> 
> Why is this anything different from is_gimple_reg_type ()?

The old SRA had  a function like this and I kept  it.  But it makes no
sense,   I've   removed   it   and   replaced   calls   to   it   with
is_gimple_reg_type().
 
> > /* Sort all accesses for the given variable, check for partial overlaps and
> >    return NULL if there are any.  If there are none, pick a representative for
> >    each combination of offset and size and create a linked list out of them.
> >    Return the pointer to the first representative and make sure it is the first
> >    one in the vector of accesses.  */
> > 
> > static struct access *
> > sort_and_splice_var_accesses (tree var)
> > {
> >   int i, j, access_count;
> >   struct access *res, **prev_acc_ptr = &res;
> >   VEC (access_p, heap) *access_vec;
> >   bool first = true;
> >   HOST_WIDE_INT low = -1, high = 0;
> > 
> >   access_vec = get_base_access_vector (var);
> >   if (!access_vec)
> >     return NULL;
> >   access_count = VEC_length (access_p, access_vec);
> > 
> >   /* Sort by <OFFSET, SIZE>.  */
> >   qsort (VEC_address (access_p, access_vec), access_count, sizeof (access_p),
> > 	 compare_access_positions);
> > 
> >   i = 0;
> >   while (i < access_count)
> >     {
> >       struct access *access = VEC_index (access_p, access_vec, i);
> >       bool modification = access->write;
> >       bool grp_read = !access->write;
> >       bool grp_bfr_lhs = access->grp_bfr_lhs;
> >       bool first_scalar = is_sra_scalar_type (access->type);
> >       bool unscalarizable_region = access->grp_unscalarizable_region;
> > 
> >       if (first || access->offset >= high)
> > 	{
> > 	  first = false;
> > 	  low = access->offset;
> > 	  high = access->offset + access->size;
> > 	}
> >       else if (access->offset > low && access->offset + access->size > high)
> > 	return NULL;
> >       else
> > 	gcc_assert (access->offset >= low
> > 		    && access->offset + access->size <= high);
> > 
> >       j = i + 1;
> >       while (j < access_count)
> > 	{
> > 	  struct access *ac2 = VEC_index (access_p, access_vec, j);
> > 	  if (ac2->offset != access->offset || ac2->size != access->size)
> > 	    break;
> > 	  modification |= ac2->write;
> > 	  grp_read |= !ac2->write;
> > 	  grp_bfr_lhs |= ac2->grp_bfr_lhs;
> > 	  unscalarizable_region |= ac2->grp_unscalarizable_region;
> > 	  relink_to_new_repr (access, ac2);
> > 
> > 	  /* If one of the equivalent accesses is scalar, use it as a
> > 	     representative (this happens when when there is for example on a
> > 	     single scalar field in a structure).  */
> > 	  if (!first_scalar && is_sra_scalar_type (ac2->type))
> > 	    {
> > 	      struct access tmp_acc;
> > 	      first_scalar = true;
> > 
> > 	      memcpy (&tmp_acc, ac2, sizeof (struct access));
> > 	      memcpy (ac2, access,  sizeof (struct access));
> > 	      memcpy (access, &tmp_acc, sizeof (struct access));
> > 	    }
> > 	  ac2->group_representative = access;
> > 	  j++;
> > 	}
> > 
> >       i = j;
> > 
> >       access->group_representative = access;
> >       access->grp_write = modification;
> >       access->grp_read = grp_read;
> >       access->grp_bfr_lhs = grp_bfr_lhs;
> >       access->grp_unscalarizable_region = unscalarizable_region;
> >       if (access->first_link)
> > 	add_access_to_work_queue (access);
> > 
> >       *prev_acc_ptr = access;
> >       prev_acc_ptr = &access->next_grp;
> >     }
> > 
> >   gcc_assert (res == VEC_index (access_p, access_vec, 0));
> >   return res;
> > }
> > 
> > /* Create a variable for the given ACCESS which determines the type, name and a
> >    few other properties.  Return the variable declaration and store it also to
> >    ACCESS->replacement.  */
> > 
> > static tree
> > create_access_replacement (struct access *access)
> > {
> >   tree repl;
> > 
> >   repl = make_rename_temp (access->type, "SR");
> >   get_var_ann (repl);
> >   add_referenced_var (repl);
> > 
> >   DECL_SOURCE_LOCATION (repl) = DECL_SOURCE_LOCATION (access->base);
> >   DECL_ARTIFICIAL (repl) = 1;
> > 
> >   if (DECL_NAME (access->base) && !DECL_IGNORED_P (access->base))
> 
> at least && !DECL_ARTIFICIAL (access->base) I think.

This part is also largely copied from the old SRA.  So far it seems to
work nicely, replacements of artificial declarations get SRsome_number
fancy names and that makes  them easy to distinguish.  Nevertheless, I
can change the condition if it is somehow wrong.  Or do you expect any
other problems beside not-so-fancy fancy names?
 
> >     {
> >       char *pretty_name = make_fancy_name (access->expr);
> > 
> >       DECL_NAME (repl) = get_identifier (pretty_name);
> >       obstack_free (&name_obstack, pretty_name);
> > 
> >       SET_DECL_DEBUG_EXPR (repl, access->expr);
> >       DECL_DEBUG_EXPR_IS_FROM (repl) = 1;
> >       DECL_IGNORED_P (repl) = 0;
> >       TREE_NO_WARNING (repl) = TREE_NO_WARNING (access->base);
> >     }
> >   else
> >     {
> >       DECL_IGNORED_P (repl) = 1;
> >       TREE_NO_WARNING (repl) = 1;
> >     }
> 
> So just copy DECL_IGNORED_P and TREE_NO_WARNING from access->base
> unconditionally?

Yeah, that makes sense.
 
> >   if (access->grp_bfr_lhs)
> >     DECL_GIMPLE_REG_P (repl) = 0;
> 
> But you never set it (see update_address_taken for more cases,
> most notably VIEW_CONVERT_EXPR on the lhs which need to be taken
> care of).  You should set it for COMPLEX_TYPE and VECTOR_TYPE 
> replacements.

This function  is the  only place where  I still  use make_rename_temp
which sets it  exactly in these two cases.  I did  not really know why
it is  required in these two  cases and only  in these two cases  so I
left it there, at least for  now.  I guess I understand that now after
seeing update_address_taken.

I can  replace this  with calling create_tmp_var()  and doing  all the
rest  that make_rename_temp does  - I  believe that  you intend  to to
remove it - I have just not found out why it is so bad.
 
> >   if (dump_file)
> >     {
> >       fprintf (dump_file, "Created a replacement for ");
> >       print_generic_expr (dump_file, access->base, 0);
> >       fprintf (dump_file, " offset: %u, size: %u: ",
> > 	       (unsigned) access->offset, (unsigned) access->size);
> >       print_generic_expr (dump_file, repl, 0);
> >       fprintf (dump_file, "\n");
> >     }
> > 
> >   return repl;
> > }
> > 
> > /* Return ACCESS scalar replacement, create it if it does not exist yet.  */
> > 
> > static inline tree
> > get_access_replacement (struct access *access)
> > {
> >   gcc_assert (access->grp_to_be_replaced);
> > 
> >   if (access->replacement_decl)
> >     return access->replacement_decl;
> > 
> >   access->replacement_decl = create_access_replacement (access);
> >   return access->replacement_decl;
> > }
> > 
> > /* Build a subtree of accesses rooted in *ACCESS, and move the pointer in the
> >    linked list along the way.  Stop when *ACCESS is NULL or the access pointed
> >    to it is not "within" the root.  */
> > 
> > static void
> > build_access_subtree (struct access **access)
> > {
> >   struct access *root = *access, *last_child = NULL;
> >   HOST_WIDE_INT limit = root->offset + root->size;
> > 
> >   *access = (*access)->next_grp;
> >   while  (*access && (*access)->offset + (*access)->size <= limit)
> >     {
> >       if (!last_child)
> > 	root->first_child = *access;
> >       else
> > 	last_child->next_sibling = *access;
> >       last_child = *access;
> > 
> >       build_access_subtree (access);
> >     }
> > }
> > 
> > /* Build a tree of access representatives, ACCESS is the pointer to the first
> >    one, others are linked in a list by the next_grp field.  Decide about scalar
> >    replacements on the way, return true iff any are to be created.  */
> > 
> > static void
> > build_access_trees (struct access *access)
> > {
> >   while (access)
> >     {
> >       struct access *root = access;
> > 
> >       build_access_subtree (&access);
> >       root->next_grp = access;
> >     }
> > }
> > 
> > /* Analyze the subtree of accesses rooted in ROOT, scheduling replacements when
> >    both seeming beneficial and when ALLOW_REPLACEMENTS allows it.  Also set
> >    all sorts of access flags appropriately along the way, notably always ser
> >    grp_read when MARK_READ is true and grp_write when MARK_WRITE is true.  */
> > 
> > static bool
> > analyze_access_subtree (struct access *root, bool allow_replacements,
> > 			bool mark_read, bool mark_write)
> > {
> >   struct access *child;
> >   HOST_WIDE_INT limit = root->offset + root->size;
> >   HOST_WIDE_INT covered_to = root->offset;
> >   bool scalar = is_sra_scalar_type (root->type);
> >   bool hole = false, sth_created = false;
> > 
> >   if (mark_read)
> >     root->grp_read = true;
> >   else if (root->grp_read)
> >     mark_read = true;
> > 
> >   if (mark_write)
> >     root->grp_write = true;
> >   else if (root->grp_write)
> >     mark_write = true;
> > 
> >   if (root->grp_unscalarizable_region)
> >     allow_replacements = false;
> > 
> >   for (child = root->first_child; child; child = child->next_sibling)
> >     {
> >       if (!hole && child->offset < covered_to)
> > 	hole = true;
> >       else
> > 	covered_to += child->size;
> > 
> >       sth_created |= analyze_access_subtree (child,
> > 					     allow_replacements && !scalar,
> > 					     mark_read, mark_write);
> > 
> >       root->grp_unscalarized_data |= child->grp_unscalarized_data;
> >       hole |= !child->grp_covered;
> >     }
> > 
> >   if (allow_replacements && scalar && !root->first_child)
> >     {
> >       if (dump_file)
> 
> && (dump_flags & TDF_DETAILS)

OK
 
> > 	{
> > 	  fprintf (dump_file, "Marking ");
> > 	  print_generic_expr (dump_file, root->base, 0);
> > 	  fprintf (dump_file, " offset: %u, size: %u: ",
> > 		   (unsigned) root->offset, (unsigned) root->size);
> > 	  fprintf (dump_file, " to be replaced.\n");
> > 	}
> > 
> >       root->grp_to_be_replaced = 1;
> >       sth_created = true;
> >       hole = false;
> >     }
> >   else if (covered_to < limit)
> >     hole = true;
> > 
> >   if (sth_created && !hole)
> >     {
> >       root->grp_covered = 1;
> >       return true;
> >     }
> >   if (root->grp_write || TREE_CODE (root->base) == PARM_DECL)
> >     root->grp_unscalarized_data = 1; /* not covered and written to */
> >   if (sth_created)
> >     return true;
> >   return false;
> > }
> > 
> > /* Analyze all access trees linked by next_grp by the means of
> >    analyze_access_subtree.  */
> > static bool
> > analyze_access_trees (struct access *access)
> > {
> >   bool ret = false;
> > 
> >   while (access)
> >     {
> >       if (analyze_access_subtree (access, true, false, false))
> > 	ret = true;
> >       access = access->next_grp;
> >     }
> > 
> >   return ret;
> > }
> > 
> > /* Return true iff a potential new child of LACC at offset OFFSET and with size
> >    SIZE would conflict with an already existing one.  If exactly such a child
> >    already exists in LACC, store a pointer to it in EXACT_MATCH.  */
> > 
> > static bool
> > child_would_conflict_in_lacc (struct access *lacc, HOST_WIDE_INT norm_offset,
> > 			      HOST_WIDE_INT size, struct access **exact_match)
> > {
> >   struct access *child;
> > 
> >   for (child = lacc->first_child; child; child = child->next_sibling)
> >     {
> >       if (child->offset == norm_offset && child->size == size)
> > 	{
> > 	  *exact_match = child;
> > 	  return true;
> > 	}
> > 
> >       if (child->offset < norm_offset + size
> > 	  && child->offset + child->size > norm_offset)
> > 	return true;
> >     }
> > 
> >   return false;
> > }
> > 
> > /* Set the expr of TARGET to one just like MODEL but with is own base at the
> >    bottom of the handled components.  */
> > 
> > static void
> > duplicate_expr_for_different_base (struct access *target,
> > 				   struct access *model)
> > {
> >   tree t, expr = unshare_expr (model->expr);
> > 
> >   gcc_assert (handled_component_p (expr));
> >   t = expr;
> >   while (handled_component_p (TREE_OPERAND (t, 0)))
> >     t = TREE_OPERAND (t, 0);
> >   gcc_assert (TREE_OPERAND (t, 0) == model->base);
> >   TREE_OPERAND (t, 0) = target->base;
> > 
> >   target->expr = expr;
> > }
> > 
> > 
> > /* Create a new child access of PARENT, with all properties just like MODEL
> >    except for its offset and with its grp_write false and grp_read true.
> >    Return the new access. Note that this access is created long after all
> >    splicing and sorting, it's not located in any access vector and is
> >    automatically a representative of its group.  */
> > 
> > static struct access *
> > create_artificial_child_access (struct access *parent, struct access *model,
> > 				HOST_WIDE_INT new_offset)
> > {
> >   struct access *access;
> >   struct access **child;
> > 
> >   gcc_assert (!model->grp_unscalarizable_region);
> > 
> >   access = (struct access *) pool_alloc (access_pool);
> >   memset (access, 0, sizeof (struct access));
> >   access->base = parent->base;
> >   access->offset = new_offset;
> >   access->size = model->size;
> >   duplicate_expr_for_different_base (access, model);
> >   access->type = model->type;
> >   access->grp_write = true;
> >   access->grp_read = false;
> > 
> >   child = &parent->first_child;
> >   while (*child && (*child)->offset < new_offset)
> >     child = &(*child)->next_sibling;
> > 
> >   access->next_sibling = *child;
> >   *child = access;
> > 
> >   return access;
> > }
> > 
> > 
> > /* Propagate all subaccesses of RACC across an assignment link to LACC. Return
> >    true if any new subaccess was created.  Additionally, if RACC is a scalar
> >    access but LACC is not, change the type of the latter.  */
> > 
> > static bool
> > propagate_subacesses_accross_link (struct access *lacc, struct access *racc)
> > {
> >   struct access *rchild;
> >   HOST_WIDE_INT norm_delta = lacc->offset - racc->offset;
> >   bool ret = false;
> > 
> >   if (is_sra_scalar_type (lacc->type)
> >       || lacc->grp_unscalarizable_region
> >       || racc->grp_unscalarizable_region)
> >     return false;
> > 
> >   if (!lacc->first_child && !racc->first_child
> >       && is_sra_scalar_type (racc->type)
> >       && (sra_mode == SRA_MODE_INTRA
> >           || !bitmap_bit_p (retvals_bitmap, DECL_UID (lacc->base))))
> >     {
> >       duplicate_expr_for_different_base (lacc, racc);
> >       lacc->type = racc->type;
> >       return false;
> >     }
> > 
> >   gcc_assert (lacc->size <= racc->size);
> > 
> >   for (rchild = racc->first_child; rchild; rchild = rchild->next_sibling)
> >     {
> >       struct access *new_acc = NULL;
> >       HOST_WIDE_INT norm_offset = rchild->offset + norm_delta;
> > 
> >       if (rchild->grp_unscalarizable_region)
> > 	continue;
> > 
> >       if (child_would_conflict_in_lacc (lacc, norm_offset, rchild->size,
> > 					&new_acc))
> > 	{
> > 	  if (new_acc && rchild->first_child)
> > 	    ret |= propagate_subacesses_accross_link (new_acc, rchild);
> > 	  continue;
> > 	}
> > 
> >       new_acc = create_artificial_child_access (lacc, rchild, norm_offset);
> >       if (racc->first_child)
> > 	propagate_subacesses_accross_link (new_acc, rchild);
> > 
> >       ret = true;
> >     }
> > 
> >   return ret;
> > }
> > 
> > /* Propagate all subaccesses across assignment links.  */
> > 
> > static void
> > propagate_all_subaccesses (void)
> > {
> >   while (work_queue_head)
> >     {
> >       struct access *racc = pop_access_from_work_queue ();
> >       struct assign_link *link;
> > 
> >       gcc_assert (racc->first_link);
> > 
> >       for (link = racc->first_link; link; link = link->next)
> > 	{
> > 	  struct access *lacc = link->lacc;
> > 
> > 	  if (!bitmap_bit_p (candidate_bitmap, DECL_UID (lacc->base)))
> > 	    continue;
> > 	  lacc = lacc->group_representative;
> > 	  if (propagate_subacesses_accross_link (lacc, racc)
> > 	      && lacc->first_link)
> > 	    add_access_to_work_queue (lacc);
> > 	}
> >     }
> > }
> > 
> > /* Go through all accesses collected throughout the (intraprocedural) analysis
> >    stage, exclude overlapping ones, identify representatives and build trees
> >    out of them, making decisions about scalarization on the way.  Return true
> >    iff there are any to-be-scalarized variables after this stage. */
> > 
> > static bool
> > analyze_all_variable_accesses (void)
> > {
> >   tree var;
> >   referenced_var_iterator rvi;
> >   bool res = false;
> > 
> >   FOR_EACH_REFERENCED_VAR (var, rvi)
> >     if (bitmap_bit_p (candidate_bitmap, DECL_UID (var)))
> >       {
> > 	struct access *access;
> > 
> > 	access = sort_and_splice_var_accesses (var);
> > 	if (access)
> > 	  build_access_trees (access);
> > 	else
> > 	  disqualify_candidate (var,
> > 				"No or inhibitingly overlapping accesses.");
> >       }
> > 
> >   propagate_all_subaccesses ();
> > 
> >   FOR_EACH_REFERENCED_VAR (var, rvi)
> >     if (bitmap_bit_p (candidate_bitmap, DECL_UID (var)))
> >       {
> > 	struct access *access = get_first_repr_for_decl (var);
> > 
> > 	if (analyze_access_trees (access))
> > 	  {
> > 	    res = true;
> > 	    if (dump_file)
> 
> && (dump_flags & TDF_DETAILS)

OK
 
> > 	      {
> > 		fprintf (dump_file, "\nAccess trees for ");
> > 		print_generic_expr (dump_file, var, 0);
> > 		fprintf (dump_file, " (UID: %u): \n", DECL_UID (var));
> > 		dump_access_tree (dump_file, access);
> > 		fprintf (dump_file, "\n");
> > 	      }
> > 	  }
> > 	else
> > 	  disqualify_candidate (var, "No scalar replacements to be created.");
> >       }
> > 
> >   return res;
> > }
> > 
> > /* Return true iff a reference statement into aggregate AGG can be built for
> >    every single to-be-replaced accesses that is a child of ACCESS, its sibling
> >    or a child of its sibling. TOP_OFFSET is the offset from the processed
> >    access subtree that has to be subtracted from offset of each access.  */
> > 
> > static bool
> > ref_expr_for_all_replacements_p (struct access *access, tree agg,
> > 				 HOST_WIDE_INT top_offset)
> > {
> >   do
> >     {
> >       if (access->grp_to_be_replaced
> > 	  && !build_ref_for_offset (NULL, TREE_TYPE (agg),
> > 				    access->offset - top_offset,
> > 				    access->type, false))
> > 	return false;
> > 
> >       if (access->first_child
> > 	  && !ref_expr_for_all_replacements_p (access->first_child, agg,
> > 					       top_offset))
> > 	return false;
> > 
> >       access = access->next_sibling;
> >     }
> >   while (access);
> > 
> >   return true;
> > }
> > 
> > 
> > /* Generate statements copying scalar replacements of accesses within a subtree
> >    into or out of AGG.  ACCESS is the first child of the root of the subtree to
> >    be processed.  AGG is an aggregate type expression (can be a declaration but
> >    does not have to be, it can for example also be an indirect_ref).
> >    TOP_OFFSET is the offset of the processed subtree which has to be subtracted
> >    from offsets of individual accesses to get corresponding offsets for AGG.
> >    If CHUNK_SIZE is non-null, copy only replacements in the interval
> >    <start_offset, start_offset + chunk_size>, otherwise copy all.  GSI is a
> >    statement iterator used to place the new statements.  WRITE should be true
> >    when the statements should write from AGG to the replacement and false if
> >    vice versa.  if INSERT_AFTER is true, new statements will be added after the
> >    current statement in GSI, they will be added before the statement
> >    otherwise.  */
> > 
> > static void
> > generate_subtree_copies (struct access *access, tree agg,
> > 			 HOST_WIDE_INT top_offset,
> > 			 HOST_WIDE_INT start_offset, HOST_WIDE_INT chunk_size,
> > 			 gimple_stmt_iterator *gsi, bool write,
> > 			 bool insert_after)
> > {
> >   do
> >     {
> >       tree expr = unshare_expr (agg);
> > 
> >       if (chunk_size && access->offset >= start_offset + chunk_size)
> > 	return;
> > 
> >       if (access->grp_to_be_replaced
> > 	  && (chunk_size == 0
> > 	      || access->offset + access->size > start_offset))
> > 	{
> > 	  bool repl_found;
> > 	  gimple stmt;
> > 
> > 	  repl_found = build_ref_for_offset (&expr, TREE_TYPE (agg),
> > 					     access->offset - top_offset,
> > 					     access->type, false);
> > 	  gcc_assert (repl_found);
> > 
> > 	  if (write)
> > 	    stmt = gimple_build_assign (get_access_replacement (access), expr);
> > 	  else
> > 	    {
> > 	      tree repl = get_access_replacement (access);
> > 	      TREE_NO_WARNING (repl) = 1;
> > 	      stmt = gimple_build_assign (expr, repl);
> > 	    }
> > 
> > 	  if (insert_after)
> > 	    gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
> > 	  else
> > 	    gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
> > 	  update_stmt (stmt);
> > 	}
> > 
> >       if (access->first_child)
> > 	generate_subtree_copies (access->first_child, agg, top_offset,
> > 				 start_offset, chunk_size, gsi,
> > 				 write, insert_after);
> > 
> >       access = access->next_sibling;
> >     }
> >   while (access);
> > }
> > 
> > /* Assign zero to all scalar replacements in an access subtree.  ACCESS is the
> >    the root of the subtree to be processed.  GSI is the statement iterator used
> >    for inserting statements which are added after the current statement if
> >    INSERT_AFTER is true or before it otherwise.  */
> > 
> > static void
> > init_subtree_with_zero (struct access *access, gimple_stmt_iterator *gsi,
> > 			bool insert_after)
> > 
> > {
> >   struct access *child;
> > 
> >   if (access->grp_to_be_replaced)
> >     {
> >       gimple stmt;
> > 
> >       stmt = gimple_build_assign (get_access_replacement (access),
> > 				  fold_convert (access->type,
> > 						integer_zero_node));
> >       if (insert_after)
> > 	gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
> >       else
> > 	gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
> >       update_stmt (stmt);
> >     }
> > 
> >   for (child = access->first_child; child; child = child->next_sibling)
> >     init_subtree_with_zero (child, gsi, insert_after);
> > }
> > 
> > /* Search for an access representative for the given expression EXPR and
> >    return it or NULL if it cannot be found.  */
> > 
> > static struct access *
> > get_access_for_expr (tree expr)
> > {
> >   HOST_WIDE_INT offset, size, max_size;
> >   tree base;
> > 
> >   if (TREE_CODE (expr) == NOP_EXPR
> 
> CONVERT_EXPR_P (expr)
> 
> >       || TREE_CODE (expr) == VIEW_CONVERT_EXPR)
> 
> VIEW_CONVERT_EXPR is also a handled_component_p.
> 
> Note that NOP_EXPR should never occur here - that would be invalid
> gimple.  So I think you can (and should) just delete the above.

I haven't  seen a NOP_EXPR for a  while, do they still  exist in lower
gimple?  Thus I have removed their handling.

Removing diving through V_C_E breaks ADA, though.  The reason is that
we get a different size (and max_size) when calling
get_ref_base_and_extent on the V_C_E and on its argument.  However, I
believe both should be represented by a single access representative.
 
> >     expr = TREE_OPERAND (expr, 0);
> > 
> >   if (handled_component_p (expr))
> >     {
> >       base = get_ref_base_and_extent (expr, &offset, &size, &max_size);
> >       size = max_size;
> >       if (size == -1 || !base || !DECL_P (base))
> > 	return NULL;
> >     }
> >   else if (DECL_P (expr))
> >     {
> >       tree tree_size;
> > 
> >       base = expr;
> >       tree_size = TYPE_SIZE (TREE_TYPE (base));
> >       if (tree_size && host_integerp (tree_size, 1))
> > 	size = max_size = tree_low_cst (tree_size, 1);
> >       else
> > 	return NULL;
> > 
> >       offset = 0;
> 
> See above.  get_ref_base_and_extent handles plain DECLs just fine.

OK
 
> >     }
> >   else
> >     return NULL;
> > 
> >   if (!bitmap_bit_p (candidate_bitmap, DECL_UID (base)))
> >     return NULL;
> > 
> >   return get_var_base_offset_size_access (base, offset, size);
> > }
> > 
> > /* Substitute into *EXPR an expression of type TYPE with the value of the
> >    replacement of ACCESS.  This is done either by producing a special V_C_E
> >    assignment statement converting the replacement to a new temporary of the
> >    requested type if TYPE is not TREE_ADDRESSABLE or by going through the base
> >    aggregate if it is.  */
> > 
> > static void
> > fix_incompatible_types_for_expr (tree *expr, tree type, struct access *access,
> > 				 gimple_stmt_iterator *gsi, bool write)
> > {
> >   tree repl = get_access_replacement (access);
> >   if (!TREE_ADDRESSABLE (type))
> >     {
> >       tree tmp = create_tmp_var (type, "SRvce");
> > 
> >       add_referenced_var (tmp);
> >       if (is_gimple_reg_type (type))
> > 	tmp = make_ssa_name (tmp, NULL);
> 
> Should be always is_gimple_reg_type () if it is a type suitable for
> a SRA scalar replacement. 

No, it is the type suitable for  the statement, it can be a union type
or a record with only one field. But see the more thorough explanation
below...

> But you should set DECL_GIMPLE_REG_P for
> VECTOR and COMPLEX types here.
> 
> >       if (write)
> > 	{
> > 	  gimple stmt;
> > 	  tree conv = fold_build1 (VIEW_CONVERT_EXPR, TREE_TYPE (repl), tmp);
> 
> This needs to either always fold to plain 'tmp' or tmp has to be a
> non-register.  Otherwise you will create invalid gimple.
> 
> > 	  *expr = tmp;
> > 	  if (is_gimple_reg_type (type))
> > 	    SSA_NAME_DEF_STMT (tmp) = gsi_stmt (*gsi);
> 
> See above.
> 
> > 	  stmt = gimple_build_assign (repl, conv);
> > 	  gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
> > 	  update_stmt (stmt);
> > 	}
> >       else
> > 	{
> > 	  gimple stmt;
> > 	  tree conv = fold_build1 (VIEW_CONVERT_EXPR, type, repl);
> > 
> > 	  stmt = gimple_build_assign (tmp, conv);
> > 	  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
> > 	  if (is_gimple_reg_type (type))
> > 	    SSA_NAME_DEF_STMT (tmp) = stmt;
> 
> See above.  (I wonder if the patch still passes bootstrap & regtest
> after the typecking patch)
> 
> > 	  *expr = tmp;
> > 	  update_stmt (stmt);
> > 	}
> >     }
> >   else
> >     {
> >       if (write)
> > 	{
> > 	  gimple stmt;
> > 
> > 	  stmt = gimple_build_assign (repl, unshare_expr (access->expr));
> > 	  gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
> > 	  update_stmt (stmt);
> > 	}
> >       else
> > 	{
> > 	  gimple stmt;
> > 
> > 	  stmt = gimple_build_assign (unshare_expr (access->expr), repl);
> > 	  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
> > 	  update_stmt (stmt);
> > 	}
> 
> I don't understand this path.  Are the types here always compatible?

And I don't really understand the comments.  The function is called by
sra_modify_expr (the function doing the replacements in all non-assign
statements) when it  needs to replace a reference by  a scalar but the
types don't  match.  This can happen  when replacing a  V_C_E, a union
access  when we  picked a  different  type that  the one  used in  the
statement or (and this case can be remarkably irritating) an access to
a records with only one (scalar) field.

My original idea  was to simply put a V_C_E in  the place.  However, I
believe there are places where this  is not possible - or at least one
case, a  LHS of  a call statement  because V_C_Es  of gimple_registers
(ssa_names) are not allowed on  LHSs.  My initial idea to handle these
cases  were to  create a  new temporary  with a  matching and  a V_C_E
assign statement  (with the V_C_E always  on the RHS -  I believe that
works even  with gimple  registers) that would  do the  conversion and
load/store  it   to  the  replacement  variable  (this   is  what  the
!TREE_ADDRESSABLE branch does).

The problem  with this idea  are TREE_ADDRESSABLE types.   These types
need to be  constructed and thus we cannot  create temporary variables
of these types.   On the other hand they absolutely  need to be SRAed,
not doing  so slows down tramp3d by  a factor of two  (and the current
SRA also breaks them up).  And  quite a few C++ classes are such types
that   are  "non-addressable"   and  have   only  one   scalar  field.
Identifying  such records  is possible,  I  soon realized  that I  can
simply leave the statement as it  is and produce a new statement to do
load/store  from the original  field (that's  what the  outermost else
branch does).

Does this make sense or is there some fundamental flaw in my reasoning
about gimple again?  Does this explain what the function does?

It certainly passes bootstrap and testing, I use --enable-checking=yes.
 
> >     }
> > }
> > 
> > 
> > /* Callback for scan_function.  Replace the expression EXPR with a scalar
> >    replacement if there is one and generate other statements to do type
> >    conversion or subtree copying if necessary.  GSI is used to place newly
> >    created statements, WRITE is true if the expression is being written to (it
> >    is on a LHS of a statement or output in an assembly statement).  */
> > 
> > static bool
> > sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi, bool write,
> > 		 void *data ATTRIBUTE_UNUSED)
> > {
> >   struct access *access;
> >   tree type, bfr;
> > 
> >   if (TREE_CODE (*expr) == BIT_FIELD_REF)
> >     {
> >       bfr = *expr;
> >       expr = &TREE_OPERAND (*expr, 0);
> >     }
> >   else
> >     bfr = NULL_TREE;
> > 
> >   if (TREE_CODE (*expr) == REALPART_EXPR || TREE_CODE (*expr) == IMAGPART_EXPR)
> >     expr = &TREE_OPERAND (*expr, 0);
> 
> Why strip these early?  I think this is wrong (or do you always want to
> produce complex type replacements, even if only the real or imaginary part
> are used?  If so a strathegic comment somewhere is missing.)

As I have already written above,  my intention was to keep the complex
replacements complex.   Again, not doing so  would definitely simplify
the code so if you think there is  no benefit in doing so I ma get rid
of all their  special handling.  I also treat  vector accesses through
BIT_FIELD_REFs specially...

ATM, I'll add a comment to the access structure description.

> >   type = TREE_TYPE (*expr);
> > 
> >   access = get_access_for_expr (*expr);
> >   if (!access)
> >     return false;
> > 
> >   if (access->grp_to_be_replaced)
> >     {
> >       if (!useless_type_conversion_p (type, access->type))
> > 	fix_incompatible_types_for_expr (expr, type, access, gsi, write);
> >       else
> > 	*expr = get_access_replacement (access);
> >     }
> > 
> >   if (access->first_child)
> >     {
> >       HOST_WIDE_INT start_offset, chunk_size;
> >       if (bfr
> > 	  && host_integerp (TREE_OPERAND (bfr, 1), 1)
> > 	  && host_integerp (TREE_OPERAND (bfr, 2), 1))
> > 	{
> > 	  start_offset = tree_low_cst (TREE_OPERAND (bfr, 1), 1);
> > 	  chunk_size = tree_low_cst (TREE_OPERAND (bfr, 2), 1);
> > 	}
> >       else
> > 	start_offset = chunk_size = 0;
> > 
> >       generate_subtree_copies (access->first_child, access->base, 0,
> > 			       start_offset, chunk_size, gsi, write, write);
> >     }
> >   return true;
> > }
> > 
> > /* Store all replacements in the access tree rooted in TOP_RACC either to their
> >    base aggregate if there are unscalarized data or directly to LHS
> >    otherwise.  */
> > 
> > static void
> > handle_unscalarized_data_in_subtree (struct access *top_racc, tree lhs,
> > 				     gimple_stmt_iterator *gsi)
> > {
> >   if (top_racc->grp_unscalarized_data)
> >     generate_subtree_copies (top_racc->first_child, top_racc->base, 0, 0, 0,
> > 			     gsi, false, false);
> >   else
> >     generate_subtree_copies (top_racc->first_child, lhs, top_racc->offset,
> > 			     0, 0, gsi, false, false);
> > }
> > 
> > 
> > /* Try to generate statements to load all sub-replacements in an access
> >    (sub)tree (LACC is the first child) from scalar replacements in the TOP_RACC
> >    (sub)tree.  If that is not possible, refresh the TOP_RACC base aggregate and
> >    load the accesses from it.  LEFT_OFFSET is the offset of the left whole
> >    subtree being copied, RIGHT_OFFSET is the same thing for the right subtree.
> >    GSI is stmt iterator used for statement insertions.  *REFRESHED is true iff
> >    the rhs top aggregate has already been refreshed by contents of its scalar
> >    reductions and is set to true if this function has to do it.  */
> > 
> > static void
> > load_assign_lhs_subreplacements (struct access *lacc, struct access *top_racc,
> > 				 HOST_WIDE_INT left_offset,
> > 				 HOST_WIDE_INT right_offset,
> > 				 gimple_stmt_iterator *old_gsi,
> > 				 gimple_stmt_iterator *new_gsi,
> > 				 bool *refreshed, tree lhs)
> > {
> >   do
> >     {
> >       if (lacc->grp_to_be_replaced)
> > 	{
> > 	  struct access *racc;
> > 	  HOST_WIDE_INT offset = lacc->offset - left_offset + right_offset;
> > 
> > 	  racc = find_access_in_subtree (top_racc, offset, lacc->size);
> > 	  if (racc && racc->grp_to_be_replaced)
> > 	    {
> > 	      gimple stmt;
> > 
> > 	      if (useless_type_conversion_p (lacc->type, racc->type))
> > 		stmt = gimple_build_assign (get_access_replacement (lacc),
> > 					    get_access_replacement (racc));
> > 	      else
> > 		{
> > 		  tree rhs = fold_build1 (VIEW_CONVERT_EXPR, lacc->type,
> > 					  get_access_replacement (racc));
> > 		  stmt = gimple_build_assign (get_access_replacement (lacc),
> > 					      rhs);
> > 		}
> > 
> > 	      gsi_insert_after (new_gsi, stmt, GSI_NEW_STMT);
> > 	      update_stmt (stmt);
> > 	    }
> > 	  else
> > 	    {
> > 	      tree expr = unshare_expr (top_racc->base);
> > 	      bool repl_found;
> > 	      gimple stmt;
> > 
> > 	      /* No suitable access on the right hand side, need to load from
> > 		 the aggregate.  See if we have to update it first... */
> > 	      if (!*refreshed)
> > 		{
> > 		  gcc_assert (top_racc->first_child);
> > 		  handle_unscalarized_data_in_subtree (top_racc, lhs, old_gsi);
> > 		  *refreshed = true;
> > 		}
> > 
> > 	      repl_found = build_ref_for_offset (&expr,
> > 						 TREE_TYPE (top_racc->base),
> > 						 lacc->offset - left_offset,
> > 						 lacc->type, false);
> > 	      gcc_assert (repl_found);
> > 	      stmt = gimple_build_assign (get_access_replacement (lacc),
> > 					  expr);
> > 	      gsi_insert_after (new_gsi, stmt, GSI_NEW_STMT);
> > 	      update_stmt (stmt);
> > 	    }
> > 	}
> >       else if (lacc->grp_read && !lacc->grp_covered && !*refreshed)
> > 	{
> > 	  handle_unscalarized_data_in_subtree (top_racc, lhs, old_gsi);
> > 	  *refreshed = true;
> > 	}
> > 
> >       if (lacc->first_child)
> > 	load_assign_lhs_subreplacements (lacc->first_child, top_racc,
> > 					 left_offset, right_offset,
> > 					 old_gsi, new_gsi, refreshed, lhs);
> >       lacc = lacc->next_sibling;
> >     }
> >   while (lacc);
> > }
> > 
> > /* Return true iff ACC is non-NULL and has subaccesses.  */
> > 
> > static inline bool
> > access_has_children_p (struct access *acc)
> > {
> >   return acc && acc->first_child;
> > }
> > 
> > /* Modify assignments with a CONSTRUCTOR on their RHS.  STMT contains a pointer
> >    to the assignment and GSI is the statement iterator pointing at it.  Returns
> >    the same values as sra_modify_assign.  */
> > 
> > static enum scan_assign_result
> > sra_modify_constructor_assign (gimple *stmt, gimple_stmt_iterator *gsi)
> > {
> >   tree lhs = gimple_assign_lhs (*stmt);
> >   struct access *acc;
> > 
> >   gcc_assert (TREE_CODE (lhs) != REALPART_EXPR
> > 	      && TREE_CODE (lhs) != IMAGPART_EXPR);
> >   acc = get_access_for_expr (lhs);
> >   if (!acc)
> >     return SRA_SA_NONE;
> > 
> >   if (VEC_length (constructor_elt,
> > 		  CONSTRUCTOR_ELTS (gimple_assign_rhs1 (*stmt))) > 0)
> >     {
> >       /* I have never seen this code path trigger but if it can happen the
> > 	 following should handle it gracefully.  */
> 
> It can trigger for vector constants.

OK, I'll remove the comment.  Apparently there are none in the
testsuite, I believe I tested with a gcc_unreachable here.

 
> >       if (access_has_children_p (acc))
> > 	generate_subtree_copies (acc->first_child, acc->base, 0, 0, 0, gsi,
> > 				 true, true);
> >       return SRA_SA_PROCESSED;
> >     }
> > 
> >   if (acc->grp_covered)
> >     {
> >       init_subtree_with_zero (acc, gsi, false);
> >       unlink_stmt_vdef (*stmt);
> >       gsi_remove (gsi, true);
> >       return SRA_SA_REMOVED;
> >     }
> >   else
> >     {
> >       init_subtree_with_zero (acc, gsi, true);
> >       return SRA_SA_PROCESSED;
> >     }
> > }
> > 
> > 
> > /* Modify statements with IMAGPART_EXPR or REALPART_EXPR on their lhs with
> >    to-be-scalarized expressions with them.  STMT is the statement and GSI is
> >    the iterator used to place new helper statements.  Returns the same values
> >    as sra_modify_assign.  */
> > 
> > static enum scan_assign_result
> > sra_modify_partially_complex_lhs (gimple stmt, gimple_stmt_iterator *gsi)
> > {
> >   tree lhs, complex, ptype, rp, ip;
> >   struct access *access;
> >   gimple new_stmt, aux_stmt;
> > 
> >   lhs = gimple_assign_lhs (stmt);
> >   complex = TREE_OPERAND (lhs, 0);
> > 
> >   access = get_access_for_expr (complex);
> > 
> >   if (!access || !access->grp_to_be_replaced)
> >     return SRA_SA_NONE;
> > 
> >   ptype = TREE_TYPE (TREE_TYPE (complex));
> >   rp = create_tmp_var (ptype, "SRr");
> >   add_referenced_var (rp);
> >   rp = make_ssa_name (rp, NULL);
> > 
> >   ip = create_tmp_var (ptype, "SRp");
> >   add_referenced_var (ip);
> >   ip = make_ssa_name (ip, NULL);
> > 
> >   if (TREE_CODE (lhs) == IMAGPART_EXPR)
> >     {
> >       aux_stmt = gimple_build_assign (rp, fold_build1 (REALPART_EXPR, ptype,
> > 					     get_access_replacement (access)));
> >       SSA_NAME_DEF_STMT (rp) = aux_stmt;
> >       gimple_assign_set_lhs (stmt, ip);
> >       SSA_NAME_DEF_STMT (ip) = stmt;
> >     }
> >   else
> >     {
> >       aux_stmt = gimple_build_assign (ip, fold_build1 (IMAGPART_EXPR, ptype,
> > 					     get_access_replacement (access)));
> >       SSA_NAME_DEF_STMT (ip) = aux_stmt;
> >       gimple_assign_set_lhs (stmt, rp);
> >       SSA_NAME_DEF_STMT (rp) = stmt;
> >     }
> > 
> >   gsi_insert_before (gsi, aux_stmt, GSI_SAME_STMT);
> >   update_stmt (aux_stmt);
> >   new_stmt = gimple_build_assign (get_access_replacement (access),
> > 				  fold_build2 (COMPLEX_EXPR, access->type,
> > 					       rp, ip));
> >   gsi_insert_after (gsi, new_stmt, GSI_NEW_STMT);
> >   update_stmt (new_stmt);
> 
> Hm.  So you do what complex lowering does here.  Note that this may
> create loads from uninitialized memory with all its problems.

Yes,  but I  have not  had any  such problems  with complex  types (as
opposed to  simple loads from half-initialized  records, for example).
OTOH, I  have also contemplated  setting DECL_GIMPLE_REG_P to  zero of
complex replacement which appear in IMAG_PART or REAL_PART on a LHS of
a statement.

> 
> WRT the complex stuff.  If you would do scalarization and analysis
> just on the components (not special case REAL/IMAGPART_EXPR everywhere)
> it should work better, correct?  You still could handle group
> scalarization for the case of for example passing a complex argument
> to a function.

Well, my reasoning was that if complex types were first-class citizens
in gimple  (as opposed to a record),  there was a reason  to keep them
together  and  so  I  attempted   that.   But  again,  if  that  is  a
misconception of mine and there  is no point in keeping them together,
I will gladly remove this.
 
> void bar(_Complex float);
> void foo(float x, float y)
> {
>   _Complex float z = x;
>   __imag z = y;
>   bar(z);
> }
> 
> The same applies for vectors - the REAL/IMAGPART_EXPRs equivalent
> there is BIT_FIELD_REF.

These are handled  by setting DECL_GIMPLE_REG_P to zero  if a B_F_R is
on a LHS.  I believe the current SRA does the same.  It works fine and
there's a lot less fuss about them.
 
> >   return SRA_SA_PROCESSED;
> > }
> > 
> > /* Return true iff T has a VIEW_CONVERT_EXPR among its handled components.  */
> > 
> > static bool
> > contains_view_convert_expr_p (tree t)
> > {
> >   while (1)
> >     {
> >       if (TREE_CODE (t) == VIEW_CONVERT_EXPR)
> > 	return true;
> >       if (!handled_component_p (t))
> > 	return false;
> >       t = TREE_OPERAND (t, 0);
> >     }
> > }
> 
> Place this in tree-flow-inline.h next to ref_contains_array_ref, also
> structure the loop in the same way.

OK,  but I'd like  the function  to work  if passed  declarations too.
Thus I cannot really use a  do-while loop.  I'll send it in a separate
patch.

> > /* Change STMT to assign compatible types by means of adding component or array
> >    references or VIEW_CONVERT_EXPRs.  All parameters have the same meaning as
> >    variable with the same names in sra_modify_assign.  This is done in a
> >    such a complicated way in order to make
> >    testsuite/g++.dg/tree-ssa/ssa-sra-2.C happy and so it helps in at least some
> >    cases.  */
> > 
> > static void
> > fix_modified_assign_compatibility (gimple_stmt_iterator *gsi, gimple *stmt,
> > 				   struct access *lacc, struct access *racc,
> > 				   tree lhs, tree *rhs, tree ltype, tree rtype)
> > {
> >   if (racc && racc->grp_to_be_replaced && AGGREGATE_TYPE_P (ltype)
> >       && !access_has_children_p (lacc))
> >     {
> >       tree expr = unshare_expr (lhs);
> >       bool found = build_ref_for_offset (&expr, ltype, racc->offset, rtype,
> > 					 false);
> >       if (found)
> > 	{
> > 	  gimple_assign_set_lhs (*stmt, expr);
> > 	  return;
> > 	}
> >     }
> > 
> >   if (lacc && lacc->grp_to_be_replaced && AGGREGATE_TYPE_P (rtype)
> >       && !access_has_children_p (racc))
> >     {
> >       tree expr = unshare_expr (*rhs);
> >       bool found = build_ref_for_offset (&expr, rtype, lacc->offset, ltype,
> > 					 false);
> >       if (found)
> > 	{
> > 	  gimple_assign_set_rhs1 (*stmt, expr);
> > 	  return;
> > 	}
> >     }
> > 
> >   *rhs = fold_build1 (VIEW_CONVERT_EXPR, ltype, *rhs);
> >   gimple_assign_set_rhs_from_tree (gsi, *rhs);
> >   *stmt = gsi_stmt (*gsi);
> 
> Reading this I have a deja-vu - isn't there another function in this
> file doing the same thing?  You are doing much unsharing even though
> you re-build the access tree from scratch?

This function has a similar purpose as fix_incompatible_types_for_expr
but this time  only for assign statements.  That  is easier because we
can always put the  V_C_E on the RHS and be safe  and so no additional
statements need to be generated.

However,  the V_C_Es  rather than  COMPONENT_REFs and  ARRAY_REFs feel
unnatural for  accessing fields from  single field records  and unions
and single  element arrays.  According to  the comment I  used to have
problems of some sort with that  in the ssa-sra-2.C testcase but I can
no longer reproduce them (and don't remember them).

I  call  unshare_expr  in  this  context  only when  one  side  of  an
assignment statement is  a scalar replacement and the  other one is an
aggregate (but not necessarily a declaration) which can happen only in
the cases listed  above.  That is not very many  calls and chances are
good that build_ref_for_offset succeeds.

Does that explain what is going on here?
 
> > }
> > 
> > /* Callback of scan_function to process assign statements.  It examines both
> >    sides of the statement, replaces them with a scalare replacement if there is
> >    one and generating copying of replacements if scalarized aggregates have been
> >    used in the assignment.  STMT is a pointer to the assign statement, GSI is
> >    used to hold generated statements for type conversions and subtree
> >    copying.  */
> > 
> > static enum scan_assign_result
> > sra_modify_assign (gimple *stmt, gimple_stmt_iterator *gsi,
> > 		   void *data ATTRIBUTE_UNUSED)
> > {
> >   struct access *lacc, *racc;
> >   tree ltype, rtype;
> >   tree lhs, rhs;
> >   bool modify_this_stmt;
> > 
> >   if (gimple_assign_rhs2 (*stmt))
> 
> !gimple_assign_single_p (*stmt)
> 
> (the only gimple assign that may access memory)

OK

> 
> >     return SRA_SA_NONE;
> >   lhs = gimple_assign_lhs (*stmt);
> >   rhs = gimple_assign_rhs1 (*stmt);
> > 
> >   if (TREE_CODE (rhs) == CONSTRUCTOR)
> >     return sra_modify_constructor_assign (stmt, gsi);
> > 
> >   if (TREE_CODE (lhs) == REALPART_EXPR || TREE_CODE (lhs) == IMAGPART_EXPR)
> >     return sra_modify_partially_complex_lhs (*stmt, gsi);
> > 
> >   if (TREE_CODE (rhs) == REALPART_EXPR || TREE_CODE (rhs) == IMAGPART_EXPR
> >       || TREE_CODE (rhs) == BIT_FIELD_REF || TREE_CODE (lhs) == BIT_FIELD_REF)
> >     {
> >       modify_this_stmt = sra_modify_expr (gimple_assign_rhs1_ptr (*stmt),
> > 					  gsi, false, data);
> >       modify_this_stmt |= sra_modify_expr (gimple_assign_lhs_ptr (*stmt),
> > 					   gsi, true, data);
> >       return modify_this_stmt ? SRA_SA_PROCESSED : SRA_SA_NONE;
> >     }
> > 
> >   lacc = get_access_for_expr (lhs);
> >   racc = get_access_for_expr (rhs);
> >   if (!lacc && !racc)
> >     return SRA_SA_NONE;
> > 
> >   modify_this_stmt = ((lacc && lacc->grp_to_be_replaced)
> > 		      || (racc && racc->grp_to_be_replaced));
> > 
> >   if (lacc && lacc->grp_to_be_replaced)
> >     {
> >       lhs = get_access_replacement (lacc);
> >       gimple_assign_set_lhs (*stmt, lhs);
> >       ltype = lacc->type;
> >     }
> >   else
> >     ltype = TREE_TYPE (lhs);
> > 
> >   if (racc && racc->grp_to_be_replaced)
> >     {
> >       rhs = get_access_replacement (racc);
> >       gimple_assign_set_rhs1 (*stmt, rhs);
> >       rtype = racc->type;
> >     }
> >   else
> >     rtype = TREE_TYPE (rhs);
> > 
> >   /* The possibility that gimple_assign_set_rhs_from_tree() might reallocate
> >      the statement makes the position of this pop_stmt_changes() a bit awkward
> >      but hopefully make some sense.  */
> 
> I don't see pop_stmt_changes().

Yeah, the comment is outdated. I've removed it.
 
> >   if (modify_this_stmt)
> >     {
> >       if (!useless_type_conversion_p (ltype, rtype))
> > 	fix_modified_assign_compatibility (gsi, stmt, lacc, racc,
> > 					   lhs, &rhs, ltype, rtype);
> >     }
> > 
> >   if (contains_view_convert_expr_p (rhs) || contains_view_convert_expr_p (lhs)
> >       || (access_has_children_p (racc)
> > 	  && !ref_expr_for_all_replacements_p (racc, lhs, racc->offset))
> >       || (access_has_children_p (lacc)
> > 	  && !ref_expr_for_all_replacements_p (lacc, rhs, lacc->offset)))
> 
> ?  A comment is missing what this case is about ...
> 
> (this smells like fixup that could be avoided by doing things correct
> in the first place)

From this  point on,  the function deals  with assignments  in between
aggregates  when at least  one has  scalar reductions  of some  of its
components.  There are three possible  scenarios: Both the LHS and RHS
have to-be-scalarized components,  2) only the RHS has  or 3) only the
LHS has.

In the first  case, we would like to load the  LHS components from RHS
components whenever possible.  If that  is not possible, we would like
to read it  directly from the RHS (after updating it  by storing in it
its own components).  If there are some necessary unscalarized data in
the  LHS, those will  be loaded  by the  original assignment  too.  If
neither of these cases happen,  the original statement can be removed.
Most of this is done by load_assign_lhs_subreplacements.

In  the  second  case, we  would  like  to  store all  RHS  scalarized
components  directly  into  LHS   and  if  they  cover  the  aggregate
completely, remove the statement too.   In the third case, we want the
LHS components to be loaded directly from the RHS (DSE will remove the
original statement if it becomes redundant).

This is a bit complex but  manageable when types match and when unions
do not cause confusion in a way that we cannot really load a component
of LHS from the RHS or  vice versa (the access representing this level
can  have subaccesses  that are  accessible only  through  a different
union field  at a higher  level - different  from the one used  in the
examined expression).  Unions are fun.

Therefore, I specially handle a fourth case, happening when there is a
specific  type  cast  or  it  is impossible  to  locate  a  scalarized
subaccess on  the other  side of the  expression.  If that  happens, I
simply "refresh"  the RHS  by storing in  it is  scalarized components
leave the original statement there to do the copying and then load the
scalar replacements of the LHS.  This is what the first branch does.

Is it  clearer now?  Perhaps I  should put these five  paragraphs as a
comment into the function?
 
> >     {
> >       if (access_has_children_p (racc))
> > 	generate_subtree_copies (racc->first_child, racc->base, 0, 0, 0,
> > 				 gsi, false, false);
> >       if (access_has_children_p (lacc))
> > 	generate_subtree_copies (lacc->first_child, lacc->base, 0, 0, 0,
> > 				 gsi, true, true);
> >     }
> >   else
> >     {
> >       if (access_has_children_p (lacc) && access_has_children_p (racc))
> > 	{
> > 	  gimple_stmt_iterator orig_gsi = *gsi;
> > 	  bool refreshed;
> > 
> > 	  if (lacc->grp_read && !lacc->grp_covered)
> > 	    {
> > 	      handle_unscalarized_data_in_subtree (racc, lhs, gsi);
> > 	      refreshed = true;
> > 	    }
> > 	  else
> > 	    refreshed = false;
> > 
> > 	  load_assign_lhs_subreplacements (lacc->first_child, racc,
> > 					   lacc->offset, racc->offset,
> > 					   &orig_gsi, gsi, &refreshed, lhs);
> > 	  if (!refreshed || !racc->grp_unscalarized_data)
> > 	    {
> > 	      if (*stmt == gsi_stmt (*gsi))
> > 		gsi_next (gsi);
> > 
> > 	      unlink_stmt_vdef (*stmt);
> > 	      gsi_remove (&orig_gsi, true);
> > 	      return SRA_SA_REMOVED;
> > 	    }
> > 	}
> >       else
> > 	{
> > 	  if (access_has_children_p (racc))
> > 	    {
> > 	      if (!racc->grp_unscalarized_data)
> > 		{
> > 		  generate_subtree_copies (racc->first_child, lhs,
> > 					   racc->offset, 0, 0, gsi,
> > 					   false, false);
> > 		  gcc_assert (*stmt == gsi_stmt (*gsi));
> > 		  unlink_stmt_vdef (*stmt);
> > 		  gsi_remove (gsi, true);
> > 		  return SRA_SA_REMOVED;
> > 		}
> > 	      else
> > 		generate_subtree_copies (racc->first_child, lhs,
> > 					 racc->offset, 0, 0, gsi, false, true);
> > 	    }
> > 	  else if (access_has_children_p (lacc))
> > 	    generate_subtree_copies (lacc->first_child, rhs, lacc->offset,
> > 				     0, 0, gsi, true, true);
> > 	}
> >     }
> > 
> >   return modify_this_stmt ? SRA_SA_PROCESSED : SRA_SA_NONE;
> > }
> > 
> > /* Generate statements initializing scalar replacements of parts of function
> >    parameters.  */
> > 
> > static void
> > initialize_parameter_reductions (void)
> > {
> >   gimple_stmt_iterator gsi;
> >   gimple_seq seq = NULL;
> >   tree parm;
> > 
> >   for (parm = DECL_ARGUMENTS (current_function_decl);
> >        parm;
> >        parm = TREE_CHAIN (parm))
> >     {
> >       VEC (access_p, heap) *access_vec;
> >       struct access *access;
> > 
> >       if (!bitmap_bit_p (candidate_bitmap, DECL_UID (parm)))
> > 	continue;
> >       access_vec = get_base_access_vector (parm);
> >       if (!access_vec)
> > 	continue;
> > 
> >       if (!seq)
> > 	{
> > 	  seq = gimple_seq_alloc ();
> > 	  gsi = gsi_start (seq);
> > 	}
> > 
> >       for (access = VEC_index (access_p, access_vec, 0);
> > 	   access;
> > 	   access = access->next_grp)
> > 	generate_subtree_copies (access, parm, 0, 0, 0, &gsi, true, true);
> >     }
> > 
> >   if (seq)
> >     gsi_insert_seq_on_edge_immediate (single_succ_edge (ENTRY_BLOCK_PTR), seq);
> > }
> > 
> > /* The "main" function of intraprocedural SRA passes.  Runs the analysis and if
> >    it reveals there are components of some aggregates to be scalarized, it runs
> >    the required transformations.  */
> > static unsigned int
> > perform_intra_sra (void)
> > {
> >   int ret = 0;
> >   sra_initialize ();
> > 
> >   if (!find_var_candidates ())
> >     goto out;
> > 
> >   if (!scan_function (build_access_from_expr, build_accesses_from_assign, NULL,
> > 		      true, NULL))
> >     goto out;
> > 
> >   if (!analyze_all_variable_accesses ())
> >     goto out;
> > 
> >   scan_function (sra_modify_expr, sra_modify_assign, NULL,
> > 		 false, NULL);
> >   initialize_parameter_reductions ();
> > 
> >   ret = TODO_update_ssa;
> 
> redundant set.
> 
> >   if (sra_mode == SRA_MODE_EARLY_INTRA)
> >     ret = TODO_update_ssa;
> >   else
> >     ret = TODO_update_ssa | TODO_rebuild_alias;
> 
> in fact you shouldn't (need to) rebuild alias.

OK
 
> >  out:
> >   sra_deinitialize ();
> >   return ret;
> > }
> > 
> > /* Perform early intraprocedural SRA.  */
> > static unsigned int
> > early_intra_sra (void)
> > {
> >   sra_mode = SRA_MODE_EARLY_INTRA;
> >   return perform_intra_sra ();
> > }
> > 
> > /* Perform "late" intraprocedural SRA.  */
> > static unsigned int
> > late_intra_sra (void)
> > {
> >   sra_mode = SRA_MODE_INTRA;
> >   return perform_intra_sra ();
> > }
> > 
> > 
> > static bool
> > gate_intra_sra (void)
> > {
> >   return flag_tree_sra != 0;
> > }
> > 
> > 
> > struct gimple_opt_pass pass_sra_early =
> > {
> >  {
> >   GIMPLE_PASS,
> >   "esra",	 			/* name */
> >   gate_intra_sra,			/* gate */
> >   early_intra_sra,			/* execute */
> >   NULL,					/* sub */
> >   NULL,					/* next */
> >   0,					/* static_pass_number */
> >   TV_TREE_SRA,				/* tv_id */
> >   PROP_cfg | PROP_ssa,                  /* properties_required */
> >   0,					/* properties_provided */
> >   0,					/* properties_destroyed */
> >   0,					/* todo_flags_start */
> >   TODO_dump_func
> >   | TODO_update_ssa
> >   | TODO_ggc_collect
> >   | TODO_verify_ssa			/* todo_flags_finish */
> >  }
> > };
> > 
> > 
> > struct gimple_opt_pass pass_sra =
> > {
> >  {
> >   GIMPLE_PASS,
> >   "sra",	 			/* name */
> >   gate_intra_sra,			/* gate */
> >   late_intra_sra,			/* execute */
> >   NULL,					/* sub */
> >   NULL,					/* next */
> >   0,					/* static_pass_number */
> >   TV_TREE_SRA,				/* tv_id */
> >   PROP_cfg | PROP_ssa,                  /* properties_required */
> >   0,					/* properties_provided */
> >   0,					/* properties_destroyed */
> >   TODO_update_address_taken,		/* todo_flags_start */
> >   TODO_dump_func
> >   | TODO_update_ssa
> >   | TODO_ggc_collect
> >   | TODO_verify_ssa			/* todo_flags_finish */
> >  }
> > };
> 
> 
> Overall it looks good - I'm still a little bit confused, but that's
> likely because reading code from top to bottom doesn't make the
> most sense in all cases ;)
> 
> Looking forward to a second look on a revised version.
> 

Thanks again,

Martin

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/5] New intraprocedural Scalar Reduction of Aggregates.
  2009-04-29 12:56     ` Richard Guenther
  2009-05-10 10:33       ` Martin Jambor
@ 2009-05-10 10:39       ` Martin Jambor
  2009-05-12  9:49         ` Martin Jambor
  1 sibling, 1 reply; 25+ messages in thread
From: Martin Jambor @ 2009-05-10 10:39 UTC (permalink / raw)
  To: Richard Guenther; +Cc: GCC Patches, Jan Hubicka

Hi,

this is the  new tree-sra.c file which incorporates  most of your
concerns.  It  passes bootstrap and all tests  (including Ada) on
x86_64-linux , revision 147317.

I'm sorry it took so long,  I was on vacation and then I stumbled
upon  a  few  new   weird  regression  (usually  results  of  new
oversights).

I'm looking forward to your comments.

Martin


/* Scalar Replacement of Aggregates (SRA) converts some structure
   references into scalar references, exposing them to the scalar
   optimizers.
   Copyright (C) 2008, 2009 Free Software Foundation, Inc.
   Contributed by Martin Jambor <mjambor@suse.cz>

This file is part of GCC.

GCC is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 3, or (at your option) any later
version.

GCC is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
for more details.

You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING3.  If not see
<http://www.gnu.org/licenses/>.  */

/* This file implements Scalar Reduction of Aggregates (SRA).  SRA is run
   twice, once in the early stages of compilation (early SRA) and once in the
   late stages (late SRA).  The aim of both is to turn references to scalar
   parts of aggregates into uses of independent scalar variables.

   The two passes are nearly identical, the only difference is that early SRA
   does not scalarize unions which are used as the result in a GIMPLE_RETURN
   statement because together with inlining this can lead to weird type
   conversions.

   Both passes operate in four stages:

   1. The declarations that have properties which make them candidates for
      scalarization are identified in function find_var_candidates().  The
      candidates are stored in candidate_bitmap.

   2. The function body is scanned.  In the process, declarations which are
      used in a manner that prevent their scalarization are removed from the
      candidate bitmap.  More importantly, for every access into an aggregate,
      an access structure (struct access) is created by create_access() and
      stored in a vector associated with the aggregate.  Among other
      information, the aggregate declaration, the offset and size of the access
      and its type are stored in the structure.

      On a related note, assign_link structures are created for every assign
      statement between candidate aggregates and attached to the related
      accesses.

   3. The vectors of accesses are analyzed.  They are first sorted according to
      their offset and size and then scanned for partially overlapping accesses
      (i.e. those which overlap but one is not entirely within another).  Such
      an access disqualifies the whole aggregate from being scalarized.

      If there is no such inhibiting overlap, a representative access structure
      is chosen for every unique combination of offset and size.  Afterwards,
      the pass builds a set of trees from these structures, in which children
      of an access are within their parent (in terms of offset and size).

      Then accesses  are propagated  whenever possible (i.e.  in cases  when it
      does not create a partially overlapping access) across assign_links from
      the right hand side to the left hand side.

      Then the set of trees for each declaration is traversed again and those
      accesses which should be replaced by a scalar are identified.

   4. The function is traversed again, and for every reference into an
      aggregate that has some component which is about to be scalarized,
      statements are amended and new statements are created as necessary.
      Finally, if a parameter got scalarized, the scalar replacements are
      initialized with values from respective parameter aggregates.  */

#include "config.h"
#include "system.h"
#include "coretypes.h"
#include "alloc-pool.h"
#include "tm.h"
#include "tree.h"
#include "gimple.h"
#include "tree-flow.h"
#include "diagnostic.h"
#include "tree-dump.h"
#include "timevar.h"
#include "params.h"
#include "target.h"
#include "flags.h"

/* Enumeration of all aggregate reductions we can do.  */
enum sra_mode { SRA_MODE_EARLY_INTRA, /* early intraprocedural SRA */
		SRA_MODE_INTRA };	     /* late intraprocedural SRA */

/* Global variable describing which aggregate reduction we are performing at
   the moment.  */
static enum sra_mode sra_mode;

struct assign_link;

/* ACCESS represents each access to an aggregate variable (as a whole or a
   part).  It can also represent a group of accesses that refer to exactly the
   same fragment of an aggregate (i.e. those that have exactly the same offset
   and size).  Such representatives for a single aggregate, once determined,
   are linked in a linked list and have the group fields set.

   Moreover, when doing intraprocedural SRA, a tree is built from those
   representatives (by the means of first_child and next_sibling pointers), in
   which all items in a subtree are "within" the root, i.e. their offset is
   greater or equal to offset of the root and offset+size is smaller or equal
   to offset+size of the root.  Children of an access are sorted by offset.

   Note that accesses to parts of vector and complex number types always
   represented by an access to the whole complex number or a vector.  It is a
   duty of the modifying functions to replace them appropriately.  */

struct access
{
  /* Values returned by  `get_ref_base_and_extent' for each component reference
     If EXPR isn't a component reference  just set `BASE = EXPR', `OFFSET = 0',
     `SIZE = TREE_SIZE (TREE_TYPE (expr))'.  */
  HOST_WIDE_INT offset;
  HOST_WIDE_INT size;
  tree base;

  /* Expression.  */
  tree expr;
  /* Type.  */
  tree type;

  /* Next group representative for this aggregate. */
  struct access *next_grp;

  /* Pointer to the group representative.  Pointer to itself if the struct is
     the representative.  */
  struct access *group_representative;

  /* If this access has any children (in terms of the definition above), this
     points to the first one.  */
  struct access *first_child;

  /* Pointer to the next sibling in the access tree as described above.  */
  struct access *next_sibling;

  /* Pointers to the first and last element in the linked list of assign
     links.  */
  struct assign_link *first_link, *last_link;

  /* Pointer to the next access in the work queue.  */
  struct access *next_queued;

  /* Replacement variable for this access "region."  Never to be accessed
     directly, always only by the means of get_access_replacement() and only
     when grp_to_be_replaced flag is set.  */
  tree replacement_decl;

  /* Is this particular access write access? */
  unsigned write : 1;

  /* Is this access currently in the work queue?  */
  unsigned grp_queued : 1;
  /* Does this group contain a write access?  This flag is propagated down the
     access tree.  */
  unsigned grp_write : 1;
  /* Does this group contain a read access?  This flag is propagated down the
     access tree.  */
  unsigned grp_read : 1;
  /* Is the subtree rooted in this access fully covered by scalar
     replacements?  */
  unsigned grp_covered : 1;
  /* If set to true, this access and all below it in an access tree must not be
     scalarized.  */
  unsigned grp_unscalarizable_region : 1;
  /* Whether data have been written to parts of the aggregate covered by this
     access which is not to be scalarized.  This flag is propagated up in the
     access tree.  */
  unsigned grp_unscalarized_data : 1;
  /* Does this access and/or group contain a write access through a
     BIT_FIELD_REF?  */
  unsigned grp_bfr_lhs : 1;

  /* Set when a scalar replacement should be created for this variable.  We do
     the decision and creation at different places because create_tmp_var
     cannot be called from within FOR_EACH_REFERENCED_VAR. */
  unsigned grp_to_be_replaced : 1;
};

typedef struct access *access_p;

DEF_VEC_P (access_p);
DEF_VEC_ALLOC_P (access_p, heap);

/* Alloc pool for allocating access structures.  */
static alloc_pool access_pool;

/* A structure linking lhs and rhs accesses from an aggregate assignment.  They
   are used to propagate subaccesses from rhs to lhs as long as they don't
   conflict with what is already there.  */
struct assign_link
{
  struct access *lacc, *racc;
  struct assign_link *next;
};

/* Alloc pool for allocating assign link structures.  */
static alloc_pool link_pool;

/* Base (tree) -> Vector (VEC(access_p,heap) *) map.  */
static struct pointer_map_t *base_access_vec;

/* Bitmap of bases (candidates).  */
static bitmap candidate_bitmap;
/* Bitmap of declarations used in a return statement.  */
static bitmap retvals_bitmap;
/* Obstack for creation of fancy names.  */
static struct obstack name_obstack;

/* Head of a linked list of accesses that need to have its subaccesses
   propagated to their assignment counterparts. */
static struct access *work_queue_head;

/* Dump contents of ACCESS to file F in a human friendly way.  If GRP is true,
   representative fields are dumped, otherwise those which only describe the
   individual access are.  */

static void
dump_access (FILE *f, struct access *access, bool grp)
{
  fprintf (f, "access { ");
  fprintf (f, "base = (%d)'", DECL_UID (access->base));
  print_generic_expr (f, access->base, 0);
  fprintf (f, "', offset = " HOST_WIDE_INT_PRINT_DEC, access->offset);
  fprintf (f, ", size = " HOST_WIDE_INT_PRINT_DEC, access->size);
  fprintf (f, ", expr = ");
  print_generic_expr (f, access->expr, 0);
  fprintf (f, ", type = ");
  print_generic_expr (f, access->type, 0);
  if (grp)
    fprintf (f, ", grp_write = %d, grp_read = %d, grp_covered = %d, "
	     "grp_unscalarizable_region = %d, grp_unscalarized_data = %d, "
	     "grp_to_be_replaced = %d\n",
	     access->grp_write, access->grp_read, access->grp_covered,
	     access->grp_unscalarizable_region, access->grp_unscalarized_data,
	     access->grp_to_be_replaced);
  else
    fprintf (f, ", write = %d'\n", access->write);
}

/* Dump a subtree rooted in ACCESS to file F, indent by LEVEL.  */

static void
dump_access_tree_1 (FILE *f, struct access *access, int level)
{
  do
    {
      int i;

      for (i = 0; i < level; i++)
	fputs ("* ", dump_file);

      dump_access (f, access, true);

      if (access->first_child)
	dump_access_tree_1 (f, access->first_child, level + 1);

      access = access->next_sibling;
    }
  while (access);
}

/* Dump all access trees for a variable, given the pointer to the first root in
   ACCESS.  */

static void
dump_access_tree (FILE *f, struct access *access)
{
  for (; access; access = access->next_grp)
    dump_access_tree_1 (f, access, 0);
}

/* Return a vector of pointers to accesses for the variable given in BASE or
   NULL if there is none.  */

static VEC (access_p, heap) *
get_base_access_vector (tree base)
{
  void **slot;

  slot = pointer_map_contains (base_access_vec, base);
  if (!slot)
    return NULL;
  else
    return *(VEC (access_p, heap) **) slot;
}

/* Find an access with required OFFSET and SIZE in a subtree of accesses rooted
   in ACCESS.  Return NULL if it cannot be found.  */

static struct access *
find_access_in_subtree (struct access *access, HOST_WIDE_INT offset,
			HOST_WIDE_INT size)
{
  while (access && (access->offset != offset || access->size != size))
    {
      struct access *child = access->first_child;

      while (child && (child->offset + child->size <= offset))
	child = child->next_sibling;
      access = child;
    }

  return access;
}

/* Return the first group representative for DECL or NULL if none exists.  */

static struct access *
get_first_repr_for_decl (tree base)
{
  VEC (access_p, heap) *access_vec;

  access_vec = get_base_access_vector (base);
  if (!access_vec)
    return NULL;

  return VEC_index (access_p, access_vec, 0);
}

/* Find an access representative for the variable BASE and given OFFSET and
   SIZE.  Requires that access trees have already been built.  Return NULL if
   it cannot be found.  */

static struct access *
get_var_base_offset_size_access (tree base, HOST_WIDE_INT offset,
				 HOST_WIDE_INT size)
{
  struct access *access;

  access = get_first_repr_for_decl (base);
  while (access && (access->offset + access->size <= offset))
    access = access->next_grp;
  if (!access)
    return NULL;

  return find_access_in_subtree (access, offset, size);
}

/* Add LINK to the linked list of assign links of RACC.  */
static void
add_link_to_rhs (struct access *racc, struct assign_link *link)
{
  gcc_assert (link->racc == racc);

  if (!racc->first_link)
    {
      gcc_assert (!racc->last_link);
      racc->first_link = link;
    }
  else
    racc->last_link->next = link;

  racc->last_link = link;
  link->next = NULL;
}

/* Move all link structures in their linked list in OLD_RACC to the linked list
   in NEW_RACC.  */
static void
relink_to_new_repr (struct access *new_racc, struct access *old_racc)
{
  if (!old_racc->first_link)
    {
      gcc_assert (!old_racc->last_link);
      return;
    }

  if (new_racc->first_link)
    {
      gcc_assert (!new_racc->last_link->next);
      gcc_assert (!old_racc->last_link || !old_racc->last_link->next);

      new_racc->last_link->next = old_racc->first_link;
      new_racc->last_link = old_racc->last_link;
    }
  else
    {
      gcc_assert (!new_racc->last_link);

      new_racc->first_link = old_racc->first_link;
      new_racc->last_link = old_racc->last_link;
    }
  old_racc->first_link = old_racc->last_link = NULL;
}

/* Add ACCESS to the work queue (which is actually a stack).  */

static void
add_access_to_work_queue (struct access *access)
{
  if (!access->grp_queued)
    {
      gcc_assert (!access->next_queued);
      access->next_queued = work_queue_head;
      access->grp_queued = 1;
      work_queue_head = access;
    }
}

/* Pop an access from the work queue, and return it, assuming there is one.  */

static struct access *
pop_access_from_work_queue (void)
{
  struct access *access = work_queue_head;

  work_queue_head = access->next_queued;
  access->next_queued = NULL;
  access->grp_queued = 0;
  return access;
}


/* Allocate necessary structures.  */

static void
sra_initialize (void)
{
  candidate_bitmap = BITMAP_ALLOC (NULL);
  retvals_bitmap = BITMAP_ALLOC (NULL);
  gcc_obstack_init (&name_obstack);
  access_pool = create_alloc_pool ("SRA accesses", sizeof (struct access), 16);
  link_pool = create_alloc_pool ("SRA links", sizeof (struct assign_link), 16);
  base_access_vec = pointer_map_create ();
}

/* Hook fed to pointer_map_traverse, deallocate stored vectors.  */

static bool
delete_base_accesses (const void *key ATTRIBUTE_UNUSED, void **value,
		     void *data ATTRIBUTE_UNUSED)
{
  VEC (access_p, heap) *access_vec;
  access_vec = (VEC (access_p, heap) *) *value;
  VEC_free (access_p, heap, access_vec);

  return true;
}

/* Deallocate all general structures.  */

static void
sra_deinitialize (void)
{
  BITMAP_FREE (candidate_bitmap);
  BITMAP_FREE (retvals_bitmap);
  free_alloc_pool (access_pool);
  free_alloc_pool (link_pool);
  obstack_free (&name_obstack, NULL);

  pointer_map_traverse (base_access_vec, delete_base_accesses, NULL);
  pointer_map_destroy (base_access_vec);
}

/* Remove DECL from candidates for SRA and write REASON to the dump file if
   there is one.  */
static void
disqualify_candidate (tree decl, const char *reason)
{
  bitmap_clear_bit (candidate_bitmap, DECL_UID (decl));

  if (dump_file && (dump_flags & TDF_DETAILS))
    {
      fprintf (dump_file, "! Disqualifying ");
      print_generic_expr (dump_file, decl, 0);
      fprintf (dump_file, " - %s\n", reason);
    }
}

/* Return true iff the type contains a field or an element which does not allow
   scalarization.  */

static bool
type_internals_preclude_sra_p (tree type)
{
  tree fld;
  tree et;

  switch (TREE_CODE (type))
    {
    case RECORD_TYPE:
    case UNION_TYPE:
    case QUAL_UNION_TYPE:
      for (fld = TYPE_FIELDS (type); fld; fld = TREE_CHAIN (fld))
	if (TREE_CODE (fld) == FIELD_DECL)
	  {
	    tree ft = TREE_TYPE (fld);

	    if (TREE_THIS_VOLATILE (fld)
		|| !DECL_FIELD_OFFSET (fld) || !DECL_SIZE (fld)
		|| !host_integerp (DECL_FIELD_OFFSET (fld), 1)
		|| !host_integerp (DECL_SIZE (fld), 1))
	      return true;

	    if (AGGREGATE_TYPE_P (ft)
		&& type_internals_preclude_sra_p (ft))
	      return true;
	  }

      return false;

    case ARRAY_TYPE:
      et = TREE_TYPE (type);

      if (AGGREGATE_TYPE_P (et))
	return type_internals_preclude_sra_p (et);
      else
	return false;

    default:
      return false;
    }
}

/* Create and insert access for EXPR. Return created access, or NULL if it is
   not possible.  */

static struct access *
create_access (tree expr, bool write)
{
  struct access *access;
  void **slot;
  VEC (access_p,heap) *vec;
  HOST_WIDE_INT offset, size, max_size;
  tree base = expr;
  bool unscalarizable_region = false;

  base = get_ref_base_and_extent (expr, &offset, &size, &max_size);

  /* !!! Assert for testing only, remove after some time.  */
  gcc_assert (base);
  if (!DECL_P (base) || !bitmap_bit_p (candidate_bitmap, DECL_UID (base)))
    return NULL;

  if (size != max_size)
    {
      size = max_size;
      unscalarizable_region = true;
    }

  if (size < 0)
    {
      disqualify_candidate (base, "Encountered an unconstrained access.");
      return NULL;
    }

  access = (struct access *) pool_alloc (access_pool);
  memset (access, 0, sizeof (struct access));

  access->base = base;
  access->offset = offset;
  access->size = size;
  access->expr = expr;
  access->type = TREE_TYPE (expr);
  access->write = write;
  access->grp_unscalarizable_region = unscalarizable_region;

  slot = pointer_map_contains (base_access_vec, base);
  if (slot)
    vec = (VEC (access_p, heap) *) *slot;
  else
    vec = VEC_alloc (access_p, heap, 32);

  VEC_safe_push (access_p, heap, vec, access);

  *((struct VEC (access_p,heap) **)
	pointer_map_insert (base_access_vec, base)) = vec;

  return access;
}


/* Search the given tree for a declaration by skipping handled components and
   exclude it from the candidates.  */

static void
disqualify_base_of_expr (tree t, const char *reason)
{
  while (handled_component_p (t))
    t = TREE_OPERAND (t, 0);

  if (DECL_P (t))
    disqualify_candidate (t, reason);
}

/* Scan expression EXPR and create access structures for all accesses to
   candidates for scalarization.  Return the created access or NULL if none is
   created.  */

static struct access *
build_access_from_expr_1 (tree *expr_ptr,
			gimple_stmt_iterator *gsi ATTRIBUTE_UNUSED, bool write)
{
  struct access *ret = NULL;
  tree expr = *expr_ptr;
  bool bit_ref;

  if (TREE_CODE (expr) == BIT_FIELD_REF)
    {
      expr = TREE_OPERAND (expr, 0);
      bit_ref = true;
    }
  else
    bit_ref = false;

  if (CONVERT_EXPR_P (expr)
      || TREE_CODE (expr) == VIEW_CONVERT_EXPR)
    expr = TREE_OPERAND (expr, 0);

  if (contains_view_convert_expr_p (expr))
    {
      disqualify_base_of_expr (expr, "V_C_E under a different handled "
			       "component.");
      return NULL;
    }

  switch (TREE_CODE (expr))
    {
    case VAR_DECL:
    case PARM_DECL:
    case RESULT_DECL:
    case COMPONENT_REF:
    case ARRAY_REF:
    case ARRAY_RANGE_REF:
      ret = create_access (expr, write);
      break;

    case REALPART_EXPR:
    case IMAGPART_EXPR:
      expr = TREE_OPERAND (expr, 0);
      ret = create_access (expr, write);
      break;

    default:
      break;
    }

  if (write && bit_ref && ret)
    ret->grp_bfr_lhs = 1;

  return ret;
}

/* Scan expression EXPR and create access structures for all accesses to
   candidates for scalarization.  Return true if any access has been
   inserted.  */

static bool
build_access_from_expr (tree *expr_ptr,
			gimple_stmt_iterator *gsi ATTRIBUTE_UNUSED, bool write,
			void *data ATTRIBUTE_UNUSED)
{
  return build_access_from_expr_1 (expr_ptr, gsi, write) != NULL;
}

/* Disqualify LHS and RHS for scalarization if STMT must end its basic block in
   modes in which it matters, return true iff they have been disqualified.  RHS
   may be NULL, in that case ignore it.  If we scalarize an aggregate in
   intra-SRA we may need to add statements after each statement.  This is not
   possible if a statement unconditionally has to end the basic block.  */
static bool
disqualify_ops_if_throwing_stmt (gimple stmt, tree lhs, tree rhs)
{
  if (stmt_can_throw_internal (stmt) || stmt_ends_bb_p (stmt))
    {
      disqualify_base_of_expr (lhs, "LHS of a throwing stmt.");
      if (rhs)
	disqualify_base_of_expr (rhs, "RHS of a throwing stmt.");
      return true;
    }
  return false;
}


/* Result code for scan_assign callback for scan_function.  */
enum scan_assign_result { SRA_SA_NONE,       /* nothing done for the stmt */
			  SRA_SA_PROCESSED,  /* stmt analyzed/changed */
			  SRA_SA_REMOVED };  /* stmt redundant and eliminated */


/* Scan expressions occuring in the statement pointed to by STMT_EXPR, create
   access structures for all accesses to candidates for scalarization and
   remove those candidates which occur in statements or expressions that
   prevent them from being split apart.  Return true if any access has been
   inserted.  */

static enum scan_assign_result
build_accesses_from_assign (gimple *stmt_ptr,
			    gimple_stmt_iterator *gsi ATTRIBUTE_UNUSED,
			    void *data ATTRIBUTE_UNUSED)
{
  gimple stmt = *stmt_ptr;
  tree *lhs_ptr, *rhs_ptr;
  struct access *lacc, *racc;

  if (!gimple_assign_single_p (stmt))
    return SRA_SA_NONE;

  lhs_ptr = gimple_assign_lhs_ptr (stmt);
  rhs_ptr = gimple_assign_rhs1_ptr (stmt);

  if (disqualify_ops_if_throwing_stmt (stmt, *lhs_ptr, *rhs_ptr))
    return SRA_SA_NONE;

  racc = build_access_from_expr_1 (rhs_ptr, gsi, false);
  lacc = build_access_from_expr_1 (lhs_ptr, gsi, true);

  if (lacc && racc
      && !lacc->grp_unscalarizable_region
      && !racc->grp_unscalarizable_region
      && AGGREGATE_TYPE_P (TREE_TYPE (*lhs_ptr))
      /* FIXME: Turn the following line into an assert after PR 40058 is
	 fixed.  */
      && lacc->size == racc->size
      && useless_type_conversion_p (lacc->type, racc->type))
    {
      struct assign_link *link;

      link = (struct assign_link *) pool_alloc (link_pool);
      memset (link, 0, sizeof (struct assign_link));

      link->lacc = lacc;
      link->racc = racc;

      add_link_to_rhs (racc, link);
    }

  return (lacc || racc) ? SRA_SA_PROCESSED : SRA_SA_NONE;
}

/* Callback of walk_stmt_load_store_addr_ops visit_addr used to determine
   GIMPLE_ASM operands with memory constrains which cannot be scalarized.  */

static bool
asm_visit_addr (gimple stmt ATTRIBUTE_UNUSED, tree op,
		void *data ATTRIBUTE_UNUSED)
{
  if (DECL_P (op))
    disqualify_candidate (op, "Non-scalarizable GIMPLE_ASM operand.");

  return false;
}


/* Scan function and look for interesting statements. Return true if any has
   been found or processed, as indicated by callbacks.  SCAN_EXPR is a callback
   called on all expressions within statements except assign statements and
   those deemed entirely unsuitable for some reason (all operands in such
   statements and expression are removed from candidate_bitmap).  SCAN_ASSIGN
   is a callback called on all assign statements, HANDLE_SSA_DEFS is a callback
   called on assign statements and those call statements which have a lhs and
   it is the only callback which can be NULL. ANALYSIS_STAGE is true when
   running in the analysis stage of a pass and thus no statement is being
   modified.  DATA is a pointer passed to all callbacks.  If any single
   callback returns true, this function also returns true, otherwise it returns
   false.  */

static bool
scan_function (bool (*scan_expr) (tree *, gimple_stmt_iterator *, bool, void *),
	       enum scan_assign_result (*scan_assign) (gimple *,
						       gimple_stmt_iterator *,
						       void *),
	       bool (*handle_ssa_defs)(gimple, void *),
	       bool analysis_stage, void *data)
{
  gimple_stmt_iterator gsi;
  basic_block bb;
  unsigned i;
  tree *t;
  bool ret = false;

  FOR_EACH_BB (bb)
    {
      bool bb_changed = false;

      gsi = gsi_start_bb (bb);
      while (!gsi_end_p (gsi))
	{
	  gimple stmt = gsi_stmt (gsi);
	  enum scan_assign_result assign_result;
	  bool any = false, deleted = false;

	  switch (gimple_code (stmt))
	    {
	    case GIMPLE_RETURN:
	      t = gimple_return_retval_ptr (stmt);
	      if (*t != NULL_TREE)
		{
		  if (DECL_P (*t))
		    {
		      tree ret_type = TREE_TYPE (*t);
		      if (analysis_stage
			  && sra_mode == SRA_MODE_EARLY_INTRA
			  && (TREE_CODE (ret_type) == UNION_TYPE
			      || TREE_CODE (ret_type) == QUAL_UNION_TYPE))
			disqualify_candidate (*t,
					      "Union in a return statement.");
		      else
			bitmap_set_bit (retvals_bitmap, DECL_UID (*t));
		    }
		  any |= scan_expr (t, &gsi, false, data);
		}
	      break;

	    case GIMPLE_ASSIGN:
	      assign_result = scan_assign (&stmt, &gsi, data);
	      any |= assign_result == SRA_SA_PROCESSED;
	      deleted = assign_result == SRA_SA_REMOVED;
	      if (handle_ssa_defs && assign_result != SRA_SA_REMOVED)
		any |= handle_ssa_defs (stmt, data);
	      break;

	    case GIMPLE_CALL:
	      /* Operands must be processed before the lhs.  */
	      for (i = 0; i < gimple_call_num_args (stmt); i++)
		{
		  tree *argp = gimple_call_arg_ptr (stmt, i);
		  any |= scan_expr (argp, &gsi, false, data);
		}

	      if (gimple_call_lhs (stmt))
		{
		  tree *lhs_ptr = gimple_call_lhs_ptr (stmt);
		  if (!analysis_stage ||
		      !disqualify_ops_if_throwing_stmt (stmt, *lhs_ptr, NULL))
		    {
		      any |= scan_expr (lhs_ptr, &gsi, true, data);
		      if (handle_ssa_defs)
			any |= handle_ssa_defs (stmt, data);
		    }
		}
	      break;

	    case GIMPLE_ASM:

	      if (analysis_stage)
		walk_stmt_load_store_addr_ops (stmt, NULL, NULL, NULL,
					       asm_visit_addr);
	      for (i = 0; i < gimple_asm_ninputs (stmt); i++)
		{
		  tree *op = &TREE_VALUE (gimple_asm_input_op (stmt, i));
		  any |= scan_expr (op, &gsi, false, data);
		}
	      for (i = 0; i < gimple_asm_noutputs (stmt); i++)
		{
		  tree *op = &TREE_VALUE (gimple_asm_output_op (stmt, i));
		  any |= scan_expr (op, &gsi, true, data);
		}

	    default:
	      break;
	    }

	  if (any)
	    {
	      ret = true;
	      bb_changed = true;

	      if (!analysis_stage)
		{
		  update_stmt (stmt);
		  if (!stmt_could_throw_p (stmt))
		    remove_stmt_from_eh_region (stmt);
		}
	    }
	  if (deleted)
	    bb_changed = true;
	  else
	    {
	      gsi_next (&gsi);
	      ret = true;
	    }
	}
      if (!analysis_stage && bb_changed)
	gimple_purge_dead_eh_edges (bb);
    }

  return ret;
}

/* Helper of QSORT function. There are pointers to accesses in the array.  An
   access is considered smaller than another if it has smaller offset or if the
   offsets are the same but is size is bigger. */

static int
compare_access_positions (const void *a, const void *b)
{
  const access_p *fp1 = (const access_p *) a;
  const access_p *fp2 = (const access_p *) b;
  const access_p f1 = *fp1;
  const access_p f2 = *fp2;

  if (f1->offset != f2->offset)
    return f1->offset < f2->offset ? -1 : 1;

  if (f1->size == f2->size)
    return 0;
  /* We want the bigger accesses first, thus the opposite operator in the next
     line: */
  return f1->size > f2->size ? -1 : 1;
}


/* Append a name of the declaration to the name obstack.  A helper function for
   make_fancy_name.  */

static void
make_fancy_decl_name (tree decl)
{
  char buffer[32];

  tree name = DECL_NAME (decl);
  if (name)
    obstack_grow (&name_obstack, IDENTIFIER_POINTER (name),
		  IDENTIFIER_LENGTH (name));
  else
    {
      sprintf (buffer, "D%u", DECL_UID (decl));
      obstack_grow (&name_obstack, buffer, strlen (buffer));
    }
}

/* Helper for make_fancy_name.  */

static void
make_fancy_name_1 (tree expr)
{
  char buffer[32];
  tree index;

  if (DECL_P (expr))
    {
      make_fancy_decl_name (expr);
      return;
    }

  switch (TREE_CODE (expr))
    {
    case COMPONENT_REF:
      make_fancy_name_1 (TREE_OPERAND (expr, 0));
      obstack_1grow (&name_obstack, '$');
      make_fancy_decl_name (TREE_OPERAND (expr, 1));
      break;

    case ARRAY_REF:
      make_fancy_name_1 (TREE_OPERAND (expr, 0));
      obstack_1grow (&name_obstack, '$');
      /* Arrays with only one element may not have a constant as their
	 index. */
      index = TREE_OPERAND (expr, 1);
      if (TREE_CODE (index) != INTEGER_CST)
	break;
      sprintf (buffer, HOST_WIDE_INT_PRINT_DEC, TREE_INT_CST_LOW (index));
      obstack_grow (&name_obstack, buffer, strlen (buffer));

      break;

    case BIT_FIELD_REF:
    case REALPART_EXPR:
    case IMAGPART_EXPR:
      gcc_unreachable (); 	/* we treat these as scalars.  */
      break;
    default:
      break;
    }
}

/* Create a human readable name for replacement variable of ACCESS.  */

static char *
make_fancy_name (tree expr)
{
  make_fancy_name_1 (expr);
  obstack_1grow (&name_obstack, '\0');
  return XOBFINISH (&name_obstack, char *);
}

/* Helper function for build_ref_for_offset.  */

static bool
build_ref_for_offset_1 (tree *res, tree type, HOST_WIDE_INT offset,
			tree exp_type)
{
  while (1)
    {
      tree fld;
      tree tr_size, index;
      HOST_WIDE_INT el_size;

      if (offset == 0 && exp_type
	  && useless_type_conversion_p (exp_type, type))
	return true;

      switch (TREE_CODE (type))
	{
	case UNION_TYPE:
	case QUAL_UNION_TYPE:
	case RECORD_TYPE:
	  /* Some ADA records are half-unions, treat all of them the same.  */
	  for (fld = TYPE_FIELDS (type); fld; fld = TREE_CHAIN (fld))
	    {
	      HOST_WIDE_INT pos, size;
	      tree expr, *expr_ptr;

	      if (TREE_CODE (fld) != FIELD_DECL)
		continue;

	      pos = int_bit_position (fld);
	      gcc_assert (TREE_CODE (type) == RECORD_TYPE || pos == 0);
	      size = tree_low_cst (DECL_SIZE (fld), 1);
	      if (pos > offset || (pos + size) <= offset)
		continue;

	      if (res)
		{
		  expr = build3 (COMPONENT_REF, TREE_TYPE (fld), *res, fld,
				 NULL_TREE);
		  expr_ptr = &expr;
		}
	      else
		expr_ptr = NULL;
	      if (build_ref_for_offset_1 (expr_ptr, TREE_TYPE (fld),
					  offset - pos, exp_type))
		{
		  if (res)
		    *res = expr;
		  return true;
		}
	    }
	  return false;

	case ARRAY_TYPE:
	  tr_size = TYPE_SIZE (TREE_TYPE (type));
	  if (!tr_size || !host_integerp (tr_size, 1))
	    return false;
	  el_size = tree_low_cst (tr_size, 1);

	  index = build_int_cst (TYPE_DOMAIN (type), offset / el_size);
	  if (!integer_zerop (TYPE_MIN_VALUE (TYPE_DOMAIN (type))))
	    index = int_const_binop (PLUS_EXPR, index,
				     TYPE_MIN_VALUE (TYPE_DOMAIN (type)), 0);
	  if (res)
	    *res = build4 (ARRAY_REF, TREE_TYPE (type), *res, index, NULL_TREE,
			   NULL_TREE);
	  offset = offset % el_size;
	  type = TREE_TYPE (type);
	  break;

	default:
	  if (offset != 0)
	    return false;

	  if (exp_type)
	    return false;
	  else
	    return true;
	}
    }
}

/* Construct an expression that would reference a part of aggregate *EXPR of
   type TYPE at the given OFFSET of the type EXP_TYPE.  If EXPR is NULL, the
   function only determines whether it can build such a reference without
   actually doing it.

   FIXME: Eventually this should be replaced with
   maybe_fold_offset_to_reference() from tree-ssa-ccp.c but that requires a
   minor rewrite of fold_stmt.
 */

static bool
build_ref_for_offset (tree *expr, tree type, HOST_WIDE_INT offset,
		      tree exp_type, bool allow_ptr)
{
  if (allow_ptr && POINTER_TYPE_P (type))
    {
      type = TREE_TYPE (type);
      if (expr)
	*expr = fold_build1 (INDIRECT_REF, type, *expr);
    }

  return build_ref_for_offset_1 (expr, type, offset, exp_type);
}

/* The very first phase of intraprocedural SRA.  It marks in candidate_bitmap
   those with type which is suitable for scalarization.  */

static bool
find_var_candidates (void)
{
  tree var, type;
  referenced_var_iterator rvi;
  bool ret = false;

  FOR_EACH_REFERENCED_VAR (var, rvi)
    {
      if (TREE_CODE (var) != VAR_DECL && TREE_CODE (var) != PARM_DECL)
        continue;
      type = TREE_TYPE (var);

      if (!AGGREGATE_TYPE_P (type)
	  || needs_to_live_in_memory (var)
	  || TREE_THIS_VOLATILE (var)
	  || !COMPLETE_TYPE_P (type)
	  || !host_integerp (TYPE_SIZE (type), 1)
          || tree_low_cst (TYPE_SIZE (type), 1) == 0
	  || type_internals_preclude_sra_p (type))
	continue;

      bitmap_set_bit (candidate_bitmap, DECL_UID (var));

      if (dump_file && (dump_flags & TDF_DETAILS))
	{
	  fprintf (dump_file, "Candidate (%d): ", DECL_UID (var));
	  print_generic_expr (dump_file, var, 0);
	  fprintf (dump_file, "\n");
	}
      ret = true;
    }

  return ret;
}

/* Sort all accesses for the given variable, check for partial overlaps and
   return NULL if there are any.  If there are none, pick a representative for
   each combination of offset and size and create a linked list out of them.
   Return the pointer to the first representative and make sure it is the first
   one in the vector of accesses.  */

static struct access *
sort_and_splice_var_accesses (tree var)
{
  int i, j, access_count;
  struct access *res, **prev_acc_ptr = &res;
  VEC (access_p, heap) *access_vec;
  bool first = true;
  HOST_WIDE_INT low = -1, high = 0;

  access_vec = get_base_access_vector (var);
  if (!access_vec)
    return NULL;
  access_count = VEC_length (access_p, access_vec);

  /* Sort by <OFFSET, SIZE>.  */
  qsort (VEC_address (access_p, access_vec), access_count, sizeof (access_p),
	 compare_access_positions);

  i = 0;
  while (i < access_count)
    {
      struct access *access = VEC_index (access_p, access_vec, i);
      bool modification = access->write;
      bool grp_read = !access->write;
      bool grp_bfr_lhs = access->grp_bfr_lhs;
      bool first_scalar = is_gimple_reg_type (access->type);
      bool unscalarizable_region = access->grp_unscalarizable_region;

      if (first || access->offset >= high)
	{
	  first = false;
	  low = access->offset;
	  high = access->offset + access->size;
	}
      else if (access->offset > low && access->offset + access->size > high)
	return NULL;
      else
	gcc_assert (access->offset >= low
		    && access->offset + access->size <= high);

      j = i + 1;
      while (j < access_count)
	{
	  struct access *ac2 = VEC_index (access_p, access_vec, j);
	  if (ac2->offset != access->offset || ac2->size != access->size)
	    break;
	  modification |= ac2->write;
	  grp_read |= !ac2->write;
	  grp_bfr_lhs |= ac2->grp_bfr_lhs;
	  unscalarizable_region |= ac2->grp_unscalarizable_region;
	  relink_to_new_repr (access, ac2);

	  /* If one of the equivalent accesses is scalar, use it as a
	     representative (this happens when when there is for example on a
	     single scalar field in a structure).  */
	  if (!first_scalar && is_gimple_reg_type (ac2->type))
	    {
	      struct access tmp_acc;
	      first_scalar = true;

	      memcpy (&tmp_acc, ac2, sizeof (struct access));
	      memcpy (ac2, access,  sizeof (struct access));
	      memcpy (access, &tmp_acc, sizeof (struct access));
	    }
	  ac2->group_representative = access;
	  j++;
	}

      i = j;

      access->group_representative = access;
      access->grp_write = modification;
      access->grp_read = grp_read;
      access->grp_bfr_lhs = grp_bfr_lhs;
      access->grp_unscalarizable_region = unscalarizable_region;
      if (access->first_link)
	add_access_to_work_queue (access);

      *prev_acc_ptr = access;
      prev_acc_ptr = &access->next_grp;
    }

  gcc_assert (res == VEC_index (access_p, access_vec, 0));
  return res;
}

/* Create a variable for the given ACCESS which determines the type, name and a
   few other properties.  Return the variable declaration and store it also to
   ACCESS->replacement.  */

static tree
create_access_replacement (struct access *access)
{
  tree repl;

  repl = make_rename_temp (access->type, "SR");
  get_var_ann (repl);
  add_referenced_var (repl);

  DECL_SOURCE_LOCATION (repl) = DECL_SOURCE_LOCATION (access->base);
  DECL_ARTIFICIAL (repl) = 1;

  if (DECL_NAME (access->base) && !DECL_IGNORED_P (access->base))
    {
      char *pretty_name = make_fancy_name (access->expr);

      DECL_NAME (repl) = get_identifier (pretty_name);
      obstack_free (&name_obstack, pretty_name);

      SET_DECL_DEBUG_EXPR (repl, access->expr);
      DECL_DEBUG_EXPR_IS_FROM (repl) = 1;
      DECL_IGNORED_P (repl) = 0;
    }

  DECL_IGNORED_P (repl) = DECL_IGNORED_P (access->base);
  TREE_NO_WARNING (repl) = TREE_NO_WARNING (access->base);

  if (access->grp_bfr_lhs)
    DECL_GIMPLE_REG_P (repl) = 0;

  if (dump_file)
    {
      fprintf (dump_file, "Created a replacement for ");
      print_generic_expr (dump_file, access->base, 0);
      fprintf (dump_file, " offset: %u, size: %u: ",
	       (unsigned) access->offset, (unsigned) access->size);
      print_generic_expr (dump_file, repl, 0);
      fprintf (dump_file, "\n");
    }

  return repl;
}

/* Return ACCESS scalar replacement, create it if it does not exist yet.  */

static inline tree
get_access_replacement (struct access *access)
{
  gcc_assert (access->grp_to_be_replaced);

  if (access->replacement_decl)
    return access->replacement_decl;

  access->replacement_decl = create_access_replacement (access);
  return access->replacement_decl;
}

/* Build a subtree of accesses rooted in *ACCESS, and move the pointer in the
   linked list along the way.  Stop when *ACCESS is NULL or the access pointed
   to it is not "within" the root.  */

static void
build_access_subtree (struct access **access)
{
  struct access *root = *access, *last_child = NULL;
  HOST_WIDE_INT limit = root->offset + root->size;

  *access = (*access)->next_grp;
  while  (*access && (*access)->offset + (*access)->size <= limit)
    {
      if (!last_child)
	root->first_child = *access;
      else
	last_child->next_sibling = *access;
      last_child = *access;

      build_access_subtree (access);
    }
}

/* Build a tree of access representatives, ACCESS is the pointer to the first
   one, others are linked in a list by the next_grp field.  Decide about scalar
   replacements on the way, return true iff any are to be created.  */

static void
build_access_trees (struct access *access)
{
  while (access)
    {
      struct access *root = access;

      build_access_subtree (&access);
      root->next_grp = access;
    }
}

/* Analyze the subtree of accesses rooted in ROOT, scheduling replacements when
   both seeming beneficial and when ALLOW_REPLACEMENTS allows it.  Also set
   all sorts of access flags appropriately along the way, notably always ser
   grp_read when MARK_READ is true and grp_write when MARK_WRITE is true.  */

static bool
analyze_access_subtree (struct access *root, bool allow_replacements,
			bool mark_read, bool mark_write)
{
  struct access *child;
  HOST_WIDE_INT limit = root->offset + root->size;
  HOST_WIDE_INT covered_to = root->offset;
  bool scalar = is_gimple_reg_type (root->type);
  bool hole = false, sth_created = false;

  if (mark_read)
    root->grp_read = true;
  else if (root->grp_read)
    mark_read = true;

  if (mark_write)
    root->grp_write = true;
  else if (root->grp_write)
    mark_write = true;

  if (root->grp_unscalarizable_region)
    allow_replacements = false;

  for (child = root->first_child; child; child = child->next_sibling)
    {
      if (!hole && child->offset < covered_to)
	hole = true;
      else
	covered_to += child->size;

      sth_created |= analyze_access_subtree (child,
					     allow_replacements && !scalar,
					     mark_read, mark_write);

      root->grp_unscalarized_data |= child->grp_unscalarized_data;
      hole |= !child->grp_covered;
    }

  if (allow_replacements && scalar && !root->first_child)
    {
      if (dump_file && (dump_flags & TDF_DETAILS))
	{
	  fprintf (dump_file, "Marking ");
	  print_generic_expr (dump_file, root->base, 0);
	  fprintf (dump_file, " offset: %u, size: %u: ",
		   (unsigned) root->offset, (unsigned) root->size);
	  fprintf (dump_file, " to be replaced.\n");
	}

      root->grp_to_be_replaced = 1;
      sth_created = true;
      hole = false;
    }
  else if (covered_to < limit)
    hole = true;

  if (sth_created && !hole)
    {
      root->grp_covered = 1;
      return true;
    }
  if (root->grp_write || TREE_CODE (root->base) == PARM_DECL)
    root->grp_unscalarized_data = 1; /* not covered and written to */
  if (sth_created)
    return true;
  return false;
}

/* Analyze all access trees linked by next_grp by the means of
   analyze_access_subtree.  */
static bool
analyze_access_trees (struct access *access)
{
  bool ret = false;

  while (access)
    {
      if (analyze_access_subtree (access, true, false, false))
	ret = true;
      access = access->next_grp;
    }

  return ret;
}

/* Return true iff a potential new child of LACC at offset OFFSET and with size
   SIZE would conflict with an already existing one.  If exactly such a child
   already exists in LACC, store a pointer to it in EXACT_MATCH.  */

static bool
child_would_conflict_in_lacc (struct access *lacc, HOST_WIDE_INT norm_offset,
			      HOST_WIDE_INT size, struct access **exact_match)
{
  struct access *child;

  for (child = lacc->first_child; child; child = child->next_sibling)
    {
      if (child->offset == norm_offset && child->size == size)
	{
	  *exact_match = child;
	  return true;
	}

      if (child->offset < norm_offset + size
	  && child->offset + child->size > norm_offset)
	return true;
    }

  return false;
}

/* Set the expr of TARGET to one just like MODEL but with is own base at the
   bottom of the handled components.  */

static void
duplicate_expr_for_different_base (struct access *target,
				   struct access *model)
{
  tree t, expr = unshare_expr (model->expr);

  gcc_assert (handled_component_p (expr));
  t = expr;
  while (handled_component_p (TREE_OPERAND (t, 0)))
    t = TREE_OPERAND (t, 0);
  gcc_assert (TREE_OPERAND (t, 0) == model->base);
  TREE_OPERAND (t, 0) = target->base;

  target->expr = expr;
}


/* Create a new child access of PARENT, with all properties just like MODEL
   except for its offset and with its grp_write false and grp_read true.
   Return the new access. Note that this access is created long after all
   splicing and sorting, it's not located in any access vector and is
   automatically a representative of its group.  */

static struct access *
create_artificial_child_access (struct access *parent, struct access *model,
				HOST_WIDE_INT new_offset)
{
  struct access *access;
  struct access **child;

  gcc_assert (!model->grp_unscalarizable_region);

  access = (struct access *) pool_alloc (access_pool);
  memset (access, 0, sizeof (struct access));
  access->base = parent->base;
  access->offset = new_offset;
  access->size = model->size;
  duplicate_expr_for_different_base (access, model);
  access->type = model->type;
  access->grp_write = true;
  access->grp_read = false;

  child = &parent->first_child;
  while (*child && (*child)->offset < new_offset)
    child = &(*child)->next_sibling;

  access->next_sibling = *child;
  *child = access;

  return access;
}


/* Propagate all subaccesses of RACC across an assignment link to LACC. Return
   true if any new subaccess was created.  Additionally, if RACC is a scalar
   access but LACC is not, change the type of the latter.  */

static bool
propagate_subacesses_accross_link (struct access *lacc, struct access *racc)
{
  struct access *rchild;
  HOST_WIDE_INT norm_delta = lacc->offset - racc->offset;

  bool ret = false;

  if (is_gimple_reg_type (lacc->type)
      || lacc->grp_unscalarizable_region
      || racc->grp_unscalarizable_region)
    return false;

  if (!lacc->first_child && !racc->first_child
      && is_gimple_reg_type (racc->type)
      && (sra_mode == SRA_MODE_INTRA
          || !bitmap_bit_p (retvals_bitmap, DECL_UID (lacc->base))))
    {
      duplicate_expr_for_different_base (lacc, racc);
      lacc->type = racc->type;
      return false;
    }

  for (rchild = racc->first_child; rchild; rchild = rchild->next_sibling)
    {
      struct access *new_acc = NULL;
      HOST_WIDE_INT norm_offset = rchild->offset + norm_delta;

      if (rchild->grp_unscalarizable_region)
	continue;

      if (child_would_conflict_in_lacc (lacc, norm_offset, rchild->size,
					&new_acc))
	{
	  if (new_acc && rchild->first_child)
	    ret |= propagate_subacesses_accross_link (new_acc, rchild);
	  continue;
	}

      new_acc = create_artificial_child_access (lacc, rchild, norm_offset);
      if (racc->first_child)
	propagate_subacesses_accross_link (new_acc, rchild);

      ret = true;
    }

  return ret;
}

/* Propagate all subaccesses across assignment links.  */

static void
propagate_all_subaccesses (void)
{
  while (work_queue_head)
    {
      struct access *racc = pop_access_from_work_queue ();
      struct assign_link *link;

      gcc_assert (racc->first_link);

      for (link = racc->first_link; link; link = link->next)
	{
	  struct access *lacc = link->lacc;

	  if (!bitmap_bit_p (candidate_bitmap, DECL_UID (lacc->base)))
	    continue;
	  lacc = lacc->group_representative;
	  if (propagate_subacesses_accross_link (lacc, racc)
	      && lacc->first_link)
	    add_access_to_work_queue (lacc);
	}
    }
}

/* Go through all accesses collected throughout the (intraprocedural) analysis
   stage, exclude overlapping ones, identify representatives and build trees
   out of them, making decisions about scalarization on the way.  Return true
   iff there are any to-be-scalarized variables after this stage. */

static bool
analyze_all_variable_accesses (void)
{
  tree var;
  referenced_var_iterator rvi;
  bool res = false;

  FOR_EACH_REFERENCED_VAR (var, rvi)
    if (bitmap_bit_p (candidate_bitmap, DECL_UID (var)))
      {
	struct access *access;

	access = sort_and_splice_var_accesses (var);
	if (access)
	  build_access_trees (access);
	else
	  disqualify_candidate (var,
				"No or inhibitingly overlapping accesses.");
      }

  propagate_all_subaccesses ();

  FOR_EACH_REFERENCED_VAR (var, rvi)
    if (bitmap_bit_p (candidate_bitmap, DECL_UID (var)))
      {
	struct access *access = get_first_repr_for_decl (var);

	if (analyze_access_trees (access))
	  {
	    res = true;
	    if (dump_file && (dump_flags & TDF_DETAILS))
	      {
		fprintf (dump_file, "\nAccess trees for ");
		print_generic_expr (dump_file, var, 0);
		fprintf (dump_file, " (UID: %u): \n", DECL_UID (var));
		dump_access_tree (dump_file, access);
		fprintf (dump_file, "\n");
	      }
	  }
	else
	  disqualify_candidate (var, "No scalar replacements to be created.");
      }

  return res;
}

/* Return true iff a reference statement into aggregate AGG can be built for
   every single to-be-replaced accesses that is a child of ACCESS, its sibling
   or a child of its sibling. TOP_OFFSET is the offset from the processed
   access subtree that has to be subtracted from offset of each access.  */

static bool
ref_expr_for_all_replacements_p (struct access *access, tree agg,
				 HOST_WIDE_INT top_offset)
{
  do
    {
      if (access->grp_to_be_replaced
	  && !build_ref_for_offset (NULL, TREE_TYPE (agg),
				    access->offset - top_offset,
				    access->type, false))
	return false;

      if (access->first_child
	  && !ref_expr_for_all_replacements_p (access->first_child, agg,
					       top_offset))
	return false;

      access = access->next_sibling;
    }
  while (access);

  return true;
}


/* Generate statements copying scalar replacements of accesses within a subtree
   into or out of AGG.  ACCESS is the first child of the root of the subtree to
   be processed.  AGG is an aggregate type expression (can be a declaration but
   does not have to be, it can for example also be an indirect_ref).
   TOP_OFFSET is the offset of the processed subtree which has to be subtracted
   from offsets of individual accesses to get corresponding offsets for AGG.
   If CHUNK_SIZE is non-null, copy only replacements in the interval
   <start_offset, start_offset + chunk_size>, otherwise copy all.  GSI is a
   statement iterator used to place the new statements.  WRITE should be true
   when the statements should write from AGG to the replacement and false if
   vice versa.  if INSERT_AFTER is true, new statements will be added after the
   current statement in GSI, they will be added before the statement
   otherwise.  */

static void
generate_subtree_copies (struct access *access, tree agg,
			 HOST_WIDE_INT top_offset,
			 HOST_WIDE_INT start_offset, HOST_WIDE_INT chunk_size,
			 gimple_stmt_iterator *gsi, bool write,
			 bool insert_after)
{
  do
    {
      tree expr = unshare_expr (agg);

      if (chunk_size && access->offset >= start_offset + chunk_size)
	return;

      if (access->grp_to_be_replaced
	  && (chunk_size == 0
	      || access->offset + access->size > start_offset))
	{
	  bool repl_found;
	  gimple stmt;

	  repl_found = build_ref_for_offset (&expr, TREE_TYPE (agg),
					     access->offset - top_offset,
					     access->type, false);
	  gcc_assert (repl_found);

	  if (write)
	    stmt = gimple_build_assign (get_access_replacement (access), expr);
	  else
	    {
	      tree repl = get_access_replacement (access);
	      TREE_NO_WARNING (repl) = 1;
	      stmt = gimple_build_assign (expr, repl);
	    }

	  if (insert_after)
	    gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
	  else
	    gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
	  update_stmt (stmt);
	}

      if (access->first_child)
	generate_subtree_copies (access->first_child, agg, top_offset,
				 start_offset, chunk_size, gsi,
				 write, insert_after);

      access = access->next_sibling;
    }
  while (access);
}

/* Assign zero to all scalar replacements in an access subtree.  ACCESS is the
   the root of the subtree to be processed.  GSI is the statement iterator used
   for inserting statements which are added after the current statement if
   INSERT_AFTER is true or before it otherwise.  */

static void
init_subtree_with_zero (struct access *access, gimple_stmt_iterator *gsi,
			bool insert_after)

{
  struct access *child;

  if (access->grp_to_be_replaced)
    {
      gimple stmt;

      stmt = gimple_build_assign (get_access_replacement (access),
				  fold_convert (access->type,
						integer_zero_node));
      if (insert_after)
	gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
      else
	gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
      update_stmt (stmt);
    }

  for (child = access->first_child; child; child = child->next_sibling)
    init_subtree_with_zero (child, gsi, insert_after);
}

/* Search for an access representative for the given expression EXPR and
   return it or NULL if it cannot be found.  */

static struct access *
get_access_for_expr (tree expr)
{
  HOST_WIDE_INT offset, size, max_size;
  tree base;

  if (TREE_CODE (expr) == VIEW_CONVERT_EXPR)
    expr = TREE_OPERAND (expr, 0);

  base = get_ref_base_and_extent (expr, &offset, &size, &max_size);
  /* !!! Assert for testing only, remove after some time.  */
  gcc_assert (base);
  if (max_size == -1 || !DECL_P (base))
    return NULL;

  if (!bitmap_bit_p (candidate_bitmap, DECL_UID (base)))
    return NULL;

  return get_var_base_offset_size_access (base, offset, max_size);
}

/* Substitute into *EXPR an expression of type TYPE with the value of the
   replacement of ACCESS.  This is done either by producing a special V_C_E
   assignment statement converting the replacement to a new temporary of the
   requested type if TYPE is not TREE_ADDRESSABLE or by going through the base
   aggregate if it is.  */

static void
fix_incompatible_types_for_expr (tree *expr, tree type, struct access *access,
				 gimple_stmt_iterator *gsi, bool write)
{
  tree repl = get_access_replacement (access);
  if (!TREE_ADDRESSABLE (type))
    {
      tree tmp = create_tmp_var (type, "SRvce");

      add_referenced_var (tmp);
      if (is_gimple_reg_type (type))
	tmp = make_ssa_name (tmp, NULL);

      if (write)
	{
	  gimple stmt;
	  tree conv = fold_build1 (VIEW_CONVERT_EXPR, TREE_TYPE (repl), tmp);

	  *expr = tmp;
	  if (is_gimple_reg_type (type))
	    SSA_NAME_DEF_STMT (tmp) = gsi_stmt (*gsi);
	  stmt = gimple_build_assign (repl, conv);
	  gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
	  update_stmt (stmt);
	}
      else
	{
	  gimple stmt;
	  tree conv = fold_build1 (VIEW_CONVERT_EXPR, type, repl);

	  stmt = gimple_build_assign (tmp, conv);
	  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
	  if (is_gimple_reg_type (type))
	    SSA_NAME_DEF_STMT (tmp) = stmt;
	  *expr = tmp;
	  update_stmt (stmt);
	}
    }
  else
    {
      if (write)
	{
	  gimple stmt;

	  stmt = gimple_build_assign (repl, unshare_expr (access->expr));
	  gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
	  update_stmt (stmt);
	}
      else
	{
	  gimple stmt;

	  stmt = gimple_build_assign (unshare_expr (access->expr), repl);
	  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
	  update_stmt (stmt);
	}
    }
}


/* Callback for scan_function.  Replace the expression EXPR with a scalar
   replacement if there is one and generate other statements to do type
   conversion or subtree copying if necessary.  GSI is used to place newly
   created statements, WRITE is true if the expression is being written to (it
   is on a LHS of a statement or output in an assembly statement).  */

static bool
sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi, bool write,
		 void *data ATTRIBUTE_UNUSED)
{
  struct access *access;
  tree type, bfr;

  if (TREE_CODE (*expr) == BIT_FIELD_REF)
    {
      bfr = *expr;
      expr = &TREE_OPERAND (*expr, 0);
    }
  else
    bfr = NULL_TREE;

  if (TREE_CODE (*expr) == REALPART_EXPR || TREE_CODE (*expr) == IMAGPART_EXPR)
    expr = &TREE_OPERAND (*expr, 0);
  type = TREE_TYPE (*expr);

  access = get_access_for_expr (*expr);
  if (!access)
    return false;

  if (access->grp_to_be_replaced)
    {
      if (!useless_type_conversion_p (type, access->type))
	fix_incompatible_types_for_expr (expr, type, access, gsi, write);
      else
	*expr = get_access_replacement (access);
    }

  if (access->first_child)
    {
      HOST_WIDE_INT start_offset, chunk_size;
      if (bfr
	  && host_integerp (TREE_OPERAND (bfr, 1), 1)
	  && host_integerp (TREE_OPERAND (bfr, 2), 1))
	{
	  start_offset = tree_low_cst (TREE_OPERAND (bfr, 1), 1);
	  chunk_size = tree_low_cst (TREE_OPERAND (bfr, 2), 1);
	}
      else
	start_offset = chunk_size = 0;

      generate_subtree_copies (access->first_child, access->base, 0,
			       start_offset, chunk_size, gsi, write, write);
    }
  return true;
}

/* Store all replacements in the access tree rooted in TOP_RACC either to their
   base aggregate if there are unscalarized data or directly to LHS
   otherwise.  */

static void
handle_unscalarized_data_in_subtree (struct access *top_racc, tree lhs,
				     gimple_stmt_iterator *gsi)
{
  if (top_racc->grp_unscalarized_data)
    generate_subtree_copies (top_racc->first_child, top_racc->base, 0, 0, 0,
			     gsi, false, false);
  else
    generate_subtree_copies (top_racc->first_child, lhs, top_racc->offset,
			     0, 0, gsi, false, false);
}


/* Try to generate statements to load all sub-replacements in an access
   (sub)tree (LACC is the first child) from scalar replacements in the TOP_RACC
   (sub)tree.  If that is not possible, refresh the TOP_RACC base aggregate and
   load the accesses from it.  LEFT_OFFSET is the offset of the left whole
   subtree being copied, RIGHT_OFFSET is the same thing for the right subtree.
   GSI is stmt iterator used for statement insertions.  *REFRESHED is true iff
   the rhs top aggregate has already been refreshed by contents of its scalar
   reductions and is set to true if this function has to do it.  */

static void
load_assign_lhs_subreplacements (struct access *lacc, struct access *top_racc,
				 HOST_WIDE_INT left_offset,
				 HOST_WIDE_INT right_offset,
				 gimple_stmt_iterator *old_gsi,
				 gimple_stmt_iterator *new_gsi,
				 bool *refreshed, tree lhs)
{
  do
    {
      if (lacc->grp_to_be_replaced)
	{
	  struct access *racc;
	  HOST_WIDE_INT offset = lacc->offset - left_offset + right_offset;

	  racc = find_access_in_subtree (top_racc, offset, lacc->size);
	  if (racc && racc->grp_to_be_replaced)
	    {
	      gimple stmt;

	      if (useless_type_conversion_p (lacc->type, racc->type))
		stmt = gimple_build_assign (get_access_replacement (lacc),
					    get_access_replacement (racc));
	      else
		{
		  tree rhs = fold_build1 (VIEW_CONVERT_EXPR, lacc->type,
					  get_access_replacement (racc));
		  stmt = gimple_build_assign (get_access_replacement (lacc),
					      rhs);
		}

	      gsi_insert_after (new_gsi, stmt, GSI_NEW_STMT);
	      update_stmt (stmt);
	    }
	  else
	    {
	      tree expr = unshare_expr (top_racc->base);
	      bool repl_found;
	      gimple stmt;

	      /* No suitable access on the right hand side, need to load from
		 the aggregate.  See if we have to update it first... */
	      if (!*refreshed)
		{
		  gcc_assert (top_racc->first_child);
		  handle_unscalarized_data_in_subtree (top_racc, lhs, old_gsi);
		  *refreshed = true;
		}

	      repl_found = build_ref_for_offset (&expr,
						 TREE_TYPE (top_racc->base),
						 lacc->offset - left_offset,
						 lacc->type, false);
	      gcc_assert (repl_found);
	      stmt = gimple_build_assign (get_access_replacement (lacc),
					  expr);
	      gsi_insert_after (new_gsi, stmt, GSI_NEW_STMT);
	      update_stmt (stmt);
	    }
	}
      else if (lacc->grp_read && !lacc->grp_covered && !*refreshed)
	{
	  handle_unscalarized_data_in_subtree (top_racc, lhs, old_gsi);
	  *refreshed = true;
	}

      if (lacc->first_child)
	load_assign_lhs_subreplacements (lacc->first_child, top_racc,
					 left_offset, right_offset,
					 old_gsi, new_gsi, refreshed, lhs);
      lacc = lacc->next_sibling;
    }
  while (lacc);
}

/* Return true iff ACC is non-NULL and has subaccesses.  */

static inline bool
access_has_children_p (struct access *acc)
{
  return acc && acc->first_child;
}

/* Modify assignments with a CONSTRUCTOR on their RHS.  STMT contains a pointer
   to the assignment and GSI is the statement iterator pointing at it.  Returns
   the same values as sra_modify_assign.  */

static enum scan_assign_result
sra_modify_constructor_assign (gimple *stmt, gimple_stmt_iterator *gsi)
{
  tree lhs = gimple_assign_lhs (*stmt);
  struct access *acc;

  gcc_assert (TREE_CODE (lhs) != REALPART_EXPR
	      && TREE_CODE (lhs) != IMAGPART_EXPR);
  acc = get_access_for_expr (lhs);
  if (!acc)
    return SRA_SA_NONE;

  if (VEC_length (constructor_elt,
		  CONSTRUCTOR_ELTS (gimple_assign_rhs1 (*stmt))) > 0)
    {
      /* I have never seen this code path trigger but if it can happen the
	 following should handle it gracefully.  */
      if (access_has_children_p (acc))
	generate_subtree_copies (acc->first_child, acc->base, 0, 0, 0, gsi,
				 true, true);
      return SRA_SA_PROCESSED;
    }

  if (acc->grp_covered)
    {
      init_subtree_with_zero (acc, gsi, false);
      unlink_stmt_vdef (*stmt);
      gsi_remove (gsi, true);
      return SRA_SA_REMOVED;
    }
  else
    {
      init_subtree_with_zero (acc, gsi, true);
      return SRA_SA_PROCESSED;
    }
}


/* Modify statements with IMAGPART_EXPR or REALPART_EXPR on their lhs with
   to-be-scalarized expressions with them.  STMT is the statement and GSI is
   the iterator used to place new helper statements.  Returns the same values
   as sra_modify_assign.  */

static enum scan_assign_result
sra_modify_partially_complex_lhs (gimple stmt, gimple_stmt_iterator *gsi)
{
  tree lhs, complex, ptype, rp, ip;
  struct access *access;
  gimple new_stmt, aux_stmt;

  lhs = gimple_assign_lhs (stmt);
  complex = TREE_OPERAND (lhs, 0);

  access = get_access_for_expr (complex);

  if (!access || !access->grp_to_be_replaced)
    return SRA_SA_NONE;

  ptype = TREE_TYPE (TREE_TYPE (complex));
  rp = create_tmp_var (ptype, "SRr");
  add_referenced_var (rp);
  rp = make_ssa_name (rp, NULL);

  ip = create_tmp_var (ptype, "SRp");
  add_referenced_var (ip);
  ip = make_ssa_name (ip, NULL);

  if (TREE_CODE (lhs) == IMAGPART_EXPR)
    {
      aux_stmt = gimple_build_assign (rp, fold_build1 (REALPART_EXPR, ptype,
					     get_access_replacement (access)));
      SSA_NAME_DEF_STMT (rp) = aux_stmt;
      gimple_assign_set_lhs (stmt, ip);
      SSA_NAME_DEF_STMT (ip) = stmt;
    }
  else
    {
      aux_stmt = gimple_build_assign (ip, fold_build1 (IMAGPART_EXPR, ptype,
					     get_access_replacement (access)));
      SSA_NAME_DEF_STMT (ip) = aux_stmt;
      gimple_assign_set_lhs (stmt, rp);
      SSA_NAME_DEF_STMT (rp) = stmt;
    }

  gsi_insert_before (gsi, aux_stmt, GSI_SAME_STMT);
  update_stmt (aux_stmt);
  new_stmt = gimple_build_assign (get_access_replacement (access),
				  fold_build2 (COMPLEX_EXPR, access->type,
					       rp, ip));
  gsi_insert_after (gsi, new_stmt, GSI_NEW_STMT);
  update_stmt (new_stmt);
  return SRA_SA_PROCESSED;
}

/* Change STMT to assign compatible types by means of adding component or array
   references or VIEW_CONVERT_EXPRs.  All parameters have the same meaning as
   variable with the same names in sra_modify_assign.  If we can avoid a V_C_E
   used to load a field from a (single field) record or a union or an element
   of a one-sized array by producing COMPONENT_REFs and ARRAY_REFs instead, do
   so.  */

static void
fix_modified_assign_compatibility (gimple_stmt_iterator *gsi, gimple *stmt,
				   struct access *lacc, struct access *racc,
				   tree lhs, tree *rhs, tree ltype, tree rtype)
{
  if (racc && racc->grp_to_be_replaced && AGGREGATE_TYPE_P (ltype)
      && !access_has_children_p (lacc))
    {
      tree expr = unshare_expr (lhs);
      bool found = build_ref_for_offset (&expr, ltype, racc->offset, rtype,
					 false);
      if (found)
	{
	  gimple_assign_set_lhs (*stmt, expr);
	  return;
	}
    }

  if (lacc && lacc->grp_to_be_replaced && AGGREGATE_TYPE_P (rtype)
      && !access_has_children_p (racc))
    {
      tree expr = unshare_expr (*rhs);
      bool found = build_ref_for_offset (&expr, rtype, lacc->offset, ltype,
					 false);
      if (found)
	{
	  gimple_assign_set_rhs1 (*stmt, expr);
	  return;
	}
    }

  *rhs = fold_build1 (VIEW_CONVERT_EXPR, ltype, *rhs);
  gimple_assign_set_rhs_from_tree (gsi, *rhs);
  *stmt = gsi_stmt (*gsi);
}

/* Callback of scan_function to process assign statements.  It examines both
   sides of the statement, replaces them with a scalare replacement if there is
   one and generating copying of replacements if scalarized aggregates have been
   used in the assignment.  STMT is a pointer to the assign statement, GSI is
   used to hold generated statements for type conversions and subtree
   copying.  */

static enum scan_assign_result
sra_modify_assign (gimple *stmt, gimple_stmt_iterator *gsi,
		   void *data ATTRIBUTE_UNUSED)
{
  struct access *lacc, *racc;
  tree ltype, rtype;
  tree lhs, rhs;
  bool modify_this_stmt;

  if (!gimple_assign_single_p (*stmt))
    return SRA_SA_NONE;
  lhs = gimple_assign_lhs (*stmt);
  rhs = gimple_assign_rhs1 (*stmt);

  if (TREE_CODE (rhs) == CONSTRUCTOR)
    return sra_modify_constructor_assign (stmt, gsi);

  if (TREE_CODE (lhs) == REALPART_EXPR || TREE_CODE (lhs) == IMAGPART_EXPR)
    return sra_modify_partially_complex_lhs (*stmt, gsi);

  if (TREE_CODE (rhs) == REALPART_EXPR || TREE_CODE (rhs) == IMAGPART_EXPR
      || TREE_CODE (rhs) == BIT_FIELD_REF || TREE_CODE (lhs) == BIT_FIELD_REF)
    {
      modify_this_stmt = sra_modify_expr (gimple_assign_rhs1_ptr (*stmt),
					  gsi, false, data);
      modify_this_stmt |= sra_modify_expr (gimple_assign_lhs_ptr (*stmt),
					   gsi, true, data);
      return modify_this_stmt ? SRA_SA_PROCESSED : SRA_SA_NONE;
    }

  lacc = get_access_for_expr (lhs);
  racc = get_access_for_expr (rhs);
  if (!lacc && !racc)
    return SRA_SA_NONE;

  modify_this_stmt = ((lacc && lacc->grp_to_be_replaced)
		      || (racc && racc->grp_to_be_replaced));

  if (lacc && lacc->grp_to_be_replaced)
    {
      lhs = get_access_replacement (lacc);
      gimple_assign_set_lhs (*stmt, lhs);
      ltype = lacc->type;
    }
  else
    ltype = TREE_TYPE (lhs);

  if (racc && racc->grp_to_be_replaced)
    {
      rhs = get_access_replacement (racc);
      gimple_assign_set_rhs1 (*stmt, rhs);
      rtype = racc->type;
    }
  else
    rtype = TREE_TYPE (rhs);

  if (modify_this_stmt)
    {
      if (!useless_type_conversion_p (ltype, rtype))
	fix_modified_assign_compatibility (gsi, stmt, lacc, racc,
					   lhs, &rhs, ltype, rtype);
    }

  /* From this point on, the function deals with assignments in between
     aggregates when at least one has scalar reductions of some of its
     components.  There are three possible scenarios: Both the LHS and RHS have
     to-be-scalarized components, 2) only the RHS has or 3) only the LHS has.

     In the first case, we would like to load the LHS components from RHS
     components whenever possible.  If that is not possible, we would like to
     read it directly from the RHS (after updating it by storing in it its own
     components).  If there are some necessary unscalarized data in the LHS,
     those will be loaded by the original assignment too.  If neither of these
     cases happen, the original statement can be removed.  Most of this is done
     by load_assign_lhs_subreplacements.

     In the second case, we would like to store all RHS scalarized components
     directly into LHS and if they cover the aggregate completely, remove the
     statement too.  In the third case, we want the LHS components to be loaded
     directly from the RHS (DSE will remove the original statement if it
     becomes redundant).

     This is a bit complex but manageable when types match and when unions do
     not cause confusion in a way that we cannot really load a component of LHS
     from the RHS or vice versa (the access representing this level can have
     subaccesses that are accessible only through a different union field at a
     higher level - different from the one used in the examined expression).
     Unions are fun.

     Therefore, I specially handle a fourth case, happening when there is a
     specific type cast or it is impossible to locate a scalarized subaccess on
     the other side of the expression.  If that happens, I simply "refresh" the
     RHS by storing in it is scalarized components leave the original statement
     there to do the copying and then load the scalar replacements of the LHS.
     This is what the first branch does.  */


  if (contains_view_convert_expr_p (rhs) || contains_view_convert_expr_p (lhs)
      || (access_has_children_p (racc)
	  && !ref_expr_for_all_replacements_p (racc, lhs, racc->offset))
      || (access_has_children_p (lacc)
	  && !ref_expr_for_all_replacements_p (lacc, rhs, lacc->offset)))
    {
      if (access_has_children_p (racc))
	generate_subtree_copies (racc->first_child, racc->base, 0, 0, 0,
				 gsi, false, false);
      if (access_has_children_p (lacc))
	generate_subtree_copies (lacc->first_child, lacc->base, 0, 0, 0,
				 gsi, true, true);
    }
  else
    {
      if (access_has_children_p (lacc) && access_has_children_p (racc))
	{
	  gimple_stmt_iterator orig_gsi = *gsi;
	  bool refreshed;

	  if (lacc->grp_read && !lacc->grp_covered)
	    {
	      handle_unscalarized_data_in_subtree (racc, lhs, gsi);
	      refreshed = true;
	    }
	  else
	    refreshed = false;

	  load_assign_lhs_subreplacements (lacc->first_child, racc,
					   lacc->offset, racc->offset,
					   &orig_gsi, gsi, &refreshed, lhs);
	  if (!refreshed || !racc->grp_unscalarized_data)
	    {
	      if (*stmt == gsi_stmt (*gsi))
		gsi_next (gsi);

	      unlink_stmt_vdef (*stmt);
	      gsi_remove (&orig_gsi, true);
	      return SRA_SA_REMOVED;
	    }
	}
      else
	{
	  if (access_has_children_p (racc))
	    {
	      if (!racc->grp_unscalarized_data)
		{
		  generate_subtree_copies (racc->first_child, lhs,
					   racc->offset, 0, 0, gsi,
					   false, false);
		  gcc_assert (*stmt == gsi_stmt (*gsi));
		  unlink_stmt_vdef (*stmt);
		  gsi_remove (gsi, true);
		  return SRA_SA_REMOVED;
		}
	      else
		generate_subtree_copies (racc->first_child, lhs,
					 racc->offset, 0, 0, gsi, false, true);
	    }
	  else if (access_has_children_p (lacc))
	    generate_subtree_copies (lacc->first_child, rhs, lacc->offset,
				     0, 0, gsi, true, true);
	}
    }

  return modify_this_stmt ? SRA_SA_PROCESSED : SRA_SA_NONE;
}

/* Generate statements initializing scalar replacements of parts of function
   parameters.  */

static void
initialize_parameter_reductions (void)
{
  gimple_stmt_iterator gsi;
  gimple_seq seq = NULL;
  tree parm;

  for (parm = DECL_ARGUMENTS (current_function_decl);
       parm;
       parm = TREE_CHAIN (parm))
    {
      VEC (access_p, heap) *access_vec;
      struct access *access;

      if (!bitmap_bit_p (candidate_bitmap, DECL_UID (parm)))
	continue;
      access_vec = get_base_access_vector (parm);
      if (!access_vec)
	continue;

      if (!seq)
	{
	  seq = gimple_seq_alloc ();
	  gsi = gsi_start (seq);
	}

      for (access = VEC_index (access_p, access_vec, 0);
	   access;
	   access = access->next_grp)
	generate_subtree_copies (access, parm, 0, 0, 0, &gsi, true, true);
    }

  if (seq)
    gsi_insert_seq_on_edge_immediate (single_succ_edge (ENTRY_BLOCK_PTR), seq);
}

/* The "main" function of intraprocedural SRA passes.  Runs the analysis and if
   it reveals there are components of some aggregates to be scalarized, it runs
   the required transformations.  */
static unsigned int
perform_intra_sra (void)
{
  int ret = 0;
  sra_initialize ();

  if (!find_var_candidates ())
    goto out;

  if (!scan_function (build_access_from_expr, build_accesses_from_assign, NULL,
		      true, NULL))
    goto out;

  if (!analyze_all_variable_accesses ())
    goto out;

  scan_function (sra_modify_expr, sra_modify_assign, NULL,
		 false, NULL);
  initialize_parameter_reductions ();
  ret = TODO_update_ssa;

 out:
  sra_deinitialize ();
  return ret;
}

/* Perform early intraprocedural SRA.  */
static unsigned int
early_intra_sra (void)
{
  sra_mode = SRA_MODE_EARLY_INTRA;
  return perform_intra_sra ();
}

/* Perform "late" intraprocedural SRA.  */
static unsigned int
late_intra_sra (void)
{
  sra_mode = SRA_MODE_INTRA;
  return perform_intra_sra ();
}


static bool
gate_intra_sra (void)
{
  return flag_tree_sra != 0;
}


struct gimple_opt_pass pass_sra_early =
{
 {
  GIMPLE_PASS,
  "esra",	 			/* name */
  gate_intra_sra,			/* gate */
  early_intra_sra,			/* execute */
  NULL,					/* sub */
  NULL,					/* next */
  0,					/* static_pass_number */
  TV_TREE_SRA,				/* tv_id */
  PROP_cfg | PROP_ssa,                  /* properties_required */
  0,					/* properties_provided */
  0,					/* properties_destroyed */
  0,					/* todo_flags_start */
  TODO_dump_func
  | TODO_update_ssa
  | TODO_ggc_collect
  | TODO_verify_ssa			/* todo_flags_finish */
 }
};


struct gimple_opt_pass pass_sra =
{
 {
  GIMPLE_PASS,
  "sra",	 			/* name */
  gate_intra_sra,			/* gate */
  late_intra_sra,			/* execute */
  NULL,					/* sub */
  NULL,					/* next */
  0,					/* static_pass_number */
  TV_TREE_SRA,				/* tv_id */
  PROP_cfg | PROP_ssa,                  /* properties_required */
  0,					/* properties_provided */
  0,					/* properties_destroyed */
  TODO_update_address_taken,		/* todo_flags_start */
  TODO_dump_func
  | TODO_update_ssa
  | TODO_ggc_collect
  | TODO_verify_ssa			/* todo_flags_finish */
 }
};

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/5] New intraprocedural Scalar Reduction of  Aggregates.
  2009-05-10 10:33       ` Martin Jambor
@ 2009-05-10 11:48         ` Richard Guenther
  2009-05-12  0:24           ` Martin Jambor
  0 siblings, 1 reply; 25+ messages in thread
From: Richard Guenther @ 2009-05-10 11:48 UTC (permalink / raw)
  To: Martin Jambor; +Cc: GCC Patches, Jan Hubicka

On Sun, 10 May 2009, Martin Jambor wrote:

> > >       expr = TREE_OPERAND (expr, 0);
> > >       bit_ref = true;
> > >     }
> > >   else
> > >     bit_ref = false;
> > > 
> > >   while (TREE_CODE (expr) == NOP_EXPR
> > 
> > CONVERT_EXPR_P (expr)
> 
> OK... but  at another place  in the email  you said it might  not even
> appear in a valid gimple statement?  Should I remove it altogether?

Indeed.  If you not build trees from tuple stmts then a
NOP_EXPR cannot appear as a rhs1 or rhs2 of an assignment (instead
it is always the subcode in gimple stmt and the rhs1 is simply sth
valid for is_gimple_val).

> > > 	 || TREE_CODE (expr) == VIEW_CONVERT_EXPR
> > > 	 || TREE_CODE (expr) == REALPART_EXPR
> > > 	 || TREE_CODE (expr) == IMAGPART_EXPR)
> > >     expr = TREE_OPERAND (expr, 0);
> > 
> > Why do this here btw, and not just lump ...
> > 
> > >   switch (TREE_CODE (expr))
> > >     {
> > >     case ADDR_EXPR:
> > >     case SSA_NAME:
> > >     case INDIRECT_REF:
> > >       break;
> > > 
> > >     case VAR_DECL:
> > >     case PARM_DECL:
> > >     case RESULT_DECL:
> > >     case COMPONENT_REF:
> > >     case ARRAY_REF:
> > >       ret = create_access (expr, write);
> > >       break;
> > 
> > ... this ...
> > 
> > >     case REALPART_EXPR:
> > >     case IMAGPART_EXPR:
> > >       expr = TREE_OPERAND (expr, 0);
> > >       ret = create_access (expr, write);
> > 
> > ... and this together?  Won't you create bogus accesses if you
> > strip for example IMAGPART_EXPR (which has non-zero offset)?
> 
> That would  break the complex  number into its components.   I thought
> that they are  meant to stay together for  some reason, otherwise they
> would not be represented explicitly  in gimple... do you think it does
> not matter?  What about vectors then?
> 
> The access is not bogus because modification functions take care of
> these statements in a special way.  However, if it is indeed OK to
> split complex numbers into their components, I will gladly simplify
> this as you suggested.

Yes, it is valid to split them (and complex lowering indeed does that).
It _might_ be useful to keep a complex together in a single SSA_NAME
for optimization purposes, but I guess you detect that anyway if there
is a read of the whole complex element into a register and keep it
that way.

I would favor simplifying SRA in this case and just split them if
that is valid.

> > >       break;
> > > 
> > >     case ARRAY_RANGE_REF:
> > 
> > it should just be handled fine I think.
> 
> OK, I will try that at some later stage.
> 
> > >     default:
> > >       walk_tree (&safe_expr, disqualify_all, NULL, NULL);
> > 
> > and if not, this should just disqualify the base of the access, like
> > get_base_address (safe_expr) (save_expr you mean?) and then if that
> > is a DECL, disqualify that decl.
> 
> I'll test just doing nothing, things handled by get_base_address are
> either fine or already accounted for.

Fine with me.

> > >     return SRA_SA_NONE;
> > >
> > >   lhs_ptr = gimple_assign_lhs_ptr (stmt);
> > >   rhs_ptr = gimple_assign_rhs1_ptr (stmt);
> > 
> > you probably don't need to pass pointers to trees everywhere as you
> > are not changing them.
> 
> Well, this  function is a  callback called by scan_function  which can
> also  call  sra_modify_expr  in  the  last  stage  of  the  pass  when
> statements  are modified.   I have  considered splitting  the function
> into two but  in the end I  thought they would be too  similar and the
> overhead is hopefully manageable.

Yeah, I noticed this later.  It is somewhat confusing at first sight,
so maybe just amending the comment before this function could
clarify things.

> > >   if (disqualify_ops_if_throwing_stmt (stmt, lhs_ptr, rhs_ptr))
> > >     return SRA_SA_NONE;
> > > 
> > >   racc = build_access_from_expr_1 (rhs_ptr, gsi, false);
> > >   lacc = build_access_from_expr_1 (lhs_ptr, gsi, true);
> > 
> > just avoid calling into build_access_from_expr_1 for SSA_NAMEs
> > or is_gimple_min_invariant lhs/rhs, that should make that
> > function more regular.
> 
> In what sense?  build_access_from_expr_1 looks at TREE_CODE anyway and
> can discard the two cases,  without for example looking into ADR_EXPRs
> like is_gimple_min_invariant().
> 
> But if you really think it is indeed beneficial, I can do that, sure -
> to me it just looks ugly).

Ok, just keep it as is.

> > >   if (lacc && racc
> > >       && !lacc->grp_unscalarizable_region
> > >       && !racc->grp_unscalarizable_region
> > >       && AGGREGATE_TYPE_P (TREE_TYPE (*lhs_ptr))
> > >       && lacc->size <= racc->size
> > >       && useless_type_conversion_p (lacc->type, racc->type))
> > 
> > useless_type_conversion_p should be always true here.
> 
> I don't think so, build_access_from_expr_1 can look through V_C_Es and
> the types of accesses are the type of the operand in such cases..

Ok, but what is the point of looking through V_C_Es there if it makes
this test fail?  Hmm, IIRC this was only to track struct copies, right?
I guess it's ok then.

> > That would just be useless information.  I guess you copied this
> > from old SRA?
> 
> Yes.  All this fancy naming stuff  is quite useless but I find it very
> handy when debugging SRA issues.

Yeah, sort of.  Still using no name in that case will do exactly
the same thing ;)

> > > static tree
> > > create_access_replacement (struct access *access)
> > > {
> > >   tree repl;
> > > 
> > >   repl = make_rename_temp (access->type, "SR");
> > >   get_var_ann (repl);
> > >   add_referenced_var (repl);
> > > 
> > >   DECL_SOURCE_LOCATION (repl) = DECL_SOURCE_LOCATION (access->base);
> > >   DECL_ARTIFICIAL (repl) = 1;
> > > 
> > >   if (DECL_NAME (access->base) && !DECL_IGNORED_P (access->base))
> > 
> > at least && !DECL_ARTIFICIAL (access->base) I think.
> 
> This part is also largely copied from the old SRA.  So far it seems to
> work nicely, replacements of artificial declarations get SRsome_number
> fancy names and that makes  them easy to distinguish.  Nevertheless, I
> can change the condition if it is somehow wrong.  Or do you expect any
> other problems beside not-so-fancy fancy names?

No, it merely uses up memory.  Not that other passes do not do this ...

Thus, I probably do not care too much.

> > >   if (access->grp_bfr_lhs)
> > >     DECL_GIMPLE_REG_P (repl) = 0;
> > 
> > But you never set it (see update_address_taken for more cases,
> > most notably VIEW_CONVERT_EXPR on the lhs which need to be taken
> > care of).  You should set it for COMPLEX_TYPE and VECTOR_TYPE 
> > replacements.
> 
> This function  is the  only place where  I still  use make_rename_temp
> which sets it  exactly in these two cases.  I did  not really know why
> it is  required in these two  cases and only  in these two cases  so I
> left it there, at least for  now.  I guess I understand that now after
> seeing update_address_taken.
> 
> I can  replace this  with calling create_tmp_var()  and doing  all the
> rest  that make_rename_temp does  - I  believe that  you intend  to to
> remove it - I have just not found out why it is so bad.

The bad thing about it is that it supports using the SSA renamer
to write a single variable into SSA.  That is usually more costly
than just manually allocating SSA_NAMEs and updating SSA form,
which is usally very easy.

It's not used much, in which case the easiest thing might be to
fix all remaining uses to manually update SSA form.

But yes, I now see why that zeroing is necessary.

> > 
> > CONVERT_EXPR_P (expr)
> > 
> > >       || TREE_CODE (expr) == VIEW_CONVERT_EXPR)
> > 
> > VIEW_CONVERT_EXPR is also a handled_component_p.
> > 
> > Note that NOP_EXPR should never occur here - that would be invalid
> > gimple.  So I think you can (and should) just delete the above.
> 
> I haven't  seen a NOP_EXPR for a  while, do they still  exist in lower
> gimple?  Thus I have removed their handling.
> 
> Removing diving through V_C_E breaks ADA, though.  The reason is that
> we get a different size (and max_size) when calling
> get_ref_base_and_extent on the V_C_E and on its argument.  However, I
> believe both should be represented by a single access representative.

Yeah, I remember this :/  It is technically invalid GIMPLE that the
Ada FE generates though.  The size of the V_C_E result has to match
that of the operand.

Please add a FIXME before this stripping refering to the Ada problem.

> > >   tree repl = get_access_replacement (access);
> > >   if (!TREE_ADDRESSABLE (type))
> > >     {
> > >       tree tmp = create_tmp_var (type, "SRvce");
> > > 
> > >       add_referenced_var (tmp);
> > >       if (is_gimple_reg_type (type))
> > > 	tmp = make_ssa_name (tmp, NULL);
> > 
> > Should be always is_gimple_reg_type () if it is a type suitable for
> > a SRA scalar replacement. 
> 
> No, it is the type suitable for  the statement, it can be a union type
> or a record with only one field. But see the more thorough explanation
> below...

I think it should be always a register type, but see below... ;)

> > But you should set DECL_GIMPLE_REG_P for
> > VECTOR and COMPLEX types here.
> > 
> > >       if (write)
> > > 	{
> > > 	  gimple stmt;
> > > 	  tree conv = fold_build1 (VIEW_CONVERT_EXPR, TREE_TYPE (repl), tmp);
> > 
> > This needs to either always fold to plain 'tmp' or tmp has to be a
> > non-register.  Otherwise you will create invalid gimple.
> > 
> > > 	  *expr = tmp;
> > > 	  if (is_gimple_reg_type (type))
> > > 	    SSA_NAME_DEF_STMT (tmp) = gsi_stmt (*gsi);
> > 
> > See above.
> > 
> > > 	  stmt = gimple_build_assign (repl, conv);
> > > 	  gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
> > > 	  update_stmt (stmt);
> > > 	}
> > >       else
> > > 	{
> > > 	  gimple stmt;
> > > 	  tree conv = fold_build1 (VIEW_CONVERT_EXPR, type, repl);
> > > 
> > > 	  stmt = gimple_build_assign (tmp, conv);
> > > 	  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
> > > 	  if (is_gimple_reg_type (type))
> > > 	    SSA_NAME_DEF_STMT (tmp) = stmt;
> > 
> > See above.  (I wonder if the patch still passes bootstrap & regtest
> > after the typecking patch)
> > 
> > > 	  *expr = tmp;
> > > 	  update_stmt (stmt);
> > > 	}
> > >     }
> > >   else
> > >     {
> > >       if (write)
> > > 	{
> > > 	  gimple stmt;
> > > 
> > > 	  stmt = gimple_build_assign (repl, unshare_expr (access->expr));
> > > 	  gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
> > > 	  update_stmt (stmt);
> > > 	}
> > >       else
> > > 	{
> > > 	  gimple stmt;
> > > 
> > > 	  stmt = gimple_build_assign (unshare_expr (access->expr), repl);
> > > 	  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
> > > 	  update_stmt (stmt);
> > > 	}
> > 
> > I don't understand this path.  Are the types here always compatible?
> 
> And I don't really understand the comments.  The function is called by
> sra_modify_expr (the function doing the replacements in all non-assign
> statements) when it  needs to replace a reference by  a scalar but the
> types don't  match.  This can happen  when replacing a  V_C_E, a union
> access  when we  picked a  different  type that  the one  used in  the
> statement or (and this case can be remarkably irritating) an access to
> a records with only one (scalar) field.
> 
> My original idea  was to simply put a V_C_E in  the place.  However, I
> believe there are places where this  is not possible - or at least one
> case, a  LHS of  a call statement  because V_C_Es  of gimple_registers
> (ssa_names) are not allowed on  LHSs.  My initial idea to handle these
> cases  were to  create a  new temporary  with a  matching and  a V_C_E
> assign statement  (with the V_C_E always  on the RHS -  I believe that
> works even  with gimple  registers) that would  do the  conversion and
> load/store  it   to  the  replacement  variable  (this   is  what  the
> !TREE_ADDRESSABLE branch does).
> 
> The problem  with this idea  are TREE_ADDRESSABLE types.   These types
> need to be  constructed and thus we cannot  create temporary variables
> of these types.   On the other hand they absolutely  need to be SRAed,
> not doing  so slows down tramp3d by  a factor of two  (and the current
> SRA also breaks them up).  And  quite a few C++ classes are such types
> that   are  "non-addressable"   and  have   only  one   scalar  field.
> Identifying  such records  is possible,  I  soon realized  that I  can
> simply leave the statement as it  is and produce a new statement to do
> load/store  from the original  field (that's  what the  outermost else
> branch does).
> 
> Does this make sense or is there some fundamental flaw in my reasoning
> about gimple again?  Does this explain what the function does?

Ok, so the case in question is

  struct X { int i; } x;
  x = foo ();

where you want to scalarize x.  Indeed the obvious scalarization would
be

  x = foo ();
  SR_1 = x.i;

For all other LHS cases (not calls) you can move the V_C_E to the RHS
and should be fine.

I don't understand the TREE_ADDRESSABLE thingy yet.  Especially why
the type should be a non-register type.  My understanding was that
in reads the replacement LHS will be a register, for mismatched types
you simply put a V_C_E around the RHS.  For writes the replacement
RHS will be a register, possibly surrounded by a V_C_E.  The special
case is calls where you cannot put a V_C_E around the "RHS".

So, can you explain with an example, how it comes that the scalar
replacement type is a non-register type?  (Whatever our conclusion
will be, the function should have a big comment enumerating the
cases we have to deal with)

> It certainly passes bootstrap and testing, I use --enable-checking=yes.

That's good.

> > >     {
> > >       /* I have never seen this code path trigger but if it can happen the
> > > 	 following should handle it gracefully.  */
> > 
> > It can trigger for vector constants.
> 
> OK, I'll remove the comment.  Apparently there are none in the
> testsuite, I believe I tested with a gcc_unreachable here.

Err, for vector constants we have VECTOR_CST, so it triggers for
non-constant vector constructors like

 vector int x = { a, b, c, d };

> > >   update_stmt (aux_stmt);
> > >   new_stmt = gimple_build_assign (get_access_replacement (access),
> > > 				  fold_build2 (COMPLEX_EXPR, access->type,
> > > 					       rp, ip));
> > >   gsi_insert_after (gsi, new_stmt, GSI_NEW_STMT);
> > >   update_stmt (new_stmt);
> > 
> > Hm.  So you do what complex lowering does here.  Note that this may
> > create loads from uninitialized memory with all its problems.
> 
> Yes,  but I  have not  had any  such problems  with complex  types (as
> opposed to  simple loads from half-initialized  records, for example).
> OTOH, I  have also contemplated  setting DECL_GIMPLE_REG_P to  zero of
> complex replacement which appear in IMAG_PART or REAL_PART on a LHS of
> a statement.

Yes, that's necessary.  I still think SRA should not bother about
this at all ;)

> > WRT the complex stuff.  If you would do scalarization and analysis
> > just on the components (not special case REAL/IMAGPART_EXPR everywhere)
> > it should work better, correct?  You still could handle group
> > scalarization for the case of for example passing a complex argument
> > to a function.
> 
> Well, my reasoning was that if complex types were first-class citizens
> in gimple  (as opposed to a record),  there was a reason  to keep them
> together  and  so  I  attempted   that.   But  again,  if  that  is  a
> misconception of mine and there  is no point in keeping them together,
> I will gladly remove this.

It's not clear.  Complex lowering decomposes all complex variables
to components if possible.  Again, simplifying SRA is probably better.

> > void bar(_Complex float);
> > void foo(float x, float y)
> > {
> >   _Complex float z = x;
> >   __imag z = y;
> >   bar(z);
> > }
> > 
> > The same applies for vectors - the REAL/IMAGPART_EXPRs equivalent
> > there is BIT_FIELD_REF.
> 
> These are handled  by setting DECL_GIMPLE_REG_P to zero  if a B_F_R is
> on a LHS.  I believe the current SRA does the same.  It works fine and
> there's a lot less fuss about them.
>  
> > >   return SRA_SA_PROCESSED;
> > > }
> > > 
> > > /* Return true iff T has a VIEW_CONVERT_EXPR among its handled components.  */
> > > 
> > > static bool
> > > contains_view_convert_expr_p (tree t)
> > > {
> > >   while (1)
> > >     {
> > >       if (TREE_CODE (t) == VIEW_CONVERT_EXPR)
> > > 	return true;
> > >       if (!handled_component_p (t))
> > > 	return false;
> > >       t = TREE_OPERAND (t, 0);
> > >     }
> > > }
> > 
> > Place this in tree-flow-inline.h next to ref_contains_array_ref, also
> > structure the loop in the same way.
> 
> OK,  but I'd like  the function  to work  if passed  declarations too.
> Thus I cannot really use a  do-while loop.  I'll send it in a separate
> patch.
> 
> > > /* Change STMT to assign compatible types by means of adding component or array
> > >    references or VIEW_CONVERT_EXPRs.  All parameters have the same meaning as
> > >    variable with the same names in sra_modify_assign.  This is done in a
> > >    such a complicated way in order to make
> > >    testsuite/g++.dg/tree-ssa/ssa-sra-2.C happy and so it helps in at least some
> > >    cases.  */
> > > 
> > > static void
> > > fix_modified_assign_compatibility (gimple_stmt_iterator *gsi, gimple *stmt,
> > > 				   struct access *lacc, struct access *racc,
> > > 				   tree lhs, tree *rhs, tree ltype, tree rtype)
> > > {
> > >   if (racc && racc->grp_to_be_replaced && AGGREGATE_TYPE_P (ltype)
> > >       && !access_has_children_p (lacc))
> > >     {
> > >       tree expr = unshare_expr (lhs);
> > >       bool found = build_ref_for_offset (&expr, ltype, racc->offset, rtype,
> > > 					 false);
> > >       if (found)
> > > 	{
> > > 	  gimple_assign_set_lhs (*stmt, expr);
> > > 	  return;
> > > 	}
> > >     }
> > > 
> > >   if (lacc && lacc->grp_to_be_replaced && AGGREGATE_TYPE_P (rtype)
> > >       && !access_has_children_p (racc))
> > >     {
> > >       tree expr = unshare_expr (*rhs);
> > >       bool found = build_ref_for_offset (&expr, rtype, lacc->offset, ltype,
> > > 					 false);
> > >       if (found)
> > > 	{
> > > 	  gimple_assign_set_rhs1 (*stmt, expr);
> > > 	  return;
> > > 	}
> > >     }
> > > 
> > >   *rhs = fold_build1 (VIEW_CONVERT_EXPR, ltype, *rhs);
> > >   gimple_assign_set_rhs_from_tree (gsi, *rhs);
> > >   *stmt = gsi_stmt (*gsi);
> > 
> > Reading this I have a deja-vu - isn't there another function in this
> > file doing the same thing?  You are doing much unsharing even though
> > you re-build the access tree from scratch?
> 
> This function has a similar purpose as fix_incompatible_types_for_expr
> but this time  only for assign statements.  That  is easier because we
> can always put the  V_C_E on the RHS and be safe  and so no additional
> statements need to be generated.
> 
> However,  the V_C_Es  rather than  COMPONENT_REFs and  ARRAY_REFs feel
> unnatural for  accessing fields from  single field records  and unions
> and single  element arrays.  According to  the comment I  used to have
> problems of some sort with that  in the ssa-sra-2.C testcase but I can
> no longer reproduce them (and don't remember them).
> 
> I  call  unshare_expr  in  this  context  only when  one  side  of  an
> assignment statement is  a scalar replacement and the  other one is an
> aggregate (but not necessarily a declaration) which can happen only in
> the cases listed  above.  That is not very many  calls and chances are
> good that build_ref_for_offset succeeds.
> 
> Does that explain what is going on here?

Yes.  I guess merging the two functions would make sense to me
(or putting them next to each other at least).  The function signatures
also look seemingly weird (that you store into *rhs but still set
the stmt rhs, etc.).  I have another look here with the new version.

> > > }
> > > 
> > > /* Callback of scan_function to process assign statements.  It examines both
> > >    sides of the statement, replaces them with a scalare replacement if there is
> > >    one and generating copying of replacements if scalarized aggregates have been
> > >    used in the assignment.  STMT is a pointer to the assign statement, GSI is
> > >    used to hold generated statements for type conversions and subtree
> > >    copying.  */
> > > 
> > > static enum scan_assign_result
> > > sra_modify_assign (gimple *stmt, gimple_stmt_iterator *gsi,
> > > 		   void *data ATTRIBUTE_UNUSED)
> > > {
> > >   struct access *lacc, *racc;
> > >   tree ltype, rtype;
> > >   tree lhs, rhs;
> > >   bool modify_this_stmt;
> > > 
> > >   if (gimple_assign_rhs2 (*stmt))
> > 
> > !gimple_assign_single_p (*stmt)
> > 
> > (the only gimple assign that may access memory)
> 
> OK
> 
> > 
> > >     return SRA_SA_NONE;
> > >   lhs = gimple_assign_lhs (*stmt);
> > >   rhs = gimple_assign_rhs1 (*stmt);
> > > 
> > >   if (TREE_CODE (rhs) == CONSTRUCTOR)
> > >     return sra_modify_constructor_assign (stmt, gsi);
> > > 
> > >   if (TREE_CODE (lhs) == REALPART_EXPR || TREE_CODE (lhs) == IMAGPART_EXPR)
> > >     return sra_modify_partially_complex_lhs (*stmt, gsi);
> > > 
> > >   if (TREE_CODE (rhs) == REALPART_EXPR || TREE_CODE (rhs) == IMAGPART_EXPR
> > >       || TREE_CODE (rhs) == BIT_FIELD_REF || TREE_CODE (lhs) == BIT_FIELD_REF)
> > >     {
> > >       modify_this_stmt = sra_modify_expr (gimple_assign_rhs1_ptr (*stmt),
> > > 					  gsi, false, data);
> > >       modify_this_stmt |= sra_modify_expr (gimple_assign_lhs_ptr (*stmt),
> > > 					   gsi, true, data);
> > >       return modify_this_stmt ? SRA_SA_PROCESSED : SRA_SA_NONE;
> > >     }
> > > 
> > >   lacc = get_access_for_expr (lhs);
> > >   racc = get_access_for_expr (rhs);
> > >   if (!lacc && !racc)
> > >     return SRA_SA_NONE;
> > > 
> > >   modify_this_stmt = ((lacc && lacc->grp_to_be_replaced)
> > > 		      || (racc && racc->grp_to_be_replaced));
> > > 
> > >   if (lacc && lacc->grp_to_be_replaced)
> > >     {
> > >       lhs = get_access_replacement (lacc);
> > >       gimple_assign_set_lhs (*stmt, lhs);
> > >       ltype = lacc->type;
> > >     }
> > >   else
> > >     ltype = TREE_TYPE (lhs);
> > > 
> > >   if (racc && racc->grp_to_be_replaced)
> > >     {
> > >       rhs = get_access_replacement (racc);
> > >       gimple_assign_set_rhs1 (*stmt, rhs);
> > >       rtype = racc->type;
> > >     }
> > >   else
> > >     rtype = TREE_TYPE (rhs);
> > > 
> > >   /* The possibility that gimple_assign_set_rhs_from_tree() might reallocate
> > >      the statement makes the position of this pop_stmt_changes() a bit awkward
> > >      but hopefully make some sense.  */
> > 
> > I don't see pop_stmt_changes().
> 
> Yeah, the comment is outdated. I've removed it.
>  
> > >   if (modify_this_stmt)
> > >     {
> > >       if (!useless_type_conversion_p (ltype, rtype))
> > > 	fix_modified_assign_compatibility (gsi, stmt, lacc, racc,
> > > 					   lhs, &rhs, ltype, rtype);
> > >     }
> > > 
> > >   if (contains_view_convert_expr_p (rhs) || contains_view_convert_expr_p (lhs)
> > >       || (access_has_children_p (racc)
> > > 	  && !ref_expr_for_all_replacements_p (racc, lhs, racc->offset))
> > >       || (access_has_children_p (lacc)
> > > 	  && !ref_expr_for_all_replacements_p (lacc, rhs, lacc->offset)))
> > 
> > ?  A comment is missing what this case is about ...
> > 
> > (this smells like fixup that could be avoided by doing things correct
> > in the first place)
> 
> From this  point on,  the function deals  with assignments  in between
> aggregates  when at least  one has  scalar reductions  of some  of its
> components.  There are three possible  scenarios: Both the LHS and RHS
> have to-be-scalarized components,  2) only the RHS has  or 3) only the
> LHS has.
> 
> In the first  case, we would like to load the  LHS components from RHS
> components whenever possible.  If that  is not possible, we would like
> to read it  directly from the RHS (after updating it  by storing in it
> its own components).  If there are some necessary unscalarized data in
> the  LHS, those will  be loaded  by the  original assignment  too.  If
> neither of these cases happen,  the original statement can be removed.
> Most of this is done by load_assign_lhs_subreplacements.
> 
> In  the  second  case, we  would  like  to  store all  RHS  scalarized
> components  directly  into  LHS   and  if  they  cover  the  aggregate
> completely, remove the statement too.   In the third case, we want the
> LHS components to be loaded directly from the RHS (DSE will remove the
> original statement if it becomes redundant).
> 
> This is a bit complex but  manageable when types match and when unions
> do not cause confusion in a way that we cannot really load a component
> of LHS from the RHS or  vice versa (the access representing this level
> can  have subaccesses  that are  accessible only  through  a different
> union field  at a higher  level - different  from the one used  in the
> examined expression).  Unions are fun.
> 
> Therefore, I specially handle a fourth case, happening when there is a
> specific  type  cast  or  it  is impossible  to  locate  a  scalarized
> subaccess on  the other  side of the  expression.  If that  happens, I
> simply "refresh"  the RHS  by storing in  it is  scalarized components
> leave the original statement there to do the copying and then load the
> scalar replacements of the LHS.  This is what the first branch does.
> 
> Is it  clearer now?  Perhaps I  should put these five  paragraphs as a
> comment into the function?

Yes - that would be nice.

Thanks,
Richard.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/5] New intraprocedural Scalar Reduction of Aggregates.
  2009-05-10 11:48         ` Richard Guenther
@ 2009-05-12  0:24           ` Martin Jambor
  2009-05-18 13:26             ` Richard Guenther
  0 siblings, 1 reply; 25+ messages in thread
From: Martin Jambor @ 2009-05-12  0:24 UTC (permalink / raw)
  To: Richard Guenther; +Cc: GCC Patches, Jan Hubicka

Hi,

thanks for a quick reply.  Some clarifications below:

On Sun, May 10, 2009 at 01:48:01PM +0200, Richard Guenther wrote:
> On Sun, 10 May 2009, Martin Jambor wrote:
> 
> > > >       expr = TREE_OPERAND (expr, 0);
> > > >       bit_ref = true;
> > > >     }
> > > >   else
> > > >     bit_ref = false;
> > > > 
> > > >   while (TREE_CODE (expr) == NOP_EXPR
> > > 
> > > CONVERT_EXPR_P (expr)
> > 
> > OK... but  at another place  in the email  you said it might  not even
> > appear in a valid gimple statement?  Should I remove it altogether?
> 
> Indeed.  If you not build trees from tuple stmts then a
> NOP_EXPR cannot appear as a rhs1 or rhs2 of an assignment (instead
> it is always the subcode in gimple stmt and the rhs1 is simply sth
> valid for is_gimple_val).

OK

> 
> > > > 	 || TREE_CODE (expr) == VIEW_CONVERT_EXPR
> > > > 	 || TREE_CODE (expr) == REALPART_EXPR
> > > > 	 || TREE_CODE (expr) == IMAGPART_EXPR)
> > > >     expr = TREE_OPERAND (expr, 0);
> > > 
> > > Why do this here btw, and not just lump ...
> > > 
> > > >   switch (TREE_CODE (expr))
> > > >     {
> > > >     case ADDR_EXPR:
> > > >     case SSA_NAME:
> > > >     case INDIRECT_REF:
> > > >       break;
> > > > 
> > > >     case VAR_DECL:
> > > >     case PARM_DECL:
> > > >     case RESULT_DECL:
> > > >     case COMPONENT_REF:
> > > >     case ARRAY_REF:
> > > >       ret = create_access (expr, write);
> > > >       break;
> > > 
> > > ... this ...
> > > 
> > > >     case REALPART_EXPR:
> > > >     case IMAGPART_EXPR:
> > > >       expr = TREE_OPERAND (expr, 0);
> > > >       ret = create_access (expr, write);
> > > 
> > > ... and this together?  Won't you create bogus accesses if you
> > > strip for example IMAGPART_EXPR (which has non-zero offset)?
> > 
> > That would  break the complex  number into its components.   I thought
> > that they are  meant to stay together for  some reason, otherwise they
> > would not be represented explicitly  in gimple... do you think it does
> > not matter?  What about vectors then?
> > 
> > The access is not bogus because modification functions take care of
> > these statements in a special way.  However, if it is indeed OK to
> > split complex numbers into their components, I will gladly simplify
> > this as you suggested.
> 
> Yes, it is valid to split them (and complex lowering indeed does that).
> It _might_ be useful to keep a complex together in a single SSA_NAME
> for optimization purposes, but I guess you detect that anyway if there
> is a read of the whole complex element into a register and keep it
> that way.
> 
> I would favor simplifying SRA in this case and just split them if
> that is valid.

OK, I will try to do that and incormporate it into the patch if it
indeed simplifies things.  At least analysis of access trees will OTOH
become more complex.  Unfortunately, I need to concentrate on another
two things this week so I'll do that early the next one.

> > > >     return SRA_SA_NONE;
> > > >
> > > >   lhs_ptr = gimple_assign_lhs_ptr (stmt);
> > > >   rhs_ptr = gimple_assign_rhs1_ptr (stmt);
> > > 
> > > you probably don't need to pass pointers to trees everywhere as you
> > > are not changing them.
> > 
> > Well, this  function is a  callback called by scan_function  which can
> > also  call  sra_modify_expr  in  the  last  stage  of  the  pass  when
> > statements  are modified.   I have  considered splitting  the function
> > into two but  in the end I  thought they would be too  similar and the
> > overhead is hopefully manageable.
> 
> Yeah, I noticed this later.  It is somewhat confusing at first sight,
> so maybe just amending the comment before this function could
> clarify things.

OK, for my part, I've realized that build_access_from_expr_1 indeed
does not use the gsi parameter (and is not a callback, unlike
build_access_from_expr).

> > > >   if (disqualify_ops_if_throwing_stmt (stmt, lhs_ptr, rhs_ptr))
> > > >     return SRA_SA_NONE;
> > > > 
> > > >   racc = build_access_from_expr_1 (rhs_ptr, gsi, false);
> > > >   lacc = build_access_from_expr_1 (lhs_ptr, gsi, true);
> > > 
> > > just avoid calling into build_access_from_expr_1 for SSA_NAMEs
> > > or is_gimple_min_invariant lhs/rhs, that should make that
> > > function more regular.
> > 
> > In what sense?  build_access_from_expr_1 looks at TREE_CODE anyway and
> > can discard the two cases,  without for example looking into ADR_EXPRs
> > like is_gimple_min_invariant().
> > 
> > But if you really think it is indeed beneficial, I can do that, sure -
> > to me it just looks ugly).
> 
> Ok, just keep it as is.
> 
> > > >   if (lacc && racc
> > > >       && !lacc->grp_unscalarizable_region
> > > >       && !racc->grp_unscalarizable_region
> > > >       && AGGREGATE_TYPE_P (TREE_TYPE (*lhs_ptr))
> > > >       && lacc->size <= racc->size
> > > >       && useless_type_conversion_p (lacc->type, racc->type))
> > > 
> > > useless_type_conversion_p should be always true here.
> > 
> > I don't think so, build_access_from_expr_1 can look through V_C_Es and
> > the types of accesses are the type of the operand in such cases..
> 
> Ok, but what is the point of looking through V_C_Es there if it makes
> this test fail?  Hmm, IIRC this was only to track struct copies, right?
> I guess it's ok then.

Yes, only for that.  The point of looking through V_C_Es is the size
incosistency.

> 
> > > That would just be useless information.  I guess you copied this
> > > from old SRA?
> > 
> > Yes.  All this fancy naming stuff  is quite useless but I find it very
> > handy when debugging SRA issues.
> 
> Yeah, sort of.  Still using no name in that case will do exactly
> the same thing ;)
> 
> > > > static tree
> > > > create_access_replacement (struct access *access)
> > > > {
> > > >   tree repl;
> > > > 
> > > >   repl = make_rename_temp (access->type, "SR");
> > > >   get_var_ann (repl);
> > > >   add_referenced_var (repl);
> > > > 
> > > >   DECL_SOURCE_LOCATION (repl) = DECL_SOURCE_LOCATION (access->base);
> > > >   DECL_ARTIFICIAL (repl) = 1;
> > > > 
> > > >   if (DECL_NAME (access->base) && !DECL_IGNORED_P (access->base))
> > > 
> > > at least && !DECL_ARTIFICIAL (access->base) I think.
> > 
> > This part is also largely copied from the old SRA.  So far it seems to
> > work nicely, replacements of artificial declarations get SRsome_number
> > fancy names and that makes  them easy to distinguish.  Nevertheless, I
> > can change the condition if it is somehow wrong.  Or do you expect any
> > other problems beside not-so-fancy fancy names?
> 
> No, it merely uses up memory.  Not that other passes do not do this ...
> 
> Thus, I probably do not care too much.

Well, I'd like to have the fancy names when SRA gets merged and new
issues are likely to come up.  We can always remove this later on.

 
> > > >   if (access->grp_bfr_lhs)
> > > >     DECL_GIMPLE_REG_P (repl) = 0;
> > > 
> > > But you never set it (see update_address_taken for more cases,
> > > most notably VIEW_CONVERT_EXPR on the lhs which need to be taken
> > > care of).  You should set it for COMPLEX_TYPE and VECTOR_TYPE 
> > > replacements.
> > 
> > This function  is the  only place where  I still  use make_rename_temp
> > which sets it  exactly in these two cases.  I did  not really know why
> > it is  required in these two  cases and only  in these two cases  so I
> > left it there, at least for  now.  I guess I understand that now after
> > seeing update_address_taken.
> > 
> > I can  replace this  with calling create_tmp_var()  and doing  all the
> > rest  that make_rename_temp does  - I  believe that  you intend  to to
> > remove it - I have just not found out why it is so bad.
> 
> The bad thing about it is that it supports using the SSA renamer
> to write a single variable into SSA.  That is usually more costly
> than just manually allocating SSA_NAMEs and updating SSA form,
> which is usally very easy.
> 
> It's not used much, in which case the easiest thing might be to
> fix all remaining uses to manually update SSA form.
> 
> But yes, I now see why that zeroing is necessary.

OK, I'll use create_tmp_var_here too.  But at this point I cannot
create SSA_NAMEs manually and  will basically have to do all that
make_rename_temp does.
 
> > > 
> > > CONVERT_EXPR_P (expr)
> > > 
> > > >       || TREE_CODE (expr) == VIEW_CONVERT_EXPR)
> > > 
> > > VIEW_CONVERT_EXPR is also a handled_component_p.
> > > 
> > > Note that NOP_EXPR should never occur here - that would be invalid
> > > gimple.  So I think you can (and should) just delete the above.
> > 
> > I haven't  seen a NOP_EXPR for a  while, do they still  exist in lower
> > gimple?  Thus I have removed their handling.
> > 
> > Removing diving through V_C_E breaks ADA, though.  The reason is that
> > we get a different size (and max_size) when calling
> > get_ref_base_and_extent on the V_C_E and on its argument.  However, I
> > believe both should be represented by a single access representative.
> 
> Yeah, I remember this :/  It is technically invalid GIMPLE that the
> Ada FE generates though.  The size of the V_C_E result has to match
> that of the operand.
> 
> Please add a FIXME before this stripping refering to the Ada problem.

OK

> > > >   tree repl = get_access_replacement (access);
> > > >   if (!TREE_ADDRESSABLE (type))
> > > >     {
> > > >       tree tmp = create_tmp_var (type, "SRvce");
> > > > 
> > > >       add_referenced_var (tmp);
> > > >       if (is_gimple_reg_type (type))
> > > > 	tmp = make_ssa_name (tmp, NULL);
> > > 
> > > Should be always is_gimple_reg_type () if it is a type suitable for
> > > a SRA scalar replacement. 
> > 
> > No, it is the type suitable for  the statement, it can be a union type
> > or a record with only one field. But see the more thorough explanation
> > below...
> 
> I think it should be always a register type, but see below... ;)
> 
> > > But you should set DECL_GIMPLE_REG_P for
> > > VECTOR and COMPLEX types here.
> > > 
> > > >       if (write)
> > > > 	{
> > > > 	  gimple stmt;
> > > > 	  tree conv = fold_build1 (VIEW_CONVERT_EXPR, TREE_TYPE (repl), tmp);
> > > 
> > > This needs to either always fold to plain 'tmp' or tmp has to be a
> > > non-register.  Otherwise you will create invalid gimple.
> > > 
> > > > 	  *expr = tmp;
> > > > 	  if (is_gimple_reg_type (type))
> > > > 	    SSA_NAME_DEF_STMT (tmp) = gsi_stmt (*gsi);
> > > 
> > > See above.
> > > 
> > > > 	  stmt = gimple_build_assign (repl, conv);
> > > > 	  gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
> > > > 	  update_stmt (stmt);
> > > > 	}
> > > >       else
> > > > 	{
> > > > 	  gimple stmt;
> > > > 	  tree conv = fold_build1 (VIEW_CONVERT_EXPR, type, repl);
> > > > 
> > > > 	  stmt = gimple_build_assign (tmp, conv);
> > > > 	  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
> > > > 	  if (is_gimple_reg_type (type))
> > > > 	    SSA_NAME_DEF_STMT (tmp) = stmt;
> > > 
> > > See above.  (I wonder if the patch still passes bootstrap & regtest
> > > after the typecking patch)
> > > 
> > > > 	  *expr = tmp;
> > > > 	  update_stmt (stmt);
> > > > 	}
> > > >     }
> > > >   else
> > > >     {
> > > >       if (write)
> > > > 	{
> > > > 	  gimple stmt;
> > > > 
> > > > 	  stmt = gimple_build_assign (repl, unshare_expr (access->expr));
> > > > 	  gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
> > > > 	  update_stmt (stmt);
> > > > 	}
> > > >       else
> > > > 	{
> > > > 	  gimple stmt;
> > > > 
> > > > 	  stmt = gimple_build_assign (unshare_expr (access->expr), repl);
> > > > 	  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
> > > > 	  update_stmt (stmt);
> > > > 	}
> > > 
> > > I don't understand this path.  Are the types here always compatible?
> > 
> > And I don't really understand the comments.  The function is called by
> > sra_modify_expr (the function doing the replacements in all non-assign
> > statements) when it  needs to replace a reference by  a scalar but the
> > types don't  match.  This can happen  when replacing a  V_C_E, a union
> > access  when we  picked a  different  type that  the one  used in  the
> > statement or (and this case can be remarkably irritating) an access to
> > a records with only one (scalar) field.
> > 
> > My original idea  was to simply put a V_C_E in  the place.  However, I
> > believe there are places where this  is not possible - or at least one
> > case, a  LHS of  a call statement  because V_C_Es  of gimple_registers
> > (ssa_names) are not allowed on  LHSs.  My initial idea to handle these
> > cases  were to  create a  new temporary  with a  matching and  a V_C_E
> > assign statement  (with the V_C_E always  on the RHS -  I believe that
> > works even  with gimple  registers) that would  do the  conversion and
> > load/store  it   to  the  replacement  variable  (this   is  what  the
> > !TREE_ADDRESSABLE branch does).
> > 
> > The problem  with this idea  are TREE_ADDRESSABLE types.   These types
> > need to be  constructed and thus we cannot  create temporary variables
> > of these types.   On the other hand they absolutely  need to be SRAed,
> > not doing  so slows down tramp3d by  a factor of two  (and the current
> > SRA also breaks them up).  And  quite a few C++ classes are such types
> > that   are  "non-addressable"   and  have   only  one   scalar  field.
> > Identifying  such records  is possible,  I  soon realized  that I  can
> > simply leave the statement as it  is and produce a new statement to do
> > load/store  from the original  field (that's  what the  outermost else
> > branch does).
> > 
> > Does this make sense or is there some fundamental flaw in my reasoning
> > about gimple again?  Does this explain what the function does?
> 
> Ok, so the case in question is
> 
>   struct X { int i; } x;
>   x = foo ();
> 
> where you want to scalarize x.  Indeed the obvious scalarization would
> be
> 
>   x = foo ();
>   SR_1 = x.i;
> 
> For all other LHS cases (not calls) you can move the V_C_E to the RHS
> and should be fine.

Well, I  have tried  to remove the  function and  have sra_modify_expr
handle this  particular case only  to discover another one  where it's
probably required  (given that  gimple verification is  correct).  The
new  problem  is  that  I  cannot   put  a  V_C_E  as  an  operand  of
GIMPLE_RETURN.   Specifically  I got  an  "invalid  operand in  return
statement" failure when verifying:

return VIEW_CONVERT_EXPR<struct gnat__perfect_hash_generators__key_type>(SR.2108_12);

Thus I will probably keep the function to be always safe but change it
to the following:

/* Substitute into *EXPR an expression of type TYPE with the value of the
   replacement of ACCESS.  This is done either by producing a special V_C_E
   assignment statement converting the replacement to a new temporary of the
   requested type if TYPE is is_gimple_reg_type or by going through the base
   aggregate if it is not.  */

static void
fix_incompatible_types_for_expr (tree *expr, tree type, struct access *access,
				 gimple_stmt_iterator *gsi, bool write)
{
  tree repl = get_access_replacement (access);

  if (is_gimple_reg_type (type))
    {
      tree tmp = create_tmp_var (type, "SRvce");

      add_referenced_var (tmp);
      tmp = make_ssa_name (tmp, NULL);

      if (write)
	{
	  gimple stmt;
	  tree conv = fold_build1 (VIEW_CONVERT_EXPR, TREE_TYPE (repl), tmp);

	  *expr = tmp;
	  SSA_NAME_DEF_STMT (tmp) = gsi_stmt (*gsi);
	  stmt = gimple_build_assign (repl, conv);
	  gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
	  update_stmt (stmt);
	}
      else
	{
	  gimple stmt;
	  tree conv = fold_build1 (VIEW_CONVERT_EXPR, type, repl);

	  stmt = gimple_build_assign (tmp, conv);
	  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
	  SSA_NAME_DEF_STMT (tmp) = stmt;
	  *expr = tmp;
	  update_stmt (stmt);
	}
    }
  else
    {
      if (write)
	{
	  gimple stmt;

	  stmt = gimple_build_assign (repl, unshare_expr (access->expr));
	  gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
	  update_stmt (stmt);
	}
      else
	{
	  gimple stmt;

	  stmt = gimple_build_assign (unshare_expr (access->expr), repl);
	  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
	  update_stmt (stmt);
	}
    }
}



> I don't understand the TREE_ADDRESSABLE thingy yet.  Especially why
> the type should be a non-register type.

I guess it does not matter now, but in your example above type would be
struct x, not int.

> > It certainly passes bootstrap and testing, I use --enable-checking=yes.
> 
> That's good.
> 
> > > >     {
> > > >       /* I have never seen this code path trigger but if it can happen the
> > > > 	 following should handle it gracefully.  */
> > > 
> > > It can trigger for vector constants.
> > 
> > OK, I'll remove the comment.  Apparently there are none in the
> > testsuite, I believe I tested with a gcc_unreachable here.
> 
> Err, for vector constants we have VECTOR_CST, so it triggers for
> non-constant vector constructors like
> 
>  vector int x = { a, b, c, d };

I see.
 
> > > >   update_stmt (aux_stmt);
> > > >   new_stmt = gimple_build_assign (get_access_replacement (access),
> > > > 				  fold_build2 (COMPLEX_EXPR, access->type,
> > > > 					       rp, ip));
> > > >   gsi_insert_after (gsi, new_stmt, GSI_NEW_STMT);
> > > >   update_stmt (new_stmt);
> > > 
> > > Hm.  So you do what complex lowering does here.  Note that this may
> > > create loads from uninitialized memory with all its problems.
> > 
> > Yes,  but I  have not  had any  such problems  with complex  types (as
> > opposed to  simple loads from half-initialized  records, for example).
> > OTOH, I  have also contemplated  setting DECL_GIMPLE_REG_P to  zero of
> > complex replacement which appear in IMAG_PART or REAL_PART on a LHS of
> > a statement.
> 
> Yes, that's necessary.  I still think SRA should not bother about
> this at all ;)
> 
> > > WRT the complex stuff.  If you would do scalarization and analysis
> > > just on the components (not special case REAL/IMAGPART_EXPR everywhere)
> > > it should work better, correct?  You still could handle group
> > > scalarization for the case of for example passing a complex argument
> > > to a function.
> > 
> > Well, my reasoning was that if complex types were first-class citizens
> > in gimple  (as opposed to a record),  there was a reason  to keep them
> > together  and  so  I  attempted   that.   But  again,  if  that  is  a
> > misconception of mine and there  is no point in keeping them together,
> > I will gladly remove this.
> 
> It's not clear.  Complex lowering decomposes all complex variables
> to components if possible.  Again, simplifying SRA is probably better.

OK, I will find out how much this would actually simplify things and
what new problems might arise.  

I have already tried relaxing the  access tree analysis s that it does
not  prevent scalarization  of  subaccesses of  scalar accesses  which
would be necessary for decomposing complex components and uncovered an
unrelated problem, again in my favorite testcase entry_4.f90.

Can you please check whether the following snippet is a valid gimple?
The expander ICEs on an assert when trying to crunch the V_C_E at the
end.  Looking at it myself, I start to doubt that I can always handle
union type-punning with V_C_Es.

----------------------------------------------------------------------
master.2.f3 (integer(kind=8) __entry, integer(kind=4) * b, integer(kind=4) * a)
{
  real(kind=4) __result_master.2.f3$r;
  union munion.2.f3 __result_master.2.f3;
  union munion.2.f3 D.1641;
  complex(kind=4) D.1640;
  real(kind=4) D.1639;
  integer(kind=4) D.1638;
  logical(kind=4) D.1637;
  integer(kind=4) D.1636;
  real(kind=4) D.1635;
  integer(kind=4) D.1634;
  integer(kind=4) D.1633;

<bb 2>:
  switch (__entry_1(D)) <default: <L3>, case 0: <L3>, case 1: <L10>, case 2: <L9>>

<L3>:
  D.1633_3 = *a_2(D);
  D.1634_4 = D.1633_3 + 15;
  D.1635_5 = (real(kind=4)) D.1634_4;
  __result_master.2.f3$r_15 = D.1635_5;
  goto <bb 6>;

<L10>:
  D.1636_7 = *b_6(D);
  D.1637_8 = D.1636_7 == 42;
  __result_master.2.f3$r_19 = VIEW_CONVERT_EXPR<real(kind=4)>(D.1637_8);
----------------------------------------------------------------------

> 
> > > void bar(_Complex float);
> > > void foo(float x, float y)
> > > {
> > >   _Complex float z = x;
> > >   __imag z = y;
> > >   bar(z);
> > > }
> > > 
> > > The same applies for vectors - the REAL/IMAGPART_EXPRs equivalent
> > > there is BIT_FIELD_REF.
> > 
> > These are handled  by setting DECL_GIMPLE_REG_P to zero  if a B_F_R is
> > on a LHS.  I believe the current SRA does the same.  It works fine and
> > there's a lot less fuss about them.
> >  
> > > >   return SRA_SA_PROCESSED;
> > > > }
> > > > 
> > > > /* Return true iff T has a VIEW_CONVERT_EXPR among its handled components.  */
> > > > 
> > > > static bool
> > > > contains_view_convert_expr_p (tree t)
> > > > {
> > > >   while (1)
> > > >     {
> > > >       if (TREE_CODE (t) == VIEW_CONVERT_EXPR)
> > > > 	return true;
> > > >       if (!handled_component_p (t))
> > > > 	return false;
> > > >       t = TREE_OPERAND (t, 0);
> > > >     }
> > > > }
> > > 
> > > Place this in tree-flow-inline.h next to ref_contains_array_ref, also
> > > structure the loop in the same way.
> > 
> > OK,  but I'd like  the function  to work  if passed  declarations too.
> > Thus I cannot really use a  do-while loop.  I'll send it in a separate
> > patch.
> > 
> > > > /* Change STMT to assign compatible types by means of adding component or array
> > > >    references or VIEW_CONVERT_EXPRs.  All parameters have the same meaning as
> > > >    variable with the same names in sra_modify_assign.  This is done in a
> > > >    such a complicated way in order to make
> > > >    testsuite/g++.dg/tree-ssa/ssa-sra-2.C happy and so it helps in at least some
> > > >    cases.  */
> > > > 
> > > > static void
> > > > fix_modified_assign_compatibility (gimple_stmt_iterator *gsi, gimple *stmt,
> > > > 				   struct access *lacc, struct access *racc,
> > > > 				   tree lhs, tree *rhs, tree ltype, tree rtype)
> > > > {
> > > >   if (racc && racc->grp_to_be_replaced && AGGREGATE_TYPE_P (ltype)
> > > >       && !access_has_children_p (lacc))
> > > >     {
> > > >       tree expr = unshare_expr (lhs);
> > > >       bool found = build_ref_for_offset (&expr, ltype, racc->offset, rtype,
> > > > 					 false);
> > > >       if (found)
> > > > 	{
> > > > 	  gimple_assign_set_lhs (*stmt, expr);
> > > > 	  return;
> > > > 	}
> > > >     }
> > > > 
> > > >   if (lacc && lacc->grp_to_be_replaced && AGGREGATE_TYPE_P (rtype)
> > > >       && !access_has_children_p (racc))
> > > >     {
> > > >       tree expr = unshare_expr (*rhs);
> > > >       bool found = build_ref_for_offset (&expr, rtype, lacc->offset, ltype,
> > > > 					 false);
> > > >       if (found)
> > > > 	{
> > > > 	  gimple_assign_set_rhs1 (*stmt, expr);
> > > > 	  return;
> > > > 	}
> > > >     }
> > > > 
> > > >   *rhs = fold_build1 (VIEW_CONVERT_EXPR, ltype, *rhs);
> > > >   gimple_assign_set_rhs_from_tree (gsi, *rhs);
> > > >   *stmt = gsi_stmt (*gsi);
> > > 
> > > Reading this I have a deja-vu - isn't there another function in this
> > > file doing the same thing?  You are doing much unsharing even though
> > > you re-build the access tree from scratch?
> > 
> > This function has a similar purpose as fix_incompatible_types_for_expr
> > but this time  only for assign statements.  That  is easier because we
> > can always put the  V_C_E on the RHS and be safe  and so no additional
> > statements need to be generated.
> > 
> > However,  the V_C_Es  rather than  COMPONENT_REFs and  ARRAY_REFs feel
> > unnatural for  accessing fields from  single field records  and unions
> > and single  element arrays.  According to  the comment I  used to have
> > problems of some sort with that  in the ssa-sra-2.C testcase but I can
> > no longer reproduce them (and don't remember them).
> > 
> > I  call  unshare_expr  in  this  context  only when  one  side  of  an
> > assignment statement is  a scalar replacement and the  other one is an
> > aggregate (but not necessarily a declaration) which can happen only in
> > the cases listed  above.  That is not very many  calls and chances are
> > good that build_ref_for_offset succeeds.
> > 
> > Does that explain what is going on here?
> 
> Yes.  I guess merging the two functions would make sense to me
> (or putting them next to each other at least).  The function signatures
> also look seemingly weird (that you store into *rhs but still set
> the stmt rhs, etc.).  I have another look here with the new version.

*rhs is a local variable of the caller, not a pointer in the
statement.  (It has only one callsite and thus this double indirection
will be inlined away).  I tried using gimple_assign_set_rhs1 instead
but it ICEs when passed a V_C_E.

The two functions have similar goal but work differently, in
particular fix_modified_assign_compatibility never generates new
statements while fix_incompatible_types_for_expr always does.  So I
don't think it makes sense to merge them.  But I can move this one
upward, sure.

> > > > }
> > > > 
> > > > /* Callback of scan_function to process assign statements.  It examines both
> > > >    sides of the statement, replaces them with a scalare replacement if there is
> > > >    one and generating copying of replacements if scalarized aggregates have been
> > > >    used in the assignment.  STMT is a pointer to the assign statement, GSI is
> > > >    used to hold generated statements for type conversions and subtree
> > > >    copying.  */
> > > > 
> > > > static enum scan_assign_result
> > > > sra_modify_assign (gimple *stmt, gimple_stmt_iterator *gsi,
> > > > 		   void *data ATTRIBUTE_UNUSED)
> > > > {
> > > >   struct access *lacc, *racc;
> > > >   tree ltype, rtype;
> > > >   tree lhs, rhs;
> > > >   bool modify_this_stmt;
> > > > 
> > > >   if (gimple_assign_rhs2 (*stmt))
> > > 
> > > !gimple_assign_single_p (*stmt)
> > > 
> > > (the only gimple assign that may access memory)
> > 
> > OK
> > 
> > > 
> > > >     return SRA_SA_NONE;
> > > >   lhs = gimple_assign_lhs (*stmt);
> > > >   rhs = gimple_assign_rhs1 (*stmt);
> > > > 
> > > >   if (TREE_CODE (rhs) == CONSTRUCTOR)
> > > >     return sra_modify_constructor_assign (stmt, gsi);
> > > > 
> > > >   if (TREE_CODE (lhs) == REALPART_EXPR || TREE_CODE (lhs) == IMAGPART_EXPR)
> > > >     return sra_modify_partially_complex_lhs (*stmt, gsi);
> > > > 
> > > >   if (TREE_CODE (rhs) == REALPART_EXPR || TREE_CODE (rhs) == IMAGPART_EXPR
> > > >       || TREE_CODE (rhs) == BIT_FIELD_REF || TREE_CODE (lhs) == BIT_FIELD_REF)
> > > >     {
> > > >       modify_this_stmt = sra_modify_expr (gimple_assign_rhs1_ptr (*stmt),
> > > > 					  gsi, false, data);
> > > >       modify_this_stmt |= sra_modify_expr (gimple_assign_lhs_ptr (*stmt),
> > > > 					   gsi, true, data);
> > > >       return modify_this_stmt ? SRA_SA_PROCESSED : SRA_SA_NONE;
> > > >     }
> > > > 
> > > >   lacc = get_access_for_expr (lhs);
> > > >   racc = get_access_for_expr (rhs);
> > > >   if (!lacc && !racc)
> > > >     return SRA_SA_NONE;
> > > > 
> > > >   modify_this_stmt = ((lacc && lacc->grp_to_be_replaced)
> > > > 		      || (racc && racc->grp_to_be_replaced));
> > > > 
> > > >   if (lacc && lacc->grp_to_be_replaced)
> > > >     {
> > > >       lhs = get_access_replacement (lacc);
> > > >       gimple_assign_set_lhs (*stmt, lhs);
> > > >       ltype = lacc->type;
> > > >     }
> > > >   else
> > > >     ltype = TREE_TYPE (lhs);
> > > > 
> > > >   if (racc && racc->grp_to_be_replaced)
> > > >     {
> > > >       rhs = get_access_replacement (racc);
> > > >       gimple_assign_set_rhs1 (*stmt, rhs);
> > > >       rtype = racc->type;
> > > >     }
> > > >   else
> > > >     rtype = TREE_TYPE (rhs);
> > > > 
> > > >   /* The possibility that gimple_assign_set_rhs_from_tree() might reallocate
> > > >      the statement makes the position of this pop_stmt_changes() a bit awkward
> > > >      but hopefully make some sense.  */
> > > 
> > > I don't see pop_stmt_changes().
> > 
> > Yeah, the comment is outdated. I've removed it.
> >  
> > > >   if (modify_this_stmt)
> > > >     {
> > > >       if (!useless_type_conversion_p (ltype, rtype))
> > > > 	fix_modified_assign_compatibility (gsi, stmt, lacc, racc,
> > > > 					   lhs, &rhs, ltype, rtype);
> > > >     }
> > > > 
> > > >   if (contains_view_convert_expr_p (rhs) || contains_view_convert_expr_p (lhs)
> > > >       || (access_has_children_p (racc)
> > > > 	  && !ref_expr_for_all_replacements_p (racc, lhs, racc->offset))
> > > >       || (access_has_children_p (lacc)
> > > > 	  && !ref_expr_for_all_replacements_p (lacc, rhs, lacc->offset)))
> > > 
> > > ?  A comment is missing what this case is about ...
> > > 
> > > (this smells like fixup that could be avoided by doing things correct
> > > in the first place)
> > 
> > From this  point on,  the function deals  with assignments  in between
> > aggregates  when at least  one has  scalar reductions  of some  of its
> > components.  There are three possible  scenarios: Both the LHS and RHS
> > have to-be-scalarized components,  2) only the RHS has  or 3) only the
> > LHS has.
> > 
> > In the first  case, we would like to load the  LHS components from RHS
> > components whenever possible.  If that  is not possible, we would like
> > to read it  directly from the RHS (after updating it  by storing in it
> > its own components).  If there are some necessary unscalarized data in
> > the  LHS, those will  be loaded  by the  original assignment  too.  If
> > neither of these cases happen,  the original statement can be removed.
> > Most of this is done by load_assign_lhs_subreplacements.
> > 
> > In  the  second  case, we  would  like  to  store all  RHS  scalarized
> > components  directly  into  LHS   and  if  they  cover  the  aggregate
> > completely, remove the statement too.   In the third case, we want the
> > LHS components to be loaded directly from the RHS (DSE will remove the
> > original statement if it becomes redundant).
> > 
> > This is a bit complex but  manageable when types match and when unions
> > do not cause confusion in a way that we cannot really load a component
> > of LHS from the RHS or  vice versa (the access representing this level
> > can  have subaccesses  that are  accessible only  through  a different
> > union field  at a higher  level - different  from the one used  in the
> > examined expression).  Unions are fun.
> > 
> > Therefore, I specially handle a fourth case, happening when there is a
> > specific  type  cast  or  it  is impossible  to  locate  a  scalarized
> > subaccess on  the other  side of the  expression.  If that  happens, I
> > simply "refresh"  the RHS  by storing in  it is  scalarized components
> > leave the original statement there to do the copying and then load the
> > scalar replacements of the LHS.  This is what the first branch does.
> > 
> > Is it  clearer now?  Perhaps I  should put these five  paragraphs as a
> > comment into the function?
> 
> Yes - that would be nice.
> 

I have  just started a bootstrap  of my latest  version (still keeping
complex  numbers  together).  If  it  passes,  I'll  mail it  tomorrow
(otherwise, I'll have to wait for a couple of days).

Thanks again,

Martin

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/5] New intraprocedural Scalar Reduction of Aggregates.
  2009-05-10 10:39       ` Martin Jambor
@ 2009-05-12  9:49         ` Martin Jambor
  0 siblings, 0 replies; 25+ messages in thread
From: Martin Jambor @ 2009-05-12  9:49 UTC (permalink / raw)
  To: Richard Guenther; +Cc: GCC Patches, Jan Hubicka

Hi,

this  is  a slightly  updated  version  of  the new  tree-sra.c  file,
including changes  that I did yesterday  in a train and  just before I
went to bed (so no huge changes but a few more-or-less simple cleanups
and clarifications).

I  bleive this  should address  all your  concerns except  for complex
component decoupling - I will experiment with that early next week.

Please have a look at my previous mail too
(http://gcc.gnu.org/ml/gcc-patches/2009-05/msg00594.html)

Thanks a lot,

Martin



/* Scalar Replacement of Aggregates (SRA) converts some structure
   references into scalar references, exposing them to the scalar
   optimizers.
   Copyright (C) 2008, 2009 Free Software Foundation, Inc.
   Contributed by Martin Jambor <mjambor@suse.cz>

This file is part of GCC.

GCC is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 3, or (at your option) any later
version.

GCC is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
for more details.

You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING3.  If not see
<http://www.gnu.org/licenses/>.  */

/* This file implements Scalar Reduction of Aggregates (SRA).  SRA is run
   twice, once in the early stages of compilation (early SRA) and once in the
   late stages (late SRA).  The aim of both is to turn references to scalar
   parts of aggregates into uses of independent scalar variables.

   The two passes are nearly identical, the only difference is that early SRA
   does not scalarize unions which are used as the result in a GIMPLE_RETURN
   statement because together with inlining this can lead to weird type
   conversions.

   Both passes operate in four stages:

   1. The declarations that have properties which make them candidates for
      scalarization are identified in function find_var_candidates().  The
      candidates are stored in candidate_bitmap.

   2. The function body is scanned.  In the process, declarations which are
      used in a manner that prevent their scalarization are removed from the
      candidate bitmap.  More importantly, for every access into an aggregate,
      an access structure (struct access) is created by create_access() and
      stored in a vector associated with the aggregate.  Among other
      information, the aggregate declaration, the offset and size of the access
      and its type are stored in the structure.

      On a related note, assign_link structures are created for every assign
      statement between candidate aggregates and attached to the related
      accesses.

   3. The vectors of accesses are analyzed.  They are first sorted according to
      their offset and size and then scanned for partially overlapping accesses
      (i.e. those which overlap but one is not entirely within another).  Such
      an access disqualifies the whole aggregate from being scalarized.

      If there is no such inhibiting overlap, a representative access structure
      is chosen for every unique combination of offset and size.  Afterwards,
      the pass builds a set of trees from these structures, in which children
      of an access are within their parent (in terms of offset and size).

      Then accesses  are propagated  whenever possible (i.e.  in cases  when it
      does not create a partially overlapping access) across assign_links from
      the right hand side to the left hand side.

      Then the set of trees for each declaration is traversed again and those
      accesses which should be replaced by a scalar are identified.

   4. The function is traversed again, and for every reference into an
      aggregate that has some component which is about to be scalarized,
      statements are amended and new statements are created as necessary.
      Finally, if a parameter got scalarized, the scalar replacements are
      initialized with values from respective parameter aggregates.  */

#include "config.h"
#include "system.h"
#include "coretypes.h"
#include "alloc-pool.h"
#include "tm.h"
#include "tree.h"
#include "gimple.h"
#include "tree-flow.h"
#include "diagnostic.h"
#include "tree-dump.h"
#include "timevar.h"
#include "params.h"
#include "target.h"
#include "flags.h"

/* Enumeration of all aggregate reductions we can do.  */
enum sra_mode { SRA_MODE_EARLY_INTRA, /* early intraprocedural SRA */
		SRA_MODE_INTRA };	     /* late intraprocedural SRA */

/* Global variable describing which aggregate reduction we are performing at
   the moment.  */
static enum sra_mode sra_mode;

struct assign_link;

/* ACCESS represents each access to an aggregate variable (as a whole or a
   part).  It can also represent a group of accesses that refer to exactly the
   same fragment of an aggregate (i.e. those that have exactly the same offset
   and size).  Such representatives for a single aggregate, once determined,
   are linked in a linked list and have the group fields set.

   Moreover, when doing intraprocedural SRA, a tree is built from those
   representatives (by the means of first_child and next_sibling pointers), in
   which all items in a subtree are "within" the root, i.e. their offset is
   greater or equal to offset of the root and offset+size is smaller or equal
   to offset+size of the root.  Children of an access are sorted by offset.

   Note that accesses to parts of vector and complex number types always
   represented by an access to the whole complex number or a vector.  It is a
   duty of the modifying functions to replace them appropriately.  */

struct access
{
  /* Values returned by  `get_ref_base_and_extent' for each component reference
     If EXPR isn't a component reference  just set `BASE = EXPR', `OFFSET = 0',
     `SIZE = TREE_SIZE (TREE_TYPE (expr))'.  */
  HOST_WIDE_INT offset;
  HOST_WIDE_INT size;
  tree base;

  /* Expression.  */
  tree expr;
  /* Type.  */
  tree type;

  /* Next group representative for this aggregate. */
  struct access *next_grp;

  /* Pointer to the group representative.  Pointer to itself if the struct is
     the representative.  */
  struct access *group_representative;

  /* If this access has any children (in terms of the definition above), this
     points to the first one.  */
  struct access *first_child;

  /* Pointer to the next sibling in the access tree as described above.  */
  struct access *next_sibling;

  /* Pointers to the first and last element in the linked list of assign
     links.  */
  struct assign_link *first_link, *last_link;

  /* Pointer to the next access in the work queue.  */
  struct access *next_queued;

  /* Replacement variable for this access "region."  Never to be accessed
     directly, always only by the means of get_access_replacement() and only
     when grp_to_be_replaced flag is set.  */
  tree replacement_decl;

  /* Is this particular access write access? */
  unsigned write : 1;

  /* Is this access currently in the work queue?  */
  unsigned grp_queued : 1;
  /* Does this group contain a write access?  This flag is propagated down the
     access tree.  */
  unsigned grp_write : 1;
  /* Does this group contain a read access?  This flag is propagated down the
     access tree.  */
  unsigned grp_read : 1;
  /* Is the subtree rooted in this access fully covered by scalar
     replacements?  */
  unsigned grp_covered : 1;
  /* If set to true, this access and all below it in an access tree must not be
     scalarized.  */
  unsigned grp_unscalarizable_region : 1;
  /* Whether data have been written to parts of the aggregate covered by this
     access which is not to be scalarized.  This flag is propagated up in the
     access tree.  */
  unsigned grp_unscalarized_data : 1;
  /* Does this access and/or group contain a write access through a
     BIT_FIELD_REF?  */
  unsigned grp_bfr_lhs : 1;

  /* Set when a scalar replacement should be created for this variable.  We do
     the decision and creation at different places because create_tmp_var
     cannot be called from within FOR_EACH_REFERENCED_VAR. */
  unsigned grp_to_be_replaced : 1;
};

typedef struct access *access_p;

DEF_VEC_P (access_p);
DEF_VEC_ALLOC_P (access_p, heap);

/* Alloc pool for allocating access structures.  */
static alloc_pool access_pool;

/* A structure linking lhs and rhs accesses from an aggregate assignment.  They
   are used to propagate subaccesses from rhs to lhs as long as they don't
   conflict with what is already there.  */
struct assign_link
{
  struct access *lacc, *racc;
  struct assign_link *next;
};

/* Alloc pool for allocating assign link structures.  */
static alloc_pool link_pool;

/* Base (tree) -> Vector (VEC(access_p,heap) *) map.  */
static struct pointer_map_t *base_access_vec;

/* Bitmap of bases (candidates).  */
static bitmap candidate_bitmap;
/* Bitmap of declarations used in a return statement.  */
static bitmap retvals_bitmap;
/* Obstack for creation of fancy names.  */
static struct obstack name_obstack;

/* Head of a linked list of accesses that need to have its subaccesses
   propagated to their assignment counterparts. */
static struct access *work_queue_head;

/* Dump contents of ACCESS to file F in a human friendly way.  If GRP is true,
   representative fields are dumped, otherwise those which only describe the
   individual access are.  */

static void
dump_access (FILE *f, struct access *access, bool grp)
{
  fprintf (f, "access { ");
  fprintf (f, "base = (%d)'", DECL_UID (access->base));
  print_generic_expr (f, access->base, 0);
  fprintf (f, "', offset = " HOST_WIDE_INT_PRINT_DEC, access->offset);
  fprintf (f, ", size = " HOST_WIDE_INT_PRINT_DEC, access->size);
  fprintf (f, ", expr = ");
  print_generic_expr (f, access->expr, 0);
  fprintf (f, ", type = ");
  print_generic_expr (f, access->type, 0);
  if (grp)
    fprintf (f, ", grp_write = %d, grp_read = %d, grp_covered = %d, "
	     "grp_unscalarizable_region = %d, grp_unscalarized_data = %d, "
	     "grp_to_be_replaced = %d\n",
	     access->grp_write, access->grp_read, access->grp_covered,
	     access->grp_unscalarizable_region, access->grp_unscalarized_data,
	     access->grp_to_be_replaced);
  else
    fprintf (f, ", write = %d'\n", access->write);
}

/* Dump a subtree rooted in ACCESS to file F, indent by LEVEL.  */

static void
dump_access_tree_1 (FILE *f, struct access *access, int level)
{
  do
    {
      int i;

      for (i = 0; i < level; i++)
	fputs ("* ", dump_file);

      dump_access (f, access, true);

      if (access->first_child)
	dump_access_tree_1 (f, access->first_child, level + 1);

      access = access->next_sibling;
    }
  while (access);
}

/* Dump all access trees for a variable, given the pointer to the first root in
   ACCESS.  */

static void
dump_access_tree (FILE *f, struct access *access)
{
  for (; access; access = access->next_grp)
    dump_access_tree_1 (f, access, 0);
}

/* Return true iff ACC is non-NULL and has subaccesses.  */

static inline bool
access_has_children_p (struct access *acc)
{
  return acc && acc->first_child;
}

/* Return a vector of pointers to accesses for the variable given in BASE or
   NULL if there is none.  */

static VEC (access_p, heap) *
get_base_access_vector (tree base)
{
  void **slot;

  slot = pointer_map_contains (base_access_vec, base);
  if (!slot)
    return NULL;
  else
    return *(VEC (access_p, heap) **) slot;
}

/* Find an access with required OFFSET and SIZE in a subtree of accesses rooted
   in ACCESS.  Return NULL if it cannot be found.  */

static struct access *
find_access_in_subtree (struct access *access, HOST_WIDE_INT offset,
			HOST_WIDE_INT size)
{
  while (access && (access->offset != offset || access->size != size))
    {
      struct access *child = access->first_child;

      while (child && (child->offset + child->size <= offset))
	child = child->next_sibling;
      access = child;
    }

  return access;
}

/* Return the first group representative for DECL or NULL if none exists.  */

static struct access *
get_first_repr_for_decl (tree base)
{
  VEC (access_p, heap) *access_vec;

  access_vec = get_base_access_vector (base);
  if (!access_vec)
    return NULL;

  return VEC_index (access_p, access_vec, 0);
}

/* Find an access representative for the variable BASE and given OFFSET and
   SIZE.  Requires that access trees have already been built.  Return NULL if
   it cannot be found.  */

static struct access *
get_var_base_offset_size_access (tree base, HOST_WIDE_INT offset,
				 HOST_WIDE_INT size)
{
  struct access *access;

  access = get_first_repr_for_decl (base);
  while (access && (access->offset + access->size <= offset))
    access = access->next_grp;
  if (!access)
    return NULL;

  return find_access_in_subtree (access, offset, size);
}

/* Add LINK to the linked list of assign links of RACC.  */
static void
add_link_to_rhs (struct access *racc, struct assign_link *link)
{
  gcc_assert (link->racc == racc);

  if (!racc->first_link)
    {
      gcc_assert (!racc->last_link);
      racc->first_link = link;
    }
  else
    racc->last_link->next = link;

  racc->last_link = link;
  link->next = NULL;
}

/* Move all link structures in their linked list in OLD_RACC to the linked list
   in NEW_RACC.  */
static void
relink_to_new_repr (struct access *new_racc, struct access *old_racc)
{
  if (!old_racc->first_link)
    {
      gcc_assert (!old_racc->last_link);
      return;
    }

  if (new_racc->first_link)
    {
      gcc_assert (!new_racc->last_link->next);
      gcc_assert (!old_racc->last_link || !old_racc->last_link->next);

      new_racc->last_link->next = old_racc->first_link;
      new_racc->last_link = old_racc->last_link;
    }
  else
    {
      gcc_assert (!new_racc->last_link);

      new_racc->first_link = old_racc->first_link;
      new_racc->last_link = old_racc->last_link;
    }
  old_racc->first_link = old_racc->last_link = NULL;
}

/* Add ACCESS to the work queue (which is actually a stack).  */

static void
add_access_to_work_queue (struct access *access)
{
  if (!access->grp_queued)
    {
      gcc_assert (!access->next_queued);
      access->next_queued = work_queue_head;
      access->grp_queued = 1;
      work_queue_head = access;
    }
}

/* Pop an access from the work queue, and return it, assuming there is one.  */

static struct access *
pop_access_from_work_queue (void)
{
  struct access *access = work_queue_head;

  work_queue_head = access->next_queued;
  access->next_queued = NULL;
  access->grp_queued = 0;
  return access;
}


/* Allocate necessary structures.  */

static void
sra_initialize (void)
{
  candidate_bitmap = BITMAP_ALLOC (NULL);
  retvals_bitmap = BITMAP_ALLOC (NULL);
  gcc_obstack_init (&name_obstack);
  access_pool = create_alloc_pool ("SRA accesses", sizeof (struct access), 16);
  link_pool = create_alloc_pool ("SRA links", sizeof (struct assign_link), 16);
  base_access_vec = pointer_map_create ();
}

/* Hook fed to pointer_map_traverse, deallocate stored vectors.  */

static bool
delete_base_accesses (const void *key ATTRIBUTE_UNUSED, void **value,
		     void *data ATTRIBUTE_UNUSED)
{
  VEC (access_p, heap) *access_vec;
  access_vec = (VEC (access_p, heap) *) *value;
  VEC_free (access_p, heap, access_vec);

  return true;
}

/* Deallocate all general structures.  */

static void
sra_deinitialize (void)
{
  BITMAP_FREE (candidate_bitmap);
  BITMAP_FREE (retvals_bitmap);
  free_alloc_pool (access_pool);
  free_alloc_pool (link_pool);
  obstack_free (&name_obstack, NULL);

  pointer_map_traverse (base_access_vec, delete_base_accesses, NULL);
  pointer_map_destroy (base_access_vec);
}

/* Remove DECL from candidates for SRA and write REASON to the dump file if
   there is one.  */
static void
disqualify_candidate (tree decl, const char *reason)
{
  bitmap_clear_bit (candidate_bitmap, DECL_UID (decl));

  if (dump_file && (dump_flags & TDF_DETAILS))
    {
      fprintf (dump_file, "! Disqualifying ");
      print_generic_expr (dump_file, decl, 0);
      fprintf (dump_file, " - %s\n", reason);
    }
}

/* Return true iff the type contains a field or an element which does not allow
   scalarization.  */

static bool
type_internals_preclude_sra_p (tree type)
{
  tree fld;
  tree et;

  switch (TREE_CODE (type))
    {
    case RECORD_TYPE:
    case UNION_TYPE:
    case QUAL_UNION_TYPE:
      for (fld = TYPE_FIELDS (type); fld; fld = TREE_CHAIN (fld))
	if (TREE_CODE (fld) == FIELD_DECL)
	  {
	    tree ft = TREE_TYPE (fld);

	    if (TREE_THIS_VOLATILE (fld)
		|| !DECL_FIELD_OFFSET (fld) || !DECL_SIZE (fld)
		|| !host_integerp (DECL_FIELD_OFFSET (fld), 1)
		|| !host_integerp (DECL_SIZE (fld), 1))
	      return true;

	    if (AGGREGATE_TYPE_P (ft)
		&& type_internals_preclude_sra_p (ft))
	      return true;
	  }

      return false;

    case ARRAY_TYPE:
      et = TREE_TYPE (type);

      if (AGGREGATE_TYPE_P (et))
	return type_internals_preclude_sra_p (et);
      else
	return false;

    default:
      return false;
    }
}

/* Create and insert access for EXPR. Return created access, or NULL if it is
   not possible.  */

static struct access *
create_access (tree expr, bool write)
{
  struct access *access;
  void **slot;
  VEC (access_p,heap) *vec;
  HOST_WIDE_INT offset, size, max_size;
  tree base = expr;
  bool unscalarizable_region = false;

  base = get_ref_base_and_extent (expr, &offset, &size, &max_size);

  /* !!! Assert for testing only, remove after some time.  */
  gcc_assert (base);
  if (!DECL_P (base) || !bitmap_bit_p (candidate_bitmap, DECL_UID (base)))
    return NULL;

  if (size != max_size)
    {
      size = max_size;
      unscalarizable_region = true;
    }

  if (size < 0)
    {
      disqualify_candidate (base, "Encountered an unconstrained access.");
      return NULL;
    }

  access = (struct access *) pool_alloc (access_pool);
  memset (access, 0, sizeof (struct access));

  access->base = base;
  access->offset = offset;
  access->size = size;
  access->expr = expr;
  access->type = TREE_TYPE (expr);
  access->write = write;
  access->grp_unscalarizable_region = unscalarizable_region;

  slot = pointer_map_contains (base_access_vec, base);
  if (slot)
    vec = (VEC (access_p, heap) *) *slot;
  else
    vec = VEC_alloc (access_p, heap, 32);

  VEC_safe_push (access_p, heap, vec, access);

  *((struct VEC (access_p,heap) **)
	pointer_map_insert (base_access_vec, base)) = vec;

  return access;
}


/* Search the given tree for a declaration by skipping handled components and
   exclude it from the candidates.  */

static void
disqualify_base_of_expr (tree t, const char *reason)
{
  while (handled_component_p (t))
    t = TREE_OPERAND (t, 0);

  if (DECL_P (t))
    disqualify_candidate (t, reason);
}

/* Scan expression EXPR and create access structures for all accesses to
   candidates for scalarization.  Return the created access or NULL if none is
   created.  */

static struct access *
build_access_from_expr_1 (tree *expr_ptr, bool write)
{
  struct access *ret = NULL;
  tree expr = *expr_ptr;
  bool bit_ref;

  if (TREE_CODE (expr) == BIT_FIELD_REF)
    {
      expr = TREE_OPERAND (expr, 0);
      bit_ref = true;
    }
  else
    bit_ref = false;

  /* We need to dive through V_C_Es in order to get the size of its parameter
     and not the result type.  Ada produces such statements.  We are also
     capable of handling the topmost V_C_E but not any of those buried in other
     handled components.  */
  if (TREE_CODE (expr) == VIEW_CONVERT_EXPR)
    expr = TREE_OPERAND (expr, 0);

  if (contains_view_convert_expr_p (expr))
    {
      disqualify_base_of_expr (expr, "V_C_E under a different handled "
			       "component.");
      return NULL;
    }

  switch (TREE_CODE (expr))
    {
    case VAR_DECL:
    case PARM_DECL:
    case RESULT_DECL:
    case COMPONENT_REF:
    case ARRAY_REF:
    case ARRAY_RANGE_REF:
      ret = create_access (expr, write);
      break;

    case REALPART_EXPR:
    case IMAGPART_EXPR:
      expr = TREE_OPERAND (expr, 0);
      ret = create_access (expr, write);
      break;

    default:
      break;
    }

  if (write && bit_ref && ret)
    ret->grp_bfr_lhs = 1;

  return ret;
}

/* Callback of scan_function.  Scan expression EXPR and create access
   structures for all accesses to candidates for scalarization.  Return true if
   any access has been inserted.  */

static bool
build_access_from_expr (tree *expr_ptr,
			gimple_stmt_iterator *gsi ATTRIBUTE_UNUSED, bool write,
			void *data ATTRIBUTE_UNUSED)
{
  return build_access_from_expr_1 (expr_ptr, write) != NULL;
}

/* Disqualify LHS and RHS for scalarization if STMT must end its basic block in
   modes in which it matters, return true iff they have been disqualified.  RHS
   may be NULL, in that case ignore it.  If we scalarize an aggregate in
   intra-SRA we may need to add statements after each statement.  This is not
   possible if a statement unconditionally has to end the basic block.  */
static bool
disqualify_ops_if_throwing_stmt (gimple stmt, tree lhs, tree rhs)
{
  if (stmt_can_throw_internal (stmt) || stmt_ends_bb_p (stmt))
    {
      disqualify_base_of_expr (lhs, "LHS of a throwing stmt.");
      if (rhs)
	disqualify_base_of_expr (rhs, "RHS of a throwing stmt.");
      return true;
    }
  return false;
}


/* Result code for scan_assign callback for scan_function.  */
enum scan_assign_result { SRA_SA_NONE,       /* nothing done for the stmt */
			  SRA_SA_PROCESSED,  /* stmt analyzed/changed */
			  SRA_SA_REMOVED };  /* stmt redundant and eliminated */


/* Callback of scan_function.  Scan expressions occuring in the statement
   pointed to by STMT_EXPR, create access structures for all accesses to
   candidates for scalarization and remove those candidates which occur in
   statements or expressions that prevent them from being split apart.  Return
   true if any access has been inserted.  */

static enum scan_assign_result
build_accesses_from_assign (gimple *stmt_ptr,
			    gimple_stmt_iterator *gsi ATTRIBUTE_UNUSED,
			    void *data ATTRIBUTE_UNUSED)
{
  gimple stmt = *stmt_ptr;
  tree *lhs_ptr, *rhs_ptr;
  struct access *lacc, *racc;

  if (!gimple_assign_single_p (stmt))
    return SRA_SA_NONE;

  lhs_ptr = gimple_assign_lhs_ptr (stmt);
  rhs_ptr = gimple_assign_rhs1_ptr (stmt);

  if (disqualify_ops_if_throwing_stmt (stmt, *lhs_ptr, *rhs_ptr))
    return SRA_SA_NONE;

  racc = build_access_from_expr_1 (rhs_ptr, false);
  lacc = build_access_from_expr_1 (lhs_ptr, true);

  if (lacc && racc
      && !lacc->grp_unscalarizable_region
      && !racc->grp_unscalarizable_region
      && AGGREGATE_TYPE_P (TREE_TYPE (*lhs_ptr))
      /* FIXME: Turn the following line into an assert after PR 40058 is
	 fixed.  */
      && lacc->size == racc->size
      && useless_type_conversion_p (lacc->type, racc->type))
    {
      struct assign_link *link;

      link = (struct assign_link *) pool_alloc (link_pool);
      memset (link, 0, sizeof (struct assign_link));

      link->lacc = lacc;
      link->racc = racc;

      add_link_to_rhs (racc, link);
    }

  return (lacc || racc) ? SRA_SA_PROCESSED : SRA_SA_NONE;
}

/* Callback of walk_stmt_load_store_addr_ops visit_addr used to determine
   GIMPLE_ASM operands with memory constrains which cannot be scalarized.  */

static bool
asm_visit_addr (gimple stmt ATTRIBUTE_UNUSED, tree op,
		void *data ATTRIBUTE_UNUSED)
{
  if (DECL_P (op))
    disqualify_candidate (op, "Non-scalarizable GIMPLE_ASM operand.");

  return false;
}


/* Scan function and look for interesting statements. Return true if any has
   been found or processed, as indicated by callbacks.  SCAN_EXPR is a callback
   called on all expressions within statements except assign statements and
   those deemed entirely unsuitable for some reason (all operands in such
   statements and expression are removed from candidate_bitmap).  SCAN_ASSIGN
   is a callback called on all assign statements, HANDLE_SSA_DEFS is a callback
   called on assign statements and those call statements which have a lhs and
   it is the only callback which can be NULL. ANALYSIS_STAGE is true when
   running in the analysis stage of a pass and thus no statement is being
   modified.  DATA is a pointer passed to all callbacks.  If any single
   callback returns true, this function also returns true, otherwise it returns
   false.  */

static bool
scan_function (bool (*scan_expr) (tree *, gimple_stmt_iterator *, bool, void *),
	       enum scan_assign_result (*scan_assign) (gimple *,
						       gimple_stmt_iterator *,
						       void *),
	       bool (*handle_ssa_defs)(gimple, void *),
	       bool analysis_stage, void *data)
{
  gimple_stmt_iterator gsi;
  basic_block bb;
  unsigned i;
  tree *t;
  bool ret = false;

  FOR_EACH_BB (bb)
    {
      bool bb_changed = false;

      gsi = gsi_start_bb (bb);
      while (!gsi_end_p (gsi))
	{
	  gimple stmt = gsi_stmt (gsi);
	  enum scan_assign_result assign_result;
	  bool any = false, deleted = false;

	  switch (gimple_code (stmt))
	    {
	    case GIMPLE_RETURN:
	      t = gimple_return_retval_ptr (stmt);
	      if (*t != NULL_TREE)
		{
		  if (DECL_P (*t))
		    {
		      tree ret_type = TREE_TYPE (*t);
		      if (analysis_stage
			  && sra_mode == SRA_MODE_EARLY_INTRA
			  && (TREE_CODE (ret_type) == UNION_TYPE
			      || TREE_CODE (ret_type) == QUAL_UNION_TYPE))
			disqualify_candidate (*t,
					      "Union in a return statement.");
		      else
			bitmap_set_bit (retvals_bitmap, DECL_UID (*t));
		    }
		  any |= scan_expr (t, &gsi, false, data);
		}
	      break;

	    case GIMPLE_ASSIGN:
	      assign_result = scan_assign (&stmt, &gsi, data);
	      any |= assign_result == SRA_SA_PROCESSED;
	      deleted = assign_result == SRA_SA_REMOVED;
	      if (handle_ssa_defs && assign_result != SRA_SA_REMOVED)
		any |= handle_ssa_defs (stmt, data);
	      break;

	    case GIMPLE_CALL:
	      /* Operands must be processed before the lhs.  */
	      for (i = 0; i < gimple_call_num_args (stmt); i++)
		{
		  tree *argp = gimple_call_arg_ptr (stmt, i);
		  any |= scan_expr (argp, &gsi, false, data);
		}

	      if (gimple_call_lhs (stmt))
		{
		  tree *lhs_ptr = gimple_call_lhs_ptr (stmt);
		  if (!analysis_stage ||
		      !disqualify_ops_if_throwing_stmt (stmt, *lhs_ptr, NULL))
		    {
		      any |= scan_expr (lhs_ptr, &gsi, true, data);
		      if (handle_ssa_defs)
			any |= handle_ssa_defs (stmt, data);
		    }
		}
	      break;

	    case GIMPLE_ASM:

	      if (analysis_stage)
		walk_stmt_load_store_addr_ops (stmt, NULL, NULL, NULL,
					       asm_visit_addr);
	      for (i = 0; i < gimple_asm_ninputs (stmt); i++)
		{
		  tree *op = &TREE_VALUE (gimple_asm_input_op (stmt, i));
		  any |= scan_expr (op, &gsi, false, data);
		}
	      for (i = 0; i < gimple_asm_noutputs (stmt); i++)
		{
		  tree *op = &TREE_VALUE (gimple_asm_output_op (stmt, i));
		  any |= scan_expr (op, &gsi, true, data);
		}

	    default:
	      break;
	    }

	  if (any)
	    {
	      ret = true;
	      bb_changed = true;

	      if (!analysis_stage)
		{
		  update_stmt (stmt);
		  if (!stmt_could_throw_p (stmt))
		    remove_stmt_from_eh_region (stmt);
		}
	    }
	  if (deleted)
	    bb_changed = true;
	  else
	    {
	      gsi_next (&gsi);
	      ret = true;
	    }
	}
      if (!analysis_stage && bb_changed)
	gimple_purge_dead_eh_edges (bb);
    }

  return ret;
}

/* Helper of QSORT function. There are pointers to accesses in the array.  An
   access is considered smaller than another if it has smaller offset or if the
   offsets are the same but is size is bigger. */

static int
compare_access_positions (const void *a, const void *b)
{
  const access_p *fp1 = (const access_p *) a;
  const access_p *fp2 = (const access_p *) b;
  const access_p f1 = *fp1;
  const access_p f2 = *fp2;

  if (f1->offset != f2->offset)
    return f1->offset < f2->offset ? -1 : 1;

  if (f1->size == f2->size)
    return 0;
  /* We want the bigger accesses first, thus the opposite operator in the next
     line: */
  return f1->size > f2->size ? -1 : 1;
}


/* Append a name of the declaration to the name obstack.  A helper function for
   make_fancy_name.  */

static void
make_fancy_decl_name (tree decl)
{
  char buffer[32];

  tree name = DECL_NAME (decl);
  if (name)
    obstack_grow (&name_obstack, IDENTIFIER_POINTER (name),
		  IDENTIFIER_LENGTH (name));
  else
    {
      sprintf (buffer, "D%u", DECL_UID (decl));
      obstack_grow (&name_obstack, buffer, strlen (buffer));
    }
}

/* Helper for make_fancy_name.  */

static void
make_fancy_name_1 (tree expr)
{
  char buffer[32];
  tree index;

  if (DECL_P (expr))
    {
      make_fancy_decl_name (expr);
      return;
    }

  switch (TREE_CODE (expr))
    {
    case COMPONENT_REF:
      make_fancy_name_1 (TREE_OPERAND (expr, 0));
      obstack_1grow (&name_obstack, '$');
      make_fancy_decl_name (TREE_OPERAND (expr, 1));
      break;

    case ARRAY_REF:
      make_fancy_name_1 (TREE_OPERAND (expr, 0));
      obstack_1grow (&name_obstack, '$');
      /* Arrays with only one element may not have a constant as their
	 index. */
      index = TREE_OPERAND (expr, 1);
      if (TREE_CODE (index) != INTEGER_CST)
	break;
      sprintf (buffer, HOST_WIDE_INT_PRINT_DEC, TREE_INT_CST_LOW (index));
      obstack_grow (&name_obstack, buffer, strlen (buffer));

      break;

    case BIT_FIELD_REF:
    case REALPART_EXPR:
    case IMAGPART_EXPR:
      gcc_unreachable (); 	/* we treat these as scalars.  */
      break;
    default:
      break;
    }
}

/* Create a human readable name for replacement variable of ACCESS.  */

static char *
make_fancy_name (tree expr)
{
  make_fancy_name_1 (expr);
  obstack_1grow (&name_obstack, '\0');
  return XOBFINISH (&name_obstack, char *);
}

/* Helper function for build_ref_for_offset.  */

static bool
build_ref_for_offset_1 (tree *res, tree type, HOST_WIDE_INT offset,
			tree exp_type)
{
  while (1)
    {
      tree fld;
      tree tr_size, index;
      HOST_WIDE_INT el_size;

      if (offset == 0 && exp_type
	  && useless_type_conversion_p (exp_type, type))
	return true;

      switch (TREE_CODE (type))
	{
	case UNION_TYPE:
	case QUAL_UNION_TYPE:
	case RECORD_TYPE:
	  /* Some ADA records are half-unions, treat all of them the same.  */
	  for (fld = TYPE_FIELDS (type); fld; fld = TREE_CHAIN (fld))
	    {
	      HOST_WIDE_INT pos, size;
	      tree expr, *expr_ptr;

	      if (TREE_CODE (fld) != FIELD_DECL)
		continue;

	      pos = int_bit_position (fld);
	      gcc_assert (TREE_CODE (type) == RECORD_TYPE || pos == 0);
	      size = tree_low_cst (DECL_SIZE (fld), 1);
	      if (pos > offset || (pos + size) <= offset)
		continue;

	      if (res)
		{
		  expr = build3 (COMPONENT_REF, TREE_TYPE (fld), *res, fld,
				 NULL_TREE);
		  expr_ptr = &expr;
		}
	      else
		expr_ptr = NULL;
	      if (build_ref_for_offset_1 (expr_ptr, TREE_TYPE (fld),
					  offset - pos, exp_type))
		{
		  if (res)
		    *res = expr;
		  return true;
		}
	    }
	  return false;

	case ARRAY_TYPE:
	  tr_size = TYPE_SIZE (TREE_TYPE (type));
	  if (!tr_size || !host_integerp (tr_size, 1))
	    return false;
	  el_size = tree_low_cst (tr_size, 1);

	  index = build_int_cst (TYPE_DOMAIN (type), offset / el_size);
	  if (!integer_zerop (TYPE_MIN_VALUE (TYPE_DOMAIN (type))))
	    index = int_const_binop (PLUS_EXPR, index,
				     TYPE_MIN_VALUE (TYPE_DOMAIN (type)), 0);
	  if (res)
	    *res = build4 (ARRAY_REF, TREE_TYPE (type), *res, index, NULL_TREE,
			   NULL_TREE);
	  offset = offset % el_size;
	  type = TREE_TYPE (type);
	  break;

	default:
	  if (offset != 0)
	    return false;

	  if (exp_type)
	    return false;
	  else
	    return true;
	}
    }
}

/* Construct an expression that would reference a part of aggregate *EXPR of
   type TYPE at the given OFFSET of the type EXP_TYPE.  If EXPR is NULL, the
   function only determines whether it can build such a reference without
   actually doing it.

   FIXME: Eventually this should be replaced with
   maybe_fold_offset_to_reference() from tree-ssa-ccp.c but that requires a
   minor rewrite of fold_stmt.
 */

static bool
build_ref_for_offset (tree *expr, tree type, HOST_WIDE_INT offset,
		      tree exp_type, bool allow_ptr)
{
  if (allow_ptr && POINTER_TYPE_P (type))
    {
      type = TREE_TYPE (type);
      if (expr)
	*expr = fold_build1 (INDIRECT_REF, type, *expr);
    }

  return build_ref_for_offset_1 (expr, type, offset, exp_type);
}

/* The very first phase of intraprocedural SRA.  It marks in candidate_bitmap
   those with type which is suitable for scalarization.  */

static bool
find_var_candidates (void)
{
  tree var, type;
  referenced_var_iterator rvi;
  bool ret = false;

  FOR_EACH_REFERENCED_VAR (var, rvi)
    {
      if (TREE_CODE (var) != VAR_DECL && TREE_CODE (var) != PARM_DECL)
        continue;
      type = TREE_TYPE (var);

      if (!AGGREGATE_TYPE_P (type)
	  || needs_to_live_in_memory (var)
	  || TREE_THIS_VOLATILE (var)
	  || !COMPLETE_TYPE_P (type)
	  || !host_integerp (TYPE_SIZE (type), 1)
          || tree_low_cst (TYPE_SIZE (type), 1) == 0
	  || type_internals_preclude_sra_p (type))
	continue;

      bitmap_set_bit (candidate_bitmap, DECL_UID (var));

      if (dump_file && (dump_flags & TDF_DETAILS))
	{
	  fprintf (dump_file, "Candidate (%d): ", DECL_UID (var));
	  print_generic_expr (dump_file, var, 0);
	  fprintf (dump_file, "\n");
	}
      ret = true;
    }

  return ret;
}

/* Sort all accesses for the given variable, check for partial overlaps and
   return NULL if there are any.  If there are none, pick a representative for
   each combination of offset and size and create a linked list out of them.
   Return the pointer to the first representative and make sure it is the first
   one in the vector of accesses.  */

static struct access *
sort_and_splice_var_accesses (tree var)
{
  int i, j, access_count;
  struct access *res, **prev_acc_ptr = &res;
  VEC (access_p, heap) *access_vec;
  bool first = true;
  HOST_WIDE_INT low = -1, high = 0;

  access_vec = get_base_access_vector (var);
  if (!access_vec)
    return NULL;
  access_count = VEC_length (access_p, access_vec);

  /* Sort by <OFFSET, SIZE>.  */
  qsort (VEC_address (access_p, access_vec), access_count, sizeof (access_p),
	 compare_access_positions);

  i = 0;
  while (i < access_count)
    {
      struct access *access = VEC_index (access_p, access_vec, i);
      bool modification = access->write;
      bool grp_read = !access->write;
      bool grp_bfr_lhs = access->grp_bfr_lhs;
      bool first_scalar = is_gimple_reg_type (access->type);
      bool unscalarizable_region = access->grp_unscalarizable_region;

      if (first || access->offset >= high)
	{
	  first = false;
	  low = access->offset;
	  high = access->offset + access->size;
	}
      else if (access->offset > low && access->offset + access->size > high)
	return NULL;
      else
	gcc_assert (access->offset >= low
		    && access->offset + access->size <= high);

      j = i + 1;
      while (j < access_count)
	{
	  struct access *ac2 = VEC_index (access_p, access_vec, j);
	  if (ac2->offset != access->offset || ac2->size != access->size)
	    break;
	  modification |= ac2->write;
	  grp_read |= !ac2->write;
	  grp_bfr_lhs |= ac2->grp_bfr_lhs;
	  unscalarizable_region |= ac2->grp_unscalarizable_region;
	  relink_to_new_repr (access, ac2);

	  /* If one of the equivalent accesses is scalar, use it as a
	     representative (this happens when when there is for example on a
	     single scalar field in a structure).  */
	  if (!first_scalar && is_gimple_reg_type (ac2->type))
	    {
	      struct access tmp_acc;
	      first_scalar = true;

	      memcpy (&tmp_acc, ac2, sizeof (struct access));
	      memcpy (ac2, access,  sizeof (struct access));
	      memcpy (access, &tmp_acc, sizeof (struct access));
	    }
	  ac2->group_representative = access;
	  j++;
	}

      i = j;

      access->group_representative = access;
      access->grp_write = modification;
      access->grp_read = grp_read;
      access->grp_bfr_lhs = grp_bfr_lhs;
      access->grp_unscalarizable_region = unscalarizable_region;
      if (access->first_link)
	add_access_to_work_queue (access);

      *prev_acc_ptr = access;
      prev_acc_ptr = &access->next_grp;
    }

  gcc_assert (res == VEC_index (access_p, access_vec, 0));
  return res;
}

/* Create a variable for the given ACCESS which determines the type, name and a
   few other properties.  Return the variable declaration and store it also to
   ACCESS->replacement.  */

static tree
create_access_replacement (struct access *access)
{
  tree repl;

  repl = create_tmp_var (access->type, "SR");
  get_var_ann (repl);
  add_referenced_var (repl);
  mark_sym_for_renaming (repl);

  if (!access->grp_bfr_lhs
      && (TREE_CODE (access->type) == COMPLEX_TYPE
	  || TREE_CODE (access->type) == VECTOR_TYPE))
    DECL_GIMPLE_REG_P (repl) = 1;

  DECL_SOURCE_LOCATION (repl) = DECL_SOURCE_LOCATION (access->base);
  DECL_ARTIFICIAL (repl) = 1;

  if (DECL_NAME (access->base)
      && !DECL_IGNORED_P (access->base)
      && !DECL_ARTIFICIAL (access->base))
    {
      char *pretty_name = make_fancy_name (access->expr);

      DECL_NAME (repl) = get_identifier (pretty_name);
      obstack_free (&name_obstack, pretty_name);

      SET_DECL_DEBUG_EXPR (repl, access->expr);
      DECL_DEBUG_EXPR_IS_FROM (repl) = 1;
      DECL_IGNORED_P (repl) = 0;
    }

  DECL_IGNORED_P (repl) = DECL_IGNORED_P (access->base);
  TREE_NO_WARNING (repl) = TREE_NO_WARNING (access->base);

  if (dump_file)
    {
      fprintf (dump_file, "Created a replacement for ");
      print_generic_expr (dump_file, access->base, 0);
      fprintf (dump_file, " offset: %u, size: %u: ",
	       (unsigned) access->offset, (unsigned) access->size);
      print_generic_expr (dump_file, repl, 0);
      fprintf (dump_file, "\n");
    }

  return repl;
}

/* Return ACCESS scalar replacement, create it if it does not exist yet.  */

static inline tree
get_access_replacement (struct access *access)
{
  gcc_assert (access->grp_to_be_replaced);

  if (access->replacement_decl)
    return access->replacement_decl;

  access->replacement_decl = create_access_replacement (access);
  return access->replacement_decl;
}

/* Build a subtree of accesses rooted in *ACCESS, and move the pointer in the
   linked list along the way.  Stop when *ACCESS is NULL or the access pointed
   to it is not "within" the root.  */

static void
build_access_subtree (struct access **access)
{
  struct access *root = *access, *last_child = NULL;
  HOST_WIDE_INT limit = root->offset + root->size;

  *access = (*access)->next_grp;
  while  (*access && (*access)->offset + (*access)->size <= limit)
    {
      if (!last_child)
	root->first_child = *access;
      else
	last_child->next_sibling = *access;
      last_child = *access;

      build_access_subtree (access);
    }
}

/* Build a tree of access representatives, ACCESS is the pointer to the first
   one, others are linked in a list by the next_grp field.  Decide about scalar
   replacements on the way, return true iff any are to be created.  */

static void
build_access_trees (struct access *access)
{
  while (access)
    {
      struct access *root = access;

      build_access_subtree (&access);
      root->next_grp = access;
    }
}

/* Analyze the subtree of accesses rooted in ROOT, scheduling replacements when
   both seeming beneficial and when ALLOW_REPLACEMENTS allows it.  Also set
   all sorts of access flags appropriately along the way, notably always ser
   grp_read when MARK_READ is true and grp_write when MARK_WRITE is true.  */

static bool
analyze_access_subtree (struct access *root, bool allow_replacements,
			bool mark_read, bool mark_write)
{
  struct access *child;
  HOST_WIDE_INT limit = root->offset + root->size;
  HOST_WIDE_INT covered_to = root->offset;
  bool scalar = is_gimple_reg_type (root->type);
  bool hole = false, sth_created = false;

  if (mark_read)
    root->grp_read = true;
  else if (root->grp_read)
    mark_read = true;

  if (mark_write)
    root->grp_write = true;
  else if (root->grp_write)
    mark_write = true;

  if (root->grp_unscalarizable_region)
    allow_replacements = false;

  for (child = root->first_child; child; child = child->next_sibling)
    {
      if (!hole && child->offset < covered_to)
	hole = true;
      else
	covered_to += child->size;

      sth_created |= analyze_access_subtree (child,
					     allow_replacements && !scalar,
					     mark_read, mark_write);

      root->grp_unscalarized_data |= child->grp_unscalarized_data;
      hole |= !child->grp_covered;
    }

  if (allow_replacements && scalar && !root->first_child)
    {
      if (dump_file && (dump_flags & TDF_DETAILS))
	{
	  fprintf (dump_file, "Marking ");
	  print_generic_expr (dump_file, root->base, 0);
	  fprintf (dump_file, " offset: %u, size: %u: ",
		   (unsigned) root->offset, (unsigned) root->size);
	  fprintf (dump_file, " to be replaced.\n");
	}

      root->grp_to_be_replaced = 1;
      sth_created = true;
      hole = false;
    }
  else if (covered_to < limit)
    hole = true;

  if (sth_created && !hole)
    {
      root->grp_covered = 1;
      return true;
    }
  if (root->grp_write || TREE_CODE (root->base) == PARM_DECL)
    root->grp_unscalarized_data = 1; /* not covered and written to */
  if (sth_created)
    return true;
  return false;
}

/* Analyze all access trees linked by next_grp by the means of
   analyze_access_subtree.  */
static bool
analyze_access_trees (struct access *access)
{
  bool ret = false;

  while (access)
    {
      if (analyze_access_subtree (access, true, false, false))
	ret = true;
      access = access->next_grp;
    }

  return ret;
}

/* Return true iff a potential new child of LACC at offset OFFSET and with size
   SIZE would conflict with an already existing one.  If exactly such a child
   already exists in LACC, store a pointer to it in EXACT_MATCH.  */

static bool
child_would_conflict_in_lacc (struct access *lacc, HOST_WIDE_INT norm_offset,
			      HOST_WIDE_INT size, struct access **exact_match)
{
  struct access *child;

  for (child = lacc->first_child; child; child = child->next_sibling)
    {
      if (child->offset == norm_offset && child->size == size)
	{
	  *exact_match = child;
	  return true;
	}

      if (child->offset < norm_offset + size
	  && child->offset + child->size > norm_offset)
	return true;
    }

  return false;
}

/* Set the expr of TARGET to one just like MODEL but with is own base at the
   bottom of the handled components.  */

static void
duplicate_expr_for_different_base (struct access *target,
				   struct access *model)
{
  tree t, expr = unshare_expr (model->expr);

  gcc_assert (handled_component_p (expr));
  t = expr;
  while (handled_component_p (TREE_OPERAND (t, 0)))
    t = TREE_OPERAND (t, 0);
  gcc_assert (TREE_OPERAND (t, 0) == model->base);
  TREE_OPERAND (t, 0) = target->base;

  target->expr = expr;
}


/* Create a new child access of PARENT, with all properties just like MODEL
   except for its offset and with its grp_write false and grp_read true.
   Return the new access. Note that this access is created long after all
   splicing and sorting, it's not located in any access vector and is
   automatically a representative of its group.  */

static struct access *
create_artificial_child_access (struct access *parent, struct access *model,
				HOST_WIDE_INT new_offset)
{
  struct access *access;
  struct access **child;

  gcc_assert (!model->grp_unscalarizable_region);

  access = (struct access *) pool_alloc (access_pool);
  memset (access, 0, sizeof (struct access));
  access->base = parent->base;
  access->offset = new_offset;
  access->size = model->size;
  duplicate_expr_for_different_base (access, model);
  access->type = model->type;
  access->grp_write = true;
  access->grp_read = false;

  child = &parent->first_child;
  while (*child && (*child)->offset < new_offset)
    child = &(*child)->next_sibling;

  access->next_sibling = *child;
  *child = access;

  return access;
}


/* Propagate all subaccesses of RACC across an assignment link to LACC. Return
   true if any new subaccess was created.  Additionally, if RACC is a scalar
   access but LACC is not, change the type of the latter.  */

static bool
propagate_subacesses_accross_link (struct access *lacc, struct access *racc)
{
  struct access *rchild;
  HOST_WIDE_INT norm_delta = lacc->offset - racc->offset;

  bool ret = false;

  if (is_gimple_reg_type (lacc->type)
      || lacc->grp_unscalarizable_region
      || racc->grp_unscalarizable_region)
    return false;

  if (!lacc->first_child && !racc->first_child
      && is_gimple_reg_type (racc->type)
      && (sra_mode == SRA_MODE_INTRA
          || !bitmap_bit_p (retvals_bitmap, DECL_UID (lacc->base))))
    {
      duplicate_expr_for_different_base (lacc, racc);
      lacc->type = racc->type;
      return false;
    }

  for (rchild = racc->first_child; rchild; rchild = rchild->next_sibling)
    {
      struct access *new_acc = NULL;
      HOST_WIDE_INT norm_offset = rchild->offset + norm_delta;

      if (rchild->grp_unscalarizable_region)
	continue;

      if (child_would_conflict_in_lacc (lacc, norm_offset, rchild->size,
					&new_acc))
	{
	  if (new_acc && rchild->first_child)
	    ret |= propagate_subacesses_accross_link (new_acc, rchild);
	  continue;
	}

      new_acc = create_artificial_child_access (lacc, rchild, norm_offset);
      if (racc->first_child)
	propagate_subacesses_accross_link (new_acc, rchild);

      ret = true;
    }

  return ret;
}

/* Propagate all subaccesses across assignment links.  */

static void
propagate_all_subaccesses (void)
{
  while (work_queue_head)
    {
      struct access *racc = pop_access_from_work_queue ();
      struct assign_link *link;

      gcc_assert (racc->first_link);

      for (link = racc->first_link; link; link = link->next)
	{
	  struct access *lacc = link->lacc;

	  if (!bitmap_bit_p (candidate_bitmap, DECL_UID (lacc->base)))
	    continue;
	  lacc = lacc->group_representative;
	  if (propagate_subacesses_accross_link (lacc, racc)
	      && lacc->first_link)
	    add_access_to_work_queue (lacc);
	}
    }
}

/* Go through all accesses collected throughout the (intraprocedural) analysis
   stage, exclude overlapping ones, identify representatives and build trees
   out of them, making decisions about scalarization on the way.  Return true
   iff there are any to-be-scalarized variables after this stage. */

static bool
analyze_all_variable_accesses (void)
{
  tree var;
  referenced_var_iterator rvi;
  bool res = false;

  FOR_EACH_REFERENCED_VAR (var, rvi)
    if (bitmap_bit_p (candidate_bitmap, DECL_UID (var)))
      {
	struct access *access;

	access = sort_and_splice_var_accesses (var);
	if (access)
	  build_access_trees (access);
	else
	  disqualify_candidate (var,
				"No or inhibitingly overlapping accesses.");
      }

  propagate_all_subaccesses ();

  FOR_EACH_REFERENCED_VAR (var, rvi)
    if (bitmap_bit_p (candidate_bitmap, DECL_UID (var)))
      {
	struct access *access = get_first_repr_for_decl (var);

	if (analyze_access_trees (access))
	  {
	    res = true;
	    if (dump_file && (dump_flags & TDF_DETAILS))
	      {
		fprintf (dump_file, "\nAccess trees for ");
		print_generic_expr (dump_file, var, 0);
		fprintf (dump_file, " (UID: %u): \n", DECL_UID (var));
		dump_access_tree (dump_file, access);
		fprintf (dump_file, "\n");
	      }
	  }
	else
	  disqualify_candidate (var, "No scalar replacements to be created.");
      }

  return res;
}

/* Return true iff a reference statement into aggregate AGG can be built for
   every single to-be-replaced accesses that is a child of ACCESS, its sibling
   or a child of its sibling. TOP_OFFSET is the offset from the processed
   access subtree that has to be subtracted from offset of each access.  */

static bool
ref_expr_for_all_replacements_p (struct access *access, tree agg,
				 HOST_WIDE_INT top_offset)
{
  do
    {
      if (access->grp_to_be_replaced
	  && !build_ref_for_offset (NULL, TREE_TYPE (agg),
				    access->offset - top_offset,
				    access->type, false))
	return false;

      if (access->first_child
	  && !ref_expr_for_all_replacements_p (access->first_child, agg,
					       top_offset))
	return false;

      access = access->next_sibling;
    }
  while (access);

  return true;
}


/* Generate statements copying scalar replacements of accesses within a subtree
   into or out of AGG.  ACCESS is the first child of the root of the subtree to
   be processed.  AGG is an aggregate type expression (can be a declaration but
   does not have to be, it can for example also be an indirect_ref).
   TOP_OFFSET is the offset of the processed subtree which has to be subtracted
   from offsets of individual accesses to get corresponding offsets for AGG.
   If CHUNK_SIZE is non-null, copy only replacements in the interval
   <start_offset, start_offset + chunk_size>, otherwise copy all.  GSI is a
   statement iterator used to place the new statements.  WRITE should be true
   when the statements should write from AGG to the replacement and false if
   vice versa.  if INSERT_AFTER is true, new statements will be added after the
   current statement in GSI, they will be added before the statement
   otherwise.  */

static void
generate_subtree_copies (struct access *access, tree agg,
			 HOST_WIDE_INT top_offset,
			 HOST_WIDE_INT start_offset, HOST_WIDE_INT chunk_size,
			 gimple_stmt_iterator *gsi, bool write,
			 bool insert_after)
{
  do
    {
      tree expr = unshare_expr (agg);

      if (chunk_size && access->offset >= start_offset + chunk_size)
	return;

      if (access->grp_to_be_replaced
	  && (chunk_size == 0
	      || access->offset + access->size > start_offset))
	{
	  bool repl_found;
	  gimple stmt;

	  repl_found = build_ref_for_offset (&expr, TREE_TYPE (agg),
					     access->offset - top_offset,
					     access->type, false);
	  gcc_assert (repl_found);

	  if (write)
	    stmt = gimple_build_assign (get_access_replacement (access), expr);
	  else
	    {
	      tree repl = get_access_replacement (access);
	      TREE_NO_WARNING (repl) = 1;
	      stmt = gimple_build_assign (expr, repl);
	    }

	  if (insert_after)
	    gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
	  else
	    gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
	  update_stmt (stmt);
	}

      if (access->first_child)
	generate_subtree_copies (access->first_child, agg, top_offset,
				 start_offset, chunk_size, gsi,
				 write, insert_after);

      access = access->next_sibling;
    }
  while (access);
}

/* Assign zero to all scalar replacements in an access subtree.  ACCESS is the
   the root of the subtree to be processed.  GSI is the statement iterator used
   for inserting statements which are added after the current statement if
   INSERT_AFTER is true or before it otherwise.  */

static void
init_subtree_with_zero (struct access *access, gimple_stmt_iterator *gsi,
			bool insert_after)

{
  struct access *child;

  if (access->grp_to_be_replaced)
    {
      gimple stmt;

      stmt = gimple_build_assign (get_access_replacement (access),
				  fold_convert (access->type,
						integer_zero_node));
      if (insert_after)
	gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
      else
	gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
      update_stmt (stmt);
    }

  for (child = access->first_child; child; child = child->next_sibling)
    init_subtree_with_zero (child, gsi, insert_after);
}

/* Search for an access representative for the given expression EXPR and
   return it or NULL if it cannot be found.  */

static struct access *
get_access_for_expr (tree expr)
{
  HOST_WIDE_INT offset, size, max_size;
  tree base;

  /* FIXME: This should not be necessary but Ada produces V_C_Es with a type of
     a different size than the size of its argument and we need the latter
     one.  */
  if (TREE_CODE (expr) == VIEW_CONVERT_EXPR)
    expr = TREE_OPERAND (expr, 0);

  base = get_ref_base_and_extent (expr, &offset, &size, &max_size);
  /* !!! Assert for testing only, remove after some time.  */
  gcc_assert (base);
  if (max_size == -1 || !DECL_P (base))
    return NULL;

  if (!bitmap_bit_p (candidate_bitmap, DECL_UID (base)))
    return NULL;

  return get_var_base_offset_size_access (base, offset, max_size);
}


/* Substitute into *EXPR an expression of type TYPE with the value of the
   replacement of ACCESS.  This is done either by producing a special V_C_E
   assignment statement converting the replacement to a new temporary of the
   requested type if TYPE is is_gimple_reg_type or by going through the base
   aggregate if it is not.  */

static void
fix_incompatible_types_for_expr (tree *expr, tree type, struct access *access,
				 gimple_stmt_iterator *gsi, bool write)
{
  tree repl = get_access_replacement (access);

  if (is_gimple_reg_type (type))
    {
      tree tmp = create_tmp_var (type, "SRvce");

      add_referenced_var (tmp);
      tmp = make_ssa_name (tmp, NULL);

      if (write)
	{
	  gimple stmt;
	  tree conv = fold_build1 (VIEW_CONVERT_EXPR, TREE_TYPE (repl), tmp);

	  *expr = tmp;
	  SSA_NAME_DEF_STMT (tmp) = gsi_stmt (*gsi);
	  stmt = gimple_build_assign (repl, conv);
	  gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
	  update_stmt (stmt);
	}
      else
	{
	  gimple stmt;
	  tree conv = fold_build1 (VIEW_CONVERT_EXPR, type, repl);

	  stmt = gimple_build_assign (tmp, conv);
	  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
	  SSA_NAME_DEF_STMT (tmp) = stmt;
	  *expr = tmp;
	  update_stmt (stmt);
	}
    }
  else
    {
      if (write)
	{
	  gimple stmt;

	  stmt = gimple_build_assign (repl, unshare_expr (access->expr));
	  gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
	  update_stmt (stmt);
	}
      else
	{
	  gimple stmt;

	  stmt = gimple_build_assign (unshare_expr (access->expr), repl);
	  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
	  update_stmt (stmt);
	}
    }
}

/* Change STMT to assign compatible types by means of adding component or array
   references or VIEW_CONVERT_EXPRs.  All parameters have the same meaning as
   variable with the same names in sra_modify_assign.  If we can avoid a V_C_E
   used to load a field from a (single field) record or a union or an element
   of a one-sized array by producing COMPONENT_REFs and ARRAY_REFs instead, do
   so.  */

static void
fix_modified_assign_compatibility (gimple_stmt_iterator *gsi, gimple stmt,
				   struct access *lacc, struct access *racc,
				   tree lhs, tree *rhs, tree ltype, tree rtype)
{
  if (racc && racc->grp_to_be_replaced && AGGREGATE_TYPE_P (ltype)
      && !access_has_children_p (lacc))
    {
      tree expr = unshare_expr (lhs);
      bool found = build_ref_for_offset (&expr, ltype, racc->offset, rtype,
					 false);
      if (found)
	{
	  gimple_assign_set_lhs (stmt, expr);
	  return;
	}
    }

  if (lacc && lacc->grp_to_be_replaced && AGGREGATE_TYPE_P (rtype)
      && !access_has_children_p (racc))
    {
      tree expr = unshare_expr (*rhs);
      bool found = build_ref_for_offset (&expr, rtype, lacc->offset, ltype,
					 false);
      if (found)
	{
	  gimple_assign_set_rhs1 (stmt, expr);
	  return;
	}
    }

  /* Note that *rhs is a local variable of caller, it's not a pointer to the
     pointer in the gimple statement:  */
  *rhs = fold_build1 (VIEW_CONVERT_EXPR, ltype, *rhs);
  /* gimple_assign_set_rhs1 ICEs when passed V_C_E so we have to resort to
     this: */
  gimple_assign_set_rhs_from_tree (gsi, *rhs);
  gcc_assert (stmt == gsi_stmt (*gsi));
}

/* Callback for scan_function.  Replace the expression EXPR with a scalar
   replacement if there is one and generate other statements to do type
   conversion or subtree copying if necessary.  GSI is used to place newly
   created statements, WRITE is true if the expression is being written to (it
   is on a LHS of a statement or output in an assembly statement).  */

static bool
sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi, bool write,
		 void *data ATTRIBUTE_UNUSED)
{
  struct access *access;
  tree type, bfr;

  if (TREE_CODE (*expr) == BIT_FIELD_REF)
    {
      bfr = *expr;
      expr = &TREE_OPERAND (*expr, 0);
    }
  else
    bfr = NULL_TREE;

  if (TREE_CODE (*expr) == REALPART_EXPR || TREE_CODE (*expr) == IMAGPART_EXPR)
    expr = &TREE_OPERAND (*expr, 0);
  type = TREE_TYPE (*expr);

  access = get_access_for_expr (*expr);
  if (!access)
    return false;

  if (access->grp_to_be_replaced)
    {
      if (!useless_type_conversion_p (type, access->type))
	fix_incompatible_types_for_expr (expr, type, access, gsi, write);
      else
	*expr = get_access_replacement (access);
    }

  if (access->first_child)
    {
      HOST_WIDE_INT start_offset, chunk_size;
      if (bfr
	  && host_integerp (TREE_OPERAND (bfr, 1), 1)
	  && host_integerp (TREE_OPERAND (bfr, 2), 1))
	{
	  start_offset = tree_low_cst (TREE_OPERAND (bfr, 1), 1);
	  chunk_size = tree_low_cst (TREE_OPERAND (bfr, 2), 1);
	}
      else
	start_offset = chunk_size = 0;

      generate_subtree_copies (access->first_child, access->base, 0,
			       start_offset, chunk_size, gsi, write, write);
    }
  return true;
}

/* Store all replacements in the access tree rooted in TOP_RACC either to their
   base aggregate if there are unscalarized data or directly to LHS
   otherwise.  */

static void
handle_unscalarized_data_in_subtree (struct access *top_racc, tree lhs,
				     gimple_stmt_iterator *gsi)
{
  if (top_racc->grp_unscalarized_data)
    generate_subtree_copies (top_racc->first_child, top_racc->base, 0, 0, 0,
			     gsi, false, false);
  else
    generate_subtree_copies (top_racc->first_child, lhs, top_racc->offset,
			     0, 0, gsi, false, false);
}


/* Try to generate statements to load all sub-replacements in an access
   (sub)tree (LACC is the first child) from scalar replacements in the TOP_RACC
   (sub)tree.  If that is not possible, refresh the TOP_RACC base aggregate and
   load the accesses from it.  LEFT_OFFSET is the offset of the left whole
   subtree being copied, RIGHT_OFFSET is the same thing for the right subtree.
   GSI is stmt iterator used for statement insertions.  *REFRESHED is true iff
   the rhs top aggregate has already been refreshed by contents of its scalar
   reductions and is set to true if this function has to do it.  */

static void
load_assign_lhs_subreplacements (struct access *lacc, struct access *top_racc,
				 HOST_WIDE_INT left_offset,
				 HOST_WIDE_INT right_offset,
				 gimple_stmt_iterator *old_gsi,
				 gimple_stmt_iterator *new_gsi,
				 bool *refreshed, tree lhs)
{
  do
    {
      if (lacc->grp_to_be_replaced)
	{
	  struct access *racc;
	  HOST_WIDE_INT offset = lacc->offset - left_offset + right_offset;

	  racc = find_access_in_subtree (top_racc, offset, lacc->size);
	  if (racc && racc->grp_to_be_replaced)
	    {
	      gimple stmt;

	      if (useless_type_conversion_p (lacc->type, racc->type))
		stmt = gimple_build_assign (get_access_replacement (lacc),
					    get_access_replacement (racc));
	      else
		{
		  tree rhs = fold_build1 (VIEW_CONVERT_EXPR, lacc->type,
					  get_access_replacement (racc));
		  stmt = gimple_build_assign (get_access_replacement (lacc),
					      rhs);
		}

	      gsi_insert_after (new_gsi, stmt, GSI_NEW_STMT);
	      update_stmt (stmt);
	    }
	  else
	    {
	      tree expr = unshare_expr (top_racc->base);
	      bool repl_found;
	      gimple stmt;

	      /* No suitable access on the right hand side, need to load from
		 the aggregate.  See if we have to update it first... */
	      if (!*refreshed)
		{
		  gcc_assert (top_racc->first_child);
		  handle_unscalarized_data_in_subtree (top_racc, lhs, old_gsi);
		  *refreshed = true;
		}

	      repl_found = build_ref_for_offset (&expr,
						 TREE_TYPE (top_racc->base),
						 lacc->offset - left_offset,
						 lacc->type, false);
	      gcc_assert (repl_found);
	      stmt = gimple_build_assign (get_access_replacement (lacc),
					  expr);
	      gsi_insert_after (new_gsi, stmt, GSI_NEW_STMT);
	      update_stmt (stmt);
	    }
	}
      else if (lacc->grp_read && !lacc->grp_covered && !*refreshed)
	{
	  handle_unscalarized_data_in_subtree (top_racc, lhs, old_gsi);
	  *refreshed = true;
	}

      if (lacc->first_child)
	load_assign_lhs_subreplacements (lacc->first_child, top_racc,
					 left_offset, right_offset,
					 old_gsi, new_gsi, refreshed, lhs);
      lacc = lacc->next_sibling;
    }
  while (lacc);
}

/* Modify assignments with a CONSTRUCTOR on their RHS.  STMT contains a pointer
   to the assignment and GSI is the statement iterator pointing at it.  Returns
   the same values as sra_modify_assign.  */

static enum scan_assign_result
sra_modify_constructor_assign (gimple *stmt, gimple_stmt_iterator *gsi)
{
  tree lhs = gimple_assign_lhs (*stmt);
  struct access *acc;

  gcc_assert (TREE_CODE (lhs) != REALPART_EXPR
	      && TREE_CODE (lhs) != IMAGPART_EXPR);
  acc = get_access_for_expr (lhs);
  if (!acc)
    return SRA_SA_NONE;

  if (VEC_length (constructor_elt,
		  CONSTRUCTOR_ELTS (gimple_assign_rhs1 (*stmt))) > 0)
    {
      /* I have never seen this code path trigger but if it can happen the
	 following should handle it gracefully.  */
      if (access_has_children_p (acc))
	generate_subtree_copies (acc->first_child, acc->base, 0, 0, 0, gsi,
				 true, true);
      return SRA_SA_PROCESSED;
    }

  if (acc->grp_covered)
    {
      init_subtree_with_zero (acc, gsi, false);
      unlink_stmt_vdef (*stmt);
      gsi_remove (gsi, true);
      return SRA_SA_REMOVED;
    }
  else
    {
      init_subtree_with_zero (acc, gsi, true);
      return SRA_SA_PROCESSED;
    }
}


/* Modify statements with IMAGPART_EXPR or REALPART_EXPR on their lhs with
   to-be-scalarized expressions with them.  STMT is the statement and GSI is
   the iterator used to place new helper statements.  Returns the same values
   as sra_modify_assign.  */

static enum scan_assign_result
sra_modify_partially_complex_lhs (gimple stmt, gimple_stmt_iterator *gsi)
{
  tree lhs, complex, ptype, rp, ip;
  struct access *access;
  gimple new_stmt, aux_stmt;

  lhs = gimple_assign_lhs (stmt);
  complex = TREE_OPERAND (lhs, 0);

  access = get_access_for_expr (complex);

  if (!access || !access->grp_to_be_replaced)
    return SRA_SA_NONE;

  ptype = TREE_TYPE (TREE_TYPE (complex));
  rp = create_tmp_var (ptype, "SRr");
  add_referenced_var (rp);
  rp = make_ssa_name (rp, NULL);

  ip = create_tmp_var (ptype, "SRp");
  add_referenced_var (ip);
  ip = make_ssa_name (ip, NULL);

  if (TREE_CODE (lhs) == IMAGPART_EXPR)
    {
      aux_stmt = gimple_build_assign (rp, fold_build1 (REALPART_EXPR, ptype,
					     get_access_replacement (access)));
      SSA_NAME_DEF_STMT (rp) = aux_stmt;
      gimple_assign_set_lhs (stmt, ip);
      SSA_NAME_DEF_STMT (ip) = stmt;
    }
  else
    {
      aux_stmt = gimple_build_assign (ip, fold_build1 (IMAGPART_EXPR, ptype,
					     get_access_replacement (access)));
      SSA_NAME_DEF_STMT (ip) = aux_stmt;
      gimple_assign_set_lhs (stmt, rp);
      SSA_NAME_DEF_STMT (rp) = stmt;
    }

  gsi_insert_before (gsi, aux_stmt, GSI_SAME_STMT);
  update_stmt (aux_stmt);
  new_stmt = gimple_build_assign (get_access_replacement (access),
				  fold_build2 (COMPLEX_EXPR, access->type,
					       rp, ip));
  gsi_insert_after (gsi, new_stmt, GSI_NEW_STMT);
  update_stmt (new_stmt);
  return SRA_SA_PROCESSED;
}

/* Callback of scan_function to process assign statements.  It examines both
   sides of the statement, replaces them with a scalare replacement if there is
   one and generating copying of replacements if scalarized aggregates have been
   used in the assignment.  STMT is a pointer to the assign statement, GSI is
   used to hold generated statements for type conversions and subtree
   copying.  */

static enum scan_assign_result
sra_modify_assign (gimple *stmt, gimple_stmt_iterator *gsi,
		   void *data ATTRIBUTE_UNUSED)
{
  struct access *lacc, *racc;
  tree ltype, rtype;
  tree lhs, rhs;
  bool modify_this_stmt;

  if (!gimple_assign_single_p (*stmt))
    return SRA_SA_NONE;
  lhs = gimple_assign_lhs (*stmt);
  rhs = gimple_assign_rhs1 (*stmt);

  if (TREE_CODE (rhs) == CONSTRUCTOR)
    return sra_modify_constructor_assign (stmt, gsi);

  if (TREE_CODE (lhs) == REALPART_EXPR || TREE_CODE (lhs) == IMAGPART_EXPR)
    return sra_modify_partially_complex_lhs (*stmt, gsi);

  if (TREE_CODE (rhs) == REALPART_EXPR || TREE_CODE (rhs) == IMAGPART_EXPR
      || TREE_CODE (rhs) == BIT_FIELD_REF || TREE_CODE (lhs) == BIT_FIELD_REF)
    {
      modify_this_stmt = sra_modify_expr (gimple_assign_rhs1_ptr (*stmt),
					  gsi, false, data);
      modify_this_stmt |= sra_modify_expr (gimple_assign_lhs_ptr (*stmt),
					   gsi, true, data);
      return modify_this_stmt ? SRA_SA_PROCESSED : SRA_SA_NONE;
    }

  lacc = get_access_for_expr (lhs);
  racc = get_access_for_expr (rhs);
  if (!lacc && !racc)
    return SRA_SA_NONE;

  modify_this_stmt = ((lacc && lacc->grp_to_be_replaced)
		      || (racc && racc->grp_to_be_replaced));

  if (lacc && lacc->grp_to_be_replaced)
    {
      lhs = get_access_replacement (lacc);
      gimple_assign_set_lhs (*stmt, lhs);
      ltype = lacc->type;
    }
  else
    ltype = TREE_TYPE (lhs);

  if (racc && racc->grp_to_be_replaced)
    {
      rhs = get_access_replacement (racc);
      gimple_assign_set_rhs1 (*stmt, rhs);
      rtype = racc->type;
    }
  else
    rtype = TREE_TYPE (rhs);

  if (modify_this_stmt)
    {
      if (!useless_type_conversion_p (ltype, rtype))
	fix_modified_assign_compatibility (gsi, *stmt, lacc, racc,
					   lhs, &rhs, ltype, rtype);
    }

  /* From this point on, the function deals with assignments in between
     aggregates when at least one has scalar reductions of some of its
     components.  There are three possible scenarios: Both the LHS and RHS have
     to-be-scalarized components, 2) only the RHS has or 3) only the LHS has.

     In the first case, we would like to load the LHS components from RHS
     components whenever possible.  If that is not possible, we would like to
     read it directly from the RHS (after updating it by storing in it its own
     components).  If there are some necessary unscalarized data in the LHS,
     those will be loaded by the original assignment too.  If neither of these
     cases happen, the original statement can be removed.  Most of this is done
     by load_assign_lhs_subreplacements.

     In the second case, we would like to store all RHS scalarized components
     directly into LHS and if they cover the aggregate completely, remove the
     statement too.  In the third case, we want the LHS components to be loaded
     directly from the RHS (DSE will remove the original statement if it
     becomes redundant).

     This is a bit complex but manageable when types match and when unions do
     not cause confusion in a way that we cannot really load a component of LHS
     from the RHS or vice versa (the access representing this level can have
     subaccesses that are accessible only through a different union field at a
     higher level - different from the one used in the examined expression).
     Unions are fun.

     Therefore, I specially handle a fourth case, happening when there is a
     specific type cast or it is impossible to locate a scalarized subaccess on
     the other side of the expression.  If that happens, I simply "refresh" the
     RHS by storing in it is scalarized components leave the original statement
     there to do the copying and then load the scalar replacements of the LHS.
     This is what the first branch does.  */


  if (contains_view_convert_expr_p (rhs) || contains_view_convert_expr_p (lhs)
      || (access_has_children_p (racc)
	  && !ref_expr_for_all_replacements_p (racc, lhs, racc->offset))
      || (access_has_children_p (lacc)
	  && !ref_expr_for_all_replacements_p (lacc, rhs, lacc->offset)))
    {
      if (access_has_children_p (racc))
	generate_subtree_copies (racc->first_child, racc->base, 0, 0, 0,
				 gsi, false, false);
      if (access_has_children_p (lacc))
	generate_subtree_copies (lacc->first_child, lacc->base, 0, 0, 0,
				 gsi, true, true);
    }
  else
    {
      if (access_has_children_p (lacc) && access_has_children_p (racc))
	{
	  gimple_stmt_iterator orig_gsi = *gsi;
	  bool refreshed;

	  if (lacc->grp_read && !lacc->grp_covered)
	    {
	      handle_unscalarized_data_in_subtree (racc, lhs, gsi);
	      refreshed = true;
	    }
	  else
	    refreshed = false;

	  load_assign_lhs_subreplacements (lacc->first_child, racc,
					   lacc->offset, racc->offset,
					   &orig_gsi, gsi, &refreshed, lhs);
	  if (!refreshed || !racc->grp_unscalarized_data)
	    {
	      if (*stmt == gsi_stmt (*gsi))
		gsi_next (gsi);

	      unlink_stmt_vdef (*stmt);
	      gsi_remove (&orig_gsi, true);
	      return SRA_SA_REMOVED;
	    }
	}
      else
	{
	  if (access_has_children_p (racc))
	    {
	      if (!racc->grp_unscalarized_data)
		{
		  generate_subtree_copies (racc->first_child, lhs,
					   racc->offset, 0, 0, gsi,
					   false, false);
		  gcc_assert (*stmt == gsi_stmt (*gsi));
		  unlink_stmt_vdef (*stmt);
		  gsi_remove (gsi, true);
		  return SRA_SA_REMOVED;
		}
	      else
		generate_subtree_copies (racc->first_child, lhs,
					 racc->offset, 0, 0, gsi, false, true);
	    }
	  else if (access_has_children_p (lacc))
	    generate_subtree_copies (lacc->first_child, rhs, lacc->offset,
				     0, 0, gsi, true, true);
	}
    }

  return modify_this_stmt ? SRA_SA_PROCESSED : SRA_SA_NONE;
}

/* Generate statements initializing scalar replacements of parts of function
   parameters.  */

static void
initialize_parameter_reductions (void)
{
  gimple_stmt_iterator gsi;
  gimple_seq seq = NULL;
  tree parm;

  for (parm = DECL_ARGUMENTS (current_function_decl);
       parm;
       parm = TREE_CHAIN (parm))
    {
      VEC (access_p, heap) *access_vec;
      struct access *access;

      if (!bitmap_bit_p (candidate_bitmap, DECL_UID (parm)))
	continue;
      access_vec = get_base_access_vector (parm);
      if (!access_vec)
	continue;

      if (!seq)
	{
	  seq = gimple_seq_alloc ();
	  gsi = gsi_start (seq);
	}

      for (access = VEC_index (access_p, access_vec, 0);
	   access;
	   access = access->next_grp)
	generate_subtree_copies (access, parm, 0, 0, 0, &gsi, true, true);
    }

  if (seq)
    gsi_insert_seq_on_edge_immediate (single_succ_edge (ENTRY_BLOCK_PTR), seq);
}

/* The "main" function of intraprocedural SRA passes.  Runs the analysis and if
   it reveals there are components of some aggregates to be scalarized, it runs
   the required transformations.  */
static unsigned int
perform_intra_sra (void)
{
  int ret = 0;
  sra_initialize ();

  if (!find_var_candidates ())
    goto out;

  if (!scan_function (build_access_from_expr, build_accesses_from_assign, NULL,
		      true, NULL))
    goto out;

  if (!analyze_all_variable_accesses ())
    goto out;

  scan_function (sra_modify_expr, sra_modify_assign, NULL,
		 false, NULL);
  initialize_parameter_reductions ();
  ret = TODO_update_ssa;

 out:
  sra_deinitialize ();
  return ret;
}

/* Perform early intraprocedural SRA.  */
static unsigned int
early_intra_sra (void)
{
  sra_mode = SRA_MODE_EARLY_INTRA;
  return perform_intra_sra ();
}

/* Perform "late" intraprocedural SRA.  */
static unsigned int
late_intra_sra (void)
{
  sra_mode = SRA_MODE_INTRA;
  return perform_intra_sra ();
}


static bool
gate_intra_sra (void)
{
  return flag_tree_sra != 0;
}


struct gimple_opt_pass pass_sra_early =
{
 {
  GIMPLE_PASS,
  "esra",	 			/* name */
  gate_intra_sra,			/* gate */
  early_intra_sra,			/* execute */
  NULL,					/* sub */
  NULL,					/* next */
  0,					/* static_pass_number */
  TV_TREE_SRA,				/* tv_id */
  PROP_cfg | PROP_ssa,                  /* properties_required */
  0,					/* properties_provided */
  0,					/* properties_destroyed */
  0,					/* todo_flags_start */
  TODO_dump_func
  | TODO_update_ssa
  | TODO_ggc_collect
  | TODO_verify_ssa			/* todo_flags_finish */
 }
};


struct gimple_opt_pass pass_sra =
{
 {
  GIMPLE_PASS,
  "sra",	 			/* name */
  gate_intra_sra,			/* gate */
  late_intra_sra,			/* execute */
  NULL,					/* sub */
  NULL,					/* next */
  0,					/* static_pass_number */
  TV_TREE_SRA,				/* tv_id */
  PROP_cfg | PROP_ssa,                  /* properties_required */
  0,					/* properties_provided */
  0,					/* properties_destroyed */
  TODO_update_address_taken,		/* todo_flags_start */
  TODO_dump_func
  | TODO_update_ssa
  | TODO_ggc_collect
  | TODO_verify_ssa			/* todo_flags_finish */
 }
};

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/5] New intraprocedural Scalar Reduction of  Aggregates.
  2009-05-12  0:24           ` Martin Jambor
@ 2009-05-18 13:26             ` Richard Guenther
  0 siblings, 0 replies; 25+ messages in thread
From: Richard Guenther @ 2009-05-18 13:26 UTC (permalink / raw)
  To: Martin Jambor; +Cc: GCC Patches, Jan Hubicka

On Tue, 12 May 2009, Martin Jambor wrote:

> Hi,
> 
> thanks for a quick reply.  Some clarifications below:
> 
> On Sun, May 10, 2009 at 01:48:01PM +0200, Richard Guenther wrote:
> > On Sun, 10 May 2009, Martin Jambor wrote:
> > 
> > > > >       expr = TREE_OPERAND (expr, 0);
> > > > >       bit_ref = true;
> > > > >     }
> > > > >   else
> > > > >     bit_ref = false;
> > > > > 
> > > > >   while (TREE_CODE (expr) == NOP_EXPR
> > > > 
> > > > CONVERT_EXPR_P (expr)
> > > 
> > > OK... but  at another place  in the email  you said it might  not even
> > > appear in a valid gimple statement?  Should I remove it altogether?
> > 
> > Indeed.  If you not build trees from tuple stmts then a
> > NOP_EXPR cannot appear as a rhs1 or rhs2 of an assignment (instead
> > it is always the subcode in gimple stmt and the rhs1 is simply sth
> > valid for is_gimple_val).
> 
> OK
> 
> > 
> > > > > 	 || TREE_CODE (expr) == VIEW_CONVERT_EXPR
> > > > > 	 || TREE_CODE (expr) == REALPART_EXPR
> > > > > 	 || TREE_CODE (expr) == IMAGPART_EXPR)
> > > > >     expr = TREE_OPERAND (expr, 0);
> > > > 
> > > > Why do this here btw, and not just lump ...
> > > > 
> > > > >   switch (TREE_CODE (expr))
> > > > >     {
> > > > >     case ADDR_EXPR:
> > > > >     case SSA_NAME:
> > > > >     case INDIRECT_REF:
> > > > >       break;
> > > > > 
> > > > >     case VAR_DECL:
> > > > >     case PARM_DECL:
> > > > >     case RESULT_DECL:
> > > > >     case COMPONENT_REF:
> > > > >     case ARRAY_REF:
> > > > >       ret = create_access (expr, write);
> > > > >       break;
> > > > 
> > > > ... this ...
> > > > 
> > > > >     case REALPART_EXPR:
> > > > >     case IMAGPART_EXPR:
> > > > >       expr = TREE_OPERAND (expr, 0);
> > > > >       ret = create_access (expr, write);
> > > > 
> > > > ... and this together?  Won't you create bogus accesses if you
> > > > strip for example IMAGPART_EXPR (which has non-zero offset)?
> > > 
> > > That would  break the complex  number into its components.   I thought
> > > that they are  meant to stay together for  some reason, otherwise they
> > > would not be represented explicitly  in gimple... do you think it does
> > > not matter?  What about vectors then?
> > > 
> > > The access is not bogus because modification functions take care of
> > > these statements in a special way.  However, if it is indeed OK to
> > > split complex numbers into their components, I will gladly simplify
> > > this as you suggested.
> > 
> > Yes, it is valid to split them (and complex lowering indeed does that).
> > It _might_ be useful to keep a complex together in a single SSA_NAME
> > for optimization purposes, but I guess you detect that anyway if there
> > is a read of the whole complex element into a register and keep it
> > that way.
> > 
> > I would favor simplifying SRA in this case and just split them if
> > that is valid.
> 
> OK, I will try to do that and incormporate it into the patch if it
> indeed simplifies things.  At least analysis of access trees will OTOH
> become more complex.  Unfortunately, I need to concentrate on another
> two things this week so I'll do that early the next one.

Ok.

> > > > >     return SRA_SA_NONE;
> > > > >
> > > > >   lhs_ptr = gimple_assign_lhs_ptr (stmt);
> > > > >   rhs_ptr = gimple_assign_rhs1_ptr (stmt);
> > > > 
> > > > you probably don't need to pass pointers to trees everywhere as you
> > > > are not changing them.
> > > 
> > > Well, this  function is a  callback called by scan_function  which can
> > > also  call  sra_modify_expr  in  the  last  stage  of  the  pass  when
> > > statements  are modified.   I have  considered splitting  the function
> > > into two but  in the end I  thought they would be too  similar and the
> > > overhead is hopefully manageable.
> > 
> > Yeah, I noticed this later.  It is somewhat confusing at first sight,
> > so maybe just amending the comment before this function could
> > clarify things.
> 
> OK, for my part, I've realized that build_access_from_expr_1 indeed
> does not use the gsi parameter (and is not a callback, unlike
> build_access_from_expr).
> 
> > > > >   if (disqualify_ops_if_throwing_stmt (stmt, lhs_ptr, rhs_ptr))
> > > > >     return SRA_SA_NONE;
> > > > > 
> > > > >   racc = build_access_from_expr_1 (rhs_ptr, gsi, false);
> > > > >   lacc = build_access_from_expr_1 (lhs_ptr, gsi, true);
> > > > 
> > > > just avoid calling into build_access_from_expr_1 for SSA_NAMEs
> > > > or is_gimple_min_invariant lhs/rhs, that should make that
> > > > function more regular.
> > > 
> > > In what sense?  build_access_from_expr_1 looks at TREE_CODE anyway and
> > > can discard the two cases,  without for example looking into ADR_EXPRs
> > > like is_gimple_min_invariant().
> > > 
> > > But if you really think it is indeed beneficial, I can do that, sure -
> > > to me it just looks ugly).
> > 
> > Ok, just keep it as is.
> > 
> > > > >   if (lacc && racc
> > > > >       && !lacc->grp_unscalarizable_region
> > > > >       && !racc->grp_unscalarizable_region
> > > > >       && AGGREGATE_TYPE_P (TREE_TYPE (*lhs_ptr))
> > > > >       && lacc->size <= racc->size
> > > > >       && useless_type_conversion_p (lacc->type, racc->type))
> > > > 
> > > > useless_type_conversion_p should be always true here.
> > > 
> > > I don't think so, build_access_from_expr_1 can look through V_C_Es and
> > > the types of accesses are the type of the operand in such cases..
> > 
> > Ok, but what is the point of looking through V_C_Es there if it makes
> > this test fail?  Hmm, IIRC this was only to track struct copies, right?
> > I guess it's ok then.
> 
> Yes, only for that.  The point of looking through V_C_Es is the size
> incosistency.
> 
> > 
> > > > That would just be useless information.  I guess you copied this
> > > > from old SRA?
> > > 
> > > Yes.  All this fancy naming stuff  is quite useless but I find it very
> > > handy when debugging SRA issues.
> > 
> > Yeah, sort of.  Still using no name in that case will do exactly
> > the same thing ;)
> > 
> > > > > static tree
> > > > > create_access_replacement (struct access *access)
> > > > > {
> > > > >   tree repl;
> > > > > 
> > > > >   repl = make_rename_temp (access->type, "SR");
> > > > >   get_var_ann (repl);
> > > > >   add_referenced_var (repl);
> > > > > 
> > > > >   DECL_SOURCE_LOCATION (repl) = DECL_SOURCE_LOCATION (access->base);
> > > > >   DECL_ARTIFICIAL (repl) = 1;
> > > > > 
> > > > >   if (DECL_NAME (access->base) && !DECL_IGNORED_P (access->base))
> > > > 
> > > > at least && !DECL_ARTIFICIAL (access->base) I think.
> > > 
> > > This part is also largely copied from the old SRA.  So far it seems to
> > > work nicely, replacements of artificial declarations get SRsome_number
> > > fancy names and that makes  them easy to distinguish.  Nevertheless, I
> > > can change the condition if it is somehow wrong.  Or do you expect any
> > > other problems beside not-so-fancy fancy names?
> > 
> > No, it merely uses up memory.  Not that other passes do not do this ...
> > 
> > Thus, I probably do not care too much.
> 
> Well, I'd like to have the fancy names when SRA gets merged and new
> issues are likely to come up.  We can always remove this later on.

Ok.

>  
> > > > >   if (access->grp_bfr_lhs)
> > > > >     DECL_GIMPLE_REG_P (repl) = 0;
> > > > 
> > > > But you never set it (see update_address_taken for more cases,
> > > > most notably VIEW_CONVERT_EXPR on the lhs which need to be taken
> > > > care of).  You should set it for COMPLEX_TYPE and VECTOR_TYPE 
> > > > replacements.
> > > 
> > > This function  is the  only place where  I still  use make_rename_temp
> > > which sets it  exactly in these two cases.  I did  not really know why
> > > it is  required in these two  cases and only  in these two cases  so I
> > > left it there, at least for  now.  I guess I understand that now after
> > > seeing update_address_taken.
> > > 
> > > I can  replace this  with calling create_tmp_var()  and doing  all the
> > > rest  that make_rename_temp does  - I  believe that  you intend  to to
> > > remove it - I have just not found out why it is so bad.
> > 
> > The bad thing about it is that it supports using the SSA renamer
> > to write a single variable into SSA.  That is usually more costly
> > than just manually allocating SSA_NAMEs and updating SSA form,
> > which is usally very easy.
> > 
> > It's not used much, in which case the easiest thing might be to
> > fix all remaining uses to manually update SSA form.
> > 
> > But yes, I now see why that zeroing is necessary.
> 
> OK, I'll use create_tmp_var_here too.  But at this point I cannot
> create SSA_NAMEs manually and  will basically have to do all that
> make_rename_temp does.

Ok.

> > > > 
> > > > CONVERT_EXPR_P (expr)
> > > > 
> > > > >       || TREE_CODE (expr) == VIEW_CONVERT_EXPR)
> > > > 
> > > > VIEW_CONVERT_EXPR is also a handled_component_p.
> > > > 
> > > > Note that NOP_EXPR should never occur here - that would be invalid
> > > > gimple.  So I think you can (and should) just delete the above.
> > > 
> > > I haven't  seen a NOP_EXPR for a  while, do they still  exist in lower
> > > gimple?  Thus I have removed their handling.
> > > 
> > > Removing diving through V_C_E breaks ADA, though.  The reason is that
> > > we get a different size (and max_size) when calling
> > > get_ref_base_and_extent on the V_C_E and on its argument.  However, I
> > > believe both should be represented by a single access representative.
> > 
> > Yeah, I remember this :/  It is technically invalid GIMPLE that the
> > Ada FE generates though.  The size of the V_C_E result has to match
> > that of the operand.
> > 
> > Please add a FIXME before this stripping refering to the Ada problem.
> 
> OK
> 
> > > > >   tree repl = get_access_replacement (access);
> > > > >   if (!TREE_ADDRESSABLE (type))
> > > > >     {
> > > > >       tree tmp = create_tmp_var (type, "SRvce");
> > > > > 
> > > > >       add_referenced_var (tmp);
> > > > >       if (is_gimple_reg_type (type))
> > > > > 	tmp = make_ssa_name (tmp, NULL);
> > > > 
> > > > Should be always is_gimple_reg_type () if it is a type suitable for
> > > > a SRA scalar replacement. 
> > > 
> > > No, it is the type suitable for  the statement, it can be a union type
> > > or a record with only one field. But see the more thorough explanation
> > > below...
> > 
> > I think it should be always a register type, but see below... ;)
> > 
> > > > But you should set DECL_GIMPLE_REG_P for
> > > > VECTOR and COMPLEX types here.
> > > > 
> > > > >       if (write)
> > > > > 	{
> > > > > 	  gimple stmt;
> > > > > 	  tree conv = fold_build1 (VIEW_CONVERT_EXPR, TREE_TYPE (repl), tmp);
> > > > 
> > > > This needs to either always fold to plain 'tmp' or tmp has to be a
> > > > non-register.  Otherwise you will create invalid gimple.
> > > > 
> > > > > 	  *expr = tmp;
> > > > > 	  if (is_gimple_reg_type (type))
> > > > > 	    SSA_NAME_DEF_STMT (tmp) = gsi_stmt (*gsi);
> > > > 
> > > > See above.
> > > > 
> > > > > 	  stmt = gimple_build_assign (repl, conv);
> > > > > 	  gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
> > > > > 	  update_stmt (stmt);
> > > > > 	}
> > > > >       else
> > > > > 	{
> > > > > 	  gimple stmt;
> > > > > 	  tree conv = fold_build1 (VIEW_CONVERT_EXPR, type, repl);
> > > > > 
> > > > > 	  stmt = gimple_build_assign (tmp, conv);
> > > > > 	  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
> > > > > 	  if (is_gimple_reg_type (type))
> > > > > 	    SSA_NAME_DEF_STMT (tmp) = stmt;
> > > > 
> > > > See above.  (I wonder if the patch still passes bootstrap & regtest
> > > > after the typecking patch)
> > > > 
> > > > > 	  *expr = tmp;
> > > > > 	  update_stmt (stmt);
> > > > > 	}
> > > > >     }
> > > > >   else
> > > > >     {
> > > > >       if (write)
> > > > > 	{
> > > > > 	  gimple stmt;
> > > > > 
> > > > > 	  stmt = gimple_build_assign (repl, unshare_expr (access->expr));
> > > > > 	  gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
> > > > > 	  update_stmt (stmt);
> > > > > 	}
> > > > >       else
> > > > > 	{
> > > > > 	  gimple stmt;
> > > > > 
> > > > > 	  stmt = gimple_build_assign (unshare_expr (access->expr), repl);
> > > > > 	  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
> > > > > 	  update_stmt (stmt);
> > > > > 	}
> > > > 
> > > > I don't understand this path.  Are the types here always compatible?
> > > 
> > > And I don't really understand the comments.  The function is called by
> > > sra_modify_expr (the function doing the replacements in all non-assign
> > > statements) when it  needs to replace a reference by  a scalar but the
> > > types don't  match.  This can happen  when replacing a  V_C_E, a union
> > > access  when we  picked a  different  type that  the one  used in  the
> > > statement or (and this case can be remarkably irritating) an access to
> > > a records with only one (scalar) field.
> > > 
> > > My original idea  was to simply put a V_C_E in  the place.  However, I
> > > believe there are places where this  is not possible - or at least one
> > > case, a  LHS of  a call statement  because V_C_Es  of gimple_registers
> > > (ssa_names) are not allowed on  LHSs.  My initial idea to handle these
> > > cases  were to  create a  new temporary  with a  matching and  a V_C_E
> > > assign statement  (with the V_C_E always  on the RHS -  I believe that
> > > works even  with gimple  registers) that would  do the  conversion and
> > > load/store  it   to  the  replacement  variable  (this   is  what  the
> > > !TREE_ADDRESSABLE branch does).
> > > 
> > > The problem  with this idea  are TREE_ADDRESSABLE types.   These types
> > > need to be  constructed and thus we cannot  create temporary variables
> > > of these types.   On the other hand they absolutely  need to be SRAed,
> > > not doing  so slows down tramp3d by  a factor of two  (and the current
> > > SRA also breaks them up).  And  quite a few C++ classes are such types
> > > that   are  "non-addressable"   and  have   only  one   scalar  field.
> > > Identifying  such records  is possible,  I  soon realized  that I  can
> > > simply leave the statement as it  is and produce a new statement to do
> > > load/store  from the original  field (that's  what the  outermost else
> > > branch does).
> > > 
> > > Does this make sense or is there some fundamental flaw in my reasoning
> > > about gimple again?  Does this explain what the function does?
> > 
> > Ok, so the case in question is
> > 
> >   struct X { int i; } x;
> >   x = foo ();
> > 
> > where you want to scalarize x.  Indeed the obvious scalarization would
> > be
> > 
> >   x = foo ();
> >   SR_1 = x.i;
> > 
> > For all other LHS cases (not calls) you can move the V_C_E to the RHS
> > and should be fine.
> 
> Well, I  have tried  to remove the  function and  have sra_modify_expr
> handle this  particular case only  to discover another one  where it's
> probably required  (given that  gimple verification is  correct).  The
> new  problem  is  that  I  cannot   put  a  V_C_E  as  an  operand  of
> GIMPLE_RETURN.   Specifically  I got  an  "invalid  operand in  return
> statement" failure when verifying:
> 
> return VIEW_CONVERT_EXPR<struct gnat__perfect_hash_generators__key_type>(SR.2108_12);

Indeed.  GIMPLE_RETURN needs a register argument, so

  tmp = VIEW_CONVERT_EXPR<struct 
gnat__perfect_hash_generators__key_type>(SR.2...;
  return tmp;

> Thus I will probably keep the function to be always safe but change it
> to the following:
> 
> /* Substitute into *EXPR an expression of type TYPE with the value of the
>    replacement of ACCESS.  This is done either by producing a special V_C_E
>    assignment statement converting the replacement to a new temporary of the
>    requested type if TYPE is is_gimple_reg_type or by going through the base
>    aggregate if it is not.  */
> 
> static void
> fix_incompatible_types_for_expr (tree *expr, tree type, struct access *access,
> 				 gimple_stmt_iterator *gsi, bool write)
> {
>   tree repl = get_access_replacement (access);
> 
>   if (is_gimple_reg_type (type))
>     {
>       tree tmp = create_tmp_var (type, "SRvce");
> 
>       add_referenced_var (tmp);
>       tmp = make_ssa_name (tmp, NULL);
> 
>       if (write)
> 	{
> 	  gimple stmt;
> 	  tree conv = fold_build1 (VIEW_CONVERT_EXPR, TREE_TYPE (repl), tmp);
> 
> 	  *expr = tmp;
> 	  SSA_NAME_DEF_STMT (tmp) = gsi_stmt (*gsi);
> 	  stmt = gimple_build_assign (repl, conv);
> 	  gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
> 	  update_stmt (stmt);
> 	}
>       else
> 	{
> 	  gimple stmt;
> 	  tree conv = fold_build1 (VIEW_CONVERT_EXPR, type, repl);
> 
> 	  stmt = gimple_build_assign (tmp, conv);
> 	  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
> 	  SSA_NAME_DEF_STMT (tmp) = stmt;
> 	  *expr = tmp;
> 	  update_stmt (stmt);
> 	}
>     }
>   else
>     {
>       if (write)
> 	{
> 	  gimple stmt;
> 
> 	  stmt = gimple_build_assign (repl, unshare_expr (access->expr));
> 	  gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
> 	  update_stmt (stmt);
> 	}
>       else
> 	{
> 	  gimple stmt;
> 
> 	  stmt = gimple_build_assign (unshare_expr (access->expr), repl);
> 	  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
> 	  update_stmt (stmt);
> 	}
>     }
> }

I will have a look at the current version you sent me an play with
it a bit.


> > I don't understand the TREE_ADDRESSABLE thingy yet.  Especially why
> > the type should be a non-register type.
> 
> I guess it does not matter now, but in your example above type would be
> struct x, not int.
> 
> > > It certainly passes bootstrap and testing, I use --enable-checking=yes.
> > 
> > That's good.
> > 
> > > > >     {
> > > > >       /* I have never seen this code path trigger but if it can happen the
> > > > > 	 following should handle it gracefully.  */
> > > > 
> > > > It can trigger for vector constants.
> > > 
> > > OK, I'll remove the comment.  Apparently there are none in the
> > > testsuite, I believe I tested with a gcc_unreachable here.
> > 
> > Err, for vector constants we have VECTOR_CST, so it triggers for
> > non-constant vector constructors like
> > 
> >  vector int x = { a, b, c, d };
> 
> I see.
>  
> > > > >   update_stmt (aux_stmt);
> > > > >   new_stmt = gimple_build_assign (get_access_replacement (access),
> > > > > 				  fold_build2 (COMPLEX_EXPR, access->type,
> > > > > 					       rp, ip));
> > > > >   gsi_insert_after (gsi, new_stmt, GSI_NEW_STMT);
> > > > >   update_stmt (new_stmt);
> > > > 
> > > > Hm.  So you do what complex lowering does here.  Note that this may
> > > > create loads from uninitialized memory with all its problems.
> > > 
> > > Yes,  but I  have not  had any  such problems  with complex  types (as
> > > opposed to  simple loads from half-initialized  records, for example).
> > > OTOH, I  have also contemplated  setting DECL_GIMPLE_REG_P to  zero of
> > > complex replacement which appear in IMAG_PART or REAL_PART on a LHS of
> > > a statement.
> > 
> > Yes, that's necessary.  I still think SRA should not bother about
> > this at all ;)
> > 
> > > > WRT the complex stuff.  If you would do scalarization and analysis
> > > > just on the components (not special case REAL/IMAGPART_EXPR everywhere)
> > > > it should work better, correct?  You still could handle group
> > > > scalarization for the case of for example passing a complex argument
> > > > to a function.
> > > 
> > > Well, my reasoning was that if complex types were first-class citizens
> > > in gimple  (as opposed to a record),  there was a reason  to keep them
> > > together  and  so  I  attempted   that.   But  again,  if  that  is  a
> > > misconception of mine and there  is no point in keeping them together,
> > > I will gladly remove this.
> > 
> > It's not clear.  Complex lowering decomposes all complex variables
> > to components if possible.  Again, simplifying SRA is probably better.
> 
> OK, I will find out how much this would actually simplify things and
> what new problems might arise.  
> 
> I have already tried relaxing the  access tree analysis s that it does
> not  prevent scalarization  of  subaccesses of  scalar accesses  which
> would be necessary for decomposing complex components and uncovered an
> unrelated problem, again in my favorite testcase entry_4.f90.
> 
> Can you please check whether the following snippet is a valid gimple?
> The expander ICEs on an assert when trying to crunch the V_C_E at the
> end.  Looking at it myself, I start to doubt that I can always handle
> union type-punning with V_C_Es.

That's a bug in the expander.  C testcase that ICEs even w/o SRA:

float foo(int i)
{
  int j = i == 42;
  return *(float *)&j;
}

I have a fix in testing.

Richard.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/5] Get rid off old external tree-sra.c stuff
  2009-04-28 12:55   ` Richard Guenther
@ 2009-05-20 10:19     ` Martin Jambor
  0 siblings, 0 replies; 25+ messages in thread
From: Martin Jambor @ 2009-05-20 10:19 UTC (permalink / raw)
  To: GCC Patches

On Tue, Apr 28, 2009 at 02:46:45PM +0200, Richard Guenther wrote:
> On Tue, 28 Apr 2009, Martin Jambor wrote:
> 
> > This  patch gets  rid off  all extermal  things in  the  old tree-sra.
> > sra_insert_before,       sra_insert_after      sra_init_cache      and
> > sra_type_can_be_decomposed_p  are not actually  used anywhere  so they
> > are made  static. insert_edge_copies_seq is  used in mudflap and  so I
> > copid the function there and made it static too.  The original one had
> > to be moved  upwards in the file do that  tree-sra compiles.  Yes, ths
> > patch duplicates the  function but the origial copy  is nuked with the
> > rest of the file by the next patch.
> > 
> > Thanks,
> > 
> > Martin
> 
> This is ok.

Re-bootstrapped, re-tested and committed as revision 147733.

Thanks, 

Martin

> 
> Thanks,
> Richard.
> 
> > 
> > 2009-04-27  Martin Jambor  <mjambor@suse.cz>
> > 
> > 	* tree-flow.h (insert_edge_copies_seq): Undeclare.
> > 	(sra_insert_before): Likewise.
> > 	(sra_insert_after): Likewise.
> > 	(sra_init_cache): Likewise.
> > 	(sra_type_can_be_decomposed_p): Likewise.
> > 
> > 	* tree-mudflap.c (insert_edge_copies_seq): Copied here from tree-sra.c
> > 
> > 	* tree-sra.c (sra_type_can_be_decomposed_p): Made static.
> > 	(sra_insert_before): Likewise.
> > 	(sra_insert_after): Likewise.
> > 	(sra_init_cache): Likewise.
> > 	(insert_edge_copies_seq): Made static and moved upwards.
> > 
> > 
> > Index: mine/gcc/tree-flow.h
> > ===================================================================
> > --- mine.orig/gcc/tree-flow.h
> > +++ mine/gcc/tree-flow.h
> > @@ -873,13 +873,6 @@ tree vn_lookup_with_vuses (tree, VEC (tr
> >  /* In tree-ssa-sink.c  */
> >  bool is_hidden_global_store (gimple);
> >  
> > -/* In tree-sra.c  */
> > -void insert_edge_copies_seq (gimple_seq, basic_block);
> > -void sra_insert_before (gimple_stmt_iterator *, gimple_seq);
> > -void sra_insert_after (gimple_stmt_iterator *, gimple_seq);
> > -void sra_init_cache (void);
> > -bool sra_type_can_be_decomposed_p (tree);
> > -
> >  /* In tree-loop-linear.c  */
> >  extern void linear_transform_loops (void);
> >  extern unsigned perfect_loop_nest_depth (struct loop *);
> > Index: mine/gcc/tree-mudflap.c
> > ===================================================================
> > --- mine.orig/gcc/tree-mudflap.c
> > +++ mine/gcc/tree-mudflap.c
> > @@ -447,6 +447,26 @@ execute_mudflap_function_ops (void)
> >    return 0;
> >  }
> >  
> > +/* Insert a gimple_seq SEQ on all the outgoing edges out of BB.  Note that
> > +   if BB has more than one edge, STMT will be replicated for each edge.
> > +   Also, abnormal edges will be ignored.  */
> > +
> > +static void
> > +insert_edge_copies_seq (gimple_seq seq, basic_block bb)
> > +{
> > +  edge e;
> > +  edge_iterator ei;
> > +  unsigned n_copies = -1;
> > +
> > +  FOR_EACH_EDGE (e, ei, bb->succs)
> > +    if (!(e->flags & EDGE_ABNORMAL))
> > +      n_copies++;
> > +
> > +  FOR_EACH_EDGE (e, ei, bb->succs)
> > +    if (!(e->flags & EDGE_ABNORMAL))
> > +      gsi_insert_seq_on_edge (e, n_copies-- > 0 ? gimple_seq_copy (seq) : seq);
> > +}
> > +
> >  /* Create and initialize local shadow variables for the lookup cache
> >     globals.  Put their decls in the *_l globals for use by
> >     mf_build_check_statement_for.  */
> > Index: mine/gcc/tree-sra.c
> > ===================================================================
> > --- mine.orig/gcc/tree-sra.c
> > +++ mine/gcc/tree-sra.c
> > @@ -236,7 +236,7 @@ is_sra_scalar_type (tree type)
> >     instantiated, just that if we decide to break up the type into
> >     separate pieces that it can be done.  */
> >  
> > -bool
> > +static bool
> >  sra_type_can_be_decomposed_p (tree type)
> >  {
> >    unsigned int cache = TYPE_UID (TYPE_MAIN_VARIANT (type)) * 2;
> > @@ -1263,6 +1263,26 @@ build_element_name (struct sra_elt *elt)
> >    return XOBFINISH (&sra_obstack, char *);
> >  }
> >  
> > +/* Insert a gimple_seq SEQ on all the outgoing edges out of BB.  Note that
> > +   if BB has more than one edge, STMT will be replicated for each edge.
> > +   Also, abnormal edges will be ignored.  */
> > +
> > +static void
> > +insert_edge_copies_seq (gimple_seq seq, basic_block bb)
> > +{
> > +  edge e;
> > +  edge_iterator ei;
> > +  unsigned n_copies = -1;
> > +
> > +  FOR_EACH_EDGE (e, ei, bb->succs)
> > +    if (!(e->flags & EDGE_ABNORMAL))
> > +      n_copies++;
> > +
> > +  FOR_EACH_EDGE (e, ei, bb->succs)
> > +    if (!(e->flags & EDGE_ABNORMAL))
> > +      gsi_insert_seq_on_edge (e, n_copies-- > 0 ? gimple_seq_copy (seq) : seq);
> > +}
> > +
> >  /* Instantiate an element as an independent variable.  */
> >  
> >  static void
> > @@ -2785,29 +2805,9 @@ generate_element_init (struct sra_elt *e
> >    return ret;
> >  }
> >  
> > -/* Insert a gimple_seq SEQ on all the outgoing edges out of BB.  Note that
> > -   if BB has more than one edge, STMT will be replicated for each edge.
> > -   Also, abnormal edges will be ignored.  */
> > -
> > -void
> > -insert_edge_copies_seq (gimple_seq seq, basic_block bb)
> > -{
> > -  edge e;
> > -  edge_iterator ei;
> > -  unsigned n_copies = -1;
> > -
> > -  FOR_EACH_EDGE (e, ei, bb->succs)
> > -    if (!(e->flags & EDGE_ABNORMAL)) 
> > -      n_copies++;
> > -
> > -  FOR_EACH_EDGE (e, ei, bb->succs)
> > -    if (!(e->flags & EDGE_ABNORMAL)) 
> > -      gsi_insert_seq_on_edge (e, n_copies-- > 0 ? gimple_seq_copy (seq) : seq);
> > -}
> > -
> >  /* Helper function to insert LIST before GSI, and set up line number info.  */
> >  
> > -void
> > +static void
> >  sra_insert_before (gimple_stmt_iterator *gsi, gimple_seq seq)
> >  {
> >    gimple stmt = gsi_stmt (*gsi);
> > @@ -2819,7 +2819,7 @@ sra_insert_before (gimple_stmt_iterator
> >  
> >  /* Similarly, but insert after GSI.  Handles insertion onto edges as well.  */
> >  
> > -void
> > +static void
> >  sra_insert_after (gimple_stmt_iterator *gsi, gimple_seq seq)
> >  {
> >    gimple stmt = gsi_stmt (*gsi);
> > @@ -3597,7 +3597,7 @@ debug_sra_elt_name (struct sra_elt *elt)
> >    fputc ('\n', stderr);
> >  }
> >  
> > -void 
> > +static void
> >  sra_init_cache (void)
> >  {
> >    if (sra_type_decomp_cache)
> > 
> > 
> 
> -- 
> Richard Guenther <rguenther@suse.de>
> Novell / SUSE Labs
> SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 - GF: Markus Rex

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/5] Make tree-complex.c:extract_component() handle  V_C_Es
  2009-04-28 11:52   ` Richard Guenther
@ 2009-05-20 10:20     ` Martin Jambor
  0 siblings, 0 replies; 25+ messages in thread
From: Martin Jambor @ 2009-05-20 10:20 UTC (permalink / raw)
  To: GCC Patches

On Tue, Apr 28, 2009 at 01:43:16PM +0200, Richard Guenther wrote:
> On Tue, 28 Apr 2009, Martin Jambor wrote:
> 
> > Currently     tree-complex.c:extract_component()     cannot     handle
> > VIEW_CONVERT_EXPRs  which  makes  the  new SRA  ICE  during  bootstrap
> > (IIRC).  This seems to be an ommision so I added a label for this code
> > so that they are handled just like other handled components.
> > 
> > Thanks,
> 
> Ok.
> 
> Thanks,
> Richard.

Re-bootstrapped, re-tested and committed as revision 147733.

Thanks,

Martin

> 
> > 2009-04-27  Martin Jambor  <mjambor@suse.cz>
> > 
> > 	* tree-complex.c (extract_component): Added VIEW_CONVERT_EXPR switch
> > 	case.
> > 
> > 
> > Index: mine/gcc/tree-complex.c
> > ===================================================================
> > --- mine.orig/gcc/tree-complex.c	2009-04-25 19:11:37.000000000 +0200
> > +++ mine/gcc/tree-complex.c	2009-04-25 19:11:47.000000000 +0200
> > @@ -601,6 +601,7 @@ extract_component (gimple_stmt_iterator 
> >      case INDIRECT_REF:
> >      case COMPONENT_REF:
> >      case ARRAY_REF:
> > +    case VIEW_CONVERT_EXPR:
> >        {
> >  	tree inner_type = TREE_TYPE (TREE_TYPE (t));

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/5] Fix indirect inlining fallout with new intra-SRA
  2009-04-29 13:13       ` Richard Guenther
@ 2009-05-20 10:23         ` Martin Jambor
  0 siblings, 0 replies; 25+ messages in thread
From: Martin Jambor @ 2009-05-20 10:23 UTC (permalink / raw)
  To: GCC Patches

On Wed, Apr 29, 2009 at 02:57:01PM +0200, Richard Guenther wrote:
> On Wed, 29 Apr 2009, Martin Jambor wrote:
> 
> > Hi,
> > 
> > On Tue, Apr 28, 2009 at 01:48:55PM +0200, Richard Guenther wrote:
> > > On Tue, Apr 28, 2009 at 12:04 PM, Martin Jambor <mjambor@suse.cz> wrote:
> > > > The new intra-SRA produces an extra copy assignment and that breaks
> > > > ipa-prop.c pattern matching.  The following patch fixes that.
> > > >
> > > > Thanks,
> > > >
> > > > Martin
> > > >
> > > >
> > > > 2009-04-27  Martin Jambor  <mjambor@suse.cz>
> > > >
> > > >        * ipa-prop.c (get_ssa_def_if_simple_copy): New function.
> > > >        (determine_cst_member_ptr): Call get_ssa_def_if_simple_copy to skip
> > > >        simple copies.
> > > >
> > > >
> > > > Index: mine/gcc/ipa-prop.c
> > > > ===================================================================
> > > > --- mine.orig/gcc/ipa-prop.c
> > > > +++ mine/gcc/ipa-prop.c
> > > > @@ -456,6 +456,22 @@ fill_member_ptr_cst_jump_function (struc
> > > >   jfunc->value.member_cst.delta = delta;
> > > >  }
> > > >
> > > > +/* If RHS is an SSA_NAMe and it is defined by a simple copy assign statement,
> > > > +   return the rhs of its defining statement.  */
> > > > +
> > > > +static inline tree
> > > > +get_ssa_def_if_simple_copy (tree rhs)
> > > > +{
> > > > +  if (TREE_CODE (rhs) == SSA_NAME && !SSA_NAME_IS_DEFAULT_DEF (rhs))
> > > > +    {
> > > > +      gimple def_stmt = SSA_NAME_DEF_STMT (rhs);
> > > > +
> > > > +      if (is_gimple_assign (def_stmt) && gimple_num_ops (def_stmt) == 2)
> > > > +       rhs = gimple_assign_rhs1 (def_stmt);
> > > > +    }
> > > > +  return rhs;
> > > > +}
> > > 
> > > IMHO this function should loop.  Also use gimple_assign_single_p
> > > instead of the assign && num_ops check. 
> > 
> > OK
> > 
> > > You also  have to check  the gimple_assign_rhs_code to  be SSA_NAME,
> > > otherwise you happily look through all unary operations.
> > > 
> > 
> > Will the RHS code be SSA_NAME even when the RHS is an invariant? (I am
> > eventually looking  for an invariant,  specifically an ADDR_EXPR  of a
> > FUNCTION_DECL and an integer constant, not an ssa name.)
> 
> No.  In that case you want to check
> 
>   if (gimple_assign_single_p (def_stmt)
>       && (gimple_assign_rhs_code (def_stmt) == SSA_NAME
>           || is_gimple_min_invariant (gimple_assign_rhs1 (def_stmt)))
> 
> Richard.

The following was approved by Richi on IRC and so I re-bootstrapped,
re-tested and committed as revision 147733.

Thanks,

Martin


2009-05-20  Martin Jambor  <mjambor@suse.cz>

	* ipa-prop.c (get_ssa_def_if_simple_copy): New function.
	(determine_cst_member_ptr): Call get_ssa_def_if_simple_copy to skip
	simple copies.


Index: mine/gcc/ipa-prop.c
===================================================================
--- mine.orig/gcc/ipa-prop.c
+++ mine/gcc/ipa-prop.c
@@ -428,6 +428,22 @@ fill_member_ptr_cst_jump_function (struc
   jfunc->value.member_cst.delta = delta;
 }
 
+/* If RHS is an SSA_NAMe and it is defined by a simple copy assign statement,
+   return the rhs of its defining statement.  */
+
+static inline tree
+get_ssa_def_if_simple_copy (tree rhs)
+{
+  while (TREE_CODE (rhs) == SSA_NAME && !SSA_NAME_IS_DEFAULT_DEF (rhs))
+    {
+      gimple def_stmt = SSA_NAME_DEF_STMT (rhs);
+
+      if (gimple_assign_single_p (def_stmt))
+	rhs = gimple_assign_rhs1 (def_stmt);
+    }
+  return rhs;
+}
+
 /* Traverse statements from CALL backwards, scanning whether the argument ARG
    which is a member pointer is filled in with constant values.  If it is, fill
    the jump function JFUNC in appropriately.  METHOD_FIELD and DELTA_FIELD are
@@ -467,6 +483,7 @@ determine_cst_member_ptr (gimple call, t
       fld = TREE_OPERAND (lhs, 1);
       if (!method && fld == method_field)
 	{
+	  rhs = get_ssa_def_if_simple_copy (rhs);
 	  if (TREE_CODE (rhs) == ADDR_EXPR
 	      && TREE_CODE (TREE_OPERAND (rhs, 0)) == FUNCTION_DECL
 	      && TREE_CODE (TREE_TYPE (TREE_OPERAND (rhs, 0))) == METHOD_TYPE)
@@ -484,6 +501,7 @@ determine_cst_member_ptr (gimple call, t
 
       if (!delta && fld == delta_field)
 	{
+	  rhs = get_ssa_def_if_simple_copy (rhs);
 	  if (TREE_CODE (rhs) == INTEGER_CST)
 	    {
 	      delta = rhs;


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2009-05-20 10:23 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-28 10:10 [PATCH 0/5] New implementation of SRA Martin Jambor
2009-04-28 10:10 ` [PATCH 4/5] Fix indirect inlining fallout with new intra-SRA Martin Jambor
2009-04-28 12:15   ` Richard Guenther
2009-04-29 12:39     ` Martin Jambor
2009-04-29 13:13       ` Richard Guenther
2009-05-20 10:23         ` Martin Jambor
2009-04-28 10:10 ` [PATCH 2/5] Make tree-complex.c:extract_component() handle V_C_Es Martin Jambor
2009-04-28 11:52   ` Richard Guenther
2009-05-20 10:20     ` Martin Jambor
2009-04-28 10:11 ` [PATCH 1/5] Get rid off old external tree-sra.c stuff Martin Jambor
2009-04-28 12:55   ` Richard Guenther
2009-05-20 10:19     ` Martin Jambor
2009-04-28 10:12 ` [PATCH 5/5] "Fix" the rest of the fallouts of new intra-SRA Martin Jambor
2009-04-28 13:05   ` Richard Guenther
2009-04-28 10:14 ` [PATCH 3/5] New intraprocedural Scalar Reduction of Aggregates Martin Jambor
2009-04-28 10:27   ` Martin Jambor
2009-04-29 12:56     ` Richard Guenther
2009-05-10 10:33       ` Martin Jambor
2009-05-10 11:48         ` Richard Guenther
2009-05-12  0:24           ` Martin Jambor
2009-05-18 13:26             ` Richard Guenther
2009-05-10 10:39       ` Martin Jambor
2009-05-12  9:49         ` Martin Jambor
2009-04-29 10:59   ` Richard Guenther
2009-04-29 12:16     ` Martin Jambor

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).