public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [autovect] [committed] fixes/cleanups
@ 2007-05-22 21:17 Dorit Nuzman
  0 siblings, 0 replies; only message in thread
From: Dorit Nuzman @ 2007-05-22 21:17 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 3993 bytes --]


A few fixups/cleanups. The only functionality change is w.r.t reduction
handling: as mentioned in
http://gcc.gnu.org/ml/gcc-patches/2007-04/msg00044.html:
"
6) The handling of inner-loop reduction (within the context of outer-loop
vectorization) has a bug and a feature: As explained above, the inner-loop
executes sequentially. However, we use the same routines as we have now to
classify scalar-cycles; this means that (1) we miss an optimization
opportunities: we think that we can't vectorize the loop if we operate in
the inner-loop on floats and fast-math is not on, and we don't support
inner-loop reductions when they are used in the inner-loop, although we
could. (2) we have a bug: stmts that are used only by operations that are
classified as a reduction are marked as "used_by_reduction", which
indicates that the order of computation when vectorizing these stmts
doesn't matter. However, in the case of inner-loop "reduction" it does
matter, cause we do preserve the order of computation. So we really need to
classify "inner-loop reductions" differently than regular reductions.
"
This patch fixes the above. W.r.t (1) - we don't require fast-math anymore
for inner-loop reduction. We still check uses inside the inner-loop (yet to
be fixed). W.r.t (2) - an inner-loop operation, including inner-loop
reduction, that is used in the outer-loop
("used_in_outer"/"used_in_outer_by_reduction"), is treated as an operation
whose computation order needs to be preserved.

The rest is misc. minor cleanups.

Bootstrapped with vectorization enabled and tested on the vectorizer
testcases on i386-linux and ppc-linux.
Committed to autovect branch.

dorit

        * tree-vectorizer.h (nested_in_vect_loop_p): New function.
        * tree-vect-transform.c (vect_init_vector): Call
nested_in_vect_loop_p.
        (get_initial_def_for_reduction): Likewise.
        (vect_create_epilog_for_reduction): Likewise.
        * tree-vectorizer.c (vect_is_simple_use): Remove multitypes check.
        Moved to vectorizable_* functions.
        * tree-vect-transform.c (vectorizable_reduction): Call
        nested_in_vect_loop_p. Check for multitypes in the inner-loop.
        (vectorizable_call): Likewise.
        (vectorizable_conversion): Likewise.
        (vectorizable_operation): Likewise.
        (vectorizable_type_promotion): Likewise.
        (vectorizable_type_demotion): Likewise.
        (vectorizable_store): Likewise.
        (vectorizable_live_operation): Likewise.

        * tree-vectorizer.c (supportable_widening_operation): Also check if
        used by an outer-loop reduction.
        (vect_is_simple_reduction): Takes a loop_vec_info as argument
instead
        of struct loop. Call nested_in_vect_loop_p and don't require
        flag_unsafe_math_optimizations if it returns true.
        * tree-vectorizer.h (vect_is_simple_reduction): Takes a
loop_vec_info
        as argument instead of struct loop.
        * tree-vect-analyze.c (vect_analyze_scalar_cycles_1): New function.
        Same code as what used to be vect_analyze_scalar_cycles, only with
        additional argument loop, and loop_info passed to
        vect_is_simple_reduction instead of loop.
        (vect_analyze_scalar_cycles): Code factored out into
        vect_analyze_scalar_cycles_1. Call it for each relevant loop-nest.
        Updated documentation.
        (vect_mark_stmts_to_be_vectorized): 'relevant' doen't change to
        vect_used_by_reduction in case it is vect_used_in_outer*.
        (vect_analyze_loop_1): Don't call vect_analyze_scalar_cycles.
        * tree-vect-patterns.c (vect_recog_dot_prod_pattern): Check if
there
        are uses in the loop.
        * tree-vect-transform.c (vectorizable_reduction): pass loop_info to
        vect_is_simple_reduction instead of loop.

        * tree-vect-analyze.c (process_use): Remove printouts.
        * tree-vect-transform.c (get_initial_def_for_induction): Fix
        indentation.


Patch:

(See attached file: clean.may22.txt)

[-- Attachment #2: clean.may22.txt --]
[-- Type: text/plain, Size: 30500 bytes --]

Index: tree-vectorizer.c
===================================================================
*** tree-vectorizer.c	(revision 124473)
--- tree-vectorizer.c	(working copy)
*************** vect_is_simple_use (tree operand, loop_v
*** 1774,1801 ****
    if (vect_print_dump_info (REPORT_DETAILS))
      fprintf (vect_dump, "type of def: %d.",*dt);
  
-   /* FORNOW. We currently don't support multiple data-types in inner-loops
-      during outer-loop vectorization.  This restriction will be relaxed.  */
-   if (*def_stmt
-       && loop->inner 
-       && (loop->inner == (bb_for_stmt (*def_stmt))->loop_father))  
-     {
-       stmt_vec_info stmt_info = vinfo_for_stmt (*def_stmt);
-       tree vectype = STMT_VINFO_VECTYPE (stmt_info);
- 
-       if (vectype)
- 	{
- 	  int nunits = TYPE_VECTOR_SUBPARTS (vectype);
- 	  int ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits;
- 	  if (ncopies > 1)
- 	    {
- 	      if (vect_print_dump_info (REPORT_DETAILS))
- 		fprintf (vect_dump, "Multiple-types in inner-loop.");
- 	      return false;
- 	    }
- 	}
-     }
- 
    switch (TREE_CODE (*def_stmt))
      {
      case PHI_NODE:
--- 1774,1779 ----
*************** supportable_widening_operation (enum tre
*** 1864,1870 ****
       of {mult_even,mult_odd} generate the following vectors:
          vect1: [res1,res3,res5,res7], vect2: [res2,res4,res6,res8].  */
  
!    if (STMT_VINFO_RELEVANT (stmt_info) == vect_used_by_reduction)
       ordered_p = false;
     else
       ordered_p = true;
--- 1842,1849 ----
       of {mult_even,mult_odd} generate the following vectors:
          vect1: [res1,res3,res5,res7], vect2: [res2,res4,res6,res8].  */
  
!    if (STMT_VINFO_RELEVANT (stmt_info) == vect_used_by_reduction
!        || STMT_VINFO_RELEVANT (stmt_info) == vect_used_in_outer_by_reduction)
       ordered_p = false;
     else
       ordered_p = true;
*************** reduction_code_for_scalar_code (enum tre
*** 1993,2000 ****
     Conditions 2,3 are tested in vect_mark_stmts_to_be_vectorized.  */
  
  tree
! vect_is_simple_reduction (struct loop *loop, tree phi)
  {
    edge latch_e = loop_latch_edge (loop);
    tree loop_arg = PHI_ARG_DEF_FROM_EDGE (phi, latch_e);
    tree def_stmt, def1, def2;
--- 1972,1981 ----
     Conditions 2,3 are tested in vect_mark_stmts_to_be_vectorized.  */
  
  tree
! vect_is_simple_reduction (loop_vec_info loop_info, tree phi)
  {
+   struct loop *loop = (bb_for_stmt (phi))->loop_father;
+   struct loop *vect_loop = LOOP_VINFO_LOOP (loop_info);
    edge latch_e = loop_latch_edge (loop);
    tree loop_arg = PHI_ARG_DEF_FROM_EDGE (phi, latch_e);
    tree def_stmt, def1, def2;
*************** vect_is_simple_reduction (struct loop *l
*** 2007,2012 ****
--- 1988,1995 ----
    imm_use_iterator imm_iter;
    use_operand_p use_p;
  
+   gcc_assert (loop == vect_loop || flow_loop_nested_p (vect_loop, loop));
+ 
    name = PHI_RESULT (phi);
    nloop_uses = 0;
    FOR_EACH_IMM_USE_FAST (use_p, imm_iter, name)
*************** vect_is_simple_reduction (struct loop *l
*** 2118,2125 ****
        return NULL_TREE;
      }
  
    /* CHECKME: check for !flag_finite_math_only too?  */
!   if (SCALAR_FLOAT_TYPE_P (type) && !flag_unsafe_math_optimizations)
      {
        /* Changing the order of operations changes the semantics.  */
        if (vect_print_dump_info (REPORT_DETAILS))
--- 2101,2116 ----
        return NULL_TREE;
      }
  
+   /* Generally, when vectorizing a reduction we change the order of the
+      computation.  This may change the behavior of the program in some
+      cases, so we need to check that this is ok.  One exception is when 
+      vectorizing an outer-loop: the inner-loop is executed sequentially,
+      and therefore vectorizing reductions in the inner-loop durint 
+      outer-loop vectorization is safe.  */
+ 
    /* CHECKME: check for !flag_finite_math_only too?  */
!   if (SCALAR_FLOAT_TYPE_P (type) && !flag_unsafe_math_optimizations
!       && !nested_in_vect_loop_p (vect_loop, def_stmt)) 
      {
        /* Changing the order of operations changes the semantics.  */
        if (vect_print_dump_info (REPORT_DETAILS))
*************** vect_is_simple_reduction (struct loop *l
*** 2129,2135 ****
          }
        return NULL_TREE;
      }
!   else if (INTEGRAL_TYPE_P (type) && TYPE_OVERFLOW_TRAPS (type))
      {
        /* Changing the order of operations changes the semantics.  */
        if (vect_print_dump_info (REPORT_DETAILS))
--- 2120,2127 ----
          }
        return NULL_TREE;
      }
!   else if (INTEGRAL_TYPE_P (type) && TYPE_OVERFLOW_TRAPS (type)
! 	   && !nested_in_vect_loop_p (vect_loop, def_stmt))
      {
        /* Changing the order of operations changes the semantics.  */
        if (vect_print_dump_info (REPORT_DETAILS))
Index: tree-vectorizer.h
===================================================================
*** tree-vectorizer.h	(revision 124473)
--- tree-vectorizer.h	(working copy)
*************** loop_vec_info_for_loop (struct loop *loo
*** 160,165 ****
--- 160,172 ----
    return (loop_vec_info) loop->aux;
  }
  
+ static inline bool
+ nested_in_vect_loop_p (struct loop *loop, tree stmt)
+ {
+   return (loop->inner 
+           && (loop->inner == (bb_for_stmt (stmt))->loop_father));
+ }
+ 
  /*-----------------------------------------------------------------*/
  /* Info on vectorized defs.                                        */
  /*-----------------------------------------------------------------*/
*************** extern tree get_vectype_for_scalar_type 
*** 419,425 ****
  extern bool vect_is_simple_use (tree, loop_vec_info, tree *, tree *,
  				enum vect_def_type *);
  extern bool vect_is_simple_iv_evolution (unsigned, tree, tree *, tree *);
! extern tree vect_is_simple_reduction (struct loop *, tree);
  extern bool vect_can_force_dr_alignment_p (tree, unsigned int);
  extern enum dr_alignment_support vect_supportable_dr_alignment
    (struct data_reference *);
--- 426,432 ----
  extern bool vect_is_simple_use (tree, loop_vec_info, tree *, tree *,
  				enum vect_def_type *);
  extern bool vect_is_simple_iv_evolution (unsigned, tree, tree *, tree *);
! extern tree vect_is_simple_reduction (loop_vec_info, tree);
  extern bool vect_can_force_dr_alignment_p (tree, unsigned int);
  extern enum dr_alignment_support vect_supportable_dr_alignment
    (struct data_reference *);
Index: tree-vect-analyze.c
===================================================================
*** tree-vect-analyze.c	(revision 124473)
--- tree-vect-analyze.c	(working copy)
*************** exist_non_indexing_operands_for_use_p (t
*** 558,607 ****
  }
  
  
! /* Function vect_analyze_scalar_cycles.
! 
!    Examine the cross iteration def-use cycles of scalar variables, by
!    analyzing the loop (scalar) PHIs; Classify each cycle as one of the
!    following: invariant, induction, reduction, unknown.
!    
!    Some forms of scalar cycles are not yet supported.
! 
!    Example1: reduction: (unsupported yet)
! 
!               loop1:
!               for (i=0; i<N; i++)
!                  sum += a[i];
  
!    Example2: induction: (unsupported yet)
! 
!               loop2:
!               for (i=0; i<N; i++)
!                  a[i] = i;
! 
!    Note: the following loop *is* vectorizable:
! 
!               loop3:
!               for (i=0; i<N; i++)
!                  a[i] = b[i];
! 
!          even though it has a def-use cycle caused by the induction variable i:
! 
!               loop: i_2 = PHI (i_0, i_1)
!                     a[i_2] = ...;
!                     i_1 = i_2 + 1;
!                     GOTO loop;
! 
!          because the def-use cycle in loop3 is considered "not relevant" - i.e.,
!          it does not need to be vectorized because it is only used for array
!          indexing (see 'mark_stmts_to_be_vectorized'). The def-use cycle in
!          loop2 on the other hand is relevant (it is being written to memory).
! */
  
  static void
! vect_analyze_scalar_cycles (loop_vec_info loop_vinfo)
  {
    tree phi;
-   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
    basic_block bb = loop->header;
    tree dumy;
    VEC(tree,heap) *worklist = VEC_alloc (tree, heap, 64);
--- 558,574 ----
  }
  
  
! /* Function vect_analyze_scalar_cycles_1.
  
!    Examine the cross iteration def-use cycles of scalar variables
!    in LOOP. LOOP_VINFO represents the loop that is noe being
!    considered for vectorization (can be LOOP, or an outer-loop
!    enclosing LOOP).  */
  
  static void
! vect_analyze_scalar_cycles_1 (loop_vec_info loop_vinfo, struct loop *loop)
  {
    tree phi;
    basic_block bb = loop->header;
    tree dumy;
    VEC(tree,heap) *worklist = VEC_alloc (tree, heap, 64);
*************** vect_analyze_scalar_cycles (loop_vec_inf
*** 667,673 ****
        gcc_assert (is_gimple_reg (SSA_NAME_VAR (def)));
        gcc_assert (STMT_VINFO_DEF_TYPE (stmt_vinfo) == vect_unknown_def_type);
  
!       reduc_stmt = vect_is_simple_reduction (loop, phi);
        if (reduc_stmt)
          {
            if (vect_print_dump_info (REPORT_DETAILS))
--- 634,640 ----
        gcc_assert (is_gimple_reg (SSA_NAME_VAR (def)));
        gcc_assert (STMT_VINFO_DEF_TYPE (stmt_vinfo) == vect_unknown_def_type);
  
!       reduc_stmt = vect_is_simple_reduction (loop_vinfo, phi);
        if (reduc_stmt)
          {
            if (vect_print_dump_info (REPORT_DETAILS))
*************** vect_analyze_scalar_cycles (loop_vec_inf
*** 686,691 ****
--- 653,700 ----
  }
  
  
+ /* Function vect_analyze_scalar_cycles.
+ 
+    Examine the cross iteration def-use cycles of scalar variables, by
+    analyzing the loop-header PHIs of scalar variables; Classify each 
+    cycle as one of the following: invariant, induction, reduction, unknown.
+    We do that for the loop represented by LOOP_VINFO, and also to its
+    inner-loop, if exists.
+    Examples for scalar cycles:
+ 
+    Example1: reduction:
+ 
+               loop1:
+               for (i=0; i<N; i++)
+                  sum += a[i];
+ 
+    Example2: induction:
+ 
+               loop2:
+               for (i=0; i<N; i++)
+                  a[i] = i;  */
+ 
+ static void
+ vect_analyze_scalar_cycles (loop_vec_info loop_vinfo)
+ {
+   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
+ 
+   vect_analyze_scalar_cycles_1 (loop_vinfo, loop);
+ 
+   /* When vectorizing an outer-loop, the inner-loop is executed sequentially.
+      Reductions in such inner-loop therefore have different properties than
+      the reductions in the nest that gets vectorized:
+      1. When vectorized, they are executed in the same order as in the original
+         scalar loop, so we can't change the order of computation when
+         vectorizing them.
+      2. FIXME: Inner-loop reductions can be used in the inner-loop, so the 
+         current checks are too strict.  */
+ 
+   if (loop->inner)
+     vect_analyze_scalar_cycles_1 (loop_vinfo, loop->inner);
+ }
+ 
+ 
  /* Function vect_insert_into_interleaving_chain.
  
     Insert DRA into the interleaving chain of DRB according to DRA's INIT.  */
*************** process_use (tree stmt, tree use, loop_v
*** 2343,2354 ****
        && bb->loop_father == def_bb->loop_father)
      {
        if (vect_print_dump_info (REPORT_DETAILS))
! 	{
! 	  fprintf (vect_dump, 
! 		   "reduc-stmt defining reduc-phi in the same nest - skip.");
! 	  print_generic_stmt (vect_dump, def_stmt, TDF_SLIM);
! 	  print_generic_stmt (vect_dump, stmt, TDF_SLIM);
! 	}
        if (STMT_VINFO_IN_PATTERN_P (dstmt_vinfo))
  	dstmt_vinfo = vinfo_for_stmt (STMT_VINFO_RELATED_STMT (dstmt_vinfo));
        gcc_assert (STMT_VINFO_RELEVANT (dstmt_vinfo) < vect_used_by_reduction);
--- 2352,2358 ----
        && bb->loop_father == def_bb->loop_father)
      {
        if (vect_print_dump_info (REPORT_DETAILS))
! 	fprintf (vect_dump, "reduc-stmt defining reduc-phi in the same nest.");
        if (STMT_VINFO_IN_PATTERN_P (dstmt_vinfo))
  	dstmt_vinfo = vinfo_for_stmt (STMT_VINFO_RELATED_STMT (dstmt_vinfo));
        gcc_assert (STMT_VINFO_RELEVANT (dstmt_vinfo) < vect_used_by_reduction);
*************** vect_mark_stmts_to_be_vectorized (loop_v
*** 2547,2563 ****
  
        if (STMT_VINFO_DEF_TYPE (stmt_vinfo) == vect_reduction_def)
          {
! 	  switch (relevant)
  	    {
  	    case vect_unused_in_loop:
  	      gcc_assert (TREE_CODE (stmt) != PHI_NODE);
  	      break;
  	    case vect_used_in_outer_by_reduction:
  	    case vect_used_in_outer:
  	      break;
  	    case vect_used_by_reduction:
  	      if (TREE_CODE (stmt) == PHI_NODE)
  		break;
  	    case vect_used_in_loop:
  	    default:
  	      if (vect_print_dump_info (REPORT_DETAILS))
--- 2551,2572 ----
  
        if (STMT_VINFO_DEF_TYPE (stmt_vinfo) == vect_reduction_def)
          {
! 	  enum vect_relevant tmp_relevant = relevant;
! 	  switch (tmp_relevant)
  	    {
  	    case vect_unused_in_loop:
  	      gcc_assert (TREE_CODE (stmt) != PHI_NODE);
+ 	      relevant = vect_used_by_reduction;
  	      break;
+ 
  	    case vect_used_in_outer_by_reduction:
  	    case vect_used_in_outer:
  	      break;
+ 
  	    case vect_used_by_reduction:
  	      if (TREE_CODE (stmt) == PHI_NODE)
  		break;
+ 	      /* fall through */
  	    case vect_used_in_loop:
  	    default:
  	      if (vect_print_dump_info (REPORT_DETAILS))
*************** vect_mark_stmts_to_be_vectorized (loop_v
*** 2565,2571 ****
  	      VEC_free (tree, heap, worklist);
  	      return false;
  	    }
- 	  relevant = vect_used_by_reduction;
  	  live_p = false;	
  	}
  
--- 2574,2579 ----
*************** vect_analyze_loop_1 (struct loop *loop)
*** 2729,2753 ****
    if (!loop_vinfo)
      {
        if (vect_print_dump_info (REPORT_DETAILS))
!         fprintf (vect_dump, "bad loop form.");
        return NULL;
      }
  
-   /* Classify all cross-iteration scalar data-flow cycles.  */
-   /* FIXME: Since the inner-loop is executed sequentially (the iterations of
-      the outerloop are combined) inner-loop scalar-cycles that are detected
-      as a "reduction" actually have different properties than the reductions in
-      the nest that gets vectorized:
-      1. When vectorized, they are executed in the same order as in the original
- 	scalar loop, so we can't change the order of computation when 
- 	vectorizing defs that are marked as "used_by_reduction".
-      2. Inner-loop reductions can be used in the inner-loop, so the current
- 	checks are too strict.
-      So we need to differenciate between regular reductions and "inner-loop
-      reductions".  */
-       
-   vect_analyze_scalar_cycles (loop_vinfo);
- 
    return loop_vinfo;
  }
  
--- 2737,2746 ----
    if (!loop_vinfo)
      {
        if (vect_print_dump_info (REPORT_DETAILS))
!         fprintf (vect_dump, "bad inner-loop form.");
        return NULL;
      }
  
    return loop_vinfo;
  }
  
Index: tree-vect-patterns.c
===================================================================
*** tree-vect-patterns.c	(revision 124473)
--- tree-vect-patterns.c	(working copy)
*************** vect_recog_dot_prod_pattern (tree last_s
*** 161,166 ****
--- 161,171 ----
    tree type, half_type;
    tree pattern_expr;
    tree prod_type;
+   tree name;
+   imm_use_iterator imm_iter;
+   use_operand_p use_p;
+   loop_vec_info loop_info = STMT_VINFO_LOOP_VINFO (stmt_vinfo);
+   struct loop *loop = LOOP_VINFO_LOOP (loop_info);
  
    if (TREE_CODE (last_stmt) != GIMPLE_MODIFY_STMT)
      return NULL;
*************** vect_recog_dot_prod_pattern (tree last_s
*** 195,200 ****
--- 200,222 ----
    if (TREE_CODE (expr) != PLUS_EXPR)
      return NULL;
  
+   /* A DOT_PROD_EXPR can be vectorized only if its ok to change the
+      computation order.  Here we check if the summation has any uses
+      in LOOP, which would require maintaining the same computation order.  */
+   name = GIMPLE_STMT_OPERAND (last_stmt, 0);
+   if (TREE_CODE (name) == SSA_NAME)
+     {
+       FOR_EACH_IMM_USE_FAST (use_p, imm_iter, name)
+ 	{
+ 	  tree use_stmt = USE_STMT (use_p);
+ 	  basic_block bb = bb_for_stmt (use_stmt);
+ 	  if (flow_bb_inside_loop_p (loop, bb)
+ 	      && (TREE_CODE (use_stmt) != PHI_NODE
+ 		  || !is_loop_header_bb_p (bb)))
+             return NULL;
+ 	}
+     }
+ 
    if (STMT_VINFO_IN_PATTERN_P (stmt_vinfo))
      {
        /* Has been detected as widening-summation?  */
Index: tree-vect-transform.c
===================================================================
*** tree-vect-transform.c	(revision 124473)
--- tree-vect-transform.c	(working copy)
*************** vect_init_vector (tree stmt, tree vector
*** 489,495 ****
    tree new_temp;
    basic_block new_bb;
   
!   if (loop->inner && (loop->inner == (bb_for_stmt (stmt))->loop_father))
      loop = loop->inner;
  
    new_var = vect_get_new_vect_var (vector_type, vect_simple_var, "cst_");
--- 489,495 ----
    tree new_temp;
    basic_block new_bb;
   
!   if (nested_in_vect_loop_p (loop, stmt))
      loop = loop->inner;
  
    new_var = vect_get_new_vect_var (vector_type, vect_simple_var, "cst_");
*************** get_initial_def_for_induction (tree iv_p
*** 580,586 ****
      step_expr = build_real (scalar_type, dconst0);
  
    /* Is phi in an inner-loop, while vectorizing an enclosing outer-loop?  */
!   if (loop->inner && (loop->inner == (bb_for_stmt (iv_phi))->loop_father))
      {
        nested_in_vect_loop = true;
        iv_loop = loop->inner;
--- 580,586 ----
      step_expr = build_real (scalar_type, dconst0);
  
    /* Is phi in an inner-loop, while vectorizing an enclosing outer-loop?  */
!   if (nested_in_vect_loop_p (loop, iv_phi))
      {
        nested_in_vect_loop = true;
        iv_loop = loop->inner;
*************** get_initial_def_for_induction (tree iv_p
*** 612,651 ****
      {
        /* iv_loop is the loop to be vectorized. Create:
  	 vec_init = [X, X+S, X+2*S, X+3*S] (S = step_expr, X = init_expr)  */
!   new_var = vect_get_new_vect_var (scalar_type, vect_scalar_var, "var_");
!   add_referenced_var (new_var);
  
!   new_name = force_gimple_operand (init_expr, &stmts, false, new_var);
!   if (stmts)
!     {
!       new_bb = bsi_insert_on_edge_immediate (pe, stmts);
!       gcc_assert (!new_bb);
!     }
  
!   t = NULL_TREE;
        t = tree_cons (NULL_TREE, init_expr, t);
!   for (i = 1; i < nunits; i++)
!     {
!       tree tmp;
  
  	  /* Create: new_name_i = new_name + step_expr  */
!       tmp = fold_build2 (PLUS_EXPR, scalar_type, new_name, step_expr);
!       init_stmt = build_gimple_modify_stmt (new_var, tmp);
!       new_name = make_ssa_name (new_var, init_stmt);
!       GIMPLE_STMT_OPERAND (init_stmt, 0) = new_name;
  
!       new_bb = bsi_insert_on_edge_immediate (pe, init_stmt);
!       gcc_assert (!new_bb);
  
!       if (vect_print_dump_info (REPORT_DETAILS))
!         {
!           fprintf (vect_dump, "created new init_stmt: ");
!           print_generic_expr (vect_dump, init_stmt, TDF_SLIM);
!         }
!       t = tree_cons (NULL_TREE, new_name, t);
!     }
        /* Create a vector from [new_name_0, new_name_1, ..., new_name_nunits-1]  */
!   vec = build_constructor_from_list (vectype, nreverse (t));
        vec_init = vect_init_vector (iv_phi, vec, vectype);
      }
  
--- 612,651 ----
      {
        /* iv_loop is the loop to be vectorized. Create:
  	 vec_init = [X, X+S, X+2*S, X+3*S] (S = step_expr, X = init_expr)  */
!       new_var = vect_get_new_vect_var (scalar_type, vect_scalar_var, "var_");
!       add_referenced_var (new_var);
  
!       new_name = force_gimple_operand (init_expr, &stmts, false, new_var);
!       if (stmts)
! 	{
! 	  new_bb = bsi_insert_on_edge_immediate (pe, stmts);
! 	  gcc_assert (!new_bb);
! 	}
  
!       t = NULL_TREE;
        t = tree_cons (NULL_TREE, init_expr, t);
!       for (i = 1; i < nunits; i++)
! 	{
! 	  tree tmp;
  
  	  /* Create: new_name_i = new_name + step_expr  */
! 	  tmp = fold_build2 (PLUS_EXPR, scalar_type, new_name, step_expr);
! 	  init_stmt = build_gimple_modify_stmt (new_var, tmp);
! 	  new_name = make_ssa_name (new_var, init_stmt);
! 	  GIMPLE_STMT_OPERAND (init_stmt, 0) = new_name;
  
! 	  new_bb = bsi_insert_on_edge_immediate (pe, init_stmt);
! 	  gcc_assert (!new_bb);
  
! 	  if (vect_print_dump_info (REPORT_DETAILS))
! 	    {
! 	      fprintf (vect_dump, "created new init_stmt: ");
! 	      print_generic_expr (vect_dump, init_stmt, TDF_SLIM);
! 	    }
! 	  t = tree_cons (NULL_TREE, new_name, t);
! 	}
        /* Create a vector from [new_name_0, new_name_1, ..., new_name_nunits-1]  */
!       vec = build_constructor_from_list (vectype, nreverse (t));
        vec_init = vect_init_vector (iv_phi, vec, vectype);
      }
  
*************** get_initial_def_for_reduction (tree stmt
*** 1221,1230 ****
    tree t = NULL_TREE;
    int i;
    tree vector_type;
!   bool nested_in_vect_loop= false; 
  
    gcc_assert (INTEGRAL_TYPE_P (type) || SCALAR_FLOAT_TYPE_P (type));
!   if (loop->inner && (loop->inner == (bb_for_stmt (stmt))->loop_father))
      nested_in_vect_loop = true;
    else
      gcc_assert (loop == (bb_for_stmt (stmt))->loop_father);
--- 1221,1230 ----
    tree t = NULL_TREE;
    int i;
    tree vector_type;
!   bool nested_in_vect_loop = false; 
  
    gcc_assert (INTEGRAL_TYPE_P (type) || SCALAR_FLOAT_TYPE_P (type));
!   if (nested_in_vect_loop_p (loop, stmt))
      nested_in_vect_loop = true;
    else
      gcc_assert (loop == (bb_for_stmt (stmt))->loop_father);
*************** vect_create_epilog_for_reduction (tree v
*** 1347,1353 ****
    bool nested_in_vect_loop = false;
    int op_type;
    
!   if (loop->inner && (loop->inner == (bb_for_stmt (stmt))->loop_father))
      {
        loop = loop->inner;
        nested_in_vect_loop = true;
--- 1347,1353 ----
    bool nested_in_vect_loop = false;
    int op_type;
    
!   if (nested_in_vect_loop_p (loop, stmt))
      {
        loop = loop->inner;
        nested_in_vect_loop = true;
*************** vectorizable_reduction (tree stmt, block
*** 1763,1774 ****
    tree new_stmt = NULL_TREE;
    int j;
  
!   if (loop->inner && (loop->inner == (bb_for_stmt (stmt))->loop_father))
      {
        loop = loop->inner;
        /* FORNOW. This restriction should be relaxed.  */
        if (ncopies > 1)
! 	return false;
      }
  
    gcc_assert (ncopies >= 1);
--- 1763,1778 ----
    tree new_stmt = NULL_TREE;
    int j;
  
!   if (nested_in_vect_loop_p (loop, stmt))
      {
        loop = loop->inner;
        /* FORNOW. This restriction should be relaxed.  */
        if (ncopies > 1)
! 	{
! 	  if (vect_print_dump_info (REPORT_DETAILS))
! 	    fprintf (vect_dump, "multiple types in nested loop.");
! 	  return false;
! 	}
      }
  
    gcc_assert (ncopies >= 1);
*************** vectorizable_reduction (tree stmt, block
*** 1839,1847 ****
    gcc_assert (dt == vect_reduction_def);
    gcc_assert (TREE_CODE (def_stmt) == PHI_NODE);
    if (orig_stmt) 
!     gcc_assert (orig_stmt == vect_is_simple_reduction (loop, def_stmt));
    else
!     gcc_assert (stmt == vect_is_simple_reduction (loop, def_stmt));
    
    if (STMT_VINFO_LIVE_P (vinfo_for_stmt (def_stmt)))
      return false;
--- 1843,1851 ----
    gcc_assert (dt == vect_reduction_def);
    gcc_assert (TREE_CODE (def_stmt) == PHI_NODE);
    if (orig_stmt) 
!     gcc_assert (orig_stmt == vect_is_simple_reduction (loop_vinfo, def_stmt));
    else
!     gcc_assert (stmt == vect_is_simple_reduction (loop_vinfo, def_stmt));
    
    if (STMT_VINFO_LIVE_P (vinfo_for_stmt (def_stmt)))
      return false;
*************** vectorizable_call (tree stmt, block_stmt
*** 2062,2067 ****
--- 2066,2072 ----
    stmt_vec_info stmt_info = vinfo_for_stmt (stmt), prev_stmt_info;
    tree vectype_out, vectype_in;
    loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
+   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
    tree fndecl, rhs, new_temp, def, def_stmt, rhs_type, lhs_type;
    enum vect_def_type dt[2];
    int ncopies, j, nargs;
*************** vectorizable_call (tree stmt, block_stmt
*** 2169,2174 ****
--- 2174,2187 ----
  	     / TYPE_VECTOR_SUBPARTS (vectype_out));
    gcc_assert (ncopies >= 1);
  
+   /* FORNOW. This restriction should be relaxed.  */
+   if (nested_in_vect_loop_p (loop, stmt) && ncopies > 1)
+     {
+       if (vect_print_dump_info (REPORT_DETAILS))
+         fprintf (vect_dump, "multiple types in nested loop.");
+       return false;
+     }
+ 
    /* Handle def.  */
    scalar_dest = GIMPLE_STMT_OPERAND (stmt, 0);
    vec_dest = vect_create_destination_var (scalar_dest, vectype_out);
*************** vectorizable_conversion (tree stmt, bloc
*** 2241,2246 ****
--- 2254,2260 ----
    tree vec_oprnd0 = NULL_TREE;
    stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
    loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
+   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
    enum tree_code code;
    tree new_temp;
    tree def, def_stmt;
*************** vectorizable_conversion (tree stmt, bloc
*** 2307,2312 ****
--- 2321,2334 ----
    ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits_in;
    gcc_assert (ncopies >= 1);
  
+   /* FORNOW. This restriction should be relaxed.  */
+   if (nested_in_vect_loop_p (loop, stmt) && ncopies > 1)
+     {
+       if (vect_print_dump_info (REPORT_DETAILS))
+ 	fprintf (vect_dump, "multiple types in nested loop.");
+       return false;
+     }
+ 
    if (!vect_is_simple_use (op0, loop_vinfo, &def_stmt, &def, &dt0))
      {
        if (vect_print_dump_info (REPORT_DETAILS))
*************** vectorizable_operation (tree stmt, block
*** 2555,2560 ****
--- 2577,2583 ----
    stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
    tree vectype = STMT_VINFO_VECTYPE (stmt_info);
    loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
+   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
    enum tree_code code;
    enum machine_mode vec_mode;
    tree new_temp;
*************** vectorizable_operation (tree stmt, block
*** 2573,2578 ****
--- 2596,2608 ----
    int j;
  
    gcc_assert (ncopies >= 1);
+   /* FORNOW. This restriction should be relaxed.  */
+   if (nested_in_vect_loop_p (loop, stmt) && ncopies > 1)
+     {
+       if (vect_print_dump_info (REPORT_DETAILS))
+         fprintf (vect_dump, "multiple types in nested loop.");
+       return false;
+     }
  
    if (!STMT_VINFO_RELEVANT_P (stmt_info))
      return false;
*************** vectorizable_type_demotion (tree stmt, b
*** 2826,2831 ****
--- 2856,2862 ----
    tree vec_oprnd0=NULL, vec_oprnd1=NULL;
    stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
    loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
+   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
    enum tree_code code;
    tree new_temp;
    tree def, def_stmt;
*************** vectorizable_type_demotion (tree stmt, b
*** 2882,2887 ****
--- 2913,2925 ----
  
    ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits_out;
    gcc_assert (ncopies >= 1);
+   /* FORNOW. This restriction should be relaxed.  */
+   if (nested_in_vect_loop_p (loop, stmt) && ncopies > 1)
+     {
+       if (vect_print_dump_info (REPORT_DETAILS))
+         fprintf (vect_dump, "multiple types in nested loop.");
+       return false;
+     }
  
    if (! ((INTEGRAL_TYPE_P (TREE_TYPE (scalar_dest))
  	  && INTEGRAL_TYPE_P (TREE_TYPE (op0)))
*************** vectorizable_type_promotion (tree stmt, 
*** 3040,3045 ****
--- 3078,3084 ----
    tree vec_oprnd0=NULL, vec_oprnd1=NULL;
    stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
    loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
+   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
    enum tree_code code, code1 = CODE_FOR_nothing, code2 = CODE_FOR_nothing;
    tree decl1 = NULL_TREE, decl2 = NULL_TREE;
    int op_type; 
*************** vectorizable_type_promotion (tree stmt, 
*** 3085,3090 ****
--- 3124,3136 ----
    nunits_in = TYPE_VECTOR_SUBPARTS (vectype_in);
    ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits_in;
    gcc_assert (ncopies >= 1);
+   /* FORNOW. This restriction should be relaxed.  */
+   if (nested_in_vect_loop_p (loop, stmt) && ncopies > 1)
+     {
+       if (vect_print_dump_info (REPORT_DETAILS))
+         fprintf (vect_dump, "multiple types in nested loop.");
+       return false;
+     }
  
    scalar_dest = GIMPLE_STMT_OPERAND (stmt, 0);
    vectype_out = get_vectype_for_scalar_type (TREE_TYPE (scalar_dest));
*************** vectorizable_store (tree stmt, block_stm
*** 3381,3386 ****
--- 3427,3433 ----
    struct data_reference *dr = STMT_VINFO_DATA_REF (stmt_info), *first_dr = NULL;
    tree vectype = STMT_VINFO_VECTYPE (stmt_info);
    loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
+   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
    enum machine_mode vec_mode;
    tree dummy;
    enum dr_alignment_support alignment_support_cheme;
*************** vectorizable_store (tree stmt, block_stm
*** 3400,3405 ****
--- 3447,3459 ----
    VEC(tree,heap) *scalar_oprnds = NULL, *vec_oprnds = NULL;
  
    gcc_assert (ncopies >= 1);
+   /* FORNOW. This restriction should be relaxed.  */
+   if (nested_in_vect_loop_p (loop, stmt) && ncopies > 1)
+     {
+       if (vect_print_dump_info (REPORT_DETAILS))
+         fprintf (vect_dump, "multiple types in nested loop.");
+       return false;
+     }
  
    if (!STMT_VINFO_RELEVANT_P (stmt_info))
      return false;
*************** vectorizable_load (tree stmt, block_stmt
*** 4156,4161 ****
--- 4211,4225 ----
    bool strided_load = false;
    tree first_stmt;

+   gcc_assert (ncopies >= 1);
+   /* FORNOW. This restriction should be relaxed.  */
+   if (nested_in_vect_loop_p (loop, stmt) && ncopies > 1)
+     {
+       if (vect_print_dump_info (REPORT_DETAILS))
+         fprintf (vect_dump, "multiple types in nested loop.");
+       return false;
+     }
+
    if (!STMT_VINFO_RELEVANT_P (stmt_info))
      return false;

*************** vectorizable_live_operation (tree stmt,
*** 4451,4456 ****
--- 4515,4521 ----
    tree operation;
    stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
    loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
+   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
    int i;
    int op_type;
    tree op;
*************** vectorizable_live_operation (tree stmt,
*** 4468,4473 ****
--- 4533,4542 ----
    if (TREE_CODE (GIMPLE_STMT_OPERAND (stmt, 0)) != SSA_NAME)
      return false;
  
+   /* FORNOW. CHECKME. */
+   if (nested_in_vect_loop_p (loop, stmt))
+     return false;
+ 
    operation = GIMPLE_STMT_OPERAND (stmt, 1);
    op_type = TREE_OPERAND_LENGTH (operation);
  

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2007-05-22 21:17 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-05-22 21:17 [autovect] [committed] fixes/cleanups Dorit Nuzman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).