public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [0/7] Type promotion pass and elimination of zext/sext
@ 2015-09-07  2:55 Kugan
  2015-09-07  2:57 ` [1/7] Add new tree code SEXT_EXPR Kugan
                   ` (8 more replies)
  0 siblings, 9 replies; 63+ messages in thread
From: Kugan @ 2015-09-07  2:55 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Biener


This a new version of the patch posted in
https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
more testing and spitted the patch to make it more easier to review.
There are still couple of issues to be addressed and I am working on them.

1. AARCH64 bootstrap now fails with the commit
94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled
in stage2 and fwprop.c is failing. It looks to me that there is a latent
issue which gets exposed my patch. I can also reproduce this in x86_64
if I use the same PROMOTE_MODE which is used in aarch64 port. For the
time being, I am using  patch
0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
workaround. This meeds to be fixed before the patches are ready to be
committed.

2. vector-compare-1.c from c-c++-common/torture fails to assemble with
-O3 -g Error: unaligned opcodes detected in executable segment. It works
fine if I remove the -g. I am looking into it and needs to be fixed as well.

In the meantime, I would appreciate if you take some time to review this.

I have bootstrapped on x86_64-linux-gnu, arm-linux-gnu and
aarch-64-linux-gnu (with the workaround) and regression tested.

Thanks,
Kugan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [1/7] Add new tree code SEXT_EXPR
  2015-09-07  2:55 [0/7] Type promotion pass and elimination of zext/sext Kugan
@ 2015-09-07  2:57 ` Kugan
  2015-09-15 13:20   ` Richard Biener
  2015-09-07  2:58 ` [2/7] Add new type promotion pass Kugan
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 63+ messages in thread
From: Kugan @ 2015-09-07  2:57 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Biener

[-- Attachment #1: Type: text/plain, Size: 507 bytes --]


This patch adds support for new tree code SEXT_EXPR.

gcc/ChangeLog:

2015-09-07  Kugan Vivekanandarajah  <kuganv@linaro.org>

	* cfgexpand.c (expand_debug_expr): Handle SEXT_EXPR.
	* expr.c (expand_expr_real_2): Likewise.
	* fold-const.c (int_const_binop_1): Likewise.
	* tree-cfg.c (verify_gimple_assign_binary): Likewise.
	* tree-inline.c (estimate_operator_cost): Likewise.
	* tree-pretty-print.c (dump_generic_node): Likewise.
	(op_symbol_code): Likewise.
	* tree.def: Define new tree code SEXT_EXPR.

[-- Attachment #2: 0001-Add-new-SEXT_EXPR-tree-code.patch --]
[-- Type: text/x-diff, Size: 4579 bytes --]

From 9e9fd271b84580ae40ce21eb39f9be8072e6dd12 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Mon, 17 Aug 2015 13:37:15 +1000
Subject: [PATCH 1/8] Add new SEXT_EXPR tree code

---
 gcc/cfgexpand.c         |  4 ++++
 gcc/expr.c              | 16 ++++++++++++++++
 gcc/fold-const.c        |  4 ++++
 gcc/tree-cfg.c          | 12 ++++++++++++
 gcc/tree-inline.c       |  1 +
 gcc/tree-pretty-print.c | 11 +++++++++++
 gcc/tree.def            |  4 ++++
 7 files changed, 52 insertions(+)

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index d567a87..bbc3c10 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -5071,6 +5071,10 @@ expand_debug_expr (tree exp)
     case FMA_EXPR:
       return simplify_gen_ternary (FMA, mode, inner_mode, op0, op1, op2);
 
+    case SEXT_EXPR:
+      return op0;
+
+
     default:
     flag_unsupported:
 #ifdef ENABLE_CHECKING
diff --git a/gcc/expr.c b/gcc/expr.c
index 1e820b4..bcd87c0 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9273,6 +9273,22 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
       target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
       return target;
 
+    case SEXT_EXPR:
+	{
+	  rtx op0 = expand_normal (treeop0);
+	  rtx temp;
+	  if (!target)
+	    target = gen_reg_rtx (TYPE_MODE (TREE_TYPE (treeop0)));
+
+	  machine_mode inner_mode
+	    = smallest_mode_for_size (tree_to_shwi (treeop1),
+				      MODE_INT);
+	  temp = convert_modes (inner_mode,
+				TYPE_MODE (TREE_TYPE (treeop0)), op0, 0);
+	  convert_move (target, temp, 0);
+	  return target;
+	}
+
     default:
       gcc_unreachable ();
     }
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index c826e67..473f930 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -984,6 +984,10 @@ int_const_binop_1 (enum tree_code code, const_tree arg1, const_tree parg2,
       res = wi::bit_and (arg1, arg2);
       break;
 
+    case SEXT_EXPR:
+      res = wi::sext (arg1, arg2.to_uhwi ());
+      break;
+
     case RSHIFT_EXPR:
     case LSHIFT_EXPR:
       if (wi::neg_p (arg2))
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 5ac73b3..c9ad28d 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3756,6 +3756,18 @@ verify_gimple_assign_binary (gassign *stmt)
         return false;
       }
 
+    case SEXT_EXPR:
+      {
+	if (!INTEGRAL_TYPE_P (lhs_type)
+	    || !INTEGRAL_TYPE_P (rhs1_type)
+	    || TREE_CODE (rhs2) != INTEGER_CST)
+	  {
+	    error ("invalid operands in sext expr");
+	    return true;
+	  }
+	return false;
+      }
+
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
       {
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index e1ceea4..272c409 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -3884,6 +3884,7 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
     case BIT_XOR_EXPR:
     case BIT_AND_EXPR:
     case BIT_NOT_EXPR:
+    case SEXT_EXPR:
 
     case TRUTH_ANDIF_EXPR:
     case TRUTH_ORIF_EXPR:
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 7cd1fe7..04f6777 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -1794,6 +1794,14 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, int flags,
       }
       break;
 
+    case SEXT_EXPR:
+      pp_string (pp, "SEXT_EXPR <");
+      dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
+      pp_string (pp, ", ");
+      dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
+      pp_greater (pp);
+      break;
+
     case MODIFY_EXPR:
     case INIT_EXPR:
       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags,
@@ -3414,6 +3422,9 @@ op_symbol_code (enum tree_code code)
     case MIN_EXPR:
       return "min";
 
+    case SEXT_EXPR:
+      return "sext from bit";
+
     default:
       return "<<< ??? >>>";
     }
diff --git a/gcc/tree.def b/gcc/tree.def
index 56580af..d614544 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -752,6 +752,10 @@ DEFTREECODE (BIT_XOR_EXPR, "bit_xor_expr", tcc_binary, 2)
 DEFTREECODE (BIT_AND_EXPR, "bit_and_expr", tcc_binary, 2)
 DEFTREECODE (BIT_NOT_EXPR, "bit_not_expr", tcc_unary, 1)
 
+/*  Sign-extend operation.  It will sign extend first operand from
+ the sign bit specified by the second operand.  */
+DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2)
+
 /* ANDIF and ORIF allow the second operand not to be computed if the
    value of the expression is determined from the first operand.  AND,
    OR, and XOR always compute the second operand whether its value is
-- 
1.9.1


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [2/7] Add new type promotion pass
  2015-09-07  2:55 [0/7] Type promotion pass and elimination of zext/sext Kugan
  2015-09-07  2:57 ` [1/7] Add new tree code SEXT_EXPR Kugan
@ 2015-09-07  2:58 ` Kugan
  2015-10-15  5:52   ` Kugan
  2015-09-07  3:00 ` [3/7] Optimize ZEXT_EXPR with tree-vrp Kugan
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 63+ messages in thread
From: Kugan @ 2015-09-07  2:58 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Biener

[-- Attachment #1: Type: text/plain, Size: 751 bytes --]


This pass applies type promotion to SSA names in the function and
inserts appropriate truncations to preserve the semantics.  Idea of this
pass is to promote operations such a way that we can minimize generation
of subreg in RTL, that intern results in removal of redundant zero/sign
extensions.

gcc/ChangeLog:

2015-09-07  Kugan Vivekanandarajah  <kuganv@linaro.org>

	* Makefile.in: Add gimple-ssa-type-promote.o.
	* common.opt: New option -ftree-type-promote.
	* doc/invoke.texi: Document -ftree-type-promote.
	* gimple-ssa-type-promote.c: New file.
	* passes.def: Define new pass_type_promote.
	* timevar.def: Define new TV_TREE_TYPE_PROMOTE.
	* tree-pass.h (make_pass_type_promote): New.
	* tree-ssanames.c (set_range_info): Adjust range_info.

[-- Attachment #2: 0002-Add-type-promotion-pass.patch --]
[-- Type: text/x-diff, Size: 28479 bytes --]

From c63cc2e1253a7d3544ba35a15dda2fde0d0380e4 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Mon, 17 Aug 2015 13:44:50 +1000
Subject: [PATCH 2/8] Add type promotion pass

---
 gcc/Makefile.in               |   1 +
 gcc/common.opt                |   4 +
 gcc/doc/invoke.texi           |  10 +
 gcc/gimple-ssa-type-promote.c | 809 ++++++++++++++++++++++++++++++++++++++++++
 gcc/passes.def                |   1 +
 gcc/timevar.def               |   1 +
 gcc/tree-pass.h               |   1 +
 gcc/tree-ssanames.c           |   3 +-
 8 files changed, 829 insertions(+), 1 deletion(-)
 create mode 100644 gcc/gimple-ssa-type-promote.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 3d1c1e5..2fb5174 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1494,6 +1494,7 @@ OBJS = \
 	tree-vect-slp.o \
 	tree-vectorizer.o \
 	tree-vrp.o \
+	gimple-ssa-type-promote.o \
 	tree.o \
 	valtrack.o \
 	value-prof.o \
diff --git a/gcc/common.opt b/gcc/common.opt
index 94d1d88..b5a93b0 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2378,6 +2378,10 @@ ftree-vrp
 Common Report Var(flag_tree_vrp) Init(0) Optimization
 Perform Value Range Propagation on trees
 
+ftree-type-promote
+Common Report Var(flag_tree_type_promote) Init(1) Optimization
+Perform Type Promotion on trees
+
 funit-at-a-time
 Common Report Var(flag_unit_at_a_time) Init(1)
 Compile whole compilation unit at a time
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index c0ec0fd..7eeabcd 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -8956,6 +8956,16 @@ enabled by default at @option{-O2} and higher.  Null pointer check
 elimination is only done if @option{-fdelete-null-pointer-checks} is
 enabled.
 
+@item -ftree-type-promote
+@opindex ftree-type-promote
+This pass applies type promotion to SSA names in the function and
+inserts appropriate truncations to preserve the semantics.  Idea of
+this pass is to promote operations such a way that we can minimise
+generation of subreg in RTL, that intern results in removal of
+redundant zero/sign extensions.
+
+This optimization is enabled by default.
+
 @item -fsplit-ivs-in-unroller
 @opindex fsplit-ivs-in-unroller
 Enables expression of values of induction variables in later iterations
diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c
new file mode 100644
index 0000000..62b5fdc
--- /dev/null
+++ b/gcc/gimple-ssa-type-promote.c
@@ -0,0 +1,809 @@
+/* Type promotion of SSA names to minimise redundant zero/sign extension.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "hash-set.h"
+#include "machmode.h"
+#include "vec.h"
+#include "double-int.h"
+#include "input.h"
+#include "symtab.h"
+#include "wide-int.h"
+#include "inchash.h"
+#include "tree.h"
+#include "fold-const.h"
+#include "stor-layout.h"
+#include "predict.h"
+#include "function.h"
+#include "dominance.h"
+#include "cfg.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimple-ssa.h"
+#include "tree-phinodes.h"
+#include "ssa-iterators.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+#include "tree-pass.h"
+#include "gimple-pretty-print.h"
+#include "langhooks.h"
+#include "sbitmap.h"
+#include "domwalk.h"
+#include "tree-dfa.h"
+
+/* This pass applies type promotion to SSA names in the function and
+   inserts appropriate truncations.  Idea of this pass is to promote operations
+   such a way that we can minimise generation of subreg in RTL,
+   that intern results in removal of redundant zero/sign extensions.  This pass
+   will run prior to The VRP and DOM such that they will be able to optimise
+   redundant truncations and extensions.  This is based on the discussion from
+   https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00472.html.
+
+*/
+
+static unsigned n_ssa_val;
+static sbitmap ssa_to_be_promoted_bitmap;
+static sbitmap ssa_sets_higher_bits_bitmap;
+static hash_map <tree, tree>  *original_type_map;
+
+/* Return the promoted type for TYPE.  */
+static tree
+get_promoted_type (tree type)
+{
+  tree promoted_type;
+  enum machine_mode mode;
+  int uns;
+  if (POINTER_TYPE_P (type)
+      || !INTEGRAL_TYPE_P (type)
+      || TYPE_PRECISION (type) % 8 != 0)
+    return type;
+  mode = TYPE_MODE (type);
+#ifdef PROMOTE_MODE
+  uns = TYPE_SIGN (type);
+  PROMOTE_MODE (mode, uns, type);
+#endif
+  uns = TYPE_SIGN (type);
+  promoted_type = lang_hooks.types.type_for_mode (mode, uns);
+  if (promoted_type
+      && (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type)))
+    type = promoted_type;
+  return type;
+}
+
+/* Return true if ssa NAME is already considered for promotion.  */
+static bool
+ssa_promoted_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	return bitmap_bit_p (ssa_to_be_promoted_bitmap, index);
+    }
+  return true;
+}
+
+
+/* Set ssa NAME to be already considered for promotion.  */
+static void
+set_ssa_promoted (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	bitmap_set_bit (ssa_to_be_promoted_bitmap, index);
+    }
+}
+
+/* Insert COPY_STMT along the edge from STMT to its successor.  */
+static void
+insert_stmt_on_edge (gimple stmt, gimple copy_stmt)
+{
+  edge_iterator ei;
+  edge e, edge = NULL;
+  basic_block bb = gimple_bb (stmt);
+
+  FOR_EACH_EDGE (e, ei, bb->succs)
+    if (!(e->flags & EDGE_EH))
+      {
+	gcc_assert (edge == NULL);
+	edge = e;
+      }
+
+  gcc_assert (edge);
+  gsi_insert_on_edge_immediate (edge, copy_stmt);
+}
+
+/* Return true if it is safe to promote the defined SSA_NAME in the STMT
+   itself.  */
+static bool
+safe_to_promote_def_p (gimple stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  if (gimple_vuse (stmt) != NULL_TREE
+      || gimple_vdef (stmt) != NULL_TREE
+      || code == ARRAY_REF
+      || code == LROTATE_EXPR
+      || code == RROTATE_EXPR
+      || code == VIEW_CONVERT_EXPR
+      || code == BIT_FIELD_REF
+      || code == REALPART_EXPR
+      || code == IMAGPART_EXPR
+      || code == REDUC_MAX_EXPR
+      || code == REDUC_PLUS_EXPR
+      || code == REDUC_MIN_EXPR)
+    return false;
+  return true;
+}
+
+/* Return true if it is safe to promote the use in the STMT.  */
+static bool
+safe_to_promote_use_p (gimple stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  tree lhs = gimple_assign_lhs (stmt);
+
+  if (gimple_vuse (stmt) != NULL_TREE
+      || gimple_vdef (stmt) != NULL_TREE
+      ||code == VIEW_CONVERT_EXPR
+      || code == LROTATE_EXPR
+      || code == RROTATE_EXPR
+      || code == CONSTRUCTOR
+      || code == BIT_FIELD_REF
+      || code == COMPLEX_EXPR
+      || code == ASM_EXPR
+      || VECTOR_TYPE_P (TREE_TYPE (lhs)))
+    return false;
+  return true;
+}
+
+/* Return true if the SSA_NAME has to be truncated to preserve the
+   semantics.  */
+static bool
+truncate_use_p (gimple stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  if (TREE_CODE_CLASS (code)
+      == tcc_comparison
+      || code == TRUNC_DIV_EXPR
+      || code == CEIL_DIV_EXPR
+      || code == FLOOR_DIV_EXPR
+      || code == ROUND_DIV_EXPR
+      || code == TRUNC_MOD_EXPR
+      || code == CEIL_MOD_EXPR
+      || code == FLOOR_MOD_EXPR
+      || code == ROUND_MOD_EXPR
+      || code == LSHIFT_EXPR
+      || code == RSHIFT_EXPR)
+    return true;
+  return false;
+}
+
+/* Return true if LHS will be promoted later.  */
+static bool
+tobe_promoted_p (tree lhs)
+{
+  if (TREE_CODE (lhs) == SSA_NAME
+      && !POINTER_TYPE_P (TREE_TYPE (lhs))
+      && INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+      && !VECTOR_TYPE_P (TREE_TYPE (lhs))
+      && !ssa_promoted_p (lhs)
+      && (get_promoted_type (TREE_TYPE (lhs))
+	  != TREE_TYPE (lhs)))
+    return true;
+  else
+    return false;
+}
+
+/* Convert constant CST to TYPE.  */
+static tree
+convert_int_cst (tree type, tree cst, signop sign = SIGNED)
+{
+  wide_int wi_cons = fold_convert (type, cst);
+  wi_cons = wi::ext (wi_cons, TYPE_PRECISION (TREE_TYPE (cst)), sign);
+  return wide_int_to_tree (type, wi_cons);
+}
+
+/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
+   promote only the constants in conditions part of the COND_EXPR.  */
+static void
+promote_cst_in_stmt (gimple stmt, tree type, bool promote_cond = false)
+{
+  tree op;
+  ssa_op_iter iter;
+  use_operand_p oprnd;
+  int index;
+  tree op0, op1;
+  signop sign = SIGNED;
+
+  switch (gimple_code (stmt))
+    {
+    case GIMPLE_ASSIGN:
+      if (promote_cond
+	  && gimple_assign_rhs_code (stmt) == COND_EXPR)
+	{
+	  /* Promote INTEGER_CST that are tcc_compare arguments.  */
+	  sign = TYPE_SIGN (type);
+	  op = gimple_assign_rhs1 (stmt);
+	  op0 = TREE_OPERAND (op, 0);
+	  op1 = TREE_OPERAND (op, 1);
+	  if (TREE_CODE (op0) == INTEGER_CST)
+	    op0 = convert_int_cst (type, op0, sign);
+	  if (TREE_CODE (op1) == INTEGER_CST)
+	    op1 = convert_int_cst (type, op1, sign);
+	  tree new_op = build2 (TREE_CODE (op), type, op0, op1);
+	  gimple_assign_set_rhs1 (stmt, new_op);
+	}
+      else
+	{
+	  /* Promote INTEGER_CST in GIMPLE_ASSIGN.  */
+	  op = gimple_assign_rhs3 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs3 (stmt, convert_int_cst (type, op, sign));
+	  if (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt))
+	      == tcc_comparison)
+	    sign = TYPE_SIGN (type);
+	  op = gimple_assign_rhs1 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs1 (stmt, convert_int_cst (type, op, sign));
+	  op = gimple_assign_rhs2 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs2 (stmt, convert_int_cst (type, op, sign));
+	}
+      break;
+
+    case GIMPLE_PHI:
+	{
+	  /* Promote INTEGER_CST arguments to GIMPLE_PHI.  */
+	  gphi *phi = as_a <gphi *> (stmt);
+	  FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE)
+	    {
+	      op = USE_FROM_PTR (oprnd);
+	      index = PHI_ARG_INDEX_FROM_USE (oprnd);
+	      if (TREE_CODE (op) == INTEGER_CST)
+		SET_PHI_ARG_DEF (phi, index, convert_int_cst (type, op, sign));
+	    }
+	}
+      break;
+
+    case GIMPLE_COND:
+	{
+	  /* Promote INTEGER_CST that are GIMPLE_COND arguments.  */
+	  gcond *cond = as_a <gcond *> (stmt);
+	  op = gimple_cond_lhs (cond);
+	  sign = TYPE_SIGN (type);
+
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_lhs (cond, convert_int_cst (type, op, sign));
+	  op = gimple_cond_rhs (cond);
+
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_rhs (cond, convert_int_cst (type, op, sign));
+	}
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+}
+
+/* Create an ssa with TYPE to copy ssa VAR.  */
+static tree
+make_promoted_copy (tree var, gimple def_stmt, tree type)
+{
+  tree new_lhs = make_ssa_name (type, def_stmt);
+  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (var))
+    SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_lhs) = 1;
+  return new_lhs;
+}
+
+/* Zero/sign extend (depending on type) VAR and truncate to WIDTH bits.
+   Assign the zero/sign extended value in NEW_VAR.  gimple statement
+   that performs the zero/sign extension is returned.  */
+static gimple
+zero_sign_extend_stmt (tree new_var, tree var, int width)
+{
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var))
+	      == TYPE_PRECISION (TREE_TYPE (new_var)));
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var)) > width);
+  gimple stmt;
+
+  if (TYPE_UNSIGNED (TREE_TYPE (new_var)))
+    /* Zero extend.  */
+    stmt = gimple_build_assign (new_var,
+				BIT_AND_EXPR,
+				var, build_int_cst (TREE_TYPE (var),
+						    ((1ULL << width) - 1)));
+  else
+    /* Sign extend.  */
+    stmt = gimple_build_assign (new_var,
+				SEXT_EXPR,
+				var, build_int_cst (TREE_TYPE (var), width));
+  return stmt;
+}
+
+
+void duplicate_default_ssa (tree to, tree from)
+{
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (to, SSA_NAME_VAR (from));
+  SSA_NAME_IS_DEFAULT_DEF (to) = SSA_NAME_IS_DEFAULT_DEF (from);
+  SSA_NAME_DEF_STMT (to) = SSA_NAME_DEF_STMT (from);
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (from, NULL_TREE);
+  SSA_NAME_IS_DEFAULT_DEF (to) = 1;
+  SSA_NAME_IS_DEFAULT_DEF (from) = 0;
+}
+
+/* Promote definition DEF to PROMOTED_TYPE.  If the stmt that defines def
+   is def_stmt, make the type of def promoted_type.  If the stmt is such
+   that, result of the def_stmt cannot be of promoted_type, create a new_def
+   of the original_type and make the def_stmt assign its value to newdef.
+   Then, create a CONVERT_EXPR to convert new_def to def of promoted type.
+
+   For example, for stmt with original_type char and promoted_type int:
+		char _1 = mem;
+	becomes:
+		char _2 = mem;
+		int _1 = (int)_2;
+
+   If the def_stmt allows def to be promoted, promote def in-place
+   (and its arguments when needed).
+
+   For example:
+		char _3 = _1 + _2;
+	becomes:
+		int _3 = _1 + _2;
+   Here, _1 and _2 will also be promoted.  */
+
+static void
+promote_definition (tree def,
+		    tree promoted_type)
+{
+  gimple def_stmt = SSA_NAME_DEF_STMT (def);
+  gimple copy_stmt = NULL;
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+  tree original_type = TREE_TYPE (def);
+  tree new_def;
+  bool do_not_promote = false;
+
+  switch (gimple_code (def_stmt))
+    {
+    case GIMPLE_PHI:
+	{
+	  /* Promote def by fixing its type and make def anonymous.  */
+	  TREE_TYPE (def) = promoted_type;
+	  SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	  promote_cst_in_stmt (def_stmt, promoted_type);
+	  break;
+	}
+
+    case GIMPLE_ASM:
+	{
+	  gasm *asm_stmt = as_a <gasm *> (def_stmt);
+	  for (unsigned int i = 0; i < gimple_asm_noutputs (asm_stmt); ++i)
+	    {
+	      /* Promote def and copy (i.e. convert) the value defined
+		 by asm to def.  */
+	      tree link = gimple_asm_output_op (asm_stmt, i);
+	      tree op = TREE_VALUE (link);
+	      if (op == def)
+		{
+		  new_def = copy_ssa_name (def);
+		  set_ssa_promoted (new_def);
+		  duplicate_default_ssa (new_def, def);
+		  TREE_VALUE (link) = new_def;
+		  gimple_asm_set_output_op (asm_stmt, i, link);
+
+		  TREE_TYPE (def) = promoted_type;
+		  copy_stmt = gimple_build_assign (def, CONVERT_EXPR,
+						   new_def, NULL_TREE);
+		  gsi = gsi_for_stmt (def_stmt);
+		  SSA_NAME_IS_DEFAULT_DEF (new_def) = 0;
+		  gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+		  break;
+		}
+	    }
+	  break;
+	}
+
+    case GIMPLE_NOP:
+	{
+	  if (SSA_NAME_VAR (def) == NULL)
+	    {
+	      /* Promote def by fixing its type for anonymous def.  */
+	      TREE_TYPE (def) = promoted_type;
+	    }
+	  else
+	    {
+	      /* Create a promoted copy of parameters.  */
+	      bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+	      gcc_assert (bb);
+	      gsi = gsi_after_labels (bb);
+	      new_def = copy_ssa_name (def);
+	      set_ssa_promoted (new_def);
+	      set_ssa_default_def (cfun, SSA_NAME_VAR (def), new_def);
+	      duplicate_default_ssa (new_def, def);
+	      TREE_TYPE (def) = promoted_type;
+	      copy_stmt = gimple_build_assign (def, CONVERT_EXPR,
+					       new_def, NULL_TREE);
+	      SSA_NAME_DEF_STMT (def) = copy_stmt;
+	      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	    }
+	  break;
+	}
+
+    case GIMPLE_ASSIGN:
+	{
+	  enum tree_code code = gimple_assign_rhs_code (def_stmt);
+	  if (!safe_to_promote_def_p (def_stmt))
+	    {
+	      do_not_promote = true;
+	    }
+	  else if (CONVERT_EXPR_CODE_P (code))
+	    {
+	      tree rhs = gimple_assign_rhs1 (def_stmt);
+	      if (types_compatible_p (TREE_TYPE (rhs), promoted_type))
+		{
+		  /* As we travel statements in dominated order, arguments
+		     of def_stmt will be visited before visiting def.  If RHS
+		     is already promoted and type is compatible, we can convert
+		     them into ZERO/SIGN EXTEND stmt.  */
+		  tree &type = original_type_map->get_or_insert (rhs);
+		  if (type == NULL_TREE)
+		    type = TREE_TYPE (rhs);
+		  if (TYPE_PRECISION (original_type) < TYPE_PRECISION (type))
+		    type = original_type;
+		  gcc_assert (type != NULL_TREE);
+		  TREE_TYPE (def) = promoted_type;
+		  gimple copy_stmt =
+		    zero_sign_extend_stmt (def, rhs,
+					   TYPE_PRECISION (type));
+		  SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		  gsi = gsi_for_stmt (def_stmt);
+		  gsi_replace (&gsi, copy_stmt, false);
+		}
+	      else
+		{
+		  /* If RHS is not promoted OR their types are not
+		     compatible, create CONVERT_EXPR that converts
+		     RHS to  promoted DEF type and perform a
+		     ZERO/SIGN EXTEND to get the required value
+		     from RHS.  */
+		  tree s = (TYPE_PRECISION (TREE_TYPE (def))
+			    < TYPE_PRECISION (TREE_TYPE (rhs)))
+		    ? TREE_TYPE (def) : TREE_TYPE (rhs);
+		  new_def = copy_ssa_name (def);
+		  set_ssa_promoted (new_def);
+		  TREE_TYPE (def) = promoted_type;
+		  TREE_TYPE (new_def) = promoted_type;
+		  SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		  SET_SSA_NAME_VAR_OR_IDENTIFIER (new_def, NULL_TREE);
+		  gimple_set_lhs (def_stmt, new_def);
+		  gimple copy_stmt =
+		    zero_sign_extend_stmt (def, new_def,
+					   TYPE_PRECISION (s));
+		  gsi = gsi_for_stmt (def_stmt);
+		  if (lookup_stmt_eh_lp (def_stmt) > 0)
+		    insert_stmt_on_edge (def_stmt, copy_stmt);
+		  else
+		    gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+		}
+	    }
+	  else
+	    {
+	      /* Promote def by fixing its type and make def anonymous.  */
+	      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	      promote_cst_in_stmt (def_stmt, promoted_type);
+	      TREE_TYPE (def) = promoted_type;
+	    }
+	  break;
+	}
+
+    default:
+      do_not_promote = true;
+      break;
+    }
+
+  if (do_not_promote)
+    {
+      /* Promote def and copy (i.e. convert) the value defined
+	 by the stmt that cannot be promoted.  */
+      new_def = copy_ssa_name (def);
+      set_ssa_promoted (new_def);
+      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+      TREE_TYPE (def) = promoted_type;
+      gimple_set_lhs (def_stmt, new_def);
+      copy_stmt = gimple_build_assign (def, CONVERT_EXPR,
+				       new_def, NULL_TREE);
+      gsi = gsi_for_stmt (def_stmt);
+      if (lookup_stmt_eh_lp (def_stmt) > 0)
+	insert_stmt_on_edge (def_stmt, copy_stmt);
+      else
+	gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+    }
+  else
+    {
+      /* Type is now promoted.  Due to this, some of the value ranges computed
+	 by VRP1 will is invalid.  TODO: We can be intelligent in deciding
+	 which ranges to be invalidated instead of invalidating everything.  */
+      SSA_NAME_RANGE_INFO (def) = NULL;
+    }
+}
+
+/* Fix the (promoted) USE in stmts where USE cannot be be promoted.  */
+static unsigned int
+fixup_uses (tree use, tree promoted_type, tree old_type)
+{
+  gimple stmt;
+  imm_use_iterator ui;
+  gimple_stmt_iterator gsi;
+  use_operand_p op;
+
+  FOR_EACH_IMM_USE_STMT (stmt, ui, use)
+    {
+      bool do_not_promote = false;
+      switch (gimple_code (stmt))
+	{
+	case GIMPLE_DEBUG:
+	    {
+	      gsi = gsi_for_stmt (stmt);
+	      gsi_remove (&gsi, true);
+	      break;
+	    }
+
+	case GIMPLE_ASM:
+	case GIMPLE_CALL:
+	case GIMPLE_RETURN:
+	    {
+	      /* USE cannot be promoted here.  */
+	      do_not_promote = true;
+	      break;
+	    }
+
+	case GIMPLE_ASSIGN:
+	    {
+	      enum tree_code code = gimple_assign_rhs_code (stmt);
+	      tree lhs = gimple_assign_lhs (stmt);
+	      if (!safe_to_promote_use_p (stmt))
+		{
+		  do_not_promote = true;
+		}
+	      else if (truncate_use_p (stmt))
+		{
+		  /* In some stmts, value in USE has to be ZERO/SIGN
+		     Extended based on the original type for correct
+		     result.  */
+		  tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+		  gimple copy_stmt =
+		    zero_sign_extend_stmt (temp, use,
+					   TYPE_PRECISION (old_type));
+		  gsi = gsi_for_stmt (stmt);
+		  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+
+		  FOR_EACH_IMM_USE_ON_STMT (op, ui)
+		    SET_USE (op, temp);
+		  if (TREE_CODE_CLASS (code)
+		      == tcc_comparison)
+		    promote_cst_in_stmt (stmt, promoted_type, true);
+		  update_stmt (stmt);
+		}
+	      else if (CONVERT_EXPR_CODE_P (code))
+		{
+		  if (types_compatible_p (TREE_TYPE (lhs), promoted_type))
+		    {
+		      /* Type of LHS and promoted RHS are compatible, we can
+			 convert this into ZERO/SIGN EXTEND stmt.  */
+		      gimple copy_stmt =
+			zero_sign_extend_stmt (lhs, use,
+					       TYPE_PRECISION (old_type));
+		      gsi = gsi_for_stmt (stmt);
+		      set_ssa_promoted (lhs);
+		      gsi_replace (&gsi, copy_stmt, false);
+		    }
+		  else if (tobe_promoted_p (lhs))
+		    {
+		      /* If LHS will be promoted later, store the original
+			 type of RHS so that we can convert it to ZERO/SIGN
+			 EXTEND when LHS is promoted.  */
+		      tree rhs = gimple_assign_rhs1 (stmt);
+		      tree &type = original_type_map->get_or_insert (rhs);
+		      type = TREE_TYPE (old_type);
+		    }
+		  else
+		    {
+		      do_not_promote = true;
+		    }
+		}
+	      break;
+	    }
+
+	case GIMPLE_COND:
+	    {
+	      /* In GIMPLE_COND, value in USE has to be ZERO/SIGN
+		 Extended based on the original type for correct
+		 result.  */
+	      tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+	      gimple copy_stmt =
+		zero_sign_extend_stmt (temp, use,
+				       TYPE_PRECISION (old_type));
+	      gsi = gsi_for_stmt (stmt);
+	      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+
+	      FOR_EACH_IMM_USE_ON_STMT (op, ui)
+		SET_USE (op, temp);
+	      promote_cst_in_stmt (stmt, promoted_type, true);
+	      update_stmt (stmt);
+	      break;
+	    }
+
+	default:
+	  break;
+	}
+
+      if (do_not_promote)
+	{
+	  /* FOR stmts where USE canoot be promoted, create an
+	     original type copy.  */
+	  tree temp;
+	  temp = copy_ssa_name (use);
+	  set_ssa_promoted (temp);
+	  TREE_TYPE (temp) = old_type;
+	  gimple copy_stmt = gimple_build_assign (temp, CONVERT_EXPR,
+						  use, NULL_TREE);
+	  gsi = gsi_for_stmt (stmt);
+	  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	  FOR_EACH_IMM_USE_ON_STMT (op, ui)
+	    SET_USE (op, temp);
+	  update_stmt (stmt);
+	}
+    }
+  return 0;
+}
+
+/* Promote definition of NAME and adjust its uses if necessary.  */
+static unsigned int
+promote_def_and_uses (tree name)
+{
+  tree type;
+  if (tobe_promoted_p (name))
+    {
+      type = get_promoted_type (TREE_TYPE (name));
+      tree old_type = TREE_TYPE (name);
+      promote_definition (name, type);
+      fixup_uses (name, type, old_type);
+      set_ssa_promoted (name);
+    }
+  return 0;
+}
+
+/* Promote all the stmts in the basic block.  */
+static void
+promote_all_stmts (basic_block bb)
+{
+  gimple_stmt_iterator gsi;
+  ssa_op_iter iter;
+  tree def;
+
+  for (gphi_iterator gpi = gsi_start_phis (bb);
+       !gsi_end_p (gpi); gsi_next (&gpi))
+    {
+      gphi *phi = gpi.phi ();
+      use_operand_p op;
+
+      FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE)
+	{
+	  def = USE_FROM_PTR (op);
+	  promote_def_and_uses (def);
+	}
+      def = PHI_RESULT (phi);
+      promote_def_and_uses (def);
+    }
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      gimple stmt = gsi_stmt (gsi);
+
+      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_USE | SSA_OP_DEF)
+	promote_def_and_uses (def);
+    }
+}
+
+
+class type_promotion_dom_walker : public dom_walker
+{
+public:
+  type_promotion_dom_walker (cdi_direction direction)
+    : dom_walker (direction) {}
+  virtual void before_dom_children (basic_block bb)
+    {
+      promote_all_stmts (bb);
+    }
+};
+
+/* Main entry point to the pass.  */
+static unsigned int
+execute_type_promotion (void)
+{
+  n_ssa_val = num_ssa_names;
+  original_type_map = new hash_map<tree, tree>;
+  ssa_to_be_promoted_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_to_be_promoted_bitmap);
+  ssa_sets_higher_bits_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_sets_higher_bits_bitmap);
+
+  calculate_dominance_info (CDI_DOMINATORS);
+  /* Walk the CFG in dominator order.  */
+  type_promotion_dom_walker (CDI_DOMINATORS)
+    .walk (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+
+  sbitmap_free (ssa_to_be_promoted_bitmap);
+  sbitmap_free (ssa_sets_higher_bits_bitmap);
+  free_dominance_info (CDI_DOMINATORS);
+  delete original_type_map;
+  return 0;
+}
+
+namespace {
+const pass_data pass_data_type_promotion =
+{
+  GIMPLE_PASS, /* type */
+  "promotion", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_TREE_TYPE_PROMOTE, /* tv_id */
+  PROP_ssa, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  (TODO_cleanup_cfg | TODO_update_ssa | TODO_verify_all),
+};
+
+class pass_type_promotion : public gimple_opt_pass
+{
+public:
+  pass_type_promotion (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_type_promotion, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  opt_pass * clone () { return new pass_type_promotion (m_ctxt); }
+  virtual bool gate (function *) { return flag_tree_type_promote != 0; }
+  virtual unsigned int execute (function *)
+    {
+      return execute_type_promotion ();
+    }
+
+}; // class pass_type_promotion
+
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_type_promote (gcc::context *ctxt)
+{
+  return new pass_type_promotion (ctxt);
+}
+
diff --git a/gcc/passes.def b/gcc/passes.def
index 64fc4d9..254496b 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -270,6 +270,7 @@ along with GCC; see the file COPYING3.  If not see
       POP_INSERT_PASSES ()
       NEXT_PASS (pass_simduid_cleanup);
       NEXT_PASS (pass_lower_vector_ssa);
+      NEXT_PASS (pass_type_promote);
       NEXT_PASS (pass_cse_reciprocals);
       NEXT_PASS (pass_reassoc);
       NEXT_PASS (pass_strength_reduction);
diff --git a/gcc/timevar.def b/gcc/timevar.def
index ac41075..80171ec 100644
--- a/gcc/timevar.def
+++ b/gcc/timevar.def
@@ -276,6 +276,7 @@ DEFTIMEVAR (TV_VTABLE_VERIFICATION   , "vtable verification")
 DEFTIMEVAR (TV_TREE_UBSAN            , "tree ubsan")
 DEFTIMEVAR (TV_INITIALIZE_RTL        , "initialize rtl")
 DEFTIMEVAR (TV_GIMPLE_LADDRESS       , "address lowering")
+DEFTIMEVAR (TV_TREE_TYPE_PROMOTE     , "tree type promote")
 
 /* Everything else in rest_of_compilation not included above.  */
 DEFTIMEVAR (TV_EARLY_LOCAL	     , "early local passes")
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 7b66a1c..7ddb55c 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -431,6 +431,7 @@ extern gimple_opt_pass *make_pass_fre (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_check_data_deps (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_copy_prop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_isolate_erroneous_paths (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_type_promote (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_vrp (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt);
diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c
index 910cb19..19aa918 100644
--- a/gcc/tree-ssanames.c
+++ b/gcc/tree-ssanames.c
@@ -190,7 +190,8 @@ set_range_info (tree name, enum value_range_type range_type,
   unsigned int precision = TYPE_PRECISION (TREE_TYPE (name));
 
   /* Allocate if not available.  */
-  if (ri == NULL)
+  if (ri == NULL
+      || (precision != ri->get_min ().get_precision ()))
     {
       size_t size = (sizeof (range_info_def)
 		     + trailing_wide_ints <3>::extra_size (precision));
-- 
1.9.1


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [3/7] Optimize ZEXT_EXPR with tree-vrp
  2015-09-07  2:55 [0/7] Type promotion pass and elimination of zext/sext Kugan
  2015-09-07  2:57 ` [1/7] Add new tree code SEXT_EXPR Kugan
  2015-09-07  2:58 ` [2/7] Add new type promotion pass Kugan
@ 2015-09-07  3:00 ` Kugan
  2015-09-15 13:18   ` Richard Biener
  2015-09-07  3:01 ` [4/7] Use correct promoted mode sign for result of GIMPLE_CALL Kugan
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 63+ messages in thread
From: Kugan @ 2015-09-07  3:00 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Biener

[-- Attachment #1: Type: text/plain, Size: 290 bytes --]

This patch tree-vrp handling and optimization for ZEXT_EXPR.



gcc/ChangeLog:

2015-09-07  Kugan Vivekanandarajah  <kuganv@linaro.org>

	* tree-vrp.c (extract_range_from_binary_expr_1): Handle SEXT_EXPR.
	(simplify_bit_ops_using_ranges): Likewise.
	(simplify_stmt_using_ranges): Likewise.

[-- Attachment #2: 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch --]
[-- Type: text/x-diff, Size: 3693 bytes --]

From 7143e0575f309f70d838edf436b555fb93a6c4bb Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Mon, 17 Aug 2015 13:45:52 +1000
Subject: [PATCH 3/8] Optimize ZEXT_EXPR with tree-vrp

---
 gcc/tree-vrp.c | 75 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 75 insertions(+)

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 21fbed0..d579b49 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -2327,6 +2327,7 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
       && code != LSHIFT_EXPR
       && code != MIN_EXPR
       && code != MAX_EXPR
+      && code != SEXT_EXPR
       && code != BIT_AND_EXPR
       && code != BIT_IOR_EXPR
       && code != BIT_XOR_EXPR)
@@ -2887,6 +2888,55 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
       extract_range_from_multiplicative_op_1 (vr, code, &vr0, &vr1);
       return;
     }
+  else if (code == SEXT_EXPR)
+    {
+      gcc_assert (range_int_cst_p (&vr1));
+      unsigned int prec = tree_to_uhwi (vr1.min);
+      type = vr0.type;
+      wide_int tmin, tmax;
+      wide_int type_min, type_max;
+      wide_int may_be_nonzero, must_be_nonzero;
+
+      gcc_assert (!TYPE_UNSIGNED (expr_type));
+      type_min = wi::shwi (1 << (prec - 1),
+			   TYPE_PRECISION (TREE_TYPE (vr0.min)));
+      type_max = wi::shwi (((1 << (prec - 1)) - 1),
+			   TYPE_PRECISION (TREE_TYPE (vr0.max)));
+      if (zero_nonzero_bits_from_vr (expr_type, &vr0,
+				     &may_be_nonzero,
+				     &must_be_nonzero))
+	{
+	  HOST_WIDE_INT int_may_be_nonzero = may_be_nonzero.to_uhwi ();
+	  HOST_WIDE_INT int_must_be_nonzero = must_be_nonzero.to_uhwi ();
+
+	  if (int_must_be_nonzero & (1 << (prec - 1)))
+	    {
+	      /* If to-be-extended sign bit is one.  */
+	      tmin = type_min;
+	      tmax = may_be_nonzero;
+	    }
+	  else if ((int_may_be_nonzero & (1 << (prec - 1))) == 0)
+	    {
+	      /* If to-be-extended sign bit is zero.  */
+	      tmin = must_be_nonzero;
+	      tmax = may_be_nonzero;
+	    }
+	  else
+	    {
+	      tmin = type_min;
+	      tmax = type_max;
+	    }
+	}
+      else
+	{
+	  tmin = type_min;
+	  tmax = type_max;
+	}
+      tmin = wi::sext (tmin, prec - 1);
+      tmax = wi::sext (tmax, prec - 1);
+      min = wide_int_to_tree (expr_type, tmin);
+      max = wide_int_to_tree (expr_type, tmax);
+    }
   else if (code == RSHIFT_EXPR
 	   || code == LSHIFT_EXPR)
     {
@@ -9254,6 +9304,30 @@ simplify_bit_ops_using_ranges (gimple_stmt_iterator *gsi, gimple stmt)
 	  break;
 	}
       break;
+    case SEXT_EXPR:
+	{
+	  gcc_assert (is_gimple_min_invariant (op1));
+	  unsigned int prec = tree_to_uhwi (op1);
+	  wide_int mask;
+	  HOST_WIDE_INT may_be_nonzero = may_be_nonzero0.to_uhwi ();
+	  HOST_WIDE_INT must_be_nonzero = must_be_nonzero0.to_uhwi ();
+	  mask = wi::shwi (((1 << (prec - 1)) - 1),
+			   TYPE_PRECISION (TREE_TYPE (vr0.max)));
+	  mask = wi::bit_not (mask);
+	  if (must_be_nonzero & (1 << (prec - 1)))
+	    {
+	      /* If to-be-extended sign bit is one.  */
+	      if (wi::bit_and (must_be_nonzero0, mask) == mask)
+		op = op0;
+	    }
+	  else if ((may_be_nonzero & (1 << (prec - 1))) == 0)
+	    {
+	      /* If to-be-extended sign bit is zero.  */
+	      if (wi::bit_and (may_be_nonzero0, mask) == 0)
+		op = op0;
+	    }
+	}
+      break;
     default:
       gcc_unreachable ();
     }
@@ -9955,6 +10029,7 @@ simplify_stmt_using_ranges (gimple_stmt_iterator *gsi)
 
 	case BIT_AND_EXPR:
 	case BIT_IOR_EXPR:
+	case SEXT_EXPR:
 	  /* Optimize away BIT_AND_EXPR and BIT_IOR_EXPR
 	     if all the bits being cleared are already cleared or
 	     all the bits being set are already set.  */
-- 
1.9.1


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [5/7] Allow gimple debug stmt in widen mode
  2015-09-07  2:55 [0/7] Type promotion pass and elimination of zext/sext Kugan
                   ` (3 preceding siblings ...)
  2015-09-07  3:01 ` [4/7] Use correct promoted mode sign for result of GIMPLE_CALL Kugan
@ 2015-09-07  3:01 ` Kugan
  2015-09-07 13:46   ` Michael Matz
  2015-09-07  3:03 ` Kugan
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 63+ messages in thread
From: Kugan @ 2015-09-07  3:01 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Biener

[-- Attachment #1: Type: text/plain, Size: 282 bytes --]

Allow GIMPLE_DEBUG with values in promoted register.


gcc/ChangeLog:

2015-09-07  Kugan Vivekanandarajah  <kuganv@linaro.org>

	* expr.c (expand_expr_real_1): Set proper SUNREG_PROMOTED_MODE for
	SSA_NAME that was set by GIMPLE_CALL and assigned to another
	SSA_NAME of same type.

[-- Attachment #2: 0005-debug-stmt-in-widen-mode.patch --]
[-- Type: text/x-diff, Size: 2222 bytes --]

From a28de63bcbb9f315cee7e41be11b65b3ff521a91 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Tue, 1 Sep 2015 08:40:40 +1000
Subject: [PATCH 5/8] debug stmt in widen mode

---
 gcc/cfgexpand.c               | 11 -----------
 gcc/gimple-ssa-type-promote.c |  7 -------
 gcc/rtl.h                     |  2 ++
 3 files changed, 2 insertions(+), 18 deletions(-)

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index bbc3c10..036085a 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -5240,7 +5240,6 @@ expand_debug_locations (void)
 	tree value = (tree)INSN_VAR_LOCATION_LOC (insn);
 	rtx val;
 	rtx_insn *prev_insn, *insn2;
-	machine_mode mode;
 
 	if (value == NULL_TREE)
 	  val = NULL_RTX;
@@ -5275,16 +5274,6 @@ expand_debug_locations (void)
 
 	if (!val)
 	  val = gen_rtx_UNKNOWN_VAR_LOC ();
-	else
-	  {
-	    mode = GET_MODE (INSN_VAR_LOCATION (insn));
-
-	    gcc_assert (mode == GET_MODE (val)
-			|| (GET_MODE (val) == VOIDmode
-			    && (CONST_SCALAR_INT_P (val)
-				|| GET_CODE (val) == CONST_FIXED
-				|| GET_CODE (val) == LABEL_REF)));
-	  }
 
 	INSN_VAR_LOCATION_LOC (insn) = val;
 	prev_insn = PREV_INSN (insn);
diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c
index 62b5fdc..6805b9c 100644
--- a/gcc/gimple-ssa-type-promote.c
+++ b/gcc/gimple-ssa-type-promote.c
@@ -570,13 +570,6 @@ fixup_uses (tree use, tree promoted_type, tree old_type)
       bool do_not_promote = false;
       switch (gimple_code (stmt))
 	{
-	case GIMPLE_DEBUG:
-	    {
-	      gsi = gsi_for_stmt (stmt);
-	      gsi_remove (&gsi, true);
-	      break;
-	    }
-
 	case GIMPLE_ASM:
 	case GIMPLE_CALL:
 	case GIMPLE_RETURN:
diff --git a/gcc/rtl.h b/gcc/rtl.h
index ac56133..c3cdf96 100644
--- a/gcc/rtl.h
+++ b/gcc/rtl.h
@@ -2100,6 +2100,8 @@ wi::int_traits <rtx_mode_t>::decompose (HOST_WIDE_INT *,
 	   targets is 1 rather than -1.  */
 	gcc_checking_assert (INTVAL (x.first)
 			     == sext_hwi (INTVAL (x.first), precision)
+			     || INTVAL (x.first)
+			     == (INTVAL (x.first) & ((1 << precision) - 1))
 			     || (x.second == BImode && INTVAL (x.first) == 1));
 
       return wi::storage_ref (&INTVAL (x.first), 1, precision);
-- 
1.9.1


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [4/7] Use correct promoted mode sign for result of GIMPLE_CALL
  2015-09-07  2:55 [0/7] Type promotion pass and elimination of zext/sext Kugan
                   ` (2 preceding siblings ...)
  2015-09-07  3:00 ` [3/7] Optimize ZEXT_EXPR with tree-vrp Kugan
@ 2015-09-07  3:01 ` Kugan
  2015-09-07 13:16   ` Michael Matz
  2015-09-07  3:01 ` [5/7] Allow gimple debug stmt in widen mode Kugan
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 63+ messages in thread
From: Kugan @ 2015-09-07  3:01 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Biener

[-- Attachment #1: Type: text/plain, Size: 1155 bytes --]



For the following testcase (compiling with -O1; -O2 works fine), we have
a stmt with stm_code SSA_NAME (_7 = _ 6) and for which _6 is defined by
a GIMPLE_CALL. In this case, we are using wrong SUNREG promoted mode
resulting in wrong code. Simple SSA_NAME copes are generally optimized
but when they are not, we can end up using the wrong promoted mode.
Attached patch fixes when we have one copy. I think it might be better
to do this in a while loop but I don't think it can happen in practice.
Please let me know what you think.

  _6 = bar5 (-10);
  ...
  _7 = _6;
  _3 = (long unsigned int) _6;
  ...
  if (_3 != l5.0_4)


for
extern void abort (void);

__attribute__ ((noinline))
static unsigned short int foo5 (int x)
{
  return x;
}

__attribute__ ((noinline))
short int bar5 (int x)
{
  return foo5 (x + 6);
}

unsigned long l5 = (short int) -4;

int
main (void)
{
  if (bar5 (-10) != l5)
    abort ();
  return 0;
}

gcc/ChangeLog:

2015-09-07  Kugan Vivekanandarajah  <kuganv@linaro.org>

	* expr.c (expand_expr_real_1): Set proper SUNREG_PROMOTED_MODE for
	SSA_NAME that was set by GIMPLE_CALL and assigned to another
	SSA_NAME of same type.

[-- Attachment #2: 0004-use-correct-promoted-sign-for-result-of-GIMPLE_CALL.patch --]
[-- Type: text/x-diff, Size: 1231 bytes --]

From 64ac68bfda1d3e8487827512e6d163b384e8a1cf Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Wed, 2 Sep 2015 12:18:41 +1000
Subject: [PATCH 4/8] use correct promoted sign for result of GIMPLE_CALL

---
 gcc/expr.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/gcc/expr.c b/gcc/expr.c
index bcd87c0..6dac3cf 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9633,7 +9633,22 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 					   gimple_call_fntype (g),
 					   2);
 	  else
-	    pmode = promote_ssa_mode (ssa_name, &unsignedp);
+	    {
+	      tree rhs;
+	      gimple stmt;
+	      if (code == SSA_NAME
+		  && is_gimple_assign (g)
+		  && (rhs = gimple_assign_rhs1 (g))
+		  && TREE_CODE (rhs) == SSA_NAME
+		  && (stmt = SSA_NAME_DEF_STMT (rhs))
+		  && gimple_code (stmt) == GIMPLE_CALL
+		  && !gimple_call_internal_p (stmt))
+		pmode = promote_function_mode (type, mode, &unsignedp,
+					       gimple_call_fntype (stmt),
+					       2);
+	      else
+		pmode = promote_ssa_mode (ssa_name, &unsignedp);
+	    }
 	  gcc_assert (GET_MODE (decl_rtl) == pmode);
 
 	  temp = gen_lowpart_SUBREG (mode, decl_rtl);
-- 
1.9.1


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [6/7] Temporary workaround to get aarch64 bootstrap
  2015-09-07  2:55 [0/7] Type promotion pass and elimination of zext/sext Kugan
                   ` (5 preceding siblings ...)
  2015-09-07  3:03 ` Kugan
@ 2015-09-07  3:03 ` Kugan
  2015-09-07  5:54 ` [7/7] Adjust-arm-test cases Kugan
  2015-10-20 20:13 ` [0/7] Type promotion pass and elimination of zext/sext Kugan
  8 siblings, 0 replies; 63+ messages in thread
From: Kugan @ 2015-09-07  3:03 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Biener

[-- Attachment #1: Type: text/plain, Size: 469 bytes --]


AARCH64 bootstrap problem that started happening with the commit
94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c mis-compiled in
stage due to this fwprop.c is failing. It looks to me that there is a
latent issue which gets exposed my patch. I can also reproduce this in
x86_64 if I use the same PROMOTE_MODE which is used in aarch64 port. For
the time being, I am using  patch
0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
workaround .

[-- Attachment #2: 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch --]
[-- Type: text/x-diff, Size: 1200 bytes --]

From 6a10c856374446ab6d18eb9ce840c08cac440a61 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Tue, 1 Sep 2015 08:44:59 +1000
Subject: [PATCH 6/8] temporary workaround for bootstrap failure due to copy
 coalescing

---
 gcc/tree-ssa-coalesce.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
index 6468012..b18f0b8 100644
--- a/gcc/tree-ssa-coalesce.c
+++ b/gcc/tree-ssa-coalesce.c
@@ -1384,11 +1384,13 @@ gimple_can_coalesce_p (tree name1, tree name2)
 	 SSA_NAMEs.  Now, if a parm or result has BLKmode, do not
 	 coalesce its SSA versions with those of any other variables,
 	 because it may be passed by reference.  */
-      return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2)))
+      return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2)));
+#if 0
 	|| (/* The case var1 == var2 is already covered above.  */
 	    !parm_in_stack_slot_p (var1)
 	    && !parm_in_stack_slot_p (var2)
 	    && promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL));
+#endif
     }
 
   /* If the types are not the same, check for a canonical type match.  This
-- 
1.9.1


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [5/7] Allow gimple debug stmt in widen mode
  2015-09-07  2:55 [0/7] Type promotion pass and elimination of zext/sext Kugan
                   ` (4 preceding siblings ...)
  2015-09-07  3:01 ` [5/7] Allow gimple debug stmt in widen mode Kugan
@ 2015-09-07  3:03 ` Kugan
  2015-09-07  3:03 ` [6/7] Temporary workaround to get aarch64 bootstrap Kugan
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 63+ messages in thread
From: Kugan @ 2015-09-07  3:03 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Biener

[-- Attachment #1: Type: text/plain, Size: 282 bytes --]

Allow GIMPLE_DEBUG with values in promoted register.


gcc/ChangeLog:

2015-09-07  Kugan Vivekanandarajah  <kuganv@linaro.org>

	* expr.c (expand_expr_real_1): Set proper SUBREG_PROMOTED_MODE for
	SSA_NAME that was set by GIMPLE_CALL and assigned to another
	SSA_NAME of same type.

[-- Attachment #2: 0005-debug-stmt-in-widen-mode.patch --]
[-- Type: text/x-diff, Size: 2222 bytes --]

From a28de63bcbb9f315cee7e41be11b65b3ff521a91 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Tue, 1 Sep 2015 08:40:40 +1000
Subject: [PATCH 5/8] debug stmt in widen mode

---
 gcc/cfgexpand.c               | 11 -----------
 gcc/gimple-ssa-type-promote.c |  7 -------
 gcc/rtl.h                     |  2 ++
 3 files changed, 2 insertions(+), 18 deletions(-)

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index bbc3c10..036085a 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -5240,7 +5240,6 @@ expand_debug_locations (void)
 	tree value = (tree)INSN_VAR_LOCATION_LOC (insn);
 	rtx val;
 	rtx_insn *prev_insn, *insn2;
-	machine_mode mode;
 
 	if (value == NULL_TREE)
 	  val = NULL_RTX;
@@ -5275,16 +5274,6 @@ expand_debug_locations (void)
 
 	if (!val)
 	  val = gen_rtx_UNKNOWN_VAR_LOC ();
-	else
-	  {
-	    mode = GET_MODE (INSN_VAR_LOCATION (insn));
-
-	    gcc_assert (mode == GET_MODE (val)
-			|| (GET_MODE (val) == VOIDmode
-			    && (CONST_SCALAR_INT_P (val)
-				|| GET_CODE (val) == CONST_FIXED
-				|| GET_CODE (val) == LABEL_REF)));
-	  }
 
 	INSN_VAR_LOCATION_LOC (insn) = val;
 	prev_insn = PREV_INSN (insn);
diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c
index 62b5fdc..6805b9c 100644
--- a/gcc/gimple-ssa-type-promote.c
+++ b/gcc/gimple-ssa-type-promote.c
@@ -570,13 +570,6 @@ fixup_uses (tree use, tree promoted_type, tree old_type)
       bool do_not_promote = false;
       switch (gimple_code (stmt))
 	{
-	case GIMPLE_DEBUG:
-	    {
-	      gsi = gsi_for_stmt (stmt);
-	      gsi_remove (&gsi, true);
-	      break;
-	    }
-
 	case GIMPLE_ASM:
 	case GIMPLE_CALL:
 	case GIMPLE_RETURN:
diff --git a/gcc/rtl.h b/gcc/rtl.h
index ac56133..c3cdf96 100644
--- a/gcc/rtl.h
+++ b/gcc/rtl.h
@@ -2100,6 +2100,8 @@ wi::int_traits <rtx_mode_t>::decompose (HOST_WIDE_INT *,
 	   targets is 1 rather than -1.  */
 	gcc_checking_assert (INTVAL (x.first)
 			     == sext_hwi (INTVAL (x.first), precision)
+			     || INTVAL (x.first)
+			     == (INTVAL (x.first) & ((1 << precision) - 1))
 			     || (x.second == BImode && INTVAL (x.first) == 1));
 
       return wi::storage_ref (&INTVAL (x.first), 1, precision);
-- 
1.9.1


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [7/7] Adjust-arm-test cases
  2015-09-07  2:55 [0/7] Type promotion pass and elimination of zext/sext Kugan
                   ` (6 preceding siblings ...)
  2015-09-07  3:03 ` [6/7] Temporary workaround to get aarch64 bootstrap Kugan
@ 2015-09-07  5:54 ` Kugan
  2015-11-02 11:43   ` Richard Earnshaw
  2015-10-20 20:13 ` [0/7] Type promotion pass and elimination of zext/sext Kugan
  8 siblings, 1 reply; 63+ messages in thread
From: Kugan @ 2015-09-07  5:54 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Biener

[-- Attachment #1: Type: text/plain, Size: 295 bytes --]



gcc/testsuite/ChangeLog:

2015-09-07  Kugan Vivekanandarajah  <kuganv@linaro.org>

	* gcc.target/arm/mla-2.c: Scan for wider mode operation.
	* gcc.target/arm/wmul-1.c: Likewise.
	* gcc.target/arm/wmul-2.c: Likewise.
	* gcc.target/arm/wmul-3.c: Likewise.
	* gcc.target/arm/wmul-9.c: Likewise.

[-- Attachment #2: 0007-adjust-arm-testcases.patch --]
[-- Type: text/x-diff, Size: 2556 bytes --]

From 305c526b4019fc11260c474143f6829be2cc3f54 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Wed, 2 Sep 2015 12:21:46 +1000
Subject: [PATCH 7/8] adjust arm testcases

---
 gcc/testsuite/gcc.target/arm/mla-2.c  | 2 +-
 gcc/testsuite/gcc.target/arm/wmul-1.c | 2 +-
 gcc/testsuite/gcc.target/arm/wmul-2.c | 2 +-
 gcc/testsuite/gcc.target/arm/wmul-3.c | 2 +-
 gcc/testsuite/gcc.target/arm/wmul-9.c | 2 +-
 5 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mla-2.c b/gcc/testsuite/gcc.target/arm/mla-2.c
index 1e3ca20..474bce0 100644
--- a/gcc/testsuite/gcc.target/arm/mla-2.c
+++ b/gcc/testsuite/gcc.target/arm/mla-2.c
@@ -7,4 +7,4 @@ long long foolong (long long x, short *a, short *b)
     return x + *a * *b;
 }
 
-/* { dg-final { scan-assembler "smlalbb" } } */
+/* { dg-final { scan-assembler "smla" } } */
diff --git a/gcc/testsuite/gcc.target/arm/wmul-1.c b/gcc/testsuite/gcc.target/arm/wmul-1.c
index ddddd50..d4e7b41 100644
--- a/gcc/testsuite/gcc.target/arm/wmul-1.c
+++ b/gcc/testsuite/gcc.target/arm/wmul-1.c
@@ -16,4 +16,4 @@ int mac(const short *a, const short *b, int sqr, int *sum)
   return sqr;
 }
 
-/* { dg-final { scan-assembler-times "smlabb" 2 } } */
+/* { dg-final { scan-assembler-times "mla" 2 } } */
diff --git a/gcc/testsuite/gcc.target/arm/wmul-2.c b/gcc/testsuite/gcc.target/arm/wmul-2.c
index 2ea55f9..0e32674 100644
--- a/gcc/testsuite/gcc.target/arm/wmul-2.c
+++ b/gcc/testsuite/gcc.target/arm/wmul-2.c
@@ -10,4 +10,4 @@ void vec_mpy(int y[], const short x[], short scaler)
    y[i] += ((scaler * x[i]) >> 31);
 }
 
-/* { dg-final { scan-assembler-times "smulbb" 1 } } */
+/* { dg-final { scan-assembler-times "mul" 1 } } */
diff --git a/gcc/testsuite/gcc.target/arm/wmul-3.c b/gcc/testsuite/gcc.target/arm/wmul-3.c
index 144b553..46d709c 100644
--- a/gcc/testsuite/gcc.target/arm/wmul-3.c
+++ b/gcc/testsuite/gcc.target/arm/wmul-3.c
@@ -16,4 +16,4 @@ int mac(const short *a, const short *b, int sqr, int *sum)
   return sqr;
 }
 
-/* { dg-final { scan-assembler-times "smulbb" 2 } } */
+/* { dg-final { scan-assembler-times "mul" 2 } } */
diff --git a/gcc/testsuite/gcc.target/arm/wmul-9.c b/gcc/testsuite/gcc.target/arm/wmul-9.c
index 40ed021..415a114 100644
--- a/gcc/testsuite/gcc.target/arm/wmul-9.c
+++ b/gcc/testsuite/gcc.target/arm/wmul-9.c
@@ -8,4 +8,4 @@ foo (long long a, short *b, char *c)
   return a + *b * *c;
 }
 
-/* { dg-final { scan-assembler "smlalbb" } } */
+/* { dg-final { scan-assembler "mlal" } } */
-- 
1.9.1


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [4/7] Use correct promoted mode sign for result of GIMPLE_CALL
  2015-09-07  3:01 ` [4/7] Use correct promoted mode sign for result of GIMPLE_CALL Kugan
@ 2015-09-07 13:16   ` Michael Matz
  2015-09-08  0:00     ` Kugan
  0 siblings, 1 reply; 63+ messages in thread
From: Michael Matz @ 2015-09-07 13:16 UTC (permalink / raw)
  To: Kugan; +Cc: gcc-patches, Richard Biener

Hi,

On Mon, 7 Sep 2015, Kugan wrote:

> For the following testcase (compiling with -O1; -O2 works fine), we have
> a stmt with stm_code SSA_NAME (_7 = _ 6) and for which _6 is defined by
> a GIMPLE_CALL. In this case, we are using wrong SUNREG promoted mode
> resulting in wrong code.

And why is that?

> Simple SSA_NAME copes are generally optimized
> but when they are not, we can end up using the wrong promoted mode.
> Attached patch fixes when we have one copy.

I think it's the wrong place to fixing up.  Where does the wrong use come 
from?  At that place it should be fixed, not after the fact.

>   _6 = bar5 (-10);
>   ...
>   _7 = _6;
>   _3 = (long unsigned int) _6;
>   ...
>   if (_3 != l5.0_4)

There is no use of '_7' in this snippet so I don't see the relevance of 
SUBREG_PROMOTED_MODE on it.

But whatever you do, please make sure you include the testcase for the 
problem as a regression test:

> extern void abort (void);
> 
> __attribute__ ((noinline))
> static unsigned short int foo5 (int x)
> {
>   return x;
> }
> 
> __attribute__ ((noinline))
> short int bar5 (int x)
> {
>   return foo5 (x + 6);
> }
> 
> unsigned long l5 = (short int) -4;
> 
> int
> main (void)
> {
>   if (bar5 (-10) != l5)
>     abort ();
>   return 0;
> }


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [5/7] Allow gimple debug stmt in widen mode
  2015-09-07  3:01 ` [5/7] Allow gimple debug stmt in widen mode Kugan
@ 2015-09-07 13:46   ` Michael Matz
  2015-09-08  0:01     ` Kugan
  0 siblings, 1 reply; 63+ messages in thread
From: Michael Matz @ 2015-09-07 13:46 UTC (permalink / raw)
  To: Kugan; +Cc: gcc-patches, Richard Biener

Hi,

On Mon, 7 Sep 2015, Kugan wrote:

> Allow GIMPLE_DEBUG with values in promoted register.

Patch does much more.

> gcc/ChangeLog:
> 
> 2015-09-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
> 
> 	* expr.c (expand_expr_real_1): Set proper SUNREG_PROMOTED_MODE for
> 	SSA_NAME that was set by GIMPLE_CALL and assigned to another
> 	SSA_NAME of same type.

ChangeLog doesn't match patch, and patch contains dubious changes:

> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -5240,7 +5240,6 @@ expand_debug_locations (void)
>         tree value = (tree)INSN_VAR_LOCATION_LOC (insn);
>         rtx val;
>         rtx_insn *prev_insn, *insn2;
> -       machine_mode mode;
>  
>         if (value == NULL_TREE)
>           val = NULL_RTX;
> @@ -5275,16 +5274,6 @@ expand_debug_locations (void)
>  
>         if (!val)
>           val = gen_rtx_UNKNOWN_VAR_LOC ();
> -       else
> -         {
> -           mode = GET_MODE (INSN_VAR_LOCATION (insn));
> -
> -           gcc_assert (mode == GET_MODE (val)
> -                       || (GET_MODE (val) == VOIDmode
> -                           && (CONST_SCALAR_INT_P (val)
> -                               || GET_CODE (val) == CONST_FIXED
> -                               || GET_CODE (val) == LABEL_REF)));
> -         }
>  
>         INSN_VAR_LOCATION_LOC (insn) = val;
>         prev_insn = PREV_INSN (insn);

So it seems that the modes of the values location and the value itself 
don't have to match anymore, which seems dubious when considering how a 
debugger should load the value in question from the given location.  So, 
how is it supposed to work?

And this change:

> --- a/gcc/rtl.h
> +++ b/gcc/rtl.h
> @@ -2100,6 +2100,8 @@ wi::int_traits <rtx_mode_t>::decompose (HOST_WIDE_INT*,
>            targets is 1 rather than -1.  */
>         gcc_checking_assert (INTVAL (x.first)
>                              == sext_hwi (INTVAL (x.first), precision)
> +                            || INTVAL (x.first)
> +                            == (INTVAL (x.first) & ((1 << precision) - 1))
>                              || (x.second == BImode && INTVAL (x.first) == 1));
>  
>        return wi::storage_ref (&INTVAL (x.first), 1, precision);

implies that wide_ints are not always sign-extended anymore after you 
changes.  That's a fundamental assumption, so removing that assert implies 
that you somehow created non-canonical wide_ints, and those will cause 
bugs elsewhere in the code.  Don't just remove asserts, they are usually 
there for a reason, and without accompanying changes those reasons don't 
go away.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [4/7] Use correct promoted mode sign for result of GIMPLE_CALL
  2015-09-07 13:16   ` Michael Matz
@ 2015-09-08  0:00     ` Kugan
  2015-09-08 15:45       ` Jeff Law
  0 siblings, 1 reply; 63+ messages in thread
From: Kugan @ 2015-09-08  0:00 UTC (permalink / raw)
  To: Michael Matz; +Cc: gcc-patches, Richard Biener

[-- Attachment #1: Type: text/plain, Size: 2173 bytes --]



On 07/09/15 23:10, Michael Matz wrote:
> Hi,
> 
> On Mon, 7 Sep 2015, Kugan wrote:
> 
>> For the following testcase (compiling with -O1; -O2 works fine), we have
>> a stmt with stm_code SSA_NAME (_7 = _ 6) and for which _6 is defined by
>> a GIMPLE_CALL. In this case, we are using wrong SUNREG promoted mode
>> resulting in wrong code.
> 
> And why is that?
> 
>> Simple SSA_NAME copes are generally optimized
>> but when they are not, we can end up using the wrong promoted mode.
>> Attached patch fixes when we have one copy.
> 
> I think it's the wrong place to fixing up.  Where does the wrong use come 
> from?  At that place it should be fixed, not after the fact.
> 
>>   _6 = bar5 (-10);
>>   ...
>>   _7 = _6;
>>   _3 = (long unsigned int) _6;
>>   ...
>>   if (_3 != l5.0_4)
> 
> There is no use of '_7' in this snippet so I don't see the relevance of 
> SUBREG_PROMOTED_MODE on it.
> 
> But whatever you do, please make sure you include the testcase for the 
> problem as a regression test:
> 

Thanks for the review.

This happens in ARM where definition of PROMOTED_MODE also changes the
sign. I am attaching the cfgdump for the test-case. This is part of the
existing test-case thats why I didn't include it as part of this patch.

for ;; _7 = _6;

(subreg:HI (reg:SI 113) 0)
 <ssa_name 0x7fd672c3ad38
    type <integer_type 0x7fd672c36540 short int HI
        size <integer_cst 0x7fd672c430c0 constant 16>
        unit size <integer_cst 0x7fd672c430d8 constant 2>
        align 16 symtab 0 alias set -1 canonical type 0x7fd672c36540
precision 16 min <integer_cst 0x7fd672c43078 -32768> max <integer_cst
0x7fd672c43090 32767>>
   def_stmt _7 = _6;

    version 7>
decl_rtl -> (reg:SI 113)
temp -> (subreg:HI (reg:SI 113) 0)
Unsignedp = 1

and we expand it to:

;; _7 = _6;

(insn 10 9 0 (set (reg:SI 113)
        (zero_extend:SI (subreg/u:HI (reg:SI 113) 0))) -1
     (nil))

but:

short int _6;
short int _7;

insn 10 above is wrong. _6 is defined by a call and therefore the sign
change in promoted mode is not true.

We should probably rearrange/or add a copy propagation to remove this
unnecessary copy but still this looks wrong to me.

Thanks,
Kugan

[-- Attachment #2: pr39240.c.190r.expand --]
[-- Type: text/plain, Size: 16496 bytes --]


;; Function foo5 (foo5, funcdef_no=0, decl_uid=4147, cgraph_uid=0, symbol_order=0)

foo5 (int x)
{
  unsigned int _2;
  unsigned int _4;
  short unsigned int _5;

;;   basic block 2, loop depth 0
;;    pred:       ENTRY
  _4 = (unsigned int) x_1(D);
  _2 = _4 & 65535;
  _5 = (short unsigned int) x_1(D);
  return _5;
;;    succ:       EXIT

}



Partition map 

Partition 1 (x_1(D) - 1 )
Partition 2 (_2 - 2 )
Partition 4 (_4 - 4 )
Partition 5 (_5 - 5 )


Coalescible Partition map 

Partition 0, base 0 (x_1(D) - 1 )


Partition map 

Partition 0 (x_1(D) - 1 )


Conflict graph:

After sorting:
Coalesce List:

Partition map 

Partition 0 (x_1(D) - 1 )

After Coalescing:

Partition map 

Partition 0 (x_1(D) - 1 )
Partition 1 (_2 - 2 )
Partition 2 (_4 - 4 )
Partition 3 (_5 - 5 )


Replacing Expressions
_4 replace with --> _4 = (unsigned int) x_1(D);

_5 replace with --> _5 = (short unsigned int) x_1(D);


foo5 (int x)
{
  unsigned int _2;
  unsigned int _4;
  short unsigned int _5;

;;   basic block 2, loop depth 0
;;    pred:       ENTRY
  _4 = (unsigned int) x_1(D);
  _2 = _4 & 65535;
  _5 = (short unsigned int) x_1(D);
  return _5;
;;    succ:       EXIT

}



;; Generating RTL for gimple basic block 2
(const_int 65535 [0xffff])

Hot cost: 4 (final)
(const_int 65535 [0xffff])

Hot cost: 4 (final)

;; _2 = _4 & 65535;

(insn 6 5 0 (set (reg:SI 111)
        (zero_extend:SI (subreg:HI (reg/v:SI 110 [ x ]) 0))) -1
     (nil))

;; return _5;

(insn 7 6 8 (set (reg:HI 115)
        (subreg:HI (reg/v:SI 110 [ x ]) 0)) pr39240.c:6 -1
     (nil))

(insn 8 7 9 (set (reg:SI 116)
        (zero_extend:SI (reg:HI 115))) pr39240.c:6 -1
     (nil))

(insn 9 8 10 (set (reg:SI 114 [ <retval> ])
        (reg:SI 116)) pr39240.c:6 -1
     (nil))

(jump_insn 10 9 11 (set (pc)
        (label_ref 0)) pr39240.c:6 -1
     (nil))

(barrier 11 10 0)


try_optimize_cfg iteration 1

Merging block 3 into block 2...
Merged blocks 2 and 3.
Merged 2 and 3 without moving.
Removing jump 10.
Merging block 4 into block 2...
Merged blocks 2 and 4.
Merged 2 and 4 without moving.


try_optimize_cfg iteration 2

fix_loop_structure: fixing up loops for function


;;
;; Full RTL generated for this function:
;;
(note 1 0 4 NOTE_INSN_DELETED)
;; basic block 2, loop depth 0, count 0, freq 10000, maybe hot
;;  prev block 0, next block 1, flags: (NEW, REACHABLE, RTL)
;;  pred:       ENTRY [100.0%]  (FALLTHRU)
(note 4 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(insn 2 4 3 2 (set (reg/v:SI 110 [ x ])
        (reg:SI 0 r0 [ x ])) pr39240.c:5 -1
     (nil))
(note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
(insn 6 3 7 2 (set (reg:SI 111)
        (zero_extend:SI (subreg:HI (reg/v:SI 110 [ x ]) 0))) -1
     (nil))
(insn 7 6 8 2 (set (reg:HI 115)
        (subreg:HI (reg/v:SI 110 [ x ]) 0)) pr39240.c:6 -1
     (nil))
(insn 8 7 9 2 (set (reg:SI 116)
        (zero_extend:SI (reg:HI 115))) pr39240.c:6 -1
     (nil))
(insn 9 8 13 2 (set (reg:SI 114 [ <retval> ])
        (reg:SI 116)) pr39240.c:6 -1
     (nil))
(insn 13 9 14 2 (set (reg/i:SI 0 r0)
        (reg:SI 114 [ <retval> ])) pr39240.c:7 -1
     (nil))
(insn 14 13 0 2 (use (reg/i:SI 0 r0)) pr39240.c:7 -1
     (nil))
;;  succ:       EXIT [100.0%]  (FALLTHRU)


;; Function bar5 (bar5, funcdef_no=1, decl_uid=4150, cgraph_uid=1, symbol_order=1)

bar5 (int x)
{
  int _2;
  unsigned int _4;
  int _5;
  short unsigned int _6;
  int _7;
  short int _8;

;;   basic block 2, loop depth 0
;;    pred:       ENTRY
  _2 = x_1(D) + 6;
  _6 = foo5 (_2);
  _4 = (unsigned int) _6;
  _7 = (int) _6;
  _5 = (_7) sext from bit (16);
  _8 = (short int) _5;
  return _8;
;;    succ:       EXIT

}



Partition map 

Partition 1 (x_1(D) - 1 )
Partition 2 (_2 - 2 )
Partition 4 (_4 - 4 )
Partition 5 (_5 - 5 )
Partition 6 (_6 - 6 )
Partition 7 (_7 - 7 )
Partition 8 (_8 - 8 )


Coalescible Partition map 

Partition 0, base 0 (x_1(D) - 1 )


Partition map 

Partition 0 (x_1(D) - 1 )


Conflict graph:

After sorting:
Coalesce List:

Partition map 

Partition 0 (x_1(D) - 1 )

After Coalescing:

Partition map 

Partition 0 (x_1(D) - 1 )
Partition 1 (_2 - 2 )
Partition 2 (_4 - 4 )
Partition 3 (_5 - 5 )
Partition 4 (_6 - 6 )
Partition 5 (_7 - 7 )
Partition 6 (_8 - 8 )


Replacing Expressions
_2 replace with --> _2 = x_1(D) + 6;

_5 replace with --> _5 = (_7) sext from bit (16);

_7 replace with --> _7 = (int) _6;

_8 replace with --> _8 = (short int) _5;


bar5 (int x)
{
  int _2;
  unsigned int _4;
  int _5;
  short unsigned int _6;
  int _7;
  short int _8;

;;   basic block 2, loop depth 0
;;    pred:       ENTRY
  _2 = x_1(D) + 6;
  _6 = foo5 (_2);
  _4 = (unsigned int) _6;
  _7 = (int) _6;
  _5 = (_7) sext from bit (16);
  _8 = (short int) _5;
  return _8;
;;    succ:       EXIT

}



;; Generating RTL for gimple basic block 2
(const_int 6 [0x6])

Hot cost: 4 (final)
(const_int 6 [0x6])

Hot cost: 4 (final)

;; _6 = foo5 (_2);

(insn 6 5 7 (set (reg:SI 118)
        (plus:SI (reg/v:SI 110 [ x ])
            (const_int 6 [0x6]))) pr39240.c:12 -1
     (nil))

(insn 7 6 8 (set (reg:SI 0 r0)
        (reg:SI 118)) pr39240.c:12 -1
     (nil))

(call_insn/u 8 7 9 (parallel [
            (set (reg:SI 0 r0)
                (call (mem:SI (symbol_ref:SI ("foo5") [flags 0x3]  <function_decl 0x7f3734ef7300 foo5>) [0 foo5 S4 A32])
                    (const_int 0 [0])))
            (use (const_int 0 [0]))
            (clobber (reg:SI 14 lr))
        ]) pr39240.c:12 -1
     (expr_list:REG_EH_REGION (const_int 0 [0])
        (nil))
    (expr_list (clobber (reg:SI 12 ip))
        (expr_list:SI (use (reg:SI 0 r0))
            (nil))))

(insn 9 8 10 (set (reg:SI 119)
        (reg:SI 0 r0)) pr39240.c:12 -1
     (nil))

(insn 10 9 0 (set (reg:SI 114)
        (reg:SI 119)) pr39240.c:12 -1
     (nil))

;; _4 = (unsigned int) _6;

(insn 11 10 0 (set (reg:SI 112)
        (reg:SI 114)) -1
     (nil))

;; return _8;

(insn 12 11 13 (set (reg:SI 121)
        (sign_extend:SI (subreg:HI (reg:SI 114) 0))) pr39240.c:12 -1
     (nil))

(insn 13 12 14 (set (reg:HI 120)
        (subreg:HI (reg:SI 121) 0)) pr39240.c:12 -1
     (nil))

(insn 14 13 15 (set (reg:SI 122)
        (sign_extend:SI (reg:HI 120))) pr39240.c:12 -1
     (nil))

(insn 15 14 16 (set (reg:SI 117 [ <retval> ])
        (reg:SI 122)) pr39240.c:12 -1
     (nil))

(jump_insn 16 15 17 (set (pc)
        (label_ref 0)) pr39240.c:12 -1
     (nil))

(barrier 17 16 0)


try_optimize_cfg iteration 1

Merging block 3 into block 2...
Merged blocks 2 and 3.
Merged 2 and 3 without moving.
Removing jump 16.
Merging block 4 into block 2...
Merged blocks 2 and 4.
Merged 2 and 4 without moving.


try_optimize_cfg iteration 2

fix_loop_structure: fixing up loops for function


;;
;; Full RTL generated for this function:
;;
(note 1 0 4 NOTE_INSN_DELETED)
;; basic block 2, loop depth 0, count 0, freq 10000, maybe hot
;;  prev block 0, next block 1, flags: (NEW, REACHABLE, RTL)
;;  pred:       ENTRY [100.0%]  (FALLTHRU)
(note 4 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(insn 2 4 3 2 (set (reg/v:SI 110 [ x ])
        (reg:SI 0 r0 [ x ])) pr39240.c:11 -1
     (nil))
(note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
(insn 6 3 7 2 (set (reg:SI 118)
        (plus:SI (reg/v:SI 110 [ x ])
            (const_int 6 [0x6]))) pr39240.c:12 -1
     (nil))
(insn 7 6 8 2 (set (reg:SI 0 r0)
        (reg:SI 118)) pr39240.c:12 -1
     (nil))
(call_insn/u 8 7 9 2 (parallel [
            (set (reg:SI 0 r0)
                (call (mem:SI (symbol_ref:SI ("foo5") [flags 0x3]  <function_decl 0x7f3734ef7300 foo5>) [0 foo5 S4 A32])
                    (const_int 0 [0])))
            (use (const_int 0 [0]))
            (clobber (reg:SI 14 lr))
        ]) pr39240.c:12 -1
     (expr_list:REG_EH_REGION (const_int 0 [0])
        (nil))
    (expr_list (clobber (reg:SI 12 ip))
        (expr_list:SI (use (reg:SI 0 r0))
            (nil))))
(insn 9 8 10 2 (set (reg:SI 119)
        (reg:SI 0 r0)) pr39240.c:12 -1
     (nil))
(insn 10 9 11 2 (set (reg:SI 114)
        (reg:SI 119)) pr39240.c:12 -1
     (nil))
(insn 11 10 12 2 (set (reg:SI 112)
        (reg:SI 114)) -1
     (nil))
(insn 12 11 13 2 (set (reg:SI 121)
        (sign_extend:SI (subreg:HI (reg:SI 114) 0))) pr39240.c:12 -1
     (nil))
(insn 13 12 14 2 (set (reg:HI 120)
        (subreg:HI (reg:SI 121) 0)) pr39240.c:12 -1
     (nil))
(insn 14 13 15 2 (set (reg:SI 122)
        (sign_extend:SI (reg:HI 120))) pr39240.c:12 -1
     (nil))
(insn 15 14 19 2 (set (reg:SI 117 [ <retval> ])
        (reg:SI 122)) pr39240.c:12 -1
     (nil))
(insn 19 15 20 2 (set (reg/i:SI 0 r0)
        (reg:SI 117 [ <retval> ])) pr39240.c:13 -1
     (nil))
(insn 20 19 0 2 (use (reg/i:SI 0 r0)) pr39240.c:13 -1
     (nil))
;;  succ:       EXIT [100.0%]  (FALLTHRU)


;; Function main (main, funcdef_no=2, decl_uid=4154, cgraph_uid=2, symbol_order=3) (executed once)

main ()
{
  int _2;
  long unsigned int _3;
  long unsigned int l5.0_4;
  short int _6;
  short int _7;

;;   basic block 2, loop depth 0
;;    pred:       ENTRY
  _6 = bar5 (-10);
  _2 = (int) _6;
  _7 = _6;
  _3 = (long unsigned int) _6;
  l5.0_4 = l5;
  if (_3 != l5.0_4)
    goto <bb 3>;
  else
    goto <bb 4>;
;;    succ:       3
;;                4

;;   basic block 3, loop depth 0
;;    pred:       2
  abort ();
;;    succ:      

;;   basic block 4, loop depth 0
;;    pred:       2
  return 0;
;;    succ:       EXIT

}



Partition map 

Partition 2 (_2 - 2 )
Partition 3 (_3 - 3 )
Partition 4 (l5.0_4 - 4 )
Partition 6 (_6 - 6 )
Partition 7 (_7 - 7 )


Coalescible Partition map 

Partition 0, base 0 (_6 - 6 7 )


Partition map 

Partition 0 (_6 - 6 )
Partition 1 (_7 - 7 )


Conflict graph:

After sorting:
Sorted Coalesce list:
(10000) _6 <-> _7

Partition map 

Partition 0 (_6 - 6 )
Partition 1 (_7 - 7 )

Coalesce list: (6)_6 & (7)_7 [map: 0, 1] : Success -> 0
After Coalescing:

Partition map 

Partition 0 (_2 - 2 )
Partition 1 (_3 - 3 )
Partition 2 (l5.0_4 - 4 )
Partition 3 (_6 - 6 7 )


Replacing Expressions
_3 replace with --> _3 = (long unsigned int) _6;

l5.0_4 replace with --> l5.0_4 = l5;


main ()
{
  int _2;
  long unsigned int _3;
  long unsigned int l5.0_4;
  short int _6;
  short int _7;

;;   basic block 2, loop depth 0
;;    pred:       ENTRY
  _6 = bar5 (-10);
  _2 = (int) _6;
  _7 = _6;
  _3 = (long unsigned int) _6;
  l5.0_4 = l5;
  if (_3 != l5.0_4)
    goto <bb 3>;
  else
    goto <bb 4>;
;;    succ:       3
;;                4

;;   basic block 3, loop depth 0
;;    pred:       2
  abort ();
;;    succ:      

;;   basic block 4, loop depth 0
;;    pred:       2
  return 0;
;;    succ:       EXIT

}



;; Generating RTL for gimple basic block 2
(const_int -10 [0xfffffffffffffff6])

Hot cost: 4 (final)

;; _6 = bar5 (-10);

(insn 5 4 6 (set (reg:SI 0 r0)
        (const_int -10 [0xfffffffffffffff6])) pr39240.c:20 -1
     (nil))

(call_insn/u 6 5 7 (parallel [
            (set (reg:SI 0 r0)
                (call (mem:SI (symbol_ref:SI ("bar5") [flags 0x3]  <function_decl 0x7f3734ef7500 bar5>) [0 bar5 S4 A32])
                    (const_int 0 [0])))
            (use (const_int 0 [0]))
            (clobber (reg:SI 14 lr))
        ]) pr39240.c:20 -1
     (expr_list:REG_EH_REGION (const_int 0 [0])
        (nil))
    (expr_list (clobber (reg:SI 12 ip))
        (expr_list:SI (use (reg:SI 0 r0))
            (nil))))

(insn 7 6 8 (set (reg:SI 115)
        (reg:SI 0 r0)) pr39240.c:20 -1
     (nil))

(insn 8 7 0 (set (reg:SI 113)
        (reg:SI 115)) pr39240.c:20 -1
     (nil))

;; _2 = (int) _6;

(insn 9 8 0 (set (reg:SI 110)
        (reg:SI 113)) -1
     (nil))

;; _7 = _6;

(insn 10 9 0 (set (reg:SI 113)
        (zero_extend:SI (subreg/u:HI (reg:SI 113) 0))) -1
     (nil))

;; if (_3 != l5.0_4)

(insn 11 10 12 (set (reg/f:SI 116)
        (symbol_ref:SI ("*.LANCHOR0") [flags 0x182])) pr39240.c:20 -1
     (nil))

(insn 12 11 13 (set (reg:SI 117)
        (mem/c:SI (reg/f:SI 116) [0 l5+0 S4 A32])) pr39240.c:20 -1
     (nil))

(insn 13 12 14 (set (reg:CC 100 cc)
        (compare:CC (reg:SI 113)
            (reg:SI 117))) pr39240.c:20 -1
     (nil))

(jump_insn 14 13 0 (set (pc)
        (if_then_else (eq (reg:CC 100 cc)
                (const_int 0 [0]))
            (label_ref 0)
            (pc))) pr39240.c:20 -1
     (int_list:REG_BR_PROB 9996 (nil)))

;; Generating RTL for gimple basic block 3

;; abort ();

(call_insn 16 15 17 (parallel [
            (call (mem:SI (symbol_ref:SI ("abort") [flags 0x41]  <function_decl 0x7f373511b400 abort>) [0 __builtin_abort S4 A32])
                (const_int 0 [0]))
            (use (const_int 0 [0]))
            (clobber (reg:SI 14 lr))
        ]) pr39240.c:21 -1
     (expr_list:REG_NORETURN (const_int 0 [0])
        (expr_list:REG_EH_REGION (const_int 0 [0])
            (nil)))
    (expr_list (clobber (reg:SI 12 ip))
        (nil)))

(barrier 17 16 0)

;; Generating RTL for gimple basic block 4

;; 

(code_label 18 17 19 5 "" [0 uses])

(note 19 18 0 NOTE_INSN_BASIC_BLOCK)

;; return 0;

(insn 20 19 21 (set (reg:SI 114 [ <retval> ])
        (const_int 0 [0])) -1
     (nil))

(jump_insn 21 20 22 (set (pc)
        (label_ref 0)) -1
     (nil))

(barrier 22 21 0)


try_optimize_cfg iteration 1

Merging block 3 into block 2...
Merged blocks 2 and 3.
Merged 2 and 3 without moving.
Removing jump 21.
Merging block 6 into block 5...
Merged blocks 5 and 6.
Merged 5 and 6 without moving.


try_optimize_cfg iteration 2

fix_loop_structure: fixing up loops for function


;;
;; Full RTL generated for this function:
;;
(note 1 0 3 NOTE_INSN_DELETED)
;; basic block 2, loop depth 0, count 0, freq 10000, maybe hot
;;  prev block 0, next block 4, flags: (NEW, REACHABLE, RTL)
;;  pred:       ENTRY [100.0%]  (FALLTHRU)
(note 3 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(note 2 3 5 2 NOTE_INSN_FUNCTION_BEG)
(insn 5 2 6 2 (set (reg:SI 0 r0)
        (const_int -10 [0xfffffffffffffff6])) pr39240.c:20 -1
     (nil))
(call_insn/u 6 5 7 2 (parallel [
            (set (reg:SI 0 r0)
                (call (mem:SI (symbol_ref:SI ("bar5") [flags 0x3]  <function_decl 0x7f3734ef7500 bar5>) [0 bar5 S4 A32])
                    (const_int 0 [0])))
            (use (const_int 0 [0]))
            (clobber (reg:SI 14 lr))
        ]) pr39240.c:20 -1
     (expr_list:REG_EH_REGION (const_int 0 [0])
        (nil))
    (expr_list (clobber (reg:SI 12 ip))
        (expr_list:SI (use (reg:SI 0 r0))
            (nil))))
(insn 7 6 8 2 (set (reg:SI 115)
        (reg:SI 0 r0)) pr39240.c:20 -1
     (nil))
(insn 8 7 9 2 (set (reg:SI 113)
        (reg:SI 115)) pr39240.c:20 -1
     (nil))
(insn 9 8 10 2 (set (reg:SI 110)
        (reg:SI 113)) -1
     (nil))
(insn 10 9 11 2 (set (reg:SI 113)
        (zero_extend:SI (subreg/u:HI (reg:SI 113) 0))) -1
     (nil))
(insn 11 10 12 2 (set (reg/f:SI 116)
        (symbol_ref:SI ("*.LANCHOR0") [flags 0x182])) pr39240.c:20 -1
     (nil))
(insn 12 11 13 2 (set (reg:SI 117)
        (mem/c:SI (reg/f:SI 116) [0 l5+0 S4 A32])) pr39240.c:20 -1
     (nil))
(insn 13 12 14 2 (set (reg:CC 100 cc)
        (compare:CC (reg:SI 113)
            (reg:SI 117))) pr39240.c:20 -1
     (nil))
(jump_insn 14 13 15 2 (set (pc)
        (if_then_else (eq (reg:CC 100 cc)
                (const_int 0 [0]))
            (label_ref 18)
            (pc))) pr39240.c:20 -1
     (int_list:REG_BR_PROB 9996 (nil))
 -> 18)
;;  succ:       4 [0.0%]  (FALLTHRU)
;;              5 [100.0%] 

;; basic block 4, loop depth 0, count 0, freq 4
;;  prev block 2, next block 5, flags: (NEW, REACHABLE, RTL)
;;  pred:       2 [0.0%]  (FALLTHRU)
(note 15 14 16 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
(call_insn 16 15 17 4 (parallel [
            (call (mem:SI (symbol_ref:SI ("abort") [flags 0x41]  <function_decl 0x7f373511b400 abort>) [0 __builtin_abort S4 A32])
                (const_int 0 [0]))
            (use (const_int 0 [0]))
            (clobber (reg:SI 14 lr))
        ]) pr39240.c:21 -1
     (expr_list:REG_NORETURN (const_int 0 [0])
        (expr_list:REG_EH_REGION (const_int 0 [0])
            (nil)))
    (expr_list (clobber (reg:SI 12 ip))
        (nil)))
;;  succ:      

(barrier 17 16 18)
;; basic block 5, loop depth 0, count 0, freq 9996, maybe hot
;;  prev block 4, next block 1, flags: (NEW, REACHABLE, RTL)
;;  pred:       2 [100.0%] 
(code_label 18 17 19 5 5 "" [1 uses])
(note 19 18 20 5 [bb 5] NOTE_INSN_BASIC_BLOCK)
(insn 20 19 24 5 (set (reg:SI 114 [ <retval> ])
        (const_int 0 [0])) -1
     (nil))
(insn 24 20 25 5 (set (reg/i:SI 0 r0)
        (reg:SI 114 [ <retval> ])) pr39240.c:23 -1
     (nil))
(insn 25 24 0 5 (use (reg/i:SI 0 r0)) pr39240.c:23 -1
     (nil))
;;  succ:       EXIT [100.0%]  (FALLTHRU)


[-- Attachment #3: pr39240.c --]
[-- Type: text/x-csrc, Size: 294 bytes --]

extern void abort (void);

__attribute__ ((noinline))
static unsigned short int foo5 (int x)
{
  return x;
}

__attribute__ ((noinline))
short int bar5 (int x)
{
  return foo5 (x + 6);
}

unsigned long l5 = (short int) -4;

int
main (void)
{
  if (bar5 (-10) != l5)
    abort ();
  return 0;
}

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [5/7] Allow gimple debug stmt in widen mode
  2015-09-07 13:46   ` Michael Matz
@ 2015-09-08  0:01     ` Kugan
  2015-09-15 13:02       ` Richard Biener
  0 siblings, 1 reply; 63+ messages in thread
From: Kugan @ 2015-09-08  0:01 UTC (permalink / raw)
  To: Michael Matz; +Cc: gcc-patches, Richard Biener


Thanks for the review.

On 07/09/15 23:20, Michael Matz wrote:
> Hi,
> 
> On Mon, 7 Sep 2015, Kugan wrote:
> 
>> Allow GIMPLE_DEBUG with values in promoted register.
> 
> Patch does much more.
> 

Oops sorry. Copy and paste mistake.

gcc/ChangeLog:

2015-09-07 Kugan Vivekanandarajah <kuganv@linaro.org>

* cfgexpand.c (expand_debug_locations): Remove assert as now we are
also allowing values in promoted register.
* gimple-ssa-type-promote.c (fixup_uses): Allow GIMPLE_DEBUG to bind
values in promoted register.
* rtl.h (wi::int_traits ::decompose): Accept zero extended value
also.


>> gcc/ChangeLog:
>>
>> 2015-09-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>
>> 	* expr.c (expand_expr_real_1): Set proper SUNREG_PROMOTED_MODE for
>> 	SSA_NAME that was set by GIMPLE_CALL and assigned to another
>> 	SSA_NAME of same type.
> 
> ChangeLog doesn't match patch, and patch contains dubious changes:
> 
>> --- a/gcc/cfgexpand.c
>> +++ b/gcc/cfgexpand.c
>> @@ -5240,7 +5240,6 @@ expand_debug_locations (void)
>>         tree value = (tree)INSN_VAR_LOCATION_LOC (insn);
>>         rtx val;
>>         rtx_insn *prev_insn, *insn2;
>> -       machine_mode mode;
>>  
>>         if (value == NULL_TREE)
>>           val = NULL_RTX;
>> @@ -5275,16 +5274,6 @@ expand_debug_locations (void)
>>  
>>         if (!val)
>>           val = gen_rtx_UNKNOWN_VAR_LOC ();
>> -       else
>> -         {
>> -           mode = GET_MODE (INSN_VAR_LOCATION (insn));
>> -
>> -           gcc_assert (mode == GET_MODE (val)
>> -                       || (GET_MODE (val) == VOIDmode
>> -                           && (CONST_SCALAR_INT_P (val)
>> -                               || GET_CODE (val) == CONST_FIXED
>> -                               || GET_CODE (val) == LABEL_REF)));
>> -         }
>>  
>>         INSN_VAR_LOCATION_LOC (insn) = val;
>>         prev_insn = PREV_INSN (insn);
> 
> So it seems that the modes of the values location and the value itself 
> don't have to match anymore, which seems dubious when considering how a 
> debugger should load the value in question from the given location.  So, 
> how is it supposed to work?

For example (simplified test-case from creduce):

fn1() {
  char a = fn1;
  return a;
}

--- test.c.142t.veclower21	2015-09-07 23:47:26.362201640 +0000
+++ test.c.143t.promotion	2015-09-07 23:47:26.362201640 +0000
@@ -5,13 +5,18 @@
 {
   char a;
   long int fn1.0_1;
+  unsigned int _2;
   int _3;
+  unsigned int _5;
+  char _6;

   <bb 2>:
   fn1.0_1 = (long int) fn1;
-  a_2 = (char) fn1.0_1;
-  # DEBUG a => a_2
-  _3 = (int) a_2;
+  _5 = (unsigned int) fn1.0_1;
+  _2 = _5 & 255;
+  # DEBUG a => _2
+  _6 = (char) _2;
+  _3 = (int) _6;
   return _3;

 }

Please see that DEBUG now points to _2 which is a promoted mode. I am
assuming that the debugger would load required precision from promoted
register. May be I am missing the details but how else we can handle
this? Any suggestions?

In this particular simplified case, we do have _6 but we might not in
all the case.


> 
> And this change:
> 
>> --- a/gcc/rtl.h
>> +++ b/gcc/rtl.h
>> @@ -2100,6 +2100,8 @@ wi::int_traits <rtx_mode_t>::decompose (HOST_WIDE_INT*,
>>            targets is 1 rather than -1.  */
>>         gcc_checking_assert (INTVAL (x.first)
>>                              == sext_hwi (INTVAL (x.first), precision)
>> +                            || INTVAL (x.first)
>> +                            == (INTVAL (x.first) & ((1 << precision) - 1))
>>                              || (x.second == BImode && INTVAL (x.first) == 1));
>>  
>>        return wi::storage_ref (&INTVAL (x.first), 1, precision);
> 
> implies that wide_ints are not always sign-extended anymore after you 
> changes.  That's a fundamental assumption, so removing that assert implies 
> that you somehow created non-canonical wide_ints, and those will cause 
> bugs elsewhere in the code.  Don't just remove asserts, they are usually 
> there for a reason, and without accompanying changes those reasons don't 
> go away.
> 


This comes from GIMPLE_DEBUG. If this assumption should always hold, I
will fix it there.

Thanks,
Kugan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [4/7] Use correct promoted mode sign for result of GIMPLE_CALL
  2015-09-08  0:00     ` Kugan
@ 2015-09-08 15:45       ` Jeff Law
  2015-09-08 22:09         ` Jim Wilson
  0 siblings, 1 reply; 63+ messages in thread
From: Jeff Law @ 2015-09-08 15:45 UTC (permalink / raw)
  To: Kugan, Michael Matz; +Cc: gcc-patches, Richard Biener

On 09/07/2015 03:27 PM, Kugan wrote:
>
>
> On 07/09/15 23:10, Michael Matz wrote:
>> Hi,
>>
>> On Mon, 7 Sep 2015, Kugan wrote:
>>
>>> For the following testcase (compiling with -O1; -O2 works fine), we have
>>> a stmt with stm_code SSA_NAME (_7 = _ 6) and for which _6 is defined by
>>> a GIMPLE_CALL. In this case, we are using wrong SUNREG promoted mode
>>> resulting in wrong code.
>>
>> And why is that?
>>
>>> Simple SSA_NAME copes are generally optimized
>>> but when they are not, we can end up using the wrong promoted mode.
>>> Attached patch fixes when we have one copy.
>>
>> I think it's the wrong place to fixing up.  Where does the wrong use come
>> from?  At that place it should be fixed, not after the fact.
>>
>>>    _6 = bar5 (-10);
>>>    ...
>>>    _7 = _6;
>>>    _3 = (long unsigned int) _6;
>>>    ...
>>>    if (_3 != l5.0_4)
>>
>> There is no use of '_7' in this snippet so I don't see the relevance of
>> SUBREG_PROMOTED_MODE on it.
>>
>> But whatever you do, please make sure you include the testcase for the
>> problem as a regression test:
>>
>
> Thanks for the review.
>
> This happens in ARM where definition of PROMOTED_MODE also changes the
> sign. I am attaching the cfgdump for the test-case. This is part of the
> existing test-case thats why I didn't include it as part of this patch.
Is this another instance of the PROMOTE_MODE issue that was raised by 
Jim Wilson a couple months ago?

jeff
>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [4/7] Use correct promoted mode sign for result of GIMPLE_CALL
  2015-09-08 15:45       ` Jeff Law
@ 2015-09-08 22:09         ` Jim Wilson
  2015-09-15 12:51           ` Richard Biener
  0 siblings, 1 reply; 63+ messages in thread
From: Jim Wilson @ 2015-09-08 22:09 UTC (permalink / raw)
  To: Jeff Law, Kugan, Michael Matz; +Cc: gcc-patches, Richard Biener

On 09/08/2015 08:39 AM, Jeff Law wrote:
> Is this another instance of the PROMOTE_MODE issue that was raised by
> Jim Wilson a couple months ago?

It looks like a closely related problem.  The one I am looking at has
confusion with a function arg and a local variable as they have
different sign extension promotion rules.  Kugan's is with a function
return value and a local variable as they have different sign extension
promotion rules.

The bug report is
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65932

The gcc-patches thread spans a month end boundary, so it has multiple heads
https://gcc.gnu.org/ml/gcc-patches/2015-06/msg02132.html
https://gcc.gnu.org/ml/gcc-patches/2015-07/msg00112.html
https://gcc.gnu.org/ml/gcc-patches/2015-07/msg00524.html

Function args and function return values get the same sign extension
treatment when promoted, this is handled by
TARGET_PROMOTE_FUNCTION_MODE. Local variables are treated differently,
via PROMOTE_MODE. I think the function arg/return treatment is wrong,
but changing that is an ABI change which is undesirable.  I suppose we
could change local variables to match function args and return values,
but I think that is moving in the wrong direction.  Though Kugan's new
optimization pass will remove some of the extra unnecessary sign/zero
extensions added by the arm TARGET_PROMOTE_FUNCTION_MODE definition, so
maybe it won't matter enough to worry about any more.

If we can't fix this in the arm backend, then we may need different
middle fixes for these two cases.  I was looking at ways to fix this in
the tree-out-of-ssa pass.  I don't know if this will work for Kugan's
testcase, I'd need time to look at it.

Jim

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [4/7] Use correct promoted mode sign for result of GIMPLE_CALL
  2015-09-08 22:09         ` Jim Wilson
@ 2015-09-15 12:51           ` Richard Biener
  2015-10-07  1:03             ` kugan
  0 siblings, 1 reply; 63+ messages in thread
From: Richard Biener @ 2015-09-15 12:51 UTC (permalink / raw)
  To: Jim Wilson; +Cc: Jeff Law, Kugan, Michael Matz, gcc-patches

On Tue, Sep 8, 2015 at 11:50 PM, Jim Wilson <jim.wilson@linaro.org> wrote:
> On 09/08/2015 08:39 AM, Jeff Law wrote:
>> Is this another instance of the PROMOTE_MODE issue that was raised by
>> Jim Wilson a couple months ago?
>
> It looks like a closely related problem.  The one I am looking at has
> confusion with a function arg and a local variable as they have
> different sign extension promotion rules.  Kugan's is with a function
> return value and a local variable as they have different sign extension
> promotion rules.
>
> The bug report is
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65932
>
> The gcc-patches thread spans a month end boundary, so it has multiple heads
> https://gcc.gnu.org/ml/gcc-patches/2015-06/msg02132.html
> https://gcc.gnu.org/ml/gcc-patches/2015-07/msg00112.html
> https://gcc.gnu.org/ml/gcc-patches/2015-07/msg00524.html
>
> Function args and function return values get the same sign extension
> treatment when promoted, this is handled by
> TARGET_PROMOTE_FUNCTION_MODE. Local variables are treated differently,
> via PROMOTE_MODE. I think the function arg/return treatment is wrong,
> but changing that is an ABI change which is undesirable.  I suppose we
> could change local variables to match function args and return values,
> but I think that is moving in the wrong direction.  Though Kugan's new
> optimization pass will remove some of the extra unnecessary sign/zero
> extensions added by the arm TARGET_PROMOTE_FUNCTION_MODE definition, so
> maybe it won't matter enough to worry about any more.
>
> If we can't fix this in the arm backend, then we may need different
> middle fixes for these two cases.  I was looking at ways to fix this in
> the tree-out-of-ssa pass.  I don't know if this will work for Kugan's
> testcase, I'd need time to look at it.

I think the function return value should have been "promoted" according to
the ABI by the lowering pass.  Thus the call stmt return type be changed,
exposing the "mismatch" and compensating the IL with a sign-conversion.

As for your original issue with function arguments they should really
get similar
treatment, eventually in function arg gimplification already, by making
the PARM_DECLs promoted and using a local variable for further uses
with the "local" type.  Eventually one can use DECL_VALUE_EXPR to fixup
the IL, not sure.  Or we can do this in the promotion pass as well.

Richard.

> Jim
>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [5/7] Allow gimple debug stmt in widen mode
  2015-09-08  0:01     ` Kugan
@ 2015-09-15 13:02       ` Richard Biener
  2015-10-15  5:45         ` Kugan
  0 siblings, 1 reply; 63+ messages in thread
From: Richard Biener @ 2015-09-15 13:02 UTC (permalink / raw)
  To: Kugan; +Cc: Michael Matz, gcc-patches

On Tue, Sep 8, 2015 at 2:00 AM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
>
> Thanks for the review.
>
> On 07/09/15 23:20, Michael Matz wrote:
>> Hi,
>>
>> On Mon, 7 Sep 2015, Kugan wrote:
>>
>>> Allow GIMPLE_DEBUG with values in promoted register.
>>
>> Patch does much more.
>>
>
> Oops sorry. Copy and paste mistake.
>
> gcc/ChangeLog:
>
> 2015-09-07 Kugan Vivekanandarajah <kuganv@linaro.org>
>
> * cfgexpand.c (expand_debug_locations): Remove assert as now we are
> also allowing values in promoted register.
> * gimple-ssa-type-promote.c (fixup_uses): Allow GIMPLE_DEBUG to bind
> values in promoted register.
> * rtl.h (wi::int_traits ::decompose): Accept zero extended value
> also.
>
>
>>> gcc/ChangeLog:
>>>
>>> 2015-09-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>>
>>>      * expr.c (expand_expr_real_1): Set proper SUNREG_PROMOTED_MODE for
>>>      SSA_NAME that was set by GIMPLE_CALL and assigned to another
>>>      SSA_NAME of same type.
>>
>> ChangeLog doesn't match patch, and patch contains dubious changes:
>>
>>> --- a/gcc/cfgexpand.c
>>> +++ b/gcc/cfgexpand.c
>>> @@ -5240,7 +5240,6 @@ expand_debug_locations (void)
>>>         tree value = (tree)INSN_VAR_LOCATION_LOC (insn);
>>>         rtx val;
>>>         rtx_insn *prev_insn, *insn2;
>>> -       machine_mode mode;
>>>
>>>         if (value == NULL_TREE)
>>>           val = NULL_RTX;
>>> @@ -5275,16 +5274,6 @@ expand_debug_locations (void)
>>>
>>>         if (!val)
>>>           val = gen_rtx_UNKNOWN_VAR_LOC ();
>>> -       else
>>> -         {
>>> -           mode = GET_MODE (INSN_VAR_LOCATION (insn));
>>> -
>>> -           gcc_assert (mode == GET_MODE (val)
>>> -                       || (GET_MODE (val) == VOIDmode
>>> -                           && (CONST_SCALAR_INT_P (val)
>>> -                               || GET_CODE (val) == CONST_FIXED
>>> -                               || GET_CODE (val) == LABEL_REF)));
>>> -         }
>>>
>>>         INSN_VAR_LOCATION_LOC (insn) = val;
>>>         prev_insn = PREV_INSN (insn);
>>
>> So it seems that the modes of the values location and the value itself
>> don't have to match anymore, which seems dubious when considering how a
>> debugger should load the value in question from the given location.  So,
>> how is it supposed to work?
>
> For example (simplified test-case from creduce):
>
> fn1() {
>   char a = fn1;
>   return a;
> }
>
> --- test.c.142t.veclower21      2015-09-07 23:47:26.362201640 +0000
> +++ test.c.143t.promotion       2015-09-07 23:47:26.362201640 +0000
> @@ -5,13 +5,18 @@
>  {
>    char a;
>    long int fn1.0_1;
> +  unsigned int _2;
>    int _3;
> +  unsigned int _5;
> +  char _6;
>
>    <bb 2>:
>    fn1.0_1 = (long int) fn1;
> -  a_2 = (char) fn1.0_1;
> -  # DEBUG a => a_2
> -  _3 = (int) a_2;
> +  _5 = (unsigned int) fn1.0_1;
> +  _2 = _5 & 255;
> +  # DEBUG a => _2
> +  _6 = (char) _2;
> +  _3 = (int) _6;
>    return _3;
>
>  }
>
> Please see that DEBUG now points to _2 which is a promoted mode. I am
> assuming that the debugger would load required precision from promoted
> register. May be I am missing the details but how else we can handle
> this? Any suggestions?

I would have expected the DEBUG insn to be adjusted as

# DEBUG a => (char)_2

Btw, why do we have

> +  _6 = (char) _2;
> +  _3 = (int) _6;

?  I'd have expected

 unsigned int _6 = SEXT <_2, 8>
 _3 = (int) _6;
 return _3;

see my other mail about promotion of PARM_DECLs and RESULT_DECLs -- we should
promote those as well.

Richard.

> In this particular simplified case, we do have _6 but we might not in
> all the case.
>
>
>>
>> And this change:
>>
>>> --- a/gcc/rtl.h
>>> +++ b/gcc/rtl.h
>>> @@ -2100,6 +2100,8 @@ wi::int_traits <rtx_mode_t>::decompose (HOST_WIDE_INT*,
>>>            targets is 1 rather than -1.  */
>>>         gcc_checking_assert (INTVAL (x.first)
>>>                              == sext_hwi (INTVAL (x.first), precision)
>>> +                            || INTVAL (x.first)
>>> +                            == (INTVAL (x.first) & ((1 << precision) - 1))
>>>                              || (x.second == BImode && INTVAL (x.first) == 1));
>>>
>>>        return wi::storage_ref (&INTVAL (x.first), 1, precision);
>>
>> implies that wide_ints are not always sign-extended anymore after you
>> changes.  That's a fundamental assumption, so removing that assert implies
>> that you somehow created non-canonical wide_ints, and those will cause
>> bugs elsewhere in the code.  Don't just remove asserts, they are usually
>> there for a reason, and without accompanying changes those reasons don't
>> go away.
>>
>
>
> This comes from GIMPLE_DEBUG. If this assumption should always hold, I
> will fix it there.
>
> Thanks,
> Kugan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [3/7] Optimize ZEXT_EXPR with tree-vrp
  2015-09-07  3:00 ` [3/7] Optimize ZEXT_EXPR with tree-vrp Kugan
@ 2015-09-15 13:18   ` Richard Biener
  2015-10-06 23:12     ` kugan
  0 siblings, 1 reply; 63+ messages in thread
From: Richard Biener @ 2015-09-15 13:18 UTC (permalink / raw)
  To: Kugan; +Cc: gcc-patches

On Mon, Sep 7, 2015 at 4:58 AM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
> This patch tree-vrp handling and optimization for ZEXT_EXPR.

+  else if (code == SEXT_EXPR)
+    {
+      gcc_assert (range_int_cst_p (&vr1));
+      unsigned int prec = tree_to_uhwi (vr1.min);
+      type = vr0.type;
+      wide_int tmin, tmax;
+      wide_int type_min, type_max;
+      wide_int may_be_nonzero, must_be_nonzero;
+
+      gcc_assert (!TYPE_UNSIGNED (expr_type));

hmm, I don't think we should restrict SEXT_EXPR this way.  SEXT_EXPR
should operate on both signed and unsigned types and the result type
should be the same as the type of operand 0.

+      type_min = wi::shwi (1 << (prec - 1),
+                          TYPE_PRECISION (TREE_TYPE (vr0.min)));
+      type_max = wi::shwi (((1 << (prec - 1)) - 1),
+                          TYPE_PRECISION (TREE_TYPE (vr0.max)));

there is wi::min_value and max_value for this.

+         HOST_WIDE_INT int_may_be_nonzero = may_be_nonzero.to_uhwi ();
+         HOST_WIDE_INT int_must_be_nonzero = must_be_nonzero.to_uhwi ();

this doesn't need to fit a HOST_WIDE_INT, please use wi::bit_and (can't
find a test_bit with a quick search).

+      tmin = wi::sext (tmin, prec - 1);
+      tmax = wi::sext (tmax, prec - 1);
+      min = wide_int_to_tree (expr_type, tmin);
+      max = wide_int_to_tree (expr_type, tmax);

not sure why you need the extra sign-extensions here.

+    case SEXT_EXPR:
+       {
+         gcc_assert (is_gimple_min_invariant (op1));
+         unsigned int prec = tree_to_uhwi (op1);

no need to assert, tree_to_uhwi will do that for you.

+         HOST_WIDE_INT may_be_nonzero = may_be_nonzero0.to_uhwi ();
+         HOST_WIDE_INT must_be_nonzero = must_be_nonzero0.to_uhwi ();

likewise with HOST_WIDE__INT issue.

Otherwise looks ok to me.  Btw, this and adding of SEXT_EXPR could be
accompanied with a match.pd pattern detecting sign-extension patterns,
that would give some extra test coverage.

Thanks,
Richard.

>
>
> gcc/ChangeLog:
>
> 2015-09-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
>
>         * tree-vrp.c (extract_range_from_binary_expr_1): Handle SEXT_EXPR.
>         (simplify_bit_ops_using_ranges): Likewise.
>         (simplify_stmt_using_ranges): Likewise.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [1/7] Add new tree code SEXT_EXPR
  2015-09-07  2:57 ` [1/7] Add new tree code SEXT_EXPR Kugan
@ 2015-09-15 13:20   ` Richard Biener
  2015-10-11 10:35     ` Kugan
  0 siblings, 1 reply; 63+ messages in thread
From: Richard Biener @ 2015-09-15 13:20 UTC (permalink / raw)
  To: Kugan; +Cc: gcc-patches

On Mon, Sep 7, 2015 at 4:55 AM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
>
> This patch adds support for new tree code SEXT_EXPR.

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index d567a87..bbc3c10 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -5071,6 +5071,10 @@ expand_debug_expr (tree exp)
     case FMA_EXPR:
       return simplify_gen_ternary (FMA, mode, inner_mode, op0, op1, op2);

+    case SEXT_EXPR:
+      return op0;

that looks wrong.  Generate (sext:... ) here?

+    case SEXT_EXPR:
+       {
+         rtx op0 = expand_normal (treeop0);
+         rtx temp;
+         if (!target)
+           target = gen_reg_rtx (TYPE_MODE (TREE_TYPE (treeop0)));
+
+         machine_mode inner_mode
+           = smallest_mode_for_size (tree_to_shwi (treeop1),
+                                     MODE_INT);
+         temp = convert_modes (inner_mode,
+                               TYPE_MODE (TREE_TYPE (treeop0)), op0, 0);
+         convert_move (target, temp, 0);
+         return target;
+       }

Humm - is that really how we expand sign extensions right now?  No helper
that would generate (sext ...) directly?  I wouldn't try using 'target' btw but
simply return (sext:mode op0 op1) or so.  But I am no way an RTL expert.

Note that if we don't disallow arbitrary precision SEXT_EXPRs we have to
fall back to using shifts (and smallest_mode_for_size is simply wrong).

+    case SEXT_EXPR:
+      {
+       if (!INTEGRAL_TYPE_P (lhs_type)
+           || !INTEGRAL_TYPE_P (rhs1_type)
+           || TREE_CODE (rhs2) != INTEGER_CST)

please constrain this some more, with

   || !useless_type_conversion_p (lhs_type, rhs1_type)

+         {
+           error ("invalid operands in sext expr");
+           return true;
+         }
+       return false;
+      }

@@ -3414,6 +3422,9 @@ op_symbol_code (enum tree_code code)
     case MIN_EXPR:
       return "min";

+    case SEXT_EXPR:
+      return "sext from bit";
+

just "sext" please.

+/*  Sign-extend operation.  It will sign extend first operand from
+ the sign bit specified by the second operand.  */
+DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2)

"from the INTEGER_CST sign bit specified"

Also add "The type of the result is that of the first operand."

Otherwise looks good to me - of course the two RTL expansion related
parts need auditing by somebody speaking more RTL than me.

Richard.

> gcc/ChangeLog:
>
> 2015-09-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
>
>         * cfgexpand.c (expand_debug_expr): Handle SEXT_EXPR.
>         * expr.c (expand_expr_real_2): Likewise.
>         * fold-const.c (int_const_binop_1): Likewise.
>         * tree-cfg.c (verify_gimple_assign_binary): Likewise.
>         * tree-inline.c (estimate_operator_cost): Likewise.
>         * tree-pretty-print.c (dump_generic_node): Likewise.
>         (op_symbol_code): Likewise.
>         * tree.def: Define new tree code SEXT_EXPR.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [3/7] Optimize ZEXT_EXPR with tree-vrp
  2015-09-15 13:18   ` Richard Biener
@ 2015-10-06 23:12     ` kugan
  2015-10-07  8:20       ` Richard Biener
  0 siblings, 1 reply; 63+ messages in thread
From: kugan @ 2015-10-06 23:12 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 2781 bytes --]


Hi Richard,

Thanks for the review.

On 15/09/15 23:08, Richard Biener wrote:
> On Mon, Sep 7, 2015 at 4:58 AM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
>> This patch tree-vrp handling and optimization for ZEXT_EXPR.
>
> +  else if (code == SEXT_EXPR)
> +    {
> +      gcc_assert (range_int_cst_p (&vr1));
> +      unsigned int prec = tree_to_uhwi (vr1.min);
> +      type = vr0.type;
> +      wide_int tmin, tmax;
> +      wide_int type_min, type_max;
> +      wide_int may_be_nonzero, must_be_nonzero;
> +
> +      gcc_assert (!TYPE_UNSIGNED (expr_type));
>
> hmm, I don't think we should restrict SEXT_EXPR this way.  SEXT_EXPR
> should operate on both signed and unsigned types and the result type
> should be the same as the type of operand 0.
>
> +      type_min = wi::shwi (1 << (prec - 1),
> +                          TYPE_PRECISION (TREE_TYPE (vr0.min)));
> +      type_max = wi::shwi (((1 << (prec - 1)) - 1),
> +                          TYPE_PRECISION (TREE_TYPE (vr0.max)));
>
> there is wi::min_value and max_value for this.

As of now, SEXT_EXPR in gimple is of the form: x = y sext 8 and types of 
all the operand and results are of the wider type. Therefore we cant use 
the  wi::min_value. Or do you want to convert this precision (in this 
case 8) to a type and use wi::min_value?

Please find the patch that addresses the other comments.

Thanks,
Kugan

>
> +         HOST_WIDE_INT int_may_be_nonzero = may_be_nonzero.to_uhwi ();
> +         HOST_WIDE_INT int_must_be_nonzero = must_be_nonzero.to_uhwi ();
>
> this doesn't need to fit a HOST_WIDE_INT, please use wi::bit_and (can't
> find a test_bit with a quick search).
>
> +      tmin = wi::sext (tmin, prec - 1);
> +      tmax = wi::sext (tmax, prec - 1);
> +      min = wide_int_to_tree (expr_type, tmin);
> +      max = wide_int_to_tree (expr_type, tmax);
>
> not sure why you need the extra sign-extensions here.
>
> +    case SEXT_EXPR:
> +       {
> +         gcc_assert (is_gimple_min_invariant (op1));
> +         unsigned int prec = tree_to_uhwi (op1);
>
> no need to assert, tree_to_uhwi will do that for you.
>
> +         HOST_WIDE_INT may_be_nonzero = may_be_nonzero0.to_uhwi ();
> +         HOST_WIDE_INT must_be_nonzero = must_be_nonzero0.to_uhwi ();
>
> likewise with HOST_WIDE__INT issue.
>
> Otherwise looks ok to me.  Btw, this and adding of SEXT_EXPR could be
> accompanied with a match.pd pattern detecting sign-extension patterns,
> that would give some extra test coverage.
>
> Thanks,
> Richard.
>
>>
>>
>> gcc/ChangeLog:
>>
>> 2015-09-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>
>>          * tree-vrp.c (extract_range_from_binary_expr_1): Handle SEXT_EXPR.
>>          (simplify_bit_ops_using_ranges): Likewise.
>>          (simplify_stmt_using_ranges): Likewise.

[-- Attachment #2: 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch --]
[-- Type: text/x-diff, Size: 3569 bytes --]

From 75fb9b8bcacd36a1409bf94c38048de83a5eab62 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Mon, 17 Aug 2015 13:45:52 +1000
Subject: [PATCH 3/7] Optimize ZEXT_EXPR with tree-vrp

---
 gcc/tree-vrp.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 73 insertions(+)

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 2cd71a2..9c7d8d8 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -2317,6 +2317,7 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
       && code != LSHIFT_EXPR
       && code != MIN_EXPR
       && code != MAX_EXPR
+      && code != SEXT_EXPR
       && code != BIT_AND_EXPR
       && code != BIT_IOR_EXPR
       && code != BIT_XOR_EXPR)
@@ -2877,6 +2878,53 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
       extract_range_from_multiplicative_op_1 (vr, code, &vr0, &vr1);
       return;
     }
+  else if (code == SEXT_EXPR)
+    {
+      gcc_assert (range_int_cst_p (&vr1));
+      unsigned int prec = tree_to_uhwi (vr1.min);
+      type = vr0.type;
+      wide_int tmin, tmax;
+      wide_int sign_bit;
+      wide_int type_min, type_max;
+      wide_int may_be_nonzero, must_be_nonzero;
+
+      type_min = wi::shwi (1 << (prec - 1),
+			   TYPE_PRECISION (TREE_TYPE (vr0.min)));
+      type_max = wi::shwi (((1 << (prec - 1)) - 1),
+			   TYPE_PRECISION (TREE_TYPE (vr0.max)));
+      sign_bit = wi::shwi (1 << (prec - 1),
+			   TYPE_PRECISION (TREE_TYPE (vr0.min)));
+      if (zero_nonzero_bits_from_vr (expr_type, &vr0,
+				     &may_be_nonzero,
+				     &must_be_nonzero))
+	{
+	  if (wi::bit_and (must_be_nonzero, sign_bit) == sign_bit)
+	    {
+	      /* If to-be-extended sign bit is one.  */
+	      tmin = type_min;
+	      tmax = may_be_nonzero;
+	    }
+	  else if (wi::bit_and (may_be_nonzero, sign_bit)
+		   != sign_bit)
+	    {
+	      /* If to-be-extended sign bit is zero.  */
+	      tmin = must_be_nonzero;
+	      tmax = may_be_nonzero;
+	    }
+	  else
+	    {
+	      tmin = type_min;
+	      tmax = type_max;
+	    }
+	}
+      else
+	{
+	  tmin = type_min;
+	  tmax = type_max;
+	}
+      min = wide_int_to_tree (expr_type, tmin);
+      max = wide_int_to_tree (expr_type, tmax);
+    }
   else if (code == RSHIFT_EXPR
 	   || code == LSHIFT_EXPR)
     {
@@ -9244,6 +9292,30 @@ simplify_bit_ops_using_ranges (gimple_stmt_iterator *gsi, gimple *stmt)
 	  break;
 	}
       break;
+    case SEXT_EXPR:
+	{
+	  gcc_assert (is_gimple_min_invariant (op1));
+	  unsigned int prec = tree_to_uhwi (op1);
+	  wide_int mask, sign_bit;
+	  mask = wi::shwi (((1 << (prec - 1)) - 1),
+			   TYPE_PRECISION (TREE_TYPE (vr0.max)));
+	  mask = wi::bit_not (mask);
+	  sign_bit = wi::shwi (1 << (prec - 1),
+			       TYPE_PRECISION (TREE_TYPE (vr0.min)));
+	  if (wi::bit_and (must_be_nonzero0, sign_bit) == sign_bit)
+	    {
+	      /* If to-be-extended sign bit is one.  */
+	      if (wi::bit_and (must_be_nonzero0, mask) == mask)
+		op = op0;
+	    }
+	  else if (wi::bit_and (may_be_nonzero0, sign_bit) != sign_bit)
+	    {
+	      /* If to-be-extended sign bit is zero.  */
+	      if (wi::bit_and (may_be_nonzero0, mask) == 0)
+		op = op0;
+	    }
+	}
+      break;
     default:
       gcc_unreachable ();
     }
@@ -9946,6 +10018,7 @@ simplify_stmt_using_ranges (gimple_stmt_iterator *gsi)
 
 	case BIT_AND_EXPR:
 	case BIT_IOR_EXPR:
+	case SEXT_EXPR:
 	  /* Optimize away BIT_AND_EXPR and BIT_IOR_EXPR
 	     if all the bits being cleared are already cleared or
 	     all the bits being set are already set.  */
-- 
1.9.1


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [4/7] Use correct promoted mode sign for result of GIMPLE_CALL
  2015-09-15 12:51           ` Richard Biener
@ 2015-10-07  1:03             ` kugan
  0 siblings, 0 replies; 63+ messages in thread
From: kugan @ 2015-10-07  1:03 UTC (permalink / raw)
  To: Richard Biener, Jim Wilson; +Cc: Jeff Law, Michael Matz, gcc-patches



On 15/09/15 22:47, Richard Biener wrote:
> On Tue, Sep 8, 2015 at 11:50 PM, Jim Wilson <jim.wilson@linaro.org> wrote:
>> On 09/08/2015 08:39 AM, Jeff Law wrote:
>>> Is this another instance of the PROMOTE_MODE issue that was raised by
>>> Jim Wilson a couple months ago?
>>
>> It looks like a closely related problem.  The one I am looking at has
>> confusion with a function arg and a local variable as they have
>> different sign extension promotion rules.  Kugan's is with a function
>> return value and a local variable as they have different sign extension
>> promotion rules.
>>
>> The bug report is
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65932
>>
>> The gcc-patches thread spans a month end boundary, so it has multiple heads
>> https://gcc.gnu.org/ml/gcc-patches/2015-06/msg02132.html
>> https://gcc.gnu.org/ml/gcc-patches/2015-07/msg00112.html
>> https://gcc.gnu.org/ml/gcc-patches/2015-07/msg00524.html
>>
>> Function args and function return values get the same sign extension
>> treatment when promoted, this is handled by
>> TARGET_PROMOTE_FUNCTION_MODE. Local variables are treated differently,
>> via PROMOTE_MODE. I think the function arg/return treatment is wrong,
>> but changing that is an ABI change which is undesirable.  I suppose we
>> could change local variables to match function args and return values,
>> but I think that is moving in the wrong direction.  Though Kugan's new
>> optimization pass will remove some of the extra unnecessary sign/zero
>> extensions added by the arm TARGET_PROMOTE_FUNCTION_MODE definition, so
>> maybe it won't matter enough to worry about any more.
>>
>> If we can't fix this in the arm backend, then we may need different
>> middle fixes for these two cases.  I was looking at ways to fix this in
>> the tree-out-of-ssa pass.  I don't know if this will work for Kugan's
>> testcase, I'd need time to look at it.

As you said, I dont think the fix in tree-out-of-ssa pass will not fix 
this case. Kyrill also saw the same problem with the trunk as in 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67714
>
> I think the function return value should have been "promoted" according to
> the ABI by the lowering pass.  Thus the call stmt return type be changed,
> exposing the "mismatch" and compensating the IL with a sign-conversion.
>

Function return value is promoted as per ABI.
In the example from PR67714
....
  _8 = fn1D.5055 ();
   e_9 = (charD.4) _8;
   f_13 = _8;
...

_8 is sign extended correctly. But in f_13 = _8, it is promoted to 
unsigned and zero extended due to the backend PROMOTE_MODE. We thus have:

The zero-extension during expand:
;; f_13 = _8;

(insn 15 14 0 (set (reg/v:SI 110 [ f ])
         (zero_extend:SI (subreg/u:QI (reg/v:SI 110 [ f ]) 0))) 
arm-zext.c:18 -1
      (nil))

This is wrong.

> As for your original issue with function arguments they should really
> get similar
> treatment, eventually in function arg gimplification already, by making
> the PARM_DECLs promoted and using a local variable for further uses
> with the "local" type.  Eventually one can use DECL_VALUE_EXPR to fixup
> the IL, not sure.  Or we can do this in the promotion pass as well.
>

I will try doing this see if I can do this.

Thanks,
Kugan

> Richard.
>
>> Jim
>>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [3/7] Optimize ZEXT_EXPR with tree-vrp
  2015-10-06 23:12     ` kugan
@ 2015-10-07  8:20       ` Richard Biener
  2015-10-07 23:40         ` Kugan
  0 siblings, 1 reply; 63+ messages in thread
From: Richard Biener @ 2015-10-07  8:20 UTC (permalink / raw)
  To: kugan; +Cc: gcc-patches

On Wed, Oct 7, 2015 at 1:12 AM, kugan <kugan.vivekanandarajah@linaro.org> wrote:
>
> Hi Richard,
>
> Thanks for the review.
>
> On 15/09/15 23:08, Richard Biener wrote:
>>
>> On Mon, Sep 7, 2015 at 4:58 AM, Kugan <kugan.vivekanandarajah@linaro.org>
>> wrote:
>>>
>>> This patch tree-vrp handling and optimization for ZEXT_EXPR.
>>
>>
>> +  else if (code == SEXT_EXPR)
>> +    {
>> +      gcc_assert (range_int_cst_p (&vr1));
>> +      unsigned int prec = tree_to_uhwi (vr1.min);
>> +      type = vr0.type;
>> +      wide_int tmin, tmax;
>> +      wide_int type_min, type_max;
>> +      wide_int may_be_nonzero, must_be_nonzero;
>> +
>> +      gcc_assert (!TYPE_UNSIGNED (expr_type));
>>
>> hmm, I don't think we should restrict SEXT_EXPR this way.  SEXT_EXPR
>> should operate on both signed and unsigned types and the result type
>> should be the same as the type of operand 0.
>>
>> +      type_min = wi::shwi (1 << (prec - 1),
>> +                          TYPE_PRECISION (TREE_TYPE (vr0.min)));
>> +      type_max = wi::shwi (((1 << (prec - 1)) - 1),
>> +                          TYPE_PRECISION (TREE_TYPE (vr0.max)));
>>
>> there is wi::min_value and max_value for this.
>
>
> As of now, SEXT_EXPR in gimple is of the form: x = y sext 8 and types of all
> the operand and results are of the wider type. Therefore we cant use the
> wi::min_value. Or do you want to convert this precision (in this case 8) to
> a type and use wi::min_value?

I don't understand - wi::min/max_value get a precision and sign, not a type.
your 1 << (prec - 1) is even wrong for prec > 32 (it's an integer type
expression).
Thus

  type_min = wi::min_value (prec, SIGNED);
  type_max = wi::max_value (prec, SIGNED);

?

> Please find the patch that addresses the other comments.

I'll have a look later.

> Thanks,
> Kugan
>
>
>>
>> +         HOST_WIDE_INT int_may_be_nonzero = may_be_nonzero.to_uhwi ();
>> +         HOST_WIDE_INT int_must_be_nonzero = must_be_nonzero.to_uhwi ();
>>
>> this doesn't need to fit a HOST_WIDE_INT, please use wi::bit_and (can't
>> find a test_bit with a quick search).
>>
>> +      tmin = wi::sext (tmin, prec - 1);
>> +      tmax = wi::sext (tmax, prec - 1);
>> +      min = wide_int_to_tree (expr_type, tmin);
>> +      max = wide_int_to_tree (expr_type, tmax);
>>
>> not sure why you need the extra sign-extensions here.
>>
>> +    case SEXT_EXPR:
>> +       {
>> +         gcc_assert (is_gimple_min_invariant (op1));
>> +         unsigned int prec = tree_to_uhwi (op1);
>>
>> no need to assert, tree_to_uhwi will do that for you.
>>
>> +         HOST_WIDE_INT may_be_nonzero = may_be_nonzero0.to_uhwi ();
>> +         HOST_WIDE_INT must_be_nonzero = must_be_nonzero0.to_uhwi ();
>>
>> likewise with HOST_WIDE__INT issue.
>>
>> Otherwise looks ok to me.  Btw, this and adding of SEXT_EXPR could be
>> accompanied with a match.pd pattern detecting sign-extension patterns,
>> that would give some extra test coverage.
>>
>> Thanks,
>> Richard.
>>
>>>
>>>
>>> gcc/ChangeLog:
>>>
>>> 2015-09-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>>
>>>          * tree-vrp.c (extract_range_from_binary_expr_1): Handle
>>> SEXT_EXPR.
>>>          (simplify_bit_ops_using_ranges): Likewise.
>>>          (simplify_stmt_using_ranges): Likewise.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [3/7] Optimize ZEXT_EXPR with tree-vrp
  2015-10-07  8:20       ` Richard Biener
@ 2015-10-07 23:40         ` Kugan
  2015-10-09 10:29           ` Richard Biener
  0 siblings, 1 reply; 63+ messages in thread
From: Kugan @ 2015-10-07 23:40 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 2179 bytes --]



On 07/10/15 19:20, Richard Biener wrote:
> On Wed, Oct 7, 2015 at 1:12 AM, kugan <kugan.vivekanandarajah@linaro.org> wrote:
>>
>> Hi Richard,
>>
>> Thanks for the review.
>>
>> On 15/09/15 23:08, Richard Biener wrote:
>>>
>>> On Mon, Sep 7, 2015 at 4:58 AM, Kugan <kugan.vivekanandarajah@linaro.org>
>>> wrote:
>>>>
>>>> This patch tree-vrp handling and optimization for ZEXT_EXPR.
>>>
>>>
>>> +  else if (code == SEXT_EXPR)
>>> +    {
>>> +      gcc_assert (range_int_cst_p (&vr1));
>>> +      unsigned int prec = tree_to_uhwi (vr1.min);
>>> +      type = vr0.type;
>>> +      wide_int tmin, tmax;
>>> +      wide_int type_min, type_max;
>>> +      wide_int may_be_nonzero, must_be_nonzero;
>>> +
>>> +      gcc_assert (!TYPE_UNSIGNED (expr_type));
>>>
>>> hmm, I don't think we should restrict SEXT_EXPR this way.  SEXT_EXPR
>>> should operate on both signed and unsigned types and the result type
>>> should be the same as the type of operand 0.
>>>
>>> +      type_min = wi::shwi (1 << (prec - 1),
>>> +                          TYPE_PRECISION (TREE_TYPE (vr0.min)));
>>> +      type_max = wi::shwi (((1 << (prec - 1)) - 1),
>>> +                          TYPE_PRECISION (TREE_TYPE (vr0.max)));
>>>
>>> there is wi::min_value and max_value for this.
>>
>>
>> As of now, SEXT_EXPR in gimple is of the form: x = y sext 8 and types of all
>> the operand and results are of the wider type. Therefore we cant use the
>> wi::min_value. Or do you want to convert this precision (in this case 8) to
>> a type and use wi::min_value?
> 
> I don't understand - wi::min/max_value get a precision and sign, not a type.
> your 1 << (prec - 1) is even wrong for prec > 32 (it's an integer type
> expression).
> Thus
> 
>   type_min = wi::min_value (prec, SIGNED);
>   type_max = wi::max_value (prec, SIGNED);
> 

Thanks for the comments. Is the attached patch looks better. It is based
on the above. I am still assuming the position of sign-bit in SEXT_EXPR
will be less than 64bit (for calculating sign_bit in wide_int format). I
think this will always be the case but please let me know if this is not OK.

Thanks,
Kugan


> 
>> Please find the patch that addresses the other comments.

[-- Attachment #2: 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch --]
[-- Type: text/x-diff, Size: 3505 bytes --]

From 963e5ed4576bd7f82e83b21f35c58e9962dbbc74 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Mon, 17 Aug 2015 13:45:52 +1000
Subject: [PATCH 3/7] Optimize ZEXT_EXPR with tree-vrp

---
 gcc/tree-vrp.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 69 insertions(+)

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 2cd71a2..ada1c9f 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -2317,6 +2317,7 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
       && code != LSHIFT_EXPR
       && code != MIN_EXPR
       && code != MAX_EXPR
+      && code != SEXT_EXPR
       && code != BIT_AND_EXPR
       && code != BIT_IOR_EXPR
       && code != BIT_XOR_EXPR)
@@ -2877,6 +2878,51 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
       extract_range_from_multiplicative_op_1 (vr, code, &vr0, &vr1);
       return;
     }
+  else if (code == SEXT_EXPR)
+    {
+      gcc_assert (range_int_cst_p (&vr1));
+      unsigned int prec = tree_to_uhwi (vr1.min);
+      type = vr0.type;
+      wide_int tmin, tmax;
+      wide_int may_be_nonzero, must_be_nonzero;
+
+      wide_int type_min = wi::min_value (prec, SIGNED);
+      wide_int type_max = wi::max_value (prec, SIGNED);
+      type_min = wide_int_to_tree (expr_type, type_min);
+      type_max = wide_int_to_tree (expr_type, type_max);
+      wide_int sign_bit = wi::shwi (1ULL << (prec - 1),
+				    TYPE_PRECISION (TREE_TYPE (vr0.min)));
+      if (zero_nonzero_bits_from_vr (expr_type, &vr0,
+				     &may_be_nonzero,
+				     &must_be_nonzero))
+	{
+	  if (wi::bit_and (must_be_nonzero, sign_bit) == sign_bit)
+	    {
+	      /* If to-be-extended sign bit is one.  */
+	      tmin = type_min;
+	      tmax = may_be_nonzero;
+	    }
+	  else if (wi::bit_and (may_be_nonzero, sign_bit)
+		   != sign_bit)
+	    {
+	      /* If to-be-extended sign bit is zero.  */
+	      tmin = must_be_nonzero;
+	      tmax = may_be_nonzero;
+	    }
+	  else
+	    {
+	      tmin = type_min;
+	      tmax = type_max;
+	    }
+	}
+      else
+	{
+	  tmin = type_min;
+	  tmax = type_max;
+	}
+      min = wide_int_to_tree (expr_type, tmin);
+      max = wide_int_to_tree (expr_type, tmax);
+    }
   else if (code == RSHIFT_EXPR
 	   || code == LSHIFT_EXPR)
     {
@@ -9244,6 +9290,28 @@ simplify_bit_ops_using_ranges (gimple_stmt_iterator *gsi, gimple *stmt)
 	  break;
 	}
       break;
+    case SEXT_EXPR:
+	{
+	  unsigned int prec = tree_to_uhwi (op1);
+	  wide_int sign_bit = wi::shwi (1ULL << (prec - 1),
+					TYPE_PRECISION (TREE_TYPE (vr0.min)));
+	  wide_int mask = wi::shwi (((1ULL << (prec - 1)) - 1),
+				    TYPE_PRECISION (TREE_TYPE (vr0.max)));
+	  mask = wi::bit_not (mask);
+	  if (wi::bit_and (must_be_nonzero0, sign_bit) == sign_bit)
+	    {
+	      /* If to-be-extended sign bit is one.  */
+	      if (wi::bit_and (must_be_nonzero0, mask) == mask)
+		op = op0;
+	    }
+	  else if (wi::bit_and (may_be_nonzero0, sign_bit) != sign_bit)
+	    {
+	      /* If to-be-extended sign bit is zero.  */
+	      if (wi::bit_and (may_be_nonzero0, mask) == 0)
+		op = op0;
+	    }
+	}
+      break;
     default:
       gcc_unreachable ();
     }
@@ -9946,6 +10014,7 @@ simplify_stmt_using_ranges (gimple_stmt_iterator *gsi)
 
 	case BIT_AND_EXPR:
 	case BIT_IOR_EXPR:
+	case SEXT_EXPR:
 	  /* Optimize away BIT_AND_EXPR and BIT_IOR_EXPR
 	     if all the bits being cleared are already cleared or
 	     all the bits being set are already set.  */
-- 
1.9.1


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [3/7] Optimize ZEXT_EXPR with tree-vrp
  2015-10-07 23:40         ` Kugan
@ 2015-10-09 10:29           ` Richard Biener
  2015-10-11  2:56             ` Kugan
  0 siblings, 1 reply; 63+ messages in thread
From: Richard Biener @ 2015-10-09 10:29 UTC (permalink / raw)
  To: Kugan; +Cc: gcc-patches

On Thu, Oct 8, 2015 at 1:40 AM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
>
>
> On 07/10/15 19:20, Richard Biener wrote:
>> On Wed, Oct 7, 2015 at 1:12 AM, kugan <kugan.vivekanandarajah@linaro.org> wrote:
>>>
>>> Hi Richard,
>>>
>>> Thanks for the review.
>>>
>>> On 15/09/15 23:08, Richard Biener wrote:
>>>>
>>>> On Mon, Sep 7, 2015 at 4:58 AM, Kugan <kugan.vivekanandarajah@linaro.org>
>>>> wrote:
>>>>>
>>>>> This patch tree-vrp handling and optimization for ZEXT_EXPR.
>>>>
>>>>
>>>> +  else if (code == SEXT_EXPR)
>>>> +    {
>>>> +      gcc_assert (range_int_cst_p (&vr1));
>>>> +      unsigned int prec = tree_to_uhwi (vr1.min);
>>>> +      type = vr0.type;
>>>> +      wide_int tmin, tmax;
>>>> +      wide_int type_min, type_max;
>>>> +      wide_int may_be_nonzero, must_be_nonzero;
>>>> +
>>>> +      gcc_assert (!TYPE_UNSIGNED (expr_type));
>>>>
>>>> hmm, I don't think we should restrict SEXT_EXPR this way.  SEXT_EXPR
>>>> should operate on both signed and unsigned types and the result type
>>>> should be the same as the type of operand 0.
>>>>
>>>> +      type_min = wi::shwi (1 << (prec - 1),
>>>> +                          TYPE_PRECISION (TREE_TYPE (vr0.min)));
>>>> +      type_max = wi::shwi (((1 << (prec - 1)) - 1),
>>>> +                          TYPE_PRECISION (TREE_TYPE (vr0.max)));
>>>>
>>>> there is wi::min_value and max_value for this.
>>>
>>>
>>> As of now, SEXT_EXPR in gimple is of the form: x = y sext 8 and types of all
>>> the operand and results are of the wider type. Therefore we cant use the
>>> wi::min_value. Or do you want to convert this precision (in this case 8) to
>>> a type and use wi::min_value?
>>
>> I don't understand - wi::min/max_value get a precision and sign, not a type.
>> your 1 << (prec - 1) is even wrong for prec > 32 (it's an integer type
>> expression).
>> Thus
>>
>>   type_min = wi::min_value (prec, SIGNED);
>>   type_max = wi::max_value (prec, SIGNED);
>>
>
> Thanks for the comments. Is the attached patch looks better. It is based
> on the above. I am still assuming the position of sign-bit in SEXT_EXPR
> will be less than 64bit (for calculating sign_bit in wide_int format). I
> think this will always be the case but please let me know if this is not OK.

+      unsigned int prec = tree_to_uhwi (vr1.min);

this should use unsigned HOST_WIDE_INT

+      wide_int sign_bit = wi::shwi (1ULL << (prec - 1),
+                                   TYPE_PRECISION (TREE_TYPE (vr0.min)));

use wi::one (TYPE_PRECISION (TREE_TYPE (vr0.min))) << (prec - 1);

That is, you really need to handle precisions bigger than HOST_WIDE_INT.

But I suppose wide_int really misses a test_bit function (it has a set_bit
one already).

+         if (wi::bit_and (must_be_nonzero, sign_bit) == sign_bit)
+           {
+             /* If to-be-extended sign bit is one.  */
+             tmin = type_min;
+             tmax = may_be_nonzero;

I think tmax should be zero-extended may_be_nonzero from prec.

+         else if (wi::bit_and (may_be_nonzero, sign_bit)
+                  != sign_bit)
+           {
+             /* If to-be-extended sign bit is zero.  */
+             tmin = must_be_nonzero;
+             tmax = may_be_nonzero;

likewise here tmin/tmax should be zero-extended may/must_be_nonzero from prec.

+    case SEXT_EXPR:
+       {
+         unsigned int prec = tree_to_uhwi (op1);
+         wide_int sign_bit = wi::shwi (1ULL << (prec - 1),
+                                       TYPE_PRECISION (TREE_TYPE (vr0.min)));
+         wide_int mask = wi::shwi (((1ULL << (prec - 1)) - 1),
+                                   TYPE_PRECISION (TREE_TYPE (vr0.max)));

this has the same host precision issues of 1ULL (HOST_WIDE_INT).
There is wi::mask, eventually you can use wi::set_bit_in_zero to
produce the sign-bit wide_int (also above).

The rest of the patch looks ok.

Richard.


> Thanks,
> Kugan
>
>
>>
>>> Please find the patch that addresses the other comments.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [3/7] Optimize ZEXT_EXPR with tree-vrp
  2015-10-09 10:29           ` Richard Biener
@ 2015-10-11  2:56             ` Kugan
  2015-10-12 12:13               ` Richard Biener
  0 siblings, 1 reply; 63+ messages in thread
From: Kugan @ 2015-10-11  2:56 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1776 bytes --]



On 09/10/15 21:29, Richard Biener wrote:
> +      unsigned int prec = tree_to_uhwi (vr1.min);
> 
> this should use unsigned HOST_WIDE_INT
> 
> +      wide_int sign_bit = wi::shwi (1ULL << (prec - 1),
> +                                   TYPE_PRECISION (TREE_TYPE (vr0.min)));
> 
> use wi::one (TYPE_PRECISION (TREE_TYPE (vr0.min))) << (prec - 1);
> 
> That is, you really need to handle precisions bigger than HOST_WIDE_INT.
> 
> But I suppose wide_int really misses a test_bit function (it has a set_bit
> one already).
> 
> +         if (wi::bit_and (must_be_nonzero, sign_bit) == sign_bit)
> +           {
> +             /* If to-be-extended sign bit is one.  */
> +             tmin = type_min;
> +             tmax = may_be_nonzero;
> 
> I think tmax should be zero-extended may_be_nonzero from prec.
> 
> +         else if (wi::bit_and (may_be_nonzero, sign_bit)
> +                  != sign_bit)
> +           {
> +             /* If to-be-extended sign bit is zero.  */
> +             tmin = must_be_nonzero;
> +             tmax = may_be_nonzero;
> 
> likewise here tmin/tmax should be zero-extended may/must_be_nonzero from prec.
> 
> +    case SEXT_EXPR:
> +       {
> +         unsigned int prec = tree_to_uhwi (op1);
> +         wide_int sign_bit = wi::shwi (1ULL << (prec - 1),
> +                                       TYPE_PRECISION (TREE_TYPE (vr0.min)));
> +         wide_int mask = wi::shwi (((1ULL << (prec - 1)) - 1),
> +                                   TYPE_PRECISION (TREE_TYPE (vr0.max)));
> 
> this has the same host precision issues of 1ULL (HOST_WIDE_INT).
> There is wi::mask, eventually you can use wi::set_bit_in_zero to
> produce the sign-bit wide_int (also above).


Thanks Ricahrd. Does the attached patch looks better ?

Thanks,
Kugan

[-- Attachment #2: 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch --]
[-- Type: text/x-diff, Size: 3524 bytes --]

From cf5f75f5c96d30cdd968e71035a398cb0d5fcff7 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Mon, 17 Aug 2015 13:45:52 +1000
Subject: [PATCH 3/7] Optimize ZEXT_EXPR with tree-vrp

---
 gcc/tree-vrp.c | 70 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 70 insertions(+)

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 2cd71a2..c04d290 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -2317,6 +2317,7 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
       && code != LSHIFT_EXPR
       && code != MIN_EXPR
       && code != MAX_EXPR
+      && code != SEXT_EXPR
       && code != BIT_AND_EXPR
       && code != BIT_IOR_EXPR
       && code != BIT_XOR_EXPR)
@@ -2877,6 +2878,52 @@ extract_range_from_binary_expr_1 (value_range_t *vr,
       extract_range_from_multiplicative_op_1 (vr, code, &vr0, &vr1);
       return;
     }
+  else if (code == SEXT_EXPR)
+    {
+      gcc_assert (range_int_cst_p (&vr1));
+      HOST_WIDE_INT prec = tree_to_uhwi (vr1.min);
+      type = vr0.type;
+      wide_int tmin, tmax;
+      wide_int may_be_nonzero, must_be_nonzero;
+
+      wide_int type_min = wi::min_value (prec, SIGNED);
+      wide_int type_max = wi::max_value (prec, SIGNED);
+      type_min = wide_int_to_tree (expr_type, type_min);
+      type_max = wide_int_to_tree (expr_type, type_max);
+      wide_int sign_bit
+	= wi::set_bit_in_zero (prec - 1,
+			       TYPE_PRECISION (TREE_TYPE (vr0.min)));
+      if (zero_nonzero_bits_from_vr (expr_type, &vr0,
+				     &may_be_nonzero,
+				     &must_be_nonzero))
+	{
+	  if (wi::bit_and (must_be_nonzero, sign_bit) == sign_bit)
+	    {
+	      /* If to-be-extended sign bit is one.  */
+	      tmin = type_min;
+	      tmax = wi::zext (may_be_nonzero, prec);
+	    }
+	  else if (wi::bit_and (may_be_nonzero, sign_bit)
+		   != sign_bit)
+	    {
+	      /* If to-be-extended sign bit is zero.  */
+	      tmin = wi::zext (must_be_nonzero, prec);
+	      tmax = wi::zext (may_be_nonzero, prec);
+	    }
+	  else
+	    {
+	      tmin = type_min;
+	      tmax = type_max;
+	    }
+	}
+      else
+	{
+	  tmin = type_min;
+	  tmax = type_max;
+	}
+      min = wide_int_to_tree (expr_type, tmin);
+      max = wide_int_to_tree (expr_type, tmax);
+    }
   else if (code == RSHIFT_EXPR
 	   || code == LSHIFT_EXPR)
     {
@@ -9244,6 +9291,28 @@ simplify_bit_ops_using_ranges (gimple_stmt_iterator *gsi, gimple *stmt)
 	  break;
 	}
       break;
+    case SEXT_EXPR:
+	{
+	  unsigned int prec = tree_to_uhwi (op1);
+	  wide_int sign_bit
+	    = wi::set_bit_in_zero (prec - 1,
+				   TYPE_PRECISION (TREE_TYPE (vr0.min)));
+	  wide_int mask = wi::mask (prec, true,
+				    TYPE_PRECISION (TREE_TYPE (vr0.min)));
+	  if (wi::bit_and (must_be_nonzero0, sign_bit) == sign_bit)
+	    {
+	      /* If to-be-extended sign bit is one.  */
+	      if (wi::bit_and (must_be_nonzero0, mask) == mask)
+		op = op0;
+	    }
+	  else if (wi::bit_and (may_be_nonzero0, sign_bit) != sign_bit)
+	    {
+	      /* If to-be-extended sign bit is zero.  */
+	      if (wi::bit_and (may_be_nonzero0, mask) == 0)
+		op = op0;
+	    }
+	}
+      break;
     default:
       gcc_unreachable ();
     }
@@ -9946,6 +10015,7 @@ simplify_stmt_using_ranges (gimple_stmt_iterator *gsi)
 
 	case BIT_AND_EXPR:
 	case BIT_IOR_EXPR:
+	case SEXT_EXPR:
 	  /* Optimize away BIT_AND_EXPR and BIT_IOR_EXPR
 	     if all the bits being cleared are already cleared or
 	     all the bits being set are already set.  */
-- 
1.9.1


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [1/7] Add new tree code SEXT_EXPR
  2015-09-15 13:20   ` Richard Biener
@ 2015-10-11 10:35     ` Kugan
  2015-10-12 12:22       ` Richard Biener
  0 siblings, 1 reply; 63+ messages in thread
From: Kugan @ 2015-10-11 10:35 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 2562 bytes --]



On 15/09/15 23:18, Richard Biener wrote:
> On Mon, Sep 7, 2015 at 4:55 AM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
>>
>> This patch adds support for new tree code SEXT_EXPR.
> 
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> index d567a87..bbc3c10 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -5071,6 +5071,10 @@ expand_debug_expr (tree exp)
>      case FMA_EXPR:
>        return simplify_gen_ternary (FMA, mode, inner_mode, op0, op1, op2);
> 
> +    case SEXT_EXPR:
> +      return op0;
> 
> that looks wrong.  Generate (sext:... ) here?
> 
> +    case SEXT_EXPR:
> +       {
> +         rtx op0 = expand_normal (treeop0);
> +         rtx temp;
> +         if (!target)
> +           target = gen_reg_rtx (TYPE_MODE (TREE_TYPE (treeop0)));
> +
> +         machine_mode inner_mode
> +           = smallest_mode_for_size (tree_to_shwi (treeop1),
> +                                     MODE_INT);
> +         temp = convert_modes (inner_mode,
> +                               TYPE_MODE (TREE_TYPE (treeop0)), op0, 0);
> +         convert_move (target, temp, 0);
> +         return target;
> +       }
> 
> Humm - is that really how we expand sign extensions right now?  No helper
> that would generate (sext ...) directly?  I wouldn't try using 'target' btw but
> simply return (sext:mode op0 op1) or so.  But I am no way an RTL expert.
> 
> Note that if we don't disallow arbitrary precision SEXT_EXPRs we have to
> fall back to using shifts (and smallest_mode_for_size is simply wrong).
> 
> +    case SEXT_EXPR:
> +      {
> +       if (!INTEGRAL_TYPE_P (lhs_type)
> +           || !INTEGRAL_TYPE_P (rhs1_type)
> +           || TREE_CODE (rhs2) != INTEGER_CST)
> 
> please constrain this some more, with
> 
>    || !useless_type_conversion_p (lhs_type, rhs1_type)
> 
> +         {
> +           error ("invalid operands in sext expr");
> +           return true;
> +         }
> +       return false;
> +      }
> 
> @@ -3414,6 +3422,9 @@ op_symbol_code (enum tree_code code)
>      case MIN_EXPR:
>        return "min";
> 
> +    case SEXT_EXPR:
> +      return "sext from bit";
> +
> 
> just "sext" please.
> 
> +/*  Sign-extend operation.  It will sign extend first operand from
> + the sign bit specified by the second operand.  */
> +DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2)
> 
> "from the INTEGER_CST sign bit specified"
> 
> Also add "The type of the result is that of the first operand."
> 



Thanks for the review. Attached patch attempts to address the above
comments. Does this look better?


Thanks,
Kugan

[-- Attachment #2: 0001-Add-new-SEXT_EXPR-tree-code.patch --]
[-- Type: text/x-diff, Size: 5041 bytes --]

From 2326daf0e7088e01e87574c9824b1c7248395798 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Mon, 17 Aug 2015 13:37:15 +1000
Subject: [PATCH 1/7] Add new SEXT_EXPR tree code

---
 gcc/cfgexpand.c         | 10 ++++++++++
 gcc/expr.c              | 20 ++++++++++++++++++++
 gcc/fold-const.c        |  4 ++++
 gcc/tree-cfg.c          | 13 +++++++++++++
 gcc/tree-inline.c       |  1 +
 gcc/tree-pretty-print.c | 11 +++++++++++
 gcc/tree.def            |  5 +++++
 7 files changed, 64 insertions(+)

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 58e55d2..dea0e37 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -5057,6 +5057,16 @@ expand_debug_expr (tree exp)
     case FMA_EXPR:
       return simplify_gen_ternary (FMA, mode, inner_mode, op0, op1, op2);
 
+    case SEXT_EXPR:
+      gcc_assert (CONST_INT_P (op1));
+      inner_mode = mode_for_size (INTVAL (op1), MODE_INT, 0);
+      if (mode != inner_mode)
+	op0 = simplify_gen_unary (SIGN_EXTEND,
+				  mode,
+				  gen_lowpart_SUBREG (inner_mode, op0),
+				  inner_mode);
+      return op0;
+
     default:
     flag_unsupported:
 #ifdef ENABLE_CHECKING
diff --git a/gcc/expr.c b/gcc/expr.c
index 0bbfccd..30898a2 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9296,6 +9296,26 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
       target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
       return target;
 
+    case SEXT_EXPR:
+	{
+	  machine_mode inner_mode = mode_for_size (tree_to_shwi (treeop1),
+						   MODE_INT, 0);
+	  rtx temp, result;
+	  rtx op0 = expand_normal (treeop0);
+	  op0 = force_reg (mode, op0);
+	  if (mode != inner_mode)
+	    {
+	      result = gen_reg_rtx (mode);
+	      temp = simplify_gen_unary (SIGN_EXTEND, mode,
+					 gen_lowpart_SUBREG (inner_mode, op0),
+					 inner_mode);
+	      convert_move (result, temp, 0);
+	    }
+	  else
+	    result = op0;
+	  return result;
+	}
+
     default:
       gcc_unreachable ();
     }
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 7231fd6..d693b42 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -984,6 +984,10 @@ int_const_binop_1 (enum tree_code code, const_tree arg1, const_tree parg2,
       res = wi::bit_and (arg1, arg2);
       break;
 
+    case SEXT_EXPR:
+      res = wi::sext (arg1, arg2.to_uhwi ());
+      break;
+
     case RSHIFT_EXPR:
     case LSHIFT_EXPR:
       if (wi::neg_p (arg2))
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 712d8cc..97da2f3 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3756,6 +3756,19 @@ verify_gimple_assign_binary (gassign *stmt)
         return false;
       }
 
+    case SEXT_EXPR:
+      {
+	if (!INTEGRAL_TYPE_P (lhs_type)
+	    || !useless_type_conversion_p (lhs_type, rhs1_type)
+	    || !INTEGRAL_TYPE_P (rhs1_type)
+	    || TREE_CODE (rhs2) != INTEGER_CST)
+	  {
+	    error ("invalid operands in sext expr");
+	    return true;
+	  }
+	return false;
+      }
+
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
       {
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index ac9586e..0975730 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -3884,6 +3884,7 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
     case BIT_XOR_EXPR:
     case BIT_AND_EXPR:
     case BIT_NOT_EXPR:
+    case SEXT_EXPR:
 
     case TRUTH_ANDIF_EXPR:
     case TRUTH_ORIF_EXPR:
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 7cd1fe7..efd8d5b 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -1794,6 +1794,14 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, int flags,
       }
       break;
 
+    case SEXT_EXPR:
+      pp_string (pp, "SEXT_EXPR <");
+      dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
+      pp_string (pp, ", ");
+      dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
+      pp_greater (pp);
+      break;
+
     case MODIFY_EXPR:
     case INIT_EXPR:
       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags,
@@ -3414,6 +3422,9 @@ op_symbol_code (enum tree_code code)
     case MIN_EXPR:
       return "min";
 
+    case SEXT_EXPR:
+      return "sext";
+
     default:
       return "<<< ??? >>>";
     }
diff --git a/gcc/tree.def b/gcc/tree.def
index 56580af..48e7413 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -752,6 +752,11 @@ DEFTREECODE (BIT_XOR_EXPR, "bit_xor_expr", tcc_binary, 2)
 DEFTREECODE (BIT_AND_EXPR, "bit_and_expr", tcc_binary, 2)
 DEFTREECODE (BIT_NOT_EXPR, "bit_not_expr", tcc_unary, 1)
 
+/*  Sign-extend operation.  It will sign extend first operand from
+ the sign bit specified by the second operand.  The type of the
+ result is that of the first operand.  */
+DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2)
+
 /* ANDIF and ORIF allow the second operand not to be computed if the
    value of the expression is determined from the first operand.  AND,
    OR, and XOR always compute the second operand whether its value is
-- 
1.9.1


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [3/7] Optimize ZEXT_EXPR with tree-vrp
  2015-10-11  2:56             ` Kugan
@ 2015-10-12 12:13               ` Richard Biener
  0 siblings, 0 replies; 63+ messages in thread
From: Richard Biener @ 2015-10-12 12:13 UTC (permalink / raw)
  To: Kugan; +Cc: gcc-patches

On Sun, Oct 11, 2015 at 4:56 AM, Kugan
<kugan.vivekanandarajah@linaro.org> wrote:
>
>
> On 09/10/15 21:29, Richard Biener wrote:
>> +      unsigned int prec = tree_to_uhwi (vr1.min);
>>
>> this should use unsigned HOST_WIDE_INT
>>
>> +      wide_int sign_bit = wi::shwi (1ULL << (prec - 1),
>> +                                   TYPE_PRECISION (TREE_TYPE (vr0.min)));
>>
>> use wi::one (TYPE_PRECISION (TREE_TYPE (vr0.min))) << (prec - 1);
>>
>> That is, you really need to handle precisions bigger than HOST_WIDE_INT.
>>
>> But I suppose wide_int really misses a test_bit function (it has a set_bit
>> one already).
>>
>> +         if (wi::bit_and (must_be_nonzero, sign_bit) == sign_bit)
>> +           {
>> +             /* If to-be-extended sign bit is one.  */
>> +             tmin = type_min;
>> +             tmax = may_be_nonzero;
>>
>> I think tmax should be zero-extended may_be_nonzero from prec.
>>
>> +         else if (wi::bit_and (may_be_nonzero, sign_bit)
>> +                  != sign_bit)
>> +           {
>> +             /* If to-be-extended sign bit is zero.  */
>> +             tmin = must_be_nonzero;
>> +             tmax = may_be_nonzero;
>>
>> likewise here tmin/tmax should be zero-extended may/must_be_nonzero from prec.
>>
>> +    case SEXT_EXPR:
>> +       {
>> +         unsigned int prec = tree_to_uhwi (op1);
>> +         wide_int sign_bit = wi::shwi (1ULL << (prec - 1),
>> +                                       TYPE_PRECISION (TREE_TYPE (vr0.min)));
>> +         wide_int mask = wi::shwi (((1ULL << (prec - 1)) - 1),
>> +                                   TYPE_PRECISION (TREE_TYPE (vr0.max)));
>>
>> this has the same host precision issues of 1ULL (HOST_WIDE_INT).
>> There is wi::mask, eventually you can use wi::set_bit_in_zero to
>> produce the sign-bit wide_int (also above).
>
>
> Thanks Ricahrd. Does the attached patch looks better ?

Yes.  That variant is ok once prerequesites have been approved.

Thanks,
Richard.


> Thanks,
> Kugan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [1/7] Add new tree code SEXT_EXPR
  2015-10-11 10:35     ` Kugan
@ 2015-10-12 12:22       ` Richard Biener
  2015-10-15  5:49         ` Kugan
  0 siblings, 1 reply; 63+ messages in thread
From: Richard Biener @ 2015-10-12 12:22 UTC (permalink / raw)
  To: Kugan; +Cc: gcc-patches

On Sun, Oct 11, 2015 at 12:35 PM, Kugan
<kugan.vivekanandarajah@linaro.org> wrote:
>
>
> On 15/09/15 23:18, Richard Biener wrote:
>> On Mon, Sep 7, 2015 at 4:55 AM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
>>>
>>> This patch adds support for new tree code SEXT_EXPR.
>>
>> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
>> index d567a87..bbc3c10 100644
>> --- a/gcc/cfgexpand.c
>> +++ b/gcc/cfgexpand.c
>> @@ -5071,6 +5071,10 @@ expand_debug_expr (tree exp)
>>      case FMA_EXPR:
>>        return simplify_gen_ternary (FMA, mode, inner_mode, op0, op1, op2);
>>
>> +    case SEXT_EXPR:
>> +      return op0;
>>
>> that looks wrong.  Generate (sext:... ) here?
>>
>> +    case SEXT_EXPR:
>> +       {
>> +         rtx op0 = expand_normal (treeop0);
>> +         rtx temp;
>> +         if (!target)
>> +           target = gen_reg_rtx (TYPE_MODE (TREE_TYPE (treeop0)));
>> +
>> +         machine_mode inner_mode
>> +           = smallest_mode_for_size (tree_to_shwi (treeop1),
>> +                                     MODE_INT);
>> +         temp = convert_modes (inner_mode,
>> +                               TYPE_MODE (TREE_TYPE (treeop0)), op0, 0);
>> +         convert_move (target, temp, 0);
>> +         return target;
>> +       }
>>
>> Humm - is that really how we expand sign extensions right now?  No helper
>> that would generate (sext ...) directly?  I wouldn't try using 'target' btw but
>> simply return (sext:mode op0 op1) or so.  But I am no way an RTL expert.
>>
>> Note that if we don't disallow arbitrary precision SEXT_EXPRs we have to
>> fall back to using shifts (and smallest_mode_for_size is simply wrong).
>>
>> +    case SEXT_EXPR:
>> +      {
>> +       if (!INTEGRAL_TYPE_P (lhs_type)
>> +           || !INTEGRAL_TYPE_P (rhs1_type)
>> +           || TREE_CODE (rhs2) != INTEGER_CST)
>>
>> please constrain this some more, with
>>
>>    || !useless_type_conversion_p (lhs_type, rhs1_type)
>>
>> +         {
>> +           error ("invalid operands in sext expr");
>> +           return true;
>> +         }
>> +       return false;
>> +      }
>>
>> @@ -3414,6 +3422,9 @@ op_symbol_code (enum tree_code code)
>>      case MIN_EXPR:
>>        return "min";
>>
>> +    case SEXT_EXPR:
>> +      return "sext from bit";
>> +
>>
>> just "sext" please.
>>
>> +/*  Sign-extend operation.  It will sign extend first operand from
>> + the sign bit specified by the second operand.  */
>> +DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2)
>>
>> "from the INTEGER_CST sign bit specified"
>>
>> Also add "The type of the result is that of the first operand."
>>
>
>
>
> Thanks for the review. Attached patch attempts to address the above
> comments. Does this look better?

+    case SEXT_EXPR:
+      gcc_assert (CONST_INT_P (op1));
+      inner_mode = mode_for_size (INTVAL (op1), MODE_INT, 0);

We should add

        gcc_assert (GET_MODE_BITSIZE (inner_mode) == INTVAL (op1));

+      if (mode != inner_mode)
+       op0 = simplify_gen_unary (SIGN_EXTEND,
+                                 mode,
+                                 gen_lowpart_SUBREG (inner_mode, op0),
+                                 inner_mode);

as we're otherwise silently dropping things like SEXT (short-typed-var, 13)

+    case SEXT_EXPR:
+       {
+         machine_mode inner_mode = mode_for_size (tree_to_shwi (treeop1),
+                                                  MODE_INT, 0);

Likewise.  Also treeop1 should be unsigned, thus tree_to_uhwi?

+         rtx temp, result;
+         rtx op0 = expand_normal (treeop0);
+         op0 = force_reg (mode, op0);
+         if (mode != inner_mode)
+           {

Again, for the RTL bits I'm not sure they are correct.  For example I don't
see why we need a lowpart SUBREG, isn't a "regular" SUBREG enough?

+    case SEXT_EXPR:
+      {
+       if (!INTEGRAL_TYPE_P (lhs_type)
+           || !useless_type_conversion_p (lhs_type, rhs1_type)
+           || !INTEGRAL_TYPE_P (rhs1_type)
+           || TREE_CODE (rhs2) != INTEGER_CST)

the INTEGRAL_TYPE_P (rhs1_type) check is redundant with
the useless_type_Conversion_p one.  Please check
tree_fits_uhwi (rhs2) instead of != INTEGER_CST.

Otherwise ok for trunk.

Thanks,
Richard.



>
> Thanks,
> Kugan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [5/7] Allow gimple debug stmt in widen mode
  2015-09-15 13:02       ` Richard Biener
@ 2015-10-15  5:45         ` Kugan
  2015-10-16  9:27           ` Richard Biener
  0 siblings, 1 reply; 63+ messages in thread
From: Kugan @ 2015-10-15  5:45 UTC (permalink / raw)
  To: Richard Biener; +Cc: Michael Matz, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 4197 bytes --]



On 15/09/15 22:57, Richard Biener wrote:
> On Tue, Sep 8, 2015 at 2:00 AM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
>>
>> Thanks for the review.
>>
>> On 07/09/15 23:20, Michael Matz wrote:
>>> Hi,
>>>
>>> On Mon, 7 Sep 2015, Kugan wrote:
>>>
>>>> Allow GIMPLE_DEBUG with values in promoted register.
>>>
>>> Patch does much more.
>>>
>>
>> Oops sorry. Copy and paste mistake.
>>
>> gcc/ChangeLog:
>>
>> 2015-09-07 Kugan Vivekanandarajah <kuganv@linaro.org>
>>
>> * cfgexpand.c (expand_debug_locations): Remove assert as now we are
>> also allowing values in promoted register.
>> * gimple-ssa-type-promote.c (fixup_uses): Allow GIMPLE_DEBUG to bind
>> values in promoted register.
>> * rtl.h (wi::int_traits ::decompose): Accept zero extended value
>> also.
>>
>>
>>>> gcc/ChangeLog:
>>>>
>>>> 2015-09-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>>>
>>>>      * expr.c (expand_expr_real_1): Set proper SUNREG_PROMOTED_MODE for
>>>>      SSA_NAME that was set by GIMPLE_CALL and assigned to another
>>>>      SSA_NAME of same type.
>>>
>>> ChangeLog doesn't match patch, and patch contains dubious changes:
>>>
>>>> --- a/gcc/cfgexpand.c
>>>> +++ b/gcc/cfgexpand.c
>>>> @@ -5240,7 +5240,6 @@ expand_debug_locations (void)
>>>>         tree value = (tree)INSN_VAR_LOCATION_LOC (insn);
>>>>         rtx val;
>>>>         rtx_insn *prev_insn, *insn2;
>>>> -       machine_mode mode;
>>>>
>>>>         if (value == NULL_TREE)
>>>>           val = NULL_RTX;
>>>> @@ -5275,16 +5274,6 @@ expand_debug_locations (void)
>>>>
>>>>         if (!val)
>>>>           val = gen_rtx_UNKNOWN_VAR_LOC ();
>>>> -       else
>>>> -         {
>>>> -           mode = GET_MODE (INSN_VAR_LOCATION (insn));
>>>> -
>>>> -           gcc_assert (mode == GET_MODE (val)
>>>> -                       || (GET_MODE (val) == VOIDmode
>>>> -                           && (CONST_SCALAR_INT_P (val)
>>>> -                               || GET_CODE (val) == CONST_FIXED
>>>> -                               || GET_CODE (val) == LABEL_REF)));
>>>> -         }
>>>>
>>>>         INSN_VAR_LOCATION_LOC (insn) = val;
>>>>         prev_insn = PREV_INSN (insn);
>>>
>>> So it seems that the modes of the values location and the value itself
>>> don't have to match anymore, which seems dubious when considering how a
>>> debugger should load the value in question from the given location.  So,
>>> how is it supposed to work?
>>
>> For example (simplified test-case from creduce):
>>
>> fn1() {
>>   char a = fn1;
>>   return a;
>> }
>>
>> --- test.c.142t.veclower21      2015-09-07 23:47:26.362201640 +0000
>> +++ test.c.143t.promotion       2015-09-07 23:47:26.362201640 +0000
>> @@ -5,13 +5,18 @@
>>  {
>>    char a;
>>    long int fn1.0_1;
>> +  unsigned int _2;
>>    int _3;
>> +  unsigned int _5;
>> +  char _6;
>>
>>    <bb 2>:
>>    fn1.0_1 = (long int) fn1;
>> -  a_2 = (char) fn1.0_1;
>> -  # DEBUG a => a_2
>> -  _3 = (int) a_2;
>> +  _5 = (unsigned int) fn1.0_1;
>> +  _2 = _5 & 255;
>> +  # DEBUG a => _2
>> +  _6 = (char) _2;
>> +  _3 = (int) _6;
>>    return _3;
>>
>>  }
>>
>> Please see that DEBUG now points to _2 which is a promoted mode. I am
>> assuming that the debugger would load required precision from promoted
>> register. May be I am missing the details but how else we can handle
>> this? Any suggestions?
> 
> I would have expected the DEBUG insn to be adjusted as
> 
> # DEBUG a => (char)_2

Thanks for the review. Please find the attached patch that attempts to
do this. I have also tested a version of this patch with gdb testsuite.

As Michael wanted, I have also removed the changes in rtl.h and
promoting constants in GIMPLE_DEBUG.


> Btw, why do we have
> 
>> +  _6 = (char) _2;
>> +  _3 = (int) _6;
> 
> ?  I'd have expected
> 
>  unsigned int _6 = SEXT <_2, 8>
>  _3 = (int) _6;
>  return _3;

I am looking into it.

> 
> see my other mail about promotion of PARM_DECLs and RESULT_DECLs -- we should
> promote those as well.
> 

Just to be sure, are you referring to
https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00244.html
where you wanted an IPA pass to perform this. This is one of my dodo
after this. Please let me know if you wanted here is a different issue.


Thanks,
Kuganb

[-- Attachment #2: 0005-debug-stmt-in-widen-mode.patch --]
[-- Type: text/x-diff, Size: 3483 bytes --]

From 7cbcca8ebd03f60e16a55da4af3fc573f98d4086 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Tue, 1 Sep 2015 08:40:40 +1000
Subject: [PATCH 5/7] debug stmt in widen mode

---
 gcc/cfgexpand.c               | 11 -------
 gcc/gimple-ssa-type-promote.c | 77 +++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 74 insertions(+), 14 deletions(-)

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 357710b..be43f46 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -5234,7 +5234,6 @@ expand_debug_locations (void)
 	tree value = (tree)INSN_VAR_LOCATION_LOC (insn);
 	rtx val;
 	rtx_insn *prev_insn, *insn2;
-	machine_mode mode;
 
 	if (value == NULL_TREE)
 	  val = NULL_RTX;
@@ -5269,16 +5268,6 @@ expand_debug_locations (void)
 
 	if (!val)
 	  val = gen_rtx_UNKNOWN_VAR_LOC ();
-	else
-	  {
-	    mode = GET_MODE (INSN_VAR_LOCATION (insn));
-
-	    gcc_assert (mode == GET_MODE (val)
-			|| (GET_MODE (val) == VOIDmode
-			    && (CONST_SCALAR_INT_P (val)
-				|| GET_CODE (val) == CONST_FIXED
-				|| GET_CODE (val) == LABEL_REF)));
-	  }
 
 	INSN_VAR_LOCATION_LOC (insn) = val;
 	prev_insn = PREV_INSN (insn);
diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c
index 513d20d..4034203 100644
--- a/gcc/gimple-ssa-type-promote.c
+++ b/gcc/gimple-ssa-type-promote.c
@@ -585,10 +585,81 @@ fixup_uses (tree use, tree promoted_type, tree old_type)
 	{
 	case GIMPLE_DEBUG:
 	    {
-	      gsi = gsi_for_stmt (stmt);
-	      gsi_remove (&gsi, true);
-	      break;
+	      tree op, new_op = NULL_TREE;
+	      gdebug *copy = NULL, *gs = as_a <gdebug *> (stmt);
+	      enum tree_code code;
+
+	      switch (gs->subcode)
+		{
+		case GIMPLE_DEBUG_BIND:
+		  op = gimple_debug_bind_get_value (gs);
+		  break;
+		case GIMPLE_DEBUG_SOURCE_BIND:
+		  op = gimple_debug_source_bind_get_value (gs);
+		  break;
+		default:
+		  gcc_unreachable ();
+		}
+
+	      switch (TREE_CODE_CLASS (TREE_CODE (op)))
+		{
+		case tcc_exceptional:
+		case tcc_unary:
+		    {
+			/* Convert DEBUG stmt of the form:
+				# DEBUG a => _2
+				to
+				# DEBUG a => (char)_2 */
+		      new_op = build1 (CONVERT_EXPR, old_type, use);
+		      break;
+		    }
+		case tcc_binary:
+		  code = TREE_CODE (op);
+		  /* Convert the INTEGER_CST in tcc_binary to promoted_type,
+		     if the expression is of kind that will be promoted.  */
+		  if (code == LROTATE_EXPR
+		      || code == RROTATE_EXPR
+		      || code == COMPLEX_EXPR)
+		    break;
+		  else
+		    {
+		      tree op0 = TREE_OPERAND (op, 0);
+		      tree op1 = TREE_OPERAND (op, 1);
+		      if (TREE_CODE (op0) == INTEGER_CST)
+			op0 = convert_int_cst (promoted_type, op0, SIGNED);
+		      if (TREE_CODE (op1) == INTEGER_CST)
+			op1 = convert_int_cst (promoted_type, op1, SIGNED);
+		      new_op = build2 (TREE_CODE (op), promoted_type, op0, op1);
+		      break;
+		    }
+		default:
+		  break;
+		}
+
+	      if (new_op)
+		{
+		  if (gimple_debug_bind_p (stmt))
+		    {
+		      copy = gimple_build_debug_bind
+			(gimple_debug_bind_get_var (stmt),
+			 new_op,
+			 stmt);
+		    }
+		  if (gimple_debug_source_bind_p (stmt))
+		    {
+		      copy = gimple_build_debug_source_bind
+			(gimple_debug_source_bind_get_var (stmt), new_op,
+			 stmt);
+		    }
+
+		  if (copy)
+		    {
+		      gsi = gsi_for_stmt (stmt);
+		      gsi_replace (&gsi, copy, false);
+		    }
+		}
 	    }
+	  break;
 
 	case GIMPLE_ASM:
 	case GIMPLE_CALL:
-- 
1.9.1


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [1/7] Add new tree code SEXT_EXPR
  2015-10-12 12:22       ` Richard Biener
@ 2015-10-15  5:49         ` Kugan
  2015-10-21 10:49           ` Richard Biener
  0 siblings, 1 reply; 63+ messages in thread
From: Kugan @ 2015-10-15  5:49 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 4462 bytes --]



On 12/10/15 23:21, Richard Biener wrote:
> On Sun, Oct 11, 2015 at 12:35 PM, Kugan
> <kugan.vivekanandarajah@linaro.org> wrote:
>>
>>
>> On 15/09/15 23:18, Richard Biener wrote:
>>> On Mon, Sep 7, 2015 at 4:55 AM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
>>>>
>>>> This patch adds support for new tree code SEXT_EXPR.
>>>
>>> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
>>> index d567a87..bbc3c10 100644
>>> --- a/gcc/cfgexpand.c
>>> +++ b/gcc/cfgexpand.c
>>> @@ -5071,6 +5071,10 @@ expand_debug_expr (tree exp)
>>>      case FMA_EXPR:
>>>        return simplify_gen_ternary (FMA, mode, inner_mode, op0, op1, op2);
>>>
>>> +    case SEXT_EXPR:
>>> +      return op0;
>>>
>>> that looks wrong.  Generate (sext:... ) here?
>>>
>>> +    case SEXT_EXPR:
>>> +       {
>>> +         rtx op0 = expand_normal (treeop0);
>>> +         rtx temp;
>>> +         if (!target)
>>> +           target = gen_reg_rtx (TYPE_MODE (TREE_TYPE (treeop0)));
>>> +
>>> +         machine_mode inner_mode
>>> +           = smallest_mode_for_size (tree_to_shwi (treeop1),
>>> +                                     MODE_INT);
>>> +         temp = convert_modes (inner_mode,
>>> +                               TYPE_MODE (TREE_TYPE (treeop0)), op0, 0);
>>> +         convert_move (target, temp, 0);
>>> +         return target;
>>> +       }
>>>
>>> Humm - is that really how we expand sign extensions right now?  No helper
>>> that would generate (sext ...) directly?  I wouldn't try using 'target' btw but
>>> simply return (sext:mode op0 op1) or so.  But I am no way an RTL expert.
>>>
>>> Note that if we don't disallow arbitrary precision SEXT_EXPRs we have to
>>> fall back to using shifts (and smallest_mode_for_size is simply wrong).
>>>
>>> +    case SEXT_EXPR:
>>> +      {
>>> +       if (!INTEGRAL_TYPE_P (lhs_type)
>>> +           || !INTEGRAL_TYPE_P (rhs1_type)
>>> +           || TREE_CODE (rhs2) != INTEGER_CST)
>>>
>>> please constrain this some more, with
>>>
>>>    || !useless_type_conversion_p (lhs_type, rhs1_type)
>>>
>>> +         {
>>> +           error ("invalid operands in sext expr");
>>> +           return true;
>>> +         }
>>> +       return false;
>>> +      }
>>>
>>> @@ -3414,6 +3422,9 @@ op_symbol_code (enum tree_code code)
>>>      case MIN_EXPR:
>>>        return "min";
>>>
>>> +    case SEXT_EXPR:
>>> +      return "sext from bit";
>>> +
>>>
>>> just "sext" please.
>>>
>>> +/*  Sign-extend operation.  It will sign extend first operand from
>>> + the sign bit specified by the second operand.  */
>>> +DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2)
>>>
>>> "from the INTEGER_CST sign bit specified"
>>>
>>> Also add "The type of the result is that of the first operand."
>>>
>>
>>
>>
>> Thanks for the review. Attached patch attempts to address the above
>> comments. Does this look better?
> 
> +    case SEXT_EXPR:
> +      gcc_assert (CONST_INT_P (op1));
> +      inner_mode = mode_for_size (INTVAL (op1), MODE_INT, 0);
> 
> We should add
> 
>         gcc_assert (GET_MODE_BITSIZE (inner_mode) == INTVAL (op1));
> 
> +      if (mode != inner_mode)
> +       op0 = simplify_gen_unary (SIGN_EXTEND,
> +                                 mode,
> +                                 gen_lowpart_SUBREG (inner_mode, op0),
> +                                 inner_mode);
> 
> as we're otherwise silently dropping things like SEXT (short-typed-var, 13)
> 
> +    case SEXT_EXPR:
> +       {
> +         machine_mode inner_mode = mode_for_size (tree_to_shwi (treeop1),
> +                                                  MODE_INT, 0);
> 
> Likewise.  Also treeop1 should be unsigned, thus tree_to_uhwi?
> 
> +         rtx temp, result;
> +         rtx op0 = expand_normal (treeop0);
> +         op0 = force_reg (mode, op0);
> +         if (mode != inner_mode)
> +           {
> 
> Again, for the RTL bits I'm not sure they are correct.  For example I don't
> see why we need a lowpart SUBREG, isn't a "regular" SUBREG enough?
> 
> +    case SEXT_EXPR:
> +      {
> +       if (!INTEGRAL_TYPE_P (lhs_type)
> +           || !useless_type_conversion_p (lhs_type, rhs1_type)
> +           || !INTEGRAL_TYPE_P (rhs1_type)
> +           || TREE_CODE (rhs2) != INTEGER_CST)
> 
> the INTEGRAL_TYPE_P (rhs1_type) check is redundant with
> the useless_type_Conversion_p one.  Please check
> tree_fits_uhwi (rhs2) instead of != INTEGER_CST.
> 

Thanks for the review, Please find the updated patch based on the review
comments.

Thanks,
Kugan

[-- Attachment #2: 0001-Add-new-SEXT_EXPR-tree-code.patch --]
[-- Type: text/x-diff, Size: 5066 bytes --]

From e600c266ac7932ffe4cb36830e8d62c90f6e26ee Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Mon, 17 Aug 2015 13:37:15 +1000
Subject: [PATCH 1/7] Add new SEXT_EXPR tree code

---
 gcc/cfgexpand.c         | 12 ++++++++++++
 gcc/expr.c              | 20 ++++++++++++++++++++
 gcc/fold-const.c        |  4 ++++
 gcc/tree-cfg.c          | 12 ++++++++++++
 gcc/tree-inline.c       |  1 +
 gcc/tree-pretty-print.c | 11 +++++++++++
 gcc/tree.def            |  5 +++++
 7 files changed, 65 insertions(+)

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 58e55d2..357710b 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -5057,6 +5057,18 @@ expand_debug_expr (tree exp)
     case FMA_EXPR:
       return simplify_gen_ternary (FMA, mode, inner_mode, op0, op1, op2);
 
+    case SEXT_EXPR:
+      gcc_assert (CONST_INT_P (op1));
+      inner_mode = mode_for_size (INTVAL (op1), MODE_INT, 0);
+      gcc_assert (GET_MODE_BITSIZE (inner_mode) == INTVAL (op1));
+
+      if (mode != inner_mode)
+	op0 = simplify_gen_unary (SIGN_EXTEND,
+				  mode,
+				  gen_lowpart_SUBREG (inner_mode, op0),
+				  inner_mode);
+      return op0;
+
     default:
     flag_unsupported:
 #ifdef ENABLE_CHECKING
diff --git a/gcc/expr.c b/gcc/expr.c
index 0bbfccd..63bd1b6 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9296,6 +9296,26 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
       target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
       return target;
 
+    case SEXT_EXPR:
+	{
+	  machine_mode inner_mode = mode_for_size (tree_to_uhwi (treeop1),
+						   MODE_INT, 0);
+	  rtx temp, result;
+	  rtx op0 = expand_normal (treeop0);
+	  op0 = force_reg (mode, op0);
+	  if (mode != inner_mode)
+	    {
+	      result = gen_reg_rtx (mode);
+	      temp = simplify_gen_unary (SIGN_EXTEND, mode,
+					 gen_lowpart_SUBREG (inner_mode, op0),
+					 inner_mode);
+	      convert_move (result, temp, 0);
+	    }
+	  else
+	    result = op0;
+	  return result;
+	}
+
     default:
       gcc_unreachable ();
     }
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 7231fd6..d693b42 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -984,6 +984,10 @@ int_const_binop_1 (enum tree_code code, const_tree arg1, const_tree parg2,
       res = wi::bit_and (arg1, arg2);
       break;
 
+    case SEXT_EXPR:
+      res = wi::sext (arg1, arg2.to_uhwi ());
+      break;
+
     case RSHIFT_EXPR:
     case LSHIFT_EXPR:
       if (wi::neg_p (arg2))
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 712d8cc..03ae758 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3756,6 +3756,18 @@ verify_gimple_assign_binary (gassign *stmt)
         return false;
       }
 
+    case SEXT_EXPR:
+      {
+	if (!INTEGRAL_TYPE_P (lhs_type)
+	    || !useless_type_conversion_p (lhs_type, rhs1_type)
+	    || !tree_fits_uhwi_p (rhs2))
+	  {
+	    error ("invalid operands in sext expr");
+	    return true;
+	  }
+	return false;
+      }
+
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
       {
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index ac9586e..0975730 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -3884,6 +3884,7 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
     case BIT_XOR_EXPR:
     case BIT_AND_EXPR:
     case BIT_NOT_EXPR:
+    case SEXT_EXPR:
 
     case TRUTH_ANDIF_EXPR:
     case TRUTH_ORIF_EXPR:
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 7cd1fe7..efd8d5b 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -1794,6 +1794,14 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, int flags,
       }
       break;
 
+    case SEXT_EXPR:
+      pp_string (pp, "SEXT_EXPR <");
+      dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
+      pp_string (pp, ", ");
+      dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
+      pp_greater (pp);
+      break;
+
     case MODIFY_EXPR:
     case INIT_EXPR:
       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags,
@@ -3414,6 +3422,9 @@ op_symbol_code (enum tree_code code)
     case MIN_EXPR:
       return "min";
 
+    case SEXT_EXPR:
+      return "sext";
+
     default:
       return "<<< ??? >>>";
     }
diff --git a/gcc/tree.def b/gcc/tree.def
index 56580af..48e7413 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -752,6 +752,11 @@ DEFTREECODE (BIT_XOR_EXPR, "bit_xor_expr", tcc_binary, 2)
 DEFTREECODE (BIT_AND_EXPR, "bit_and_expr", tcc_binary, 2)
 DEFTREECODE (BIT_NOT_EXPR, "bit_not_expr", tcc_unary, 1)
 
+/*  Sign-extend operation.  It will sign extend first operand from
+ the sign bit specified by the second operand.  The type of the
+ result is that of the first operand.  */
+DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2)
+
 /* ANDIF and ORIF allow the second operand not to be computed if the
    value of the expression is determined from the first operand.  AND,
    OR, and XOR always compute the second operand whether its value is
-- 
1.9.1


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [2/7] Add new type promotion pass
  2015-09-07  2:58 ` [2/7] Add new type promotion pass Kugan
@ 2015-10-15  5:52   ` Kugan
  2015-10-15 22:47     ` Richard Henderson
  0 siblings, 1 reply; 63+ messages in thread
From: Kugan @ 2015-10-15  5:52 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Biener

[-- Attachment #1: Type: text/plain, Size: 879 bytes --]



On 07/09/15 12:56, Kugan wrote:
> 
> This pass applies type promotion to SSA names in the function and
> inserts appropriate truncations to preserve the semantics.  Idea of this
> pass is to promote operations such a way that we can minimize generation
> of subreg in RTL, that intern results in removal of redundant zero/sign
> extensions.
> 
> gcc/ChangeLog:
> 
> 2015-09-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
> 
> 	* Makefile.in: Add gimple-ssa-type-promote.o.
> 	* common.opt: New option -ftree-type-promote.
> 	* doc/invoke.texi: Document -ftree-type-promote.
> 	* gimple-ssa-type-promote.c: New file.
> 	* passes.def: Define new pass_type_promote.
> 	* timevar.def: Define new TV_TREE_TYPE_PROMOTE.
> 	* tree-pass.h (make_pass_type_promote): New.
> 	* tree-ssanames.c (set_range_info): Adjust range_info.
> 

Here is the latest version of patch.

Thanks,
Kugan

[-- Attachment #2: 0002-Add-type-promotion-pass.patch --]
[-- Type: text/x-diff, Size: 28923 bytes --]

From 69c05e27b39cd9977e1a412e1c1b3255409ba351 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Mon, 17 Aug 2015 13:44:50 +1000
Subject: [PATCH 2/7] Add type promotion pass

---
 gcc/Makefile.in               |   1 +
 gcc/common.opt                |   4 +
 gcc/doc/invoke.texi           |  10 +
 gcc/gimple-ssa-type-promote.c | 827 ++++++++++++++++++++++++++++++++++++++++++
 gcc/passes.def                |   1 +
 gcc/timevar.def               |   1 +
 gcc/tree-pass.h               |   1 +
 gcc/tree-ssanames.c           |   3 +-
 8 files changed, 847 insertions(+), 1 deletion(-)
 create mode 100644 gcc/gimple-ssa-type-promote.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 009c745..0946055 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1498,6 +1498,7 @@ OBJS = \
 	tree-vect-slp.o \
 	tree-vectorizer.o \
 	tree-vrp.o \
+	gimple-ssa-type-promote.o \
 	tree.o \
 	valtrack.o \
 	value-prof.o \
diff --git a/gcc/common.opt b/gcc/common.opt
index 94d1d88..b5a93b0 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2378,6 +2378,10 @@ ftree-vrp
 Common Report Var(flag_tree_vrp) Init(0) Optimization
 Perform Value Range Propagation on trees
 
+ftree-type-promote
+Common Report Var(flag_tree_type_promote) Init(1) Optimization
+Perform Type Promotion on trees
+
 funit-at-a-time
 Common Report Var(flag_unit_at_a_time) Init(1)
 Compile whole compilation unit at a time
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 50cc520..e6f0ce1 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -9051,6 +9051,16 @@ enabled by default at @option{-O2} and higher.  Null pointer check
 elimination is only done if @option{-fdelete-null-pointer-checks} is
 enabled.
 
+@item -ftree-type-promote
+@opindex ftree-type-promote
+This pass applies type promotion to SSA names in the function and
+inserts appropriate truncations to preserve the semantics.  Idea of
+this pass is to promote operations such a way that we can minimise
+generation of subreg in RTL, that intern results in removal of
+redundant zero/sign extensions.
+
+This optimization is enabled by default.
+
 @item -fsplit-ivs-in-unroller
 @opindex fsplit-ivs-in-unroller
 Enables expression of values of induction variables in later iterations
diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c
new file mode 100644
index 0000000..513d20d
--- /dev/null
+++ b/gcc/gimple-ssa-type-promote.c
@@ -0,0 +1,827 @@
+/* Type promotion of SSA names to minimise redundant zero/sign extension.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "hash-set.h"
+#include "machmode.h"
+#include "vec.h"
+#include "double-int.h"
+#include "input.h"
+#include "symtab.h"
+#include "wide-int.h"
+#include "inchash.h"
+#include "tree.h"
+#include "fold-const.h"
+#include "stor-layout.h"
+#include "predict.h"
+#include "function.h"
+#include "dominance.h"
+#include "cfg.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimple-ssa.h"
+#include "tree-phinodes.h"
+#include "ssa-iterators.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+#include "tree-pass.h"
+#include "gimple-pretty-print.h"
+#include "langhooks.h"
+#include "sbitmap.h"
+#include "domwalk.h"
+#include "tree-dfa.h"
+
+/* This pass applies type promotion to SSA names in the function and
+   inserts appropriate truncations.  Idea of this pass is to promote operations
+   such a way that we can minimise generation of subreg in RTL,
+   that intern results in removal of redundant zero/sign extensions.  This pass
+   will run prior to The VRP and DOM such that they will be able to optimise
+   redundant truncations and extensions.  This is based on the discussion from
+   https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00472.html.
+
+*/
+
+static unsigned n_ssa_val;
+static sbitmap ssa_to_be_promoted_bitmap;
+static sbitmap ssa_sets_higher_bits_bitmap;
+static hash_map <tree, tree>  *original_type_map;
+
+static bool
+type_precision_ok (tree type)
+{
+  return (TYPE_PRECISION (type)  == 8
+	  || TYPE_PRECISION (type) == 16
+	  || TYPE_PRECISION (type) == 32);
+}
+
+/* Return the promoted type for TYPE.  */
+static tree
+get_promoted_type (tree type)
+{
+  tree promoted_type;
+  enum machine_mode mode;
+  int uns;
+  if (POINTER_TYPE_P (type)
+      || !INTEGRAL_TYPE_P (type)
+      || !type_precision_ok (type))
+    return type;
+
+  mode = TYPE_MODE (type);
+#ifdef PROMOTE_MODE
+  uns = TYPE_SIGN (type);
+  PROMOTE_MODE (mode, uns, type);
+#endif
+  uns = TYPE_SIGN (type);
+  promoted_type = lang_hooks.types.type_for_mode (mode, uns);
+  if (promoted_type
+      && (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type)))
+    type = promoted_type;
+  return type;
+}
+
+/* Return true if ssa NAME is already considered for promotion.  */
+static bool
+ssa_promoted_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	return bitmap_bit_p (ssa_to_be_promoted_bitmap, index);
+    }
+  return true;
+}
+
+
+/* Set ssa NAME to be already considered for promotion.  */
+static void
+set_ssa_promoted (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	bitmap_set_bit (ssa_to_be_promoted_bitmap, index);
+    }
+}
+
+/* Insert COPY_STMT along the edge from STMT to its successor.  */
+static void
+insert_stmt_on_edge (gimple *stmt, gimple *copy_stmt)
+{
+  edge_iterator ei;
+  edge e, edge = NULL;
+  basic_block bb = gimple_bb (stmt);
+
+  FOR_EACH_EDGE (e, ei, bb->succs)
+    if (!(e->flags & EDGE_EH))
+      {
+	gcc_assert (edge == NULL);
+	edge = e;
+      }
+
+  gcc_assert (edge);
+  gsi_insert_on_edge_immediate (edge, copy_stmt);
+}
+
+/* Return true if it is safe to promote the defined SSA_NAME in the STMT
+   itself.  */
+static bool
+safe_to_promote_def_p (gimple *stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  if (gimple_vuse (stmt) != NULL_TREE
+      || gimple_vdef (stmt) != NULL_TREE
+      || code == ARRAY_REF
+      || code == LROTATE_EXPR
+      || code == RROTATE_EXPR
+      || code == VIEW_CONVERT_EXPR
+      || code == BIT_FIELD_REF
+      || code == REALPART_EXPR
+      || code == IMAGPART_EXPR
+      || code == REDUC_MAX_EXPR
+      || code == REDUC_PLUS_EXPR
+      || code == REDUC_MIN_EXPR)
+    return false;
+  return true;
+}
+
+/* Return true if it is safe to promote the use in the STMT.  */
+static bool
+safe_to_promote_use_p (gimple *stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  tree lhs = gimple_assign_lhs (stmt);
+
+  if (gimple_vuse (stmt) != NULL_TREE
+      || gimple_vdef (stmt) != NULL_TREE
+      || code == VIEW_CONVERT_EXPR
+      || code == LROTATE_EXPR
+      || code == RROTATE_EXPR
+      || code == CONSTRUCTOR
+      || code == BIT_FIELD_REF
+      || code == COMPLEX_EXPR
+      || code == ASM_EXPR
+      || VECTOR_TYPE_P (TREE_TYPE (lhs)))
+    return false;
+  return true;
+}
+
+/* Return true if the SSA_NAME has to be truncated to preserve the
+   semantics.  */
+static bool
+truncate_use_p (gimple *stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  if (TREE_CODE_CLASS (code)
+      == tcc_comparison
+      || code == TRUNC_DIV_EXPR
+      || code == CEIL_DIV_EXPR
+      || code == FLOOR_DIV_EXPR
+      || code == ROUND_DIV_EXPR
+      || code == TRUNC_MOD_EXPR
+      || code == CEIL_MOD_EXPR
+      || code == FLOOR_MOD_EXPR
+      || code == ROUND_MOD_EXPR
+      || code == LSHIFT_EXPR
+      || code == RSHIFT_EXPR)
+    return true;
+  return false;
+}
+
+/* Return true if LHS will be promoted later.  */
+static bool
+tobe_promoted_p (tree lhs)
+{
+  if (TREE_CODE (lhs) == SSA_NAME
+      && !POINTER_TYPE_P (TREE_TYPE (lhs))
+      && INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+      && !VECTOR_TYPE_P (TREE_TYPE (lhs))
+      && !ssa_promoted_p (lhs)
+      && (get_promoted_type (TREE_TYPE (lhs))
+	  != TREE_TYPE (lhs)))
+    return true;
+  else
+    return false;
+}
+
+/* Convert constant CST to TYPE.  */
+static tree
+convert_int_cst (tree type, tree cst, signop sign = SIGNED)
+{
+  wide_int wi_cons = fold_convert (type, cst);
+  wi_cons = wi::ext (wi_cons, TYPE_PRECISION (TREE_TYPE (cst)), sign);
+  return wide_int_to_tree (type, wi_cons);
+}
+
+/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
+   promote only the constants in conditions part of the COND_EXPR.  */
+static void
+promote_cst_in_stmt (gimple *stmt, tree type, bool promote_cond = false)
+{
+  tree op;
+  ssa_op_iter iter;
+  use_operand_p oprnd;
+  int index;
+  tree op0, op1;
+  signop sign = SIGNED;
+
+  switch (gimple_code (stmt))
+    {
+    case GIMPLE_ASSIGN:
+      if (promote_cond
+	  && gimple_assign_rhs_code (stmt) == COND_EXPR)
+	{
+	  /* Promote INTEGER_CST that are tcc_compare arguments.  */
+	  sign = TYPE_SIGN (type);
+	  op = gimple_assign_rhs1 (stmt);
+	  op0 = TREE_OPERAND (op, 0);
+	  op1 = TREE_OPERAND (op, 1);
+	  if (TREE_CODE (op0) == INTEGER_CST)
+	    op0 = convert_int_cst (type, op0, sign);
+	  if (TREE_CODE (op1) == INTEGER_CST)
+	    op1 = convert_int_cst (type, op1, sign);
+	  tree new_op = build2 (TREE_CODE (op), type, op0, op1);
+	  gimple_assign_set_rhs1 (stmt, new_op);
+	}
+      else
+	{
+	  /* Promote INTEGER_CST in GIMPLE_ASSIGN.  */
+	  op = gimple_assign_rhs3 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs3 (stmt, convert_int_cst (type, op, sign));
+	  if (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt))
+	      == tcc_comparison)
+	    sign = TYPE_SIGN (type);
+	  op = gimple_assign_rhs1 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs1 (stmt, convert_int_cst (type, op, sign));
+	  op = gimple_assign_rhs2 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs2 (stmt, convert_int_cst (type, op, sign));
+	}
+      break;
+
+    case GIMPLE_PHI:
+	{
+	  /* Promote INTEGER_CST arguments to GIMPLE_PHI.  */
+	  gphi *phi = as_a <gphi *> (stmt);
+	  FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE)
+	    {
+	      op = USE_FROM_PTR (oprnd);
+	      index = PHI_ARG_INDEX_FROM_USE (oprnd);
+	      if (TREE_CODE (op) == INTEGER_CST)
+		SET_PHI_ARG_DEF (phi, index, convert_int_cst (type, op, sign));
+	    }
+	}
+      break;
+
+    case GIMPLE_COND:
+	{
+	  /* Promote INTEGER_CST that are GIMPLE_COND arguments.  */
+	  gcond *cond = as_a <gcond *> (stmt);
+	  op = gimple_cond_lhs (cond);
+	  sign = TYPE_SIGN (type);
+
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_lhs (cond, convert_int_cst (type, op, sign));
+	  op = gimple_cond_rhs (cond);
+
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_rhs (cond, convert_int_cst (type, op, sign));
+	}
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+}
+
+/* Create an ssa with TYPE to copy ssa VAR.  */
+static tree
+make_promoted_copy (tree var, gimple *def_stmt, tree type)
+{
+  tree new_lhs = make_ssa_name (type, def_stmt);
+  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (var))
+    SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_lhs) = 1;
+  return new_lhs;
+}
+
+/* Zero/sign extend (depending on type) VAR and truncate to WIDTH bits.
+   Assign the zero/sign extended value in NEW_VAR.  gimple statement
+   that performs the zero/sign extension is returned.  */
+static gimple *
+zero_sign_extend_stmt (tree new_var, tree var, int width)
+{
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var))
+	      == TYPE_PRECISION (TREE_TYPE (new_var)));
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var)) > width);
+  gcc_assert (width != 1);
+  gimple *stmt;
+
+  if (TYPE_UNSIGNED (TREE_TYPE (new_var)))
+    /* Zero extend.  */
+    stmt = gimple_build_assign (new_var,
+				BIT_AND_EXPR,
+				var, build_int_cst (TREE_TYPE (var),
+						    ((1ULL << width) - 1)));
+  else
+    /* Sign extend.  */
+    stmt = gimple_build_assign (new_var,
+				SEXT_EXPR,
+				var, build_int_cst (TREE_TYPE (var), width));
+  return stmt;
+}
+
+
+void duplicate_default_ssa (tree to, tree from)
+{
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (to, SSA_NAME_VAR (from));
+  SSA_NAME_IS_DEFAULT_DEF (to) = SSA_NAME_IS_DEFAULT_DEF (from);
+  SSA_NAME_DEF_STMT (to) = SSA_NAME_DEF_STMT (from);
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (from, NULL_TREE);
+  SSA_NAME_IS_DEFAULT_DEF (to) = 1;
+  SSA_NAME_IS_DEFAULT_DEF (from) = 0;
+}
+
+/* Promote definition DEF to PROMOTED_TYPE.  If the stmt that defines def
+   is def_stmt, make the type of def promoted_type.  If the stmt is such
+   that, result of the def_stmt cannot be of promoted_type, create a new_def
+   of the original_type and make the def_stmt assign its value to newdef.
+   Then, create a CONVERT_EXPR to convert new_def to def of promoted type.
+
+   For example, for stmt with original_type char and promoted_type int:
+		char _1 = mem;
+	becomes:
+		char _2 = mem;
+		int _1 = (int)_2;
+
+   If the def_stmt allows def to be promoted, promote def in-place
+   (and its arguments when needed).
+
+   For example:
+		char _3 = _1 + _2;
+	becomes:
+		int _3 = _1 + _2;
+   Here, _1 and _2 will also be promoted.  */
+
+static void
+promote_definition (tree def,
+		    tree promoted_type)
+{
+  gimple *def_stmt = SSA_NAME_DEF_STMT (def);
+  gimple *copy_stmt = NULL;
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+  tree original_type = TREE_TYPE (def);
+  tree new_def;
+  bool do_not_promote = false;
+
+  switch (gimple_code (def_stmt))
+    {
+    case GIMPLE_PHI:
+	{
+	  /* Promote def by fixing its type and make def anonymous.  */
+	  TREE_TYPE (def) = promoted_type;
+	  SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	  promote_cst_in_stmt (def_stmt, promoted_type);
+	  break;
+	}
+
+    case GIMPLE_ASM:
+	{
+	  gasm *asm_stmt = as_a <gasm *> (def_stmt);
+	  for (unsigned int i = 0; i < gimple_asm_noutputs (asm_stmt); ++i)
+	    {
+	      /* Promote def and copy (i.e. convert) the value defined
+		 by asm to def.  */
+	      tree link = gimple_asm_output_op (asm_stmt, i);
+	      tree op = TREE_VALUE (link);
+	      if (op == def)
+		{
+		  new_def = copy_ssa_name (def);
+		  set_ssa_promoted (new_def);
+		  duplicate_default_ssa (new_def, def);
+		  TREE_VALUE (link) = new_def;
+		  gimple_asm_set_output_op (asm_stmt, i, link);
+
+		  TREE_TYPE (def) = promoted_type;
+		  copy_stmt = gimple_build_assign (def, CONVERT_EXPR,
+						   new_def, NULL_TREE);
+		  gsi = gsi_for_stmt (def_stmt);
+		  SSA_NAME_IS_DEFAULT_DEF (new_def) = 0;
+		  gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+		  break;
+		}
+	    }
+	  break;
+	}
+
+    case GIMPLE_NOP:
+	{
+	  if (SSA_NAME_VAR (def) == NULL)
+	    {
+	      /* Promote def by fixing its type for anonymous def.  */
+	      TREE_TYPE (def) = promoted_type;
+	    }
+	  else
+	    {
+	      /* Create a promoted copy of parameters.  */
+	      bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+	      gcc_assert (bb);
+	      gsi = gsi_after_labels (bb);
+	      new_def = copy_ssa_name (def);
+	      set_ssa_promoted (new_def);
+	      set_ssa_default_def (cfun, SSA_NAME_VAR (def), new_def);
+	      duplicate_default_ssa (new_def, def);
+	      TREE_TYPE (def) = promoted_type;
+	      copy_stmt = gimple_build_assign (def, CONVERT_EXPR,
+					       new_def, NULL_TREE);
+	      SSA_NAME_DEF_STMT (def) = copy_stmt;
+	      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	    }
+	  break;
+	}
+
+    case GIMPLE_ASSIGN:
+	{
+	  enum tree_code code = gimple_assign_rhs_code (def_stmt);
+	  if (!safe_to_promote_def_p (def_stmt))
+	    {
+	      do_not_promote = true;
+	    }
+	  else if (CONVERT_EXPR_CODE_P (code))
+	    {
+	      tree rhs = gimple_assign_rhs1 (def_stmt);
+	      if (!type_precision_ok (TREE_TYPE (rhs)))
+		{
+		  do_not_promote = true;
+		}
+	      else if (types_compatible_p (TREE_TYPE (rhs), promoted_type))
+		{
+		  /* As we travel statements in dominated order, arguments
+		     of def_stmt will be visited before visiting def.  If RHS
+		     is already promoted and type is compatible, we can convert
+		     them into ZERO/SIGN EXTEND stmt.  */
+		  tree &type = original_type_map->get_or_insert (rhs);
+		  if (type == NULL_TREE)
+		    type = TREE_TYPE (rhs);
+		  if (TYPE_PRECISION (original_type) < TYPE_PRECISION (type))
+		    type = original_type;
+		  gcc_assert (type != NULL_TREE);
+		  TREE_TYPE (def) = promoted_type;
+		  gimple *copy_stmt =
+		    zero_sign_extend_stmt (def, rhs,
+					   TYPE_PRECISION (type));
+		  SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		  gsi = gsi_for_stmt (def_stmt);
+		  gsi_replace (&gsi, copy_stmt, false);
+		}
+	      else {
+		  /* If RHS is not promoted OR their types are not
+		     compatible, create CONVERT_EXPR that converts
+		     RHS to  promoted DEF type and perform a
+		     ZERO/SIGN EXTEND to get the required value
+		     from RHS.  */
+		  tree s = (TYPE_PRECISION (TREE_TYPE (def))
+			    < TYPE_PRECISION (TREE_TYPE (rhs)))
+		    ? TREE_TYPE (def) : TREE_TYPE (rhs);
+		  new_def = copy_ssa_name (def);
+		  set_ssa_promoted (new_def);
+		  TREE_TYPE (def) = promoted_type;
+		  TREE_TYPE (new_def) = promoted_type;
+		  SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		  SET_SSA_NAME_VAR_OR_IDENTIFIER (new_def, NULL_TREE);
+		  gimple_set_lhs (def_stmt, new_def);
+		  gimple *copy_stmt =
+		    zero_sign_extend_stmt (def, new_def,
+					   TYPE_PRECISION (s));
+		  gsi = gsi_for_stmt (def_stmt);
+		  if (lookup_stmt_eh_lp (def_stmt) > 0)
+		    insert_stmt_on_edge (def_stmt, copy_stmt);
+		  else
+		    gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+		}
+	    }
+	  else
+	    {
+	      /* Promote def by fixing its type and make def anonymous.  */
+	      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	      promote_cst_in_stmt (def_stmt, promoted_type);
+	      TREE_TYPE (def) = promoted_type;
+	    }
+	  break;
+	}
+
+    default:
+      do_not_promote = true;
+      break;
+    }
+
+  if (do_not_promote)
+    {
+      /* Promote def and copy (i.e. convert) the value defined
+	 by the stmt that cannot be promoted.  */
+      new_def = copy_ssa_name (def);
+      set_ssa_promoted (new_def);
+      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+      TREE_TYPE (def) = promoted_type;
+      gimple_set_lhs (def_stmt, new_def);
+      copy_stmt = gimple_build_assign (def, CONVERT_EXPR,
+				       new_def, NULL_TREE);
+      gsi = gsi_for_stmt (def_stmt);
+      if (lookup_stmt_eh_lp (def_stmt) > 0)
+	insert_stmt_on_edge (def_stmt, copy_stmt);
+      else
+	gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+    }
+  else
+    {
+      /* Type is now promoted.  Due to this, some of the value ranges computed
+	 by VRP1 will is invalid.  TODO: We can be intelligent in deciding
+	 which ranges to be invalidated instead of invalidating everything.  */
+      SSA_NAME_RANGE_INFO (def) = NULL;
+    }
+}
+
+/* Fix the (promoted) USE in stmts where USE cannot be be promoted.  */
+static unsigned int
+fixup_uses (tree use, tree promoted_type, tree old_type)
+{
+  gimple *stmt;
+  imm_use_iterator ui;
+  gimple_stmt_iterator gsi;
+  use_operand_p op;
+
+  FOR_EACH_IMM_USE_STMT (stmt, ui, use)
+    {
+      bool do_not_promote = false;
+      switch (gimple_code (stmt))
+	{
+	case GIMPLE_DEBUG:
+	    {
+	      gsi = gsi_for_stmt (stmt);
+	      gsi_remove (&gsi, true);
+	      break;
+	    }
+
+	case GIMPLE_ASM:
+	case GIMPLE_CALL:
+	case GIMPLE_RETURN:
+	    {
+	      /* USE cannot be promoted here.  */
+	      do_not_promote = true;
+	      break;
+	    }
+
+	case GIMPLE_ASSIGN:
+	    {
+	      enum tree_code code = gimple_assign_rhs_code (stmt);
+	      tree lhs = gimple_assign_lhs (stmt);
+	      if (!safe_to_promote_use_p (stmt))
+		{
+		  do_not_promote = true;
+		}
+	      else if (truncate_use_p (stmt))
+		{
+		  /* In some stmts, value in USE has to be ZERO/SIGN
+		     Extended based on the original type for correct
+		     result.  */
+		  tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+		  gimple *copy_stmt =
+		    zero_sign_extend_stmt (temp, use,
+					   TYPE_PRECISION (old_type));
+		  gsi = gsi_for_stmt (stmt);
+		  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+
+		  FOR_EACH_IMM_USE_ON_STMT (op, ui)
+		    SET_USE (op, temp);
+		  if (TREE_CODE_CLASS (code)
+		      == tcc_comparison)
+		    promote_cst_in_stmt (stmt, promoted_type, true);
+		  update_stmt (stmt);
+		}
+	      else if (CONVERT_EXPR_CODE_P (code))
+		{
+		  tree rhs = gimple_assign_rhs1 (stmt);
+		  if (!type_precision_ok (TREE_TYPE (rhs)))
+		    {
+		      do_not_promote = true;
+		    }
+		  else if (types_compatible_p (TREE_TYPE (lhs), promoted_type))
+		    {
+		      /* Type of LHS and promoted RHS are compatible, we can
+			 convert this into ZERO/SIGN EXTEND stmt.  */
+		      gimple *copy_stmt =
+			zero_sign_extend_stmt (lhs, use,
+					       TYPE_PRECISION (old_type));
+		      gsi = gsi_for_stmt (stmt);
+		      set_ssa_promoted (lhs);
+		      gsi_replace (&gsi, copy_stmt, false);
+		    }
+		  else if (tobe_promoted_p (lhs))
+		    {
+		      /* If LHS will be promoted later, store the original
+			 type of RHS so that we can convert it to ZERO/SIGN
+			 EXTEND when LHS is promoted.  */
+		      tree rhs = gimple_assign_rhs1 (stmt);
+		      tree &type = original_type_map->get_or_insert (rhs);
+		      type = TREE_TYPE (old_type);
+		    }
+		  else
+		    {
+		      do_not_promote = true;
+		    }
+		}
+	      break;
+	    }
+
+	case GIMPLE_COND:
+	    {
+	      /* In GIMPLE_COND, value in USE has to be ZERO/SIGN
+		 Extended based on the original type for correct
+		 result.  */
+	      tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+	      gimple *copy_stmt =
+		zero_sign_extend_stmt (temp, use,
+				       TYPE_PRECISION (old_type));
+	      gsi = gsi_for_stmt (stmt);
+	      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+
+	      FOR_EACH_IMM_USE_ON_STMT (op, ui)
+		SET_USE (op, temp);
+	      promote_cst_in_stmt (stmt, promoted_type, true);
+	      update_stmt (stmt);
+	      break;
+	    }
+
+	default:
+	  break;
+	}
+
+      if (do_not_promote)
+	{
+	  /* FOR stmts where USE canoot be promoted, create an
+	     original type copy.  */
+	  tree temp;
+	  temp = copy_ssa_name (use);
+	  set_ssa_promoted (temp);
+	  TREE_TYPE (temp) = old_type;
+	  gimple *copy_stmt = gimple_build_assign (temp, CONVERT_EXPR,
+						  use, NULL_TREE);
+	  gsi = gsi_for_stmt (stmt);
+	  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	  FOR_EACH_IMM_USE_ON_STMT (op, ui)
+	    SET_USE (op, temp);
+	  update_stmt (stmt);
+	}
+    }
+  return 0;
+}
+
+/* Promote definition of NAME and adjust its uses if necessary.  */
+static unsigned int
+promote_def_and_uses (tree name)
+{
+  tree type;
+  if (tobe_promoted_p (name))
+    {
+      type = get_promoted_type (TREE_TYPE (name));
+      tree old_type = TREE_TYPE (name);
+      promote_definition (name, type);
+      fixup_uses (name, type, old_type);
+      set_ssa_promoted (name);
+    }
+  return 0;
+}
+
+/* Promote all the stmts in the basic block.  */
+static void
+promote_all_stmts (basic_block bb)
+{
+  gimple_stmt_iterator gsi;
+  ssa_op_iter iter;
+  tree def;
+
+  for (gphi_iterator gpi = gsi_start_phis (bb);
+       !gsi_end_p (gpi); gsi_next (&gpi))
+    {
+      gphi *phi = gpi.phi ();
+      use_operand_p op;
+
+      FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE)
+	{
+	  def = USE_FROM_PTR (op);
+	  promote_def_and_uses (def);
+	}
+      def = PHI_RESULT (phi);
+      promote_def_and_uses (def);
+    }
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      gimple *stmt = gsi_stmt (gsi);
+
+      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_USE | SSA_OP_DEF)
+	promote_def_and_uses (def);
+    }
+}
+
+
+class type_promotion_dom_walker : public dom_walker
+{
+public:
+  type_promotion_dom_walker (cdi_direction direction)
+    : dom_walker (direction) {}
+  virtual void before_dom_children (basic_block bb)
+    {
+      promote_all_stmts (bb);
+    }
+};
+
+/* Main entry point to the pass.  */
+static unsigned int
+execute_type_promotion (void)
+{
+  n_ssa_val = num_ssa_names;
+  original_type_map = new hash_map<tree, tree>;
+  ssa_to_be_promoted_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_to_be_promoted_bitmap);
+  ssa_sets_higher_bits_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_sets_higher_bits_bitmap);
+
+  calculate_dominance_info (CDI_DOMINATORS);
+  /* Walk the CFG in dominator order.  */
+  type_promotion_dom_walker (CDI_DOMINATORS)
+    .walk (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+
+  sbitmap_free (ssa_to_be_promoted_bitmap);
+  sbitmap_free (ssa_sets_higher_bits_bitmap);
+  free_dominance_info (CDI_DOMINATORS);
+  delete original_type_map;
+  return 0;
+}
+
+namespace {
+const pass_data pass_data_type_promotion =
+{
+  GIMPLE_PASS, /* type */
+  "promotion", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_TREE_TYPE_PROMOTE, /* tv_id */
+  PROP_ssa, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  (TODO_cleanup_cfg | TODO_update_ssa | TODO_verify_all),
+};
+
+class pass_type_promotion : public gimple_opt_pass
+{
+public:
+  pass_type_promotion (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_type_promotion, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  opt_pass * clone () { return new pass_type_promotion (m_ctxt); }
+  virtual bool gate (function *) { return flag_tree_type_promote != 0; }
+  virtual unsigned int execute (function *)
+    {
+      return execute_type_promotion ();
+    }
+
+}; // class pass_type_promotion
+
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_type_promote (gcc::context *ctxt)
+{
+  return new pass_type_promotion (ctxt);
+}
+
diff --git a/gcc/passes.def b/gcc/passes.def
index 64fc4d9..254496b 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -270,6 +270,7 @@ along with GCC; see the file COPYING3.  If not see
       POP_INSERT_PASSES ()
       NEXT_PASS (pass_simduid_cleanup);
       NEXT_PASS (pass_lower_vector_ssa);
+      NEXT_PASS (pass_type_promote);
       NEXT_PASS (pass_cse_reciprocals);
       NEXT_PASS (pass_reassoc);
       NEXT_PASS (pass_strength_reduction);
diff --git a/gcc/timevar.def b/gcc/timevar.def
index ac41075..80171ec 100644
--- a/gcc/timevar.def
+++ b/gcc/timevar.def
@@ -276,6 +276,7 @@ DEFTIMEVAR (TV_VTABLE_VERIFICATION   , "vtable verification")
 DEFTIMEVAR (TV_TREE_UBSAN            , "tree ubsan")
 DEFTIMEVAR (TV_INITIALIZE_RTL        , "initialize rtl")
 DEFTIMEVAR (TV_GIMPLE_LADDRESS       , "address lowering")
+DEFTIMEVAR (TV_TREE_TYPE_PROMOTE     , "tree type promote")
 
 /* Everything else in rest_of_compilation not included above.  */
 DEFTIMEVAR (TV_EARLY_LOCAL	     , "early local passes")
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 3c913ea..d4cc485 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -431,6 +431,7 @@ extern gimple_opt_pass *make_pass_fre (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_check_data_deps (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_copy_prop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_isolate_erroneous_paths (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_type_promote (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_vrp (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt);
diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c
index 4199290..43b46d9 100644
--- a/gcc/tree-ssanames.c
+++ b/gcc/tree-ssanames.c
@@ -190,7 +190,8 @@ set_range_info (tree name, enum value_range_type range_type,
   unsigned int precision = TYPE_PRECISION (TREE_TYPE (name));
 
   /* Allocate if not available.  */
-  if (ri == NULL)
+  if (ri == NULL
+      || (precision != ri->get_min ().get_precision ()))
     {
       size_t size = (sizeof (range_info_def)
 		     + trailing_wide_ints <3>::extra_size (precision));
-- 
1.9.1


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [2/7] Add new type promotion pass
  2015-10-15  5:52   ` Kugan
@ 2015-10-15 22:47     ` Richard Henderson
  0 siblings, 0 replies; 63+ messages in thread
From: Richard Henderson @ 2015-10-15 22:47 UTC (permalink / raw)
  To: Kugan, gcc-patches; +Cc: Richard Biener

On 10/15/2015 04:51 PM, Kugan wrote:
> +generation of subreg in RTL, that intern results in removal of

s/intern/in turn/


r~

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [5/7] Allow gimple debug stmt in widen mode
  2015-10-15  5:45         ` Kugan
@ 2015-10-16  9:27           ` Richard Biener
  2015-10-18 20:51             ` Kugan
  0 siblings, 1 reply; 63+ messages in thread
From: Richard Biener @ 2015-10-16  9:27 UTC (permalink / raw)
  To: Kugan; +Cc: Michael Matz, gcc-patches

On Thu, Oct 15, 2015 at 7:44 AM, Kugan
<kugan.vivekanandarajah@linaro.org> wrote:
>
>
> On 15/09/15 22:57, Richard Biener wrote:
>> On Tue, Sep 8, 2015 at 2:00 AM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
>>>
>>> Thanks for the review.
>>>
>>> On 07/09/15 23:20, Michael Matz wrote:
>>>> Hi,
>>>>
>>>> On Mon, 7 Sep 2015, Kugan wrote:
>>>>
>>>>> Allow GIMPLE_DEBUG with values in promoted register.
>>>>
>>>> Patch does much more.
>>>>
>>>
>>> Oops sorry. Copy and paste mistake.
>>>
>>> gcc/ChangeLog:
>>>
>>> 2015-09-07 Kugan Vivekanandarajah <kuganv@linaro.org>
>>>
>>> * cfgexpand.c (expand_debug_locations): Remove assert as now we are
>>> also allowing values in promoted register.
>>> * gimple-ssa-type-promote.c (fixup_uses): Allow GIMPLE_DEBUG to bind
>>> values in promoted register.
>>> * rtl.h (wi::int_traits ::decompose): Accept zero extended value
>>> also.
>>>
>>>
>>>>> gcc/ChangeLog:
>>>>>
>>>>> 2015-09-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>>>>
>>>>>      * expr.c (expand_expr_real_1): Set proper SUNREG_PROMOTED_MODE for
>>>>>      SSA_NAME that was set by GIMPLE_CALL and assigned to another
>>>>>      SSA_NAME of same type.
>>>>
>>>> ChangeLog doesn't match patch, and patch contains dubious changes:
>>>>
>>>>> --- a/gcc/cfgexpand.c
>>>>> +++ b/gcc/cfgexpand.c
>>>>> @@ -5240,7 +5240,6 @@ expand_debug_locations (void)
>>>>>         tree value = (tree)INSN_VAR_LOCATION_LOC (insn);
>>>>>         rtx val;
>>>>>         rtx_insn *prev_insn, *insn2;
>>>>> -       machine_mode mode;
>>>>>
>>>>>         if (value == NULL_TREE)
>>>>>           val = NULL_RTX;
>>>>> @@ -5275,16 +5274,6 @@ expand_debug_locations (void)
>>>>>
>>>>>         if (!val)
>>>>>           val = gen_rtx_UNKNOWN_VAR_LOC ();
>>>>> -       else
>>>>> -         {
>>>>> -           mode = GET_MODE (INSN_VAR_LOCATION (insn));
>>>>> -
>>>>> -           gcc_assert (mode == GET_MODE (val)
>>>>> -                       || (GET_MODE (val) == VOIDmode
>>>>> -                           && (CONST_SCALAR_INT_P (val)
>>>>> -                               || GET_CODE (val) == CONST_FIXED
>>>>> -                               || GET_CODE (val) == LABEL_REF)));
>>>>> -         }
>>>>>
>>>>>         INSN_VAR_LOCATION_LOC (insn) = val;
>>>>>         prev_insn = PREV_INSN (insn);
>>>>
>>>> So it seems that the modes of the values location and the value itself
>>>> don't have to match anymore, which seems dubious when considering how a
>>>> debugger should load the value in question from the given location.  So,
>>>> how is it supposed to work?
>>>
>>> For example (simplified test-case from creduce):
>>>
>>> fn1() {
>>>   char a = fn1;
>>>   return a;
>>> }
>>>
>>> --- test.c.142t.veclower21      2015-09-07 23:47:26.362201640 +0000
>>> +++ test.c.143t.promotion       2015-09-07 23:47:26.362201640 +0000
>>> @@ -5,13 +5,18 @@
>>>  {
>>>    char a;
>>>    long int fn1.0_1;
>>> +  unsigned int _2;
>>>    int _3;
>>> +  unsigned int _5;
>>> +  char _6;
>>>
>>>    <bb 2>:
>>>    fn1.0_1 = (long int) fn1;
>>> -  a_2 = (char) fn1.0_1;
>>> -  # DEBUG a => a_2
>>> -  _3 = (int) a_2;
>>> +  _5 = (unsigned int) fn1.0_1;
>>> +  _2 = _5 & 255;
>>> +  # DEBUG a => _2
>>> +  _6 = (char) _2;
>>> +  _3 = (int) _6;
>>>    return _3;
>>>
>>>  }
>>>
>>> Please see that DEBUG now points to _2 which is a promoted mode. I am
>>> assuming that the debugger would load required precision from promoted
>>> register. May be I am missing the details but how else we can handle
>>> this? Any suggestions?
>>
>> I would have expected the DEBUG insn to be adjusted as
>>
>> # DEBUG a => (char)_2
>
> Thanks for the review. Please find the attached patch that attempts to
> do this. I have also tested a version of this patch with gdb testsuite.
>
> As Michael wanted, I have also removed the changes in rtl.h and
> promoting constants in GIMPLE_DEBUG.
>
>
>> Btw, why do we have
>>
>>> +  _6 = (char) _2;
>>> +  _3 = (int) _6;
>>
>> ?  I'd have expected
>>
>>  unsigned int _6 = SEXT <_2, 8>
>>  _3 = (int) _6;
>>  return _3;
>
> I am looking into it.
>
>>
>> see my other mail about promotion of PARM_DECLs and RESULT_DECLs -- we should
>> promote those as well.
>>
>
> Just to be sure, are you referring to
> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00244.html
> where you wanted an IPA pass to perform this. This is one of my dodo
> after this. Please let me know if you wanted here is a different issue.

No, that's the same issue.

You remove


@@ -5269,16 +5268,6 @@ expand_debug_locations (void)

        if (!val)
          val = gen_rtx_UNKNOWN_VAR_LOC ();
-       else
-         {
-           mode = GET_MODE (INSN_VAR_LOCATION (insn));
-
-           gcc_assert (mode == GET_MODE (val)
-                       || (GET_MODE (val) == VOIDmode
-                           && (CONST_SCALAR_INT_P (val)
-                               || GET_CODE (val) == CONST_FIXED
-                               || GET_CODE (val) == LABEL_REF)));
-         }

which is in place to ensure the debug insns are "valid" in some form(?)
On what kind of insn does the assert trigger with your patch so that
you have to remove it?

+
+             switch (TREE_CODE_CLASS (TREE_CODE (op)))
+               {
+               case tcc_exceptional:
+               case tcc_unary:
+                   {

Hmm.  So when we promote _1 in

  _1 = ...;
 # DEBUG i = _1 + 7;

to sth else it would probably best to instead of doing conversion of operands
where necessary introduce a debug temporary like

 # DEBUG D#1 = (type-of-_1) replacement-of-_1;

and replace debug uses of _1 with D#1

Richard.

>
> Thanks,
> Kuganb

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [5/7] Allow gimple debug stmt in widen mode
  2015-10-16  9:27           ` Richard Biener
@ 2015-10-18 20:51             ` Kugan
  0 siblings, 0 replies; 63+ messages in thread
From: Kugan @ 2015-10-18 20:51 UTC (permalink / raw)
  To: Richard Biener; +Cc: Michael Matz, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1589 bytes --]


> You remove
> 
> 
> @@ -5269,16 +5268,6 @@ expand_debug_locations (void)
> 
>         if (!val)
>           val = gen_rtx_UNKNOWN_VAR_LOC ();
> -       else
> -         {
> -           mode = GET_MODE (INSN_VAR_LOCATION (insn));
> -
> -           gcc_assert (mode == GET_MODE (val)
> -                       || (GET_MODE (val) == VOIDmode
> -                           && (CONST_SCALAR_INT_P (val)
> -                               || GET_CODE (val) == CONST_FIXED
> -                               || GET_CODE (val) == LABEL_REF)));
> -         }
> 
> which is in place to ensure the debug insns are "valid" in some form(?)
> On what kind of insn does the assert trigger with your patch so that
> you have to remove it?

Thanks for the review. Please find the attached patch this removes it
and does the conversion as part of the GIMPLE_DEBUG.

Does this look better?


Thanks,
Kugan



gcc/ChangeLog:

2015-10-19  Kugan Vivekanandarajah  <kuganv@linaro.org>

	* gimple-ssa-type-promote.c (fixup_uses): For GIMPLE_DEBUG stmts,
	convert the values computed in promoted_type to original and bind.


> 
> +
> +             switch (TREE_CODE_CLASS (TREE_CODE (op)))
> +               {
> +               case tcc_exceptional:
> +               case tcc_unary:
> +                   {
> 
> Hmm.  So when we promote _1 in
> 
>   _1 = ...;
>  # DEBUG i = _1 + 7;
> 
> to sth else it would probably best to instead of doing conversion of operands
> where necessary introduce a debug temporary like
> 
>  # DEBUG D#1 = (type-of-_1) replacement-of-_1;
> 
> and replace debug uses of _1 with D#1


[-- Attachment #2: 0005-debug-stmt-in-widen-mode.patch --]
[-- Type: text/x-diff, Size: 3165 bytes --]

From 47469bb461dcafdf0ce5fe5f020faed0e8d6d4d9 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Tue, 1 Sep 2015 08:40:40 +1000
Subject: [PATCH 5/7] debug stmt in widen mode

---
 gcc/gimple-ssa-type-promote.c | 82 +++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 79 insertions(+), 3 deletions(-)

diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c
index d4ca1a3..660bd3f 100644
--- a/gcc/gimple-ssa-type-promote.c
+++ b/gcc/gimple-ssa-type-promote.c
@@ -589,10 +589,86 @@ fixup_uses (tree use, tree promoted_type, tree old_type)
 	{
 	case GIMPLE_DEBUG:
 	    {
-	      gsi = gsi_for_stmt (stmt);
-	      gsi_remove (&gsi, true);
-	      break;
+	      /* Change the GIMPLE_DEBUG stmt such that the value bound is
+		 computed in promoted_type and then converted to required
+		 type.  */
+	      tree op, new_op = NULL_TREE;
+	      gdebug *copy = NULL, *gs = as_a <gdebug *> (stmt);
+	      enum tree_code code;
+
+	      /* Get the value that is bound in debug stmt.  */
+	      switch (gs->subcode)
+		{
+		case GIMPLE_DEBUG_BIND:
+		  op = gimple_debug_bind_get_value (gs);
+		  break;
+		case GIMPLE_DEBUG_SOURCE_BIND:
+		  op = gimple_debug_source_bind_get_value (gs);
+		  break;
+		default:
+		  gcc_unreachable ();
+		}
+
+	      code = TREE_CODE (op);
+	      /* Convert the value computed in promoted_type to
+		 old_type.  */
+	      if (code == SSA_NAME && use == op)
+		new_op = build1 (NOP_EXPR, old_type, use);
+	      else if (TREE_CODE_CLASS (TREE_CODE (op)) == tcc_unary
+		       && code != NOP_EXPR)
+		{
+		  tree op0 = TREE_OPERAND (op, 0);
+		  if (op0 == use)
+		    {
+		      tree temp = build1 (code, promoted_type, op0);
+		      new_op = build1 (NOP_EXPR, old_type, temp);
+		    }
+		}
+	      else if (TREE_CODE_CLASS (TREE_CODE (op)) == tcc_binary
+		       /* Skip codes that are rejected in safe_to_promote_use_p.  */
+		       && code != LROTATE_EXPR
+		       && code != RROTATE_EXPR
+		       && code != COMPLEX_EXPR)
+		{
+		  tree op0 = TREE_OPERAND (op, 0);
+		  tree op1 = TREE_OPERAND (op, 1);
+		  if (op0 == use || op1 == use)
+		    {
+		      if (TREE_CODE (op0) == INTEGER_CST)
+			op0 = convert_int_cst (promoted_type, op0, SIGNED);
+		      if (TREE_CODE (op1) == INTEGER_CST)
+			op1 = convert_int_cst (promoted_type, op1, SIGNED);
+		      tree temp = build2 (code, promoted_type, op0, op1);
+		      new_op = build1 (NOP_EXPR, old_type, temp);
+		    }
+		}
+
+	      /* Create new GIMPLE_DEBUG stmt with the new value (new_op) to
+		 be bound, if new value has been calculated */
+	      if (new_op)
+		{
+		  if (gimple_debug_bind_p (stmt))
+		    {
+		      copy = gimple_build_debug_bind
+			(gimple_debug_bind_get_var (stmt),
+			 new_op,
+			 stmt);
+		    }
+		  if (gimple_debug_source_bind_p (stmt))
+		    {
+		      copy = gimple_build_debug_source_bind
+			(gimple_debug_source_bind_get_var (stmt), new_op,
+			 stmt);
+		    }
+
+		  if (copy)
+		    {
+		      gsi = gsi_for_stmt (stmt);
+		      gsi_replace (&gsi, copy, false);
+		    }
+		}
 	    }
+	  break;
 
 	case GIMPLE_ASM:
 	case GIMPLE_CALL:
-- 
1.9.1


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-09-07  2:55 [0/7] Type promotion pass and elimination of zext/sext Kugan
                   ` (7 preceding siblings ...)
  2015-09-07  5:54 ` [7/7] Adjust-arm-test cases Kugan
@ 2015-10-20 20:13 ` Kugan
  2015-10-21 12:56   ` Richard Biener
  8 siblings, 1 reply; 63+ messages in thread
From: Kugan @ 2015-10-20 20:13 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Biener



On 07/09/15 12:53, Kugan wrote:
> 
> This a new version of the patch posted in
> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
> more testing and spitted the patch to make it more easier to review.
> There are still couple of issues to be addressed and I am working on them.
> 
> 1. AARCH64 bootstrap now fails with the commit
> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled
> in stage2 and fwprop.c is failing. It looks to me that there is a latent
> issue which gets exposed my patch. I can also reproduce this in x86_64
> if I use the same PROMOTE_MODE which is used in aarch64 port. For the
> time being, I am using  patch
> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
> workaround. This meeds to be fixed before the patches are ready to be
> committed.
> 
> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
> -O3 -g Error: unaligned opcodes detected in executable segment. It works
> fine if I remove the -g. I am looking into it and needs to be fixed as well.

Hi Richard,

Now that stage 1 is going to close, I would like to get these patches
accepted for stage1. I will try my best to address your review comments
ASAP.

* Issue 1 above (AARCH64 bootstrap now fails with the commit) is no
longer present as it is fixed in trunk. Patch-6 is no longer needed.

* Issue 2 is also reported as known issue

*  Promotion of PARM_DECLs and RESULT_DECLs in IPA pass and patterns in
match.pd for SEXT_EXPR, I would like to propose them as a follow up
patch once this is accepted.

* I am happy to turn this pass off by default till IPA and match.pd
changes are accepted. I can do regular testing to make sure that this
pass works properly till we enable it by default.


Please let me know what you think,

Thanks,
Kugan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [1/7] Add new tree code SEXT_EXPR
  2015-10-15  5:49         ` Kugan
@ 2015-10-21 10:49           ` Richard Biener
  0 siblings, 0 replies; 63+ messages in thread
From: Richard Biener @ 2015-10-21 10:49 UTC (permalink / raw)
  To: Kugan; +Cc: gcc-patches

On Thu, Oct 15, 2015 at 7:49 AM, Kugan
<kugan.vivekanandarajah@linaro.org> wrote:
>
>
> On 12/10/15 23:21, Richard Biener wrote:
>> On Sun, Oct 11, 2015 at 12:35 PM, Kugan
>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>
>>>
>>> On 15/09/15 23:18, Richard Biener wrote:
>>>> On Mon, Sep 7, 2015 at 4:55 AM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
>>>>>
>>>>> This patch adds support for new tree code SEXT_EXPR.
>>>>
>>>> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
>>>> index d567a87..bbc3c10 100644
>>>> --- a/gcc/cfgexpand.c
>>>> +++ b/gcc/cfgexpand.c
>>>> @@ -5071,6 +5071,10 @@ expand_debug_expr (tree exp)
>>>>      case FMA_EXPR:
>>>>        return simplify_gen_ternary (FMA, mode, inner_mode, op0, op1, op2);
>>>>
>>>> +    case SEXT_EXPR:
>>>> +      return op0;
>>>>
>>>> that looks wrong.  Generate (sext:... ) here?
>>>>
>>>> +    case SEXT_EXPR:
>>>> +       {
>>>> +         rtx op0 = expand_normal (treeop0);
>>>> +         rtx temp;
>>>> +         if (!target)
>>>> +           target = gen_reg_rtx (TYPE_MODE (TREE_TYPE (treeop0)));
>>>> +
>>>> +         machine_mode inner_mode
>>>> +           = smallest_mode_for_size (tree_to_shwi (treeop1),
>>>> +                                     MODE_INT);
>>>> +         temp = convert_modes (inner_mode,
>>>> +                               TYPE_MODE (TREE_TYPE (treeop0)), op0, 0);
>>>> +         convert_move (target, temp, 0);
>>>> +         return target;
>>>> +       }
>>>>
>>>> Humm - is that really how we expand sign extensions right now?  No helper
>>>> that would generate (sext ...) directly?  I wouldn't try using 'target' btw but
>>>> simply return (sext:mode op0 op1) or so.  But I am no way an RTL expert.
>>>>
>>>> Note that if we don't disallow arbitrary precision SEXT_EXPRs we have to
>>>> fall back to using shifts (and smallest_mode_for_size is simply wrong).
>>>>
>>>> +    case SEXT_EXPR:
>>>> +      {
>>>> +       if (!INTEGRAL_TYPE_P (lhs_type)
>>>> +           || !INTEGRAL_TYPE_P (rhs1_type)
>>>> +           || TREE_CODE (rhs2) != INTEGER_CST)
>>>>
>>>> please constrain this some more, with
>>>>
>>>>    || !useless_type_conversion_p (lhs_type, rhs1_type)
>>>>
>>>> +         {
>>>> +           error ("invalid operands in sext expr");
>>>> +           return true;
>>>> +         }
>>>> +       return false;
>>>> +      }
>>>>
>>>> @@ -3414,6 +3422,9 @@ op_symbol_code (enum tree_code code)
>>>>      case MIN_EXPR:
>>>>        return "min";
>>>>
>>>> +    case SEXT_EXPR:
>>>> +      return "sext from bit";
>>>> +
>>>>
>>>> just "sext" please.
>>>>
>>>> +/*  Sign-extend operation.  It will sign extend first operand from
>>>> + the sign bit specified by the second operand.  */
>>>> +DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2)
>>>>
>>>> "from the INTEGER_CST sign bit specified"
>>>>
>>>> Also add "The type of the result is that of the first operand."
>>>>
>>>
>>>
>>>
>>> Thanks for the review. Attached patch attempts to address the above
>>> comments. Does this look better?
>>
>> +    case SEXT_EXPR:
>> +      gcc_assert (CONST_INT_P (op1));
>> +      inner_mode = mode_for_size (INTVAL (op1), MODE_INT, 0);
>>
>> We should add
>>
>>         gcc_assert (GET_MODE_BITSIZE (inner_mode) == INTVAL (op1));
>>
>> +      if (mode != inner_mode)
>> +       op0 = simplify_gen_unary (SIGN_EXTEND,
>> +                                 mode,
>> +                                 gen_lowpart_SUBREG (inner_mode, op0),
>> +                                 inner_mode);
>>
>> as we're otherwise silently dropping things like SEXT (short-typed-var, 13)
>>
>> +    case SEXT_EXPR:
>> +       {
>> +         machine_mode inner_mode = mode_for_size (tree_to_shwi (treeop1),
>> +                                                  MODE_INT, 0);
>>
>> Likewise.  Also treeop1 should be unsigned, thus tree_to_uhwi?
>>
>> +         rtx temp, result;
>> +         rtx op0 = expand_normal (treeop0);
>> +         op0 = force_reg (mode, op0);
>> +         if (mode != inner_mode)
>> +           {
>>
>> Again, for the RTL bits I'm not sure they are correct.  For example I don't
>> see why we need a lowpart SUBREG, isn't a "regular" SUBREG enough?
>>
>> +    case SEXT_EXPR:
>> +      {
>> +       if (!INTEGRAL_TYPE_P (lhs_type)
>> +           || !useless_type_conversion_p (lhs_type, rhs1_type)
>> +           || !INTEGRAL_TYPE_P (rhs1_type)
>> +           || TREE_CODE (rhs2) != INTEGER_CST)
>>
>> the INTEGRAL_TYPE_P (rhs1_type) check is redundant with
>> the useless_type_Conversion_p one.  Please check
>> tree_fits_uhwi (rhs2) instead of != INTEGER_CST.
>>
>
> Thanks for the review, Please find the updated patch based on the review
> comments.

Ok.

Thanks,
Richard.


> Thanks,
> Kugan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-10-20 20:13 ` [0/7] Type promotion pass and elimination of zext/sext Kugan
@ 2015-10-21 12:56   ` Richard Biener
  2015-10-21 13:57     ` Richard Biener
  2015-10-22 11:01     ` Kugan
  0 siblings, 2 replies; 63+ messages in thread
From: Richard Biener @ 2015-10-21 12:56 UTC (permalink / raw)
  To: Kugan; +Cc: gcc-patches

On Tue, Oct 20, 2015 at 10:03 PM, Kugan
<kugan.vivekanandarajah@linaro.org> wrote:
>
>
> On 07/09/15 12:53, Kugan wrote:
>>
>> This a new version of the patch posted in
>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
>> more testing and spitted the patch to make it more easier to review.
>> There are still couple of issues to be addressed and I am working on them.
>>
>> 1. AARCH64 bootstrap now fails with the commit
>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled
>> in stage2 and fwprop.c is failing. It looks to me that there is a latent
>> issue which gets exposed my patch. I can also reproduce this in x86_64
>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the
>> time being, I am using  patch
>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
>> workaround. This meeds to be fixed before the patches are ready to be
>> committed.
>>
>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>> -O3 -g Error: unaligned opcodes detected in executable segment. It works
>> fine if I remove the -g. I am looking into it and needs to be fixed as well.
>
> Hi Richard,
>
> Now that stage 1 is going to close, I would like to get these patches
> accepted for stage1. I will try my best to address your review comments
> ASAP.

Ok, can you make the whole patch series available so I can poke at the
implementation a bit?  Please state the revision it was rebased on
(or point me to a git/svn branch the work resides on).

> * Issue 1 above (AARCH64 bootstrap now fails with the commit) is no
> longer present as it is fixed in trunk. Patch-6 is no longer needed.
>
> * Issue 2 is also reported as known issue
>
> *  Promotion of PARM_DECLs and RESULT_DECLs in IPA pass and patterns in
> match.pd for SEXT_EXPR, I would like to propose them as a follow up
> patch once this is accepted.

I thought more about this and don't think it can be made work without a lot of
hassle.  Instead to get rid of the remaining "badly" typed registers in the
function we can key different type requirements on a pass property
(PROP_promoted_regs), thus simply change the expectation of the
types of function parameters / results according to their promotion.

The promotion pass would set PROP_promoted_regs then.

I will look over the patch(es) this week but as said I'd like to play with
some code examples myself and thus like to have the current patchset
in a more easily accessible form (and sure to apply to some rev.).

Thanks,
Richard.

> * I am happy to turn this pass off by default till IPA and match.pd
> changes are accepted. I can do regular testing to make sure that this
> pass works properly till we enable it by default.
>
>
> Please let me know what you think,
>
> Thanks,
> Kugan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-10-21 12:56   ` Richard Biener
@ 2015-10-21 13:57     ` Richard Biener
  2015-10-21 17:17       ` Joseph Myers
  2015-10-21 18:11       ` Richard Henderson
  2015-10-22 11:01     ` Kugan
  1 sibling, 2 replies; 63+ messages in thread
From: Richard Biener @ 2015-10-21 13:57 UTC (permalink / raw)
  To: Kugan, Richard Henderson; +Cc: gcc-patches

On Wed, Oct 21, 2015 at 2:45 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Tue, Oct 20, 2015 at 10:03 PM, Kugan
> <kugan.vivekanandarajah@linaro.org> wrote:
>>
>>
>> On 07/09/15 12:53, Kugan wrote:
>>>
>>> This a new version of the patch posted in
>>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
>>> more testing and spitted the patch to make it more easier to review.
>>> There are still couple of issues to be addressed and I am working on them.
>>>
>>> 1. AARCH64 bootstrap now fails with the commit
>>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled
>>> in stage2 and fwprop.c is failing. It looks to me that there is a latent
>>> issue which gets exposed my patch. I can also reproduce this in x86_64
>>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the
>>> time being, I am using  patch
>>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
>>> workaround. This meeds to be fixed before the patches are ready to be
>>> committed.
>>>
>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>>> -O3 -g Error: unaligned opcodes detected in executable segment. It works
>>> fine if I remove the -g. I am looking into it and needs to be fixed as well.
>>
>> Hi Richard,
>>
>> Now that stage 1 is going to close, I would like to get these patches
>> accepted for stage1. I will try my best to address your review comments
>> ASAP.
>
> Ok, can you make the whole patch series available so I can poke at the
> implementation a bit?  Please state the revision it was rebased on
> (or point me to a git/svn branch the work resides on).
>
>> * Issue 1 above (AARCH64 bootstrap now fails with the commit) is no
>> longer present as it is fixed in trunk. Patch-6 is no longer needed.
>>
>> * Issue 2 is also reported as known issue
>>
>> *  Promotion of PARM_DECLs and RESULT_DECLs in IPA pass and patterns in
>> match.pd for SEXT_EXPR, I would like to propose them as a follow up
>> patch once this is accepted.
>
> I thought more about this and don't think it can be made work without a lot of
> hassle.  Instead to get rid of the remaining "badly" typed registers in the
> function we can key different type requirements on a pass property
> (PROP_promoted_regs), thus simply change the expectation of the
> types of function parameters / results according to their promotion.

Or maybe we should simply make GIMPLE _always_ adhere to the ABI
details from the start (gimplification).  Note that this does not only involve
PROMOTE_MODE.  Note that for what GIMPLE is concerned I'd only
"lower" passing / returning in registers (whee, and then we have
things like targetm.calls.split_complex_arg ... not to mention passing
GIMPLE memory in registers).

Maybe I'm shooting too far here in the attempt to make GIMPLE closer
to the target (to expose those redundant extensions on GIMPLE) and
we'll end up with a bigger mess than with not doing this?

Richard.

> The promotion pass would set PROP_promoted_regs then.
>
> I will look over the patch(es) this week but as said I'd like to play with
> some code examples myself and thus like to have the current patchset
> in a more easily accessible form (and sure to apply to some rev.).
>
> Thanks,
> Richard.
>
>> * I am happy to turn this pass off by default till IPA and match.pd
>> changes are accepted. I can do regular testing to make sure that this
>> pass works properly till we enable it by default.
>>
>>
>> Please let me know what you think,
>>
>> Thanks,
>> Kugan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-10-21 13:57     ` Richard Biener
@ 2015-10-21 17:17       ` Joseph Myers
  2015-10-21 18:11       ` Richard Henderson
  1 sibling, 0 replies; 63+ messages in thread
From: Joseph Myers @ 2015-10-21 17:17 UTC (permalink / raw)
  To: Richard Biener; +Cc: Kugan, Richard Henderson, gcc-patches

On Wed, 21 Oct 2015, Richard Biener wrote:

> Or maybe we should simply make GIMPLE _always_ adhere to the ABI
> details from the start (gimplification).  Note that this does not only involve
> PROMOTE_MODE.  Note that for what GIMPLE is concerned I'd only
> "lower" passing / returning in registers (whee, and then we have
> things like targetm.calls.split_complex_arg ... not to mention passing
> GIMPLE memory in registers).
> 
> Maybe I'm shooting too far here in the attempt to make GIMPLE closer
> to the target (to expose those redundant extensions on GIMPLE) and
> we'll end up with a bigger mess than with not doing this?

I don't know at what point target-specific promotion should appear, but 
right now it's visible before then (front ends use 
targetm.calls.promote_prototypes), which is definitely too early.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-10-21 13:57     ` Richard Biener
  2015-10-21 17:17       ` Joseph Myers
@ 2015-10-21 18:11       ` Richard Henderson
  2015-10-22 12:48         ` Richard Biener
  1 sibling, 1 reply; 63+ messages in thread
From: Richard Henderson @ 2015-10-21 18:11 UTC (permalink / raw)
  To: Richard Biener, Kugan; +Cc: gcc-patches

On 10/21/2015 03:56 AM, Richard Biener wrote:
> On Wed, Oct 21, 2015 at 2:45 PM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Tue, Oct 20, 2015 at 10:03 PM, Kugan
>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>
>>>
>>> On 07/09/15 12:53, Kugan wrote:
>>>>
>>>> This a new version of the patch posted in
>>>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
>>>> more testing and spitted the patch to make it more easier to review.
>>>> There are still couple of issues to be addressed and I am working on them.
>>>>
>>>> 1. AARCH64 bootstrap now fails with the commit
>>>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled
>>>> in stage2 and fwprop.c is failing. It looks to me that there is a latent
>>>> issue which gets exposed my patch. I can also reproduce this in x86_64
>>>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the
>>>> time being, I am using  patch
>>>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
>>>> workaround. This meeds to be fixed before the patches are ready to be
>>>> committed.
>>>>
>>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>>>> -O3 -g Error: unaligned opcodes detected in executable segment. It works
>>>> fine if I remove the -g. I am looking into it and needs to be fixed as well.
>>>
>>> Hi Richard,
>>>
>>> Now that stage 1 is going to close, I would like to get these patches
>>> accepted for stage1. I will try my best to address your review comments
>>> ASAP.
>>
>> Ok, can you make the whole patch series available so I can poke at the
>> implementation a bit?  Please state the revision it was rebased on
>> (or point me to a git/svn branch the work resides on).
>>
>>> * Issue 1 above (AARCH64 bootstrap now fails with the commit) is no
>>> longer present as it is fixed in trunk. Patch-6 is no longer needed.
>>>
>>> * Issue 2 is also reported as known issue
>>>
>>> *  Promotion of PARM_DECLs and RESULT_DECLs in IPA pass and patterns in
>>> match.pd for SEXT_EXPR, I would like to propose them as a follow up
>>> patch once this is accepted.
>>
>> I thought more about this and don't think it can be made work without a lot of
>> hassle.  Instead to get rid of the remaining "badly" typed registers in the
>> function we can key different type requirements on a pass property
>> (PROP_promoted_regs), thus simply change the expectation of the
>> types of function parameters / results according to their promotion.
>
> Or maybe we should simply make GIMPLE _always_ adhere to the ABI
> details from the start (gimplification).  Note that this does not only involve
> PROMOTE_MODE.  Note that for what GIMPLE is concerned I'd only
> "lower" passing / returning in registers (whee, and then we have
> things like targetm.calls.split_complex_arg ... not to mention passing
> GIMPLE memory in registers).
>
> Maybe I'm shooting too far here in the attempt to make GIMPLE closer
> to the target (to expose those redundant extensions on GIMPLE) and
> we'll end up with a bigger mess than with not doing this?

I'm leary of building this in as early as gimplification, lest we get into 
trouble with splitting out bits of the current function for off-loading.  What 
happens when the cpu and gpu have different promotion rules?


r~

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-10-21 12:56   ` Richard Biener
  2015-10-21 13:57     ` Richard Biener
@ 2015-10-22 11:01     ` Kugan
  2015-10-22 14:24       ` Richard Biener
  1 sibling, 1 reply; 63+ messages in thread
From: Kugan @ 2015-10-22 11:01 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1751 bytes --]



On 21/10/15 23:45, Richard Biener wrote:
> On Tue, Oct 20, 2015 at 10:03 PM, Kugan
> <kugan.vivekanandarajah@linaro.org> wrote:
>>
>>
>> On 07/09/15 12:53, Kugan wrote:
>>>
>>> This a new version of the patch posted in
>>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
>>> more testing and spitted the patch to make it more easier to review.
>>> There are still couple of issues to be addressed and I am working on them.
>>>
>>> 1. AARCH64 bootstrap now fails with the commit
>>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled
>>> in stage2 and fwprop.c is failing. It looks to me that there is a latent
>>> issue which gets exposed my patch. I can also reproduce this in x86_64
>>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the
>>> time being, I am using  patch
>>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
>>> workaround. This meeds to be fixed before the patches are ready to be
>>> committed.
>>>
>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>>> -O3 -g Error: unaligned opcodes detected in executable segment. It works
>>> fine if I remove the -g. I am looking into it and needs to be fixed as well.
>>
>> Hi Richard,
>>
>> Now that stage 1 is going to close, I would like to get these patches
>> accepted for stage1. I will try my best to address your review comments
>> ASAP.
> 
> Ok, can you make the whole patch series available so I can poke at the
> implementation a bit?  Please state the revision it was rebased on
> (or point me to a git/svn branch the work resides on).
> 

Thanks. Please find the patched rebated against trunk@229156. I have
skipped the test-case readjustment patches.


Thanks,
Kugan

[-- Attachment #2: 0004-debug-stmt-in-widen-mode.patch --]
[-- Type: text/x-diff, Size: 3166 bytes --]

From 2dc1cccfc59ae6967928b52396227b52a50803d9 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 22 Oct 2015 10:54:31 +1100
Subject: [PATCH 4/4] debug stmt in widen mode

---
 gcc/gimple-ssa-type-promote.c | 82 +++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 79 insertions(+), 3 deletions(-)

diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c
index e62a7c6..c0b6aa1 100644
--- a/gcc/gimple-ssa-type-promote.c
+++ b/gcc/gimple-ssa-type-promote.c
@@ -589,10 +589,86 @@ fixup_uses (tree use, tree promoted_type, tree old_type)
 	{
 	case GIMPLE_DEBUG:
 	    {
-	      gsi = gsi_for_stmt (stmt);
-	      gsi_remove (&gsi, true);
-	      break;
+	      /* Change the GIMPLE_DEBUG stmt such that the value bound is
+		 computed in promoted_type and then converted to required
+		 type.  */
+	      tree op, new_op = NULL_TREE;
+	      gdebug *copy = NULL, *gs = as_a <gdebug *> (stmt);
+	      enum tree_code code;
+
+	      /* Get the value that is bound in debug stmt.  */
+	      switch (gs->subcode)
+		{
+		case GIMPLE_DEBUG_BIND:
+		  op = gimple_debug_bind_get_value (gs);
+		  break;
+		case GIMPLE_DEBUG_SOURCE_BIND:
+		  op = gimple_debug_source_bind_get_value (gs);
+		  break;
+		default:
+		  gcc_unreachable ();
+		}
+
+	      code = TREE_CODE (op);
+	      /* Convert the value computed in promoted_type to
+		 old_type.  */
+	      if (code == SSA_NAME && use == op)
+		new_op = build1 (NOP_EXPR, old_type, use);
+	      else if (TREE_CODE_CLASS (TREE_CODE (op)) == tcc_unary
+		       && code != NOP_EXPR)
+		{
+		  tree op0 = TREE_OPERAND (op, 0);
+		  if (op0 == use)
+		    {
+		      tree temp = build1 (code, promoted_type, op0);
+		      new_op = build1 (NOP_EXPR, old_type, temp);
+		    }
+		}
+	      else if (TREE_CODE_CLASS (TREE_CODE (op)) == tcc_binary
+		       /* Skip codes that are rejected in safe_to_promote_use_p.  */
+		       && code != LROTATE_EXPR
+		       && code != RROTATE_EXPR
+		       && code != COMPLEX_EXPR)
+		{
+		  tree op0 = TREE_OPERAND (op, 0);
+		  tree op1 = TREE_OPERAND (op, 1);
+		  if (op0 == use || op1 == use)
+		    {
+		      if (TREE_CODE (op0) == INTEGER_CST)
+			op0 = convert_int_cst (promoted_type, op0, SIGNED);
+		      if (TREE_CODE (op1) == INTEGER_CST)
+			op1 = convert_int_cst (promoted_type, op1, SIGNED);
+		      tree temp = build2 (code, promoted_type, op0, op1);
+		      new_op = build1 (NOP_EXPR, old_type, temp);
+		    }
+		}
+
+	      /* Create new GIMPLE_DEBUG stmt with the new value (new_op) to
+		 be bound, if new value has been calculated */
+	      if (new_op)
+		{
+		  if (gimple_debug_bind_p (stmt))
+		    {
+		      copy = gimple_build_debug_bind
+			(gimple_debug_bind_get_var (stmt),
+			 new_op,
+			 stmt);
+		    }
+		  if (gimple_debug_source_bind_p (stmt))
+		    {
+		      copy = gimple_build_debug_source_bind
+			(gimple_debug_source_bind_get_var (stmt), new_op,
+			 stmt);
+		    }
+
+		  if (copy)
+		    {
+		      gsi = gsi_for_stmt (stmt);
+		      gsi_replace (&gsi, copy, false);
+		    }
+		}
 	    }
+	  break;
 
 	case GIMPLE_ASM:
 	case GIMPLE_CALL:
-- 
1.9.1


[-- Attachment #3: 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch --]
[-- Type: text/x-diff, Size: 3519 bytes --]

From 1044b1b5ebf8ad696a942207b031e3668ab2a0de Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 22 Oct 2015 10:53:56 +1100
Subject: [PATCH 3/4] Optimize ZEXT_EXPR with tree-vrp

---
 gcc/tree-vrp.c | 70 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 70 insertions(+)

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index fe34ffd..cdff9c0 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -2241,6 +2241,7 @@ extract_range_from_binary_expr_1 (value_range *vr,
       && code != LSHIFT_EXPR
       && code != MIN_EXPR
       && code != MAX_EXPR
+      && code != SEXT_EXPR
       && code != BIT_AND_EXPR
       && code != BIT_IOR_EXPR
       && code != BIT_XOR_EXPR)
@@ -2801,6 +2802,52 @@ extract_range_from_binary_expr_1 (value_range *vr,
       extract_range_from_multiplicative_op_1 (vr, code, &vr0, &vr1);
       return;
     }
+  else if (code == SEXT_EXPR)
+    {
+      gcc_assert (range_int_cst_p (&vr1));
+      HOST_WIDE_INT prec = tree_to_uhwi (vr1.min);
+      type = vr0.type;
+      wide_int tmin, tmax;
+      wide_int may_be_nonzero, must_be_nonzero;
+
+      wide_int type_min = wi::min_value (prec, SIGNED);
+      wide_int type_max = wi::max_value (prec, SIGNED);
+      type_min = wide_int_to_tree (expr_type, type_min);
+      type_max = wide_int_to_tree (expr_type, type_max);
+      wide_int sign_bit
+	= wi::set_bit_in_zero (prec - 1,
+			       TYPE_PRECISION (TREE_TYPE (vr0.min)));
+      if (zero_nonzero_bits_from_vr (expr_type, &vr0,
+				     &may_be_nonzero,
+				     &must_be_nonzero))
+	{
+	  if (wi::bit_and (must_be_nonzero, sign_bit) == sign_bit)
+	    {
+	      /* If to-be-extended sign bit is one.  */
+	      tmin = type_min;
+	      tmax = wi::zext (may_be_nonzero, prec);
+	    }
+	  else if (wi::bit_and (may_be_nonzero, sign_bit)
+		   != sign_bit)
+	    {
+	      /* If to-be-extended sign bit is zero.  */
+	      tmin = wi::zext (must_be_nonzero, prec);
+	      tmax = wi::zext (may_be_nonzero, prec);
+	    }
+	  else
+	    {
+	      tmin = type_min;
+	      tmax = type_max;
+	    }
+	}
+      else
+	{
+	  tmin = type_min;
+	  tmax = type_max;
+	}
+      min = wide_int_to_tree (expr_type, tmin);
+      max = wide_int_to_tree (expr_type, tmax);
+    }
   else if (code == RSHIFT_EXPR
 	   || code == LSHIFT_EXPR)
     {
@@ -9166,6 +9213,28 @@ simplify_bit_ops_using_ranges (gimple_stmt_iterator *gsi, gimple *stmt)
 	  break;
 	}
       break;
+    case SEXT_EXPR:
+	{
+	  unsigned int prec = tree_to_uhwi (op1);
+	  wide_int sign_bit
+	    = wi::set_bit_in_zero (prec - 1,
+				   TYPE_PRECISION (TREE_TYPE (vr0.min)));
+	  wide_int mask = wi::mask (prec, true,
+				    TYPE_PRECISION (TREE_TYPE (vr0.min)));
+	  if (wi::bit_and (must_be_nonzero0, sign_bit) == sign_bit)
+	    {
+	      /* If to-be-extended sign bit is one.  */
+	      if (wi::bit_and (must_be_nonzero0, mask) == mask)
+		op = op0;
+	    }
+	  else if (wi::bit_and (may_be_nonzero0, sign_bit) != sign_bit)
+	    {
+	      /* If to-be-extended sign bit is zero.  */
+	      if (wi::bit_and (may_be_nonzero0, mask) == 0)
+		op = op0;
+	    }
+	}
+      break;
     default:
       gcc_unreachable ();
     }
@@ -9868,6 +9937,7 @@ simplify_stmt_using_ranges (gimple_stmt_iterator *gsi)
 
 	case BIT_AND_EXPR:
 	case BIT_IOR_EXPR:
+	case SEXT_EXPR:
 	  /* Optimize away BIT_AND_EXPR and BIT_IOR_EXPR
 	     if all the bits being cleared are already cleared or
 	     all the bits being set are already set.  */
-- 
1.9.1


[-- Attachment #4: 0002-Add-type-promotion-pass.patch --]
[-- Type: text/x-diff, Size: 29013 bytes --]

From 0cd8d75c4130639f4a3fe8294bcbfdf4f2d3e4eb Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 22 Oct 2015 10:52:37 +1100
Subject: [PATCH 2/4] Add type promotion pass

---
 gcc/Makefile.in               |   1 +
 gcc/common.opt                |   4 +
 gcc/doc/invoke.texi           |  10 +
 gcc/gimple-ssa-type-promote.c | 831 ++++++++++++++++++++++++++++++++++++++++++
 gcc/passes.def                |   1 +
 gcc/timevar.def               |   1 +
 gcc/tree-pass.h               |   1 +
 gcc/tree-ssanames.c           |   3 +-
 8 files changed, 851 insertions(+), 1 deletion(-)
 create mode 100644 gcc/gimple-ssa-type-promote.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index b91b8dc..c6aed45 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1499,6 +1499,7 @@ OBJS = \
 	tree-vect-slp.o \
 	tree-vectorizer.o \
 	tree-vrp.o \
+	gimple-ssa-type-promote.o \
 	tree.o \
 	valtrack.o \
 	value-prof.o \
diff --git a/gcc/common.opt b/gcc/common.opt
index 12ca0d6..f450428 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2404,6 +2404,10 @@ ftree-vrp
 Common Report Var(flag_tree_vrp) Init(0) Optimization
 Perform Value Range Propagation on trees.
 
+ftree-type-promote
+Common Report Var(flag_tree_type_promote) Init(1) Optimization
+Perform Type Promotion on trees
+
 funit-at-a-time
 Common Report Var(flag_unit_at_a_time) Init(1)
 Compile whole compilation unit at a time.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index cd82544..bc059a0 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -9093,6 +9093,16 @@ enabled by default at @option{-O2} and higher.  Null pointer check
 elimination is only done if @option{-fdelete-null-pointer-checks} is
 enabled.
 
+@item -ftree-type-promote
+@opindex ftree-type-promote
+This pass applies type promotion to SSA names in the function and
+inserts appropriate truncations to preserve the semantics.  Idea of
+this pass is to promote operations such a way that we can minimise
+generation of subreg in RTL, that intern results in removal of
+redundant zero/sign extensions.
+
+This optimization is enabled by default.
+
 @item -fsplit-ivs-in-unroller
 @opindex fsplit-ivs-in-unroller
 Enables expression of values of induction variables in later iterations
diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c
new file mode 100644
index 0000000..e62a7c6
--- /dev/null
+++ b/gcc/gimple-ssa-type-promote.c
@@ -0,0 +1,831 @@
+/* Type promotion of SSA names to minimise redundant zero/sign extension.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "hash-set.h"
+#include "machmode.h"
+#include "vec.h"
+#include "double-int.h"
+#include "input.h"
+#include "symtab.h"
+#include "wide-int.h"
+#include "inchash.h"
+#include "tree.h"
+#include "fold-const.h"
+#include "stor-layout.h"
+#include "predict.h"
+#include "function.h"
+#include "dominance.h"
+#include "cfg.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimple-ssa.h"
+#include "tree-phinodes.h"
+#include "ssa-iterators.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+#include "tree-pass.h"
+#include "gimple-pretty-print.h"
+#include "langhooks.h"
+#include "sbitmap.h"
+#include "domwalk.h"
+#include "tree-dfa.h"
+
+/* This pass applies type promotion to SSA names in the function and
+   inserts appropriate truncations.  Idea of this pass is to promote operations
+   such a way that we can minimise generation of subreg in RTL,
+   that in turn results in removal of redundant zero/sign extensions.  This pass
+   will run prior to The VRP and DOM such that they will be able to optimise
+   redundant truncations and extensions.  This is based on the discussion from
+   https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00472.html.
+
+*/
+
+static unsigned n_ssa_val;
+static sbitmap ssa_to_be_promoted_bitmap;
+static sbitmap ssa_sets_higher_bits_bitmap;
+static hash_map <tree, tree>  *original_type_map;
+
+static bool
+type_precision_ok (tree type)
+{
+  return (TYPE_PRECISION (type)  == 8
+	  || TYPE_PRECISION (type) == 16
+	  || TYPE_PRECISION (type) == 32);
+}
+
+/* Return the promoted type for TYPE.  */
+static tree
+get_promoted_type (tree type)
+{
+  tree promoted_type;
+  enum machine_mode mode;
+  int uns;
+  if (POINTER_TYPE_P (type)
+      || !INTEGRAL_TYPE_P (type)
+      || !type_precision_ok (type))
+    return type;
+
+  mode = TYPE_MODE (type);
+#ifdef PROMOTE_MODE
+  uns = TYPE_SIGN (type);
+  PROMOTE_MODE (mode, uns, type);
+#endif
+  uns = TYPE_SIGN (type);
+  promoted_type = lang_hooks.types.type_for_mode (mode, uns);
+  if (promoted_type
+      && (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type)))
+    type = promoted_type;
+  return type;
+}
+
+/* Return true if ssa NAME is already considered for promotion.  */
+static bool
+ssa_promoted_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	return bitmap_bit_p (ssa_to_be_promoted_bitmap, index);
+    }
+  return true;
+}
+
+
+/* Set ssa NAME to be already considered for promotion.  */
+static void
+set_ssa_promoted (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	bitmap_set_bit (ssa_to_be_promoted_bitmap, index);
+    }
+}
+
+/* Insert COPY_STMT along the edge from STMT to its successor.  */
+static void
+insert_stmt_on_edge (gimple *stmt, gimple *copy_stmt)
+{
+  edge_iterator ei;
+  edge e, edge = NULL;
+  basic_block bb = gimple_bb (stmt);
+
+  FOR_EACH_EDGE (e, ei, bb->succs)
+    if (!(e->flags & EDGE_EH))
+      {
+	gcc_assert (edge == NULL);
+	edge = e;
+      }
+
+  gcc_assert (edge);
+  gsi_insert_on_edge_immediate (edge, copy_stmt);
+}
+
+/* Return true if it is safe to promote the defined SSA_NAME in the STMT
+   itself.  */
+static bool
+safe_to_promote_def_p (gimple *stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  if (gimple_vuse (stmt) != NULL_TREE
+      || gimple_vdef (stmt) != NULL_TREE
+      || code == ARRAY_REF
+      || code == LROTATE_EXPR
+      || code == RROTATE_EXPR
+      || code == VIEW_CONVERT_EXPR
+      || code == BIT_FIELD_REF
+      || code == REALPART_EXPR
+      || code == IMAGPART_EXPR
+      || code == REDUC_MAX_EXPR
+      || code == REDUC_PLUS_EXPR
+      || code == REDUC_MIN_EXPR)
+    return false;
+  return true;
+}
+
+/* Return true if it is safe to promote the use in the STMT.  */
+static bool
+safe_to_promote_use_p (gimple *stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  tree lhs = gimple_assign_lhs (stmt);
+
+  if (gimple_vuse (stmt) != NULL_TREE
+      || gimple_vdef (stmt) != NULL_TREE
+      || code == VIEW_CONVERT_EXPR
+      || code == LROTATE_EXPR
+      || code == RROTATE_EXPR
+      || code == CONSTRUCTOR
+      || code == BIT_FIELD_REF
+      || code == COMPLEX_EXPR
+      || code == ASM_EXPR
+      || VECTOR_TYPE_P (TREE_TYPE (lhs)))
+    return false;
+  return true;
+}
+
+/* Return true if the SSA_NAME has to be truncated to preserve the
+   semantics.  */
+static bool
+truncate_use_p (gimple *stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  if (TREE_CODE_CLASS (code)
+      == tcc_comparison
+      || code == TRUNC_DIV_EXPR
+      || code == CEIL_DIV_EXPR
+      || code == FLOOR_DIV_EXPR
+      || code == ROUND_DIV_EXPR
+      || code == TRUNC_MOD_EXPR
+      || code == CEIL_MOD_EXPR
+      || code == FLOOR_MOD_EXPR
+      || code == ROUND_MOD_EXPR
+      || code == LSHIFT_EXPR
+      || code == RSHIFT_EXPR)
+    return true;
+  return false;
+}
+
+/* Return true if LHS will be promoted later.  */
+static bool
+tobe_promoted_p (tree lhs)
+{
+  if (TREE_CODE (lhs) == SSA_NAME
+      && !POINTER_TYPE_P (TREE_TYPE (lhs))
+      && INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+      && !VECTOR_TYPE_P (TREE_TYPE (lhs))
+      && !ssa_promoted_p (lhs)
+      && (get_promoted_type (TREE_TYPE (lhs))
+	  != TREE_TYPE (lhs)))
+    return true;
+  else
+    return false;
+}
+
+/* Convert constant CST to TYPE.  */
+static tree
+convert_int_cst (tree type, tree cst, signop sign = SIGNED)
+{
+  wide_int wi_cons = fold_convert (type, cst);
+  wi_cons = wi::ext (wi_cons, TYPE_PRECISION (TREE_TYPE (cst)), sign);
+  return wide_int_to_tree (type, wi_cons);
+}
+
+/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
+   promote only the constants in conditions part of the COND_EXPR.  */
+static void
+promote_cst_in_stmt (gimple *stmt, tree type, bool promote_cond = false)
+{
+  tree op;
+  ssa_op_iter iter;
+  use_operand_p oprnd;
+  int index;
+  tree op0, op1;
+  signop sign = SIGNED;
+
+  switch (gimple_code (stmt))
+    {
+    case GIMPLE_ASSIGN:
+      if (promote_cond
+	  && gimple_assign_rhs_code (stmt) == COND_EXPR)
+	{
+	  /* Promote INTEGER_CST that are tcc_compare arguments.  */
+	  sign = TYPE_SIGN (type);
+	  op = gimple_assign_rhs1 (stmt);
+	  op0 = TREE_OPERAND (op, 0);
+	  op1 = TREE_OPERAND (op, 1);
+	  if (TREE_CODE (op0) == INTEGER_CST)
+	    op0 = convert_int_cst (type, op0, sign);
+	  if (TREE_CODE (op1) == INTEGER_CST)
+	    op1 = convert_int_cst (type, op1, sign);
+	  tree new_op = build2 (TREE_CODE (op), type, op0, op1);
+	  gimple_assign_set_rhs1 (stmt, new_op);
+	}
+      else
+	{
+	  /* Promote INTEGER_CST in GIMPLE_ASSIGN.  */
+	  op = gimple_assign_rhs3 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs3 (stmt, convert_int_cst (type, op, sign));
+	  if (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt))
+	      == tcc_comparison)
+	    sign = TYPE_SIGN (type);
+	  op = gimple_assign_rhs1 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs1 (stmt, convert_int_cst (type, op, sign));
+	  op = gimple_assign_rhs2 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs2 (stmt, convert_int_cst (type, op, sign));
+	}
+      break;
+
+    case GIMPLE_PHI:
+	{
+	  /* Promote INTEGER_CST arguments to GIMPLE_PHI.  */
+	  gphi *phi = as_a <gphi *> (stmt);
+	  FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE)
+	    {
+	      op = USE_FROM_PTR (oprnd);
+	      index = PHI_ARG_INDEX_FROM_USE (oprnd);
+	      if (TREE_CODE (op) == INTEGER_CST)
+		SET_PHI_ARG_DEF (phi, index, convert_int_cst (type, op, sign));
+	    }
+	}
+      break;
+
+    case GIMPLE_COND:
+	{
+	  /* Promote INTEGER_CST that are GIMPLE_COND arguments.  */
+	  gcond *cond = as_a <gcond *> (stmt);
+	  op = gimple_cond_lhs (cond);
+	  sign = TYPE_SIGN (type);
+
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_lhs (cond, convert_int_cst (type, op, sign));
+	  op = gimple_cond_rhs (cond);
+
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_rhs (cond, convert_int_cst (type, op, sign));
+	}
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+}
+
+/* Create an ssa with TYPE to copy ssa VAR.  */
+static tree
+make_promoted_copy (tree var, gimple *def_stmt, tree type)
+{
+  tree new_lhs = make_ssa_name (type, def_stmt);
+  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (var))
+    SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_lhs) = 1;
+  return new_lhs;
+}
+
+/* Zero/sign extend (depending on type) VAR and truncate to WIDTH bits.
+   Assign the zero/sign extended value in NEW_VAR.  gimple statement
+   that performs the zero/sign extension is returned.  */
+static gimple *
+zero_sign_extend_stmt (tree new_var, tree var, int width)
+{
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var))
+	      == TYPE_PRECISION (TREE_TYPE (new_var)));
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var)) > width);
+  gcc_assert (width != 1);
+  gimple *stmt;
+
+  if (TYPE_UNSIGNED (TREE_TYPE (new_var)))
+    {
+      /* Zero extend.  */
+      tree cst
+	= wide_int_to_tree (TREE_TYPE (var),
+			    wi::mask (width, false,
+				      TYPE_PRECISION (TREE_TYPE (var))));
+      stmt = gimple_build_assign (new_var, BIT_AND_EXPR,
+				  var, cst);
+    }
+  else
+    /* Sign extend.  */
+    stmt = gimple_build_assign (new_var,
+				SEXT_EXPR,
+				var, build_int_cst (TREE_TYPE (var), width));
+  return stmt;
+}
+
+
+void duplicate_default_ssa (tree to, tree from)
+{
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (to, SSA_NAME_VAR (from));
+  SSA_NAME_IS_DEFAULT_DEF (to) = SSA_NAME_IS_DEFAULT_DEF (from);
+  SSA_NAME_DEF_STMT (to) = SSA_NAME_DEF_STMT (from);
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (from, NULL_TREE);
+  SSA_NAME_IS_DEFAULT_DEF (to) = 1;
+  SSA_NAME_IS_DEFAULT_DEF (from) = 0;
+}
+
+/* Promote definition DEF to PROMOTED_TYPE.  If the stmt that defines def
+   is def_stmt, make the type of def promoted_type.  If the stmt is such
+   that, result of the def_stmt cannot be of promoted_type, create a new_def
+   of the original_type and make the def_stmt assign its value to newdef.
+   Then, create a CONVERT_EXPR to convert new_def to def of promoted type.
+
+   For example, for stmt with original_type char and promoted_type int:
+		char _1 = mem;
+	becomes:
+		char _2 = mem;
+		int _1 = (int)_2;
+
+   If the def_stmt allows def to be promoted, promote def in-place
+   (and its arguments when needed).
+
+   For example:
+		char _3 = _1 + _2;
+	becomes:
+		int _3 = _1 + _2;
+   Here, _1 and _2 will also be promoted.  */
+
+static void
+promote_definition (tree def,
+		    tree promoted_type)
+{
+  gimple *def_stmt = SSA_NAME_DEF_STMT (def);
+  gimple *copy_stmt = NULL;
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+  tree original_type = TREE_TYPE (def);
+  tree new_def;
+  bool do_not_promote = false;
+
+  switch (gimple_code (def_stmt))
+    {
+    case GIMPLE_PHI:
+	{
+	  /* Promote def by fixing its type and make def anonymous.  */
+	  TREE_TYPE (def) = promoted_type;
+	  SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	  promote_cst_in_stmt (def_stmt, promoted_type);
+	  break;
+	}
+
+    case GIMPLE_ASM:
+	{
+	  gasm *asm_stmt = as_a <gasm *> (def_stmt);
+	  for (unsigned int i = 0; i < gimple_asm_noutputs (asm_stmt); ++i)
+	    {
+	      /* Promote def and copy (i.e. convert) the value defined
+		 by asm to def.  */
+	      tree link = gimple_asm_output_op (asm_stmt, i);
+	      tree op = TREE_VALUE (link);
+	      if (op == def)
+		{
+		  new_def = copy_ssa_name (def);
+		  set_ssa_promoted (new_def);
+		  duplicate_default_ssa (new_def, def);
+		  TREE_VALUE (link) = new_def;
+		  gimple_asm_set_output_op (asm_stmt, i, link);
+
+		  TREE_TYPE (def) = promoted_type;
+		  copy_stmt = gimple_build_assign (def, CONVERT_EXPR,
+						   new_def, NULL_TREE);
+		  gsi = gsi_for_stmt (def_stmt);
+		  SSA_NAME_IS_DEFAULT_DEF (new_def) = 0;
+		  gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+		  break;
+		}
+	    }
+	  break;
+	}
+
+    case GIMPLE_NOP:
+	{
+	  if (SSA_NAME_VAR (def) == NULL)
+	    {
+	      /* Promote def by fixing its type for anonymous def.  */
+	      TREE_TYPE (def) = promoted_type;
+	    }
+	  else
+	    {
+	      /* Create a promoted copy of parameters.  */
+	      bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+	      gcc_assert (bb);
+	      gsi = gsi_after_labels (bb);
+	      new_def = copy_ssa_name (def);
+	      set_ssa_promoted (new_def);
+	      set_ssa_default_def (cfun, SSA_NAME_VAR (def), new_def);
+	      duplicate_default_ssa (new_def, def);
+	      TREE_TYPE (def) = promoted_type;
+	      copy_stmt = gimple_build_assign (def, CONVERT_EXPR,
+					       new_def, NULL_TREE);
+	      SSA_NAME_DEF_STMT (def) = copy_stmt;
+	      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	    }
+	  break;
+	}
+
+    case GIMPLE_ASSIGN:
+	{
+	  enum tree_code code = gimple_assign_rhs_code (def_stmt);
+	  if (!safe_to_promote_def_p (def_stmt))
+	    {
+	      do_not_promote = true;
+	    }
+	  else if (CONVERT_EXPR_CODE_P (code))
+	    {
+	      tree rhs = gimple_assign_rhs1 (def_stmt);
+	      if (!type_precision_ok (TREE_TYPE (rhs)))
+		{
+		  do_not_promote = true;
+		}
+	      else if (types_compatible_p (TREE_TYPE (rhs), promoted_type))
+		{
+		  /* As we travel statements in dominated order, arguments
+		     of def_stmt will be visited before visiting def.  If RHS
+		     is already promoted and type is compatible, we can convert
+		     them into ZERO/SIGN EXTEND stmt.  */
+		  tree &type = original_type_map->get_or_insert (rhs);
+		  if (type == NULL_TREE)
+		    type = TREE_TYPE (rhs);
+		  if (TYPE_PRECISION (original_type) < TYPE_PRECISION (type))
+		    type = original_type;
+		  gcc_assert (type != NULL_TREE);
+		  TREE_TYPE (def) = promoted_type;
+		  gimple *copy_stmt =
+		    zero_sign_extend_stmt (def, rhs,
+					   TYPE_PRECISION (type));
+		  SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		  gsi = gsi_for_stmt (def_stmt);
+		  gsi_replace (&gsi, copy_stmt, false);
+		}
+	      else {
+		  /* If RHS is not promoted OR their types are not
+		     compatible, create CONVERT_EXPR that converts
+		     RHS to  promoted DEF type and perform a
+		     ZERO/SIGN EXTEND to get the required value
+		     from RHS.  */
+		  tree s = (TYPE_PRECISION (TREE_TYPE (def))
+			    < TYPE_PRECISION (TREE_TYPE (rhs)))
+		    ? TREE_TYPE (def) : TREE_TYPE (rhs);
+		  new_def = copy_ssa_name (def);
+		  set_ssa_promoted (new_def);
+		  TREE_TYPE (def) = promoted_type;
+		  TREE_TYPE (new_def) = promoted_type;
+		  SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		  SET_SSA_NAME_VAR_OR_IDENTIFIER (new_def, NULL_TREE);
+		  gimple_set_lhs (def_stmt, new_def);
+		  gimple *copy_stmt =
+		    zero_sign_extend_stmt (def, new_def,
+					   TYPE_PRECISION (s));
+		  gsi = gsi_for_stmt (def_stmt);
+		  if (lookup_stmt_eh_lp (def_stmt) > 0)
+		    insert_stmt_on_edge (def_stmt, copy_stmt);
+		  else
+		    gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+		}
+	    }
+	  else
+	    {
+	      /* Promote def by fixing its type and make def anonymous.  */
+	      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	      promote_cst_in_stmt (def_stmt, promoted_type);
+	      TREE_TYPE (def) = promoted_type;
+	    }
+	  break;
+	}
+
+    default:
+      do_not_promote = true;
+      break;
+    }
+
+  if (do_not_promote)
+    {
+      /* Promote def and copy (i.e. convert) the value defined
+	 by the stmt that cannot be promoted.  */
+      new_def = copy_ssa_name (def);
+      set_ssa_promoted (new_def);
+      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+      TREE_TYPE (def) = promoted_type;
+      gimple_set_lhs (def_stmt, new_def);
+      copy_stmt = gimple_build_assign (def, CONVERT_EXPR,
+				       new_def, NULL_TREE);
+      gsi = gsi_for_stmt (def_stmt);
+      if (lookup_stmt_eh_lp (def_stmt) > 0)
+	insert_stmt_on_edge (def_stmt, copy_stmt);
+      else
+	gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+    }
+  else
+    {
+      /* Type is now promoted.  Due to this, some of the value ranges computed
+	 by VRP1 will is invalid.  TODO: We can be intelligent in deciding
+	 which ranges to be invalidated instead of invalidating everything.  */
+      SSA_NAME_RANGE_INFO (def) = NULL;
+    }
+}
+
+/* Fix the (promoted) USE in stmts where USE cannot be be promoted.  */
+static unsigned int
+fixup_uses (tree use, tree promoted_type, tree old_type)
+{
+  gimple *stmt;
+  imm_use_iterator ui;
+  gimple_stmt_iterator gsi;
+  use_operand_p op;
+
+  FOR_EACH_IMM_USE_STMT (stmt, ui, use)
+    {
+      bool do_not_promote = false;
+      switch (gimple_code (stmt))
+	{
+	case GIMPLE_DEBUG:
+	    {
+	      gsi = gsi_for_stmt (stmt);
+	      gsi_remove (&gsi, true);
+	      break;
+	    }
+
+	case GIMPLE_ASM:
+	case GIMPLE_CALL:
+	case GIMPLE_RETURN:
+	    {
+	      /* USE cannot be promoted here.  */
+	      do_not_promote = true;
+	      break;
+	    }
+
+	case GIMPLE_ASSIGN:
+	    {
+	      enum tree_code code = gimple_assign_rhs_code (stmt);
+	      tree lhs = gimple_assign_lhs (stmt);
+	      if (!safe_to_promote_use_p (stmt))
+		{
+		  do_not_promote = true;
+		}
+	      else if (truncate_use_p (stmt))
+		{
+		  /* In some stmts, value in USE has to be ZERO/SIGN
+		     Extended based on the original type for correct
+		     result.  */
+		  tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+		  gimple *copy_stmt =
+		    zero_sign_extend_stmt (temp, use,
+					   TYPE_PRECISION (old_type));
+		  gsi = gsi_for_stmt (stmt);
+		  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+
+		  FOR_EACH_IMM_USE_ON_STMT (op, ui)
+		    SET_USE (op, temp);
+		  if (TREE_CODE_CLASS (code)
+		      == tcc_comparison)
+		    promote_cst_in_stmt (stmt, promoted_type, true);
+		  update_stmt (stmt);
+		}
+	      else if (CONVERT_EXPR_CODE_P (code))
+		{
+		  tree rhs = gimple_assign_rhs1 (stmt);
+		  if (!type_precision_ok (TREE_TYPE (rhs)))
+		    {
+		      do_not_promote = true;
+		    }
+		  else if (types_compatible_p (TREE_TYPE (lhs), promoted_type))
+		    {
+		      /* Type of LHS and promoted RHS are compatible, we can
+			 convert this into ZERO/SIGN EXTEND stmt.  */
+		      gimple *copy_stmt =
+			zero_sign_extend_stmt (lhs, use,
+					       TYPE_PRECISION (old_type));
+		      gsi = gsi_for_stmt (stmt);
+		      set_ssa_promoted (lhs);
+		      gsi_replace (&gsi, copy_stmt, false);
+		    }
+		  else if (tobe_promoted_p (lhs))
+		    {
+		      /* If LHS will be promoted later, store the original
+			 type of RHS so that we can convert it to ZERO/SIGN
+			 EXTEND when LHS is promoted.  */
+		      tree rhs = gimple_assign_rhs1 (stmt);
+		      tree &type = original_type_map->get_or_insert (rhs);
+		      type = TREE_TYPE (old_type);
+		    }
+		  else
+		    {
+		      do_not_promote = true;
+		    }
+		}
+	      break;
+	    }
+
+	case GIMPLE_COND:
+	    {
+	      /* In GIMPLE_COND, value in USE has to be ZERO/SIGN
+		 Extended based on the original type for correct
+		 result.  */
+	      tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+	      gimple *copy_stmt =
+		zero_sign_extend_stmt (temp, use,
+				       TYPE_PRECISION (old_type));
+	      gsi = gsi_for_stmt (stmt);
+	      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+
+	      FOR_EACH_IMM_USE_ON_STMT (op, ui)
+		SET_USE (op, temp);
+	      promote_cst_in_stmt (stmt, promoted_type, true);
+	      update_stmt (stmt);
+	      break;
+	    }
+
+	default:
+	  break;
+	}
+
+      if (do_not_promote)
+	{
+	  /* FOR stmts where USE canoot be promoted, create an
+	     original type copy.  */
+	  tree temp;
+	  temp = copy_ssa_name (use);
+	  set_ssa_promoted (temp);
+	  TREE_TYPE (temp) = old_type;
+	  gimple *copy_stmt = gimple_build_assign (temp, CONVERT_EXPR,
+						  use, NULL_TREE);
+	  gsi = gsi_for_stmt (stmt);
+	  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	  FOR_EACH_IMM_USE_ON_STMT (op, ui)
+	    SET_USE (op, temp);
+	  update_stmt (stmt);
+	}
+    }
+  return 0;
+}
+
+/* Promote definition of NAME and adjust its uses if necessary.  */
+static unsigned int
+promote_def_and_uses (tree name)
+{
+  tree type;
+  if (tobe_promoted_p (name))
+    {
+      type = get_promoted_type (TREE_TYPE (name));
+      tree old_type = TREE_TYPE (name);
+      promote_definition (name, type);
+      fixup_uses (name, type, old_type);
+      set_ssa_promoted (name);
+    }
+  return 0;
+}
+
+/* Promote all the stmts in the basic block.  */
+static void
+promote_all_stmts (basic_block bb)
+{
+  gimple_stmt_iterator gsi;
+  ssa_op_iter iter;
+  tree def;
+
+  for (gphi_iterator gpi = gsi_start_phis (bb);
+       !gsi_end_p (gpi); gsi_next (&gpi))
+    {
+      gphi *phi = gpi.phi ();
+      use_operand_p op;
+
+      FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE)
+	{
+	  def = USE_FROM_PTR (op);
+	  promote_def_and_uses (def);
+	}
+      def = PHI_RESULT (phi);
+      promote_def_and_uses (def);
+    }
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      gimple *stmt = gsi_stmt (gsi);
+
+      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_USE | SSA_OP_DEF)
+	promote_def_and_uses (def);
+    }
+}
+
+
+class type_promotion_dom_walker : public dom_walker
+{
+public:
+  type_promotion_dom_walker (cdi_direction direction)
+    : dom_walker (direction) {}
+  virtual void before_dom_children (basic_block bb)
+    {
+      promote_all_stmts (bb);
+    }
+};
+
+/* Main entry point to the pass.  */
+static unsigned int
+execute_type_promotion (void)
+{
+  n_ssa_val = num_ssa_names;
+  original_type_map = new hash_map<tree, tree>;
+  ssa_to_be_promoted_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_to_be_promoted_bitmap);
+  ssa_sets_higher_bits_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_sets_higher_bits_bitmap);
+
+  calculate_dominance_info (CDI_DOMINATORS);
+  /* Walk the CFG in dominator order.  */
+  type_promotion_dom_walker (CDI_DOMINATORS)
+    .walk (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+
+  sbitmap_free (ssa_to_be_promoted_bitmap);
+  sbitmap_free (ssa_sets_higher_bits_bitmap);
+  free_dominance_info (CDI_DOMINATORS);
+  delete original_type_map;
+  return 0;
+}
+
+namespace {
+const pass_data pass_data_type_promotion =
+{
+  GIMPLE_PASS, /* type */
+  "promotion", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_TREE_TYPE_PROMOTE, /* tv_id */
+  PROP_ssa, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  (TODO_cleanup_cfg | TODO_update_ssa | TODO_verify_all),
+};
+
+class pass_type_promotion : public gimple_opt_pass
+{
+public:
+  pass_type_promotion (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_type_promotion, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  opt_pass * clone () { return new pass_type_promotion (m_ctxt); }
+  virtual bool gate (function *) { return flag_tree_type_promote != 0; }
+  virtual unsigned int execute (function *)
+    {
+      return execute_type_promotion ();
+    }
+
+}; // class pass_type_promotion
+
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_type_promote (gcc::context *ctxt)
+{
+  return new pass_type_promotion (ctxt);
+}
+
diff --git a/gcc/passes.def b/gcc/passes.def
index 36d2b3b..78c463a 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -272,6 +272,7 @@ along with GCC; see the file COPYING3.  If not see
       POP_INSERT_PASSES ()
       NEXT_PASS (pass_simduid_cleanup);
       NEXT_PASS (pass_lower_vector_ssa);
+      NEXT_PASS (pass_type_promote);
       NEXT_PASS (pass_cse_reciprocals);
       NEXT_PASS (pass_reassoc);
       NEXT_PASS (pass_strength_reduction);
diff --git a/gcc/timevar.def b/gcc/timevar.def
index b429faf..a8d40c3 100644
--- a/gcc/timevar.def
+++ b/gcc/timevar.def
@@ -278,6 +278,7 @@ DEFTIMEVAR (TV_VTABLE_VERIFICATION   , "vtable verification")
 DEFTIMEVAR (TV_TREE_UBSAN            , "tree ubsan")
 DEFTIMEVAR (TV_INITIALIZE_RTL        , "initialize rtl")
 DEFTIMEVAR (TV_GIMPLE_LADDRESS       , "address lowering")
+DEFTIMEVAR (TV_TREE_TYPE_PROMOTE     , "tree type promote")
 
 /* Everything else in rest_of_compilation not included above.  */
 DEFTIMEVAR (TV_EARLY_LOCAL	     , "early local passes")
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 333b5a7..449dd19 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -436,6 +436,7 @@ extern gimple_opt_pass *make_pass_fre (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_check_data_deps (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_copy_prop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_isolate_erroneous_paths (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_type_promote (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_vrp (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt);
diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c
index 82fd4a1..80fcf70 100644
--- a/gcc/tree-ssanames.c
+++ b/gcc/tree-ssanames.c
@@ -207,7 +207,8 @@ set_range_info (tree name, enum value_range_type range_type,
   unsigned int precision = TYPE_PRECISION (TREE_TYPE (name));
 
   /* Allocate if not available.  */
-  if (ri == NULL)
+  if (ri == NULL
+      || (precision != ri->get_min ().get_precision ()))
     {
       size_t size = (sizeof (range_info_def)
 		     + trailing_wide_ints <3>::extra_size (precision));
-- 
1.9.1


[-- Attachment #5: 0001-Add-new-SEXT_EXPR-tree-code.patch --]
[-- Type: text/x-diff, Size: 5067 bytes --]

From c0ce364e3a422912a08189645efde46c36583753 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 22 Oct 2015 10:51:42 +1100
Subject: [PATCH 1/4] Add new SEXT_EXPR tree code

---
 gcc/cfgexpand.c         | 12 ++++++++++++
 gcc/expr.c              | 20 ++++++++++++++++++++
 gcc/fold-const.c        |  4 ++++
 gcc/tree-cfg.c          | 12 ++++++++++++
 gcc/tree-inline.c       |  1 +
 gcc/tree-pretty-print.c | 11 +++++++++++
 gcc/tree.def            |  5 +++++
 7 files changed, 65 insertions(+)

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index eaad859..aeb64bb 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -5054,6 +5054,18 @@ expand_debug_expr (tree exp)
     case FMA_EXPR:
       return simplify_gen_ternary (FMA, mode, inner_mode, op0, op1, op2);
 
+    case SEXT_EXPR:
+      gcc_assert (CONST_INT_P (op1));
+      inner_mode = mode_for_size (INTVAL (op1), MODE_INT, 0);
+      gcc_assert (GET_MODE_BITSIZE (inner_mode) == INTVAL (op1));
+
+      if (mode != inner_mode)
+	op0 = simplify_gen_unary (SIGN_EXTEND,
+				  mode,
+				  gen_lowpart_SUBREG (inner_mode, op0),
+				  inner_mode);
+      return op0;
+
     default:
     flag_unsupported:
 #ifdef ENABLE_CHECKING
diff --git a/gcc/expr.c b/gcc/expr.c
index da68870..c2f535f 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9318,6 +9318,26 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
       target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
       return target;
 
+    case SEXT_EXPR:
+	{
+	  machine_mode inner_mode = mode_for_size (tree_to_uhwi (treeop1),
+						   MODE_INT, 0);
+	  rtx temp, result;
+	  rtx op0 = expand_normal (treeop0);
+	  op0 = force_reg (mode, op0);
+	  if (mode != inner_mode)
+	    {
+	      result = gen_reg_rtx (mode);
+	      temp = simplify_gen_unary (SIGN_EXTEND, mode,
+					 gen_lowpart_SUBREG (inner_mode, op0),
+					 inner_mode);
+	      convert_move (result, temp, 0);
+	    }
+	  else
+	    result = op0;
+	  return result;
+	}
+
     default:
       gcc_unreachable ();
     }
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 602ea24..a149bad 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -987,6 +987,10 @@ int_const_binop_1 (enum tree_code code, const_tree arg1, const_tree parg2,
       res = wi::bit_and (arg1, arg2);
       break;
 
+    case SEXT_EXPR:
+      res = wi::sext (arg1, arg2.to_uhwi ());
+      break;
+
     case RSHIFT_EXPR:
     case LSHIFT_EXPR:
       if (wi::neg_p (arg2))
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 8e3e810..d18b3f7 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3752,6 +3752,18 @@ verify_gimple_assign_binary (gassign *stmt)
         return false;
       }
 
+    case SEXT_EXPR:
+      {
+	if (!INTEGRAL_TYPE_P (lhs_type)
+	    || !useless_type_conversion_p (lhs_type, rhs1_type)
+	    || !tree_fits_uhwi_p (rhs2))
+	  {
+	    error ("invalid operands in sext expr");
+	    return true;
+	  }
+	return false;
+      }
+
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
       {
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index b8269ef..e61c200 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -3893,6 +3893,7 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
     case BIT_XOR_EXPR:
     case BIT_AND_EXPR:
     case BIT_NOT_EXPR:
+    case SEXT_EXPR:
 
     case TRUTH_ANDIF_EXPR:
     case TRUTH_ORIF_EXPR:
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 11f90051..bec9082 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -1923,6 +1923,14 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, int flags,
       }
       break;
 
+    case SEXT_EXPR:
+      pp_string (pp, "SEXT_EXPR <");
+      dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
+      pp_string (pp, ", ");
+      dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
+      pp_greater (pp);
+      break;
+
     case MODIFY_EXPR:
     case INIT_EXPR:
       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags,
@@ -3561,6 +3569,9 @@ op_symbol_code (enum tree_code code)
     case MIN_EXPR:
       return "min";
 
+    case SEXT_EXPR:
+      return "sext";
+
     default:
       return "<<< ??? >>>";
     }
diff --git a/gcc/tree.def b/gcc/tree.def
index d0a3bd6..789cfdd 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -760,6 +760,11 @@ DEFTREECODE (BIT_XOR_EXPR, "bit_xor_expr", tcc_binary, 2)
 DEFTREECODE (BIT_AND_EXPR, "bit_and_expr", tcc_binary, 2)
 DEFTREECODE (BIT_NOT_EXPR, "bit_not_expr", tcc_unary, 1)
 
+/*  Sign-extend operation.  It will sign extend first operand from
+ the sign bit specified by the second operand.  The type of the
+ result is that of the first operand.  */
+DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2)
+
 /* ANDIF and ORIF allow the second operand not to be computed if the
    value of the expression is determined from the first operand.  AND,
    OR, and XOR always compute the second operand whether its value is
-- 
1.9.1


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-10-21 18:11       ` Richard Henderson
@ 2015-10-22 12:48         ` Richard Biener
  0 siblings, 0 replies; 63+ messages in thread
From: Richard Biener @ 2015-10-22 12:48 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Kugan, gcc-patches

On Wed, Oct 21, 2015 at 7:55 PM, Richard Henderson <rth@redhat.com> wrote:
> On 10/21/2015 03:56 AM, Richard Biener wrote:
>>
>> On Wed, Oct 21, 2015 at 2:45 PM, Richard Biener
>> <richard.guenther@gmail.com> wrote:
>>>
>>> On Tue, Oct 20, 2015 at 10:03 PM, Kugan
>>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>>
>>>>
>>>>
>>>> On 07/09/15 12:53, Kugan wrote:
>>>>>
>>>>>
>>>>> This a new version of the patch posted in
>>>>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
>>>>> more testing and spitted the patch to make it more easier to review.
>>>>> There are still couple of issues to be addressed and I am working on
>>>>> them.
>>>>>
>>>>> 1. AARCH64 bootstrap now fails with the commit
>>>>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is
>>>>> mis-compiled
>>>>> in stage2 and fwprop.c is failing. It looks to me that there is a
>>>>> latent
>>>>> issue which gets exposed my patch. I can also reproduce this in x86_64
>>>>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the
>>>>> time being, I am using  patch
>>>>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
>>>>> workaround. This meeds to be fixed before the patches are ready to be
>>>>> committed.
>>>>>
>>>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>>>>> -O3 -g Error: unaligned opcodes detected in executable segment. It
>>>>> works
>>>>> fine if I remove the -g. I am looking into it and needs to be fixed as
>>>>> well.
>>>>
>>>>
>>>> Hi Richard,
>>>>
>>>> Now that stage 1 is going to close, I would like to get these patches
>>>> accepted for stage1. I will try my best to address your review comments
>>>> ASAP.
>>>
>>>
>>> Ok, can you make the whole patch series available so I can poke at the
>>> implementation a bit?  Please state the revision it was rebased on
>>> (or point me to a git/svn branch the work resides on).
>>>
>>>> * Issue 1 above (AARCH64 bootstrap now fails with the commit) is no
>>>> longer present as it is fixed in trunk. Patch-6 is no longer needed.
>>>>
>>>> * Issue 2 is also reported as known issue
>>>>
>>>> *  Promotion of PARM_DECLs and RESULT_DECLs in IPA pass and patterns in
>>>> match.pd for SEXT_EXPR, I would like to propose them as a follow up
>>>> patch once this is accepted.
>>>
>>>
>>> I thought more about this and don't think it can be made work without a
>>> lot of
>>> hassle.  Instead to get rid of the remaining "badly" typed registers in
>>> the
>>> function we can key different type requirements on a pass property
>>> (PROP_promoted_regs), thus simply change the expectation of the
>>> types of function parameters / results according to their promotion.
>>
>>
>> Or maybe we should simply make GIMPLE _always_ adhere to the ABI
>> details from the start (gimplification).  Note that this does not only
>> involve
>> PROMOTE_MODE.  Note that for what GIMPLE is concerned I'd only
>> "lower" passing / returning in registers (whee, and then we have
>> things like targetm.calls.split_complex_arg ... not to mention passing
>> GIMPLE memory in registers).
>>
>> Maybe I'm shooting too far here in the attempt to make GIMPLE closer
>> to the target (to expose those redundant extensions on GIMPLE) and
>> we'll end up with a bigger mess than with not doing this?
>
>
> I'm leary of building this in as early as gimplification, lest we get into
> trouble with splitting out bits of the current function for off-loading.
> What happens when the cpu and gpu have different promotion rules?

Ah, of course.  I tend to forget these issues.

Richard.

>
> r~

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-10-22 11:01     ` Kugan
@ 2015-10-22 14:24       ` Richard Biener
  2015-10-27  1:48         ` kugan
  0 siblings, 1 reply; 63+ messages in thread
From: Richard Biener @ 2015-10-22 14:24 UTC (permalink / raw)
  To: Kugan; +Cc: gcc-patches

On Thu, Oct 22, 2015 at 12:50 PM, Kugan
<kugan.vivekanandarajah@linaro.org> wrote:
>
>
> On 21/10/15 23:45, Richard Biener wrote:
>> On Tue, Oct 20, 2015 at 10:03 PM, Kugan
>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>
>>>
>>> On 07/09/15 12:53, Kugan wrote:
>>>>
>>>> This a new version of the patch posted in
>>>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
>>>> more testing and spitted the patch to make it more easier to review.
>>>> There are still couple of issues to be addressed and I am working on them.
>>>>
>>>> 1. AARCH64 bootstrap now fails with the commit
>>>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled
>>>> in stage2 and fwprop.c is failing. It looks to me that there is a latent
>>>> issue which gets exposed my patch. I can also reproduce this in x86_64
>>>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the
>>>> time being, I am using  patch
>>>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
>>>> workaround. This meeds to be fixed before the patches are ready to be
>>>> committed.
>>>>
>>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>>>> -O3 -g Error: unaligned opcodes detected in executable segment. It works
>>>> fine if I remove the -g. I am looking into it and needs to be fixed as well.
>>>
>>> Hi Richard,
>>>
>>> Now that stage 1 is going to close, I would like to get these patches
>>> accepted for stage1. I will try my best to address your review comments
>>> ASAP.
>>
>> Ok, can you make the whole patch series available so I can poke at the
>> implementation a bit?  Please state the revision it was rebased on
>> (or point me to a git/svn branch the work resides on).
>>
>
> Thanks. Please find the patched rebated against trunk@229156. I have
> skipped the test-case readjustment patches.

Some quick observations.  On x86_64 when building

short bar (short y);
int foo (short x)
{
  short y = bar (x) + 15;
  return y;
}

with -m32 -O2 -mtune=pentiumpro (which ends up promoting HImode regs)
I get

  <bb 2>:
  _1 = (int) x_10(D);
  _2 = (_1) sext (16);
  _11 = bar (_2);
  _5 = (int) _11;
  _12 = (unsigned int) _5;
  _6 = _12 & 65535;
  _7 = _6 + 15;
  _13 = (int) _7;
  _8 = (_13) sext (16);
  _9 = (_8) sext (16);
  return _9;

which looks fine but the VRP optimization doesn't trigger for the redundant sext
(ranges are computed correctly but the 2nd extension is not removed).

This also makes me notice trivial match.pd patterns are missing, like
for example

(simplify
 (sext (sext@2 @0 @1) @3)
 (if (tree_int_cst_compare (@1, @3) <= 0)
  @2
  (sext @0 @3)))

as VRP doesn't run at -O1 we must rely on those to remove rendudant extensions,
otherwise generated code might get worse compared to without the pass(?)

I also notice that the 'short' argument does not get it's sign-extension removed
as redundand either even though we have

_1 = (int) x_8(D);
Found new range for _1: [-32768, 32767]

In the end I suspect that keeping track of the "simple" cases in the promotion
pass itself (by keeping a lattice) might be a good idea (after we fix VRP to do
its work).  In some way whether the ABI guarantees promoted argument
registers might need some other target hook queries.

Now onto the 0002 patch.

+static bool
+type_precision_ok (tree type)
+{
+  return (TYPE_PRECISION (type)  == 8
+         || TYPE_PRECISION (type) == 16
+         || TYPE_PRECISION (type) == 32);
+}

that's a weird function to me.  You probably want
TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE (type))
here?  And guard that thing with POINTER_TYPE_P || INTEGRAL_TYPE_P?

+/* Return the promoted type for TYPE.  */
+static tree
+get_promoted_type (tree type)
+{
+  tree promoted_type;
+  enum machine_mode mode;
+  int uns;
+  if (POINTER_TYPE_P (type)
+      || !INTEGRAL_TYPE_P (type)
+      || !type_precision_ok (type))
+    return type;
+
+  mode = TYPE_MODE (type);
+#ifdef PROMOTE_MODE
+  uns = TYPE_SIGN (type);
+  PROMOTE_MODE (mode, uns, type);
+#endif
+  uns = TYPE_SIGN (type);
+  promoted_type = lang_hooks.types.type_for_mode (mode, uns);
+  if (promoted_type
+      && (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type)))
+    type = promoted_type;

I think what you want to verify is that TYPE_PRECISION (promoted_type)
== GET_MODE_PRECISION (mode).
And to not even bother with this simply use

promoted_type = build_nonstandard_integer_type (GET_MODE_PRECISION (mode), uns);

You use a domwalk but also might create new basic-blocks during it
(insert_on_edge_immediate), that's a
no-no, commit edge inserts after the domwalk.
ssa_sets_higher_bits_bitmap looks unused and
we generally don't free dominance info, so please don't do that.

I fired off a bootstrap on ppc64-linux which fails building stage1 libgcc with

/abuild/rguenther/obj/./gcc/xgcc -B/abuild/rguenther/obj/./gcc/
-B/usr/local/powerpc64-unknown-linux-gnu/bin/
-B/usr/local/powerpc64-unknown-linux-gnu/lib/ -isystem
/usr/local/powerpc64-unknown-linux-gnu/include -isystem
/usr/local/powerpc64-unknown-linux-gnu/sys-include    -g -O2 -O2  -g
-O2 -DIN_GCC    -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual
-Wno-format -Wstrict-prototypes -Wmissing-prototypes
-Wold-style-definition  -isystem ./include   -fPIC -mlong-double-128
-mno-minimal-toc -g -DIN_LIBGCC2 -fbuilding-libgcc
-fno-stack-protector   -fPIC -mlong-double-128 -mno-minimal-toc -I.
-I. -I../.././gcc -I../../../trunk/libgcc -I../../../trunk/libgcc/.
-I../../../trunk/libgcc/../gcc -I../../../trunk/libgcc/../include
-I../../../trunk/libgcc/../libdecnumber/dpd
-I../../../trunk/libgcc/../libdecnumber -DHAVE_CC_TLS  -o _divdi3.o
-MT _divdi3.o -MD -MP -MF _divdi3.dep -DL_divdi3 -c
../../../trunk/libgcc/libgcc2.c \
          -fexceptions -fnon-call-exceptions -fvisibility=hidden -DHIDE_EXPORTS
In file included from ../../../trunk/libgcc/libgcc2.c:56:0:
../../../trunk/libgcc/libgcc2.c: In function ‘__divti3’:
../../../trunk/libgcc/libgcc2.h:193:20: internal compiler error: in
expand_debug_locations, at cfgexpand.c:5277

as hinted at above a bootstrap on i?86 (yes, 32bit) with
--with-tune=pentiumpro might be another good testing candidate.

+      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_USE | SSA_OP_DEF)
+       promote_def_and_uses (def);

it looks like you are doing some redundant work by walking both defs
and uses of each stmt.  I'd say you should separate
def and use processing and use

  FOR_EACH_SSA_USE_OPERAND (use, stmt, iter, SSA_OP_USE)
    promote_use (use);
  FOR_EACH_SSA_DEF_OPERAND (def, stmt, iter, SSA_OP_DEF)
    promote_def (def);

this should make processing more efficient (memory local) compared to
doing the split handling
in promote_def_and_uses.

I think it will be convenient to have a SSA name info structure where
you can remember the original
type a name was promoted from as well as whether it was promoted or
not.  This way adjusting
debug uses should be "trivial":

+static unsigned int
+fixup_uses (tree use, tree promoted_type, tree old_type)
+{
+  gimple *stmt;
+  imm_use_iterator ui;
+  gimple_stmt_iterator gsi;
+  use_operand_p op;
+
+  FOR_EACH_IMM_USE_STMT (stmt, ui, use)
+    {
+      bool do_not_promote = false;
+      switch (gimple_code (stmt))
+       {
+       case GIMPLE_DEBUG:
+           {
+             gsi = gsi_for_stmt (stmt);
+             gsi_remove (&gsi, true);

rather than doing the above you'd do sth like

  SET_USE (use, fold_convert (old_type, new_def));
  update_stmt (stmt);

note that while you may not be able to use promoted regs at all uses
(like calls or asms) you can promote all defs, if only with a compensation
statement after the original def.  The SSA name info struct can be used
to note down the actual SSA name holding the promoted def.

The pass looks a lot better than last time (it's way smaller!) but
still needs some
improvements.  There are some more fishy details with respect to how you
allocate/change SSA names but I think those can be dealt with once the
basic structure looks how I like it to be.

Thanks,
Richard.



>
> Thanks,
> Kugan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-10-22 14:24       ` Richard Biener
@ 2015-10-27  1:48         ` kugan
  2015-10-28 15:51           ` Richard Biener
  0 siblings, 1 reply; 63+ messages in thread
From: kugan @ 2015-10-27  1:48 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches



On 23/10/15 01:23, Richard Biener wrote:
> On Thu, Oct 22, 2015 at 12:50 PM, Kugan
> <kugan.vivekanandarajah@linaro.org> wrote:
>>
>>
>> On 21/10/15 23:45, Richard Biener wrote:
>>> On Tue, Oct 20, 2015 at 10:03 PM, Kugan
>>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>>
>>>>
>>>> On 07/09/15 12:53, Kugan wrote:
>>>>>
>>>>> This a new version of the patch posted in
>>>>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
>>>>> more testing and spitted the patch to make it more easier to review.
>>>>> There are still couple of issues to be addressed and I am working on them.
>>>>>
>>>>> 1. AARCH64 bootstrap now fails with the commit
>>>>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled
>>>>> in stage2 and fwprop.c is failing. It looks to me that there is a latent
>>>>> issue which gets exposed my patch. I can also reproduce this in x86_64
>>>>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the
>>>>> time being, I am using  patch
>>>>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
>>>>> workaround. This meeds to be fixed before the patches are ready to be
>>>>> committed.
>>>>>
>>>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>>>>> -O3 -g Error: unaligned opcodes detected in executable segment. It works
>>>>> fine if I remove the -g. I am looking into it and needs to be fixed as well.
>>>>
>>>> Hi Richard,
>>>>
>>>> Now that stage 1 is going to close, I would like to get these patches
>>>> accepted for stage1. I will try my best to address your review comments
>>>> ASAP.
>>>
>>> Ok, can you make the whole patch series available so I can poke at the
>>> implementation a bit?  Please state the revision it was rebased on
>>> (or point me to a git/svn branch the work resides on).
>>>
>>
>> Thanks. Please find the patched rebated against trunk@229156. I have
>> skipped the test-case readjustment patches.
>
> Some quick observations.  On x86_64 when building

Hi Richard,

Thanks for the review.
>
> short bar (short y);
> int foo (short x)
> {
>    short y = bar (x) + 15;
>    return y;
> }
>
> with -m32 -O2 -mtune=pentiumpro (which ends up promoting HImode regs)
> I get
>
>    <bb 2>:
>    _1 = (int) x_10(D);
>    _2 = (_1) sext (16);
>    _11 = bar (_2);
>    _5 = (int) _11;
>    _12 = (unsigned int) _5;
>    _6 = _12 & 65535;
>    _7 = _6 + 15;
>    _13 = (int) _7;
>    _8 = (_13) sext (16);
>    _9 = (_8) sext (16);
>    return _9;
>
> which looks fine but the VRP optimization doesn't trigger for the redundant sext
> (ranges are computed correctly but the 2nd extension is not removed).
>
> This also makes me notice trivial match.pd patterns are missing, like
> for example
>
> (simplify
>   (sext (sext@2 @0 @1) @3)
>   (if (tree_int_cst_compare (@1, @3) <= 0)
>    @2
>    (sext @0 @3)))
>
> as VRP doesn't run at -O1 we must rely on those to remove rendudant extensions,
> otherwise generated code might get worse compared to without the pass(?)

Do you think that we should enable this pass only when vrp is enabled. 
Otherwise, even when we do the simple optimizations you mentioned below, 
we might not be able to remove all the redundancies.

>
> I also notice that the 'short' argument does not get it's sign-extension removed
> as redundand either even though we have
>
> _1 = (int) x_8(D);
> Found new range for _1: [-32768, 32767]
>

I am looking into it.

> In the end I suspect that keeping track of the "simple" cases in the promotion
> pass itself (by keeping a lattice) might be a good idea (after we fix VRP to do
> its work).  In some way whether the ABI guarantees promoted argument
> registers might need some other target hook queries.
>
> Now onto the 0002 patch.
>
> +static bool
> +type_precision_ok (tree type)
> +{
> +  return (TYPE_PRECISION (type)  == 8
> +         || TYPE_PRECISION (type) == 16
> +         || TYPE_PRECISION (type) == 32);
> +}
>
> that's a weird function to me.  You probably want
> TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE (type))
> here?  And guard that thing with POINTER_TYPE_P || INTEGRAL_TYPE_P?
>

I will change this. (I have a patch which I am testing with other 
changes you have asked for)

> +/* Return the promoted type for TYPE.  */
> +static tree
> +get_promoted_type (tree type)
> +{
> +  tree promoted_type;
> +  enum machine_mode mode;
> +  int uns;
> +  if (POINTER_TYPE_P (type)
> +      || !INTEGRAL_TYPE_P (type)
> +      || !type_precision_ok (type))
> +    return type;
> +
> +  mode = TYPE_MODE (type);
> +#ifdef PROMOTE_MODE
> +  uns = TYPE_SIGN (type);
> +  PROMOTE_MODE (mode, uns, type);
> +#endif
> +  uns = TYPE_SIGN (type);
> +  promoted_type = lang_hooks.types.type_for_mode (mode, uns);
> +  if (promoted_type
> +      && (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type)))
> +    type = promoted_type;
>
> I think what you want to verify is that TYPE_PRECISION (promoted_type)
> == GET_MODE_PRECISION (mode).
> And to not even bother with this simply use
>
> promoted_type = build_nonstandard_integer_type (GET_MODE_PRECISION (mode), uns);
>

I am changing this too.

> You use a domwalk but also might create new basic-blocks during it
> (insert_on_edge_immediate), that's a
> no-no, commit edge inserts after the domwalk.

I am sorry, I dont understand "commit edge inserts after the domwalk" Is 
there a way to do this in the current implementation?

> ssa_sets_higher_bits_bitmap looks unused and
> we generally don't free dominance info, so please don't do that.
>
> I fired off a bootstrap on ppc64-linux which fails building stage1 libgcc with
>
> /abuild/rguenther/obj/./gcc/xgcc -B/abuild/rguenther/obj/./gcc/
> -B/usr/local/powerpc64-unknown-linux-gnu/bin/
> -B/usr/local/powerpc64-unknown-linux-gnu/lib/ -isystem
> /usr/local/powerpc64-unknown-linux-gnu/include -isystem
> /usr/local/powerpc64-unknown-linux-gnu/sys-include    -g -O2 -O2  -g
> -O2 -DIN_GCC    -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual
> -Wno-format -Wstrict-prototypes -Wmissing-prototypes
> -Wold-style-definition  -isystem ./include   -fPIC -mlong-double-128
> -mno-minimal-toc -g -DIN_LIBGCC2 -fbuilding-libgcc
> -fno-stack-protector   -fPIC -mlong-double-128 -mno-minimal-toc -I.
> -I. -I../.././gcc -I../../../trunk/libgcc -I../../../trunk/libgcc/.
> -I../../../trunk/libgcc/../gcc -I../../../trunk/libgcc/../include
> -I../../../trunk/libgcc/../libdecnumber/dpd
> -I../../../trunk/libgcc/../libdecnumber -DHAVE_CC_TLS  -o _divdi3.o
> -MT _divdi3.o -MD -MP -MF _divdi3.dep -DL_divdi3 -c
> ../../../trunk/libgcc/libgcc2.c \
>            -fexceptions -fnon-call-exceptions -fvisibility=hidden -DHIDE_EXPORTS
> In file included from ../../../trunk/libgcc/libgcc2.c:56:0:
> ../../../trunk/libgcc/libgcc2.c: In function ‘__divti3’:
> ../../../trunk/libgcc/libgcc2.h:193:20: internal compiler error: in
> expand_debug_locations, at cfgexpand.c:5277
>

I am testing on gcc computefarm. I will get it to bootstrap and will do 
the regression testing before posting the next version.

> as hinted at above a bootstrap on i?86 (yes, 32bit) with
> --with-tune=pentiumpro might be another good testing candidate.
>
> +      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_USE | SSA_OP_DEF)
> +       promote_def_and_uses (def);
>
> it looks like you are doing some redundant work by walking both defs
> and uses of each stmt.  I'd say you should separate
> def and use processing and use
>
>    FOR_EACH_SSA_USE_OPERAND (use, stmt, iter, SSA_OP_USE)
>      promote_use (use);
>    FOR_EACH_SSA_DEF_OPERAND (def, stmt, iter, SSA_OP_DEF)
>      promote_def (def);
>

Name promote_def_and_uses in my implementation is a bit confusing. It is 
  promoting the SSA_NAMEs. We only have to do that for the definitions 
if we can do the SSA_NAMEs defined by parameters.

I also have a bitmap to see if we have promoted a variable and avoid 
doing it again. I will try to improve this.


> this should make processing more efficient (memory local) compared to
> doing the split handling
> in promote_def_and_uses.
>
> I think it will be convenient to have a SSA name info structure where
> you can remember the original
> type a name was promoted from as well as whether it was promoted or
> not.  This way adjusting
> debug uses should be "trivial":
>
> +static unsigned int
> +fixup_uses (tree use, tree promoted_type, tree old_type)
> +{
> +  gimple *stmt;
> +  imm_use_iterator ui;
> +  gimple_stmt_iterator gsi;
> +  use_operand_p op;
> +
> +  FOR_EACH_IMM_USE_STMT (stmt, ui, use)
> +    {
> +      bool do_not_promote = false;
> +      switch (gimple_code (stmt))
> +       {
> +       case GIMPLE_DEBUG:
> +           {
> +             gsi = gsi_for_stmt (stmt);
> +             gsi_remove (&gsi, true);
>
> rather than doing the above you'd do sth like
>
>    SET_USE (use, fold_convert (old_type, new_def));
>    update_stmt (stmt);
>

We do have these information (original type a name was promoted from as 
well as whether it was promoted or not). To make it easy to review, in 
the patch that adds the pass,I am removing these debug stmts. But in 
patch 4, I am trying to handle this properly. Maybe  I should combine them.

> note that while you may not be able to use promoted regs at all uses
> (like calls or asms) you can promote all defs, if only with a compensation
> statement after the original def.  The SSA name info struct can be used
> to note down the actual SSA name holding the promoted def.
>
> The pass looks a lot better than last time (it's way smaller!) but
> still needs some
> improvements.  There are some more fishy details with respect to how you
> allocate/change SSA names but I think those can be dealt with once the
> basic structure looks how I like it to be.
>

I will post an updated patch in a day or two.

Thanks again,
Kugan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-10-27  1:48         ` kugan
@ 2015-10-28 15:51           ` Richard Biener
  2015-11-02  9:17             ` Kugan
  0 siblings, 1 reply; 63+ messages in thread
From: Richard Biener @ 2015-10-28 15:51 UTC (permalink / raw)
  To: kugan; +Cc: gcc-patches

On Tue, Oct 27, 2015 at 1:50 AM, kugan
<kugan.vivekanandarajah@linaro.org> wrote:
>
>
> On 23/10/15 01:23, Richard Biener wrote:
>>
>> On Thu, Oct 22, 2015 at 12:50 PM, Kugan
>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>
>>>
>>>
>>> On 21/10/15 23:45, Richard Biener wrote:
>>>>
>>>> On Tue, Oct 20, 2015 at 10:03 PM, Kugan
>>>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 07/09/15 12:53, Kugan wrote:
>>>>>>
>>>>>>
>>>>>> This a new version of the patch posted in
>>>>>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
>>>>>> more testing and spitted the patch to make it more easier to review.
>>>>>> There are still couple of issues to be addressed and I am working on
>>>>>> them.
>>>>>>
>>>>>> 1. AARCH64 bootstrap now fails with the commit
>>>>>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is
>>>>>> mis-compiled
>>>>>> in stage2 and fwprop.c is failing. It looks to me that there is a
>>>>>> latent
>>>>>> issue which gets exposed my patch. I can also reproduce this in x86_64
>>>>>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the
>>>>>> time being, I am using  patch
>>>>>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
>>>>>> workaround. This meeds to be fixed before the patches are ready to be
>>>>>> committed.
>>>>>>
>>>>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>>>>>> -O3 -g Error: unaligned opcodes detected in executable segment. It
>>>>>> works
>>>>>> fine if I remove the -g. I am looking into it and needs to be fixed as
>>>>>> well.
>>>>>
>>>>>
>>>>> Hi Richard,
>>>>>
>>>>> Now that stage 1 is going to close, I would like to get these patches
>>>>> accepted for stage1. I will try my best to address your review comments
>>>>> ASAP.
>>>>
>>>>
>>>> Ok, can you make the whole patch series available so I can poke at the
>>>> implementation a bit?  Please state the revision it was rebased on
>>>> (or point me to a git/svn branch the work resides on).
>>>>
>>>
>>> Thanks. Please find the patched rebated against trunk@229156. I have
>>> skipped the test-case readjustment patches.
>>
>>
>> Some quick observations.  On x86_64 when building
>
>
> Hi Richard,
>
> Thanks for the review.
>
>>
>> short bar (short y);
>> int foo (short x)
>> {
>>    short y = bar (x) + 15;
>>    return y;
>> }
>>
>> with -m32 -O2 -mtune=pentiumpro (which ends up promoting HImode regs)
>> I get
>>
>>    <bb 2>:
>>    _1 = (int) x_10(D);
>>    _2 = (_1) sext (16);
>>    _11 = bar (_2);
>>    _5 = (int) _11;
>>    _12 = (unsigned int) _5;
>>    _6 = _12 & 65535;
>>    _7 = _6 + 15;
>>    _13 = (int) _7;
>>    _8 = (_13) sext (16);
>>    _9 = (_8) sext (16);
>>    return _9;
>>
>> which looks fine but the VRP optimization doesn't trigger for the
>> redundant sext
>> (ranges are computed correctly but the 2nd extension is not removed).
>>
>> This also makes me notice trivial match.pd patterns are missing, like
>> for example
>>
>> (simplify
>>   (sext (sext@2 @0 @1) @3)
>>   (if (tree_int_cst_compare (@1, @3) <= 0)
>>    @2
>>    (sext @0 @3)))
>>
>> as VRP doesn't run at -O1 we must rely on those to remove rendudant
>> extensions,
>> otherwise generated code might get worse compared to without the pass(?)
>
>
> Do you think that we should enable this pass only when vrp is enabled.
> Otherwise, even when we do the simple optimizations you mentioned below, we
> might not be able to remove all the redundancies.
>
>>
>> I also notice that the 'short' argument does not get it's sign-extension
>> removed
>> as redundand either even though we have
>>
>> _1 = (int) x_8(D);
>> Found new range for _1: [-32768, 32767]
>>
>
> I am looking into it.
>
>> In the end I suspect that keeping track of the "simple" cases in the
>> promotion
>> pass itself (by keeping a lattice) might be a good idea (after we fix VRP
>> to do
>> its work).  In some way whether the ABI guarantees promoted argument
>> registers might need some other target hook queries.
>>
>> Now onto the 0002 patch.
>>
>> +static bool
>> +type_precision_ok (tree type)
>> +{
>> +  return (TYPE_PRECISION (type)  == 8
>> +         || TYPE_PRECISION (type) == 16
>> +         || TYPE_PRECISION (type) == 32);
>> +}
>>
>> that's a weird function to me.  You probably want
>> TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE (type))
>> here?  And guard that thing with POINTER_TYPE_P || INTEGRAL_TYPE_P?
>>
>
> I will change this. (I have a patch which I am testing with other changes
> you have asked for)
>
>
>> +/* Return the promoted type for TYPE.  */
>> +static tree
>> +get_promoted_type (tree type)
>> +{
>> +  tree promoted_type;
>> +  enum machine_mode mode;
>> +  int uns;
>> +  if (POINTER_TYPE_P (type)
>> +      || !INTEGRAL_TYPE_P (type)
>> +      || !type_precision_ok (type))
>> +    return type;
>> +
>> +  mode = TYPE_MODE (type);
>> +#ifdef PROMOTE_MODE
>> +  uns = TYPE_SIGN (type);
>> +  PROMOTE_MODE (mode, uns, type);
>> +#endif
>> +  uns = TYPE_SIGN (type);
>> +  promoted_type = lang_hooks.types.type_for_mode (mode, uns);
>> +  if (promoted_type
>> +      && (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type)))
>> +    type = promoted_type;
>>
>> I think what you want to verify is that TYPE_PRECISION (promoted_type)
>> == GET_MODE_PRECISION (mode).
>> And to not even bother with this simply use
>>
>> promoted_type = build_nonstandard_integer_type (GET_MODE_PRECISION (mode),
>> uns);
>>
>
> I am changing this too.
>
>> You use a domwalk but also might create new basic-blocks during it
>> (insert_on_edge_immediate), that's a
>> no-no, commit edge inserts after the domwalk.
>
>
> I am sorry, I dont understand "commit edge inserts after the domwalk" Is
> there a way to do this in the current implementation?

Yes, simply use gsi_insert_on_edge () and after the domwalk is done do
gsi_commit_edge_inserts ().

>> ssa_sets_higher_bits_bitmap looks unused and
>> we generally don't free dominance info, so please don't do that.
>>
>> I fired off a bootstrap on ppc64-linux which fails building stage1 libgcc
>> with
>>
>> /abuild/rguenther/obj/./gcc/xgcc -B/abuild/rguenther/obj/./gcc/
>> -B/usr/local/powerpc64-unknown-linux-gnu/bin/
>> -B/usr/local/powerpc64-unknown-linux-gnu/lib/ -isystem
>> /usr/local/powerpc64-unknown-linux-gnu/include -isystem
>> /usr/local/powerpc64-unknown-linux-gnu/sys-include    -g -O2 -O2  -g
>> -O2 -DIN_GCC    -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual
>> -Wno-format -Wstrict-prototypes -Wmissing-prototypes
>> -Wold-style-definition  -isystem ./include   -fPIC -mlong-double-128
>> -mno-minimal-toc -g -DIN_LIBGCC2 -fbuilding-libgcc
>> -fno-stack-protector   -fPIC -mlong-double-128 -mno-minimal-toc -I.
>> -I. -I../.././gcc -I../../../trunk/libgcc -I../../../trunk/libgcc/.
>> -I../../../trunk/libgcc/../gcc -I../../../trunk/libgcc/../include
>> -I../../../trunk/libgcc/../libdecnumber/dpd
>> -I../../../trunk/libgcc/../libdecnumber -DHAVE_CC_TLS  -o _divdi3.o
>> -MT _divdi3.o -MD -MP -MF _divdi3.dep -DL_divdi3 -c
>> ../../../trunk/libgcc/libgcc2.c \
>>            -fexceptions -fnon-call-exceptions -fvisibility=hidden
>> -DHIDE_EXPORTS
>> In file included from ../../../trunk/libgcc/libgcc2.c:56:0:
>> ../../../trunk/libgcc/libgcc2.c: In function ‘__divti3’:
>> ../../../trunk/libgcc/libgcc2.h:193:20: internal compiler error: in
>> expand_debug_locations, at cfgexpand.c:5277
>>
>
> I am testing on gcc computefarm. I will get it to bootstrap and will do the
> regression testing before posting the next version.
>
>> as hinted at above a bootstrap on i?86 (yes, 32bit) with
>> --with-tune=pentiumpro might be another good testing candidate.
>>
>> +      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_USE |
>> SSA_OP_DEF)
>> +       promote_def_and_uses (def);
>>
>> it looks like you are doing some redundant work by walking both defs
>> and uses of each stmt.  I'd say you should separate
>> def and use processing and use
>>
>>    FOR_EACH_SSA_USE_OPERAND (use, stmt, iter, SSA_OP_USE)
>>      promote_use (use);
>>    FOR_EACH_SSA_DEF_OPERAND (def, stmt, iter, SSA_OP_DEF)
>>      promote_def (def);
>>
>
> Name promote_def_and_uses in my implementation is a bit confusing. It is
> promoting the SSA_NAMEs. We only have to do that for the definitions if we
> can do the SSA_NAMEs defined by parameters.
>
> I also have a bitmap to see if we have promoted a variable and avoid doing
> it again. I will try to improve this.
>
>
>
>> this should make processing more efficient (memory local) compared to
>> doing the split handling
>> in promote_def_and_uses.
>>
>> I think it will be convenient to have a SSA name info structure where
>> you can remember the original
>> type a name was promoted from as well as whether it was promoted or
>> not.  This way adjusting
>> debug uses should be "trivial":
>>
>> +static unsigned int
>> +fixup_uses (tree use, tree promoted_type, tree old_type)
>> +{
>> +  gimple *stmt;
>> +  imm_use_iterator ui;
>> +  gimple_stmt_iterator gsi;
>> +  use_operand_p op;
>> +
>> +  FOR_EACH_IMM_USE_STMT (stmt, ui, use)
>> +    {
>> +      bool do_not_promote = false;
>> +      switch (gimple_code (stmt))
>> +       {
>> +       case GIMPLE_DEBUG:
>> +           {
>> +             gsi = gsi_for_stmt (stmt);
>> +             gsi_remove (&gsi, true);
>>
>> rather than doing the above you'd do sth like
>>
>>    SET_USE (use, fold_convert (old_type, new_def));
>>    update_stmt (stmt);
>>
>
> We do have these information (original type a name was promoted from as well
> as whether it was promoted or not). To make it easy to review, in the patch
> that adds the pass,I am removing these debug stmts. But in patch 4, I am
> trying to handle this properly. Maybe  I should combine them.

Yeah, it's a bit confusing otherwise.

>> note that while you may not be able to use promoted regs at all uses
>> (like calls or asms) you can promote all defs, if only with a compensation
>> statement after the original def.  The SSA name info struct can be used
>> to note down the actual SSA name holding the promoted def.
>>
>> The pass looks a lot better than last time (it's way smaller!) but
>> still needs some
>> improvements.  There are some more fishy details with respect to how you
>> allocate/change SSA names but I think those can be dealt with once the
>> basic structure looks how I like it to be.
>>
>
> I will post an updated patch in a day or two.

Thanks,
Richard.

> Thanks again,
> Kugan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-10-28 15:51           ` Richard Biener
@ 2015-11-02  9:17             ` Kugan
  2015-11-03 14:40               ` Richard Biener
  0 siblings, 1 reply; 63+ messages in thread
From: Kugan @ 2015-11-02  9:17 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 12098 bytes --]



On 29/10/15 02:45, Richard Biener wrote:
> On Tue, Oct 27, 2015 at 1:50 AM, kugan
> <kugan.vivekanandarajah@linaro.org> wrote:
>>
>>
>> On 23/10/15 01:23, Richard Biener wrote:
>>>
>>> On Thu, Oct 22, 2015 at 12:50 PM, Kugan
>>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>>
>>>>
>>>>
>>>> On 21/10/15 23:45, Richard Biener wrote:
>>>>>
>>>>> On Tue, Oct 20, 2015 at 10:03 PM, Kugan
>>>>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 07/09/15 12:53, Kugan wrote:
>>>>>>>
>>>>>>>
>>>>>>> This a new version of the patch posted in
>>>>>>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
>>>>>>> more testing and spitted the patch to make it more easier to review.
>>>>>>> There are still couple of issues to be addressed and I am working on
>>>>>>> them.
>>>>>>>
>>>>>>> 1. AARCH64 bootstrap now fails with the commit
>>>>>>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is
>>>>>>> mis-compiled
>>>>>>> in stage2 and fwprop.c is failing. It looks to me that there is a
>>>>>>> latent
>>>>>>> issue which gets exposed my patch. I can also reproduce this in x86_64
>>>>>>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the
>>>>>>> time being, I am using  patch
>>>>>>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
>>>>>>> workaround. This meeds to be fixed before the patches are ready to be
>>>>>>> committed.
>>>>>>>
>>>>>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>>>>>>> -O3 -g Error: unaligned opcodes detected in executable segment. It
>>>>>>> works
>>>>>>> fine if I remove the -g. I am looking into it and needs to be fixed as
>>>>>>> well.
>>>>>>
>>>>>>
>>>>>> Hi Richard,
>>>>>>
>>>>>> Now that stage 1 is going to close, I would like to get these patches
>>>>>> accepted for stage1. I will try my best to address your review comments
>>>>>> ASAP.
>>>>>
>>>>>
>>>>> Ok, can you make the whole patch series available so I can poke at the
>>>>> implementation a bit?  Please state the revision it was rebased on
>>>>> (or point me to a git/svn branch the work resides on).
>>>>>
>>>>
>>>> Thanks. Please find the patched rebated against trunk@229156. I have
>>>> skipped the test-case readjustment patches.
>>>
>>>
>>> Some quick observations.  On x86_64 when building
>>
>>
>> Hi Richard,
>>
>> Thanks for the review.
>>
>>>
>>> short bar (short y);
>>> int foo (short x)
>>> {
>>>    short y = bar (x) + 15;
>>>    return y;
>>> }
>>>
>>> with -m32 -O2 -mtune=pentiumpro (which ends up promoting HImode regs)
>>> I get
>>>
>>>    <bb 2>:
>>>    _1 = (int) x_10(D);
>>>    _2 = (_1) sext (16);
>>>    _11 = bar (_2);
>>>    _5 = (int) _11;
>>>    _12 = (unsigned int) _5;
>>>    _6 = _12 & 65535;
>>>    _7 = _6 + 15;
>>>    _13 = (int) _7;
>>>    _8 = (_13) sext (16);
>>>    _9 = (_8) sext (16);
>>>    return _9;
>>>
>>> which looks fine but the VRP optimization doesn't trigger for the
>>> redundant sext
>>> (ranges are computed correctly but the 2nd extension is not removed).

Thanks for the comments. Please fond the attached patches with which I
am now getting
cat .192t.optimized

;; Function foo (foo, funcdef_no=0, decl_uid=1406, cgraph_uid=0,
symbol_order=0)

foo (short int x)
{
  signed int _1;
  int _2;
  signed int _5;
  unsigned int _6;
  unsigned int _7;
  signed int _8;
  int _9;
  short int _11;
  unsigned int _12;
  signed int _13;

  <bb 2>:
  _1 = (signed int) x_10(D);
  _2 = _1;
  _11 = bar (_2);
  _5 = (signed int) _11;
  _12 = (unsigned int) _11;
  _6 = _12 & 65535;
  _7 = _6 + 15;
  _13 = (signed int) _7;
  _8 = (_13) sext (16);
  _9 = _8;
  return _9;

}


There are still some redundancies. The asm difference after RTL
optimizations is

-	addl	$15, %eax
+	addw	$15, %ax


>>>
>>> This also makes me notice trivial match.pd patterns are missing, like
>>> for example
>>>
>>> (simplify
>>>   (sext (sext@2 @0 @1) @3)
>>>   (if (tree_int_cst_compare (@1, @3) <= 0)
>>>    @2
>>>    (sext @0 @3)))
>>>
>>> as VRP doesn't run at -O1 we must rely on those to remove rendudant
>>> extensions,
>>> otherwise generated code might get worse compared to without the pass(?)
>>
>>
>> Do you think that we should enable this pass only when vrp is enabled.
>> Otherwise, even when we do the simple optimizations you mentioned below, we
>> might not be able to remove all the redundancies.
>>
>>>
>>> I also notice that the 'short' argument does not get it's sign-extension
>>> removed
>>> as redundand either even though we have
>>>
>>> _1 = (int) x_8(D);
>>> Found new range for _1: [-32768, 32767]
>>>
>>
>> I am looking into it.
>>
>>> In the end I suspect that keeping track of the "simple" cases in the
>>> promotion
>>> pass itself (by keeping a lattice) might be a good idea (after we fix VRP
>>> to do
>>> its work).  In some way whether the ABI guarantees promoted argument
>>> registers might need some other target hook queries.

I tried adding it in the attached patch with record_visit_stmt to track
whether an ssa would have value overflow or properly zero/sign extended
in promoted mode. We can use this to eliminate some of the zero/sign
extension at gimple level. As it is, it doesn't do much. If this is what
you had in mind, I will extend it based on your feedback.


>>>
>>> Now onto the 0002 patch.
>>>
>>> +static bool
>>> +type_precision_ok (tree type)
>>> +{
>>> +  return (TYPE_PRECISION (type)  == 8
>>> +         || TYPE_PRECISION (type) == 16
>>> +         || TYPE_PRECISION (type) == 32);
>>> +}
>>>
>>> that's a weird function to me.  You probably want
>>> TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE (type))
>>> here?  And guard that thing with POINTER_TYPE_P || INTEGRAL_TYPE_P?
>>>
>>
>> I will change this. (I have a patch which I am testing with other changes
>> you have asked for)
>>
>>
>>> +/* Return the promoted type for TYPE.  */
>>> +static tree
>>> +get_promoted_type (tree type)
>>> +{
>>> +  tree promoted_type;
>>> +  enum machine_mode mode;
>>> +  int uns;
>>> +  if (POINTER_TYPE_P (type)
>>> +      || !INTEGRAL_TYPE_P (type)
>>> +      || !type_precision_ok (type))
>>> +    return type;
>>> +
>>> +  mode = TYPE_MODE (type);
>>> +#ifdef PROMOTE_MODE
>>> +  uns = TYPE_SIGN (type);
>>> +  PROMOTE_MODE (mode, uns, type);
>>> +#endif
>>> +  uns = TYPE_SIGN (type);
>>> +  promoted_type = lang_hooks.types.type_for_mode (mode, uns);
>>> +  if (promoted_type
>>> +      && (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type)))
>>> +    type = promoted_type;
>>>
>>> I think what you want to verify is that TYPE_PRECISION (promoted_type)
>>> == GET_MODE_PRECISION (mode).
>>> And to not even bother with this simply use
>>>
>>> promoted_type = build_nonstandard_integer_type (GET_MODE_PRECISION (mode),
>>> uns);
>>>
>>
>> I am changing this too.
>>
>>> You use a domwalk but also might create new basic-blocks during it
>>> (insert_on_edge_immediate), that's a
>>> no-no, commit edge inserts after the domwalk.
>>
>>
>> I am sorry, I dont understand "commit edge inserts after the domwalk" Is
>> there a way to do this in the current implementation?
> 
> Yes, simply use gsi_insert_on_edge () and after the domwalk is done do
> gsi_commit_edge_inserts ().
> 
>>> ssa_sets_higher_bits_bitmap looks unused and
>>> we generally don't free dominance info, so please don't do that.
>>>
>>> I fired off a bootstrap on ppc64-linux which fails building stage1 libgcc
>>> with
>>>
>>> /abuild/rguenther/obj/./gcc/xgcc -B/abuild/rguenther/obj/./gcc/
>>> -B/usr/local/powerpc64-unknown-linux-gnu/bin/
>>> -B/usr/local/powerpc64-unknown-linux-gnu/lib/ -isystem
>>> /usr/local/powerpc64-unknown-linux-gnu/include -isystem
>>> /usr/local/powerpc64-unknown-linux-gnu/sys-include    -g -O2 -O2  -g
>>> -O2 -DIN_GCC    -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual
>>> -Wno-format -Wstrict-prototypes -Wmissing-prototypes
>>> -Wold-style-definition  -isystem ./include   -fPIC -mlong-double-128
>>> -mno-minimal-toc -g -DIN_LIBGCC2 -fbuilding-libgcc
>>> -fno-stack-protector   -fPIC -mlong-double-128 -mno-minimal-toc -I.
>>> -I. -I../.././gcc -I../../../trunk/libgcc -I../../../trunk/libgcc/.
>>> -I../../../trunk/libgcc/../gcc -I../../../trunk/libgcc/../include
>>> -I../../../trunk/libgcc/../libdecnumber/dpd
>>> -I../../../trunk/libgcc/../libdecnumber -DHAVE_CC_TLS  -o _divdi3.o
>>> -MT _divdi3.o -MD -MP -MF _divdi3.dep -DL_divdi3 -c
>>> ../../../trunk/libgcc/libgcc2.c \
>>>            -fexceptions -fnon-call-exceptions -fvisibility=hidden
>>> -DHIDE_EXPORTS
>>> In file included from ../../../trunk/libgcc/libgcc2.c:56:0:
>>> ../../../trunk/libgcc/libgcc2.c: In function ‘__divti3’:
>>> ../../../trunk/libgcc/libgcc2.h:193:20: internal compiler error: in
>>> expand_debug_locations, at cfgexpand.c:5277
>>>

With the attached patch, now I am running into Bootstrap comparison
failure. I am looking into it. Please review this version so that I can
address them while fixing this issue.

Thanks,
Kugan

>>
>> I am testing on gcc computefarm. I will get it to bootstrap and will do the
>> regression testing before posting the next version.
>>
>>> as hinted at above a bootstrap on i?86 (yes, 32bit) with
>>> --with-tune=pentiumpro might be another good testing candidate.
>>>
>>> +      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_USE |
>>> SSA_OP_DEF)
>>> +       promote_def_and_uses (def);
>>>
>>> it looks like you are doing some redundant work by walking both defs
>>> and uses of each stmt.  I'd say you should separate
>>> def and use processing and use
>>>
>>>    FOR_EACH_SSA_USE_OPERAND (use, stmt, iter, SSA_OP_USE)
>>>      promote_use (use);
>>>    FOR_EACH_SSA_DEF_OPERAND (def, stmt, iter, SSA_OP_DEF)
>>>      promote_def (def);
>>>
>>
>> Name promote_def_and_uses in my implementation is a bit confusing. It is
>> promoting the SSA_NAMEs. We only have to do that for the definitions if we
>> can do the SSA_NAMEs defined by parameters.
>>
>> I also have a bitmap to see if we have promoted a variable and avoid doing
>> it again. I will try to improve this.
>>
>>
>>
>>> this should make processing more efficient (memory local) compared to
>>> doing the split handling
>>> in promote_def_and_uses.
>>>
>>> I think it will be convenient to have a SSA name info structure where
>>> you can remember the original
>>> type a name was promoted from as well as whether it was promoted or
>>> not.  This way adjusting
>>> debug uses should be "trivial":
>>>
>>> +static unsigned int
>>> +fixup_uses (tree use, tree promoted_type, tree old_type)
>>> +{
>>> +  gimple *stmt;
>>> +  imm_use_iterator ui;
>>> +  gimple_stmt_iterator gsi;
>>> +  use_operand_p op;
>>> +
>>> +  FOR_EACH_IMM_USE_STMT (stmt, ui, use)
>>> +    {
>>> +      bool do_not_promote = false;
>>> +      switch (gimple_code (stmt))
>>> +       {
>>> +       case GIMPLE_DEBUG:
>>> +           {
>>> +             gsi = gsi_for_stmt (stmt);
>>> +             gsi_remove (&gsi, true);
>>>
>>> rather than doing the above you'd do sth like
>>>
>>>    SET_USE (use, fold_convert (old_type, new_def));
>>>    update_stmt (stmt);
>>>
>>
>> We do have these information (original type a name was promoted from as well
>> as whether it was promoted or not). To make it easy to review, in the patch
>> that adds the pass,I am removing these debug stmts. But in patch 4, I am
>> trying to handle this properly. Maybe  I should combine them.
> 
> Yeah, it's a bit confusing otherwise.
> 
>>> note that while you may not be able to use promoted regs at all uses
>>> (like calls or asms) you can promote all defs, if only with a compensation
>>> statement after the original def.  The SSA name info struct can be used
>>> to note down the actual SSA name holding the promoted def.
>>>
>>> The pass looks a lot better than last time (it's way smaller!) but
>>> still needs some
>>> improvements.  There are some more fishy details with respect to how you
>>> allocate/change SSA names but I think those can be dealt with once the
>>> basic structure looks how I like it to be.
>>>
>>
>> I will post an updated patch in a day or two.
> 
> Thanks,
> Richard.
> 
>> Thanks again,
>> Kugan

[-- Attachment #2: 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch --]
[-- Type: text/x-diff, Size: 3519 bytes --]

From 355a6ebe7cc2548417e2e4976b842fbbf5e93224 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 22 Oct 2015 10:53:56 +1100
Subject: [PATCH 3/3] Optimize ZEXT_EXPR with tree-vrp

---
 gcc/match.pd   |  6 ++++++
 gcc/tree-vrp.c | 59 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 65 insertions(+)

diff --git a/gcc/match.pd b/gcc/match.pd
index 0a9598e..1b152f1 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2585,3 +2585,9 @@ along with GCC; see the file COPYING3.  If not see
   (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
    (op @0 (ext @1 @2)))))
 
+(simplify
+ (sext (sext@2 @0 @1) @3)
+ (if (tree_int_cst_compare (@1, @3) <= 0)
+  @2
+  (sext @0 @3)))
+
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index fe34ffd..671a388 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -2241,6 +2241,7 @@ extract_range_from_binary_expr_1 (value_range *vr,
       && code != LSHIFT_EXPR
       && code != MIN_EXPR
       && code != MAX_EXPR
+      && code != SEXT_EXPR
       && code != BIT_AND_EXPR
       && code != BIT_IOR_EXPR
       && code != BIT_XOR_EXPR)
@@ -2801,6 +2802,52 @@ extract_range_from_binary_expr_1 (value_range *vr,
       extract_range_from_multiplicative_op_1 (vr, code, &vr0, &vr1);
       return;
     }
+  else if (code == SEXT_EXPR)
+    {
+      gcc_assert (range_int_cst_p (&vr1));
+      HOST_WIDE_INT prec = tree_to_uhwi (vr1.min);
+      type = vr0.type;
+      wide_int tmin, tmax;
+      wide_int may_be_nonzero, must_be_nonzero;
+
+      wide_int type_min = wi::min_value (prec, SIGNED);
+      wide_int type_max = wi::max_value (prec, SIGNED);
+      type_min = wide_int_to_tree (expr_type, type_min);
+      type_max = wide_int_to_tree (expr_type, type_max);
+      wide_int sign_bit
+	= wi::set_bit_in_zero (prec - 1,
+			       TYPE_PRECISION (TREE_TYPE (vr0.min)));
+      if (zero_nonzero_bits_from_vr (expr_type, &vr0,
+				     &may_be_nonzero,
+				     &must_be_nonzero))
+	{
+	  if (wi::bit_and (must_be_nonzero, sign_bit) == sign_bit)
+	    {
+	      /* If to-be-extended sign bit is one.  */
+	      tmin = type_min;
+	      tmax = wi::zext (may_be_nonzero, prec);
+	    }
+	  else if (wi::bit_and (may_be_nonzero, sign_bit)
+		   != sign_bit)
+	    {
+	      /* If to-be-extended sign bit is zero.  */
+	      tmin = wi::zext (must_be_nonzero, prec);
+	      tmax = wi::zext (may_be_nonzero, prec);
+	    }
+	  else
+	    {
+	      tmin = type_min;
+	      tmax = type_max;
+	    }
+	}
+      else
+	{
+	  tmin = type_min;
+	  tmax = type_max;
+	}
+      min = wide_int_to_tree (expr_type, tmin);
+      max = wide_int_to_tree (expr_type, tmax);
+    }
   else if (code == RSHIFT_EXPR
 	   || code == LSHIFT_EXPR)
     {
@@ -9166,6 +9213,17 @@ simplify_bit_ops_using_ranges (gimple_stmt_iterator *gsi, gimple *stmt)
 	  break;
 	}
       break;
+    case SEXT_EXPR:
+	{
+	  unsigned int prec = tree_to_uhwi (op1);
+	  wide_int min = vr0.min;
+	  wide_int max = vr0.max;
+	  wide_int sext_min = wi::sext (min, prec);
+	  wide_int sext_max = wi::sext (max, prec);
+	  if (min == sext_min && max == sext_max)
+	    op = op0;
+	}
+      break;
     default:
       gcc_unreachable ();
     }
@@ -9868,6 +9926,7 @@ simplify_stmt_using_ranges (gimple_stmt_iterator *gsi)
 
 	case BIT_AND_EXPR:
 	case BIT_IOR_EXPR:
+	case SEXT_EXPR:
 	  /* Optimize away BIT_AND_EXPR and BIT_IOR_EXPR
 	     if all the bits being cleared are already cleared or
 	     all the bits being set are already set.  */
-- 
1.9.1


[-- Attachment #3: 0002-Add-type-promotion-pass.patch --]
[-- Type: text/x-diff, Size: 33011 bytes --]

From 8b2256e4787adb05ac9c439ef54d5befe035915d Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 22 Oct 2015 10:52:37 +1100
Subject: [PATCH 2/3] Add type promotion pass

---
 gcc/Makefile.in               |   1 +
 gcc/common.opt                |   4 +
 gcc/doc/invoke.texi           |  10 +
 gcc/gimple-ssa-type-promote.c | 997 ++++++++++++++++++++++++++++++++++++++++++
 gcc/passes.def                |   1 +
 gcc/timevar.def               |   1 +
 gcc/tree-pass.h               |   1 +
 gcc/tree-ssanames.c           |   3 +-
 8 files changed, 1017 insertions(+), 1 deletion(-)
 create mode 100644 gcc/gimple-ssa-type-promote.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index b91b8dc..c6aed45 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1499,6 +1499,7 @@ OBJS = \
 	tree-vect-slp.o \
 	tree-vectorizer.o \
 	tree-vrp.o \
+	gimple-ssa-type-promote.o \
 	tree.o \
 	valtrack.o \
 	value-prof.o \
diff --git a/gcc/common.opt b/gcc/common.opt
index 12ca0d6..f450428 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2404,6 +2404,10 @@ ftree-vrp
 Common Report Var(flag_tree_vrp) Init(0) Optimization
 Perform Value Range Propagation on trees.
 
+ftree-type-promote
+Common Report Var(flag_tree_type_promote) Init(1) Optimization
+Perform Type Promotion on trees
+
 funit-at-a-time
 Common Report Var(flag_unit_at_a_time) Init(1)
 Compile whole compilation unit at a time.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index cd82544..bc059a0 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -9093,6 +9093,16 @@ enabled by default at @option{-O2} and higher.  Null pointer check
 elimination is only done if @option{-fdelete-null-pointer-checks} is
 enabled.
 
+@item -ftree-type-promote
+@opindex ftree-type-promote
+This pass applies type promotion to SSA names in the function and
+inserts appropriate truncations to preserve the semantics.  Idea of
+this pass is to promote operations such a way that we can minimise
+generation of subreg in RTL, that intern results in removal of
+redundant zero/sign extensions.
+
+This optimization is enabled by default.
+
 @item -fsplit-ivs-in-unroller
 @opindex fsplit-ivs-in-unroller
 Enables expression of values of induction variables in later iterations
diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c
new file mode 100644
index 0000000..2831fec
--- /dev/null
+++ b/gcc/gimple-ssa-type-promote.c
@@ -0,0 +1,997 @@
+/* Type promotion of SSA names to minimise redundant zero/sign extension.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "hash-set.h"
+#include "machmode.h"
+#include "vec.h"
+#include "double-int.h"
+#include "input.h"
+#include "symtab.h"
+#include "wide-int.h"
+#include "inchash.h"
+#include "tree.h"
+#include "fold-const.h"
+#include "stor-layout.h"
+#include "predict.h"
+#include "function.h"
+#include "dominance.h"
+#include "cfg.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimple-ssa.h"
+#include "tree-phinodes.h"
+#include "ssa-iterators.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+#include "tree-pass.h"
+#include "gimple-pretty-print.h"
+#include "langhooks.h"
+#include "sbitmap.h"
+#include "domwalk.h"
+#include "tree-dfa.h"
+
+/* This pass applies type promotion to SSA names in the function and
+   inserts appropriate truncations.  Idea of this pass is to promote operations
+   such a way that we can minimise generation of subreg in RTL,
+   that in turn results in removal of redundant zero/sign extensions.  This pass
+   will run prior to The VRP and DOM such that they will be able to optimise
+   redundant truncations and extensions.  This is based on the discussion from
+   https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00472.html.
+
+*/
+
+static unsigned n_ssa_val;
+static sbitmap ssa_to_be_promoted_bitmap;
+static sbitmap ssa_sets_higher_bits_bitmap;
+static hash_map <tree, tree>  *original_type_map;
+
+static bool
+type_precision_ok (tree type)
+{
+  return (TYPE_PRECISION (type)
+	  == GET_MODE_PRECISION (TYPE_MODE (type)));
+}
+
+/* Return the promoted type for TYPE.  */
+static tree
+get_promoted_type (tree type)
+{
+  tree promoted_type;
+  enum machine_mode mode;
+  int uns;
+
+  if (POINTER_TYPE_P (type)
+      || !INTEGRAL_TYPE_P (type)
+      || !type_precision_ok (type))
+    return type;
+
+  mode = TYPE_MODE (type);
+#ifdef PROMOTE_MODE
+  uns = TYPE_SIGN (type);
+  PROMOTE_MODE (mode, uns, type);
+#endif
+  uns = TYPE_SIGN (type);
+  if (TYPE_PRECISION (type) == GET_MODE_PRECISION (mode))
+    return type;
+  promoted_type
+    = build_nonstandard_integer_type (GET_MODE_PRECISION (mode),
+				      uns);
+  gcc_assert (TYPE_PRECISION (promoted_type) == GET_MODE_PRECISION (mode));
+  return promoted_type;
+}
+
+/* Return true if ssa NAME is already considered for promotion.  */
+static bool
+ssa_promoted_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	return bitmap_bit_p (ssa_to_be_promoted_bitmap, index);
+    }
+  return true;
+}
+
+/* Set ssa NAME to be already considered for promotion.  */
+static void
+set_ssa_promoted (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	bitmap_set_bit (ssa_to_be_promoted_bitmap, index);
+    }
+}
+
+/* Set ssa NAME will have higher bits if promoted.  */
+static void
+set_ssa_overflows (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	bitmap_set_bit (ssa_sets_higher_bits_bitmap, index);
+    }
+}
+
+
+/* Return true if ssa NAME will have higher bits if promoted.  */
+static bool
+ssa_overflows_p (tree name ATTRIBUTE_UNUSED)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	return bitmap_bit_p (ssa_sets_higher_bits_bitmap, index);
+    }
+  return true;
+}
+
+/* Visit PHI stmt and record if variables might have higher bits set if
+   promoted.  */
+static bool
+record_visit_phi_node (gimple *stmt)
+{
+  tree def;
+  ssa_op_iter i;
+  use_operand_p op;
+  bool high_bits_set = false;
+  gphi *phi = as_a <gphi *> (stmt);
+  tree lhs = PHI_RESULT (phi);
+
+  if (TREE_CODE (lhs) != SSA_NAME
+      || POINTER_TYPE_P (TREE_TYPE (lhs))
+      || !INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+      || ssa_overflows_p (lhs))
+    return false;
+
+  FOR_EACH_PHI_ARG (op, phi, i, SSA_OP_USE)
+    {
+      def = USE_FROM_PTR (op);
+      if (ssa_overflows_p (def))
+	high_bits_set = true;
+    }
+
+  if (high_bits_set)
+    {
+      set_ssa_overflows (lhs);
+      return true;
+    }
+  else
+    return false;
+}
+
+/* Visit STMT and record if variables might have higher bits set if
+   promoted.  */
+static bool
+record_visit_stmt (gimple *stmt)
+{
+  bool changed = false;
+  gcc_assert (gimple_code (stmt) == GIMPLE_ASSIGN);
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  tree lhs = gimple_assign_lhs (stmt);
+  tree rhs1 = gimple_assign_rhs1 (stmt);
+
+  if (TREE_CODE (lhs) != SSA_NAME
+      || POINTER_TYPE_P (TREE_TYPE (lhs))
+      || !INTEGRAL_TYPE_P (TREE_TYPE (lhs)))
+    return false;
+
+  switch (code)
+    {
+    case SSA_NAME:
+      if (!ssa_overflows_p (lhs)
+	  && ssa_overflows_p (rhs1))
+	{
+	  set_ssa_overflows (lhs);
+	  changed = true;
+	}
+      break;
+
+    default:
+      if (!ssa_overflows_p (lhs))
+	{
+	  set_ssa_overflows (lhs);
+	  changed = true;
+	}
+      break;
+    }
+  return changed;
+}
+
+static void
+process_all_stmts_for_unsafe_promotion ()
+{
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+  auto_vec<gimple *> work_list;
+
+  FOR_EACH_BB_FN (bb, cfun)
+    {
+      for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+	{
+	  gimple *phi = gsi_stmt (gsi);
+	  work_list.safe_push (phi);
+	}
+
+      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+	{
+	  gimple *stmt = gsi_stmt (gsi);
+	  if (gimple_code (stmt) == GIMPLE_ASSIGN)
+	    work_list.safe_push (stmt);
+	}
+    }
+
+  while (work_list.length () > 0)
+    {
+      bool changed;
+      gimple *stmt = work_list.pop ();
+      tree lhs;
+
+      switch (gimple_code (stmt))
+	{
+
+	case GIMPLE_ASSIGN:
+	  changed = record_visit_stmt (stmt);
+	  lhs = gimple_assign_lhs (stmt);
+	  break;
+
+	case GIMPLE_PHI:
+	  changed = record_visit_phi_node (stmt);
+	  lhs = PHI_RESULT (stmt);
+	  break;
+
+	default:
+	  gcc_assert (false);
+	  break;
+	}
+
+      if (changed)
+	{
+	  gimple *use_stmt;
+	  imm_use_iterator ui;
+
+	  FOR_EACH_IMM_USE_STMT (use_stmt, ui, lhs)
+	    {
+	      if (gimple_code (use_stmt) == GIMPLE_ASSIGN
+		  || gimple_code (use_stmt) == GIMPLE_PHI)
+		work_list.safe_push (use_stmt);
+	    }
+	}
+    }
+}
+
+/* Return true if it is safe to promote the defined SSA_NAME in the STMT
+   itself.  */
+static bool
+safe_to_promote_def_p (gimple *stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  if (gimple_vuse (stmt) != NULL_TREE
+      || gimple_vdef (stmt) != NULL_TREE
+      || code == ARRAY_REF
+      || code == LROTATE_EXPR
+      || code == RROTATE_EXPR
+      || code == VIEW_CONVERT_EXPR
+      || code == BIT_FIELD_REF
+      || code == REALPART_EXPR
+      || code == IMAGPART_EXPR
+      || code == REDUC_MAX_EXPR
+      || code == REDUC_PLUS_EXPR
+      || code == REDUC_MIN_EXPR)
+    return false;
+  return true;
+}
+
+/* Return true if it is safe to promote the use in the STMT.  */
+static bool
+safe_to_promote_use_p (gimple *stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  tree lhs = gimple_assign_lhs (stmt);
+
+  if (gimple_vuse (stmt) != NULL_TREE
+      || gimple_vdef (stmt) != NULL_TREE
+      || code == VIEW_CONVERT_EXPR
+      || code == LROTATE_EXPR
+      || code == RROTATE_EXPR
+      || code == CONSTRUCTOR
+      || code == BIT_FIELD_REF
+      || code == COMPLEX_EXPR
+      || code == ASM_EXPR
+      || VECTOR_TYPE_P (TREE_TYPE (lhs)))
+    return false;
+  return true;
+}
+
+/* Return true if the SSA_NAME has to be truncated to preserve the
+   semantics.  */
+static bool
+truncate_use_p (gimple *stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  if (TREE_CODE_CLASS (code)
+      == tcc_comparison
+      || code == TRUNC_DIV_EXPR
+      || code == CEIL_DIV_EXPR
+      || code == FLOOR_DIV_EXPR
+      || code == ROUND_DIV_EXPR
+      || code == TRUNC_MOD_EXPR
+      || code == CEIL_MOD_EXPR
+      || code == FLOOR_MOD_EXPR
+      || code == ROUND_MOD_EXPR
+      || code == LSHIFT_EXPR
+      || code == RSHIFT_EXPR)
+    return true;
+  return false;
+}
+
+/* Return true if LHS will be promoted later.  */
+static bool
+tobe_promoted_p (tree lhs)
+{
+  if (TREE_CODE (lhs) == SSA_NAME
+      && !POINTER_TYPE_P (TREE_TYPE (lhs))
+      && INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+      && !VECTOR_TYPE_P (TREE_TYPE (lhs))
+      && !ssa_promoted_p (lhs)
+      && (get_promoted_type (TREE_TYPE (lhs))
+	  != TREE_TYPE (lhs)))
+    return true;
+  else
+    return false;
+}
+
+/* Convert constant CST to TYPE.  */
+static tree
+convert_int_cst (tree type, tree cst, signop sign = SIGNED)
+{
+  wide_int wi_cons = fold_convert (type, cst);
+  wi_cons = wi::ext (wi_cons, TYPE_PRECISION (TREE_TYPE (cst)), sign);
+  return wide_int_to_tree (type, wi_cons);
+}
+
+/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
+   promote only the constants in conditions part of the COND_EXPR.  */
+static void
+promote_cst_in_stmt (gimple *stmt, tree type, bool promote_cond = false)
+{
+  tree op;
+  ssa_op_iter iter;
+  use_operand_p oprnd;
+  int index;
+  tree op0, op1;
+  signop sign = SIGNED;
+
+  switch (gimple_code (stmt))
+    {
+    case GIMPLE_ASSIGN:
+      if (promote_cond
+	  && gimple_assign_rhs_code (stmt) == COND_EXPR)
+	{
+	  /* Promote INTEGER_CST that are tcc_compare arguments.  */
+	  sign = TYPE_SIGN (type);
+	  op = gimple_assign_rhs1 (stmt);
+	  op0 = TREE_OPERAND (op, 0);
+	  op1 = TREE_OPERAND (op, 1);
+	  if (TREE_CODE (op0) == INTEGER_CST)
+	    op0 = convert_int_cst (type, op0, sign);
+	  if (TREE_CODE (op1) == INTEGER_CST)
+	    op1 = convert_int_cst (type, op1, sign);
+	  tree new_op = build2 (TREE_CODE (op), type, op0, op1);
+	  gimple_assign_set_rhs1 (stmt, new_op);
+	}
+      else
+	{
+	  /* Promote INTEGER_CST in GIMPLE_ASSIGN.  */
+	  op = gimple_assign_rhs3 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs3 (stmt, convert_int_cst (type, op, sign));
+	  if (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt))
+	      == tcc_comparison)
+	    sign = TYPE_SIGN (type);
+	  op = gimple_assign_rhs1 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs1 (stmt, convert_int_cst (type, op, sign));
+	  op = gimple_assign_rhs2 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs2 (stmt, convert_int_cst (type, op, sign));
+	}
+      break;
+
+    case GIMPLE_PHI:
+	{
+	  /* Promote INTEGER_CST arguments to GIMPLE_PHI.  */
+	  gphi *phi = as_a <gphi *> (stmt);
+	  FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE)
+	    {
+	      op = USE_FROM_PTR (oprnd);
+	      index = PHI_ARG_INDEX_FROM_USE (oprnd);
+	      if (TREE_CODE (op) == INTEGER_CST)
+		SET_PHI_ARG_DEF (phi, index, convert_int_cst (type, op, sign));
+	    }
+	}
+      break;
+
+    case GIMPLE_COND:
+	{
+	  /* Promote INTEGER_CST that are GIMPLE_COND arguments.  */
+	  gcond *cond = as_a <gcond *> (stmt);
+	  op = gimple_cond_lhs (cond);
+	  sign = TYPE_SIGN (type);
+
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_lhs (cond, convert_int_cst (type, op, sign));
+	  op = gimple_cond_rhs (cond);
+
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_rhs (cond, convert_int_cst (type, op, sign));
+	}
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+}
+
+/* Create an ssa with TYPE to copy ssa VAR.  */
+static tree
+make_promoted_copy (tree var, gimple *def_stmt, tree type)
+{
+  tree new_lhs = make_ssa_name (type, def_stmt);
+  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (var))
+    SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_lhs) = 1;
+  return new_lhs;
+}
+
+/* Zero/sign extend (depending on type) VAR and truncate to WIDTH bits.
+   Assign the zero/sign extended value in NEW_VAR.  gimple statement
+   that performs the zero/sign extension is returned.  */
+static gimple *
+zero_sign_extend_stmt (tree new_var, tree var, int width)
+{
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var))
+	      == TYPE_PRECISION (TREE_TYPE (new_var)));
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var)) > width);
+  gimple *stmt;
+
+  if (TYPE_UNSIGNED (TREE_TYPE (new_var)))
+    {
+      /* Zero extend.  */
+      tree cst
+	= wide_int_to_tree (TREE_TYPE (var),
+			    wi::mask (width, false,
+				      TYPE_PRECISION (TREE_TYPE (var))));
+      stmt = gimple_build_assign (new_var, BIT_AND_EXPR,
+				  var, cst);
+    }
+  else
+    /* Sign extend.  */
+    stmt = gimple_build_assign (new_var,
+				SEXT_EXPR,
+				var, build_int_cst (TREE_TYPE (var), width));
+  return stmt;
+}
+
+
+void duplicate_default_ssa (tree to, tree from)
+{
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (to, SSA_NAME_VAR (from));
+  SSA_NAME_IS_DEFAULT_DEF (to) = SSA_NAME_IS_DEFAULT_DEF (from);
+  SSA_NAME_DEF_STMT (to) = SSA_NAME_DEF_STMT (from);
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (from, NULL_TREE);
+  SSA_NAME_IS_DEFAULT_DEF (to) = 1;
+  SSA_NAME_IS_DEFAULT_DEF (from) = 0;
+}
+
+/* Promote definition DEF to PROMOTED_TYPE.  If the stmt that defines def
+   is def_stmt, make the type of def promoted_type.  If the stmt is such
+   that, result of the def_stmt cannot be of promoted_type, create a new_def
+   of the original_type and make the def_stmt assign its value to newdef.
+   Then, create a CONVERT_EXPR to convert new_def to def of promoted type.
+
+   For example, for stmt with original_type char and promoted_type int:
+		char _1 = mem;
+	becomes:
+		char _2 = mem;
+		int _1 = (int)_2;
+
+   If the def_stmt allows def to be promoted, promote def in-place
+   (and its arguments when needed).
+
+   For example:
+		char _3 = _1 + _2;
+	becomes:
+		int _3 = _1 + _2;
+   Here, _1 and _2 will also be promoted.  */
+
+static void
+promote_ssa (tree def,
+	     tree promoted_type)
+{
+  gimple *def_stmt = SSA_NAME_DEF_STMT (def);
+  gimple *copy_stmt = NULL;
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+  tree original_type = TREE_TYPE (def);
+  tree new_def;
+  bool do_not_promote = false;
+
+  switch (gimple_code (def_stmt))
+    {
+    case GIMPLE_PHI:
+	{
+	  /* Promote def by fixing its type and make def anonymous.  */
+	  TREE_TYPE (def) = promoted_type;
+	  SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	  promote_cst_in_stmt (def_stmt, promoted_type);
+	  break;
+	}
+
+    case GIMPLE_ASM:
+	{
+	  gasm *asm_stmt = as_a <gasm *> (def_stmt);
+	  for (unsigned int i = 0; i < gimple_asm_noutputs (asm_stmt); ++i)
+	    {
+	      /* Promote def and copy (i.e. convert) the value defined
+		 by asm to def.  */
+	      tree link = gimple_asm_output_op (asm_stmt, i);
+	      tree op = TREE_VALUE (link);
+	      if (op == def)
+		{
+		  new_def = copy_ssa_name (def);
+		  set_ssa_promoted (new_def);
+		  duplicate_default_ssa (new_def, def);
+		  TREE_VALUE (link) = new_def;
+		  gimple_asm_set_output_op (asm_stmt, i, link);
+
+		  TREE_TYPE (def) = promoted_type;
+		  copy_stmt = gimple_build_assign (def, CONVERT_EXPR,
+						   new_def, NULL_TREE);
+		  gsi = gsi_for_stmt (def_stmt);
+		  SSA_NAME_IS_DEFAULT_DEF (new_def) = 0;
+		  gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+		  break;
+		}
+	    }
+	  break;
+	}
+
+    case GIMPLE_NOP:
+	{
+	  if (SSA_NAME_VAR (def) == NULL)
+	    {
+	      /* Promote def by fixing its type for anonymous def.  */
+	      TREE_TYPE (def) = promoted_type;
+	    }
+	  else
+	    {
+	      /* Create a promoted copy of parameters.  */
+	      bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+	      gcc_assert (bb);
+	      gsi = gsi_after_labels (bb);
+	      new_def = copy_ssa_name (def);
+	      set_ssa_promoted (new_def);
+	      set_ssa_default_def (cfun, SSA_NAME_VAR (def), new_def);
+	      duplicate_default_ssa (new_def, def);
+	      TREE_TYPE (def) = promoted_type;
+	      copy_stmt = gimple_build_assign (def, CONVERT_EXPR,
+					       new_def, NULL_TREE);
+	      SSA_NAME_DEF_STMT (def) = copy_stmt;
+	      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	    }
+	  break;
+	}
+
+    case GIMPLE_ASSIGN:
+	{
+	  enum tree_code code = gimple_assign_rhs_code (def_stmt);
+	  if (!safe_to_promote_def_p (def_stmt))
+	    {
+	      do_not_promote = true;
+	    }
+	  else if (CONVERT_EXPR_CODE_P (code))
+	    {
+	      tree rhs = gimple_assign_rhs1 (def_stmt);
+	      if (!type_precision_ok (TREE_TYPE (rhs)))
+		{
+		  do_not_promote = true;
+		}
+	      else if (types_compatible_p (TREE_TYPE (rhs), promoted_type))
+		{
+		  /* As we travel statements in dominated order, arguments
+		     of def_stmt will be visited before visiting def.  If RHS
+		     is already promoted and type is compatible, we can convert
+		     them into ZERO/SIGN EXTEND stmt.  */
+		  tree &type = original_type_map->get_or_insert (rhs);
+		  if (type == NULL_TREE)
+		    type = TREE_TYPE (rhs);
+		  if ((TYPE_PRECISION (original_type) > TYPE_PRECISION (type))
+		      || (TYPE_UNSIGNED (original_type) != TYPE_UNSIGNED (type)))
+		    {
+		      tree &type = original_type_map->get_or_insert (rhs);
+		      if (type == NULL_TREE)
+			type = TREE_TYPE (rhs);
+		      if (TYPE_PRECISION (original_type) < TYPE_PRECISION (type))
+			type = original_type;
+		      gcc_assert (type != NULL_TREE);
+		      TREE_TYPE (def) = promoted_type;
+		      gimple *copy_stmt =
+			zero_sign_extend_stmt (def, rhs,
+					       TYPE_PRECISION (type));
+		      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		      gsi = gsi_for_stmt (def_stmt);
+		      gsi_replace (&gsi, copy_stmt, false);
+		    }
+		  else
+		    {
+		      TREE_TYPE (def) = promoted_type;
+		      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		    }
+		}
+	      else
+		{
+		  /* If RHS is not promoted OR their types are not
+		     compatible, create CONVERT_EXPR that converts
+		     RHS to  promoted DEF type and perform a
+		     ZERO/SIGN EXTEND to get the required value
+		     from RHS.  */
+		  tree s = (TYPE_PRECISION (TREE_TYPE (def))
+			    < TYPE_PRECISION (TREE_TYPE (rhs)))
+		    ? TREE_TYPE (def) : TREE_TYPE (rhs);
+		  new_def = copy_ssa_name (def);
+		  set_ssa_promoted (new_def);
+		  TREE_TYPE (def) = promoted_type;
+		  TREE_TYPE (new_def) = promoted_type;
+		  SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		  SET_SSA_NAME_VAR_OR_IDENTIFIER (new_def, NULL_TREE);
+		  gimple_set_lhs (def_stmt, new_def);
+		  gimple *copy_stmt =
+		    zero_sign_extend_stmt (def, new_def,
+					   TYPE_PRECISION (s));
+		  gsi = gsi_for_stmt (def_stmt);
+		  if (lookup_stmt_eh_lp (def_stmt) > 0
+		      || (gimple_code (def_stmt) == GIMPLE_CALL
+			  && gimple_call_ctrl_altering_p (def_stmt)))
+		    gsi_insert_on_edge (FALLTHRU_EDGE (gimple_bb (def_stmt)),
+						  copy_stmt);
+		  else
+		    gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+	      }
+	    }
+	  else
+	    {
+	      /* Promote def by fixing its type and make def anonymous.  */
+	      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	      promote_cst_in_stmt (def_stmt, promoted_type);
+	      TREE_TYPE (def) = promoted_type;
+	    }
+	  break;
+	}
+
+    default:
+      do_not_promote = true;
+      break;
+    }
+
+  if (do_not_promote)
+    {
+      /* Promote def and copy (i.e. convert) the value defined
+	 by the stmt that cannot be promoted.  */
+      new_def = copy_ssa_name (def);
+      set_ssa_promoted (new_def);
+      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+      TREE_TYPE (def) = promoted_type;
+      gimple_set_lhs (def_stmt, new_def);
+      copy_stmt = gimple_build_assign (def, CONVERT_EXPR,
+				       new_def, NULL_TREE);
+      gsi = gsi_for_stmt (def_stmt);
+      if (lookup_stmt_eh_lp (def_stmt) > 0
+	  || (gimple_code (def_stmt) == GIMPLE_CALL
+	      && gimple_call_ctrl_altering_p (def_stmt)))
+	gsi_insert_on_edge (FALLTHRU_EDGE (gimple_bb (def_stmt)),
+				      copy_stmt);
+      else
+	gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT);
+    }
+  else
+    {
+      /* Type is now promoted.  Due to this, some of the value ranges computed
+	 by VRP1 will is invalid.  TODO: We can be intelligent in deciding
+	 which ranges to be invalidated instead of invalidating everything.  */
+      SSA_NAME_RANGE_INFO (def) = NULL;
+    }
+}
+
+/* Fix the (promoted) USE in stmts where USE cannot be be promoted.  */
+static unsigned int
+fixup_uses (tree use, tree promoted_type, tree old_type)
+{
+  gimple *stmt;
+  imm_use_iterator ui;
+  gimple_stmt_iterator gsi;
+  use_operand_p op;
+
+  FOR_EACH_IMM_USE_STMT (stmt, ui, use)
+    {
+      bool do_not_promote = false;
+      switch (gimple_code (stmt))
+	{
+	case GIMPLE_DEBUG:
+	    {
+	      FOR_EACH_IMM_USE_ON_STMT (op, ui)
+		SET_USE (op, fold_convert (old_type, use));
+	      update_stmt (stmt);
+	    }
+	  break;
+
+	case GIMPLE_ASM:
+	case GIMPLE_CALL:
+	case GIMPLE_RETURN:
+	    {
+	      /* USE cannot be promoted here.  */
+	      do_not_promote = true;
+	      break;
+	    }
+
+	case GIMPLE_ASSIGN:
+	    {
+	      enum tree_code code = gimple_assign_rhs_code (stmt);
+	      tree lhs = gimple_assign_lhs (stmt);
+	      if (!safe_to_promote_use_p (stmt))
+		{
+		  do_not_promote = true;
+		}
+	      else if (truncate_use_p (stmt)
+			 || !INTEGRAL_TYPE_P (TREE_TYPE (lhs)))
+		{
+		  if (TREE_CODE_CLASS (code)
+		      == tcc_comparison)
+		    promote_cst_in_stmt (stmt, promoted_type, true);
+		  if (!ssa_overflows_p (use))
+		    break;
+		  /* In some stmts, value in USE has to be ZERO/SIGN
+		     Extended based on the original type for correct
+		     result.  */
+		  tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+		  gimple *copy_stmt =
+		    zero_sign_extend_stmt (temp, use,
+					   TYPE_PRECISION (old_type));
+		  gsi = gsi_for_stmt (stmt);
+		  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+
+		  FOR_EACH_IMM_USE_ON_STMT (op, ui)
+		    SET_USE (op, temp);
+		  update_stmt (stmt);
+		}
+	      else if (CONVERT_EXPR_CODE_P (code))
+		{
+		  tree rhs = gimple_assign_rhs1 (stmt);
+		  if (!type_precision_ok (TREE_TYPE (rhs)))
+		    {
+		      do_not_promote = true;
+		    }
+		  else if (types_compatible_p (TREE_TYPE (lhs), promoted_type))
+		    {
+		      /* Type of LHS and promoted RHS are compatible, we can
+			 convert this into ZERO/SIGN EXTEND stmt.  */
+		      gimple *copy_stmt =
+			zero_sign_extend_stmt (lhs, use,
+					       TYPE_PRECISION (old_type));
+		      gsi = gsi_for_stmt (stmt);
+		      set_ssa_promoted (lhs);
+		      gsi_replace (&gsi, copy_stmt, false);
+		    }
+		  else if (tobe_promoted_p (lhs))
+		    {
+		      /* If LHS will be promoted later, store the original
+			 type of RHS so that we can convert it to ZERO/SIGN
+			 EXTEND when LHS is promoted.  */
+		      tree rhs = gimple_assign_rhs1 (stmt);
+		      tree &type = original_type_map->get_or_insert (rhs);
+		      type = TREE_TYPE (old_type);
+		    }
+		  else
+		    {
+		      do_not_promote = true;
+		    }
+		}
+	      break;
+	    }
+
+	case GIMPLE_COND:
+	  if (ssa_overflows_p (use))
+	    {
+	      /* In GIMPLE_COND, value in USE has to be ZERO/SIGN
+		 Extended based on the original type for correct
+		 result.  */
+	      tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+	      gimple *copy_stmt =
+		zero_sign_extend_stmt (temp, use,
+				       TYPE_PRECISION (old_type));
+	      gsi = gsi_for_stmt (stmt);
+	      gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+
+	      FOR_EACH_IMM_USE_ON_STMT (op, ui)
+		SET_USE (op, temp);
+	    }
+	  promote_cst_in_stmt (stmt, promoted_type, true);
+	  update_stmt (stmt);
+	  break;
+
+	default:
+	  break;
+	}
+
+      if (do_not_promote)
+	{
+	  /* FOR stmts where USE canoot be promoted, create an
+	     original type copy.  */
+	  tree temp;
+	  temp = copy_ssa_name (use);
+	  set_ssa_promoted (temp);
+	  TREE_TYPE (temp) = old_type;
+	  gimple *copy_stmt = gimple_build_assign (temp, CONVERT_EXPR,
+						  use, NULL_TREE);
+	  gsi = gsi_for_stmt (stmt);
+	  gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT);
+	  FOR_EACH_IMM_USE_ON_STMT (op, ui)
+	    SET_USE (op, temp);
+	  update_stmt (stmt);
+	}
+    }
+  return 0;
+}
+void debug_tree (tree);
+
+/* Promote definition of NAME and adjust its uses if necessary.  */
+static unsigned int
+promote_ssa_if_not_promoted (tree name)
+{
+  tree type;
+  if (tobe_promoted_p (name))
+    {
+      type = get_promoted_type (TREE_TYPE (name));
+      tree old_type = TREE_TYPE (name);
+      promote_ssa (name, type);
+      set_ssa_promoted (name);
+      fixup_uses (name, type, old_type);
+    }
+  return 0;
+}
+
+/* Promote all the stmts in the basic block.  */
+static void
+promote_all_stmts (basic_block bb)
+{
+  gimple_stmt_iterator gsi;
+  ssa_op_iter iter;
+  tree def;
+
+  for (gphi_iterator gpi = gsi_start_phis (bb);
+       !gsi_end_p (gpi); gsi_next (&gpi))
+    {
+      gphi *phi = gpi.phi ();
+      use_operand_p op;
+
+      FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE)
+	{
+	  def = USE_FROM_PTR (op);
+	  promote_ssa_if_not_promoted (def);
+	}
+      def = PHI_RESULT (phi);
+      promote_ssa_if_not_promoted (def);
+    }
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      gimple *stmt = gsi_stmt (gsi);
+
+      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_USE | SSA_OP_DEF)
+	promote_ssa_if_not_promoted (def);
+    }
+}
+
+
+class type_promotion_dom_walker : public dom_walker
+{
+public:
+  type_promotion_dom_walker (cdi_direction direction)
+    : dom_walker (direction) {}
+  virtual void before_dom_children (basic_block bb)
+    {
+      promote_all_stmts (bb);
+    }
+};
+
+/* Main entry point to the pass.  */
+static unsigned int
+execute_type_promotion (void)
+{
+  n_ssa_val = num_ssa_names;
+  original_type_map = new hash_map<tree, tree>;
+  ssa_to_be_promoted_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_to_be_promoted_bitmap);
+  ssa_sets_higher_bits_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_sets_higher_bits_bitmap);
+
+  calculate_dominance_info (CDI_DOMINATORS);
+  process_all_stmts_for_unsafe_promotion ();
+  /* Walk the CFG in dominator order.  */
+  type_promotion_dom_walker (CDI_DOMINATORS)
+    .walk (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+  gsi_commit_edge_inserts ();
+  sbitmap_free (ssa_to_be_promoted_bitmap);
+  sbitmap_free (ssa_sets_higher_bits_bitmap);
+  delete original_type_map;
+  return 0;
+}
+
+namespace {
+const pass_data pass_data_type_promotion =
+{
+  GIMPLE_PASS, /* type */
+  "promotion", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_TREE_TYPE_PROMOTE, /* tv_id */
+  PROP_ssa, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  (TODO_cleanup_cfg | TODO_update_ssa | TODO_verify_all),
+};
+
+class pass_type_promotion : public gimple_opt_pass
+{
+public:
+  pass_type_promotion (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_type_promotion, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  opt_pass * clone () { return new pass_type_promotion (m_ctxt); }
+  virtual bool gate (function *) { return flag_tree_type_promote != 0; }
+  virtual unsigned int execute (function *)
+    {
+      return execute_type_promotion ();
+    }
+
+}; // class pass_type_promotion
+
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_type_promote (gcc::context *ctxt)
+{
+  return new pass_type_promotion (ctxt);
+}
+
diff --git a/gcc/passes.def b/gcc/passes.def
index 36d2b3b..78c463a 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -272,6 +272,7 @@ along with GCC; see the file COPYING3.  If not see
       POP_INSERT_PASSES ()
       NEXT_PASS (pass_simduid_cleanup);
       NEXT_PASS (pass_lower_vector_ssa);
+      NEXT_PASS (pass_type_promote);
       NEXT_PASS (pass_cse_reciprocals);
       NEXT_PASS (pass_reassoc);
       NEXT_PASS (pass_strength_reduction);
diff --git a/gcc/timevar.def b/gcc/timevar.def
index b429faf..a8d40c3 100644
--- a/gcc/timevar.def
+++ b/gcc/timevar.def
@@ -278,6 +278,7 @@ DEFTIMEVAR (TV_VTABLE_VERIFICATION   , "vtable verification")
 DEFTIMEVAR (TV_TREE_UBSAN            , "tree ubsan")
 DEFTIMEVAR (TV_INITIALIZE_RTL        , "initialize rtl")
 DEFTIMEVAR (TV_GIMPLE_LADDRESS       , "address lowering")
+DEFTIMEVAR (TV_TREE_TYPE_PROMOTE     , "tree type promote")
 
 /* Everything else in rest_of_compilation not included above.  */
 DEFTIMEVAR (TV_EARLY_LOCAL	     , "early local passes")
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 333b5a7..449dd19 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -436,6 +436,7 @@ extern gimple_opt_pass *make_pass_fre (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_check_data_deps (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_copy_prop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_isolate_erroneous_paths (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_type_promote (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_vrp (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt);
diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c
index 82fd4a1..80fcf70 100644
--- a/gcc/tree-ssanames.c
+++ b/gcc/tree-ssanames.c
@@ -207,7 +207,8 @@ set_range_info (tree name, enum value_range_type range_type,
   unsigned int precision = TYPE_PRECISION (TREE_TYPE (name));
 
   /* Allocate if not available.  */
-  if (ri == NULL)
+  if (ri == NULL
+      || (precision != ri->get_min ().get_precision ()))
     {
       size_t size = (sizeof (range_info_def)
 		     + trailing_wide_ints <3>::extra_size (precision));
-- 
1.9.1


[-- Attachment #4: 0001-Add-new-SEXT_EXPR-tree-code.patch --]
[-- Type: text/x-diff, Size: 5067 bytes --]

From c0ce364e3a422912a08189645efde46c36583753 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 22 Oct 2015 10:51:42 +1100
Subject: [PATCH 1/3] Add new SEXT_EXPR tree code

---
 gcc/cfgexpand.c         | 12 ++++++++++++
 gcc/expr.c              | 20 ++++++++++++++++++++
 gcc/fold-const.c        |  4 ++++
 gcc/tree-cfg.c          | 12 ++++++++++++
 gcc/tree-inline.c       |  1 +
 gcc/tree-pretty-print.c | 11 +++++++++++
 gcc/tree.def            |  5 +++++
 7 files changed, 65 insertions(+)

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index eaad859..aeb64bb 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -5054,6 +5054,18 @@ expand_debug_expr (tree exp)
     case FMA_EXPR:
       return simplify_gen_ternary (FMA, mode, inner_mode, op0, op1, op2);
 
+    case SEXT_EXPR:
+      gcc_assert (CONST_INT_P (op1));
+      inner_mode = mode_for_size (INTVAL (op1), MODE_INT, 0);
+      gcc_assert (GET_MODE_BITSIZE (inner_mode) == INTVAL (op1));
+
+      if (mode != inner_mode)
+	op0 = simplify_gen_unary (SIGN_EXTEND,
+				  mode,
+				  gen_lowpart_SUBREG (inner_mode, op0),
+				  inner_mode);
+      return op0;
+
     default:
     flag_unsupported:
 #ifdef ENABLE_CHECKING
diff --git a/gcc/expr.c b/gcc/expr.c
index da68870..c2f535f 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9318,6 +9318,26 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
       target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
       return target;
 
+    case SEXT_EXPR:
+	{
+	  machine_mode inner_mode = mode_for_size (tree_to_uhwi (treeop1),
+						   MODE_INT, 0);
+	  rtx temp, result;
+	  rtx op0 = expand_normal (treeop0);
+	  op0 = force_reg (mode, op0);
+	  if (mode != inner_mode)
+	    {
+	      result = gen_reg_rtx (mode);
+	      temp = simplify_gen_unary (SIGN_EXTEND, mode,
+					 gen_lowpart_SUBREG (inner_mode, op0),
+					 inner_mode);
+	      convert_move (result, temp, 0);
+	    }
+	  else
+	    result = op0;
+	  return result;
+	}
+
     default:
       gcc_unreachable ();
     }
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 602ea24..a149bad 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -987,6 +987,10 @@ int_const_binop_1 (enum tree_code code, const_tree arg1, const_tree parg2,
       res = wi::bit_and (arg1, arg2);
       break;
 
+    case SEXT_EXPR:
+      res = wi::sext (arg1, arg2.to_uhwi ());
+      break;
+
     case RSHIFT_EXPR:
     case LSHIFT_EXPR:
       if (wi::neg_p (arg2))
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 8e3e810..d18b3f7 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3752,6 +3752,18 @@ verify_gimple_assign_binary (gassign *stmt)
         return false;
       }
 
+    case SEXT_EXPR:
+      {
+	if (!INTEGRAL_TYPE_P (lhs_type)
+	    || !useless_type_conversion_p (lhs_type, rhs1_type)
+	    || !tree_fits_uhwi_p (rhs2))
+	  {
+	    error ("invalid operands in sext expr");
+	    return true;
+	  }
+	return false;
+      }
+
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
       {
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index b8269ef..e61c200 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -3893,6 +3893,7 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
     case BIT_XOR_EXPR:
     case BIT_AND_EXPR:
     case BIT_NOT_EXPR:
+    case SEXT_EXPR:
 
     case TRUTH_ANDIF_EXPR:
     case TRUTH_ORIF_EXPR:
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 11f90051..bec9082 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -1923,6 +1923,14 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, int flags,
       }
       break;
 
+    case SEXT_EXPR:
+      pp_string (pp, "SEXT_EXPR <");
+      dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
+      pp_string (pp, ", ");
+      dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
+      pp_greater (pp);
+      break;
+
     case MODIFY_EXPR:
     case INIT_EXPR:
       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags,
@@ -3561,6 +3569,9 @@ op_symbol_code (enum tree_code code)
     case MIN_EXPR:
       return "min";
 
+    case SEXT_EXPR:
+      return "sext";
+
     default:
       return "<<< ??? >>>";
     }
diff --git a/gcc/tree.def b/gcc/tree.def
index d0a3bd6..789cfdd 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -760,6 +760,11 @@ DEFTREECODE (BIT_XOR_EXPR, "bit_xor_expr", tcc_binary, 2)
 DEFTREECODE (BIT_AND_EXPR, "bit_and_expr", tcc_binary, 2)
 DEFTREECODE (BIT_NOT_EXPR, "bit_not_expr", tcc_unary, 1)
 
+/*  Sign-extend operation.  It will sign extend first operand from
+ the sign bit specified by the second operand.  The type of the
+ result is that of the first operand.  */
+DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2)
+
 /* ANDIF and ORIF allow the second operand not to be computed if the
    value of the expression is determined from the first operand.  AND,
    OR, and XOR always compute the second operand whether its value is
-- 
1.9.1


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [7/7] Adjust-arm-test cases
  2015-09-07  5:54 ` [7/7] Adjust-arm-test cases Kugan
@ 2015-11-02 11:43   ` Richard Earnshaw
  0 siblings, 0 replies; 63+ messages in thread
From: Richard Earnshaw @ 2015-11-02 11:43 UTC (permalink / raw)
  To: Kugan, gcc-patches; +Cc: Richard Biener

On 07/09/15 04:03, Kugan wrote:
> 
> 
> gcc/testsuite/ChangeLog:
> 
> 2015-09-07  Kugan Vivekanandarajah  <kuganv@linaro.org>
> 
> 	* gcc.target/arm/mla-2.c: Scan for wider mode operation.
> 	* gcc.target/arm/wmul-1.c: Likewise.
> 	* gcc.target/arm/wmul-2.c: Likewise.
> 	* gcc.target/arm/wmul-3.c: Likewise.
> 	* gcc.target/arm/wmul-9.c: Likewise.
> 
> 

This looks undesirable.  These tests are supposed to be checking that
widening multiplies can be generated.  If your other changes mean that
this no-longer happens then we should be asking, is this the correct
behaviour now?  If it is the correct behaviour for these cases, then the
tests need enhancing so that they do still result in the correct
widening multiply instructions being generated.

R.


> 0007-adjust-arm-testcases.patch
> 
> 
> From 305c526b4019fc11260c474143f6829be2cc3f54 Mon Sep 17 00:00:00 2001
> From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
> Date: Wed, 2 Sep 2015 12:21:46 +1000
> Subject: [PATCH 7/8] adjust arm testcases
> 
> ---
>  gcc/testsuite/gcc.target/arm/mla-2.c  | 2 +-
>  gcc/testsuite/gcc.target/arm/wmul-1.c | 2 +-
>  gcc/testsuite/gcc.target/arm/wmul-2.c | 2 +-
>  gcc/testsuite/gcc.target/arm/wmul-3.c | 2 +-
>  gcc/testsuite/gcc.target/arm/wmul-9.c | 2 +-
>  5 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/arm/mla-2.c b/gcc/testsuite/gcc.target/arm/mla-2.c
> index 1e3ca20..474bce0 100644
> --- a/gcc/testsuite/gcc.target/arm/mla-2.c
> +++ b/gcc/testsuite/gcc.target/arm/mla-2.c
> @@ -7,4 +7,4 @@ long long foolong (long long x, short *a, short *b)
>      return x + *a * *b;
>  }
>  
> -/* { dg-final { scan-assembler "smlalbb" } } */
> +/* { dg-final { scan-assembler "smla" } } */
> diff --git a/gcc/testsuite/gcc.target/arm/wmul-1.c b/gcc/testsuite/gcc.target/arm/wmul-1.c
> index ddddd50..d4e7b41 100644
> --- a/gcc/testsuite/gcc.target/arm/wmul-1.c
> +++ b/gcc/testsuite/gcc.target/arm/wmul-1.c
> @@ -16,4 +16,4 @@ int mac(const short *a, const short *b, int sqr, int *sum)
>    return sqr;
>  }
>  
> -/* { dg-final { scan-assembler-times "smlabb" 2 } } */
> +/* { dg-final { scan-assembler-times "mla" 2 } } */
> diff --git a/gcc/testsuite/gcc.target/arm/wmul-2.c b/gcc/testsuite/gcc.target/arm/wmul-2.c
> index 2ea55f9..0e32674 100644
> --- a/gcc/testsuite/gcc.target/arm/wmul-2.c
> +++ b/gcc/testsuite/gcc.target/arm/wmul-2.c
> @@ -10,4 +10,4 @@ void vec_mpy(int y[], const short x[], short scaler)
>     y[i] += ((scaler * x[i]) >> 31);
>  }
>  
> -/* { dg-final { scan-assembler-times "smulbb" 1 } } */
> +/* { dg-final { scan-assembler-times "mul" 1 } } */
> diff --git a/gcc/testsuite/gcc.target/arm/wmul-3.c b/gcc/testsuite/gcc.target/arm/wmul-3.c
> index 144b553..46d709c 100644
> --- a/gcc/testsuite/gcc.target/arm/wmul-3.c
> +++ b/gcc/testsuite/gcc.target/arm/wmul-3.c
> @@ -16,4 +16,4 @@ int mac(const short *a, const short *b, int sqr, int *sum)
>    return sqr;
>  }
>  
> -/* { dg-final { scan-assembler-times "smulbb" 2 } } */
> +/* { dg-final { scan-assembler-times "mul" 2 } } */
> diff --git a/gcc/testsuite/gcc.target/arm/wmul-9.c b/gcc/testsuite/gcc.target/arm/wmul-9.c
> index 40ed021..415a114 100644
> --- a/gcc/testsuite/gcc.target/arm/wmul-9.c
> +++ b/gcc/testsuite/gcc.target/arm/wmul-9.c
> @@ -8,4 +8,4 @@ foo (long long a, short *b, char *c)
>    return a + *b * *c;
>  }
>  
> -/* { dg-final { scan-assembler "smlalbb" } } */
> +/* { dg-final { scan-assembler "mlal" } } */
> 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-11-02  9:17             ` Kugan
@ 2015-11-03 14:40               ` Richard Biener
  2015-11-08  9:43                 ` Kugan
  0 siblings, 1 reply; 63+ messages in thread
From: Richard Biener @ 2015-11-03 14:40 UTC (permalink / raw)
  To: Kugan; +Cc: gcc-patches

On Mon, Nov 2, 2015 at 10:17 AM, Kugan
<kugan.vivekanandarajah@linaro.org> wrote:
>
>
> On 29/10/15 02:45, Richard Biener wrote:
>> On Tue, Oct 27, 2015 at 1:50 AM, kugan
>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>
>>>
>>> On 23/10/15 01:23, Richard Biener wrote:
>>>>
>>>> On Thu, Oct 22, 2015 at 12:50 PM, Kugan
>>>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 21/10/15 23:45, Richard Biener wrote:
>>>>>>
>>>>>> On Tue, Oct 20, 2015 at 10:03 PM, Kugan
>>>>>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 07/09/15 12:53, Kugan wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> This a new version of the patch posted in
>>>>>>>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done
>>>>>>>> more testing and spitted the patch to make it more easier to review.
>>>>>>>> There are still couple of issues to be addressed and I am working on
>>>>>>>> them.
>>>>>>>>
>>>>>>>> 1. AARCH64 bootstrap now fails with the commit
>>>>>>>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is
>>>>>>>> mis-compiled
>>>>>>>> in stage2 and fwprop.c is failing. It looks to me that there is a
>>>>>>>> latent
>>>>>>>> issue which gets exposed my patch. I can also reproduce this in x86_64
>>>>>>>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the
>>>>>>>> time being, I am using  patch
>>>>>>>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a
>>>>>>>> workaround. This meeds to be fixed before the patches are ready to be
>>>>>>>> committed.
>>>>>>>>
>>>>>>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>>>>>>>> -O3 -g Error: unaligned opcodes detected in executable segment. It
>>>>>>>> works
>>>>>>>> fine if I remove the -g. I am looking into it and needs to be fixed as
>>>>>>>> well.
>>>>>>>
>>>>>>>
>>>>>>> Hi Richard,
>>>>>>>
>>>>>>> Now that stage 1 is going to close, I would like to get these patches
>>>>>>> accepted for stage1. I will try my best to address your review comments
>>>>>>> ASAP.
>>>>>>
>>>>>>
>>>>>> Ok, can you make the whole patch series available so I can poke at the
>>>>>> implementation a bit?  Please state the revision it was rebased on
>>>>>> (or point me to a git/svn branch the work resides on).
>>>>>>
>>>>>
>>>>> Thanks. Please find the patched rebated against trunk@229156. I have
>>>>> skipped the test-case readjustment patches.
>>>>
>>>>
>>>> Some quick observations.  On x86_64 when building
>>>
>>>
>>> Hi Richard,
>>>
>>> Thanks for the review.
>>>
>>>>
>>>> short bar (short y);
>>>> int foo (short x)
>>>> {
>>>>    short y = bar (x) + 15;
>>>>    return y;
>>>> }
>>>>
>>>> with -m32 -O2 -mtune=pentiumpro (which ends up promoting HImode regs)
>>>> I get
>>>>
>>>>    <bb 2>:
>>>>    _1 = (int) x_10(D);
>>>>    _2 = (_1) sext (16);
>>>>    _11 = bar (_2);
>>>>    _5 = (int) _11;
>>>>    _12 = (unsigned int) _5;
>>>>    _6 = _12 & 65535;
>>>>    _7 = _6 + 15;
>>>>    _13 = (int) _7;
>>>>    _8 = (_13) sext (16);
>>>>    _9 = (_8) sext (16);
>>>>    return _9;
>>>>
>>>> which looks fine but the VRP optimization doesn't trigger for the
>>>> redundant sext
>>>> (ranges are computed correctly but the 2nd extension is not removed).
>
> Thanks for the comments. Please fond the attached patches with which I
> am now getting
> cat .192t.optimized
>
> ;; Function foo (foo, funcdef_no=0, decl_uid=1406, cgraph_uid=0,
> symbol_order=0)
>
> foo (short int x)
> {
>   signed int _1;
>   int _2;
>   signed int _5;
>   unsigned int _6;
>   unsigned int _7;
>   signed int _8;
>   int _9;
>   short int _11;
>   unsigned int _12;
>   signed int _13;
>
>   <bb 2>:
>   _1 = (signed int) x_10(D);
>   _2 = _1;
>   _11 = bar (_2);
>   _5 = (signed int) _11;
>   _12 = (unsigned int) _11;
>   _6 = _12 & 65535;
>   _7 = _6 + 15;
>   _13 = (signed int) _7;
>   _8 = (_13) sext (16);
>   _9 = _8;
>   return _9;
>
> }
>
>
> There are still some redundancies. The asm difference after RTL
> optimizations is
>
> -       addl    $15, %eax
> +       addw    $15, %ax
>
>
>>>>
>>>> This also makes me notice trivial match.pd patterns are missing, like
>>>> for example
>>>>
>>>> (simplify
>>>>   (sext (sext@2 @0 @1) @3)
>>>>   (if (tree_int_cst_compare (@1, @3) <= 0)
>>>>    @2
>>>>    (sext @0 @3)))
>>>>
>>>> as VRP doesn't run at -O1 we must rely on those to remove rendudant
>>>> extensions,
>>>> otherwise generated code might get worse compared to without the pass(?)
>>>
>>>
>>> Do you think that we should enable this pass only when vrp is enabled.
>>> Otherwise, even when we do the simple optimizations you mentioned below, we
>>> might not be able to remove all the redundancies.
>>>
>>>>
>>>> I also notice that the 'short' argument does not get it's sign-extension
>>>> removed
>>>> as redundand either even though we have
>>>>
>>>> _1 = (int) x_8(D);
>>>> Found new range for _1: [-32768, 32767]
>>>>
>>>
>>> I am looking into it.
>>>
>>>> In the end I suspect that keeping track of the "simple" cases in the
>>>> promotion
>>>> pass itself (by keeping a lattice) might be a good idea (after we fix VRP
>>>> to do
>>>> its work).  In some way whether the ABI guarantees promoted argument
>>>> registers might need some other target hook queries.
>
> I tried adding it in the attached patch with record_visit_stmt to track
> whether an ssa would have value overflow or properly zero/sign extended
> in promoted mode. We can use this to eliminate some of the zero/sign
> extension at gimple level. As it is, it doesn't do much. If this is what
> you had in mind, I will extend it based on your feedback.
>
>
>>>>
>>>> Now onto the 0002 patch.
>>>>
>>>> +static bool
>>>> +type_precision_ok (tree type)
>>>> +{
>>>> +  return (TYPE_PRECISION (type)  == 8
>>>> +         || TYPE_PRECISION (type) == 16
>>>> +         || TYPE_PRECISION (type) == 32);
>>>> +}
>>>>
>>>> that's a weird function to me.  You probably want
>>>> TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE (type))
>>>> here?  And guard that thing with POINTER_TYPE_P || INTEGRAL_TYPE_P?
>>>>
>>>
>>> I will change this. (I have a patch which I am testing with other changes
>>> you have asked for)
>>>
>>>
>>>> +/* Return the promoted type for TYPE.  */
>>>> +static tree
>>>> +get_promoted_type (tree type)
>>>> +{
>>>> +  tree promoted_type;
>>>> +  enum machine_mode mode;
>>>> +  int uns;
>>>> +  if (POINTER_TYPE_P (type)
>>>> +      || !INTEGRAL_TYPE_P (type)
>>>> +      || !type_precision_ok (type))
>>>> +    return type;
>>>> +
>>>> +  mode = TYPE_MODE (type);
>>>> +#ifdef PROMOTE_MODE
>>>> +  uns = TYPE_SIGN (type);
>>>> +  PROMOTE_MODE (mode, uns, type);
>>>> +#endif
>>>> +  uns = TYPE_SIGN (type);
>>>> +  promoted_type = lang_hooks.types.type_for_mode (mode, uns);
>>>> +  if (promoted_type
>>>> +      && (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type)))
>>>> +    type = promoted_type;
>>>>
>>>> I think what you want to verify is that TYPE_PRECISION (promoted_type)
>>>> == GET_MODE_PRECISION (mode).
>>>> And to not even bother with this simply use
>>>>
>>>> promoted_type = build_nonstandard_integer_type (GET_MODE_PRECISION (mode),
>>>> uns);
>>>>
>>>
>>> I am changing this too.
>>>
>>>> You use a domwalk but also might create new basic-blocks during it
>>>> (insert_on_edge_immediate), that's a
>>>> no-no, commit edge inserts after the domwalk.
>>>
>>>
>>> I am sorry, I dont understand "commit edge inserts after the domwalk" Is
>>> there a way to do this in the current implementation?
>>
>> Yes, simply use gsi_insert_on_edge () and after the domwalk is done do
>> gsi_commit_edge_inserts ().
>>
>>>> ssa_sets_higher_bits_bitmap looks unused and
>>>> we generally don't free dominance info, so please don't do that.
>>>>
>>>> I fired off a bootstrap on ppc64-linux which fails building stage1 libgcc
>>>> with
>>>>
>>>> /abuild/rguenther/obj/./gcc/xgcc -B/abuild/rguenther/obj/./gcc/
>>>> -B/usr/local/powerpc64-unknown-linux-gnu/bin/
>>>> -B/usr/local/powerpc64-unknown-linux-gnu/lib/ -isystem
>>>> /usr/local/powerpc64-unknown-linux-gnu/include -isystem
>>>> /usr/local/powerpc64-unknown-linux-gnu/sys-include    -g -O2 -O2  -g
>>>> -O2 -DIN_GCC    -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual
>>>> -Wno-format -Wstrict-prototypes -Wmissing-prototypes
>>>> -Wold-style-definition  -isystem ./include   -fPIC -mlong-double-128
>>>> -mno-minimal-toc -g -DIN_LIBGCC2 -fbuilding-libgcc
>>>> -fno-stack-protector   -fPIC -mlong-double-128 -mno-minimal-toc -I.
>>>> -I. -I../.././gcc -I../../../trunk/libgcc -I../../../trunk/libgcc/.
>>>> -I../../../trunk/libgcc/../gcc -I../../../trunk/libgcc/../include
>>>> -I../../../trunk/libgcc/../libdecnumber/dpd
>>>> -I../../../trunk/libgcc/../libdecnumber -DHAVE_CC_TLS  -o _divdi3.o
>>>> -MT _divdi3.o -MD -MP -MF _divdi3.dep -DL_divdi3 -c
>>>> ../../../trunk/libgcc/libgcc2.c \
>>>>            -fexceptions -fnon-call-exceptions -fvisibility=hidden
>>>> -DHIDE_EXPORTS
>>>> In file included from ../../../trunk/libgcc/libgcc2.c:56:0:
>>>> ../../../trunk/libgcc/libgcc2.c: In function ‘__divti3’:
>>>> ../../../trunk/libgcc/libgcc2.h:193:20: internal compiler error: in
>>>> expand_debug_locations, at cfgexpand.c:5277
>>>>
>
> With the attached patch, now I am running into Bootstrap comparison
> failure. I am looking into it. Please review this version so that I can
> address them while fixing this issue.

I notice

diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c
index 82fd4a1..80fcf70 100644
--- a/gcc/tree-ssanames.c
+++ b/gcc/tree-ssanames.c
@@ -207,7 +207,8 @@ set_range_info (tree name, enum value_range_type range_type,
   unsigned int precision = TYPE_PRECISION (TREE_TYPE (name));

   /* Allocate if not available.  */
-  if (ri == NULL)
+  if (ri == NULL
+      || (precision != ri->get_min ().get_precision ()))

and I think you need to clear range info on promoted SSA vars in the
promotion pass.

The basic "structure" thing still remains.  You walk over all uses and
defs in all stmts
in promote_all_stmts which ends up calling promote_ssa_if_not_promoted on all
uses and defs which in turn promotes (the "def") and then fixes up all
uses in all stmts.

Instead of this you should, in promote_all_stmts, walk over all uses doing what
fixup_uses does and then walk over all defs, doing what promote_ssa does.

+    case GIMPLE_NOP:
+       {
+         if (SSA_NAME_VAR (def) == NULL)
+           {
+             /* Promote def by fixing its type for anonymous def.  */
+             TREE_TYPE (def) = promoted_type;
+           }
+         else
+           {
+             /* Create a promoted copy of parameters.  */
+             bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));

I think the uninitialized vars are somewhat tricky and it would be best
to create a new uninit anonymous SSA name for them.  You can
have SSA_NAME_VAR != NULL and def _not_ being a parameter
btw.

+/* Return true if it is safe to promote the defined SSA_NAME in the STMT
+   itself.  */
+static bool
+safe_to_promote_def_p (gimple *stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  if (gimple_vuse (stmt) != NULL_TREE
+      || gimple_vdef (stmt) != NULL_TREE
+      || code == ARRAY_REF
+      || code == LROTATE_EXPR
+      || code == RROTATE_EXPR
+      || code == VIEW_CONVERT_EXPR
+      || code == BIT_FIELD_REF
+      || code == REALPART_EXPR
+      || code == IMAGPART_EXPR
+      || code == REDUC_MAX_EXPR
+      || code == REDUC_PLUS_EXPR
+      || code == REDUC_MIN_EXPR)
+    return false;
+  return true;

huh, I think this function has an odd name, maybe
can_promote_operation ()?  Please
use TREE_CODE_CLASS (code) == tcc_reference for all _REF trees.

Note that as followup things like the rotates should be "expanded" like
we'd do on RTL (open-coding the thing).  And we'd need a way to
specify zero-/sign-extended loads.

+/* Return true if it is safe to promote the use in the STMT.  */
+static bool
+safe_to_promote_use_p (gimple *stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  tree lhs = gimple_assign_lhs (stmt);
+
+  if (gimple_vuse (stmt) != NULL_TREE
+      || gimple_vdef (stmt) != NULL_TREE

I think the vuse/vdef check is bogus, you can have a use of 'i_3' in say
_2 = a[i_3];

+      || code == VIEW_CONVERT_EXPR
+      || code == LROTATE_EXPR
+      || code == RROTATE_EXPR
+      || code == CONSTRUCTOR
+      || code == BIT_FIELD_REF
+      || code == COMPLEX_EXPR
+      || code == ASM_EXPR
+      || VECTOR_TYPE_P (TREE_TYPE (lhs)))
+    return false;
+  return true;

ASM_EXPR can never appear here.  I think PROMOTE_MODE never
promotes vector types - what cases did you need to add VECTOR_TYPE_P for?

+/* Return true if the SSA_NAME has to be truncated to preserve the
+   semantics.  */
+static bool
+truncate_use_p (gimple *stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);

I think the description can be improved.  This is about stray bits set
beyond the original type, correct?

Please use NOP_EXPR wherever you use CONVERT_EXPR right how.

+                 if (TREE_CODE_CLASS (code)
+                     == tcc_comparison)
+                   promote_cst_in_stmt (stmt, promoted_type, true);

don't you always need to promote constant operands?

Richard.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-11-03 14:40               ` Richard Biener
@ 2015-11-08  9:43                 ` Kugan
  2015-11-10 14:13                   ` Richard Biener
  0 siblings, 1 reply; 63+ messages in thread
From: Kugan @ 2015-11-08  9:43 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 5762 bytes --]


Thanks Richard for the comments.  Please find the attached patches which
now passes bootstrap with x86_64-none-linux-gnu, aarch64-linux-gnu  and
ppc64-linux-gnu. Regression testing is ongoing. Please find the comments
for your questions/suggestions below.

> 
> I notice
> 
> diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c
> index 82fd4a1..80fcf70 100644
> --- a/gcc/tree-ssanames.c
> +++ b/gcc/tree-ssanames.c
> @@ -207,7 +207,8 @@ set_range_info (tree name, enum value_range_type range_type,
>    unsigned int precision = TYPE_PRECISION (TREE_TYPE (name));
> 
>    /* Allocate if not available.  */
> -  if (ri == NULL)
> +  if (ri == NULL
> +      || (precision != ri->get_min ().get_precision ()))
> 
> and I think you need to clear range info on promoted SSA vars in the
> promotion pass.

Done.

> 
> The basic "structure" thing still remains.  You walk over all uses and
> defs in all stmts
> in promote_all_stmts which ends up calling promote_ssa_if_not_promoted on all
> uses and defs which in turn promotes (the "def") and then fixes up all
> uses in all stmts.

Done.

> 
> Instead of this you should, in promote_all_stmts, walk over all uses doing what
> fixup_uses does and then walk over all defs, doing what promote_ssa does.
> 
> +    case GIMPLE_NOP:
> +       {
> +         if (SSA_NAME_VAR (def) == NULL)
> +           {
> +             /* Promote def by fixing its type for anonymous def.  */
> +             TREE_TYPE (def) = promoted_type;
> +           }
> +         else
> +           {
> +             /* Create a promoted copy of parameters.  */
> +             bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
> 
> I think the uninitialized vars are somewhat tricky and it would be best
> to create a new uninit anonymous SSA name for them.  You can
> have SSA_NAME_VAR != NULL and def _not_ being a parameter
> btw.

Done. I also had to do some changes to in couple of other places to
reflect this.
They are:
--- a/gcc/tree-ssa-reassoc.c
+++ b/gcc/tree-ssa-reassoc.c
@@ -302,6 +302,7 @@ phi_rank (gimple *stmt)
     {
       tree arg = gimple_phi_arg_def (stmt, i);
       if (TREE_CODE (arg) == SSA_NAME
+	  && SSA_NAME_VAR (arg)
 	  && !SSA_NAME_IS_DEFAULT_DEF (arg))
 	{
 	  gimple *def_stmt = SSA_NAME_DEF_STMT (arg);
@@ -434,7 +435,8 @@ get_rank (tree e)
       if (gimple_code (stmt) == GIMPLE_PHI)
 	return phi_rank (stmt);

-      if (!is_gimple_assign (stmt))
+      if (!is_gimple_assign (stmt)
+	  && !gimple_nop_p (stmt))
 	return bb_rank[gimple_bb (stmt)->index];

and

--- a/gcc/tree-ssa.c
+++ b/gcc/tree-ssa.c
@@ -752,7 +752,8 @@ verify_use (basic_block bb, basic_block def_bb,
use_operand_p use_p,
   TREE_VISITED (ssa_name) = 1;

   if (gimple_nop_p (SSA_NAME_DEF_STMT (ssa_name))
-      && SSA_NAME_IS_DEFAULT_DEF (ssa_name))
+      && (SSA_NAME_IS_DEFAULT_DEF (ssa_name)
+	  || SSA_NAME_VAR (ssa_name) == NULL))
     ; /* Default definitions have empty statements.  Nothing to do.  */
   else if (!def_bb)
     {

Does this look OK?

> 
> +/* Return true if it is safe to promote the defined SSA_NAME in the STMT
> +   itself.  */
> +static bool
> +safe_to_promote_def_p (gimple *stmt)
> +{
> +  enum tree_code code = gimple_assign_rhs_code (stmt);
> +  if (gimple_vuse (stmt) != NULL_TREE
> +      || gimple_vdef (stmt) != NULL_TREE
> +      || code == ARRAY_REF
> +      || code == LROTATE_EXPR
> +      || code == RROTATE_EXPR
> +      || code == VIEW_CONVERT_EXPR
> +      || code == BIT_FIELD_REF
> +      || code == REALPART_EXPR
> +      || code == IMAGPART_EXPR
> +      || code == REDUC_MAX_EXPR
> +      || code == REDUC_PLUS_EXPR
> +      || code == REDUC_MIN_EXPR)
> +    return false;
> +  return true;
> 
> huh, I think this function has an odd name, maybe
> can_promote_operation ()?  Please
> use TREE_CODE_CLASS (code) == tcc_reference for all _REF trees.

Done.

> 
> Note that as followup things like the rotates should be "expanded" like
> we'd do on RTL (open-coding the thing).  And we'd need a way to
> specify zero-/sign-extended loads.
> 
> +/* Return true if it is safe to promote the use in the STMT.  */
> +static bool
> +safe_to_promote_use_p (gimple *stmt)
> +{
> +  enum tree_code code = gimple_assign_rhs_code (stmt);
> +  tree lhs = gimple_assign_lhs (stmt);
> +
> +  if (gimple_vuse (stmt) != NULL_TREE
> +      || gimple_vdef (stmt) != NULL_TREE
> 
> I think the vuse/vdef check is bogus, you can have a use of 'i_3' in say
> _2 = a[i_3];
> 
When I remove this, I see errors in stmts like:

unsigned char
unsigned int
# .MEM_197 = VDEF <.MEM_187>
fs_9(D)->fde_encoding = _154;


> +      || code == VIEW_CONVERT_EXPR
> +      || code == LROTATE_EXPR
> +      || code == RROTATE_EXPR
> +      || code == CONSTRUCTOR
> +      || code == BIT_FIELD_REF
> +      || code == COMPLEX_EXPR
> +      || code == ASM_EXPR
> +      || VECTOR_TYPE_P (TREE_TYPE (lhs)))
> +    return false;
> +  return true;
> 
> ASM_EXPR can never appear here.  I think PROMOTE_MODE never
> promotes vector types - what cases did you need to add VECTOR_TYPE_P for?

Done
> 
> +/* Return true if the SSA_NAME has to be truncated to preserve the
> +   semantics.  */
> +static bool
> +truncate_use_p (gimple *stmt)
> +{
> +  enum tree_code code = gimple_assign_rhs_code (stmt);
> 
> I think the description can be improved.  This is about stray bits set
> beyond the original type, correct?
> 
> Please use NOP_EXPR wherever you use CONVERT_EXPR right how.
> 
> +                 if (TREE_CODE_CLASS (code)
> +                     == tcc_comparison)
> +                   promote_cst_in_stmt (stmt, promoted_type, true);
> 
> don't you always need to promote constant operands?

I am promoting all the constants. Here, I am promoting the the constants
that are part of the conditions.


Thanks,
Kugan

[-- Attachment #2: 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch --]
[-- Type: text/x-diff, Size: 3519 bytes --]

From a25f711713778cd3ed3d0976cc3f37d541479afb Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 22 Oct 2015 10:53:56 +1100
Subject: [PATCH 3/4] Optimize ZEXT_EXPR with tree-vrp

---
 gcc/match.pd   |  6 ++++++
 gcc/tree-vrp.c | 59 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 65 insertions(+)

diff --git a/gcc/match.pd b/gcc/match.pd
index 0a9598e..1b152f1 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2585,3 +2585,9 @@ along with GCC; see the file COPYING3.  If not see
   (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
    (op @0 (ext @1 @2)))))
 
+(simplify
+ (sext (sext@2 @0 @1) @3)
+ (if (tree_int_cst_compare (@1, @3) <= 0)
+  @2
+  (sext @0 @3)))
+
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index fe34ffd..671a388 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -2241,6 +2241,7 @@ extract_range_from_binary_expr_1 (value_range *vr,
       && code != LSHIFT_EXPR
       && code != MIN_EXPR
       && code != MAX_EXPR
+      && code != SEXT_EXPR
       && code != BIT_AND_EXPR
       && code != BIT_IOR_EXPR
       && code != BIT_XOR_EXPR)
@@ -2801,6 +2802,52 @@ extract_range_from_binary_expr_1 (value_range *vr,
       extract_range_from_multiplicative_op_1 (vr, code, &vr0, &vr1);
       return;
     }
+  else if (code == SEXT_EXPR)
+    {
+      gcc_assert (range_int_cst_p (&vr1));
+      HOST_WIDE_INT prec = tree_to_uhwi (vr1.min);
+      type = vr0.type;
+      wide_int tmin, tmax;
+      wide_int may_be_nonzero, must_be_nonzero;
+
+      wide_int type_min = wi::min_value (prec, SIGNED);
+      wide_int type_max = wi::max_value (prec, SIGNED);
+      type_min = wide_int_to_tree (expr_type, type_min);
+      type_max = wide_int_to_tree (expr_type, type_max);
+      wide_int sign_bit
+	= wi::set_bit_in_zero (prec - 1,
+			       TYPE_PRECISION (TREE_TYPE (vr0.min)));
+      if (zero_nonzero_bits_from_vr (expr_type, &vr0,
+				     &may_be_nonzero,
+				     &must_be_nonzero))
+	{
+	  if (wi::bit_and (must_be_nonzero, sign_bit) == sign_bit)
+	    {
+	      /* If to-be-extended sign bit is one.  */
+	      tmin = type_min;
+	      tmax = wi::zext (may_be_nonzero, prec);
+	    }
+	  else if (wi::bit_and (may_be_nonzero, sign_bit)
+		   != sign_bit)
+	    {
+	      /* If to-be-extended sign bit is zero.  */
+	      tmin = wi::zext (must_be_nonzero, prec);
+	      tmax = wi::zext (may_be_nonzero, prec);
+	    }
+	  else
+	    {
+	      tmin = type_min;
+	      tmax = type_max;
+	    }
+	}
+      else
+	{
+	  tmin = type_min;
+	  tmax = type_max;
+	}
+      min = wide_int_to_tree (expr_type, tmin);
+      max = wide_int_to_tree (expr_type, tmax);
+    }
   else if (code == RSHIFT_EXPR
 	   || code == LSHIFT_EXPR)
     {
@@ -9166,6 +9213,17 @@ simplify_bit_ops_using_ranges (gimple_stmt_iterator *gsi, gimple *stmt)
 	  break;
 	}
       break;
+    case SEXT_EXPR:
+	{
+	  unsigned int prec = tree_to_uhwi (op1);
+	  wide_int min = vr0.min;
+	  wide_int max = vr0.max;
+	  wide_int sext_min = wi::sext (min, prec);
+	  wide_int sext_max = wi::sext (max, prec);
+	  if (min == sext_min && max == sext_max)
+	    op = op0;
+	}
+      break;
     default:
       gcc_unreachable ();
     }
@@ -9868,6 +9926,7 @@ simplify_stmt_using_ranges (gimple_stmt_iterator *gsi)
 
 	case BIT_AND_EXPR:
 	case BIT_IOR_EXPR:
+	case SEXT_EXPR:
 	  /* Optimize away BIT_AND_EXPR and BIT_IOR_EXPR
 	     if all the bits being cleared are already cleared or
 	     all the bits being set are already set.  */
-- 
1.9.1


[-- Attachment #3: 0002-Add-type-promotion-pass.patch --]
[-- Type: text/x-diff, Size: 37083 bytes --]

From f1b226443b63eda75f38f204a0befa5578e6df0f Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 22 Oct 2015 10:52:37 +1100
Subject: [PATCH 2/4] Add type promotion pass

---
 gcc/Makefile.in               |    1 +
 gcc/auto-profile.c            |    2 +-
 gcc/common.opt                |    4 +
 gcc/doc/invoke.texi           |   10 +
 gcc/gimple-ssa-type-promote.c | 1026 +++++++++++++++++++++++++++++++++++++++++
 gcc/passes.def                |    1 +
 gcc/timevar.def               |    1 +
 gcc/tree-pass.h               |    1 +
 gcc/tree-ssa-reassoc.c        |    4 +-
 gcc/tree-ssa-uninit.c         |   23 +-
 gcc/tree-ssa.c                |    3 +-
 libiberty/cp-demangle.c       |    2 +-
 12 files changed, 1064 insertions(+), 14 deletions(-)
 create mode 100644 gcc/gimple-ssa-type-promote.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index b91b8dc..c6aed45 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1499,6 +1499,7 @@ OBJS = \
 	tree-vect-slp.o \
 	tree-vectorizer.o \
 	tree-vrp.o \
+	gimple-ssa-type-promote.o \
 	tree.o \
 	valtrack.o \
 	value-prof.o \
diff --git a/gcc/auto-profile.c b/gcc/auto-profile.c
index 25202c5..d32c3b6 100644
--- a/gcc/auto-profile.c
+++ b/gcc/auto-profile.c
@@ -1266,7 +1266,7 @@ afdo_propagate_circuit (const bb_set &annotated_bb, edge_set *annotated_edge)
     FOR_EACH_EDGE (e, ei, bb->succs)
     {
       unsigned i, total = 0;
-      edge only_one;
+      edge only_one = NULL;
       bool check_value_one = (((integer_onep (cmp_rhs))
                                ^ (gimple_cond_code (cmp_stmt) == EQ_EXPR))
                               ^ ((e->flags & EDGE_TRUE_VALUE) != 0));
diff --git a/gcc/common.opt b/gcc/common.opt
index 12ca0d6..f450428 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2404,6 +2404,10 @@ ftree-vrp
 Common Report Var(flag_tree_vrp) Init(0) Optimization
 Perform Value Range Propagation on trees.
 
+ftree-type-promote
+Common Report Var(flag_tree_type_promote) Init(1) Optimization
+Perform Type Promotion on trees
+
 funit-at-a-time
 Common Report Var(flag_unit_at_a_time) Init(1)
 Compile whole compilation unit at a time.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index cd82544..bc059a0 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -9093,6 +9093,16 @@ enabled by default at @option{-O2} and higher.  Null pointer check
 elimination is only done if @option{-fdelete-null-pointer-checks} is
 enabled.
 
+@item -ftree-type-promote
+@opindex ftree-type-promote
+This pass applies type promotion to SSA names in the function and
+inserts appropriate truncations to preserve the semantics.  Idea of
+this pass is to promote operations such a way that we can minimise
+generation of subreg in RTL, that intern results in removal of
+redundant zero/sign extensions.
+
+This optimization is enabled by default.
+
 @item -fsplit-ivs-in-unroller
 @opindex fsplit-ivs-in-unroller
 Enables expression of values of induction variables in later iterations
diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c
new file mode 100644
index 0000000..1d24566
--- /dev/null
+++ b/gcc/gimple-ssa-type-promote.c
@@ -0,0 +1,1026 @@
+/* Type promotion of SSA names to minimise redundant zero/sign extension.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "hash-set.h"
+#include "machmode.h"
+#include "vec.h"
+#include "double-int.h"
+#include "input.h"
+#include "symtab.h"
+#include "wide-int.h"
+#include "inchash.h"
+#include "tree.h"
+#include "fold-const.h"
+#include "stor-layout.h"
+#include "predict.h"
+#include "function.h"
+#include "dominance.h"
+#include "cfg.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimple-ssa.h"
+#include "tree-phinodes.h"
+#include "ssa-iterators.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+#include "tree-pass.h"
+#include "gimple-pretty-print.h"
+#include "langhooks.h"
+#include "sbitmap.h"
+#include "domwalk.h"
+#include "tree-dfa.h"
+
+/* This pass applies type promotion to SSA names in the function and
+   inserts appropriate truncations.  Idea of this pass is to promote operations
+   such a way that we can minimise generation of subreg in RTL,
+   that in turn results in removal of redundant zero/sign extensions.  This pass
+   will run prior to The VRP and DOM such that they will be able to optimise
+   redundant truncations and extensions.  This is based on the discussion from
+   https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00472.html.
+
+*/
+
+struct ssa_name_info
+{
+  tree ssa;
+  tree type;
+  tree promoted_type;
+};
+
+/* Obstack for ssa_name_info.  */
+static struct obstack ssa_name_info_obstack;
+
+static unsigned n_ssa_val;
+static sbitmap ssa_to_be_promoted_bitmap;
+static sbitmap ssa_sets_higher_bits_bitmap;
+static hash_map <tree, ssa_name_info *>  *ssa_name_info_map;
+
+static bool
+type_precision_ok (tree type)
+{
+  return (TYPE_PRECISION (type)
+	  == GET_MODE_PRECISION (TYPE_MODE (type)));
+}
+
+/* Return the promoted type for TYPE.  */
+static tree
+get_promoted_type (tree type)
+{
+  tree promoted_type;
+  enum machine_mode mode;
+  int uns;
+
+  if (POINTER_TYPE_P (type)
+      || !INTEGRAL_TYPE_P (type)
+      || !type_precision_ok (type))
+    return type;
+
+  mode = TYPE_MODE (type);
+#ifdef PROMOTE_MODE
+  uns = TYPE_SIGN (type);
+  PROMOTE_MODE (mode, uns, type);
+#endif
+  uns = TYPE_SIGN (type);
+  if (TYPE_PRECISION (type) == GET_MODE_PRECISION (mode))
+    return type;
+  promoted_type
+    = build_nonstandard_integer_type (GET_MODE_PRECISION (mode),
+				      uns);
+  gcc_assert (TYPE_PRECISION (promoted_type) == GET_MODE_PRECISION (mode));
+  return promoted_type;
+}
+
+/* Return true if ssa NAME is already considered for promotion.  */
+static bool
+ssa_promoted_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	return bitmap_bit_p (ssa_to_be_promoted_bitmap, index);
+    }
+  return true;
+}
+
+/* Set ssa NAME to be already considered for promotion.  */
+static void
+set_ssa_promoted (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	bitmap_set_bit (ssa_to_be_promoted_bitmap, index);
+    }
+}
+
+/* Set ssa NAME will have higher bits if promoted.  */
+static void
+set_ssa_overflows (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	bitmap_set_bit (ssa_sets_higher_bits_bitmap, index);
+    }
+}
+
+
+/* Return true if ssa NAME will have higher bits if promoted.  */
+static bool
+ssa_overflows_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      gimple *def_stmt = SSA_NAME_DEF_STMT (name);
+
+      if (gimple_code (def_stmt) == GIMPLE_NOP
+	  && SSA_NAME_VAR (name)
+	  && TREE_CODE (SSA_NAME_VAR (name)) != PARM_DECL)
+	return true;
+      if (index < n_ssa_val)
+	return bitmap_bit_p (ssa_sets_higher_bits_bitmap, index);
+    }
+  return true;
+}
+
+/* Visit PHI stmt and record if variables might have higher bits set if
+   promoted.  */
+static bool
+record_visit_phi_node (gimple *stmt)
+{
+  tree def;
+  ssa_op_iter i;
+  use_operand_p op;
+  bool high_bits_set = false;
+  gphi *phi = as_a <gphi *> (stmt);
+  tree lhs = PHI_RESULT (phi);
+
+  if (TREE_CODE (lhs) != SSA_NAME
+      || POINTER_TYPE_P (TREE_TYPE (lhs))
+      || !INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+      || ssa_overflows_p (lhs))
+    return false;
+
+  FOR_EACH_PHI_ARG (op, phi, i, SSA_OP_USE)
+    {
+      def = USE_FROM_PTR (op);
+      if (ssa_overflows_p (def))
+	high_bits_set = true;
+    }
+
+  if (high_bits_set)
+    {
+      set_ssa_overflows (lhs);
+      return true;
+    }
+  else
+    return false;
+}
+
+/* Visit STMT and record if variables might have higher bits set if
+   promoted.  */
+static bool
+record_visit_stmt (gimple *stmt)
+{
+  bool changed = false;
+  gcc_assert (gimple_code (stmt) == GIMPLE_ASSIGN);
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  tree lhs = gimple_assign_lhs (stmt);
+  tree rhs1 = gimple_assign_rhs1 (stmt);
+
+  if (TREE_CODE (lhs) != SSA_NAME
+      || POINTER_TYPE_P (TREE_TYPE (lhs))
+      || !INTEGRAL_TYPE_P (TREE_TYPE (lhs)))
+    return false;
+
+  switch (code)
+    {
+    case SSA_NAME:
+      if (!ssa_overflows_p (lhs)
+	  && ssa_overflows_p (rhs1))
+	{
+	  set_ssa_overflows (lhs);
+	  changed = true;
+	}
+      break;
+
+    default:
+      if (!ssa_overflows_p (lhs))
+	{
+	  set_ssa_overflows (lhs);
+	  changed = true;
+	}
+      break;
+    }
+  return changed;
+}
+
+static void
+process_all_stmts_for_unsafe_promotion ()
+{
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+  auto_vec<gimple *> work_list;
+
+  FOR_EACH_BB_FN (bb, cfun)
+    {
+      for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+	{
+	  gimple *phi = gsi_stmt (gsi);
+	  work_list.safe_push (phi);
+	}
+
+      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+	{
+	  gimple *stmt = gsi_stmt (gsi);
+	  if (gimple_code (stmt) == GIMPLE_ASSIGN)
+	    work_list.safe_push (stmt);
+	}
+    }
+
+  while (work_list.length () > 0)
+    {
+      bool changed;
+      gimple *stmt = work_list.pop ();
+      tree lhs;
+
+      switch (gimple_code (stmt))
+	{
+
+	case GIMPLE_ASSIGN:
+	  changed = record_visit_stmt (stmt);
+	  lhs = gimple_assign_lhs (stmt);
+	  break;
+
+	case GIMPLE_PHI:
+	  changed = record_visit_phi_node (stmt);
+	  lhs = PHI_RESULT (stmt);
+	  break;
+
+	default:
+	  gcc_assert (false);
+	  break;
+	}
+
+      if (changed)
+	{
+	  gimple *use_stmt;
+	  imm_use_iterator ui;
+
+	  FOR_EACH_IMM_USE_STMT (use_stmt, ui, lhs)
+	    {
+	      if (gimple_code (use_stmt) == GIMPLE_ASSIGN
+		  || gimple_code (use_stmt) == GIMPLE_PHI)
+		work_list.safe_push (use_stmt);
+	    }
+	}
+    }
+}
+
+/* Return true if it is safe to promote the defined SSA_NAME in the STMT
+   itself.  */
+static bool
+can_promote_operation_p (gimple *stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  if (gimple_vuse (stmt) != NULL_TREE
+      || gimple_vdef (stmt) != NULL_TREE
+      || TREE_CODE_CLASS (code) == tcc_reference
+      || code == LROTATE_EXPR
+      || code == RROTATE_EXPR
+      || code == VIEW_CONVERT_EXPR
+      || code == REALPART_EXPR
+      || code == IMAGPART_EXPR
+      || code == REDUC_MAX_EXPR
+      || code == REDUC_PLUS_EXPR
+      || code == REDUC_MIN_EXPR)
+    return false;
+  return true;
+}
+
+/* Return true if it is safe to promote the use in the STMT.  */
+static bool
+safe_to_promote_use_p (gimple *stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  tree lhs = gimple_assign_lhs (stmt);
+
+  if (gimple_vuse (stmt) != NULL_TREE
+      || gimple_vdef (stmt) != NULL_TREE
+      || code == VIEW_CONVERT_EXPR
+      || code == LROTATE_EXPR
+      || code == RROTATE_EXPR
+      || code == CONSTRUCTOR
+      || code == BIT_FIELD_REF
+      || code == COMPLEX_EXPR
+      || VECTOR_TYPE_P (TREE_TYPE (lhs)))
+    return false;
+  return true;
+}
+
+/* Return true if the SSA_NAME has to be truncated when (stray bits are set
+   beyond the original type in promoted mode) to preserve the semantics.  */
+static bool
+truncate_use_p (gimple *stmt)
+{
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  if (TREE_CODE_CLASS (code) == tcc_comparison
+      || code == TRUNC_DIV_EXPR
+      || code == CEIL_DIV_EXPR
+      || code == FLOOR_DIV_EXPR
+      || code == ROUND_DIV_EXPR
+      || code == TRUNC_MOD_EXPR
+      || code == CEIL_MOD_EXPR
+      || code == FLOOR_MOD_EXPR
+      || code == ROUND_MOD_EXPR
+      || code == LSHIFT_EXPR
+      || code == RSHIFT_EXPR)
+    return true;
+  return false;
+}
+
+/* Return true if LHS will be promoted later.  */
+static bool
+tobe_promoted_p (tree lhs)
+{
+  if (TREE_CODE (lhs) == SSA_NAME
+      && !POINTER_TYPE_P (TREE_TYPE (lhs))
+      && INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+      && !VECTOR_TYPE_P (TREE_TYPE (lhs))
+      && !ssa_promoted_p (lhs)
+      && (get_promoted_type (TREE_TYPE (lhs))
+	  != TREE_TYPE (lhs)))
+    return true;
+  else
+    return false;
+}
+
+/* Convert constant CST to TYPE.  */
+static tree
+convert_int_cst (tree type, tree cst, signop sign = SIGNED)
+{
+  wide_int wi_cons = fold_convert (type, cst);
+  wi_cons = wi::ext (wi_cons, TYPE_PRECISION (TREE_TYPE (cst)), sign);
+  return wide_int_to_tree (type, wi_cons);
+}
+
+/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
+   promote only the constants in conditions part of the COND_EXPR.  */
+static void
+promote_cst_in_stmt (gimple *stmt, tree type, bool promote_cond = false)
+{
+  tree op;
+  ssa_op_iter iter;
+  use_operand_p oprnd;
+  int index;
+  tree op0, op1;
+  signop sign = SIGNED;
+
+  switch (gimple_code (stmt))
+    {
+    case GIMPLE_ASSIGN:
+      if (promote_cond
+	  && gimple_assign_rhs_code (stmt) == COND_EXPR)
+	{
+	  /* Promote INTEGER_CST that are tcc_compare arguments.  */
+	  sign = TYPE_SIGN (type);
+	  op = gimple_assign_rhs1 (stmt);
+	  op0 = TREE_OPERAND (op, 0);
+	  op1 = TREE_OPERAND (op, 1);
+	  if (TREE_CODE (op0) == INTEGER_CST)
+	    op0 = convert_int_cst (type, op0, sign);
+	  if (TREE_CODE (op1) == INTEGER_CST)
+	    op1 = convert_int_cst (type, op1, sign);
+	  tree new_op = build2 (TREE_CODE (op), type, op0, op1);
+	  gimple_assign_set_rhs1 (stmt, new_op);
+	}
+      else
+	{
+	  /* Promote INTEGER_CST in GIMPLE_ASSIGN.  */
+	  op = gimple_assign_rhs3 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs3 (stmt, convert_int_cst (type, op, sign));
+	  if (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt))
+	      == tcc_comparison)
+	    sign = TYPE_SIGN (type);
+	  op = gimple_assign_rhs1 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs1 (stmt, convert_int_cst (type, op, sign));
+	  op = gimple_assign_rhs2 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs2 (stmt, convert_int_cst (type, op, sign));
+	}
+      break;
+
+    case GIMPLE_PHI:
+	{
+	  /* Promote INTEGER_CST arguments to GIMPLE_PHI.  */
+	  gphi *phi = as_a <gphi *> (stmt);
+	  FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE)
+	    {
+	      op = USE_FROM_PTR (oprnd);
+	      index = PHI_ARG_INDEX_FROM_USE (oprnd);
+	      if (TREE_CODE (op) == INTEGER_CST)
+		SET_PHI_ARG_DEF (phi, index, convert_int_cst (type, op, sign));
+	    }
+	}
+      break;
+
+    case GIMPLE_COND:
+	{
+	  /* Promote INTEGER_CST that are GIMPLE_COND arguments.  */
+	  gcond *cond = as_a <gcond *> (stmt);
+	  op = gimple_cond_lhs (cond);
+	  sign = TYPE_SIGN (type);
+
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_lhs (cond, convert_int_cst (type, op, sign));
+	  op = gimple_cond_rhs (cond);
+
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_rhs (cond, convert_int_cst (type, op, sign));
+	}
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+}
+
+/* Create an ssa with TYPE to copy ssa VAR.  */
+static tree
+make_promoted_copy (tree var, gimple *def_stmt, tree type)
+{
+  tree new_lhs = make_ssa_name (type, def_stmt);
+  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (var))
+    SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_lhs) = 1;
+  return new_lhs;
+}
+
+/* Zero/sign extend (depending on type) VAR and truncate to WIDTH bits.
+   Assign the zero/sign extended value in NEW_VAR.  gimple statement
+   that performs the zero/sign extension is returned.  */
+static gimple *
+zero_sign_extend_stmt (tree new_var, tree var, int width)
+{
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var))
+	      == TYPE_PRECISION (TREE_TYPE (new_var)));
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var)) > width);
+  gimple *stmt;
+
+  if (TYPE_UNSIGNED (TREE_TYPE (new_var)))
+    {
+      /* Zero extend.  */
+      tree cst
+	= wide_int_to_tree (TREE_TYPE (var),
+			    wi::mask (width, false,
+				      TYPE_PRECISION (TREE_TYPE (var))));
+      stmt = gimple_build_assign (new_var, BIT_AND_EXPR,
+				  var, cst);
+    }
+  else
+    /* Sign extend.  */
+    stmt = gimple_build_assign (new_var,
+				SEXT_EXPR,
+				var, build_int_cst (TREE_TYPE (var), width));
+  return stmt;
+}
+
+
+void duplicate_default_ssa (tree to, tree from)
+{
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (to, SSA_NAME_VAR (from));
+  SSA_NAME_IS_DEFAULT_DEF (to) = SSA_NAME_IS_DEFAULT_DEF (from);
+  SSA_NAME_DEF_STMT (to) = SSA_NAME_DEF_STMT (from);
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (from, NULL_TREE);
+  SSA_NAME_IS_DEFAULT_DEF (to) = 1;
+  SSA_NAME_IS_DEFAULT_DEF (from) = 0;
+}
+
+/* Promote definition DEF to PROMOTED_TYPE.  If the stmt that defines def
+   is def_stmt, make the type of def promoted_type.  If the stmt is such
+   that, result of the def_stmt cannot be of promoted_type, create a new_def
+   of the original_type and make the def_stmt assign its value to newdef.
+   Then, create a CONVERT_EXPR to convert new_def to def of promoted type.
+
+   For example, for stmt with original_type char and promoted_type int:
+		char _1 = mem;
+	becomes:
+		char _2 = mem;
+		int _1 = (int)_2;
+
+   If the def_stmt allows def to be promoted, promote def in-place
+   (and its arguments when needed).
+
+   For example:
+		char _3 = _1 + _2;
+	becomes:
+		int _3 = _1 + _2;
+   Here, _1 and _2 will also be promoted.  */
+static void
+promote_ssa (tree def, gimple_stmt_iterator *gsi)
+{
+  gimple *def_stmt = SSA_NAME_DEF_STMT (def);
+  gimple *copy_stmt = NULL;
+  basic_block bb;
+  gimple_stmt_iterator gsi2;
+  tree original_type = TREE_TYPE (def);
+  tree new_def;
+  bool do_not_promote = false;
+  if (!tobe_promoted_p (def))
+    return;
+  tree promoted_type = get_promoted_type (TREE_TYPE (def));
+  ssa_name_info *info = (ssa_name_info *) obstack_alloc (&ssa_name_info_obstack,
+						       sizeof (ssa_name_info));
+  info->type = original_type;
+  info->promoted_type = promoted_type;
+  info->ssa = def;
+  ssa_name_info_map->put (def, info);
+
+  switch (gimple_code (def_stmt))
+    {
+    case GIMPLE_PHI:
+	{
+	  /* Promote def by fixing its type and make def anonymous.  */
+	  TREE_TYPE (def) = promoted_type;
+	  SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	  promote_cst_in_stmt (def_stmt, promoted_type);
+	  break;
+	}
+
+    case GIMPLE_ASM:
+	{
+	  gasm *asm_stmt = as_a <gasm *> (def_stmt);
+	  for (unsigned int i = 0; i < gimple_asm_noutputs (asm_stmt); ++i)
+	    {
+	      /* Promote def and copy (i.e. convert) the value defined
+		 by asm to def.  */
+	      tree link = gimple_asm_output_op (asm_stmt, i);
+	      tree op = TREE_VALUE (link);
+	      if (op == def)
+		{
+		  new_def = copy_ssa_name (def);
+		  set_ssa_promoted (new_def);
+		  duplicate_default_ssa (new_def, def);
+		  TREE_VALUE (link) = new_def;
+		  gimple_asm_set_output_op (asm_stmt, i, link);
+
+		  TREE_TYPE (def) = promoted_type;
+		  copy_stmt = gimple_build_assign (def, NOP_EXPR,
+						   new_def, NULL_TREE);
+		  SSA_NAME_IS_DEFAULT_DEF (new_def) = 0;
+		  gsi2 = gsi_for_stmt (def_stmt);
+		  gsi_insert_after (&gsi2, copy_stmt, GSI_NEW_STMT);
+		  break;
+		}
+	    }
+	  break;
+	}
+
+    case GIMPLE_NOP:
+	{
+	  if (SSA_NAME_VAR (def) == NULL
+	      || TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL)
+	    {
+	      /* Promote def by fixing its type for anonymous def.  */
+	      if (SSA_NAME_VAR (def))
+		{
+		  set_ssa_default_def (cfun, SSA_NAME_VAR (def), NULL_TREE);
+		  SSA_NAME_IS_DEFAULT_DEF (def) = 0;
+		  SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		}
+	      TREE_TYPE (def) = promoted_type;
+	    }
+	  else
+	    {
+	      /* Create a promoted copy of parameters.  */
+	      bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+	      gcc_assert (bb);
+	      gsi2 = gsi_after_labels (bb);
+	      new_def = copy_ssa_name (def);
+	      set_ssa_promoted (new_def);
+	      set_ssa_default_def (cfun, SSA_NAME_VAR (def), new_def);
+	      duplicate_default_ssa (new_def, def);
+	      TREE_TYPE (def) = promoted_type;
+	      copy_stmt = gimple_build_assign (def, NOP_EXPR,
+					       new_def, NULL_TREE);
+	      SSA_NAME_DEF_STMT (def) = copy_stmt;
+	      gsi_insert_before (&gsi2, copy_stmt, GSI_NEW_STMT);
+	    }
+	  break;
+	}
+
+    case GIMPLE_ASSIGN:
+	{
+	  enum tree_code code = gimple_assign_rhs_code (def_stmt);
+	  if (!can_promote_operation_p (def_stmt))
+	    {
+	      do_not_promote = true;
+	    }
+	  else if (CONVERT_EXPR_CODE_P (code))
+	    {
+	      tree rhs = gimple_assign_rhs1 (def_stmt);
+	      if (!type_precision_ok (TREE_TYPE (rhs)))
+		{
+		  do_not_promote = true;
+		}
+	      else if (types_compatible_p (TREE_TYPE (rhs), promoted_type))
+		{
+		  /* As we travel statements in dominated order, arguments
+		     of def_stmt will be visited before visiting def.  If RHS
+		     is already promoted and type is compatible, we can convert
+		     them into ZERO/SIGN EXTEND stmt.  */
+		  ssa_name_info *info = ssa_name_info_map->get_or_insert (rhs);
+		  tree type;
+		  if (info == NULL)
+		    type = TREE_TYPE (rhs);
+		  else
+		    type = info->type;
+		  if ((TYPE_PRECISION (original_type)
+		       > TYPE_PRECISION (type))
+		      || (TYPE_UNSIGNED (original_type)
+			  != TYPE_UNSIGNED (type)))
+		    {
+		      if (TYPE_PRECISION (original_type) < TYPE_PRECISION (type))
+			type = original_type;
+		      gcc_assert (type != NULL_TREE);
+		      TREE_TYPE (def) = promoted_type;
+		      gimple *copy_stmt =
+			zero_sign_extend_stmt (def, rhs,
+					       TYPE_PRECISION (type));
+		      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		      gsi_replace (gsi, copy_stmt, false);
+		    }
+		  else
+		    {
+		      TREE_TYPE (def) = promoted_type;
+		      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		    }
+		}
+	      else
+		{
+		  /* If RHS is not promoted OR their types are not
+		     compatible, create CONVERT_EXPR that converts
+		     RHS to  promoted DEF type and perform a
+		     ZERO/SIGN EXTEND to get the required value
+		     from RHS.  */
+		  tree s = (TYPE_PRECISION (TREE_TYPE (def))
+			    < TYPE_PRECISION (TREE_TYPE (rhs)))
+		    ? TREE_TYPE (def) : TREE_TYPE (rhs);
+		  new_def = copy_ssa_name (def);
+		  set_ssa_promoted (new_def);
+		  TREE_TYPE (def) = promoted_type;
+		  TREE_TYPE (new_def) = promoted_type;
+		  SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		  SET_SSA_NAME_VAR_OR_IDENTIFIER (new_def, NULL_TREE);
+		  gimple_set_lhs (def_stmt, new_def);
+		  gimple *copy_stmt =
+		    zero_sign_extend_stmt (def, new_def,
+					   TYPE_PRECISION (s));
+		  gsi2 = gsi_for_stmt (def_stmt);
+		  if (lookup_stmt_eh_lp (def_stmt) > 0
+		      || (gimple_code (def_stmt) == GIMPLE_CALL
+			  && gimple_call_ctrl_altering_p (def_stmt)))
+		    gsi_insert_on_edge (FALLTHRU_EDGE (gimple_bb (def_stmt)),
+					copy_stmt);
+		  else
+		    gsi_insert_after (&gsi2, copy_stmt, GSI_NEW_STMT);
+	      }
+	    }
+	  else
+	    {
+	      /* Promote def by fixing its type and make def anonymous.  */
+	      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	      promote_cst_in_stmt (def_stmt, promoted_type);
+	      TREE_TYPE (def) = promoted_type;
+	    }
+	  break;
+	}
+
+    default:
+      do_not_promote = true;
+      break;
+    }
+
+  if (do_not_promote)
+    {
+      /* Promote def and copy (i.e. convert) the value defined
+	 by the stmt that cannot be promoted.  */
+      new_def = copy_ssa_name (def);
+      set_ssa_promoted (new_def);
+      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+      TREE_TYPE (def) = promoted_type;
+      gimple_set_lhs (def_stmt, new_def);
+      copy_stmt = gimple_build_assign (def, NOP_EXPR,
+				       new_def, NULL_TREE);
+      gsi2 = gsi_for_stmt (def_stmt);
+      if (lookup_stmt_eh_lp (def_stmt) > 0
+	  || (gimple_code (def_stmt) == GIMPLE_CALL
+	      && gimple_call_ctrl_altering_p (def_stmt)))
+	gsi_insert_on_edge (FALLTHRU_EDGE (gimple_bb (def_stmt)),
+			    copy_stmt);
+      else
+	gsi_insert_after (&gsi2, copy_stmt, GSI_NEW_STMT);
+    }
+
+  SSA_NAME_RANGE_INFO (def) = NULL;
+}
+
+/* Fix the (promoted) USE in stmts where USE cannot be be promoted.  */
+static unsigned int
+fixup_uses (gimple *stmt, gimple_stmt_iterator *gsi,
+	    use_operand_p op, tree use)
+{
+  ssa_name_info *info = ssa_name_info_map->get_or_insert (use);
+  if (!info)
+    return 0;
+
+  tree promoted_type = info->promoted_type;
+  tree old_type = info->type;
+  bool do_not_promote = false;
+
+  switch (gimple_code (stmt))
+    {
+    case GIMPLE_DEBUG:
+	{
+	  SET_USE (op, fold_convert (old_type, use));
+	  update_stmt (stmt);
+	}
+      break;
+
+    case GIMPLE_ASM:
+    case GIMPLE_CALL:
+    case GIMPLE_RETURN:
+	{
+	  /* USE cannot be promoted here.  */
+	  do_not_promote = true;
+	  break;
+	}
+
+    case GIMPLE_ASSIGN:
+	{
+	  enum tree_code code = gimple_assign_rhs_code (stmt);
+	  tree lhs = gimple_assign_lhs (stmt);
+	  if (!safe_to_promote_use_p (stmt))
+	    {
+	      do_not_promote = true;
+	    }
+	  else if (truncate_use_p (stmt)
+		   || !INTEGRAL_TYPE_P (TREE_TYPE (lhs)))
+	    {
+	      /* Promote the constant in comparison when other comparsion
+		 operand is promoted.  All other constants are promoted as
+		 part of promoting definition in promote_ssa.  */
+	      if (TREE_CODE_CLASS (code) == tcc_comparison)
+		promote_cst_in_stmt (stmt, promoted_type, true);
+	      if (!ssa_overflows_p (use))
+		break;
+	      /* In some stmts, value in USE has to be ZERO/SIGN
+		 Extended based on the original type for correct
+		 result.  */
+	      tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+	      gimple *copy_stmt =
+		zero_sign_extend_stmt (temp, use,
+				       TYPE_PRECISION (old_type));
+	      gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT);
+
+	      SET_USE (op, temp);
+	      update_stmt (stmt);
+	    }
+	  else if (CONVERT_EXPR_CODE_P (code))
+	    {
+	      if (types_compatible_p (TREE_TYPE (lhs), promoted_type))
+		{
+		  /* Type of LHS and promoted RHS are compatible, we can
+		     convert this into ZERO/SIGN EXTEND stmt.  */
+		  gimple *copy_stmt =
+		    zero_sign_extend_stmt (lhs, use,
+					   TYPE_PRECISION (old_type));
+		  set_ssa_promoted (lhs);
+		  gsi_replace (gsi, copy_stmt, false);
+		}
+	      else if (tobe_promoted_p (lhs));
+	      else
+		{
+		  do_not_promote = true;
+		}
+	    }
+	  break;
+	}
+
+    case GIMPLE_COND:
+      if (ssa_overflows_p (use))
+	{
+	  /* In GIMPLE_COND, value in USE has to be ZERO/SIGN
+	     Extended based on the original type for correct
+	     result.  */
+	  tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+	  gimple *copy_stmt =
+	    zero_sign_extend_stmt (temp, use,
+				   TYPE_PRECISION (old_type));
+	  gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT);
+	  SET_USE (op, temp);
+	}
+      promote_cst_in_stmt (stmt, promoted_type, true);
+      update_stmt (stmt);
+      break;
+
+    default:
+      break;
+    }
+
+  if (do_not_promote)
+    {
+      /* FOR stmts where USE canoot be promoted, create an
+	 original type copy.  */
+      tree temp;
+      temp = copy_ssa_name (use);
+      set_ssa_promoted (temp);
+      TREE_TYPE (temp) = old_type;
+      gimple *copy_stmt = gimple_build_assign (temp, NOP_EXPR,
+					       use, NULL_TREE);
+      gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT);
+      SET_USE (op, temp);
+      update_stmt (stmt);
+    }
+  return 0;
+}
+
+
+/* Promote all the stmts in the basic block.  */
+static void
+promote_all_stmts (basic_block bb)
+{
+  gimple_stmt_iterator gsi;
+  ssa_op_iter iter;
+  tree def, use;
+  use_operand_p op;
+
+  for (gphi_iterator gpi = gsi_start_phis (bb);
+       !gsi_end_p (gpi); gsi_next (&gpi))
+    {
+      gphi *phi = gpi.phi ();
+      def = PHI_RESULT (phi);
+      promote_ssa (def, &gsi);
+
+      FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE)
+	{
+	  use = USE_FROM_PTR (op);
+	  if (TREE_CODE (use) == SSA_NAME
+	      && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP)
+	    promote_ssa (use, &gsi);
+	  fixup_uses (phi, &gsi, op, use);
+	}
+    }
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      gimple *stmt = gsi_stmt (gsi);
+      if (is_gimple_debug (stmt))
+	continue;
+
+      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_DEF)
+	promote_ssa (def, &gsi);
+
+      FOR_EACH_SSA_USE_OPERAND (op, stmt, iter, SSA_OP_USE)
+	{
+	  use = USE_FROM_PTR (op);
+	  if (TREE_CODE (use) == SSA_NAME
+	    && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP)
+	    promote_ssa (use, &gsi);
+	  fixup_uses (stmt, &gsi, op, use);
+	}
+    }
+}
+
+void promote_debug_stmts ()
+{
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+  ssa_op_iter iter;
+  tree use;
+  use_operand_p op;
+
+  FOR_EACH_BB_FN (bb, cfun)
+    for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+      {
+	gimple *stmt = gsi_stmt (gsi);
+	if (!is_gimple_debug (stmt))
+	  continue;
+	FOR_EACH_SSA_USE_OPERAND (op, stmt, iter, SSA_OP_USE)
+	  {
+	    use = USE_FROM_PTR (op);
+	    fixup_uses (stmt, &gsi, op, use);
+	  }
+      }
+}
+
+
+class type_promotion_dom_walker : public dom_walker
+{
+public:
+  type_promotion_dom_walker (cdi_direction direction)
+    : dom_walker (direction) {}
+  virtual void before_dom_children (basic_block bb)
+    {
+      promote_all_stmts (bb);
+    }
+};
+
+/* Main entry point to the pass.  */
+static unsigned int
+execute_type_promotion (void)
+{
+  n_ssa_val = num_ssa_names;
+  ssa_name_info_map = new hash_map<tree, ssa_name_info *>;
+  ssa_to_be_promoted_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_to_be_promoted_bitmap);
+  ssa_sets_higher_bits_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_sets_higher_bits_bitmap);
+
+  /* Create the obstack where ssa_name_info will reside.  */
+  gcc_obstack_init (&ssa_name_info_obstack);
+
+  calculate_dominance_info (CDI_DOMINATORS);
+  process_all_stmts_for_unsafe_promotion ();
+  /* Walk the CFG in dominator order.  */
+  type_promotion_dom_walker (CDI_DOMINATORS)
+    .walk (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+  promote_debug_stmts ();
+  gsi_commit_edge_inserts ();
+
+  obstack_free (&ssa_name_info_obstack, NULL);
+  sbitmap_free (ssa_to_be_promoted_bitmap);
+  sbitmap_free (ssa_sets_higher_bits_bitmap);
+  delete ssa_name_info_map;
+  return 0;
+}
+
+namespace {
+const pass_data pass_data_type_promotion =
+{
+  GIMPLE_PASS, /* type */
+  "promotion", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_TREE_TYPE_PROMOTE, /* tv_id */
+  PROP_ssa, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  (TODO_cleanup_cfg | TODO_update_ssa | TODO_verify_all),
+};
+
+class pass_type_promotion : public gimple_opt_pass
+{
+public:
+  pass_type_promotion (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_type_promotion, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  opt_pass * clone () { return new pass_type_promotion (m_ctxt); }
+  virtual bool gate (function *) { return flag_tree_type_promote != 0; }
+  virtual unsigned int execute (function *)
+    {
+      return execute_type_promotion ();
+    }
+
+}; // class pass_type_promotion
+
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_type_promote (gcc::context *ctxt)
+{
+  return new pass_type_promotion (ctxt);
+}
+
diff --git a/gcc/passes.def b/gcc/passes.def
index 36d2b3b..78c463a 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -272,6 +272,7 @@ along with GCC; see the file COPYING3.  If not see
       POP_INSERT_PASSES ()
       NEXT_PASS (pass_simduid_cleanup);
       NEXT_PASS (pass_lower_vector_ssa);
+      NEXT_PASS (pass_type_promote);
       NEXT_PASS (pass_cse_reciprocals);
       NEXT_PASS (pass_reassoc);
       NEXT_PASS (pass_strength_reduction);
diff --git a/gcc/timevar.def b/gcc/timevar.def
index b429faf..a8d40c3 100644
--- a/gcc/timevar.def
+++ b/gcc/timevar.def
@@ -278,6 +278,7 @@ DEFTIMEVAR (TV_VTABLE_VERIFICATION   , "vtable verification")
 DEFTIMEVAR (TV_TREE_UBSAN            , "tree ubsan")
 DEFTIMEVAR (TV_INITIALIZE_RTL        , "initialize rtl")
 DEFTIMEVAR (TV_GIMPLE_LADDRESS       , "address lowering")
+DEFTIMEVAR (TV_TREE_TYPE_PROMOTE     , "tree type promote")
 
 /* Everything else in rest_of_compilation not included above.  */
 DEFTIMEVAR (TV_EARLY_LOCAL	     , "early local passes")
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 333b5a7..449dd19 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -436,6 +436,7 @@ extern gimple_opt_pass *make_pass_fre (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_check_data_deps (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_copy_prop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_isolate_erroneous_paths (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_type_promote (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_vrp (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt);
diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c
index 45b8d46..07845e3 100644
--- a/gcc/tree-ssa-reassoc.c
+++ b/gcc/tree-ssa-reassoc.c
@@ -302,6 +302,7 @@ phi_rank (gimple *stmt)
     {
       tree arg = gimple_phi_arg_def (stmt, i);
       if (TREE_CODE (arg) == SSA_NAME
+	  && SSA_NAME_VAR (arg)
 	  && !SSA_NAME_IS_DEFAULT_DEF (arg))
 	{
 	  gimple *def_stmt = SSA_NAME_DEF_STMT (arg);
@@ -434,7 +435,8 @@ get_rank (tree e)
       if (gimple_code (stmt) == GIMPLE_PHI)
 	return phi_rank (stmt);
 
-      if (!is_gimple_assign (stmt))
+      if (!is_gimple_assign (stmt)
+	  && !gimple_nop_p (stmt))
 	return bb_rank[gimple_bb (stmt)->index];
 
       /* If we already have a rank for this expression, use that.  */
diff --git a/gcc/tree-ssa-uninit.c b/gcc/tree-ssa-uninit.c
index 3f7dbcf..93422ac 100644
--- a/gcc/tree-ssa-uninit.c
+++ b/gcc/tree-ssa-uninit.c
@@ -201,16 +201,19 @@ warn_uninitialized_vars (bool warn_possibly_uninitialized)
 	  FOR_EACH_SSA_USE_OPERAND (use_p, stmt, op_iter, SSA_OP_USE)
 	    {
 	      use = USE_FROM_PTR (use_p);
-	      if (always_executed)
-		warn_uninit (OPT_Wuninitialized, use,
-			     SSA_NAME_VAR (use), SSA_NAME_VAR (use),
-			     "%qD is used uninitialized in this function",
-			     stmt, UNKNOWN_LOCATION);
-	      else if (warn_possibly_uninitialized)
-		warn_uninit (OPT_Wmaybe_uninitialized, use,
-			     SSA_NAME_VAR (use), SSA_NAME_VAR (use),
-			     "%qD may be used uninitialized in this function",
-			     stmt, UNKNOWN_LOCATION);
+	      if (SSA_NAME_VAR (use))
+		{
+		  if (always_executed)
+		    warn_uninit (OPT_Wuninitialized, use,
+				 SSA_NAME_VAR (use), SSA_NAME_VAR (use),
+				 "%qD is used uninitialized in this function",
+				 stmt, UNKNOWN_LOCATION);
+		  else if (warn_possibly_uninitialized)
+		    warn_uninit (OPT_Wmaybe_uninitialized, use,
+				 SSA_NAME_VAR (use), SSA_NAME_VAR (use),
+				 "%qD may be used uninitialized in this function",
+				 stmt, UNKNOWN_LOCATION);
+		}
 	    }
 
 	  /* For memory the only cheap thing we can do is see if we
diff --git a/gcc/tree-ssa.c b/gcc/tree-ssa.c
index 4b869be..3e520fc 100644
--- a/gcc/tree-ssa.c
+++ b/gcc/tree-ssa.c
@@ -752,7 +752,8 @@ verify_use (basic_block bb, basic_block def_bb, use_operand_p use_p,
   TREE_VISITED (ssa_name) = 1;
 
   if (gimple_nop_p (SSA_NAME_DEF_STMT (ssa_name))
-      && SSA_NAME_IS_DEFAULT_DEF (ssa_name))
+      && (SSA_NAME_IS_DEFAULT_DEF (ssa_name)
+	  || SSA_NAME_VAR (ssa_name) == NULL))
     ; /* Default definitions have empty statements.  Nothing to do.  */
   else if (!def_bb)
     {
diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index ff608a3..6722331 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -4353,7 +4353,7 @@ d_print_comp_inner (struct d_print_info *dpi, int options,
 
   /* Variable used to store the current templates while a previously
      captured scope is used.  */
-  struct d_print_template *saved_templates;
+  struct d_print_template *saved_templates = NULL;
 
   /* Nonzero if templates have been stored in the above variable.  */
   int need_template_restore = 0;
-- 
1.9.1


[-- Attachment #4: 0001-Add-new-SEXT_EXPR-tree-code.patch --]
[-- Type: text/x-diff, Size: 5067 bytes --]

From c0ce364e3a422912a08189645efde46c36583753 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 22 Oct 2015 10:51:42 +1100
Subject: [PATCH 1/4] Add new SEXT_EXPR tree code

---
 gcc/cfgexpand.c         | 12 ++++++++++++
 gcc/expr.c              | 20 ++++++++++++++++++++
 gcc/fold-const.c        |  4 ++++
 gcc/tree-cfg.c          | 12 ++++++++++++
 gcc/tree-inline.c       |  1 +
 gcc/tree-pretty-print.c | 11 +++++++++++
 gcc/tree.def            |  5 +++++
 7 files changed, 65 insertions(+)

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index eaad859..aeb64bb 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -5054,6 +5054,18 @@ expand_debug_expr (tree exp)
     case FMA_EXPR:
       return simplify_gen_ternary (FMA, mode, inner_mode, op0, op1, op2);
 
+    case SEXT_EXPR:
+      gcc_assert (CONST_INT_P (op1));
+      inner_mode = mode_for_size (INTVAL (op1), MODE_INT, 0);
+      gcc_assert (GET_MODE_BITSIZE (inner_mode) == INTVAL (op1));
+
+      if (mode != inner_mode)
+	op0 = simplify_gen_unary (SIGN_EXTEND,
+				  mode,
+				  gen_lowpart_SUBREG (inner_mode, op0),
+				  inner_mode);
+      return op0;
+
     default:
     flag_unsupported:
 #ifdef ENABLE_CHECKING
diff --git a/gcc/expr.c b/gcc/expr.c
index da68870..c2f535f 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9318,6 +9318,26 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
       target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
       return target;
 
+    case SEXT_EXPR:
+	{
+	  machine_mode inner_mode = mode_for_size (tree_to_uhwi (treeop1),
+						   MODE_INT, 0);
+	  rtx temp, result;
+	  rtx op0 = expand_normal (treeop0);
+	  op0 = force_reg (mode, op0);
+	  if (mode != inner_mode)
+	    {
+	      result = gen_reg_rtx (mode);
+	      temp = simplify_gen_unary (SIGN_EXTEND, mode,
+					 gen_lowpart_SUBREG (inner_mode, op0),
+					 inner_mode);
+	      convert_move (result, temp, 0);
+	    }
+	  else
+	    result = op0;
+	  return result;
+	}
+
     default:
       gcc_unreachable ();
     }
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 602ea24..a149bad 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -987,6 +987,10 @@ int_const_binop_1 (enum tree_code code, const_tree arg1, const_tree parg2,
       res = wi::bit_and (arg1, arg2);
       break;
 
+    case SEXT_EXPR:
+      res = wi::sext (arg1, arg2.to_uhwi ());
+      break;
+
     case RSHIFT_EXPR:
     case LSHIFT_EXPR:
       if (wi::neg_p (arg2))
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 8e3e810..d18b3f7 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3752,6 +3752,18 @@ verify_gimple_assign_binary (gassign *stmt)
         return false;
       }
 
+    case SEXT_EXPR:
+      {
+	if (!INTEGRAL_TYPE_P (lhs_type)
+	    || !useless_type_conversion_p (lhs_type, rhs1_type)
+	    || !tree_fits_uhwi_p (rhs2))
+	  {
+	    error ("invalid operands in sext expr");
+	    return true;
+	  }
+	return false;
+      }
+
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
       {
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index b8269ef..e61c200 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -3893,6 +3893,7 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
     case BIT_XOR_EXPR:
     case BIT_AND_EXPR:
     case BIT_NOT_EXPR:
+    case SEXT_EXPR:
 
     case TRUTH_ANDIF_EXPR:
     case TRUTH_ORIF_EXPR:
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 11f90051..bec9082 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -1923,6 +1923,14 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, int flags,
       }
       break;
 
+    case SEXT_EXPR:
+      pp_string (pp, "SEXT_EXPR <");
+      dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
+      pp_string (pp, ", ");
+      dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
+      pp_greater (pp);
+      break;
+
     case MODIFY_EXPR:
     case INIT_EXPR:
       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags,
@@ -3561,6 +3569,9 @@ op_symbol_code (enum tree_code code)
     case MIN_EXPR:
       return "min";
 
+    case SEXT_EXPR:
+      return "sext";
+
     default:
       return "<<< ??? >>>";
     }
diff --git a/gcc/tree.def b/gcc/tree.def
index d0a3bd6..789cfdd 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -760,6 +760,11 @@ DEFTREECODE (BIT_XOR_EXPR, "bit_xor_expr", tcc_binary, 2)
 DEFTREECODE (BIT_AND_EXPR, "bit_and_expr", tcc_binary, 2)
 DEFTREECODE (BIT_NOT_EXPR, "bit_not_expr", tcc_unary, 1)
 
+/*  Sign-extend operation.  It will sign extend first operand from
+ the sign bit specified by the second operand.  The type of the
+ result is that of the first operand.  */
+DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2)
+
 /* ANDIF and ORIF allow the second operand not to be computed if the
    value of the expression is determined from the first operand.  AND,
    OR, and XOR always compute the second operand whether its value is
-- 
1.9.1


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-11-08  9:43                 ` Kugan
@ 2015-11-10 14:13                   ` Richard Biener
  2015-11-12  6:08                     ` Kugan
  2015-11-14  1:15                     ` Kugan
  0 siblings, 2 replies; 63+ messages in thread
From: Richard Biener @ 2015-11-10 14:13 UTC (permalink / raw)
  To: Kugan; +Cc: gcc-patches

On Sun, Nov 8, 2015 at 10:43 AM, Kugan
<kugan.vivekanandarajah@linaro.org> wrote:
>
> Thanks Richard for the comments.  Please find the attached patches which
> now passes bootstrap with x86_64-none-linux-gnu, aarch64-linux-gnu  and
> ppc64-linux-gnu. Regression testing is ongoing. Please find the comments
> for your questions/suggestions below.
>
>>
>> I notice
>>
>> diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c
>> index 82fd4a1..80fcf70 100644
>> --- a/gcc/tree-ssanames.c
>> +++ b/gcc/tree-ssanames.c
>> @@ -207,7 +207,8 @@ set_range_info (tree name, enum value_range_type range_type,
>>    unsigned int precision = TYPE_PRECISION (TREE_TYPE (name));
>>
>>    /* Allocate if not available.  */
>> -  if (ri == NULL)
>> +  if (ri == NULL
>> +      || (precision != ri->get_min ().get_precision ()))
>>
>> and I think you need to clear range info on promoted SSA vars in the
>> promotion pass.
>
> Done.
>
>>
>> The basic "structure" thing still remains.  You walk over all uses and
>> defs in all stmts
>> in promote_all_stmts which ends up calling promote_ssa_if_not_promoted on all
>> uses and defs which in turn promotes (the "def") and then fixes up all
>> uses in all stmts.
>
> Done.

Not exactly.  I still see

/* Promote all the stmts in the basic block.  */
static void
promote_all_stmts (basic_block bb)
{
  gimple_stmt_iterator gsi;
  ssa_op_iter iter;
  tree def, use;
  use_operand_p op;

  for (gphi_iterator gpi = gsi_start_phis (bb);
       !gsi_end_p (gpi); gsi_next (&gpi))
    {
      gphi *phi = gpi.phi ();
      def = PHI_RESULT (phi);
      promote_ssa (def, &gsi);

      FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE)
        {
          use = USE_FROM_PTR (op);
          if (TREE_CODE (use) == SSA_NAME
              && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP)
            promote_ssa (use, &gsi);
          fixup_uses (phi, &gsi, op, use);
        }

you still call promote_ssa on both DEFs and USEs and promote_ssa looks
at SSA_NAME_DEF_STMT of the passed arg.  Please call promote_ssa just
on DEFs and fixup_uses on USEs.

Any reason you do not promote debug stmts during the DOM walk?

So for each DEF you record in ssa_name_info

struct ssa_name_info
{
  tree ssa;
  tree type;
  tree promoted_type;
};

(the fields need documenting).  Add a tree promoted_def to it which you
can replace any use of the DEF with.

Currently as you call promote_ssa for DEFs and USEs you repeatedly
overwrite the entry in ssa_name_info_map with a new copy.  So you
should assert it wasn't already there.

  switch (gimple_code (def_stmt))
    {
    case GIMPLE_PHI:
        {

the last { is indented too much it should be indented 2 spaces
relative to the 'case'


  SSA_NAME_RANGE_INFO (def) = NULL;

only needed in the case 'def' was promoted itself.  Please use
reset_flow_sensitive_info (def).

>>
>> Instead of this you should, in promote_all_stmts, walk over all uses doing what
>> fixup_uses does and then walk over all defs, doing what promote_ssa does.
>>
>> +    case GIMPLE_NOP:
>> +       {
>> +         if (SSA_NAME_VAR (def) == NULL)
>> +           {
>> +             /* Promote def by fixing its type for anonymous def.  */
>> +             TREE_TYPE (def) = promoted_type;
>> +           }
>> +         else
>> +           {
>> +             /* Create a promoted copy of parameters.  */
>> +             bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
>>
>> I think the uninitialized vars are somewhat tricky and it would be best
>> to create a new uninit anonymous SSA name for them.  You can
>> have SSA_NAME_VAR != NULL and def _not_ being a parameter
>> btw.
>
> Done. I also had to do some changes to in couple of other places to
> reflect this.
> They are:
> --- a/gcc/tree-ssa-reassoc.c
> +++ b/gcc/tree-ssa-reassoc.c
> @@ -302,6 +302,7 @@ phi_rank (gimple *stmt)
>      {
>        tree arg = gimple_phi_arg_def (stmt, i);
>        if (TREE_CODE (arg) == SSA_NAME
> +         && SSA_NAME_VAR (arg)
>           && !SSA_NAME_IS_DEFAULT_DEF (arg))
>         {
>           gimple *def_stmt = SSA_NAME_DEF_STMT (arg);
> @@ -434,7 +435,8 @@ get_rank (tree e)
>        if (gimple_code (stmt) == GIMPLE_PHI)
>         return phi_rank (stmt);
>
> -      if (!is_gimple_assign (stmt))
> +      if (!is_gimple_assign (stmt)
> +         && !gimple_nop_p (stmt))
>         return bb_rank[gimple_bb (stmt)->index];
>
> and
>
> --- a/gcc/tree-ssa.c
> +++ b/gcc/tree-ssa.c
> @@ -752,7 +752,8 @@ verify_use (basic_block bb, basic_block def_bb,
> use_operand_p use_p,
>    TREE_VISITED (ssa_name) = 1;
>
>    if (gimple_nop_p (SSA_NAME_DEF_STMT (ssa_name))
> -      && SSA_NAME_IS_DEFAULT_DEF (ssa_name))
> +      && (SSA_NAME_IS_DEFAULT_DEF (ssa_name)
> +         || SSA_NAME_VAR (ssa_name) == NULL))
>      ; /* Default definitions have empty statements.  Nothing to do.  */
>    else if (!def_bb)
>      {
>
> Does this look OK?

Hmm, no, this looks bogus.

I think the best thing to do is not promoting default defs at all and instead
promote at the uses.

              /* Create a promoted copy of parameters.  */
              bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
              gcc_assert (bb);
              gsi2 = gsi_after_labels (bb);
              new_def = copy_ssa_name (def);
              set_ssa_promoted (new_def);
              set_ssa_default_def (cfun, SSA_NAME_VAR (def), new_def);
              duplicate_default_ssa (new_def, def);
              TREE_TYPE (def) = promoted_type;

AFAIK this is just an awkward way of replacing all uses by a new DEF, sth
that should be supported by the machinery so that other default defs can just
do

             new_def = get_or_create_default_def (create_tmp_reg
(promoted_type));

and have all uses ('def') replaced by new_def.

>>
>> +/* Return true if it is safe to promote the defined SSA_NAME in the STMT
>> +   itself.  */
>> +static bool
>> +safe_to_promote_def_p (gimple *stmt)
>> +{
>> +  enum tree_code code = gimple_assign_rhs_code (stmt);
>> +  if (gimple_vuse (stmt) != NULL_TREE
>> +      || gimple_vdef (stmt) != NULL_TREE
>> +      || code == ARRAY_REF
>> +      || code == LROTATE_EXPR
>> +      || code == RROTATE_EXPR
>> +      || code == VIEW_CONVERT_EXPR
>> +      || code == BIT_FIELD_REF
>> +      || code == REALPART_EXPR
>> +      || code == IMAGPART_EXPR
>> +      || code == REDUC_MAX_EXPR
>> +      || code == REDUC_PLUS_EXPR
>> +      || code == REDUC_MIN_EXPR)
>> +    return false;
>> +  return true;
>>
>> huh, I think this function has an odd name, maybe
>> can_promote_operation ()?  Please
>> use TREE_CODE_CLASS (code) == tcc_reference for all _REF trees.
>
> Done.
>
>>
>> Note that as followup things like the rotates should be "expanded" like
>> we'd do on RTL (open-coding the thing).  And we'd need a way to
>> specify zero-/sign-extended loads.
>>
>> +/* Return true if it is safe to promote the use in the STMT.  */
>> +static bool
>> +safe_to_promote_use_p (gimple *stmt)
>> +{
>> +  enum tree_code code = gimple_assign_rhs_code (stmt);
>> +  tree lhs = gimple_assign_lhs (stmt);
>> +
>> +  if (gimple_vuse (stmt) != NULL_TREE
>> +      || gimple_vdef (stmt) != NULL_TREE
>>
>> I think the vuse/vdef check is bogus, you can have a use of 'i_3' in say
>> _2 = a[i_3];
>>
> When I remove this, I see errors in stmts like:
>
> unsigned char
> unsigned int
> # .MEM_197 = VDEF <.MEM_187>
> fs_9(D)->fde_encoding = _154;

Yeah, as said a stmt based check is really bogus without context.  As the
predicate is only used in a single place it's better to inline it
there.  In this
case you want to handle loads/stores differently.  From this context it
looks like not iterating over uses in the caller but rather iterating over
uses here makes most sense as you then can do

   if (gimple_store_p (stmt))
     {
        promote all uses that are not gimple_assign_rhs1 ()
     }

you can also transparently handle constants for the cases where promoting
is required.  At the moment their handling is interwinded with the def promotion
code.  That makes the whole thing hard to follow.

Thanks,
Richard.

>
>> +      || code == VIEW_CONVERT_EXPR
>> +      || code == LROTATE_EXPR
>> +      || code == RROTATE_EXPR
>> +      || code == CONSTRUCTOR
>> +      || code == BIT_FIELD_REF
>> +      || code == COMPLEX_EXPR
>> +      || code == ASM_EXPR
>> +      || VECTOR_TYPE_P (TREE_TYPE (lhs)))
>> +    return false;
>> +  return true;
>>
>> ASM_EXPR can never appear here.  I think PROMOTE_MODE never
>> promotes vector types - what cases did you need to add VECTOR_TYPE_P for?
>
> Done
>>
>> +/* Return true if the SSA_NAME has to be truncated to preserve the
>> +   semantics.  */
>> +static bool
>> +truncate_use_p (gimple *stmt)
>> +{
>> +  enum tree_code code = gimple_assign_rhs_code (stmt);
>>
>> I think the description can be improved.  This is about stray bits set
>> beyond the original type, correct?
>>
>> Please use NOP_EXPR wherever you use CONVERT_EXPR right how.
>>
>> +                 if (TREE_CODE_CLASS (code)
>> +                     == tcc_comparison)
>> +                   promote_cst_in_stmt (stmt, promoted_type, true);
>>
>> don't you always need to promote constant operands?
>
> I am promoting all the constants. Here, I am promoting the the constants
> that are part of the conditions.
>
>
> Thanks,
> Kugan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-11-10 14:13                   ` Richard Biener
@ 2015-11-12  6:08                     ` Kugan
  2015-11-14  1:15                     ` Kugan
  1 sibling, 0 replies; 63+ messages in thread
From: Kugan @ 2015-11-12  6:08 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 9053 bytes --]

Hi Richard,

Thanks for the review.

>>>
>>> The basic "structure" thing still remains.  You walk over all uses and
>>> defs in all stmts
>>> in promote_all_stmts which ends up calling promote_ssa_if_not_promoted on all
>>> uses and defs which in turn promotes (the "def") and then fixes up all
>>> uses in all stmts.
>>
>> Done.
> 
> Not exactly.  I still see
> 
> /* Promote all the stmts in the basic block.  */
> static void
> promote_all_stmts (basic_block bb)
> {
>   gimple_stmt_iterator gsi;
>   ssa_op_iter iter;
>   tree def, use;
>   use_operand_p op;
> 
>   for (gphi_iterator gpi = gsi_start_phis (bb);
>        !gsi_end_p (gpi); gsi_next (&gpi))
>     {
>       gphi *phi = gpi.phi ();
>       def = PHI_RESULT (phi);
>       promote_ssa (def, &gsi);
> 
>       FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE)
>         {
>           use = USE_FROM_PTR (op);
>           if (TREE_CODE (use) == SSA_NAME
>               && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP)
>             promote_ssa (use, &gsi);
>           fixup_uses (phi, &gsi, op, use);
>         }
> 
> you still call promote_ssa on both DEFs and USEs and promote_ssa looks
> at SSA_NAME_DEF_STMT of the passed arg.  Please call promote_ssa just
> on DEFs and fixup_uses on USEs.

I am doing this to promote SSA that are defined with GIMPLE_NOP. Is
there anyway to iterate over this. I have added gcc_assert to make sure
that promote_ssa is called only once.

> 
> Any reason you do not promote debug stmts during the DOM walk?
> 
> So for each DEF you record in ssa_name_info
> 
> struct ssa_name_info
> {
>   tree ssa;
>   tree type;
>   tree promoted_type;
> };
> 
> (the fields need documenting).  Add a tree promoted_def to it which you
> can replace any use of the DEF with.

In this version of the patch, I am promoting the def in place. If we
decide to change, I will add it. If I understand you correctly, this is
to be used in iterating over uses and fixing.

> 
> Currently as you call promote_ssa for DEFs and USEs you repeatedly
> overwrite the entry in ssa_name_info_map with a new copy.  So you
> should assert it wasn't already there.
> 
>   switch (gimple_code (def_stmt))
>     {
>     case GIMPLE_PHI:
>         {
> 
> the last { is indented too much it should be indented 2 spaces
> relative to the 'case'

Done.

> 
> 
>   SSA_NAME_RANGE_INFO (def) = NULL;
> 
> only needed in the case 'def' was promoted itself.  Please use
> reset_flow_sensitive_info (def).

We are promoting all the defs. In some-cases we can however use the
value ranges in SSA just by promoting to new type (as the values will be
the same). Shall I do it as a follow up.
> 
>>>
>>> Instead of this you should, in promote_all_stmts, walk over all uses doing what
>>> fixup_uses does and then walk over all defs, doing what promote_ssa does.
>>>
>>> +    case GIMPLE_NOP:
>>> +       {
>>> +         if (SSA_NAME_VAR (def) == NULL)
>>> +           {
>>> +             /* Promote def by fixing its type for anonymous def.  */
>>> +             TREE_TYPE (def) = promoted_type;
>>> +           }
>>> +         else
>>> +           {
>>> +             /* Create a promoted copy of parameters.  */
>>> +             bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
>>>
>>> I think the uninitialized vars are somewhat tricky and it would be best
>>> to create a new uninit anonymous SSA name for them.  You can
>>> have SSA_NAME_VAR != NULL and def _not_ being a parameter
>>> btw.
>>
>> Done. I also had to do some changes to in couple of other places to
>> reflect this.
>> They are:
>> --- a/gcc/tree-ssa-reassoc.c
>> +++ b/gcc/tree-ssa-reassoc.c
>> @@ -302,6 +302,7 @@ phi_rank (gimple *stmt)
>>      {
>>        tree arg = gimple_phi_arg_def (stmt, i);
>>        if (TREE_CODE (arg) == SSA_NAME
>> +         && SSA_NAME_VAR (arg)
>>           && !SSA_NAME_IS_DEFAULT_DEF (arg))
>>         {
>>           gimple *def_stmt = SSA_NAME_DEF_STMT (arg);
>> @@ -434,7 +435,8 @@ get_rank (tree e)
>>        if (gimple_code (stmt) == GIMPLE_PHI)
>>         return phi_rank (stmt);
>>
>> -      if (!is_gimple_assign (stmt))
>> +      if (!is_gimple_assign (stmt)
>> +         && !gimple_nop_p (stmt))
>>         return bb_rank[gimple_bb (stmt)->index];
>>
>> and
>>
>> --- a/gcc/tree-ssa.c
>> +++ b/gcc/tree-ssa.c
>> @@ -752,7 +752,8 @@ verify_use (basic_block bb, basic_block def_bb,
>> use_operand_p use_p,
>>    TREE_VISITED (ssa_name) = 1;
>>
>>    if (gimple_nop_p (SSA_NAME_DEF_STMT (ssa_name))
>> -      && SSA_NAME_IS_DEFAULT_DEF (ssa_name))
>> +      && (SSA_NAME_IS_DEFAULT_DEF (ssa_name)
>> +         || SSA_NAME_VAR (ssa_name) == NULL))
>>      ; /* Default definitions have empty statements.  Nothing to do.  */
>>    else if (!def_bb)
>>      {
>>
>> Does this look OK?
> 
> Hmm, no, this looks bogus.

I have removed all the above.

> 
> I think the best thing to do is not promoting default defs at all and instead
> promote at the uses.
> 
>               /* Create a promoted copy of parameters.  */
>               bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
>               gcc_assert (bb);
>               gsi2 = gsi_after_labels (bb);
>               new_def = copy_ssa_name (def);
>               set_ssa_promoted (new_def);
>               set_ssa_default_def (cfun, SSA_NAME_VAR (def), new_def);
>               duplicate_default_ssa (new_def, def);
>               TREE_TYPE (def) = promoted_type;
> 
> AFAIK this is just an awkward way of replacing all uses by a new DEF, sth
> that should be supported by the machinery so that other default defs can just
> do
> 
>              new_def = get_or_create_default_def (create_tmp_reg
> (promoted_type));
> 
> and have all uses ('def') replaced by new_def.

I experimented with get_or_create_default_def. Here  we have to have a
SSA_NAME_VAR (def) of promoted type.

In the attached patch I am doing the following and seems to work. Does
this looks OK?

+	  }
+	else if (TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL)
+	  {
+	    tree var = copy_node (SSA_NAME_VAR (def));
+	    TREE_TYPE (var) = promoted_type;
+	    TREE_TYPE (def) = promoted_type;
+	    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var);
+	  }

I prefer to promote def as otherwise iterating over the uses and
promoting can look complicated (have to look at all the different types
of stmts again and do the right thing as It was in the earlier version
of this before we move to this approach)

>>>
>>> Note that as followup things like the rotates should be "expanded" like
>>> we'd do on RTL (open-coding the thing).  And we'd need a way to
>>> specify zero-/sign-extended loads.
>>>
>>> +/* Return true if it is safe to promote the use in the STMT.  */
>>> +static bool
>>> +safe_to_promote_use_p (gimple *stmt)
>>> +{
>>> +  enum tree_code code = gimple_assign_rhs_code (stmt);
>>> +  tree lhs = gimple_assign_lhs (stmt);
>>> +
>>> +  if (gimple_vuse (stmt) != NULL_TREE
>>> +      || gimple_vdef (stmt) != NULL_TREE
>>>
>>> I think the vuse/vdef check is bogus, you can have a use of 'i_3' in say
>>> _2 = a[i_3];
>>>
>> When I remove this, I see errors in stmts like:
>>
>> unsigned char
>> unsigned int
>> # .MEM_197 = VDEF <.MEM_187>
>> fs_9(D)->fde_encoding = _154;
> 
> Yeah, as said a stmt based check is really bogus without context.  As the
> predicate is only used in a single place it's better to inline it
> there.  In this
> case you want to handle loads/stores differently.  From this context it
> looks like not iterating over uses in the caller but rather iterating over
> uses here makes most sense as you then can do
> 
>    if (gimple_store_p (stmt))
>      {
>         promote all uses that are not gimple_assign_rhs1 ()
>      }
> 
> you can also transparently handle constants for the cases where promoting
> is required.  At the moment their handling is interwinded with the def promotion
> code.  That makes the whole thing hard to follow.


I have updated the comments with:

+/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
+   promote only the constants in conditions part of the COND_EXPR.
+
+   We promote the constants when the associated operands are promoted.
+   This usually means that we promote the constants when we promote the
+   defining stmnts (as part of promote_ssa). However for COND_EXPR, we
+   can promote only when we promote the other operand. Therefore, this
+   is done during fixup_use.  */


I am handling gimple_debug separately to avoid any code difference with
and without -g option. I have updated the comments for this.

Tested attached patch on ppc64, aarch64 and x86-none-linux-gnu.
regression testing for ppc64 is progressing. I also noticed that
tree-ssa-uninit sometimes gives false positives due to the assumptions
it makes. Is it OK to move this pass before type promotion? I can do the
testings and post a separate patch with this if this OK.

I also removed the optimization that prevents some of the redundant
truncation/extensions from type promotion pass, as it dosent do much as
of now. I can send a proper follow up patch. Is that OK?

Thanks,
Kugan

[-- Attachment #2: 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch --]
[-- Type: text/x-patch, Size: 3609 bytes --]

From 0eb41ec18322484cf0ae8ca6631ac9dc913576fb Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 22 Oct 2015 10:53:56 +1100
Subject: [PATCH 3/5] Optimize ZEXT_EXPR with tree-vrp

---
 gcc/match.pd   |  6 ++++++
 gcc/tree-vrp.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 67 insertions(+)

diff --git a/gcc/match.pd b/gcc/match.pd
index 0a9598e..1b152f1 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2585,3 +2585,9 @@ along with GCC; see the file COPYING3.  If not see
   (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
    (op @0 (ext @1 @2)))))
 
+(simplify
+ (sext (sext@2 @0 @1) @3)
+ (if (tree_int_cst_compare (@1, @3) <= 0)
+  @2
+  (sext @0 @3)))
+
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index fe34ffd..024c8ef 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -2241,6 +2241,7 @@ extract_range_from_binary_expr_1 (value_range *vr,
       && code != LSHIFT_EXPR
       && code != MIN_EXPR
       && code != MAX_EXPR
+      && code != SEXT_EXPR
       && code != BIT_AND_EXPR
       && code != BIT_IOR_EXPR
       && code != BIT_XOR_EXPR)
@@ -2801,6 +2802,54 @@ extract_range_from_binary_expr_1 (value_range *vr,
       extract_range_from_multiplicative_op_1 (vr, code, &vr0, &vr1);
       return;
     }
+  else if (code == SEXT_EXPR)
+    {
+      gcc_assert (range_int_cst_p (&vr1));
+      HOST_WIDE_INT prec = tree_to_uhwi (vr1.min);
+      type = vr0.type;
+      wide_int tmin, tmax;
+      wide_int may_be_nonzero, must_be_nonzero;
+
+      wide_int type_min = wi::min_value (prec, SIGNED);
+      wide_int type_max = wi::max_value (prec, SIGNED);
+      type_min = wide_int_to_tree (expr_type, type_min);
+      type_max = wide_int_to_tree (expr_type, type_max);
+      type_min = wi::sext (type_min, prec);
+      type_max = wi::sext (type_max, prec);
+      wide_int sign_bit
+	= wi::set_bit_in_zero (prec - 1,
+			       TYPE_PRECISION (TREE_TYPE (vr0.min)));
+      if (zero_nonzero_bits_from_vr (expr_type, &vr0,
+				     &may_be_nonzero,
+				     &must_be_nonzero))
+	{
+	  if (wi::bit_and (must_be_nonzero, sign_bit) == sign_bit)
+	    {
+	      /* If to-be-extended sign bit is one.  */
+	      tmin = type_min;
+	      tmax = wi::zext (may_be_nonzero, prec);
+	    }
+	  else if (wi::bit_and (may_be_nonzero, sign_bit)
+		   != sign_bit)
+	    {
+	      /* If to-be-extended sign bit is zero.  */
+	      tmin = wi::zext (must_be_nonzero, prec);
+	      tmax = wi::zext (may_be_nonzero, prec);
+	    }
+	  else
+	    {
+	      tmin = type_min;
+	      tmax = type_max;
+	    }
+	}
+      else
+	{
+	  tmin = type_min;
+	  tmax = type_max;
+	}
+      min = wide_int_to_tree (expr_type, tmin);
+      max = wide_int_to_tree (expr_type, tmax);
+    }
   else if (code == RSHIFT_EXPR
 	   || code == LSHIFT_EXPR)
     {
@@ -9166,6 +9215,17 @@ simplify_bit_ops_using_ranges (gimple_stmt_iterator *gsi, gimple *stmt)
 	  break;
 	}
       break;
+    case SEXT_EXPR:
+	{
+	  unsigned int prec = tree_to_uhwi (op1);
+	  wide_int min = vr0.min;
+	  wide_int max = vr0.max;
+	  wide_int sext_min = wi::sext (min, prec);
+	  wide_int sext_max = wi::sext (max, prec);
+	  if (min == sext_min && max == sext_max)
+	    op = op0;
+	}
+      break;
     default:
       gcc_unreachable ();
     }
@@ -9868,6 +9928,7 @@ simplify_stmt_using_ranges (gimple_stmt_iterator *gsi)
 
 	case BIT_AND_EXPR:
 	case BIT_IOR_EXPR:
+	case SEXT_EXPR:
 	  /* Optimize away BIT_AND_EXPR and BIT_IOR_EXPR
 	     if all the bits being cleared are already cleared or
 	     all the bits being set are already set.  */
-- 
1.9.1


[-- Attachment #3: 0002-Add-type-promotion-pass.patch --]
[-- Type: text/x-patch, Size: 30437 bytes --]

From 31c9caf7b239827ed6ac7ad7f4fe05e0ba4197e2 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 22 Oct 2015 10:52:37 +1100
Subject: [PATCH 2/5] Add type promotion pass

---
 gcc/Makefile.in               |   1 +
 gcc/auto-profile.c            |   2 +-
 gcc/common.opt                |   4 +
 gcc/doc/invoke.texi           |  10 +
 gcc/gimple-ssa-type-promote.c | 845 ++++++++++++++++++++++++++++++++++++++++++
 gcc/passes.def                |   1 +
 gcc/timevar.def               |   1 +
 gcc/tree-pass.h               |   1 +
 libiberty/cp-demangle.c       |   2 +-
 9 files changed, 865 insertions(+), 2 deletions(-)
 create mode 100644 gcc/gimple-ssa-type-promote.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index b91b8dc..c6aed45 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1499,6 +1499,7 @@ OBJS = \
 	tree-vect-slp.o \
 	tree-vectorizer.o \
 	tree-vrp.o \
+	gimple-ssa-type-promote.o \
 	tree.o \
 	valtrack.o \
 	value-prof.o \
diff --git a/gcc/auto-profile.c b/gcc/auto-profile.c
index 25202c5..d32c3b6 100644
--- a/gcc/auto-profile.c
+++ b/gcc/auto-profile.c
@@ -1266,7 +1266,7 @@ afdo_propagate_circuit (const bb_set &annotated_bb, edge_set *annotated_edge)
     FOR_EACH_EDGE (e, ei, bb->succs)
     {
       unsigned i, total = 0;
-      edge only_one;
+      edge only_one = NULL;
       bool check_value_one = (((integer_onep (cmp_rhs))
                                ^ (gimple_cond_code (cmp_stmt) == EQ_EXPR))
                               ^ ((e->flags & EDGE_TRUE_VALUE) != 0));
diff --git a/gcc/common.opt b/gcc/common.opt
index 12ca0d6..f450428 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2404,6 +2404,10 @@ ftree-vrp
 Common Report Var(flag_tree_vrp) Init(0) Optimization
 Perform Value Range Propagation on trees.
 
+ftree-type-promote
+Common Report Var(flag_tree_type_promote) Init(1) Optimization
+Perform Type Promotion on trees
+
 funit-at-a-time
 Common Report Var(flag_unit_at_a_time) Init(1)
 Compile whole compilation unit at a time.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index cd82544..bc059a0 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -9093,6 +9093,16 @@ enabled by default at @option{-O2} and higher.  Null pointer check
 elimination is only done if @option{-fdelete-null-pointer-checks} is
 enabled.
 
+@item -ftree-type-promote
+@opindex ftree-type-promote
+This pass applies type promotion to SSA names in the function and
+inserts appropriate truncations to preserve the semantics.  Idea of
+this pass is to promote operations such a way that we can minimise
+generation of subreg in RTL, that intern results in removal of
+redundant zero/sign extensions.
+
+This optimization is enabled by default.
+
 @item -fsplit-ivs-in-unroller
 @opindex fsplit-ivs-in-unroller
 Enables expression of values of induction variables in later iterations
diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c
new file mode 100644
index 0000000..6a8cc06
--- /dev/null
+++ b/gcc/gimple-ssa-type-promote.c
@@ -0,0 +1,845 @@
+/* Type promotion of SSA names to minimise redundant zero/sign extension.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "hash-set.h"
+#include "machmode.h"
+#include "vec.h"
+#include "double-int.h"
+#include "input.h"
+#include "symtab.h"
+#include "wide-int.h"
+#include "inchash.h"
+#include "tree.h"
+#include "fold-const.h"
+#include "stor-layout.h"
+#include "predict.h"
+#include "function.h"
+#include "dominance.h"
+#include "cfg.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimple-ssa.h"
+#include "tree-phinodes.h"
+#include "ssa-iterators.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+#include "tree-pass.h"
+#include "gimple-pretty-print.h"
+#include "langhooks.h"
+#include "sbitmap.h"
+#include "domwalk.h"
+#include "tree-dfa.h"
+
+/* This pass applies type promotion to SSA names in the function and
+   inserts appropriate truncations.  Idea of this pass is to promote operations
+   such a way that we can minimise generation of subreg in RTL,
+   that in turn results in removal of redundant zero/sign extensions.  This pass
+   will run prior to The VRP and DOM such that they will be able to optimise
+   redundant truncations and extensions.  This is based on the discussion from
+   https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00472.html.
+*/
+
+/* Structure to hold the type and promoted type for promoted ssa variables.  */
+struct ssa_name_info
+{
+  tree ssa;		/* Name of the SSA_NAME.  */
+  tree type;		/* Original type of ssa.  */
+  tree promoted_type;	/* Promoted type of ssa.  */
+};
+
+/* Obstack for ssa_name_info.  */
+static struct obstack ssa_name_info_obstack;
+
+static unsigned n_ssa_val;
+static sbitmap ssa_to_be_promoted_bitmap;
+static hash_map <tree, ssa_name_info *>  *ssa_name_info_map;
+
+static bool
+type_precision_ok (tree type)
+{
+  return (TYPE_PRECISION (type)
+	  == GET_MODE_PRECISION (TYPE_MODE (type)));
+}
+
+/* Return the promoted type for TYPE.  */
+static tree
+get_promoted_type (tree type)
+{
+  tree promoted_type;
+  enum machine_mode mode;
+  int uns;
+
+  if (POINTER_TYPE_P (type)
+      || !INTEGRAL_TYPE_P (type)
+      || !type_precision_ok (type))
+    return type;
+
+  mode = TYPE_MODE (type);
+#ifdef PROMOTE_MODE
+  uns = TYPE_SIGN (type);
+  PROMOTE_MODE (mode, uns, type);
+#endif
+  uns = TYPE_SIGN (type);
+  if (TYPE_PRECISION (type) == GET_MODE_PRECISION (mode))
+    return type;
+  promoted_type
+    = build_nonstandard_integer_type (GET_MODE_PRECISION (mode),
+				      uns);
+  gcc_assert (TYPE_PRECISION (promoted_type) == GET_MODE_PRECISION (mode));
+  return promoted_type;
+}
+
+/* Return true if ssa NAME is already considered for promotion.  */
+static bool
+ssa_promoted_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	return bitmap_bit_p (ssa_to_be_promoted_bitmap, index);
+    }
+  return true;
+}
+
+/* Set ssa NAME to be already considered for promotion.  */
+static void
+set_ssa_promoted (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	bitmap_set_bit (ssa_to_be_promoted_bitmap, index);
+    }
+}
+
+/* Return true if LHS will be promoted later.  */
+static bool
+tobe_promoted_p (tree lhs)
+{
+  if (TREE_CODE (lhs) == SSA_NAME
+      && !POINTER_TYPE_P (TREE_TYPE (lhs))
+      && INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+      && !VECTOR_TYPE_P (TREE_TYPE (lhs))
+      && !ssa_promoted_p (lhs)
+      && (get_promoted_type (TREE_TYPE (lhs))
+	  != TREE_TYPE (lhs)))
+    return true;
+  else
+    return false;
+}
+
+/* Convert constant CST to TYPE.  */
+static tree
+convert_int_cst (tree type, tree cst, signop sign = SIGNED)
+{
+  wide_int wi_cons = fold_convert (type, cst);
+  wi_cons = wi::ext (wi_cons, TYPE_PRECISION (TREE_TYPE (cst)), sign);
+  return wide_int_to_tree (type, wi_cons);
+}
+
+/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
+   promote only the constants in conditions part of the COND_EXPR.
+
+   We promote the constants when the associated operands are promoted.
+   This usually means that we promote the constants when we promote the
+   defining stmnts (as part of promote_ssa). However for COND_EXPR, we
+   can promote only when we promote the other operand. Therefore, this
+   is done during fixup_use.  */
+
+static void
+promote_cst_in_stmt (gimple *stmt, tree type, bool promote_cond = false)
+{
+  tree op;
+  ssa_op_iter iter;
+  use_operand_p oprnd;
+  int index;
+  tree op0, op1;
+  signop sign = SIGNED;
+
+  switch (gimple_code (stmt))
+    {
+    case GIMPLE_ASSIGN:
+      if (promote_cond
+	  && gimple_assign_rhs_code (stmt) == COND_EXPR)
+	{
+	  /* Promote INTEGER_CST that are tcc_compare arguments.  */
+	  sign = TYPE_SIGN (type);
+	  op = gimple_assign_rhs1 (stmt);
+	  op0 = TREE_OPERAND (op, 0);
+	  op1 = TREE_OPERAND (op, 1);
+	  if (TREE_CODE (op0) == INTEGER_CST)
+	    op0 = convert_int_cst (type, op0, sign);
+	  if (TREE_CODE (op1) == INTEGER_CST)
+	    op1 = convert_int_cst (type, op1, sign);
+	  tree new_op = build2 (TREE_CODE (op), type, op0, op1);
+	  gimple_assign_set_rhs1 (stmt, new_op);
+	}
+      else
+	{
+	  /* Promote INTEGER_CST in GIMPLE_ASSIGN.  */
+	  op = gimple_assign_rhs3 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs3 (stmt, convert_int_cst (type, op, sign));
+	  if (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt))
+	      == tcc_comparison)
+	    sign = TYPE_SIGN (type);
+	  op = gimple_assign_rhs1 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs1 (stmt, convert_int_cst (type, op, sign));
+	  op = gimple_assign_rhs2 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs2 (stmt, convert_int_cst (type, op, sign));
+	}
+      break;
+
+    case GIMPLE_PHI:
+	{
+	  /* Promote INTEGER_CST arguments to GIMPLE_PHI.  */
+	  gphi *phi = as_a <gphi *> (stmt);
+	  FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE)
+	    {
+	      op = USE_FROM_PTR (oprnd);
+	      index = PHI_ARG_INDEX_FROM_USE (oprnd);
+	      if (TREE_CODE (op) == INTEGER_CST)
+		SET_PHI_ARG_DEF (phi, index, convert_int_cst (type, op, sign));
+	    }
+	}
+      break;
+
+    case GIMPLE_COND:
+	{
+	  /* Promote INTEGER_CST that are GIMPLE_COND arguments.  */
+	  gcond *cond = as_a <gcond *> (stmt);
+	  sign = TYPE_SIGN (type);
+	  op = gimple_cond_lhs (cond);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_lhs (cond, convert_int_cst (type, op, sign));
+
+	  op = gimple_cond_rhs (cond);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_rhs (cond, convert_int_cst (type, op, sign));
+	}
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+}
+
+/* Create an ssa with TYPE to copy ssa VAR.  */
+static tree
+make_promoted_copy (tree var, gimple *def_stmt, tree type)
+{
+  tree new_lhs = make_ssa_name (type, def_stmt);
+  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (var))
+    SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_lhs) = 1;
+  return new_lhs;
+}
+
+/* Zero/sign extend (depending on UNSIGNED_P) VAR and truncate to WIDTH bits.
+   Assign the zero/sign extended value in NEW_VAR.  gimple statement
+   that performs the zero/sign extension is returned.  */
+static gimple *
+zero_sign_extend_stmt (tree new_var, tree var, bool unsigned_p, int width)
+{
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var))
+	      == TYPE_PRECISION (TREE_TYPE (new_var)));
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var)) > width);
+  gimple *stmt;
+
+  if (unsigned_p)
+    {
+      /* Zero extend.  */
+      tree cst
+	= wide_int_to_tree (TREE_TYPE (var),
+			    wi::mask (width, false,
+				      TYPE_PRECISION (TREE_TYPE (var))));
+      stmt = gimple_build_assign (new_var, BIT_AND_EXPR,
+				  var, cst);
+    }
+  else
+    /* Sign extend.  */
+    stmt = gimple_build_assign (new_var,
+				SEXT_EXPR,
+				var, build_int_cst (TREE_TYPE (var), width));
+  return stmt;
+}
+
+
+static void
+copy_default_ssa (tree to, tree from)
+{
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (to, SSA_NAME_VAR (from));
+  SSA_NAME_DEF_STMT (to) = SSA_NAME_DEF_STMT (from);
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (from, NULL_TREE);
+  SSA_NAME_IS_DEFAULT_DEF (to) = 1;
+  SSA_NAME_IS_DEFAULT_DEF (from) = 0;
+}
+
+/* Promote definition DEF to PROMOTED_TYPE.  If the stmt that defines def
+   is def_stmt, make the type of def promoted_type.  If the stmt is such
+   that, result of the def_stmt cannot be of promoted_type, create a new_def
+   of the original_type and make the def_stmt assign its value to newdef.
+   Then, create a NOP_EXPR to convert new_def to def of promoted type.
+
+   For example, for stmt with original_type char and promoted_type int:
+		char _1 = mem;
+	becomes:
+		char _2 = mem;
+		int _1 = (int)_2;
+
+   If the def_stmt allows def to be promoted, promote def in-place
+   (and its arguments when needed).
+
+   For example:
+		char _3 = _1 + _2;
+	becomes:
+		int _3 = _1 + _2;
+   Here, _1 and _2 will also be promoted.  */
+static void
+promote_ssa (tree def, gimple_stmt_iterator *gsi)
+{
+  gimple *def_stmt = SSA_NAME_DEF_STMT (def);
+  gimple *copy_stmt = NULL;
+  basic_block bb;
+  gimple_stmt_iterator gsi2;
+  tree original_type = TREE_TYPE (def);
+  tree new_def;
+  ssa_name_info *info;
+  bool do_not_promote = false;
+  tree promoted_type = get_promoted_type (TREE_TYPE (def));
+
+  if (!tobe_promoted_p (def))
+    return;
+
+  info = (ssa_name_info *) obstack_alloc (&ssa_name_info_obstack,
+							 sizeof (ssa_name_info));
+  info->type = original_type;
+  info->promoted_type = promoted_type;
+  info->ssa = def;
+  gcc_assert (!ssa_name_info_map->get_or_insert (def));
+  ssa_name_info_map->put (def, info);
+
+  switch (gimple_code (def_stmt))
+    {
+    case GIMPLE_PHI:
+      {
+	/* Promote def by fixing its type and make def anonymous.  */
+	TREE_TYPE (def) = promoted_type;
+	SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	promote_cst_in_stmt (def_stmt, promoted_type);
+	break;
+      }
+
+    case GIMPLE_ASM:
+      {
+	gasm *asm_stmt = as_a <gasm *> (def_stmt);
+	for (unsigned int i = 0; i < gimple_asm_noutputs (asm_stmt); ++i)
+	  {
+	    /* Promote def and copy (i.e. convert) the value defined
+	       by asm to def.  */
+	    tree link = gimple_asm_output_op (asm_stmt, i);
+	    tree op = TREE_VALUE (link);
+	    if (op == def)
+	      {
+		new_def = copy_ssa_name (def);
+		set_ssa_promoted (new_def);
+		copy_default_ssa (new_def, def);
+		TREE_VALUE (link) = new_def;
+		gimple_asm_set_output_op (asm_stmt, i, link);
+
+		TREE_TYPE (def) = promoted_type;
+		copy_stmt = gimple_build_assign (def, NOP_EXPR,
+						 new_def, NULL_TREE);
+		SSA_NAME_IS_DEFAULT_DEF (new_def) = 0;
+		gsi2 = gsi_for_stmt (def_stmt);
+		gsi_insert_after (&gsi2, copy_stmt, GSI_NEW_STMT);
+		break;
+	      }
+	  }
+	break;
+      }
+
+    case GIMPLE_NOP:
+      {
+	if (SSA_NAME_VAR (def) == NULL)
+	  {
+	    /* Promote def by fixing its type for anonymous def.  */
+	    TREE_TYPE (def) = promoted_type;
+	  }
+	else if (TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL)
+	  {
+	    tree var = copy_node (SSA_NAME_VAR (def));
+	    TREE_TYPE (var) = promoted_type;
+	    TREE_TYPE (def) = promoted_type;
+	    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var);
+	  }
+	else
+	  {
+	    /* Create a promoted copy of parameters.  */
+	    bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+	    gcc_assert (bb);
+	    gsi2 = gsi_after_labels (bb);
+	    /* Create new_def of the original type and set that to be the
+	       parameter.  */
+	    new_def = copy_ssa_name (def);
+	    set_ssa_promoted (new_def);
+	    set_ssa_default_def (cfun, SSA_NAME_VAR (def), new_def);
+	    copy_default_ssa (new_def, def);
+
+	    /* Now promote the def and copy the value from parameter.  */
+	    TREE_TYPE (def) = promoted_type;
+	    copy_stmt = gimple_build_assign (def, NOP_EXPR,
+					     new_def, NULL_TREE);
+	    SSA_NAME_DEF_STMT (def) = copy_stmt;
+	    gsi_insert_before (&gsi2, copy_stmt, GSI_NEW_STMT);
+	  }
+	break;
+      }
+
+    case GIMPLE_ASSIGN:
+      {
+	enum tree_code code = gimple_assign_rhs_code (def_stmt);
+	if (gimple_vuse (def_stmt) != NULL_TREE
+	    || gimple_vdef (def_stmt) != NULL_TREE
+	    || TREE_CODE_CLASS (code) == tcc_reference
+	    || code == LROTATE_EXPR
+	    || code == RROTATE_EXPR
+	    || code == VIEW_CONVERT_EXPR
+	    || code == REALPART_EXPR
+	    || code == IMAGPART_EXPR
+	    || code == REDUC_MAX_EXPR
+	    || code == REDUC_PLUS_EXPR
+	    || code == REDUC_MIN_EXPR)
+	  {
+	    do_not_promote = true;
+	  }
+	else if (CONVERT_EXPR_CODE_P (code))
+	  {
+	    tree rhs = gimple_assign_rhs1 (def_stmt);
+	    if (!type_precision_ok (TREE_TYPE (rhs))
+		|| !INTEGRAL_TYPE_P (TREE_TYPE (rhs))
+		|| (TYPE_UNSIGNED (TREE_TYPE (rhs)) != TYPE_UNSIGNED (promoted_type)))
+	      {
+		do_not_promote = true;
+	      }
+	    else if (types_compatible_p (TREE_TYPE (rhs), promoted_type))
+	      {
+		/* As we travel statements in dominated order, arguments
+		   of def_stmt will be visited before visiting def.  If RHS
+		   is already promoted and type is compatible, we can convert
+		   them into ZERO/SIGN EXTEND stmt.  */
+		ssa_name_info *info = ssa_name_info_map->get_or_insert (rhs);
+		tree type;
+		if (info == NULL)
+		  type = TREE_TYPE (rhs);
+		else
+		  type = info->type;
+		if ((TYPE_PRECISION (original_type)
+		     > TYPE_PRECISION (type))
+		    || (TYPE_UNSIGNED (original_type)
+			!= TYPE_UNSIGNED (type)))
+		  {
+		    if (TYPE_PRECISION (original_type) < TYPE_PRECISION (type))
+		      type = original_type;
+		    gcc_assert (type != NULL_TREE);
+		    TREE_TYPE (def) = promoted_type;
+		    gimple *copy_stmt =
+		      zero_sign_extend_stmt (def, rhs,
+					     TYPE_UNSIGNED (type),
+					     TYPE_PRECISION (type));
+		    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		    gsi_replace (gsi, copy_stmt, false);
+		  }
+		else
+		  {
+		    TREE_TYPE (def) = promoted_type;
+		    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		  }
+	      }
+	    else
+	      {
+		/* If RHS is not promoted OR their types are not
+		   compatible, create NOP_EXPR that converts
+		   RHS to  promoted DEF type and perform a
+		   ZERO/SIGN EXTEND to get the required value
+		   from RHS.  */
+		ssa_name_info *info = ssa_name_info_map->get_or_insert (rhs);
+		if (info != NULL)
+		  {
+		    tree type = info->type;
+		    new_def = copy_ssa_name (rhs);
+		    SET_SSA_NAME_VAR_OR_IDENTIFIER (new_def, NULL_TREE);
+		    TREE_TYPE (def) = promoted_type;
+		    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		    gimple *copy_stmt =
+		      zero_sign_extend_stmt (new_def, rhs,
+					     TYPE_UNSIGNED (type),
+					     TYPE_PRECISION (type));
+		    gsi2 = gsi_for_stmt (def_stmt);
+		    gsi_insert_before (&gsi2, copy_stmt, GSI_NEW_STMT);
+		    gassign *new_def_stmt = gimple_build_assign (def, code,
+								 new_def, NULL_TREE);
+		    gsi_replace (gsi, new_def_stmt, false);
+		  }
+		else
+		  {
+		    TREE_TYPE (def) = promoted_type;
+		    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		  }
+	      }
+	  }
+	else
+	  {
+	    /* Promote def by fixing its type and make def anonymous.  */
+	    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	    promote_cst_in_stmt (def_stmt, promoted_type);
+	    TREE_TYPE (def) = promoted_type;
+	  }
+	break;
+      }
+
+    default:
+      do_not_promote = true;
+      break;
+    }
+
+  if (do_not_promote)
+    {
+      /* Promote def and copy (i.e. convert) the value defined
+	 by the stmt that cannot be promoted.  */
+      new_def = copy_ssa_name (def);
+      set_ssa_promoted (new_def);
+      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+      TREE_TYPE (def) = promoted_type;
+      gimple_set_lhs (def_stmt, new_def);
+      copy_stmt = gimple_build_assign (def, NOP_EXPR,
+				       new_def, NULL_TREE);
+      gsi2 = gsi_for_stmt (def_stmt);
+      if (lookup_stmt_eh_lp (def_stmt) > 0
+	  || (gimple_code (def_stmt) == GIMPLE_CALL
+	      && gimple_call_ctrl_altering_p (def_stmt)))
+	gsi_insert_on_edge (FALLTHRU_EDGE (gimple_bb (def_stmt)),
+			    copy_stmt);
+      else
+	gsi_insert_after (&gsi2, copy_stmt, GSI_NEW_STMT);
+    }
+  reset_flow_sensitive_info (def);
+}
+
+/* Fix the (promoted) USE in stmts where USE cannot be be promoted.  */
+static unsigned int
+fixup_use (gimple *stmt, gimple_stmt_iterator *gsi,
+	   use_operand_p op, tree use)
+{
+  ssa_name_info *info = ssa_name_info_map->get_or_insert (use);
+  /* If USE is not promoted, nothing to do.  */
+  if (!info)
+    return 0;
+
+  tree promoted_type = info->promoted_type;
+  tree old_type = info->type;
+  bool do_not_promote = false;
+
+  switch (gimple_code (stmt))
+    {
+    case GIMPLE_DEBUG:
+      {
+	SET_USE (op, fold_convert (old_type, use));
+	update_stmt (stmt);
+	break;
+      }
+
+    case GIMPLE_ASM:
+    case GIMPLE_CALL:
+    case GIMPLE_RETURN:
+      {
+	/* USE cannot be promoted here.  */
+	do_not_promote = true;
+	break;
+      }
+
+    case GIMPLE_ASSIGN:
+      {
+	enum tree_code code = gimple_assign_rhs_code (stmt);
+	tree lhs = gimple_assign_lhs (stmt);
+	if (gimple_vuse (stmt) != NULL_TREE
+	    || gimple_vdef (stmt) != NULL_TREE
+	    || code == VIEW_CONVERT_EXPR
+	    || code == LROTATE_EXPR
+	    || code == RROTATE_EXPR
+	    || code == CONSTRUCTOR
+	    || code == BIT_FIELD_REF
+	    || code == COMPLEX_EXPR
+	    || VECTOR_TYPE_P (TREE_TYPE (lhs)))
+	  {
+	    do_not_promote = true;
+	  }
+	else if (TREE_CODE_CLASS (code) == tcc_comparison
+		 || code == TRUNC_DIV_EXPR
+		 || code == CEIL_DIV_EXPR
+		 || code == FLOOR_DIV_EXPR
+		 || code == ROUND_DIV_EXPR
+		 || code == TRUNC_MOD_EXPR
+		 || code == CEIL_MOD_EXPR
+		 || code == FLOOR_MOD_EXPR
+		 || code == ROUND_MOD_EXPR
+		 || code == LSHIFT_EXPR
+		 || code == RSHIFT_EXPR
+		 || !INTEGRAL_TYPE_P (TREE_TYPE (lhs)))
+	  {
+	    /* Promote the constant in comparison when other comparison
+	       operand is promoted.  All other constants are promoted as
+	       part of promoting definition in promote_ssa.  */
+	    if (TREE_CODE_CLASS (code) == tcc_comparison)
+	      promote_cst_in_stmt (stmt, promoted_type, true);
+	    /* In some stmts, value in USE has to be ZERO/SIGN
+	       Extended based on the original type for correct
+	       result.  */
+	    tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+	    gimple *copy_stmt =
+	      zero_sign_extend_stmt (temp, use,
+				     TYPE_UNSIGNED (old_type),
+				     TYPE_PRECISION (old_type));
+	    gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT);
+
+	    SET_USE (op, temp);
+	    update_stmt (stmt);
+	  }
+	else if (CONVERT_EXPR_CODE_P (code))
+	  {
+	    if (types_compatible_p (TREE_TYPE (lhs), promoted_type))
+	      {
+		/* Type of LHS and promoted RHS are compatible, we can
+		   convert this into ZERO/SIGN EXTEND stmt.  */
+		gimple *copy_stmt =
+		  zero_sign_extend_stmt (lhs, use,
+					 TYPE_UNSIGNED (old_type),
+					 TYPE_PRECISION (old_type));
+		set_ssa_promoted (lhs);
+		gsi_replace (gsi, copy_stmt, false);
+	      }
+	    else if (tobe_promoted_p (lhs));
+	    else
+	      {
+		do_not_promote = true;
+	      }
+	  }
+	break;
+      }
+
+    case GIMPLE_COND:
+      {
+	/* In GIMPLE_COND, value in USE has to be ZERO/SIGN
+	   Extended based on the original type for correct
+	   result.  */
+	tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+	gimple *copy_stmt =
+	  zero_sign_extend_stmt (temp, use,
+				 TYPE_UNSIGNED (old_type),
+				 TYPE_PRECISION (old_type));
+	gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT);
+	SET_USE (op, temp);
+	promote_cst_in_stmt (stmt, promoted_type);
+	update_stmt (stmt);
+	break;
+      }
+
+    default:
+      break;
+    }
+
+  if (do_not_promote)
+    {
+      /* FOR stmts where USE cannot be promoted, create an
+	 original type copy.  */
+      tree temp;
+      temp = copy_ssa_name (use);
+      SET_SSA_NAME_VAR_OR_IDENTIFIER (temp, NULL_TREE);
+      set_ssa_promoted (temp);
+      TREE_TYPE (temp) = old_type;
+      gimple *copy_stmt = gimple_build_assign (temp, NOP_EXPR,
+					       use, NULL_TREE);
+      gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT);
+      SET_USE (op, temp);
+      update_stmt (stmt);
+    }
+  return 0;
+}
+
+
+/* Promote all the stmts in the basic block.  */
+static void
+promote_all_stmts (basic_block bb)
+{
+  gimple_stmt_iterator gsi;
+  ssa_op_iter iter;
+  tree def, use;
+  use_operand_p op;
+
+  for (gphi_iterator gpi = gsi_start_phis (bb);
+       !gsi_end_p (gpi); gsi_next (&gpi))
+    {
+      gphi *phi = gpi.phi ();
+      FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE)
+	{
+	  use = USE_FROM_PTR (op);
+	  if (TREE_CODE (use) == SSA_NAME
+	      && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP)
+	    promote_ssa (use, &gsi);
+	  fixup_use (phi, &gsi, op, use);
+	}
+
+      def = PHI_RESULT (phi);
+      promote_ssa (def, &gsi);
+    }
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      gimple *stmt = gsi_stmt (gsi);
+      if (is_gimple_debug (stmt))
+	continue;
+
+      FOR_EACH_SSA_USE_OPERAND (op, stmt, iter, SSA_OP_USE)
+	{
+	  use = USE_FROM_PTR (op);
+	  if (TREE_CODE (use) == SSA_NAME
+	      && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP)
+	    promote_ssa (use, &gsi);
+	  fixup_use (stmt, &gsi, op, use);
+	}
+
+      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_DEF)
+	promote_ssa (def, &gsi);
+    }
+}
+
+/* Promote use in GIMPLE_DEBUG stmts. Do this separately to avoid generating
+   different sequence with and without -g.  This can  happen when promoting
+   SSA that are defined with GIMPLE_NOP.  */
+static void
+promote_debug_stmts ()
+{
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+  ssa_op_iter iter;
+  tree use;
+  use_operand_p op;
+
+  FOR_EACH_BB_FN (bb, cfun)
+    for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+      {
+	gimple *stmt = gsi_stmt (gsi);
+	if (!is_gimple_debug (stmt))
+	  continue;
+	FOR_EACH_SSA_USE_OPERAND (op, stmt, iter, SSA_OP_USE)
+	  {
+	    use = USE_FROM_PTR (op);
+	    fixup_use (stmt, &gsi, op, use);
+	  }
+      }
+}
+
+
+class type_promotion_dom_walker : public dom_walker
+{
+public:
+  type_promotion_dom_walker (cdi_direction direction)
+    : dom_walker (direction) {}
+  virtual void before_dom_children (basic_block bb)
+    {
+      promote_all_stmts (bb);
+    }
+};
+
+/* Main entry point to the pass.  */
+static unsigned int
+execute_type_promotion (void)
+{
+  n_ssa_val = num_ssa_names;
+  ssa_name_info_map = new hash_map<tree, ssa_name_info *>;
+  ssa_to_be_promoted_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_to_be_promoted_bitmap);
+
+  /* Create the obstack where ssa_name_info will reside.  */
+  gcc_obstack_init (&ssa_name_info_obstack);
+
+  calculate_dominance_info (CDI_DOMINATORS);
+  /* Walk the CFG in dominator order.  */
+  type_promotion_dom_walker (CDI_DOMINATORS)
+    .walk (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+  promote_debug_stmts ();
+  gsi_commit_edge_inserts ();
+
+  obstack_free (&ssa_name_info_obstack, NULL);
+  sbitmap_free (ssa_to_be_promoted_bitmap);
+  delete ssa_name_info_map;
+  return 0;
+}
+
+namespace {
+const pass_data pass_data_type_promotion =
+{
+  GIMPLE_PASS, /* type */
+  "promotion", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_TREE_TYPE_PROMOTE, /* tv_id */
+  PROP_ssa, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  (TODO_cleanup_cfg | TODO_update_ssa | TODO_verify_all),
+};
+
+class pass_type_promotion : public gimple_opt_pass
+{
+public:
+  pass_type_promotion (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_type_promotion, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  opt_pass * clone () { return new pass_type_promotion (m_ctxt); }
+  virtual bool gate (function *) { return flag_tree_type_promote != 0; }
+  virtual unsigned int execute (function *)
+    {
+      return execute_type_promotion ();
+    }
+
+}; // class pass_type_promotion
+
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_type_promote (gcc::context *ctxt)
+{
+  return new pass_type_promotion (ctxt);
+}
+
diff --git a/gcc/passes.def b/gcc/passes.def
index 36d2b3b..78c463a 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -272,6 +272,7 @@ along with GCC; see the file COPYING3.  If not see
       POP_INSERT_PASSES ()
       NEXT_PASS (pass_simduid_cleanup);
       NEXT_PASS (pass_lower_vector_ssa);
+      NEXT_PASS (pass_type_promote);
       NEXT_PASS (pass_cse_reciprocals);
       NEXT_PASS (pass_reassoc);
       NEXT_PASS (pass_strength_reduction);
diff --git a/gcc/timevar.def b/gcc/timevar.def
index b429faf..a8d40c3 100644
--- a/gcc/timevar.def
+++ b/gcc/timevar.def
@@ -278,6 +278,7 @@ DEFTIMEVAR (TV_VTABLE_VERIFICATION   , "vtable verification")
 DEFTIMEVAR (TV_TREE_UBSAN            , "tree ubsan")
 DEFTIMEVAR (TV_INITIALIZE_RTL        , "initialize rtl")
 DEFTIMEVAR (TV_GIMPLE_LADDRESS       , "address lowering")
+DEFTIMEVAR (TV_TREE_TYPE_PROMOTE     , "tree type promote")
 
 /* Everything else in rest_of_compilation not included above.  */
 DEFTIMEVAR (TV_EARLY_LOCAL	     , "early local passes")
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 333b5a7..449dd19 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -436,6 +436,7 @@ extern gimple_opt_pass *make_pass_fre (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_check_data_deps (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_copy_prop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_isolate_erroneous_paths (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_type_promote (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_vrp (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt);
diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index ff608a3..6722331 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -4353,7 +4353,7 @@ d_print_comp_inner (struct d_print_info *dpi, int options,
 
   /* Variable used to store the current templates while a previously
      captured scope is used.  */
-  struct d_print_template *saved_templates;
+  struct d_print_template *saved_templates = NULL;
 
   /* Nonzero if templates have been stored in the above variable.  */
   int need_template_restore = 0;
-- 
1.9.1


[-- Attachment #4: 0001-Add-new-SEXT_EXPR-tree-code.patch --]
[-- Type: text/x-patch, Size: 5067 bytes --]

From c0ce364e3a422912a08189645efde46c36583753 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 22 Oct 2015 10:51:42 +1100
Subject: [PATCH 1/5] Add new SEXT_EXPR tree code

---
 gcc/cfgexpand.c         | 12 ++++++++++++
 gcc/expr.c              | 20 ++++++++++++++++++++
 gcc/fold-const.c        |  4 ++++
 gcc/tree-cfg.c          | 12 ++++++++++++
 gcc/tree-inline.c       |  1 +
 gcc/tree-pretty-print.c | 11 +++++++++++
 gcc/tree.def            |  5 +++++
 7 files changed, 65 insertions(+)

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index eaad859..aeb64bb 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -5054,6 +5054,18 @@ expand_debug_expr (tree exp)
     case FMA_EXPR:
       return simplify_gen_ternary (FMA, mode, inner_mode, op0, op1, op2);
 
+    case SEXT_EXPR:
+      gcc_assert (CONST_INT_P (op1));
+      inner_mode = mode_for_size (INTVAL (op1), MODE_INT, 0);
+      gcc_assert (GET_MODE_BITSIZE (inner_mode) == INTVAL (op1));
+
+      if (mode != inner_mode)
+	op0 = simplify_gen_unary (SIGN_EXTEND,
+				  mode,
+				  gen_lowpart_SUBREG (inner_mode, op0),
+				  inner_mode);
+      return op0;
+
     default:
     flag_unsupported:
 #ifdef ENABLE_CHECKING
diff --git a/gcc/expr.c b/gcc/expr.c
index da68870..c2f535f 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9318,6 +9318,26 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
       target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
       return target;
 
+    case SEXT_EXPR:
+	{
+	  machine_mode inner_mode = mode_for_size (tree_to_uhwi (treeop1),
+						   MODE_INT, 0);
+	  rtx temp, result;
+	  rtx op0 = expand_normal (treeop0);
+	  op0 = force_reg (mode, op0);
+	  if (mode != inner_mode)
+	    {
+	      result = gen_reg_rtx (mode);
+	      temp = simplify_gen_unary (SIGN_EXTEND, mode,
+					 gen_lowpart_SUBREG (inner_mode, op0),
+					 inner_mode);
+	      convert_move (result, temp, 0);
+	    }
+	  else
+	    result = op0;
+	  return result;
+	}
+
     default:
       gcc_unreachable ();
     }
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 602ea24..a149bad 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -987,6 +987,10 @@ int_const_binop_1 (enum tree_code code, const_tree arg1, const_tree parg2,
       res = wi::bit_and (arg1, arg2);
       break;
 
+    case SEXT_EXPR:
+      res = wi::sext (arg1, arg2.to_uhwi ());
+      break;
+
     case RSHIFT_EXPR:
     case LSHIFT_EXPR:
       if (wi::neg_p (arg2))
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 8e3e810..d18b3f7 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3752,6 +3752,18 @@ verify_gimple_assign_binary (gassign *stmt)
         return false;
       }
 
+    case SEXT_EXPR:
+      {
+	if (!INTEGRAL_TYPE_P (lhs_type)
+	    || !useless_type_conversion_p (lhs_type, rhs1_type)
+	    || !tree_fits_uhwi_p (rhs2))
+	  {
+	    error ("invalid operands in sext expr");
+	    return true;
+	  }
+	return false;
+      }
+
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
       {
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index b8269ef..e61c200 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -3893,6 +3893,7 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
     case BIT_XOR_EXPR:
     case BIT_AND_EXPR:
     case BIT_NOT_EXPR:
+    case SEXT_EXPR:
 
     case TRUTH_ANDIF_EXPR:
     case TRUTH_ORIF_EXPR:
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 11f90051..bec9082 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -1923,6 +1923,14 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, int flags,
       }
       break;
 
+    case SEXT_EXPR:
+      pp_string (pp, "SEXT_EXPR <");
+      dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
+      pp_string (pp, ", ");
+      dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
+      pp_greater (pp);
+      break;
+
     case MODIFY_EXPR:
     case INIT_EXPR:
       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags,
@@ -3561,6 +3569,9 @@ op_symbol_code (enum tree_code code)
     case MIN_EXPR:
       return "min";
 
+    case SEXT_EXPR:
+      return "sext";
+
     default:
       return "<<< ??? >>>";
     }
diff --git a/gcc/tree.def b/gcc/tree.def
index d0a3bd6..789cfdd 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -760,6 +760,11 @@ DEFTREECODE (BIT_XOR_EXPR, "bit_xor_expr", tcc_binary, 2)
 DEFTREECODE (BIT_AND_EXPR, "bit_and_expr", tcc_binary, 2)
 DEFTREECODE (BIT_NOT_EXPR, "bit_not_expr", tcc_unary, 1)
 
+/*  Sign-extend operation.  It will sign extend first operand from
+ the sign bit specified by the second operand.  The type of the
+ result is that of the first operand.  */
+DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2)
+
 /* ANDIF and ORIF allow the second operand not to be computed if the
    value of the expression is determined from the first operand.  AND,
    OR, and XOR always compute the second operand whether its value is
-- 
1.9.1


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-11-10 14:13                   ` Richard Biener
  2015-11-12  6:08                     ` Kugan
@ 2015-11-14  1:15                     ` Kugan
  2015-11-18 14:04                       ` Richard Biener
  1 sibling, 1 reply; 63+ messages in thread
From: Kugan @ 2015-11-14  1:15 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 5293 bytes --]


Attached is the latest version of the patch. With the patches
0001-Add-new-SEXT_EXPR-tree-code.patch,
0002-Add-type-promotion-pass.patch and
0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch.

I did bootstrap on ppc64-linux-gnu, aarch64-linux-gnu and
x64-64-linux-gnu and regression testing on ppc64-linux-gnu,
aarch64-linux-gnu arm64-linux-gnu and x64-64-linux-gnu. I ran into three
issues in ppc64-linux-gnu regression testing. There are some other test
cases which needs adjustment for scanning for some patterns that are not
valid now.

1. rtl fwprop was going into infinite loop. Works with the following patch:
diff --git a/gcc/fwprop.c b/gcc/fwprop.c
index 16c7981..9cf4f43 100644
--- a/gcc/fwprop.c
+++ b/gcc/fwprop.c
@@ -948,6 +948,10 @@ try_fwprop_subst (df_ref use, rtx *loc, rtx
new_rtx, rtx_insn *def_insn,
   int old_cost = 0;
   bool ok;

+  /* Value to be substituted is the same, nothing to do.  */
+  if (rtx_equal_p (*loc, new_rtx))
+    return false;
+
   update_df_init (def_insn, insn);

   /* forward_propagate_subreg may be operating on an instruction with

2. gcc.dg/torture/ftrapv-1.c fails
This is because we are checking for the  SImode trapping. With the
promotion of the operation to wider mode, this is i think expected. I
think the testcase needs updating.
3. gcc.dg/sms-3.c fails
It fails with  -fmodulo-sched-allow-regmoves  and OK when I remove it. I
am looking into it.


I also have the following issues based on the previous review (as posted
in the previous patch). Copying again for the review purpose.

1.
> you still call promote_ssa on both DEFs and USEs and promote_ssa looks
> at SSA_NAME_DEF_STMT of the passed arg.  Please call promote_ssa just
> on DEFs and fixup_uses on USEs.

I am doing this to promote SSA that are defined with GIMPLE_NOP. Is
there anyway to iterate over this. I have added gcc_assert to make sure
that promote_ssa is called only once.

2.
> Instead of this you should, in promote_all_stmts, walk over all uses
doing what
> fixup_uses does and then walk over all defs, doing what promote_ssa does.
>
> +    case GIMPLE_NOP:
> +       {
> +         if (SSA_NAME_VAR (def) == NULL)
> +           {
> +             /* Promote def by fixing its type for anonymous def.  */
> +             TREE_TYPE (def) = promoted_type;
> +           }
> +         else
> +           {
> +             /* Create a promoted copy of parameters.  */
> +             bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
>
> I think the uninitialized vars are somewhat tricky and it would be best
> to create a new uninit anonymous SSA name for them.  You can
> have SSA_NAME_VAR != NULL and def _not_ being a parameter
> btw.

I experimented with get_or_create_default_def. Here  we have to have a
SSA_NAME_VAR (def) of promoted type.

In the attached patch I am doing the following and seems to work. Does
this looks OK?

+	  }
+	else if (TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL)
+	  {
+	    tree var = copy_node (SSA_NAME_VAR (def));
+	    TREE_TYPE (var) = promoted_type;
+	    TREE_TYPE (def) = promoted_type;
+	    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var);
+	  }

I prefer to promote def as otherwise iterating over the uses and
promoting can look complicated (have to look at all the different types
of stmts again and do the right thing as It was in the earlier version
of this before we move to this approach)

3)
> you can also transparently handle constants for the cases where promoting
> is required.  At the moment their handling is interwinded with the def
promotion
> code.  That makes the whole thing hard to follow.


I have updated the comments with:

+/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
+   promote only the constants in conditions part of the COND_EXPR.
+
+   We promote the constants when the associated operands are promoted.
+   This usually means that we promote the constants when we promote the
+   defining stmnts (as part of promote_ssa). However for COND_EXPR, we
+   can promote only when we promote the other operand. Therefore, this
+   is done during fixup_use.  */


4)
I am handling gimple_debug separately to avoid any code difference with
and without -g option. I have updated the comments for this.

5)
I also noticed that tree-ssa-uninit sometimes gives false positives due
to the assumptions
it makes. Is it OK to move this pass before type promotion? I can do the
testings and post a separate patch with this if this OK.

6)
I also removed the optimization that prevents some of the redundant
truncation/extensions from type promotion pass, as it dosent do much as
of now. I can send a proper follow up patch. Is that OK?

I also did a simple test with coremark for the latest patch. I compared
the code size for coremark for linux-gcc with -Os. Results are as
reported by the "size" utility. I know this doesn't mean much but can
give some indication.
	Base   		with pass	Percentage improvement
==============================================================
arm	10476		10372		0.9927453226
aarch64	9545		9521		0.2514405448
ppc64	12236		12052		1.5037593985


After resolving the above issues, I would like propose that we  commit
the pass as not enabled by default (even though the patch as it stands
enabled by default - I am doing it for testing purposes).

Thanks,
Kugan



[-- Attachment #2: 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch --]
[-- Type: text/x-diff, Size: 3609 bytes --]

From 8e71ea17eaf6f282325076f588dbdf4f53c8b865 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 22 Oct 2015 10:53:56 +1100
Subject: [PATCH 3/5] Optimize ZEXT_EXPR with tree-vrp

---
 gcc/match.pd   |  6 ++++++
 gcc/tree-vrp.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 67 insertions(+)

diff --git a/gcc/match.pd b/gcc/match.pd
index 0a9598e..1b152f1 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2585,3 +2585,9 @@ along with GCC; see the file COPYING3.  If not see
   (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
    (op @0 (ext @1 @2)))))
 
+(simplify
+ (sext (sext@2 @0 @1) @3)
+ (if (tree_int_cst_compare (@1, @3) <= 0)
+  @2
+  (sext @0 @3)))
+
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index fe34ffd..024c8ef 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -2241,6 +2241,7 @@ extract_range_from_binary_expr_1 (value_range *vr,
       && code != LSHIFT_EXPR
       && code != MIN_EXPR
       && code != MAX_EXPR
+      && code != SEXT_EXPR
       && code != BIT_AND_EXPR
       && code != BIT_IOR_EXPR
       && code != BIT_XOR_EXPR)
@@ -2801,6 +2802,54 @@ extract_range_from_binary_expr_1 (value_range *vr,
       extract_range_from_multiplicative_op_1 (vr, code, &vr0, &vr1);
       return;
     }
+  else if (code == SEXT_EXPR)
+    {
+      gcc_assert (range_int_cst_p (&vr1));
+      HOST_WIDE_INT prec = tree_to_uhwi (vr1.min);
+      type = vr0.type;
+      wide_int tmin, tmax;
+      wide_int may_be_nonzero, must_be_nonzero;
+
+      wide_int type_min = wi::min_value (prec, SIGNED);
+      wide_int type_max = wi::max_value (prec, SIGNED);
+      type_min = wide_int_to_tree (expr_type, type_min);
+      type_max = wide_int_to_tree (expr_type, type_max);
+      type_min = wi::sext (type_min, prec);
+      type_max = wi::sext (type_max, prec);
+      wide_int sign_bit
+	= wi::set_bit_in_zero (prec - 1,
+			       TYPE_PRECISION (TREE_TYPE (vr0.min)));
+      if (zero_nonzero_bits_from_vr (expr_type, &vr0,
+				     &may_be_nonzero,
+				     &must_be_nonzero))
+	{
+	  if (wi::bit_and (must_be_nonzero, sign_bit) == sign_bit)
+	    {
+	      /* If to-be-extended sign bit is one.  */
+	      tmin = type_min;
+	      tmax = wi::zext (may_be_nonzero, prec);
+	    }
+	  else if (wi::bit_and (may_be_nonzero, sign_bit)
+		   != sign_bit)
+	    {
+	      /* If to-be-extended sign bit is zero.  */
+	      tmin = wi::zext (must_be_nonzero, prec);
+	      tmax = wi::zext (may_be_nonzero, prec);
+	    }
+	  else
+	    {
+	      tmin = type_min;
+	      tmax = type_max;
+	    }
+	}
+      else
+	{
+	  tmin = type_min;
+	  tmax = type_max;
+	}
+      min = wide_int_to_tree (expr_type, tmin);
+      max = wide_int_to_tree (expr_type, tmax);
+    }
   else if (code == RSHIFT_EXPR
 	   || code == LSHIFT_EXPR)
     {
@@ -9166,6 +9215,17 @@ simplify_bit_ops_using_ranges (gimple_stmt_iterator *gsi, gimple *stmt)
 	  break;
 	}
       break;
+    case SEXT_EXPR:
+	{
+	  unsigned int prec = tree_to_uhwi (op1);
+	  wide_int min = vr0.min;
+	  wide_int max = vr0.max;
+	  wide_int sext_min = wi::sext (min, prec);
+	  wide_int sext_max = wi::sext (max, prec);
+	  if (min == sext_min && max == sext_max)
+	    op = op0;
+	}
+      break;
     default:
       gcc_unreachable ();
     }
@@ -9868,6 +9928,7 @@ simplify_stmt_using_ranges (gimple_stmt_iterator *gsi)
 
 	case BIT_AND_EXPR:
 	case BIT_IOR_EXPR:
+	case SEXT_EXPR:
 	  /* Optimize away BIT_AND_EXPR and BIT_IOR_EXPR
 	     if all the bits being cleared are already cleared or
 	     all the bits being set are already set.  */
-- 
1.9.1


[-- Attachment #3: 0002-Add-type-promotion-pass.patch --]
[-- Type: text/x-diff, Size: 31165 bytes --]

From 42128668393c32c3860d346ead7b3118a090ffa4 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 22 Oct 2015 10:52:37 +1100
Subject: [PATCH 2/5] Add type promotion pass

---
 gcc/Makefile.in               |   1 +
 gcc/auto-profile.c            |   2 +-
 gcc/common.opt                |   4 +
 gcc/doc/invoke.texi           |  10 +
 gcc/gimple-ssa-type-promote.c | 867 ++++++++++++++++++++++++++++++++++++++++++
 gcc/passes.def                |   1 +
 gcc/timevar.def               |   1 +
 gcc/tree-pass.h               |   1 +
 libiberty/cp-demangle.c       |   2 +-
 9 files changed, 887 insertions(+), 2 deletions(-)
 create mode 100644 gcc/gimple-ssa-type-promote.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index b91b8dc..c6aed45 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1499,6 +1499,7 @@ OBJS = \
 	tree-vect-slp.o \
 	tree-vectorizer.o \
 	tree-vrp.o \
+	gimple-ssa-type-promote.o \
 	tree.o \
 	valtrack.o \
 	value-prof.o \
diff --git a/gcc/auto-profile.c b/gcc/auto-profile.c
index 25202c5..d32c3b6 100644
--- a/gcc/auto-profile.c
+++ b/gcc/auto-profile.c
@@ -1266,7 +1266,7 @@ afdo_propagate_circuit (const bb_set &annotated_bb, edge_set *annotated_edge)
     FOR_EACH_EDGE (e, ei, bb->succs)
     {
       unsigned i, total = 0;
-      edge only_one;
+      edge only_one = NULL;
       bool check_value_one = (((integer_onep (cmp_rhs))
                                ^ (gimple_cond_code (cmp_stmt) == EQ_EXPR))
                               ^ ((e->flags & EDGE_TRUE_VALUE) != 0));
diff --git a/gcc/common.opt b/gcc/common.opt
index 12ca0d6..f450428 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2404,6 +2404,10 @@ ftree-vrp
 Common Report Var(flag_tree_vrp) Init(0) Optimization
 Perform Value Range Propagation on trees.
 
+ftree-type-promote
+Common Report Var(flag_tree_type_promote) Init(1) Optimization
+Perform Type Promotion on trees
+
 funit-at-a-time
 Common Report Var(flag_unit_at_a_time) Init(1)
 Compile whole compilation unit at a time.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index cd82544..bc059a0 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -9093,6 +9093,16 @@ enabled by default at @option{-O2} and higher.  Null pointer check
 elimination is only done if @option{-fdelete-null-pointer-checks} is
 enabled.
 
+@item -ftree-type-promote
+@opindex ftree-type-promote
+This pass applies type promotion to SSA names in the function and
+inserts appropriate truncations to preserve the semantics.  Idea of
+this pass is to promote operations such a way that we can minimise
+generation of subreg in RTL, that intern results in removal of
+redundant zero/sign extensions.
+
+This optimization is enabled by default.
+
 @item -fsplit-ivs-in-unroller
 @opindex fsplit-ivs-in-unroller
 Enables expression of values of induction variables in later iterations
diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c
new file mode 100644
index 0000000..735e7ee
--- /dev/null
+++ b/gcc/gimple-ssa-type-promote.c
@@ -0,0 +1,867 @@
+/* Type promotion of SSA names to minimise redundant zero/sign extension.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "hash-set.h"
+#include "machmode.h"
+#include "vec.h"
+#include "double-int.h"
+#include "input.h"
+#include "symtab.h"
+#include "wide-int.h"
+#include "inchash.h"
+#include "tree.h"
+#include "fold-const.h"
+#include "stor-layout.h"
+#include "predict.h"
+#include "function.h"
+#include "dominance.h"
+#include "cfg.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimple-ssa.h"
+#include "tree-phinodes.h"
+#include "ssa-iterators.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+#include "tree-pass.h"
+#include "gimple-pretty-print.h"
+#include "langhooks.h"
+#include "sbitmap.h"
+#include "domwalk.h"
+#include "tree-dfa.h"
+
+/* This pass applies type promotion to SSA names in the function and
+   inserts appropriate truncations.  Idea of this pass is to promote operations
+   such a way that we can minimise generation of subreg in RTL,
+   that in turn results in removal of redundant zero/sign extensions.  This pass
+   will run prior to The VRP and DOM such that they will be able to optimise
+   redundant truncations and extensions.  This is based on the discussion from
+   https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00472.html.
+*/
+
+/* Structure to hold the type and promoted type for promoted ssa variables.  */
+struct ssa_name_info
+{
+  tree ssa;		/* Name of the SSA_NAME.  */
+  tree type;		/* Original type of ssa.  */
+  tree promoted_type;	/* Promoted type of ssa.  */
+};
+
+/* Obstack for ssa_name_info.  */
+static struct obstack ssa_name_info_obstack;
+
+static unsigned n_ssa_val;
+static sbitmap ssa_to_be_promoted_bitmap;
+static hash_map <tree, ssa_name_info *>  *ssa_name_info_map;
+
+static bool
+type_precision_ok (tree type)
+{
+  return (TYPE_PRECISION (type)
+	  == GET_MODE_PRECISION (TYPE_MODE (type)));
+}
+
+/* Return the promoted type for TYPE.  */
+static tree
+get_promoted_type (tree type)
+{
+  tree promoted_type;
+  enum machine_mode mode;
+  int uns;
+
+  if (POINTER_TYPE_P (type)
+      || !INTEGRAL_TYPE_P (type)
+      || !type_precision_ok (type))
+    return type;
+
+  mode = TYPE_MODE (type);
+#ifdef PROMOTE_MODE
+  uns = TYPE_SIGN (type);
+  PROMOTE_MODE (mode, uns, type);
+#endif
+  uns = TYPE_SIGN (type);
+  if (TYPE_PRECISION (type) == GET_MODE_PRECISION (mode))
+    return type;
+  promoted_type
+    = build_nonstandard_integer_type (GET_MODE_PRECISION (mode),
+				      uns);
+  gcc_assert (TYPE_PRECISION (promoted_type) == GET_MODE_PRECISION (mode));
+  return promoted_type;
+}
+
+/* Return true if ssa NAME is already considered for promotion.  */
+static bool
+ssa_promoted_p (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	return bitmap_bit_p (ssa_to_be_promoted_bitmap, index);
+    }
+  return true;
+}
+
+/* Set ssa NAME to be already considered for promotion.  */
+static void
+set_ssa_promoted (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME)
+    {
+      unsigned int index = SSA_NAME_VERSION (name);
+      if (index < n_ssa_val)
+	bitmap_set_bit (ssa_to_be_promoted_bitmap, index);
+    }
+}
+
+/* Return true if LHS will be promoted later.  */
+static bool
+tobe_promoted_p (tree lhs)
+{
+  if (TREE_CODE (lhs) == SSA_NAME
+      && !POINTER_TYPE_P (TREE_TYPE (lhs))
+      && INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+      && !VECTOR_TYPE_P (TREE_TYPE (lhs))
+      && !ssa_promoted_p (lhs)
+      && (get_promoted_type (TREE_TYPE (lhs))
+	  != TREE_TYPE (lhs)))
+    return true;
+  else
+    return false;
+}
+
+/* Return true if the tree CODE needs the propmoted operand to be
+   truncated (when stray bits are set beyond the original type in
+   promoted mode) to preserve the semantics.  */
+static bool
+truncate_use_p (enum tree_code code)
+{
+  if (code == TRUNC_DIV_EXPR
+      || code == CEIL_DIV_EXPR
+      || code == FLOOR_DIV_EXPR
+      || code == ROUND_DIV_EXPR
+      || code == TRUNC_MOD_EXPR
+      || code == CEIL_MOD_EXPR
+      || code == FLOOR_MOD_EXPR
+      || code == ROUND_MOD_EXPR
+      || code == LSHIFT_EXPR
+      || code == RSHIFT_EXPR
+      || code == MAX_EXPR
+      || code == MIN_EXPR)
+    return true;
+  else
+    return false;
+}
+
+/* Convert constant CST to TYPE.  */
+static tree
+convert_int_cst (tree type, tree cst, signop sign = SIGNED)
+{
+  wide_int wi_cons = fold_convert (type, cst);
+  wi_cons = wi::ext (wi_cons, TYPE_PRECISION (TREE_TYPE (cst)), sign);
+  return wide_int_to_tree (type, wi_cons);
+}
+
+/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
+   promote only the constants in conditions part of the COND_EXPR.
+
+   We promote the constants when the associated operands are promoted.
+   This usually means that we promote the constants when we promote the
+   defining stmnts (as part of promote_ssa). However for COND_EXPR, we
+   can promote only when we promote the other operand. Therefore, this
+   is done during fixup_use.  */
+
+static void
+promote_cst_in_stmt (gimple *stmt, tree type, bool promote_cond = false)
+{
+  tree op;
+  ssa_op_iter iter;
+  use_operand_p oprnd;
+  int index;
+  tree op0, op1;
+  signop sign = SIGNED;
+
+  switch (gimple_code (stmt))
+    {
+    case GIMPLE_ASSIGN:
+      if (promote_cond
+	  && gimple_assign_rhs_code (stmt) == COND_EXPR)
+	{
+	  /* Promote INTEGER_CST that are tcc_compare arguments.  */
+	  sign = TYPE_SIGN (type);
+	  op = gimple_assign_rhs1 (stmt);
+	  op0 = TREE_OPERAND (op, 0);
+	  op1 = TREE_OPERAND (op, 1);
+	  if (TREE_CODE (op0) == INTEGER_CST)
+	    op0 = convert_int_cst (type, op0, sign);
+	  if (TREE_CODE (op1) == INTEGER_CST)
+	    op1 = convert_int_cst (type, op1, sign);
+	  tree new_op = build2 (TREE_CODE (op), type, op0, op1);
+	  gimple_assign_set_rhs1 (stmt, new_op);
+	}
+      else
+	{
+	  /* Promote INTEGER_CST in GIMPLE_ASSIGN.  */
+	  op = gimple_assign_rhs3 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs3 (stmt, convert_int_cst (type, op, sign));
+	  if ((TREE_CODE_CLASS (gimple_assign_rhs_code (stmt))
+	       == tcc_comparison)
+	      || truncate_use_p (gimple_assign_rhs_code (stmt)))
+	    sign = TYPE_SIGN (type);
+	  op = gimple_assign_rhs1 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs1 (stmt, convert_int_cst (type, op, sign));
+	  op = gimple_assign_rhs2 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs2 (stmt, convert_int_cst (type, op, sign));
+	}
+      break;
+
+    case GIMPLE_PHI:
+	{
+	  /* Promote INTEGER_CST arguments to GIMPLE_PHI.  */
+	  gphi *phi = as_a <gphi *> (stmt);
+	  FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE)
+	    {
+	      op = USE_FROM_PTR (oprnd);
+	      index = PHI_ARG_INDEX_FROM_USE (oprnd);
+	      if (TREE_CODE (op) == INTEGER_CST)
+		SET_PHI_ARG_DEF (phi, index, convert_int_cst (type, op, sign));
+	    }
+	}
+      break;
+
+    case GIMPLE_COND:
+	{
+	  /* Promote INTEGER_CST that are GIMPLE_COND arguments.  */
+	  gcond *cond = as_a <gcond *> (stmt);
+	  sign = TYPE_SIGN (type);
+	  op = gimple_cond_lhs (cond);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_lhs (cond, convert_int_cst (type, op, sign));
+
+	  op = gimple_cond_rhs (cond);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_rhs (cond, convert_int_cst (type, op, sign));
+	}
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+}
+
+/* Create an ssa with TYPE to copy ssa VAR.  */
+static tree
+make_promoted_copy (tree var, gimple *def_stmt, tree type)
+{
+  tree new_lhs = make_ssa_name (type, def_stmt);
+  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (var))
+    SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_lhs) = 1;
+  return new_lhs;
+}
+
+/* Zero/sign extend (depending on UNSIGNED_P) VAR and truncate to WIDTH bits.
+   Assign the zero/sign extended value in NEW_VAR.  gimple statement
+   that performs the zero/sign extension is returned.  */
+static gimple *
+zero_sign_extend_stmt (tree new_var, tree var, bool unsigned_p, int width)
+{
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var))
+	      == TYPE_PRECISION (TREE_TYPE (new_var)));
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var)) > width);
+  gimple *stmt;
+
+  if (unsigned_p)
+    {
+      /* Zero extend.  */
+      tree cst
+	= wide_int_to_tree (TREE_TYPE (var),
+			    wi::mask (width, false,
+				      TYPE_PRECISION (TREE_TYPE (var))));
+      stmt = gimple_build_assign (new_var, BIT_AND_EXPR,
+				  var, cst);
+    }
+  else
+    /* Sign extend.  */
+    stmt = gimple_build_assign (new_var,
+				SEXT_EXPR,
+				var, build_int_cst (TREE_TYPE (var), width));
+  return stmt;
+}
+
+
+static void
+copy_default_ssa (tree to, tree from)
+{
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (to, SSA_NAME_VAR (from));
+  SSA_NAME_DEF_STMT (to) = SSA_NAME_DEF_STMT (from);
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (from, NULL_TREE);
+  SSA_NAME_IS_DEFAULT_DEF (to) = 1;
+  SSA_NAME_IS_DEFAULT_DEF (from) = 0;
+}
+
+/* Promote definition DEF to PROMOTED_TYPE.  If the stmt that defines def
+   is def_stmt, make the type of def promoted_type.  If the stmt is such
+   that, result of the def_stmt cannot be of promoted_type, create a new_def
+   of the original_type and make the def_stmt assign its value to newdef.
+   Then, create a NOP_EXPR to convert new_def to def of promoted type.
+
+   For example, for stmt with original_type char and promoted_type int:
+		char _1 = mem;
+	becomes:
+		char _2 = mem;
+		int _1 = (int)_2;
+
+   If the def_stmt allows def to be promoted, promote def in-place
+   (and its arguments when needed).
+
+   For example:
+		char _3 = _1 + _2;
+	becomes:
+		int _3 = _1 + _2;
+   Here, _1 and _2 will also be promoted.  */
+
+static void
+promote_ssa (tree def, gimple_stmt_iterator *gsi)
+{
+  gimple *def_stmt = SSA_NAME_DEF_STMT (def);
+  gimple *copy_stmt = NULL;
+  basic_block bb;
+  gimple_stmt_iterator gsi2;
+  tree original_type = TREE_TYPE (def);
+  tree new_def;
+  ssa_name_info *info;
+  bool do_not_promote = false;
+  tree promoted_type = get_promoted_type (TREE_TYPE (def));
+
+  if (!tobe_promoted_p (def))
+    return;
+
+  info = (ssa_name_info *) obstack_alloc (&ssa_name_info_obstack,
+							 sizeof (ssa_name_info));
+  info->type = original_type;
+  info->promoted_type = promoted_type;
+  info->ssa = def;
+  gcc_assert (!ssa_name_info_map->get_or_insert (def));
+  ssa_name_info_map->put (def, info);
+
+  switch (gimple_code (def_stmt))
+    {
+    case GIMPLE_PHI:
+      {
+	/* Promote def by fixing its type and make def anonymous.  */
+	TREE_TYPE (def) = promoted_type;
+	SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	promote_cst_in_stmt (def_stmt, promoted_type);
+	break;
+      }
+
+    case GIMPLE_ASM:
+      {
+	gasm *asm_stmt = as_a <gasm *> (def_stmt);
+	for (unsigned int i = 0; i < gimple_asm_noutputs (asm_stmt); ++i)
+	  {
+	    /* Promote def and copy (i.e. convert) the value defined
+	       by asm to def.  */
+	    tree link = gimple_asm_output_op (asm_stmt, i);
+	    tree op = TREE_VALUE (link);
+	    if (op == def)
+	      {
+		new_def = copy_ssa_name (def);
+		set_ssa_promoted (new_def);
+		copy_default_ssa (new_def, def);
+		TREE_VALUE (link) = new_def;
+		gimple_asm_set_output_op (asm_stmt, i, link);
+
+		TREE_TYPE (def) = promoted_type;
+		copy_stmt = gimple_build_assign (def, NOP_EXPR,
+						 new_def, NULL_TREE);
+		SSA_NAME_IS_DEFAULT_DEF (new_def) = 0;
+		gsi2 = gsi_for_stmt (def_stmt);
+		gsi_insert_after (&gsi2, copy_stmt, GSI_NEW_STMT);
+		break;
+	      }
+	  }
+	break;
+      }
+
+    case GIMPLE_NOP:
+      {
+	if (SSA_NAME_VAR (def) == NULL)
+	  {
+	    /* Promote def by fixing its type for anonymous def.  */
+	    TREE_TYPE (def) = promoted_type;
+	  }
+	else if (TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL)
+	  {
+	    tree var = copy_node (SSA_NAME_VAR (def));
+	    TREE_TYPE (var) = promoted_type;
+	    TREE_TYPE (def) = promoted_type;
+	    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var);
+	  }
+	else
+	  {
+	    /* Create a promoted copy of parameters.  */
+	    bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+	    gcc_assert (bb);
+	    gsi2 = gsi_after_labels (bb);
+	    /* Create new_def of the original type and set that to be the
+	       parameter.  */
+	    new_def = copy_ssa_name (def);
+	    set_ssa_promoted (new_def);
+	    set_ssa_default_def (cfun, SSA_NAME_VAR (def), new_def);
+	    copy_default_ssa (new_def, def);
+
+	    /* Now promote the def and copy the value from parameter.  */
+	    TREE_TYPE (def) = promoted_type;
+	    copy_stmt = gimple_build_assign (def, NOP_EXPR,
+					     new_def, NULL_TREE);
+	    SSA_NAME_DEF_STMT (def) = copy_stmt;
+	    gsi_insert_before (&gsi2, copy_stmt, GSI_NEW_STMT);
+	  }
+	break;
+      }
+
+    case GIMPLE_ASSIGN:
+      {
+	enum tree_code code = gimple_assign_rhs_code (def_stmt);
+	tree rhs = gimple_assign_rhs1 (def_stmt);
+	if (gimple_vuse (def_stmt) != NULL_TREE
+	    || gimple_vdef (def_stmt) != NULL_TREE
+	    || TREE_CODE_CLASS (code) == tcc_reference
+	    || code == LROTATE_EXPR
+	    || code == RROTATE_EXPR
+	    || code == VIEW_CONVERT_EXPR
+	    || code == REALPART_EXPR
+	    || code == IMAGPART_EXPR
+	    || code == REDUC_PLUS_EXPR
+	    || code == REDUC_MAX_EXPR
+	    || code == REDUC_MIN_EXPR
+	    || !INTEGRAL_TYPE_P (TREE_TYPE (rhs)))
+	  {
+	    do_not_promote = true;
+	  }
+	else if (CONVERT_EXPR_CODE_P (code))
+	  {
+	    if (!type_precision_ok (TREE_TYPE (rhs)))
+	      {
+		do_not_promote = true;
+	      }
+	    else if (types_compatible_p (TREE_TYPE (rhs), promoted_type))
+	      {
+		/* As we travel statements in dominated order, arguments
+		   of def_stmt will be visited before visiting def.  If RHS
+		   is already promoted and type is compatible, we can convert
+		   them into ZERO/SIGN EXTEND stmt.  */
+		ssa_name_info *info = ssa_name_info_map->get_or_insert (rhs);
+		tree type;
+		if (info == NULL)
+		  type = TREE_TYPE (rhs);
+		else
+		  type = info->type;
+		if ((TYPE_PRECISION (original_type)
+		     > TYPE_PRECISION (type))
+		    || (TYPE_UNSIGNED (original_type)
+			!= TYPE_UNSIGNED (type)))
+		  {
+		    if (TYPE_PRECISION (original_type) < TYPE_PRECISION (type))
+		      type = original_type;
+		    gcc_assert (type != NULL_TREE);
+		    TREE_TYPE (def) = promoted_type;
+		    gimple *copy_stmt =
+		      zero_sign_extend_stmt (def, rhs,
+					     TYPE_UNSIGNED (type),
+					     TYPE_PRECISION (type));
+		    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		    gsi_replace (gsi, copy_stmt, false);
+		  }
+		else
+		  {
+		    TREE_TYPE (def) = promoted_type;
+		    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		  }
+	      }
+	    else
+	      {
+		/* If RHS is not promoted OR their types are not
+		   compatible, create NOP_EXPR that converts
+		   RHS to  promoted DEF type and perform a
+		   ZERO/SIGN EXTEND to get the required value
+		   from RHS.  */
+		ssa_name_info *info = ssa_name_info_map->get_or_insert (rhs);
+		if (info != NULL)
+		  {
+		    tree type = info->type;
+		    new_def = copy_ssa_name (rhs);
+		    SET_SSA_NAME_VAR_OR_IDENTIFIER (new_def, NULL_TREE);
+		    TREE_TYPE (def) = promoted_type;
+		    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		    gimple *copy_stmt =
+		      zero_sign_extend_stmt (new_def, rhs,
+					     TYPE_UNSIGNED (type),
+					     TYPE_PRECISION (type));
+		    gsi2 = gsi_for_stmt (def_stmt);
+		    gsi_insert_before (&gsi2, copy_stmt, GSI_NEW_STMT);
+		    gassign *new_def_stmt = gimple_build_assign (def, code,
+								 new_def, NULL_TREE);
+		    gsi_replace (gsi, new_def_stmt, false);
+		  }
+		else
+		  {
+		    TREE_TYPE (def) = promoted_type;
+		    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		  }
+	      }
+	  }
+	else
+	  {
+	    /* Promote def by fixing its type and make def anonymous.  */
+	    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	    promote_cst_in_stmt (def_stmt, promoted_type);
+	    TREE_TYPE (def) = promoted_type;
+	  }
+	break;
+      }
+
+    default:
+      do_not_promote = true;
+      break;
+    }
+
+  if (do_not_promote)
+    {
+      /* Promote def and copy (i.e. convert) the value defined
+	 by the stmt that cannot be promoted.  */
+      new_def = copy_ssa_name (def);
+      set_ssa_promoted (new_def);
+      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+      TREE_TYPE (def) = promoted_type;
+      gimple_set_lhs (def_stmt, new_def);
+      copy_stmt = gimple_build_assign (def, NOP_EXPR,
+				       new_def, NULL_TREE);
+      gsi2 = gsi_for_stmt (def_stmt);
+      if (lookup_stmt_eh_lp (def_stmt) > 0
+	  || (gimple_code (def_stmt) == GIMPLE_CALL
+	      && gimple_call_ctrl_altering_p (def_stmt)))
+	gsi_insert_on_edge (FALLTHRU_EDGE (gimple_bb (def_stmt)),
+			    copy_stmt);
+      else
+	gsi_insert_after (&gsi2, copy_stmt, GSI_NEW_STMT);
+    }
+  reset_flow_sensitive_info (def);
+}
+
+/* Fix the (promoted) USE in stmts where USE cannot be be promoted.  */
+static unsigned int
+fixup_use (gimple *stmt, gimple_stmt_iterator *gsi,
+	   use_operand_p op, tree use)
+{
+  ssa_name_info *info = ssa_name_info_map->get_or_insert (use);
+  /* If USE is not promoted, nothing to do.  */
+  if (!info)
+    return 0;
+
+  tree promoted_type = info->promoted_type;
+  tree old_type = info->type;
+  bool do_not_promote = false;
+
+  switch (gimple_code (stmt))
+    {
+    case GIMPLE_DEBUG:
+      {
+	SET_USE (op, fold_convert (old_type, use));
+	update_stmt (stmt);
+	break;
+      }
+
+    case GIMPLE_ASM:
+    case GIMPLE_CALL:
+    case GIMPLE_RETURN:
+      {
+	/* USE cannot be promoted here.  */
+	do_not_promote = true;
+	break;
+      }
+
+    case GIMPLE_ASSIGN:
+      {
+	enum tree_code code = gimple_assign_rhs_code (stmt);
+	tree lhs = gimple_assign_lhs (stmt);
+	if (gimple_vuse (stmt) != NULL_TREE
+	    || gimple_vdef (stmt) != NULL_TREE
+	    || code == VIEW_CONVERT_EXPR
+	    || code == LROTATE_EXPR
+	    || code == RROTATE_EXPR
+	    || code == CONSTRUCTOR
+	    || code == BIT_FIELD_REF
+	    || code == COMPLEX_EXPR
+	    || VECTOR_TYPE_P (TREE_TYPE (lhs)))
+	  {
+	    do_not_promote = true;
+	  }
+	else if (TREE_CODE_CLASS (code) == tcc_comparison
+		 || truncate_use_p (code))
+	  {
+	    /* Promote the constant in comparison when other comparison
+	       operand is promoted.  All other constants are promoted as
+	       part of promoting definition in promote_ssa.  */
+	    if (TREE_CODE_CLASS (code) == tcc_comparison)
+	      promote_cst_in_stmt (stmt, promoted_type, true);
+	    /* In some stmts, value in USE has to be ZERO/SIGN
+	       Extended based on the original type for correct
+	       result.  */
+	    tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+	    gimple *copy_stmt =
+	      zero_sign_extend_stmt (temp, use,
+				     TYPE_UNSIGNED (old_type),
+				     TYPE_PRECISION (old_type));
+	    gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT);
+
+	    SET_USE (op, temp);
+	    update_stmt (stmt);
+	  }
+	else if (CONVERT_EXPR_CODE_P (code))
+	  {
+	    if (types_compatible_p (TREE_TYPE (lhs), promoted_type))
+	      {
+		/* Type of LHS and promoted RHS are compatible, we can
+		   convert this into ZERO/SIGN EXTEND stmt.  */
+		gimple *copy_stmt =
+		  zero_sign_extend_stmt (lhs, use,
+					 TYPE_UNSIGNED (old_type),
+					 TYPE_PRECISION (old_type));
+		set_ssa_promoted (lhs);
+		gsi_replace (gsi, copy_stmt, false);
+	      }
+	    else if (!tobe_promoted_p (lhs)
+		     || !INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+		     || (TYPE_UNSIGNED (TREE_TYPE (use)) != TYPE_UNSIGNED (TREE_TYPE (lhs))))
+	      {
+		tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+		gimple *copy_stmt =
+		  zero_sign_extend_stmt (temp, use,
+					 TYPE_UNSIGNED (old_type),
+					 TYPE_PRECISION (old_type));
+		gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT);
+		SET_USE (op, temp);
+		update_stmt (stmt);
+	      }
+	  }
+	break;
+      }
+
+    case GIMPLE_COND:
+      {
+	/* In GIMPLE_COND, value in USE has to be ZERO/SIGN
+	   Extended based on the original type for correct
+	   result.  */
+	tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use));
+	gimple *copy_stmt =
+	  zero_sign_extend_stmt (temp, use,
+				 TYPE_UNSIGNED (old_type),
+				 TYPE_PRECISION (old_type));
+	gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT);
+	SET_USE (op, temp);
+	promote_cst_in_stmt (stmt, promoted_type);
+	update_stmt (stmt);
+	break;
+      }
+
+    default:
+      break;
+    }
+
+  if (do_not_promote)
+    {
+      /* FOR stmts where USE cannot be promoted, create an
+	 original type copy.  */
+      tree temp;
+      temp = copy_ssa_name (use);
+      SET_SSA_NAME_VAR_OR_IDENTIFIER (temp, NULL_TREE);
+      set_ssa_promoted (temp);
+      TREE_TYPE (temp) = old_type;
+      gimple *copy_stmt = gimple_build_assign (temp, NOP_EXPR,
+					       use, NULL_TREE);
+      gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT);
+      SET_USE (op, temp);
+      update_stmt (stmt);
+    }
+  return 0;
+}
+
+
+/* Promote all the stmts in the basic block.  */
+static void
+promote_all_stmts (basic_block bb)
+{
+  gimple_stmt_iterator gsi;
+  ssa_op_iter iter;
+  tree def, use;
+  use_operand_p op;
+
+  for (gphi_iterator gpi = gsi_start_phis (bb);
+       !gsi_end_p (gpi); gsi_next (&gpi))
+    {
+      gphi *phi = gpi.phi ();
+      FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE)
+	{
+	  use = USE_FROM_PTR (op);
+	  if (TREE_CODE (use) == SSA_NAME
+	      && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP)
+	    promote_ssa (use, &gsi);
+	  fixup_use (phi, &gsi, op, use);
+	}
+
+      def = PHI_RESULT (phi);
+      promote_ssa (def, &gsi);
+    }
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      gimple *stmt = gsi_stmt (gsi);
+      if (is_gimple_debug (stmt))
+	continue;
+
+      FOR_EACH_SSA_USE_OPERAND (op, stmt, iter, SSA_OP_USE)
+	{
+	  use = USE_FROM_PTR (op);
+	  if (TREE_CODE (use) == SSA_NAME
+	      && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP)
+	    promote_ssa (use, &gsi);
+	  fixup_use (stmt, &gsi, op, use);
+	}
+
+      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_DEF)
+	promote_ssa (def, &gsi);
+    }
+}
+
+/* Promote use in GIMPLE_DEBUG stmts. Do this separately to avoid generating
+   different sequence with and without -g.  This can  happen when promoting
+   SSA that are defined with GIMPLE_NOP.  */
+static void
+promote_debug_stmts ()
+{
+  basic_block bb;
+  gimple_stmt_iterator gsi;
+  ssa_op_iter iter;
+  tree use;
+  use_operand_p op;
+
+  FOR_EACH_BB_FN (bb, cfun)
+    for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+      {
+	gimple *stmt = gsi_stmt (gsi);
+	if (!is_gimple_debug (stmt))
+	  continue;
+	FOR_EACH_SSA_USE_OPERAND (op, stmt, iter, SSA_OP_USE)
+	  {
+	    use = USE_FROM_PTR (op);
+	    fixup_use (stmt, &gsi, op, use);
+	  }
+      }
+}
+
+
+class type_promotion_dom_walker : public dom_walker
+{
+public:
+  type_promotion_dom_walker (cdi_direction direction)
+    : dom_walker (direction) {}
+  virtual void before_dom_children (basic_block bb)
+    {
+      promote_all_stmts (bb);
+    }
+};
+
+/* Main entry point to the pass.  */
+static unsigned int
+execute_type_promotion (void)
+{
+  n_ssa_val = num_ssa_names;
+  ssa_name_info_map = new hash_map<tree, ssa_name_info *>;
+  ssa_to_be_promoted_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_to_be_promoted_bitmap);
+
+  /* Create the obstack where ssa_name_info will reside.  */
+  gcc_obstack_init (&ssa_name_info_obstack);
+
+  calculate_dominance_info (CDI_DOMINATORS);
+  /* Walk the CFG in dominator order.  */
+  type_promotion_dom_walker (CDI_DOMINATORS)
+    .walk (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+  promote_debug_stmts ();
+  gsi_commit_edge_inserts ();
+
+  obstack_free (&ssa_name_info_obstack, NULL);
+  sbitmap_free (ssa_to_be_promoted_bitmap);
+  delete ssa_name_info_map;
+  return 0;
+}
+
+namespace {
+const pass_data pass_data_type_promotion =
+{
+  GIMPLE_PASS, /* type */
+  "promotion", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_TREE_TYPE_PROMOTE, /* tv_id */
+  PROP_ssa, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  (TODO_cleanup_cfg | TODO_update_ssa | TODO_verify_all),
+};
+
+class pass_type_promotion : public gimple_opt_pass
+{
+public:
+  pass_type_promotion (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_type_promotion, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  opt_pass * clone () { return new pass_type_promotion (m_ctxt); }
+  virtual bool gate (function *) { return flag_tree_type_promote != 0; }
+  virtual unsigned int execute (function *)
+    {
+      return execute_type_promotion ();
+    }
+
+}; // class pass_type_promotion
+
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_type_promote (gcc::context *ctxt)
+{
+  return new pass_type_promotion (ctxt);
+}
+
diff --git a/gcc/passes.def b/gcc/passes.def
index 36d2b3b..78c463a 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -272,6 +272,7 @@ along with GCC; see the file COPYING3.  If not see
       POP_INSERT_PASSES ()
       NEXT_PASS (pass_simduid_cleanup);
       NEXT_PASS (pass_lower_vector_ssa);
+      NEXT_PASS (pass_type_promote);
       NEXT_PASS (pass_cse_reciprocals);
       NEXT_PASS (pass_reassoc);
       NEXT_PASS (pass_strength_reduction);
diff --git a/gcc/timevar.def b/gcc/timevar.def
index b429faf..a8d40c3 100644
--- a/gcc/timevar.def
+++ b/gcc/timevar.def
@@ -278,6 +278,7 @@ DEFTIMEVAR (TV_VTABLE_VERIFICATION   , "vtable verification")
 DEFTIMEVAR (TV_TREE_UBSAN            , "tree ubsan")
 DEFTIMEVAR (TV_INITIALIZE_RTL        , "initialize rtl")
 DEFTIMEVAR (TV_GIMPLE_LADDRESS       , "address lowering")
+DEFTIMEVAR (TV_TREE_TYPE_PROMOTE     , "tree type promote")
 
 /* Everything else in rest_of_compilation not included above.  */
 DEFTIMEVAR (TV_EARLY_LOCAL	     , "early local passes")
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 333b5a7..449dd19 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -436,6 +436,7 @@ extern gimple_opt_pass *make_pass_fre (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_check_data_deps (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_copy_prop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_isolate_erroneous_paths (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_type_promote (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_vrp (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt);
diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index ff608a3..6722331 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -4353,7 +4353,7 @@ d_print_comp_inner (struct d_print_info *dpi, int options,
 
   /* Variable used to store the current templates while a previously
      captured scope is used.  */
-  struct d_print_template *saved_templates;
+  struct d_print_template *saved_templates = NULL;
 
   /* Nonzero if templates have been stored in the above variable.  */
   int need_template_restore = 0;
-- 
1.9.1


[-- Attachment #4: 0001-Add-new-SEXT_EXPR-tree-code.patch --]
[-- Type: text/x-diff, Size: 5067 bytes --]

From c0ce364e3a422912a08189645efde46c36583753 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 22 Oct 2015 10:51:42 +1100
Subject: [PATCH 1/5] Add new SEXT_EXPR tree code

---
 gcc/cfgexpand.c         | 12 ++++++++++++
 gcc/expr.c              | 20 ++++++++++++++++++++
 gcc/fold-const.c        |  4 ++++
 gcc/tree-cfg.c          | 12 ++++++++++++
 gcc/tree-inline.c       |  1 +
 gcc/tree-pretty-print.c | 11 +++++++++++
 gcc/tree.def            |  5 +++++
 7 files changed, 65 insertions(+)

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index eaad859..aeb64bb 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -5054,6 +5054,18 @@ expand_debug_expr (tree exp)
     case FMA_EXPR:
       return simplify_gen_ternary (FMA, mode, inner_mode, op0, op1, op2);
 
+    case SEXT_EXPR:
+      gcc_assert (CONST_INT_P (op1));
+      inner_mode = mode_for_size (INTVAL (op1), MODE_INT, 0);
+      gcc_assert (GET_MODE_BITSIZE (inner_mode) == INTVAL (op1));
+
+      if (mode != inner_mode)
+	op0 = simplify_gen_unary (SIGN_EXTEND,
+				  mode,
+				  gen_lowpart_SUBREG (inner_mode, op0),
+				  inner_mode);
+      return op0;
+
     default:
     flag_unsupported:
 #ifdef ENABLE_CHECKING
diff --git a/gcc/expr.c b/gcc/expr.c
index da68870..c2f535f 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9318,6 +9318,26 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
       target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
       return target;
 
+    case SEXT_EXPR:
+	{
+	  machine_mode inner_mode = mode_for_size (tree_to_uhwi (treeop1),
+						   MODE_INT, 0);
+	  rtx temp, result;
+	  rtx op0 = expand_normal (treeop0);
+	  op0 = force_reg (mode, op0);
+	  if (mode != inner_mode)
+	    {
+	      result = gen_reg_rtx (mode);
+	      temp = simplify_gen_unary (SIGN_EXTEND, mode,
+					 gen_lowpart_SUBREG (inner_mode, op0),
+					 inner_mode);
+	      convert_move (result, temp, 0);
+	    }
+	  else
+	    result = op0;
+	  return result;
+	}
+
     default:
       gcc_unreachable ();
     }
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 602ea24..a149bad 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -987,6 +987,10 @@ int_const_binop_1 (enum tree_code code, const_tree arg1, const_tree parg2,
       res = wi::bit_and (arg1, arg2);
       break;
 
+    case SEXT_EXPR:
+      res = wi::sext (arg1, arg2.to_uhwi ());
+      break;
+
     case RSHIFT_EXPR:
     case LSHIFT_EXPR:
       if (wi::neg_p (arg2))
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 8e3e810..d18b3f7 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3752,6 +3752,18 @@ verify_gimple_assign_binary (gassign *stmt)
         return false;
       }
 
+    case SEXT_EXPR:
+      {
+	if (!INTEGRAL_TYPE_P (lhs_type)
+	    || !useless_type_conversion_p (lhs_type, rhs1_type)
+	    || !tree_fits_uhwi_p (rhs2))
+	  {
+	    error ("invalid operands in sext expr");
+	    return true;
+	  }
+	return false;
+      }
+
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
       {
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index b8269ef..e61c200 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -3893,6 +3893,7 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights,
     case BIT_XOR_EXPR:
     case BIT_AND_EXPR:
     case BIT_NOT_EXPR:
+    case SEXT_EXPR:
 
     case TRUTH_ANDIF_EXPR:
     case TRUTH_ORIF_EXPR:
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 11f90051..bec9082 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -1923,6 +1923,14 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, int flags,
       }
       break;
 
+    case SEXT_EXPR:
+      pp_string (pp, "SEXT_EXPR <");
+      dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
+      pp_string (pp, ", ");
+      dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
+      pp_greater (pp);
+      break;
+
     case MODIFY_EXPR:
     case INIT_EXPR:
       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags,
@@ -3561,6 +3569,9 @@ op_symbol_code (enum tree_code code)
     case MIN_EXPR:
       return "min";
 
+    case SEXT_EXPR:
+      return "sext";
+
     default:
       return "<<< ??? >>>";
     }
diff --git a/gcc/tree.def b/gcc/tree.def
index d0a3bd6..789cfdd 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -760,6 +760,11 @@ DEFTREECODE (BIT_XOR_EXPR, "bit_xor_expr", tcc_binary, 2)
 DEFTREECODE (BIT_AND_EXPR, "bit_and_expr", tcc_binary, 2)
 DEFTREECODE (BIT_NOT_EXPR, "bit_not_expr", tcc_unary, 1)
 
+/*  Sign-extend operation.  It will sign extend first operand from
+ the sign bit specified by the second operand.  The type of the
+ result is that of the first operand.  */
+DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2)
+
 /* ANDIF and ORIF allow the second operand not to be computed if the
    value of the expression is determined from the first operand.  AND,
    OR, and XOR always compute the second operand whether its value is
-- 
1.9.1


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-11-14  1:15                     ` Kugan
@ 2015-11-18 14:04                       ` Richard Biener
  2015-11-18 15:06                         ` Richard Biener
  0 siblings, 1 reply; 63+ messages in thread
From: Richard Biener @ 2015-11-18 14:04 UTC (permalink / raw)
  To: Kugan; +Cc: gcc-patches

On Sat, Nov 14, 2015 at 2:15 AM, Kugan
<kugan.vivekanandarajah@linaro.org> wrote:
>
> Attached is the latest version of the patch. With the patches
> 0001-Add-new-SEXT_EXPR-tree-code.patch,
> 0002-Add-type-promotion-pass.patch and
> 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch.
>
> I did bootstrap on ppc64-linux-gnu, aarch64-linux-gnu and
> x64-64-linux-gnu and regression testing on ppc64-linux-gnu,
> aarch64-linux-gnu arm64-linux-gnu and x64-64-linux-gnu. I ran into three
> issues in ppc64-linux-gnu regression testing. There are some other test
> cases which needs adjustment for scanning for some patterns that are not
> valid now.
>
> 1. rtl fwprop was going into infinite loop. Works with the following patch:
> diff --git a/gcc/fwprop.c b/gcc/fwprop.c
> index 16c7981..9cf4f43 100644
> --- a/gcc/fwprop.c
> +++ b/gcc/fwprop.c
> @@ -948,6 +948,10 @@ try_fwprop_subst (df_ref use, rtx *loc, rtx
> new_rtx, rtx_insn *def_insn,
>    int old_cost = 0;
>    bool ok;
>
> +  /* Value to be substituted is the same, nothing to do.  */
> +  if (rtx_equal_p (*loc, new_rtx))
> +    return false;
> +
>    update_df_init (def_insn, insn);
>
>    /* forward_propagate_subreg may be operating on an instruction with

Which testcase was this on?

> 2. gcc.dg/torture/ftrapv-1.c fails
> This is because we are checking for the  SImode trapping. With the
> promotion of the operation to wider mode, this is i think expected. I
> think the testcase needs updating.

No, it is not expected.  As said earlier you need to refrain from promoting
integer operations that trap.  You can use ! operation_no_trapping_overflow
for this.

> 3. gcc.dg/sms-3.c fails
> It fails with  -fmodulo-sched-allow-regmoves  and OK when I remove it. I
> am looking into it.
>
>
> I also have the following issues based on the previous review (as posted
> in the previous patch). Copying again for the review purpose.
>
> 1.
>> you still call promote_ssa on both DEFs and USEs and promote_ssa looks
>> at SSA_NAME_DEF_STMT of the passed arg.  Please call promote_ssa just
>> on DEFs and fixup_uses on USEs.
>
> I am doing this to promote SSA that are defined with GIMPLE_NOP. Is
> there anyway to iterate over this. I have added gcc_assert to make sure
> that promote_ssa is called only once.

  gcc_assert (!ssa_name_info_map->get_or_insert (def));

with --disable-checking this will be compiled away so you need to do
the assert in a separate statement.

> 2.
>> Instead of this you should, in promote_all_stmts, walk over all uses
> doing what
>> fixup_uses does and then walk over all defs, doing what promote_ssa does.
>>
>> +    case GIMPLE_NOP:
>> +       {
>> +         if (SSA_NAME_VAR (def) == NULL)
>> +           {
>> +             /* Promote def by fixing its type for anonymous def.  */
>> +             TREE_TYPE (def) = promoted_type;
>> +           }
>> +         else
>> +           {
>> +             /* Create a promoted copy of parameters.  */
>> +             bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
>>
>> I think the uninitialized vars are somewhat tricky and it would be best
>> to create a new uninit anonymous SSA name for them.  You can
>> have SSA_NAME_VAR != NULL and def _not_ being a parameter
>> btw.
>
> I experimented with get_or_create_default_def. Here  we have to have a
> SSA_NAME_VAR (def) of promoted type.
>
> In the attached patch I am doing the following and seems to work. Does
> this looks OK?
>
> +         }
> +       else if (TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL)
> +         {
> +           tree var = copy_node (SSA_NAME_VAR (def));
> +           TREE_TYPE (var) = promoted_type;
> +           TREE_TYPE (def) = promoted_type;
> +           SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var);
> +         }

I believe this will wreck the SSA default-def map so you should do

  set_ssa_default_def (cfun, SSA_NAME_VAR (def), NULL_TREE);
  tree var = create_tmp_reg (promoted_type);
  TREE_TYPE (def) = promoted_type;
  SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var);
  set_ssa_default_def (cfun, var, def);

instead.

> I prefer to promote def as otherwise iterating over the uses and
> promoting can look complicated (have to look at all the different types
> of stmts again and do the right thing as It was in the earlier version
> of this before we move to this approach)
>
> 3)
>> you can also transparently handle constants for the cases where promoting
>> is required.  At the moment their handling is interwinded with the def
> promotion
>> code.  That makes the whole thing hard to follow.
>
>
> I have updated the comments with:
>
> +/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
> +   promote only the constants in conditions part of the COND_EXPR.
> +
> +   We promote the constants when the associated operands are promoted.
> +   This usually means that we promote the constants when we promote the
> +   defining stmnts (as part of promote_ssa). However for COND_EXPR, we
> +   can promote only when we promote the other operand. Therefore, this
> +   is done during fixup_use.  */
>
>
> 4)
> I am handling gimple_debug separately to avoid any code difference with
> and without -g option. I have updated the comments for this.
>
> 5)
> I also noticed that tree-ssa-uninit sometimes gives false positives due
> to the assumptions
> it makes. Is it OK to move this pass before type promotion? I can do the
> testings and post a separate patch with this if this OK.

Hmm, no, this needs more explanation (like a testcase).

> 6)
> I also removed the optimization that prevents some of the redundant
> truncation/extensions from type promotion pass, as it dosent do much as
> of now. I can send a proper follow up patch. Is that OK?

Yeah, that sounds fine.

> I also did a simple test with coremark for the latest patch. I compared
> the code size for coremark for linux-gcc with -Os. Results are as
> reported by the "size" utility. I know this doesn't mean much but can
> give some indication.
>         Base            with pass       Percentage improvement
> ==============================================================
> arm     10476           10372           0.9927453226
> aarch64 9545            9521            0.2514405448
> ppc64   12236           12052           1.5037593985
>
>
> After resolving the above issues, I would like propose that we  commit
> the pass as not enabled by default (even though the patch as it stands
> enabled by default - I am doing it for testing purposes).

Hmm, we don't like to have passes that are not enabled by default with any
optimization level or for any target.  Those tend to bitrot quickly :(

Did you do any performance measurements yet?

Looking over the pass in detail now (again).

Thanks,
Richard.

> Thanks,
> Kugan
>
>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-11-18 14:04                       ` Richard Biener
@ 2015-11-18 15:06                         ` Richard Biener
  2015-11-24  2:52                           ` Kugan
  0 siblings, 1 reply; 63+ messages in thread
From: Richard Biener @ 2015-11-18 15:06 UTC (permalink / raw)
  To: Kugan; +Cc: gcc-patches

On Wed, Nov 18, 2015 at 3:04 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Sat, Nov 14, 2015 at 2:15 AM, Kugan
> <kugan.vivekanandarajah@linaro.org> wrote:
>>
>> Attached is the latest version of the patch. With the patches
>> 0001-Add-new-SEXT_EXPR-tree-code.patch,
>> 0002-Add-type-promotion-pass.patch and
>> 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch.
>>
>> I did bootstrap on ppc64-linux-gnu, aarch64-linux-gnu and
>> x64-64-linux-gnu and regression testing on ppc64-linux-gnu,
>> aarch64-linux-gnu arm64-linux-gnu and x64-64-linux-gnu. I ran into three
>> issues in ppc64-linux-gnu regression testing. There are some other test
>> cases which needs adjustment for scanning for some patterns that are not
>> valid now.
>>
>> 1. rtl fwprop was going into infinite loop. Works with the following patch:
>> diff --git a/gcc/fwprop.c b/gcc/fwprop.c
>> index 16c7981..9cf4f43 100644
>> --- a/gcc/fwprop.c
>> +++ b/gcc/fwprop.c
>> @@ -948,6 +948,10 @@ try_fwprop_subst (df_ref use, rtx *loc, rtx
>> new_rtx, rtx_insn *def_insn,
>>    int old_cost = 0;
>>    bool ok;
>>
>> +  /* Value to be substituted is the same, nothing to do.  */
>> +  if (rtx_equal_p (*loc, new_rtx))
>> +    return false;
>> +
>>    update_df_init (def_insn, insn);
>>
>>    /* forward_propagate_subreg may be operating on an instruction with
>
> Which testcase was this on?
>
>> 2. gcc.dg/torture/ftrapv-1.c fails
>> This is because we are checking for the  SImode trapping. With the
>> promotion of the operation to wider mode, this is i think expected. I
>> think the testcase needs updating.
>
> No, it is not expected.  As said earlier you need to refrain from promoting
> integer operations that trap.  You can use ! operation_no_trapping_overflow
> for this.
>
>> 3. gcc.dg/sms-3.c fails
>> It fails with  -fmodulo-sched-allow-regmoves  and OK when I remove it. I
>> am looking into it.
>>
>>
>> I also have the following issues based on the previous review (as posted
>> in the previous patch). Copying again for the review purpose.
>>
>> 1.
>>> you still call promote_ssa on both DEFs and USEs and promote_ssa looks
>>> at SSA_NAME_DEF_STMT of the passed arg.  Please call promote_ssa just
>>> on DEFs and fixup_uses on USEs.
>>
>> I am doing this to promote SSA that are defined with GIMPLE_NOP. Is
>> there anyway to iterate over this. I have added gcc_assert to make sure
>> that promote_ssa is called only once.
>
>   gcc_assert (!ssa_name_info_map->get_or_insert (def));
>
> with --disable-checking this will be compiled away so you need to do
> the assert in a separate statement.
>
>> 2.
>>> Instead of this you should, in promote_all_stmts, walk over all uses
>> doing what
>>> fixup_uses does and then walk over all defs, doing what promote_ssa does.
>>>
>>> +    case GIMPLE_NOP:
>>> +       {
>>> +         if (SSA_NAME_VAR (def) == NULL)
>>> +           {
>>> +             /* Promote def by fixing its type for anonymous def.  */
>>> +             TREE_TYPE (def) = promoted_type;
>>> +           }
>>> +         else
>>> +           {
>>> +             /* Create a promoted copy of parameters.  */
>>> +             bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
>>>
>>> I think the uninitialized vars are somewhat tricky and it would be best
>>> to create a new uninit anonymous SSA name for them.  You can
>>> have SSA_NAME_VAR != NULL and def _not_ being a parameter
>>> btw.
>>
>> I experimented with get_or_create_default_def. Here  we have to have a
>> SSA_NAME_VAR (def) of promoted type.
>>
>> In the attached patch I am doing the following and seems to work. Does
>> this looks OK?
>>
>> +         }
>> +       else if (TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL)
>> +         {
>> +           tree var = copy_node (SSA_NAME_VAR (def));
>> +           TREE_TYPE (var) = promoted_type;
>> +           TREE_TYPE (def) = promoted_type;
>> +           SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var);
>> +         }
>
> I believe this will wreck the SSA default-def map so you should do
>
>   set_ssa_default_def (cfun, SSA_NAME_VAR (def), NULL_TREE);
>   tree var = create_tmp_reg (promoted_type);
>   TREE_TYPE (def) = promoted_type;
>   SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var);
>   set_ssa_default_def (cfun, var, def);
>
> instead.
>
>> I prefer to promote def as otherwise iterating over the uses and
>> promoting can look complicated (have to look at all the different types
>> of stmts again and do the right thing as It was in the earlier version
>> of this before we move to this approach)
>>
>> 3)
>>> you can also transparently handle constants for the cases where promoting
>>> is required.  At the moment their handling is interwinded with the def
>> promotion
>>> code.  That makes the whole thing hard to follow.
>>
>>
>> I have updated the comments with:
>>
>> +/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
>> +   promote only the constants in conditions part of the COND_EXPR.
>> +
>> +   We promote the constants when the associated operands are promoted.
>> +   This usually means that we promote the constants when we promote the
>> +   defining stmnts (as part of promote_ssa). However for COND_EXPR, we
>> +   can promote only when we promote the other operand. Therefore, this
>> +   is done during fixup_use.  */
>>
>>
>> 4)
>> I am handling gimple_debug separately to avoid any code difference with
>> and without -g option. I have updated the comments for this.
>>
>> 5)
>> I also noticed that tree-ssa-uninit sometimes gives false positives due
>> to the assumptions
>> it makes. Is it OK to move this pass before type promotion? I can do the
>> testings and post a separate patch with this if this OK.
>
> Hmm, no, this needs more explanation (like a testcase).
>
>> 6)
>> I also removed the optimization that prevents some of the redundant
>> truncation/extensions from type promotion pass, as it dosent do much as
>> of now. I can send a proper follow up patch. Is that OK?
>
> Yeah, that sounds fine.
>
>> I also did a simple test with coremark for the latest patch. I compared
>> the code size for coremark for linux-gcc with -Os. Results are as
>> reported by the "size" utility. I know this doesn't mean much but can
>> give some indication.
>>         Base            with pass       Percentage improvement
>> ==============================================================
>> arm     10476           10372           0.9927453226
>> aarch64 9545            9521            0.2514405448
>> ppc64   12236           12052           1.5037593985
>>
>>
>> After resolving the above issues, I would like propose that we  commit
>> the pass as not enabled by default (even though the patch as it stands
>> enabled by default - I am doing it for testing purposes).
>
> Hmm, we don't like to have passes that are not enabled by default with any
> optimization level or for any target.  Those tend to bitrot quickly :(
>
> Did you do any performance measurements yet?
>
> Looking over the pass in detail now (again).

Ok, so still looking at the basic operation scheme.

      FOR_EACH_SSA_USE_OPERAND (op, stmt, iter, SSA_OP_USE)
        {
          use = USE_FROM_PTR (op);
          if (TREE_CODE (use) == SSA_NAME
              && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP)
            promote_ssa (use, &gsi);
          fixup_use (stmt, &gsi, op, use);
        }

      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_DEF)
        promote_ssa (def, &gsi);

the GIMPLE_NOP handling in promote_ssa but when processing uses looks
backwards.  As those are implicitely defined in the entry block you may
better just iterate over all default defs before the dominator walk like so

  unsigned n = num_ssa_names;
  for (i = 1; i < n; ++i)
    {
      tree name = ssa_name (i);
      if (name
          && SSA_NAME_IS_DEFAULT_DEF
          && ! has_zero_uses (name))
       promote_default_def (name);
    }

I see promote_cst_in_stmt in both promote_ssa and fixup_use.  Logically
it belongs to use processing, but on a stmt granularity.  Thus between
iterating over all uses and iteration over all defs call promote_cst_in_stmt
on all stmts.  It's a bit awkward as it expects to be called from context
that knows whether promotion is necessary or not.

/* Create an ssa with TYPE to copy ssa VAR.  */
static tree
make_promoted_copy (tree var, gimple *def_stmt, tree type)
{
  tree new_lhs = make_ssa_name (type, def_stmt);
  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (var))
    SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_lhs) = 1;
  return new_lhs;
}

as you are generating a copy statement I don't see why you need to copy
SSA_NAME_OCCURS_IN_ABNORMAL_PHI (in no case new_lhs will
be used in a PHI node directly AFAICS).  Merging make_promoted_copy
and the usually following extension stmt generation plus insertion into
a single helper would make that obvious.

static unsigned int
fixup_use (gimple *stmt, gimple_stmt_iterator *gsi,
           use_operand_p op, tree use)
{
  ssa_name_info *info = ssa_name_info_map->get_or_insert (use);
  /* If USE is not promoted, nothing to do.  */
  if (!info)
    return 0;

You should use ->get (), not ->get_or_insert here.

      gimple *copy_stmt = gimple_build_assign (temp, NOP_EXPR,
                                               use, NULL_TREE);

you can avoid the trailing NULL_TREE here.

        gimple *copy_stmt =
          zero_sign_extend_stmt (temp, use,
                                 TYPE_UNSIGNED (old_type),
                                 TYPE_PRECISION (old_type));

coding style says the '=' goes to the next line, thus

    gimple *copy_stmt
       = zero_sign_extend_stmt ...

/* Zero/sign extend (depending on UNSIGNED_P) VAR and truncate to WIDTH bits.
   Assign the zero/sign extended value in NEW_VAR.  gimple statement
   that performs the zero/sign extension is returned.  */
static gimple *
zero_sign_extend_stmt (tree new_var, tree var, bool unsigned_p, int width)
{

looks like instead of unsigned_p/width you can pass in a  type instead.

    /* Sign extend.  */
    stmt = gimple_build_assign (new_var,
                                SEXT_EXPR,
                                var, build_int_cst (TREE_TYPE (var), width));

use size_int (width) instead.

/* Convert constant CST to TYPE.  */
static tree
convert_int_cst (tree type, tree cst, signop sign = SIGNED)

no need for a default argument

{
  wide_int wi_cons = fold_convert (type, cst);
  wi_cons = wi::ext (wi_cons, TYPE_PRECISION (TREE_TYPE (cst)), sign);
  return wide_int_to_tree (type, wi_cons);
}

I wonder why this function is needed at all and you don't just call
fold_convert (type, cst)?

/* Return true if the tree CODE needs the propmoted operand to be
   truncated (when stray bits are set beyond the original type in
   promoted mode) to preserve the semantics.  */
static bool
truncate_use_p (enum tree_code code)
{

a conservatively correct predicate would implement the inversion,
not_truncated_use_p because if you miss any tree code the
result will be unnecessary rather than missed truncations.

static bool
type_precision_ok (tree type)
{
  return (TYPE_PRECISION (type)
          == GET_MODE_PRECISION (TYPE_MODE (type)));
}

/* Return the promoted type for TYPE.  */
static tree
get_promoted_type (tree type)
{
  tree promoted_type;
  enum machine_mode mode;
  int uns;

  if (POINTER_TYPE_P (type)
      || !INTEGRAL_TYPE_P (type)
      || !type_precision_ok (type))

the type_precision_ok check is because SEXT doesn't work
properly for bitfield types?  I think we want to promote those
to their mode precision anyway.  We just need to use
sth different than SEXT here (the bitwise-and works of course)
or expand SEXT from non-mode precision differently (see
expr.c REDUCE_BIT_FIELD which expands it as a
lshift/rshift combo).  Eventually this can be left for a followup
though it might get you some extra testing coverage on
non-promote-mode targets.

/* Return true if ssa NAME is already considered for promotion.  */
static bool
ssa_promoted_p (tree name)
{
  if (TREE_CODE (name) == SSA_NAME)
    {
      unsigned int index = SSA_NAME_VERSION (name);
      if (index < n_ssa_val)
        return bitmap_bit_p (ssa_to_be_promoted_bitmap, index);
    }
  return true;

better than this default assert you pass in an SSA name.

isn't the bitmap somewhat redundant with the hash-map?
And you could combine both by using a vec<ssa_name_info *> indexed
by SSA_NAME_VERSION ()?

         if ((TREE_CODE_CLASS (gimple_assign_rhs_code (stmt))
               == tcc_comparison)
              || truncate_use_p (gimple_assign_rhs_code (stmt)))

you always check for tcc_omparison when checking for truncate_use_p
so just handle it there (well, as said above, implement conservative
predicates).

  switch (gimple_code (stmt))
    {
    case GIMPLE_ASSIGN:
      if (promote_cond
          && gimple_assign_rhs_code (stmt) == COND_EXPR)
        {

looking at all callers this condition is never true.

          tree new_op = build2 (TREE_CODE (op), type, op0, op1);

as tcc_comparison class trees are not shareable you don't
need to build2 but can directly set TREE_OPERAND (op, ..) to the
promoted value.  Note that rhs1 may still just be an SSA name
and not a comparison.

    case GIMPLE_PHI:
        {
          /* Promote INTEGER_CST arguments to GIMPLE_PHI.  */
          gphi *phi = as_a <gphi *> (stmt);
          FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE)
            {
              op = USE_FROM_PTR (oprnd);
              index = PHI_ARG_INDEX_FROM_USE (oprnd);
              if (TREE_CODE (op) == INTEGER_CST)
                SET_PHI_ARG_DEF (phi, index, convert_int_cst (type, op, sign));
            }

static unsigned int
fixup_use (gimple *stmt, gimple_stmt_iterator *gsi,
           use_operand_p op, tree use)
{
  ssa_name_info *info = ssa_name_info_map->get_or_insert (use);
  /* If USE is not promoted, nothing to do.  */
  if (!info)
    return 0;

  tree promoted_type = info->promoted_type;
  tree old_type = info->type;
  bool do_not_promote = false;

  switch (gimple_code (stmt))
    {
 ....
    default:
      break;
    }

do_not_promote = false is not conservative.  Please place a
gcc_unreachable () in the default case.

I see you handle debug stmts here but that case cannot be reached.

/* Promote use in GIMPLE_DEBUG stmts. Do this separately to avoid generating
   different sequence with and without -g.  This can  happen when promoting
   SSA that are defined with GIMPLE_NOP.  */

but that's only because you choose to unconditionally handle GIMPLE_NOP uses...

Richard.


> Thanks,
> Richard.
>
>> Thanks,
>> Kugan
>>
>>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-11-18 15:06                         ` Richard Biener
@ 2015-11-24  2:52                           ` Kugan
  2015-12-10  0:27                             ` Kugan
  0 siblings, 1 reply; 63+ messages in thread
From: Kugan @ 2015-11-24  2:52 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 17718 bytes --]

Hi Richard,

Thanks for you comments. I am attaching  an updated patch with details
below.

On 19/11/15 02:06, Richard Biener wrote:
> On Wed, Nov 18, 2015 at 3:04 PM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Sat, Nov 14, 2015 at 2:15 AM, Kugan
>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>
>>> Attached is the latest version of the patch. With the patches
>>> 0001-Add-new-SEXT_EXPR-tree-code.patch,
>>> 0002-Add-type-promotion-pass.patch and
>>> 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch.
>>>
>>> I did bootstrap on ppc64-linux-gnu, aarch64-linux-gnu and
>>> x64-64-linux-gnu and regression testing on ppc64-linux-gnu,
>>> aarch64-linux-gnu arm64-linux-gnu and x64-64-linux-gnu. I ran into three
>>> issues in ppc64-linux-gnu regression testing. There are some other test
>>> cases which needs adjustment for scanning for some patterns that are not
>>> valid now.
>>>
>>> 1. rtl fwprop was going into infinite loop. Works with the following patch:
>>> diff --git a/gcc/fwprop.c b/gcc/fwprop.c
>>> index 16c7981..9cf4f43 100644
>>> --- a/gcc/fwprop.c
>>> +++ b/gcc/fwprop.c
>>> @@ -948,6 +948,10 @@ try_fwprop_subst (df_ref use, rtx *loc, rtx
>>> new_rtx, rtx_insn *def_insn,
>>>    int old_cost = 0;
>>>    bool ok;
>>>
>>> +  /* Value to be substituted is the same, nothing to do.  */
>>> +  if (rtx_equal_p (*loc, new_rtx))
>>> +    return false;
>>> +
>>>    update_df_init (def_insn, insn);
>>>
>>>    /* forward_propagate_subreg may be operating on an instruction with
>>
>> Which testcase was this on?

After re-basing the trunk, I cannot reproduce it anymore.

>>
>>> 2. gcc.dg/torture/ftrapv-1.c fails
>>> This is because we are checking for the  SImode trapping. With the
>>> promotion of the operation to wider mode, this is i think expected. I
>>> think the testcase needs updating.
>>
>> No, it is not expected.  As said earlier you need to refrain from promoting
>> integer operations that trap.  You can use ! operation_no_trapping_overflow
>> for this.
>>

I have changed this.

>>> 3. gcc.dg/sms-3.c fails
>>> It fails with  -fmodulo-sched-allow-regmoves  and OK when I remove it. I
>>> am looking into it.
>>>
>>>
>>> I also have the following issues based on the previous review (as posted
>>> in the previous patch). Copying again for the review purpose.
>>>
>>> 1.
>>>> you still call promote_ssa on both DEFs and USEs and promote_ssa looks
>>>> at SSA_NAME_DEF_STMT of the passed arg.  Please call promote_ssa just
>>>> on DEFs and fixup_uses on USEs.
>>>
>>> I am doing this to promote SSA that are defined with GIMPLE_NOP. Is
>>> there anyway to iterate over this. I have added gcc_assert to make sure
>>> that promote_ssa is called only once.
>>
>>   gcc_assert (!ssa_name_info_map->get_or_insert (def));
>>
>> with --disable-checking this will be compiled away so you need to do
>> the assert in a separate statement.
>>
>>> 2.
>>>> Instead of this you should, in promote_all_stmts, walk over all uses
>>> doing what
>>>> fixup_uses does and then walk over all defs, doing what promote_ssa does.
>>>>
>>>> +    case GIMPLE_NOP:
>>>> +       {
>>>> +         if (SSA_NAME_VAR (def) == NULL)
>>>> +           {
>>>> +             /* Promote def by fixing its type for anonymous def.  */
>>>> +             TREE_TYPE (def) = promoted_type;
>>>> +           }
>>>> +         else
>>>> +           {
>>>> +             /* Create a promoted copy of parameters.  */
>>>> +             bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
>>>>
>>>> I think the uninitialized vars are somewhat tricky and it would be best
>>>> to create a new uninit anonymous SSA name for them.  You can
>>>> have SSA_NAME_VAR != NULL and def _not_ being a parameter
>>>> btw.
>>>
>>> I experimented with get_or_create_default_def. Here  we have to have a
>>> SSA_NAME_VAR (def) of promoted type.
>>>
>>> In the attached patch I am doing the following and seems to work. Does
>>> this looks OK?
>>>
>>> +         }
>>> +       else if (TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL)
>>> +         {
>>> +           tree var = copy_node (SSA_NAME_VAR (def));
>>> +           TREE_TYPE (var) = promoted_type;
>>> +           TREE_TYPE (def) = promoted_type;
>>> +           SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var);
>>> +         }
>>
>> I believe this will wreck the SSA default-def map so you should do
>>
>>   set_ssa_default_def (cfun, SSA_NAME_VAR (def), NULL_TREE);
>>   tree var = create_tmp_reg (promoted_type);
>>   TREE_TYPE (def) = promoted_type;
>>   SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var);
>>   set_ssa_default_def (cfun, var, def);
>>
>> instead.
I have changed this.

>>
>>> I prefer to promote def as otherwise iterating over the uses and
>>> promoting can look complicated (have to look at all the different types
>>> of stmts again and do the right thing as It was in the earlier version
>>> of this before we move to this approach)
>>>
>>> 3)
>>>> you can also transparently handle constants for the cases where promoting
>>>> is required.  At the moment their handling is interwinded with the def
>>> promotion
>>>> code.  That makes the whole thing hard to follow.
>>>
>>>
>>> I have updated the comments with:
>>>
>>> +/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
>>> +   promote only the constants in conditions part of the COND_EXPR.
>>> +
>>> +   We promote the constants when the associated operands are promoted.
>>> +   This usually means that we promote the constants when we promote the
>>> +   defining stmnts (as part of promote_ssa). However for COND_EXPR, we
>>> +   can promote only when we promote the other operand. Therefore, this
>>> +   is done during fixup_use.  */
>>>
>>>
>>> 4)
>>> I am handling gimple_debug separately to avoid any code difference with
>>> and without -g option. I have updated the comments for this.
>>>
>>> 5)
>>> I also noticed that tree-ssa-uninit sometimes gives false positives due
>>> to the assumptions
>>> it makes. Is it OK to move this pass before type promotion? I can do the
>>> testings and post a separate patch with this if this OK.
>>
>> Hmm, no, this needs more explanation (like a testcase).
There are few issues I ran into. I will send a list with more info. For
example:

/* Test we do not warn about initializing variable with self. */
/* { dg-do compile } */
/* { dg-options "-O -Wuninitialized" } */

int f()
{
  int i = i;
  return i;
}


I now get:
kugan@kugan-desktop:~$
/home/kugan/work/builds/gcc-fsf-linaro/tools/bin/ppc64-none-linux-gnu-gcc -O
-Wuninitialized
/home/kugan/work/SVN/gcc/trunk/gcc/testsuite/c-c++-common/uninit-D.c
-fdump-tree-all
/home/kugan/work/SVN/gcc/trunk/gcc/testsuite/c-c++-common/uninit-D.c: In
function ‘f’:
/home/kugan/work/SVN/gcc/trunk/gcc/testsuite/c-c++-common/uninit-D.c:8:10:
warning: ‘i’ is used uninitialized in this function [-Wuninitialized]
   return i;


diff -u uninit-D.c.146t.veclower21  uninit-D.c.147t.promotion is:

--- uninit-D.c.146t.veclower21	2015-11-24 11:30:04.374203197 +1100
+++ uninit-D.c.147t.promotion	2015-11-24 11:30:04.374203197 +1100
@@ -1,13 +1,16 @@

 ;; Function f (f, funcdef_no=0, decl_uid=2271, cgraph_uid=0,
symbol_order=0)

 f ()
 {
+  signed long i;
   int i;
+  int _3;

   <bb 2>:
-  return i_1(D);
+  _3 = (int) i_1(D);
+  return _3;

 }



>>
>>> 6)
>>> I also removed the optimization that prevents some of the redundant
>>> truncation/extensions from type promotion pass, as it dosent do much as
>>> of now. I can send a proper follow up patch. Is that OK?
>>
>> Yeah, that sounds fine.
>>
>>> I also did a simple test with coremark for the latest patch. I compared
>>> the code size for coremark for linux-gcc with -Os. Results are as
>>> reported by the "size" utility. I know this doesn't mean much but can
>>> give some indication.
>>>         Base            with pass       Percentage improvement
>>> ==============================================================
>>> arm     10476           10372           0.9927453226
>>> aarch64 9545            9521            0.2514405448
>>> ppc64   12236           12052           1.5037593985
>>>
>>>
>>> After resolving the above issues, I would like propose that we  commit
>>> the pass as not enabled by default (even though the patch as it stands
>>> enabled by default - I am doing it for testing purposes).
>>
>> Hmm, we don't like to have passes that are not enabled by default with any
>> optimization level or for any target.  Those tend to bitrot quickly :(
>>
>> Did you do any performance measurements yet?

Ok, I understand. I did performance testing on AARch64 and saw some good
improvement for the earlier version. I will do it again for more targets
after getting it reviewed.

>>
>> Looking over the pass in detail now (again).
> 
> Ok, so still looking at the basic operation scheme.
> 
>       FOR_EACH_SSA_USE_OPERAND (op, stmt, iter, SSA_OP_USE)
>         {
>           use = USE_FROM_PTR (op);
>           if (TREE_CODE (use) == SSA_NAME
>               && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP)
>             promote_ssa (use, &gsi);
>           fixup_use (stmt, &gsi, op, use);
>         }
> 
>       FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_DEF)
>         promote_ssa (def, &gsi);
> 
> the GIMPLE_NOP handling in promote_ssa but when processing uses looks
> backwards.  As those are implicitely defined in the entry block you may
> better just iterate over all default defs before the dominator walk like so
> 
>   unsigned n = num_ssa_names;
>   for (i = 1; i < n; ++i)
>     {
>       tree name = ssa_name (i);
>       if (name
>           && SSA_NAME_IS_DEFAULT_DEF
>           && ! has_zero_uses (name))
>        promote_default_def (name);
>     }
> 

I have Changed this.

> I see promote_cst_in_stmt in both promote_ssa and fixup_use.  Logically
> it belongs to use processing, but on a stmt granularity.  Thus between
> iterating over all uses and iteration over all defs call promote_cst_in_stmt
> on all stmts.  It's a bit awkward as it expects to be called from context
> that knows whether promotion is necessary or not.
> 
> /* Create an ssa with TYPE to copy ssa VAR.  */
> static tree
> make_promoted_copy (tree var, gimple *def_stmt, tree type)
> {
>   tree new_lhs = make_ssa_name (type, def_stmt);
>   if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (var))
>     SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_lhs) = 1;
>   return new_lhs;
> }
> 
> as you are generating a copy statement I don't see why you need to copy
> SSA_NAME_OCCURS_IN_ABNORMAL_PHI (in no case new_lhs will
> be used in a PHI node directly AFAICS).  Merging make_promoted_copy
> and the usually following extension stmt generation plus insertion into
> a single helper would make that obvious.
> 

I have changed this.

> static unsigned int
> fixup_use (gimple *stmt, gimple_stmt_iterator *gsi,
>            use_operand_p op, tree use)
> {
>   ssa_name_info *info = ssa_name_info_map->get_or_insert (use);
>   /* If USE is not promoted, nothing to do.  */
>   if (!info)
>     return 0;
> 
> You should use ->get (), not ->get_or_insert here.
> 
>       gimple *copy_stmt = gimple_build_assign (temp, NOP_EXPR,
>                                                use, NULL_TREE);
> 

Changed this.

> you can avoid the trailing NULL_TREE here.
> 
>         gimple *copy_stmt =
>           zero_sign_extend_stmt (temp, use,
>                                  TYPE_UNSIGNED (old_type),
>                                  TYPE_PRECISION (old_type));
> 
> coding style says the '=' goes to the next line, thus
> 
>     gimple *copy_stmt
>        = zero_sign_extend_stmt ...


Changed this.
> 
> /* Zero/sign extend (depending on UNSIGNED_P) VAR and truncate to WIDTH bits.
>    Assign the zero/sign extended value in NEW_VAR.  gimple statement
>    that performs the zero/sign extension is returned.  */
> static gimple *
> zero_sign_extend_stmt (tree new_var, tree var, bool unsigned_p, int width)
> {
> 
> looks like instead of unsigned_p/width you can pass in a  type instead.
> 
>     /* Sign extend.  */
>     stmt = gimple_build_assign (new_var,
>                                 SEXT_EXPR,
>                                 var, build_int_cst (TREE_TYPE (var), width));
> 
> use size_int (width) instead.
> 
> /* Convert constant CST to TYPE.  */
> static tree
> convert_int_cst (tree type, tree cst, signop sign = SIGNED)
> 
> no need for a default argument
> 
> {
>   wide_int wi_cons = fold_convert (type, cst);
>   wi_cons = wi::ext (wi_cons, TYPE_PRECISION (TREE_TYPE (cst)), sign);
>   return wide_int_to_tree (type, wi_cons);
> }


For some of the operations, sign extended constants are created. For
example:

short unPack( unsigned char c )
{
    /* Only want lower four bit nibble */
    c = c & (unsigned char)0x0F ;

    if( c > 7 ) {
        /* Negative nibble */
        return( ( short )( c - 5 ) ) ;

    }
    else
    {
        /* positive nibble */
        return( ( short )c ) ;
    }
}


- 5 above becomes  + (-5). Therefore, If I sign extend the constant in
promotion (even though it is unsigned) results in better code. There is
no correctness issue.

I have now changed it based on your suggestions. Is this look better?


> 
> I wonder why this function is needed at all and you don't just call
> fold_convert (type, cst)?
> 
> /* Return true if the tree CODE needs the propmoted operand to be
>    truncated (when stray bits are set beyond the original type in
>    promoted mode) to preserve the semantics.  */
> static bool
> truncate_use_p (enum tree_code code)
> {
> 
> a conservatively correct predicate would implement the inversion,
> not_truncated_use_p because if you miss any tree code the
> result will be unnecessary rather than missed truncations.
> 

Changed it.

> static bool
> type_precision_ok (tree type)
> {
>   return (TYPE_PRECISION (type)
>           == GET_MODE_PRECISION (TYPE_MODE (type)));
> }
> 
> /* Return the promoted type for TYPE.  */
> static tree
> get_promoted_type (tree type)
> {
>   tree promoted_type;
>   enum machine_mode mode;
>   int uns;
> 
>   if (POINTER_TYPE_P (type)
>       || !INTEGRAL_TYPE_P (type)
>       || !type_precision_ok (type))
> 
> the type_precision_ok check is because SEXT doesn't work
> properly for bitfield types?  I think we want to promote those
> to their mode precision anyway.  We just need to use
> sth different than SEXT here (the bitwise-and works of course)
> or expand SEXT from non-mode precision differently (see
> expr.c REDUCE_BIT_FIELD which expands it as a
> lshift/rshift combo).  Eventually this can be left for a followup
> though it might get you some extra testing coverage on
> non-promote-mode targets.

I will have a look at it.

> 
> /* Return true if ssa NAME is already considered for promotion.  */
> static bool
> ssa_promoted_p (tree name)
> {
>   if (TREE_CODE (name) == SSA_NAME)
>     {
>       unsigned int index = SSA_NAME_VERSION (name);
>       if (index < n_ssa_val)
>         return bitmap_bit_p (ssa_to_be_promoted_bitmap, index);
>     }
>   return true;
> 
> better than this default assert you pass in an SSA name.

Changed it.

> 
> isn't the bitmap somewhat redundant with the hash-map?
> And you could combine both by using a vec<ssa_name_info *> indexed
> by SSA_NAME_VERSION ()?
> 
>          if ((TREE_CODE_CLASS (gimple_assign_rhs_code (stmt))
>                == tcc_comparison)
>               || truncate_use_p (gimple_assign_rhs_code (stmt)))
> 
> you always check for tcc_omparison when checking for truncate_use_p
> so just handle it there (well, as said above, implement conservative
> predicates).
> 
>   switch (gimple_code (stmt))
>     {
>     case GIMPLE_ASSIGN:
>       if (promote_cond
>           && gimple_assign_rhs_code (stmt) == COND_EXPR)
>         {
> 
> looking at all callers this condition is never true.
> 
>           tree new_op = build2 (TREE_CODE (op), type, op0, op1);
> 
> as tcc_comparison class trees are not shareable you don't
> need to build2 but can directly set TREE_OPERAND (op, ..) to the
> promoted value.  Note that rhs1 may still just be an SSA name
> and not a comparison.

Changed this.

> 
>     case GIMPLE_PHI:
>         {
>           /* Promote INTEGER_CST arguments to GIMPLE_PHI.  */
>           gphi *phi = as_a <gphi *> (stmt);
>           FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE)
>             {
>               op = USE_FROM_PTR (oprnd);
>               index = PHI_ARG_INDEX_FROM_USE (oprnd);
>               if (TREE_CODE (op) == INTEGER_CST)
>                 SET_PHI_ARG_DEF (phi, index, convert_int_cst (type, op, sign));
>             }
> 
> static unsigned int
> fixup_use (gimple *stmt, gimple_stmt_iterator *gsi,
>            use_operand_p op, tree use)
> {
>   ssa_name_info *info = ssa_name_info_map->get_or_insert (use);
>   /* If USE is not promoted, nothing to do.  */
>   if (!info)
>     return 0;
> 
>   tree promoted_type = info->promoted_type;
>   tree old_type = info->type;
>   bool do_not_promote = false;
> 
>   switch (gimple_code (stmt))
>     {
>  ....
>     default:
>       break;
>     }
> 
> do_not_promote = false is not conservative.  Please place a
> gcc_unreachable () in the default case.

We will have valid statements (which are not handled in switch) for
which we don't have to do any fix ups.

> 
> I see you handle debug stmts here but that case cannot be reached.
> 
> /* Promote use in GIMPLE_DEBUG stmts. Do this separately to avoid generating
>    different sequence with and without -g.  This can  happen when promoting
>    SSA that are defined with GIMPLE_NOP.  */
> 
> but that's only because you choose to unconditionally handle GIMPLE_NOP uses...

I have removed this.

Thanks,
Kugan

> 
> Richard.
> 
> 
>> Thanks,
>> Richard.
>>
>>> Thanks,
>>> Kugan
>>>
>>>

[-- Attachment #2: 0002-Add-type-promotion-pass.patch --]
[-- Type: text/x-patch, Size: 31362 bytes --]

From 89f526ea6f7878879fa65a2b869cac4c21dc7df0 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Fri, 20 Nov 2015 14:14:52 +1100
Subject: [PATCH 2/3] Add type promotion pass

---
 gcc/Makefile.in               |   1 +
 gcc/auto-profile.c            |   2 +-
 gcc/common.opt                |   4 +
 gcc/doc/invoke.texi           |  10 +
 gcc/gimple-ssa-type-promote.c | 849 ++++++++++++++++++++++++++++++++++++++++++
 gcc/passes.def                |   1 +
 gcc/timevar.def               |   1 +
 gcc/tree-pass.h               |   1 +
 libiberty/cp-demangle.c       |   2 +-
 9 files changed, 869 insertions(+), 2 deletions(-)
 create mode 100644 gcc/gimple-ssa-type-promote.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 0fd8d99..4e1444c 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1512,6 +1512,7 @@ OBJS = \
 	tree-vect-slp.o \
 	tree-vectorizer.o \
 	tree-vrp.o \
+	gimple-ssa-type-promote.o \
 	tree.o \
 	valtrack.o \
 	value-prof.o \
diff --git a/gcc/auto-profile.c b/gcc/auto-profile.c
index c7aab42..f214331 100644
--- a/gcc/auto-profile.c
+++ b/gcc/auto-profile.c
@@ -1257,7 +1257,7 @@ afdo_propagate_circuit (const bb_set &annotated_bb, edge_set *annotated_edge)
     FOR_EACH_EDGE (e, ei, bb->succs)
     {
       unsigned i, total = 0;
-      edge only_one;
+      edge only_one = NULL;
       bool check_value_one = (((integer_onep (cmp_rhs))
                                ^ (gimple_cond_code (cmp_stmt) == EQ_EXPR))
                               ^ ((e->flags & EDGE_TRUE_VALUE) != 0));
diff --git a/gcc/common.opt b/gcc/common.opt
index 3eb520e..582e8ee 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2407,6 +2407,10 @@ fsplit-paths
 Common Report Var(flag_split_paths) Init(0) Optimization
 Split paths leading to loop backedges.
 
+ftree-type-promote
+Common Report Var(flag_tree_type_promote) Init(1) Optimization
+Perform Type Promotion on trees
+
 funit-at-a-time
 Common Report Var(flag_unit_at_a_time) Init(1)
 Compile whole compilation unit at a time.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 7cef176..21f94a6 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -9142,6 +9142,16 @@ Split paths leading to loop backedges.  This can improve dead code
 elimination and common subexpression elimination.  This is enabled by
 default at @option{-O2} and above.
 
+@item -ftree-type-promote
+@opindex ftree-type-promote
+This pass applies type promotion to SSA names in the function and
+inserts appropriate truncations to preserve the semantics.  Idea of
+this pass is to promote operations such a way that we can minimise
+generation of subreg in RTL, that intern results in removal of
+redundant zero/sign extensions.
+
+This optimization is enabled by default.
+
 @item -fsplit-ivs-in-unroller
 @opindex fsplit-ivs-in-unroller
 Enables expression of values of induction variables in later iterations
diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c
new file mode 100644
index 0000000..5993e89
--- /dev/null
+++ b/gcc/gimple-ssa-type-promote.c
@@ -0,0 +1,849 @@
+/* Type promotion of SSA names to minimise redundant zero/sign extension.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "hash-set.h"
+#include "machmode.h"
+#include "vec.h"
+#include "double-int.h"
+#include "input.h"
+#include "symtab.h"
+#include "wide-int.h"
+#include "inchash.h"
+#include "tree.h"
+#include "fold-const.h"
+#include "stor-layout.h"
+#include "predict.h"
+#include "function.h"
+#include "dominance.h"
+#include "cfg.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimple-ssa.h"
+#include "tree-phinodes.h"
+#include "ssa-iterators.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+#include "tree-pass.h"
+#include "gimple-pretty-print.h"
+#include "langhooks.h"
+#include "sbitmap.h"
+#include "domwalk.h"
+#include "tree-dfa.h"
+
+/* This pass applies type promotion to SSA names in the function and
+   inserts appropriate truncations.  Idea of this pass is to promote operations
+   such a way that we can minimise generation of subreg in RTL,
+   that in turn results in removal of redundant zero/sign extensions.  This pass
+   will run prior to The VRP and DOM such that they will be able to optimise
+   redundant truncations and extensions.  This is based on the discussion from
+   https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00472.html.
+*/
+
+/* Structure to hold the type and promoted type for promoted ssa variables.  */
+struct ssa_name_info
+{
+  tree ssa;		/* Name of the SSA_NAME.  */
+  tree type;		/* Original type of ssa.  */
+  tree promoted_type;	/* Promoted type of ssa.  */
+};
+
+/* Obstack for ssa_name_info.  */
+static struct obstack ssa_name_info_obstack;
+
+static unsigned n_ssa_val;
+static sbitmap ssa_to_be_promoted_bitmap;
+static hash_map <tree, ssa_name_info *>  *ssa_name_info_map;
+
+static bool
+type_precision_ok (tree type)
+{
+  return (TYPE_PRECISION (type)
+	  == GET_MODE_PRECISION (TYPE_MODE (type)));
+}
+
+/* Return the promoted type for TYPE.  */
+static tree
+get_promoted_type (tree type)
+{
+  tree promoted_type;
+  enum machine_mode mode;
+  int uns;
+
+  if (POINTER_TYPE_P (type)
+      || !INTEGRAL_TYPE_P (type)
+      || !type_precision_ok (type))
+    return type;
+
+  mode = TYPE_MODE (type);
+#ifdef PROMOTE_MODE
+  uns = TYPE_SIGN (type);
+  PROMOTE_MODE (mode, uns, type);
+#endif
+  uns = TYPE_SIGN (type);
+  if (TYPE_PRECISION (type) == GET_MODE_PRECISION (mode))
+    return type;
+  promoted_type
+    = build_nonstandard_integer_type (GET_MODE_PRECISION (mode),
+				      uns);
+  gcc_assert (TYPE_PRECISION (promoted_type) == GET_MODE_PRECISION (mode));
+  return promoted_type;
+}
+
+/* Return true if ssa NAME is already considered for promotion.  */
+static bool
+ssa_promoted_p (tree name)
+{
+  gcc_assert (TREE_CODE (name) == SSA_NAME);
+  unsigned int index = SSA_NAME_VERSION (name);
+  if (index < n_ssa_val)
+    return bitmap_bit_p (ssa_to_be_promoted_bitmap, index);
+  return true;
+}
+
+/* Set ssa NAME to be already considered for promotion.  */
+static void
+set_ssa_promoted (tree name)
+{
+  gcc_assert (TREE_CODE (name) == SSA_NAME);
+  unsigned int index = SSA_NAME_VERSION (name);
+  if (index < n_ssa_val)
+    bitmap_set_bit (ssa_to_be_promoted_bitmap, index);
+}
+
+/* Return true if the tree CODE needs the propmoted operand to be
+   truncated (when stray bits are set beyond the original type in
+   promoted mode) to preserve the semantics.  */
+static bool
+not_truncated_use_p (enum tree_code code)
+{
+  if (TREE_CODE_CLASS (code) == tcc_comparison
+      || code == TRUNC_DIV_EXPR
+      || code == CEIL_DIV_EXPR
+      || code == FLOOR_DIV_EXPR
+      || code == ROUND_DIV_EXPR
+      || code == TRUNC_MOD_EXPR
+      || code == CEIL_MOD_EXPR
+      || code == FLOOR_MOD_EXPR
+      || code == ROUND_MOD_EXPR
+      || code == LSHIFT_EXPR
+      || code == RSHIFT_EXPR
+      || code == MAX_EXPR
+      || code == MIN_EXPR)
+    return false;
+  else
+    return true;
+}
+
+
+/* Return true if LHS will be promoted later.  */
+static bool
+tobe_promoted_p (tree lhs)
+{
+  if (TREE_CODE (lhs) == SSA_NAME
+      && INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+      && !VECTOR_TYPE_P (TREE_TYPE (lhs))
+      && !POINTER_TYPE_P (TREE_TYPE (lhs))
+      && !ssa_promoted_p (lhs)
+      && (get_promoted_type (TREE_TYPE (lhs))
+	  != TREE_TYPE (lhs)))
+    return true;
+  else
+    return false;
+}
+
+/* Convert and sign-extend constant CST to TYPE.  */
+static tree
+fold_convert_sext (tree type, tree cst)
+{
+  wide_int wi_cons = fold_convert (type, cst);
+  wi_cons = wi::ext (wi_cons, TYPE_PRECISION (TREE_TYPE (cst)), SIGNED);
+  return wide_int_to_tree (type, wi_cons);
+}
+
+/* Promote constants in STMT to TYPE.  If PROMOTE_COND_EXPR is true,
+   promote only the constants in conditions part of the COND_EXPR.
+
+   We promote the constants when the ssociated operands are promoted.
+   This usually means that we promote the constants when we promote the
+   defining stmnts (as part of promote_ssa). However for COND_EXPR, we
+   can promote only when we promote the other operand. Therefore, this
+   is done during fixup_use.  */
+
+static void
+promote_cst_in_stmt (gimple *stmt, tree type)
+{
+  tree op;
+  ssa_op_iter iter;
+  use_operand_p oprnd;
+  int index;
+  tree op0, op1;
+
+  switch (gimple_code (stmt))
+    {
+    case GIMPLE_ASSIGN:
+      if (gimple_assign_rhs_code (stmt) == COND_EXPR
+	  && TREE_OPERAND_LENGTH (gimple_assign_rhs1 (stmt)) == 2)
+	{
+	  /* Promote INTEGER_CST that are tcc_compare arguments.  */
+	  op = gimple_assign_rhs1 (stmt);
+	  op0 = TREE_OPERAND (op, 0);
+	  op1 = TREE_OPERAND (op, 1);
+	  if (TREE_TYPE (op0) != TREE_TYPE (op1))
+	    {
+	      if (TREE_CODE (op0) == INTEGER_CST)
+		TREE_OPERAND (op, 0) = fold_convert (type, op0);
+	      if (TREE_CODE (op1) == INTEGER_CST)
+		TREE_OPERAND (op, 1) = fold_convert (type, op1);
+	    }
+	}
+      /* Promote INTEGER_CST in GIMPLE_ASSIGN.  */
+      if (not_truncated_use_p (gimple_assign_rhs_code (stmt)))
+	{
+	  op = gimple_assign_rhs3 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs3 (stmt, fold_convert_sext (type, op));
+	  op = gimple_assign_rhs1 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs1 (stmt, fold_convert_sext (type, op));
+	  op = gimple_assign_rhs2 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs2 (stmt, fold_convert_sext (type, op));
+	}
+      else
+	{
+	  op = gimple_assign_rhs3 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs3 (stmt, fold_convert (type, op));
+	  op = gimple_assign_rhs1 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs1 (stmt, fold_convert (type, op));
+	  op = gimple_assign_rhs2 (stmt);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_assign_set_rhs2 (stmt, fold_convert (type, op));
+	}
+      break;
+
+    case GIMPLE_PHI:
+	{
+	  /* Promote INTEGER_CST arguments to GIMPLE_PHI.  */
+	  gphi *phi = as_a <gphi *> (stmt);
+	  FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE)
+	    {
+	      op = USE_FROM_PTR (oprnd);
+	      index = PHI_ARG_INDEX_FROM_USE (oprnd);
+	      if (TREE_CODE (op) == INTEGER_CST)
+		SET_PHI_ARG_DEF (phi, index, fold_convert (type, op));
+	    }
+	}
+      break;
+
+    case GIMPLE_COND:
+	{
+	  /* Promote INTEGER_CST that are GIMPLE_COND arguments.  */
+	  gcond *cond = as_a <gcond *> (stmt);
+	  op = gimple_cond_lhs (cond);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_lhs (cond, fold_convert (type, op));
+
+	  op = gimple_cond_rhs (cond);
+	  if (op && TREE_CODE (op) == INTEGER_CST)
+	    gimple_cond_set_rhs (cond, fold_convert (type, op));
+	}
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+}
+
+/* Zero/sign extend VAR and truncate to INNER_TYPE.
+   Assign the zero/sign extended value in NEW_VAR.  gimple statement
+   that performs the zero/sign extension is returned.  */
+
+static gimple *
+zero_sign_extend_stmt (tree new_var, tree var, tree inner_type)
+{
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var))
+	      == TYPE_PRECISION (TREE_TYPE (new_var)));
+  gcc_assert (TYPE_PRECISION (TREE_TYPE (var)) > TYPE_PRECISION (inner_type));
+  gimple *stmt;
+
+  if (TYPE_UNSIGNED (inner_type))
+    {
+      /* Zero extend.  */
+      tree cst
+	= wide_int_to_tree (TREE_TYPE (var),
+			    wi::mask (TYPE_PRECISION (inner_type), false,
+				      TYPE_PRECISION (TREE_TYPE (var))));
+      stmt = gimple_build_assign (new_var, BIT_AND_EXPR,
+				  var, cst);
+    }
+  else
+    /* Sign extend.  */
+    stmt = gimple_build_assign (new_var,
+				SEXT_EXPR,
+				var,
+				build_int_cst (TREE_TYPE (var),
+					       TYPE_PRECISION (inner_type)));
+  return stmt;
+}
+
+static void
+copy_default_ssa (tree to, tree from)
+{
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (to, SSA_NAME_VAR (from));
+  SSA_NAME_DEF_STMT (to) = SSA_NAME_DEF_STMT (from);
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (from, NULL_TREE);
+  SSA_NAME_IS_DEFAULT_DEF (to) = 1;
+  SSA_NAME_IS_DEFAULT_DEF (from) = 0;
+}
+
+/* Promote definition DEF to PROMOTED_TYPE.  If the stmt that defines def
+   is def_stmt, make the type of def promoted_type.  If the stmt is such
+   that, result of the def_stmt cannot be of promoted_type, create a new_def
+   of the original_type and make the def_stmt assign its value to newdef.
+   Then, create a NOP_EXPR to convert new_def to def of promoted type.
+
+   For example, for stmt with original_type char and promoted_type int:
+		char _1 = mem;
+	becomes:
+		char _2 = mem;
+		int _1 = (int)_2;
+
+   If the def_stmt allows def to be promoted, promote def in-place
+   (and its arguments when needed).
+
+   For example:
+		char _3 = _1 + _2;
+	becomes:
+		int _3 = _1 + _2;
+   Here, _1 and _2 will also be promoted.  */
+
+static void
+promote_ssa (tree def, gimple_stmt_iterator *gsi)
+{
+  gimple *def_stmt = SSA_NAME_DEF_STMT (def);
+  gimple *copy_stmt = NULL;
+  gimple_stmt_iterator gsi2;
+  tree original_type = TREE_TYPE (def);
+  tree new_def;
+  ssa_name_info *info;
+  bool do_not_promote = false;
+  tree promoted_type = get_promoted_type (TREE_TYPE (def));
+
+  if (!tobe_promoted_p (def))
+    return;
+
+  info = (ssa_name_info *) obstack_alloc (&ssa_name_info_obstack,
+					  sizeof (ssa_name_info));
+  info->type = original_type;
+  info->promoted_type = promoted_type;
+  info->ssa = def;
+  ssa_name_info_map->put (def, info);
+
+  switch (gimple_code (def_stmt))
+    {
+    case GIMPLE_PHI:
+      {
+	/* Promote def by fixing its type and make def anonymous.  */
+	TREE_TYPE (def) = promoted_type;
+	SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	promote_cst_in_stmt (def_stmt, promoted_type);
+	break;
+      }
+
+    case GIMPLE_ASM:
+      {
+	gasm *asm_stmt = as_a <gasm *> (def_stmt);
+	for (unsigned int i = 0; i < gimple_asm_noutputs (asm_stmt); ++i)
+	  {
+	    /* Promote def and copy (i.e. convert) the value defined
+	       by asm to def.  */
+	    tree link = gimple_asm_output_op (asm_stmt, i);
+	    tree op = TREE_VALUE (link);
+	    if (op == def)
+	      {
+		new_def = copy_ssa_name (def);
+		set_ssa_promoted (new_def);
+		copy_default_ssa (new_def, def);
+		TREE_VALUE (link) = new_def;
+		gimple_asm_set_output_op (asm_stmt, i, link);
+
+		TREE_TYPE (def) = promoted_type;
+		copy_stmt = gimple_build_assign (def, NOP_EXPR, new_def);
+		SSA_NAME_IS_DEFAULT_DEF (new_def) = 0;
+		gimple_set_location (copy_stmt, gimple_location (def_stmt));
+		gsi2 = gsi_for_stmt (def_stmt);
+		gsi_insert_after (&gsi2, copy_stmt, GSI_NEW_STMT);
+		break;
+	      }
+	  }
+	break;
+      }
+
+    case GIMPLE_NOP:
+      {
+	gcc_unreachable ();
+      }
+
+    case GIMPLE_ASSIGN:
+      {
+	enum tree_code code = gimple_assign_rhs_code (def_stmt);
+	tree rhs = gimple_assign_rhs1 (def_stmt);
+	if (gimple_vuse (def_stmt) != NULL_TREE
+	    || gimple_vdef (def_stmt) != NULL_TREE
+	    || (ANY_INTEGRAL_TYPE_P (TREE_TYPE (def))
+		&& !operation_no_trapping_overflow (TREE_TYPE (def), code))
+	    || TREE_CODE_CLASS (code) == tcc_reference
+	    || TREE_CODE_CLASS (code) == tcc_comparison
+	    || code == LROTATE_EXPR
+	    || code == RROTATE_EXPR
+	    || code == VIEW_CONVERT_EXPR
+	    || code == REALPART_EXPR
+	    || code == IMAGPART_EXPR
+	    || code == REDUC_PLUS_EXPR
+	    || code == REDUC_MAX_EXPR
+	    || code == REDUC_MIN_EXPR
+	    || !INTEGRAL_TYPE_P (TREE_TYPE (rhs)))
+	  {
+	    do_not_promote = true;
+	  }
+	else if (CONVERT_EXPR_CODE_P (code))
+	  {
+	    if (!type_precision_ok (TREE_TYPE (rhs)))
+	      {
+		do_not_promote = true;
+	      }
+	    else if (types_compatible_p (TREE_TYPE (rhs), promoted_type))
+	      {
+		/* As we travel statements in dominated order, arguments
+		   of def_stmt will be visited before visiting def.  If RHS
+		   is already promoted and type is compatible, we can convert
+		   them into ZERO/SIGN EXTEND stmt.  */
+		ssa_name_info *info = ssa_name_info_map->get_or_insert (rhs);
+		tree type;
+		if (info == NULL)
+		  type = TREE_TYPE (rhs);
+		else
+		  type = info->type;
+		if ((TYPE_PRECISION (original_type)
+		     > TYPE_PRECISION (type))
+		    || (TYPE_UNSIGNED (original_type)
+			!= TYPE_UNSIGNED (type)))
+		  {
+		    if (TYPE_PRECISION (original_type) < TYPE_PRECISION (type))
+		      type = original_type;
+		    gcc_assert (type != NULL_TREE);
+		    TREE_TYPE (def) = promoted_type;
+		    copy_stmt = zero_sign_extend_stmt (def, rhs, type);
+		    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		    gsi_replace (gsi, copy_stmt, false);
+		  }
+		else
+		  {
+		    TREE_TYPE (def) = promoted_type;
+		    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		  }
+	      }
+	    else
+	      {
+		/* If RHS is not promoted OR their types are not
+		   compatible, create NOP_EXPR that converts
+		   RHS to  promoted DEF type and perform a
+		   ZERO/SIGN EXTEND to get the required value
+		   from RHS.  */
+		ssa_name_info *info = ssa_name_info_map->get_or_insert (rhs);
+		if (info != NULL)
+		  {
+		    tree type = info->type;
+		    new_def = copy_ssa_name (rhs);
+		    SET_SSA_NAME_VAR_OR_IDENTIFIER (new_def, NULL_TREE);
+		    TREE_TYPE (def) = promoted_type;
+		    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		    copy_stmt = zero_sign_extend_stmt (new_def, rhs, type);
+		    gimple_set_location (copy_stmt, gimple_location (def_stmt));
+		    gsi2 = gsi_for_stmt (def_stmt);
+		    gsi_insert_before (&gsi2, copy_stmt, GSI_NEW_STMT);
+		    gassign *new_def_stmt = gimple_build_assign (def, code, new_def);
+		    gsi_replace (gsi, new_def_stmt, false);
+		  }
+		else
+		  {
+		    TREE_TYPE (def) = promoted_type;
+		    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+		  }
+	      }
+	  }
+	else
+	  {
+	    /* Promote def by fixing its type and make def anonymous.  */
+	    promote_cst_in_stmt (def_stmt, promoted_type);
+	    SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+	    TREE_TYPE (def) = promoted_type;
+	  }
+	break;
+      }
+
+    default:
+      do_not_promote = true;
+      break;
+    }
+
+  if (do_not_promote)
+    {
+      /* Promote def and copy (i.e. convert) the value defined
+	 by the stmt that cannot be promoted.  */
+      new_def = copy_ssa_name (def);
+      set_ssa_promoted (new_def);
+      SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE);
+      TREE_TYPE (def) = promoted_type;
+      gimple_set_lhs (def_stmt, new_def);
+      copy_stmt = gimple_build_assign (def, NOP_EXPR, new_def);
+      gimple_set_location (copy_stmt, gimple_location (def_stmt));
+      gsi2 = gsi_for_stmt (def_stmt);
+      if (lookup_stmt_eh_lp (def_stmt) > 0
+	  || (gimple_code (def_stmt) == GIMPLE_CALL
+	      && gimple_call_ctrl_altering_p (def_stmt)))
+	gsi_insert_on_edge (FALLTHRU_EDGE (gimple_bb (def_stmt)),
+			    copy_stmt);
+      else
+	gsi_insert_after (&gsi2, copy_stmt, GSI_NEW_STMT);
+    }
+  reset_flow_sensitive_info (def);
+}
+
+/* Fix the (promoted) USE in stmts where USE cannot be be promoted.  */
+static unsigned int
+fixup_use (gimple *stmt, gimple_stmt_iterator *gsi,
+	   use_operand_p op, tree use)
+{
+  gimple *copy_stmt;
+  ssa_name_info **info = ssa_name_info_map->get (use);
+  /* If USE is not promoted, nothing to do.  */
+  if (!info || *info == NULL)
+    return 0;
+
+  tree promoted_type = (*info)->promoted_type;
+  tree old_type = (*info)->type;
+  bool do_not_promote = false;
+
+  switch (gimple_code (stmt))
+    {
+    case GIMPLE_DEBUG:
+      {
+	SET_USE (op, fold_convert (old_type, use));
+	update_stmt (stmt);
+	break;
+      }
+
+    case GIMPLE_ASM:
+    case GIMPLE_CALL:
+    case GIMPLE_RETURN:
+      {
+	/* USE cannot be promoted here.  */
+	do_not_promote = true;
+	break;
+      }
+
+    case GIMPLE_ASSIGN:
+      {
+	enum tree_code code = gimple_assign_rhs_code (stmt);
+	tree lhs = gimple_assign_lhs (stmt);
+	if (gimple_vuse (stmt) != NULL_TREE
+	    || gimple_vdef (stmt) != NULL_TREE
+	    || (ANY_INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+		&& !operation_no_trapping_overflow (TREE_TYPE (lhs), code))
+	    || code == VIEW_CONVERT_EXPR
+	    || code == LROTATE_EXPR
+	    || code == RROTATE_EXPR
+	    || code == CONSTRUCTOR
+	    || code == BIT_FIELD_REF
+	    || code == COMPLEX_EXPR
+	    || VECTOR_TYPE_P (TREE_TYPE (lhs)))
+	  {
+	    do_not_promote = true;
+	  }
+	else if (!not_truncated_use_p (code))
+	  {
+	    /* Promote the constant in comparison when other comparison
+	       operand is promoted.  All other constants are promoted as
+	       part of promoting definition in promote_ssa.  */
+	    if (TREE_CODE_CLASS (code) == tcc_comparison)
+	      promote_cst_in_stmt (stmt, promoted_type);
+	    /* In some stmts, value in USE has to be ZERO/SIGN
+	       Extended based on the original type for correct
+	       result.  */
+	    tree temp = make_ssa_name (TREE_TYPE (use), NULL);
+	    copy_stmt = zero_sign_extend_stmt (temp, use, old_type);
+	    gimple_set_location (copy_stmt, gimple_location (stmt));
+	    gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT);
+
+	    SET_USE (op, temp);
+	    update_stmt (stmt);
+	  }
+	else if (CONVERT_EXPR_CODE_P (code)
+	    || code == FLOAT_EXPR)
+	  {
+	    if (types_compatible_p (TREE_TYPE (lhs), promoted_type))
+	      {
+		/* Type of LHS and promoted RHS are compatible, we can
+		   convert this into ZERO/SIGN EXTEND stmt.  */
+		copy_stmt = zero_sign_extend_stmt (lhs, use, old_type);
+		gimple_set_location (copy_stmt, gimple_location (stmt));
+		set_ssa_promoted (lhs);
+		gsi_replace (gsi, copy_stmt, false);
+	      }
+	    else if (!tobe_promoted_p (lhs)
+		     || !INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+		     || (TYPE_UNSIGNED (TREE_TYPE (use)) != TYPE_UNSIGNED (TREE_TYPE (lhs))))
+	      {
+		tree temp = make_ssa_name (TREE_TYPE (use), NULL);
+		copy_stmt = zero_sign_extend_stmt (temp, use, old_type);
+		gimple_set_location (copy_stmt, gimple_location (stmt));
+		gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT);
+		SET_USE (op, temp);
+		update_stmt (stmt);
+	      }
+	  }
+	break;
+      }
+
+    case GIMPLE_COND:
+      {
+	/* In GIMPLE_COND, value in USE has to be ZERO/SIGN
+	   Extended based on the original type for correct
+	   result.  */
+	tree temp = make_ssa_name (TREE_TYPE (use), NULL);
+	copy_stmt = zero_sign_extend_stmt (temp, use, old_type);
+	gimple_set_location (copy_stmt, gimple_location (stmt));
+	gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT);
+	SET_USE (op, temp);
+	promote_cst_in_stmt (stmt, promoted_type);
+	update_stmt (stmt);
+	break;
+      }
+
+    default:
+      break;
+    }
+
+  if (do_not_promote)
+    {
+      /* FOR stmts where USE cannot be promoted, create an
+	 original type copy.  */
+      tree temp;
+      temp = copy_ssa_name (use);
+      SET_SSA_NAME_VAR_OR_IDENTIFIER (temp, NULL_TREE);
+      set_ssa_promoted (temp);
+      TREE_TYPE (temp) = old_type;
+      copy_stmt = gimple_build_assign (temp, NOP_EXPR, use);
+      gimple_set_location (copy_stmt, gimple_location (stmt));
+      gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT);
+      SET_USE (op, temp);
+      update_stmt (stmt);
+    }
+  return 0;
+}
+
+static void
+promote_all_ssa_defined_with_nop ()
+{
+  unsigned n = num_ssa_names, i;
+  gimple_stmt_iterator gsi2;
+  tree new_def;
+  basic_block bb;
+  gimple *copy_stmt;
+
+  for (i = 1; i < n; ++i)
+    {
+      tree name = ssa_name (i);
+      if (name
+	  && gimple_code (SSA_NAME_DEF_STMT (name)) == GIMPLE_NOP
+	  && tobe_promoted_p (name)
+	  && !has_zero_uses (name))
+	{
+	  tree promoted_type = get_promoted_type (TREE_TYPE (name));
+	  ssa_name_info *info;
+	  set_ssa_promoted (name);
+	  info = (ssa_name_info *) obstack_alloc (&ssa_name_info_obstack,
+						  sizeof (ssa_name_info));
+	  info->type = TREE_TYPE (name);
+	  info->promoted_type = promoted_type;
+	  info->ssa = name;
+	  ssa_name_info_map->put (name, info);
+
+	  if (SSA_NAME_VAR (name) == NULL)
+	    {
+	      /* Promote def by fixing its type for anonymous def.  */
+	      TREE_TYPE (name) = promoted_type;
+	    }
+	  else if (TREE_CODE (SSA_NAME_VAR (name)) != PARM_DECL)
+	    {
+	      tree var = create_tmp_reg (promoted_type);
+	      DECL_NAME (var) = DECL_NAME (SSA_NAME_VAR (name));
+	      set_ssa_default_def (cfun, SSA_NAME_VAR (name), NULL_TREE);
+	      TREE_TYPE (name) = promoted_type;
+	      SET_SSA_NAME_VAR_OR_IDENTIFIER (name, var);
+	      set_ssa_default_def (cfun, var, name);
+	    }
+	  else
+	    {
+	      /* Create a promoted copy of parameters.  */
+	      bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+	      gcc_assert (bb);
+	      gsi2 = gsi_after_labels (bb);
+	      /* Create new_def of the original type and set that to be the
+		 parameter.  */
+	      new_def = copy_ssa_name (name);
+	      set_ssa_promoted (new_def);
+	      set_ssa_default_def (cfun, SSA_NAME_VAR (name), new_def);
+	      copy_default_ssa (new_def, name);
+
+	      /* Now promote the def and copy the value from parameter.  */
+	      TREE_TYPE (name) = promoted_type;
+	      copy_stmt = gimple_build_assign (name, NOP_EXPR, new_def);
+	      SSA_NAME_DEF_STMT (name) = copy_stmt;
+	      gsi_insert_before (&gsi2, copy_stmt, GSI_NEW_STMT);
+	    }
+	  reset_flow_sensitive_info (name);
+	}
+    }
+}
+
+/* Promote all the stmts in the basic block.  */
+static void
+promote_all_stmts (basic_block bb)
+{
+  gimple_stmt_iterator gsi;
+  ssa_op_iter iter;
+  tree def, use;
+  use_operand_p op;
+
+  for (gphi_iterator gpi = gsi_start_phis (bb);
+       !gsi_end_p (gpi); gsi_next (&gpi))
+    {
+      gphi *phi = gpi.phi ();
+      FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE)
+	{
+	  use = USE_FROM_PTR (op);
+	  fixup_use (phi, &gsi, op, use);
+	}
+
+      def = PHI_RESULT (phi);
+      promote_ssa (def, &gsi);
+    }
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      gimple *stmt = gsi_stmt (gsi);
+      FOR_EACH_SSA_USE_OPERAND (op, stmt, iter, SSA_OP_USE)
+	{
+	  use = USE_FROM_PTR (op);
+	  fixup_use (stmt, &gsi, op, use);
+	}
+
+      FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_DEF)
+	promote_ssa (def, &gsi);
+    }
+}
+
+class type_promotion_dom_walker : public dom_walker
+{
+public:
+  type_promotion_dom_walker (cdi_direction direction)
+    : dom_walker (direction) {}
+  virtual void before_dom_children (basic_block bb)
+    {
+      promote_all_stmts (bb);
+    }
+};
+
+/* Main entry point to the pass.  */
+static unsigned int
+execute_type_promotion (void)
+{
+  n_ssa_val = num_ssa_names;
+  ssa_name_info_map = new hash_map<tree, ssa_name_info *>;
+  ssa_to_be_promoted_bitmap = sbitmap_alloc (n_ssa_val);
+  bitmap_clear (ssa_to_be_promoted_bitmap);
+
+  /* Create the obstack where ssa_name_info will reside.  */
+  gcc_obstack_init (&ssa_name_info_obstack);
+
+  calculate_dominance_info (CDI_DOMINATORS);
+  promote_all_ssa_defined_with_nop ();
+  /* Walk the CFG in dominator order.  */
+  type_promotion_dom_walker (CDI_DOMINATORS)
+    .walk (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+  gsi_commit_edge_inserts ();
+
+  obstack_free (&ssa_name_info_obstack, NULL);
+  sbitmap_free (ssa_to_be_promoted_bitmap);
+  delete ssa_name_info_map;
+  return 0;
+}
+
+namespace {
+const pass_data pass_data_type_promotion =
+{
+  GIMPLE_PASS, /* type */
+  "promotion", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_TREE_TYPE_PROMOTE, /* tv_id */
+  PROP_ssa, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  (TODO_cleanup_cfg | TODO_update_ssa | TODO_verify_all),
+};
+
+class pass_type_promotion : public gimple_opt_pass
+{
+public:
+  pass_type_promotion (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_type_promotion, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  opt_pass * clone () { return new pass_type_promotion (m_ctxt); }
+  virtual bool gate (function *) { return flag_tree_type_promote != 0; }
+  virtual unsigned int execute (function *)
+    {
+      return execute_type_promotion ();
+    }
+
+}; // class pass_type_promotion
+
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_type_promote (gcc::context *ctxt)
+{
+  return new pass_type_promotion (ctxt);
+}
+
diff --git a/gcc/passes.def b/gcc/passes.def
index 1702778..26838f3 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -276,6 +276,7 @@ along with GCC; see the file COPYING3.  If not see
       POP_INSERT_PASSES ()
       NEXT_PASS (pass_simduid_cleanup);
       NEXT_PASS (pass_lower_vector_ssa);
+      NEXT_PASS (pass_type_promote);
       NEXT_PASS (pass_split_paths);
       NEXT_PASS (pass_cse_reciprocals);
       NEXT_PASS (pass_reassoc, false /* insert_powi_p */);
diff --git a/gcc/timevar.def b/gcc/timevar.def
index 45e3b70..da7f2d5 100644
--- a/gcc/timevar.def
+++ b/gcc/timevar.def
@@ -279,6 +279,7 @@ DEFTIMEVAR (TV_VTABLE_VERIFICATION   , "vtable verification")
 DEFTIMEVAR (TV_TREE_UBSAN            , "tree ubsan")
 DEFTIMEVAR (TV_INITIALIZE_RTL        , "initialize rtl")
 DEFTIMEVAR (TV_GIMPLE_LADDRESS       , "address lowering")
+DEFTIMEVAR (TV_TREE_TYPE_PROMOTE     , "tree type promote")
 
 /* Everything else in rest_of_compilation not included above.  */
 DEFTIMEVAR (TV_EARLY_LOCAL	     , "early local passes")
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index dcd2d5e..376ad7d 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -441,6 +441,7 @@ extern gimple_opt_pass *make_pass_fre (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_check_data_deps (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_copy_prop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_isolate_erroneous_paths (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_type_promote (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_vrp (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt);
diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index ff608a3..6722331 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -4353,7 +4353,7 @@ d_print_comp_inner (struct d_print_info *dpi, int options,
 
   /* Variable used to store the current templates while a previously
      captured scope is used.  */
-  struct d_print_template *saved_templates;
+  struct d_print_template *saved_templates = NULL;
 
   /* Nonzero if templates have been stored in the above variable.  */
   int need_template_restore = 0;
-- 
1.9.1


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-11-24  2:52                           ` Kugan
@ 2015-12-10  0:27                             ` Kugan
  2015-12-16 13:18                               ` Richard Biener
  0 siblings, 1 reply; 63+ messages in thread
From: Kugan @ 2015-12-10  0:27 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

Hi Riachard,

Thanks for the reviews.

I think since we have some unresolved issues here, it is best to aim for
the next stage1. I however would like any feedback so that I can
continue to improve this.

https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01063.html is also related
to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67714. I don't think
there is any agreement on this. Or is there any better place to fix this?

Thanks,
Kugan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-12-10  0:27                             ` Kugan
@ 2015-12-16 13:18                               ` Richard Biener
  0 siblings, 0 replies; 63+ messages in thread
From: Richard Biener @ 2015-12-16 13:18 UTC (permalink / raw)
  To: Kugan; +Cc: gcc-patches

On Thu, Dec 10, 2015 at 1:27 AM, Kugan
<kugan.vivekanandarajah@linaro.org> wrote:
> Hi Riachard,
>
> Thanks for the reviews.
>
> I think since we have some unresolved issues here, it is best to aim for
> the next stage1. I however would like any feedback so that I can
> continue to improve this.

Yeah, sorry I've been distracted lately and am not sure I'll get to
the patch before
christmas break.

> https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01063.html is also related
> to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67714. I don't think
> there is any agreement on this. Or is there any better place to fix this?

I don't know enough in this area to suggest anything.

Richard.

> Thanks,
> Kugan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* RE: [0/7] Type promotion pass and elimination of zext/sext
  2015-09-08  8:03       ` Renlin Li
@ 2015-09-08 12:37         ` Wilco Dijkstra
  0 siblings, 0 replies; 63+ messages in thread
From: Wilco Dijkstra @ 2015-09-08 12:37 UTC (permalink / raw)
  To: Renlin Li, pinskia, Kugan; +Cc: GCC Patches, Richard Biener, nickc

> Renlin Li wrote:
> Hi Andrew,
> 
> Previously, there is a discussion thread in binutils mailing list:
> 
> https://sourceware.org/ml/binutils/2015-04/msg00032.html
> 
> Nick proposed a way to fix, Richard Henderson hold similar opinion as you.

Both Nick and Richard H seem to think it is an issue with unaligned instructions 
rather than an alignment bug in the debug code in the assembler (probably due to
the misleading error message). Although it would work, since we don't have/need
unaligned instructions that proposed patch is not the right fix for this issue.

Anyway aligning the debug tables correctly should be a safe and trivial fix.

Wilco



^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-09-07 12:17     ` pinskia
  2015-09-07 12:49       ` Wilco Dijkstra
@ 2015-09-08  8:03       ` Renlin Li
  2015-09-08 12:37         ` Wilco Dijkstra
  1 sibling, 1 reply; 63+ messages in thread
From: Renlin Li @ 2015-09-08  8:03 UTC (permalink / raw)
  To: pinskia, Kugan
  Cc: Wilco Dijkstra, GCC Patches, Richard Biener, Nicholas Clifton

Hi Andrew,

Previously, there is a discussion thread in binutils mailing list:

https://sourceware.org/ml/binutils/2015-04/msg00032.html

Nick proposed a way to fix, Richard Henderson hold similar opinion as you.

Regards,
Renlin

On 07/09/15 12:45, pinskia@gmail.com wrote:
>
>
>
>> On Sep 7, 2015, at 7:22 PM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
>>
>>
>>
>> On 07/09/15 20:46, Wilco Dijkstra wrote:
>>>> Kugan wrote:
>>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>>>> -O3 -g Error: unaligned opcodes detected in executable segment. It works
>>>> fine if I remove the -g. I am looking into it and needs to be fixed as well.
>>> This is a known assembler bug I found a while back, Renlin is looking into it.
>>> Basically when debug tables are inserted at the end of a code section the
>>> assembler doesn't align to the alignment required by the debug tables.
>> This is precisely what seems to be happening. Renlin, could you please
>> let me know when you have a patch (even if it is a prototype or a hack).
>
> I had noticed that but I read through the assembler code and it sounded very much like it was a designed this way and that the compiler was not supposed to emit assembly like this and fix up the alignment.
>
> Thanks,
> Andrew
>
>> Thanks,
>> Kugan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* RE: [0/7] Type promotion pass and elimination of zext/sext
  2015-09-07 12:17     ` pinskia
@ 2015-09-07 12:49       ` Wilco Dijkstra
  2015-09-08  8:03       ` Renlin Li
  1 sibling, 0 replies; 63+ messages in thread
From: Wilco Dijkstra @ 2015-09-07 12:49 UTC (permalink / raw)
  To: pinskia, Kugan; +Cc: Renlin Li, GCC Patches, Richard Biener

> pinskia@gmail.com wrote:
> > On Sep 7, 2015, at 7:22 PM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
> >
> >
> >
> > On 07/09/15 20:46, Wilco Dijkstra wrote:
> >>> Kugan wrote:
> >>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
> >>> -O3 -g Error: unaligned opcodes detected in executable segment. It works
> >>> fine if I remove the -g. I am looking into it and needs to be fixed as well.
> >>
> >> This is a known assembler bug I found a while back, Renlin is looking into it.
> >> Basically when debug tables are inserted at the end of a code section the
> >> assembler doesn't align to the alignment required by the debug tables.
> >
> > This is precisely what seems to be happening. Renlin, could you please
> > let me know when you have a patch (even if it is a prototype or a hack).
> 
> 
> I had noticed that but I read through the assembler code and it sounded very much like it was
> a designed this way and that the compiler was not supposed to emit assembly like this and fix
> up the alignment.

No, the bug is introduced solely by the assembler - there is no way to avoid it as you can't expect
users to align the end of the code section to an unspecified debug alignment (which could
potentially vary depending on the generated debug info). The assembler aligns unaligned instructions
without a warning, and doesn't require the section size to be a multiple of the section alignment,
ie. the design is that the assembler can deal with any alignment.

Wilco


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-09-07 11:31   ` Kugan
@ 2015-09-07 12:17     ` pinskia
  2015-09-07 12:49       ` Wilco Dijkstra
  2015-09-08  8:03       ` Renlin Li
  0 siblings, 2 replies; 63+ messages in thread
From: pinskia @ 2015-09-07 12:17 UTC (permalink / raw)
  To: Kugan; +Cc: Wilco Dijkstra, Renlin Li, GCC Patches, Richard Biener





> On Sep 7, 2015, at 7:22 PM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
> 
> 
> 
> On 07/09/15 20:46, Wilco Dijkstra wrote:
>>> Kugan wrote:
>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>>> -O3 -g Error: unaligned opcodes detected in executable segment. It works
>>> fine if I remove the -g. I am looking into it and needs to be fixed as well.
>> 
>> This is a known assembler bug I found a while back, Renlin is looking into it.
>> Basically when debug tables are inserted at the end of a code section the 
>> assembler doesn't align to the alignment required by the debug tables.
> 
> This is precisely what seems to be happening. Renlin, could you please
> let me know when you have a patch (even if it is a prototype or a hack).


I had noticed that but I read through the assembler code and it sounded very much like it was a designed this way and that the compiler was not supposed to emit assembly like this and fix up the alignment. 

Thanks,
Andrew

> 
> Thanks,
> Kugan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [0/7] Type promotion pass and elimination of zext/sext
  2015-09-07 10:51 ` Wilco Dijkstra
@ 2015-09-07 11:31   ` Kugan
  2015-09-07 12:17     ` pinskia
  0 siblings, 1 reply; 63+ messages in thread
From: Kugan @ 2015-09-07 11:31 UTC (permalink / raw)
  To: Wilco Dijkstra, Renlin Li; +Cc: 'GCC Patches', 'Richard Biener'



On 07/09/15 20:46, Wilco Dijkstra wrote:
>> Kugan wrote:
>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>> -O3 -g Error: unaligned opcodes detected in executable segment. It works
>> fine if I remove the -g. I am looking into it and needs to be fixed as well.
> 
> This is a known assembler bug I found a while back, Renlin is looking into it.
> Basically when debug tables are inserted at the end of a code section the 
> assembler doesn't align to the alignment required by the debug tables.

This is precisely what seems to be happening. Renlin, could you please
let me know when you have a patch (even if it is a prototype or a hack).

Thanks,
Kugan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* RE: [0/7] Type promotion pass and elimination of zext/sext
       [not found] <A610E03AD50BFC4D95529A36D37FA55E8A7AB808CC@GEORGE.Emea.Arm.com>
@ 2015-09-07 10:51 ` Wilco Dijkstra
  2015-09-07 11:31   ` Kugan
  0 siblings, 1 reply; 63+ messages in thread
From: Wilco Dijkstra @ 2015-09-07 10:51 UTC (permalink / raw)
  To: 'Kugan', Renlin Li
  Cc: 'GCC Patches', 'Richard Biener'

> Kugan wrote:
> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
> -O3 -g Error: unaligned opcodes detected in executable segment. It works
> fine if I remove the -g. I am looking into it and needs to be fixed as well.

This is a known assembler bug I found a while back, Renlin is looking into it.
Basically when debug tables are inserted at the end of a code section the 
assembler doesn't align to the alignment required by the debug tables.

Wilco


^ permalink raw reply	[flat|nested] 63+ messages in thread

end of thread, other threads:[~2015-12-16 13:18 UTC | newest]

Thread overview: 63+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-07  2:55 [0/7] Type promotion pass and elimination of zext/sext Kugan
2015-09-07  2:57 ` [1/7] Add new tree code SEXT_EXPR Kugan
2015-09-15 13:20   ` Richard Biener
2015-10-11 10:35     ` Kugan
2015-10-12 12:22       ` Richard Biener
2015-10-15  5:49         ` Kugan
2015-10-21 10:49           ` Richard Biener
2015-09-07  2:58 ` [2/7] Add new type promotion pass Kugan
2015-10-15  5:52   ` Kugan
2015-10-15 22:47     ` Richard Henderson
2015-09-07  3:00 ` [3/7] Optimize ZEXT_EXPR with tree-vrp Kugan
2015-09-15 13:18   ` Richard Biener
2015-10-06 23:12     ` kugan
2015-10-07  8:20       ` Richard Biener
2015-10-07 23:40         ` Kugan
2015-10-09 10:29           ` Richard Biener
2015-10-11  2:56             ` Kugan
2015-10-12 12:13               ` Richard Biener
2015-09-07  3:01 ` [4/7] Use correct promoted mode sign for result of GIMPLE_CALL Kugan
2015-09-07 13:16   ` Michael Matz
2015-09-08  0:00     ` Kugan
2015-09-08 15:45       ` Jeff Law
2015-09-08 22:09         ` Jim Wilson
2015-09-15 12:51           ` Richard Biener
2015-10-07  1:03             ` kugan
2015-09-07  3:01 ` [5/7] Allow gimple debug stmt in widen mode Kugan
2015-09-07 13:46   ` Michael Matz
2015-09-08  0:01     ` Kugan
2015-09-15 13:02       ` Richard Biener
2015-10-15  5:45         ` Kugan
2015-10-16  9:27           ` Richard Biener
2015-10-18 20:51             ` Kugan
2015-09-07  3:03 ` Kugan
2015-09-07  3:03 ` [6/7] Temporary workaround to get aarch64 bootstrap Kugan
2015-09-07  5:54 ` [7/7] Adjust-arm-test cases Kugan
2015-11-02 11:43   ` Richard Earnshaw
2015-10-20 20:13 ` [0/7] Type promotion pass and elimination of zext/sext Kugan
2015-10-21 12:56   ` Richard Biener
2015-10-21 13:57     ` Richard Biener
2015-10-21 17:17       ` Joseph Myers
2015-10-21 18:11       ` Richard Henderson
2015-10-22 12:48         ` Richard Biener
2015-10-22 11:01     ` Kugan
2015-10-22 14:24       ` Richard Biener
2015-10-27  1:48         ` kugan
2015-10-28 15:51           ` Richard Biener
2015-11-02  9:17             ` Kugan
2015-11-03 14:40               ` Richard Biener
2015-11-08  9:43                 ` Kugan
2015-11-10 14:13                   ` Richard Biener
2015-11-12  6:08                     ` Kugan
2015-11-14  1:15                     ` Kugan
2015-11-18 14:04                       ` Richard Biener
2015-11-18 15:06                         ` Richard Biener
2015-11-24  2:52                           ` Kugan
2015-12-10  0:27                             ` Kugan
2015-12-16 13:18                               ` Richard Biener
     [not found] <A610E03AD50BFC4D95529A36D37FA55E8A7AB808CC@GEORGE.Emea.Arm.com>
2015-09-07 10:51 ` Wilco Dijkstra
2015-09-07 11:31   ` Kugan
2015-09-07 12:17     ` pinskia
2015-09-07 12:49       ` Wilco Dijkstra
2015-09-08  8:03       ` Renlin Li
2015-09-08 12:37         ` Wilco Dijkstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).