public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [RFC] ipa bitwise constant propagation
@ 2016-08-04  6:36 Prathamesh Kulkarni
  2016-08-04  8:02 ` Richard Biener
  2016-08-05 12:37 ` Martin Jambor
  0 siblings, 2 replies; 31+ messages in thread
From: Prathamesh Kulkarni @ 2016-08-04  6:36 UTC (permalink / raw)
  To: Richard Biener, Jan Hubicka, Martin Jambor,
	Kugan Vivekanandarajah, gcc Patches

[-- Attachment #1: Type: text/plain, Size: 3501 bytes --]

Hi,
This is a prototype patch for propagating known/unknown bits inter-procedurally.
for integral types which propagates info obtained from get_nonzero_bits ().

Patch required making following changes:
a) To make info from get_nonzero_bits() available to ipa, I had to remove
guard !nonzero_p in ccp_finalize. However that triggered the following ICE
in get_ptr_info() for default_none.f95 (and several other fortran tests)
with options: -fopenacc -O2
ICE: http://pastebin.com/KjD7HMQi
I confirmed with Richard that this was a latent issue.

b) I chose widest_int for representing value, mask in ipcp_bits_lattice
and correspondingly changed declarations for
bit_value_unop_1/bit_value_binop_1 to take
precision and sign instead of type (those are the only two fields that
were used). Both these functions are exported by tree-ssa-ccp.h
I hope that's ok ?

c) Changed streamer_read_wi/streamer_write_wi to non-static.
Ah I see Kugan has submitted a patch for this, so I will drop this hunk.

d) We have following in tree-ssa-ccp.c:get_default_value ():
          if (flag_tree_bit_ccp)
            {
              wide_int nonzero_bits = get_nonzero_bits (var);
              if (nonzero_bits != -1)
                {
                  val.lattice_val = CONSTANT;
                  val.value = build_zero_cst (TREE_TYPE (var));
                  val.mask = extend_mask (nonzero_bits);
                }

extend_mask() sets all upper bits to 1 in nonzero_bits, ie, varying
in terms of bit-ccp.
I suppose in tree-ccp we need to extend mask if var is parameter since we don't
know in advance what values it will receive from different callers and mark all
upper bits as 1 to be safe.
However I suppose with ipa, we can determine exactly which bits of
parameter are constant and
setting all upper bits to 1 will become unnecessary ?

For example, consider following artificial test-case:
int f(int x)
{
  if (x > 300)
    return 1;
  else
    return 2;
}

int main(int argc, char **argv)
{
  return f(argc & 0xc) + f (argc & 0x3);
}

For x, the mask would be meet of:
<0, 0xc> meet <0, 0x3> == (0x3 | 0xc) | (0 ^ 0) == 0xf
and ipcp_update_bits() sets nonzero_bits for x to 0xf.
However get_default_value then calls extend_mask (0xf), resulting in
all upper bits
being set to 1 and consequently the condition if (x > 300) doesn't get folded.

To resolve this, I added a new flag "set_by_ipa" to decl_common,
which is set to true if the mask of parameter is determined by ipa-cp,
and the condition changes to:

if (SSA_NAME_VAR (var)
    && TREE_CODE (SSA_NAME_VAR (var)) == PARM_DECL
    && DECL_SET_BY_IPA (SSA_NAME_VAR (var))
  val.mask = widest_int::from (nonzero_bits,
                          TYPE_SIGN (TREE_TYPE (SSA_NAME_VAR (var)));
else
  val.mask = extend_mask (nonzero_bits);

I am not sure if adding a new flag to decl_common is a good idea. How
do other ipa passes deal with this/similar issue ?

I suppose we would want to gate this on some flag, say -fipa-bit-cp ?
I haven't yet gated it on the flag, will do in next version of patch.
I have added some very simple test-cases, I will try to add more
meaningful ones.

Patch passes bootstrap+test on x86_64-unknown-linux-gnu
and cross-tested on arm*-*-* and aarch64*-*-* with the exception
of some fortran tests failing due to above ICE.

As next steps, I am planning to extend it to handle alignment propagation,
and do further testing (lto-bootstrap, chromium).
I would be grateful for feedback on the current patch.

Thanks,
Prathamesh

[-- Attachment #2: 6.diff --]
[-- Type: text/plain, Size: 36967 bytes --]

diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 5b6cb9a..b770f6a 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -120,6 +120,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "params.h"
 #include "ipa-inline.h"
 #include "ipa-utils.h"
+#include "tree-ssa-ccp.h"
 
 template <typename valtype> class ipcp_value;
 
@@ -266,6 +267,40 @@ private:
   bool meet_with_1 (unsigned new_align, unsigned new_misalign);
 };
 
+/* Lattice of known bits, only capable of holding one value.
+   Similar to ccp_prop_value_t, mask represents which bits of value are constant.
+   If a bit in mask is set to 0, then the corresponding bit in
+   value is known to be constant.  */
+
+class ipcp_bits_lattice
+{
+public:
+  bool bottom_p () { return lattice_val == IPA_BITS_VARYING; }
+  bool top_p () { return lattice_val == IPA_BITS_UNDEFINED; }
+  bool constant_p () { return lattice_val == IPA_BITS_CONSTANT; }
+  bool set_to_bottom ();
+  bool set_to_constant (widest_int, widest_int, signop, unsigned);
+ 
+  widest_int get_value () { return value; }
+  widest_int get_mask () { return mask; }
+  signop get_sign () { return sgn; }
+  unsigned get_precision () { return precision; }
+
+  bool meet_with (ipcp_bits_lattice& other, enum tree_code, tree);
+  bool meet_with (widest_int, widest_int, signop, unsigned);
+
+  void print (FILE *);
+
+private:
+  enum { IPA_BITS_UNDEFINED, IPA_BITS_CONSTANT, IPA_BITS_VARYING } lattice_val;
+  widest_int value, mask;
+  signop sgn;
+  unsigned precision;
+
+  bool meet_with_1 (widest_int, widest_int); 
+  void get_value_and_mask (tree, widest_int *, widest_int *);
+}; 
+
 /* Structure containing lattices for a parameter itself and for pieces of
    aggregates that are passed in the parameter or by a reference in a parameter
    plus some other useful flags.  */
@@ -281,6 +316,8 @@ public:
   ipcp_agg_lattice *aggs;
   /* Lattice describing known alignment.  */
   ipcp_alignment_lattice alignment;
+  /* Lattice describing known bits.  */
+  ipcp_bits_lattice bits_lattice;
   /* Number of aggregate lattices */
   int aggs_count;
   /* True if aggregate data were passed by reference (as opposed to by
@@ -458,6 +495,21 @@ ipcp_alignment_lattice::print (FILE * f)
     fprintf (f, "         Alignment %u, misalignment %u\n", align, misalign);
 }
 
+void
+ipcp_bits_lattice::print (FILE *f)
+{
+  if (top_p ())
+    fprintf (f, "         Bits unknown (TOP)\n");
+  else if (bottom_p ())
+    fprintf (f, "         Bits unusable (BOTTOM)\n");
+  else
+    {
+      fprintf (f, "         Bits: value = "); print_hex (get_value (), f);
+      fprintf (f, ", mask = "); print_hex (get_mask (), f);
+      fprintf (f, "\n");
+    }
+}
+
 /* Print all ipcp_lattices of all functions to F.  */
 
 static void
@@ -484,6 +536,7 @@ print_all_lattices (FILE * f, bool dump_sources, bool dump_benefits)
 	  fprintf (f, "         ctxs: ");
 	  plats->ctxlat.print (f, dump_sources, dump_benefits);
 	  plats->alignment.print (f);
+	  plats->bits_lattice.print (f);
 	  if (plats->virt_call)
 	    fprintf (f, "        virt_call flag set\n");
 
@@ -911,6 +964,161 @@ ipcp_alignment_lattice::meet_with (const ipcp_alignment_lattice &other,
   return meet_with_1 (other.align, adjusted_misalign);
 }
 
+/* Set lattice value to bottom, if it already isn't the case.  */
+
+bool
+ipcp_bits_lattice::set_to_bottom ()
+{
+  if (bottom_p ())
+    return false;
+  lattice_val = IPA_BITS_VARYING;
+  value = 0;
+  mask = -1;
+  return true;
+}
+
+/* Set to constant if it isn't already. Only meant to be called
+   when switching state from TOP.  */
+
+bool
+ipcp_bits_lattice::set_to_constant (widest_int value, widest_int mask,
+				    signop sgn, unsigned precision)
+{
+  gcc_assert (top_p ());
+  this->lattice_val = IPA_BITS_CONSTANT;
+  this->value = value;
+  this->mask = mask;
+  this->sgn = sgn;
+  this->precision = precision;
+  return true;
+}
+
+/* Convert operand to value, mask form.  */
+
+void
+ipcp_bits_lattice::get_value_and_mask (tree operand, widest_int *valuep, widest_int *maskp)
+{
+  wide_int get_nonzero_bits (const_tree);
+
+  if (TREE_CODE (operand) == INTEGER_CST)
+    {
+      *valuep = wi::to_widest (operand); 
+      *maskp = 0;
+    }
+  else if (TREE_CODE (operand) == SSA_NAME)
+    {
+      *valuep = 0;
+      *maskp = widest_int::from (get_nonzero_bits (operand), UNSIGNED);
+    }
+  else
+    gcc_unreachable ();
+}  
+
+/* Meet operation, similar to ccp_lattice_meet, we xor values
+   if this->value, value have different values at same bit positions, we want
+   to drop that bit to varying. Return true if mask is changed.
+   This function assumes that the lattice value is in CONSTANT state  */
+
+bool
+ipcp_bits_lattice::meet_with_1 (widest_int value, widest_int mask)
+{
+  gcc_assert (constant_p ());
+  
+  widest_int old_mask = this->mask;
+  this->mask = (this->mask | mask) | (this->value ^ value);
+
+  if (wi::sext (this->mask, this->precision) == -1)
+    return set_to_bottom ();
+
+  bool changed = this->mask != old_mask;
+  return changed;
+}
+
+/* Meet the bits lattice with operand
+   described by <value, mask, sgn, precision.  */
+
+bool
+ipcp_bits_lattice::meet_with (widest_int value, widest_int mask,
+			      signop sgn, unsigned precision)
+{
+  if (bottom_p ())
+    return false;
+
+  if (top_p ())
+    {
+      if (wi::sext (mask, precision) == -1)
+	return set_to_bottom ();
+      return set_to_constant (value, mask, sgn, precision);
+    }
+
+  return meet_with_1 (value, mask);
+}
+
+/* Meet bits lattice with the result of bit_value_binop_1 (other, operand)
+   if code is binary operation or bit_value_unop_1 (other) if code is unary op.
+   In the case when code is nop_expr, no adjustment is required. */
+
+bool
+ipcp_bits_lattice::meet_with (ipcp_bits_lattice& other, enum tree_code code, tree operand)
+{
+  if (other.bottom_p ())
+    return set_to_bottom ();
+
+  if (bottom_p () || other.top_p ())
+    return false;
+
+  widest_int adjusted_value, adjusted_mask;
+
+  if (TREE_CODE_CLASS (code) == tcc_binary)
+    {
+      tree type = TREE_TYPE (operand);
+      gcc_assert (INTEGRAL_TYPE_P (type));
+      widest_int o_value, o_mask;
+      get_value_and_mask (operand, &o_value, &o_mask);
+
+      signop sgn = other.get_sign ();
+      unsigned prec = other.get_precision ();
+
+      bit_value_binop_1 (code, sgn, prec, &adjusted_value, &adjusted_mask,
+			 sgn, prec, other.get_value (), other.get_mask (),
+			 TYPE_SIGN (type), TYPE_PRECISION (type), o_value, o_mask);
+
+      if (wi::sext (adjusted_mask, prec) == -1)
+	return set_to_bottom ();
+    }
+
+  else if (TREE_CODE_CLASS (code) == tcc_unary)
+    {
+      signop sgn = other.get_sign ();
+      unsigned prec = other.get_precision ();
+
+      bit_value_unop_1 (code, sgn, prec, &adjusted_value,
+			&adjusted_mask, sgn, prec, other.get_value (),
+			other.get_mask ());
+
+      if (wi::sext (adjusted_mask, prec) == -1)
+	return set_to_bottom ();
+    }
+
+  else if (code == NOP_EXPR)
+    {
+      adjusted_value = other.value;
+      adjusted_mask = other.mask;
+    }
+
+  else
+    return set_to_bottom ();
+
+  if (top_p ())
+    {
+      if (wi::sext (adjusted_mask, other.get_precision ()) == -1)
+	return set_to_bottom ();
+      return set_to_constant (adjusted_value, adjusted_mask, other.get_sign (), other.get_precision ());
+    }
+  else
+    return meet_with_1 (adjusted_value, adjusted_mask);
+}
+
 /* Mark bot aggregate and scalar lattices as containing an unknown variable,
    return true is any of them has not been marked as such so far.  */
 
@@ -922,6 +1130,7 @@ set_all_contains_variable (struct ipcp_param_lattices *plats)
   ret |= plats->ctxlat.set_contains_variable ();
   ret |= set_agg_lats_contain_variable (plats);
   ret |= plats->alignment.set_to_bottom ();
+  ret |= plats->bits_lattice.set_to_bottom ();
   return ret;
 }
 
@@ -1003,6 +1212,7 @@ initialize_node_lattices (struct cgraph_node *node)
 	      plats->ctxlat.set_to_bottom ();
 	      set_agg_lats_to_bottom (plats);
 	      plats->alignment.set_to_bottom ();
+	      plats->bits_lattice.set_to_bottom ();
 	    }
 	  else
 	    set_all_contains_variable (plats);
@@ -1621,6 +1831,57 @@ propagate_alignment_accross_jump_function (cgraph_edge *cs,
     }
 }
 
+/* Propagate bits across jfunc that is associated with
+   edge cs and update dest_lattice accordingly.  */
+
+bool
+propagate_bits_accross_jump_function (cgraph_edge *cs, ipa_jump_func *jfunc,
+				      ipcp_bits_lattice *dest_lattice)
+{
+  if (dest_lattice->bottom_p ())
+    return false;
+
+  if (jfunc->type == IPA_JF_PASS_THROUGH)
+    {
+      struct ipa_node_params *caller_info = IPA_NODE_REF (cs->caller);
+      enum tree_code code = ipa_get_jf_pass_through_operation (jfunc);
+      tree operand = NULL_TREE;
+
+      if (code != NOP_EXPR)
+	operand = ipa_get_jf_pass_through_operand (jfunc);
+
+      int src_idx = ipa_get_jf_pass_through_formal_id (jfunc);
+      struct ipcp_param_lattices *src_lats
+	= ipa_get_parm_lattices (caller_info, src_idx);
+
+      /* Try to proapgate bits if src_lattice is bottom, but jfunc is known.
+	 for eg consider:
+	 int f(int x)
+	 {
+	   g (x & 0xff);
+	 }
+	 Assume lattice for x is bottom, however we can still propagate
+	 result of x & 0xff == 0xff, which gets computed during ccp1 pass
+	 and we store it in jump function during analysis stage.  */
+
+      if (src_lats->bits_lattice.bottom_p ()
+	  && jfunc->bits.known)
+	return dest_lattice->meet_with (jfunc->bits.value, jfunc->bits.mask,
+				        jfunc->bits.sgn, jfunc->bits.precision);
+      else
+	return dest_lattice->meet_with (src_lats->bits_lattice, code, operand);
+    }
+
+  else if (jfunc->type == IPA_JF_ANCESTOR)
+    return dest_lattice->set_to_bottom ();
+
+  else if (jfunc->bits.known) 
+    return dest_lattice->meet_with (jfunc->bits.value, jfunc->bits.mask,
+				    jfunc->bits.sgn, jfunc->bits.precision);
+  else
+    return dest_lattice->set_to_bottom ();
+}
+
 /* If DEST_PLATS already has aggregate items, check that aggs_by_ref matches
    NEW_AGGS_BY_REF and if not, mark all aggs as bottoms and return true (in all
    other cases, return false).  If there are no aggregate items, set
@@ -1968,6 +2229,8 @@ propagate_constants_accross_call (struct cgraph_edge *cs)
 							  &dest_plats->ctxlat);
 	  ret |= propagate_alignment_accross_jump_function (cs, jump_func,
 							 &dest_plats->alignment);
+	  ret |= propagate_bits_accross_jump_function (cs, jump_func,
+						       &dest_plats->bits_lattice);
 	  ret |= propagate_aggs_accross_jump_function (cs, jump_func,
 						       dest_plats);
 	}
@@ -4592,6 +4855,74 @@ ipcp_store_alignment_results (void)
   }
 }
 
+/* Look up all the bits information that we have discovered and copy it over
+   to the transformation summary.  */
+
+static void
+ipcp_store_bits_results (void)
+{
+  cgraph_node *node;
+
+  FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node)
+    {
+      ipa_node_params *info = IPA_NODE_REF (node);
+      bool dumped_sth = false;
+      bool found_useful_result = false;
+
+      if (info->ipcp_orig_node)
+	info = IPA_NODE_REF (info->ipcp_orig_node);
+
+      unsigned count = ipa_get_param_count (info);
+      for (unsigned i = 0; i < count; i++)
+	{
+	  ipcp_param_lattices *plats = ipa_get_parm_lattices (info, i);
+	  if (plats->bits_lattice.constant_p ())
+	    {
+	      found_useful_result = true;
+	      break;
+	    }
+	}
+
+    if (!found_useful_result)
+      continue;
+
+    ipcp_grow_transformations_if_necessary ();
+    ipcp_transformation_summary *ts = ipcp_get_transformation_summary (node);
+    vec_safe_reserve_exact (ts->bits, count);
+
+    for (unsigned i = 0; i < count; i++)
+      {
+	ipcp_param_lattices *plats = ipa_get_parm_lattices (info, i);
+	ipa_bits bits_jfunc;			 
+
+	if (plats->bits_lattice.constant_p ())
+	  {
+	    bits_jfunc.known = true;
+	    bits_jfunc.value = plats->bits_lattice.get_value ();
+	    bits_jfunc.mask = plats->bits_lattice.get_mask ();
+	    bits_jfunc.sgn = plats->bits_lattice.get_sign ();
+	    bits_jfunc.precision = plats->bits_lattice.get_precision ();
+	  }
+	else
+	  bits_jfunc.known = false;
+
+	ts->bits->quick_push (bits_jfunc);
+	if (!dump_file || !bits_jfunc.known)
+	  continue;
+	if (!dumped_sth)
+	  {
+	    fprintf (dump_file, "Propagated bits info for function %s/%i:\n",
+				node->name (), node->order);
+	    dumped_sth = true;
+	  }
+	fprintf (dump_file, " param %i: value = ", i);
+	print_hex (bits_jfunc.value, dump_file);
+	fprintf (dump_file, ", mask = ");
+	print_hex (bits_jfunc.mask, dump_file);
+	fprintf (dump_file, "\n");
+      }
+    }
+}
 /* The IPCP driver.  */
 
 static unsigned int
@@ -4625,6 +4956,8 @@ ipcp_driver (void)
   ipcp_decision_stage (&topo);
   /* Store results of alignment propagation. */
   ipcp_store_alignment_results ();
+  /* Store results of bits propagation.  */
+  ipcp_store_bits_results ();
 
   /* Free all IPCP structures.  */
   free_toporder_info (&topo);
diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
index 132b622..0913cc5 100644
--- a/gcc/ipa-prop.c
+++ b/gcc/ipa-prop.c
@@ -302,6 +302,15 @@ ipa_print_node_jump_functions_for_edge (FILE *f, struct cgraph_edge *cs)
 	}
       else
 	fprintf (f, "         Unknown alignment\n");
+
+      if (jump_func->bits.known)
+	{
+	  fprintf (f, "         value: "); print_hex (jump_func->bits.value, f);
+	  fprintf (f, ", mask: "); print_hex (jump_func->bits.mask, f);
+	  fprintf (f, "\n");
+	}
+      else
+	fprintf (f, "         Unknown bits\n");
     }
 }
 
@@ -381,6 +390,7 @@ ipa_set_jf_unknown (struct ipa_jump_func *jfunc)
 {
   jfunc->type = IPA_JF_UNKNOWN;
   jfunc->alignment.known = false;
+  jfunc->bits.known = false;
 }
 
 /* Set JFUNC to be a copy of another jmp (to be used by jump function
@@ -1674,6 +1684,27 @@ ipa_compute_jump_functions_for_edge (struct ipa_func_body_info *fbi,
       else
 	gcc_assert (!jfunc->alignment.known);
 
+      if (INTEGRAL_TYPE_P (TREE_TYPE (arg))
+	  && (TREE_CODE (arg) == SSA_NAME || TREE_CODE (arg) == INTEGER_CST))
+	{
+	  jfunc->bits.known = true;
+	  jfunc->bits.sgn = TYPE_SIGN (TREE_TYPE (arg));
+	  jfunc->bits.precision = TYPE_PRECISION (TREE_TYPE (arg));
+	  
+	  if (TREE_CODE (arg) == SSA_NAME)
+	    {
+	      jfunc->bits.value = 0;
+	      jfunc->bits.mask = widest_int::from (get_nonzero_bits (arg), UNSIGNED); 
+	    }
+	  else
+	    {
+	      jfunc->bits.value = wi::to_widest (arg);
+	      jfunc->bits.mask = 0;
+	    }
+	}
+      else
+	gcc_assert (!jfunc->bits.known);
+
       if (is_gimple_ip_invariant (arg)
 	  || (TREE_CODE (arg) == VAR_DECL
 	      && is_global_var (arg)
@@ -3690,6 +3721,18 @@ ipa_node_params_t::duplicate(cgraph_node *src, cgraph_node *dst,
       for (unsigned i = 0; i < src_alignments->length (); ++i)
 	dst_alignments->quick_push ((*src_alignments)[i]);
     }
+
+  if (src_trans && vec_safe_length (src_trans->bits) > 0)
+    {
+      ipcp_grow_transformations_if_necessary ();
+      src_trans = ipcp_get_transformation_summary (src);
+      const vec<ipa_bits, va_gc> *src_bits = src_trans->bits;
+      vec<ipa_bits, va_gc> *&dst_bits
+	= ipcp_get_transformation_summary (dst)->bits;
+      vec_safe_reserve_exact (dst_bits, src_bits->length ());
+      for (unsigned i = 0; i < src_bits->length (); ++i)
+	dst_bits->quick_push ((*src_bits)[i]);
+    }
 }
 
 /* Register our cgraph hooks if they are not already there.  */
@@ -4609,6 +4652,17 @@ ipa_write_jump_function (struct output_block *ob,
       streamer_write_uhwi (ob, jump_func->alignment.align);
       streamer_write_uhwi (ob, jump_func->alignment.misalign);
     }
+
+  bp = bitpack_create (ob->main_stream);
+  bp_pack_value (&bp, jump_func->bits.known, 1);
+  streamer_write_bitpack (&bp);
+  if (jump_func->bits.known)
+    {
+      streamer_write_wi (ob, jump_func->bits.value);
+      streamer_write_wi (ob, jump_func->bits.mask);
+      streamer_write_enum (ob->main_stream, signop, UNSIGNED + 1, jump_func->bits.sgn);
+      streamer_write_uhwi (ob, jump_func->bits.precision);
+    }   
 }
 
 /* Read in jump function JUMP_FUNC from IB.  */
@@ -4685,6 +4739,19 @@ ipa_read_jump_function (struct lto_input_block *ib,
     }
   else
     jump_func->alignment.known = false;
+
+  bp = streamer_read_bitpack (ib);
+  bool bits_known = bp_unpack_value (&bp, 1);
+  if (bits_known)
+    {
+      jump_func->bits.known = true;
+      jump_func->bits.value = streamer_read_wi (ib);
+      jump_func->bits.mask = streamer_read_wi (ib);
+      jump_func->bits.sgn = streamer_read_enum (ib, signop, UNSIGNED + 1);
+      jump_func->bits.precision = streamer_read_uhwi (ib); 
+    }
+  else
+    jump_func->bits.known = false;
 }
 
 /* Stream out parts of cgraph_indirect_call_info corresponding to CS that are
@@ -5050,6 +5117,31 @@ write_ipcp_transformation_info (output_block *ob, cgraph_node *node)
     }
   else
     streamer_write_uhwi (ob, 0);
+
+  ts = ipcp_get_transformation_summary (node);
+  if (ts && vec_safe_length (ts->bits) > 0)
+    {
+      count = ts->bits->length ();
+      streamer_write_uhwi (ob, count);
+
+      for (unsigned i = 0; i < count; ++i)
+	{
+	  const ipa_bits& bits_jfunc = (*ts->bits)[i];
+	  struct bitpack_d bp = bitpack_create (ob->main_stream);
+	  bp_pack_value (&bp, bits_jfunc.known, 1);
+	  streamer_write_bitpack (&bp);
+	  if (bits_jfunc.known)
+	    {
+	      streamer_write_wi (ob, bits_jfunc.value);
+	      streamer_write_wi (ob, bits_jfunc.mask);
+	      streamer_write_enum (ob->main_stream, signop,
+				   UNSIGNED + 1, bits_jfunc.sgn);
+	      streamer_write_uhwi (ob, bits_jfunc.precision);
+	    }
+	}
+    }
+  else
+    streamer_write_uhwi (ob, 0);
 }
 
 /* Stream in the aggregate value replacement chain for NODE from IB.  */
@@ -5102,6 +5194,28 @@ read_ipcp_transformation_info (lto_input_block *ib, cgraph_node *node,
 	    }
 	}
     }
+
+  count = streamer_read_uhwi (ib);
+  if (count > 0)
+    {
+      ipcp_grow_transformations_if_necessary ();
+      ipcp_transformation_summary *ts = ipcp_get_transformation_summary (node);
+      vec_safe_grow_cleared (ts->bits, count);
+
+      for (i = 0; i < count; i++)
+	{
+	  ipa_bits& bits_jfunc = (*ts->bits)[i];
+	  struct bitpack_d bp = streamer_read_bitpack (ib);
+	  bits_jfunc.known = bp_unpack_value (&bp, 1);
+	  if (bits_jfunc.known)
+	    {
+	      bits_jfunc.value = streamer_read_wi (ib);
+	      bits_jfunc.mask = streamer_read_wi (ib);
+	      bits_jfunc.sgn = streamer_read_enum (ib, signop, UNSIGNED + 1);
+	      bits_jfunc.precision = streamer_read_uhwi (ib);
+	    }
+	}
+    }
 }
 
 /* Write all aggregate replacement for nodes in set.  */
@@ -5404,6 +5518,55 @@ ipcp_update_alignments (struct cgraph_node *node)
     }
 }
 
+/* Update bits info of formal parameters as described in
+   ipcp_transformation_summary.  */
+
+static void
+ipcp_update_bits (struct cgraph_node *node)
+{
+  tree parm = DECL_ARGUMENTS (node->decl);
+  tree next_parm = parm;
+  ipcp_transformation_summary *ts = ipcp_get_transformation_summary (node);
+
+  if (!ts || vec_safe_length (ts->bits) == 0)
+    return;
+
+  vec<ipa_bits, va_gc> &bits = *ts->bits;
+  unsigned count = bits.length ();
+
+  for (unsigned i = 0; i < count; ++i, parm = next_parm)
+    {
+      if (node->clone.combined_args_to_skip
+	  && bitmap_bit_p (node->clone.combined_args_to_skip, i))
+	continue;
+
+      gcc_checking_assert (parm);
+      next_parm = DECL_CHAIN (parm);
+
+      if (!bits[i].known
+	  || !INTEGRAL_TYPE_P (TREE_TYPE (parm))
+	  || !is_gimple_reg (parm))
+	continue;       
+
+      tree ddef = ssa_default_def (DECL_STRUCT_FUNCTION (node->decl), parm);
+      if (!ddef)
+	continue;
+
+      if (dump_file)
+	{
+	  fprintf (dump_file, "Adjusting mask for param %u to ", i); 
+	  print_hex (bits[i].mask, dump_file);
+	  fprintf (dump_file, "\n");
+	}
+
+      unsigned prec = TYPE_PRECISION (TREE_TYPE (ddef));
+      wide_int nonzero_bits = wide_int::from (bits[i].mask, prec, UNSIGNED)
+			      | wide_int::from (bits[i].value, prec, bits[i].sgn);
+      set_nonzero_bits (ddef, nonzero_bits);
+      DECL_SET_BY_IPA (parm) = 1;
+    }
+}
+
 /* IPCP transformation phase doing propagation of aggregate values.  */
 
 unsigned int
@@ -5423,6 +5586,7 @@ ipcp_transform_function (struct cgraph_node *node)
 	     node->name (), node->order);
 
   ipcp_update_alignments (node);
+  ipcp_update_bits (node);
   aggval = ipa_get_agg_replacements_for_node (node);
   if (!aggval)
       return 0;
diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h
index e32d078..d69a071 100644
--- a/gcc/ipa-prop.h
+++ b/gcc/ipa-prop.h
@@ -154,6 +154,16 @@ struct GTY(()) ipa_alignment
   unsigned misalign;
 };
 
+/* Information about zero/non-zero bits.  */
+struct GTY(()) ipa_bits
+{
+  bool known;
+  widest_int value;
+  widest_int mask;
+  enum signop sgn;
+  unsigned precision;
+};
+
 /* A jump function for a callsite represents the values passed as actual
    arguments of the callsite. See enum jump_func_type for the various
    types of jump functions supported.  */
@@ -166,6 +176,9 @@ struct GTY (()) ipa_jump_func
   /* Information about alignment of pointers. */
   struct ipa_alignment alignment;
 
+  /* Information about zero/non-zero bits.  */
+  struct ipa_bits bits;
+
   enum jump_func_type type;
   /* Represents a value of a jump function.  pass_through is used only in jump
      function context.  constant represents the actual constant in constant jump
@@ -482,6 +495,8 @@ struct GTY(()) ipcp_transformation_summary
   ipa_agg_replacement_value *agg_values;
   /* Alignment information for pointers.  */
   vec<ipa_alignment, va_gc> *alignments;
+  /* Known bits information.  */
+  vec<ipa_bits, va_gc> *bits;
 };
 
 void ipa_set_node_agg_value_chain (struct cgraph_node *node,
diff --git a/gcc/lto-streamer-in.c b/gcc/lto-streamer-in.c
index 1d56d21..01462e2 100644
--- a/gcc/lto-streamer-in.c
+++ b/gcc/lto-streamer-in.c
@@ -712,7 +712,7 @@ make_new_block (struct function *fn, unsigned int index)
 
 /* Read a wide-int.  */
 
-static widest_int
+widest_int
 streamer_read_wi (struct lto_input_block *ib)
 {
   HOST_WIDE_INT a[WIDE_INT_MAX_ELTS];
diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
index aa6b589..8fbd882 100644
--- a/gcc/lto-streamer-out.c
+++ b/gcc/lto-streamer-out.c
@@ -1830,7 +1830,7 @@ output_ssa_names (struct output_block *ob, struct function *fn)
 
 /* Output a wide-int.  */
 
-static void
+void
 streamer_write_wi (struct output_block *ob,
 		   const widest_int &w)
 {
diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h
index ecc1e5d..4da89d0 100644
--- a/gcc/lto-streamer.h
+++ b/gcc/lto-streamer.h
@@ -1225,4 +1225,7 @@ DEFINE_DECL_STREAM_FUNCS (TYPE_DECL, type_decl)
 DEFINE_DECL_STREAM_FUNCS (NAMESPACE_DECL, namespace_decl)
 DEFINE_DECL_STREAM_FUNCS (LABEL_DECL, label_decl)
 
+widest_int streamer_read_wi (struct lto_input_block *);
+void streamer_write_wi (struct output_block *, const widest_int &);
+
 #endif /* GCC_LTO_STREAMER_H  */
diff --git a/gcc/testsuite/gcc.dg/ipa/prop-bits-1.c b/gcc/testsuite/gcc.dg/ipa/prop-bits-1.c
new file mode 100644
index 0000000..2389d9f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/prop-bits-1.c
@@ -0,0 +1,33 @@
+/* Propagate 0xff from main to f3 to f2.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-early-inlining -fdump-ipa-cp -fdump-tree-optimized" } */
+
+int pass_test(void);
+int fail_test(void);
+
+__attribute__((noinline, noclone))
+static int f2(int x)
+{
+  if (x > 300)
+    return fail_test();
+  else
+    return pass_test();
+}
+
+__attribute__((noinline, noclone))
+static int f3(int y)
+{
+  int k = f2(y);
+  return k;
+}
+
+int main(int argc)
+{
+  int k = argc & 0xff;
+  int a = f3(k);
+  return a; 
+}
+
+/* { dg-final { scan-ipa-dump-times "Adjusting mask for param 0 to 0xff" 2 "cp" } } */
+/* { dg-final { scan-tree-dump-not "fail_test" "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/ipa/prop-bits-2.c b/gcc/testsuite/gcc.dg/ipa/prop-bits-2.c
new file mode 100644
index 0000000..8704cce
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/prop-bits-2.c
@@ -0,0 +1,37 @@
+/* x's mask should be meet(0xc, 0x3) == 0xf  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-early-inlining -fdump-ipa-cp" } */
+
+__attribute__((noinline))
+static int f1(int x)
+{
+  if (x > 300)
+    return 1;
+  else
+    return 2;
+}
+
+__attribute__((noinline))
+static int f2(int y)
+{
+  return f1(y & 0x03);
+}
+
+__attribute__((noinline))
+static int f3(int z)
+{
+  return f1(z & 0xc);
+}
+
+extern int a;
+extern int b;
+
+int main(void)
+{
+  int k = f2(a); 
+  int l = f3(b);
+  return k + l;
+}
+
+/* { dg-final { scan-ipa-dump "Adjusting mask for param 0 to 0xf" "cp" } } */
diff --git a/gcc/testsuite/gcc.dg/ipa/prop-bits-3.c b/gcc/testsuite/gcc.dg/ipa/prop-bits-3.c
new file mode 100644
index 0000000..1d0a2ab
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/prop-bits-3.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-early-inlining -fdump-ipa-cp" } */
+
+__attribute__((noinline))
+static int f(int x)
+{
+  int f2(int);
+
+  if (x > 300)
+    {
+      int z = f(x + 1);
+      return f2 (z);
+    }
+  else
+    return 2;
+}
+
+int main(int argc, char **argv)
+{
+  int k = f(argc & 0xff); 
+  return k;
+}
+
+/* { dg-final { scan-ipa-dump-not "Adjusting mask for" "cp" } } */  
diff --git a/gcc/testsuite/gcc.dg/ipa/prop-bits-4.c b/gcc/testsuite/gcc.dg/ipa/prop-bits-4.c
new file mode 100644
index 0000000..135cde9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/prop-bits-4.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-early-inlining -fdump-ipa-cp" } */
+
+__attribute__((noinline)) 
+static int f(int x)
+{
+  if (x > 300)
+    return 1;
+  else
+    return 2;
+}
+
+int main(void)
+{
+  int a = f(1);
+  int b = f(2);
+  int c = f(4);
+  return a + b + c;
+}
+
+/* { dg-final { scan-ipa-dump "Adjusting mask for param 0 to 0x7" "cp" } } */
diff --git a/gcc/testsuite/gcc.dg/ipa/prop-bits-5.c b/gcc/testsuite/gcc.dg/ipa/prop-bits-5.c
new file mode 100644
index 0000000..731f307
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/prop-bits-5.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-early-inlining -fdump-ipa-cp" } */
+
+__attribute__((noinline))
+static int f1(int x)
+{
+  if (x > 20)
+    return 1;
+  else
+    return 2;
+}
+
+__attribute__((noinline))
+static int f2(int y)
+{
+  return f1 (y & 0x3);
+}
+
+int main(int argc, char **argv)
+{
+  int z = f2 (argc & 0xff);
+  int k = f1 (argc & 0xc);
+  return z + k;
+}
+
+/* { dg-final { scan-ipa-dump "Adjusting mask for param 0 to 0xf" "cp" } } */
diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index 6e8595c..9369cf7 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -1558,7 +1558,8 @@ struct GTY(()) tree_decl_common {
   /* DECL_ALIGN.  It should have the same size as TYPE_ALIGN.  */
   unsigned int align : 6;
 
-  /* 20 bits unused.  */
+  unsigned set_by_ipa: 1;
+  /* 19 bits unused.  */
 
   /* UID for points-to sets, stable over copying from inlining.  */
   unsigned int pt_uid;
diff --git a/gcc/tree-ssa-ccp.c b/gcc/tree-ssa-ccp.c
index ae120a8..c75e9ce 100644
--- a/gcc/tree-ssa-ccp.c
+++ b/gcc/tree-ssa-ccp.c
@@ -142,7 +142,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfgloop.h"
 #include "stor-layout.h"
 #include "optabs-query.h"
-
+#include "tree-ssa-ccp.h"
 
 /* Possible lattice values.  */
 typedef enum
@@ -287,7 +287,11 @@ get_default_value (tree var)
 		{
 		  val.lattice_val = CONSTANT;
 		  val.value = build_zero_cst (TREE_TYPE (var));
-		  val.mask = extend_mask (nonzero_bits);
+		  if (SSA_NAME_VAR (var) && TREE_CODE (SSA_NAME_VAR (var)) == PARM_DECL
+		      && DECL_SET_BY_IPA (SSA_NAME_VAR (var)))
+		    val.mask = widest_int::from (nonzero_bits, TYPE_SIGN (TREE_TYPE (SSA_NAME_VAR (var))));
+		  else
+		    val.mask = extend_mask (nonzero_bits);
 		}
 	    }
 	}
@@ -537,9 +541,9 @@ set_lattice_value (tree var, ccp_prop_value_t *new_val)
 
 static ccp_prop_value_t get_value_for_expr (tree, bool);
 static ccp_prop_value_t bit_value_binop (enum tree_code, tree, tree, tree);
-static void bit_value_binop_1 (enum tree_code, tree, widest_int *, widest_int *,
-			       tree, const widest_int &, const widest_int &,
-			       tree, const widest_int &, const widest_int &);
+void bit_value_binop_1 (enum tree_code, signop, unsigned, widest_int *, widest_int *,
+			signop, unsigned, const widest_int &, const widest_int &,
+			signop, unsigned, const widest_int &, const widest_int &);
 
 /* Return a widest_int that can be used for bitwise simplifications
    from VAL.  */
@@ -895,7 +899,7 @@ do_dbg_cnt (void)
    Return TRUE when something was optimized.  */
 
 static bool
-ccp_finalize (bool nonzero_p)
+ccp_finalize (bool nonzero_p ATTRIBUTE_UNUSED)
 {
   bool something_changed;
   unsigned i;
@@ -913,10 +917,7 @@ ccp_finalize (bool nonzero_p)
 
       if (!name
 	  || (!POINTER_TYPE_P (TREE_TYPE (name))
-	      && (!INTEGRAL_TYPE_P (TREE_TYPE (name))
-		  /* Don't record nonzero bits before IPA to avoid
-		     using too much memory.  */
-		  || !nonzero_p)))
+	      && (!INTEGRAL_TYPE_P (TREE_TYPE (name)))))
 	continue;
 
       val = get_value (name);
@@ -1225,10 +1226,11 @@ ccp_fold (gimple *stmt)
    RVAL and RMASK representing a value of type RTYPE and set
    the value, mask pair *VAL and *MASK to the result.  */
 
-static void
-bit_value_unop_1 (enum tree_code code, tree type,
+void
+bit_value_unop_1 (enum tree_code code, signop type_sgn, unsigned type_precision, 
 		  widest_int *val, widest_int *mask,
-		  tree rtype, const widest_int &rval, const widest_int &rmask)
+		  signop rtype_sgn, unsigned rtype_precision,
+		  const widest_int &rval, const widest_int &rmask)
 {
   switch (code)
     {
@@ -1241,25 +1243,23 @@ bit_value_unop_1 (enum tree_code code, tree type,
       {
 	widest_int temv, temm;
 	/* Return ~rval + 1.  */
-	bit_value_unop_1 (BIT_NOT_EXPR, type, &temv, &temm, type, rval, rmask);
-	bit_value_binop_1 (PLUS_EXPR, type, val, mask,
-			   type, temv, temm, type, 1, 0);
+	bit_value_unop_1 (BIT_NOT_EXPR, type_sgn, type_precision, &temv, &temm,
+			  type_sgn, type_precision, rval, rmask);
+	bit_value_binop_1 (PLUS_EXPR, type_sgn, type_precision, val, mask,
+			   type_sgn, type_precision, temv, temm,
+			   type_sgn, type_precision, 1, 0);
 	break;
       }
 
     CASE_CONVERT:
       {
-	signop sgn;
-
 	/* First extend mask and value according to the original type.  */
-	sgn = TYPE_SIGN (rtype);
-	*mask = wi::ext (rmask, TYPE_PRECISION (rtype), sgn);
-	*val = wi::ext (rval, TYPE_PRECISION (rtype), sgn);
+	*mask = wi::ext (rmask, rtype_precision, rtype_sgn);
+	*val = wi::ext (rval, rtype_precision, rtype_sgn);
 
 	/* Then extend mask and value according to the target type.  */
-	sgn = TYPE_SIGN (type);
-	*mask = wi::ext (*mask, TYPE_PRECISION (type), sgn);
-	*val = wi::ext (*val, TYPE_PRECISION (type), sgn);
+	*mask = wi::ext (*mask, type_precision, type_sgn);
+	*val = wi::ext (*val, type_precision, type_sgn);
 	break;
       }
 
@@ -1273,15 +1273,16 @@ bit_value_unop_1 (enum tree_code code, tree type,
    R1VAL, R1MASK and R2VAL, R2MASK representing a values of type R1TYPE
    and R2TYPE and set the value, mask pair *VAL and *MASK to the result.  */
 
-static void
-bit_value_binop_1 (enum tree_code code, tree type,
+void
+bit_value_binop_1 (enum tree_code code, signop type_sgn, unsigned type_precision,
 		   widest_int *val, widest_int *mask,
-		   tree r1type, const widest_int &r1val,
-		   const widest_int &r1mask, tree r2type,
+		   signop r1type_sgn, unsigned r1type_precision,
+		   const widest_int &r1val, const widest_int &r1mask,
+		   signop r2type_sgn, unsigned r2type_precision,
 		   const widest_int &r2val, const widest_int &r2mask)
 {
-  signop sgn = TYPE_SIGN (type);
-  int width = TYPE_PRECISION (type);
+  signop sgn = type_sgn;
+  int width = (int) type_precision;
   bool swap_p = false;
 
   /* Assume we'll get a constant result.  Use an initial non varying
@@ -1407,11 +1408,11 @@ bit_value_binop_1 (enum tree_code code, tree type,
     case MINUS_EXPR:
       {
 	widest_int temv, temm;
-	bit_value_unop_1 (NEGATE_EXPR, r2type, &temv, &temm,
-			  r2type, r2val, r2mask);
-	bit_value_binop_1 (PLUS_EXPR, type, val, mask,
-			   r1type, r1val, r1mask,
-			   r2type, temv, temm);
+	bit_value_unop_1 (NEGATE_EXPR, r2type_sgn, r2type_precision, &temv, &temm,
+			  r2type_sgn, r2type_precision, r2val, r2mask);
+	bit_value_binop_1 (PLUS_EXPR, type_sgn, type_precision, val, mask,
+			   r1type_sgn, r1type_precision, r1val, r1mask,
+			   r2type_sgn, r2type_precision, temv, temm);
 	break;
       }
 
@@ -1473,7 +1474,7 @@ bit_value_binop_1 (enum tree_code code, tree type,
 	  break;
 
 	/* For comparisons the signedness is in the comparison operands.  */
-	sgn = TYPE_SIGN (r1type);
+	sgn = r1type_sgn;
 
 	/* If we know the most significant bits we know the values
 	   value ranges by means of treating varying bits as zero
@@ -1526,8 +1527,9 @@ bit_value_unop (enum tree_code code, tree type, tree rhs)
   gcc_assert ((rval.lattice_val == CONSTANT
 	       && TREE_CODE (rval.value) == INTEGER_CST)
 	      || wi::sext (rval.mask, TYPE_PRECISION (TREE_TYPE (rhs))) == -1);
-  bit_value_unop_1 (code, type, &value, &mask,
-		    TREE_TYPE (rhs), value_to_wide_int (rval), rval.mask);
+  bit_value_unop_1 (code, TYPE_SIGN (type), TYPE_PRECISION (type), &value, &mask,
+		    TYPE_SIGN (TREE_TYPE (rhs)), TYPE_PRECISION (TREE_TYPE (rhs)),
+		    value_to_wide_int (rval), rval.mask);
   if (wi::sext (mask, TYPE_PRECISION (type)) != -1)
     {
       val.lattice_val = CONSTANT;
@@ -1572,9 +1574,11 @@ bit_value_binop (enum tree_code code, tree type, tree rhs1, tree rhs2)
 	       && TREE_CODE (r2val.value) == INTEGER_CST)
 	      || wi::sext (r2val.mask,
 			   TYPE_PRECISION (TREE_TYPE (rhs2))) == -1);
-  bit_value_binop_1 (code, type, &value, &mask,
-		     TREE_TYPE (rhs1), value_to_wide_int (r1val), r1val.mask,
-		     TREE_TYPE (rhs2), value_to_wide_int (r2val), r2val.mask);
+  bit_value_binop_1 (code, TYPE_SIGN (type), TYPE_PRECISION (type), &value, &mask,
+		     TYPE_SIGN (TREE_TYPE (rhs1)), TYPE_PRECISION (TREE_TYPE (rhs1)),
+		     value_to_wide_int (r1val), r1val.mask,
+		     TYPE_SIGN (TREE_TYPE (rhs2)), TYPE_PRECISION (TREE_TYPE (rhs2)),
+		     value_to_wide_int (r2val), r2val.mask);
   if (wi::sext (mask, TYPE_PRECISION (type)) != -1)
     {
       val.lattice_val = CONSTANT;
@@ -1673,9 +1677,9 @@ bit_value_assume_aligned (gimple *stmt, tree attr, ccp_prop_value_t ptrval,
 
   align = build_int_cst_type (type, -aligni);
   alignval = get_value_for_expr (align, true);
-  bit_value_binop_1 (BIT_AND_EXPR, type, &value, &mask,
-		     type, value_to_wide_int (ptrval), ptrval.mask,
-		     type, value_to_wide_int (alignval), alignval.mask);
+  bit_value_binop_1 (BIT_AND_EXPR, TYPE_SIGN (type), TYPE_PRECISION (type), &value, &mask,
+		     TYPE_SIGN (type), TYPE_PRECISION (type), value_to_wide_int (ptrval), ptrval.mask,
+		     TYPE_SIGN (type), TYPE_PRECISION (type), value_to_wide_int (alignval), alignval.mask);
   if (wi::sext (mask, TYPE_PRECISION (type)) != -1)
     {
       val.lattice_val = CONSTANT;
diff --git a/gcc/tree-ssa-ccp.h b/gcc/tree-ssa-ccp.h
new file mode 100644
index 0000000..b76a834
--- /dev/null
+++ b/gcc/tree-ssa-ccp.h
@@ -0,0 +1,30 @@
+/* Copyright (C) 2016-2016 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by the
+Free Software Foundation; either version 3, or (at your option) any
+later version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT
+ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef TREE_SSA_CCP_H
+#define TREE_SSA_CCP_H
+
+void bit_value_binop_1 (enum tree_code, signop, unsigned, widest_int *, widest_int *,
+			signop, unsigned, const widest_int &, const widest_int &,
+			signop, unsigned, const widest_int &, const widest_int &);
+
+void bit_value_unop_1 (enum tree_code, signop, unsigned, widest_int *, widest_int *,
+		       signop, unsigned, const widest_int &, const widest_int &);
+
+
+#endif
diff --git a/gcc/tree.h b/gcc/tree.h
index fff65d6..e21b31d 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -2346,6 +2346,9 @@ extern machine_mode element_mode (const_tree t);
 #define DECL_IGNORED_P(NODE) \
   (DECL_COMMON_CHECK (NODE)->decl_common.ignored_flag)
 
+#define DECL_SET_BY_IPA(NODE) \
+  (DECL_COMMON_CHECK (NODE)->decl_common.set_by_ipa)
+
 /* Nonzero for a given ..._DECL node means that this node represents an
    "abstract instance" of the given declaration (e.g. in the original
    declaration of an inline function).  When generating symbolic debugging

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] ipa bitwise constant propagation
  2016-08-04  6:36 [RFC] ipa bitwise constant propagation Prathamesh Kulkarni
@ 2016-08-04  8:02 ` Richard Biener
  2016-08-04  8:57   ` Prathamesh Kulkarni
  2016-08-04 13:05   ` Jan Hubicka
  2016-08-05 12:37 ` Martin Jambor
  1 sibling, 2 replies; 31+ messages in thread
From: Richard Biener @ 2016-08-04  8:02 UTC (permalink / raw)
  To: Prathamesh Kulkarni
  Cc: Jan Hubicka, Martin Jambor, Kugan Vivekanandarajah, gcc Patches

On Thu, 4 Aug 2016, Prathamesh Kulkarni wrote:

> Hi,
> This is a prototype patch for propagating known/unknown bits inter-procedurally.
> for integral types which propagates info obtained from get_nonzero_bits ().
> 
> Patch required making following changes:
> a) To make info from get_nonzero_bits() available to ipa, I had to remove
> guard !nonzero_p in ccp_finalize. However that triggered the following ICE
> in get_ptr_info() for default_none.f95 (and several other fortran tests)
> with options: -fopenacc -O2
> ICE: http://pastebin.com/KjD7HMQi
> I confirmed with Richard that this was a latent issue.

Can you plase bootstrap/test the fix for this separately?  (doesn't
seem to be included in this patch btw)

> b) I chose widest_int for representing value, mask in ipcp_bits_lattice
> and correspondingly changed declarations for
> bit_value_unop_1/bit_value_binop_1 to take
> precision and sign instead of type (those are the only two fields that
> were used). Both these functions are exported by tree-ssa-ccp.h
> I hope that's ok ?

That's ok, but please change the functions to overloads of
bit_value_binop / bit_value_unop to not export ugly _1 names.

-  signop sgn = TYPE_SIGN (type);
-  int width = TYPE_PRECISION (type);
+  signop sgn = type_sgn;
+  int width = (int) type_precision;

please adjust parameter names to get rid of those now unnecessary
locals (and make the precision parameter an 'int').

> c) Changed streamer_read_wi/streamer_write_wi to non-static.
> Ah I see Kugan has submitted a patch for this, so I will drop this hunk.

But he streams wide_int, not widest_int.  I followed up on his
patch.

> d) We have following in tree-ssa-ccp.c:get_default_value ():
>           if (flag_tree_bit_ccp)
>             {
>               wide_int nonzero_bits = get_nonzero_bits (var);
>               if (nonzero_bits != -1)
>                 {
>                   val.lattice_val = CONSTANT;
>                   val.value = build_zero_cst (TREE_TYPE (var));
>                   val.mask = extend_mask (nonzero_bits);
>                 }
> 
> extend_mask() sets all upper bits to 1 in nonzero_bits, ie, varying
> in terms of bit-ccp.
> I suppose in tree-ccp we need to extend mask if var is parameter since we don't
> know in advance what values it will receive from different callers and mark all
> upper bits as 1 to be safe.

Not sure, it seems to me that we can zero-extend for unsigned types
and sign-extend for signed types (if the "sign"-bit of nonzero_bits
is one it properly makes higher bits undefined).  Can you change
the code accordingly?  (simply give extend_mask a sign-op and use
that appropriately?)  Please split out this change so it can be
tested separately.

> However I suppose with ipa, we can determine exactly which bits of
> parameter are constant and
> setting all upper bits to 1 will become unnecessary ?
> 
> For example, consider following artificial test-case:
> int f(int x)
> {
>   if (x > 300)
>     return 1;
>   else
>     return 2;
> }
> 
> int main(int argc, char **argv)
> {
>   return f(argc & 0xc) + f (argc & 0x3);
> }
> 
> For x, the mask would be meet of:
> <0, 0xc> meet <0, 0x3> == (0x3 | 0xc) | (0 ^ 0) == 0xf
> and ipcp_update_bits() sets nonzero_bits for x to 0xf.
> However get_default_value then calls extend_mask (0xf), resulting in
> all upper bits
> being set to 1 and consequently the condition if (x > 300) doesn't get folded.

But then why would the code trying to optimize the comparison look at
bits that are outside of the precision?  (where do we try to use this
info?  I see that VRP misses to use nonzero bits if no range info
is present - I suppose set_nonzero_bits misses to eventually adjust
the range.

That said, where is the folding code and why does it care for those
"uninteresting" bits at all?

> To resolve this, I added a new flag "set_by_ipa" to decl_common,
> which is set to true if the mask of parameter is determined by ipa-cp,
> and the condition changes to:
> 
> if (SSA_NAME_VAR (var)
>     && TREE_CODE (SSA_NAME_VAR (var)) == PARM_DECL
>     && DECL_SET_BY_IPA (SSA_NAME_VAR (var))
>   val.mask = widest_int::from (nonzero_bits,
>                           TYPE_SIGN (TREE_TYPE (SSA_NAME_VAR (var)));
> else
>   val.mask = extend_mask (nonzero_bits);
> 
> I am not sure if adding a new flag to decl_common is a good idea. How
> do other ipa passes deal with this/similar issue ?
> 
> I suppose we would want to gate this on some flag, say -fipa-bit-cp ?
> I haven't yet gated it on the flag, will do in next version of patch.
> I have added some very simple test-cases, I will try to add more
> meaningful ones.

See above - we should avoid needing this.

> Patch passes bootstrap+test on x86_64-unknown-linux-gnu
> and cross-tested on arm*-*-* and aarch64*-*-* with the exception
> of some fortran tests failing due to above ICE.
> 
> As next steps, I am planning to extend it to handle alignment propagation,
> and do further testing (lto-bootstrap, chromium).
> I would be grateful for feedback on the current patch.

I see you do

@@ -895,7 +899,7 @@ do_dbg_cnt (void)
    Return TRUE when something was optimized.  */

 static bool
-ccp_finalize (bool nonzero_p)
+ccp_finalize (bool nonzero_p ATTRIBUTE_UNUSED)
 {
   bool something_changed;
   unsigned i;
@@ -913,10 +917,7 @@ ccp_finalize (bool nonzero_p)

       if (!name
          || (!POINTER_TYPE_P (TREE_TYPE (name))
-             && (!INTEGRAL_TYPE_P (TREE_TYPE (name))
-                 /* Don't record nonzero bits before IPA to avoid
-                    using too much memory.  */
-                 || !nonzero_p)))
+             && (!INTEGRAL_TYPE_P (TREE_TYPE (name)))))
        continue;

can you instead adjust the caller to do sth like

  if (ccp_finalize (nonzero_p || flag_ipa_cp))
    {

?  What we miss to optimize memory usage in the early CCP case
(it's run very early, before dead code elimination) is to
avoid setting alignment / nonzero bits for the case of
fully propagatable (and thus dead after substitute_and_fold)
SSA names.

So in ccp_finalize do sth like

      val = get_value (name);
      if (val->lattice_val != CONSTANT
          || TREE_CODE (val->value) != INTEGER_CST
          || val->mask == 0)
        continue;

That should cut down early CCP memory use in case of nonzero
setting significantly.

I didn't look at the propagation part but eventually the IPA-CP
lattice gets quite big.  Also the alignment lattice is very
similar to the bits lattice so why not merge those two?  But
in the end it's Martins/Honzas call here.  Note there is
trailing_wide_ints <> which could be used to improve memory usage
based on the underlying type.

Thanks,
Richard.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] ipa bitwise constant propagation
  2016-08-04  8:02 ` Richard Biener
@ 2016-08-04  8:57   ` Prathamesh Kulkarni
  2016-08-04  9:07     ` kugan
  2016-08-04 10:51     ` Richard Biener
  2016-08-04 13:05   ` Jan Hubicka
  1 sibling, 2 replies; 31+ messages in thread
From: Prathamesh Kulkarni @ 2016-08-04  8:57 UTC (permalink / raw)
  To: Richard Biener
  Cc: Jan Hubicka, Martin Jambor, Kugan Vivekanandarajah, gcc Patches

On 4 August 2016 at 13:31, Richard Biener <rguenther@suse.de> wrote:
> On Thu, 4 Aug 2016, Prathamesh Kulkarni wrote:
>
>> Hi,
>> This is a prototype patch for propagating known/unknown bits inter-procedurally.
>> for integral types which propagates info obtained from get_nonzero_bits ().
>>
>> Patch required making following changes:
>> a) To make info from get_nonzero_bits() available to ipa, I had to remove
>> guard !nonzero_p in ccp_finalize. However that triggered the following ICE
>> in get_ptr_info() for default_none.f95 (and several other fortran tests)
>> with options: -fopenacc -O2
>> ICE: http://pastebin.com/KjD7HMQi
>> I confirmed with Richard that this was a latent issue.
>
> Can you plase bootstrap/test the fix for this separately?  (doesn't
> seem to be included in this patch btw)
Well I don't have the fix available -;)
>
>> b) I chose widest_int for representing value, mask in ipcp_bits_lattice
>> and correspondingly changed declarations for
>> bit_value_unop_1/bit_value_binop_1 to take
>> precision and sign instead of type (those are the only two fields that
>> were used). Both these functions are exported by tree-ssa-ccp.h
>> I hope that's ok ?
>
> That's ok, but please change the functions to overloads of
> bit_value_binop / bit_value_unop to not export ugly _1 names.
>
> -  signop sgn = TYPE_SIGN (type);
> -  int width = TYPE_PRECISION (type);
> +  signop sgn = type_sgn;
> +  int width = (int) type_precision;
>
> please adjust parameter names to get rid of those now unnecessary
> locals (and make the precision parameter an 'int').
>
>> c) Changed streamer_read_wi/streamer_write_wi to non-static.
>> Ah I see Kugan has submitted a patch for this, so I will drop this hunk.
>
> But he streams wide_int, not widest_int.  I followed up on his
> patch.
Oops, I got confused, sorry about that.
>
>> d) We have following in tree-ssa-ccp.c:get_default_value ():
>>           if (flag_tree_bit_ccp)
>>             {
>>               wide_int nonzero_bits = get_nonzero_bits (var);
>>               if (nonzero_bits != -1)
>>                 {
>>                   val.lattice_val = CONSTANT;
>>                   val.value = build_zero_cst (TREE_TYPE (var));
>>                   val.mask = extend_mask (nonzero_bits);
>>                 }
>>
>> extend_mask() sets all upper bits to 1 in nonzero_bits, ie, varying
>> in terms of bit-ccp.
>> I suppose in tree-ccp we need to extend mask if var is parameter since we don't
>> know in advance what values it will receive from different callers and mark all
>> upper bits as 1 to be safe.
>
> Not sure, it seems to me that we can zero-extend for unsigned types
> and sign-extend for signed types (if the "sign"-bit of nonzero_bits
> is one it properly makes higher bits undefined).  Can you change
> the code accordingly?  (simply give extend_mask a sign-op and use
> that appropriately?)  Please split out this change so it can be
> tested separately.
>
>> However I suppose with ipa, we can determine exactly which bits of
>> parameter are constant and
>> setting all upper bits to 1 will become unnecessary ?
>>
>> For example, consider following artificial test-case:
>> int f(int x)
>> {
>>   if (x > 300)
>>     return 1;
>>   else
>>     return 2;
>> }
>>
>> int main(int argc, char **argv)
>> {
>>   return f(argc & 0xc) + f (argc & 0x3);
>> }
>>
>> For x, the mask would be meet of:
>> <0, 0xc> meet <0, 0x3> == (0x3 | 0xc) | (0 ^ 0) == 0xf
>> and ipcp_update_bits() sets nonzero_bits for x to 0xf.
>> However get_default_value then calls extend_mask (0xf), resulting in
>> all upper bits
>> being set to 1 and consequently the condition if (x > 300) doesn't get folded.
>
> But then why would the code trying to optimize the comparison look at
> bits that are outside of the precision?  (where do we try to use this
> info?  I see that VRP misses to use nonzero bits if no range info
> is present - I suppose set_nonzero_bits misses to eventually adjust
> the range.
>
> That said, where is the folding code and why does it care for those
> "uninteresting" bits at all?
Well there is following in bit_value_binop_1 for case LT_EXPR / LE_EXPR:
        /* If the most significant bits are not known we know nothing.  */
        if (wi::neg_p (o1mask) || wi::neg_p (o2mask))
          break;

IIUC extend_mask extends all upper bits to 1, and we hit break and
thus not perform folding.
ccp2 dump shows:
Folding statement: if (x_2(D) > 300)
which is likely CONSTANT
Not folded

Instead if we extend based on signop, then the condition gets folded correctly:
Folding statement: if (x_2(D) > 300)
which is likely CONSTANT
Folding predicate x_2(D) > 300 to 0
gimple_simplified to if (0 != 0)
Folded into: if (0 != 0)

I thought it was unsafe for ccp to extend based on sign-op,
so I guarded that on DECL_SET_BY_IPA.
I will try to change extend_mask to extend the mask based on signop
and get rid of the flag.

I will address your other comments in follow-up patch.

Thanks,
Prathamesh
>
>> To resolve this, I added a new flag "set_by_ipa" to decl_common,
>> which is set to true if the mask of parameter is determined by ipa-cp,
>> and the condition changes to:
>>
>> if (SSA_NAME_VAR (var)
>>     && TREE_CODE (SSA_NAME_VAR (var)) == PARM_DECL
>>     && DECL_SET_BY_IPA (SSA_NAME_VAR (var))
>>   val.mask = widest_int::from (nonzero_bits,
>>                           TYPE_SIGN (TREE_TYPE (SSA_NAME_VAR (var)));
>> else
>>   val.mask = extend_mask (nonzero_bits);
>>
>> I am not sure if adding a new flag to decl_common is a good idea. How
>> do other ipa passes deal with this/similar issue ?
>>
>> I suppose we would want to gate this on some flag, say -fipa-bit-cp ?
>> I haven't yet gated it on the flag, will do in next version of patch.
>> I have added some very simple test-cases, I will try to add more
>> meaningful ones.
>
> See above - we should avoid needing this.
>
>> Patch passes bootstrap+test on x86_64-unknown-linux-gnu
>> and cross-tested on arm*-*-* and aarch64*-*-* with the exception
>> of some fortran tests failing due to above ICE.
>>
>> As next steps, I am planning to extend it to handle alignment propagation,
>> and do further testing (lto-bootstrap, chromium).
>> I would be grateful for feedback on the current patch.
>
> I see you do
>
> @@ -895,7 +899,7 @@ do_dbg_cnt (void)
>     Return TRUE when something was optimized.  */
>
>  static bool
> -ccp_finalize (bool nonzero_p)
> +ccp_finalize (bool nonzero_p ATTRIBUTE_UNUSED)
>  {
>    bool something_changed;
>    unsigned i;
> @@ -913,10 +917,7 @@ ccp_finalize (bool nonzero_p)
>
>        if (!name
>           || (!POINTER_TYPE_P (TREE_TYPE (name))
> -             && (!INTEGRAL_TYPE_P (TREE_TYPE (name))
> -                 /* Don't record nonzero bits before IPA to avoid
> -                    using too much memory.  */
> -                 || !nonzero_p)))
> +             && (!INTEGRAL_TYPE_P (TREE_TYPE (name)))))
>         continue;
>
> can you instead adjust the caller to do sth like
>
>   if (ccp_finalize (nonzero_p || flag_ipa_cp))
>     {
>
> ?  What we miss to optimize memory usage in the early CCP case
> (it's run very early, before dead code elimination) is to
> avoid setting alignment / nonzero bits for the case of
> fully propagatable (and thus dead after substitute_and_fold)
> SSA names.
>
> So in ccp_finalize do sth like
>
>       val = get_value (name);
>       if (val->lattice_val != CONSTANT
>           || TREE_CODE (val->value) != INTEGER_CST
>           || val->mask == 0)
>         continue;
>
> That should cut down early CCP memory use in case of nonzero
> setting significantly.
>
> I didn't look at the propagation part but eventually the IPA-CP
> lattice gets quite big.  Also the alignment lattice is very
> similar to the bits lattice so why not merge those two?  But
> in the end it's Martins/Honzas call here.  Note there is
> trailing_wide_ints <> which could be used to improve memory usage
> based on the underlying type.
>
> Thanks,
> Richard.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] ipa bitwise constant propagation
  2016-08-04  8:57   ` Prathamesh Kulkarni
@ 2016-08-04  9:07     ` kugan
  2016-08-04 10:51     ` Richard Biener
  1 sibling, 0 replies; 31+ messages in thread
From: kugan @ 2016-08-04  9:07 UTC (permalink / raw)
  To: Prathamesh Kulkarni, Richard Biener
  Cc: Jan Hubicka, Martin Jambor, gcc Patches



On 04/08/16 18:57, Prathamesh Kulkarni wrote:
> On 4 August 2016 at 13:31, Richard Biener <rguenther@suse.de> wrote:
>> On Thu, 4 Aug 2016, Prathamesh Kulkarni wrote:
>>
>>> Hi,
>>> This is a prototype patch for propagating known/unknown bits inter-procedurally.
>>> for integral types which propagates info obtained from get_nonzero_bits ().
>>>
>>> Patch required making following changes:
>>> a) To make info from get_nonzero_bits() available to ipa, I had to remove
>>> guard !nonzero_p in ccp_finalize. However that triggered the following ICE
>>> in get_ptr_info() for default_none.f95 (and several other fortran tests)
>>> with options: -fopenacc -O2
>>> ICE: http://pastebin.com/KjD7HMQi
>>> I confirmed with Richard that this was a latent issue.
>>
>> Can you plase bootstrap/test the fix for this separately?  (doesn't
>> seem to be included in this patch btw)
> Well I don't have the fix available -;)

This looks like what I fixed in 
https://patchwork.ozlabs.org/patch/648662/. I will commit that soon.

Thanks,
Kugan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] ipa bitwise constant propagation
  2016-08-04  8:57   ` Prathamesh Kulkarni
  2016-08-04  9:07     ` kugan
@ 2016-08-04 10:51     ` Richard Biener
  1 sibling, 0 replies; 31+ messages in thread
From: Richard Biener @ 2016-08-04 10:51 UTC (permalink / raw)
  To: Prathamesh Kulkarni
  Cc: Jan Hubicka, Martin Jambor, Kugan Vivekanandarajah, gcc Patches

On Thu, 4 Aug 2016, Prathamesh Kulkarni wrote:

> On 4 August 2016 at 13:31, Richard Biener <rguenther@suse.de> wrote:
> > On Thu, 4 Aug 2016, Prathamesh Kulkarni wrote:
> >
> >> Hi,
> >> This is a prototype patch for propagating known/unknown bits inter-procedurally.
> >> for integral types which propagates info obtained from get_nonzero_bits ().
> >>
> >> Patch required making following changes:
> >> a) To make info from get_nonzero_bits() available to ipa, I had to remove
> >> guard !nonzero_p in ccp_finalize. However that triggered the following ICE
> >> in get_ptr_info() for default_none.f95 (and several other fortran tests)
> >> with options: -fopenacc -O2
> >> ICE: http://pastebin.com/KjD7HMQi
> >> I confirmed with Richard that this was a latent issue.
> >
> > Can you plase bootstrap/test the fix for this separately?  (doesn't
> > seem to be included in this patch btw)
> Well I don't have the fix available -;)

Oh, I thought it was obvious:

Index: gcc/tree-inline.c
===================================================================
--- gcc/tree-inline.c   (revision 239117)
+++ gcc/tree-inline.c   (working copy)
@@ -242,7 +242,8 @@ remap_ssa_name (tree name, copy_body_dat
       SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_tree)
        = SSA_NAME_OCCURS_IN_ABNORMAL_PHI (name);
       /* At least IPA points-to info can be directly transferred.  */
-      if (id->src_cfun->gimple_df
+      if (POINTER_TYPE_P (TREE_TYPE (name))
+         && id->src_cfun->gimple_df
          && id->src_cfun->gimple_df->ipa_pta
          && (pi = SSA_NAME_PTR_INFO (name))
          && !pi->pt.anything)
@@ -274,7 +275,8 @@ remap_ssa_name (tree name, copy_body_dat
       SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_tree)
        = SSA_NAME_OCCURS_IN_ABNORMAL_PHI (name);
       /* At least IPA points-to info can be directly transferred.  */
-      if (id->src_cfun->gimple_df
+      if (POINTER_TYPE_P (TREE_TYPE (name))
+         && id->src_cfun->gimple_df
          && id->src_cfun->gimple_df->ipa_pta
          && (pi = SSA_NAME_PTR_INFO (name))
          && !pi->pt.anything)

similarly range info could be transfered of course.

> >
> >> b) I chose widest_int for representing value, mask in ipcp_bits_lattice
> >> and correspondingly changed declarations for
> >> bit_value_unop_1/bit_value_binop_1 to take
> >> precision and sign instead of type (those are the only two fields that
> >> were used). Both these functions are exported by tree-ssa-ccp.h
> >> I hope that's ok ?
> >
> > That's ok, but please change the functions to overloads of
> > bit_value_binop / bit_value_unop to not export ugly _1 names.
> >
> > -  signop sgn = TYPE_SIGN (type);
> > -  int width = TYPE_PRECISION (type);
> > +  signop sgn = type_sgn;
> > +  int width = (int) type_precision;
> >
> > please adjust parameter names to get rid of those now unnecessary
> > locals (and make the precision parameter an 'int').
> >
> >> c) Changed streamer_read_wi/streamer_write_wi to non-static.
> >> Ah I see Kugan has submitted a patch for this, so I will drop this hunk.
> >
> > But he streams wide_int, not widest_int.  I followed up on his
> > patch.
> Oops, I got confused, sorry about that.
> >
> >> d) We have following in tree-ssa-ccp.c:get_default_value ():
> >>           if (flag_tree_bit_ccp)
> >>             {
> >>               wide_int nonzero_bits = get_nonzero_bits (var);
> >>               if (nonzero_bits != -1)
> >>                 {
> >>                   val.lattice_val = CONSTANT;
> >>                   val.value = build_zero_cst (TREE_TYPE (var));
> >>                   val.mask = extend_mask (nonzero_bits);
> >>                 }
> >>
> >> extend_mask() sets all upper bits to 1 in nonzero_bits, ie, varying
> >> in terms of bit-ccp.
> >> I suppose in tree-ccp we need to extend mask if var is parameter since we don't
> >> know in advance what values it will receive from different callers and mark all
> >> upper bits as 1 to be safe.
> >
> > Not sure, it seems to me that we can zero-extend for unsigned types
> > and sign-extend for signed types (if the "sign"-bit of nonzero_bits
> > is one it properly makes higher bits undefined).  Can you change
> > the code accordingly?  (simply give extend_mask a sign-op and use
> > that appropriately?)  Please split out this change so it can be
> > tested separately.
> >
> >> However I suppose with ipa, we can determine exactly which bits of
> >> parameter are constant and
> >> setting all upper bits to 1 will become unnecessary ?
> >>
> >> For example, consider following artificial test-case:
> >> int f(int x)
> >> {
> >>   if (x > 300)
> >>     return 1;
> >>   else
> >>     return 2;
> >> }
> >>
> >> int main(int argc, char **argv)
> >> {
> >>   return f(argc & 0xc) + f (argc & 0x3);
> >> }
> >>
> >> For x, the mask would be meet of:
> >> <0, 0xc> meet <0, 0x3> == (0x3 | 0xc) | (0 ^ 0) == 0xf
> >> and ipcp_update_bits() sets nonzero_bits for x to 0xf.
> >> However get_default_value then calls extend_mask (0xf), resulting in
> >> all upper bits
> >> being set to 1 and consequently the condition if (x > 300) doesn't get folded.
> >
> > But then why would the code trying to optimize the comparison look at
> > bits that are outside of the precision?  (where do we try to use this
> > info?  I see that VRP misses to use nonzero bits if no range info
> > is present - I suppose set_nonzero_bits misses to eventually adjust
> > the range.
> >
> > That said, where is the folding code and why does it care for those
> > "uninteresting" bits at all?
> Well there is following in bit_value_binop_1 for case LT_EXPR / LE_EXPR:
>         /* If the most significant bits are not known we know nothing.  */
>         if (wi::neg_p (o1mask) || wi::neg_p (o2mask))
>           break;
> 
> IIUC extend_mask extends all upper bits to 1, and we hit break and
> thus not perform folding.

Yeah, this should simply test _the_ most significant bit
(that is, bit number TYPE_PRECISION (r1type)).

But it should be fixed by properly extending the mask.

> ccp2 dump shows:
> Folding statement: if (x_2(D) > 300)
> which is likely CONSTANT
> Not folded
> 
> Instead if we extend based on signop, then the condition gets folded correctly:
> Folding statement: if (x_2(D) > 300)
> which is likely CONSTANT
> Folding predicate x_2(D) > 300 to 0
> gimple_simplified to if (0 != 0)
> Folded into: if (0 != 0)
> 
> I thought it was unsafe for ccp to extend based on sign-op,
> so I guarded that on DECL_SET_BY_IPA.
> I will try to change extend_mask to extend the mask based on signop
> and get rid of the flag.
> 
> I will address your other comments in follow-up patch.

Thanks,
Richard.

> Thanks,
> Prathamesh
> >
> >> To resolve this, I added a new flag "set_by_ipa" to decl_common,
> >> which is set to true if the mask of parameter is determined by ipa-cp,
> >> and the condition changes to:
> >>
> >> if (SSA_NAME_VAR (var)
> >>     && TREE_CODE (SSA_NAME_VAR (var)) == PARM_DECL
> >>     && DECL_SET_BY_IPA (SSA_NAME_VAR (var))
> >>   val.mask = widest_int::from (nonzero_bits,
> >>                           TYPE_SIGN (TREE_TYPE (SSA_NAME_VAR (var)));
> >> else
> >>   val.mask = extend_mask (nonzero_bits);
> >>
> >> I am not sure if adding a new flag to decl_common is a good idea. How
> >> do other ipa passes deal with this/similar issue ?
> >>
> >> I suppose we would want to gate this on some flag, say -fipa-bit-cp ?
> >> I haven't yet gated it on the flag, will do in next version of patch.
> >> I have added some very simple test-cases, I will try to add more
> >> meaningful ones.
> >
> > See above - we should avoid needing this.
> >
> >> Patch passes bootstrap+test on x86_64-unknown-linux-gnu
> >> and cross-tested on arm*-*-* and aarch64*-*-* with the exception
> >> of some fortran tests failing due to above ICE.
> >>
> >> As next steps, I am planning to extend it to handle alignment propagation,
> >> and do further testing (lto-bootstrap, chromium).
> >> I would be grateful for feedback on the current patch.
> >
> > I see you do
> >
> > @@ -895,7 +899,7 @@ do_dbg_cnt (void)
> >     Return TRUE when something was optimized.  */
> >
> >  static bool
> > -ccp_finalize (bool nonzero_p)
> > +ccp_finalize (bool nonzero_p ATTRIBUTE_UNUSED)
> >  {
> >    bool something_changed;
> >    unsigned i;
> > @@ -913,10 +917,7 @@ ccp_finalize (bool nonzero_p)
> >
> >        if (!name
> >           || (!POINTER_TYPE_P (TREE_TYPE (name))
> > -             && (!INTEGRAL_TYPE_P (TREE_TYPE (name))
> > -                 /* Don't record nonzero bits before IPA to avoid
> > -                    using too much memory.  */
> > -                 || !nonzero_p)))
> > +             && (!INTEGRAL_TYPE_P (TREE_TYPE (name)))))
> >         continue;
> >
> > can you instead adjust the caller to do sth like
> >
> >   if (ccp_finalize (nonzero_p || flag_ipa_cp))
> >     {
> >
> > ?  What we miss to optimize memory usage in the early CCP case
> > (it's run very early, before dead code elimination) is to
> > avoid setting alignment / nonzero bits for the case of
> > fully propagatable (and thus dead after substitute_and_fold)
> > SSA names.
> >
> > So in ccp_finalize do sth like
> >
> >       val = get_value (name);
> >       if (val->lattice_val != CONSTANT
> >           || TREE_CODE (val->value) != INTEGER_CST
> >           || val->mask == 0)
> >         continue;
> >
> > That should cut down early CCP memory use in case of nonzero
> > setting significantly.
> >
> > I didn't look at the propagation part but eventually the IPA-CP
> > lattice gets quite big.  Also the alignment lattice is very
> > similar to the bits lattice so why not merge those two?  But
> > in the end it's Martins/Honzas call here.  Note there is
> > trailing_wide_ints <> which could be used to improve memory usage
> > based on the underlying type.
> >
> > Thanks,
> > Richard.
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] ipa bitwise constant propagation
  2016-08-04  8:02 ` Richard Biener
  2016-08-04  8:57   ` Prathamesh Kulkarni
@ 2016-08-04 13:05   ` Jan Hubicka
  2016-08-04 23:04     ` kugan
  1 sibling, 1 reply; 31+ messages in thread
From: Jan Hubicka @ 2016-08-04 13:05 UTC (permalink / raw)
  To: Richard Biener
  Cc: Prathamesh Kulkarni, Jan Hubicka, Martin Jambor,
	Kugan Vivekanandarajah, gcc Patches

> I didn't look at the propagation part but eventually the IPA-CP
> lattice gets quite big.  Also the alignment lattice is very
> similar to the bits lattice so why not merge those two?  But

This was always the original idea to replace alignment propagation by bitwise
ccp.  I suppose we only have issue here because nonzero bits are not tracked for
pointers so we need to feed the original lattices by hand?

We could also make use of VR ranges and bits while evaultaing predicates
in ipa-inline-analysis. I can look into it after returning from Leeds.

Honza
> in the end it's Martins/Honzas call here.  Note there is
> trailing_wide_ints <> which could be used to improve memory usage
> based on the underlying type.
> 
> Thanks,
> Richard.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] ipa bitwise constant propagation
  2016-08-04 13:05   ` Jan Hubicka
@ 2016-08-04 23:04     ` kugan
  2016-08-05 11:36       ` Jan Hubicka
  0 siblings, 1 reply; 31+ messages in thread
From: kugan @ 2016-08-04 23:04 UTC (permalink / raw)
  To: Jan Hubicka, Richard Biener
  Cc: Prathamesh Kulkarni, Martin Jambor, gcc Patches

Hi Honza,

On 04/08/16 23:05, Jan Hubicka wrote:
>> I didn't look at the propagation part but eventually the IPA-CP
>> lattice gets quite big.  Also the alignment lattice is very
>> similar to the bits lattice so why not merge those two?  But
>
> This was always the original idea to replace alignment propagation by bitwise
> ccp.  I suppose we only have issue here because nonzero bits are not tracked for
> pointers so we need to feed the original lattices by hand?

I also raised this one with Prathamesh off line. With the early-vrp, we 
should have nonzero_bits for non pointers. For pointers we should feed 
the lattices with get_pointer_alignment_1 as it is done in 
ipa-cpalignment propagation.

> We could also make use of VR ranges and bits while evaultaing predicates
> in ipa-inline-analysis. I can look into it after returning from Leeds.

Indeed. With ealrly dom based VRP (non iterative at this point), some of 
the ranges can be pessimistic and can impact the estimation. Let me have 
a look at this.

Thanks,
Kugan


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] ipa bitwise constant propagation
  2016-08-04 23:04     ` kugan
@ 2016-08-05 11:36       ` Jan Hubicka
  0 siblings, 0 replies; 31+ messages in thread
From: Jan Hubicka @ 2016-08-05 11:36 UTC (permalink / raw)
  To: kugan
  Cc: Jan Hubicka, Richard Biener, Prathamesh Kulkarni, Martin Jambor,
	gcc Patches

> Hi Honza,
> 
> On 04/08/16 23:05, Jan Hubicka wrote:
> >>I didn't look at the propagation part but eventually the IPA-CP
> >>lattice gets quite big.  Also the alignment lattice is very
> >>similar to the bits lattice so why not merge those two?  But
> >
> >This was always the original idea to replace alignment propagation by bitwise
> >ccp.  I suppose we only have issue here because nonzero bits are not tracked for
> >pointers so we need to feed the original lattices by hand?
> 
> I also raised this one with Prathamesh off line. With the early-vrp,
> we should have nonzero_bits for non pointers. For pointers we should
> feed the lattices with get_pointer_alignment_1 as it is done in
> ipa-cpalignment propagation.

Yes, that is the general idea. Note that also for pointers it would be
very useful to track what pointers are non-NULL (C++ multiple inheritance inserts
a lot of NULL pointer checks that confuse us in later analysis and it would
be nice to optimize them out). I am not very convinced saving a pointer is
worth to make difference between pointers/nonpointers for all the local
tracking.
> 
> >We could also make use of VR ranges and bits while evaultaing predicates
> >in ipa-inline-analysis. I can look into it after returning from Leeds.
> 
> Indeed. With ealrly dom based VRP (non iterative at this point),
> some of the ranges can be pessimistic and can impact the estimation.
> Let me have a look at this.

Yes, but those are independent issues - size/time estimates should
take into account the new info we have and we should work on getting
it better when we can ;)

I will try to revisit the size/time code after returning from
my vacation and turn it into sreals for time + cleanup/generalize the APIs
a bit. I tried to do it last stage1 but got stuck on some of ugly side cases
+ gengtype not liking sreal type.

Honza
> 
> Thanks,
> Kugan
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] ipa bitwise constant propagation
  2016-08-04  6:36 [RFC] ipa bitwise constant propagation Prathamesh Kulkarni
  2016-08-04  8:02 ` Richard Biener
@ 2016-08-05 12:37 ` Martin Jambor
  2016-08-07 21:38   ` Prathamesh Kulkarni
  1 sibling, 1 reply; 31+ messages in thread
From: Martin Jambor @ 2016-08-05 12:37 UTC (permalink / raw)
  To: Prathamesh Kulkarni
  Cc: Richard Biener, Jan Hubicka, Kugan Vivekanandarajah, gcc Patches

Hi,

generally speaking, the ipa-cp.c and ipa-cp.[hc] bits look reasonable,
but I have a few comments:

On Thu, Aug 04, 2016 at 12:06:18PM +0530, Prathamesh Kulkarni wrote:
> Hi,
> This is a prototype patch for propagating known/unknown bits inter-procedurally.
> for integral types which propagates info obtained from get_nonzero_bits ().
> 
> Patch required making following changes:
> a) To make info from get_nonzero_bits() available to ipa

in IPA phase, you should not be looking at SSA_NAMEs, those will not
be available with LTO when you do not have access to function bodies
and thus also not to SSA_NAMES.  In IPA, you should only rely on hat
you have in jump functions.

> , I had to remove
> guard !nonzero_p in ccp_finalize. However that triggered the following ICE
> in get_ptr_info() for default_none.f95 (and several other fortran tests)
> with options: -fopenacc -O2
> ICE: http://pastebin.com/KjD7HMQi
> I confirmed with Richard that this was a latent issue.
> 
> 
> I suppose we would want to gate this on some flag, say -fipa-bit-cp ?

Yes, definitely.

> I haven't yet gated it on the flag, will do in next version of patch.
> I have added some very simple test-cases, I will try to add more
> meaningful ones.


> 
> Patch passes bootstrap+test on x86_64-unknown-linux-gnu
> and cross-tested on arm*-*-* and aarch64*-*-* with the exception
> of some fortran tests failing due to above ICE.
> 
> As next steps, I am planning to extend it to handle alignment propagation,
> and do further testing (lto-bootstrap, chromium).
> I would be grateful for feedback on the current patch.
> 
> Thanks,
> Prathamesh

> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
> index 5b6cb9a..b770f6a 100644
> --- a/gcc/ipa-cp.c
> +++ b/gcc/ipa-cp.c
> @@ -120,6 +120,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "params.h"
>  #include "ipa-inline.h"
>  #include "ipa-utils.h"
> +#include "tree-ssa-ccp.h"
>  
>  template <typename valtype> class ipcp_value;
>  
> @@ -266,6 +267,40 @@ private:
>    bool meet_with_1 (unsigned new_align, unsigned new_misalign);
>  };
>  
> +/* Lattice of known bits, only capable of holding one value.
> +   Similar to ccp_prop_value_t, mask represents which bits of value are constant.
> +   If a bit in mask is set to 0, then the corresponding bit in
> +   value is known to be constant.  */
> +
> +class ipcp_bits_lattice
> +{
> +public:
> +  bool bottom_p () { return lattice_val == IPA_BITS_VARYING; }
> +  bool top_p () { return lattice_val == IPA_BITS_UNDEFINED; }
> +  bool constant_p () { return lattice_val == IPA_BITS_CONSTANT; }
> +  bool set_to_bottom ();
> +  bool set_to_constant (widest_int, widest_int, signop, unsigned);
> + 
> +  widest_int get_value () { return value; }
> +  widest_int get_mask () { return mask; }
> +  signop get_sign () { return sgn; }
> +  unsigned get_precision () { return precision; }
> +
> +  bool meet_with (ipcp_bits_lattice& other, enum tree_code, tree);
> +  bool meet_with (widest_int, widest_int, signop, unsigned);
> +
> +  void print (FILE *);
> +
> +private:
> +  enum { IPA_BITS_UNDEFINED, IPA_BITS_CONSTANT, IPA_BITS_VARYING } lattice_val;
> +  widest_int value, mask;
> +  signop sgn;
> +  unsigned precision;
> +
> +  bool meet_with_1 (widest_int, widest_int); 
> +  void get_value_and_mask (tree, widest_int *, widest_int *);
> +}; 
> +
>  /* Structure containing lattices for a parameter itself and for pieces of
>     aggregates that are passed in the parameter or by a reference in a parameter
>     plus some other useful flags.  */
> @@ -281,6 +316,8 @@ public:
>    ipcp_agg_lattice *aggs;
>    /* Lattice describing known alignment.  */
>    ipcp_alignment_lattice alignment;
> +  /* Lattice describing known bits.  */
> +  ipcp_bits_lattice bits_lattice;
>    /* Number of aggregate lattices */
>    int aggs_count;
>    /* True if aggregate data were passed by reference (as opposed to by
> @@ -458,6 +495,21 @@ ipcp_alignment_lattice::print (FILE * f)
>      fprintf (f, "         Alignment %u, misalignment %u\n", align, misalign);
>  }
>  
> +void
> +ipcp_bits_lattice::print (FILE *f)
> +{
> +  if (top_p ())
> +    fprintf (f, "         Bits unknown (TOP)\n");
> +  else if (bottom_p ())
> +    fprintf (f, "         Bits unusable (BOTTOM)\n");
> +  else
> +    {
> +      fprintf (f, "         Bits: value = "); print_hex (get_value (), f);
> +      fprintf (f, ", mask = "); print_hex (get_mask (), f);
> +      fprintf (f, "\n");
> +    }
> +}
> +
>  /* Print all ipcp_lattices of all functions to F.  */
>  
>  static void
> @@ -484,6 +536,7 @@ print_all_lattices (FILE * f, bool dump_sources, bool dump_benefits)
>  	  fprintf (f, "         ctxs: ");
>  	  plats->ctxlat.print (f, dump_sources, dump_benefits);
>  	  plats->alignment.print (f);
> +	  plats->bits_lattice.print (f);
>  	  if (plats->virt_call)
>  	    fprintf (f, "        virt_call flag set\n");
>  
> @@ -911,6 +964,161 @@ ipcp_alignment_lattice::meet_with (const ipcp_alignment_lattice &other,
>    return meet_with_1 (other.align, adjusted_misalign);
>  }
>  
> +/* Set lattice value to bottom, if it already isn't the case.  */
> +
> +bool
> +ipcp_bits_lattice::set_to_bottom ()
> +{
> +  if (bottom_p ())
> +    return false;
> +  lattice_val = IPA_BITS_VARYING;
> +  value = 0;
> +  mask = -1;
> +  return true;
> +}
> +
> +/* Set to constant if it isn't already. Only meant to be called
> +   when switching state from TOP.  */
> +
> +bool
> +ipcp_bits_lattice::set_to_constant (widest_int value, widest_int mask,
> +				    signop sgn, unsigned precision)
> +{
> +  gcc_assert (top_p ());
> +  this->lattice_val = IPA_BITS_CONSTANT;
> +  this->value = value;
> +  this->mask = mask;
> +  this->sgn = sgn;
> +  this->precision = precision;
> +  return true;
> +}
> +
> +/* Convert operand to value, mask form.  */
> +
> +void
> +ipcp_bits_lattice::get_value_and_mask (tree operand, widest_int *valuep, widest_int *maskp)
> +{
> +  wide_int get_nonzero_bits (const_tree);
> +
> +  if (TREE_CODE (operand) == INTEGER_CST)
> +    {
> +      *valuep = wi::to_widest (operand); 
> +      *maskp = 0;
> +    }
> +  else if (TREE_CODE (operand) == SSA_NAME)

IIUC, operand is the operand from pass-through jump function and that
should never be an SSA_NAME.  I have even looked at how we generate
them and it seems fairly safe to say that they never are.  If you have
seen an SSA_NAME here, it is a bug and please let me know because
sooner or later it will cause an assert.

> +    {
> +      *valuep = 0;
> +      *maskp = widest_int::from (get_nonzero_bits (operand), UNSIGNED);
> +    }
> +  else
> +    gcc_unreachable ();

The operand however can be any any other is_gimple_ip_invariant tree.
I assume that you could hit this gcc_unreachable only in a program
with undefined behavior (or with a Fortran CONST_DECL?) but you should
not ICE here.


> +}  
> +
> +/* Meet operation, similar to ccp_lattice_meet, we xor values
> +   if this->value, value have different values at same bit positions, we want
> +   to drop that bit to varying. Return true if mask is changed.
> +   This function assumes that the lattice value is in CONSTANT state  */
> +
> +bool
> +ipcp_bits_lattice::meet_with_1 (widest_int value, widest_int mask)
> +{
> +  gcc_assert (constant_p ());
> +  
> +  widest_int old_mask = this->mask;
> +  this->mask = (this->mask | mask) | (this->value ^ value);
> +
> +  if (wi::sext (this->mask, this->precision) == -1)
> +    return set_to_bottom ();
> +
> +  bool changed = this->mask != old_mask;
> +  return changed;
> +}
> +
> +/* Meet the bits lattice with operand
> +   described by <value, mask, sgn, precision.  */
> +
> +bool
> +ipcp_bits_lattice::meet_with (widest_int value, widest_int mask,
> +			      signop sgn, unsigned precision)
> +{
> +  if (bottom_p ())
> +    return false;
> +
> +  if (top_p ())
> +    {
> +      if (wi::sext (mask, precision) == -1)
> +	return set_to_bottom ();
> +      return set_to_constant (value, mask, sgn, precision);
> +    }
> +
> +  return meet_with_1 (value, mask);

What if precisions do not match?

> +}
> +
> +/* Meet bits lattice with the result of bit_value_binop_1 (other, operand)
> +   if code is binary operation or bit_value_unop_1 (other) if code is unary op.
> +   In the case when code is nop_expr, no adjustment is required. */
> +
> +bool
> +ipcp_bits_lattice::meet_with (ipcp_bits_lattice& other, enum tree_code code, tree operand)
> +{
> +  if (other.bottom_p ())
> +    return set_to_bottom ();
> +
> +  if (bottom_p () || other.top_p ())
> +    return false;
> +
> +  widest_int adjusted_value, adjusted_mask;
> +
> +  if (TREE_CODE_CLASS (code) == tcc_binary)
> +    {
> +      tree type = TREE_TYPE (operand);
> +      gcc_assert (INTEGRAL_TYPE_P (type));
> +      widest_int o_value, o_mask;
> +      get_value_and_mask (operand, &o_value, &o_mask);
> +
> +      signop sgn = other.get_sign ();
> +      unsigned prec = other.get_precision ();
> +
> +      bit_value_binop_1 (code, sgn, prec, &adjusted_value, &adjusted_mask,
> +			 sgn, prec, other.get_value (), other.get_mask (),
> +			 TYPE_SIGN (type), TYPE_PRECISION (type), o_value, o_mask);

It is probably just me not being particularly sharp on a Friday
afternoon and I might not understand the semantics of mask well (also,
you did not document it :-), but... assume that we are looking at a
binary and operation, other comes from an SSA pointer and its mask
would be binary 100 and its value 0 because that's what you set for
ssa names in ipa-prop.h, and the operand is binary value 101, which
means that get_value_and_mask returns mask 0 and value 101.  Now,
bit_value_binop_1 would return value 0 & 101 = 0 and mask according to

(m1 | m2) & ((v1 | m1) & (v2 | m2))

so in our case

(100b & 0) & ((0 | 100b) & (101b | 0)) = 0 & 100b = 0.

So both value and mask would be zero, which, if this was the only
incoming value to the lattice, I am afraid you would later on in
ipcp_update_bits interpret as that no bits can be non-zero.  Yet the
third bit clearly could.

I think that this is the only place where you interpret mask as a mask
and actually use it for and-ing, everywhere else you interpret it
basically only as the result of get_nonzero_bits and use it for
or-ing.

> +
> +      if (wi::sext (adjusted_mask, prec) == -1)
> +	return set_to_bottom ();
> +    }
> +
> +  else if (TREE_CODE_CLASS (code) == tcc_unary)
> +    {
> +      signop sgn = other.get_sign ();
> +      unsigned prec = other.get_precision ();
> +
> +      bit_value_unop_1 (code, sgn, prec, &adjusted_value,
> +			&adjusted_mask, sgn, prec, other.get_value (),
> +			other.get_mask ());
> +
> +      if (wi::sext (adjusted_mask, prec) == -1)
> +	return set_to_bottom ();
> +    }
> +
> +  else if (code == NOP_EXPR)
> +    {
> +      adjusted_value = other.value;
> +      adjusted_mask = other.mask;
> +    }
> +
> +  else
> +    return set_to_bottom ();
> +
> +  if (top_p ())
> +    {
> +      if (wi::sext (adjusted_mask, other.get_precision ()) == -1)
> +	return set_to_bottom ();
> +      return set_to_constant (adjusted_value, adjusted_mask, other.get_sign (), other.get_precision ());
> +    }
> +  else
> +    return meet_with_1 (adjusted_value, adjusted_mask);

Again, What if precisions do not match?

> +}
> +
>  /* Mark bot aggregate and scalar lattices as containing an unknown variable,
>     return true is any of them has not been marked as such so far.  */
>  
> @@ -922,6 +1130,7 @@ set_all_contains_variable (struct ipcp_param_lattices *plats)
>    ret |= plats->ctxlat.set_contains_variable ();
>    ret |= set_agg_lats_contain_variable (plats);
>    ret |= plats->alignment.set_to_bottom ();
> +  ret |= plats->bits_lattice.set_to_bottom ();
>    return ret;
>  }
>  
> @@ -1003,6 +1212,7 @@ initialize_node_lattices (struct cgraph_node *node)
>  	      plats->ctxlat.set_to_bottom ();
>  	      set_agg_lats_to_bottom (plats);
>  	      plats->alignment.set_to_bottom ();
> +	      plats->bits_lattice.set_to_bottom ();
>  	    }
>  	  else
>  	    set_all_contains_variable (plats);
> @@ -1621,6 +1831,57 @@ propagate_alignment_accross_jump_function (cgraph_edge *cs,
>      }
>  }
>  
> +/* Propagate bits across jfunc that is associated with
> +   edge cs and update dest_lattice accordingly.  */
> +
> +bool
> +propagate_bits_accross_jump_function (cgraph_edge *cs, ipa_jump_func *jfunc,
> +				      ipcp_bits_lattice *dest_lattice)
> +{
> +  if (dest_lattice->bottom_p ())
> +    return false;
> +
> +  if (jfunc->type == IPA_JF_PASS_THROUGH)
> +    {
> +      struct ipa_node_params *caller_info = IPA_NODE_REF (cs->caller);
> +      enum tree_code code = ipa_get_jf_pass_through_operation (jfunc);
> +      tree operand = NULL_TREE;
> +
> +      if (code != NOP_EXPR)
> +	operand = ipa_get_jf_pass_through_operand (jfunc);
> +
> +      int src_idx = ipa_get_jf_pass_through_formal_id (jfunc);
> +      struct ipcp_param_lattices *src_lats
> +	= ipa_get_parm_lattices (caller_info, src_idx);
> +
> +      /* Try to proapgate bits if src_lattice is bottom, but jfunc is known.

propagate

> +	 for eg consider:
> +	 int f(int x)
> +	 {
> +	   g (x & 0xff);
> +	 }
> +	 Assume lattice for x is bottom, however we can still propagate
> +	 result of x & 0xff == 0xff, which gets computed during ccp1 pass
> +	 and we store it in jump function during analysis stage.  */
> +
> +      if (src_lats->bits_lattice.bottom_p ()
> +	  && jfunc->bits.known)
> +	return dest_lattice->meet_with (jfunc->bits.value, jfunc->bits.mask,
> +				        jfunc->bits.sgn, jfunc->bits.precision);
> +      else
> +	return dest_lattice->meet_with (src_lats->bits_lattice, code, operand);
> +    }
> +
> +  else if (jfunc->type == IPA_JF_ANCESTOR)
> +    return dest_lattice->set_to_bottom ();
> +
> +  else if (jfunc->bits.known) 
> +    return dest_lattice->meet_with (jfunc->bits.value, jfunc->bits.mask,
> +				    jfunc->bits.sgn, jfunc->bits.precision);
> +  else
> +    return dest_lattice->set_to_bottom ();
> +}
> +

...

> diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h
> index e32d078..d69a071 100644
> --- a/gcc/ipa-prop.h
> +++ b/gcc/ipa-prop.h
> @@ -154,6 +154,16 @@ struct GTY(()) ipa_alignment
>    unsigned misalign;
>  };
>  
> +/* Information about zero/non-zero bits.  */
> +struct GTY(()) ipa_bits
> +{
> +  bool known;
> +  widest_int value;
> +  widest_int mask;
> +  enum signop sgn;
> +  unsigned precision;
> +};

Please order the fields according to their size, the largest first and
add a comment describing each one of them.  As I explained above, it
is not immediately clear why the mask would be a mask, for example.

Nevertheless, thanks for looking into this.  It would be nice to have
this for pointers too, not least because we could represent
non-NULL-ness this way, which could be very interesting for cloning
and inlining.  But those are further steps, once normal propagation
works.

Martin

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] ipa bitwise constant propagation
  2016-08-05 12:37 ` Martin Jambor
@ 2016-08-07 21:38   ` Prathamesh Kulkarni
  2016-08-08 14:04     ` Martin Jambor
  0 siblings, 1 reply; 31+ messages in thread
From: Prathamesh Kulkarni @ 2016-08-07 21:38 UTC (permalink / raw)
  To: Prathamesh Kulkarni, Richard Biener, Jan Hubicka,
	Kugan Vivekanandarajah, gcc Patches

[-- Attachment #1: Type: text/plain, Size: 16684 bytes --]

On 5 August 2016 at 18:06, Martin Jambor <mjambor@suse.cz> wrote:
> Hi,
Hi Martin,
Thanks for the review. Please find my responses inline.
>
> generally speaking, the ipa-cp.c and ipa-cp.[hc] bits look reasonable,
> but I have a few comments:
>
> On Thu, Aug 04, 2016 at 12:06:18PM +0530, Prathamesh Kulkarni wrote:
>> Hi,
>> This is a prototype patch for propagating known/unknown bits inter-procedurally.
>> for integral types which propagates info obtained from get_nonzero_bits ().
>>
>> Patch required making following changes:
>> a) To make info from get_nonzero_bits() available to ipa
>
> in IPA phase, you should not be looking at SSA_NAMEs, those will not
> be available with LTO when you do not have access to function bodies
> and thus also not to SSA_NAMES.  In IPA, you should only rely on hat
> you have in jump functions.
>
>> , I had to remove
>> guard !nonzero_p in ccp_finalize. However that triggered the following ICE
>> in get_ptr_info() for default_none.f95 (and several other fortran tests)
>> with options: -fopenacc -O2
>> ICE: http://pastebin.com/KjD7HMQi
>> I confirmed with Richard that this was a latent issue.
>>
>>
>> I suppose we would want to gate this on some flag, say -fipa-bit-cp ?
>
> Yes, definitely.
Added -fipa-cp-bit (mirroring -fipa-cp-alignment)
>
>> I haven't yet gated it on the flag, will do in next version of patch.
>> I have added some very simple test-cases, I will try to add more
>> meaningful ones.
>
>
>>
>> Patch passes bootstrap+test on x86_64-unknown-linux-gnu
>> and cross-tested on arm*-*-* and aarch64*-*-* with the exception
>> of some fortran tests failing due to above ICE.
>>
>> As next steps, I am planning to extend it to handle alignment propagation,
>> and do further testing (lto-bootstrap, chromium).
>> I would be grateful for feedback on the current patch.
>>
>> Thanks,
>> Prathamesh
>
>> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
>> index 5b6cb9a..b770f6a 100644
>> --- a/gcc/ipa-cp.c
>> +++ b/gcc/ipa-cp.c
>> @@ -120,6 +120,7 @@ along with GCC; see the file COPYING3.  If not see
>>  #include "params.h"
>>  #include "ipa-inline.h"
>>  #include "ipa-utils.h"
>> +#include "tree-ssa-ccp.h"
>>
>>  template <typename valtype> class ipcp_value;
>>
>> @@ -266,6 +267,40 @@ private:
>>    bool meet_with_1 (unsigned new_align, unsigned new_misalign);
>>  };
>>
>> +/* Lattice of known bits, only capable of holding one value.
>> +   Similar to ccp_prop_value_t, mask represents which bits of value are constant.
>> +   If a bit in mask is set to 0, then the corresponding bit in
>> +   value is known to be constant.  */
>> +
>> +class ipcp_bits_lattice
>> +{
>> +public:
>> +  bool bottom_p () { return lattice_val == IPA_BITS_VARYING; }
>> +  bool top_p () { return lattice_val == IPA_BITS_UNDEFINED; }
>> +  bool constant_p () { return lattice_val == IPA_BITS_CONSTANT; }
>> +  bool set_to_bottom ();
>> +  bool set_to_constant (widest_int, widest_int, signop, unsigned);
>> +
>> +  widest_int get_value () { return value; }
>> +  widest_int get_mask () { return mask; }
>> +  signop get_sign () { return sgn; }
>> +  unsigned get_precision () { return precision; }
>> +
>> +  bool meet_with (ipcp_bits_lattice& other, enum tree_code, tree);
>> +  bool meet_with (widest_int, widest_int, signop, unsigned);
>> +
>> +  void print (FILE *);
>> +
>> +private:
>> +  enum { IPA_BITS_UNDEFINED, IPA_BITS_CONSTANT, IPA_BITS_VARYING } lattice_val;
>> +  widest_int value, mask;
>> +  signop sgn;
>> +  unsigned precision;
>> +
>> +  bool meet_with_1 (widest_int, widest_int);
>> +  void get_value_and_mask (tree, widest_int *, widest_int *);
>> +};
>> +
>>  /* Structure containing lattices for a parameter itself and for pieces of
>>     aggregates that are passed in the parameter or by a reference in a parameter
>>     plus some other useful flags.  */
>> @@ -281,6 +316,8 @@ public:
>>    ipcp_agg_lattice *aggs;
>>    /* Lattice describing known alignment.  */
>>    ipcp_alignment_lattice alignment;
>> +  /* Lattice describing known bits.  */
>> +  ipcp_bits_lattice bits_lattice;
>>    /* Number of aggregate lattices */
>>    int aggs_count;
>>    /* True if aggregate data were passed by reference (as opposed to by
>> @@ -458,6 +495,21 @@ ipcp_alignment_lattice::print (FILE * f)
>>      fprintf (f, "         Alignment %u, misalignment %u\n", align, misalign);
>>  }
>>
>> +void
>> +ipcp_bits_lattice::print (FILE *f)
>> +{
>> +  if (top_p ())
>> +    fprintf (f, "         Bits unknown (TOP)\n");
>> +  else if (bottom_p ())
>> +    fprintf (f, "         Bits unusable (BOTTOM)\n");
>> +  else
>> +    {
>> +      fprintf (f, "         Bits: value = "); print_hex (get_value (), f);
>> +      fprintf (f, ", mask = "); print_hex (get_mask (), f);
>> +      fprintf (f, "\n");
>> +    }
>> +}
>> +
>>  /* Print all ipcp_lattices of all functions to F.  */
>>
>>  static void
>> @@ -484,6 +536,7 @@ print_all_lattices (FILE * f, bool dump_sources, bool dump_benefits)
>>         fprintf (f, "         ctxs: ");
>>         plats->ctxlat.print (f, dump_sources, dump_benefits);
>>         plats->alignment.print (f);
>> +       plats->bits_lattice.print (f);
>>         if (plats->virt_call)
>>           fprintf (f, "        virt_call flag set\n");
>>
>> @@ -911,6 +964,161 @@ ipcp_alignment_lattice::meet_with (const ipcp_alignment_lattice &other,
>>    return meet_with_1 (other.align, adjusted_misalign);
>>  }
>>
>> +/* Set lattice value to bottom, if it already isn't the case.  */
>> +
>> +bool
>> +ipcp_bits_lattice::set_to_bottom ()
>> +{
>> +  if (bottom_p ())
>> +    return false;
>> +  lattice_val = IPA_BITS_VARYING;
>> +  value = 0;
>> +  mask = -1;
>> +  return true;
>> +}
>> +
>> +/* Set to constant if it isn't already. Only meant to be called
>> +   when switching state from TOP.  */
>> +
>> +bool
>> +ipcp_bits_lattice::set_to_constant (widest_int value, widest_int mask,
>> +                                 signop sgn, unsigned precision)
>> +{
>> +  gcc_assert (top_p ());
>> +  this->lattice_val = IPA_BITS_CONSTANT;
>> +  this->value = value;
>> +  this->mask = mask;
>> +  this->sgn = sgn;
>> +  this->precision = precision;
>> +  return true;
>> +}
>> +
>> +/* Convert operand to value, mask form.  */
>> +
>> +void
>> +ipcp_bits_lattice::get_value_and_mask (tree operand, widest_int *valuep, widest_int *maskp)
>> +{
>> +  wide_int get_nonzero_bits (const_tree);
>> +
>> +  if (TREE_CODE (operand) == INTEGER_CST)
>> +    {
>> +      *valuep = wi::to_widest (operand);
>> +      *maskp = 0;
>> +    }
>> +  else if (TREE_CODE (operand) == SSA_NAME)
>
> IIUC, operand is the operand from pass-through jump function and that
> should never be an SSA_NAME.  I have even looked at how we generate
> them and it seems fairly safe to say that they never are.  If you have
> seen an SSA_NAME here, it is a bug and please let me know because
> sooner or later it will cause an assert.
>
>> +    {
>> +      *valuep = 0;
>> +      *maskp = widest_int::from (get_nonzero_bits (operand), UNSIGNED);
>> +    }
>> +  else
>> +    gcc_unreachable ();
>
> The operand however can be any any other is_gimple_ip_invariant tree.
> I assume that you could hit this gcc_unreachable only in a program
> with undefined behavior (or with a Fortran CONST_DECL?) but you should
> not ICE here.
Changed to:
if (TREE_CODE (operand) == INTEGER_CST)
    {
      *valuep = wi::to_widest (operand);
      *maskp = 0;
    }
  else
    {
      *valuep = 0;
      *maskp = -1;
    }

I am not sure how to extract nonzero bits for gimple_ip_invariant if
it's not INTEGER_CST,
so setting to unknown (value = 0, mask = -1).
Does this look OK ?
>
>
>> +}
>> +
>> +/* Meet operation, similar to ccp_lattice_meet, we xor values
>> +   if this->value, value have different values at same bit positions, we want
>> +   to drop that bit to varying. Return true if mask is changed.
>> +   This function assumes that the lattice value is in CONSTANT state  */
>> +
>> +bool
>> +ipcp_bits_lattice::meet_with_1 (widest_int value, widest_int mask)
>> +{
>> +  gcc_assert (constant_p ());
>> +
>> +  widest_int old_mask = this->mask;
>> +  this->mask = (this->mask | mask) | (this->value ^ value);
>> +
>> +  if (wi::sext (this->mask, this->precision) == -1)
>> +    return set_to_bottom ();
>> +
>> +  bool changed = this->mask != old_mask;
>> +  return changed;
>> +}
>> +
>> +/* Meet the bits lattice with operand
>> +   described by <value, mask, sgn, precision.  */
>> +
>> +bool
>> +ipcp_bits_lattice::meet_with (widest_int value, widest_int mask,
>> +                           signop sgn, unsigned precision)
>> +{
>> +  if (bottom_p ())
>> +    return false;
>> +
>> +  if (top_p ())
>> +    {
>> +      if (wi::sext (mask, precision) == -1)
>> +     return set_to_bottom ();
>> +      return set_to_constant (value, mask, sgn, precision);
>> +    }
>> +
>> +  return meet_with_1 (value, mask);
>
> What if precisions do not match?
Sorry I don't understand. Since we extend to widest_int, precision
would be same ?
bit_value_binop_1 requires original precision for few cases (shifts,
rotates, plus, mult), so
I was preserving the original precision in jump function.
Later in ipcp_update_bits(), the mask is set after narrowing to the
precision of the parameter.
>
>> +}
>> +
>> +/* Meet bits lattice with the result of bit_value_binop_1 (other, operand)
>> +   if code is binary operation or bit_value_unop_1 (other) if code is unary op.
>> +   In the case when code is nop_expr, no adjustment is required. */
>> +
>> +bool
>> +ipcp_bits_lattice::meet_with (ipcp_bits_lattice& other, enum tree_code code, tree operand)
>> +{
>> +  if (other.bottom_p ())
>> +    return set_to_bottom ();
>> +
>> +  if (bottom_p () || other.top_p ())
>> +    return false;
>> +
>> +  widest_int adjusted_value, adjusted_mask;
>> +
>> +  if (TREE_CODE_CLASS (code) == tcc_binary)
>> +    {
>> +      tree type = TREE_TYPE (operand);
>> +      gcc_assert (INTEGRAL_TYPE_P (type));
>> +      widest_int o_value, o_mask;
>> +      get_value_and_mask (operand, &o_value, &o_mask);
>> +
>> +      signop sgn = other.get_sign ();
>> +      unsigned prec = other.get_precision ();
>> +
>> +      bit_value_binop_1 (code, sgn, prec, &adjusted_value, &adjusted_mask,
>> +                      sgn, prec, other.get_value (), other.get_mask (),
>> +                      TYPE_SIGN (type), TYPE_PRECISION (type), o_value, o_mask);
>
> It is probably just me not being particularly sharp on a Friday
> afternoon and I might not understand the semantics of mask well (also,
> you did not document it :-), but... assume that we are looking at a
> binary and operation, other comes from an SSA pointer and its mask
> would be binary 100 and its value 0 because that's what you set for
> ssa names in ipa-prop.h, and the operand is binary value 101, which
> means that get_value_and_mask returns mask 0 and value 101.  Now,
> bit_value_binop_1 would return value 0 & 101 = 0 and mask according to
>
> (m1 | m2) & ((v1 | m1) & (v2 | m2))
>
> so in our case
>
> (100b & 0) & ((0 | 100b) & (101b | 0)) = 0 & 100b = 0.
Shouldn't this be:
(100b | 0) & ((0 | 100b) & (101b | 0)) = 100 & 100 = 100 -;)
>
> So both value and mask would be zero, which, if this was the only
> incoming value to the lattice, I am afraid you would later on in
> ipcp_update_bits interpret as that no bits can be non-zero.  Yet the
> third bit clearly could.
>
> I think that this is the only place where you interpret mask as a mask
> and actually use it for and-ing, everywhere else you interpret it
> basically only as the result of get_nonzero_bits and use it for
> or-ing.
>
>> +
>> +      if (wi::sext (adjusted_mask, prec) == -1)
>> +     return set_to_bottom ();
>> +    }
>> +
>> +  else if (TREE_CODE_CLASS (code) == tcc_unary)
>> +    {
>> +      signop sgn = other.get_sign ();
>> +      unsigned prec = other.get_precision ();
>> +
>> +      bit_value_unop_1 (code, sgn, prec, &adjusted_value,
>> +                     &adjusted_mask, sgn, prec, other.get_value (),
>> +                     other.get_mask ());
>> +
>> +      if (wi::sext (adjusted_mask, prec) == -1)
>> +     return set_to_bottom ();
>> +    }
>> +
>> +  else if (code == NOP_EXPR)
>> +    {
>> +      adjusted_value = other.value;
>> +      adjusted_mask = other.mask;
>> +    }
>> +
>> +  else
>> +    return set_to_bottom ();
>> +
>> +  if (top_p ())
>> +    {
>> +      if (wi::sext (adjusted_mask, other.get_precision ()) == -1)
>> +     return set_to_bottom ();
>> +      return set_to_constant (adjusted_value, adjusted_mask, other.get_sign (), other.get_precision ());
>> +    }
>> +  else
>> +    return meet_with_1 (adjusted_value, adjusted_mask);
>
> Again, What if precisions do not match?
>
>> +}
>> +
>>  /* Mark bot aggregate and scalar lattices as containing an unknown variable,
>>     return true is any of them has not been marked as such so far.  */
>>
>> @@ -922,6 +1130,7 @@ set_all_contains_variable (struct ipcp_param_lattices *plats)
>>    ret |= plats->ctxlat.set_contains_variable ();
>>    ret |= set_agg_lats_contain_variable (plats);
>>    ret |= plats->alignment.set_to_bottom ();
>> +  ret |= plats->bits_lattice.set_to_bottom ();
>>    return ret;
>>  }
>>
>> @@ -1003,6 +1212,7 @@ initialize_node_lattices (struct cgraph_node *node)
>>             plats->ctxlat.set_to_bottom ();
>>             set_agg_lats_to_bottom (plats);
>>             plats->alignment.set_to_bottom ();
>> +           plats->bits_lattice.set_to_bottom ();
>>           }
>>         else
>>           set_all_contains_variable (plats);
>> @@ -1621,6 +1831,57 @@ propagate_alignment_accross_jump_function (cgraph_edge *cs,
>>      }
>>  }
>>
>> +/* Propagate bits across jfunc that is associated with
>> +   edge cs and update dest_lattice accordingly.  */
>> +
>> +bool
>> +propagate_bits_accross_jump_function (cgraph_edge *cs, ipa_jump_func *jfunc,
>> +                                   ipcp_bits_lattice *dest_lattice)
>> +{
>> +  if (dest_lattice->bottom_p ())
>> +    return false;
>> +
>> +  if (jfunc->type == IPA_JF_PASS_THROUGH)
>> +    {
>> +      struct ipa_node_params *caller_info = IPA_NODE_REF (cs->caller);
>> +      enum tree_code code = ipa_get_jf_pass_through_operation (jfunc);
>> +      tree operand = NULL_TREE;
>> +
>> +      if (code != NOP_EXPR)
>> +     operand = ipa_get_jf_pass_through_operand (jfunc);
>> +
>> +      int src_idx = ipa_get_jf_pass_through_formal_id (jfunc);
>> +      struct ipcp_param_lattices *src_lats
>> +     = ipa_get_parm_lattices (caller_info, src_idx);
>> +
>> +      /* Try to proapgate bits if src_lattice is bottom, but jfunc is known.
>
> propagate
oops, corrected in attached version.
>
>> +      for eg consider:
>> +      int f(int x)
>> +      {
>> +        g (x & 0xff);
>> +      }
>> +      Assume lattice for x is bottom, however we can still propagate
>> +      result of x & 0xff == 0xff, which gets computed during ccp1 pass
>> +      and we store it in jump function during analysis stage.  */
>> +
>> +      if (src_lats->bits_lattice.bottom_p ()
>> +       && jfunc->bits.known)
>> +     return dest_lattice->meet_with (jfunc->bits.value, jfunc->bits.mask,
>> +                                     jfunc->bits.sgn, jfunc->bits.precision);
>> +      else
>> +     return dest_lattice->meet_with (src_lats->bits_lattice, code, operand);
>> +    }
>> +
>> +  else if (jfunc->type == IPA_JF_ANCESTOR)
>> +    return dest_lattice->set_to_bottom ();
>> +
>> +  else if (jfunc->bits.known)
>> +    return dest_lattice->meet_with (jfunc->bits.value, jfunc->bits.mask,
>> +                                 jfunc->bits.sgn, jfunc->bits.precision);
>> +  else
>> +    return dest_lattice->set_to_bottom ();
>> +}
>> +
>
> ...
>
>> diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h
>> index e32d078..d69a071 100644
>> --- a/gcc/ipa-prop.h
>> +++ b/gcc/ipa-prop.h
>> @@ -154,6 +154,16 @@ struct GTY(()) ipa_alignment
>>    unsigned misalign;
>>  };
>>
>> +/* Information about zero/non-zero bits.  */
>> +struct GTY(()) ipa_bits
>> +{
>> +  bool known;
>> +  widest_int value;
>> +  widest_int mask;
>> +  enum signop sgn;
>> +  unsigned precision;
>> +};
>
> Please order the fields according to their size, the largest first and
> add a comment describing each one of them.  As I explained above, it
> is not immediately clear why the mask would be a mask, for example.
Reordered the fields.
>
> Nevertheless, thanks for looking into this.  It would be nice to have
> this for pointers too, not least because we could represent
> non-NULL-ness this way, which could be very interesting for cloning
> and inlining.  But those are further steps, once normal propagation
> works.
Does this version look OK ?

Thanks,
Prathamesh
>
> Martin

[-- Attachment #2: bits-prop-1-ipa.diff --]
[-- Type: text/plain, Size: 23753 bytes --]

diff --git a/gcc/common.opt b/gcc/common.opt
index 8a292ed..8bac0a2 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1561,6 +1561,10 @@ fipa-cp-alignment
 Common Report Var(flag_ipa_cp_alignment) Optimization
 Perform alignment discovery and propagation to make Interprocedural constant propagation stronger.
 
+fipa-cp-bit
+Common Report Var(flag_ipa_cp_bit) Optimization
+Perform interprocedural bitwise constant propagation.
+
 fipa-profile
 Common Report Var(flag_ipa_profile) Init(0) Optimization
 Perform interprocedural profile propagation.
diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 5b6cb9a..a26a1a6 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -120,6 +120,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "params.h"
 #include "ipa-inline.h"
 #include "ipa-utils.h"
+#include "tree-ssa-ccp.h"
 
 template <typename valtype> class ipcp_value;
 
@@ -266,6 +267,40 @@ private:
   bool meet_with_1 (unsigned new_align, unsigned new_misalign);
 };
 
+/* Lattice of known bits, only capable of holding one value.
+   Similar to ccp_prop_value_t, mask represents which bits of value are constant.
+   If a bit in mask is set to 0, then the corresponding bit in
+   value is known to be constant.  */
+
+class ipcp_bits_lattice
+{
+public:
+  bool bottom_p () { return lattice_val == IPA_BITS_VARYING; }
+  bool top_p () { return lattice_val == IPA_BITS_UNDEFINED; }
+  bool constant_p () { return lattice_val == IPA_BITS_CONSTANT; }
+  bool set_to_bottom ();
+  bool set_to_constant (widest_int, widest_int, signop, unsigned);
+ 
+  widest_int get_value () { return value; }
+  widest_int get_mask () { return mask; }
+  signop get_sign () { return sgn; }
+  unsigned get_precision () { return precision; }
+
+  bool meet_with (ipcp_bits_lattice& other, enum tree_code, tree);
+  bool meet_with (widest_int, widest_int, signop, unsigned);
+
+  void print (FILE *);
+
+private:
+  enum { IPA_BITS_UNDEFINED, IPA_BITS_CONSTANT, IPA_BITS_VARYING } lattice_val;
+  widest_int value, mask;
+  signop sgn;
+  unsigned precision;
+
+  bool meet_with_1 (widest_int, widest_int); 
+  void get_value_and_mask (tree, widest_int *, widest_int *);
+}; 
+
 /* Structure containing lattices for a parameter itself and for pieces of
    aggregates that are passed in the parameter or by a reference in a parameter
    plus some other useful flags.  */
@@ -281,6 +316,8 @@ public:
   ipcp_agg_lattice *aggs;
   /* Lattice describing known alignment.  */
   ipcp_alignment_lattice alignment;
+  /* Lattice describing known bits.  */
+  ipcp_bits_lattice bits_lattice;
   /* Number of aggregate lattices */
   int aggs_count;
   /* True if aggregate data were passed by reference (as opposed to by
@@ -458,6 +495,21 @@ ipcp_alignment_lattice::print (FILE * f)
     fprintf (f, "         Alignment %u, misalignment %u\n", align, misalign);
 }
 
+void
+ipcp_bits_lattice::print (FILE *f)
+{
+  if (top_p ())
+    fprintf (f, "         Bits unknown (TOP)\n");
+  else if (bottom_p ())
+    fprintf (f, "         Bits unusable (BOTTOM)\n");
+  else
+    {
+      fprintf (f, "         Bits: value = "); print_hex (get_value (), f);
+      fprintf (f, ", mask = "); print_hex (get_mask (), f);
+      fprintf (f, "\n");
+    }
+}
+
 /* Print all ipcp_lattices of all functions to F.  */
 
 static void
@@ -484,6 +536,7 @@ print_all_lattices (FILE * f, bool dump_sources, bool dump_benefits)
 	  fprintf (f, "         ctxs: ");
 	  plats->ctxlat.print (f, dump_sources, dump_benefits);
 	  plats->alignment.print (f);
+	  plats->bits_lattice.print (f);
 	  if (plats->virt_call)
 	    fprintf (f, "        virt_call flag set\n");
 
@@ -911,6 +964,159 @@ ipcp_alignment_lattice::meet_with (const ipcp_alignment_lattice &other,
   return meet_with_1 (other.align, adjusted_misalign);
 }
 
+/* Set lattice value to bottom, if it already isn't the case.  */
+
+bool
+ipcp_bits_lattice::set_to_bottom ()
+{
+  if (bottom_p ())
+    return false;
+  lattice_val = IPA_BITS_VARYING;
+  value = 0;
+  mask = -1;
+  return true;
+}
+
+/* Set to constant if it isn't already. Only meant to be called
+   when switching state from TOP.  */
+
+bool
+ipcp_bits_lattice::set_to_constant (widest_int value, widest_int mask,
+				    signop sgn, unsigned precision)
+{
+  gcc_assert (top_p ());
+  this->lattice_val = IPA_BITS_CONSTANT;
+  this->value = value;
+  this->mask = mask;
+  this->sgn = sgn;
+  this->precision = precision;
+  return true;
+}
+
+/* Convert operand to value, mask form.  */
+
+void
+ipcp_bits_lattice::get_value_and_mask (tree operand, widest_int *valuep, widest_int *maskp)
+{
+  wide_int get_nonzero_bits (const_tree);
+
+  if (TREE_CODE (operand) == INTEGER_CST)
+    {
+      *valuep = wi::to_widest (operand); 
+      *maskp = 0;
+    }
+  else
+    {
+      *valuep = 0;
+      *maskp = -1;
+    }
+}
+
+/* Meet operation, similar to ccp_lattice_meet, we xor values
+   if this->value, value have different values at same bit positions, we want
+   to drop that bit to varying. Return true if mask is changed.
+   This function assumes that the lattice value is in CONSTANT state  */
+
+bool
+ipcp_bits_lattice::meet_with_1 (widest_int value, widest_int mask)
+{
+  gcc_assert (constant_p ());
+  
+  widest_int old_mask = this->mask;
+  this->mask = (this->mask | mask) | (this->value ^ value);
+
+  if (wi::sext (this->mask, this->precision) == -1)
+    return set_to_bottom ();
+
+  bool changed = this->mask != old_mask;
+  return changed;
+}
+
+/* Meet the bits lattice with operand
+   described by <value, mask, sgn, precision.  */
+
+bool
+ipcp_bits_lattice::meet_with (widest_int value, widest_int mask,
+			      signop sgn, unsigned precision)
+{
+  if (bottom_p ())
+    return false;
+
+  if (top_p ())
+    {
+      if (wi::sext (mask, precision) == -1)
+	return set_to_bottom ();
+      return set_to_constant (value, mask, sgn, precision);
+    }
+
+  return meet_with_1 (value, mask);
+}
+
+/* Meet bits lattice with the result of bit_value_binop (other, operand)
+   if code is binary operation or bit_value_unop (other) if code is unary op.
+   In the case when code is nop_expr, no adjustment is required. */
+
+bool
+ipcp_bits_lattice::meet_with (ipcp_bits_lattice& other, enum tree_code code, tree operand)
+{
+  if (other.bottom_p ())
+    return set_to_bottom ();
+
+  if (bottom_p () || other.top_p ())
+    return false;
+
+  widest_int adjusted_value, adjusted_mask;
+
+  if (TREE_CODE_CLASS (code) == tcc_binary)
+    {
+      tree type = TREE_TYPE (operand);
+      gcc_assert (INTEGRAL_TYPE_P (type));
+      widest_int o_value, o_mask;
+      get_value_and_mask (operand, &o_value, &o_mask);
+
+      signop sgn = other.get_sign ();
+      unsigned prec = other.get_precision ();
+
+      bit_value_binop (code, sgn, prec, &adjusted_value, &adjusted_mask,
+		       sgn, prec, other.get_value (), other.get_mask (),
+		       TYPE_SIGN (type), TYPE_PRECISION (type), o_value, o_mask);
+
+      if (wi::sext (adjusted_mask, prec) == -1)
+	return set_to_bottom ();
+    }
+
+  else if (TREE_CODE_CLASS (code) == tcc_unary)
+    {
+      signop sgn = other.get_sign ();
+      unsigned prec = other.get_precision ();
+
+      bit_value_unop (code, sgn, prec, &adjusted_value,
+		      &adjusted_mask, sgn, prec, other.get_value (),
+		      other.get_mask ());
+
+      if (wi::sext (adjusted_mask, prec) == -1)
+	return set_to_bottom ();
+    }
+
+  else if (code == NOP_EXPR)
+    {
+      adjusted_value = other.value;
+      adjusted_mask = other.mask;
+    }
+
+  else
+    return set_to_bottom ();
+
+  if (top_p ())
+    {
+      if (wi::sext (adjusted_mask, other.get_precision ()) == -1)
+	return set_to_bottom ();
+      return set_to_constant (adjusted_value, adjusted_mask, other.get_sign (), other.get_precision ());
+    }
+  else
+    return meet_with_1 (adjusted_value, adjusted_mask);
+}
+
 /* Mark bot aggregate and scalar lattices as containing an unknown variable,
    return true is any of them has not been marked as such so far.  */
 
@@ -922,6 +1128,7 @@ set_all_contains_variable (struct ipcp_param_lattices *plats)
   ret |= plats->ctxlat.set_contains_variable ();
   ret |= set_agg_lats_contain_variable (plats);
   ret |= plats->alignment.set_to_bottom ();
+  ret |= plats->bits_lattice.set_to_bottom ();
   return ret;
 }
 
@@ -1003,6 +1210,7 @@ initialize_node_lattices (struct cgraph_node *node)
 	      plats->ctxlat.set_to_bottom ();
 	      set_agg_lats_to_bottom (plats);
 	      plats->alignment.set_to_bottom ();
+	      plats->bits_lattice.set_to_bottom ();
 	    }
 	  else
 	    set_all_contains_variable (plats);
@@ -1621,6 +1829,57 @@ propagate_alignment_accross_jump_function (cgraph_edge *cs,
     }
 }
 
+/* Propagate bits across jfunc that is associated with
+   edge cs and update dest_lattice accordingly.  */
+
+bool
+propagate_bits_accross_jump_function (cgraph_edge *cs, ipa_jump_func *jfunc,
+				      ipcp_bits_lattice *dest_lattice)
+{
+  if (dest_lattice->bottom_p ())
+    return false;
+
+  if (jfunc->type == IPA_JF_PASS_THROUGH)
+    {
+      struct ipa_node_params *caller_info = IPA_NODE_REF (cs->caller);
+      enum tree_code code = ipa_get_jf_pass_through_operation (jfunc);
+      tree operand = NULL_TREE;
+
+      if (code != NOP_EXPR)
+	operand = ipa_get_jf_pass_through_operand (jfunc);
+
+      int src_idx = ipa_get_jf_pass_through_formal_id (jfunc);
+      struct ipcp_param_lattices *src_lats
+	= ipa_get_parm_lattices (caller_info, src_idx);
+
+      /* Try to propagate bits if src_lattice is bottom, but jfunc is known.
+	 for eg consider:
+	 int f(int x)
+	 {
+	   g (x & 0xff);
+	 }
+	 Assume lattice for x is bottom, however we can still propagate
+	 result of x & 0xff == 0xff, which gets computed during ccp1 pass
+	 and we store it in jump function during analysis stage.  */
+
+      if (src_lats->bits_lattice.bottom_p ()
+	  && jfunc->bits.known)
+	return dest_lattice->meet_with (jfunc->bits.value, jfunc->bits.mask,
+				        jfunc->bits.sgn, jfunc->bits.precision);
+      else
+	return dest_lattice->meet_with (src_lats->bits_lattice, code, operand);
+    }
+
+  else if (jfunc->type == IPA_JF_ANCESTOR)
+    return dest_lattice->set_to_bottom ();
+
+  else if (jfunc->bits.known) 
+    return dest_lattice->meet_with (jfunc->bits.value, jfunc->bits.mask,
+				    jfunc->bits.sgn, jfunc->bits.precision);
+  else
+    return dest_lattice->set_to_bottom ();
+}
+
 /* If DEST_PLATS already has aggregate items, check that aggs_by_ref matches
    NEW_AGGS_BY_REF and if not, mark all aggs as bottoms and return true (in all
    other cases, return false).  If there are no aggregate items, set
@@ -1968,6 +2227,8 @@ propagate_constants_accross_call (struct cgraph_edge *cs)
 							  &dest_plats->ctxlat);
 	  ret |= propagate_alignment_accross_jump_function (cs, jump_func,
 							 &dest_plats->alignment);
+	  ret |= propagate_bits_accross_jump_function (cs, jump_func,
+						       &dest_plats->bits_lattice);
 	  ret |= propagate_aggs_accross_jump_function (cs, jump_func,
 						       dest_plats);
 	}
@@ -4592,6 +4853,83 @@ ipcp_store_alignment_results (void)
   }
 }
 
+/* Look up all the bits information that we have discovered and copy it over
+   to the transformation summary.  */
+
+static void
+ipcp_store_bits_results (void)
+{
+  cgraph_node *node;
+
+  FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node)
+    {
+      ipa_node_params *info = IPA_NODE_REF (node);
+      bool dumped_sth = false;
+      bool found_useful_result = false;
+
+      if (!opt_for_fn (node->decl, flag_ipa_cp_bit))
+	{
+	  if (dump_file)
+	    fprintf (dump_file, "Not considering %s for ipa bitwise propagation "
+				"; -fipa-cp-bit: disabled.\n",
+				node->name ());
+	  continue;
+	}
+
+      if (info->ipcp_orig_node)
+	info = IPA_NODE_REF (info->ipcp_orig_node);
+
+      unsigned count = ipa_get_param_count (info);
+      for (unsigned i = 0; i < count; i++)
+	{
+	  ipcp_param_lattices *plats = ipa_get_parm_lattices (info, i);
+	  if (plats->bits_lattice.constant_p ())
+	    {
+	      found_useful_result = true;
+	      break;
+	    }
+	}
+
+    if (!found_useful_result)
+      continue;
+
+    ipcp_grow_transformations_if_necessary ();
+    ipcp_transformation_summary *ts = ipcp_get_transformation_summary (node);
+    vec_safe_reserve_exact (ts->bits, count);
+
+    for (unsigned i = 0; i < count; i++)
+      {
+	ipcp_param_lattices *plats = ipa_get_parm_lattices (info, i);
+	ipa_bits bits_jfunc;			 
+
+	if (plats->bits_lattice.constant_p ())
+	  {
+	    bits_jfunc.known = true;
+	    bits_jfunc.value = plats->bits_lattice.get_value ();
+	    bits_jfunc.mask = plats->bits_lattice.get_mask ();
+	    bits_jfunc.sgn = plats->bits_lattice.get_sign ();
+	    bits_jfunc.precision = plats->bits_lattice.get_precision ();
+	  }
+	else
+	  bits_jfunc.known = false;
+
+	ts->bits->quick_push (bits_jfunc);
+	if (!dump_file || !bits_jfunc.known)
+	  continue;
+	if (!dumped_sth)
+	  {
+	    fprintf (dump_file, "Propagated bits info for function %s/%i:\n",
+				node->name (), node->order);
+	    dumped_sth = true;
+	  }
+	fprintf (dump_file, " param %i: value = ", i);
+	print_hex (bits_jfunc.value, dump_file);
+	fprintf (dump_file, ", mask = ");
+	print_hex (bits_jfunc.mask, dump_file);
+	fprintf (dump_file, "\n");
+      }
+    }
+}
 /* The IPCP driver.  */
 
 static unsigned int
@@ -4625,6 +4963,8 @@ ipcp_driver (void)
   ipcp_decision_stage (&topo);
   /* Store results of alignment propagation. */
   ipcp_store_alignment_results ();
+  /* Store results of bits propagation.  */
+  ipcp_store_bits_results ();
 
   /* Free all IPCP structures.  */
   free_toporder_info (&topo);
diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
index 132b622..46955bb 100644
--- a/gcc/ipa-prop.c
+++ b/gcc/ipa-prop.c
@@ -302,6 +302,15 @@ ipa_print_node_jump_functions_for_edge (FILE *f, struct cgraph_edge *cs)
 	}
       else
 	fprintf (f, "         Unknown alignment\n");
+
+      if (jump_func->bits.known)
+	{
+	  fprintf (f, "         value: "); print_hex (jump_func->bits.value, f);
+	  fprintf (f, ", mask: "); print_hex (jump_func->bits.mask, f);
+	  fprintf (f, "\n");
+	}
+      else
+	fprintf (f, "         Unknown bits\n");
     }
 }
 
@@ -381,6 +390,7 @@ ipa_set_jf_unknown (struct ipa_jump_func *jfunc)
 {
   jfunc->type = IPA_JF_UNKNOWN;
   jfunc->alignment.known = false;
+  jfunc->bits.known = false;
 }
 
 /* Set JFUNC to be a copy of another jmp (to be used by jump function
@@ -1674,6 +1684,27 @@ ipa_compute_jump_functions_for_edge (struct ipa_func_body_info *fbi,
       else
 	gcc_assert (!jfunc->alignment.known);
 
+      if (INTEGRAL_TYPE_P (TREE_TYPE (arg))
+	  && (TREE_CODE (arg) == SSA_NAME || TREE_CODE (arg) == INTEGER_CST))
+	{
+	  jfunc->bits.known = true;
+	  jfunc->bits.sgn = TYPE_SIGN (TREE_TYPE (arg));
+	  jfunc->bits.precision = TYPE_PRECISION (TREE_TYPE (arg));
+	  
+	  if (TREE_CODE (arg) == SSA_NAME)
+	    {
+	      jfunc->bits.value = 0;
+	      jfunc->bits.mask = widest_int::from (get_nonzero_bits (arg), UNSIGNED); 
+	    }
+	  else
+	    {
+	      jfunc->bits.value = wi::to_widest (arg);
+	      jfunc->bits.mask = 0;
+	    }
+	}
+      else
+	gcc_assert (!jfunc->bits.known);
+
       if (is_gimple_ip_invariant (arg)
 	  || (TREE_CODE (arg) == VAR_DECL
 	      && is_global_var (arg)
@@ -3690,6 +3721,18 @@ ipa_node_params_t::duplicate(cgraph_node *src, cgraph_node *dst,
       for (unsigned i = 0; i < src_alignments->length (); ++i)
 	dst_alignments->quick_push ((*src_alignments)[i]);
     }
+
+  if (src_trans && vec_safe_length (src_trans->bits) > 0)
+    {
+      ipcp_grow_transformations_if_necessary ();
+      src_trans = ipcp_get_transformation_summary (src);
+      const vec<ipa_bits, va_gc> *src_bits = src_trans->bits;
+      vec<ipa_bits, va_gc> *&dst_bits
+	= ipcp_get_transformation_summary (dst)->bits;
+      vec_safe_reserve_exact (dst_bits, src_bits->length ());
+      for (unsigned i = 0; i < src_bits->length (); ++i)
+	dst_bits->quick_push ((*src_bits)[i]);
+    }
 }
 
 /* Register our cgraph hooks if they are not already there.  */
@@ -4609,6 +4652,17 @@ ipa_write_jump_function (struct output_block *ob,
       streamer_write_uhwi (ob, jump_func->alignment.align);
       streamer_write_uhwi (ob, jump_func->alignment.misalign);
     }
+
+  bp = bitpack_create (ob->main_stream);
+  bp_pack_value (&bp, jump_func->bits.known, 1);
+  streamer_write_bitpack (&bp);
+  if (jump_func->bits.known)
+    {
+      streamer_write_widest_int (ob, jump_func->bits.value);
+      streamer_write_widest_int (ob, jump_func->bits.mask);
+      streamer_write_enum (ob->main_stream, signop, UNSIGNED + 1, jump_func->bits.sgn);
+      streamer_write_uhwi (ob, jump_func->bits.precision);
+    }   
 }
 
 /* Read in jump function JUMP_FUNC from IB.  */
@@ -4685,6 +4739,19 @@ ipa_read_jump_function (struct lto_input_block *ib,
     }
   else
     jump_func->alignment.known = false;
+
+  bp = streamer_read_bitpack (ib);
+  bool bits_known = bp_unpack_value (&bp, 1);
+  if (bits_known)
+    {
+      jump_func->bits.known = true;
+      jump_func->bits.value = streamer_read_widest_int (ib);
+      jump_func->bits.mask = streamer_read_widest_int (ib);
+      jump_func->bits.sgn = streamer_read_enum (ib, signop, UNSIGNED + 1);
+      jump_func->bits.precision = streamer_read_uhwi (ib); 
+    }
+  else
+    jump_func->bits.known = false;
 }
 
 /* Stream out parts of cgraph_indirect_call_info corresponding to CS that are
@@ -5050,6 +5117,31 @@ write_ipcp_transformation_info (output_block *ob, cgraph_node *node)
     }
   else
     streamer_write_uhwi (ob, 0);
+
+  ts = ipcp_get_transformation_summary (node);
+  if (ts && vec_safe_length (ts->bits) > 0)
+    {
+      count = ts->bits->length ();
+      streamer_write_uhwi (ob, count);
+
+      for (unsigned i = 0; i < count; ++i)
+	{
+	  const ipa_bits& bits_jfunc = (*ts->bits)[i];
+	  struct bitpack_d bp = bitpack_create (ob->main_stream);
+	  bp_pack_value (&bp, bits_jfunc.known, 1);
+	  streamer_write_bitpack (&bp);
+	  if (bits_jfunc.known)
+	    {
+	      streamer_write_widest_int (ob, bits_jfunc.value);
+	      streamer_write_widest_int (ob, bits_jfunc.mask);
+	      streamer_write_enum (ob->main_stream, signop,
+				   UNSIGNED + 1, bits_jfunc.sgn);
+	      streamer_write_uhwi (ob, bits_jfunc.precision);
+	    }
+	}
+    }
+  else
+    streamer_write_uhwi (ob, 0);
 }
 
 /* Stream in the aggregate value replacement chain for NODE from IB.  */
@@ -5102,6 +5194,28 @@ read_ipcp_transformation_info (lto_input_block *ib, cgraph_node *node,
 	    }
 	}
     }
+
+  count = streamer_read_uhwi (ib);
+  if (count > 0)
+    {
+      ipcp_grow_transformations_if_necessary ();
+      ipcp_transformation_summary *ts = ipcp_get_transformation_summary (node);
+      vec_safe_grow_cleared (ts->bits, count);
+
+      for (i = 0; i < count; i++)
+	{
+	  ipa_bits& bits_jfunc = (*ts->bits)[i];
+	  struct bitpack_d bp = streamer_read_bitpack (ib);
+	  bits_jfunc.known = bp_unpack_value (&bp, 1);
+	  if (bits_jfunc.known)
+	    {
+	      bits_jfunc.value = streamer_read_widest_int (ib);
+	      bits_jfunc.mask = streamer_read_widest_int (ib);
+	      bits_jfunc.sgn = streamer_read_enum (ib, signop, UNSIGNED + 1);
+	      bits_jfunc.precision = streamer_read_uhwi (ib);
+	    }
+	}
+    }
 }
 
 /* Write all aggregate replacement for nodes in set.  */
@@ -5404,6 +5518,54 @@ ipcp_update_alignments (struct cgraph_node *node)
     }
 }
 
+/* Update bits info of formal parameters as described in
+   ipcp_transformation_summary.  */
+
+static void
+ipcp_update_bits (struct cgraph_node *node)
+{
+  tree parm = DECL_ARGUMENTS (node->decl);
+  tree next_parm = parm;
+  ipcp_transformation_summary *ts = ipcp_get_transformation_summary (node);
+
+  if (!ts || vec_safe_length (ts->bits) == 0)
+    return;
+
+  vec<ipa_bits, va_gc> &bits = *ts->bits;
+  unsigned count = bits.length ();
+
+  for (unsigned i = 0; i < count; ++i, parm = next_parm)
+    {
+      if (node->clone.combined_args_to_skip
+	  && bitmap_bit_p (node->clone.combined_args_to_skip, i))
+	continue;
+
+      gcc_checking_assert (parm);
+      next_parm = DECL_CHAIN (parm);
+
+      if (!bits[i].known
+	  || !INTEGRAL_TYPE_P (TREE_TYPE (parm))
+	  || !is_gimple_reg (parm))
+	continue;       
+
+      tree ddef = ssa_default_def (DECL_STRUCT_FUNCTION (node->decl), parm);
+      if (!ddef)
+	continue;
+
+      if (dump_file)
+	{
+	  fprintf (dump_file, "Adjusting mask for param %u to ", i); 
+	  print_hex (bits[i].mask, dump_file);
+	  fprintf (dump_file, "\n");
+	}
+
+      unsigned prec = TYPE_PRECISION (TREE_TYPE (ddef));
+      wide_int nonzero_bits = wide_int::from (bits[i].mask, prec, UNSIGNED)
+			      | wide_int::from (bits[i].value, prec, bits[i].sgn);
+      set_nonzero_bits (ddef, nonzero_bits);
+    }
+}
+
 /* IPCP transformation phase doing propagation of aggregate values.  */
 
 unsigned int
@@ -5423,6 +5585,7 @@ ipcp_transform_function (struct cgraph_node *node)
 	     node->name (), node->order);
 
   ipcp_update_alignments (node);
+  ipcp_update_bits (node);
   aggval = ipa_get_agg_replacements_for_node (node);
   if (!aggval)
       return 0;
diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h
index e32d078..1b9d0ef 100644
--- a/gcc/ipa-prop.h
+++ b/gcc/ipa-prop.h
@@ -154,6 +154,23 @@ struct GTY(()) ipa_alignment
   unsigned misalign;
 };
 
+/* Information about zero/non-zero bits.  */
+struct GTY(()) ipa_bits
+{
+  /* The propagated value.  */
+  widest_int value;
+  /* Mask corresponding to the value.
+     Similar to ccp_prop_t, if xth bit of mask is 0,
+     implies xth bit of value is constant.  */
+  widest_int mask;
+  /* Original precision of the value.  */
+  unsigned precision;
+  /* Sign obtained from TYPE_SIGN.  */
+  enum signop sgn;
+  /* True if jump function is known.  */
+  bool known;
+};
+
 /* A jump function for a callsite represents the values passed as actual
    arguments of the callsite. See enum jump_func_type for the various
    types of jump functions supported.  */
@@ -166,6 +183,9 @@ struct GTY (()) ipa_jump_func
   /* Information about alignment of pointers. */
   struct ipa_alignment alignment;
 
+  /* Information about zero/non-zero bits.  */
+  struct ipa_bits bits;
+
   enum jump_func_type type;
   /* Represents a value of a jump function.  pass_through is used only in jump
      function context.  constant represents the actual constant in constant jump
@@ -482,6 +502,8 @@ struct GTY(()) ipcp_transformation_summary
   ipa_agg_replacement_value *agg_values;
   /* Alignment information for pointers.  */
   vec<ipa_alignment, va_gc> *alignments;
+  /* Known bits information.  */
+  vec<ipa_bits, va_gc> *bits;
 };
 
 void ipa_set_node_agg_value_chain (struct cgraph_node *node,
diff --git a/gcc/opts.c b/gcc/opts.c
index 4053fb1..cde9a7b 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -505,6 +505,7 @@ static const struct default_options default_options_table[] =
     { OPT_LEVELS_2_PLUS, OPT_ftree_switch_conversion, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fipa_cp, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fipa_cp_alignment, NULL, 1 },
+    { OPT_LEVELS_2_PLUS, OPT_fipa_cp_bit, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fdevirtualize, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fdevirtualize_speculatively, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fipa_sra, NULL, 1 },
@@ -1422,6 +1423,9 @@ enable_fdo_optimizations (struct gcc_options *opts,
   if (!opts_set->x_flag_ipa_cp_alignment
       && value && opts->x_flag_ipa_cp)
     opts->x_flag_ipa_cp_alignment = value;
+  if (!opts_set->x_flag_ipa_cp_bit
+      && value && opts->x_flag_ipa_cp)
+    opts->x_flag_ipa_cp_bit = value;
   if (!opts_set->x_flag_predictive_commoning)
     opts->x_flag_predictive_commoning = value;
   if (!opts_set->x_flag_unswitch_loops)

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] ipa bitwise constant propagation
  2016-08-07 21:38   ` Prathamesh Kulkarni
@ 2016-08-08 14:04     ` Martin Jambor
  2016-08-08 14:29       ` David Malcolm
  2016-08-09  8:11       ` Prathamesh Kulkarni
  0 siblings, 2 replies; 31+ messages in thread
From: Martin Jambor @ 2016-08-08 14:04 UTC (permalink / raw)
  To: Prathamesh Kulkarni
  Cc: Richard Biener, Jan Hubicka, Kugan Vivekanandarajah, gcc Patches

Hi,

thanks for following through.  You'll need an approval from Honza, but
I think the code looks good (I have looked at the places that I
believe have changed since the last week).  However, I have discovered
one new thing I don't like and still believe you need to handle
different precisions in lattice need:

On Mon, Aug 08, 2016 at 03:08:35AM +0530, Prathamesh Kulkarni wrote:
> On 5 August 2016 at 18:06, Martin Jambor <mjambor@suse.cz> wrote:
>
> ...
>
> >> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
> >> index 5b6cb9a..b770f6a 100644
> >> --- a/gcc/ipa-cp.c
> >> +++ b/gcc/ipa-cp.c
> >> @@ -120,6 +120,7 @@ along with GCC; see the file COPYING3.  If not see
> >>  #include "params.h"
> >>  #include "ipa-inline.h"
> >>  #include "ipa-utils.h"
> >> +#include "tree-ssa-ccp.h"
> >>
> >>  template <typename valtype> class ipcp_value;
> >>
> >> @@ -266,6 +267,40 @@ private:
> >>    bool meet_with_1 (unsigned new_align, unsigned new_misalign);
> >>  };
> >>
> >> +/* Lattice of known bits, only capable of holding one value.
> >> +   Similar to ccp_prop_value_t, mask represents which bits of value are constant.
> >> +   If a bit in mask is set to 0, then the corresponding bit in
> >> +   value is known to be constant.  */
> >> +
> >> +class ipcp_bits_lattice
> >> +{
> >> +public:
> >> +  bool bottom_p () { return lattice_val == IPA_BITS_VARYING; }
> >> +  bool top_p () { return lattice_val == IPA_BITS_UNDEFINED; }
> >> +  bool constant_p () { return lattice_val == IPA_BITS_CONSTANT; }
> >> +  bool set_to_bottom ();
> >> +  bool set_to_constant (widest_int, widest_int, signop, unsigned);
> >> +
> >> +  widest_int get_value () { return value; }
> >> +  widest_int get_mask () { return mask; }
> >> +  signop get_sign () { return sgn; }
> >> +  unsigned get_precision () { return precision; }
> >> +
> >> +  bool meet_with (ipcp_bits_lattice& other, enum tree_code, tree);
> >> +  bool meet_with (widest_int, widest_int, signop, unsigned);
> >> +
> >> +  void print (FILE *);
> >> +
> >> +private:
> >> +  enum { IPA_BITS_UNDEFINED, IPA_BITS_CONSTANT, IPA_BITS_VARYING } lattice_val;
> >> +  widest_int value, mask;
> >> +  signop sgn;
> >> +  unsigned precision;

I know that the existing code in ipa-cp.c does not do this, but please
prefix member variables with "m_" like our coding style guidelines
suggest (or even require?).  You routinely reuse those same names in
names of parameters of meet_with and I believe that is a practice that
will sooner or later lead to confusing the two and bugs.

> >> +
> >> +  bool meet_with_1 (widest_int, widest_int);
> >> +  void get_value_and_mask (tree, widest_int *, widest_int *);
> >> +};
> >> +
> >>  /* Structure containing lattices for a parameter itself and for pieces of
> >>     aggregates that are passed in the parameter or by a reference in a parameter
> >>     plus some other useful flags.  */
> >> @@ -281,6 +316,8 @@ public:
> >>    ipcp_agg_lattice *aggs;
> >>    /* Lattice describing known alignment.  */
> >>    ipcp_alignment_lattice alignment;
> >> +  /* Lattice describing known bits.  */
> >> +  ipcp_bits_lattice bits_lattice;
> >>    /* Number of aggregate lattices */
> >>    int aggs_count;
> >>    /* True if aggregate data were passed by reference (as opposed to by
> >> @@ -458,6 +495,21 @@ ipcp_alignment_lattice::print (FILE * f)
> >>      fprintf (f, "         Alignment %u, misalignment %u\n", align, misalign);
> >>  }
> >>
>
> ...
>
> >> +/* Convert operand to value, mask form.  */
> >> +
> >> +void
> >> +ipcp_bits_lattice::get_value_and_mask (tree operand, widest_int *valuep, widest_int *maskp)
> >> +{
> >> +  wide_int get_nonzero_bits (const_tree);
> >> +
> >> +  if (TREE_CODE (operand) == INTEGER_CST)
> >> +    {
> >> +      *valuep = wi::to_widest (operand);
> >> +      *maskp = 0;
> >> +    }
> >> +  else if (TREE_CODE (operand) == SSA_NAME)
> >
> > IIUC, operand is the operand from pass-through jump function and that
> > should never be an SSA_NAME.  I have even looked at how we generate
> > them and it seems fairly safe to say that they never are.  If you have
> > seen an SSA_NAME here, it is a bug and please let me know because
> > sooner or later it will cause an assert.
> >
> >> +    {
> >> +      *valuep = 0;
> >> +      *maskp = widest_int::from (get_nonzero_bits (operand), UNSIGNED);
> >> +    }
> >> +  else
> >> +    gcc_unreachable ();
> >
> > The operand however can be any any other is_gimple_ip_invariant tree.
> > I assume that you could hit this gcc_unreachable only in a program
> > with undefined behavior (or with a Fortran CONST_DECL?) but you should
> > not ICE here.
> Changed to:
> if (TREE_CODE (operand) == INTEGER_CST)
>     {
>       *valuep = wi::to_widest (operand);
>       *maskp = 0;
>     }
>   else
>     {
>       *valuep = 0;
>       *maskp = -1;
>     }
> 
> I am not sure how to extract nonzero bits for gimple_ip_invariant if
> it's not INTEGER_CST,

I don't think that you reasonably can.

> so setting to unknown (value = 0, mask = -1).
> Does this look OK ?

Yes.

> >
> >
> >> +}
> >> +
> >> +/* Meet operation, similar to ccp_lattice_meet, we xor values
> >> +   if this->value, value have different values at same bit positions, we want
> >> +   to drop that bit to varying. Return true if mask is changed.
> >> +   This function assumes that the lattice value is in CONSTANT state  */
> >> +
> >> +bool
> >> +ipcp_bits_lattice::meet_with_1 (widest_int value, widest_int mask)
> >> +{
> >> +  gcc_assert (constant_p ());
> >> +
> >> +  widest_int old_mask = this->mask;
> >> +  this->mask = (this->mask | mask) | (this->value ^ value);
> >> +
> >> +  if (wi::sext (this->mask, this->precision) == -1)
> >> +    return set_to_bottom ();
> >> +
> >> +  bool changed = this->mask != old_mask;
> >> +  return changed;
> >> +}
> >> +
> >> +/* Meet the bits lattice with operand
> >> +   described by <value, mask, sgn, precision.  */
> >> +
> >> +bool
> >> +ipcp_bits_lattice::meet_with (widest_int value, widest_int mask,
> >> +                           signop sgn, unsigned precision)
> >> +{
> >> +  if (bottom_p ())
> >> +    return false;
> >> +
> >> +  if (top_p ())
> >> +    {
> >> +      if (wi::sext (mask, precision) == -1)
> >> +     return set_to_bottom ();
> >> +      return set_to_constant (value, mask, sgn, precision);
> >> +    }
> >> +
> >> +  return meet_with_1 (value, mask);
> >
> > What if precisions do not match?
> Sorry I don't understand. Since we extend to widest_int, precision
> would be same ?

I meant what if:

  this->precision != precision /* the parameter value */

(you see, giving both the same name is error-prone).  You are
propagating the recorded precision gathered from types of the actual
arguments in calls, those can be different.  For example, one caller
can pass a direct integer value with full integer precision, another
caller can pass in that same argument an enum value with a very
limited precision.  Your code ignores that difference and the
resulting precision is the one that arrives first.  If it is the enum,
it might be too small for the integer value from the other call-site?

> bit_value_binop_1 requires original precision for few cases (shifts,
> rotates, plus, mult), so

Yeah, so wrong precision could only be used if it got fed into the
binary operation routines, making the bug much harder to trigger.  But
it would still be a bug (or you do not need to care for original
precisions at all).

> I was preserving the original precision in jump function.
> Later in ipcp_update_bits(), the mask is set after narrowing to the
> precision of the parameter.
> >
> >> +}
> >> +
> >> +/* Meet bits lattice with the result of bit_value_binop_1 (other, operand)
> >> +   if code is binary operation or bit_value_unop_1 (other) if code is unary op.
> >> +   In the case when code is nop_expr, no adjustment is required. */
> >> +
> >> +bool
> >> +ipcp_bits_lattice::meet_with (ipcp_bits_lattice& other, enum tree_code code, tree operand)
> >> +{
> >> +  if (other.bottom_p ())
> >> +    return set_to_bottom ();
> >> +
> >> +  if (bottom_p () || other.top_p ())
> >> +    return false;
> >> +
> >> +  widest_int adjusted_value, adjusted_mask;
> >> +
> >> +  if (TREE_CODE_CLASS (code) == tcc_binary)
> >> +    {
> >> +      tree type = TREE_TYPE (operand);
> >> +      gcc_assert (INTEGRAL_TYPE_P (type));
> >> +      widest_int o_value, o_mask;
> >> +      get_value_and_mask (operand, &o_value, &o_mask);
> >> +
> >> +      signop sgn = other.get_sign ();
> >> +      unsigned prec = other.get_precision ();
> >> +
> >> +      bit_value_binop_1 (code, sgn, prec, &adjusted_value, &adjusted_mask,
> >> +                      sgn, prec, other.get_value (), other.get_mask (),
> >> +                      TYPE_SIGN (type), TYPE_PRECISION (type), o_value, o_mask);
> >
> > It is probably just me not being particularly sharp on a Friday
> > afternoon and I might not understand the semantics of mask well (also,
> > you did not document it :-), but... assume that we are looking at a
> > binary and operation, other comes from an SSA pointer and its mask
> > would be binary 100 and its value 0 because that's what you set for
> > ssa names in ipa-prop.h, and the operand is binary value 101, which
> > means that get_value_and_mask returns mask 0 and value 101.  Now,
> > bit_value_binop_1 would return value 0 & 101 = 0 and mask according to
> >
> > (m1 | m2) & ((v1 | m1) & (v2 | m2))
> >
> > so in our case
> >
> > (100b & 0) & ((0 | 100b) & (101b | 0)) = 0 & 100b = 0.
> Shouldn't this be:
> (100b | 0) & ((0 | 100b) & (101b | 0)) = 100 & 100 = 100 -;)

Eh, right, sorry.  I just find the term mask confusing when we do not
actually mask anything with it (but I guess it is good to be
consistent so let's keep it).

> diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h
> index e32d078..1b9d0ef 100644
> --- a/gcc/ipa-prop.h
> +++ b/gcc/ipa-prop.h
> @@ -154,6 +154,23 @@ struct GTY(()) ipa_alignment
>    unsigned misalign;
>  };
>  
> +/* Information about zero/non-zero bits.  */
> +struct GTY(()) ipa_bits
> +{
> +  /* The propagated value.  */
> +  widest_int value;
> +  /* Mask corresponding to the value.
> +     Similar to ccp_prop_t, if xth bit of mask is 0,

Is ccp_prop_t a typo? I did not find it anywhere when I grepped for it.

Thanks,

Martin

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] ipa bitwise constant propagation
  2016-08-08 14:04     ` Martin Jambor
@ 2016-08-08 14:29       ` David Malcolm
  2016-08-09  8:11       ` Prathamesh Kulkarni
  1 sibling, 0 replies; 31+ messages in thread
From: David Malcolm @ 2016-08-08 14:29 UTC (permalink / raw)
  To: Martin Jambor, Prathamesh Kulkarni
  Cc: Richard Biener, Jan Hubicka, Kugan Vivekanandarajah, gcc Patches

On Mon, 2016-08-08 at 16:03 +0200, Martin Jambor wrote:
> Hi,
> 
> thanks for following through.  You'll need an approval from Honza,
> but
> I think the code looks good (I have looked at the places that I
> believe have changed since the last week).  However, I have
> discovered
> one new thing I don't like and still believe you need to handle
> different precisions in lattice need:
> 
> On Mon, Aug 08, 2016 at 03:08:35AM +0530, Prathamesh Kulkarni wrote:
> > On 5 August 2016 at 18:06, Martin Jambor <mjambor@suse.cz> wrote:
> > 
> > ...
> > 
> > > > diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
> > > > index 5b6cb9a..b770f6a 100644
> > > > --- a/gcc/ipa-cp.c
> > > > +++ b/gcc/ipa-cp.c
> > > > @@ -120,6 +120,7 @@ along with GCC; see the file COPYING3.  If
> > > > not see
> > > >  #include "params.h"
> > > >  #include "ipa-inline.h"
> > > >  #include "ipa-utils.h"
> > > > +#include "tree-ssa-ccp.h"
> > > > 
> > > >  template <typename valtype> class ipcp_value;
> > > > 
> > > > @@ -266,6 +267,40 @@ private:
> > > >    bool meet_with_1 (unsigned new_align, unsigned
> > > > new_misalign);
> > > >  };
> > > > 
> > > > +/* Lattice of known bits, only capable of holding one value.
> > > > +   Similar to ccp_prop_value_t, mask represents which bits of
> > > > value are constant.
> > > > +   If a bit in mask is set to 0, then the corresponding bit in
> > > > +   value is known to be constant.  */
> > > > +
> > > > +class ipcp_bits_lattice
> > > > +{
> > > > +public:
> > > > +  bool bottom_p () { return lattice_val == IPA_BITS_VARYING; }
> > > > +  bool top_p () { return lattice_val == IPA_BITS_UNDEFINED; }
> > > > +  bool constant_p () { return lattice_val ==
> > > > IPA_BITS_CONSTANT; }
> > > > +  bool set_to_bottom ();
> > > > +  bool set_to_constant (widest_int, widest_int, signop,
> > > > unsigned);
> > > > +
> > > > +  widest_int get_value () { return value; }
> > > > +  widest_int get_mask () { return mask; }
> > > > +  signop get_sign () { return sgn; }
> > > > +  unsigned get_precision () { return precision; }
> > > > +
> > > > +  bool meet_with (ipcp_bits_lattice& other, enum tree_code,
> > > > tree);
> > > > +  bool meet_with (widest_int, widest_int, signop, unsigned);
> > > > +
> > > > +  void print (FILE *);
> > > > +
> > > > +private:
> > > > +  enum { IPA_BITS_UNDEFINED, IPA_BITS_CONSTANT,
> > > > IPA_BITS_VARYING } lattice_val;
> > > > +  widest_int value, mask;
> > > > +  signop sgn;
> > > > +  unsigned precision;
> 
> I know that the existing code in ipa-cp.c does not do this, but
> please
> prefix member variables with "m_" like our coding style guidelines
> suggest (or even require?).  You routinely reuse those same names in
> names of parameters of meet_with and I believe that is a practice
> that
> will sooner or later lead to confusing the two and bugs.

I'm not a reviewer, and not very familiar with this code, but is it
possible to add a couple of examples to the descriptive comment of
 class ipcp_bits_lattice?  I'm finding it hard to understand how the
various fields interact, in particular "value" and "mask" interact (or
rather "m_value" and "m_mask").  I think a concrete example would make
things much clearer.  This thread talked about this below...

[...]

> > > It is probably just me not being particularly sharp on a Friday
> > > afternoon and I might not understand the semantics of mask well
> > > (also,
> > > you did not document it :-), but... assume that we are looking at
> > > a
> > > binary and operation, other comes from an SSA pointer and its
> > > mask
> > > would be binary 100 and its value 0 because that's what you set
> > > for
> > > ssa names in ipa-prop.h, and the operand is binary value 101,
> > > which
> > > means that get_value_and_mask returns mask 0 and value 101.  Now,
> > > bit_value_binop_1 would return value 0 & 101 = 0 and mask
> > > according to
> > > 
> > > (m1 | m2) & ((v1 | m1) & (v2 | m2))
> > > 
> > > so in our case
> > > 
> > > (100b & 0) & ((0 | 100b) & (101b | 0)) = 0 & 100b = 0.
> > Shouldn't this be:
> > (100b | 0) & ((0 | 100b) & (101b | 0)) = 100 & 100 = 100 -;)
> 
> Eh, right, sorry.  I just find the term mask confusing when we do not
> actually mask anything with it (but I guess it is good to be
> consistent so let's keep it).

...so presumably it would be good to capture something like that within
the descriptive comment of the class.

[...]

Hope this is constructive
Dave

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] ipa bitwise constant propagation
  2016-08-08 14:04     ` Martin Jambor
  2016-08-08 14:29       ` David Malcolm
@ 2016-08-09  8:11       ` Prathamesh Kulkarni
  2016-08-09  9:24         ` Richard Biener
  2016-08-09 11:09         ` Martin Jambor
  1 sibling, 2 replies; 31+ messages in thread
From: Prathamesh Kulkarni @ 2016-08-09  8:11 UTC (permalink / raw)
  To: Prathamesh Kulkarni, Richard Biener, Jan Hubicka,
	Kugan Vivekanandarajah, gcc Patches

On 8 August 2016 at 19:33, Martin Jambor <mjambor@suse.cz> wrote:
> Hi,
>
> thanks for following through.  You'll need an approval from Honza, but
> I think the code looks good (I have looked at the places that I
> believe have changed since the last week).  However, I have discovered
> one new thing I don't like and still believe you need to handle
> different precisions in lattice need:
>
> On Mon, Aug 08, 2016 at 03:08:35AM +0530, Prathamesh Kulkarni wrote:
>> On 5 August 2016 at 18:06, Martin Jambor <mjambor@suse.cz> wrote:
>>
>> ...
>>
>> >> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
>> >> index 5b6cb9a..b770f6a 100644
>> >> --- a/gcc/ipa-cp.c
>> >> +++ b/gcc/ipa-cp.c
>> >> @@ -120,6 +120,7 @@ along with GCC; see the file COPYING3.  If not see
>> >>  #include "params.h"
>> >>  #include "ipa-inline.h"
>> >>  #include "ipa-utils.h"
>> >> +#include "tree-ssa-ccp.h"
>> >>
>> >>  template <typename valtype> class ipcp_value;
>> >>
>> >> @@ -266,6 +267,40 @@ private:
>> >>    bool meet_with_1 (unsigned new_align, unsigned new_misalign);
>> >>  };
>> >>
>> >> +/* Lattice of known bits, only capable of holding one value.
>> >> +   Similar to ccp_prop_value_t, mask represents which bits of value are constant.
>> >> +   If a bit in mask is set to 0, then the corresponding bit in
>> >> +   value is known to be constant.  */
>> >> +
>> >> +class ipcp_bits_lattice
>> >> +{
>> >> +public:
>> >> +  bool bottom_p () { return lattice_val == IPA_BITS_VARYING; }
>> >> +  bool top_p () { return lattice_val == IPA_BITS_UNDEFINED; }
>> >> +  bool constant_p () { return lattice_val == IPA_BITS_CONSTANT; }
>> >> +  bool set_to_bottom ();
>> >> +  bool set_to_constant (widest_int, widest_int, signop, unsigned);
>> >> +
>> >> +  widest_int get_value () { return value; }
>> >> +  widest_int get_mask () { return mask; }
>> >> +  signop get_sign () { return sgn; }
>> >> +  unsigned get_precision () { return precision; }
>> >> +
>> >> +  bool meet_with (ipcp_bits_lattice& other, enum tree_code, tree);
>> >> +  bool meet_with (widest_int, widest_int, signop, unsigned);
>> >> +
>> >> +  void print (FILE *);
>> >> +
>> >> +private:
>> >> +  enum { IPA_BITS_UNDEFINED, IPA_BITS_CONSTANT, IPA_BITS_VARYING } lattice_val;
>> >> +  widest_int value, mask;
>> >> +  signop sgn;
>> >> +  unsigned precision;
>
> I know that the existing code in ipa-cp.c does not do this, but please
> prefix member variables with "m_" like our coding style guidelines
> suggest (or even require?).  You routinely reuse those same names in
> names of parameters of meet_with and I believe that is a practice that
> will sooner or later lead to confusing the two and bugs.
Sorry about this, will change to m_ prefix in followup patch.
>
>> >> +
>> >> +  bool meet_with_1 (widest_int, widest_int);
>> >> +  void get_value_and_mask (tree, widest_int *, widest_int *);
>> >> +};
>> >> +
>> >>  /* Structure containing lattices for a parameter itself and for pieces of
>> >>     aggregates that are passed in the parameter or by a reference in a parameter
>> >>     plus some other useful flags.  */
>> >> @@ -281,6 +316,8 @@ public:
>> >>    ipcp_agg_lattice *aggs;
>> >>    /* Lattice describing known alignment.  */
>> >>    ipcp_alignment_lattice alignment;
>> >> +  /* Lattice describing known bits.  */
>> >> +  ipcp_bits_lattice bits_lattice;
>> >>    /* Number of aggregate lattices */
>> >>    int aggs_count;
>> >>    /* True if aggregate data were passed by reference (as opposed to by
>> >> @@ -458,6 +495,21 @@ ipcp_alignment_lattice::print (FILE * f)
>> >>      fprintf (f, "         Alignment %u, misalignment %u\n", align, misalign);
>> >>  }
>> >>
>>
>> ...
>>
>> >> +/* Convert operand to value, mask form.  */
>> >> +
>> >> +void
>> >> +ipcp_bits_lattice::get_value_and_mask (tree operand, widest_int *valuep, widest_int *maskp)
>> >> +{
>> >> +  wide_int get_nonzero_bits (const_tree);
>> >> +
>> >> +  if (TREE_CODE (operand) == INTEGER_CST)
>> >> +    {
>> >> +      *valuep = wi::to_widest (operand);
>> >> +      *maskp = 0;
>> >> +    }
>> >> +  else if (TREE_CODE (operand) == SSA_NAME)
>> >
>> > IIUC, operand is the operand from pass-through jump function and that
>> > should never be an SSA_NAME.  I have even looked at how we generate
>> > them and it seems fairly safe to say that they never are.  If you have
>> > seen an SSA_NAME here, it is a bug and please let me know because
>> > sooner or later it will cause an assert.
>> >
>> >> +    {
>> >> +      *valuep = 0;
>> >> +      *maskp = widest_int::from (get_nonzero_bits (operand), UNSIGNED);
>> >> +    }
>> >> +  else
>> >> +    gcc_unreachable ();
>> >
>> > The operand however can be any any other is_gimple_ip_invariant tree.
>> > I assume that you could hit this gcc_unreachable only in a program
>> > with undefined behavior (or with a Fortran CONST_DECL?) but you should
>> > not ICE here.
>> Changed to:
>> if (TREE_CODE (operand) == INTEGER_CST)
>>     {
>>       *valuep = wi::to_widest (operand);
>>       *maskp = 0;
>>     }
>>   else
>>     {
>>       *valuep = 0;
>>       *maskp = -1;
>>     }
>>
>> I am not sure how to extract nonzero bits for gimple_ip_invariant if
>> it's not INTEGER_CST,
>
> I don't think that you reasonably can.
>
>> so setting to unknown (value = 0, mask = -1).
>> Does this look OK ?
>
> Yes.
>
>> >
>> >
>> >> +}
>> >> +
>> >> +/* Meet operation, similar to ccp_lattice_meet, we xor values
>> >> +   if this->value, value have different values at same bit positions, we want
>> >> +   to drop that bit to varying. Return true if mask is changed.
>> >> +   This function assumes that the lattice value is in CONSTANT state  */
>> >> +
>> >> +bool
>> >> +ipcp_bits_lattice::meet_with_1 (widest_int value, widest_int mask)
>> >> +{
>> >> +  gcc_assert (constant_p ());
>> >> +
>> >> +  widest_int old_mask = this->mask;
>> >> +  this->mask = (this->mask | mask) | (this->value ^ value);
>> >> +
>> >> +  if (wi::sext (this->mask, this->precision) == -1)
>> >> +    return set_to_bottom ();
>> >> +
>> >> +  bool changed = this->mask != old_mask;
>> >> +  return changed;
>> >> +}
>> >> +
>> >> +/* Meet the bits lattice with operand
>> >> +   described by <value, mask, sgn, precision.  */
>> >> +
>> >> +bool
>> >> +ipcp_bits_lattice::meet_with (widest_int value, widest_int mask,
>> >> +                           signop sgn, unsigned precision)
>> >> +{
>> >> +  if (bottom_p ())
>> >> +    return false;
>> >> +
>> >> +  if (top_p ())
>> >> +    {
>> >> +      if (wi::sext (mask, precision) == -1)
>> >> +     return set_to_bottom ();
>> >> +      return set_to_constant (value, mask, sgn, precision);
>> >> +    }
>> >> +
>> >> +  return meet_with_1 (value, mask);
>> >
>> > What if precisions do not match?
>> Sorry I don't understand. Since we extend to widest_int, precision
>> would be same ?
>
> I meant what if:
>
>   this->precision != precision /* the parameter value */
>
> (you see, giving both the same name is error-prone).  You are
> propagating the recorded precision gathered from types of the actual
> arguments in calls, those can be different.  For example, one caller
> can pass a direct integer value with full integer precision, another
> caller can pass in that same argument an enum value with a very
> limited precision.  Your code ignores that difference and the
> resulting precision is the one that arrives first.  If it is the enum,
> it might be too small for the integer value from the other call-site?
Ah indeed the patch incorrectly propagates precision of argument.
So IIUC in ipcp_bits_lattice, we want m_precision to be the precision
of parameter's type and _not_ of argument's type.

The patch incorrectly propagates precision in following case:

__attribute__((noinline))
static int f2(short z)
{
  return z;
}

__attribute__((noinline))
static int f1(int y)
{
  return f2 (y & 0xff);
}

__attribute__((noinline))
int f(int x)
{
  return f1 (x);
}

Precision for 'z' should be 16 while the patch propagates 32, which
is precision of type of the argument passed by the caller.
We only set m_precision when changing from TOP to CONSTANT
state.

Instead of storing arg's precision and sign, we should store
parameter's precision and sign in ipa_compute_jump_functions_for_edge ().
Diff with respect to previous patch:

@@ -1688,9 +1690,9 @@ ipa_compute_jump_functions_for_edge (struct
ipa_func_body_info *fbi,
   && (TREE_CODE (arg) == SSA_NAME || TREE_CODE (arg) == INTEGER_CST))
  {
   jfunc->bits.known = true;
-  jfunc->bits.sgn = TYPE_SIGN (TREE_TYPE (arg));
-  jfunc->bits.precision = TYPE_PRECISION (TREE_TYPE (arg));
-
+  jfunc->bits.sgn = TYPE_SIGN (param_type);
+  jfunc->bits.precision = TYPE_PRECISION (param_type);
+
   if (TREE_CODE (arg) == SSA_NAME)
     {
       jfunc->bits.value = 0;

So in ipcp_bits_lattice::meet_with(value, mask, signop, precision)
when we propagate into
parameter's lattice for first time, we will set m_precision ==
precision of it's own type.
rather than precision of the argument

For eg, consider following test-case:
int f(int x)
{
  return some_operation (x);
}

int f1(short y)
{
  return f (y & 0xf);
}

int f2(char z)
{
  return f (z & 0xff);
}

Assume we first propagate from f2->f.
In this case, jump_function from f2->f is unknown (but bits.known is true),
so we call meet_with (0, 0xff, SIGNED, 32).
The precision and sign are of param's type because we extract them
from param_type as shown above.

(I suppose the reason this is not pass-thru is because result y & 0xf
is wrapped by convert_expr,
which is actually passed to f(), so the parameter y isn't really
involved in the call to f ?)

Since lattice of x is TOP, it will change to CONSTANT,
and m_precision will get assigned 32.

Next propagate from f1->f
jump_function from f1->f is unknown (but bits.known is true)
so we call meet_with (0, 0xf, 32, SIGNED).
Since lattice of x is already CONSTANT, it doesn't change m_precision anymore
on this call or any subsequent calls.

So when we propagate into callee for first time, only then do we set
the precision.
Does this look reasonable ?
>
>> bit_value_binop_1 requires original precision for few cases (shifts,
>> rotates, plus, mult), so
>
> Yeah, so wrong precision could only be used if it got fed into the
> binary operation routines, making the bug much harder to trigger.  But
> it would still be a bug (or you do not need to care for original
> precisions at all).
>
>> I was preserving the original precision in jump function.
>> Later in ipcp_update_bits(), the mask is set after narrowing to the
>> precision of the parameter.
>> >
>> >> +}
>> >> +
>> >> +/* Meet bits lattice with the result of bit_value_binop_1 (other, operand)
>> >> +   if code is binary operation or bit_value_unop_1 (other) if code is unary op.
>> >> +   In the case when code is nop_expr, no adjustment is required. */
>> >> +
>> >> +bool
>> >> +ipcp_bits_lattice::meet_with (ipcp_bits_lattice& other, enum tree_code code, tree operand)
>> >> +{
>> >> +  if (other.bottom_p ())
>> >> +    return set_to_bottom ();
>> >> +
>> >> +  if (bottom_p () || other.top_p ())
>> >> +    return false;
>> >> +
>> >> +  widest_int adjusted_value, adjusted_mask;
>> >> +
>> >> +  if (TREE_CODE_CLASS (code) == tcc_binary)
>> >> +    {
>> >> +      tree type = TREE_TYPE (operand);
>> >> +      gcc_assert (INTEGRAL_TYPE_P (type));
>> >> +      widest_int o_value, o_mask;
>> >> +      get_value_and_mask (operand, &o_value, &o_mask);
>> >> +
>> >> +      signop sgn = other.get_sign ();
>> >> +      unsigned prec = other.get_precision ();
>> >> +
>> >> +      bit_value_binop_1 (code, sgn, prec, &adjusted_value, &adjusted_mask,
>> >> +                      sgn, prec, other.get_value (), other.get_mask (),
>> >> +                      TYPE_SIGN (type), TYPE_PRECISION (type), o_value, o_mask);
>> >
>> > It is probably just me not being particularly sharp on a Friday
>> > afternoon and I might not understand the semantics of mask well (also,
>> > you did not document it :-), but... assume that we are looking at a
>> > binary and operation, other comes from an SSA pointer and its mask
>> > would be binary 100 and its value 0 because that's what you set for
>> > ssa names in ipa-prop.h, and the operand is binary value 101, which
>> > means that get_value_and_mask returns mask 0 and value 101.  Now,
>> > bit_value_binop_1 would return value 0 & 101 = 0 and mask according to
>> >
>> > (m1 | m2) & ((v1 | m1) & (v2 | m2))
>> >
>> > so in our case
>> >
>> > (100b & 0) & ((0 | 100b) & (101b | 0)) = 0 & 100b = 0.
>> Shouldn't this be:
>> (100b | 0) & ((0 | 100b) & (101b | 0)) = 100 & 100 = 100 -;)
>
> Eh, right, sorry.  I just find the term mask confusing when we do not
> actually mask anything with it (but I guess it is good to be
> consistent so let's keep it).
>
>> diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h
>> index e32d078..1b9d0ef 100644
>> --- a/gcc/ipa-prop.h
>> +++ b/gcc/ipa-prop.h
>> @@ -154,6 +154,23 @@ struct GTY(()) ipa_alignment
>>    unsigned misalign;
>>  };
>>
>> +/* Information about zero/non-zero bits.  */
>> +struct GTY(()) ipa_bits
>> +{
>> +  /* The propagated value.  */
>> +  widest_int value;
>> +  /* Mask corresponding to the value.
>> +     Similar to ccp_prop_t, if xth bit of mask is 0,
>
> Is ccp_prop_t a typo? I did not find it anywhere when I grepped for it.
ah, it's ccp_lattice_t -;)

Thanks,
Prathamesh
>
> Thanks,
>
> Martin
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] ipa bitwise constant propagation
  2016-08-09  8:11       ` Prathamesh Kulkarni
@ 2016-08-09  9:24         ` Richard Biener
  2016-08-09 11:09         ` Martin Jambor
  1 sibling, 0 replies; 31+ messages in thread
From: Richard Biener @ 2016-08-09  9:24 UTC (permalink / raw)
  To: Prathamesh Kulkarni; +Cc: Jan Hubicka, Kugan Vivekanandarajah, gcc Patches

On Tue, 9 Aug 2016, Prathamesh Kulkarni wrote:

> On 8 August 2016 at 19:33, Martin Jambor <mjambor@suse.cz> wrote:
> > Hi,
> >
> > thanks for following through.  You'll need an approval from Honza, but
> > I think the code looks good (I have looked at the places that I
> > believe have changed since the last week).  However, I have discovered
> > one new thing I don't like and still believe you need to handle
> > different precisions in lattice need:
> >
> > On Mon, Aug 08, 2016 at 03:08:35AM +0530, Prathamesh Kulkarni wrote:
> >> On 5 August 2016 at 18:06, Martin Jambor <mjambor@suse.cz> wrote:
> >>
> >> ...
> >>
> >> >> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
> >> >> index 5b6cb9a..b770f6a 100644
> >> >> --- a/gcc/ipa-cp.c
> >> >> +++ b/gcc/ipa-cp.c
> >> >> @@ -120,6 +120,7 @@ along with GCC; see the file COPYING3.  If not see
> >> >>  #include "params.h"
> >> >>  #include "ipa-inline.h"
> >> >>  #include "ipa-utils.h"
> >> >> +#include "tree-ssa-ccp.h"
> >> >>
> >> >>  template <typename valtype> class ipcp_value;
> >> >>
> >> >> @@ -266,6 +267,40 @@ private:
> >> >>    bool meet_with_1 (unsigned new_align, unsigned new_misalign);
> >> >>  };
> >> >>
> >> >> +/* Lattice of known bits, only capable of holding one value.
> >> >> +   Similar to ccp_prop_value_t, mask represents which bits of value are constant.
> >> >> +   If a bit in mask is set to 0, then the corresponding bit in
> >> >> +   value is known to be constant.  */
> >> >> +
> >> >> +class ipcp_bits_lattice
> >> >> +{
> >> >> +public:
> >> >> +  bool bottom_p () { return lattice_val == IPA_BITS_VARYING; }
> >> >> +  bool top_p () { return lattice_val == IPA_BITS_UNDEFINED; }
> >> >> +  bool constant_p () { return lattice_val == IPA_BITS_CONSTANT; }
> >> >> +  bool set_to_bottom ();
> >> >> +  bool set_to_constant (widest_int, widest_int, signop, unsigned);
> >> >> +
> >> >> +  widest_int get_value () { return value; }
> >> >> +  widest_int get_mask () { return mask; }
> >> >> +  signop get_sign () { return sgn; }
> >> >> +  unsigned get_precision () { return precision; }
> >> >> +
> >> >> +  bool meet_with (ipcp_bits_lattice& other, enum tree_code, tree);
> >> >> +  bool meet_with (widest_int, widest_int, signop, unsigned);
> >> >> +
> >> >> +  void print (FILE *);
> >> >> +
> >> >> +private:
> >> >> +  enum { IPA_BITS_UNDEFINED, IPA_BITS_CONSTANT, IPA_BITS_VARYING } lattice_val;
> >> >> +  widest_int value, mask;
> >> >> +  signop sgn;
> >> >> +  unsigned precision;
> >
> > I know that the existing code in ipa-cp.c does not do this, but please
> > prefix member variables with "m_" like our coding style guidelines
> > suggest (or even require?).  You routinely reuse those same names in
> > names of parameters of meet_with and I believe that is a practice that
> > will sooner or later lead to confusing the two and bugs.
> Sorry about this, will change to m_ prefix in followup patch.
> >
> >> >> +
> >> >> +  bool meet_with_1 (widest_int, widest_int);
> >> >> +  void get_value_and_mask (tree, widest_int *, widest_int *);
> >> >> +};
> >> >> +
> >> >>  /* Structure containing lattices for a parameter itself and for pieces of
> >> >>     aggregates that are passed in the parameter or by a reference in a parameter
> >> >>     plus some other useful flags.  */
> >> >> @@ -281,6 +316,8 @@ public:
> >> >>    ipcp_agg_lattice *aggs;
> >> >>    /* Lattice describing known alignment.  */
> >> >>    ipcp_alignment_lattice alignment;
> >> >> +  /* Lattice describing known bits.  */
> >> >> +  ipcp_bits_lattice bits_lattice;
> >> >>    /* Number of aggregate lattices */
> >> >>    int aggs_count;
> >> >>    /* True if aggregate data were passed by reference (as opposed to by
> >> >> @@ -458,6 +495,21 @@ ipcp_alignment_lattice::print (FILE * f)
> >> >>      fprintf (f, "         Alignment %u, misalignment %u\n", align, misalign);
> >> >>  }
> >> >>
> >>
> >> ...
> >>
> >> >> +/* Convert operand to value, mask form.  */
> >> >> +
> >> >> +void
> >> >> +ipcp_bits_lattice::get_value_and_mask (tree operand, widest_int *valuep, widest_int *maskp)
> >> >> +{
> >> >> +  wide_int get_nonzero_bits (const_tree);
> >> >> +
> >> >> +  if (TREE_CODE (operand) == INTEGER_CST)
> >> >> +    {
> >> >> +      *valuep = wi::to_widest (operand);
> >> >> +      *maskp = 0;
> >> >> +    }
> >> >> +  else if (TREE_CODE (operand) == SSA_NAME)
> >> >
> >> > IIUC, operand is the operand from pass-through jump function and that
> >> > should never be an SSA_NAME.  I have even looked at how we generate
> >> > them and it seems fairly safe to say that they never are.  If you have
> >> > seen an SSA_NAME here, it is a bug and please let me know because
> >> > sooner or later it will cause an assert.
> >> >
> >> >> +    {
> >> >> +      *valuep = 0;
> >> >> +      *maskp = widest_int::from (get_nonzero_bits (operand), UNSIGNED);
> >> >> +    }
> >> >> +  else
> >> >> +    gcc_unreachable ();
> >> >
> >> > The operand however can be any any other is_gimple_ip_invariant tree.
> >> > I assume that you could hit this gcc_unreachable only in a program
> >> > with undefined behavior (or with a Fortran CONST_DECL?) but you should
> >> > not ICE here.
> >> Changed to:
> >> if (TREE_CODE (operand) == INTEGER_CST)
> >>     {
> >>       *valuep = wi::to_widest (operand);
> >>       *maskp = 0;
> >>     }
> >>   else
> >>     {
> >>       *valuep = 0;
> >>       *maskp = -1;
> >>     }
> >>
> >> I am not sure how to extract nonzero bits for gimple_ip_invariant if
> >> it's not INTEGER_CST,
> >
> > I don't think that you reasonably can.
> >
> >> so setting to unknown (value = 0, mask = -1).
> >> Does this look OK ?
> >
> > Yes.
> >
> >> >
> >> >
> >> >> +}
> >> >> +
> >> >> +/* Meet operation, similar to ccp_lattice_meet, we xor values
> >> >> +   if this->value, value have different values at same bit positions, we want
> >> >> +   to drop that bit to varying. Return true if mask is changed.
> >> >> +   This function assumes that the lattice value is in CONSTANT state  */
> >> >> +
> >> >> +bool
> >> >> +ipcp_bits_lattice::meet_with_1 (widest_int value, widest_int mask)
> >> >> +{
> >> >> +  gcc_assert (constant_p ());
> >> >> +
> >> >> +  widest_int old_mask = this->mask;
> >> >> +  this->mask = (this->mask | mask) | (this->value ^ value);
> >> >> +
> >> >> +  if (wi::sext (this->mask, this->precision) == -1)
> >> >> +    return set_to_bottom ();
> >> >> +
> >> >> +  bool changed = this->mask != old_mask;
> >> >> +  return changed;
> >> >> +}
> >> >> +
> >> >> +/* Meet the bits lattice with operand
> >> >> +   described by <value, mask, sgn, precision.  */
> >> >> +
> >> >> +bool
> >> >> +ipcp_bits_lattice::meet_with (widest_int value, widest_int mask,
> >> >> +                           signop sgn, unsigned precision)
> >> >> +{
> >> >> +  if (bottom_p ())
> >> >> +    return false;
> >> >> +
> >> >> +  if (top_p ())
> >> >> +    {
> >> >> +      if (wi::sext (mask, precision) == -1)
> >> >> +     return set_to_bottom ();
> >> >> +      return set_to_constant (value, mask, sgn, precision);
> >> >> +    }
> >> >> +
> >> >> +  return meet_with_1 (value, mask);
> >> >
> >> > What if precisions do not match?
> >> Sorry I don't understand. Since we extend to widest_int, precision
> >> would be same ?
> >
> > I meant what if:
> >
> >   this->precision != precision /* the parameter value */
> >
> > (you see, giving both the same name is error-prone).  You are
> > propagating the recorded precision gathered from types of the actual
> > arguments in calls, those can be different.  For example, one caller
> > can pass a direct integer value with full integer precision, another
> > caller can pass in that same argument an enum value with a very
> > limited precision.  Your code ignores that difference and the
> > resulting precision is the one that arrives first.  If it is the enum,
> > it might be too small for the integer value from the other call-site?
> Ah indeed the patch incorrectly propagates precision of argument.
> So IIUC in ipcp_bits_lattice, we want m_precision to be the precision
> of parameter's type and _not_ of argument's type.
> 
> The patch incorrectly propagates precision in following case:
> 
> __attribute__((noinline))
> static int f2(short z)
> {
>   return z;
> }
> 
> __attribute__((noinline))
> static int f1(int y)
> {
>   return f2 (y & 0xff);
> }
> 
> __attribute__((noinline))
> int f(int x)
> {
>   return f1 (x);
> }
> 
> Precision for 'z' should be 16 while the patch propagates 32, which
> is precision of type of the argument passed by the caller.
> We only set m_precision when changing from TOP to CONSTANT
> state.
> 
> Instead of storing arg's precision and sign, we should store
> parameter's precision and sign in ipa_compute_jump_functions_for_edge ().
> Diff with respect to previous patch:
> 
> @@ -1688,9 +1690,9 @@ ipa_compute_jump_functions_for_edge (struct
> ipa_func_body_info *fbi,
>    && (TREE_CODE (arg) == SSA_NAME || TREE_CODE (arg) == INTEGER_CST))
>   {
>    jfunc->bits.known = true;
> -  jfunc->bits.sgn = TYPE_SIGN (TREE_TYPE (arg));
> -  jfunc->bits.precision = TYPE_PRECISION (TREE_TYPE (arg));
> -
> +  jfunc->bits.sgn = TYPE_SIGN (param_type);
> +  jfunc->bits.precision = TYPE_PRECISION (param_type);
> +
>    if (TREE_CODE (arg) == SSA_NAME)
>      {
>        jfunc->bits.value = 0;
> 
> So in ipcp_bits_lattice::meet_with(value, mask, signop, precision)
> when we propagate into
> parameter's lattice for first time, we will set m_precision ==
> precision of it's own type.
> rather than precision of the argument
> 
> For eg, consider following test-case:
> int f(int x)
> {
>   return some_operation (x);
> }
> 
> int f1(short y)
> {
>   return f (y & 0xf);
> }
> 
> int f2(char z)
> {
>   return f (z & 0xff);
> }
> 
> Assume we first propagate from f2->f.
> In this case, jump_function from f2->f is unknown (but bits.known is true),
> so we call meet_with (0, 0xff, SIGNED, 32).
> The precision and sign are of param's type because we extract them
> from param_type as shown above.
> 
> (I suppose the reason this is not pass-thru is because result y & 0xf
> is wrapped by convert_expr,
> which is actually passed to f(), so the parameter y isn't really
> involved in the call to f ?)
> 
> Since lattice of x is TOP, it will change to CONSTANT,
> and m_precision will get assigned 32.
> 
> Next propagate from f1->f
> jump_function from f1->f is unknown (but bits.known is true)
> so we call meet_with (0, 0xf, 32, SIGNED).
> Since lattice of x is already CONSTANT, it doesn't change m_precision anymore
> on this call or any subsequent calls.
> 
> So when we propagate into callee for first time, only then do we set
> the precision.
> Does this look reasonable ?

Just to chime in, please try to produce dg-do run testcase(s) for
the cases you show above that are handled wrong and add them to the
testsuite with the patch.

Richard.

> >
> >> bit_value_binop_1 requires original precision for few cases (shifts,
> >> rotates, plus, mult), so
> >
> > Yeah, so wrong precision could only be used if it got fed into the
> > binary operation routines, making the bug much harder to trigger.  But
> > it would still be a bug (or you do not need to care for original
> > precisions at all).
> >
> >> I was preserving the original precision in jump function.
> >> Later in ipcp_update_bits(), the mask is set after narrowing to the
> >> precision of the parameter.
> >> >
> >> >> +}
> >> >> +
> >> >> +/* Meet bits lattice with the result of bit_value_binop_1 (other, operand)
> >> >> +   if code is binary operation or bit_value_unop_1 (other) if code is unary op.
> >> >> +   In the case when code is nop_expr, no adjustment is required. */
> >> >> +
> >> >> +bool
> >> >> +ipcp_bits_lattice::meet_with (ipcp_bits_lattice& other, enum tree_code code, tree operand)
> >> >> +{
> >> >> +  if (other.bottom_p ())
> >> >> +    return set_to_bottom ();
> >> >> +
> >> >> +  if (bottom_p () || other.top_p ())
> >> >> +    return false;
> >> >> +
> >> >> +  widest_int adjusted_value, adjusted_mask;
> >> >> +
> >> >> +  if (TREE_CODE_CLASS (code) == tcc_binary)
> >> >> +    {
> >> >> +      tree type = TREE_TYPE (operand);
> >> >> +      gcc_assert (INTEGRAL_TYPE_P (type));
> >> >> +      widest_int o_value, o_mask;
> >> >> +      get_value_and_mask (operand, &o_value, &o_mask);
> >> >> +
> >> >> +      signop sgn = other.get_sign ();
> >> >> +      unsigned prec = other.get_precision ();
> >> >> +
> >> >> +      bit_value_binop_1 (code, sgn, prec, &adjusted_value, &adjusted_mask,
> >> >> +                      sgn, prec, other.get_value (), other.get_mask (),
> >> >> +                      TYPE_SIGN (type), TYPE_PRECISION (type), o_value, o_mask);
> >> >
> >> > It is probably just me not being particularly sharp on a Friday
> >> > afternoon and I might not understand the semantics of mask well (also,
> >> > you did not document it :-), but... assume that we are looking at a
> >> > binary and operation, other comes from an SSA pointer and its mask
> >> > would be binary 100 and its value 0 because that's what you set for
> >> > ssa names in ipa-prop.h, and the operand is binary value 101, which
> >> > means that get_value_and_mask returns mask 0 and value 101.  Now,
> >> > bit_value_binop_1 would return value 0 & 101 = 0 and mask according to
> >> >
> >> > (m1 | m2) & ((v1 | m1) & (v2 | m2))
> >> >
> >> > so in our case
> >> >
> >> > (100b & 0) & ((0 | 100b) & (101b | 0)) = 0 & 100b = 0.
> >> Shouldn't this be:
> >> (100b | 0) & ((0 | 100b) & (101b | 0)) = 100 & 100 = 100 -;)
> >
> > Eh, right, sorry.  I just find the term mask confusing when we do not
> > actually mask anything with it (but I guess it is good to be
> > consistent so let's keep it).
> >
> >> diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h
> >> index e32d078..1b9d0ef 100644
> >> --- a/gcc/ipa-prop.h
> >> +++ b/gcc/ipa-prop.h
> >> @@ -154,6 +154,23 @@ struct GTY(()) ipa_alignment
> >>    unsigned misalign;
> >>  };
> >>
> >> +/* Information about zero/non-zero bits.  */
> >> +struct GTY(()) ipa_bits
> >> +{
> >> +  /* The propagated value.  */
> >> +  widest_int value;
> >> +  /* Mask corresponding to the value.
> >> +     Similar to ccp_prop_t, if xth bit of mask is 0,
> >
> > Is ccp_prop_t a typo? I did not find it anywhere when I grepped for it.
> ah, it's ccp_lattice_t -;)
> 
> Thanks,
> Prathamesh
> >
> > Thanks,
> >
> > Martin
> >
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] ipa bitwise constant propagation
  2016-08-09  8:11       ` Prathamesh Kulkarni
  2016-08-09  9:24         ` Richard Biener
@ 2016-08-09 11:09         ` Martin Jambor
  2016-08-09 11:47           ` Prathamesh Kulkarni
  1 sibling, 1 reply; 31+ messages in thread
From: Martin Jambor @ 2016-08-09 11:09 UTC (permalink / raw)
  To: Prathamesh Kulkarni
  Cc: Richard Biener, Jan Hubicka, Kugan Vivekanandarajah, gcc Patches

Hi,

On Tue, Aug 09, 2016 at 01:41:21PM +0530, Prathamesh Kulkarni wrote:
> On 8 August 2016 at 19:33, Martin Jambor <mjambor@suse.cz> wrote:
> >> >> +class ipcp_bits_lattice
> >> >> +{
> >> >> +public:
> >> >> +  bool bottom_p () { return lattice_val == IPA_BITS_VARYING; }
> >> >> +  bool top_p () { return lattice_val == IPA_BITS_UNDEFINED; }
> >> >> +  bool constant_p () { return lattice_val == IPA_BITS_CONSTANT; }
> >> >> +  bool set_to_bottom ();
> >> >> +  bool set_to_constant (widest_int, widest_int, signop, unsigned);
> >> >> +
> >> >> +  widest_int get_value () { return value; }
> >> >> +  widest_int get_mask () { return mask; }
> >> >> +  signop get_sign () { return sgn; }
> >> >> +  unsigned get_precision () { return precision; }
> >> >> +
> >> >> +  bool meet_with (ipcp_bits_lattice& other, enum tree_code, tree);
> >> >> +  bool meet_with (widest_int, widest_int, signop, unsigned);
> >> >> +
> >> >> +  void print (FILE *);
> >> >> +
> >> >> +private:
> >> >> +  enum { IPA_BITS_UNDEFINED, IPA_BITS_CONSTANT, IPA_BITS_VARYING } lattice_val;
> >> >> +  widest_int value, mask;
> >> >> +  signop sgn;
> >> >> +  unsigned precision;
> >
> > I know that the existing code in ipa-cp.c does not do this, but please
> > prefix member variables with "m_" like our coding style guidelines
> > suggest (or even require?).  You routinely reuse those same names in
> > names of parameters of meet_with and I believe that is a practice that
> > will sooner or later lead to confusing the two and bugs.
> Sorry about this, will change to m_ prefix in followup patch.

Thanks a lot.

> >
> >> >> +
> >> >> +  bool meet_with_1 (widest_int, widest_int);
> >> >> +  void get_value_and_mask (tree, widest_int *, widest_int *);
> >> >> +};
> >> >> +
> >> >>  /* Structure containing lattices for a parameter itself and for pieces of
> >> >>     aggregates that are passed in the parameter or by a reference in a parameter
> >> >>     plus some other useful flags.  */
> >> >> @@ -281,6 +316,8 @@ public:
> >> >>    ipcp_agg_lattice *aggs;
> >> >>    /* Lattice describing known alignment.  */
> >> >>    ipcp_alignment_lattice alignment;
> >> >> +  /* Lattice describing known bits.  */
> >> >> +  ipcp_bits_lattice bits_lattice;
> >> >>    /* Number of aggregate lattices */
> >> >>    int aggs_count;
> >> >>    /* True if aggregate data were passed by reference (as opposed to by
> >> >> @@ -458,6 +495,21 @@ ipcp_alignment_lattice::print (FILE * f)
> >> >>      fprintf (f, "         Alignment %u, misalignment %u\n", align, misalign);
> >> >>  }
> >> >>
> >>
> >> ...
> >>
> >> >> +}
> >> >> +
> >> >> +/* Meet operation, similar to ccp_lattice_meet, we xor values
> >> >> +   if this->value, value have different values at same bit positions, we want
> >> >> +   to drop that bit to varying. Return true if mask is changed.
> >> >> +   This function assumes that the lattice value is in CONSTANT state  */
> >> >> +
> >> >> +bool
> >> >> +ipcp_bits_lattice::meet_with_1 (widest_int value, widest_int mask)
> >> >> +{
> >> >> +  gcc_assert (constant_p ());
> >> >> +
> >> >> +  widest_int old_mask = this->mask;
> >> >> +  this->mask = (this->mask | mask) | (this->value ^ value);
> >> >> +
> >> >> +  if (wi::sext (this->mask, this->precision) == -1)
> >> >> +    return set_to_bottom ();
> >> >> +
> >> >> +  bool changed = this->mask != old_mask;
> >> >> +  return changed;
> >> >> +}
> >> >> +
> >> >> +/* Meet the bits lattice with operand
> >> >> +   described by <value, mask, sgn, precision.  */
> >> >> +
> >> >> +bool
> >> >> +ipcp_bits_lattice::meet_with (widest_int value, widest_int mask,
> >> >> +                           signop sgn, unsigned precision)
> >> >> +{
> >> >> +  if (bottom_p ())
> >> >> +    return false;
> >> >> +
> >> >> +  if (top_p ())
> >> >> +    {
> >> >> +      if (wi::sext (mask, precision) == -1)
> >> >> +     return set_to_bottom ();
> >> >> +      return set_to_constant (value, mask, sgn, precision);
> >> >> +    }
> >> >> +
> >> >> +  return meet_with_1 (value, mask);
> >> >
> >> > What if precisions do not match?
> >> Sorry I don't understand. Since we extend to widest_int, precision
> >> would be same ?
> >
> > I meant what if:
> >
> >   this->precision != precision /* the parameter value */
> >
> > (you see, giving both the same name is error-prone).  You are
> > propagating the recorded precision gathered from types of the actual
> > arguments in calls, those can be different.  For example, one caller
> > can pass a direct integer value with full integer precision, another
> > caller can pass in that same argument an enum value with a very
> > limited precision.  Your code ignores that difference and the
> > resulting precision is the one that arrives first.  If it is the enum,
> > it might be too small for the integer value from the other call-site?
> Ah indeed the patch incorrectly propagates precision of argument.
> So IIUC in ipcp_bits_lattice, we want m_precision to be the precision
> of parameter's type and _not_ of argument's type.
> 
> The patch incorrectly propagates precision in following case:
> 
> __attribute__((noinline))
> static int f2(short z)
> {
>   return z;
> }
> 
> __attribute__((noinline))
> static int f1(int y)
> {
>   return f2 (y & 0xff);
> }
> 
> __attribute__((noinline))
> int f(int x)
> {
>   return f1 (x);
> }
> 
> Precision for 'z' should be 16 while the patch propagates 32, which
> is precision of type of the argument passed by the caller.

That is true but you never use precison of z (in this example), do
you?  You would only use a precision from a jump function of y in
meet_with if there was a pass-through function, am I right?

> We only set m_precision when changing from TOP to CONSTANT
> state.
> 
> Instead of storing arg's precision and sign, we should store
> parameter's precision and sign in ipa_compute_jump_functions_for_edge ().
> Diff with respect to previous patch:
> 
> @@ -1688,9 +1690,9 @@ ipa_compute_jump_functions_for_edge (struct
> ipa_func_body_info *fbi,
>    && (TREE_CODE (arg) == SSA_NAME || TREE_CODE (arg) == INTEGER_CST))
>   {
>    jfunc->bits.known = true;
> -  jfunc->bits.sgn = TYPE_SIGN (TREE_TYPE (arg));
> -  jfunc->bits.precision = TYPE_PRECISION (TREE_TYPE (arg));
> -
> +  jfunc->bits.sgn = TYPE_SIGN (param_type);
> +  jfunc->bits.precision = TYPE_PRECISION (param_type);
> +

If you want to use the precision of the formal parameter then you do
not need to store it to jump functions.  Parameter DECLs along with
their types are readily accessible in IPA (even with LTO).  It would
also be much clearer what is going on, IMHO.

>    if (TREE_CODE (arg) == SSA_NAME)
>      {
>        jfunc->bits.value = 0;
> 
> So in ipcp_bits_lattice::meet_with(value, mask, signop, precision)
> when we propagate into
> parameter's lattice for first time, we will set m_precision ==
> precision of it's own type.
> rather than precision of the argument
> 
> For eg, consider following test-case:
> int f(int x)
> {
>   return some_operation (x);
> }
> 
> int f1(short y)
> {
>   return f (y & 0xf);
> }
> 
> int f2(char z)
> {
>   return f (z & 0xff);
> }
> 
> Assume we first propagate from f2->f.
> In this case, jump_function from f2->f is unknown (but bits.known is true),
> so we call meet_with (0, 0xff, SIGNED, 32).
> The precision and sign are of param's type because we extract them
> from param_type as shown above.
> 
> (I suppose the reason this is not pass-thru is because result y & 0xf
> is wrapped by convert_expr,
> which is actually passed to f(), so the parameter y isn't really
> involved in the call to f ?)

I haven't seen the IL but most probably yes.  If you want to be
experimenting with this, I beleive you need an example with types of
the same size but different precisions (C++ enums might work nicely, I
beleive).

If you managed to come up with a testcase with a pass through jump
function with an operation in which the precision actually affects the
result, that would be great.

> 
> Since lattice of x is TOP, it will change to CONSTANT,
> and m_precision will get assigned 32.
> 
> Next propagate from f1->f
> jump_function from f1->f is unknown (but bits.known is true)
> so we call meet_with (0, 0xf, 32, SIGNED).
> Since lattice of x is already CONSTANT, it doesn't change m_precision anymore
> on this call or any subsequent calls.
> 
> So when we propagate into callee for first time, only then do we set
> the precision.
> Does this look reasonable ?

It does, but to summarize what I wrote above, I believe that with this
approach you can and should remove the precision field from jump
functions and lattices.

Thanks,


Martin

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] ipa bitwise constant propagation
  2016-08-09 11:09         ` Martin Jambor
@ 2016-08-09 11:47           ` Prathamesh Kulkarni
  2016-08-09 18:13             ` Martin Jambor
  0 siblings, 1 reply; 31+ messages in thread
From: Prathamesh Kulkarni @ 2016-08-09 11:47 UTC (permalink / raw)
  To: Prathamesh Kulkarni, Richard Biener, Jan Hubicka,
	Kugan Vivekanandarajah, gcc Patches

On 9 August 2016 at 16:39, Martin Jambor <mjambor@suse.cz> wrote:
> Hi,
>
> On Tue, Aug 09, 2016 at 01:41:21PM +0530, Prathamesh Kulkarni wrote:
>> On 8 August 2016 at 19:33, Martin Jambor <mjambor@suse.cz> wrote:
>> >> >> +class ipcp_bits_lattice
>> >> >> +{
>> >> >> +public:
>> >> >> +  bool bottom_p () { return lattice_val == IPA_BITS_VARYING; }
>> >> >> +  bool top_p () { return lattice_val == IPA_BITS_UNDEFINED; }
>> >> >> +  bool constant_p () { return lattice_val == IPA_BITS_CONSTANT; }
>> >> >> +  bool set_to_bottom ();
>> >> >> +  bool set_to_constant (widest_int, widest_int, signop, unsigned);
>> >> >> +
>> >> >> +  widest_int get_value () { return value; }
>> >> >> +  widest_int get_mask () { return mask; }
>> >> >> +  signop get_sign () { return sgn; }
>> >> >> +  unsigned get_precision () { return precision; }
>> >> >> +
>> >> >> +  bool meet_with (ipcp_bits_lattice& other, enum tree_code, tree);
>> >> >> +  bool meet_with (widest_int, widest_int, signop, unsigned);
>> >> >> +
>> >> >> +  void print (FILE *);
>> >> >> +
>> >> >> +private:
>> >> >> +  enum { IPA_BITS_UNDEFINED, IPA_BITS_CONSTANT, IPA_BITS_VARYING } lattice_val;
>> >> >> +  widest_int value, mask;
>> >> >> +  signop sgn;
>> >> >> +  unsigned precision;
>> >
>> > I know that the existing code in ipa-cp.c does not do this, but please
>> > prefix member variables with "m_" like our coding style guidelines
>> > suggest (or even require?).  You routinely reuse those same names in
>> > names of parameters of meet_with and I believe that is a practice that
>> > will sooner or later lead to confusing the two and bugs.
>> Sorry about this, will change to m_ prefix in followup patch.
>
> Thanks a lot.
>
>> >
>> >> >> +
>> >> >> +  bool meet_with_1 (widest_int, widest_int);
>> >> >> +  void get_value_and_mask (tree, widest_int *, widest_int *);
>> >> >> +};
>> >> >> +
>> >> >>  /* Structure containing lattices for a parameter itself and for pieces of
>> >> >>     aggregates that are passed in the parameter or by a reference in a parameter
>> >> >>     plus some other useful flags.  */
>> >> >> @@ -281,6 +316,8 @@ public:
>> >> >>    ipcp_agg_lattice *aggs;
>> >> >>    /* Lattice describing known alignment.  */
>> >> >>    ipcp_alignment_lattice alignment;
>> >> >> +  /* Lattice describing known bits.  */
>> >> >> +  ipcp_bits_lattice bits_lattice;
>> >> >>    /* Number of aggregate lattices */
>> >> >>    int aggs_count;
>> >> >>    /* True if aggregate data were passed by reference (as opposed to by
>> >> >> @@ -458,6 +495,21 @@ ipcp_alignment_lattice::print (FILE * f)
>> >> >>      fprintf (f, "         Alignment %u, misalignment %u\n", align, misalign);
>> >> >>  }
>> >> >>
>> >>
>> >> ...
>> >>
>> >> >> +}
>> >> >> +
>> >> >> +/* Meet operation, similar to ccp_lattice_meet, we xor values
>> >> >> +   if this->value, value have different values at same bit positions, we want
>> >> >> +   to drop that bit to varying. Return true if mask is changed.
>> >> >> +   This function assumes that the lattice value is in CONSTANT state  */
>> >> >> +
>> >> >> +bool
>> >> >> +ipcp_bits_lattice::meet_with_1 (widest_int value, widest_int mask)
>> >> >> +{
>> >> >> +  gcc_assert (constant_p ());
>> >> >> +
>> >> >> +  widest_int old_mask = this->mask;
>> >> >> +  this->mask = (this->mask | mask) | (this->value ^ value);
>> >> >> +
>> >> >> +  if (wi::sext (this->mask, this->precision) == -1)
>> >> >> +    return set_to_bottom ();
>> >> >> +
>> >> >> +  bool changed = this->mask != old_mask;
>> >> >> +  return changed;
>> >> >> +}
>> >> >> +
>> >> >> +/* Meet the bits lattice with operand
>> >> >> +   described by <value, mask, sgn, precision.  */
>> >> >> +
>> >> >> +bool
>> >> >> +ipcp_bits_lattice::meet_with (widest_int value, widest_int mask,
>> >> >> +                           signop sgn, unsigned precision)
>> >> >> +{
>> >> >> +  if (bottom_p ())
>> >> >> +    return false;
>> >> >> +
>> >> >> +  if (top_p ())
>> >> >> +    {
>> >> >> +      if (wi::sext (mask, precision) == -1)
>> >> >> +     return set_to_bottom ();
>> >> >> +      return set_to_constant (value, mask, sgn, precision);
>> >> >> +    }
>> >> >> +
>> >> >> +  return meet_with_1 (value, mask);
>> >> >
>> >> > What if precisions do not match?
>> >> Sorry I don't understand. Since we extend to widest_int, precision
>> >> would be same ?
>> >
>> > I meant what if:
>> >
>> >   this->precision != precision /* the parameter value */
>> >
>> > (you see, giving both the same name is error-prone).  You are
>> > propagating the recorded precision gathered from types of the actual
>> > arguments in calls, those can be different.  For example, one caller
>> > can pass a direct integer value with full integer precision, another
>> > caller can pass in that same argument an enum value with a very
>> > limited precision.  Your code ignores that difference and the
>> > resulting precision is the one that arrives first.  If it is the enum,
>> > it might be too small for the integer value from the other call-site?
>> Ah indeed the patch incorrectly propagates precision of argument.
>> So IIUC in ipcp_bits_lattice, we want m_precision to be the precision
>> of parameter's type and _not_ of argument's type.
>>
>> The patch incorrectly propagates precision in following case:
>>
>> __attribute__((noinline))
>> static int f2(short z)
>> {
>>   return z;
>> }
>>
>> __attribute__((noinline))
>> static int f1(int y)
>> {
>>   return f2 (y & 0xff);
>> }
>>
>> __attribute__((noinline))
>> int f(int x)
>> {
>>   return f1 (x);
>> }
>>
>> Precision for 'z' should be 16 while the patch propagates 32, which
>> is precision of type of the argument passed by the caller.
>
> That is true but you never use precison of z (in this example), do
> you?  You would only use a precision from a jump function of y in
> meet_with if there was a pass-through function, am I right?
Yes, the example was artificial.
>
>> We only set m_precision when changing from TOP to CONSTANT
>> state.
>>
>> Instead of storing arg's precision and sign, we should store
>> parameter's precision and sign in ipa_compute_jump_functions_for_edge ().
>> Diff with respect to previous patch:
>>
>> @@ -1688,9 +1690,9 @@ ipa_compute_jump_functions_for_edge (struct
>> ipa_func_body_info *fbi,
>>    && (TREE_CODE (arg) == SSA_NAME || TREE_CODE (arg) == INTEGER_CST))
>>   {
>>    jfunc->bits.known = true;
>> -  jfunc->bits.sgn = TYPE_SIGN (TREE_TYPE (arg));
>> -  jfunc->bits.precision = TYPE_PRECISION (TREE_TYPE (arg));
>> -
>> +  jfunc->bits.sgn = TYPE_SIGN (param_type);
>> +  jfunc->bits.precision = TYPE_PRECISION (param_type);
>> +
>
> If you want to use the precision of the formal parameter then you do
> not need to store it to jump functions.  Parameter DECLs along with
> their types are readily accessible in IPA (even with LTO).  It would
> also be much clearer what is going on, IMHO.
Could you please point out how to access parameter decl in wpa ?
The only reason I ended up putting this in jump function was because
I couldn't figure out how to access param decl during WPA.
I see there's ipa_get_param() in ipa-prop.h however it's gated on
gcc_checking_assert (!flag_wpa), so I suppose I can't use this
during WPA ?

Alternatively I think I could access cs->callee->decl and get to the param decl
by walking DECL_ARGUMENTS ?

Getting param decl would be indeed much clearer. Storing param
precision and sign
in jump function is admittedly quite ugly and as you mentioned below,
we could get rid of precision and signop from ipcp_bits_lattice and ipa_bits.
>
>>    if (TREE_CODE (arg) == SSA_NAME)
>>      {
>>        jfunc->bits.value = 0;
>>
>> So in ipcp_bits_lattice::meet_with(value, mask, signop, precision)
>> when we propagate into
>> parameter's lattice for first time, we will set m_precision ==
>> precision of it's own type.
>> rather than precision of the argument
>>
>> For eg, consider following test-case:
>> int f(int x)
>> {
>>   return some_operation (x);
>> }
>>
>> int f1(short y)
>> {
>>   return f (y & 0xf);
>> }
>>
>> int f2(char z)
>> {
>>   return f (z & 0xff);
>> }
>>
>> Assume we first propagate from f2->f.
>> In this case, jump_function from f2->f is unknown (but bits.known is true),
>> so we call meet_with (0, 0xff, SIGNED, 32).
>> The precision and sign are of param's type because we extract them
>> from param_type as shown above.
>>
>> (I suppose the reason this is not pass-thru is because result y & 0xf
>> is wrapped by convert_expr,
>> which is actually passed to f(), so the parameter y isn't really
>> involved in the call to f ?)
>
> I haven't seen the IL but most probably yes.  If you want to be
> experimenting with this, I beleive you need an example with types of
> the same size but different precisions (C++ enums might work nicely, I
> beleive).
I will try to come up with a test-case.

Thanks,
Prathamesh
>
> If you managed to come up with a testcase with a pass through jump
> function with an operation in which the precision actually affects the
> result, that would be great.
>
>>
>> Since lattice of x is TOP, it will change to CONSTANT,
>> and m_precision will get assigned 32.
>>
>> Next propagate from f1->f
>> jump_function from f1->f is unknown (but bits.known is true)
>> so we call meet_with (0, 0xf, 32, SIGNED).
>> Since lattice of x is already CONSTANT, it doesn't change m_precision anymore
>> on this call or any subsequent calls.
>>
>> So when we propagate into callee for first time, only then do we set
>> the precision.
>> Does this look reasonable ?
>
> It does, but to summarize what I wrote above, I believe that with this
> approach you can and should remove the precision field from jump
> functions and lattices.
>
> Thanks,
>
>
> Martin

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] ipa bitwise constant propagation
  2016-08-09 11:47           ` Prathamesh Kulkarni
@ 2016-08-09 18:13             ` Martin Jambor
  2016-08-10  8:45               ` Prathamesh Kulkarni
  0 siblings, 1 reply; 31+ messages in thread
From: Martin Jambor @ 2016-08-09 18:13 UTC (permalink / raw)
  To: Prathamesh Kulkarni
  Cc: Richard Biener, Jan Hubicka, Kugan Vivekanandarajah, gcc Patches

Hi,

On Tue, Aug 09, 2016 at 05:17:31PM +0530, Prathamesh Kulkarni wrote:
> On 9 August 2016 at 16:39, Martin Jambor <mjambor@suse.cz> wrote:
>
> ...
>
> >> Instead of storing arg's precision and sign, we should store
> >> parameter's precision and sign in ipa_compute_jump_functions_for_edge ().
> >> Diff with respect to previous patch:
> >>
> >> @@ -1688,9 +1690,9 @@ ipa_compute_jump_functions_for_edge (struct
> >> ipa_func_body_info *fbi,
> >>    && (TREE_CODE (arg) == SSA_NAME || TREE_CODE (arg) == INTEGER_CST))
> >>   {
> >>    jfunc->bits.known = true;
> >> -  jfunc->bits.sgn = TYPE_SIGN (TREE_TYPE (arg));
> >> -  jfunc->bits.precision = TYPE_PRECISION (TREE_TYPE (arg));
> >> -
> >> +  jfunc->bits.sgn = TYPE_SIGN (param_type);
> >> +  jfunc->bits.precision = TYPE_PRECISION (param_type);
> >> +
> >
> > If you want to use the precision of the formal parameter then you do
> > not need to store it to jump functions.  Parameter DECLs along with
> > their types are readily accessible in IPA (even with LTO).  It would
> > also be much clearer what is going on, IMHO.
> Could you please point out how to access parameter decl in wpa ?
> The only reason I ended up putting this in jump function was because
> I couldn't figure out how to access param decl during WPA.
> I see there's ipa_get_param() in ipa-prop.h however it's gated on
> gcc_checking_assert (!flag_wpa), so I suppose I can't use this
> during WPA ?
> 
> Alternatively I think I could access cs->callee->decl and get to the param decl
> by walking DECL_ARGUMENTS ?

Actually, we no longer have DECL_ARGUMENTS during LTO WPA.  But in
most cases, you can still get at the type with something like the
following (only very lightly tested) patch, if Honza does not think it
is too crazy.

Note that= for old K&R C sources we do not have TYPE_ARG_TYPES and so
ipa_get_type can return NULL(!) ...however I wonder whether for such
programs the type assumptions made in callers when constructing jump
functions can be trusted either.

I have to run, we will continue the discussion later.

Martin


2016-08-09  Martin Jambor  <mjambor@suse.cz>

	* ipa-prop.h (ipa_param_descriptor): Renamed decl to decl_or_type.
	Update comment.
	(ipa_get_param): Updated comment, added assert that we have a
	PARM_DECL.
	(ipa_get_type): New function.
	* ipa-cp.c (ipcp_propagate_stage): Fill in argument types in LTO mode.
	* ipa-prop.c (ipa_get_param_decl_index_1): Use decl_or_type
	instead of decl;
	(ipa_populate_param_decls): Likewise.
	(ipa_dump_param): Likewise.


diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 5b6cb9a..3465da5 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -1952,11 +1952,21 @@ propagate_constants_accross_call (struct cgraph_edge *cs)
   else
     i = 0;
 
+  /* !!! The following dump is of course only a demonstration that it works: */
+  debug_generic_expr (callee->decl);
+  fprintf (stderr, "\n");
+
   for (; (i < args_count) && (i < parms_count); i++)
     {
       struct ipa_jump_func *jump_func = ipa_get_ith_jump_func (args, i);
       struct ipcp_param_lattices *dest_plats;
 
+      /* !!! The following dump is of course only a demonstration that it
+             works: */
+      fprintf (stderr, "  The type of parameter %i is: ", i);
+      debug_generic_expr (ipa_get_type (callee_info, i));
+      fprintf (stderr, "\n");
+
       dest_plats = ipa_get_parm_lattices (callee_info, i);
       if (availability == AVAIL_INTERPOSABLE)
 	ret |= set_all_contains_variable (dest_plats);
@@ -2936,6 +2946,19 @@ ipcp_propagate_stage (struct ipa_topo_info *topo)
   {
     struct ipa_node_params *info = IPA_NODE_REF (node);
 
+    /* In LTO we do not have PARM_DECLs but we would still like to be able to
+       look at types of parameters.  */
+    if (in_lto_p)
+      {
+	tree t = TYPE_ARG_TYPES (TREE_TYPE (node->decl));
+	for (int k = 0; k < ipa_get_param_count (info); k++)
+	  {
+	    gcc_assert (t != void_list_node);
+	    info->descriptors[k].decl_or_type = TREE_VALUE (t);
+	    t = t ? TREE_CHAIN (t) : NULL;
+	  }
+      }
+
     determine_versionability (node, info);
     if (node->has_gimple_body_p ())
       {
diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
index 132b622..1eaccdf 100644
--- a/gcc/ipa-prop.c
+++ b/gcc/ipa-prop.c
@@ -103,9 +103,10 @@ ipa_get_param_decl_index_1 (vec<ipa_param_descriptor> descriptors, tree ptree)
 {
   int i, count;
 
+  gcc_checking_assert (!flag_wpa);
   count = descriptors.length ();
   for (i = 0; i < count; i++)
-    if (descriptors[i].decl == ptree)
+    if (descriptors[i].decl_or_type == ptree)
       return i;
 
   return -1;
@@ -138,7 +139,7 @@ ipa_populate_param_decls (struct cgraph_node *node,
   param_num = 0;
   for (parm = fnargs; parm; parm = DECL_CHAIN (parm))
     {
-      descriptors[param_num].decl = parm;
+      descriptors[param_num].decl_or_type = parm;
       descriptors[param_num].move_cost = estimate_move_cost (TREE_TYPE (parm),
 							     true);
       param_num++;
@@ -168,10 +169,10 @@ void
 ipa_dump_param (FILE *file, struct ipa_node_params *info, int i)
 {
   fprintf (file, "param #%i", i);
-  if (info->descriptors[i].decl)
+  if (info->descriptors[i].decl_or_type)
     {
       fprintf (file, " ");
-      print_generic_expr (file, info->descriptors[i].decl, 0);
+      print_generic_expr (file, info->descriptors[i].decl_or_type, 0);
     }
 }
 
diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h
index e32d078..1d5ce0b 100644
--- a/gcc/ipa-prop.h
+++ b/gcc/ipa-prop.h
@@ -283,8 +283,11 @@ ipa_get_jf_ancestor_type_preserved (struct ipa_jump_func *jfunc)
 
 struct ipa_param_descriptor
 {
-  /* PARAM_DECL of this parameter.  */
-  tree decl;
+  /* In analysis and modification phase, this is the PARAM_DECL of this
+     parameter, in IPA LTO phase, this is the type of the the described
+     parameter or NULL if not known.  Do not read this field directly but
+     through ipa_get_param and ipa_get_type as appropriate.  */
+  tree decl_or_type;
   /* If all uses of the parameter are described by ipa-prop structures, this
      says how many there are.  If any use could not be described by means of
      ipa-prop structures, this is IPA_UNDESCRIBED_USE.  */
@@ -402,13 +405,31 @@ ipa_get_param_count (struct ipa_node_params *info)
 
 /* Return the declaration of Ith formal parameter of the function corresponding
    to INFO.  Note there is no setter function as this array is built just once
-   using ipa_initialize_node_params. */
+   using ipa_initialize_node_params.  This function should not be called in
+   WPA.  */
 
 static inline tree
 ipa_get_param (struct ipa_node_params *info, int i)
 {
   gcc_checking_assert (!flag_wpa);
-  return info->descriptors[i].decl;
+  tree t = info->descriptors[i].decl_or_type;
+  gcc_checking_assert (TREE_CODE (t) == PARM_DECL);
+  return t;
+}
+
+/* Return the type of Ith formal parameter of the function corresponding
+   to INFO if it is known or NULL if not.  */
+
+static inline tree
+ipa_get_type (struct ipa_node_params *info, int i)
+{
+  tree t = info->descriptors[i].decl_or_type;
+  if (!t)
+    return NULL;
+  if (TYPE_P (t))
+    return t;
+  gcc_checking_assert (TREE_CODE (t) == PARM_DECL);
+  return TREE_TYPE (t);
 }
 
 /* Return the move cost of Ith formal parameter of the function corresponding

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] ipa bitwise constant propagation
  2016-08-09 18:13             ` Martin Jambor
@ 2016-08-10  8:45               ` Prathamesh Kulkarni
  2016-08-10 11:35                 ` Prathamesh Kulkarni
  0 siblings, 1 reply; 31+ messages in thread
From: Prathamesh Kulkarni @ 2016-08-10  8:45 UTC (permalink / raw)
  To: Prathamesh Kulkarni, Richard Biener, Jan Hubicka,
	Kugan Vivekanandarajah, gcc Patches

[-- Attachment #1: Type: text/plain, Size: 8236 bytes --]

On 9 August 2016 at 23:43, Martin Jambor <mjambor@suse.cz> wrote:
> Hi,
>
> On Tue, Aug 09, 2016 at 05:17:31PM +0530, Prathamesh Kulkarni wrote:
>> On 9 August 2016 at 16:39, Martin Jambor <mjambor@suse.cz> wrote:
>>
>> ...
>>
>> >> Instead of storing arg's precision and sign, we should store
>> >> parameter's precision and sign in ipa_compute_jump_functions_for_edge ().
>> >> Diff with respect to previous patch:
>> >>
>> >> @@ -1688,9 +1690,9 @@ ipa_compute_jump_functions_for_edge (struct
>> >> ipa_func_body_info *fbi,
>> >>    && (TREE_CODE (arg) == SSA_NAME || TREE_CODE (arg) == INTEGER_CST))
>> >>   {
>> >>    jfunc->bits.known = true;
>> >> -  jfunc->bits.sgn = TYPE_SIGN (TREE_TYPE (arg));
>> >> -  jfunc->bits.precision = TYPE_PRECISION (TREE_TYPE (arg));
>> >> -
>> >> +  jfunc->bits.sgn = TYPE_SIGN (param_type);
>> >> +  jfunc->bits.precision = TYPE_PRECISION (param_type);
>> >> +
>> >
>> > If you want to use the precision of the formal parameter then you do
>> > not need to store it to jump functions.  Parameter DECLs along with
>> > their types are readily accessible in IPA (even with LTO).  It would
>> > also be much clearer what is going on, IMHO.
>> Could you please point out how to access parameter decl in wpa ?
>> The only reason I ended up putting this in jump function was because
>> I couldn't figure out how to access param decl during WPA.
>> I see there's ipa_get_param() in ipa-prop.h however it's gated on
>> gcc_checking_assert (!flag_wpa), so I suppose I can't use this
>> during WPA ?
>>
>> Alternatively I think I could access cs->callee->decl and get to the param decl
>> by walking DECL_ARGUMENTS ?
>
> Actually, we no longer have DECL_ARGUMENTS during LTO WPA.  But in
> most cases, you can still get at the type with something like the
> following (only very lightly tested) patch, if Honza does not think it
> is too crazy.
>
> Note that= for old K&R C sources we do not have TYPE_ARG_TYPES and so
> ipa_get_type can return NULL(!) ...however I wonder whether for such
> programs the type assumptions made in callers when constructing jump
> functions can be trusted either.
>
> I have to run, we will continue the discussion later.
Thanks for the patch.
In this version, I updated the patch to use ipa_get_type, remove
precision and sgn
from ipcp_bits_lattice and ipa_bits, and renamed member variables to
add m_ prefix.
Does it look OK ?
I am looking for test-case that affects precision and hopefully add
that along with other
test-cases in follow-up patch.
Bootstrap+test in progress on x86_64-unknown-linux-gnu.

Thanks,
Prathamesh
>
> Martin
>
>
> 2016-08-09  Martin Jambor  <mjambor@suse.cz>
>
>         * ipa-prop.h (ipa_param_descriptor): Renamed decl to decl_or_type.
>         Update comment.
>         (ipa_get_param): Updated comment, added assert that we have a
>         PARM_DECL.
>         (ipa_get_type): New function.
>         * ipa-cp.c (ipcp_propagate_stage): Fill in argument types in LTO mode.
>         * ipa-prop.c (ipa_get_param_decl_index_1): Use decl_or_type
>         instead of decl;
>         (ipa_populate_param_decls): Likewise.
>         (ipa_dump_param): Likewise.
>
>
> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
> index 5b6cb9a..3465da5 100644
> --- a/gcc/ipa-cp.c
> +++ b/gcc/ipa-cp.c
> @@ -1952,11 +1952,21 @@ propagate_constants_accross_call (struct cgraph_edge *cs)
>    else
>      i = 0;
>
> +  /* !!! The following dump is of course only a demonstration that it works: */
> +  debug_generic_expr (callee->decl);
> +  fprintf (stderr, "\n");
> +
>    for (; (i < args_count) && (i < parms_count); i++)
>      {
>        struct ipa_jump_func *jump_func = ipa_get_ith_jump_func (args, i);
>        struct ipcp_param_lattices *dest_plats;
>
> +      /* !!! The following dump is of course only a demonstration that it
> +             works: */
> +      fprintf (stderr, "  The type of parameter %i is: ", i);
> +      debug_generic_expr (ipa_get_type (callee_info, i));
> +      fprintf (stderr, "\n");
> +
>        dest_plats = ipa_get_parm_lattices (callee_info, i);
>        if (availability == AVAIL_INTERPOSABLE)
>         ret |= set_all_contains_variable (dest_plats);
> @@ -2936,6 +2946,19 @@ ipcp_propagate_stage (struct ipa_topo_info *topo)
>    {
>      struct ipa_node_params *info = IPA_NODE_REF (node);
>
> +    /* In LTO we do not have PARM_DECLs but we would still like to be able to
> +       look at types of parameters.  */
> +    if (in_lto_p)
> +      {
> +       tree t = TYPE_ARG_TYPES (TREE_TYPE (node->decl));
> +       for (int k = 0; k < ipa_get_param_count (info); k++)
> +         {
> +           gcc_assert (t != void_list_node);
> +           info->descriptors[k].decl_or_type = TREE_VALUE (t);
> +           t = t ? TREE_CHAIN (t) : NULL;
> +         }
> +      }
> +
>      determine_versionability (node, info);
>      if (node->has_gimple_body_p ())
>        {
> diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
> index 132b622..1eaccdf 100644
> --- a/gcc/ipa-prop.c
> +++ b/gcc/ipa-prop.c
> @@ -103,9 +103,10 @@ ipa_get_param_decl_index_1 (vec<ipa_param_descriptor> descriptors, tree ptree)
>  {
>    int i, count;
>
> +  gcc_checking_assert (!flag_wpa);
>    count = descriptors.length ();
>    for (i = 0; i < count; i++)
> -    if (descriptors[i].decl == ptree)
> +    if (descriptors[i].decl_or_type == ptree)
>        return i;
>
>    return -1;
> @@ -138,7 +139,7 @@ ipa_populate_param_decls (struct cgraph_node *node,
>    param_num = 0;
>    for (parm = fnargs; parm; parm = DECL_CHAIN (parm))
>      {
> -      descriptors[param_num].decl = parm;
> +      descriptors[param_num].decl_or_type = parm;
>        descriptors[param_num].move_cost = estimate_move_cost (TREE_TYPE (parm),
>                                                              true);
>        param_num++;
> @@ -168,10 +169,10 @@ void
>  ipa_dump_param (FILE *file, struct ipa_node_params *info, int i)
>  {
>    fprintf (file, "param #%i", i);
> -  if (info->descriptors[i].decl)
> +  if (info->descriptors[i].decl_or_type)
>      {
>        fprintf (file, " ");
> -      print_generic_expr (file, info->descriptors[i].decl, 0);
> +      print_generic_expr (file, info->descriptors[i].decl_or_type, 0);
>      }
>  }
>
> diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h
> index e32d078..1d5ce0b 100644
> --- a/gcc/ipa-prop.h
> +++ b/gcc/ipa-prop.h
> @@ -283,8 +283,11 @@ ipa_get_jf_ancestor_type_preserved (struct ipa_jump_func *jfunc)
>
>  struct ipa_param_descriptor
>  {
> -  /* PARAM_DECL of this parameter.  */
> -  tree decl;
> +  /* In analysis and modification phase, this is the PARAM_DECL of this
> +     parameter, in IPA LTO phase, this is the type of the the described
> +     parameter or NULL if not known.  Do not read this field directly but
> +     through ipa_get_param and ipa_get_type as appropriate.  */
> +  tree decl_or_type;
>    /* If all uses of the parameter are described by ipa-prop structures, this
>       says how many there are.  If any use could not be described by means of
>       ipa-prop structures, this is IPA_UNDESCRIBED_USE.  */
> @@ -402,13 +405,31 @@ ipa_get_param_count (struct ipa_node_params *info)
>
>  /* Return the declaration of Ith formal parameter of the function corresponding
>     to INFO.  Note there is no setter function as this array is built just once
> -   using ipa_initialize_node_params. */
> +   using ipa_initialize_node_params.  This function should not be called in
> +   WPA.  */
>
>  static inline tree
>  ipa_get_param (struct ipa_node_params *info, int i)
>  {
>    gcc_checking_assert (!flag_wpa);
> -  return info->descriptors[i].decl;
> +  tree t = info->descriptors[i].decl_or_type;
> +  gcc_checking_assert (TREE_CODE (t) == PARM_DECL);
> +  return t;
> +}
> +
> +/* Return the type of Ith formal parameter of the function corresponding
> +   to INFO if it is known or NULL if not.  */
> +
> +static inline tree
> +ipa_get_type (struct ipa_node_params *info, int i)
> +{
> +  tree t = info->descriptors[i].decl_or_type;
> +  if (!t)
> +    return NULL;
> +  if (TYPE_P (t))
> +    return t;
> +  gcc_checking_assert (TREE_CODE (t) == PARM_DECL);
> +  return TREE_TYPE (t);
>  }
>
>  /* Return the move cost of Ith formal parameter of the function corresponding
>

[-- Attachment #2: bits-prop-2.diff --]
[-- Type: text/plain, Size: 30667 bytes --]

diff --git a/gcc/common.opt b/gcc/common.opt
index 8a292ed..8bac0a2 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1561,6 +1561,10 @@ fipa-cp-alignment
 Common Report Var(flag_ipa_cp_alignment) Optimization
 Perform alignment discovery and propagation to make Interprocedural constant propagation stronger.
 
+fipa-cp-bit
+Common Report Var(flag_ipa_cp_bit) Optimization
+Perform interprocedural bitwise constant propagation.
+
 fipa-profile
 Common Report Var(flag_ipa_profile) Init(0) Optimization
 Perform interprocedural profile propagation.
diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index b308e01..289d6c3 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -120,6 +120,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "params.h"
 #include "ipa-inline.h"
 #include "ipa-utils.h"
+#include "tree-ssa-ccp.h"
 
 template <typename valtype> class ipcp_value;
 
@@ -266,6 +267,38 @@ private:
   bool meet_with_1 (unsigned new_align, unsigned new_misalign);
 };
 
+/* Lattice of known bits, only capable of holding one value.
+   Similar to ccp_lattice_t, mask represents which bits of value are constant.
+   If a bit in mask is set to 0, then the corresponding bit in
+   value is known to be constant.  */
+
+class ipcp_bits_lattice
+{
+public:
+  bool bottom_p () { return m_lattice_val == IPA_BITS_VARYING; }
+  bool top_p () { return m_lattice_val == IPA_BITS_UNDEFINED; }
+  bool constant_p () { return m_lattice_val == IPA_BITS_CONSTANT; }
+  bool set_to_bottom ();
+  bool set_to_constant (widest_int, widest_int); 
+ 
+  widest_int get_value () { return m_value; }
+  widest_int get_mask () { return m_mask; }
+
+  bool meet_with (ipcp_bits_lattice& other, unsigned, signop,
+		  enum tree_code, tree);
+
+  bool meet_with (widest_int, widest_int, unsigned);
+
+  void print (FILE *);
+
+private:
+  enum { IPA_BITS_UNDEFINED, IPA_BITS_CONSTANT, IPA_BITS_VARYING } m_lattice_val;
+  widest_int m_value, m_mask;
+
+  bool meet_with_1 (widest_int, widest_int, unsigned); 
+  void get_value_and_mask (tree, widest_int *, widest_int *);
+}; 
+
 /* Structure containing lattices for a parameter itself and for pieces of
    aggregates that are passed in the parameter or by a reference in a parameter
    plus some other useful flags.  */
@@ -281,6 +314,8 @@ public:
   ipcp_agg_lattice *aggs;
   /* Lattice describing known alignment.  */
   ipcp_alignment_lattice alignment;
+  /* Lattice describing known bits.  */
+  ipcp_bits_lattice bits_lattice;
   /* Number of aggregate lattices */
   int aggs_count;
   /* True if aggregate data were passed by reference (as opposed to by
@@ -458,6 +493,21 @@ ipcp_alignment_lattice::print (FILE * f)
     fprintf (f, "         Alignment %u, misalignment %u\n", align, misalign);
 }
 
+void
+ipcp_bits_lattice::print (FILE *f)
+{
+  if (top_p ())
+    fprintf (f, "         Bits unknown (TOP)\n");
+  else if (bottom_p ())
+    fprintf (f, "         Bits unusable (BOTTOM)\n");
+  else
+    {
+      fprintf (f, "         Bits: value = "); print_hex (get_value (), f);
+      fprintf (f, ", mask = "); print_hex (get_mask (), f);
+      fprintf (f, "\n");
+    }
+}
+
 /* Print all ipcp_lattices of all functions to F.  */
 
 static void
@@ -484,6 +534,7 @@ print_all_lattices (FILE * f, bool dump_sources, bool dump_benefits)
 	  fprintf (f, "         ctxs: ");
 	  plats->ctxlat.print (f, dump_sources, dump_benefits);
 	  plats->alignment.print (f);
+	  plats->bits_lattice.print (f);
 	  if (plats->virt_call)
 	    fprintf (f, "        virt_call flag set\n");
 
@@ -911,6 +962,151 @@ ipcp_alignment_lattice::meet_with (const ipcp_alignment_lattice &other,
   return meet_with_1 (other.align, adjusted_misalign);
 }
 
+/* Set lattice value to bottom, if it already isn't the case.  */
+
+bool
+ipcp_bits_lattice::set_to_bottom ()
+{
+  if (bottom_p ())
+    return false;
+  m_lattice_val = IPA_BITS_VARYING;
+  m_value = 0;
+  m_mask = -1;
+  return true;
+}
+
+/* Set to constant if it isn't already. Only meant to be called
+   when switching state from TOP.  */
+
+bool
+ipcp_bits_lattice::set_to_constant (widest_int value, widest_int mask)
+{
+  gcc_assert (top_p ());
+  m_lattice_val = IPA_BITS_CONSTANT;
+  m_value = value;
+  m_mask = mask;
+  return true;
+}
+
+/* Convert operand to value, mask form.  */
+
+void
+ipcp_bits_lattice::get_value_and_mask (tree operand, widest_int *valuep, widest_int *maskp)
+{
+  wide_int get_nonzero_bits (const_tree);
+
+  if (TREE_CODE (operand) == INTEGER_CST)
+    {
+      *valuep = wi::to_widest (operand); 
+      *maskp = 0;
+    }
+  else
+    {
+      *valuep = 0;
+      *maskp = -1;
+    }
+}
+
+/* Meet operation, similar to ccp_lattice_meet, we xor values
+   if this->value, value have different values at same bit positions, we want
+   to drop that bit to varying. Return true if mask is changed.
+   This function assumes that the lattice value is in CONSTANT state  */
+
+bool
+ipcp_bits_lattice::meet_with_1 (widest_int value, widest_int mask,
+				unsigned precision)
+{
+  gcc_assert (constant_p ());
+  
+  widest_int old_mask = m_mask; 
+  m_mask = (m_mask | mask) | (m_value ^ value);
+
+  if (wi::sext (m_mask, precision) == -1)
+    return set_to_bottom ();
+
+  return m_mask != old_mask;
+}
+
+/* Meet the bits lattice with operand
+   described by <value, mask, sgn, precision.  */
+
+bool
+ipcp_bits_lattice::meet_with (widest_int value, widest_int mask,
+			      unsigned precision)
+{
+  if (bottom_p ())
+    return false;
+
+  if (top_p ())
+    {
+      if (wi::sext (mask, precision) == -1)
+	return set_to_bottom ();
+      return set_to_constant (value, mask); 
+    }
+
+  return meet_with_1 (value, mask, precision);
+}
+
+/* Meet bits lattice with the result of bit_value_binop (other, operand)
+   if code is binary operation or bit_value_unop (other) if code is unary op.
+   In the case when code is nop_expr, no adjustment is required. */
+
+bool
+ipcp_bits_lattice::meet_with (ipcp_bits_lattice& other, unsigned precision,
+			      signop sgn, enum tree_code code, tree operand)
+{
+  if (other.bottom_p ())
+    return set_to_bottom ();
+
+  if (bottom_p () || other.top_p ())
+    return false;
+
+  widest_int adjusted_value, adjusted_mask;
+
+  if (TREE_CODE_CLASS (code) == tcc_binary)
+    {
+      tree type = TREE_TYPE (operand);
+      gcc_assert (INTEGRAL_TYPE_P (type));
+      widest_int o_value, o_mask;
+      get_value_and_mask (operand, &o_value, &o_mask);
+
+      bit_value_binop (code, sgn, precision, &adjusted_value, &adjusted_mask,
+		       sgn, precision, other.get_value (), other.get_mask (),
+		       TYPE_SIGN (type), TYPE_PRECISION (type), o_value, o_mask);
+
+      if (wi::sext (adjusted_mask, precision) == -1)
+	return set_to_bottom ();
+    }
+
+  else if (TREE_CODE_CLASS (code) == tcc_unary)
+    {
+      bit_value_unop (code, sgn, precision, &adjusted_value,
+		      &adjusted_mask, sgn, precision, other.get_value (),
+		      other.get_mask ());
+
+      if (wi::sext (adjusted_mask, precision) == -1)
+	return set_to_bottom ();
+    }
+
+  else if (code == NOP_EXPR)
+    {
+      adjusted_value = other.m_value;
+      adjusted_mask = other.m_mask;
+    }
+
+  else
+    return set_to_bottom ();
+
+  if (top_p ())
+    {
+      if (wi::sext (adjusted_mask, precision) == -1)
+	return set_to_bottom ();
+      return set_to_constant (adjusted_value, adjusted_mask); 
+    }
+  else
+    return meet_with_1 (adjusted_value, adjusted_mask, precision);
+}
+
 /* Mark bot aggregate and scalar lattices as containing an unknown variable,
    return true is any of them has not been marked as such so far.  */
 
@@ -922,6 +1118,7 @@ set_all_contains_variable (struct ipcp_param_lattices *plats)
   ret |= plats->ctxlat.set_contains_variable ();
   ret |= set_agg_lats_contain_variable (plats);
   ret |= plats->alignment.set_to_bottom ();
+  ret |= plats->bits_lattice.set_to_bottom ();
   return ret;
 }
 
@@ -1003,6 +1200,7 @@ initialize_node_lattices (struct cgraph_node *node)
 	      plats->ctxlat.set_to_bottom ();
 	      set_agg_lats_to_bottom (plats);
 	      plats->alignment.set_to_bottom ();
+	      plats->bits_lattice.set_to_bottom ();
 	    }
 	  else
 	    set_all_contains_variable (plats);
@@ -1621,6 +1819,69 @@ propagate_alignment_accross_jump_function (cgraph_edge *cs,
     }
 }
 
+/* Propagate bits across jfunc that is associated with
+   edge cs and update dest_lattice accordingly.  */
+
+bool
+propagate_bits_accross_jump_function (cgraph_edge *cs, int idx, ipa_jump_func *jfunc,
+				      ipcp_bits_lattice *dest_lattice)
+{
+  if (dest_lattice->bottom_p ())
+    return false;
+
+  struct ipa_node_params *callee_info = IPA_NODE_REF (cs->callee);
+  tree parm_type = ipa_get_type (callee_info, idx);
+
+  /* For K&R C programs, ipa_get_type() could return NULL_TREE.
+     Avoid the transform for these cases.  */
+  if (!parm_type)
+    return dest_lattice->set_to_bottom ();
+
+  unsigned precision = TYPE_PRECISION (parm_type);
+  signop sgn = TYPE_SIGN (parm_type);
+
+  if (jfunc->type == IPA_JF_PASS_THROUGH)
+    {
+      struct ipa_node_params *caller_info = IPA_NODE_REF (cs->caller);
+      enum tree_code code = ipa_get_jf_pass_through_operation (jfunc);
+      tree operand = NULL_TREE;
+
+      if (code != NOP_EXPR)
+	operand = ipa_get_jf_pass_through_operand (jfunc);
+
+      int src_idx = ipa_get_jf_pass_through_formal_id (jfunc);
+      struct ipcp_param_lattices *src_lats
+	= ipa_get_parm_lattices (caller_info, src_idx);
+
+      /* Try to propagate bits if src_lattice is bottom, but jfunc is known.
+	 for eg consider:
+	 int f(int x)
+	 {
+	   g (x & 0xff);
+	 }
+	 Assume lattice for x is bottom, however we can still propagate
+	 result of x & 0xff == 0xff, which gets computed during ccp1 pass
+	 and we store it in jump function during analysis stage.  */
+
+      if (src_lats->bits_lattice.bottom_p ()
+	  && jfunc->bits.known)
+	return dest_lattice->meet_with (jfunc->bits.value, jfunc->bits.mask,
+					precision);
+      else
+	return dest_lattice->meet_with (src_lats->bits_lattice, precision, sgn,
+					code, operand);
+    }
+
+  else if (jfunc->type == IPA_JF_ANCESTOR)
+    return dest_lattice->set_to_bottom ();
+
+  else if (jfunc->bits.known) 
+    return dest_lattice->meet_with (jfunc->bits.value, jfunc->bits.mask, precision);
+  
+  else
+    return dest_lattice->set_to_bottom ();
+}
+
 /* If DEST_PLATS already has aggregate items, check that aggs_by_ref matches
    NEW_AGGS_BY_REF and if not, mark all aggs as bottoms and return true (in all
    other cases, return false).  If there are no aggregate items, set
@@ -1968,6 +2229,8 @@ propagate_constants_accross_call (struct cgraph_edge *cs)
 							  &dest_plats->ctxlat);
 	  ret |= propagate_alignment_accross_jump_function (cs, jump_func,
 							 &dest_plats->alignment);
+	  ret |= propagate_bits_accross_jump_function (cs, i, jump_func,
+						       &dest_plats->bits_lattice);
 	  ret |= propagate_aggs_accross_jump_function (cs, jump_func,
 						       dest_plats);
 	}
@@ -4605,6 +4868,81 @@ ipcp_store_alignment_results (void)
   }
 }
 
+/* Look up all the bits information that we have discovered and copy it over
+   to the transformation summary.  */
+
+static void
+ipcp_store_bits_results (void)
+{
+  cgraph_node *node;
+
+  FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node)
+    {
+      ipa_node_params *info = IPA_NODE_REF (node);
+      bool dumped_sth = false;
+      bool found_useful_result = false;
+
+      if (!opt_for_fn (node->decl, flag_ipa_cp_bit))
+	{
+	  if (dump_file)
+	    fprintf (dump_file, "Not considering %s for ipa bitwise propagation "
+				"; -fipa-cp-bit: disabled.\n",
+				node->name ());
+	  continue;
+	}
+
+      if (info->ipcp_orig_node)
+	info = IPA_NODE_REF (info->ipcp_orig_node);
+
+      unsigned count = ipa_get_param_count (info);
+      for (unsigned i = 0; i < count; i++)
+	{
+	  ipcp_param_lattices *plats = ipa_get_parm_lattices (info, i);
+	  if (plats->bits_lattice.constant_p ())
+	    {
+	      found_useful_result = true;
+	      break;
+	    }
+	}
+
+    if (!found_useful_result)
+      continue;
+
+    ipcp_grow_transformations_if_necessary ();
+    ipcp_transformation_summary *ts = ipcp_get_transformation_summary (node);
+    vec_safe_reserve_exact (ts->bits, count);
+
+    for (unsigned i = 0; i < count; i++)
+      {
+	ipcp_param_lattices *plats = ipa_get_parm_lattices (info, i);
+	ipa_bits bits_jfunc;			 
+
+	if (plats->bits_lattice.constant_p ())
+	  {
+	    bits_jfunc.known = true;
+	    bits_jfunc.value = plats->bits_lattice.get_value ();
+	    bits_jfunc.mask = plats->bits_lattice.get_mask ();
+	  }
+	else
+	  bits_jfunc.known = false;
+
+	ts->bits->quick_push (bits_jfunc);
+	if (!dump_file || !bits_jfunc.known)
+	  continue;
+	if (!dumped_sth)
+	  {
+	    fprintf (dump_file, "Propagated bits info for function %s/%i:\n",
+				node->name (), node->order);
+	    dumped_sth = true;
+	  }
+	fprintf (dump_file, " param %i: value = ", i);
+	print_hex (bits_jfunc.value, dump_file);
+	fprintf (dump_file, ", mask = ");
+	print_hex (bits_jfunc.mask, dump_file);
+	fprintf (dump_file, "\n");
+      }
+    }
+}
 /* The IPCP driver.  */
 
 static unsigned int
@@ -4638,6 +4976,8 @@ ipcp_driver (void)
   ipcp_decision_stage (&topo);
   /* Store results of alignment propagation. */
   ipcp_store_alignment_results ();
+  /* Store results of bits propagation.  */
+  ipcp_store_bits_results ();
 
   /* Free all IPCP structures.  */
   free_toporder_info (&topo);
diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
index 8fa1350..44ec20a 100644
--- a/gcc/ipa-prop.c
+++ b/gcc/ipa-prop.c
@@ -303,6 +303,15 @@ ipa_print_node_jump_functions_for_edge (FILE *f, struct cgraph_edge *cs)
 	}
       else
 	fprintf (f, "         Unknown alignment\n");
+
+      if (jump_func->bits.known)
+	{
+	  fprintf (f, "         value: "); print_hex (jump_func->bits.value, f);
+	  fprintf (f, ", mask: "); print_hex (jump_func->bits.mask, f);
+	  fprintf (f, "\n");
+	}
+      else
+	fprintf (f, "         Unknown bits\n");
     }
 }
 
@@ -382,6 +391,7 @@ ipa_set_jf_unknown (struct ipa_jump_func *jfunc)
 {
   jfunc->type = IPA_JF_UNKNOWN;
   jfunc->alignment.known = false;
+  jfunc->bits.known = false;
 }
 
 /* Set JFUNC to be a copy of another jmp (to be used by jump function
@@ -1675,6 +1685,26 @@ ipa_compute_jump_functions_for_edge (struct ipa_func_body_info *fbi,
       else
 	gcc_assert (!jfunc->alignment.known);
 
+      if (INTEGRAL_TYPE_P (TREE_TYPE (arg))
+	  && (TREE_CODE (arg) == SSA_NAME || TREE_CODE (arg) == INTEGER_CST))
+	{
+	  jfunc->bits.known = true;
+	  
+	  if (TREE_CODE (arg) == SSA_NAME)
+	    {
+	      jfunc->bits.value = 0;
+	      jfunc->bits.mask = widest_int::from (get_nonzero_bits (arg),
+						   TYPE_SIGN (TREE_TYPE (arg)));
+	    }
+	  else
+	    {
+	      jfunc->bits.value = wi::to_widest (arg);
+	      jfunc->bits.mask = 0;
+	    }
+	}
+      else
+	gcc_assert (!jfunc->bits.known);
+
       if (is_gimple_ip_invariant (arg)
 	  || (TREE_CODE (arg) == VAR_DECL
 	      && is_global_var (arg)
@@ -3691,6 +3721,18 @@ ipa_node_params_t::duplicate(cgraph_node *src, cgraph_node *dst,
       for (unsigned i = 0; i < src_alignments->length (); ++i)
 	dst_alignments->quick_push ((*src_alignments)[i]);
     }
+
+  if (src_trans && vec_safe_length (src_trans->bits) > 0)
+    {
+      ipcp_grow_transformations_if_necessary ();
+      src_trans = ipcp_get_transformation_summary (src);
+      const vec<ipa_bits, va_gc> *src_bits = src_trans->bits;
+      vec<ipa_bits, va_gc> *&dst_bits
+	= ipcp_get_transformation_summary (dst)->bits;
+      vec_safe_reserve_exact (dst_bits, src_bits->length ());
+      for (unsigned i = 0; i < src_bits->length (); ++i)
+	dst_bits->quick_push ((*src_bits)[i]);
+    }
 }
 
 /* Register our cgraph hooks if they are not already there.  */
@@ -4610,6 +4652,15 @@ ipa_write_jump_function (struct output_block *ob,
       streamer_write_uhwi (ob, jump_func->alignment.align);
       streamer_write_uhwi (ob, jump_func->alignment.misalign);
     }
+
+  bp = bitpack_create (ob->main_stream);
+  bp_pack_value (&bp, jump_func->bits.known, 1);
+  streamer_write_bitpack (&bp);
+  if (jump_func->bits.known)
+    {
+      streamer_write_widest_int (ob, jump_func->bits.value);
+      streamer_write_widest_int (ob, jump_func->bits.mask);
+    }   
 }
 
 /* Read in jump function JUMP_FUNC from IB.  */
@@ -4686,6 +4737,17 @@ ipa_read_jump_function (struct lto_input_block *ib,
     }
   else
     jump_func->alignment.known = false;
+
+  bp = streamer_read_bitpack (ib);
+  bool bits_known = bp_unpack_value (&bp, 1);
+  if (bits_known)
+    {
+      jump_func->bits.known = true;
+      jump_func->bits.value = streamer_read_widest_int (ib);
+      jump_func->bits.mask = streamer_read_widest_int (ib);
+    }
+  else
+    jump_func->bits.known = false;
 }
 
 /* Stream out parts of cgraph_indirect_call_info corresponding to CS that are
@@ -5051,6 +5113,28 @@ write_ipcp_transformation_info (output_block *ob, cgraph_node *node)
     }
   else
     streamer_write_uhwi (ob, 0);
+
+  ts = ipcp_get_transformation_summary (node);
+  if (ts && vec_safe_length (ts->bits) > 0)
+    {
+      count = ts->bits->length ();
+      streamer_write_uhwi (ob, count);
+
+      for (unsigned i = 0; i < count; ++i)
+	{
+	  const ipa_bits& bits_jfunc = (*ts->bits)[i];
+	  struct bitpack_d bp = bitpack_create (ob->main_stream);
+	  bp_pack_value (&bp, bits_jfunc.known, 1);
+	  streamer_write_bitpack (&bp);
+	  if (bits_jfunc.known)
+	    {
+	      streamer_write_widest_int (ob, bits_jfunc.value);
+	      streamer_write_widest_int (ob, bits_jfunc.mask);
+	    }
+	}
+    }
+  else
+    streamer_write_uhwi (ob, 0);
 }
 
 /* Stream in the aggregate value replacement chain for NODE from IB.  */
@@ -5103,6 +5187,26 @@ read_ipcp_transformation_info (lto_input_block *ib, cgraph_node *node,
 	    }
 	}
     }
+
+  count = streamer_read_uhwi (ib);
+  if (count > 0)
+    {
+      ipcp_grow_transformations_if_necessary ();
+      ipcp_transformation_summary *ts = ipcp_get_transformation_summary (node);
+      vec_safe_grow_cleared (ts->bits, count);
+
+      for (i = 0; i < count; i++)
+	{
+	  ipa_bits& bits_jfunc = (*ts->bits)[i];
+	  struct bitpack_d bp = streamer_read_bitpack (ib);
+	  bits_jfunc.known = bp_unpack_value (&bp, 1);
+	  if (bits_jfunc.known)
+	    {
+	      bits_jfunc.value = streamer_read_widest_int (ib);
+	      bits_jfunc.mask = streamer_read_widest_int (ib);
+	    }
+	}
+    }
 }
 
 /* Write all aggregate replacement for nodes in set.  */
@@ -5405,6 +5509,56 @@ ipcp_update_alignments (struct cgraph_node *node)
     }
 }
 
+/* Update bits info of formal parameters as described in
+   ipcp_transformation_summary.  */
+
+static void
+ipcp_update_bits (struct cgraph_node *node)
+{
+  tree parm = DECL_ARGUMENTS (node->decl);
+  tree next_parm = parm;
+  ipcp_transformation_summary *ts = ipcp_get_transformation_summary (node);
+
+  if (!ts || vec_safe_length (ts->bits) == 0)
+    return;
+
+  vec<ipa_bits, va_gc> &bits = *ts->bits;
+  unsigned count = bits.length ();
+
+  for (unsigned i = 0; i < count; ++i, parm = next_parm)
+    {
+      if (node->clone.combined_args_to_skip
+	  && bitmap_bit_p (node->clone.combined_args_to_skip, i))
+	continue;
+
+      gcc_checking_assert (parm);
+      next_parm = DECL_CHAIN (parm);
+
+      if (!bits[i].known
+	  || !INTEGRAL_TYPE_P (TREE_TYPE (parm))
+	  || !is_gimple_reg (parm))
+	continue;       
+
+      tree ddef = ssa_default_def (DECL_STRUCT_FUNCTION (node->decl), parm);
+      if (!ddef)
+	continue;
+
+      if (dump_file)
+	{
+	  fprintf (dump_file, "Adjusting mask for param %u to ", i); 
+	  print_hex (bits[i].mask, dump_file);
+	  fprintf (dump_file, "\n");
+	}
+
+      unsigned prec = TYPE_PRECISION (TREE_TYPE (ddef));
+      signop sgn = TYPE_SIGN (TREE_TYPE (ddef));
+
+      wide_int nonzero_bits = wide_int::from (bits[i].mask, prec, UNSIGNED)
+			      | wide_int::from (bits[i].value, prec, sgn);
+      set_nonzero_bits (ddef, nonzero_bits);
+    }
+}
+
 /* IPCP transformation phase doing propagation of aggregate values.  */
 
 unsigned int
@@ -5424,6 +5578,7 @@ ipcp_transform_function (struct cgraph_node *node)
 	     node->name (), node->order);
 
   ipcp_update_alignments (node);
+  ipcp_update_bits (node);
   aggval = ipa_get_agg_replacements_for_node (node);
   if (!aggval)
       return 0;
diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h
index 1d5ce0b..e5a56da 100644
--- a/gcc/ipa-prop.h
+++ b/gcc/ipa-prop.h
@@ -154,6 +154,19 @@ struct GTY(()) ipa_alignment
   unsigned misalign;
 };
 
+/* Information about zero/non-zero bits.  */
+struct GTY(()) ipa_bits
+{
+  /* The propagated value.  */
+  widest_int value;
+  /* Mask corresponding to the value.
+     Similar to ccp_lattice_t, if xth bit of mask is 0,
+     implies xth bit of value is constant.  */
+  widest_int mask;
+  /* True if jump function is known.  */
+  bool known;
+};
+
 /* A jump function for a callsite represents the values passed as actual
    arguments of the callsite. See enum jump_func_type for the various
    types of jump functions supported.  */
@@ -166,6 +179,9 @@ struct GTY (()) ipa_jump_func
   /* Information about alignment of pointers. */
   struct ipa_alignment alignment;
 
+  /* Information about zero/non-zero bits.  */
+  struct ipa_bits bits;
+
   enum jump_func_type type;
   /* Represents a value of a jump function.  pass_through is used only in jump
      function context.  constant represents the actual constant in constant jump
@@ -503,6 +519,8 @@ struct GTY(()) ipcp_transformation_summary
   ipa_agg_replacement_value *agg_values;
   /* Alignment information for pointers.  */
   vec<ipa_alignment, va_gc> *alignments;
+  /* Known bits information.  */
+  vec<ipa_bits, va_gc> *bits;
 };
 
 void ipa_set_node_agg_value_chain (struct cgraph_node *node,
diff --git a/gcc/opts.c b/gcc/opts.c
index 4053fb1..cde9a7b 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -505,6 +505,7 @@ static const struct default_options default_options_table[] =
     { OPT_LEVELS_2_PLUS, OPT_ftree_switch_conversion, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fipa_cp, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fipa_cp_alignment, NULL, 1 },
+    { OPT_LEVELS_2_PLUS, OPT_fipa_cp_bit, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fdevirtualize, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fdevirtualize_speculatively, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fipa_sra, NULL, 1 },
@@ -1422,6 +1423,9 @@ enable_fdo_optimizations (struct gcc_options *opts,
   if (!opts_set->x_flag_ipa_cp_alignment
       && value && opts->x_flag_ipa_cp)
     opts->x_flag_ipa_cp_alignment = value;
+  if (!opts_set->x_flag_ipa_cp_bit
+      && value && opts->x_flag_ipa_cp)
+    opts->x_flag_ipa_cp_bit = value;
   if (!opts_set->x_flag_predictive_commoning)
     opts->x_flag_predictive_commoning = value;
   if (!opts_set->x_flag_unswitch_loops)
diff --git a/gcc/tree-ssa-ccp.c b/gcc/tree-ssa-ccp.c
index 5d5386e..d88143b 100644
--- a/gcc/tree-ssa-ccp.c
+++ b/gcc/tree-ssa-ccp.c
@@ -142,7 +142,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfgloop.h"
 #include "stor-layout.h"
 #include "optabs-query.h"
-
+#include "tree-ssa-ccp.h"
 
 /* Possible lattice values.  */
 typedef enum
@@ -536,9 +536,9 @@ set_lattice_value (tree var, ccp_prop_value_t *new_val)
 
 static ccp_prop_value_t get_value_for_expr (tree, bool);
 static ccp_prop_value_t bit_value_binop (enum tree_code, tree, tree, tree);
-static void bit_value_binop_1 (enum tree_code, tree, widest_int *, widest_int *,
-			       tree, const widest_int &, const widest_int &,
-			       tree, const widest_int &, const widest_int &);
+void bit_value_binop (enum tree_code, signop, int, widest_int *, widest_int *,
+		      signop, int, const widest_int &, const widest_int &,
+		      signop, int, const widest_int &, const widest_int &);
 
 /* Return a widest_int that can be used for bitwise simplifications
    from VAL.  */
@@ -894,7 +894,7 @@ do_dbg_cnt (void)
    Return TRUE when something was optimized.  */
 
 static bool
-ccp_finalize (bool nonzero_p)
+ccp_finalize (bool nonzero_p) 
 {
   bool something_changed;
   unsigned i;
@@ -920,7 +920,8 @@ ccp_finalize (bool nonzero_p)
 
       val = get_value (name);
       if (val->lattice_val != CONSTANT
-	  || TREE_CODE (val->value) != INTEGER_CST)
+	  || TREE_CODE (val->value) != INTEGER_CST
+	  || val->mask == 0)
 	continue;
 
       if (POINTER_TYPE_P (TREE_TYPE (name)))
@@ -1224,10 +1225,11 @@ ccp_fold (gimple *stmt)
    RVAL and RMASK representing a value of type RTYPE and set
    the value, mask pair *VAL and *MASK to the result.  */
 
-static void
-bit_value_unop_1 (enum tree_code code, tree type,
-		  widest_int *val, widest_int *mask,
-		  tree rtype, const widest_int &rval, const widest_int &rmask)
+void
+bit_value_unop (enum tree_code code, signop type_sgn, int type_precision, 
+		widest_int *val, widest_int *mask,
+		signop rtype_sgn, int rtype_precision,
+		const widest_int &rval, const widest_int &rmask)
 {
   switch (code)
     {
@@ -1240,25 +1242,23 @@ bit_value_unop_1 (enum tree_code code, tree type,
       {
 	widest_int temv, temm;
 	/* Return ~rval + 1.  */
-	bit_value_unop_1 (BIT_NOT_EXPR, type, &temv, &temm, type, rval, rmask);
-	bit_value_binop_1 (PLUS_EXPR, type, val, mask,
-			   type, temv, temm, type, 1, 0);
+	bit_value_unop (BIT_NOT_EXPR, type_sgn, type_precision, &temv, &temm,
+			type_sgn, type_precision, rval, rmask);
+	bit_value_binop (PLUS_EXPR, type_sgn, type_precision, val, mask,
+			 type_sgn, type_precision, temv, temm,
+			 type_sgn, type_precision, 1, 0);
 	break;
       }
 
     CASE_CONVERT:
       {
-	signop sgn;
-
 	/* First extend mask and value according to the original type.  */
-	sgn = TYPE_SIGN (rtype);
-	*mask = wi::ext (rmask, TYPE_PRECISION (rtype), sgn);
-	*val = wi::ext (rval, TYPE_PRECISION (rtype), sgn);
+	*mask = wi::ext (rmask, rtype_precision, rtype_sgn);
+	*val = wi::ext (rval, rtype_precision, rtype_sgn);
 
 	/* Then extend mask and value according to the target type.  */
-	sgn = TYPE_SIGN (type);
-	*mask = wi::ext (*mask, TYPE_PRECISION (type), sgn);
-	*val = wi::ext (*val, TYPE_PRECISION (type), sgn);
+	*mask = wi::ext (*mask, type_precision, type_sgn);
+	*val = wi::ext (*val, type_precision, type_sgn);
 	break;
       }
 
@@ -1272,15 +1272,14 @@ bit_value_unop_1 (enum tree_code code, tree type,
    R1VAL, R1MASK and R2VAL, R2MASK representing a values of type R1TYPE
    and R2TYPE and set the value, mask pair *VAL and *MASK to the result.  */
 
-static void
-bit_value_binop_1 (enum tree_code code, tree type,
-		   widest_int *val, widest_int *mask,
-		   tree r1type, const widest_int &r1val,
-		   const widest_int &r1mask, tree r2type,
-		   const widest_int &r2val, const widest_int &r2mask)
+void
+bit_value_binop (enum tree_code code, signop sgn, int width, 
+		 widest_int *val, widest_int *mask,
+		 signop r1type_sgn, int r1type_precision,
+		 const widest_int &r1val, const widest_int &r1mask,
+		 signop r2type_sgn, int r2type_precision,
+		 const widest_int &r2val, const widest_int &r2mask)
 {
-  signop sgn = TYPE_SIGN (type);
-  int width = TYPE_PRECISION (type);
   bool swap_p = false;
 
   /* Assume we'll get a constant result.  Use an initial non varying
@@ -1406,11 +1405,11 @@ bit_value_binop_1 (enum tree_code code, tree type,
     case MINUS_EXPR:
       {
 	widest_int temv, temm;
-	bit_value_unop_1 (NEGATE_EXPR, r2type, &temv, &temm,
-			  r2type, r2val, r2mask);
-	bit_value_binop_1 (PLUS_EXPR, type, val, mask,
-			   r1type, r1val, r1mask,
-			   r2type, temv, temm);
+	bit_value_unop (NEGATE_EXPR, r2type_sgn, r2type_precision, &temv, &temm,
+			  r2type_sgn, r2type_precision, r2val, r2mask);
+	bit_value_binop (PLUS_EXPR, sgn, width, val, mask,
+			 r1type_sgn, r1type_precision, r1val, r1mask,
+			 r2type_sgn, r2type_precision, temv, temm);
 	break;
       }
 
@@ -1472,7 +1471,7 @@ bit_value_binop_1 (enum tree_code code, tree type,
 	  break;
 
 	/* For comparisons the signedness is in the comparison operands.  */
-	sgn = TYPE_SIGN (r1type);
+	sgn = r1type_sgn;
 
 	/* If we know the most significant bits we know the values
 	   value ranges by means of treating varying bits as zero
@@ -1525,8 +1524,9 @@ bit_value_unop (enum tree_code code, tree type, tree rhs)
   gcc_assert ((rval.lattice_val == CONSTANT
 	       && TREE_CODE (rval.value) == INTEGER_CST)
 	      || wi::sext (rval.mask, TYPE_PRECISION (TREE_TYPE (rhs))) == -1);
-  bit_value_unop_1 (code, type, &value, &mask,
-		    TREE_TYPE (rhs), value_to_wide_int (rval), rval.mask);
+  bit_value_unop (code, TYPE_SIGN (type), TYPE_PRECISION (type), &value, &mask,
+		  TYPE_SIGN (TREE_TYPE (rhs)), TYPE_PRECISION (TREE_TYPE (rhs)),
+		  value_to_wide_int (rval), rval.mask);
   if (wi::sext (mask, TYPE_PRECISION (type)) != -1)
     {
       val.lattice_val = CONSTANT;
@@ -1571,9 +1571,12 @@ bit_value_binop (enum tree_code code, tree type, tree rhs1, tree rhs2)
 	       && TREE_CODE (r2val.value) == INTEGER_CST)
 	      || wi::sext (r2val.mask,
 			   TYPE_PRECISION (TREE_TYPE (rhs2))) == -1);
-  bit_value_binop_1 (code, type, &value, &mask,
-		     TREE_TYPE (rhs1), value_to_wide_int (r1val), r1val.mask,
-		     TREE_TYPE (rhs2), value_to_wide_int (r2val), r2val.mask);
+  bit_value_binop (code, TYPE_SIGN (type), TYPE_PRECISION (type), &value, &mask,
+		   TYPE_SIGN (TREE_TYPE (rhs1)), TYPE_PRECISION (TREE_TYPE (rhs1)),
+		   value_to_wide_int (r1val), r1val.mask,
+		   TYPE_SIGN (TREE_TYPE (rhs2)), TYPE_PRECISION (TREE_TYPE (rhs2)),
+		   value_to_wide_int (r2val), r2val.mask);
+
   if (wi::sext (mask, TYPE_PRECISION (type)) != -1)
     {
       val.lattice_val = CONSTANT;
@@ -1672,9 +1675,10 @@ bit_value_assume_aligned (gimple *stmt, tree attr, ccp_prop_value_t ptrval,
 
   align = build_int_cst_type (type, -aligni);
   alignval = get_value_for_expr (align, true);
-  bit_value_binop_1 (BIT_AND_EXPR, type, &value, &mask,
-		     type, value_to_wide_int (ptrval), ptrval.mask,
-		     type, value_to_wide_int (alignval), alignval.mask);
+  bit_value_binop (BIT_AND_EXPR, TYPE_SIGN (type), TYPE_PRECISION (type), &value, &mask,
+		   TYPE_SIGN (type), TYPE_PRECISION (type), value_to_wide_int (ptrval), ptrval.mask,
+		   TYPE_SIGN (type), TYPE_PRECISION (type), value_to_wide_int (alignval), alignval.mask);
+
   if (wi::sext (mask, TYPE_PRECISION (type)) != -1)
     {
       val.lattice_val = CONSTANT;
@@ -2409,7 +2413,7 @@ do_ssa_ccp (bool nonzero_p)
 
   ccp_initialize ();
   ssa_propagate (ccp_visit_stmt, ccp_visit_phi_node);
-  if (ccp_finalize (nonzero_p))
+  if (ccp_finalize (nonzero_p || flag_ipa_cp_bit))
     {
       todo = (TODO_cleanup_cfg | TODO_update_ssa);
 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] ipa bitwise constant propagation
  2016-08-10  8:45               ` Prathamesh Kulkarni
@ 2016-08-10 11:35                 ` Prathamesh Kulkarni
  2016-08-11 12:55                   ` Jan Hubicka
  0 siblings, 1 reply; 31+ messages in thread
From: Prathamesh Kulkarni @ 2016-08-10 11:35 UTC (permalink / raw)
  To: Prathamesh Kulkarni, Richard Biener, Jan Hubicka,
	Kugan Vivekanandarajah, gcc Patches

[-- Attachment #1: Type: text/plain, Size: 8658 bytes --]

On 10 August 2016 at 14:14, Prathamesh Kulkarni
<prathamesh.kulkarni@linaro.org> wrote:
> On 9 August 2016 at 23:43, Martin Jambor <mjambor@suse.cz> wrote:
>> Hi,
>>
>> On Tue, Aug 09, 2016 at 05:17:31PM +0530, Prathamesh Kulkarni wrote:
>>> On 9 August 2016 at 16:39, Martin Jambor <mjambor@suse.cz> wrote:
>>>
>>> ...
>>>
>>> >> Instead of storing arg's precision and sign, we should store
>>> >> parameter's precision and sign in ipa_compute_jump_functions_for_edge ().
>>> >> Diff with respect to previous patch:
>>> >>
>>> >> @@ -1688,9 +1690,9 @@ ipa_compute_jump_functions_for_edge (struct
>>> >> ipa_func_body_info *fbi,
>>> >>    && (TREE_CODE (arg) == SSA_NAME || TREE_CODE (arg) == INTEGER_CST))
>>> >>   {
>>> >>    jfunc->bits.known = true;
>>> >> -  jfunc->bits.sgn = TYPE_SIGN (TREE_TYPE (arg));
>>> >> -  jfunc->bits.precision = TYPE_PRECISION (TREE_TYPE (arg));
>>> >> -
>>> >> +  jfunc->bits.sgn = TYPE_SIGN (param_type);
>>> >> +  jfunc->bits.precision = TYPE_PRECISION (param_type);
>>> >> +
>>> >
>>> > If you want to use the precision of the formal parameter then you do
>>> > not need to store it to jump functions.  Parameter DECLs along with
>>> > their types are readily accessible in IPA (even with LTO).  It would
>>> > also be much clearer what is going on, IMHO.
>>> Could you please point out how to access parameter decl in wpa ?
>>> The only reason I ended up putting this in jump function was because
>>> I couldn't figure out how to access param decl during WPA.
>>> I see there's ipa_get_param() in ipa-prop.h however it's gated on
>>> gcc_checking_assert (!flag_wpa), so I suppose I can't use this
>>> during WPA ?
>>>
>>> Alternatively I think I could access cs->callee->decl and get to the param decl
>>> by walking DECL_ARGUMENTS ?
>>
>> Actually, we no longer have DECL_ARGUMENTS during LTO WPA.  But in
>> most cases, you can still get at the type with something like the
>> following (only very lightly tested) patch, if Honza does not think it
>> is too crazy.
>>
>> Note that= for old K&R C sources we do not have TYPE_ARG_TYPES and so
>> ipa_get_type can return NULL(!) ...however I wonder whether for such
>> programs the type assumptions made in callers when constructing jump
>> functions can be trusted either.
>>
>> I have to run, we will continue the discussion later.
> Thanks for the patch.
> In this version, I updated the patch to use ipa_get_type, remove
> precision and sgn
> from ipcp_bits_lattice and ipa_bits, and renamed member variables to
> add m_ prefix.
> Does it look OK ?
> I am looking for test-case that affects precision and hopefully add
> that along with other
> test-cases in follow-up patch.
> Bootstrap+test in progress on x86_64-unknown-linux-gnu.
oops, I forgot to add tree-ssa-ccp.h hunk to the patch :/
Attached in this version.

Thanks,
Prathamesh
>
> Thanks,
> Prathamesh
>>
>> Martin
>>
>>
>> 2016-08-09  Martin Jambor  <mjambor@suse.cz>
>>
>>         * ipa-prop.h (ipa_param_descriptor): Renamed decl to decl_or_type.
>>         Update comment.
>>         (ipa_get_param): Updated comment, added assert that we have a
>>         PARM_DECL.
>>         (ipa_get_type): New function.
>>         * ipa-cp.c (ipcp_propagate_stage): Fill in argument types in LTO mode.
>>         * ipa-prop.c (ipa_get_param_decl_index_1): Use decl_or_type
>>         instead of decl;
>>         (ipa_populate_param_decls): Likewise.
>>         (ipa_dump_param): Likewise.
>>
>>
>> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
>> index 5b6cb9a..3465da5 100644
>> --- a/gcc/ipa-cp.c
>> +++ b/gcc/ipa-cp.c
>> @@ -1952,11 +1952,21 @@ propagate_constants_accross_call (struct cgraph_edge *cs)
>>    else
>>      i = 0;
>>
>> +  /* !!! The following dump is of course only a demonstration that it works: */
>> +  debug_generic_expr (callee->decl);
>> +  fprintf (stderr, "\n");
>> +
>>    for (; (i < args_count) && (i < parms_count); i++)
>>      {
>>        struct ipa_jump_func *jump_func = ipa_get_ith_jump_func (args, i);
>>        struct ipcp_param_lattices *dest_plats;
>>
>> +      /* !!! The following dump is of course only a demonstration that it
>> +             works: */
>> +      fprintf (stderr, "  The type of parameter %i is: ", i);
>> +      debug_generic_expr (ipa_get_type (callee_info, i));
>> +      fprintf (stderr, "\n");
>> +
>>        dest_plats = ipa_get_parm_lattices (callee_info, i);
>>        if (availability == AVAIL_INTERPOSABLE)
>>         ret |= set_all_contains_variable (dest_plats);
>> @@ -2936,6 +2946,19 @@ ipcp_propagate_stage (struct ipa_topo_info *topo)
>>    {
>>      struct ipa_node_params *info = IPA_NODE_REF (node);
>>
>> +    /* In LTO we do not have PARM_DECLs but we would still like to be able to
>> +       look at types of parameters.  */
>> +    if (in_lto_p)
>> +      {
>> +       tree t = TYPE_ARG_TYPES (TREE_TYPE (node->decl));
>> +       for (int k = 0; k < ipa_get_param_count (info); k++)
>> +         {
>> +           gcc_assert (t != void_list_node);
>> +           info->descriptors[k].decl_or_type = TREE_VALUE (t);
>> +           t = t ? TREE_CHAIN (t) : NULL;
>> +         }
>> +      }
>> +
>>      determine_versionability (node, info);
>>      if (node->has_gimple_body_p ())
>>        {
>> diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
>> index 132b622..1eaccdf 100644
>> --- a/gcc/ipa-prop.c
>> +++ b/gcc/ipa-prop.c
>> @@ -103,9 +103,10 @@ ipa_get_param_decl_index_1 (vec<ipa_param_descriptor> descriptors, tree ptree)
>>  {
>>    int i, count;
>>
>> +  gcc_checking_assert (!flag_wpa);
>>    count = descriptors.length ();
>>    for (i = 0; i < count; i++)
>> -    if (descriptors[i].decl == ptree)
>> +    if (descriptors[i].decl_or_type == ptree)
>>        return i;
>>
>>    return -1;
>> @@ -138,7 +139,7 @@ ipa_populate_param_decls (struct cgraph_node *node,
>>    param_num = 0;
>>    for (parm = fnargs; parm; parm = DECL_CHAIN (parm))
>>      {
>> -      descriptors[param_num].decl = parm;
>> +      descriptors[param_num].decl_or_type = parm;
>>        descriptors[param_num].move_cost = estimate_move_cost (TREE_TYPE (parm),
>>                                                              true);
>>        param_num++;
>> @@ -168,10 +169,10 @@ void
>>  ipa_dump_param (FILE *file, struct ipa_node_params *info, int i)
>>  {
>>    fprintf (file, "param #%i", i);
>> -  if (info->descriptors[i].decl)
>> +  if (info->descriptors[i].decl_or_type)
>>      {
>>        fprintf (file, " ");
>> -      print_generic_expr (file, info->descriptors[i].decl, 0);
>> +      print_generic_expr (file, info->descriptors[i].decl_or_type, 0);
>>      }
>>  }
>>
>> diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h
>> index e32d078..1d5ce0b 100644
>> --- a/gcc/ipa-prop.h
>> +++ b/gcc/ipa-prop.h
>> @@ -283,8 +283,11 @@ ipa_get_jf_ancestor_type_preserved (struct ipa_jump_func *jfunc)
>>
>>  struct ipa_param_descriptor
>>  {
>> -  /* PARAM_DECL of this parameter.  */
>> -  tree decl;
>> +  /* In analysis and modification phase, this is the PARAM_DECL of this
>> +     parameter, in IPA LTO phase, this is the type of the the described
>> +     parameter or NULL if not known.  Do not read this field directly but
>> +     through ipa_get_param and ipa_get_type as appropriate.  */
>> +  tree decl_or_type;
>>    /* If all uses of the parameter are described by ipa-prop structures, this
>>       says how many there are.  If any use could not be described by means of
>>       ipa-prop structures, this is IPA_UNDESCRIBED_USE.  */
>> @@ -402,13 +405,31 @@ ipa_get_param_count (struct ipa_node_params *info)
>>
>>  /* Return the declaration of Ith formal parameter of the function corresponding
>>     to INFO.  Note there is no setter function as this array is built just once
>> -   using ipa_initialize_node_params. */
>> +   using ipa_initialize_node_params.  This function should not be called in
>> +   WPA.  */
>>
>>  static inline tree
>>  ipa_get_param (struct ipa_node_params *info, int i)
>>  {
>>    gcc_checking_assert (!flag_wpa);
>> -  return info->descriptors[i].decl;
>> +  tree t = info->descriptors[i].decl_or_type;
>> +  gcc_checking_assert (TREE_CODE (t) == PARM_DECL);
>> +  return t;
>> +}
>> +
>> +/* Return the type of Ith formal parameter of the function corresponding
>> +   to INFO if it is known or NULL if not.  */
>> +
>> +static inline tree
>> +ipa_get_type (struct ipa_node_params *info, int i)
>> +{
>> +  tree t = info->descriptors[i].decl_or_type;
>> +  if (!t)
>> +    return NULL;
>> +  if (TYPE_P (t))
>> +    return t;
>> +  gcc_checking_assert (TREE_CODE (t) == PARM_DECL);
>> +  return TREE_TYPE (t);
>>  }
>>
>>  /* Return the move cost of Ith formal parameter of the function corresponding
>>

[-- Attachment #2: bits-prop-2.diff --]
[-- Type: text/plain, Size: 31929 bytes --]

diff --git a/gcc/common.opt b/gcc/common.opt
index 8a292ed..8bac0a2 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1561,6 +1561,10 @@ fipa-cp-alignment
 Common Report Var(flag_ipa_cp_alignment) Optimization
 Perform alignment discovery and propagation to make Interprocedural constant propagation stronger.
 
+fipa-cp-bit
+Common Report Var(flag_ipa_cp_bit) Optimization
+Perform interprocedural bitwise constant propagation.
+
 fipa-profile
 Common Report Var(flag_ipa_profile) Init(0) Optimization
 Perform interprocedural profile propagation.
diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index b308e01..289d6c3 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -120,6 +120,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "params.h"
 #include "ipa-inline.h"
 #include "ipa-utils.h"
+#include "tree-ssa-ccp.h"
 
 template <typename valtype> class ipcp_value;
 
@@ -266,6 +267,38 @@ private:
   bool meet_with_1 (unsigned new_align, unsigned new_misalign);
 };
 
+/* Lattice of known bits, only capable of holding one value.
+   Similar to ccp_lattice_t, mask represents which bits of value are constant.
+   If a bit in mask is set to 0, then the corresponding bit in
+   value is known to be constant.  */
+
+class ipcp_bits_lattice
+{
+public:
+  bool bottom_p () { return m_lattice_val == IPA_BITS_VARYING; }
+  bool top_p () { return m_lattice_val == IPA_BITS_UNDEFINED; }
+  bool constant_p () { return m_lattice_val == IPA_BITS_CONSTANT; }
+  bool set_to_bottom ();
+  bool set_to_constant (widest_int, widest_int); 
+ 
+  widest_int get_value () { return m_value; }
+  widest_int get_mask () { return m_mask; }
+
+  bool meet_with (ipcp_bits_lattice& other, unsigned, signop,
+		  enum tree_code, tree);
+
+  bool meet_with (widest_int, widest_int, unsigned);
+
+  void print (FILE *);
+
+private:
+  enum { IPA_BITS_UNDEFINED, IPA_BITS_CONSTANT, IPA_BITS_VARYING } m_lattice_val;
+  widest_int m_value, m_mask;
+
+  bool meet_with_1 (widest_int, widest_int, unsigned); 
+  void get_value_and_mask (tree, widest_int *, widest_int *);
+}; 
+
 /* Structure containing lattices for a parameter itself and for pieces of
    aggregates that are passed in the parameter or by a reference in a parameter
    plus some other useful flags.  */
@@ -281,6 +314,8 @@ public:
   ipcp_agg_lattice *aggs;
   /* Lattice describing known alignment.  */
   ipcp_alignment_lattice alignment;
+  /* Lattice describing known bits.  */
+  ipcp_bits_lattice bits_lattice;
   /* Number of aggregate lattices */
   int aggs_count;
   /* True if aggregate data were passed by reference (as opposed to by
@@ -458,6 +493,21 @@ ipcp_alignment_lattice::print (FILE * f)
     fprintf (f, "         Alignment %u, misalignment %u\n", align, misalign);
 }
 
+void
+ipcp_bits_lattice::print (FILE *f)
+{
+  if (top_p ())
+    fprintf (f, "         Bits unknown (TOP)\n");
+  else if (bottom_p ())
+    fprintf (f, "         Bits unusable (BOTTOM)\n");
+  else
+    {
+      fprintf (f, "         Bits: value = "); print_hex (get_value (), f);
+      fprintf (f, ", mask = "); print_hex (get_mask (), f);
+      fprintf (f, "\n");
+    }
+}
+
 /* Print all ipcp_lattices of all functions to F.  */
 
 static void
@@ -484,6 +534,7 @@ print_all_lattices (FILE * f, bool dump_sources, bool dump_benefits)
 	  fprintf (f, "         ctxs: ");
 	  plats->ctxlat.print (f, dump_sources, dump_benefits);
 	  plats->alignment.print (f);
+	  plats->bits_lattice.print (f);
 	  if (plats->virt_call)
 	    fprintf (f, "        virt_call flag set\n");
 
@@ -911,6 +962,151 @@ ipcp_alignment_lattice::meet_with (const ipcp_alignment_lattice &other,
   return meet_with_1 (other.align, adjusted_misalign);
 }
 
+/* Set lattice value to bottom, if it already isn't the case.  */
+
+bool
+ipcp_bits_lattice::set_to_bottom ()
+{
+  if (bottom_p ())
+    return false;
+  m_lattice_val = IPA_BITS_VARYING;
+  m_value = 0;
+  m_mask = -1;
+  return true;
+}
+
+/* Set to constant if it isn't already. Only meant to be called
+   when switching state from TOP.  */
+
+bool
+ipcp_bits_lattice::set_to_constant (widest_int value, widest_int mask)
+{
+  gcc_assert (top_p ());
+  m_lattice_val = IPA_BITS_CONSTANT;
+  m_value = value;
+  m_mask = mask;
+  return true;
+}
+
+/* Convert operand to value, mask form.  */
+
+void
+ipcp_bits_lattice::get_value_and_mask (tree operand, widest_int *valuep, widest_int *maskp)
+{
+  wide_int get_nonzero_bits (const_tree);
+
+  if (TREE_CODE (operand) == INTEGER_CST)
+    {
+      *valuep = wi::to_widest (operand); 
+      *maskp = 0;
+    }
+  else
+    {
+      *valuep = 0;
+      *maskp = -1;
+    }
+}
+
+/* Meet operation, similar to ccp_lattice_meet, we xor values
+   if this->value, value have different values at same bit positions, we want
+   to drop that bit to varying. Return true if mask is changed.
+   This function assumes that the lattice value is in CONSTANT state  */
+
+bool
+ipcp_bits_lattice::meet_with_1 (widest_int value, widest_int mask,
+				unsigned precision)
+{
+  gcc_assert (constant_p ());
+  
+  widest_int old_mask = m_mask; 
+  m_mask = (m_mask | mask) | (m_value ^ value);
+
+  if (wi::sext (m_mask, precision) == -1)
+    return set_to_bottom ();
+
+  return m_mask != old_mask;
+}
+
+/* Meet the bits lattice with operand
+   described by <value, mask, sgn, precision.  */
+
+bool
+ipcp_bits_lattice::meet_with (widest_int value, widest_int mask,
+			      unsigned precision)
+{
+  if (bottom_p ())
+    return false;
+
+  if (top_p ())
+    {
+      if (wi::sext (mask, precision) == -1)
+	return set_to_bottom ();
+      return set_to_constant (value, mask); 
+    }
+
+  return meet_with_1 (value, mask, precision);
+}
+
+/* Meet bits lattice with the result of bit_value_binop (other, operand)
+   if code is binary operation or bit_value_unop (other) if code is unary op.
+   In the case when code is nop_expr, no adjustment is required. */
+
+bool
+ipcp_bits_lattice::meet_with (ipcp_bits_lattice& other, unsigned precision,
+			      signop sgn, enum tree_code code, tree operand)
+{
+  if (other.bottom_p ())
+    return set_to_bottom ();
+
+  if (bottom_p () || other.top_p ())
+    return false;
+
+  widest_int adjusted_value, adjusted_mask;
+
+  if (TREE_CODE_CLASS (code) == tcc_binary)
+    {
+      tree type = TREE_TYPE (operand);
+      gcc_assert (INTEGRAL_TYPE_P (type));
+      widest_int o_value, o_mask;
+      get_value_and_mask (operand, &o_value, &o_mask);
+
+      bit_value_binop (code, sgn, precision, &adjusted_value, &adjusted_mask,
+		       sgn, precision, other.get_value (), other.get_mask (),
+		       TYPE_SIGN (type), TYPE_PRECISION (type), o_value, o_mask);
+
+      if (wi::sext (adjusted_mask, precision) == -1)
+	return set_to_bottom ();
+    }
+
+  else if (TREE_CODE_CLASS (code) == tcc_unary)
+    {
+      bit_value_unop (code, sgn, precision, &adjusted_value,
+		      &adjusted_mask, sgn, precision, other.get_value (),
+		      other.get_mask ());
+
+      if (wi::sext (adjusted_mask, precision) == -1)
+	return set_to_bottom ();
+    }
+
+  else if (code == NOP_EXPR)
+    {
+      adjusted_value = other.m_value;
+      adjusted_mask = other.m_mask;
+    }
+
+  else
+    return set_to_bottom ();
+
+  if (top_p ())
+    {
+      if (wi::sext (adjusted_mask, precision) == -1)
+	return set_to_bottom ();
+      return set_to_constant (adjusted_value, adjusted_mask); 
+    }
+  else
+    return meet_with_1 (adjusted_value, adjusted_mask, precision);
+}
+
 /* Mark bot aggregate and scalar lattices as containing an unknown variable,
    return true is any of them has not been marked as such so far.  */
 
@@ -922,6 +1118,7 @@ set_all_contains_variable (struct ipcp_param_lattices *plats)
   ret |= plats->ctxlat.set_contains_variable ();
   ret |= set_agg_lats_contain_variable (plats);
   ret |= plats->alignment.set_to_bottom ();
+  ret |= plats->bits_lattice.set_to_bottom ();
   return ret;
 }
 
@@ -1003,6 +1200,7 @@ initialize_node_lattices (struct cgraph_node *node)
 	      plats->ctxlat.set_to_bottom ();
 	      set_agg_lats_to_bottom (plats);
 	      plats->alignment.set_to_bottom ();
+	      plats->bits_lattice.set_to_bottom ();
 	    }
 	  else
 	    set_all_contains_variable (plats);
@@ -1621,6 +1819,69 @@ propagate_alignment_accross_jump_function (cgraph_edge *cs,
     }
 }
 
+/* Propagate bits across jfunc that is associated with
+   edge cs and update dest_lattice accordingly.  */
+
+bool
+propagate_bits_accross_jump_function (cgraph_edge *cs, int idx, ipa_jump_func *jfunc,
+				      ipcp_bits_lattice *dest_lattice)
+{
+  if (dest_lattice->bottom_p ())
+    return false;
+
+  struct ipa_node_params *callee_info = IPA_NODE_REF (cs->callee);
+  tree parm_type = ipa_get_type (callee_info, idx);
+
+  /* For K&R C programs, ipa_get_type() could return NULL_TREE.
+     Avoid the transform for these cases.  */
+  if (!parm_type)
+    return dest_lattice->set_to_bottom ();
+
+  unsigned precision = TYPE_PRECISION (parm_type);
+  signop sgn = TYPE_SIGN (parm_type);
+
+  if (jfunc->type == IPA_JF_PASS_THROUGH)
+    {
+      struct ipa_node_params *caller_info = IPA_NODE_REF (cs->caller);
+      enum tree_code code = ipa_get_jf_pass_through_operation (jfunc);
+      tree operand = NULL_TREE;
+
+      if (code != NOP_EXPR)
+	operand = ipa_get_jf_pass_through_operand (jfunc);
+
+      int src_idx = ipa_get_jf_pass_through_formal_id (jfunc);
+      struct ipcp_param_lattices *src_lats
+	= ipa_get_parm_lattices (caller_info, src_idx);
+
+      /* Try to propagate bits if src_lattice is bottom, but jfunc is known.
+	 for eg consider:
+	 int f(int x)
+	 {
+	   g (x & 0xff);
+	 }
+	 Assume lattice for x is bottom, however we can still propagate
+	 result of x & 0xff == 0xff, which gets computed during ccp1 pass
+	 and we store it in jump function during analysis stage.  */
+
+      if (src_lats->bits_lattice.bottom_p ()
+	  && jfunc->bits.known)
+	return dest_lattice->meet_with (jfunc->bits.value, jfunc->bits.mask,
+					precision);
+      else
+	return dest_lattice->meet_with (src_lats->bits_lattice, precision, sgn,
+					code, operand);
+    }
+
+  else if (jfunc->type == IPA_JF_ANCESTOR)
+    return dest_lattice->set_to_bottom ();
+
+  else if (jfunc->bits.known) 
+    return dest_lattice->meet_with (jfunc->bits.value, jfunc->bits.mask, precision);
+  
+  else
+    return dest_lattice->set_to_bottom ();
+}
+
 /* If DEST_PLATS already has aggregate items, check that aggs_by_ref matches
    NEW_AGGS_BY_REF and if not, mark all aggs as bottoms and return true (in all
    other cases, return false).  If there are no aggregate items, set
@@ -1968,6 +2229,8 @@ propagate_constants_accross_call (struct cgraph_edge *cs)
 							  &dest_plats->ctxlat);
 	  ret |= propagate_alignment_accross_jump_function (cs, jump_func,
 							 &dest_plats->alignment);
+	  ret |= propagate_bits_accross_jump_function (cs, i, jump_func,
+						       &dest_plats->bits_lattice);
 	  ret |= propagate_aggs_accross_jump_function (cs, jump_func,
 						       dest_plats);
 	}
@@ -4605,6 +4868,81 @@ ipcp_store_alignment_results (void)
   }
 }
 
+/* Look up all the bits information that we have discovered and copy it over
+   to the transformation summary.  */
+
+static void
+ipcp_store_bits_results (void)
+{
+  cgraph_node *node;
+
+  FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node)
+    {
+      ipa_node_params *info = IPA_NODE_REF (node);
+      bool dumped_sth = false;
+      bool found_useful_result = false;
+
+      if (!opt_for_fn (node->decl, flag_ipa_cp_bit))
+	{
+	  if (dump_file)
+	    fprintf (dump_file, "Not considering %s for ipa bitwise propagation "
+				"; -fipa-cp-bit: disabled.\n",
+				node->name ());
+	  continue;
+	}
+
+      if (info->ipcp_orig_node)
+	info = IPA_NODE_REF (info->ipcp_orig_node);
+
+      unsigned count = ipa_get_param_count (info);
+      for (unsigned i = 0; i < count; i++)
+	{
+	  ipcp_param_lattices *plats = ipa_get_parm_lattices (info, i);
+	  if (plats->bits_lattice.constant_p ())
+	    {
+	      found_useful_result = true;
+	      break;
+	    }
+	}
+
+    if (!found_useful_result)
+      continue;
+
+    ipcp_grow_transformations_if_necessary ();
+    ipcp_transformation_summary *ts = ipcp_get_transformation_summary (node);
+    vec_safe_reserve_exact (ts->bits, count);
+
+    for (unsigned i = 0; i < count; i++)
+      {
+	ipcp_param_lattices *plats = ipa_get_parm_lattices (info, i);
+	ipa_bits bits_jfunc;			 
+
+	if (plats->bits_lattice.constant_p ())
+	  {
+	    bits_jfunc.known = true;
+	    bits_jfunc.value = plats->bits_lattice.get_value ();
+	    bits_jfunc.mask = plats->bits_lattice.get_mask ();
+	  }
+	else
+	  bits_jfunc.known = false;
+
+	ts->bits->quick_push (bits_jfunc);
+	if (!dump_file || !bits_jfunc.known)
+	  continue;
+	if (!dumped_sth)
+	  {
+	    fprintf (dump_file, "Propagated bits info for function %s/%i:\n",
+				node->name (), node->order);
+	    dumped_sth = true;
+	  }
+	fprintf (dump_file, " param %i: value = ", i);
+	print_hex (bits_jfunc.value, dump_file);
+	fprintf (dump_file, ", mask = ");
+	print_hex (bits_jfunc.mask, dump_file);
+	fprintf (dump_file, "\n");
+      }
+    }
+}
 /* The IPCP driver.  */
 
 static unsigned int
@@ -4638,6 +4976,8 @@ ipcp_driver (void)
   ipcp_decision_stage (&topo);
   /* Store results of alignment propagation. */
   ipcp_store_alignment_results ();
+  /* Store results of bits propagation.  */
+  ipcp_store_bits_results ();
 
   /* Free all IPCP structures.  */
   free_toporder_info (&topo);
diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
index 8fa1350..44ec20a 100644
--- a/gcc/ipa-prop.c
+++ b/gcc/ipa-prop.c
@@ -303,6 +303,15 @@ ipa_print_node_jump_functions_for_edge (FILE *f, struct cgraph_edge *cs)
 	}
       else
 	fprintf (f, "         Unknown alignment\n");
+
+      if (jump_func->bits.known)
+	{
+	  fprintf (f, "         value: "); print_hex (jump_func->bits.value, f);
+	  fprintf (f, ", mask: "); print_hex (jump_func->bits.mask, f);
+	  fprintf (f, "\n");
+	}
+      else
+	fprintf (f, "         Unknown bits\n");
     }
 }
 
@@ -382,6 +391,7 @@ ipa_set_jf_unknown (struct ipa_jump_func *jfunc)
 {
   jfunc->type = IPA_JF_UNKNOWN;
   jfunc->alignment.known = false;
+  jfunc->bits.known = false;
 }
 
 /* Set JFUNC to be a copy of another jmp (to be used by jump function
@@ -1675,6 +1685,26 @@ ipa_compute_jump_functions_for_edge (struct ipa_func_body_info *fbi,
       else
 	gcc_assert (!jfunc->alignment.known);
 
+      if (INTEGRAL_TYPE_P (TREE_TYPE (arg))
+	  && (TREE_CODE (arg) == SSA_NAME || TREE_CODE (arg) == INTEGER_CST))
+	{
+	  jfunc->bits.known = true;
+	  
+	  if (TREE_CODE (arg) == SSA_NAME)
+	    {
+	      jfunc->bits.value = 0;
+	      jfunc->bits.mask = widest_int::from (get_nonzero_bits (arg),
+						   TYPE_SIGN (TREE_TYPE (arg)));
+	    }
+	  else
+	    {
+	      jfunc->bits.value = wi::to_widest (arg);
+	      jfunc->bits.mask = 0;
+	    }
+	}
+      else
+	gcc_assert (!jfunc->bits.known);
+
       if (is_gimple_ip_invariant (arg)
 	  || (TREE_CODE (arg) == VAR_DECL
 	      && is_global_var (arg)
@@ -3691,6 +3721,18 @@ ipa_node_params_t::duplicate(cgraph_node *src, cgraph_node *dst,
       for (unsigned i = 0; i < src_alignments->length (); ++i)
 	dst_alignments->quick_push ((*src_alignments)[i]);
     }
+
+  if (src_trans && vec_safe_length (src_trans->bits) > 0)
+    {
+      ipcp_grow_transformations_if_necessary ();
+      src_trans = ipcp_get_transformation_summary (src);
+      const vec<ipa_bits, va_gc> *src_bits = src_trans->bits;
+      vec<ipa_bits, va_gc> *&dst_bits
+	= ipcp_get_transformation_summary (dst)->bits;
+      vec_safe_reserve_exact (dst_bits, src_bits->length ());
+      for (unsigned i = 0; i < src_bits->length (); ++i)
+	dst_bits->quick_push ((*src_bits)[i]);
+    }
 }
 
 /* Register our cgraph hooks if they are not already there.  */
@@ -4610,6 +4652,15 @@ ipa_write_jump_function (struct output_block *ob,
       streamer_write_uhwi (ob, jump_func->alignment.align);
       streamer_write_uhwi (ob, jump_func->alignment.misalign);
     }
+
+  bp = bitpack_create (ob->main_stream);
+  bp_pack_value (&bp, jump_func->bits.known, 1);
+  streamer_write_bitpack (&bp);
+  if (jump_func->bits.known)
+    {
+      streamer_write_widest_int (ob, jump_func->bits.value);
+      streamer_write_widest_int (ob, jump_func->bits.mask);
+    }   
 }
 
 /* Read in jump function JUMP_FUNC from IB.  */
@@ -4686,6 +4737,17 @@ ipa_read_jump_function (struct lto_input_block *ib,
     }
   else
     jump_func->alignment.known = false;
+
+  bp = streamer_read_bitpack (ib);
+  bool bits_known = bp_unpack_value (&bp, 1);
+  if (bits_known)
+    {
+      jump_func->bits.known = true;
+      jump_func->bits.value = streamer_read_widest_int (ib);
+      jump_func->bits.mask = streamer_read_widest_int (ib);
+    }
+  else
+    jump_func->bits.known = false;
 }
 
 /* Stream out parts of cgraph_indirect_call_info corresponding to CS that are
@@ -5051,6 +5113,28 @@ write_ipcp_transformation_info (output_block *ob, cgraph_node *node)
     }
   else
     streamer_write_uhwi (ob, 0);
+
+  ts = ipcp_get_transformation_summary (node);
+  if (ts && vec_safe_length (ts->bits) > 0)
+    {
+      count = ts->bits->length ();
+      streamer_write_uhwi (ob, count);
+
+      for (unsigned i = 0; i < count; ++i)
+	{
+	  const ipa_bits& bits_jfunc = (*ts->bits)[i];
+	  struct bitpack_d bp = bitpack_create (ob->main_stream);
+	  bp_pack_value (&bp, bits_jfunc.known, 1);
+	  streamer_write_bitpack (&bp);
+	  if (bits_jfunc.known)
+	    {
+	      streamer_write_widest_int (ob, bits_jfunc.value);
+	      streamer_write_widest_int (ob, bits_jfunc.mask);
+	    }
+	}
+    }
+  else
+    streamer_write_uhwi (ob, 0);
 }
 
 /* Stream in the aggregate value replacement chain for NODE from IB.  */
@@ -5103,6 +5187,26 @@ read_ipcp_transformation_info (lto_input_block *ib, cgraph_node *node,
 	    }
 	}
     }
+
+  count = streamer_read_uhwi (ib);
+  if (count > 0)
+    {
+      ipcp_grow_transformations_if_necessary ();
+      ipcp_transformation_summary *ts = ipcp_get_transformation_summary (node);
+      vec_safe_grow_cleared (ts->bits, count);
+
+      for (i = 0; i < count; i++)
+	{
+	  ipa_bits& bits_jfunc = (*ts->bits)[i];
+	  struct bitpack_d bp = streamer_read_bitpack (ib);
+	  bits_jfunc.known = bp_unpack_value (&bp, 1);
+	  if (bits_jfunc.known)
+	    {
+	      bits_jfunc.value = streamer_read_widest_int (ib);
+	      bits_jfunc.mask = streamer_read_widest_int (ib);
+	    }
+	}
+    }
 }
 
 /* Write all aggregate replacement for nodes in set.  */
@@ -5405,6 +5509,56 @@ ipcp_update_alignments (struct cgraph_node *node)
     }
 }
 
+/* Update bits info of formal parameters as described in
+   ipcp_transformation_summary.  */
+
+static void
+ipcp_update_bits (struct cgraph_node *node)
+{
+  tree parm = DECL_ARGUMENTS (node->decl);
+  tree next_parm = parm;
+  ipcp_transformation_summary *ts = ipcp_get_transformation_summary (node);
+
+  if (!ts || vec_safe_length (ts->bits) == 0)
+    return;
+
+  vec<ipa_bits, va_gc> &bits = *ts->bits;
+  unsigned count = bits.length ();
+
+  for (unsigned i = 0; i < count; ++i, parm = next_parm)
+    {
+      if (node->clone.combined_args_to_skip
+	  && bitmap_bit_p (node->clone.combined_args_to_skip, i))
+	continue;
+
+      gcc_checking_assert (parm);
+      next_parm = DECL_CHAIN (parm);
+
+      if (!bits[i].known
+	  || !INTEGRAL_TYPE_P (TREE_TYPE (parm))
+	  || !is_gimple_reg (parm))
+	continue;       
+
+      tree ddef = ssa_default_def (DECL_STRUCT_FUNCTION (node->decl), parm);
+      if (!ddef)
+	continue;
+
+      if (dump_file)
+	{
+	  fprintf (dump_file, "Adjusting mask for param %u to ", i); 
+	  print_hex (bits[i].mask, dump_file);
+	  fprintf (dump_file, "\n");
+	}
+
+      unsigned prec = TYPE_PRECISION (TREE_TYPE (ddef));
+      signop sgn = TYPE_SIGN (TREE_TYPE (ddef));
+
+      wide_int nonzero_bits = wide_int::from (bits[i].mask, prec, UNSIGNED)
+			      | wide_int::from (bits[i].value, prec, sgn);
+      set_nonzero_bits (ddef, nonzero_bits);
+    }
+}
+
 /* IPCP transformation phase doing propagation of aggregate values.  */
 
 unsigned int
@@ -5424,6 +5578,7 @@ ipcp_transform_function (struct cgraph_node *node)
 	     node->name (), node->order);
 
   ipcp_update_alignments (node);
+  ipcp_update_bits (node);
   aggval = ipa_get_agg_replacements_for_node (node);
   if (!aggval)
       return 0;
diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h
index 1d5ce0b..e5a56da 100644
--- a/gcc/ipa-prop.h
+++ b/gcc/ipa-prop.h
@@ -154,6 +154,19 @@ struct GTY(()) ipa_alignment
   unsigned misalign;
 };
 
+/* Information about zero/non-zero bits.  */
+struct GTY(()) ipa_bits
+{
+  /* The propagated value.  */
+  widest_int value;
+  /* Mask corresponding to the value.
+     Similar to ccp_lattice_t, if xth bit of mask is 0,
+     implies xth bit of value is constant.  */
+  widest_int mask;
+  /* True if jump function is known.  */
+  bool known;
+};
+
 /* A jump function for a callsite represents the values passed as actual
    arguments of the callsite. See enum jump_func_type for the various
    types of jump functions supported.  */
@@ -166,6 +179,9 @@ struct GTY (()) ipa_jump_func
   /* Information about alignment of pointers. */
   struct ipa_alignment alignment;
 
+  /* Information about zero/non-zero bits.  */
+  struct ipa_bits bits;
+
   enum jump_func_type type;
   /* Represents a value of a jump function.  pass_through is used only in jump
      function context.  constant represents the actual constant in constant jump
@@ -503,6 +519,8 @@ struct GTY(()) ipcp_transformation_summary
   ipa_agg_replacement_value *agg_values;
   /* Alignment information for pointers.  */
   vec<ipa_alignment, va_gc> *alignments;
+  /* Known bits information.  */
+  vec<ipa_bits, va_gc> *bits;
 };
 
 void ipa_set_node_agg_value_chain (struct cgraph_node *node,
diff --git a/gcc/opts.c b/gcc/opts.c
index 4053fb1..cde9a7b 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -505,6 +505,7 @@ static const struct default_options default_options_table[] =
     { OPT_LEVELS_2_PLUS, OPT_ftree_switch_conversion, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fipa_cp, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fipa_cp_alignment, NULL, 1 },
+    { OPT_LEVELS_2_PLUS, OPT_fipa_cp_bit, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fdevirtualize, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fdevirtualize_speculatively, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fipa_sra, NULL, 1 },
@@ -1422,6 +1423,9 @@ enable_fdo_optimizations (struct gcc_options *opts,
   if (!opts_set->x_flag_ipa_cp_alignment
       && value && opts->x_flag_ipa_cp)
     opts->x_flag_ipa_cp_alignment = value;
+  if (!opts_set->x_flag_ipa_cp_bit
+      && value && opts->x_flag_ipa_cp)
+    opts->x_flag_ipa_cp_bit = value;
   if (!opts_set->x_flag_predictive_commoning)
     opts->x_flag_predictive_commoning = value;
   if (!opts_set->x_flag_unswitch_loops)
diff --git a/gcc/tree-ssa-ccp.c b/gcc/tree-ssa-ccp.c
index 5d5386e..d88143b 100644
--- a/gcc/tree-ssa-ccp.c
+++ b/gcc/tree-ssa-ccp.c
@@ -142,7 +142,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfgloop.h"
 #include "stor-layout.h"
 #include "optabs-query.h"
-
+#include "tree-ssa-ccp.h"
 
 /* Possible lattice values.  */
 typedef enum
@@ -536,9 +536,9 @@ set_lattice_value (tree var, ccp_prop_value_t *new_val)
 
 static ccp_prop_value_t get_value_for_expr (tree, bool);
 static ccp_prop_value_t bit_value_binop (enum tree_code, tree, tree, tree);
-static void bit_value_binop_1 (enum tree_code, tree, widest_int *, widest_int *,
-			       tree, const widest_int &, const widest_int &,
-			       tree, const widest_int &, const widest_int &);
+void bit_value_binop (enum tree_code, signop, int, widest_int *, widest_int *,
+		      signop, int, const widest_int &, const widest_int &,
+		      signop, int, const widest_int &, const widest_int &);
 
 /* Return a widest_int that can be used for bitwise simplifications
    from VAL.  */
@@ -894,7 +894,7 @@ do_dbg_cnt (void)
    Return TRUE when something was optimized.  */
 
 static bool
-ccp_finalize (bool nonzero_p)
+ccp_finalize (bool nonzero_p) 
 {
   bool something_changed;
   unsigned i;
@@ -920,7 +920,8 @@ ccp_finalize (bool nonzero_p)
 
       val = get_value (name);
       if (val->lattice_val != CONSTANT
-	  || TREE_CODE (val->value) != INTEGER_CST)
+	  || TREE_CODE (val->value) != INTEGER_CST
+	  || val->mask == 0)
 	continue;
 
       if (POINTER_TYPE_P (TREE_TYPE (name)))
@@ -1224,10 +1225,11 @@ ccp_fold (gimple *stmt)
    RVAL and RMASK representing a value of type RTYPE and set
    the value, mask pair *VAL and *MASK to the result.  */
 
-static void
-bit_value_unop_1 (enum tree_code code, tree type,
-		  widest_int *val, widest_int *mask,
-		  tree rtype, const widest_int &rval, const widest_int &rmask)
+void
+bit_value_unop (enum tree_code code, signop type_sgn, int type_precision, 
+		widest_int *val, widest_int *mask,
+		signop rtype_sgn, int rtype_precision,
+		const widest_int &rval, const widest_int &rmask)
 {
   switch (code)
     {
@@ -1240,25 +1242,23 @@ bit_value_unop_1 (enum tree_code code, tree type,
       {
 	widest_int temv, temm;
 	/* Return ~rval + 1.  */
-	bit_value_unop_1 (BIT_NOT_EXPR, type, &temv, &temm, type, rval, rmask);
-	bit_value_binop_1 (PLUS_EXPR, type, val, mask,
-			   type, temv, temm, type, 1, 0);
+	bit_value_unop (BIT_NOT_EXPR, type_sgn, type_precision, &temv, &temm,
+			type_sgn, type_precision, rval, rmask);
+	bit_value_binop (PLUS_EXPR, type_sgn, type_precision, val, mask,
+			 type_sgn, type_precision, temv, temm,
+			 type_sgn, type_precision, 1, 0);
 	break;
       }
 
     CASE_CONVERT:
       {
-	signop sgn;
-
 	/* First extend mask and value according to the original type.  */
-	sgn = TYPE_SIGN (rtype);
-	*mask = wi::ext (rmask, TYPE_PRECISION (rtype), sgn);
-	*val = wi::ext (rval, TYPE_PRECISION (rtype), sgn);
+	*mask = wi::ext (rmask, rtype_precision, rtype_sgn);
+	*val = wi::ext (rval, rtype_precision, rtype_sgn);
 
 	/* Then extend mask and value according to the target type.  */
-	sgn = TYPE_SIGN (type);
-	*mask = wi::ext (*mask, TYPE_PRECISION (type), sgn);
-	*val = wi::ext (*val, TYPE_PRECISION (type), sgn);
+	*mask = wi::ext (*mask, type_precision, type_sgn);
+	*val = wi::ext (*val, type_precision, type_sgn);
 	break;
       }
 
@@ -1272,15 +1272,14 @@ bit_value_unop_1 (enum tree_code code, tree type,
    R1VAL, R1MASK and R2VAL, R2MASK representing a values of type R1TYPE
    and R2TYPE and set the value, mask pair *VAL and *MASK to the result.  */
 
-static void
-bit_value_binop_1 (enum tree_code code, tree type,
-		   widest_int *val, widest_int *mask,
-		   tree r1type, const widest_int &r1val,
-		   const widest_int &r1mask, tree r2type,
-		   const widest_int &r2val, const widest_int &r2mask)
+void
+bit_value_binop (enum tree_code code, signop sgn, int width, 
+		 widest_int *val, widest_int *mask,
+		 signop r1type_sgn, int r1type_precision,
+		 const widest_int &r1val, const widest_int &r1mask,
+		 signop r2type_sgn, int r2type_precision,
+		 const widest_int &r2val, const widest_int &r2mask)
 {
-  signop sgn = TYPE_SIGN (type);
-  int width = TYPE_PRECISION (type);
   bool swap_p = false;
 
   /* Assume we'll get a constant result.  Use an initial non varying
@@ -1406,11 +1405,11 @@ bit_value_binop_1 (enum tree_code code, tree type,
     case MINUS_EXPR:
       {
 	widest_int temv, temm;
-	bit_value_unop_1 (NEGATE_EXPR, r2type, &temv, &temm,
-			  r2type, r2val, r2mask);
-	bit_value_binop_1 (PLUS_EXPR, type, val, mask,
-			   r1type, r1val, r1mask,
-			   r2type, temv, temm);
+	bit_value_unop (NEGATE_EXPR, r2type_sgn, r2type_precision, &temv, &temm,
+			  r2type_sgn, r2type_precision, r2val, r2mask);
+	bit_value_binop (PLUS_EXPR, sgn, width, val, mask,
+			 r1type_sgn, r1type_precision, r1val, r1mask,
+			 r2type_sgn, r2type_precision, temv, temm);
 	break;
       }
 
@@ -1472,7 +1471,7 @@ bit_value_binop_1 (enum tree_code code, tree type,
 	  break;
 
 	/* For comparisons the signedness is in the comparison operands.  */
-	sgn = TYPE_SIGN (r1type);
+	sgn = r1type_sgn;
 
 	/* If we know the most significant bits we know the values
 	   value ranges by means of treating varying bits as zero
@@ -1525,8 +1524,9 @@ bit_value_unop (enum tree_code code, tree type, tree rhs)
   gcc_assert ((rval.lattice_val == CONSTANT
 	       && TREE_CODE (rval.value) == INTEGER_CST)
 	      || wi::sext (rval.mask, TYPE_PRECISION (TREE_TYPE (rhs))) == -1);
-  bit_value_unop_1 (code, type, &value, &mask,
-		    TREE_TYPE (rhs), value_to_wide_int (rval), rval.mask);
+  bit_value_unop (code, TYPE_SIGN (type), TYPE_PRECISION (type), &value, &mask,
+		  TYPE_SIGN (TREE_TYPE (rhs)), TYPE_PRECISION (TREE_TYPE (rhs)),
+		  value_to_wide_int (rval), rval.mask);
   if (wi::sext (mask, TYPE_PRECISION (type)) != -1)
     {
       val.lattice_val = CONSTANT;
@@ -1571,9 +1571,12 @@ bit_value_binop (enum tree_code code, tree type, tree rhs1, tree rhs2)
 	       && TREE_CODE (r2val.value) == INTEGER_CST)
 	      || wi::sext (r2val.mask,
 			   TYPE_PRECISION (TREE_TYPE (rhs2))) == -1);
-  bit_value_binop_1 (code, type, &value, &mask,
-		     TREE_TYPE (rhs1), value_to_wide_int (r1val), r1val.mask,
-		     TREE_TYPE (rhs2), value_to_wide_int (r2val), r2val.mask);
+  bit_value_binop (code, TYPE_SIGN (type), TYPE_PRECISION (type), &value, &mask,
+		   TYPE_SIGN (TREE_TYPE (rhs1)), TYPE_PRECISION (TREE_TYPE (rhs1)),
+		   value_to_wide_int (r1val), r1val.mask,
+		   TYPE_SIGN (TREE_TYPE (rhs2)), TYPE_PRECISION (TREE_TYPE (rhs2)),
+		   value_to_wide_int (r2val), r2val.mask);
+
   if (wi::sext (mask, TYPE_PRECISION (type)) != -1)
     {
       val.lattice_val = CONSTANT;
@@ -1672,9 +1675,10 @@ bit_value_assume_aligned (gimple *stmt, tree attr, ccp_prop_value_t ptrval,
 
   align = build_int_cst_type (type, -aligni);
   alignval = get_value_for_expr (align, true);
-  bit_value_binop_1 (BIT_AND_EXPR, type, &value, &mask,
-		     type, value_to_wide_int (ptrval), ptrval.mask,
-		     type, value_to_wide_int (alignval), alignval.mask);
+  bit_value_binop (BIT_AND_EXPR, TYPE_SIGN (type), TYPE_PRECISION (type), &value, &mask,
+		   TYPE_SIGN (type), TYPE_PRECISION (type), value_to_wide_int (ptrval), ptrval.mask,
+		   TYPE_SIGN (type), TYPE_PRECISION (type), value_to_wide_int (alignval), alignval.mask);
+
   if (wi::sext (mask, TYPE_PRECISION (type)) != -1)
     {
       val.lattice_val = CONSTANT;
@@ -2409,7 +2413,7 @@ do_ssa_ccp (bool nonzero_p)
 
   ccp_initialize ();
   ssa_propagate (ccp_visit_stmt, ccp_visit_phi_node);
-  if (ccp_finalize (nonzero_p))
+  if (ccp_finalize (nonzero_p || flag_ipa_cp_bit))
     {
       todo = (TODO_cleanup_cfg | TODO_update_ssa);
 
diff --git a/gcc/tree-ssa-ccp.h b/gcc/tree-ssa-ccp.h
new file mode 100644
index 0000000..0e619c7
--- /dev/null
+++ b/gcc/tree-ssa-ccp.h
@@ -0,0 +1,29 @@
+/* Copyright (C) 2016-2016 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by the
+Free Software Foundation; either version 3, or (at your option) any
+later version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT
+ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef TREE_SSA_CCP_H
+#define TREE_SSA_CCP_H
+
+void bit_value_binop (enum tree_code, signop, int, widest_int *, widest_int *,
+		      signop, int, const widest_int &, const widest_int &,
+		      signop, int, const widest_int &, const widest_int &);
+
+void bit_value_unop (enum tree_code, signop, int, widest_int *, widest_int *,
+		     signop, int, const widest_int &, const widest_int &);
+
+#endif

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] ipa bitwise constant propagation
  2016-08-10 11:35                 ` Prathamesh Kulkarni
@ 2016-08-11 12:55                   ` Jan Hubicka
  2016-08-12  9:54                     ` Prathamesh Kulkarni
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Hubicka @ 2016-08-11 12:55 UTC (permalink / raw)
  To: Prathamesh Kulkarni
  Cc: Richard Biener, Jan Hubicka, Kugan Vivekanandarajah, gcc Patches

> @@ -266,6 +267,38 @@ private:
>    bool meet_with_1 (unsigned new_align, unsigned new_misalign);
>  };
>  
> +/* Lattice of known bits, only capable of holding one value.
> +   Similar to ccp_lattice_t, mask represents which bits of value are constant.
> +   If a bit in mask is set to 0, then the corresponding bit in
> +   value is known to be constant.  */
> +
> +class ipcp_bits_lattice
> +{
> +public:
> +  bool bottom_p () { return m_lattice_val == IPA_BITS_VARYING; }
> +  bool top_p () { return m_lattice_val == IPA_BITS_UNDEFINED; }
> +  bool constant_p () { return m_lattice_val == IPA_BITS_CONSTANT; }
> +  bool set_to_bottom ();
> +  bool set_to_constant (widest_int, widest_int); 
> + 
> +  widest_int get_value () { return m_value; }
> +  widest_int get_mask () { return m_mask; }
> +
> +  bool meet_with (ipcp_bits_lattice& other, unsigned, signop,
> +		  enum tree_code, tree);
> +
> +  bool meet_with (widest_int, widest_int, unsigned);
> +
> +  void print (FILE *);
> +
> +private:
> +  enum { IPA_BITS_UNDEFINED, IPA_BITS_CONSTANT, IPA_BITS_VARYING } m_lattice_val;
> +  widest_int m_value, m_mask;

Please add comment for these, like one in tree-ssa-ccp and mention they are the same
values.

> +
> +  /* For K&R C programs, ipa_get_type() could return NULL_TREE.
> +     Avoid the transform for these cases.  */
> +  if (!parm_type)
> +    return dest_lattice->set_to_bottom ();

Please add TDF_DETAILS dump for this so we notice if we drop useful info for no
good reasons.  It also happens for variadic functions but hopefully not much more.

The patch is OK with those changes.

thanks,
Honza

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] ipa bitwise constant propagation
  2016-08-11 12:55                   ` Jan Hubicka
@ 2016-08-12  9:54                     ` Prathamesh Kulkarni
  2016-08-12 14:04                       ` Jan Hubicka
  0 siblings, 1 reply; 31+ messages in thread
From: Prathamesh Kulkarni @ 2016-08-12  9:54 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Richard Biener, Kugan Vivekanandarajah, gcc Patches

On 11 August 2016 at 18:25, Jan Hubicka <hubicka@ucw.cz> wrote:
>> @@ -266,6 +267,38 @@ private:
>>    bool meet_with_1 (unsigned new_align, unsigned new_misalign);
>>  };
>>
>> +/* Lattice of known bits, only capable of holding one value.
>> +   Similar to ccp_lattice_t, mask represents which bits of value are constant.
>> +   If a bit in mask is set to 0, then the corresponding bit in
>> +   value is known to be constant.  */
>> +
>> +class ipcp_bits_lattice
>> +{
>> +public:
>> +  bool bottom_p () { return m_lattice_val == IPA_BITS_VARYING; }
>> +  bool top_p () { return m_lattice_val == IPA_BITS_UNDEFINED; }
>> +  bool constant_p () { return m_lattice_val == IPA_BITS_CONSTANT; }
>> +  bool set_to_bottom ();
>> +  bool set_to_constant (widest_int, widest_int);
>> +
>> +  widest_int get_value () { return m_value; }
>> +  widest_int get_mask () { return m_mask; }
>> +
>> +  bool meet_with (ipcp_bits_lattice& other, unsigned, signop,
>> +               enum tree_code, tree);
>> +
>> +  bool meet_with (widest_int, widest_int, unsigned);
>> +
>> +  void print (FILE *);
>> +
>> +private:
>> +  enum { IPA_BITS_UNDEFINED, IPA_BITS_CONSTANT, IPA_BITS_VARYING } m_lattice_val;
>> +  widest_int m_value, m_mask;
>
> Please add comment for these, like one in tree-ssa-ccp and mention they are the same
> values.
>
>> +
>> +  /* For K&R C programs, ipa_get_type() could return NULL_TREE.
>> +     Avoid the transform for these cases.  */
>> +  if (!parm_type)
>> +    return dest_lattice->set_to_bottom ();
>
> Please add TDF_DETAILS dump for this so we notice if we drop useful info for no
> good reasons.  It also happens for variadic functions but hopefully not much more.
>
> The patch is OK with those changes.
Hi,
The patch broke bootstrap due to segfault while compiling libsupc++/eh_alloc.cc
in ipa_get_type() because callee_info->descriptors had 0 length in
propagate_bits_accross_call.

After debugging a bit, I realized it was incorrect to use cs->callee and
using cs->callee->function_symbol() fixed it:
(that seemed to match with value of 'callee' variable in
propagate_constants_accross_call).

-  struct ipa_node_params *callee_info = IPA_NODE_REF (cs->callee);
+  enum availability availability;
+  cgraph_node *callee = cs->callee->function_symbol (&availability);
+  struct ipa_node_params *callee_info = IPA_NODE_REF (callee);
   tree parm_type = ipa_get_type (callee_info, idx);

Similarly I wonder if cs->caller->function_symbol() should be used
instead of cs->caller in following place while obtaining lattices of
source param ?

  if (jfunc->type == IPA_JF_PASS_THROUGH)
    {
      struct ipa_node_params *caller_info = IPA_NODE_REF (cs->caller);


The patch segfaults with -flto for gcc.c-torture/execute/920302-1.c in
ipcp_propagate_stage ()
while populating info->descriptors[k].decl_or_type because t becomes
NULL and we dereference
it with TREE_VALUE (t)
The test-case has K&R style param declaration.
The following change seems to fix it for me:

@@ -3235,7 +3235,7 @@ ipcp_propagate_stage (struct ipa_topo_info *topo)
     if (in_lto_p)
       {
        tree t = TYPE_ARG_TYPES (TREE_TYPE (node->decl));
-       for (int k = 0; k < ipa_get_param_count (info); k++)
+       for (int k = 0; k < ipa_get_param_count (info) && t; k++)
          {
            gcc_assert (t != void_list_node);
            info->descriptors[k].decl_or_type = TREE_VALUE (t);

Is that change OK ?

PS: I am on vacation for next week, will get back to working on the
patch after returning.

Thanks,
Prathamesh
>
> thanks,
> Honza

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] ipa bitwise constant propagation
  2016-08-12  9:54                     ` Prathamesh Kulkarni
@ 2016-08-12 14:04                       ` Jan Hubicka
  2016-08-16 13:05                         ` Prathamesh Kulkarni
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Hubicka @ 2016-08-12 14:04 UTC (permalink / raw)
  To: Prathamesh Kulkarni
  Cc: Jan Hubicka, Richard Biener, Kugan Vivekanandarajah, gcc Patches

> On 11 August 2016 at 18:25, Jan Hubicka <hubicka@ucw.cz> wrote:
> >> @@ -266,6 +267,38 @@ private:
> >>    bool meet_with_1 (unsigned new_align, unsigned new_misalign);
> >>  };
> >>
> >> +/* Lattice of known bits, only capable of holding one value.
> >> +   Similar to ccp_lattice_t, mask represents which bits of value are constant.
> >> +   If a bit in mask is set to 0, then the corresponding bit in
> >> +   value is known to be constant.  */
> >> +
> >> +class ipcp_bits_lattice
> >> +{
> >> +public:
> >> +  bool bottom_p () { return m_lattice_val == IPA_BITS_VARYING; }
> >> +  bool top_p () { return m_lattice_val == IPA_BITS_UNDEFINED; }
> >> +  bool constant_p () { return m_lattice_val == IPA_BITS_CONSTANT; }
> >> +  bool set_to_bottom ();
> >> +  bool set_to_constant (widest_int, widest_int);
> >> +
> >> +  widest_int get_value () { return m_value; }
> >> +  widest_int get_mask () { return m_mask; }
> >> +
> >> +  bool meet_with (ipcp_bits_lattice& other, unsigned, signop,
> >> +               enum tree_code, tree);
> >> +
> >> +  bool meet_with (widest_int, widest_int, unsigned);
> >> +
> >> +  void print (FILE *);
> >> +
> >> +private:
> >> +  enum { IPA_BITS_UNDEFINED, IPA_BITS_CONSTANT, IPA_BITS_VARYING } m_lattice_val;
> >> +  widest_int m_value, m_mask;
> >
> > Please add comment for these, like one in tree-ssa-ccp and mention they are the same
> > values.
> >
> >> +
> >> +  /* For K&R C programs, ipa_get_type() could return NULL_TREE.
> >> +     Avoid the transform for these cases.  */
> >> +  if (!parm_type)
> >> +    return dest_lattice->set_to_bottom ();
> >
> > Please add TDF_DETAILS dump for this so we notice if we drop useful info for no
> > good reasons.  It also happens for variadic functions but hopefully not much more.
> >
> > The patch is OK with those changes.
> Hi,
> The patch broke bootstrap due to segfault while compiling libsupc++/eh_alloc.cc
> in ipa_get_type() because callee_info->descriptors had 0 length in
> propagate_bits_accross_call.
> 
> After debugging a bit, I realized it was incorrect to use cs->callee and
> using cs->callee->function_symbol() fixed it:
> (that seemed to match with value of 'callee' variable in
> propagate_constants_accross_call).

Yes, callee may be alias and in that case you want to look into its target.
> 
> -  struct ipa_node_params *callee_info = IPA_NODE_REF (cs->callee);
> +  enum availability availability;
> +  cgraph_node *callee = cs->callee->function_symbol (&availability);
> +  struct ipa_node_params *callee_info = IPA_NODE_REF (callee);
>    tree parm_type = ipa_get_type (callee_info, idx);
> 
> Similarly I wonder if cs->caller->function_symbol() should be used
> instead of cs->caller in following place while obtaining lattices of
> source param ?
> 
>   if (jfunc->type == IPA_JF_PASS_THROUGH)
>     {
>       struct ipa_node_params *caller_info = IPA_NODE_REF (cs->caller);

For callers you do not need to do that, because only real functions can call
(not aliases).
> 
> 
> The patch segfaults with -flto for gcc.c-torture/execute/920302-1.c in
> ipcp_propagate_stage ()
> while populating info->descriptors[k].decl_or_type because t becomes
> NULL and we dereference
> it with TREE_VALUE (t)
> The test-case has K&R style param declaration.
> The following change seems to fix it for me:
> 
> @@ -3235,7 +3235,7 @@ ipcp_propagate_stage (struct ipa_topo_info *topo)
>      if (in_lto_p)
>        {
>         tree t = TYPE_ARG_TYPES (TREE_TYPE (node->decl));
> -       for (int k = 0; k < ipa_get_param_count (info); k++)
> +       for (int k = 0; k < ipa_get_param_count (info) && t; k++)
>           {
>             gcc_assert (t != void_list_node);
>             info->descriptors[k].decl_or_type = TREE_VALUE (t);
> 
> Is that change OK ?

Yes, this also looks fine to me.

Honza
> 
> PS: I am on vacation for next week, will get back to working on the
> patch after returning.
> 
> Thanks,
> Prathamesh
> >
> > thanks,
> > Honza

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] ipa bitwise constant propagation
  2016-08-12 14:04                       ` Jan Hubicka
@ 2016-08-16 13:05                         ` Prathamesh Kulkarni
  2016-08-22 13:33                           ` Martin Jambor
  0 siblings, 1 reply; 31+ messages in thread
From: Prathamesh Kulkarni @ 2016-08-16 13:05 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Richard Biener, Kugan Vivekanandarajah, gcc Patches

[-- Attachment #1: Type: text/plain, Size: 6182 bytes --]

On 12 August 2016 at 19:33, Jan Hubicka <hubicka@ucw.cz> wrote:
>> On 11 August 2016 at 18:25, Jan Hubicka <hubicka@ucw.cz> wrote:
>> >> @@ -266,6 +267,38 @@ private:
>> >>    bool meet_with_1 (unsigned new_align, unsigned new_misalign);
>> >>  };
>> >>
>> >> +/* Lattice of known bits, only capable of holding one value.
>> >> +   Similar to ccp_lattice_t, mask represents which bits of value are constant.
>> >> +   If a bit in mask is set to 0, then the corresponding bit in
>> >> +   value is known to be constant.  */
>> >> +
>> >> +class ipcp_bits_lattice
>> >> +{
>> >> +public:
>> >> +  bool bottom_p () { return m_lattice_val == IPA_BITS_VARYING; }
>> >> +  bool top_p () { return m_lattice_val == IPA_BITS_UNDEFINED; }
>> >> +  bool constant_p () { return m_lattice_val == IPA_BITS_CONSTANT; }
>> >> +  bool set_to_bottom ();
>> >> +  bool set_to_constant (widest_int, widest_int);
>> >> +
>> >> +  widest_int get_value () { return m_value; }
>> >> +  widest_int get_mask () { return m_mask; }
>> >> +
>> >> +  bool meet_with (ipcp_bits_lattice& other, unsigned, signop,
>> >> +               enum tree_code, tree);
>> >> +
>> >> +  bool meet_with (widest_int, widest_int, unsigned);
>> >> +
>> >> +  void print (FILE *);
>> >> +
>> >> +private:
>> >> +  enum { IPA_BITS_UNDEFINED, IPA_BITS_CONSTANT, IPA_BITS_VARYING } m_lattice_val;
>> >> +  widest_int m_value, m_mask;
>> >
>> > Please add comment for these, like one in tree-ssa-ccp and mention they are the same
>> > values.
>> >
>> >> +
>> >> +  /* For K&R C programs, ipa_get_type() could return NULL_TREE.
>> >> +     Avoid the transform for these cases.  */
>> >> +  if (!parm_type)
>> >> +    return dest_lattice->set_to_bottom ();
>> >
>> > Please add TDF_DETAILS dump for this so we notice if we drop useful info for no
>> > good reasons.  It also happens for variadic functions but hopefully not much more.
>> >
>> > The patch is OK with those changes.
>> Hi,
>> The patch broke bootstrap due to segfault while compiling libsupc++/eh_alloc.cc
>> in ipa_get_type() because callee_info->descriptors had 0 length in
>> propagate_bits_accross_call.
>>
>> After debugging a bit, I realized it was incorrect to use cs->callee and
>> using cs->callee->function_symbol() fixed it:
>> (that seemed to match with value of 'callee' variable in
>> propagate_constants_accross_call).
>
> Yes, callee may be alias and in that case you want to look into its target.
>>
>> -  struct ipa_node_params *callee_info = IPA_NODE_REF (cs->callee);
>> +  enum availability availability;
>> +  cgraph_node *callee = cs->callee->function_symbol (&availability);
>> +  struct ipa_node_params *callee_info = IPA_NODE_REF (callee);
>>    tree parm_type = ipa_get_type (callee_info, idx);
>>
>> Similarly I wonder if cs->caller->function_symbol() should be used
>> instead of cs->caller in following place while obtaining lattices of
>> source param ?
>>
>>   if (jfunc->type == IPA_JF_PASS_THROUGH)
>>     {
>>       struct ipa_node_params *caller_info = IPA_NODE_REF (cs->caller);
>
> For callers you do not need to do that, because only real functions can call
> (not aliases).
>>
>>
>> The patch segfaults with -flto for gcc.c-torture/execute/920302-1.c in
>> ipcp_propagate_stage ()
>> while populating info->descriptors[k].decl_or_type because t becomes
>> NULL and we dereference
>> it with TREE_VALUE (t)
>> The test-case has K&R style param declaration.
>> The following change seems to fix it for me:
>>
>> @@ -3235,7 +3235,7 @@ ipcp_propagate_stage (struct ipa_topo_info *topo)
>>      if (in_lto_p)
>>        {
>>         tree t = TYPE_ARG_TYPES (TREE_TYPE (node->decl));
>> -       for (int k = 0; k < ipa_get_param_count (info); k++)
>> +       for (int k = 0; k < ipa_get_param_count (info) && t; k++)
>>           {
>>             gcc_assert (t != void_list_node);
>>             info->descriptors[k].decl_or_type = TREE_VALUE (t);
>>
>> Is that change OK ?
>
> Yes, this also looks fine to me.
Thanks, I updated the patch to address these issues (attached).
However the patch caused ICE during testing
objc.dg/torture/forward-1.m (and few others but with same ICE):

Command line options:
/home/prathamesh.kulkarni/gnu-toolchain/gcc/bits-prop-5/bootstrap-build/gcc/xgcc
-B/home/prathamesh.kulkarni/gnu-toolchain/gcc/bits-prop-5/bootstrap-build/gcc/
/home/prathamesh.kulkarni/gnu-toolchain/gcc/bits-prop-5/gcc/gcc/testsuite/objc.dg/torture/forward-1.m
-fno-diagnostics-show-caret -fdiagnostics-color=never -O2 -flto
-fuse-linker-plugin -fno-fat-lto-objects -fgnu-runtime
-I/home/prathamesh.kulkarni/gnu-toolchain/gcc/bits-prop-5/gcc/gcc/testsuite/../../libobjc
-B/home/prathamesh.kulkarni/gnu-toolchain/gcc/bits-prop-5/bootstrap-build/x86_64-pc-linux-gnu/./libobjc/.libs
-L/home/prathamesh.kulkarni/gnu-toolchain/gcc/bits-prop-5/bootstrap-build/x86_64-pc-linux-gnu/./libobjc/.libs
-lobjc -lm -o ./forward-1.exe

Backtrace:
0x8c0ed2 ipa_get_param_decl_index_1
../../gcc/gcc/ipa-prop.c:106
0x8b7dbb will_be_nonconstant_predicate
../../gcc/gcc/ipa-inline-analysis.c:2110
0x8b7dbb estimate_function_body_sizes
../../gcc/gcc/ipa-inline-analysis.c:2739
0x8bae26 compute_inline_parameters(cgraph_node*, bool)
../../gcc/gcc/ipa-inline-analysis.c:3030
0x8bb309 inline_analyze_function(cgraph_node*)
../../gcc/gcc/ipa-inline-analysis.c:4157
0x11dc402 ipa_icf::sem_function::merge(ipa_icf::sem_item*)
../../gcc/gcc/ipa-icf.c:1345
0x11d6334 ipa_icf::sem_item_optimizer::merge_classes(unsigned int)
../../gcc/gcc/ipa-icf.c:3461
0x11e12c6 ipa_icf::sem_item_optimizer::execute()
../../gcc/gcc/ipa-icf.c:2636
0x11e34d6 ipa_icf_driver
../../gcc/gcc/ipa-icf.c:3538
0x11e34d6 ipa_icf::pass_ipa_icf::execute(function*)
../../gcc/gcc/ipa-icf.c:3585

This appears due to following assert in ipa_get_param_decl_index_1():
gcc_checking_assert (!flag_wpa);
which was added by Martin's patch introducing ipa_get_type().
Removing the assert works, however I am not sure if that's the correct thing.
I would be grateful for suggestions on how to handle this case.

Thanks,
Prathamesh
>
> Honza
>>
>> PS: I am on vacation for next week, will get back to working on the
>> patch after returning.
>>
>> Thanks,
>> Prathamesh
>> >
>> > thanks,
>> > Honza

[-- Attachment #2: bits-prop-5.diff --]
[-- Type: text/plain, Size: 36337 bytes --]

diff --git a/gcc/common.opt b/gcc/common.opt
index 8a292ed..8bac0a2 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1561,6 +1561,10 @@ fipa-cp-alignment
 Common Report Var(flag_ipa_cp_alignment) Optimization
 Perform alignment discovery and propagation to make Interprocedural constant propagation stronger.
 
+fipa-cp-bit
+Common Report Var(flag_ipa_cp_bit) Optimization
+Perform interprocedural bitwise constant propagation.
+
 fipa-profile
 Common Report Var(flag_ipa_profile) Init(0) Optimization
 Perform interprocedural profile propagation.
diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 5b6cb9a..b58fd05 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -120,6 +120,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "params.h"
 #include "ipa-inline.h"
 #include "ipa-utils.h"
+#include "tree-ssa-ccp.h"
 
 template <typename valtype> class ipcp_value;
 
@@ -266,6 +267,60 @@ private:
   bool meet_with_1 (unsigned new_align, unsigned new_misalign);
 };
 
+/* Lattice of known bits, only capable of holding one value.
+   Bitwise constant propagation propagates which bits of a
+   value are constant.
+   For eg:
+   int f(int x)
+   {
+     return some_op (x);
+   }
+
+   int f1(int y)
+   {
+     if (cond)
+      return f (y & 0xff);
+     else
+      return f (y & 0xf);
+   }
+
+   In the above case, the param 'x' will always have all
+   the bits (except the lowest 8 bits) set to 0.
+   Hence the mask of 'x' would be 0xff. The mask
+   reflects that the lowest 8 bits are unknown.
+   The actual propagated value is given by m_value & ~m_mask.  */
+
+class ipcp_bits_lattice
+{
+public:
+  bool bottom_p () { return m_lattice_val == IPA_BITS_VARYING; }
+  bool top_p () { return m_lattice_val == IPA_BITS_UNDEFINED; }
+  bool constant_p () { return m_lattice_val == IPA_BITS_CONSTANT; }
+  bool set_to_bottom ();
+  bool set_to_constant (widest_int, widest_int); 
+ 
+  widest_int get_value () { return m_value; }
+  widest_int get_mask () { return m_mask; }
+
+  bool meet_with (ipcp_bits_lattice& other, unsigned, signop,
+		  enum tree_code, tree);
+
+  bool meet_with (widest_int, widest_int, unsigned);
+
+  void print (FILE *);
+
+private:
+  enum { IPA_BITS_UNDEFINED, IPA_BITS_CONSTANT, IPA_BITS_VARYING } m_lattice_val;
+
+  /* Similar to ccp_lattice_t, mask represents which bits of value are constant.
+     If a bit in mask is set to 0, then the corresponding bit in
+     value is known to be constant.  */
+  widest_int m_value, m_mask;
+
+  bool meet_with_1 (widest_int, widest_int, unsigned); 
+  void get_value_and_mask (tree, widest_int *, widest_int *);
+}; 
+
 /* Structure containing lattices for a parameter itself and for pieces of
    aggregates that are passed in the parameter or by a reference in a parameter
    plus some other useful flags.  */
@@ -281,6 +336,8 @@ public:
   ipcp_agg_lattice *aggs;
   /* Lattice describing known alignment.  */
   ipcp_alignment_lattice alignment;
+  /* Lattice describing known bits.  */
+  ipcp_bits_lattice bits_lattice;
   /* Number of aggregate lattices */
   int aggs_count;
   /* True if aggregate data were passed by reference (as opposed to by
@@ -458,6 +515,21 @@ ipcp_alignment_lattice::print (FILE * f)
     fprintf (f, "         Alignment %u, misalignment %u\n", align, misalign);
 }
 
+void
+ipcp_bits_lattice::print (FILE *f)
+{
+  if (top_p ())
+    fprintf (f, "         Bits unknown (TOP)\n");
+  else if (bottom_p ())
+    fprintf (f, "         Bits unusable (BOTTOM)\n");
+  else
+    {
+      fprintf (f, "         Bits: value = "); print_hex (get_value (), f);
+      fprintf (f, ", mask = "); print_hex (get_mask (), f);
+      fprintf (f, "\n");
+    }
+}
+
 /* Print all ipcp_lattices of all functions to F.  */
 
 static void
@@ -484,6 +556,7 @@ print_all_lattices (FILE * f, bool dump_sources, bool dump_benefits)
 	  fprintf (f, "         ctxs: ");
 	  plats->ctxlat.print (f, dump_sources, dump_benefits);
 	  plats->alignment.print (f);
+	  plats->bits_lattice.print (f);
 	  if (plats->virt_call)
 	    fprintf (f, "        virt_call flag set\n");
 
@@ -911,6 +984,151 @@ ipcp_alignment_lattice::meet_with (const ipcp_alignment_lattice &other,
   return meet_with_1 (other.align, adjusted_misalign);
 }
 
+/* Set lattice value to bottom, if it already isn't the case.  */
+
+bool
+ipcp_bits_lattice::set_to_bottom ()
+{
+  if (bottom_p ())
+    return false;
+  m_lattice_val = IPA_BITS_VARYING;
+  m_value = 0;
+  m_mask = -1;
+  return true;
+}
+
+/* Set to constant if it isn't already. Only meant to be called
+   when switching state from TOP.  */
+
+bool
+ipcp_bits_lattice::set_to_constant (widest_int value, widest_int mask)
+{
+  gcc_assert (top_p ());
+  m_lattice_val = IPA_BITS_CONSTANT;
+  m_value = value;
+  m_mask = mask;
+  return true;
+}
+
+/* Convert operand to value, mask form.  */
+
+void
+ipcp_bits_lattice::get_value_and_mask (tree operand, widest_int *valuep, widest_int *maskp)
+{
+  wide_int get_nonzero_bits (const_tree);
+
+  if (TREE_CODE (operand) == INTEGER_CST)
+    {
+      *valuep = wi::to_widest (operand); 
+      *maskp = 0;
+    }
+  else
+    {
+      *valuep = 0;
+      *maskp = -1;
+    }
+}
+
+/* Meet operation, similar to ccp_lattice_meet, we xor values
+   if this->value, value have different values at same bit positions, we want
+   to drop that bit to varying. Return true if mask is changed.
+   This function assumes that the lattice value is in CONSTANT state  */
+
+bool
+ipcp_bits_lattice::meet_with_1 (widest_int value, widest_int mask,
+				unsigned precision)
+{
+  gcc_assert (constant_p ());
+  
+  widest_int old_mask = m_mask; 
+  m_mask = (m_mask | mask) | (m_value ^ value);
+
+  if (wi::sext (m_mask, precision) == -1)
+    return set_to_bottom ();
+
+  return m_mask != old_mask;
+}
+
+/* Meet the bits lattice with operand
+   described by <value, mask, sgn, precision.  */
+
+bool
+ipcp_bits_lattice::meet_with (widest_int value, widest_int mask,
+			      unsigned precision)
+{
+  if (bottom_p ())
+    return false;
+
+  if (top_p ())
+    {
+      if (wi::sext (mask, precision) == -1)
+	return set_to_bottom ();
+      return set_to_constant (value, mask); 
+    }
+
+  return meet_with_1 (value, mask, precision);
+}
+
+/* Meet bits lattice with the result of bit_value_binop (other, operand)
+   if code is binary operation or bit_value_unop (other) if code is unary op.
+   In the case when code is nop_expr, no adjustment is required. */
+
+bool
+ipcp_bits_lattice::meet_with (ipcp_bits_lattice& other, unsigned precision,
+			      signop sgn, enum tree_code code, tree operand)
+{
+  if (other.bottom_p ())
+    return set_to_bottom ();
+
+  if (bottom_p () || other.top_p ())
+    return false;
+
+  widest_int adjusted_value, adjusted_mask;
+
+  if (TREE_CODE_CLASS (code) == tcc_binary)
+    {
+      tree type = TREE_TYPE (operand);
+      gcc_assert (INTEGRAL_TYPE_P (type));
+      widest_int o_value, o_mask;
+      get_value_and_mask (operand, &o_value, &o_mask);
+
+      bit_value_binop (code, sgn, precision, &adjusted_value, &adjusted_mask,
+		       sgn, precision, other.get_value (), other.get_mask (),
+		       TYPE_SIGN (type), TYPE_PRECISION (type), o_value, o_mask);
+
+      if (wi::sext (adjusted_mask, precision) == -1)
+	return set_to_bottom ();
+    }
+
+  else if (TREE_CODE_CLASS (code) == tcc_unary)
+    {
+      bit_value_unop (code, sgn, precision, &adjusted_value,
+		      &adjusted_mask, sgn, precision, other.get_value (),
+		      other.get_mask ());
+
+      if (wi::sext (adjusted_mask, precision) == -1)
+	return set_to_bottom ();
+    }
+
+  else if (code == NOP_EXPR)
+    {
+      adjusted_value = other.m_value;
+      adjusted_mask = other.m_mask;
+    }
+
+  else
+    return set_to_bottom ();
+
+  if (top_p ())
+    {
+      if (wi::sext (adjusted_mask, precision) == -1)
+	return set_to_bottom ();
+      return set_to_constant (adjusted_value, adjusted_mask); 
+    }
+  else
+    return meet_with_1 (adjusted_value, adjusted_mask, precision);
+}
+
 /* Mark bot aggregate and scalar lattices as containing an unknown variable,
    return true is any of them has not been marked as such so far.  */
 
@@ -922,6 +1140,7 @@ set_all_contains_variable (struct ipcp_param_lattices *plats)
   ret |= plats->ctxlat.set_contains_variable ();
   ret |= set_agg_lats_contain_variable (plats);
   ret |= plats->alignment.set_to_bottom ();
+  ret |= plats->bits_lattice.set_to_bottom ();
   return ret;
 }
 
@@ -1003,6 +1222,7 @@ initialize_node_lattices (struct cgraph_node *node)
 	      plats->ctxlat.set_to_bottom ();
 	      set_agg_lats_to_bottom (plats);
 	      plats->alignment.set_to_bottom ();
+	      plats->bits_lattice.set_to_bottom ();
 	    }
 	  else
 	    set_all_contains_variable (plats);
@@ -1621,6 +1841,78 @@ propagate_alignment_accross_jump_function (cgraph_edge *cs,
     }
 }
 
+/* Propagate bits across jfunc that is associated with
+   edge cs and update dest_lattice accordingly.  */
+
+bool
+propagate_bits_accross_jump_function (cgraph_edge *cs, int idx, ipa_jump_func *jfunc,
+				      ipcp_bits_lattice *dest_lattice)
+{
+  if (dest_lattice->bottom_p ())
+    return false;
+
+  enum availability availability;
+  cgraph_node *callee = cs->callee->function_symbol (&availability);
+  struct ipa_node_params *callee_info = IPA_NODE_REF (callee);
+  tree parm_type = ipa_get_type (callee_info, idx);
+
+  /* For K&R C programs, ipa_get_type() could return NULL_TREE.
+     Avoid the transform for these cases.  */
+  if (!parm_type)
+    {
+      if (dump_file && (dump_flags & TDF_DETAILS))
+	fprintf (dump_file, "Setting dest_lattice to bottom, because"
+			    "param %i type is NULL for %s\n", idx,
+			    cs->callee->name ());
+
+      return dest_lattice->set_to_bottom ();
+    }
+
+  unsigned precision = TYPE_PRECISION (parm_type);
+  signop sgn = TYPE_SIGN (parm_type);
+
+  if (jfunc->type == IPA_JF_PASS_THROUGH)
+    {
+      struct ipa_node_params *caller_info = IPA_NODE_REF (cs->caller);
+      enum tree_code code = ipa_get_jf_pass_through_operation (jfunc);
+      tree operand = NULL_TREE;
+
+      if (code != NOP_EXPR)
+	operand = ipa_get_jf_pass_through_operand (jfunc);
+
+      int src_idx = ipa_get_jf_pass_through_formal_id (jfunc);
+      struct ipcp_param_lattices *src_lats
+	= ipa_get_parm_lattices (caller_info, src_idx);
+
+      /* Try to propagate bits if src_lattice is bottom, but jfunc is known.
+	 for eg consider:
+	 int f(int x)
+	 {
+	   g (x & 0xff);
+	 }
+	 Assume lattice for x is bottom, however we can still propagate
+	 result of x & 0xff == 0xff, which gets computed during ccp1 pass
+	 and we store it in jump function during analysis stage.  */
+
+      if (src_lats->bits_lattice.bottom_p ()
+	  && jfunc->bits.known)
+	return dest_lattice->meet_with (jfunc->bits.value, jfunc->bits.mask,
+					precision);
+      else
+	return dest_lattice->meet_with (src_lats->bits_lattice, precision, sgn,
+					code, operand);
+    }
+
+  else if (jfunc->type == IPA_JF_ANCESTOR)
+    return dest_lattice->set_to_bottom ();
+
+  else if (jfunc->bits.known) 
+    return dest_lattice->meet_with (jfunc->bits.value, jfunc->bits.mask, precision);
+  
+  else
+    return dest_lattice->set_to_bottom ();
+}
+
 /* If DEST_PLATS already has aggregate items, check that aggs_by_ref matches
    NEW_AGGS_BY_REF and if not, mark all aggs as bottoms and return true (in all
    other cases, return false).  If there are no aggregate items, set
@@ -1968,6 +2260,8 @@ propagate_constants_accross_call (struct cgraph_edge *cs)
 							  &dest_plats->ctxlat);
 	  ret |= propagate_alignment_accross_jump_function (cs, jump_func,
 							 &dest_plats->alignment);
+	  ret |= propagate_bits_accross_jump_function (cs, i, jump_func,
+						       &dest_plats->bits_lattice);
 	  ret |= propagate_aggs_accross_jump_function (cs, jump_func,
 						       dest_plats);
 	}
@@ -2936,6 +3230,19 @@ ipcp_propagate_stage (struct ipa_topo_info *topo)
   {
     struct ipa_node_params *info = IPA_NODE_REF (node);
 
+    /* In LTO we do not have PARM_DECLs but we would still like to be able to
+       look at types of parameters.  */
+    if (in_lto_p)
+      {
+	tree t = TYPE_ARG_TYPES (TREE_TYPE (node->decl));
+	for (int k = 0; k < ipa_get_param_count (info) && t; k++)
+	  {
+	    gcc_assert (t != void_list_node);
+	    info->descriptors[k].decl_or_type = TREE_VALUE (t);
+	    t = t ? TREE_CHAIN (t) : NULL;
+	  }
+      }
+
     determine_versionability (node, info);
     if (node->has_gimple_body_p ())
       {
@@ -4592,6 +4899,81 @@ ipcp_store_alignment_results (void)
   }
 }
 
+/* Look up all the bits information that we have discovered and copy it over
+   to the transformation summary.  */
+
+static void
+ipcp_store_bits_results (void)
+{
+  cgraph_node *node;
+
+  FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node)
+    {
+      ipa_node_params *info = IPA_NODE_REF (node);
+      bool dumped_sth = false;
+      bool found_useful_result = false;
+
+      if (!opt_for_fn (node->decl, flag_ipa_cp_bit))
+	{
+	  if (dump_file)
+	    fprintf (dump_file, "Not considering %s for ipa bitwise propagation "
+				"; -fipa-cp-bit: disabled.\n",
+				node->name ());
+	  continue;
+	}
+
+      if (info->ipcp_orig_node)
+	info = IPA_NODE_REF (info->ipcp_orig_node);
+
+      unsigned count = ipa_get_param_count (info);
+      for (unsigned i = 0; i < count; i++)
+	{
+	  ipcp_param_lattices *plats = ipa_get_parm_lattices (info, i);
+	  if (plats->bits_lattice.constant_p ())
+	    {
+	      found_useful_result = true;
+	      break;
+	    }
+	}
+
+    if (!found_useful_result)
+      continue;
+
+    ipcp_grow_transformations_if_necessary ();
+    ipcp_transformation_summary *ts = ipcp_get_transformation_summary (node);
+    vec_safe_reserve_exact (ts->bits, count);
+
+    for (unsigned i = 0; i < count; i++)
+      {
+	ipcp_param_lattices *plats = ipa_get_parm_lattices (info, i);
+	ipa_bits bits_jfunc;			 
+
+	if (plats->bits_lattice.constant_p ())
+	  {
+	    bits_jfunc.known = true;
+	    bits_jfunc.value = plats->bits_lattice.get_value ();
+	    bits_jfunc.mask = plats->bits_lattice.get_mask ();
+	  }
+	else
+	  bits_jfunc.known = false;
+
+	ts->bits->quick_push (bits_jfunc);
+	if (!dump_file || !bits_jfunc.known)
+	  continue;
+	if (!dumped_sth)
+	  {
+	    fprintf (dump_file, "Propagated bits info for function %s/%i:\n",
+				node->name (), node->order);
+	    dumped_sth = true;
+	  }
+	fprintf (dump_file, " param %i: value = ", i);
+	print_hex (bits_jfunc.value, dump_file);
+	fprintf (dump_file, ", mask = ");
+	print_hex (bits_jfunc.mask, dump_file);
+	fprintf (dump_file, "\n");
+      }
+    }
+}
 /* The IPCP driver.  */
 
 static unsigned int
@@ -4625,6 +5007,8 @@ ipcp_driver (void)
   ipcp_decision_stage (&topo);
   /* Store results of alignment propagation. */
   ipcp_store_alignment_results ();
+  /* Store results of bits propagation.  */
+  ipcp_store_bits_results ();
 
   /* Free all IPCP structures.  */
   free_toporder_info (&topo);
diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
index 4385614..44ec20a 100644
--- a/gcc/ipa-prop.c
+++ b/gcc/ipa-prop.c
@@ -103,9 +103,10 @@ ipa_get_param_decl_index_1 (vec<ipa_param_descriptor> descriptors, tree ptree)
 {
   int i, count;
 
+  gcc_checking_assert (!flag_wpa);
   count = descriptors.length ();
   for (i = 0; i < count; i++)
-    if (descriptors[i].decl == ptree)
+    if (descriptors[i].decl_or_type == ptree)
       return i;
 
   return -1;
@@ -138,7 +139,7 @@ ipa_populate_param_decls (struct cgraph_node *node,
   param_num = 0;
   for (parm = fnargs; parm; parm = DECL_CHAIN (parm))
     {
-      descriptors[param_num].decl = parm;
+      descriptors[param_num].decl_or_type = parm;
       descriptors[param_num].move_cost = estimate_move_cost (TREE_TYPE (parm),
 							     true);
       param_num++;
@@ -168,10 +169,10 @@ void
 ipa_dump_param (FILE *file, struct ipa_node_params *info, int i)
 {
   fprintf (file, "param #%i", i);
-  if (info->descriptors[i].decl)
+  if (info->descriptors[i].decl_or_type)
     {
       fprintf (file, " ");
-      print_generic_expr (file, info->descriptors[i].decl, 0);
+      print_generic_expr (file, info->descriptors[i].decl_or_type, 0);
     }
 }
 
@@ -302,6 +303,15 @@ ipa_print_node_jump_functions_for_edge (FILE *f, struct cgraph_edge *cs)
 	}
       else
 	fprintf (f, "         Unknown alignment\n");
+
+      if (jump_func->bits.known)
+	{
+	  fprintf (f, "         value: "); print_hex (jump_func->bits.value, f);
+	  fprintf (f, ", mask: "); print_hex (jump_func->bits.mask, f);
+	  fprintf (f, "\n");
+	}
+      else
+	fprintf (f, "         Unknown bits\n");
     }
 }
 
@@ -381,6 +391,7 @@ ipa_set_jf_unknown (struct ipa_jump_func *jfunc)
 {
   jfunc->type = IPA_JF_UNKNOWN;
   jfunc->alignment.known = false;
+  jfunc->bits.known = false;
 }
 
 /* Set JFUNC to be a copy of another jmp (to be used by jump function
@@ -1674,6 +1685,26 @@ ipa_compute_jump_functions_for_edge (struct ipa_func_body_info *fbi,
       else
 	gcc_assert (!jfunc->alignment.known);
 
+      if (INTEGRAL_TYPE_P (TREE_TYPE (arg))
+	  && (TREE_CODE (arg) == SSA_NAME || TREE_CODE (arg) == INTEGER_CST))
+	{
+	  jfunc->bits.known = true;
+	  
+	  if (TREE_CODE (arg) == SSA_NAME)
+	    {
+	      jfunc->bits.value = 0;
+	      jfunc->bits.mask = widest_int::from (get_nonzero_bits (arg),
+						   TYPE_SIGN (TREE_TYPE (arg)));
+	    }
+	  else
+	    {
+	      jfunc->bits.value = wi::to_widest (arg);
+	      jfunc->bits.mask = 0;
+	    }
+	}
+      else
+	gcc_assert (!jfunc->bits.known);
+
       if (is_gimple_ip_invariant (arg)
 	  || (TREE_CODE (arg) == VAR_DECL
 	      && is_global_var (arg)
@@ -3690,6 +3721,18 @@ ipa_node_params_t::duplicate(cgraph_node *src, cgraph_node *dst,
       for (unsigned i = 0; i < src_alignments->length (); ++i)
 	dst_alignments->quick_push ((*src_alignments)[i]);
     }
+
+  if (src_trans && vec_safe_length (src_trans->bits) > 0)
+    {
+      ipcp_grow_transformations_if_necessary ();
+      src_trans = ipcp_get_transformation_summary (src);
+      const vec<ipa_bits, va_gc> *src_bits = src_trans->bits;
+      vec<ipa_bits, va_gc> *&dst_bits
+	= ipcp_get_transformation_summary (dst)->bits;
+      vec_safe_reserve_exact (dst_bits, src_bits->length ());
+      for (unsigned i = 0; i < src_bits->length (); ++i)
+	dst_bits->quick_push ((*src_bits)[i]);
+    }
 }
 
 /* Register our cgraph hooks if they are not already there.  */
@@ -4609,6 +4652,15 @@ ipa_write_jump_function (struct output_block *ob,
       streamer_write_uhwi (ob, jump_func->alignment.align);
       streamer_write_uhwi (ob, jump_func->alignment.misalign);
     }
+
+  bp = bitpack_create (ob->main_stream);
+  bp_pack_value (&bp, jump_func->bits.known, 1);
+  streamer_write_bitpack (&bp);
+  if (jump_func->bits.known)
+    {
+      streamer_write_widest_int (ob, jump_func->bits.value);
+      streamer_write_widest_int (ob, jump_func->bits.mask);
+    }   
 }
 
 /* Read in jump function JUMP_FUNC from IB.  */
@@ -4685,6 +4737,17 @@ ipa_read_jump_function (struct lto_input_block *ib,
     }
   else
     jump_func->alignment.known = false;
+
+  bp = streamer_read_bitpack (ib);
+  bool bits_known = bp_unpack_value (&bp, 1);
+  if (bits_known)
+    {
+      jump_func->bits.known = true;
+      jump_func->bits.value = streamer_read_widest_int (ib);
+      jump_func->bits.mask = streamer_read_widest_int (ib);
+    }
+  else
+    jump_func->bits.known = false;
 }
 
 /* Stream out parts of cgraph_indirect_call_info corresponding to CS that are
@@ -5050,6 +5113,28 @@ write_ipcp_transformation_info (output_block *ob, cgraph_node *node)
     }
   else
     streamer_write_uhwi (ob, 0);
+
+  ts = ipcp_get_transformation_summary (node);
+  if (ts && vec_safe_length (ts->bits) > 0)
+    {
+      count = ts->bits->length ();
+      streamer_write_uhwi (ob, count);
+
+      for (unsigned i = 0; i < count; ++i)
+	{
+	  const ipa_bits& bits_jfunc = (*ts->bits)[i];
+	  struct bitpack_d bp = bitpack_create (ob->main_stream);
+	  bp_pack_value (&bp, bits_jfunc.known, 1);
+	  streamer_write_bitpack (&bp);
+	  if (bits_jfunc.known)
+	    {
+	      streamer_write_widest_int (ob, bits_jfunc.value);
+	      streamer_write_widest_int (ob, bits_jfunc.mask);
+	    }
+	}
+    }
+  else
+    streamer_write_uhwi (ob, 0);
 }
 
 /* Stream in the aggregate value replacement chain for NODE from IB.  */
@@ -5102,6 +5187,26 @@ read_ipcp_transformation_info (lto_input_block *ib, cgraph_node *node,
 	    }
 	}
     }
+
+  count = streamer_read_uhwi (ib);
+  if (count > 0)
+    {
+      ipcp_grow_transformations_if_necessary ();
+      ipcp_transformation_summary *ts = ipcp_get_transformation_summary (node);
+      vec_safe_grow_cleared (ts->bits, count);
+
+      for (i = 0; i < count; i++)
+	{
+	  ipa_bits& bits_jfunc = (*ts->bits)[i];
+	  struct bitpack_d bp = streamer_read_bitpack (ib);
+	  bits_jfunc.known = bp_unpack_value (&bp, 1);
+	  if (bits_jfunc.known)
+	    {
+	      bits_jfunc.value = streamer_read_widest_int (ib);
+	      bits_jfunc.mask = streamer_read_widest_int (ib);
+	    }
+	}
+    }
 }
 
 /* Write all aggregate replacement for nodes in set.  */
@@ -5404,6 +5509,56 @@ ipcp_update_alignments (struct cgraph_node *node)
     }
 }
 
+/* Update bits info of formal parameters as described in
+   ipcp_transformation_summary.  */
+
+static void
+ipcp_update_bits (struct cgraph_node *node)
+{
+  tree parm = DECL_ARGUMENTS (node->decl);
+  tree next_parm = parm;
+  ipcp_transformation_summary *ts = ipcp_get_transformation_summary (node);
+
+  if (!ts || vec_safe_length (ts->bits) == 0)
+    return;
+
+  vec<ipa_bits, va_gc> &bits = *ts->bits;
+  unsigned count = bits.length ();
+
+  for (unsigned i = 0; i < count; ++i, parm = next_parm)
+    {
+      if (node->clone.combined_args_to_skip
+	  && bitmap_bit_p (node->clone.combined_args_to_skip, i))
+	continue;
+
+      gcc_checking_assert (parm);
+      next_parm = DECL_CHAIN (parm);
+
+      if (!bits[i].known
+	  || !INTEGRAL_TYPE_P (TREE_TYPE (parm))
+	  || !is_gimple_reg (parm))
+	continue;       
+
+      tree ddef = ssa_default_def (DECL_STRUCT_FUNCTION (node->decl), parm);
+      if (!ddef)
+	continue;
+
+      if (dump_file)
+	{
+	  fprintf (dump_file, "Adjusting mask for param %u to ", i); 
+	  print_hex (bits[i].mask, dump_file);
+	  fprintf (dump_file, "\n");
+	}
+
+      unsigned prec = TYPE_PRECISION (TREE_TYPE (ddef));
+      signop sgn = TYPE_SIGN (TREE_TYPE (ddef));
+
+      wide_int nonzero_bits = wide_int::from (bits[i].mask, prec, UNSIGNED)
+			      | wide_int::from (bits[i].value, prec, sgn);
+      set_nonzero_bits (ddef, nonzero_bits);
+    }
+}
+
 /* IPCP transformation phase doing propagation of aggregate values.  */
 
 unsigned int
@@ -5423,6 +5578,7 @@ ipcp_transform_function (struct cgraph_node *node)
 	     node->name (), node->order);
 
   ipcp_update_alignments (node);
+  ipcp_update_bits (node);
   aggval = ipa_get_agg_replacements_for_node (node);
   if (!aggval)
       return 0;
diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h
index e32d078..e5a56da 100644
--- a/gcc/ipa-prop.h
+++ b/gcc/ipa-prop.h
@@ -154,6 +154,19 @@ struct GTY(()) ipa_alignment
   unsigned misalign;
 };
 
+/* Information about zero/non-zero bits.  */
+struct GTY(()) ipa_bits
+{
+  /* The propagated value.  */
+  widest_int value;
+  /* Mask corresponding to the value.
+     Similar to ccp_lattice_t, if xth bit of mask is 0,
+     implies xth bit of value is constant.  */
+  widest_int mask;
+  /* True if jump function is known.  */
+  bool known;
+};
+
 /* A jump function for a callsite represents the values passed as actual
    arguments of the callsite. See enum jump_func_type for the various
    types of jump functions supported.  */
@@ -166,6 +179,9 @@ struct GTY (()) ipa_jump_func
   /* Information about alignment of pointers. */
   struct ipa_alignment alignment;
 
+  /* Information about zero/non-zero bits.  */
+  struct ipa_bits bits;
+
   enum jump_func_type type;
   /* Represents a value of a jump function.  pass_through is used only in jump
      function context.  constant represents the actual constant in constant jump
@@ -283,8 +299,11 @@ ipa_get_jf_ancestor_type_preserved (struct ipa_jump_func *jfunc)
 
 struct ipa_param_descriptor
 {
-  /* PARAM_DECL of this parameter.  */
-  tree decl;
+  /* In analysis and modification phase, this is the PARAM_DECL of this
+     parameter, in IPA LTO phase, this is the type of the the described
+     parameter or NULL if not known.  Do not read this field directly but
+     through ipa_get_param and ipa_get_type as appropriate.  */
+  tree decl_or_type;
   /* If all uses of the parameter are described by ipa-prop structures, this
      says how many there are.  If any use could not be described by means of
      ipa-prop structures, this is IPA_UNDESCRIBED_USE.  */
@@ -402,13 +421,31 @@ ipa_get_param_count (struct ipa_node_params *info)
 
 /* Return the declaration of Ith formal parameter of the function corresponding
    to INFO.  Note there is no setter function as this array is built just once
-   using ipa_initialize_node_params. */
+   using ipa_initialize_node_params.  This function should not be called in
+   WPA.  */
 
 static inline tree
 ipa_get_param (struct ipa_node_params *info, int i)
 {
   gcc_checking_assert (!flag_wpa);
-  return info->descriptors[i].decl;
+  tree t = info->descriptors[i].decl_or_type;
+  gcc_checking_assert (TREE_CODE (t) == PARM_DECL);
+  return t;
+}
+
+/* Return the type of Ith formal parameter of the function corresponding
+   to INFO if it is known or NULL if not.  */
+
+static inline tree
+ipa_get_type (struct ipa_node_params *info, int i)
+{
+  tree t = info->descriptors[i].decl_or_type;
+  if (!t)
+    return NULL;
+  if (TYPE_P (t))
+    return t;
+  gcc_checking_assert (TREE_CODE (t) == PARM_DECL);
+  return TREE_TYPE (t);
 }
 
 /* Return the move cost of Ith formal parameter of the function corresponding
@@ -482,6 +519,8 @@ struct GTY(()) ipcp_transformation_summary
   ipa_agg_replacement_value *agg_values;
   /* Alignment information for pointers.  */
   vec<ipa_alignment, va_gc> *alignments;
+  /* Known bits information.  */
+  vec<ipa_bits, va_gc> *bits;
 };
 
 void ipa_set_node_agg_value_chain (struct cgraph_node *node,
diff --git a/gcc/opts.c b/gcc/opts.c
index 4053fb1..cde9a7b 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -505,6 +505,7 @@ static const struct default_options default_options_table[] =
     { OPT_LEVELS_2_PLUS, OPT_ftree_switch_conversion, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fipa_cp, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fipa_cp_alignment, NULL, 1 },
+    { OPT_LEVELS_2_PLUS, OPT_fipa_cp_bit, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fdevirtualize, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fdevirtualize_speculatively, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fipa_sra, NULL, 1 },
@@ -1422,6 +1423,9 @@ enable_fdo_optimizations (struct gcc_options *opts,
   if (!opts_set->x_flag_ipa_cp_alignment
       && value && opts->x_flag_ipa_cp)
     opts->x_flag_ipa_cp_alignment = value;
+  if (!opts_set->x_flag_ipa_cp_bit
+      && value && opts->x_flag_ipa_cp)
+    opts->x_flag_ipa_cp_bit = value;
   if (!opts_set->x_flag_predictive_commoning)
     opts->x_flag_predictive_commoning = value;
   if (!opts_set->x_flag_unswitch_loops)
diff --git a/gcc/tree-ssa-ccp.c b/gcc/tree-ssa-ccp.c
index 5d5386e..d88143b 100644
--- a/gcc/tree-ssa-ccp.c
+++ b/gcc/tree-ssa-ccp.c
@@ -142,7 +142,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfgloop.h"
 #include "stor-layout.h"
 #include "optabs-query.h"
-
+#include "tree-ssa-ccp.h"
 
 /* Possible lattice values.  */
 typedef enum
@@ -536,9 +536,9 @@ set_lattice_value (tree var, ccp_prop_value_t *new_val)
 
 static ccp_prop_value_t get_value_for_expr (tree, bool);
 static ccp_prop_value_t bit_value_binop (enum tree_code, tree, tree, tree);
-static void bit_value_binop_1 (enum tree_code, tree, widest_int *, widest_int *,
-			       tree, const widest_int &, const widest_int &,
-			       tree, const widest_int &, const widest_int &);
+void bit_value_binop (enum tree_code, signop, int, widest_int *, widest_int *,
+		      signop, int, const widest_int &, const widest_int &,
+		      signop, int, const widest_int &, const widest_int &);
 
 /* Return a widest_int that can be used for bitwise simplifications
    from VAL.  */
@@ -894,7 +894,7 @@ do_dbg_cnt (void)
    Return TRUE when something was optimized.  */
 
 static bool
-ccp_finalize (bool nonzero_p)
+ccp_finalize (bool nonzero_p) 
 {
   bool something_changed;
   unsigned i;
@@ -920,7 +920,8 @@ ccp_finalize (bool nonzero_p)
 
       val = get_value (name);
       if (val->lattice_val != CONSTANT
-	  || TREE_CODE (val->value) != INTEGER_CST)
+	  || TREE_CODE (val->value) != INTEGER_CST
+	  || val->mask == 0)
 	continue;
 
       if (POINTER_TYPE_P (TREE_TYPE (name)))
@@ -1224,10 +1225,11 @@ ccp_fold (gimple *stmt)
    RVAL and RMASK representing a value of type RTYPE and set
    the value, mask pair *VAL and *MASK to the result.  */
 
-static void
-bit_value_unop_1 (enum tree_code code, tree type,
-		  widest_int *val, widest_int *mask,
-		  tree rtype, const widest_int &rval, const widest_int &rmask)
+void
+bit_value_unop (enum tree_code code, signop type_sgn, int type_precision, 
+		widest_int *val, widest_int *mask,
+		signop rtype_sgn, int rtype_precision,
+		const widest_int &rval, const widest_int &rmask)
 {
   switch (code)
     {
@@ -1240,25 +1242,23 @@ bit_value_unop_1 (enum tree_code code, tree type,
       {
 	widest_int temv, temm;
 	/* Return ~rval + 1.  */
-	bit_value_unop_1 (BIT_NOT_EXPR, type, &temv, &temm, type, rval, rmask);
-	bit_value_binop_1 (PLUS_EXPR, type, val, mask,
-			   type, temv, temm, type, 1, 0);
+	bit_value_unop (BIT_NOT_EXPR, type_sgn, type_precision, &temv, &temm,
+			type_sgn, type_precision, rval, rmask);
+	bit_value_binop (PLUS_EXPR, type_sgn, type_precision, val, mask,
+			 type_sgn, type_precision, temv, temm,
+			 type_sgn, type_precision, 1, 0);
 	break;
       }
 
     CASE_CONVERT:
       {
-	signop sgn;
-
 	/* First extend mask and value according to the original type.  */
-	sgn = TYPE_SIGN (rtype);
-	*mask = wi::ext (rmask, TYPE_PRECISION (rtype), sgn);
-	*val = wi::ext (rval, TYPE_PRECISION (rtype), sgn);
+	*mask = wi::ext (rmask, rtype_precision, rtype_sgn);
+	*val = wi::ext (rval, rtype_precision, rtype_sgn);
 
 	/* Then extend mask and value according to the target type.  */
-	sgn = TYPE_SIGN (type);
-	*mask = wi::ext (*mask, TYPE_PRECISION (type), sgn);
-	*val = wi::ext (*val, TYPE_PRECISION (type), sgn);
+	*mask = wi::ext (*mask, type_precision, type_sgn);
+	*val = wi::ext (*val, type_precision, type_sgn);
 	break;
       }
 
@@ -1272,15 +1272,14 @@ bit_value_unop_1 (enum tree_code code, tree type,
    R1VAL, R1MASK and R2VAL, R2MASK representing a values of type R1TYPE
    and R2TYPE and set the value, mask pair *VAL and *MASK to the result.  */
 
-static void
-bit_value_binop_1 (enum tree_code code, tree type,
-		   widest_int *val, widest_int *mask,
-		   tree r1type, const widest_int &r1val,
-		   const widest_int &r1mask, tree r2type,
-		   const widest_int &r2val, const widest_int &r2mask)
+void
+bit_value_binop (enum tree_code code, signop sgn, int width, 
+		 widest_int *val, widest_int *mask,
+		 signop r1type_sgn, int r1type_precision,
+		 const widest_int &r1val, const widest_int &r1mask,
+		 signop r2type_sgn, int r2type_precision,
+		 const widest_int &r2val, const widest_int &r2mask)
 {
-  signop sgn = TYPE_SIGN (type);
-  int width = TYPE_PRECISION (type);
   bool swap_p = false;
 
   /* Assume we'll get a constant result.  Use an initial non varying
@@ -1406,11 +1405,11 @@ bit_value_binop_1 (enum tree_code code, tree type,
     case MINUS_EXPR:
       {
 	widest_int temv, temm;
-	bit_value_unop_1 (NEGATE_EXPR, r2type, &temv, &temm,
-			  r2type, r2val, r2mask);
-	bit_value_binop_1 (PLUS_EXPR, type, val, mask,
-			   r1type, r1val, r1mask,
-			   r2type, temv, temm);
+	bit_value_unop (NEGATE_EXPR, r2type_sgn, r2type_precision, &temv, &temm,
+			  r2type_sgn, r2type_precision, r2val, r2mask);
+	bit_value_binop (PLUS_EXPR, sgn, width, val, mask,
+			 r1type_sgn, r1type_precision, r1val, r1mask,
+			 r2type_sgn, r2type_precision, temv, temm);
 	break;
       }
 
@@ -1472,7 +1471,7 @@ bit_value_binop_1 (enum tree_code code, tree type,
 	  break;
 
 	/* For comparisons the signedness is in the comparison operands.  */
-	sgn = TYPE_SIGN (r1type);
+	sgn = r1type_sgn;
 
 	/* If we know the most significant bits we know the values
 	   value ranges by means of treating varying bits as zero
@@ -1525,8 +1524,9 @@ bit_value_unop (enum tree_code code, tree type, tree rhs)
   gcc_assert ((rval.lattice_val == CONSTANT
 	       && TREE_CODE (rval.value) == INTEGER_CST)
 	      || wi::sext (rval.mask, TYPE_PRECISION (TREE_TYPE (rhs))) == -1);
-  bit_value_unop_1 (code, type, &value, &mask,
-		    TREE_TYPE (rhs), value_to_wide_int (rval), rval.mask);
+  bit_value_unop (code, TYPE_SIGN (type), TYPE_PRECISION (type), &value, &mask,
+		  TYPE_SIGN (TREE_TYPE (rhs)), TYPE_PRECISION (TREE_TYPE (rhs)),
+		  value_to_wide_int (rval), rval.mask);
   if (wi::sext (mask, TYPE_PRECISION (type)) != -1)
     {
       val.lattice_val = CONSTANT;
@@ -1571,9 +1571,12 @@ bit_value_binop (enum tree_code code, tree type, tree rhs1, tree rhs2)
 	       && TREE_CODE (r2val.value) == INTEGER_CST)
 	      || wi::sext (r2val.mask,
 			   TYPE_PRECISION (TREE_TYPE (rhs2))) == -1);
-  bit_value_binop_1 (code, type, &value, &mask,
-		     TREE_TYPE (rhs1), value_to_wide_int (r1val), r1val.mask,
-		     TREE_TYPE (rhs2), value_to_wide_int (r2val), r2val.mask);
+  bit_value_binop (code, TYPE_SIGN (type), TYPE_PRECISION (type), &value, &mask,
+		   TYPE_SIGN (TREE_TYPE (rhs1)), TYPE_PRECISION (TREE_TYPE (rhs1)),
+		   value_to_wide_int (r1val), r1val.mask,
+		   TYPE_SIGN (TREE_TYPE (rhs2)), TYPE_PRECISION (TREE_TYPE (rhs2)),
+		   value_to_wide_int (r2val), r2val.mask);
+
   if (wi::sext (mask, TYPE_PRECISION (type)) != -1)
     {
       val.lattice_val = CONSTANT;
@@ -1672,9 +1675,10 @@ bit_value_assume_aligned (gimple *stmt, tree attr, ccp_prop_value_t ptrval,
 
   align = build_int_cst_type (type, -aligni);
   alignval = get_value_for_expr (align, true);
-  bit_value_binop_1 (BIT_AND_EXPR, type, &value, &mask,
-		     type, value_to_wide_int (ptrval), ptrval.mask,
-		     type, value_to_wide_int (alignval), alignval.mask);
+  bit_value_binop (BIT_AND_EXPR, TYPE_SIGN (type), TYPE_PRECISION (type), &value, &mask,
+		   TYPE_SIGN (type), TYPE_PRECISION (type), value_to_wide_int (ptrval), ptrval.mask,
+		   TYPE_SIGN (type), TYPE_PRECISION (type), value_to_wide_int (alignval), alignval.mask);
+
   if (wi::sext (mask, TYPE_PRECISION (type)) != -1)
     {
       val.lattice_val = CONSTANT;
@@ -2409,7 +2413,7 @@ do_ssa_ccp (bool nonzero_p)
 
   ccp_initialize ();
   ssa_propagate (ccp_visit_stmt, ccp_visit_phi_node);
-  if (ccp_finalize (nonzero_p))
+  if (ccp_finalize (nonzero_p || flag_ipa_cp_bit))
     {
       todo = (TODO_cleanup_cfg | TODO_update_ssa);
 
diff --git a/gcc/tree-ssa-ccp.h b/gcc/tree-ssa-ccp.h
new file mode 100644
index 0000000..0e619c7
--- /dev/null
+++ b/gcc/tree-ssa-ccp.h
@@ -0,0 +1,29 @@
+/* Copyright (C) 2016-2016 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by the
+Free Software Foundation; either version 3, or (at your option) any
+later version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT
+ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef TREE_SSA_CCP_H
+#define TREE_SSA_CCP_H
+
+void bit_value_binop (enum tree_code, signop, int, widest_int *, widest_int *,
+		      signop, int, const widest_int &, const widest_int &,
+		      signop, int, const widest_int &, const widest_int &);
+
+void bit_value_unop (enum tree_code, signop, int, widest_int *, widest_int *,
+		     signop, int, const widest_int &, const widest_int &);
+
+#endif

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] ipa bitwise constant propagation
  2016-08-16 13:05                         ` Prathamesh Kulkarni
@ 2016-08-22 13:33                           ` Martin Jambor
  2016-08-22 13:55                             ` Prathamesh Kulkarni
  0 siblings, 1 reply; 31+ messages in thread
From: Martin Jambor @ 2016-08-22 13:33 UTC (permalink / raw)
  To: Prathamesh Kulkarni
  Cc: Jan Hubicka, Richard Biener, Kugan Vivekanandarajah, gcc Patches

Hi,

On Tue, Aug 16, 2016 at 06:34:48PM +0530, Prathamesh Kulkarni wrote:
> Thanks, I updated the patch to address these issues (attached).
> However the patch caused ICE during testing
> objc.dg/torture/forward-1.m (and few others but with same ICE):
> 
> Command line options:
> /home/prathamesh.kulkarni/gnu-toolchain/gcc/bits-prop-5/bootstrap-build/gcc/xgcc
> -B/home/prathamesh.kulkarni/gnu-toolchain/gcc/bits-prop-5/bootstrap-build/gcc/
> /home/prathamesh.kulkarni/gnu-toolchain/gcc/bits-prop-5/gcc/gcc/testsuite/objc.dg/torture/forward-1.m
> -fno-diagnostics-show-caret -fdiagnostics-color=never -O2 -flto
> -fuse-linker-plugin -fno-fat-lto-objects -fgnu-runtime
> -I/home/prathamesh.kulkarni/gnu-toolchain/gcc/bits-prop-5/gcc/gcc/testsuite/../../libobjc
> -B/home/prathamesh.kulkarni/gnu-toolchain/gcc/bits-prop-5/bootstrap-build/x86_64-pc-linux-gnu/./libobjc/.libs
> -L/home/prathamesh.kulkarni/gnu-toolchain/gcc/bits-prop-5/bootstrap-build/x86_64-pc-linux-gnu/./libobjc/.libs
> -lobjc -lm -o ./forward-1.exe
> 
> Backtrace:
> 0x8c0ed2 ipa_get_param_decl_index_1
> ../../gcc/gcc/ipa-prop.c:106
> 0x8b7dbb will_be_nonconstant_predicate
> ../../gcc/gcc/ipa-inline-analysis.c:2110
> 0x8b7dbb estimate_function_body_sizes
> ../../gcc/gcc/ipa-inline-analysis.c:2739
> 0x8bae26 compute_inline_parameters(cgraph_node*, bool)
> ../../gcc/gcc/ipa-inline-analysis.c:3030
> 0x8bb309 inline_analyze_function(cgraph_node*)
> ../../gcc/gcc/ipa-inline-analysis.c:4157
> 0x11dc402 ipa_icf::sem_function::merge(ipa_icf::sem_item*)
> ../../gcc/gcc/ipa-icf.c:1345
> 0x11d6334 ipa_icf::sem_item_optimizer::merge_classes(unsigned int)
> ../../gcc/gcc/ipa-icf.c:3461
> 0x11e12c6 ipa_icf::sem_item_optimizer::execute()
> ../../gcc/gcc/ipa-icf.c:2636
> 0x11e34d6 ipa_icf_driver
> ../../gcc/gcc/ipa-icf.c:3538
> 0x11e34d6 ipa_icf::pass_ipa_icf::execute(function*)
> ../../gcc/gcc/ipa-icf.c:3585
> 
> This appears due to following assert in ipa_get_param_decl_index_1():
> gcc_checking_assert (!flag_wpa);
> which was added by Martin's patch introducing ipa_get_type().
> Removing the assert works, however I am not sure if that's the correct thing.
> I would be grateful for suggestions on how to handle this case.
> 

I wrote that the patch was not really tested, I did not think about
ICF loading bodies and re-running body-analyses at WPO time.
Nevertheless, after some consideration, I think that just removing the
assert is fine.  After all, the caller must have passed a PARM_DECL if
it is doing anything sensible at all and that means we have access to
the function body.

Thanks,

Martin

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] ipa bitwise constant propagation
  2016-08-22 13:33                           ` Martin Jambor
@ 2016-08-22 13:55                             ` Prathamesh Kulkarni
  2016-08-24 12:07                               ` Prathamesh Kulkarni
  0 siblings, 1 reply; 31+ messages in thread
From: Prathamesh Kulkarni @ 2016-08-22 13:55 UTC (permalink / raw)
  To: Prathamesh Kulkarni, Jan Hubicka, Richard Biener,
	Kugan Vivekanandarajah, gcc Patches

On 22 August 2016 at 19:03, Martin Jambor <mjambor@suse.cz> wrote:
> Hi,
>
> On Tue, Aug 16, 2016 at 06:34:48PM +0530, Prathamesh Kulkarni wrote:
>> Thanks, I updated the patch to address these issues (attached).
>> However the patch caused ICE during testing
>> objc.dg/torture/forward-1.m (and few others but with same ICE):
>>
>> Command line options:
>> /home/prathamesh.kulkarni/gnu-toolchain/gcc/bits-prop-5/bootstrap-build/gcc/xgcc
>> -B/home/prathamesh.kulkarni/gnu-toolchain/gcc/bits-prop-5/bootstrap-build/gcc/
>> /home/prathamesh.kulkarni/gnu-toolchain/gcc/bits-prop-5/gcc/gcc/testsuite/objc.dg/torture/forward-1.m
>> -fno-diagnostics-show-caret -fdiagnostics-color=never -O2 -flto
>> -fuse-linker-plugin -fno-fat-lto-objects -fgnu-runtime
>> -I/home/prathamesh.kulkarni/gnu-toolchain/gcc/bits-prop-5/gcc/gcc/testsuite/../../libobjc
>> -B/home/prathamesh.kulkarni/gnu-toolchain/gcc/bits-prop-5/bootstrap-build/x86_64-pc-linux-gnu/./libobjc/.libs
>> -L/home/prathamesh.kulkarni/gnu-toolchain/gcc/bits-prop-5/bootstrap-build/x86_64-pc-linux-gnu/./libobjc/.libs
>> -lobjc -lm -o ./forward-1.exe
>>
>> Backtrace:
>> 0x8c0ed2 ipa_get_param_decl_index_1
>> ../../gcc/gcc/ipa-prop.c:106
>> 0x8b7dbb will_be_nonconstant_predicate
>> ../../gcc/gcc/ipa-inline-analysis.c:2110
>> 0x8b7dbb estimate_function_body_sizes
>> ../../gcc/gcc/ipa-inline-analysis.c:2739
>> 0x8bae26 compute_inline_parameters(cgraph_node*, bool)
>> ../../gcc/gcc/ipa-inline-analysis.c:3030
>> 0x8bb309 inline_analyze_function(cgraph_node*)
>> ../../gcc/gcc/ipa-inline-analysis.c:4157
>> 0x11dc402 ipa_icf::sem_function::merge(ipa_icf::sem_item*)
>> ../../gcc/gcc/ipa-icf.c:1345
>> 0x11d6334 ipa_icf::sem_item_optimizer::merge_classes(unsigned int)
>> ../../gcc/gcc/ipa-icf.c:3461
>> 0x11e12c6 ipa_icf::sem_item_optimizer::execute()
>> ../../gcc/gcc/ipa-icf.c:2636
>> 0x11e34d6 ipa_icf_driver
>> ../../gcc/gcc/ipa-icf.c:3538
>> 0x11e34d6 ipa_icf::pass_ipa_icf::execute(function*)
>> ../../gcc/gcc/ipa-icf.c:3585
>>
>> This appears due to following assert in ipa_get_param_decl_index_1():
>> gcc_checking_assert (!flag_wpa);
>> which was added by Martin's patch introducing ipa_get_type().
>> Removing the assert works, however I am not sure if that's the correct thing.
>> I would be grateful for suggestions on how to handle this case.
>>
>
> I wrote that the patch was not really tested, I did not think about
> ICF loading bodies and re-running body-analyses at WPO time.
> Nevertheless, after some consideration, I think that just removing the
> assert is fine.  After all, the caller must have passed a PARM_DECL if
> it is doing anything sensible at all and that means we have access to
> the function body.
Thanks for the pointers. I will validate the patch after removing assert,
and get back.

Thanks,
Prathamesh
>
> Thanks,
>
> Martin

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] ipa bitwise constant propagation
  2016-08-22 13:55                             ` Prathamesh Kulkarni
@ 2016-08-24 12:07                               ` Prathamesh Kulkarni
  2016-08-25 13:44                                 ` Jan Hubicka
  2016-08-26 16:23                                 ` Rainer Orth
  0 siblings, 2 replies; 31+ messages in thread
From: Prathamesh Kulkarni @ 2016-08-24 12:07 UTC (permalink / raw)
  To: Prathamesh Kulkarni, Jan Hubicka, Richard Biener,
	Kugan Vivekanandarajah, gcc Patches

[-- Attachment #1: Type: text/plain, Size: 3268 bytes --]

On 22 August 2016 at 19:24, Prathamesh Kulkarni
<prathamesh.kulkarni@linaro.org> wrote:
> On 22 August 2016 at 19:03, Martin Jambor <mjambor@suse.cz> wrote:
>> Hi,
>>
>> On Tue, Aug 16, 2016 at 06:34:48PM +0530, Prathamesh Kulkarni wrote:
>>> Thanks, I updated the patch to address these issues (attached).
>>> However the patch caused ICE during testing
>>> objc.dg/torture/forward-1.m (and few others but with same ICE):
>>>
>>> Command line options:
>>> /home/prathamesh.kulkarni/gnu-toolchain/gcc/bits-prop-5/bootstrap-build/gcc/xgcc
>>> -B/home/prathamesh.kulkarni/gnu-toolchain/gcc/bits-prop-5/bootstrap-build/gcc/
>>> /home/prathamesh.kulkarni/gnu-toolchain/gcc/bits-prop-5/gcc/gcc/testsuite/objc.dg/torture/forward-1.m
>>> -fno-diagnostics-show-caret -fdiagnostics-color=never -O2 -flto
>>> -fuse-linker-plugin -fno-fat-lto-objects -fgnu-runtime
>>> -I/home/prathamesh.kulkarni/gnu-toolchain/gcc/bits-prop-5/gcc/gcc/testsuite/../../libobjc
>>> -B/home/prathamesh.kulkarni/gnu-toolchain/gcc/bits-prop-5/bootstrap-build/x86_64-pc-linux-gnu/./libobjc/.libs
>>> -L/home/prathamesh.kulkarni/gnu-toolchain/gcc/bits-prop-5/bootstrap-build/x86_64-pc-linux-gnu/./libobjc/.libs
>>> -lobjc -lm -o ./forward-1.exe
>>>
>>> Backtrace:
>>> 0x8c0ed2 ipa_get_param_decl_index_1
>>> ../../gcc/gcc/ipa-prop.c:106
>>> 0x8b7dbb will_be_nonconstant_predicate
>>> ../../gcc/gcc/ipa-inline-analysis.c:2110
>>> 0x8b7dbb estimate_function_body_sizes
>>> ../../gcc/gcc/ipa-inline-analysis.c:2739
>>> 0x8bae26 compute_inline_parameters(cgraph_node*, bool)
>>> ../../gcc/gcc/ipa-inline-analysis.c:3030
>>> 0x8bb309 inline_analyze_function(cgraph_node*)
>>> ../../gcc/gcc/ipa-inline-analysis.c:4157
>>> 0x11dc402 ipa_icf::sem_function::merge(ipa_icf::sem_item*)
>>> ../../gcc/gcc/ipa-icf.c:1345
>>> 0x11d6334 ipa_icf::sem_item_optimizer::merge_classes(unsigned int)
>>> ../../gcc/gcc/ipa-icf.c:3461
>>> 0x11e12c6 ipa_icf::sem_item_optimizer::execute()
>>> ../../gcc/gcc/ipa-icf.c:2636
>>> 0x11e34d6 ipa_icf_driver
>>> ../../gcc/gcc/ipa-icf.c:3538
>>> 0x11e34d6 ipa_icf::pass_ipa_icf::execute(function*)
>>> ../../gcc/gcc/ipa-icf.c:3585
>>>
>>> This appears due to following assert in ipa_get_param_decl_index_1():
>>> gcc_checking_assert (!flag_wpa);
>>> which was added by Martin's patch introducing ipa_get_type().
>>> Removing the assert works, however I am not sure if that's the correct thing.
>>> I would be grateful for suggestions on how to handle this case.
>>>
>>
>> I wrote that the patch was not really tested, I did not think about
>> ICF loading bodies and re-running body-analyses at WPO time.
>> Nevertheless, after some consideration, I think that just removing the
>> assert is fine.  After all, the caller must have passed a PARM_DECL if
>> it is doing anything sensible at all and that means we have access to
>> the function body.
> Thanks for the pointers. I will validate the patch after removing assert,
> and get back.
Hi,
The attached version passes bootstrap+test on
x86_64-unknown-linux-gnu, ppc64le-linux-gnu,
and with c,c++,fortran on armv8l-linux-gnueabihf.
Cross-tested on arm*-*-* and aarch64*-*-*.
Verified the patch survives lto-bootstrap on x86_64-unknown-linux-gnu.
Ok to commit ?

Thanks,
Prathamesh
>
> Thanks,
> Prathamesh
>>
>> Thanks,
>>
>> Martin

[-- Attachment #2: bits-prop-7.txt --]
[-- Type: text/plain, Size: 42731 bytes --]

Patch for performing interprocedural bitwise constant propagation.

2016-08-23  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>
	    Martin Jambhor  <mjambor@suse.cz>

	* common.opt: New option -fipa-cp-bit.
	* doc/invoke.texi: Document -fipa-cp-bit.
	* opts.c (default_options_table): Add entry for -fipa-cp-bit.
	(enable_fdo_optimizations): Check for flag_ipa_cp_bit.
	* tree-ssa-ccp.h: New header file.
	* tree-ssa-ccp.c: Include tree-ssa-ccp.h
	(bit_value_binop_1): Change to bit_value_binop_1 and export it.
	Replace all occurences of tree parameter by two new params: signop, int.
	(bit_value_unop_1): Change to bit_value_unop and export it.
	Replace all occurences of tree parameter by two new params: signop,
	int.
	(bit_value_binop): Change call from bit_value_binop_1 to
	bit_value_binop.
	(bit_value_assume_aligned): Likewise.
	(bit_value_unop): Change call from bit_value_unop_1 to bit_value_unop.
	(do_ssa_ccp): Pass nonzero_p || flag_ipa_cp_bit instead of nonzero_p
	to ccp_finalize.
	(ccp_finalize): Skip processing if val->mask == 0.
	* ipa-cp.c: Include tree-ssa-ccp.h
	(ipcp_bits_lattice): New class.
	(ipcp_param_lattice (bits_lattice): New member.
	(print_all_lattices): Call ipcp_bits_lattice::print.
	(set_all_contains_variable): Call ipcp_bits_lattice::set_to_bottom. 
	(initialize_node_lattices): Likewise.
	(propagate_bits_accross_jump_function): New function.
	(propagate_constants_accross_call): Call
	propagate_bits_accross_jump_function.
	(ipcp_propagate_stage): Store parameter types when in_lto_p is true.
	(ipcp_store_bits_results): New function.
	(ipcp_driver): Call ipcp_store_bits_results.
	* ipa-prop.h (ipa_bits): New struct.
	(ipa_jump_func): Add new member bits of type ipa_bits.
	(ipa_param_descriptor): Change decl to decl_or_type.
	(ipa_get_param): Change decl to decl_or_type and assert on
	PARM_DECL.
	(ipa_get_type): New function.
	(ipcp_transformation_summary): New member bits.
	* ipa-prop.c (ipa_get_param_decl_index_1): s/decl/decl_or_type.
	(ipa_populate_param_decls): Likewise.
	(ipa_dump_param): Likewise.
	(ipa_print_node_jump_functions_for_edge): Pretty-print ipa_bits jump
	function.
	(ipa_set_jf_unknown): Set ipa_bits::known to false.
	(ipa_compute_jump_functions_for_edge): Compute jump function for bits
	propagation.
	(ipa_node_params_t::duplicate): Copy src->bits into dst->bits.
	(ipa_write_jump_function): Add streaming for ipa_bits.
	(ipa_read_jump_function): Add support for reading streamed ipa_bits.
	(write_ipcp_transformation_info): Add streaming for ipa_bits
	summary for ltrans.
	(read_ipcp_transfomration_info): Add support for reading streamed ipa_bits.
	(ipcp_update_bits): New function.
	(ipcp_transform_function): Call ipcp_update_bits.

testsuite/
	* gcc.dg/ipa/propbits-1.c: New test-case.
	* gcc.dg/ipa/propbits-2.c: Likewise.
	* gcc.dg/ipa/propbits-3.c: Likewise.

diff --git a/gcc/common.opt b/gcc/common.opt
index 8a292ed..8bac0a2 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1561,6 +1561,10 @@ fipa-cp-alignment
 Common Report Var(flag_ipa_cp_alignment) Optimization
 Perform alignment discovery and propagation to make Interprocedural constant propagation stronger.
 
+fipa-cp-bit
+Common Report Var(flag_ipa_cp_bit) Optimization
+Perform interprocedural bitwise constant propagation.
+
 fipa-profile
 Common Report Var(flag_ipa_profile) Init(0) Optimization
 Perform interprocedural profile propagation.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 22001f9..ebbf4ee 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -358,7 +358,7 @@ Objective-C and Objective-C++ Dialects}.
 -fgcse-sm -fhoist-adjacent-loads -fif-conversion @gol
 -fif-conversion2 -findirect-inlining @gol
 -finline-functions -finline-functions-called-once -finline-limit=@var{n} @gol
--finline-small-functions -fipa-cp -fipa-cp-clone -fipa-cp-alignment @gol
+-finline-small-functions -fipa-cp -fipa-cp-clone -fipa-cp-alignment -fipa-cp-bit @gol
 -fipa-pta -fipa-profile -fipa-pure-const -fipa-reference -fipa-icf @gol
 -fira-algorithm=@var{algorithm} @gol
 -fira-region=@var{region} -fira-hoist-pressure @gol
@@ -6370,6 +6370,7 @@ also turns on the following optimization flags:
 -findirect-inlining @gol
 -fipa-cp @gol
 -fipa-cp-alignment @gol
+-fipa-cp-bit @gol
 -fipa-sra @gol
 -fipa-icf @gol
 -fisolate-erroneous-paths-dereference @gol
@@ -7378,6 +7379,12 @@ parameters to support better vectorization and string operations.
 This flag is enabled by default at @option{-O2} and @option{-Os}.  It
 requires that @option{-fipa-cp} is enabled.
 
+@item -fipa-cp-bit
+@opindex -fipa-cp-bit
+When enabled, perform ipa bitwise constant propagation. This flag is
+enabled by default at @option{-O2}. It requires that @option{-fipa-cp}
+is enabled.
+
 @item -fipa-icf
 @opindex fipa-icf
 Perform Identical Code Folding for functions and read-only variables.
diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 5b6cb9a..7e740f9 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -120,6 +120,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "params.h"
 #include "ipa-inline.h"
 #include "ipa-utils.h"
+#include "tree-ssa-ccp.h"
 
 template <typename valtype> class ipcp_value;
 
@@ -266,6 +267,60 @@ private:
   bool meet_with_1 (unsigned new_align, unsigned new_misalign);
 };
 
+/* Lattice of known bits, only capable of holding one value.
+   Bitwise constant propagation propagates which bits of a
+   value are constant.
+   For eg:
+   int f(int x)
+   {
+     return some_op (x);
+   }
+
+   int f1(int y)
+   {
+     if (cond)
+      return f (y & 0xff);
+     else
+      return f (y & 0xf);
+   }
+
+   In the above case, the param 'x' will always have all
+   the bits (except the bits in lsb) set to 0.
+   Hence the mask of 'x' would be 0xff. The mask
+   reflects that the bits in lsb are unknown.
+   The actual propagated value is given by m_value & ~m_mask.  */
+
+class ipcp_bits_lattice
+{
+public:
+  bool bottom_p () { return m_lattice_val == IPA_BITS_VARYING; }
+  bool top_p () { return m_lattice_val == IPA_BITS_UNDEFINED; }
+  bool constant_p () { return m_lattice_val == IPA_BITS_CONSTANT; }
+  bool set_to_bottom ();
+  bool set_to_constant (widest_int, widest_int); 
+ 
+  widest_int get_value () { return m_value; }
+  widest_int get_mask () { return m_mask; }
+
+  bool meet_with (ipcp_bits_lattice& other, unsigned, signop,
+		  enum tree_code, tree);
+
+  bool meet_with (widest_int, widest_int, unsigned);
+
+  void print (FILE *);
+
+private:
+  enum { IPA_BITS_UNDEFINED, IPA_BITS_CONSTANT, IPA_BITS_VARYING } m_lattice_val;
+
+  /* Similar to ccp_lattice_t, mask represents which bits of value are constant.
+     If a bit in mask is set to 0, then the corresponding bit in
+     value is known to be constant.  */
+  widest_int m_value, m_mask;
+
+  bool meet_with_1 (widest_int, widest_int, unsigned); 
+  void get_value_and_mask (tree, widest_int *, widest_int *);
+}; 
+
 /* Structure containing lattices for a parameter itself and for pieces of
    aggregates that are passed in the parameter or by a reference in a parameter
    plus some other useful flags.  */
@@ -281,6 +336,8 @@ public:
   ipcp_agg_lattice *aggs;
   /* Lattice describing known alignment.  */
   ipcp_alignment_lattice alignment;
+  /* Lattice describing known bits.  */
+  ipcp_bits_lattice bits_lattice;
   /* Number of aggregate lattices */
   int aggs_count;
   /* True if aggregate data were passed by reference (as opposed to by
@@ -458,6 +515,21 @@ ipcp_alignment_lattice::print (FILE * f)
     fprintf (f, "         Alignment %u, misalignment %u\n", align, misalign);
 }
 
+void
+ipcp_bits_lattice::print (FILE *f)
+{
+  if (top_p ())
+    fprintf (f, "         Bits unknown (TOP)\n");
+  else if (bottom_p ())
+    fprintf (f, "         Bits unusable (BOTTOM)\n");
+  else
+    {
+      fprintf (f, "         Bits: value = "); print_hex (get_value (), f);
+      fprintf (f, ", mask = "); print_hex (get_mask (), f);
+      fprintf (f, "\n");
+    }
+}
+
 /* Print all ipcp_lattices of all functions to F.  */
 
 static void
@@ -484,6 +556,7 @@ print_all_lattices (FILE * f, bool dump_sources, bool dump_benefits)
 	  fprintf (f, "         ctxs: ");
 	  plats->ctxlat.print (f, dump_sources, dump_benefits);
 	  plats->alignment.print (f);
+	  plats->bits_lattice.print (f);
 	  if (plats->virt_call)
 	    fprintf (f, "        virt_call flag set\n");
 
@@ -911,6 +984,151 @@ ipcp_alignment_lattice::meet_with (const ipcp_alignment_lattice &other,
   return meet_with_1 (other.align, adjusted_misalign);
 }
 
+/* Set lattice value to bottom, if it already isn't the case.  */
+
+bool
+ipcp_bits_lattice::set_to_bottom ()
+{
+  if (bottom_p ())
+    return false;
+  m_lattice_val = IPA_BITS_VARYING;
+  m_value = 0;
+  m_mask = -1;
+  return true;
+}
+
+/* Set to constant if it isn't already. Only meant to be called
+   when switching state from TOP.  */
+
+bool
+ipcp_bits_lattice::set_to_constant (widest_int value, widest_int mask)
+{
+  gcc_assert (top_p ());
+  m_lattice_val = IPA_BITS_CONSTANT;
+  m_value = value;
+  m_mask = mask;
+  return true;
+}
+
+/* Convert operand to value, mask form.  */
+
+void
+ipcp_bits_lattice::get_value_and_mask (tree operand, widest_int *valuep, widest_int *maskp)
+{
+  wide_int get_nonzero_bits (const_tree);
+
+  if (TREE_CODE (operand) == INTEGER_CST)
+    {
+      *valuep = wi::to_widest (operand); 
+      *maskp = 0;
+    }
+  else
+    {
+      *valuep = 0;
+      *maskp = -1;
+    }
+}
+
+/* Meet operation, similar to ccp_lattice_meet, we xor values
+   if this->value, value have different values at same bit positions, we want
+   to drop that bit to varying. Return true if mask is changed.
+   This function assumes that the lattice value is in CONSTANT state  */
+
+bool
+ipcp_bits_lattice::meet_with_1 (widest_int value, widest_int mask,
+				unsigned precision)
+{
+  gcc_assert (constant_p ());
+  
+  widest_int old_mask = m_mask; 
+  m_mask = (m_mask | mask) | (m_value ^ value);
+
+  if (wi::sext (m_mask, precision) == -1)
+    return set_to_bottom ();
+
+  return m_mask != old_mask;
+}
+
+/* Meet the bits lattice with operand
+   described by <value, mask, sgn, precision.  */
+
+bool
+ipcp_bits_lattice::meet_with (widest_int value, widest_int mask,
+			      unsigned precision)
+{
+  if (bottom_p ())
+    return false;
+
+  if (top_p ())
+    {
+      if (wi::sext (mask, precision) == -1)
+	return set_to_bottom ();
+      return set_to_constant (value, mask); 
+    }
+
+  return meet_with_1 (value, mask, precision);
+}
+
+/* Meet bits lattice with the result of bit_value_binop (other, operand)
+   if code is binary operation or bit_value_unop (other) if code is unary op.
+   In the case when code is nop_expr, no adjustment is required. */
+
+bool
+ipcp_bits_lattice::meet_with (ipcp_bits_lattice& other, unsigned precision,
+			      signop sgn, enum tree_code code, tree operand)
+{
+  if (other.bottom_p ())
+    return set_to_bottom ();
+
+  if (bottom_p () || other.top_p ())
+    return false;
+
+  widest_int adjusted_value, adjusted_mask;
+
+  if (TREE_CODE_CLASS (code) == tcc_binary)
+    {
+      tree type = TREE_TYPE (operand);
+      gcc_assert (INTEGRAL_TYPE_P (type));
+      widest_int o_value, o_mask;
+      get_value_and_mask (operand, &o_value, &o_mask);
+
+      bit_value_binop (code, sgn, precision, &adjusted_value, &adjusted_mask,
+		       sgn, precision, other.get_value (), other.get_mask (),
+		       TYPE_SIGN (type), TYPE_PRECISION (type), o_value, o_mask);
+
+      if (wi::sext (adjusted_mask, precision) == -1)
+	return set_to_bottom ();
+    }
+
+  else if (TREE_CODE_CLASS (code) == tcc_unary)
+    {
+      bit_value_unop (code, sgn, precision, &adjusted_value,
+		      &adjusted_mask, sgn, precision, other.get_value (),
+		      other.get_mask ());
+
+      if (wi::sext (adjusted_mask, precision) == -1)
+	return set_to_bottom ();
+    }
+
+  else if (code == NOP_EXPR)
+    {
+      adjusted_value = other.m_value;
+      adjusted_mask = other.m_mask;
+    }
+
+  else
+    return set_to_bottom ();
+
+  if (top_p ())
+    {
+      if (wi::sext (adjusted_mask, precision) == -1)
+	return set_to_bottom ();
+      return set_to_constant (adjusted_value, adjusted_mask); 
+    }
+  else
+    return meet_with_1 (adjusted_value, adjusted_mask, precision);
+}
+
 /* Mark bot aggregate and scalar lattices as containing an unknown variable,
    return true is any of them has not been marked as such so far.  */
 
@@ -922,6 +1140,7 @@ set_all_contains_variable (struct ipcp_param_lattices *plats)
   ret |= plats->ctxlat.set_contains_variable ();
   ret |= set_agg_lats_contain_variable (plats);
   ret |= plats->alignment.set_to_bottom ();
+  ret |= plats->bits_lattice.set_to_bottom ();
   return ret;
 }
 
@@ -1003,6 +1222,7 @@ initialize_node_lattices (struct cgraph_node *node)
 	      plats->ctxlat.set_to_bottom ();
 	      set_agg_lats_to_bottom (plats);
 	      plats->alignment.set_to_bottom ();
+	      plats->bits_lattice.set_to_bottom ();
 	    }
 	  else
 	    set_all_contains_variable (plats);
@@ -1621,6 +1841,78 @@ propagate_alignment_accross_jump_function (cgraph_edge *cs,
     }
 }
 
+/* Propagate bits across jfunc that is associated with
+   edge cs and update dest_lattice accordingly.  */
+
+bool
+propagate_bits_accross_jump_function (cgraph_edge *cs, int idx, ipa_jump_func *jfunc,
+				      ipcp_bits_lattice *dest_lattice)
+{
+  if (dest_lattice->bottom_p ())
+    return false;
+
+  enum availability availability;
+  cgraph_node *callee = cs->callee->function_symbol (&availability);
+  struct ipa_node_params *callee_info = IPA_NODE_REF (callee);
+  tree parm_type = ipa_get_type (callee_info, idx);
+
+  /* For K&R C programs, ipa_get_type() could return NULL_TREE.
+     Avoid the transform for these cases.  */
+  if (!parm_type)
+    {
+      if (dump_file && (dump_flags & TDF_DETAILS))
+	fprintf (dump_file, "Setting dest_lattice to bottom, because"
+			    "param %i type is NULL for %s\n", idx,
+			    cs->callee->name ());
+
+      return dest_lattice->set_to_bottom ();
+    }
+
+  unsigned precision = TYPE_PRECISION (parm_type);
+  signop sgn = TYPE_SIGN (parm_type);
+
+  if (jfunc->type == IPA_JF_PASS_THROUGH)
+    {
+      struct ipa_node_params *caller_info = IPA_NODE_REF (cs->caller);
+      enum tree_code code = ipa_get_jf_pass_through_operation (jfunc);
+      tree operand = NULL_TREE;
+
+      if (code != NOP_EXPR)
+	operand = ipa_get_jf_pass_through_operand (jfunc);
+
+      int src_idx = ipa_get_jf_pass_through_formal_id (jfunc);
+      struct ipcp_param_lattices *src_lats
+	= ipa_get_parm_lattices (caller_info, src_idx);
+
+      /* Try to propagate bits if src_lattice is bottom, but jfunc is known.
+	 for eg consider:
+	 int f(int x)
+	 {
+	   g (x & 0xff);
+	 }
+	 Assume lattice for x is bottom, however we can still propagate
+	 result of x & 0xff == 0xff, which gets computed during ccp1 pass
+	 and we store it in jump function during analysis stage.  */
+
+      if (src_lats->bits_lattice.bottom_p ()
+	  && jfunc->bits.known)
+	return dest_lattice->meet_with (jfunc->bits.value, jfunc->bits.mask,
+					precision);
+      else
+	return dest_lattice->meet_with (src_lats->bits_lattice, precision, sgn,
+					code, operand);
+    }
+
+  else if (jfunc->type == IPA_JF_ANCESTOR)
+    return dest_lattice->set_to_bottom ();
+
+  else if (jfunc->bits.known) 
+    return dest_lattice->meet_with (jfunc->bits.value, jfunc->bits.mask, precision);
+  
+  else
+    return dest_lattice->set_to_bottom ();
+}
+
 /* If DEST_PLATS already has aggregate items, check that aggs_by_ref matches
    NEW_AGGS_BY_REF and if not, mark all aggs as bottoms and return true (in all
    other cases, return false).  If there are no aggregate items, set
@@ -1968,6 +2260,8 @@ propagate_constants_accross_call (struct cgraph_edge *cs)
 							  &dest_plats->ctxlat);
 	  ret |= propagate_alignment_accross_jump_function (cs, jump_func,
 							 &dest_plats->alignment);
+	  ret |= propagate_bits_accross_jump_function (cs, i, jump_func,
+						       &dest_plats->bits_lattice);
 	  ret |= propagate_aggs_accross_jump_function (cs, jump_func,
 						       dest_plats);
 	}
@@ -2936,6 +3230,19 @@ ipcp_propagate_stage (struct ipa_topo_info *topo)
   {
     struct ipa_node_params *info = IPA_NODE_REF (node);
 
+    /* In LTO we do not have PARM_DECLs but we would still like to be able to
+       look at types of parameters.  */
+    if (in_lto_p)
+      {
+	tree t = TYPE_ARG_TYPES (TREE_TYPE (node->decl));
+	for (int k = 0; k < ipa_get_param_count (info) && t; k++)
+	  {
+	    gcc_assert (t != void_list_node);
+	    info->descriptors[k].decl_or_type = TREE_VALUE (t);
+	    t = t ? TREE_CHAIN (t) : NULL;
+	  }
+      }
+
     determine_versionability (node, info);
     if (node->has_gimple_body_p ())
       {
@@ -4592,6 +4899,81 @@ ipcp_store_alignment_results (void)
   }
 }
 
+/* Look up all the bits information that we have discovered and copy it over
+   to the transformation summary.  */
+
+static void
+ipcp_store_bits_results (void)
+{
+  cgraph_node *node;
+
+  FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node)
+    {
+      ipa_node_params *info = IPA_NODE_REF (node);
+      bool dumped_sth = false;
+      bool found_useful_result = false;
+
+      if (!opt_for_fn (node->decl, flag_ipa_cp_bit))
+	{
+	  if (dump_file)
+	    fprintf (dump_file, "Not considering %s for ipa bitwise propagation "
+				"; -fipa-cp-bit: disabled.\n",
+				node->name ());
+	  continue;
+	}
+
+      if (info->ipcp_orig_node)
+	info = IPA_NODE_REF (info->ipcp_orig_node);
+
+      unsigned count = ipa_get_param_count (info);
+      for (unsigned i = 0; i < count; i++)
+	{
+	  ipcp_param_lattices *plats = ipa_get_parm_lattices (info, i);
+	  if (plats->bits_lattice.constant_p ())
+	    {
+	      found_useful_result = true;
+	      break;
+	    }
+	}
+
+    if (!found_useful_result)
+      continue;
+
+    ipcp_grow_transformations_if_necessary ();
+    ipcp_transformation_summary *ts = ipcp_get_transformation_summary (node);
+    vec_safe_reserve_exact (ts->bits, count);
+
+    for (unsigned i = 0; i < count; i++)
+      {
+	ipcp_param_lattices *plats = ipa_get_parm_lattices (info, i);
+	ipa_bits bits_jfunc;			 
+
+	if (plats->bits_lattice.constant_p ())
+	  {
+	    bits_jfunc.known = true;
+	    bits_jfunc.value = plats->bits_lattice.get_value ();
+	    bits_jfunc.mask = plats->bits_lattice.get_mask ();
+	  }
+	else
+	  bits_jfunc.known = false;
+
+	ts->bits->quick_push (bits_jfunc);
+	if (!dump_file || !bits_jfunc.known)
+	  continue;
+	if (!dumped_sth)
+	  {
+	    fprintf (dump_file, "Propagated bits info for function %s/%i:\n",
+				node->name (), node->order);
+	    dumped_sth = true;
+	  }
+	fprintf (dump_file, " param %i: value = ", i);
+	print_hex (bits_jfunc.value, dump_file);
+	fprintf (dump_file, ", mask = ");
+	print_hex (bits_jfunc.mask, dump_file);
+	fprintf (dump_file, "\n");
+      }
+    }
+}
 /* The IPCP driver.  */
 
 static unsigned int
@@ -4625,6 +5007,8 @@ ipcp_driver (void)
   ipcp_decision_stage (&topo);
   /* Store results of alignment propagation. */
   ipcp_store_alignment_results ();
+  /* Store results of bits propagation.  */
+  ipcp_store_bits_results ();
 
   /* Free all IPCP structures.  */
   free_toporder_info (&topo);
diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
index 4385614..1629781 100644
--- a/gcc/ipa-prop.c
+++ b/gcc/ipa-prop.c
@@ -105,7 +105,7 @@ ipa_get_param_decl_index_1 (vec<ipa_param_descriptor> descriptors, tree ptree)
 
   count = descriptors.length ();
   for (i = 0; i < count; i++)
-    if (descriptors[i].decl == ptree)
+    if (descriptors[i].decl_or_type == ptree)
       return i;
 
   return -1;
@@ -138,7 +138,7 @@ ipa_populate_param_decls (struct cgraph_node *node,
   param_num = 0;
   for (parm = fnargs; parm; parm = DECL_CHAIN (parm))
     {
-      descriptors[param_num].decl = parm;
+      descriptors[param_num].decl_or_type = parm;
       descriptors[param_num].move_cost = estimate_move_cost (TREE_TYPE (parm),
 							     true);
       param_num++;
@@ -168,10 +168,10 @@ void
 ipa_dump_param (FILE *file, struct ipa_node_params *info, int i)
 {
   fprintf (file, "param #%i", i);
-  if (info->descriptors[i].decl)
+  if (info->descriptors[i].decl_or_type)
     {
       fprintf (file, " ");
-      print_generic_expr (file, info->descriptors[i].decl, 0);
+      print_generic_expr (file, info->descriptors[i].decl_or_type, 0);
     }
 }
 
@@ -302,6 +302,15 @@ ipa_print_node_jump_functions_for_edge (FILE *f, struct cgraph_edge *cs)
 	}
       else
 	fprintf (f, "         Unknown alignment\n");
+
+      if (jump_func->bits.known)
+	{
+	  fprintf (f, "         value: "); print_hex (jump_func->bits.value, f);
+	  fprintf (f, ", mask: "); print_hex (jump_func->bits.mask, f);
+	  fprintf (f, "\n");
+	}
+      else
+	fprintf (f, "         Unknown bits\n");
     }
 }
 
@@ -381,6 +390,7 @@ ipa_set_jf_unknown (struct ipa_jump_func *jfunc)
 {
   jfunc->type = IPA_JF_UNKNOWN;
   jfunc->alignment.known = false;
+  jfunc->bits.known = false;
 }
 
 /* Set JFUNC to be a copy of another jmp (to be used by jump function
@@ -1674,6 +1684,26 @@ ipa_compute_jump_functions_for_edge (struct ipa_func_body_info *fbi,
       else
 	gcc_assert (!jfunc->alignment.known);
 
+      if (INTEGRAL_TYPE_P (TREE_TYPE (arg))
+	  && (TREE_CODE (arg) == SSA_NAME || TREE_CODE (arg) == INTEGER_CST))
+	{
+	  jfunc->bits.known = true;
+	  
+	  if (TREE_CODE (arg) == SSA_NAME)
+	    {
+	      jfunc->bits.value = 0;
+	      jfunc->bits.mask = widest_int::from (get_nonzero_bits (arg),
+						   TYPE_SIGN (TREE_TYPE (arg)));
+	    }
+	  else
+	    {
+	      jfunc->bits.value = wi::to_widest (arg);
+	      jfunc->bits.mask = 0;
+	    }
+	}
+      else
+	gcc_assert (!jfunc->bits.known);
+
       if (is_gimple_ip_invariant (arg)
 	  || (TREE_CODE (arg) == VAR_DECL
 	      && is_global_var (arg)
@@ -3690,6 +3720,18 @@ ipa_node_params_t::duplicate(cgraph_node *src, cgraph_node *dst,
       for (unsigned i = 0; i < src_alignments->length (); ++i)
 	dst_alignments->quick_push ((*src_alignments)[i]);
     }
+
+  if (src_trans && vec_safe_length (src_trans->bits) > 0)
+    {
+      ipcp_grow_transformations_if_necessary ();
+      src_trans = ipcp_get_transformation_summary (src);
+      const vec<ipa_bits, va_gc> *src_bits = src_trans->bits;
+      vec<ipa_bits, va_gc> *&dst_bits
+	= ipcp_get_transformation_summary (dst)->bits;
+      vec_safe_reserve_exact (dst_bits, src_bits->length ());
+      for (unsigned i = 0; i < src_bits->length (); ++i)
+	dst_bits->quick_push ((*src_bits)[i]);
+    }
 }
 
 /* Register our cgraph hooks if they are not already there.  */
@@ -4609,6 +4651,15 @@ ipa_write_jump_function (struct output_block *ob,
       streamer_write_uhwi (ob, jump_func->alignment.align);
       streamer_write_uhwi (ob, jump_func->alignment.misalign);
     }
+
+  bp = bitpack_create (ob->main_stream);
+  bp_pack_value (&bp, jump_func->bits.known, 1);
+  streamer_write_bitpack (&bp);
+  if (jump_func->bits.known)
+    {
+      streamer_write_widest_int (ob, jump_func->bits.value);
+      streamer_write_widest_int (ob, jump_func->bits.mask);
+    }   
 }
 
 /* Read in jump function JUMP_FUNC from IB.  */
@@ -4685,6 +4736,17 @@ ipa_read_jump_function (struct lto_input_block *ib,
     }
   else
     jump_func->alignment.known = false;
+
+  bp = streamer_read_bitpack (ib);
+  bool bits_known = bp_unpack_value (&bp, 1);
+  if (bits_known)
+    {
+      jump_func->bits.known = true;
+      jump_func->bits.value = streamer_read_widest_int (ib);
+      jump_func->bits.mask = streamer_read_widest_int (ib);
+    }
+  else
+    jump_func->bits.known = false;
 }
 
 /* Stream out parts of cgraph_indirect_call_info corresponding to CS that are
@@ -5050,6 +5112,28 @@ write_ipcp_transformation_info (output_block *ob, cgraph_node *node)
     }
   else
     streamer_write_uhwi (ob, 0);
+
+  ts = ipcp_get_transformation_summary (node);
+  if (ts && vec_safe_length (ts->bits) > 0)
+    {
+      count = ts->bits->length ();
+      streamer_write_uhwi (ob, count);
+
+      for (unsigned i = 0; i < count; ++i)
+	{
+	  const ipa_bits& bits_jfunc = (*ts->bits)[i];
+	  struct bitpack_d bp = bitpack_create (ob->main_stream);
+	  bp_pack_value (&bp, bits_jfunc.known, 1);
+	  streamer_write_bitpack (&bp);
+	  if (bits_jfunc.known)
+	    {
+	      streamer_write_widest_int (ob, bits_jfunc.value);
+	      streamer_write_widest_int (ob, bits_jfunc.mask);
+	    }
+	}
+    }
+  else
+    streamer_write_uhwi (ob, 0);
 }
 
 /* Stream in the aggregate value replacement chain for NODE from IB.  */
@@ -5102,6 +5186,26 @@ read_ipcp_transformation_info (lto_input_block *ib, cgraph_node *node,
 	    }
 	}
     }
+
+  count = streamer_read_uhwi (ib);
+  if (count > 0)
+    {
+      ipcp_grow_transformations_if_necessary ();
+      ipcp_transformation_summary *ts = ipcp_get_transformation_summary (node);
+      vec_safe_grow_cleared (ts->bits, count);
+
+      for (i = 0; i < count; i++)
+	{
+	  ipa_bits& bits_jfunc = (*ts->bits)[i];
+	  struct bitpack_d bp = streamer_read_bitpack (ib);
+	  bits_jfunc.known = bp_unpack_value (&bp, 1);
+	  if (bits_jfunc.known)
+	    {
+	      bits_jfunc.value = streamer_read_widest_int (ib);
+	      bits_jfunc.mask = streamer_read_widest_int (ib);
+	    }
+	}
+    }
 }
 
 /* Write all aggregate replacement for nodes in set.  */
@@ -5404,6 +5508,56 @@ ipcp_update_alignments (struct cgraph_node *node)
     }
 }
 
+/* Update bits info of formal parameters as described in
+   ipcp_transformation_summary.  */
+
+static void
+ipcp_update_bits (struct cgraph_node *node)
+{
+  tree parm = DECL_ARGUMENTS (node->decl);
+  tree next_parm = parm;
+  ipcp_transformation_summary *ts = ipcp_get_transformation_summary (node);
+
+  if (!ts || vec_safe_length (ts->bits) == 0)
+    return;
+
+  vec<ipa_bits, va_gc> &bits = *ts->bits;
+  unsigned count = bits.length ();
+
+  for (unsigned i = 0; i < count; ++i, parm = next_parm)
+    {
+      if (node->clone.combined_args_to_skip
+	  && bitmap_bit_p (node->clone.combined_args_to_skip, i))
+	continue;
+
+      gcc_checking_assert (parm);
+      next_parm = DECL_CHAIN (parm);
+
+      if (!bits[i].known
+	  || !INTEGRAL_TYPE_P (TREE_TYPE (parm))
+	  || !is_gimple_reg (parm))
+	continue;       
+
+      tree ddef = ssa_default_def (DECL_STRUCT_FUNCTION (node->decl), parm);
+      if (!ddef)
+	continue;
+
+      if (dump_file)
+	{
+	  fprintf (dump_file, "Adjusting mask for param %u to ", i); 
+	  print_hex (bits[i].mask, dump_file);
+	  fprintf (dump_file, "\n");
+	}
+
+      unsigned prec = TYPE_PRECISION (TREE_TYPE (ddef));
+      signop sgn = TYPE_SIGN (TREE_TYPE (ddef));
+
+      wide_int nonzero_bits = wide_int::from (bits[i].mask, prec, UNSIGNED)
+			      | wide_int::from (bits[i].value, prec, sgn);
+      set_nonzero_bits (ddef, nonzero_bits);
+    }
+}
+
 /* IPCP transformation phase doing propagation of aggregate values.  */
 
 unsigned int
@@ -5423,6 +5577,7 @@ ipcp_transform_function (struct cgraph_node *node)
 	     node->name (), node->order);
 
   ipcp_update_alignments (node);
+  ipcp_update_bits (node);
   aggval = ipa_get_agg_replacements_for_node (node);
   if (!aggval)
       return 0;
diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h
index e32d078..e5a56da 100644
--- a/gcc/ipa-prop.h
+++ b/gcc/ipa-prop.h
@@ -154,6 +154,19 @@ struct GTY(()) ipa_alignment
   unsigned misalign;
 };
 
+/* Information about zero/non-zero bits.  */
+struct GTY(()) ipa_bits
+{
+  /* The propagated value.  */
+  widest_int value;
+  /* Mask corresponding to the value.
+     Similar to ccp_lattice_t, if xth bit of mask is 0,
+     implies xth bit of value is constant.  */
+  widest_int mask;
+  /* True if jump function is known.  */
+  bool known;
+};
+
 /* A jump function for a callsite represents the values passed as actual
    arguments of the callsite. See enum jump_func_type for the various
    types of jump functions supported.  */
@@ -166,6 +179,9 @@ struct GTY (()) ipa_jump_func
   /* Information about alignment of pointers. */
   struct ipa_alignment alignment;
 
+  /* Information about zero/non-zero bits.  */
+  struct ipa_bits bits;
+
   enum jump_func_type type;
   /* Represents a value of a jump function.  pass_through is used only in jump
      function context.  constant represents the actual constant in constant jump
@@ -283,8 +299,11 @@ ipa_get_jf_ancestor_type_preserved (struct ipa_jump_func *jfunc)
 
 struct ipa_param_descriptor
 {
-  /* PARAM_DECL of this parameter.  */
-  tree decl;
+  /* In analysis and modification phase, this is the PARAM_DECL of this
+     parameter, in IPA LTO phase, this is the type of the the described
+     parameter or NULL if not known.  Do not read this field directly but
+     through ipa_get_param and ipa_get_type as appropriate.  */
+  tree decl_or_type;
   /* If all uses of the parameter are described by ipa-prop structures, this
      says how many there are.  If any use could not be described by means of
      ipa-prop structures, this is IPA_UNDESCRIBED_USE.  */
@@ -402,13 +421,31 @@ ipa_get_param_count (struct ipa_node_params *info)
 
 /* Return the declaration of Ith formal parameter of the function corresponding
    to INFO.  Note there is no setter function as this array is built just once
-   using ipa_initialize_node_params. */
+   using ipa_initialize_node_params.  This function should not be called in
+   WPA.  */
 
 static inline tree
 ipa_get_param (struct ipa_node_params *info, int i)
 {
   gcc_checking_assert (!flag_wpa);
-  return info->descriptors[i].decl;
+  tree t = info->descriptors[i].decl_or_type;
+  gcc_checking_assert (TREE_CODE (t) == PARM_DECL);
+  return t;
+}
+
+/* Return the type of Ith formal parameter of the function corresponding
+   to INFO if it is known or NULL if not.  */
+
+static inline tree
+ipa_get_type (struct ipa_node_params *info, int i)
+{
+  tree t = info->descriptors[i].decl_or_type;
+  if (!t)
+    return NULL;
+  if (TYPE_P (t))
+    return t;
+  gcc_checking_assert (TREE_CODE (t) == PARM_DECL);
+  return TREE_TYPE (t);
 }
 
 /* Return the move cost of Ith formal parameter of the function corresponding
@@ -482,6 +519,8 @@ struct GTY(()) ipcp_transformation_summary
   ipa_agg_replacement_value *agg_values;
   /* Alignment information for pointers.  */
   vec<ipa_alignment, va_gc> *alignments;
+  /* Known bits information.  */
+  vec<ipa_bits, va_gc> *bits;
 };
 
 void ipa_set_node_agg_value_chain (struct cgraph_node *node,
diff --git a/gcc/opts.c b/gcc/opts.c
index 4053fb1..cde9a7b 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -505,6 +505,7 @@ static const struct default_options default_options_table[] =
     { OPT_LEVELS_2_PLUS, OPT_ftree_switch_conversion, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fipa_cp, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fipa_cp_alignment, NULL, 1 },
+    { OPT_LEVELS_2_PLUS, OPT_fipa_cp_bit, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fdevirtualize, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fdevirtualize_speculatively, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fipa_sra, NULL, 1 },
@@ -1422,6 +1423,9 @@ enable_fdo_optimizations (struct gcc_options *opts,
   if (!opts_set->x_flag_ipa_cp_alignment
       && value && opts->x_flag_ipa_cp)
     opts->x_flag_ipa_cp_alignment = value;
+  if (!opts_set->x_flag_ipa_cp_bit
+      && value && opts->x_flag_ipa_cp)
+    opts->x_flag_ipa_cp_bit = value;
   if (!opts_set->x_flag_predictive_commoning)
     opts->x_flag_predictive_commoning = value;
   if (!opts_set->x_flag_unswitch_loops)
diff --git a/gcc/testsuite/gcc.dg/ipa/propbits-1.c b/gcc/testsuite/gcc.dg/ipa/propbits-1.c
new file mode 100644
index 0000000..8ec372d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/propbits-1.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-early-inlining -fdump-ipa-cp" } */
+
+__attribute__((noinline)) 
+static int f(int x)
+{
+  int some_op(int);
+  return some_op (x);
+}
+
+int main(void)
+{
+  int a = f(1);
+  int b = f(2);
+  int c = f(4);
+  return a + b + c;
+}
+
+/* { dg-final { scan-ipa-dump "Adjusting mask for param 0 to 0x7" "cp" } } */
diff --git a/gcc/testsuite/gcc.dg/ipa/propbits-2.c b/gcc/testsuite/gcc.dg/ipa/propbits-2.c
new file mode 100644
index 0000000..3a960f0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/propbits-2.c
@@ -0,0 +1,41 @@
+/* x's mask should be meet(0xc, 0x3) == 0xf  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-early-inlining -fdump-ipa-cp" } */
+
+extern int pass_test ();
+extern int fail_test ();
+
+__attribute__((noinline))
+static int f1(int x)
+{
+  if ((x & ~0xf) == 0)
+    return pass_test ();
+  else
+    return fail_test ();
+}
+
+__attribute__((noinline))
+static int f2(int y)
+{
+  return f1(y & 0x03);
+}
+
+__attribute__((noinline))
+static int f3(int z)
+{
+  return f1(z & 0xc);
+}
+
+extern int a;
+extern int b;
+
+int main(void)
+{
+  int k = f2(a); 
+  int l = f3(b);
+  return k + l;
+}
+
+/* { dg-final { scan-ipa-dump "Adjusting mask for param 0 to 0xf" "cp" } } */
+/* { dg-final { scan-dump-tree-not "fail_test" "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/ipa/propbits-3.c b/gcc/testsuite/gcc.dg/ipa/propbits-3.c
new file mode 100644
index 0000000..44744cd
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/propbits-3.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-early-inlining -fdump-ipa-cp" } */
+
+__attribute__((noinline))
+static int f(int x)
+{
+  extern int limit;
+  extern int f2(int);
+
+  if (x == limit)
+    return x;
+  int k = f(x + 1);
+  return f2 (k); 
+}
+
+int main(int argc, char **argv)
+{
+  int k = f(argc & 0xff); 
+  return k;
+}
+
+/* { dg-final { scan-ipa-dump-not "Adjusting mask for" "cp" } } */  
diff --git a/gcc/tree-ssa-ccp.c b/gcc/tree-ssa-ccp.c
index 5d5386e..d88143b 100644
--- a/gcc/tree-ssa-ccp.c
+++ b/gcc/tree-ssa-ccp.c
@@ -142,7 +142,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfgloop.h"
 #include "stor-layout.h"
 #include "optabs-query.h"
-
+#include "tree-ssa-ccp.h"
 
 /* Possible lattice values.  */
 typedef enum
@@ -536,9 +536,9 @@ set_lattice_value (tree var, ccp_prop_value_t *new_val)
 
 static ccp_prop_value_t get_value_for_expr (tree, bool);
 static ccp_prop_value_t bit_value_binop (enum tree_code, tree, tree, tree);
-static void bit_value_binop_1 (enum tree_code, tree, widest_int *, widest_int *,
-			       tree, const widest_int &, const widest_int &,
-			       tree, const widest_int &, const widest_int &);
+void bit_value_binop (enum tree_code, signop, int, widest_int *, widest_int *,
+		      signop, int, const widest_int &, const widest_int &,
+		      signop, int, const widest_int &, const widest_int &);
 
 /* Return a widest_int that can be used for bitwise simplifications
    from VAL.  */
@@ -894,7 +894,7 @@ do_dbg_cnt (void)
    Return TRUE when something was optimized.  */
 
 static bool
-ccp_finalize (bool nonzero_p)
+ccp_finalize (bool nonzero_p) 
 {
   bool something_changed;
   unsigned i;
@@ -920,7 +920,8 @@ ccp_finalize (bool nonzero_p)
 
       val = get_value (name);
       if (val->lattice_val != CONSTANT
-	  || TREE_CODE (val->value) != INTEGER_CST)
+	  || TREE_CODE (val->value) != INTEGER_CST
+	  || val->mask == 0)
 	continue;
 
       if (POINTER_TYPE_P (TREE_TYPE (name)))
@@ -1224,10 +1225,11 @@ ccp_fold (gimple *stmt)
    RVAL and RMASK representing a value of type RTYPE and set
    the value, mask pair *VAL and *MASK to the result.  */
 
-static void
-bit_value_unop_1 (enum tree_code code, tree type,
-		  widest_int *val, widest_int *mask,
-		  tree rtype, const widest_int &rval, const widest_int &rmask)
+void
+bit_value_unop (enum tree_code code, signop type_sgn, int type_precision, 
+		widest_int *val, widest_int *mask,
+		signop rtype_sgn, int rtype_precision,
+		const widest_int &rval, const widest_int &rmask)
 {
   switch (code)
     {
@@ -1240,25 +1242,23 @@ bit_value_unop_1 (enum tree_code code, tree type,
       {
 	widest_int temv, temm;
 	/* Return ~rval + 1.  */
-	bit_value_unop_1 (BIT_NOT_EXPR, type, &temv, &temm, type, rval, rmask);
-	bit_value_binop_1 (PLUS_EXPR, type, val, mask,
-			   type, temv, temm, type, 1, 0);
+	bit_value_unop (BIT_NOT_EXPR, type_sgn, type_precision, &temv, &temm,
+			type_sgn, type_precision, rval, rmask);
+	bit_value_binop (PLUS_EXPR, type_sgn, type_precision, val, mask,
+			 type_sgn, type_precision, temv, temm,
+			 type_sgn, type_precision, 1, 0);
 	break;
       }
 
     CASE_CONVERT:
       {
-	signop sgn;
-
 	/* First extend mask and value according to the original type.  */
-	sgn = TYPE_SIGN (rtype);
-	*mask = wi::ext (rmask, TYPE_PRECISION (rtype), sgn);
-	*val = wi::ext (rval, TYPE_PRECISION (rtype), sgn);
+	*mask = wi::ext (rmask, rtype_precision, rtype_sgn);
+	*val = wi::ext (rval, rtype_precision, rtype_sgn);
 
 	/* Then extend mask and value according to the target type.  */
-	sgn = TYPE_SIGN (type);
-	*mask = wi::ext (*mask, TYPE_PRECISION (type), sgn);
-	*val = wi::ext (*val, TYPE_PRECISION (type), sgn);
+	*mask = wi::ext (*mask, type_precision, type_sgn);
+	*val = wi::ext (*val, type_precision, type_sgn);
 	break;
       }
 
@@ -1272,15 +1272,14 @@ bit_value_unop_1 (enum tree_code code, tree type,
    R1VAL, R1MASK and R2VAL, R2MASK representing a values of type R1TYPE
    and R2TYPE and set the value, mask pair *VAL and *MASK to the result.  */
 
-static void
-bit_value_binop_1 (enum tree_code code, tree type,
-		   widest_int *val, widest_int *mask,
-		   tree r1type, const widest_int &r1val,
-		   const widest_int &r1mask, tree r2type,
-		   const widest_int &r2val, const widest_int &r2mask)
+void
+bit_value_binop (enum tree_code code, signop sgn, int width, 
+		 widest_int *val, widest_int *mask,
+		 signop r1type_sgn, int r1type_precision,
+		 const widest_int &r1val, const widest_int &r1mask,
+		 signop r2type_sgn, int r2type_precision,
+		 const widest_int &r2val, const widest_int &r2mask)
 {
-  signop sgn = TYPE_SIGN (type);
-  int width = TYPE_PRECISION (type);
   bool swap_p = false;
 
   /* Assume we'll get a constant result.  Use an initial non varying
@@ -1406,11 +1405,11 @@ bit_value_binop_1 (enum tree_code code, tree type,
     case MINUS_EXPR:
       {
 	widest_int temv, temm;
-	bit_value_unop_1 (NEGATE_EXPR, r2type, &temv, &temm,
-			  r2type, r2val, r2mask);
-	bit_value_binop_1 (PLUS_EXPR, type, val, mask,
-			   r1type, r1val, r1mask,
-			   r2type, temv, temm);
+	bit_value_unop (NEGATE_EXPR, r2type_sgn, r2type_precision, &temv, &temm,
+			  r2type_sgn, r2type_precision, r2val, r2mask);
+	bit_value_binop (PLUS_EXPR, sgn, width, val, mask,
+			 r1type_sgn, r1type_precision, r1val, r1mask,
+			 r2type_sgn, r2type_precision, temv, temm);
 	break;
       }
 
@@ -1472,7 +1471,7 @@ bit_value_binop_1 (enum tree_code code, tree type,
 	  break;
 
 	/* For comparisons the signedness is in the comparison operands.  */
-	sgn = TYPE_SIGN (r1type);
+	sgn = r1type_sgn;
 
 	/* If we know the most significant bits we know the values
 	   value ranges by means of treating varying bits as zero
@@ -1525,8 +1524,9 @@ bit_value_unop (enum tree_code code, tree type, tree rhs)
   gcc_assert ((rval.lattice_val == CONSTANT
 	       && TREE_CODE (rval.value) == INTEGER_CST)
 	      || wi::sext (rval.mask, TYPE_PRECISION (TREE_TYPE (rhs))) == -1);
-  bit_value_unop_1 (code, type, &value, &mask,
-		    TREE_TYPE (rhs), value_to_wide_int (rval), rval.mask);
+  bit_value_unop (code, TYPE_SIGN (type), TYPE_PRECISION (type), &value, &mask,
+		  TYPE_SIGN (TREE_TYPE (rhs)), TYPE_PRECISION (TREE_TYPE (rhs)),
+		  value_to_wide_int (rval), rval.mask);
   if (wi::sext (mask, TYPE_PRECISION (type)) != -1)
     {
       val.lattice_val = CONSTANT;
@@ -1571,9 +1571,12 @@ bit_value_binop (enum tree_code code, tree type, tree rhs1, tree rhs2)
 	       && TREE_CODE (r2val.value) == INTEGER_CST)
 	      || wi::sext (r2val.mask,
 			   TYPE_PRECISION (TREE_TYPE (rhs2))) == -1);
-  bit_value_binop_1 (code, type, &value, &mask,
-		     TREE_TYPE (rhs1), value_to_wide_int (r1val), r1val.mask,
-		     TREE_TYPE (rhs2), value_to_wide_int (r2val), r2val.mask);
+  bit_value_binop (code, TYPE_SIGN (type), TYPE_PRECISION (type), &value, &mask,
+		   TYPE_SIGN (TREE_TYPE (rhs1)), TYPE_PRECISION (TREE_TYPE (rhs1)),
+		   value_to_wide_int (r1val), r1val.mask,
+		   TYPE_SIGN (TREE_TYPE (rhs2)), TYPE_PRECISION (TREE_TYPE (rhs2)),
+		   value_to_wide_int (r2val), r2val.mask);
+
   if (wi::sext (mask, TYPE_PRECISION (type)) != -1)
     {
       val.lattice_val = CONSTANT;
@@ -1672,9 +1675,10 @@ bit_value_assume_aligned (gimple *stmt, tree attr, ccp_prop_value_t ptrval,
 
   align = build_int_cst_type (type, -aligni);
   alignval = get_value_for_expr (align, true);
-  bit_value_binop_1 (BIT_AND_EXPR, type, &value, &mask,
-		     type, value_to_wide_int (ptrval), ptrval.mask,
-		     type, value_to_wide_int (alignval), alignval.mask);
+  bit_value_binop (BIT_AND_EXPR, TYPE_SIGN (type), TYPE_PRECISION (type), &value, &mask,
+		   TYPE_SIGN (type), TYPE_PRECISION (type), value_to_wide_int (ptrval), ptrval.mask,
+		   TYPE_SIGN (type), TYPE_PRECISION (type), value_to_wide_int (alignval), alignval.mask);
+
   if (wi::sext (mask, TYPE_PRECISION (type)) != -1)
     {
       val.lattice_val = CONSTANT;
@@ -2409,7 +2413,7 @@ do_ssa_ccp (bool nonzero_p)
 
   ccp_initialize ();
   ssa_propagate (ccp_visit_stmt, ccp_visit_phi_node);
-  if (ccp_finalize (nonzero_p))
+  if (ccp_finalize (nonzero_p || flag_ipa_cp_bit))
     {
       todo = (TODO_cleanup_cfg | TODO_update_ssa);
 
diff --git a/gcc/tree-ssa-ccp.h b/gcc/tree-ssa-ccp.h
new file mode 100644
index 0000000..35383c5
--- /dev/null
+++ b/gcc/tree-ssa-ccp.h
@@ -0,0 +1,29 @@
+/* Copyright (C) 2016 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by the
+Free Software Foundation; either version 3, or (at your option) any
+later version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT
+ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef TREE_SSA_CCP_H
+#define TREE_SSA_CCP_H
+
+void bit_value_binop (enum tree_code, signop, int, widest_int *, widest_int *,
+		      signop, int, const widest_int &, const widest_int &,
+		      signop, int, const widest_int &, const widest_int &);
+
+void bit_value_unop (enum tree_code, signop, int, widest_int *, widest_int *,
+		     signop, int, const widest_int &, const widest_int &);
+
+#endif

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] ipa bitwise constant propagation
  2016-08-24 12:07                               ` Prathamesh Kulkarni
@ 2016-08-25 13:44                                 ` Jan Hubicka
  2016-08-26 12:31                                   ` Prathamesh Kulkarni
  2016-08-26 16:23                                 ` Rainer Orth
  1 sibling, 1 reply; 31+ messages in thread
From: Jan Hubicka @ 2016-08-25 13:44 UTC (permalink / raw)
  To: Prathamesh Kulkarni
  Cc: Jan Hubicka, Richard Biener, Kugan Vivekanandarajah, gcc Patches

> Patch for performing interprocedural bitwise constant propagation.
> 
> 2016-08-23  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>
> 	    Martin Jambhor  <mjambor@suse.cz>
> 
> 	* common.opt: New option -fipa-cp-bit.
> 	* doc/invoke.texi: Document -fipa-cp-bit.
> 	* opts.c (default_options_table): Add entry for -fipa-cp-bit.

Bitwise intraprocedural ccp is enabled by -ftree-bit-cp, so I think the
option name should be -fipa-bit-cp so things are more consistent.

Patch is OK with this change.

Thanks!
Honza
> 	(enable_fdo_optimizations): Check for flag_ipa_cp_bit.
> 	* tree-ssa-ccp.h: New header file.
> 	* tree-ssa-ccp.c: Include tree-ssa-ccp.h
> 	(bit_value_binop_1): Change to bit_value_binop_1 and export it.
> 	Replace all occurences of tree parameter by two new params: signop, int.
> 	(bit_value_unop_1): Change to bit_value_unop and export it.
> 	Replace all occurences of tree parameter by two new params: signop,
> 	int.
> 	(bit_value_binop): Change call from bit_value_binop_1 to
> 	bit_value_binop.
> 	(bit_value_assume_aligned): Likewise.
> 	(bit_value_unop): Change call from bit_value_unop_1 to bit_value_unop.
> 	(do_ssa_ccp): Pass nonzero_p || flag_ipa_cp_bit instead of nonzero_p
> 	to ccp_finalize.
> 	(ccp_finalize): Skip processing if val->mask == 0.
> 	* ipa-cp.c: Include tree-ssa-ccp.h
> 	(ipcp_bits_lattice): New class.
> 	(ipcp_param_lattice (bits_lattice): New member.
> 	(print_all_lattices): Call ipcp_bits_lattice::print.
> 	(set_all_contains_variable): Call ipcp_bits_lattice::set_to_bottom. 
> 	(initialize_node_lattices): Likewise.
> 	(propagate_bits_accross_jump_function): New function.
> 	(propagate_constants_accross_call): Call
> 	propagate_bits_accross_jump_function.
> 	(ipcp_propagate_stage): Store parameter types when in_lto_p is true.
> 	(ipcp_store_bits_results): New function.
> 	(ipcp_driver): Call ipcp_store_bits_results.
> 	* ipa-prop.h (ipa_bits): New struct.
> 	(ipa_jump_func): Add new member bits of type ipa_bits.
> 	(ipa_param_descriptor): Change decl to decl_or_type.
> 	(ipa_get_param): Change decl to decl_or_type and assert on
> 	PARM_DECL.
> 	(ipa_get_type): New function.
> 	(ipcp_transformation_summary): New member bits.
> 	* ipa-prop.c (ipa_get_param_decl_index_1): s/decl/decl_or_type.
> 	(ipa_populate_param_decls): Likewise.
> 	(ipa_dump_param): Likewise.
> 	(ipa_print_node_jump_functions_for_edge): Pretty-print ipa_bits jump
> 	function.
> 	(ipa_set_jf_unknown): Set ipa_bits::known to false.
> 	(ipa_compute_jump_functions_for_edge): Compute jump function for bits
> 	propagation.
> 	(ipa_node_params_t::duplicate): Copy src->bits into dst->bits.
> 	(ipa_write_jump_function): Add streaming for ipa_bits.
> 	(ipa_read_jump_function): Add support for reading streamed ipa_bits.
> 	(write_ipcp_transformation_info): Add streaming for ipa_bits
> 	summary for ltrans.
> 	(read_ipcp_transfomration_info): Add support for reading streamed ipa_bits.
> 	(ipcp_update_bits): New function.
> 	(ipcp_transform_function): Call ipcp_update_bits.
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] ipa bitwise constant propagation
  2016-08-25 13:44                                 ` Jan Hubicka
@ 2016-08-26 12:31                                   ` Prathamesh Kulkarni
  0 siblings, 0 replies; 31+ messages in thread
From: Prathamesh Kulkarni @ 2016-08-26 12:31 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Richard Biener, Kugan Vivekanandarajah, gcc Patches

On 25 August 2016 at 19:14, Jan Hubicka <hubicka@ucw.cz> wrote:
>> Patch for performing interprocedural bitwise constant propagation.
>>
>> 2016-08-23  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>
>>           Martin Jambhor  <mjambor@suse.cz>
>>
>>       * common.opt: New option -fipa-cp-bit.
>>       * doc/invoke.texi: Document -fipa-cp-bit.
>>       * opts.c (default_options_table): Add entry for -fipa-cp-bit.
>
> Bitwise intraprocedural ccp is enabled by -ftree-bit-cp, so I think the
> option name should be -fipa-bit-cp so things are more consistent.
>
> Patch is OK with this change.
Thanks, committed the patch as r239769.
As next steps, I will try to merge bitwise and pointer-alignment propagation.

Thanks,
Prathamesh
>
> Thanks!
> Honza
>>       (enable_fdo_optimizations): Check for flag_ipa_cp_bit.
>>       * tree-ssa-ccp.h: New header file.
>>       * tree-ssa-ccp.c: Include tree-ssa-ccp.h
>>       (bit_value_binop_1): Change to bit_value_binop_1 and export it.
>>       Replace all occurences of tree parameter by two new params: signop, int.
>>       (bit_value_unop_1): Change to bit_value_unop and export it.
>>       Replace all occurences of tree parameter by two new params: signop,
>>       int.
>>       (bit_value_binop): Change call from bit_value_binop_1 to
>>       bit_value_binop.
>>       (bit_value_assume_aligned): Likewise.
>>       (bit_value_unop): Change call from bit_value_unop_1 to bit_value_unop.
>>       (do_ssa_ccp): Pass nonzero_p || flag_ipa_cp_bit instead of nonzero_p
>>       to ccp_finalize.
>>       (ccp_finalize): Skip processing if val->mask == 0.
>>       * ipa-cp.c: Include tree-ssa-ccp.h
>>       (ipcp_bits_lattice): New class.
>>       (ipcp_param_lattice (bits_lattice): New member.
>>       (print_all_lattices): Call ipcp_bits_lattice::print.
>>       (set_all_contains_variable): Call ipcp_bits_lattice::set_to_bottom.
>>       (initialize_node_lattices): Likewise.
>>       (propagate_bits_accross_jump_function): New function.
>>       (propagate_constants_accross_call): Call
>>       propagate_bits_accross_jump_function.
>>       (ipcp_propagate_stage): Store parameter types when in_lto_p is true.
>>       (ipcp_store_bits_results): New function.
>>       (ipcp_driver): Call ipcp_store_bits_results.
>>       * ipa-prop.h (ipa_bits): New struct.
>>       (ipa_jump_func): Add new member bits of type ipa_bits.
>>       (ipa_param_descriptor): Change decl to decl_or_type.
>>       (ipa_get_param): Change decl to decl_or_type and assert on
>>       PARM_DECL.
>>       (ipa_get_type): New function.
>>       (ipcp_transformation_summary): New member bits.
>>       * ipa-prop.c (ipa_get_param_decl_index_1): s/decl/decl_or_type.
>>       (ipa_populate_param_decls): Likewise.
>>       (ipa_dump_param): Likewise.
>>       (ipa_print_node_jump_functions_for_edge): Pretty-print ipa_bits jump
>>       function.
>>       (ipa_set_jf_unknown): Set ipa_bits::known to false.
>>       (ipa_compute_jump_functions_for_edge): Compute jump function for bits
>>       propagation.
>>       (ipa_node_params_t::duplicate): Copy src->bits into dst->bits.
>>       (ipa_write_jump_function): Add streaming for ipa_bits.
>>       (ipa_read_jump_function): Add support for reading streamed ipa_bits.
>>       (write_ipcp_transformation_info): Add streaming for ipa_bits
>>       summary for ltrans.
>>       (read_ipcp_transfomration_info): Add support for reading streamed ipa_bits.
>>       (ipcp_update_bits): New function.
>>       (ipcp_transform_function): Call ipcp_update_bits.
>>
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] ipa bitwise constant propagation
  2016-08-24 12:07                               ` Prathamesh Kulkarni
  2016-08-25 13:44                                 ` Jan Hubicka
@ 2016-08-26 16:23                                 ` Rainer Orth
  2016-08-26 17:23                                   ` Prathamesh Kulkarni
  1 sibling, 1 reply; 31+ messages in thread
From: Rainer Orth @ 2016-08-26 16:23 UTC (permalink / raw)
  To: Prathamesh Kulkarni
  Cc: Jan Hubicka, Richard Biener, Kugan Vivekanandarajah, gcc Patches

[-- Attachment #1: Type: text/plain, Size: 2614 bytes --]

Hi Prathamesh,

> The attached version passes bootstrap+test on
> x86_64-unknown-linux-gnu, ppc64le-linux-gnu,
> and with c,c++,fortran on armv8l-linux-gnueabihf.
> Cross-tested on arm*-*-* and aarch64*-*-*.
> Verified the patch survives lto-bootstrap on x86_64-unknown-linux-gnu.
> Ok to commit ?
[...]
> testsuite/
> 	* gcc.dg/ipa/propbits-1.c: New test-case.
> 	* gcc.dg/ipa/propbits-2.c: Likewise.
> 	* gcc.dg/ipa/propbits-3.c: Likewise.
[...]
> diff --git a/gcc/testsuite/gcc.dg/ipa/propbits-2.c b/gcc/testsuite/gcc.dg/ipa/propbits-2.c
> new file mode 100644
> index 0000000..3a960f0
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/ipa/propbits-2.c
> @@ -0,0 +1,41 @@
> +/* x's mask should be meet(0xc, 0x3) == 0xf  */
> +
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fno-early-inlining -fdump-ipa-cp" } */
> +
> +extern int pass_test ();
> +extern int fail_test ();
> +
> +__attribute__((noinline))
> +static int f1(int x)
> +{
> +  if ((x & ~0xf) == 0)
> +    return pass_test ();
> +  else
> +    return fail_test ();
> +}
> +
> +__attribute__((noinline))
> +static int f2(int y)
> +{
> +  return f1(y & 0x03);
> +}
> +
> +__attribute__((noinline))
> +static int f3(int z)
> +{
> +  return f1(z & 0xc);
> +}
> +
> +extern int a;
> +extern int b;
> +
> +int main(void)
> +{
> +  int k = f2(a); 
> +  int l = f3(b);
> +  return k + l;
> +}
> +
> +/* { dg-final { scan-ipa-dump "Adjusting mask for param 0 to 0xf" "cp" } } */
> +/* { dg-final { scan-dump-tree-not "fail_test" "optimized" } } */

This testcase thoroughly broke make check-gcc:

At first, runtest errors out with

ERROR: (DejaGnu) proc "scan-dump-tree-not fail_test optimized" does not exist.

The resulting incomplete gcc.sum files confuse dg-extract-results.py

testsuite/gcc6/gcc.sum.sep: no recognised summary line
testsuite/gcc6/gcc.log.sep: no recognised summary line

and cause it to emit en empty gcc.sum, effectively losing all gcc
testresults in mail-report.log.

This cannot have been tested in any reasonable way.

Once you fix the typo (scan-dump-tree-not -> scan-tree-dump-not), at
least we get a complete gcc.sum again, but the testcase still shows up as

UNRESOLVED: gcc.dg/ipa/propbits-2.c scan-tree-dump-not optimized "fail_test"

and gcc.log shows

gcc.dg/ipa/propbits-2.c: dump file does not exist

Adding -fdump-tree-optimized creates the necessary dump and finally lets
the test pass.

Here's the resulting patch.  Unless there are objections, I plan to
commit it soon.

	Rainer


2016-08-26  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

	* gcc.dg/ipa/propbits-2.c: Add -fdump-tree-optimized to dg-options.
	Fix typo.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: pb2.patch --]
[-- Type: text/x-patch, Size: 722 bytes --]

diff --git a/gcc/testsuite/gcc.dg/ipa/propbits-2.c b/gcc/testsuite/gcc.dg/ipa/propbits-2.c
--- a/gcc/testsuite/gcc.dg/ipa/propbits-2.c
+++ b/gcc/testsuite/gcc.dg/ipa/propbits-2.c
@@ -1,7 +1,7 @@
 /* x's mask should be meet(0xc, 0x3) == 0xf  */
 
 /* { dg-do compile } */
-/* { dg-options "-O2 -fno-early-inlining -fdump-ipa-cp" } */
+/* { dg-options "-O2 -fno-early-inlining -fdump-ipa-cp -fdump-tree-optimized" } */
 
 extern int pass_test ();
 extern int fail_test ();
@@ -38,4 +38,4 @@ int main(void)
 }
 
 /* { dg-final { scan-ipa-dump "Adjusting mask for param 0 to 0xf" "cp" } } */
-/* { dg-final { scan-dump-tree-not "fail_test" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "fail_test" "optimized" } } */

[-- Attachment #3: Type: text/plain, Size: 143 bytes --]


-- 
-----------------------------------------------------------------------------
Rainer Orth, Center for Biotechnology, Bielefeld University

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] ipa bitwise constant propagation
  2016-08-26 16:23                                 ` Rainer Orth
@ 2016-08-26 17:23                                   ` Prathamesh Kulkarni
  2016-08-29 10:53                                     ` Christophe Lyon
  0 siblings, 1 reply; 31+ messages in thread
From: Prathamesh Kulkarni @ 2016-08-26 17:23 UTC (permalink / raw)
  To: Rainer Orth
  Cc: Jan Hubicka, Richard Biener, Kugan Vivekanandarajah, gcc Patches

On 26 August 2016 at 21:53, Rainer Orth <ro@cebitec.uni-bielefeld.de> wrote:
> Hi Prathamesh,
>
>> The attached version passes bootstrap+test on
>> x86_64-unknown-linux-gnu, ppc64le-linux-gnu,
>> and with c,c++,fortran on armv8l-linux-gnueabihf.
>> Cross-tested on arm*-*-* and aarch64*-*-*.
>> Verified the patch survives lto-bootstrap on x86_64-unknown-linux-gnu.
>> Ok to commit ?
> [...]
>> testsuite/
>>       * gcc.dg/ipa/propbits-1.c: New test-case.
>>       * gcc.dg/ipa/propbits-2.c: Likewise.
>>       * gcc.dg/ipa/propbits-3.c: Likewise.
> [...]
>> diff --git a/gcc/testsuite/gcc.dg/ipa/propbits-2.c b/gcc/testsuite/gcc.dg/ipa/propbits-2.c
>> new file mode 100644
>> index 0000000..3a960f0
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/ipa/propbits-2.c
>> @@ -0,0 +1,41 @@
>> +/* x's mask should be meet(0xc, 0x3) == 0xf  */
>> +
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2 -fno-early-inlining -fdump-ipa-cp" } */
>> +
>> +extern int pass_test ();
>> +extern int fail_test ();
>> +
>> +__attribute__((noinline))
>> +static int f1(int x)
>> +{
>> +  if ((x & ~0xf) == 0)
>> +    return pass_test ();
>> +  else
>> +    return fail_test ();
>> +}
>> +
>> +__attribute__((noinline))
>> +static int f2(int y)
>> +{
>> +  return f1(y & 0x03);
>> +}
>> +
>> +__attribute__((noinline))
>> +static int f3(int z)
>> +{
>> +  return f1(z & 0xc);
>> +}
>> +
>> +extern int a;
>> +extern int b;
>> +
>> +int main(void)
>> +{
>> +  int k = f2(a);
>> +  int l = f3(b);
>> +  return k + l;
>> +}
>> +
>> +/* { dg-final { scan-ipa-dump "Adjusting mask for param 0 to 0xf" "cp" } } */
>> +/* { dg-final { scan-dump-tree-not "fail_test" "optimized" } } */
>
> This testcase thoroughly broke make check-gcc:
Oops, sorry for the breakage. I am not sure how this missed my testing :/
I obtained test results using test_summary script with and without patch,
and compared the results with compare_tests which apparently showed no
regressions...
Thanks for the fix.

Thanks,
Prathamesh
>
> At first, runtest errors out with
>
> ERROR: (DejaGnu) proc "scan-dump-tree-not fail_test optimized" does not exist.
>
> The resulting incomplete gcc.sum files confuse dg-extract-results.py
>
> testsuite/gcc6/gcc.sum.sep: no recognised summary line
> testsuite/gcc6/gcc.log.sep: no recognised summary line
>
> and cause it to emit en empty gcc.sum, effectively losing all gcc
> testresults in mail-report.log.
>
> This cannot have been tested in any reasonable way.
>
> Once you fix the typo (scan-dump-tree-not -> scan-tree-dump-not), at
> least we get a complete gcc.sum again, but the testcase still shows up as
>
> UNRESOLVED: gcc.dg/ipa/propbits-2.c scan-tree-dump-not optimized "fail_test"
>
> and gcc.log shows
>
> gcc.dg/ipa/propbits-2.c: dump file does not exist
>
> Adding -fdump-tree-optimized creates the necessary dump and finally lets
> the test pass.
>
> Here's the resulting patch.  Unless there are objections, I plan to
> commit it soon.
>
>         Rainer
>
>
> 2016-08-26  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>
>
>         * gcc.dg/ipa/propbits-2.c: Add -fdump-tree-optimized to dg-options.
>         Fix typo.
>
>
>
> --
> -----------------------------------------------------------------------------
> Rainer Orth, Center for Biotechnology, Bielefeld University
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] ipa bitwise constant propagation
  2016-08-26 17:23                                   ` Prathamesh Kulkarni
@ 2016-08-29 10:53                                     ` Christophe Lyon
  0 siblings, 0 replies; 31+ messages in thread
From: Christophe Lyon @ 2016-08-29 10:53 UTC (permalink / raw)
  To: Prathamesh Kulkarni
  Cc: Rainer Orth, Jan Hubicka, Richard Biener, Kugan Vivekanandarajah,
	gcc Patches

On 26 August 2016 at 19:22, Prathamesh Kulkarni
<prathamesh.kulkarni@linaro.org> wrote:
> On 26 August 2016 at 21:53, Rainer Orth <ro@cebitec.uni-bielefeld.de> wrote:
>> Hi Prathamesh,
>>
>>> The attached version passes bootstrap+test on
>>> x86_64-unknown-linux-gnu, ppc64le-linux-gnu,
>>> and with c,c++,fortran on armv8l-linux-gnueabihf.
>>> Cross-tested on arm*-*-* and aarch64*-*-*.
>>> Verified the patch survives lto-bootstrap on x86_64-unknown-linux-gnu.
>>> Ok to commit ?
>> [...]
>>> testsuite/
>>>       * gcc.dg/ipa/propbits-1.c: New test-case.
>>>       * gcc.dg/ipa/propbits-2.c: Likewise.
>>>       * gcc.dg/ipa/propbits-3.c: Likewise.
>> [...]
>>> diff --git a/gcc/testsuite/gcc.dg/ipa/propbits-2.c b/gcc/testsuite/gcc.dg/ipa/propbits-2.c
>>> new file mode 100644
>>> index 0000000..3a960f0
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.dg/ipa/propbits-2.c
>>> @@ -0,0 +1,41 @@
>>> +/* x's mask should be meet(0xc, 0x3) == 0xf  */
>>> +
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-O2 -fno-early-inlining -fdump-ipa-cp" } */
>>> +
>>> +extern int pass_test ();
>>> +extern int fail_test ();
>>> +
>>> +__attribute__((noinline))
>>> +static int f1(int x)
>>> +{
>>> +  if ((x & ~0xf) == 0)
>>> +    return pass_test ();
>>> +  else
>>> +    return fail_test ();
>>> +}
>>> +
>>> +__attribute__((noinline))
>>> +static int f2(int y)
>>> +{
>>> +  return f1(y & 0x03);
>>> +}
>>> +
>>> +__attribute__((noinline))
>>> +static int f3(int z)
>>> +{
>>> +  return f1(z & 0xc);
>>> +}
>>> +
>>> +extern int a;
>>> +extern int b;
>>> +
>>> +int main(void)
>>> +{
>>> +  int k = f2(a);
>>> +  int l = f3(b);
>>> +  return k + l;
>>> +}
>>> +
>>> +/* { dg-final { scan-ipa-dump "Adjusting mask for param 0 to 0xf" "cp" } } */
>>> +/* { dg-final { scan-dump-tree-not "fail_test" "optimized" } } */
>>
>> This testcase thoroughly broke make check-gcc:
> Oops, sorry for the breakage. I am not sure how this missed my testing :/
> I obtained test results using test_summary script with and without patch,
> and compared the results with compare_tests which apparently showed no
> regressions...
> Thanks for the fix.
>

Hmmm that's weird indeed.

> Thanks,
> Prathamesh
>>
>> At first, runtest errors out with
>>
>> ERROR: (DejaGnu) proc "scan-dump-tree-not fail_test optimized" does not exist.

I do see this message in gcc.log (and in gcc.sum), but...
>>
>> The resulting incomplete gcc.sum files confuse dg-extract-results.py
>>
>> testsuite/gcc6/gcc.sum.sep: no recognised summary line
>> testsuite/gcc6/gcc.log.sep: no recognised summary line
>>
.... I do not see this...

>> and cause it to emit en empty gcc.sum, effectively losing all gcc
>> testresults in mail-report.log.
and gcc.sum looks quite good (except for the ERROR: message
which is not noticed by the comparison tools).

It could be an effect of a different 'make -j' value, resulting
in different split of gcc.sum.sep, thus making the error
un-noticed.

Christophe

>> This cannot have been tested in any reasonable way.
>>
>> Once you fix the typo (scan-dump-tree-not -> scan-tree-dump-not), at
>> least we get a complete gcc.sum again, but the testcase still shows up as
>>
>> UNRESOLVED: gcc.dg/ipa/propbits-2.c scan-tree-dump-not optimized "fail_test"
>>
>> and gcc.log shows
>>
>> gcc.dg/ipa/propbits-2.c: dump file does not exist
>>
>> Adding -fdump-tree-optimized creates the necessary dump and finally lets
>> the test pass.
>>
>> Here's the resulting patch.  Unless there are objections, I plan to
>> commit it soon.
>>
>>         Rainer
>>
>>
>> 2016-08-26  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>
>>
>>         * gcc.dg/ipa/propbits-2.c: Add -fdump-tree-optimized to dg-options.
>>         Fix typo.
>>
>>
>>
>> --
>> -----------------------------------------------------------------------------
>> Rainer Orth, Center for Biotechnology, Bielefeld University
>>

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2016-08-29 10:53 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-04  6:36 [RFC] ipa bitwise constant propagation Prathamesh Kulkarni
2016-08-04  8:02 ` Richard Biener
2016-08-04  8:57   ` Prathamesh Kulkarni
2016-08-04  9:07     ` kugan
2016-08-04 10:51     ` Richard Biener
2016-08-04 13:05   ` Jan Hubicka
2016-08-04 23:04     ` kugan
2016-08-05 11:36       ` Jan Hubicka
2016-08-05 12:37 ` Martin Jambor
2016-08-07 21:38   ` Prathamesh Kulkarni
2016-08-08 14:04     ` Martin Jambor
2016-08-08 14:29       ` David Malcolm
2016-08-09  8:11       ` Prathamesh Kulkarni
2016-08-09  9:24         ` Richard Biener
2016-08-09 11:09         ` Martin Jambor
2016-08-09 11:47           ` Prathamesh Kulkarni
2016-08-09 18:13             ` Martin Jambor
2016-08-10  8:45               ` Prathamesh Kulkarni
2016-08-10 11:35                 ` Prathamesh Kulkarni
2016-08-11 12:55                   ` Jan Hubicka
2016-08-12  9:54                     ` Prathamesh Kulkarni
2016-08-12 14:04                       ` Jan Hubicka
2016-08-16 13:05                         ` Prathamesh Kulkarni
2016-08-22 13:33                           ` Martin Jambor
2016-08-22 13:55                             ` Prathamesh Kulkarni
2016-08-24 12:07                               ` Prathamesh Kulkarni
2016-08-25 13:44                                 ` Jan Hubicka
2016-08-26 12:31                                   ` Prathamesh Kulkarni
2016-08-26 16:23                                 ` Rainer Orth
2016-08-26 17:23                                   ` Prathamesh Kulkarni
2016-08-29 10:53                                     ` Christophe Lyon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).