public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Fix PR61306: improve handling of sign and cast in bswap
@ 2014-06-04  5:59 Thomas Preud'homme
  2014-06-04  9:39 ` Richard Biener
  0 siblings, 1 reply; 17+ messages in thread
From: Thomas Preud'homme @ 2014-06-04  5:59 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 3272 bytes --]

When bswap replace a bitwise expression involving a memory source by a load possibly followed by a bswap, it is possible that the load has a size smaller than that of the target expression where the bitwise expression was affected. So some sort of cast is needed. But there might also be a difference between the size of the expression that was affected and the size of the load. So 3 different sizes might be involved. Consider the following example from binutils:

bfd_vma
bfd_getl16 (const void *p)
{
  const bfd_byte *addr = (*const bfd_byte *) p;
  return (addr[1] << 8) | addr[0];
}

Here the load will have a size of 16 bits, while the bitwise expression is an int (don't ask me why) but is returned as a 64 bits quantity (bfd_vma maps to the size of host registers). In this case we need 2 separate cast. One from 16 bit quantity to int with zero extension as the high bits are 0. It is always a zero extension because bswap will not do anything in the presence of a sign extension as depending on the initial value the result would be different (maybe a bswap if positive number and random value if negative number). Then, we need to cast respecting the extension that would have happen had we not replaced the bitwise extension. Here since the bitwise expression is int, it means we sign extend and then consider the content as being unsigned (bfd_vma is an unsigned quantity).

When a bswap is necessary we need to do this double cast after doing the bswap as the bswap must be done on the loaded value since a that's the size expected by the bswap builtin. Finally, this patch also forbit any sign extension *in* the bitwise expression as the result of the expression would then be unpredictable (depend on the initial value).

The patch works this way:

1) prevent size extension of a bitwise expression
2) record the type of the bitwise expression instead of its size (the size can be determined from the type)
3) use this type to perform a double cast as explained above


2014-05-30  Thomas Preud'homme  <thomas.preudhomme@arm.com>

	PR tree-optimization/61306
	* tree-ssa-math-opts.c (struct symbolic_number): Store type of
	expression instead of its size.
	(do_shift_rotate): Adapt to change in struct symbolic_number.
	(verify_symbolic_number_p): Likewise.
	(init_symbolic_number): Likewise.
	(find_bswap_or_nop_1): Likewise. Also prevent optimization when the
	result of the expressions is unpredictable due to sign extension.
	(convert_via): New function to deal with the casting involved from the
	loaded value to the target SSA.
	(bswap_replace): Rename load_type in range_type to reflect it's the
	type the memory accessed shall have before being casted. Select load
	type according to whether a bswap needs to be done. Cast first to rhs
	with zero extend and then to lhs with sign extend to keep semantic of
	original stmts.
	(pass_optimize_bswap::execute): Adapt to change in struct
	symbolic_number. Decide if the range accessed should be signed or
	unsigned before being casted to lhs type based on rhs type and size.

2014-05-29  Thomas Preud'homme  <thomas.preudhomme@arm.com>

	* gcc.c-torture/execute/pr61306.c: New test.

Patch is in attachment. Is this ok for trunk?

Best regards,

Thomas

[-- Attachment #2: PR61306.1.0.diff --]
[-- Type: application/octet-stream, Size: 13810 bytes --]

diff --git a/gcc/testsuite/gcc.c-torture/execute/pr61306.c b/gcc/testsuite/gcc.c-torture/execute/pr61306.c
new file mode 100644
index 0000000..6086e27
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr61306.c
@@ -0,0 +1,13 @@
+short a = -1;
+int b;
+char c;
+
+int
+main ()
+{
+  c = a;
+  b = a | c;
+  if (b != -1)
+    __builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index 658b341..4c0b808 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1622,7 +1622,7 @@ make_pass_cse_sincos (gcc::context *ctxt)
 
 struct symbolic_number {
   uint64_t n;
-  int size;
+  tree type;
   tree base_addr;
   tree offset;
   HOST_WIDE_INT bytepos;
@@ -1652,13 +1652,15 @@ do_shift_rotate (enum tree_code code,
 		 struct symbolic_number *n,
 		 int count)
 {
+  int bitsize = TYPE_PRECISION (n->type);
+
   if (count % 8 != 0)
     return false;
 
   /* Zero out the extra bits of N in order to avoid them being shifted
      into the significant bits.  */
-  if (n->size < (int)sizeof (int64_t))
-    n->n &= ((uint64_t)1 << (n->size * BITS_PER_UNIT)) - 1;
+  if (bitsize / BITS_PER_UNIT < (int)sizeof (int64_t))
+    n->n &= ((uint64_t)1 << bitsize) - 1;
 
   switch (code)
     {
@@ -1669,17 +1671,17 @@ do_shift_rotate (enum tree_code code,
       n->n >>= count;
       break;
     case LROTATE_EXPR:
-      n->n = (n->n << count) | (n->n >> ((n->size * BITS_PER_UNIT) - count));
+      n->n = (n->n << count) | (n->n >> (bitsize - count));
       break;
     case RROTATE_EXPR:
-      n->n = (n->n >> count) | (n->n << ((n->size * BITS_PER_UNIT) - count));
+      n->n = (n->n >> count) | (n->n << (bitsize - count));
       break;
     default:
       return false;
     }
   /* Zero unused bits for size.  */
-  if (n->size < (int)sizeof (int64_t))
-    n->n &= ((uint64_t)1 << (n->size * BITS_PER_UNIT)) - 1;
+  if (bitsize / BITS_PER_UNIT < (int)sizeof (int64_t))
+    n->n &= ((uint64_t)1 << bitsize) - 1;
   return true;
 }
 
@@ -1696,7 +1698,7 @@ verify_symbolic_number_p (struct symbolic_number *n, gimple stmt)
   if (TREE_CODE (lhs_type) != INTEGER_TYPE)
     return false;
 
-  if (TYPE_PRECISION (lhs_type) != n->size * BITS_PER_UNIT)
+  if (TYPE_PRECISION (lhs_type) != TYPE_PRECISION (n->type))
     return false;
 
   return true;
@@ -1708,20 +1710,23 @@ verify_symbolic_number_p (struct symbolic_number *n, gimple stmt)
 static bool
 init_symbolic_number (struct symbolic_number *n, tree src)
 {
+  int size;
+
   n->base_addr = n->offset = n->alias_set = n->vuse = NULL_TREE;
 
   /* Set up the symbolic number N by setting each byte to a value between 1 and
      the byte size of rhs1.  The highest order byte is set to n->size and the
      lowest order byte to 1.  */
-  n->size = TYPE_PRECISION (TREE_TYPE (src));
-  if (n->size % BITS_PER_UNIT != 0)
+  n->type = TREE_TYPE (src);
+  size = TYPE_PRECISION (n->type);
+  if (size % BITS_PER_UNIT != 0)
     return false;
-  n->size /= BITS_PER_UNIT;
-  n->range = n->size;
+  size /= BITS_PER_UNIT;
+  n->range = size;
   n->n = CMPNOP;
 
-  if (n->size < (int)sizeof (int64_t))
-    n->n &= ((uint64_t)1 << (n->size * BITS_PER_UNIT)) - 1;
+  if (size < (int)sizeof (int64_t))
+    n->n &= ((uint64_t)1 << (size * BITS_PER_UNIT)) - 1;
 
   return true;
 }
@@ -1857,12 +1862,12 @@ find_bswap_or_nop_1 (gimple stmt, struct symbolic_number *n, int limit)
 	{
 	case BIT_AND_EXPR:
 	  {
-	    int i;
+	    int i, size = TYPE_PRECISION (n->type) / BITS_PER_UNIT;
 	    uint64_t val = int_cst_value (rhs2);
 	    uint64_t tmp = val;
 
 	    /* Only constants masking full bytes are allowed.  */
-	    for (i = 0; i < n->size; i++, tmp >>= BITS_PER_UNIT)
+	    for (i = 0; i < size; i++, tmp >>= BITS_PER_UNIT)
 	      if ((tmp & 0xff) != 0 && (tmp & 0xff) != 0xff)
 		return NULL_TREE;
 
@@ -1879,20 +1884,27 @@ find_bswap_or_nop_1 (gimple stmt, struct symbolic_number *n, int limit)
 	CASE_CONVERT:
 	  {
 	    int type_size;
+	    tree type;
 
-	    type_size = TYPE_PRECISION (gimple_expr_type (stmt));
+	    type = gimple_expr_type (stmt);
+	    type_size = TYPE_PRECISION (type);
 	    if (type_size % BITS_PER_UNIT != 0)
 	      return NULL_TREE;
 
+	    /* Sign extension: result is dependent on the value.  */
+	    if (!TYPE_UNSIGNED (type) && !TYPE_UNSIGNED (n->type)
+		&& type_size > TYPE_PRECISION (n->type))
+	      return NULL_TREE;
+
 	    if (type_size / BITS_PER_UNIT < (int)(sizeof (int64_t)))
 	      {
 		/* If STMT casts to a smaller type mask out the bits not
 		   belonging to the target type.  */
 		n->n &= ((uint64_t)1 << type_size) - 1;
 	      }
-	    n->size = type_size / BITS_PER_UNIT;
+	    n->type = type;
 	    if (!n->base_addr)
-	      n->range = n->size;
+	      n->range = type_size / BITS_PER_UNIT;
 	  }
 	  break;
 	default:
@@ -1905,7 +1917,7 @@ find_bswap_or_nop_1 (gimple stmt, struct symbolic_number *n, int limit)
 
   if (rhs_class == GIMPLE_BINARY_RHS)
     {
-      int i;
+      int i, size;
       struct symbolic_number n1, n2;
       uint64_t mask;
       tree source_expr2;
@@ -1931,7 +1943,7 @@ find_bswap_or_nop_1 (gimple stmt, struct symbolic_number *n, int limit)
 	  if (!source_expr2)
 	    return NULL_TREE;
 
-	  if (n1.size != n2.size)
+	  if (TYPE_PRECISION (n1.type) != TYPE_PRECISION (n2.type))
 	    return NULL_TREE;
 
 	  if (!n1.vuse != !n2.vuse ||
@@ -1997,8 +2009,9 @@ find_bswap_or_nop_1 (gimple stmt, struct symbolic_number *n, int limit)
 	  n->base_addr = n1.base_addr;
 	  n->offset = n1.offset;
 	  n->bytepos = n1.bytepos;
-	  n->size = n1.size;
-	  for (i = 0, mask = 0xff; i < n->size; i++, mask <<= BITS_PER_UNIT)
+	  n->type = n1.type;
+	  size = TYPE_PRECISION (n->type) / BITS_PER_UNIT;
+	  for (i = 0, mask = 0xff; i < size; i++, mask <<= BITS_PER_UNIT)
 	    {
 	      uint64_t masked1, masked2;
 
@@ -2123,6 +2136,68 @@ public:
 
 }; // class pass_optimize_bswap
 
+/* Convert *unsigned* expression FROM to an expression of type TO_TYPE.
+   Expression is first zero extended to VIA_TYPE if different from the type of
+   FROM and then sign extended according to the sign of TO.  If ASSIGN_TO is
+   true, then TO is an SSA and the final cast result is assigned to it.  Else,
+   TO is a tree type.  INSERT_BEFORE controls whether the stmt generated for
+   the cast are inserted before or after the stmt pointed at by GSI.
+
+   This function returns the stmt corresponding to the last conversion, or NULL
+   if no conversion was needed.
+
+   The purpose of this function is to cast the result of a load created by the
+   bswap pass to the type of the lhs of the statement replaced. To keep the
+   semantic of the initial set of stmt, it is necessary to zero extend to the
+   size of the rhs and then sign extend to the lhs as would have happened in
+   the original stmt.  */
+
+static gimple
+convert_via (gimple_stmt_iterator *gsi, tree from, tree via_type, tree to,
+	 bool insert_before, bool assign_to)
+{
+  tree via_tmp;
+  gimple_seq seq = NULL;
+  gimple last_cast_stmt = NULL;
+
+  gcc_assert (TYPE_UNSIGNED (TREE_TYPE (from)));
+  gcc_assert (assign_to || TYPE_CHECK (to));
+
+  /* First zero extend to size of via_type if necessary.  */
+  if (!useless_type_conversion_p (via_type, TREE_TYPE (from)))
+    {
+      gimple via_cast_stmt;
+
+      via_tmp = make_temp_ssa_name (via_type, NULL, "bswap_via_cast");
+      via_cast_stmt = gimple_build_assign_with_ops (CONVERT_EXPR, via_tmp,
+						    from, NULL);
+      gimple_seq_add_stmt (&seq, via_cast_stmt);
+      last_cast_stmt = via_cast_stmt;
+    }
+  else
+    via_tmp = from;
+
+  /* Then sign extend to size of to_type.  */
+  if (assign_to && !useless_type_conversion_p (TREE_TYPE (to), via_type))
+    {
+      gimple to_cast_stmt;
+
+      to_cast_stmt = gimple_build_assign_with_ops (CONVERT_EXPR, to, via_tmp,
+						   NULL);
+      gimple_seq_add_stmt (&seq, to_cast_stmt);
+      last_cast_stmt = to_cast_stmt;
+    }
+  else if (via_tmp != from && assign_to)
+    gimple_assign_set_lhs (last_cast_stmt, to);
+
+  if (insert_before)
+    gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT);
+  else
+    gsi_insert_seq_after (gsi, seq, GSI_SAME_STMT);
+
+  return last_cast_stmt;
+}
+
 /* Perform the bswap optimization: replace the statement STMT at GSI
    with load type, VUSE and set-alias as described by N if a memory
    source is involved (N->base_addr is non null), followed by the
@@ -2132,7 +2207,7 @@ public:
 
 static bool
 bswap_replace (gimple stmt, gimple_stmt_iterator *gsi, tree src, tree fndecl,
-	       tree bswap_type, tree load_type, struct symbolic_number *n,
+	       tree bswap_type, tree range_type, struct symbolic_number *n,
 	       bool bswap)
 {
   tree tmp, tgt;
@@ -2144,12 +2219,12 @@ bswap_replace (gimple stmt, gimple_stmt_iterator *gsi, tree src, tree fndecl,
   if (n->base_addr)
     {
       tree addr_expr, addr_tmp, val_expr, val_tmp;
-      tree load_offset_ptr, aligned_load_type;
+      tree load_offset_ptr, load_type, aligned_load_type;
       gimple addr_stmt, load_stmt;
       unsigned align;
 
       align = get_object_alignment (src);
-      if (bswap && SLOW_UNALIGNED_ACCESS (TYPE_MODE (load_type), align))
+      if (bswap && SLOW_UNALIGNED_ACCESS (TYPE_MODE (range_type), align))
 	return false;
 
       /*  Compute address to load from and cast according to the size
@@ -2166,6 +2241,7 @@ bswap_replace (gimple stmt, gimple_stmt_iterator *gsi, tree src, tree fndecl,
 	}
 
       /* Perform the load.  */
+      load_type = bswap ? bswap_type : range_type;
       aligned_load_type = load_type;
       if (align < TYPE_ALIGN (load_type))
 	aligned_load_type = build_aligned_type (load_type, align);
@@ -2186,14 +2262,23 @@ bswap_replace (gimple stmt, gimple_stmt_iterator *gsi, tree src, tree fndecl,
 	    }
 
 	  /* Convert the result of load if necessary.  */
-	  if (!useless_type_conversion_p (TREE_TYPE (tgt), load_type))
+	  if (!useless_type_conversion_p (TREE_TYPE (tgt),
+					  TREE_TYPE (val_expr)))
 	    {
-	      val_tmp = make_temp_ssa_name (aligned_load_type, NULL,
+	      gimple cast_stmt;
+	      tree cast_tmp;
+
+	      val_tmp = make_temp_ssa_name (TREE_TYPE (val_expr), NULL,
 					    "load_dst");
 	      load_stmt = gimple_build_assign (val_tmp, val_expr);
 	      gimple_set_vuse (load_stmt, n->vuse);
 	      gsi_insert_before (gsi, load_stmt, GSI_SAME_STMT);
-	      gimple_assign_set_rhs_with_ops_1 (gsi, NOP_EXPR, val_tmp,
+
+	      cast_stmt = convert_via (gsi, val_tmp, n->type, TREE_TYPE (tgt),
+				       true, false);
+	      cast_tmp = gimple_assign_lhs (cast_stmt);
+
+	      gimple_assign_set_rhs_with_ops_1 (gsi, CONVERT_EXPR, cast_tmp,
 						NULL_TREE, NULL_TREE);
 	    }
 	  else
@@ -2248,10 +2333,27 @@ bswap_replace (gimple stmt, gimple_stmt_iterator *gsi, tree src, tree fndecl,
   /* Convert the result if necessary.  */
   if (!useless_type_conversion_p (TREE_TYPE (tgt), bswap_type))
     {
-      gimple convert_stmt;
-      tmp = make_temp_ssa_name (bswap_type, NULL, "bswapdst");
-      convert_stmt = gimple_build_assign_with_ops (NOP_EXPR, tgt, tmp, NULL);
-      gsi_insert_after (gsi, convert_stmt, GSI_SAME_STMT);
+      gimple_stmt_iterator cast_gsi = *gsi;
+      gimple cast_stmt = NULL, last_stmt;
+      tree cast_tmp;
+
+      cast_tmp = make_temp_ssa_name (range_type, NULL, "unsigned_bswapdst");
+
+      /* Cast to unsigned as required by convert_via.  */
+      if (!useless_type_conversion_p (range_type, bswap_type))
+	{
+	  tmp = make_temp_ssa_name (bswap_type, NULL, "bswapdst");
+	  cast_stmt = gimple_build_assign_with_ops (NOP_EXPR, cast_tmp, tmp,
+						    NULL);
+	  gsi_insert_after (&cast_gsi, cast_stmt, GSI_NEW_STMT);
+	}
+      else
+	tmp = cast_tmp;
+
+      last_stmt = convert_via (&cast_gsi, cast_tmp, n->type, tgt, false, true);
+      if (last_stmt == NULL && cast_stmt)
+	gimple_assign_set_lhs (cast_stmt, tgt);
+
     }
 
   gimple_call_set_lhs (call, tmp);
@@ -2327,9 +2429,10 @@ pass_optimize_bswap::execute (function *fun)
         {
 	  gimple stmt = gsi_stmt (gsi);
 	  tree fndecl = NULL_TREE, bswap_type = NULL_TREE;
-	  tree src, load_type;
+	  tree src, range_type;
 	  struct symbolic_number n;
-	  bool bswap;
+	  bool bswap, unsigned_load;
+	  int size;
 
 	  if (!is_gimple_assign (stmt)
 	      || gimple_assign_rhs_code (stmt) != BIT_IOR_EXPR)
@@ -2340,10 +2443,15 @@ pass_optimize_bswap::execute (function *fun)
 	  if (!src)
 	    continue;
 
+	  size = TYPE_PRECISION (n.type) / BITS_PER_UNIT;
+	  unsigned_load = TYPE_UNSIGNED (n.type) || (int)n.range != size;
 	  switch (n.range)
 	    {
 	    case 16:
-	      load_type = uint16_type_node;
+	      if (unsigned_load)
+		range_type = unsigned_intHI_type_node;
+	      else
+		range_type = intHI_type_node;
 	      if (bswap16_p)
 		{
 		  fndecl = builtin_decl_explicit (BUILT_IN_BSWAP16);
@@ -2351,7 +2459,10 @@ pass_optimize_bswap::execute (function *fun)
 		}
 	      break;
 	    case 32:
-	      load_type = uint32_type_node;
+	      if (unsigned_load)
+		range_type = unsigned_intSI_type_node;
+	      else
+		range_type = intSI_type_node;
 	      if (bswap32_p)
 		{
 		  fndecl = builtin_decl_explicit (BUILT_IN_BSWAP32);
@@ -2359,7 +2470,10 @@ pass_optimize_bswap::execute (function *fun)
 		}
 	      break;
 	    case 64:
-	      load_type = uint64_type_node;
+	      if (unsigned_load)
+		range_type = unsigned_intDI_type_node;
+	      else
+		range_type = intDI_type_node;
 	      if (bswap64_p)
 		{
 		  fndecl = builtin_decl_explicit (BUILT_IN_BSWAP64);
@@ -2373,7 +2487,7 @@ pass_optimize_bswap::execute (function *fun)
 	  if (bswap && !fndecl)
 	    continue;
 
-	  if (bswap_replace (stmt, &gsi, src, fndecl, bswap_type, load_type,
+	  if (bswap_replace (stmt, &gsi, src, fndecl, bswap_type, range_type,
 			     &n, bswap))
 	    changed = true;
 	}

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Fix PR61306: improve handling of sign and cast in bswap
  2014-06-04  5:59 [PATCH] Fix PR61306: improve handling of sign and cast in bswap Thomas Preud'homme
@ 2014-06-04  9:39 ` Richard Biener
  2014-06-04 10:42   ` Thomas Preud'homme
  2014-06-10  8:50   ` Thomas Preud'homme
  0 siblings, 2 replies; 17+ messages in thread
From: Richard Biener @ 2014-06-04  9:39 UTC (permalink / raw)
  To: Thomas Preud'homme; +Cc: GCC Patches

On Wed, Jun 4, 2014 at 7:59 AM, Thomas Preud'homme
<thomas.preudhomme@arm.com> wrote:
> When bswap replace a bitwise expression involving a memory source by a load possibly followed by a bswap, it is possible that the load has a size smaller than that of the target expression where the bitwise expression was affected. So some sort of cast is needed. But there might also be a difference between the size of the expression that was affected and the size of the load. So 3 different sizes might be involved. Consider the following example from binutils:
>
> bfd_vma
> bfd_getl16 (const void *p)
> {
>   const bfd_byte *addr = (*const bfd_byte *) p;
>   return (addr[1] << 8) | addr[0];
> }
>
> Here the load will have a size of 16 bits, while the bitwise expression is an int (don't ask me why) but is returned as a 64 bits quantity (bfd_vma maps to the size of host registers). In this case we need 2 separate cast. One from 16 bit quantity to int with zero extension as the high bits are 0. It is always a zero extension because bswap will not do anything in the presence of a sign extension as depending on the initial value the result would be different (maybe a bswap if positive number and random value if negative number). Then, we need to cast respecting the extension that would have happen had we not replaced the bitwise extension. Here since the bitwise expression is int, it means we sign extend and then consider the content as being unsigned (bfd_vma is an unsigned quantity).
>
> When a bswap is necessary we need to do this double cast after doing the bswap as the bswap must be done on the loaded value since a that's the size expected by the bswap builtin. Finally, this patch also forbit any sign extension *in* the bitwise expression as the result of the expression would then be unpredictable (depend on the initial value).
>
> The patch works this way:
>
> 1) prevent size extension of a bitwise expression
> 2) record the type of the bitwise expression instead of its size (the size can be determined from the type)
> 3) use this type to perform a double cast as explained above
>
>
> 2014-05-30  Thomas Preud'homme  <thomas.preudhomme@arm.com>
>
>         PR tree-optimization/61306
>         * tree-ssa-math-opts.c (struct symbolic_number): Store type of
>         expression instead of its size.
>         (do_shift_rotate): Adapt to change in struct symbolic_number.
>         (verify_symbolic_number_p): Likewise.
>         (init_symbolic_number): Likewise.
>         (find_bswap_or_nop_1): Likewise. Also prevent optimization when the
>         result of the expressions is unpredictable due to sign extension.
>         (convert_via): New function to deal with the casting involved from the
>         loaded value to the target SSA.
>         (bswap_replace): Rename load_type in range_type to reflect it's the
>         type the memory accessed shall have before being casted. Select load
>         type according to whether a bswap needs to be done. Cast first to rhs
>         with zero extend and then to lhs with sign extend to keep semantic of
>         original stmts.
>         (pass_optimize_bswap::execute): Adapt to change in struct
>         symbolic_number. Decide if the range accessed should be signed or
>         unsigned before being casted to lhs type based on rhs type and size.
>
> 2014-05-29  Thomas Preud'homme  <thomas.preudhomme@arm.com>
>
>         * gcc.c-torture/execute/pr61306.c: New test.
>
> Patch is in attachment. Is this ok for trunk?

I'd rather change the comparisons

-  if (n->size < (int)sizeof (int64_t))
-    n->n &= ((uint64_t)1 << (n->size * BITS_PER_UNIT)) - 1;
+  if (bitsize / BITS_PER_UNIT < (int)sizeof (int64_t))
+    n->n &= ((uint64_t)1 << bitsize) - 1;

to work in bits, thus bitsize < 8 * sizeof (int64_t) (note that using
BITS_PER_UNIT is bogus here - you are dealing with 8-bit bytes
on the host, not whatever the target uses).  Otherwise it smells
like truncation may lose bits (probably not in practice).

+           /* Sign extension: result is dependent on the value.  */
+           if (!TYPE_UNSIGNED (type) && !TYPE_UNSIGNED (n->type)
+               && type_size > TYPE_PRECISION (n->type))
+             return NULL_TREE;

whether it's sign-extended depends solely on the sign of the
converted entity, so I'm not sure why you are restricting this
to signed n->type.  Consider

signed char *p;
((unsigned int)p[0]) << 8 | ((unsigned int)p[1]) << 16

the loads are still sign-extended but n->type is unsigned.

I'm failing to get why you possibly need two casts ... you should
only need one, from the bswap/load result to the final type
(zero-extended as you say - so the load type should simply be
unsigned which it is already).

So I think that the testcase in the patch is fixed already by
doing the n->type change (and a proper sign-extension detection).

Can you please split that part out?

That range_type and convert_via looks wrong and unnecessary to me,
and it doesn't look like you have a testcase excercising it?

Thanks,
Richard.

> Best regards,
>
> Thomas

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH] Fix PR61306: improve handling of sign and cast in bswap
  2014-06-04  9:39 ` Richard Biener
@ 2014-06-04 10:42   ` Thomas Preud'homme
  2014-06-04 11:04     ` Richard Biener
  2014-06-10  8:50   ` Thomas Preud'homme
  1 sibling, 1 reply; 17+ messages in thread
From: Thomas Preud'homme @ 2014-06-04 10:42 UTC (permalink / raw)
  To: 'Richard Biener'; +Cc: GCC Patches

> From: Richard Biener [mailto:richard.guenther@gmail.com]
> 
> I'd rather change the comparisons
> 
> -  if (n->size < (int)sizeof (int64_t))
> -    n->n &= ((uint64_t)1 << (n->size * BITS_PER_UNIT)) - 1;
> +  if (bitsize / BITS_PER_UNIT < (int)sizeof (int64_t))
> +    n->n &= ((uint64_t)1 << bitsize) - 1;
> 
> to work in bits, thus bitsize < 8 * sizeof (int64_t) (note that using
> BITS_PER_UNIT is bogus here - you are dealing with 8-bit bytes
> on the host, not whatever the target uses).  Otherwise it smells
> like truncation may lose bits (probably not in practice).

Ah yes, right.

> 
> +           /* Sign extension: result is dependent on the value.  */
> +           if (!TYPE_UNSIGNED (type) && !TYPE_UNSIGNED (n->type)
> +               && type_size > TYPE_PRECISION (n->type))
> +             return NULL_TREE;
> 
> whether it's sign-extended depends solely on the sign of the
> converted entity, so I'm not sure why you are restricting this
> to signed n->type.  Consider
> 
> signed char *p;
> ((unsigned int)p[0]) << 8 | ((unsigned int)p[1]) << 16
> 
> the loads are still sign-extended but n->type is unsigned.

Indeed, I understood it for convert_via (the requirement to be
unsigned) but got it wrong here.

> 
> I'm failing to get why you possibly need two casts ... you should
> only need one, from the bswap/load result to the final type
> (zero-extended as you say - so the load type should simply be
> unsigned which it is already).

Because of the type of the shift constant, the bitwise expression
is usually of type int. However, if you write a function that does
a 32 bit load in host endianness (like a 32 bit little endian load on
x86_64) with a result of size 64 bits, then you need to sign
extend, since the source type is signed. This is a situation I
encountered in bfd_getl32 in binutils I think.

Now if you consider bfd_getl16 instead a direct sign extension
is out of the question. Suppose bfd_getl16 is called to read from
a memory address that contains 0xff 0xfe. The bitwise expression
would thus be equivalent to the value 0x0000feff since it's of type
int. Then after the sign extension to 64bits you'd have
0x000000000000feff. But after replacing the bitwise expression
you end up with a 16bit load into a 16bit SSA variable. If you sign
extend that directly to 64 bits you'll end up with
0xfffffffffffffeff which is wrong. But if you zero extend to an
int value (the type of the bitwise OR expression) and then sign
extend to the target type you'll have the correct result.

But you're right, we can do simpler by sign extending if
load size == size of bitwise expression and zero extend else. The
change of load_type to range_type would still be needed
because in case of a load + bswap it's better to load in the
same type as the type of the parameter of bswap. After bswap
you'd need to convert to a signed or unsigned value according
to the logic above (load size == size of bitwise expression)

In the original statement, the bitwise OR
expression would have the 2 bytes of higher weight be 0 while
the 2 bytes of lower weight would be the value read from
memory. The result of the sign extension would be 

> 
> So I think that the testcase in the patch is fixed already by
> doing the n->type change (and a proper sign-extension detection).

I don't remember exactly but I think it didn't fix this bug
(but it is a necessary fix anyway).
 
> 
> Can you please split that part out?

Sure, that part would need to be applied on gcc 4.9 too. I'll try
to construct a testcase for that.

> 
> That range_type and convert_via looks wrong and unnecessary to me,
> and it doesn't look like you have a testcase excercising it?

See above.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Fix PR61306: improve handling of sign and cast in bswap
  2014-06-04 10:42   ` Thomas Preud'homme
@ 2014-06-04 11:04     ` Richard Biener
  2014-06-04 11:16       ` Thomas Preud'homme
  0 siblings, 1 reply; 17+ messages in thread
From: Richard Biener @ 2014-06-04 11:04 UTC (permalink / raw)
  To: Thomas Preud'homme; +Cc: GCC Patches

On Wed, Jun 4, 2014 at 12:42 PM, Thomas Preud'homme
<thomas.preudhomme@arm.com> wrote:
>> From: Richard Biener [mailto:richard.guenther@gmail.com]
>>
>> I'd rather change the comparisons
>>
>> -  if (n->size < (int)sizeof (int64_t))
>> -    n->n &= ((uint64_t)1 << (n->size * BITS_PER_UNIT)) - 1;
>> +  if (bitsize / BITS_PER_UNIT < (int)sizeof (int64_t))
>> +    n->n &= ((uint64_t)1 << bitsize) - 1;
>>
>> to work in bits, thus bitsize < 8 * sizeof (int64_t) (note that using
>> BITS_PER_UNIT is bogus here - you are dealing with 8-bit bytes
>> on the host, not whatever the target uses).  Otherwise it smells
>> like truncation may lose bits (probably not in practice).
>
> Ah yes, right.
>
>>
>> +           /* Sign extension: result is dependent on the value.  */
>> +           if (!TYPE_UNSIGNED (type) && !TYPE_UNSIGNED (n->type)
>> +               && type_size > TYPE_PRECISION (n->type))
>> +             return NULL_TREE;
>>
>> whether it's sign-extended depends solely on the sign of the
>> converted entity, so I'm not sure why you are restricting this
>> to signed n->type.  Consider
>>
>> signed char *p;
>> ((unsigned int)p[0]) << 8 | ((unsigned int)p[1]) << 16
>>
>> the loads are still sign-extended but n->type is unsigned.
>
> Indeed, I understood it for convert_via (the requirement to be
> unsigned) but got it wrong here.
>
>>
>> I'm failing to get why you possibly need two casts ... you should
>> only need one, from the bswap/load result to the final type
>> (zero-extended as you say - so the load type should simply be
>> unsigned which it is already).
>
> Because of the type of the shift constant, the bitwise expression
> is usually of type int. However, if you write a function that does
> a 32 bit load in host endianness (like a 32 bit little endian load on
> x86_64) with a result of size 64 bits, then you need to sign
> extend, since the source type is signed. This is a situation I
> encountered in bfd_getl32 in binutils I think.
>
> Now if you consider bfd_getl16 instead a direct sign extension
> is out of the question. Suppose bfd_getl16 is called to read from
> a memory address that contains 0xff 0xfe. The bitwise expression
> would thus be equivalent to the value 0x0000feff since it's of type
> int. Then after the sign extension to 64bits you'd have
> 0x000000000000feff. But after replacing the bitwise expression
> you end up with a 16bit load into a 16bit SSA variable. If you sign
> extend that directly to 64 bits you'll end up with
> 0xfffffffffffffeff which is wrong. But if you zero extend to an
> int value (the type of the bitwise OR expression) and then sign
> extend to the target type you'll have the correct result.

Err, but if you zero-extend directly to the target type you have the
correct result, too.

> But you're right, we can do simpler by sign extending if
> load size == size of bitwise expression and zero extend else. The
> change of load_type to range_type would still be needed
> because in case of a load + bswap it's better to load in the
> same type as the type of the parameter of bswap. After bswap
> you'd need to convert to a signed or unsigned value according
> to the logic above (load size == size of bitwise expression)
>
> In the original statement, the bitwise OR
> expression would have the 2 bytes of higher weight be 0 while
> the 2 bytes of lower weight would be the value read from
> memory. The result of the sign extension would be
>
>>
>> So I think that the testcase in the patch is fixed already by
>> doing the n->type change (and a proper sign-extension detection).
>
> I don't remember exactly but I think it didn't fix this bug
> (but it is a necessary fix anyway).
>
>>
>> Can you please split that part out?
>
> Sure, that part would need to be applied on gcc 4.9 too. I'll try
> to construct a testcase for that.
>
>>
>> That range_type and convert_via looks wrong and unnecessary to me,
>> and it doesn't look like you have a testcase excercising it?
>
> See above.

But nothing for the testsuite?  The testcase you add fails foul of
sign-extending the loads.

Richard.

>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH] Fix PR61306: improve handling of sign and cast in bswap
  2014-06-04 11:04     ` Richard Biener
@ 2014-06-04 11:16       ` Thomas Preud'homme
  2014-06-04 11:18         ` Richard Biener
  0 siblings, 1 reply; 17+ messages in thread
From: Thomas Preud'homme @ 2014-06-04 11:16 UTC (permalink / raw)
  To: 'Richard Biener'; +Cc: GCC Patches

> From: Richard Biener [mailto:richard.guenther@gmail.com]
> 
> Err, but if you zero-extend directly to the target type you have the
> correct result, too.

Yep but in some case we need sign extend (32 bit bitwise OR stored
into 64 bit result). As I said, the logic could be simplified by sign
extending if load_size == bitwise expression size and zero extending
if not true. I'll rework the patch in this direction.

> 
> But nothing for the testsuite?  The testcase you add fails foul of
> sign-extending the loads.

Ack, I'll add a test for zero extension and one for sign extension.

Cheers,

Thomas



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Fix PR61306: improve handling of sign and cast in bswap
  2014-06-04 11:16       ` Thomas Preud'homme
@ 2014-06-04 11:18         ` Richard Biener
  0 siblings, 0 replies; 17+ messages in thread
From: Richard Biener @ 2014-06-04 11:18 UTC (permalink / raw)
  To: Thomas Preud'homme; +Cc: GCC Patches

On Wed, Jun 4, 2014 at 1:15 PM, Thomas Preud'homme
<thomas.preudhomme@arm.com> wrote:
>> From: Richard Biener [mailto:richard.guenther@gmail.com]
>>
>> Err, but if you zero-extend directly to the target type you have the
>> correct result, too.
>
> Yep but in some case we need sign extend (32 bit bitwise OR stored
> into 64 bit result).

But you'd convert the 32bit bitwise OR to the original type anyway
(no extension, but sign-change).  The extension to the stored type
happens with the original IL.

> As I said, the logic could be simplified by sign
> extending if load_size == bitwise expression size and zero extending
> if not true. I'll rework the patch in this direction.
>
>>
>> But nothing for the testsuite?  The testcase you add fails foul of
>> sign-extending the loads.
>
> Ack, I'll add a test for zero extension and one for sign extension.

Thanks.
Richard.

> Cheers,
>
> Thomas
>
>
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH] Fix PR61306: improve handling of sign and cast in bswap
  2014-06-04  9:39 ` Richard Biener
  2014-06-04 10:42   ` Thomas Preud'homme
@ 2014-06-10  8:50   ` Thomas Preud'homme
  2014-06-10  8:54     ` Thomas Preud'homme
  2014-06-11  8:32     ` Richard Biener
  1 sibling, 2 replies; 17+ messages in thread
From: Thomas Preud'homme @ 2014-06-10  8:50 UTC (permalink / raw)
  To: 'Richard Biener'; +Cc: GCC Patches

> From: Richard Biener [mailto:richard.guenther@gmail.com]
> Sent: Wednesday, June 04, 2014 5:39 PM
> To: Thomas Preud'homme
> 

> 
> I'm failing to get why you possibly need two casts ... you should
> only need one, from the bswap/load result to the final type
> (zero-extended as you say - so the load type should simply be
> unsigned which it is already).

You are right indeed. I failed to realize that the problems I
encountered were caused by an initially wrong understanding of
the reason behind PR61306. All this code is not necessary.

> 
> So I think that the testcase in the patch is fixed already by
> doing the n->type change (and a proper sign-extension detection).
> 
> Can you please split that part out?

Doing so I realize the patch was incomplete. Sign extension can
be triggered in two distinct place in the code (right shift and cast)
that can both lead to incorrect code being generated. With some
efforts I managed to create two testcases that work both on
GCC trunk but also GCC 4.9 and 4.8.

ChangeLog entries are:

*** gcc/ChangeLog ***

2014-06-05  Thomas Preud'homme  <thomas.preudhomme@arm.com>

	PR tree-optimization/61306
	* tree-ssa-math-opts.c (struct symbolic_number): Store type of
	expression instead of its size.
	(do_shift_rotate): Adapt to change in struct symbolic_number. Return
	false to prevent optimization when the result is unpredictable due to
	arithmetic right shift of signed type with highest byte is set.
	(verify_symbolic_number_p): Adapt to change in struct symbolic_number.
	(init_symbolic_number): Likewise.
	(find_bswap_or_nop_1): Likewise. Return NULL to prevent optimization
	when the result is unpredictable due to sign extension.

*** gcc/testsuite/ChangeLog ***

2014-06-05  Thomas Preud'homme  <thomas.preudhomme@arm.com>

	* gcc.c-torture/execute/pr61306-1.c: New test.
	* gcc.c-torture/execute/pr61306-2.c: Likewise.
	* gcc.c-torture/execute/pr61306-3.c: Likewise.


diff --git a/gcc/testsuite/gcc.c-torture/execute/pr61306-1.c b/gcc/testsuite/gcc.c-torture/execute/pr61306-1.c
new file mode 100644
index 0000000..f6e8ff3
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr61306-1.c
@@ -0,0 +1,39 @@
+#ifdef __INT32_TYPE__
+typedef __INT32_TYPE__ int32_t;
+#else
+typedef int int32_t;
+#endif
+
+#ifdef __UINT32_TYPE__
+typedef __UINT32_TYPE__ uint32_t;
+#else
+typedef unsigned uint32_t;
+#endif
+
+#define __fake_const_swab32(x) ((uint32_t)(		      \
+        (((uint32_t)(x) & (uint32_t)0x000000ffUL) << 24) |    \
+        (((uint32_t)(x) & (uint32_t)0x0000ff00UL) <<  8) |    \
+        (((uint32_t)(x) & (uint32_t)0x00ff0000UL) >>  8) |    \
+        (( (int32_t)(x) &  (int32_t)0xff000000UL) >> 24)))
+
+/* Previous version of bswap optimization failed to consider sign extension
+   and as a result would replace an expression *not* doing a bswap by a
+   bswap.  */
+
+__attribute__ ((noinline, noclone)) uint32_t
+fake_bswap32 (uint32_t in)
+{
+  return __fake_const_swab32 (in);
+}
+
+int
+main(void)
+{
+  if (sizeof (int32_t) * __CHAR_BIT__ != 32)
+    return 0;
+  if (sizeof (uint32_t) * __CHAR_BIT__ != 32)
+    return 0;
+  if (fake_bswap32 (0x87654321) != 0xffffff87)
+    __builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr61306-2.c b/gcc/testsuite/gcc.c-torture/execute/pr61306-2.c
new file mode 100644
index 0000000..6cbbd19
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr61306-2.c
@@ -0,0 +1,40 @@
+#ifdef __INT16_TYPE__
+typedef __INT16_TYPE__ int16_t;
+#else
+typedef short int16_t;
+#endif
+
+#ifdef __UINT32_TYPE__
+typedef __UINT32_TYPE__ uint32_t;
+#else
+typedef unsigned uint32_t;
+#endif
+
+#define __fake_const_swab32(x) ((uint32_t)(			      \
+        (((uint32_t)         (x) & (uint32_t)0x000000ffUL) << 24) |   \
+        (((uint32_t)(int16_t)(x) & (uint32_t)0x00ffff00UL) <<  8) |   \
+        (((uint32_t)         (x) & (uint32_t)0x00ff0000UL) >>  8) |   \
+        (((uint32_t)         (x) & (uint32_t)0xff000000UL) >> 24)))
+
+
+/* Previous version of bswap optimization failed to consider sign extension
+   and as a result would replace an expression *not* doing a bswap by a
+   bswap.  */
+
+__attribute__ ((noinline, noclone)) uint32_t
+fake_bswap32 (uint32_t in)
+{
+  return __fake_const_swab32 (in);
+}
+
+int
+main(void)
+{
+  if (sizeof (uint32_t) * __CHAR_BIT__ != 32)
+    return 0;
+  if (sizeof (int16_t) * __CHAR_BIT__ != 16)
+    return 0;
+  if (fake_bswap32 (0x81828384) != 0xff838281)
+    __builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr61306-3.c b/gcc/testsuite/gcc.c-torture/execute/pr61306-3.c
new file mode 100644
index 0000000..6086e27
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr61306-3.c
@@ -0,0 +1,13 @@
+short a = -1;
+int b;
+char c;
+
+int
+main ()
+{
+  c = a;
+  b = a | c;
+  if (b != -1)
+    __builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index 658b341..d92c433 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1622,7 +1622,7 @@ make_pass_cse_sincos (gcc::context *ctxt)
 
 struct symbolic_number {
   uint64_t n;
-  int size;
+  tree type;
   tree base_addr;
   tree offset;
   HOST_WIDE_INT bytepos;
@@ -1652,13 +1652,15 @@ do_shift_rotate (enum tree_code code,
 		 struct symbolic_number *n,
 		 int count)
 {
+  int bitsize = TYPE_PRECISION (n->type);
+
   if (count % 8 != 0)
     return false;
 
   /* Zero out the extra bits of N in order to avoid them being shifted
      into the significant bits.  */
-  if (n->size < (int)sizeof (int64_t))
-    n->n &= ((uint64_t)1 << (n->size * BITS_PER_UNIT)) - 1;
+  if (bitsize < 8 * (int)sizeof (int64_t))
+    n->n &= ((uint64_t)1 << bitsize) - 1;
 
   switch (code)
     {
@@ -1666,20 +1668,23 @@ do_shift_rotate (enum tree_code code,
       n->n <<= count;
       break;
     case RSHIFT_EXPR:
+      /* Arithmetic shift of signed type: result is dependent on the value.  */
+      if (!TYPE_UNSIGNED (n->type) && (n->n & (0xff << (bitsize - 8))))
+	return false;
       n->n >>= count;
       break;
     case LROTATE_EXPR:
-      n->n = (n->n << count) | (n->n >> ((n->size * BITS_PER_UNIT) - count));
+      n->n = (n->n << count) | (n->n >> (bitsize - count));
       break;
     case RROTATE_EXPR:
-      n->n = (n->n >> count) | (n->n << ((n->size * BITS_PER_UNIT) - count));
+      n->n = (n->n >> count) | (n->n << (bitsize - count));
       break;
     default:
       return false;
     }
   /* Zero unused bits for size.  */
-  if (n->size < (int)sizeof (int64_t))
-    n->n &= ((uint64_t)1 << (n->size * BITS_PER_UNIT)) - 1;
+  if (bitsize < 8 * (int)sizeof (int64_t))
+    n->n &= ((uint64_t)1 << bitsize) - 1;
   return true;
 }
 
@@ -1696,7 +1701,7 @@ verify_symbolic_number_p (struct symbolic_number *n, gimple stmt)
   if (TREE_CODE (lhs_type) != INTEGER_TYPE)
     return false;
 
-  if (TYPE_PRECISION (lhs_type) != n->size * BITS_PER_UNIT)
+  if (TYPE_PRECISION (lhs_type) != TYPE_PRECISION (n->type))
     return false;
 
   return true;
@@ -1708,20 +1713,23 @@ verify_symbolic_number_p (struct symbolic_number *n, gimple stmt)
 static bool
 init_symbolic_number (struct symbolic_number *n, tree src)
 {
+  int size;
+
   n->base_addr = n->offset = n->alias_set = n->vuse = NULL_TREE;
 
   /* Set up the symbolic number N by setting each byte to a value between 1 and
      the byte size of rhs1.  The highest order byte is set to n->size and the
      lowest order byte to 1.  */
-  n->size = TYPE_PRECISION (TREE_TYPE (src));
-  if (n->size % BITS_PER_UNIT != 0)
+  n->type = TREE_TYPE (src);
+  size = TYPE_PRECISION (n->type);
+  if (size % BITS_PER_UNIT != 0)
     return false;
-  n->size /= BITS_PER_UNIT;
-  n->range = n->size;
+  size /= BITS_PER_UNIT;
+  n->range = size;
   n->n = CMPNOP;
 
-  if (n->size < (int)sizeof (int64_t))
-    n->n &= ((uint64_t)1 << (n->size * BITS_PER_UNIT)) - 1;
+  if (size < (int)sizeof (int64_t))
+    n->n &= ((uint64_t)1 << (size * BITS_PER_UNIT)) - 1;
 
   return true;
 }
@@ -1857,12 +1865,12 @@ find_bswap_or_nop_1 (gimple stmt, struct symbolic_number *n, int limit)
 	{
 	case BIT_AND_EXPR:
 	  {
-	    int i;
+	    int i, size = TYPE_PRECISION (n->type) / BITS_PER_UNIT;
 	    uint64_t val = int_cst_value (rhs2);
 	    uint64_t tmp = val;
 
 	    /* Only constants masking full bytes are allowed.  */
-	    for (i = 0; i < n->size; i++, tmp >>= BITS_PER_UNIT)
+	    for (i = 0; i < size; i++, tmp >>= BITS_PER_UNIT)
 	      if ((tmp & 0xff) != 0 && (tmp & 0xff) != 0xff)
 		return NULL_TREE;
 
@@ -1878,21 +1886,30 @@ find_bswap_or_nop_1 (gimple stmt, struct symbolic_number *n, int limit)
 	  break;
 	CASE_CONVERT:
 	  {
-	    int type_size;
+	    int type_size, old_type_size;
+	    tree type;
 
-	    type_size = TYPE_PRECISION (gimple_expr_type (stmt));
+	    type = gimple_expr_type (stmt);
+	    type_size = TYPE_PRECISION (type);
 	    if (type_size % BITS_PER_UNIT != 0)
 	      return NULL_TREE;
 
+	    /* Sign extension: result is dependent on the value.  */
+	    old_type_size = TYPE_PRECISION (n->type);
+	    if (!TYPE_UNSIGNED (n->type)
+		&& type_size > old_type_size
+		&& n->n & (0xff << (old_type_size - 8)))
+	      return NULL_TREE;
+
 	    if (type_size / BITS_PER_UNIT < (int)(sizeof (int64_t)))
 	      {
 		/* If STMT casts to a smaller type mask out the bits not
 		   belonging to the target type.  */
 		n->n &= ((uint64_t)1 << type_size) - 1;
 	      }
-	    n->size = type_size / BITS_PER_UNIT;
+	    n->type = type;
 	    if (!n->base_addr)
-	      n->range = n->size;
+	      n->range = type_size / BITS_PER_UNIT;
 	  }
 	  break;
 	default:
@@ -1905,7 +1922,7 @@ find_bswap_or_nop_1 (gimple stmt, struct symbolic_number *n, int limit)
 
   if (rhs_class == GIMPLE_BINARY_RHS)
     {
-      int i;
+      int i, size;
       struct symbolic_number n1, n2;
       uint64_t mask;
       tree source_expr2;
@@ -1931,7 +1948,7 @@ find_bswap_or_nop_1 (gimple stmt, struct symbolic_number *n, int limit)
 	  if (!source_expr2)
 	    return NULL_TREE;
 
-	  if (n1.size != n2.size)
+	  if (TYPE_PRECISION (n1.type) != TYPE_PRECISION (n2.type))
 	    return NULL_TREE;
 
 	  if (!n1.vuse != !n2.vuse ||
@@ -1997,8 +2014,9 @@ find_bswap_or_nop_1 (gimple stmt, struct symbolic_number *n, int limit)
 	  n->base_addr = n1.base_addr;
 	  n->offset = n1.offset;
 	  n->bytepos = n1.bytepos;
-	  n->size = n1.size;
-	  for (i = 0, mask = 0xff; i < n->size; i++, mask <<= BITS_PER_UNIT)
+	  n->type = n1.type;
+	  size = TYPE_PRECISION (n->type) / BITS_PER_UNIT;
+	  for (i = 0, mask = 0xff; i < size; i++, mask <<= BITS_PER_UNIT)
 	    {
 	      uint64_t masked1, masked2;


Is this OK for trunk? Does this bug qualify for a backport patch to
4.8 and 4.9 branches?

Best regards,

Thomas


^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH] Fix PR61306: improve handling of sign and cast in bswap
  2014-06-10  8:50   ` Thomas Preud'homme
@ 2014-06-10  8:54     ` Thomas Preud'homme
  2014-06-11  8:32     ` Richard Biener
  1 sibling, 0 replies; 17+ messages in thread
From: Thomas Preud'homme @ 2014-06-10  8:54 UTC (permalink / raw)
  To: 'Richard Biener'; +Cc: GCC Patches

> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-
> owner@gcc.gnu.org] On Behalf Of Thomas Preud'homme

> 
> Is this OK for trunk? Does this bug qualify for a backport patch to
> 4.8 and 4.9 branches?

I forgot to mention that this was tested via bootstrap on
x86_64-linux-gnu target, the testsuite then showing no
regressions and the 3 tests added now passing.

Best regards,

Thomas 



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Fix PR61306: improve handling of sign and cast in bswap
  2014-06-10  8:50   ` Thomas Preud'homme
  2014-06-10  8:54     ` Thomas Preud'homme
@ 2014-06-11  8:32     ` Richard Biener
  2014-06-18  4:55       ` Thomas Preud'homme
  1 sibling, 1 reply; 17+ messages in thread
From: Richard Biener @ 2014-06-11  8:32 UTC (permalink / raw)
  To: Thomas Preud'homme; +Cc: GCC Patches

On Tue, Jun 10, 2014 at 10:49 AM, Thomas Preud'homme
<thomas.preudhomme@arm.com> wrote:
>> From: Richard Biener [mailto:richard.guenther@gmail.com]
>> Sent: Wednesday, June 04, 2014 5:39 PM
>> To: Thomas Preud'homme
>>
>
>>
>> I'm failing to get why you possibly need two casts ... you should
>> only need one, from the bswap/load result to the final type
>> (zero-extended as you say - so the load type should simply be
>> unsigned which it is already).
>
> You are right indeed. I failed to realize that the problems I
> encountered were caused by an initially wrong understanding of
> the reason behind PR61306. All this code is not necessary.
>
>>
>> So I think that the testcase in the patch is fixed already by
>> doing the n->type change (and a proper sign-extension detection).
>>
>> Can you please split that part out?
>
> Doing so I realize the patch was incomplete. Sign extension can
> be triggered in two distinct place in the code (right shift and cast)
> that can both lead to incorrect code being generated. With some
> efforts I managed to create two testcases that work both on
> GCC trunk but also GCC 4.9 and 4.8.
>
> ChangeLog entries are:
>
> *** gcc/ChangeLog ***
>
> 2014-06-05  Thomas Preud'homme  <thomas.preudhomme@arm.com>
>
>         PR tree-optimization/61306
>         * tree-ssa-math-opts.c (struct symbolic_number): Store type of
>         expression instead of its size.
>         (do_shift_rotate): Adapt to change in struct symbolic_number. Return
>         false to prevent optimization when the result is unpredictable due to
>         arithmetic right shift of signed type with highest byte is set.
>         (verify_symbolic_number_p): Adapt to change in struct symbolic_number.
>         (init_symbolic_number): Likewise.
>         (find_bswap_or_nop_1): Likewise. Return NULL to prevent optimization
>         when the result is unpredictable due to sign extension.
>
> *** gcc/testsuite/ChangeLog ***
>
> 2014-06-05  Thomas Preud'homme  <thomas.preudhomme@arm.com>
>
>         * gcc.c-torture/execute/pr61306-1.c: New test.
>         * gcc.c-torture/execute/pr61306-2.c: Likewise.
>         * gcc.c-torture/execute/pr61306-3.c: Likewise.
>
>
> diff --git a/gcc/testsuite/gcc.c-torture/execute/pr61306-1.c b/gcc/testsuite/gcc.c-torture/execute/pr61306-1.c
> new file mode 100644
> index 0000000..f6e8ff3
> --- /dev/null
> +++ b/gcc/testsuite/gcc.c-torture/execute/pr61306-1.c
> @@ -0,0 +1,39 @@
> +#ifdef __INT32_TYPE__
> +typedef __INT32_TYPE__ int32_t;
> +#else
> +typedef int int32_t;
> +#endif
> +
> +#ifdef __UINT32_TYPE__
> +typedef __UINT32_TYPE__ uint32_t;
> +#else
> +typedef unsigned uint32_t;
> +#endif
> +
> +#define __fake_const_swab32(x) ((uint32_t)(                  \
> +        (((uint32_t)(x) & (uint32_t)0x000000ffUL) << 24) |    \
> +        (((uint32_t)(x) & (uint32_t)0x0000ff00UL) <<  8) |    \
> +        (((uint32_t)(x) & (uint32_t)0x00ff0000UL) >>  8) |    \
> +        (( (int32_t)(x) &  (int32_t)0xff000000UL) >> 24)))
> +
> +/* Previous version of bswap optimization failed to consider sign extension
> +   and as a result would replace an expression *not* doing a bswap by a
> +   bswap.  */
> +
> +__attribute__ ((noinline, noclone)) uint32_t
> +fake_bswap32 (uint32_t in)
> +{
> +  return __fake_const_swab32 (in);
> +}
> +
> +int
> +main(void)
> +{
> +  if (sizeof (int32_t) * __CHAR_BIT__ != 32)
> +    return 0;
> +  if (sizeof (uint32_t) * __CHAR_BIT__ != 32)
> +    return 0;
> +  if (fake_bswap32 (0x87654321) != 0xffffff87)
> +    __builtin_abort ();
> +  return 0;
> +}
> diff --git a/gcc/testsuite/gcc.c-torture/execute/pr61306-2.c b/gcc/testsuite/gcc.c-torture/execute/pr61306-2.c
> new file mode 100644
> index 0000000..6cbbd19
> --- /dev/null
> +++ b/gcc/testsuite/gcc.c-torture/execute/pr61306-2.c
> @@ -0,0 +1,40 @@
> +#ifdef __INT16_TYPE__
> +typedef __INT16_TYPE__ int16_t;
> +#else
> +typedef short int16_t;
> +#endif
> +
> +#ifdef __UINT32_TYPE__
> +typedef __UINT32_TYPE__ uint32_t;
> +#else
> +typedef unsigned uint32_t;
> +#endif
> +
> +#define __fake_const_swab32(x) ((uint32_t)(                          \
> +        (((uint32_t)         (x) & (uint32_t)0x000000ffUL) << 24) |   \
> +        (((uint32_t)(int16_t)(x) & (uint32_t)0x00ffff00UL) <<  8) |   \
> +        (((uint32_t)         (x) & (uint32_t)0x00ff0000UL) >>  8) |   \
> +        (((uint32_t)         (x) & (uint32_t)0xff000000UL) >> 24)))
> +
> +
> +/* Previous version of bswap optimization failed to consider sign extension
> +   and as a result would replace an expression *not* doing a bswap by a
> +   bswap.  */
> +
> +__attribute__ ((noinline, noclone)) uint32_t
> +fake_bswap32 (uint32_t in)
> +{
> +  return __fake_const_swab32 (in);
> +}
> +
> +int
> +main(void)
> +{
> +  if (sizeof (uint32_t) * __CHAR_BIT__ != 32)
> +    return 0;
> +  if (sizeof (int16_t) * __CHAR_BIT__ != 16)
> +    return 0;
> +  if (fake_bswap32 (0x81828384) != 0xff838281)
> +    __builtin_abort ();
> +  return 0;
> +}
> diff --git a/gcc/testsuite/gcc.c-torture/execute/pr61306-3.c b/gcc/testsuite/gcc.c-torture/execute/pr61306-3.c
> new file mode 100644
> index 0000000..6086e27
> --- /dev/null
> +++ b/gcc/testsuite/gcc.c-torture/execute/pr61306-3.c
> @@ -0,0 +1,13 @@
> +short a = -1;
> +int b;
> +char c;
> +
> +int
> +main ()
> +{
> +  c = a;
> +  b = a | c;
> +  if (b != -1)
> +    __builtin_abort ();
> +  return 0;
> +}
> diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
> index 658b341..d92c433 100644
> --- a/gcc/tree-ssa-math-opts.c
> +++ b/gcc/tree-ssa-math-opts.c
> @@ -1622,7 +1622,7 @@ make_pass_cse_sincos (gcc::context *ctxt)
>
>  struct symbolic_number {
>    uint64_t n;
> -  int size;
> +  tree type;
>    tree base_addr;
>    tree offset;
>    HOST_WIDE_INT bytepos;
> @@ -1652,13 +1652,15 @@ do_shift_rotate (enum tree_code code,
>                  struct symbolic_number *n,
>                  int count)
>  {
> +  int bitsize = TYPE_PRECISION (n->type);
> +
>    if (count % 8 != 0)
>      return false;
>
>    /* Zero out the extra bits of N in order to avoid them being shifted
>       into the significant bits.  */
> -  if (n->size < (int)sizeof (int64_t))
> -    n->n &= ((uint64_t)1 << (n->size * BITS_PER_UNIT)) - 1;
> +  if (bitsize < 8 * (int)sizeof (int64_t))
> +    n->n &= ((uint64_t)1 << bitsize) - 1;
>
>    switch (code)
>      {
> @@ -1666,20 +1668,23 @@ do_shift_rotate (enum tree_code code,
>        n->n <<= count;
>        break;
>      case RSHIFT_EXPR:
> +      /* Arithmetic shift of signed type: result is dependent on the value.  */
> +      if (!TYPE_UNSIGNED (n->type) && (n->n & (0xff << (bitsize - 8))))
> +       return false;
>        n->n >>= count;
>        break;
>      case LROTATE_EXPR:
> -      n->n = (n->n << count) | (n->n >> ((n->size * BITS_PER_UNIT) - count));
> +      n->n = (n->n << count) | (n->n >> (bitsize - count));
>        break;
>      case RROTATE_EXPR:
> -      n->n = (n->n >> count) | (n->n << ((n->size * BITS_PER_UNIT) - count));
> +      n->n = (n->n >> count) | (n->n << (bitsize - count));
>        break;
>      default:
>        return false;
>      }
>    /* Zero unused bits for size.  */
> -  if (n->size < (int)sizeof (int64_t))
> -    n->n &= ((uint64_t)1 << (n->size * BITS_PER_UNIT)) - 1;
> +  if (bitsize < 8 * (int)sizeof (int64_t))
> +    n->n &= ((uint64_t)1 << bitsize) - 1;
>    return true;
>  }
>
> @@ -1696,7 +1701,7 @@ verify_symbolic_number_p (struct symbolic_number *n, gimple stmt)
>    if (TREE_CODE (lhs_type) != INTEGER_TYPE)
>      return false;
>
> -  if (TYPE_PRECISION (lhs_type) != n->size * BITS_PER_UNIT)
> +  if (TYPE_PRECISION (lhs_type) != TYPE_PRECISION (n->type))
>      return false;
>
>    return true;
> @@ -1708,20 +1713,23 @@ verify_symbolic_number_p (struct symbolic_number *n, gimple stmt)
>  static bool
>  init_symbolic_number (struct symbolic_number *n, tree src)
>  {
> +  int size;
> +
>    n->base_addr = n->offset = n->alias_set = n->vuse = NULL_TREE;
>
>    /* Set up the symbolic number N by setting each byte to a value between 1 and
>       the byte size of rhs1.  The highest order byte is set to n->size and the
>       lowest order byte to 1.  */
> -  n->size = TYPE_PRECISION (TREE_TYPE (src));
> -  if (n->size % BITS_PER_UNIT != 0)
> +  n->type = TREE_TYPE (src);
> +  size = TYPE_PRECISION (n->type);
> +  if (size % BITS_PER_UNIT != 0)
>      return false;
> -  n->size /= BITS_PER_UNIT;
> -  n->range = n->size;
> +  size /= BITS_PER_UNIT;
> +  n->range = size;
>    n->n = CMPNOP;
>
> -  if (n->size < (int)sizeof (int64_t))
> -    n->n &= ((uint64_t)1 << (n->size * BITS_PER_UNIT)) - 1;
> +  if (size < (int)sizeof (int64_t))
> +    n->n &= ((uint64_t)1 << (size * BITS_PER_UNIT)) - 1;
>
>    return true;
>  }
> @@ -1857,12 +1865,12 @@ find_bswap_or_nop_1 (gimple stmt, struct symbolic_number *n, int limit)
>         {
>         case BIT_AND_EXPR:
>           {
> -           int i;
> +           int i, size = TYPE_PRECISION (n->type) / BITS_PER_UNIT;
>             uint64_t val = int_cst_value (rhs2);
>             uint64_t tmp = val;
>
>             /* Only constants masking full bytes are allowed.  */
> -           for (i = 0; i < n->size; i++, tmp >>= BITS_PER_UNIT)
> +           for (i = 0; i < size; i++, tmp >>= BITS_PER_UNIT)
>               if ((tmp & 0xff) != 0 && (tmp & 0xff) != 0xff)
>                 return NULL_TREE;
>
> @@ -1878,21 +1886,30 @@ find_bswap_or_nop_1 (gimple stmt, struct symbolic_number *n, int limit)
>           break;
>         CASE_CONVERT:
>           {
> -           int type_size;
> +           int type_size, old_type_size;
> +           tree type;
>
> -           type_size = TYPE_PRECISION (gimple_expr_type (stmt));
> +           type = gimple_expr_type (stmt);
> +           type_size = TYPE_PRECISION (type);
>             if (type_size % BITS_PER_UNIT != 0)
>               return NULL_TREE;
>
> +           /* Sign extension: result is dependent on the value.  */
> +           old_type_size = TYPE_PRECISION (n->type);
> +           if (!TYPE_UNSIGNED (n->type)
> +               && type_size > old_type_size
> +               && n->n & (0xff << (old_type_size - 8)))
> +             return NULL_TREE;
> +
>             if (type_size / BITS_PER_UNIT < (int)(sizeof (int64_t)))
>               {
>                 /* If STMT casts to a smaller type mask out the bits not
>                    belonging to the target type.  */
>                 n->n &= ((uint64_t)1 << type_size) - 1;
>               }
> -           n->size = type_size / BITS_PER_UNIT;
> +           n->type = type;
>             if (!n->base_addr)
> -             n->range = n->size;
> +             n->range = type_size / BITS_PER_UNIT;
>           }
>           break;
>         default:
> @@ -1905,7 +1922,7 @@ find_bswap_or_nop_1 (gimple stmt, struct symbolic_number *n, int limit)
>
>    if (rhs_class == GIMPLE_BINARY_RHS)
>      {
> -      int i;
> +      int i, size;
>        struct symbolic_number n1, n2;
>        uint64_t mask;
>        tree source_expr2;
> @@ -1931,7 +1948,7 @@ find_bswap_or_nop_1 (gimple stmt, struct symbolic_number *n, int limit)
>           if (!source_expr2)
>             return NULL_TREE;
>
> -         if (n1.size != n2.size)
> +         if (TYPE_PRECISION (n1.type) != TYPE_PRECISION (n2.type))
>             return NULL_TREE;
>
>           if (!n1.vuse != !n2.vuse ||
> @@ -1997,8 +2014,9 @@ find_bswap_or_nop_1 (gimple stmt, struct symbolic_number *n, int limit)
>           n->base_addr = n1.base_addr;
>           n->offset = n1.offset;
>           n->bytepos = n1.bytepos;
> -         n->size = n1.size;
> -         for (i = 0, mask = 0xff; i < n->size; i++, mask <<= BITS_PER_UNIT)
> +         n->type = n1.type;
> +         size = TYPE_PRECISION (n->type) / BITS_PER_UNIT;
> +         for (i = 0, mask = 0xff; i < size; i++, mask <<= BITS_PER_UNIT)
>             {
>               uint64_t masked1, masked2;
>
>
> Is this OK for trunk? Does this bug qualify for a backport patch to
> 4.8 and 4.9 branches?

This is ok for trunk and also for backporting (after a short while to
see if there is any fallout).

Thanks,
RIchard.

> Best regards,
>
> Thomas
>
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH] Fix PR61306: improve handling of sign and cast in bswap
  2014-06-11  8:32     ` Richard Biener
@ 2014-06-18  4:55       ` Thomas Preud'homme
  2014-06-18 10:15         ` Richard Biener
  2014-06-18 15:23         ` Marek Polacek
  0 siblings, 2 replies; 17+ messages in thread
From: Thomas Preud'homme @ 2014-06-18  4:55 UTC (permalink / raw)
  To: 'Richard Biener'; +Cc: GCC Patches

> From: Richard Biener [mailto:richard.guenther@gmail.com]
> Sent: Wednesday, June 11, 2014 4:32 PM
> >
> >
> > Is this OK for trunk? Does this bug qualify for a backport patch to
> > 4.8 and 4.9 branches?
> 
> This is ok for trunk and also for backporting (after a short while to
> see if there is any fallout).

Below is the backported patch for 4.8/4.9. Is this ok for both 4.8 and
4.9? If yes, how much more should I wait before committing?

Tested on both 4.8 and 4.9 without regression in the testsuite after
a bootstrap.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 1e35bbe..0559b7f 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,16 @@
+2014-06-12  Thomas Preud'homme  <thomas.preudhomme@arm.com>
+
+	PR tree-optimization/61306
+	* tree-ssa-math-opts.c (struct symbolic_number): Store type of
+	expression instead of its size.
+	(do_shift_rotate): Adapt to change in struct symbolic_number. Return
+	false to prevent optimization when the result is unpredictable due to
+	arithmetic right shift of signed type with highest byte is set.
+	(verify_symbolic_number_p): Adapt to change in struct symbolic_number.
+	(find_bswap_1): Likewise. Return NULL to prevent optimization when the
+	result is unpredictable due to sign extension.
+	(find_bswap): Adapt to change in struct symbolic_number.
+
 2014-06-12  Alan Modra  <amodra@gmail.com>
 
 	PR target/61300
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 757cb74..139f23c 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,9 @@
+2014-06-12  Thomas Preud'homme  <thomas.preudhomme@arm.com>
+
+	* gcc.c-torture/execute/pr61306-1.c: New test.
+	* gcc.c-torture/execute/pr61306-2.c: Likewise.
+	* gcc.c-torture/execute/pr61306-3.c: Likewise.
+
 2014-06-11  Richard Biener  <rguenther@suse.de>
 
 	PR tree-optimization/61452
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr61306-1.c b/gcc/testsuite/gcc.c-torture/execute/pr61306-1.c
new file mode 100644
index 0000000..ebc90a3
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr61306-1.c
@@ -0,0 +1,39 @@
+#ifdef __INT32_TYPE__
+typedef __INT32_TYPE__ int32_t;
+#else
+typedef int int32_t;
+#endif
+
+#ifdef __UINT32_TYPE__
+typedef __UINT32_TYPE__ uint32_t;
+#else
+typedef unsigned uint32_t;
+#endif
+
+#define __fake_const_swab32(x) ((uint32_t)(		      \
+	(((uint32_t)(x) & (uint32_t)0x000000ffUL) << 24) |    \
+	(((uint32_t)(x) & (uint32_t)0x0000ff00UL) <<  8) |    \
+	(((uint32_t)(x) & (uint32_t)0x00ff0000UL) >>  8) |    \
+	(( (int32_t)(x) &  (int32_t)0xff000000UL) >> 24)))
+
+/* Previous version of bswap optimization failed to consider sign extension
+   and as a result would replace an expression *not* doing a bswap by a
+   bswap.  */
+
+__attribute__ ((noinline, noclone)) uint32_t
+fake_bswap32 (uint32_t in)
+{
+  return __fake_const_swab32 (in);
+}
+
+int
+main(void)
+{
+  if (sizeof (int32_t) * __CHAR_BIT__ != 32)
+    return 0;
+  if (sizeof (uint32_t) * __CHAR_BIT__ != 32)
+    return 0;
+  if (fake_bswap32 (0x87654321) != 0xffffff87)
+    __builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr61306-2.c b/gcc/testsuite/gcc.c-torture/execute/pr61306-2.c
new file mode 100644
index 0000000..886ecfd
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr61306-2.c
@@ -0,0 +1,40 @@
+#ifdef __INT16_TYPE__
+typedef __INT16_TYPE__ int16_t;
+#else
+typedef short int16_t;
+#endif
+
+#ifdef __UINT32_TYPE__
+typedef __UINT32_TYPE__ uint32_t;
+#else
+typedef unsigned uint32_t;
+#endif
+
+#define __fake_const_swab32(x) ((uint32_t)(			      \
+	(((uint32_t)         (x) & (uint32_t)0x000000ffUL) << 24) |   \
+	(((uint32_t)(int16_t)(x) & (uint32_t)0x00ffff00UL) <<  8) |   \
+	(((uint32_t)         (x) & (uint32_t)0x00ff0000UL) >>  8) |   \
+	(((uint32_t)         (x) & (uint32_t)0xff000000UL) >> 24)))
+
+
+/* Previous version of bswap optimization failed to consider sign extension
+   and as a result would replace an expression *not* doing a bswap by a
+   bswap.  */
+
+__attribute__ ((noinline, noclone)) uint32_t
+fake_bswap32 (uint32_t in)
+{
+  return __fake_const_swab32 (in);
+}
+
+int
+main(void)
+{
+  if (sizeof (uint32_t) * __CHAR_BIT__ != 32)
+    return 0;
+  if (sizeof (int16_t) * __CHAR_BIT__ != 16)
+    return 0;
+  if (fake_bswap32 (0x81828384) != 0xff838281)
+    __builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr61306-3.c b/gcc/testsuite/gcc.c-torture/execute/pr61306-3.c
new file mode 100644
index 0000000..6086e27
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr61306-3.c
@@ -0,0 +1,13 @@
+short a = -1;
+int b;
+char c;
+
+int
+main ()
+{
+  c = a;
+  b = a | c;
+  if (b != -1)
+    __builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index 9ff857c..2b656ae 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1620,7 +1620,7 @@ make_pass_cse_sincos (gcc::context *ctxt)
 
 struct symbolic_number {
   unsigned HOST_WIDEST_INT n;
-  int size;
+  tree type;
 };
 
 /* Perform a SHIFT or ROTATE operation by COUNT bits on symbolic
@@ -1632,13 +1632,15 @@ do_shift_rotate (enum tree_code code,
 		 struct symbolic_number *n,
 		 int count)
 {
+  int bitsize = TYPE_PRECISION (n->type);
+
   if (count % 8 != 0)
     return false;
 
   /* Zero out the extra bits of N in order to avoid them being shifted
      into the significant bits.  */
-  if (n->size < (int)sizeof (HOST_WIDEST_INT))
-    n->n &= ((unsigned HOST_WIDEST_INT)1 << (n->size * BITS_PER_UNIT)) - 1;
+  if (bitsize < 8 * (int)sizeof (HOST_WIDEST_INT))
+    n->n &= ((unsigned HOST_WIDEST_INT)1 << bitsize) - 1;
 
   switch (code)
     {
@@ -1646,20 +1648,23 @@ do_shift_rotate (enum tree_code code,
       n->n <<= count;
       break;
     case RSHIFT_EXPR:
+      /* Arithmetic shift of signed type: result is dependent on the value.  */
+      if (!TYPE_UNSIGNED (n->type) && (n->n & (0xff << (bitsize - 8))))
+	return false;
       n->n >>= count;
       break;
     case LROTATE_EXPR:
-      n->n = (n->n << count) | (n->n >> ((n->size * BITS_PER_UNIT) - count));
+      n->n = (n->n << count) | (n->n >> (bitsize - count));
       break;
     case RROTATE_EXPR:
-      n->n = (n->n >> count) | (n->n << ((n->size * BITS_PER_UNIT) - count));
+      n->n = (n->n >> count) | (n->n << (bitsize - count));
       break;
     default:
       return false;
     }
   /* Zero unused bits for size.  */
-  if (n->size < (int)sizeof (HOST_WIDEST_INT))
-    n->n &= ((unsigned HOST_WIDEST_INT)1 << (n->size * BITS_PER_UNIT)) - 1;
+  if (bitsize < 8 * (int)sizeof (HOST_WIDEST_INT))
+    n->n &= ((unsigned HOST_WIDEST_INT)1 << bitsize) - 1;
   return true;
 }
 
@@ -1676,7 +1681,7 @@ verify_symbolic_number_p (struct symbolic_number *n, gimple stmt)
   if (TREE_CODE (lhs_type) != INTEGER_TYPE)
     return false;
 
-  if (TYPE_PRECISION (lhs_type) != n->size * BITS_PER_UNIT)
+  if (TYPE_PRECISION (lhs_type) != TYPE_PRECISION (n->type))
     return false;
 
   return true;
@@ -1733,20 +1738,23 @@ find_bswap_1 (gimple stmt, struct symbolic_number *n, int limit)
 	 to initialize the symbolic number.  */
       if (!source_expr1)
 	{
+	  int size;
+
 	  /* Set up the symbolic number N by setting each byte to a
 	     value between 1 and the byte size of rhs1.  The highest
 	     order byte is set to n->size and the lowest order
 	     byte to 1.  */
-	  n->size = TYPE_PRECISION (TREE_TYPE (rhs1));
-	  if (n->size % BITS_PER_UNIT != 0)
+	  n->type = TREE_TYPE (rhs1);
+	  size = TYPE_PRECISION (n->type);
+	  if (size % BITS_PER_UNIT != 0)
 	    return NULL_TREE;
-	  n->size /= BITS_PER_UNIT;
+	  size /= BITS_PER_UNIT;
 	  n->n = (sizeof (HOST_WIDEST_INT) < 8 ? 0 :
 		  (unsigned HOST_WIDEST_INT)0x08070605 << 32 | 0x04030201);
 
-	  if (n->size < (int)sizeof (HOST_WIDEST_INT))
+	  if (size < (int)sizeof (HOST_WIDEST_INT))
 	    n->n &= ((unsigned HOST_WIDEST_INT)1 <<
-		     (n->size * BITS_PER_UNIT)) - 1;
+		     (size * BITS_PER_UNIT)) - 1;
 
 	  source_expr1 = rhs1;
 	}
@@ -1755,12 +1763,12 @@ find_bswap_1 (gimple stmt, struct symbolic_number *n, int limit)
 	{
 	case BIT_AND_EXPR:
 	  {
-	    int i;
+	    int i, size = TYPE_PRECISION (n->type) / BITS_PER_UNIT;
 	    unsigned HOST_WIDEST_INT val = widest_int_cst_value (rhs2);
 	    unsigned HOST_WIDEST_INT tmp = val;
 
 	    /* Only constants masking full bytes are allowed.  */
-	    for (i = 0; i < n->size; i++, tmp >>= BITS_PER_UNIT)
+	    for (i = 0; i < size; i++, tmp >>= BITS_PER_UNIT)
 	      if ((tmp & 0xff) != 0 && (tmp & 0xff) != 0xff)
 		return NULL_TREE;
 
@@ -1776,19 +1784,28 @@ find_bswap_1 (gimple stmt, struct symbolic_number *n, int limit)
 	  break;
 	CASE_CONVERT:
 	  {
-	    int type_size;
+	    int type_size, old_type_size;
+	    tree type;
 
-	    type_size = TYPE_PRECISION (gimple_expr_type (stmt));
+	    type = gimple_expr_type (stmt);
+	    type_size = TYPE_PRECISION (type);
 	    if (type_size % BITS_PER_UNIT != 0)
 	      return NULL_TREE;
 
+	    /* Sign extension: result is dependent on the value.  */
+	    old_type_size = TYPE_PRECISION (n->type);
+	    if (!TYPE_UNSIGNED (n->type)
+		&& type_size > old_type_size
+		&& n->n & (0xff << (old_type_size - 8)))
+	      return NULL_TREE;
+
 	    if (type_size / BITS_PER_UNIT < (int)(sizeof (HOST_WIDEST_INT)))
 	      {
 		/* If STMT casts to a smaller type mask out the bits not
 		   belonging to the target type.  */
 		n->n &= ((unsigned HOST_WIDEST_INT)1 << type_size) - 1;
 	      }
-	    n->size = type_size / BITS_PER_UNIT;
+	    n->type = type;
 	  }
 	  break;
 	default:
@@ -1801,7 +1818,7 @@ find_bswap_1 (gimple stmt, struct symbolic_number *n, int limit)
 
   if (rhs_class == GIMPLE_BINARY_RHS)
     {
-      int i;
+      int i, size;
       struct symbolic_number n1, n2;
       unsigned HOST_WIDEST_INT mask;
       tree source_expr2;
@@ -1825,11 +1842,12 @@ find_bswap_1 (gimple stmt, struct symbolic_number *n, int limit)
 	  source_expr2 = find_bswap_1 (rhs2_stmt, &n2, limit - 1);
 
 	  if (source_expr1 != source_expr2
-	      || n1.size != n2.size)
+	      || TYPE_PRECISION (n1.type) != TYPE_PRECISION (n2.type))
 	    return NULL_TREE;
 
-	  n->size = n1.size;
-	  for (i = 0, mask = 0xff; i < n->size; i++, mask <<= BITS_PER_UNIT)
+	  n->type = n1.type;
+	  size = TYPE_PRECISION (n->type) / BITS_PER_UNIT;
+	  for (i = 0, mask = 0xff; i < size; i++, mask <<= BITS_PER_UNIT)
 	    {
 	      unsigned HOST_WIDEST_INT masked1, masked2;
 
@@ -1868,7 +1886,7 @@ find_bswap (gimple stmt)
 
   struct symbolic_number n;
   tree source_expr;
-  int limit;
+  int limit, bitsize;
 
   /* The last parameter determines the depth search limit.  It usually
      correlates directly to the number of bytes to be touched.  We
@@ -1883,13 +1901,14 @@ find_bswap (gimple stmt)
     return NULL_TREE;
 
   /* Zero out the extra bits of N and CMP.  */
-  if (n.size < (int)sizeof (HOST_WIDEST_INT))
+  bitsize = TYPE_PRECISION (n.type);
+  if (bitsize < 8 * (int)sizeof (HOST_WIDEST_INT))
     {
       unsigned HOST_WIDEST_INT mask =
-	((unsigned HOST_WIDEST_INT)1 << (n.size * BITS_PER_UNIT)) - 1;
+	((unsigned HOST_WIDEST_INT)1 << bitsize) - 1;
 
       n.n &= mask;
-      cmp >>= (sizeof (HOST_WIDEST_INT) - n.size) * BITS_PER_UNIT;
+      cmp >>= sizeof (HOST_WIDEST_INT) * BITS_PER_UNIT - bitsize;
     }
 
   /* A complete byte swap should make the symbolic number to start

Best regards,

Thomas


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Fix PR61306: improve handling of sign and cast in bswap
  2014-06-18  4:55       ` Thomas Preud'homme
@ 2014-06-18 10:15         ` Richard Biener
  2014-06-18 15:23         ` Marek Polacek
  1 sibling, 0 replies; 17+ messages in thread
From: Richard Biener @ 2014-06-18 10:15 UTC (permalink / raw)
  To: Thomas Preud'homme; +Cc: GCC Patches

On Wed, Jun 18, 2014 at 6:55 AM, Thomas Preud'homme
<thomas.preudhomme@arm.com> wrote:
>> From: Richard Biener [mailto:richard.guenther@gmail.com]
>> Sent: Wednesday, June 11, 2014 4:32 PM
>> >
>> >
>> > Is this OK for trunk? Does this bug qualify for a backport patch to
>> > 4.8 and 4.9 branches?
>>
>> This is ok for trunk and also for backporting (after a short while to
>> see if there is any fallout).
>
> Below is the backported patch for 4.8/4.9. Is this ok for both 4.8 and
> 4.9? If yes, how much more should I wait before committing?
>
> Tested on both 4.8 and 4.9 without regression in the testsuite after
> a bootstrap.

This is ok to commit now.

Thanks,
Richard.

> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 1e35bbe..0559b7f 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,16 @@
> +2014-06-12  Thomas Preud'homme  <thomas.preudhomme@arm.com>
> +
> +       PR tree-optimization/61306
> +       * tree-ssa-math-opts.c (struct symbolic_number): Store type of
> +       expression instead of its size.
> +       (do_shift_rotate): Adapt to change in struct symbolic_number. Return
> +       false to prevent optimization when the result is unpredictable due to
> +       arithmetic right shift of signed type with highest byte is set.
> +       (verify_symbolic_number_p): Adapt to change in struct symbolic_number.
> +       (find_bswap_1): Likewise. Return NULL to prevent optimization when the
> +       result is unpredictable due to sign extension.
> +       (find_bswap): Adapt to change in struct symbolic_number.
> +
>  2014-06-12  Alan Modra  <amodra@gmail.com>
>
>         PR target/61300
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index 757cb74..139f23c 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,3 +1,9 @@
> +2014-06-12  Thomas Preud'homme  <thomas.preudhomme@arm.com>
> +
> +       * gcc.c-torture/execute/pr61306-1.c: New test.
> +       * gcc.c-torture/execute/pr61306-2.c: Likewise.
> +       * gcc.c-torture/execute/pr61306-3.c: Likewise.
> +
>  2014-06-11  Richard Biener  <rguenther@suse.de>
>
>         PR tree-optimization/61452
> diff --git a/gcc/testsuite/gcc.c-torture/execute/pr61306-1.c b/gcc/testsuite/gcc.c-torture/execute/pr61306-1.c
> new file mode 100644
> index 0000000..ebc90a3
> --- /dev/null
> +++ b/gcc/testsuite/gcc.c-torture/execute/pr61306-1.c
> @@ -0,0 +1,39 @@
> +#ifdef __INT32_TYPE__
> +typedef __INT32_TYPE__ int32_t;
> +#else
> +typedef int int32_t;
> +#endif
> +
> +#ifdef __UINT32_TYPE__
> +typedef __UINT32_TYPE__ uint32_t;
> +#else
> +typedef unsigned uint32_t;
> +#endif
> +
> +#define __fake_const_swab32(x) ((uint32_t)(                  \
> +       (((uint32_t)(x) & (uint32_t)0x000000ffUL) << 24) |    \
> +       (((uint32_t)(x) & (uint32_t)0x0000ff00UL) <<  8) |    \
> +       (((uint32_t)(x) & (uint32_t)0x00ff0000UL) >>  8) |    \
> +       (( (int32_t)(x) &  (int32_t)0xff000000UL) >> 24)))
> +
> +/* Previous version of bswap optimization failed to consider sign extension
> +   and as a result would replace an expression *not* doing a bswap by a
> +   bswap.  */
> +
> +__attribute__ ((noinline, noclone)) uint32_t
> +fake_bswap32 (uint32_t in)
> +{
> +  return __fake_const_swab32 (in);
> +}
> +
> +int
> +main(void)
> +{
> +  if (sizeof (int32_t) * __CHAR_BIT__ != 32)
> +    return 0;
> +  if (sizeof (uint32_t) * __CHAR_BIT__ != 32)
> +    return 0;
> +  if (fake_bswap32 (0x87654321) != 0xffffff87)
> +    __builtin_abort ();
> +  return 0;
> +}
> diff --git a/gcc/testsuite/gcc.c-torture/execute/pr61306-2.c b/gcc/testsuite/gcc.c-torture/execute/pr61306-2.c
> new file mode 100644
> index 0000000..886ecfd
> --- /dev/null
> +++ b/gcc/testsuite/gcc.c-torture/execute/pr61306-2.c
> @@ -0,0 +1,40 @@
> +#ifdef __INT16_TYPE__
> +typedef __INT16_TYPE__ int16_t;
> +#else
> +typedef short int16_t;
> +#endif
> +
> +#ifdef __UINT32_TYPE__
> +typedef __UINT32_TYPE__ uint32_t;
> +#else
> +typedef unsigned uint32_t;
> +#endif
> +
> +#define __fake_const_swab32(x) ((uint32_t)(                          \
> +       (((uint32_t)         (x) & (uint32_t)0x000000ffUL) << 24) |   \
> +       (((uint32_t)(int16_t)(x) & (uint32_t)0x00ffff00UL) <<  8) |   \
> +       (((uint32_t)         (x) & (uint32_t)0x00ff0000UL) >>  8) |   \
> +       (((uint32_t)         (x) & (uint32_t)0xff000000UL) >> 24)))
> +
> +
> +/* Previous version of bswap optimization failed to consider sign extension
> +   and as a result would replace an expression *not* doing a bswap by a
> +   bswap.  */
> +
> +__attribute__ ((noinline, noclone)) uint32_t
> +fake_bswap32 (uint32_t in)
> +{
> +  return __fake_const_swab32 (in);
> +}
> +
> +int
> +main(void)
> +{
> +  if (sizeof (uint32_t) * __CHAR_BIT__ != 32)
> +    return 0;
> +  if (sizeof (int16_t) * __CHAR_BIT__ != 16)
> +    return 0;
> +  if (fake_bswap32 (0x81828384) != 0xff838281)
> +    __builtin_abort ();
> +  return 0;
> +}
> diff --git a/gcc/testsuite/gcc.c-torture/execute/pr61306-3.c b/gcc/testsuite/gcc.c-torture/execute/pr61306-3.c
> new file mode 100644
> index 0000000..6086e27
> --- /dev/null
> +++ b/gcc/testsuite/gcc.c-torture/execute/pr61306-3.c
> @@ -0,0 +1,13 @@
> +short a = -1;
> +int b;
> +char c;
> +
> +int
> +main ()
> +{
> +  c = a;
> +  b = a | c;
> +  if (b != -1)
> +    __builtin_abort ();
> +  return 0;
> +}
> diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
> index 9ff857c..2b656ae 100644
> --- a/gcc/tree-ssa-math-opts.c
> +++ b/gcc/tree-ssa-math-opts.c
> @@ -1620,7 +1620,7 @@ make_pass_cse_sincos (gcc::context *ctxt)
>
>  struct symbolic_number {
>    unsigned HOST_WIDEST_INT n;
> -  int size;
> +  tree type;
>  };
>
>  /* Perform a SHIFT or ROTATE operation by COUNT bits on symbolic
> @@ -1632,13 +1632,15 @@ do_shift_rotate (enum tree_code code,
>                  struct symbolic_number *n,
>                  int count)
>  {
> +  int bitsize = TYPE_PRECISION (n->type);
> +
>    if (count % 8 != 0)
>      return false;
>
>    /* Zero out the extra bits of N in order to avoid them being shifted
>       into the significant bits.  */
> -  if (n->size < (int)sizeof (HOST_WIDEST_INT))
> -    n->n &= ((unsigned HOST_WIDEST_INT)1 << (n->size * BITS_PER_UNIT)) - 1;
> +  if (bitsize < 8 * (int)sizeof (HOST_WIDEST_INT))
> +    n->n &= ((unsigned HOST_WIDEST_INT)1 << bitsize) - 1;
>
>    switch (code)
>      {
> @@ -1646,20 +1648,23 @@ do_shift_rotate (enum tree_code code,
>        n->n <<= count;
>        break;
>      case RSHIFT_EXPR:
> +      /* Arithmetic shift of signed type: result is dependent on the value.  */
> +      if (!TYPE_UNSIGNED (n->type) && (n->n & (0xff << (bitsize - 8))))
> +       return false;
>        n->n >>= count;
>        break;
>      case LROTATE_EXPR:
> -      n->n = (n->n << count) | (n->n >> ((n->size * BITS_PER_UNIT) - count));
> +      n->n = (n->n << count) | (n->n >> (bitsize - count));
>        break;
>      case RROTATE_EXPR:
> -      n->n = (n->n >> count) | (n->n << ((n->size * BITS_PER_UNIT) - count));
> +      n->n = (n->n >> count) | (n->n << (bitsize - count));
>        break;
>      default:
>        return false;
>      }
>    /* Zero unused bits for size.  */
> -  if (n->size < (int)sizeof (HOST_WIDEST_INT))
> -    n->n &= ((unsigned HOST_WIDEST_INT)1 << (n->size * BITS_PER_UNIT)) - 1;
> +  if (bitsize < 8 * (int)sizeof (HOST_WIDEST_INT))
> +    n->n &= ((unsigned HOST_WIDEST_INT)1 << bitsize) - 1;
>    return true;
>  }
>
> @@ -1676,7 +1681,7 @@ verify_symbolic_number_p (struct symbolic_number *n, gimple stmt)
>    if (TREE_CODE (lhs_type) != INTEGER_TYPE)
>      return false;
>
> -  if (TYPE_PRECISION (lhs_type) != n->size * BITS_PER_UNIT)
> +  if (TYPE_PRECISION (lhs_type) != TYPE_PRECISION (n->type))
>      return false;
>
>    return true;
> @@ -1733,20 +1738,23 @@ find_bswap_1 (gimple stmt, struct symbolic_number *n, int limit)
>          to initialize the symbolic number.  */
>        if (!source_expr1)
>         {
> +         int size;
> +
>           /* Set up the symbolic number N by setting each byte to a
>              value between 1 and the byte size of rhs1.  The highest
>              order byte is set to n->size and the lowest order
>              byte to 1.  */
> -         n->size = TYPE_PRECISION (TREE_TYPE (rhs1));
> -         if (n->size % BITS_PER_UNIT != 0)
> +         n->type = TREE_TYPE (rhs1);
> +         size = TYPE_PRECISION (n->type);
> +         if (size % BITS_PER_UNIT != 0)
>             return NULL_TREE;
> -         n->size /= BITS_PER_UNIT;
> +         size /= BITS_PER_UNIT;
>           n->n = (sizeof (HOST_WIDEST_INT) < 8 ? 0 :
>                   (unsigned HOST_WIDEST_INT)0x08070605 << 32 | 0x04030201);
>
> -         if (n->size < (int)sizeof (HOST_WIDEST_INT))
> +         if (size < (int)sizeof (HOST_WIDEST_INT))
>             n->n &= ((unsigned HOST_WIDEST_INT)1 <<
> -                    (n->size * BITS_PER_UNIT)) - 1;
> +                    (size * BITS_PER_UNIT)) - 1;
>
>           source_expr1 = rhs1;
>         }
> @@ -1755,12 +1763,12 @@ find_bswap_1 (gimple stmt, struct symbolic_number *n, int limit)
>         {
>         case BIT_AND_EXPR:
>           {
> -           int i;
> +           int i, size = TYPE_PRECISION (n->type) / BITS_PER_UNIT;
>             unsigned HOST_WIDEST_INT val = widest_int_cst_value (rhs2);
>             unsigned HOST_WIDEST_INT tmp = val;
>
>             /* Only constants masking full bytes are allowed.  */
> -           for (i = 0; i < n->size; i++, tmp >>= BITS_PER_UNIT)
> +           for (i = 0; i < size; i++, tmp >>= BITS_PER_UNIT)
>               if ((tmp & 0xff) != 0 && (tmp & 0xff) != 0xff)
>                 return NULL_TREE;
>
> @@ -1776,19 +1784,28 @@ find_bswap_1 (gimple stmt, struct symbolic_number *n, int limit)
>           break;
>         CASE_CONVERT:
>           {
> -           int type_size;
> +           int type_size, old_type_size;
> +           tree type;
>
> -           type_size = TYPE_PRECISION (gimple_expr_type (stmt));
> +           type = gimple_expr_type (stmt);
> +           type_size = TYPE_PRECISION (type);
>             if (type_size % BITS_PER_UNIT != 0)
>               return NULL_TREE;
>
> +           /* Sign extension: result is dependent on the value.  */
> +           old_type_size = TYPE_PRECISION (n->type);
> +           if (!TYPE_UNSIGNED (n->type)
> +               && type_size > old_type_size
> +               && n->n & (0xff << (old_type_size - 8)))
> +             return NULL_TREE;
> +
>             if (type_size / BITS_PER_UNIT < (int)(sizeof (HOST_WIDEST_INT)))
>               {
>                 /* If STMT casts to a smaller type mask out the bits not
>                    belonging to the target type.  */
>                 n->n &= ((unsigned HOST_WIDEST_INT)1 << type_size) - 1;
>               }
> -           n->size = type_size / BITS_PER_UNIT;
> +           n->type = type;
>           }
>           break;
>         default:
> @@ -1801,7 +1818,7 @@ find_bswap_1 (gimple stmt, struct symbolic_number *n, int limit)
>
>    if (rhs_class == GIMPLE_BINARY_RHS)
>      {
> -      int i;
> +      int i, size;
>        struct symbolic_number n1, n2;
>        unsigned HOST_WIDEST_INT mask;
>        tree source_expr2;
> @@ -1825,11 +1842,12 @@ find_bswap_1 (gimple stmt, struct symbolic_number *n, int limit)
>           source_expr2 = find_bswap_1 (rhs2_stmt, &n2, limit - 1);
>
>           if (source_expr1 != source_expr2
> -             || n1.size != n2.size)
> +             || TYPE_PRECISION (n1.type) != TYPE_PRECISION (n2.type))
>             return NULL_TREE;
>
> -         n->size = n1.size;
> -         for (i = 0, mask = 0xff; i < n->size; i++, mask <<= BITS_PER_UNIT)
> +         n->type = n1.type;
> +         size = TYPE_PRECISION (n->type) / BITS_PER_UNIT;
> +         for (i = 0, mask = 0xff; i < size; i++, mask <<= BITS_PER_UNIT)
>             {
>               unsigned HOST_WIDEST_INT masked1, masked2;
>
> @@ -1868,7 +1886,7 @@ find_bswap (gimple stmt)
>
>    struct symbolic_number n;
>    tree source_expr;
> -  int limit;
> +  int limit, bitsize;
>
>    /* The last parameter determines the depth search limit.  It usually
>       correlates directly to the number of bytes to be touched.  We
> @@ -1883,13 +1901,14 @@ find_bswap (gimple stmt)
>      return NULL_TREE;
>
>    /* Zero out the extra bits of N and CMP.  */
> -  if (n.size < (int)sizeof (HOST_WIDEST_INT))
> +  bitsize = TYPE_PRECISION (n.type);
> +  if (bitsize < 8 * (int)sizeof (HOST_WIDEST_INT))
>      {
>        unsigned HOST_WIDEST_INT mask =
> -       ((unsigned HOST_WIDEST_INT)1 << (n.size * BITS_PER_UNIT)) - 1;
> +       ((unsigned HOST_WIDEST_INT)1 << bitsize) - 1;
>
>        n.n &= mask;
> -      cmp >>= (sizeof (HOST_WIDEST_INT) - n.size) * BITS_PER_UNIT;
> +      cmp >>= sizeof (HOST_WIDEST_INT) * BITS_PER_UNIT - bitsize;
>      }
>
>    /* A complete byte swap should make the symbolic number to start
>
> Best regards,
>
> Thomas
>
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Fix PR61306: improve handling of sign and cast in bswap
  2014-06-18  4:55       ` Thomas Preud'homme
  2014-06-18 10:15         ` Richard Biener
@ 2014-06-18 15:23         ` Marek Polacek
  2014-06-18 17:53           ` Jakub Jelinek
  1 sibling, 1 reply; 17+ messages in thread
From: Marek Polacek @ 2014-06-18 15:23 UTC (permalink / raw)
  To: Thomas Preud'homme; +Cc: 'Richard Biener', GCC Patches

On Wed, Jun 18, 2014 at 12:55:01PM +0800, Thomas Preud'homme wrote:
> @@ -1646,20 +1648,23 @@ do_shift_rotate (enum tree_code code,
>        n->n <<= count;
>        break;
>      case RSHIFT_EXPR:
> +      /* Arithmetic shift of signed type: result is dependent on the value.  */
> +      if (!TYPE_UNSIGNED (n->type) && (n->n & (0xff << (bitsize - 8))))
> +	return false;

Looks like here an undefined behavior happens:
tree-ssa-math-opts.c:1672:53: runtime error: shift exponent 56 is too
large for 32-bit type 'int'

	Marek

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Fix PR61306: improve handling of sign and cast in bswap
  2014-06-18 15:23         ` Marek Polacek
@ 2014-06-18 17:53           ` Jakub Jelinek
  2014-06-19  5:36             ` Thomas Preud'homme
  0 siblings, 1 reply; 17+ messages in thread
From: Jakub Jelinek @ 2014-06-18 17:53 UTC (permalink / raw)
  To: Marek Polacek
  Cc: Thomas Preud'homme, 'Richard Biener', GCC Patches

On Wed, Jun 18, 2014 at 05:23:13PM +0200, Marek Polacek wrote:
> On Wed, Jun 18, 2014 at 12:55:01PM +0800, Thomas Preud'homme wrote:
> > @@ -1646,20 +1648,23 @@ do_shift_rotate (enum tree_code code,
> >        n->n <<= count;
> >        break;
> >      case RSHIFT_EXPR:
> > +      /* Arithmetic shift of signed type: result is dependent on the value.  */
> > +      if (!TYPE_UNSIGNED (n->type) && (n->n & (0xff << (bitsize - 8))))
> > +	return false;
> 
> Looks like here an undefined behavior happens:
> tree-ssa-math-opts.c:1672:53: runtime error: shift exponent 56 is too
> large for 32-bit type 'int'

Seems there are actually two spots with this, not just one.

Completely untested fix:

2014-06-18  Jakub Jelinek  <jakub@redhat.com>

	* tree-ssa-math-opts.c (do_shift_rotate, find_bswap_or_nop_1): Cast
	0xff to uint64_t before shifting it up.

--- gcc/tree-ssa-math-opts.c	2014-06-13 08:08:42.354136356 +0200
+++ gcc/tree-ssa-math-opts.c	2014-06-18 19:50:59.486916201 +0200
@@ -1669,7 +1669,8 @@ do_shift_rotate (enum tree_code code,
       break;
     case RSHIFT_EXPR:
       /* Arithmetic shift of signed type: result is dependent on the value.  */
-      if (!TYPE_UNSIGNED (n->type) && (n->n & (0xff << (bitsize - 8))))
+      if (!TYPE_UNSIGNED (n->type)
+	  && (n->n & ((uint64_t) 0xff << (bitsize - 8))))
 	return false;
       n->n >>= count;
       break;
@@ -1903,7 +1904,7 @@ find_bswap_or_nop_1 (gimple stmt, struct
 	    old_type_size = TYPE_PRECISION (n->type);
 	    if (!TYPE_UNSIGNED (n->type)
 		&& type_size > old_type_size
-		&& n->n & (0xff << (old_type_size - 8)))
+		&& n->n & ((uint64_t) 0xff << (old_type_size - 8)))
 	      return NULL_TREE;
 
 	    if (type_size / BITS_PER_UNIT < (int)(sizeof (int64_t)))


	Jakub

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH] Fix PR61306: improve handling of sign and cast in bswap
  2014-06-18 17:53           ` Jakub Jelinek
@ 2014-06-19  5:36             ` Thomas Preud'homme
  2014-06-27  2:30               ` Thomas Preud'homme
  0 siblings, 1 reply; 17+ messages in thread
From: Thomas Preud'homme @ 2014-06-19  5:36 UTC (permalink / raw)
  To: 'Jakub Jelinek', Marek Polacek
  Cc: 'Richard Biener', GCC Patches

> From: Jakub Jelinek [mailto:jakub@redhat.com]
> Sent: Thursday, June 19, 2014 1:54 AM
> 
> Seems there are actually two spots with this, not just one.
> 
> Completely untested fix:
> 
> 2014-06-18  Jakub Jelinek  <jakub@redhat.com>
> 
> 	* tree-ssa-math-opts.c (do_shift_rotate, find_bswap_or_nop_1):
> Cast
> 	0xff to uint64_t before shifting it up.
> 
> --- gcc/tree-ssa-math-opts.c	2014-06-13 08:08:42.354136356 +0200
> +++ gcc/tree-ssa-math-opts.c	2014-06-18 19:50:59.486916201 +0200
> @@ -1669,7 +1669,8 @@ do_shift_rotate (enum tree_code code,
>        break;
>      case RSHIFT_EXPR:
>        /* Arithmetic shift of signed type: result is dependent on the value.  */
> -      if (!TYPE_UNSIGNED (n->type) && (n->n & (0xff << (bitsize - 8))))
> +      if (!TYPE_UNSIGNED (n->type)
> +	  && (n->n & ((uint64_t) 0xff << (bitsize - 8))))
>  	return false;
>        n->n >>= count;
>        break;
> @@ -1903,7 +1904,7 @@ find_bswap_or_nop_1 (gimple stmt, struct
>  	    old_type_size = TYPE_PRECISION (n->type);
>  	    if (!TYPE_UNSIGNED (n->type)
>  		&& type_size > old_type_size
> -		&& n->n & (0xff << (old_type_size - 8)))
> +		&& n->n & ((uint64_t) 0xff << (old_type_size - 8)))
>  	      return NULL_TREE;
> 
>  	    if (type_size / BITS_PER_UNIT < (int)(sizeof (int64_t)))
> 
> 

Yep, that's the right fix. I tested it on both a bootstrapped gcc on x86_64-linux-gnu
and an arm-none-eabi cross-compiler with no regression on the testsuite.

Jakub, since you made the patch, the honor of commiting it should be yours.

Richard, given this issue, I think we should wait a few more days before I commit
A backported (and fixed of course) version to 4.8 and 4.9.

Best regards,

Thomas



^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH] Fix PR61306: improve handling of sign and cast in bswap
  2014-06-19  5:36             ` Thomas Preud'homme
@ 2014-06-27  2:30               ` Thomas Preud'homme
  2014-06-27  8:49                 ` Richard Biener
  0 siblings, 1 reply; 17+ messages in thread
From: Thomas Preud'homme @ 2014-06-27  2:30 UTC (permalink / raw)
  To: Thomas Preud'homme, 'Jakub Jelinek', Marek Polacek
  Cc: 'Richard Biener', GCC Patches

> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-
> owner@gcc.gnu.org] On Behalf Of Thomas Preud'homme
> Sent: Thursday, June 19, 2014 1:36 PM
> 
> Richard, given this issue, I think we should wait a few more days before I
> commit
> A backported (and fixed of course) version to 4.8 and 4.9.

No new issues were reported since then. Is it ok to commit the backport
(with Jakub fix) now or should we wait more?

Best regards,

Thomas


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Fix PR61306: improve handling of sign and cast in bswap
  2014-06-27  2:30               ` Thomas Preud'homme
@ 2014-06-27  8:49                 ` Richard Biener
  2014-06-30  3:03                   ` Thomas Preud'homme
  0 siblings, 1 reply; 17+ messages in thread
From: Richard Biener @ 2014-06-27  8:49 UTC (permalink / raw)
  To: Thomas Preud'homme; +Cc: Jakub Jelinek, Marek Polacek, GCC Patches

On Fri, Jun 27, 2014 at 4:29 AM, Thomas Preud'homme
<thomas.preudhomme@arm.com> wrote:
>> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-
>> owner@gcc.gnu.org] On Behalf Of Thomas Preud'homme
>> Sent: Thursday, June 19, 2014 1:36 PM
>>
>> Richard, given this issue, I think we should wait a few more days before I
>> commit
>> A backported (and fixed of course) version to 4.8 and 4.9.
>
> No new issues were reported since then. Is it ok to commit the backport
> (with Jakub fix) now or should we wait more?

FIne with me now.

Richard.

> Best regards,
>
> Thomas
>
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH] Fix PR61306: improve handling of sign and cast in bswap
  2014-06-27  8:49                 ` Richard Biener
@ 2014-06-30  3:03                   ` Thomas Preud'homme
  0 siblings, 0 replies; 17+ messages in thread
From: Thomas Preud'homme @ 2014-06-30  3:03 UTC (permalink / raw)
  To: 'Richard Biener'; +Cc: Jakub Jelinek, Marek Polacek, GCC Patches

> From: Richard Biener [mailto:richard.guenther@gmail.com]
> Sent: Friday, June 27, 2014 4:49 PM
> 
> FIne with me now.

Commited.

Best regards,

Thomas



^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2014-06-30  3:03 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-04  5:59 [PATCH] Fix PR61306: improve handling of sign and cast in bswap Thomas Preud'homme
2014-06-04  9:39 ` Richard Biener
2014-06-04 10:42   ` Thomas Preud'homme
2014-06-04 11:04     ` Richard Biener
2014-06-04 11:16       ` Thomas Preud'homme
2014-06-04 11:18         ` Richard Biener
2014-06-10  8:50   ` Thomas Preud'homme
2014-06-10  8:54     ` Thomas Preud'homme
2014-06-11  8:32     ` Richard Biener
2014-06-18  4:55       ` Thomas Preud'homme
2014-06-18 10:15         ` Richard Biener
2014-06-18 15:23         ` Marek Polacek
2014-06-18 17:53           ` Jakub Jelinek
2014-06-19  5:36             ` Thomas Preud'homme
2014-06-27  2:30               ` Thomas Preud'homme
2014-06-27  8:49                 ` Richard Biener
2014-06-30  3:03                   ` Thomas Preud'homme

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).