public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH][RFC] Introduce BIT_FIELD_INSERT
@ 2016-05-13 10:51 Richard Biener
  2016-05-16  0:55 ` Bill Schmidt
                   ` (3 more replies)
  0 siblings, 4 replies; 32+ messages in thread
From: Richard Biener @ 2016-05-13 10:51 UTC (permalink / raw)
  To: gcc-patches


The following patch adds BIT_FIELD_INSERT, an operation to
facilitate doing bitfield inserts on registers (as opposed
to currently where we'd have a BIT_FIELD_REF store).

Originally this was developed as part of bitfield lowering
where bitfield stores were lowered into read-modify-write
cycles and the modify part, instead of doing shifting and masking,
be kept in a more high-level form to ease combining them.

A second use case (the above is still valid) is vector element
inserts which we currently can only do via memory or
by extracting all components and re-building the vector using
a CONSTRUCTOR.  For this second use case I added code
re-writing the BIT_FIELD_REF stores the C family FEs produce
into BIT_FIELD_INSERT when update-address-taken can otherwise
re-write a decl into SSA form (the testcase shows we miss
a similar opportunity with the MEM_REF form of a vector insert,
I plan to fix that for the final submission).

One speciality of BIT_FIELD_INSERT as opposed to BIT_FIELD_REF
is that the size of the insertion is given implicitely via the
type size/precision of the value to insert.  That avoids
introducing ways to have quaternary ops in folding and GIMPLE stmts.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

Richard.

2011-06-16  Richard Guenther  <rguenther@suse.de>

	PR tree-optimization/29756
	* tree.def (BIT_FIELD_INSERT): New tcc_expression tree code.
	* expr.c (expand_expr_real_2): Handle BIT_FIELD_INSERT.
	* fold-const.c (operand_equal_p): Likewise.
	(fold_ternary_loc): Add constant folding of BIT_FIELD_INSERT.
	* gimplify.c (gimplify_expr): Handle BIT_FIELD_INSERT.
	* tree-inline.c (estimate_operator_cost): Likewise.
	* tree-pretty-print.c (dump_generic_node): Likewise.
	* tree-ssa-operands.c (get_expr_operands): Likewise.
	* cfgexpand.c (expand_debug_expr): Likewise.
	* gimple-pretty-print.c (dump_ternary_rhs): Likewise.
	* gimple.c (get_gimple_rhs_num_ops): Handle BIT_FIELD_INSERT.
	* tree-cfg.c (verify_gimple_assign_ternary): Verify BIT_FIELD_INSERT.

	* tree-ssa.c (non_rewritable_lvalue_p): We can rewrite
	vector inserts using BIT_FIELD_REF on the lhs.
	(execute_update_addresses_taken): Do it.

	* gcc.dg/tree-ssa/vector-6.c: New testcase.

Index: trunk/gcc/expr.c
===================================================================
*** trunk.orig/gcc/expr.c	2016-05-12 13:40:30.704262951 +0200
--- trunk/gcc/expr.c	2016-05-12 15:40:32.481225744 +0200
*************** expand_expr_real_2 (sepops ops, rtx targ
*** 9358,9363 ****
--- 9358,9380 ----
        target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
        return target;
  
+     case BIT_FIELD_INSERT:
+       {
+ 	unsigned bitpos = tree_to_uhwi (treeop2);
+ 	unsigned bitsize;
+ 	if (INTEGRAL_TYPE_P (TREE_TYPE (treeop1)))
+ 	  bitsize = TYPE_PRECISION (TREE_TYPE (treeop1));
+ 	else
+ 	  bitsize = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (treeop1)));
+ 	rtx op0 = expand_normal (treeop0);
+ 	rtx op1 = expand_normal (treeop1);
+ 	rtx dst = gen_reg_rtx (mode);
+ 	emit_move_insn (dst, op0);
+ 	store_bit_field (dst, bitsize, bitpos, 0, 0,
+ 			 TYPE_MODE (TREE_TYPE (treeop1)), op1, false);
+ 	return dst;
+       }
+ 
      default:
        gcc_unreachable ();
      }
Index: trunk/gcc/fold-const.c
===================================================================
*** trunk.orig/gcc/fold-const.c	2016-05-12 13:40:30.704262951 +0200
--- trunk/gcc/fold-const.c	2016-05-13 09:41:13.509812127 +0200
*************** operand_equal_p (const_tree arg0, const_
*** 3163,3168 ****
--- 3163,3169 ----
  
  	case VEC_COND_EXPR:
  	case DOT_PROD_EXPR:
+ 	case BIT_FIELD_INSERT:
  	  return OP_SAME (0) && OP_SAME (1) && OP_SAME (2);
  
  	default:
*************** fold_ternary_loc (location_t loc, enum t
*** 11870,11875 ****
--- 11871,11916 ----
  	}
        return NULL_TREE;
  
+     case BIT_FIELD_INSERT:
+       /* Perform (partial) constant folding of BIT_FIELD_INSERT.  */
+       if (TREE_CODE (arg0) == INTEGER_CST
+ 	  && TREE_CODE (arg1) == INTEGER_CST)
+ 	{
+ 	  unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
+ 	  unsigned bitsize = TYPE_PRECISION (TREE_TYPE (arg1));
+ 	  wide_int tem = wi::bit_and (arg0,
+ 				      wi::shifted_mask (bitpos, bitsize, true,
+ 							TYPE_PRECISION (type)));
+ 	  wide_int tem2
+ 	    = wi::lshift (wi::zext (wi::to_wide (arg1, TYPE_PRECISION (type)),
+ 				    bitsize), bitpos);
+ 	  return wide_int_to_tree (type, wi::bit_or (tem, tem2));
+ 	}
+       else if (TREE_CODE (arg0) == VECTOR_CST
+ 	       && CONSTANT_CLASS_P (arg1)
+ 	       && types_compatible_p (TREE_TYPE (TREE_TYPE (arg0)),
+ 				      TREE_TYPE (arg1)))
+ 	{
+ 	  unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
+ 	  unsigned HOST_WIDE_INT elsize
+ 	    = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (arg1)));
+ 	  if (bitpos % elsize == 0)
+ 	    {
+ 	      unsigned k = bitpos / elsize;
+ 	      if (operand_equal_p (VECTOR_CST_ELT (arg0, k), arg1, 0))
+ 		return arg0;
+ 	      else
+ 		{
+ 		  tree *elts = XALLOCAVEC (tree, TYPE_VECTOR_SUBPARTS (type));
+ 		  memcpy (elts, VECTOR_CST_ELTS (arg0),
+ 			  sizeof (tree) * TYPE_VECTOR_SUBPARTS (type));
+ 		  elts[k] = arg1;
+ 		  return build_vector (type, elts);
+ 		}
+ 	    }
+ 	}
+       return NULL_TREE;
+ 
      default:
        return NULL_TREE;
      } /* switch (code) */
Index: trunk/gcc/gimplify.c
===================================================================
*** trunk.orig/gcc/gimplify.c	2016-05-12 13:40:30.704262951 +0200
--- trunk/gcc/gimplify.c	2016-05-12 13:56:18.679120641 +0200
*************** gimplify_expr (tree *expr_p, gimple_seq
*** 10936,10941 ****
--- 10936,10945 ----
  	  /* Classified as tcc_expression.  */
  	  goto expr_3;
  
+ 	case BIT_FIELD_INSERT:
+ 	  /* Argument 3 is a constant.  */
+ 	  goto expr_2;
+ 
  	case POINTER_PLUS_EXPR:
  	  {
  	    enum gimplify_status r0, r1;
Index: trunk/gcc/tree-inline.c
===================================================================
*** trunk.orig/gcc/tree-inline.c	2016-05-12 13:40:30.704262951 +0200
--- trunk/gcc/tree-inline.c	2016-05-12 13:42:45.465811959 +0200
*************** estimate_operator_cost (enum tree_code c
*** 3941,3946 ****
--- 3941,3950 ----
          return weights->div_mod_cost;
        return 1;
  
+     /* Bit-field insertion needs several shift and mask operations.  */
+     case BIT_FIELD_INSERT:
+       return 3;
+ 
      default:
        /* We expect a copy assignment with no operator.  */
        gcc_assert (get_gimple_rhs_class (code) == GIMPLE_SINGLE_RHS);
Index: trunk/gcc/tree-pretty-print.c
===================================================================
*** trunk.orig/gcc/tree-pretty-print.c	2016-05-12 13:40:30.704262951 +0200
--- trunk/gcc/tree-pretty-print.c	2016-05-12 14:30:05.781944740 +0200
*************** dump_generic_node (pretty_printer *pp, t
*** 1876,1881 ****
--- 1876,1898 ----
        pp_greater (pp);
        break;
  
+     case BIT_FIELD_INSERT:
+       pp_string (pp, "BIT_FIELD_INSERT <");
+       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
+       pp_string (pp, ", ");
+       dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
+       pp_string (pp, ", ");
+       dump_generic_node (pp, TREE_OPERAND (node, 2), spc, flags, false);
+       pp_string (pp, " (");
+       if (INTEGRAL_TYPE_P (TREE_TYPE (TREE_OPERAND (node, 1))))
+ 	pp_decimal_int (pp,
+ 			TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (node, 1))));
+       else
+ 	dump_generic_node (pp, TYPE_SIZE (TREE_TYPE (TREE_OPERAND (node, 1))),
+ 			   spc, flags, false);
+       pp_string (pp, " bits)>");
+       break;
+ 
      case ARRAY_REF:
      case ARRAY_RANGE_REF:
        op0 = TREE_OPERAND (node, 0);
Index: trunk/gcc/tree-ssa-operands.c
===================================================================
*** trunk.orig/gcc/tree-ssa-operands.c	2016-05-12 13:42:45.465811959 +0200
--- trunk/gcc/tree-ssa-operands.c	2016-05-12 13:48:26.881736503 +0200
*************** get_expr_operands (struct function *fn,
*** 833,838 ****
--- 833,839 ----
        get_expr_operands (fn, stmt, &TREE_OPERAND (expr, 0), flags);
        return;
  
+     case BIT_FIELD_INSERT:
      case COMPOUND_EXPR:
      case OBJ_TYPE_REF:
      case ASSERT_EXPR:
Index: trunk/gcc/tree.def
===================================================================
*** trunk.orig/gcc/tree.def	2016-05-12 13:40:30.704262951 +0200
--- trunk/gcc/tree.def	2016-05-12 13:47:09.972852423 +0200
*************** DEFTREECODE (ADDR_EXPR, "addr_expr", tcc
*** 852,857 ****
--- 852,868 ----
     descriptor of type ptr_mode.  */
  DEFTREECODE (FDESC_EXPR, "fdesc_expr", tcc_expression, 2)
  
+ /* Given a word, a value and a bitfield position within the word,
+    produce the value that results if replacing the
+    described parts of word with value.
+    Operand 0 is a tree for the word of integral type;
+    Operand 1 is a tree for the value of integral type;
+    Operand 2 is a tree giving the constant position of the first referenced bit;
+    The number of bits replaced is given by the precision of the value
+    type if that is integral or by its size if it is non-integral.
+    The replaced bits shall be fully inside the word.  */
+ DEFTREECODE (BIT_FIELD_INSERT, "bit_field_insert", tcc_expression, 3)
+ 
  /* Given two real or integer operands of the same type,
     returns a complex value of the corresponding complex type.  */
  DEFTREECODE (COMPLEX_EXPR, "complex_expr", tcc_binary, 2)
Index: trunk/gcc/cfgexpand.c
===================================================================
*** trunk.orig/gcc/cfgexpand.c	2016-05-12 13:42:45.469812005 +0200
--- trunk/gcc/cfgexpand.c	2016-05-13 11:48:04.513407495 +0200
*************** expand_debug_expr (tree exp)
*** 5025,5030 ****
--- 5025,5031 ----
      case FIXED_CONVERT_EXPR:
      case OBJ_TYPE_REF:
      case WITH_SIZE_EXPR:
+     case BIT_FIELD_INSERT:
        return NULL;
  
      case DOT_PROD_EXPR:
Index: trunk/gcc/gimple-pretty-print.c
===================================================================
*** trunk.orig/gcc/gimple-pretty-print.c	2016-05-12 11:23:09.261375157 +0200
--- trunk/gcc/gimple-pretty-print.c	2016-05-12 14:57:22.096175579 +0200
*************** dump_ternary_rhs (pretty_printer *buffer
*** 479,484 ****
--- 479,502 ----
        pp_greater (buffer);
        break;
  
+     case BIT_FIELD_INSERT:
+       pp_string (buffer, "BIT_FIELD_INSERT <");
+       dump_generic_node (buffer, gimple_assign_rhs1 (gs), spc, flags, false);
+       pp_string (buffer, ", ");
+       dump_generic_node (buffer, gimple_assign_rhs2 (gs), spc, flags, false);
+       pp_string (buffer, ", ");
+       dump_generic_node (buffer, gimple_assign_rhs3 (gs), spc, flags, false);
+       pp_string (buffer, " (");
+       if (INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_rhs2 (gs))))
+ 	pp_decimal_int (buffer,
+ 			TYPE_PRECISION (TREE_TYPE (gimple_assign_rhs2 (gs))));
+       else
+ 	dump_generic_node (buffer,
+ 			   TYPE_SIZE (TREE_TYPE (gimple_assign_rhs2 (gs))),
+ 			   spc, flags, false);
+       pp_string (buffer, " bits)>");
+       break;
+ 
      default:
        gcc_unreachable ();
      }
Index: trunk/gcc/gimple.c
===================================================================
*** trunk.orig/gcc/gimple.c	2016-05-12 13:40:30.704262951 +0200
--- trunk/gcc/gimple.c	2016-05-12 14:49:37.066994969 +0200
*************** get_gimple_rhs_num_ops (enum tree_code c
*** 2044,2049 ****
--- 2044,2050 ----
        || (SYM) == REALIGN_LOAD_EXPR					    \
        || (SYM) == VEC_COND_EXPR						    \
        || (SYM) == VEC_PERM_EXPR                                             \
+       || (SYM) == BIT_FIELD_INSERT					    \
        || (SYM) == FMA_EXPR) ? GIMPLE_TERNARY_RHS			    \
     : ((SYM) == CONSTRUCTOR						    \
        || (SYM) == OBJ_TYPE_REF						    \
Index: trunk/gcc/tree-cfg.c
===================================================================
*** trunk.orig/gcc/tree-cfg.c	2016-05-06 14:38:33.959495081 +0200
--- trunk/gcc/tree-cfg.c	2016-05-13 09:25:01.670630730 +0200
*************** verify_gimple_assign_ternary (gassign *s
*** 4155,4160 ****
--- 4155,4207 ----
  
        return false;
  
+     case BIT_FIELD_INSERT:
+       if (! useless_type_conversion_p (lhs_type, rhs1_type))
+ 	{
+ 	  error ("type mismatch in BIT_FIELD_INSERT");
+ 	  debug_generic_expr (lhs_type);
+ 	  debug_generic_expr (rhs1_type);
+ 	  return true;
+ 	}
+       if (! ((INTEGRAL_TYPE_P (rhs1_type)
+ 	      && INTEGRAL_TYPE_P (rhs2_type))
+ 	     || (VECTOR_TYPE_P (rhs1_type)
+ 		 && types_compatible_p (TREE_TYPE (rhs1_type), rhs2_type))))
+ 	{
+ 	  error ("not allowed type combination in BIT_FIELD_INSERT");
+ 	  debug_generic_expr (rhs1_type);
+ 	  debug_generic_expr (rhs2_type);
+ 	  return true;
+ 	}
+       if (! tree_fits_uhwi_p (rhs3)
+ 	  || ! tree_fits_uhwi_p (TYPE_SIZE (rhs2_type)))
+ 	{
+ 	  error ("invalid position or size in BIT_FIELD_INSERT");
+ 	  return true;
+ 	}
+       if (INTEGRAL_TYPE_P (rhs1_type))
+ 	{
+ 	  unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (rhs3);
+ 	  if (bitpos >= TYPE_PRECISION (rhs1_type)
+ 	      || (bitpos + TYPE_PRECISION (rhs2_type)
+ 		  > TYPE_PRECISION (rhs1_type)))
+ 	    {
+ 	      error ("insertion out of range in BIT_FIELD_INSERT");
+ 	      return true;
+ 	    }
+ 	}
+       else if (VECTOR_TYPE_P (rhs1_type))
+ 	{
+ 	  unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (rhs3);
+ 	  unsigned HOST_WIDE_INT bitsize = tree_to_uhwi (TYPE_SIZE (rhs2_type));
+ 	  if (bitpos % bitsize != 0)
+ 	    {
+ 	      error ("vector insertion not at element boundary");
+ 	      return true;
+ 	    }
+ 	}
+       return false;
+ 
      case DOT_PROD_EXPR:
      case REALIGN_LOAD_EXPR:
        /* FIXME.  */
Index: trunk/gcc/tree-ssa.c
===================================================================
*** trunk.orig/gcc/tree-ssa.c	2016-05-13 09:38:02.263611726 +0200
--- trunk/gcc/tree-ssa.c	2016-05-13 09:50:31.020226585 +0200
*************** non_rewritable_lvalue_p (tree lhs)
*** 1318,1323 ****
--- 1318,1335 ----
  	return false;
      }
  
+   /* A vector-insert using a BIT_FIELD_REF is rewritable using
+      BIT_FIELD_INSERT.  */
+   if (TREE_CODE (lhs) == BIT_FIELD_REF
+       && DECL_P (TREE_OPERAND (lhs, 0))
+       && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (lhs, 0)))
+       /* && bitsize % element-size == 0 */
+       && types_compatible_p (TREE_TYPE (lhs),
+ 			     TREE_TYPE (TREE_TYPE (TREE_OPERAND (lhs, 0))))
+       && (tree_to_uhwi (TREE_OPERAND (lhs, 2))
+ 	  % tree_to_uhwi (TYPE_SIZE (TREE_TYPE (lhs)))) == 0)
+     return false;
+ 
    return true;
  }
  
*************** execute_update_addresses_taken (void)
*** 1536,1541 ****
--- 1548,1576 ----
  		    stmt = gsi_stmt (gsi);
  		    unlink_stmt_vdef (stmt);
  		    update_stmt (stmt);
+ 		    continue;
+ 		  }
+ 
+ 		/* Rewrite a vector insert via a BIT_FIELD_REF on the LHS
+ 		   into a BIT_FIELD_INSERT.  */
+ 		if (TREE_CODE (lhs) == BIT_FIELD_REF
+ 		    && DECL_P (TREE_OPERAND (lhs, 0))
+ 		    && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (lhs, 0)))
+ 		    && types_compatible_p (TREE_TYPE (lhs),
+ 					   TREE_TYPE (TREE_TYPE
+ 						       (TREE_OPERAND (lhs, 0))))
+ 		    && (tree_to_uhwi (TREE_OPERAND (lhs, 2))
+ 			% tree_to_uhwi (TYPE_SIZE (TREE_TYPE (lhs))) == 0))
+ 		  {
+ 		    tree var = TREE_OPERAND (lhs, 0);
+ 		    tree val = gimple_assign_rhs1 (stmt);
+ 		    tree bitpos = TREE_OPERAND (lhs, 2);
+ 		    gimple_assign_set_lhs (stmt, var);
+ 		    gimple_assign_set_rhs_with_ops
+ 		      (&gsi, BIT_FIELD_INSERT, var, val, bitpos);
+ 		    stmt = gsi_stmt (gsi);
+ 		    unlink_stmt_vdef (stmt);
+ 		    update_stmt (stmt);
  		    continue;
  		  }
  
Index: trunk/gcc/testsuite/gcc.dg/tree-ssa/vector-6.c
===================================================================
*** /dev/null	1970-01-01 00:00:00.000000000 +0000
--- trunk/gcc/testsuite/gcc.dg/tree-ssa/vector-6.c	2016-05-13 09:54:16.026814995 +0200
***************
*** 0 ****
--- 1,34 ----
+ /* { dg-do compile } */
+ /* { dg-options "-O -fdump-tree-ccp1" } */
+ 
+ typedef int v4si __attribute__((vector_size (4 * sizeof (int))));
+ 
+ v4si test1 (v4si v, int i)
+ {
+   ((int *)&v)[0] = i;
+   return v;
+ }
+ 
+ v4si test2 (v4si v, int i)
+ {
+   int *p = (int *)&v;
+   *p = i;
+   return v;
+ }
+ 
+ v4si test3 (v4si v, int i)
+ {
+   ((int *)&v)[3] = i;
+   return v;
+ }
+ 
+ v4si test4 (v4si v, int i)
+ {
+   int *p = (int *)&v;
+   p += 3;
+   *p = i;
+   return v;
+ }
+ 
+ /* { dg-final { scan-tree-dump-times "Now a gimple register: v" 2 "ccp1" } } */
+ /* { dg-final { scan-tree-dump-times "Now a gimple register: v" 4 "ccp1" { xfail *-*-* } } } */

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT
  2016-05-13 10:51 [PATCH][RFC] Introduce BIT_FIELD_INSERT Richard Biener
@ 2016-05-16  0:55 ` Bill Schmidt
  2016-05-16 12:37   ` Bill Schmidt
  2016-05-16  8:24 ` Eric Botcazou
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 32+ messages in thread
From: Bill Schmidt @ 2016-05-16  0:55 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

Hi Richard,

(Sorry for duplication to your personal email, I had new-mailer issues.)

The new vector-6 test produces very good code for powerpc64le with this patch:

        addis 9,2,.LC0@toc@ha
        sldi 3,3,32
        addi 9,9,.LC0@toc@l
        rldicl 9,9,0,32
        or 3,9,3
        blr

I did run into some ICEs with bootstrap/regtest, though:

26c26
< /home/wschmidt/gcc/build/gcc-mainline-base/gcc/testsuite/g++/../../xg++  version 7.0.0 20160515 (experimental) [trunk revision 236259] (GCC) 
---
> /home/wschmidt/gcc/build/gcc-mainline-test/gcc/testsuite/g++/../../xg++  version 7.0.0 20160515 (experimental) [trunk revision 236259] (GCC) 
31a32,39
> FAIL: gcc.c-torture/compile/pr70240.c   -O1  (internal compiler error)
> FAIL: gcc.c-torture/compile/pr70240.c   -O1  (test for excess errors)
> FAIL: gcc.c-torture/compile/pr70240.c   -O2  (internal compiler error)
> FAIL: gcc.c-torture/compile/pr70240.c   -O2  (test for excess errors)
> FAIL: gcc.c-torture/compile/pr70240.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (internal compiler error)
> FAIL: gcc.c-torture/compile/pr70240.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (test for excess errors)
> FAIL: gcc.c-torture/compile/pr70240.c   -Os  (internal compiler error)
> FAIL: gcc.c-torture/compile/pr70240.c   -Os  (test for excess errors)
53a62,66
> FAIL: gcc.dg/pr69896.c (internal compiler error)
> FAIL: gcc.dg/pr69896.c (test for excess errors)
> UNRESOLVED: gcc.dg/pr69896.c compilation failed to produce executable
> FAIL: gcc.dg/pr70326.c (internal compiler error)
> FAIL: gcc.dg/pr70326.c (test for excess errors)
281a295,353
> FAIL: gcc.dg/torture/pr69613.c   -O1  (internal compiler error)
> FAIL: gcc.dg/torture/pr69613.c   -O1  (test for excess errors)
> UNRESOLVED: gcc.dg/torture/pr69613.c   -O1  compilation failed to produce executable
> FAIL: gcc.dg/torture/pr69613.c   -O2  (internal compiler error)
> FAIL: gcc.dg/torture/pr69613.c   -O2  (test for excess errors)
> UNRESOLVED: gcc.dg/torture/pr69613.c   -O2  compilation failed to produce executable
> FAIL: gcc.dg/torture/pr69613.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (internal compiler error)
> FAIL: gcc.dg/torture/pr69613.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (test for excess errors)
> UNRESOLVED: gcc.dg/torture/pr69613.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  compilation failed to produce executable
> FAIL: gcc.dg/torture/pr69613.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  (internal compiler error)
> FAIL: gcc.dg/torture/pr69613.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)
> UNRESOLVED: gcc.dg/torture/pr69613.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  compilation failed to produce executable
> FAIL: gcc.dg/torture/pr69613.c   -O3 -g  (internal compiler error)
> FAIL: gcc.dg/torture/pr69613.c   -O3 -g  (test for excess errors)
> UNRESOLVED: gcc.dg/torture/pr69613.c   -O3 -g  compilation failed to produce executable
> FAIL: gcc.dg/torture/pr69613.c   -Os  (internal compiler error)
> FAIL: gcc.dg/torture/pr69613.c   -Os  (test for excess errors)
> UNRESOLVED: gcc.dg/torture/pr69613.c   -Os  compilation failed to produce executable
> FAIL: gcc.dg/torture/pr69909.c   -O1  (internal compiler error)
> FAIL: gcc.dg/torture/pr69909.c   -O1  (test for excess errors)
> UNRESOLVED: gcc.dg/torture/pr69909.c   -O1  compilation failed to produce executable
> FAIL: gcc.dg/torture/pr69909.c   -O2  (internal compiler error)
> FAIL: gcc.dg/torture/pr69909.c   -O2  (test for excess errors)
> UNRESOLVED: gcc.dg/torture/pr69909.c   -O2  compilation failed to produce executable
> FAIL: gcc.dg/torture/pr69909.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (internal compiler error)
> FAIL: gcc.dg/torture/pr69909.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (test for excess errors)
> UNRESOLVED: gcc.dg/torture/pr69909.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  compilation failed to produce executable
> FAIL: gcc.dg/torture/pr69909.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  (internal compiler error)
> FAIL: gcc.dg/torture/pr69909.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)
> UNRESOLVED: gcc.dg/torture/pr69909.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  compilation failed to produce executable
> FAIL: gcc.dg/torture/pr69909.c   -O3 -g  (internal compiler error)
> FAIL: gcc.dg/torture/pr69909.c   -O3 -g  (test for excess errors)
> UNRESOLVED: gcc.dg/torture/pr69909.c   -O3 -g  compilation failed to produce executable
> FAIL: gcc.dg/torture/pr69909.c   -Os  (internal compiler error)
> FAIL: gcc.dg/torture/pr69909.c   -Os  (test for excess errors)
> UNRESOLVED: gcc.dg/torture/pr69909.c   -Os  compilation failed to produce executable
> FAIL: gcc.dg/torture/pr70083.c   -O1  (internal compiler error)
> FAIL: gcc.dg/torture/pr70083.c   -O2  (internal compiler error)
> FAIL: gcc.dg/torture/pr70083.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (internal compiler error)
> FAIL: gcc.dg/torture/pr70083.c   -O3 -g  (internal compiler error)
> FAIL: gcc.dg/torture/pr70083.c   -Os  (internal compiler error)
> FAIL: gcc.dg/torture/pr70421.c   -O1  (internal compiler error)
> FAIL: gcc.dg/torture/pr70421.c   -O1  (test for excess errors)
> UNRESOLVED: gcc.dg/torture/pr70421.c   -O1  compilation failed to produce executable
> FAIL: gcc.dg/torture/pr70421.c   -O2  (internal compiler error)
> FAIL: gcc.dg/torture/pr70421.c   -O2  (test for excess errors)
> UNRESOLVED: gcc.dg/torture/pr70421.c   -O2  compilation failed to produce executable
> FAIL: gcc.dg/torture/pr70421.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (internal compiler error)
> FAIL: gcc.dg/torture/pr70421.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (test for excess errors)
> UNRESOLVED: gcc.dg/torture/pr70421.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  compilation failed to produce executable
> FAIL: gcc.dg/torture/pr70421.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  (internal compiler error)
> FAIL: gcc.dg/torture/pr70421.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)
> UNRESOLVED: gcc.dg/torture/pr70421.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  compilation failed to produce executable
> FAIL: gcc.dg/torture/pr70421.c   -O3 -g  (internal compiler error)
> FAIL: gcc.dg/torture/pr70421.c   -O3 -g  (test for excess errors)
> UNRESOLVED: gcc.dg/torture/pr70421.c   -O3 -g  compilation failed to produce executable
> FAIL: gcc.dg/torture/pr70421.c   -Os  (internal compiler error)
> FAIL: gcc.dg/torture/pr70421.c   -Os  (test for excess errors)
> UNRESOLVED: gcc.dg/torture/pr70421.c   -Os  compilation failed to produce executable
295,296c367,368

Thanks for adding BIT_FIELD_INSERT, I think this will help us in several places.

Bill 

> On May 13, 2016, at 5:51 AM, Richard Biener <rguenther@suse.de> wrote:
> 
> 
> The following patch adds BIT_FIELD_INSERT, an operation to
> facilitate doing bitfield inserts on registers (as opposed
> to currently where we'd have a BIT_FIELD_REF store).
> 
> Originally this was developed as part of bitfield lowering
> where bitfield stores were lowered into read-modify-write
> cycles and the modify part, instead of doing shifting and masking,
> be kept in a more high-level form to ease combining them.
> 
> A second use case (the above is still valid) is vector element
> inserts which we currently can only do via memory or
> by extracting all components and re-building the vector using
> a CONSTRUCTOR.  For this second use case I added code
> re-writing the BIT_FIELD_REF stores the C family FEs produce
> into BIT_FIELD_INSERT when update-address-taken can otherwise
> re-write a decl into SSA form (the testcase shows we miss
> a similar opportunity with the MEM_REF form of a vector insert,
> I plan to fix that for the final submission).
> 
> One speciality of BIT_FIELD_INSERT as opposed to BIT_FIELD_REF
> is that the size of the insertion is given implicitely via the
> type size/precision of the value to insert.  That avoids
> introducing ways to have quaternary ops in folding and GIMPLE stmts.
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
> 
> Richard.
> 
> 2011-06-16  Richard Guenther  <rguenther@suse.de>
> 
> 	PR tree-optimization/29756
> 	* tree.def (BIT_FIELD_INSERT): New tcc_expression tree code.
> 	* expr.c (expand_expr_real_2): Handle BIT_FIELD_INSERT.
> 	* fold-const.c (operand_equal_p): Likewise.
> 	(fold_ternary_loc): Add constant folding of BIT_FIELD_INSERT.
> 	* gimplify.c (gimplify_expr): Handle BIT_FIELD_INSERT.
> 	* tree-inline.c (estimate_operator_cost): Likewise.
> 	* tree-pretty-print.c (dump_generic_node): Likewise.
> 	* tree-ssa-operands.c (get_expr_operands): Likewise.
> 	* cfgexpand.c (expand_debug_expr): Likewise.
> 	* gimple-pretty-print.c (dump_ternary_rhs): Likewise.
> 	* gimple.c (get_gimple_rhs_num_ops): Handle BIT_FIELD_INSERT.
> 	* tree-cfg.c (verify_gimple_assign_ternary): Verify BIT_FIELD_INSERT.
> 
> 	* tree-ssa.c (non_rewritable_lvalue_p): We can rewrite
> 	vector inserts using BIT_FIELD_REF on the lhs.
> 	(execute_update_addresses_taken): Do it.
> 
> 	* gcc.dg/tree-ssa/vector-6.c: New testcase.
> 
> Index: trunk/gcc/expr.c
> ===================================================================
> *** trunk.orig/gcc/expr.c	2016-05-12 13:40:30.704262951 +0200
> --- trunk/gcc/expr.c	2016-05-12 15:40:32.481225744 +0200
> *************** expand_expr_real_2 (sepops ops, rtx targ
> *** 9358,9363 ****
> --- 9358,9380 ----
>        target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
>        return target;
> 
> +     case BIT_FIELD_INSERT:
> +       {
> + 	unsigned bitpos = tree_to_uhwi (treeop2);
> + 	unsigned bitsize;
> + 	if (INTEGRAL_TYPE_P (TREE_TYPE (treeop1)))
> + 	  bitsize = TYPE_PRECISION (TREE_TYPE (treeop1));
> + 	else
> + 	  bitsize = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (treeop1)));
> + 	rtx op0 = expand_normal (treeop0);
> + 	rtx op1 = expand_normal (treeop1);
> + 	rtx dst = gen_reg_rtx (mode);
> + 	emit_move_insn (dst, op0);
> + 	store_bit_field (dst, bitsize, bitpos, 0, 0,
> + 			 TYPE_MODE (TREE_TYPE (treeop1)), op1, false);
> + 	return dst;
> +       }
> + 
>      default:
>        gcc_unreachable ();
>      }
> Index: trunk/gcc/fold-const.c
> ===================================================================
> *** trunk.orig/gcc/fold-const.c	2016-05-12 13:40:30.704262951 +0200
> --- trunk/gcc/fold-const.c	2016-05-13 09:41:13.509812127 +0200
> *************** operand_equal_p (const_tree arg0, const_
> *** 3163,3168 ****
> --- 3163,3169 ----
> 
>  	case VEC_COND_EXPR:
>  	case DOT_PROD_EXPR:
> + 	case BIT_FIELD_INSERT:
>  	  return OP_SAME (0) && OP_SAME (1) && OP_SAME (2);
> 
>  	default:
> *************** fold_ternary_loc (location_t loc, enum t
> *** 11870,11875 ****
> --- 11871,11916 ----
>  	}
>        return NULL_TREE;
> 
> +     case BIT_FIELD_INSERT:
> +       /* Perform (partial) constant folding of BIT_FIELD_INSERT.  */
> +       if (TREE_CODE (arg0) == INTEGER_CST
> + 	  && TREE_CODE (arg1) == INTEGER_CST)
> + 	{
> + 	  unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
> + 	  unsigned bitsize = TYPE_PRECISION (TREE_TYPE (arg1));
> + 	  wide_int tem = wi::bit_and (arg0,
> + 				      wi::shifted_mask (bitpos, bitsize, true,
> + 							TYPE_PRECISION (type)));
> + 	  wide_int tem2
> + 	    = wi::lshift (wi::zext (wi::to_wide (arg1, TYPE_PRECISION (type)),
> + 				    bitsize), bitpos);
> + 	  return wide_int_to_tree (type, wi::bit_or (tem, tem2));
> + 	}
> +       else if (TREE_CODE (arg0) == VECTOR_CST
> + 	       && CONSTANT_CLASS_P (arg1)
> + 	       && types_compatible_p (TREE_TYPE (TREE_TYPE (arg0)),
> + 				      TREE_TYPE (arg1)))
> + 	{
> + 	  unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
> + 	  unsigned HOST_WIDE_INT elsize
> + 	    = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (arg1)));
> + 	  if (bitpos % elsize == 0)
> + 	    {
> + 	      unsigned k = bitpos / elsize;
> + 	      if (operand_equal_p (VECTOR_CST_ELT (arg0, k), arg1, 0))
> + 		return arg0;
> + 	      else
> + 		{
> + 		  tree *elts = XALLOCAVEC (tree, TYPE_VECTOR_SUBPARTS (type));
> + 		  memcpy (elts, VECTOR_CST_ELTS (arg0),
> + 			  sizeof (tree) * TYPE_VECTOR_SUBPARTS (type));
> + 		  elts[k] = arg1;
> + 		  return build_vector (type, elts);
> + 		}
> + 	    }
> + 	}
> +       return NULL_TREE;
> + 
>      default:
>        return NULL_TREE;
>      } /* switch (code) */
> Index: trunk/gcc/gimplify.c
> ===================================================================
> *** trunk.orig/gcc/gimplify.c	2016-05-12 13:40:30.704262951 +0200
> --- trunk/gcc/gimplify.c	2016-05-12 13:56:18.679120641 +0200
> *************** gimplify_expr (tree *expr_p, gimple_seq
> *** 10936,10941 ****
> --- 10936,10945 ----
>  	  /* Classified as tcc_expression.  */
>  	  goto expr_3;
> 
> + 	case BIT_FIELD_INSERT:
> + 	  /* Argument 3 is a constant.  */
> + 	  goto expr_2;
> + 
>  	case POINTER_PLUS_EXPR:
>  	  {
>  	    enum gimplify_status r0, r1;
> Index: trunk/gcc/tree-inline.c
> ===================================================================
> *** trunk.orig/gcc/tree-inline.c	2016-05-12 13:40:30.704262951 +0200
> --- trunk/gcc/tree-inline.c	2016-05-12 13:42:45.465811959 +0200
> *************** estimate_operator_cost (enum tree_code c
> *** 3941,3946 ****
> --- 3941,3950 ----
>          return weights->div_mod_cost;
>        return 1;
> 
> +     /* Bit-field insertion needs several shift and mask operations.  */
> +     case BIT_FIELD_INSERT:
> +       return 3;
> + 
>      default:
>        /* We expect a copy assignment with no operator.  */
>        gcc_assert (get_gimple_rhs_class (code) == GIMPLE_SINGLE_RHS);
> Index: trunk/gcc/tree-pretty-print.c
> ===================================================================
> *** trunk.orig/gcc/tree-pretty-print.c	2016-05-12 13:40:30.704262951 +0200
> --- trunk/gcc/tree-pretty-print.c	2016-05-12 14:30:05.781944740 +0200
> *************** dump_generic_node (pretty_printer *pp, t
> *** 1876,1881 ****
> --- 1876,1898 ----
>        pp_greater (pp);
>        break;
> 
> +     case BIT_FIELD_INSERT:
> +       pp_string (pp, "BIT_FIELD_INSERT <");
> +       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
> +       pp_string (pp, ", ");
> +       dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
> +       pp_string (pp, ", ");
> +       dump_generic_node (pp, TREE_OPERAND (node, 2), spc, flags, false);
> +       pp_string (pp, " (");
> +       if (INTEGRAL_TYPE_P (TREE_TYPE (TREE_OPERAND (node, 1))))
> + 	pp_decimal_int (pp,
> + 			TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (node, 1))));
> +       else
> + 	dump_generic_node (pp, TYPE_SIZE (TREE_TYPE (TREE_OPERAND (node, 1))),
> + 			   spc, flags, false);
> +       pp_string (pp, " bits)>");
> +       break;
> + 
>      case ARRAY_REF:
>      case ARRAY_RANGE_REF:
>        op0 = TREE_OPERAND (node, 0);
> Index: trunk/gcc/tree-ssa-operands.c
> ===================================================================
> *** trunk.orig/gcc/tree-ssa-operands.c	2016-05-12 13:42:45.465811959 +0200
> --- trunk/gcc/tree-ssa-operands.c	2016-05-12 13:48:26.881736503 +0200
> *************** get_expr_operands (struct function *fn,
> *** 833,838 ****
> --- 833,839 ----
>        get_expr_operands (fn, stmt, &TREE_OPERAND (expr, 0), flags);
>        return;
> 
> +     case BIT_FIELD_INSERT:
>      case COMPOUND_EXPR:
>      case OBJ_TYPE_REF:
>      case ASSERT_EXPR:
> Index: trunk/gcc/tree.def
> ===================================================================
> *** trunk.orig/gcc/tree.def	2016-05-12 13:40:30.704262951 +0200
> --- trunk/gcc/tree.def	2016-05-12 13:47:09.972852423 +0200
> *************** DEFTREECODE (ADDR_EXPR, "addr_expr", tcc
> *** 852,857 ****
> --- 852,868 ----
>     descriptor of type ptr_mode.  */
>  DEFTREECODE (FDESC_EXPR, "fdesc_expr", tcc_expression, 2)
> 
> + /* Given a word, a value and a bitfield position within the word,
> +    produce the value that results if replacing the
> +    described parts of word with value.
> +    Operand 0 is a tree for the word of integral type;
> +    Operand 1 is a tree for the value of integral type;
> +    Operand 2 is a tree giving the constant position of the first referenced bit;
> +    The number of bits replaced is given by the precision of the value
> +    type if that is integral or by its size if it is non-integral.
> +    The replaced bits shall be fully inside the word.  */
> + DEFTREECODE (BIT_FIELD_INSERT, "bit_field_insert", tcc_expression, 3)
> + 
>  /* Given two real or integer operands of the same type,
>     returns a complex value of the corresponding complex type.  */
>  DEFTREECODE (COMPLEX_EXPR, "complex_expr", tcc_binary, 2)
> Index: trunk/gcc/cfgexpand.c
> ===================================================================
> *** trunk.orig/gcc/cfgexpand.c	2016-05-12 13:42:45.469812005 +0200
> --- trunk/gcc/cfgexpand.c	2016-05-13 11:48:04.513407495 +0200
> *************** expand_debug_expr (tree exp)
> *** 5025,5030 ****
> --- 5025,5031 ----
>      case FIXED_CONVERT_EXPR:
>      case OBJ_TYPE_REF:
>      case WITH_SIZE_EXPR:
> +     case BIT_FIELD_INSERT:
>        return NULL;
> 
>      case DOT_PROD_EXPR:
> Index: trunk/gcc/gimple-pretty-print.c
> ===================================================================
> *** trunk.orig/gcc/gimple-pretty-print.c	2016-05-12 11:23:09.261375157 +0200
> --- trunk/gcc/gimple-pretty-print.c	2016-05-12 14:57:22.096175579 +0200
> *************** dump_ternary_rhs (pretty_printer *buffer
> *** 479,484 ****
> --- 479,502 ----
>        pp_greater (buffer);
>        break;
> 
> +     case BIT_FIELD_INSERT:
> +       pp_string (buffer, "BIT_FIELD_INSERT <");
> +       dump_generic_node (buffer, gimple_assign_rhs1 (gs), spc, flags, false);
> +       pp_string (buffer, ", ");
> +       dump_generic_node (buffer, gimple_assign_rhs2 (gs), spc, flags, false);
> +       pp_string (buffer, ", ");
> +       dump_generic_node (buffer, gimple_assign_rhs3 (gs), spc, flags, false);
> +       pp_string (buffer, " (");
> +       if (INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_rhs2 (gs))))
> + 	pp_decimal_int (buffer,
> + 			TYPE_PRECISION (TREE_TYPE (gimple_assign_rhs2 (gs))));
> +       else
> + 	dump_generic_node (buffer,
> + 			   TYPE_SIZE (TREE_TYPE (gimple_assign_rhs2 (gs))),
> + 			   spc, flags, false);
> +       pp_string (buffer, " bits)>");
> +       break;
> + 
>      default:
>        gcc_unreachable ();
>      }
> Index: trunk/gcc/gimple.c
> ===================================================================
> *** trunk.orig/gcc/gimple.c	2016-05-12 13:40:30.704262951 +0200
> --- trunk/gcc/gimple.c	2016-05-12 14:49:37.066994969 +0200
> *************** get_gimple_rhs_num_ops (enum tree_code c
> *** 2044,2049 ****
> --- 2044,2050 ----
>        || (SYM) == REALIGN_LOAD_EXPR					    \
>        || (SYM) == VEC_COND_EXPR						    \
>        || (SYM) == VEC_PERM_EXPR                                             \
> +       || (SYM) == BIT_FIELD_INSERT					    \
>        || (SYM) == FMA_EXPR) ? GIMPLE_TERNARY_RHS			    \
>     : ((SYM) == CONSTRUCTOR						    \
>        || (SYM) == OBJ_TYPE_REF						    \
> Index: trunk/gcc/tree-cfg.c
> ===================================================================
> *** trunk.orig/gcc/tree-cfg.c	2016-05-06 14:38:33.959495081 +0200
> --- trunk/gcc/tree-cfg.c	2016-05-13 09:25:01.670630730 +0200
> *************** verify_gimple_assign_ternary (gassign *s
> *** 4155,4160 ****
> --- 4155,4207 ----
> 
>        return false;
> 
> +     case BIT_FIELD_INSERT:
> +       if (! useless_type_conversion_p (lhs_type, rhs1_type))
> + 	{
> + 	  error ("type mismatch in BIT_FIELD_INSERT");
> + 	  debug_generic_expr (lhs_type);
> + 	  debug_generic_expr (rhs1_type);
> + 	  return true;
> + 	}
> +       if (! ((INTEGRAL_TYPE_P (rhs1_type)
> + 	      && INTEGRAL_TYPE_P (rhs2_type))
> + 	     || (VECTOR_TYPE_P (rhs1_type)
> + 		 && types_compatible_p (TREE_TYPE (rhs1_type), rhs2_type))))
> + 	{
> + 	  error ("not allowed type combination in BIT_FIELD_INSERT");
> + 	  debug_generic_expr (rhs1_type);
> + 	  debug_generic_expr (rhs2_type);
> + 	  return true;
> + 	}
> +       if (! tree_fits_uhwi_p (rhs3)
> + 	  || ! tree_fits_uhwi_p (TYPE_SIZE (rhs2_type)))
> + 	{
> + 	  error ("invalid position or size in BIT_FIELD_INSERT");
> + 	  return true;
> + 	}
> +       if (INTEGRAL_TYPE_P (rhs1_type))
> + 	{
> + 	  unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (rhs3);
> + 	  if (bitpos >= TYPE_PRECISION (rhs1_type)
> + 	      || (bitpos + TYPE_PRECISION (rhs2_type)
> + 		  > TYPE_PRECISION (rhs1_type)))
> + 	    {
> + 	      error ("insertion out of range in BIT_FIELD_INSERT");
> + 	      return true;
> + 	    }
> + 	}
> +       else if (VECTOR_TYPE_P (rhs1_type))
> + 	{
> + 	  unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (rhs3);
> + 	  unsigned HOST_WIDE_INT bitsize = tree_to_uhwi (TYPE_SIZE (rhs2_type));
> + 	  if (bitpos % bitsize != 0)
> + 	    {
> + 	      error ("vector insertion not at element boundary");
> + 	      return true;
> + 	    }
> + 	}
> +       return false;
> + 
>      case DOT_PROD_EXPR:
>      case REALIGN_LOAD_EXPR:
>        /* FIXME.  */
> Index: trunk/gcc/tree-ssa.c
> ===================================================================
> *** trunk.orig/gcc/tree-ssa.c	2016-05-13 09:38:02.263611726 +0200
> --- trunk/gcc/tree-ssa.c	2016-05-13 09:50:31.020226585 +0200
> *************** non_rewritable_lvalue_p (tree lhs)
> *** 1318,1323 ****
> --- 1318,1335 ----
>  	return false;
>      }
> 
> +   /* A vector-insert using a BIT_FIELD_REF is rewritable using
> +      BIT_FIELD_INSERT.  */
> +   if (TREE_CODE (lhs) == BIT_FIELD_REF
> +       && DECL_P (TREE_OPERAND (lhs, 0))
> +       && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (lhs, 0)))
> +       /* && bitsize % element-size == 0 */
> +       && types_compatible_p (TREE_TYPE (lhs),
> + 			     TREE_TYPE (TREE_TYPE (TREE_OPERAND (lhs, 0))))
> +       && (tree_to_uhwi (TREE_OPERAND (lhs, 2))
> + 	  % tree_to_uhwi (TYPE_SIZE (TREE_TYPE (lhs)))) == 0)
> +     return false;
> + 
>    return true;
>  }
> 
> *************** execute_update_addresses_taken (void)
> *** 1536,1541 ****
> --- 1548,1576 ----
>  		    stmt = gsi_stmt (gsi);
>  		    unlink_stmt_vdef (stmt);
>  		    update_stmt (stmt);
> + 		    continue;
> + 		  }
> + 
> + 		/* Rewrite a vector insert via a BIT_FIELD_REF on the LHS
> + 		   into a BIT_FIELD_INSERT.  */
> + 		if (TREE_CODE (lhs) == BIT_FIELD_REF
> + 		    && DECL_P (TREE_OPERAND (lhs, 0))
> + 		    && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (lhs, 0)))
> + 		    && types_compatible_p (TREE_TYPE (lhs),
> + 					   TREE_TYPE (TREE_TYPE
> + 						       (TREE_OPERAND (lhs, 0))))
> + 		    && (tree_to_uhwi (TREE_OPERAND (lhs, 2))
> + 			% tree_to_uhwi (TYPE_SIZE (TREE_TYPE (lhs))) == 0))
> + 		  {
> + 		    tree var = TREE_OPERAND (lhs, 0);
> + 		    tree val = gimple_assign_rhs1 (stmt);
> + 		    tree bitpos = TREE_OPERAND (lhs, 2);
> + 		    gimple_assign_set_lhs (stmt, var);
> + 		    gimple_assign_set_rhs_with_ops
> + 		      (&gsi, BIT_FIELD_INSERT, var, val, bitpos);
> + 		    stmt = gsi_stmt (gsi);
> + 		    unlink_stmt_vdef (stmt);
> + 		    update_stmt (stmt);
>  		    continue;
>  		  }
> 
> Index: trunk/gcc/testsuite/gcc.dg/tree-ssa/vector-6.c
> ===================================================================
> *** /dev/null	1970-01-01 00:00:00.000000000 +0000
> --- trunk/gcc/testsuite/gcc.dg/tree-ssa/vector-6.c	2016-05-13 09:54:16.026814995 +0200
> ***************
> *** 0 ****
> --- 1,34 ----
> + /* { dg-do compile } */
> + /* { dg-options "-O -fdump-tree-ccp1" } */
> + 
> + typedef int v4si __attribute__((vector_size (4 * sizeof (int))));
> + 
> + v4si test1 (v4si v, int i)
> + {
> +   ((int *)&v)[0] = i;
> +   return v;
> + }
> + 
> + v4si test2 (v4si v, int i)
> + {
> +   int *p = (int *)&v;
> +   *p = i;
> +   return v;
> + }
> + 
> + v4si test3 (v4si v, int i)
> + {
> +   ((int *)&v)[3] = i;
> +   return v;
> + }
> + 
> + v4si test4 (v4si v, int i)
> + {
> +   int *p = (int *)&v;
> +   p += 3;
> +   *p = i;
> +   return v;
> + }
> + 
> + /* { dg-final { scan-tree-dump-times "Now a gimple register: v" 2 "ccp1" } } */
> + /* { dg-final { scan-tree-dump-times "Now a gimple register: v" 4 "ccp1" { xfail *-*-* } } } */
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT
  2016-05-13 10:51 [PATCH][RFC] Introduce BIT_FIELD_INSERT Richard Biener
  2016-05-16  0:55 ` Bill Schmidt
@ 2016-05-16  8:24 ` Eric Botcazou
  2016-05-17  7:50   ` Richard Biener
  2016-05-20 14:11 ` Andi Kleen
  2018-11-15  1:27 ` Andrew Pinski
  3 siblings, 1 reply; 32+ messages in thread
From: Eric Botcazou @ 2016-05-16  8:24 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

> The following patch adds BIT_FIELD_INSERT, an operation to
> facilitate doing bitfield inserts on registers (as opposed
> to currently where we'd have a BIT_FIELD_REF store).

Why not call it BIT_FIELD_INSERT_EXPR instead to make it clear that it's an 
expression and not a mere operation?

> Originally this was developed as part of bitfield lowering
> where bitfield stores were lowered into read-modify-write
> cycles and the modify part, instead of doing shifting and masking,
> be kept in a more high-level form to ease combining them.
> 
> A second use case (the above is still valid) is vector element
> inserts which we currently can only do via memory or
> by extracting all components and re-building the vector using
> a CONSTRUCTOR.  For this second use case I added code
> re-writing the BIT_FIELD_REF stores the C family FEs produce
> into BIT_FIELD_INSERT when update-address-taken can otherwise
> re-write a decl into SSA form (the testcase shows we miss
> a similar opportunity with the MEM_REF form of a vector insert,
> I plan to fix that for the final submission).

The description in tree.def looks off then, it only mentions words and 
integral types.

> One speciality of BIT_FIELD_INSERT as opposed to BIT_FIELD_REF
> is that the size of the insertion is given implicitely via the
> type size/precision of the value to insert.  That avoids
> introducing ways to have quaternary ops in folding and GIMPLE stmts.

Yes, it's a bit unfortunate, but sensible.  Maybe add a ??? note about that.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT
  2016-05-16  0:55 ` Bill Schmidt
@ 2016-05-16 12:37   ` Bill Schmidt
  2016-05-17  7:52     ` Richard Biener
  0 siblings, 1 reply; 32+ messages in thread
From: Bill Schmidt @ 2016-05-16 12:37 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

Sorry, that was the wrong vector-6.c — should have realized.  In any case, for each of the vector tests, we get appropriate use of element-wise loads, and no load-hit-store bitfield assignments, so the code generation is what we want to see.  Sorry for the misleading information.

Bill

> On May 15, 2016, at 7:55 PM, Bill Schmidt <wschmidt@linux.vnet.ibm.com> wrote:
> 
> Hi Richard,
> 
> (Sorry for duplication to your personal email, I had new-mailer issues.)
> 
> The new vector-6 test produces very good code for powerpc64le with this patch:
> 
>        addis 9,2,.LC0@toc@ha
>        sldi 3,3,32
>        addi 9,9,.LC0@toc@l
>        rldicl 9,9,0,32
>        or 3,9,3
>        blr
> 
> I did run into some ICEs with bootstrap/regtest, though:
> 
> 26c26
> < /home/wschmidt/gcc/build/gcc-mainline-base/gcc/testsuite/g++/../../xg++  version 7.0.0 20160515 (experimental) [trunk revision 236259] (GCC) 
> ---
>> /home/wschmidt/gcc/build/gcc-mainline-test/gcc/testsuite/g++/../../xg++  version 7.0.0 20160515 (experimental) [trunk revision 236259] (GCC) 
> 31a32,39
>> FAIL: gcc.c-torture/compile/pr70240.c   -O1  (internal compiler error)
>> FAIL: gcc.c-torture/compile/pr70240.c   -O1  (test for excess errors)
>> FAIL: gcc.c-torture/compile/pr70240.c   -O2  (internal compiler error)
>> FAIL: gcc.c-torture/compile/pr70240.c   -O2  (test for excess errors)
>> FAIL: gcc.c-torture/compile/pr70240.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (internal compiler error)
>> FAIL: gcc.c-torture/compile/pr70240.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (test for excess errors)
>> FAIL: gcc.c-torture/compile/pr70240.c   -Os  (internal compiler error)
>> FAIL: gcc.c-torture/compile/pr70240.c   -Os  (test for excess errors)
> 53a62,66
>> FAIL: gcc.dg/pr69896.c (internal compiler error)
>> FAIL: gcc.dg/pr69896.c (test for excess errors)
>> UNRESOLVED: gcc.dg/pr69896.c compilation failed to produce executable
>> FAIL: gcc.dg/pr70326.c (internal compiler error)
>> FAIL: gcc.dg/pr70326.c (test for excess errors)
> 281a295,353
>> FAIL: gcc.dg/torture/pr69613.c   -O1  (internal compiler error)
>> FAIL: gcc.dg/torture/pr69613.c   -O1  (test for excess errors)
>> UNRESOLVED: gcc.dg/torture/pr69613.c   -O1  compilation failed to produce executable
>> FAIL: gcc.dg/torture/pr69613.c   -O2  (internal compiler error)
>> FAIL: gcc.dg/torture/pr69613.c   -O2  (test for excess errors)
>> UNRESOLVED: gcc.dg/torture/pr69613.c   -O2  compilation failed to produce executable
>> FAIL: gcc.dg/torture/pr69613.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (internal compiler error)
>> FAIL: gcc.dg/torture/pr69613.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (test for excess errors)
>> UNRESOLVED: gcc.dg/torture/pr69613.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  compilation failed to produce executable
>> FAIL: gcc.dg/torture/pr69613.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  (internal compiler error)
>> FAIL: gcc.dg/torture/pr69613.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)
>> UNRESOLVED: gcc.dg/torture/pr69613.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  compilation failed to produce executable
>> FAIL: gcc.dg/torture/pr69613.c   -O3 -g  (internal compiler error)
>> FAIL: gcc.dg/torture/pr69613.c   -O3 -g  (test for excess errors)
>> UNRESOLVED: gcc.dg/torture/pr69613.c   -O3 -g  compilation failed to produce executable
>> FAIL: gcc.dg/torture/pr69613.c   -Os  (internal compiler error)
>> FAIL: gcc.dg/torture/pr69613.c   -Os  (test for excess errors)
>> UNRESOLVED: gcc.dg/torture/pr69613.c   -Os  compilation failed to produce executable
>> FAIL: gcc.dg/torture/pr69909.c   -O1  (internal compiler error)
>> FAIL: gcc.dg/torture/pr69909.c   -O1  (test for excess errors)
>> UNRESOLVED: gcc.dg/torture/pr69909.c   -O1  compilation failed to produce executable
>> FAIL: gcc.dg/torture/pr69909.c   -O2  (internal compiler error)
>> FAIL: gcc.dg/torture/pr69909.c   -O2  (test for excess errors)
>> UNRESOLVED: gcc.dg/torture/pr69909.c   -O2  compilation failed to produce executable
>> FAIL: gcc.dg/torture/pr69909.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (internal compiler error)
>> FAIL: gcc.dg/torture/pr69909.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (test for excess errors)
>> UNRESOLVED: gcc.dg/torture/pr69909.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  compilation failed to produce executable
>> FAIL: gcc.dg/torture/pr69909.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  (internal compiler error)
>> FAIL: gcc.dg/torture/pr69909.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)
>> UNRESOLVED: gcc.dg/torture/pr69909.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  compilation failed to produce executable
>> FAIL: gcc.dg/torture/pr69909.c   -O3 -g  (internal compiler error)
>> FAIL: gcc.dg/torture/pr69909.c   -O3 -g  (test for excess errors)
>> UNRESOLVED: gcc.dg/torture/pr69909.c   -O3 -g  compilation failed to produce executable
>> FAIL: gcc.dg/torture/pr69909.c   -Os  (internal compiler error)
>> FAIL: gcc.dg/torture/pr69909.c   -Os  (test for excess errors)
>> UNRESOLVED: gcc.dg/torture/pr69909.c   -Os  compilation failed to produce executable
>> FAIL: gcc.dg/torture/pr70083.c   -O1  (internal compiler error)
>> FAIL: gcc.dg/torture/pr70083.c   -O2  (internal compiler error)
>> FAIL: gcc.dg/torture/pr70083.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (internal compiler error)
>> FAIL: gcc.dg/torture/pr70083.c   -O3 -g  (internal compiler error)
>> FAIL: gcc.dg/torture/pr70083.c   -Os  (internal compiler error)
>> FAIL: gcc.dg/torture/pr70421.c   -O1  (internal compiler error)
>> FAIL: gcc.dg/torture/pr70421.c   -O1  (test for excess errors)
>> UNRESOLVED: gcc.dg/torture/pr70421.c   -O1  compilation failed to produce executable
>> FAIL: gcc.dg/torture/pr70421.c   -O2  (internal compiler error)
>> FAIL: gcc.dg/torture/pr70421.c   -O2  (test for excess errors)
>> UNRESOLVED: gcc.dg/torture/pr70421.c   -O2  compilation failed to produce executable
>> FAIL: gcc.dg/torture/pr70421.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (internal compiler error)
>> FAIL: gcc.dg/torture/pr70421.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (test for excess errors)
>> UNRESOLVED: gcc.dg/torture/pr70421.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  compilation failed to produce executable
>> FAIL: gcc.dg/torture/pr70421.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  (internal compiler error)
>> FAIL: gcc.dg/torture/pr70421.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)
>> UNRESOLVED: gcc.dg/torture/pr70421.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  compilation failed to produce executable
>> FAIL: gcc.dg/torture/pr70421.c   -O3 -g  (internal compiler error)
>> FAIL: gcc.dg/torture/pr70421.c   -O3 -g  (test for excess errors)
>> UNRESOLVED: gcc.dg/torture/pr70421.c   -O3 -g  compilation failed to produce executable
>> FAIL: gcc.dg/torture/pr70421.c   -Os  (internal compiler error)
>> FAIL: gcc.dg/torture/pr70421.c   -Os  (test for excess errors)
>> UNRESOLVED: gcc.dg/torture/pr70421.c   -Os  compilation failed to produce executable
> 295,296c367,368
> 
> Thanks for adding BIT_FIELD_INSERT, I think this will help us in several places.
> 
> Bill 
> 
>> On May 13, 2016, at 5:51 AM, Richard Biener <rguenther@suse.de> wrote:
>> 
>> 
>> The following patch adds BIT_FIELD_INSERT, an operation to
>> facilitate doing bitfield inserts on registers (as opposed
>> to currently where we'd have a BIT_FIELD_REF store).
>> 
>> Originally this was developed as part of bitfield lowering
>> where bitfield stores were lowered into read-modify-write
>> cycles and the modify part, instead of doing shifting and masking,
>> be kept in a more high-level form to ease combining them.
>> 
>> A second use case (the above is still valid) is vector element
>> inserts which we currently can only do via memory or
>> by extracting all components and re-building the vector using
>> a CONSTRUCTOR.  For this second use case I added code
>> re-writing the BIT_FIELD_REF stores the C family FEs produce
>> into BIT_FIELD_INSERT when update-address-taken can otherwise
>> re-write a decl into SSA form (the testcase shows we miss
>> a similar opportunity with the MEM_REF form of a vector insert,
>> I plan to fix that for the final submission).
>> 
>> One speciality of BIT_FIELD_INSERT as opposed to BIT_FIELD_REF
>> is that the size of the insertion is given implicitely via the
>> type size/precision of the value to insert.  That avoids
>> introducing ways to have quaternary ops in folding and GIMPLE stmts.
>> 
>> Bootstrapped and tested on x86_64-unknown-linux-gnu.
>> 
>> Richard.
>> 
>> 2011-06-16  Richard Guenther  <rguenther@suse.de>
>> 
>> 	PR tree-optimization/29756
>> 	* tree.def (BIT_FIELD_INSERT): New tcc_expression tree code.
>> 	* expr.c (expand_expr_real_2): Handle BIT_FIELD_INSERT.
>> 	* fold-const.c (operand_equal_p): Likewise.
>> 	(fold_ternary_loc): Add constant folding of BIT_FIELD_INSERT.
>> 	* gimplify.c (gimplify_expr): Handle BIT_FIELD_INSERT.
>> 	* tree-inline.c (estimate_operator_cost): Likewise.
>> 	* tree-pretty-print.c (dump_generic_node): Likewise.
>> 	* tree-ssa-operands.c (get_expr_operands): Likewise.
>> 	* cfgexpand.c (expand_debug_expr): Likewise.
>> 	* gimple-pretty-print.c (dump_ternary_rhs): Likewise.
>> 	* gimple.c (get_gimple_rhs_num_ops): Handle BIT_FIELD_INSERT.
>> 	* tree-cfg.c (verify_gimple_assign_ternary): Verify BIT_FIELD_INSERT.
>> 
>> 	* tree-ssa.c (non_rewritable_lvalue_p): We can rewrite
>> 	vector inserts using BIT_FIELD_REF on the lhs.
>> 	(execute_update_addresses_taken): Do it.
>> 
>> 	* gcc.dg/tree-ssa/vector-6.c: New testcase.
>> 
>> Index: trunk/gcc/expr.c
>> ===================================================================
>> *** trunk.orig/gcc/expr.c	2016-05-12 13:40:30.704262951 +0200
>> --- trunk/gcc/expr.c	2016-05-12 15:40:32.481225744 +0200
>> *************** expand_expr_real_2 (sepops ops, rtx targ
>> *** 9358,9363 ****
>> --- 9358,9380 ----
>>       target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
>>       return target;
>> 
>> +     case BIT_FIELD_INSERT:
>> +       {
>> + 	unsigned bitpos = tree_to_uhwi (treeop2);
>> + 	unsigned bitsize;
>> + 	if (INTEGRAL_TYPE_P (TREE_TYPE (treeop1)))
>> + 	  bitsize = TYPE_PRECISION (TREE_TYPE (treeop1));
>> + 	else
>> + 	  bitsize = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (treeop1)));
>> + 	rtx op0 = expand_normal (treeop0);
>> + 	rtx op1 = expand_normal (treeop1);
>> + 	rtx dst = gen_reg_rtx (mode);
>> + 	emit_move_insn (dst, op0);
>> + 	store_bit_field (dst, bitsize, bitpos, 0, 0,
>> + 			 TYPE_MODE (TREE_TYPE (treeop1)), op1, false);
>> + 	return dst;
>> +       }
>> + 
>>     default:
>>       gcc_unreachable ();
>>     }
>> Index: trunk/gcc/fold-const.c
>> ===================================================================
>> *** trunk.orig/gcc/fold-const.c	2016-05-12 13:40:30.704262951 +0200
>> --- trunk/gcc/fold-const.c	2016-05-13 09:41:13.509812127 +0200
>> *************** operand_equal_p (const_tree arg0, const_
>> *** 3163,3168 ****
>> --- 3163,3169 ----
>> 
>> 	case VEC_COND_EXPR:
>> 	case DOT_PROD_EXPR:
>> + 	case BIT_FIELD_INSERT:
>> 	  return OP_SAME (0) && OP_SAME (1) && OP_SAME (2);
>> 
>> 	default:
>> *************** fold_ternary_loc (location_t loc, enum t
>> *** 11870,11875 ****
>> --- 11871,11916 ----
>> 	}
>>       return NULL_TREE;
>> 
>> +     case BIT_FIELD_INSERT:
>> +       /* Perform (partial) constant folding of BIT_FIELD_INSERT.  */
>> +       if (TREE_CODE (arg0) == INTEGER_CST
>> + 	  && TREE_CODE (arg1) == INTEGER_CST)
>> + 	{
>> + 	  unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
>> + 	  unsigned bitsize = TYPE_PRECISION (TREE_TYPE (arg1));
>> + 	  wide_int tem = wi::bit_and (arg0,
>> + 				      wi::shifted_mask (bitpos, bitsize, true,
>> + 							TYPE_PRECISION (type)));
>> + 	  wide_int tem2
>> + 	    = wi::lshift (wi::zext (wi::to_wide (arg1, TYPE_PRECISION (type)),
>> + 				    bitsize), bitpos);
>> + 	  return wide_int_to_tree (type, wi::bit_or (tem, tem2));
>> + 	}
>> +       else if (TREE_CODE (arg0) == VECTOR_CST
>> + 	       && CONSTANT_CLASS_P (arg1)
>> + 	       && types_compatible_p (TREE_TYPE (TREE_TYPE (arg0)),
>> + 				      TREE_TYPE (arg1)))
>> + 	{
>> + 	  unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
>> + 	  unsigned HOST_WIDE_INT elsize
>> + 	    = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (arg1)));
>> + 	  if (bitpos % elsize == 0)
>> + 	    {
>> + 	      unsigned k = bitpos / elsize;
>> + 	      if (operand_equal_p (VECTOR_CST_ELT (arg0, k), arg1, 0))
>> + 		return arg0;
>> + 	      else
>> + 		{
>> + 		  tree *elts = XALLOCAVEC (tree, TYPE_VECTOR_SUBPARTS (type));
>> + 		  memcpy (elts, VECTOR_CST_ELTS (arg0),
>> + 			  sizeof (tree) * TYPE_VECTOR_SUBPARTS (type));
>> + 		  elts[k] = arg1;
>> + 		  return build_vector (type, elts);
>> + 		}
>> + 	    }
>> + 	}
>> +       return NULL_TREE;
>> + 
>>     default:
>>       return NULL_TREE;
>>     } /* switch (code) */
>> Index: trunk/gcc/gimplify.c
>> ===================================================================
>> *** trunk.orig/gcc/gimplify.c	2016-05-12 13:40:30.704262951 +0200
>> --- trunk/gcc/gimplify.c	2016-05-12 13:56:18.679120641 +0200
>> *************** gimplify_expr (tree *expr_p, gimple_seq
>> *** 10936,10941 ****
>> --- 10936,10945 ----
>> 	  /* Classified as tcc_expression.  */
>> 	  goto expr_3;
>> 
>> + 	case BIT_FIELD_INSERT:
>> + 	  /* Argument 3 is a constant.  */
>> + 	  goto expr_2;
>> + 
>> 	case POINTER_PLUS_EXPR:
>> 	  {
>> 	    enum gimplify_status r0, r1;
>> Index: trunk/gcc/tree-inline.c
>> ===================================================================
>> *** trunk.orig/gcc/tree-inline.c	2016-05-12 13:40:30.704262951 +0200
>> --- trunk/gcc/tree-inline.c	2016-05-12 13:42:45.465811959 +0200
>> *************** estimate_operator_cost (enum tree_code c
>> *** 3941,3946 ****
>> --- 3941,3950 ----
>>         return weights->div_mod_cost;
>>       return 1;
>> 
>> +     /* Bit-field insertion needs several shift and mask operations.  */
>> +     case BIT_FIELD_INSERT:
>> +       return 3;
>> + 
>>     default:
>>       /* We expect a copy assignment with no operator.  */
>>       gcc_assert (get_gimple_rhs_class (code) == GIMPLE_SINGLE_RHS);
>> Index: trunk/gcc/tree-pretty-print.c
>> ===================================================================
>> *** trunk.orig/gcc/tree-pretty-print.c	2016-05-12 13:40:30.704262951 +0200
>> --- trunk/gcc/tree-pretty-print.c	2016-05-12 14:30:05.781944740 +0200
>> *************** dump_generic_node (pretty_printer *pp, t
>> *** 1876,1881 ****
>> --- 1876,1898 ----
>>       pp_greater (pp);
>>       break;
>> 
>> +     case BIT_FIELD_INSERT:
>> +       pp_string (pp, "BIT_FIELD_INSERT <");
>> +       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
>> +       pp_string (pp, ", ");
>> +       dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
>> +       pp_string (pp, ", ");
>> +       dump_generic_node (pp, TREE_OPERAND (node, 2), spc, flags, false);
>> +       pp_string (pp, " (");
>> +       if (INTEGRAL_TYPE_P (TREE_TYPE (TREE_OPERAND (node, 1))))
>> + 	pp_decimal_int (pp,
>> + 			TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (node, 1))));
>> +       else
>> + 	dump_generic_node (pp, TYPE_SIZE (TREE_TYPE (TREE_OPERAND (node, 1))),
>> + 			   spc, flags, false);
>> +       pp_string (pp, " bits)>");
>> +       break;
>> + 
>>     case ARRAY_REF:
>>     case ARRAY_RANGE_REF:
>>       op0 = TREE_OPERAND (node, 0);
>> Index: trunk/gcc/tree-ssa-operands.c
>> ===================================================================
>> *** trunk.orig/gcc/tree-ssa-operands.c	2016-05-12 13:42:45.465811959 +0200
>> --- trunk/gcc/tree-ssa-operands.c	2016-05-12 13:48:26.881736503 +0200
>> *************** get_expr_operands (struct function *fn,
>> *** 833,838 ****
>> --- 833,839 ----
>>       get_expr_operands (fn, stmt, &TREE_OPERAND (expr, 0), flags);
>>       return;
>> 
>> +     case BIT_FIELD_INSERT:
>>     case COMPOUND_EXPR:
>>     case OBJ_TYPE_REF:
>>     case ASSERT_EXPR:
>> Index: trunk/gcc/tree.def
>> ===================================================================
>> *** trunk.orig/gcc/tree.def	2016-05-12 13:40:30.704262951 +0200
>> --- trunk/gcc/tree.def	2016-05-12 13:47:09.972852423 +0200
>> *************** DEFTREECODE (ADDR_EXPR, "addr_expr", tcc
>> *** 852,857 ****
>> --- 852,868 ----
>>    descriptor of type ptr_mode.  */
>> DEFTREECODE (FDESC_EXPR, "fdesc_expr", tcc_expression, 2)
>> 
>> + /* Given a word, a value and a bitfield position within the word,
>> +    produce the value that results if replacing the
>> +    described parts of word with value.
>> +    Operand 0 is a tree for the word of integral type;
>> +    Operand 1 is a tree for the value of integral type;
>> +    Operand 2 is a tree giving the constant position of the first referenced bit;
>> +    The number of bits replaced is given by the precision of the value
>> +    type if that is integral or by its size if it is non-integral.
>> +    The replaced bits shall be fully inside the word.  */
>> + DEFTREECODE (BIT_FIELD_INSERT, "bit_field_insert", tcc_expression, 3)
>> + 
>> /* Given two real or integer operands of the same type,
>>    returns a complex value of the corresponding complex type.  */
>> DEFTREECODE (COMPLEX_EXPR, "complex_expr", tcc_binary, 2)
>> Index: trunk/gcc/cfgexpand.c
>> ===================================================================
>> *** trunk.orig/gcc/cfgexpand.c	2016-05-12 13:42:45.469812005 +0200
>> --- trunk/gcc/cfgexpand.c	2016-05-13 11:48:04.513407495 +0200
>> *************** expand_debug_expr (tree exp)
>> *** 5025,5030 ****
>> --- 5025,5031 ----
>>     case FIXED_CONVERT_EXPR:
>>     case OBJ_TYPE_REF:
>>     case WITH_SIZE_EXPR:
>> +     case BIT_FIELD_INSERT:
>>       return NULL;
>> 
>>     case DOT_PROD_EXPR:
>> Index: trunk/gcc/gimple-pretty-print.c
>> ===================================================================
>> *** trunk.orig/gcc/gimple-pretty-print.c	2016-05-12 11:23:09.261375157 +0200
>> --- trunk/gcc/gimple-pretty-print.c	2016-05-12 14:57:22.096175579 +0200
>> *************** dump_ternary_rhs (pretty_printer *buffer
>> *** 479,484 ****
>> --- 479,502 ----
>>       pp_greater (buffer);
>>       break;
>> 
>> +     case BIT_FIELD_INSERT:
>> +       pp_string (buffer, "BIT_FIELD_INSERT <");
>> +       dump_generic_node (buffer, gimple_assign_rhs1 (gs), spc, flags, false);
>> +       pp_string (buffer, ", ");
>> +       dump_generic_node (buffer, gimple_assign_rhs2 (gs), spc, flags, false);
>> +       pp_string (buffer, ", ");
>> +       dump_generic_node (buffer, gimple_assign_rhs3 (gs), spc, flags, false);
>> +       pp_string (buffer, " (");
>> +       if (INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_rhs2 (gs))))
>> + 	pp_decimal_int (buffer,
>> + 			TYPE_PRECISION (TREE_TYPE (gimple_assign_rhs2 (gs))));
>> +       else
>> + 	dump_generic_node (buffer,
>> + 			   TYPE_SIZE (TREE_TYPE (gimple_assign_rhs2 (gs))),
>> + 			   spc, flags, false);
>> +       pp_string (buffer, " bits)>");
>> +       break;
>> + 
>>     default:
>>       gcc_unreachable ();
>>     }
>> Index: trunk/gcc/gimple.c
>> ===================================================================
>> *** trunk.orig/gcc/gimple.c	2016-05-12 13:40:30.704262951 +0200
>> --- trunk/gcc/gimple.c	2016-05-12 14:49:37.066994969 +0200
>> *************** get_gimple_rhs_num_ops (enum tree_code c
>> *** 2044,2049 ****
>> --- 2044,2050 ----
>>       || (SYM) == REALIGN_LOAD_EXPR					    \
>>       || (SYM) == VEC_COND_EXPR						    \
>>       || (SYM) == VEC_PERM_EXPR                                             \
>> +       || (SYM) == BIT_FIELD_INSERT					    \
>>       || (SYM) == FMA_EXPR) ? GIMPLE_TERNARY_RHS			    \
>>    : ((SYM) == CONSTRUCTOR						    \
>>       || (SYM) == OBJ_TYPE_REF						    \
>> Index: trunk/gcc/tree-cfg.c
>> ===================================================================
>> *** trunk.orig/gcc/tree-cfg.c	2016-05-06 14:38:33.959495081 +0200
>> --- trunk/gcc/tree-cfg.c	2016-05-13 09:25:01.670630730 +0200
>> *************** verify_gimple_assign_ternary (gassign *s
>> *** 4155,4160 ****
>> --- 4155,4207 ----
>> 
>>       return false;
>> 
>> +     case BIT_FIELD_INSERT:
>> +       if (! useless_type_conversion_p (lhs_type, rhs1_type))
>> + 	{
>> + 	  error ("type mismatch in BIT_FIELD_INSERT");
>> + 	  debug_generic_expr (lhs_type);
>> + 	  debug_generic_expr (rhs1_type);
>> + 	  return true;
>> + 	}
>> +       if (! ((INTEGRAL_TYPE_P (rhs1_type)
>> + 	      && INTEGRAL_TYPE_P (rhs2_type))
>> + 	     || (VECTOR_TYPE_P (rhs1_type)
>> + 		 && types_compatible_p (TREE_TYPE (rhs1_type), rhs2_type))))
>> + 	{
>> + 	  error ("not allowed type combination in BIT_FIELD_INSERT");
>> + 	  debug_generic_expr (rhs1_type);
>> + 	  debug_generic_expr (rhs2_type);
>> + 	  return true;
>> + 	}
>> +       if (! tree_fits_uhwi_p (rhs3)
>> + 	  || ! tree_fits_uhwi_p (TYPE_SIZE (rhs2_type)))
>> + 	{
>> + 	  error ("invalid position or size in BIT_FIELD_INSERT");
>> + 	  return true;
>> + 	}
>> +       if (INTEGRAL_TYPE_P (rhs1_type))
>> + 	{
>> + 	  unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (rhs3);
>> + 	  if (bitpos >= TYPE_PRECISION (rhs1_type)
>> + 	      || (bitpos + TYPE_PRECISION (rhs2_type)
>> + 		  > TYPE_PRECISION (rhs1_type)))
>> + 	    {
>> + 	      error ("insertion out of range in BIT_FIELD_INSERT");
>> + 	      return true;
>> + 	    }
>> + 	}
>> +       else if (VECTOR_TYPE_P (rhs1_type))
>> + 	{
>> + 	  unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (rhs3);
>> + 	  unsigned HOST_WIDE_INT bitsize = tree_to_uhwi (TYPE_SIZE (rhs2_type));
>> + 	  if (bitpos % bitsize != 0)
>> + 	    {
>> + 	      error ("vector insertion not at element boundary");
>> + 	      return true;
>> + 	    }
>> + 	}
>> +       return false;
>> + 
>>     case DOT_PROD_EXPR:
>>     case REALIGN_LOAD_EXPR:
>>       /* FIXME.  */
>> Index: trunk/gcc/tree-ssa.c
>> ===================================================================
>> *** trunk.orig/gcc/tree-ssa.c	2016-05-13 09:38:02.263611726 +0200
>> --- trunk/gcc/tree-ssa.c	2016-05-13 09:50:31.020226585 +0200
>> *************** non_rewritable_lvalue_p (tree lhs)
>> *** 1318,1323 ****
>> --- 1318,1335 ----
>> 	return false;
>>     }
>> 
>> +   /* A vector-insert using a BIT_FIELD_REF is rewritable using
>> +      BIT_FIELD_INSERT.  */
>> +   if (TREE_CODE (lhs) == BIT_FIELD_REF
>> +       && DECL_P (TREE_OPERAND (lhs, 0))
>> +       && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (lhs, 0)))
>> +       /* && bitsize % element-size == 0 */
>> +       && types_compatible_p (TREE_TYPE (lhs),
>> + 			     TREE_TYPE (TREE_TYPE (TREE_OPERAND (lhs, 0))))
>> +       && (tree_to_uhwi (TREE_OPERAND (lhs, 2))
>> + 	  % tree_to_uhwi (TYPE_SIZE (TREE_TYPE (lhs)))) == 0)
>> +     return false;
>> + 
>>   return true;
>> }
>> 
>> *************** execute_update_addresses_taken (void)
>> *** 1536,1541 ****
>> --- 1548,1576 ----
>> 		    stmt = gsi_stmt (gsi);
>> 		    unlink_stmt_vdef (stmt);
>> 		    update_stmt (stmt);
>> + 		    continue;
>> + 		  }
>> + 
>> + 		/* Rewrite a vector insert via a BIT_FIELD_REF on the LHS
>> + 		   into a BIT_FIELD_INSERT.  */
>> + 		if (TREE_CODE (lhs) == BIT_FIELD_REF
>> + 		    && DECL_P (TREE_OPERAND (lhs, 0))
>> + 		    && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (lhs, 0)))
>> + 		    && types_compatible_p (TREE_TYPE (lhs),
>> + 					   TREE_TYPE (TREE_TYPE
>> + 						       (TREE_OPERAND (lhs, 0))))
>> + 		    && (tree_to_uhwi (TREE_OPERAND (lhs, 2))
>> + 			% tree_to_uhwi (TYPE_SIZE (TREE_TYPE (lhs))) == 0))
>> + 		  {
>> + 		    tree var = TREE_OPERAND (lhs, 0);
>> + 		    tree val = gimple_assign_rhs1 (stmt);
>> + 		    tree bitpos = TREE_OPERAND (lhs, 2);
>> + 		    gimple_assign_set_lhs (stmt, var);
>> + 		    gimple_assign_set_rhs_with_ops
>> + 		      (&gsi, BIT_FIELD_INSERT, var, val, bitpos);
>> + 		    stmt = gsi_stmt (gsi);
>> + 		    unlink_stmt_vdef (stmt);
>> + 		    update_stmt (stmt);
>> 		    continue;
>> 		  }
>> 
>> Index: trunk/gcc/testsuite/gcc.dg/tree-ssa/vector-6.c
>> ===================================================================
>> *** /dev/null	1970-01-01 00:00:00.000000000 +0000
>> --- trunk/gcc/testsuite/gcc.dg/tree-ssa/vector-6.c	2016-05-13 09:54:16.026814995 +0200
>> ***************
>> *** 0 ****
>> --- 1,34 ----
>> + /* { dg-do compile } */
>> + /* { dg-options "-O -fdump-tree-ccp1" } */
>> + 
>> + typedef int v4si __attribute__((vector_size (4 * sizeof (int))));
>> + 
>> + v4si test1 (v4si v, int i)
>> + {
>> +   ((int *)&v)[0] = i;
>> +   return v;
>> + }
>> + 
>> + v4si test2 (v4si v, int i)
>> + {
>> +   int *p = (int *)&v;
>> +   *p = i;
>> +   return v;
>> + }
>> + 
>> + v4si test3 (v4si v, int i)
>> + {
>> +   ((int *)&v)[3] = i;
>> +   return v;
>> + }
>> + 
>> + v4si test4 (v4si v, int i)
>> + {
>> +   int *p = (int *)&v;
>> +   p += 3;
>> +   *p = i;
>> +   return v;
>> + }
>> + 
>> + /* { dg-final { scan-tree-dump-times "Now a gimple register: v" 2 "ccp1" } } */
>> + /* { dg-final { scan-tree-dump-times "Now a gimple register: v" 4 "ccp1" { xfail *-*-* } } } */
>> 
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT
  2016-05-16  8:24 ` Eric Botcazou
@ 2016-05-17  7:50   ` Richard Biener
  2016-05-17  8:13     ` Eric Botcazou
  2016-05-17 15:19     ` Michael Matz
  0 siblings, 2 replies; 32+ messages in thread
From: Richard Biener @ 2016-05-17  7:50 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches

On Mon, 16 May 2016, Eric Botcazou wrote:

> > The following patch adds BIT_FIELD_INSERT, an operation to
> > facilitate doing bitfield inserts on registers (as opposed
> > to currently where we'd have a BIT_FIELD_REF store).
> 
> Why not call it BIT_FIELD_INSERT_EXPR instead to make it clear that it's an 
> expression and not a mere operation?

Had it like that first but it was so long ... ;)  Originally
I called it BIT_FIELD_EXPR but that's a bit ambiguous (with
BIT_FIELD_REF).

I'm fine with renaming it to BIT_FIELD_INSERT_EXPR, maybe
BIT_INSERT_EXPR then as it doesn't really have anything to do
with "bitfields".

Any preference?

> > Originally this was developed as part of bitfield lowering
> > where bitfield stores were lowered into read-modify-write
> > cycles and the modify part, instead of doing shifting and masking,
> > be kept in a more high-level form to ease combining them.
> > 
> > A second use case (the above is still valid) is vector element
> > inserts which we currently can only do via memory or
> > by extracting all components and re-building the vector using
> > a CONSTRUCTOR.  For this second use case I added code
> > re-writing the BIT_FIELD_REF stores the C family FEs produce
> > into BIT_FIELD_INSERT when update-address-taken can otherwise
> > re-write a decl into SSA form (the testcase shows we miss
> > a similar opportunity with the MEM_REF form of a vector insert,
> > I plan to fix that for the final submission).
> 
> The description in tree.def looks off then, it only mentions words and 
> integral types.

I'll fix that up.

> > One speciality of BIT_FIELD_INSERT as opposed to BIT_FIELD_REF
> > is that the size of the insertion is given implicitely via the
> > type size/precision of the value to insert.  That avoids
> > introducing ways to have quaternary ops in folding and GIMPLE stmts.
> 
> Yes, it's a bit unfortunate, but sensible.  Maybe add a ??? note about that.

Ok.

Thanks,
Richard.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT
  2016-05-16 12:37   ` Bill Schmidt
@ 2016-05-17  7:52     ` Richard Biener
  0 siblings, 0 replies; 32+ messages in thread
From: Richard Biener @ 2016-05-17  7:52 UTC (permalink / raw)
  To: Bill Schmidt; +Cc: gcc-patches

[-- Attachment #1: Type: TEXT/PLAIN, Size: 27027 bytes --]

On Mon, 16 May 2016, Bill Schmidt wrote:

> Sorry, that was the wrong vector-6.c — should have realized.  In any 
> case, for each of the vector tests, we get appropriate use of 
> element-wise loads, and no load-hit-store bitfield assignments, so the 
> code generation is what we want to see.  Sorry for the misleading 
> information.

For the simple testcases code-gen on x86_64 doesn't change it is merely
we expose more ops to GIMPLE optimizations via SSA form which is always
good.  On x86_64 the simple testcases were taken care of by combine.

I also saw the ICEs and fixed the simple oversight.

I'll post an updated patch this week.

Thanks,
Richard.

> Bill
> 
> > On May 15, 2016, at 7:55 PM, Bill Schmidt <wschmidt@linux.vnet.ibm.com> wrote:
> > 
> > Hi Richard,
> > 
> > (Sorry for duplication to your personal email, I had new-mailer issues.)
> > 
> > The new vector-6 test produces very good code for powerpc64le with this patch:
> > 
> >        addis 9,2,.LC0@toc@ha
> >        sldi 3,3,32
> >        addi 9,9,.LC0@toc@l
> >        rldicl 9,9,0,32
> >        or 3,9,3
> >        blr
> > 
> > I did run into some ICEs with bootstrap/regtest, though:
> > 
> > 26c26
> > < /home/wschmidt/gcc/build/gcc-mainline-base/gcc/testsuite/g++/../../xg++  version 7.0.0 20160515 (experimental) [trunk revision 236259] (GCC) 
> > ---
> >> /home/wschmidt/gcc/build/gcc-mainline-test/gcc/testsuite/g++/../../xg++  version 7.0.0 20160515 (experimental) [trunk revision 236259] (GCC) 
> > 31a32,39
> >> FAIL: gcc.c-torture/compile/pr70240.c   -O1  (internal compiler error)
> >> FAIL: gcc.c-torture/compile/pr70240.c   -O1  (test for excess errors)
> >> FAIL: gcc.c-torture/compile/pr70240.c   -O2  (internal compiler error)
> >> FAIL: gcc.c-torture/compile/pr70240.c   -O2  (test for excess errors)
> >> FAIL: gcc.c-torture/compile/pr70240.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (internal compiler error)
> >> FAIL: gcc.c-torture/compile/pr70240.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (test for excess errors)
> >> FAIL: gcc.c-torture/compile/pr70240.c   -Os  (internal compiler error)
> >> FAIL: gcc.c-torture/compile/pr70240.c   -Os  (test for excess errors)
> > 53a62,66
> >> FAIL: gcc.dg/pr69896.c (internal compiler error)
> >> FAIL: gcc.dg/pr69896.c (test for excess errors)
> >> UNRESOLVED: gcc.dg/pr69896.c compilation failed to produce executable
> >> FAIL: gcc.dg/pr70326.c (internal compiler error)
> >> FAIL: gcc.dg/pr70326.c (test for excess errors)
> > 281a295,353
> >> FAIL: gcc.dg/torture/pr69613.c   -O1  (internal compiler error)
> >> FAIL: gcc.dg/torture/pr69613.c   -O1  (test for excess errors)
> >> UNRESOLVED: gcc.dg/torture/pr69613.c   -O1  compilation failed to produce executable
> >> FAIL: gcc.dg/torture/pr69613.c   -O2  (internal compiler error)
> >> FAIL: gcc.dg/torture/pr69613.c   -O2  (test for excess errors)
> >> UNRESOLVED: gcc.dg/torture/pr69613.c   -O2  compilation failed to produce executable
> >> FAIL: gcc.dg/torture/pr69613.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (internal compiler error)
> >> FAIL: gcc.dg/torture/pr69613.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (test for excess errors)
> >> UNRESOLVED: gcc.dg/torture/pr69613.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  compilation failed to produce executable
> >> FAIL: gcc.dg/torture/pr69613.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  (internal compiler error)
> >> FAIL: gcc.dg/torture/pr69613.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)
> >> UNRESOLVED: gcc.dg/torture/pr69613.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  compilation failed to produce executable
> >> FAIL: gcc.dg/torture/pr69613.c   -O3 -g  (internal compiler error)
> >> FAIL: gcc.dg/torture/pr69613.c   -O3 -g  (test for excess errors)
> >> UNRESOLVED: gcc.dg/torture/pr69613.c   -O3 -g  compilation failed to produce executable
> >> FAIL: gcc.dg/torture/pr69613.c   -Os  (internal compiler error)
> >> FAIL: gcc.dg/torture/pr69613.c   -Os  (test for excess errors)
> >> UNRESOLVED: gcc.dg/torture/pr69613.c   -Os  compilation failed to produce executable
> >> FAIL: gcc.dg/torture/pr69909.c   -O1  (internal compiler error)
> >> FAIL: gcc.dg/torture/pr69909.c   -O1  (test for excess errors)
> >> UNRESOLVED: gcc.dg/torture/pr69909.c   -O1  compilation failed to produce executable
> >> FAIL: gcc.dg/torture/pr69909.c   -O2  (internal compiler error)
> >> FAIL: gcc.dg/torture/pr69909.c   -O2  (test for excess errors)
> >> UNRESOLVED: gcc.dg/torture/pr69909.c   -O2  compilation failed to produce executable
> >> FAIL: gcc.dg/torture/pr69909.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (internal compiler error)
> >> FAIL: gcc.dg/torture/pr69909.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (test for excess errors)
> >> UNRESOLVED: gcc.dg/torture/pr69909.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  compilation failed to produce executable
> >> FAIL: gcc.dg/torture/pr69909.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  (internal compiler error)
> >> FAIL: gcc.dg/torture/pr69909.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)
> >> UNRESOLVED: gcc.dg/torture/pr69909.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  compilation failed to produce executable
> >> FAIL: gcc.dg/torture/pr69909.c   -O3 -g  (internal compiler error)
> >> FAIL: gcc.dg/torture/pr69909.c   -O3 -g  (test for excess errors)
> >> UNRESOLVED: gcc.dg/torture/pr69909.c   -O3 -g  compilation failed to produce executable
> >> FAIL: gcc.dg/torture/pr69909.c   -Os  (internal compiler error)
> >> FAIL: gcc.dg/torture/pr69909.c   -Os  (test for excess errors)
> >> UNRESOLVED: gcc.dg/torture/pr69909.c   -Os  compilation failed to produce executable
> >> FAIL: gcc.dg/torture/pr70083.c   -O1  (internal compiler error)
> >> FAIL: gcc.dg/torture/pr70083.c   -O2  (internal compiler error)
> >> FAIL: gcc.dg/torture/pr70083.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (internal compiler error)
> >> FAIL: gcc.dg/torture/pr70083.c   -O3 -g  (internal compiler error)
> >> FAIL: gcc.dg/torture/pr70083.c   -Os  (internal compiler error)
> >> FAIL: gcc.dg/torture/pr70421.c   -O1  (internal compiler error)
> >> FAIL: gcc.dg/torture/pr70421.c   -O1  (test for excess errors)
> >> UNRESOLVED: gcc.dg/torture/pr70421.c   -O1  compilation failed to produce executable
> >> FAIL: gcc.dg/torture/pr70421.c   -O2  (internal compiler error)
> >> FAIL: gcc.dg/torture/pr70421.c   -O2  (test for excess errors)
> >> UNRESOLVED: gcc.dg/torture/pr70421.c   -O2  compilation failed to produce executable
> >> FAIL: gcc.dg/torture/pr70421.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (internal compiler error)
> >> FAIL: gcc.dg/torture/pr70421.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (test for excess errors)
> >> UNRESOLVED: gcc.dg/torture/pr70421.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  compilation failed to produce executable
> >> FAIL: gcc.dg/torture/pr70421.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  (internal compiler error)
> >> FAIL: gcc.dg/torture/pr70421.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)
> >> UNRESOLVED: gcc.dg/torture/pr70421.c   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  compilation failed to produce executable
> >> FAIL: gcc.dg/torture/pr70421.c   -O3 -g  (internal compiler error)
> >> FAIL: gcc.dg/torture/pr70421.c   -O3 -g  (test for excess errors)
> >> UNRESOLVED: gcc.dg/torture/pr70421.c   -O3 -g  compilation failed to produce executable
> >> FAIL: gcc.dg/torture/pr70421.c   -Os  (internal compiler error)
> >> FAIL: gcc.dg/torture/pr70421.c   -Os  (test for excess errors)
> >> UNRESOLVED: gcc.dg/torture/pr70421.c   -Os  compilation failed to produce executable
> > 295,296c367,368
> > 
> > Thanks for adding BIT_FIELD_INSERT, I think this will help us in several places.
> > 
> > Bill 
> > 
> >> On May 13, 2016, at 5:51 AM, Richard Biener <rguenther@suse.de> wrote:
> >> 
> >> 
> >> The following patch adds BIT_FIELD_INSERT, an operation to
> >> facilitate doing bitfield inserts on registers (as opposed
> >> to currently where we'd have a BIT_FIELD_REF store).
> >> 
> >> Originally this was developed as part of bitfield lowering
> >> where bitfield stores were lowered into read-modify-write
> >> cycles and the modify part, instead of doing shifting and masking,
> >> be kept in a more high-level form to ease combining them.
> >> 
> >> A second use case (the above is still valid) is vector element
> >> inserts which we currently can only do via memory or
> >> by extracting all components and re-building the vector using
> >> a CONSTRUCTOR.  For this second use case I added code
> >> re-writing the BIT_FIELD_REF stores the C family FEs produce
> >> into BIT_FIELD_INSERT when update-address-taken can otherwise
> >> re-write a decl into SSA form (the testcase shows we miss
> >> a similar opportunity with the MEM_REF form of a vector insert,
> >> I plan to fix that for the final submission).
> >> 
> >> One speciality of BIT_FIELD_INSERT as opposed to BIT_FIELD_REF
> >> is that the size of the insertion is given implicitely via the
> >> type size/precision of the value to insert.  That avoids
> >> introducing ways to have quaternary ops in folding and GIMPLE stmts.
> >> 
> >> Bootstrapped and tested on x86_64-unknown-linux-gnu.
> >> 
> >> Richard.
> >> 
> >> 2011-06-16  Richard Guenther  <rguenther@suse.de>
> >> 
> >> 	PR tree-optimization/29756
> >> 	* tree.def (BIT_FIELD_INSERT): New tcc_expression tree code.
> >> 	* expr.c (expand_expr_real_2): Handle BIT_FIELD_INSERT.
> >> 	* fold-const.c (operand_equal_p): Likewise.
> >> 	(fold_ternary_loc): Add constant folding of BIT_FIELD_INSERT.
> >> 	* gimplify.c (gimplify_expr): Handle BIT_FIELD_INSERT.
> >> 	* tree-inline.c (estimate_operator_cost): Likewise.
> >> 	* tree-pretty-print.c (dump_generic_node): Likewise.
> >> 	* tree-ssa-operands.c (get_expr_operands): Likewise.
> >> 	* cfgexpand.c (expand_debug_expr): Likewise.
> >> 	* gimple-pretty-print.c (dump_ternary_rhs): Likewise.
> >> 	* gimple.c (get_gimple_rhs_num_ops): Handle BIT_FIELD_INSERT.
> >> 	* tree-cfg.c (verify_gimple_assign_ternary): Verify BIT_FIELD_INSERT.
> >> 
> >> 	* tree-ssa.c (non_rewritable_lvalue_p): We can rewrite
> >> 	vector inserts using BIT_FIELD_REF on the lhs.
> >> 	(execute_update_addresses_taken): Do it.
> >> 
> >> 	* gcc.dg/tree-ssa/vector-6.c: New testcase.
> >> 
> >> Index: trunk/gcc/expr.c
> >> ===================================================================
> >> *** trunk.orig/gcc/expr.c	2016-05-12 13:40:30.704262951 +0200
> >> --- trunk/gcc/expr.c	2016-05-12 15:40:32.481225744 +0200
> >> *************** expand_expr_real_2 (sepops ops, rtx targ
> >> *** 9358,9363 ****
> >> --- 9358,9380 ----
> >>       target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
> >>       return target;
> >> 
> >> +     case BIT_FIELD_INSERT:
> >> +       {
> >> + 	unsigned bitpos = tree_to_uhwi (treeop2);
> >> + 	unsigned bitsize;
> >> + 	if (INTEGRAL_TYPE_P (TREE_TYPE (treeop1)))
> >> + 	  bitsize = TYPE_PRECISION (TREE_TYPE (treeop1));
> >> + 	else
> >> + 	  bitsize = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (treeop1)));
> >> + 	rtx op0 = expand_normal (treeop0);
> >> + 	rtx op1 = expand_normal (treeop1);
> >> + 	rtx dst = gen_reg_rtx (mode);
> >> + 	emit_move_insn (dst, op0);
> >> + 	store_bit_field (dst, bitsize, bitpos, 0, 0,
> >> + 			 TYPE_MODE (TREE_TYPE (treeop1)), op1, false);
> >> + 	return dst;
> >> +       }
> >> + 
> >>     default:
> >>       gcc_unreachable ();
> >>     }
> >> Index: trunk/gcc/fold-const.c
> >> ===================================================================
> >> *** trunk.orig/gcc/fold-const.c	2016-05-12 13:40:30.704262951 +0200
> >> --- trunk/gcc/fold-const.c	2016-05-13 09:41:13.509812127 +0200
> >> *************** operand_equal_p (const_tree arg0, const_
> >> *** 3163,3168 ****
> >> --- 3163,3169 ----
> >> 
> >> 	case VEC_COND_EXPR:
> >> 	case DOT_PROD_EXPR:
> >> + 	case BIT_FIELD_INSERT:
> >> 	  return OP_SAME (0) && OP_SAME (1) && OP_SAME (2);
> >> 
> >> 	default:
> >> *************** fold_ternary_loc (location_t loc, enum t
> >> *** 11870,11875 ****
> >> --- 11871,11916 ----
> >> 	}
> >>       return NULL_TREE;
> >> 
> >> +     case BIT_FIELD_INSERT:
> >> +       /* Perform (partial) constant folding of BIT_FIELD_INSERT.  */
> >> +       if (TREE_CODE (arg0) == INTEGER_CST
> >> + 	  && TREE_CODE (arg1) == INTEGER_CST)
> >> + 	{
> >> + 	  unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
> >> + 	  unsigned bitsize = TYPE_PRECISION (TREE_TYPE (arg1));
> >> + 	  wide_int tem = wi::bit_and (arg0,
> >> + 				      wi::shifted_mask (bitpos, bitsize, true,
> >> + 							TYPE_PRECISION (type)));
> >> + 	  wide_int tem2
> >> + 	    = wi::lshift (wi::zext (wi::to_wide (arg1, TYPE_PRECISION (type)),
> >> + 				    bitsize), bitpos);
> >> + 	  return wide_int_to_tree (type, wi::bit_or (tem, tem2));
> >> + 	}
> >> +       else if (TREE_CODE (arg0) == VECTOR_CST
> >> + 	       && CONSTANT_CLASS_P (arg1)
> >> + 	       && types_compatible_p (TREE_TYPE (TREE_TYPE (arg0)),
> >> + 				      TREE_TYPE (arg1)))
> >> + 	{
> >> + 	  unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
> >> + 	  unsigned HOST_WIDE_INT elsize
> >> + 	    = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (arg1)));
> >> + 	  if (bitpos % elsize == 0)
> >> + 	    {
> >> + 	      unsigned k = bitpos / elsize;
> >> + 	      if (operand_equal_p (VECTOR_CST_ELT (arg0, k), arg1, 0))
> >> + 		return arg0;
> >> + 	      else
> >> + 		{
> >> + 		  tree *elts = XALLOCAVEC (tree, TYPE_VECTOR_SUBPARTS (type));
> >> + 		  memcpy (elts, VECTOR_CST_ELTS (arg0),
> >> + 			  sizeof (tree) * TYPE_VECTOR_SUBPARTS (type));
> >> + 		  elts[k] = arg1;
> >> + 		  return build_vector (type, elts);
> >> + 		}
> >> + 	    }
> >> + 	}
> >> +       return NULL_TREE;
> >> + 
> >>     default:
> >>       return NULL_TREE;
> >>     } /* switch (code) */
> >> Index: trunk/gcc/gimplify.c
> >> ===================================================================
> >> *** trunk.orig/gcc/gimplify.c	2016-05-12 13:40:30.704262951 +0200
> >> --- trunk/gcc/gimplify.c	2016-05-12 13:56:18.679120641 +0200
> >> *************** gimplify_expr (tree *expr_p, gimple_seq
> >> *** 10936,10941 ****
> >> --- 10936,10945 ----
> >> 	  /* Classified as tcc_expression.  */
> >> 	  goto expr_3;
> >> 
> >> + 	case BIT_FIELD_INSERT:
> >> + 	  /* Argument 3 is a constant.  */
> >> + 	  goto expr_2;
> >> + 
> >> 	case POINTER_PLUS_EXPR:
> >> 	  {
> >> 	    enum gimplify_status r0, r1;
> >> Index: trunk/gcc/tree-inline.c
> >> ===================================================================
> >> *** trunk.orig/gcc/tree-inline.c	2016-05-12 13:40:30.704262951 +0200
> >> --- trunk/gcc/tree-inline.c	2016-05-12 13:42:45.465811959 +0200
> >> *************** estimate_operator_cost (enum tree_code c
> >> *** 3941,3946 ****
> >> --- 3941,3950 ----
> >>         return weights->div_mod_cost;
> >>       return 1;
> >> 
> >> +     /* Bit-field insertion needs several shift and mask operations.  */
> >> +     case BIT_FIELD_INSERT:
> >> +       return 3;
> >> + 
> >>     default:
> >>       /* We expect a copy assignment with no operator.  */
> >>       gcc_assert (get_gimple_rhs_class (code) == GIMPLE_SINGLE_RHS);
> >> Index: trunk/gcc/tree-pretty-print.c
> >> ===================================================================
> >> *** trunk.orig/gcc/tree-pretty-print.c	2016-05-12 13:40:30.704262951 +0200
> >> --- trunk/gcc/tree-pretty-print.c	2016-05-12 14:30:05.781944740 +0200
> >> *************** dump_generic_node (pretty_printer *pp, t
> >> *** 1876,1881 ****
> >> --- 1876,1898 ----
> >>       pp_greater (pp);
> >>       break;
> >> 
> >> +     case BIT_FIELD_INSERT:
> >> +       pp_string (pp, "BIT_FIELD_INSERT <");
> >> +       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
> >> +       pp_string (pp, ", ");
> >> +       dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
> >> +       pp_string (pp, ", ");
> >> +       dump_generic_node (pp, TREE_OPERAND (node, 2), spc, flags, false);
> >> +       pp_string (pp, " (");
> >> +       if (INTEGRAL_TYPE_P (TREE_TYPE (TREE_OPERAND (node, 1))))
> >> + 	pp_decimal_int (pp,
> >> + 			TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (node, 1))));
> >> +       else
> >> + 	dump_generic_node (pp, TYPE_SIZE (TREE_TYPE (TREE_OPERAND (node, 1))),
> >> + 			   spc, flags, false);
> >> +       pp_string (pp, " bits)>");
> >> +       break;
> >> + 
> >>     case ARRAY_REF:
> >>     case ARRAY_RANGE_REF:
> >>       op0 = TREE_OPERAND (node, 0);
> >> Index: trunk/gcc/tree-ssa-operands.c
> >> ===================================================================
> >> *** trunk.orig/gcc/tree-ssa-operands.c	2016-05-12 13:42:45.465811959 +0200
> >> --- trunk/gcc/tree-ssa-operands.c	2016-05-12 13:48:26.881736503 +0200
> >> *************** get_expr_operands (struct function *fn,
> >> *** 833,838 ****
> >> --- 833,839 ----
> >>       get_expr_operands (fn, stmt, &TREE_OPERAND (expr, 0), flags);
> >>       return;
> >> 
> >> +     case BIT_FIELD_INSERT:
> >>     case COMPOUND_EXPR:
> >>     case OBJ_TYPE_REF:
> >>     case ASSERT_EXPR:
> >> Index: trunk/gcc/tree.def
> >> ===================================================================
> >> *** trunk.orig/gcc/tree.def	2016-05-12 13:40:30.704262951 +0200
> >> --- trunk/gcc/tree.def	2016-05-12 13:47:09.972852423 +0200
> >> *************** DEFTREECODE (ADDR_EXPR, "addr_expr", tcc
> >> *** 852,857 ****
> >> --- 852,868 ----
> >>    descriptor of type ptr_mode.  */
> >> DEFTREECODE (FDESC_EXPR, "fdesc_expr", tcc_expression, 2)
> >> 
> >> + /* Given a word, a value and a bitfield position within the word,
> >> +    produce the value that results if replacing the
> >> +    described parts of word with value.
> >> +    Operand 0 is a tree for the word of integral type;
> >> +    Operand 1 is a tree for the value of integral type;
> >> +    Operand 2 is a tree giving the constant position of the first referenced bit;
> >> +    The number of bits replaced is given by the precision of the value
> >> +    type if that is integral or by its size if it is non-integral.
> >> +    The replaced bits shall be fully inside the word.  */
> >> + DEFTREECODE (BIT_FIELD_INSERT, "bit_field_insert", tcc_expression, 3)
> >> + 
> >> /* Given two real or integer operands of the same type,
> >>    returns a complex value of the corresponding complex type.  */
> >> DEFTREECODE (COMPLEX_EXPR, "complex_expr", tcc_binary, 2)
> >> Index: trunk/gcc/cfgexpand.c
> >> ===================================================================
> >> *** trunk.orig/gcc/cfgexpand.c	2016-05-12 13:42:45.469812005 +0200
> >> --- trunk/gcc/cfgexpand.c	2016-05-13 11:48:04.513407495 +0200
> >> *************** expand_debug_expr (tree exp)
> >> *** 5025,5030 ****
> >> --- 5025,5031 ----
> >>     case FIXED_CONVERT_EXPR:
> >>     case OBJ_TYPE_REF:
> >>     case WITH_SIZE_EXPR:
> >> +     case BIT_FIELD_INSERT:
> >>       return NULL;
> >> 
> >>     case DOT_PROD_EXPR:
> >> Index: trunk/gcc/gimple-pretty-print.c
> >> ===================================================================
> >> *** trunk.orig/gcc/gimple-pretty-print.c	2016-05-12 11:23:09.261375157 +0200
> >> --- trunk/gcc/gimple-pretty-print.c	2016-05-12 14:57:22.096175579 +0200
> >> *************** dump_ternary_rhs (pretty_printer *buffer
> >> *** 479,484 ****
> >> --- 479,502 ----
> >>       pp_greater (buffer);
> >>       break;
> >> 
> >> +     case BIT_FIELD_INSERT:
> >> +       pp_string (buffer, "BIT_FIELD_INSERT <");
> >> +       dump_generic_node (buffer, gimple_assign_rhs1 (gs), spc, flags, false);
> >> +       pp_string (buffer, ", ");
> >> +       dump_generic_node (buffer, gimple_assign_rhs2 (gs), spc, flags, false);
> >> +       pp_string (buffer, ", ");
> >> +       dump_generic_node (buffer, gimple_assign_rhs3 (gs), spc, flags, false);
> >> +       pp_string (buffer, " (");
> >> +       if (INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_rhs2 (gs))))
> >> + 	pp_decimal_int (buffer,
> >> + 			TYPE_PRECISION (TREE_TYPE (gimple_assign_rhs2 (gs))));
> >> +       else
> >> + 	dump_generic_node (buffer,
> >> + 			   TYPE_SIZE (TREE_TYPE (gimple_assign_rhs2 (gs))),
> >> + 			   spc, flags, false);
> >> +       pp_string (buffer, " bits)>");
> >> +       break;
> >> + 
> >>     default:
> >>       gcc_unreachable ();
> >>     }
> >> Index: trunk/gcc/gimple.c
> >> ===================================================================
> >> *** trunk.orig/gcc/gimple.c	2016-05-12 13:40:30.704262951 +0200
> >> --- trunk/gcc/gimple.c	2016-05-12 14:49:37.066994969 +0200
> >> *************** get_gimple_rhs_num_ops (enum tree_code c
> >> *** 2044,2049 ****
> >> --- 2044,2050 ----
> >>       || (SYM) == REALIGN_LOAD_EXPR					    \
> >>       || (SYM) == VEC_COND_EXPR						    \
> >>       || (SYM) == VEC_PERM_EXPR                                             \
> >> +       || (SYM) == BIT_FIELD_INSERT					    \
> >>       || (SYM) == FMA_EXPR) ? GIMPLE_TERNARY_RHS			    \
> >>    : ((SYM) == CONSTRUCTOR						    \
> >>       || (SYM) == OBJ_TYPE_REF						    \
> >> Index: trunk/gcc/tree-cfg.c
> >> ===================================================================
> >> *** trunk.orig/gcc/tree-cfg.c	2016-05-06 14:38:33.959495081 +0200
> >> --- trunk/gcc/tree-cfg.c	2016-05-13 09:25:01.670630730 +0200
> >> *************** verify_gimple_assign_ternary (gassign *s
> >> *** 4155,4160 ****
> >> --- 4155,4207 ----
> >> 
> >>       return false;
> >> 
> >> +     case BIT_FIELD_INSERT:
> >> +       if (! useless_type_conversion_p (lhs_type, rhs1_type))
> >> + 	{
> >> + 	  error ("type mismatch in BIT_FIELD_INSERT");
> >> + 	  debug_generic_expr (lhs_type);
> >> + 	  debug_generic_expr (rhs1_type);
> >> + 	  return true;
> >> + 	}
> >> +       if (! ((INTEGRAL_TYPE_P (rhs1_type)
> >> + 	      && INTEGRAL_TYPE_P (rhs2_type))
> >> + 	     || (VECTOR_TYPE_P (rhs1_type)
> >> + 		 && types_compatible_p (TREE_TYPE (rhs1_type), rhs2_type))))
> >> + 	{
> >> + 	  error ("not allowed type combination in BIT_FIELD_INSERT");
> >> + 	  debug_generic_expr (rhs1_type);
> >> + 	  debug_generic_expr (rhs2_type);
> >> + 	  return true;
> >> + 	}
> >> +       if (! tree_fits_uhwi_p (rhs3)
> >> + 	  || ! tree_fits_uhwi_p (TYPE_SIZE (rhs2_type)))
> >> + 	{
> >> + 	  error ("invalid position or size in BIT_FIELD_INSERT");
> >> + 	  return true;
> >> + 	}
> >> +       if (INTEGRAL_TYPE_P (rhs1_type))
> >> + 	{
> >> + 	  unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (rhs3);
> >> + 	  if (bitpos >= TYPE_PRECISION (rhs1_type)
> >> + 	      || (bitpos + TYPE_PRECISION (rhs2_type)
> >> + 		  > TYPE_PRECISION (rhs1_type)))
> >> + 	    {
> >> + 	      error ("insertion out of range in BIT_FIELD_INSERT");
> >> + 	      return true;
> >> + 	    }
> >> + 	}
> >> +       else if (VECTOR_TYPE_P (rhs1_type))
> >> + 	{
> >> + 	  unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (rhs3);
> >> + 	  unsigned HOST_WIDE_INT bitsize = tree_to_uhwi (TYPE_SIZE (rhs2_type));
> >> + 	  if (bitpos % bitsize != 0)
> >> + 	    {
> >> + 	      error ("vector insertion not at element boundary");
> >> + 	      return true;
> >> + 	    }
> >> + 	}
> >> +       return false;
> >> + 
> >>     case DOT_PROD_EXPR:
> >>     case REALIGN_LOAD_EXPR:
> >>       /* FIXME.  */
> >> Index: trunk/gcc/tree-ssa.c
> >> ===================================================================
> >> *** trunk.orig/gcc/tree-ssa.c	2016-05-13 09:38:02.263611726 +0200
> >> --- trunk/gcc/tree-ssa.c	2016-05-13 09:50:31.020226585 +0200
> >> *************** non_rewritable_lvalue_p (tree lhs)
> >> *** 1318,1323 ****
> >> --- 1318,1335 ----
> >> 	return false;
> >>     }
> >> 
> >> +   /* A vector-insert using a BIT_FIELD_REF is rewritable using
> >> +      BIT_FIELD_INSERT.  */
> >> +   if (TREE_CODE (lhs) == BIT_FIELD_REF
> >> +       && DECL_P (TREE_OPERAND (lhs, 0))
> >> +       && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (lhs, 0)))
> >> +       /* && bitsize % element-size == 0 */
> >> +       && types_compatible_p (TREE_TYPE (lhs),
> >> + 			     TREE_TYPE (TREE_TYPE (TREE_OPERAND (lhs, 0))))
> >> +       && (tree_to_uhwi (TREE_OPERAND (lhs, 2))
> >> + 	  % tree_to_uhwi (TYPE_SIZE (TREE_TYPE (lhs)))) == 0)
> >> +     return false;
> >> + 
> >>   return true;
> >> }
> >> 
> >> *************** execute_update_addresses_taken (void)
> >> *** 1536,1541 ****
> >> --- 1548,1576 ----
> >> 		    stmt = gsi_stmt (gsi);
> >> 		    unlink_stmt_vdef (stmt);
> >> 		    update_stmt (stmt);
> >> + 		    continue;
> >> + 		  }
> >> + 
> >> + 		/* Rewrite a vector insert via a BIT_FIELD_REF on the LHS
> >> + 		   into a BIT_FIELD_INSERT.  */
> >> + 		if (TREE_CODE (lhs) == BIT_FIELD_REF
> >> + 		    && DECL_P (TREE_OPERAND (lhs, 0))
> >> + 		    && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (lhs, 0)))
> >> + 		    && types_compatible_p (TREE_TYPE (lhs),
> >> + 					   TREE_TYPE (TREE_TYPE
> >> + 						       (TREE_OPERAND (lhs, 0))))
> >> + 		    && (tree_to_uhwi (TREE_OPERAND (lhs, 2))
> >> + 			% tree_to_uhwi (TYPE_SIZE (TREE_TYPE (lhs))) == 0))
> >> + 		  {
> >> + 		    tree var = TREE_OPERAND (lhs, 0);
> >> + 		    tree val = gimple_assign_rhs1 (stmt);
> >> + 		    tree bitpos = TREE_OPERAND (lhs, 2);
> >> + 		    gimple_assign_set_lhs (stmt, var);
> >> + 		    gimple_assign_set_rhs_with_ops
> >> + 		      (&gsi, BIT_FIELD_INSERT, var, val, bitpos);
> >> + 		    stmt = gsi_stmt (gsi);
> >> + 		    unlink_stmt_vdef (stmt);
> >> + 		    update_stmt (stmt);
> >> 		    continue;
> >> 		  }
> >> 
> >> Index: trunk/gcc/testsuite/gcc.dg/tree-ssa/vector-6.c
> >> ===================================================================
> >> *** /dev/null	1970-01-01 00:00:00.000000000 +0000
> >> --- trunk/gcc/testsuite/gcc.dg/tree-ssa/vector-6.c	2016-05-13 09:54:16.026814995 +0200
> >> ***************
> >> *** 0 ****
> >> --- 1,34 ----
> >> + /* { dg-do compile } */
> >> + /* { dg-options "-O -fdump-tree-ccp1" } */
> >> + 
> >> + typedef int v4si __attribute__((vector_size (4 * sizeof (int))));
> >> + 
> >> + v4si test1 (v4si v, int i)
> >> + {
> >> +   ((int *)&v)[0] = i;
> >> +   return v;
> >> + }
> >> + 
> >> + v4si test2 (v4si v, int i)
> >> + {
> >> +   int *p = (int *)&v;
> >> +   *p = i;
> >> +   return v;
> >> + }
> >> + 
> >> + v4si test3 (v4si v, int i)
> >> + {
> >> +   ((int *)&v)[3] = i;
> >> +   return v;
> >> + }
> >> + 
> >> + v4si test4 (v4si v, int i)
> >> + {
> >> +   int *p = (int *)&v;
> >> +   p += 3;
> >> +   *p = i;
> >> +   return v;
> >> + }
> >> + 
> >> + /* { dg-final { scan-tree-dump-times "Now a gimple register: v" 2 "ccp1" } } */
> >> + /* { dg-final { scan-tree-dump-times "Now a gimple register: v" 4 "ccp1" { xfail *-*-* } } } */
> >> 
> > 
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT
  2016-05-17  7:50   ` Richard Biener
@ 2016-05-17  8:13     ` Eric Botcazou
  2016-05-17 15:19     ` Michael Matz
  1 sibling, 0 replies; 32+ messages in thread
From: Eric Botcazou @ 2016-05-17  8:13 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

> I'm fine with renaming it to BIT_FIELD_INSERT_EXPR, maybe
> BIT_INSERT_EXPR then as it doesn't really have anything to do
> with "bitfields".
> 
> Any preference?

BIT_INSERT_EXPR is fine with me.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT
  2016-05-17  7:50   ` Richard Biener
  2016-05-17  8:13     ` Eric Botcazou
@ 2016-05-17 15:19     ` Michael Matz
  2016-05-19 13:23       ` Richard Biener
  1 sibling, 1 reply; 32+ messages in thread
From: Michael Matz @ 2016-05-17 15:19 UTC (permalink / raw)
  To: Richard Biener; +Cc: Eric Botcazou, gcc-patches

Hi,

On Tue, 17 May 2016, Richard Biener wrote:

> BIT_INSERT_EXPR 

This.

> Any preference?


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT
  2016-05-17 15:19     ` Michael Matz
@ 2016-05-19 13:23       ` Richard Biener
  2016-05-19 15:21         ` Eric Botcazou
  0 siblings, 1 reply; 32+ messages in thread
From: Richard Biener @ 2016-05-19 13:23 UTC (permalink / raw)
  To: Michael Matz; +Cc: Eric Botcazou, gcc-patches

On Tue, 17 May 2016, Michael Matz wrote:

> Hi,
> 
> On Tue, 17 May 2016, Richard Biener wrote:
> 
> > BIT_INSERT_EXPR 
> 
> This.
> 
> > Any preference?

Here is an updated patch.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

I plan to commit this tomorrow if there are no further comments.

Thanks,
Richard.

2011-06-19  Richard Guenther  <rguenther@suse.de>

	PR tree-optimization/29756
	* tree.def (BIT_INSERT_EXPR): New tcc_expression tree code.
	* expr.c (expand_expr_real_2): Handle BIT_INSERT_EXPR.
	* fold-const.c (operand_equal_p): Likewise.
	(fold_ternary_loc): Add constant folding of BIT_INSERT_EXPR.
	* gimplify.c (gimplify_expr): Handle BIT_INSERT_EXPR.
	* tree-inline.c (estimate_operator_cost): Likewise.
	* tree-pretty-print.c (dump_generic_node): Likewise.
	* tree-ssa-operands.c (get_expr_operands): Likewise.
	* cfgexpand.c (expand_debug_expr): Likewise.
	* gimple-pretty-print.c (dump_ternary_rhs): Likewise.
	* gimple.c (get_gimple_rhs_num_ops): Handle BIT_INSERT_EXPR.
	* tree-cfg.c (verify_gimple_assign_ternary): Verify BIT_INSERT_EXPR.

	* tree-ssa.c (non_rewritable_lvalue_p): We can rewrite
	vector inserts using BIT_FIELD_REF or MEM_REF on the lhs.
	(execute_update_addresses_taken): Do it.

	* gcc.dg/tree-ssa/vector-6.c: New testcase.

Index: trunk/gcc/expr.c
===================================================================
*** trunk.orig/gcc/expr.c	2016-05-17 17:19:41.783958489 +0200
--- trunk/gcc/expr.c	2016-05-19 10:23:35.667140692 +0200
*************** expand_expr_real_2 (sepops ops, rtx targ
*** 9225,9230 ****
--- 9225,9247 ----
        target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
        return target;
  
+     case BIT_INSERT_EXPR:
+       {
+ 	unsigned bitpos = tree_to_uhwi (treeop2);
+ 	unsigned bitsize;
+ 	if (INTEGRAL_TYPE_P (TREE_TYPE (treeop1)))
+ 	  bitsize = TYPE_PRECISION (TREE_TYPE (treeop1));
+ 	else
+ 	  bitsize = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (treeop1)));
+ 	rtx op0 = expand_normal (treeop0);
+ 	rtx op1 = expand_normal (treeop1);
+ 	rtx dst = gen_reg_rtx (mode);
+ 	emit_move_insn (dst, op0);
+ 	store_bit_field (dst, bitsize, bitpos, 0, 0,
+ 			 TYPE_MODE (TREE_TYPE (treeop1)), op1, false);
+ 	return dst;
+       }
+ 
      default:
        gcc_unreachable ();
      }
Index: trunk/gcc/fold-const.c
===================================================================
*** trunk.orig/gcc/fold-const.c	2016-05-17 17:19:41.783958489 +0200
--- trunk/gcc/fold-const.c	2016-05-19 10:23:35.699141058 +0200
*************** operand_equal_p (const_tree arg0, const_
*** 3163,3168 ****
--- 3163,3169 ----
  
  	case VEC_COND_EXPR:
  	case DOT_PROD_EXPR:
+ 	case BIT_INSERT_EXPR:
  	  return OP_SAME (0) && OP_SAME (1) && OP_SAME (2);
  
  	default:
*************** fold_ternary_loc (location_t loc, enum t
*** 11870,11875 ****
--- 11871,11916 ----
  	}
        return NULL_TREE;
  
+     case BIT_INSERT_EXPR:
+       /* Perform (partial) constant folding of BIT_INSERT_EXPR.  */
+       if (TREE_CODE (arg0) == INTEGER_CST
+ 	  && TREE_CODE (arg1) == INTEGER_CST)
+ 	{
+ 	  unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
+ 	  unsigned bitsize = TYPE_PRECISION (TREE_TYPE (arg1));
+ 	  wide_int tem = wi::bit_and (arg0,
+ 				      wi::shifted_mask (bitpos, bitsize, true,
+ 							TYPE_PRECISION (type)));
+ 	  wide_int tem2
+ 	    = wi::lshift (wi::zext (wi::to_wide (arg1, TYPE_PRECISION (type)),
+ 				    bitsize), bitpos);
+ 	  return wide_int_to_tree (type, wi::bit_or (tem, tem2));
+ 	}
+       else if (TREE_CODE (arg0) == VECTOR_CST
+ 	       && CONSTANT_CLASS_P (arg1)
+ 	       && types_compatible_p (TREE_TYPE (TREE_TYPE (arg0)),
+ 				      TREE_TYPE (arg1)))
+ 	{
+ 	  unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
+ 	  unsigned HOST_WIDE_INT elsize
+ 	    = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (arg1)));
+ 	  if (bitpos % elsize == 0)
+ 	    {
+ 	      unsigned k = bitpos / elsize;
+ 	      if (operand_equal_p (VECTOR_CST_ELT (arg0, k), arg1, 0))
+ 		return arg0;
+ 	      else
+ 		{
+ 		  tree *elts = XALLOCAVEC (tree, TYPE_VECTOR_SUBPARTS (type));
+ 		  memcpy (elts, VECTOR_CST_ELTS (arg0),
+ 			  sizeof (tree) * TYPE_VECTOR_SUBPARTS (type));
+ 		  elts[k] = arg1;
+ 		  return build_vector (type, elts);
+ 		}
+ 	    }
+ 	}
+       return NULL_TREE;
+ 
      default:
        return NULL_TREE;
      } /* switch (code) */
Index: trunk/gcc/gimplify.c
===================================================================
*** trunk.orig/gcc/gimplify.c	2016-05-17 17:19:41.783958489 +0200
--- trunk/gcc/gimplify.c	2016-05-19 10:23:35.723141333 +0200
*************** gimplify_expr (tree *expr_p, gimple_seq
*** 10950,10955 ****
--- 10950,10959 ----
  	  /* Classified as tcc_expression.  */
  	  goto expr_3;
  
+ 	case BIT_INSERT_EXPR:
+ 	  /* Argument 3 is a constant.  */
+ 	  goto expr_2;
+ 
  	case POINTER_PLUS_EXPR:
  	  {
  	    enum gimplify_status r0, r1;
Index: trunk/gcc/tree-inline.c
===================================================================
*** trunk.orig/gcc/tree-inline.c	2016-05-18 09:42:22.733237510 +0200
--- trunk/gcc/tree-inline.c	2016-05-19 10:23:35.739141516 +0200
*************** estimate_operator_cost (enum tree_code c
*** 3941,3946 ****
--- 3941,3950 ----
          return weights->div_mod_cost;
        return 1;
  
+     /* Bit-field insertion needs several shift and mask operations.  */
+     case BIT_INSERT_EXPR:
+       return 3;
+ 
      default:
        /* We expect a copy assignment with no operator.  */
        gcc_assert (get_gimple_rhs_class (code) == GIMPLE_SINGLE_RHS);
Index: trunk/gcc/tree-pretty-print.c
===================================================================
*** trunk.orig/gcc/tree-pretty-print.c	2016-05-17 17:19:41.783958489 +0200
--- trunk/gcc/tree-pretty-print.c	2016-05-19 10:23:35.763141790 +0200
*************** dump_generic_node (pretty_printer *pp, t
*** 1876,1881 ****
--- 1876,1898 ----
        pp_greater (pp);
        break;
  
+     case BIT_INSERT_EXPR:
+       pp_string (pp, "BIT_INSERT_EXPR <");
+       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
+       pp_string (pp, ", ");
+       dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
+       pp_string (pp, ", ");
+       dump_generic_node (pp, TREE_OPERAND (node, 2), spc, flags, false);
+       pp_string (pp, " (");
+       if (INTEGRAL_TYPE_P (TREE_TYPE (TREE_OPERAND (node, 1))))
+ 	pp_decimal_int (pp,
+ 			TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (node, 1))));
+       else
+ 	dump_generic_node (pp, TYPE_SIZE (TREE_TYPE (TREE_OPERAND (node, 1))),
+ 			   spc, flags, false);
+       pp_string (pp, " bits)>");
+       break;
+ 
      case ARRAY_REF:
      case ARRAY_RANGE_REF:
        op0 = TREE_OPERAND (node, 0);
Index: trunk/gcc/tree-ssa-operands.c
===================================================================
*** trunk.orig/gcc/tree-ssa-operands.c	2016-05-17 17:19:41.783958489 +0200
--- trunk/gcc/tree-ssa-operands.c	2016-05-19 10:23:35.779141973 +0200
*************** get_expr_operands (struct function *fn,
*** 833,838 ****
--- 833,839 ----
        get_expr_operands (fn, stmt, &TREE_OPERAND (expr, 0), flags);
        return;
  
+     case BIT_INSERT_EXPR:
      case COMPOUND_EXPR:
      case OBJ_TYPE_REF:
      case ASSERT_EXPR:
Index: trunk/gcc/tree.def
===================================================================
*** trunk.orig/gcc/tree.def	2016-05-17 17:19:41.783958489 +0200
--- trunk/gcc/tree.def	2016-05-19 10:23:35.779141973 +0200
*************** DEFTREECODE (ADDR_EXPR, "addr_expr", tcc
*** 852,857 ****
--- 852,871 ----
     descriptor of type ptr_mode.  */
  DEFTREECODE (FDESC_EXPR, "fdesc_expr", tcc_expression, 2)
  
+ /* Given a word, a value and a bit position within the word,
+    produce the value that results if replacing the parts of word
+    starting at the bit position with value.
+    Operand 0 is a tree for the word of integral or vector type;
+    Operand 1 is a tree for the value of integral or vector element type;
+    Operand 2 is a tree giving the constant position of the first referenced bit;
+    The number of bits replaced is given by the precision of the value
+    type if that is integral or by its size if it is non-integral.
+    ???  The reason to make the size of the replacement implicit is to not
+    have a quaternary operation.
+    The replaced bits shall be fully inside the word.  If the word is of
+    vector type the replaced bits shall be aligned with its elements.  */
+ DEFTREECODE (BIT_INSERT_EXPR, "bit_field_insert", tcc_expression, 3)
+ 
  /* Given two real or integer operands of the same type,
     returns a complex value of the corresponding complex type.  */
  DEFTREECODE (COMPLEX_EXPR, "complex_expr", tcc_binary, 2)
Index: trunk/gcc/cfgexpand.c
===================================================================
*** trunk.orig/gcc/cfgexpand.c	2016-05-17 17:19:41.783958489 +0200
--- trunk/gcc/cfgexpand.c	2016-05-19 10:23:35.787142064 +0200
*************** expand_debug_expr (tree exp)
*** 5025,5030 ****
--- 5025,5031 ----
      case FIXED_CONVERT_EXPR:
      case OBJ_TYPE_REF:
      case WITH_SIZE_EXPR:
+     case BIT_INSERT_EXPR:
        return NULL;
  
      case DOT_PROD_EXPR:
Index: trunk/gcc/gimple-pretty-print.c
===================================================================
*** trunk.orig/gcc/gimple-pretty-print.c	2016-05-17 17:19:41.783958489 +0200
--- trunk/gcc/gimple-pretty-print.c	2016-05-19 10:23:35.787142064 +0200
*************** dump_ternary_rhs (pretty_printer *buffer
*** 479,484 ****
--- 479,502 ----
        pp_greater (buffer);
        break;
  
+     case BIT_INSERT_EXPR:
+       pp_string (buffer, "BIT_INSERT_EXPR <");
+       dump_generic_node (buffer, gimple_assign_rhs1 (gs), spc, flags, false);
+       pp_string (buffer, ", ");
+       dump_generic_node (buffer, gimple_assign_rhs2 (gs), spc, flags, false);
+       pp_string (buffer, ", ");
+       dump_generic_node (buffer, gimple_assign_rhs3 (gs), spc, flags, false);
+       pp_string (buffer, " (");
+       if (INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_rhs2 (gs))))
+ 	pp_decimal_int (buffer,
+ 			TYPE_PRECISION (TREE_TYPE (gimple_assign_rhs2 (gs))));
+       else
+ 	dump_generic_node (buffer,
+ 			   TYPE_SIZE (TREE_TYPE (gimple_assign_rhs2 (gs))),
+ 			   spc, flags, false);
+       pp_string (buffer, " bits)>");
+       break;
+ 
      default:
        gcc_unreachable ();
      }
Index: trunk/gcc/gimple.c
===================================================================
*** trunk.orig/gcc/gimple.c	2016-05-18 13:45:15.350948365 +0200
--- trunk/gcc/gimple.c	2016-05-19 10:23:35.787142064 +0200
*************** get_gimple_rhs_num_ops (enum tree_code c
*** 2044,2049 ****
--- 2044,2050 ----
        || (SYM) == REALIGN_LOAD_EXPR					    \
        || (SYM) == VEC_COND_EXPR						    \
        || (SYM) == VEC_PERM_EXPR                                             \
+       || (SYM) == BIT_INSERT_EXPR					    \
        || (SYM) == FMA_EXPR) ? GIMPLE_TERNARY_RHS			    \
     : ((SYM) == CONSTRUCTOR						    \
        || (SYM) == OBJ_TYPE_REF						    \
Index: trunk/gcc/tree-cfg.c
===================================================================
*** trunk.orig/gcc/tree-cfg.c	2016-05-17 17:19:41.783958489 +0200
--- trunk/gcc/tree-cfg.c	2016-05-19 10:23:35.799142201 +0200
*************** verify_gimple_assign_ternary (gassign *s
*** 4155,4160 ****
--- 4155,4207 ----
  
        return false;
  
+     case BIT_INSERT_EXPR:
+       if (! useless_type_conversion_p (lhs_type, rhs1_type))
+ 	{
+ 	  error ("type mismatch in BIT_INSERT_EXPR");
+ 	  debug_generic_expr (lhs_type);
+ 	  debug_generic_expr (rhs1_type);
+ 	  return true;
+ 	}
+       if (! ((INTEGRAL_TYPE_P (rhs1_type)
+ 	      && INTEGRAL_TYPE_P (rhs2_type))
+ 	     || (VECTOR_TYPE_P (rhs1_type)
+ 		 && types_compatible_p (TREE_TYPE (rhs1_type), rhs2_type))))
+ 	{
+ 	  error ("not allowed type combination in BIT_INSERT_EXPR");
+ 	  debug_generic_expr (rhs1_type);
+ 	  debug_generic_expr (rhs2_type);
+ 	  return true;
+ 	}
+       if (! tree_fits_uhwi_p (rhs3)
+ 	  || ! tree_fits_uhwi_p (TYPE_SIZE (rhs2_type)))
+ 	{
+ 	  error ("invalid position or size in BIT_INSERT_EXPR");
+ 	  return true;
+ 	}
+       if (INTEGRAL_TYPE_P (rhs1_type))
+ 	{
+ 	  unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (rhs3);
+ 	  if (bitpos >= TYPE_PRECISION (rhs1_type)
+ 	      || (bitpos + TYPE_PRECISION (rhs2_type)
+ 		  > TYPE_PRECISION (rhs1_type)))
+ 	    {
+ 	      error ("insertion out of range in BIT_INSERT_EXPR");
+ 	      return true;
+ 	    }
+ 	}
+       else if (VECTOR_TYPE_P (rhs1_type))
+ 	{
+ 	  unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (rhs3);
+ 	  unsigned HOST_WIDE_INT bitsize = tree_to_uhwi (TYPE_SIZE (rhs2_type));
+ 	  if (bitpos % bitsize != 0)
+ 	    {
+ 	      error ("vector insertion not at element boundary");
+ 	      return true;
+ 	    }
+ 	}
+       return false;
+ 
      case DOT_PROD_EXPR:
      case REALIGN_LOAD_EXPR:
        /* FIXME.  */
Index: trunk/gcc/tree-ssa.c
===================================================================
*** trunk.orig/gcc/tree-ssa.c	2016-05-17 17:19:41.783958489 +0200
--- trunk/gcc/tree-ssa.c	2016-05-19 11:17:40.008247968 +0200
*************** non_rewritable_lvalue_p (tree lhs)
*** 1303,1323 ****
        && DECL_P (TREE_OPERAND (lhs, 0)))
      return false;
  
!   /* A decl that is wrapped inside a MEM-REF that covers
!      it full is also rewritable.
!      ???  The following could be relaxed allowing component
       references that do not change the access size.  */
    if (TREE_CODE (lhs) == MEM_REF
!       && TREE_CODE (TREE_OPERAND (lhs, 0)) == ADDR_EXPR
!       && integer_zerop (TREE_OPERAND (lhs, 1)))
      {
        tree decl = TREE_OPERAND (TREE_OPERAND (lhs, 0), 0);
!       if (DECL_P (decl)
  	  && DECL_SIZE (decl) == TYPE_SIZE (TREE_TYPE (lhs))
  	  && (TREE_THIS_VOLATILE (decl) == TREE_THIS_VOLATILE (lhs)))
  	return false;
      }
  
    return true;
  }
  
--- 1303,1350 ----
        && DECL_P (TREE_OPERAND (lhs, 0)))
      return false;
  
!   /* ???  The following could be relaxed allowing component
       references that do not change the access size.  */
    if (TREE_CODE (lhs) == MEM_REF
!       && TREE_CODE (TREE_OPERAND (lhs, 0)) == ADDR_EXPR)
      {
        tree decl = TREE_OPERAND (TREE_OPERAND (lhs, 0), 0);
! 
!       /* A decl that is wrapped inside a MEM-REF that covers
! 	 it full is also rewritable.  */
!       if (integer_zerop (TREE_OPERAND (lhs, 1))
! 	  && DECL_P (decl)
  	  && DECL_SIZE (decl) == TYPE_SIZE (TREE_TYPE (lhs))
  	  && (TREE_THIS_VOLATILE (decl) == TREE_THIS_VOLATILE (lhs)))
  	return false;
+ 
+       /* A vector-insert using a MEM_REF or ARRAY_REF is rewritable
+ 	 using a BIT_INSERT_EXPR.  */
+       if (DECL_P (decl)
+ 	  && VECTOR_TYPE_P (TREE_TYPE (decl))
+ 	  && TYPE_MODE (TREE_TYPE (decl)) != BLKmode
+ 	  && types_compatible_p (TREE_TYPE (lhs),
+ 				 TREE_TYPE (TREE_TYPE (decl)))
+ 	  && tree_fits_uhwi_p (TREE_OPERAND (lhs, 1))
+ 	  && tree_int_cst_lt (TREE_OPERAND (lhs, 1),
+ 			      TYPE_SIZE_UNIT (TREE_TYPE (decl)))
+ 	  && (tree_to_uhwi (TREE_OPERAND (lhs, 1))
+ 	      % tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (lhs)))) == 0)
+ 	return false;
      }
  
+   /* A vector-insert using a BIT_FIELD_REF is rewritable using
+      BIT_INSERT_EXPR.  */
+   if (TREE_CODE (lhs) == BIT_FIELD_REF
+       && DECL_P (TREE_OPERAND (lhs, 0))
+       && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (lhs, 0)))
+       && TYPE_MODE (TREE_TYPE (TREE_OPERAND (lhs, 0))) != BLKmode
+       && types_compatible_p (TREE_TYPE (lhs),
+ 			     TREE_TYPE (TREE_TYPE (TREE_OPERAND (lhs, 0))))
+       && (tree_to_uhwi (TREE_OPERAND (lhs, 2))
+ 	  % tree_to_uhwi (TYPE_SIZE (TREE_TYPE (lhs)))) == 0)
+     return false;
+ 
    return true;
  }
  
*************** execute_update_addresses_taken (void)
*** 1536,1541 ****
--- 1563,1624 ----
  		    stmt = gsi_stmt (gsi);
  		    unlink_stmt_vdef (stmt);
  		    update_stmt (stmt);
+ 		    continue;
+ 		  }
+ 
+ 		/* Rewrite a vector insert via a BIT_FIELD_REF on the LHS
+ 		   into a BIT_INSERT_EXPR.  */
+ 		if (TREE_CODE (lhs) == BIT_FIELD_REF
+ 		    && DECL_P (TREE_OPERAND (lhs, 0))
+ 		    && bitmap_bit_p (suitable_for_renaming,
+ 				     DECL_UID (TREE_OPERAND (lhs, 0)))
+ 		    && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (lhs, 0)))
+ 		    && TYPE_MODE (TREE_TYPE (TREE_OPERAND (lhs, 0))) != BLKmode
+ 		    && types_compatible_p (TREE_TYPE (lhs),
+ 					   TREE_TYPE (TREE_TYPE
+ 						       (TREE_OPERAND (lhs, 0))))
+ 		    && (tree_to_uhwi (TREE_OPERAND (lhs, 2))
+ 			% tree_to_uhwi (TYPE_SIZE (TREE_TYPE (lhs))) == 0))
+ 		  {
+ 		    tree var = TREE_OPERAND (lhs, 0);
+ 		    tree val = gimple_assign_rhs1 (stmt);
+ 		    tree bitpos = TREE_OPERAND (lhs, 2);
+ 		    gimple_assign_set_lhs (stmt, var);
+ 		    gimple_assign_set_rhs_with_ops
+ 		      (&gsi, BIT_INSERT_EXPR, var, val, bitpos);
+ 		    stmt = gsi_stmt (gsi);
+ 		    unlink_stmt_vdef (stmt);
+ 		    update_stmt (stmt);
+ 		    continue;
+ 		  }
+ 
+ 		/* Rewrite a vector insert using a MEM_REF on the LHS
+ 		   into a BIT_INSERT_EXPR.  */
+ 		if (TREE_CODE (lhs) == MEM_REF
+ 		    && TREE_CODE (TREE_OPERAND (lhs, 0)) == ADDR_EXPR
+ 		    && (sym = TREE_OPERAND (TREE_OPERAND (lhs, 0), 0))
+ 		    && DECL_P (sym)
+ 		    && bitmap_bit_p (suitable_for_renaming, DECL_UID (sym))
+ 		    && VECTOR_TYPE_P (TREE_TYPE (sym))
+ 		    && TYPE_MODE (TREE_TYPE (sym)) != BLKmode
+ 		    && types_compatible_p (TREE_TYPE (lhs),
+ 					   TREE_TYPE (TREE_TYPE (sym)))
+ 		    && tree_fits_uhwi_p (TREE_OPERAND (lhs, 1))
+ 		    && tree_int_cst_lt (TREE_OPERAND (lhs, 1),
+ 					TYPE_SIZE_UNIT (TREE_TYPE (sym)))
+ 		    && (tree_to_uhwi (TREE_OPERAND (lhs, 1))
+ 			% tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (lhs)))) == 0)
+ 		  {
+ 		    tree val = gimple_assign_rhs1 (stmt);
+ 		    tree bitpos
+ 		      = wide_int_to_tree (bitsizetype,
+ 					  mem_ref_offset (lhs) * BITS_PER_UNIT);
+ 		    gimple_assign_set_lhs (stmt, sym);
+ 		    gimple_assign_set_rhs_with_ops
+ 		      (&gsi, BIT_INSERT_EXPR, sym, val, bitpos);
+ 		    stmt = gsi_stmt (gsi);
+ 		    unlink_stmt_vdef (stmt);
+ 		    update_stmt (stmt);
  		    continue;
  		  }
  
Index: trunk/gcc/testsuite/gcc.dg/tree-ssa/vector-6.c
===================================================================
*** /dev/null	1970-01-01 00:00:00.000000000 +0000
--- trunk/gcc/testsuite/gcc.dg/tree-ssa/vector-6.c	2016-05-19 11:18:53.465098885 +0200
***************
*** 0 ****
--- 1,33 ----
+ /* { dg-do compile } */
+ /* { dg-options "-O -fdump-tree-ccp1" } */
+ 
+ typedef int v4si __attribute__((vector_size (4 * sizeof (int))));
+ 
+ v4si test1 (v4si v, int i)
+ {
+   ((int *)&v)[0] = i;
+   return v;
+ }
+ 
+ v4si test2 (v4si v, int i)
+ {
+   int *p = (int *)&v;
+   *p = i;
+   return v;
+ }
+ 
+ v4si test3 (v4si v, int i)
+ {
+   ((int *)&v)[3] = i;
+   return v;
+ }
+ 
+ v4si test4 (v4si v, int i)
+ {
+   int *p = (int *)&v;
+   p += 3;
+   *p = i;
+   return v;
+ }
+ 
+ /* { dg-final { scan-tree-dump-times "Now a gimple register: v" 4 "ccp1" } } */

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT
  2016-05-19 13:23       ` Richard Biener
@ 2016-05-19 15:21         ` Eric Botcazou
  2016-05-20  8:59           ` Richard Biener
  0 siblings, 1 reply; 32+ messages in thread
From: Eric Botcazou @ 2016-05-19 15:21 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, Michael Matz

> Index: trunk/gcc/tree.def
> ===================================================================
> *** trunk.orig/gcc/tree.def	2016-05-17 17:19:41.783958489 +0200
> --- trunk/gcc/tree.def	2016-05-19 10:23:35.779141973 +0200
> *************** DEFTREECODE (ADDR_EXPR, "addr_expr", tcc
> *** 852,857 ****
> --- 852,871 ----
>      descriptor of type ptr_mode.  */
>   DEFTREECODE (FDESC_EXPR, "fdesc_expr", tcc_expression, 2)
> 
> + /* Given a word, a value and a bit position within the word,
> +    produce the value that results if replacing the parts of word
> +    starting at the bit position with value.
> +    Operand 0 is a tree for the word of integral or vector type;
> +    Operand 1 is a tree for the value of integral or vector element type;
> +    Operand 2 is a tree giving the constant position of the first
> referenced bit; +    The number of bits replaced is given by the precision
> of the value +    type if that is integral or by its size if it is
> non-integral. +    ???  The reason to make the size of the replacement
> implicit is to not +    have a quaternary operation.
> +    The replaced bits shall be fully inside the word.  If the word is of
> +    vector type the replaced bits shall be aligned with its elements.  */
> + DEFTREECODE (BIT_INSERT_EXPR, "bit_field_insert", tcc_expression, 3)
> +

"word" is ambiguous (what is a word of vector type?).  What's allowed as 
operand #0 exactly?  If that's anything, I'd call it a value too, possibly 
with a qualifier, for example:

 /* Given a container value, a replacement value and a bit position within
    the container, produce the value that results from replacing the part of
    the container starting at the bit position with the replacement value.
    Operand 0 is a tree for the container value of integral or vector type;
    Operand 1 is a tree for the replacement value of another integral or
    vector element type;
    Operand 2 is a tree giving the constant bit position;
    The number of bits replaced is given by the precision of the type of the
    replacement value if it is integral or by its size if it is non-integral.
    ???  The reason to make the size of the replacement implicit is to avoid
    introducing a quaternary operation.
    The replaced bits shall be fully inside the container.  If the container
    is of vector type, then these bits shall be aligned with its elements.  */
DEFTREECODE (BIT_INSERT_EXPR, "bit_field_insert", tcc_expression, 3)

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT
  2016-05-19 15:21         ` Eric Botcazou
@ 2016-05-20  8:59           ` Richard Biener
  2016-05-20 11:25             ` Jakub Jelinek
  0 siblings, 1 reply; 32+ messages in thread
From: Richard Biener @ 2016-05-20  8:59 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches, Michael Matz

On Thu, 19 May 2016, Eric Botcazou wrote:

> > Index: trunk/gcc/tree.def
> > ===================================================================
> > *** trunk.orig/gcc/tree.def	2016-05-17 17:19:41.783958489 +0200
> > --- trunk/gcc/tree.def	2016-05-19 10:23:35.779141973 +0200
> > *************** DEFTREECODE (ADDR_EXPR, "addr_expr", tcc
> > *** 852,857 ****
> > --- 852,871 ----
> >      descriptor of type ptr_mode.  */
> >   DEFTREECODE (FDESC_EXPR, "fdesc_expr", tcc_expression, 2)
> > 
> > + /* Given a word, a value and a bit position within the word,
> > +    produce the value that results if replacing the parts of word
> > +    starting at the bit position with value.
> > +    Operand 0 is a tree for the word of integral or vector type;
> > +    Operand 1 is a tree for the value of integral or vector element type;
> > +    Operand 2 is a tree giving the constant position of the first
> > referenced bit; +    The number of bits replaced is given by the precision
> > of the value +    type if that is integral or by its size if it is
> > non-integral. +    ???  The reason to make the size of the replacement
> > implicit is to not +    have a quaternary operation.
> > +    The replaced bits shall be fully inside the word.  If the word is of
> > +    vector type the replaced bits shall be aligned with its elements.  */
> > + DEFTREECODE (BIT_INSERT_EXPR, "bit_field_insert", tcc_expression, 3)
> > +
> 
> "word" is ambiguous (what is a word of vector type?).  What's allowed as 
> operand #0 exactly?  If that's anything, I'd call it a value too, possibly 
> with a qualifier, for example:
> 
>  /* Given a container value, a replacement value and a bit position within
>     the container, produce the value that results from replacing the part of
>     the container starting at the bit position with the replacement value.
>     Operand 0 is a tree for the container value of integral or vector type;
>     Operand 1 is a tree for the replacement value of another integral or
>     vector element type;
>     Operand 2 is a tree giving the constant bit position;
>     The number of bits replaced is given by the precision of the type of the
>     replacement value if it is integral or by its size if it is non-integral.
>     ???  The reason to make the size of the replacement implicit is to avoid
>     introducing a quaternary operation.
>     The replaced bits shall be fully inside the container.  If the container
>     is of vector type, then these bits shall be aligned with its elements.  */
> DEFTREECODE (BIT_INSERT_EXPR, "bit_field_insert", tcc_expression, 3)

Sounds good.  I will commit later with your wording.

Richard.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT
  2016-05-20  8:59           ` Richard Biener
@ 2016-05-20 11:25             ` Jakub Jelinek
  2016-05-20 11:41               ` Richard Biener
  0 siblings, 1 reply; 32+ messages in thread
From: Jakub Jelinek @ 2016-05-20 11:25 UTC (permalink / raw)
  To: Richard Biener; +Cc: Eric Botcazou, gcc-patches, Michael Matz

On Fri, May 20, 2016 at 10:59:18AM +0200, Richard Biener wrote:
> Sounds good.  I will commit later with your wording.

Unfortunately, the new testcase fails e.g. on i?86-*-* or on powerpc*.
On i?86-*-* (without -msse) I actually see 2 different issues, one is
extra -Wpsabi warnings, and another is the dump scan, the optimization isn't
used there at all if we don't have SSE HW.
Surprisingly, on powerpc* the only problem is the extra warnings about ABI
compatibility, but the scan matches, even if there is no vector support.
Similarly on s390* too (and there are no warnings even).

So, dunno if we should limit the scan-tree-dump-times only to a few selected
arches (e.g. those where we add dg-additional-options for, plus some where
it is known to work without additional options, like perhaps aarch64*-*-*,
maybe spu*-*-*, what else?).

2016-05-20  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/29756
	gcc.dg/tree-ssa/vector-6.c: Add -Wno-psabi -w to dg-options.
	Add -msse2 for x86 and -maltivec for powerpc.

--- gcc/testsuite/gcc.dg/tree-ssa/vector-6.c.jj	2016-05-20 12:44:33.000000000 +0200
+++ gcc/testsuite/gcc.dg/tree-ssa/vector-6.c	2016-05-20 13:17:08.730168547 +0200
@@ -1,5 +1,7 @@
 /* { dg-do compile } */
-/* { dg-options "-O -fdump-tree-ccp1" } */
+/* { dg-options "-O -fdump-tree-ccp1 -Wno-psabi -w" } */
+/* { dg-additional-options "-msse2" { target i?86-*-* x86_64-*-* } } */
+/* { dg-additional-options "-maltivec" { target powerpc_altivec_ok } } */
 
 typedef int v4si __attribute__((vector_size (4 * sizeof (int))));
 


	Jakub

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT
  2016-05-20 11:25             ` Jakub Jelinek
@ 2016-05-20 11:41               ` Richard Biener
  2016-05-20 11:52                 ` Jakub Jelinek
  0 siblings, 1 reply; 32+ messages in thread
From: Richard Biener @ 2016-05-20 11:41 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Eric Botcazou, gcc-patches, Michael Matz

On Fri, 20 May 2016, Jakub Jelinek wrote:

> On Fri, May 20, 2016 at 10:59:18AM +0200, Richard Biener wrote:
> > Sounds good.  I will commit later with your wording.
> 
> Unfortunately, the new testcase fails e.g. on i?86-*-* or on powerpc*.
> On i?86-*-* (without -msse) I actually see 2 different issues, one is
> extra -Wpsabi warnings, and another is the dump scan, the optimization isn't
> used there at all if we don't have SSE HW.
> Surprisingly, on powerpc* the only problem is the extra warnings about ABI
> compatibility, but the scan matches, even if there is no vector support.
> Similarly on s390* too (and there are no warnings even).

I suppose they still have vector modes enabled.

> So, dunno if we should limit the scan-tree-dump-times only to a few selected
> arches (e.g. those where we add dg-additional-options for, plus some where
> it is known to work without additional options, like perhaps aarch64*-*-*,
> maybe spu*-*-*, what else?).

I'd say ppc and aarch64 are fine.  Thanks for noticing.

Richard.

> 2016-05-20  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR tree-optimization/29756
> 	gcc.dg/tree-ssa/vector-6.c: Add -Wno-psabi -w to dg-options.
> 	Add -msse2 for x86 and -maltivec for powerpc.
> 
> --- gcc/testsuite/gcc.dg/tree-ssa/vector-6.c.jj	2016-05-20 12:44:33.000000000 +0200
> +++ gcc/testsuite/gcc.dg/tree-ssa/vector-6.c	2016-05-20 13:17:08.730168547 +0200
> @@ -1,5 +1,7 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O -fdump-tree-ccp1" } */
> +/* { dg-options "-O -fdump-tree-ccp1 -Wno-psabi -w" } */
> +/* { dg-additional-options "-msse2" { target i?86-*-* x86_64-*-* } } */
> +/* { dg-additional-options "-maltivec" { target powerpc_altivec_ok } } */
>  
>  typedef int v4si __attribute__((vector_size (4 * sizeof (int))));
>  
> 
> 
> 	Jakub

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT
  2016-05-20 11:41               ` Richard Biener
@ 2016-05-20 11:52                 ` Jakub Jelinek
  2016-05-20 11:53                   ` Richard Biener
  0 siblings, 1 reply; 32+ messages in thread
From: Jakub Jelinek @ 2016-05-20 11:52 UTC (permalink / raw)
  To: Richard Biener; +Cc: Eric Botcazou, gcc-patches, Michael Matz

On Fri, May 20, 2016 at 01:41:01PM +0200, Richard Biener wrote:
> I'd say ppc and aarch64 are fine.  Thanks for noticing.

So like this then?

2016-05-20  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/29756
	gcc.dg/tree-ssa/vector-6.c: Add -Wno-psabi -w to dg-options.
	Add -msse2 for x86 and -maltivec for powerpc.  Use scan-tree-dump-times
	only on selected targets where V4SImode vectors are known to be
	supported.

--- gcc/testsuite/gcc.dg/tree-ssa/vector-6.c.jj	2016-05-20 12:44:33.000000000 +0200
+++ gcc/testsuite/gcc.dg/tree-ssa/vector-6.c	2016-05-20 13:49:19.880961132 +0200
@@ -1,5 +1,7 @@
 /* { dg-do compile } */
-/* { dg-options "-O -fdump-tree-ccp1" } */
+/* { dg-options "-O -fdump-tree-ccp1 -Wno-psabi -w" } */
+/* { dg-additional-options "-msse2" { target i?86-*-* x86_64-*-* } } */
+/* { dg-additional-options "-maltivec" { target powerpc_altivec_ok } } */
 
 typedef int v4si __attribute__((vector_size (4 * sizeof (int))));
 
@@ -30,4 +32,4 @@ v4si test4 (v4si v, int i)
   return v;
 }
 
-/* { dg-final { scan-tree-dump-times "Now a gimple register: v" 4 "ccp1" } } */
+/* { dg-final { scan-tree-dump-times "Now a gimple register: v" 4 "ccp1" { target { { i?86-*-* x86_64-*-* aarch64*-*-* spu*-*-* } || { powerpc_altivec_ok } } } } } */


	Jakub

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT
  2016-05-20 11:52                 ` Jakub Jelinek
@ 2016-05-20 11:53                   ` Richard Biener
  0 siblings, 0 replies; 32+ messages in thread
From: Richard Biener @ 2016-05-20 11:53 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Eric Botcazou, gcc-patches, Michael Matz

On Fri, 20 May 2016, Jakub Jelinek wrote:

> On Fri, May 20, 2016 at 01:41:01PM +0200, Richard Biener wrote:
> > I'd say ppc and aarch64 are fine.  Thanks for noticing.
> 
> So like this then?

Yes.

Thanks,
Richard.

> 2016-05-20  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR tree-optimization/29756
> 	gcc.dg/tree-ssa/vector-6.c: Add -Wno-psabi -w to dg-options.
> 	Add -msse2 for x86 and -maltivec for powerpc.  Use scan-tree-dump-times
> 	only on selected targets where V4SImode vectors are known to be
> 	supported.
> 
> --- gcc/testsuite/gcc.dg/tree-ssa/vector-6.c.jj	2016-05-20 12:44:33.000000000 +0200
> +++ gcc/testsuite/gcc.dg/tree-ssa/vector-6.c	2016-05-20 13:49:19.880961132 +0200
> @@ -1,5 +1,7 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O -fdump-tree-ccp1" } */
> +/* { dg-options "-O -fdump-tree-ccp1 -Wno-psabi -w" } */
> +/* { dg-additional-options "-msse2" { target i?86-*-* x86_64-*-* } } */
> +/* { dg-additional-options "-maltivec" { target powerpc_altivec_ok } } */
>  
>  typedef int v4si __attribute__((vector_size (4 * sizeof (int))));
>  
> @@ -30,4 +32,4 @@ v4si test4 (v4si v, int i)
>    return v;
>  }
>  
> -/* { dg-final { scan-tree-dump-times "Now a gimple register: v" 4 "ccp1" } } */
> +/* { dg-final { scan-tree-dump-times "Now a gimple register: v" 4 "ccp1" { target { { i?86-*-* x86_64-*-* aarch64*-*-* spu*-*-* } || { powerpc_altivec_ok } } } } } */
> 
> 
> 	Jakub
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT
  2016-05-13 10:51 [PATCH][RFC] Introduce BIT_FIELD_INSERT Richard Biener
  2016-05-16  0:55 ` Bill Schmidt
  2016-05-16  8:24 ` Eric Botcazou
@ 2016-05-20 14:11 ` Andi Kleen
  2016-05-20 15:12   ` Marc Glisse
  2018-11-15  1:27 ` Andrew Pinski
  3 siblings, 1 reply; 32+ messages in thread
From: Andi Kleen @ 2016-05-20 14:11 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

Richard Biener <rguenther@suse.de> writes:

> The following patch adds BIT_FIELD_INSERT, an operation to
> facilitate doing bitfield inserts on registers (as opposed
> to currently where we'd have a BIT_FIELD_REF store).

I wonder if these patches would make it easier to use the Haswell
bit manipulations instructions on x86 (which act on registers).

I found that gcc makes significantly less use of them than LLVM,
sometimes leading to much bigger code.

-Andi

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT
  2016-05-20 14:11 ` Andi Kleen
@ 2016-05-20 15:12   ` Marc Glisse
  2016-05-20 15:54     ` Andi Kleen
  0 siblings, 1 reply; 32+ messages in thread
From: Marc Glisse @ 2016-05-20 15:12 UTC (permalink / raw)
  To: Andi Kleen; +Cc: gcc-patches

On Fri, 20 May 2016, Andi Kleen wrote:

> Richard Biener <rguenther@suse.de> writes:
>
>> The following patch adds BIT_FIELD_INSERT, an operation to
>> facilitate doing bitfield inserts on registers (as opposed
>> to currently where we'd have a BIT_FIELD_REF store).
>
> I wonder if these patches would make it easier to use the Haswell
> bit manipulations instructions on x86 (which act on registers).
>
> I found that gcc makes significantly less use of them than LLVM,
> sometimes leading to much bigger code.

Could you point at some bugzilla entries? I don't really see which BMI* 
instruction could be helped by BIT_FIELD_INSERT (PDEP seems too hard). 
There is one BMI1 instruction we don't use much, bextr (only defined with 
an UNSPEC in i386.md, unlike the TBM version), but it is about extracting.

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT
  2016-05-20 15:12   ` Marc Glisse
@ 2016-05-20 15:54     ` Andi Kleen
  2016-05-20 16:08       ` Jakub Jelinek
  2016-05-20 17:08       ` Marc Glisse
  0 siblings, 2 replies; 32+ messages in thread
From: Andi Kleen @ 2016-05-20 15:54 UTC (permalink / raw)
  To: gcc-patches; +Cc: Andi Kleen

On Fri, May 20, 2016 at 05:11:59PM +0200, Marc Glisse wrote:
> On Fri, 20 May 2016, Andi Kleen wrote:
> 
> >Richard Biener <rguenther@suse.de> writes:
> >
> >>The following patch adds BIT_FIELD_INSERT, an operation to
> >>facilitate doing bitfield inserts on registers (as opposed
> >>to currently where we'd have a BIT_FIELD_REF store).
> >
> >I wonder if these patches would make it easier to use the Haswell
> >bit manipulations instructions on x86 (which act on registers).
> >
> >I found that gcc makes significantly less use of them than LLVM,
> >sometimes leading to much bigger code.
> 
> Could you point at some bugzilla entries? I don't really see which
> BMI* instruction could be helped by BIT_FIELD_INSERT (PDEP seems too
> hard). There is one BMI1 instruction we don't use much, bextr (only
> defined with an UNSPEC in i386.md, unlike the TBM version), but it
> is about extracting.

Ok. Yes I was thinking of BEXTR.

I thought I had filed a bugzilla at some point, but can't
find it right now. If you compare bitfield code
compiled for Haswell on LLVM and GCC it is very visible
how much worse gcc is.

So perhaps it only needs changes in the backend.

-Andi

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT
  2016-05-20 15:54     ` Andi Kleen
@ 2016-05-20 16:08       ` Jakub Jelinek
  2016-05-20 19:25         ` Richard Biener
  2016-05-20 17:08       ` Marc Glisse
  1 sibling, 1 reply; 32+ messages in thread
From: Jakub Jelinek @ 2016-05-20 16:08 UTC (permalink / raw)
  To: Andi Kleen; +Cc: gcc-patches

On Fri, May 20, 2016 at 08:54:39AM -0700, Andi Kleen wrote:
> I thought I had filed a bugzilla at some point, but can't
> find it right now. If you compare bitfield code
> compiled for Haswell on LLVM and GCC it is very visible
> how much worse gcc is.

We really need to lower bitfield operations (especially when there are
multiple adjacent ones) to integer arithmetics on the underlying
DECL_BIT_FIELD_REPRESENTATIVE fields somewhere in (late?) gimple and
perform some cleanups after it and only after that expand.

	Jakub

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT
  2016-05-20 15:54     ` Andi Kleen
  2016-05-20 16:08       ` Jakub Jelinek
@ 2016-05-20 17:08       ` Marc Glisse
  1 sibling, 0 replies; 32+ messages in thread
From: Marc Glisse @ 2016-05-20 17:08 UTC (permalink / raw)
  To: Andi Kleen; +Cc: gcc-patches

On Fri, 20 May 2016, Andi Kleen wrote:

> On Fri, May 20, 2016 at 05:11:59PM +0200, Marc Glisse wrote:
>> On Fri, 20 May 2016, Andi Kleen wrote:
>>
>>> Richard Biener <rguenther@suse.de> writes:
>>>
>>>> The following patch adds BIT_FIELD_INSERT, an operation to
>>>> facilitate doing bitfield inserts on registers (as opposed
>>>> to currently where we'd have a BIT_FIELD_REF store).
>>>
>>> I wonder if these patches would make it easier to use the Haswell
>>> bit manipulations instructions on x86 (which act on registers).
>>>
>>> I found that gcc makes significantly less use of them than LLVM,
>>> sometimes leading to much bigger code.
>>
>> Could you point at some bugzilla entries? I don't really see which
>> BMI* instruction could be helped by BIT_FIELD_INSERT (PDEP seems too
>> hard). There is one BMI1 instruction we don't use much, bextr (only
>> defined with an UNSPEC in i386.md, unlike the TBM version), but it
>> is about extracting.
>
> Ok. Yes I was thinking of BEXTR.
>
> I thought I had filed a bugzilla at some point, but can't
> find it right now. If you compare bitfield code
> compiled for Haswell on LLVM and GCC it is very visible
> how much worse gcc is.

If I try some simple operations on bitfields, I don't see anything that 
obvious.

 	movl	$1285, %eax             # imm = 0x505
 	bextrl	%eax, %edi, %eax

vs

 	shrl	$5, %eax
 	andl	$31, %eax

is not that much better.
Incrementing a field gives one more shift with gcc and one more 'and' with 
clang, so maybe clang is slightly better there.

> So perhaps it only needs changes in the backend.

With -mtbm we generate

 	bextr	$1285, %edi, %eax

so it shouldn't be hard to generate the same code as clang above, but I 
don't think that's an example of the "much worse" you have in mind.

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT
  2016-05-20 16:08       ` Jakub Jelinek
@ 2016-05-20 19:25         ` Richard Biener
  0 siblings, 0 replies; 32+ messages in thread
From: Richard Biener @ 2016-05-20 19:25 UTC (permalink / raw)
  To: Jakub Jelinek, Andi Kleen; +Cc: gcc-patches

On May 20, 2016 6:08:34 PM GMT+02:00, Jakub Jelinek <jakub@redhat.com> wrote:
>On Fri, May 20, 2016 at 08:54:39AM -0700, Andi Kleen wrote:
>> I thought I had filed a bugzilla at some point, but can't
>> find it right now. If you compare bitfield code
>> compiled for Haswell on LLVM and GCC it is very visible
>> how much worse gcc is.
>
>We really need to lower bitfield operations (especially when there are
>multiple adjacent ones) to integer arithmetics on the underlying
>DECL_BIT_FIELD_REPRESENTATIVE fields somewhere in (late?) gimple and
>perform some cleanups after it and only after that expand.

Yes, I still have patches for this and plan to resurrect them now that one prerequisite is in.

Richard.

>	Jakub


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT
  2016-05-13 10:51 [PATCH][RFC] Introduce BIT_FIELD_INSERT Richard Biener
                   ` (2 preceding siblings ...)
  2016-05-20 14:11 ` Andi Kleen
@ 2018-11-15  1:27 ` Andrew Pinski
  2018-11-15  8:29   ` Richard Biener
  3 siblings, 1 reply; 32+ messages in thread
From: Andrew Pinski @ 2018-11-15  1:27 UTC (permalink / raw)
  To: Richard Guenther; +Cc: GCC Patches

On Fri, May 13, 2016 at 3:51 AM Richard Biener <rguenther@suse.de> wrote:
>
>
> The following patch adds BIT_FIELD_INSERT, an operation to
> facilitate doing bitfield inserts on registers (as opposed
> to currently where we'd have a BIT_FIELD_REF store).
>
> Originally this was developed as part of bitfield lowering
> where bitfield stores were lowered into read-modify-write
> cycles and the modify part, instead of doing shifting and masking,
> be kept in a more high-level form to ease combining them.
>
> A second use case (the above is still valid) is vector element
> inserts which we currently can only do via memory or
> by extracting all components and re-building the vector using
> a CONSTRUCTOR.  For this second use case I added code
> re-writing the BIT_FIELD_REF stores the C family FEs produce
> into BIT_FIELD_INSERT when update-address-taken can otherwise
> re-write a decl into SSA form (the testcase shows we miss
> a similar opportunity with the MEM_REF form of a vector insert,
> I plan to fix that for the final submission).
>
> One speciality of BIT_FIELD_INSERT as opposed to BIT_FIELD_REF
> is that the size of the insertion is given implicitely via the
> type size/precision of the value to insert.  That avoids
> introducing ways to have quaternary ops in folding and GIMPLE stmts.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
>
> Richard.
>
> 2011-06-16  Richard Guenther  <rguenther@suse.de>
>
>         PR tree-optimization/29756
>         * tree.def (BIT_FIELD_INSERT): New tcc_expression tree code.
>         * expr.c (expand_expr_real_2): Handle BIT_FIELD_INSERT.
>         * fold-const.c (operand_equal_p): Likewise.
>         (fold_ternary_loc): Add constant folding of BIT_FIELD_INSERT.
>         * gimplify.c (gimplify_expr): Handle BIT_FIELD_INSERT.
>         * tree-inline.c (estimate_operator_cost): Likewise.
>         * tree-pretty-print.c (dump_generic_node): Likewise.
>         * tree-ssa-operands.c (get_expr_operands): Likewise.
>         * cfgexpand.c (expand_debug_expr): Likewise.
>         * gimple-pretty-print.c (dump_ternary_rhs): Likewise.
>         * gimple.c (get_gimple_rhs_num_ops): Handle BIT_FIELD_INSERT.
>         * tree-cfg.c (verify_gimple_assign_ternary): Verify BIT_FIELD_INSERT.
>
>         * tree-ssa.c (non_rewritable_lvalue_p): We can rewrite
>         vector inserts using BIT_FIELD_REF on the lhs.
>         (execute_update_addresses_taken): Do it.
>
>         * gcc.dg/tree-ssa/vector-6.c: New testcase.
>
> Index: trunk/gcc/expr.c
> ===================================================================
> *** trunk.orig/gcc/expr.c       2016-05-12 13:40:30.704262951 +0200
> --- trunk/gcc/expr.c    2016-05-12 15:40:32.481225744 +0200
> *************** expand_expr_real_2 (sepops ops, rtx targ
> *** 9358,9363 ****
> --- 9358,9380 ----
>         target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
>         return target;
>
> +     case BIT_FIELD_INSERT:
> +       {
> +       unsigned bitpos = tree_to_uhwi (treeop2);
> +       unsigned bitsize;
> +       if (INTEGRAL_TYPE_P (TREE_TYPE (treeop1)))
> +         bitsize = TYPE_PRECISION (TREE_TYPE (treeop1));
> +       else
> +         bitsize = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (treeop1)));
> +       rtx op0 = expand_normal (treeop0);
> +       rtx op1 = expand_normal (treeop1);
> +       rtx dst = gen_reg_rtx (mode);
> +       emit_move_insn (dst, op0);
> +       store_bit_field (dst, bitsize, bitpos, 0, 0,
> +                        TYPE_MODE (TREE_TYPE (treeop1)), op1, false);
> +       return dst;
> +       }
> +
>       default:
>         gcc_unreachable ();
>       }
> Index: trunk/gcc/fold-const.c
> ===================================================================
> *** trunk.orig/gcc/fold-const.c 2016-05-12 13:40:30.704262951 +0200
> --- trunk/gcc/fold-const.c      2016-05-13 09:41:13.509812127 +0200
> *************** operand_equal_p (const_tree arg0, const_
> *** 3163,3168 ****
> --- 3163,3169 ----
>
>         case VEC_COND_EXPR:
>         case DOT_PROD_EXPR:
> +       case BIT_FIELD_INSERT:
>           return OP_SAME (0) && OP_SAME (1) && OP_SAME (2);
>
>         default:
> *************** fold_ternary_loc (location_t loc, enum t
> *** 11870,11875 ****
> --- 11871,11916 ----
>         }
>         return NULL_TREE;
>
> +     case BIT_FIELD_INSERT:
> +       /* Perform (partial) constant folding of BIT_FIELD_INSERT.  */
> +       if (TREE_CODE (arg0) == INTEGER_CST
> +         && TREE_CODE (arg1) == INTEGER_CST)
> +       {
> +         unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
> +         unsigned bitsize = TYPE_PRECISION (TREE_TYPE (arg1));
> +         wide_int tem = wi::bit_and (arg0,
> +                                     wi::shifted_mask (bitpos, bitsize, true,
> +                                                       TYPE_PRECISION (type)));
> +         wide_int tem2
> +           = wi::lshift (wi::zext (wi::to_wide (arg1, TYPE_PRECISION (type)),
> +                                   bitsize), bitpos);
> +         return wide_int_to_tree (type, wi::bit_or (tem, tem2));
> +       }

This seems incorrect for the case where BYTES_BIG_ENDIAN as far as I
can tell.  With BYTES_BIG_ENDIAN, the bits position starts most
significiant rather than the least significiant.  Sorry I am bring
this up after this has been in the tree for a long time but I finally
got around to testing my bit-field lowering on a few big-endian
targets and ran into this issue.

Thanks,
Andrew Pinski

> +       else if (TREE_CODE (arg0) == VECTOR_CST
> +              && CONSTANT_CLASS_P (arg1)
> +              && types_compatible_p (TREE_TYPE (TREE_TYPE (arg0)),
> +                                     TREE_TYPE (arg1)))
> +       {
> +         unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
> +         unsigned HOST_WIDE_INT elsize
> +           = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (arg1)));
> +         if (bitpos % elsize == 0)
> +           {
> +             unsigned k = bitpos / elsize;
> +             if (operand_equal_p (VECTOR_CST_ELT (arg0, k), arg1, 0))
> +               return arg0;
> +             else
> +               {
> +                 tree *elts = XALLOCAVEC (tree, TYPE_VECTOR_SUBPARTS (type));
> +                 memcpy (elts, VECTOR_CST_ELTS (arg0),
> +                         sizeof (tree) * TYPE_VECTOR_SUBPARTS (type));
> +                 elts[k] = arg1;
> +                 return build_vector (type, elts);
> +               }
> +           }
> +       }
> +       return NULL_TREE;
> +
>       default:
>         return NULL_TREE;
>       } /* switch (code) */
> Index: trunk/gcc/gimplify.c
> ===================================================================
> *** trunk.orig/gcc/gimplify.c   2016-05-12 13:40:30.704262951 +0200
> --- trunk/gcc/gimplify.c        2016-05-12 13:56:18.679120641 +0200
> *************** gimplify_expr (tree *expr_p, gimple_seq
> *** 10936,10941 ****
> --- 10936,10945 ----
>           /* Classified as tcc_expression.  */
>           goto expr_3;
>
> +       case BIT_FIELD_INSERT:
> +         /* Argument 3 is a constant.  */
> +         goto expr_2;
> +
>         case POINTER_PLUS_EXPR:
>           {
>             enum gimplify_status r0, r1;
> Index: trunk/gcc/tree-inline.c
> ===================================================================
> *** trunk.orig/gcc/tree-inline.c        2016-05-12 13:40:30.704262951 +0200
> --- trunk/gcc/tree-inline.c     2016-05-12 13:42:45.465811959 +0200
> *************** estimate_operator_cost (enum tree_code c
> *** 3941,3946 ****
> --- 3941,3950 ----
>           return weights->div_mod_cost;
>         return 1;
>
> +     /* Bit-field insertion needs several shift and mask operations.  */
> +     case BIT_FIELD_INSERT:
> +       return 3;
> +
>       default:
>         /* We expect a copy assignment with no operator.  */
>         gcc_assert (get_gimple_rhs_class (code) == GIMPLE_SINGLE_RHS);
> Index: trunk/gcc/tree-pretty-print.c
> ===================================================================
> *** trunk.orig/gcc/tree-pretty-print.c  2016-05-12 13:40:30.704262951 +0200
> --- trunk/gcc/tree-pretty-print.c       2016-05-12 14:30:05.781944740 +0200
> *************** dump_generic_node (pretty_printer *pp, t
> *** 1876,1881 ****
> --- 1876,1898 ----
>         pp_greater (pp);
>         break;
>
> +     case BIT_FIELD_INSERT:
> +       pp_string (pp, "BIT_FIELD_INSERT <");
> +       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
> +       pp_string (pp, ", ");
> +       dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
> +       pp_string (pp, ", ");
> +       dump_generic_node (pp, TREE_OPERAND (node, 2), spc, flags, false);
> +       pp_string (pp, " (");
> +       if (INTEGRAL_TYPE_P (TREE_TYPE (TREE_OPERAND (node, 1))))
> +       pp_decimal_int (pp,
> +                       TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (node, 1))));
> +       else
> +       dump_generic_node (pp, TYPE_SIZE (TREE_TYPE (TREE_OPERAND (node, 1))),
> +                          spc, flags, false);
> +       pp_string (pp, " bits)>");
> +       break;
> +
>       case ARRAY_REF:
>       case ARRAY_RANGE_REF:
>         op0 = TREE_OPERAND (node, 0);
> Index: trunk/gcc/tree-ssa-operands.c
> ===================================================================
> *** trunk.orig/gcc/tree-ssa-operands.c  2016-05-12 13:42:45.465811959 +0200
> --- trunk/gcc/tree-ssa-operands.c       2016-05-12 13:48:26.881736503 +0200
> *************** get_expr_operands (struct function *fn,
> *** 833,838 ****
> --- 833,839 ----
>         get_expr_operands (fn, stmt, &TREE_OPERAND (expr, 0), flags);
>         return;
>
> +     case BIT_FIELD_INSERT:
>       case COMPOUND_EXPR:
>       case OBJ_TYPE_REF:
>       case ASSERT_EXPR:
> Index: trunk/gcc/tree.def
> ===================================================================
> *** trunk.orig/gcc/tree.def     2016-05-12 13:40:30.704262951 +0200
> --- trunk/gcc/tree.def  2016-05-12 13:47:09.972852423 +0200
> *************** DEFTREECODE (ADDR_EXPR, "addr_expr", tcc
> *** 852,857 ****
> --- 852,868 ----
>      descriptor of type ptr_mode.  */
>   DEFTREECODE (FDESC_EXPR, "fdesc_expr", tcc_expression, 2)
>
> + /* Given a word, a value and a bitfield position within the word,
> +    produce the value that results if replacing the
> +    described parts of word with value.
> +    Operand 0 is a tree for the word of integral type;
> +    Operand 1 is a tree for the value of integral type;
> +    Operand 2 is a tree giving the constant position of the first referenced bit;
> +    The number of bits replaced is given by the precision of the value
> +    type if that is integral or by its size if it is non-integral.
> +    The replaced bits shall be fully inside the word.  */
> + DEFTREECODE (BIT_FIELD_INSERT, "bit_field_insert", tcc_expression, 3)
> +
>   /* Given two real or integer operands of the same type,
>      returns a complex value of the corresponding complex type.  */
>   DEFTREECODE (COMPLEX_EXPR, "complex_expr", tcc_binary, 2)
> Index: trunk/gcc/cfgexpand.c
> ===================================================================
> *** trunk.orig/gcc/cfgexpand.c  2016-05-12 13:42:45.469812005 +0200
> --- trunk/gcc/cfgexpand.c       2016-05-13 11:48:04.513407495 +0200
> *************** expand_debug_expr (tree exp)
> *** 5025,5030 ****
> --- 5025,5031 ----
>       case FIXED_CONVERT_EXPR:
>       case OBJ_TYPE_REF:
>       case WITH_SIZE_EXPR:
> +     case BIT_FIELD_INSERT:
>         return NULL;
>
>       case DOT_PROD_EXPR:
> Index: trunk/gcc/gimple-pretty-print.c
> ===================================================================
> *** trunk.orig/gcc/gimple-pretty-print.c        2016-05-12 11:23:09.261375157 +0200
> --- trunk/gcc/gimple-pretty-print.c     2016-05-12 14:57:22.096175579 +0200
> *************** dump_ternary_rhs (pretty_printer *buffer
> *** 479,484 ****
> --- 479,502 ----
>         pp_greater (buffer);
>         break;
>
> +     case BIT_FIELD_INSERT:
> +       pp_string (buffer, "BIT_FIELD_INSERT <");
> +       dump_generic_node (buffer, gimple_assign_rhs1 (gs), spc, flags, false);
> +       pp_string (buffer, ", ");
> +       dump_generic_node (buffer, gimple_assign_rhs2 (gs), spc, flags, false);
> +       pp_string (buffer, ", ");
> +       dump_generic_node (buffer, gimple_assign_rhs3 (gs), spc, flags, false);
> +       pp_string (buffer, " (");
> +       if (INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_rhs2 (gs))))
> +       pp_decimal_int (buffer,
> +                       TYPE_PRECISION (TREE_TYPE (gimple_assign_rhs2 (gs))));
> +       else
> +       dump_generic_node (buffer,
> +                          TYPE_SIZE (TREE_TYPE (gimple_assign_rhs2 (gs))),
> +                          spc, flags, false);
> +       pp_string (buffer, " bits)>");
> +       break;
> +
>       default:
>         gcc_unreachable ();
>       }
> Index: trunk/gcc/gimple.c
> ===================================================================
> *** trunk.orig/gcc/gimple.c     2016-05-12 13:40:30.704262951 +0200
> --- trunk/gcc/gimple.c  2016-05-12 14:49:37.066994969 +0200
> *************** get_gimple_rhs_num_ops (enum tree_code c
> *** 2044,2049 ****
> --- 2044,2050 ----
>         || (SYM) == REALIGN_LOAD_EXPR                                       \
>         || (SYM) == VEC_COND_EXPR                                                   \
>         || (SYM) == VEC_PERM_EXPR                                             \
> +       || (SYM) == BIT_FIELD_INSERT                                        \
>         || (SYM) == FMA_EXPR) ? GIMPLE_TERNARY_RHS                          \
>      : ((SYM) == CONSTRUCTOR                                                \
>         || (SYM) == OBJ_TYPE_REF                                                    \
> Index: trunk/gcc/tree-cfg.c
> ===================================================================
> *** trunk.orig/gcc/tree-cfg.c   2016-05-06 14:38:33.959495081 +0200
> --- trunk/gcc/tree-cfg.c        2016-05-13 09:25:01.670630730 +0200
> *************** verify_gimple_assign_ternary (gassign *s
> *** 4155,4160 ****
> --- 4155,4207 ----
>
>         return false;
>
> +     case BIT_FIELD_INSERT:
> +       if (! useless_type_conversion_p (lhs_type, rhs1_type))
> +       {
> +         error ("type mismatch in BIT_FIELD_INSERT");
> +         debug_generic_expr (lhs_type);
> +         debug_generic_expr (rhs1_type);
> +         return true;
> +       }
> +       if (! ((INTEGRAL_TYPE_P (rhs1_type)
> +             && INTEGRAL_TYPE_P (rhs2_type))
> +            || (VECTOR_TYPE_P (rhs1_type)
> +                && types_compatible_p (TREE_TYPE (rhs1_type), rhs2_type))))
> +       {
> +         error ("not allowed type combination in BIT_FIELD_INSERT");
> +         debug_generic_expr (rhs1_type);
> +         debug_generic_expr (rhs2_type);
> +         return true;
> +       }
> +       if (! tree_fits_uhwi_p (rhs3)
> +         || ! tree_fits_uhwi_p (TYPE_SIZE (rhs2_type)))
> +       {
> +         error ("invalid position or size in BIT_FIELD_INSERT");
> +         return true;
> +       }
> +       if (INTEGRAL_TYPE_P (rhs1_type))
> +       {
> +         unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (rhs3);
> +         if (bitpos >= TYPE_PRECISION (rhs1_type)
> +             || (bitpos + TYPE_PRECISION (rhs2_type)
> +                 > TYPE_PRECISION (rhs1_type)))
> +           {
> +             error ("insertion out of range in BIT_FIELD_INSERT");
> +             return true;
> +           }
> +       }
> +       else if (VECTOR_TYPE_P (rhs1_type))
> +       {
> +         unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (rhs3);
> +         unsigned HOST_WIDE_INT bitsize = tree_to_uhwi (TYPE_SIZE (rhs2_type));
> +         if (bitpos % bitsize != 0)
> +           {
> +             error ("vector insertion not at element boundary");
> +             return true;
> +           }
> +       }
> +       return false;
> +
>       case DOT_PROD_EXPR:
>       case REALIGN_LOAD_EXPR:
>         /* FIXME.  */
> Index: trunk/gcc/tree-ssa.c
> ===================================================================
> *** trunk.orig/gcc/tree-ssa.c   2016-05-13 09:38:02.263611726 +0200
> --- trunk/gcc/tree-ssa.c        2016-05-13 09:50:31.020226585 +0200
> *************** non_rewritable_lvalue_p (tree lhs)
> *** 1318,1323 ****
> --- 1318,1335 ----
>         return false;
>       }
>
> +   /* A vector-insert using a BIT_FIELD_REF is rewritable using
> +      BIT_FIELD_INSERT.  */
> +   if (TREE_CODE (lhs) == BIT_FIELD_REF
> +       && DECL_P (TREE_OPERAND (lhs, 0))
> +       && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (lhs, 0)))
> +       /* && bitsize % element-size == 0 */
> +       && types_compatible_p (TREE_TYPE (lhs),
> +                            TREE_TYPE (TREE_TYPE (TREE_OPERAND (lhs, 0))))
> +       && (tree_to_uhwi (TREE_OPERAND (lhs, 2))
> +         % tree_to_uhwi (TYPE_SIZE (TREE_TYPE (lhs)))) == 0)
> +     return false;
> +
>     return true;
>   }
>
> *************** execute_update_addresses_taken (void)
> *** 1536,1541 ****
> --- 1548,1576 ----
>                     stmt = gsi_stmt (gsi);
>                     unlink_stmt_vdef (stmt);
>                     update_stmt (stmt);
> +                   continue;
> +                 }
> +
> +               /* Rewrite a vector insert via a BIT_FIELD_REF on the LHS
> +                  into a BIT_FIELD_INSERT.  */
> +               if (TREE_CODE (lhs) == BIT_FIELD_REF
> +                   && DECL_P (TREE_OPERAND (lhs, 0))
> +                   && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (lhs, 0)))
> +                   && types_compatible_p (TREE_TYPE (lhs),
> +                                          TREE_TYPE (TREE_TYPE
> +                                                      (TREE_OPERAND (lhs, 0))))
> +                   && (tree_to_uhwi (TREE_OPERAND (lhs, 2))
> +                       % tree_to_uhwi (TYPE_SIZE (TREE_TYPE (lhs))) == 0))
> +                 {
> +                   tree var = TREE_OPERAND (lhs, 0);
> +                   tree val = gimple_assign_rhs1 (stmt);
> +                   tree bitpos = TREE_OPERAND (lhs, 2);
> +                   gimple_assign_set_lhs (stmt, var);
> +                   gimple_assign_set_rhs_with_ops
> +                     (&gsi, BIT_FIELD_INSERT, var, val, bitpos);
> +                   stmt = gsi_stmt (gsi);
> +                   unlink_stmt_vdef (stmt);
> +                   update_stmt (stmt);
>                     continue;
>                   }
>
> Index: trunk/gcc/testsuite/gcc.dg/tree-ssa/vector-6.c
> ===================================================================
> *** /dev/null   1970-01-01 00:00:00.000000000 +0000
> --- trunk/gcc/testsuite/gcc.dg/tree-ssa/vector-6.c      2016-05-13 09:54:16.026814995 +0200
> ***************
> *** 0 ****
> --- 1,34 ----
> + /* { dg-do compile } */
> + /* { dg-options "-O -fdump-tree-ccp1" } */
> +
> + typedef int v4si __attribute__((vector_size (4 * sizeof (int))));
> +
> + v4si test1 (v4si v, int i)
> + {
> +   ((int *)&v)[0] = i;
> +   return v;
> + }
> +
> + v4si test2 (v4si v, int i)
> + {
> +   int *p = (int *)&v;
> +   *p = i;
> +   return v;
> + }
> +
> + v4si test3 (v4si v, int i)
> + {
> +   ((int *)&v)[3] = i;
> +   return v;
> + }
> +
> + v4si test4 (v4si v, int i)
> + {
> +   int *p = (int *)&v;
> +   p += 3;
> +   *p = i;
> +   return v;
> + }
> +
> + /* { dg-final { scan-tree-dump-times "Now a gimple register: v" 2 "ccp1" } } */
> + /* { dg-final { scan-tree-dump-times "Now a gimple register: v" 4 "ccp1" { xfail *-*-* } } } */

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT
  2018-11-15  1:27 ` Andrew Pinski
@ 2018-11-15  8:29   ` Richard Biener
  2018-11-15  8:31     ` Richard Biener
  0 siblings, 1 reply; 32+ messages in thread
From: Richard Biener @ 2018-11-15  8:29 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: GCC Patches

On Wed, 14 Nov 2018, Andrew Pinski wrote:

> On Fri, May 13, 2016 at 3:51 AM Richard Biener <rguenther@suse.de> wrote:
> >
> >
> > The following patch adds BIT_FIELD_INSERT, an operation to
> > facilitate doing bitfield inserts on registers (as opposed
> > to currently where we'd have a BIT_FIELD_REF store).
> >
> > Originally this was developed as part of bitfield lowering
> > where bitfield stores were lowered into read-modify-write
> > cycles and the modify part, instead of doing shifting and masking,
> > be kept in a more high-level form to ease combining them.
> >
> > A second use case (the above is still valid) is vector element
> > inserts which we currently can only do via memory or
> > by extracting all components and re-building the vector using
> > a CONSTRUCTOR.  For this second use case I added code
> > re-writing the BIT_FIELD_REF stores the C family FEs produce
> > into BIT_FIELD_INSERT when update-address-taken can otherwise
> > re-write a decl into SSA form (the testcase shows we miss
> > a similar opportunity with the MEM_REF form of a vector insert,
> > I plan to fix that for the final submission).
> >
> > One speciality of BIT_FIELD_INSERT as opposed to BIT_FIELD_REF
> > is that the size of the insertion is given implicitely via the
> > type size/precision of the value to insert.  That avoids
> > introducing ways to have quaternary ops in folding and GIMPLE stmts.
> >
> > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> >
> > Richard.
> >
> > 2011-06-16  Richard Guenther  <rguenther@suse.de>
> >
> >         PR tree-optimization/29756
> >         * tree.def (BIT_FIELD_INSERT): New tcc_expression tree code.
> >         * expr.c (expand_expr_real_2): Handle BIT_FIELD_INSERT.
> >         * fold-const.c (operand_equal_p): Likewise.
> >         (fold_ternary_loc): Add constant folding of BIT_FIELD_INSERT.
> >         * gimplify.c (gimplify_expr): Handle BIT_FIELD_INSERT.
> >         * tree-inline.c (estimate_operator_cost): Likewise.
> >         * tree-pretty-print.c (dump_generic_node): Likewise.
> >         * tree-ssa-operands.c (get_expr_operands): Likewise.
> >         * cfgexpand.c (expand_debug_expr): Likewise.
> >         * gimple-pretty-print.c (dump_ternary_rhs): Likewise.
> >         * gimple.c (get_gimple_rhs_num_ops): Handle BIT_FIELD_INSERT.
> >         * tree-cfg.c (verify_gimple_assign_ternary): Verify BIT_FIELD_INSERT.
> >
> >         * tree-ssa.c (non_rewritable_lvalue_p): We can rewrite
> >         vector inserts using BIT_FIELD_REF on the lhs.
> >         (execute_update_addresses_taken): Do it.
> >
> >         * gcc.dg/tree-ssa/vector-6.c: New testcase.
> >
> > Index: trunk/gcc/expr.c
> > ===================================================================
> > *** trunk.orig/gcc/expr.c       2016-05-12 13:40:30.704262951 +0200
> > --- trunk/gcc/expr.c    2016-05-12 15:40:32.481225744 +0200
> > *************** expand_expr_real_2 (sepops ops, rtx targ
> > *** 9358,9363 ****
> > --- 9358,9380 ----
> >         target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
> >         return target;
> >
> > +     case BIT_FIELD_INSERT:
> > +       {
> > +       unsigned bitpos = tree_to_uhwi (treeop2);
> > +       unsigned bitsize;
> > +       if (INTEGRAL_TYPE_P (TREE_TYPE (treeop1)))
> > +         bitsize = TYPE_PRECISION (TREE_TYPE (treeop1));
> > +       else
> > +         bitsize = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (treeop1)));
> > +       rtx op0 = expand_normal (treeop0);
> > +       rtx op1 = expand_normal (treeop1);
> > +       rtx dst = gen_reg_rtx (mode);
> > +       emit_move_insn (dst, op0);
> > +       store_bit_field (dst, bitsize, bitpos, 0, 0,
> > +                        TYPE_MODE (TREE_TYPE (treeop1)), op1, false);
> > +       return dst;
> > +       }
> > +
> >       default:
> >         gcc_unreachable ();
> >       }
> > Index: trunk/gcc/fold-const.c
> > ===================================================================
> > *** trunk.orig/gcc/fold-const.c 2016-05-12 13:40:30.704262951 +0200
> > --- trunk/gcc/fold-const.c      2016-05-13 09:41:13.509812127 +0200
> > *************** operand_equal_p (const_tree arg0, const_
> > *** 3163,3168 ****
> > --- 3163,3169 ----
> >
> >         case VEC_COND_EXPR:
> >         case DOT_PROD_EXPR:
> > +       case BIT_FIELD_INSERT:
> >           return OP_SAME (0) && OP_SAME (1) && OP_SAME (2);
> >
> >         default:
> > *************** fold_ternary_loc (location_t loc, enum t
> > *** 11870,11875 ****
> > --- 11871,11916 ----
> >         }
> >         return NULL_TREE;
> >
> > +     case BIT_FIELD_INSERT:
> > +       /* Perform (partial) constant folding of BIT_FIELD_INSERT.  */
> > +       if (TREE_CODE (arg0) == INTEGER_CST
> > +         && TREE_CODE (arg1) == INTEGER_CST)
> > +       {
> > +         unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
> > +         unsigned bitsize = TYPE_PRECISION (TREE_TYPE (arg1));
> > +         wide_int tem = wi::bit_and (arg0,
> > +                                     wi::shifted_mask (bitpos, bitsize, true,
> > +                                                       TYPE_PRECISION (type)));
> > +         wide_int tem2
> > +           = wi::lshift (wi::zext (wi::to_wide (arg1, TYPE_PRECISION (type)),
> > +                                   bitsize), bitpos);
> > +         return wide_int_to_tree (type, wi::bit_or (tem, tem2));
> > +       }
> 
> This seems incorrect for the case where BYTES_BIG_ENDIAN as far as I
> can tell.  With BYTES_BIG_ENDIAN, the bits position starts most
> significiant rather than the least significiant.

You mean the bitpos operand of BIT_FIELD_INSERT works in a different way?
I see the BIT_FIELD_REF folding uses native_encode/interpret but only
handles byte-aligned references.  I suppose the BIT_INSERT_EXPR case
(you are following up an old patch) could do the same.

>  Sorry I am bring
> this up after this has been in the tree for a long time but I finally
> got around to testing my bit-field lowering on a few big-endian
> targets and ran into this issue.

So - can you fix it please?  Also note that the VECTOR_CST case
(as in BIT_FIELD_REF) seems to be inconsistent here and counts
"bits" in a different way?

Thanks,
Richard.

> Thanks,
> Andrew Pinski
> 
> > +       else if (TREE_CODE (arg0) == VECTOR_CST
> > +              && CONSTANT_CLASS_P (arg1)
> > +              && types_compatible_p (TREE_TYPE (TREE_TYPE (arg0)),
> > +                                     TREE_TYPE (arg1)))
> > +       {
> > +         unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
> > +         unsigned HOST_WIDE_INT elsize
> > +           = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (arg1)));
> > +         if (bitpos % elsize == 0)
> > +           {
> > +             unsigned k = bitpos / elsize;
> > +             if (operand_equal_p (VECTOR_CST_ELT (arg0, k), arg1, 0))
> > +               return arg0;
> > +             else
> > +               {
> > +                 tree *elts = XALLOCAVEC (tree, TYPE_VECTOR_SUBPARTS (type));
> > +                 memcpy (elts, VECTOR_CST_ELTS (arg0),
> > +                         sizeof (tree) * TYPE_VECTOR_SUBPARTS (type));
> > +                 elts[k] = arg1;
> > +                 return build_vector (type, elts);
> > +               }
> > +           }
> > +       }
> > +       return NULL_TREE;
> > +
> >       default:
> >         return NULL_TREE;
> >       } /* switch (code) */
> > Index: trunk/gcc/gimplify.c
> > ===================================================================
> > *** trunk.orig/gcc/gimplify.c   2016-05-12 13:40:30.704262951 +0200
> > --- trunk/gcc/gimplify.c        2016-05-12 13:56:18.679120641 +0200
> > *************** gimplify_expr (tree *expr_p, gimple_seq
> > *** 10936,10941 ****
> > --- 10936,10945 ----
> >           /* Classified as tcc_expression.  */
> >           goto expr_3;
> >
> > +       case BIT_FIELD_INSERT:
> > +         /* Argument 3 is a constant.  */
> > +         goto expr_2;
> > +
> >         case POINTER_PLUS_EXPR:
> >           {
> >             enum gimplify_status r0, r1;
> > Index: trunk/gcc/tree-inline.c
> > ===================================================================
> > *** trunk.orig/gcc/tree-inline.c        2016-05-12 13:40:30.704262951 +0200
> > --- trunk/gcc/tree-inline.c     2016-05-12 13:42:45.465811959 +0200
> > *************** estimate_operator_cost (enum tree_code c
> > *** 3941,3946 ****
> > --- 3941,3950 ----
> >           return weights->div_mod_cost;
> >         return 1;
> >
> > +     /* Bit-field insertion needs several shift and mask operations.  */
> > +     case BIT_FIELD_INSERT:
> > +       return 3;
> > +
> >       default:
> >         /* We expect a copy assignment with no operator.  */
> >         gcc_assert (get_gimple_rhs_class (code) == GIMPLE_SINGLE_RHS);
> > Index: trunk/gcc/tree-pretty-print.c
> > ===================================================================
> > *** trunk.orig/gcc/tree-pretty-print.c  2016-05-12 13:40:30.704262951 +0200
> > --- trunk/gcc/tree-pretty-print.c       2016-05-12 14:30:05.781944740 +0200
> > *************** dump_generic_node (pretty_printer *pp, t
> > *** 1876,1881 ****
> > --- 1876,1898 ----
> >         pp_greater (pp);
> >         break;
> >
> > +     case BIT_FIELD_INSERT:
> > +       pp_string (pp, "BIT_FIELD_INSERT <");
> > +       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
> > +       pp_string (pp, ", ");
> > +       dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
> > +       pp_string (pp, ", ");
> > +       dump_generic_node (pp, TREE_OPERAND (node, 2), spc, flags, false);
> > +       pp_string (pp, " (");
> > +       if (INTEGRAL_TYPE_P (TREE_TYPE (TREE_OPERAND (node, 1))))
> > +       pp_decimal_int (pp,
> > +                       TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (node, 1))));
> > +       else
> > +       dump_generic_node (pp, TYPE_SIZE (TREE_TYPE (TREE_OPERAND (node, 1))),
> > +                          spc, flags, false);
> > +       pp_string (pp, " bits)>");
> > +       break;
> > +
> >       case ARRAY_REF:
> >       case ARRAY_RANGE_REF:
> >         op0 = TREE_OPERAND (node, 0);
> > Index: trunk/gcc/tree-ssa-operands.c
> > ===================================================================
> > *** trunk.orig/gcc/tree-ssa-operands.c  2016-05-12 13:42:45.465811959 +0200
> > --- trunk/gcc/tree-ssa-operands.c       2016-05-12 13:48:26.881736503 +0200
> > *************** get_expr_operands (struct function *fn,
> > *** 833,838 ****
> > --- 833,839 ----
> >         get_expr_operands (fn, stmt, &TREE_OPERAND (expr, 0), flags);
> >         return;
> >
> > +     case BIT_FIELD_INSERT:
> >       case COMPOUND_EXPR:
> >       case OBJ_TYPE_REF:
> >       case ASSERT_EXPR:
> > Index: trunk/gcc/tree.def
> > ===================================================================
> > *** trunk.orig/gcc/tree.def     2016-05-12 13:40:30.704262951 +0200
> > --- trunk/gcc/tree.def  2016-05-12 13:47:09.972852423 +0200
> > *************** DEFTREECODE (ADDR_EXPR, "addr_expr", tcc
> > *** 852,857 ****
> > --- 852,868 ----
> >      descriptor of type ptr_mode.  */
> >   DEFTREECODE (FDESC_EXPR, "fdesc_expr", tcc_expression, 2)
> >
> > + /* Given a word, a value and a bitfield position within the word,
> > +    produce the value that results if replacing the
> > +    described parts of word with value.
> > +    Operand 0 is a tree for the word of integral type;
> > +    Operand 1 is a tree for the value of integral type;
> > +    Operand 2 is a tree giving the constant position of the first referenced bit;
> > +    The number of bits replaced is given by the precision of the value
> > +    type if that is integral or by its size if it is non-integral.
> > +    The replaced bits shall be fully inside the word.  */
> > + DEFTREECODE (BIT_FIELD_INSERT, "bit_field_insert", tcc_expression, 3)
> > +
> >   /* Given two real or integer operands of the same type,
> >      returns a complex value of the corresponding complex type.  */
> >   DEFTREECODE (COMPLEX_EXPR, "complex_expr", tcc_binary, 2)
> > Index: trunk/gcc/cfgexpand.c
> > ===================================================================
> > *** trunk.orig/gcc/cfgexpand.c  2016-05-12 13:42:45.469812005 +0200
> > --- trunk/gcc/cfgexpand.c       2016-05-13 11:48:04.513407495 +0200
> > *************** expand_debug_expr (tree exp)
> > *** 5025,5030 ****
> > --- 5025,5031 ----
> >       case FIXED_CONVERT_EXPR:
> >       case OBJ_TYPE_REF:
> >       case WITH_SIZE_EXPR:
> > +     case BIT_FIELD_INSERT:
> >         return NULL;
> >
> >       case DOT_PROD_EXPR:
> > Index: trunk/gcc/gimple-pretty-print.c
> > ===================================================================
> > *** trunk.orig/gcc/gimple-pretty-print.c        2016-05-12 11:23:09.261375157 +0200
> > --- trunk/gcc/gimple-pretty-print.c     2016-05-12 14:57:22.096175579 +0200
> > *************** dump_ternary_rhs (pretty_printer *buffer
> > *** 479,484 ****
> > --- 479,502 ----
> >         pp_greater (buffer);
> >         break;
> >
> > +     case BIT_FIELD_INSERT:
> > +       pp_string (buffer, "BIT_FIELD_INSERT <");
> > +       dump_generic_node (buffer, gimple_assign_rhs1 (gs), spc, flags, false);
> > +       pp_string (buffer, ", ");
> > +       dump_generic_node (buffer, gimple_assign_rhs2 (gs), spc, flags, false);
> > +       pp_string (buffer, ", ");
> > +       dump_generic_node (buffer, gimple_assign_rhs3 (gs), spc, flags, false);
> > +       pp_string (buffer, " (");
> > +       if (INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_rhs2 (gs))))
> > +       pp_decimal_int (buffer,
> > +                       TYPE_PRECISION (TREE_TYPE (gimple_assign_rhs2 (gs))));
> > +       else
> > +       dump_generic_node (buffer,
> > +                          TYPE_SIZE (TREE_TYPE (gimple_assign_rhs2 (gs))),
> > +                          spc, flags, false);
> > +       pp_string (buffer, " bits)>");
> > +       break;
> > +
> >       default:
> >         gcc_unreachable ();
> >       }
> > Index: trunk/gcc/gimple.c
> > ===================================================================
> > *** trunk.orig/gcc/gimple.c     2016-05-12 13:40:30.704262951 +0200
> > --- trunk/gcc/gimple.c  2016-05-12 14:49:37.066994969 +0200
> > *************** get_gimple_rhs_num_ops (enum tree_code c
> > *** 2044,2049 ****
> > --- 2044,2050 ----
> >         || (SYM) == REALIGN_LOAD_EXPR                                       \
> >         || (SYM) == VEC_COND_EXPR                                                   \
> >         || (SYM) == VEC_PERM_EXPR                                             \
> > +       || (SYM) == BIT_FIELD_INSERT                                        \
> >         || (SYM) == FMA_EXPR) ? GIMPLE_TERNARY_RHS                          \
> >      : ((SYM) == CONSTRUCTOR                                                \
> >         || (SYM) == OBJ_TYPE_REF                                                    \
> > Index: trunk/gcc/tree-cfg.c
> > ===================================================================
> > *** trunk.orig/gcc/tree-cfg.c   2016-05-06 14:38:33.959495081 +0200
> > --- trunk/gcc/tree-cfg.c        2016-05-13 09:25:01.670630730 +0200
> > *************** verify_gimple_assign_ternary (gassign *s
> > *** 4155,4160 ****
> > --- 4155,4207 ----
> >
> >         return false;
> >
> > +     case BIT_FIELD_INSERT:
> > +       if (! useless_type_conversion_p (lhs_type, rhs1_type))
> > +       {
> > +         error ("type mismatch in BIT_FIELD_INSERT");
> > +         debug_generic_expr (lhs_type);
> > +         debug_generic_expr (rhs1_type);
> > +         return true;
> > +       }
> > +       if (! ((INTEGRAL_TYPE_P (rhs1_type)
> > +             && INTEGRAL_TYPE_P (rhs2_type))
> > +            || (VECTOR_TYPE_P (rhs1_type)
> > +                && types_compatible_p (TREE_TYPE (rhs1_type), rhs2_type))))
> > +       {
> > +         error ("not allowed type combination in BIT_FIELD_INSERT");
> > +         debug_generic_expr (rhs1_type);
> > +         debug_generic_expr (rhs2_type);
> > +         return true;
> > +       }
> > +       if (! tree_fits_uhwi_p (rhs3)
> > +         || ! tree_fits_uhwi_p (TYPE_SIZE (rhs2_type)))
> > +       {
> > +         error ("invalid position or size in BIT_FIELD_INSERT");
> > +         return true;
> > +       }
> > +       if (INTEGRAL_TYPE_P (rhs1_type))
> > +       {
> > +         unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (rhs3);
> > +         if (bitpos >= TYPE_PRECISION (rhs1_type)
> > +             || (bitpos + TYPE_PRECISION (rhs2_type)
> > +                 > TYPE_PRECISION (rhs1_type)))
> > +           {
> > +             error ("insertion out of range in BIT_FIELD_INSERT");
> > +             return true;
> > +           }
> > +       }
> > +       else if (VECTOR_TYPE_P (rhs1_type))
> > +       {
> > +         unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (rhs3);
> > +         unsigned HOST_WIDE_INT bitsize = tree_to_uhwi (TYPE_SIZE (rhs2_type));
> > +         if (bitpos % bitsize != 0)
> > +           {
> > +             error ("vector insertion not at element boundary");
> > +             return true;
> > +           }
> > +       }
> > +       return false;
> > +
> >       case DOT_PROD_EXPR:
> >       case REALIGN_LOAD_EXPR:
> >         /* FIXME.  */
> > Index: trunk/gcc/tree-ssa.c
> > ===================================================================
> > *** trunk.orig/gcc/tree-ssa.c   2016-05-13 09:38:02.263611726 +0200
> > --- trunk/gcc/tree-ssa.c        2016-05-13 09:50:31.020226585 +0200
> > *************** non_rewritable_lvalue_p (tree lhs)
> > *** 1318,1323 ****
> > --- 1318,1335 ----
> >         return false;
> >       }
> >
> > +   /* A vector-insert using a BIT_FIELD_REF is rewritable using
> > +      BIT_FIELD_INSERT.  */
> > +   if (TREE_CODE (lhs) == BIT_FIELD_REF
> > +       && DECL_P (TREE_OPERAND (lhs, 0))
> > +       && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (lhs, 0)))
> > +       /* && bitsize % element-size == 0 */
> > +       && types_compatible_p (TREE_TYPE (lhs),
> > +                            TREE_TYPE (TREE_TYPE (TREE_OPERAND (lhs, 0))))
> > +       && (tree_to_uhwi (TREE_OPERAND (lhs, 2))
> > +         % tree_to_uhwi (TYPE_SIZE (TREE_TYPE (lhs)))) == 0)
> > +     return false;
> > +
> >     return true;
> >   }
> >
> > *************** execute_update_addresses_taken (void)
> > *** 1536,1541 ****
> > --- 1548,1576 ----
> >                     stmt = gsi_stmt (gsi);
> >                     unlink_stmt_vdef (stmt);
> >                     update_stmt (stmt);
> > +                   continue;
> > +                 }
> > +
> > +               /* Rewrite a vector insert via a BIT_FIELD_REF on the LHS
> > +                  into a BIT_FIELD_INSERT.  */
> > +               if (TREE_CODE (lhs) == BIT_FIELD_REF
> > +                   && DECL_P (TREE_OPERAND (lhs, 0))
> > +                   && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (lhs, 0)))
> > +                   && types_compatible_p (TREE_TYPE (lhs),
> > +                                          TREE_TYPE (TREE_TYPE
> > +                                                      (TREE_OPERAND (lhs, 0))))
> > +                   && (tree_to_uhwi (TREE_OPERAND (lhs, 2))
> > +                       % tree_to_uhwi (TYPE_SIZE (TREE_TYPE (lhs))) == 0))
> > +                 {
> > +                   tree var = TREE_OPERAND (lhs, 0);
> > +                   tree val = gimple_assign_rhs1 (stmt);
> > +                   tree bitpos = TREE_OPERAND (lhs, 2);
> > +                   gimple_assign_set_lhs (stmt, var);
> > +                   gimple_assign_set_rhs_with_ops
> > +                     (&gsi, BIT_FIELD_INSERT, var, val, bitpos);
> > +                   stmt = gsi_stmt (gsi);
> > +                   unlink_stmt_vdef (stmt);
> > +                   update_stmt (stmt);
> >                     continue;
> >                   }
> >
> > Index: trunk/gcc/testsuite/gcc.dg/tree-ssa/vector-6.c
> > ===================================================================
> > *** /dev/null   1970-01-01 00:00:00.000000000 +0000
> > --- trunk/gcc/testsuite/gcc.dg/tree-ssa/vector-6.c      2016-05-13 09:54:16.026814995 +0200
> > ***************
> > *** 0 ****
> > --- 1,34 ----
> > + /* { dg-do compile } */
> > + /* { dg-options "-O -fdump-tree-ccp1" } */
> > +
> > + typedef int v4si __attribute__((vector_size (4 * sizeof (int))));
> > +
> > + v4si test1 (v4si v, int i)
> > + {
> > +   ((int *)&v)[0] = i;
> > +   return v;
> > + }
> > +
> > + v4si test2 (v4si v, int i)
> > + {
> > +   int *p = (int *)&v;
> > +   *p = i;
> > +   return v;
> > + }
> > +
> > + v4si test3 (v4si v, int i)
> > + {
> > +   ((int *)&v)[3] = i;
> > +   return v;
> > + }
> > +
> > + v4si test4 (v4si v, int i)
> > + {
> > +   int *p = (int *)&v;
> > +   p += 3;
> > +   *p = i;
> > +   return v;
> > + }
> > +
> > + /* { dg-final { scan-tree-dump-times "Now a gimple register: v" 2 "ccp1" } } */
> > + /* { dg-final { scan-tree-dump-times "Now a gimple register: v" 4 "ccp1" { xfail *-*-* } } } */
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT
  2018-11-15  8:29   ` Richard Biener
@ 2018-11-15  8:31     ` Richard Biener
  2019-12-17  2:41       ` Andrew Pinski
  0 siblings, 1 reply; 32+ messages in thread
From: Richard Biener @ 2018-11-15  8:31 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: GCC Patches

On Thu, 15 Nov 2018, Richard Biener wrote:

> On Wed, 14 Nov 2018, Andrew Pinski wrote:
> 
> > On Fri, May 13, 2016 at 3:51 AM Richard Biener <rguenther@suse.de> wrote:
> > >
> > >
> > > The following patch adds BIT_FIELD_INSERT, an operation to
> > > facilitate doing bitfield inserts on registers (as opposed
> > > to currently where we'd have a BIT_FIELD_REF store).
> > >
> > > Originally this was developed as part of bitfield lowering
> > > where bitfield stores were lowered into read-modify-write
> > > cycles and the modify part, instead of doing shifting and masking,
> > > be kept in a more high-level form to ease combining them.
> > >
> > > A second use case (the above is still valid) is vector element
> > > inserts which we currently can only do via memory or
> > > by extracting all components and re-building the vector using
> > > a CONSTRUCTOR.  For this second use case I added code
> > > re-writing the BIT_FIELD_REF stores the C family FEs produce
> > > into BIT_FIELD_INSERT when update-address-taken can otherwise
> > > re-write a decl into SSA form (the testcase shows we miss
> > > a similar opportunity with the MEM_REF form of a vector insert,
> > > I plan to fix that for the final submission).
> > >
> > > One speciality of BIT_FIELD_INSERT as opposed to BIT_FIELD_REF
> > > is that the size of the insertion is given implicitely via the
> > > type size/precision of the value to insert.  That avoids
> > > introducing ways to have quaternary ops in folding and GIMPLE stmts.
> > >
> > > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> > >
> > > Richard.
> > >
> > > 2011-06-16  Richard Guenther  <rguenther@suse.de>
> > >
> > >         PR tree-optimization/29756
> > >         * tree.def (BIT_FIELD_INSERT): New tcc_expression tree code.
> > >         * expr.c (expand_expr_real_2): Handle BIT_FIELD_INSERT.
> > >         * fold-const.c (operand_equal_p): Likewise.
> > >         (fold_ternary_loc): Add constant folding of BIT_FIELD_INSERT.
> > >         * gimplify.c (gimplify_expr): Handle BIT_FIELD_INSERT.
> > >         * tree-inline.c (estimate_operator_cost): Likewise.
> > >         * tree-pretty-print.c (dump_generic_node): Likewise.
> > >         * tree-ssa-operands.c (get_expr_operands): Likewise.
> > >         * cfgexpand.c (expand_debug_expr): Likewise.
> > >         * gimple-pretty-print.c (dump_ternary_rhs): Likewise.
> > >         * gimple.c (get_gimple_rhs_num_ops): Handle BIT_FIELD_INSERT.
> > >         * tree-cfg.c (verify_gimple_assign_ternary): Verify BIT_FIELD_INSERT.
> > >
> > >         * tree-ssa.c (non_rewritable_lvalue_p): We can rewrite
> > >         vector inserts using BIT_FIELD_REF on the lhs.
> > >         (execute_update_addresses_taken): Do it.
> > >
> > >         * gcc.dg/tree-ssa/vector-6.c: New testcase.
> > >
> > > Index: trunk/gcc/expr.c
> > > ===================================================================
> > > *** trunk.orig/gcc/expr.c       2016-05-12 13:40:30.704262951 +0200
> > > --- trunk/gcc/expr.c    2016-05-12 15:40:32.481225744 +0200
> > > *************** expand_expr_real_2 (sepops ops, rtx targ
> > > *** 9358,9363 ****
> > > --- 9358,9380 ----
> > >         target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
> > >         return target;
> > >
> > > +     case BIT_FIELD_INSERT:
> > > +       {
> > > +       unsigned bitpos = tree_to_uhwi (treeop2);
> > > +       unsigned bitsize;
> > > +       if (INTEGRAL_TYPE_P (TREE_TYPE (treeop1)))
> > > +         bitsize = TYPE_PRECISION (TREE_TYPE (treeop1));
> > > +       else
> > > +         bitsize = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (treeop1)));
> > > +       rtx op0 = expand_normal (treeop0);
> > > +       rtx op1 = expand_normal (treeop1);
> > > +       rtx dst = gen_reg_rtx (mode);
> > > +       emit_move_insn (dst, op0);
> > > +       store_bit_field (dst, bitsize, bitpos, 0, 0,
> > > +                        TYPE_MODE (TREE_TYPE (treeop1)), op1, false);
> > > +       return dst;
> > > +       }
> > > +
> > >       default:
> > >         gcc_unreachable ();
> > >       }
> > > Index: trunk/gcc/fold-const.c
> > > ===================================================================
> > > *** trunk.orig/gcc/fold-const.c 2016-05-12 13:40:30.704262951 +0200
> > > --- trunk/gcc/fold-const.c      2016-05-13 09:41:13.509812127 +0200
> > > *************** operand_equal_p (const_tree arg0, const_
> > > *** 3163,3168 ****
> > > --- 3163,3169 ----
> > >
> > >         case VEC_COND_EXPR:
> > >         case DOT_PROD_EXPR:
> > > +       case BIT_FIELD_INSERT:
> > >           return OP_SAME (0) && OP_SAME (1) && OP_SAME (2);
> > >
> > >         default:
> > > *************** fold_ternary_loc (location_t loc, enum t
> > > *** 11870,11875 ****
> > > --- 11871,11916 ----
> > >         }
> > >         return NULL_TREE;
> > >
> > > +     case BIT_FIELD_INSERT:
> > > +       /* Perform (partial) constant folding of BIT_FIELD_INSERT.  */
> > > +       if (TREE_CODE (arg0) == INTEGER_CST
> > > +         && TREE_CODE (arg1) == INTEGER_CST)
> > > +       {
> > > +         unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
> > > +         unsigned bitsize = TYPE_PRECISION (TREE_TYPE (arg1));
> > > +         wide_int tem = wi::bit_and (arg0,
> > > +                                     wi::shifted_mask (bitpos, bitsize, true,
> > > +                                                       TYPE_PRECISION (type)));
> > > +         wide_int tem2
> > > +           = wi::lshift (wi::zext (wi::to_wide (arg1, TYPE_PRECISION (type)),
> > > +                                   bitsize), bitpos);
> > > +         return wide_int_to_tree (type, wi::bit_or (tem, tem2));
> > > +       }
> > 
> > This seems incorrect for the case where BYTES_BIG_ENDIAN as far as I
> > can tell.  With BYTES_BIG_ENDIAN, the bits position starts most
> > significiant rather than the least significiant.
> 
> You mean the bitpos operand of BIT_FIELD_INSERT works in a different way?
> I see the BIT_FIELD_REF folding uses native_encode/interpret but only
> handles byte-aligned references.  I suppose the BIT_INSERT_EXPR case
> (you are following up an old patch) could do the same.
> 
> >  Sorry I am bring
> > this up after this has been in the tree for a long time but I finally
> > got around to testing my bit-field lowering on a few big-endian
> > targets and ran into this issue.
> 
> So - can you fix it please?  Also note that the VECTOR_CST case
> (as in BIT_FIELD_REF) seems to be inconsistent here and counts
> "bits" in a different way?

And bonus points for documenting BIT_FIELD_REF and BIT_INSERT_EXPR
in generic.texi, together with those "details".

Richard.

> Thanks,
> Richard.
> 
> > Thanks,
> > Andrew Pinski
> > 
> > > +       else if (TREE_CODE (arg0) == VECTOR_CST
> > > +              && CONSTANT_CLASS_P (arg1)
> > > +              && types_compatible_p (TREE_TYPE (TREE_TYPE (arg0)),
> > > +                                     TREE_TYPE (arg1)))
> > > +       {
> > > +         unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
> > > +         unsigned HOST_WIDE_INT elsize
> > > +           = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (arg1)));
> > > +         if (bitpos % elsize == 0)
> > > +           {
> > > +             unsigned k = bitpos / elsize;
> > > +             if (operand_equal_p (VECTOR_CST_ELT (arg0, k), arg1, 0))
> > > +               return arg0;
> > > +             else
> > > +               {
> > > +                 tree *elts = XALLOCAVEC (tree, TYPE_VECTOR_SUBPARTS (type));
> > > +                 memcpy (elts, VECTOR_CST_ELTS (arg0),
> > > +                         sizeof (tree) * TYPE_VECTOR_SUBPARTS (type));
> > > +                 elts[k] = arg1;
> > > +                 return build_vector (type, elts);
> > > +               }
> > > +           }
> > > +       }
> > > +       return NULL_TREE;
> > > +
> > >       default:
> > >         return NULL_TREE;
> > >       } /* switch (code) */
> > > Index: trunk/gcc/gimplify.c
> > > ===================================================================
> > > *** trunk.orig/gcc/gimplify.c   2016-05-12 13:40:30.704262951 +0200
> > > --- trunk/gcc/gimplify.c        2016-05-12 13:56:18.679120641 +0200
> > > *************** gimplify_expr (tree *expr_p, gimple_seq
> > > *** 10936,10941 ****
> > > --- 10936,10945 ----
> > >           /* Classified as tcc_expression.  */
> > >           goto expr_3;
> > >
> > > +       case BIT_FIELD_INSERT:
> > > +         /* Argument 3 is a constant.  */
> > > +         goto expr_2;
> > > +
> > >         case POINTER_PLUS_EXPR:
> > >           {
> > >             enum gimplify_status r0, r1;
> > > Index: trunk/gcc/tree-inline.c
> > > ===================================================================
> > > *** trunk.orig/gcc/tree-inline.c        2016-05-12 13:40:30.704262951 +0200
> > > --- trunk/gcc/tree-inline.c     2016-05-12 13:42:45.465811959 +0200
> > > *************** estimate_operator_cost (enum tree_code c
> > > *** 3941,3946 ****
> > > --- 3941,3950 ----
> > >           return weights->div_mod_cost;
> > >         return 1;
> > >
> > > +     /* Bit-field insertion needs several shift and mask operations.  */
> > > +     case BIT_FIELD_INSERT:
> > > +       return 3;
> > > +
> > >       default:
> > >         /* We expect a copy assignment with no operator.  */
> > >         gcc_assert (get_gimple_rhs_class (code) == GIMPLE_SINGLE_RHS);
> > > Index: trunk/gcc/tree-pretty-print.c
> > > ===================================================================
> > > *** trunk.orig/gcc/tree-pretty-print.c  2016-05-12 13:40:30.704262951 +0200
> > > --- trunk/gcc/tree-pretty-print.c       2016-05-12 14:30:05.781944740 +0200
> > > *************** dump_generic_node (pretty_printer *pp, t
> > > *** 1876,1881 ****
> > > --- 1876,1898 ----
> > >         pp_greater (pp);
> > >         break;
> > >
> > > +     case BIT_FIELD_INSERT:
> > > +       pp_string (pp, "BIT_FIELD_INSERT <");
> > > +       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
> > > +       pp_string (pp, ", ");
> > > +       dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
> > > +       pp_string (pp, ", ");
> > > +       dump_generic_node (pp, TREE_OPERAND (node, 2), spc, flags, false);
> > > +       pp_string (pp, " (");
> > > +       if (INTEGRAL_TYPE_P (TREE_TYPE (TREE_OPERAND (node, 1))))
> > > +       pp_decimal_int (pp,
> > > +                       TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (node, 1))));
> > > +       else
> > > +       dump_generic_node (pp, TYPE_SIZE (TREE_TYPE (TREE_OPERAND (node, 1))),
> > > +                          spc, flags, false);
> > > +       pp_string (pp, " bits)>");
> > > +       break;
> > > +
> > >       case ARRAY_REF:
> > >       case ARRAY_RANGE_REF:
> > >         op0 = TREE_OPERAND (node, 0);
> > > Index: trunk/gcc/tree-ssa-operands.c
> > > ===================================================================
> > > *** trunk.orig/gcc/tree-ssa-operands.c  2016-05-12 13:42:45.465811959 +0200
> > > --- trunk/gcc/tree-ssa-operands.c       2016-05-12 13:48:26.881736503 +0200
> > > *************** get_expr_operands (struct function *fn,
> > > *** 833,838 ****
> > > --- 833,839 ----
> > >         get_expr_operands (fn, stmt, &TREE_OPERAND (expr, 0), flags);
> > >         return;
> > >
> > > +     case BIT_FIELD_INSERT:
> > >       case COMPOUND_EXPR:
> > >       case OBJ_TYPE_REF:
> > >       case ASSERT_EXPR:
> > > Index: trunk/gcc/tree.def
> > > ===================================================================
> > > *** trunk.orig/gcc/tree.def     2016-05-12 13:40:30.704262951 +0200
> > > --- trunk/gcc/tree.def  2016-05-12 13:47:09.972852423 +0200
> > > *************** DEFTREECODE (ADDR_EXPR, "addr_expr", tcc
> > > *** 852,857 ****
> > > --- 852,868 ----
> > >      descriptor of type ptr_mode.  */
> > >   DEFTREECODE (FDESC_EXPR, "fdesc_expr", tcc_expression, 2)
> > >
> > > + /* Given a word, a value and a bitfield position within the word,
> > > +    produce the value that results if replacing the
> > > +    described parts of word with value.
> > > +    Operand 0 is a tree for the word of integral type;
> > > +    Operand 1 is a tree for the value of integral type;
> > > +    Operand 2 is a tree giving the constant position of the first referenced bit;
> > > +    The number of bits replaced is given by the precision of the value
> > > +    type if that is integral or by its size if it is non-integral.
> > > +    The replaced bits shall be fully inside the word.  */
> > > + DEFTREECODE (BIT_FIELD_INSERT, "bit_field_insert", tcc_expression, 3)
> > > +
> > >   /* Given two real or integer operands of the same type,
> > >      returns a complex value of the corresponding complex type.  */
> > >   DEFTREECODE (COMPLEX_EXPR, "complex_expr", tcc_binary, 2)
> > > Index: trunk/gcc/cfgexpand.c
> > > ===================================================================
> > > *** trunk.orig/gcc/cfgexpand.c  2016-05-12 13:42:45.469812005 +0200
> > > --- trunk/gcc/cfgexpand.c       2016-05-13 11:48:04.513407495 +0200
> > > *************** expand_debug_expr (tree exp)
> > > *** 5025,5030 ****
> > > --- 5025,5031 ----
> > >       case FIXED_CONVERT_EXPR:
> > >       case OBJ_TYPE_REF:
> > >       case WITH_SIZE_EXPR:
> > > +     case BIT_FIELD_INSERT:
> > >         return NULL;
> > >
> > >       case DOT_PROD_EXPR:
> > > Index: trunk/gcc/gimple-pretty-print.c
> > > ===================================================================
> > > *** trunk.orig/gcc/gimple-pretty-print.c        2016-05-12 11:23:09.261375157 +0200
> > > --- trunk/gcc/gimple-pretty-print.c     2016-05-12 14:57:22.096175579 +0200
> > > *************** dump_ternary_rhs (pretty_printer *buffer
> > > *** 479,484 ****
> > > --- 479,502 ----
> > >         pp_greater (buffer);
> > >         break;
> > >
> > > +     case BIT_FIELD_INSERT:
> > > +       pp_string (buffer, "BIT_FIELD_INSERT <");
> > > +       dump_generic_node (buffer, gimple_assign_rhs1 (gs), spc, flags, false);
> > > +       pp_string (buffer, ", ");
> > > +       dump_generic_node (buffer, gimple_assign_rhs2 (gs), spc, flags, false);
> > > +       pp_string (buffer, ", ");
> > > +       dump_generic_node (buffer, gimple_assign_rhs3 (gs), spc, flags, false);
> > > +       pp_string (buffer, " (");
> > > +       if (INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_rhs2 (gs))))
> > > +       pp_decimal_int (buffer,
> > > +                       TYPE_PRECISION (TREE_TYPE (gimple_assign_rhs2 (gs))));
> > > +       else
> > > +       dump_generic_node (buffer,
> > > +                          TYPE_SIZE (TREE_TYPE (gimple_assign_rhs2 (gs))),
> > > +                          spc, flags, false);
> > > +       pp_string (buffer, " bits)>");
> > > +       break;
> > > +
> > >       default:
> > >         gcc_unreachable ();
> > >       }
> > > Index: trunk/gcc/gimple.c
> > > ===================================================================
> > > *** trunk.orig/gcc/gimple.c     2016-05-12 13:40:30.704262951 +0200
> > > --- trunk/gcc/gimple.c  2016-05-12 14:49:37.066994969 +0200
> > > *************** get_gimple_rhs_num_ops (enum tree_code c
> > > *** 2044,2049 ****
> > > --- 2044,2050 ----
> > >         || (SYM) == REALIGN_LOAD_EXPR                                       \
> > >         || (SYM) == VEC_COND_EXPR                                                   \
> > >         || (SYM) == VEC_PERM_EXPR                                             \
> > > +       || (SYM) == BIT_FIELD_INSERT                                        \
> > >         || (SYM) == FMA_EXPR) ? GIMPLE_TERNARY_RHS                          \
> > >      : ((SYM) == CONSTRUCTOR                                                \
> > >         || (SYM) == OBJ_TYPE_REF                                                    \
> > > Index: trunk/gcc/tree-cfg.c
> > > ===================================================================
> > > *** trunk.orig/gcc/tree-cfg.c   2016-05-06 14:38:33.959495081 +0200
> > > --- trunk/gcc/tree-cfg.c        2016-05-13 09:25:01.670630730 +0200
> > > *************** verify_gimple_assign_ternary (gassign *s
> > > *** 4155,4160 ****
> > > --- 4155,4207 ----
> > >
> > >         return false;
> > >
> > > +     case BIT_FIELD_INSERT:
> > > +       if (! useless_type_conversion_p (lhs_type, rhs1_type))
> > > +       {
> > > +         error ("type mismatch in BIT_FIELD_INSERT");
> > > +         debug_generic_expr (lhs_type);
> > > +         debug_generic_expr (rhs1_type);
> > > +         return true;
> > > +       }
> > > +       if (! ((INTEGRAL_TYPE_P (rhs1_type)
> > > +             && INTEGRAL_TYPE_P (rhs2_type))
> > > +            || (VECTOR_TYPE_P (rhs1_type)
> > > +                && types_compatible_p (TREE_TYPE (rhs1_type), rhs2_type))))
> > > +       {
> > > +         error ("not allowed type combination in BIT_FIELD_INSERT");
> > > +         debug_generic_expr (rhs1_type);
> > > +         debug_generic_expr (rhs2_type);
> > > +         return true;
> > > +       }
> > > +       if (! tree_fits_uhwi_p (rhs3)
> > > +         || ! tree_fits_uhwi_p (TYPE_SIZE (rhs2_type)))
> > > +       {
> > > +         error ("invalid position or size in BIT_FIELD_INSERT");
> > > +         return true;
> > > +       }
> > > +       if (INTEGRAL_TYPE_P (rhs1_type))
> > > +       {
> > > +         unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (rhs3);
> > > +         if (bitpos >= TYPE_PRECISION (rhs1_type)
> > > +             || (bitpos + TYPE_PRECISION (rhs2_type)
> > > +                 > TYPE_PRECISION (rhs1_type)))
> > > +           {
> > > +             error ("insertion out of range in BIT_FIELD_INSERT");
> > > +             return true;
> > > +           }
> > > +       }
> > > +       else if (VECTOR_TYPE_P (rhs1_type))
> > > +       {
> > > +         unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (rhs3);
> > > +         unsigned HOST_WIDE_INT bitsize = tree_to_uhwi (TYPE_SIZE (rhs2_type));
> > > +         if (bitpos % bitsize != 0)
> > > +           {
> > > +             error ("vector insertion not at element boundary");
> > > +             return true;
> > > +           }
> > > +       }
> > > +       return false;
> > > +
> > >       case DOT_PROD_EXPR:
> > >       case REALIGN_LOAD_EXPR:
> > >         /* FIXME.  */
> > > Index: trunk/gcc/tree-ssa.c
> > > ===================================================================
> > > *** trunk.orig/gcc/tree-ssa.c   2016-05-13 09:38:02.263611726 +0200
> > > --- trunk/gcc/tree-ssa.c        2016-05-13 09:50:31.020226585 +0200
> > > *************** non_rewritable_lvalue_p (tree lhs)
> > > *** 1318,1323 ****
> > > --- 1318,1335 ----
> > >         return false;
> > >       }
> > >
> > > +   /* A vector-insert using a BIT_FIELD_REF is rewritable using
> > > +      BIT_FIELD_INSERT.  */
> > > +   if (TREE_CODE (lhs) == BIT_FIELD_REF
> > > +       && DECL_P (TREE_OPERAND (lhs, 0))
> > > +       && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (lhs, 0)))
> > > +       /* && bitsize % element-size == 0 */
> > > +       && types_compatible_p (TREE_TYPE (lhs),
> > > +                            TREE_TYPE (TREE_TYPE (TREE_OPERAND (lhs, 0))))
> > > +       && (tree_to_uhwi (TREE_OPERAND (lhs, 2))
> > > +         % tree_to_uhwi (TYPE_SIZE (TREE_TYPE (lhs)))) == 0)
> > > +     return false;
> > > +
> > >     return true;
> > >   }
> > >
> > > *************** execute_update_addresses_taken (void)
> > > *** 1536,1541 ****
> > > --- 1548,1576 ----
> > >                     stmt = gsi_stmt (gsi);
> > >                     unlink_stmt_vdef (stmt);
> > >                     update_stmt (stmt);
> > > +                   continue;
> > > +                 }
> > > +
> > > +               /* Rewrite a vector insert via a BIT_FIELD_REF on the LHS
> > > +                  into a BIT_FIELD_INSERT.  */
> > > +               if (TREE_CODE (lhs) == BIT_FIELD_REF
> > > +                   && DECL_P (TREE_OPERAND (lhs, 0))
> > > +                   && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (lhs, 0)))
> > > +                   && types_compatible_p (TREE_TYPE (lhs),
> > > +                                          TREE_TYPE (TREE_TYPE
> > > +                                                      (TREE_OPERAND (lhs, 0))))
> > > +                   && (tree_to_uhwi (TREE_OPERAND (lhs, 2))
> > > +                       % tree_to_uhwi (TYPE_SIZE (TREE_TYPE (lhs))) == 0))
> > > +                 {
> > > +                   tree var = TREE_OPERAND (lhs, 0);
> > > +                   tree val = gimple_assign_rhs1 (stmt);
> > > +                   tree bitpos = TREE_OPERAND (lhs, 2);
> > > +                   gimple_assign_set_lhs (stmt, var);
> > > +                   gimple_assign_set_rhs_with_ops
> > > +                     (&gsi, BIT_FIELD_INSERT, var, val, bitpos);
> > > +                   stmt = gsi_stmt (gsi);
> > > +                   unlink_stmt_vdef (stmt);
> > > +                   update_stmt (stmt);
> > >                     continue;
> > >                   }
> > >
> > > Index: trunk/gcc/testsuite/gcc.dg/tree-ssa/vector-6.c
> > > ===================================================================
> > > *** /dev/null   1970-01-01 00:00:00.000000000 +0000
> > > --- trunk/gcc/testsuite/gcc.dg/tree-ssa/vector-6.c      2016-05-13 09:54:16.026814995 +0200
> > > ***************
> > > *** 0 ****
> > > --- 1,34 ----
> > > + /* { dg-do compile } */
> > > + /* { dg-options "-O -fdump-tree-ccp1" } */
> > > +
> > > + typedef int v4si __attribute__((vector_size (4 * sizeof (int))));
> > > +
> > > + v4si test1 (v4si v, int i)
> > > + {
> > > +   ((int *)&v)[0] = i;
> > > +   return v;
> > > + }
> > > +
> > > + v4si test2 (v4si v, int i)
> > > + {
> > > +   int *p = (int *)&v;
> > > +   *p = i;
> > > +   return v;
> > > + }
> > > +
> > > + v4si test3 (v4si v, int i)
> > > + {
> > > +   ((int *)&v)[3] = i;
> > > +   return v;
> > > + }
> > > +
> > > + v4si test4 (v4si v, int i)
> > > + {
> > > +   int *p = (int *)&v;
> > > +   p += 3;
> > > +   *p = i;
> > > +   return v;
> > > + }
> > > +
> > > + /* { dg-final { scan-tree-dump-times "Now a gimple register: v" 2 "ccp1" } } */
> > > + /* { dg-final { scan-tree-dump-times "Now a gimple register: v" 4 "ccp1" { xfail *-*-* } } } */
> > 
> > 
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT
  2018-11-15  8:31     ` Richard Biener
@ 2019-12-17  2:41       ` Andrew Pinski
  2019-12-17  3:25         ` Andrew Pinski
  2020-01-07  7:37         ` Richard Biener
  0 siblings, 2 replies; 32+ messages in thread
From: Andrew Pinski @ 2019-12-17  2:41 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches

On Thu, Nov 15, 2018 at 12:31 AM Richard Biener <rguenther@suse.de> wrote:
>
> On Thu, 15 Nov 2018, Richard Biener wrote:
>
> > On Wed, 14 Nov 2018, Andrew Pinski wrote:
> >
> > > On Fri, May 13, 2016 at 3:51 AM Richard Biener <rguenther@suse.de> wrote:
> > > >
> > > >
> > > > The following patch adds BIT_FIELD_INSERT, an operation to
> > > > facilitate doing bitfield inserts on registers (as opposed
> > > > to currently where we'd have a BIT_FIELD_REF store).
> > > >
> > > > Originally this was developed as part of bitfield lowering
> > > > where bitfield stores were lowered into read-modify-write
> > > > cycles and the modify part, instead of doing shifting and masking,
> > > > be kept in a more high-level form to ease combining them.
> > > >
> > > > A second use case (the above is still valid) is vector element
> > > > inserts which we currently can only do via memory or
> > > > by extracting all components and re-building the vector using
> > > > a CONSTRUCTOR.  For this second use case I added code
> > > > re-writing the BIT_FIELD_REF stores the C family FEs produce
> > > > into BIT_FIELD_INSERT when update-address-taken can otherwise
> > > > re-write a decl into SSA form (the testcase shows we miss
> > > > a similar opportunity with the MEM_REF form of a vector insert,
> > > > I plan to fix that for the final submission).
> > > >
> > > > One speciality of BIT_FIELD_INSERT as opposed to BIT_FIELD_REF
> > > > is that the size of the insertion is given implicitely via the
> > > > type size/precision of the value to insert.  That avoids
> > > > introducing ways to have quaternary ops in folding and GIMPLE stmts.
> > > >
> > > > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> > > >
> > > > Richard.
> > > >
> > > > 2011-06-16  Richard Guenther  <rguenther@suse.de>
> > > >
> > > >         PR tree-optimization/29756
> > > >         * tree.def (BIT_FIELD_INSERT): New tcc_expression tree code.
> > > >         * expr.c (expand_expr_real_2): Handle BIT_FIELD_INSERT.
> > > >         * fold-const.c (operand_equal_p): Likewise.
> > > >         (fold_ternary_loc): Add constant folding of BIT_FIELD_INSERT.
> > > >         * gimplify.c (gimplify_expr): Handle BIT_FIELD_INSERT.
> > > >         * tree-inline.c (estimate_operator_cost): Likewise.
> > > >         * tree-pretty-print.c (dump_generic_node): Likewise.
> > > >         * tree-ssa-operands.c (get_expr_operands): Likewise.
> > > >         * cfgexpand.c (expand_debug_expr): Likewise.
> > > >         * gimple-pretty-print.c (dump_ternary_rhs): Likewise.
> > > >         * gimple.c (get_gimple_rhs_num_ops): Handle BIT_FIELD_INSERT.
> > > >         * tree-cfg.c (verify_gimple_assign_ternary): Verify BIT_FIELD_INSERT.
> > > >
> > > >         * tree-ssa.c (non_rewritable_lvalue_p): We can rewrite
> > > >         vector inserts using BIT_FIELD_REF on the lhs.
> > > >         (execute_update_addresses_taken): Do it.
> > > >
> > > >         * gcc.dg/tree-ssa/vector-6.c: New testcase.
> > > >
> > > > Index: trunk/gcc/expr.c
> > > > ===================================================================
> > > > *** trunk.orig/gcc/expr.c       2016-05-12 13:40:30.704262951 +0200
> > > > --- trunk/gcc/expr.c    2016-05-12 15:40:32.481225744 +0200
> > > > *************** expand_expr_real_2 (sepops ops, rtx targ
> > > > *** 9358,9363 ****
> > > > --- 9358,9380 ----
> > > >         target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
> > > >         return target;
> > > >
> > > > +     case BIT_FIELD_INSERT:
> > > > +       {
> > > > +       unsigned bitpos = tree_to_uhwi (treeop2);
> > > > +       unsigned bitsize;
> > > > +       if (INTEGRAL_TYPE_P (TREE_TYPE (treeop1)))
> > > > +         bitsize = TYPE_PRECISION (TREE_TYPE (treeop1));
> > > > +       else
> > > > +         bitsize = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (treeop1)));
> > > > +       rtx op0 = expand_normal (treeop0);
> > > > +       rtx op1 = expand_normal (treeop1);
> > > > +       rtx dst = gen_reg_rtx (mode);
> > > > +       emit_move_insn (dst, op0);
> > > > +       store_bit_field (dst, bitsize, bitpos, 0, 0,
> > > > +                        TYPE_MODE (TREE_TYPE (treeop1)), op1, false);
> > > > +       return dst;
> > > > +       }
> > > > +
> > > >       default:
> > > >         gcc_unreachable ();
> > > >       }
> > > > Index: trunk/gcc/fold-const.c
> > > > ===================================================================
> > > > *** trunk.orig/gcc/fold-const.c 2016-05-12 13:40:30.704262951 +0200
> > > > --- trunk/gcc/fold-const.c      2016-05-13 09:41:13.509812127 +0200
> > > > *************** operand_equal_p (const_tree arg0, const_
> > > > *** 3163,3168 ****
> > > > --- 3163,3169 ----
> > > >
> > > >         case VEC_COND_EXPR:
> > > >         case DOT_PROD_EXPR:
> > > > +       case BIT_FIELD_INSERT:
> > > >           return OP_SAME (0) && OP_SAME (1) && OP_SAME (2);
> > > >
> > > >         default:
> > > > *************** fold_ternary_loc (location_t loc, enum t
> > > > *** 11870,11875 ****
> > > > --- 11871,11916 ----
> > > >         }
> > > >         return NULL_TREE;
> > > >
> > > > +     case BIT_FIELD_INSERT:
> > > > +       /* Perform (partial) constant folding of BIT_FIELD_INSERT.  */
> > > > +       if (TREE_CODE (arg0) == INTEGER_CST
> > > > +         && TREE_CODE (arg1) == INTEGER_CST)
> > > > +       {
> > > > +         unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
> > > > +         unsigned bitsize = TYPE_PRECISION (TREE_TYPE (arg1));
> > > > +         wide_int tem = wi::bit_and (arg0,
> > > > +                                     wi::shifted_mask (bitpos, bitsize, true,
> > > > +                                                       TYPE_PRECISION (type)));
> > > > +         wide_int tem2
> > > > +           = wi::lshift (wi::zext (wi::to_wide (arg1, TYPE_PRECISION (type)),
> > > > +                                   bitsize), bitpos);
> > > > +         return wide_int_to_tree (type, wi::bit_or (tem, tem2));
> > > > +       }
> > >
> > > This seems incorrect for the case where BYTES_BIG_ENDIAN as far as I
> > > can tell.  With BYTES_BIG_ENDIAN, the bits position starts most
> > > significiant rather than the least significiant.
> >
> > You mean the bitpos operand of BIT_FIELD_INSERT works in a different way?
> > I see the BIT_FIELD_REF folding uses native_encode/interpret but only
> > handles byte-aligned references.  I suppose the BIT_INSERT_EXPR case
> > (you are following up an old patch) could do the same.
> >
> > >  Sorry I am bring
> > > this up after this has been in the tree for a long time but I finally
> > > got around to testing my bit-field lowering on a few big-endian
> > > targets and ran into this issue.
> >
> > So - can you fix it please?  Also note that the VECTOR_CST case
> > (as in BIT_FIELD_REF) seems to be inconsistent here and counts
> > "bits" in a different way?
>
> And bonus points for documenting BIT_FIELD_REF and BIT_INSERT_EXPR
> in generic.texi, together with those "details".

This is the fix:
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 8e9e299..a919b63 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -12301,6 +12301,8 @@ fold_ternary_loc (location_t loc, enum
tree_code code, tree type,
        {
          unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
          unsigned bitsize = TYPE_PRECISION (TREE_TYPE (arg1));
+         if (BYTES_BIG_ENDIAN)
+           bitpos = TYPE_PRECISION (type) - bitpos - bitsize;
          wide_int tem = (wi::to_wide (arg0)
                          & wi::shifted_mask (bitpos, bitsize, true,
                                              TYPE_PRECISION (type)));

---- CUT ----
I will do a full test in a little bit with the other patch I attached
to related bugzilla.

Thanks,
Andrew

>
> Richard.
>
> > Thanks,
> > Richard.
> >
> > > Thanks,
> > > Andrew Pinski
> > >
> > > > +       else if (TREE_CODE (arg0) == VECTOR_CST
> > > > +              && CONSTANT_CLASS_P (arg1)
> > > > +              && types_compatible_p (TREE_TYPE (TREE_TYPE (arg0)),
> > > > +                                     TREE_TYPE (arg1)))
> > > > +       {
> > > > +         unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
> > > > +         unsigned HOST_WIDE_INT elsize
> > > > +           = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (arg1)));
> > > > +         if (bitpos % elsize == 0)
> > > > +           {
> > > > +             unsigned k = bitpos / elsize;
> > > > +             if (operand_equal_p (VECTOR_CST_ELT (arg0, k), arg1, 0))
> > > > +               return arg0;
> > > > +             else
> > > > +               {
> > > > +                 tree *elts = XALLOCAVEC (tree, TYPE_VECTOR_SUBPARTS (type));
> > > > +                 memcpy (elts, VECTOR_CST_ELTS (arg0),
> > > > +                         sizeof (tree) * TYPE_VECTOR_SUBPARTS (type));
> > > > +                 elts[k] = arg1;
> > > > +                 return build_vector (type, elts);
> > > > +               }
> > > > +           }
> > > > +       }
> > > > +       return NULL_TREE;
> > > > +
> > > >       default:
> > > >         return NULL_TREE;
> > > >       } /* switch (code) */
> > > > Index: trunk/gcc/gimplify.c
> > > > ===================================================================
> > > > *** trunk.orig/gcc/gimplify.c   2016-05-12 13:40:30.704262951 +0200
> > > > --- trunk/gcc/gimplify.c        2016-05-12 13:56:18.679120641 +0200
> > > > *************** gimplify_expr (tree *expr_p, gimple_seq
> > > > *** 10936,10941 ****
> > > > --- 10936,10945 ----
> > > >           /* Classified as tcc_expression.  */
> > > >           goto expr_3;
> > > >
> > > > +       case BIT_FIELD_INSERT:
> > > > +         /* Argument 3 is a constant.  */
> > > > +         goto expr_2;
> > > > +
> > > >         case POINTER_PLUS_EXPR:
> > > >           {
> > > >             enum gimplify_status r0, r1;
> > > > Index: trunk/gcc/tree-inline.c
> > > > ===================================================================
> > > > *** trunk.orig/gcc/tree-inline.c        2016-05-12 13:40:30.704262951 +0200
> > > > --- trunk/gcc/tree-inline.c     2016-05-12 13:42:45.465811959 +0200
> > > > *************** estimate_operator_cost (enum tree_code c
> > > > *** 3941,3946 ****
> > > > --- 3941,3950 ----
> > > >           return weights->div_mod_cost;
> > > >         return 1;
> > > >
> > > > +     /* Bit-field insertion needs several shift and mask operations.  */
> > > > +     case BIT_FIELD_INSERT:
> > > > +       return 3;
> > > > +
> > > >       default:
> > > >         /* We expect a copy assignment with no operator.  */
> > > >         gcc_assert (get_gimple_rhs_class (code) == GIMPLE_SINGLE_RHS);
> > > > Index: trunk/gcc/tree-pretty-print.c
> > > > ===================================================================
> > > > *** trunk.orig/gcc/tree-pretty-print.c  2016-05-12 13:40:30.704262951 +0200
> > > > --- trunk/gcc/tree-pretty-print.c       2016-05-12 14:30:05.781944740 +0200
> > > > *************** dump_generic_node (pretty_printer *pp, t
> > > > *** 1876,1881 ****
> > > > --- 1876,1898 ----
> > > >         pp_greater (pp);
> > > >         break;
> > > >
> > > > +     case BIT_FIELD_INSERT:
> > > > +       pp_string (pp, "BIT_FIELD_INSERT <");
> > > > +       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
> > > > +       pp_string (pp, ", ");
> > > > +       dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
> > > > +       pp_string (pp, ", ");
> > > > +       dump_generic_node (pp, TREE_OPERAND (node, 2), spc, flags, false);
> > > > +       pp_string (pp, " (");
> > > > +       if (INTEGRAL_TYPE_P (TREE_TYPE (TREE_OPERAND (node, 1))))
> > > > +       pp_decimal_int (pp,
> > > > +                       TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (node, 1))));
> > > > +       else
> > > > +       dump_generic_node (pp, TYPE_SIZE (TREE_TYPE (TREE_OPERAND (node, 1))),
> > > > +                          spc, flags, false);
> > > > +       pp_string (pp, " bits)>");
> > > > +       break;
> > > > +
> > > >       case ARRAY_REF:
> > > >       case ARRAY_RANGE_REF:
> > > >         op0 = TREE_OPERAND (node, 0);
> > > > Index: trunk/gcc/tree-ssa-operands.c
> > > > ===================================================================
> > > > *** trunk.orig/gcc/tree-ssa-operands.c  2016-05-12 13:42:45.465811959 +0200
> > > > --- trunk/gcc/tree-ssa-operands.c       2016-05-12 13:48:26.881736503 +0200
> > > > *************** get_expr_operands (struct function *fn,
> > > > *** 833,838 ****
> > > > --- 833,839 ----
> > > >         get_expr_operands (fn, stmt, &TREE_OPERAND (expr, 0), flags);
> > > >         return;
> > > >
> > > > +     case BIT_FIELD_INSERT:
> > > >       case COMPOUND_EXPR:
> > > >       case OBJ_TYPE_REF:
> > > >       case ASSERT_EXPR:
> > > > Index: trunk/gcc/tree.def
> > > > ===================================================================
> > > > *** trunk.orig/gcc/tree.def     2016-05-12 13:40:30.704262951 +0200
> > > > --- trunk/gcc/tree.def  2016-05-12 13:47:09.972852423 +0200
> > > > *************** DEFTREECODE (ADDR_EXPR, "addr_expr", tcc
> > > > *** 852,857 ****
> > > > --- 852,868 ----
> > > >      descriptor of type ptr_mode.  */
> > > >   DEFTREECODE (FDESC_EXPR, "fdesc_expr", tcc_expression, 2)
> > > >
> > > > + /* Given a word, a value and a bitfield position within the word,
> > > > +    produce the value that results if replacing the
> > > > +    described parts of word with value.
> > > > +    Operand 0 is a tree for the word of integral type;
> > > > +    Operand 1 is a tree for the value of integral type;
> > > > +    Operand 2 is a tree giving the constant position of the first referenced bit;
> > > > +    The number of bits replaced is given by the precision of the value
> > > > +    type if that is integral or by its size if it is non-integral.
> > > > +    The replaced bits shall be fully inside the word.  */
> > > > + DEFTREECODE (BIT_FIELD_INSERT, "bit_field_insert", tcc_expression, 3)
> > > > +
> > > >   /* Given two real or integer operands of the same type,
> > > >      returns a complex value of the corresponding complex type.  */
> > > >   DEFTREECODE (COMPLEX_EXPR, "complex_expr", tcc_binary, 2)
> > > > Index: trunk/gcc/cfgexpand.c
> > > > ===================================================================
> > > > *** trunk.orig/gcc/cfgexpand.c  2016-05-12 13:42:45.469812005 +0200
> > > > --- trunk/gcc/cfgexpand.c       2016-05-13 11:48:04.513407495 +0200
> > > > *************** expand_debug_expr (tree exp)
> > > > *** 5025,5030 ****
> > > > --- 5025,5031 ----
> > > >       case FIXED_CONVERT_EXPR:
> > > >       case OBJ_TYPE_REF:
> > > >       case WITH_SIZE_EXPR:
> > > > +     case BIT_FIELD_INSERT:
> > > >         return NULL;
> > > >
> > > >       case DOT_PROD_EXPR:
> > > > Index: trunk/gcc/gimple-pretty-print.c
> > > > ===================================================================
> > > > *** trunk.orig/gcc/gimple-pretty-print.c        2016-05-12 11:23:09.261375157 +0200
> > > > --- trunk/gcc/gimple-pretty-print.c     2016-05-12 14:57:22.096175579 +0200
> > > > *************** dump_ternary_rhs (pretty_printer *buffer
> > > > *** 479,484 ****
> > > > --- 479,502 ----
> > > >         pp_greater (buffer);
> > > >         break;
> > > >
> > > > +     case BIT_FIELD_INSERT:
> > > > +       pp_string (buffer, "BIT_FIELD_INSERT <");
> > > > +       dump_generic_node (buffer, gimple_assign_rhs1 (gs), spc, flags, false);
> > > > +       pp_string (buffer, ", ");
> > > > +       dump_generic_node (buffer, gimple_assign_rhs2 (gs), spc, flags, false);
> > > > +       pp_string (buffer, ", ");
> > > > +       dump_generic_node (buffer, gimple_assign_rhs3 (gs), spc, flags, false);
> > > > +       pp_string (buffer, " (");
> > > > +       if (INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_rhs2 (gs))))
> > > > +       pp_decimal_int (buffer,
> > > > +                       TYPE_PRECISION (TREE_TYPE (gimple_assign_rhs2 (gs))));
> > > > +       else
> > > > +       dump_generic_node (buffer,
> > > > +                          TYPE_SIZE (TREE_TYPE (gimple_assign_rhs2 (gs))),
> > > > +                          spc, flags, false);
> > > > +       pp_string (buffer, " bits)>");
> > > > +       break;
> > > > +
> > > >       default:
> > > >         gcc_unreachable ();
> > > >       }
> > > > Index: trunk/gcc/gimple.c
> > > > ===================================================================
> > > > *** trunk.orig/gcc/gimple.c     2016-05-12 13:40:30.704262951 +0200
> > > > --- trunk/gcc/gimple.c  2016-05-12 14:49:37.066994969 +0200
> > > > *************** get_gimple_rhs_num_ops (enum tree_code c
> > > > *** 2044,2049 ****
> > > > --- 2044,2050 ----
> > > >         || (SYM) == REALIGN_LOAD_EXPR                                       \
> > > >         || (SYM) == VEC_COND_EXPR                                                   \
> > > >         || (SYM) == VEC_PERM_EXPR                                             \
> > > > +       || (SYM) == BIT_FIELD_INSERT                                        \
> > > >         || (SYM) == FMA_EXPR) ? GIMPLE_TERNARY_RHS                          \
> > > >      : ((SYM) == CONSTRUCTOR                                                \
> > > >         || (SYM) == OBJ_TYPE_REF                                                    \
> > > > Index: trunk/gcc/tree-cfg.c
> > > > ===================================================================
> > > > *** trunk.orig/gcc/tree-cfg.c   2016-05-06 14:38:33.959495081 +0200
> > > > --- trunk/gcc/tree-cfg.c        2016-05-13 09:25:01.670630730 +0200
> > > > *************** verify_gimple_assign_ternary (gassign *s
> > > > *** 4155,4160 ****
> > > > --- 4155,4207 ----
> > > >
> > > >         return false;
> > > >
> > > > +     case BIT_FIELD_INSERT:
> > > > +       if (! useless_type_conversion_p (lhs_type, rhs1_type))
> > > > +       {
> > > > +         error ("type mismatch in BIT_FIELD_INSERT");
> > > > +         debug_generic_expr (lhs_type);
> > > > +         debug_generic_expr (rhs1_type);
> > > > +         return true;
> > > > +       }
> > > > +       if (! ((INTEGRAL_TYPE_P (rhs1_type)
> > > > +             && INTEGRAL_TYPE_P (rhs2_type))
> > > > +            || (VECTOR_TYPE_P (rhs1_type)
> > > > +                && types_compatible_p (TREE_TYPE (rhs1_type), rhs2_type))))
> > > > +       {
> > > > +         error ("not allowed type combination in BIT_FIELD_INSERT");
> > > > +         debug_generic_expr (rhs1_type);
> > > > +         debug_generic_expr (rhs2_type);
> > > > +         return true;
> > > > +       }
> > > > +       if (! tree_fits_uhwi_p (rhs3)
> > > > +         || ! tree_fits_uhwi_p (TYPE_SIZE (rhs2_type)))
> > > > +       {
> > > > +         error ("invalid position or size in BIT_FIELD_INSERT");
> > > > +         return true;
> > > > +       }
> > > > +       if (INTEGRAL_TYPE_P (rhs1_type))
> > > > +       {
> > > > +         unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (rhs3);
> > > > +         if (bitpos >= TYPE_PRECISION (rhs1_type)
> > > > +             || (bitpos + TYPE_PRECISION (rhs2_type)
> > > > +                 > TYPE_PRECISION (rhs1_type)))
> > > > +           {
> > > > +             error ("insertion out of range in BIT_FIELD_INSERT");
> > > > +             return true;
> > > > +           }
> > > > +       }
> > > > +       else if (VECTOR_TYPE_P (rhs1_type))
> > > > +       {
> > > > +         unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (rhs3);
> > > > +         unsigned HOST_WIDE_INT bitsize = tree_to_uhwi (TYPE_SIZE (rhs2_type));
> > > > +         if (bitpos % bitsize != 0)
> > > > +           {
> > > > +             error ("vector insertion not at element boundary");
> > > > +             return true;
> > > > +           }
> > > > +       }
> > > > +       return false;
> > > > +
> > > >       case DOT_PROD_EXPR:
> > > >       case REALIGN_LOAD_EXPR:
> > > >         /* FIXME.  */
> > > > Index: trunk/gcc/tree-ssa.c
> > > > ===================================================================
> > > > *** trunk.orig/gcc/tree-ssa.c   2016-05-13 09:38:02.263611726 +0200
> > > > --- trunk/gcc/tree-ssa.c        2016-05-13 09:50:31.020226585 +0200
> > > > *************** non_rewritable_lvalue_p (tree lhs)
> > > > *** 1318,1323 ****
> > > > --- 1318,1335 ----
> > > >         return false;
> > > >       }
> > > >
> > > > +   /* A vector-insert using a BIT_FIELD_REF is rewritable using
> > > > +      BIT_FIELD_INSERT.  */
> > > > +   if (TREE_CODE (lhs) == BIT_FIELD_REF
> > > > +       && DECL_P (TREE_OPERAND (lhs, 0))
> > > > +       && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (lhs, 0)))
> > > > +       /* && bitsize % element-size == 0 */
> > > > +       && types_compatible_p (TREE_TYPE (lhs),
> > > > +                            TREE_TYPE (TREE_TYPE (TREE_OPERAND (lhs, 0))))
> > > > +       && (tree_to_uhwi (TREE_OPERAND (lhs, 2))
> > > > +         % tree_to_uhwi (TYPE_SIZE (TREE_TYPE (lhs)))) == 0)
> > > > +     return false;
> > > > +
> > > >     return true;
> > > >   }
> > > >
> > > > *************** execute_update_addresses_taken (void)
> > > > *** 1536,1541 ****
> > > > --- 1548,1576 ----
> > > >                     stmt = gsi_stmt (gsi);
> > > >                     unlink_stmt_vdef (stmt);
> > > >                     update_stmt (stmt);
> > > > +                   continue;
> > > > +                 }
> > > > +
> > > > +               /* Rewrite a vector insert via a BIT_FIELD_REF on the LHS
> > > > +                  into a BIT_FIELD_INSERT.  */
> > > > +               if (TREE_CODE (lhs) == BIT_FIELD_REF
> > > > +                   && DECL_P (TREE_OPERAND (lhs, 0))
> > > > +                   && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (lhs, 0)))
> > > > +                   && types_compatible_p (TREE_TYPE (lhs),
> > > > +                                          TREE_TYPE (TREE_TYPE
> > > > +                                                      (TREE_OPERAND (lhs, 0))))
> > > > +                   && (tree_to_uhwi (TREE_OPERAND (lhs, 2))
> > > > +                       % tree_to_uhwi (TYPE_SIZE (TREE_TYPE (lhs))) == 0))
> > > > +                 {
> > > > +                   tree var = TREE_OPERAND (lhs, 0);
> > > > +                   tree val = gimple_assign_rhs1 (stmt);
> > > > +                   tree bitpos = TREE_OPERAND (lhs, 2);
> > > > +                   gimple_assign_set_lhs (stmt, var);
> > > > +                   gimple_assign_set_rhs_with_ops
> > > > +                     (&gsi, BIT_FIELD_INSERT, var, val, bitpos);
> > > > +                   stmt = gsi_stmt (gsi);
> > > > +                   unlink_stmt_vdef (stmt);
> > > > +                   update_stmt (stmt);
> > > >                     continue;
> > > >                   }
> > > >
> > > > Index: trunk/gcc/testsuite/gcc.dg/tree-ssa/vector-6.c
> > > > ===================================================================
> > > > *** /dev/null   1970-01-01 00:00:00.000000000 +0000
> > > > --- trunk/gcc/testsuite/gcc.dg/tree-ssa/vector-6.c      2016-05-13 09:54:16.026814995 +0200
> > > > ***************
> > > > *** 0 ****
> > > > --- 1,34 ----
> > > > + /* { dg-do compile } */
> > > > + /* { dg-options "-O -fdump-tree-ccp1" } */
> > > > +
> > > > + typedef int v4si __attribute__((vector_size (4 * sizeof (int))));
> > > > +
> > > > + v4si test1 (v4si v, int i)
> > > > + {
> > > > +   ((int *)&v)[0] = i;
> > > > +   return v;
> > > > + }
> > > > +
> > > > + v4si test2 (v4si v, int i)
> > > > + {
> > > > +   int *p = (int *)&v;
> > > > +   *p = i;
> > > > +   return v;
> > > > + }
> > > > +
> > > > + v4si test3 (v4si v, int i)
> > > > + {
> > > > +   ((int *)&v)[3] = i;
> > > > +   return v;
> > > > + }
> > > > +
> > > > + v4si test4 (v4si v, int i)
> > > > + {
> > > > +   int *p = (int *)&v;
> > > > +   p += 3;
> > > > +   *p = i;
> > > > +   return v;
> > > > + }
> > > > +
> > > > + /* { dg-final { scan-tree-dump-times "Now a gimple register: v" 2 "ccp1" } } */
> > > > + /* { dg-final { scan-tree-dump-times "Now a gimple register: v" 4 "ccp1" { xfail *-*-* } } } */
> > >
> > >
> >
> >
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT
  2019-12-17  2:41       ` Andrew Pinski
@ 2019-12-17  3:25         ` Andrew Pinski
  2020-01-07  7:37         ` Richard Biener
  1 sibling, 0 replies; 32+ messages in thread
From: Andrew Pinski @ 2019-12-17  3:25 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches

On Mon, Dec 16, 2019 at 6:32 PM Andrew Pinski <pinskia@gmail.com> wrote:
>
> On Thu, Nov 15, 2018 at 12:31 AM Richard Biener <rguenther@suse.de> wrote:
> >
> > On Thu, 15 Nov 2018, Richard Biener wrote:
> >
> > > On Wed, 14 Nov 2018, Andrew Pinski wrote:
> > >
> > > > On Fri, May 13, 2016 at 3:51 AM Richard Biener <rguenther@suse.de> wrote:
> > > > >
> > > > >
> > > > > The following patch adds BIT_FIELD_INSERT, an operation to
> > > > > facilitate doing bitfield inserts on registers (as opposed
> > > > > to currently where we'd have a BIT_FIELD_REF store).
> > > > >
> > > > > Originally this was developed as part of bitfield lowering
> > > > > where bitfield stores were lowered into read-modify-write
> > > > > cycles and the modify part, instead of doing shifting and masking,
> > > > > be kept in a more high-level form to ease combining them.
> > > > >
> > > > > A second use case (the above is still valid) is vector element
> > > > > inserts which we currently can only do via memory or
> > > > > by extracting all components and re-building the vector using
> > > > > a CONSTRUCTOR.  For this second use case I added code
> > > > > re-writing the BIT_FIELD_REF stores the C family FEs produce
> > > > > into BIT_FIELD_INSERT when update-address-taken can otherwise
> > > > > re-write a decl into SSA form (the testcase shows we miss
> > > > > a similar opportunity with the MEM_REF form of a vector insert,
> > > > > I plan to fix that for the final submission).
> > > > >
> > > > > One speciality of BIT_FIELD_INSERT as opposed to BIT_FIELD_REF
> > > > > is that the size of the insertion is given implicitely via the
> > > > > type size/precision of the value to insert.  That avoids
> > > > > introducing ways to have quaternary ops in folding and GIMPLE stmts.
> > > > >
> > > > > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> > > > >
> > > > > Richard.
> > > > >
> > > > > 2011-06-16  Richard Guenther  <rguenther@suse.de>
> > > > >
> > > > >         PR tree-optimization/29756
> > > > >         * tree.def (BIT_FIELD_INSERT): New tcc_expression tree code.
> > > > >         * expr.c (expand_expr_real_2): Handle BIT_FIELD_INSERT.
> > > > >         * fold-const.c (operand_equal_p): Likewise.
> > > > >         (fold_ternary_loc): Add constant folding of BIT_FIELD_INSERT.
> > > > >         * gimplify.c (gimplify_expr): Handle BIT_FIELD_INSERT.
> > > > >         * tree-inline.c (estimate_operator_cost): Likewise.
> > > > >         * tree-pretty-print.c (dump_generic_node): Likewise.
> > > > >         * tree-ssa-operands.c (get_expr_operands): Likewise.
> > > > >         * cfgexpand.c (expand_debug_expr): Likewise.
> > > > >         * gimple-pretty-print.c (dump_ternary_rhs): Likewise.
> > > > >         * gimple.c (get_gimple_rhs_num_ops): Handle BIT_FIELD_INSERT.
> > > > >         * tree-cfg.c (verify_gimple_assign_ternary): Verify BIT_FIELD_INSERT.
> > > > >
> > > > >         * tree-ssa.c (non_rewritable_lvalue_p): We can rewrite
> > > > >         vector inserts using BIT_FIELD_REF on the lhs.
> > > > >         (execute_update_addresses_taken): Do it.
> > > > >
> > > > >         * gcc.dg/tree-ssa/vector-6.c: New testcase.
> > > > >
> > > > > Index: trunk/gcc/expr.c
> > > > > ===================================================================
> > > > > *** trunk.orig/gcc/expr.c       2016-05-12 13:40:30.704262951 +0200
> > > > > --- trunk/gcc/expr.c    2016-05-12 15:40:32.481225744 +0200
> > > > > *************** expand_expr_real_2 (sepops ops, rtx targ
> > > > > *** 9358,9363 ****
> > > > > --- 9358,9380 ----
> > > > >         target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
> > > > >         return target;
> > > > >
> > > > > +     case BIT_FIELD_INSERT:
> > > > > +       {
> > > > > +       unsigned bitpos = tree_to_uhwi (treeop2);
> > > > > +       unsigned bitsize;
> > > > > +       if (INTEGRAL_TYPE_P (TREE_TYPE (treeop1)))
> > > > > +         bitsize = TYPE_PRECISION (TREE_TYPE (treeop1));
> > > > > +       else
> > > > > +         bitsize = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (treeop1)));
> > > > > +       rtx op0 = expand_normal (treeop0);
> > > > > +       rtx op1 = expand_normal (treeop1);
> > > > > +       rtx dst = gen_reg_rtx (mode);
> > > > > +       emit_move_insn (dst, op0);
> > > > > +       store_bit_field (dst, bitsize, bitpos, 0, 0,
> > > > > +                        TYPE_MODE (TREE_TYPE (treeop1)), op1, false);
> > > > > +       return dst;
> > > > > +       }
> > > > > +
> > > > >       default:
> > > > >         gcc_unreachable ();
> > > > >       }
> > > > > Index: trunk/gcc/fold-const.c
> > > > > ===================================================================
> > > > > *** trunk.orig/gcc/fold-const.c 2016-05-12 13:40:30.704262951 +0200
> > > > > --- trunk/gcc/fold-const.c      2016-05-13 09:41:13.509812127 +0200
> > > > > *************** operand_equal_p (const_tree arg0, const_
> > > > > *** 3163,3168 ****
> > > > > --- 3163,3169 ----
> > > > >
> > > > >         case VEC_COND_EXPR:
> > > > >         case DOT_PROD_EXPR:
> > > > > +       case BIT_FIELD_INSERT:
> > > > >           return OP_SAME (0) && OP_SAME (1) && OP_SAME (2);
> > > > >
> > > > >         default:
> > > > > *************** fold_ternary_loc (location_t loc, enum t
> > > > > *** 11870,11875 ****
> > > > > --- 11871,11916 ----
> > > > >         }
> > > > >         return NULL_TREE;
> > > > >
> > > > > +     case BIT_FIELD_INSERT:
> > > > > +       /* Perform (partial) constant folding of BIT_FIELD_INSERT.  */
> > > > > +       if (TREE_CODE (arg0) == INTEGER_CST
> > > > > +         && TREE_CODE (arg1) == INTEGER_CST)
> > > > > +       {
> > > > > +         unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
> > > > > +         unsigned bitsize = TYPE_PRECISION (TREE_TYPE (arg1));
> > > > > +         wide_int tem = wi::bit_and (arg0,
> > > > > +                                     wi::shifted_mask (bitpos, bitsize, true,
> > > > > +                                                       TYPE_PRECISION (type)));
> > > > > +         wide_int tem2
> > > > > +           = wi::lshift (wi::zext (wi::to_wide (arg1, TYPE_PRECISION (type)),
> > > > > +                                   bitsize), bitpos);
> > > > > +         return wide_int_to_tree (type, wi::bit_or (tem, tem2));
> > > > > +       }
> > > >
> > > > This seems incorrect for the case where BYTES_BIG_ENDIAN as far as I
> > > > can tell.  With BYTES_BIG_ENDIAN, the bits position starts most
> > > > significiant rather than the least significiant.
> > >
> > > You mean the bitpos operand of BIT_FIELD_INSERT works in a different way?
> > > I see the BIT_FIELD_REF folding uses native_encode/interpret but only
> > > handles byte-aligned references.  I suppose the BIT_INSERT_EXPR case
> > > (you are following up an old patch) could do the same.
> > >
> > > >  Sorry I am bring
> > > > this up after this has been in the tree for a long time but I finally
> > > > got around to testing my bit-field lowering on a few big-endian
> > > > targets and ran into this issue.
> > >
> > > So - can you fix it please?  Also note that the VECTOR_CST case
> > > (as in BIT_FIELD_REF) seems to be inconsistent here and counts
> > > "bits" in a different way?
> >
> > And bonus points for documenting BIT_FIELD_REF and BIT_INSERT_EXPR
> > in generic.texi, together with those "details".
>
> This is the fix:
> diff --git a/gcc/fold-const.c b/gcc/fold-const.c
> index 8e9e299..a919b63 100644
> --- a/gcc/fold-const.c
> +++ b/gcc/fold-const.c
> @@ -12301,6 +12301,8 @@ fold_ternary_loc (location_t loc, enum
> tree_code code, tree type,
>         {
>           unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
>           unsigned bitsize = TYPE_PRECISION (TREE_TYPE (arg1));
> +         if (BYTES_BIG_ENDIAN)
> +           bitpos = TYPE_PRECISION (type) - bitpos - bitsize;
>           wide_int tem = (wi::to_wide (arg0)
>                           & wi::shifted_mask (bitpos, bitsize, true,
>                                               TYPE_PRECISION (type)));
>
> ---- CUT ----
> I will do a full test in a little bit with the other patch I attached
> to related bugzilla.

I forgot to mention gcc.c-torture/execute/pr88904.c is the testcase
implementing my bit-field lowering pass.

Thanks,
Andrew Pinski

>
> Thanks,
> Andrew
>
> >
> > Richard.
> >
> > > Thanks,
> > > Richard.
> > >
> > > > Thanks,
> > > > Andrew Pinski
> > > >
> > > > > +       else if (TREE_CODE (arg0) == VECTOR_CST
> > > > > +              && CONSTANT_CLASS_P (arg1)
> > > > > +              && types_compatible_p (TREE_TYPE (TREE_TYPE (arg0)),
> > > > > +                                     TREE_TYPE (arg1)))
> > > > > +       {
> > > > > +         unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
> > > > > +         unsigned HOST_WIDE_INT elsize
> > > > > +           = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (arg1)));
> > > > > +         if (bitpos % elsize == 0)
> > > > > +           {
> > > > > +             unsigned k = bitpos / elsize;
> > > > > +             if (operand_equal_p (VECTOR_CST_ELT (arg0, k), arg1, 0))
> > > > > +               return arg0;
> > > > > +             else
> > > > > +               {
> > > > > +                 tree *elts = XALLOCAVEC (tree, TYPE_VECTOR_SUBPARTS (type));
> > > > > +                 memcpy (elts, VECTOR_CST_ELTS (arg0),
> > > > > +                         sizeof (tree) * TYPE_VECTOR_SUBPARTS (type));
> > > > > +                 elts[k] = arg1;
> > > > > +                 return build_vector (type, elts);
> > > > > +               }
> > > > > +           }
> > > > > +       }
> > > > > +       return NULL_TREE;
> > > > > +
> > > > >       default:
> > > > >         return NULL_TREE;
> > > > >       } /* switch (code) */
> > > > > Index: trunk/gcc/gimplify.c
> > > > > ===================================================================
> > > > > *** trunk.orig/gcc/gimplify.c   2016-05-12 13:40:30.704262951 +0200
> > > > > --- trunk/gcc/gimplify.c        2016-05-12 13:56:18.679120641 +0200
> > > > > *************** gimplify_expr (tree *expr_p, gimple_seq
> > > > > *** 10936,10941 ****
> > > > > --- 10936,10945 ----
> > > > >           /* Classified as tcc_expression.  */
> > > > >           goto expr_3;
> > > > >
> > > > > +       case BIT_FIELD_INSERT:
> > > > > +         /* Argument 3 is a constant.  */
> > > > > +         goto expr_2;
> > > > > +
> > > > >         case POINTER_PLUS_EXPR:
> > > > >           {
> > > > >             enum gimplify_status r0, r1;
> > > > > Index: trunk/gcc/tree-inline.c
> > > > > ===================================================================
> > > > > *** trunk.orig/gcc/tree-inline.c        2016-05-12 13:40:30.704262951 +0200
> > > > > --- trunk/gcc/tree-inline.c     2016-05-12 13:42:45.465811959 +0200
> > > > > *************** estimate_operator_cost (enum tree_code c
> > > > > *** 3941,3946 ****
> > > > > --- 3941,3950 ----
> > > > >           return weights->div_mod_cost;
> > > > >         return 1;
> > > > >
> > > > > +     /* Bit-field insertion needs several shift and mask operations.  */
> > > > > +     case BIT_FIELD_INSERT:
> > > > > +       return 3;
> > > > > +
> > > > >       default:
> > > > >         /* We expect a copy assignment with no operator.  */
> > > > >         gcc_assert (get_gimple_rhs_class (code) == GIMPLE_SINGLE_RHS);
> > > > > Index: trunk/gcc/tree-pretty-print.c
> > > > > ===================================================================
> > > > > *** trunk.orig/gcc/tree-pretty-print.c  2016-05-12 13:40:30.704262951 +0200
> > > > > --- trunk/gcc/tree-pretty-print.c       2016-05-12 14:30:05.781944740 +0200
> > > > > *************** dump_generic_node (pretty_printer *pp, t
> > > > > *** 1876,1881 ****
> > > > > --- 1876,1898 ----
> > > > >         pp_greater (pp);
> > > > >         break;
> > > > >
> > > > > +     case BIT_FIELD_INSERT:
> > > > > +       pp_string (pp, "BIT_FIELD_INSERT <");
> > > > > +       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
> > > > > +       pp_string (pp, ", ");
> > > > > +       dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
> > > > > +       pp_string (pp, ", ");
> > > > > +       dump_generic_node (pp, TREE_OPERAND (node, 2), spc, flags, false);
> > > > > +       pp_string (pp, " (");
> > > > > +       if (INTEGRAL_TYPE_P (TREE_TYPE (TREE_OPERAND (node, 1))))
> > > > > +       pp_decimal_int (pp,
> > > > > +                       TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (node, 1))));
> > > > > +       else
> > > > > +       dump_generic_node (pp, TYPE_SIZE (TREE_TYPE (TREE_OPERAND (node, 1))),
> > > > > +                          spc, flags, false);
> > > > > +       pp_string (pp, " bits)>");
> > > > > +       break;
> > > > > +
> > > > >       case ARRAY_REF:
> > > > >       case ARRAY_RANGE_REF:
> > > > >         op0 = TREE_OPERAND (node, 0);
> > > > > Index: trunk/gcc/tree-ssa-operands.c
> > > > > ===================================================================
> > > > > *** trunk.orig/gcc/tree-ssa-operands.c  2016-05-12 13:42:45.465811959 +0200
> > > > > --- trunk/gcc/tree-ssa-operands.c       2016-05-12 13:48:26.881736503 +0200
> > > > > *************** get_expr_operands (struct function *fn,
> > > > > *** 833,838 ****
> > > > > --- 833,839 ----
> > > > >         get_expr_operands (fn, stmt, &TREE_OPERAND (expr, 0), flags);
> > > > >         return;
> > > > >
> > > > > +     case BIT_FIELD_INSERT:
> > > > >       case COMPOUND_EXPR:
> > > > >       case OBJ_TYPE_REF:
> > > > >       case ASSERT_EXPR:
> > > > > Index: trunk/gcc/tree.def
> > > > > ===================================================================
> > > > > *** trunk.orig/gcc/tree.def     2016-05-12 13:40:30.704262951 +0200
> > > > > --- trunk/gcc/tree.def  2016-05-12 13:47:09.972852423 +0200
> > > > > *************** DEFTREECODE (ADDR_EXPR, "addr_expr", tcc
> > > > > *** 852,857 ****
> > > > > --- 852,868 ----
> > > > >      descriptor of type ptr_mode.  */
> > > > >   DEFTREECODE (FDESC_EXPR, "fdesc_expr", tcc_expression, 2)
> > > > >
> > > > > + /* Given a word, a value and a bitfield position within the word,
> > > > > +    produce the value that results if replacing the
> > > > > +    described parts of word with value.
> > > > > +    Operand 0 is a tree for the word of integral type;
> > > > > +    Operand 1 is a tree for the value of integral type;
> > > > > +    Operand 2 is a tree giving the constant position of the first referenced bit;
> > > > > +    The number of bits replaced is given by the precision of the value
> > > > > +    type if that is integral or by its size if it is non-integral.
> > > > > +    The replaced bits shall be fully inside the word.  */
> > > > > + DEFTREECODE (BIT_FIELD_INSERT, "bit_field_insert", tcc_expression, 3)
> > > > > +
> > > > >   /* Given two real or integer operands of the same type,
> > > > >      returns a complex value of the corresponding complex type.  */
> > > > >   DEFTREECODE (COMPLEX_EXPR, "complex_expr", tcc_binary, 2)
> > > > > Index: trunk/gcc/cfgexpand.c
> > > > > ===================================================================
> > > > > *** trunk.orig/gcc/cfgexpand.c  2016-05-12 13:42:45.469812005 +0200
> > > > > --- trunk/gcc/cfgexpand.c       2016-05-13 11:48:04.513407495 +0200
> > > > > *************** expand_debug_expr (tree exp)
> > > > > *** 5025,5030 ****
> > > > > --- 5025,5031 ----
> > > > >       case FIXED_CONVERT_EXPR:
> > > > >       case OBJ_TYPE_REF:
> > > > >       case WITH_SIZE_EXPR:
> > > > > +     case BIT_FIELD_INSERT:
> > > > >         return NULL;
> > > > >
> > > > >       case DOT_PROD_EXPR:
> > > > > Index: trunk/gcc/gimple-pretty-print.c
> > > > > ===================================================================
> > > > > *** trunk.orig/gcc/gimple-pretty-print.c        2016-05-12 11:23:09.261375157 +0200
> > > > > --- trunk/gcc/gimple-pretty-print.c     2016-05-12 14:57:22.096175579 +0200
> > > > > *************** dump_ternary_rhs (pretty_printer *buffer
> > > > > *** 479,484 ****
> > > > > --- 479,502 ----
> > > > >         pp_greater (buffer);
> > > > >         break;
> > > > >
> > > > > +     case BIT_FIELD_INSERT:
> > > > > +       pp_string (buffer, "BIT_FIELD_INSERT <");
> > > > > +       dump_generic_node (buffer, gimple_assign_rhs1 (gs), spc, flags, false);
> > > > > +       pp_string (buffer, ", ");
> > > > > +       dump_generic_node (buffer, gimple_assign_rhs2 (gs), spc, flags, false);
> > > > > +       pp_string (buffer, ", ");
> > > > > +       dump_generic_node (buffer, gimple_assign_rhs3 (gs), spc, flags, false);
> > > > > +       pp_string (buffer, " (");
> > > > > +       if (INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_rhs2 (gs))))
> > > > > +       pp_decimal_int (buffer,
> > > > > +                       TYPE_PRECISION (TREE_TYPE (gimple_assign_rhs2 (gs))));
> > > > > +       else
> > > > > +       dump_generic_node (buffer,
> > > > > +                          TYPE_SIZE (TREE_TYPE (gimple_assign_rhs2 (gs))),
> > > > > +                          spc, flags, false);
> > > > > +       pp_string (buffer, " bits)>");
> > > > > +       break;
> > > > > +
> > > > >       default:
> > > > >         gcc_unreachable ();
> > > > >       }
> > > > > Index: trunk/gcc/gimple.c
> > > > > ===================================================================
> > > > > *** trunk.orig/gcc/gimple.c     2016-05-12 13:40:30.704262951 +0200
> > > > > --- trunk/gcc/gimple.c  2016-05-12 14:49:37.066994969 +0200
> > > > > *************** get_gimple_rhs_num_ops (enum tree_code c
> > > > > *** 2044,2049 ****
> > > > > --- 2044,2050 ----
> > > > >         || (SYM) == REALIGN_LOAD_EXPR                                       \
> > > > >         || (SYM) == VEC_COND_EXPR                                                   \
> > > > >         || (SYM) == VEC_PERM_EXPR                                             \
> > > > > +       || (SYM) == BIT_FIELD_INSERT                                        \
> > > > >         || (SYM) == FMA_EXPR) ? GIMPLE_TERNARY_RHS                          \
> > > > >      : ((SYM) == CONSTRUCTOR                                                \
> > > > >         || (SYM) == OBJ_TYPE_REF                                                    \
> > > > > Index: trunk/gcc/tree-cfg.c
> > > > > ===================================================================
> > > > > *** trunk.orig/gcc/tree-cfg.c   2016-05-06 14:38:33.959495081 +0200
> > > > > --- trunk/gcc/tree-cfg.c        2016-05-13 09:25:01.670630730 +0200
> > > > > *************** verify_gimple_assign_ternary (gassign *s
> > > > > *** 4155,4160 ****
> > > > > --- 4155,4207 ----
> > > > >
> > > > >         return false;
> > > > >
> > > > > +     case BIT_FIELD_INSERT:
> > > > > +       if (! useless_type_conversion_p (lhs_type, rhs1_type))
> > > > > +       {
> > > > > +         error ("type mismatch in BIT_FIELD_INSERT");
> > > > > +         debug_generic_expr (lhs_type);
> > > > > +         debug_generic_expr (rhs1_type);
> > > > > +         return true;
> > > > > +       }
> > > > > +       if (! ((INTEGRAL_TYPE_P (rhs1_type)
> > > > > +             && INTEGRAL_TYPE_P (rhs2_type))
> > > > > +            || (VECTOR_TYPE_P (rhs1_type)
> > > > > +                && types_compatible_p (TREE_TYPE (rhs1_type), rhs2_type))))
> > > > > +       {
> > > > > +         error ("not allowed type combination in BIT_FIELD_INSERT");
> > > > > +         debug_generic_expr (rhs1_type);
> > > > > +         debug_generic_expr (rhs2_type);
> > > > > +         return true;
> > > > > +       }
> > > > > +       if (! tree_fits_uhwi_p (rhs3)
> > > > > +         || ! tree_fits_uhwi_p (TYPE_SIZE (rhs2_type)))
> > > > > +       {
> > > > > +         error ("invalid position or size in BIT_FIELD_INSERT");
> > > > > +         return true;
> > > > > +       }
> > > > > +       if (INTEGRAL_TYPE_P (rhs1_type))
> > > > > +       {
> > > > > +         unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (rhs3);
> > > > > +         if (bitpos >= TYPE_PRECISION (rhs1_type)
> > > > > +             || (bitpos + TYPE_PRECISION (rhs2_type)
> > > > > +                 > TYPE_PRECISION (rhs1_type)))
> > > > > +           {
> > > > > +             error ("insertion out of range in BIT_FIELD_INSERT");
> > > > > +             return true;
> > > > > +           }
> > > > > +       }
> > > > > +       else if (VECTOR_TYPE_P (rhs1_type))
> > > > > +       {
> > > > > +         unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (rhs3);
> > > > > +         unsigned HOST_WIDE_INT bitsize = tree_to_uhwi (TYPE_SIZE (rhs2_type));
> > > > > +         if (bitpos % bitsize != 0)
> > > > > +           {
> > > > > +             error ("vector insertion not at element boundary");
> > > > > +             return true;
> > > > > +           }
> > > > > +       }
> > > > > +       return false;
> > > > > +
> > > > >       case DOT_PROD_EXPR:
> > > > >       case REALIGN_LOAD_EXPR:
> > > > >         /* FIXME.  */
> > > > > Index: trunk/gcc/tree-ssa.c
> > > > > ===================================================================
> > > > > *** trunk.orig/gcc/tree-ssa.c   2016-05-13 09:38:02.263611726 +0200
> > > > > --- trunk/gcc/tree-ssa.c        2016-05-13 09:50:31.020226585 +0200
> > > > > *************** non_rewritable_lvalue_p (tree lhs)
> > > > > *** 1318,1323 ****
> > > > > --- 1318,1335 ----
> > > > >         return false;
> > > > >       }
> > > > >
> > > > > +   /* A vector-insert using a BIT_FIELD_REF is rewritable using
> > > > > +      BIT_FIELD_INSERT.  */
> > > > > +   if (TREE_CODE (lhs) == BIT_FIELD_REF
> > > > > +       && DECL_P (TREE_OPERAND (lhs, 0))
> > > > > +       && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (lhs, 0)))
> > > > > +       /* && bitsize % element-size == 0 */
> > > > > +       && types_compatible_p (TREE_TYPE (lhs),
> > > > > +                            TREE_TYPE (TREE_TYPE (TREE_OPERAND (lhs, 0))))
> > > > > +       && (tree_to_uhwi (TREE_OPERAND (lhs, 2))
> > > > > +         % tree_to_uhwi (TYPE_SIZE (TREE_TYPE (lhs)))) == 0)
> > > > > +     return false;
> > > > > +
> > > > >     return true;
> > > > >   }
> > > > >
> > > > > *************** execute_update_addresses_taken (void)
> > > > > *** 1536,1541 ****
> > > > > --- 1548,1576 ----
> > > > >                     stmt = gsi_stmt (gsi);
> > > > >                     unlink_stmt_vdef (stmt);
> > > > >                     update_stmt (stmt);
> > > > > +                   continue;
> > > > > +                 }
> > > > > +
> > > > > +               /* Rewrite a vector insert via a BIT_FIELD_REF on the LHS
> > > > > +                  into a BIT_FIELD_INSERT.  */
> > > > > +               if (TREE_CODE (lhs) == BIT_FIELD_REF
> > > > > +                   && DECL_P (TREE_OPERAND (lhs, 0))
> > > > > +                   && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (lhs, 0)))
> > > > > +                   && types_compatible_p (TREE_TYPE (lhs),
> > > > > +                                          TREE_TYPE (TREE_TYPE
> > > > > +                                                      (TREE_OPERAND (lhs, 0))))
> > > > > +                   && (tree_to_uhwi (TREE_OPERAND (lhs, 2))
> > > > > +                       % tree_to_uhwi (TYPE_SIZE (TREE_TYPE (lhs))) == 0))
> > > > > +                 {
> > > > > +                   tree var = TREE_OPERAND (lhs, 0);
> > > > > +                   tree val = gimple_assign_rhs1 (stmt);
> > > > > +                   tree bitpos = TREE_OPERAND (lhs, 2);
> > > > > +                   gimple_assign_set_lhs (stmt, var);
> > > > > +                   gimple_assign_set_rhs_with_ops
> > > > > +                     (&gsi, BIT_FIELD_INSERT, var, val, bitpos);
> > > > > +                   stmt = gsi_stmt (gsi);
> > > > > +                   unlink_stmt_vdef (stmt);
> > > > > +                   update_stmt (stmt);
> > > > >                     continue;
> > > > >                   }
> > > > >
> > > > > Index: trunk/gcc/testsuite/gcc.dg/tree-ssa/vector-6.c
> > > > > ===================================================================
> > > > > *** /dev/null   1970-01-01 00:00:00.000000000 +0000
> > > > > --- trunk/gcc/testsuite/gcc.dg/tree-ssa/vector-6.c      2016-05-13 09:54:16.026814995 +0200
> > > > > ***************
> > > > > *** 0 ****
> > > > > --- 1,34 ----
> > > > > + /* { dg-do compile } */
> > > > > + /* { dg-options "-O -fdump-tree-ccp1" } */
> > > > > +
> > > > > + typedef int v4si __attribute__((vector_size (4 * sizeof (int))));
> > > > > +
> > > > > + v4si test1 (v4si v, int i)
> > > > > + {
> > > > > +   ((int *)&v)[0] = i;
> > > > > +   return v;
> > > > > + }
> > > > > +
> > > > > + v4si test2 (v4si v, int i)
> > > > > + {
> > > > > +   int *p = (int *)&v;
> > > > > +   *p = i;
> > > > > +   return v;
> > > > > + }
> > > > > +
> > > > > + v4si test3 (v4si v, int i)
> > > > > + {
> > > > > +   ((int *)&v)[3] = i;
> > > > > +   return v;
> > > > > + }
> > > > > +
> > > > > + v4si test4 (v4si v, int i)
> > > > > + {
> > > > > +   int *p = (int *)&v;
> > > > > +   p += 3;
> > > > > +   *p = i;
> > > > > +   return v;
> > > > > + }
> > > > > +
> > > > > + /* { dg-final { scan-tree-dump-times "Now a gimple register: v" 2 "ccp1" } } */
> > > > > + /* { dg-final { scan-tree-dump-times "Now a gimple register: v" 4 "ccp1" { xfail *-*-* } } } */
> > > >
> > > >
> > >
> > >
> >
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT
  2019-12-17  2:41       ` Andrew Pinski
  2019-12-17  3:25         ` Andrew Pinski
@ 2020-01-07  7:37         ` Richard Biener
  2020-01-07  9:40           ` Andrew Pinski
  1 sibling, 1 reply; 32+ messages in thread
From: Richard Biener @ 2020-01-07  7:37 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 25705 bytes --]

On Mon, 16 Dec 2019, Andrew Pinski wrote:

> On Thu, Nov 15, 2018 at 12:31 AM Richard Biener <rguenther@suse.de> wrote:
> >
> > On Thu, 15 Nov 2018, Richard Biener wrote:
> >
> > > On Wed, 14 Nov 2018, Andrew Pinski wrote:
> > >
> > > > On Fri, May 13, 2016 at 3:51 AM Richard Biener <rguenther@suse.de> wrote:
> > > > >
> > > > >
> > > > > The following patch adds BIT_FIELD_INSERT, an operation to
> > > > > facilitate doing bitfield inserts on registers (as opposed
> > > > > to currently where we'd have a BIT_FIELD_REF store).
> > > > >
> > > > > Originally this was developed as part of bitfield lowering
> > > > > where bitfield stores were lowered into read-modify-write
> > > > > cycles and the modify part, instead of doing shifting and masking,
> > > > > be kept in a more high-level form to ease combining them.
> > > > >
> > > > > A second use case (the above is still valid) is vector element
> > > > > inserts which we currently can only do via memory or
> > > > > by extracting all components and re-building the vector using
> > > > > a CONSTRUCTOR.  For this second use case I added code
> > > > > re-writing the BIT_FIELD_REF stores the C family FEs produce
> > > > > into BIT_FIELD_INSERT when update-address-taken can otherwise
> > > > > re-write a decl into SSA form (the testcase shows we miss
> > > > > a similar opportunity with the MEM_REF form of a vector insert,
> > > > > I plan to fix that for the final submission).
> > > > >
> > > > > One speciality of BIT_FIELD_INSERT as opposed to BIT_FIELD_REF
> > > > > is that the size of the insertion is given implicitely via the
> > > > > type size/precision of the value to insert.  That avoids
> > > > > introducing ways to have quaternary ops in folding and GIMPLE stmts.
> > > > >
> > > > > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> > > > >
> > > > > Richard.
> > > > >
> > > > > 2011-06-16  Richard Guenther  <rguenther@suse.de>
> > > > >
> > > > >         PR tree-optimization/29756
> > > > >         * tree.def (BIT_FIELD_INSERT): New tcc_expression tree code.
> > > > >         * expr.c (expand_expr_real_2): Handle BIT_FIELD_INSERT.
> > > > >         * fold-const.c (operand_equal_p): Likewise.
> > > > >         (fold_ternary_loc): Add constant folding of BIT_FIELD_INSERT.
> > > > >         * gimplify.c (gimplify_expr): Handle BIT_FIELD_INSERT.
> > > > >         * tree-inline.c (estimate_operator_cost): Likewise.
> > > > >         * tree-pretty-print.c (dump_generic_node): Likewise.
> > > > >         * tree-ssa-operands.c (get_expr_operands): Likewise.
> > > > >         * cfgexpand.c (expand_debug_expr): Likewise.
> > > > >         * gimple-pretty-print.c (dump_ternary_rhs): Likewise.
> > > > >         * gimple.c (get_gimple_rhs_num_ops): Handle BIT_FIELD_INSERT.
> > > > >         * tree-cfg.c (verify_gimple_assign_ternary): Verify BIT_FIELD_INSERT.
> > > > >
> > > > >         * tree-ssa.c (non_rewritable_lvalue_p): We can rewrite
> > > > >         vector inserts using BIT_FIELD_REF on the lhs.
> > > > >         (execute_update_addresses_taken): Do it.
> > > > >
> > > > >         * gcc.dg/tree-ssa/vector-6.c: New testcase.
> > > > >
> > > > > Index: trunk/gcc/expr.c
> > > > > ===================================================================
> > > > > *** trunk.orig/gcc/expr.c       2016-05-12 13:40:30.704262951 +0200
> > > > > --- trunk/gcc/expr.c    2016-05-12 15:40:32.481225744 +0200
> > > > > *************** expand_expr_real_2 (sepops ops, rtx targ
> > > > > *** 9358,9363 ****
> > > > > --- 9358,9380 ----
> > > > >         target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
> > > > >         return target;
> > > > >
> > > > > +     case BIT_FIELD_INSERT:
> > > > > +       {
> > > > > +       unsigned bitpos = tree_to_uhwi (treeop2);
> > > > > +       unsigned bitsize;
> > > > > +       if (INTEGRAL_TYPE_P (TREE_TYPE (treeop1)))
> > > > > +         bitsize = TYPE_PRECISION (TREE_TYPE (treeop1));
> > > > > +       else
> > > > > +         bitsize = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (treeop1)));
> > > > > +       rtx op0 = expand_normal (treeop0);
> > > > > +       rtx op1 = expand_normal (treeop1);
> > > > > +       rtx dst = gen_reg_rtx (mode);
> > > > > +       emit_move_insn (dst, op0);
> > > > > +       store_bit_field (dst, bitsize, bitpos, 0, 0,
> > > > > +                        TYPE_MODE (TREE_TYPE (treeop1)), op1, false);
> > > > > +       return dst;
> > > > > +       }
> > > > > +
> > > > >       default:
> > > > >         gcc_unreachable ();
> > > > >       }
> > > > > Index: trunk/gcc/fold-const.c
> > > > > ===================================================================
> > > > > *** trunk.orig/gcc/fold-const.c 2016-05-12 13:40:30.704262951 +0200
> > > > > --- trunk/gcc/fold-const.c      2016-05-13 09:41:13.509812127 +0200
> > > > > *************** operand_equal_p (const_tree arg0, const_
> > > > > *** 3163,3168 ****
> > > > > --- 3163,3169 ----
> > > > >
> > > > >         case VEC_COND_EXPR:
> > > > >         case DOT_PROD_EXPR:
> > > > > +       case BIT_FIELD_INSERT:
> > > > >           return OP_SAME (0) && OP_SAME (1) && OP_SAME (2);
> > > > >
> > > > >         default:
> > > > > *************** fold_ternary_loc (location_t loc, enum t
> > > > > *** 11870,11875 ****
> > > > > --- 11871,11916 ----
> > > > >         }
> > > > >         return NULL_TREE;
> > > > >
> > > > > +     case BIT_FIELD_INSERT:
> > > > > +       /* Perform (partial) constant folding of BIT_FIELD_INSERT.  */
> > > > > +       if (TREE_CODE (arg0) == INTEGER_CST
> > > > > +         && TREE_CODE (arg1) == INTEGER_CST)
> > > > > +       {
> > > > > +         unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
> > > > > +         unsigned bitsize = TYPE_PRECISION (TREE_TYPE (arg1));
> > > > > +         wide_int tem = wi::bit_and (arg0,
> > > > > +                                     wi::shifted_mask (bitpos, bitsize, true,
> > > > > +                                                       TYPE_PRECISION (type)));
> > > > > +         wide_int tem2
> > > > > +           = wi::lshift (wi::zext (wi::to_wide (arg1, TYPE_PRECISION (type)),
> > > > > +                                   bitsize), bitpos);
> > > > > +         return wide_int_to_tree (type, wi::bit_or (tem, tem2));
> > > > > +       }
> > > >
> > > > This seems incorrect for the case where BYTES_BIG_ENDIAN as far as I
> > > > can tell.  With BYTES_BIG_ENDIAN, the bits position starts most
> > > > significiant rather than the least significiant.
> > >
> > > You mean the bitpos operand of BIT_FIELD_INSERT works in a different way?
> > > I see the BIT_FIELD_REF folding uses native_encode/interpret but only
> > > handles byte-aligned references.  I suppose the BIT_INSERT_EXPR case
> > > (you are following up an old patch) could do the same.
> > >
> > > >  Sorry I am bring
> > > > this up after this has been in the tree for a long time but I finally
> > > > got around to testing my bit-field lowering on a few big-endian
> > > > targets and ran into this issue.
> > >
> > > So - can you fix it please?  Also note that the VECTOR_CST case
> > > (as in BIT_FIELD_REF) seems to be inconsistent here and counts
> > > "bits" in a different way?
> >
> > And bonus points for documenting BIT_FIELD_REF and BIT_INSERT_EXPR
> > in generic.texi, together with those "details".
> 
> This is the fix:
> diff --git a/gcc/fold-const.c b/gcc/fold-const.c
> index 8e9e299..a919b63 100644
> --- a/gcc/fold-const.c
> +++ b/gcc/fold-const.c
> @@ -12301,6 +12301,8 @@ fold_ternary_loc (location_t loc, enum
> tree_code code, tree type,
>         {
>           unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
>           unsigned bitsize = TYPE_PRECISION (TREE_TYPE (arg1));
> +         if (BYTES_BIG_ENDIAN)
> +           bitpos = TYPE_PRECISION (type) - bitpos - bitsize;
>           wide_int tem = (wi::to_wide (arg0)
>                           & wi::shifted_mask (bitpos, bitsize, true,
>                                               TYPE_PRECISION (type)));

I guess you need to guard against BYTES_BIG_ENDIAN != WORDS_BIG_ENDIAN
as well.  Also the above only works reliably for mode-precision
integers?  We might want to disallow BIT_FIELD_REF/BIT_INSERT_EXPR
on non-mode-precision entities in the GIMPLE/GENERIC verifiers.

> ---- CUT ----
> I will do a full test in a little bit with the other patch I attached
> to related bugzilla.
> 
> Thanks,
> Andrew
> 
> >
> > Richard.
> >
> > > Thanks,
> > > Richard.
> > >
> > > > Thanks,
> > > > Andrew Pinski
> > > >
> > > > > +       else if (TREE_CODE (arg0) == VECTOR_CST
> > > > > +              && CONSTANT_CLASS_P (arg1)
> > > > > +              && types_compatible_p (TREE_TYPE (TREE_TYPE (arg0)),
> > > > > +                                     TREE_TYPE (arg1)))
> > > > > +       {
> > > > > +         unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
> > > > > +         unsigned HOST_WIDE_INT elsize
> > > > > +           = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (arg1)));
> > > > > +         if (bitpos % elsize == 0)
> > > > > +           {
> > > > > +             unsigned k = bitpos / elsize;
> > > > > +             if (operand_equal_p (VECTOR_CST_ELT (arg0, k), arg1, 0))
> > > > > +               return arg0;
> > > > > +             else
> > > > > +               {
> > > > > +                 tree *elts = XALLOCAVEC (tree, TYPE_VECTOR_SUBPARTS (type));
> > > > > +                 memcpy (elts, VECTOR_CST_ELTS (arg0),
> > > > > +                         sizeof (tree) * TYPE_VECTOR_SUBPARTS (type));
> > > > > +                 elts[k] = arg1;
> > > > > +                 return build_vector (type, elts);
> > > > > +               }
> > > > > +           }
> > > > > +       }
> > > > > +       return NULL_TREE;
> > > > > +
> > > > >       default:
> > > > >         return NULL_TREE;
> > > > >       } /* switch (code) */
> > > > > Index: trunk/gcc/gimplify.c
> > > > > ===================================================================
> > > > > *** trunk.orig/gcc/gimplify.c   2016-05-12 13:40:30.704262951 +0200
> > > > > --- trunk/gcc/gimplify.c        2016-05-12 13:56:18.679120641 +0200
> > > > > *************** gimplify_expr (tree *expr_p, gimple_seq
> > > > > *** 10936,10941 ****
> > > > > --- 10936,10945 ----
> > > > >           /* Classified as tcc_expression.  */
> > > > >           goto expr_3;
> > > > >
> > > > > +       case BIT_FIELD_INSERT:
> > > > > +         /* Argument 3 is a constant.  */
> > > > > +         goto expr_2;
> > > > > +
> > > > >         case POINTER_PLUS_EXPR:
> > > > >           {
> > > > >             enum gimplify_status r0, r1;
> > > > > Index: trunk/gcc/tree-inline.c
> > > > > ===================================================================
> > > > > *** trunk.orig/gcc/tree-inline.c        2016-05-12 13:40:30.704262951 +0200
> > > > > --- trunk/gcc/tree-inline.c     2016-05-12 13:42:45.465811959 +0200
> > > > > *************** estimate_operator_cost (enum tree_code c
> > > > > *** 3941,3946 ****
> > > > > --- 3941,3950 ----
> > > > >           return weights->div_mod_cost;
> > > > >         return 1;
> > > > >
> > > > > +     /* Bit-field insertion needs several shift and mask operations.  */
> > > > > +     case BIT_FIELD_INSERT:
> > > > > +       return 3;
> > > > > +
> > > > >       default:
> > > > >         /* We expect a copy assignment with no operator.  */
> > > > >         gcc_assert (get_gimple_rhs_class (code) == GIMPLE_SINGLE_RHS);
> > > > > Index: trunk/gcc/tree-pretty-print.c
> > > > > ===================================================================
> > > > > *** trunk.orig/gcc/tree-pretty-print.c  2016-05-12 13:40:30.704262951 +0200
> > > > > --- trunk/gcc/tree-pretty-print.c       2016-05-12 14:30:05.781944740 +0200
> > > > > *************** dump_generic_node (pretty_printer *pp, t
> > > > > *** 1876,1881 ****
> > > > > --- 1876,1898 ----
> > > > >         pp_greater (pp);
> > > > >         break;
> > > > >
> > > > > +     case BIT_FIELD_INSERT:
> > > > > +       pp_string (pp, "BIT_FIELD_INSERT <");
> > > > > +       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
> > > > > +       pp_string (pp, ", ");
> > > > > +       dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
> > > > > +       pp_string (pp, ", ");
> > > > > +       dump_generic_node (pp, TREE_OPERAND (node, 2), spc, flags, false);
> > > > > +       pp_string (pp, " (");
> > > > > +       if (INTEGRAL_TYPE_P (TREE_TYPE (TREE_OPERAND (node, 1))))
> > > > > +       pp_decimal_int (pp,
> > > > > +                       TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (node, 1))));
> > > > > +       else
> > > > > +       dump_generic_node (pp, TYPE_SIZE (TREE_TYPE (TREE_OPERAND (node, 1))),
> > > > > +                          spc, flags, false);
> > > > > +       pp_string (pp, " bits)>");
> > > > > +       break;
> > > > > +
> > > > >       case ARRAY_REF:
> > > > >       case ARRAY_RANGE_REF:
> > > > >         op0 = TREE_OPERAND (node, 0);
> > > > > Index: trunk/gcc/tree-ssa-operands.c
> > > > > ===================================================================
> > > > > *** trunk.orig/gcc/tree-ssa-operands.c  2016-05-12 13:42:45.465811959 +0200
> > > > > --- trunk/gcc/tree-ssa-operands.c       2016-05-12 13:48:26.881736503 +0200
> > > > > *************** get_expr_operands (struct function *fn,
> > > > > *** 833,838 ****
> > > > > --- 833,839 ----
> > > > >         get_expr_operands (fn, stmt, &TREE_OPERAND (expr, 0), flags);
> > > > >         return;
> > > > >
> > > > > +     case BIT_FIELD_INSERT:
> > > > >       case COMPOUND_EXPR:
> > > > >       case OBJ_TYPE_REF:
> > > > >       case ASSERT_EXPR:
> > > > > Index: trunk/gcc/tree.def
> > > > > ===================================================================
> > > > > *** trunk.orig/gcc/tree.def     2016-05-12 13:40:30.704262951 +0200
> > > > > --- trunk/gcc/tree.def  2016-05-12 13:47:09.972852423 +0200
> > > > > *************** DEFTREECODE (ADDR_EXPR, "addr_expr", tcc
> > > > > *** 852,857 ****
> > > > > --- 852,868 ----
> > > > >      descriptor of type ptr_mode.  */
> > > > >   DEFTREECODE (FDESC_EXPR, "fdesc_expr", tcc_expression, 2)
> > > > >
> > > > > + /* Given a word, a value and a bitfield position within the word,
> > > > > +    produce the value that results if replacing the
> > > > > +    described parts of word with value.
> > > > > +    Operand 0 is a tree for the word of integral type;
> > > > > +    Operand 1 is a tree for the value of integral type;
> > > > > +    Operand 2 is a tree giving the constant position of the first referenced bit;
> > > > > +    The number of bits replaced is given by the precision of the value
> > > > > +    type if that is integral or by its size if it is non-integral.
> > > > > +    The replaced bits shall be fully inside the word.  */
> > > > > + DEFTREECODE (BIT_FIELD_INSERT, "bit_field_insert", tcc_expression, 3)
> > > > > +
> > > > >   /* Given two real or integer operands of the same type,
> > > > >      returns a complex value of the corresponding complex type.  */
> > > > >   DEFTREECODE (COMPLEX_EXPR, "complex_expr", tcc_binary, 2)
> > > > > Index: trunk/gcc/cfgexpand.c
> > > > > ===================================================================
> > > > > *** trunk.orig/gcc/cfgexpand.c  2016-05-12 13:42:45.469812005 +0200
> > > > > --- trunk/gcc/cfgexpand.c       2016-05-13 11:48:04.513407495 +0200
> > > > > *************** expand_debug_expr (tree exp)
> > > > > *** 5025,5030 ****
> > > > > --- 5025,5031 ----
> > > > >       case FIXED_CONVERT_EXPR:
> > > > >       case OBJ_TYPE_REF:
> > > > >       case WITH_SIZE_EXPR:
> > > > > +     case BIT_FIELD_INSERT:
> > > > >         return NULL;
> > > > >
> > > > >       case DOT_PROD_EXPR:
> > > > > Index: trunk/gcc/gimple-pretty-print.c
> > > > > ===================================================================
> > > > > *** trunk.orig/gcc/gimple-pretty-print.c        2016-05-12 11:23:09.261375157 +0200
> > > > > --- trunk/gcc/gimple-pretty-print.c     2016-05-12 14:57:22.096175579 +0200
> > > > > *************** dump_ternary_rhs (pretty_printer *buffer
> > > > > *** 479,484 ****
> > > > > --- 479,502 ----
> > > > >         pp_greater (buffer);
> > > > >         break;
> > > > >
> > > > > +     case BIT_FIELD_INSERT:
> > > > > +       pp_string (buffer, "BIT_FIELD_INSERT <");
> > > > > +       dump_generic_node (buffer, gimple_assign_rhs1 (gs), spc, flags, false);
> > > > > +       pp_string (buffer, ", ");
> > > > > +       dump_generic_node (buffer, gimple_assign_rhs2 (gs), spc, flags, false);
> > > > > +       pp_string (buffer, ", ");
> > > > > +       dump_generic_node (buffer, gimple_assign_rhs3 (gs), spc, flags, false);
> > > > > +       pp_string (buffer, " (");
> > > > > +       if (INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_rhs2 (gs))))
> > > > > +       pp_decimal_int (buffer,
> > > > > +                       TYPE_PRECISION (TREE_TYPE (gimple_assign_rhs2 (gs))));
> > > > > +       else
> > > > > +       dump_generic_node (buffer,
> > > > > +                          TYPE_SIZE (TREE_TYPE (gimple_assign_rhs2 (gs))),
> > > > > +                          spc, flags, false);
> > > > > +       pp_string (buffer, " bits)>");
> > > > > +       break;
> > > > > +
> > > > >       default:
> > > > >         gcc_unreachable ();
> > > > >       }
> > > > > Index: trunk/gcc/gimple.c
> > > > > ===================================================================
> > > > > *** trunk.orig/gcc/gimple.c     2016-05-12 13:40:30.704262951 +0200
> > > > > --- trunk/gcc/gimple.c  2016-05-12 14:49:37.066994969 +0200
> > > > > *************** get_gimple_rhs_num_ops (enum tree_code c
> > > > > *** 2044,2049 ****
> > > > > --- 2044,2050 ----
> > > > >         || (SYM) == REALIGN_LOAD_EXPR                                       \
> > > > >         || (SYM) == VEC_COND_EXPR                                                   \
> > > > >         || (SYM) == VEC_PERM_EXPR                                             \
> > > > > +       || (SYM) == BIT_FIELD_INSERT                                        \
> > > > >         || (SYM) == FMA_EXPR) ? GIMPLE_TERNARY_RHS                          \
> > > > >      : ((SYM) == CONSTRUCTOR                                                \
> > > > >         || (SYM) == OBJ_TYPE_REF                                                    \
> > > > > Index: trunk/gcc/tree-cfg.c
> > > > > ===================================================================
> > > > > *** trunk.orig/gcc/tree-cfg.c   2016-05-06 14:38:33.959495081 +0200
> > > > > --- trunk/gcc/tree-cfg.c        2016-05-13 09:25:01.670630730 +0200
> > > > > *************** verify_gimple_assign_ternary (gassign *s
> > > > > *** 4155,4160 ****
> > > > > --- 4155,4207 ----
> > > > >
> > > > >         return false;
> > > > >
> > > > > +     case BIT_FIELD_INSERT:
> > > > > +       if (! useless_type_conversion_p (lhs_type, rhs1_type))
> > > > > +       {
> > > > > +         error ("type mismatch in BIT_FIELD_INSERT");
> > > > > +         debug_generic_expr (lhs_type);
> > > > > +         debug_generic_expr (rhs1_type);
> > > > > +         return true;
> > > > > +       }
> > > > > +       if (! ((INTEGRAL_TYPE_P (rhs1_type)
> > > > > +             && INTEGRAL_TYPE_P (rhs2_type))
> > > > > +            || (VECTOR_TYPE_P (rhs1_type)
> > > > > +                && types_compatible_p (TREE_TYPE (rhs1_type), rhs2_type))))
> > > > > +       {
> > > > > +         error ("not allowed type combination in BIT_FIELD_INSERT");
> > > > > +         debug_generic_expr (rhs1_type);
> > > > > +         debug_generic_expr (rhs2_type);
> > > > > +         return true;
> > > > > +       }
> > > > > +       if (! tree_fits_uhwi_p (rhs3)
> > > > > +         || ! tree_fits_uhwi_p (TYPE_SIZE (rhs2_type)))
> > > > > +       {
> > > > > +         error ("invalid position or size in BIT_FIELD_INSERT");
> > > > > +         return true;
> > > > > +       }
> > > > > +       if (INTEGRAL_TYPE_P (rhs1_type))
> > > > > +       {
> > > > > +         unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (rhs3);
> > > > > +         if (bitpos >= TYPE_PRECISION (rhs1_type)
> > > > > +             || (bitpos + TYPE_PRECISION (rhs2_type)
> > > > > +                 > TYPE_PRECISION (rhs1_type)))
> > > > > +           {
> > > > > +             error ("insertion out of range in BIT_FIELD_INSERT");
> > > > > +             return true;
> > > > > +           }
> > > > > +       }
> > > > > +       else if (VECTOR_TYPE_P (rhs1_type))
> > > > > +       {
> > > > > +         unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (rhs3);
> > > > > +         unsigned HOST_WIDE_INT bitsize = tree_to_uhwi (TYPE_SIZE (rhs2_type));
> > > > > +         if (bitpos % bitsize != 0)
> > > > > +           {
> > > > > +             error ("vector insertion not at element boundary");
> > > > > +             return true;
> > > > > +           }
> > > > > +       }
> > > > > +       return false;
> > > > > +
> > > > >       case DOT_PROD_EXPR:
> > > > >       case REALIGN_LOAD_EXPR:
> > > > >         /* FIXME.  */
> > > > > Index: trunk/gcc/tree-ssa.c
> > > > > ===================================================================
> > > > > *** trunk.orig/gcc/tree-ssa.c   2016-05-13 09:38:02.263611726 +0200
> > > > > --- trunk/gcc/tree-ssa.c        2016-05-13 09:50:31.020226585 +0200
> > > > > *************** non_rewritable_lvalue_p (tree lhs)
> > > > > *** 1318,1323 ****
> > > > > --- 1318,1335 ----
> > > > >         return false;
> > > > >       }
> > > > >
> > > > > +   /* A vector-insert using a BIT_FIELD_REF is rewritable using
> > > > > +      BIT_FIELD_INSERT.  */
> > > > > +   if (TREE_CODE (lhs) == BIT_FIELD_REF
> > > > > +       && DECL_P (TREE_OPERAND (lhs, 0))
> > > > > +       && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (lhs, 0)))
> > > > > +       /* && bitsize % element-size == 0 */
> > > > > +       && types_compatible_p (TREE_TYPE (lhs),
> > > > > +                            TREE_TYPE (TREE_TYPE (TREE_OPERAND (lhs, 0))))
> > > > > +       && (tree_to_uhwi (TREE_OPERAND (lhs, 2))
> > > > > +         % tree_to_uhwi (TYPE_SIZE (TREE_TYPE (lhs)))) == 0)
> > > > > +     return false;
> > > > > +
> > > > >     return true;
> > > > >   }
> > > > >
> > > > > *************** execute_update_addresses_taken (void)
> > > > > *** 1536,1541 ****
> > > > > --- 1548,1576 ----
> > > > >                     stmt = gsi_stmt (gsi);
> > > > >                     unlink_stmt_vdef (stmt);
> > > > >                     update_stmt (stmt);
> > > > > +                   continue;
> > > > > +                 }
> > > > > +
> > > > > +               /* Rewrite a vector insert via a BIT_FIELD_REF on the LHS
> > > > > +                  into a BIT_FIELD_INSERT.  */
> > > > > +               if (TREE_CODE (lhs) == BIT_FIELD_REF
> > > > > +                   && DECL_P (TREE_OPERAND (lhs, 0))
> > > > > +                   && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (lhs, 0)))
> > > > > +                   && types_compatible_p (TREE_TYPE (lhs),
> > > > > +                                          TREE_TYPE (TREE_TYPE
> > > > > +                                                      (TREE_OPERAND (lhs, 0))))
> > > > > +                   && (tree_to_uhwi (TREE_OPERAND (lhs, 2))
> > > > > +                       % tree_to_uhwi (TYPE_SIZE (TREE_TYPE (lhs))) == 0))
> > > > > +                 {
> > > > > +                   tree var = TREE_OPERAND (lhs, 0);
> > > > > +                   tree val = gimple_assign_rhs1 (stmt);
> > > > > +                   tree bitpos = TREE_OPERAND (lhs, 2);
> > > > > +                   gimple_assign_set_lhs (stmt, var);
> > > > > +                   gimple_assign_set_rhs_with_ops
> > > > > +                     (&gsi, BIT_FIELD_INSERT, var, val, bitpos);
> > > > > +                   stmt = gsi_stmt (gsi);
> > > > > +                   unlink_stmt_vdef (stmt);
> > > > > +                   update_stmt (stmt);
> > > > >                     continue;
> > > > >                   }
> > > > >
> > > > > Index: trunk/gcc/testsuite/gcc.dg/tree-ssa/vector-6.c
> > > > > ===================================================================
> > > > > *** /dev/null   1970-01-01 00:00:00.000000000 +0000
> > > > > --- trunk/gcc/testsuite/gcc.dg/tree-ssa/vector-6.c      2016-05-13 09:54:16.026814995 +0200
> > > > > ***************
> > > > > *** 0 ****
> > > > > --- 1,34 ----
> > > > > + /* { dg-do compile } */
> > > > > + /* { dg-options "-O -fdump-tree-ccp1" } */
> > > > > +
> > > > > + typedef int v4si __attribute__((vector_size (4 * sizeof (int))));
> > > > > +
> > > > > + v4si test1 (v4si v, int i)
> > > > > + {
> > > > > +   ((int *)&v)[0] = i;
> > > > > +   return v;
> > > > > + }
> > > > > +
> > > > > + v4si test2 (v4si v, int i)
> > > > > + {
> > > > > +   int *p = (int *)&v;
> > > > > +   *p = i;
> > > > > +   return v;
> > > > > + }
> > > > > +
> > > > > + v4si test3 (v4si v, int i)
> > > > > + {
> > > > > +   ((int *)&v)[3] = i;
> > > > > +   return v;
> > > > > + }
> > > > > +
> > > > > + v4si test4 (v4si v, int i)
> > > > > + {
> > > > > +   int *p = (int *)&v;
> > > > > +   p += 3;
> > > > > +   *p = i;
> > > > > +   return v;
> > > > > + }
> > > > > +
> > > > > + /* { dg-final { scan-tree-dump-times "Now a gimple register: v" 2 "ccp1" } } */
> > > > > + /* { dg-final { scan-tree-dump-times "Now a gimple register: v" 4 "ccp1" { xfail *-*-* } } } */
> > > >
> > > >
> > >
> > >
> >
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT
  2020-01-07  7:37         ` Richard Biener
@ 2020-01-07  9:40           ` Andrew Pinski
  2020-01-07 10:04             ` Richard Biener
  0 siblings, 1 reply; 32+ messages in thread
From: Andrew Pinski @ 2020-01-07  9:40 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches

On Mon, Jan 6, 2020 at 11:36 PM Richard Biener <rguenther@suse.de> wrote:
>
> On Mon, 16 Dec 2019, Andrew Pinski wrote:
>
> > On Thu, Nov 15, 2018 at 12:31 AM Richard Biener <rguenther@suse.de> wrote:
> > >
> > > On Thu, 15 Nov 2018, Richard Biener wrote:
> > >
> > > > On Wed, 14 Nov 2018, Andrew Pinski wrote:
> > > >
> > > > > On Fri, May 13, 2016 at 3:51 AM Richard Biener <rguenther@suse.de> wrote:
> > > > > >
> > > > > >
> > > > > > The following patch adds BIT_FIELD_INSERT, an operation to
> > > > > > facilitate doing bitfield inserts on registers (as opposed
> > > > > > to currently where we'd have a BIT_FIELD_REF store).
> > > > > >
> > > > > > Originally this was developed as part of bitfield lowering
> > > > > > where bitfield stores were lowered into read-modify-write
> > > > > > cycles and the modify part, instead of doing shifting and masking,
> > > > > > be kept in a more high-level form to ease combining them.
> > > > > >
> > > > > > A second use case (the above is still valid) is vector element
> > > > > > inserts which we currently can only do via memory or
> > > > > > by extracting all components and re-building the vector using
> > > > > > a CONSTRUCTOR.  For this second use case I added code
> > > > > > re-writing the BIT_FIELD_REF stores the C family FEs produce
> > > > > > into BIT_FIELD_INSERT when update-address-taken can otherwise
> > > > > > re-write a decl into SSA form (the testcase shows we miss
> > > > > > a similar opportunity with the MEM_REF form of a vector insert,
> > > > > > I plan to fix that for the final submission).
> > > > > >
> > > > > > One speciality of BIT_FIELD_INSERT as opposed to BIT_FIELD_REF
> > > > > > is that the size of the insertion is given implicitely via the
> > > > > > type size/precision of the value to insert.  That avoids
> > > > > > introducing ways to have quaternary ops in folding and GIMPLE stmts.
> > > > > >
> > > > > > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> > > > > >
> > > > > > Richard.
> > > > > >
> > > > > > 2011-06-16  Richard Guenther  <rguenther@suse.de>
> > > > > >
> > > > > >         PR tree-optimization/29756
> > > > > >         * tree.def (BIT_FIELD_INSERT): New tcc_expression tree code.
> > > > > >         * expr.c (expand_expr_real_2): Handle BIT_FIELD_INSERT.
> > > > > >         * fold-const.c (operand_equal_p): Likewise.
> > > > > >         (fold_ternary_loc): Add constant folding of BIT_FIELD_INSERT.
> > > > > >         * gimplify.c (gimplify_expr): Handle BIT_FIELD_INSERT.
> > > > > >         * tree-inline.c (estimate_operator_cost): Likewise.
> > > > > >         * tree-pretty-print.c (dump_generic_node): Likewise.
> > > > > >         * tree-ssa-operands.c (get_expr_operands): Likewise.
> > > > > >         * cfgexpand.c (expand_debug_expr): Likewise.
> > > > > >         * gimple-pretty-print.c (dump_ternary_rhs): Likewise.
> > > > > >         * gimple.c (get_gimple_rhs_num_ops): Handle BIT_FIELD_INSERT.
> > > > > >         * tree-cfg.c (verify_gimple_assign_ternary): Verify BIT_FIELD_INSERT.
> > > > > >
> > > > > >         * tree-ssa.c (non_rewritable_lvalue_p): We can rewrite
> > > > > >         vector inserts using BIT_FIELD_REF on the lhs.
> > > > > >         (execute_update_addresses_taken): Do it.
> > > > > >
> > > > > >         * gcc.dg/tree-ssa/vector-6.c: New testcase.
> > > > > >
> > > > > > Index: trunk/gcc/expr.c
> > > > > > ===================================================================
> > > > > > *** trunk.orig/gcc/expr.c       2016-05-12 13:40:30.704262951 +0200
> > > > > > --- trunk/gcc/expr.c    2016-05-12 15:40:32.481225744 +0200
> > > > > > *************** expand_expr_real_2 (sepops ops, rtx targ
> > > > > > *** 9358,9363 ****
> > > > > > --- 9358,9380 ----
> > > > > >         target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
> > > > > >         return target;
> > > > > >
> > > > > > +     case BIT_FIELD_INSERT:
> > > > > > +       {
> > > > > > +       unsigned bitpos = tree_to_uhwi (treeop2);
> > > > > > +       unsigned bitsize;
> > > > > > +       if (INTEGRAL_TYPE_P (TREE_TYPE (treeop1)))
> > > > > > +         bitsize = TYPE_PRECISION (TREE_TYPE (treeop1));
> > > > > > +       else
> > > > > > +         bitsize = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (treeop1)));
> > > > > > +       rtx op0 = expand_normal (treeop0);
> > > > > > +       rtx op1 = expand_normal (treeop1);
> > > > > > +       rtx dst = gen_reg_rtx (mode);
> > > > > > +       emit_move_insn (dst, op0);
> > > > > > +       store_bit_field (dst, bitsize, bitpos, 0, 0,
> > > > > > +                        TYPE_MODE (TREE_TYPE (treeop1)), op1, false);
> > > > > > +       return dst;
> > > > > > +       }
> > > > > > +
> > > > > >       default:
> > > > > >         gcc_unreachable ();
> > > > > >       }
> > > > > > Index: trunk/gcc/fold-const.c
> > > > > > ===================================================================
> > > > > > *** trunk.orig/gcc/fold-const.c 2016-05-12 13:40:30.704262951 +0200
> > > > > > --- trunk/gcc/fold-const.c      2016-05-13 09:41:13.509812127 +0200
> > > > > > *************** operand_equal_p (const_tree arg0, const_
> > > > > > *** 3163,3168 ****
> > > > > > --- 3163,3169 ----
> > > > > >
> > > > > >         case VEC_COND_EXPR:
> > > > > >         case DOT_PROD_EXPR:
> > > > > > +       case BIT_FIELD_INSERT:
> > > > > >           return OP_SAME (0) && OP_SAME (1) && OP_SAME (2);
> > > > > >
> > > > > >         default:
> > > > > > *************** fold_ternary_loc (location_t loc, enum t
> > > > > > *** 11870,11875 ****
> > > > > > --- 11871,11916 ----
> > > > > >         }
> > > > > >         return NULL_TREE;
> > > > > >
> > > > > > +     case BIT_FIELD_INSERT:
> > > > > > +       /* Perform (partial) constant folding of BIT_FIELD_INSERT.  */
> > > > > > +       if (TREE_CODE (arg0) == INTEGER_CST
> > > > > > +         && TREE_CODE (arg1) == INTEGER_CST)
> > > > > > +       {
> > > > > > +         unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
> > > > > > +         unsigned bitsize = TYPE_PRECISION (TREE_TYPE (arg1));
> > > > > > +         wide_int tem = wi::bit_and (arg0,
> > > > > > +                                     wi::shifted_mask (bitpos, bitsize, true,
> > > > > > +                                                       TYPE_PRECISION (type)));
> > > > > > +         wide_int tem2
> > > > > > +           = wi::lshift (wi::zext (wi::to_wide (arg1, TYPE_PRECISION (type)),
> > > > > > +                                   bitsize), bitpos);
> > > > > > +         return wide_int_to_tree (type, wi::bit_or (tem, tem2));
> > > > > > +       }
> > > > >
> > > > > This seems incorrect for the case where BYTES_BIG_ENDIAN as far as I
> > > > > can tell.  With BYTES_BIG_ENDIAN, the bits position starts most
> > > > > significiant rather than the least significiant.
> > > >
> > > > You mean the bitpos operand of BIT_FIELD_INSERT works in a different way?
> > > > I see the BIT_FIELD_REF folding uses native_encode/interpret but only
> > > > handles byte-aligned references.  I suppose the BIT_INSERT_EXPR case
> > > > (you are following up an old patch) could do the same.
> > > >
> > > > >  Sorry I am bring
> > > > > this up after this has been in the tree for a long time but I finally
> > > > > got around to testing my bit-field lowering on a few big-endian
> > > > > targets and ran into this issue.
> > > >
> > > > So - can you fix it please?  Also note that the VECTOR_CST case
> > > > (as in BIT_FIELD_REF) seems to be inconsistent here and counts
> > > > "bits" in a different way?
> > >
> > > And bonus points for documenting BIT_FIELD_REF and BIT_INSERT_EXPR
> > > in generic.texi, together with those "details".
> >
> > This is the fix:
> > diff --git a/gcc/fold-const.c b/gcc/fold-const.c
> > index 8e9e299..a919b63 100644
> > --- a/gcc/fold-const.c
> > +++ b/gcc/fold-const.c
> > @@ -12301,6 +12301,8 @@ fold_ternary_loc (location_t loc, enum
> > tree_code code, tree type,
> >         {
> >           unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
> >           unsigned bitsize = TYPE_PRECISION (TREE_TYPE (arg1));
> > +         if (BYTES_BIG_ENDIAN)
> > +           bitpos = TYPE_PRECISION (type) - bitpos - bitsize;
> >           wide_int tem = (wi::to_wide (arg0)
> >                           & wi::shifted_mask (bitpos, bitsize, true,
> >                                               TYPE_PRECISION (type)));
>
> I guess you need to guard against BYTES_BIG_ENDIAN != WORDS_BIG_ENDIAN
> as well.

Yes I will add that check.

>  Also the above only works reliably for mode-precision
> integers?  We might want to disallow BIT_FIELD_REF/BIT_INSERT_EXPR
> on non-mode-precision entities in the GIMPLE/GENERIC verifiers.

You added that check already for gimple in r268332 due to PR88739.
BIT_FIELD_REF around tree-cfg.c:3083
BIT_INSERT_EXPR  around tree-cfg.c:4324

Thanks,
Andrew Pinski

>
> > ---- CUT ----
> > I will do a full test in a little bit with the other patch I attached
> > to related bugzilla.
> >
> > Thanks,
> > Andrew
> >
> > >
> > > Richard.
> > >
> > > > Thanks,
> > > > Richard.
> > > >
> > > > > Thanks,
> > > > > Andrew Pinski
> > > > >
> > > > > > +       else if (TREE_CODE (arg0) == VECTOR_CST
> > > > > > +              && CONSTANT_CLASS_P (arg1)
> > > > > > +              && types_compatible_p (TREE_TYPE (TREE_TYPE (arg0)),
> > > > > > +                                     TREE_TYPE (arg1)))
> > > > > > +       {
> > > > > > +         unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
> > > > > > +         unsigned HOST_WIDE_INT elsize
> > > > > > +           = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (arg1)));
> > > > > > +         if (bitpos % elsize == 0)
> > > > > > +           {
> > > > > > +             unsigned k = bitpos / elsize;
> > > > > > +             if (operand_equal_p (VECTOR_CST_ELT (arg0, k), arg1, 0))
> > > > > > +               return arg0;
> > > > > > +             else
> > > > > > +               {
> > > > > > +                 tree *elts = XALLOCAVEC (tree, TYPE_VECTOR_SUBPARTS (type));
> > > > > > +                 memcpy (elts, VECTOR_CST_ELTS (arg0),
> > > > > > +                         sizeof (tree) * TYPE_VECTOR_SUBPARTS (type));
> > > > > > +                 elts[k] = arg1;
> > > > > > +                 return build_vector (type, elts);
> > > > > > +               }
> > > > > > +           }
> > > > > > +       }
> > > > > > +       return NULL_TREE;
> > > > > > +
> > > > > >       default:
> > > > > >         return NULL_TREE;
> > > > > >       } /* switch (code) */
> > > > > > Index: trunk/gcc/gimplify.c
> > > > > > ===================================================================
> > > > > > *** trunk.orig/gcc/gimplify.c   2016-05-12 13:40:30.704262951 +0200
> > > > > > --- trunk/gcc/gimplify.c        2016-05-12 13:56:18.679120641 +0200
> > > > > > *************** gimplify_expr (tree *expr_p, gimple_seq
> > > > > > *** 10936,10941 ****
> > > > > > --- 10936,10945 ----
> > > > > >           /* Classified as tcc_expression.  */
> > > > > >           goto expr_3;
> > > > > >
> > > > > > +       case BIT_FIELD_INSERT:
> > > > > > +         /* Argument 3 is a constant.  */
> > > > > > +         goto expr_2;
> > > > > > +
> > > > > >         case POINTER_PLUS_EXPR:
> > > > > >           {
> > > > > >             enum gimplify_status r0, r1;
> > > > > > Index: trunk/gcc/tree-inline.c
> > > > > > ===================================================================
> > > > > > *** trunk.orig/gcc/tree-inline.c        2016-05-12 13:40:30.704262951 +0200
> > > > > > --- trunk/gcc/tree-inline.c     2016-05-12 13:42:45.465811959 +0200
> > > > > > *************** estimate_operator_cost (enum tree_code c
> > > > > > *** 3941,3946 ****
> > > > > > --- 3941,3950 ----
> > > > > >           return weights->div_mod_cost;
> > > > > >         return 1;
> > > > > >
> > > > > > +     /* Bit-field insertion needs several shift and mask operations.  */
> > > > > > +     case BIT_FIELD_INSERT:
> > > > > > +       return 3;
> > > > > > +
> > > > > >       default:
> > > > > >         /* We expect a copy assignment with no operator.  */
> > > > > >         gcc_assert (get_gimple_rhs_class (code) == GIMPLE_SINGLE_RHS);
> > > > > > Index: trunk/gcc/tree-pretty-print.c
> > > > > > ===================================================================
> > > > > > *** trunk.orig/gcc/tree-pretty-print.c  2016-05-12 13:40:30.704262951 +0200
> > > > > > --- trunk/gcc/tree-pretty-print.c       2016-05-12 14:30:05.781944740 +0200
> > > > > > *************** dump_generic_node (pretty_printer *pp, t
> > > > > > *** 1876,1881 ****
> > > > > > --- 1876,1898 ----
> > > > > >         pp_greater (pp);
> > > > > >         break;
> > > > > >
> > > > > > +     case BIT_FIELD_INSERT:
> > > > > > +       pp_string (pp, "BIT_FIELD_INSERT <");
> > > > > > +       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
> > > > > > +       pp_string (pp, ", ");
> > > > > > +       dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
> > > > > > +       pp_string (pp, ", ");
> > > > > > +       dump_generic_node (pp, TREE_OPERAND (node, 2), spc, flags, false);
> > > > > > +       pp_string (pp, " (");
> > > > > > +       if (INTEGRAL_TYPE_P (TREE_TYPE (TREE_OPERAND (node, 1))))
> > > > > > +       pp_decimal_int (pp,
> > > > > > +                       TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (node, 1))));
> > > > > > +       else
> > > > > > +       dump_generic_node (pp, TYPE_SIZE (TREE_TYPE (TREE_OPERAND (node, 1))),
> > > > > > +                          spc, flags, false);
> > > > > > +       pp_string (pp, " bits)>");
> > > > > > +       break;
> > > > > > +
> > > > > >       case ARRAY_REF:
> > > > > >       case ARRAY_RANGE_REF:
> > > > > >         op0 = TREE_OPERAND (node, 0);
> > > > > > Index: trunk/gcc/tree-ssa-operands.c
> > > > > > ===================================================================
> > > > > > *** trunk.orig/gcc/tree-ssa-operands.c  2016-05-12 13:42:45.465811959 +0200
> > > > > > --- trunk/gcc/tree-ssa-operands.c       2016-05-12 13:48:26.881736503 +0200
> > > > > > *************** get_expr_operands (struct function *fn,
> > > > > > *** 833,838 ****
> > > > > > --- 833,839 ----
> > > > > >         get_expr_operands (fn, stmt, &TREE_OPERAND (expr, 0), flags);
> > > > > >         return;
> > > > > >
> > > > > > +     case BIT_FIELD_INSERT:
> > > > > >       case COMPOUND_EXPR:
> > > > > >       case OBJ_TYPE_REF:
> > > > > >       case ASSERT_EXPR:
> > > > > > Index: trunk/gcc/tree.def
> > > > > > ===================================================================
> > > > > > *** trunk.orig/gcc/tree.def     2016-05-12 13:40:30.704262951 +0200
> > > > > > --- trunk/gcc/tree.def  2016-05-12 13:47:09.972852423 +0200
> > > > > > *************** DEFTREECODE (ADDR_EXPR, "addr_expr", tcc
> > > > > > *** 852,857 ****
> > > > > > --- 852,868 ----
> > > > > >      descriptor of type ptr_mode.  */
> > > > > >   DEFTREECODE (FDESC_EXPR, "fdesc_expr", tcc_expression, 2)
> > > > > >
> > > > > > + /* Given a word, a value and a bitfield position within the word,
> > > > > > +    produce the value that results if replacing the
> > > > > > +    described parts of word with value.
> > > > > > +    Operand 0 is a tree for the word of integral type;
> > > > > > +    Operand 1 is a tree for the value of integral type;
> > > > > > +    Operand 2 is a tree giving the constant position of the first referenced bit;
> > > > > > +    The number of bits replaced is given by the precision of the value
> > > > > > +    type if that is integral or by its size if it is non-integral.
> > > > > > +    The replaced bits shall be fully inside the word.  */
> > > > > > + DEFTREECODE (BIT_FIELD_INSERT, "bit_field_insert", tcc_expression, 3)
> > > > > > +
> > > > > >   /* Given two real or integer operands of the same type,
> > > > > >      returns a complex value of the corresponding complex type.  */
> > > > > >   DEFTREECODE (COMPLEX_EXPR, "complex_expr", tcc_binary, 2)
> > > > > > Index: trunk/gcc/cfgexpand.c
> > > > > > ===================================================================
> > > > > > *** trunk.orig/gcc/cfgexpand.c  2016-05-12 13:42:45.469812005 +0200
> > > > > > --- trunk/gcc/cfgexpand.c       2016-05-13 11:48:04.513407495 +0200
> > > > > > *************** expand_debug_expr (tree exp)
> > > > > > *** 5025,5030 ****
> > > > > > --- 5025,5031 ----
> > > > > >       case FIXED_CONVERT_EXPR:
> > > > > >       case OBJ_TYPE_REF:
> > > > > >       case WITH_SIZE_EXPR:
> > > > > > +     case BIT_FIELD_INSERT:
> > > > > >         return NULL;
> > > > > >
> > > > > >       case DOT_PROD_EXPR:
> > > > > > Index: trunk/gcc/gimple-pretty-print.c
> > > > > > ===================================================================
> > > > > > *** trunk.orig/gcc/gimple-pretty-print.c        2016-05-12 11:23:09.261375157 +0200
> > > > > > --- trunk/gcc/gimple-pretty-print.c     2016-05-12 14:57:22.096175579 +0200
> > > > > > *************** dump_ternary_rhs (pretty_printer *buffer
> > > > > > *** 479,484 ****
> > > > > > --- 479,502 ----
> > > > > >         pp_greater (buffer);
> > > > > >         break;
> > > > > >
> > > > > > +     case BIT_FIELD_INSERT:
> > > > > > +       pp_string (buffer, "BIT_FIELD_INSERT <");
> > > > > > +       dump_generic_node (buffer, gimple_assign_rhs1 (gs), spc, flags, false);
> > > > > > +       pp_string (buffer, ", ");
> > > > > > +       dump_generic_node (buffer, gimple_assign_rhs2 (gs), spc, flags, false);
> > > > > > +       pp_string (buffer, ", ");
> > > > > > +       dump_generic_node (buffer, gimple_assign_rhs3 (gs), spc, flags, false);
> > > > > > +       pp_string (buffer, " (");
> > > > > > +       if (INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_rhs2 (gs))))
> > > > > > +       pp_decimal_int (buffer,
> > > > > > +                       TYPE_PRECISION (TREE_TYPE (gimple_assign_rhs2 (gs))));
> > > > > > +       else
> > > > > > +       dump_generic_node (buffer,
> > > > > > +                          TYPE_SIZE (TREE_TYPE (gimple_assign_rhs2 (gs))),
> > > > > > +                          spc, flags, false);
> > > > > > +       pp_string (buffer, " bits)>");
> > > > > > +       break;
> > > > > > +
> > > > > >       default:
> > > > > >         gcc_unreachable ();
> > > > > >       }
> > > > > > Index: trunk/gcc/gimple.c
> > > > > > ===================================================================
> > > > > > *** trunk.orig/gcc/gimple.c     2016-05-12 13:40:30.704262951 +0200
> > > > > > --- trunk/gcc/gimple.c  2016-05-12 14:49:37.066994969 +0200
> > > > > > *************** get_gimple_rhs_num_ops (enum tree_code c
> > > > > > *** 2044,2049 ****
> > > > > > --- 2044,2050 ----
> > > > > >         || (SYM) == REALIGN_LOAD_EXPR                                       \
> > > > > >         || (SYM) == VEC_COND_EXPR                                                   \
> > > > > >         || (SYM) == VEC_PERM_EXPR                                             \
> > > > > > +       || (SYM) == BIT_FIELD_INSERT                                        \
> > > > > >         || (SYM) == FMA_EXPR) ? GIMPLE_TERNARY_RHS                          \
> > > > > >      : ((SYM) == CONSTRUCTOR                                                \
> > > > > >         || (SYM) == OBJ_TYPE_REF                                                    \
> > > > > > Index: trunk/gcc/tree-cfg.c
> > > > > > ===================================================================
> > > > > > *** trunk.orig/gcc/tree-cfg.c   2016-05-06 14:38:33.959495081 +0200
> > > > > > --- trunk/gcc/tree-cfg.c        2016-05-13 09:25:01.670630730 +0200
> > > > > > *************** verify_gimple_assign_ternary (gassign *s
> > > > > > *** 4155,4160 ****
> > > > > > --- 4155,4207 ----
> > > > > >
> > > > > >         return false;
> > > > > >
> > > > > > +     case BIT_FIELD_INSERT:
> > > > > > +       if (! useless_type_conversion_p (lhs_type, rhs1_type))
> > > > > > +       {
> > > > > > +         error ("type mismatch in BIT_FIELD_INSERT");
> > > > > > +         debug_generic_expr (lhs_type);
> > > > > > +         debug_generic_expr (rhs1_type);
> > > > > > +         return true;
> > > > > > +       }
> > > > > > +       if (! ((INTEGRAL_TYPE_P (rhs1_type)
> > > > > > +             && INTEGRAL_TYPE_P (rhs2_type))
> > > > > > +            || (VECTOR_TYPE_P (rhs1_type)
> > > > > > +                && types_compatible_p (TREE_TYPE (rhs1_type), rhs2_type))))
> > > > > > +       {
> > > > > > +         error ("not allowed type combination in BIT_FIELD_INSERT");
> > > > > > +         debug_generic_expr (rhs1_type);
> > > > > > +         debug_generic_expr (rhs2_type);
> > > > > > +         return true;
> > > > > > +       }
> > > > > > +       if (! tree_fits_uhwi_p (rhs3)
> > > > > > +         || ! tree_fits_uhwi_p (TYPE_SIZE (rhs2_type)))
> > > > > > +       {
> > > > > > +         error ("invalid position or size in BIT_FIELD_INSERT");
> > > > > > +         return true;
> > > > > > +       }
> > > > > > +       if (INTEGRAL_TYPE_P (rhs1_type))
> > > > > > +       {
> > > > > > +         unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (rhs3);
> > > > > > +         if (bitpos >= TYPE_PRECISION (rhs1_type)
> > > > > > +             || (bitpos + TYPE_PRECISION (rhs2_type)
> > > > > > +                 > TYPE_PRECISION (rhs1_type)))
> > > > > > +           {
> > > > > > +             error ("insertion out of range in BIT_FIELD_INSERT");
> > > > > > +             return true;
> > > > > > +           }
> > > > > > +       }
> > > > > > +       else if (VECTOR_TYPE_P (rhs1_type))
> > > > > > +       {
> > > > > > +         unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (rhs3);
> > > > > > +         unsigned HOST_WIDE_INT bitsize = tree_to_uhwi (TYPE_SIZE (rhs2_type));
> > > > > > +         if (bitpos % bitsize != 0)
> > > > > > +           {
> > > > > > +             error ("vector insertion not at element boundary");
> > > > > > +             return true;
> > > > > > +           }
> > > > > > +       }
> > > > > > +       return false;
> > > > > > +
> > > > > >       case DOT_PROD_EXPR:
> > > > > >       case REALIGN_LOAD_EXPR:
> > > > > >         /* FIXME.  */
> > > > > > Index: trunk/gcc/tree-ssa.c
> > > > > > ===================================================================
> > > > > > *** trunk.orig/gcc/tree-ssa.c   2016-05-13 09:38:02.263611726 +0200
> > > > > > --- trunk/gcc/tree-ssa.c        2016-05-13 09:50:31.020226585 +0200
> > > > > > *************** non_rewritable_lvalue_p (tree lhs)
> > > > > > *** 1318,1323 ****
> > > > > > --- 1318,1335 ----
> > > > > >         return false;
> > > > > >       }
> > > > > >
> > > > > > +   /* A vector-insert using a BIT_FIELD_REF is rewritable using
> > > > > > +      BIT_FIELD_INSERT.  */
> > > > > > +   if (TREE_CODE (lhs) == BIT_FIELD_REF
> > > > > > +       && DECL_P (TREE_OPERAND (lhs, 0))
> > > > > > +       && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (lhs, 0)))
> > > > > > +       /* && bitsize % element-size == 0 */
> > > > > > +       && types_compatible_p (TREE_TYPE (lhs),
> > > > > > +                            TREE_TYPE (TREE_TYPE (TREE_OPERAND (lhs, 0))))
> > > > > > +       && (tree_to_uhwi (TREE_OPERAND (lhs, 2))
> > > > > > +         % tree_to_uhwi (TYPE_SIZE (TREE_TYPE (lhs)))) == 0)
> > > > > > +     return false;
> > > > > > +
> > > > > >     return true;
> > > > > >   }
> > > > > >
> > > > > > *************** execute_update_addresses_taken (void)
> > > > > > *** 1536,1541 ****
> > > > > > --- 1548,1576 ----
> > > > > >                     stmt = gsi_stmt (gsi);
> > > > > >                     unlink_stmt_vdef (stmt);
> > > > > >                     update_stmt (stmt);
> > > > > > +                   continue;
> > > > > > +                 }
> > > > > > +
> > > > > > +               /* Rewrite a vector insert via a BIT_FIELD_REF on the LHS
> > > > > > +                  into a BIT_FIELD_INSERT.  */
> > > > > > +               if (TREE_CODE (lhs) == BIT_FIELD_REF
> > > > > > +                   && DECL_P (TREE_OPERAND (lhs, 0))
> > > > > > +                   && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (lhs, 0)))
> > > > > > +                   && types_compatible_p (TREE_TYPE (lhs),
> > > > > > +                                          TREE_TYPE (TREE_TYPE
> > > > > > +                                                      (TREE_OPERAND (lhs, 0))))
> > > > > > +                   && (tree_to_uhwi (TREE_OPERAND (lhs, 2))
> > > > > > +                       % tree_to_uhwi (TYPE_SIZE (TREE_TYPE (lhs))) == 0))
> > > > > > +                 {
> > > > > > +                   tree var = TREE_OPERAND (lhs, 0);
> > > > > > +                   tree val = gimple_assign_rhs1 (stmt);
> > > > > > +                   tree bitpos = TREE_OPERAND (lhs, 2);
> > > > > > +                   gimple_assign_set_lhs (stmt, var);
> > > > > > +                   gimple_assign_set_rhs_with_ops
> > > > > > +                     (&gsi, BIT_FIELD_INSERT, var, val, bitpos);
> > > > > > +                   stmt = gsi_stmt (gsi);
> > > > > > +                   unlink_stmt_vdef (stmt);
> > > > > > +                   update_stmt (stmt);
> > > > > >                     continue;
> > > > > >                   }
> > > > > >
> > > > > > Index: trunk/gcc/testsuite/gcc.dg/tree-ssa/vector-6.c
> > > > > > ===================================================================
> > > > > > *** /dev/null   1970-01-01 00:00:00.000000000 +0000
> > > > > > --- trunk/gcc/testsuite/gcc.dg/tree-ssa/vector-6.c      2016-05-13 09:54:16.026814995 +0200
> > > > > > ***************
> > > > > > *** 0 ****
> > > > > > --- 1,34 ----
> > > > > > + /* { dg-do compile } */
> > > > > > + /* { dg-options "-O -fdump-tree-ccp1" } */
> > > > > > +
> > > > > > + typedef int v4si __attribute__((vector_size (4 * sizeof (int))));
> > > > > > +
> > > > > > + v4si test1 (v4si v, int i)
> > > > > > + {
> > > > > > +   ((int *)&v)[0] = i;
> > > > > > +   return v;
> > > > > > + }
> > > > > > +
> > > > > > + v4si test2 (v4si v, int i)
> > > > > > + {
> > > > > > +   int *p = (int *)&v;
> > > > > > +   *p = i;
> > > > > > +   return v;
> > > > > > + }
> > > > > > +
> > > > > > + v4si test3 (v4si v, int i)
> > > > > > + {
> > > > > > +   ((int *)&v)[3] = i;
> > > > > > +   return v;
> > > > > > + }
> > > > > > +
> > > > > > + v4si test4 (v4si v, int i)
> > > > > > + {
> > > > > > +   int *p = (int *)&v;
> > > > > > +   p += 3;
> > > > > > +   *p = i;
> > > > > > +   return v;
> > > > > > + }
> > > > > > +
> > > > > > + /* { dg-final { scan-tree-dump-times "Now a gimple register: v" 2 "ccp1" } } */
> > > > > > + /* { dg-final { scan-tree-dump-times "Now a gimple register: v" 4 "ccp1" { xfail *-*-* } } } */
> > > > >
> > > > >
> > > >
> > > >
> > >
> > > --
> > > Richard Biener <rguenther@suse.de>
> > > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
> >
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT
  2020-01-07  9:40           ` Andrew Pinski
@ 2020-01-07 10:04             ` Richard Biener
  2020-01-07 11:14               ` Richard Sandiford
  0 siblings, 1 reply; 32+ messages in thread
From: Richard Biener @ 2020-01-07 10:04 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 30537 bytes --]

On Tue, 7 Jan 2020, Andrew Pinski wrote:

> On Mon, Jan 6, 2020 at 11:36 PM Richard Biener <rguenther@suse.de> wrote:
> >
> > On Mon, 16 Dec 2019, Andrew Pinski wrote:
> >
> > > On Thu, Nov 15, 2018 at 12:31 AM Richard Biener <rguenther@suse.de> wrote:
> > > >
> > > > On Thu, 15 Nov 2018, Richard Biener wrote:
> > > >
> > > > > On Wed, 14 Nov 2018, Andrew Pinski wrote:
> > > > >
> > > > > > On Fri, May 13, 2016 at 3:51 AM Richard Biener <rguenther@suse.de> wrote:
> > > > > > >
> > > > > > >
> > > > > > > The following patch adds BIT_FIELD_INSERT, an operation to
> > > > > > > facilitate doing bitfield inserts on registers (as opposed
> > > > > > > to currently where we'd have a BIT_FIELD_REF store).
> > > > > > >
> > > > > > > Originally this was developed as part of bitfield lowering
> > > > > > > where bitfield stores were lowered into read-modify-write
> > > > > > > cycles and the modify part, instead of doing shifting and masking,
> > > > > > > be kept in a more high-level form to ease combining them.
> > > > > > >
> > > > > > > A second use case (the above is still valid) is vector element
> > > > > > > inserts which we currently can only do via memory or
> > > > > > > by extracting all components and re-building the vector using
> > > > > > > a CONSTRUCTOR.  For this second use case I added code
> > > > > > > re-writing the BIT_FIELD_REF stores the C family FEs produce
> > > > > > > into BIT_FIELD_INSERT when update-address-taken can otherwise
> > > > > > > re-write a decl into SSA form (the testcase shows we miss
> > > > > > > a similar opportunity with the MEM_REF form of a vector insert,
> > > > > > > I plan to fix that for the final submission).
> > > > > > >
> > > > > > > One speciality of BIT_FIELD_INSERT as opposed to BIT_FIELD_REF
> > > > > > > is that the size of the insertion is given implicitely via the
> > > > > > > type size/precision of the value to insert.  That avoids
> > > > > > > introducing ways to have quaternary ops in folding and GIMPLE stmts.
> > > > > > >
> > > > > > > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> > > > > > >
> > > > > > > Richard.
> > > > > > >
> > > > > > > 2011-06-16  Richard Guenther  <rguenther@suse.de>
> > > > > > >
> > > > > > >         PR tree-optimization/29756
> > > > > > >         * tree.def (BIT_FIELD_INSERT): New tcc_expression tree code.
> > > > > > >         * expr.c (expand_expr_real_2): Handle BIT_FIELD_INSERT.
> > > > > > >         * fold-const.c (operand_equal_p): Likewise.
> > > > > > >         (fold_ternary_loc): Add constant folding of BIT_FIELD_INSERT.
> > > > > > >         * gimplify.c (gimplify_expr): Handle BIT_FIELD_INSERT.
> > > > > > >         * tree-inline.c (estimate_operator_cost): Likewise.
> > > > > > >         * tree-pretty-print.c (dump_generic_node): Likewise.
> > > > > > >         * tree-ssa-operands.c (get_expr_operands): Likewise.
> > > > > > >         * cfgexpand.c (expand_debug_expr): Likewise.
> > > > > > >         * gimple-pretty-print.c (dump_ternary_rhs): Likewise.
> > > > > > >         * gimple.c (get_gimple_rhs_num_ops): Handle BIT_FIELD_INSERT.
> > > > > > >         * tree-cfg.c (verify_gimple_assign_ternary): Verify BIT_FIELD_INSERT.
> > > > > > >
> > > > > > >         * tree-ssa.c (non_rewritable_lvalue_p): We can rewrite
> > > > > > >         vector inserts using BIT_FIELD_REF on the lhs.
> > > > > > >         (execute_update_addresses_taken): Do it.
> > > > > > >
> > > > > > >         * gcc.dg/tree-ssa/vector-6.c: New testcase.
> > > > > > >
> > > > > > > Index: trunk/gcc/expr.c
> > > > > > > ===================================================================
> > > > > > > *** trunk.orig/gcc/expr.c       2016-05-12 13:40:30.704262951 +0200
> > > > > > > --- trunk/gcc/expr.c    2016-05-12 15:40:32.481225744 +0200
> > > > > > > *************** expand_expr_real_2 (sepops ops, rtx targ
> > > > > > > *** 9358,9363 ****
> > > > > > > --- 9358,9380 ----
> > > > > > >         target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
> > > > > > >         return target;
> > > > > > >
> > > > > > > +     case BIT_FIELD_INSERT:
> > > > > > > +       {
> > > > > > > +       unsigned bitpos = tree_to_uhwi (treeop2);
> > > > > > > +       unsigned bitsize;
> > > > > > > +       if (INTEGRAL_TYPE_P (TREE_TYPE (treeop1)))
> > > > > > > +         bitsize = TYPE_PRECISION (TREE_TYPE (treeop1));
> > > > > > > +       else
> > > > > > > +         bitsize = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (treeop1)));
> > > > > > > +       rtx op0 = expand_normal (treeop0);
> > > > > > > +       rtx op1 = expand_normal (treeop1);
> > > > > > > +       rtx dst = gen_reg_rtx (mode);
> > > > > > > +       emit_move_insn (dst, op0);
> > > > > > > +       store_bit_field (dst, bitsize, bitpos, 0, 0,
> > > > > > > +                        TYPE_MODE (TREE_TYPE (treeop1)), op1, false);
> > > > > > > +       return dst;
> > > > > > > +       }
> > > > > > > +
> > > > > > >       default:
> > > > > > >         gcc_unreachable ();
> > > > > > >       }
> > > > > > > Index: trunk/gcc/fold-const.c
> > > > > > > ===================================================================
> > > > > > > *** trunk.orig/gcc/fold-const.c 2016-05-12 13:40:30.704262951 +0200
> > > > > > > --- trunk/gcc/fold-const.c      2016-05-13 09:41:13.509812127 +0200
> > > > > > > *************** operand_equal_p (const_tree arg0, const_
> > > > > > > *** 3163,3168 ****
> > > > > > > --- 3163,3169 ----
> > > > > > >
> > > > > > >         case VEC_COND_EXPR:
> > > > > > >         case DOT_PROD_EXPR:
> > > > > > > +       case BIT_FIELD_INSERT:
> > > > > > >           return OP_SAME (0) && OP_SAME (1) && OP_SAME (2);
> > > > > > >
> > > > > > >         default:
> > > > > > > *************** fold_ternary_loc (location_t loc, enum t
> > > > > > > *** 11870,11875 ****
> > > > > > > --- 11871,11916 ----
> > > > > > >         }
> > > > > > >         return NULL_TREE;
> > > > > > >
> > > > > > > +     case BIT_FIELD_INSERT:
> > > > > > > +       /* Perform (partial) constant folding of BIT_FIELD_INSERT.  */
> > > > > > > +       if (TREE_CODE (arg0) == INTEGER_CST
> > > > > > > +         && TREE_CODE (arg1) == INTEGER_CST)
> > > > > > > +       {
> > > > > > > +         unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
> > > > > > > +         unsigned bitsize = TYPE_PRECISION (TREE_TYPE (arg1));
> > > > > > > +         wide_int tem = wi::bit_and (arg0,
> > > > > > > +                                     wi::shifted_mask (bitpos, bitsize, true,
> > > > > > > +                                                       TYPE_PRECISION (type)));
> > > > > > > +         wide_int tem2
> > > > > > > +           = wi::lshift (wi::zext (wi::to_wide (arg1, TYPE_PRECISION (type)),
> > > > > > > +                                   bitsize), bitpos);
> > > > > > > +         return wide_int_to_tree (type, wi::bit_or (tem, tem2));
> > > > > > > +       }
> > > > > >
> > > > > > This seems incorrect for the case where BYTES_BIG_ENDIAN as far as I
> > > > > > can tell.  With BYTES_BIG_ENDIAN, the bits position starts most
> > > > > > significiant rather than the least significiant.
> > > > >
> > > > > You mean the bitpos operand of BIT_FIELD_INSERT works in a different way?
> > > > > I see the BIT_FIELD_REF folding uses native_encode/interpret but only
> > > > > handles byte-aligned references.  I suppose the BIT_INSERT_EXPR case
> > > > > (you are following up an old patch) could do the same.
> > > > >
> > > > > >  Sorry I am bring
> > > > > > this up after this has been in the tree for a long time but I finally
> > > > > > got around to testing my bit-field lowering on a few big-endian
> > > > > > targets and ran into this issue.
> > > > >
> > > > > So - can you fix it please?  Also note that the VECTOR_CST case
> > > > > (as in BIT_FIELD_REF) seems to be inconsistent here and counts
> > > > > "bits" in a different way?
> > > >
> > > > And bonus points for documenting BIT_FIELD_REF and BIT_INSERT_EXPR
> > > > in generic.texi, together with those "details".
> > >
> > > This is the fix:
> > > diff --git a/gcc/fold-const.c b/gcc/fold-const.c
> > > index 8e9e299..a919b63 100644
> > > --- a/gcc/fold-const.c
> > > +++ b/gcc/fold-const.c
> > > @@ -12301,6 +12301,8 @@ fold_ternary_loc (location_t loc, enum
> > > tree_code code, tree type,
> > >         {
> > >           unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
> > >           unsigned bitsize = TYPE_PRECISION (TREE_TYPE (arg1));
> > > +         if (BYTES_BIG_ENDIAN)
> > > +           bitpos = TYPE_PRECISION (type) - bitpos - bitsize;
> > >           wide_int tem = (wi::to_wide (arg0)
> > >                           & wi::shifted_mask (bitpos, bitsize, true,
> > >                                               TYPE_PRECISION (type)));
> >
> > I guess you need to guard against BYTES_BIG_ENDIAN != WORDS_BIG_ENDIAN
> > as well.
> 
> Yes I will add that check.
> 
> >  Also the above only works reliably for mode-precision
> > integers?  We might want to disallow BIT_FIELD_REF/BIT_INSERT_EXPR
> > on non-mode-precision entities in the GIMPLE/GENERIC verifiers.
> 
> You added that check already for gimple in r268332 due to PR88739.
> BIT_FIELD_REF around tree-cfg.c:3083
> BIT_INSERT_EXPR  around tree-cfg.c:4324

Ah, good ;)  Note neither BIT_FIELD_REF nor BIT_INSERT_EXPR are
documented in generic.texi and BIT_FIELD_REF is documented in tree.def
as operating on structs/unions (well, memory).  And for register args
we interpret it as storing the register to memory and interpreting
the bit positions in memory bit terms (with the store doing endian
fiddling).  But for vector (register only?) accesses we interpret
it as specifying lane numbers but at least BIT_FIELD_REF verifying
doesn't barf on bit/sizes not corresponding to exact vector lanes
(and I know we introduce non-matching ones via at least VIEW_CONVERT
"merging" into BIT_FIELD_REFs).  I have in one of my trees:

Index: gcc/tree-cfg.c
===================================================================
--- gcc/tree-cfg.c      (revision 279944)
+++ gcc/tree-cfg.c      (working copy)
@@ -3094,6 +3094,26 @@ verify_types_in_gimple_reference (tree e
                     "%qs", code_name);
              return true;
            }
+         if (0 && VECTOR_TYPE_P (TREE_TYPE (op)))
+           {
+             if ((!VECTOR_TYPE_P (TREE_TYPE (expr))
+                  && !useless_type_conversion_p (TREE_TYPE (expr),
+                                                 TREE_TYPE (TREE_TYPE 
(op))))
+                 || (VECTOR_TYPE_P (TREE_TYPE (expr))
+                     && !useless_type_conversion_p (TREE_TYPE (TREE_TYPE 
(expr)),
+                                                    TREE_TYPE (TREE_TYPE 
(op)))))
+               {
+                 error ("invalid types in vector extract");
+                 debug_generic_stmt (TREE_TYPE (expr));
+                 debug_generic_stmt (TREE_TYPE (op));
+                 return true;
+               }
+             if (!multiple_p (bit_field_offset (expr), bit_field_size 
(expr)))
+               {
+                 error ("unaligned vector extract");
+                 return true;
+               }
+           }
        }
 
       if ((TREE_CODE (expr) == REALPART_EXPR

guess extracting same mode but different sign elements should be OK,
not sure about extracting SImode from a V4SFmode vector (I guess OK
as well).


> Thanks,
> Andrew Pinski
> 
> >
> > > ---- CUT ----
> > > I will do a full test in a little bit with the other patch I attached
> > > to related bugzilla.
> > >
> > > Thanks,
> > > Andrew
> > >
> > > >
> > > > Richard.
> > > >
> > > > > Thanks,
> > > > > Richard.
> > > > >
> > > > > > Thanks,
> > > > > > Andrew Pinski
> > > > > >
> > > > > > > +       else if (TREE_CODE (arg0) == VECTOR_CST
> > > > > > > +              && CONSTANT_CLASS_P (arg1)
> > > > > > > +              && types_compatible_p (TREE_TYPE (TREE_TYPE (arg0)),
> > > > > > > +                                     TREE_TYPE (arg1)))
> > > > > > > +       {
> > > > > > > +         unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
> > > > > > > +         unsigned HOST_WIDE_INT elsize
> > > > > > > +           = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (arg1)));
> > > > > > > +         if (bitpos % elsize == 0)
> > > > > > > +           {
> > > > > > > +             unsigned k = bitpos / elsize;
> > > > > > > +             if (operand_equal_p (VECTOR_CST_ELT (arg0, k), arg1, 0))
> > > > > > > +               return arg0;
> > > > > > > +             else
> > > > > > > +               {
> > > > > > > +                 tree *elts = XALLOCAVEC (tree, TYPE_VECTOR_SUBPARTS (type));
> > > > > > > +                 memcpy (elts, VECTOR_CST_ELTS (arg0),
> > > > > > > +                         sizeof (tree) * TYPE_VECTOR_SUBPARTS (type));
> > > > > > > +                 elts[k] = arg1;
> > > > > > > +                 return build_vector (type, elts);
> > > > > > > +               }
> > > > > > > +           }
> > > > > > > +       }
> > > > > > > +       return NULL_TREE;
> > > > > > > +
> > > > > > >       default:
> > > > > > >         return NULL_TREE;
> > > > > > >       } /* switch (code) */
> > > > > > > Index: trunk/gcc/gimplify.c
> > > > > > > ===================================================================
> > > > > > > *** trunk.orig/gcc/gimplify.c   2016-05-12 13:40:30.704262951 +0200
> > > > > > > --- trunk/gcc/gimplify.c        2016-05-12 13:56:18.679120641 +0200
> > > > > > > *************** gimplify_expr (tree *expr_p, gimple_seq
> > > > > > > *** 10936,10941 ****
> > > > > > > --- 10936,10945 ----
> > > > > > >           /* Classified as tcc_expression.  */
> > > > > > >           goto expr_3;
> > > > > > >
> > > > > > > +       case BIT_FIELD_INSERT:
> > > > > > > +         /* Argument 3 is a constant.  */
> > > > > > > +         goto expr_2;
> > > > > > > +
> > > > > > >         case POINTER_PLUS_EXPR:
> > > > > > >           {
> > > > > > >             enum gimplify_status r0, r1;
> > > > > > > Index: trunk/gcc/tree-inline.c
> > > > > > > ===================================================================
> > > > > > > *** trunk.orig/gcc/tree-inline.c        2016-05-12 13:40:30.704262951 +0200
> > > > > > > --- trunk/gcc/tree-inline.c     2016-05-12 13:42:45.465811959 +0200
> > > > > > > *************** estimate_operator_cost (enum tree_code c
> > > > > > > *** 3941,3946 ****
> > > > > > > --- 3941,3950 ----
> > > > > > >           return weights->div_mod_cost;
> > > > > > >         return 1;
> > > > > > >
> > > > > > > +     /* Bit-field insertion needs several shift and mask operations.  */
> > > > > > > +     case BIT_FIELD_INSERT:
> > > > > > > +       return 3;
> > > > > > > +
> > > > > > >       default:
> > > > > > >         /* We expect a copy assignment with no operator.  */
> > > > > > >         gcc_assert (get_gimple_rhs_class (code) == GIMPLE_SINGLE_RHS);
> > > > > > > Index: trunk/gcc/tree-pretty-print.c
> > > > > > > ===================================================================
> > > > > > > *** trunk.orig/gcc/tree-pretty-print.c  2016-05-12 13:40:30.704262951 +0200
> > > > > > > --- trunk/gcc/tree-pretty-print.c       2016-05-12 14:30:05.781944740 +0200
> > > > > > > *************** dump_generic_node (pretty_printer *pp, t
> > > > > > > *** 1876,1881 ****
> > > > > > > --- 1876,1898 ----
> > > > > > >         pp_greater (pp);
> > > > > > >         break;
> > > > > > >
> > > > > > > +     case BIT_FIELD_INSERT:
> > > > > > > +       pp_string (pp, "BIT_FIELD_INSERT <");
> > > > > > > +       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
> > > > > > > +       pp_string (pp, ", ");
> > > > > > > +       dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false);
> > > > > > > +       pp_string (pp, ", ");
> > > > > > > +       dump_generic_node (pp, TREE_OPERAND (node, 2), spc, flags, false);
> > > > > > > +       pp_string (pp, " (");
> > > > > > > +       if (INTEGRAL_TYPE_P (TREE_TYPE (TREE_OPERAND (node, 1))))
> > > > > > > +       pp_decimal_int (pp,
> > > > > > > +                       TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (node, 1))));
> > > > > > > +       else
> > > > > > > +       dump_generic_node (pp, TYPE_SIZE (TREE_TYPE (TREE_OPERAND (node, 1))),
> > > > > > > +                          spc, flags, false);
> > > > > > > +       pp_string (pp, " bits)>");
> > > > > > > +       break;
> > > > > > > +
> > > > > > >       case ARRAY_REF:
> > > > > > >       case ARRAY_RANGE_REF:
> > > > > > >         op0 = TREE_OPERAND (node, 0);
> > > > > > > Index: trunk/gcc/tree-ssa-operands.c
> > > > > > > ===================================================================
> > > > > > > *** trunk.orig/gcc/tree-ssa-operands.c  2016-05-12 13:42:45.465811959 +0200
> > > > > > > --- trunk/gcc/tree-ssa-operands.c       2016-05-12 13:48:26.881736503 +0200
> > > > > > > *************** get_expr_operands (struct function *fn,
> > > > > > > *** 833,838 ****
> > > > > > > --- 833,839 ----
> > > > > > >         get_expr_operands (fn, stmt, &TREE_OPERAND (expr, 0), flags);
> > > > > > >         return;
> > > > > > >
> > > > > > > +     case BIT_FIELD_INSERT:
> > > > > > >       case COMPOUND_EXPR:
> > > > > > >       case OBJ_TYPE_REF:
> > > > > > >       case ASSERT_EXPR:
> > > > > > > Index: trunk/gcc/tree.def
> > > > > > > ===================================================================
> > > > > > > *** trunk.orig/gcc/tree.def     2016-05-12 13:40:30.704262951 +0200
> > > > > > > --- trunk/gcc/tree.def  2016-05-12 13:47:09.972852423 +0200
> > > > > > > *************** DEFTREECODE (ADDR_EXPR, "addr_expr", tcc
> > > > > > > *** 852,857 ****
> > > > > > > --- 852,868 ----
> > > > > > >      descriptor of type ptr_mode.  */
> > > > > > >   DEFTREECODE (FDESC_EXPR, "fdesc_expr", tcc_expression, 2)
> > > > > > >
> > > > > > > + /* Given a word, a value and a bitfield position within the word,
> > > > > > > +    produce the value that results if replacing the
> > > > > > > +    described parts of word with value.
> > > > > > > +    Operand 0 is a tree for the word of integral type;
> > > > > > > +    Operand 1 is a tree for the value of integral type;
> > > > > > > +    Operand 2 is a tree giving the constant position of the first referenced bit;
> > > > > > > +    The number of bits replaced is given by the precision of the value
> > > > > > > +    type if that is integral or by its size if it is non-integral.
> > > > > > > +    The replaced bits shall be fully inside the word.  */
> > > > > > > + DEFTREECODE (BIT_FIELD_INSERT, "bit_field_insert", tcc_expression, 3)
> > > > > > > +
> > > > > > >   /* Given two real or integer operands of the same type,
> > > > > > >      returns a complex value of the corresponding complex type.  */
> > > > > > >   DEFTREECODE (COMPLEX_EXPR, "complex_expr", tcc_binary, 2)
> > > > > > > Index: trunk/gcc/cfgexpand.c
> > > > > > > ===================================================================
> > > > > > > *** trunk.orig/gcc/cfgexpand.c  2016-05-12 13:42:45.469812005 +0200
> > > > > > > --- trunk/gcc/cfgexpand.c       2016-05-13 11:48:04.513407495 +0200
> > > > > > > *************** expand_debug_expr (tree exp)
> > > > > > > *** 5025,5030 ****
> > > > > > > --- 5025,5031 ----
> > > > > > >       case FIXED_CONVERT_EXPR:
> > > > > > >       case OBJ_TYPE_REF:
> > > > > > >       case WITH_SIZE_EXPR:
> > > > > > > +     case BIT_FIELD_INSERT:
> > > > > > >         return NULL;
> > > > > > >
> > > > > > >       case DOT_PROD_EXPR:
> > > > > > > Index: trunk/gcc/gimple-pretty-print.c
> > > > > > > ===================================================================
> > > > > > > *** trunk.orig/gcc/gimple-pretty-print.c        2016-05-12 11:23:09.261375157 +0200
> > > > > > > --- trunk/gcc/gimple-pretty-print.c     2016-05-12 14:57:22.096175579 +0200
> > > > > > > *************** dump_ternary_rhs (pretty_printer *buffer
> > > > > > > *** 479,484 ****
> > > > > > > --- 479,502 ----
> > > > > > >         pp_greater (buffer);
> > > > > > >         break;
> > > > > > >
> > > > > > > +     case BIT_FIELD_INSERT:
> > > > > > > +       pp_string (buffer, "BIT_FIELD_INSERT <");
> > > > > > > +       dump_generic_node (buffer, gimple_assign_rhs1 (gs), spc, flags, false);
> > > > > > > +       pp_string (buffer, ", ");
> > > > > > > +       dump_generic_node (buffer, gimple_assign_rhs2 (gs), spc, flags, false);
> > > > > > > +       pp_string (buffer, ", ");
> > > > > > > +       dump_generic_node (buffer, gimple_assign_rhs3 (gs), spc, flags, false);
> > > > > > > +       pp_string (buffer, " (");
> > > > > > > +       if (INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_rhs2 (gs))))
> > > > > > > +       pp_decimal_int (buffer,
> > > > > > > +                       TYPE_PRECISION (TREE_TYPE (gimple_assign_rhs2 (gs))));
> > > > > > > +       else
> > > > > > > +       dump_generic_node (buffer,
> > > > > > > +                          TYPE_SIZE (TREE_TYPE (gimple_assign_rhs2 (gs))),
> > > > > > > +                          spc, flags, false);
> > > > > > > +       pp_string (buffer, " bits)>");
> > > > > > > +       break;
> > > > > > > +
> > > > > > >       default:
> > > > > > >         gcc_unreachable ();
> > > > > > >       }
> > > > > > > Index: trunk/gcc/gimple.c
> > > > > > > ===================================================================
> > > > > > > *** trunk.orig/gcc/gimple.c     2016-05-12 13:40:30.704262951 +0200
> > > > > > > --- trunk/gcc/gimple.c  2016-05-12 14:49:37.066994969 +0200
> > > > > > > *************** get_gimple_rhs_num_ops (enum tree_code c
> > > > > > > *** 2044,2049 ****
> > > > > > > --- 2044,2050 ----
> > > > > > >         || (SYM) == REALIGN_LOAD_EXPR                                       \
> > > > > > >         || (SYM) == VEC_COND_EXPR                                                   \
> > > > > > >         || (SYM) == VEC_PERM_EXPR                                             \
> > > > > > > +       || (SYM) == BIT_FIELD_INSERT                                        \
> > > > > > >         || (SYM) == FMA_EXPR) ? GIMPLE_TERNARY_RHS                          \
> > > > > > >      : ((SYM) == CONSTRUCTOR                                                \
> > > > > > >         || (SYM) == OBJ_TYPE_REF                                                    \
> > > > > > > Index: trunk/gcc/tree-cfg.c
> > > > > > > ===================================================================
> > > > > > > *** trunk.orig/gcc/tree-cfg.c   2016-05-06 14:38:33.959495081 +0200
> > > > > > > --- trunk/gcc/tree-cfg.c        2016-05-13 09:25:01.670630730 +0200
> > > > > > > *************** verify_gimple_assign_ternary (gassign *s
> > > > > > > *** 4155,4160 ****
> > > > > > > --- 4155,4207 ----
> > > > > > >
> > > > > > >         return false;
> > > > > > >
> > > > > > > +     case BIT_FIELD_INSERT:
> > > > > > > +       if (! useless_type_conversion_p (lhs_type, rhs1_type))
> > > > > > > +       {
> > > > > > > +         error ("type mismatch in BIT_FIELD_INSERT");
> > > > > > > +         debug_generic_expr (lhs_type);
> > > > > > > +         debug_generic_expr (rhs1_type);
> > > > > > > +         return true;
> > > > > > > +       }
> > > > > > > +       if (! ((INTEGRAL_TYPE_P (rhs1_type)
> > > > > > > +             && INTEGRAL_TYPE_P (rhs2_type))
> > > > > > > +            || (VECTOR_TYPE_P (rhs1_type)
> > > > > > > +                && types_compatible_p (TREE_TYPE (rhs1_type), rhs2_type))))
> > > > > > > +       {
> > > > > > > +         error ("not allowed type combination in BIT_FIELD_INSERT");
> > > > > > > +         debug_generic_expr (rhs1_type);
> > > > > > > +         debug_generic_expr (rhs2_type);
> > > > > > > +         return true;
> > > > > > > +       }
> > > > > > > +       if (! tree_fits_uhwi_p (rhs3)
> > > > > > > +         || ! tree_fits_uhwi_p (TYPE_SIZE (rhs2_type)))
> > > > > > > +       {
> > > > > > > +         error ("invalid position or size in BIT_FIELD_INSERT");
> > > > > > > +         return true;
> > > > > > > +       }
> > > > > > > +       if (INTEGRAL_TYPE_P (rhs1_type))
> > > > > > > +       {
> > > > > > > +         unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (rhs3);
> > > > > > > +         if (bitpos >= TYPE_PRECISION (rhs1_type)
> > > > > > > +             || (bitpos + TYPE_PRECISION (rhs2_type)
> > > > > > > +                 > TYPE_PRECISION (rhs1_type)))
> > > > > > > +           {
> > > > > > > +             error ("insertion out of range in BIT_FIELD_INSERT");
> > > > > > > +             return true;
> > > > > > > +           }
> > > > > > > +       }
> > > > > > > +       else if (VECTOR_TYPE_P (rhs1_type))
> > > > > > > +       {
> > > > > > > +         unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (rhs3);
> > > > > > > +         unsigned HOST_WIDE_INT bitsize = tree_to_uhwi (TYPE_SIZE (rhs2_type));
> > > > > > > +         if (bitpos % bitsize != 0)
> > > > > > > +           {
> > > > > > > +             error ("vector insertion not at element boundary");
> > > > > > > +             return true;
> > > > > > > +           }
> > > > > > > +       }
> > > > > > > +       return false;
> > > > > > > +
> > > > > > >       case DOT_PROD_EXPR:
> > > > > > >       case REALIGN_LOAD_EXPR:
> > > > > > >         /* FIXME.  */
> > > > > > > Index: trunk/gcc/tree-ssa.c
> > > > > > > ===================================================================
> > > > > > > *** trunk.orig/gcc/tree-ssa.c   2016-05-13 09:38:02.263611726 +0200
> > > > > > > --- trunk/gcc/tree-ssa.c        2016-05-13 09:50:31.020226585 +0200
> > > > > > > *************** non_rewritable_lvalue_p (tree lhs)
> > > > > > > *** 1318,1323 ****
> > > > > > > --- 1318,1335 ----
> > > > > > >         return false;
> > > > > > >       }
> > > > > > >
> > > > > > > +   /* A vector-insert using a BIT_FIELD_REF is rewritable using
> > > > > > > +      BIT_FIELD_INSERT.  */
> > > > > > > +   if (TREE_CODE (lhs) == BIT_FIELD_REF
> > > > > > > +       && DECL_P (TREE_OPERAND (lhs, 0))
> > > > > > > +       && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (lhs, 0)))
> > > > > > > +       /* && bitsize % element-size == 0 */
> > > > > > > +       && types_compatible_p (TREE_TYPE (lhs),
> > > > > > > +                            TREE_TYPE (TREE_TYPE (TREE_OPERAND (lhs, 0))))
> > > > > > > +       && (tree_to_uhwi (TREE_OPERAND (lhs, 2))
> > > > > > > +         % tree_to_uhwi (TYPE_SIZE (TREE_TYPE (lhs)))) == 0)
> > > > > > > +     return false;
> > > > > > > +
> > > > > > >     return true;
> > > > > > >   }
> > > > > > >
> > > > > > > *************** execute_update_addresses_taken (void)
> > > > > > > *** 1536,1541 ****
> > > > > > > --- 1548,1576 ----
> > > > > > >                     stmt = gsi_stmt (gsi);
> > > > > > >                     unlink_stmt_vdef (stmt);
> > > > > > >                     update_stmt (stmt);
> > > > > > > +                   continue;
> > > > > > > +                 }
> > > > > > > +
> > > > > > > +               /* Rewrite a vector insert via a BIT_FIELD_REF on the LHS
> > > > > > > +                  into a BIT_FIELD_INSERT.  */
> > > > > > > +               if (TREE_CODE (lhs) == BIT_FIELD_REF
> > > > > > > +                   && DECL_P (TREE_OPERAND (lhs, 0))
> > > > > > > +                   && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (lhs, 0)))
> > > > > > > +                   && types_compatible_p (TREE_TYPE (lhs),
> > > > > > > +                                          TREE_TYPE (TREE_TYPE
> > > > > > > +                                                      (TREE_OPERAND (lhs, 0))))
> > > > > > > +                   && (tree_to_uhwi (TREE_OPERAND (lhs, 2))
> > > > > > > +                       % tree_to_uhwi (TYPE_SIZE (TREE_TYPE (lhs))) == 0))
> > > > > > > +                 {
> > > > > > > +                   tree var = TREE_OPERAND (lhs, 0);
> > > > > > > +                   tree val = gimple_assign_rhs1 (stmt);
> > > > > > > +                   tree bitpos = TREE_OPERAND (lhs, 2);
> > > > > > > +                   gimple_assign_set_lhs (stmt, var);
> > > > > > > +                   gimple_assign_set_rhs_with_ops
> > > > > > > +                     (&gsi, BIT_FIELD_INSERT, var, val, bitpos);
> > > > > > > +                   stmt = gsi_stmt (gsi);
> > > > > > > +                   unlink_stmt_vdef (stmt);
> > > > > > > +                   update_stmt (stmt);
> > > > > > >                     continue;
> > > > > > >                   }
> > > > > > >
> > > > > > > Index: trunk/gcc/testsuite/gcc.dg/tree-ssa/vector-6.c
> > > > > > > ===================================================================
> > > > > > > *** /dev/null   1970-01-01 00:00:00.000000000 +0000
> > > > > > > --- trunk/gcc/testsuite/gcc.dg/tree-ssa/vector-6.c      2016-05-13 09:54:16.026814995 +0200
> > > > > > > ***************
> > > > > > > *** 0 ****
> > > > > > > --- 1,34 ----
> > > > > > > + /* { dg-do compile } */
> > > > > > > + /* { dg-options "-O -fdump-tree-ccp1" } */
> > > > > > > +
> > > > > > > + typedef int v4si __attribute__((vector_size (4 * sizeof (int))));
> > > > > > > +
> > > > > > > + v4si test1 (v4si v, int i)
> > > > > > > + {
> > > > > > > +   ((int *)&v)[0] = i;
> > > > > > > +   return v;
> > > > > > > + }
> > > > > > > +
> > > > > > > + v4si test2 (v4si v, int i)
> > > > > > > + {
> > > > > > > +   int *p = (int *)&v;
> > > > > > > +   *p = i;
> > > > > > > +   return v;
> > > > > > > + }
> > > > > > > +
> > > > > > > + v4si test3 (v4si v, int i)
> > > > > > > + {
> > > > > > > +   ((int *)&v)[3] = i;
> > > > > > > +   return v;
> > > > > > > + }
> > > > > > > +
> > > > > > > + v4si test4 (v4si v, int i)
> > > > > > > + {
> > > > > > > +   int *p = (int *)&v;
> > > > > > > +   p += 3;
> > > > > > > +   *p = i;
> > > > > > > +   return v;
> > > > > > > + }
> > > > > > > +
> > > > > > > + /* { dg-final { scan-tree-dump-times "Now a gimple register: v" 2 "ccp1" } } */
> > > > > > > + /* { dg-final { scan-tree-dump-times "Now a gimple register: v" 4 "ccp1" { xfail *-*-* } } } */
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > > --
> > > > Richard Biener <rguenther@suse.de>
> > > > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
> > >
> >
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT
  2020-01-07 10:04             ` Richard Biener
@ 2020-01-07 11:14               ` Richard Sandiford
  2020-01-07 11:38                 ` Richard Biener
  0 siblings, 1 reply; 32+ messages in thread
From: Richard Sandiford @ 2020-01-07 11:14 UTC (permalink / raw)
  To: Richard Biener; +Cc: Andrew Pinski, GCC Patches

Richard Biener <rguenther@suse.de> writes:
> On Tue, 7 Jan 2020, Andrew Pinski wrote:
>
>> On Mon, Jan 6, 2020 at 11:36 PM Richard Biener <rguenther@suse.de> wrote:
>> >
>> > On Mon, 16 Dec 2019, Andrew Pinski wrote:
>> >
>> > > On Thu, Nov 15, 2018 at 12:31 AM Richard Biener <rguenther@suse.de> wrote:
>> > > >
>> > > > On Thu, 15 Nov 2018, Richard Biener wrote:
>> > > >
>> > > > > So - can you fix it please?  Also note that the VECTOR_CST case
>> > > > > (as in BIT_FIELD_REF) seems to be inconsistent here and counts
>> > > > > "bits" in a different way?
>> > > >
>> > > > And bonus points for documenting BIT_FIELD_REF and BIT_INSERT_EXPR
>> > > > in generic.texi, together with those "details".
>> > >
>> > > This is the fix:
>> > > diff --git a/gcc/fold-const.c b/gcc/fold-const.c
>> > > index 8e9e299..a919b63 100644
>> > > --- a/gcc/fold-const.c
>> > > +++ b/gcc/fold-const.c
>> > > @@ -12301,6 +12301,8 @@ fold_ternary_loc (location_t loc, enum
>> > > tree_code code, tree type,
>> > >         {
>> > >           unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
>> > >           unsigned bitsize = TYPE_PRECISION (TREE_TYPE (arg1));
>> > > +         if (BYTES_BIG_ENDIAN)
>> > > +           bitpos = TYPE_PRECISION (type) - bitpos - bitsize;
>> > >           wide_int tem = (wi::to_wide (arg0)
>> > >                           & wi::shifted_mask (bitpos, bitsize, true,
>> > >                                               TYPE_PRECISION (type)));
>> >
>> > I guess you need to guard against BYTES_BIG_ENDIAN != WORDS_BIG_ENDIAN
>> > as well.
>> 
>> Yes I will add that check.
>> 
>> >  Also the above only works reliably for mode-precision
>> > integers?  We might want to disallow BIT_FIELD_REF/BIT_INSERT_EXPR
>> > on non-mode-precision entities in the GIMPLE/GENERIC verifiers.
>> 
>> You added that check already for gimple in r268332 due to PR88739.
>> BIT_FIELD_REF around tree-cfg.c:3083
>> BIT_INSERT_EXPR  around tree-cfg.c:4324
>
> Ah, good ;)  Note neither BIT_FIELD_REF nor BIT_INSERT_EXPR are
> documented in generic.texi and BIT_FIELD_REF is documented in tree.def
> as operating on structs/unions (well, memory).  And for register args
> we interpret it as storing the register to memory and interpreting
> the bit positions in memory bit terms (with the store doing endian
> fiddling).

Ah, was going to ask what the semantics were. :-)  That sounds good
because it's essentially the same as for non-paradoxical subregs.
We have routines like subreg_lsb_1 and subreg_offset_from_lsb that
convert byte offsets to shift amounts, so maybe we should move them
to code that's common to both gimple and rtl.  The BYTES_BIG_ENDIAN !=
WORDS_BIG_ENDIAN calculation is quite subtle, so I don't think we should
open-code it everwhere we need it.

What about the subbyte part of the bit value?  Does that always count
from the lsb of the containing byte?  E.g. for the four bytes:

  0x12, 0x34, 0x56, 0x78

what does bit offset == 3, bit size == 7 mean for big-endian?

> But for vector (register only?) accesses we interpret
> it as specifying lane numbers but at least BIT_FIELD_REF verifying
> doesn't barf on bit/sizes not corresponding to exact vector lanes
> (and I know we introduce non-matching ones via at least VIEW_CONVERT
> "merging" into BIT_FIELD_REFs).

GCC's vector lane numbering is equivalent to array index numbering for
all endiannesses, so these cases should still be ok for BYTES_BIG_ENDIAN
== WORDS_BIG_ENDIAN, at least on byte boundaries.  Not sure about
subbyte boundaries -- guess it depends on the answer to the question
above.

For BYTES_BIG_ENDIAN != WORDS_BIG_ENDIAN, any sequence that crosses
a word boundary can lead to ranges that aren't contiguous in registers,
unless the range starts and ends on a word boundary.  This would include
some of those vector cases, but it could also include bitfield references
to multiword integers.

Subregs that cross a word boundary must start and end on a word boundary,
but I guess that would be too draconian for gimple.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT
  2020-01-07 11:14               ` Richard Sandiford
@ 2020-01-07 11:38                 ` Richard Biener
  2020-01-07 11:52                   ` Richard Sandiford
  0 siblings, 1 reply; 32+ messages in thread
From: Richard Biener @ 2020-01-07 11:38 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Andrew Pinski, GCC Patches

On Tue, 7 Jan 2020, Richard Sandiford wrote:

> Richard Biener <rguenther@suse.de> writes:
> > On Tue, 7 Jan 2020, Andrew Pinski wrote:
> >
> >> On Mon, Jan 6, 2020 at 11:36 PM Richard Biener <rguenther@suse.de> wrote:
> >> >
> >> > On Mon, 16 Dec 2019, Andrew Pinski wrote:
> >> >
> >> > > On Thu, Nov 15, 2018 at 12:31 AM Richard Biener <rguenther@suse.de> wrote:
> >> > > >
> >> > > > On Thu, 15 Nov 2018, Richard Biener wrote:
> >> > > >
> >> > > > > So - can you fix it please?  Also note that the VECTOR_CST case
> >> > > > > (as in BIT_FIELD_REF) seems to be inconsistent here and counts
> >> > > > > "bits" in a different way?
> >> > > >
> >> > > > And bonus points for documenting BIT_FIELD_REF and BIT_INSERT_EXPR
> >> > > > in generic.texi, together with those "details".
> >> > >
> >> > > This is the fix:
> >> > > diff --git a/gcc/fold-const.c b/gcc/fold-const.c
> >> > > index 8e9e299..a919b63 100644
> >> > > --- a/gcc/fold-const.c
> >> > > +++ b/gcc/fold-const.c
> >> > > @@ -12301,6 +12301,8 @@ fold_ternary_loc (location_t loc, enum
> >> > > tree_code code, tree type,
> >> > >         {
> >> > >           unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
> >> > >           unsigned bitsize = TYPE_PRECISION (TREE_TYPE (arg1));
> >> > > +         if (BYTES_BIG_ENDIAN)
> >> > > +           bitpos = TYPE_PRECISION (type) - bitpos - bitsize;
> >> > >           wide_int tem = (wi::to_wide (arg0)
> >> > >                           & wi::shifted_mask (bitpos, bitsize, true,
> >> > >                                               TYPE_PRECISION (type)));
> >> >
> >> > I guess you need to guard against BYTES_BIG_ENDIAN != WORDS_BIG_ENDIAN
> >> > as well.
> >> 
> >> Yes I will add that check.
> >> 
> >> >  Also the above only works reliably for mode-precision
> >> > integers?  We might want to disallow BIT_FIELD_REF/BIT_INSERT_EXPR
> >> > on non-mode-precision entities in the GIMPLE/GENERIC verifiers.
> >> 
> >> You added that check already for gimple in r268332 due to PR88739.
> >> BIT_FIELD_REF around tree-cfg.c:3083
> >> BIT_INSERT_EXPR  around tree-cfg.c:4324
> >
> > Ah, good ;)  Note neither BIT_FIELD_REF nor BIT_INSERT_EXPR are
> > documented in generic.texi and BIT_FIELD_REF is documented in tree.def
> > as operating on structs/unions (well, memory).  And for register args
> > we interpret it as storing the register to memory and interpreting
> > the bit positions in memory bit terms (with the store doing endian
> > fiddling).
> 
> Ah, was going to ask what the semantics were. :-)  That sounds good
> because it's essentially the same as for non-paradoxical subregs.
> We have routines like subreg_lsb_1 and subreg_offset_from_lsb that
> convert byte offsets to shift amounts, so maybe we should move them
> to code that's common to both gimple and rtl.  The BYTES_BIG_ENDIAN !=
> WORDS_BIG_ENDIAN calculation is quite subtle, so I don't think we should
> open-code it everwhere we need it.
> 
> What about the subbyte part of the bit value?  Does that always count
> from the lsb of the containing byte?  E.g. for the four bytes:
> 
>   0x12, 0x34, 0x56, 0x78
> 
> what does bit offset == 3, bit size == 7 mean for big-endian?
> 
> > But for vector (register only?) accesses we interpret
> > it as specifying lane numbers but at least BIT_FIELD_REF verifying
> > doesn't barf on bit/sizes not corresponding to exact vector lanes
> > (and I know we introduce non-matching ones via at least VIEW_CONVERT
> > "merging" into BIT_FIELD_REFs).
> 
> GCC's vector lane numbering is equivalent to array index numbering for
> all endiannesses, so these cases should still be ok for BYTES_BIG_ENDIAN
> == WORDS_BIG_ENDIAN, at least on byte boundaries.  Not sure about
> subbyte boundaries -- guess it depends on the answer to the question
> above.

I was thinking about, say a SImode extract at offset == 16, size == 32 of 
a V4SImode vector.  Is that to be interpreted as some weird shifted vector
lane or as a memory "bit" location after storing the vector to
memory?  The issue I see here is that once RTL expansion decides to
spill and interpret the offset/size in non-lane terms will there ever
be a mismatch between both?

> For BYTES_BIG_ENDIAN != WORDS_BIG_ENDIAN, any sequence that crosses
> a word boundary can lead to ranges that aren't contiguous in registers,
> unless the range starts and ends on a word boundary.  This would include
> some of those vector cases, but it could also include bitfield references
> to multiword integers.
> 
> Subregs that cross a word boundary must start and end on a word boundary,
> but I guess that would be too draconian for gimple.

Yeah.  

Richard.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT
  2020-01-07 11:38                 ` Richard Biener
@ 2020-01-07 11:52                   ` Richard Sandiford
  0 siblings, 0 replies; 32+ messages in thread
From: Richard Sandiford @ 2020-01-07 11:52 UTC (permalink / raw)
  To: Richard Biener; +Cc: Andrew Pinski, GCC Patches

Richard Biener <rguenther@suse.de> writes:
> On Tue, 7 Jan 2020, Richard Sandiford wrote:
>
>> Richard Biener <rguenther@suse.de> writes:
>> > On Tue, 7 Jan 2020, Andrew Pinski wrote:
>> >
>> >> On Mon, Jan 6, 2020 at 11:36 PM Richard Biener <rguenther@suse.de> wrote:
>> >> >
>> >> > On Mon, 16 Dec 2019, Andrew Pinski wrote:
>> >> >
>> >> > > On Thu, Nov 15, 2018 at 12:31 AM Richard Biener <rguenther@suse.de> wrote:
>> >> > > >
>> >> > > > On Thu, 15 Nov 2018, Richard Biener wrote:
>> >> > > >
>> >> > > > > So - can you fix it please?  Also note that the VECTOR_CST case
>> >> > > > > (as in BIT_FIELD_REF) seems to be inconsistent here and counts
>> >> > > > > "bits" in a different way?
>> >> > > >
>> >> > > > And bonus points for documenting BIT_FIELD_REF and BIT_INSERT_EXPR
>> >> > > > in generic.texi, together with those "details".
>> >> > >
>> >> > > This is the fix:
>> >> > > diff --git a/gcc/fold-const.c b/gcc/fold-const.c
>> >> > > index 8e9e299..a919b63 100644
>> >> > > --- a/gcc/fold-const.c
>> >> > > +++ b/gcc/fold-const.c
>> >> > > @@ -12301,6 +12301,8 @@ fold_ternary_loc (location_t loc, enum
>> >> > > tree_code code, tree type,
>> >> > >         {
>> >> > >           unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
>> >> > >           unsigned bitsize = TYPE_PRECISION (TREE_TYPE (arg1));
>> >> > > +         if (BYTES_BIG_ENDIAN)
>> >> > > +           bitpos = TYPE_PRECISION (type) - bitpos - bitsize;
>> >> > >           wide_int tem = (wi::to_wide (arg0)
>> >> > >                           & wi::shifted_mask (bitpos, bitsize, true,
>> >> > >                                               TYPE_PRECISION (type)));
>> >> >
>> >> > I guess you need to guard against BYTES_BIG_ENDIAN != WORDS_BIG_ENDIAN
>> >> > as well.
>> >> 
>> >> Yes I will add that check.
>> >> 
>> >> >  Also the above only works reliably for mode-precision
>> >> > integers?  We might want to disallow BIT_FIELD_REF/BIT_INSERT_EXPR
>> >> > on non-mode-precision entities in the GIMPLE/GENERIC verifiers.
>> >> 
>> >> You added that check already for gimple in r268332 due to PR88739.
>> >> BIT_FIELD_REF around tree-cfg.c:3083
>> >> BIT_INSERT_EXPR  around tree-cfg.c:4324
>> >
>> > Ah, good ;)  Note neither BIT_FIELD_REF nor BIT_INSERT_EXPR are
>> > documented in generic.texi and BIT_FIELD_REF is documented in tree.def
>> > as operating on structs/unions (well, memory).  And for register args
>> > we interpret it as storing the register to memory and interpreting
>> > the bit positions in memory bit terms (with the store doing endian
>> > fiddling).
>> 
>> Ah, was going to ask what the semantics were. :-)  That sounds good
>> because it's essentially the same as for non-paradoxical subregs.
>> We have routines like subreg_lsb_1 and subreg_offset_from_lsb that
>> convert byte offsets to shift amounts, so maybe we should move them
>> to code that's common to both gimple and rtl.  The BYTES_BIG_ENDIAN !=
>> WORDS_BIG_ENDIAN calculation is quite subtle, so I don't think we should
>> open-code it everwhere we need it.
>> 
>> What about the subbyte part of the bit value?  Does that always count
>> from the lsb of the containing byte?  E.g. for the four bytes:
>> 
>>   0x12, 0x34, 0x56, 0x78
>> 
>> what does bit offset == 3, bit size == 7 mean for big-endian?
>> 
>> > But for vector (register only?) accesses we interpret
>> > it as specifying lane numbers but at least BIT_FIELD_REF verifying
>> > doesn't barf on bit/sizes not corresponding to exact vector lanes
>> > (and I know we introduce non-matching ones via at least VIEW_CONVERT
>> > "merging" into BIT_FIELD_REFs).
>> 
>> GCC's vector lane numbering is equivalent to array index numbering for
>> all endiannesses, so these cases should still be ok for BYTES_BIG_ENDIAN
>> == WORDS_BIG_ENDIAN, at least on byte boundaries.  Not sure about
>> subbyte boundaries -- guess it depends on the answer to the question
>> above.
>
> I was thinking about, say a SImode extract at offset == 16, size == 32 of 
> a V4SImode vector.  Is that to be interpreted as some weird shifted vector
> lane or as a memory "bit" location after storing the vector to
> memory?  The issue I see here is that once RTL expansion decides to
> spill and interpret the offset/size in non-lane terms will there ever
> be a mismatch between both?

I think they'll be the same for BYTES_BIG_ENDIAN == WORDS_BIG_ENDIAN
(or even without for BITS_PER_WORD >= 64).  E.g.:

   memory:      0x01 0x23 | 0x45 0x67 | 0x89 0xab | 0xcd 0xef
                     ----   ----

   big-endian: 0x2345, little-endian: 0x4523

                                   0000111122223333 lane
   big-endian register:          0x0123456789abcdef (lsb)
                                     ----

                                   3333222211110000 lane
  little-endian register:        0xefcdab8967452301 (lsb)
                                             ----

Thanks,
Richard

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2020-01-07 11:52 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-13 10:51 [PATCH][RFC] Introduce BIT_FIELD_INSERT Richard Biener
2016-05-16  0:55 ` Bill Schmidt
2016-05-16 12:37   ` Bill Schmidt
2016-05-17  7:52     ` Richard Biener
2016-05-16  8:24 ` Eric Botcazou
2016-05-17  7:50   ` Richard Biener
2016-05-17  8:13     ` Eric Botcazou
2016-05-17 15:19     ` Michael Matz
2016-05-19 13:23       ` Richard Biener
2016-05-19 15:21         ` Eric Botcazou
2016-05-20  8:59           ` Richard Biener
2016-05-20 11:25             ` Jakub Jelinek
2016-05-20 11:41               ` Richard Biener
2016-05-20 11:52                 ` Jakub Jelinek
2016-05-20 11:53                   ` Richard Biener
2016-05-20 14:11 ` Andi Kleen
2016-05-20 15:12   ` Marc Glisse
2016-05-20 15:54     ` Andi Kleen
2016-05-20 16:08       ` Jakub Jelinek
2016-05-20 19:25         ` Richard Biener
2016-05-20 17:08       ` Marc Glisse
2018-11-15  1:27 ` Andrew Pinski
2018-11-15  8:29   ` Richard Biener
2018-11-15  8:31     ` Richard Biener
2019-12-17  2:41       ` Andrew Pinski
2019-12-17  3:25         ` Andrew Pinski
2020-01-07  7:37         ` Richard Biener
2020-01-07  9:40           ` Andrew Pinski
2020-01-07 10:04             ` Richard Biener
2020-01-07 11:14               ` Richard Sandiford
2020-01-07 11:38                 ` Richard Biener
2020-01-07 11:52                   ` Richard Sandiford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).