public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* TARGET_SHIFT_TRUNCATION_MASK
@ 2010-07-15  7:57 Uros Bizjak
  2010-07-15 12:17 ` TARGET_SHIFT_TRUNCATION_MASK Paolo Bonzini
  0 siblings, 1 reply; 4+ messages in thread
From: Uros Bizjak @ 2010-07-15  7:57 UTC (permalink / raw)
  To: GCC Development

Hello!

I was playing a bit with TARGET_SHIFT_TRUNCATION_MASK on x86 in the
hope that redundant masking would get eliminated from:

int test (int a, int c)
{
	return a << (c & 0x1f);
}

The macro was defined as:

+/* Implement TARGET_SHIFT_TRUNCATION_MASK.  */
+static unsigned HOST_WIDE_INT
+ix86_shift_truncation_mask (enum machine_mode mode)
+{
+  switch (mode)
+    {
+    case QImode:
+    case HImode:
+    case SImode:
+      return 31;
+
+    case DImode:
+      if (TARGET_64BIT)
+	return 63;
+
+    default:
+      return 0;
+    }
+}

However, I was not able to get rid of the masking "and".

Uros.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: TARGET_SHIFT_TRUNCATION_MASK
  2010-07-15  7:57 TARGET_SHIFT_TRUNCATION_MASK Uros Bizjak
@ 2010-07-15 12:17 ` Paolo Bonzini
  2010-07-15 12:26   ` TARGET_SHIFT_TRUNCATION_MASK Uros Bizjak
  0 siblings, 1 reply; 4+ messages in thread
From: Paolo Bonzini @ 2010-07-15 12:17 UTC (permalink / raw)
  To: GCC Mailing List, Uros Bizjak

On 07/15/2010 09:57 AM, Uros Bizjak wrote:
> Hello!
>
> I was playing a bit with TARGET_SHIFT_TRUNCATION_MASK on x86 in the
> hope that redundant masking would get eliminated from:
>
> int test (int a, int c)
> {
> 	return a<<  (c&  0x1f);
> }
>
> The macro was defined as:
>
> +/* Implement TARGET_SHIFT_TRUNCATION_MASK.  */
> +static unsigned HOST_WIDE_INT
> +ix86_shift_truncation_mask (enum machine_mode mode)
> +{
> +  switch (mode)
> +    {
> +    case QImode:
> +    case HImode:
> +    case SImode:
> +      return 31;
> +
> +    case DImode:
> +      if (TARGET_64BIT)
> +	return 63;
> +
> +    default:
> +      return 0;
> +    }
> +}
>
> However, I was not able to get rid of the masking "and".

FWIW, the reason the hook is not implemented on x86 is that some 
variants of bsf/bsr (I think with memory source) do not honor the 
truncation.

Paolo

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: TARGET_SHIFT_TRUNCATION_MASK
  2010-07-15 12:17 ` TARGET_SHIFT_TRUNCATION_MASK Paolo Bonzini
@ 2010-07-15 12:26   ` Uros Bizjak
  2010-07-16  7:38     ` TARGET_SHIFT_TRUNCATION_MASK Paolo Bonzini
  0 siblings, 1 reply; 4+ messages in thread
From: Uros Bizjak @ 2010-07-15 12:26 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: GCC Mailing List

On Thu, Jul 15, 2010 at 2:16 PM, Paolo Bonzini <bonzini@gnu.org> wrote:
> On 07/15/2010 09:57 AM, Uros Bizjak wrote:
>>
>> Hello!
>>
>> I was playing a bit with TARGET_SHIFT_TRUNCATION_MASK on x86 in the
>> hope that redundant masking would get eliminated from:
>>
>> int test (int a, int c)
>> {
>>        return a<<  (c&  0x1f);
>> }
>>
>> The macro was defined as:
>>
>> +/* Implement TARGET_SHIFT_TRUNCATION_MASK.  */
>> +static unsigned HOST_WIDE_INT
>> +ix86_shift_truncation_mask (enum machine_mode mode)
>> +{
>> +  switch (mode)
>> +    {
>> +    case QImode:
>> +    case HImode:
>> +    case SImode:
>> +      return 31;
>> +
>> +    case DImode:
>> +      if (TARGET_64BIT)
>> +       return 63;
>> +
>> +    default:
>> +      return 0;
>> +    }
>> +}
>>
>> However, I was not able to get rid of the masking "and".
>
> FWIW, the reason the hook is not implemented on x86 is that some variants of
> bsf/bsr (I think with memory source) do not honor the truncation.

The reason you pointed out is for SHIFT_COUNT_TRUNCATED. Please note,
that we don't use memory_operands, but even in register operand case,
"bt" insn doesn't truncate the bit-count operand, but performs modulo
operation on it. I.E, "bt %reg, 75" will not return 0, but shift insn
with the same operands will.

Uros.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: TARGET_SHIFT_TRUNCATION_MASK
  2010-07-15 12:26   ` TARGET_SHIFT_TRUNCATION_MASK Uros Bizjak
@ 2010-07-16  7:38     ` Paolo Bonzini
  0 siblings, 0 replies; 4+ messages in thread
From: Paolo Bonzini @ 2010-07-16  7:38 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: GCC Mailing List

[-- Attachment #1: Type: text/plain, Size: 611 bytes --]

On 07/15/2010 02:26 PM, Uros Bizjak wrote:
> The reason you pointed out is for SHIFT_COUNT_TRUNCATED. Please note,
> that we don't use memory_operands, but even in register operand case,
> "bt" insn doesn't truncate the bit-count operand, but performs modulo
> operation on it. I.E, "bt %reg, 75" will not return 0, but shift insn
> with the same operands will.

Yes, only for memory_operands.

You can take a look at the attached patch.  I never got round to finish 
it, but I think it bootstrapped.  It unifies SHIFT_COUNT_TRUNCATED and 
TARGET_SHIFT_TRUNCATION_MASK, I think it can be useful for x86.

Paolo

[-- Attachment #2: for-uros.patch --]
[-- Type: text/plain, Size: 35185 bytes --]

patch 1/4:
2009-03-13  Paolo Bonzini  <bonzini@gnu.org>

	* combine.c (expand_compound_operation): Fix thinko.
	(simplify_shift_const_1): Avoid noncanonical rtx.

Index: gcc/combine.c
===================================================================
--- gcc/combine.c	(branch rebase-shift-count-trunc)
+++ gcc/combine.c	(working copy)
@@ -6191,7 +6191,7 @@ expand_compound_operation (rtx x)
      a such a position.  */
 
   modewidth = GET_MODE_BITSIZE (GET_MODE (x));
-  if (modewidth + len >= pos)
+  if (modewidth >= pos + len)
     {
       enum machine_mode mode = GET_MODE (x);
       tem = gen_lowpart (mode, XEXP (x, 0));
@@ -9613,10 +9613,11 @@ simplify_shift_const_1 (enum rtx_code co
 	      rtx varop_inner = XEXP (varop, 0);
 
 	      varop_inner
-		= gen_rtx_LSHIFTRT (GET_MODE (varop_inner),
-				    XEXP (varop_inner, 0),
-				    GEN_INT
-				    (count + INTVAL (XEXP (varop_inner, 1))));
+		= simplify_gen_binary (LSHIFTRT,
+				       GET_MODE (varop_inner),
+				       XEXP (varop_inner, 0),
+				       GEN_INT
+				       (count + INTVAL (XEXP (varop_inner, 1))));
 	      varop = gen_rtx_TRUNCATE (GET_MODE (varop), varop_inner);
 	      count = 0;
 	      continue;

patch 2/4:
2009-03-13  Paolo Bonzini  <bonzini@gnu.org>

	* gcc/config/bfin/bfin.h (SHIFT_COUNT_TRUNCATED): Define to 0.
	* gcc/config/cris/cris.h (SHIFT_COUNT_TRUNCATED): Define to 0.
	* gcc/config/h8300/h8300.h (SHIFT_COUNT_TRUNCATED): Define to 0.
	* gcc/config/mcore/mcore.h (SHIFT_COUNT_TRUNCATED): Define to 0.
	* gcc/config/mmix/mmix.h (SHIFT_COUNT_TRUNCATED): Define to 0.
	* gcc/config/pdp11/pdp11.h (SHIFT_COUNT_TRUNCATED): Define to 0.
	* gcc/config/vax/vax.h (SHIFT_COUNT_TRUNCATED): Define to 0.

Index: gcc/config/pdp11/pdp11.h
===================================================================
--- gcc/config/pdp11/pdp11.h	(branch rebase-shift-count-trunc)
+++ gcc/config/pdp11/pdp11.h	(working copy)
@@ -780,6 +780,9 @@ extern int may_call_alloca;
 
 #define MOVE_MAX 2
 
+/* Check later if we can set SHIFT_COUNT_TRUNCATED to 1.  */
+#define SHIFT_COUNT_TRUNCATED 0
+
 /* Nonzero if access to memory by byte is slow and undesirable. -
 */
 #define SLOW_BYTE_ACCESS 0
Index: gcc/config/cris/cris.h
===================================================================
--- gcc/config/cris/cris.h	(branch rebase-shift-count-trunc)
+++ gcc/config/cris/cris.h	(working copy)
@@ -1521,6 +1521,7 @@ enum cris_pic_symbol_type
 #define MOVE_MAX 4
 
 /* Maybe SHIFT_COUNT_TRUNCATED is safe to define?  FIXME: Check later.  */
+#define SHIFT_COUNT_TRUNCATED 0
 
 #define TRULY_NOOP_TRUNCATION(OUTPREC, INPREC) 1
 
Index: gcc/config/mcore/mcore.h
===================================================================
--- gcc/config/mcore/mcore.h	(branch rebase-shift-count-trunc)
+++ gcc/config/mcore/mcore.h	(working copy)
@@ -820,7 +820,7 @@ extern const enum reg_class reg_class_fr
 
 /* Shift counts are truncated to 6-bits (0 to 63) instead of the expected
    5-bits, so we can not define SHIFT_COUNT_TRUNCATED to true for this
-   target.  */
+   target.  TODO: we could define TARGET_SHIFT_TRUNCATION_MASK. */
 #define SHIFT_COUNT_TRUNCATED 0
 
 /* All integers have the same format so truncation is easy.  */
Index: gcc/config/vax/vax.h
===================================================================
--- gcc/config/vax/vax.h	(branch rebase-shift-count-trunc)
+++ gcc/config/vax/vax.h	(working copy)
@@ -627,7 +627,7 @@ enum reg_class { NO_REGS, ALL_REGS, LIM_
 /* Define if shifts truncate the shift count
    which implies one can omit a sign-extension or zero-extension
    of a shift count.  */
-/* #define SHIFT_COUNT_TRUNCATED */
+#define SHIFT_COUNT_TRUNCATED 0
 
 /* Value is 1 if truncating an integer of INPREC bits to OUTPREC bits
    is done just by pretending it is already truncated.  */
Index: gcc/config/h8300/h8300.h
===================================================================
--- gcc/config/h8300/h8300.h	(branch rebase-shift-count-trunc)
+++ gcc/config/h8300/h8300.h	(working copy)
@@ -981,7 +981,7 @@ struct cum_arg
 /* Define if shifts truncate the shift count
    which implies one can omit a sign-extension or zero-extension
    of a shift count.  */
-/* #define SHIFT_COUNT_TRUNCATED */
+#define SHIFT_COUNT_TRUNCATED 0
 
 /* Value is 1 if truncating an integer of INPREC bits to OUTPREC bits
    is done just by pretending it is already truncated.  */
Index: gcc/config/mmix/mmix.h
===================================================================
--- gcc/config/mmix/mmix.h	(branch rebase-shift-count-trunc)
+++ gcc/config/mmix/mmix.h	(working copy)
@@ -973,6 +973,8 @@ typedef struct { int regs; int lib; } CU
 
 #define TRULY_NOOP_TRUNCATION(OUTPREC, INPREC) 1
 
+#define SHIFT_COUNT_TRUNCATED 0
+
 /* ??? MMIX allows a choice of STORE_FLAG_VALUE.  Revisit later,
    we don't have scc expanders yet.  */
 
Index: gcc/config/bfin/bfin.h
===================================================================
--- gcc/config/bfin/bfin.h	(branch rebase-shift-count-trunc)
+++ gcc/config/bfin/bfin.h	(working copy)
@@ -1322,6 +1322,11 @@ do { 						\
     fprintf (FILE, "\tCALL __mcount;\n");	\
   } while(0)
 
+/* The documentation specifies that e.g. for "Dx >>= Dy" instructions,
+   shift counts greater than 31 produce a result of zero.  */
+#undef SHIFT_COUNT_TRUNCATED
+#define SHIFT_COUNT_TRUNCATED 0
+
 #undef NO_PROFILE_COUNTERS
 #define NO_PROFILE_COUNTERS 1
 

patch 3/4:
2009-03-13  Paolo Bonzini  <bonzini@gnu.org>

	* simplify-rtx.c (canonical_shift_count): New.
	(simplify_binary_operation_1): Add (ashiftrt (lshiftrt A B) C)
	simplification for nonzero constant B, other (<shift> (<shift> A B) C)
	simplifications for constant B and C, and canonicalization of
	out-of-range shift counts.
	(simplify_const_binary_operation): Canonicalize out-of-range
	shift counts.
	* fold-const.c (lshift_double): Remove truncation of shift count
	(rshift_double): Likewise, and support negative counts for symmetry.
	(int_const_binop): Truncate shift counts here.

	* expmed.c (expand_shift): Use TARGET_SHIFT_TRUNCATION_MASK
	instead of SHIFT_COUNT_TRUNCATED.  Avoid creating out-of-range
	shifts.
	* optabs.c (expand_binop): Simplify for ~0 being used now to
	indicate no truncation.

	* cse.c (fold_rtx): Use simplify_binary_operation to reassociate
	operations (including shifts).
	* combine.c (combine_simplify_rtx): Use TARGET_SHIFT_TRUNCATION_MASK
        instead of SHIFT_COUNT_TRUNCATED.
	(simplify_shift_const_1): Assert orig_count is in range.
	(simplify_comparison): Use TARGET_EXTRACT_TRUNCATION_MASK instead
	of SHIFT_COUNT_TRUNCATED.

	* target.h (struct gcc_target): Add extract_truncation_mask.
	* target-def.h (TARGET_EXTRACT_TRUNCATION_MASK): Define.
	* targhooks.c (default_shift_truncation_mask): Use ~0 to indicate
	no truncation.

	* config/arm/arm.h (SHIFT_COUNT_TRUNCATED): Remove misleading comment,
	define to 1 for bitfield extraction routines.
	* config/arm/arm.c (arm_shift_truncation_mask): Adjust.

	* doc/tm.texi (SHIFT_COUNT_TRUNCATED): Describe in terms of
	TARGET_SHIFT_TRUNCATION_MASK and TARGET_EXTRACT_TRUNCATION_MASK.
	(TARGET_SHIFT_TRUNCATION_MASK): Simplify.
	(TARGET_EXTRACT_TRUNCATION_MASK): New.

Index: gcc/simplify-rtx.c
===================================================================
--- gcc/simplify-rtx.c	(branch rebase-shift-count-trunc)
+++ gcc/simplify-rtx.c	(working copy)
@@ -63,6 +63,35 @@ static rtx simplify_unary_operation_1 (e
 static rtx simplify_binary_operation_1 (enum rtx_code, enum machine_mode,
 					rtx, rtx, rtx, rtx);
 \f
+/* Truncate the shift count in ARG1 according to the given CODE and MODE.
+   Return a value between 0 and GET_MODE_BITSIZE (mode)
+   inclusive (GET_MODE_BITSIZE (mode) is returned only for LSHIFT and
+   LSHIFTRT and means the operation should be folded to zero).  */
+
+static HOST_WIDE_INT
+canonical_shift_count (enum rtx_code code, enum machine_mode mode,
+		       HOST_WIDE_INT arg1)
+{
+  if (arg1 >= GET_MODE_BITSIZE (mode))
+    switch (code)
+      {
+      case ASHIFT:
+      case LSHIFTRT:
+	return GET_MODE_BITSIZE (mode);
+      case SS_ASHIFT:
+      case US_ASHIFT:
+      case ASHIFTRT:
+	return GET_MODE_BITSIZE (mode) - 1;
+      case ROTATE:
+      case ROTATERT:
+	return arg1 % GET_MODE_BITSIZE (mode);
+      default:
+	gcc_unreachable ();
+      }
+
+  return arg1;
+}
+
 /* Negate a CONST_INT rtx, truncating (because a conversion from a
    maximally negative number can overflow).  */
 static rtx
@@ -2581,41 +2610,70 @@ simplify_binary_operation_1 (enum rtx_co
 	}
       break;
 
+    case ASHIFTRT:
+      /* Possibly transform (ashiftrt (lshiftrt A B) C) into a
+	 LSHIFTRT, but only if B is not zero.  */
+      if (code == ASHIFTRT
+	  && GET_CODE (op0) == LSHIFTRT
+	  && GET_CODE (XEXP (op0, 1)) == CONST_INT
+	  && INTVAL (XEXP (op0, 1)))
+	{
+          code = LSHIFTRT;
+	  goto canonicalize_shift;
+	}
+
+      /* otherwise fall through... */
+
     case ROTATERT:
     case ROTATE:
-    case ASHIFTRT:
-      if (trueop1 == CONST0_RTX (mode))
-	return op0;
-      if (trueop0 == CONST0_RTX (mode) && ! side_effects_p (op1))
-	return op0;
-      /* Rotating ~0 always results in ~0.  */
+    case US_ASHIFT:
+      /* Applied to ~0, these codes always give ~0.  */
       if (GET_CODE (trueop0) == CONST_INT && width <= HOST_BITS_PER_WIDE_INT
 	  && (unsigned HOST_WIDE_INT) INTVAL (trueop0) == GET_MODE_MASK (mode)
 	  && ! side_effects_p (op1))
 	return op0;
-    canonicalize_shift:
-      if (SHIFT_COUNT_TRUNCATED && GET_CODE (op1) == CONST_INT)
-	{
-	  val = INTVAL (op1) & (GET_MODE_BITSIZE (mode) - 1);
-	  if (val != INTVAL (op1))
-	    return simplify_gen_binary (code, mode, op0, GEN_INT (val));
-	}
-      break;
 
     case ASHIFT:
     case SS_ASHIFT:
-    case US_ASHIFT:
+    canonicalize_shift:
       if (trueop1 == CONST0_RTX (mode))
 	return op0;
       if (trueop0 == CONST0_RTX (mode) && ! side_effects_p (op1))
 	return op0;
-      goto canonicalize_shift;
+      if (GET_CODE (trueop1) == CONST_INT)
+	{
+	  unsigned HOST_WIDE_INT cnt;
+	  cnt = INTVAL (trueop1) & targetm.shift_truncation_mask (mode);
+          if (GET_CODE (op0) == code
+	      && GET_CODE (XEXP (op0, 1)) == CONST_INT)
+	    {
+	      /* Combine two identical shifts.  Truncate them separately,
+		 then canonicalize the sum.  */
+	      cnt += INTVAL (XEXP (op0, 1)) & targetm.shift_truncation_mask (mode);
+	      op0 = XEXP (op0, 0);
+	    }
+	  else
+	    {
+	      /* Otherwise, only proceed if there is nothing noncanonical.  */
+	      if (INTVAL (trueop1) < GET_MODE_BITSIZE (mode))
+		break;
+	    }
+
+          cnt = canonical_shift_count (code, mode, cnt);
+	  if (cnt < GET_MODE_BITSIZE (mode))
+	    return simplify_gen_binary (code, mode, op0, GEN_INT (cnt));
+	  else
+	    {
+	      /* Use an AND instead of a shift to keep shift counts canonical.  */
+	      if (side_effects_p (op0))
+		return simplify_gen_binary (AND, mode, op0, const0_rtx);
+	      else
+		return CONST0_RTX (mode);
+	    }
+	}
+      break;
 
     case LSHIFTRT:
-      if (trueop1 == CONST0_RTX (mode))
-	return op0;
-      if (trueop0 == CONST0_RTX (mode) && ! side_effects_p (op1))
-	return op0;
       /* Optimize (lshiftrt (clz X) C) as (eq X 0).  */
       if (GET_CODE (op0) == CLZ
 	  && GET_CODE (trueop1) == CONST_INT
@@ -3221,11 +3279,20 @@ simplify_const_binary_operation (enum rt
 	case LSHIFTRT:   case ASHIFTRT:
 	case ASHIFT:
 	case ROTATE:     case ROTATERT:
-	  if (SHIFT_COUNT_TRUNCATED)
-	    l2 &= (GET_MODE_BITSIZE (mode) - 1), h2 = 0;
+	  {
+	    unsigned HOST_WIDE_INT mask = targetm.shift_truncation_mask (mode);
+	    if (~mask)
+	      {
+	        h2 = 0;
+	        l2 &= mask;
+	      }
 
-	  if (h2 != 0 || l2 >= GET_MODE_BITSIZE (mode))
-	    return 0;
+	    /* Ignore h2 for rotates; for other codes, any overflowing count is
+	       the same.  */
+	    if (h2 && code != ROTATE && code != ROTATERT)
+	      l2 = GET_MODE_BITSIZE (mode);
+	    l2 = canonical_shift_count (code, mode, l2);
+	  }
 
 	  if (code == LSHIFTRT || code == ASHIFTRT)
 	    rshift_double (l1, h1, l2, GET_MODE_BITSIZE (mode), &lv, &hv,
@@ -3336,21 +3403,16 @@ simplify_const_binary_operation (enum rt
 	case LSHIFTRT:
 	case ASHIFT:
 	case ASHIFTRT:
-	  /* Truncate the shift if SHIFT_COUNT_TRUNCATED, otherwise make sure
-	     the value is in range.  We can't return any old value for
-	     out-of-range arguments because either the middle-end (via
-	     shift_truncation_mask) or the back-end might be relying on
-	     target-specific knowledge.  Nor can we rely on
-	     shift_truncation_mask, since the shift might not be part of an
-	     ashlM3, lshrM3 or ashrM3 instruction.  */
-	  if (SHIFT_COUNT_TRUNCATED)
-	    arg1 = (unsigned HOST_WIDE_INT) arg1 % width;
-	  else if (arg1 < 0 || arg1 >= GET_MODE_BITSIZE (mode))
-	    return 0;
-	  
-	  val = (code == ASHIFT
-		 ? ((unsigned HOST_WIDE_INT) arg0) << arg1
-		 : ((unsigned HOST_WIDE_INT) arg0) >> arg1);
+	  arg1 = canonical_shift_count (code, mode,
+					(unsigned HOST_WIDE_INT)
+				          (arg1 & targetm.shift_truncation_mask (mode)));
+
+	  if (arg1 >= GET_MODE_BITSIZE (mode))
+	    val = 0;
+	  else 
+	    val = (code == ASHIFT
+		   ? ((unsigned HOST_WIDE_INT) arg0) << arg1
+		   : ((unsigned HOST_WIDE_INT) arg0) >> arg1);
 	  
 	  /* Sign-extend the result for arithmetic right shifts.  */
 	  if (code == ASHIFTRT && arg0s < 0 && arg1 > 0)
Index: gcc/doc/tm.texi
===================================================================
--- gcc/doc/tm.texi	(branch rebase-shift-count-trunc)
+++ gcc/doc/tm.texi	(working copy)
@@ -9745,30 +9745,29 @@ constant value that is the largest value
 at run-time.
 @end defmac
 
-@defmac SHIFT_COUNT_TRUNCATED
-A C expression that is nonzero if on this machine the number of bits
-actually used for the count of a shift operation is equal to the number
-of bits needed to represent the size of the object being shifted.  When
-this macro is nonzero, the compiler will assume that it is safe to omit
-a sign-extend, zero-extend, and certain bitwise `and' instructions that
-truncates the count of a shift operation.  On machines that have
-instructions that act on bit-fields at variable positions, which may
-include `bit test' instructions, a nonzero @code{SHIFT_COUNT_TRUNCATED}
-also enables deletion of truncations of the values that serve as
-arguments to bit-field instructions.
-
-If both types of instructions truncate the count (for shifts) and
-position (for bit-field operations), or if no variable-position bit-field
-instructions exist, you should define this macro.
-
-However, on some machines, such as the 80386 and the 680x0, truncation
-only applies to shift operations and not the (real or pretended)
-bit-field operations.  Define @code{SHIFT_COUNT_TRUNCATED} to be zero on
-such machines.  Instead, add patterns to the @file{md} file that include
-the implied truncation of the shift instructions.
-
-You need not define this macro if it would always have the value of zero.
-@end defmac
+@anchor{TARGET_EXTRACT_TRUNCATION_MASK}
+@deftypefn {Target Hook} int TARGET_EXTRACT_TRUNCATION_MASK (enum machine_mode @var{mode})
+This function describes how the standard extraction patterns for
+@var{mode} deal with extractions by negative amounts or by more than
+the width of the mode.
+
+On many machines, the extraction patterns will apply a mask @var{m} to
+the extraction count, meaning that a fixed-width extraction of @var{x}
+by @var{y} is equivalent to an arbitrary-width shift of @var{x} by @var{y
+& m}.  If this is true for mode @var{mode}, the function should return
+@var{m}, otherwise it should return @code{~(unsigned HOST_WIDE_INT)0}.
+If no particular behavior is guaranteed, it is possible to return
+@code{~(unsigned HOST_WIDE_INT)0}.
+
+Returning a larger value than the one actually used by the machine
+is always safe.
+
+This function may be left out by a port if it defines
+@code{SHIFT_COUNT_TRUNCATED}.  In this case, the default implementation
+of this function returns @code{GET_MODE_BITSIZE (@var{mode}) - 1}
+if @code{SHIFT_COUNT_TRUNCATED} is nonzero and @code{~(unsigned
+HOST_WIDE_INT)0} otherwise.
+@end deftypefn
 
 @anchor{TARGET_SHIFT_TRUNCATION_MASK}
 @deftypefn {Target Hook} int TARGET_SHIFT_TRUNCATION_MASK (enum machine_mode @var{mode})
@@ -9780,21 +9779,37 @@ On many machines, the shift patterns wil
 shift count, meaning that a fixed-width shift of @var{x} by @var{y} is
 equivalent to an arbitrary-width shift of @var{x} by @var{y & m}.  If
 this is true for mode @var{mode}, the function should return @var{m},
-otherwise it should return 0.  A return value of 0 indicates that no
-particular behavior is guaranteed.
+otherwise it should return @code{~(unsigned HOST_WIDE_INT)0}.
 
-Note that, unlike @code{SHIFT_COUNT_TRUNCATED}, this function does
-@emph{not} apply to general shift rtxes; it applies only to instructions
-that are generated by the named shift patterns.
-
-The default implementation of this function returns
-@code{GET_MODE_BITSIZE (@var{mode}) - 1} if @code{SHIFT_COUNT_TRUNCATED}
-and 0 otherwise.  This definition is always safe, but if
-@code{SHIFT_COUNT_TRUNCATED} is false, and some shift patterns
-nevertheless truncate the shift count, you may get better code
-by overriding it.
+Returning a smaller mask than the one actually used by the machine will
+cause wrong optimizations.  Returning a larger mask than the one actually
+used by the machine is always safe, but may cause undefined programs
+to have different results depending on the compiler's optimizations.
+
+This function may be left out by a port if it defines
+@code{SHIFT_COUNT_TRUNCATED}.  In this case, the default implementation
+of this function returns @code{GET_MODE_BITSIZE (@var{mode}) - 1}
+if @code{SHIFT_COUNT_TRUNCATED} is nonzero and @code{~(unsigned
+HOST_WIDE_INT)0} otherwise.
 @end deftypefn
 
+@defmac SHIFT_COUNT_TRUNCATED
+This macro is just a shortcut for @code{TARGET_SHIFT_TRUNCATION_MASK}
+and @code{TARGET_EXTRACT_TRUNCATION_MASK}.  It can be left undefined if
+your target defines both of those target hooks.  If both types
+of instructions truncate the count (for shifts) and position (for
+bit-field operations), or if no variable-position bit-field instructions
+exist, defining this macro is simpler than defining both target hooks.
+A nonzero value will tell the default implementation of the hooks
+to truncate the count or position.
+
+On some machines such as the 80386 and the 680x0, however,
+truncation only applies to shift operations and not to the bit-field
+operations.  On these machines, you can define this macro and
+make sure that you define @code{TARGET_SHIFT_TRUNCATION_MASK} or
+@code{TARGET_EXTRACT_TRUNCATION_MASK} appropriately.
+@end defmac
+
 @defmac TRULY_NOOP_TRUNCATION (@var{outprec}, @var{inprec})
 A C expression which is nonzero if on this machine it is safe to
 ``convert'' an integer of @var{inprec} bits to one of @var{outprec}
Index: gcc/targhooks.c
===================================================================
--- gcc/targhooks.c	(branch rebase-shift-count-trunc)
+++ gcc/targhooks.c	(working copy)
@@ -178,7 +178,9 @@ default_unwind_word_mode (void)
 unsigned HOST_WIDE_INT
 default_shift_truncation_mask (enum machine_mode mode)
 {
-  return SHIFT_COUNT_TRUNCATED ? GET_MODE_BITSIZE (mode) - 1 : 0;
+  return (SHIFT_COUNT_TRUNCATED
+	  ? (unsigned HOST_WIDE_INT) GET_MODE_BITSIZE (mode) - 1
+	  : ~(unsigned HOST_WIDE_INT) 0);
 }
 
 /* The default implementation of TARGET_MIN_DIVISIONS_FOR_RECIP_MUL.  */
Index: gcc/target.h
===================================================================
--- gcc/target.h	(branch rebase-shift-count-trunc)
+++ gcc/target.h	(working copy)
@@ -642,6 +642,10 @@ struct gcc_target
   /* Undo the effects of encode_section_info on the symbol string.  */
   const char * (* strip_name_encoding) (const char *);
 
+  /* If extract optabs for MODE are known to always truncate the shift count,
+     return the mask that they apply.  Return 0 otherwise.  */
+  unsigned HOST_WIDE_INT (* extract_truncation_mask) (enum machine_mode mode);
+
   /* If shift optabs for MODE are known to always truncate the shift count,
      return the mask that they apply.  Return 0 otherwise.  */
   unsigned HOST_WIDE_INT (* shift_truncation_mask) (enum machine_mode mode);
Index: gcc/fold-const.c
===================================================================
--- gcc/fold-const.c	(branch rebase-shift-count-trunc)
+++ gcc/fold-const.c	(working copy)
@@ -444,9 +444,6 @@ lshift_double (unsigned HOST_WIDE_INT l1
       return;
     }
 
-  if (SHIFT_COUNT_TRUNCATED)
-    count %= prec;
-
   if (count >= 2 * HOST_BITS_PER_WIDE_INT)
     {
       /* Shifting by the host word size is undefined according to the
@@ -501,13 +498,16 @@ rshift_double (unsigned HOST_WIDE_INT l1
 {
   unsigned HOST_WIDE_INT signmask;
 
+  if (count < 0)
+    {
+      lshift_double (l1, h1, -count, prec, lv, hv, arith);
+      return;
+    }
+
   signmask = (arith
 	      ? -((unsigned HOST_WIDE_INT) h1 >> (HOST_BITS_PER_WIDE_INT - 1))
 	      : 0);
 
-  if (SHIFT_COUNT_TRUNCATED)
-    count %= prec;
-
   if (count >= 2 * HOST_BITS_PER_WIDE_INT)
     {
       /* Shifting by the host word size is undefined according to the
@@ -1667,11 +1667,16 @@ int_const_binop (enum tree_code code, co
       break;
 
     case RSHIFT_EXPR:
-      int2l = -int2l;
+      int2l &= targetm.shift_truncation_mask (TYPE_MODE (type));
+      rshift_double (int1l, int1h, int2l, TYPE_PRECISION (type),
+		     &low, &hi, !uns);
+      break;
+
     case LSHIFT_EXPR:
       /* It's unclear from the C standard whether shifts can overflow.
 	 The following code ignores overflow; perhaps a C standard
 	 interpretation ruling is needed.  */
+      int2l &= targetm.shift_truncation_mask (TYPE_MODE (type));
       lshift_double (int1l, int1h, int2l, TYPE_PRECISION (type),
 		     &low, &hi, !uns);
       break;
Index: gcc/cse.c
===================================================================
--- gcc/cse.c	(branch rebase-shift-count-trunc)
+++ gcc/cse.c	(working copy)
@@ -3457,27 +3457,10 @@ fold_rtx (rtx x, rtx insn)
 	     intermediate operation if every use is simplified in this way.
 	     Note that the similar optimization done by combine.c only works
 	     if the intermediate operation's result has only one reference.  */
-
 	  if (REG_P (folded_arg0)
 	      && const_arg1 && GET_CODE (const_arg1) == CONST_INT)
 	    {
-	      int is_shift
-		= (code == ASHIFT || code == ASHIFTRT || code == LSHIFTRT);
-	      rtx y, inner_const, new_const;
-	      rtx canon_const_arg1 = const_arg1;
-	      enum rtx_code associate_code;
-
-	      if (is_shift
-		  && (INTVAL (const_arg1) >= GET_MODE_BITSIZE (mode)
-		      || INTVAL (const_arg1) < 0))
-		{
-		  if (SHIFT_COUNT_TRUNCATED)
-		    canon_const_arg1 = GEN_INT (INTVAL (const_arg1)
-						& (GET_MODE_BITSIZE (mode)
-						   - 1));
-		  else
-		    break;
-		}
+	      rtx y, inner_const;
 
 	      y = lookup_as_function (folded_arg0, code);
 	      if (y == 0)
@@ -3513,71 +3496,32 @@ fold_rtx (rtx x, rtx insn)
 
 	      /* ??? Vector mode shifts by scalar
 		 shift operand are not supported yet.  */
-	      if (is_shift && VECTOR_MODE_P (mode))
+	      if ((code == ASHIFT || code == ASHIFTRT || code == LSHIFTRT)
+		  && VECTOR_MODE_P (mode))
                 break;
 
-	      if (is_shift
-		  && (INTVAL (inner_const) >= GET_MODE_BITSIZE (mode)
-		      || INTVAL (inner_const) < 0))
-		{
-		  if (SHIFT_COUNT_TRUNCATED)
-		    inner_const = GEN_INT (INTVAL (inner_const)
-					   & (GET_MODE_BITSIZE (mode) - 1));
-		  else
-		    break;
-		}
-
-	      /* Compute the code used to compose the constants.  For example,
-		 A-C1-C2 is A-(C1 + C2), so if CODE == MINUS, we want PLUS.  */
-
-	      associate_code = (is_shift || code == MINUS ? PLUS : code);
-
-	      new_const = simplify_binary_operation (associate_code, mode,
-						     canon_const_arg1,
-						     inner_const);
-
-	      if (new_const == 0)
-		break;
-
-	      /* If we are associating shift operations, don't let this
-		 produce a shift of the size of the object or larger.
-		 This could occur when we follow a sign-extend by a right
-		 shift on a machine that does a sign-extend as a pair
-		 of shifts.  */
-
-	      if (is_shift
-		  && GET_CODE (new_const) == CONST_INT
-		  && INTVAL (new_const) >= GET_MODE_BITSIZE (mode))
-		{
-		  /* As an exception, we can turn an ASHIFTRT of this
-		     form into a shift of the number of bits - 1.  */
-		  if (code == ASHIFTRT)
-		    new_const = GEN_INT (GET_MODE_BITSIZE (mode) - 1);
-		  else if (!side_effects_p (XEXP (y, 0)))
-		    return CONST0_RTX (mode);
-		  else
-		    break;
-		}
-
-	      y = copy_rtx (XEXP (y, 0));
-
 	      /* If Y contains our first operand (the most common way this
 		 can happen is if Y is a MEM), we would do into an infinite
 		 loop if we tried to fold it.  So don't in that case.  */
 
 	      if (! reg_mentioned_p (folded_arg0, y))
-		y = fold_rtx (y, insn);
+		y = simplify_gen_binary (code, mode,
+					 fold_rtx (copy_rtx (XEXP (y, 0)), insn),
+					 inner_const);
+	      else
+		y = simplify_gen_binary (code, mode, copy_rtx (XEXP (y, 0)),
+					 inner_const);
 
-	      return simplify_gen_binary (code, mode, y, new_const);
+	      return simplify_gen_binary (code, mode, y, const_arg1);
 	    }
 	  break;
 
 	case DIV:       case UDIV:
 	  /* ??? The associative optimization performed immediately above is
-	     also possible for DIV and UDIV using associate_code of MULT.
-	     However, we would need extra code to verify that the
-	     multiplication does not overflow, that is, there is no overflow
-	     in the calculation of new_const.  */
+	     also possible for DIV and UDIV too.  However, we would need extra
+	     code to verify that the multiplication does not overflow, that
+	     is, there is no overflow in the calculation of new_const.  I
+	     am not sure this check is done by simplify-rtx.c.  */
 	  break;
 
 	default:
Index: gcc/expmed.c
===================================================================
--- gcc/expmed.c	(branch rebase-shift-count-trunc)
+++ gcc/expmed.c	(working copy)
@@ -2107,6 +2107,7 @@ expand_shift (enum tree_code code, enum 
   optab lrotate_optab = rotl_optab;
   optab rrotate_optab = rotr_optab;
   enum machine_mode op1_mode;
+  unsigned HOST_WIDE_INT mask;
   int attempt;
   bool speed = optimize_insn_for_speed_p ();
 
@@ -2128,18 +2129,32 @@ expand_shift (enum tree_code code, enum 
      and shifted in the other direction; but that does not work
      on all machines.  */
 
-  if (SHIFT_COUNT_TRUNCATED)
-    {
-      if (GET_CODE (op1) == CONST_INT
-	  && ((unsigned HOST_WIDE_INT) INTVAL (op1) >=
-	      (unsigned HOST_WIDE_INT) GET_MODE_BITSIZE (mode)))
-	op1 = GEN_INT ((unsigned HOST_WIDE_INT) INTVAL (op1)
-		       % GET_MODE_BITSIZE (mode));
-      else if (GET_CODE (op1) == SUBREG
-	       && subreg_lowpart_p (op1)
-	       && INTEGRAL_MODE_P (GET_MODE (SUBREG_REG (op1))))
-	op1 = SUBREG_REG (op1);
+  mask = targetm.shift_truncation_mask (mode);
+  if (GET_CODE (op1) == CONST_INT
+      && ((unsigned HOST_WIDE_INT) INTVAL (op1) >=
+	  (unsigned HOST_WIDE_INT) GET_MODE_BITSIZE (mode)))
+    {
+      unsigned HOST_WIDE_INT val = (unsigned HOST_WIDE_INT) INTVAL (op1);
+      val &= mask;
+
+      if (val >= (unsigned HOST_WIDE_INT) GET_MODE_BITSIZE (mode))
+	{
+	  /* Avoid out-of-range shifts.  */
+	  if (rotate)
+	    op1 = GEN_INT (INTVAL (op1) & (GET_MODE_BITSIZE (mode) - 1));
+	  else if (!left && !unsignedp)
+	    op1 = GEN_INT (GET_MODE_BITSIZE (mode) - 1);
+	  else
+	    return CONST0_RTX (mode);
+	}
     }
+  else if (GET_CODE (op1) == SUBREG
+	   && subreg_lowpart_p (op1)
+	   && INTEGRAL_MODE_P (GET_MODE (SUBREG_REG (op1)))
+
+	   /* If shifts are not truncated, do not simplify subregs.  */
+	   && mask <= (unsigned HOST_WIDE_INT) GET_MODE_MASK (GET_MODE (op1)))
+    op1 = SUBREG_REG (op1);
 
   if (op1 == const0_rtx)
     return shifted;
Index: gcc/target-def.h
===================================================================
--- gcc/target-def.h	(branch rebase-shift-count-trunc)
+++ gcc/target-def.h	(working copy)
@@ -441,6 +441,10 @@
 #define TARGET_BINDS_LOCAL_P default_binds_local_p
 #endif
 
+#ifndef TARGET_EXTRACT_TRUNCATION_MASK
+#define TARGET_EXTRACT_TRUNCATION_MASK default_shift_truncation_mask
+#endif
+
 #ifndef TARGET_SHIFT_TRUNCATION_MASK
 #define TARGET_SHIFT_TRUNCATION_MASK default_shift_truncation_mask
 #endif
@@ -875,6 +879,7 @@
   TARGET_MANGLE_DECL_ASSEMBLER_NAME,		\
   TARGET_ENCODE_SECTION_INFO,			\
   TARGET_STRIP_NAME_ENCODING,			\
+  TARGET_EXTRACT_TRUNCATION_MASK,		\
   TARGET_SHIFT_TRUNCATION_MASK,			\
   TARGET_MIN_DIVISIONS_FOR_RECIP_MUL,		\
   TARGET_MODE_REP_EXTENDED,			\
Index: gcc/combine.c
===================================================================
--- gcc/combine.c	(branch rebase-shift-count-trunc)
+++ gcc/combine.c	(working copy)
@@ -5225,13 +5225,14 @@ combine_simplify_rtx (rtx x, enum machin
 	return simplify_shift_const (x, code, mode, XEXP (x, 0),
 				     INTVAL (XEXP (x, 1)));
 
-      else if (SHIFT_COUNT_TRUNCATED && !REG_P (XEXP (x, 1)))
-	SUBST (XEXP (x, 1),
-	       force_to_mode (XEXP (x, 1), GET_MODE (XEXP (x, 1)),
-			      ((HOST_WIDE_INT) 1
-			       << exact_log2 (GET_MODE_BITSIZE (GET_MODE (x))))
-			      - 1,
-			      0));
+      else if (!REG_P (XEXP (x, 1)))
+	{
+	  enum machine_mode mode = GET_MODE (x);
+	  unsigned HOST_WIDE_INT mask = targetm.shift_truncation_mask (mode);
+	  if (mask != ~(unsigned HOST_WIDE_INT)0)
+	    SUBST (XEXP (x, 1),
+		   force_to_mode (XEXP (x, 1), GET_MODE (XEXP (x, 1)), mask, 0));
+	}
       break;
 
     default:
@@ -8992,24 +8993,11 @@ simplify_shift_const_1 (enum rtx_code co
   int complement_p = 0;
   rtx new_rtx, x;
 
-  /* Make sure and truncate the "natural" shift on the way in.  We don't
-     want to do this inside the loop as it makes it more difficult to
-     combine shifts.  */
-  if (SHIFT_COUNT_TRUNCATED)
-    orig_count &= GET_MODE_BITSIZE (mode) - 1;
-
-  /* If we were given an invalid count, don't do anything except exactly
-     what was requested.  */
-
-  if (orig_count < 0 || orig_count >= (int) GET_MODE_BITSIZE (mode))
-    return NULL_RTX;
-
-  count = orig_count;
-
-  /* Unless one of the branches of the `if' in this loop does a `continue',
-     we will `break' the loop after the `if'.  */
+  gcc_assert (orig_count >= 0 && orig_count < GET_MODE_BITSIZE (mode));
 
-  while (count != 0)
+  /* Unless the body of the loop does a `continue', we will `break' at
+     the end of the loop.  */
+  for (count = orig_count; count != 0; )
     {
       /* If we have an operand of (clobber (const_int 0)), fail.  */
       if (GET_CODE (varop) == CLOBBER)
@@ -10366,10 +10354,13 @@ simplify_comparison (enum rtx_code code,
 	  /* If we are extracting a single bit from a variable position in
 	     a constant that has only a single bit set and are comparing it
 	     with zero, we can convert this into an equality comparison
-	     between the position and the location of the single bit.  */
-	  /* Except we can't if SHIFT_COUNT_TRUNCATED is set, since we might
-	     have already reduced the shift count modulo the word size.  */
-	  if (!SHIFT_COUNT_TRUNCATED
+	     between the position and the location of the single bit.
+	     Except we can't if shift counts are truncated more strictly
+	     than extractions, since we might have already reduced the
+	     shift count modulo the word size---for example changing
+	     (1 << (A & 31)) to (1 << A) via force_to_mode.  */
+	  if ((targetm.shift_truncation_mask (mode)
+	       >= targetm.extract_truncation_mask (mode))
 	      && GET_CODE (XEXP (op0, 0)) == CONST_INT
 	      && XEXP (op0, 1) == const1_rtx
 	      && equality_comparison_p && const_op == 0
Index: gcc/config/arm/arm.h
===================================================================
--- gcc/config/arm/arm.h	(branch rebase-shift-count-trunc)
+++ gcc/config/arm/arm.h	(working copy)
@@ -2293,11 +2293,7 @@ do {							\
    that the native compiler puts too large (> 32) immediate shift counts
    into a register and shifts by the register, letting the ARM decide what
    to do instead of doing that itself.  */
-/* This is all wrong.  Defining SHIFT_COUNT_TRUNCATED tells combine that
-   code like (X << (Y % 32)) for register X, Y is equivalent to (X << Y).
-   On the arm, Y in a register is used modulo 256 for the shift. Only for
-   rotates is modulo 32 used.  */
-/* #define SHIFT_COUNT_TRUNCATED 1 */
+#define SHIFT_COUNT_TRUNCATED 1
 
 /* All integers have the same format so truncation is easy.  */
 #define TRULY_NOOP_TRUNCATION(OUTPREC, INPREC)  1
Index: gcc/config/arm/arm.c
===================================================================
--- gcc/config/arm/arm.c	(branch rebase-shift-count-trunc)
+++ gcc/config/arm/arm.c	(working copy)
@@ -19040,7 +19040,7 @@ arm_vector_mode_supported_p (enum machin
 static unsigned HOST_WIDE_INT
 arm_shift_truncation_mask (enum machine_mode mode)
 {
-  return mode == SImode ? 255 : 0;
+  return mode == SImode ? 255 : ~(unsigned HOST_WIDE_INT) 0;
 }
 
 
Index: gcc/optabs.c
===================================================================
--- gcc/optabs.c	(branch rebase-shift-count-trunc)
+++ gcc/optabs.c	(working copy)
@@ -1799,17 +1799,14 @@ expand_binop (enum machine_mode mode, op
       double_shift_mask = targetm.shift_truncation_mask (mode);
       shift_mask = targetm.shift_truncation_mask (word_mode);
       op1_mode = GET_MODE (op1) != VOIDmode ? GET_MODE (op1) : word_mode;
-
-      /* Apply the truncation to constant shifts.  */
-      if (double_shift_mask > 0 && GET_CODE (op1) == CONST_INT)
-	op1 = GEN_INT (INTVAL (op1) & double_shift_mask);
+      op1 = GEN_INT (INTVAL (op1) & double_shift_mask);
 
       if (op1 == CONST0_RTX (op1_mode))
 	return op0;
 
       /* Make sure that this is a combination that expand_doubleword_shift
 	 can handle.  See the comments there for details.  */
-      if (double_shift_mask == 0
+      if (double_shift_mask == ~(unsigned HOST_WIDE_INT) 0
 	  || (shift_mask == BITS_PER_WORD - 1
 	      && double_shift_mask == BITS_PER_WORD * 2 - 1))
 	{

patch 4/4:
2009-03-13  Paolo Bonzini  <bonzini@gnu.org>

	* targhooks.c (target_shift_truncation_mask): Handle vector modes.

Index: gcc/targhooks.c
===================================================================
--- gcc/targhooks.c	(branch rebase-shift-count-trunc)
+++ gcc/targhooks.c	(working copy)
@@ -178,9 +178,14 @@ default_unwind_word_mode (void)
 unsigned HOST_WIDE_INT
 default_shift_truncation_mask (enum machine_mode mode)
 {
-  return (SHIFT_COUNT_TRUNCATED
-	  ? (unsigned HOST_WIDE_INT) GET_MODE_BITSIZE (mode) - 1
-	  : ~(unsigned HOST_WIDE_INT) 0);
+  if (!SHIFT_COUNT_TRUNCATED)
+    return ~(unsigned HOST_WIDE_INT) 0;
+
+  if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT
+      || GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT)
+    return GET_MODE_BITSIZE (mode) / GET_MODE_NUNITS (mode) - 1;
+  else
+    return GET_MODE_BITSIZE (mode) - 1;
 }
 
 /* The default implementation of TARGET_MIN_DIVISIONS_FOR_RECIP_MUL.  */


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-07-16  7:38 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-07-15  7:57 TARGET_SHIFT_TRUNCATION_MASK Uros Bizjak
2010-07-15 12:17 ` TARGET_SHIFT_TRUNCATION_MASK Paolo Bonzini
2010-07-15 12:26   ` TARGET_SHIFT_TRUNCATION_MASK Uros Bizjak
2010-07-16  7:38     ` TARGET_SHIFT_TRUNCATION_MASK Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).