Re: Add __builtin_clrsb, similar to clz/ctz

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Bernd Schmidt <bernds@codesourcery.com>
To: Richard Henderson <rth@redhat.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: Add __builtin_clrsb, similar to clz/ctz
Date: Mon, 20 Jun 2011 20:32:00 -0000	[thread overview]
Message-ID: <4DFFA1AE.7070405@codesourcery.com> (raw)
In-Reply-To: <4DFA2E85.2030601@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 1835 bytes --]

On 06/16/2011 06:25 PM, Richard Henderson wrote:
> On 06/16/2011 05:44 AM, Bernd Schmidt wrote:
>> +@deftypefn {Built-in Function} int __builtin_clrsb (unsigned int x)
>> +Returns the number of leading redundant sign bits in @var{x}, starting
>> +at the most significant bit position.
>> +@end deftypefn
> 
> Do we want a signed argument, since we're talking about signs?

Err, yes. It's signed everywhere else (builtins.def etc.).

> It would seem that unlike clz, this function is not undefined for zero.
> What about INT_MIN?  Do all cpus handle those edge cases the same way?

-1 and zero should both produce the same value, 31 (for a 32 bit
integer). I don't see why INT_MIN should be special - the return value
is zero. This is true for C6X and Blackfin; ARM documentation suggests
it's also true for their VCLS instruction. I've not found proper
picochip documentation but some other documents that suggest it's also
implemented this way.

> Do you get smaller code in general from
> 
>   if (x < 0)
>     x = ~x;
>   if (x == 0)
>     return W_TYPE_SIZE - 1;
>   count_leading_zeros(ret, x);
>   return ret - 1;

Probably.

>> -(define_insn "signbitssi2"
>> +(define_insn "clrsbsi2"
>>    [(set (match_operand:HI 0 "register_operand" "=d")
>>  	(if_then_else:HI
>>  	 (lt (match_operand:SI 1 "register_operand" "d") (const_int 0))
> 
> No use of the new rtx code?

D'oh. Blackfin has a (clrsb:HI (operand:SI)) instruction, so adding this
showed a problem with some of the existing simplify_const_unop cases:
for ffs/clz/ctz/clrsb/parity/popcount, we should look at the mode of the
operand, rather than the mode of the operation. This limits what we can
do in that function, since op_mode is sometimes VOIDmode - we really
should add builtin folders for these at some point.

New patch below. Retested on i686 and bfin.


Bernd

[-- Attachment #2: clrsb0620.diff --]
[-- Type: text/plain, Size: 26109 bytes --]

	libgcc/
	* Makefile.in (lib2funcs): Add _clrsbsi2 and _clrsbdi2.
	* libgcc-std.ver.in (GCC_4.7.0): New section.

	gcc/
	* doc/extend.texi (__builtin_clrsb, __builtin_clrsbl,
	__builtin_clrsbll): Document.
	* doc/rtl.texi (clrsb): New entry.
	* optabs.c (widen_leading): Renamed from widen_clz.  New argument
	UNOPTAB.  All callers changed.  Use UNOPTAB instead of clz_optab.
	(expand_unop): Handle clrsb_optab.
	(init_optabs): Initialize it.
	* optabs.h (enum optab_index): New entry OTI_clrsb.
	(clrsb_optab): Define.
	* genopinit.c (optabs): Add an entry for it.
	* builtins.c (expand_builtin): Handle clrsb builtin functions.
	* builtins.def (BUILT_IN_CLRSB, BUILT_IN_CLRSBIMAX, BUILT_IN_CLRSBL,
	BUILT_IN_CLRSBLL): New.
	* rtl.def (CLRSB): New code.
	* dwarf2out.c (mem_loc_descriptor): Handle it.
	* simplify-rtx.c (simplify_const_unary_operation): Likewise.
	Use op_mode rather than mode when optimizing ffs, clz, ctz, parity
	and popcount.
	* libgcc2.c (__clrsbSI2, __clrsbDI2): New functions.
	* libgcc2.h (__clrsbSI2, __clrsbDI2): Define and declare.
	(__ctzDI2): Move declaration.
	* config/bfin/bfin.md (clrsbsi2): New expander.
	(signbitssi2): Use the CLRSB rtx.
	(clrsbhi2): Renamed from signbitshi2.  Use the CLRSB rtx.
	* config/bfin/bfin.c (bdesc_1arg): Changed accordingly.

	gcc/testsuite/
	* gcc.c-torture/excute/builtin-bitops-1.c (MAKE_FUNS): Make
	my_clrsb test functions.
	(main): Test clrsb.
	* gcc.dg/builtin-protos-1.c (test_s, test_u, test_sl, test_ul,
	test_sll, test_ull): Add clrsb tests.
	* gcc.dg/torture/builtin-attr-1.c: Add tests for clrsb, clrsbl,
	clrsbll.

Index: libgcc/Makefile.in
===================================================================
--- libgcc/Makefile.in	(revision 174339)
+++ libgcc/Makefile.in	(working copy)
@@ -320,7 +320,7 @@ lib2funcs = _muldi3 _negdi2 _lshrdi3 _as
 	    _ctzsi2 _ctzdi2 _popcount_tab _popcountsi2 _popcountdi2	   \
 	    _paritysi2 _paritydi2 _powisf2 _powidf2 _powixf2 _powitf2	   \
 	    _mulsc3 _muldc3 _mulxc3 _multc3 _divsc3 _divdc3 _divxc3	   \
-	    _divtc3 _bswapsi2 _bswapdi2
+	    _divtc3 _bswapsi2 _bswapdi2 _clrsbsi2 _clrsbdi2
 
 # The floating-point conversion routines that involve a single-word integer.
 # XX stands for the integer mode.
Index: libgcc/libgcc-std.ver.in
===================================================================
--- libgcc/libgcc-std.ver.in	(revision 174339)
+++ libgcc/libgcc-std.ver.in	(working copy)
@@ -1920,3 +1920,10 @@ GCC_4.6.0 {
   __morestack_initial_sp
   __splitstack_find
 }
+
+%inherit GCC_4.7.0 GCC_4.6.0
+GCC_4.7.0 {
+  __PFX__clrsbsi2
+  __PFX__clrsbdi2
+  __PFX__clrsbti2
+}
Index: gcc/doc/extend.texi
===================================================================
--- gcc/doc/extend.texi	(revision 174339)
+++ gcc/doc/extend.texi	(working copy)
@@ -7828,6 +7828,12 @@ Returns the number of trailing 0-bits in
 significant bit position.  If @var{x} is 0, the result is undefined.
 @end deftypefn
 
+@deftypefn {Built-in Function} int __builtin_clrsb (int x)
+Returns the number of leading redundant sign bits in @var{x}, i.e. the
+number of bits following the most significant bit which are identical
+to it.  There are no special cases for 0 or other values. 
+@end deftypefn
+
 @deftypefn {Built-in Function} int __builtin_popcount (unsigned int x)
 Returns the number of 1-bits in @var{x}.
 @end deftypefn
@@ -7852,6 +7858,11 @@ Similar to @code{__builtin_ctz}, except
 @code{unsigned long}.
 @end deftypefn
 
+@deftypefn {Built-in Function} int __builtin_clrsbl (long)
+Similar to @code{__builtin_clrsb}, except the argument type is
+@code{long}.
+@end deftypefn
+
 @deftypefn {Built-in Function} int __builtin_popcountl (unsigned long)
 Similar to @code{__builtin_popcount}, except the argument type is
 @code{unsigned long}.
@@ -7877,6 +7888,11 @@ Similar to @code{__builtin_ctz}, except
 @code{unsigned long long}.
 @end deftypefn
 
+@deftypefn {Built-in Function} int __builtin_clrsbll (long long)
+Similar to @code{__builtin_clrsb}, except the argument type is
+@code{long long}.
+@end deftypefn
+
 @deftypefn {Built-in Function} int __builtin_popcountll (unsigned long long)
 Similar to @code{__builtin_popcount}, except the argument type is
 @code{unsigned long long}.
Index: gcc/doc/rtl.texi
===================================================================
--- gcc/doc/rtl.texi	(revision 174339)
+++ gcc/doc/rtl.texi	(working copy)
@@ -2400,6 +2400,14 @@ zero if @var{x} is zero.)  The mode of @
 depending on the target machine, various mode combinations may be
 valid.
 
+@findex clrsb
+@item (clrsb:@var{m} @var{x})
+Represents the number of redundant leading sign bits in @var{x},
+represented as an integer of mode @var{m}, starting at the most
+significant bit position.  This is one less than the number of leading
+sign bits (either 0 or 1), with no special cases.  The mode of @var{x}
+will usually be an integer mode and may differ from @var{m}.
+
 @findex clz
 @item (clz:@var{m} @var{x})
 Represents the number of leading 0-bits in @var{x}, represented as an
Index: gcc/optabs.c
===================================================================
--- gcc/optabs.c	(revision 174339)
+++ gcc/optabs.c	(working copy)
@@ -2317,9 +2317,12 @@ expand_simple_unop (enum machine_mode mo
 /* Try calculating
 	(clz:narrow x)
    as
-	(clz:wide (zero_extend:wide x)) - ((width wide) - (width narrow)).  */
+	(clz:wide (zero_extend:wide x)) - ((width wide) - (width narrow)).
+
+   A similar operation can be used for clrsb.  UNOPTAB says which operation
+   we are trying to expand.  */
 static rtx
-widen_clz (enum machine_mode mode, rtx op0, rtx target)
+widen_leading (enum machine_mode mode, rtx op0, rtx target, optab unoptab)
 {
   enum mode_class mclass = GET_MODE_CLASS (mode);
   if (CLASS_HAS_WIDER_MODES_P (mclass))
@@ -2329,7 +2332,7 @@ widen_clz (enum machine_mode mode, rtx o
 	   wider_mode != VOIDmode;
 	   wider_mode = GET_MODE_WIDER_MODE (wider_mode))
 	{
-	  if (optab_handler (clz_optab, wider_mode) != CODE_FOR_nothing)
+	  if (optab_handler (unoptab, wider_mode) != CODE_FOR_nothing)
 	    {
 	      rtx xop0, temp, last;
 
@@ -2338,7 +2341,7 @@ widen_clz (enum machine_mode mode, rtx o
 	      if (target == 0)
 		target = gen_reg_rtx (mode);
 	      xop0 = widen_operand (op0, wider_mode, mode, true, false);
-	      temp = expand_unop (wider_mode, clz_optab, xop0, NULL_RTX, true);
+	      temp = expand_unop (wider_mode, unoptab, xop0, NULL_RTX, true);
 	      if (temp != 0)
 		temp = expand_binop (wider_mode, sub_optab, temp,
 				     GEN_INT (GET_MODE_BITSIZE (wider_mode)
@@ -2832,7 +2835,7 @@ expand_unop (enum machine_mode mode, opt
   /* Widening (or narrowing) clz needs special treatment.  */
   if (unoptab == clz_optab)
     {
-      temp = widen_clz (mode, op0, target);
+      temp = widen_leading (mode, op0, target, unoptab);
       if (temp)
 	return temp;
 
@@ -2844,7 +2847,15 @@ expand_unop (enum machine_mode mode, opt
 	    return temp;
 	}
 
-	goto try_libcall;
+      goto try_libcall;
+    }
+
+  if (unoptab == clrsb_optab)
+    {
+      temp = widen_leading (mode, op0, target, unoptab);
+      if (temp)
+	return temp;
+      goto try_libcall;
     }
 
   /* Widening (or narrowing) bswap needs special treatment.  */
@@ -2999,7 +3010,8 @@ expand_unop (enum machine_mode mode, opt
       /* All of these functions return small values.  Thus we choose to
 	 have them return something that isn't a double-word.  */
       if (unoptab == ffs_optab || unoptab == clz_optab || unoptab == ctz_optab
-	  || unoptab == popcount_optab || unoptab == parity_optab)
+	  || unoptab == clrsb_optab || unoptab == popcount_optab
+	  || unoptab == parity_optab)
 	outmode
 	  = GET_MODE (hard_libcall_value (TYPE_MODE (integer_type_node),
 					  optab_libfunc (unoptab, mode)));
@@ -5943,6 +5955,7 @@ init_optabs (void)
   init_optab (ffs_optab, FFS);
   init_optab (clz_optab, CLZ);
   init_optab (ctz_optab, CTZ);
+  init_optab (clrsb_optab, CLRSB);
   init_optab (popcount_optab, POPCOUNT);
   init_optab (parity_optab, PARITY);
   init_optab (sqrt_optab, SQRT);
@@ -6173,6 +6186,9 @@ init_optabs (void)
   ctz_optab->libcall_basename = "ctz";
   ctz_optab->libcall_suffix = '2';
   ctz_optab->libcall_gen = gen_int_libfunc;
+  clrsb_optab->libcall_basename = "clrsb";
+  clrsb_optab->libcall_suffix = '2';
+  clrsb_optab->libcall_gen = gen_int_libfunc;
   popcount_optab->libcall_basename = "popcount";
   popcount_optab->libcall_suffix = '2';
   popcount_optab->libcall_gen = gen_int_libfunc;
Index: gcc/optabs.h
===================================================================
--- gcc/optabs.h	(revision 174339)
+++ gcc/optabs.h	(working copy)
@@ -220,6 +220,7 @@ enum optab_index
   OTI_ffs,
   OTI_clz,
   OTI_ctz,
+  OTI_clrsb,
   OTI_popcount,
   OTI_parity,
   /* Square root */
@@ -456,6 +457,7 @@ enum optab_index
 #define ffs_optab (&optab_table[OTI_ffs])
 #define clz_optab (&optab_table[OTI_clz])
 #define ctz_optab (&optab_table[OTI_ctz])
+#define clrsb_optab (&optab_table[OTI_clrsb])
 #define popcount_optab (&optab_table[OTI_popcount])
 #define parity_optab (&optab_table[OTI_parity])
 #define sqrt_optab (&optab_table[OTI_sqrt])
Index: gcc/genopinit.c
===================================================================
--- gcc/genopinit.c	(revision 174339)
+++ gcc/genopinit.c	(working copy)
@@ -199,6 +199,7 @@ static const char * const optabs[] =
   "set_optab_handler (ffs_optab, $A, CODE_FOR_$(ffs$a2$))",
   "set_optab_handler (clz_optab, $A, CODE_FOR_$(clz$a2$))",
   "set_optab_handler (ctz_optab, $A, CODE_FOR_$(ctz$a2$))",
+  "set_optab_handler (clrsb_optab, $A, CODE_FOR_$(clrsb$a2$))",
   "set_optab_handler (popcount_optab, $A, CODE_FOR_$(popcount$a2$))",
   "set_optab_handler (parity_optab, $A, CODE_FOR_$(parity$a2$))",
   "set_optab_handler (mov_optab, $A, CODE_FOR_$(mov$a$))",
Index: gcc/builtins.c
===================================================================
--- gcc/builtins.c	(revision 174339)
+++ gcc/builtins.c	(working copy)
@@ -6068,6 +6068,14 @@ expand_builtin (tree exp, rtx target, rt
 	return target;
       break;
 
+    CASE_INT_FN (BUILT_IN_CLRSB):
+    case BUILT_IN_CLRSBIMAX:
+      target = expand_builtin_unop (target_mode, exp, target,
+				    subtarget, clrsb_optab);
+      if (target)
+	return target;
+      break;
+
     CASE_INT_FN (BUILT_IN_POPCOUNT):
     case BUILT_IN_POPCOUNTIMAX:
       target = expand_builtin_unop (target_mode, exp, target,
Index: gcc/testsuite/gcc.c-torture/execute/builtin-bitops-1.c
===================================================================
--- gcc/testsuite/gcc.c-torture/execute/builtin-bitops-1.c	(revision 174339)
+++ gcc/testsuite/gcc.c-torture/execute/builtin-bitops-1.c	(working copy)
@@ -62,6 +62,16 @@ int my_clz##suffix(type x) {						\
     return i;								\
 }									\
 									\
+int my_clrsb##suffix(type x) {						\
+    int i;								\
+    int leading = (x >> CHAR_BIT * sizeof (type) - 1) & 1;		\
+    for (i = 1; i < CHAR_BIT * sizeof (type); i++)			\
+	if (((x >> ((CHAR_BIT * sizeof (type)) - i - 1)) & 1)		\
+	    != leading)							\
+	    break;							\
+    return i - 1;							\
+}									\
+									\
 int my_popcount##suffix(type x) {					\
     int i;								\
     int count = 0;							\
@@ -176,6 +186,8 @@ main (void)
       if (ints[i] != 0
 	  && __builtin_ctz (ints[i]) != my_ctz (ints[i]))
 	abort ();
+      if (__builtin_clrsb (ints[i]) != my_clrsb (ints[i]))
+	abort ();
       if (__builtin_popcount (ints[i]) != my_popcount (ints[i]))
 	abort ();
       if (__builtin_parity (ints[i]) != my_parity (ints[i]))
@@ -192,6 +204,8 @@ main (void)
       if (longs[i] != 0
 	  && __builtin_ctzl (longs[i]) != my_ctzl (longs[i]))
 	abort ();
+      if (__builtin_clrsbl (longs[i]) != my_clrsbl (longs[i]))
+	abort ();
       if (__builtin_popcountl (longs[i]) != my_popcountl (longs[i]))
 	abort ();
       if (__builtin_parityl (longs[i]) != my_parityl (longs[i]))
@@ -208,6 +222,8 @@ main (void)
       if (longlongs[i] != 0
 	  && __builtin_ctzll (longlongs[i]) != my_ctzll (longlongs[i]))
 	abort ();
+      if (__builtin_clrsbll (longlongs[i]) != my_clrsbll (longlongs[i]))
+	abort ();
       if (__builtin_popcountll (longlongs[i]) != my_popcountll (longlongs[i]))
 	abort ();
       if (__builtin_parityll (longlongs[i]) != my_parityll (longlongs[i]))
@@ -223,6 +239,8 @@ main (void)
     abort ();								\
   if (x != 0 && __builtin_ctz##suffix (x) != my_ctz##suffix (x))	\
     abort ();								\
+  if (__builtin_clrsb##suffix (x) != my_clrsb##suffix (x))		\
+    abort ();								\
   if (__builtin_popcount##suffix (x) != my_popcount##suffix (x))	\
     abort ();								\
   if (__builtin_parity##suffix (x) != my_parity##suffix (x))		\
Index: gcc/testsuite/gcc.dg/builtin-protos-1.c
===================================================================
--- gcc/testsuite/gcc.dg/builtin-protos-1.c	(revision 174339)
+++ gcc/testsuite/gcc.dg/builtin-protos-1.c	(working copy)
@@ -7,6 +7,7 @@ test_s (signed int x)
   return __builtin_abs (x)	/* { dg-bogus "as unsigned due to prototype" } */
     + __builtin_clz (x)		/* { dg-warning "as unsigned due to prototype" } */
     + __builtin_ctz (x)		/* { dg-warning "as unsigned due to prototype" } */
+    + __builtin_clrsb (x)	/* { dg-bogus "as unsigned due to prototype" } */
     + __builtin_ffs (x)		/* { dg-bogus "as unsigned due to prototype" } */
     + __builtin_parity (x)	/* { dg-warning "as unsigned due to prototype" } */
     + __builtin_popcount (x);	/* { dg-warning "as unsigned due to prototype" } */
@@ -18,6 +19,7 @@ test_u (unsigned int x)
   return __builtin_abs (x)	/* { dg-warning "as signed due to prototype" } */
     + __builtin_clz (x)		/* { dg-bogus "as signed due to prototype" } */
     + __builtin_ctz (x)		/* { dg-bogus "as signed due to prototype" } */
+    + __builtin_clrsb (x)	/* { dg-warning "as signed due to prototype" } */
     + __builtin_ffs (x)		/* { dg-warning "as signed due to prototype" } */
     + __builtin_parity (x)	/* { dg-bogus "as signed due to prototype" } */
     + __builtin_popcount (x);	/* { dg-bogus "as signed due to prototype" } */
@@ -29,6 +31,7 @@ test_sl (signed long x)
   return __builtin_labs (x)	/* { dg-bogus "as unsigned due to prototype" } */
     + __builtin_clzl (x)	/* { dg-warning "as unsigned due to prototype" } */
     + __builtin_ctzl (x)	/* { dg-warning "as unsigned due to prototype" } */
+    + __builtin_clrsbl (x)	/* { dg-bogus "as unsigned due to prototype" } */
     + __builtin_ffsl (x)	/* { dg-bogus "as unsigned due to prototype" } */
     + __builtin_parityl (x)	/* { dg-warning "as unsigned due to prototype" } */
     + __builtin_popcountl (x);	/* { dg-warning "as unsigned due to prototype" } */
@@ -40,6 +43,7 @@ test_ul (unsigned long x)
   return __builtin_labs (x)	/* { dg-warning "as signed due to prototype" } */
     + __builtin_clzl (x)	/* { dg-bogus "as signed due to prototype" } */
     + __builtin_ctzl (x)	/* { dg-bogus "as signed due to prototype" } */
+    + __builtin_clrsbl (x)	/* { dg-warning "as signed due to prototype" } */
     + __builtin_ffsl (x)	/* { dg-warning "as signed due to prototype" } */
     + __builtin_parityl (x)	/* { dg-bogus "as signed due to prototype" } */
     + __builtin_popcountl (x);	/* { dg-bogus "as signed due to prototype" } */
@@ -51,6 +55,7 @@ test_sll (signed long long x)
   return __builtin_llabs (x)	/* { dg-bogus "as unsigned due to prototype" } */
     + __builtin_clzll (x)	/* { dg-warning "as unsigned due to prototype" } */
     + __builtin_ctzll (x)	/* { dg-warning "as unsigned due to prototype" } */
+    + __builtin_clrsbll (x)	/* { dg-bogus "as unsigned due to prototype" } */
     + __builtin_ffsll (x)	/* { dg-bogus "as unsigned due to prototype" } */
     + __builtin_parityll (x)	/* { dg-warning "as unsigned due to prototype" } */
     + __builtin_popcountll (x);	/* { dg-warning "as unsigned due to prototype" } */
@@ -62,6 +67,7 @@ test_ull (unsigned long long x)
   return __builtin_llabs (x)	/* { dg-warning "as signed due to prototype" } */
     + __builtin_clzll (x)	/* { dg-bogus "as signed due to prototype" } */
     + __builtin_ctzll (x)	/* { dg-bogus "as signed due to prototype" } */
+    + __builtin_clrsbll (x)	/* { dg-warning "as signed due to prototype" } */
     + __builtin_ffsll (x)	/* { dg-warning "as signed due to prototype" } */
     + __builtin_parityll (x)	/* { dg-bogus "as signed due to prototype" } */
     + __builtin_popcountll (x);	/* { dg-bogus "as signed due to prototype" } */
Index: gcc/testsuite/gcc.dg/torture/builtin-attr-1.c
===================================================================
--- gcc/testsuite/gcc.dg/torture/builtin-attr-1.c	(revision 174339)
+++ gcc/testsuite/gcc.dg/torture/builtin-attr-1.c	(working copy)
@@ -416,6 +416,9 @@ BUILTIN_TEST1 (clzll, long long)
 BUILTIN_TEST1 (ctz, int)
 BUILTIN_TEST1 (ctzl, long)
 BUILTIN_TEST1 (ctzll, long long)
+BUILTIN_TEST1 (clrsb, int)
+BUILTIN_TEST1 (clrsbl, long)
+BUILTIN_TEST1 (clrsbll, long long)
 TEST1         (ffs, int, int)
 TEST1         (ffsl, long, int)
 TEST1         (ffsll, long long, int)
Index: gcc/builtins.def
===================================================================
--- gcc/builtins.def	(revision 174339)
+++ gcc/builtins.def	(working copy)
@@ -620,6 +620,10 @@ DEF_GCC_BUILTIN        (BUILT_IN_CTZ, "c
 DEF_GCC_BUILTIN        (BUILT_IN_CTZIMAX, "ctzimax", BT_FN_INT_UINTMAX, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN        (BUILT_IN_CTZL, "ctzl", BT_FN_INT_ULONG, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN        (BUILT_IN_CTZLL, "ctzll", BT_FN_INT_ULONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN        (BUILT_IN_CLRSB, "clrsb", BT_FN_INT_INT, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN        (BUILT_IN_CLRSBIMAX, "clrsbimax", BT_FN_INT_INTMAX, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN        (BUILT_IN_CLRSBL, "clrsbl", BT_FN_INT_LONG, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN        (BUILT_IN_CLRSBLL, "clrsbll", BT_FN_INT_LONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_DCGETTEXT, "dcgettext", BT_FN_STRING_CONST_STRING_CONST_STRING_INT, ATTR_FORMAT_ARG_2)
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_DGETTEXT, "dgettext", BT_FN_STRING_CONST_STRING_CONST_STRING, ATTR_FORMAT_ARG_2)
 DEF_GCC_BUILTIN        (BUILT_IN_DWARF_CFA, "dwarf_cfa", BT_FN_PTR, ATTR_NULL)
Index: gcc/rtl.def
===================================================================
--- gcc/rtl.def	(revision 174339)
+++ gcc/rtl.def	(working copy)
@@ -613,6 +613,10 @@ DEF_RTL_EXPR(BSWAP, "bswap", "e", RTX_UN
    or 0 if arg is 0.  */
 DEF_RTL_EXPR(FFS, "ffs", "e", RTX_UNARY)
 
+/* Count number of leading redundant sign bits (number of leading
+   sign bits minus one).  */
+DEF_RTL_EXPR(CLRSB, "clrsb", "e", RTX_UNARY)
+
 /* Count leading zeros.  */
 DEF_RTL_EXPR(CLZ, "clz", "e", RTX_UNARY)
 
Index: gcc/dwarf2out.c
===================================================================
--- gcc/dwarf2out.c	(revision 174339)
+++ gcc/dwarf2out.c	(working copy)
@@ -14874,6 +14874,7 @@ mem_loc_descriptor (rtx rtl, enum machin
     case FFS:
     case CLZ:
     case CTZ:
+    case CLRSB:
     case POPCOUNT:
     case PARITY:
     case ASM_OPERANDS:
Index: gcc/libgcc2.c
===================================================================
--- gcc/libgcc2.c	(revision 174339)
+++ gcc/libgcc2.c	(working copy)
@@ -762,7 +762,50 @@ __ctzDI2 (UDWtype x)
   return ret + add;
 }
 #endif
+\f
+#ifdef L_clrsbsi2
+#undef int
+int
+__clrsbSI2 (Wtype x)
+{
+  Wtype ret;
 
+  if (x < 0)
+    x = ~x;
+  if (x == 0)
+    return W_TYPE_SIZE - 1;
+  count_leading_zeros (ret, x);
+  return ret - 1;
+}
+#endif
+\f
+#ifdef L_clrsbdi2
+#undef int
+int
+__clrsbDI2 (DWtype x)
+{
+  const DWunion uu = {.ll = x};
+  UWtype word;
+  Wtype ret, add;
+
+  if (uu.s.high == 0)
+    word = uu.s.low, add = W_TYPE_SIZE;
+  else if (uu.s.high == -1)
+    word = ~uu.s.low, add = W_TYPE_SIZE;
+  else if (uu.s.high >= 0)
+    word = uu.s.high, add = 0;
+  else
+    word = ~uu.s.high, add = 0;
+
+  if (word == 0)
+    ret = W_TYPE_SIZE;
+  else
+    count_leading_zeros (ret, word);
+
+  return ret + add - 1;
+}
+#endif
+\f
 #ifdef L_popcount_tab
 const UQItype __popcount_tab[256] =
 {
Index: gcc/libgcc2.h
===================================================================
--- gcc/libgcc2.h	(revision 174339)
+++ gcc/libgcc2.h	(working copy)
@@ -315,11 +315,13 @@ typedef int shift_count_type __attribute
 #define __ffsSI2	__NW(ffs,2)
 #define __clzSI2	__NW(clz,2)
 #define __ctzSI2	__NW(ctz,2)
+#define __clrsbSI2	__NW(clrsb,2)
 #define __popcountSI2	__NW(popcount,2)
 #define __paritySI2	__NW(parity,2)
 #define __ffsDI2	__NDW(ffs,2)
 #define __clzDI2	__NDW(clz,2)
 #define __ctzDI2	__NDW(ctz,2)
+#define __clrsbDI2	__NDW(clrsb,2)
 #define __popcountDI2	__NDW(popcount,2)
 #define __parityDI2	__NDW(parity,2)
 
@@ -508,9 +510,11 @@ extern const UQItype __clz_tab[256];
 extern int __clzDI2 (UDWtype);
 extern int __clzSI2 (UWtype);
 extern int __ctzSI2 (UWtype);
+extern int __ctzDI2 (UDWtype);
+extern int __clrsbSI2 (Wtype);
+extern int __clrsbDI2 (DWtype);
 extern int __ffsSI2 (UWtype);
 extern int __ffsDI2 (DWtype);
-extern int __ctzDI2 (UDWtype);
 extern int __popcountSI2 (UWtype);
 extern int __popcountDI2 (UDWtype);
 extern int __paritySI2 (UWtype);
Index: gcc/simplify-rtx.c
===================================================================
--- gcc/simplify-rtx.c	(revision 174339)
+++ gcc/simplify-rtx.c	(working copy)
@@ -1127,6 +1127,7 @@ simplify_const_unary_operation (enum rtx
 				rtx op, enum machine_mode op_mode)
 {
   unsigned int width = GET_MODE_BITSIZE (mode);
+  unsigned int op_width = GET_MODE_BITSIZE (op_mode);
 
   if (code == VEC_DUPLICATE)
     {
@@ -1237,7 +1238,8 @@ simplify_const_unary_operation (enum rtx
     }
 
   if (CONST_INT_P (op)
-      && width <= HOST_BITS_PER_WIDE_INT && width > 0)
+      && width <= HOST_BITS_PER_WIDE_INT
+      && op_width <= HOST_BITS_PER_WIDE_INT && op_width > 0)
     {
       HOST_WIDE_INT arg0 = INTVAL (op);
       HOST_WIDE_INT val;
@@ -1257,40 +1259,50 @@ simplify_const_unary_operation (enum rtx
 	  break;
 
 	case FFS:
-	  arg0 &= GET_MODE_MASK (mode);
+	  arg0 &= GET_MODE_MASK (op_mode);
 	  val = ffs_hwi (arg0);
 	  break;
 
 	case CLZ:
-	  arg0 &= GET_MODE_MASK (mode);
-	  if (arg0 == 0 && CLZ_DEFINED_VALUE_AT_ZERO (mode, val))
+	  arg0 &= GET_MODE_MASK (op_mode);
+	  if (arg0 == 0 && CLZ_DEFINED_VALUE_AT_ZERO (op_mode, val))
 	    ;
 	  else
-	    val = GET_MODE_BITSIZE (mode) - floor_log2 (arg0) - 1;
+	    val = GET_MODE_BITSIZE (op_mode) - floor_log2 (arg0) - 1;
+	  break;
+
+	case CLRSB:
+	  arg0 &= GET_MODE_MASK (op_mode);
+	  if (arg0 == 0)
+	    val = GET_MODE_BITSIZE (op_mode) - 1;
+	  else if (arg0 >= 0)
+	    val = GET_MODE_BITSIZE (op_mode) - floor_log2 (arg0) - 2;
+	  else if (arg0 < 0)
+	    val = GET_MODE_BITSIZE (op_mode) - floor_log2 (~arg0) - 2;
 	  break;
 
 	case CTZ:
-	  arg0 &= GET_MODE_MASK (mode);
+	  arg0 &= GET_MODE_MASK (op_mode);
 	  if (arg0 == 0)
 	    {
 	      /* Even if the value at zero is undefined, we have to come
 		 up with some replacement.  Seems good enough.  */
-	      if (! CTZ_DEFINED_VALUE_AT_ZERO (mode, val))
-		val = GET_MODE_BITSIZE (mode);
+	      if (! CTZ_DEFINED_VALUE_AT_ZERO (op_mode, val))
+		val = GET_MODE_BITSIZE (op_mode);
 	    }
 	  else
 	    val = ctz_hwi (arg0);
 	  break;
 
 	case POPCOUNT:
-	  arg0 &= GET_MODE_MASK (mode);
+	  arg0 &= GET_MODE_MASK (op_mode);
 	  val = 0;
 	  while (arg0)
 	    val++, arg0 &= arg0 - 1;
 	  break;
 
 	case PARITY:
-	  arg0 &= GET_MODE_MASK (mode);
+	  arg0 &= GET_MODE_MASK (op_mode);
 	  val = 0;
 	  while (arg0)
 	    val++, arg0 &= arg0 - 1;
Index: gcc/config/bfin/bfin.c
===================================================================
--- gcc/config/bfin/bfin.c	(revision 174339)
+++ gcc/config/bfin/bfin.c	(working copy)
@@ -6254,11 +6254,11 @@ static const struct builtin_description
 
   { CODE_FOR_ones, "__builtin_bfin_ones", BFIN_BUILTIN_ONES, 0 },
 
-  { CODE_FOR_signbitshi2, "__builtin_bfin_norm_fr1x16", BFIN_BUILTIN_NORM_1X16, 0 },
+  { CODE_FOR_clrsbhi2, "__builtin_bfin_norm_fr1x16", BFIN_BUILTIN_NORM_1X16, 0 },
   { CODE_FOR_ssneghi2, "__builtin_bfin_negate_fr1x16", BFIN_BUILTIN_NEG_1X16, 0 },
   { CODE_FOR_abshi2, "__builtin_bfin_abs_fr1x16", BFIN_BUILTIN_ABS_1X16, 0 },
 
-  { CODE_FOR_signbitssi2, "__builtin_bfin_norm_fr1x32", BFIN_BUILTIN_NORM_1X32, 0 },
+  { CODE_FOR_clrsbsi2, "__builtin_bfin_norm_fr1x32", BFIN_BUILTIN_NORM_1X32, 0 },
   { CODE_FOR_ssroundsi2, "__builtin_bfin_round_fr1x32", BFIN_BUILTIN_ROUND_1X32, 0 },
   { CODE_FOR_ssnegsi2, "__builtin_bfin_negate_fr1x32", BFIN_BUILTIN_NEG_1X32, 0 },
   { CODE_FOR_ssabssi2, "__builtin_bfin_abs_fr1x32", BFIN_BUILTIN_ABS_1X32, 0 },
Index: gcc/config/bfin/bfin.md
===================================================================
--- gcc/config/bfin/bfin.md	(revision 174339)
+++ gcc/config/bfin/bfin.md	(working copy)
@@ -1461,12 +1461,19 @@ (define_insn "one_cmplsi2"
   "%0 = ~%1;"
   [(set_attr "type" "alu0")])
 
+(define_expand "clrsbsi2"
+  [(set (match_dup 2)
+	(clrsb:HI (match_operand:SI 1 "register_operand" "d")))
+   (set (match_operand:SI 0 "register_operand")
+	(zero_extend:SI (match_dup 2)))]
+  ""
+{
+  operands[2] = gen_reg_rtx (HImode);
+})
+
 (define_insn "signbitssi2"
   [(set (match_operand:HI 0 "register_operand" "=d")
-	(if_then_else:HI
-	 (lt (match_operand:SI 1 "register_operand" "d") (const_int 0))
-	 (clz:HI (not:SI (match_dup 1)))
-	 (clz:HI (match_dup 1))))]
+	(clrsb:HI (match_operand:SI 1 "register_operand" "d")))]
   ""
   "%h0 = signbits %1%!"
   [(set_attr "type" "dsp32")])
@@ -1518,12 +1525,9 @@ (define_insn "ssneghi2"
   "%0 = -%1 (V)%!"
   [(set_attr "type" "dsp32")])
 
-(define_insn "signbitshi2"
+(define_insn "clrsbhi2"
   [(set (match_operand:HI 0 "register_operand" "=d")
-	(if_then_else:HI
-	 (lt (match_operand:HI 1 "register_operand" "d") (const_int 0))
-	 (clz:HI (not:HI (match_dup 1)))
-	 (clz:HI (match_dup 1))))]
+	(clrsb:HI (match_operand:HI 1 "register_operand" "d")))]
   ""
   "%h0 = signbits %h1%!"
   [(set_attr "type" "dsp32")])

next prev parent reply	other threads:[~2011-06-20 19:40 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-16 13:06 Bernd Schmidt
2011-06-16 13:10 ` Georg-Johann Lay
2011-06-16 13:56 ` Laurent Desnogues
2011-06-16 13:59   ` Bernd Schmidt
2011-06-16 17:03 ` Richard Henderson
2011-06-20 20:32   ` Bernd Schmidt [this message]
2011-06-20 20:48     ` Richard Henderson
2011-06-21 16:39     ` [PATCH] Fix __bultin_clrsb* (PR middle-end/49489) Jakub Jelinek
2011-06-21 16:46       ` Bernd Schmidt
2011-06-23  6:23     ` Add __builtin_clrsb, similar to clz/ctz H.J. Lu
2011-07-12  3:50     ` Hans-Peter Nilsson
2011-08-23 10:07     ` Jakub Jelinek
2011-08-23 10:08       ` Bernd Schmidt
2011-08-23 10:19         ` Richard Guenther
2011-08-23 10:25         ` Jakub Jelinek
2011-08-23 10:34           ` Bernd Schmidt
2011-08-23 13:43             ` [PATCH] For FFS/CLZ/CTZ/CLRSB/POPCOUNT/PARITY/BSWAP require operand mode equal to operation mode (or VOIDmode) (PR middle-end/50161) Jakub Jelinek
2011-08-23 14:54               ` Georg-Johann Lay
2011-08-23 15:58               ` Jakub Jelinek
2011-08-23 16:16                 ` Bernd Schmidt
2011-08-23 10:45         ` Add __builtin_clrsb, similar to clz/ctz Richard Sandiford

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4DFFA1AE.7070405@codesourcery.com \
    --to=bernds@codesourcery.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=rth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).