Add __builtin_clrsb, similar to clz/ctz

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* Add __builtin_clrsb, similar to clz/ctz
@ 2011-06-16 13:06 Bernd Schmidt
  2011-06-16 13:10 ` Georg-Johann Lay
                   ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Bernd Schmidt @ 2011-06-16 13:06 UTC (permalink / raw)
  To: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 925 bytes --]

Several processors have a "count redundant sign bits" instruction:

* SIGNBITS on Blackfin
* NORM on C6X
* SBC, apparently, on picochip

and probably others. For example, for a value of 0 or -1, it returns 31,
while e.g. an input 0xc0000000 produces 1.

Other compilers tend to define builtin functions called "norm" for this,
and the operation is used in some DSP benchmarks. The patch below adds a
__builtin_clrsb family of functions, similar to __builtin_clz. I'm open
to suggestions about the name; I don't particularly like "norm" but
maybe that's what users expect.

Bootstrapped and tested on i686-linux (libmudflap.cth/pass39-frag.c
apparently fails randomly); also regression tested with a 4.5 c6x-elf
toolchain and the C6X port modified to provide clrsbsi2.  Also tested on
bfin-elf, where the modified builtins tests pass (test results are noisy
otherwise; I should do another run to get a baseline).

Ok?


Bernd

[-- Attachment #2: clrsb.diff --]
[-- Type: text/plain, Size: 23216 bytes --]

	libgcc/
	* Makefile.in (lib2funcs): Add _clrsbsi2 and _clrsbdi2.
	* libgcc-std.ver.in (GCC_4.7.0): New section.

	gcc/
	* doc/extend.texi (__builtin_clrsb, __builtin_clrsbl,
	__builtin_clrsbll): Document.
	* doc/rtl.texi (clrsb): New entry.
	* optabs.c (widen_leading): Renamed from widen_clz.  New argument
	UNOPTAB.  All callers changed.  Use UNOPTAB instead of clz_optab.
	(expand_unop): Handle clrsb_optab.
	(init_optabs): Initialize it.
	* optabs.h (enum optab_index): New entry OTI_clrsb.
	(clrsb_optab): Define.
	* genopinit.c (optabs): Add an entry for it.
	* simplify-rtx.c (simplify_const_unary_operation): Handle CLRSB.
	* builtins.c (expand_builtin): Handle clrsb builtin functions.
	* builtins.def (BUILT_IN_CLRSB, BUILT_IN_CLRSBIMAX, BUILT_IN_CLRSBL,
	BUILT_IN_CLRSBLL): New.
	* rtl.def (CLRSB): New code.
	* libgcc2.c (__clrsbSI2, __clrsbDI2): New functions.
	* libgcc2.h (__clrsbSI2, __clrsbDI2): Define and declare.
	(__ctzDI2): Move declaration.
	* config/bfin/bfin.md (clrsbsi2): Renamed from signbitssi2.
	(clrsbhi2): Renamed from signbitshi2.
	* config/bfin/bfin.c (bdesc_1arg): Changed accordingly.

	gcc/testsuite/
	* gcc.c-torture/excute/builtin-bitops-1.c (MAKE_FUNS): Make
	my_clrsb test functions.
	(main): Test clrsb.
	* gcc.dg/builtin-protos-1.c (test_s, test_u, test_sl, test_ul,
	test_sll, test_ull): Add clrsb tests.
	* gcc.dg/torture/builtin-attr-1.c: Add tests for clrsb, clrsbl,
	clrsbll.
	
Index: libgcc/Makefile.in
===================================================================
--- libgcc/Makefile.in	(revision 174339)
+++ libgcc/Makefile.in	(working copy)
@@ -320,7 +320,7 @@ lib2funcs = _muldi3 _negdi2 _lshrdi3 _as
 	    _ctzsi2 _ctzdi2 _popcount_tab _popcountsi2 _popcountdi2	   \
 	    _paritysi2 _paritydi2 _powisf2 _powidf2 _powixf2 _powitf2	   \
 	    _mulsc3 _muldc3 _mulxc3 _multc3 _divsc3 _divdc3 _divxc3	   \
-	    _divtc3 _bswapsi2 _bswapdi2
+	    _divtc3 _bswapsi2 _bswapdi2 _clrsbsi2 _clrsbdi2
 
 # The floating-point conversion routines that involve a single-word integer.
 # XX stands for the integer mode.
Index: libgcc/libgcc-std.ver.in
===================================================================
--- libgcc/libgcc-std.ver.in	(revision 174339)
+++ libgcc/libgcc-std.ver.in	(working copy)
@@ -1920,3 +1920,10 @@ GCC_4.6.0 {
   __morestack_initial_sp
   __splitstack_find
 }
+
+%inherit GCC_4.7.0 GCC_4.6.0
+GCC_4.7.0 {
+  __PFX__clrsbsi2
+  __PFX__clrsbdi2
+  __PFX__clrsbti2
+}
Index: gcc/doc/extend.texi
===================================================================
--- gcc/doc/extend.texi	(revision 174339)
+++ gcc/doc/extend.texi	(working copy)
@@ -7828,6 +7828,11 @@ Returns the number of trailing 0-bits in
 significant bit position.  If @var{x} is 0, the result is undefined.
 @end deftypefn
 
+@deftypefn {Built-in Function} int __builtin_clrsb (unsigned int x)
+Returns the number of leading redundant sign bits in @var{x}, starting
+at the most significant bit position.
+@end deftypefn
+
 @deftypefn {Built-in Function} int __builtin_popcount (unsigned int x)
 Returns the number of 1-bits in @var{x}.
 @end deftypefn
@@ -7852,6 +7857,11 @@ Similar to @code{__builtin_ctz}, except
 @code{unsigned long}.
 @end deftypefn
 
+@deftypefn {Built-in Function} int __builtin_clrsbl (unsigned long)
+Similar to @code{__builtin_clrsb}, except the argument type is
+@code{unsigned long}.
+@end deftypefn
+
 @deftypefn {Built-in Function} int __builtin_popcountl (unsigned long)
 Similar to @code{__builtin_popcount}, except the argument type is
 @code{unsigned long}.
@@ -7877,6 +7887,11 @@ Similar to @code{__builtin_ctz}, except
 @code{unsigned long long}.
 @end deftypefn
 
+@deftypefn {Built-in Function} int __builtin_clrsbll (unsigned long long)
+Similar to @code{__builtin_clrsb}, except the argument type is
+@code{unsigned long long}.
+@end deftypefn
+
 @deftypefn {Built-in Function} int __builtin_popcountll (unsigned long long)
 Similar to @code{__builtin_popcount}, except the argument type is
 @code{unsigned long long}.
Index: gcc/doc/rtl.texi
===================================================================
--- gcc/doc/rtl.texi	(revision 174339)
+++ gcc/doc/rtl.texi	(working copy)
@@ -2400,6 +2408,14 @@ zero if @var{x} is zero.)  The mode of @
 depending on the target machine, various mode combinations may be
 valid.
 
+@findex clrsb
+@item (clrsb:@var{m} @var{x})
+Represents the number of redundant leading sign bits in @var{x},
+represented as an integer of mode @var{m}, starting at the most
+significant bit position.  This is one less than the number of leading
+sign bits (either 0 or 1).  The mode of @var{x} will usually be an
+integer mode.
+
 @findex clz
 @item (clz:@var{m} @var{x})
 Represents the number of leading 0-bits in @var{x}, represented as an
Index: gcc/optabs.c
===================================================================
--- gcc/optabs.c	(revision 174339)
+++ gcc/optabs.c	(working copy)
@@ -2317,9 +2317,12 @@ expand_simple_unop (enum machine_mode mo
 /* Try calculating
 	(clz:narrow x)
    as
-	(clz:wide (zero_extend:wide x)) - ((width wide) - (width narrow)).  */
+	(clz:wide (zero_extend:wide x)) - ((width wide) - (width narrow)).
+
+   A similar operation can be used for clrsb.  UNOPTAB says which operation
+   we are trying to expand.  */
 static rtx
-widen_clz (enum machine_mode mode, rtx op0, rtx target)
+widen_leading (enum machine_mode mode, rtx op0, rtx target, optab unoptab)
 {
   enum mode_class mclass = GET_MODE_CLASS (mode);
   if (CLASS_HAS_WIDER_MODES_P (mclass))
@@ -2329,7 +2332,7 @@ widen_clz (enum machine_mode mode, rtx o
 	   wider_mode != VOIDmode;
 	   wider_mode = GET_MODE_WIDER_MODE (wider_mode))
 	{
-	  if (optab_handler (clz_optab, wider_mode) != CODE_FOR_nothing)
+	  if (optab_handler (unoptab, wider_mode) != CODE_FOR_nothing)
 	    {
 	      rtx xop0, temp, last;
 
@@ -2338,7 +2341,7 @@ widen_clz (enum machine_mode mode, rtx o
 	      if (target == 0)
 		target = gen_reg_rtx (mode);
 	      xop0 = widen_operand (op0, wider_mode, mode, true, false);
-	      temp = expand_unop (wider_mode, clz_optab, xop0, NULL_RTX, true);
+	      temp = expand_unop (wider_mode, unoptab, xop0, NULL_RTX, true);
 	      if (temp != 0)
 		temp = expand_binop (wider_mode, sub_optab, temp,
 				     GEN_INT (GET_MODE_BITSIZE (wider_mode)
@@ -2832,7 +2835,7 @@ expand_unop (enum machine_mode mode, opt
   /* Widening (or narrowing) clz needs special treatment.  */
   if (unoptab == clz_optab)
     {
-      temp = widen_clz (mode, op0, target);
+      temp = widen_leading (mode, op0, target, unoptab);
       if (temp)
 	return temp;
 
@@ -2844,7 +2847,15 @@ expand_unop (enum machine_mode mode, opt
 	    return temp;
 	}
 
-	goto try_libcall;
+      goto try_libcall;
+    }
+
+  if (unoptab == clrsb_optab)
+    {
+      temp = widen_leading (mode, op0, target, unoptab);
+      if (temp)
+	return temp;
+      goto try_libcall;
     }
 
   /* Widening (or narrowing) bswap needs special treatment.  */
@@ -2999,7 +3010,8 @@ expand_unop (enum machine_mode mode, opt
       /* All of these functions return small values.  Thus we choose to
 	 have them return something that isn't a double-word.  */
       if (unoptab == ffs_optab || unoptab == clz_optab || unoptab == ctz_optab
-	  || unoptab == popcount_optab || unoptab == parity_optab)
+	  || unoptab == clrsb_optab || unoptab == popcount_optab
+	  || unoptab == parity_optab)
 	outmode
 	  = GET_MODE (hard_libcall_value (TYPE_MODE (integer_type_node),
 					  optab_libfunc (unoptab, mode)));
@@ -5943,6 +5955,7 @@ init_optabs (void)
   init_optab (ffs_optab, FFS);
   init_optab (clz_optab, CLZ);
   init_optab (ctz_optab, CTZ);
+  init_optab (clrsb_optab, CLRSB);
   init_optab (popcount_optab, POPCOUNT);
   init_optab (parity_optab, PARITY);
   init_optab (sqrt_optab, SQRT);
@@ -6173,6 +6186,9 @@ init_optabs (void)
   ctz_optab->libcall_basename = "ctz";
   ctz_optab->libcall_suffix = '2';
   ctz_optab->libcall_gen = gen_int_libfunc;
+  clrsb_optab->libcall_basename = "clrsb";
+  clrsb_optab->libcall_suffix = '2';
+  clrsb_optab->libcall_gen = gen_int_libfunc;
   popcount_optab->libcall_basename = "popcount";
   popcount_optab->libcall_suffix = '2';
   popcount_optab->libcall_gen = gen_int_libfunc;
Index: gcc/optabs.h
===================================================================
--- gcc/optabs.h	(revision 174339)
+++ gcc/optabs.h	(working copy)
@@ -220,6 +220,7 @@ enum optab_index
   OTI_ffs,
   OTI_clz,
   OTI_ctz,
+  OTI_clrsb,
   OTI_popcount,
   OTI_parity,
   /* Square root */
@@ -456,6 +457,7 @@ enum optab_index
 #define ffs_optab (&optab_table[OTI_ffs])
 #define clz_optab (&optab_table[OTI_clz])
 #define ctz_optab (&optab_table[OTI_ctz])
+#define clrsb_optab (&optab_table[OTI_clrsb])
 #define popcount_optab (&optab_table[OTI_popcount])
 #define parity_optab (&optab_table[OTI_parity])
 #define sqrt_optab (&optab_table[OTI_sqrt])
Index: gcc/genopinit.c
===================================================================
--- gcc/genopinit.c	(revision 174339)
+++ gcc/genopinit.c	(working copy)
@@ -199,6 +199,7 @@ static const char * const optabs[] =
   "set_optab_handler (ffs_optab, $A, CODE_FOR_$(ffs$a2$))",
   "set_optab_handler (clz_optab, $A, CODE_FOR_$(clz$a2$))",
   "set_optab_handler (ctz_optab, $A, CODE_FOR_$(ctz$a2$))",
+  "set_optab_handler (clrsb_optab, $A, CODE_FOR_$(clrsb$a2$))",
   "set_optab_handler (popcount_optab, $A, CODE_FOR_$(popcount$a2$))",
   "set_optab_handler (parity_optab, $A, CODE_FOR_$(parity$a2$))",
   "set_optab_handler (mov_optab, $A, CODE_FOR_$(mov$a$))",
Index: gcc/builtins.c
===================================================================
--- gcc/builtins.c	(revision 174339)
+++ gcc/builtins.c	(working copy)
@@ -6068,6 +6068,14 @@ expand_builtin (tree exp, rtx target, rt
 	return target;
       break;
 
+    CASE_INT_FN (BUILT_IN_CLRSB):
+    case BUILT_IN_CLRSBIMAX:
+      target = expand_builtin_unop (target_mode, exp, target,
+				    subtarget, clrsb_optab);
+      if (target)
+	return target;
+      break;
+
     CASE_INT_FN (BUILT_IN_POPCOUNT):
     case BUILT_IN_POPCOUNTIMAX:
       target = expand_builtin_unop (target_mode, exp, target,
Index: gcc/testsuite/gcc.c-torture/execute/builtin-bitops-1.c
===================================================================
--- gcc/testsuite/gcc.c-torture/execute/builtin-bitops-1.c	(revision 174339)
+++ gcc/testsuite/gcc.c-torture/execute/builtin-bitops-1.c	(working copy)
@@ -62,6 +62,16 @@ int my_clz##suffix(type x) {						\
     return i;								\
 }									\
 									\
+int my_clrsb##suffix(type x) {						\
+    int i;								\
+    int leading = (x >> CHAR_BIT * sizeof (type) - 1) & 1;		\
+    for (i = 1; i < CHAR_BIT * sizeof (type); i++)			\
+	if (((x >> ((CHAR_BIT * sizeof (type)) - i - 1)) & 1)		\
+	    != leading)							\
+	    break;							\
+    return i - 1;							\
+}									\
+									\
 int my_popcount##suffix(type x) {					\
     int i;								\
     int count = 0;							\
@@ -176,6 +186,8 @@ main (void)
       if (ints[i] != 0
 	  && __builtin_ctz (ints[i]) != my_ctz (ints[i]))
 	abort ();
+      if (__builtin_clrsb (ints[i]) != my_clrsb (ints[i]))
+	abort ();
       if (__builtin_popcount (ints[i]) != my_popcount (ints[i]))
 	abort ();
       if (__builtin_parity (ints[i]) != my_parity (ints[i]))
@@ -192,6 +204,8 @@ main (void)
       if (longs[i] != 0
 	  && __builtin_ctzl (longs[i]) != my_ctzl (longs[i]))
 	abort ();
+      if (__builtin_clrsbl (longs[i]) != my_clrsbl (longs[i]))
+	abort ();
       if (__builtin_popcountl (longs[i]) != my_popcountl (longs[i]))
 	abort ();
       if (__builtin_parityl (longs[i]) != my_parityl (longs[i]))
@@ -208,6 +222,8 @@ main (void)
       if (longlongs[i] != 0
 	  && __builtin_ctzll (longlongs[i]) != my_ctzll (longlongs[i]))
 	abort ();
+      if (__builtin_clrsbll (longlongs[i]) != my_clrsbll (longlongs[i]))
+	abort ();
       if (__builtin_popcountll (longlongs[i]) != my_popcountll (longlongs[i]))
 	abort ();
       if (__builtin_parityll (longlongs[i]) != my_parityll (longlongs[i]))
@@ -223,6 +239,8 @@ main (void)
     abort ();								\
   if (x != 0 && __builtin_ctz##suffix (x) != my_ctz##suffix (x))	\
     abort ();								\
+  if (__builtin_clrsb##suffix (x) != my_clrsb##suffix (x))		\
+    abort ();								\
   if (__builtin_popcount##suffix (x) != my_popcount##suffix (x))	\
     abort ();								\
   if (__builtin_parity##suffix (x) != my_parity##suffix (x))		\
Index: gcc/testsuite/gcc.dg/builtin-protos-1.c
===================================================================
--- gcc/testsuite/gcc.dg/builtin-protos-1.c	(revision 174339)
+++ gcc/testsuite/gcc.dg/builtin-protos-1.c	(working copy)
@@ -7,6 +7,7 @@ test_s (signed int x)
   return __builtin_abs (x)	/* { dg-bogus "as unsigned due to prototype" } */
     + __builtin_clz (x)		/* { dg-warning "as unsigned due to prototype" } */
     + __builtin_ctz (x)		/* { dg-warning "as unsigned due to prototype" } */
+    + __builtin_clrsb (x)	/* { dg-bogus "as unsigned due to prototype" } */
     + __builtin_ffs (x)		/* { dg-bogus "as unsigned due to prototype" } */
     + __builtin_parity (x)	/* { dg-warning "as unsigned due to prototype" } */
     + __builtin_popcount (x);	/* { dg-warning "as unsigned due to prototype" } */
@@ -18,6 +19,7 @@ test_u (unsigned int x)
   return __builtin_abs (x)	/* { dg-warning "as signed due to prototype" } */
     + __builtin_clz (x)		/* { dg-bogus "as signed due to prototype" } */
     + __builtin_ctz (x)		/* { dg-bogus "as signed due to prototype" } */
+    + __builtin_clrsb (x)	/* { dg-warning "as signed due to prototype" } */
     + __builtin_ffs (x)		/* { dg-warning "as signed due to prototype" } */
     + __builtin_parity (x)	/* { dg-bogus "as signed due to prototype" } */
     + __builtin_popcount (x);	/* { dg-bogus "as signed due to prototype" } */
@@ -29,6 +31,7 @@ test_sl (signed long x)
   return __builtin_labs (x)	/* { dg-bogus "as unsigned due to prototype" } */
     + __builtin_clzl (x)	/* { dg-warning "as unsigned due to prototype" } */
     + __builtin_ctzl (x)	/* { dg-warning "as unsigned due to prototype" } */
+    + __builtin_clrsbl (x)	/* { dg-bogus "as unsigned due to prototype" } */
     + __builtin_ffsl (x)	/* { dg-bogus "as unsigned due to prototype" } */
     + __builtin_parityl (x)	/* { dg-warning "as unsigned due to prototype" } */
     + __builtin_popcountl (x);	/* { dg-warning "as unsigned due to prototype" } */
@@ -40,6 +43,7 @@ test_ul (unsigned long x)
   return __builtin_labs (x)	/* { dg-warning "as signed due to prototype" } */
     + __builtin_clzl (x)	/* { dg-bogus "as signed due to prototype" } */
     + __builtin_ctzl (x)	/* { dg-bogus "as signed due to prototype" } */
+    + __builtin_clrsbl (x)	/* { dg-warning "as signed due to prototype" } */
     + __builtin_ffsl (x)	/* { dg-warning "as signed due to prototype" } */
     + __builtin_parityl (x)	/* { dg-bogus "as signed due to prototype" } */
     + __builtin_popcountl (x);	/* { dg-bogus "as signed due to prototype" } */
@@ -51,6 +55,7 @@ test_sll (signed long long x)
   return __builtin_llabs (x)	/* { dg-bogus "as unsigned due to prototype" } */
     + __builtin_clzll (x)	/* { dg-warning "as unsigned due to prototype" } */
     + __builtin_ctzll (x)	/* { dg-warning "as unsigned due to prototype" } */
+    + __builtin_clrsbll (x)	/* { dg-bogus "as unsigned due to prototype" } */
     + __builtin_ffsll (x)	/* { dg-bogus "as unsigned due to prototype" } */
     + __builtin_parityll (x)	/* { dg-warning "as unsigned due to prototype" } */
     + __builtin_popcountll (x);	/* { dg-warning "as unsigned due to prototype" } */
@@ -62,6 +67,7 @@ test_ull (unsigned long long x)
   return __builtin_llabs (x)	/* { dg-warning "as signed due to prototype" } */
     + __builtin_clzll (x)	/* { dg-bogus "as signed due to prototype" } */
     + __builtin_ctzll (x)	/* { dg-bogus "as signed due to prototype" } */
+    + __builtin_clrsbll (x)	/* { dg-warning "as signed due to prototype" } */
     + __builtin_ffsll (x)	/* { dg-warning "as signed due to prototype" } */
     + __builtin_parityll (x)	/* { dg-bogus "as signed due to prototype" } */
     + __builtin_popcountll (x);	/* { dg-bogus "as signed due to prototype" } */
Index: gcc/testsuite/gcc.dg/torture/builtin-attr-1.c
===================================================================
--- gcc/testsuite/gcc.dg/torture/builtin-attr-1.c	(revision 174339)
+++ gcc/testsuite/gcc.dg/torture/builtin-attr-1.c	(working copy)
@@ -416,6 +416,9 @@ BUILTIN_TEST1 (clzll, long long)
 BUILTIN_TEST1 (ctz, int)
 BUILTIN_TEST1 (ctzl, long)
 BUILTIN_TEST1 (ctzll, long long)
+BUILTIN_TEST1 (clrsb, int)
+BUILTIN_TEST1 (clrsbl, long)
+BUILTIN_TEST1 (clrsbll, long long)
 TEST1         (ffs, int, int)
 TEST1         (ffsl, long, int)
 TEST1         (ffsll, long long, int)
Index: gcc/builtins.def
===================================================================
--- gcc/builtins.def	(revision 174339)
+++ gcc/builtins.def	(working copy)
@@ -620,6 +620,10 @@ DEF_GCC_BUILTIN        (BUILT_IN_CTZ, "c
 DEF_GCC_BUILTIN        (BUILT_IN_CTZIMAX, "ctzimax", BT_FN_INT_UINTMAX, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN        (BUILT_IN_CTZL, "ctzl", BT_FN_INT_ULONG, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN        (BUILT_IN_CTZLL, "ctzll", BT_FN_INT_ULONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN        (BUILT_IN_CLRSB, "clrsb", BT_FN_INT_INT, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN        (BUILT_IN_CLRSBIMAX, "clrsbimax", BT_FN_INT_INTMAX, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN        (BUILT_IN_CLRSBL, "clrsbl", BT_FN_INT_LONG, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN        (BUILT_IN_CLRSBLL, "clrsbll", BT_FN_INT_LONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_DCGETTEXT, "dcgettext", BT_FN_STRING_CONST_STRING_CONST_STRING_INT, ATTR_FORMAT_ARG_2)
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_DGETTEXT, "dgettext", BT_FN_STRING_CONST_STRING_CONST_STRING, ATTR_FORMAT_ARG_2)
 DEF_GCC_BUILTIN        (BUILT_IN_DWARF_CFA, "dwarf_cfa", BT_FN_PTR, ATTR_NULL)
Index: gcc/rtl.def
===================================================================
--- gcc/rtl.def	(revision 174339)
+++ gcc/rtl.def	(working copy)
@@ -613,6 +613,10 @@ DEF_RTL_EXPR(BSWAP, "bswap", "e", RTX_UN
    or 0 if arg is 0.  */
 DEF_RTL_EXPR(FFS, "ffs", "e", RTX_UNARY)
 
+/* Count number of leading redundant sign bits (number of leading
+   sign bits minus one).  */
+DEF_RTL_EXPR(CLRSB, "clrsb", "e", RTX_UNARY)
+
 /* Count leading zeros.  */
 DEF_RTL_EXPR(CLZ, "clz", "e", RTX_UNARY)
 
Index: gcc/libgcc2.c
===================================================================
--- gcc/libgcc2.c	(revision 174339)
+++ gcc/libgcc2.c	(working copy)
@@ -762,7 +762,51 @@ __ctzDI2 (UDWtype x)
   return ret + add;
 }
 #endif
+\f
+#ifdef L_clrsbsi2
+#undef int
+int
+__clrsbSI2 (Wtype x)
+{
+  Wtype ret;
+
+  if (x == 0 || x == (Wtype)-1)
+    return W_TYPE_SIZE - 1;
+  if (x > 0)
+    count_leading_zeros (ret, x);
+  else
+    count_leading_zeros (ret, ~x);
+  return ret - 1;
+}
+#endif
+\f
+#ifdef L_clrsbdi2
+#undef int
+int
+__clrsbDI2 (DWtype x)
+{
+  const DWunion uu = {.ll = x};
+  UWtype word;
+  Wtype ret, add;
+
+  if (uu.s.high == 0)
+    word = uu.s.low, add = W_TYPE_SIZE;
+  else if (uu.s.high == -1)
+    word = ~uu.s.low, add = W_TYPE_SIZE;
+  else if (uu.s.high >= 0)
+    word = uu.s.high, add = 0;
+  else
+    word = ~uu.s.high, add = 0;
 
+  if (word == 0)
+    ret = W_TYPE_SIZE;
+  else
+    count_leading_zeros (ret, word);
+
+  return ret + add - 1;
+}
+#endif
+\f
 #ifdef L_popcount_tab
 const UQItype __popcount_tab[256] =
 {
Index: gcc/libgcc2.h
===================================================================
--- gcc/libgcc2.h	(revision 174339)
+++ gcc/libgcc2.h	(working copy)
@@ -315,11 +315,13 @@ typedef int shift_count_type __attribute
 #define __ffsSI2	__NW(ffs,2)
 #define __clzSI2	__NW(clz,2)
 #define __ctzSI2	__NW(ctz,2)
+#define __clrsbSI2	__NW(clrsb,2)
 #define __popcountSI2	__NW(popcount,2)
 #define __paritySI2	__NW(parity,2)
 #define __ffsDI2	__NDW(ffs,2)
 #define __clzDI2	__NDW(clz,2)
 #define __ctzDI2	__NDW(ctz,2)
+#define __clrsbDI2	__NDW(clrsb,2)
 #define __popcountDI2	__NDW(popcount,2)
 #define __parityDI2	__NDW(parity,2)
 
@@ -508,9 +510,11 @@ extern const UQItype __clz_tab[256];
 extern int __clzDI2 (UDWtype);
 extern int __clzSI2 (UWtype);
 extern int __ctzSI2 (UWtype);
+extern int __ctzDI2 (UDWtype);
+extern int __clrsbSI2 (Wtype);
+extern int __clrsbDI2 (DWtype);
 extern int __ffsSI2 (UWtype);
 extern int __ffsDI2 (DWtype);
-extern int __ctzDI2 (UDWtype);
 extern int __popcountSI2 (UWtype);
 extern int __popcountDI2 (UDWtype);
 extern int __paritySI2 (UWtype);
Index: gcc/simplify-rtx.c
===================================================================
--- gcc/simplify-rtx.c	(revision 174339)
+++ gcc/simplify-rtx.c	(working copy)
@@ -1269,6 +1269,16 @@ simplify_const_unary_operation (enum rtx
 	    val = GET_MODE_BITSIZE (mode) - floor_log2 (arg0) - 1;
 	  break;
 
+	case CLRSB:
+	  arg0 &= GET_MODE_MASK (mode);
+	  if (arg0 == 0)
+	    val = GET_MODE_BITSIZE (mode) - 1;
+	  else if (arg0 >= 0)
+	    val = GET_MODE_BITSIZE (mode) - floor_log2 (arg0) - 2;
+	  else if (arg0 < 0)
+	    val = GET_MODE_BITSIZE (mode) - floor_log2 (~arg0) - 2;
+	  break;
+
 	case CTZ:
 	  arg0 &= GET_MODE_MASK (mode);
 	  if (arg0 == 0)
Index: gcc/config/bfin/bfin.c
===================================================================
--- gcc/config/bfin/bfin.c	(revision 174339)
+++ gcc/config/bfin/bfin.c	(working copy)
@@ -6254,11 +6254,11 @@ static const struct builtin_description
 
   { CODE_FOR_ones, "__builtin_bfin_ones", BFIN_BUILTIN_ONES, 0 },
 
-  { CODE_FOR_signbitshi2, "__builtin_bfin_norm_fr1x16", BFIN_BUILTIN_NORM_1X16, 0 },
+  { CODE_FOR_clrsbhi2, "__builtin_bfin_norm_fr1x16", BFIN_BUILTIN_NORM_1X16, 0 },
   { CODE_FOR_ssneghi2, "__builtin_bfin_negate_fr1x16", BFIN_BUILTIN_NEG_1X16, 0 },
   { CODE_FOR_abshi2, "__builtin_bfin_abs_fr1x16", BFIN_BUILTIN_ABS_1X16, 0 },
 
-  { CODE_FOR_signbitssi2, "__builtin_bfin_norm_fr1x32", BFIN_BUILTIN_NORM_1X32, 0 },
+  { CODE_FOR_clrsbsi2, "__builtin_bfin_norm_fr1x32", BFIN_BUILTIN_NORM_1X32, 0 },
   { CODE_FOR_ssroundsi2, "__builtin_bfin_round_fr1x32", BFIN_BUILTIN_ROUND_1X32, 0 },
   { CODE_FOR_ssnegsi2, "__builtin_bfin_negate_fr1x32", BFIN_BUILTIN_NEG_1X32, 0 },
   { CODE_FOR_ssabssi2, "__builtin_bfin_abs_fr1x32", BFIN_BUILTIN_ABS_1X32, 0 },
Index: gcc/config/bfin/bfin.md
===================================================================
--- gcc/config/bfin/bfin.md	(revision 174339)
+++ gcc/config/bfin/bfin.md	(working copy)
@@ -1461,7 +1461,7 @@ (define_insn "one_cmplsi2"
   "%0 = ~%1;"
   [(set_attr "type" "alu0")])
 
-(define_insn "signbitssi2"
+(define_insn "clrsbsi2"
   [(set (match_operand:HI 0 "register_operand" "=d")
 	(if_then_else:HI
 	 (lt (match_operand:SI 1 "register_operand" "d") (const_int 0))
@@ -1518,7 +1518,7 @@ (define_insn "ssneghi2"
   "%0 = -%1 (V)%!"
   [(set_attr "type" "dsp32")])
 
-(define_insn "signbitshi2"
+(define_insn "clrsbhi2"
   [(set (match_operand:HI 0 "register_operand" "=d")
 	(if_then_else:HI
 	 (lt (match_operand:HI 1 "register_operand" "d") (const_int 0))

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Add __builtin_clrsb, similar to clz/ctz
  2011-06-16 13:06 Add __builtin_clrsb, similar to clz/ctz Bernd Schmidt
@ 2011-06-16 13:10 ` Georg-Johann Lay
  2011-06-16 13:56 ` Laurent Desnogues
  2011-06-16 17:03 ` Richard Henderson
  2 siblings, 0 replies; 21+ messages in thread
From: Georg-Johann Lay @ 2011-06-16 13:10 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: GCC Patches

Bernd Schmidt schrieb:
> Several processors have a "count redundant sign bits" instruction:
> 
> * SIGNBITS on Blackfin
> * NORM on C6X
> * SBC, apparently, on picochip
> 
> and probably others. For example, for a value of 0 or -1, it returns 31,
> while e.g. an input 0xc0000000 produces 1.
> 
> Other compilers tend to define builtin functions called "norm" for this,
> and the operation is used in some DSP benchmarks. The patch below adds a
> __builtin_clrsb family of functions, similar to __builtin_clz. I'm open
> to suggestions about the name; I don't particularly like "norm" but
> maybe that's what users expect.

Some call it cls "count leading signs". A clrxx reminds me rather of
"clear xx".

Johann

> 
> Bernd
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Add __builtin_clrsb, similar to clz/ctz
  2011-06-16 13:06 Add __builtin_clrsb, similar to clz/ctz Bernd Schmidt
  2011-06-16 13:10 ` Georg-Johann Lay
@ 2011-06-16 13:56 ` Laurent Desnogues
  2011-06-16 13:59   ` Bernd Schmidt
  2011-06-16 17:03 ` Richard Henderson
  2 siblings, 1 reply; 21+ messages in thread
From: Laurent Desnogues @ 2011-06-16 13:56 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: GCC Patches

On Thu, Jun 16, 2011 at 2:44 PM, Bernd Schmidt <bernds@codesourcery.com> wrote:
> Several processors have a "count redundant sign bits" instruction:
>
> * SIGNBITS on Blackfin
> * NORM on C6X
> * SBC, apparently, on picochip

picoChip defines __builtin_sbc.  Wouldn't that make it redundant
with your __builtin_clrsb?

> and probably others. For example, for a value of 0 or -1, it returns 31,
> while e.g. an input 0xc0000000 produces 1.

FWIW, ARM AdvSIMD has vcls.

> Other compilers tend to define builtin functions called "norm" for this,
> and the operation is used in some DSP benchmarks. The patch below adds a
> __builtin_clrsb family of functions, similar to __builtin_clz. I'm open
> to suggestions about the name; I don't particularly like "norm" but
> maybe that's what users expect.

I'm not sure I like clrsb.  The trailing 'b' looks superfluous if you
compare to clz.  And clr looks too much like clear :-)


Laurent

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Add __builtin_clrsb, similar to clz/ctz
  2011-06-16 13:56 ` Laurent Desnogues
@ 2011-06-16 13:59   ` Bernd Schmidt
  0 siblings, 0 replies; 21+ messages in thread
From: Bernd Schmidt @ 2011-06-16 13:59 UTC (permalink / raw)
  To: Laurent Desnogues; +Cc: GCC Patches

On 06/16/2011 03:10 PM, Laurent Desnogues wrote:
> On Thu, Jun 16, 2011 at 2:44 PM, Bernd Schmidt <bernds@codesourcery.com> wrote:
>> Several processors have a "count redundant sign bits" instruction:
>>
>> * SIGNBITS on Blackfin
>> * NORM on C6X
>> * SBC, apparently, on picochip
> 
> picoChip defines __builtin_sbc.  Wouldn't that make it redundant
> with your __builtin_clrsb?

Yes. But the fact that many processors support the same operation
suggests we shouldn't add a builtin function to every backend, but
rather provide it as a general feature.

> I'm not sure I like clrsb.  The trailing 'b' looks superfluous if you
> compare to clz.  And clr looks too much like clear :-)

Well, you would say "leading zeros" but not "leading signs".  OTOH
"leading" is redundant when talking about sign bits, so maybe "crsb"? I
don't really care as long as the patch is otherwise approved :)

Bernd

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Add __builtin_clrsb, similar to clz/ctz
  2011-06-16 13:06 Add __builtin_clrsb, similar to clz/ctz Bernd Schmidt
  2011-06-16 13:10 ` Georg-Johann Lay
  2011-06-16 13:56 ` Laurent Desnogues
@ 2011-06-16 17:03 ` Richard Henderson
  2011-06-20 20:32   ` Bernd Schmidt
  2 siblings, 1 reply; 21+ messages in thread
From: Richard Henderson @ 2011-06-16 17:03 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: GCC Patches

On 06/16/2011 05:44 AM, Bernd Schmidt wrote:
> +@deftypefn {Built-in Function} int __builtin_clrsb (unsigned int x)
> +Returns the number of leading redundant sign bits in @var{x}, starting
> +at the most significant bit position.
> +@end deftypefn

Do we want a signed argument, since we're talking about signs?

It would seem that unlike clz, this function is not undefined for zero.
What about INT_MIN?  Do all cpus handle those edge cases the same way?
Both should probably be documented both for the builtin and the rtx code.

> +  if (x == 0 || x == (Wtype)-1)
> +    return W_TYPE_SIZE - 1;
> +  if (x > 0)
> +    count_leading_zeros (ret, x);
> +  else
> +    count_leading_zeros (ret, ~x);
> +  return ret - 1;

Do you get smaller code in general from

  if (x < 0)
    x = ~x;
  if (x == 0)
    return W_TYPE_SIZE - 1;
  count_leading_zeros(ret, x);
  return ret - 1;

?

> -(define_insn "signbitssi2"
> +(define_insn "clrsbsi2"
>    [(set (match_operand:HI 0 "register_operand" "=d")
>  	(if_then_else:HI
>  	 (lt (match_operand:SI 1 "register_operand" "d") (const_int 0))

No use of the new rtx code?  Not that this couldn't be handled
with a different patch, but I don't see how you're testing the
code in simplify-rtx without at least one such use.


r~

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Add __builtin_clrsb, similar to clz/ctz
  2011-06-16 17:03 ` Richard Henderson
@ 2011-06-20 20:32   ` Bernd Schmidt
  2011-06-20 20:48     ` Richard Henderson
                       ` (4 more replies)
  0 siblings, 5 replies; 21+ messages in thread
From: Bernd Schmidt @ 2011-06-20 20:32 UTC (permalink / raw)
  To: Richard Henderson; +Cc: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 1835 bytes --]

On 06/16/2011 06:25 PM, Richard Henderson wrote:
> On 06/16/2011 05:44 AM, Bernd Schmidt wrote:
>> +@deftypefn {Built-in Function} int __builtin_clrsb (unsigned int x)
>> +Returns the number of leading redundant sign bits in @var{x}, starting
>> +at the most significant bit position.
>> +@end deftypefn
> 
> Do we want a signed argument, since we're talking about signs?

Err, yes. It's signed everywhere else (builtins.def etc.).

> It would seem that unlike clz, this function is not undefined for zero.
> What about INT_MIN?  Do all cpus handle those edge cases the same way?

-1 and zero should both produce the same value, 31 (for a 32 bit
integer). I don't see why INT_MIN should be special - the return value
is zero. This is true for C6X and Blackfin; ARM documentation suggests
it's also true for their VCLS instruction. I've not found proper
picochip documentation but some other documents that suggest it's also
implemented this way.

> Do you get smaller code in general from
> 
>   if (x < 0)
>     x = ~x;
>   if (x == 0)
>     return W_TYPE_SIZE - 1;
>   count_leading_zeros(ret, x);
>   return ret - 1;

Probably.

>> -(define_insn "signbitssi2"
>> +(define_insn "clrsbsi2"
>>    [(set (match_operand:HI 0 "register_operand" "=d")
>>  	(if_then_else:HI
>>  	 (lt (match_operand:SI 1 "register_operand" "d") (const_int 0))
> 
> No use of the new rtx code?

D'oh. Blackfin has a (clrsb:HI (operand:SI)) instruction, so adding this
showed a problem with some of the existing simplify_const_unop cases:
for ffs/clz/ctz/clrsb/parity/popcount, we should look at the mode of the
operand, rather than the mode of the operation. This limits what we can
do in that function, since op_mode is sometimes VOIDmode - we really
should add builtin folders for these at some point.

New patch below. Retested on i686 and bfin.


Bernd

[-- Attachment #2: clrsb0620.diff --]
[-- Type: text/plain, Size: 26109 bytes --]

	libgcc/
	* Makefile.in (lib2funcs): Add _clrsbsi2 and _clrsbdi2.
	* libgcc-std.ver.in (GCC_4.7.0): New section.

	gcc/
	* doc/extend.texi (__builtin_clrsb, __builtin_clrsbl,
	__builtin_clrsbll): Document.
	* doc/rtl.texi (clrsb): New entry.
	* optabs.c (widen_leading): Renamed from widen_clz.  New argument
	UNOPTAB.  All callers changed.  Use UNOPTAB instead of clz_optab.
	(expand_unop): Handle clrsb_optab.
	(init_optabs): Initialize it.
	* optabs.h (enum optab_index): New entry OTI_clrsb.
	(clrsb_optab): Define.
	* genopinit.c (optabs): Add an entry for it.
	* builtins.c (expand_builtin): Handle clrsb builtin functions.
	* builtins.def (BUILT_IN_CLRSB, BUILT_IN_CLRSBIMAX, BUILT_IN_CLRSBL,
	BUILT_IN_CLRSBLL): New.
	* rtl.def (CLRSB): New code.
	* dwarf2out.c (mem_loc_descriptor): Handle it.
	* simplify-rtx.c (simplify_const_unary_operation): Likewise.
	Use op_mode rather than mode when optimizing ffs, clz, ctz, parity
	and popcount.
	* libgcc2.c (__clrsbSI2, __clrsbDI2): New functions.
	* libgcc2.h (__clrsbSI2, __clrsbDI2): Define and declare.
	(__ctzDI2): Move declaration.
	* config/bfin/bfin.md (clrsbsi2): New expander.
	(signbitssi2): Use the CLRSB rtx.
	(clrsbhi2): Renamed from signbitshi2.  Use the CLRSB rtx.
	* config/bfin/bfin.c (bdesc_1arg): Changed accordingly.

	gcc/testsuite/
	* gcc.c-torture/excute/builtin-bitops-1.c (MAKE_FUNS): Make
	my_clrsb test functions.
	(main): Test clrsb.
	* gcc.dg/builtin-protos-1.c (test_s, test_u, test_sl, test_ul,
	test_sll, test_ull): Add clrsb tests.
	* gcc.dg/torture/builtin-attr-1.c: Add tests for clrsb, clrsbl,
	clrsbll.

Index: libgcc/Makefile.in
===================================================================
--- libgcc/Makefile.in	(revision 174339)
+++ libgcc/Makefile.in	(working copy)
@@ -320,7 +320,7 @@ lib2funcs = _muldi3 _negdi2 _lshrdi3 _as
 	    _ctzsi2 _ctzdi2 _popcount_tab _popcountsi2 _popcountdi2	   \
 	    _paritysi2 _paritydi2 _powisf2 _powidf2 _powixf2 _powitf2	   \
 	    _mulsc3 _muldc3 _mulxc3 _multc3 _divsc3 _divdc3 _divxc3	   \
-	    _divtc3 _bswapsi2 _bswapdi2
+	    _divtc3 _bswapsi2 _bswapdi2 _clrsbsi2 _clrsbdi2
 
 # The floating-point conversion routines that involve a single-word integer.
 # XX stands for the integer mode.
Index: libgcc/libgcc-std.ver.in
===================================================================
--- libgcc/libgcc-std.ver.in	(revision 174339)
+++ libgcc/libgcc-std.ver.in	(working copy)
@@ -1920,3 +1920,10 @@ GCC_4.6.0 {
   __morestack_initial_sp
   __splitstack_find
 }
+
+%inherit GCC_4.7.0 GCC_4.6.0
+GCC_4.7.0 {
+  __PFX__clrsbsi2
+  __PFX__clrsbdi2
+  __PFX__clrsbti2
+}
Index: gcc/doc/extend.texi
===================================================================
--- gcc/doc/extend.texi	(revision 174339)
+++ gcc/doc/extend.texi	(working copy)
@@ -7828,6 +7828,12 @@ Returns the number of trailing 0-bits in
 significant bit position.  If @var{x} is 0, the result is undefined.
 @end deftypefn
 
+@deftypefn {Built-in Function} int __builtin_clrsb (int x)
+Returns the number of leading redundant sign bits in @var{x}, i.e. the
+number of bits following the most significant bit which are identical
+to it.  There are no special cases for 0 or other values. 
+@end deftypefn
+
 @deftypefn {Built-in Function} int __builtin_popcount (unsigned int x)
 Returns the number of 1-bits in @var{x}.
 @end deftypefn
@@ -7852,6 +7858,11 @@ Similar to @code{__builtin_ctz}, except
 @code{unsigned long}.
 @end deftypefn
 
+@deftypefn {Built-in Function} int __builtin_clrsbl (long)
+Similar to @code{__builtin_clrsb}, except the argument type is
+@code{long}.
+@end deftypefn
+
 @deftypefn {Built-in Function} int __builtin_popcountl (unsigned long)
 Similar to @code{__builtin_popcount}, except the argument type is
 @code{unsigned long}.
@@ -7877,6 +7888,11 @@ Similar to @code{__builtin_ctz}, except
 @code{unsigned long long}.
 @end deftypefn
 
+@deftypefn {Built-in Function} int __builtin_clrsbll (long long)
+Similar to @code{__builtin_clrsb}, except the argument type is
+@code{long long}.
+@end deftypefn
+
 @deftypefn {Built-in Function} int __builtin_popcountll (unsigned long long)
 Similar to @code{__builtin_popcount}, except the argument type is
 @code{unsigned long long}.
Index: gcc/doc/rtl.texi
===================================================================
--- gcc/doc/rtl.texi	(revision 174339)
+++ gcc/doc/rtl.texi	(working copy)
@@ -2400,6 +2400,14 @@ zero if @var{x} is zero.)  The mode of @
 depending on the target machine, various mode combinations may be
 valid.
 
+@findex clrsb
+@item (clrsb:@var{m} @var{x})
+Represents the number of redundant leading sign bits in @var{x},
+represented as an integer of mode @var{m}, starting at the most
+significant bit position.  This is one less than the number of leading
+sign bits (either 0 or 1), with no special cases.  The mode of @var{x}
+will usually be an integer mode and may differ from @var{m}.
+
 @findex clz
 @item (clz:@var{m} @var{x})
 Represents the number of leading 0-bits in @var{x}, represented as an
Index: gcc/optabs.c
===================================================================
--- gcc/optabs.c	(revision 174339)
+++ gcc/optabs.c	(working copy)
@@ -2317,9 +2317,12 @@ expand_simple_unop (enum machine_mode mo
 /* Try calculating
 	(clz:narrow x)
    as
-	(clz:wide (zero_extend:wide x)) - ((width wide) - (width narrow)).  */
+	(clz:wide (zero_extend:wide x)) - ((width wide) - (width narrow)).
+
+   A similar operation can be used for clrsb.  UNOPTAB says which operation
+   we are trying to expand.  */
 static rtx
-widen_clz (enum machine_mode mode, rtx op0, rtx target)
+widen_leading (enum machine_mode mode, rtx op0, rtx target, optab unoptab)
 {
   enum mode_class mclass = GET_MODE_CLASS (mode);
   if (CLASS_HAS_WIDER_MODES_P (mclass))
@@ -2329,7 +2332,7 @@ widen_clz (enum machine_mode mode, rtx o
 	   wider_mode != VOIDmode;
 	   wider_mode = GET_MODE_WIDER_MODE (wider_mode))
 	{
-	  if (optab_handler (clz_optab, wider_mode) != CODE_FOR_nothing)
+	  if (optab_handler (unoptab, wider_mode) != CODE_FOR_nothing)
 	    {
 	      rtx xop0, temp, last;
 
@@ -2338,7 +2341,7 @@ widen_clz (enum machine_mode mode, rtx o
 	      if (target == 0)
 		target = gen_reg_rtx (mode);
 	      xop0 = widen_operand (op0, wider_mode, mode, true, false);
-	      temp = expand_unop (wider_mode, clz_optab, xop0, NULL_RTX, true);
+	      temp = expand_unop (wider_mode, unoptab, xop0, NULL_RTX, true);
 	      if (temp != 0)
 		temp = expand_binop (wider_mode, sub_optab, temp,
 				     GEN_INT (GET_MODE_BITSIZE (wider_mode)
@@ -2832,7 +2835,7 @@ expand_unop (enum machine_mode mode, opt
   /* Widening (or narrowing) clz needs special treatment.  */
   if (unoptab == clz_optab)
     {
-      temp = widen_clz (mode, op0, target);
+      temp = widen_leading (mode, op0, target, unoptab);
       if (temp)
 	return temp;
 
@@ -2844,7 +2847,15 @@ expand_unop (enum machine_mode mode, opt
 	    return temp;
 	}
 
-	goto try_libcall;
+      goto try_libcall;
+    }
+
+  if (unoptab == clrsb_optab)
+    {
+      temp = widen_leading (mode, op0, target, unoptab);
+      if (temp)
+	return temp;
+      goto try_libcall;
     }
 
   /* Widening (or narrowing) bswap needs special treatment.  */
@@ -2999,7 +3010,8 @@ expand_unop (enum machine_mode mode, opt
       /* All of these functions return small values.  Thus we choose to
 	 have them return something that isn't a double-word.  */
       if (unoptab == ffs_optab || unoptab == clz_optab || unoptab == ctz_optab
-	  || unoptab == popcount_optab || unoptab == parity_optab)
+	  || unoptab == clrsb_optab || unoptab == popcount_optab
+	  || unoptab == parity_optab)
 	outmode
 	  = GET_MODE (hard_libcall_value (TYPE_MODE (integer_type_node),
 					  optab_libfunc (unoptab, mode)));
@@ -5943,6 +5955,7 @@ init_optabs (void)
   init_optab (ffs_optab, FFS);
   init_optab (clz_optab, CLZ);
   init_optab (ctz_optab, CTZ);
+  init_optab (clrsb_optab, CLRSB);
   init_optab (popcount_optab, POPCOUNT);
   init_optab (parity_optab, PARITY);
   init_optab (sqrt_optab, SQRT);
@@ -6173,6 +6186,9 @@ init_optabs (void)
   ctz_optab->libcall_basename = "ctz";
   ctz_optab->libcall_suffix = '2';
   ctz_optab->libcall_gen = gen_int_libfunc;
+  clrsb_optab->libcall_basename = "clrsb";
+  clrsb_optab->libcall_suffix = '2';
+  clrsb_optab->libcall_gen = gen_int_libfunc;
   popcount_optab->libcall_basename = "popcount";
   popcount_optab->libcall_suffix = '2';
   popcount_optab->libcall_gen = gen_int_libfunc;
Index: gcc/optabs.h
===================================================================
--- gcc/optabs.h	(revision 174339)
+++ gcc/optabs.h	(working copy)
@@ -220,6 +220,7 @@ enum optab_index
   OTI_ffs,
   OTI_clz,
   OTI_ctz,
+  OTI_clrsb,
   OTI_popcount,
   OTI_parity,
   /* Square root */
@@ -456,6 +457,7 @@ enum optab_index
 #define ffs_optab (&optab_table[OTI_ffs])
 #define clz_optab (&optab_table[OTI_clz])
 #define ctz_optab (&optab_table[OTI_ctz])
+#define clrsb_optab (&optab_table[OTI_clrsb])
 #define popcount_optab (&optab_table[OTI_popcount])
 #define parity_optab (&optab_table[OTI_parity])
 #define sqrt_optab (&optab_table[OTI_sqrt])
Index: gcc/genopinit.c
===================================================================
--- gcc/genopinit.c	(revision 174339)
+++ gcc/genopinit.c	(working copy)
@@ -199,6 +199,7 @@ static const char * const optabs[] =
   "set_optab_handler (ffs_optab, $A, CODE_FOR_$(ffs$a2$))",
   "set_optab_handler (clz_optab, $A, CODE_FOR_$(clz$a2$))",
   "set_optab_handler (ctz_optab, $A, CODE_FOR_$(ctz$a2$))",
+  "set_optab_handler (clrsb_optab, $A, CODE_FOR_$(clrsb$a2$))",
   "set_optab_handler (popcount_optab, $A, CODE_FOR_$(popcount$a2$))",
   "set_optab_handler (parity_optab, $A, CODE_FOR_$(parity$a2$))",
   "set_optab_handler (mov_optab, $A, CODE_FOR_$(mov$a$))",
Index: gcc/builtins.c
===================================================================
--- gcc/builtins.c	(revision 174339)
+++ gcc/builtins.c	(working copy)
@@ -6068,6 +6068,14 @@ expand_builtin (tree exp, rtx target, rt
 	return target;
       break;
 
+    CASE_INT_FN (BUILT_IN_CLRSB):
+    case BUILT_IN_CLRSBIMAX:
+      target = expand_builtin_unop (target_mode, exp, target,
+				    subtarget, clrsb_optab);
+      if (target)
+	return target;
+      break;
+
     CASE_INT_FN (BUILT_IN_POPCOUNT):
     case BUILT_IN_POPCOUNTIMAX:
       target = expand_builtin_unop (target_mode, exp, target,
Index: gcc/testsuite/gcc.c-torture/execute/builtin-bitops-1.c
===================================================================
--- gcc/testsuite/gcc.c-torture/execute/builtin-bitops-1.c	(revision 174339)
+++ gcc/testsuite/gcc.c-torture/execute/builtin-bitops-1.c	(working copy)
@@ -62,6 +62,16 @@ int my_clz##suffix(type x) {						\
     return i;								\
 }									\
 									\
+int my_clrsb##suffix(type x) {						\
+    int i;								\
+    int leading = (x >> CHAR_BIT * sizeof (type) - 1) & 1;		\
+    for (i = 1; i < CHAR_BIT * sizeof (type); i++)			\
+	if (((x >> ((CHAR_BIT * sizeof (type)) - i - 1)) & 1)		\
+	    != leading)							\
+	    break;							\
+    return i - 1;							\
+}									\
+									\
 int my_popcount##suffix(type x) {					\
     int i;								\
     int count = 0;							\
@@ -176,6 +186,8 @@ main (void)
       if (ints[i] != 0
 	  && __builtin_ctz (ints[i]) != my_ctz (ints[i]))
 	abort ();
+      if (__builtin_clrsb (ints[i]) != my_clrsb (ints[i]))
+	abort ();
       if (__builtin_popcount (ints[i]) != my_popcount (ints[i]))
 	abort ();
       if (__builtin_parity (ints[i]) != my_parity (ints[i]))
@@ -192,6 +204,8 @@ main (void)
       if (longs[i] != 0
 	  && __builtin_ctzl (longs[i]) != my_ctzl (longs[i]))
 	abort ();
+      if (__builtin_clrsbl (longs[i]) != my_clrsbl (longs[i]))
+	abort ();
       if (__builtin_popcountl (longs[i]) != my_popcountl (longs[i]))
 	abort ();
       if (__builtin_parityl (longs[i]) != my_parityl (longs[i]))
@@ -208,6 +222,8 @@ main (void)
       if (longlongs[i] != 0
 	  && __builtin_ctzll (longlongs[i]) != my_ctzll (longlongs[i]))
 	abort ();
+      if (__builtin_clrsbll (longlongs[i]) != my_clrsbll (longlongs[i]))
+	abort ();
       if (__builtin_popcountll (longlongs[i]) != my_popcountll (longlongs[i]))
 	abort ();
       if (__builtin_parityll (longlongs[i]) != my_parityll (longlongs[i]))
@@ -223,6 +239,8 @@ main (void)
     abort ();								\
   if (x != 0 && __builtin_ctz##suffix (x) != my_ctz##suffix (x))	\
     abort ();								\
+  if (__builtin_clrsb##suffix (x) != my_clrsb##suffix (x))		\
+    abort ();								\
   if (__builtin_popcount##suffix (x) != my_popcount##suffix (x))	\
     abort ();								\
   if (__builtin_parity##suffix (x) != my_parity##suffix (x))		\
Index: gcc/testsuite/gcc.dg/builtin-protos-1.c
===================================================================
--- gcc/testsuite/gcc.dg/builtin-protos-1.c	(revision 174339)
+++ gcc/testsuite/gcc.dg/builtin-protos-1.c	(working copy)
@@ -7,6 +7,7 @@ test_s (signed int x)
   return __builtin_abs (x)	/* { dg-bogus "as unsigned due to prototype" } */
     + __builtin_clz (x)		/* { dg-warning "as unsigned due to prototype" } */
     + __builtin_ctz (x)		/* { dg-warning "as unsigned due to prototype" } */
+    + __builtin_clrsb (x)	/* { dg-bogus "as unsigned due to prototype" } */
     + __builtin_ffs (x)		/* { dg-bogus "as unsigned due to prototype" } */
     + __builtin_parity (x)	/* { dg-warning "as unsigned due to prototype" } */
     + __builtin_popcount (x);	/* { dg-warning "as unsigned due to prototype" } */
@@ -18,6 +19,7 @@ test_u (unsigned int x)
   return __builtin_abs (x)	/* { dg-warning "as signed due to prototype" } */
     + __builtin_clz (x)		/* { dg-bogus "as signed due to prototype" } */
     + __builtin_ctz (x)		/* { dg-bogus "as signed due to prototype" } */
+    + __builtin_clrsb (x)	/* { dg-warning "as signed due to prototype" } */
     + __builtin_ffs (x)		/* { dg-warning "as signed due to prototype" } */
     + __builtin_parity (x)	/* { dg-bogus "as signed due to prototype" } */
     + __builtin_popcount (x);	/* { dg-bogus "as signed due to prototype" } */
@@ -29,6 +31,7 @@ test_sl (signed long x)
   return __builtin_labs (x)	/* { dg-bogus "as unsigned due to prototype" } */
     + __builtin_clzl (x)	/* { dg-warning "as unsigned due to prototype" } */
     + __builtin_ctzl (x)	/* { dg-warning "as unsigned due to prototype" } */
+    + __builtin_clrsbl (x)	/* { dg-bogus "as unsigned due to prototype" } */
     + __builtin_ffsl (x)	/* { dg-bogus "as unsigned due to prototype" } */
     + __builtin_parityl (x)	/* { dg-warning "as unsigned due to prototype" } */
     + __builtin_popcountl (x);	/* { dg-warning "as unsigned due to prototype" } */
@@ -40,6 +43,7 @@ test_ul (unsigned long x)
   return __builtin_labs (x)	/* { dg-warning "as signed due to prototype" } */
     + __builtin_clzl (x)	/* { dg-bogus "as signed due to prototype" } */
     + __builtin_ctzl (x)	/* { dg-bogus "as signed due to prototype" } */
+    + __builtin_clrsbl (x)	/* { dg-warning "as signed due to prototype" } */
     + __builtin_ffsl (x)	/* { dg-warning "as signed due to prototype" } */
     + __builtin_parityl (x)	/* { dg-bogus "as signed due to prototype" } */
     + __builtin_popcountl (x);	/* { dg-bogus "as signed due to prototype" } */
@@ -51,6 +55,7 @@ test_sll (signed long long x)
   return __builtin_llabs (x)	/* { dg-bogus "as unsigned due to prototype" } */
     + __builtin_clzll (x)	/* { dg-warning "as unsigned due to prototype" } */
     + __builtin_ctzll (x)	/* { dg-warning "as unsigned due to prototype" } */
+    + __builtin_clrsbll (x)	/* { dg-bogus "as unsigned due to prototype" } */
     + __builtin_ffsll (x)	/* { dg-bogus "as unsigned due to prototype" } */
     + __builtin_parityll (x)	/* { dg-warning "as unsigned due to prototype" } */
     + __builtin_popcountll (x);	/* { dg-warning "as unsigned due to prototype" } */
@@ -62,6 +67,7 @@ test_ull (unsigned long long x)
   return __builtin_llabs (x)	/* { dg-warning "as signed due to prototype" } */
     + __builtin_clzll (x)	/* { dg-bogus "as signed due to prototype" } */
     + __builtin_ctzll (x)	/* { dg-bogus "as signed due to prototype" } */
+    + __builtin_clrsbll (x)	/* { dg-warning "as signed due to prototype" } */
     + __builtin_ffsll (x)	/* { dg-warning "as signed due to prototype" } */
     + __builtin_parityll (x)	/* { dg-bogus "as signed due to prototype" } */
     + __builtin_popcountll (x);	/* { dg-bogus "as signed due to prototype" } */
Index: gcc/testsuite/gcc.dg/torture/builtin-attr-1.c
===================================================================
--- gcc/testsuite/gcc.dg/torture/builtin-attr-1.c	(revision 174339)
+++ gcc/testsuite/gcc.dg/torture/builtin-attr-1.c	(working copy)
@@ -416,6 +416,9 @@ BUILTIN_TEST1 (clzll, long long)
 BUILTIN_TEST1 (ctz, int)
 BUILTIN_TEST1 (ctzl, long)
 BUILTIN_TEST1 (ctzll, long long)
+BUILTIN_TEST1 (clrsb, int)
+BUILTIN_TEST1 (clrsbl, long)
+BUILTIN_TEST1 (clrsbll, long long)
 TEST1         (ffs, int, int)
 TEST1         (ffsl, long, int)
 TEST1         (ffsll, long long, int)
Index: gcc/builtins.def
===================================================================
--- gcc/builtins.def	(revision 174339)
+++ gcc/builtins.def	(working copy)
@@ -620,6 +620,10 @@ DEF_GCC_BUILTIN        (BUILT_IN_CTZ, "c
 DEF_GCC_BUILTIN        (BUILT_IN_CTZIMAX, "ctzimax", BT_FN_INT_UINTMAX, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN        (BUILT_IN_CTZL, "ctzl", BT_FN_INT_ULONG, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN        (BUILT_IN_CTZLL, "ctzll", BT_FN_INT_ULONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN        (BUILT_IN_CLRSB, "clrsb", BT_FN_INT_INT, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN        (BUILT_IN_CLRSBIMAX, "clrsbimax", BT_FN_INT_INTMAX, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN        (BUILT_IN_CLRSBL, "clrsbl", BT_FN_INT_LONG, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN        (BUILT_IN_CLRSBLL, "clrsbll", BT_FN_INT_LONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_DCGETTEXT, "dcgettext", BT_FN_STRING_CONST_STRING_CONST_STRING_INT, ATTR_FORMAT_ARG_2)
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_DGETTEXT, "dgettext", BT_FN_STRING_CONST_STRING_CONST_STRING, ATTR_FORMAT_ARG_2)
 DEF_GCC_BUILTIN        (BUILT_IN_DWARF_CFA, "dwarf_cfa", BT_FN_PTR, ATTR_NULL)
Index: gcc/rtl.def
===================================================================
--- gcc/rtl.def	(revision 174339)
+++ gcc/rtl.def	(working copy)
@@ -613,6 +613,10 @@ DEF_RTL_EXPR(BSWAP, "bswap", "e", RTX_UN
    or 0 if arg is 0.  */
 DEF_RTL_EXPR(FFS, "ffs", "e", RTX_UNARY)
 
+/* Count number of leading redundant sign bits (number of leading
+   sign bits minus one).  */
+DEF_RTL_EXPR(CLRSB, "clrsb", "e", RTX_UNARY)
+
 /* Count leading zeros.  */
 DEF_RTL_EXPR(CLZ, "clz", "e", RTX_UNARY)
 
Index: gcc/dwarf2out.c
===================================================================
--- gcc/dwarf2out.c	(revision 174339)
+++ gcc/dwarf2out.c	(working copy)
@@ -14874,6 +14874,7 @@ mem_loc_descriptor (rtx rtl, enum machin
     case FFS:
     case CLZ:
     case CTZ:
+    case CLRSB:
     case POPCOUNT:
     case PARITY:
     case ASM_OPERANDS:
Index: gcc/libgcc2.c
===================================================================
--- gcc/libgcc2.c	(revision 174339)
+++ gcc/libgcc2.c	(working copy)
@@ -762,7 +762,50 @@ __ctzDI2 (UDWtype x)
   return ret + add;
 }
 #endif
+\f
+#ifdef L_clrsbsi2
+#undef int
+int
+__clrsbSI2 (Wtype x)
+{
+  Wtype ret;
 
+  if (x < 0)
+    x = ~x;
+  if (x == 0)
+    return W_TYPE_SIZE - 1;
+  count_leading_zeros (ret, x);
+  return ret - 1;
+}
+#endif
+\f
+#ifdef L_clrsbdi2
+#undef int
+int
+__clrsbDI2 (DWtype x)
+{
+  const DWunion uu = {.ll = x};
+  UWtype word;
+  Wtype ret, add;
+
+  if (uu.s.high == 0)
+    word = uu.s.low, add = W_TYPE_SIZE;
+  else if (uu.s.high == -1)
+    word = ~uu.s.low, add = W_TYPE_SIZE;
+  else if (uu.s.high >= 0)
+    word = uu.s.high, add = 0;
+  else
+    word = ~uu.s.high, add = 0;
+
+  if (word == 0)
+    ret = W_TYPE_SIZE;
+  else
+    count_leading_zeros (ret, word);
+
+  return ret + add - 1;
+}
+#endif
+\f
 #ifdef L_popcount_tab
 const UQItype __popcount_tab[256] =
 {
Index: gcc/libgcc2.h
===================================================================
--- gcc/libgcc2.h	(revision 174339)
+++ gcc/libgcc2.h	(working copy)
@@ -315,11 +315,13 @@ typedef int shift_count_type __attribute
 #define __ffsSI2	__NW(ffs,2)
 #define __clzSI2	__NW(clz,2)
 #define __ctzSI2	__NW(ctz,2)
+#define __clrsbSI2	__NW(clrsb,2)
 #define __popcountSI2	__NW(popcount,2)
 #define __paritySI2	__NW(parity,2)
 #define __ffsDI2	__NDW(ffs,2)
 #define __clzDI2	__NDW(clz,2)
 #define __ctzDI2	__NDW(ctz,2)
+#define __clrsbDI2	__NDW(clrsb,2)
 #define __popcountDI2	__NDW(popcount,2)
 #define __parityDI2	__NDW(parity,2)
 
@@ -508,9 +510,11 @@ extern const UQItype __clz_tab[256];
 extern int __clzDI2 (UDWtype);
 extern int __clzSI2 (UWtype);
 extern int __ctzSI2 (UWtype);
+extern int __ctzDI2 (UDWtype);
+extern int __clrsbSI2 (Wtype);
+extern int __clrsbDI2 (DWtype);
 extern int __ffsSI2 (UWtype);
 extern int __ffsDI2 (DWtype);
-extern int __ctzDI2 (UDWtype);
 extern int __popcountSI2 (UWtype);
 extern int __popcountDI2 (UDWtype);
 extern int __paritySI2 (UWtype);
Index: gcc/simplify-rtx.c
===================================================================
--- gcc/simplify-rtx.c	(revision 174339)
+++ gcc/simplify-rtx.c	(working copy)
@@ -1127,6 +1127,7 @@ simplify_const_unary_operation (enum rtx
 				rtx op, enum machine_mode op_mode)
 {
   unsigned int width = GET_MODE_BITSIZE (mode);
+  unsigned int op_width = GET_MODE_BITSIZE (op_mode);
 
   if (code == VEC_DUPLICATE)
     {
@@ -1237,7 +1238,8 @@ simplify_const_unary_operation (enum rtx
     }
 
   if (CONST_INT_P (op)
-      && width <= HOST_BITS_PER_WIDE_INT && width > 0)
+      && width <= HOST_BITS_PER_WIDE_INT
+      && op_width <= HOST_BITS_PER_WIDE_INT && op_width > 0)
     {
       HOST_WIDE_INT arg0 = INTVAL (op);
       HOST_WIDE_INT val;
@@ -1257,40 +1259,50 @@ simplify_const_unary_operation (enum rtx
 	  break;
 
 	case FFS:
-	  arg0 &= GET_MODE_MASK (mode);
+	  arg0 &= GET_MODE_MASK (op_mode);
 	  val = ffs_hwi (arg0);
 	  break;
 
 	case CLZ:
-	  arg0 &= GET_MODE_MASK (mode);
-	  if (arg0 == 0 && CLZ_DEFINED_VALUE_AT_ZERO (mode, val))
+	  arg0 &= GET_MODE_MASK (op_mode);
+	  if (arg0 == 0 && CLZ_DEFINED_VALUE_AT_ZERO (op_mode, val))
 	    ;
 	  else
-	    val = GET_MODE_BITSIZE (mode) - floor_log2 (arg0) - 1;
+	    val = GET_MODE_BITSIZE (op_mode) - floor_log2 (arg0) - 1;
+	  break;
+
+	case CLRSB:
+	  arg0 &= GET_MODE_MASK (op_mode);
+	  if (arg0 == 0)
+	    val = GET_MODE_BITSIZE (op_mode) - 1;
+	  else if (arg0 >= 0)
+	    val = GET_MODE_BITSIZE (op_mode) - floor_log2 (arg0) - 2;
+	  else if (arg0 < 0)
+	    val = GET_MODE_BITSIZE (op_mode) - floor_log2 (~arg0) - 2;
 	  break;
 
 	case CTZ:
-	  arg0 &= GET_MODE_MASK (mode);
+	  arg0 &= GET_MODE_MASK (op_mode);
 	  if (arg0 == 0)
 	    {
 	      /* Even if the value at zero is undefined, we have to come
 		 up with some replacement.  Seems good enough.  */
-	      if (! CTZ_DEFINED_VALUE_AT_ZERO (mode, val))
-		val = GET_MODE_BITSIZE (mode);
+	      if (! CTZ_DEFINED_VALUE_AT_ZERO (op_mode, val))
+		val = GET_MODE_BITSIZE (op_mode);
 	    }
 	  else
 	    val = ctz_hwi (arg0);
 	  break;
 
 	case POPCOUNT:
-	  arg0 &= GET_MODE_MASK (mode);
+	  arg0 &= GET_MODE_MASK (op_mode);
 	  val = 0;
 	  while (arg0)
 	    val++, arg0 &= arg0 - 1;
 	  break;
 
 	case PARITY:
-	  arg0 &= GET_MODE_MASK (mode);
+	  arg0 &= GET_MODE_MASK (op_mode);
 	  val = 0;
 	  while (arg0)
 	    val++, arg0 &= arg0 - 1;
Index: gcc/config/bfin/bfin.c
===================================================================
--- gcc/config/bfin/bfin.c	(revision 174339)
+++ gcc/config/bfin/bfin.c	(working copy)
@@ -6254,11 +6254,11 @@ static const struct builtin_description
 
   { CODE_FOR_ones, "__builtin_bfin_ones", BFIN_BUILTIN_ONES, 0 },
 
-  { CODE_FOR_signbitshi2, "__builtin_bfin_norm_fr1x16", BFIN_BUILTIN_NORM_1X16, 0 },
+  { CODE_FOR_clrsbhi2, "__builtin_bfin_norm_fr1x16", BFIN_BUILTIN_NORM_1X16, 0 },
   { CODE_FOR_ssneghi2, "__builtin_bfin_negate_fr1x16", BFIN_BUILTIN_NEG_1X16, 0 },
   { CODE_FOR_abshi2, "__builtin_bfin_abs_fr1x16", BFIN_BUILTIN_ABS_1X16, 0 },
 
-  { CODE_FOR_signbitssi2, "__builtin_bfin_norm_fr1x32", BFIN_BUILTIN_NORM_1X32, 0 },
+  { CODE_FOR_clrsbsi2, "__builtin_bfin_norm_fr1x32", BFIN_BUILTIN_NORM_1X32, 0 },
   { CODE_FOR_ssroundsi2, "__builtin_bfin_round_fr1x32", BFIN_BUILTIN_ROUND_1X32, 0 },
   { CODE_FOR_ssnegsi2, "__builtin_bfin_negate_fr1x32", BFIN_BUILTIN_NEG_1X32, 0 },
   { CODE_FOR_ssabssi2, "__builtin_bfin_abs_fr1x32", BFIN_BUILTIN_ABS_1X32, 0 },
Index: gcc/config/bfin/bfin.md
===================================================================
--- gcc/config/bfin/bfin.md	(revision 174339)
+++ gcc/config/bfin/bfin.md	(working copy)
@@ -1461,12 +1461,19 @@ (define_insn "one_cmplsi2"
   "%0 = ~%1;"
   [(set_attr "type" "alu0")])
 
+(define_expand "clrsbsi2"
+  [(set (match_dup 2)
+	(clrsb:HI (match_operand:SI 1 "register_operand" "d")))
+   (set (match_operand:SI 0 "register_operand")
+	(zero_extend:SI (match_dup 2)))]
+  ""
+{
+  operands[2] = gen_reg_rtx (HImode);
+})
+
 (define_insn "signbitssi2"
   [(set (match_operand:HI 0 "register_operand" "=d")
-	(if_then_else:HI
-	 (lt (match_operand:SI 1 "register_operand" "d") (const_int 0))
-	 (clz:HI (not:SI (match_dup 1)))
-	 (clz:HI (match_dup 1))))]
+	(clrsb:HI (match_operand:SI 1 "register_operand" "d")))]
   ""
   "%h0 = signbits %1%!"
   [(set_attr "type" "dsp32")])
@@ -1518,12 +1525,9 @@ (define_insn "ssneghi2"
   "%0 = -%1 (V)%!"
   [(set_attr "type" "dsp32")])
 
-(define_insn "signbitshi2"
+(define_insn "clrsbhi2"
   [(set (match_operand:HI 0 "register_operand" "=d")
-	(if_then_else:HI
-	 (lt (match_operand:HI 1 "register_operand" "d") (const_int 0))
-	 (clz:HI (not:HI (match_dup 1)))
-	 (clz:HI (match_dup 1))))]
+	(clrsb:HI (match_operand:HI 1 "register_operand" "d")))]
   ""
   "%h0 = signbits %h1%!"
   [(set_attr "type" "dsp32")])

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Add __builtin_clrsb, similar to clz/ctz
  2011-06-20 20:32   ` Bernd Schmidt
@ 2011-06-20 20:48     ` Richard Henderson
  2011-06-21 16:39     ` [PATCH] Fix __bultin_clrsb* (PR middle-end/49489) Jakub Jelinek
                       ` (3 subsequent siblings)
  4 siblings, 0 replies; 21+ messages in thread
From: Richard Henderson @ 2011-06-20 20:48 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: GCC Patches

On 06/20/2011 12:38 PM, Bernd Schmidt wrote:
> D'oh. Blackfin has a (clrsb:HI (operand:SI)) instruction, so adding this
> showed a problem with some of the existing simplify_const_unop cases:
> for ffs/clz/ctz/clrsb/parity/popcount, we should look at the mode of the
> operand, rather than the mode of the operation. This limits what we can
> do in that function, since op_mode is sometimes VOIDmode - we really
> should add builtin folders for these at some point.
> 
> New patch below. Retested on i686 and bfin.
> 

Ok.

I'm not going to bike-shed the function name.  If, at some point
before 4.7 is released we agree on another name, we can change it.


r~

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH] Fix __bultin_clrsb* (PR middle-end/49489)
  2011-06-20 20:32   ` Bernd Schmidt
  2011-06-20 20:48     ` Richard Henderson
@ 2011-06-21 16:39     ` Jakub Jelinek
  2011-06-21 16:46       ` Bernd Schmidt
  2011-06-23  6:23     ` Add __builtin_clrsb, similar to clz/ctz H.J. Lu
                       ` (2 subsequent siblings)
  4 siblings, 1 reply; 21+ messages in thread
From: Jakub Jelinek @ 2011-06-21 16:39 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Richard Henderson, GCC Patches

On Mon, Jun 20, 2011 at 09:38:22PM +0200, Bernd Schmidt wrote:
> On 06/16/2011 06:25 PM, Richard Henderson wrote:
> > On 06/16/2011 05:44 AM, Bernd Schmidt wrote:
> >> +@deftypefn {Built-in Function} int __builtin_clrsb (unsigned int x)
> >> +Returns the number of leading redundant sign bits in @var{x}, starting
> >> +at the most significant bit position.
> >> +@end deftypefn
> > 
> > Do we want a signed argument, since we're talking about signs?
> 
> Err, yes. It's signed everywhere else (builtins.def etc.).

But unfortunately not during expansion.
gcc.c-torture/execute/builtin-bitops-1.c fails on x86_64-linux, because
there is no SImode libcall, only DImode, and it was zero-extending instead
of sign-extending the argument, and not subtracting the mode difference
afterwards.  So int foo (int x) { return __builtin_clrsb (x); }
expanded the same as
int foo (int x) { return __builtin_clrsbl ((unsigned) x); }
instead of the correct
int foo (int x) { return __builtin_clrsbl (x) - 32; }

Fixed thusly, I've also added folding of constant arguments
during tree optimizations for it.

Ok for trunk?

2011-06-21  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/49489
	* builtins.c (expand_builtin_unop): Call expand_unop with 0 as
	unsignedp argument instead of 1 for clrsb_optab.
	(fold_builtin_bitop): Fix masking for width > HOST_BITS_PER_WIDE_INT
	and < 2 * HOST_BITS_PER_WIDE_INT.  Optimize BUILT_IN_CLRSB*.
	(fold_builtin_1): Call fold_builtin_binop for BUILT_IN_CLRSB*.
	* optabs.c (widen_leading): Call widen_operand and expand_unop
	with 0 as unsignedp argument instead of 1 for clrsb_optab.
	(expand_unop): Subtract difference of mode sizes also for
	clrsb_optab.

--- gcc/builtins.c.jj	2011-06-21 16:46:01.000000000 +0200
+++ gcc/builtins.c	2011-06-21 17:36:38.000000000 +0200
@@ -4578,7 +4578,7 @@ expand_builtin_unop (enum machine_mode t
   /* Compute op, into TARGET if possible.
      Set TARGET to wherever the result comes back.  */
   target = expand_unop (TYPE_MODE (TREE_TYPE (CALL_EXPR_ARG (exp, 0))),
-			op_optab, op0, target, 1);
+			op_optab, op0, target, op_optab != clrsb_optab);
   gcc_assert (target);
 
   return convert_to_mode (target_mode, target, 0);
@@ -7265,7 +7265,8 @@ fold_builtin_bitop (tree fndecl, tree ar
 	{
 	  hi = TREE_INT_CST_HIGH (arg);
 	  if (width < 2 * HOST_BITS_PER_WIDE_INT)
-	    hi &= ~((HOST_WIDE_INT) (-1) >> (width - HOST_BITS_PER_WIDE_INT));
+	    hi &= ~((unsigned HOST_WIDE_INT) (-1)
+		    << (width - HOST_BITS_PER_WIDE_INT));
 	}
       else
 	{
@@ -7303,6 +7304,26 @@ fold_builtin_bitop (tree fndecl, tree ar
 	    result = width;
 	  break;
 
+	CASE_INT_FN (BUILT_IN_CLRSB):
+	  if (width > HOST_BITS_PER_WIDE_INT
+	      && (hi & ((unsigned HOST_WIDE_INT) 1
+			<< (width - HOST_BITS_PER_WIDE_INT - 1))) != 0)
+	    {
+	      hi = ~hi & ~((unsigned HOST_WIDE_INT) (-1)
+			   << (width - HOST_BITS_PER_WIDE_INT - 1));
+	      lo = ~lo;
+	    }
+	  else if (width <= HOST_BITS_PER_WIDE_INT
+		   && (lo & ((unsigned HOST_WIDE_INT) 1 << (width - 1))) != 0)
+	    lo = ~lo & ~((unsigned HOST_WIDE_INT) (-1) << (width - 1));
+	  if (hi != 0)
+	    result = width - floor_log2 (hi) - 2 - HOST_BITS_PER_WIDE_INT;
+	  else if (lo != 0)
+	    result = width - floor_log2 (lo) - 2;
+	  else
+	    result = width - 1;
+	  break;
+
 	CASE_INT_FN (BUILT_IN_POPCOUNT):
 	  result = 0;
 	  while (lo)
@@ -9737,6 +9758,7 @@ fold_builtin_1 (location_t loc, tree fnd
     CASE_INT_FN (BUILT_IN_FFS):
     CASE_INT_FN (BUILT_IN_CLZ):
     CASE_INT_FN (BUILT_IN_CTZ):
+    CASE_INT_FN (BUILT_IN_CLRSB):
     CASE_INT_FN (BUILT_IN_POPCOUNT):
     CASE_INT_FN (BUILT_IN_PARITY):
       return fold_builtin_bitop (fndecl, arg0);
--- gcc/optabs.c.jj	2011-06-21 16:46:01.000000000 +0200
+++ gcc/optabs.c	2011-06-21 17:50:35.000000000 +0200
@@ -2350,8 +2350,10 @@ widen_leading (enum machine_mode mode, r
 
 	      if (target == 0)
 		target = gen_reg_rtx (mode);
-	      xop0 = widen_operand (op0, wider_mode, mode, true, false);
-	      temp = expand_unop (wider_mode, unoptab, xop0, NULL_RTX, true);
+	      xop0 = widen_operand (op0, wider_mode, mode,
+				    unoptab != clrsb_optab, false);
+	      temp = expand_unop (wider_mode, unoptab, xop0, NULL_RTX,
+				  unoptab != clrsb_optab);
 	      if (temp != 0)
 		temp = expand_binop (wider_mode, sub_optab, temp,
 				     GEN_INT (GET_MODE_BITSIZE (wider_mode)
@@ -3075,8 +3077,9 @@ expand_unop (enum machine_mode mode, opt
 				  unsignedp);
 
 	      /* If we are generating clz using wider mode, adjust the
-		 result.  */
-	      if (unoptab == clz_optab && temp != 0)
+		 result.  Similarly for clrsb.  */
+	      if ((unoptab == clz_optab || unoptab == clrsb_optab)
+		  && temp != 0)
 		temp = expand_binop (wider_mode, sub_optab, temp,
 				     GEN_INT (GET_MODE_BITSIZE (wider_mode)
 					      - GET_MODE_BITSIZE (mode)),

	Jakub

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] Fix __bultin_clrsb* (PR middle-end/49489)
  2011-06-21 16:39     ` [PATCH] Fix __bultin_clrsb* (PR middle-end/49489) Jakub Jelinek
@ 2011-06-21 16:46       ` Bernd Schmidt
  0 siblings, 0 replies; 21+ messages in thread
From: Bernd Schmidt @ 2011-06-21 16:46 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Henderson, GCC Patches

On 06/21/2011 06:10 PM, Jakub Jelinek wrote:
> gcc.c-torture/execute/builtin-bitops-1.c fails on x86_64-linux, because
> there is no SImode libcall, only DImode, and it was zero-extending instead
> of sign-extending the argument, and not subtracting the mode difference
> afterwards.  So int foo (int x) { return __builtin_clrsb (x); }
> expanded the same as
> int foo (int x) { return __builtin_clrsbl ((unsigned) x); }
> instead of the correct
> int foo (int x) { return __builtin_clrsbl (x) - 32; }
> 
> Fixed thusly, I've also added folding of constant arguments
> during tree optimizations for it.
> 
> Ok for trunk?

Ok.

> 	(fold_builtin_bitop): Fix masking for width > HOST_BITS_PER_WIDE_INT
> 	and < 2 * HOST_BITS_PER_WIDE_INT.  Optimize BUILT_IN_CLRSB*.

Oh that's where the folders were hiding? I thought we didn't have any
for this family of builtins.


Bernd

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Add __builtin_clrsb, similar to clz/ctz
  2011-06-20 20:32   ` Bernd Schmidt
  2011-06-20 20:48     ` Richard Henderson
  2011-06-21 16:39     ` [PATCH] Fix __bultin_clrsb* (PR middle-end/49489) Jakub Jelinek
@ 2011-06-23  6:23     ` H.J. Lu
  2011-07-12  3:50     ` Hans-Peter Nilsson
  2011-08-23 10:07     ` Jakub Jelinek
  4 siblings, 0 replies; 21+ messages in thread
From: H.J. Lu @ 2011-06-23  6:23 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Richard Henderson, GCC Patches

On Mon, Jun 20, 2011 at 12:38 PM, Bernd Schmidt <bernds@codesourcery.com> wrote:
> On 06/16/2011 06:25 PM, Richard Henderson wrote:
>> On 06/16/2011 05:44 AM, Bernd Schmidt wrote:
>>> +@deftypefn {Built-in Function} int __builtin_clrsb (unsigned int x)
>>> +Returns the number of leading redundant sign bits in @var{x}, starting
>>> +at the most significant bit position.
>>> +@end deftypefn
>>
>> Do we want a signed argument, since we're talking about signs?
>
> Err, yes. It's signed everywhere else (builtins.def etc.).
>
>> It would seem that unlike clz, this function is not undefined for zero.
>> What about INT_MIN?  Do all cpus handle those edge cases the same way?
>
> -1 and zero should both produce the same value, 31 (for a 32 bit
> integer). I don't see why INT_MIN should be special - the return value
> is zero. This is true for C6X and Blackfin; ARM documentation suggests
> it's also true for their VCLS instruction. I've not found proper
> picochip documentation but some other documents that suggest it's also
> implemented this way.
>
>> Do you get smaller code in general from
>>
>>   if (x < 0)
>>     x = ~x;
>>   if (x == 0)
>>     return W_TYPE_SIZE - 1;
>>   count_leading_zeros(ret, x);
>>   return ret - 1;
>
> Probably.
>
>>> -(define_insn "signbitssi2"
>>> +(define_insn "clrsbsi2"
>>>    [(set (match_operand:HI 0 "register_operand" "=d")
>>>      (if_then_else:HI
>>>       (lt (match_operand:SI 1 "register_operand" "d") (const_int 0))
>>
>> No use of the new rtx code?
>
> D'oh. Blackfin has a (clrsb:HI (operand:SI)) instruction, so adding this
> showed a problem with some of the existing simplify_const_unop cases:
> for ffs/clz/ctz/clrsb/parity/popcount, we should look at the mode of the
> operand, rather than the mode of the operation. This limits what we can
> do in that function, since op_mode is sometimes VOIDmode - we really
> should add builtin folders for these at some point.
>
> New patch below. Retested on i686 and bfin.

This caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49512


-- 
H.J.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Add __builtin_clrsb, similar to clz/ctz
  2011-06-20 20:32   ` Bernd Schmidt
                       ` (2 preceding siblings ...)
  2011-06-23  6:23     ` Add __builtin_clrsb, similar to clz/ctz H.J. Lu
@ 2011-07-12  3:50     ` Hans-Peter Nilsson
  2011-08-23 10:07     ` Jakub Jelinek
  4 siblings, 0 replies; 21+ messages in thread
From: Hans-Peter Nilsson @ 2011-07-12  3:50 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: GCC Patches

On Mon, 20 Jun 2011, Bernd Schmidt wrote:
> New patch below. Retested on i686 and bfin.

Yay, bikeshedding opportunity! :P
Can we call them "leading *repeated* sign bits"? (in docs and
comments)  Calling them "redundant" makes you think the
representation is  not two's complement but new and improved...
like (bitwise) (1<<31) == -1 or something.

brgds, H-P
PS. just a minor change, I can do the legwork.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Add __builtin_clrsb, similar to clz/ctz
  2011-06-20 20:32   ` Bernd Schmidt
                       ` (3 preceding siblings ...)
  2011-07-12  3:50     ` Hans-Peter Nilsson
@ 2011-08-23 10:07     ` Jakub Jelinek
  2011-08-23 10:08       ` Bernd Schmidt
  4 siblings, 1 reply; 21+ messages in thread
From: Jakub Jelinek @ 2011-08-23 10:07 UTC (permalink / raw)
  To: Bernd Schmidt, Eric Botcazou, Richard Sandiford
  Cc: Richard Henderson, GCC Patches

On Mon, Jun 20, 2011 at 09:38:22PM +0200, Bernd Schmidt wrote:
> D'oh. Blackfin has a (clrsb:HI (operand:SI)) instruction, so adding this
> showed a problem with some of the existing simplify_const_unop cases:
> for ffs/clz/ctz/clrsb/parity/popcount, we should look at the mode of the
> operand, rather than the mode of the operation. This limits what we can
> do in that function, since op_mode is sometimes VOIDmode - we really
> should add builtin folders for these at some point.

> 	* simplify-rtx.c (simplify_const_unary_operation): Likewise.
> 	Use op_mode rather than mode when optimizing ffs, clz, ctz, parity
> 	and popcount.

This change is IMHO wrong, see e.g.
PR50161 where we have (subreg:SI (popcount:DI (const_int -1))).  This
is supposed to yield 64, but with your changes
it yields 128 - the op_mode here is VOIDmode, so the first if that used
to handle it is no longer used, but as width is <= 2 *
HOST_BITS_PER_WIDE_INT, it is treated as TImode constant.

IMHO best would be just to mandate that for these unary ops like
FFS, CLZ, CLRSB, CTZ, POPCOUNT, PARITY, BSWAP the operand has the same mode
(or VOIDmode) as the unary rtx and that the operation is being carried in
the unop's mode, it shouldn't be hard to adjust the few
*.md patterns (mainly in avr.md, bfin.md).
I think it is bad enough that ZERO_EXTEND must not have CONST_INT argument,
making CONST_INT undefined also for all these unary ops is unnecessary.
I think for NEG/NOT we already have such a guarantee (and thus your change
pessimizes it anyway).
avr.md/bfin.md etc. can use (subreg:HI (popcount:SI (match_operand:SI ...)))
(or (zero_extend:HI (popcount:QI (match_operand:QI ...))) and similar.

Or the
  /* We can do some operations on integer CONST_DOUBLEs.  Also allow
     for a DImode operation on a CONST_INT.  */
  else if (GET_MODE (op) == VOIDmode
           && width <= HOST_BITS_PER_WIDE_INT * 2
           && (GET_CODE (op) == CONST_DOUBLE
               || CONST_INT_P (op)))
case would need to change too to test that op_width ==
HOST_BITS_PER_WIDE_INT * 2 (but then, it would again pessimize at least
NEG/NOT/ABS that are defined sanely).  But we'd also need to change many
other places, e.g. cse_process_notes_1, that currently special case
ZERO_EXTEND/SUBREG (and sometimes SIGN_EXTEND) and pessimize them because
those rtxes aren't allowed to have VOIDmode arguments.  cse_process_notes_1
perhaps could be changed for VOIDmode new_rtx to try to
simplify_replace_rtx it...

	Jakub

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Add __builtin_clrsb, similar to clz/ctz
  2011-08-23 10:07     ` Jakub Jelinek
@ 2011-08-23 10:08       ` Bernd Schmidt
  2011-08-23 10:19         ` Richard Guenther
                           ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Bernd Schmidt @ 2011-08-23 10:08 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Eric Botcazou, Richard Sandiford, Richard Henderson, GCC Patches

On 08/23/11 11:05, Jakub Jelinek wrote:
> On Mon, Jun 20, 2011 at 09:38:22PM +0200, Bernd Schmidt wrote:
>> D'oh. Blackfin has a (clrsb:HI (operand:SI)) instruction, so adding this
>> showed a problem with some of the existing simplify_const_unop cases:
>> for ffs/clz/ctz/clrsb/parity/popcount, we should look at the mode of the
>> operand, rather than the mode of the operation. This limits what we can
>> do in that function, since op_mode is sometimes VOIDmode - we really
>> should add builtin folders for these at some point.
> 
>> 	* simplify-rtx.c (simplify_const_unary_operation): Likewise.
>> 	Use op_mode rather than mode when optimizing ffs, clz, ctz, parity
>> 	and popcount.
> 
> This change is IMHO wrong,

Conceptually, I think it is exactly right. It may however be
inconvenient in some cases.

> see e.g.
> PR50161 where we have (subreg:SI (popcount:DI (const_int -1))).  This
> is supposed to yield 64, but with your changes
> it yields 128 - the op_mode here is VOIDmode,

This is what shouldn't happen.

> cse_process_notes_1
> perhaps could be changed for VOIDmode new_rtx to try to
> simplify_replace_rtx it...

Is this where the problem came from? Sounds like it's worth a try.

Wasn't Richard S. working on a patch to give constants modes?


Bernd

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Add __builtin_clrsb, similar to clz/ctz
  2011-08-23 10:08       ` Bernd Schmidt
@ 2011-08-23 10:19         ` Richard Guenther
  2011-08-23 10:25         ` Jakub Jelinek
  2011-08-23 10:45         ` Add __builtin_clrsb, similar to clz/ctz Richard Sandiford
  2 siblings, 0 replies; 21+ messages in thread
From: Richard Guenther @ 2011-08-23 10:19 UTC (permalink / raw)
  To: Bernd Schmidt
  Cc: Jakub Jelinek, Eric Botcazou, Richard Sandiford,
	Richard Henderson, GCC Patches

On Tue, Aug 23, 2011 at 11:35 AM, Bernd Schmidt <bernds@codesourcery.com> wrote:
> On 08/23/11 11:05, Jakub Jelinek wrote:
>> On Mon, Jun 20, 2011 at 09:38:22PM +0200, Bernd Schmidt wrote:
>>> D'oh. Blackfin has a (clrsb:HI (operand:SI)) instruction, so adding this
>>> showed a problem with some of the existing simplify_const_unop cases:
>>> for ffs/clz/ctz/clrsb/parity/popcount, we should look at the mode of the
>>> operand, rather than the mode of the operation. This limits what we can
>>> do in that function, since op_mode is sometimes VOIDmode - we really
>>> should add builtin folders for these at some point.
>>
>>>      * simplify-rtx.c (simplify_const_unary_operation): Likewise.
>>>      Use op_mode rather than mode when optimizing ffs, clz, ctz, parity
>>>      and popcount.
>>
>> This change is IMHO wrong,
>
> Conceptually, I think it is exactly right. It may however be
> inconvenient in some cases.
>
>> see e.g.
>> PR50161 where we have (subreg:SI (popcount:DI (const_int -1))).  This
>> is supposed to yield 64, but with your changes
>> it yields 128 - the op_mode here is VOIDmode,
>
> This is what shouldn't happen.

If it shouldn't happen, does some verifier catch it?

>> cse_process_notes_1
>> perhaps could be changed for VOIDmode new_rtx to try to
>> simplify_replace_rtx it...
>
> Is this where the problem came from? Sounds like it's worth a try.
>
> Wasn't Richard S. working on a patch to give constants modes?
>
>
> Bernd
>
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Add __builtin_clrsb, similar to clz/ctz
  2011-08-23 10:08       ` Bernd Schmidt
  2011-08-23 10:19         ` Richard Guenther
@ 2011-08-23 10:25         ` Jakub Jelinek
  2011-08-23 10:34           ` Bernd Schmidt
  2011-08-23 10:45         ` Add __builtin_clrsb, similar to clz/ctz Richard Sandiford
  2 siblings, 1 reply; 21+ messages in thread
From: Jakub Jelinek @ 2011-08-23 10:25 UTC (permalink / raw)
  To: Bernd Schmidt
  Cc: Eric Botcazou, Richard Sandiford, Richard Henderson, GCC Patches

On Tue, Aug 23, 2011 at 11:35:07AM +0200, Bernd Schmidt wrote:
> > cse_process_notes_1
> > perhaps could be changed for VOIDmode new_rtx to try to
> > simplify_replace_rtx it...
> 
> Is this where the problem came from? Sounds like it's worth a try.

In this case, yes.  But there are many other places all around the
compiler that need to disallow unary op with VOIDmode operand.
In cse.c alone e.g. fold_rtx (twice), in combine.c e.g. in do_SUBST,
subst, etc.  Do we want to special case all those 7 unary ops there too?
Is it really worth it to save one subreg or truncate in the md patterns
for rarely used rtxes?

> Wasn't Richard S. working on a patch to give constants modes?

I don't think this is achievable for 4.7...

	Jakub

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Add __builtin_clrsb, similar to clz/ctz
  2011-08-23 10:25         ` Jakub Jelinek
@ 2011-08-23 10:34           ` Bernd Schmidt
  2011-08-23 13:43             ` [PATCH] For FFS/CLZ/CTZ/CLRSB/POPCOUNT/PARITY/BSWAP require operand mode equal to operation mode (or VOIDmode) (PR middle-end/50161) Jakub Jelinek
  0 siblings, 1 reply; 21+ messages in thread
From: Bernd Schmidt @ 2011-08-23 10:34 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Eric Botcazou, Richard Sandiford, Richard Henderson, GCC Patches

On 08/23/11 11:52, Jakub Jelinek wrote:
> On Tue, Aug 23, 2011 at 11:35:07AM +0200, Bernd Schmidt wrote:
>>> cse_process_notes_1
>>> perhaps could be changed for VOIDmode new_rtx to try to
>>> simplify_replace_rtx it...
>>
>> Is this where the problem came from? Sounds like it's worth a try.
> 
> In this case, yes.  But there are many other places all around the
> compiler that need to disallow unary op with VOIDmode operand.
> In cse.c alone e.g. fold_rtx (twice), in combine.c e.g. in do_SUBST,
> subst, etc.  Do we want to special case all those 7 unary ops there too?
> Is it really worth it to save one subreg or truncate in the md patterns
> for rarely used rtxes?

Maybe not. I'll approve a patch to change it back, even if I think it's
not a good representation.


Bernd

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Add __builtin_clrsb, similar to clz/ctz
  2011-08-23 10:08       ` Bernd Schmidt
  2011-08-23 10:19         ` Richard Guenther
  2011-08-23 10:25         ` Jakub Jelinek
@ 2011-08-23 10:45         ` Richard Sandiford
  2 siblings, 0 replies; 21+ messages in thread
From: Richard Sandiford @ 2011-08-23 10:45 UTC (permalink / raw)
  To: Bernd Schmidt
  Cc: Jakub Jelinek, Eric Botcazou, Richard Henderson, GCC Patches

Bernd Schmidt <bernds@codesourcery.com> writes:
> Wasn't Richard S. working on a patch to give constants modes?

A whole series more like.  Don't hold your breath!

Richard

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH] For FFS/CLZ/CTZ/CLRSB/POPCOUNT/PARITY/BSWAP require operand mode equal to operation mode (or VOIDmode) (PR middle-end/50161)
  2011-08-23 10:34           ` Bernd Schmidt
@ 2011-08-23 13:43             ` Jakub Jelinek
  2011-08-23 14:54               ` Georg-Johann Lay
  2011-08-23 15:58               ` Jakub Jelinek
  0 siblings, 2 replies; 21+ messages in thread
From: Jakub Jelinek @ 2011-08-23 13:43 UTC (permalink / raw)
  To: Bernd Schmidt
  Cc: Eric Botcazou, Richard Sandiford, Richard Henderson, GCC Patches

On Tue, Aug 23, 2011 at 11:57:58AM +0200, Bernd Schmidt wrote:
> On 08/23/11 11:52, Jakub Jelinek wrote:
> > On Tue, Aug 23, 2011 at 11:35:07AM +0200, Bernd Schmidt wrote:
> >>> cse_process_notes_1
> >>> perhaps could be changed for VOIDmode new_rtx to try to
> >>> simplify_replace_rtx it...
> >>
> >> Is this where the problem came from? Sounds like it's worth a try.
> > 
> > In this case, yes.  But there are many other places all around the
> > compiler that need to disallow unary op with VOIDmode operand.
> > In cse.c alone e.g. fold_rtx (twice), in combine.c e.g. in do_SUBST,
> > subst, etc.  Do we want to special case all those 7 unary ops there too?
> > Is it really worth it to save one subreg or truncate in the md patterns
> > for rarely used rtxes?
> 
> Maybe not. I'll approve a patch to change it back, even if I think it's
> not a good representation.

We can remove that restriction again once CONST_INTs are no longer VOIDmode.

Here is an untested patch, will bootstrap/regtest it now on x86_64-linux
and i686-linux, on c6x it should make no difference IMHO (looked like a typo
in the expander which wasn't used anyway), can somebody test it on AVR and
BFIN?  My grepping through *.md didn't find any other places where the
operand wouldn't have the same mode as operation.

2011-08-23  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/50161
	* simplify-rtx.c (simplify_const_unary_operation): If
	op is CONST_INT, don't look at op_mode, but use instead
	mode.
	* optabs.c (add_equal_note): For FFS, CLZ, CTZ,
	CLRSB, POPCOUNT, PARITY and BSWAP use operand mode for
	operation and TRUNCATE/ZERO_EXTEND if needed.
	* doc/rtl.texi (ffs, clrsb, clz, ctz, popcount, parity, bswap):
	Document that operand mode must be same as operation mode,
	or VOIDmode.
	* config/avr/avr.md (paritysi2, *parityqihi2.libgcc,
	*paritysihi2.libgcc, popcountsi2, *popcountsi2.libgcc,
	*popcountqihi2.libgcc, clzsi2, *clzsihi2.libgcc, ctzsi2,
	*ctzsihi2.libgcc, ffssi2, *ffssihi2.libgcc): For unary ops
	use the mode of operand for the operation and add truncate
	or zero_extend around if needed.
	* config/c6x/c6x.md (ctzdi2): Likewise.
	* config/bfin/bfin.md (clrsbsi2, signbitssi2): Likewise.

	* gcc.dg/pr50161.c: New test.

--- gcc/simplify-rtx.c.jj	2011-08-22 08:17:07.000000000 +0200
+++ gcc/simplify-rtx.c	2011-08-23 13:24:08.000000000 +0200
@@ -1373,8 +1373,7 @@ simplify_const_unary_operation (enum rtx
     }
 
   if (CONST_INT_P (op)
-      && width <= HOST_BITS_PER_WIDE_INT
-      && op_width <= HOST_BITS_PER_WIDE_INT && op_width > 0)
+      && width <= HOST_BITS_PER_WIDE_INT && width > 0)
     {
       HOST_WIDE_INT arg0 = INTVAL (op);
       HOST_WIDE_INT val;
@@ -1394,50 +1393,50 @@ simplify_const_unary_operation (enum rtx
 	  break;
 
 	case FFS:
-	  arg0 &= GET_MODE_MASK (op_mode);
+	  arg0 &= GET_MODE_MASK (mode);
 	  val = ffs_hwi (arg0);
 	  break;
 
 	case CLZ:
-	  arg0 &= GET_MODE_MASK (op_mode);
-	  if (arg0 == 0 && CLZ_DEFINED_VALUE_AT_ZERO (op_mode, val))
+	  arg0 &= GET_MODE_MASK (mode);
+	  if (arg0 == 0 && CLZ_DEFINED_VALUE_AT_ZERO (mode, val))
 	    ;
 	  else
-	    val = GET_MODE_PRECISION (op_mode) - floor_log2 (arg0) - 1;
+	    val = GET_MODE_PRECISION (mode) - floor_log2 (arg0) - 1;
 	  break;
 
 	case CLRSB:
-	  arg0 &= GET_MODE_MASK (op_mode);
+	  arg0 &= GET_MODE_MASK (mode);
 	  if (arg0 == 0)
-	    val = GET_MODE_PRECISION (op_mode) - 1;
+	    val = GET_MODE_PRECISION (mode) - 1;
 	  else if (arg0 >= 0)
-	    val = GET_MODE_PRECISION (op_mode) - floor_log2 (arg0) - 2;
+	    val = GET_MODE_PRECISION (mode) - floor_log2 (arg0) - 2;
 	  else if (arg0 < 0)
-	    val = GET_MODE_PRECISION (op_mode) - floor_log2 (~arg0) - 2;
+	    val = GET_MODE_PRECISION (mode) - floor_log2 (~arg0) - 2;
 	  break;
 
 	case CTZ:
-	  arg0 &= GET_MODE_MASK (op_mode);
+	  arg0 &= GET_MODE_MASK (mode);
 	  if (arg0 == 0)
 	    {
 	      /* Even if the value at zero is undefined, we have to come
 		 up with some replacement.  Seems good enough.  */
-	      if (! CTZ_DEFINED_VALUE_AT_ZERO (op_mode, val))
-		val = GET_MODE_PRECISION (op_mode);
+	      if (! CTZ_DEFINED_VALUE_AT_ZERO (mode, val))
+		val = GET_MODE_PRECISION (mode);
 	    }
 	  else
 	    val = ctz_hwi (arg0);
 	  break;
 
 	case POPCOUNT:
-	  arg0 &= GET_MODE_MASK (op_mode);
+	  arg0 &= GET_MODE_MASK (mode);
 	  val = 0;
 	  while (arg0)
 	    val++, arg0 &= arg0 - 1;
 	  break;
 
 	case PARITY:
-	  arg0 &= GET_MODE_MASK (op_mode);
+	  arg0 &= GET_MODE_MASK (mode);
 	  val = 0;
 	  while (arg0)
 	    val++, arg0 &= arg0 - 1;
--- gcc/optabs.c.jj	2011-08-22 08:17:07.000000000 +0200
+++ gcc/optabs.c	2011-08-23 13:37:20.000000000 +0200
@@ -216,7 +216,32 @@ add_equal_note (rtx insns, rtx target, e
     }
 
   if (GET_RTX_CLASS (code) == RTX_UNARY)
-    note = gen_rtx_fmt_e (code, GET_MODE (target), copy_rtx (op0));
+    switch (code)
+      {
+      case FFS:
+      case CLZ:
+      case CTZ:
+      case CLRSB:
+      case POPCOUNT:
+      case PARITY:
+      case BSWAP:
+	if (GET_MODE (op0) != VOIDmode && GET_MODE (target) != GET_MODE (op0))
+	  {
+	    note = gen_rtx_fmt_e (code, GET_MODE (op0), copy_rtx (op0));
+	    if (GET_MODE_SIZE (GET_MODE (op0))
+		> GET_MODE_SIZE (GET_MODE (target)))
+	      note = simplify_gen_unary (TRUNCATE, GET_MODE (target),
+					 note, GET_MODE (op0));
+	    else
+	      note = simplify_gen_unary (ZERO_EXTEND, GET_MODE (target),
+					 note, GET_MODE (op0));
+	    break;
+	  }
+	/* FALLTHRU */
+      default:
+	note = gen_rtx_fmt_e (code, GET_MODE (target), copy_rtx (op0));
+	break;
+      }
   else
     note = gen_rtx_fmt_ee (code, GET_MODE (target), copy_rtx (op0), copy_rtx (op1));
 
--- gcc/doc/rtl.texi.jj	2011-07-21 09:54:33.000000000 +0200
+++ gcc/doc/rtl.texi	2011-08-23 12:56:44.000000000 +0200
@@ -2408,9 +2408,8 @@ Most often @var{m} will be a floating po
 @item (ffs:@var{m} @var{x})
 Represents one plus the index of the least significant 1-bit in
 @var{x}, represented as an integer of mode @var{m}.  (The value is
-zero if @var{x} is zero.)  The mode of @var{x} need not be @var{m};
-depending on the target machine, various mode combinations may be
-valid.
+zero if @var{x} is zero.)  The mode of @var{x} must be @var{m}
+or @code{VOIDmode}.
 
 @findex clrsb
 @item (clrsb:@var{m} @var{x})
@@ -2418,7 +2417,7 @@ Represents the number of redundant leadi
 represented as an integer of mode @var{m}, starting at the most
 significant bit position.  This is one less than the number of leading
 sign bits (either 0 or 1), with no special cases.  The mode of @var{x}
-will usually be an integer mode and may differ from @var{m}.
+must be @var{m} or @code{VOIDmode}.
 
 @findex clz
 @item (clz:@var{m} @var{x})
@@ -2427,7 +2426,7 @@ integer of mode @var{m}, starting at the
 If @var{x} is zero, the value is determined by
 @code{CLZ_DEFINED_VALUE_AT_ZERO} (@pxref{Misc}).  Note that this is one of
 the few expressions that is not invariant under widening.  The mode of
-@var{x} will usually be an integer mode.
+@var{x} must be @var{m} or @code{VOIDmode}.
 
 @findex ctz
 @item (ctz:@var{m} @var{x})
@@ -2436,23 +2435,24 @@ integer of mode @var{m}, starting at the
 If @var{x} is zero, the value is determined by
 @code{CTZ_DEFINED_VALUE_AT_ZERO} (@pxref{Misc}).  Except for this case,
 @code{ctz(x)} is equivalent to @code{ffs(@var{x}) - 1}.  The mode of
-@var{x} will usually be an integer mode.
+@var{x} must be @var{m} or @code{VOIDmode}.
 
 @findex popcount
 @item (popcount:@var{m} @var{x})
 Represents the number of 1-bits in @var{x}, represented as an integer of
-mode @var{m}.  The mode of @var{x} will usually be an integer mode.
+mode @var{m}.  The mode of @var{x} must be @var{m} or @code{VOIDmode}.
 
 @findex parity
 @item (parity:@var{m} @var{x})
 Represents the number of 1-bits modulo 2 in @var{x}, represented as an
-integer of mode @var{m}.  The mode of @var{x} will usually be an integer
-mode.
+integer of mode @var{m}.  The mode of @var{x} must be @var{m} or
+@code{VOIDmode}.
 
 @findex bswap
 @item (bswap:@var{m} @var{x})
 Represents the value @var{x} with the order of bytes reversed, carried out
 in mode @var{m}, which must be a fixed-point machine mode.
+The mode of @var{x} must be @var{m} or @code{VOIDmode}.
 @end table
 
 @node Comparisons
--- gcc/config/avr/avr.md.jj	2011-08-18 08:35:54.000000000 +0200
+++ gcc/config/avr/avr.md	2011-08-23 12:46:13.000000000 +0200
@@ -4124,7 +4124,7 @@ (define_expand "paritysi2"
   [(set (reg:SI 22)
         (match_operand:SI 1 "register_operand" ""))
    (set (reg:HI 24)
-        (parity:HI (reg:SI 22)))
+	(truncate:HI (parity:SI (reg:SI 22))))
    (set (match_dup 2)
         (reg:HI 24))
    (set (match_operand:SI 0 "register_operand" "")
@@ -4144,7 +4144,7 @@ (define_insn "*parityhi2.libgcc"
 
 (define_insn "*parityqihi2.libgcc"
   [(set (reg:HI 24)
-        (parity:HI (reg:QI 24)))]
+	(zero_extend:HI (parity:QI (reg:QI 24))))]
   ""
   "%~call __parityqi2"
   [(set_attr "type" "xcall")
@@ -4152,7 +4152,7 @@ (define_insn "*parityqihi2.libgcc"
 
 (define_insn "*paritysihi2.libgcc"
   [(set (reg:HI 24)
-        (parity:HI (reg:SI 22)))]
+	(truncate:HI (parity:SI (reg:SI 22))))]
   ""
   "%~call __paritysi2"
   [(set_attr "type" "xcall")
@@ -4175,7 +4175,7 @@ (define_expand "popcountsi2"
   [(set (reg:SI 22)
         (match_operand:SI 1 "register_operand" ""))
    (set (reg:HI 24)
-        (popcount:HI (reg:SI 22)))
+	(truncate:HI (popcount:SI (reg:SI 22))))
    (set (match_dup 2)
         (reg:HI 24))
    (set (match_operand:SI 0 "register_operand" "")
@@ -4195,7 +4195,7 @@ (define_insn "*popcounthi2.libgcc"
 
 (define_insn "*popcountsi2.libgcc"
   [(set (reg:HI 24)
-        (popcount:HI (reg:SI 22)))]
+	(truncate:HI (popcount:SI (reg:SI 22))))]
   ""
   "%~call __popcountsi2"
   [(set_attr "type" "xcall")
@@ -4211,7 +4211,7 @@ (define_insn "*popcountqi2.libgcc"
 
 (define_insn_and_split "*popcountqihi2.libgcc"
   [(set (reg:HI 24)
-        (popcount:HI (reg:QI 24)))]
+	(zero_extend:HI (popcount:QI (reg:QI 24))))]
   ""
   "#"
   ""
@@ -4238,7 +4238,7 @@ (define_expand "clzsi2"
   [(set (reg:SI 22)
         (match_operand:SI 1 "register_operand" ""))
    (parallel [(set (reg:HI 24)
-                   (clz:HI (reg:SI 22)))
+		   (truncate:HI (clz:SI (reg:SI 22))))
               (clobber (reg:QI 26))])
    (set (match_dup 2)
         (reg:HI 24))
@@ -4260,7 +4260,7 @@ (define_insn "*clzhi2.libgcc"
 
 (define_insn "*clzsihi2.libgcc"
   [(set (reg:HI 24)
-        (clz:HI (reg:SI 22)))
+	(truncate:HI (clz:SI (reg:SI 22))))
    (clobber (reg:QI 26))]
   ""
   "%~call __clzsi2"
@@ -4284,7 +4284,7 @@ (define_expand "ctzsi2"
   [(set (reg:SI 22)
         (match_operand:SI 1 "register_operand" ""))
    (parallel [(set (reg:HI 24)
-                   (ctz:HI (reg:SI 22)))
+		   (truncate:HI (ctz:SI (reg:SI 22))))
               (clobber (reg:QI 22))
               (clobber (reg:QI 26))])
    (set (match_dup 2)
@@ -4307,7 +4307,7 @@ (define_insn "*ctzhi2.libgcc"
 
 (define_insn "*ctzsihi2.libgcc"
   [(set (reg:HI 24)
-        (ctz:HI (reg:SI 22)))
+	(truncate:HI (ctz:SI (reg:SI 22))))
    (clobber (reg:QI 22))
    (clobber (reg:QI 26))]
   ""
@@ -4332,7 +4332,7 @@ (define_expand "ffssi2"
   [(set (reg:SI 22)
         (match_operand:SI 1 "register_operand" ""))
    (parallel [(set (reg:HI 24)
-                   (ffs:HI (reg:SI 22)))
+		   (truncate:HI (ffs:SI (reg:SI 22))))
               (clobber (reg:QI 22))
               (clobber (reg:QI 26))])
    (set (match_dup 2)
@@ -4355,7 +4355,7 @@ (define_insn "*ffshi2.libgcc"
 
 (define_insn "*ffssihi2.libgcc"
   [(set (reg:HI 24)
-        (ffs:HI (reg:SI 22)))
+	(truncate:HI (ffs:SI (reg:SI 22))))
    (clobber (reg:QI 22))
    (clobber (reg:QI 26))]
   ""
--- gcc/config/c6x/c6x.md.jj	2011-07-15 20:46:48.000000000 +0200
+++ gcc/config/c6x/c6x.md	2011-08-23 12:36:20.000000000 +0200
@@ -2012,7 +2012,7 @@ (define_expand "ctzsi2"
 
 (define_expand "ctzdi2"
   [(set (match_operand:DI 0 "register_operand" "")
-	(ctz:SI (match_operand:DI 1 "register_operand" "")))]
+	(ctz:DI (match_operand:DI 1 "register_operand" "")))]
   "TARGET_INSNS_64"
 {
   rtx tmpreg = gen_reg_rtx (DImode);
--- gcc/config/bfin/bfin.md.jj	2011-07-08 15:09:38.000000000 +0200
+++ gcc/config/bfin/bfin.md	2011-08-23 12:39:27.000000000 +0200
@@ -1463,7 +1463,7 @@ (define_insn "one_cmplsi2"
 
 (define_expand "clrsbsi2"
   [(set (match_dup 2)
-	(clrsb:HI (match_operand:SI 1 "register_operand" "d")))
+	(truncate:HI (clrsb:SI (match_operand:SI 1 "register_operand" "d"))))
    (set (match_operand:SI 0 "register_operand")
 	(zero_extend:SI (match_dup 2)))]
   ""
@@ -1473,7 +1473,7 @@ (define_expand "clrsbsi2"
 
 (define_insn "signbitssi2"
   [(set (match_operand:HI 0 "register_operand" "=d")
-	(clrsb:HI (match_operand:SI 1 "register_operand" "d")))]
+	(truncate:HI (clrsb:SI (match_operand:SI 1 "register_operand" "d"))))]
   ""
   "%h0 = signbits %1%!"
   [(set_attr "type" "dsp32")])
--- gcc/testsuite/gcc.dg/pr50161.c.jj	2011-08-23 13:51:59.000000000 +0200
+++ gcc/testsuite/gcc.dg/pr50161.c	2011-08-23 13:31:14.000000000 +0200
@@ -0,0 +1,21 @@
+/* PR middle-end/50161 */
+/* { dg-do run } */
+/* { dg-options "-O2 -fno-tree-ter -funroll-loops" } */
+
+extern void abort (void);
+
+int
+main ()
+{
+  unsigned i;
+  unsigned long a[16];
+
+  for (i = 0; i < 16; i++)
+    a[i] = ~0UL;
+
+  for (i = 0; i < 16; i++)
+    if (__builtin_popcountl (a[i]) != sizeof (a[i]) * 8)
+      abort ();
+
+  return 0;
+}


	Jakub

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] For FFS/CLZ/CTZ/CLRSB/POPCOUNT/PARITY/BSWAP require operand mode equal to operation mode (or VOIDmode) (PR middle-end/50161)
  2011-08-23 13:43             ` [PATCH] For FFS/CLZ/CTZ/CLRSB/POPCOUNT/PARITY/BSWAP require operand mode equal to operation mode (or VOIDmode) (PR middle-end/50161) Jakub Jelinek
@ 2011-08-23 14:54               ` Georg-Johann Lay
  2011-08-23 15:58               ` Jakub Jelinek
  1 sibling, 0 replies; 21+ messages in thread
From: Georg-Johann Lay @ 2011-08-23 14:54 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Bernd Schmidt, Eric Botcazou, Richard Sandiford,
	Richard Henderson, GCC Patches

Jakub Jelinek wrote:
> On Tue, Aug 23, 2011 at 11:57:58AM +0200, Bernd Schmidt wrote:
>> On 08/23/11 11:52, Jakub Jelinek wrote:
>>> On Tue, Aug 23, 2011 at 11:35:07AM +0200, Bernd Schmidt wrote:
>>>>> cse_process_notes_1
>>>>> perhaps could be changed for VOIDmode new_rtx to try to
>>>>> simplify_replace_rtx it...
>>>> Is this where the problem came from? Sounds like it's worth a try.
>>> In this case, yes.  But there are many other places all around the
>>> compiler that need to disallow unary op with VOIDmode operand.
>>> In cse.c alone e.g. fold_rtx (twice), in combine.c e.g. in do_SUBST,
>>> subst, etc.  Do we want to special case all those 7 unary ops there too?
>>> Is it really worth it to save one subreg or truncate in the md patterns
>>> for rarely used rtxes?
>> Maybe not. I'll approve a patch to change it back, even if I think it's
>> not a good representation.
> 
> We can remove that restriction again once CONST_INTs are no longer VOIDmode.
> 
> Here is an untested patch, will bootstrap/regtest it now on x86_64-linux
> and i686-linux, on c6x it should make no difference IMHO (looked like a typo
> in the expander which wasn't used anyway), can somebody test it on AVR and

Tested you patch against r177949 on avr-unknown-none for C/C++.
There are no regressions and the new test case passes fine.

Johann

> BFIN?  My grepping through *.md didn't find any other places where the
> operand wouldn't have the same mode as operation.
> 
> 2011-08-23  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR middle-end/50161
> 	* simplify-rtx.c (simplify_const_unary_operation): If
> 	op is CONST_INT, don't look at op_mode, but use instead
> 	mode.
> 	* optabs.c (add_equal_note): For FFS, CLZ, CTZ,
> 	CLRSB, POPCOUNT, PARITY and BSWAP use operand mode for
> 	operation and TRUNCATE/ZERO_EXTEND if needed.
> 	* doc/rtl.texi (ffs, clrsb, clz, ctz, popcount, parity, bswap):
> 	Document that operand mode must be same as operation mode,
> 	or VOIDmode.
> 	* config/avr/avr.md (paritysi2, *parityqihi2.libgcc,
> 	*paritysihi2.libgcc, popcountsi2, *popcountsi2.libgcc,
> 	*popcountqihi2.libgcc, clzsi2, *clzsihi2.libgcc, ctzsi2,
> 	*ctzsihi2.libgcc, ffssi2, *ffssihi2.libgcc): For unary ops
> 	use the mode of operand for the operation and add truncate
> 	or zero_extend around if needed.
> 	* config/c6x/c6x.md (ctzdi2): Likewise.
> 	* config/bfin/bfin.md (clrsbsi2, signbitssi2): Likewise.
> 
> 	* gcc.dg/pr50161.c: New test.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] For FFS/CLZ/CTZ/CLRSB/POPCOUNT/PARITY/BSWAP require operand mode equal to operation mode (or VOIDmode) (PR middle-end/50161)
  2011-08-23 13:43             ` [PATCH] For FFS/CLZ/CTZ/CLRSB/POPCOUNT/PARITY/BSWAP require operand mode equal to operation mode (or VOIDmode) (PR middle-end/50161) Jakub Jelinek
  2011-08-23 14:54               ` Georg-Johann Lay
@ 2011-08-23 15:58               ` Jakub Jelinek
  2011-08-23 16:16                 ` Bernd Schmidt
  1 sibling, 1 reply; 21+ messages in thread
From: Jakub Jelinek @ 2011-08-23 15:58 UTC (permalink / raw)
  To: Bernd Schmidt
  Cc: Eric Botcazou, Richard Sandiford, Richard Henderson, GCC Patches

On Tue, Aug 23, 2011 at 02:06:05PM +0200, Jakub Jelinek wrote:
> We can remove that restriction again once CONST_INTs are no longer VOIDmode.
> 
> Here is an untested patch, will bootstrap/regtest it now on x86_64-linux
> and i686-linux, on c6x it should make no difference IMHO (looked like a typo
> in the expander which wasn't used anyway), can somebody test it on AVR and
> BFIN?  My grepping through *.md didn't find any other places where the
> operand wouldn't have the same mode as operation.

Now successfully bootstrapped/regtested on x86_64-linux and i686-linux
and tested on AVR by Georg-Johann, ok for trunk?

	Jakub

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] For FFS/CLZ/CTZ/CLRSB/POPCOUNT/PARITY/BSWAP require operand mode equal to operation mode (or VOIDmode) (PR middle-end/50161)
  2011-08-23 15:58               ` Jakub Jelinek
@ 2011-08-23 16:16                 ` Bernd Schmidt
  0 siblings, 0 replies; 21+ messages in thread
From: Bernd Schmidt @ 2011-08-23 16:16 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Eric Botcazou, Richard Sandiford, Richard Henderson, GCC Patches

On 08/23/11 17:42, Jakub Jelinek wrote:
> On Tue, Aug 23, 2011 at 02:06:05PM +0200, Jakub Jelinek wrote:
>> We can remove that restriction again once CONST_INTs are no longer VOIDmode.
>>
>> Here is an untested patch, will bootstrap/regtest it now on x86_64-linux
>> and i686-linux, on c6x it should make no difference IMHO (looked like a typo
>> in the expander which wasn't used anyway), can somebody test it on AVR and
>> BFIN?  My grepping through *.md didn't find any other places where the
>> operand wouldn't have the same mode as operation.
> 
> Now successfully bootstrapped/regtested on x86_64-linux and i686-linux
> and tested on AVR by Georg-Johann, ok for trunk?

Ok. Will fix up bfin/c6x if necessary.


Bernd

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2011-08-23 15:46 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-16 13:06 Add __builtin_clrsb, similar to clz/ctz Bernd Schmidt
2011-06-16 13:10 ` Georg-Johann Lay
2011-06-16 13:56 ` Laurent Desnogues
2011-06-16 13:59   ` Bernd Schmidt
2011-06-16 17:03 ` Richard Henderson
2011-06-20 20:32   ` Bernd Schmidt
2011-06-20 20:48     ` Richard Henderson
2011-06-21 16:39     ` [PATCH] Fix __bultin_clrsb* (PR middle-end/49489) Jakub Jelinek
2011-06-21 16:46       ` Bernd Schmidt
2011-06-23  6:23     ` Add __builtin_clrsb, similar to clz/ctz H.J. Lu
2011-07-12  3:50     ` Hans-Peter Nilsson
2011-08-23 10:07     ` Jakub Jelinek
2011-08-23 10:08       ` Bernd Schmidt
2011-08-23 10:19         ` Richard Guenther
2011-08-23 10:25         ` Jakub Jelinek
2011-08-23 10:34           ` Bernd Schmidt
2011-08-23 13:43             ` [PATCH] For FFS/CLZ/CTZ/CLRSB/POPCOUNT/PARITY/BSWAP require operand mode equal to operation mode (or VOIDmode) (PR middle-end/50161) Jakub Jelinek
2011-08-23 14:54               ` Georg-Johann Lay
2011-08-23 15:58               ` Jakub Jelinek
2011-08-23 16:16                 ` Bernd Schmidt
2011-08-23 10:45         ` Add __builtin_clrsb, similar to clz/ctz Richard Sandiford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).