public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Add type-generic clz/ctz/clrsb/ffs/parity/popcount builtins [PR111309]
@ 2023-11-09 15:02 Jakub Jelinek
  2023-11-09 21:43 ` Joseph Myers
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Jakub Jelinek @ 2023-11-09 15:02 UTC (permalink / raw)
  To: Joseph S. Myers, Richard Biener, Jason Merrill; +Cc: gcc-patches

Hi!

The following patch adds 6 new type-generic builtins,
__builtin_clzg
__builtin_ctzg
__builtin_clrsbg
__builtin_ffsg
__builtin_parityg
__builtin_popcountg
The g at the end stands for generic because the unsuffixed variant
of the builtins already have unsigned int or int arguments.

The main reason to add these is to support arbitrary unsigned (for
clrsb/ffs signed) bit-precise integer types and also __int128 which
wasn't supported by the existing builtins, so that e.g. <stdbit.h>
type-generic functions could then support not just bit-precise unsigned
integer type whose width matches a standard or extended integer type,
but others too.

None of these new builtins promote their first argument, so the argument
can be e.g. unsigned char or unsigned short or unsigned __int20 etc.
The first 2 support either 1 or 2 arguments, if only 1 argument is supplied,
the behavior is undefined for argument 0 like for other __builtin_c[lt]z*
builtins, if 2 arguments are supplied, the second argument should be int
that will be returned if the argument is 0.  All other builtins have
just one argument.  For __builtin_clrsbg and __builtin_ffsg the argument
shall be any signed standard/extended or bit-precise integer, for the others
any unsigned standard/extended or bit-precise integer (bool not allowed).

One possibility would be to also allow signed integer types for
the clz/ctz/parity/popcount ones (and just cast the argument to
unsigned_type_for during folding) and similarly unsigned integer types
for the clrsb/ffs ones, dunno what is better; for stdbit.h the current
version is sufficient and diagnoses use of the inappropriate sign,
though on the other side I wonder if users won't be confused by
__builtin_clzg (1) being an error and having to write __builtin_clzg (1U).
And I think we don't have anything in C that would allow casting to
corresponding unsigned type (or vice versa) given arbitrary integral type,
one could use _Generic for that for standard and extended types, but not
for arbitrary _BitInt.  What do you think?

The new builtins are lowered to corresponding builtins with other suffixes
or internal calls (plus casts and adjustments where needed) during FE
folding or during gimplification at latest, the non-suffixed builtins
handling precisions up to precision of int, l up to precision of long,
ll up to precision of long long, up to __int128 precision lowered to
double-word expansion early and the rest (which must be _BitInt) lowered
to internal fn calls - those are then lowered during bitint lowering pass.

The patch also changes representation of IFN_CLZ and IFN_CTZ calls,
previously they were in the IL only if they are directly supported optab
and depending on C[LT]Z_DEFINED_VALUE_AT_ZERO (...) == 2 they had or didn't
have defined behavior at 0, now they are in the IL either if directly
supported optab, or for the large/huge BITINT_TYPEs and they have either
1 or 2 arguments.  If one, the behavior is undefined at zero, if 2, the
second argument is an int constant that should be returned for 0.
As there is no extra support during expansion, for directly supported optab
the second argument if present should still match the
C[LT]Z_DEFINED_VALUE_AT_ZERO (...) == 2 value, but for BITINT_TYPE arguments
it can be arbitrary int INTEGER_CST.

The goal is e.g.
#ifdef __has_builtin
#if __has_builtin(__builtin_clzg) && __has_builtin(__builtin_popcountg)
#define stdc_leading_zeros(x) \
  __builtin_clzg (x, __builtin_popcountg ((__typeof (x)) -1))
#endif
#endif
where __builtin_popcountg ((__typeof (x)) -1) computes the bit precision
of x's type (kind of _Bitwidthof (x) alternative).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-11-09  Jakub Jelinek  <jakub@redhat.com>

	PR c/111309
gcc/
	* builtins.def (BUILT_IN_CLZG, BUILT_IN_CTZG, BUILT_IN_CLRSBG,
	BUILT_IN_FFSG, BUILT_IN_PARITYG, BUILT_IN_POPCOUNTG): New
	builtins.
	* builtins.cc (fold_builtin_bit_query): New function.
	(fold_builtin_1): Use it for
	BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
	(fold_builtin_2): Use it for BUILT_IN_{CLZ,CTZ}G.
	* fold-const-call.cc: Fix comment typo on tm.h inclusion.
	(fold_const_call_ss): Handle
	CFN_BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
	(fold_const_call_sss): New function.
	(fold_const_call_1): Call it for 2 argument functions returning
	scalar when passed 2 INTEGER_CSTs.
	* genmatch.cc (cmp_operand): For function calls also compare
	number of arguments.
	(fns_cmp): New function.
	(dt_node::gen_kids): Sort fns and generic_fns.
	(dt_node::gen_kids_1): Handle fns with the same id but different
	number of arguments.
	* match.pd (CLZ simplifications): Drop checks for defined behavior
	at zero.  Add variant of simplifications for IFN_CLZ with 2 arguments.
	(CTZ simplifications): Drop checks for defined behavior at zero,
	don't optimize precisions above MAX_FIXED_MODE_SIZE.  Add variant of
	simplifications for IFN_CTZ with 2 arguments.
	(a != 0 ? CLZ(a) : CST -> .CLZ(a)): Use TREE_TYPE (@3) instead of
	type, add BITINT_TYPE handling, create 2 argument IFN_CLZ rather than
	one argument.  Add variant for matching CLZ with 2 arguments.
	(a != 0 ? CTZ(a) : CST -> .CTZ(a)): Similarly.
	* gimple-lower-bitint.cc (bitint_large_huge::lower_bit_query): New
	method.
	(bitint_large_huge::lower_call): Use it for IFN_{CLZ,CTZ,CLRSB,FFS}
	and IFN_{PARITY,POPCOUNT} calls.
	* gimple-range-op.cc (cfn_clz::fold_range): Don't check
	CLZ_DEFINED_VALUE_AT_ZERO for m_gimple_call_internal_p, instead
	assume defined value at zero if the call has 2 arguments and use
	second argument value for that case.
	(cfn_ctz::fold_range): Similarly.
	(gimple_range_op_handler::maybe_builtin_call): Use op_cfn_clz_internal
	or op_cfn_ctz_internal only if internal fn call has 2 arguments and
	set m_op2 in that case.
	* tree-vect-patterns.cc (vect_recog_ctz_ffs_pattern,
	vect_recog_popcount_clz_ctz_ffs_pattern): For value defined at zero
	use second argument of calls if present, otherwise assume UB at zero,
	create 2 argument .CLZ/.CTZ calls if needed.
	* tree-vect-stmts.cc (vectorizable_call): Handle 2 argument .CLZ/.CTZ
	calls.
	* tree-ssa-loop-niter.cc (build_cltz_expr): Create 2 argument
	.CLZ/.CTZ calls if needed.
	* tree-ssa-forwprop.cc (simplify_count_trailing_zeroes): Create 2
	argument .CTZ calls if needed.
	* tree-ssa-phiopt.cc (cond_removal_in_builtin_zero_pattern): Handle
	2 argument .CLZ/.CTZ calls, handle BITINT_TYPE, create 2 argument
	.CLZ/.CTZ calls.
	* doc/extend.texi (__builtin_clzg, __builtin_ctzg, __builtin_clrsbg,
	__builtin_ffsg, __builtin_parityg, __builtin_popcountg): Document.
gcc/c-family/
	* c-common.cc (check_builtin_function_arguments): Handle
	BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
	* c-gimplify.cc (c_gimplify_expr): If __builtin_c[lt]zg second
	argument hasn't been folded into constant yet, transform it to one
	argument call inside of a COND_EXPR which for first argument 0
	returns the second argument.
gcc/c/
	* c-typeck.cc (convert_arguments): Don't promote first argument
	of BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
gcc/cp/
	* call.cc (magic_varargs_p): Return 4 for
	BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
	(build_over_call): Don't promote first argument of
	BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
	* cp-gimplify.cc (cp_gimplify_expr): For BUILT_IN_C{L,T}ZG use
	c_gimplify_expr.
gcc/testsuite/
	* c-c++-common/pr111309-1.c: New test.
	* c-c++-common/pr111309-2.c: New test.
	* gcc.dg/torture/bitint-43.c: New test.
	* gcc.dg/torture/bitint-44.c: New test.

--- gcc/builtins.def.jj	2023-11-09 09:04:18.396546519 +0100
+++ gcc/builtins.def	2023-11-09 09:17:40.235182413 +0100
@@ -962,15 +962,18 @@ DEF_GCC_BUILTIN        (BUILT_IN_CLZ, "c
 DEF_GCC_BUILTIN        (BUILT_IN_CLZIMAX, "clzimax", BT_FN_INT_UINTMAX, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN        (BUILT_IN_CLZL, "clzl", BT_FN_INT_ULONG, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN        (BUILT_IN_CLZLL, "clzll", BT_FN_INT_ULONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN        (BUILT_IN_CLZG, "clzg", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF)
 DEF_GCC_BUILTIN        (BUILT_IN_CONSTANT_P, "constant_p", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN        (BUILT_IN_CTZ, "ctz", BT_FN_INT_UINT, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN        (BUILT_IN_CTZIMAX, "ctzimax", BT_FN_INT_UINTMAX, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN        (BUILT_IN_CTZL, "ctzl", BT_FN_INT_ULONG, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN        (BUILT_IN_CTZLL, "ctzll", BT_FN_INT_ULONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN        (BUILT_IN_CTZG, "ctzg", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF)
 DEF_GCC_BUILTIN        (BUILT_IN_CLRSB, "clrsb", BT_FN_INT_INT, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN        (BUILT_IN_CLRSBIMAX, "clrsbimax", BT_FN_INT_INTMAX, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN        (BUILT_IN_CLRSBL, "clrsbl", BT_FN_INT_LONG, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN        (BUILT_IN_CLRSBLL, "clrsbll", BT_FN_INT_LONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN        (BUILT_IN_CLRSBG, "clrsbg", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF)
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_DCGETTEXT, "dcgettext", BT_FN_STRING_CONST_STRING_CONST_STRING_INT, ATTR_FORMAT_ARG_2)
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_DGETTEXT, "dgettext", BT_FN_STRING_CONST_STRING_CONST_STRING, ATTR_FORMAT_ARG_2)
 DEF_GCC_BUILTIN        (BUILT_IN_DWARF_CFA, "dwarf_cfa", BT_FN_PTR, ATTR_NULL)
@@ -993,6 +996,7 @@ DEF_EXT_LIB_BUILTIN    (BUILT_IN_FFS, "f
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_FFSIMAX, "ffsimax", BT_FN_INT_INTMAX, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_FFSL, "ffsl", BT_FN_INT_LONG, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_FFSLL, "ffsll", BT_FN_INT_LONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN        (BUILT_IN_FFSG, "ffsg", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF)
 DEF_EXT_LIB_BUILTIN        (BUILT_IN_FORK, "fork", BT_FN_PID, ATTR_NOTHROW_LIST)
 DEF_GCC_BUILTIN        (BUILT_IN_FRAME_ADDRESS, "frame_address", BT_FN_PTR_UINT, ATTR_NULL)
 /* [trans-mem]: Adjust BUILT_IN_TM_FREE if BUILT_IN_FREE is changed.  */
@@ -1041,10 +1045,12 @@ DEF_GCC_BUILTIN        (BUILT_IN_PARITY,
 DEF_GCC_BUILTIN        (BUILT_IN_PARITYIMAX, "parityimax", BT_FN_INT_UINTMAX, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN        (BUILT_IN_PARITYL, "parityl", BT_FN_INT_ULONG, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN        (BUILT_IN_PARITYLL, "parityll", BT_FN_INT_ULONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN        (BUILT_IN_PARITYG, "parityg", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF)
 DEF_GCC_BUILTIN        (BUILT_IN_POPCOUNT, "popcount", BT_FN_INT_UINT, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN        (BUILT_IN_POPCOUNTIMAX, "popcountimax", BT_FN_INT_UINTMAX, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN        (BUILT_IN_POPCOUNTL, "popcountl", BT_FN_INT_ULONG, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN        (BUILT_IN_POPCOUNTLL, "popcountll", BT_FN_INT_ULONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN        (BUILT_IN_POPCOUNTG, "popcountg", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF)
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_POSIX_MEMALIGN, "posix_memalign", BT_FN_INT_PTRPTR_SIZE_SIZE, ATTR_NOTHROW_NONNULL_LEAF)
 DEF_GCC_BUILTIN        (BUILT_IN_PREFETCH, "prefetch", BT_FN_VOID_CONST_PTR_VAR, ATTR_NOVOPS_LEAF_LIST)
 DEF_LIB_BUILTIN        (BUILT_IN_REALLOC, "realloc", BT_FN_PTR_PTR_SIZE, ATTR_ALLOC_WARN_UNUSED_RESULT_SIZE_2_NOTHROW_LEAF_LIST)
--- gcc/builtins.cc.jj	2023-11-09 09:03:53.107904770 +0100
+++ gcc/builtins.cc	2023-11-09 09:17:40.230182483 +0100
@@ -9573,6 +9573,271 @@ fold_builtin_arith_overflow (location_t
   return build2_loc (loc, COMPOUND_EXPR, boolean_type_node, store, ovfres);
 }
 
+/* Fold __builtin_{clz,ctz,clrsb,ffs,parity,popcount}g into corresponding
+   internal function.  */
+
+static tree
+fold_builtin_bit_query (location_t loc, enum built_in_function fcode,
+			tree arg0, tree arg1)
+{
+  enum internal_fn ifn;
+  enum built_in_function fcodei, fcodel, fcodell;
+  tree arg0_type = TREE_TYPE (arg0);
+  tree cast_type = NULL_TREE;
+  int addend = 0;
+
+  switch (fcode)
+    {
+    case BUILT_IN_CLZG:
+      if (arg1 && TREE_CODE (arg1) != INTEGER_CST)
+	return NULL_TREE;
+      ifn = IFN_CLZ;
+      fcodei = BUILT_IN_CLZ;
+      fcodel = BUILT_IN_CLZL;
+      fcodell = BUILT_IN_CLZLL;
+      break;
+    case BUILT_IN_CTZG:
+      if (arg1 && TREE_CODE (arg1) != INTEGER_CST)
+	return NULL_TREE;
+      ifn = IFN_CTZ;
+      fcodei = BUILT_IN_CTZ;
+      fcodel = BUILT_IN_CTZL;
+      fcodell = BUILT_IN_CTZLL;
+      break;
+    case BUILT_IN_CLRSBG:
+      ifn = IFN_CLRSB;
+      fcodei = BUILT_IN_CLRSB;
+      fcodel = BUILT_IN_CLRSBL;
+      fcodell = BUILT_IN_CLRSBLL;
+      break;
+    case BUILT_IN_FFSG:
+      ifn = IFN_FFS;
+      fcodei = BUILT_IN_FFS;
+      fcodel = BUILT_IN_FFSL;
+      fcodell = BUILT_IN_FFSLL;
+      break;
+    case BUILT_IN_PARITYG:
+      ifn = IFN_PARITY;
+      fcodei = BUILT_IN_PARITY;
+      fcodel = BUILT_IN_PARITYL;
+      fcodell = BUILT_IN_PARITYLL;
+      break;
+    case BUILT_IN_POPCOUNTG:
+      ifn = IFN_POPCOUNT;
+      fcodei = BUILT_IN_POPCOUNT;
+      fcodel = BUILT_IN_POPCOUNTL;
+      fcodell = BUILT_IN_POPCOUNTLL;
+      break;
+    default:
+      gcc_unreachable ();
+    }
+
+  if (TYPE_PRECISION (arg0_type)
+      <= TYPE_PRECISION (long_long_unsigned_type_node))
+    {
+      if (TYPE_PRECISION (arg0_type) <= TYPE_PRECISION (unsigned_type_node))
+
+	cast_type = (TYPE_UNSIGNED (arg0_type)
+		     ? unsigned_type_node : integer_type_node);
+      else if (TYPE_PRECISION (arg0_type)
+	       <= TYPE_PRECISION (long_unsigned_type_node))
+	{
+	  cast_type = (TYPE_UNSIGNED (arg0_type)
+		       ? long_unsigned_type_node : long_integer_type_node);
+	  fcodei = fcodel;
+	}
+      else
+	{
+	  cast_type = (TYPE_UNSIGNED (arg0_type)
+		       ? long_long_unsigned_type_node
+		       : long_long_integer_type_node);
+	  fcodei = fcodell;
+	}
+    }
+  else if (TYPE_PRECISION (arg0_type) <= MAX_FIXED_MODE_SIZE)
+    {
+      cast_type
+	= build_nonstandard_integer_type (MAX_FIXED_MODE_SIZE,
+					  TYPE_UNSIGNED (arg0_type));
+      gcc_assert (TYPE_PRECISION (cast_type)
+		  == 2 * TYPE_PRECISION (long_long_unsigned_type_node));
+      fcodei = END_BUILTINS;
+    }
+  else
+    fcodei = END_BUILTINS;
+  if (cast_type)
+    {
+      switch (fcode)
+	{
+	case BUILT_IN_CLZG:
+	case BUILT_IN_CLRSBG:
+	  addend = TYPE_PRECISION (arg0_type) - TYPE_PRECISION (cast_type);
+	  break;
+	default:
+	  break;
+	}
+      arg0 = fold_convert (cast_type, arg0);
+      arg0_type = cast_type;
+    }
+
+  if (arg1)
+    arg1 = fold_convert (integer_type_node, arg1);
+
+  tree arg2 = arg1;
+  if (fcode == BUILT_IN_CLZG && addend)
+    {
+      if (arg1)
+	arg0 = save_expr (arg0);
+      arg2 = NULL_TREE;
+    }
+  tree call = NULL_TREE, tem;
+  if (TYPE_PRECISION (arg0_type) == MAX_FIXED_MODE_SIZE
+      && (TYPE_PRECISION (arg0_type)
+	  == 2 * TYPE_PRECISION (long_long_unsigned_type_node)))
+    {
+      /* __int128 expansions using up to 2 long long builtins.  */
+      arg0 = save_expr (arg0);
+      tree type = (TYPE_UNSIGNED (arg0_type)
+		   ? long_long_unsigned_type_node
+		   : long_long_integer_type_node);
+      tree hi = fold_build2 (RSHIFT_EXPR, arg0_type, arg0,
+			     build_int_cst (integer_type_node,
+					    MAX_FIXED_MODE_SIZE / 2));
+      hi = fold_convert (type, hi);
+      tree lo = fold_convert (type, arg0);
+      switch (fcode)
+	{
+	case BUILT_IN_CLZG:
+	  call = fold_builtin_bit_query (loc, fcode, lo, NULL_TREE);
+	  call = fold_build2 (PLUS_EXPR, integer_type_node, call,
+			      build_int_cst (integer_type_node,
+					     MAX_FIXED_MODE_SIZE / 2));
+	  if (arg2)
+	    call = fold_build3 (COND_EXPR, integer_type_node,
+				fold_build2 (NE_EXPR, boolean_type_node,
+					     lo, build_zero_cst (type)),
+				call, arg2);
+	  call = fold_build3 (COND_EXPR, integer_type_node,
+			      fold_build2 (NE_EXPR, boolean_type_node,
+					   hi, build_zero_cst (type)),
+			      fold_builtin_bit_query (loc, fcode, hi,
+						      NULL_TREE),
+			      call);
+	  break;
+	case BUILT_IN_CTZG:
+	  call = fold_builtin_bit_query (loc, fcode, hi, NULL_TREE);
+	  call = fold_build2 (PLUS_EXPR, integer_type_node, call,
+			      build_int_cst (integer_type_node,
+					     MAX_FIXED_MODE_SIZE / 2));
+	  if (arg2)
+	    call = fold_build3 (COND_EXPR, integer_type_node,
+				fold_build2 (NE_EXPR, boolean_type_node,
+					     hi, build_zero_cst (type)),
+				call, arg2);
+	  call = fold_build3 (COND_EXPR, integer_type_node,
+			      fold_build2 (NE_EXPR, boolean_type_node,
+					   lo, build_zero_cst (type)),
+			      fold_builtin_bit_query (loc, fcode, lo,
+						      NULL_TREE),
+			      call);
+	  break;
+	case BUILT_IN_CLRSBG:
+	  tem = fold_builtin_bit_query (loc, fcode, lo, NULL_TREE);
+	  tem = fold_build2 (PLUS_EXPR, integer_type_node, tem,
+			     build_int_cst (integer_type_node,
+					    MAX_FIXED_MODE_SIZE / 2));
+	  tem = fold_build3 (COND_EXPR, integer_type_node,
+			     fold_build2 (LT_EXPR, boolean_type_node,
+					  fold_build2 (BIT_XOR_EXPR, type,
+						       lo, hi),
+					  build_zero_cst (type)),
+			     build_int_cst (integer_type_node,
+					    MAX_FIXED_MODE_SIZE / 2 - 1),
+			     tem);
+	  call = fold_builtin_bit_query (loc, fcode, hi, NULL_TREE);
+	  call = save_expr (call);
+	  call = fold_build3 (COND_EXPR, integer_type_node,
+			      fold_build2 (NE_EXPR, boolean_type_node,
+					   call,
+					   build_int_cst (integer_type_node,
+							  MAX_FIXED_MODE_SIZE
+							  / 2 - 1)),
+			      call, tem);
+	  break;
+	case BUILT_IN_FFSG:
+	  call = fold_builtin_bit_query (loc, fcode, hi, NULL_TREE);
+	  call = fold_build2 (PLUS_EXPR, integer_type_node, call,
+			      build_int_cst (integer_type_node,
+					     MAX_FIXED_MODE_SIZE / 2));
+	  call = fold_build3 (COND_EXPR, integer_type_node,
+			      fold_build2 (NE_EXPR, boolean_type_node,
+					   hi, build_zero_cst (type)),
+			      call, integer_zero_node);
+	  call = fold_build3 (COND_EXPR, integer_type_node,
+			      fold_build2 (NE_EXPR, boolean_type_node,
+					   lo, build_zero_cst (type)),
+			      fold_builtin_bit_query (loc, fcode, lo,
+						      NULL_TREE),
+			      call);
+	  break;
+	case BUILT_IN_PARITYG:
+	  call = fold_builtin_bit_query (loc, fcode,
+					 fold_build2 (BIT_XOR_EXPR, type,
+						      lo, hi), NULL_TREE);
+	  break;
+	case BUILT_IN_POPCOUNTG:
+	  call = fold_build2 (PLUS_EXPR, integer_type_node,
+			      fold_builtin_bit_query (loc, fcode, hi,
+						      NULL_TREE),
+			      fold_builtin_bit_query (loc, fcode, lo,
+						      NULL_TREE));
+	  break;
+	default:
+	  gcc_unreachable ();
+	}
+    }
+  else
+    {
+      /* Only keep second argument to IFN_CLZ/IFN_CTZ if it is the
+	 value defined at zero during GIMPLE, or for large/huge _BitInt
+	 (which are then lowered during bitint lowering).  */
+      if (arg2 && TREE_CODE (TREE_TYPE (arg0)) != BITINT_TYPE)
+	{
+	  int val;
+	  if (fcode == BUILT_IN_CLZG)
+	    {
+	      if (CLZ_DEFINED_VALUE_AT_ZERO (SCALAR_TYPE_MODE (arg0_type),
+					     val) != 2
+		  || wi::to_widest (arg2) != val)
+		arg2 = NULL_TREE;
+	    }
+	  else if (CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_TYPE_MODE (arg0_type),
+					      val) != 2
+		   || wi::to_widest (arg2) != val)
+	    arg2 = NULL_TREE;
+	  if (!direct_internal_fn_supported_p (ifn, arg0_type,
+					       OPTIMIZE_FOR_BOTH))
+	    arg2 = NULL_TREE;
+	}
+      if (fcodei == END_BUILTINS || arg2)
+	call = build_call_expr_internal_loc (loc, ifn, integer_type_node,
+					     arg2 ? 2 : 1, arg0, arg2);
+      else
+	call = build_call_expr_loc (loc, builtin_decl_explicit (fcodei), 1,
+				    arg0);
+    }
+  if (addend)
+    call = fold_build2 (PLUS_EXPR, integer_type_node, call,
+			build_int_cst (integer_type_node, addend));
+  if (arg1 && arg2 == NULL_TREE)
+    call = fold_build3 (COND_EXPR, integer_type_node,
+			fold_build2 (NE_EXPR, boolean_type_node,
+				     arg0, build_zero_cst (arg0_type)),
+			call, arg1);
+
+  return call;
+}
+
 /* Fold __builtin_{add,sub}c{,l,ll} into pair of internal functions
    that return both result of arithmetics and overflowed boolean
    flag in a complex integer result.  */
@@ -9824,6 +10089,14 @@ fold_builtin_1 (location_t loc, tree exp
 	return build_empty_stmt (loc);
       break;
 
+    case BUILT_IN_CLZG:
+    case BUILT_IN_CTZG:
+    case BUILT_IN_CLRSBG:
+    case BUILT_IN_FFSG:
+    case BUILT_IN_PARITYG:
+    case BUILT_IN_POPCOUNTG:
+      return fold_builtin_bit_query (loc, fcode, arg0, NULL_TREE);
+
     default:
       break;
     }
@@ -9913,6 +10186,10 @@ fold_builtin_2 (location_t loc, tree exp
     case BUILT_IN_ATOMIC_IS_LOCK_FREE:
       return fold_builtin_atomic_is_lock_free (arg0, arg1);
 
+    case BUILT_IN_CLZG:
+    case BUILT_IN_CTZG:
+      return fold_builtin_bit_query (loc, fcode, arg0, arg1);
+
     default:
       break;
     }
--- gcc/fold-const-call.cc.jj	2023-11-09 09:03:53.368901073 +0100
+++ gcc/fold-const-call.cc	2023-11-09 09:17:40.240182342 +0100
@@ -27,7 +27,7 @@ along with GCC; see the file COPYING3.
 #include "fold-const.h"
 #include "fold-const-call.h"
 #include "case-cfn-macros.h"
-#include "tm.h" /* For C[LT]Z_DEFINED_AT_ZERO.  */
+#include "tm.h" /* For C[LT]Z_DEFINED_VALUE_AT_ZERO.  */
 #include "builtins.h"
 #include "gimple-expr.h"
 #include "tree-vector-builder.h"
@@ -1017,14 +1017,18 @@ fold_const_call_ss (wide_int *result, co
   switch (fn)
     {
     CASE_CFN_FFS:
+    case CFN_BUILT_IN_FFSG:
       *result = wi::shwi (wi::ffs (arg), precision);
       return true;
 
     CASE_CFN_CLZ:
+    case CFN_BUILT_IN_CLZG:
       {
 	int tmp;
 	if (wi::ne_p (arg, 0))
 	  tmp = wi::clz (arg);
+	else if (TREE_CODE (arg_type) == BITINT_TYPE)
+	  tmp = TYPE_PRECISION (arg_type);
 	else if (!CLZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (arg_type),
 					     tmp))
 	  tmp = TYPE_PRECISION (arg_type);
@@ -1033,10 +1037,13 @@ fold_const_call_ss (wide_int *result, co
       }
 
     CASE_CFN_CTZ:
+    case CFN_BUILT_IN_CTZG:
       {
 	int tmp;
 	if (wi::ne_p (arg, 0))
 	  tmp = wi::ctz (arg);
+	else if (TREE_CODE (arg_type) == BITINT_TYPE)
+	  tmp = TYPE_PRECISION (arg_type);
 	else if (!CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (arg_type),
 					     tmp))
 	  tmp = TYPE_PRECISION (arg_type);
@@ -1045,14 +1052,17 @@ fold_const_call_ss (wide_int *result, co
       }
 
     CASE_CFN_CLRSB:
+    case CFN_BUILT_IN_CLRSBG:
       *result = wi::shwi (wi::clrsb (arg), precision);
       return true;
 
     CASE_CFN_POPCOUNT:
+    case CFN_BUILT_IN_POPCOUNTG:
       *result = wi::shwi (wi::popcount (arg), precision);
       return true;
 
     CASE_CFN_PARITY:
+    case CFN_BUILT_IN_PARITYG:
       *result = wi::shwi (wi::parity (arg), precision);
       return true;
 
@@ -1531,6 +1541,49 @@ fold_const_call_sss (real_value *result,
 
 /* Try to evaluate:
 
+      *RESULT = FN (ARG0, ARG1)
+
+   where ARG_TYPE is the type of ARG0 and PRECISION is the number of bits in
+   the result.  Return true on success.  */
+
+static bool
+fold_const_call_sss (wide_int *result, combined_fn fn,
+		     const wide_int_ref &arg0, const wide_int_ref &arg1,
+		     unsigned int precision, tree arg_type ATTRIBUTE_UNUSED)
+{
+  switch (fn)
+    {
+    case CFN_CLZ:
+    case CFN_BUILT_IN_CLZG:
+      {
+	int tmp;
+	if (wi::ne_p (arg0, 0))
+	  tmp = wi::clz (arg0);
+	else
+	  tmp = arg1.to_shwi ();
+	*result = wi::shwi (tmp, precision);
+	return true;
+      }
+
+    case CFN_CTZ:
+    case CFN_BUILT_IN_CTZG:
+      {
+	int tmp;
+	if (wi::ne_p (arg0, 0))
+	  tmp = wi::ctz (arg0);
+	else
+	  tmp = arg1.to_shwi ();
+	*result = wi::shwi (tmp, precision);
+	return true;
+      }
+
+    default:
+      return false;
+    }
+}
+
+/* Try to evaluate:
+
       RESULT = fn (ARG0, ARG1)
 
    where FORMAT is the format of the real and imaginary parts of RESULT
@@ -1565,6 +1618,19 @@ fold_const_call_1 (combined_fn fn, tree
   machine_mode arg0_mode = TYPE_MODE (TREE_TYPE (arg0));
   machine_mode arg1_mode = TYPE_MODE (TREE_TYPE (arg1));
 
+  if (integer_cst_p (arg0) && integer_cst_p (arg1))
+    {
+      if (SCALAR_INT_MODE_P (mode))
+	{
+	  wide_int result;
+	  if (fold_const_call_sss (&result, fn, wi::to_wide (arg0),
+				   wi::to_wide (arg1), TYPE_PRECISION (type),
+				   TREE_TYPE (arg0)))
+	    return wide_int_to_tree (type, result);
+	}
+      return NULL_TREE;
+    }
+
   if (mode == arg0_mode
       && real_cst_p (arg0)
       && real_cst_p (arg1))
--- gcc/genmatch.cc.jj	2023-11-09 09:03:53.375900973 +0100
+++ gcc/genmatch.cc	2023-11-09 09:17:40.234182427 +0100
@@ -1895,8 +1895,14 @@ cmp_operand (operand *o1, operand *o2)
     {
       expr *e1 = static_cast<expr *>(o1);
       expr *e2 = static_cast<expr *>(o2);
-      return (e1->operation == e2->operation
-	      && e1->is_generic == e2->is_generic);
+      if (e1->operation != e2->operation
+	  || e1->is_generic != e2->is_generic)
+	return false;
+      if (e1->operation->kind == id_base::FN
+	  /* For function calls also compare number of arguments.  */
+	  && e1->ops.length () != e2->ops.length ())
+	return false;
+      return true;
     }
   else
     return false;
@@ -3070,6 +3076,26 @@ dt_operand::gen_generic_expr (FILE *f, i
   return 0;
 }
 
+/* Compare 2 fns or generic_fns vector entries for vector sorting.
+   Same operation entries with different number of arguments should
+   be adjacent.  */
+
+static int
+fns_cmp (const void *p1, const void *p2)
+{
+  dt_operand *op1 = *(dt_operand *const *) p1;
+  dt_operand *op2 = *(dt_operand *const *) p2;
+  expr *e1 = as_a <expr *> (op1->op);
+  expr *e2 = as_a <expr *> (op2->op);
+  id_base *b1 = e1->operation;
+  id_base *b2 = e2->operation;
+  if (b1->hashval < b2->hashval)
+    return -1;
+  if (b1->hashval > b2->hashval)
+    return 1;
+  return strcmp (b1->id, b2->id);
+}
+
 /* Generate matching code for the children of the decision tree node.  */
 
 void
@@ -3143,6 +3169,8 @@ dt_node::gen_kids (FILE *f, int indent,
 	     Like DT_TRUE, DT_MATCH serves as a barrier as it can cause
 	     dependent matches to get out-of-order.  Generate code now
 	     for what we have collected sofar.  */
+	  fns.qsort (fns_cmp);
+	  generic_fns.qsort (fns_cmp);
 	  gen_kids_1 (f, indent, gimple, depth, gimple_exprs, generic_exprs,
 		      fns, generic_fns, preds, others);
 	  /* And output the true operand itself.  */
@@ -3159,6 +3187,8 @@ dt_node::gen_kids (FILE *f, int indent,
     }
 
   /* Generate code for the remains.  */
+  fns.qsort (fns_cmp);
+  generic_fns.qsort (fns_cmp);
   gen_kids_1 (f, indent, gimple, depth, gimple_exprs, generic_exprs,
 	      fns, generic_fns, preds, others);
 }
@@ -3256,14 +3286,21 @@ dt_node::gen_kids_1 (FILE *f, int indent
 
 	  indent += 4;
 	  fprintf_indent (f, indent, "{\n");
+	  id_base *last_op = NULL;
 	  for (unsigned i = 0; i < fns_len; ++i)
 	    {
 	      expr *e = as_a <expr *>(fns[i]->op);
-	      if (user_id *u = dyn_cast <user_id *> (e->operation))
-		for (auto id : u->substitutes)
-		  fprintf_indent (f, indent, "case %s:\n", id->id);
-	      else
-		fprintf_indent (f, indent, "case %s:\n", e->operation->id);
+	      if (e->operation != last_op)
+		{
+		  if (i)
+		    fprintf_indent (f, indent, "  break;\n");
+		  if (user_id *u = dyn_cast <user_id *> (e->operation))
+		    for (auto id : u->substitutes)
+		      fprintf_indent (f, indent, "case %s:\n", id->id);
+		  else
+		    fprintf_indent (f, indent, "case %s:\n", e->operation->id);
+		}
+	      last_op = e->operation;
 	      /* We need to be defensive against bogus prototypes allowing
 		 calls with not enough arguments.  */
 	      fprintf_indent (f, indent,
@@ -3272,9 +3309,9 @@ dt_node::gen_kids_1 (FILE *f, int indent
 	      fprintf_indent (f, indent, "    {\n");
 	      fns[i]->gen (f, indent + 6, true, depth);
 	      fprintf_indent (f, indent, "    }\n");
-	      fprintf_indent (f, indent, "  break;\n");
 	    }
 
+	  fprintf_indent (f, indent, "  break;\n");
 	  fprintf_indent (f, indent, "default:;\n");
 	  fprintf_indent (f, indent, "}\n");
 	  indent -= 4;
@@ -3334,18 +3371,25 @@ dt_node::gen_kids_1 (FILE *f, int indent
 		      "    {\n");
       indent += 4;
 
+      id_base *last_op = NULL;
       for (unsigned j = 0; j < generic_fns.length (); ++j)
 	{
 	  expr *e = as_a <expr *>(generic_fns[j]->op);
 	  gcc_assert (e->operation->kind == id_base::FN);
 
-	  fprintf_indent (f, indent, "case %s:\n", e->operation->id);
+	  if (e->operation != last_op)
+	    {
+	      if (j)
+		fprintf_indent (f, indent, "  break;\n");
+	      fprintf_indent (f, indent, "case %s:\n", e->operation->id);
+	    }
+	  last_op = e->operation;
 	  fprintf_indent (f, indent, "  if (call_expr_nargs (%s) == %d)\n"
 				     "    {\n", kid_opname, e->ops.length ());
 	  generic_fns[j]->gen (f, indent + 6, false, depth);
-	  fprintf_indent (f, indent, "    }\n"
-				     "  break;\n");
+	  fprintf_indent (f, indent, "    }\n");
 	}
+      fprintf_indent (f, indent, "  break;\n");
       fprintf_indent (f, indent, "default:;\n");
 
       indent -= 4;
--- gcc/match.pd.jj	2023-11-09 09:03:53.490899344 +0100
+++ gcc/match.pd	2023-11-09 09:17:40.231182469 +0100
@@ -8532,31 +8532,34 @@ (define_operator_list SYNC_FETCH_AND_AND
    (op (clz:s@2 @0) INTEGER_CST@1)
    (if (integer_zerop (@1) && single_use (@2))
     /* clz(X) == 0 is (int)X < 0 and clz(X) != 0 is (int)X >= 0.  */
-    (with { tree type0 = TREE_TYPE (@0);
-	    tree stype = signed_type_for (type0);
-	    HOST_WIDE_INT val = 0;
-	    /* Punt on hypothetical weird targets.  */
-	    if (clz == CFN_CLZ
-		&& CLZ_DEFINED_VALUE_AT_ZERO (SCALAR_TYPE_MODE (type0),
-					      val) == 2
-		&& val == 0)
-	      stype = NULL_TREE;
-	  }
-     (if (stype)
-      (cmp (convert:stype @0) { build_zero_cst (stype); })))
+    (with { tree stype = signed_type_for (TREE_TYPE (@0)); }
+     (cmp (convert:stype @0) { build_zero_cst (stype); }))
     /* clz(X) == (prec-1) is X == 1 and clz(X) != (prec-1) is X != 1.  */
-    (with { bool ok = true;
-	    HOST_WIDE_INT val = 0;
-	    tree type0 = TREE_TYPE (@0);
-	    /* Punt on hypothetical weird targets.  */
-	    if (clz == CFN_CLZ
-		&& CLZ_DEFINED_VALUE_AT_ZERO (SCALAR_TYPE_MODE (type0),
-					      val) == 2
-		&& val == TYPE_PRECISION (type0) - 1)
-	      ok = false;
-	  }
-     (if (ok && wi::to_wide (@1) == (TYPE_PRECISION (type0) - 1))
-      (op @0 { build_one_cst (type0); })))))))
+    (if (wi::to_wide (@1) == TYPE_PRECISION (TREE_TYPE (@0)) - 1)
+     (op @0 { build_one_cst (TREE_TYPE (@0)); }))))))
+(for op (eq ne)
+     cmp (lt ge)
+ (simplify
+  (op (IFN_CLZ:s@2 @0 @3) INTEGER_CST@1)
+  (if (integer_zerop (@1) && single_use (@2))
+   /* clz(X) == 0 is (int)X < 0 and clz(X) != 0 is (int)X >= 0.  */
+   (with { tree type0 = TREE_TYPE (@0);
+	   tree stype = signed_type_for (TREE_TYPE (@0));
+	   /* Punt if clz(0) == 0.  */
+	   if (integer_zerop (@3))
+	     stype = NULL_TREE;
+	 }
+    (if (stype)
+     (cmp (convert:stype @0) { build_zero_cst (stype); })))
+   /* clz(X) == (prec-1) is X == 1 and clz(X) != (prec-1) is X != 1.  */
+   (with { bool ok = true;
+	   tree type0 = TREE_TYPE (@0);
+	   /* Punt if clz(0) == prec - 1.  */
+	   if (wi::to_widest (@3) == TYPE_PRECISION (type0) - 1)
+	     ok = false;
+	 }
+    (if (ok && wi::to_wide (@1) == (TYPE_PRECISION (type0) - 1))
+     (op @0 { build_one_cst (type0); }))))))
 
 /* CTZ simplifications.  */
 (for ctz (CTZ)
@@ -8581,22 +8584,14 @@ (define_operator_list SYNC_FETCH_AND_AND
 		      val++;
 		  }
 	      }
-	    bool zero_res = false;
-	    HOST_WIDE_INT zero_val = 0;
 	    tree type0 = TREE_TYPE (@0);
 	    int prec = TYPE_PRECISION (type0);
-	    if (ctz == CFN_CTZ
-		&& CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_TYPE_MODE (type0),
-					      zero_val) == 2)
-	      zero_res = true;
 	  }
-     (if (val <= 0)
-      (if (ok && (!zero_res || zero_val >= val))
-       { constant_boolean_node (cmp == EQ_EXPR ? true : false, type); })
-      (if (val >= prec)
-       (if (ok && (!zero_res || zero_val < val))
-	{ constant_boolean_node (cmp == EQ_EXPR ? false : true, type); })
-       (if (ok && (!zero_res || zero_val < 0 || zero_val >= prec))
+     (if (ok && prec <= MAX_FIXED_MODE_SIZE)
+      (if (val <= 0)
+       { constant_boolean_node (cmp == EQ_EXPR ? true : false, type); }
+       (if (val >= prec)
+	{ constant_boolean_node (cmp == EQ_EXPR ? false : true, type); }
 	(cmp (bit_and @0 { wide_int_to_tree (type0,
 					     wi::mask (val, false, prec)); })
 	     { build_zero_cst (type0); })))))))
@@ -8604,19 +8599,68 @@ (define_operator_list SYNC_FETCH_AND_AND
   (simplify
    /* __builtin_ctz (x) == C -> (x & ((1 << (C + 1)) - 1)) == (1 << C).  */
    (op (ctz:s @0) INTEGER_CST@1)
-    (with { bool zero_res = false;
-	    HOST_WIDE_INT zero_val = 0;
-	    tree type0 = TREE_TYPE (@0);
+    (with { tree type0 = TREE_TYPE (@0);
 	    int prec = TYPE_PRECISION (type0);
-	    if (ctz == CFN_CTZ
-		&& CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_TYPE_MODE (type0),
-					      zero_val) == 2)
-	      zero_res = true;
 	  }
+     (if (prec <= MAX_FIXED_MODE_SIZE)
+      (if (tree_int_cst_sgn (@1) < 0 || wi::to_widest (@1) >= prec)
+       { constant_boolean_node (op == EQ_EXPR ? false : true, type); }
+       (op (bit_and @0 { wide_int_to_tree (type0,
+					   wi::mask (tree_to_uhwi (@1) + 1,
+						     false, prec)); })
+	   { wide_int_to_tree (type0,
+			       wi::shifted_mask (tree_to_uhwi (@1), 1,
+						 false, prec)); })))))))
+(for op (ge gt le lt)
+     cmp (eq eq ne ne)
+ (simplify
+  /* __builtin_ctz (x) >= C -> (x & ((1 << C) - 1)) == 0.  */
+  (op (IFN_CTZ:s @0 @2) INTEGER_CST@1)
+   (with { bool ok = true;
+	   HOST_WIDE_INT val = 0;
+	   if (!tree_fits_shwi_p (@1))
+	     ok = false;
+	   else
+	     {
+	       val = tree_to_shwi (@1);
+	       /* Canonicalize to >= or <.  */
+	       if (op == GT_EXPR || op == LE_EXPR)
+		 {
+		   if (val == HOST_WIDE_INT_MAX)
+		     ok = false;
+		   else
+		     val++;
+		 }
+	     }
+	   HOST_WIDE_INT zero_val = tree_to_shwi (@2);
+	   tree type0 = TREE_TYPE (@0);
+	   int prec = TYPE_PRECISION (type0);
+	   if (prec > MAX_FIXED_MODE_SIZE)
+	     ok = false;
+	  }
+     (if (val <= 0)
+      (if (ok && zero_val >= val)
+       { constant_boolean_node (cmp == EQ_EXPR ? true : false, type); })
+      (if (val >= prec)
+       (if (ok && zero_val < val)
+	{ constant_boolean_node (cmp == EQ_EXPR ? false : true, type); })
+       (if (ok && (zero_val < 0 || zero_val >= prec))
+	(cmp (bit_and @0 { wide_int_to_tree (type0,
+					     wi::mask (val, false, prec)); })
+	     { build_zero_cst (type0); })))))))
+(for op (eq ne)
+ (simplify
+  /* __builtin_ctz (x) == C -> (x & ((1 << (C + 1)) - 1)) == (1 << C).  */
+  (op (IFN_CTZ:s @0 @2) INTEGER_CST@1)
+   (with { HOST_WIDE_INT zero_val = tree_to_shwi (@2);
+	   tree type0 = TREE_TYPE (@0);
+	   int prec = TYPE_PRECISION (type0);
+	 }
+    (if (prec <= MAX_FIXED_MODE_SIZE)
      (if (tree_int_cst_sgn (@1) < 0 || wi::to_widest (@1) >= prec)
-      (if (!zero_res || zero_val != wi::to_widest (@1))
+      (if (zero_val != wi::to_widest (@1))
        { constant_boolean_node (op == EQ_EXPR ? false : true, type); })
-      (if (!zero_res || zero_val < 0 || zero_val >= prec)
+      (if (zero_val < 0 || zero_val >= prec)
        (op (bit_and @0 { wide_int_to_tree (type0,
 					   wi::mask (tree_to_uhwi (@1) + 1,
 						     false, prec)); })
@@ -8753,13 +8797,38 @@ (define_operator_list SYNC_FETCH_AND_AND
   (cond (ne @0 integer_zerop@1) (func (convert?@3 @0)) INTEGER_CST@2)
   (with { int val;
 	  internal_fn ifn = IFN_LAST;
-	  if (direct_internal_fn_supported_p (IFN_CLZ, type, OPTIMIZE_FOR_BOTH)
-	      && CLZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (type),
-					    val) == 2)
+	  if (TREE_CODE (TREE_TYPE (@3)) == BITINT_TYPE)
+	    {
+	      if (tree_fits_shwi_p (@2))
+		{
+		  HOST_WIDE_INT valw = tree_to_shwi (@2);
+		  if ((int) valw == valw)
+		    {
+		      val = valw;
+		      ifn = IFN_CLZ;
+		    }
+		}
+	    }
+	  else if (direct_internal_fn_supported_p (IFN_CLZ, TREE_TYPE (@3),
+						   OPTIMIZE_FOR_BOTH)
+		   && CLZ_DEFINED_VALUE_AT_ZERO
+			(SCALAR_INT_TYPE_MODE (TREE_TYPE (@3)), val) == 2)
 	    ifn = IFN_CLZ;
 	}
    (if (ifn == IFN_CLZ && wi::to_widest (@2) == val)
-    (IFN_CLZ @3)))))
+    (IFN_CLZ @3 @2)))))
+(simplify
+ (cond (ne @0 integer_zerop@1) (IFN_CLZ (convert?@3 @0) INTEGER_CST@2) @2)
+  (with { int val;
+	  internal_fn ifn = IFN_LAST;
+	  if (TREE_CODE (TREE_TYPE (@3)) == BITINT_TYPE)
+	    ifn = IFN_CLZ;
+	  else if (direct_internal_fn_supported_p (IFN_CLZ, TREE_TYPE (@3),
+						   OPTIMIZE_FOR_BOTH))
+	    ifn = IFN_CLZ;
+	}
+   (if (ifn == IFN_CLZ)
+    (IFN_CLZ @3 @2))))
 
 /* a != 0 ? CTZ(a) : CST -> .CTZ(a) where CST is the result of the internal function for 0. */
 (for func (CTZ)
@@ -8767,13 +8836,38 @@ (define_operator_list SYNC_FETCH_AND_AND
   (cond (ne @0 integer_zerop@1) (func (convert?@3 @0)) INTEGER_CST@2)
   (with { int val;
 	  internal_fn ifn = IFN_LAST;
-	  if (direct_internal_fn_supported_p (IFN_CTZ, type, OPTIMIZE_FOR_BOTH)
-	      && CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (type),
-					    val) == 2)
+	  if (TREE_CODE (TREE_TYPE (@3)) == BITINT_TYPE)
+	    {
+	      if (tree_fits_shwi_p (@2))
+		{
+		  HOST_WIDE_INT valw = tree_to_shwi (@2);
+		  if ((int) valw == valw)
+		    {
+		      val = valw;
+		      ifn = IFN_CTZ;
+		    }
+		}
+	    }
+	  else if (direct_internal_fn_supported_p (IFN_CTZ, TREE_TYPE (@3),
+						   OPTIMIZE_FOR_BOTH)
+		   && CTZ_DEFINED_VALUE_AT_ZERO
+			(SCALAR_INT_TYPE_MODE (TREE_TYPE (@3)), val) == 2)
 	    ifn = IFN_CTZ;
 	}
    (if (ifn == IFN_CTZ && wi::to_widest (@2) == val)
-    (IFN_CTZ @3)))))
+    (IFN_CTZ @3 @2)))))
+(simplify
+ (cond (ne @0 integer_zerop@1) (IFN_CTZ (convert?@3 @0) INTEGER_CST@2) @2)
+  (with { int val;
+	  internal_fn ifn = IFN_LAST;
+	  if (TREE_CODE (TREE_TYPE (@3)) == BITINT_TYPE)
+	    ifn = IFN_CTZ;
+	  else if (direct_internal_fn_supported_p (IFN_CTZ, TREE_TYPE (@3),
+						   OPTIMIZE_FOR_BOTH))
+	    ifn = IFN_CTZ;
+	}
+   (if (ifn == IFN_CTZ)
+    (IFN_CTZ @3 @2))))
 #endif
 
 /* Common POPCOUNT/PARITY simplifications.  */
--- gcc/gimple-lower-bitint.cc.jj	2023-11-09 09:03:53.423900293 +0100
+++ gcc/gimple-lower-bitint.cc	2023-11-09 09:17:40.242182314 +0100
@@ -427,6 +427,7 @@ struct bitint_large_huge
   void lower_mul_overflow (tree, gimple *);
   void lower_cplxpart_stmt (tree, gimple *);
   void lower_complexexpr_stmt (gimple *);
+  void lower_bit_query (gimple *);
   void lower_call (tree, gimple *);
   void lower_asm (gimple *);
   void lower_stmt (gimple *);
@@ -4455,6 +4456,524 @@ bitint_large_huge::lower_complexexpr_stm
   insert_before (g);
 }
 
+/* Lower a .{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT} call with one large/huge _BitInt
+   argument.  */
+
+void
+bitint_large_huge::lower_bit_query (gimple *stmt)
+{
+  tree arg0 = gimple_call_arg (stmt, 0);
+  tree arg1 = (gimple_call_num_args (stmt) == 2
+	       ? gimple_call_arg (stmt, 1) : NULL_TREE);
+  tree lhs = gimple_call_lhs (stmt);
+  gimple *g;
+
+  if (!lhs)
+    {
+      gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
+      gsi_remove (&gsi, true);
+      return;
+    }
+  tree type = TREE_TYPE (arg0);
+  gcc_assert (TREE_CODE (type) == BITINT_TYPE);
+  bitint_prec_kind kind = bitint_precision_kind (type);
+  gcc_assert (kind >= bitint_prec_large);
+  enum internal_fn ifn = gimple_call_internal_fn (stmt);
+  enum built_in_function fcode = END_BUILTINS;
+  gcc_assert (TYPE_PRECISION (unsigned_type_node) == limb_prec
+	      || TYPE_PRECISION (long_unsigned_type_node) == limb_prec
+	      || TYPE_PRECISION (long_long_unsigned_type_node) == limb_prec);
+  switch (ifn)
+    {
+    case IFN_CLZ:
+      if (TYPE_PRECISION (unsigned_type_node) == limb_prec)
+	fcode = BUILT_IN_CLZ;
+      else if (TYPE_PRECISION (long_unsigned_type_node) == limb_prec)
+	fcode = BUILT_IN_CLZL;
+      else
+	fcode = BUILT_IN_CLZLL;
+      break;
+    case IFN_FFS:
+      /* .FFS (X) is .CTZ (X, -1) + 1, though under the hood
+	 we don't add the addend at the end.  */
+      arg1 = integer_zero_node;
+      /* FALLTHRU */
+    case IFN_CTZ:
+      if (TYPE_PRECISION (unsigned_type_node) == limb_prec)
+	fcode = BUILT_IN_CTZ;
+      else if (TYPE_PRECISION (long_unsigned_type_node) == limb_prec)
+	fcode = BUILT_IN_CTZL;
+      else
+	fcode = BUILT_IN_CTZLL;
+      m_upwards = true;
+      break;
+    case IFN_CLRSB:
+      if (TYPE_PRECISION (unsigned_type_node) == limb_prec)
+	fcode = BUILT_IN_CLRSB;
+      else if (TYPE_PRECISION (long_unsigned_type_node) == limb_prec)
+	fcode = BUILT_IN_CLRSBL;
+      else
+	fcode = BUILT_IN_CLRSBLL;
+      break;
+    case IFN_PARITY:
+      if (TYPE_PRECISION (unsigned_type_node) == limb_prec)
+	fcode = BUILT_IN_PARITY;
+      else if (TYPE_PRECISION (long_unsigned_type_node) == limb_prec)
+	fcode = BUILT_IN_PARITYL;
+      else
+	fcode = BUILT_IN_PARITYLL;
+      m_upwards = true;
+      break;
+    case IFN_POPCOUNT:
+      if (TYPE_PRECISION (unsigned_type_node) == limb_prec)
+	fcode = BUILT_IN_POPCOUNT;
+      else if (TYPE_PRECISION (long_unsigned_type_node) == limb_prec)
+	fcode = BUILT_IN_POPCOUNTL;
+      else
+	fcode = BUILT_IN_POPCOUNTLL;
+      m_upwards = true;
+      break;
+    default:
+      gcc_unreachable ();
+    }
+  tree fndecl = builtin_decl_explicit (fcode), res = NULL_TREE;
+  unsigned cnt = 0, rem = 0, end = 0, prec = TYPE_PRECISION (type);
+  struct bq_details { edge e; tree val, addend; } *bqp = NULL;
+  basic_block edge_bb = NULL;
+  if (m_upwards)
+    {
+      tree idx = NULL_TREE, idx_first = NULL_TREE, idx_next = NULL_TREE;
+      if (kind == bitint_prec_large)
+	cnt = CEIL (prec, limb_prec);
+      else
+	{
+	  rem = (prec % (2 * limb_prec));
+	  end = (prec - rem) / limb_prec;
+	  cnt = 2 + CEIL (rem, limb_prec);
+	  idx = idx_first = create_loop (size_zero_node, &idx_next);
+	}
+
+      if (ifn == IFN_CTZ || ifn == IFN_FFS)
+	{
+	  gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
+	  gsi_prev (&gsi);
+	  edge e = split_block (gsi_bb (gsi), gsi_stmt (gsi));
+	  edge_bb = e->src;
+	  if (kind == bitint_prec_large)
+	    {
+	      m_gsi = gsi_last_bb (edge_bb);
+	      if (!gsi_end_p (m_gsi))
+		gsi_next (&m_gsi);
+	    }
+	  bqp = XALLOCAVEC (struct bq_details, cnt);
+	}
+      else
+	m_after_stmt = stmt;
+      if (kind != bitint_prec_large)
+	m_upwards_2limb = end;
+
+      for (unsigned i = 0; i < cnt; i++)
+	{
+	  m_data_cnt = 0;
+	  if (kind == bitint_prec_large)
+	    idx = size_int (i);
+	  else if (i >= 2)
+	    idx = size_int (end + (i > 2));
+
+	  tree rhs1 = handle_operand (arg0, idx);
+	  if (!useless_type_conversion_p (m_limb_type, TREE_TYPE (rhs1)))
+	    {
+	      if (!TYPE_UNSIGNED (TREE_TYPE (rhs1)))
+		rhs1 = add_cast (unsigned_type_for (TREE_TYPE (rhs1)), rhs1);
+	      rhs1 = add_cast (m_limb_type, rhs1);
+	    }
+
+	  tree in, out, tem;
+	  if (ifn == IFN_PARITY)
+	    in = prepare_data_in_out (build_zero_cst (m_limb_type), idx, &out);
+	  else if (ifn == IFN_FFS)
+	    in = prepare_data_in_out (integer_one_node, idx, &out);
+	  else
+	    in = prepare_data_in_out (integer_zero_node, idx, &out);
+
+	  switch (ifn)
+	    {
+	    case IFN_CTZ:
+	    case IFN_FFS:
+	      g = gimple_build_cond (NE_EXPR, rhs1,
+				     build_zero_cst (m_limb_type),
+				     NULL_TREE, NULL_TREE);
+	      insert_before (g);
+	      edge e1, e2;
+	      e1 = split_block (gsi_bb (m_gsi), g);
+	      e1->flags = EDGE_FALSE_VALUE;
+	      e2 = make_edge (e1->src, gimple_bb (stmt), EDGE_TRUE_VALUE);
+	      e1->probability = profile_probability::unlikely ();
+	      e2->probability = e1->probability.invert ();
+	      if (i == 0)
+		set_immediate_dominator (CDI_DOMINATORS, e2->dest, e2->src);
+	      m_gsi = gsi_after_labels (e1->dest);
+	      bqp[i].e = e2;
+	      bqp[i].val = rhs1;
+	      if (tree_fits_uhwi_p (idx))
+		bqp[i].addend
+		  = build_int_cst (integer_type_node,
+				   tree_to_uhwi (idx) * limb_prec
+				   + (ifn == IFN_FFS));
+	      else
+		{
+		  bqp[i].addend = in;
+		  if (i == 1)
+		    res = out;
+		  else
+		    res = make_ssa_name (integer_type_node);
+		  g = gimple_build_assign (res, PLUS_EXPR, in,
+					   build_int_cst (integer_type_node,
+							  limb_prec));
+		  insert_before (g);
+		  m_data[m_data_cnt] = res;
+		}
+	      break;
+	    case IFN_PARITY:
+	      if (!integer_zerop (in))
+		{
+		  if (kind == bitint_prec_huge && i == 1)
+		    res = out;
+		  else
+		    res = make_ssa_name (m_limb_type);
+		  g = gimple_build_assign (res, BIT_XOR_EXPR, in, rhs1);
+		  insert_before (g);
+		}
+	      else
+		res = rhs1;
+	      m_data[m_data_cnt] = res;
+	      break;
+	    case IFN_POPCOUNT:
+	      g = gimple_build_call (fndecl, 1, rhs1);
+	      tem = make_ssa_name (integer_type_node);
+	      gimple_call_set_lhs (g, tem);
+	      insert_before (g);
+	      if (!integer_zerop (in))
+		{
+		  if (kind == bitint_prec_huge && i == 1)
+		    res = out;
+		  else
+		    res = make_ssa_name (integer_type_node);
+		  g = gimple_build_assign (res, PLUS_EXPR, in, tem);
+		  insert_before (g);
+		}
+	      else
+		res = tem;
+	      m_data[m_data_cnt] = res;
+	      break;
+	    default:
+	      gcc_unreachable ();
+	    }
+
+	  m_first = false;
+	  if (kind == bitint_prec_huge && i <= 1)
+	    {
+	      if (i == 0)
+		{
+		  idx = make_ssa_name (sizetype);
+		  g = gimple_build_assign (idx, PLUS_EXPR, idx_first,
+					   size_one_node);
+		  insert_before (g);
+		}
+	      else
+		{
+		  g = gimple_build_assign (idx_next, PLUS_EXPR, idx_first,
+					   size_int (2));
+		  insert_before (g);
+		  g = gimple_build_cond (NE_EXPR, idx_next, size_int (end),
+					 NULL_TREE, NULL_TREE);
+		  insert_before (g);
+		  if (ifn == IFN_CTZ || ifn == IFN_FFS)
+		    m_gsi = gsi_after_labels (edge_bb);
+		  else
+		    m_gsi = gsi_for_stmt (stmt);
+		}
+	    }
+	}
+    }
+  else
+    {
+      tree idx = NULL_TREE, idx_next = NULL_TREE, first = NULL_TREE;
+      int sub_one = 0;
+      if (kind == bitint_prec_large)
+	cnt = CEIL (prec, limb_prec);
+      else
+	{
+	  rem = prec % limb_prec;
+	  if (rem == 0 && (!TYPE_UNSIGNED (type) || ifn == IFN_CLRSB))
+	    rem = limb_prec;
+	  end = (prec - rem) / limb_prec;
+	  cnt = 1 + (rem != 0);
+	  if (ifn == IFN_CLRSB)
+	    sub_one = 1;
+	}
+
+      gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
+      gsi_prev (&gsi);
+      edge e = split_block (gsi_bb (gsi), gsi_stmt (gsi));
+      edge_bb = e->src;
+      m_gsi = gsi_last_bb (edge_bb);
+      if (!gsi_end_p (m_gsi))
+	gsi_next (&m_gsi);
+
+      if (ifn == IFN_CLZ)
+	bqp = XALLOCAVEC (struct bq_details, cnt);
+      else
+	{
+	  gsi = gsi_for_stmt (stmt);
+	  gsi_prev (&gsi);
+	  e = split_block (gsi_bb (gsi), gsi_stmt (gsi));
+	  edge_bb = e->src;
+	  bqp = XALLOCAVEC (struct bq_details, 2 * cnt);
+	}
+
+      for (unsigned i = 0; i < cnt; i++)
+	{
+	  m_data_cnt = 0;
+	  if (kind == bitint_prec_large)
+	    idx = size_int (cnt - i - 1);
+	  else if (i == cnt - 1)
+	    idx = create_loop (size_int (end - 1), &idx_next);
+	  else
+	    idx = size_int (end);
+
+	  tree rhs1 = handle_operand (arg0, idx);
+	  if (!useless_type_conversion_p (m_limb_type, TREE_TYPE (rhs1)))
+	    {
+	      if (ifn == IFN_CLZ && !TYPE_UNSIGNED (TREE_TYPE (rhs1)))
+		rhs1 = add_cast (unsigned_type_for (TREE_TYPE (rhs1)), rhs1);
+	      else if (ifn == IFN_CLRSB && TYPE_UNSIGNED (TREE_TYPE (rhs1)))
+		rhs1 = add_cast (signed_type_for (TREE_TYPE (rhs1)), rhs1);
+	      rhs1 = add_cast (m_limb_type, rhs1);
+	    }
+
+	  if (ifn == IFN_CLZ)
+	    {
+	      g = gimple_build_cond (NE_EXPR, rhs1,
+				     build_zero_cst (m_limb_type),
+				     NULL_TREE, NULL_TREE);
+	      insert_before (g);
+	      edge e1 = split_block (gsi_bb (m_gsi), g);
+	      e1->flags = EDGE_FALSE_VALUE;
+	      edge e2 = make_edge (e1->src, gimple_bb (stmt), EDGE_TRUE_VALUE);
+	      e1->probability = profile_probability::unlikely ();
+	      e2->probability = e1->probability.invert ();
+	      if (i == 0)
+		set_immediate_dominator (CDI_DOMINATORS, e2->dest, e2->src);
+	      m_gsi = gsi_after_labels (e1->dest);
+	      bqp[i].e = e2;
+	      bqp[i].val = rhs1;
+	    }
+	  else
+	    {
+	      if (i == 0)
+		{
+		  first = rhs1;
+		  g = gimple_build_assign (make_ssa_name (m_limb_type),
+					   PLUS_EXPR, rhs1,
+					   build_int_cst (m_limb_type, 1));
+		  insert_before (g);
+		  g = gimple_build_cond (GT_EXPR, gimple_assign_lhs (g),
+					 build_int_cst (m_limb_type, 1),
+					 NULL_TREE, NULL_TREE);
+		  insert_before (g);
+		}
+	      else
+		{
+		  g = gimple_build_assign (make_ssa_name (m_limb_type),
+					   BIT_XOR_EXPR, rhs1, first);
+		  insert_before (g);
+		  tree stype = signed_type_for (m_limb_type);
+		  g = gimple_build_cond (LT_EXPR,
+					 add_cast (stype,
+						   gimple_assign_lhs (g)),
+					 build_zero_cst (stype),
+					 NULL_TREE, NULL_TREE);
+		  insert_before (g);
+		  edge e1 = split_block (gsi_bb (m_gsi), g);
+		  e1->flags = EDGE_FALSE_VALUE;
+		  edge e2 = make_edge (e1->src, gimple_bb (stmt),
+				       EDGE_TRUE_VALUE);
+		  e1->probability = profile_probability::unlikely ();
+		  e2->probability = e1->probability.invert ();
+		  if (i == 1)
+		    set_immediate_dominator (CDI_DOMINATORS, e2->dest,
+					     e2->src);
+		  m_gsi = gsi_after_labels (e1->dest);
+		  bqp[2 * i].e = e2;
+		  g = gimple_build_cond (NE_EXPR, rhs1, first,
+					 NULL_TREE, NULL_TREE);
+		  insert_before (g);
+		}
+	      edge e1 = split_block (gsi_bb (m_gsi), g);
+	      e1->flags = EDGE_FALSE_VALUE;
+	      edge e2 = make_edge (e1->src, edge_bb, EDGE_TRUE_VALUE);
+	      e1->probability = profile_probability::unlikely ();
+	      e2->probability = e1->probability.invert ();
+	      if (i == 0)
+		set_immediate_dominator (CDI_DOMINATORS, e2->dest, e2->src);
+	      m_gsi = gsi_after_labels (e1->dest);
+	      bqp[2 * i + 1].e = e2;
+	      bqp[i].val = rhs1;
+	    }
+	  if (tree_fits_uhwi_p (idx))
+	    bqp[i].addend
+	      = build_int_cst (integer_type_node,
+			       (int) prec
+			       - (((int) tree_to_uhwi (idx) + 1)
+				  * limb_prec) - sub_one);
+	  else
+	    {
+	      tree in, out;
+	      in = build_int_cst (integer_type_node, rem - sub_one);
+	      m_first = true;
+	      in = prepare_data_in_out (in, idx, &out);
+	      out = m_data[m_data_cnt + 1];
+	      bqp[i].addend = in;
+	      g = gimple_build_assign (out, PLUS_EXPR, in,
+				       build_int_cst (integer_type_node,
+						      limb_prec));
+	      insert_before (g);
+	      m_data[m_data_cnt] = out;
+	    }
+
+	  m_first = false;
+	  if (kind == bitint_prec_huge && i == cnt - 1)
+	    {
+	      g = gimple_build_assign (idx_next, PLUS_EXPR, idx,
+				       size_int (-1));
+	      insert_before (g);
+	      g = gimple_build_cond (NE_EXPR, idx, size_zero_node,
+				     NULL_TREE, NULL_TREE);
+	      insert_before (g);
+	      edge true_edge, false_edge;
+	      extract_true_false_edges_from_block (gsi_bb (m_gsi),
+						   &true_edge, &false_edge);
+	      m_gsi = gsi_after_labels (false_edge->dest);
+	    }
+	}
+    }
+  switch (ifn)
+    {
+    case IFN_CLZ:
+    case IFN_CTZ:
+    case IFN_FFS:
+      gphi *phi1, *phi2, *phi3;
+      basic_block bb;
+      bb = gsi_bb (m_gsi);
+      remove_edge (find_edge (bb, gimple_bb (stmt)));
+      phi1 = create_phi_node (make_ssa_name (m_limb_type),
+			      gimple_bb (stmt));
+      phi2 = create_phi_node (make_ssa_name (integer_type_node),
+			      gimple_bb (stmt));
+      for (unsigned i = 0; i < cnt; i++)
+	{
+	  add_phi_arg (phi1, bqp[i].val, bqp[i].e, UNKNOWN_LOCATION);
+	  add_phi_arg (phi2, bqp[i].addend, bqp[i].e, UNKNOWN_LOCATION);
+	}
+      if (arg1 == NULL_TREE)
+	{
+	  g = gimple_build_builtin_unreachable (m_loc);
+	  insert_before (g);
+	}
+      m_gsi = gsi_for_stmt (stmt);
+      g = gimple_build_call (fndecl, 1, gimple_phi_result (phi1));
+      gimple_call_set_lhs (g, make_ssa_name (integer_type_node));
+      insert_before (g);
+      if (arg1 == NULL_TREE)
+	g = gimple_build_assign (lhs, PLUS_EXPR,
+				 gimple_phi_result (phi2),
+				 gimple_call_lhs (g));
+      else
+	{
+	  g = gimple_build_assign (make_ssa_name (integer_type_node),
+				   PLUS_EXPR, gimple_phi_result (phi2),
+				   gimple_call_lhs (g));
+	  insert_before (g);
+	  edge e1 = split_block (gimple_bb (stmt), g);
+	  edge e2 = make_edge (bb, e1->dest, EDGE_FALLTHRU);
+	  e2->probability = profile_probability::always ();
+	  set_immediate_dominator (CDI_DOMINATORS, e1->dest,
+				   get_immediate_dominator (CDI_DOMINATORS,
+							    e1->src));
+	  phi3 = create_phi_node (make_ssa_name (integer_type_node), e1->dest);
+	  add_phi_arg (phi3, gimple_assign_lhs (g), e1, UNKNOWN_LOCATION);
+	  add_phi_arg (phi3, arg1, e2, UNKNOWN_LOCATION);
+	  m_gsi = gsi_for_stmt (stmt);
+	  g = gimple_build_assign (lhs, gimple_phi_result (phi3));
+	}
+      gsi_replace (&m_gsi, g, true);
+      break;
+    case IFN_CLRSB:
+      bb = gsi_bb (m_gsi);
+      remove_edge (find_edge (bb, edge_bb));
+      edge e;
+      e = make_edge (bb, gimple_bb (stmt), EDGE_FALLTHRU);
+      e->probability = profile_probability::always ();
+      set_immediate_dominator (CDI_DOMINATORS, gimple_bb (stmt),
+			       get_immediate_dominator (CDI_DOMINATORS,
+							edge_bb));
+      phi1 = create_phi_node (make_ssa_name (m_limb_type),
+			      edge_bb);
+      phi2 = create_phi_node (make_ssa_name (integer_type_node),
+			      edge_bb);
+      phi3 = create_phi_node (make_ssa_name (integer_type_node),
+			      gimple_bb (stmt));
+      for (unsigned i = 0; i < cnt; i++)
+	{
+	  add_phi_arg (phi1, bqp[i].val, bqp[2 * i + 1].e, UNKNOWN_LOCATION);
+	  add_phi_arg (phi2, bqp[i].addend, bqp[2 * i + 1].e,
+		       UNKNOWN_LOCATION);
+	  tree a = bqp[i].addend;
+	  if (i && kind == bitint_prec_large)
+	    a = int_const_binop (PLUS_EXPR, a, integer_minus_one_node);
+	  if (i)
+	    add_phi_arg (phi3, a, bqp[2 * i].e, UNKNOWN_LOCATION);
+	}
+      add_phi_arg (phi3, build_int_cst (integer_type_node, prec - 1), e,
+		   UNKNOWN_LOCATION);
+      m_gsi = gsi_after_labels (edge_bb);
+      g = gimple_build_call (fndecl, 1,
+			     add_cast (signed_type_for (m_limb_type),
+				       gimple_phi_result (phi1)));
+      gimple_call_set_lhs (g, make_ssa_name (integer_type_node));
+      insert_before (g);
+      g = gimple_build_assign (make_ssa_name (integer_type_node),
+			       PLUS_EXPR, gimple_call_lhs (g),
+			       gimple_phi_result (phi2));
+      insert_before (g);
+      if (kind != bitint_prec_large)
+	{
+	  g = gimple_build_assign (make_ssa_name (integer_type_node),
+				   PLUS_EXPR, gimple_assign_lhs (g),
+				   integer_one_node);
+	  insert_before (g);
+	}
+      add_phi_arg (phi3, gimple_assign_lhs (g),
+		   find_edge (edge_bb, gimple_bb (stmt)), UNKNOWN_LOCATION);
+      m_gsi = gsi_for_stmt (stmt);
+      g = gimple_build_assign (lhs, gimple_phi_result (phi3));
+      gsi_replace (&m_gsi, g, true);
+      break;
+    case IFN_PARITY:
+      g = gimple_build_call (fndecl, 1, res);
+      gimple_call_set_lhs (g, lhs);
+      gsi_replace (&m_gsi, g, true);
+      break;
+    case IFN_POPCOUNT:
+      g = gimple_build_assign (lhs, res);
+      gsi_replace (&m_gsi, g, true);
+      break;
+    default:
+      gcc_unreachable ();
+    }
+}
+
 /* Lower a call statement with one or more large/huge _BitInt
    arguments or large/huge _BitInt return value.  */
 
@@ -4476,6 +4995,14 @@ bitint_large_huge::lower_call (tree obj,
       case IFN_UBSAN_CHECK_MUL:
 	lower_mul_overflow (obj, stmt);
 	return;
+      case IFN_CLZ:
+      case IFN_CTZ:
+      case IFN_CLRSB:
+      case IFN_FFS:
+      case IFN_PARITY:
+      case IFN_POPCOUNT:
+	lower_bit_query (stmt);
+	return;
       default:
 	break;
       }
--- gcc/gimple-range-op.cc.jj	2023-11-09 09:03:53.443900010 +0100
+++ gcc/gimple-range-op.cc	2023-11-09 09:17:40.233182441 +0100
@@ -908,39 +908,34 @@ public:
   cfn_clz (bool internal) { m_gimple_call_internal_p = internal; }
   using range_operator::fold_range;
   virtual bool fold_range (irange &r, tree type, const irange &lh,
-			   const irange &, relation_trio) const;
+			   const irange &rh, relation_trio) const;
 private:
   bool m_gimple_call_internal_p;
 } op_cfn_clz (false), op_cfn_clz_internal (true);
 
 bool
 cfn_clz::fold_range (irange &r, tree type, const irange &lh,
-		     const irange &, relation_trio) const
+		     const irange &rh, relation_trio) const
 {
   // __builtin_c[lt]z* return [0, prec-1], except when the
   // argument is 0, but that is undefined behavior.
   //
   // For __builtin_c[lt]z* consider argument of 0 always undefined
-  // behavior, for internal fns depending on C?Z_DEFINED_VALUE_AT_ZERO.
+  // behavior, for internal fns likewise, unless it has 2 arguments,
+  // then the second argument is the value at zero.
   if (lh.undefined_p ())
     return false;
   int prec = TYPE_PRECISION (lh.type ());
   int mini = 0;
   int maxi = prec - 1;
-  int zerov = 0;
-  scalar_int_mode mode = SCALAR_INT_TYPE_MODE (lh.type ());
   if (m_gimple_call_internal_p)
     {
-      if (optab_handler (clz_optab, mode) != CODE_FOR_nothing
-	  && CLZ_DEFINED_VALUE_AT_ZERO (mode, zerov) == 2)
-	{
-	  // Only handle the single common value.
-	  if (zerov == prec)
-	    maxi = prec;
-	  else
-	    // Magic value to give up, unless we can prove arg is non-zero.
-	    mini = -2;
-	}
+      // Only handle the single common value.
+      if (rh.lower_bound () == prec)
+	maxi = prec;
+      else
+	// Magic value to give up, unless we can prove arg is non-zero.
+	mini = -2;
     }
 
   // From clz of minimum we can compute result maximum.
@@ -985,37 +980,31 @@ public:
   cfn_ctz (bool internal) { m_gimple_call_internal_p = internal; }
   using range_operator::fold_range;
   virtual bool fold_range (irange &r, tree type, const irange &lh,
-			   const irange &, relation_trio) const;
+			   const irange &rh, relation_trio) const;
 private:
   bool m_gimple_call_internal_p;
 } op_cfn_ctz (false), op_cfn_ctz_internal (true);
 
 bool
 cfn_ctz::fold_range (irange &r, tree type, const irange &lh,
-		     const irange &, relation_trio) const
+		     const irange &rh, relation_trio) const
 {
   if (lh.undefined_p ())
     return false;
   int prec = TYPE_PRECISION (lh.type ());
   int mini = 0;
   int maxi = prec - 1;
-  int zerov = 0;
-  scalar_int_mode mode = SCALAR_INT_TYPE_MODE (lh.type ());
 
   if (m_gimple_call_internal_p)
     {
-      if (optab_handler (ctz_optab, mode) != CODE_FOR_nothing
-	  && CTZ_DEFINED_VALUE_AT_ZERO (mode, zerov) == 2)
-	{
-	  // Handle only the two common values.
-	  if (zerov == -1)
-	    mini = -1;
-	  else if (zerov == prec)
-	    maxi = prec;
-	  else
-	    // Magic value to give up, unless we can prove arg is non-zero.
-	    mini = -2;
-	}
+      // Handle only the two common values.
+      if (rh.lower_bound () == -1)
+	mini = -1;
+      else if (rh.lower_bound () == prec)
+	maxi = prec;
+      else
+	// Magic value to give up, unless we can prove arg is non-zero.
+	mini = -2;
     }
   // If arg is non-zero, then use [0, prec - 1].
   if (!range_includes_zero_p (&lh))
@@ -1288,16 +1277,24 @@ gimple_range_op_handler::maybe_builtin_c
 
     CASE_CFN_CLZ:
       m_op1 = gimple_call_arg (call, 0);
-      if (gimple_call_internal_p (call))
-	m_operator = &op_cfn_clz_internal;
+      if (gimple_call_internal_p (call)
+	  && gimple_call_num_args (call) == 2)
+	{
+	  m_op2 = gimple_call_arg (call, 1);
+	  m_operator = &op_cfn_clz_internal;
+	}
       else
 	m_operator = &op_cfn_clz;
       break;
 
     CASE_CFN_CTZ:
       m_op1 = gimple_call_arg (call, 0);
-      if (gimple_call_internal_p (call))
-	m_operator = &op_cfn_ctz_internal;
+      if (gimple_call_internal_p (call)
+	  && gimple_call_num_args (call) == 2)
+	{
+	  m_op2 = gimple_call_arg (call, 1);
+	  m_operator = &op_cfn_ctz_internal;
+	}
       else
 	m_operator = &op_cfn_ctz;
       break;
--- gcc/tree-vect-patterns.cc.jj	2023-11-09 09:03:53.675896723 +0100
+++ gcc/tree-vect-patterns.cc	2023-11-09 09:17:40.232182455 +0100
@@ -1818,7 +1818,7 @@ vect_recog_ctz_ffs_pattern (vec_info *vi
   tree new_var;
   internal_fn ifn = IFN_LAST, ifnnew = IFN_LAST;
   bool defined_at_zero = true, defined_at_zero_new = false;
-  int val = 0, val_new = 0;
+  int val = 0, val_new = 0, val_cmp = 0;
   int prec;
   int sub = 0, add = 0;
   location_t loc;
@@ -1826,7 +1826,8 @@ vect_recog_ctz_ffs_pattern (vec_info *vi
   if (!is_gimple_call (call_stmt))
     return NULL;
 
-  if (gimple_call_num_args (call_stmt) != 1)
+  if (gimple_call_num_args (call_stmt) != 1
+      && gimple_call_num_args (call_stmt) != 2)
     return NULL;
 
   rhs_oprnd = gimple_call_arg (call_stmt, 0);
@@ -1846,9 +1847,10 @@ vect_recog_ctz_ffs_pattern (vec_info *vi
     CASE_CFN_CTZ:
       ifn = IFN_CTZ;
       if (!gimple_call_internal_p (call_stmt)
-	  || CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (rhs_type),
-					val) != 2)
+	  || gimple_call_num_args (call_stmt) != 2)
 	defined_at_zero = false;
+      else
+	val = tree_to_shwi (gimple_call_arg (call_stmt, 1));
       break;
     CASE_CFN_FFS:
       ifn = IFN_FFS;
@@ -1907,6 +1909,7 @@ vect_recog_ctz_ffs_pattern (vec_info *vi
 
   vect_pattern_detected ("vec_recog_ctz_ffs_pattern", call_stmt);
 
+  val_cmp = val_new;
   if ((ifnnew == IFN_CLZ
        && defined_at_zero
        && defined_at_zero_new
@@ -1918,7 +1921,7 @@ vect_recog_ctz_ffs_pattern (vec_info *vi
 	 .CTZ (X) = .POPCOUNT ((X - 1) & ~X).  */
       if (ifnnew == IFN_CLZ)
 	sub = prec;
-      val_new = prec;
+      val_cmp = prec;
 
       if (!TYPE_UNSIGNED (rhs_type))
 	{
@@ -1955,7 +1958,7 @@ vect_recog_ctz_ffs_pattern (vec_info *vi
       /* .CTZ (X) = (PREC - 1) - .CLZ (X & -X)
 	 .FFS (X) = PREC - .CLZ (X & -X).  */
       sub = prec - (ifn == IFN_CTZ);
-      val_new = sub - val_new;
+      val_cmp = sub - val_new;
 
       tree neg = vect_recog_temp_ssa_var (rhs_type, NULL);
       pattern_stmt = gimple_build_assign (neg, NEGATE_EXPR, rhs_oprnd);
@@ -1974,7 +1977,7 @@ vect_recog_ctz_ffs_pattern (vec_info *vi
       /* .CTZ (X) = PREC - .POPCOUNT (X | -X)
 	 .FFS (X) = (PREC + 1) - .POPCOUNT (X | -X).  */
       sub = prec + (ifn == IFN_FFS);
-      val_new = sub;
+      val_cmp = sub;
 
       tree neg = vect_recog_temp_ssa_var (rhs_type, NULL);
       pattern_stmt = gimple_build_assign (neg, NEGATE_EXPR, rhs_oprnd);
@@ -1992,12 +1995,18 @@ vect_recog_ctz_ffs_pattern (vec_info *vi
     {
       /* .FFS (X) = .CTZ (X) + 1.  */
       add = 1;
-      val_new++;
+      val_cmp++;
     }
 
   /* Create B = .IFNNEW (A).  */
   new_var = vect_recog_temp_ssa_var (lhs_type, NULL);
-  pattern_stmt = gimple_build_call_internal (ifnnew, 1, rhs_oprnd);
+  if ((ifnnew == IFN_CLZ || ifnnew == IFN_CTZ) && defined_at_zero_new)
+    pattern_stmt
+      = gimple_build_call_internal (ifnnew, 2, rhs_oprnd,
+				    build_int_cst (integer_type_node,
+						   val_new));
+  else
+    pattern_stmt = gimple_build_call_internal (ifnnew, 1, rhs_oprnd);
   gimple_call_set_lhs (pattern_stmt, new_var);
   gimple_set_location (pattern_stmt, loc);
   *type_out = vec_type;
@@ -2023,7 +2032,7 @@ vect_recog_ctz_ffs_pattern (vec_info *vi
     }
 
   if (defined_at_zero
-      && (!defined_at_zero_new || val != val_new))
+      && (!defined_at_zero_new || val != val_cmp))
     {
       append_pattern_def_seq (vinfo, stmt_vinfo, pattern_stmt, vec_type);
       tree ret_var = vect_recog_temp_ssa_var (lhs_type, NULL);
@@ -2143,7 +2152,8 @@ vect_recog_popcount_clz_ctz_ffs_pattern
       return NULL;
     }
 
-  if (gimple_call_num_args (call_stmt) != 1)
+  if (gimple_call_num_args (call_stmt) != 1
+      && gimple_call_num_args (call_stmt) != 2)
     return NULL;
 
   rhs_oprnd = gimple_call_arg (call_stmt, 0);
@@ -2181,17 +2191,14 @@ vect_recog_popcount_clz_ctz_ffs_pattern
 	  return NULL;
 	addend = (TYPE_PRECISION (TREE_TYPE (rhs_oprnd))
 		  - TYPE_PRECISION (lhs_type));
-	if (gimple_call_internal_p (call_stmt))
+	if (gimple_call_internal_p (call_stmt)
+	    && gimple_call_num_args (call_stmt) == 2)
 	  {
 	    int val1, val2;
-	    int d1
-	      = CLZ_DEFINED_VALUE_AT_ZERO
-		  (SCALAR_INT_TYPE_MODE (TREE_TYPE (rhs_oprnd)), val1);
+	    val1 = tree_to_shwi (gimple_call_arg (call_stmt, 1));
 	    int d2
 	      = CLZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (lhs_type),
 					   val2);
-	    if (d1 != 2)
-	      break;
 	    if (d2 != 2 || val1 != val2 + addend)
 	      return NULL;
 	  }
@@ -2200,17 +2207,14 @@ vect_recog_popcount_clz_ctz_ffs_pattern
 	/* ctzll (x) == ctz (x) for unsigned or signed x != 0, so ok
 	   if it is undefined at zero or if it matches also for the
 	   defined value there.  */
-	if (gimple_call_internal_p (call_stmt))
+	if (gimple_call_internal_p (call_stmt)
+	    && gimple_call_num_args (call_stmt) == 2)
 	  {
 	    int val1, val2;
-	    int d1
-	      = CTZ_DEFINED_VALUE_AT_ZERO
-		  (SCALAR_INT_TYPE_MODE (TREE_TYPE (rhs_oprnd)), val1);
+	    val1 = tree_to_shwi (gimple_call_arg (call_stmt, 1));
 	    int d2
 	      = CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (lhs_type),
 					   val2);
-	    if (d1 != 2)
-	      break;
 	    if (d2 != 2 || val1 != val2)
 	      return NULL;
 	  }
@@ -2260,7 +2264,20 @@ vect_recog_popcount_clz_ctz_ffs_pattern
 
   /* Create B = .POPCOUNT (A).  */
   new_var = vect_recog_temp_ssa_var (lhs_type, NULL);
-  pattern_stmt = gimple_build_call_internal (ifn, 1, unprom_diff.op);
+  tree arg2 = NULL_TREE;
+  int val;
+  if (ifn == IFN_CLZ
+      && CLZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (lhs_type),
+				    val) == 2)
+    arg2 = build_int_cst (integer_type_node, val);
+  else if (ifn == IFN_CTZ
+	   && CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (lhs_type),
+					 val) == 2)
+    arg2 = build_int_cst (integer_type_node, val);
+  if (arg2)
+    pattern_stmt = gimple_build_call_internal (ifn, 2, unprom_diff.op, arg2);
+  else
+    pattern_stmt = gimple_build_call_internal (ifn, 1, unprom_diff.op);
   gimple_call_set_lhs (pattern_stmt, new_var);
   gimple_set_location (pattern_stmt, gimple_location (last_stmt));
   *type_out = vec_type;
--- gcc/tree-vect-stmts.cc.jj	2023-11-09 09:04:20.349518853 +0100
+++ gcc/tree-vect-stmts.cc	2023-11-09 10:00:01.351992895 +0100
@@ -3266,6 +3266,7 @@ vectorizable_call (vec_info *vinfo,
   enum { NARROW, NONE, WIDEN } modifier;
   size_t i, nargs;
   tree lhs;
+  tree clz_ctz_arg1 = NULL_TREE;
 
   if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
     return false;
@@ -3311,6 +3312,14 @@ vectorizable_call (vec_info *vinfo,
       nargs = 0;
       rhs_type = unsigned_type_node;
     }
+  /* Similarly pretend IFN_CLZ and IFN_CTZ only has one argument, the second
+     argument just says whether it is well-defined at zero or not and what
+     value should be returned for it.  */
+  if ((cfn == CFN_CLZ || cfn == CFN_CTZ) && nargs == 2)
+    {
+      nargs = 1;
+      clz_ctz_arg1 = gimple_call_arg (stmt, 1);
+    }
 
   int mask_opno = -1;
   if (internal_fn_p (cfn))
@@ -3576,6 +3585,8 @@ vectorizable_call (vec_info *vinfo,
       ifn = cond_fn;
       vect_nargs += 2;
     }
+  if (clz_ctz_arg1)
+    ++vect_nargs;
 
   if (modifier == NONE || ifn != IFN_LAST)
     {
@@ -3613,6 +3624,9 @@ vectorizable_call (vec_info *vinfo,
 		    }
 		  if (masked_loop_p && reduc_idx >= 0)
 		    vargs[varg++] = vargs[reduc_idx + 1];
+		  if (clz_ctz_arg1)
+		    vargs[varg++] = clz_ctz_arg1;
+
 		  gimple *new_stmt;
 		  if (modifier == NARROW)
 		    {
@@ -3699,6 +3713,8 @@ vectorizable_call (vec_info *vinfo,
 	    }
 	  if (masked_loop_p && reduc_idx >= 0)
 	    vargs[varg++] = vargs[reduc_idx + 1];
+	  if (clz_ctz_arg1)
+	    vargs[varg++] = clz_ctz_arg1;
 
 	  if (len_opno >= 0 && len_loop_p)
 	    {
--- gcc/tree-ssa-loop-niter.cc.jj	2023-11-09 09:03:53.592897899 +0100
+++ gcc/tree-ssa-loop-niter.cc	2023-11-09 09:17:40.234182427 +0100
@@ -2235,14 +2235,18 @@ build_cltz_expr (tree src, bool leading,
   tree call;
   if (use_ifn)
     {
-      call = build_call_expr_internal_loc (UNKNOWN_LOCATION, ifn,
-					   integer_type_node, 1, src);
       int val;
       int optab_defined_at_zero
 	= (leading
 	   ? CLZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (utype), val)
 	   : CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (utype), val));
-      if (define_at_zero && !(optab_defined_at_zero == 2 && val == prec))
+      tree arg2 = NULL_TREE;
+      if (define_at_zero && optab_defined_at_zero == 2 && val == prec)
+	arg2 = build_int_cst (integer_type_node, val);
+      call = build_call_expr_internal_loc (UNKNOWN_LOCATION, ifn,
+					   integer_type_node, arg2 ? 2 : 1,
+					   src, arg2);
+      if (define_at_zero && arg2 == NULL_TREE)
 	{
 	  tree is_zero = fold_build2 (NE_EXPR, boolean_type_node, src,
 				      build_zero_cst (TREE_TYPE (src)));
--- gcc/tree-ssa-forwprop.cc.jj	2023-11-09 09:03:53.542898608 +0100
+++ gcc/tree-ssa-forwprop.cc	2023-11-09 09:38:28.895393573 +0100
@@ -2381,6 +2381,7 @@ simplify_count_trailing_zeroes (gimple_s
       HOST_WIDE_INT type_size = tree_to_shwi (TYPE_SIZE (type));
       bool zero_ok
 	= CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (type), ctz_val) == 2;
+      int nargs = 2;
 
       /* If the input value can't be zero, don't special case ctz (0).  */
       if (tree_expr_nonzero_p (res_ops[0]))
@@ -2388,6 +2389,7 @@ simplify_count_trailing_zeroes (gimple_s
 	  zero_ok = true;
 	  zero_val = 0;
 	  ctz_val = 0;
+	  nargs = 1;
 	}
 
       /* Skip if there is no value defined at zero, or if we can't easily
@@ -2399,7 +2401,11 @@ simplify_count_trailing_zeroes (gimple_s
 
       gimple_seq seq = NULL;
       gimple *g;
-      gcall *call = gimple_build_call_internal (IFN_CTZ, 1, res_ops[0]);
+      gcall *call
+	= gimple_build_call_internal (IFN_CTZ, nargs, res_ops[0],
+				      nargs == 1 ? NULL_TREE
+				      : build_int_cst (integer_type_node,
+						       ctz_val));
       gimple_set_location (call, gimple_location (stmt));
       gimple_set_lhs (call, make_ssa_name (integer_type_node));
       gimple_seq_add_stmt (&seq, call);
--- gcc/tree-ssa-phiopt.cc.jj	2023-11-09 09:03:53.616897559 +0100
+++ gcc/tree-ssa-phiopt.cc	2023-11-09 09:17:40.241182328 +0100
@@ -2863,18 +2863,26 @@ cond_removal_in_builtin_zero_pattern (ba
     }
 
   /* Check that we have a popcount/clz/ctz builtin.  */
-  if (!is_gimple_call (call) || gimple_call_num_args (call) != 1)
+  if (!is_gimple_call (call))
     return false;
 
-  arg = gimple_call_arg (call, 0);
   lhs = gimple_get_lhs (call);
 
   if (lhs == NULL_TREE)
     return false;
 
   combined_fn cfn = gimple_call_combined_fn (call);
+  if (gimple_call_num_args (call) != 1
+      && (gimple_call_num_args (call) != 2
+	  || cfn == CFN_CLZ
+	  || cfn == CFN_CTZ))
+    return false;
+
+  arg = gimple_call_arg (call, 0);
+
   internal_fn ifn = IFN_LAST;
   int val = 0;
+  bool any_val = false;
   switch (cfn)
     {
     case CFN_BUILT_IN_BSWAP16:
@@ -2889,6 +2897,23 @@ cond_removal_in_builtin_zero_pattern (ba
       if (INTEGRAL_TYPE_P (TREE_TYPE (arg)))
 	{
 	  tree type = TREE_TYPE (arg);
+	  if (TREE_CODE (type) == BITINT_TYPE)
+	    {
+	      if (gimple_call_num_args (call) == 1)
+		{
+		  any_val = true;
+		  ifn = IFN_CLZ;
+		  break;
+		}
+	      if (!tree_fits_shwi_p (gimple_call_arg (call, 1)))
+		return false;
+	      HOST_WIDE_INT at_zero = tree_to_shwi (gimple_call_arg (call, 1));
+	      if ((int) at_zero != at_zero)
+		return false;
+	      ifn = IFN_CLZ;
+	      val = at_zero;
+	      break;
+	    }
 	  if (direct_internal_fn_supported_p (IFN_CLZ, type, OPTIMIZE_FOR_BOTH)
 	      && CLZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (type),
 					    val) == 2)
@@ -2902,6 +2927,23 @@ cond_removal_in_builtin_zero_pattern (ba
       if (INTEGRAL_TYPE_P (TREE_TYPE (arg)))
 	{
 	  tree type = TREE_TYPE (arg);
+	  if (TREE_CODE (type) == BITINT_TYPE)
+	    {
+	      if (gimple_call_num_args (call) == 1)
+		{
+		  any_val = true;
+		  ifn = IFN_CTZ;
+		  break;
+		}
+	      if (!tree_fits_shwi_p (gimple_call_arg (call, 1)))
+		return false;
+	      HOST_WIDE_INT at_zero = tree_to_shwi (gimple_call_arg (call, 1));
+	      if ((int) at_zero != at_zero)
+		return false;
+	      ifn = IFN_CTZ;
+	      val = at_zero;
+	      break;
+	    }
 	  if (direct_internal_fn_supported_p (IFN_CTZ, type, OPTIMIZE_FOR_BOTH)
 	      && CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (type),
 					    val) == 2)
@@ -2960,8 +3002,18 @@ cond_removal_in_builtin_zero_pattern (ba
 
   /* Check PHI arguments.  */
   if (lhs != arg0
-      || TREE_CODE (arg1) != INTEGER_CST
-      || wi::to_wide (arg1) != val)
+      || TREE_CODE (arg1) != INTEGER_CST)
+    return false;
+  if (any_val)
+    {
+      if (!tree_fits_shwi_p (arg1))
+	return false;
+      HOST_WIDE_INT at_zero = tree_to_shwi (arg1);
+      if ((int) at_zero != at_zero)
+	return false;
+      val = at_zero;
+    }
+  else if (wi::to_wide (arg1) != val)
     return false;
 
   /* And insert the popcount/clz/ctz builtin and cast stmt before the
@@ -2974,13 +3026,15 @@ cond_removal_in_builtin_zero_pattern (ba
       reset_flow_sensitive_info (gimple_get_lhs (cast));
     }
   gsi_from = gsi_for_stmt (call);
-  if (ifn == IFN_LAST || gimple_call_internal_p (call))
+  if (ifn == IFN_LAST
+      || (gimple_call_internal_p (call) && gimple_call_num_args (call) == 2))
     gsi_move_before (&gsi_from, &gsi);
   else
     {
       /* For __builtin_c[lt]z* force .C[LT]Z ifn, because only
 	 the latter is well defined at zero.  */
-      call = gimple_build_call_internal (ifn, 1, gimple_call_arg (call, 0));
+      call = gimple_build_call_internal (ifn, 2, gimple_call_arg (call, 0),
+					 build_int_cst (integer_type_node, val));
       gimple_call_set_lhs (call, lhs);
       gsi_insert_before (&gsi, call, GSI_SAME_STMT);
       gsi_remove (&gsi_from, true);
--- gcc/doc/extend.texi.jj	2023-11-09 09:04:18.823540470 +0100
+++ gcc/doc/extend.texi	2023-11-09 09:17:40.240182342 +0100
@@ -14960,6 +14960,42 @@ Similar to @code{__builtin_parity}, exce
 @code{unsigned long long}.
 @enddefbuiltin
 
+@defbuiltin{int __builtin_ffsg (...)}
+Similar to @code{__builtin_ffs}, except the argument is type-generic
+signed integer (standard, extended or bit-precise).
+@enddefbuiltin
+
+@defbuiltin{int __builtin_clzg (...)}
+Similar to @code{__builtin_clz}, except the argument is type-generic
+unsigned integer (standard, extended or bit-precise) and there is
+optional second argument with int type.  If two arguments are specified,
+and first argument is 0, the result is the second argument.  If only
+one argument is specified and it is 0, the result is undefined.
+@enddefbuiltin
+
+@defbuiltin{int __builtin_ctzg (...)}
+Similar to @code{__builtin_ctz}, except the argument is type-generic
+unsigned integer (standard, extended or bit-precise) and there is
+optional second argument with int type.  If two arguments are specified,
+and first argument is 0, the result is the second argument.  If only
+one argument is specified and it is 0, the result is undefined.
+@enddefbuiltin
+
+@defbuiltin{int __builtin_clrsbg (...)}
+Similar to @code{__builtin_clrsb}, except the argument is type-generic
+signed integer (standard, extended or bit-precise).
+@enddefbuiltin
+
+@defbuiltin{int __builtin_popcountg (...)}
+Similar to @code{__builtin_popcount}, except the argument is type-generic
+unsigned integer (standard, extended or bit-precise).
+@enddefbuiltin
+
+@defbuiltin{int __builtin_parityg (...)}
+Similar to @code{__builtin_parity}, except the argument is type-generic
+unsigned integer (standard, extended or bit-precise).
+@enddefbuiltin
+
 @defbuiltin{double __builtin_powi (double, int)}
 @defbuiltinx{float __builtin_powif (float, int)}
 @defbuiltinx{{long double} __builtin_powil (long double, int)}
--- gcc/c-family/c-common.cc.jj	2023-11-09 09:04:18.409546335 +0100
+++ gcc/c-family/c-common.cc	2023-11-09 09:17:40.236182399 +0100
@@ -6475,14 +6475,14 @@ check_builtin_function_arguments (locati
 	      }
 	  if (TREE_CODE (TREE_TYPE (args[2])) == ENUMERAL_TYPE)
 	    {
-	      error_at (ARG_LOCATION (2), "argument 3 in call to function "
-			"%qE has enumerated type", fndecl);
+	      error_at (ARG_LOCATION (2), "argument %u in call to function "
+			"%qE has enumerated type", 3, fndecl);
 	      return false;
 	    }
 	  else if (TREE_CODE (TREE_TYPE (args[2])) == BOOLEAN_TYPE)
 	    {
-	      error_at (ARG_LOCATION (2), "argument 3 in call to function "
-			"%qE has boolean type", fndecl);
+	      error_at (ARG_LOCATION (2), "argument %u in call to function "
+			"%qE has boolean type", 3, fndecl);
 	      return false;
 	    }
 	  return true;
@@ -6522,6 +6522,72 @@ check_builtin_function_arguments (locati
 	}
       return false;
 
+    case BUILT_IN_CLZG:
+    case BUILT_IN_CTZG:
+    case BUILT_IN_CLRSBG:
+    case BUILT_IN_FFSG:
+    case BUILT_IN_PARITYG:
+    case BUILT_IN_POPCOUNTG:
+      if (nargs == 2
+	  && (DECL_FUNCTION_CODE (fndecl) == BUILT_IN_CLZG
+	      || DECL_FUNCTION_CODE (fndecl) == BUILT_IN_CTZG))
+	{
+	  if (!INTEGRAL_TYPE_P (TREE_TYPE (args[1])))
+	    {
+	      error_at (ARG_LOCATION (1), "argument %u in call to function "
+			"%qE does not have integral type", 2, fndecl);
+	      return false;
+	    }
+	  if ((TYPE_PRECISION (TREE_TYPE (args[1]))
+	       > TYPE_PRECISION (integer_type_node))
+	      || (TYPE_PRECISION (TREE_TYPE (args[1]))
+		  == TYPE_PRECISION (integer_type_node)
+		  && TYPE_UNSIGNED (TREE_TYPE (args[1]))))
+	    {
+	      error_at (ARG_LOCATION (1), "argument %u in call to function "
+			"%qE does not have %<int%> type", 2, fndecl);
+	      return false;
+	    }
+	}
+      else if (!builtin_function_validate_nargs (loc, fndecl, nargs, 1))
+	return false;
+
+      if (!INTEGRAL_TYPE_P (TREE_TYPE (args[0])))
+	{
+	  error_at (ARG_LOCATION (0), "argument %u in call to function "
+		    "%qE does not have integral type", 1, fndecl);
+	  return false;
+	}
+      if (TREE_CODE (TREE_TYPE (args[0])) == ENUMERAL_TYPE)
+	{
+	  error_at (ARG_LOCATION (0), "argument %u in call to function "
+		    "%qE has enumerated type", 1, fndecl);
+	  return false;
+	}
+      if (TREE_CODE (TREE_TYPE (args[0])) == BOOLEAN_TYPE)
+	{
+	  error_at (ARG_LOCATION (0), "argument %u in call to function "
+		    "%qE has boolean type", 1, fndecl);
+	  return false;
+	}
+      if (DECL_FUNCTION_CODE (fndecl) == BUILT_IN_FFSG
+	  || DECL_FUNCTION_CODE (fndecl) == BUILT_IN_CLRSBG)
+	{
+	  if (TYPE_UNSIGNED (TREE_TYPE (args[0])))
+	    {
+	      error_at (ARG_LOCATION (0), "argument 1 in call to function "
+			"%qE has unsigned type", fndecl);
+	      return false;
+	    }
+	}
+      else if (!TYPE_UNSIGNED (TREE_TYPE (args[0])))
+	{
+	  error_at (ARG_LOCATION (0), "argument 1 in call to function "
+		    "%qE has signed type", fndecl);
+	  return false;
+	}
+      return true;
+
     default:
       return true;
     }
--- gcc/c-family/c-gimplify.cc.jj	2023-11-09 09:03:53.251902730 +0100
+++ gcc/c-family/c-gimplify.cc	2023-11-09 09:17:40.237182384 +0100
@@ -818,6 +818,28 @@ c_gimplify_expr (tree *expr_p, gimple_se
 	break;
       }
 
+    case CALL_EXPR:
+      {
+	tree fndecl = get_callee_fndecl (*expr_p);
+	if (fndecl
+	    && fndecl_built_in_p (fndecl, BUILT_IN_CLZG, BUILT_IN_CTZG)
+	    && call_expr_nargs (*expr_p) == 2
+	    && TREE_CODE (CALL_EXPR_ARG (*expr_p, 1)) != INTEGER_CST)
+	  {
+	    tree a = save_expr (CALL_EXPR_ARG (*expr_p, 0));
+	    tree c = build_call_expr_loc (EXPR_LOCATION (*expr_p),
+					  fndecl, 1, a);
+	    *expr_p = build3_loc (EXPR_LOCATION (*expr_p), COND_EXPR,
+				  integer_type_node,
+				  build2_loc (EXPR_LOCATION (*expr_p),
+					      NE_EXPR, boolean_type_node, a,
+					      build_zero_cst (TREE_TYPE (a))),
+				  c, CALL_EXPR_ARG (*expr_p, 1));
+	    return GS_OK;
+	  }
+	break;
+      }
+
     default:;
     }
 
--- gcc/c/c-typeck.cc.jj	2023-11-09 09:04:18.537544522 +0100
+++ gcc/c/c-typeck.cc	2023-11-09 10:57:28.672517220 +0100
@@ -3560,6 +3560,7 @@ convert_arguments (location_t loc, vec<l
     && lookup_attribute ("type generic", TYPE_ATTRIBUTES (TREE_TYPE (fundecl)));
   bool type_generic_remove_excess_precision = false;
   bool type_generic_overflow_p = false;
+  bool type_generic_bit_query = false;
   tree selector;
 
   /* Change pointer to function to the function itself for
@@ -3615,6 +3616,17 @@ convert_arguments (location_t loc, vec<l
 	    type_generic_overflow_p = true;
 	    break;
 
+	  case BUILT_IN_CLZG:
+	  case BUILT_IN_CTZG:
+	  case BUILT_IN_CLRSBG:
+	  case BUILT_IN_FFSG:
+	  case BUILT_IN_PARITYG:
+	  case BUILT_IN_POPCOUNTG:
+	    /* The first argument of these type-generic builtins
+	       should not be promoted.  */
+	    type_generic_bit_query = true;
+	    break;
+
 	  default:
 	    break;
 	  }
@@ -3750,11 +3762,13 @@ convert_arguments (location_t loc, vec<l
 	    }
 	}
       else if ((excess_precision && !type_generic)
-	       || (type_generic_overflow_p && parmnum == 2))
+	       || (type_generic_overflow_p && parmnum == 2)
+	       || (type_generic_bit_query && parmnum == 0))
 	/* A "double" argument with excess precision being passed
 	   without a prototype or in variable arguments.
 	   The last argument of __builtin_*_overflow_p should not be
-	   promoted.  */
+	   promoted, similarly the first argument of
+	   __builtin_{clz,ctz,clrsb,ffs,parity,popcount}g.  */
 	parmval = convert (valtype, val);
       else if ((invalid_func_diag =
 		targetm.calls.invalid_arg_for_unprototyped_fn (typelist, fundecl, val)))
--- gcc/cp/call.cc.jj	2023-11-04 09:02:35.376001531 +0100
+++ gcc/cp/call.cc	2023-11-09 11:03:06.687737428 +0100
@@ -9290,7 +9290,9 @@ convert_for_arg_passing (tree type, tree
    This is true for some builtins which don't act like normal functions.
    Return 2 if just decay_conversion and removal of excess precision should
    be done, 1 if just decay_conversion.  Return 3 for special treatment of
-   the 3rd argument for __builtin_*_overflow_p.  */
+   the 3rd argument for __builtin_*_overflow_p.  Return 4 for special
+   treatment of the 1st argument for
+   __builtin_{clz,ctz,clrsb,ffs,parity,popcount}g.  */
 
 int
 magic_varargs_p (tree fn)
@@ -9317,6 +9319,14 @@ magic_varargs_p (tree fn)
       case BUILT_IN_FPCLASSIFY:
 	return 2;
 
+      case BUILT_IN_CLZG:
+      case BUILT_IN_CTZG:
+      case BUILT_IN_CLRSBG:
+      case BUILT_IN_FFSG:
+      case BUILT_IN_PARITYG:
+      case BUILT_IN_POPCOUNTG:
+	return 4;
+
       default:
 	return lookup_attribute ("type generic",
 				 TYPE_ATTRIBUTES (TREE_TYPE (fn))) != 0;
@@ -10122,7 +10132,7 @@ build_over_call (struct z_candidate *can
   for (; arg_index < vec_safe_length (args); ++arg_index)
     {
       tree a = (*args)[arg_index];
-      if (magic == 3 && arg_index == 2)
+      if ((magic == 3 && arg_index == 2) || (magic == 4 && arg_index == 0))
 	{
 	  /* Do no conversions for certain magic varargs.  */
 	  a = mark_type_use (a);
--- gcc/cp/cp-gimplify.cc.jj	2023-11-02 07:49:15.839882778 +0100
+++ gcc/cp/cp-gimplify.cc	2023-11-09 12:11:59.834140462 +0100
@@ -771,6 +771,10 @@ cp_gimplify_expr (tree *expr_p, gimple_s
 	      default:
 		break;
 	      }
+	  else if (decl
+		   && fndecl_built_in_p (decl, BUILT_IN_CLZG, BUILT_IN_CTZG))
+	    ret = (enum gimplify_status) c_gimplify_expr (expr_p, pre_p,
+							  post_p);
 	}
       break;
 
--- gcc/testsuite/c-c++-common/pr111309-1.c.jj	2023-11-09 10:35:28.974541671 +0100
+++ gcc/testsuite/c-c++-common/pr111309-1.c	2023-11-09 11:54:02.817389761 +0100
@@ -0,0 +1,470 @@
+/* PR c/111309 */
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+__attribute__((noipa)) int
+clzc (unsigned char x)
+{
+  return __builtin_clzg (x);
+}
+
+__attribute__((noipa)) int
+clzc2 (unsigned char x, int y)
+{
+  return __builtin_clzg (x, y);
+}
+
+__attribute__((noipa)) int
+clzs (unsigned short x)
+{
+  return __builtin_clzg (x);
+}
+
+__attribute__((noipa)) int
+clzs2 (unsigned short x)
+{
+  return __builtin_clzg (x, -2);
+}
+
+__attribute__((noipa)) int
+clzi (unsigned int x)
+{
+  return __builtin_clzg (x);
+}
+
+__attribute__((noipa)) int
+clzi2 (unsigned int x, int y)
+{
+  return __builtin_clzg (x, y);
+}
+
+__attribute__((noipa)) int
+clzl (unsigned long x)
+{
+  return __builtin_clzg (x);
+}
+
+__attribute__((noipa)) int
+clzl2 (unsigned long x)
+{
+  return __builtin_clzg (x, -1);
+}
+
+__attribute__((noipa)) int
+clzL (unsigned long long x)
+{
+  return __builtin_clzg (x);
+}
+
+__attribute__((noipa)) int
+clzL2 (unsigned long long x, int y)
+{
+  return __builtin_clzg (x, y);
+}
+
+#ifdef __SIZEOF_INT128__
+__attribute__((noipa)) int
+clzI (unsigned __int128 x)
+{
+  return __builtin_clzg (x);
+}
+
+__attribute__((noipa)) int
+clzI2 (unsigned __int128 x, int y)
+{
+  return __builtin_clzg (x, y);
+}
+#endif
+
+__attribute__((noipa)) int
+ctzc (unsigned char x)
+{
+  return __builtin_ctzg (x);
+}
+
+__attribute__((noipa)) int
+ctzc2 (unsigned char x, int y)
+{
+  return __builtin_ctzg (x, y);
+}
+
+__attribute__((noipa)) int
+ctzs (unsigned short x)
+{
+  return __builtin_ctzg (x);
+}
+
+__attribute__((noipa)) int
+ctzs2 (unsigned short x, int y)
+{
+  return __builtin_ctzg (x, y);
+}
+
+__attribute__((noipa)) int
+ctzi (unsigned int x)
+{
+  return __builtin_ctzg (x);
+}
+
+__attribute__((noipa)) int
+ctzi2 (unsigned int x, int y)
+{
+  return __builtin_ctzg (x, y);
+}
+
+__attribute__((noipa)) int
+ctzl (unsigned long x)
+{
+  return __builtin_ctzg (x);
+}
+
+__attribute__((noipa)) int
+ctzl2 (unsigned long x, int y)
+{
+  return __builtin_ctzg (x, y);
+}
+
+__attribute__((noipa)) int
+ctzL (unsigned long long x)
+{
+  return __builtin_ctzg (x);
+}
+
+__attribute__((noipa)) int
+ctzL2 (unsigned long long x, int y)
+{
+  return __builtin_ctzg (x, y);
+}
+
+#ifdef __SIZEOF_INT128__
+__attribute__((noipa)) int
+ctzI (unsigned __int128 x)
+{
+  return __builtin_ctzg (x);
+}
+
+__attribute__((noipa)) int
+ctzI2 (unsigned __int128 x)
+{
+  return __builtin_ctzg (x, __SIZEOF_INT128__ * __CHAR_BIT__);
+}
+#endif
+
+__attribute__((noipa)) int
+clrsbc (signed char x)
+{
+  return __builtin_clrsbg (x);
+}
+
+__attribute__((noipa)) int
+clrsbs (signed short x)
+{
+  return __builtin_clrsbg (x);
+}
+
+__attribute__((noipa)) int
+clrsbi (signed int x)
+{
+  return __builtin_clrsbg (x);
+}
+
+__attribute__((noipa)) int
+clrsbl (signed long x)
+{
+  return __builtin_clrsbg (x);
+}
+
+__attribute__((noipa)) int
+clrsbL (signed long long x)
+{
+  return __builtin_clrsbg (x);
+}
+
+#ifdef __SIZEOF_INT128__
+__attribute__((noipa)) int
+clrsbI (signed __int128 x)
+{
+  return __builtin_clrsbg (x);
+}
+#endif
+
+__attribute__((noipa)) int
+ffsc (signed char x)
+{
+  return __builtin_ffsg (x);
+}
+
+__attribute__((noipa)) int
+ffss (signed short x)
+{
+  return __builtin_ffsg (x);
+}
+
+__attribute__((noipa)) int
+ffsi (signed int x)
+{
+  return __builtin_ffsg (x);
+}
+
+__attribute__((noipa)) int
+ffsl (signed long x)
+{
+  return __builtin_ffsg (x);
+}
+
+__attribute__((noipa)) int
+ffsL (signed long long x)
+{
+  return __builtin_ffsg (x);
+}
+
+#ifdef __SIZEOF_INT128__
+__attribute__((noipa)) int
+ffsI (signed __int128 x)
+{
+  return __builtin_ffsg (x);
+}
+#endif
+
+__attribute__((noipa)) int
+parityc (unsigned char x)
+{
+  return __builtin_parityg (x);
+}
+
+__attribute__((noipa)) int
+paritys (unsigned short x)
+{
+  return __builtin_parityg (x);
+}
+
+__attribute__((noipa)) int
+parityi (unsigned int x)
+{
+  return __builtin_parityg (x);
+}
+
+__attribute__((noipa)) int
+parityl (unsigned long x)
+{
+  return __builtin_parityg (x);
+}
+
+__attribute__((noipa)) int
+parityL (unsigned long long x)
+{
+  return __builtin_parityg (x);
+}
+
+#ifdef __SIZEOF_INT128__
+__attribute__((noipa)) int
+parityI (unsigned __int128 x)
+{
+  return __builtin_parityg (x);
+}
+#endif
+
+__attribute__((noipa)) int
+popcountc (unsigned char x)
+{
+  return __builtin_popcountg (x);
+}
+
+__attribute__((noipa)) int
+popcounts (unsigned short x)
+{
+  return __builtin_popcountg (x);
+}
+
+__attribute__((noipa)) int
+popcounti (unsigned int x)
+{
+  return __builtin_popcountg (x);
+}
+
+__attribute__((noipa)) int
+popcountl (unsigned long x)
+{
+  return __builtin_popcountg (x);
+}
+
+__attribute__((noipa)) int
+popcountL (unsigned long long x)
+{
+  return __builtin_popcountg (x);
+}
+
+#ifdef __SIZEOF_INT128__
+__attribute__((noipa)) int
+popcountI (unsigned __int128 x)
+{
+  return __builtin_popcountg (x);
+}
+#endif
+
+int
+main ()
+{
+  if (__builtin_clzg ((unsigned char) 1) != __CHAR_BIT__ - 1
+      || __builtin_clzg ((unsigned short) 2, -2) != __SIZEOF_SHORT__ * __CHAR_BIT__ - 2
+      || __builtin_clzg (0U, 42) != 42
+      || __builtin_clzg (0U, -1) != -1
+      || __builtin_clzg (1U) != __SIZEOF_INT__ * __CHAR_BIT__ - 1
+      || __builtin_clzg (2UL, -1) != __SIZEOF_LONG__ * __CHAR_BIT__ - 2
+      || __builtin_clzg (5ULL) != __SIZEOF_LONG_LONG__ * __CHAR_BIT__ - 3
+#ifdef __SIZEOF_INT128__
+      || __builtin_clzg ((unsigned __int128) 9) != __SIZEOF_INT128__ * __CHAR_BIT__ - 4
+#endif
+      || __builtin_clzg (~0U, -5) != 0
+      || __builtin_clzg (~0ULL >> 2) != 2
+      || __builtin_ctzg ((unsigned char) 1) != 0
+      || __builtin_ctzg ((unsigned short) 28) != 2
+      || __builtin_ctzg (0U, 32) != 32
+      || __builtin_ctzg (0U, -42) != -42
+      || __builtin_ctzg (1U) != 0
+      || __builtin_ctzg (16UL, -1) != 4
+      || __builtin_ctzg (5ULL << 52, 0) != 52
+#ifdef __SIZEOF_INT128__
+      || __builtin_ctzg (((unsigned __int128) 9) << 72) != 72
+#endif
+      || __builtin_clrsbg ((signed char) 0) != __CHAR_BIT__ - 1
+      || __builtin_clrsbg ((signed short) -1) != __SIZEOF_SHORT__ * __CHAR_BIT__ - 1
+      || __builtin_clrsbg (0) != __SIZEOF_INT__ * __CHAR_BIT__ - 1
+      || __builtin_clrsbg (-1L) != __SIZEOF_LONG__ * __CHAR_BIT__ - 1
+      || __builtin_clrsbg (0LL) != __SIZEOF_LONG_LONG__ * __CHAR_BIT__ - 1
+#ifdef __SIZEOF_INT128__
+      || __builtin_clrsbg ((__int128) -1) != __SIZEOF_INT128__ * __CHAR_BIT__ - 1
+#endif
+      || __builtin_clrsbg (0x1afb) != __SIZEOF_INT__ * __CHAR_BIT__ - 14
+      || __builtin_clrsbg (-2) != __SIZEOF_INT__ * __CHAR_BIT__ - 2
+      || __builtin_clrsbg (1L) != __SIZEOF_LONG__ * __CHAR_BIT__ - 2
+      || __builtin_clrsbg (-4LL) != __SIZEOF_LONG_LONG__ * __CHAR_BIT__ - 3
+      || __builtin_ffsg ((signed char) 0) != 0
+      || __builtin_ffsg ((signed short) 0) != 0
+      || __builtin_ffsg (0) != 0
+      || __builtin_ffsg (0L) != 0
+      || __builtin_ffsg (0LL) != 0
+#ifdef __SIZEOF_INT128__
+      || __builtin_ffsg ((__int128) 0) != 0
+#endif
+      || __builtin_ffsg ((signed char) 4) != 3
+      || __builtin_ffsg ((signed short) 8) != 4
+      || __builtin_ffsg (1) != 1
+      || __builtin_ffsg (2L) != 2
+      || __builtin_ffsg (28LL) != 3
+      || __builtin_parityg ((unsigned char) 1) != 1
+      || __builtin_parityg ((unsigned short) 2) != 1
+      || __builtin_parityg (0U) != 0
+      || __builtin_parityg (3U) != 0
+      || __builtin_parityg (0UL) != 0
+      || __builtin_parityg (7UL) != 1
+      || __builtin_parityg (0ULL) != 0
+#ifdef __SIZEOF_INT128__
+      || __builtin_parityg ((unsigned __int128) 0) != 0
+#endif
+      || __builtin_parityg ((unsigned char) ~0U) != 0
+      || __builtin_parityg ((unsigned short) ~0U) != 0
+      || __builtin_parityg (~0U) != 0
+      || __builtin_parityg (~0UL) != 0
+      || __builtin_parityg (~0ULL) != 0
+#ifdef __SIZEOF_INT128__
+      || __builtin_parityg (~(unsigned __int128) 0) != 0
+#endif
+      || __builtin_popcountg (0U) != 0
+      || __builtin_popcountg (0UL) != 0
+      || __builtin_popcountg (0ULL) != 0
+#ifdef __SIZEOF_INT128__
+      || __builtin_popcountg ((unsigned __int128) 0) != 0
+#endif
+      || __builtin_popcountg ((unsigned char) ~0U) != __CHAR_BIT__
+      || __builtin_popcountg ((unsigned short) ~0U) != __SIZEOF_SHORT__ * __CHAR_BIT__
+      || __builtin_popcountg (~0U) != __SIZEOF_INT__ * __CHAR_BIT__
+      || __builtin_popcountg (~0UL) != __SIZEOF_LONG__ * __CHAR_BIT__
+      || __builtin_popcountg (~0ULL) != __SIZEOF_LONG_LONG__ * __CHAR_BIT__
+#ifdef __SIZEOF_INT128__
+      || __builtin_popcountg (~(unsigned __int128) 0) != __SIZEOF_INT128__ * __CHAR_BIT__
+#endif
+      || 0)
+  __builtin_abort ();
+  if (clzc (1) != __CHAR_BIT__ - 1
+      || clzs2 (2) != __SIZEOF_SHORT__ * __CHAR_BIT__ - 2
+      || clzi2 (0U, 42) != 42
+      || clzi2 (0U, -1) != -1
+      || clzi (1U) != __SIZEOF_INT__ * __CHAR_BIT__ - 1
+      || clzl2 (2UL) != __SIZEOF_LONG__ * __CHAR_BIT__ - 2
+      || clzL (5ULL) != __SIZEOF_LONG_LONG__ * __CHAR_BIT__ - 3
+#ifdef __SIZEOF_INT128__
+      || clzI ((unsigned __int128) 9) != __SIZEOF_INT128__ * __CHAR_BIT__ - 4
+#endif
+      || clzi2 (~0U, -5) != 0
+      || clzL (~0ULL >> 2) != 2
+      || ctzc (1) != 0
+      || ctzs (28) != 2
+      || ctzi2 (0U, 32) != 32
+      || ctzi2 (0U, -42) != -42
+      || ctzi (1U) != 0
+      || ctzl2 (16UL, -1) != 4
+      || ctzL2 (5ULL << 52, 0) != 52
+#ifdef __SIZEOF_INT128__
+      || ctzI (((unsigned __int128) 9) << 72) != 72
+#endif
+      || clrsbc (0) != __CHAR_BIT__ - 1
+      || clrsbs (-1) != __SIZEOF_SHORT__ * __CHAR_BIT__ - 1
+      || clrsbi (0) != __SIZEOF_INT__ * __CHAR_BIT__ - 1
+      || clrsbl (-1L) != __SIZEOF_LONG__ * __CHAR_BIT__ - 1
+      || clrsbL (0LL) != __SIZEOF_LONG_LONG__ * __CHAR_BIT__ - 1
+#ifdef __SIZEOF_INT128__
+      || clrsbI (-1) != __SIZEOF_INT128__ * __CHAR_BIT__ - 1
+#endif
+      || clrsbi (0x1afb) != __SIZEOF_INT__ * __CHAR_BIT__ - 14
+      || clrsbi (-2) != __SIZEOF_INT__ * __CHAR_BIT__ - 2
+      || clrsbl (1L) != __SIZEOF_LONG__ * __CHAR_BIT__ - 2
+      || clrsbL (-4LL) != __SIZEOF_LONG_LONG__ * __CHAR_BIT__ - 3
+      || ffsc (0) != 0
+      || ffss (0) != 0
+      || ffsi (0) != 0
+      || ffsl (0L) != 0
+      || ffsL (0LL) != 0
+#ifdef __SIZEOF_INT128__
+      || ffsI (0) != 0
+#endif
+      || ffsc (4) != 3
+      || ffss (8) != 4
+      || ffsi (1) != 1
+      || ffsl (2L) != 2
+      || ffsL (28LL) != 3
+      || parityc (1) != 1
+      || paritys (2) != 1
+      || parityi (0U) != 0
+      || parityi (3U) != 0
+      || parityl (0UL) != 0
+      || parityl (7UL) != 1
+      || parityL (0ULL) != 0
+#ifdef __SIZEOF_INT128__
+      || parityI (0) != 0
+#endif
+      || parityc ((unsigned char) ~0U) != 0
+      || paritys ((unsigned short) ~0U) != 0
+      || parityi (~0U) != 0
+      || parityl (~0UL) != 0
+      || parityL (~0ULL) != 0
+#ifdef __SIZEOF_INT128__
+      || parityI (~(unsigned __int128) 0) != 0
+#endif
+      || popcounti (0U) != 0
+      || popcountl (0UL) != 0
+      || popcountL (0ULL) != 0
+#ifdef __SIZEOF_INT128__
+      || popcountI (0) != 0
+#endif
+      || popcountc ((unsigned char) ~0U) != __CHAR_BIT__
+      || popcounts ((unsigned short) ~0U) != __SIZEOF_SHORT__ * __CHAR_BIT__
+      || popcounti (~0U) != __SIZEOF_INT__ * __CHAR_BIT__
+      || popcountl (~0UL) != __SIZEOF_LONG__ * __CHAR_BIT__
+      || popcountL (~0ULL) != __SIZEOF_LONG_LONG__ * __CHAR_BIT__
+#ifdef __SIZEOF_INT128__
+      || popcountI (~(unsigned __int128) 0) != __SIZEOF_INT128__ * __CHAR_BIT__
+#endif
+      || 0)
+  __builtin_abort ();
+}
--- gcc/testsuite/c-c++-common/pr111309-2.c.jj	2023-11-09 11:33:42.680632470 +0100
+++ gcc/testsuite/c-c++-common/pr111309-2.c	2023-11-09 12:03:11.062619162 +0100
@@ -0,0 +1,85 @@
+/* PR c/111309 */
+/* { dg-do compile } */
+/* { dg-additional-options "-std=c99" { target c } } */
+
+#ifndef __cplusplus
+#define bool _Bool
+#define true ((_Bool) 1)
+#define false ((_Bool) 0)
+#endif
+
+void
+foo (void)
+{
+  enum E { E0 = 0 };
+  struct S { int s; } s;
+  __builtin_clzg ();		/* { dg-error "too few arguments" } */
+  __builtin_clzg (0U, 1, 2);	/* { dg-error "too many arguments" } */
+  __builtin_clzg (0);		/* { dg-error "has signed type" } */
+  __builtin_clzg (0.0);		/* { dg-error "does not have integral type" } */
+  __builtin_clzg (s);		/* { dg-error "does not have integral type" } */
+  __builtin_clzg (true);	/* { dg-error "has boolean type" } */
+  __builtin_clzg (E0);		/* { dg-error "has signed type" "" { target c } } */
+				/* { dg-error "has enumerated type" "" { target c++ } .-1 } */
+  __builtin_clzg (0, 0);	/* { dg-error "has signed type" } */
+  __builtin_clzg (0.0, 0);	/* { dg-error "does not have integral type" } */
+  __builtin_clzg (s, 0);	/* { dg-error "does not have integral type" } */
+  __builtin_clzg (true, 0);	/* { dg-error "has boolean type" } */
+  __builtin_clzg (E0, 0);	/* { dg-error "has signed type" "" { target c } } */
+				/* { dg-error "has enumerated type" "" { target c++ } .-1 } */
+  __builtin_clzg (0U, 2.0);	/* { dg-error "does not have integral type" } */
+  __builtin_clzg (0U, s);	/* { dg-error "does not have integral type" } */
+  __builtin_clzg (0U, 2LL);	/* { dg-error "does not have 'int' type" } */
+  __builtin_clzg (0U, 2U);	/* { dg-error "does not have 'int' type" } */
+  __builtin_clzg (0U, true);
+  __builtin_clzg (0U, E0);	/* { dg-error "does not have 'int' type" "" { target c++ } } */
+  __builtin_ctzg ();		/* { dg-error "too few arguments" } */
+  __builtin_ctzg (0U, 1, 2);	/* { dg-error "too many arguments" } */
+  __builtin_ctzg (0);		/* { dg-error "has signed type" } */
+  __builtin_ctzg (0.0);		/* { dg-error "does not have integral type" } */
+  __builtin_ctzg (s);		/* { dg-error "does not have integral type" } */
+  __builtin_ctzg (true);	/* { dg-error "has boolean type" } */
+  __builtin_ctzg (E0);		/* { dg-error "has signed type" "" { target c } } */
+				/* { dg-error "has enumerated type" "" { target c++ } .-1 } */
+  __builtin_ctzg (0, 0);	/* { dg-error "has signed type" } */
+  __builtin_ctzg (0.0, 0);	/* { dg-error "does not have integral type" } */
+  __builtin_ctzg (s, 0);	/* { dg-error "does not have integral type" } */
+  __builtin_ctzg (true, 0);	/* { dg-error "has boolean type" } */
+  __builtin_ctzg (E0, 0);	/* { dg-error "has signed type" "" { target c } } */
+				/* { dg-error "has enumerated type" "" { target c++ } .-1 } */
+  __builtin_ctzg (0U, 2.0);	/* { dg-error "does not have integral type" } */
+  __builtin_ctzg (0U, 2LL);	/* { dg-error "does not have 'int' type" } */
+  __builtin_ctzg (0U, 2U);	/* { dg-error "does not have 'int' type" } */
+  __builtin_ctzg (0U, true);
+  __builtin_ctzg (0U, E0);	/* { dg-error "does not have 'int' type" "" { target c++ } } */
+  __builtin_clrsbg ();		/* { dg-error "too few arguments" } */
+  __builtin_clrsbg (0, 1);	/* { dg-error "too many arguments" } */
+  __builtin_clrsbg (0U);	/* { dg-error "has unsigned type" } */
+  __builtin_clrsbg (0.0);	/* { dg-error "does not have integral type" } */
+  __builtin_clrsbg (s);		/* { dg-error "does not have integral type" } */
+  __builtin_clrsbg (true);	/* { dg-error "has boolean type" } */
+  __builtin_clrsbg (E0);	/* { dg-error "has enumerated type" "" { target c++ } } */
+  __builtin_ffsg ();		/* { dg-error "too few arguments" } */
+  __builtin_ffsg (0, 1);	/* { dg-error "too many arguments" } */
+  __builtin_ffsg (0U);		/* { dg-error "has unsigned type" } */
+  __builtin_ffsg (0.0);		/* { dg-error "does not have integral type" } */
+  __builtin_ffsg (s);		/* { dg-error "does not have integral type" } */
+  __builtin_ffsg (true);	/* { dg-error "has boolean type" } */
+  __builtin_ffsg (E0);		/* { dg-error "has enumerated type" "" { target c++ } } */
+  __builtin_parityg ();		/* { dg-error "too few arguments" } */
+  __builtin_parityg (0U, 1);	/* { dg-error "too many arguments" } */
+  __builtin_parityg (0);	/* { dg-error "has signed type" } */
+  __builtin_parityg (0.0);	/* { dg-error "does not have integral type" } */
+  __builtin_parityg (s);	/* { dg-error "does not have integral type" } */
+  __builtin_parityg (true);	/* { dg-error "has boolean type" } */
+  __builtin_parityg (E0);	/* { dg-error "has signed type" "" { target c } } */
+				/* { dg-error "has enumerated type" "" { target c++ } .-1 } */
+  __builtin_popcountg ();	/* { dg-error "too few arguments" } */
+  __builtin_popcountg (0U, 1);	/* { dg-error "too many arguments" } */
+  __builtin_popcountg (0);	/* { dg-error "has signed type" } */
+  __builtin_popcountg (0.0);	/* { dg-error "does not have integral type" } */
+  __builtin_popcountg (s);	/* { dg-error "does not have integral type" } */
+  __builtin_popcountg (true);	/* { dg-error "has boolean type" } */
+  __builtin_popcountg (E0);	/* { dg-error "has signed type" "" { target c } } */
+				/* { dg-error "has enumerated type" "" { target c++ } .-1 } */
+}
--- gcc/testsuite/gcc.dg/torture/bitint-43.c.jj	2023-11-09 09:17:40.233182441 +0100
+++ gcc/testsuite/gcc.dg/torture/bitint-43.c	2023-11-09 12:16:51.757013390 +0100
@@ -0,0 +1,306 @@
+/* PR c/111309 */
+/* { dg-do run { target bitint } } */
+/* { dg-options "-std=c2x -pedantic-errors" } */
+/* { dg-skip-if "" { ! run_expensive_tests }  { "*" } { "-O0" "-O2" } } */
+/* { dg-skip-if "" { ! run_expensive_tests } { "-flto" } { "" } } */
+
+#if __BITINT_MAXWIDTH__ >= 156
+__attribute__((noipa)) int
+clz156 (unsigned _BitInt(156) x)
+{
+  return __builtin_clzg (x);
+}
+
+__attribute__((noipa)) int
+clzd156 (unsigned _BitInt(156) x)
+{
+  return __builtin_clzg (x, __builtin_popcountg ((typeof (x)) ~(typeof (x)) 0));
+}
+
+__attribute__((noipa)) int
+clzD156 (unsigned _BitInt(156) x, int y)
+{
+  return __builtin_clzg (x, y);
+}
+
+__attribute__((noipa)) int
+ctz156 (unsigned _BitInt(156) x)
+{
+  return __builtin_ctzg (x);
+}
+
+__attribute__((noipa)) int
+ctzd156 (unsigned _BitInt(156) x)
+{
+  return __builtin_ctzg (x, __builtin_popcountg ((typeof (x)) ~(typeof (x)) 0));
+}
+
+__attribute__((noipa)) int
+ctzD156 (unsigned _BitInt(156) x, int y)
+{
+  return __builtin_ctzg (x, y);
+}
+
+__attribute__((noipa)) int
+clrsb156 (_BitInt(156) x)
+{
+  return __builtin_clrsbg (x);
+}
+
+__attribute__((noipa)) int
+ffs156 (_BitInt(156) x)
+{
+  return __builtin_ffsg (x);
+}
+
+__attribute__((noipa)) int
+parity156 (unsigned _BitInt(156) x)
+{
+  return __builtin_parityg (x);
+}
+
+__attribute__((noipa)) int
+popcount156 (unsigned _BitInt(156) x)
+{
+  return __builtin_popcountg (x);
+}
+#endif
+
+#if __BITINT_MAXWIDTH__ >= 192
+__attribute__((noipa)) int
+clz192 (unsigned _BitInt(192) x)
+{
+  return __builtin_clzg (x);
+}
+
+__attribute__((noipa)) int
+clzd192 (unsigned _BitInt(192) x)
+{
+  return __builtin_clzg (x, __builtin_popcountg ((typeof (x)) ~(typeof (x)) 0));
+}
+
+__attribute__((noipa)) int
+clzD192 (unsigned _BitInt(192) x, int y)
+{
+  return __builtin_clzg (x, y);
+}
+
+__attribute__((noipa)) int
+ctz192 (unsigned _BitInt(192) x)
+{
+  return __builtin_ctzg (x);
+}
+
+__attribute__((noipa)) int
+ctzd192 (unsigned _BitInt(192) x)
+{
+  return __builtin_ctzg (x, __builtin_popcountg ((typeof (x)) ~(typeof (x)) 0));
+}
+
+__attribute__((noipa)) int
+ctzD192 (unsigned _BitInt(192) x, int y)
+{
+  return __builtin_ctzg (x, y);
+}
+
+__attribute__((noipa)) int
+clrsb192 (_BitInt(192) x)
+{
+  return __builtin_clrsbg (x);
+}
+
+__attribute__((noipa)) int
+ffs192 (_BitInt(192) x)
+{
+  return __builtin_ffsg (x);
+}
+
+__attribute__((noipa)) int
+parity192 (unsigned _BitInt(192) x)
+{
+  return __builtin_parityg (x);
+}
+
+__attribute__((noipa)) int
+popcount192 (unsigned _BitInt(192) x)
+{
+  return __builtin_popcountg (x);
+}
+#endif
+
+int
+main ()
+{
+#if __BITINT_MAXWIDTH__ >= 156
+  if (clzd156 (0) != 156
+      || clzD156 (0, -1) != -1
+      || ctzd156 (0) != 156
+      || ctzD156 (0, 42) != 42
+      || clrsb156 (0) != 156 - 1
+      || ffs156 (0) != 0
+      || parity156 (0) != 0
+      || popcount156 (0) != 0
+      || __builtin_clzg ((unsigned _BitInt(156)) 0, 156 + 32) != 156 + 32
+      || __builtin_ctzg ((unsigned _BitInt(156)) 0, 156) != 156
+      || __builtin_clrsbg ((_BitInt(156)) 0) != 156 - 1
+      || __builtin_ffsg ((_BitInt(156)) 0) != 0
+      || __builtin_parityg ((unsigned _BitInt(156)) 0) != 0
+      || __builtin_popcountg ((unsigned _BitInt(156)) 0) != 0)
+    __builtin_abort ();
+  if (clz156 (-1) != 0
+      || clzd156 (-1) != 0
+      || clzD156 (-1, 0) != 0
+      || ctz156 (-1) != 0
+      || ctzd156 (-1) != 0
+      || ctzD156 (-1, 17) != 0
+      || clrsb156 (-1) != 156 - 1
+      || ffs156 (-1) != 1
+      || parity156 (-1) != 0
+      || popcount156 (-1) != 156
+      || __builtin_clzg ((unsigned _BitInt(156)) -1) != 0
+      || __builtin_clzg ((unsigned _BitInt(156)) -1, 156 + 32) != 0
+      || __builtin_ctzg ((unsigned _BitInt(156)) -1) != 0
+      || __builtin_ctzg ((unsigned _BitInt(156)) -1, 156) != 0
+      || __builtin_clrsbg ((_BitInt(156)) -1) != 156 - 1
+      || __builtin_ffsg ((_BitInt(156)) -1) != 1
+      || __builtin_parityg ((unsigned _BitInt(156)) -1) != 0
+      || __builtin_popcountg ((unsigned _BitInt(156)) -1) != 156)
+    __builtin_abort ();
+  if (clz156 (((unsigned _BitInt(156)) -1) >> 24) != 24
+      || clz156 (((unsigned _BitInt(156)) -1) >> 79) != 79
+      || clz156 (1) != 156 - 1
+      || clzd156 (((unsigned _BitInt(156)) -1) >> 139) != 139
+      || clzd156 (2) != 156 - 2
+      || ctz156 (((unsigned _BitInt(156)) -1) << 42) != 42
+      || ctz156 (((unsigned _BitInt(156)) -1) << 57) != 57
+      || ctz156 (0x4000000000000000000000uwb) != 86
+      || ctzd156 (((unsigned _BitInt(156)) -1) << 149) != 149
+      || ctzd156 (2) != 1
+      || clrsb156 ((unsigned _BitInt(156 - 4)) -1) != 3
+      || clrsb156 ((unsigned _BitInt(156 - 28)) -1) != 27
+      || clrsb156 ((unsigned _BitInt(156 - 29)) -1) != 28
+      || clrsb156 (~(unsigned _BitInt(156)) (unsigned _BitInt(156 - 68)) -1) != 67
+      || clrsb156 (~(unsigned _BitInt(156)) (unsigned _BitInt(156 - 92)) -1) != 91
+      || clrsb156 (~(unsigned _BitInt(156)) (unsigned _BitInt(156 - 93)) -1) != 92
+      || ffs156 (((unsigned _BitInt(156)) -1) << 42) != 43
+      || ffs156 (((unsigned _BitInt(156)) -1) << 57) != 58
+      || ffs156 (0x4000000000000000000000uwb) != 87
+      || ffs156 (((unsigned _BitInt(156)) -1) << 149) != 150
+      || ffs156 (2) != 2
+      || __builtin_clzg (((unsigned _BitInt(156)) -1) >> 24) != 24
+      || __builtin_clzg (((unsigned _BitInt(156)) -1) >> 79) != 79
+      || __builtin_clzg ((unsigned _BitInt(156)) 1) != 156 - 1
+      || __builtin_clzg (((unsigned _BitInt(156)) -1) >> 139, 156) != 139
+      || __builtin_clzg ((unsigned _BitInt(156)) 2, 156) != 156 - 2
+      || __builtin_ctzg (((unsigned _BitInt(156)) -1) << 42) != 42
+      || __builtin_ctzg (((unsigned _BitInt(156)) -1) << 57) != 57
+      || __builtin_ctzg ((unsigned _BitInt(156)) 0x4000000000000000000000uwb) != 86
+      || __builtin_ctzg (((unsigned _BitInt(156)) -1) << 149, 156) != 149
+      || __builtin_ctzg ((unsigned _BitInt(156)) 2, 156) != 1
+      || __builtin_clrsbg ((_BitInt(156)) (unsigned _BitInt(156 - 4)) -1) != 3
+      || __builtin_clrsbg ((_BitInt(156)) (unsigned _BitInt(156 - 28)) -1) != 27
+      || __builtin_clrsbg ((_BitInt(156)) (unsigned _BitInt(156 - 29)) -1) != 28
+      || __builtin_clrsbg ((_BitInt(156)) ~(unsigned _BitInt(156)) (unsigned _BitInt(156 - 68)) -1) != 67
+      || __builtin_clrsbg ((_BitInt(156)) ~(unsigned _BitInt(156)) (unsigned _BitInt(156 - 92)) -1) != 91
+      || __builtin_clrsbg ((_BitInt(156)) ~(unsigned _BitInt(156)) (unsigned _BitInt(156 - 93)) -1) != 92
+      || __builtin_ffsg ((_BitInt(156)) (((unsigned _BitInt(156)) -1) << 42)) != 43
+      || __builtin_ffsg ((_BitInt(156)) (((unsigned _BitInt(156)) -1) << 57)) != 58
+      || __builtin_ffsg ((_BitInt(156)) 0x4000000000000000000000uwb) != 87
+      || __builtin_ffsg ((_BitInt(156)) (((unsigned _BitInt(156)) -1) << 149)) != 150
+      || __builtin_ffsg ((_BitInt(156)) 2) != 2)
+    __builtin_abort ();
+  if (parity156 (23008250258685373142923325827291949461178444434uwb) != __builtin_parityg (23008250258685373142923325827291949461178444434uwb)
+      || parity156 (41771568792516301628132437740665810252917251244uwb) != __builtin_parityg (41771568792516301628132437740665810252917251244uwb)
+      || parity156 (5107402473866766219120283991834936835726115452uwb) != __builtin_parityg (5107402473866766219120283991834936835726115452uwb)
+      || popcount156 (50353291748276374580944955711958129678996395562uwb) != __builtin_popcountg (50353291748276374580944955711958129678996395562uwb)
+      || popcount156 (29091263616891212550063067166307725491211684496uwb) != __builtin_popcountg (29091263616891212550063067166307725491211684496uwb)
+      || popcount156 (64973284306583205619384799873110935608793072026uwb) != __builtin_popcountg (64973284306583205619384799873110935608793072026uwb))
+    __builtin_abort ();
+#endif
+#if __BITINT_MAXWIDTH__ >= 192
+  if (clzd192 (0) != 192
+      || clzD192 (0, 42) != 42
+      || ctzd192 (0) != 192
+      || ctzD192 (0, -1) != -1
+      || clrsb192 (0) != 192 - 1
+      || ffs192 (0) != 0
+      || parity192 (0) != 0
+      || popcount192 (0) != 0
+      || __builtin_clzg ((unsigned _BitInt(192)) 0, 192 + 32) != 192 + 32
+      || __builtin_ctzg ((unsigned _BitInt(192)) 0, 192) != 192
+      || __builtin_clrsbg ((_BitInt(192)) 0) != 192 - 1
+      || __builtin_ffsg ((_BitInt(192)) 0) != 0
+      || __builtin_parityg ((unsigned _BitInt(192)) 0) != 0
+      || __builtin_popcountg ((unsigned _BitInt(192)) 0) != 0)
+    __builtin_abort ();
+  if (clz192 (-1) != 0
+      || clzd192 (-1) != 0
+      || clzD192 (-1, 15) != 0
+      || ctz192 (-1) != 0
+      || ctzd192 (-1) != 0
+      || ctzD192 (-1, -57) != 0
+      || clrsb192 (-1) != 192 - 1
+      || ffs192 (-1) != 1
+      || parity192 (-1) != 0
+      || popcount192 (-1) != 192
+      || __builtin_clzg ((unsigned _BitInt(192)) -1) != 0
+      || __builtin_clzg ((unsigned _BitInt(192)) -1, 192 + 32) != 0
+      || __builtin_ctzg ((unsigned _BitInt(192)) -1) != 0
+      || __builtin_ctzg ((unsigned _BitInt(192)) -1, 192) != 0
+      || __builtin_clrsbg ((_BitInt(192)) -1) != 192 - 1
+      || __builtin_ffsg ((_BitInt(192)) -1) != 1
+      || __builtin_parityg ((unsigned _BitInt(192)) -1) != 0
+      || __builtin_popcountg ((unsigned _BitInt(192)) -1) != 192)
+    __builtin_abort ();
+  if (clz192 (((unsigned _BitInt(192)) -1) >> 24) != 24
+      || clz192 (((unsigned _BitInt(192)) -1) >> 79) != 79
+      || clz192 (1) != 192 - 1
+      || clzd192 (((unsigned _BitInt(192)) -1) >> 139) != 139
+      || clzd192 (2) != 192 - 2
+      || ctz192 (((unsigned _BitInt(192)) -1) << 42) != 42
+      || ctz192 (((unsigned _BitInt(192)) -1) << 57) != 57
+      || ctz192 (0x4000000000000000000000uwb) != 86
+      || ctzd192 (((unsigned _BitInt(192)) -1) << 149) != 149
+      || ctzd192 (2) != 1
+      || clrsb192 ((unsigned _BitInt(192 - 4)) -1) != 3
+      || clrsb192 ((unsigned _BitInt(192 - 28)) -1) != 27
+      || clrsb192 ((unsigned _BitInt(192 - 29)) -1) != 28
+      || clrsb192 (~(unsigned _BitInt(192)) (unsigned _BitInt(192 - 68)) -1) != 67
+      || clrsb192 (~(unsigned _BitInt(192)) (unsigned _BitInt(192 - 92)) -1) != 91
+      || clrsb192 (~(unsigned _BitInt(192)) (unsigned _BitInt(192 - 93)) -1) != 92
+      || ffs192 (((unsigned _BitInt(192)) -1) << 42) != 43
+      || ffs192 (((unsigned _BitInt(192)) -1) << 57) != 58
+      || ffs192 (0x4000000000000000000000uwb) != 87
+      || ffs192 (((unsigned _BitInt(192)) -1) << 149) != 150
+      || ffs192 (2) != 2
+      || __builtin_clzg (((unsigned _BitInt(192)) -1) >> 24) != 24
+      || __builtin_clzg (((unsigned _BitInt(192)) -1) >> 79) != 79
+      || __builtin_clzg ((unsigned _BitInt(192)) 1) != 192 - 1
+      || __builtin_clzg (((unsigned _BitInt(192)) -1) >> 139, 192) != 139
+      || __builtin_clzg ((unsigned _BitInt(192)) 2, 192) != 192 - 2
+      || __builtin_ctzg (((unsigned _BitInt(192)) -1) << 42) != 42
+      || __builtin_ctzg (((unsigned _BitInt(192)) -1) << 57) != 57
+      || __builtin_ctzg ((unsigned _BitInt(192)) 0x4000000000000000000000uwb) != 86
+      || __builtin_ctzg (((unsigned _BitInt(192)) -1) << 149, 192) != 149
+      || __builtin_ctzg ((unsigned _BitInt(192)) 2, 192) != 1
+      || __builtin_clrsbg ((_BitInt(192)) (unsigned _BitInt(192 - 4)) -1) != 3
+      || __builtin_clrsbg ((_BitInt(192)) (unsigned _BitInt(192 - 28)) -1) != 27
+      || __builtin_clrsbg ((_BitInt(192)) (unsigned _BitInt(192 - 29)) -1) != 28
+      || __builtin_clrsbg ((_BitInt(192)) ~(unsigned _BitInt(192)) (unsigned _BitInt(192 - 68)) -1) != 67
+      || __builtin_clrsbg ((_BitInt(192)) ~(unsigned _BitInt(192)) (unsigned _BitInt(192 - 92)) -1) != 91
+      || __builtin_clrsbg ((_BitInt(192)) ~(unsigned _BitInt(192)) (unsigned _BitInt(192 - 93)) -1) != 92
+      || __builtin_ffsg ((_BitInt(192)) (((unsigned _BitInt(192)) -1) << 42)) != 43
+      || __builtin_ffsg ((_BitInt(192)) (((unsigned _BitInt(192)) -1) << 57)) != 58
+      || __builtin_ffsg ((_BitInt(192)) 0x4000000000000000000000uwb) != 87
+      || __builtin_ffsg ((_BitInt(192)) (((unsigned _BitInt(192)) -1) << 149)) != 150
+      || __builtin_ffsg ((_BitInt(192)) 2) != 2)
+    __builtin_abort ();
+  if (parity192 (4692147078159863499615754634965484598760535154638668598762uwb) != __builtin_parityg (4692147078159863499615754634965484598760535154638668598762uwb)
+      || parity192 (1669461228546917627909935444501097256112222796898845183538uwb) != __builtin_parityg (1669461228546917627909935444501097256112222796898845183538uwb)
+      || parity192 (5107402473866766219120283991834936835726115452uwb) != __builtin_parityg (5107402473866766219120283991834936835726115452uwb)
+      || popcount192 (4033871057575185619108386380181511734118888391160164588976uwb) != __builtin_popcountg (4033871057575185619108386380181511734118888391160164588976uwb)
+      || popcount192 (58124766715713711628758119849579188845074973856704521119uwb) != __builtin_popcountg (58124766715713711628758119849579188845074973856704521119uwb)
+      || popcount192 (289948065236269174335700831610076764076947650072787325852uwb) != __builtin_popcountg (289948065236269174335700831610076764076947650072787325852uwb))
+    __builtin_abort ();
+#endif
+}
--- gcc/testsuite/gcc.dg/torture/bitint-44.c.jj	2023-11-09 09:17:40.232182455 +0100
+++ gcc/testsuite/gcc.dg/torture/bitint-44.c	2023-11-09 12:21:32.376046129 +0100
@@ -0,0 +1,306 @@
+/* PR c/111309 */
+/* { dg-do run { target bitint } } */
+/* { dg-options "-std=c2x -pedantic-errors" } */
+/* { dg-skip-if "" { ! run_expensive_tests }  { "*" } { "-O0" "-O2" } } */
+/* { dg-skip-if "" { ! run_expensive_tests } { "-flto" } { "" } } */
+
+#if __BITINT_MAXWIDTH__ >= 512
+__attribute__((noipa)) int
+clz512 (unsigned _BitInt(512) x)
+{
+  return __builtin_clzg (x);
+}
+
+__attribute__((noipa)) int
+clzd512 (unsigned _BitInt(512) x)
+{
+  return __builtin_clzg (x, __builtin_popcountg ((typeof (x)) ~(typeof (x)) 0));
+}
+
+__attribute__((noipa)) int
+clzD512 (unsigned _BitInt(512) x, int y)
+{
+  return __builtin_clzg (x, y);
+}
+
+__attribute__((noipa)) int
+ctz512 (unsigned _BitInt(512) x)
+{
+  return __builtin_ctzg (x);
+}
+
+__attribute__((noipa)) int
+ctzd512 (unsigned _BitInt(512) x)
+{
+  return __builtin_ctzg (x, __builtin_popcountg ((typeof (x)) ~(typeof (x)) 0));
+}
+
+__attribute__((noipa)) int
+ctzD512 (unsigned _BitInt(512) x, int y)
+{
+  return __builtin_ctzg (x, y);
+}
+
+__attribute__((noipa)) int
+clrsb512 (_BitInt(512) x)
+{
+  return __builtin_clrsbg (x);
+}
+
+__attribute__((noipa)) int
+ffs512 (_BitInt(512) x)
+{
+  return __builtin_ffsg (x);
+}
+
+__attribute__((noipa)) int
+parity512 (unsigned _BitInt(512) x)
+{
+  return __builtin_parityg (x);
+}
+
+__attribute__((noipa)) int
+popcount512 (unsigned _BitInt(512) x)
+{
+  return __builtin_popcountg (x);
+}
+#endif
+
+#if __BITINT_MAXWIDTH__ >= 523
+__attribute__((noipa)) int
+clz523 (unsigned _BitInt(523) x)
+{
+  return __builtin_clzg (x);
+}
+
+__attribute__((noipa)) int
+clzd523 (unsigned _BitInt(523) x)
+{
+  return __builtin_clzg (x, __builtin_popcountg ((typeof (x)) ~(typeof (x)) 0));
+}
+
+__attribute__((noipa)) int
+clzD523 (unsigned _BitInt(523) x, int y)
+{
+  return __builtin_clzg (x, y);
+}
+
+__attribute__((noipa)) int
+ctz523 (unsigned _BitInt(523) x)
+{
+  return __builtin_ctzg (x);
+}
+
+__attribute__((noipa)) int
+ctzd523 (unsigned _BitInt(523) x)
+{
+  return __builtin_ctzg (x, __builtin_popcountg ((typeof (x)) ~(typeof (x)) 0));
+}
+
+__attribute__((noipa)) int
+ctzD523 (unsigned _BitInt(523) x, int y)
+{
+  return __builtin_ctzg (x, y);
+}
+
+__attribute__((noipa)) int
+clrsb523 (_BitInt(523) x)
+{
+  return __builtin_clrsbg (x);
+}
+
+__attribute__((noipa)) int
+ffs523 (_BitInt(523) x)
+{
+  return __builtin_ffsg (x);
+}
+
+__attribute__((noipa)) int
+parity523 (unsigned _BitInt(523) x)
+{
+  return __builtin_parityg (x);
+}
+
+__attribute__((noipa)) int
+popcount523 (unsigned _BitInt(523) x)
+{
+  return __builtin_popcountg (x);
+}
+#endif
+
+int
+main ()
+{
+#if __BITINT_MAXWIDTH__ >= 512
+  if (clzd512 (0) != 512
+      || clzD512 (0, -1) != -1
+      || ctzd512 (0) != 512
+      || ctzD512 (0, 42) != 42
+      || clrsb512 (0) != 512 - 1
+      || ffs512 (0) != 0
+      || parity512 (0) != 0
+      || popcount512 (0) != 0
+      || __builtin_clzg ((unsigned _BitInt(512)) 0, 512 + 32) != 512 + 32
+      || __builtin_ctzg ((unsigned _BitInt(512)) 0, 512) != 512
+      || __builtin_clrsbg ((_BitInt(512)) 0) != 512 - 1
+      || __builtin_ffsg ((_BitInt(512)) 0) != 0
+      || __builtin_parityg ((unsigned _BitInt(512)) 0) != 0
+      || __builtin_popcountg ((unsigned _BitInt(512)) 0) != 0)
+    __builtin_abort ();
+  if (clz512 (-1) != 0
+      || clzd512 (-1) != 0
+      || clzD512 (-1, 0) != 0
+      || ctz512 (-1) != 0
+      || ctzd512 (-1) != 0
+      || ctzD512 (-1, 17) != 0
+      || clrsb512 (-1) != 512 - 1
+      || ffs512 (-1) != 1
+      || parity512 (-1) != 0
+      || popcount512 (-1) != 512
+      || __builtin_clzg ((unsigned _BitInt(512)) -1) != 0
+      || __builtin_clzg ((unsigned _BitInt(512)) -1, 512 + 32) != 0
+      || __builtin_ctzg ((unsigned _BitInt(512)) -1) != 0
+      || __builtin_ctzg ((unsigned _BitInt(512)) -1, 512) != 0
+      || __builtin_clrsbg ((_BitInt(512)) -1) != 512 - 1
+      || __builtin_ffsg ((_BitInt(512)) -1) != 1
+      || __builtin_parityg ((unsigned _BitInt(512)) -1) != 0
+      || __builtin_popcountg ((unsigned _BitInt(512)) -1) != 512)
+    __builtin_abort ();
+  if (clz512 (((unsigned _BitInt(512)) -1) >> 24) != 24
+      || clz512 (((unsigned _BitInt(512)) -1) >> 79) != 79
+      || clz512 (1) != 512 - 1
+      || clzd512 (((unsigned _BitInt(512)) -1) >> 139) != 139
+      || clzd512 (2) != 512 - 2
+      || ctz512 (((unsigned _BitInt(512)) -1) << 42) != 42
+      || ctz512 (((unsigned _BitInt(512)) -1) << 57) != 57
+      || ctz512 (0x4000000000000000000000uwb) != 86
+      || ctzd512 (((unsigned _BitInt(512)) -1) << 149) != 149
+      || ctzd512 (2) != 1
+      || clrsb512 ((unsigned _BitInt(512 - 4)) -1) != 3
+      || clrsb512 ((unsigned _BitInt(512 - 28)) -1) != 27
+      || clrsb512 ((unsigned _BitInt(512 - 29)) -1) != 28
+      || clrsb512 (~(unsigned _BitInt(512)) (unsigned _BitInt(512 - 68)) -1) != 67
+      || clrsb512 (~(unsigned _BitInt(512)) (unsigned _BitInt(512 - 92)) -1) != 91
+      || clrsb512 (~(unsigned _BitInt(512)) (unsigned _BitInt(512 - 93)) -1) != 92
+      || ffs512 (((unsigned _BitInt(512)) -1) << 42) != 43
+      || ffs512 (((unsigned _BitInt(512)) -1) << 57) != 58
+      || ffs512 (0x4000000000000000000000uwb) != 87
+      || ffs512 (((unsigned _BitInt(512)) -1) << 149) != 150
+      || ffs512 (2) != 2
+      || __builtin_clzg (((unsigned _BitInt(512)) -1) >> 24) != 24
+      || __builtin_clzg (((unsigned _BitInt(512)) -1) >> 79) != 79
+      || __builtin_clzg ((unsigned _BitInt(512)) 1) != 512 - 1
+      || __builtin_clzg (((unsigned _BitInt(512)) -1) >> 139, 512) != 139
+      || __builtin_clzg ((unsigned _BitInt(512)) 2, 512) != 512 - 2
+      || __builtin_ctzg (((unsigned _BitInt(512)) -1) << 42) != 42
+      || __builtin_ctzg (((unsigned _BitInt(512)) -1) << 57) != 57
+      || __builtin_ctzg ((unsigned _BitInt(512)) 0x4000000000000000000000uwb) != 86
+      || __builtin_ctzg (((unsigned _BitInt(512)) -1) << 149, 512) != 149
+      || __builtin_ctzg ((unsigned _BitInt(512)) 2, 512) != 1
+      || __builtin_clrsbg ((_BitInt(512)) (unsigned _BitInt(512 - 4)) -1) != 3
+      || __builtin_clrsbg ((_BitInt(512)) (unsigned _BitInt(512 - 28)) -1) != 27
+      || __builtin_clrsbg ((_BitInt(512)) (unsigned _BitInt(512 - 29)) -1) != 28
+      || __builtin_clrsbg ((_BitInt(512)) ~(unsigned _BitInt(512)) (unsigned _BitInt(512 - 68)) -1) != 67
+      || __builtin_clrsbg ((_BitInt(512)) ~(unsigned _BitInt(512)) (unsigned _BitInt(512 - 92)) -1) != 91
+      || __builtin_clrsbg ((_BitInt(512)) ~(unsigned _BitInt(512)) (unsigned _BitInt(512 - 93)) -1) != 92
+      || __builtin_ffsg ((_BitInt(512)) (((unsigned _BitInt(512)) -1) << 42)) != 43
+      || __builtin_ffsg ((_BitInt(512)) (((unsigned _BitInt(512)) -1) << 57)) != 58
+      || __builtin_ffsg ((_BitInt(512)) 0x4000000000000000000000uwb) != 87
+      || __builtin_ffsg ((_BitInt(512)) (((unsigned _BitInt(512)) -1) << 149)) != 150
+      || __builtin_ffsg ((_BitInt(512)) 2) != 2)
+    __builtin_abort ();
+  if (parity512 (8278593062772967967574644592392030907507244457324713380127157444008480135136016412791369421272159911061801023217823646324038055629840240503699995274750141uwb) != __builtin_parityg (8278593062772967967574644592392030907507244457324713380127157444008480135136016412791369421272159911061801023217823646324038055629840240503699995274750141uwb)
+      || parity512 (663951521760319802637316646127146913163123967584512032007606686578544864655291546789196279408181546344880831465704154822174055168766759305688225967189384uwb) != __builtin_parityg (663951521760319802637316646127146913163123967584512032007606686578544864655291546789196279408181546344880831465704154822174055168766759305688225967189384uwb)
+      || parity512 (8114152627481936575035564712656624361256533214211179387274127464949371919139038942819974113641465089580051998523156404968195970853124179018281296621919217uwb) != __builtin_parityg (8114152627481936575035564712656624361256533214211179387274127464949371919139038942819974113641465089580051998523156404968195970853124179018281296621919217uwb)
+      || popcount512 (697171368046392901434470580443928282938585745214587494987284546386421344865289735592202298494880955572094546861862007016154025065165834164941207378563932uwb) != __builtin_popcountg (697171368046392901434470580443928282938585745214587494987284546386421344865289735592202298494880955572094546861862007016154025065165834164941207378563932uwb)
+      || popcount512 (12625357869391866487124235043239209385173615631331705015179232007319637649427586947822360147798041278948617160703315666047585702906648747835331939389354450uwb) != __builtin_popcountg (12625357869391866487124235043239209385173615631331705015179232007319637649427586947822360147798041278948617160703315666047585702906648747835331939389354450uwb)
+      || popcount512 (12989863959706456104163426941303698078341934896544520782734564901708926112239778316241786242633862403309192697330635825122310265805838908726925342761646021uwb) != __builtin_popcountg (12989863959706456104163426941303698078341934896544520782734564901708926112239778316241786242633862403309192697330635825122310265805838908726925342761646021uwb))
+    __builtin_abort ();
+#endif
+#if __BITINT_MAXWIDTH__ >= 523
+  if (clzd523 (0) != 523
+      || clzD523 (0, 42) != 42
+      || ctzd523 (0) != 523
+      || ctzD523 (0, -1) != -1
+      || clrsb523 (0) != 523 - 1
+      || ffs523 (0) != 0
+      || parity523 (0) != 0
+      || popcount523 (0) != 0
+      || __builtin_clzg ((unsigned _BitInt(523)) 0, 523 + 32) != 523 + 32
+      || __builtin_ctzg ((unsigned _BitInt(523)) 0, 523) != 523
+      || __builtin_clrsbg ((_BitInt(523)) 0) != 523 - 1
+      || __builtin_ffsg ((_BitInt(523)) 0) != 0
+      || __builtin_parityg ((unsigned _BitInt(523)) 0) != 0
+      || __builtin_popcountg ((unsigned _BitInt(523)) 0) != 0)
+    __builtin_abort ();
+  if (clz523 (-1) != 0
+      || clzd523 (-1) != 0
+      || clzD523 (-1, 15) != 0
+      || ctz523 (-1) != 0
+      || ctzd523 (-1) != 0
+      || ctzD523 (-1, -57) != 0
+      || clrsb523 (-1) != 523 - 1
+      || ffs523 (-1) != 1
+      || parity523 (-1) != 1
+      || popcount523 (-1) != 523
+      || __builtin_clzg ((unsigned _BitInt(523)) -1) != 0
+      || __builtin_clzg ((unsigned _BitInt(523)) -1, 523 + 32) != 0
+      || __builtin_ctzg ((unsigned _BitInt(523)) -1) != 0
+      || __builtin_ctzg ((unsigned _BitInt(523)) -1, 523) != 0
+      || __builtin_clrsbg ((_BitInt(523)) -1) != 523 - 1
+      || __builtin_ffsg ((_BitInt(523)) -1) != 1
+      || __builtin_parityg ((unsigned _BitInt(523)) -1) != 1
+      || __builtin_popcountg ((unsigned _BitInt(523)) -1) != 523)
+    __builtin_abort ();
+  if (clz523 (((unsigned _BitInt(523)) -1) >> 24) != 24
+      || clz523 (((unsigned _BitInt(523)) -1) >> 79) != 79
+      || clz523 (1) != 523 - 1
+      || clzd523 (((unsigned _BitInt(523)) -1) >> 139) != 139
+      || clzd523 (2) != 523 - 2
+      || ctz523 (((unsigned _BitInt(523)) -1) << 42) != 42
+      || ctz523 (((unsigned _BitInt(523)) -1) << 57) != 57
+      || ctz523 (0x4000000000000000000000uwb) != 86
+      || ctzd523 (((unsigned _BitInt(523)) -1) << 149) != 149
+      || ctzd523 (2) != 1
+      || clrsb523 ((unsigned _BitInt(523 - 4)) -1) != 3
+      || clrsb523 ((unsigned _BitInt(523 - 28)) -1) != 27
+      || clrsb523 ((unsigned _BitInt(523 - 29)) -1) != 28
+      || clrsb523 (~(unsigned _BitInt(523)) (unsigned _BitInt(523 - 68)) -1) != 67
+      || clrsb523 (~(unsigned _BitInt(523)) (unsigned _BitInt(523 - 92)) -1) != 91
+      || clrsb523 (~(unsigned _BitInt(523)) (unsigned _BitInt(523 - 93)) -1) != 92
+      || ffs523 (((unsigned _BitInt(523)) -1) << 42) != 43
+      || ffs523 (((unsigned _BitInt(523)) -1) << 57) != 58
+      || ffs523 (0x4000000000000000000000uwb) != 87
+      || ffs523 (((unsigned _BitInt(523)) -1) << 149) != 150
+      || ffs523 (2) != 2
+      || __builtin_clzg (((unsigned _BitInt(523)) -1) >> 24) != 24
+      || __builtin_clzg (((unsigned _BitInt(523)) -1) >> 79) != 79
+      || __builtin_clzg ((unsigned _BitInt(523)) 1) != 523 - 1
+      || __builtin_clzg (((unsigned _BitInt(523)) -1) >> 139, 523) != 139
+      || __builtin_clzg ((unsigned _BitInt(523)) 2, 523) != 523 - 2
+      || __builtin_ctzg (((unsigned _BitInt(523)) -1) << 42) != 42
+      || __builtin_ctzg (((unsigned _BitInt(523)) -1) << 57) != 57
+      || __builtin_ctzg ((unsigned _BitInt(523)) 0x4000000000000000000000uwb) != 86
+      || __builtin_ctzg (((unsigned _BitInt(523)) -1) << 149, 523) != 149
+      || __builtin_ctzg ((unsigned _BitInt(523)) 2, 523) != 1
+      || __builtin_clrsbg ((_BitInt(523)) (unsigned _BitInt(523 - 4)) -1) != 3
+      || __builtin_clrsbg ((_BitInt(523)) (unsigned _BitInt(523 - 28)) -1) != 27
+      || __builtin_clrsbg ((_BitInt(523)) (unsigned _BitInt(523 - 29)) -1) != 28
+      || __builtin_clrsbg ((_BitInt(523)) ~(unsigned _BitInt(523)) (unsigned _BitInt(523 - 68)) -1) != 67
+      || __builtin_clrsbg ((_BitInt(523)) ~(unsigned _BitInt(523)) (unsigned _BitInt(523 - 92)) -1) != 91
+      || __builtin_clrsbg ((_BitInt(523)) ~(unsigned _BitInt(523)) (unsigned _BitInt(523 - 93)) -1) != 92
+      || __builtin_ffsg ((_BitInt(523)) (((unsigned _BitInt(523)) -1) << 42)) != 43
+      || __builtin_ffsg ((_BitInt(523)) (((unsigned _BitInt(523)) -1) << 57)) != 58
+      || __builtin_ffsg ((_BitInt(523)) 0x4000000000000000000000uwb) != 87
+      || __builtin_ffsg ((_BitInt(523)) (((unsigned _BitInt(523)) -1) << 149)) != 150
+      || __builtin_ffsg ((_BitInt(523)) 2) != 2)
+    __builtin_abort ();
+  if (parity523 (14226628251091586975416900831427560438504550751597528218770815297642064445318137709184907300499591292677456563377096100346699421879373024906380724757049700104uwb) != __builtin_parityg (14226628251091586975416900831427560438504550751597528218770815297642064445318137709184907300499591292677456563377096100346699421879373024906380724757049700104uwb)
+      || parity523 (20688958227123188226117538663818621034852702121556301239818743230005799574164516085541310491875153692467123662601853835357822935286851364843928714141587045255uwb) != __builtin_parityg (20688958227123188226117538663818621034852702121556301239818743230005799574164516085541310491875153692467123662601853835357822935286851364843928714141587045255uwb)
+      || parity523 (8927708174664018648856542263215989788443763271738485875573765922613438023117960552135374015673598803453205044464280019640319125968982118836809392169156450404uwb) != __builtin_parityg (8927708174664018648856542263215989788443763271738485875573765922613438023117960552135374015673598803453205044464280019640319125968982118836809392169156450404uwb)
+      || popcount523 (27178327344587654457581274852432957423537947348354896748701960885269035920194935311522194372418922852798513401240689173265979378157685169921449935364246334672uwb) != __builtin_popcountg (27178327344587654457581274852432957423537947348354896748701960885269035920194935311522194372418922852798513401240689173265979378157685169921449935364246334672uwb)
+      || popcount523 (5307736750284212829931201546806718535860789684371772688568780952567669490917265125893664418036905110148872995350655890585853451175740907670080602411287166989uwb) != __builtin_popcountg (5307736750284212829931201546806718535860789684371772688568780952567669490917265125893664418036905110148872995350655890585853451175740907670080602411287166989uwb)
+      || popcount523 (21261096432069432668470452941790780841888331284195411465624030283325239673941548816191698556934198698768393659379577567450765073013688585051560340496749593370uwb) != __builtin_popcountg (21261096432069432668470452941790780841888331284195411465624030283325239673941548816191698556934198698768393659379577567450765073013688585051560340496749593370uwb))
+    __builtin_abort ();
+#endif
+}

	Jakub


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Add type-generic clz/ctz/clrsb/ffs/parity/popcount builtins [PR111309]
  2023-11-09 15:02 [PATCH] Add type-generic clz/ctz/clrsb/ffs/parity/popcount builtins [PR111309] Jakub Jelinek
@ 2023-11-09 21:43 ` Joseph Myers
  2023-11-10  8:09 ` Richard Biener
  2023-12-16  5:51 ` Andrew Pinski
  2 siblings, 0 replies; 10+ messages in thread
From: Joseph Myers @ 2023-11-09 21:43 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, Jason Merrill, gcc-patches

On Thu, 9 Nov 2023, Jakub Jelinek wrote:

> The main reason to add these is to support arbitrary unsigned (for
> clrsb/ffs signed) bit-precise integer types and also __int128 which
> wasn't supported by the existing builtins, so that e.g. <stdbit.h>
> type-generic functions could then support not just bit-precise unsigned
> integer type whose width matches a standard or extended integer type,
> but others too.

Thanks for working on this.  My plan for the <stdbit.h> implementation I'm 
working on for glibc is to start with implementations using existing 
built-in functions (and only handling types whose width matches standard 
types), then supporting _BitInt with these new built-in functions will be 
suitable for a followup.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Add type-generic clz/ctz/clrsb/ffs/parity/popcount builtins [PR111309]
  2023-11-09 15:02 [PATCH] Add type-generic clz/ctz/clrsb/ffs/parity/popcount builtins [PR111309] Jakub Jelinek
  2023-11-09 21:43 ` Joseph Myers
@ 2023-11-10  8:09 ` Richard Biener
  2023-11-10  9:10   ` Jakub Jelinek
  2023-12-16  5:51 ` Andrew Pinski
  2 siblings, 1 reply; 10+ messages in thread
From: Richard Biener @ 2023-11-10  8:09 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Joseph S. Myers, Jason Merrill, gcc-patches

On Thu, 9 Nov 2023, Jakub Jelinek wrote:

> Hi!
> 
> The following patch adds 6 new type-generic builtins,
> __builtin_clzg
> __builtin_ctzg
> __builtin_clrsbg
> __builtin_ffsg
> __builtin_parityg
> __builtin_popcountg
> The g at the end stands for generic because the unsuffixed variant
> of the builtins already have unsigned int or int arguments.
> 
> The main reason to add these is to support arbitrary unsigned (for
> clrsb/ffs signed) bit-precise integer types and also __int128 which
> wasn't supported by the existing builtins, so that e.g. <stdbit.h>
> type-generic functions could then support not just bit-precise unsigned
> integer type whose width matches a standard or extended integer type,
> but others too.
> 
> None of these new builtins promote their first argument, so the argument
> can be e.g. unsigned char or unsigned short or unsigned __int20 etc.

But is that a good idea?  Is that how type generic functions work in C?
I think it introduces non-obvious/unexpected behavior in user code.

If people do not want to "compensate" for this maybe insted also add
__builtin_*{8,16} (like we have for the bswap variants)?

Otherwise this looks reasonable.  I'm not sure why we need separate
CFN_CLZ and CFN_BUILT_IN_CLZG?  (why CFN_BUILT_IN_CLZG and not CFN_CLZG?)
That is, I'm confused about

     CASE_CFN_CLRSB:
+    case CFN_BUILT_IN_CLRSBG:

why does CASE_CFN_CLRSB not include CLRSBG?  It includes IFN_CLRSB, no?
And IFN_CLRSB already has the two and one arg case and thus encompasses
some BUILT_IN_CLRSBG cases?

> The first 2 support either 1 or 2 arguments, if only 1 argument is supplied,
> the behavior is undefined for argument 0 like for other __builtin_c[lt]z*
> builtins, if 2 arguments are supplied, the second argument should be int
> that will be returned if the argument is 0.  All other builtins have
> just one argument.  For __builtin_clrsbg and __builtin_ffsg the argument
> shall be any signed standard/extended or bit-precise integer, for the others
> any unsigned standard/extended or bit-precise integer (bool not allowed).
> 
> One possibility would be to also allow signed integer types for
> the clz/ctz/parity/popcount ones (and just cast the argument to
> unsigned_type_for during folding) and similarly unsigned integer types
> for the clrsb/ffs ones, dunno what is better; for stdbit.h the current
> version is sufficient and diagnoses use of the inappropriate sign,
> though on the other side I wonder if users won't be confused by
> __builtin_clzg (1) being an error and having to write __builtin_clzg (1U).
> And I think we don't have anything in C that would allow casting to
> corresponding unsigned type (or vice versa) given arbitrary integral type,
> one could use _Generic for that for standard and extended types, but not
> for arbitrary _BitInt.  What do you think?
> 
> The new builtins are lowered to corresponding builtins with other suffixes
> or internal calls (plus casts and adjustments where needed) during FE
> folding or during gimplification at latest, the non-suffixed builtins
> handling precisions up to precision of int, l up to precision of long,
> ll up to precision of long long, up to __int128 precision lowered to
> double-word expansion early and the rest (which must be _BitInt) lowered
> to internal fn calls - those are then lowered during bitint lowering pass.
> 
> The patch also changes representation of IFN_CLZ and IFN_CTZ calls,
> previously they were in the IL only if they are directly supported optab
> and depending on C[LT]Z_DEFINED_VALUE_AT_ZERO (...) == 2 they had or didn't
> have defined behavior at 0, now they are in the IL either if directly
> supported optab, or for the large/huge BITINT_TYPEs and they have either
> 1 or 2 arguments.  If one, the behavior is undefined at zero, if 2, the
> second argument is an int constant that should be returned for 0.
> As there is no extra support during expansion, for directly supported optab
> the second argument if present should still match the
> C[LT]Z_DEFINED_VALUE_AT_ZERO (...) == 2 value, but for BITINT_TYPE arguments
> it can be arbitrary int INTEGER_CST.
> 
> The goal is e.g.
> #ifdef __has_builtin
> #if __has_builtin(__builtin_clzg) && __has_builtin(__builtin_popcountg)
> #define stdc_leading_zeros(x) \
>   __builtin_clzg (x, __builtin_popcountg ((__typeof (x)) -1))
> #endif
> #endif
> where __builtin_popcountg ((__typeof (x)) -1) computes the bit precision
> of x's type (kind of _Bitwidthof (x) alternative).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Besides the above question I'd say OK (I assume Josephs reply is a
general ack from his side).

Thanks,
Richard.

> 2023-11-09  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR c/111309
> gcc/
> 	* builtins.def (BUILT_IN_CLZG, BUILT_IN_CTZG, BUILT_IN_CLRSBG,
> 	BUILT_IN_FFSG, BUILT_IN_PARITYG, BUILT_IN_POPCOUNTG): New
> 	builtins.
> 	* builtins.cc (fold_builtin_bit_query): New function.
> 	(fold_builtin_1): Use it for
> 	BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
> 	(fold_builtin_2): Use it for BUILT_IN_{CLZ,CTZ}G.
> 	* fold-const-call.cc: Fix comment typo on tm.h inclusion.
> 	(fold_const_call_ss): Handle
> 	CFN_BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
> 	(fold_const_call_sss): New function.
> 	(fold_const_call_1): Call it for 2 argument functions returning
> 	scalar when passed 2 INTEGER_CSTs.
> 	* genmatch.cc (cmp_operand): For function calls also compare
> 	number of arguments.
> 	(fns_cmp): New function.
> 	(dt_node::gen_kids): Sort fns and generic_fns.
> 	(dt_node::gen_kids_1): Handle fns with the same id but different
> 	number of arguments.
> 	* match.pd (CLZ simplifications): Drop checks for defined behavior
> 	at zero.  Add variant of simplifications for IFN_CLZ with 2 arguments.
> 	(CTZ simplifications): Drop checks for defined behavior at zero,
> 	don't optimize precisions above MAX_FIXED_MODE_SIZE.  Add variant of
> 	simplifications for IFN_CTZ with 2 arguments.
> 	(a != 0 ? CLZ(a) : CST -> .CLZ(a)): Use TREE_TYPE (@3) instead of
> 	type, add BITINT_TYPE handling, create 2 argument IFN_CLZ rather than
> 	one argument.  Add variant for matching CLZ with 2 arguments.
> 	(a != 0 ? CTZ(a) : CST -> .CTZ(a)): Similarly.
> 	* gimple-lower-bitint.cc (bitint_large_huge::lower_bit_query): New
> 	method.
> 	(bitint_large_huge::lower_call): Use it for IFN_{CLZ,CTZ,CLRSB,FFS}
> 	and IFN_{PARITY,POPCOUNT} calls.
> 	* gimple-range-op.cc (cfn_clz::fold_range): Don't check
> 	CLZ_DEFINED_VALUE_AT_ZERO for m_gimple_call_internal_p, instead
> 	assume defined value at zero if the call has 2 arguments and use
> 	second argument value for that case.
> 	(cfn_ctz::fold_range): Similarly.
> 	(gimple_range_op_handler::maybe_builtin_call): Use op_cfn_clz_internal
> 	or op_cfn_ctz_internal only if internal fn call has 2 arguments and
> 	set m_op2 in that case.
> 	* tree-vect-patterns.cc (vect_recog_ctz_ffs_pattern,
> 	vect_recog_popcount_clz_ctz_ffs_pattern): For value defined at zero
> 	use second argument of calls if present, otherwise assume UB at zero,
> 	create 2 argument .CLZ/.CTZ calls if needed.
> 	* tree-vect-stmts.cc (vectorizable_call): Handle 2 argument .CLZ/.CTZ
> 	calls.
> 	* tree-ssa-loop-niter.cc (build_cltz_expr): Create 2 argument
> 	.CLZ/.CTZ calls if needed.
> 	* tree-ssa-forwprop.cc (simplify_count_trailing_zeroes): Create 2
> 	argument .CTZ calls if needed.
> 	* tree-ssa-phiopt.cc (cond_removal_in_builtin_zero_pattern): Handle
> 	2 argument .CLZ/.CTZ calls, handle BITINT_TYPE, create 2 argument
> 	.CLZ/.CTZ calls.
> 	* doc/extend.texi (__builtin_clzg, __builtin_ctzg, __builtin_clrsbg,
> 	__builtin_ffsg, __builtin_parityg, __builtin_popcountg): Document.
> gcc/c-family/
> 	* c-common.cc (check_builtin_function_arguments): Handle
> 	BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
> 	* c-gimplify.cc (c_gimplify_expr): If __builtin_c[lt]zg second
> 	argument hasn't been folded into constant yet, transform it to one
> 	argument call inside of a COND_EXPR which for first argument 0
> 	returns the second argument.
> gcc/c/
> 	* c-typeck.cc (convert_arguments): Don't promote first argument
> 	of BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
> gcc/cp/
> 	* call.cc (magic_varargs_p): Return 4 for
> 	BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
> 	(build_over_call): Don't promote first argument of
> 	BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
> 	* cp-gimplify.cc (cp_gimplify_expr): For BUILT_IN_C{L,T}ZG use
> 	c_gimplify_expr.
> gcc/testsuite/
> 	* c-c++-common/pr111309-1.c: New test.
> 	* c-c++-common/pr111309-2.c: New test.
> 	* gcc.dg/torture/bitint-43.c: New test.
> 	* gcc.dg/torture/bitint-44.c: New test.
> 
> --- gcc/builtins.def.jj	2023-11-09 09:04:18.396546519 +0100
> +++ gcc/builtins.def	2023-11-09 09:17:40.235182413 +0100
> @@ -962,15 +962,18 @@ DEF_GCC_BUILTIN        (BUILT_IN_CLZ, "c
>  DEF_GCC_BUILTIN        (BUILT_IN_CLZIMAX, "clzimax", BT_FN_INT_UINTMAX, ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_GCC_BUILTIN        (BUILT_IN_CLZL, "clzl", BT_FN_INT_ULONG, ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_GCC_BUILTIN        (BUILT_IN_CLZLL, "clzll", BT_FN_INT_ULONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST)
> +DEF_GCC_BUILTIN        (BUILT_IN_CLZG, "clzg", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF)
>  DEF_GCC_BUILTIN        (BUILT_IN_CONSTANT_P, "constant_p", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_GCC_BUILTIN        (BUILT_IN_CTZ, "ctz", BT_FN_INT_UINT, ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_GCC_BUILTIN        (BUILT_IN_CTZIMAX, "ctzimax", BT_FN_INT_UINTMAX, ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_GCC_BUILTIN        (BUILT_IN_CTZL, "ctzl", BT_FN_INT_ULONG, ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_GCC_BUILTIN        (BUILT_IN_CTZLL, "ctzll", BT_FN_INT_ULONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST)
> +DEF_GCC_BUILTIN        (BUILT_IN_CTZG, "ctzg", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF)
>  DEF_GCC_BUILTIN        (BUILT_IN_CLRSB, "clrsb", BT_FN_INT_INT, ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_GCC_BUILTIN        (BUILT_IN_CLRSBIMAX, "clrsbimax", BT_FN_INT_INTMAX, ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_GCC_BUILTIN        (BUILT_IN_CLRSBL, "clrsbl", BT_FN_INT_LONG, ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_GCC_BUILTIN        (BUILT_IN_CLRSBLL, "clrsbll", BT_FN_INT_LONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST)
> +DEF_GCC_BUILTIN        (BUILT_IN_CLRSBG, "clrsbg", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF)
>  DEF_EXT_LIB_BUILTIN    (BUILT_IN_DCGETTEXT, "dcgettext", BT_FN_STRING_CONST_STRING_CONST_STRING_INT, ATTR_FORMAT_ARG_2)
>  DEF_EXT_LIB_BUILTIN    (BUILT_IN_DGETTEXT, "dgettext", BT_FN_STRING_CONST_STRING_CONST_STRING, ATTR_FORMAT_ARG_2)
>  DEF_GCC_BUILTIN        (BUILT_IN_DWARF_CFA, "dwarf_cfa", BT_FN_PTR, ATTR_NULL)
> @@ -993,6 +996,7 @@ DEF_EXT_LIB_BUILTIN    (BUILT_IN_FFS, "f
>  DEF_EXT_LIB_BUILTIN    (BUILT_IN_FFSIMAX, "ffsimax", BT_FN_INT_INTMAX, ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_EXT_LIB_BUILTIN    (BUILT_IN_FFSL, "ffsl", BT_FN_INT_LONG, ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_EXT_LIB_BUILTIN    (BUILT_IN_FFSLL, "ffsll", BT_FN_INT_LONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST)
> +DEF_GCC_BUILTIN        (BUILT_IN_FFSG, "ffsg", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF)
>  DEF_EXT_LIB_BUILTIN        (BUILT_IN_FORK, "fork", BT_FN_PID, ATTR_NOTHROW_LIST)
>  DEF_GCC_BUILTIN        (BUILT_IN_FRAME_ADDRESS, "frame_address", BT_FN_PTR_UINT, ATTR_NULL)
>  /* [trans-mem]: Adjust BUILT_IN_TM_FREE if BUILT_IN_FREE is changed.  */
> @@ -1041,10 +1045,12 @@ DEF_GCC_BUILTIN        (BUILT_IN_PARITY,
>  DEF_GCC_BUILTIN        (BUILT_IN_PARITYIMAX, "parityimax", BT_FN_INT_UINTMAX, ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_GCC_BUILTIN        (BUILT_IN_PARITYL, "parityl", BT_FN_INT_ULONG, ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_GCC_BUILTIN        (BUILT_IN_PARITYLL, "parityll", BT_FN_INT_ULONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST)
> +DEF_GCC_BUILTIN        (BUILT_IN_PARITYG, "parityg", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF)
>  DEF_GCC_BUILTIN        (BUILT_IN_POPCOUNT, "popcount", BT_FN_INT_UINT, ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_GCC_BUILTIN        (BUILT_IN_POPCOUNTIMAX, "popcountimax", BT_FN_INT_UINTMAX, ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_GCC_BUILTIN        (BUILT_IN_POPCOUNTL, "popcountl", BT_FN_INT_ULONG, ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_GCC_BUILTIN        (BUILT_IN_POPCOUNTLL, "popcountll", BT_FN_INT_ULONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST)
> +DEF_GCC_BUILTIN        (BUILT_IN_POPCOUNTG, "popcountg", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF)
>  DEF_EXT_LIB_BUILTIN    (BUILT_IN_POSIX_MEMALIGN, "posix_memalign", BT_FN_INT_PTRPTR_SIZE_SIZE, ATTR_NOTHROW_NONNULL_LEAF)
>  DEF_GCC_BUILTIN        (BUILT_IN_PREFETCH, "prefetch", BT_FN_VOID_CONST_PTR_VAR, ATTR_NOVOPS_LEAF_LIST)
>  DEF_LIB_BUILTIN        (BUILT_IN_REALLOC, "realloc", BT_FN_PTR_PTR_SIZE, ATTR_ALLOC_WARN_UNUSED_RESULT_SIZE_2_NOTHROW_LEAF_LIST)
> --- gcc/builtins.cc.jj	2023-11-09 09:03:53.107904770 +0100
> +++ gcc/builtins.cc	2023-11-09 09:17:40.230182483 +0100
> @@ -9573,6 +9573,271 @@ fold_builtin_arith_overflow (location_t
>    return build2_loc (loc, COMPOUND_EXPR, boolean_type_node, store, ovfres);
>  }
>  
> +/* Fold __builtin_{clz,ctz,clrsb,ffs,parity,popcount}g into corresponding
> +   internal function.  */
> +
> +static tree
> +fold_builtin_bit_query (location_t loc, enum built_in_function fcode,
> +			tree arg0, tree arg1)
> +{
> +  enum internal_fn ifn;
> +  enum built_in_function fcodei, fcodel, fcodell;
> +  tree arg0_type = TREE_TYPE (arg0);
> +  tree cast_type = NULL_TREE;
> +  int addend = 0;
> +
> +  switch (fcode)
> +    {
> +    case BUILT_IN_CLZG:
> +      if (arg1 && TREE_CODE (arg1) != INTEGER_CST)
> +	return NULL_TREE;
> +      ifn = IFN_CLZ;
> +      fcodei = BUILT_IN_CLZ;
> +      fcodel = BUILT_IN_CLZL;
> +      fcodell = BUILT_IN_CLZLL;
> +      break;
> +    case BUILT_IN_CTZG:
> +      if (arg1 && TREE_CODE (arg1) != INTEGER_CST)
> +	return NULL_TREE;
> +      ifn = IFN_CTZ;
> +      fcodei = BUILT_IN_CTZ;
> +      fcodel = BUILT_IN_CTZL;
> +      fcodell = BUILT_IN_CTZLL;
> +      break;
> +    case BUILT_IN_CLRSBG:
> +      ifn = IFN_CLRSB;
> +      fcodei = BUILT_IN_CLRSB;
> +      fcodel = BUILT_IN_CLRSBL;
> +      fcodell = BUILT_IN_CLRSBLL;
> +      break;
> +    case BUILT_IN_FFSG:
> +      ifn = IFN_FFS;
> +      fcodei = BUILT_IN_FFS;
> +      fcodel = BUILT_IN_FFSL;
> +      fcodell = BUILT_IN_FFSLL;
> +      break;
> +    case BUILT_IN_PARITYG:
> +      ifn = IFN_PARITY;
> +      fcodei = BUILT_IN_PARITY;
> +      fcodel = BUILT_IN_PARITYL;
> +      fcodell = BUILT_IN_PARITYLL;
> +      break;
> +    case BUILT_IN_POPCOUNTG:
> +      ifn = IFN_POPCOUNT;
> +      fcodei = BUILT_IN_POPCOUNT;
> +      fcodel = BUILT_IN_POPCOUNTL;
> +      fcodell = BUILT_IN_POPCOUNTLL;
> +      break;
> +    default:
> +      gcc_unreachable ();
> +    }
> +
> +  if (TYPE_PRECISION (arg0_type)
> +      <= TYPE_PRECISION (long_long_unsigned_type_node))
> +    {
> +      if (TYPE_PRECISION (arg0_type) <= TYPE_PRECISION (unsigned_type_node))
> +
> +	cast_type = (TYPE_UNSIGNED (arg0_type)
> +		     ? unsigned_type_node : integer_type_node);
> +      else if (TYPE_PRECISION (arg0_type)
> +	       <= TYPE_PRECISION (long_unsigned_type_node))
> +	{
> +	  cast_type = (TYPE_UNSIGNED (arg0_type)
> +		       ? long_unsigned_type_node : long_integer_type_node);
> +	  fcodei = fcodel;
> +	}
> +      else
> +	{
> +	  cast_type = (TYPE_UNSIGNED (arg0_type)
> +		       ? long_long_unsigned_type_node
> +		       : long_long_integer_type_node);
> +	  fcodei = fcodell;
> +	}
> +    }
> +  else if (TYPE_PRECISION (arg0_type) <= MAX_FIXED_MODE_SIZE)
> +    {
> +      cast_type
> +	= build_nonstandard_integer_type (MAX_FIXED_MODE_SIZE,
> +					  TYPE_UNSIGNED (arg0_type));
> +      gcc_assert (TYPE_PRECISION (cast_type)
> +		  == 2 * TYPE_PRECISION (long_long_unsigned_type_node));
> +      fcodei = END_BUILTINS;
> +    }
> +  else
> +    fcodei = END_BUILTINS;
> +  if (cast_type)
> +    {
> +      switch (fcode)
> +	{
> +	case BUILT_IN_CLZG:
> +	case BUILT_IN_CLRSBG:
> +	  addend = TYPE_PRECISION (arg0_type) - TYPE_PRECISION (cast_type);
> +	  break;
> +	default:
> +	  break;
> +	}
> +      arg0 = fold_convert (cast_type, arg0);
> +      arg0_type = cast_type;
> +    }
> +
> +  if (arg1)
> +    arg1 = fold_convert (integer_type_node, arg1);
> +
> +  tree arg2 = arg1;
> +  if (fcode == BUILT_IN_CLZG && addend)
> +    {
> +      if (arg1)
> +	arg0 = save_expr (arg0);
> +      arg2 = NULL_TREE;
> +    }
> +  tree call = NULL_TREE, tem;
> +  if (TYPE_PRECISION (arg0_type) == MAX_FIXED_MODE_SIZE
> +      && (TYPE_PRECISION (arg0_type)
> +	  == 2 * TYPE_PRECISION (long_long_unsigned_type_node)))
> +    {
> +      /* __int128 expansions using up to 2 long long builtins.  */
> +      arg0 = save_expr (arg0);
> +      tree type = (TYPE_UNSIGNED (arg0_type)
> +		   ? long_long_unsigned_type_node
> +		   : long_long_integer_type_node);
> +      tree hi = fold_build2 (RSHIFT_EXPR, arg0_type, arg0,
> +			     build_int_cst (integer_type_node,
> +					    MAX_FIXED_MODE_SIZE / 2));
> +      hi = fold_convert (type, hi);
> +      tree lo = fold_convert (type, arg0);
> +      switch (fcode)
> +	{
> +	case BUILT_IN_CLZG:
> +	  call = fold_builtin_bit_query (loc, fcode, lo, NULL_TREE);
> +	  call = fold_build2 (PLUS_EXPR, integer_type_node, call,
> +			      build_int_cst (integer_type_node,
> +					     MAX_FIXED_MODE_SIZE / 2));
> +	  if (arg2)
> +	    call = fold_build3 (COND_EXPR, integer_type_node,
> +				fold_build2 (NE_EXPR, boolean_type_node,
> +					     lo, build_zero_cst (type)),
> +				call, arg2);
> +	  call = fold_build3 (COND_EXPR, integer_type_node,
> +			      fold_build2 (NE_EXPR, boolean_type_node,
> +					   hi, build_zero_cst (type)),
> +			      fold_builtin_bit_query (loc, fcode, hi,
> +						      NULL_TREE),
> +			      call);
> +	  break;
> +	case BUILT_IN_CTZG:
> +	  call = fold_builtin_bit_query (loc, fcode, hi, NULL_TREE);
> +	  call = fold_build2 (PLUS_EXPR, integer_type_node, call,
> +			      build_int_cst (integer_type_node,
> +					     MAX_FIXED_MODE_SIZE / 2));
> +	  if (arg2)
> +	    call = fold_build3 (COND_EXPR, integer_type_node,
> +				fold_build2 (NE_EXPR, boolean_type_node,
> +					     hi, build_zero_cst (type)),
> +				call, arg2);
> +	  call = fold_build3 (COND_EXPR, integer_type_node,
> +			      fold_build2 (NE_EXPR, boolean_type_node,
> +					   lo, build_zero_cst (type)),
> +			      fold_builtin_bit_query (loc, fcode, lo,
> +						      NULL_TREE),
> +			      call);
> +	  break;
> +	case BUILT_IN_CLRSBG:
> +	  tem = fold_builtin_bit_query (loc, fcode, lo, NULL_TREE);
> +	  tem = fold_build2 (PLUS_EXPR, integer_type_node, tem,
> +			     build_int_cst (integer_type_node,
> +					    MAX_FIXED_MODE_SIZE / 2));
> +	  tem = fold_build3 (COND_EXPR, integer_type_node,
> +			     fold_build2 (LT_EXPR, boolean_type_node,
> +					  fold_build2 (BIT_XOR_EXPR, type,
> +						       lo, hi),
> +					  build_zero_cst (type)),
> +			     build_int_cst (integer_type_node,
> +					    MAX_FIXED_MODE_SIZE / 2 - 1),
> +			     tem);
> +	  call = fold_builtin_bit_query (loc, fcode, hi, NULL_TREE);
> +	  call = save_expr (call);
> +	  call = fold_build3 (COND_EXPR, integer_type_node,
> +			      fold_build2 (NE_EXPR, boolean_type_node,
> +					   call,
> +					   build_int_cst (integer_type_node,
> +							  MAX_FIXED_MODE_SIZE
> +							  / 2 - 1)),
> +			      call, tem);
> +	  break;
> +	case BUILT_IN_FFSG:
> +	  call = fold_builtin_bit_query (loc, fcode, hi, NULL_TREE);
> +	  call = fold_build2 (PLUS_EXPR, integer_type_node, call,
> +			      build_int_cst (integer_type_node,
> +					     MAX_FIXED_MODE_SIZE / 2));
> +	  call = fold_build3 (COND_EXPR, integer_type_node,
> +			      fold_build2 (NE_EXPR, boolean_type_node,
> +					   hi, build_zero_cst (type)),
> +			      call, integer_zero_node);
> +	  call = fold_build3 (COND_EXPR, integer_type_node,
> +			      fold_build2 (NE_EXPR, boolean_type_node,
> +					   lo, build_zero_cst (type)),
> +			      fold_builtin_bit_query (loc, fcode, lo,
> +						      NULL_TREE),
> +			      call);
> +	  break;
> +	case BUILT_IN_PARITYG:
> +	  call = fold_builtin_bit_query (loc, fcode,
> +					 fold_build2 (BIT_XOR_EXPR, type,
> +						      lo, hi), NULL_TREE);
> +	  break;
> +	case BUILT_IN_POPCOUNTG:
> +	  call = fold_build2 (PLUS_EXPR, integer_type_node,
> +			      fold_builtin_bit_query (loc, fcode, hi,
> +						      NULL_TREE),
> +			      fold_builtin_bit_query (loc, fcode, lo,
> +						      NULL_TREE));
> +	  break;
> +	default:
> +	  gcc_unreachable ();
> +	}
> +    }
> +  else
> +    {
> +      /* Only keep second argument to IFN_CLZ/IFN_CTZ if it is the
> +	 value defined at zero during GIMPLE, or for large/huge _BitInt
> +	 (which are then lowered during bitint lowering).  */
> +      if (arg2 && TREE_CODE (TREE_TYPE (arg0)) != BITINT_TYPE)
> +	{
> +	  int val;
> +	  if (fcode == BUILT_IN_CLZG)
> +	    {
> +	      if (CLZ_DEFINED_VALUE_AT_ZERO (SCALAR_TYPE_MODE (arg0_type),
> +					     val) != 2
> +		  || wi::to_widest (arg2) != val)
> +		arg2 = NULL_TREE;
> +	    }
> +	  else if (CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_TYPE_MODE (arg0_type),
> +					      val) != 2
> +		   || wi::to_widest (arg2) != val)
> +	    arg2 = NULL_TREE;
> +	  if (!direct_internal_fn_supported_p (ifn, arg0_type,
> +					       OPTIMIZE_FOR_BOTH))
> +	    arg2 = NULL_TREE;
> +	}
> +      if (fcodei == END_BUILTINS || arg2)
> +	call = build_call_expr_internal_loc (loc, ifn, integer_type_node,
> +					     arg2 ? 2 : 1, arg0, arg2);
> +      else
> +	call = build_call_expr_loc (loc, builtin_decl_explicit (fcodei), 1,
> +				    arg0);
> +    }
> +  if (addend)
> +    call = fold_build2 (PLUS_EXPR, integer_type_node, call,
> +			build_int_cst (integer_type_node, addend));
> +  if (arg1 && arg2 == NULL_TREE)
> +    call = fold_build3 (COND_EXPR, integer_type_node,
> +			fold_build2 (NE_EXPR, boolean_type_node,
> +				     arg0, build_zero_cst (arg0_type)),
> +			call, arg1);
> +
> +  return call;
> +}
> +
>  /* Fold __builtin_{add,sub}c{,l,ll} into pair of internal functions
>     that return both result of arithmetics and overflowed boolean
>     flag in a complex integer result.  */
> @@ -9824,6 +10089,14 @@ fold_builtin_1 (location_t loc, tree exp
>  	return build_empty_stmt (loc);
>        break;
>  
> +    case BUILT_IN_CLZG:
> +    case BUILT_IN_CTZG:
> +    case BUILT_IN_CLRSBG:
> +    case BUILT_IN_FFSG:
> +    case BUILT_IN_PARITYG:
> +    case BUILT_IN_POPCOUNTG:
> +      return fold_builtin_bit_query (loc, fcode, arg0, NULL_TREE);
> +
>      default:
>        break;
>      }
> @@ -9913,6 +10186,10 @@ fold_builtin_2 (location_t loc, tree exp
>      case BUILT_IN_ATOMIC_IS_LOCK_FREE:
>        return fold_builtin_atomic_is_lock_free (arg0, arg1);
>  
> +    case BUILT_IN_CLZG:
> +    case BUILT_IN_CTZG:
> +      return fold_builtin_bit_query (loc, fcode, arg0, arg1);
> +
>      default:
>        break;
>      }
> --- gcc/fold-const-call.cc.jj	2023-11-09 09:03:53.368901073 +0100
> +++ gcc/fold-const-call.cc	2023-11-09 09:17:40.240182342 +0100
> @@ -27,7 +27,7 @@ along with GCC; see the file COPYING3.
>  #include "fold-const.h"
>  #include "fold-const-call.h"
>  #include "case-cfn-macros.h"
> -#include "tm.h" /* For C[LT]Z_DEFINED_AT_ZERO.  */
> +#include "tm.h" /* For C[LT]Z_DEFINED_VALUE_AT_ZERO.  */
>  #include "builtins.h"
>  #include "gimple-expr.h"
>  #include "tree-vector-builder.h"
> @@ -1017,14 +1017,18 @@ fold_const_call_ss (wide_int *result, co
>    switch (fn)
>      {
>      CASE_CFN_FFS:
> +    case CFN_BUILT_IN_FFSG:
>        *result = wi::shwi (wi::ffs (arg), precision);
>        return true;
>  
>      CASE_CFN_CLZ:
> +    case CFN_BUILT_IN_CLZG:
>        {
>  	int tmp;
>  	if (wi::ne_p (arg, 0))
>  	  tmp = wi::clz (arg);
> +	else if (TREE_CODE (arg_type) == BITINT_TYPE)
> +	  tmp = TYPE_PRECISION (arg_type);
>  	else if (!CLZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (arg_type),
>  					     tmp))
>  	  tmp = TYPE_PRECISION (arg_type);
> @@ -1033,10 +1037,13 @@ fold_const_call_ss (wide_int *result, co
>        }
>  
>      CASE_CFN_CTZ:
> +    case CFN_BUILT_IN_CTZG:
>        {
>  	int tmp;
>  	if (wi::ne_p (arg, 0))
>  	  tmp = wi::ctz (arg);
> +	else if (TREE_CODE (arg_type) == BITINT_TYPE)
> +	  tmp = TYPE_PRECISION (arg_type);
>  	else if (!CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (arg_type),
>  					     tmp))
>  	  tmp = TYPE_PRECISION (arg_type);
> @@ -1045,14 +1052,17 @@ fold_const_call_ss (wide_int *result, co
>        }
>  
>      CASE_CFN_CLRSB:
> +    case CFN_BUILT_IN_CLRSBG:
>        *result = wi::shwi (wi::clrsb (arg), precision);
>        return true;
>  
>      CASE_CFN_POPCOUNT:
> +    case CFN_BUILT_IN_POPCOUNTG:
>        *result = wi::shwi (wi::popcount (arg), precision);
>        return true;
>  
>      CASE_CFN_PARITY:
> +    case CFN_BUILT_IN_PARITYG:
>        *result = wi::shwi (wi::parity (arg), precision);
>        return true;
>  
> @@ -1531,6 +1541,49 @@ fold_const_call_sss (real_value *result,
>  
>  /* Try to evaluate:
>  
> +      *RESULT = FN (ARG0, ARG1)
> +
> +   where ARG_TYPE is the type of ARG0 and PRECISION is the number of bits in
> +   the result.  Return true on success.  */
> +
> +static bool
> +fold_const_call_sss (wide_int *result, combined_fn fn,
> +		     const wide_int_ref &arg0, const wide_int_ref &arg1,
> +		     unsigned int precision, tree arg_type ATTRIBUTE_UNUSED)
> +{
> +  switch (fn)
> +    {
> +    case CFN_CLZ:
> +    case CFN_BUILT_IN_CLZG:
> +      {
> +	int tmp;
> +	if (wi::ne_p (arg0, 0))
> +	  tmp = wi::clz (arg0);
> +	else
> +	  tmp = arg1.to_shwi ();
> +	*result = wi::shwi (tmp, precision);
> +	return true;
> +      }
> +
> +    case CFN_CTZ:
> +    case CFN_BUILT_IN_CTZG:
> +      {
> +	int tmp;
> +	if (wi::ne_p (arg0, 0))
> +	  tmp = wi::ctz (arg0);
> +	else
> +	  tmp = arg1.to_shwi ();
> +	*result = wi::shwi (tmp, precision);
> +	return true;
> +      }
> +
> +    default:
> +      return false;
> +    }
> +}
> +
> +/* Try to evaluate:
> +
>        RESULT = fn (ARG0, ARG1)
>  
>     where FORMAT is the format of the real and imaginary parts of RESULT
> @@ -1565,6 +1618,19 @@ fold_const_call_1 (combined_fn fn, tree
>    machine_mode arg0_mode = TYPE_MODE (TREE_TYPE (arg0));
>    machine_mode arg1_mode = TYPE_MODE (TREE_TYPE (arg1));
>  
> +  if (integer_cst_p (arg0) && integer_cst_p (arg1))
> +    {
> +      if (SCALAR_INT_MODE_P (mode))
> +	{
> +	  wide_int result;
> +	  if (fold_const_call_sss (&result, fn, wi::to_wide (arg0),
> +				   wi::to_wide (arg1), TYPE_PRECISION (type),
> +				   TREE_TYPE (arg0)))
> +	    return wide_int_to_tree (type, result);
> +	}
> +      return NULL_TREE;
> +    }
> +
>    if (mode == arg0_mode
>        && real_cst_p (arg0)
>        && real_cst_p (arg1))
> --- gcc/genmatch.cc.jj	2023-11-09 09:03:53.375900973 +0100
> +++ gcc/genmatch.cc	2023-11-09 09:17:40.234182427 +0100
> @@ -1895,8 +1895,14 @@ cmp_operand (operand *o1, operand *o2)
>      {
>        expr *e1 = static_cast<expr *>(o1);
>        expr *e2 = static_cast<expr *>(o2);
> -      return (e1->operation == e2->operation
> -	      && e1->is_generic == e2->is_generic);
> +      if (e1->operation != e2->operation
> +	  || e1->is_generic != e2->is_generic)
> +	return false;
> +      if (e1->operation->kind == id_base::FN
> +	  /* For function calls also compare number of arguments.  */
> +	  && e1->ops.length () != e2->ops.length ())
> +	return false;
> +      return true;
>      }
>    else
>      return false;
> @@ -3070,6 +3076,26 @@ dt_operand::gen_generic_expr (FILE *f, i
>    return 0;
>  }
>  
> +/* Compare 2 fns or generic_fns vector entries for vector sorting.
> +   Same operation entries with different number of arguments should
> +   be adjacent.  */
> +
> +static int
> +fns_cmp (const void *p1, const void *p2)
> +{
> +  dt_operand *op1 = *(dt_operand *const *) p1;
> +  dt_operand *op2 = *(dt_operand *const *) p2;
> +  expr *e1 = as_a <expr *> (op1->op);
> +  expr *e2 = as_a <expr *> (op2->op);
> +  id_base *b1 = e1->operation;
> +  id_base *b2 = e2->operation;
> +  if (b1->hashval < b2->hashval)
> +    return -1;
> +  if (b1->hashval > b2->hashval)
> +    return 1;
> +  return strcmp (b1->id, b2->id);
> +}
> +
>  /* Generate matching code for the children of the decision tree node.  */
>  
>  void
> @@ -3143,6 +3169,8 @@ dt_node::gen_kids (FILE *f, int indent,
>  	     Like DT_TRUE, DT_MATCH serves as a barrier as it can cause
>  	     dependent matches to get out-of-order.  Generate code now
>  	     for what we have collected sofar.  */
> +	  fns.qsort (fns_cmp);
> +	  generic_fns.qsort (fns_cmp);
>  	  gen_kids_1 (f, indent, gimple, depth, gimple_exprs, generic_exprs,
>  		      fns, generic_fns, preds, others);
>  	  /* And output the true operand itself.  */
> @@ -3159,6 +3187,8 @@ dt_node::gen_kids (FILE *f, int indent,
>      }
>  
>    /* Generate code for the remains.  */
> +  fns.qsort (fns_cmp);
> +  generic_fns.qsort (fns_cmp);
>    gen_kids_1 (f, indent, gimple, depth, gimple_exprs, generic_exprs,
>  	      fns, generic_fns, preds, others);
>  }
> @@ -3256,14 +3286,21 @@ dt_node::gen_kids_1 (FILE *f, int indent
>  
>  	  indent += 4;
>  	  fprintf_indent (f, indent, "{\n");
> +	  id_base *last_op = NULL;
>  	  for (unsigned i = 0; i < fns_len; ++i)
>  	    {
>  	      expr *e = as_a <expr *>(fns[i]->op);
> -	      if (user_id *u = dyn_cast <user_id *> (e->operation))
> -		for (auto id : u->substitutes)
> -		  fprintf_indent (f, indent, "case %s:\n", id->id);
> -	      else
> -		fprintf_indent (f, indent, "case %s:\n", e->operation->id);
> +	      if (e->operation != last_op)
> +		{
> +		  if (i)
> +		    fprintf_indent (f, indent, "  break;\n");
> +		  if (user_id *u = dyn_cast <user_id *> (e->operation))
> +		    for (auto id : u->substitutes)
> +		      fprintf_indent (f, indent, "case %s:\n", id->id);
> +		  else
> +		    fprintf_indent (f, indent, "case %s:\n", e->operation->id);
> +		}
> +	      last_op = e->operation;
>  	      /* We need to be defensive against bogus prototypes allowing
>  		 calls with not enough arguments.  */
>  	      fprintf_indent (f, indent,
> @@ -3272,9 +3309,9 @@ dt_node::gen_kids_1 (FILE *f, int indent
>  	      fprintf_indent (f, indent, "    {\n");
>  	      fns[i]->gen (f, indent + 6, true, depth);
>  	      fprintf_indent (f, indent, "    }\n");
> -	      fprintf_indent (f, indent, "  break;\n");
>  	    }
>  
> +	  fprintf_indent (f, indent, "  break;\n");
>  	  fprintf_indent (f, indent, "default:;\n");
>  	  fprintf_indent (f, indent, "}\n");
>  	  indent -= 4;
> @@ -3334,18 +3371,25 @@ dt_node::gen_kids_1 (FILE *f, int indent
>  		      "    {\n");
>        indent += 4;
>  
> +      id_base *last_op = NULL;
>        for (unsigned j = 0; j < generic_fns.length (); ++j)
>  	{
>  	  expr *e = as_a <expr *>(generic_fns[j]->op);
>  	  gcc_assert (e->operation->kind == id_base::FN);
>  
> -	  fprintf_indent (f, indent, "case %s:\n", e->operation->id);
> +	  if (e->operation != last_op)
> +	    {
> +	      if (j)
> +		fprintf_indent (f, indent, "  break;\n");
> +	      fprintf_indent (f, indent, "case %s:\n", e->operation->id);
> +	    }
> +	  last_op = e->operation;
>  	  fprintf_indent (f, indent, "  if (call_expr_nargs (%s) == %d)\n"
>  				     "    {\n", kid_opname, e->ops.length ());
>  	  generic_fns[j]->gen (f, indent + 6, false, depth);
> -	  fprintf_indent (f, indent, "    }\n"
> -				     "  break;\n");
> +	  fprintf_indent (f, indent, "    }\n");
>  	}
> +      fprintf_indent (f, indent, "  break;\n");
>        fprintf_indent (f, indent, "default:;\n");
>  
>        indent -= 4;
> --- gcc/match.pd.jj	2023-11-09 09:03:53.490899344 +0100
> +++ gcc/match.pd	2023-11-09 09:17:40.231182469 +0100
> @@ -8532,31 +8532,34 @@ (define_operator_list SYNC_FETCH_AND_AND
>     (op (clz:s@2 @0) INTEGER_CST@1)
>     (if (integer_zerop (@1) && single_use (@2))
>      /* clz(X) == 0 is (int)X < 0 and clz(X) != 0 is (int)X >= 0.  */
> -    (with { tree type0 = TREE_TYPE (@0);
> -	    tree stype = signed_type_for (type0);
> -	    HOST_WIDE_INT val = 0;
> -	    /* Punt on hypothetical weird targets.  */
> -	    if (clz == CFN_CLZ
> -		&& CLZ_DEFINED_VALUE_AT_ZERO (SCALAR_TYPE_MODE (type0),
> -					      val) == 2
> -		&& val == 0)
> -	      stype = NULL_TREE;
> -	  }
> -     (if (stype)
> -      (cmp (convert:stype @0) { build_zero_cst (stype); })))
> +    (with { tree stype = signed_type_for (TREE_TYPE (@0)); }
> +     (cmp (convert:stype @0) { build_zero_cst (stype); }))
>      /* clz(X) == (prec-1) is X == 1 and clz(X) != (prec-1) is X != 1.  */
> -    (with { bool ok = true;
> -	    HOST_WIDE_INT val = 0;
> -	    tree type0 = TREE_TYPE (@0);
> -	    /* Punt on hypothetical weird targets.  */
> -	    if (clz == CFN_CLZ
> -		&& CLZ_DEFINED_VALUE_AT_ZERO (SCALAR_TYPE_MODE (type0),
> -					      val) == 2
> -		&& val == TYPE_PRECISION (type0) - 1)
> -	      ok = false;
> -	  }
> -     (if (ok && wi::to_wide (@1) == (TYPE_PRECISION (type0) - 1))
> -      (op @0 { build_one_cst (type0); })))))))
> +    (if (wi::to_wide (@1) == TYPE_PRECISION (TREE_TYPE (@0)) - 1)
> +     (op @0 { build_one_cst (TREE_TYPE (@0)); }))))))
> +(for op (eq ne)
> +     cmp (lt ge)
> + (simplify
> +  (op (IFN_CLZ:s@2 @0 @3) INTEGER_CST@1)
> +  (if (integer_zerop (@1) && single_use (@2))
> +   /* clz(X) == 0 is (int)X < 0 and clz(X) != 0 is (int)X >= 0.  */
> +   (with { tree type0 = TREE_TYPE (@0);
> +	   tree stype = signed_type_for (TREE_TYPE (@0));
> +	   /* Punt if clz(0) == 0.  */
> +	   if (integer_zerop (@3))
> +	     stype = NULL_TREE;
> +	 }
> +    (if (stype)
> +     (cmp (convert:stype @0) { build_zero_cst (stype); })))
> +   /* clz(X) == (prec-1) is X == 1 and clz(X) != (prec-1) is X != 1.  */
> +   (with { bool ok = true;
> +	   tree type0 = TREE_TYPE (@0);
> +	   /* Punt if clz(0) == prec - 1.  */
> +	   if (wi::to_widest (@3) == TYPE_PRECISION (type0) - 1)
> +	     ok = false;
> +	 }
> +    (if (ok && wi::to_wide (@1) == (TYPE_PRECISION (type0) - 1))
> +     (op @0 { build_one_cst (type0); }))))))
>  
>  /* CTZ simplifications.  */
>  (for ctz (CTZ)
> @@ -8581,22 +8584,14 @@ (define_operator_list SYNC_FETCH_AND_AND
>  		      val++;
>  		  }
>  	      }
> -	    bool zero_res = false;
> -	    HOST_WIDE_INT zero_val = 0;
>  	    tree type0 = TREE_TYPE (@0);
>  	    int prec = TYPE_PRECISION (type0);
> -	    if (ctz == CFN_CTZ
> -		&& CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_TYPE_MODE (type0),
> -					      zero_val) == 2)
> -	      zero_res = true;
>  	  }
> -     (if (val <= 0)
> -      (if (ok && (!zero_res || zero_val >= val))
> -       { constant_boolean_node (cmp == EQ_EXPR ? true : false, type); })
> -      (if (val >= prec)
> -       (if (ok && (!zero_res || zero_val < val))
> -	{ constant_boolean_node (cmp == EQ_EXPR ? false : true, type); })
> -       (if (ok && (!zero_res || zero_val < 0 || zero_val >= prec))
> +     (if (ok && prec <= MAX_FIXED_MODE_SIZE)
> +      (if (val <= 0)
> +       { constant_boolean_node (cmp == EQ_EXPR ? true : false, type); }
> +       (if (val >= prec)
> +	{ constant_boolean_node (cmp == EQ_EXPR ? false : true, type); }
>  	(cmp (bit_and @0 { wide_int_to_tree (type0,
>  					     wi::mask (val, false, prec)); })
>  	     { build_zero_cst (type0); })))))))
> @@ -8604,19 +8599,68 @@ (define_operator_list SYNC_FETCH_AND_AND
>    (simplify
>     /* __builtin_ctz (x) == C -> (x & ((1 << (C + 1)) - 1)) == (1 << C).  */
>     (op (ctz:s @0) INTEGER_CST@1)
> -    (with { bool zero_res = false;
> -	    HOST_WIDE_INT zero_val = 0;
> -	    tree type0 = TREE_TYPE (@0);
> +    (with { tree type0 = TREE_TYPE (@0);
>  	    int prec = TYPE_PRECISION (type0);
> -	    if (ctz == CFN_CTZ
> -		&& CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_TYPE_MODE (type0),
> -					      zero_val) == 2)
> -	      zero_res = true;
>  	  }
> +     (if (prec <= MAX_FIXED_MODE_SIZE)
> +      (if (tree_int_cst_sgn (@1) < 0 || wi::to_widest (@1) >= prec)
> +       { constant_boolean_node (op == EQ_EXPR ? false : true, type); }
> +       (op (bit_and @0 { wide_int_to_tree (type0,
> +					   wi::mask (tree_to_uhwi (@1) + 1,
> +						     false, prec)); })
> +	   { wide_int_to_tree (type0,
> +			       wi::shifted_mask (tree_to_uhwi (@1), 1,
> +						 false, prec)); })))))))
> +(for op (ge gt le lt)
> +     cmp (eq eq ne ne)
> + (simplify
> +  /* __builtin_ctz (x) >= C -> (x & ((1 << C) - 1)) == 0.  */
> +  (op (IFN_CTZ:s @0 @2) INTEGER_CST@1)
> +   (with { bool ok = true;
> +	   HOST_WIDE_INT val = 0;
> +	   if (!tree_fits_shwi_p (@1))
> +	     ok = false;
> +	   else
> +	     {
> +	       val = tree_to_shwi (@1);
> +	       /* Canonicalize to >= or <.  */
> +	       if (op == GT_EXPR || op == LE_EXPR)
> +		 {
> +		   if (val == HOST_WIDE_INT_MAX)
> +		     ok = false;
> +		   else
> +		     val++;
> +		 }
> +	     }
> +	   HOST_WIDE_INT zero_val = tree_to_shwi (@2);
> +	   tree type0 = TREE_TYPE (@0);
> +	   int prec = TYPE_PRECISION (type0);
> +	   if (prec > MAX_FIXED_MODE_SIZE)
> +	     ok = false;
> +	  }
> +     (if (val <= 0)
> +      (if (ok && zero_val >= val)
> +       { constant_boolean_node (cmp == EQ_EXPR ? true : false, type); })
> +      (if (val >= prec)
> +       (if (ok && zero_val < val)
> +	{ constant_boolean_node (cmp == EQ_EXPR ? false : true, type); })
> +       (if (ok && (zero_val < 0 || zero_val >= prec))
> +	(cmp (bit_and @0 { wide_int_to_tree (type0,
> +					     wi::mask (val, false, prec)); })
> +	     { build_zero_cst (type0); })))))))
> +(for op (eq ne)
> + (simplify
> +  /* __builtin_ctz (x) == C -> (x & ((1 << (C + 1)) - 1)) == (1 << C).  */
> +  (op (IFN_CTZ:s @0 @2) INTEGER_CST@1)
> +   (with { HOST_WIDE_INT zero_val = tree_to_shwi (@2);
> +	   tree type0 = TREE_TYPE (@0);
> +	   int prec = TYPE_PRECISION (type0);
> +	 }
> +    (if (prec <= MAX_FIXED_MODE_SIZE)
>       (if (tree_int_cst_sgn (@1) < 0 || wi::to_widest (@1) >= prec)
> -      (if (!zero_res || zero_val != wi::to_widest (@1))
> +      (if (zero_val != wi::to_widest (@1))
>         { constant_boolean_node (op == EQ_EXPR ? false : true, type); })
> -      (if (!zero_res || zero_val < 0 || zero_val >= prec)
> +      (if (zero_val < 0 || zero_val >= prec)
>         (op (bit_and @0 { wide_int_to_tree (type0,
>  					   wi::mask (tree_to_uhwi (@1) + 1,
>  						     false, prec)); })
> @@ -8753,13 +8797,38 @@ (define_operator_list SYNC_FETCH_AND_AND
>    (cond (ne @0 integer_zerop@1) (func (convert?@3 @0)) INTEGER_CST@2)
>    (with { int val;
>  	  internal_fn ifn = IFN_LAST;
> -	  if (direct_internal_fn_supported_p (IFN_CLZ, type, OPTIMIZE_FOR_BOTH)
> -	      && CLZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (type),
> -					    val) == 2)
> +	  if (TREE_CODE (TREE_TYPE (@3)) == BITINT_TYPE)
> +	    {
> +	      if (tree_fits_shwi_p (@2))
> +		{
> +		  HOST_WIDE_INT valw = tree_to_shwi (@2);
> +		  if ((int) valw == valw)
> +		    {
> +		      val = valw;
> +		      ifn = IFN_CLZ;
> +		    }
> +		}
> +	    }
> +	  else if (direct_internal_fn_supported_p (IFN_CLZ, TREE_TYPE (@3),
> +						   OPTIMIZE_FOR_BOTH)
> +		   && CLZ_DEFINED_VALUE_AT_ZERO
> +			(SCALAR_INT_TYPE_MODE (TREE_TYPE (@3)), val) == 2)
>  	    ifn = IFN_CLZ;
>  	}
>     (if (ifn == IFN_CLZ && wi::to_widest (@2) == val)
> -    (IFN_CLZ @3)))))
> +    (IFN_CLZ @3 @2)))))
> +(simplify
> + (cond (ne @0 integer_zerop@1) (IFN_CLZ (convert?@3 @0) INTEGER_CST@2) @2)
> +  (with { int val;
> +	  internal_fn ifn = IFN_LAST;
> +	  if (TREE_CODE (TREE_TYPE (@3)) == BITINT_TYPE)
> +	    ifn = IFN_CLZ;
> +	  else if (direct_internal_fn_supported_p (IFN_CLZ, TREE_TYPE (@3),
> +						   OPTIMIZE_FOR_BOTH))
> +	    ifn = IFN_CLZ;
> +	}
> +   (if (ifn == IFN_CLZ)
> +    (IFN_CLZ @3 @2))))
>  
>  /* a != 0 ? CTZ(a) : CST -> .CTZ(a) where CST is the result of the internal function for 0. */
>  (for func (CTZ)
> @@ -8767,13 +8836,38 @@ (define_operator_list SYNC_FETCH_AND_AND
>    (cond (ne @0 integer_zerop@1) (func (convert?@3 @0)) INTEGER_CST@2)
>    (with { int val;
>  	  internal_fn ifn = IFN_LAST;
> -	  if (direct_internal_fn_supported_p (IFN_CTZ, type, OPTIMIZE_FOR_BOTH)
> -	      && CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (type),
> -					    val) == 2)
> +	  if (TREE_CODE (TREE_TYPE (@3)) == BITINT_TYPE)
> +	    {
> +	      if (tree_fits_shwi_p (@2))
> +		{
> +		  HOST_WIDE_INT valw = tree_to_shwi (@2);
> +		  if ((int) valw == valw)
> +		    {
> +		      val = valw;
> +		      ifn = IFN_CTZ;
> +		    }
> +		}
> +	    }
> +	  else if (direct_internal_fn_supported_p (IFN_CTZ, TREE_TYPE (@3),
> +						   OPTIMIZE_FOR_BOTH)
> +		   && CTZ_DEFINED_VALUE_AT_ZERO
> +			(SCALAR_INT_TYPE_MODE (TREE_TYPE (@3)), val) == 2)
>  	    ifn = IFN_CTZ;
>  	}
>     (if (ifn == IFN_CTZ && wi::to_widest (@2) == val)
> -    (IFN_CTZ @3)))))
> +    (IFN_CTZ @3 @2)))))
> +(simplify
> + (cond (ne @0 integer_zerop@1) (IFN_CTZ (convert?@3 @0) INTEGER_CST@2) @2)
> +  (with { int val;
> +	  internal_fn ifn = IFN_LAST;
> +	  if (TREE_CODE (TREE_TYPE (@3)) == BITINT_TYPE)
> +	    ifn = IFN_CTZ;
> +	  else if (direct_internal_fn_supported_p (IFN_CTZ, TREE_TYPE (@3),
> +						   OPTIMIZE_FOR_BOTH))
> +	    ifn = IFN_CTZ;
> +	}
> +   (if (ifn == IFN_CTZ)
> +    (IFN_CTZ @3 @2))))
>  #endif
>  
>  /* Common POPCOUNT/PARITY simplifications.  */
> --- gcc/gimple-lower-bitint.cc.jj	2023-11-09 09:03:53.423900293 +0100
> +++ gcc/gimple-lower-bitint.cc	2023-11-09 09:17:40.242182314 +0100
> @@ -427,6 +427,7 @@ struct bitint_large_huge
>    void lower_mul_overflow (tree, gimple *);
>    void lower_cplxpart_stmt (tree, gimple *);
>    void lower_complexexpr_stmt (gimple *);
> +  void lower_bit_query (gimple *);
>    void lower_call (tree, gimple *);
>    void lower_asm (gimple *);
>    void lower_stmt (gimple *);
> @@ -4455,6 +4456,524 @@ bitint_large_huge::lower_complexexpr_stm
>    insert_before (g);
>  }
>  
> +/* Lower a .{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT} call with one large/huge _BitInt
> +   argument.  */
> +
> +void
> +bitint_large_huge::lower_bit_query (gimple *stmt)
> +{
> +  tree arg0 = gimple_call_arg (stmt, 0);
> +  tree arg1 = (gimple_call_num_args (stmt) == 2
> +	       ? gimple_call_arg (stmt, 1) : NULL_TREE);
> +  tree lhs = gimple_call_lhs (stmt);
> +  gimple *g;
> +
> +  if (!lhs)
> +    {
> +      gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
> +      gsi_remove (&gsi, true);
> +      return;
> +    }
> +  tree type = TREE_TYPE (arg0);
> +  gcc_assert (TREE_CODE (type) == BITINT_TYPE);
> +  bitint_prec_kind kind = bitint_precision_kind (type);
> +  gcc_assert (kind >= bitint_prec_large);
> +  enum internal_fn ifn = gimple_call_internal_fn (stmt);
> +  enum built_in_function fcode = END_BUILTINS;
> +  gcc_assert (TYPE_PRECISION (unsigned_type_node) == limb_prec
> +	      || TYPE_PRECISION (long_unsigned_type_node) == limb_prec
> +	      || TYPE_PRECISION (long_long_unsigned_type_node) == limb_prec);
> +  switch (ifn)
> +    {
> +    case IFN_CLZ:
> +      if (TYPE_PRECISION (unsigned_type_node) == limb_prec)
> +	fcode = BUILT_IN_CLZ;
> +      else if (TYPE_PRECISION (long_unsigned_type_node) == limb_prec)
> +	fcode = BUILT_IN_CLZL;
> +      else
> +	fcode = BUILT_IN_CLZLL;
> +      break;
> +    case IFN_FFS:
> +      /* .FFS (X) is .CTZ (X, -1) + 1, though under the hood
> +	 we don't add the addend at the end.  */
> +      arg1 = integer_zero_node;
> +      /* FALLTHRU */
> +    case IFN_CTZ:
> +      if (TYPE_PRECISION (unsigned_type_node) == limb_prec)
> +	fcode = BUILT_IN_CTZ;
> +      else if (TYPE_PRECISION (long_unsigned_type_node) == limb_prec)
> +	fcode = BUILT_IN_CTZL;
> +      else
> +	fcode = BUILT_IN_CTZLL;
> +      m_upwards = true;
> +      break;
> +    case IFN_CLRSB:
> +      if (TYPE_PRECISION (unsigned_type_node) == limb_prec)
> +	fcode = BUILT_IN_CLRSB;
> +      else if (TYPE_PRECISION (long_unsigned_type_node) == limb_prec)
> +	fcode = BUILT_IN_CLRSBL;
> +      else
> +	fcode = BUILT_IN_CLRSBLL;
> +      break;
> +    case IFN_PARITY:
> +      if (TYPE_PRECISION (unsigned_type_node) == limb_prec)
> +	fcode = BUILT_IN_PARITY;
> +      else if (TYPE_PRECISION (long_unsigned_type_node) == limb_prec)
> +	fcode = BUILT_IN_PARITYL;
> +      else
> +	fcode = BUILT_IN_PARITYLL;
> +      m_upwards = true;
> +      break;
> +    case IFN_POPCOUNT:
> +      if (TYPE_PRECISION (unsigned_type_node) == limb_prec)
> +	fcode = BUILT_IN_POPCOUNT;
> +      else if (TYPE_PRECISION (long_unsigned_type_node) == limb_prec)
> +	fcode = BUILT_IN_POPCOUNTL;
> +      else
> +	fcode = BUILT_IN_POPCOUNTLL;
> +      m_upwards = true;
> +      break;
> +    default:
> +      gcc_unreachable ();
> +    }
> +  tree fndecl = builtin_decl_explicit (fcode), res = NULL_TREE;
> +  unsigned cnt = 0, rem = 0, end = 0, prec = TYPE_PRECISION (type);
> +  struct bq_details { edge e; tree val, addend; } *bqp = NULL;
> +  basic_block edge_bb = NULL;
> +  if (m_upwards)
> +    {
> +      tree idx = NULL_TREE, idx_first = NULL_TREE, idx_next = NULL_TREE;
> +      if (kind == bitint_prec_large)
> +	cnt = CEIL (prec, limb_prec);
> +      else
> +	{
> +	  rem = (prec % (2 * limb_prec));
> +	  end = (prec - rem) / limb_prec;
> +	  cnt = 2 + CEIL (rem, limb_prec);
> +	  idx = idx_first = create_loop (size_zero_node, &idx_next);
> +	}
> +
> +      if (ifn == IFN_CTZ || ifn == IFN_FFS)
> +	{
> +	  gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
> +	  gsi_prev (&gsi);
> +	  edge e = split_block (gsi_bb (gsi), gsi_stmt (gsi));
> +	  edge_bb = e->src;
> +	  if (kind == bitint_prec_large)
> +	    {
> +	      m_gsi = gsi_last_bb (edge_bb);
> +	      if (!gsi_end_p (m_gsi))
> +		gsi_next (&m_gsi);
> +	    }
> +	  bqp = XALLOCAVEC (struct bq_details, cnt);
> +	}
> +      else
> +	m_after_stmt = stmt;
> +      if (kind != bitint_prec_large)
> +	m_upwards_2limb = end;
> +
> +      for (unsigned i = 0; i < cnt; i++)
> +	{
> +	  m_data_cnt = 0;
> +	  if (kind == bitint_prec_large)
> +	    idx = size_int (i);
> +	  else if (i >= 2)
> +	    idx = size_int (end + (i > 2));
> +
> +	  tree rhs1 = handle_operand (arg0, idx);
> +	  if (!useless_type_conversion_p (m_limb_type, TREE_TYPE (rhs1)))
> +	    {
> +	      if (!TYPE_UNSIGNED (TREE_TYPE (rhs1)))
> +		rhs1 = add_cast (unsigned_type_for (TREE_TYPE (rhs1)), rhs1);
> +	      rhs1 = add_cast (m_limb_type, rhs1);
> +	    }
> +
> +	  tree in, out, tem;
> +	  if (ifn == IFN_PARITY)
> +	    in = prepare_data_in_out (build_zero_cst (m_limb_type), idx, &out);
> +	  else if (ifn == IFN_FFS)
> +	    in = prepare_data_in_out (integer_one_node, idx, &out);
> +	  else
> +	    in = prepare_data_in_out (integer_zero_node, idx, &out);
> +
> +	  switch (ifn)
> +	    {
> +	    case IFN_CTZ:
> +	    case IFN_FFS:
> +	      g = gimple_build_cond (NE_EXPR, rhs1,
> +				     build_zero_cst (m_limb_type),
> +				     NULL_TREE, NULL_TREE);
> +	      insert_before (g);
> +	      edge e1, e2;
> +	      e1 = split_block (gsi_bb (m_gsi), g);
> +	      e1->flags = EDGE_FALSE_VALUE;
> +	      e2 = make_edge (e1->src, gimple_bb (stmt), EDGE_TRUE_VALUE);
> +	      e1->probability = profile_probability::unlikely ();
> +	      e2->probability = e1->probability.invert ();
> +	      if (i == 0)
> +		set_immediate_dominator (CDI_DOMINATORS, e2->dest, e2->src);
> +	      m_gsi = gsi_after_labels (e1->dest);
> +	      bqp[i].e = e2;
> +	      bqp[i].val = rhs1;
> +	      if (tree_fits_uhwi_p (idx))
> +		bqp[i].addend
> +		  = build_int_cst (integer_type_node,
> +				   tree_to_uhwi (idx) * limb_prec
> +				   + (ifn == IFN_FFS));
> +	      else
> +		{
> +		  bqp[i].addend = in;
> +		  if (i == 1)
> +		    res = out;
> +		  else
> +		    res = make_ssa_name (integer_type_node);
> +		  g = gimple_build_assign (res, PLUS_EXPR, in,
> +					   build_int_cst (integer_type_node,
> +							  limb_prec));
> +		  insert_before (g);
> +		  m_data[m_data_cnt] = res;
> +		}
> +	      break;
> +	    case IFN_PARITY:
> +	      if (!integer_zerop (in))
> +		{
> +		  if (kind == bitint_prec_huge && i == 1)
> +		    res = out;
> +		  else
> +		    res = make_ssa_name (m_limb_type);
> +		  g = gimple_build_assign (res, BIT_XOR_EXPR, in, rhs1);
> +		  insert_before (g);
> +		}
> +	      else
> +		res = rhs1;
> +	      m_data[m_data_cnt] = res;
> +	      break;
> +	    case IFN_POPCOUNT:
> +	      g = gimple_build_call (fndecl, 1, rhs1);
> +	      tem = make_ssa_name (integer_type_node);
> +	      gimple_call_set_lhs (g, tem);
> +	      insert_before (g);
> +	      if (!integer_zerop (in))
> +		{
> +		  if (kind == bitint_prec_huge && i == 1)
> +		    res = out;
> +		  else
> +		    res = make_ssa_name (integer_type_node);
> +		  g = gimple_build_assign (res, PLUS_EXPR, in, tem);
> +		  insert_before (g);
> +		}
> +	      else
> +		res = tem;
> +	      m_data[m_data_cnt] = res;
> +	      break;
> +	    default:
> +	      gcc_unreachable ();
> +	    }
> +
> +	  m_first = false;
> +	  if (kind == bitint_prec_huge && i <= 1)
> +	    {
> +	      if (i == 0)
> +		{
> +		  idx = make_ssa_name (sizetype);
> +		  g = gimple_build_assign (idx, PLUS_EXPR, idx_first,
> +					   size_one_node);
> +		  insert_before (g);
> +		}
> +	      else
> +		{
> +		  g = gimple_build_assign (idx_next, PLUS_EXPR, idx_first,
> +					   size_int (2));
> +		  insert_before (g);
> +		  g = gimple_build_cond (NE_EXPR, idx_next, size_int (end),
> +					 NULL_TREE, NULL_TREE);
> +		  insert_before (g);
> +		  if (ifn == IFN_CTZ || ifn == IFN_FFS)
> +		    m_gsi = gsi_after_labels (edge_bb);
> +		  else
> +		    m_gsi = gsi_for_stmt (stmt);
> +		}
> +	    }
> +	}
> +    }
> +  else
> +    {
> +      tree idx = NULL_TREE, idx_next = NULL_TREE, first = NULL_TREE;
> +      int sub_one = 0;
> +      if (kind == bitint_prec_large)
> +	cnt = CEIL (prec, limb_prec);
> +      else
> +	{
> +	  rem = prec % limb_prec;
> +	  if (rem == 0 && (!TYPE_UNSIGNED (type) || ifn == IFN_CLRSB))
> +	    rem = limb_prec;
> +	  end = (prec - rem) / limb_prec;
> +	  cnt = 1 + (rem != 0);
> +	  if (ifn == IFN_CLRSB)
> +	    sub_one = 1;
> +	}
> +
> +      gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
> +      gsi_prev (&gsi);
> +      edge e = split_block (gsi_bb (gsi), gsi_stmt (gsi));
> +      edge_bb = e->src;
> +      m_gsi = gsi_last_bb (edge_bb);
> +      if (!gsi_end_p (m_gsi))
> +	gsi_next (&m_gsi);
> +
> +      if (ifn == IFN_CLZ)
> +	bqp = XALLOCAVEC (struct bq_details, cnt);
> +      else
> +	{
> +	  gsi = gsi_for_stmt (stmt);
> +	  gsi_prev (&gsi);
> +	  e = split_block (gsi_bb (gsi), gsi_stmt (gsi));
> +	  edge_bb = e->src;
> +	  bqp = XALLOCAVEC (struct bq_details, 2 * cnt);
> +	}
> +
> +      for (unsigned i = 0; i < cnt; i++)
> +	{
> +	  m_data_cnt = 0;
> +	  if (kind == bitint_prec_large)
> +	    idx = size_int (cnt - i - 1);
> +	  else if (i == cnt - 1)
> +	    idx = create_loop (size_int (end - 1), &idx_next);
> +	  else
> +	    idx = size_int (end);
> +
> +	  tree rhs1 = handle_operand (arg0, idx);
> +	  if (!useless_type_conversion_p (m_limb_type, TREE_TYPE (rhs1)))
> +	    {
> +	      if (ifn == IFN_CLZ && !TYPE_UNSIGNED (TREE_TYPE (rhs1)))
> +		rhs1 = add_cast (unsigned_type_for (TREE_TYPE (rhs1)), rhs1);
> +	      else if (ifn == IFN_CLRSB && TYPE_UNSIGNED (TREE_TYPE (rhs1)))
> +		rhs1 = add_cast (signed_type_for (TREE_TYPE (rhs1)), rhs1);
> +	      rhs1 = add_cast (m_limb_type, rhs1);
> +	    }
> +
> +	  if (ifn == IFN_CLZ)
> +	    {
> +	      g = gimple_build_cond (NE_EXPR, rhs1,
> +				     build_zero_cst (m_limb_type),
> +				     NULL_TREE, NULL_TREE);
> +	      insert_before (g);
> +	      edge e1 = split_block (gsi_bb (m_gsi), g);
> +	      e1->flags = EDGE_FALSE_VALUE;
> +	      edge e2 = make_edge (e1->src, gimple_bb (stmt), EDGE_TRUE_VALUE);
> +	      e1->probability = profile_probability::unlikely ();
> +	      e2->probability = e1->probability.invert ();
> +	      if (i == 0)
> +		set_immediate_dominator (CDI_DOMINATORS, e2->dest, e2->src);
> +	      m_gsi = gsi_after_labels (e1->dest);
> +	      bqp[i].e = e2;
> +	      bqp[i].val = rhs1;
> +	    }
> +	  else
> +	    {
> +	      if (i == 0)
> +		{
> +		  first = rhs1;
> +		  g = gimple_build_assign (make_ssa_name (m_limb_type),
> +					   PLUS_EXPR, rhs1,
> +					   build_int_cst (m_limb_type, 1));
> +		  insert_before (g);
> +		  g = gimple_build_cond (GT_EXPR, gimple_assign_lhs (g),
> +					 build_int_cst (m_limb_type, 1),
> +					 NULL_TREE, NULL_TREE);
> +		  insert_before (g);
> +		}
> +	      else
> +		{
> +		  g = gimple_build_assign (make_ssa_name (m_limb_type),
> +					   BIT_XOR_EXPR, rhs1, first);
> +		  insert_before (g);
> +		  tree stype = signed_type_for (m_limb_type);
> +		  g = gimple_build_cond (LT_EXPR,
> +					 add_cast (stype,
> +						   gimple_assign_lhs (g)),
> +					 build_zero_cst (stype),
> +					 NULL_TREE, NULL_TREE);
> +		  insert_before (g);
> +		  edge e1 = split_block (gsi_bb (m_gsi), g);
> +		  e1->flags = EDGE_FALSE_VALUE;
> +		  edge e2 = make_edge (e1->src, gimple_bb (stmt),
> +				       EDGE_TRUE_VALUE);
> +		  e1->probability = profile_probability::unlikely ();
> +		  e2->probability = e1->probability.invert ();
> +		  if (i == 1)
> +		    set_immediate_dominator (CDI_DOMINATORS, e2->dest,
> +					     e2->src);
> +		  m_gsi = gsi_after_labels (e1->dest);
> +		  bqp[2 * i].e = e2;
> +		  g = gimple_build_cond (NE_EXPR, rhs1, first,
> +					 NULL_TREE, NULL_TREE);
> +		  insert_before (g);
> +		}
> +	      edge e1 = split_block (gsi_bb (m_gsi), g);
> +	      e1->flags = EDGE_FALSE_VALUE;
> +	      edge e2 = make_edge (e1->src, edge_bb, EDGE_TRUE_VALUE);
> +	      e1->probability = profile_probability::unlikely ();
> +	      e2->probability = e1->probability.invert ();
> +	      if (i == 0)
> +		set_immediate_dominator (CDI_DOMINATORS, e2->dest, e2->src);
> +	      m_gsi = gsi_after_labels (e1->dest);
> +	      bqp[2 * i + 1].e = e2;
> +	      bqp[i].val = rhs1;
> +	    }
> +	  if (tree_fits_uhwi_p (idx))
> +	    bqp[i].addend
> +	      = build_int_cst (integer_type_node,
> +			       (int) prec
> +			       - (((int) tree_to_uhwi (idx) + 1)
> +				  * limb_prec) - sub_one);
> +	  else
> +	    {
> +	      tree in, out;
> +	      in = build_int_cst (integer_type_node, rem - sub_one);
> +	      m_first = true;
> +	      in = prepare_data_in_out (in, idx, &out);
> +	      out = m_data[m_data_cnt + 1];
> +	      bqp[i].addend = in;
> +	      g = gimple_build_assign (out, PLUS_EXPR, in,
> +				       build_int_cst (integer_type_node,
> +						      limb_prec));
> +	      insert_before (g);
> +	      m_data[m_data_cnt] = out;
> +	    }
> +
> +	  m_first = false;
> +	  if (kind == bitint_prec_huge && i == cnt - 1)
> +	    {
> +	      g = gimple_build_assign (idx_next, PLUS_EXPR, idx,
> +				       size_int (-1));
> +	      insert_before (g);
> +	      g = gimple_build_cond (NE_EXPR, idx, size_zero_node,
> +				     NULL_TREE, NULL_TREE);
> +	      insert_before (g);
> +	      edge true_edge, false_edge;
> +	      extract_true_false_edges_from_block (gsi_bb (m_gsi),
> +						   &true_edge, &false_edge);
> +	      m_gsi = gsi_after_labels (false_edge->dest);
> +	    }
> +	}
> +    }
> +  switch (ifn)
> +    {
> +    case IFN_CLZ:
> +    case IFN_CTZ:
> +    case IFN_FFS:
> +      gphi *phi1, *phi2, *phi3;
> +      basic_block bb;
> +      bb = gsi_bb (m_gsi);
> +      remove_edge (find_edge (bb, gimple_bb (stmt)));
> +      phi1 = create_phi_node (make_ssa_name (m_limb_type),
> +			      gimple_bb (stmt));
> +      phi2 = create_phi_node (make_ssa_name (integer_type_node),
> +			      gimple_bb (stmt));
> +      for (unsigned i = 0; i < cnt; i++)
> +	{
> +	  add_phi_arg (phi1, bqp[i].val, bqp[i].e, UNKNOWN_LOCATION);
> +	  add_phi_arg (phi2, bqp[i].addend, bqp[i].e, UNKNOWN_LOCATION);
> +	}
> +      if (arg1 == NULL_TREE)
> +	{
> +	  g = gimple_build_builtin_unreachable (m_loc);
> +	  insert_before (g);
> +	}
> +      m_gsi = gsi_for_stmt (stmt);
> +      g = gimple_build_call (fndecl, 1, gimple_phi_result (phi1));
> +      gimple_call_set_lhs (g, make_ssa_name (integer_type_node));
> +      insert_before (g);
> +      if (arg1 == NULL_TREE)
> +	g = gimple_build_assign (lhs, PLUS_EXPR,
> +				 gimple_phi_result (phi2),
> +				 gimple_call_lhs (g));
> +      else
> +	{
> +	  g = gimple_build_assign (make_ssa_name (integer_type_node),
> +				   PLUS_EXPR, gimple_phi_result (phi2),
> +				   gimple_call_lhs (g));
> +	  insert_before (g);
> +	  edge e1 = split_block (gimple_bb (stmt), g);
> +	  edge e2 = make_edge (bb, e1->dest, EDGE_FALLTHRU);
> +	  e2->probability = profile_probability::always ();
> +	  set_immediate_dominator (CDI_DOMINATORS, e1->dest,
> +				   get_immediate_dominator (CDI_DOMINATORS,
> +							    e1->src));
> +	  phi3 = create_phi_node (make_ssa_name (integer_type_node), e1->dest);
> +	  add_phi_arg (phi3, gimple_assign_lhs (g), e1, UNKNOWN_LOCATION);
> +	  add_phi_arg (phi3, arg1, e2, UNKNOWN_LOCATION);
> +	  m_gsi = gsi_for_stmt (stmt);
> +	  g = gimple_build_assign (lhs, gimple_phi_result (phi3));
> +	}
> +      gsi_replace (&m_gsi, g, true);
> +      break;
> +    case IFN_CLRSB:
> +      bb = gsi_bb (m_gsi);
> +      remove_edge (find_edge (bb, edge_bb));
> +      edge e;
> +      e = make_edge (bb, gimple_bb (stmt), EDGE_FALLTHRU);
> +      e->probability = profile_probability::always ();
> +      set_immediate_dominator (CDI_DOMINATORS, gimple_bb (stmt),
> +			       get_immediate_dominator (CDI_DOMINATORS,
> +							edge_bb));
> +      phi1 = create_phi_node (make_ssa_name (m_limb_type),
> +			      edge_bb);
> +      phi2 = create_phi_node (make_ssa_name (integer_type_node),
> +			      edge_bb);
> +      phi3 = create_phi_node (make_ssa_name (integer_type_node),
> +			      gimple_bb (stmt));
> +      for (unsigned i = 0; i < cnt; i++)
> +	{
> +	  add_phi_arg (phi1, bqp[i].val, bqp[2 * i + 1].e, UNKNOWN_LOCATION);
> +	  add_phi_arg (phi2, bqp[i].addend, bqp[2 * i + 1].e,
> +		       UNKNOWN_LOCATION);
> +	  tree a = bqp[i].addend;
> +	  if (i && kind == bitint_prec_large)
> +	    a = int_const_binop (PLUS_EXPR, a, integer_minus_one_node);
> +	  if (i)
> +	    add_phi_arg (phi3, a, bqp[2 * i].e, UNKNOWN_LOCATION);
> +	}
> +      add_phi_arg (phi3, build_int_cst (integer_type_node, prec - 1), e,
> +		   UNKNOWN_LOCATION);
> +      m_gsi = gsi_after_labels (edge_bb);
> +      g = gimple_build_call (fndecl, 1,
> +			     add_cast (signed_type_for (m_limb_type),
> +				       gimple_phi_result (phi1)));
> +      gimple_call_set_lhs (g, make_ssa_name (integer_type_node));
> +      insert_before (g);
> +      g = gimple_build_assign (make_ssa_name (integer_type_node),
> +			       PLUS_EXPR, gimple_call_lhs (g),
> +			       gimple_phi_result (phi2));
> +      insert_before (g);
> +      if (kind != bitint_prec_large)
> +	{
> +	  g = gimple_build_assign (make_ssa_name (integer_type_node),
> +				   PLUS_EXPR, gimple_assign_lhs (g),
> +				   integer_one_node);
> +	  insert_before (g);
> +	}
> +      add_phi_arg (phi3, gimple_assign_lhs (g),
> +		   find_edge (edge_bb, gimple_bb (stmt)), UNKNOWN_LOCATION);
> +      m_gsi = gsi_for_stmt (stmt);
> +      g = gimple_build_assign (lhs, gimple_phi_result (phi3));
> +      gsi_replace (&m_gsi, g, true);
> +      break;
> +    case IFN_PARITY:
> +      g = gimple_build_call (fndecl, 1, res);
> +      gimple_call_set_lhs (g, lhs);
> +      gsi_replace (&m_gsi, g, true);
> +      break;
> +    case IFN_POPCOUNT:
> +      g = gimple_build_assign (lhs, res);
> +      gsi_replace (&m_gsi, g, true);
> +      break;
> +    default:
> +      gcc_unreachable ();
> +    }
> +}
> +
>  /* Lower a call statement with one or more large/huge _BitInt
>     arguments or large/huge _BitInt return value.  */
>  
> @@ -4476,6 +4995,14 @@ bitint_large_huge::lower_call (tree obj,
>        case IFN_UBSAN_CHECK_MUL:
>  	lower_mul_overflow (obj, stmt);
>  	return;
> +      case IFN_CLZ:
> +      case IFN_CTZ:
> +      case IFN_CLRSB:
> +      case IFN_FFS:
> +      case IFN_PARITY:
> +      case IFN_POPCOUNT:
> +	lower_bit_query (stmt);
> +	return;
>        default:
>  	break;
>        }
> --- gcc/gimple-range-op.cc.jj	2023-11-09 09:03:53.443900010 +0100
> +++ gcc/gimple-range-op.cc	2023-11-09 09:17:40.233182441 +0100
> @@ -908,39 +908,34 @@ public:
>    cfn_clz (bool internal) { m_gimple_call_internal_p = internal; }
>    using range_operator::fold_range;
>    virtual bool fold_range (irange &r, tree type, const irange &lh,
> -			   const irange &, relation_trio) const;
> +			   const irange &rh, relation_trio) const;
>  private:
>    bool m_gimple_call_internal_p;
>  } op_cfn_clz (false), op_cfn_clz_internal (true);
>  
>  bool
>  cfn_clz::fold_range (irange &r, tree type, const irange &lh,
> -		     const irange &, relation_trio) const
> +		     const irange &rh, relation_trio) const
>  {
>    // __builtin_c[lt]z* return [0, prec-1], except when the
>    // argument is 0, but that is undefined behavior.
>    //
>    // For __builtin_c[lt]z* consider argument of 0 always undefined
> -  // behavior, for internal fns depending on C?Z_DEFINED_VALUE_AT_ZERO.
> +  // behavior, for internal fns likewise, unless it has 2 arguments,
> +  // then the second argument is the value at zero.
>    if (lh.undefined_p ())
>      return false;
>    int prec = TYPE_PRECISION (lh.type ());
>    int mini = 0;
>    int maxi = prec - 1;
> -  int zerov = 0;
> -  scalar_int_mode mode = SCALAR_INT_TYPE_MODE (lh.type ());
>    if (m_gimple_call_internal_p)
>      {
> -      if (optab_handler (clz_optab, mode) != CODE_FOR_nothing
> -	  && CLZ_DEFINED_VALUE_AT_ZERO (mode, zerov) == 2)
> -	{
> -	  // Only handle the single common value.
> -	  if (zerov == prec)
> -	    maxi = prec;
> -	  else
> -	    // Magic value to give up, unless we can prove arg is non-zero.
> -	    mini = -2;
> -	}
> +      // Only handle the single common value.
> +      if (rh.lower_bound () == prec)
> +	maxi = prec;
> +      else
> +	// Magic value to give up, unless we can prove arg is non-zero.
> +	mini = -2;
>      }
>  
>    // From clz of minimum we can compute result maximum.
> @@ -985,37 +980,31 @@ public:
>    cfn_ctz (bool internal) { m_gimple_call_internal_p = internal; }
>    using range_operator::fold_range;
>    virtual bool fold_range (irange &r, tree type, const irange &lh,
> -			   const irange &, relation_trio) const;
> +			   const irange &rh, relation_trio) const;
>  private:
>    bool m_gimple_call_internal_p;
>  } op_cfn_ctz (false), op_cfn_ctz_internal (true);
>  
>  bool
>  cfn_ctz::fold_range (irange &r, tree type, const irange &lh,
> -		     const irange &, relation_trio) const
> +		     const irange &rh, relation_trio) const
>  {
>    if (lh.undefined_p ())
>      return false;
>    int prec = TYPE_PRECISION (lh.type ());
>    int mini = 0;
>    int maxi = prec - 1;
> -  int zerov = 0;
> -  scalar_int_mode mode = SCALAR_INT_TYPE_MODE (lh.type ());
>  
>    if (m_gimple_call_internal_p)
>      {
> -      if (optab_handler (ctz_optab, mode) != CODE_FOR_nothing
> -	  && CTZ_DEFINED_VALUE_AT_ZERO (mode, zerov) == 2)
> -	{
> -	  // Handle only the two common values.
> -	  if (zerov == -1)
> -	    mini = -1;
> -	  else if (zerov == prec)
> -	    maxi = prec;
> -	  else
> -	    // Magic value to give up, unless we can prove arg is non-zero.
> -	    mini = -2;
> -	}
> +      // Handle only the two common values.
> +      if (rh.lower_bound () == -1)
> +	mini = -1;
> +      else if (rh.lower_bound () == prec)
> +	maxi = prec;
> +      else
> +	// Magic value to give up, unless we can prove arg is non-zero.
> +	mini = -2;
>      }
>    // If arg is non-zero, then use [0, prec - 1].
>    if (!range_includes_zero_p (&lh))
> @@ -1288,16 +1277,24 @@ gimple_range_op_handler::maybe_builtin_c
>  
>      CASE_CFN_CLZ:
>        m_op1 = gimple_call_arg (call, 0);
> -      if (gimple_call_internal_p (call))
> -	m_operator = &op_cfn_clz_internal;
> +      if (gimple_call_internal_p (call)
> +	  && gimple_call_num_args (call) == 2)
> +	{
> +	  m_op2 = gimple_call_arg (call, 1);
> +	  m_operator = &op_cfn_clz_internal;
> +	}
>        else
>  	m_operator = &op_cfn_clz;
>        break;
>  
>      CASE_CFN_CTZ:
>        m_op1 = gimple_call_arg (call, 0);
> -      if (gimple_call_internal_p (call))
> -	m_operator = &op_cfn_ctz_internal;
> +      if (gimple_call_internal_p (call)
> +	  && gimple_call_num_args (call) == 2)
> +	{
> +	  m_op2 = gimple_call_arg (call, 1);
> +	  m_operator = &op_cfn_ctz_internal;
> +	}
>        else
>  	m_operator = &op_cfn_ctz;
>        break;
> --- gcc/tree-vect-patterns.cc.jj	2023-11-09 09:03:53.675896723 +0100
> +++ gcc/tree-vect-patterns.cc	2023-11-09 09:17:40.232182455 +0100
> @@ -1818,7 +1818,7 @@ vect_recog_ctz_ffs_pattern (vec_info *vi
>    tree new_var;
>    internal_fn ifn = IFN_LAST, ifnnew = IFN_LAST;
>    bool defined_at_zero = true, defined_at_zero_new = false;
> -  int val = 0, val_new = 0;
> +  int val = 0, val_new = 0, val_cmp = 0;
>    int prec;
>    int sub = 0, add = 0;
>    location_t loc;
> @@ -1826,7 +1826,8 @@ vect_recog_ctz_ffs_pattern (vec_info *vi
>    if (!is_gimple_call (call_stmt))
>      return NULL;
>  
> -  if (gimple_call_num_args (call_stmt) != 1)
> +  if (gimple_call_num_args (call_stmt) != 1
> +      && gimple_call_num_args (call_stmt) != 2)
>      return NULL;
>  
>    rhs_oprnd = gimple_call_arg (call_stmt, 0);
> @@ -1846,9 +1847,10 @@ vect_recog_ctz_ffs_pattern (vec_info *vi
>      CASE_CFN_CTZ:
>        ifn = IFN_CTZ;
>        if (!gimple_call_internal_p (call_stmt)
> -	  || CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (rhs_type),
> -					val) != 2)
> +	  || gimple_call_num_args (call_stmt) != 2)
>  	defined_at_zero = false;
> +      else
> +	val = tree_to_shwi (gimple_call_arg (call_stmt, 1));
>        break;
>      CASE_CFN_FFS:
>        ifn = IFN_FFS;
> @@ -1907,6 +1909,7 @@ vect_recog_ctz_ffs_pattern (vec_info *vi
>  
>    vect_pattern_detected ("vec_recog_ctz_ffs_pattern", call_stmt);
>  
> +  val_cmp = val_new;
>    if ((ifnnew == IFN_CLZ
>         && defined_at_zero
>         && defined_at_zero_new
> @@ -1918,7 +1921,7 @@ vect_recog_ctz_ffs_pattern (vec_info *vi
>  	 .CTZ (X) = .POPCOUNT ((X - 1) & ~X).  */
>        if (ifnnew == IFN_CLZ)
>  	sub = prec;
> -      val_new = prec;
> +      val_cmp = prec;
>  
>        if (!TYPE_UNSIGNED (rhs_type))
>  	{
> @@ -1955,7 +1958,7 @@ vect_recog_ctz_ffs_pattern (vec_info *vi
>        /* .CTZ (X) = (PREC - 1) - .CLZ (X & -X)
>  	 .FFS (X) = PREC - .CLZ (X & -X).  */
>        sub = prec - (ifn == IFN_CTZ);
> -      val_new = sub - val_new;
> +      val_cmp = sub - val_new;
>  
>        tree neg = vect_recog_temp_ssa_var (rhs_type, NULL);
>        pattern_stmt = gimple_build_assign (neg, NEGATE_EXPR, rhs_oprnd);
> @@ -1974,7 +1977,7 @@ vect_recog_ctz_ffs_pattern (vec_info *vi
>        /* .CTZ (X) = PREC - .POPCOUNT (X | -X)
>  	 .FFS (X) = (PREC + 1) - .POPCOUNT (X | -X).  */
>        sub = prec + (ifn == IFN_FFS);
> -      val_new = sub;
> +      val_cmp = sub;
>  
>        tree neg = vect_recog_temp_ssa_var (rhs_type, NULL);
>        pattern_stmt = gimple_build_assign (neg, NEGATE_EXPR, rhs_oprnd);
> @@ -1992,12 +1995,18 @@ vect_recog_ctz_ffs_pattern (vec_info *vi
>      {
>        /* .FFS (X) = .CTZ (X) + 1.  */
>        add = 1;
> -      val_new++;
> +      val_cmp++;
>      }
>  
>    /* Create B = .IFNNEW (A).  */
>    new_var = vect_recog_temp_ssa_var (lhs_type, NULL);
> -  pattern_stmt = gimple_build_call_internal (ifnnew, 1, rhs_oprnd);
> +  if ((ifnnew == IFN_CLZ || ifnnew == IFN_CTZ) && defined_at_zero_new)
> +    pattern_stmt
> +      = gimple_build_call_internal (ifnnew, 2, rhs_oprnd,
> +				    build_int_cst (integer_type_node,
> +						   val_new));
> +  else
> +    pattern_stmt = gimple_build_call_internal (ifnnew, 1, rhs_oprnd);
>    gimple_call_set_lhs (pattern_stmt, new_var);
>    gimple_set_location (pattern_stmt, loc);
>    *type_out = vec_type;
> @@ -2023,7 +2032,7 @@ vect_recog_ctz_ffs_pattern (vec_info *vi
>      }
>  
>    if (defined_at_zero
> -      && (!defined_at_zero_new || val != val_new))
> +      && (!defined_at_zero_new || val != val_cmp))
>      {
>        append_pattern_def_seq (vinfo, stmt_vinfo, pattern_stmt, vec_type);
>        tree ret_var = vect_recog_temp_ssa_var (lhs_type, NULL);
> @@ -2143,7 +2152,8 @@ vect_recog_popcount_clz_ctz_ffs_pattern
>        return NULL;
>      }
>  
> -  if (gimple_call_num_args (call_stmt) != 1)
> +  if (gimple_call_num_args (call_stmt) != 1
> +      && gimple_call_num_args (call_stmt) != 2)
>      return NULL;
>  
>    rhs_oprnd = gimple_call_arg (call_stmt, 0);
> @@ -2181,17 +2191,14 @@ vect_recog_popcount_clz_ctz_ffs_pattern
>  	  return NULL;
>  	addend = (TYPE_PRECISION (TREE_TYPE (rhs_oprnd))
>  		  - TYPE_PRECISION (lhs_type));
> -	if (gimple_call_internal_p (call_stmt))
> +	if (gimple_call_internal_p (call_stmt)
> +	    && gimple_call_num_args (call_stmt) == 2)
>  	  {
>  	    int val1, val2;
> -	    int d1
> -	      = CLZ_DEFINED_VALUE_AT_ZERO
> -		  (SCALAR_INT_TYPE_MODE (TREE_TYPE (rhs_oprnd)), val1);
> +	    val1 = tree_to_shwi (gimple_call_arg (call_stmt, 1));
>  	    int d2
>  	      = CLZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (lhs_type),
>  					   val2);
> -	    if (d1 != 2)
> -	      break;
>  	    if (d2 != 2 || val1 != val2 + addend)
>  	      return NULL;
>  	  }
> @@ -2200,17 +2207,14 @@ vect_recog_popcount_clz_ctz_ffs_pattern
>  	/* ctzll (x) == ctz (x) for unsigned or signed x != 0, so ok
>  	   if it is undefined at zero or if it matches also for the
>  	   defined value there.  */
> -	if (gimple_call_internal_p (call_stmt))
> +	if (gimple_call_internal_p (call_stmt)
> +	    && gimple_call_num_args (call_stmt) == 2)
>  	  {
>  	    int val1, val2;
> -	    int d1
> -	      = CTZ_DEFINED_VALUE_AT_ZERO
> -		  (SCALAR_INT_TYPE_MODE (TREE_TYPE (rhs_oprnd)), val1);
> +	    val1 = tree_to_shwi (gimple_call_arg (call_stmt, 1));
>  	    int d2
>  	      = CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (lhs_type),
>  					   val2);
> -	    if (d1 != 2)
> -	      break;
>  	    if (d2 != 2 || val1 != val2)
>  	      return NULL;
>  	  }
> @@ -2260,7 +2264,20 @@ vect_recog_popcount_clz_ctz_ffs_pattern
>  
>    /* Create B = .POPCOUNT (A).  */
>    new_var = vect_recog_temp_ssa_var (lhs_type, NULL);
> -  pattern_stmt = gimple_build_call_internal (ifn, 1, unprom_diff.op);
> +  tree arg2 = NULL_TREE;
> +  int val;
> +  if (ifn == IFN_CLZ
> +      && CLZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (lhs_type),
> +				    val) == 2)
> +    arg2 = build_int_cst (integer_type_node, val);
> +  else if (ifn == IFN_CTZ
> +	   && CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (lhs_type),
> +					 val) == 2)
> +    arg2 = build_int_cst (integer_type_node, val);
> +  if (arg2)
> +    pattern_stmt = gimple_build_call_internal (ifn, 2, unprom_diff.op, arg2);
> +  else
> +    pattern_stmt = gimple_build_call_internal (ifn, 1, unprom_diff.op);
>    gimple_call_set_lhs (pattern_stmt, new_var);
>    gimple_set_location (pattern_stmt, gimple_location (last_stmt));
>    *type_out = vec_type;
> --- gcc/tree-vect-stmts.cc.jj	2023-11-09 09:04:20.349518853 +0100
> +++ gcc/tree-vect-stmts.cc	2023-11-09 10:00:01.351992895 +0100
> @@ -3266,6 +3266,7 @@ vectorizable_call (vec_info *vinfo,
>    enum { NARROW, NONE, WIDEN } modifier;
>    size_t i, nargs;
>    tree lhs;
> +  tree clz_ctz_arg1 = NULL_TREE;
>  
>    if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
>      return false;
> @@ -3311,6 +3312,14 @@ vectorizable_call (vec_info *vinfo,
>        nargs = 0;
>        rhs_type = unsigned_type_node;
>      }
> +  /* Similarly pretend IFN_CLZ and IFN_CTZ only has one argument, the second
> +     argument just says whether it is well-defined at zero or not and what
> +     value should be returned for it.  */
> +  if ((cfn == CFN_CLZ || cfn == CFN_CTZ) && nargs == 2)
> +    {
> +      nargs = 1;
> +      clz_ctz_arg1 = gimple_call_arg (stmt, 1);
> +    }
>  
>    int mask_opno = -1;
>    if (internal_fn_p (cfn))
> @@ -3576,6 +3585,8 @@ vectorizable_call (vec_info *vinfo,
>        ifn = cond_fn;
>        vect_nargs += 2;
>      }
> +  if (clz_ctz_arg1)
> +    ++vect_nargs;
>  
>    if (modifier == NONE || ifn != IFN_LAST)
>      {
> @@ -3613,6 +3624,9 @@ vectorizable_call (vec_info *vinfo,
>  		    }
>  		  if (masked_loop_p && reduc_idx >= 0)
>  		    vargs[varg++] = vargs[reduc_idx + 1];
> +		  if (clz_ctz_arg1)
> +		    vargs[varg++] = clz_ctz_arg1;
> +
>  		  gimple *new_stmt;
>  		  if (modifier == NARROW)
>  		    {
> @@ -3699,6 +3713,8 @@ vectorizable_call (vec_info *vinfo,
>  	    }
>  	  if (masked_loop_p && reduc_idx >= 0)
>  	    vargs[varg++] = vargs[reduc_idx + 1];
> +	  if (clz_ctz_arg1)
> +	    vargs[varg++] = clz_ctz_arg1;
>  
>  	  if (len_opno >= 0 && len_loop_p)
>  	    {
> --- gcc/tree-ssa-loop-niter.cc.jj	2023-11-09 09:03:53.592897899 +0100
> +++ gcc/tree-ssa-loop-niter.cc	2023-11-09 09:17:40.234182427 +0100
> @@ -2235,14 +2235,18 @@ build_cltz_expr (tree src, bool leading,
>    tree call;
>    if (use_ifn)
>      {
> -      call = build_call_expr_internal_loc (UNKNOWN_LOCATION, ifn,
> -					   integer_type_node, 1, src);
>        int val;
>        int optab_defined_at_zero
>  	= (leading
>  	   ? CLZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (utype), val)
>  	   : CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (utype), val));
> -      if (define_at_zero && !(optab_defined_at_zero == 2 && val == prec))
> +      tree arg2 = NULL_TREE;
> +      if (define_at_zero && optab_defined_at_zero == 2 && val == prec)
> +	arg2 = build_int_cst (integer_type_node, val);
> +      call = build_call_expr_internal_loc (UNKNOWN_LOCATION, ifn,
> +					   integer_type_node, arg2 ? 2 : 1,
> +					   src, arg2);
> +      if (define_at_zero && arg2 == NULL_TREE)
>  	{
>  	  tree is_zero = fold_build2 (NE_EXPR, boolean_type_node, src,
>  				      build_zero_cst (TREE_TYPE (src)));
> --- gcc/tree-ssa-forwprop.cc.jj	2023-11-09 09:03:53.542898608 +0100
> +++ gcc/tree-ssa-forwprop.cc	2023-11-09 09:38:28.895393573 +0100
> @@ -2381,6 +2381,7 @@ simplify_count_trailing_zeroes (gimple_s
>        HOST_WIDE_INT type_size = tree_to_shwi (TYPE_SIZE (type));
>        bool zero_ok
>  	= CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (type), ctz_val) == 2;
> +      int nargs = 2;
>  
>        /* If the input value can't be zero, don't special case ctz (0).  */
>        if (tree_expr_nonzero_p (res_ops[0]))
> @@ -2388,6 +2389,7 @@ simplify_count_trailing_zeroes (gimple_s
>  	  zero_ok = true;
>  	  zero_val = 0;
>  	  ctz_val = 0;
> +	  nargs = 1;
>  	}
>  
>        /* Skip if there is no value defined at zero, or if we can't easily
> @@ -2399,7 +2401,11 @@ simplify_count_trailing_zeroes (gimple_s
>  
>        gimple_seq seq = NULL;
>        gimple *g;
> -      gcall *call = gimple_build_call_internal (IFN_CTZ, 1, res_ops[0]);
> +      gcall *call
> +	= gimple_build_call_internal (IFN_CTZ, nargs, res_ops[0],
> +				      nargs == 1 ? NULL_TREE
> +				      : build_int_cst (integer_type_node,
> +						       ctz_val));
>        gimple_set_location (call, gimple_location (stmt));
>        gimple_set_lhs (call, make_ssa_name (integer_type_node));
>        gimple_seq_add_stmt (&seq, call);
> --- gcc/tree-ssa-phiopt.cc.jj	2023-11-09 09:03:53.616897559 +0100
> +++ gcc/tree-ssa-phiopt.cc	2023-11-09 09:17:40.241182328 +0100
> @@ -2863,18 +2863,26 @@ cond_removal_in_builtin_zero_pattern (ba
>      }
>  
>    /* Check that we have a popcount/clz/ctz builtin.  */
> -  if (!is_gimple_call (call) || gimple_call_num_args (call) != 1)
> +  if (!is_gimple_call (call))
>      return false;
>  
> -  arg = gimple_call_arg (call, 0);
>    lhs = gimple_get_lhs (call);
>  
>    if (lhs == NULL_TREE)
>      return false;
>  
>    combined_fn cfn = gimple_call_combined_fn (call);
> +  if (gimple_call_num_args (call) != 1
> +      && (gimple_call_num_args (call) != 2
> +	  || cfn == CFN_CLZ
> +	  || cfn == CFN_CTZ))
> +    return false;
> +
> +  arg = gimple_call_arg (call, 0);
> +
>    internal_fn ifn = IFN_LAST;
>    int val = 0;
> +  bool any_val = false;
>    switch (cfn)
>      {
>      case CFN_BUILT_IN_BSWAP16:
> @@ -2889,6 +2897,23 @@ cond_removal_in_builtin_zero_pattern (ba
>        if (INTEGRAL_TYPE_P (TREE_TYPE (arg)))
>  	{
>  	  tree type = TREE_TYPE (arg);
> +	  if (TREE_CODE (type) == BITINT_TYPE)
> +	    {
> +	      if (gimple_call_num_args (call) == 1)
> +		{
> +		  any_val = true;
> +		  ifn = IFN_CLZ;
> +		  break;
> +		}
> +	      if (!tree_fits_shwi_p (gimple_call_arg (call, 1)))
> +		return false;
> +	      HOST_WIDE_INT at_zero = tree_to_shwi (gimple_call_arg (call, 1));
> +	      if ((int) at_zero != at_zero)
> +		return false;
> +	      ifn = IFN_CLZ;
> +	      val = at_zero;
> +	      break;
> +	    }
>  	  if (direct_internal_fn_supported_p (IFN_CLZ, type, OPTIMIZE_FOR_BOTH)
>  	      && CLZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (type),
>  					    val) == 2)
> @@ -2902,6 +2927,23 @@ cond_removal_in_builtin_zero_pattern (ba
>        if (INTEGRAL_TYPE_P (TREE_TYPE (arg)))
>  	{
>  	  tree type = TREE_TYPE (arg);
> +	  if (TREE_CODE (type) == BITINT_TYPE)
> +	    {
> +	      if (gimple_call_num_args (call) == 1)
> +		{
> +		  any_val = true;
> +		  ifn = IFN_CTZ;
> +		  break;
> +		}
> +	      if (!tree_fits_shwi_p (gimple_call_arg (call, 1)))
> +		return false;
> +	      HOST_WIDE_INT at_zero = tree_to_shwi (gimple_call_arg (call, 1));
> +	      if ((int) at_zero != at_zero)
> +		return false;
> +	      ifn = IFN_CTZ;
> +	      val = at_zero;
> +	      break;
> +	    }
>  	  if (direct_internal_fn_supported_p (IFN_CTZ, type, OPTIMIZE_FOR_BOTH)
>  	      && CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (type),
>  					    val) == 2)
> @@ -2960,8 +3002,18 @@ cond_removal_in_builtin_zero_pattern (ba
>  
>    /* Check PHI arguments.  */
>    if (lhs != arg0
> -      || TREE_CODE (arg1) != INTEGER_CST
> -      || wi::to_wide (arg1) != val)
> +      || TREE_CODE (arg1) != INTEGER_CST)
> +    return false;
> +  if (any_val)
> +    {
> +      if (!tree_fits_shwi_p (arg1))
> +	return false;
> +      HOST_WIDE_INT at_zero = tree_to_shwi (arg1);
> +      if ((int) at_zero != at_zero)
> +	return false;
> +      val = at_zero;
> +    }
> +  else if (wi::to_wide (arg1) != val)
>      return false;
>  
>    /* And insert the popcount/clz/ctz builtin and cast stmt before the
> @@ -2974,13 +3026,15 @@ cond_removal_in_builtin_zero_pattern (ba
>        reset_flow_sensitive_info (gimple_get_lhs (cast));
>      }
>    gsi_from = gsi_for_stmt (call);
> -  if (ifn == IFN_LAST || gimple_call_internal_p (call))
> +  if (ifn == IFN_LAST
> +      || (gimple_call_internal_p (call) && gimple_call_num_args (call) == 2))
>      gsi_move_before (&gsi_from, &gsi);
>    else
>      {
>        /* For __builtin_c[lt]z* force .C[LT]Z ifn, because only
>  	 the latter is well defined at zero.  */
> -      call = gimple_build_call_internal (ifn, 1, gimple_call_arg (call, 0));
> +      call = gimple_build_call_internal (ifn, 2, gimple_call_arg (call, 0),
> +					 build_int_cst (integer_type_node, val));
>        gimple_call_set_lhs (call, lhs);
>        gsi_insert_before (&gsi, call, GSI_SAME_STMT);
>        gsi_remove (&gsi_from, true);
> --- gcc/doc/extend.texi.jj	2023-11-09 09:04:18.823540470 +0100
> +++ gcc/doc/extend.texi	2023-11-09 09:17:40.240182342 +0100
> @@ -14960,6 +14960,42 @@ Similar to @code{__builtin_parity}, exce
>  @code{unsigned long long}.
>  @enddefbuiltin
>  
> +@defbuiltin{int __builtin_ffsg (...)}
> +Similar to @code{__builtin_ffs}, except the argument is type-generic
> +signed integer (standard, extended or bit-precise).
> +@enddefbuiltin
> +
> +@defbuiltin{int __builtin_clzg (...)}
> +Similar to @code{__builtin_clz}, except the argument is type-generic
> +unsigned integer (standard, extended or bit-precise) and there is
> +optional second argument with int type.  If two arguments are specified,
> +and first argument is 0, the result is the second argument.  If only
> +one argument is specified and it is 0, the result is undefined.
> +@enddefbuiltin
> +
> +@defbuiltin{int __builtin_ctzg (...)}
> +Similar to @code{__builtin_ctz}, except the argument is type-generic
> +unsigned integer (standard, extended or bit-precise) and there is
> +optional second argument with int type.  If two arguments are specified,
> +and first argument is 0, the result is the second argument.  If only
> +one argument is specified and it is 0, the result is undefined.
> +@enddefbuiltin
> +
> +@defbuiltin{int __builtin_clrsbg (...)}
> +Similar to @code{__builtin_clrsb}, except the argument is type-generic
> +signed integer (standard, extended or bit-precise).
> +@enddefbuiltin
> +
> +@defbuiltin{int __builtin_popcountg (...)}
> +Similar to @code{__builtin_popcount}, except the argument is type-generic
> +unsigned integer (standard, extended or bit-precise).
> +@enddefbuiltin
> +
> +@defbuiltin{int __builtin_parityg (...)}
> +Similar to @code{__builtin_parity}, except the argument is type-generic
> +unsigned integer (standard, extended or bit-precise).
> +@enddefbuiltin
> +
>  @defbuiltin{double __builtin_powi (double, int)}
>  @defbuiltinx{float __builtin_powif (float, int)}
>  @defbuiltinx{{long double} __builtin_powil (long double, int)}
> --- gcc/c-family/c-common.cc.jj	2023-11-09 09:04:18.409546335 +0100
> +++ gcc/c-family/c-common.cc	2023-11-09 09:17:40.236182399 +0100
> @@ -6475,14 +6475,14 @@ check_builtin_function_arguments (locati
>  	      }
>  	  if (TREE_CODE (TREE_TYPE (args[2])) == ENUMERAL_TYPE)
>  	    {
> -	      error_at (ARG_LOCATION (2), "argument 3 in call to function "
> -			"%qE has enumerated type", fndecl);
> +	      error_at (ARG_LOCATION (2), "argument %u in call to function "
> +			"%qE has enumerated type", 3, fndecl);
>  	      return false;
>  	    }
>  	  else if (TREE_CODE (TREE_TYPE (args[2])) == BOOLEAN_TYPE)
>  	    {
> -	      error_at (ARG_LOCATION (2), "argument 3 in call to function "
> -			"%qE has boolean type", fndecl);
> +	      error_at (ARG_LOCATION (2), "argument %u in call to function "
> +			"%qE has boolean type", 3, fndecl);
>  	      return false;
>  	    }
>  	  return true;
> @@ -6522,6 +6522,72 @@ check_builtin_function_arguments (locati
>  	}
>        return false;
>  
> +    case BUILT_IN_CLZG:
> +    case BUILT_IN_CTZG:
> +    case BUILT_IN_CLRSBG:
> +    case BUILT_IN_FFSG:
> +    case BUILT_IN_PARITYG:
> +    case BUILT_IN_POPCOUNTG:
> +      if (nargs == 2
> +	  && (DECL_FUNCTION_CODE (fndecl) == BUILT_IN_CLZG
> +	      || DECL_FUNCTION_CODE (fndecl) == BUILT_IN_CTZG))
> +	{
> +	  if (!INTEGRAL_TYPE_P (TREE_TYPE (args[1])))
> +	    {
> +	      error_at (ARG_LOCATION (1), "argument %u in call to function "
> +			"%qE does not have integral type", 2, fndecl);
> +	      return false;
> +	    }
> +	  if ((TYPE_PRECISION (TREE_TYPE (args[1]))
> +	       > TYPE_PRECISION (integer_type_node))
> +	      || (TYPE_PRECISION (TREE_TYPE (args[1]))
> +		  == TYPE_PRECISION (integer_type_node)
> +		  && TYPE_UNSIGNED (TREE_TYPE (args[1]))))
> +	    {
> +	      error_at (ARG_LOCATION (1), "argument %u in call to function "
> +			"%qE does not have %<int%> type", 2, fndecl);
> +	      return false;
> +	    }
> +	}
> +      else if (!builtin_function_validate_nargs (loc, fndecl, nargs, 1))
> +	return false;
> +
> +      if (!INTEGRAL_TYPE_P (TREE_TYPE (args[0])))
> +	{
> +	  error_at (ARG_LOCATION (0), "argument %u in call to function "
> +		    "%qE does not have integral type", 1, fndecl);
> +	  return false;
> +	}
> +      if (TREE_CODE (TREE_TYPE (args[0])) == ENUMERAL_TYPE)
> +	{
> +	  error_at (ARG_LOCATION (0), "argument %u in call to function "
> +		    "%qE has enumerated type", 1, fndecl);
> +	  return false;
> +	}
> +      if (TREE_CODE (TREE_TYPE (args[0])) == BOOLEAN_TYPE)
> +	{
> +	  error_at (ARG_LOCATION (0), "argument %u in call to function "
> +		    "%qE has boolean type", 1, fndecl);
> +	  return false;
> +	}
> +      if (DECL_FUNCTION_CODE (fndecl) == BUILT_IN_FFSG
> +	  || DECL_FUNCTION_CODE (fndecl) == BUILT_IN_CLRSBG)
> +	{
> +	  if (TYPE_UNSIGNED (TREE_TYPE (args[0])))
> +	    {
> +	      error_at (ARG_LOCATION (0), "argument 1 in call to function "
> +			"%qE has unsigned type", fndecl);
> +	      return false;
> +	    }
> +	}
> +      else if (!TYPE_UNSIGNED (TREE_TYPE (args[0])))
> +	{
> +	  error_at (ARG_LOCATION (0), "argument 1 in call to function "
> +		    "%qE has signed type", fndecl);
> +	  return false;
> +	}
> +      return true;
> +
>      default:
>        return true;
>      }
> --- gcc/c-family/c-gimplify.cc.jj	2023-11-09 09:03:53.251902730 +0100
> +++ gcc/c-family/c-gimplify.cc	2023-11-09 09:17:40.237182384 +0100
> @@ -818,6 +818,28 @@ c_gimplify_expr (tree *expr_p, gimple_se
>  	break;
>        }
>  
> +    case CALL_EXPR:
> +      {
> +	tree fndecl = get_callee_fndecl (*expr_p);
> +	if (fndecl
> +	    && fndecl_built_in_p (fndecl, BUILT_IN_CLZG, BUILT_IN_CTZG)
> +	    && call_expr_nargs (*expr_p) == 2
> +	    && TREE_CODE (CALL_EXPR_ARG (*expr_p, 1)) != INTEGER_CST)
> +	  {
> +	    tree a = save_expr (CALL_EXPR_ARG (*expr_p, 0));
> +	    tree c = build_call_expr_loc (EXPR_LOCATION (*expr_p),
> +					  fndecl, 1, a);
> +	    *expr_p = build3_loc (EXPR_LOCATION (*expr_p), COND_EXPR,
> +				  integer_type_node,
> +				  build2_loc (EXPR_LOCATION (*expr_p),
> +					      NE_EXPR, boolean_type_node, a,
> +					      build_zero_cst (TREE_TYPE (a))),
> +				  c, CALL_EXPR_ARG (*expr_p, 1));
> +	    return GS_OK;
> +	  }
> +	break;
> +      }
> +
>      default:;
>      }
>  
> --- gcc/c/c-typeck.cc.jj	2023-11-09 09:04:18.537544522 +0100
> +++ gcc/c/c-typeck.cc	2023-11-09 10:57:28.672517220 +0100
> @@ -3560,6 +3560,7 @@ convert_arguments (location_t loc, vec<l
>      && lookup_attribute ("type generic", TYPE_ATTRIBUTES (TREE_TYPE (fundecl)));
>    bool type_generic_remove_excess_precision = false;
>    bool type_generic_overflow_p = false;
> +  bool type_generic_bit_query = false;
>    tree selector;
>  
>    /* Change pointer to function to the function itself for
> @@ -3615,6 +3616,17 @@ convert_arguments (location_t loc, vec<l
>  	    type_generic_overflow_p = true;
>  	    break;
>  
> +	  case BUILT_IN_CLZG:
> +	  case BUILT_IN_CTZG:
> +	  case BUILT_IN_CLRSBG:
> +	  case BUILT_IN_FFSG:
> +	  case BUILT_IN_PARITYG:
> +	  case BUILT_IN_POPCOUNTG:
> +	    /* The first argument of these type-generic builtins
> +	       should not be promoted.  */
> +	    type_generic_bit_query = true;
> +	    break;
> +
>  	  default:
>  	    break;
>  	  }
> @@ -3750,11 +3762,13 @@ convert_arguments (location_t loc, vec<l
>  	    }
>  	}
>        else if ((excess_precision && !type_generic)
> -	       || (type_generic_overflow_p && parmnum == 2))
> +	       || (type_generic_overflow_p && parmnum == 2)
> +	       || (type_generic_bit_query && parmnum == 0))
>  	/* A "double" argument with excess precision being passed
>  	   without a prototype or in variable arguments.
>  	   The last argument of __builtin_*_overflow_p should not be
> -	   promoted.  */
> +	   promoted, similarly the first argument of
> +	   __builtin_{clz,ctz,clrsb,ffs,parity,popcount}g.  */
>  	parmval = convert (valtype, val);
>        else if ((invalid_func_diag =
>  		targetm.calls.invalid_arg_for_unprototyped_fn (typelist, fundecl, val)))
> --- gcc/cp/call.cc.jj	2023-11-04 09:02:35.376001531 +0100
> +++ gcc/cp/call.cc	2023-11-09 11:03:06.687737428 +0100
> @@ -9290,7 +9290,9 @@ convert_for_arg_passing (tree type, tree
>     This is true for some builtins which don't act like normal functions.
>     Return 2 if just decay_conversion and removal of excess precision should
>     be done, 1 if just decay_conversion.  Return 3 for special treatment of
> -   the 3rd argument for __builtin_*_overflow_p.  */
> +   the 3rd argument for __builtin_*_overflow_p.  Return 4 for special
> +   treatment of the 1st argument for
> +   __builtin_{clz,ctz,clrsb,ffs,parity,popcount}g.  */
>  
>  int
>  magic_varargs_p (tree fn)
> @@ -9317,6 +9319,14 @@ magic_varargs_p (tree fn)
>        case BUILT_IN_FPCLASSIFY:
>  	return 2;
>  
> +      case BUILT_IN_CLZG:
> +      case BUILT_IN_CTZG:
> +      case BUILT_IN_CLRSBG:
> +      case BUILT_IN_FFSG:
> +      case BUILT_IN_PARITYG:
> +      case BUILT_IN_POPCOUNTG:
> +	return 4;
> +
>        default:
>  	return lookup_attribute ("type generic",
>  				 TYPE_ATTRIBUTES (TREE_TYPE (fn))) != 0;
> @@ -10122,7 +10132,7 @@ build_over_call (struct z_candidate *can
>    for (; arg_index < vec_safe_length (args); ++arg_index)
>      {
>        tree a = (*args)[arg_index];
> -      if (magic == 3 && arg_index == 2)
> +      if ((magic == 3 && arg_index == 2) || (magic == 4 && arg_index == 0))
>  	{
>  	  /* Do no conversions for certain magic varargs.  */
>  	  a = mark_type_use (a);
> --- gcc/cp/cp-gimplify.cc.jj	2023-11-02 07:49:15.839882778 +0100
> +++ gcc/cp/cp-gimplify.cc	2023-11-09 12:11:59.834140462 +0100
> @@ -771,6 +771,10 @@ cp_gimplify_expr (tree *expr_p, gimple_s
>  	      default:
>  		break;
>  	      }
> +	  else if (decl
> +		   && fndecl_built_in_p (decl, BUILT_IN_CLZG, BUILT_IN_CTZG))
> +	    ret = (enum gimplify_status) c_gimplify_expr (expr_p, pre_p,
> +							  post_p);
>  	}
>        break;
>  
> --- gcc/testsuite/c-c++-common/pr111309-1.c.jj	2023-11-09 10:35:28.974541671 +0100
> +++ gcc/testsuite/c-c++-common/pr111309-1.c	2023-11-09 11:54:02.817389761 +0100
> @@ -0,0 +1,470 @@
> +/* PR c/111309 */
> +/* { dg-do run } */
> +/* { dg-options "-O2" } */
> +
> +__attribute__((noipa)) int
> +clzc (unsigned char x)
> +{
> +  return __builtin_clzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +clzc2 (unsigned char x, int y)
> +{
> +  return __builtin_clzg (x, y);
> +}
> +
> +__attribute__((noipa)) int
> +clzs (unsigned short x)
> +{
> +  return __builtin_clzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +clzs2 (unsigned short x)
> +{
> +  return __builtin_clzg (x, -2);
> +}
> +
> +__attribute__((noipa)) int
> +clzi (unsigned int x)
> +{
> +  return __builtin_clzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +clzi2 (unsigned int x, int y)
> +{
> +  return __builtin_clzg (x, y);
> +}
> +
> +__attribute__((noipa)) int
> +clzl (unsigned long x)
> +{
> +  return __builtin_clzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +clzl2 (unsigned long x)
> +{
> +  return __builtin_clzg (x, -1);
> +}
> +
> +__attribute__((noipa)) int
> +clzL (unsigned long long x)
> +{
> +  return __builtin_clzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +clzL2 (unsigned long long x, int y)
> +{
> +  return __builtin_clzg (x, y);
> +}
> +
> +#ifdef __SIZEOF_INT128__
> +__attribute__((noipa)) int
> +clzI (unsigned __int128 x)
> +{
> +  return __builtin_clzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +clzI2 (unsigned __int128 x, int y)
> +{
> +  return __builtin_clzg (x, y);
> +}
> +#endif
> +
> +__attribute__((noipa)) int
> +ctzc (unsigned char x)
> +{
> +  return __builtin_ctzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ctzc2 (unsigned char x, int y)
> +{
> +  return __builtin_ctzg (x, y);
> +}
> +
> +__attribute__((noipa)) int
> +ctzs (unsigned short x)
> +{
> +  return __builtin_ctzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ctzs2 (unsigned short x, int y)
> +{
> +  return __builtin_ctzg (x, y);
> +}
> +
> +__attribute__((noipa)) int
> +ctzi (unsigned int x)
> +{
> +  return __builtin_ctzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ctzi2 (unsigned int x, int y)
> +{
> +  return __builtin_ctzg (x, y);
> +}
> +
> +__attribute__((noipa)) int
> +ctzl (unsigned long x)
> +{
> +  return __builtin_ctzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ctzl2 (unsigned long x, int y)
> +{
> +  return __builtin_ctzg (x, y);
> +}
> +
> +__attribute__((noipa)) int
> +ctzL (unsigned long long x)
> +{
> +  return __builtin_ctzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ctzL2 (unsigned long long x, int y)
> +{
> +  return __builtin_ctzg (x, y);
> +}
> +
> +#ifdef __SIZEOF_INT128__
> +__attribute__((noipa)) int
> +ctzI (unsigned __int128 x)
> +{
> +  return __builtin_ctzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ctzI2 (unsigned __int128 x)
> +{
> +  return __builtin_ctzg (x, __SIZEOF_INT128__ * __CHAR_BIT__);
> +}
> +#endif
> +
> +__attribute__((noipa)) int
> +clrsbc (signed char x)
> +{
> +  return __builtin_clrsbg (x);
> +}
> +
> +__attribute__((noipa)) int
> +clrsbs (signed short x)
> +{
> +  return __builtin_clrsbg (x);
> +}
> +
> +__attribute__((noipa)) int
> +clrsbi (signed int x)
> +{
> +  return __builtin_clrsbg (x);
> +}
> +
> +__attribute__((noipa)) int
> +clrsbl (signed long x)
> +{
> +  return __builtin_clrsbg (x);
> +}
> +
> +__attribute__((noipa)) int
> +clrsbL (signed long long x)
> +{
> +  return __builtin_clrsbg (x);
> +}
> +
> +#ifdef __SIZEOF_INT128__
> +__attribute__((noipa)) int
> +clrsbI (signed __int128 x)
> +{
> +  return __builtin_clrsbg (x);
> +}
> +#endif
> +
> +__attribute__((noipa)) int
> +ffsc (signed char x)
> +{
> +  return __builtin_ffsg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ffss (signed short x)
> +{
> +  return __builtin_ffsg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ffsi (signed int x)
> +{
> +  return __builtin_ffsg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ffsl (signed long x)
> +{
> +  return __builtin_ffsg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ffsL (signed long long x)
> +{
> +  return __builtin_ffsg (x);
> +}
> +
> +#ifdef __SIZEOF_INT128__
> +__attribute__((noipa)) int
> +ffsI (signed __int128 x)
> +{
> +  return __builtin_ffsg (x);
> +}
> +#endif
> +
> +__attribute__((noipa)) int
> +parityc (unsigned char x)
> +{
> +  return __builtin_parityg (x);
> +}
> +
> +__attribute__((noipa)) int
> +paritys (unsigned short x)
> +{
> +  return __builtin_parityg (x);
> +}
> +
> +__attribute__((noipa)) int
> +parityi (unsigned int x)
> +{
> +  return __builtin_parityg (x);
> +}
> +
> +__attribute__((noipa)) int
> +parityl (unsigned long x)
> +{
> +  return __builtin_parityg (x);
> +}
> +
> +__attribute__((noipa)) int
> +parityL (unsigned long long x)
> +{
> +  return __builtin_parityg (x);
> +}
> +
> +#ifdef __SIZEOF_INT128__
> +__attribute__((noipa)) int
> +parityI (unsigned __int128 x)
> +{
> +  return __builtin_parityg (x);
> +}
> +#endif
> +
> +__attribute__((noipa)) int
> +popcountc (unsigned char x)
> +{
> +  return __builtin_popcountg (x);
> +}
> +
> +__attribute__((noipa)) int
> +popcounts (unsigned short x)
> +{
> +  return __builtin_popcountg (x);
> +}
> +
> +__attribute__((noipa)) int
> +popcounti (unsigned int x)
> +{
> +  return __builtin_popcountg (x);
> +}
> +
> +__attribute__((noipa)) int
> +popcountl (unsigned long x)
> +{
> +  return __builtin_popcountg (x);
> +}
> +
> +__attribute__((noipa)) int
> +popcountL (unsigned long long x)
> +{
> +  return __builtin_popcountg (x);
> +}
> +
> +#ifdef __SIZEOF_INT128__
> +__attribute__((noipa)) int
> +popcountI (unsigned __int128 x)
> +{
> +  return __builtin_popcountg (x);
> +}
> +#endif
> +
> +int
> +main ()
> +{
> +  if (__builtin_clzg ((unsigned char) 1) != __CHAR_BIT__ - 1
> +      || __builtin_clzg ((unsigned short) 2, -2) != __SIZEOF_SHORT__ * __CHAR_BIT__ - 2
> +      || __builtin_clzg (0U, 42) != 42
> +      || __builtin_clzg (0U, -1) != -1
> +      || __builtin_clzg (1U) != __SIZEOF_INT__ * __CHAR_BIT__ - 1
> +      || __builtin_clzg (2UL, -1) != __SIZEOF_LONG__ * __CHAR_BIT__ - 2
> +      || __builtin_clzg (5ULL) != __SIZEOF_LONG_LONG__ * __CHAR_BIT__ - 3
> +#ifdef __SIZEOF_INT128__
> +      || __builtin_clzg ((unsigned __int128) 9) != __SIZEOF_INT128__ * __CHAR_BIT__ - 4
> +#endif
> +      || __builtin_clzg (~0U, -5) != 0
> +      || __builtin_clzg (~0ULL >> 2) != 2
> +      || __builtin_ctzg ((unsigned char) 1) != 0
> +      || __builtin_ctzg ((unsigned short) 28) != 2
> +      || __builtin_ctzg (0U, 32) != 32
> +      || __builtin_ctzg (0U, -42) != -42
> +      || __builtin_ctzg (1U) != 0
> +      || __builtin_ctzg (16UL, -1) != 4
> +      || __builtin_ctzg (5ULL << 52, 0) != 52
> +#ifdef __SIZEOF_INT128__
> +      || __builtin_ctzg (((unsigned __int128) 9) << 72) != 72
> +#endif
> +      || __builtin_clrsbg ((signed char) 0) != __CHAR_BIT__ - 1
> +      || __builtin_clrsbg ((signed short) -1) != __SIZEOF_SHORT__ * __CHAR_BIT__ - 1
> +      || __builtin_clrsbg (0) != __SIZEOF_INT__ * __CHAR_BIT__ - 1
> +      || __builtin_clrsbg (-1L) != __SIZEOF_LONG__ * __CHAR_BIT__ - 1
> +      || __builtin_clrsbg (0LL) != __SIZEOF_LONG_LONG__ * __CHAR_BIT__ - 1
> +#ifdef __SIZEOF_INT128__
> +      || __builtin_clrsbg ((__int128) -1) != __SIZEOF_INT128__ * __CHAR_BIT__ - 1
> +#endif
> +      || __builtin_clrsbg (0x1afb) != __SIZEOF_INT__ * __CHAR_BIT__ - 14
> +      || __builtin_clrsbg (-2) != __SIZEOF_INT__ * __CHAR_BIT__ - 2
> +      || __builtin_clrsbg (1L) != __SIZEOF_LONG__ * __CHAR_BIT__ - 2
> +      || __builtin_clrsbg (-4LL) != __SIZEOF_LONG_LONG__ * __CHAR_BIT__ - 3
> +      || __builtin_ffsg ((signed char) 0) != 0
> +      || __builtin_ffsg ((signed short) 0) != 0
> +      || __builtin_ffsg (0) != 0
> +      || __builtin_ffsg (0L) != 0
> +      || __builtin_ffsg (0LL) != 0
> +#ifdef __SIZEOF_INT128__
> +      || __builtin_ffsg ((__int128) 0) != 0
> +#endif
> +      || __builtin_ffsg ((signed char) 4) != 3
> +      || __builtin_ffsg ((signed short) 8) != 4
> +      || __builtin_ffsg (1) != 1
> +      || __builtin_ffsg (2L) != 2
> +      || __builtin_ffsg (28LL) != 3
> +      || __builtin_parityg ((unsigned char) 1) != 1
> +      || __builtin_parityg ((unsigned short) 2) != 1
> +      || __builtin_parityg (0U) != 0
> +      || __builtin_parityg (3U) != 0
> +      || __builtin_parityg (0UL) != 0
> +      || __builtin_parityg (7UL) != 1
> +      || __builtin_parityg (0ULL) != 0
> +#ifdef __SIZEOF_INT128__
> +      || __builtin_parityg ((unsigned __int128) 0) != 0
> +#endif
> +      || __builtin_parityg ((unsigned char) ~0U) != 0
> +      || __builtin_parityg ((unsigned short) ~0U) != 0
> +      || __builtin_parityg (~0U) != 0
> +      || __builtin_parityg (~0UL) != 0
> +      || __builtin_parityg (~0ULL) != 0
> +#ifdef __SIZEOF_INT128__
> +      || __builtin_parityg (~(unsigned __int128) 0) != 0
> +#endif
> +      || __builtin_popcountg (0U) != 0
> +      || __builtin_popcountg (0UL) != 0
> +      || __builtin_popcountg (0ULL) != 0
> +#ifdef __SIZEOF_INT128__
> +      || __builtin_popcountg ((unsigned __int128) 0) != 0
> +#endif
> +      || __builtin_popcountg ((unsigned char) ~0U) != __CHAR_BIT__
> +      || __builtin_popcountg ((unsigned short) ~0U) != __SIZEOF_SHORT__ * __CHAR_BIT__
> +      || __builtin_popcountg (~0U) != __SIZEOF_INT__ * __CHAR_BIT__
> +      || __builtin_popcountg (~0UL) != __SIZEOF_LONG__ * __CHAR_BIT__
> +      || __builtin_popcountg (~0ULL) != __SIZEOF_LONG_LONG__ * __CHAR_BIT__
> +#ifdef __SIZEOF_INT128__
> +      || __builtin_popcountg (~(unsigned __int128) 0) != __SIZEOF_INT128__ * __CHAR_BIT__
> +#endif
> +      || 0)
> +  __builtin_abort ();
> +  if (clzc (1) != __CHAR_BIT__ - 1
> +      || clzs2 (2) != __SIZEOF_SHORT__ * __CHAR_BIT__ - 2
> +      || clzi2 (0U, 42) != 42
> +      || clzi2 (0U, -1) != -1
> +      || clzi (1U) != __SIZEOF_INT__ * __CHAR_BIT__ - 1
> +      || clzl2 (2UL) != __SIZEOF_LONG__ * __CHAR_BIT__ - 2
> +      || clzL (5ULL) != __SIZEOF_LONG_LONG__ * __CHAR_BIT__ - 3
> +#ifdef __SIZEOF_INT128__
> +      || clzI ((unsigned __int128) 9) != __SIZEOF_INT128__ * __CHAR_BIT__ - 4
> +#endif
> +      || clzi2 (~0U, -5) != 0
> +      || clzL (~0ULL >> 2) != 2
> +      || ctzc (1) != 0
> +      || ctzs (28) != 2
> +      || ctzi2 (0U, 32) != 32
> +      || ctzi2 (0U, -42) != -42
> +      || ctzi (1U) != 0
> +      || ctzl2 (16UL, -1) != 4
> +      || ctzL2 (5ULL << 52, 0) != 52
> +#ifdef __SIZEOF_INT128__
> +      || ctzI (((unsigned __int128) 9) << 72) != 72
> +#endif
> +      || clrsbc (0) != __CHAR_BIT__ - 1
> +      || clrsbs (-1) != __SIZEOF_SHORT__ * __CHAR_BIT__ - 1
> +      || clrsbi (0) != __SIZEOF_INT__ * __CHAR_BIT__ - 1
> +      || clrsbl (-1L) != __SIZEOF_LONG__ * __CHAR_BIT__ - 1
> +      || clrsbL (0LL) != __SIZEOF_LONG_LONG__ * __CHAR_BIT__ - 1
> +#ifdef __SIZEOF_INT128__
> +      || clrsbI (-1) != __SIZEOF_INT128__ * __CHAR_BIT__ - 1
> +#endif
> +      || clrsbi (0x1afb) != __SIZEOF_INT__ * __CHAR_BIT__ - 14
> +      || clrsbi (-2) != __SIZEOF_INT__ * __CHAR_BIT__ - 2
> +      || clrsbl (1L) != __SIZEOF_LONG__ * __CHAR_BIT__ - 2
> +      || clrsbL (-4LL) != __SIZEOF_LONG_LONG__ * __CHAR_BIT__ - 3
> +      || ffsc (0) != 0
> +      || ffss (0) != 0
> +      || ffsi (0) != 0
> +      || ffsl (0L) != 0
> +      || ffsL (0LL) != 0
> +#ifdef __SIZEOF_INT128__
> +      || ffsI (0) != 0
> +#endif
> +      || ffsc (4) != 3
> +      || ffss (8) != 4
> +      || ffsi (1) != 1
> +      || ffsl (2L) != 2
> +      || ffsL (28LL) != 3
> +      || parityc (1) != 1
> +      || paritys (2) != 1
> +      || parityi (0U) != 0
> +      || parityi (3U) != 0
> +      || parityl (0UL) != 0
> +      || parityl (7UL) != 1
> +      || parityL (0ULL) != 0
> +#ifdef __SIZEOF_INT128__
> +      || parityI (0) != 0
> +#endif
> +      || parityc ((unsigned char) ~0U) != 0
> +      || paritys ((unsigned short) ~0U) != 0
> +      || parityi (~0U) != 0
> +      || parityl (~0UL) != 0
> +      || parityL (~0ULL) != 0
> +#ifdef __SIZEOF_INT128__
> +      || parityI (~(unsigned __int128) 0) != 0
> +#endif
> +      || popcounti (0U) != 0
> +      || popcountl (0UL) != 0
> +      || popcountL (0ULL) != 0
> +#ifdef __SIZEOF_INT128__
> +      || popcountI (0) != 0
> +#endif
> +      || popcountc ((unsigned char) ~0U) != __CHAR_BIT__
> +      || popcounts ((unsigned short) ~0U) != __SIZEOF_SHORT__ * __CHAR_BIT__
> +      || popcounti (~0U) != __SIZEOF_INT__ * __CHAR_BIT__
> +      || popcountl (~0UL) != __SIZEOF_LONG__ * __CHAR_BIT__
> +      || popcountL (~0ULL) != __SIZEOF_LONG_LONG__ * __CHAR_BIT__
> +#ifdef __SIZEOF_INT128__
> +      || popcountI (~(unsigned __int128) 0) != __SIZEOF_INT128__ * __CHAR_BIT__
> +#endif
> +      || 0)
> +  __builtin_abort ();
> +}
> --- gcc/testsuite/c-c++-common/pr111309-2.c.jj	2023-11-09 11:33:42.680632470 +0100
> +++ gcc/testsuite/c-c++-common/pr111309-2.c	2023-11-09 12:03:11.062619162 +0100
> @@ -0,0 +1,85 @@
> +/* PR c/111309 */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-std=c99" { target c } } */
> +
> +#ifndef __cplusplus
> +#define bool _Bool
> +#define true ((_Bool) 1)
> +#define false ((_Bool) 0)
> +#endif
> +
> +void
> +foo (void)
> +{
> +  enum E { E0 = 0 };
> +  struct S { int s; } s;
> +  __builtin_clzg ();		/* { dg-error "too few arguments" } */
> +  __builtin_clzg (0U, 1, 2);	/* { dg-error "too many arguments" } */
> +  __builtin_clzg (0);		/* { dg-error "has signed type" } */
> +  __builtin_clzg (0.0);		/* { dg-error "does not have integral type" } */
> +  __builtin_clzg (s);		/* { dg-error "does not have integral type" } */
> +  __builtin_clzg (true);	/* { dg-error "has boolean type" } */
> +  __builtin_clzg (E0);		/* { dg-error "has signed type" "" { target c } } */
> +				/* { dg-error "has enumerated type" "" { target c++ } .-1 } */
> +  __builtin_clzg (0, 0);	/* { dg-error "has signed type" } */
> +  __builtin_clzg (0.0, 0);	/* { dg-error "does not have integral type" } */
> +  __builtin_clzg (s, 0);	/* { dg-error "does not have integral type" } */
> +  __builtin_clzg (true, 0);	/* { dg-error "has boolean type" } */
> +  __builtin_clzg (E0, 0);	/* { dg-error "has signed type" "" { target c } } */
> +				/* { dg-error "has enumerated type" "" { target c++ } .-1 } */
> +  __builtin_clzg (0U, 2.0);	/* { dg-error "does not have integral type" } */
> +  __builtin_clzg (0U, s);	/* { dg-error "does not have integral type" } */
> +  __builtin_clzg (0U, 2LL);	/* { dg-error "does not have 'int' type" } */
> +  __builtin_clzg (0U, 2U);	/* { dg-error "does not have 'int' type" } */
> +  __builtin_clzg (0U, true);
> +  __builtin_clzg (0U, E0);	/* { dg-error "does not have 'int' type" "" { target c++ } } */
> +  __builtin_ctzg ();		/* { dg-error "too few arguments" } */
> +  __builtin_ctzg (0U, 1, 2);	/* { dg-error "too many arguments" } */
> +  __builtin_ctzg (0);		/* { dg-error "has signed type" } */
> +  __builtin_ctzg (0.0);		/* { dg-error "does not have integral type" } */
> +  __builtin_ctzg (s);		/* { dg-error "does not have integral type" } */
> +  __builtin_ctzg (true);	/* { dg-error "has boolean type" } */
> +  __builtin_ctzg (E0);		/* { dg-error "has signed type" "" { target c } } */
> +				/* { dg-error "has enumerated type" "" { target c++ } .-1 } */
> +  __builtin_ctzg (0, 0);	/* { dg-error "has signed type" } */
> +  __builtin_ctzg (0.0, 0);	/* { dg-error "does not have integral type" } */
> +  __builtin_ctzg (s, 0);	/* { dg-error "does not have integral type" } */
> +  __builtin_ctzg (true, 0);	/* { dg-error "has boolean type" } */
> +  __builtin_ctzg (E0, 0);	/* { dg-error "has signed type" "" { target c } } */
> +				/* { dg-error "has enumerated type" "" { target c++ } .-1 } */
> +  __builtin_ctzg (0U, 2.0);	/* { dg-error "does not have integral type" } */
> +  __builtin_ctzg (0U, 2LL);	/* { dg-error "does not have 'int' type" } */
> +  __builtin_ctzg (0U, 2U);	/* { dg-error "does not have 'int' type" } */
> +  __builtin_ctzg (0U, true);
> +  __builtin_ctzg (0U, E0);	/* { dg-error "does not have 'int' type" "" { target c++ } } */
> +  __builtin_clrsbg ();		/* { dg-error "too few arguments" } */
> +  __builtin_clrsbg (0, 1);	/* { dg-error "too many arguments" } */
> +  __builtin_clrsbg (0U);	/* { dg-error "has unsigned type" } */
> +  __builtin_clrsbg (0.0);	/* { dg-error "does not have integral type" } */
> +  __builtin_clrsbg (s);		/* { dg-error "does not have integral type" } */
> +  __builtin_clrsbg (true);	/* { dg-error "has boolean type" } */
> +  __builtin_clrsbg (E0);	/* { dg-error "has enumerated type" "" { target c++ } } */
> +  __builtin_ffsg ();		/* { dg-error "too few arguments" } */
> +  __builtin_ffsg (0, 1);	/* { dg-error "too many arguments" } */
> +  __builtin_ffsg (0U);		/* { dg-error "has unsigned type" } */
> +  __builtin_ffsg (0.0);		/* { dg-error "does not have integral type" } */
> +  __builtin_ffsg (s);		/* { dg-error "does not have integral type" } */
> +  __builtin_ffsg (true);	/* { dg-error "has boolean type" } */
> +  __builtin_ffsg (E0);		/* { dg-error "has enumerated type" "" { target c++ } } */
> +  __builtin_parityg ();		/* { dg-error "too few arguments" } */
> +  __builtin_parityg (0U, 1);	/* { dg-error "too many arguments" } */
> +  __builtin_parityg (0);	/* { dg-error "has signed type" } */
> +  __builtin_parityg (0.0);	/* { dg-error "does not have integral type" } */
> +  __builtin_parityg (s);	/* { dg-error "does not have integral type" } */
> +  __builtin_parityg (true);	/* { dg-error "has boolean type" } */
> +  __builtin_parityg (E0);	/* { dg-error "has signed type" "" { target c } } */
> +				/* { dg-error "has enumerated type" "" { target c++ } .-1 } */
> +  __builtin_popcountg ();	/* { dg-error "too few arguments" } */
> +  __builtin_popcountg (0U, 1);	/* { dg-error "too many arguments" } */
> +  __builtin_popcountg (0);	/* { dg-error "has signed type" } */
> +  __builtin_popcountg (0.0);	/* { dg-error "does not have integral type" } */
> +  __builtin_popcountg (s);	/* { dg-error "does not have integral type" } */
> +  __builtin_popcountg (true);	/* { dg-error "has boolean type" } */
> +  __builtin_popcountg (E0);	/* { dg-error "has signed type" "" { target c } } */
> +				/* { dg-error "has enumerated type" "" { target c++ } .-1 } */
> +}
> --- gcc/testsuite/gcc.dg/torture/bitint-43.c.jj	2023-11-09 09:17:40.233182441 +0100
> +++ gcc/testsuite/gcc.dg/torture/bitint-43.c	2023-11-09 12:16:51.757013390 +0100
> @@ -0,0 +1,306 @@
> +/* PR c/111309 */
> +/* { dg-do run { target bitint } } */
> +/* { dg-options "-std=c2x -pedantic-errors" } */
> +/* { dg-skip-if "" { ! run_expensive_tests }  { "*" } { "-O0" "-O2" } } */
> +/* { dg-skip-if "" { ! run_expensive_tests } { "-flto" } { "" } } */
> +
> +#if __BITINT_MAXWIDTH__ >= 156
> +__attribute__((noipa)) int
> +clz156 (unsigned _BitInt(156) x)
> +{
> +  return __builtin_clzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +clzd156 (unsigned _BitInt(156) x)
> +{
> +  return __builtin_clzg (x, __builtin_popcountg ((typeof (x)) ~(typeof (x)) 0));
> +}
> +
> +__attribute__((noipa)) int
> +clzD156 (unsigned _BitInt(156) x, int y)
> +{
> +  return __builtin_clzg (x, y);
> +}
> +
> +__attribute__((noipa)) int
> +ctz156 (unsigned _BitInt(156) x)
> +{
> +  return __builtin_ctzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ctzd156 (unsigned _BitInt(156) x)
> +{
> +  return __builtin_ctzg (x, __builtin_popcountg ((typeof (x)) ~(typeof (x)) 0));
> +}
> +
> +__attribute__((noipa)) int
> +ctzD156 (unsigned _BitInt(156) x, int y)
> +{
> +  return __builtin_ctzg (x, y);
> +}
> +
> +__attribute__((noipa)) int
> +clrsb156 (_BitInt(156) x)
> +{
> +  return __builtin_clrsbg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ffs156 (_BitInt(156) x)
> +{
> +  return __builtin_ffsg (x);
> +}
> +
> +__attribute__((noipa)) int
> +parity156 (unsigned _BitInt(156) x)
> +{
> +  return __builtin_parityg (x);
> +}
> +
> +__attribute__((noipa)) int
> +popcount156 (unsigned _BitInt(156) x)
> +{
> +  return __builtin_popcountg (x);
> +}
> +#endif
> +
> +#if __BITINT_MAXWIDTH__ >= 192
> +__attribute__((noipa)) int
> +clz192 (unsigned _BitInt(192) x)
> +{
> +  return __builtin_clzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +clzd192 (unsigned _BitInt(192) x)
> +{
> +  return __builtin_clzg (x, __builtin_popcountg ((typeof (x)) ~(typeof (x)) 0));
> +}
> +
> +__attribute__((noipa)) int
> +clzD192 (unsigned _BitInt(192) x, int y)
> +{
> +  return __builtin_clzg (x, y);
> +}
> +
> +__attribute__((noipa)) int
> +ctz192 (unsigned _BitInt(192) x)
> +{
> +  return __builtin_ctzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ctzd192 (unsigned _BitInt(192) x)
> +{
> +  return __builtin_ctzg (x, __builtin_popcountg ((typeof (x)) ~(typeof (x)) 0));
> +}
> +
> +__attribute__((noipa)) int
> +ctzD192 (unsigned _BitInt(192) x, int y)
> +{
> +  return __builtin_ctzg (x, y);
> +}
> +
> +__attribute__((noipa)) int
> +clrsb192 (_BitInt(192) x)
> +{
> +  return __builtin_clrsbg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ffs192 (_BitInt(192) x)
> +{
> +  return __builtin_ffsg (x);
> +}
> +
> +__attribute__((noipa)) int
> +parity192 (unsigned _BitInt(192) x)
> +{
> +  return __builtin_parityg (x);
> +}
> +
> +__attribute__((noipa)) int
> +popcount192 (unsigned _BitInt(192) x)
> +{
> +  return __builtin_popcountg (x);
> +}
> +#endif
> +
> +int
> +main ()
> +{
> +#if __BITINT_MAXWIDTH__ >= 156
> +  if (clzd156 (0) != 156
> +      || clzD156 (0, -1) != -1
> +      || ctzd156 (0) != 156
> +      || ctzD156 (0, 42) != 42
> +      || clrsb156 (0) != 156 - 1
> +      || ffs156 (0) != 0
> +      || parity156 (0) != 0
> +      || popcount156 (0) != 0
> +      || __builtin_clzg ((unsigned _BitInt(156)) 0, 156 + 32) != 156 + 32
> +      || __builtin_ctzg ((unsigned _BitInt(156)) 0, 156) != 156
> +      || __builtin_clrsbg ((_BitInt(156)) 0) != 156 - 1
> +      || __builtin_ffsg ((_BitInt(156)) 0) != 0
> +      || __builtin_parityg ((unsigned _BitInt(156)) 0) != 0
> +      || __builtin_popcountg ((unsigned _BitInt(156)) 0) != 0)
> +    __builtin_abort ();
> +  if (clz156 (-1) != 0
> +      || clzd156 (-1) != 0
> +      || clzD156 (-1, 0) != 0
> +      || ctz156 (-1) != 0
> +      || ctzd156 (-1) != 0
> +      || ctzD156 (-1, 17) != 0
> +      || clrsb156 (-1) != 156 - 1
> +      || ffs156 (-1) != 1
> +      || parity156 (-1) != 0
> +      || popcount156 (-1) != 156
> +      || __builtin_clzg ((unsigned _BitInt(156)) -1) != 0
> +      || __builtin_clzg ((unsigned _BitInt(156)) -1, 156 + 32) != 0
> +      || __builtin_ctzg ((unsigned _BitInt(156)) -1) != 0
> +      || __builtin_ctzg ((unsigned _BitInt(156)) -1, 156) != 0
> +      || __builtin_clrsbg ((_BitInt(156)) -1) != 156 - 1
> +      || __builtin_ffsg ((_BitInt(156)) -1) != 1
> +      || __builtin_parityg ((unsigned _BitInt(156)) -1) != 0
> +      || __builtin_popcountg ((unsigned _BitInt(156)) -1) != 156)
> +    __builtin_abort ();
> +  if (clz156 (((unsigned _BitInt(156)) -1) >> 24) != 24
> +      || clz156 (((unsigned _BitInt(156)) -1) >> 79) != 79
> +      || clz156 (1) != 156 - 1
> +      || clzd156 (((unsigned _BitInt(156)) -1) >> 139) != 139
> +      || clzd156 (2) != 156 - 2
> +      || ctz156 (((unsigned _BitInt(156)) -1) << 42) != 42
> +      || ctz156 (((unsigned _BitInt(156)) -1) << 57) != 57
> +      || ctz156 (0x4000000000000000000000uwb) != 86
> +      || ctzd156 (((unsigned _BitInt(156)) -1) << 149) != 149
> +      || ctzd156 (2) != 1
> +      || clrsb156 ((unsigned _BitInt(156 - 4)) -1) != 3
> +      || clrsb156 ((unsigned _BitInt(156 - 28)) -1) != 27
> +      || clrsb156 ((unsigned _BitInt(156 - 29)) -1) != 28
> +      || clrsb156 (~(unsigned _BitInt(156)) (unsigned _BitInt(156 - 68)) -1) != 67
> +      || clrsb156 (~(unsigned _BitInt(156)) (unsigned _BitInt(156 - 92)) -1) != 91
> +      || clrsb156 (~(unsigned _BitInt(156)) (unsigned _BitInt(156 - 93)) -1) != 92
> +      || ffs156 (((unsigned _BitInt(156)) -1) << 42) != 43
> +      || ffs156 (((unsigned _BitInt(156)) -1) << 57) != 58
> +      || ffs156 (0x4000000000000000000000uwb) != 87
> +      || ffs156 (((unsigned _BitInt(156)) -1) << 149) != 150
> +      || ffs156 (2) != 2
> +      || __builtin_clzg (((unsigned _BitInt(156)) -1) >> 24) != 24
> +      || __builtin_clzg (((unsigned _BitInt(156)) -1) >> 79) != 79
> +      || __builtin_clzg ((unsigned _BitInt(156)) 1) != 156 - 1
> +      || __builtin_clzg (((unsigned _BitInt(156)) -1) >> 139, 156) != 139
> +      || __builtin_clzg ((unsigned _BitInt(156)) 2, 156) != 156 - 2
> +      || __builtin_ctzg (((unsigned _BitInt(156)) -1) << 42) != 42
> +      || __builtin_ctzg (((unsigned _BitInt(156)) -1) << 57) != 57
> +      || __builtin_ctzg ((unsigned _BitInt(156)) 0x4000000000000000000000uwb) != 86
> +      || __builtin_ctzg (((unsigned _BitInt(156)) -1) << 149, 156) != 149
> +      || __builtin_ctzg ((unsigned _BitInt(156)) 2, 156) != 1
> +      || __builtin_clrsbg ((_BitInt(156)) (unsigned _BitInt(156 - 4)) -1) != 3
> +      || __builtin_clrsbg ((_BitInt(156)) (unsigned _BitInt(156 - 28)) -1) != 27
> +      || __builtin_clrsbg ((_BitInt(156)) (unsigned _BitInt(156 - 29)) -1) != 28
> +      || __builtin_clrsbg ((_BitInt(156)) ~(unsigned _BitInt(156)) (unsigned _BitInt(156 - 68)) -1) != 67
> +      || __builtin_clrsbg ((_BitInt(156)) ~(unsigned _BitInt(156)) (unsigned _BitInt(156 - 92)) -1) != 91
> +      || __builtin_clrsbg ((_BitInt(156)) ~(unsigned _BitInt(156)) (unsigned _BitInt(156 - 93)) -1) != 92
> +      || __builtin_ffsg ((_BitInt(156)) (((unsigned _BitInt(156)) -1) << 42)) != 43
> +      || __builtin_ffsg ((_BitInt(156)) (((unsigned _BitInt(156)) -1) << 57)) != 58
> +      || __builtin_ffsg ((_BitInt(156)) 0x4000000000000000000000uwb) != 87
> +      || __builtin_ffsg ((_BitInt(156)) (((unsigned _BitInt(156)) -1) << 149)) != 150
> +      || __builtin_ffsg ((_BitInt(156)) 2) != 2)
> +    __builtin_abort ();
> +  if (parity156 (23008250258685373142923325827291949461178444434uwb) != __builtin_parityg (23008250258685373142923325827291949461178444434uwb)
> +      || parity156 (41771568792516301628132437740665810252917251244uwb) != __builtin_parityg (41771568792516301628132437740665810252917251244uwb)
> +      || parity156 (5107402473866766219120283991834936835726115452uwb) != __builtin_parityg (5107402473866766219120283991834936835726115452uwb)
> +      || popcount156 (50353291748276374580944955711958129678996395562uwb) != __builtin_popcountg (50353291748276374580944955711958129678996395562uwb)
> +      || popcount156 (29091263616891212550063067166307725491211684496uwb) != __builtin_popcountg (29091263616891212550063067166307725491211684496uwb)
> +      || popcount156 (64973284306583205619384799873110935608793072026uwb) != __builtin_popcountg (64973284306583205619384799873110935608793072026uwb))
> +    __builtin_abort ();
> +#endif
> +#if __BITINT_MAXWIDTH__ >= 192
> +  if (clzd192 (0) != 192
> +      || clzD192 (0, 42) != 42
> +      || ctzd192 (0) != 192
> +      || ctzD192 (0, -1) != -1
> +      || clrsb192 (0) != 192 - 1
> +      || ffs192 (0) != 0
> +      || parity192 (0) != 0
> +      || popcount192 (0) != 0
> +      || __builtin_clzg ((unsigned _BitInt(192)) 0, 192 + 32) != 192 + 32
> +      || __builtin_ctzg ((unsigned _BitInt(192)) 0, 192) != 192
> +      || __builtin_clrsbg ((_BitInt(192)) 0) != 192 - 1
> +      || __builtin_ffsg ((_BitInt(192)) 0) != 0
> +      || __builtin_parityg ((unsigned _BitInt(192)) 0) != 0
> +      || __builtin_popcountg ((unsigned _BitInt(192)) 0) != 0)
> +    __builtin_abort ();
> +  if (clz192 (-1) != 0
> +      || clzd192 (-1) != 0
> +      || clzD192 (-1, 15) != 0
> +      || ctz192 (-1) != 0
> +      || ctzd192 (-1) != 0
> +      || ctzD192 (-1, -57) != 0
> +      || clrsb192 (-1) != 192 - 1
> +      || ffs192 (-1) != 1
> +      || parity192 (-1) != 0
> +      || popcount192 (-1) != 192
> +      || __builtin_clzg ((unsigned _BitInt(192)) -1) != 0
> +      || __builtin_clzg ((unsigned _BitInt(192)) -1, 192 + 32) != 0
> +      || __builtin_ctzg ((unsigned _BitInt(192)) -1) != 0
> +      || __builtin_ctzg ((unsigned _BitInt(192)) -1, 192) != 0
> +      || __builtin_clrsbg ((_BitInt(192)) -1) != 192 - 1
> +      || __builtin_ffsg ((_BitInt(192)) -1) != 1
> +      || __builtin_parityg ((unsigned _BitInt(192)) -1) != 0
> +      || __builtin_popcountg ((unsigned _BitInt(192)) -1) != 192)
> +    __builtin_abort ();
> +  if (clz192 (((unsigned _BitInt(192)) -1) >> 24) != 24
> +      || clz192 (((unsigned _BitInt(192)) -1) >> 79) != 79
> +      || clz192 (1) != 192 - 1
> +      || clzd192 (((unsigned _BitInt(192)) -1) >> 139) != 139
> +      || clzd192 (2) != 192 - 2
> +      || ctz192 (((unsigned _BitInt(192)) -1) << 42) != 42
> +      || ctz192 (((unsigned _BitInt(192)) -1) << 57) != 57
> +      || ctz192 (0x4000000000000000000000uwb) != 86
> +      || ctzd192 (((unsigned _BitInt(192)) -1) << 149) != 149
> +      || ctzd192 (2) != 1
> +      || clrsb192 ((unsigned _BitInt(192 - 4)) -1) != 3
> +      || clrsb192 ((unsigned _BitInt(192 - 28)) -1) != 27
> +      || clrsb192 ((unsigned _BitInt(192 - 29)) -1) != 28
> +      || clrsb192 (~(unsigned _BitInt(192)) (unsigned _BitInt(192 - 68)) -1) != 67
> +      || clrsb192 (~(unsigned _BitInt(192)) (unsigned _BitInt(192 - 92)) -1) != 91
> +      || clrsb192 (~(unsigned _BitInt(192)) (unsigned _BitInt(192 - 93)) -1) != 92
> +      || ffs192 (((unsigned _BitInt(192)) -1) << 42) != 43
> +      || ffs192 (((unsigned _BitInt(192)) -1) << 57) != 58
> +      || ffs192 (0x4000000000000000000000uwb) != 87
> +      || ffs192 (((unsigned _BitInt(192)) -1) << 149) != 150
> +      || ffs192 (2) != 2
> +      || __builtin_clzg (((unsigned _BitInt(192)) -1) >> 24) != 24
> +      || __builtin_clzg (((unsigned _BitInt(192)) -1) >> 79) != 79
> +      || __builtin_clzg ((unsigned _BitInt(192)) 1) != 192 - 1
> +      || __builtin_clzg (((unsigned _BitInt(192)) -1) >> 139, 192) != 139
> +      || __builtin_clzg ((unsigned _BitInt(192)) 2, 192) != 192 - 2
> +      || __builtin_ctzg (((unsigned _BitInt(192)) -1) << 42) != 42
> +      || __builtin_ctzg (((unsigned _BitInt(192)) -1) << 57) != 57
> +      || __builtin_ctzg ((unsigned _BitInt(192)) 0x4000000000000000000000uwb) != 86
> +      || __builtin_ctzg (((unsigned _BitInt(192)) -1) << 149, 192) != 149
> +      || __builtin_ctzg ((unsigned _BitInt(192)) 2, 192) != 1
> +      || __builtin_clrsbg ((_BitInt(192)) (unsigned _BitInt(192 - 4)) -1) != 3
> +      || __builtin_clrsbg ((_BitInt(192)) (unsigned _BitInt(192 - 28)) -1) != 27
> +      || __builtin_clrsbg ((_BitInt(192)) (unsigned _BitInt(192 - 29)) -1) != 28
> +      || __builtin_clrsbg ((_BitInt(192)) ~(unsigned _BitInt(192)) (unsigned _BitInt(192 - 68)) -1) != 67
> +      || __builtin_clrsbg ((_BitInt(192)) ~(unsigned _BitInt(192)) (unsigned _BitInt(192 - 92)) -1) != 91
> +      || __builtin_clrsbg ((_BitInt(192)) ~(unsigned _BitInt(192)) (unsigned _BitInt(192 - 93)) -1) != 92
> +      || __builtin_ffsg ((_BitInt(192)) (((unsigned _BitInt(192)) -1) << 42)) != 43
> +      || __builtin_ffsg ((_BitInt(192)) (((unsigned _BitInt(192)) -1) << 57)) != 58
> +      || __builtin_ffsg ((_BitInt(192)) 0x4000000000000000000000uwb) != 87
> +      || __builtin_ffsg ((_BitInt(192)) (((unsigned _BitInt(192)) -1) << 149)) != 150
> +      || __builtin_ffsg ((_BitInt(192)) 2) != 2)
> +    __builtin_abort ();
> +  if (parity192 (4692147078159863499615754634965484598760535154638668598762uwb) != __builtin_parityg (4692147078159863499615754634965484598760535154638668598762uwb)
> +      || parity192 (1669461228546917627909935444501097256112222796898845183538uwb) != __builtin_parityg (1669461228546917627909935444501097256112222796898845183538uwb)
> +      || parity192 (5107402473866766219120283991834936835726115452uwb) != __builtin_parityg (5107402473866766219120283991834936835726115452uwb)
> +      || popcount192 (4033871057575185619108386380181511734118888391160164588976uwb) != __builtin_popcountg (4033871057575185619108386380181511734118888391160164588976uwb)
> +      || popcount192 (58124766715713711628758119849579188845074973856704521119uwb) != __builtin_popcountg (58124766715713711628758119849579188845074973856704521119uwb)
> +      || popcount192 (289948065236269174335700831610076764076947650072787325852uwb) != __builtin_popcountg (289948065236269174335700831610076764076947650072787325852uwb))
> +    __builtin_abort ();
> +#endif
> +}
> --- gcc/testsuite/gcc.dg/torture/bitint-44.c.jj	2023-11-09 09:17:40.232182455 +0100
> +++ gcc/testsuite/gcc.dg/torture/bitint-44.c	2023-11-09 12:21:32.376046129 +0100
> @@ -0,0 +1,306 @@
> +/* PR c/111309 */
> +/* { dg-do run { target bitint } } */
> +/* { dg-options "-std=c2x -pedantic-errors" } */
> +/* { dg-skip-if "" { ! run_expensive_tests }  { "*" } { "-O0" "-O2" } } */
> +/* { dg-skip-if "" { ! run_expensive_tests } { "-flto" } { "" } } */
> +
> +#if __BITINT_MAXWIDTH__ >= 512
> +__attribute__((noipa)) int
> +clz512 (unsigned _BitInt(512) x)
> +{
> +  return __builtin_clzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +clzd512 (unsigned _BitInt(512) x)
> +{
> +  return __builtin_clzg (x, __builtin_popcountg ((typeof (x)) ~(typeof (x)) 0));
> +}
> +
> +__attribute__((noipa)) int
> +clzD512 (unsigned _BitInt(512) x, int y)
> +{
> +  return __builtin_clzg (x, y);
> +}
> +
> +__attribute__((noipa)) int
> +ctz512 (unsigned _BitInt(512) x)
> +{
> +  return __builtin_ctzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ctzd512 (unsigned _BitInt(512) x)
> +{
> +  return __builtin_ctzg (x, __builtin_popcountg ((typeof (x)) ~(typeof (x)) 0));
> +}
> +
> +__attribute__((noipa)) int
> +ctzD512 (unsigned _BitInt(512) x, int y)
> +{
> +  return __builtin_ctzg (x, y);
> +}
> +
> +__attribute__((noipa)) int
> +clrsb512 (_BitInt(512) x)
> +{
> +  return __builtin_clrsbg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ffs512 (_BitInt(512) x)
> +{
> +  return __builtin_ffsg (x);
> +}
> +
> +__attribute__((noipa)) int
> +parity512 (unsigned _BitInt(512) x)
> +{
> +  return __builtin_parityg (x);
> +}
> +
> +__attribute__((noipa)) int
> +popcount512 (unsigned _BitInt(512) x)
> +{
> +  return __builtin_popcountg (x);
> +}
> +#endif
> +
> +#if __BITINT_MAXWIDTH__ >= 523
> +__attribute__((noipa)) int
> +clz523 (unsigned _BitInt(523) x)
> +{
> +  return __builtin_clzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +clzd523 (unsigned _BitInt(523) x)
> +{
> +  return __builtin_clzg (x, __builtin_popcountg ((typeof (x)) ~(typeof (x)) 0));
> +}
> +
> +__attribute__((noipa)) int
> +clzD523 (unsigned _BitInt(523) x, int y)
> +{
> +  return __builtin_clzg (x, y);
> +}
> +
> +__attribute__((noipa)) int
> +ctz523 (unsigned _BitInt(523) x)
> +{
> +  return __builtin_ctzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ctzd523 (unsigned _BitInt(523) x)
> +{
> +  return __builtin_ctzg (x, __builtin_popcountg ((typeof (x)) ~(typeof (x)) 0));
> +}
> +
> +__attribute__((noipa)) int
> +ctzD523 (unsigned _BitInt(523) x, int y)
> +{
> +  return __builtin_ctzg (x, y);
> +}
> +
> +__attribute__((noipa)) int
> +clrsb523 (_BitInt(523) x)
> +{
> +  return __builtin_clrsbg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ffs523 (_BitInt(523) x)
> +{
> +  return __builtin_ffsg (x);
> +}
> +
> +__attribute__((noipa)) int
> +parity523 (unsigned _BitInt(523) x)
> +{
> +  return __builtin_parityg (x);
> +}
> +
> +__attribute__((noipa)) int
> +popcount523 (unsigned _BitInt(523) x)
> +{
> +  return __builtin_popcountg (x);
> +}
> +#endif
> +
> +int
> +main ()
> +{
> +#if __BITINT_MAXWIDTH__ >= 512
> +  if (clzd512 (0) != 512
> +      || clzD512 (0, -1) != -1
> +      || ctzd512 (0) != 512
> +      || ctzD512 (0, 42) != 42
> +      || clrsb512 (0) != 512 - 1
> +      || ffs512 (0) != 0
> +      || parity512 (0) != 0
> +      || popcount512 (0) != 0
> +      || __builtin_clzg ((unsigned _BitInt(512)) 0, 512 + 32) != 512 + 32
> +      || __builtin_ctzg ((unsigned _BitInt(512)) 0, 512) != 512
> +      || __builtin_clrsbg ((_BitInt(512)) 0) != 512 - 1
> +      || __builtin_ffsg ((_BitInt(512)) 0) != 0
> +      || __builtin_parityg ((unsigned _BitInt(512)) 0) != 0
> +      || __builtin_popcountg ((unsigned _BitInt(512)) 0) != 0)
> +    __builtin_abort ();
> +  if (clz512 (-1) != 0
> +      || clzd512 (-1) != 0
> +      || clzD512 (-1, 0) != 0
> +      || ctz512 (-1) != 0
> +      || ctzd512 (-1) != 0
> +      || ctzD512 (-1, 17) != 0
> +      || clrsb512 (-1) != 512 - 1
> +      || ffs512 (-1) != 1
> +      || parity512 (-1) != 0
> +      || popcount512 (-1) != 512
> +      || __builtin_clzg ((unsigned _BitInt(512)) -1) != 0
> +      || __builtin_clzg ((unsigned _BitInt(512)) -1, 512 + 32) != 0
> +      || __builtin_ctzg ((unsigned _BitInt(512)) -1) != 0
> +      || __builtin_ctzg ((unsigned _BitInt(512)) -1, 512) != 0
> +      || __builtin_clrsbg ((_BitInt(512)) -1) != 512 - 1
> +      || __builtin_ffsg ((_BitInt(512)) -1) != 1
> +      || __builtin_parityg ((unsigned _BitInt(512)) -1) != 0
> +      || __builtin_popcountg ((unsigned _BitInt(512)) -1) != 512)
> +    __builtin_abort ();
> +  if (clz512 (((unsigned _BitInt(512)) -1) >> 24) != 24
> +      || clz512 (((unsigned _BitInt(512)) -1) >> 79) != 79
> +      || clz512 (1) != 512 - 1
> +      || clzd512 (((unsigned _BitInt(512)) -1) >> 139) != 139
> +      || clzd512 (2) != 512 - 2
> +      || ctz512 (((unsigned _BitInt(512)) -1) << 42) != 42
> +      || ctz512 (((unsigned _BitInt(512)) -1) << 57) != 57
> +      || ctz512 (0x4000000000000000000000uwb) != 86
> +      || ctzd512 (((unsigned _BitInt(512)) -1) << 149) != 149
> +      || ctzd512 (2) != 1
> +      || clrsb512 ((unsigned _BitInt(512 - 4)) -1) != 3
> +      || clrsb512 ((unsigned _BitInt(512 - 28)) -1) != 27
> +      || clrsb512 ((unsigned _BitInt(512 - 29)) -1) != 28
> +      || clrsb512 (~(unsigned _BitInt(512)) (unsigned _BitInt(512 - 68)) -1) != 67
> +      || clrsb512 (~(unsigned _BitInt(512)) (unsigned _BitInt(512 - 92)) -1) != 91
> +      || clrsb512 (~(unsigned _BitInt(512)) (unsigned _BitInt(512 - 93)) -1) != 92
> +      || ffs512 (((unsigned _BitInt(512)) -1) << 42) != 43
> +      || ffs512 (((unsigned _BitInt(512)) -1) << 57) != 58
> +      || ffs512 (0x4000000000000000000000uwb) != 87
> +      || ffs512 (((unsigned _BitInt(512)) -1) << 149) != 150
> +      || ffs512 (2) != 2
> +      || __builtin_clzg (((unsigned _BitInt(512)) -1) >> 24) != 24
> +      || __builtin_clzg (((unsigned _BitInt(512)) -1) >> 79) != 79
> +      || __builtin_clzg ((unsigned _BitInt(512)) 1) != 512 - 1
> +      || __builtin_clzg (((unsigned _BitInt(512)) -1) >> 139, 512) != 139
> +      || __builtin_clzg ((unsigned _BitInt(512)) 2, 512) != 512 - 2
> +      || __builtin_ctzg (((unsigned _BitInt(512)) -1) << 42) != 42
> +      || __builtin_ctzg (((unsigned _BitInt(512)) -1) << 57) != 57
> +      || __builtin_ctzg ((unsigned _BitInt(512)) 0x4000000000000000000000uwb) != 86
> +      || __builtin_ctzg (((unsigned _BitInt(512)) -1) << 149, 512) != 149
> +      || __builtin_ctzg ((unsigned _BitInt(512)) 2, 512) != 1
> +      || __builtin_clrsbg ((_BitInt(512)) (unsigned _BitInt(512 - 4)) -1) != 3
> +      || __builtin_clrsbg ((_BitInt(512)) (unsigned _BitInt(512 - 28)) -1) != 27
> +      || __builtin_clrsbg ((_BitInt(512)) (unsigned _BitInt(512 - 29)) -1) != 28
> +      || __builtin_clrsbg ((_BitInt(512)) ~(unsigned _BitInt(512)) (unsigned _BitInt(512 - 68)) -1) != 67
> +      || __builtin_clrsbg ((_BitInt(512)) ~(unsigned _BitInt(512)) (unsigned _BitInt(512 - 92)) -1) != 91
> +      || __builtin_clrsbg ((_BitInt(512)) ~(unsigned _BitInt(512)) (unsigned _BitInt(512 - 93)) -1) != 92
> +      || __builtin_ffsg ((_BitInt(512)) (((unsigned _BitInt(512)) -1) << 42)) != 43
> +      || __builtin_ffsg ((_BitInt(512)) (((unsigned _BitInt(512)) -1) << 57)) != 58
> +      || __builtin_ffsg ((_BitInt(512)) 0x4000000000000000000000uwb) != 87
> +      || __builtin_ffsg ((_BitInt(512)) (((unsigned _BitInt(512)) -1) << 149)) != 150
> +      || __builtin_ffsg ((_BitInt(512)) 2) != 2)
> +    __builtin_abort ();
> +  if (parity512 (8278593062772967967574644592392030907507244457324713380127157444008480135136016412791369421272159911061801023217823646324038055629840240503699995274750141uwb) != __builtin_parityg (8278593062772967967574644592392030907507244457324713380127157444008480135136016412791369421272159911061801023217823646324038055629840240503699995274750141uwb)
> +      || parity512 (663951521760319802637316646127146913163123967584512032007606686578544864655291546789196279408181546344880831465704154822174055168766759305688225967189384uwb) != __builtin_parityg (663951521760319802637316646127146913163123967584512032007606686578544864655291546789196279408181546344880831465704154822174055168766759305688225967189384uwb)
> +      || parity512 (8114152627481936575035564712656624361256533214211179387274127464949371919139038942819974113641465089580051998523156404968195970853124179018281296621919217uwb) != __builtin_parityg (8114152627481936575035564712656624361256533214211179387274127464949371919139038942819974113641465089580051998523156404968195970853124179018281296621919217uwb)
> +      || popcount512 (697171368046392901434470580443928282938585745214587494987284546386421344865289735592202298494880955572094546861862007016154025065165834164941207378563932uwb) != __builtin_popcountg (697171368046392901434470580443928282938585745214587494987284546386421344865289735592202298494880955572094546861862007016154025065165834164941207378563932uwb)
> +      || popcount512 (12625357869391866487124235043239209385173615631331705015179232007319637649427586947822360147798041278948617160703315666047585702906648747835331939389354450uwb) != __builtin_popcountg (12625357869391866487124235043239209385173615631331705015179232007319637649427586947822360147798041278948617160703315666047585702906648747835331939389354450uwb)
> +      || popcount512 (12989863959706456104163426941303698078341934896544520782734564901708926112239778316241786242633862403309192697330635825122310265805838908726925342761646021uwb) != __builtin_popcountg (12989863959706456104163426941303698078341934896544520782734564901708926112239778316241786242633862403309192697330635825122310265805838908726925342761646021uwb))
> +    __builtin_abort ();
> +#endif
> +#if __BITINT_MAXWIDTH__ >= 523
> +  if (clzd523 (0) != 523
> +      || clzD523 (0, 42) != 42
> +      || ctzd523 (0) != 523
> +      || ctzD523 (0, -1) != -1
> +      || clrsb523 (0) != 523 - 1
> +      || ffs523 (0) != 0
> +      || parity523 (0) != 0
> +      || popcount523 (0) != 0
> +      || __builtin_clzg ((unsigned _BitInt(523)) 0, 523 + 32) != 523 + 32
> +      || __builtin_ctzg ((unsigned _BitInt(523)) 0, 523) != 523
> +      || __builtin_clrsbg ((_BitInt(523)) 0) != 523 - 1
> +      || __builtin_ffsg ((_BitInt(523)) 0) != 0
> +      || __builtin_parityg ((unsigned _BitInt(523)) 0) != 0
> +      || __builtin_popcountg ((unsigned _BitInt(523)) 0) != 0)
> +    __builtin_abort ();
> +  if (clz523 (-1) != 0
> +      || clzd523 (-1) != 0
> +      || clzD523 (-1, 15) != 0
> +      || ctz523 (-1) != 0
> +      || ctzd523 (-1) != 0
> +      || ctzD523 (-1, -57) != 0
> +      || clrsb523 (-1) != 523 - 1
> +      || ffs523 (-1) != 1
> +      || parity523 (-1) != 1
> +      || popcount523 (-1) != 523
> +      || __builtin_clzg ((unsigned _BitInt(523)) -1) != 0
> +      || __builtin_clzg ((unsigned _BitInt(523)) -1, 523 + 32) != 0
> +      || __builtin_ctzg ((unsigned _BitInt(523)) -1) != 0
> +      || __builtin_ctzg ((unsigned _BitInt(523)) -1, 523) != 0
> +      || __builtin_clrsbg ((_BitInt(523)) -1) != 523 - 1
> +      || __builtin_ffsg ((_BitInt(523)) -1) != 1
> +      || __builtin_parityg ((unsigned _BitInt(523)) -1) != 1
> +      || __builtin_popcountg ((unsigned _BitInt(523)) -1) != 523)
> +    __builtin_abort ();
> +  if (clz523 (((unsigned _BitInt(523)) -1) >> 24) != 24
> +      || clz523 (((unsigned _BitInt(523)) -1) >> 79) != 79
> +      || clz523 (1) != 523 - 1
> +      || clzd523 (((unsigned _BitInt(523)) -1) >> 139) != 139
> +      || clzd523 (2) != 523 - 2
> +      || ctz523 (((unsigned _BitInt(523)) -1) << 42) != 42
> +      || ctz523 (((unsigned _BitInt(523)) -1) << 57) != 57
> +      || ctz523 (0x4000000000000000000000uwb) != 86
> +      || ctzd523 (((unsigned _BitInt(523)) -1) << 149) != 149
> +      || ctzd523 (2) != 1
> +      || clrsb523 ((unsigned _BitInt(523 - 4)) -1) != 3
> +      || clrsb523 ((unsigned _BitInt(523 - 28)) -1) != 27
> +      || clrsb523 ((unsigned _BitInt(523 - 29)) -1) != 28
> +      || clrsb523 (~(unsigned _BitInt(523)) (unsigned _BitInt(523 - 68)) -1) != 67
> +      || clrsb523 (~(unsigned _BitInt(523)) (unsigned _BitInt(523 - 92)) -1) != 91
> +      || clrsb523 (~(unsigned _BitInt(523)) (unsigned _BitInt(523 - 93)) -1) != 92
> +      || ffs523 (((unsigned _BitInt(523)) -1) << 42) != 43
> +      || ffs523 (((unsigned _BitInt(523)) -1) << 57) != 58
> +      || ffs523 (0x4000000000000000000000uwb) != 87
> +      || ffs523 (((unsigned _BitInt(523)) -1) << 149) != 150
> +      || ffs523 (2) != 2
> +      || __builtin_clzg (((unsigned _BitInt(523)) -1) >> 24) != 24
> +      || __builtin_clzg (((unsigned _BitInt(523)) -1) >> 79) != 79
> +      || __builtin_clzg ((unsigned _BitInt(523)) 1) != 523 - 1
> +      || __builtin_clzg (((unsigned _BitInt(523)) -1) >> 139, 523) != 139
> +      || __builtin_clzg ((unsigned _BitInt(523)) 2, 523) != 523 - 2
> +      || __builtin_ctzg (((unsigned _BitInt(523)) -1) << 42) != 42
> +      || __builtin_ctzg (((unsigned _BitInt(523)) -1) << 57) != 57
> +      || __builtin_ctzg ((unsigned _BitInt(523)) 0x4000000000000000000000uwb) != 86
> +      || __builtin_ctzg (((unsigned _BitInt(523)) -1) << 149, 523) != 149
> +      || __builtin_ctzg ((unsigned _BitInt(523)) 2, 523) != 1
> +      || __builtin_clrsbg ((_BitInt(523)) (unsigned _BitInt(523 - 4)) -1) != 3
> +      || __builtin_clrsbg ((_BitInt(523)) (unsigned _BitInt(523 - 28)) -1) != 27
> +      || __builtin_clrsbg ((_BitInt(523)) (unsigned _BitInt(523 - 29)) -1) != 28
> +      || __builtin_clrsbg ((_BitInt(523)) ~(unsigned _BitInt(523)) (unsigned _BitInt(523 - 68)) -1) != 67
> +      || __builtin_clrsbg ((_BitInt(523)) ~(unsigned _BitInt(523)) (unsigned _BitInt(523 - 92)) -1) != 91
> +      || __builtin_clrsbg ((_BitInt(523)) ~(unsigned _BitInt(523)) (unsigned _BitInt(523 - 93)) -1) != 92
> +      || __builtin_ffsg ((_BitInt(523)) (((unsigned _BitInt(523)) -1) << 42)) != 43
> +      || __builtin_ffsg ((_BitInt(523)) (((unsigned _BitInt(523)) -1) << 57)) != 58
> +      || __builtin_ffsg ((_BitInt(523)) 0x4000000000000000000000uwb) != 87
> +      || __builtin_ffsg ((_BitInt(523)) (((unsigned _BitInt(523)) -1) << 149)) != 150
> +      || __builtin_ffsg ((_BitInt(523)) 2) != 2)
> +    __builtin_abort ();
> +  if (parity523 (14226628251091586975416900831427560438504550751597528218770815297642064445318137709184907300499591292677456563377096100346699421879373024906380724757049700104uwb) != __builtin_parityg (14226628251091586975416900831427560438504550751597528218770815297642064445318137709184907300499591292677456563377096100346699421879373024906380724757049700104uwb)
> +      || parity523 (20688958227123188226117538663818621034852702121556301239818743230005799574164516085541310491875153692467123662601853835357822935286851364843928714141587045255uwb) != __builtin_parityg (20688958227123188226117538663818621034852702121556301239818743230005799574164516085541310491875153692467123662601853835357822935286851364843928714141587045255uwb)
> +      || parity523 (8927708174664018648856542263215989788443763271738485875573765922613438023117960552135374015673598803453205044464280019640319125968982118836809392169156450404uwb) != __builtin_parityg (8927708174664018648856542263215989788443763271738485875573765922613438023117960552135374015673598803453205044464280019640319125968982118836809392169156450404uwb)
> +      || popcount523 (27178327344587654457581274852432957423537947348354896748701960885269035920194935311522194372418922852798513401240689173265979378157685169921449935364246334672uwb) != __builtin_popcountg (27178327344587654457581274852432957423537947348354896748701960885269035920194935311522194372418922852798513401240689173265979378157685169921449935364246334672uwb)
> +      || popcount523 (5307736750284212829931201546806718535860789684371772688568780952567669490917265125893664418036905110148872995350655890585853451175740907670080602411287166989uwb) != __builtin_popcountg (5307736750284212829931201546806718535860789684371772688568780952567669490917265125893664418036905110148872995350655890585853451175740907670080602411287166989uwb)
> +      || popcount523 (21261096432069432668470452941790780841888331284195411465624030283325239673941548816191698556934198698768393659379577567450765073013688585051560340496749593370uwb) != __builtin_popcountg (21261096432069432668470452941790780841888331284195411465624030283325239673941548816191698556934198698768393659379577567450765073013688585051560340496749593370uwb))
> +    __builtin_abort ();
> +#endif
> +}
> 
> 	Jakub
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Add type-generic clz/ctz/clrsb/ffs/parity/popcount builtins [PR111309]
  2023-11-10  8:09 ` Richard Biener
@ 2023-11-10  9:10   ` Jakub Jelinek
  2023-11-10  9:19     ` Richard Biener
  2023-11-13 23:45     ` Joseph Myers
  0 siblings, 2 replies; 10+ messages in thread
From: Jakub Jelinek @ 2023-11-10  9:10 UTC (permalink / raw)
  To: Richard Biener; +Cc: Joseph S. Myers, Jason Merrill, gcc-patches

On Fri, Nov 10, 2023 at 08:09:26AM +0000, Richard Biener wrote:
> > The following patch adds 6 new type-generic builtins,
> > __builtin_clzg
> > __builtin_ctzg
> > __builtin_clrsbg
> > __builtin_ffsg
> > __builtin_parityg
> > __builtin_popcountg
> > The g at the end stands for generic because the unsuffixed variant
> > of the builtins already have unsigned int or int arguments.
> > 
> > The main reason to add these is to support arbitrary unsigned (for
> > clrsb/ffs signed) bit-precise integer types and also __int128 which
> > wasn't supported by the existing builtins, so that e.g. <stdbit.h>
> > type-generic functions could then support not just bit-precise unsigned
> > integer type whose width matches a standard or extended integer type,
> > but others too.
> > 
> > None of these new builtins promote their first argument, so the argument
> > can be e.g. unsigned char or unsigned short or unsigned __int20 etc.
> 
> But is that a good idea?  Is that how type generic functions work in C?

Most current type generic functions deal just with floating point args and
don't promote there float to double as the ... promotions would normally do:
__builtin_signbit
__builtin_fpclassify
__builtin_isfinite
__builtin_isinf_sign
__builtin_isinf
__builtin_isnan
__builtin_isnormal
__builtin_isgreater
__builtin_isgreaterequal
__builtin_isless
__builtin_islessequal
__builtin_islessgreater
__builtin_isunordered
__builtin_iseqsig
__builtin_issignaling

__builtin_clear_padding is uninteresting, because the argument must be a
pointer.

Then
__builtin_add_overflow
__builtin_sub_overflow
__builtin_mul_overflow
do promote the first 2 arguments, but that doesn't really matter, because
all we care about is the argument values, not their type.

And finally
__builtin_add_overflow_p
__builtin_sub_overflow_p
__builtin_mul_overflow_p
do promote the first 2 arguments, and don't promote the third one, which is
the only one where we care about the type, so that is the behavior that
I've used also for the new builtins.  I think if we added e.g.
__builtin_classify_type now and not more than 3 decades ago it would behave
like that too.
Only not promoting the argument will make it directly usable in the
stdc_leading_zeros, stdc_leading_ones, stdc_trailing_zeros, stdc_trailing_ones,
stdc_first_leading_zero, ..., stdc_count_zeros, stdc_count_ones, ...
C23 stdbit.h type-generic macros, otherwise one would need to play with
_Generic and special-case there unsigned char and unsigned short (which
normally promote to int), but e.g. unsigned _BitInt(8) doesn't.
I expect Joseph will have compatibility version of the macro for when these
builtins aren't supported, but given the standard says that {un,}signed _BitInt
with width matching standard/extended integer types other than bool needs to
be handled also, either it will not use _Generic at all and just go after
sizeof (argument), or maybe will use _Generic and for the default case will
go after sizeof.  Seems clang returns -1 for __builtin_classify_type (0uwb)
rather than 18 GCC returns, so one can't portably distinguish bitints.

> I think it introduces non-obvious/unexpected behavior in user code.

If we keep the patch behavior of requiring unsigned
standard/extended/bit-precise types other than bool for the
clz/ctz/parity/popcount cases, the choice is between always erroring on
__builtin_clzg ((unsigned char) 1) - where the promoted argument is signed,
and accepting it as unsigned char case without promotion, so I think users
would be more confused to see an error on that.
If we'd switch to accepting both signed and unsigned
standard/extended/bit-precise integer types other than bool for all the
builtins, whether we promote or not doesn't matter for ctz/parity/popcount
but does for clz.
The clrsb/ffs cases accept signed argument on the other side, so both
promoted and unpromoted argument would mean something and be accepted,
in the ffs case it again doesn't really matter for the result, but for clrsb
is significant.
Would it help to just document it that the argument isn't promoted?

We document that for __builtin_*_overflow_p:
"The value of the third argument is ignored, just the side effects in the third argument
are evaluated, and no integral argument promotions are performed on the last
argument."

> If people do not want to "compensate" for this maybe insted also add
> __builtin_*{8,16} (like we have for the bswap variants)?

Note, clang has __builtin_clzs and __builtin_ctzs for unsigned short (but
not the other 4), but nothing for the unsigned char cases.
I was just hoping we don't need to add further variants if we have
type-generic ones.

> Otherwise this looks reasonable.  I'm not sure why we need separate
> CFN_CLZ and CFN_BUILT_IN_CLZG?  (why CFN_BUILT_IN_CLZG and not CFN_CLZG?)
> That is, I'm confused about
> 
>      CASE_CFN_CLRSB:
> +    case CFN_BUILT_IN_CLRSBG:
> 
> why does CASE_CFN_CLRSB not include CLRSBG?  It includes IFN_CLRSB, no?
> And IFN_CLRSB already has the two and one arg case and thus encompasses
> some BUILT_IN_CLRSBG cases?

gencfn-macros.cc is aware of just normal float suffixes (F, nothing, L;
then under different names of the macros other variants for float
suffixes), and int suffixes (nothing, L, LL, IMAX), it doesn't know anything
about the G suffix.  We could teach it to under a different suffix add the
G case too, but I didn't think it was necessary because the *G builtins are
meant to be folded away into something else as soon as possible, worst case
during gimplification, so nothing after that ought to care about them.
It is just the fold-const-call.cc case where in constant expressions I think
we want to fold them into constants and having new macros just to use them
once (and don't want to use them in the 2-6 other places depending on the
builtin) seemed unnecessary.

> Besides the above question I'd say OK (I assume Josephs reply is a
> general ack from his side).

Joseph, what are your thoughts on the above?

Incremental patch to document the lack of integral argument promotion:

--- gcc/doc/extend.texi.jj	2023-11-09 09:17:40.240182342 +0100
+++ gcc/doc/extend.texi	2023-11-10 09:57:45.396215654 +0100
@@ -14962,13 +14962,15 @@ Similar to @code{__builtin_parity}, exce
 
 @defbuiltin{int __builtin_ffsg (...)}
 Similar to @code{__builtin_ffs}, except the argument is type-generic
-signed integer (standard, extended or bit-precise).
+signed integer (standard, extended or bit-precise).  No integral argument
+promotions are performed on the argument.  
 @enddefbuiltin
 
 @defbuiltin{int __builtin_clzg (...)}
 Similar to @code{__builtin_clz}, except the argument is type-generic
 unsigned integer (standard, extended or bit-precise) and there is
-optional second argument with int type.  If two arguments are specified,
+optional second argument with int type.  No integral argument promotions
+are performed on the first argument.  If two arguments are specified,
 and first argument is 0, the result is the second argument.  If only
 one argument is specified and it is 0, the result is undefined.
 @enddefbuiltin
@@ -14976,24 +14978,28 @@ one argument is specified and it is 0, t
 @defbuiltin{int __builtin_ctzg (...)}
 Similar to @code{__builtin_ctz}, except the argument is type-generic
 unsigned integer (standard, extended or bit-precise) and there is
-optional second argument with int type.  If two arguments are specified,
+optional second argument with int type.  No integral argument promotions
+are performed on the first argument.  If two arguments are specified,
 and first argument is 0, the result is the second argument.  If only
 one argument is specified and it is 0, the result is undefined.
 @enddefbuiltin
 
 @defbuiltin{int __builtin_clrsbg (...)}
 Similar to @code{__builtin_clrsb}, except the argument is type-generic
-signed integer (standard, extended or bit-precise).
+signed integer (standard, extended or bit-precise).  No integral argument
+promotions are performed on the argument.  
 @enddefbuiltin
 
 @defbuiltin{int __builtin_popcountg (...)}
 Similar to @code{__builtin_popcount}, except the argument is type-generic
-unsigned integer (standard, extended or bit-precise).
+unsigned integer (standard, extended or bit-precise).  No integral argument
+promotions are performed on the argument.  
 @enddefbuiltin
 
 @defbuiltin{int __builtin_parityg (...)}
 Similar to @code{__builtin_parity}, except the argument is type-generic
-unsigned integer (standard, extended or bit-precise).
+unsigned integer (standard, extended or bit-precise).  No integral argument
+promotions are performed on the argument.  
 @enddefbuiltin
 
 @defbuiltin{double __builtin_powi (double, int)}


	Jakub


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Add type-generic clz/ctz/clrsb/ffs/parity/popcount builtins [PR111309]
  2023-11-10  9:10   ` Jakub Jelinek
@ 2023-11-10  9:19     ` Richard Biener
  2023-11-10  9:44       ` Jakub Jelinek
  2023-11-13 23:45     ` Joseph Myers
  1 sibling, 1 reply; 10+ messages in thread
From: Richard Biener @ 2023-11-10  9:19 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Joseph S. Myers, Jason Merrill, gcc-patches

On Fri, 10 Nov 2023, Jakub Jelinek wrote:

> On Fri, Nov 10, 2023 at 08:09:26AM +0000, Richard Biener wrote:
> > > The following patch adds 6 new type-generic builtins,
> > > __builtin_clzg
> > > __builtin_ctzg
> > > __builtin_clrsbg
> > > __builtin_ffsg
> > > __builtin_parityg
> > > __builtin_popcountg
> > > The g at the end stands for generic because the unsuffixed variant
> > > of the builtins already have unsigned int or int arguments.
> > > 
> > > The main reason to add these is to support arbitrary unsigned (for
> > > clrsb/ffs signed) bit-precise integer types and also __int128 which
> > > wasn't supported by the existing builtins, so that e.g. <stdbit.h>
> > > type-generic functions could then support not just bit-precise unsigned
> > > integer type whose width matches a standard or extended integer type,
> > > but others too.
> > > 
> > > None of these new builtins promote their first argument, so the argument
> > > can be e.g. unsigned char or unsigned short or unsigned __int20 etc.
> > 
> > But is that a good idea?  Is that how type generic functions work in C?
> 
> Most current type generic functions deal just with floating point args and
> don't promote there float to double as the ... promotions would normally do:
> __builtin_signbit
> __builtin_fpclassify
> __builtin_isfinite
> __builtin_isinf_sign
> __builtin_isinf
> __builtin_isnan
> __builtin_isnormal
> __builtin_isgreater
> __builtin_isgreaterequal
> __builtin_isless
> __builtin_islessequal
> __builtin_islessgreater
> __builtin_isunordered
> __builtin_iseqsig
> __builtin_issignaling
> 
> __builtin_clear_padding is uninteresting, because the argument must be a
> pointer.
> 
> Then
> __builtin_add_overflow
> __builtin_sub_overflow
> __builtin_mul_overflow
> do promote the first 2 arguments, but that doesn't really matter, because
> all we care about is the argument values, not their type.
> 
> And finally
> __builtin_add_overflow_p
> __builtin_sub_overflow_p
> __builtin_mul_overflow_p
> do promote the first 2 arguments, and don't promote the third one, which is
> the only one where we care about the type, so that is the behavior that
> I've used also for the new builtins.  I think if we added e.g.
> __builtin_classify_type now and not more than 3 decades ago it would behave
> like that too.
> Only not promoting the argument will make it directly usable in the
> stdc_leading_zeros, stdc_leading_ones, stdc_trailing_zeros, stdc_trailing_ones,
> stdc_first_leading_zero, ..., stdc_count_zeros, stdc_count_ones, ...
> C23 stdbit.h type-generic macros, otherwise one would need to play with
> _Generic and special-case there unsigned char and unsigned short (which
> normally promote to int), but e.g. unsigned _BitInt(8) doesn't.

googling doesn't find me stdc_leading_zeros - are those supposed to work
for non-_BitInt types as well and don't promote the argument in that
case?

If we are spcificially targeting those I wonder why we don't name
the builtins after those?  But yes, if promotion is undesirable
for implementing them then I agree.  IIRC _BitInt(n) is not subject
to integer promotions.

> I expect Joseph will have compatibility version of the macro for when these
> builtins aren't supported, but given the standard says that {un,}signed _BitInt
> with width matching standard/extended integer types other than bool needs to
> be handled also, either it will not use _Generic at all and just go after
> sizeof (argument), or maybe will use _Generic and for the default case will
> go after sizeof.  Seems clang returns -1 for __builtin_classify_type (0uwb)
> rather than 18 GCC returns, so one can't portably distinguish bitints.
> 
> > I think it introduces non-obvious/unexpected behavior in user code.
> 
> If we keep the patch behavior of requiring unsigned
> standard/extended/bit-precise types other than bool for the
> clz/ctz/parity/popcount cases, the choice is between always erroring on
> __builtin_clzg ((unsigned char) 1) - where the promoted argument is signed,
> and accepting it as unsigned char case without promotion, so I think users
> would be more confused to see an error on that.
> If we'd switch to accepting both signed and unsigned
> standard/extended/bit-precise integer types other than bool for all the
> builtins, whether we promote or not doesn't matter for ctz/parity/popcount
> but does for clz.
> The clrsb/ffs cases accept signed argument on the other side, so both
> promoted and unpromoted argument would mean something and be accepted,
> in the ffs case it again doesn't really matter for the result, but for clrsb
> is significant.
> Would it help to just document it that the argument isn't promoted?
> 
> We document that for __builtin_*_overflow_p:
> "The value of the third argument is ignored, just the side effects in the third argument
> are evaluated, and no integral argument promotions are performed on the last
> argument."
> 
> > If people do not want to "compensate" for this maybe insted also add
> > __builtin_*{8,16} (like we have for the bswap variants)?
> 
> Note, clang has __builtin_clzs and __builtin_ctzs for unsigned short (but
> not the other 4), but nothing for the unsigned char cases.
> I was just hoping we don't need to add further variants if we have
> type-generic ones.
> 
> > Otherwise this looks reasonable.  I'm not sure why we need separate
> > CFN_CLZ and CFN_BUILT_IN_CLZG?  (why CFN_BUILT_IN_CLZG and not CFN_CLZG?)
> > That is, I'm confused about
> > 
> >      CASE_CFN_CLRSB:
> > +    case CFN_BUILT_IN_CLRSBG:
> > 
> > why does CASE_CFN_CLRSB not include CLRSBG?  It includes IFN_CLRSB, no?
> > And IFN_CLRSB already has the two and one arg case and thus encompasses
> > some BUILT_IN_CLRSBG cases?
> 
> gencfn-macros.cc is aware of just normal float suffixes (F, nothing, L;
> then under different names of the macros other variants for float
> suffixes), and int suffixes (nothing, L, LL, IMAX), it doesn't know anything
> about the G suffix.  We could teach it to under a different suffix add the
> G case too, but I didn't think it was necessary because the *G builtins are
> meant to be folded away into something else as soon as possible, worst case
> during gimplification, so nothing after that ought to care about them.
> It is just the fold-const-call.cc case where in constant expressions I think
> we want to fold them into constants and having new macros just to use them
> once (and don't want to use them in the 2-6 other places depending on the
> builtin) seemed unnecessary.
> 
> > Besides the above question I'd say OK (I assume Josephs reply is a
> > general ack from his side).
> 
> Joseph, what are your thoughts on the above?
> 
> Incremental patch to document the lack of integral argument promotion:
> 
> --- gcc/doc/extend.texi.jj	2023-11-09 09:17:40.240182342 +0100
> +++ gcc/doc/extend.texi	2023-11-10 09:57:45.396215654 +0100
> @@ -14962,13 +14962,15 @@ Similar to @code{__builtin_parity}, exce
>  
>  @defbuiltin{int __builtin_ffsg (...)}
>  Similar to @code{__builtin_ffs}, except the argument is type-generic
> -signed integer (standard, extended or bit-precise).
> +signed integer (standard, extended or bit-precise).  No integral argument
> +promotions are performed on the argument.  
>  @enddefbuiltin
>  
>  @defbuiltin{int __builtin_clzg (...)}
>  Similar to @code{__builtin_clz}, except the argument is type-generic
>  unsigned integer (standard, extended or bit-precise) and there is
> -optional second argument with int type.  If two arguments are specified,
> +optional second argument with int type.  No integral argument promotions
> +are performed on the first argument.  If two arguments are specified,
>  and first argument is 0, the result is the second argument.  If only
>  one argument is specified and it is 0, the result is undefined.
>  @enddefbuiltin
> @@ -14976,24 +14978,28 @@ one argument is specified and it is 0, t
>  @defbuiltin{int __builtin_ctzg (...)}
>  Similar to @code{__builtin_ctz}, except the argument is type-generic
>  unsigned integer (standard, extended or bit-precise) and there is
> -optional second argument with int type.  If two arguments are specified,
> +optional second argument with int type.  No integral argument promotions
> +are performed on the first argument.  If two arguments are specified,
>  and first argument is 0, the result is the second argument.  If only
>  one argument is specified and it is 0, the result is undefined.
>  @enddefbuiltin
>  
>  @defbuiltin{int __builtin_clrsbg (...)}
>  Similar to @code{__builtin_clrsb}, except the argument is type-generic
> -signed integer (standard, extended or bit-precise).
> +signed integer (standard, extended or bit-precise).  No integral argument
> +promotions are performed on the argument.  
>  @enddefbuiltin
>  
>  @defbuiltin{int __builtin_popcountg (...)}
>  Similar to @code{__builtin_popcount}, except the argument is type-generic
> -unsigned integer (standard, extended or bit-precise).
> +unsigned integer (standard, extended or bit-precise).  No integral argument
> +promotions are performed on the argument.  
>  @enddefbuiltin
>  
>  @defbuiltin{int __builtin_parityg (...)}
>  Similar to @code{__builtin_parity}, except the argument is type-generic
> -unsigned integer (standard, extended or bit-precise).
> +unsigned integer (standard, extended or bit-precise).  No integral argument
> +promotions are performed on the argument.  
>  @enddefbuiltin
>  
>  @defbuiltin{double __builtin_powi (double, int)}
> 
> 
> 	Jakub
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Add type-generic clz/ctz/clrsb/ffs/parity/popcount builtins [PR111309]
  2023-11-10  9:19     ` Richard Biener
@ 2023-11-10  9:44       ` Jakub Jelinek
  2023-11-11  8:18         ` Jakub Jelinek
  0 siblings, 1 reply; 10+ messages in thread
From: Jakub Jelinek @ 2023-11-10  9:44 UTC (permalink / raw)
  To: Richard Biener; +Cc: Joseph S. Myers, Jason Merrill, gcc-patches

On Fri, Nov 10, 2023 at 09:19:14AM +0000, Richard Biener wrote:
> > Only not promoting the argument will make it directly usable in the
> > stdc_leading_zeros, stdc_leading_ones, stdc_trailing_zeros, stdc_trailing_ones,
> > stdc_first_leading_zero, ..., stdc_count_zeros, stdc_count_ones, ...
> > C23 stdbit.h type-generic macros, otherwise one would need to play with
> > _Generic and special-case there unsigned char and unsigned short (which
> > normally promote to int), but e.g. unsigned _BitInt(8) doesn't.
> 
> googling doesn't find me stdc_leading_zeros - are those supposed to work
> for non-_BitInt types as well and don't promote the argument in that
> case?

https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3096.pdf
is the C23 draft the C23 wiki points at.

E.g.

#include <stdbit.h>
unsigned int stdc_leading_zeros_uc(unsigned char value);
unsigned int stdc_leading_zeros_us(unsigned short value);
unsigned int stdc_leading_zeros_ui(unsigned int value);
unsigned int stdc_leading_zeros_ul(unsigned long value);
unsigned int stdc_leading_zeros_ull(unsigned long long value);
generic_return_type stdc_leading_zeros(generic_value_type value);

Returns
Returns the number of consecutive 0 bits in value, starting from
the most significant bit.
The type-generic function (marked by its generic_value_type argument)
returns the appropriate value based on the type of the input value,
so long as it is a:
— standard unsigned integer type, excluding bool;
— extended unsigned integer type;
— or, bit-precise unsigned integer type whose width matches a standard
  or extended integer type, excluding bool.
The generic_return_type type shall be a suitable large unsigned integer
type capable of representing the computed result.

My understanding is that because unsigned char and unsigned short
are standard unsigned integer types, it ought to support those too,
not diagnose them as invalid, and shall return number of consecutive 0 bits
in them (which is something different between value for unsigned char
and int unless unsigned char has the same precision as int).

> If we are spcificially targeting those I wonder why we don't name
> the builtins after those?  But yes, if promotion is undesirable
> for implementing them then I agree.  IIRC _BitInt(n) is not subject
> to integer promotions.

Because the builtins are just something matching in behavior to existing
builtins which can be used for those macros, not exact implementation of
those.  E.g. while
#define stdc_leading_zeros(value) \
((unsigned int) __builtin_clzg (value, __builtin_popcountg ((__typeof (value)) ~(__typeof (value)) 0)))
implements (I believe) stdc_leading_zeros above, the second argument to the
builtin could be something different needed for other cases, e.g. -1 if one
wants to implement ffs-like behavior on unsigned argument, and e.g.
stdc_leading_ones would be implemented probably like:
#define stdc_leading_ones(value) \
((unsigned int) __builtin_clzg ((__typeof (value)) ~(value), __builtin_popcountg ((__typeof (value)) ~(__typeof (value)) 0)))
Or
#define stdc_first_trailing_one(value) \
((unsigned int) (__builtin_ctzg (value, -1) + 1))
vs.
#define stdc_trailing_zeros(value) \
((unsigned int) __builtin_ctzg (value, __builtin_popcountg ((__typeof (value)) ~(__typeof (value)) 0)))
No need to add 14 new type-generic builtins, we can just add the building
blocks to implement those.

	Jakub


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Add type-generic clz/ctz/clrsb/ffs/parity/popcount builtins [PR111309]
  2023-11-10  9:44       ` Jakub Jelinek
@ 2023-11-11  8:18         ` Jakub Jelinek
  0 siblings, 0 replies; 10+ messages in thread
From: Jakub Jelinek @ 2023-11-11  8:18 UTC (permalink / raw)
  To: Richard Biener, Joseph S. Myers, Jason Merrill, gcc-patches

On Fri, Nov 10, 2023 at 10:44:12AM +0100, Jakub Jelinek wrote:
> Because the builtins are just something matching in behavior to existing
> builtins which can be used for those macros, not exact implementation of
> those.

BTW, the new builtins also allow implementation of generic signed_type_for
and unsigned_type_for macros in C (together with __builtin_classify_type),
I think _Generic over known standard and extended types plus for _BitInt
(__builtin_classify_type (__typeof (x)) == 18) something like in the
following source:

void bar (_BitInt(193) *, unsigned _BitInt(193) *);

void
foo (void)
{
  unsigned _BitInt(193) a = 0uwb;
  _BitInt(__builtin_popcountg ((__typeof (a)) -1)) b = 0wb;
  bar (&b, &a);
}

void
baz (void)
{
  _BitInt(193) a = 0wb;
  unsigned _BitInt(__builtin_clrsbg ((__typeof (a)) - 1) + 1) b = 0uwb;
  bar (&a, &b);
}

One needs to use __builtin_popcountg on all ones for unsigned types and
1 + __builtin_clrsbg on all ones for signed types, but otherwise it seems to
work fine.  Of course, for signed_type_for one would need to decide what
to do with unsigned _BitInt(1) type which doesn't have signed counterpart,
but that can be dealt in _Generic.

	Jakub


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Add type-generic clz/ctz/clrsb/ffs/parity/popcount builtins [PR111309]
  2023-11-10  9:10   ` Jakub Jelinek
  2023-11-10  9:19     ` Richard Biener
@ 2023-11-13 23:45     ` Joseph Myers
  1 sibling, 0 replies; 10+ messages in thread
From: Joseph Myers @ 2023-11-13 23:45 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, Jason Merrill, gcc-patches

On Fri, 10 Nov 2023, Jakub Jelinek wrote:

> > Besides the above question I'd say OK (I assume Josephs reply is a
> > general ack from his side).
> 
> Joseph, what are your thoughts on the above?

It's correct not to promote, since that matches the semantics of the 
standard type-generic macros.  (I did suggest in WG14 that the 
type-generic macros might make more sense in the cases of functions that 
are genuinely just functions of their integer argument and not of its 
type, such as population count, than for functions where the result for a 
given integer argument depends on the width of its type and not just the 
integer value, or that passing an explicit width argument might be 
appropriate for type-generic macros in cases where the width matters, but 
WG14 wanted all the type-generic macros as-is.)

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Add type-generic clz/ctz/clrsb/ffs/parity/popcount builtins [PR111309]
  2023-11-09 15:02 [PATCH] Add type-generic clz/ctz/clrsb/ffs/parity/popcount builtins [PR111309] Jakub Jelinek
  2023-11-09 21:43 ` Joseph Myers
  2023-11-10  8:09 ` Richard Biener
@ 2023-12-16  5:51 ` Andrew Pinski
  2023-12-16  8:36   ` Jakub Jelinek
  2 siblings, 1 reply; 10+ messages in thread
From: Andrew Pinski @ 2023-12-16  5:51 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Joseph S. Myers, Richard Biener, Jason Merrill, gcc-patches

On Thu, Nov 9, 2023 at 7:03 AM Jakub Jelinek <jakub@redhat.com> wrote:
>
> Hi!
>
> The following patch adds 6 new type-generic builtins,
> __builtin_clzg
> __builtin_ctzg
> __builtin_clrsbg
> __builtin_ffsg
> __builtin_parityg
> __builtin_popcountg
> The g at the end stands for generic because the unsuffixed variant
> of the builtins already have unsigned int or int arguments.
>
> The main reason to add these is to support arbitrary unsigned (for
> clrsb/ffs signed) bit-precise integer types and also __int128 which
> wasn't supported by the existing builtins, so that e.g. <stdbit.h>
> type-generic functions could then support not just bit-precise unsigned
> integer type whose width matches a standard or extended integer type,
> but others too.
>
> None of these new builtins promote their first argument, so the argument
> can be e.g. unsigned char or unsigned short or unsigned __int20 etc.
> The first 2 support either 1 or 2 arguments, if only 1 argument is supplied,
> the behavior is undefined for argument 0 like for other __builtin_c[lt]z*
> builtins, if 2 arguments are supplied, the second argument should be int
> that will be returned if the argument is 0.  All other builtins have
> just one argument.  For __builtin_clrsbg and __builtin_ffsg the argument
> shall be any signed standard/extended or bit-precise integer, for the others
> any unsigned standard/extended or bit-precise integer (bool not allowed).
>
> One possibility would be to also allow signed integer types for
> the clz/ctz/parity/popcount ones (and just cast the argument to
> unsigned_type_for during folding) and similarly unsigned integer types
> for the clrsb/ffs ones, dunno what is better; for stdbit.h the current
> version is sufficient and diagnoses use of the inappropriate sign,
> though on the other side I wonder if users won't be confused by
> __builtin_clzg (1) being an error and having to write __builtin_clzg (1U).
> And I think we don't have anything in C that would allow casting to
> corresponding unsigned type (or vice versa) given arbitrary integral type,
> one could use _Generic for that for standard and extended types, but not
> for arbitrary _BitInt.  What do you think?
>
> The new builtins are lowered to corresponding builtins with other suffixes
> or internal calls (plus casts and adjustments where needed) during FE
> folding or during gimplification at latest, the non-suffixed builtins
> handling precisions up to precision of int, l up to precision of long,
> ll up to precision of long long, up to __int128 precision lowered to
> double-word expansion early and the rest (which must be _BitInt) lowered
> to internal fn calls - those are then lowered during bitint lowering pass.
>
> The patch also changes representation of IFN_CLZ and IFN_CTZ calls,
> previously they were in the IL only if they are directly supported optab
> and depending on C[LT]Z_DEFINED_VALUE_AT_ZERO (...) == 2 they had or didn't
> have defined behavior at 0, now they are in the IL either if directly
> supported optab, or for the large/huge BITINT_TYPEs and they have either
> 1 or 2 arguments.  If one, the behavior is undefined at zero, if 2, the
> second argument is an int constant that should be returned for 0.
> As there is no extra support during expansion, for directly supported optab
> the second argument if present should still match the
> C[LT]Z_DEFINED_VALUE_AT_ZERO (...) == 2 value, but for BITINT_TYPE arguments
> it can be arbitrary int INTEGER_CST.
>
> The goal is e.g.
> #ifdef __has_builtin
> #if __has_builtin(__builtin_clzg) && __has_builtin(__builtin_popcountg)
> #define stdc_leading_zeros(x) \
>   __builtin_clzg (x, __builtin_popcountg ((__typeof (x)) -1))
> #endif
> #endif
> where __builtin_popcountg ((__typeof (x)) -1) computes the bit precision
> of x's type (kind of _Bitwidthof (x) alternative).
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

I was looking into improving __builtin_popcountg for __int128 on
aarch64 (when CSSC is not implemented which right now is almost all
cores) but this patch forces __builtin_popcountg to expand into 2
__builtin_popcountll (and add) before it could optimize into an
internal function for the popcount and have the backend a possibility
of using implementing something better.
This is due to the code in fold_builtin_bit_query, what might be the
best way of disabling that for this case?

Basically right now popcount is implemented using the SIMD instruction
cnt which can be used either 8x1 or 16x1 wide. Using the 16x1 improves
both the code size and performance (on almost all cores I know of). So
instead of 2 cnt instructions, we only would need one.

Thanks,
Andrew Pinski

>
> 2023-11-09  Jakub Jelinek  <jakub@redhat.com>
>
>         PR c/111309
> gcc/
>         * builtins.def (BUILT_IN_CLZG, BUILT_IN_CTZG, BUILT_IN_CLRSBG,
>         BUILT_IN_FFSG, BUILT_IN_PARITYG, BUILT_IN_POPCOUNTG): New
>         builtins.
>         * builtins.cc (fold_builtin_bit_query): New function.
>         (fold_builtin_1): Use it for
>         BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
>         (fold_builtin_2): Use it for BUILT_IN_{CLZ,CTZ}G.
>         * fold-const-call.cc: Fix comment typo on tm.h inclusion.
>         (fold_const_call_ss): Handle
>         CFN_BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
>         (fold_const_call_sss): New function.
>         (fold_const_call_1): Call it for 2 argument functions returning
>         scalar when passed 2 INTEGER_CSTs.
>         * genmatch.cc (cmp_operand): For function calls also compare
>         number of arguments.
>         (fns_cmp): New function.
>         (dt_node::gen_kids): Sort fns and generic_fns.
>         (dt_node::gen_kids_1): Handle fns with the same id but different
>         number of arguments.
>         * match.pd (CLZ simplifications): Drop checks for defined behavior
>         at zero.  Add variant of simplifications for IFN_CLZ with 2 arguments.
>         (CTZ simplifications): Drop checks for defined behavior at zero,
>         don't optimize precisions above MAX_FIXED_MODE_SIZE.  Add variant of
>         simplifications for IFN_CTZ with 2 arguments.
>         (a != 0 ? CLZ(a) : CST -> .CLZ(a)): Use TREE_TYPE (@3) instead of
>         type, add BITINT_TYPE handling, create 2 argument IFN_CLZ rather than
>         one argument.  Add variant for matching CLZ with 2 arguments.
>         (a != 0 ? CTZ(a) : CST -> .CTZ(a)): Similarly.
>         * gimple-lower-bitint.cc (bitint_large_huge::lower_bit_query): New
>         method.
>         (bitint_large_huge::lower_call): Use it for IFN_{CLZ,CTZ,CLRSB,FFS}
>         and IFN_{PARITY,POPCOUNT} calls.
>         * gimple-range-op.cc (cfn_clz::fold_range): Don't check
>         CLZ_DEFINED_VALUE_AT_ZERO for m_gimple_call_internal_p, instead
>         assume defined value at zero if the call has 2 arguments and use
>         second argument value for that case.
>         (cfn_ctz::fold_range): Similarly.
>         (gimple_range_op_handler::maybe_builtin_call): Use op_cfn_clz_internal
>         or op_cfn_ctz_internal only if internal fn call has 2 arguments and
>         set m_op2 in that case.
>         * tree-vect-patterns.cc (vect_recog_ctz_ffs_pattern,
>         vect_recog_popcount_clz_ctz_ffs_pattern): For value defined at zero
>         use second argument of calls if present, otherwise assume UB at zero,
>         create 2 argument .CLZ/.CTZ calls if needed.
>         * tree-vect-stmts.cc (vectorizable_call): Handle 2 argument .CLZ/.CTZ
>         calls.
>         * tree-ssa-loop-niter.cc (build_cltz_expr): Create 2 argument
>         .CLZ/.CTZ calls if needed.
>         * tree-ssa-forwprop.cc (simplify_count_trailing_zeroes): Create 2
>         argument .CTZ calls if needed.
>         * tree-ssa-phiopt.cc (cond_removal_in_builtin_zero_pattern): Handle
>         2 argument .CLZ/.CTZ calls, handle BITINT_TYPE, create 2 argument
>         .CLZ/.CTZ calls.
>         * doc/extend.texi (__builtin_clzg, __builtin_ctzg, __builtin_clrsbg,
>         __builtin_ffsg, __builtin_parityg, __builtin_popcountg): Document.
> gcc/c-family/
>         * c-common.cc (check_builtin_function_arguments): Handle
>         BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
>         * c-gimplify.cc (c_gimplify_expr): If __builtin_c[lt]zg second
>         argument hasn't been folded into constant yet, transform it to one
>         argument call inside of a COND_EXPR which for first argument 0
>         returns the second argument.
> gcc/c/
>         * c-typeck.cc (convert_arguments): Don't promote first argument
>         of BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
> gcc/cp/
>         * call.cc (magic_varargs_p): Return 4 for
>         BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
>         (build_over_call): Don't promote first argument of
>         BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
>         * cp-gimplify.cc (cp_gimplify_expr): For BUILT_IN_C{L,T}ZG use
>         c_gimplify_expr.
> gcc/testsuite/
>         * c-c++-common/pr111309-1.c: New test.
>         * c-c++-common/pr111309-2.c: New test.
>         * gcc.dg/torture/bitint-43.c: New test.
>         * gcc.dg/torture/bitint-44.c: New test.
>
> --- gcc/builtins.def.jj 2023-11-09 09:04:18.396546519 +0100
> +++ gcc/builtins.def    2023-11-09 09:17:40.235182413 +0100
> @@ -962,15 +962,18 @@ DEF_GCC_BUILTIN        (BUILT_IN_CLZ, "c
>  DEF_GCC_BUILTIN        (BUILT_IN_CLZIMAX, "clzimax", BT_FN_INT_UINTMAX, ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_GCC_BUILTIN        (BUILT_IN_CLZL, "clzl", BT_FN_INT_ULONG, ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_GCC_BUILTIN        (BUILT_IN_CLZLL, "clzll", BT_FN_INT_ULONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST)
> +DEF_GCC_BUILTIN        (BUILT_IN_CLZG, "clzg", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF)
>  DEF_GCC_BUILTIN        (BUILT_IN_CONSTANT_P, "constant_p", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_GCC_BUILTIN        (BUILT_IN_CTZ, "ctz", BT_FN_INT_UINT, ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_GCC_BUILTIN        (BUILT_IN_CTZIMAX, "ctzimax", BT_FN_INT_UINTMAX, ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_GCC_BUILTIN        (BUILT_IN_CTZL, "ctzl", BT_FN_INT_ULONG, ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_GCC_BUILTIN        (BUILT_IN_CTZLL, "ctzll", BT_FN_INT_ULONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST)
> +DEF_GCC_BUILTIN        (BUILT_IN_CTZG, "ctzg", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF)
>  DEF_GCC_BUILTIN        (BUILT_IN_CLRSB, "clrsb", BT_FN_INT_INT, ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_GCC_BUILTIN        (BUILT_IN_CLRSBIMAX, "clrsbimax", BT_FN_INT_INTMAX, ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_GCC_BUILTIN        (BUILT_IN_CLRSBL, "clrsbl", BT_FN_INT_LONG, ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_GCC_BUILTIN        (BUILT_IN_CLRSBLL, "clrsbll", BT_FN_INT_LONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST)
> +DEF_GCC_BUILTIN        (BUILT_IN_CLRSBG, "clrsbg", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF)
>  DEF_EXT_LIB_BUILTIN    (BUILT_IN_DCGETTEXT, "dcgettext", BT_FN_STRING_CONST_STRING_CONST_STRING_INT, ATTR_FORMAT_ARG_2)
>  DEF_EXT_LIB_BUILTIN    (BUILT_IN_DGETTEXT, "dgettext", BT_FN_STRING_CONST_STRING_CONST_STRING, ATTR_FORMAT_ARG_2)
>  DEF_GCC_BUILTIN        (BUILT_IN_DWARF_CFA, "dwarf_cfa", BT_FN_PTR, ATTR_NULL)
> @@ -993,6 +996,7 @@ DEF_EXT_LIB_BUILTIN    (BUILT_IN_FFS, "f
>  DEF_EXT_LIB_BUILTIN    (BUILT_IN_FFSIMAX, "ffsimax", BT_FN_INT_INTMAX, ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_EXT_LIB_BUILTIN    (BUILT_IN_FFSL, "ffsl", BT_FN_INT_LONG, ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_EXT_LIB_BUILTIN    (BUILT_IN_FFSLL, "ffsll", BT_FN_INT_LONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST)
> +DEF_GCC_BUILTIN        (BUILT_IN_FFSG, "ffsg", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF)
>  DEF_EXT_LIB_BUILTIN        (BUILT_IN_FORK, "fork", BT_FN_PID, ATTR_NOTHROW_LIST)
>  DEF_GCC_BUILTIN        (BUILT_IN_FRAME_ADDRESS, "frame_address", BT_FN_PTR_UINT, ATTR_NULL)
>  /* [trans-mem]: Adjust BUILT_IN_TM_FREE if BUILT_IN_FREE is changed.  */
> @@ -1041,10 +1045,12 @@ DEF_GCC_BUILTIN        (BUILT_IN_PARITY,
>  DEF_GCC_BUILTIN        (BUILT_IN_PARITYIMAX, "parityimax", BT_FN_INT_UINTMAX, ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_GCC_BUILTIN        (BUILT_IN_PARITYL, "parityl", BT_FN_INT_ULONG, ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_GCC_BUILTIN        (BUILT_IN_PARITYLL, "parityll", BT_FN_INT_ULONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST)
> +DEF_GCC_BUILTIN        (BUILT_IN_PARITYG, "parityg", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF)
>  DEF_GCC_BUILTIN        (BUILT_IN_POPCOUNT, "popcount", BT_FN_INT_UINT, ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_GCC_BUILTIN        (BUILT_IN_POPCOUNTIMAX, "popcountimax", BT_FN_INT_UINTMAX, ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_GCC_BUILTIN        (BUILT_IN_POPCOUNTL, "popcountl", BT_FN_INT_ULONG, ATTR_CONST_NOTHROW_LEAF_LIST)
>  DEF_GCC_BUILTIN        (BUILT_IN_POPCOUNTLL, "popcountll", BT_FN_INT_ULONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST)
> +DEF_GCC_BUILTIN        (BUILT_IN_POPCOUNTG, "popcountg", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF)
>  DEF_EXT_LIB_BUILTIN    (BUILT_IN_POSIX_MEMALIGN, "posix_memalign", BT_FN_INT_PTRPTR_SIZE_SIZE, ATTR_NOTHROW_NONNULL_LEAF)
>  DEF_GCC_BUILTIN        (BUILT_IN_PREFETCH, "prefetch", BT_FN_VOID_CONST_PTR_VAR, ATTR_NOVOPS_LEAF_LIST)
>  DEF_LIB_BUILTIN        (BUILT_IN_REALLOC, "realloc", BT_FN_PTR_PTR_SIZE, ATTR_ALLOC_WARN_UNUSED_RESULT_SIZE_2_NOTHROW_LEAF_LIST)
> --- gcc/builtins.cc.jj  2023-11-09 09:03:53.107904770 +0100
> +++ gcc/builtins.cc     2023-11-09 09:17:40.230182483 +0100
> @@ -9573,6 +9573,271 @@ fold_builtin_arith_overflow (location_t
>    return build2_loc (loc, COMPOUND_EXPR, boolean_type_node, store, ovfres);
>  }
>
> +/* Fold __builtin_{clz,ctz,clrsb,ffs,parity,popcount}g into corresponding
> +   internal function.  */
> +
> +static tree
> +fold_builtin_bit_query (location_t loc, enum built_in_function fcode,
> +                       tree arg0, tree arg1)
> +{
> +  enum internal_fn ifn;
> +  enum built_in_function fcodei, fcodel, fcodell;
> +  tree arg0_type = TREE_TYPE (arg0);
> +  tree cast_type = NULL_TREE;
> +  int addend = 0;
> +
> +  switch (fcode)
> +    {
> +    case BUILT_IN_CLZG:
> +      if (arg1 && TREE_CODE (arg1) != INTEGER_CST)
> +       return NULL_TREE;
> +      ifn = IFN_CLZ;
> +      fcodei = BUILT_IN_CLZ;
> +      fcodel = BUILT_IN_CLZL;
> +      fcodell = BUILT_IN_CLZLL;
> +      break;
> +    case BUILT_IN_CTZG:
> +      if (arg1 && TREE_CODE (arg1) != INTEGER_CST)
> +       return NULL_TREE;
> +      ifn = IFN_CTZ;
> +      fcodei = BUILT_IN_CTZ;
> +      fcodel = BUILT_IN_CTZL;
> +      fcodell = BUILT_IN_CTZLL;
> +      break;
> +    case BUILT_IN_CLRSBG:
> +      ifn = IFN_CLRSB;
> +      fcodei = BUILT_IN_CLRSB;
> +      fcodel = BUILT_IN_CLRSBL;
> +      fcodell = BUILT_IN_CLRSBLL;
> +      break;
> +    case BUILT_IN_FFSG:
> +      ifn = IFN_FFS;
> +      fcodei = BUILT_IN_FFS;
> +      fcodel = BUILT_IN_FFSL;
> +      fcodell = BUILT_IN_FFSLL;
> +      break;
> +    case BUILT_IN_PARITYG:
> +      ifn = IFN_PARITY;
> +      fcodei = BUILT_IN_PARITY;
> +      fcodel = BUILT_IN_PARITYL;
> +      fcodell = BUILT_IN_PARITYLL;
> +      break;
> +    case BUILT_IN_POPCOUNTG:
> +      ifn = IFN_POPCOUNT;
> +      fcodei = BUILT_IN_POPCOUNT;
> +      fcodel = BUILT_IN_POPCOUNTL;
> +      fcodell = BUILT_IN_POPCOUNTLL;
> +      break;
> +    default:
> +      gcc_unreachable ();
> +    }
> +
> +  if (TYPE_PRECISION (arg0_type)
> +      <= TYPE_PRECISION (long_long_unsigned_type_node))
> +    {
> +      if (TYPE_PRECISION (arg0_type) <= TYPE_PRECISION (unsigned_type_node))
> +
> +       cast_type = (TYPE_UNSIGNED (arg0_type)
> +                    ? unsigned_type_node : integer_type_node);
> +      else if (TYPE_PRECISION (arg0_type)
> +              <= TYPE_PRECISION (long_unsigned_type_node))
> +       {
> +         cast_type = (TYPE_UNSIGNED (arg0_type)
> +                      ? long_unsigned_type_node : long_integer_type_node);
> +         fcodei = fcodel;
> +       }
> +      else
> +       {
> +         cast_type = (TYPE_UNSIGNED (arg0_type)
> +                      ? long_long_unsigned_type_node
> +                      : long_long_integer_type_node);
> +         fcodei = fcodell;
> +       }
> +    }
> +  else if (TYPE_PRECISION (arg0_type) <= MAX_FIXED_MODE_SIZE)
> +    {
> +      cast_type
> +       = build_nonstandard_integer_type (MAX_FIXED_MODE_SIZE,
> +                                         TYPE_UNSIGNED (arg0_type));
> +      gcc_assert (TYPE_PRECISION (cast_type)
> +                 == 2 * TYPE_PRECISION (long_long_unsigned_type_node));
> +      fcodei = END_BUILTINS;
> +    }
> +  else
> +    fcodei = END_BUILTINS;
> +  if (cast_type)
> +    {
> +      switch (fcode)
> +       {
> +       case BUILT_IN_CLZG:
> +       case BUILT_IN_CLRSBG:
> +         addend = TYPE_PRECISION (arg0_type) - TYPE_PRECISION (cast_type);
> +         break;
> +       default:
> +         break;
> +       }
> +      arg0 = fold_convert (cast_type, arg0);
> +      arg0_type = cast_type;
> +    }
> +
> +  if (arg1)
> +    arg1 = fold_convert (integer_type_node, arg1);
> +
> +  tree arg2 = arg1;
> +  if (fcode == BUILT_IN_CLZG && addend)
> +    {
> +      if (arg1)
> +       arg0 = save_expr (arg0);
> +      arg2 = NULL_TREE;
> +    }
> +  tree call = NULL_TREE, tem;
> +  if (TYPE_PRECISION (arg0_type) == MAX_FIXED_MODE_SIZE
> +      && (TYPE_PRECISION (arg0_type)
> +         == 2 * TYPE_PRECISION (long_long_unsigned_type_node)))
> +    {
> +      /* __int128 expansions using up to 2 long long builtins.  */
> +      arg0 = save_expr (arg0);
> +      tree type = (TYPE_UNSIGNED (arg0_type)
> +                  ? long_long_unsigned_type_node
> +                  : long_long_integer_type_node);
> +      tree hi = fold_build2 (RSHIFT_EXPR, arg0_type, arg0,
> +                            build_int_cst (integer_type_node,
> +                                           MAX_FIXED_MODE_SIZE / 2));
> +      hi = fold_convert (type, hi);
> +      tree lo = fold_convert (type, arg0);
> +      switch (fcode)
> +       {
> +       case BUILT_IN_CLZG:
> +         call = fold_builtin_bit_query (loc, fcode, lo, NULL_TREE);
> +         call = fold_build2 (PLUS_EXPR, integer_type_node, call,
> +                             build_int_cst (integer_type_node,
> +                                            MAX_FIXED_MODE_SIZE / 2));
> +         if (arg2)
> +           call = fold_build3 (COND_EXPR, integer_type_node,
> +                               fold_build2 (NE_EXPR, boolean_type_node,
> +                                            lo, build_zero_cst (type)),
> +                               call, arg2);
> +         call = fold_build3 (COND_EXPR, integer_type_node,
> +                             fold_build2 (NE_EXPR, boolean_type_node,
> +                                          hi, build_zero_cst (type)),
> +                             fold_builtin_bit_query (loc, fcode, hi,
> +                                                     NULL_TREE),
> +                             call);
> +         break;
> +       case BUILT_IN_CTZG:
> +         call = fold_builtin_bit_query (loc, fcode, hi, NULL_TREE);
> +         call = fold_build2 (PLUS_EXPR, integer_type_node, call,
> +                             build_int_cst (integer_type_node,
> +                                            MAX_FIXED_MODE_SIZE / 2));
> +         if (arg2)
> +           call = fold_build3 (COND_EXPR, integer_type_node,
> +                               fold_build2 (NE_EXPR, boolean_type_node,
> +                                            hi, build_zero_cst (type)),
> +                               call, arg2);
> +         call = fold_build3 (COND_EXPR, integer_type_node,
> +                             fold_build2 (NE_EXPR, boolean_type_node,
> +                                          lo, build_zero_cst (type)),
> +                             fold_builtin_bit_query (loc, fcode, lo,
> +                                                     NULL_TREE),
> +                             call);
> +         break;
> +       case BUILT_IN_CLRSBG:
> +         tem = fold_builtin_bit_query (loc, fcode, lo, NULL_TREE);
> +         tem = fold_build2 (PLUS_EXPR, integer_type_node, tem,
> +                            build_int_cst (integer_type_node,
> +                                           MAX_FIXED_MODE_SIZE / 2));
> +         tem = fold_build3 (COND_EXPR, integer_type_node,
> +                            fold_build2 (LT_EXPR, boolean_type_node,
> +                                         fold_build2 (BIT_XOR_EXPR, type,
> +                                                      lo, hi),
> +                                         build_zero_cst (type)),
> +                            build_int_cst (integer_type_node,
> +                                           MAX_FIXED_MODE_SIZE / 2 - 1),
> +                            tem);
> +         call = fold_builtin_bit_query (loc, fcode, hi, NULL_TREE);
> +         call = save_expr (call);
> +         call = fold_build3 (COND_EXPR, integer_type_node,
> +                             fold_build2 (NE_EXPR, boolean_type_node,
> +                                          call,
> +                                          build_int_cst (integer_type_node,
> +                                                         MAX_FIXED_MODE_SIZE
> +                                                         / 2 - 1)),
> +                             call, tem);
> +         break;
> +       case BUILT_IN_FFSG:
> +         call = fold_builtin_bit_query (loc, fcode, hi, NULL_TREE);
> +         call = fold_build2 (PLUS_EXPR, integer_type_node, call,
> +                             build_int_cst (integer_type_node,
> +                                            MAX_FIXED_MODE_SIZE / 2));
> +         call = fold_build3 (COND_EXPR, integer_type_node,
> +                             fold_build2 (NE_EXPR, boolean_type_node,
> +                                          hi, build_zero_cst (type)),
> +                             call, integer_zero_node);
> +         call = fold_build3 (COND_EXPR, integer_type_node,
> +                             fold_build2 (NE_EXPR, boolean_type_node,
> +                                          lo, build_zero_cst (type)),
> +                             fold_builtin_bit_query (loc, fcode, lo,
> +                                                     NULL_TREE),
> +                             call);
> +         break;
> +       case BUILT_IN_PARITYG:
> +         call = fold_builtin_bit_query (loc, fcode,
> +                                        fold_build2 (BIT_XOR_EXPR, type,
> +                                                     lo, hi), NULL_TREE);
> +         break;
> +       case BUILT_IN_POPCOUNTG:
> +         call = fold_build2 (PLUS_EXPR, integer_type_node,
> +                             fold_builtin_bit_query (loc, fcode, hi,
> +                                                     NULL_TREE),
> +                             fold_builtin_bit_query (loc, fcode, lo,
> +                                                     NULL_TREE));
> +         break;
> +       default:
> +         gcc_unreachable ();
> +       }
> +    }
> +  else
> +    {
> +      /* Only keep second argument to IFN_CLZ/IFN_CTZ if it is the
> +        value defined at zero during GIMPLE, or for large/huge _BitInt
> +        (which are then lowered during bitint lowering).  */
> +      if (arg2 && TREE_CODE (TREE_TYPE (arg0)) != BITINT_TYPE)
> +       {
> +         int val;
> +         if (fcode == BUILT_IN_CLZG)
> +           {
> +             if (CLZ_DEFINED_VALUE_AT_ZERO (SCALAR_TYPE_MODE (arg0_type),
> +                                            val) != 2
> +                 || wi::to_widest (arg2) != val)
> +               arg2 = NULL_TREE;
> +           }
> +         else if (CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_TYPE_MODE (arg0_type),
> +                                             val) != 2
> +                  || wi::to_widest (arg2) != val)
> +           arg2 = NULL_TREE;
> +         if (!direct_internal_fn_supported_p (ifn, arg0_type,
> +                                              OPTIMIZE_FOR_BOTH))
> +           arg2 = NULL_TREE;
> +       }
> +      if (fcodei == END_BUILTINS || arg2)
> +       call = build_call_expr_internal_loc (loc, ifn, integer_type_node,
> +                                            arg2 ? 2 : 1, arg0, arg2);
> +      else
> +       call = build_call_expr_loc (loc, builtin_decl_explicit (fcodei), 1,
> +                                   arg0);
> +    }
> +  if (addend)
> +    call = fold_build2 (PLUS_EXPR, integer_type_node, call,
> +                       build_int_cst (integer_type_node, addend));
> +  if (arg1 && arg2 == NULL_TREE)
> +    call = fold_build3 (COND_EXPR, integer_type_node,
> +                       fold_build2 (NE_EXPR, boolean_type_node,
> +                                    arg0, build_zero_cst (arg0_type)),
> +                       call, arg1);
> +
> +  return call;
> +}
> +
>  /* Fold __builtin_{add,sub}c{,l,ll} into pair of internal functions
>     that return both result of arithmetics and overflowed boolean
>     flag in a complex integer result.  */
> @@ -9824,6 +10089,14 @@ fold_builtin_1 (location_t loc, tree exp
>         return build_empty_stmt (loc);
>        break;
>
> +    case BUILT_IN_CLZG:
> +    case BUILT_IN_CTZG:
> +    case BUILT_IN_CLRSBG:
> +    case BUILT_IN_FFSG:
> +    case BUILT_IN_PARITYG:
> +    case BUILT_IN_POPCOUNTG:
> +      return fold_builtin_bit_query (loc, fcode, arg0, NULL_TREE);
> +
>      default:
>        break;
>      }
> @@ -9913,6 +10186,10 @@ fold_builtin_2 (location_t loc, tree exp
>      case BUILT_IN_ATOMIC_IS_LOCK_FREE:
>        return fold_builtin_atomic_is_lock_free (arg0, arg1);
>
> +    case BUILT_IN_CLZG:
> +    case BUILT_IN_CTZG:
> +      return fold_builtin_bit_query (loc, fcode, arg0, arg1);
> +
>      default:
>        break;
>      }
> --- gcc/fold-const-call.cc.jj   2023-11-09 09:03:53.368901073 +0100
> +++ gcc/fold-const-call.cc      2023-11-09 09:17:40.240182342 +0100
> @@ -27,7 +27,7 @@ along with GCC; see the file COPYING3.
>  #include "fold-const.h"
>  #include "fold-const-call.h"
>  #include "case-cfn-macros.h"
> -#include "tm.h" /* For C[LT]Z_DEFINED_AT_ZERO.  */
> +#include "tm.h" /* For C[LT]Z_DEFINED_VALUE_AT_ZERO.  */
>  #include "builtins.h"
>  #include "gimple-expr.h"
>  #include "tree-vector-builder.h"
> @@ -1017,14 +1017,18 @@ fold_const_call_ss (wide_int *result, co
>    switch (fn)
>      {
>      CASE_CFN_FFS:
> +    case CFN_BUILT_IN_FFSG:
>        *result = wi::shwi (wi::ffs (arg), precision);
>        return true;
>
>      CASE_CFN_CLZ:
> +    case CFN_BUILT_IN_CLZG:
>        {
>         int tmp;
>         if (wi::ne_p (arg, 0))
>           tmp = wi::clz (arg);
> +       else if (TREE_CODE (arg_type) == BITINT_TYPE)
> +         tmp = TYPE_PRECISION (arg_type);
>         else if (!CLZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (arg_type),
>                                              tmp))
>           tmp = TYPE_PRECISION (arg_type);
> @@ -1033,10 +1037,13 @@ fold_const_call_ss (wide_int *result, co
>        }
>
>      CASE_CFN_CTZ:
> +    case CFN_BUILT_IN_CTZG:
>        {
>         int tmp;
>         if (wi::ne_p (arg, 0))
>           tmp = wi::ctz (arg);
> +       else if (TREE_CODE (arg_type) == BITINT_TYPE)
> +         tmp = TYPE_PRECISION (arg_type);
>         else if (!CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (arg_type),
>                                              tmp))
>           tmp = TYPE_PRECISION (arg_type);
> @@ -1045,14 +1052,17 @@ fold_const_call_ss (wide_int *result, co
>        }
>
>      CASE_CFN_CLRSB:
> +    case CFN_BUILT_IN_CLRSBG:
>        *result = wi::shwi (wi::clrsb (arg), precision);
>        return true;
>
>      CASE_CFN_POPCOUNT:
> +    case CFN_BUILT_IN_POPCOUNTG:
>        *result = wi::shwi (wi::popcount (arg), precision);
>        return true;
>
>      CASE_CFN_PARITY:
> +    case CFN_BUILT_IN_PARITYG:
>        *result = wi::shwi (wi::parity (arg), precision);
>        return true;
>
> @@ -1531,6 +1541,49 @@ fold_const_call_sss (real_value *result,
>
>  /* Try to evaluate:
>
> +      *RESULT = FN (ARG0, ARG1)
> +
> +   where ARG_TYPE is the type of ARG0 and PRECISION is the number of bits in
> +   the result.  Return true on success.  */
> +
> +static bool
> +fold_const_call_sss (wide_int *result, combined_fn fn,
> +                    const wide_int_ref &arg0, const wide_int_ref &arg1,
> +                    unsigned int precision, tree arg_type ATTRIBUTE_UNUSED)
> +{
> +  switch (fn)
> +    {
> +    case CFN_CLZ:
> +    case CFN_BUILT_IN_CLZG:
> +      {
> +       int tmp;
> +       if (wi::ne_p (arg0, 0))
> +         tmp = wi::clz (arg0);
> +       else
> +         tmp = arg1.to_shwi ();
> +       *result = wi::shwi (tmp, precision);
> +       return true;
> +      }
> +
> +    case CFN_CTZ:
> +    case CFN_BUILT_IN_CTZG:
> +      {
> +       int tmp;
> +       if (wi::ne_p (arg0, 0))
> +         tmp = wi::ctz (arg0);
> +       else
> +         tmp = arg1.to_shwi ();
> +       *result = wi::shwi (tmp, precision);
> +       return true;
> +      }
> +
> +    default:
> +      return false;
> +    }
> +}
> +
> +/* Try to evaluate:
> +
>        RESULT = fn (ARG0, ARG1)
>
>     where FORMAT is the format of the real and imaginary parts of RESULT
> @@ -1565,6 +1618,19 @@ fold_const_call_1 (combined_fn fn, tree
>    machine_mode arg0_mode = TYPE_MODE (TREE_TYPE (arg0));
>    machine_mode arg1_mode = TYPE_MODE (TREE_TYPE (arg1));
>
> +  if (integer_cst_p (arg0) && integer_cst_p (arg1))
> +    {
> +      if (SCALAR_INT_MODE_P (mode))
> +       {
> +         wide_int result;
> +         if (fold_const_call_sss (&result, fn, wi::to_wide (arg0),
> +                                  wi::to_wide (arg1), TYPE_PRECISION (type),
> +                                  TREE_TYPE (arg0)))
> +           return wide_int_to_tree (type, result);
> +       }
> +      return NULL_TREE;
> +    }
> +
>    if (mode == arg0_mode
>        && real_cst_p (arg0)
>        && real_cst_p (arg1))
> --- gcc/genmatch.cc.jj  2023-11-09 09:03:53.375900973 +0100
> +++ gcc/genmatch.cc     2023-11-09 09:17:40.234182427 +0100
> @@ -1895,8 +1895,14 @@ cmp_operand (operand *o1, operand *o2)
>      {
>        expr *e1 = static_cast<expr *>(o1);
>        expr *e2 = static_cast<expr *>(o2);
> -      return (e1->operation == e2->operation
> -             && e1->is_generic == e2->is_generic);
> +      if (e1->operation != e2->operation
> +         || e1->is_generic != e2->is_generic)
> +       return false;
> +      if (e1->operation->kind == id_base::FN
> +         /* For function calls also compare number of arguments.  */
> +         && e1->ops.length () != e2->ops.length ())
> +       return false;
> +      return true;
>      }
>    else
>      return false;
> @@ -3070,6 +3076,26 @@ dt_operand::gen_generic_expr (FILE *f, i
>    return 0;
>  }
>
> +/* Compare 2 fns or generic_fns vector entries for vector sorting.
> +   Same operation entries with different number of arguments should
> +   be adjacent.  */
> +
> +static int
> +fns_cmp (const void *p1, const void *p2)
> +{
> +  dt_operand *op1 = *(dt_operand *const *) p1;
> +  dt_operand *op2 = *(dt_operand *const *) p2;
> +  expr *e1 = as_a <expr *> (op1->op);
> +  expr *e2 = as_a <expr *> (op2->op);
> +  id_base *b1 = e1->operation;
> +  id_base *b2 = e2->operation;
> +  if (b1->hashval < b2->hashval)
> +    return -1;
> +  if (b1->hashval > b2->hashval)
> +    return 1;
> +  return strcmp (b1->id, b2->id);
> +}
> +
>  /* Generate matching code for the children of the decision tree node.  */
>
>  void
> @@ -3143,6 +3169,8 @@ dt_node::gen_kids (FILE *f, int indent,
>              Like DT_TRUE, DT_MATCH serves as a barrier as it can cause
>              dependent matches to get out-of-order.  Generate code now
>              for what we have collected sofar.  */
> +         fns.qsort (fns_cmp);
> +         generic_fns.qsort (fns_cmp);
>           gen_kids_1 (f, indent, gimple, depth, gimple_exprs, generic_exprs,
>                       fns, generic_fns, preds, others);
>           /* And output the true operand itself.  */
> @@ -3159,6 +3187,8 @@ dt_node::gen_kids (FILE *f, int indent,
>      }
>
>    /* Generate code for the remains.  */
> +  fns.qsort (fns_cmp);
> +  generic_fns.qsort (fns_cmp);
>    gen_kids_1 (f, indent, gimple, depth, gimple_exprs, generic_exprs,
>               fns, generic_fns, preds, others);
>  }
> @@ -3256,14 +3286,21 @@ dt_node::gen_kids_1 (FILE *f, int indent
>
>           indent += 4;
>           fprintf_indent (f, indent, "{\n");
> +         id_base *last_op = NULL;
>           for (unsigned i = 0; i < fns_len; ++i)
>             {
>               expr *e = as_a <expr *>(fns[i]->op);
> -             if (user_id *u = dyn_cast <user_id *> (e->operation))
> -               for (auto id : u->substitutes)
> -                 fprintf_indent (f, indent, "case %s:\n", id->id);
> -             else
> -               fprintf_indent (f, indent, "case %s:\n", e->operation->id);
> +             if (e->operation != last_op)
> +               {
> +                 if (i)
> +                   fprintf_indent (f, indent, "  break;\n");
> +                 if (user_id *u = dyn_cast <user_id *> (e->operation))
> +                   for (auto id : u->substitutes)
> +                     fprintf_indent (f, indent, "case %s:\n", id->id);
> +                 else
> +                   fprintf_indent (f, indent, "case %s:\n", e->operation->id);
> +               }
> +             last_op = e->operation;
>               /* We need to be defensive against bogus prototypes allowing
>                  calls with not enough arguments.  */
>               fprintf_indent (f, indent,
> @@ -3272,9 +3309,9 @@ dt_node::gen_kids_1 (FILE *f, int indent
>               fprintf_indent (f, indent, "    {\n");
>               fns[i]->gen (f, indent + 6, true, depth);
>               fprintf_indent (f, indent, "    }\n");
> -             fprintf_indent (f, indent, "  break;\n");
>             }
>
> +         fprintf_indent (f, indent, "  break;\n");
>           fprintf_indent (f, indent, "default:;\n");
>           fprintf_indent (f, indent, "}\n");
>           indent -= 4;
> @@ -3334,18 +3371,25 @@ dt_node::gen_kids_1 (FILE *f, int indent
>                       "    {\n");
>        indent += 4;
>
> +      id_base *last_op = NULL;
>        for (unsigned j = 0; j < generic_fns.length (); ++j)
>         {
>           expr *e = as_a <expr *>(generic_fns[j]->op);
>           gcc_assert (e->operation->kind == id_base::FN);
>
> -         fprintf_indent (f, indent, "case %s:\n", e->operation->id);
> +         if (e->operation != last_op)
> +           {
> +             if (j)
> +               fprintf_indent (f, indent, "  break;\n");
> +             fprintf_indent (f, indent, "case %s:\n", e->operation->id);
> +           }
> +         last_op = e->operation;
>           fprintf_indent (f, indent, "  if (call_expr_nargs (%s) == %d)\n"
>                                      "    {\n", kid_opname, e->ops.length ());
>           generic_fns[j]->gen (f, indent + 6, false, depth);
> -         fprintf_indent (f, indent, "    }\n"
> -                                    "  break;\n");
> +         fprintf_indent (f, indent, "    }\n");
>         }
> +      fprintf_indent (f, indent, "  break;\n");
>        fprintf_indent (f, indent, "default:;\n");
>
>        indent -= 4;
> --- gcc/match.pd.jj     2023-11-09 09:03:53.490899344 +0100
> +++ gcc/match.pd        2023-11-09 09:17:40.231182469 +0100
> @@ -8532,31 +8532,34 @@ (define_operator_list SYNC_FETCH_AND_AND
>     (op (clz:s@2 @0) INTEGER_CST@1)
>     (if (integer_zerop (@1) && single_use (@2))
>      /* clz(X) == 0 is (int)X < 0 and clz(X) != 0 is (int)X >= 0.  */
> -    (with { tree type0 = TREE_TYPE (@0);
> -           tree stype = signed_type_for (type0);
> -           HOST_WIDE_INT val = 0;
> -           /* Punt on hypothetical weird targets.  */
> -           if (clz == CFN_CLZ
> -               && CLZ_DEFINED_VALUE_AT_ZERO (SCALAR_TYPE_MODE (type0),
> -                                             val) == 2
> -               && val == 0)
> -             stype = NULL_TREE;
> -         }
> -     (if (stype)
> -      (cmp (convert:stype @0) { build_zero_cst (stype); })))
> +    (with { tree stype = signed_type_for (TREE_TYPE (@0)); }
> +     (cmp (convert:stype @0) { build_zero_cst (stype); }))
>      /* clz(X) == (prec-1) is X == 1 and clz(X) != (prec-1) is X != 1.  */
> -    (with { bool ok = true;
> -           HOST_WIDE_INT val = 0;
> -           tree type0 = TREE_TYPE (@0);
> -           /* Punt on hypothetical weird targets.  */
> -           if (clz == CFN_CLZ
> -               && CLZ_DEFINED_VALUE_AT_ZERO (SCALAR_TYPE_MODE (type0),
> -                                             val) == 2
> -               && val == TYPE_PRECISION (type0) - 1)
> -             ok = false;
> -         }
> -     (if (ok && wi::to_wide (@1) == (TYPE_PRECISION (type0) - 1))
> -      (op @0 { build_one_cst (type0); })))))))
> +    (if (wi::to_wide (@1) == TYPE_PRECISION (TREE_TYPE (@0)) - 1)
> +     (op @0 { build_one_cst (TREE_TYPE (@0)); }))))))
> +(for op (eq ne)
> +     cmp (lt ge)
> + (simplify
> +  (op (IFN_CLZ:s@2 @0 @3) INTEGER_CST@1)
> +  (if (integer_zerop (@1) && single_use (@2))
> +   /* clz(X) == 0 is (int)X < 0 and clz(X) != 0 is (int)X >= 0.  */
> +   (with { tree type0 = TREE_TYPE (@0);
> +          tree stype = signed_type_for (TREE_TYPE (@0));
> +          /* Punt if clz(0) == 0.  */
> +          if (integer_zerop (@3))
> +            stype = NULL_TREE;
> +        }
> +    (if (stype)
> +     (cmp (convert:stype @0) { build_zero_cst (stype); })))
> +   /* clz(X) == (prec-1) is X == 1 and clz(X) != (prec-1) is X != 1.  */
> +   (with { bool ok = true;
> +          tree type0 = TREE_TYPE (@0);
> +          /* Punt if clz(0) == prec - 1.  */
> +          if (wi::to_widest (@3) == TYPE_PRECISION (type0) - 1)
> +            ok = false;
> +        }
> +    (if (ok && wi::to_wide (@1) == (TYPE_PRECISION (type0) - 1))
> +     (op @0 { build_one_cst (type0); }))))))
>
>  /* CTZ simplifications.  */
>  (for ctz (CTZ)
> @@ -8581,22 +8584,14 @@ (define_operator_list SYNC_FETCH_AND_AND
>                       val++;
>                   }
>               }
> -           bool zero_res = false;
> -           HOST_WIDE_INT zero_val = 0;
>             tree type0 = TREE_TYPE (@0);
>             int prec = TYPE_PRECISION (type0);
> -           if (ctz == CFN_CTZ
> -               && CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_TYPE_MODE (type0),
> -                                             zero_val) == 2)
> -             zero_res = true;
>           }
> -     (if (val <= 0)
> -      (if (ok && (!zero_res || zero_val >= val))
> -       { constant_boolean_node (cmp == EQ_EXPR ? true : false, type); })
> -      (if (val >= prec)
> -       (if (ok && (!zero_res || zero_val < val))
> -       { constant_boolean_node (cmp == EQ_EXPR ? false : true, type); })
> -       (if (ok && (!zero_res || zero_val < 0 || zero_val >= prec))
> +     (if (ok && prec <= MAX_FIXED_MODE_SIZE)
> +      (if (val <= 0)
> +       { constant_boolean_node (cmp == EQ_EXPR ? true : false, type); }
> +       (if (val >= prec)
> +       { constant_boolean_node (cmp == EQ_EXPR ? false : true, type); }
>         (cmp (bit_and @0 { wide_int_to_tree (type0,
>                                              wi::mask (val, false, prec)); })
>              { build_zero_cst (type0); })))))))
> @@ -8604,19 +8599,68 @@ (define_operator_list SYNC_FETCH_AND_AND
>    (simplify
>     /* __builtin_ctz (x) == C -> (x & ((1 << (C + 1)) - 1)) == (1 << C).  */
>     (op (ctz:s @0) INTEGER_CST@1)
> -    (with { bool zero_res = false;
> -           HOST_WIDE_INT zero_val = 0;
> -           tree type0 = TREE_TYPE (@0);
> +    (with { tree type0 = TREE_TYPE (@0);
>             int prec = TYPE_PRECISION (type0);
> -           if (ctz == CFN_CTZ
> -               && CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_TYPE_MODE (type0),
> -                                             zero_val) == 2)
> -             zero_res = true;
>           }
> +     (if (prec <= MAX_FIXED_MODE_SIZE)
> +      (if (tree_int_cst_sgn (@1) < 0 || wi::to_widest (@1) >= prec)
> +       { constant_boolean_node (op == EQ_EXPR ? false : true, type); }
> +       (op (bit_and @0 { wide_int_to_tree (type0,
> +                                          wi::mask (tree_to_uhwi (@1) + 1,
> +                                                    false, prec)); })
> +          { wide_int_to_tree (type0,
> +                              wi::shifted_mask (tree_to_uhwi (@1), 1,
> +                                                false, prec)); })))))))
> +(for op (ge gt le lt)
> +     cmp (eq eq ne ne)
> + (simplify
> +  /* __builtin_ctz (x) >= C -> (x & ((1 << C) - 1)) == 0.  */
> +  (op (IFN_CTZ:s @0 @2) INTEGER_CST@1)
> +   (with { bool ok = true;
> +          HOST_WIDE_INT val = 0;
> +          if (!tree_fits_shwi_p (@1))
> +            ok = false;
> +          else
> +            {
> +              val = tree_to_shwi (@1);
> +              /* Canonicalize to >= or <.  */
> +              if (op == GT_EXPR || op == LE_EXPR)
> +                {
> +                  if (val == HOST_WIDE_INT_MAX)
> +                    ok = false;
> +                  else
> +                    val++;
> +                }
> +            }
> +          HOST_WIDE_INT zero_val = tree_to_shwi (@2);
> +          tree type0 = TREE_TYPE (@0);
> +          int prec = TYPE_PRECISION (type0);
> +          if (prec > MAX_FIXED_MODE_SIZE)
> +            ok = false;
> +         }
> +     (if (val <= 0)
> +      (if (ok && zero_val >= val)
> +       { constant_boolean_node (cmp == EQ_EXPR ? true : false, type); })
> +      (if (val >= prec)
> +       (if (ok && zero_val < val)
> +       { constant_boolean_node (cmp == EQ_EXPR ? false : true, type); })
> +       (if (ok && (zero_val < 0 || zero_val >= prec))
> +       (cmp (bit_and @0 { wide_int_to_tree (type0,
> +                                            wi::mask (val, false, prec)); })
> +            { build_zero_cst (type0); })))))))
> +(for op (eq ne)
> + (simplify
> +  /* __builtin_ctz (x) == C -> (x & ((1 << (C + 1)) - 1)) == (1 << C).  */
> +  (op (IFN_CTZ:s @0 @2) INTEGER_CST@1)
> +   (with { HOST_WIDE_INT zero_val = tree_to_shwi (@2);
> +          tree type0 = TREE_TYPE (@0);
> +          int prec = TYPE_PRECISION (type0);
> +        }
> +    (if (prec <= MAX_FIXED_MODE_SIZE)
>       (if (tree_int_cst_sgn (@1) < 0 || wi::to_widest (@1) >= prec)
> -      (if (!zero_res || zero_val != wi::to_widest (@1))
> +      (if (zero_val != wi::to_widest (@1))
>         { constant_boolean_node (op == EQ_EXPR ? false : true, type); })
> -      (if (!zero_res || zero_val < 0 || zero_val >= prec)
> +      (if (zero_val < 0 || zero_val >= prec)
>         (op (bit_and @0 { wide_int_to_tree (type0,
>                                            wi::mask (tree_to_uhwi (@1) + 1,
>                                                      false, prec)); })
> @@ -8753,13 +8797,38 @@ (define_operator_list SYNC_FETCH_AND_AND
>    (cond (ne @0 integer_zerop@1) (func (convert?@3 @0)) INTEGER_CST@2)
>    (with { int val;
>           internal_fn ifn = IFN_LAST;
> -         if (direct_internal_fn_supported_p (IFN_CLZ, type, OPTIMIZE_FOR_BOTH)
> -             && CLZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (type),
> -                                           val) == 2)
> +         if (TREE_CODE (TREE_TYPE (@3)) == BITINT_TYPE)
> +           {
> +             if (tree_fits_shwi_p (@2))
> +               {
> +                 HOST_WIDE_INT valw = tree_to_shwi (@2);
> +                 if ((int) valw == valw)
> +                   {
> +                     val = valw;
> +                     ifn = IFN_CLZ;
> +                   }
> +               }
> +           }
> +         else if (direct_internal_fn_supported_p (IFN_CLZ, TREE_TYPE (@3),
> +                                                  OPTIMIZE_FOR_BOTH)
> +                  && CLZ_DEFINED_VALUE_AT_ZERO
> +                       (SCALAR_INT_TYPE_MODE (TREE_TYPE (@3)), val) == 2)
>             ifn = IFN_CLZ;
>         }
>     (if (ifn == IFN_CLZ && wi::to_widest (@2) == val)
> -    (IFN_CLZ @3)))))
> +    (IFN_CLZ @3 @2)))))
> +(simplify
> + (cond (ne @0 integer_zerop@1) (IFN_CLZ (convert?@3 @0) INTEGER_CST@2) @2)
> +  (with { int val;
> +         internal_fn ifn = IFN_LAST;
> +         if (TREE_CODE (TREE_TYPE (@3)) == BITINT_TYPE)
> +           ifn = IFN_CLZ;
> +         else if (direct_internal_fn_supported_p (IFN_CLZ, TREE_TYPE (@3),
> +                                                  OPTIMIZE_FOR_BOTH))
> +           ifn = IFN_CLZ;
> +       }
> +   (if (ifn == IFN_CLZ)
> +    (IFN_CLZ @3 @2))))
>
>  /* a != 0 ? CTZ(a) : CST -> .CTZ(a) where CST is the result of the internal function for 0. */
>  (for func (CTZ)
> @@ -8767,13 +8836,38 @@ (define_operator_list SYNC_FETCH_AND_AND
>    (cond (ne @0 integer_zerop@1) (func (convert?@3 @0)) INTEGER_CST@2)
>    (with { int val;
>           internal_fn ifn = IFN_LAST;
> -         if (direct_internal_fn_supported_p (IFN_CTZ, type, OPTIMIZE_FOR_BOTH)
> -             && CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (type),
> -                                           val) == 2)
> +         if (TREE_CODE (TREE_TYPE (@3)) == BITINT_TYPE)
> +           {
> +             if (tree_fits_shwi_p (@2))
> +               {
> +                 HOST_WIDE_INT valw = tree_to_shwi (@2);
> +                 if ((int) valw == valw)
> +                   {
> +                     val = valw;
> +                     ifn = IFN_CTZ;
> +                   }
> +               }
> +           }
> +         else if (direct_internal_fn_supported_p (IFN_CTZ, TREE_TYPE (@3),
> +                                                  OPTIMIZE_FOR_BOTH)
> +                  && CTZ_DEFINED_VALUE_AT_ZERO
> +                       (SCALAR_INT_TYPE_MODE (TREE_TYPE (@3)), val) == 2)
>             ifn = IFN_CTZ;
>         }
>     (if (ifn == IFN_CTZ && wi::to_widest (@2) == val)
> -    (IFN_CTZ @3)))))
> +    (IFN_CTZ @3 @2)))))
> +(simplify
> + (cond (ne @0 integer_zerop@1) (IFN_CTZ (convert?@3 @0) INTEGER_CST@2) @2)
> +  (with { int val;
> +         internal_fn ifn = IFN_LAST;
> +         if (TREE_CODE (TREE_TYPE (@3)) == BITINT_TYPE)
> +           ifn = IFN_CTZ;
> +         else if (direct_internal_fn_supported_p (IFN_CTZ, TREE_TYPE (@3),
> +                                                  OPTIMIZE_FOR_BOTH))
> +           ifn = IFN_CTZ;
> +       }
> +   (if (ifn == IFN_CTZ)
> +    (IFN_CTZ @3 @2))))
>  #endif
>
>  /* Common POPCOUNT/PARITY simplifications.  */
> --- gcc/gimple-lower-bitint.cc.jj       2023-11-09 09:03:53.423900293 +0100
> +++ gcc/gimple-lower-bitint.cc  2023-11-09 09:17:40.242182314 +0100
> @@ -427,6 +427,7 @@ struct bitint_large_huge
>    void lower_mul_overflow (tree, gimple *);
>    void lower_cplxpart_stmt (tree, gimple *);
>    void lower_complexexpr_stmt (gimple *);
> +  void lower_bit_query (gimple *);
>    void lower_call (tree, gimple *);
>    void lower_asm (gimple *);
>    void lower_stmt (gimple *);
> @@ -4455,6 +4456,524 @@ bitint_large_huge::lower_complexexpr_stm
>    insert_before (g);
>  }
>
> +/* Lower a .{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT} call with one large/huge _BitInt
> +   argument.  */
> +
> +void
> +bitint_large_huge::lower_bit_query (gimple *stmt)
> +{
> +  tree arg0 = gimple_call_arg (stmt, 0);
> +  tree arg1 = (gimple_call_num_args (stmt) == 2
> +              ? gimple_call_arg (stmt, 1) : NULL_TREE);
> +  tree lhs = gimple_call_lhs (stmt);
> +  gimple *g;
> +
> +  if (!lhs)
> +    {
> +      gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
> +      gsi_remove (&gsi, true);
> +      return;
> +    }
> +  tree type = TREE_TYPE (arg0);
> +  gcc_assert (TREE_CODE (type) == BITINT_TYPE);
> +  bitint_prec_kind kind = bitint_precision_kind (type);
> +  gcc_assert (kind >= bitint_prec_large);
> +  enum internal_fn ifn = gimple_call_internal_fn (stmt);
> +  enum built_in_function fcode = END_BUILTINS;
> +  gcc_assert (TYPE_PRECISION (unsigned_type_node) == limb_prec
> +             || TYPE_PRECISION (long_unsigned_type_node) == limb_prec
> +             || TYPE_PRECISION (long_long_unsigned_type_node) == limb_prec);
> +  switch (ifn)
> +    {
> +    case IFN_CLZ:
> +      if (TYPE_PRECISION (unsigned_type_node) == limb_prec)
> +       fcode = BUILT_IN_CLZ;
> +      else if (TYPE_PRECISION (long_unsigned_type_node) == limb_prec)
> +       fcode = BUILT_IN_CLZL;
> +      else
> +       fcode = BUILT_IN_CLZLL;
> +      break;
> +    case IFN_FFS:
> +      /* .FFS (X) is .CTZ (X, -1) + 1, though under the hood
> +        we don't add the addend at the end.  */
> +      arg1 = integer_zero_node;
> +      /* FALLTHRU */
> +    case IFN_CTZ:
> +      if (TYPE_PRECISION (unsigned_type_node) == limb_prec)
> +       fcode = BUILT_IN_CTZ;
> +      else if (TYPE_PRECISION (long_unsigned_type_node) == limb_prec)
> +       fcode = BUILT_IN_CTZL;
> +      else
> +       fcode = BUILT_IN_CTZLL;
> +      m_upwards = true;
> +      break;
> +    case IFN_CLRSB:
> +      if (TYPE_PRECISION (unsigned_type_node) == limb_prec)
> +       fcode = BUILT_IN_CLRSB;
> +      else if (TYPE_PRECISION (long_unsigned_type_node) == limb_prec)
> +       fcode = BUILT_IN_CLRSBL;
> +      else
> +       fcode = BUILT_IN_CLRSBLL;
> +      break;
> +    case IFN_PARITY:
> +      if (TYPE_PRECISION (unsigned_type_node) == limb_prec)
> +       fcode = BUILT_IN_PARITY;
> +      else if (TYPE_PRECISION (long_unsigned_type_node) == limb_prec)
> +       fcode = BUILT_IN_PARITYL;
> +      else
> +       fcode = BUILT_IN_PARITYLL;
> +      m_upwards = true;
> +      break;
> +    case IFN_POPCOUNT:
> +      if (TYPE_PRECISION (unsigned_type_node) == limb_prec)
> +       fcode = BUILT_IN_POPCOUNT;
> +      else if (TYPE_PRECISION (long_unsigned_type_node) == limb_prec)
> +       fcode = BUILT_IN_POPCOUNTL;
> +      else
> +       fcode = BUILT_IN_POPCOUNTLL;
> +      m_upwards = true;
> +      break;
> +    default:
> +      gcc_unreachable ();
> +    }
> +  tree fndecl = builtin_decl_explicit (fcode), res = NULL_TREE;
> +  unsigned cnt = 0, rem = 0, end = 0, prec = TYPE_PRECISION (type);
> +  struct bq_details { edge e; tree val, addend; } *bqp = NULL;
> +  basic_block edge_bb = NULL;
> +  if (m_upwards)
> +    {
> +      tree idx = NULL_TREE, idx_first = NULL_TREE, idx_next = NULL_TREE;
> +      if (kind == bitint_prec_large)
> +       cnt = CEIL (prec, limb_prec);
> +      else
> +       {
> +         rem = (prec % (2 * limb_prec));
> +         end = (prec - rem) / limb_prec;
> +         cnt = 2 + CEIL (rem, limb_prec);
> +         idx = idx_first = create_loop (size_zero_node, &idx_next);
> +       }
> +
> +      if (ifn == IFN_CTZ || ifn == IFN_FFS)
> +       {
> +         gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
> +         gsi_prev (&gsi);
> +         edge e = split_block (gsi_bb (gsi), gsi_stmt (gsi));
> +         edge_bb = e->src;
> +         if (kind == bitint_prec_large)
> +           {
> +             m_gsi = gsi_last_bb (edge_bb);
> +             if (!gsi_end_p (m_gsi))
> +               gsi_next (&m_gsi);
> +           }
> +         bqp = XALLOCAVEC (struct bq_details, cnt);
> +       }
> +      else
> +       m_after_stmt = stmt;
> +      if (kind != bitint_prec_large)
> +       m_upwards_2limb = end;
> +
> +      for (unsigned i = 0; i < cnt; i++)
> +       {
> +         m_data_cnt = 0;
> +         if (kind == bitint_prec_large)
> +           idx = size_int (i);
> +         else if (i >= 2)
> +           idx = size_int (end + (i > 2));
> +
> +         tree rhs1 = handle_operand (arg0, idx);
> +         if (!useless_type_conversion_p (m_limb_type, TREE_TYPE (rhs1)))
> +           {
> +             if (!TYPE_UNSIGNED (TREE_TYPE (rhs1)))
> +               rhs1 = add_cast (unsigned_type_for (TREE_TYPE (rhs1)), rhs1);
> +             rhs1 = add_cast (m_limb_type, rhs1);
> +           }
> +
> +         tree in, out, tem;
> +         if (ifn == IFN_PARITY)
> +           in = prepare_data_in_out (build_zero_cst (m_limb_type), idx, &out);
> +         else if (ifn == IFN_FFS)
> +           in = prepare_data_in_out (integer_one_node, idx, &out);
> +         else
> +           in = prepare_data_in_out (integer_zero_node, idx, &out);
> +
> +         switch (ifn)
> +           {
> +           case IFN_CTZ:
> +           case IFN_FFS:
> +             g = gimple_build_cond (NE_EXPR, rhs1,
> +                                    build_zero_cst (m_limb_type),
> +                                    NULL_TREE, NULL_TREE);
> +             insert_before (g);
> +             edge e1, e2;
> +             e1 = split_block (gsi_bb (m_gsi), g);
> +             e1->flags = EDGE_FALSE_VALUE;
> +             e2 = make_edge (e1->src, gimple_bb (stmt), EDGE_TRUE_VALUE);
> +             e1->probability = profile_probability::unlikely ();
> +             e2->probability = e1->probability.invert ();
> +             if (i == 0)
> +               set_immediate_dominator (CDI_DOMINATORS, e2->dest, e2->src);
> +             m_gsi = gsi_after_labels (e1->dest);
> +             bqp[i].e = e2;
> +             bqp[i].val = rhs1;
> +             if (tree_fits_uhwi_p (idx))
> +               bqp[i].addend
> +                 = build_int_cst (integer_type_node,
> +                                  tree_to_uhwi (idx) * limb_prec
> +                                  + (ifn == IFN_FFS));
> +             else
> +               {
> +                 bqp[i].addend = in;
> +                 if (i == 1)
> +                   res = out;
> +                 else
> +                   res = make_ssa_name (integer_type_node);
> +                 g = gimple_build_assign (res, PLUS_EXPR, in,
> +                                          build_int_cst (integer_type_node,
> +                                                         limb_prec));
> +                 insert_before (g);
> +                 m_data[m_data_cnt] = res;
> +               }
> +             break;
> +           case IFN_PARITY:
> +             if (!integer_zerop (in))
> +               {
> +                 if (kind == bitint_prec_huge && i == 1)
> +                   res = out;
> +                 else
> +                   res = make_ssa_name (m_limb_type);
> +                 g = gimple_build_assign (res, BIT_XOR_EXPR, in, rhs1);
> +                 insert_before (g);
> +               }
> +             else
> +               res = rhs1;
> +             m_data[m_data_cnt] = res;
> +             break;
> +           case IFN_POPCOUNT:
> +             g = gimple_build_call (fndecl, 1, rhs1);
> +             tem = make_ssa_name (integer_type_node);
> +             gimple_call_set_lhs (g, tem);
> +             insert_before (g);
> +             if (!integer_zerop (in))
> +               {
> +                 if (kind == bitint_prec_huge && i == 1)
> +                   res = out;
> +                 else
> +                   res = make_ssa_name (integer_type_node);
> +                 g = gimple_build_assign (res, PLUS_EXPR, in, tem);
> +                 insert_before (g);
> +               }
> +             else
> +               res = tem;
> +             m_data[m_data_cnt] = res;
> +             break;
> +           default:
> +             gcc_unreachable ();
> +           }
> +
> +         m_first = false;
> +         if (kind == bitint_prec_huge && i <= 1)
> +           {
> +             if (i == 0)
> +               {
> +                 idx = make_ssa_name (sizetype);
> +                 g = gimple_build_assign (idx, PLUS_EXPR, idx_first,
> +                                          size_one_node);
> +                 insert_before (g);
> +               }
> +             else
> +               {
> +                 g = gimple_build_assign (idx_next, PLUS_EXPR, idx_first,
> +                                          size_int (2));
> +                 insert_before (g);
> +                 g = gimple_build_cond (NE_EXPR, idx_next, size_int (end),
> +                                        NULL_TREE, NULL_TREE);
> +                 insert_before (g);
> +                 if (ifn == IFN_CTZ || ifn == IFN_FFS)
> +                   m_gsi = gsi_after_labels (edge_bb);
> +                 else
> +                   m_gsi = gsi_for_stmt (stmt);
> +               }
> +           }
> +       }
> +    }
> +  else
> +    {
> +      tree idx = NULL_TREE, idx_next = NULL_TREE, first = NULL_TREE;
> +      int sub_one = 0;
> +      if (kind == bitint_prec_large)
> +       cnt = CEIL (prec, limb_prec);
> +      else
> +       {
> +         rem = prec % limb_prec;
> +         if (rem == 0 && (!TYPE_UNSIGNED (type) || ifn == IFN_CLRSB))
> +           rem = limb_prec;
> +         end = (prec - rem) / limb_prec;
> +         cnt = 1 + (rem != 0);
> +         if (ifn == IFN_CLRSB)
> +           sub_one = 1;
> +       }
> +
> +      gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
> +      gsi_prev (&gsi);
> +      edge e = split_block (gsi_bb (gsi), gsi_stmt (gsi));
> +      edge_bb = e->src;
> +      m_gsi = gsi_last_bb (edge_bb);
> +      if (!gsi_end_p (m_gsi))
> +       gsi_next (&m_gsi);
> +
> +      if (ifn == IFN_CLZ)
> +       bqp = XALLOCAVEC (struct bq_details, cnt);
> +      else
> +       {
> +         gsi = gsi_for_stmt (stmt);
> +         gsi_prev (&gsi);
> +         e = split_block (gsi_bb (gsi), gsi_stmt (gsi));
> +         edge_bb = e->src;
> +         bqp = XALLOCAVEC (struct bq_details, 2 * cnt);
> +       }
> +
> +      for (unsigned i = 0; i < cnt; i++)
> +       {
> +         m_data_cnt = 0;
> +         if (kind == bitint_prec_large)
> +           idx = size_int (cnt - i - 1);
> +         else if (i == cnt - 1)
> +           idx = create_loop (size_int (end - 1), &idx_next);
> +         else
> +           idx = size_int (end);
> +
> +         tree rhs1 = handle_operand (arg0, idx);
> +         if (!useless_type_conversion_p (m_limb_type, TREE_TYPE (rhs1)))
> +           {
> +             if (ifn == IFN_CLZ && !TYPE_UNSIGNED (TREE_TYPE (rhs1)))
> +               rhs1 = add_cast (unsigned_type_for (TREE_TYPE (rhs1)), rhs1);
> +             else if (ifn == IFN_CLRSB && TYPE_UNSIGNED (TREE_TYPE (rhs1)))
> +               rhs1 = add_cast (signed_type_for (TREE_TYPE (rhs1)), rhs1);
> +             rhs1 = add_cast (m_limb_type, rhs1);
> +           }
> +
> +         if (ifn == IFN_CLZ)
> +           {
> +             g = gimple_build_cond (NE_EXPR, rhs1,
> +                                    build_zero_cst (m_limb_type),
> +                                    NULL_TREE, NULL_TREE);
> +             insert_before (g);
> +             edge e1 = split_block (gsi_bb (m_gsi), g);
> +             e1->flags = EDGE_FALSE_VALUE;
> +             edge e2 = make_edge (e1->src, gimple_bb (stmt), EDGE_TRUE_VALUE);
> +             e1->probability = profile_probability::unlikely ();
> +             e2->probability = e1->probability.invert ();
> +             if (i == 0)
> +               set_immediate_dominator (CDI_DOMINATORS, e2->dest, e2->src);
> +             m_gsi = gsi_after_labels (e1->dest);
> +             bqp[i].e = e2;
> +             bqp[i].val = rhs1;
> +           }
> +         else
> +           {
> +             if (i == 0)
> +               {
> +                 first = rhs1;
> +                 g = gimple_build_assign (make_ssa_name (m_limb_type),
> +                                          PLUS_EXPR, rhs1,
> +                                          build_int_cst (m_limb_type, 1));
> +                 insert_before (g);
> +                 g = gimple_build_cond (GT_EXPR, gimple_assign_lhs (g),
> +                                        build_int_cst (m_limb_type, 1),
> +                                        NULL_TREE, NULL_TREE);
> +                 insert_before (g);
> +               }
> +             else
> +               {
> +                 g = gimple_build_assign (make_ssa_name (m_limb_type),
> +                                          BIT_XOR_EXPR, rhs1, first);
> +                 insert_before (g);
> +                 tree stype = signed_type_for (m_limb_type);
> +                 g = gimple_build_cond (LT_EXPR,
> +                                        add_cast (stype,
> +                                                  gimple_assign_lhs (g)),
> +                                        build_zero_cst (stype),
> +                                        NULL_TREE, NULL_TREE);
> +                 insert_before (g);
> +                 edge e1 = split_block (gsi_bb (m_gsi), g);
> +                 e1->flags = EDGE_FALSE_VALUE;
> +                 edge e2 = make_edge (e1->src, gimple_bb (stmt),
> +                                      EDGE_TRUE_VALUE);
> +                 e1->probability = profile_probability::unlikely ();
> +                 e2->probability = e1->probability.invert ();
> +                 if (i == 1)
> +                   set_immediate_dominator (CDI_DOMINATORS, e2->dest,
> +                                            e2->src);
> +                 m_gsi = gsi_after_labels (e1->dest);
> +                 bqp[2 * i].e = e2;
> +                 g = gimple_build_cond (NE_EXPR, rhs1, first,
> +                                        NULL_TREE, NULL_TREE);
> +                 insert_before (g);
> +               }
> +             edge e1 = split_block (gsi_bb (m_gsi), g);
> +             e1->flags = EDGE_FALSE_VALUE;
> +             edge e2 = make_edge (e1->src, edge_bb, EDGE_TRUE_VALUE);
> +             e1->probability = profile_probability::unlikely ();
> +             e2->probability = e1->probability.invert ();
> +             if (i == 0)
> +               set_immediate_dominator (CDI_DOMINATORS, e2->dest, e2->src);
> +             m_gsi = gsi_after_labels (e1->dest);
> +             bqp[2 * i + 1].e = e2;
> +             bqp[i].val = rhs1;
> +           }
> +         if (tree_fits_uhwi_p (idx))
> +           bqp[i].addend
> +             = build_int_cst (integer_type_node,
> +                              (int) prec
> +                              - (((int) tree_to_uhwi (idx) + 1)
> +                                 * limb_prec) - sub_one);
> +         else
> +           {
> +             tree in, out;
> +             in = build_int_cst (integer_type_node, rem - sub_one);
> +             m_first = true;
> +             in = prepare_data_in_out (in, idx, &out);
> +             out = m_data[m_data_cnt + 1];
> +             bqp[i].addend = in;
> +             g = gimple_build_assign (out, PLUS_EXPR, in,
> +                                      build_int_cst (integer_type_node,
> +                                                     limb_prec));
> +             insert_before (g);
> +             m_data[m_data_cnt] = out;
> +           }
> +
> +         m_first = false;
> +         if (kind == bitint_prec_huge && i == cnt - 1)
> +           {
> +             g = gimple_build_assign (idx_next, PLUS_EXPR, idx,
> +                                      size_int (-1));
> +             insert_before (g);
> +             g = gimple_build_cond (NE_EXPR, idx, size_zero_node,
> +                                    NULL_TREE, NULL_TREE);
> +             insert_before (g);
> +             edge true_edge, false_edge;
> +             extract_true_false_edges_from_block (gsi_bb (m_gsi),
> +                                                  &true_edge, &false_edge);
> +             m_gsi = gsi_after_labels (false_edge->dest);
> +           }
> +       }
> +    }
> +  switch (ifn)
> +    {
> +    case IFN_CLZ:
> +    case IFN_CTZ:
> +    case IFN_FFS:
> +      gphi *phi1, *phi2, *phi3;
> +      basic_block bb;
> +      bb = gsi_bb (m_gsi);
> +      remove_edge (find_edge (bb, gimple_bb (stmt)));
> +      phi1 = create_phi_node (make_ssa_name (m_limb_type),
> +                             gimple_bb (stmt));
> +      phi2 = create_phi_node (make_ssa_name (integer_type_node),
> +                             gimple_bb (stmt));
> +      for (unsigned i = 0; i < cnt; i++)
> +       {
> +         add_phi_arg (phi1, bqp[i].val, bqp[i].e, UNKNOWN_LOCATION);
> +         add_phi_arg (phi2, bqp[i].addend, bqp[i].e, UNKNOWN_LOCATION);
> +       }
> +      if (arg1 == NULL_TREE)
> +       {
> +         g = gimple_build_builtin_unreachable (m_loc);
> +         insert_before (g);
> +       }
> +      m_gsi = gsi_for_stmt (stmt);
> +      g = gimple_build_call (fndecl, 1, gimple_phi_result (phi1));
> +      gimple_call_set_lhs (g, make_ssa_name (integer_type_node));
> +      insert_before (g);
> +      if (arg1 == NULL_TREE)
> +       g = gimple_build_assign (lhs, PLUS_EXPR,
> +                                gimple_phi_result (phi2),
> +                                gimple_call_lhs (g));
> +      else
> +       {
> +         g = gimple_build_assign (make_ssa_name (integer_type_node),
> +                                  PLUS_EXPR, gimple_phi_result (phi2),
> +                                  gimple_call_lhs (g));
> +         insert_before (g);
> +         edge e1 = split_block (gimple_bb (stmt), g);
> +         edge e2 = make_edge (bb, e1->dest, EDGE_FALLTHRU);
> +         e2->probability = profile_probability::always ();
> +         set_immediate_dominator (CDI_DOMINATORS, e1->dest,
> +                                  get_immediate_dominator (CDI_DOMINATORS,
> +                                                           e1->src));
> +         phi3 = create_phi_node (make_ssa_name (integer_type_node), e1->dest);
> +         add_phi_arg (phi3, gimple_assign_lhs (g), e1, UNKNOWN_LOCATION);
> +         add_phi_arg (phi3, arg1, e2, UNKNOWN_LOCATION);
> +         m_gsi = gsi_for_stmt (stmt);
> +         g = gimple_build_assign (lhs, gimple_phi_result (phi3));
> +       }
> +      gsi_replace (&m_gsi, g, true);
> +      break;
> +    case IFN_CLRSB:
> +      bb = gsi_bb (m_gsi);
> +      remove_edge (find_edge (bb, edge_bb));
> +      edge e;
> +      e = make_edge (bb, gimple_bb (stmt), EDGE_FALLTHRU);
> +      e->probability = profile_probability::always ();
> +      set_immediate_dominator (CDI_DOMINATORS, gimple_bb (stmt),
> +                              get_immediate_dominator (CDI_DOMINATORS,
> +                                                       edge_bb));
> +      phi1 = create_phi_node (make_ssa_name (m_limb_type),
> +                             edge_bb);
> +      phi2 = create_phi_node (make_ssa_name (integer_type_node),
> +                             edge_bb);
> +      phi3 = create_phi_node (make_ssa_name (integer_type_node),
> +                             gimple_bb (stmt));
> +      for (unsigned i = 0; i < cnt; i++)
> +       {
> +         add_phi_arg (phi1, bqp[i].val, bqp[2 * i + 1].e, UNKNOWN_LOCATION);
> +         add_phi_arg (phi2, bqp[i].addend, bqp[2 * i + 1].e,
> +                      UNKNOWN_LOCATION);
> +         tree a = bqp[i].addend;
> +         if (i && kind == bitint_prec_large)
> +           a = int_const_binop (PLUS_EXPR, a, integer_minus_one_node);
> +         if (i)
> +           add_phi_arg (phi3, a, bqp[2 * i].e, UNKNOWN_LOCATION);
> +       }
> +      add_phi_arg (phi3, build_int_cst (integer_type_node, prec - 1), e,
> +                  UNKNOWN_LOCATION);
> +      m_gsi = gsi_after_labels (edge_bb);
> +      g = gimple_build_call (fndecl, 1,
> +                            add_cast (signed_type_for (m_limb_type),
> +                                      gimple_phi_result (phi1)));
> +      gimple_call_set_lhs (g, make_ssa_name (integer_type_node));
> +      insert_before (g);
> +      g = gimple_build_assign (make_ssa_name (integer_type_node),
> +                              PLUS_EXPR, gimple_call_lhs (g),
> +                              gimple_phi_result (phi2));
> +      insert_before (g);
> +      if (kind != bitint_prec_large)
> +       {
> +         g = gimple_build_assign (make_ssa_name (integer_type_node),
> +                                  PLUS_EXPR, gimple_assign_lhs (g),
> +                                  integer_one_node);
> +         insert_before (g);
> +       }
> +      add_phi_arg (phi3, gimple_assign_lhs (g),
> +                  find_edge (edge_bb, gimple_bb (stmt)), UNKNOWN_LOCATION);
> +      m_gsi = gsi_for_stmt (stmt);
> +      g = gimple_build_assign (lhs, gimple_phi_result (phi3));
> +      gsi_replace (&m_gsi, g, true);
> +      break;
> +    case IFN_PARITY:
> +      g = gimple_build_call (fndecl, 1, res);
> +      gimple_call_set_lhs (g, lhs);
> +      gsi_replace (&m_gsi, g, true);
> +      break;
> +    case IFN_POPCOUNT:
> +      g = gimple_build_assign (lhs, res);
> +      gsi_replace (&m_gsi, g, true);
> +      break;
> +    default:
> +      gcc_unreachable ();
> +    }
> +}
> +
>  /* Lower a call statement with one or more large/huge _BitInt
>     arguments or large/huge _BitInt return value.  */
>
> @@ -4476,6 +4995,14 @@ bitint_large_huge::lower_call (tree obj,
>        case IFN_UBSAN_CHECK_MUL:
>         lower_mul_overflow (obj, stmt);
>         return;
> +      case IFN_CLZ:
> +      case IFN_CTZ:
> +      case IFN_CLRSB:
> +      case IFN_FFS:
> +      case IFN_PARITY:
> +      case IFN_POPCOUNT:
> +       lower_bit_query (stmt);
> +       return;
>        default:
>         break;
>        }
> --- gcc/gimple-range-op.cc.jj   2023-11-09 09:03:53.443900010 +0100
> +++ gcc/gimple-range-op.cc      2023-11-09 09:17:40.233182441 +0100
> @@ -908,39 +908,34 @@ public:
>    cfn_clz (bool internal) { m_gimple_call_internal_p = internal; }
>    using range_operator::fold_range;
>    virtual bool fold_range (irange &r, tree type, const irange &lh,
> -                          const irange &, relation_trio) const;
> +                          const irange &rh, relation_trio) const;
>  private:
>    bool m_gimple_call_internal_p;
>  } op_cfn_clz (false), op_cfn_clz_internal (true);
>
>  bool
>  cfn_clz::fold_range (irange &r, tree type, const irange &lh,
> -                    const irange &, relation_trio) const
> +                    const irange &rh, relation_trio) const
>  {
>    // __builtin_c[lt]z* return [0, prec-1], except when the
>    // argument is 0, but that is undefined behavior.
>    //
>    // For __builtin_c[lt]z* consider argument of 0 always undefined
> -  // behavior, for internal fns depending on C?Z_DEFINED_VALUE_AT_ZERO.
> +  // behavior, for internal fns likewise, unless it has 2 arguments,
> +  // then the second argument is the value at zero.
>    if (lh.undefined_p ())
>      return false;
>    int prec = TYPE_PRECISION (lh.type ());
>    int mini = 0;
>    int maxi = prec - 1;
> -  int zerov = 0;
> -  scalar_int_mode mode = SCALAR_INT_TYPE_MODE (lh.type ());
>    if (m_gimple_call_internal_p)
>      {
> -      if (optab_handler (clz_optab, mode) != CODE_FOR_nothing
> -         && CLZ_DEFINED_VALUE_AT_ZERO (mode, zerov) == 2)
> -       {
> -         // Only handle the single common value.
> -         if (zerov == prec)
> -           maxi = prec;
> -         else
> -           // Magic value to give up, unless we can prove arg is non-zero.
> -           mini = -2;
> -       }
> +      // Only handle the single common value.
> +      if (rh.lower_bound () == prec)
> +       maxi = prec;
> +      else
> +       // Magic value to give up, unless we can prove arg is non-zero.
> +       mini = -2;
>      }
>
>    // From clz of minimum we can compute result maximum.
> @@ -985,37 +980,31 @@ public:
>    cfn_ctz (bool internal) { m_gimple_call_internal_p = internal; }
>    using range_operator::fold_range;
>    virtual bool fold_range (irange &r, tree type, const irange &lh,
> -                          const irange &, relation_trio) const;
> +                          const irange &rh, relation_trio) const;
>  private:
>    bool m_gimple_call_internal_p;
>  } op_cfn_ctz (false), op_cfn_ctz_internal (true);
>
>  bool
>  cfn_ctz::fold_range (irange &r, tree type, const irange &lh,
> -                    const irange &, relation_trio) const
> +                    const irange &rh, relation_trio) const
>  {
>    if (lh.undefined_p ())
>      return false;
>    int prec = TYPE_PRECISION (lh.type ());
>    int mini = 0;
>    int maxi = prec - 1;
> -  int zerov = 0;
> -  scalar_int_mode mode = SCALAR_INT_TYPE_MODE (lh.type ());
>
>    if (m_gimple_call_internal_p)
>      {
> -      if (optab_handler (ctz_optab, mode) != CODE_FOR_nothing
> -         && CTZ_DEFINED_VALUE_AT_ZERO (mode, zerov) == 2)
> -       {
> -         // Handle only the two common values.
> -         if (zerov == -1)
> -           mini = -1;
> -         else if (zerov == prec)
> -           maxi = prec;
> -         else
> -           // Magic value to give up, unless we can prove arg is non-zero.
> -           mini = -2;
> -       }
> +      // Handle only the two common values.
> +      if (rh.lower_bound () == -1)
> +       mini = -1;
> +      else if (rh.lower_bound () == prec)
> +       maxi = prec;
> +      else
> +       // Magic value to give up, unless we can prove arg is non-zero.
> +       mini = -2;
>      }
>    // If arg is non-zero, then use [0, prec - 1].
>    if (!range_includes_zero_p (&lh))
> @@ -1288,16 +1277,24 @@ gimple_range_op_handler::maybe_builtin_c
>
>      CASE_CFN_CLZ:
>        m_op1 = gimple_call_arg (call, 0);
> -      if (gimple_call_internal_p (call))
> -       m_operator = &op_cfn_clz_internal;
> +      if (gimple_call_internal_p (call)
> +         && gimple_call_num_args (call) == 2)
> +       {
> +         m_op2 = gimple_call_arg (call, 1);
> +         m_operator = &op_cfn_clz_internal;
> +       }
>        else
>         m_operator = &op_cfn_clz;
>        break;
>
>      CASE_CFN_CTZ:
>        m_op1 = gimple_call_arg (call, 0);
> -      if (gimple_call_internal_p (call))
> -       m_operator = &op_cfn_ctz_internal;
> +      if (gimple_call_internal_p (call)
> +         && gimple_call_num_args (call) == 2)
> +       {
> +         m_op2 = gimple_call_arg (call, 1);
> +         m_operator = &op_cfn_ctz_internal;
> +       }
>        else
>         m_operator = &op_cfn_ctz;
>        break;
> --- gcc/tree-vect-patterns.cc.jj        2023-11-09 09:03:53.675896723 +0100
> +++ gcc/tree-vect-patterns.cc   2023-11-09 09:17:40.232182455 +0100
> @@ -1818,7 +1818,7 @@ vect_recog_ctz_ffs_pattern (vec_info *vi
>    tree new_var;
>    internal_fn ifn = IFN_LAST, ifnnew = IFN_LAST;
>    bool defined_at_zero = true, defined_at_zero_new = false;
> -  int val = 0, val_new = 0;
> +  int val = 0, val_new = 0, val_cmp = 0;
>    int prec;
>    int sub = 0, add = 0;
>    location_t loc;
> @@ -1826,7 +1826,8 @@ vect_recog_ctz_ffs_pattern (vec_info *vi
>    if (!is_gimple_call (call_stmt))
>      return NULL;
>
> -  if (gimple_call_num_args (call_stmt) != 1)
> +  if (gimple_call_num_args (call_stmt) != 1
> +      && gimple_call_num_args (call_stmt) != 2)
>      return NULL;
>
>    rhs_oprnd = gimple_call_arg (call_stmt, 0);
> @@ -1846,9 +1847,10 @@ vect_recog_ctz_ffs_pattern (vec_info *vi
>      CASE_CFN_CTZ:
>        ifn = IFN_CTZ;
>        if (!gimple_call_internal_p (call_stmt)
> -         || CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (rhs_type),
> -                                       val) != 2)
> +         || gimple_call_num_args (call_stmt) != 2)
>         defined_at_zero = false;
> +      else
> +       val = tree_to_shwi (gimple_call_arg (call_stmt, 1));
>        break;
>      CASE_CFN_FFS:
>        ifn = IFN_FFS;
> @@ -1907,6 +1909,7 @@ vect_recog_ctz_ffs_pattern (vec_info *vi
>
>    vect_pattern_detected ("vec_recog_ctz_ffs_pattern", call_stmt);
>
> +  val_cmp = val_new;
>    if ((ifnnew == IFN_CLZ
>         && defined_at_zero
>         && defined_at_zero_new
> @@ -1918,7 +1921,7 @@ vect_recog_ctz_ffs_pattern (vec_info *vi
>          .CTZ (X) = .POPCOUNT ((X - 1) & ~X).  */
>        if (ifnnew == IFN_CLZ)
>         sub = prec;
> -      val_new = prec;
> +      val_cmp = prec;
>
>        if (!TYPE_UNSIGNED (rhs_type))
>         {
> @@ -1955,7 +1958,7 @@ vect_recog_ctz_ffs_pattern (vec_info *vi
>        /* .CTZ (X) = (PREC - 1) - .CLZ (X & -X)
>          .FFS (X) = PREC - .CLZ (X & -X).  */
>        sub = prec - (ifn == IFN_CTZ);
> -      val_new = sub - val_new;
> +      val_cmp = sub - val_new;
>
>        tree neg = vect_recog_temp_ssa_var (rhs_type, NULL);
>        pattern_stmt = gimple_build_assign (neg, NEGATE_EXPR, rhs_oprnd);
> @@ -1974,7 +1977,7 @@ vect_recog_ctz_ffs_pattern (vec_info *vi
>        /* .CTZ (X) = PREC - .POPCOUNT (X | -X)
>          .FFS (X) = (PREC + 1) - .POPCOUNT (X | -X).  */
>        sub = prec + (ifn == IFN_FFS);
> -      val_new = sub;
> +      val_cmp = sub;
>
>        tree neg = vect_recog_temp_ssa_var (rhs_type, NULL);
>        pattern_stmt = gimple_build_assign (neg, NEGATE_EXPR, rhs_oprnd);
> @@ -1992,12 +1995,18 @@ vect_recog_ctz_ffs_pattern (vec_info *vi
>      {
>        /* .FFS (X) = .CTZ (X) + 1.  */
>        add = 1;
> -      val_new++;
> +      val_cmp++;
>      }
>
>    /* Create B = .IFNNEW (A).  */
>    new_var = vect_recog_temp_ssa_var (lhs_type, NULL);
> -  pattern_stmt = gimple_build_call_internal (ifnnew, 1, rhs_oprnd);
> +  if ((ifnnew == IFN_CLZ || ifnnew == IFN_CTZ) && defined_at_zero_new)
> +    pattern_stmt
> +      = gimple_build_call_internal (ifnnew, 2, rhs_oprnd,
> +                                   build_int_cst (integer_type_node,
> +                                                  val_new));
> +  else
> +    pattern_stmt = gimple_build_call_internal (ifnnew, 1, rhs_oprnd);
>    gimple_call_set_lhs (pattern_stmt, new_var);
>    gimple_set_location (pattern_stmt, loc);
>    *type_out = vec_type;
> @@ -2023,7 +2032,7 @@ vect_recog_ctz_ffs_pattern (vec_info *vi
>      }
>
>    if (defined_at_zero
> -      && (!defined_at_zero_new || val != val_new))
> +      && (!defined_at_zero_new || val != val_cmp))
>      {
>        append_pattern_def_seq (vinfo, stmt_vinfo, pattern_stmt, vec_type);
>        tree ret_var = vect_recog_temp_ssa_var (lhs_type, NULL);
> @@ -2143,7 +2152,8 @@ vect_recog_popcount_clz_ctz_ffs_pattern
>        return NULL;
>      }
>
> -  if (gimple_call_num_args (call_stmt) != 1)
> +  if (gimple_call_num_args (call_stmt) != 1
> +      && gimple_call_num_args (call_stmt) != 2)
>      return NULL;
>
>    rhs_oprnd = gimple_call_arg (call_stmt, 0);
> @@ -2181,17 +2191,14 @@ vect_recog_popcount_clz_ctz_ffs_pattern
>           return NULL;
>         addend = (TYPE_PRECISION (TREE_TYPE (rhs_oprnd))
>                   - TYPE_PRECISION (lhs_type));
> -       if (gimple_call_internal_p (call_stmt))
> +       if (gimple_call_internal_p (call_stmt)
> +           && gimple_call_num_args (call_stmt) == 2)
>           {
>             int val1, val2;
> -           int d1
> -             = CLZ_DEFINED_VALUE_AT_ZERO
> -                 (SCALAR_INT_TYPE_MODE (TREE_TYPE (rhs_oprnd)), val1);
> +           val1 = tree_to_shwi (gimple_call_arg (call_stmt, 1));
>             int d2
>               = CLZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (lhs_type),
>                                            val2);
> -           if (d1 != 2)
> -             break;
>             if (d2 != 2 || val1 != val2 + addend)
>               return NULL;
>           }
> @@ -2200,17 +2207,14 @@ vect_recog_popcount_clz_ctz_ffs_pattern
>         /* ctzll (x) == ctz (x) for unsigned or signed x != 0, so ok
>            if it is undefined at zero or if it matches also for the
>            defined value there.  */
> -       if (gimple_call_internal_p (call_stmt))
> +       if (gimple_call_internal_p (call_stmt)
> +           && gimple_call_num_args (call_stmt) == 2)
>           {
>             int val1, val2;
> -           int d1
> -             = CTZ_DEFINED_VALUE_AT_ZERO
> -                 (SCALAR_INT_TYPE_MODE (TREE_TYPE (rhs_oprnd)), val1);
> +           val1 = tree_to_shwi (gimple_call_arg (call_stmt, 1));
>             int d2
>               = CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (lhs_type),
>                                            val2);
> -           if (d1 != 2)
> -             break;
>             if (d2 != 2 || val1 != val2)
>               return NULL;
>           }
> @@ -2260,7 +2264,20 @@ vect_recog_popcount_clz_ctz_ffs_pattern
>
>    /* Create B = .POPCOUNT (A).  */
>    new_var = vect_recog_temp_ssa_var (lhs_type, NULL);
> -  pattern_stmt = gimple_build_call_internal (ifn, 1, unprom_diff.op);
> +  tree arg2 = NULL_TREE;
> +  int val;
> +  if (ifn == IFN_CLZ
> +      && CLZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (lhs_type),
> +                                   val) == 2)
> +    arg2 = build_int_cst (integer_type_node, val);
> +  else if (ifn == IFN_CTZ
> +          && CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (lhs_type),
> +                                        val) == 2)
> +    arg2 = build_int_cst (integer_type_node, val);
> +  if (arg2)
> +    pattern_stmt = gimple_build_call_internal (ifn, 2, unprom_diff.op, arg2);
> +  else
> +    pattern_stmt = gimple_build_call_internal (ifn, 1, unprom_diff.op);
>    gimple_call_set_lhs (pattern_stmt, new_var);
>    gimple_set_location (pattern_stmt, gimple_location (last_stmt));
>    *type_out = vec_type;
> --- gcc/tree-vect-stmts.cc.jj   2023-11-09 09:04:20.349518853 +0100
> +++ gcc/tree-vect-stmts.cc      2023-11-09 10:00:01.351992895 +0100
> @@ -3266,6 +3266,7 @@ vectorizable_call (vec_info *vinfo,
>    enum { NARROW, NONE, WIDEN } modifier;
>    size_t i, nargs;
>    tree lhs;
> +  tree clz_ctz_arg1 = NULL_TREE;
>
>    if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
>      return false;
> @@ -3311,6 +3312,14 @@ vectorizable_call (vec_info *vinfo,
>        nargs = 0;
>        rhs_type = unsigned_type_node;
>      }
> +  /* Similarly pretend IFN_CLZ and IFN_CTZ only has one argument, the second
> +     argument just says whether it is well-defined at zero or not and what
> +     value should be returned for it.  */
> +  if ((cfn == CFN_CLZ || cfn == CFN_CTZ) && nargs == 2)
> +    {
> +      nargs = 1;
> +      clz_ctz_arg1 = gimple_call_arg (stmt, 1);
> +    }
>
>    int mask_opno = -1;
>    if (internal_fn_p (cfn))
> @@ -3576,6 +3585,8 @@ vectorizable_call (vec_info *vinfo,
>        ifn = cond_fn;
>        vect_nargs += 2;
>      }
> +  if (clz_ctz_arg1)
> +    ++vect_nargs;
>
>    if (modifier == NONE || ifn != IFN_LAST)
>      {
> @@ -3613,6 +3624,9 @@ vectorizable_call (vec_info *vinfo,
>                     }
>                   if (masked_loop_p && reduc_idx >= 0)
>                     vargs[varg++] = vargs[reduc_idx + 1];
> +                 if (clz_ctz_arg1)
> +                   vargs[varg++] = clz_ctz_arg1;
> +
>                   gimple *new_stmt;
>                   if (modifier == NARROW)
>                     {
> @@ -3699,6 +3713,8 @@ vectorizable_call (vec_info *vinfo,
>             }
>           if (masked_loop_p && reduc_idx >= 0)
>             vargs[varg++] = vargs[reduc_idx + 1];
> +         if (clz_ctz_arg1)
> +           vargs[varg++] = clz_ctz_arg1;
>
>           if (len_opno >= 0 && len_loop_p)
>             {
> --- gcc/tree-ssa-loop-niter.cc.jj       2023-11-09 09:03:53.592897899 +0100
> +++ gcc/tree-ssa-loop-niter.cc  2023-11-09 09:17:40.234182427 +0100
> @@ -2235,14 +2235,18 @@ build_cltz_expr (tree src, bool leading,
>    tree call;
>    if (use_ifn)
>      {
> -      call = build_call_expr_internal_loc (UNKNOWN_LOCATION, ifn,
> -                                          integer_type_node, 1, src);
>        int val;
>        int optab_defined_at_zero
>         = (leading
>            ? CLZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (utype), val)
>            : CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (utype), val));
> -      if (define_at_zero && !(optab_defined_at_zero == 2 && val == prec))
> +      tree arg2 = NULL_TREE;
> +      if (define_at_zero && optab_defined_at_zero == 2 && val == prec)
> +       arg2 = build_int_cst (integer_type_node, val);
> +      call = build_call_expr_internal_loc (UNKNOWN_LOCATION, ifn,
> +                                          integer_type_node, arg2 ? 2 : 1,
> +                                          src, arg2);
> +      if (define_at_zero && arg2 == NULL_TREE)
>         {
>           tree is_zero = fold_build2 (NE_EXPR, boolean_type_node, src,
>                                       build_zero_cst (TREE_TYPE (src)));
> --- gcc/tree-ssa-forwprop.cc.jj 2023-11-09 09:03:53.542898608 +0100
> +++ gcc/tree-ssa-forwprop.cc    2023-11-09 09:38:28.895393573 +0100
> @@ -2381,6 +2381,7 @@ simplify_count_trailing_zeroes (gimple_s
>        HOST_WIDE_INT type_size = tree_to_shwi (TYPE_SIZE (type));
>        bool zero_ok
>         = CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (type), ctz_val) == 2;
> +      int nargs = 2;
>
>        /* If the input value can't be zero, don't special case ctz (0).  */
>        if (tree_expr_nonzero_p (res_ops[0]))
> @@ -2388,6 +2389,7 @@ simplify_count_trailing_zeroes (gimple_s
>           zero_ok = true;
>           zero_val = 0;
>           ctz_val = 0;
> +         nargs = 1;
>         }
>
>        /* Skip if there is no value defined at zero, or if we can't easily
> @@ -2399,7 +2401,11 @@ simplify_count_trailing_zeroes (gimple_s
>
>        gimple_seq seq = NULL;
>        gimple *g;
> -      gcall *call = gimple_build_call_internal (IFN_CTZ, 1, res_ops[0]);
> +      gcall *call
> +       = gimple_build_call_internal (IFN_CTZ, nargs, res_ops[0],
> +                                     nargs == 1 ? NULL_TREE
> +                                     : build_int_cst (integer_type_node,
> +                                                      ctz_val));
>        gimple_set_location (call, gimple_location (stmt));
>        gimple_set_lhs (call, make_ssa_name (integer_type_node));
>        gimple_seq_add_stmt (&seq, call);
> --- gcc/tree-ssa-phiopt.cc.jj   2023-11-09 09:03:53.616897559 +0100
> +++ gcc/tree-ssa-phiopt.cc      2023-11-09 09:17:40.241182328 +0100
> @@ -2863,18 +2863,26 @@ cond_removal_in_builtin_zero_pattern (ba
>      }
>
>    /* Check that we have a popcount/clz/ctz builtin.  */
> -  if (!is_gimple_call (call) || gimple_call_num_args (call) != 1)
> +  if (!is_gimple_call (call))
>      return false;
>
> -  arg = gimple_call_arg (call, 0);
>    lhs = gimple_get_lhs (call);
>
>    if (lhs == NULL_TREE)
>      return false;
>
>    combined_fn cfn = gimple_call_combined_fn (call);
> +  if (gimple_call_num_args (call) != 1
> +      && (gimple_call_num_args (call) != 2
> +         || cfn == CFN_CLZ
> +         || cfn == CFN_CTZ))
> +    return false;
> +
> +  arg = gimple_call_arg (call, 0);
> +
>    internal_fn ifn = IFN_LAST;
>    int val = 0;
> +  bool any_val = false;
>    switch (cfn)
>      {
>      case CFN_BUILT_IN_BSWAP16:
> @@ -2889,6 +2897,23 @@ cond_removal_in_builtin_zero_pattern (ba
>        if (INTEGRAL_TYPE_P (TREE_TYPE (arg)))
>         {
>           tree type = TREE_TYPE (arg);
> +         if (TREE_CODE (type) == BITINT_TYPE)
> +           {
> +             if (gimple_call_num_args (call) == 1)
> +               {
> +                 any_val = true;
> +                 ifn = IFN_CLZ;
> +                 break;
> +               }
> +             if (!tree_fits_shwi_p (gimple_call_arg (call, 1)))
> +               return false;
> +             HOST_WIDE_INT at_zero = tree_to_shwi (gimple_call_arg (call, 1));
> +             if ((int) at_zero != at_zero)
> +               return false;
> +             ifn = IFN_CLZ;
> +             val = at_zero;
> +             break;
> +           }
>           if (direct_internal_fn_supported_p (IFN_CLZ, type, OPTIMIZE_FOR_BOTH)
>               && CLZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (type),
>                                             val) == 2)
> @@ -2902,6 +2927,23 @@ cond_removal_in_builtin_zero_pattern (ba
>        if (INTEGRAL_TYPE_P (TREE_TYPE (arg)))
>         {
>           tree type = TREE_TYPE (arg);
> +         if (TREE_CODE (type) == BITINT_TYPE)
> +           {
> +             if (gimple_call_num_args (call) == 1)
> +               {
> +                 any_val = true;
> +                 ifn = IFN_CTZ;
> +                 break;
> +               }
> +             if (!tree_fits_shwi_p (gimple_call_arg (call, 1)))
> +               return false;
> +             HOST_WIDE_INT at_zero = tree_to_shwi (gimple_call_arg (call, 1));
> +             if ((int) at_zero != at_zero)
> +               return false;
> +             ifn = IFN_CTZ;
> +             val = at_zero;
> +             break;
> +           }
>           if (direct_internal_fn_supported_p (IFN_CTZ, type, OPTIMIZE_FOR_BOTH)
>               && CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (type),
>                                             val) == 2)
> @@ -2960,8 +3002,18 @@ cond_removal_in_builtin_zero_pattern (ba
>
>    /* Check PHI arguments.  */
>    if (lhs != arg0
> -      || TREE_CODE (arg1) != INTEGER_CST
> -      || wi::to_wide (arg1) != val)
> +      || TREE_CODE (arg1) != INTEGER_CST)
> +    return false;
> +  if (any_val)
> +    {
> +      if (!tree_fits_shwi_p (arg1))
> +       return false;
> +      HOST_WIDE_INT at_zero = tree_to_shwi (arg1);
> +      if ((int) at_zero != at_zero)
> +       return false;
> +      val = at_zero;
> +    }
> +  else if (wi::to_wide (arg1) != val)
>      return false;
>
>    /* And insert the popcount/clz/ctz builtin and cast stmt before the
> @@ -2974,13 +3026,15 @@ cond_removal_in_builtin_zero_pattern (ba
>        reset_flow_sensitive_info (gimple_get_lhs (cast));
>      }
>    gsi_from = gsi_for_stmt (call);
> -  if (ifn == IFN_LAST || gimple_call_internal_p (call))
> +  if (ifn == IFN_LAST
> +      || (gimple_call_internal_p (call) && gimple_call_num_args (call) == 2))
>      gsi_move_before (&gsi_from, &gsi);
>    else
>      {
>        /* For __builtin_c[lt]z* force .C[LT]Z ifn, because only
>          the latter is well defined at zero.  */
> -      call = gimple_build_call_internal (ifn, 1, gimple_call_arg (call, 0));
> +      call = gimple_build_call_internal (ifn, 2, gimple_call_arg (call, 0),
> +                                        build_int_cst (integer_type_node, val));
>        gimple_call_set_lhs (call, lhs);
>        gsi_insert_before (&gsi, call, GSI_SAME_STMT);
>        gsi_remove (&gsi_from, true);
> --- gcc/doc/extend.texi.jj      2023-11-09 09:04:18.823540470 +0100
> +++ gcc/doc/extend.texi 2023-11-09 09:17:40.240182342 +0100
> @@ -14960,6 +14960,42 @@ Similar to @code{__builtin_parity}, exce
>  @code{unsigned long long}.
>  @enddefbuiltin
>
> +@defbuiltin{int __builtin_ffsg (...)}
> +Similar to @code{__builtin_ffs}, except the argument is type-generic
> +signed integer (standard, extended or bit-precise).
> +@enddefbuiltin
> +
> +@defbuiltin{int __builtin_clzg (...)}
> +Similar to @code{__builtin_clz}, except the argument is type-generic
> +unsigned integer (standard, extended or bit-precise) and there is
> +optional second argument with int type.  If two arguments are specified,
> +and first argument is 0, the result is the second argument.  If only
> +one argument is specified and it is 0, the result is undefined.
> +@enddefbuiltin
> +
> +@defbuiltin{int __builtin_ctzg (...)}
> +Similar to @code{__builtin_ctz}, except the argument is type-generic
> +unsigned integer (standard, extended or bit-precise) and there is
> +optional second argument with int type.  If two arguments are specified,
> +and first argument is 0, the result is the second argument.  If only
> +one argument is specified and it is 0, the result is undefined.
> +@enddefbuiltin
> +
> +@defbuiltin{int __builtin_clrsbg (...)}
> +Similar to @code{__builtin_clrsb}, except the argument is type-generic
> +signed integer (standard, extended or bit-precise).
> +@enddefbuiltin
> +
> +@defbuiltin{int __builtin_popcountg (...)}
> +Similar to @code{__builtin_popcount}, except the argument is type-generic
> +unsigned integer (standard, extended or bit-precise).
> +@enddefbuiltin
> +
> +@defbuiltin{int __builtin_parityg (...)}
> +Similar to @code{__builtin_parity}, except the argument is type-generic
> +unsigned integer (standard, extended or bit-precise).
> +@enddefbuiltin
> +
>  @defbuiltin{double __builtin_powi (double, int)}
>  @defbuiltinx{float __builtin_powif (float, int)}
>  @defbuiltinx{{long double} __builtin_powil (long double, int)}
> --- gcc/c-family/c-common.cc.jj 2023-11-09 09:04:18.409546335 +0100
> +++ gcc/c-family/c-common.cc    2023-11-09 09:17:40.236182399 +0100
> @@ -6475,14 +6475,14 @@ check_builtin_function_arguments (locati
>               }
>           if (TREE_CODE (TREE_TYPE (args[2])) == ENUMERAL_TYPE)
>             {
> -             error_at (ARG_LOCATION (2), "argument 3 in call to function "
> -                       "%qE has enumerated type", fndecl);
> +             error_at (ARG_LOCATION (2), "argument %u in call to function "
> +                       "%qE has enumerated type", 3, fndecl);
>               return false;
>             }
>           else if (TREE_CODE (TREE_TYPE (args[2])) == BOOLEAN_TYPE)
>             {
> -             error_at (ARG_LOCATION (2), "argument 3 in call to function "
> -                       "%qE has boolean type", fndecl);
> +             error_at (ARG_LOCATION (2), "argument %u in call to function "
> +                       "%qE has boolean type", 3, fndecl);
>               return false;
>             }
>           return true;
> @@ -6522,6 +6522,72 @@ check_builtin_function_arguments (locati
>         }
>        return false;
>
> +    case BUILT_IN_CLZG:
> +    case BUILT_IN_CTZG:
> +    case BUILT_IN_CLRSBG:
> +    case BUILT_IN_FFSG:
> +    case BUILT_IN_PARITYG:
> +    case BUILT_IN_POPCOUNTG:
> +      if (nargs == 2
> +         && (DECL_FUNCTION_CODE (fndecl) == BUILT_IN_CLZG
> +             || DECL_FUNCTION_CODE (fndecl) == BUILT_IN_CTZG))
> +       {
> +         if (!INTEGRAL_TYPE_P (TREE_TYPE (args[1])))
> +           {
> +             error_at (ARG_LOCATION (1), "argument %u in call to function "
> +                       "%qE does not have integral type", 2, fndecl);
> +             return false;
> +           }
> +         if ((TYPE_PRECISION (TREE_TYPE (args[1]))
> +              > TYPE_PRECISION (integer_type_node))
> +             || (TYPE_PRECISION (TREE_TYPE (args[1]))
> +                 == TYPE_PRECISION (integer_type_node)
> +                 && TYPE_UNSIGNED (TREE_TYPE (args[1]))))
> +           {
> +             error_at (ARG_LOCATION (1), "argument %u in call to function "
> +                       "%qE does not have %<int%> type", 2, fndecl);
> +             return false;
> +           }
> +       }
> +      else if (!builtin_function_validate_nargs (loc, fndecl, nargs, 1))
> +       return false;
> +
> +      if (!INTEGRAL_TYPE_P (TREE_TYPE (args[0])))
> +       {
> +         error_at (ARG_LOCATION (0), "argument %u in call to function "
> +                   "%qE does not have integral type", 1, fndecl);
> +         return false;
> +       }
> +      if (TREE_CODE (TREE_TYPE (args[0])) == ENUMERAL_TYPE)
> +       {
> +         error_at (ARG_LOCATION (0), "argument %u in call to function "
> +                   "%qE has enumerated type", 1, fndecl);
> +         return false;
> +       }
> +      if (TREE_CODE (TREE_TYPE (args[0])) == BOOLEAN_TYPE)
> +       {
> +         error_at (ARG_LOCATION (0), "argument %u in call to function "
> +                   "%qE has boolean type", 1, fndecl);
> +         return false;
> +       }
> +      if (DECL_FUNCTION_CODE (fndecl) == BUILT_IN_FFSG
> +         || DECL_FUNCTION_CODE (fndecl) == BUILT_IN_CLRSBG)
> +       {
> +         if (TYPE_UNSIGNED (TREE_TYPE (args[0])))
> +           {
> +             error_at (ARG_LOCATION (0), "argument 1 in call to function "
> +                       "%qE has unsigned type", fndecl);
> +             return false;
> +           }
> +       }
> +      else if (!TYPE_UNSIGNED (TREE_TYPE (args[0])))
> +       {
> +         error_at (ARG_LOCATION (0), "argument 1 in call to function "
> +                   "%qE has signed type", fndecl);
> +         return false;
> +       }
> +      return true;
> +
>      default:
>        return true;
>      }
> --- gcc/c-family/c-gimplify.cc.jj       2023-11-09 09:03:53.251902730 +0100
> +++ gcc/c-family/c-gimplify.cc  2023-11-09 09:17:40.237182384 +0100
> @@ -818,6 +818,28 @@ c_gimplify_expr (tree *expr_p, gimple_se
>         break;
>        }
>
> +    case CALL_EXPR:
> +      {
> +       tree fndecl = get_callee_fndecl (*expr_p);
> +       if (fndecl
> +           && fndecl_built_in_p (fndecl, BUILT_IN_CLZG, BUILT_IN_CTZG)
> +           && call_expr_nargs (*expr_p) == 2
> +           && TREE_CODE (CALL_EXPR_ARG (*expr_p, 1)) != INTEGER_CST)
> +         {
> +           tree a = save_expr (CALL_EXPR_ARG (*expr_p, 0));
> +           tree c = build_call_expr_loc (EXPR_LOCATION (*expr_p),
> +                                         fndecl, 1, a);
> +           *expr_p = build3_loc (EXPR_LOCATION (*expr_p), COND_EXPR,
> +                                 integer_type_node,
> +                                 build2_loc (EXPR_LOCATION (*expr_p),
> +                                             NE_EXPR, boolean_type_node, a,
> +                                             build_zero_cst (TREE_TYPE (a))),
> +                                 c, CALL_EXPR_ARG (*expr_p, 1));
> +           return GS_OK;
> +         }
> +       break;
> +      }
> +
>      default:;
>      }
>
> --- gcc/c/c-typeck.cc.jj        2023-11-09 09:04:18.537544522 +0100
> +++ gcc/c/c-typeck.cc   2023-11-09 10:57:28.672517220 +0100
> @@ -3560,6 +3560,7 @@ convert_arguments (location_t loc, vec<l
>      && lookup_attribute ("type generic", TYPE_ATTRIBUTES (TREE_TYPE (fundecl)));
>    bool type_generic_remove_excess_precision = false;
>    bool type_generic_overflow_p = false;
> +  bool type_generic_bit_query = false;
>    tree selector;
>
>    /* Change pointer to function to the function itself for
> @@ -3615,6 +3616,17 @@ convert_arguments (location_t loc, vec<l
>             type_generic_overflow_p = true;
>             break;
>
> +         case BUILT_IN_CLZG:
> +         case BUILT_IN_CTZG:
> +         case BUILT_IN_CLRSBG:
> +         case BUILT_IN_FFSG:
> +         case BUILT_IN_PARITYG:
> +         case BUILT_IN_POPCOUNTG:
> +           /* The first argument of these type-generic builtins
> +              should not be promoted.  */
> +           type_generic_bit_query = true;
> +           break;
> +
>           default:
>             break;
>           }
> @@ -3750,11 +3762,13 @@ convert_arguments (location_t loc, vec<l
>             }
>         }
>        else if ((excess_precision && !type_generic)
> -              || (type_generic_overflow_p && parmnum == 2))
> +              || (type_generic_overflow_p && parmnum == 2)
> +              || (type_generic_bit_query && parmnum == 0))
>         /* A "double" argument with excess precision being passed
>            without a prototype or in variable arguments.
>            The last argument of __builtin_*_overflow_p should not be
> -          promoted.  */
> +          promoted, similarly the first argument of
> +          __builtin_{clz,ctz,clrsb,ffs,parity,popcount}g.  */
>         parmval = convert (valtype, val);
>        else if ((invalid_func_diag =
>                 targetm.calls.invalid_arg_for_unprototyped_fn (typelist, fundecl, val)))
> --- gcc/cp/call.cc.jj   2023-11-04 09:02:35.376001531 +0100
> +++ gcc/cp/call.cc      2023-11-09 11:03:06.687737428 +0100
> @@ -9290,7 +9290,9 @@ convert_for_arg_passing (tree type, tree
>     This is true for some builtins which don't act like normal functions.
>     Return 2 if just decay_conversion and removal of excess precision should
>     be done, 1 if just decay_conversion.  Return 3 for special treatment of
> -   the 3rd argument for __builtin_*_overflow_p.  */
> +   the 3rd argument for __builtin_*_overflow_p.  Return 4 for special
> +   treatment of the 1st argument for
> +   __builtin_{clz,ctz,clrsb,ffs,parity,popcount}g.  */
>
>  int
>  magic_varargs_p (tree fn)
> @@ -9317,6 +9319,14 @@ magic_varargs_p (tree fn)
>        case BUILT_IN_FPCLASSIFY:
>         return 2;
>
> +      case BUILT_IN_CLZG:
> +      case BUILT_IN_CTZG:
> +      case BUILT_IN_CLRSBG:
> +      case BUILT_IN_FFSG:
> +      case BUILT_IN_PARITYG:
> +      case BUILT_IN_POPCOUNTG:
> +       return 4;
> +
>        default:
>         return lookup_attribute ("type generic",
>                                  TYPE_ATTRIBUTES (TREE_TYPE (fn))) != 0;
> @@ -10122,7 +10132,7 @@ build_over_call (struct z_candidate *can
>    for (; arg_index < vec_safe_length (args); ++arg_index)
>      {
>        tree a = (*args)[arg_index];
> -      if (magic == 3 && arg_index == 2)
> +      if ((magic == 3 && arg_index == 2) || (magic == 4 && arg_index == 0))
>         {
>           /* Do no conversions for certain magic varargs.  */
>           a = mark_type_use (a);
> --- gcc/cp/cp-gimplify.cc.jj    2023-11-02 07:49:15.839882778 +0100
> +++ gcc/cp/cp-gimplify.cc       2023-11-09 12:11:59.834140462 +0100
> @@ -771,6 +771,10 @@ cp_gimplify_expr (tree *expr_p, gimple_s
>               default:
>                 break;
>               }
> +         else if (decl
> +                  && fndecl_built_in_p (decl, BUILT_IN_CLZG, BUILT_IN_CTZG))
> +           ret = (enum gimplify_status) c_gimplify_expr (expr_p, pre_p,
> +                                                         post_p);
>         }
>        break;
>
> --- gcc/testsuite/c-c++-common/pr111309-1.c.jj  2023-11-09 10:35:28.974541671 +0100
> +++ gcc/testsuite/c-c++-common/pr111309-1.c     2023-11-09 11:54:02.817389761 +0100
> @@ -0,0 +1,470 @@
> +/* PR c/111309 */
> +/* { dg-do run } */
> +/* { dg-options "-O2" } */
> +
> +__attribute__((noipa)) int
> +clzc (unsigned char x)
> +{
> +  return __builtin_clzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +clzc2 (unsigned char x, int y)
> +{
> +  return __builtin_clzg (x, y);
> +}
> +
> +__attribute__((noipa)) int
> +clzs (unsigned short x)
> +{
> +  return __builtin_clzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +clzs2 (unsigned short x)
> +{
> +  return __builtin_clzg (x, -2);
> +}
> +
> +__attribute__((noipa)) int
> +clzi (unsigned int x)
> +{
> +  return __builtin_clzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +clzi2 (unsigned int x, int y)
> +{
> +  return __builtin_clzg (x, y);
> +}
> +
> +__attribute__((noipa)) int
> +clzl (unsigned long x)
> +{
> +  return __builtin_clzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +clzl2 (unsigned long x)
> +{
> +  return __builtin_clzg (x, -1);
> +}
> +
> +__attribute__((noipa)) int
> +clzL (unsigned long long x)
> +{
> +  return __builtin_clzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +clzL2 (unsigned long long x, int y)
> +{
> +  return __builtin_clzg (x, y);
> +}
> +
> +#ifdef __SIZEOF_INT128__
> +__attribute__((noipa)) int
> +clzI (unsigned __int128 x)
> +{
> +  return __builtin_clzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +clzI2 (unsigned __int128 x, int y)
> +{
> +  return __builtin_clzg (x, y);
> +}
> +#endif
> +
> +__attribute__((noipa)) int
> +ctzc (unsigned char x)
> +{
> +  return __builtin_ctzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ctzc2 (unsigned char x, int y)
> +{
> +  return __builtin_ctzg (x, y);
> +}
> +
> +__attribute__((noipa)) int
> +ctzs (unsigned short x)
> +{
> +  return __builtin_ctzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ctzs2 (unsigned short x, int y)
> +{
> +  return __builtin_ctzg (x, y);
> +}
> +
> +__attribute__((noipa)) int
> +ctzi (unsigned int x)
> +{
> +  return __builtin_ctzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ctzi2 (unsigned int x, int y)
> +{
> +  return __builtin_ctzg (x, y);
> +}
> +
> +__attribute__((noipa)) int
> +ctzl (unsigned long x)
> +{
> +  return __builtin_ctzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ctzl2 (unsigned long x, int y)
> +{
> +  return __builtin_ctzg (x, y);
> +}
> +
> +__attribute__((noipa)) int
> +ctzL (unsigned long long x)
> +{
> +  return __builtin_ctzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ctzL2 (unsigned long long x, int y)
> +{
> +  return __builtin_ctzg (x, y);
> +}
> +
> +#ifdef __SIZEOF_INT128__
> +__attribute__((noipa)) int
> +ctzI (unsigned __int128 x)
> +{
> +  return __builtin_ctzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ctzI2 (unsigned __int128 x)
> +{
> +  return __builtin_ctzg (x, __SIZEOF_INT128__ * __CHAR_BIT__);
> +}
> +#endif
> +
> +__attribute__((noipa)) int
> +clrsbc (signed char x)
> +{
> +  return __builtin_clrsbg (x);
> +}
> +
> +__attribute__((noipa)) int
> +clrsbs (signed short x)
> +{
> +  return __builtin_clrsbg (x);
> +}
> +
> +__attribute__((noipa)) int
> +clrsbi (signed int x)
> +{
> +  return __builtin_clrsbg (x);
> +}
> +
> +__attribute__((noipa)) int
> +clrsbl (signed long x)
> +{
> +  return __builtin_clrsbg (x);
> +}
> +
> +__attribute__((noipa)) int
> +clrsbL (signed long long x)
> +{
> +  return __builtin_clrsbg (x);
> +}
> +
> +#ifdef __SIZEOF_INT128__
> +__attribute__((noipa)) int
> +clrsbI (signed __int128 x)
> +{
> +  return __builtin_clrsbg (x);
> +}
> +#endif
> +
> +__attribute__((noipa)) int
> +ffsc (signed char x)
> +{
> +  return __builtin_ffsg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ffss (signed short x)
> +{
> +  return __builtin_ffsg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ffsi (signed int x)
> +{
> +  return __builtin_ffsg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ffsl (signed long x)
> +{
> +  return __builtin_ffsg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ffsL (signed long long x)
> +{
> +  return __builtin_ffsg (x);
> +}
> +
> +#ifdef __SIZEOF_INT128__
> +__attribute__((noipa)) int
> +ffsI (signed __int128 x)
> +{
> +  return __builtin_ffsg (x);
> +}
> +#endif
> +
> +__attribute__((noipa)) int
> +parityc (unsigned char x)
> +{
> +  return __builtin_parityg (x);
> +}
> +
> +__attribute__((noipa)) int
> +paritys (unsigned short x)
> +{
> +  return __builtin_parityg (x);
> +}
> +
> +__attribute__((noipa)) int
> +parityi (unsigned int x)
> +{
> +  return __builtin_parityg (x);
> +}
> +
> +__attribute__((noipa)) int
> +parityl (unsigned long x)
> +{
> +  return __builtin_parityg (x);
> +}
> +
> +__attribute__((noipa)) int
> +parityL (unsigned long long x)
> +{
> +  return __builtin_parityg (x);
> +}
> +
> +#ifdef __SIZEOF_INT128__
> +__attribute__((noipa)) int
> +parityI (unsigned __int128 x)
> +{
> +  return __builtin_parityg (x);
> +}
> +#endif
> +
> +__attribute__((noipa)) int
> +popcountc (unsigned char x)
> +{
> +  return __builtin_popcountg (x);
> +}
> +
> +__attribute__((noipa)) int
> +popcounts (unsigned short x)
> +{
> +  return __builtin_popcountg (x);
> +}
> +
> +__attribute__((noipa)) int
> +popcounti (unsigned int x)
> +{
> +  return __builtin_popcountg (x);
> +}
> +
> +__attribute__((noipa)) int
> +popcountl (unsigned long x)
> +{
> +  return __builtin_popcountg (x);
> +}
> +
> +__attribute__((noipa)) int
> +popcountL (unsigned long long x)
> +{
> +  return __builtin_popcountg (x);
> +}
> +
> +#ifdef __SIZEOF_INT128__
> +__attribute__((noipa)) int
> +popcountI (unsigned __int128 x)
> +{
> +  return __builtin_popcountg (x);
> +}
> +#endif
> +
> +int
> +main ()
> +{
> +  if (__builtin_clzg ((unsigned char) 1) != __CHAR_BIT__ - 1
> +      || __builtin_clzg ((unsigned short) 2, -2) != __SIZEOF_SHORT__ * __CHAR_BIT__ - 2
> +      || __builtin_clzg (0U, 42) != 42
> +      || __builtin_clzg (0U, -1) != -1
> +      || __builtin_clzg (1U) != __SIZEOF_INT__ * __CHAR_BIT__ - 1
> +      || __builtin_clzg (2UL, -1) != __SIZEOF_LONG__ * __CHAR_BIT__ - 2
> +      || __builtin_clzg (5ULL) != __SIZEOF_LONG_LONG__ * __CHAR_BIT__ - 3
> +#ifdef __SIZEOF_INT128__
> +      || __builtin_clzg ((unsigned __int128) 9) != __SIZEOF_INT128__ * __CHAR_BIT__ - 4
> +#endif
> +      || __builtin_clzg (~0U, -5) != 0
> +      || __builtin_clzg (~0ULL >> 2) != 2
> +      || __builtin_ctzg ((unsigned char) 1) != 0
> +      || __builtin_ctzg ((unsigned short) 28) != 2
> +      || __builtin_ctzg (0U, 32) != 32
> +      || __builtin_ctzg (0U, -42) != -42
> +      || __builtin_ctzg (1U) != 0
> +      || __builtin_ctzg (16UL, -1) != 4
> +      || __builtin_ctzg (5ULL << 52, 0) != 52
> +#ifdef __SIZEOF_INT128__
> +      || __builtin_ctzg (((unsigned __int128) 9) << 72) != 72
> +#endif
> +      || __builtin_clrsbg ((signed char) 0) != __CHAR_BIT__ - 1
> +      || __builtin_clrsbg ((signed short) -1) != __SIZEOF_SHORT__ * __CHAR_BIT__ - 1
> +      || __builtin_clrsbg (0) != __SIZEOF_INT__ * __CHAR_BIT__ - 1
> +      || __builtin_clrsbg (-1L) != __SIZEOF_LONG__ * __CHAR_BIT__ - 1
> +      || __builtin_clrsbg (0LL) != __SIZEOF_LONG_LONG__ * __CHAR_BIT__ - 1
> +#ifdef __SIZEOF_INT128__
> +      || __builtin_clrsbg ((__int128) -1) != __SIZEOF_INT128__ * __CHAR_BIT__ - 1
> +#endif
> +      || __builtin_clrsbg (0x1afb) != __SIZEOF_INT__ * __CHAR_BIT__ - 14
> +      || __builtin_clrsbg (-2) != __SIZEOF_INT__ * __CHAR_BIT__ - 2
> +      || __builtin_clrsbg (1L) != __SIZEOF_LONG__ * __CHAR_BIT__ - 2
> +      || __builtin_clrsbg (-4LL) != __SIZEOF_LONG_LONG__ * __CHAR_BIT__ - 3
> +      || __builtin_ffsg ((signed char) 0) != 0
> +      || __builtin_ffsg ((signed short) 0) != 0
> +      || __builtin_ffsg (0) != 0
> +      || __builtin_ffsg (0L) != 0
> +      || __builtin_ffsg (0LL) != 0
> +#ifdef __SIZEOF_INT128__
> +      || __builtin_ffsg ((__int128) 0) != 0
> +#endif
> +      || __builtin_ffsg ((signed char) 4) != 3
> +      || __builtin_ffsg ((signed short) 8) != 4
> +      || __builtin_ffsg (1) != 1
> +      || __builtin_ffsg (2L) != 2
> +      || __builtin_ffsg (28LL) != 3
> +      || __builtin_parityg ((unsigned char) 1) != 1
> +      || __builtin_parityg ((unsigned short) 2) != 1
> +      || __builtin_parityg (0U) != 0
> +      || __builtin_parityg (3U) != 0
> +      || __builtin_parityg (0UL) != 0
> +      || __builtin_parityg (7UL) != 1
> +      || __builtin_parityg (0ULL) != 0
> +#ifdef __SIZEOF_INT128__
> +      || __builtin_parityg ((unsigned __int128) 0) != 0
> +#endif
> +      || __builtin_parityg ((unsigned char) ~0U) != 0
> +      || __builtin_parityg ((unsigned short) ~0U) != 0
> +      || __builtin_parityg (~0U) != 0
> +      || __builtin_parityg (~0UL) != 0
> +      || __builtin_parityg (~0ULL) != 0
> +#ifdef __SIZEOF_INT128__
> +      || __builtin_parityg (~(unsigned __int128) 0) != 0
> +#endif
> +      || __builtin_popcountg (0U) != 0
> +      || __builtin_popcountg (0UL) != 0
> +      || __builtin_popcountg (0ULL) != 0
> +#ifdef __SIZEOF_INT128__
> +      || __builtin_popcountg ((unsigned __int128) 0) != 0
> +#endif
> +      || __builtin_popcountg ((unsigned char) ~0U) != __CHAR_BIT__
> +      || __builtin_popcountg ((unsigned short) ~0U) != __SIZEOF_SHORT__ * __CHAR_BIT__
> +      || __builtin_popcountg (~0U) != __SIZEOF_INT__ * __CHAR_BIT__
> +      || __builtin_popcountg (~0UL) != __SIZEOF_LONG__ * __CHAR_BIT__
> +      || __builtin_popcountg (~0ULL) != __SIZEOF_LONG_LONG__ * __CHAR_BIT__
> +#ifdef __SIZEOF_INT128__
> +      || __builtin_popcountg (~(unsigned __int128) 0) != __SIZEOF_INT128__ * __CHAR_BIT__
> +#endif
> +      || 0)
> +  __builtin_abort ();
> +  if (clzc (1) != __CHAR_BIT__ - 1
> +      || clzs2 (2) != __SIZEOF_SHORT__ * __CHAR_BIT__ - 2
> +      || clzi2 (0U, 42) != 42
> +      || clzi2 (0U, -1) != -1
> +      || clzi (1U) != __SIZEOF_INT__ * __CHAR_BIT__ - 1
> +      || clzl2 (2UL) != __SIZEOF_LONG__ * __CHAR_BIT__ - 2
> +      || clzL (5ULL) != __SIZEOF_LONG_LONG__ * __CHAR_BIT__ - 3
> +#ifdef __SIZEOF_INT128__
> +      || clzI ((unsigned __int128) 9) != __SIZEOF_INT128__ * __CHAR_BIT__ - 4
> +#endif
> +      || clzi2 (~0U, -5) != 0
> +      || clzL (~0ULL >> 2) != 2
> +      || ctzc (1) != 0
> +      || ctzs (28) != 2
> +      || ctzi2 (0U, 32) != 32
> +      || ctzi2 (0U, -42) != -42
> +      || ctzi (1U) != 0
> +      || ctzl2 (16UL, -1) != 4
> +      || ctzL2 (5ULL << 52, 0) != 52
> +#ifdef __SIZEOF_INT128__
> +      || ctzI (((unsigned __int128) 9) << 72) != 72
> +#endif
> +      || clrsbc (0) != __CHAR_BIT__ - 1
> +      || clrsbs (-1) != __SIZEOF_SHORT__ * __CHAR_BIT__ - 1
> +      || clrsbi (0) != __SIZEOF_INT__ * __CHAR_BIT__ - 1
> +      || clrsbl (-1L) != __SIZEOF_LONG__ * __CHAR_BIT__ - 1
> +      || clrsbL (0LL) != __SIZEOF_LONG_LONG__ * __CHAR_BIT__ - 1
> +#ifdef __SIZEOF_INT128__
> +      || clrsbI (-1) != __SIZEOF_INT128__ * __CHAR_BIT__ - 1
> +#endif
> +      || clrsbi (0x1afb) != __SIZEOF_INT__ * __CHAR_BIT__ - 14
> +      || clrsbi (-2) != __SIZEOF_INT__ * __CHAR_BIT__ - 2
> +      || clrsbl (1L) != __SIZEOF_LONG__ * __CHAR_BIT__ - 2
> +      || clrsbL (-4LL) != __SIZEOF_LONG_LONG__ * __CHAR_BIT__ - 3
> +      || ffsc (0) != 0
> +      || ffss (0) != 0
> +      || ffsi (0) != 0
> +      || ffsl (0L) != 0
> +      || ffsL (0LL) != 0
> +#ifdef __SIZEOF_INT128__
> +      || ffsI (0) != 0
> +#endif
> +      || ffsc (4) != 3
> +      || ffss (8) != 4
> +      || ffsi (1) != 1
> +      || ffsl (2L) != 2
> +      || ffsL (28LL) != 3
> +      || parityc (1) != 1
> +      || paritys (2) != 1
> +      || parityi (0U) != 0
> +      || parityi (3U) != 0
> +      || parityl (0UL) != 0
> +      || parityl (7UL) != 1
> +      || parityL (0ULL) != 0
> +#ifdef __SIZEOF_INT128__
> +      || parityI (0) != 0
> +#endif
> +      || parityc ((unsigned char) ~0U) != 0
> +      || paritys ((unsigned short) ~0U) != 0
> +      || parityi (~0U) != 0
> +      || parityl (~0UL) != 0
> +      || parityL (~0ULL) != 0
> +#ifdef __SIZEOF_INT128__
> +      || parityI (~(unsigned __int128) 0) != 0
> +#endif
> +      || popcounti (0U) != 0
> +      || popcountl (0UL) != 0
> +      || popcountL (0ULL) != 0
> +#ifdef __SIZEOF_INT128__
> +      || popcountI (0) != 0
> +#endif
> +      || popcountc ((unsigned char) ~0U) != __CHAR_BIT__
> +      || popcounts ((unsigned short) ~0U) != __SIZEOF_SHORT__ * __CHAR_BIT__
> +      || popcounti (~0U) != __SIZEOF_INT__ * __CHAR_BIT__
> +      || popcountl (~0UL) != __SIZEOF_LONG__ * __CHAR_BIT__
> +      || popcountL (~0ULL) != __SIZEOF_LONG_LONG__ * __CHAR_BIT__
> +#ifdef __SIZEOF_INT128__
> +      || popcountI (~(unsigned __int128) 0) != __SIZEOF_INT128__ * __CHAR_BIT__
> +#endif
> +      || 0)
> +  __builtin_abort ();
> +}
> --- gcc/testsuite/c-c++-common/pr111309-2.c.jj  2023-11-09 11:33:42.680632470 +0100
> +++ gcc/testsuite/c-c++-common/pr111309-2.c     2023-11-09 12:03:11.062619162 +0100
> @@ -0,0 +1,85 @@
> +/* PR c/111309 */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-std=c99" { target c } } */
> +
> +#ifndef __cplusplus
> +#define bool _Bool
> +#define true ((_Bool) 1)
> +#define false ((_Bool) 0)
> +#endif
> +
> +void
> +foo (void)
> +{
> +  enum E { E0 = 0 };
> +  struct S { int s; } s;
> +  __builtin_clzg ();           /* { dg-error "too few arguments" } */
> +  __builtin_clzg (0U, 1, 2);   /* { dg-error "too many arguments" } */
> +  __builtin_clzg (0);          /* { dg-error "has signed type" } */
> +  __builtin_clzg (0.0);                /* { dg-error "does not have integral type" } */
> +  __builtin_clzg (s);          /* { dg-error "does not have integral type" } */
> +  __builtin_clzg (true);       /* { dg-error "has boolean type" } */
> +  __builtin_clzg (E0);         /* { dg-error "has signed type" "" { target c } } */
> +                               /* { dg-error "has enumerated type" "" { target c++ } .-1 } */
> +  __builtin_clzg (0, 0);       /* { dg-error "has signed type" } */
> +  __builtin_clzg (0.0, 0);     /* { dg-error "does not have integral type" } */
> +  __builtin_clzg (s, 0);       /* { dg-error "does not have integral type" } */
> +  __builtin_clzg (true, 0);    /* { dg-error "has boolean type" } */
> +  __builtin_clzg (E0, 0);      /* { dg-error "has signed type" "" { target c } } */
> +                               /* { dg-error "has enumerated type" "" { target c++ } .-1 } */
> +  __builtin_clzg (0U, 2.0);    /* { dg-error "does not have integral type" } */
> +  __builtin_clzg (0U, s);      /* { dg-error "does not have integral type" } */
> +  __builtin_clzg (0U, 2LL);    /* { dg-error "does not have 'int' type" } */
> +  __builtin_clzg (0U, 2U);     /* { dg-error "does not have 'int' type" } */
> +  __builtin_clzg (0U, true);
> +  __builtin_clzg (0U, E0);     /* { dg-error "does not have 'int' type" "" { target c++ } } */
> +  __builtin_ctzg ();           /* { dg-error "too few arguments" } */
> +  __builtin_ctzg (0U, 1, 2);   /* { dg-error "too many arguments" } */
> +  __builtin_ctzg (0);          /* { dg-error "has signed type" } */
> +  __builtin_ctzg (0.0);                /* { dg-error "does not have integral type" } */
> +  __builtin_ctzg (s);          /* { dg-error "does not have integral type" } */
> +  __builtin_ctzg (true);       /* { dg-error "has boolean type" } */
> +  __builtin_ctzg (E0);         /* { dg-error "has signed type" "" { target c } } */
> +                               /* { dg-error "has enumerated type" "" { target c++ } .-1 } */
> +  __builtin_ctzg (0, 0);       /* { dg-error "has signed type" } */
> +  __builtin_ctzg (0.0, 0);     /* { dg-error "does not have integral type" } */
> +  __builtin_ctzg (s, 0);       /* { dg-error "does not have integral type" } */
> +  __builtin_ctzg (true, 0);    /* { dg-error "has boolean type" } */
> +  __builtin_ctzg (E0, 0);      /* { dg-error "has signed type" "" { target c } } */
> +                               /* { dg-error "has enumerated type" "" { target c++ } .-1 } */
> +  __builtin_ctzg (0U, 2.0);    /* { dg-error "does not have integral type" } */
> +  __builtin_ctzg (0U, 2LL);    /* { dg-error "does not have 'int' type" } */
> +  __builtin_ctzg (0U, 2U);     /* { dg-error "does not have 'int' type" } */
> +  __builtin_ctzg (0U, true);
> +  __builtin_ctzg (0U, E0);     /* { dg-error "does not have 'int' type" "" { target c++ } } */
> +  __builtin_clrsbg ();         /* { dg-error "too few arguments" } */
> +  __builtin_clrsbg (0, 1);     /* { dg-error "too many arguments" } */
> +  __builtin_clrsbg (0U);       /* { dg-error "has unsigned type" } */
> +  __builtin_clrsbg (0.0);      /* { dg-error "does not have integral type" } */
> +  __builtin_clrsbg (s);                /* { dg-error "does not have integral type" } */
> +  __builtin_clrsbg (true);     /* { dg-error "has boolean type" } */
> +  __builtin_clrsbg (E0);       /* { dg-error "has enumerated type" "" { target c++ } } */
> +  __builtin_ffsg ();           /* { dg-error "too few arguments" } */
> +  __builtin_ffsg (0, 1);       /* { dg-error "too many arguments" } */
> +  __builtin_ffsg (0U);         /* { dg-error "has unsigned type" } */
> +  __builtin_ffsg (0.0);                /* { dg-error "does not have integral type" } */
> +  __builtin_ffsg (s);          /* { dg-error "does not have integral type" } */
> +  __builtin_ffsg (true);       /* { dg-error "has boolean type" } */
> +  __builtin_ffsg (E0);         /* { dg-error "has enumerated type" "" { target c++ } } */
> +  __builtin_parityg ();                /* { dg-error "too few arguments" } */
> +  __builtin_parityg (0U, 1);   /* { dg-error "too many arguments" } */
> +  __builtin_parityg (0);       /* { dg-error "has signed type" } */
> +  __builtin_parityg (0.0);     /* { dg-error "does not have integral type" } */
> +  __builtin_parityg (s);       /* { dg-error "does not have integral type" } */
> +  __builtin_parityg (true);    /* { dg-error "has boolean type" } */
> +  __builtin_parityg (E0);      /* { dg-error "has signed type" "" { target c } } */
> +                               /* { dg-error "has enumerated type" "" { target c++ } .-1 } */
> +  __builtin_popcountg ();      /* { dg-error "too few arguments" } */
> +  __builtin_popcountg (0U, 1); /* { dg-error "too many arguments" } */
> +  __builtin_popcountg (0);     /* { dg-error "has signed type" } */
> +  __builtin_popcountg (0.0);   /* { dg-error "does not have integral type" } */
> +  __builtin_popcountg (s);     /* { dg-error "does not have integral type" } */
> +  __builtin_popcountg (true);  /* { dg-error "has boolean type" } */
> +  __builtin_popcountg (E0);    /* { dg-error "has signed type" "" { target c } } */
> +                               /* { dg-error "has enumerated type" "" { target c++ } .-1 } */
> +}
> --- gcc/testsuite/gcc.dg/torture/bitint-43.c.jj 2023-11-09 09:17:40.233182441 +0100
> +++ gcc/testsuite/gcc.dg/torture/bitint-43.c    2023-11-09 12:16:51.757013390 +0100
> @@ -0,0 +1,306 @@
> +/* PR c/111309 */
> +/* { dg-do run { target bitint } } */
> +/* { dg-options "-std=c2x -pedantic-errors" } */
> +/* { dg-skip-if "" { ! run_expensive_tests }  { "*" } { "-O0" "-O2" } } */
> +/* { dg-skip-if "" { ! run_expensive_tests } { "-flto" } { "" } } */
> +
> +#if __BITINT_MAXWIDTH__ >= 156
> +__attribute__((noipa)) int
> +clz156 (unsigned _BitInt(156) x)
> +{
> +  return __builtin_clzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +clzd156 (unsigned _BitInt(156) x)
> +{
> +  return __builtin_clzg (x, __builtin_popcountg ((typeof (x)) ~(typeof (x)) 0));
> +}
> +
> +__attribute__((noipa)) int
> +clzD156 (unsigned _BitInt(156) x, int y)
> +{
> +  return __builtin_clzg (x, y);
> +}
> +
> +__attribute__((noipa)) int
> +ctz156 (unsigned _BitInt(156) x)
> +{
> +  return __builtin_ctzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ctzd156 (unsigned _BitInt(156) x)
> +{
> +  return __builtin_ctzg (x, __builtin_popcountg ((typeof (x)) ~(typeof (x)) 0));
> +}
> +
> +__attribute__((noipa)) int
> +ctzD156 (unsigned _BitInt(156) x, int y)
> +{
> +  return __builtin_ctzg (x, y);
> +}
> +
> +__attribute__((noipa)) int
> +clrsb156 (_BitInt(156) x)
> +{
> +  return __builtin_clrsbg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ffs156 (_BitInt(156) x)
> +{
> +  return __builtin_ffsg (x);
> +}
> +
> +__attribute__((noipa)) int
> +parity156 (unsigned _BitInt(156) x)
> +{
> +  return __builtin_parityg (x);
> +}
> +
> +__attribute__((noipa)) int
> +popcount156 (unsigned _BitInt(156) x)
> +{
> +  return __builtin_popcountg (x);
> +}
> +#endif
> +
> +#if __BITINT_MAXWIDTH__ >= 192
> +__attribute__((noipa)) int
> +clz192 (unsigned _BitInt(192) x)
> +{
> +  return __builtin_clzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +clzd192 (unsigned _BitInt(192) x)
> +{
> +  return __builtin_clzg (x, __builtin_popcountg ((typeof (x)) ~(typeof (x)) 0));
> +}
> +
> +__attribute__((noipa)) int
> +clzD192 (unsigned _BitInt(192) x, int y)
> +{
> +  return __builtin_clzg (x, y);
> +}
> +
> +__attribute__((noipa)) int
> +ctz192 (unsigned _BitInt(192) x)
> +{
> +  return __builtin_ctzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ctzd192 (unsigned _BitInt(192) x)
> +{
> +  return __builtin_ctzg (x, __builtin_popcountg ((typeof (x)) ~(typeof (x)) 0));
> +}
> +
> +__attribute__((noipa)) int
> +ctzD192 (unsigned _BitInt(192) x, int y)
> +{
> +  return __builtin_ctzg (x, y);
> +}
> +
> +__attribute__((noipa)) int
> +clrsb192 (_BitInt(192) x)
> +{
> +  return __builtin_clrsbg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ffs192 (_BitInt(192) x)
> +{
> +  return __builtin_ffsg (x);
> +}
> +
> +__attribute__((noipa)) int
> +parity192 (unsigned _BitInt(192) x)
> +{
> +  return __builtin_parityg (x);
> +}
> +
> +__attribute__((noipa)) int
> +popcount192 (unsigned _BitInt(192) x)
> +{
> +  return __builtin_popcountg (x);
> +}
> +#endif
> +
> +int
> +main ()
> +{
> +#if __BITINT_MAXWIDTH__ >= 156
> +  if (clzd156 (0) != 156
> +      || clzD156 (0, -1) != -1
> +      || ctzd156 (0) != 156
> +      || ctzD156 (0, 42) != 42
> +      || clrsb156 (0) != 156 - 1
> +      || ffs156 (0) != 0
> +      || parity156 (0) != 0
> +      || popcount156 (0) != 0
> +      || __builtin_clzg ((unsigned _BitInt(156)) 0, 156 + 32) != 156 + 32
> +      || __builtin_ctzg ((unsigned _BitInt(156)) 0, 156) != 156
> +      || __builtin_clrsbg ((_BitInt(156)) 0) != 156 - 1
> +      || __builtin_ffsg ((_BitInt(156)) 0) != 0
> +      || __builtin_parityg ((unsigned _BitInt(156)) 0) != 0
> +      || __builtin_popcountg ((unsigned _BitInt(156)) 0) != 0)
> +    __builtin_abort ();
> +  if (clz156 (-1) != 0
> +      || clzd156 (-1) != 0
> +      || clzD156 (-1, 0) != 0
> +      || ctz156 (-1) != 0
> +      || ctzd156 (-1) != 0
> +      || ctzD156 (-1, 17) != 0
> +      || clrsb156 (-1) != 156 - 1
> +      || ffs156 (-1) != 1
> +      || parity156 (-1) != 0
> +      || popcount156 (-1) != 156
> +      || __builtin_clzg ((unsigned _BitInt(156)) -1) != 0
> +      || __builtin_clzg ((unsigned _BitInt(156)) -1, 156 + 32) != 0
> +      || __builtin_ctzg ((unsigned _BitInt(156)) -1) != 0
> +      || __builtin_ctzg ((unsigned _BitInt(156)) -1, 156) != 0
> +      || __builtin_clrsbg ((_BitInt(156)) -1) != 156 - 1
> +      || __builtin_ffsg ((_BitInt(156)) -1) != 1
> +      || __builtin_parityg ((unsigned _BitInt(156)) -1) != 0
> +      || __builtin_popcountg ((unsigned _BitInt(156)) -1) != 156)
> +    __builtin_abort ();
> +  if (clz156 (((unsigned _BitInt(156)) -1) >> 24) != 24
> +      || clz156 (((unsigned _BitInt(156)) -1) >> 79) != 79
> +      || clz156 (1) != 156 - 1
> +      || clzd156 (((unsigned _BitInt(156)) -1) >> 139) != 139
> +      || clzd156 (2) != 156 - 2
> +      || ctz156 (((unsigned _BitInt(156)) -1) << 42) != 42
> +      || ctz156 (((unsigned _BitInt(156)) -1) << 57) != 57
> +      || ctz156 (0x4000000000000000000000uwb) != 86
> +      || ctzd156 (((unsigned _BitInt(156)) -1) << 149) != 149
> +      || ctzd156 (2) != 1
> +      || clrsb156 ((unsigned _BitInt(156 - 4)) -1) != 3
> +      || clrsb156 ((unsigned _BitInt(156 - 28)) -1) != 27
> +      || clrsb156 ((unsigned _BitInt(156 - 29)) -1) != 28
> +      || clrsb156 (~(unsigned _BitInt(156)) (unsigned _BitInt(156 - 68)) -1) != 67
> +      || clrsb156 (~(unsigned _BitInt(156)) (unsigned _BitInt(156 - 92)) -1) != 91
> +      || clrsb156 (~(unsigned _BitInt(156)) (unsigned _BitInt(156 - 93)) -1) != 92
> +      || ffs156 (((unsigned _BitInt(156)) -1) << 42) != 43
> +      || ffs156 (((unsigned _BitInt(156)) -1) << 57) != 58
> +      || ffs156 (0x4000000000000000000000uwb) != 87
> +      || ffs156 (((unsigned _BitInt(156)) -1) << 149) != 150
> +      || ffs156 (2) != 2
> +      || __builtin_clzg (((unsigned _BitInt(156)) -1) >> 24) != 24
> +      || __builtin_clzg (((unsigned _BitInt(156)) -1) >> 79) != 79
> +      || __builtin_clzg ((unsigned _BitInt(156)) 1) != 156 - 1
> +      || __builtin_clzg (((unsigned _BitInt(156)) -1) >> 139, 156) != 139
> +      || __builtin_clzg ((unsigned _BitInt(156)) 2, 156) != 156 - 2
> +      || __builtin_ctzg (((unsigned _BitInt(156)) -1) << 42) != 42
> +      || __builtin_ctzg (((unsigned _BitInt(156)) -1) << 57) != 57
> +      || __builtin_ctzg ((unsigned _BitInt(156)) 0x4000000000000000000000uwb) != 86
> +      || __builtin_ctzg (((unsigned _BitInt(156)) -1) << 149, 156) != 149
> +      || __builtin_ctzg ((unsigned _BitInt(156)) 2, 156) != 1
> +      || __builtin_clrsbg ((_BitInt(156)) (unsigned _BitInt(156 - 4)) -1) != 3
> +      || __builtin_clrsbg ((_BitInt(156)) (unsigned _BitInt(156 - 28)) -1) != 27
> +      || __builtin_clrsbg ((_BitInt(156)) (unsigned _BitInt(156 - 29)) -1) != 28
> +      || __builtin_clrsbg ((_BitInt(156)) ~(unsigned _BitInt(156)) (unsigned _BitInt(156 - 68)) -1) != 67
> +      || __builtin_clrsbg ((_BitInt(156)) ~(unsigned _BitInt(156)) (unsigned _BitInt(156 - 92)) -1) != 91
> +      || __builtin_clrsbg ((_BitInt(156)) ~(unsigned _BitInt(156)) (unsigned _BitInt(156 - 93)) -1) != 92
> +      || __builtin_ffsg ((_BitInt(156)) (((unsigned _BitInt(156)) -1) << 42)) != 43
> +      || __builtin_ffsg ((_BitInt(156)) (((unsigned _BitInt(156)) -1) << 57)) != 58
> +      || __builtin_ffsg ((_BitInt(156)) 0x4000000000000000000000uwb) != 87
> +      || __builtin_ffsg ((_BitInt(156)) (((unsigned _BitInt(156)) -1) << 149)) != 150
> +      || __builtin_ffsg ((_BitInt(156)) 2) != 2)
> +    __builtin_abort ();
> +  if (parity156 (23008250258685373142923325827291949461178444434uwb) != __builtin_parityg (23008250258685373142923325827291949461178444434uwb)
> +      || parity156 (41771568792516301628132437740665810252917251244uwb) != __builtin_parityg (41771568792516301628132437740665810252917251244uwb)
> +      || parity156 (5107402473866766219120283991834936835726115452uwb) != __builtin_parityg (5107402473866766219120283991834936835726115452uwb)
> +      || popcount156 (50353291748276374580944955711958129678996395562uwb) != __builtin_popcountg (50353291748276374580944955711958129678996395562uwb)
> +      || popcount156 (29091263616891212550063067166307725491211684496uwb) != __builtin_popcountg (29091263616891212550063067166307725491211684496uwb)
> +      || popcount156 (64973284306583205619384799873110935608793072026uwb) != __builtin_popcountg (64973284306583205619384799873110935608793072026uwb))
> +    __builtin_abort ();
> +#endif
> +#if __BITINT_MAXWIDTH__ >= 192
> +  if (clzd192 (0) != 192
> +      || clzD192 (0, 42) != 42
> +      || ctzd192 (0) != 192
> +      || ctzD192 (0, -1) != -1
> +      || clrsb192 (0) != 192 - 1
> +      || ffs192 (0) != 0
> +      || parity192 (0) != 0
> +      || popcount192 (0) != 0
> +      || __builtin_clzg ((unsigned _BitInt(192)) 0, 192 + 32) != 192 + 32
> +      || __builtin_ctzg ((unsigned _BitInt(192)) 0, 192) != 192
> +      || __builtin_clrsbg ((_BitInt(192)) 0) != 192 - 1
> +      || __builtin_ffsg ((_BitInt(192)) 0) != 0
> +      || __builtin_parityg ((unsigned _BitInt(192)) 0) != 0
> +      || __builtin_popcountg ((unsigned _BitInt(192)) 0) != 0)
> +    __builtin_abort ();
> +  if (clz192 (-1) != 0
> +      || clzd192 (-1) != 0
> +      || clzD192 (-1, 15) != 0
> +      || ctz192 (-1) != 0
> +      || ctzd192 (-1) != 0
> +      || ctzD192 (-1, -57) != 0
> +      || clrsb192 (-1) != 192 - 1
> +      || ffs192 (-1) != 1
> +      || parity192 (-1) != 0
> +      || popcount192 (-1) != 192
> +      || __builtin_clzg ((unsigned _BitInt(192)) -1) != 0
> +      || __builtin_clzg ((unsigned _BitInt(192)) -1, 192 + 32) != 0
> +      || __builtin_ctzg ((unsigned _BitInt(192)) -1) != 0
> +      || __builtin_ctzg ((unsigned _BitInt(192)) -1, 192) != 0
> +      || __builtin_clrsbg ((_BitInt(192)) -1) != 192 - 1
> +      || __builtin_ffsg ((_BitInt(192)) -1) != 1
> +      || __builtin_parityg ((unsigned _BitInt(192)) -1) != 0
> +      || __builtin_popcountg ((unsigned _BitInt(192)) -1) != 192)
> +    __builtin_abort ();
> +  if (clz192 (((unsigned _BitInt(192)) -1) >> 24) != 24
> +      || clz192 (((unsigned _BitInt(192)) -1) >> 79) != 79
> +      || clz192 (1) != 192 - 1
> +      || clzd192 (((unsigned _BitInt(192)) -1) >> 139) != 139
> +      || clzd192 (2) != 192 - 2
> +      || ctz192 (((unsigned _BitInt(192)) -1) << 42) != 42
> +      || ctz192 (((unsigned _BitInt(192)) -1) << 57) != 57
> +      || ctz192 (0x4000000000000000000000uwb) != 86
> +      || ctzd192 (((unsigned _BitInt(192)) -1) << 149) != 149
> +      || ctzd192 (2) != 1
> +      || clrsb192 ((unsigned _BitInt(192 - 4)) -1) != 3
> +      || clrsb192 ((unsigned _BitInt(192 - 28)) -1) != 27
> +      || clrsb192 ((unsigned _BitInt(192 - 29)) -1) != 28
> +      || clrsb192 (~(unsigned _BitInt(192)) (unsigned _BitInt(192 - 68)) -1) != 67
> +      || clrsb192 (~(unsigned _BitInt(192)) (unsigned _BitInt(192 - 92)) -1) != 91
> +      || clrsb192 (~(unsigned _BitInt(192)) (unsigned _BitInt(192 - 93)) -1) != 92
> +      || ffs192 (((unsigned _BitInt(192)) -1) << 42) != 43
> +      || ffs192 (((unsigned _BitInt(192)) -1) << 57) != 58
> +      || ffs192 (0x4000000000000000000000uwb) != 87
> +      || ffs192 (((unsigned _BitInt(192)) -1) << 149) != 150
> +      || ffs192 (2) != 2
> +      || __builtin_clzg (((unsigned _BitInt(192)) -1) >> 24) != 24
> +      || __builtin_clzg (((unsigned _BitInt(192)) -1) >> 79) != 79
> +      || __builtin_clzg ((unsigned _BitInt(192)) 1) != 192 - 1
> +      || __builtin_clzg (((unsigned _BitInt(192)) -1) >> 139, 192) != 139
> +      || __builtin_clzg ((unsigned _BitInt(192)) 2, 192) != 192 - 2
> +      || __builtin_ctzg (((unsigned _BitInt(192)) -1) << 42) != 42
> +      || __builtin_ctzg (((unsigned _BitInt(192)) -1) << 57) != 57
> +      || __builtin_ctzg ((unsigned _BitInt(192)) 0x4000000000000000000000uwb) != 86
> +      || __builtin_ctzg (((unsigned _BitInt(192)) -1) << 149, 192) != 149
> +      || __builtin_ctzg ((unsigned _BitInt(192)) 2, 192) != 1
> +      || __builtin_clrsbg ((_BitInt(192)) (unsigned _BitInt(192 - 4)) -1) != 3
> +      || __builtin_clrsbg ((_BitInt(192)) (unsigned _BitInt(192 - 28)) -1) != 27
> +      || __builtin_clrsbg ((_BitInt(192)) (unsigned _BitInt(192 - 29)) -1) != 28
> +      || __builtin_clrsbg ((_BitInt(192)) ~(unsigned _BitInt(192)) (unsigned _BitInt(192 - 68)) -1) != 67
> +      || __builtin_clrsbg ((_BitInt(192)) ~(unsigned _BitInt(192)) (unsigned _BitInt(192 - 92)) -1) != 91
> +      || __builtin_clrsbg ((_BitInt(192)) ~(unsigned _BitInt(192)) (unsigned _BitInt(192 - 93)) -1) != 92
> +      || __builtin_ffsg ((_BitInt(192)) (((unsigned _BitInt(192)) -1) << 42)) != 43
> +      || __builtin_ffsg ((_BitInt(192)) (((unsigned _BitInt(192)) -1) << 57)) != 58
> +      || __builtin_ffsg ((_BitInt(192)) 0x4000000000000000000000uwb) != 87
> +      || __builtin_ffsg ((_BitInt(192)) (((unsigned _BitInt(192)) -1) << 149)) != 150
> +      || __builtin_ffsg ((_BitInt(192)) 2) != 2)
> +    __builtin_abort ();
> +  if (parity192 (4692147078159863499615754634965484598760535154638668598762uwb) != __builtin_parityg (4692147078159863499615754634965484598760535154638668598762uwb)
> +      || parity192 (1669461228546917627909935444501097256112222796898845183538uwb) != __builtin_parityg (1669461228546917627909935444501097256112222796898845183538uwb)
> +      || parity192 (5107402473866766219120283991834936835726115452uwb) != __builtin_parityg (5107402473866766219120283991834936835726115452uwb)
> +      || popcount192 (4033871057575185619108386380181511734118888391160164588976uwb) != __builtin_popcountg (4033871057575185619108386380181511734118888391160164588976uwb)
> +      || popcount192 (58124766715713711628758119849579188845074973856704521119uwb) != __builtin_popcountg (58124766715713711628758119849579188845074973856704521119uwb)
> +      || popcount192 (289948065236269174335700831610076764076947650072787325852uwb) != __builtin_popcountg (289948065236269174335700831610076764076947650072787325852uwb))
> +    __builtin_abort ();
> +#endif
> +}
> --- gcc/testsuite/gcc.dg/torture/bitint-44.c.jj 2023-11-09 09:17:40.232182455 +0100
> +++ gcc/testsuite/gcc.dg/torture/bitint-44.c    2023-11-09 12:21:32.376046129 +0100
> @@ -0,0 +1,306 @@
> +/* PR c/111309 */
> +/* { dg-do run { target bitint } } */
> +/* { dg-options "-std=c2x -pedantic-errors" } */
> +/* { dg-skip-if "" { ! run_expensive_tests }  { "*" } { "-O0" "-O2" } } */
> +/* { dg-skip-if "" { ! run_expensive_tests } { "-flto" } { "" } } */
> +
> +#if __BITINT_MAXWIDTH__ >= 512
> +__attribute__((noipa)) int
> +clz512 (unsigned _BitInt(512) x)
> +{
> +  return __builtin_clzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +clzd512 (unsigned _BitInt(512) x)
> +{
> +  return __builtin_clzg (x, __builtin_popcountg ((typeof (x)) ~(typeof (x)) 0));
> +}
> +
> +__attribute__((noipa)) int
> +clzD512 (unsigned _BitInt(512) x, int y)
> +{
> +  return __builtin_clzg (x, y);
> +}
> +
> +__attribute__((noipa)) int
> +ctz512 (unsigned _BitInt(512) x)
> +{
> +  return __builtin_ctzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ctzd512 (unsigned _BitInt(512) x)
> +{
> +  return __builtin_ctzg (x, __builtin_popcountg ((typeof (x)) ~(typeof (x)) 0));
> +}
> +
> +__attribute__((noipa)) int
> +ctzD512 (unsigned _BitInt(512) x, int y)
> +{
> +  return __builtin_ctzg (x, y);
> +}
> +
> +__attribute__((noipa)) int
> +clrsb512 (_BitInt(512) x)
> +{
> +  return __builtin_clrsbg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ffs512 (_BitInt(512) x)
> +{
> +  return __builtin_ffsg (x);
> +}
> +
> +__attribute__((noipa)) int
> +parity512 (unsigned _BitInt(512) x)
> +{
> +  return __builtin_parityg (x);
> +}
> +
> +__attribute__((noipa)) int
> +popcount512 (unsigned _BitInt(512) x)
> +{
> +  return __builtin_popcountg (x);
> +}
> +#endif
> +
> +#if __BITINT_MAXWIDTH__ >= 523
> +__attribute__((noipa)) int
> +clz523 (unsigned _BitInt(523) x)
> +{
> +  return __builtin_clzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +clzd523 (unsigned _BitInt(523) x)
> +{
> +  return __builtin_clzg (x, __builtin_popcountg ((typeof (x)) ~(typeof (x)) 0));
> +}
> +
> +__attribute__((noipa)) int
> +clzD523 (unsigned _BitInt(523) x, int y)
> +{
> +  return __builtin_clzg (x, y);
> +}
> +
> +__attribute__((noipa)) int
> +ctz523 (unsigned _BitInt(523) x)
> +{
> +  return __builtin_ctzg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ctzd523 (unsigned _BitInt(523) x)
> +{
> +  return __builtin_ctzg (x, __builtin_popcountg ((typeof (x)) ~(typeof (x)) 0));
> +}
> +
> +__attribute__((noipa)) int
> +ctzD523 (unsigned _BitInt(523) x, int y)
> +{
> +  return __builtin_ctzg (x, y);
> +}
> +
> +__attribute__((noipa)) int
> +clrsb523 (_BitInt(523) x)
> +{
> +  return __builtin_clrsbg (x);
> +}
> +
> +__attribute__((noipa)) int
> +ffs523 (_BitInt(523) x)
> +{
> +  return __builtin_ffsg (x);
> +}
> +
> +__attribute__((noipa)) int
> +parity523 (unsigned _BitInt(523) x)
> +{
> +  return __builtin_parityg (x);
> +}
> +
> +__attribute__((noipa)) int
> +popcount523 (unsigned _BitInt(523) x)
> +{
> +  return __builtin_popcountg (x);
> +}
> +#endif
> +
> +int
> +main ()
> +{
> +#if __BITINT_MAXWIDTH__ >= 512
> +  if (clzd512 (0) != 512
> +      || clzD512 (0, -1) != -1
> +      || ctzd512 (0) != 512
> +      || ctzD512 (0, 42) != 42
> +      || clrsb512 (0) != 512 - 1
> +      || ffs512 (0) != 0
> +      || parity512 (0) != 0
> +      || popcount512 (0) != 0
> +      || __builtin_clzg ((unsigned _BitInt(512)) 0, 512 + 32) != 512 + 32
> +      || __builtin_ctzg ((unsigned _BitInt(512)) 0, 512) != 512
> +      || __builtin_clrsbg ((_BitInt(512)) 0) != 512 - 1
> +      || __builtin_ffsg ((_BitInt(512)) 0) != 0
> +      || __builtin_parityg ((unsigned _BitInt(512)) 0) != 0
> +      || __builtin_popcountg ((unsigned _BitInt(512)) 0) != 0)
> +    __builtin_abort ();
> +  if (clz512 (-1) != 0
> +      || clzd512 (-1) != 0
> +      || clzD512 (-1, 0) != 0
> +      || ctz512 (-1) != 0
> +      || ctzd512 (-1) != 0
> +      || ctzD512 (-1, 17) != 0
> +      || clrsb512 (-1) != 512 - 1
> +      || ffs512 (-1) != 1
> +      || parity512 (-1) != 0
> +      || popcount512 (-1) != 512
> +      || __builtin_clzg ((unsigned _BitInt(512)) -1) != 0
> +      || __builtin_clzg ((unsigned _BitInt(512)) -1, 512 + 32) != 0
> +      || __builtin_ctzg ((unsigned _BitInt(512)) -1) != 0
> +      || __builtin_ctzg ((unsigned _BitInt(512)) -1, 512) != 0
> +      || __builtin_clrsbg ((_BitInt(512)) -1) != 512 - 1
> +      || __builtin_ffsg ((_BitInt(512)) -1) != 1
> +      || __builtin_parityg ((unsigned _BitInt(512)) -1) != 0
> +      || __builtin_popcountg ((unsigned _BitInt(512)) -1) != 512)
> +    __builtin_abort ();
> +  if (clz512 (((unsigned _BitInt(512)) -1) >> 24) != 24
> +      || clz512 (((unsigned _BitInt(512)) -1) >> 79) != 79
> +      || clz512 (1) != 512 - 1
> +      || clzd512 (((unsigned _BitInt(512)) -1) >> 139) != 139
> +      || clzd512 (2) != 512 - 2
> +      || ctz512 (((unsigned _BitInt(512)) -1) << 42) != 42
> +      || ctz512 (((unsigned _BitInt(512)) -1) << 57) != 57
> +      || ctz512 (0x4000000000000000000000uwb) != 86
> +      || ctzd512 (((unsigned _BitInt(512)) -1) << 149) != 149
> +      || ctzd512 (2) != 1
> +      || clrsb512 ((unsigned _BitInt(512 - 4)) -1) != 3
> +      || clrsb512 ((unsigned _BitInt(512 - 28)) -1) != 27
> +      || clrsb512 ((unsigned _BitInt(512 - 29)) -1) != 28
> +      || clrsb512 (~(unsigned _BitInt(512)) (unsigned _BitInt(512 - 68)) -1) != 67
> +      || clrsb512 (~(unsigned _BitInt(512)) (unsigned _BitInt(512 - 92)) -1) != 91
> +      || clrsb512 (~(unsigned _BitInt(512)) (unsigned _BitInt(512 - 93)) -1) != 92
> +      || ffs512 (((unsigned _BitInt(512)) -1) << 42) != 43
> +      || ffs512 (((unsigned _BitInt(512)) -1) << 57) != 58
> +      || ffs512 (0x4000000000000000000000uwb) != 87
> +      || ffs512 (((unsigned _BitInt(512)) -1) << 149) != 150
> +      || ffs512 (2) != 2
> +      || __builtin_clzg (((unsigned _BitInt(512)) -1) >> 24) != 24
> +      || __builtin_clzg (((unsigned _BitInt(512)) -1) >> 79) != 79
> +      || __builtin_clzg ((unsigned _BitInt(512)) 1) != 512 - 1
> +      || __builtin_clzg (((unsigned _BitInt(512)) -1) >> 139, 512) != 139
> +      || __builtin_clzg ((unsigned _BitInt(512)) 2, 512) != 512 - 2
> +      || __builtin_ctzg (((unsigned _BitInt(512)) -1) << 42) != 42
> +      || __builtin_ctzg (((unsigned _BitInt(512)) -1) << 57) != 57
> +      || __builtin_ctzg ((unsigned _BitInt(512)) 0x4000000000000000000000uwb) != 86
> +      || __builtin_ctzg (((unsigned _BitInt(512)) -1) << 149, 512) != 149
> +      || __builtin_ctzg ((unsigned _BitInt(512)) 2, 512) != 1
> +      || __builtin_clrsbg ((_BitInt(512)) (unsigned _BitInt(512 - 4)) -1) != 3
> +      || __builtin_clrsbg ((_BitInt(512)) (unsigned _BitInt(512 - 28)) -1) != 27
> +      || __builtin_clrsbg ((_BitInt(512)) (unsigned _BitInt(512 - 29)) -1) != 28
> +      || __builtin_clrsbg ((_BitInt(512)) ~(unsigned _BitInt(512)) (unsigned _BitInt(512 - 68)) -1) != 67
> +      || __builtin_clrsbg ((_BitInt(512)) ~(unsigned _BitInt(512)) (unsigned _BitInt(512 - 92)) -1) != 91
> +      || __builtin_clrsbg ((_BitInt(512)) ~(unsigned _BitInt(512)) (unsigned _BitInt(512 - 93)) -1) != 92
> +      || __builtin_ffsg ((_BitInt(512)) (((unsigned _BitInt(512)) -1) << 42)) != 43
> +      || __builtin_ffsg ((_BitInt(512)) (((unsigned _BitInt(512)) -1) << 57)) != 58
> +      || __builtin_ffsg ((_BitInt(512)) 0x4000000000000000000000uwb) != 87
> +      || __builtin_ffsg ((_BitInt(512)) (((unsigned _BitInt(512)) -1) << 149)) != 150
> +      || __builtin_ffsg ((_BitInt(512)) 2) != 2)
> +    __builtin_abort ();
> +  if (parity512 (8278593062772967967574644592392030907507244457324713380127157444008480135136016412791369421272159911061801023217823646324038055629840240503699995274750141uwb) != __builtin_parityg (8278593062772967967574644592392030907507244457324713380127157444008480135136016412791369421272159911061801023217823646324038055629840240503699995274750141uwb)
> +      || parity512 (663951521760319802637316646127146913163123967584512032007606686578544864655291546789196279408181546344880831465704154822174055168766759305688225967189384uwb) != __builtin_parityg (663951521760319802637316646127146913163123967584512032007606686578544864655291546789196279408181546344880831465704154822174055168766759305688225967189384uwb)
> +      || parity512 (8114152627481936575035564712656624361256533214211179387274127464949371919139038942819974113641465089580051998523156404968195970853124179018281296621919217uwb) != __builtin_parityg (8114152627481936575035564712656624361256533214211179387274127464949371919139038942819974113641465089580051998523156404968195970853124179018281296621919217uwb)
> +      || popcount512 (697171368046392901434470580443928282938585745214587494987284546386421344865289735592202298494880955572094546861862007016154025065165834164941207378563932uwb) != __builtin_popcountg (697171368046392901434470580443928282938585745214587494987284546386421344865289735592202298494880955572094546861862007016154025065165834164941207378563932uwb)
> +      || popcount512 (12625357869391866487124235043239209385173615631331705015179232007319637649427586947822360147798041278948617160703315666047585702906648747835331939389354450uwb) != __builtin_popcountg (12625357869391866487124235043239209385173615631331705015179232007319637649427586947822360147798041278948617160703315666047585702906648747835331939389354450uwb)
> +      || popcount512 (12989863959706456104163426941303698078341934896544520782734564901708926112239778316241786242633862403309192697330635825122310265805838908726925342761646021uwb) != __builtin_popcountg (12989863959706456104163426941303698078341934896544520782734564901708926112239778316241786242633862403309192697330635825122310265805838908726925342761646021uwb))
> +    __builtin_abort ();
> +#endif
> +#if __BITINT_MAXWIDTH__ >= 523
> +  if (clzd523 (0) != 523
> +      || clzD523 (0, 42) != 42
> +      || ctzd523 (0) != 523
> +      || ctzD523 (0, -1) != -1
> +      || clrsb523 (0) != 523 - 1
> +      || ffs523 (0) != 0
> +      || parity523 (0) != 0
> +      || popcount523 (0) != 0
> +      || __builtin_clzg ((unsigned _BitInt(523)) 0, 523 + 32) != 523 + 32
> +      || __builtin_ctzg ((unsigned _BitInt(523)) 0, 523) != 523
> +      || __builtin_clrsbg ((_BitInt(523)) 0) != 523 - 1
> +      || __builtin_ffsg ((_BitInt(523)) 0) != 0
> +      || __builtin_parityg ((unsigned _BitInt(523)) 0) != 0
> +      || __builtin_popcountg ((unsigned _BitInt(523)) 0) != 0)
> +    __builtin_abort ();
> +  if (clz523 (-1) != 0
> +      || clzd523 (-1) != 0
> +      || clzD523 (-1, 15) != 0
> +      || ctz523 (-1) != 0
> +      || ctzd523 (-1) != 0
> +      || ctzD523 (-1, -57) != 0
> +      || clrsb523 (-1) != 523 - 1
> +      || ffs523 (-1) != 1
> +      || parity523 (-1) != 1
> +      || popcount523 (-1) != 523
> +      || __builtin_clzg ((unsigned _BitInt(523)) -1) != 0
> +      || __builtin_clzg ((unsigned _BitInt(523)) -1, 523 + 32) != 0
> +      || __builtin_ctzg ((unsigned _BitInt(523)) -1) != 0
> +      || __builtin_ctzg ((unsigned _BitInt(523)) -1, 523) != 0
> +      || __builtin_clrsbg ((_BitInt(523)) -1) != 523 - 1
> +      || __builtin_ffsg ((_BitInt(523)) -1) != 1
> +      || __builtin_parityg ((unsigned _BitInt(523)) -1) != 1
> +      || __builtin_popcountg ((unsigned _BitInt(523)) -1) != 523)
> +    __builtin_abort ();
> +  if (clz523 (((unsigned _BitInt(523)) -1) >> 24) != 24
> +      || clz523 (((unsigned _BitInt(523)) -1) >> 79) != 79
> +      || clz523 (1) != 523 - 1
> +      || clzd523 (((unsigned _BitInt(523)) -1) >> 139) != 139
> +      || clzd523 (2) != 523 - 2
> +      || ctz523 (((unsigned _BitInt(523)) -1) << 42) != 42
> +      || ctz523 (((unsigned _BitInt(523)) -1) << 57) != 57
> +      || ctz523 (0x4000000000000000000000uwb) != 86
> +      || ctzd523 (((unsigned _BitInt(523)) -1) << 149) != 149
> +      || ctzd523 (2) != 1
> +      || clrsb523 ((unsigned _BitInt(523 - 4)) -1) != 3
> +      || clrsb523 ((unsigned _BitInt(523 - 28)) -1) != 27
> +      || clrsb523 ((unsigned _BitInt(523 - 29)) -1) != 28
> +      || clrsb523 (~(unsigned _BitInt(523)) (unsigned _BitInt(523 - 68)) -1) != 67
> +      || clrsb523 (~(unsigned _BitInt(523)) (unsigned _BitInt(523 - 92)) -1) != 91
> +      || clrsb523 (~(unsigned _BitInt(523)) (unsigned _BitInt(523 - 93)) -1) != 92
> +      || ffs523 (((unsigned _BitInt(523)) -1) << 42) != 43
> +      || ffs523 (((unsigned _BitInt(523)) -1) << 57) != 58
> +      || ffs523 (0x4000000000000000000000uwb) != 87
> +      || ffs523 (((unsigned _BitInt(523)) -1) << 149) != 150
> +      || ffs523 (2) != 2
> +      || __builtin_clzg (((unsigned _BitInt(523)) -1) >> 24) != 24
> +      || __builtin_clzg (((unsigned _BitInt(523)) -1) >> 79) != 79
> +      || __builtin_clzg ((unsigned _BitInt(523)) 1) != 523 - 1
> +      || __builtin_clzg (((unsigned _BitInt(523)) -1) >> 139, 523) != 139
> +      || __builtin_clzg ((unsigned _BitInt(523)) 2, 523) != 523 - 2
> +      || __builtin_ctzg (((unsigned _BitInt(523)) -1) << 42) != 42
> +      || __builtin_ctzg (((unsigned _BitInt(523)) -1) << 57) != 57
> +      || __builtin_ctzg ((unsigned _BitInt(523)) 0x4000000000000000000000uwb) != 86
> +      || __builtin_ctzg (((unsigned _BitInt(523)) -1) << 149, 523) != 149
> +      || __builtin_ctzg ((unsigned _BitInt(523)) 2, 523) != 1
> +      || __builtin_clrsbg ((_BitInt(523)) (unsigned _BitInt(523 - 4)) -1) != 3
> +      || __builtin_clrsbg ((_BitInt(523)) (unsigned _BitInt(523 - 28)) -1) != 27
> +      || __builtin_clrsbg ((_BitInt(523)) (unsigned _BitInt(523 - 29)) -1) != 28
> +      || __builtin_clrsbg ((_BitInt(523)) ~(unsigned _BitInt(523)) (unsigned _BitInt(523 - 68)) -1) != 67
> +      || __builtin_clrsbg ((_BitInt(523)) ~(unsigned _BitInt(523)) (unsigned _BitInt(523 - 92)) -1) != 91
> +      || __builtin_clrsbg ((_BitInt(523)) ~(unsigned _BitInt(523)) (unsigned _BitInt(523 - 93)) -1) != 92
> +      || __builtin_ffsg ((_BitInt(523)) (((unsigned _BitInt(523)) -1) << 42)) != 43
> +      || __builtin_ffsg ((_BitInt(523)) (((unsigned _BitInt(523)) -1) << 57)) != 58
> +      || __builtin_ffsg ((_BitInt(523)) 0x4000000000000000000000uwb) != 87
> +      || __builtin_ffsg ((_BitInt(523)) (((unsigned _BitInt(523)) -1) << 149)) != 150
> +      || __builtin_ffsg ((_BitInt(523)) 2) != 2)
> +    __builtin_abort ();
> +  if (parity523 (14226628251091586975416900831427560438504550751597528218770815297642064445318137709184907300499591292677456563377096100346699421879373024906380724757049700104uwb) != __builtin_parityg (14226628251091586975416900831427560438504550751597528218770815297642064445318137709184907300499591292677456563377096100346699421879373024906380724757049700104uwb)
> +      || parity523 (20688958227123188226117538663818621034852702121556301239818743230005799574164516085541310491875153692467123662601853835357822935286851364843928714141587045255uwb) != __builtin_parityg (20688958227123188226117538663818621034852702121556301239818743230005799574164516085541310491875153692467123662601853835357822935286851364843928714141587045255uwb)
> +      || parity523 (8927708174664018648856542263215989788443763271738485875573765922613438023117960552135374015673598803453205044464280019640319125968982118836809392169156450404uwb) != __builtin_parityg (8927708174664018648856542263215989788443763271738485875573765922613438023117960552135374015673598803453205044464280019640319125968982118836809392169156450404uwb)
> +      || popcount523 (27178327344587654457581274852432957423537947348354896748701960885269035920194935311522194372418922852798513401240689173265979378157685169921449935364246334672uwb) != __builtin_popcountg (27178327344587654457581274852432957423537947348354896748701960885269035920194935311522194372418922852798513401240689173265979378157685169921449935364246334672uwb)
> +      || popcount523 (5307736750284212829931201546806718535860789684371772688568780952567669490917265125893664418036905110148872995350655890585853451175740907670080602411287166989uwb) != __builtin_popcountg (5307736750284212829931201546806718535860789684371772688568780952567669490917265125893664418036905110148872995350655890585853451175740907670080602411287166989uwb)
> +      || popcount523 (21261096432069432668470452941790780841888331284195411465624030283325239673941548816191698556934198698768393659379577567450765073013688585051560340496749593370uwb) != __builtin_popcountg (21261096432069432668470452941790780841888331284195411465624030283325239673941548816191698556934198698768393659379577567450765073013688585051560340496749593370uwb))
> +    __builtin_abort ();
> +#endif
> +}
>
>         Jakub
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Add type-generic clz/ctz/clrsb/ffs/parity/popcount builtins [PR111309]
  2023-12-16  5:51 ` Andrew Pinski
@ 2023-12-16  8:36   ` Jakub Jelinek
  0 siblings, 0 replies; 10+ messages in thread
From: Jakub Jelinek @ 2023-12-16  8:36 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: Joseph S. Myers, Richard Biener, Jason Merrill, gcc-patches

On Fri, Dec 15, 2023 at 09:51:10PM -0800, Andrew Pinski wrote:
> I was looking into improving __builtin_popcountg for __int128 on
> aarch64 (when CSSC is not implemented which right now is almost all
> cores) but this patch forces __builtin_popcountg to expand into 2
> __builtin_popcountll (and add) before it could optimize into an
> internal function for the popcount and have the backend a possibility
> of using implementing something better.
> This is due to the code in fold_builtin_bit_query, what might be the
> best way of disabling that for this case?
> 
> Basically right now popcount is implemented using the SIMD instruction
> cnt which can be used either 8x1 or 16x1 wide. Using the 16x1 improves
> both the code size and performance (on almost all cores I know of). So
> instead of 2 cnt instructions, we only would need one.

The reason for lowering those 2 * wordsize cases early is that there
is no __builtin_{clz,ctz,clrsb,ffs,parity,popcount}* for those cases (so we
can't expect expansion to say libgcc routines as fallback) and
IFN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT} are still direct optab ifns
(now with the extension that large/huge _BitInt is ok for those as operands
because we are guaranteed to lower that during bitint lowering).
Anything else won't make it through the direct optab checks and won't be
guaranteed to expand.

You can always define optabs for those and handle them in md files if it
results in better code.

	Jakub


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-12-16  8:36 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-09 15:02 [PATCH] Add type-generic clz/ctz/clrsb/ffs/parity/popcount builtins [PR111309] Jakub Jelinek
2023-11-09 21:43 ` Joseph Myers
2023-11-10  8:09 ` Richard Biener
2023-11-10  9:10   ` Jakub Jelinek
2023-11-10  9:19     ` Richard Biener
2023-11-10  9:44       ` Jakub Jelinek
2023-11-11  8:18         ` Jakub Jelinek
2023-11-13 23:45     ` Joseph Myers
2023-12-16  5:51 ` Andrew Pinski
2023-12-16  8:36   ` Jakub Jelinek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).