public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* speeding up parts of gcc by using count_leading_zero (long)
@ 2003-01-05  6:16 Andrew Pinski
  2003-01-05  7:03 ` Zack Weinberg
  2003-01-06 10:51 ` speeding up parts of gcc by using ffs Andrew Pinski
  0 siblings, 2 replies; 12+ messages in thread
From: Andrew Pinski @ 2003-01-05  6:16 UTC (permalink / raw)
  To: gcc; +Cc: Andrew Pinski, apinski

There are parts of gcc which can be sped up on PPC (and other 
architectures which have something similar)  by using the instruction 
`cntlz{w,d}' (count leading zero word, double word [PPC64 only]).

Also the parts of gcc on other architectures which do not have count 
leading zero can be sped up by instead of using a for loop, a series of 
if statements:
  register int bit_num = 0;
   if (word & 0xFFFF0000)
     word >>= 16;
   else
     bit_num += 16;
   if (word & 0xFF00)
     word >>= 8;
   else
     bit_num += 8;
   if (word & 0xF0)
     word >>= 4;
   else
     bit_num += 4;
   if (word & 0xc)
     word >>= 2;
   else
     bit_num += 2;
   if (word & 0x2)
     word >>= 1;
   else
     bit_num += 1;
   if ((word & 0x1) == 0)
     bit_num += 1;
and bit_num is the result.

The functions which could benefit from this are:
function					file
_________________________________
sbitmap_first_set_bit		sbitmap.c
sbitmap_last_set_bit		sbitmap.c
bitmap_first_set_bit		bitmap.c			(this already does a series of if 
statements)
bitmap_last_set_bit		bitmap.c			(this already does a series of if 
statements)
floor_log2_wide			toplev.c
exact_log2_wide			toplev.c
compute_inverse			ggc-page.c
build_mask64_2_operands		config/rs6000/rs6000.c

This one only effects libgcc
count_leading_zero			longlong.h

also this one from libiberty
ffs						ffs.c


There might be others but these would found by doing a grep for >> and 
looking at for a counting the bits that not set until getting one that 
is set SIZEOF_INT - cntlzw (word&-word) which is the same as ffs or 
looping until the word is zero which is the same as SIZEOF_INT - cntlzw 
(word) - 1.

Where should I put the mechanism for the cntlz for HOST_WIDE_INT, int, 
HOST_WIDEST_INT (in hwint.h?) and what should I call the functions, 
count_leading_zero_wide_int, count_leading_zero_int, and 
count_leading_zero_widest_int?

Thanks,
Andrew Pinski

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: speeding up parts of gcc by using count_leading_zero (long)
  2003-01-05  6:16 speeding up parts of gcc by using count_leading_zero (long) Andrew Pinski
@ 2003-01-05  7:03 ` Zack Weinberg
  2003-01-05  7:17   ` Peter Barada
  2003-01-06 10:51 ` speeding up parts of gcc by using ffs Andrew Pinski
  1 sibling, 1 reply; 12+ messages in thread
From: Zack Weinberg @ 2003-01-05  7:03 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: gcc, apinski

Andrew Pinski <pinskia@physics.uc.edu> writes:

> There are parts of gcc which can be sped up on PPC (and other
> architectures which have something similar)  by using the instruction
> `cntlz{w,d}' (count leading zero word, double word [PPC64 only]).

The right thing here is to change this code to use ffs() which is
already recognized and optimized by GCC and other compilers.  Put a
generic implementation of this primitive in libiberty, since it's not
part of C89.

Also, you missed ggc_alloc in ggc-page.c (which is a much more
important routine to optimize than compute_inverse).

Tangentially, ffs takes an int, which is 32 bits on all supported
hosts.  It would make sense to define __builtin_ffs32() and
__builtin_ffs64() to nail down the sizes.  ffs64 can be implemented
efficiently on machines with only a 32-bit ffs instruction, as

         ffs    high, r
         test   r
         bnz    0f
         ffs    low, r
0f:

so it is useful to provide both of them always.

zw

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: speeding up parts of gcc by using count_leading_zero (long)
  2003-01-05  7:03 ` Zack Weinberg
@ 2003-01-05  7:17   ` Peter Barada
  2003-01-05  8:41     ` Zack Weinberg
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Barada @ 2003-01-05  7:17 UTC (permalink / raw)
  To: zack; +Cc: pinskia, gcc, apinski


>Tangentially, ffs takes an int, which is 32 bits on all supported
>hosts.  It would make sense to define __builtin_ffs32() and
>__builtin_ffs64() to nail down the sizes.  ffs64 can be implemented
>efficiently on machines with only a 32-bit ffs instruction, as
>
>         ffs    high, r
>         test   r
>         bnz    0f
>         ffs    low, r
>0f:
>
>so it is useful to provide both of them always.

If ffs returns zero if its arg is zero, and then 1 to 32 (assuming ffs
returns 1 if the msb is set), then you forgot to add 32 if the high
word is zero(unless the lower word is zero) so the result is 0 to 64:

         ffs    high, r
         test   r
         bnz    0f
         ffs    low, r
	 test	r
	 bz	0f
	 add	32, r
0f:

-- 
Peter Barada
peter@baradas.org

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: speeding up parts of gcc by using count_leading_zero (long)
  2003-01-05  7:17   ` Peter Barada
@ 2003-01-05  8:41     ` Zack Weinberg
  0 siblings, 0 replies; 12+ messages in thread
From: Zack Weinberg @ 2003-01-05  8:41 UTC (permalink / raw)
  To: Peter Barada; +Cc: pinskia, gcc, apinski

Peter Barada <peter@baradas.org> writes:

> If ffs returns zero if its arg is zero, and then 1 to 32 (assuming ffs
> returns 1 if the msb is set), then you forgot to add 32 if the high
> word is zero(unless the lower word is zero) so the result is 0 to 64:
>
>          ffs    high, r
>          test   r
>          bnz    0f
>          ffs    low, r
> 	 test	r
> 	 bz	0f
> 	 add	32, r
> 0f:

Yes, you're right.  Sorry about that.

zw

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: speeding up parts of gcc by using ffs
  2003-01-05  6:16 speeding up parts of gcc by using count_leading_zero (long) Andrew Pinski
  2003-01-05  7:03 ` Zack Weinberg
@ 2003-01-06 10:51 ` Andrew Pinski
  2003-01-06 18:08   ` Zack Weinberg
  1 sibling, 1 reply; 12+ messages in thread
From: Andrew Pinski @ 2003-01-06 10:51 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: gcc, apinski

[-- Attachment #1: Type: text/plain, Size: 1414 bytes --]

A follow of the comments I received, I have only right now added the 
simple cases of HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_INT and 
HOST_BITS_PER_INT == HOST_BITS_PER_LONG.  I might implement the other 
part of the patch for when HOST_BITS_PER_WIDE_INT == 
HOST_BITS_PER_LONG_LONG and HOST_BITS_PER_WIDE_INT == 
HOST_BITS_PER_LONG but this might take some time because I have to add 
some more builtins but it will clean things up.

bootstraped on powerpc-apple-darwin6.3, also bootstrapped and run then 
testsuite on i386-unknown-openbsd3.1.
Also changed need_64bit_hwint on darwin's case back to needing it and 
bootstraped there also.

ChangeLog:
2003-01-06	Andrew Pinski <pinskia@physics.uc.edu>
	* bitmap.c: (bitmap_first_set_bit) Remove comment about ffs
	since libiberty already includes it. When
	HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_INT, use ffs.
	genattrtab.c: (encode_units_mask) Use ffs.
	ggc-page.c: (ggc_alloc) Use ffs when HOST_BITS_PER_INT == 
HOST_BITS_PER_LONG.
	(compute_inverse) Use ffs. (ggc_recalculate_in_use_p) Use ffs when 
HOST_BITS_PER_INT == HOST_BITS_PER_LONG
	toplev.c.c: (exact_log2_wide) Use ffs when HOST_BITS_PER_WIDE_INT == 
HOST_BITS_PER_INT.
	Hide log when HOST_BITS_PER_WIDE_INT != HOST_BITS_PER_INT.
	config/rs6000/rs6000.c: (extract_MB) Use ffs when HOST_BITS_PER_INT == 
HOST_BITS_PER_LONG.
	(extract_ME) Use ffs when HOST_BITS_PER_INT == HOST_BITS_PER_LONG.

Patch:

[-- Attachment #2: ffs.patch --]
[-- Type: application/octet-stream, Size: 4252 bytes --]

Index: bitmap.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/bitmap.c,v
retrieving revision 1.39
diff -u -d -b -w -u -r1.39 bitmap.c
--- bitmap.c	16 Dec 2002 18:18:59 -0000	1.39
+++ bitmap.c	6 Jan 2003 08:45:26 -0000
@@ -453,7 +453,10 @@
 #endif
 
   /* Binary search for the first set bit.  */
-  /* ??? It'd be nice to know if ffs or ffsl was available.  */
+
+#if HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_INT
+  bit_num = ffs (word) - 1;
+#else
 
   bit_num = 0;
   word = word & -word;
@@ -475,6 +478,7 @@
     bit_num += 2;
   if (word & 0xaa)
     bit_num += 1;
+#endif
 
   return (ptr->indx * BITMAP_ELEMENT_ALL_BITS
 	  + word_num * HOST_BITS_PER_WIDE_INT
Index: genattrtab.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/genattrtab.c,v
retrieving revision 1.121
diff -u -d -b -w -u -r1.121 genattrtab.c
--- genattrtab.c	16 Dec 2002 18:19:32 -0000	1.121
+++ genattrtab.c	6 Jan 2003 08:45:27 -0000
@@ -2270,8 +2270,7 @@
 	abort ();
       else if (i != 0 && i == (i & -i))
 	/* Only one bit is set, so yield that unit number.  */
-	for (j = 0; (i >>= 1) != 0; j++)
-	  ;
+        j = ffs (i) - 1;
       else
 	j = ~i;
       return attr_rtx (CONST_STRING, attr_printf (MAX_DIGITS, "%d", j));
Index: ggc-page.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/ggc-page.c,v
retrieving revision 1.58
diff -u -d -b -w -u -r1.58 ggc-page.c
--- ggc-page.c	16 Dec 2002 18:19:36 -0000	1.58
+++ ggc-page.c	6 Jan 2003 08:45:27 -0000
@@ -928,8 +928,12 @@
 	  word = bit = 0;
 	  while (~entry->in_use_p[word] == 0)
 	    ++word;
+#if HOST_BITS_PER_INT == HOST_BITS_PER_LONG
+	  bit = ffs (~(entry->in_use_p[word])) - 1;
+#else
 	  while ((entry->in_use_p[word] >> bit) & 1)
 	    ++bit;
+#endif
 	  hint = word * HOST_BITS_PER_LONG + bit;
 	}
 
@@ -1100,12 +1104,8 @@
     }
 
   size = OBJECT_SIZE (order);
-  e = 0;
-  while (size % 2 == 0)
-    {
-      e++;
-      size >>= 1;
-    }
+  e = ffs (size) - 1;
+  size >>= e;
 
   inv = size;
   while (inv * size != 1)
@@ -1241,9 +1241,14 @@
       /* Something is in use if it is marked, or if it was in use in a
 	 context further down the context stack.  */
       p->in_use_p[i] |= p->save_in_use_p[i];
+#if HOST_BITS_PER_INT == HOST_BITS_PER_LONG
+      j = p->in_use_p[i] >>  (ffs (p->in_use_p[i]) - 1);
+#else
+      j = p->in_use_p[i];
+#endif
 
       /* Decrement the free object count for every object allocated.  */
-      for (j = p->in_use_p[i]; j; j >>= 1)
+      for (; j; j >>= 1)
 	p->num_free_objects -= (j & 1);
     }
 
Index: toplev.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/toplev.c,v
retrieving revision 1.693
diff -u -d -b -w -u -r1.693 toplev.c
--- toplev.c	24 Dec 2002 08:30:32 -0000	1.693
+++ toplev.c	6 Jan 2003 08:45:27 -0000
@@ -1661,13 +1661,19 @@
 exact_log2_wide (x)
      unsigned HOST_WIDE_INT x;
 {
+#if HOST_BITS_PER_WIDE_INT != HOST_BITS_PER_INT
   int log = 0;
+#endif
   /* Test for 0 or a power of 2.  */
   if (x == 0 || x != (x & -x))
     return -1;
+#if HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_INT
+  return ffs (x) - 1;
+#else
   while ((x >>= 1) != 0)
     log++;
   return log;
+#endif
 }
 
 /* Given X, an unsigned number, return the largest int Y such that 2**Y <= X.
Index: config/rs6000/rs6000.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.407
diff -u -d -b -w -u -r1.407 rs6000.c
--- config/rs6000/rs6000.c	3 Jan 2003 23:09:33 -0000	1.407
+++ config/rs6000/rs6000.c	6 Jan 2003 08:45:28 -0000
@@ -7331,9 +7331,13 @@
 
   /* Otherwise we have a wrap-around mask.  Look for the first 0 bit
      from the right.  */
+#if HOST_BITS_PER_INT == HOST_BITS_PER_LONG
+  i = HOST_BITS_PER_LONG - ffs (~val) + 1;
+#else
   i = 31;
   while (((val >>= 1) & 1) != 0)
     --i;
+#endif
 
   return i;
 }
@@ -7352,9 +7356,13 @@
       if ((val & 0xffffffff) == 0)
 	abort ();
 
+#if HOST_BITS_PER_INT == HOST_BITS_PER_LONG
+      i = HOST_BITS_PER_LONG - ffs (val);
+#else
       i = 30;
       while (((val >>= 1) & 1) == 0)
 	--i;
+#endif
 
       return i;
     }

[-- Attachment #3: Type: text/plain, Size: 147 bytes --]


	
Thanks,
Andrew Pinski
apinski@apple.com
pinskia@physics.uc.edu

PS. Sorry for the messy ChangeLog, I am still trying to get a hang of 
this. :)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: speeding up parts of gcc by using ffs
  2003-01-06 10:51 ` speeding up parts of gcc by using ffs Andrew Pinski
@ 2003-01-06 18:08   ` Zack Weinberg
  2003-01-06 18:28     ` Joseph S. Myers
  2003-01-08  9:51     ` Andrew Pinski
  0 siblings, 2 replies; 12+ messages in thread
From: Zack Weinberg @ 2003-01-06 18:08 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: gcc, apinski

Andrew Pinski <pinskia@physics.uc.edu> writes:

> A follow of the comments I received, I have only right now added the
> simple cases of HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_INT and
> HOST_BITS_PER_INT == HOST_BITS_PER_LONG.  I might implement the other
> part of the patch for when HOST_BITS_PER_WIDE_INT ==
> HOST_BITS_PER_LONG_LONG and HOST_BITS_PER_WIDE_INT ==
> HOST_BITS_PER_LONG but this might take some time because I have to add
> some more builtins but it will clean things up.

I'm not enthusiastic about all the #ifdefs.

glibc provides ffsl and ffsll which take 'long' and 'long long'
respectively; may I suggest that you follow these steps:

1) put ffsl and ffsll into libiberty.
2) have hwint.h #define ffs_hwi and ffs_hwidesti appropriately
3) use ffs/ffsl/ffsll/ffs_hwi throughout the compiler
4) create __builtin_ffsl and __builtin_ffsll

zw

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: speeding up parts of gcc by using ffs
  2003-01-06 18:08   ` Zack Weinberg
@ 2003-01-06 18:28     ` Joseph S. Myers
  2003-01-08  9:51     ` Andrew Pinski
  1 sibling, 0 replies; 12+ messages in thread
From: Joseph S. Myers @ 2003-01-06 18:28 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: Andrew Pinski, gcc, apinski

On Mon, 6 Jan 2003, Zack Weinberg wrote:

> glibc provides ffsl and ffsll which take 'long' and 'long long'
> respectively; may I suggest that you follow these steps:
> 
> 1) put ffsl and ffsll into libiberty.
> 2) have hwint.h #define ffs_hwi and ffs_hwidesti appropriately
> 3) use ffs/ffsl/ffsll/ffs_hwi throughout the compiler
> 4) create __builtin_ffsl and __builtin_ffsll

If any of the uses are genuinely performance-critical (as shown by
profiling), the fallback version of ffs (and ffsl, ffsll) in libiberty
could also be speeded up by replacing it with glibc's generic C
implementation.

-- 
Joseph S. Myers
jsm28@cam.ac.uk

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: speeding up parts of gcc by using ffs
  2003-01-06 18:08   ` Zack Weinberg
  2003-01-06 18:28     ` Joseph S. Myers
@ 2003-01-08  9:51     ` Andrew Pinski
  2003-01-08 11:20       ` Falk Hueffner
  2003-01-09  0:24       ` Richard Henderson
  1 sibling, 2 replies; 12+ messages in thread
From: Andrew Pinski @ 2003-01-08  9:51 UTC (permalink / raw)
  To: gcc-patches; +Cc: Andrew Pinski, gcc, apinski

Here is the first patch for using ffs{,l,ll} in gcc, will be posting 
the rest by Monday.
This adds ffsl and ffsll to libiberty, and also adds the ffs's to the 
libiberty.a.

Thanks,
Andrew Pinski
apinski@apple.com
pinskia@physics.uc.edu



On Monday, Jan 6, 2003, at 09:49 US/Pacific, Zack Weinberg wrote:

> Andrew Pinski <pinskia@physics.uc.edu> writes:
>
>> A follow of the comments I received, I have only right now added the
>> simple cases of HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_INT and
>> HOST_BITS_PER_INT == HOST_BITS_PER_LONG.  I might implement the other
>> part of the patch for when HOST_BITS_PER_WIDE_INT ==
>> HOST_BITS_PER_LONG_LONG and HOST_BITS_PER_WIDE_INT ==
>> HOST_BITS_PER_LONG but this might take some time because I have to add
>> some more builtins but it will clean things up.
>
> I'm not enthusiastic about all the #ifdefs.
>
> glibc provides ffsl and ffsll which take 'long' and 'long long'
> respectively; may I suggest that you follow these steps:
>
> 1) put ffsl and ffsll into libiberty.
> 2) have hwint.h #define ffs_hwi and ffs_hwidesti appropriately
> 3) use ffs/ffsl/ffsll/ffs_hwi throughout the compiler
> 4) create __builtin_ffsl and __builtin_ffsll
>
> zw



ChangeLog:
2003-01-07	Andrew Pinski <pinskia@physics.uc.edu>
	* Makefile.in (ffsll.o): special rule for long long warning.
	(CFILES): add ffsl.c and ffsll.c. (CONFIGURED_OFILES): add
	ffsl.o and ffsll.o. (NEEDED): add ffs, ffsl, and ffsll.
	* aclocal.m4 include ../config/accross.m4 for AC_C_BIGENDIAN_CROSS,
   and AC_COMPILE_CHECK_SIZEOF. (libiberty_AC_C_LONG_LONG) copied from
   gcc's gcc_AC_C_LONG_LONG.
	* configure.in (AC_C_BIGENDIAN_CROSS) use it.
   (libiberty_AC_C_LONG_LONG) use it. (AC_COMPILE_CHECK_SIZEOF) use it.
   Probe the size of int, long, and long long/__int64 if we have them.
   Check for ffsl and ffsll.
	* configure regenerate.
   (LIB_AC_PROG_CC) add ac_libiberty_warn_long_long_cflags.
	* config.in regenerate.
	* ffsl.c new file.
	* ffsll.c new file.

Index: Makefile.in
===================================================================
RCS file: /cvs/gcc/gcc/libiberty/Makefile.in,v
retrieving revision 1.78
diff -u -d -b -w -b -B -u -r1.78 Makefile.in
--- Makefile.in	22 Nov 2002 20:01:07 -0000	1.78
+++ Makefile.in	8 Jan 2003 07:32:39 -0000
@@ -122,15 +122,22 @@
  	else true; fi
  	$(COMPILE.c) $< $(OUTPUT_OPTION)

+#special case ffsll.o because of extra warnings will show up other wise
+ffsll.o: ffsll.c
+	if [ x"$(PICFLAG)" != x ]; then \
+	  $(COMPILE.c) @ac_libiberty_warn_long_long_cflags@  $(PICFLAG) $< -o 
pic/$@; \
+	else true; fi
+	$(COMPILE.c) @ac_libiberty_warn_long_long_cflags@ $< $(OUTPUT_OPTION)
+
  # NOTE: If you add new files to the library, add them to this list
  # (alphabetical), and add them to REQUIRED_OFILES, or
  # CONFIGURED_OFILES and funcs in configure.in.
  CFILES = alloca.c argv.c asprintf.c atexit.c				\
  	basename.c bcmp.c bcopy.c bsearch.c bzero.c			\
-	calloc.c choose-temp.c clock.c concat.c cp-demangle.c		\
-	 cplus-dem.c							\
+	calloc.c choose-temp.c clock.c concat.c 			\
+	 cp-demangle.c cplus-dem.c					\
  	dyn-string.c							\
-	fdmatch.c ffs.c fibheap.c floatformat.c fnmatch.c		\
+	fdmatch.c ffs.c ffsl.c ffsll.c fibheap.c floatformat.c fnmatch.c\
  	getcwd.c getopt.c getopt1.c getpagesize.c getpwd.c getruntime.c	\
  	hashtab.c hex.c							\
  	index.c insque.c						\
@@ -176,7 +183,7 @@
  	basename.o bcmp.o bcopy.o bsearch.o bzero.o			\
  	calloc.o clock.o copysign.o					\
  	_doprnt.o							\
-	ffs.o								\
+	ffs.o ffsl.o ffsll.o						\
  	getcwd.o getpagesize.o						\
  	index.o insque.o						\
  	memchr.o memcmp.o memcpy.o memmove.o memset.o mkstemps.o	\
@@ -287,7 +294,7 @@
  # can't use anything encumbering.
  NEEDED = atexit calloc memchr memcmp memcpy memmove memset rename 
strchr \
  	 strerror strncmp strrchr strstr strtol strtoul tmpnam vfprintf 
vprintf \
-	 vfork waitpid bcmp bcopy bzero
+	 vfork waitpid bcmp bcopy bzero ffs ffsl ffsll
  needed-list: Makefile
  	rm -f needed-list; touch needed-list; \
  	for f in $(NEEDED); do \
Index: aclocal.m4
===================================================================
RCS file: /cvs/gcc/gcc/libiberty/aclocal.m4,v
retrieving revision 1.5
diff -u -d -b -w -b -B -u -r1.5 aclocal.m4
--- aclocal.m4	31 Dec 2001 23:23:49 -0000	1.5
+++ aclocal.m4	8 Jan 2003 07:32:39 -0000
@@ -1,3 +1,5 @@
+sinclude(../config/accross.m4)
+
  dnl See whether strncmp reads past the end of its string parameters.
  dnl On some versions of SunOS4 at least, strncmp reads a word at a time
  dnl but erroneously reads past the end of strings.  This can cause
@@ -107,6 +109,7 @@
  if test $ac_cv_prog_gcc = yes; then
    GCC=yes
    ac_libiberty_warn_cflags='-W -Wall -Wtraditional -pedantic'
+  ac_libiberty_warn_long_long_cflags='-Wno-long-long'
  dnl Check whether -g works, even if CFLAGS is set, in case the package
  dnl plays around with CFLAGS (such as to build both debugging and
  dnl normal versions of a library), tasteless as that idea is.
@@ -124,9 +127,11 @@
  else
    GCC=
    ac_libiberty_warn_cflags=
+  ac_libiberty_warn_long_long_cflags=
    test "${CFLAGS+set}" = set || CFLAGS="-g"
  fi
  AC_SUBST(ac_libiberty_warn_cflags)
+AC_SUBST(ac_libiberty_warn_long_long_cflags)
  ])

  # Work around a bug in autoheader.  This can go away when we switch to
@@ -188,4 +193,27 @@
          STACK_DIRECTION > 0 => grows toward higher addresses
          STACK_DIRECTION < 0 => grows toward lower addresses
          STACK_DIRECTION = 0 => direction of growth unknown])
+])
+
+dnl Checking for long long.
+dnl By Caolan McNamara <caolan@skynet.ie>
+dnl Added check for __int64, Zack Weinberg <zackw@stanford.edu>
+dnl
+AC_DEFUN([libiberty_AC_C_LONG_LONG],
+[AC_CACHE_CHECK(for long long int, ac_cv_c_long_long,
+  [AC_TRY_COMPILE(,[long long int i;],
+         ac_cv_c_long_long=yes,
+         ac_cv_c_long_long=no)])
+  if test $ac_cv_c_long_long = yes; then
+    AC_DEFINE(HAVE_LONG_LONG, 1,
+      [Define if your compiler supports the \`long long' type.])
+  fi
+AC_CACHE_CHECK(for __int64, ac_cv_c___int64,
+  [AC_TRY_COMPILE(,[__int64 i;],
+	ac_cv_c___int64=yes,
+	ac_cv_c___int64=no)])
+  if test $ac_cv_c___int64 = yes; then
+    AC_DEFINE(HAVE___INT64, 1,
+      [Define if your compiler supports the \`__int64' type.])
+  fi
  ])
Index: configure.in
===================================================================
RCS file: /cvs/gcc/gcc/libiberty/configure.in,v
retrieving revision 1.52
diff -u -d -b -w -b -B -u -r1.52 configure.in
--- configure.in	1 Jul 2002 05:38:50 -0000	1.52
+++ configure.in	8 Jan 2003 07:32:39 -0000
@@ -145,6 +145,9 @@
  AC_HEADER_SYS_WAIT
  AC_HEADER_TIME

+
+AC_C_BIGENDIAN_CROSS
+
  libiberty_AC_DECLARE_ERRNO

  AC_CHECK_TYPE(uintptr_t, unsigned long)
@@ -154,6 +158,18 @@
    AC_DEFINE(HAVE_UINTPTR_T, 1, [Define if you have the \`uintptr_t' 
type.])
  fi

+libiberty_AC_C_LONG_LONG
+
+AC_COMPILE_CHECK_SIZEOF(int)
+AC_COMPILE_CHECK_SIZEOF(long)
+
+if test $ac_cv_c_long_long = yes; then
+  AC_COMPILE_CHECK_SIZEOF(long long)
+fi
+if test $ac_cv_c___int64 = yes; then
+  AC_COMPILE_CHECK_SIZEOF(__int64)
+fi
+
  AC_TYPE_PID_T

  # This is the list of functions which libiberty will provide if they
@@ -200,6 +216,14 @@
  funcs="$funcs vprintf"
  funcs="$funcs vsprintf"
  funcs="$funcs waitpid"
+funcs="$funcs ffsl"
+
+if test $ac_cv_c_long_long = yes; then
+    funcs="$funcs ffsll"
+fi
+if test $ac_cv_c___int64 = yes; then
+    funcs="$funcs ffsll"
+fi

  # Also in the old function.def file: alloca, vfork, getopt.

@@ -216,7 +240,7 @@
    AC_CHECK_FUNCS(strcasecmp setenv strchr strdup strncasecmp strrchr 
strstr)
    AC_CHECK_FUNCS(strtod strtol strtoul tmpnam vasprintf vfprintf 
vprintf)
    AC_CHECK_FUNCS(vsprintf waitpid getrusage on_exit psignal strerror 
strsignal)
-  AC_CHECK_FUNCS(sysconf times sbrk gettimeofday ffs)
+  AC_CHECK_FUNCS(sysconf times sbrk gettimeofday ffs ffsl ffsll)
    AC_DEFINE(HAVE_SYS_ERRLIST, 1, [Define if you have the sys_errlist 
variable.])
    AC_DEFINE(HAVE_SYS_NERR,    1, [Define if you have the sys_nerr 
variable.])
    AC_DEFINE(HAVE_SYS_SIGLIST, 1, [Define if you have the sys_siglist 
variable.])
Index: ffsl.c
===================================================================
RCS file: ffsl.c
diff -N ffsl.c
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ ffsl.c	8 Jan 2003 07:32:39 -0000
@@ -0,0 +1,31 @@
+/* ffsl -- Find the first bit set in the parameter
+
+@deftypefn Supplemental int ffsl (long @var{valu})
+
+Find the first (least significant) bit set in @var{valu}.  Bits are
+numbered from right to left, starting with bit 1 (corresponding to the
+value 1).  If @var{valu} is zero, zero is returned.
+
+@end deftypefn
+
+*/
+
+#include "config.h"
+
+int
+ffsl (valu)
+  register long valu;
+{
+#if SIZEOF_LONG == SIZEOF_INT
+  extern int ffs();
+  return ffs(valu);
+#else
+#if SIZEOF_LONG == SIZEOF_LONG_LONG || SIZEOF_LONG == SIZEOF___INT64
+  extern int ffs();
+  return ffsll(valu);
+#else
+  #error Do not know what size long is.
+#endif
+#endif
+}
+
Index: ffsll.c
===================================================================
RCS file: ffsll.c
diff -N ffsll.c
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ ffsll.c	8 Jan 2003 07:32:39 -0000
@@ -0,0 +1,75 @@
+/* ffsll -- Find the first bit set in the parameter
+
+@deftypefn Supplemental int ffsll (int @var{valu})
+
+Find the first (least significant) bit set in @var{valu}.  Bits are
+numbered from right to left, starting with bit 1 (corresponding to the
+value 1).  If @var{valu} is zero, zero is returned.
+
+@end deftypefn
+
+*/
+
+#include "config.h"
+
+#if defined(HAVE_LONG_LONG)
+#define NEED_FFSLL
+typedef long long ll;
+#define SIZEOF_LL SIZEOF_LONG_LONG
+
+#else
+#if defined(HAVE___INT64)
+#define NEED_FFSLL
+typedef __int64 ll;
+#define SIZEOF_LL SIZEOF___INT64
+
+#endif
+#endif
+
+#if defined(NEED_FFSLL)
+
+union ll_ints {
+  ll longlongs;
+  struct {
+#if defined(WORDS_BIGENDIAN)
+    int hi;
+    int low;
+#else
+    int low;
+    int hi;
+#endif
+  } ints;
+};
+
+int ffs ();
+
+int
+ffsll (valu)
+  register ll valu;
+{
+#if SIZEOF_LL == SIZEOF_INT*2
+  ll x  = valu & -valu;
+  union ll_ints temp;
+  union ll_ints temp1;
+  int add = 0;
+  int word;
+  temp.longlongs = valu;
+  temp1.longlongs = x;
+  if(temp1.ints.hi!=0)
+    word = temp.ints.low;
+  else
+    word = temp.ints.hi, add = 32;
+  return add + ffs (word);
+#else
+ int bit_num = 0;
+ if(valu==0)
+   return 0;
+ while((valu&1)!=0)
+   bit_num ++, valu >>= 1;
+
+  return bit_num;
+#endif
+
+}
+
+#endif

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: speeding up parts of gcc by using ffs
  2003-01-08  9:51     ` Andrew Pinski
@ 2003-01-08 11:20       ` Falk Hueffner
  2003-01-08 11:21         ` Andrew Pinski
  2003-01-09  0:24       ` Richard Henderson
  1 sibling, 1 reply; 12+ messages in thread
From: Falk Hueffner @ 2003-01-08 11:20 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: gcc-patches, gcc, apinski

Andrew Pinski <pinskia@physics.uc.edu> writes:

> +  extern int ffs();

Function declarators with empty parentheses are an obsolescent feature
in C99, so I think they're better to be avoided.

> +#else
> + int bit_num = 0;
> + if(valu==0)
> +   return 0;
> + while((valu&1)!=0)
> +   bit_num ++, valu >>= 1;
> +
> +  return bit_num;
> +#endif

This looks bogus to me.

-- 
	Falk

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: speeding up parts of gcc by using ffs
  2003-01-08 11:20       ` Falk Hueffner
@ 2003-01-08 11:21         ` Andrew Pinski
  2003-01-08 11:26           ` Falk Hueffner
  0 siblings, 1 reply; 12+ messages in thread
From: Andrew Pinski @ 2003-01-08 11:21 UTC (permalink / raw)
  To: Falk Hueffner; +Cc: Andrew Pinski, gcc-patches, gcc, apinski


On Wednesday, Jan 8, 2003, at 00:22 US/Pacific, Falk Hueffner wrote:

> Andrew Pinski <pinskia@physics.uc.edu> writes:
>
>> +  extern int ffs();
>
> Function declarators with empty parentheses are an obsolescent feature
> in C99, so I think they're better to be avoided.

Well parts of gcc are still written in K&R C, this is one.

Thanks,
Andrew Pinski

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: speeding up parts of gcc by using ffs
  2003-01-08 11:21         ` Andrew Pinski
@ 2003-01-08 11:26           ` Falk Hueffner
  0 siblings, 0 replies; 12+ messages in thread
From: Falk Hueffner @ 2003-01-08 11:26 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: gcc-patches, gcc, apinski

Andrew Pinski <pinskia@physics.uc.edu> writes:

> On Wednesday, Jan 8, 2003, at 00:22 US/Pacific, Falk Hueffner wrote:
> > Andrew Pinski <pinskia@physics.uc.edu> writes:
> >
> >> +  extern int ffs();
> >
> > Function declarators with empty parentheses are an obsolescent feature
> > in C99, so I think they're better to be avoided.
> 
> Well parts of gcc are still written in K&R C, this is one.

Hmm, didn't think of that. Perhaps you could uses PARAMS(()).

-- 
	Falk

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: speeding up parts of gcc by using ffs
  2003-01-08  9:51     ` Andrew Pinski
  2003-01-08 11:20       ` Falk Hueffner
@ 2003-01-09  0:24       ` Richard Henderson
  1 sibling, 0 replies; 12+ messages in thread
From: Richard Henderson @ 2003-01-09  0:24 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: gcc-patches, gcc, apinski

On Tue, Jan 07, 2003 at 11:52:00PM -0800, Andrew Pinski wrote:
> +  if(temp1.ints.hi!=0)
> +    word = temp.ints.low;

This doesn't implement ffs.  You want

	if (temp1.ints.lo)
	  word = temp1.ints.lo;

And, really, I suspect that you don't want a union
at all.  Using a >> 32 should be good enough and it
won't punish 64-bit hosts.


r~

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2003-01-08 22:46 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-01-05  6:16 speeding up parts of gcc by using count_leading_zero (long) Andrew Pinski
2003-01-05  7:03 ` Zack Weinberg
2003-01-05  7:17   ` Peter Barada
2003-01-05  8:41     ` Zack Weinberg
2003-01-06 10:51 ` speeding up parts of gcc by using ffs Andrew Pinski
2003-01-06 18:08   ` Zack Weinberg
2003-01-06 18:28     ` Joseph S. Myers
2003-01-08  9:51     ` Andrew Pinski
2003-01-08 11:20       ` Falk Hueffner
2003-01-08 11:21         ` Andrew Pinski
2003-01-08 11:26           ` Falk Hueffner
2003-01-09  0:24       ` Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).