* speeding up parts of gcc by using count_leading_zero (long) @ 2003-01-05 6:16 Andrew Pinski 2003-01-05 7:03 ` Zack Weinberg 2003-01-06 10:51 ` speeding up parts of gcc by using ffs Andrew Pinski 0 siblings, 2 replies; 12+ messages in thread From: Andrew Pinski @ 2003-01-05 6:16 UTC (permalink / raw) To: gcc; +Cc: Andrew Pinski, apinski There are parts of gcc which can be sped up on PPC (and other architectures which have something similar) by using the instruction `cntlz{w,d}' (count leading zero word, double word [PPC64 only]). Also the parts of gcc on other architectures which do not have count leading zero can be sped up by instead of using a for loop, a series of if statements: register int bit_num = 0; if (word & 0xFFFF0000) word >>= 16; else bit_num += 16; if (word & 0xFF00) word >>= 8; else bit_num += 8; if (word & 0xF0) word >>= 4; else bit_num += 4; if (word & 0xc) word >>= 2; else bit_num += 2; if (word & 0x2) word >>= 1; else bit_num += 1; if ((word & 0x1) == 0) bit_num += 1; and bit_num is the result. The functions which could benefit from this are: function file _________________________________ sbitmap_first_set_bit sbitmap.c sbitmap_last_set_bit sbitmap.c bitmap_first_set_bit bitmap.c (this already does a series of if statements) bitmap_last_set_bit bitmap.c (this already does a series of if statements) floor_log2_wide toplev.c exact_log2_wide toplev.c compute_inverse ggc-page.c build_mask64_2_operands config/rs6000/rs6000.c This one only effects libgcc count_leading_zero longlong.h also this one from libiberty ffs ffs.c There might be others but these would found by doing a grep for >> and looking at for a counting the bits that not set until getting one that is set SIZEOF_INT - cntlzw (word&-word) which is the same as ffs or looping until the word is zero which is the same as SIZEOF_INT - cntlzw (word) - 1. Where should I put the mechanism for the cntlz for HOST_WIDE_INT, int, HOST_WIDEST_INT (in hwint.h?) and what should I call the functions, count_leading_zero_wide_int, count_leading_zero_int, and count_leading_zero_widest_int? Thanks, Andrew Pinski ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: speeding up parts of gcc by using count_leading_zero (long) 2003-01-05 6:16 speeding up parts of gcc by using count_leading_zero (long) Andrew Pinski @ 2003-01-05 7:03 ` Zack Weinberg 2003-01-05 7:17 ` Peter Barada 2003-01-06 10:51 ` speeding up parts of gcc by using ffs Andrew Pinski 1 sibling, 1 reply; 12+ messages in thread From: Zack Weinberg @ 2003-01-05 7:03 UTC (permalink / raw) To: Andrew Pinski; +Cc: gcc, apinski Andrew Pinski <pinskia@physics.uc.edu> writes: > There are parts of gcc which can be sped up on PPC (and other > architectures which have something similar) by using the instruction > `cntlz{w,d}' (count leading zero word, double word [PPC64 only]). The right thing here is to change this code to use ffs() which is already recognized and optimized by GCC and other compilers. Put a generic implementation of this primitive in libiberty, since it's not part of C89. Also, you missed ggc_alloc in ggc-page.c (which is a much more important routine to optimize than compute_inverse). Tangentially, ffs takes an int, which is 32 bits on all supported hosts. It would make sense to define __builtin_ffs32() and __builtin_ffs64() to nail down the sizes. ffs64 can be implemented efficiently on machines with only a 32-bit ffs instruction, as ffs high, r test r bnz 0f ffs low, r 0f: so it is useful to provide both of them always. zw ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: speeding up parts of gcc by using count_leading_zero (long) 2003-01-05 7:03 ` Zack Weinberg @ 2003-01-05 7:17 ` Peter Barada 2003-01-05 8:41 ` Zack Weinberg 0 siblings, 1 reply; 12+ messages in thread From: Peter Barada @ 2003-01-05 7:17 UTC (permalink / raw) To: zack; +Cc: pinskia, gcc, apinski >Tangentially, ffs takes an int, which is 32 bits on all supported >hosts. It would make sense to define __builtin_ffs32() and >__builtin_ffs64() to nail down the sizes. ffs64 can be implemented >efficiently on machines with only a 32-bit ffs instruction, as > > ffs high, r > test r > bnz 0f > ffs low, r >0f: > >so it is useful to provide both of them always. If ffs returns zero if its arg is zero, and then 1 to 32 (assuming ffs returns 1 if the msb is set), then you forgot to add 32 if the high word is zero(unless the lower word is zero) so the result is 0 to 64: ffs high, r test r bnz 0f ffs low, r test r bz 0f add 32, r 0f: -- Peter Barada peter@baradas.org ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: speeding up parts of gcc by using count_leading_zero (long) 2003-01-05 7:17 ` Peter Barada @ 2003-01-05 8:41 ` Zack Weinberg 0 siblings, 0 replies; 12+ messages in thread From: Zack Weinberg @ 2003-01-05 8:41 UTC (permalink / raw) To: Peter Barada; +Cc: pinskia, gcc, apinski Peter Barada <peter@baradas.org> writes: > If ffs returns zero if its arg is zero, and then 1 to 32 (assuming ffs > returns 1 if the msb is set), then you forgot to add 32 if the high > word is zero(unless the lower word is zero) so the result is 0 to 64: > > ffs high, r > test r > bnz 0f > ffs low, r > test r > bz 0f > add 32, r > 0f: Yes, you're right. Sorry about that. zw ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: speeding up parts of gcc by using ffs 2003-01-05 6:16 speeding up parts of gcc by using count_leading_zero (long) Andrew Pinski 2003-01-05 7:03 ` Zack Weinberg @ 2003-01-06 10:51 ` Andrew Pinski 2003-01-06 18:08 ` Zack Weinberg 1 sibling, 1 reply; 12+ messages in thread From: Andrew Pinski @ 2003-01-06 10:51 UTC (permalink / raw) To: Andrew Pinski; +Cc: gcc, apinski [-- Attachment #1: Type: text/plain, Size: 1414 bytes --] A follow of the comments I received, I have only right now added the simple cases of HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_INT and HOST_BITS_PER_INT == HOST_BITS_PER_LONG. I might implement the other part of the patch for when HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_LONG_LONG and HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_LONG but this might take some time because I have to add some more builtins but it will clean things up. bootstraped on powerpc-apple-darwin6.3, also bootstrapped and run then testsuite on i386-unknown-openbsd3.1. Also changed need_64bit_hwint on darwin's case back to needing it and bootstraped there also. ChangeLog: 2003-01-06 Andrew Pinski <pinskia@physics.uc.edu> * bitmap.c: (bitmap_first_set_bit) Remove comment about ffs since libiberty already includes it. When HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_INT, use ffs. genattrtab.c: (encode_units_mask) Use ffs. ggc-page.c: (ggc_alloc) Use ffs when HOST_BITS_PER_INT == HOST_BITS_PER_LONG. (compute_inverse) Use ffs. (ggc_recalculate_in_use_p) Use ffs when HOST_BITS_PER_INT == HOST_BITS_PER_LONG toplev.c.c: (exact_log2_wide) Use ffs when HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_INT. Hide log when HOST_BITS_PER_WIDE_INT != HOST_BITS_PER_INT. config/rs6000/rs6000.c: (extract_MB) Use ffs when HOST_BITS_PER_INT == HOST_BITS_PER_LONG. (extract_ME) Use ffs when HOST_BITS_PER_INT == HOST_BITS_PER_LONG. Patch: [-- Attachment #2: ffs.patch --] [-- Type: application/octet-stream, Size: 4252 bytes --] Index: bitmap.c =================================================================== RCS file: /cvs/gcc/gcc/gcc/bitmap.c,v retrieving revision 1.39 diff -u -d -b -w -u -r1.39 bitmap.c --- bitmap.c 16 Dec 2002 18:18:59 -0000 1.39 +++ bitmap.c 6 Jan 2003 08:45:26 -0000 @@ -453,7 +453,10 @@ #endif /* Binary search for the first set bit. */ - /* ??? It'd be nice to know if ffs or ffsl was available. */ + +#if HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_INT + bit_num = ffs (word) - 1; +#else bit_num = 0; word = word & -word; @@ -475,6 +478,7 @@ bit_num += 2; if (word & 0xaa) bit_num += 1; +#endif return (ptr->indx * BITMAP_ELEMENT_ALL_BITS + word_num * HOST_BITS_PER_WIDE_INT Index: genattrtab.c =================================================================== RCS file: /cvs/gcc/gcc/gcc/genattrtab.c,v retrieving revision 1.121 diff -u -d -b -w -u -r1.121 genattrtab.c --- genattrtab.c 16 Dec 2002 18:19:32 -0000 1.121 +++ genattrtab.c 6 Jan 2003 08:45:27 -0000 @@ -2270,8 +2270,7 @@ abort (); else if (i != 0 && i == (i & -i)) /* Only one bit is set, so yield that unit number. */ - for (j = 0; (i >>= 1) != 0; j++) - ; + j = ffs (i) - 1; else j = ~i; return attr_rtx (CONST_STRING, attr_printf (MAX_DIGITS, "%d", j)); Index: ggc-page.c =================================================================== RCS file: /cvs/gcc/gcc/gcc/ggc-page.c,v retrieving revision 1.58 diff -u -d -b -w -u -r1.58 ggc-page.c --- ggc-page.c 16 Dec 2002 18:19:36 -0000 1.58 +++ ggc-page.c 6 Jan 2003 08:45:27 -0000 @@ -928,8 +928,12 @@ word = bit = 0; while (~entry->in_use_p[word] == 0) ++word; +#if HOST_BITS_PER_INT == HOST_BITS_PER_LONG + bit = ffs (~(entry->in_use_p[word])) - 1; +#else while ((entry->in_use_p[word] >> bit) & 1) ++bit; +#endif hint = word * HOST_BITS_PER_LONG + bit; } @@ -1100,12 +1104,8 @@ } size = OBJECT_SIZE (order); - e = 0; - while (size % 2 == 0) - { - e++; - size >>= 1; - } + e = ffs (size) - 1; + size >>= e; inv = size; while (inv * size != 1) @@ -1241,9 +1241,14 @@ /* Something is in use if it is marked, or if it was in use in a context further down the context stack. */ p->in_use_p[i] |= p->save_in_use_p[i]; +#if HOST_BITS_PER_INT == HOST_BITS_PER_LONG + j = p->in_use_p[i] >> (ffs (p->in_use_p[i]) - 1); +#else + j = p->in_use_p[i]; +#endif /* Decrement the free object count for every object allocated. */ - for (j = p->in_use_p[i]; j; j >>= 1) + for (; j; j >>= 1) p->num_free_objects -= (j & 1); } Index: toplev.c =================================================================== RCS file: /cvs/gcc/gcc/gcc/toplev.c,v retrieving revision 1.693 diff -u -d -b -w -u -r1.693 toplev.c --- toplev.c 24 Dec 2002 08:30:32 -0000 1.693 +++ toplev.c 6 Jan 2003 08:45:27 -0000 @@ -1661,13 +1661,19 @@ exact_log2_wide (x) unsigned HOST_WIDE_INT x; { +#if HOST_BITS_PER_WIDE_INT != HOST_BITS_PER_INT int log = 0; +#endif /* Test for 0 or a power of 2. */ if (x == 0 || x != (x & -x)) return -1; +#if HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_INT + return ffs (x) - 1; +#else while ((x >>= 1) != 0) log++; return log; +#endif } /* Given X, an unsigned number, return the largest int Y such that 2**Y <= X. Index: config/rs6000/rs6000.c =================================================================== RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v retrieving revision 1.407 diff -u -d -b -w -u -r1.407 rs6000.c --- config/rs6000/rs6000.c 3 Jan 2003 23:09:33 -0000 1.407 +++ config/rs6000/rs6000.c 6 Jan 2003 08:45:28 -0000 @@ -7331,9 +7331,13 @@ /* Otherwise we have a wrap-around mask. Look for the first 0 bit from the right. */ +#if HOST_BITS_PER_INT == HOST_BITS_PER_LONG + i = HOST_BITS_PER_LONG - ffs (~val) + 1; +#else i = 31; while (((val >>= 1) & 1) != 0) --i; +#endif return i; } @@ -7352,9 +7356,13 @@ if ((val & 0xffffffff) == 0) abort (); +#if HOST_BITS_PER_INT == HOST_BITS_PER_LONG + i = HOST_BITS_PER_LONG - ffs (val); +#else i = 30; while (((val >>= 1) & 1) == 0) --i; +#endif return i; } [-- Attachment #3: Type: text/plain, Size: 147 bytes --] Thanks, Andrew Pinski apinski@apple.com pinskia@physics.uc.edu PS. Sorry for the messy ChangeLog, I am still trying to get a hang of this. :) ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: speeding up parts of gcc by using ffs 2003-01-06 10:51 ` speeding up parts of gcc by using ffs Andrew Pinski @ 2003-01-06 18:08 ` Zack Weinberg 2003-01-06 18:28 ` Joseph S. Myers 2003-01-08 9:51 ` Andrew Pinski 0 siblings, 2 replies; 12+ messages in thread From: Zack Weinberg @ 2003-01-06 18:08 UTC (permalink / raw) To: Andrew Pinski; +Cc: gcc, apinski Andrew Pinski <pinskia@physics.uc.edu> writes: > A follow of the comments I received, I have only right now added the > simple cases of HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_INT and > HOST_BITS_PER_INT == HOST_BITS_PER_LONG. I might implement the other > part of the patch for when HOST_BITS_PER_WIDE_INT == > HOST_BITS_PER_LONG_LONG and HOST_BITS_PER_WIDE_INT == > HOST_BITS_PER_LONG but this might take some time because I have to add > some more builtins but it will clean things up. I'm not enthusiastic about all the #ifdefs. glibc provides ffsl and ffsll which take 'long' and 'long long' respectively; may I suggest that you follow these steps: 1) put ffsl and ffsll into libiberty. 2) have hwint.h #define ffs_hwi and ffs_hwidesti appropriately 3) use ffs/ffsl/ffsll/ffs_hwi throughout the compiler 4) create __builtin_ffsl and __builtin_ffsll zw ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: speeding up parts of gcc by using ffs 2003-01-06 18:08 ` Zack Weinberg @ 2003-01-06 18:28 ` Joseph S. Myers 2003-01-08 9:51 ` Andrew Pinski 1 sibling, 0 replies; 12+ messages in thread From: Joseph S. Myers @ 2003-01-06 18:28 UTC (permalink / raw) To: Zack Weinberg; +Cc: Andrew Pinski, gcc, apinski On Mon, 6 Jan 2003, Zack Weinberg wrote: > glibc provides ffsl and ffsll which take 'long' and 'long long' > respectively; may I suggest that you follow these steps: > > 1) put ffsl and ffsll into libiberty. > 2) have hwint.h #define ffs_hwi and ffs_hwidesti appropriately > 3) use ffs/ffsl/ffsll/ffs_hwi throughout the compiler > 4) create __builtin_ffsl and __builtin_ffsll If any of the uses are genuinely performance-critical (as shown by profiling), the fallback version of ffs (and ffsl, ffsll) in libiberty could also be speeded up by replacing it with glibc's generic C implementation. -- Joseph S. Myers jsm28@cam.ac.uk ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: speeding up parts of gcc by using ffs 2003-01-06 18:08 ` Zack Weinberg 2003-01-06 18:28 ` Joseph S. Myers @ 2003-01-08 9:51 ` Andrew Pinski 2003-01-08 11:20 ` Falk Hueffner 2003-01-09 0:24 ` Richard Henderson 1 sibling, 2 replies; 12+ messages in thread From: Andrew Pinski @ 2003-01-08 9:51 UTC (permalink / raw) To: gcc-patches; +Cc: Andrew Pinski, gcc, apinski Here is the first patch for using ffs{,l,ll} in gcc, will be posting the rest by Monday. This adds ffsl and ffsll to libiberty, and also adds the ffs's to the libiberty.a. Thanks, Andrew Pinski apinski@apple.com pinskia@physics.uc.edu On Monday, Jan 6, 2003, at 09:49 US/Pacific, Zack Weinberg wrote: > Andrew Pinski <pinskia@physics.uc.edu> writes: > >> A follow of the comments I received, I have only right now added the >> simple cases of HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_INT and >> HOST_BITS_PER_INT == HOST_BITS_PER_LONG. I might implement the other >> part of the patch for when HOST_BITS_PER_WIDE_INT == >> HOST_BITS_PER_LONG_LONG and HOST_BITS_PER_WIDE_INT == >> HOST_BITS_PER_LONG but this might take some time because I have to add >> some more builtins but it will clean things up. > > I'm not enthusiastic about all the #ifdefs. > > glibc provides ffsl and ffsll which take 'long' and 'long long' > respectively; may I suggest that you follow these steps: > > 1) put ffsl and ffsll into libiberty. > 2) have hwint.h #define ffs_hwi and ffs_hwidesti appropriately > 3) use ffs/ffsl/ffsll/ffs_hwi throughout the compiler > 4) create __builtin_ffsl and __builtin_ffsll > > zw ChangeLog: 2003-01-07 Andrew Pinski <pinskia@physics.uc.edu> * Makefile.in (ffsll.o): special rule for long long warning. (CFILES): add ffsl.c and ffsll.c. (CONFIGURED_OFILES): add ffsl.o and ffsll.o. (NEEDED): add ffs, ffsl, and ffsll. * aclocal.m4 include ../config/accross.m4 for AC_C_BIGENDIAN_CROSS, and AC_COMPILE_CHECK_SIZEOF. (libiberty_AC_C_LONG_LONG) copied from gcc's gcc_AC_C_LONG_LONG. * configure.in (AC_C_BIGENDIAN_CROSS) use it. (libiberty_AC_C_LONG_LONG) use it. (AC_COMPILE_CHECK_SIZEOF) use it. Probe the size of int, long, and long long/__int64 if we have them. Check for ffsl and ffsll. * configure regenerate. (LIB_AC_PROG_CC) add ac_libiberty_warn_long_long_cflags. * config.in regenerate. * ffsl.c new file. * ffsll.c new file. Index: Makefile.in =================================================================== RCS file: /cvs/gcc/gcc/libiberty/Makefile.in,v retrieving revision 1.78 diff -u -d -b -w -b -B -u -r1.78 Makefile.in --- Makefile.in 22 Nov 2002 20:01:07 -0000 1.78 +++ Makefile.in 8 Jan 2003 07:32:39 -0000 @@ -122,15 +122,22 @@ else true; fi $(COMPILE.c) $< $(OUTPUT_OPTION) +#special case ffsll.o because of extra warnings will show up other wise +ffsll.o: ffsll.c + if [ x"$(PICFLAG)" != x ]; then \ + $(COMPILE.c) @ac_libiberty_warn_long_long_cflags@ $(PICFLAG) $< -o pic/$@; \ + else true; fi + $(COMPILE.c) @ac_libiberty_warn_long_long_cflags@ $< $(OUTPUT_OPTION) + # NOTE: If you add new files to the library, add them to this list # (alphabetical), and add them to REQUIRED_OFILES, or # CONFIGURED_OFILES and funcs in configure.in. CFILES = alloca.c argv.c asprintf.c atexit.c \ basename.c bcmp.c bcopy.c bsearch.c bzero.c \ - calloc.c choose-temp.c clock.c concat.c cp-demangle.c \ - cplus-dem.c \ + calloc.c choose-temp.c clock.c concat.c \ + cp-demangle.c cplus-dem.c \ dyn-string.c \ - fdmatch.c ffs.c fibheap.c floatformat.c fnmatch.c \ + fdmatch.c ffs.c ffsl.c ffsll.c fibheap.c floatformat.c fnmatch.c\ getcwd.c getopt.c getopt1.c getpagesize.c getpwd.c getruntime.c \ hashtab.c hex.c \ index.c insque.c \ @@ -176,7 +183,7 @@ basename.o bcmp.o bcopy.o bsearch.o bzero.o \ calloc.o clock.o copysign.o \ _doprnt.o \ - ffs.o \ + ffs.o ffsl.o ffsll.o \ getcwd.o getpagesize.o \ index.o insque.o \ memchr.o memcmp.o memcpy.o memmove.o memset.o mkstemps.o \ @@ -287,7 +294,7 @@ # can't use anything encumbering. NEEDED = atexit calloc memchr memcmp memcpy memmove memset rename strchr \ strerror strncmp strrchr strstr strtol strtoul tmpnam vfprintf vprintf \ - vfork waitpid bcmp bcopy bzero + vfork waitpid bcmp bcopy bzero ffs ffsl ffsll needed-list: Makefile rm -f needed-list; touch needed-list; \ for f in $(NEEDED); do \ Index: aclocal.m4 =================================================================== RCS file: /cvs/gcc/gcc/libiberty/aclocal.m4,v retrieving revision 1.5 diff -u -d -b -w -b -B -u -r1.5 aclocal.m4 --- aclocal.m4 31 Dec 2001 23:23:49 -0000 1.5 +++ aclocal.m4 8 Jan 2003 07:32:39 -0000 @@ -1,3 +1,5 @@ +sinclude(../config/accross.m4) + dnl See whether strncmp reads past the end of its string parameters. dnl On some versions of SunOS4 at least, strncmp reads a word at a time dnl but erroneously reads past the end of strings. This can cause @@ -107,6 +109,7 @@ if test $ac_cv_prog_gcc = yes; then GCC=yes ac_libiberty_warn_cflags='-W -Wall -Wtraditional -pedantic' + ac_libiberty_warn_long_long_cflags='-Wno-long-long' dnl Check whether -g works, even if CFLAGS is set, in case the package dnl plays around with CFLAGS (such as to build both debugging and dnl normal versions of a library), tasteless as that idea is. @@ -124,9 +127,11 @@ else GCC= ac_libiberty_warn_cflags= + ac_libiberty_warn_long_long_cflags= test "${CFLAGS+set}" = set || CFLAGS="-g" fi AC_SUBST(ac_libiberty_warn_cflags) +AC_SUBST(ac_libiberty_warn_long_long_cflags) ]) # Work around a bug in autoheader. This can go away when we switch to @@ -188,4 +193,27 @@ STACK_DIRECTION > 0 => grows toward higher addresses STACK_DIRECTION < 0 => grows toward lower addresses STACK_DIRECTION = 0 => direction of growth unknown]) +]) + +dnl Checking for long long. +dnl By Caolan McNamara <caolan@skynet.ie> +dnl Added check for __int64, Zack Weinberg <zackw@stanford.edu> +dnl +AC_DEFUN([libiberty_AC_C_LONG_LONG], +[AC_CACHE_CHECK(for long long int, ac_cv_c_long_long, + [AC_TRY_COMPILE(,[long long int i;], + ac_cv_c_long_long=yes, + ac_cv_c_long_long=no)]) + if test $ac_cv_c_long_long = yes; then + AC_DEFINE(HAVE_LONG_LONG, 1, + [Define if your compiler supports the \`long long' type.]) + fi +AC_CACHE_CHECK(for __int64, ac_cv_c___int64, + [AC_TRY_COMPILE(,[__int64 i;], + ac_cv_c___int64=yes, + ac_cv_c___int64=no)]) + if test $ac_cv_c___int64 = yes; then + AC_DEFINE(HAVE___INT64, 1, + [Define if your compiler supports the \`__int64' type.]) + fi ]) Index: configure.in =================================================================== RCS file: /cvs/gcc/gcc/libiberty/configure.in,v retrieving revision 1.52 diff -u -d -b -w -b -B -u -r1.52 configure.in --- configure.in 1 Jul 2002 05:38:50 -0000 1.52 +++ configure.in 8 Jan 2003 07:32:39 -0000 @@ -145,6 +145,9 @@ AC_HEADER_SYS_WAIT AC_HEADER_TIME + +AC_C_BIGENDIAN_CROSS + libiberty_AC_DECLARE_ERRNO AC_CHECK_TYPE(uintptr_t, unsigned long) @@ -154,6 +158,18 @@ AC_DEFINE(HAVE_UINTPTR_T, 1, [Define if you have the \`uintptr_t' type.]) fi +libiberty_AC_C_LONG_LONG + +AC_COMPILE_CHECK_SIZEOF(int) +AC_COMPILE_CHECK_SIZEOF(long) + +if test $ac_cv_c_long_long = yes; then + AC_COMPILE_CHECK_SIZEOF(long long) +fi +if test $ac_cv_c___int64 = yes; then + AC_COMPILE_CHECK_SIZEOF(__int64) +fi + AC_TYPE_PID_T # This is the list of functions which libiberty will provide if they @@ -200,6 +216,14 @@ funcs="$funcs vprintf" funcs="$funcs vsprintf" funcs="$funcs waitpid" +funcs="$funcs ffsl" + +if test $ac_cv_c_long_long = yes; then + funcs="$funcs ffsll" +fi +if test $ac_cv_c___int64 = yes; then + funcs="$funcs ffsll" +fi # Also in the old function.def file: alloca, vfork, getopt. @@ -216,7 +240,7 @@ AC_CHECK_FUNCS(strcasecmp setenv strchr strdup strncasecmp strrchr strstr) AC_CHECK_FUNCS(strtod strtol strtoul tmpnam vasprintf vfprintf vprintf) AC_CHECK_FUNCS(vsprintf waitpid getrusage on_exit psignal strerror strsignal) - AC_CHECK_FUNCS(sysconf times sbrk gettimeofday ffs) + AC_CHECK_FUNCS(sysconf times sbrk gettimeofday ffs ffsl ffsll) AC_DEFINE(HAVE_SYS_ERRLIST, 1, [Define if you have the sys_errlist variable.]) AC_DEFINE(HAVE_SYS_NERR, 1, [Define if you have the sys_nerr variable.]) AC_DEFINE(HAVE_SYS_SIGLIST, 1, [Define if you have the sys_siglist variable.]) Index: ffsl.c =================================================================== RCS file: ffsl.c diff -N ffsl.c --- /dev/null 1 Jan 1970 00:00:00 -0000 +++ ffsl.c 8 Jan 2003 07:32:39 -0000 @@ -0,0 +1,31 @@ +/* ffsl -- Find the first bit set in the parameter + +@deftypefn Supplemental int ffsl (long @var{valu}) + +Find the first (least significant) bit set in @var{valu}. Bits are +numbered from right to left, starting with bit 1 (corresponding to the +value 1). If @var{valu} is zero, zero is returned. + +@end deftypefn + +*/ + +#include "config.h" + +int +ffsl (valu) + register long valu; +{ +#if SIZEOF_LONG == SIZEOF_INT + extern int ffs(); + return ffs(valu); +#else +#if SIZEOF_LONG == SIZEOF_LONG_LONG || SIZEOF_LONG == SIZEOF___INT64 + extern int ffs(); + return ffsll(valu); +#else + #error Do not know what size long is. +#endif +#endif +} + Index: ffsll.c =================================================================== RCS file: ffsll.c diff -N ffsll.c --- /dev/null 1 Jan 1970 00:00:00 -0000 +++ ffsll.c 8 Jan 2003 07:32:39 -0000 @@ -0,0 +1,75 @@ +/* ffsll -- Find the first bit set in the parameter + +@deftypefn Supplemental int ffsll (int @var{valu}) + +Find the first (least significant) bit set in @var{valu}. Bits are +numbered from right to left, starting with bit 1 (corresponding to the +value 1). If @var{valu} is zero, zero is returned. + +@end deftypefn + +*/ + +#include "config.h" + +#if defined(HAVE_LONG_LONG) +#define NEED_FFSLL +typedef long long ll; +#define SIZEOF_LL SIZEOF_LONG_LONG + +#else +#if defined(HAVE___INT64) +#define NEED_FFSLL +typedef __int64 ll; +#define SIZEOF_LL SIZEOF___INT64 + +#endif +#endif + +#if defined(NEED_FFSLL) + +union ll_ints { + ll longlongs; + struct { +#if defined(WORDS_BIGENDIAN) + int hi; + int low; +#else + int low; + int hi; +#endif + } ints; +}; + +int ffs (); + +int +ffsll (valu) + register ll valu; +{ +#if SIZEOF_LL == SIZEOF_INT*2 + ll x = valu & -valu; + union ll_ints temp; + union ll_ints temp1; + int add = 0; + int word; + temp.longlongs = valu; + temp1.longlongs = x; + if(temp1.ints.hi!=0) + word = temp.ints.low; + else + word = temp.ints.hi, add = 32; + return add + ffs (word); +#else + int bit_num = 0; + if(valu==0) + return 0; + while((valu&1)!=0) + bit_num ++, valu >>= 1; + + return bit_num; +#endif + +} + +#endif ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: speeding up parts of gcc by using ffs 2003-01-08 9:51 ` Andrew Pinski @ 2003-01-08 11:20 ` Falk Hueffner 2003-01-08 11:21 ` Andrew Pinski 2003-01-09 0:24 ` Richard Henderson 1 sibling, 1 reply; 12+ messages in thread From: Falk Hueffner @ 2003-01-08 11:20 UTC (permalink / raw) To: Andrew Pinski; +Cc: gcc-patches, gcc, apinski Andrew Pinski <pinskia@physics.uc.edu> writes: > + extern int ffs(); Function declarators with empty parentheses are an obsolescent feature in C99, so I think they're better to be avoided. > +#else > + int bit_num = 0; > + if(valu==0) > + return 0; > + while((valu&1)!=0) > + bit_num ++, valu >>= 1; > + > + return bit_num; > +#endif This looks bogus to me. -- Falk ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: speeding up parts of gcc by using ffs 2003-01-08 11:20 ` Falk Hueffner @ 2003-01-08 11:21 ` Andrew Pinski 2003-01-08 11:26 ` Falk Hueffner 0 siblings, 1 reply; 12+ messages in thread From: Andrew Pinski @ 2003-01-08 11:21 UTC (permalink / raw) To: Falk Hueffner; +Cc: Andrew Pinski, gcc-patches, gcc, apinski On Wednesday, Jan 8, 2003, at 00:22 US/Pacific, Falk Hueffner wrote: > Andrew Pinski <pinskia@physics.uc.edu> writes: > >> + extern int ffs(); > > Function declarators with empty parentheses are an obsolescent feature > in C99, so I think they're better to be avoided. Well parts of gcc are still written in K&R C, this is one. Thanks, Andrew Pinski ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: speeding up parts of gcc by using ffs 2003-01-08 11:21 ` Andrew Pinski @ 2003-01-08 11:26 ` Falk Hueffner 0 siblings, 0 replies; 12+ messages in thread From: Falk Hueffner @ 2003-01-08 11:26 UTC (permalink / raw) To: Andrew Pinski; +Cc: gcc-patches, gcc, apinski Andrew Pinski <pinskia@physics.uc.edu> writes: > On Wednesday, Jan 8, 2003, at 00:22 US/Pacific, Falk Hueffner wrote: > > Andrew Pinski <pinskia@physics.uc.edu> writes: > > > >> + extern int ffs(); > > > > Function declarators with empty parentheses are an obsolescent feature > > in C99, so I think they're better to be avoided. > > Well parts of gcc are still written in K&R C, this is one. Hmm, didn't think of that. Perhaps you could uses PARAMS(()). -- Falk ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: speeding up parts of gcc by using ffs 2003-01-08 9:51 ` Andrew Pinski 2003-01-08 11:20 ` Falk Hueffner @ 2003-01-09 0:24 ` Richard Henderson 1 sibling, 0 replies; 12+ messages in thread From: Richard Henderson @ 2003-01-09 0:24 UTC (permalink / raw) To: Andrew Pinski; +Cc: gcc-patches, gcc, apinski On Tue, Jan 07, 2003 at 11:52:00PM -0800, Andrew Pinski wrote: > + if(temp1.ints.hi!=0) > + word = temp.ints.low; This doesn't implement ffs. You want if (temp1.ints.lo) word = temp1.ints.lo; And, really, I suspect that you don't want a union at all. Using a >> 32 should be good enough and it won't punish 64-bit hosts. r~ ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2003-01-08 22:46 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2003-01-05 6:16 speeding up parts of gcc by using count_leading_zero (long) Andrew Pinski 2003-01-05 7:03 ` Zack Weinberg 2003-01-05 7:17 ` Peter Barada 2003-01-05 8:41 ` Zack Weinberg 2003-01-06 10:51 ` speeding up parts of gcc by using ffs Andrew Pinski 2003-01-06 18:08 ` Zack Weinberg 2003-01-06 18:28 ` Joseph S. Myers 2003-01-08 9:51 ` Andrew Pinski 2003-01-08 11:20 ` Falk Hueffner 2003-01-08 11:21 ` Andrew Pinski 2003-01-08 11:26 ` Falk Hueffner 2003-01-09 0:24 ` Richard Henderson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).