* speeding up parts of gcc by using count_leading_zero (long)
@ 2003-01-05 6:16 Andrew Pinski
2003-01-05 7:03 ` Zack Weinberg
2003-01-06 10:51 ` speeding up parts of gcc by using ffs Andrew Pinski
0 siblings, 2 replies; 12+ messages in thread
From: Andrew Pinski @ 2003-01-05 6:16 UTC (permalink / raw)
To: gcc; +Cc: Andrew Pinski, apinski
There are parts of gcc which can be sped up on PPC (and other
architectures which have something similar) by using the instruction
`cntlz{w,d}' (count leading zero word, double word [PPC64 only]).
Also the parts of gcc on other architectures which do not have count
leading zero can be sped up by instead of using a for loop, a series of
if statements:
register int bit_num = 0;
if (word & 0xFFFF0000)
word >>= 16;
else
bit_num += 16;
if (word & 0xFF00)
word >>= 8;
else
bit_num += 8;
if (word & 0xF0)
word >>= 4;
else
bit_num += 4;
if (word & 0xc)
word >>= 2;
else
bit_num += 2;
if (word & 0x2)
word >>= 1;
else
bit_num += 1;
if ((word & 0x1) == 0)
bit_num += 1;
and bit_num is the result.
The functions which could benefit from this are:
function file
_________________________________
sbitmap_first_set_bit sbitmap.c
sbitmap_last_set_bit sbitmap.c
bitmap_first_set_bit bitmap.c (this already does a series of if
statements)
bitmap_last_set_bit bitmap.c (this already does a series of if
statements)
floor_log2_wide toplev.c
exact_log2_wide toplev.c
compute_inverse ggc-page.c
build_mask64_2_operands config/rs6000/rs6000.c
This one only effects libgcc
count_leading_zero longlong.h
also this one from libiberty
ffs ffs.c
There might be others but these would found by doing a grep for >> and
looking at for a counting the bits that not set until getting one that
is set SIZEOF_INT - cntlzw (word&-word) which is the same as ffs or
looping until the word is zero which is the same as SIZEOF_INT - cntlzw
(word) - 1.
Where should I put the mechanism for the cntlz for HOST_WIDE_INT, int,
HOST_WIDEST_INT (in hwint.h?) and what should I call the functions,
count_leading_zero_wide_int, count_leading_zero_int, and
count_leading_zero_widest_int?
Thanks,
Andrew Pinski
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: speeding up parts of gcc by using count_leading_zero (long)
2003-01-05 6:16 speeding up parts of gcc by using count_leading_zero (long) Andrew Pinski
@ 2003-01-05 7:03 ` Zack Weinberg
2003-01-05 7:17 ` Peter Barada
2003-01-06 10:51 ` speeding up parts of gcc by using ffs Andrew Pinski
1 sibling, 1 reply; 12+ messages in thread
From: Zack Weinberg @ 2003-01-05 7:03 UTC (permalink / raw)
To: Andrew Pinski; +Cc: gcc, apinski
Andrew Pinski <pinskia@physics.uc.edu> writes:
> There are parts of gcc which can be sped up on PPC (and other
> architectures which have something similar) by using the instruction
> `cntlz{w,d}' (count leading zero word, double word [PPC64 only]).
The right thing here is to change this code to use ffs() which is
already recognized and optimized by GCC and other compilers. Put a
generic implementation of this primitive in libiberty, since it's not
part of C89.
Also, you missed ggc_alloc in ggc-page.c (which is a much more
important routine to optimize than compute_inverse).
Tangentially, ffs takes an int, which is 32 bits on all supported
hosts. It would make sense to define __builtin_ffs32() and
__builtin_ffs64() to nail down the sizes. ffs64 can be implemented
efficiently on machines with only a 32-bit ffs instruction, as
ffs high, r
test r
bnz 0f
ffs low, r
0f:
so it is useful to provide both of them always.
zw
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: speeding up parts of gcc by using count_leading_zero (long)
2003-01-05 7:03 ` Zack Weinberg
@ 2003-01-05 7:17 ` Peter Barada
2003-01-05 8:41 ` Zack Weinberg
0 siblings, 1 reply; 12+ messages in thread
From: Peter Barada @ 2003-01-05 7:17 UTC (permalink / raw)
To: zack; +Cc: pinskia, gcc, apinski
>Tangentially, ffs takes an int, which is 32 bits on all supported
>hosts. It would make sense to define __builtin_ffs32() and
>__builtin_ffs64() to nail down the sizes. ffs64 can be implemented
>efficiently on machines with only a 32-bit ffs instruction, as
>
> ffs high, r
> test r
> bnz 0f
> ffs low, r
>0f:
>
>so it is useful to provide both of them always.
If ffs returns zero if its arg is zero, and then 1 to 32 (assuming ffs
returns 1 if the msb is set), then you forgot to add 32 if the high
word is zero(unless the lower word is zero) so the result is 0 to 64:
ffs high, r
test r
bnz 0f
ffs low, r
test r
bz 0f
add 32, r
0f:
--
Peter Barada
peter@baradas.org
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: speeding up parts of gcc by using count_leading_zero (long)
2003-01-05 7:17 ` Peter Barada
@ 2003-01-05 8:41 ` Zack Weinberg
0 siblings, 0 replies; 12+ messages in thread
From: Zack Weinberg @ 2003-01-05 8:41 UTC (permalink / raw)
To: Peter Barada; +Cc: pinskia, gcc, apinski
Peter Barada <peter@baradas.org> writes:
> If ffs returns zero if its arg is zero, and then 1 to 32 (assuming ffs
> returns 1 if the msb is set), then you forgot to add 32 if the high
> word is zero(unless the lower word is zero) so the result is 0 to 64:
>
> ffs high, r
> test r
> bnz 0f
> ffs low, r
> test r
> bz 0f
> add 32, r
> 0f:
Yes, you're right. Sorry about that.
zw
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: speeding up parts of gcc by using ffs
2003-01-05 6:16 speeding up parts of gcc by using count_leading_zero (long) Andrew Pinski
2003-01-05 7:03 ` Zack Weinberg
@ 2003-01-06 10:51 ` Andrew Pinski
2003-01-06 18:08 ` Zack Weinberg
1 sibling, 1 reply; 12+ messages in thread
From: Andrew Pinski @ 2003-01-06 10:51 UTC (permalink / raw)
To: Andrew Pinski; +Cc: gcc, apinski
[-- Attachment #1: Type: text/plain, Size: 1414 bytes --]
A follow of the comments I received, I have only right now added the
simple cases of HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_INT and
HOST_BITS_PER_INT == HOST_BITS_PER_LONG. I might implement the other
part of the patch for when HOST_BITS_PER_WIDE_INT ==
HOST_BITS_PER_LONG_LONG and HOST_BITS_PER_WIDE_INT ==
HOST_BITS_PER_LONG but this might take some time because I have to add
some more builtins but it will clean things up.
bootstraped on powerpc-apple-darwin6.3, also bootstrapped and run then
testsuite on i386-unknown-openbsd3.1.
Also changed need_64bit_hwint on darwin's case back to needing it and
bootstraped there also.
ChangeLog:
2003-01-06 Andrew Pinski <pinskia@physics.uc.edu>
* bitmap.c: (bitmap_first_set_bit) Remove comment about ffs
since libiberty already includes it. When
HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_INT, use ffs.
genattrtab.c: (encode_units_mask) Use ffs.
ggc-page.c: (ggc_alloc) Use ffs when HOST_BITS_PER_INT ==
HOST_BITS_PER_LONG.
(compute_inverse) Use ffs. (ggc_recalculate_in_use_p) Use ffs when
HOST_BITS_PER_INT == HOST_BITS_PER_LONG
toplev.c.c: (exact_log2_wide) Use ffs when HOST_BITS_PER_WIDE_INT ==
HOST_BITS_PER_INT.
Hide log when HOST_BITS_PER_WIDE_INT != HOST_BITS_PER_INT.
config/rs6000/rs6000.c: (extract_MB) Use ffs when HOST_BITS_PER_INT ==
HOST_BITS_PER_LONG.
(extract_ME) Use ffs when HOST_BITS_PER_INT == HOST_BITS_PER_LONG.
Patch:
[-- Attachment #2: ffs.patch --]
[-- Type: application/octet-stream, Size: 4252 bytes --]
Index: bitmap.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/bitmap.c,v
retrieving revision 1.39
diff -u -d -b -w -u -r1.39 bitmap.c
--- bitmap.c 16 Dec 2002 18:18:59 -0000 1.39
+++ bitmap.c 6 Jan 2003 08:45:26 -0000
@@ -453,7 +453,10 @@
#endif
/* Binary search for the first set bit. */
- /* ??? It'd be nice to know if ffs or ffsl was available. */
+
+#if HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_INT
+ bit_num = ffs (word) - 1;
+#else
bit_num = 0;
word = word & -word;
@@ -475,6 +478,7 @@
bit_num += 2;
if (word & 0xaa)
bit_num += 1;
+#endif
return (ptr->indx * BITMAP_ELEMENT_ALL_BITS
+ word_num * HOST_BITS_PER_WIDE_INT
Index: genattrtab.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/genattrtab.c,v
retrieving revision 1.121
diff -u -d -b -w -u -r1.121 genattrtab.c
--- genattrtab.c 16 Dec 2002 18:19:32 -0000 1.121
+++ genattrtab.c 6 Jan 2003 08:45:27 -0000
@@ -2270,8 +2270,7 @@
abort ();
else if (i != 0 && i == (i & -i))
/* Only one bit is set, so yield that unit number. */
- for (j = 0; (i >>= 1) != 0; j++)
- ;
+ j = ffs (i) - 1;
else
j = ~i;
return attr_rtx (CONST_STRING, attr_printf (MAX_DIGITS, "%d", j));
Index: ggc-page.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/ggc-page.c,v
retrieving revision 1.58
diff -u -d -b -w -u -r1.58 ggc-page.c
--- ggc-page.c 16 Dec 2002 18:19:36 -0000 1.58
+++ ggc-page.c 6 Jan 2003 08:45:27 -0000
@@ -928,8 +928,12 @@
word = bit = 0;
while (~entry->in_use_p[word] == 0)
++word;
+#if HOST_BITS_PER_INT == HOST_BITS_PER_LONG
+ bit = ffs (~(entry->in_use_p[word])) - 1;
+#else
while ((entry->in_use_p[word] >> bit) & 1)
++bit;
+#endif
hint = word * HOST_BITS_PER_LONG + bit;
}
@@ -1100,12 +1104,8 @@
}
size = OBJECT_SIZE (order);
- e = 0;
- while (size % 2 == 0)
- {
- e++;
- size >>= 1;
- }
+ e = ffs (size) - 1;
+ size >>= e;
inv = size;
while (inv * size != 1)
@@ -1241,9 +1241,14 @@
/* Something is in use if it is marked, or if it was in use in a
context further down the context stack. */
p->in_use_p[i] |= p->save_in_use_p[i];
+#if HOST_BITS_PER_INT == HOST_BITS_PER_LONG
+ j = p->in_use_p[i] >> (ffs (p->in_use_p[i]) - 1);
+#else
+ j = p->in_use_p[i];
+#endif
/* Decrement the free object count for every object allocated. */
- for (j = p->in_use_p[i]; j; j >>= 1)
+ for (; j; j >>= 1)
p->num_free_objects -= (j & 1);
}
Index: toplev.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/toplev.c,v
retrieving revision 1.693
diff -u -d -b -w -u -r1.693 toplev.c
--- toplev.c 24 Dec 2002 08:30:32 -0000 1.693
+++ toplev.c 6 Jan 2003 08:45:27 -0000
@@ -1661,13 +1661,19 @@
exact_log2_wide (x)
unsigned HOST_WIDE_INT x;
{
+#if HOST_BITS_PER_WIDE_INT != HOST_BITS_PER_INT
int log = 0;
+#endif
/* Test for 0 or a power of 2. */
if (x == 0 || x != (x & -x))
return -1;
+#if HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_INT
+ return ffs (x) - 1;
+#else
while ((x >>= 1) != 0)
log++;
return log;
+#endif
}
/* Given X, an unsigned number, return the largest int Y such that 2**Y <= X.
Index: config/rs6000/rs6000.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.407
diff -u -d -b -w -u -r1.407 rs6000.c
--- config/rs6000/rs6000.c 3 Jan 2003 23:09:33 -0000 1.407
+++ config/rs6000/rs6000.c 6 Jan 2003 08:45:28 -0000
@@ -7331,9 +7331,13 @@
/* Otherwise we have a wrap-around mask. Look for the first 0 bit
from the right. */
+#if HOST_BITS_PER_INT == HOST_BITS_PER_LONG
+ i = HOST_BITS_PER_LONG - ffs (~val) + 1;
+#else
i = 31;
while (((val >>= 1) & 1) != 0)
--i;
+#endif
return i;
}
@@ -7352,9 +7356,13 @@
if ((val & 0xffffffff) == 0)
abort ();
+#if HOST_BITS_PER_INT == HOST_BITS_PER_LONG
+ i = HOST_BITS_PER_LONG - ffs (val);
+#else
i = 30;
while (((val >>= 1) & 1) == 0)
--i;
+#endif
return i;
}
[-- Attachment #3: Type: text/plain, Size: 147 bytes --]
Thanks,
Andrew Pinski
apinski@apple.com
pinskia@physics.uc.edu
PS. Sorry for the messy ChangeLog, I am still trying to get a hang of
this. :)
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: speeding up parts of gcc by using ffs
2003-01-06 10:51 ` speeding up parts of gcc by using ffs Andrew Pinski
@ 2003-01-06 18:08 ` Zack Weinberg
2003-01-06 18:28 ` Joseph S. Myers
2003-01-08 9:51 ` Andrew Pinski
0 siblings, 2 replies; 12+ messages in thread
From: Zack Weinberg @ 2003-01-06 18:08 UTC (permalink / raw)
To: Andrew Pinski; +Cc: gcc, apinski
Andrew Pinski <pinskia@physics.uc.edu> writes:
> A follow of the comments I received, I have only right now added the
> simple cases of HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_INT and
> HOST_BITS_PER_INT == HOST_BITS_PER_LONG. I might implement the other
> part of the patch for when HOST_BITS_PER_WIDE_INT ==
> HOST_BITS_PER_LONG_LONG and HOST_BITS_PER_WIDE_INT ==
> HOST_BITS_PER_LONG but this might take some time because I have to add
> some more builtins but it will clean things up.
I'm not enthusiastic about all the #ifdefs.
glibc provides ffsl and ffsll which take 'long' and 'long long'
respectively; may I suggest that you follow these steps:
1) put ffsl and ffsll into libiberty.
2) have hwint.h #define ffs_hwi and ffs_hwidesti appropriately
3) use ffs/ffsl/ffsll/ffs_hwi throughout the compiler
4) create __builtin_ffsl and __builtin_ffsll
zw
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: speeding up parts of gcc by using ffs
2003-01-06 18:08 ` Zack Weinberg
@ 2003-01-06 18:28 ` Joseph S. Myers
2003-01-08 9:51 ` Andrew Pinski
1 sibling, 0 replies; 12+ messages in thread
From: Joseph S. Myers @ 2003-01-06 18:28 UTC (permalink / raw)
To: Zack Weinberg; +Cc: Andrew Pinski, gcc, apinski
On Mon, 6 Jan 2003, Zack Weinberg wrote:
> glibc provides ffsl and ffsll which take 'long' and 'long long'
> respectively; may I suggest that you follow these steps:
>
> 1) put ffsl and ffsll into libiberty.
> 2) have hwint.h #define ffs_hwi and ffs_hwidesti appropriately
> 3) use ffs/ffsl/ffsll/ffs_hwi throughout the compiler
> 4) create __builtin_ffsl and __builtin_ffsll
If any of the uses are genuinely performance-critical (as shown by
profiling), the fallback version of ffs (and ffsl, ffsll) in libiberty
could also be speeded up by replacing it with glibc's generic C
implementation.
--
Joseph S. Myers
jsm28@cam.ac.uk
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: speeding up parts of gcc by using ffs
2003-01-06 18:08 ` Zack Weinberg
2003-01-06 18:28 ` Joseph S. Myers
@ 2003-01-08 9:51 ` Andrew Pinski
2003-01-08 11:20 ` Falk Hueffner
2003-01-09 0:24 ` Richard Henderson
1 sibling, 2 replies; 12+ messages in thread
From: Andrew Pinski @ 2003-01-08 9:51 UTC (permalink / raw)
To: gcc-patches; +Cc: Andrew Pinski, gcc, apinski
Here is the first patch for using ffs{,l,ll} in gcc, will be posting
the rest by Monday.
This adds ffsl and ffsll to libiberty, and also adds the ffs's to the
libiberty.a.
Thanks,
Andrew Pinski
apinski@apple.com
pinskia@physics.uc.edu
On Monday, Jan 6, 2003, at 09:49 US/Pacific, Zack Weinberg wrote:
> Andrew Pinski <pinskia@physics.uc.edu> writes:
>
>> A follow of the comments I received, I have only right now added the
>> simple cases of HOST_BITS_PER_WIDE_INT == HOST_BITS_PER_INT and
>> HOST_BITS_PER_INT == HOST_BITS_PER_LONG. I might implement the other
>> part of the patch for when HOST_BITS_PER_WIDE_INT ==
>> HOST_BITS_PER_LONG_LONG and HOST_BITS_PER_WIDE_INT ==
>> HOST_BITS_PER_LONG but this might take some time because I have to add
>> some more builtins but it will clean things up.
>
> I'm not enthusiastic about all the #ifdefs.
>
> glibc provides ffsl and ffsll which take 'long' and 'long long'
> respectively; may I suggest that you follow these steps:
>
> 1) put ffsl and ffsll into libiberty.
> 2) have hwint.h #define ffs_hwi and ffs_hwidesti appropriately
> 3) use ffs/ffsl/ffsll/ffs_hwi throughout the compiler
> 4) create __builtin_ffsl and __builtin_ffsll
>
> zw
ChangeLog:
2003-01-07 Andrew Pinski <pinskia@physics.uc.edu>
* Makefile.in (ffsll.o): special rule for long long warning.
(CFILES): add ffsl.c and ffsll.c. (CONFIGURED_OFILES): add
ffsl.o and ffsll.o. (NEEDED): add ffs, ffsl, and ffsll.
* aclocal.m4 include ../config/accross.m4 for AC_C_BIGENDIAN_CROSS,
and AC_COMPILE_CHECK_SIZEOF. (libiberty_AC_C_LONG_LONG) copied from
gcc's gcc_AC_C_LONG_LONG.
* configure.in (AC_C_BIGENDIAN_CROSS) use it.
(libiberty_AC_C_LONG_LONG) use it. (AC_COMPILE_CHECK_SIZEOF) use it.
Probe the size of int, long, and long long/__int64 if we have them.
Check for ffsl and ffsll.
* configure regenerate.
(LIB_AC_PROG_CC) add ac_libiberty_warn_long_long_cflags.
* config.in regenerate.
* ffsl.c new file.
* ffsll.c new file.
Index: Makefile.in
===================================================================
RCS file: /cvs/gcc/gcc/libiberty/Makefile.in,v
retrieving revision 1.78
diff -u -d -b -w -b -B -u -r1.78 Makefile.in
--- Makefile.in 22 Nov 2002 20:01:07 -0000 1.78
+++ Makefile.in 8 Jan 2003 07:32:39 -0000
@@ -122,15 +122,22 @@
else true; fi
$(COMPILE.c) $< $(OUTPUT_OPTION)
+#special case ffsll.o because of extra warnings will show up other wise
+ffsll.o: ffsll.c
+ if [ x"$(PICFLAG)" != x ]; then \
+ $(COMPILE.c) @ac_libiberty_warn_long_long_cflags@ $(PICFLAG) $< -o
pic/$@; \
+ else true; fi
+ $(COMPILE.c) @ac_libiberty_warn_long_long_cflags@ $< $(OUTPUT_OPTION)
+
# NOTE: If you add new files to the library, add them to this list
# (alphabetical), and add them to REQUIRED_OFILES, or
# CONFIGURED_OFILES and funcs in configure.in.
CFILES = alloca.c argv.c asprintf.c atexit.c \
basename.c bcmp.c bcopy.c bsearch.c bzero.c \
- calloc.c choose-temp.c clock.c concat.c cp-demangle.c \
- cplus-dem.c \
+ calloc.c choose-temp.c clock.c concat.c \
+ cp-demangle.c cplus-dem.c \
dyn-string.c \
- fdmatch.c ffs.c fibheap.c floatformat.c fnmatch.c \
+ fdmatch.c ffs.c ffsl.c ffsll.c fibheap.c floatformat.c fnmatch.c\
getcwd.c getopt.c getopt1.c getpagesize.c getpwd.c getruntime.c \
hashtab.c hex.c \
index.c insque.c \
@@ -176,7 +183,7 @@
basename.o bcmp.o bcopy.o bsearch.o bzero.o \
calloc.o clock.o copysign.o \
_doprnt.o \
- ffs.o \
+ ffs.o ffsl.o ffsll.o \
getcwd.o getpagesize.o \
index.o insque.o \
memchr.o memcmp.o memcpy.o memmove.o memset.o mkstemps.o \
@@ -287,7 +294,7 @@
# can't use anything encumbering.
NEEDED = atexit calloc memchr memcmp memcpy memmove memset rename
strchr \
strerror strncmp strrchr strstr strtol strtoul tmpnam vfprintf
vprintf \
- vfork waitpid bcmp bcopy bzero
+ vfork waitpid bcmp bcopy bzero ffs ffsl ffsll
needed-list: Makefile
rm -f needed-list; touch needed-list; \
for f in $(NEEDED); do \
Index: aclocal.m4
===================================================================
RCS file: /cvs/gcc/gcc/libiberty/aclocal.m4,v
retrieving revision 1.5
diff -u -d -b -w -b -B -u -r1.5 aclocal.m4
--- aclocal.m4 31 Dec 2001 23:23:49 -0000 1.5
+++ aclocal.m4 8 Jan 2003 07:32:39 -0000
@@ -1,3 +1,5 @@
+sinclude(../config/accross.m4)
+
dnl See whether strncmp reads past the end of its string parameters.
dnl On some versions of SunOS4 at least, strncmp reads a word at a time
dnl but erroneously reads past the end of strings. This can cause
@@ -107,6 +109,7 @@
if test $ac_cv_prog_gcc = yes; then
GCC=yes
ac_libiberty_warn_cflags='-W -Wall -Wtraditional -pedantic'
+ ac_libiberty_warn_long_long_cflags='-Wno-long-long'
dnl Check whether -g works, even if CFLAGS is set, in case the package
dnl plays around with CFLAGS (such as to build both debugging and
dnl normal versions of a library), tasteless as that idea is.
@@ -124,9 +127,11 @@
else
GCC=
ac_libiberty_warn_cflags=
+ ac_libiberty_warn_long_long_cflags=
test "${CFLAGS+set}" = set || CFLAGS="-g"
fi
AC_SUBST(ac_libiberty_warn_cflags)
+AC_SUBST(ac_libiberty_warn_long_long_cflags)
])
# Work around a bug in autoheader. This can go away when we switch to
@@ -188,4 +193,27 @@
STACK_DIRECTION > 0 => grows toward higher addresses
STACK_DIRECTION < 0 => grows toward lower addresses
STACK_DIRECTION = 0 => direction of growth unknown])
+])
+
+dnl Checking for long long.
+dnl By Caolan McNamara <caolan@skynet.ie>
+dnl Added check for __int64, Zack Weinberg <zackw@stanford.edu>
+dnl
+AC_DEFUN([libiberty_AC_C_LONG_LONG],
+[AC_CACHE_CHECK(for long long int, ac_cv_c_long_long,
+ [AC_TRY_COMPILE(,[long long int i;],
+ ac_cv_c_long_long=yes,
+ ac_cv_c_long_long=no)])
+ if test $ac_cv_c_long_long = yes; then
+ AC_DEFINE(HAVE_LONG_LONG, 1,
+ [Define if your compiler supports the \`long long' type.])
+ fi
+AC_CACHE_CHECK(for __int64, ac_cv_c___int64,
+ [AC_TRY_COMPILE(,[__int64 i;],
+ ac_cv_c___int64=yes,
+ ac_cv_c___int64=no)])
+ if test $ac_cv_c___int64 = yes; then
+ AC_DEFINE(HAVE___INT64, 1,
+ [Define if your compiler supports the \`__int64' type.])
+ fi
])
Index: configure.in
===================================================================
RCS file: /cvs/gcc/gcc/libiberty/configure.in,v
retrieving revision 1.52
diff -u -d -b -w -b -B -u -r1.52 configure.in
--- configure.in 1 Jul 2002 05:38:50 -0000 1.52
+++ configure.in 8 Jan 2003 07:32:39 -0000
@@ -145,6 +145,9 @@
AC_HEADER_SYS_WAIT
AC_HEADER_TIME
+
+AC_C_BIGENDIAN_CROSS
+
libiberty_AC_DECLARE_ERRNO
AC_CHECK_TYPE(uintptr_t, unsigned long)
@@ -154,6 +158,18 @@
AC_DEFINE(HAVE_UINTPTR_T, 1, [Define if you have the \`uintptr_t'
type.])
fi
+libiberty_AC_C_LONG_LONG
+
+AC_COMPILE_CHECK_SIZEOF(int)
+AC_COMPILE_CHECK_SIZEOF(long)
+
+if test $ac_cv_c_long_long = yes; then
+ AC_COMPILE_CHECK_SIZEOF(long long)
+fi
+if test $ac_cv_c___int64 = yes; then
+ AC_COMPILE_CHECK_SIZEOF(__int64)
+fi
+
AC_TYPE_PID_T
# This is the list of functions which libiberty will provide if they
@@ -200,6 +216,14 @@
funcs="$funcs vprintf"
funcs="$funcs vsprintf"
funcs="$funcs waitpid"
+funcs="$funcs ffsl"
+
+if test $ac_cv_c_long_long = yes; then
+ funcs="$funcs ffsll"
+fi
+if test $ac_cv_c___int64 = yes; then
+ funcs="$funcs ffsll"
+fi
# Also in the old function.def file: alloca, vfork, getopt.
@@ -216,7 +240,7 @@
AC_CHECK_FUNCS(strcasecmp setenv strchr strdup strncasecmp strrchr
strstr)
AC_CHECK_FUNCS(strtod strtol strtoul tmpnam vasprintf vfprintf
vprintf)
AC_CHECK_FUNCS(vsprintf waitpid getrusage on_exit psignal strerror
strsignal)
- AC_CHECK_FUNCS(sysconf times sbrk gettimeofday ffs)
+ AC_CHECK_FUNCS(sysconf times sbrk gettimeofday ffs ffsl ffsll)
AC_DEFINE(HAVE_SYS_ERRLIST, 1, [Define if you have the sys_errlist
variable.])
AC_DEFINE(HAVE_SYS_NERR, 1, [Define if you have the sys_nerr
variable.])
AC_DEFINE(HAVE_SYS_SIGLIST, 1, [Define if you have the sys_siglist
variable.])
Index: ffsl.c
===================================================================
RCS file: ffsl.c
diff -N ffsl.c
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ ffsl.c 8 Jan 2003 07:32:39 -0000
@@ -0,0 +1,31 @@
+/* ffsl -- Find the first bit set in the parameter
+
+@deftypefn Supplemental int ffsl (long @var{valu})
+
+Find the first (least significant) bit set in @var{valu}. Bits are
+numbered from right to left, starting with bit 1 (corresponding to the
+value 1). If @var{valu} is zero, zero is returned.
+
+@end deftypefn
+
+*/
+
+#include "config.h"
+
+int
+ffsl (valu)
+ register long valu;
+{
+#if SIZEOF_LONG == SIZEOF_INT
+ extern int ffs();
+ return ffs(valu);
+#else
+#if SIZEOF_LONG == SIZEOF_LONG_LONG || SIZEOF_LONG == SIZEOF___INT64
+ extern int ffs();
+ return ffsll(valu);
+#else
+ #error Do not know what size long is.
+#endif
+#endif
+}
+
Index: ffsll.c
===================================================================
RCS file: ffsll.c
diff -N ffsll.c
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ ffsll.c 8 Jan 2003 07:32:39 -0000
@@ -0,0 +1,75 @@
+/* ffsll -- Find the first bit set in the parameter
+
+@deftypefn Supplemental int ffsll (int @var{valu})
+
+Find the first (least significant) bit set in @var{valu}. Bits are
+numbered from right to left, starting with bit 1 (corresponding to the
+value 1). If @var{valu} is zero, zero is returned.
+
+@end deftypefn
+
+*/
+
+#include "config.h"
+
+#if defined(HAVE_LONG_LONG)
+#define NEED_FFSLL
+typedef long long ll;
+#define SIZEOF_LL SIZEOF_LONG_LONG
+
+#else
+#if defined(HAVE___INT64)
+#define NEED_FFSLL
+typedef __int64 ll;
+#define SIZEOF_LL SIZEOF___INT64
+
+#endif
+#endif
+
+#if defined(NEED_FFSLL)
+
+union ll_ints {
+ ll longlongs;
+ struct {
+#if defined(WORDS_BIGENDIAN)
+ int hi;
+ int low;
+#else
+ int low;
+ int hi;
+#endif
+ } ints;
+};
+
+int ffs ();
+
+int
+ffsll (valu)
+ register ll valu;
+{
+#if SIZEOF_LL == SIZEOF_INT*2
+ ll x = valu & -valu;
+ union ll_ints temp;
+ union ll_ints temp1;
+ int add = 0;
+ int word;
+ temp.longlongs = valu;
+ temp1.longlongs = x;
+ if(temp1.ints.hi!=0)
+ word = temp.ints.low;
+ else
+ word = temp.ints.hi, add = 32;
+ return add + ffs (word);
+#else
+ int bit_num = 0;
+ if(valu==0)
+ return 0;
+ while((valu&1)!=0)
+ bit_num ++, valu >>= 1;
+
+ return bit_num;
+#endif
+
+}
+
+#endif
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: speeding up parts of gcc by using ffs
2003-01-08 9:51 ` Andrew Pinski
@ 2003-01-08 11:20 ` Falk Hueffner
2003-01-08 11:21 ` Andrew Pinski
2003-01-09 0:24 ` Richard Henderson
1 sibling, 1 reply; 12+ messages in thread
From: Falk Hueffner @ 2003-01-08 11:20 UTC (permalink / raw)
To: Andrew Pinski; +Cc: gcc-patches, gcc, apinski
Andrew Pinski <pinskia@physics.uc.edu> writes:
> + extern int ffs();
Function declarators with empty parentheses are an obsolescent feature
in C99, so I think they're better to be avoided.
> +#else
> + int bit_num = 0;
> + if(valu==0)
> + return 0;
> + while((valu&1)!=0)
> + bit_num ++, valu >>= 1;
> +
> + return bit_num;
> +#endif
This looks bogus to me.
--
Falk
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: speeding up parts of gcc by using ffs
2003-01-08 11:20 ` Falk Hueffner
@ 2003-01-08 11:21 ` Andrew Pinski
2003-01-08 11:26 ` Falk Hueffner
0 siblings, 1 reply; 12+ messages in thread
From: Andrew Pinski @ 2003-01-08 11:21 UTC (permalink / raw)
To: Falk Hueffner; +Cc: Andrew Pinski, gcc-patches, gcc, apinski
On Wednesday, Jan 8, 2003, at 00:22 US/Pacific, Falk Hueffner wrote:
> Andrew Pinski <pinskia@physics.uc.edu> writes:
>
>> + extern int ffs();
>
> Function declarators with empty parentheses are an obsolescent feature
> in C99, so I think they're better to be avoided.
Well parts of gcc are still written in K&R C, this is one.
Thanks,
Andrew Pinski
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: speeding up parts of gcc by using ffs
2003-01-08 11:21 ` Andrew Pinski
@ 2003-01-08 11:26 ` Falk Hueffner
0 siblings, 0 replies; 12+ messages in thread
From: Falk Hueffner @ 2003-01-08 11:26 UTC (permalink / raw)
To: Andrew Pinski; +Cc: gcc-patches, gcc, apinski
Andrew Pinski <pinskia@physics.uc.edu> writes:
> On Wednesday, Jan 8, 2003, at 00:22 US/Pacific, Falk Hueffner wrote:
> > Andrew Pinski <pinskia@physics.uc.edu> writes:
> >
> >> + extern int ffs();
> >
> > Function declarators with empty parentheses are an obsolescent feature
> > in C99, so I think they're better to be avoided.
>
> Well parts of gcc are still written in K&R C, this is one.
Hmm, didn't think of that. Perhaps you could uses PARAMS(()).
--
Falk
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: speeding up parts of gcc by using ffs
2003-01-08 9:51 ` Andrew Pinski
2003-01-08 11:20 ` Falk Hueffner
@ 2003-01-09 0:24 ` Richard Henderson
1 sibling, 0 replies; 12+ messages in thread
From: Richard Henderson @ 2003-01-09 0:24 UTC (permalink / raw)
To: Andrew Pinski; +Cc: gcc-patches, gcc, apinski
On Tue, Jan 07, 2003 at 11:52:00PM -0800, Andrew Pinski wrote:
> + if(temp1.ints.hi!=0)
> + word = temp.ints.low;
This doesn't implement ffs. You want
if (temp1.ints.lo)
word = temp1.ints.lo;
And, really, I suspect that you don't want a union
at all. Using a >> 32 should be good enough and it
won't punish 64-bit hosts.
r~
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2003-01-08 22:46 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-01-05 6:16 speeding up parts of gcc by using count_leading_zero (long) Andrew Pinski
2003-01-05 7:03 ` Zack Weinberg
2003-01-05 7:17 ` Peter Barada
2003-01-05 8:41 ` Zack Weinberg
2003-01-06 10:51 ` speeding up parts of gcc by using ffs Andrew Pinski
2003-01-06 18:08 ` Zack Weinberg
2003-01-06 18:28 ` Joseph S. Myers
2003-01-08 9:51 ` Andrew Pinski
2003-01-08 11:20 ` Falk Hueffner
2003-01-08 11:21 ` Andrew Pinski
2003-01-08 11:26 ` Falk Hueffner
2003-01-09 0:24 ` Richard Henderson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).