public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/31695] New: __builtin_ctzll slower than 2*__builtin_ctz
@ 2007-04-25 8:17 joerg dot richter at pdv-fs dot de
2007-04-25 14:38 ` [Bug c/31695] " rguenth at gcc dot gnu dot org
0 siblings, 1 reply; 2+ messages in thread
From: joerg dot richter at pdv-fs dot de @ 2007-04-25 8:17 UTC (permalink / raw)
To: gcc-bugs
int func1( unsigned long long val )
{
return __builtin_ctzll( val );
}
int func2( unsigned long long val )
{
unsigned lo = (unsigned)val;
return lo ? __builtin_ctz(lo) : __builtin_ctz(unsigned(val>>32)) + 32;
}
func1 is more than 2 times slower than func2.
But it should be at least as fast as func2
__builtin_ctzll is not expanded inline like __builtin_ctz.
--
Summary: __builtin_ctzll slower than 2*__builtin_ctz
Product: gcc
Version: 4.1.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: joerg dot richter at pdv-fs dot de
GCC target triplet: i686-pc-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31695
^ permalink raw reply [flat|nested] 2+ messages in thread
* [Bug c/31695] __builtin_ctzll slower than 2*__builtin_ctz
2007-04-25 8:17 [Bug c/31695] New: __builtin_ctzll slower than 2*__builtin_ctz joerg dot richter at pdv-fs dot de
@ 2007-04-25 14:38 ` rguenth at gcc dot gnu dot org
0 siblings, 0 replies; 2+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2007-04-25 14:38 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from rguenth at gcc dot gnu dot org 2007-04-25 15:38 -------
Because it calls into libgcc and that without tail-calling:
_Z5func1y:
.LFB2:
pushl %ebp
.LCFI2:
movl %esp, %ebp
.LCFI3:
subl $24, %esp
.LCFI4:
movl 8(%ebp), %eax
movl 12(%ebp), %edx
movl %eax, (%esp)
movl %edx, 4(%esp)
call __ctzdi2
leave
ret
libgcc implements it as
int
__ctzDI2 (UDWtype x)
{
const DWunion uu = {.ll = x};
UWtype word;
Wtype ret, add;
if (uu.s.low)
word = uu.s.low, add = 0;
else
word = uu.s.high, add = W_TYPE_SIZE;
count_trailing_zeros (ret, word);
return ret + add;
}
(count_trailing_zeros is expanded to asm bsfl on x86, that's ok)
The question remains why we don't tailcall. And we could expand the
long-long version inline.
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Severity|normal |enhancement
Status|UNCONFIRMED |NEW
Ever Confirmed|0 |1
Keywords| |missed-optimization
Last reconfirmed|0000-00-00 00:00:00 |2007-04-25 15:38:21
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31695
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2007-04-25 14:38 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-04-25 8:17 [Bug c/31695] New: __builtin_ctzll slower than 2*__builtin_ctz joerg dot richter at pdv-fs dot de
2007-04-25 14:38 ` [Bug c/31695] " rguenth at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).