On Mon, Jul 31, 2023 at 1:12 PM Adhemerval Zanella Netto < adhemerval.zanella@linaro.org> wrote: > > > On 31/07/23 15:35, Sunil K Pandey via Libc-alpha wrote: > > Ffsll function size is 17 byte, this patch optimizes size to 16 byte. > > Currently ffsll function randomly regress by ~20%, depending on how > > code get aligned. > > > > This patch fixes ffsll function random performance regression. > > > > Changes from v1: > > - Further reduce size ffsll function size to 12 bytes. > > --- > > sysdeps/x86_64/ffsll.c | 10 +++++----- > > 1 file changed, 5 insertions(+), 5 deletions(-) > > > > diff --git a/sysdeps/x86_64/ffsll.c b/sysdeps/x86_64/ffsll.c > > index a1c13d4906..6a5803c7c1 100644 > > --- a/sysdeps/x86_64/ffsll.c > > +++ b/sysdeps/x86_64/ffsll.c > > @@ -26,13 +26,13 @@ int > > ffsll (long long int x) > > { > > long long int cnt; > > - long long int tmp; > > > > - asm ("bsfq %2,%0\n" /* Count low bits in X and store > in %1. */ > > - "cmoveq %1,%0\n" /* If number was zero, use -1 as > result. */ > > - : "=&r" (cnt), "=r" (tmp) : "rm" (x), "1" (-1)); > > + asm ("mov $-1,%k0\n" /* Intialize CNT to -1. */ > > + "bsf %1,%0\n" /* Count low bits in X and store in CNT. */ > > + "inc %k0\n" /* Increment CNT by 1. */ > > + : "=&r" (cnt) : "r" (x)); > > > > - return cnt + 1; > > + return cnt; > > } > > > > #ifndef __ILP32__ > > > > I still prefer if we can just remove this arch-optimized function in favor > in compiler builtins. > Sure, compiler builtin should replace it in the long run. In the meantime, can it get fixed?