From: Sunil Pandey <skpgkp2@gmail.com>
To: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
Cc: libc-alpha@sourceware.org, hjl.tools@gmail.com
Subject: Re: [PATCH v2] x86_64: Optimize ffsll function code size.
Date: Mon, 31 Jul 2023 16:44:02 -0700 [thread overview]
Message-ID: <CAMAf5_cnjr3x2tTH9+LVmeAweyp7EAqUhgd8429Qv+UEAUWeAQ@mail.gmail.com> (raw)
In-Reply-To: <9608ac95-a7d5-7963-e4f8-5dc7b0247d82@linaro.org>
[-- Attachment #1: Type: text/plain, Size: 3689 bytes --]
On Mon, Jul 31, 2023 at 3:57 PM Adhemerval Zanella Netto <
adhemerval.zanella@linaro.org> wrote:
>
>
> On 31/07/23 17:58, Sunil Pandey wrote:
> >
> >
> > On Mon, Jul 31, 2023 at 1:12 PM Adhemerval Zanella Netto <
> adhemerval.zanella@linaro.org <mailto:adhemerval.zanella@linaro.org>>
> wrote:
> >
> >
> >
> > On 31/07/23 15:35, Sunil K Pandey via Libc-alpha wrote:
> > > Ffsll function size is 17 byte, this patch optimizes size to 16
> byte.
> > > Currently ffsll function randomly regress by ~20%, depending on how
> > > code get aligned.
> > >
> > > This patch fixes ffsll function random performance regression.
> > >
> > > Changes from v1:
> > > - Further reduce size ffsll function size to 12 bytes.
> > > ---
> > > sysdeps/x86_64/ffsll.c | 10 +++++-----
> > > 1 file changed, 5 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/sysdeps/x86_64/ffsll.c b/sysdeps/x86_64/ffsll.c
> > > index a1c13d4906..6a5803c7c1 100644
> > > --- a/sysdeps/x86_64/ffsll.c
> > > +++ b/sysdeps/x86_64/ffsll.c
> > > @@ -26,13 +26,13 @@ int
> > > ffsll (long long int x)
> > > {
> > > long long int cnt;
> > > - long long int tmp;
> > >
> > > - asm ("bsfq %2,%0\n" /* Count low bits in X and
> store in %1. */
> > > - "cmoveq %1,%0\n" /* If number was zero, use
> -1 as result. */
> > > - : "=&r" (cnt), "=r" (tmp) : "rm" (x), "1" (-1));
> > > + asm ("mov $-1,%k0\n" /* Intialize CNT to -1. */
> > > + "bsf %1,%0\n" /* Count low bits in X and store in CNT. */
> > > + "inc %k0\n" /* Increment CNT by 1. */
> > > + : "=&r" (cnt) : "r" (x));
> > >
> > > - return cnt + 1;
> > > + return cnt;
> > > }
> > >
> > > #ifndef __ILP32__
> >
> >
> >
> > I still prefer if we can just remove this arch-optimized function in
> favor
> > in compiler builtins.
> >
> >
> > Sure, compiler builtin should replace it in the long run.
> > In the meantime, can it get fixed?
>
> This fix only works if compiler does not insert anything in the prologue,
> if
> you use CET or stack protector strong it might not work. And I *really*
> do not want to add another assembly optimization to a symbol that won't
> be used in most real programs.
>
v2 will fix it, as CET overhead is 4 byte and the latest code size is only
12 byte.
So toal code size will be 16 byte even if CET gets enabled.
> And already have a fix to use compiler builtins [1].
>
Here is code generated from the builtin patch.
(gdb) b ffsll
Breakpoint 1 at 0x10a0
(gdb) run --direct
Starting program: benchtests/bench-ffsll --direct
Breakpoint 1, __ffsll (i=66900473708975203) at ffsll.c:30
30 return __builtin_ffsll (i);
(gdb) disass
Dump of assembler code for function __ffsll:
=> 0x00007ffff7c955a0 <+0>: bsf %rdi,%rdi
0x00007ffff7c955a4 <+4>: mov $0xffffffffffffffff,%rax
0x00007ffff7c955ab <+11>: cmove %rax,%rdi
0x00007ffff7c955af <+15>: lea 0x1(%rdi),%eax
0x00007ffff7c955b2 <+18>: ret
End of assembler dump.
(gdb)
It is not going to fix the problem. Random 20% variation will continue even
with
builtin patch in benchmarking.
I do not see anything wrong using architecture features, if it helps
people save their valuable debugging time. After all, its valuable
glibc feature to override generic implementation with architecture specific
code.
> [1]
> https://patchwork.sourceware.org/project/glibc/patch/20230717143431.2075924-1-adhemerval.zanella@linaro.org/
>
next prev parent reply other threads:[~2023-07-31 23:44 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-26 16:05 [PATCH] " Sunil K Pandey
2023-07-26 16:38 ` Richard Henderson
2023-07-26 16:50 ` H.J. Lu
2023-07-26 16:51 ` Noah Goldstein
2023-07-26 16:51 ` Sunil Pandey
2023-07-26 16:59 ` Noah Goldstein
2023-07-26 17:11 ` Adhemerval Zanella Netto
2023-07-27 0:31 ` Cristian Rodríguez
2023-07-26 20:43 ` Sunil Pandey
2023-07-26 21:05 ` Noah Goldstein
2023-07-26 22:37 ` Sunil Pandey
2023-07-27 0:00 ` Noah Goldstein
2023-07-27 8:16 ` Florian Weimer
2023-07-27 11:46 ` Alexander Monakov
2023-07-27 12:10 ` Florian Weimer
2023-07-27 13:59 ` Cristian Rodríguez
2023-07-27 14:00 ` Alexander Monakov
2023-07-27 15:13 ` Sunil Pandey
2023-07-27 15:50 ` Alexander Monakov
2023-07-27 16:24 ` Florian Weimer
2023-07-27 16:35 ` Adhemerval Zanella Netto
2023-07-27 17:09 ` Florian Weimer
2023-07-27 17:25 ` Sunil Pandey
2023-07-31 18:35 ` [PATCH v2] " Sunil K Pandey
2023-07-31 20:12 ` Adhemerval Zanella Netto
2023-07-31 20:58 ` Sunil Pandey
2023-07-31 22:57 ` Adhemerval Zanella Netto
2023-07-31 23:44 ` Sunil Pandey [this message]
2023-07-31 23:54 ` Noah Goldstein
2023-08-01 6:47 ` Andreas Schwab
2023-08-01 13:46 ` Adhemerval Zanella Netto
2023-08-01 18:25 ` Cristian Rodríguez
2024-01-10 19:19 ` Carlos O'Donell
2024-01-25 3:10 ` Sunil Pandey
2023-07-27 16:40 ` [PATCH] " Sunil Pandey
2023-07-26 17:04 ` Noah Goldstein
2023-07-26 17:25 ` Andrew Pinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAMAf5_cnjr3x2tTH9+LVmeAweyp7EAqUhgd8429Qv+UEAUWeAQ@mail.gmail.com \
--to=skpgkp2@gmail.com \
--cc=adhemerval.zanella@linaro.org \
--cc=hjl.tools@gmail.com \
--cc=libc-alpha@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).