From: Sergey Bugaev <bugaevc@gmail.com>
To: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
Cc: libc-alpha@sourceware.org
Subject: Re: [PATCH v2] Mark more functions as __COLD
Date: Fri, 19 May 2023 13:35:29 +0300 [thread overview]
Message-ID: <CAN9u=HdQ=o-KUG0Wsxav4b00DmgE5bnbzVCG+oKAFOiAEMGh2g@mail.gmail.com> (raw)
In-Reply-To: <98620b0e-7251-3781-2935-ce058a5953dc@linaro.org>
On Thu, May 18, 2023 at 10:43 PM Adhemerval Zanella Netto
<adhemerval.zanella@linaro.org> wrote:
> The rationale seems ok, some comments below.
Thanks. Any thoughts on the .text.{startup,exit} part?
> > -void
> > +void __COLD
> > __libc_fatal (const char *message)
> > {
> > _dl_fatal_printf ("%s", message);
> > }
> > rtld_hidden_def (__libc_fatal)
> >
>
> Can't you just add on the function prototype at include/stdio.h? Same
> question for the __assert_fail and __assert_perror_fail below.
But I did just that (added __COLD to the prototypes in include/stdio.h
and include/assert.h), didn't I?
If you're saying that it's not worth repeating __COLD on the
definition, then sure, I could remove that if you prefer.
> > +/* Intentionally not marked __COLD in the header, since this only causes GCC
> > + to create a bunch of useless __foo_chk.cold symbols containing only a call
> > + to this function; better just keep calling it directly. */
> > extern void __chk_fail (void) __attribute__ ((__noreturn__));
> > libc_hidden_proto (__chk_fail)
> > rtld_hidden_proto (__chk_fail)
>
> Why exactly gcc generates the useless __foo_chk.cold for this case? Is this a
> bug or a limitation?
I don't know; your guess is as good as mine (actually yours would be
better than mine). But my guess would be that they just didn't think
to add a check that whatever code size savings they're getting by
moving the cold path into a separate section outweigh the jump
instruction to get there.
Here's what I'm getting specifically, on i686-gnu:
Dump of assembler code for function __ppoll_chk:
Address range 0x198760 to 0x19879e:
0x00198760 <+0>: 56 push %esi
0x00198761 <+1>: 53 push %ebx
0x00198762 <+2>: 83 ec 04 sub $0x4,%esp
0x00198765 <+5>: 8b 44 24 20 mov 0x20(%esp),%eax
0x00198769 <+9>: 8b 54 24 14 mov 0x14(%esp),%edx
0x0019876d <+13>: 8b 4c 24 10 mov 0x10(%esp),%ecx
0x00198771 <+17>: 8b 5c 24 18 mov 0x18(%esp),%ebx
0x00198775 <+21>: c1 e8 03 shr $0x3,%eax
0x00198778 <+24>: 8b 74 24 1c mov 0x1c(%esp),%esi
0x0019877c <+28>: 39 d0 cmp %edx,%eax
0x0019877e <+30>: 0f 82 9d bb e8 ff jb 0x24321 <__ppoll_chk.cold>
0x00198784 <+36>: 89 74 24 1c mov %esi,0x1c(%esp)
0x00198788 <+40>: 89 5c 24 18 mov %ebx,0x18(%esp)
0x0019878c <+44>: 89 54 24 14 mov %edx,0x14(%esp)
0x00198790 <+48>: 89 4c 24 10 mov %ecx,0x10(%esp)
0x00198794 <+52>: 83 c4 04 add $0x4,%esp
0x00198797 <+55>: 5b pop %ebx
0x00198798 <+56>: 5e pop %esi
0x00198799 <+57>: e9 b2 b9 fb ff jmp 0x154150 <__GI_ppoll>
Address range 0x24321 to 0x24326:
0x00024321 <-1524799>: e8 5c ff ff ff call 0x24282 <__GI___chk_fail>
End of assembler dump.
It's spending 6 bytes for the 'jb __ppoll_chk.cold', only to jump to
'call __GI___chk_fail' which takes 5 bytes. That's negative space
savings, both overall and inside .text.
And actually frankly that's bad codegen altogether, unless I'm missing
something. Why not
mov 20(%esp), %eax
shr $3, %eax
cmp 8(%esp), %eax
jnb __GI_ppoll
push %ebp
mov %esp, %ebp
call __GI___chk_fail
Then maybe it'd make sense to move the "push, mov, call" into
.text.unlikely, adding a jmp.
Sergey
next prev parent reply other threads:[~2023-05-19 10:35 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-15 14:48 [RFC PATCH 0/6] .text.subsections for some questionable benefit Sergey Bugaev
2023-05-15 14:48 ` [RFC PATCH 1/6] Mark more functions as __COLD Sergey Bugaev
2023-05-15 15:22 ` Andreas Schwab
2023-05-15 15:27 ` Sergey Bugaev
2023-05-18 17:06 ` [PATCH v2] " Sergey Bugaev
2023-05-18 19:43 ` Adhemerval Zanella Netto
2023-05-19 10:35 ` Sergey Bugaev [this message]
2023-05-22 20:41 ` Adhemerval Zanella Netto
2023-05-15 14:48 ` [RFC PATCH 2/6] mcheck: Microoptimize Sergey Bugaev
2023-05-15 14:48 ` [RFC PATCH 3/6] sys/cdefs.h: Define __TEXT_STARTUP & __TEXT_EXIT Sergey Bugaev
2023-05-15 14:48 ` [RFC PATCH 4/6] Mark various functions as __TEXT_STARTUP and __TEXT_EXIT Sergey Bugaev
2023-05-15 14:48 ` [RFC PATCH 5/6] Also place entry points into .text.startup Sergey Bugaev
2023-05-15 14:48 ` [RFC PATCH 6/6] mach: In rtld, mark MIG routines as __TEXT_STARTUP Sergey Bugaev
2023-05-15 15:33 ` [RFC PATCH 0/6] .text.subsections for some questionable benefit Cristian Rodríguez
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAN9u=HdQ=o-KUG0Wsxav4b00DmgE5bnbzVCG+oKAFOiAEMGh2g@mail.gmail.com' \
--to=bugaevc@gmail.com \
--cc=adhemerval.zanella@linaro.org \
--cc=libc-alpha@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).