From: Segher Boessenkool <segher@kernel.crashing.org>
To: Jeff Law <law@redhat.com>
Cc: gcc-patches@gcc.gnu.org, dje.gcc@gmail.com
Subject: Re: [PATCH 4/5] shrink-wrap: Shrink-wrapping for separate components
Date: Mon, 10 Oct 2016 22:23:00 -0000 [thread overview]
Message-ID: <20161010222315.GA21712@gate.crashing.org> (raw)
In-Reply-To: <db1c033f-ed38-c165-2e3e-489383a92e49@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 3113 bytes --]
On Mon, Oct 10, 2016 at 03:21:31PM -0600, Jeff Law wrote:
> On 09/30/2016 04:34 AM, Segher Boessenkool wrote:
> >[ whoops, message too big, resending with the attachment compressed ]
> >
> >On Tue, Sep 27, 2016 at 03:14:51PM -0600, Jeff Law wrote:
> >>With transposition issue addressed, the only blocker I see are some
> >>simple testcases we can add to the suite. They don't have to be real
> >>extensive. And one motivating example for the list archives, ideally
> >>the glibc malloc case.
> >
> >And here is the malloc testcase.
> >
> >A very important (for performance) function is _int_malloc, which starts
> >with
> [ ... ]
> THanks. What I think is important to note with this example is the bits
> that were pushed into the path with the sysmalloc/alloc_perturb calls.
> That's an unlikely path.
alloc_perturb is a no-op, and inlined as such: as nothing :-)
> We have to extrapolate a bit from the assembly provided. In the not
> separately shrink-wrapped version, we have a full prologue of stores and
> two instances of a full epilogue (though only one ever executes) provided.
>
> With separate shrink wrapping the (presumably) very cold path where we
> error has virtually no prologue/epilogue. That's probably a nop from a
> performance standpoint.
>
> More interesting is the path where we call sysmalloc/alloc_perturb, it's
> a cold path, but not as cold as the error path. We save/restore 4 regs
> in that case. Rather than a full prologue/epilogue. So there's clearly
> a savings there, though again, via the expect it's a cold path.
>
> Where we have to extrapolate is the hot path. Presumably on the hot
> path we're saving/restoring ~4 fewer registers. I haven't verified
> that, but that is kindof the whole point here.
We save/restore just four registers total on the hot path. And yes,
that is the point :-)
The hot exit is
.L683:
ld 14,144(1)
ld 15,152(1)
ld 25,232(1)
ld 30,272(1)
addi 3,4,16
.L673:
addi 1,1,288
blr
so four GPR restores and no LR restore. Without separate shrink-wrapping
this was
.L641:
addi 3,21,16
b .L631
[ ... ]
.L631:
addi 1,1,288
ld 29,16(1)
ld 14,-144(1)
ld 15,-136(1)
ld 16,-128(1)
ld 17,-120(1)
ld 18,-112(1)
ld 19,-104(1)
ld 20,-96(1)
ld 21,-88(1)
ld 22,-80(1)
ld 23,-72(1)
ld 24,-64(1)
mtlr 29
ld 25,-56(1)
ld 26,-48(1)
ld 27,-40(1)
ld 28,-32(1)
ld 29,-24(1)
ld 30,-16(1)
ld 31,-8(1)
blr
(18 GPRs as well as LR).
I didn't show this path because there is a whole bunch of branches with
inline asm in the way.
The sysmalloc path was
.L635:
li 4,0
.L761:
addi 1,1,288
mr 3,14
ld 14,16(1)
ld 15,-136(1)
ld 16,-128(1)
ld 17,-120(1)
ld 18,-112(1)
ld 19,-104(1)
ld 20,-96(1)
ld 21,-88(1)
ld 22,-80(1)
ld 23,-72(1)
ld 24,-64(1)
ld 25,-56(1)
mtlr 14
ld 26,-48(1)
ld 14,-144(1)
ld 27,-40(1)
ld 28,-32(1)
ld 29,-24(1)
ld 30,-16(1)
ld 31,-8(1)
b sysmalloc
and now is
.L677:
mr 3,14
ld 15,152(1)
ld 14,144(1)
ld 25,232(1)
ld 30,272(1)
li 4,0
addi 1,1,288
b sysmalloc
I attach malloc.s.{no,yes}, I hope you can stomach that. Well you
can read HP-PA, heh.
Segher
[-- Attachment #2: malloc.s.no.gz --]
[-- Type: application/x-gzip, Size: 40507 bytes --]
[-- Attachment #3: malloc.s.yes.gz --]
[-- Type: application/x-gzip, Size: 42479 bytes --]
next prev parent reply other threads:[~2016-10-10 22:23 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-09-23 8:22 [PATCH v3 0/5] Separate shrink-wrapping Segher Boessenkool
2016-09-23 8:22 ` [PATCH 2/5] dce: Don't dead-code delete separately wrapped restores Segher Boessenkool
2016-09-26 16:55 ` Jeff Law
2016-09-23 8:22 ` [PATCH 1/5] separate shrink-wrap: New command-line flag, status flag, hooks, and doc Segher Boessenkool
2016-09-26 17:02 ` Jeff Law
2016-09-23 8:23 ` [PATCH 3/5] regrename: Don't rename restores Segher Boessenkool
2016-09-26 16:44 ` Jeff Law
2016-09-23 8:33 ` [PATCH 4/5] shrink-wrap: Shrink-wrapping for separate components Segher Boessenkool
2016-09-27 21:25 ` Jeff Law
2016-09-28 9:26 ` Segher Boessenkool
2016-09-28 16:36 ` Jeff Law
2016-09-30 10:14 ` Segher Boessenkool
[not found] ` <20160930102908.GB14933@gate.crashing.org>
2016-09-30 10:52 ` Segher Boessenkool
2016-10-10 21:21 ` Jeff Law
2016-10-10 22:23 ` Segher Boessenkool [this message]
2016-09-23 8:44 ` [PATCH 5/5] rs6000: Separate shrink-wrapping Segher Boessenkool
2016-09-26 16:39 ` Jeff Law
2016-09-26 18:16 ` David Edelsohn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161010222315.GA21712@gate.crashing.org \
--to=segher@kernel.crashing.org \
--cc=dje.gcc@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=law@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).