public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jakub Jelinek <jakub@redhat.com>
To: Florian Weimer <fweimer@redhat.com>
Cc: Jason Merrill <jason@redhat.com>, Michael Matz <matz@suse.de>,
	gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] libgcc: Decrease size of _Unwind_FrameState and even more size of cleared area in uw_frame_state_for
Date: Mon, 19 Sep 2022 11:33:37 +0200	[thread overview]
Message-ID: <Yyg3cRTh6eW6o228@tucnak> (raw)
In-Reply-To: <87czbrhg1y.fsf@oldenburg.str.redhat.com>

On Mon, Sep 19, 2022 at 11:25:13AM +0200, Florian Weimer wrote:
> * Jakub Jelinek:
> 
> > The disadvantage of the patch is that touching reg[x].loc and how[x]
> > now means 2 cachelines rather than one as before, and I admit beyond
> > bootstrap/regtest I haven't benchmarked it in any way.  Florian, could
> > you retry whatever you measured to get at the 40% of time spent on the
> > stack clearing to see how the numbers change?
> 
> A benchmark that unwinds through 100 frames containing a std::string
> variable goes from (0b5b8ac5cb7fe92dd17ae8bd7de84640daa59e84):
> 
> min:     24418 ns
> 25%:     24740 ns
> 50%:     24790 ns
> 75%:     24840 ns
> 95%:     24937 ns
> 99%:     26174 ns
> max:     42530 ns
> avg:   24826.1 ns
> 
> to (0b5b8ac5cb7fe92dd17ae8bd7de84640daa59e84 with this patch):
> 
> min:     22307 ns
> 25%:     22640 ns
> 50%:     22713 ns
> 75%:     22787 ns
> 95%:     22948 ns
> 99%:     24839 ns
> max:     52658 ns
> avg:   22863.4 ns
> 
> So 227 ns per frame instead of 248 ns per frame, or ~9% less.

Thanks for doing that.

> Moving cfa_how after how in struct frame_state_reg_info as an 8-bit
> bitfield should avoid zeroing another 8 bytes.  This shaves off another
> 3 ns per frame in my testing (on a Core i9-10900T, so with ERMS).

Good idea.  Won't help always, on some targets how could have size divisible
by pointer alignment, but when it is at the end it always increases the
size by alignment of pointer, while after how array it only does so if
how is multiple of pointer alignment.

> The REP STOS still dominates uw_frame_state_for execution time, but this
> seems to be a profiling artifact.  Replacing it with PXOR and seven
> MOVUPS instructions makes the hotspot go away, but performance does not
> improve.  Odd.

	Jakub


  reply	other threads:[~2022-09-19  9:33 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-19  7:58 Jakub Jelinek
2022-09-19  8:57 ` Richard Biener
2022-09-19  9:16   ` Jakub Jelinek
2022-09-19  9:25 ` Florian Weimer
2022-09-19  9:33   ` Jakub Jelinek [this message]
2022-09-19 13:46     ` Florian Weimer
2022-10-05 10:33 ` Patch ping (Re: [PATCH] libgcc: Decrease size of _Unwind_FrameState and even more size of cleared area in uw_frame_state_for) Jakub Jelinek
2022-10-06  8:08   ` Richard Biener
2022-10-06 22:05     ` Joseph Myers
2022-10-06 22:19       ` [committed] libgcc, arc: Fix build Jakub Jelinek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yyg3cRTh6eW6o228@tucnak \
    --to=jakub@redhat.com \
    --cc=fweimer@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jason@redhat.com \
    --cc=matz@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).