public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: Vladimir Mezentsev <vladimir.mezentsev@oracle.com>
To: Simon Sobisch <simonsobisch@gnu.org>
Cc: binutils@sourceware.org
Subject: Re: minor patch to improve gprofng performance (re: Bug 30898)
Date: Wed, 27 Sep 2023 19:47:36 -0700	[thread overview]
Message-ID: <771ac0b3-16a4-0e72-d371-fc0755448710@oracle.com> (raw)
In-Reply-To: <8a8c5ffa-f15d-c7a6-ea64-9afe3d42bdb1@gnu.org>

hi Simon,
Thank you for your report.
See comments below.

On 9/26/23 04:25, Simon Sobisch wrote:
> Inspecting bug #30898 [1] showed that there is an issue when using the 
> disassembly option with huge (generated) functions.
>
> I gave this a test and found, via
>
>   perf record -o perf.data.gpdisplay --call-graph dwarf,38192 --aio -z \
>     --sample-cpu --mmap-pages 16M \
>   gprofng display text -name short:soname -metrics e.%totalcpu:name \
>     -disasm prog_ test.1.er > /dev/null
>
> That the problem is the disassembly handling.
> Checking the generated perf recording shows that the **burning hot** 
> place is DbeInstr::mapPCtoLine(SourceFile*), called by 
> Module::set_dis_data(Function*, int, int, int, bool, bool, int); 
> taking more than 93.3% of all instructions.

This is a little surprise for me.
It's likely that gcc inlines functions and generates Dwarf that gprofng 
interprets poorly.

If you run:
   gprofng display src -dis prog_ 
<YOUR_EXECUTION_OR_LIBRARY_WHERE_FUNC_IS_LOCATED>
Do you see the same performance problem ?
If yes, may I get this binary ?

>
> Running that took around 5 minutes. Redirecting the output to a file 
> leads to a file with 4,124,497 lines, so: this _really_ is about huge 
> disassembly.


  I generated the big function (~ 1000000 lines). The  disassembly is  
10000037 lines.  This took 38 sec.
But my test is trivial and gcc generates a trivial Dwarf.


>
> I've tinkered a bit with the burning hot function, the result is a 
> minor decrease when using C++2017 invalid code, you find it in the 
> attached patch.
>
> Also attached is the recorded output for the hot function, 
> interestingly the patched version showed quite clearly that over 60 %  
> of the complete run's cpu instructions goes to Hist_data.cc line 1380:
>
>       if (p->level == 0)
>
>
> For huge (GnuCOBOL) generated functions the attached patch drops the 
> perf stat reported counters by 10%.
> Reported counters (median of 3 runs - code generated with default 
> options -O2 -g using g++ (GCC) 11.3) - are as follows:
>
> Original version:
>
>         270,060.53 msec task-clock             #    0.999 CPUs utilized
>  1,023,551,049,245      cycles                 #    3.790 GHz
>  2,160,049,675,779      instructions           #    2.11  insn per cycle
>
> adjusted version using the C++2017 removed "register" storage class 
> specifier for the pointer (there is possibly a better way), decreasing 
> everything:
>
>         260,284.41 msec task-clock             #    0.999 CPUs utilized
>    986,393,903,158      cycles                 #    3.790 GHz
>  1,815,443,360,713      instructions           #    1.84  insn per cycle
>
> adjusted version that abides to C++2017, only instructions decreased:
>
>         280,430.13 msec task-clock             #    0.999 CPUs utilized
>  1,062,479,713,621      cycles                 #    3.789 GHz
>  1,815,698,269,043      instructions           #    1.71  insn per cycle
>
>
>
> Along to this change a short-term _option_ to drop most of those 60% 
> (and, if it drops the amount of entries in there, a good portion of 
> walking the pointers) could be to have a copy of func->inlinedSubr 
> _once_ that _only_ contains level 0 entries.
>
>
> But in the long-term it seems more reasonable to recheck if that 
> function should be rewritten/replaced for better supporting "huge 
> disassembly".
>
>
>
> Another note: the reserved memory use for gp-display-text topped 1.6 
> GB, there may be a way to improve that, too.

It is not normal.
It looks like  gprofng generates always a new DbeLine in 
DbeInstr::mapPCtoLine().

-Vladimir

>
>
> [1]: https://sourceware.org/bugzilla/show_bug.cgi?id=30898


       reply	other threads:[~2023-09-28  2:47 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <8a8c5ffa-f15d-c7a6-ea64-9afe3d42bdb1@gnu.org>
2023-09-28  2:47 ` Vladimir Mezentsev [this message]
2024-01-12 22:31   ` Simon Sobisch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=771ac0b3-16a4-0e72-d371-fc0755448710@oracle.com \
    --to=vladimir.mezentsev@oracle.com \
    --cc=binutils@sourceware.org \
    --cc=simonsobisch@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).