public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: Andi Kleen <ak@linux.intel.com>
To: Jan Hubicka <hubicka@ucw.cz>
Cc: 172060045@hdu.edu.cn, gcc <gcc@gcc.gnu.org>,
	Eugene.Rozenfeld@microsoft.com
Subject: Re: State of AutoFDO in GCC
Date: Mon, 10 May 2021 08:36:06 -0700	[thread overview]
Message-ID: <ecae0319-cb25-83a0-5bc6-383a16e778cc@linux.intel.com> (raw)
In-Reply-To: <20210509170121.GE25641@kam.mff.cuni.cz>


On 5/9/2021 10:01 AM, Jan Hubicka wrote:
>>> With my tests, AutoFDO could achieve almost half of the effect of
>>> instrumentation FDO on real applications such as MySQL 8.0.20 .
>> Likely this could be improved with some of the missing changes. Apparently
>> discriminator support is worth quite a bit especially on dense C++ code
>> bases. Without that, gcc autofdo only works on line numbers, which can be
>> very limiting if single lines have a lot of basic blocks.
>>
>> Sadly discriminator support is currently only on the old Google branch and
>> not in gcc mainline
>>
>> Longer term it would probably be best to replace all this with some custom
>> specialized binary annotation instead of stretching DWARF beyond its limits.
> I think it makes sense to pick the AutoFDO to the lists of things that
> would be nice to fix in GCC12.  I guess first we need to solve the
> issues with the tool producing autofdo gcda files and once that works
> setup some benchmarking so we know how things compare to FDO and they
> get tested.

It should work with my updated branch and latest perf. This is an old 
version before the great LLVMification and the regressions, so it builds 
on its own. I removed all the checks that broke with new perf versions, 
at least as far as I could test.

https://github.com/andikleen/autofdo/tree/perf-future

Longer term I'm thinking the autofdo tools setup as currently done is 
not the right way anyways. The problem is that to get good results you 
need to keep autofdo running for a long time, but that causes perf to 
produce gigantic perf.data files on disk for every sample. For the gcc 
boot strap runs we had cases where it reached GBs. This is all quite 
unnecessary because all it needs to do is to keep some statistics on 
basic block edges, so keeping all the samples is not needed at all.

As a minimum we probably need to figure out how to run it in online in 
perf pipe mode to avoid the temporary files. But the actual code 
algorithms of create_gcov are rather simple, so maybe it could be 
converted into a simple online tool that does the profiling.

And then we need some tooling to find the profile data for a given 
binary in this combined output. Today it's rather difficult to patch 
build systems to always point to the right files. This would need some 
metadata or at least a file name convention for gcov. This would allow 
longer term profiling of full real systems with existing build systems.

> If you point me to the discriminator patches I can try to figure out how
> hard would be to mainline them.

It's difficult to find now because it was a branch in the old SVN that 
wasn't converted. Sadly the great git conversion was quite lossy.

IIRC it was separate patches in the google gcc_4_8 SVN branch (of which 
I don't seem to have a copy either), but in _4_9 they squashed 
everything in autofdo together. It could be gotten if someone has some 
backup of the old SVN repository and git convert the google_4_8 branch.

Here in the 4_9 version you can search for get_discriminator and diff it 
against newer versions:

https://github.com/andikleen/gcc-old-svn/blob/6ff70bb2ef3cc0a5c6940030a89546bf40e70891/gcc/auto-profile.c#L393

and all the changes to other files were in the complete autofdo patch:

https://github.com/andikleen/gcc-old-svn/commit/d71978a93358a397fb80b20f3a65caad3d9addf1#diff-94f0fa7f897ccce65856dc5a98bae4bf6957a346766613d79414c976d093aa4a

can also search for discriminator there

The basic ideas was quite simple. You have an unique value for the dwarf 
discriminator for each basic block, and then you include that in all the 
autofdo location comparisons.

>   I am not too sure about custom
> annotations though - storing info about regions of code segment is
> always a pain and it is quite nice that dwarf provides support for that.
> But I guess we could go step by step.  I first need a working setup ;)

The thing about custom annotations is that it would make autofdo 
independent of the early inliner, which is one of the main reasons for 
the strange structure and placement of the autofdo passes. If we had an 
independent stable identifier for code regions this could be all done at 
better places.

Maybe a similar thing could be done in dwarf, but it would seem a 
stretch because it's really designed around source code.

-Andi


  reply	other threads:[~2021-05-10 15:36 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-22 19:58 Eugene Rozenfeld
2021-04-22 20:16 ` Martin Liška
2021-04-22 22:29   ` Jan Hubicka
2021-04-23  4:14     ` Xinliang David Li
2021-04-23  7:00       ` Richard Biener
2021-04-23  7:18         ` Martin Liška
2021-04-23  9:32           ` Richard Biener
2021-04-23 16:41           ` Xinliang David Li
2021-04-23 16:54             ` Jan Hubicka
2021-04-23 17:04               ` Xinliang David Li
2021-04-23 17:16                 ` Jan Hubicka
2021-04-23 17:27                   ` Xinliang David Li
2021-04-23 17:28                     ` Xinliang David Li
2021-04-23 19:28                       ` Jan Hubicka
2021-04-23 19:58                         ` Xinliang David Li
2021-04-25 19:07                           ` Jan Hubicka
2021-04-25 23:18                             ` Xinliang David Li
2021-04-26  4:22                               ` Wei Mi
2021-04-26 15:11                             ` Andi Kleen
2021-04-26 16:57                               ` Xinliang David Li
2021-04-26 18:00                                 ` Andi Kleen
2021-04-26 18:05                                   ` Xinliang David Li
2021-04-26 18:40                                     ` Hongtao Yu
2021-04-26 19:13                                       ` Andi Kleen
2021-04-29  5:40                                       ` Andi Kleen
2021-04-29 14:45                                         ` 172060045
2021-04-30 21:43                                           ` Andi Kleen
2021-05-08 11:25                                             ` 172060045
2021-05-09 16:28                                               ` Andi Kleen
2021-05-09 17:01                                                 ` Jan Hubicka
2021-05-10 15:36                                                   ` Andi Kleen [this message]
2021-05-10 16:55                                                     ` Joseph Myers
2021-05-10 17:21                                                       ` Andi Kleen
2022-07-26 20:12                                                         ` Eugene Rozenfeld
2022-07-26 22:37                                                           ` David Edelsohn
2022-07-27  7:26                                                             ` Jan Hubicka
2022-07-27 18:30                                                               ` [EXTERNAL] " Eugene Rozenfeld
2022-07-27 18:24                                                             ` Eugene Rozenfeld
2022-07-27  1:31                                                           ` Xionghu Luo
2022-07-27  1:41                                                             ` Xionghu Luo
2022-07-27 18:38                                                               ` [EXTERNAL] " Eugene Rozenfeld
2021-05-10 23:46                                         ` Wei Mi
2021-05-22  1:28                                           ` [EXTERNAL] " Eugene Rozenfeld
2021-05-22 16:36                                             ` Wei Mi
2021-05-25  1:39                                               ` Eugene Rozenfeld
2021-05-25  3:11                                                 ` Wei Mi
2021-05-25  3:33                                                   ` Eugene Rozenfeld
2021-05-25  3:54                                                     ` Wei Mi
2021-05-25  7:01                                                       ` Eugene Rozenfeld
2021-05-25 16:16                                                         ` Wei Mi
2021-05-25 20:49                                                           ` Eugene Rozenfeld
2021-05-26  3:06                                                             ` Wei Mi
2021-05-26 23:39                                                               ` Eugene Rozenfeld
2021-05-27  2:51                                                                 ` Wei Mi
2021-06-12  1:14                                                                   ` Eugene Rozenfeld
2021-06-14 17:00                                                                     ` Wei Mi
2021-04-23 17:20           ` Jan Hubicka
2021-04-23 16:36         ` Xinliang David Li
2021-04-30 18:48           ` [EXTERNAL] " Eugene Rozenfeld
2021-04-30 21:45             ` Andi Kleen
2021-06-24 21:45               ` Eugene Rozenfeld
2021-04-23  1:46   ` Bin.Cheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ecae0319-cb25-83a0-5bc6-383a16e778cc@linux.intel.com \
    --to=ak@linux.intel.com \
    --cc=172060045@hdu.edu.cn \
    --cc=Eugene.Rozenfeld@microsoft.com \
    --cc=gcc@gcc.gnu.org \
    --cc=hubicka@ucw.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).