public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <richard.guenther@gmail.com>
To: "Martin Liška" <mliska@suse.cz>
Cc: Jakub Jelinek <jakub@redhat.com>,
	Andi Kleen <andi@firstfloor.org>, Jeff Law <law@redhat.com>,
		Nathan Sidwell <nathan@acm.org>,
	GCC Patches <gcc-patches@gcc.gnu.org>,
	"Hubicha, Jan" <jh@suse.cz>
Subject: Re: [RFC] Speed-up -fprofile-update=atomic
Date: Tue, 04 Oct 2016 09:45:00 -0000	[thread overview]
Message-ID: <CAFiYyc1vXtBaEO=NMLC+emcWb_egEemRDOA5TmHLz-gbwosEwA@mail.gmail.com> (raw)
In-Reply-To: <8932e842-2457-64a0-76bc-a81b9a9a9b31@suse.cz>

On Thu, Sep 15, 2016 at 12:00 PM, Martin Liška <mliska@suse.cz> wrote:
> On 09/07/2016 02:09 PM, Richard Biener wrote:
>> On Wed, Sep 7, 2016 at 1:37 PM, Martin Liška <mliska@suse.cz> wrote:
>>> On 08/18/2016 06:06 PM, Richard Biener wrote:
>>>> On August 18, 2016 5:54:49 PM GMT+02:00, Jakub Jelinek <jakub@redhat.com> wrote:
>>>>> On Thu, Aug 18, 2016 at 08:51:31AM -0700, Andi Kleen wrote:
>>>>>>> I'd prefer to make updates atomic in multi-threaded applications.
>>>>>>> The best proxy we have for that is -pthread.
>>>>>>>
>>>>>>> Is it slower, most definitely, but odds are we're giving folks
>>>>>>> garbage data otherwise, which in many ways is even worse.
>>>>>>
>>>>>> It will likely be catastrophically slower in some cases.
>>>>>>
>>>>>> Catastrophically as in too slow to be usable.
>>>>>>
>>>>>> An atomic instruction is a lot more expensive than a single
>>>>> increment. Also
>>>>>> they sometimes are really slow depending on the state of the machine.
>>>>>
>>>>> Can't we just have thread-local copies of all the counters (perhaps
>>>>> using
>>>>> __thread pointer as base) and just atomically merge at thread
>>>>> termination?
>>>>
>>>> I suggested that as well but of course it'll have its own class of issues (short lived threads, so we need to somehow re-use counters from terminated threads, large number of threads and thus using too much memory for the counters)
>>>>
>>>> Richard.
>>>
>>> Hello.
>>>
>>> I've got written the approach on my TODO list, let's see whether it would be doable in a reasonable amount of time.
>>>
>>> I've just finished some measurements to illustrate slow-down of -fprofile-update=atomic approach.
>>> All numbers are: no profile, -fprofile-generate, -fprofile-generate -fprofile-update=atomic
>>> c-ray benchmark (utilizing 8 threads, -O3): 1.7, 15.5., 38.1s
>>> unrar (utilizing 8 threads, -O3): 3.6, 11.6, 38s
>>> tramp3d (1 thread, -O3): 18.0, 46.6, 168s
>>>
>>> So the slow-down is roughly 300% compared to -fprofile-generate. I'm not having much experience with default option
>>> selection, but these numbers can probably help.
>>>
>>> Thoughts?
>>
>> Look at the generated code for an instrumented simple loop and see that for
>> the non-atomic updates we happily apply store-motion to the counter update
>> and thus we only get one counter update per loop exit rather than one per
>> loop iteration.  Now see what happens for the atomic case (I suspect you
>> get one per iteration).
>>
>> I'll bet this accounts for most of the slowdown.
>>
>> Back in time ICC which had atomic counter updates (but using function
>> calls - ugh!) had a > 1000% overhead with FDO for tramp3d (they also
>> didn't have early inlining -- removing abstraction helps reducing the number
>> of counters significantly).
>>
>> Richard.
>
> Hi.
>
> During Cauldron I discussed with Richi approaches how to speed-up ARCS
> profile counter updates. My first attempt is to utilize TLS storage, where
> every function is accumulating arcs counters. These are eventually added
> (using atomic operations) to the global one at the very end of a function.
> Currently I rely on target support of TLS, which is questionable whether
> to have such a requirement for -fprofile-update=atomic, or to add a new option value
> like -fprofile-update=atomic-tls?
>
> Running the patch on tramp3d, compared to previous numbers, it takes 88s to finish.
> Time shrinks to 50%, compared to the current implementation.
>
> Thoughts?

Hmm, I thought I suggested that you can simply use automatic storage
(which effectively
is TLS...) for regions that are not forked or abnormally left (which
means SESE regions
that have no calls that eventually terminate or throw externally).

So why did you end up with TLS?

Richard.

> Martin
>
>>
>>> Martin
>>>
>>>>
>>>>>      Jakub
>>>>
>>>>
>>>
>

  reply	other threads:[~2016-10-04  9:45 UTC|newest]

Thread overview: 95+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-01  8:50 [PATCH 0/4] Various GCOV/PGO improvements marxin
2016-08-01  8:49 ` [PATCH 3/4] Fix typo in gcov.texi marxin
2016-08-01  8:50 ` [PATCH 4/4] Add tests for __gcov_dump and __gcov_reset marxin
2016-08-01  8:50 ` [PATCH 1/4] Cherry-pick fprofile-generate-atomic from google/gcc-4_9 branch marxin
2016-08-01 12:22   ` Nathan Sidwell
2016-08-01 13:29     ` Martin Liška
2016-08-04 14:48       ` Nathan Sidwell
2016-08-04 15:34         ` Martin Liška
2016-08-04 16:43           ` Nathan Sidwell
2016-08-04 17:03             ` Nathan Sidwell
2016-08-05  8:55               ` Martin Liška
2016-08-05 12:38                 ` Nathan Sidwell
2016-08-05 12:48                   ` Martin Liška
2016-08-05 13:14                     ` Nathan Sidwell
2016-08-05 13:43                       ` Martin Liška
2016-08-08 13:59                         ` [PATCH 5/N] Add new *_atomic counter update function, (-fprofile-update=atomic) Martin Liška
2016-08-08 15:24                           ` Nathan Sidwell
2016-08-08 16:51                             ` Martin Liška
2016-08-08 17:03                               ` Martin Liška
2016-08-09 12:36                                 ` Nathan Sidwell
2016-08-08 16:56                             ` [PATCH] Fix POW2 histogram Martin Liška
2016-08-09  8:41                               ` [PATCH 2/N] Fix usage of " Martin Liška
2016-08-09 12:37                                 ` Nathan Sidwell
2016-08-09 12:34                               ` [PATCH] Fix " Nathan Sidwell
2016-08-09 11:24                         ` [PATCH] Set -fprofile-update=atomic when -pthread is present Martin Liška
2016-08-09 12:40                           ` Nathan Sidwell
2016-08-09 19:04                           ` Andi Kleen
2016-08-12 13:31                             ` Martin Liška
2016-08-18  3:16                               ` Jeff Law
2016-08-18 11:02                                 ` Nathan Sidwell
2016-08-18 15:51                                 ` Andi Kleen
2016-08-18 15:53                                   ` Jeff Law
2016-10-03 12:13                                     ` Martin Liška
2016-10-03 12:26                                       ` Nathan Sidwell
2016-10-03 16:46                                         ` Jeff Law
2016-10-03 17:52                                           ` Andi Kleen
2016-10-04 12:05                                         ` Martin Liška
2016-10-05 17:54                                           ` Jeff Law
2016-10-13 15:34                                           ` [PATCH] Introduce -fprofile-update=maybe-atomic Martin Liška
2016-10-31  9:13                                             ` Martin Liška
2016-11-10 13:19                                               ` Martin Liška
2016-11-10 15:43                                                 ` Nathan Sidwell
2016-11-10 15:55                                                   ` David Edelsohn
2016-11-10 16:18                                                     ` Nathan Sidwell
2016-11-10 15:58                                                   ` Martin Liška
2016-11-10 16:17                                                     ` David Edelsohn
2016-11-10 16:24                                                       ` Martin Liška
2016-11-10 17:31                                                         ` Nathan Sidwell
2016-11-11 10:48                                                           ` Martin Liška
2016-11-11 15:11                                                             ` Nathan Sidwell
2016-11-10 16:14                                                   ` Nathan Sidwell
2016-11-10 16:16                                                     ` David Edelsohn
2016-08-18 15:54                                   ` [PATCH] Set -fprofile-update=atomic when -pthread is present Jakub Jelinek
2016-08-18 16:06                                     ` Richard Biener
2016-09-07 11:41                                       ` Martin Liška
     [not found]                                         ` <CAFiYyc0UaSzXhZmyG9QRkHGT4JFowxBfE2yb-NvXE=hR1xafdA@mail.gmail.com>
2016-09-15 10:18                                           ` [RFC] Speed-up -fprofile-update=atomic Martin Liška
2016-10-04  9:45                                             ` Richard Biener [this message]
2016-10-12 13:53                                               ` Martin Liška
2016-10-13  9:43                                                 ` Richard Biener
2016-10-17 11:47                                                   ` Martin Liška
     [not found]                                                     ` <CAFiYyc3eDT4g926PPZuktz5fEW=k-PibAcxhigx4GBcxoXNJFQ@mail.gmail.com>
2016-10-24 12:09                                                       ` Martin Liška
     [not found]                                                         ` <CAFiYyc1tSdTdqqkHcMp+dgE43+8tHL6kY8E07TCHoZBeUT-ggQ@mail.gmail.com>
2016-10-25 14:32                                                           ` Martin Liška
2016-10-26  9:29                                                             ` Richard Biener
2016-10-26  9:32                                                               ` Richard Biener
2016-08-18 16:04                                   ` [PATCH] Set -fprofile-update=atomic when -pthread is present Richard Biener
2016-08-10 12:57                         ` [PATCH 1/4] Cherry-pick fprofile-generate-atomic from google/gcc-4_9 branch Nathan Sidwell
2016-08-13 12:14                         ` [BUILDROBOT] avr broken (was: [PATCH 1/4] Cherry-pick fprofile-generate-atomic from google/gcc-4_9 branch) Jan-Benedict Glaw
     [not found]                           ` <4455937b-eba7-fe66-fe1a-3172567dd1e4@suse.cz>
2016-08-16 13:36                             ` [BUILDROBOT] avr broken Nathan Sidwell
     [not found]                               ` <617e8799-b7db-fefd-b3a3-842e9a7decfd@suse.cz>
2016-08-16 14:31                                 ` Nathan Sidwell
2016-08-16 17:05                                   ` Jan-Benedict Glaw
2016-08-16 18:26                                     ` Nathan Sidwell
2016-08-17  7:21                                       ` Denis Chertykov
2016-08-17  7:22                                         ` Martin Liška
2016-08-17  8:11                                       ` Jan-Benedict Glaw
2016-08-16 12:56                         ` [PATCH] Detect whether target can use -fprofile-update=atomic Martin Liška
2016-08-16 14:31                           ` Nathan Sidwell
2016-09-06 10:57                             ` Martin Liška
2016-09-06 11:17                               ` David Edelsohn
2016-09-06 12:15                                 ` Nathan Sidwell
2016-09-06 12:39                                   ` Jakub Jelinek
2016-09-06 12:43                                     ` David Edelsohn
2016-09-06 12:41                                   ` David Edelsohn
2016-09-06 12:51                                     ` Martin Liška
2016-09-06 13:13                                       ` Jakub Jelinek
2016-09-06 13:15                                         ` Martin Liška
2016-09-06 13:45                                           ` Jakub Jelinek
2016-09-06 13:50                                             ` Martin Liška
2016-09-06 14:06                                               ` Jakub Jelinek
2016-09-07  7:52                                               ` Christophe Lyon
2016-09-07  9:35                                                 ` Martin Liška
2016-09-07 16:06                                                   ` Christophe Lyon
2016-09-12 20:20                                                   ` Jeff Law
2016-09-29  8:31                                               ` Rainer Orth
2016-08-01  8:50 ` [PATCH 2/4] Remove __gcov_indirect_call_profiler marxin
2016-08-01 12:11 ` [PATCH 0/4] Various GCOV/PGO improvements Nathan Sidwell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFiYyc1vXtBaEO=NMLC+emcWb_egEemRDOA5TmHLz-gbwosEwA@mail.gmail.com' \
    --to=richard.guenther@gmail.com \
    --cc=andi@firstfloor.org \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=jh@suse.cz \
    --cc=law@redhat.com \
    --cc=mliska@suse.cz \
    --cc=nathan@acm.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).