Re: Future atomic operation cleanup

public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed

From: Torvald Riegel <triegel@redhat.com>
To: Adhemerval Zanella <azanella@linux.vnet.ibm.com>
Cc: libc-alpha@sourceware.org
Subject: Re: Future atomic operation cleanup
Date: Fri, 29 Aug 2014 09:55:00 -0000	[thread overview]
Message-ID: <1409306115.6045.71.camel@triegel.csb> (raw)
In-Reply-To: <53FF5889.5080006@linux.vnet.ibm.com>

On Thu, 2014-08-28 at 13:27 -0300, Adhemerval Zanella wrote:
> On 28-08-2014 12:37, Torvald Riegel wrote:
> > On Thu, 2014-08-28 at 10:57 -0300, Adhemerval Zanella wrote:
> >> On 22-08-2014 11:00, Richard Henderson wrote:
> >>> On 08/22/2014 06:50 AM, Adhemerval Zanella wrote:
> >>>> Hi,
> >>>>
> >>>> Following comments from my first patch to optimize single-thread internal
> >>>> glibc locking/atomics [1], I have changed the implementation to use now
> >>>> relaxed atomics instead.  Addresing the concerns raised in last discussion, 
> >>>> the primitives are still signal-safe (although not thread-safe), so if future
> >>>> malloc implementation is changed to be async-safe, it won't require to a
> >>>> adjust powerpc atomics.
> >>>>
> >>>> For catomic_and and catomic_or I follow the definition at 'include/atomic.h'
> >>>> (which powerpc is currently using) and implemented the atomics with acquire
> >>>> semantics.  The new implementation also is simpler.
> >>>>
> >>>> On synthetic benchmarks it shows an improvement of 5-10% for malloc
> >>>> calls and an performance increase of 7-8% in 483.xalancbmk from
> >>>> speccpu2006 (number from an POWER8 machine).
> >>>>
> >>>> Checked on powerpc64, powerpc32 and powerpc64le.
> >>> Wow, that's a lot of boiler-plate.
> >>>
> >>> When development opens again, can we simplify all of these atomic operations by
> >>> assuming compiler primitives?  That is, use the __atomic builtins for gcc 4.8
> >>> and later, fall back to the __sync builtins for earlier gcc, and completely
> >>> drop support for truly ancient compilers that support neither.
> >>>
> >>> As a bonus we'd get to unify the implementations across all of the targets.
> >>>
> >>>
> >>> r~
> >>>
> >> I also agree we should move to more a unified implementation (in fact, I
> >> plan to get rid of powerpc lowlevellock.h when devel opens again).  However
> >> I really don't want to either wait or reimplement all the custom atomic to push
> >> this optimization... 
> > I believe that, unless the caveat Joseph mentioned actually applies to
> > the archs you're concerned about, using the compiler builtins and doing
> > the unification would simplify your patch considerably.  It would also
> > avoid having to iterate over the changes of your patch again once we do
> > all of the unification.
> >
> >> I think such change will require a lot of iteration and testing, which is not
> >> the intend of this patch.
> > If we move to C11-like atomics, then those will certainly go in
> > incrementally, and exist in parallel with the current atomics for a
> > while (until we reviewed all existing code using the old atomics and
> > moved it over to the new atomics).
> >
> > If we use the compiler builtins, we'd also have to test less because we
> > have no additional custom atomics implementation we need to maintain.
> >
> I do agree moving to a compiler specific is the best way, although for current
> status where we support GCC 4.4 we still need hand crafted assembly for atomics.
> Also, the GCC atomics are only support for 4.7+ afaik [1] and C11 for 4.9.

Note that I'm not arguing for using C11 at this point, but using atomics
that are very much like the atomic_*_explicit operations.  The function
names can be different, but functionality, parameter ordering, etc.,
would be similar.

> We can make what Joseph suggested and move minimum compiler required for GCC 4.7.
> 
> But if we intend to actually add mininum GCC 4.7, I can work on changing all the
> atomics to GCC builtins.

Another option would be to only enable the new atomics that we don't
have right now (e.g., memory_order_relaxed) when building with 4.7 or
higher, and falling back to some of the custom assembly ones otherwise.
This would mean no performance regressions but the optimizations would
only be effective for 4.7 or higher.  memory_order_relaxed loads/stores
could be an exception, because I think we need to make them explicity,
and we don't want to slow down code that now uses plain nonatomic
loads/stores in cases where it should actually be memory_order_relaxed
accesses.

next prev parent reply	other threads:[~2014-08-29  9:55 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-22 13:50 [PATCH v2] PowerPC: libc single-thread lock optimization Adhemerval Zanella
2014-08-22 14:00 ` Future atomic operation cleanup Richard Henderson
2014-08-22 14:50   ` Joseph S. Myers
2014-08-22 17:26   ` Torvald Riegel
2014-08-28 13:57   ` Adhemerval Zanella
2014-08-28 15:37     ` Torvald Riegel
2014-08-28 16:28       ` Adhemerval Zanella
2014-08-29  9:55         ` Torvald Riegel [this message]
2014-08-29 11:10           ` Adhemerval Zanella
2014-08-29 13:28             ` Torvald Riegel
2014-08-29 13:32               ` Adhemerval Zanella
2014-08-29 16:33                 ` Chris Metcalf
2014-10-23 17:12                   ` Carlos O'Donell
2014-08-29 21:52                 ` Roland McGrath
2014-09-05 17:51                   ` Torvald Riegel
2016-03-11 18:36                 ` [PATCH v3] PowerPC: libc single-thread lock optimization Tulio Magno Quites Machado Filho
2016-03-11 18:39                   ` Florian Weimer
2016-03-11 21:12                     ` Tulio Magno Quites Machado Filho
2016-03-28 17:36                   ` [PING][PATCH " Tulio Magno Quites Machado Filho
2016-04-07 11:24                   ` [PATCH " Torvald Riegel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1409306115.6045.71.camel@triegel.csb \
    --to=triegel@redhat.com \
    --cc=azanella@linux.vnet.ibm.com \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).