public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libstdc++/59439] New: std::locale uses atomic instructions on construction
@ 2013-12-09 21:56 ben.maurer at gmail dot com
  2013-12-09 21:59 ` [Bug libstdc++/59439] " ben.maurer at gmail dot com
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: ben.maurer at gmail dot com @ 2013-12-09 21:56 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59439

            Bug ID: 59439
           Summary: std::locale uses atomic instructions on construction
           Product: gcc
           Version: 4.8.3
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libstdc++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ben.maurer at gmail dot com

In a large multithreaded program that uses stringstream to stringify integers,
I have seen that construction and deconstruction of std::locale can take over
10% of the runtime according to the perf tool.

In bug 40088, std::locale was optimized to avoid the usage of a mutex when
creating the default locale. In our program, this path is being exercised.
However, the act of doing ref counting still has a very large performance
penalty on SMP systems due to cache line bouncing.

If this locale is truly readonly, it would be much better if refcounting could
be avoided. Maybe a refcount of -1 could signify "this object need not be
refcounted".


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug libstdc++/59439] std::locale uses atomic instructions on construction
  2013-12-09 21:56 [Bug libstdc++/59439] New: std::locale uses atomic instructions on construction ben.maurer at gmail dot com
@ 2013-12-09 21:59 ` ben.maurer at gmail dot com
  2013-12-09 23:55 ` pinskia at gcc dot gnu.org
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: ben.maurer at gmail dot com @ 2013-12-09 21:59 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59439

--- Comment #1 from Ben Maurer <ben.maurer at gmail dot com> ---
Facebook is putting a $50 bounty on this bug via bountysource:
https://www.bountysource.com/issues/1350875


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug libstdc++/59439] std::locale uses atomic instructions on construction
  2013-12-09 21:56 [Bug libstdc++/59439] New: std::locale uses atomic instructions on construction ben.maurer at gmail dot com
  2013-12-09 21:59 ` [Bug libstdc++/59439] " ben.maurer at gmail dot com
@ 2013-12-09 23:55 ` pinskia at gcc dot gnu.org
  2013-12-10  0:39 ` ben.maurer at gmail dot com
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2013-12-09 23:55 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59439

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Is there a simple performance testcase you could provide?


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug libstdc++/59439] std::locale uses atomic instructions on construction
  2013-12-09 21:56 [Bug libstdc++/59439] New: std::locale uses atomic instructions on construction ben.maurer at gmail dot com
  2013-12-09 21:59 ` [Bug libstdc++/59439] " ben.maurer at gmail dot com
  2013-12-09 23:55 ` pinskia at gcc dot gnu.org
@ 2013-12-10  0:39 ` ben.maurer at gmail dot com
  2013-12-10  0:54 ` ben.maurer at gmail dot com
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: ben.maurer at gmail dot com @ 2013-12-10  0:39 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59439

--- Comment #3 from Ben Maurer <ben.maurer at gmail dot com> ---
Created attachment 31405
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31405&action=edit
Benchmark

 _build/opt/experimental/bmaurer/benchmark
snprintf
1 threads took 20 ms
2 threads took 20 ms
3 threads took 20 ms
4 threads took 20 ms
5 threads took 22 ms
..
31 threads took 43 ms
iostream
1 threads took 108 ms
2 threads took 219 ms
3 threads took 371 ms
4 threads took 451 ms
5 threads took 559 ms
6 threads took 655 ms
7 threads took 763 ms
8 threads took 908 ms
9 threads took 1015 ms
...
31 threads took 3176 ms


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug libstdc++/59439] std::locale uses atomic instructions on construction
  2013-12-09 21:56 [Bug libstdc++/59439] New: std::locale uses atomic instructions on construction ben.maurer at gmail dot com
                   ` (2 preceding siblings ...)
  2013-12-10  0:39 ` ben.maurer at gmail dot com
@ 2013-12-10  0:54 ` ben.maurer at gmail dot com
  2013-12-10 23:59 ` redi at gcc dot gnu.org
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: ben.maurer at gmail dot com @ 2013-12-10  0:54 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59439

--- Comment #4 from Ben Maurer <ben.maurer at gmail dot com> ---
Also, here's where perf says time is being spent. While only 25% is shown as
being in the locale constructor/destructor, I suspect that the time spent in
other methods is actually related to the ping-ponging of cachelines caused by
the constructor -- all the time is being spent in memory access which should be
very hot in the CPU cache.

    16.77%  benchmark  libstdc++.so.6.0.18  [.] std::locale::locale()
    10.42%  benchmark  libstdc++.so.6.0.18  [.] std::locale::~locale()
     9.93%  benchmark  libstdc++.so.6.0.18  [.] bool
std::has_facet<std::num_put<char, std::ostreambuf_iterator<char,
std::char_traits<char> > > >(std::locale const&)
     9.84%  benchmark  libstdc++.so.6.0.18  [.] std::num_put<char,
std::ostreambuf_iterator<char, std::char_traits<char> > > const&
std::use_facet<std::num_put<char, std::ostreambuf_iterator<char,
std::char_traits<char> > > >(std::locale const&)
     9.26%  benchmark  libstdc++.so.6.0.18  [.] bool
std::has_facet<std::num_get<char, std::istreambuf_iterator<char,
std::char_traits<char> > > >(std::locale const&)
     9.10%  benchmark  libstdc++.so.6.0.18  [.] std::num_get<char,
std::istreambuf_iterator<char, std::char_traits<char> > > const&
std::use_facet<std::num_get<char, std::istreambuf_iterator<char,
std::char_traits<char> > > >(std::locale const&)
     8.78%  benchmark  libstdc++.so.6.0.18  [.] std::ctype<char> const&
std::use_facet<std::ctype<char> >(std::locale const&)
     6.14%  benchmark  libstdc++.so.6.0.18  [.]
std::locale::operator=(std::locale const&)
     3.71%  benchmark  libstdc++.so.6.0.18  [.]
std::__use_cache<std::__numpunct_cache<char> >::operator()(std::locale const&)
const
     3.66%  benchmark  libstdc++.so.6.0.18  [.] bool
std::has_facet<std::ctype<char> >(std::locale const&)
     3.42%  benchmark  libstdc++.so.6.0.18  [.] __dynamic_cast
     1.96%  benchmark  libstdc++.so.6.0.18  [.] std::locale::id::_M_id() const
     0.89%  benchmark  benchmark            [.] doIoStream()


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug libstdc++/59439] std::locale uses atomic instructions on construction
  2013-12-09 21:56 [Bug libstdc++/59439] New: std::locale uses atomic instructions on construction ben.maurer at gmail dot com
                   ` (3 preceding siblings ...)
  2013-12-10  0:54 ` ben.maurer at gmail dot com
@ 2013-12-10 23:59 ` redi at gcc dot gnu.org
  2013-12-11  0:23 ` ben.maurer at gmail dot com
  2014-04-20  8:33 ` antoshkka at gmail dot com
  6 siblings, 0 replies; 8+ messages in thread
From: redi at gcc dot gnu.org @ 2013-12-10 23:59 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59439

--- Comment #5 from Jonathan Wakely <redi at gcc dot gnu.org> ---
Don't do that then ;-)

Noone ever claimed std::stringstream is a good way to convert integers to
strings in performance critical code. Even if the std::locale construction is
optimized further, stringstream still involves far more code than necessary for
a simple conversion (to handle field widths, hex output and everything else
that iostreams support).

std::to_string() is a better alternative in C++11, re-using an existing
stringstream object is another alternative, or just use your snprintf version.

There's a good reason boost::lexical_cast is special-cased for the common
int-to-string conversions that don't need fancy formatting.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug libstdc++/59439] std::locale uses atomic instructions on construction
  2013-12-09 21:56 [Bug libstdc++/59439] New: std::locale uses atomic instructions on construction ben.maurer at gmail dot com
                   ` (4 preceding siblings ...)
  2013-12-10 23:59 ` redi at gcc dot gnu.org
@ 2013-12-11  0:23 ` ben.maurer at gmail dot com
  2014-04-20  8:33 ` antoshkka at gmail dot com
  6 siblings, 0 replies; 8+ messages in thread
From: ben.maurer at gmail dot com @ 2013-12-11  0:23 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59439

--- Comment #6 from Ben Maurer <ben.maurer at gmail dot com> ---
I agree this isn't necessarily the cleanest way to accomplish this task.

My discovery of this bug came through a real life application which was
deployed to real users and had a noticeable impact due to this issue. This
isn't benchmarking for the sake of benchmarking, an actual developer decided to
write code this way. In the context of their code, choosing stringstream made
slightly more sense because they were combining multiple integers and strings.
However, bug 40088 suggests that other users may have run into this problem in
the past. Given that iostream shows slower, but still reasonable performance in
the single-thread case, it's possible others will run into it in the future.
Machines will only be getting more cores in the future, so this problem may
increase in impact over time.

In any case, I've fixed the original application, and am filing this bug in the
hopes that we can save others debugging time in the future finding bottlenecks
like this.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug libstdc++/59439] std::locale uses atomic instructions on construction
  2013-12-09 21:56 [Bug libstdc++/59439] New: std::locale uses atomic instructions on construction ben.maurer at gmail dot com
                   ` (5 preceding siblings ...)
  2013-12-11  0:23 ` ben.maurer at gmail dot com
@ 2014-04-20  8:33 ` antoshkka at gmail dot com
  6 siblings, 0 replies; 8+ messages in thread
From: antoshkka at gmail dot com @ 2014-04-20  8:33 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59439

Antony Polukhin <antoshkka at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |antoshkka at gmail dot com

--- Comment #7 from Antony Polukhin <antoshkka at gmail dot com> ---
(In reply to Jonathan Wakely from comment #5)
> There's a good reason boost::lexical_cast is special-cased for the common
> int-to-string conversions that don't need fancy formatting.

I'm the current maintainer of boost::lexical_cast and I'd like to add my 2
cetns.

Construction of std::locale is a common case all around the Boost libraries.
Here are some of the cases when std::locale is constructed:
* in almost any conversion with boost::lexical_cast
* in all the string related algorithms in Boost.Algorithm (like compare(),
erase(), find() and others)
* in methods of Boost.Locale like to_upper, to_lower, normalize and others...
* in parsers of Boost.PropertyTree

In most cases I've seen only the global 'C' locale is used in user code, so
optimizing it's construction/descruction/copying will have a good impact on
code performance.


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-04-20  8:33 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-12-09 21:56 [Bug libstdc++/59439] New: std::locale uses atomic instructions on construction ben.maurer at gmail dot com
2013-12-09 21:59 ` [Bug libstdc++/59439] " ben.maurer at gmail dot com
2013-12-09 23:55 ` pinskia at gcc dot gnu.org
2013-12-10  0:39 ` ben.maurer at gmail dot com
2013-12-10  0:54 ` ben.maurer at gmail dot com
2013-12-10 23:59 ` redi at gcc dot gnu.org
2013-12-11  0:23 ` ben.maurer at gmail dot com
2014-04-20  8:33 ` antoshkka at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).