public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libstdc++/101923] New: std::function's move ctor is slower than the copy one for empty source objects
@ 2021-08-15 13:17 dartdart26 at gmail dot com
2021-08-15 13:38 ` [Bug libstdc++/101923] " dartdart26 at gmail dot com
` (10 more replies)
0 siblings, 11 replies; 12+ messages in thread
From: dartdart26 at gmail dot com @ 2021-08-15 13:17 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101923
Bug ID: 101923
Summary: std::function's move ctor is slower than the copy one
for empty source objects
Product: gcc
Version: 9.3.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: libstdc++
Assignee: unassigned at gcc dot gnu.org
Reporter: dartdart26 at gmail dot com
Target Milestone: ---
std::function's move constructor calls swap() irrespective of whether the
source object is empty or not. In contrast, the copy constructor first checks
if the source object is empty and if it is, nothing is being done as the `this`
object is constructed in an empty state by _Function_base().
Calling swap() on an empty source requires more work, because some data needs
to be copied - for example, the POD data cannot be moved.
Could the move constructor check if the source is empty too, as the copy one
does? Please let me know if I am missing a rule that prevents that.
I have noticed that on version 9.3.0, but I see the code is the same in current
master at:
https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libstdc%2B%2B-v3/include/bits/std_function.h;hb=c22bcfd2f7dc9bb5ad394720f4a612327dc898ba#l391
I have tested on a MacBook M1 and the copy ctor for empty sources is almost 2x
faster than the move ctor:
-----------------------------------------------------
Benchmark Time CPU Iterations
-----------------------------------------------------
copy 0.945 ns 0.945 ns 555789159
move 1.83 ns 1.83 ns 382183169
I have made an YouTube video for describing my findings and the benchmark
results:
https://www.youtube.com/watch?v=WA3mKab-tn8
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug libstdc++/101923] std::function's move ctor is slower than the copy one for empty source objects
2021-08-15 13:17 [Bug libstdc++/101923] New: std::function's move ctor is slower than the copy one for empty source objects dartdart26 at gmail dot com
@ 2021-08-15 13:38 ` dartdart26 at gmail dot com
2021-08-15 17:54 ` nok.raven at gmail dot com
` (9 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: dartdart26 at gmail dot com @ 2021-08-15 13:38 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101923
--- Comment #1 from Petar Ivanov <dartdart26 at gmail dot com> ---
Benchmark code (using Google Benchmark):
#include <benchmark/benchmark.h>
#include <functional>
#include <utility>
struct Car {};
static void copy(benchmark::State& state) {
for (auto _ : state) {
const auto f = std::function<void(const Car&)>{};
const auto copied = f;
benchmark::DoNotOptimize(copied);
}
}
static void move(benchmark::State& state) {
for (auto _ : state) {
auto f = std::function<void(const Car&)>{};
const auto moved = std::move(f);
benchmark::DoNotOptimize(moved);
}
}
BENCHMARK(copy);
BENCHMARK(move);
BENCHMARK_MAIN();
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug libstdc++/101923] std::function's move ctor is slower than the copy one for empty source objects
2021-08-15 13:17 [Bug libstdc++/101923] New: std::function's move ctor is slower than the copy one for empty source objects dartdart26 at gmail dot com
2021-08-15 13:38 ` [Bug libstdc++/101923] " dartdart26 at gmail dot com
@ 2021-08-15 17:54 ` nok.raven at gmail dot com
2021-08-16 7:18 ` dartdart26 at gmail dot com
` (8 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: nok.raven at gmail dot com @ 2021-08-15 17:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101923
Nikita Kniazev <nok.raven at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |nok.raven at gmail dot com
--- Comment #2 from Nikita Kniazev <nok.raven at gmail dot com> ---
There is no difference in the produced code on trunk (except move ops order)
https://godbolt.org/z/esfjhr9ae
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug libstdc++/101923] std::function's move ctor is slower than the copy one for empty source objects
2021-08-15 13:17 [Bug libstdc++/101923] New: std::function's move ctor is slower than the copy one for empty source objects dartdart26 at gmail dot com
2021-08-15 13:38 ` [Bug libstdc++/101923] " dartdart26 at gmail dot com
2021-08-15 17:54 ` nok.raven at gmail dot com
@ 2021-08-16 7:18 ` dartdart26 at gmail dot com
2021-08-16 7:30 ` [Bug tree-optimization/101923] " pinskia at gcc dot gnu.org
` (7 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: dartdart26 at gmail dot com @ 2021-08-16 7:18 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101923
--- Comment #3 from Petar Ivanov <dartdart26 at gmail dot com> ---
Thank you for pointing the output on x86!
Following that, I checked O2 and O3 on ARM64 and I see differences, though I
cannot say what their actual impact is:
02: https://godbolt.org/z/P9Garznef
O3: https://godbolt.org/z/Yb1q33YP3
In terms of x86, I ran the benchmark in Quick Bench (I assume x86 as that what
the disassembly is) and the results are similar to my findings on ARM64 - move
being slower:
https://quick-bench.com/q/vK9eSYngutKGo4QSPcdra9gUOI0
The benchmark code seems correct to me, but I might be missing something, might
be misusing DoNotOptimize() or there might be some side effects.
I am sure this is not a big deal. I was just wondering if adding an if
statement is doable and, if yes, it seems like a quick and easy win.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug tree-optimization/101923] std::function's move ctor is slower than the copy one for empty source objects
2021-08-15 13:17 [Bug libstdc++/101923] New: std::function's move ctor is slower than the copy one for empty source objects dartdart26 at gmail dot com
` (2 preceding siblings ...)
2021-08-16 7:18 ` dartdart26 at gmail dot com
@ 2021-08-16 7:30 ` pinskia at gcc dot gnu.org
2021-08-17 6:09 ` dartdart26 at gmail dot com
` (6 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-08-16 7:30 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101923
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization
Component|libstdc++ |tree-optimization
--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Hmm
__tmp = MEM[(union _Any_data & {ref-all})&f];
MEM[(union _Any_data * {ref-all})&f] = MEM[(union _Any_data &
{ref-all})&moved];
MEM[(union _Any_data * {ref-all})&moved] = __tmp;
__tmp ={v} {CLOBBER};
_13 = MEM[(void (*type) (const union _Any_data & {ref-all}, const struct Car
&) &)&f + 24];
_14 = MEM[(void (*type) (const union _Any_data & {ref-all}, const struct Car
&) &)&moved + 24];
MEM[(void (*<Te9f8>) (const union _Any_data & {ref-all}, const struct Car &)
&)&f + 24] = _14;
MEM[(void (*<Te9f8>) (const union _Any_data & {ref-all}, const struct Car &)
&)&moved + 24] = _13;
So a missed optimization at the gimple level.
But note the arm64 compiler on godbolt is a few months old, 20210528. There
might have been some fixes which improve this already.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug tree-optimization/101923] std::function's move ctor is slower than the copy one for empty source objects
2021-08-15 13:17 [Bug libstdc++/101923] New: std::function's move ctor is slower than the copy one for empty source objects dartdart26 at gmail dot com
` (3 preceding siblings ...)
2021-08-16 7:30 ` [Bug tree-optimization/101923] " pinskia at gcc dot gnu.org
@ 2021-08-17 6:09 ` dartdart26 at gmail dot com
2021-08-17 10:00 ` redi at gcc dot gnu.org
` (5 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: dartdart26 at gmail dot com @ 2021-08-17 6:09 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101923
--- Comment #5 from Petar Ivanov <dartdart26 at gmail dot com> ---
(In reply to Andrew Pinski from comment #4)
> Hmm
>
> __tmp = MEM[(union _Any_data & {ref-all})&f];
> MEM[(union _Any_data * {ref-all})&f] = MEM[(union _Any_data &
> {ref-all})&moved];
> MEM[(union _Any_data * {ref-all})&moved] = __tmp;
> __tmp ={v} {CLOBBER};
> _13 = MEM[(void (*type) (const union _Any_data & {ref-all}, const struct
> Car &) &)&f + 24];
> _14 = MEM[(void (*type) (const union _Any_data & {ref-all}, const struct
> Car &) &)&moved + 24];
> MEM[(void (*<Te9f8>) (const union _Any_data & {ref-all}, const struct Car
> &) &)&f + 24] = _14;
> MEM[(void (*<Te9f8>) (const union _Any_data & {ref-all}, const struct Car
> &) &)&moved + 24] = _13;
>
> So a missed optimization at the gimple level.
> But note the arm64 compiler on godbolt is a few months old, 20210528. There
> might have been some fixes which improve this already.
I see, thank you.
Do you think the x86 results on quick bench are something worth improving? From
a user's perspective, I assume the expectation is that moves are at least as
fast as copies.
Could you please advise on how I can proceed with this report? Can a change be
made in libstdc++ or should it be considered a compiler issue?
Thank you!
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug tree-optimization/101923] std::function's move ctor is slower than the copy one for empty source objects
2021-08-15 13:17 [Bug libstdc++/101923] New: std::function's move ctor is slower than the copy one for empty source objects dartdart26 at gmail dot com
` (4 preceding siblings ...)
2021-08-17 6:09 ` dartdart26 at gmail dot com
@ 2021-08-17 10:00 ` redi at gcc dot gnu.org
2021-08-17 10:25 ` redi at gcc dot gnu.org
` (4 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: redi at gcc dot gnu.org @ 2021-08-17 10:00 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101923
--- Comment #6 from Jonathan Wakely <redi at gcc dot gnu.org> ---
(In reply to Petar Ivanov from comment #5)
> Could you please advise on how I can proceed with this report? Can a change
> be made in libstdc++ or should it be considered a compiler issue?
Both, I think.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug tree-optimization/101923] std::function's move ctor is slower than the copy one for empty source objects
2021-08-15 13:17 [Bug libstdc++/101923] New: std::function's move ctor is slower than the copy one for empty source objects dartdart26 at gmail dot com
` (5 preceding siblings ...)
2021-08-17 10:00 ` redi at gcc dot gnu.org
@ 2021-08-17 10:25 ` redi at gcc dot gnu.org
2021-08-17 13:24 ` cvs-commit at gcc dot gnu.org
` (3 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: redi at gcc dot gnu.org @ 2021-08-17 10:25 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101923
--- Comment #7 from Jonathan Wakely <redi at gcc dot gnu.org> ---
We can do better than just making the swap conditional:
function(function&& __x) noexcept
: _Function_base(), _M_invoker(__x._M_invoker)
{
if (static_cast<bool>(__x))
{
_M_functor = __x._M_functor;
_M_manager = __x._M_manager;
__x._M_manager = nullptr;
__x._M_invoker = nullptr;
}
}
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug tree-optimization/101923] std::function's move ctor is slower than the copy one for empty source objects
2021-08-15 13:17 [Bug libstdc++/101923] New: std::function's move ctor is slower than the copy one for empty source objects dartdart26 at gmail dot com
` (6 preceding siblings ...)
2021-08-17 10:25 ` redi at gcc dot gnu.org
@ 2021-08-17 13:24 ` cvs-commit at gcc dot gnu.org
2021-08-18 6:02 ` dartdart26 at gmail dot com
` (2 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-08-17 13:24 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101923
--- Comment #8 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>:
https://gcc.gnu.org/g:0808b0df9c4d31f4c362b9c85fb538b6aafcb517
commit r12-2959-g0808b0df9c4d31f4c362b9c85fb538b6aafcb517
Author: Jonathan Wakely <jwakely@redhat.com>
Date: Tue Aug 17 11:30:56 2021 +0100
libstdc++: Optimize std::function move constructor [PR101923]
PR 101923 points out that the unconditional swap in the std::function
move constructor makes it slower than copying an empty std::function.
The copy constructor has to check for the empty case before doing
anything, and that makes it very fast for the empty case.
Adding the same check to the move constructor avoids copying the
_Any_data POD when we don't need to. We can also inline the effects of
swap, by copying each member and then zeroing the pointer members.
This makes moving an empty object at least as fast as copying an empty
object.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
PR libstdc++/101923
* include/bits/std_function.h (function(function&&)): Check for
non-empty parameter before doing any work.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug tree-optimization/101923] std::function's move ctor is slower than the copy one for empty source objects
2021-08-15 13:17 [Bug libstdc++/101923] New: std::function's move ctor is slower than the copy one for empty source objects dartdart26 at gmail dot com
` (7 preceding siblings ...)
2021-08-17 13:24 ` cvs-commit at gcc dot gnu.org
@ 2021-08-18 6:02 ` dartdart26 at gmail dot com
2021-10-12 10:59 ` cvs-commit at gcc dot gnu.org
2022-12-29 22:14 ` pinskia at gcc dot gnu.org
10 siblings, 0 replies; 12+ messages in thread
From: dartdart26 at gmail dot com @ 2021-08-18 6:02 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101923
Petar Ivanov <dartdart26 at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution|--- |FIXED
--- Comment #9 from Petar Ivanov <dartdart26 at gmail dot com> ---
(In reply to CVS Commits from comment #8)
> The master branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>:
>
> https://gcc.gnu.org/g:0808b0df9c4d31f4c362b9c85fb538b6aafcb517
>
> commit r12-2959-g0808b0df9c4d31f4c362b9c85fb538b6aafcb517
> Author: Jonathan Wakely <jwakely@redhat.com>
> Date: Tue Aug 17 11:30:56 2021 +0100
>
> libstdc++: Optimize std::function move constructor [PR101923]
>
Thank you!
On ARM64, it is now identical to copy:
-----------------------------------------------------
Benchmark Time CPU Iterations
-----------------------------------------------------
copy 0.948 ns 0.948 ns 558822565
move 0.952 ns 0.952 ns 729210032
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug tree-optimization/101923] std::function's move ctor is slower than the copy one for empty source objects
2021-08-15 13:17 [Bug libstdc++/101923] New: std::function's move ctor is slower than the copy one for empty source objects dartdart26 at gmail dot com
` (8 preceding siblings ...)
2021-08-18 6:02 ` dartdart26 at gmail dot com
@ 2021-10-12 10:59 ` cvs-commit at gcc dot gnu.org
2022-12-29 22:14 ` pinskia at gcc dot gnu.org
10 siblings, 0 replies; 12+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-10-12 10:59 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101923
--- Comment #10 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-11 branch has been updated by Jonathan Wakely
<redi@gcc.gnu.org>:
https://gcc.gnu.org/g:73b0f810a17a5f529fc8342a2df31276d3538851
commit r11-9111-g73b0f810a17a5f529fc8342a2df31276d3538851
Author: Jonathan Wakely <jwakely@redhat.com>
Date: Tue Aug 17 11:30:56 2021 +0100
libstdc++: Optimize std::function move constructor [PR101923]
PR 101923 points out that the unconditional swap in the std::function
move constructor makes it slower than copying an empty std::function.
The copy constructor has to check for the empty case before doing
anything, and that makes it very fast for the empty case.
Adding the same check to the move constructor avoids copying the
_Any_data POD when we don't need to. We can also inline the effects of
swap, by copying each member and then zeroing the pointer members.
This makes moving an empty object at least as fast as copying an empty
object.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
PR libstdc++/101923
* include/bits/std_function.h (function(function&&)): Check for
non-empty parameter before doing any work.
(cherry picked from commit 0808b0df9c4d31f4c362b9c85fb538b6aafcb517)
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug tree-optimization/101923] std::function's move ctor is slower than the copy one for empty source objects
2021-08-15 13:17 [Bug libstdc++/101923] New: std::function's move ctor is slower than the copy one for empty source objects dartdart26 at gmail dot com
` (9 preceding siblings ...)
2021-10-12 10:59 ` cvs-commit at gcc dot gnu.org
@ 2022-12-29 22:14 ` pinskia at gcc dot gnu.org
10 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-12-29 22:14 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101923
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|--- |11.0
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2022-12-29 22:14 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-15 13:17 [Bug libstdc++/101923] New: std::function's move ctor is slower than the copy one for empty source objects dartdart26 at gmail dot com
2021-08-15 13:38 ` [Bug libstdc++/101923] " dartdart26 at gmail dot com
2021-08-15 17:54 ` nok.raven at gmail dot com
2021-08-16 7:18 ` dartdart26 at gmail dot com
2021-08-16 7:30 ` [Bug tree-optimization/101923] " pinskia at gcc dot gnu.org
2021-08-17 6:09 ` dartdart26 at gmail dot com
2021-08-17 10:00 ` redi at gcc dot gnu.org
2021-08-17 10:25 ` redi at gcc dot gnu.org
2021-08-17 13:24 ` cvs-commit at gcc dot gnu.org
2021-08-18 6:02 ` dartdart26 at gmail dot com
2021-10-12 10:59 ` cvs-commit at gcc dot gnu.org
2022-12-29 22:14 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).