* [Bug libstdc++/101923] std::function's move ctor is slower than the copy one for empty source objects
2021-08-15 13:17 [Bug libstdc++/101923] New: std::function's move ctor is slower than the copy one for empty source objects dartdart26 at gmail dot com
@ 2021-08-15 13:38 ` dartdart26 at gmail dot com
2021-08-15 17:54 ` nok.raven at gmail dot com
` (9 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: dartdart26 at gmail dot com @ 2021-08-15 13:38 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101923
--- Comment #1 from Petar Ivanov <dartdart26 at gmail dot com> ---
Benchmark code (using Google Benchmark):
#include <benchmark/benchmark.h>
#include <functional>
#include <utility>
struct Car {};
static void copy(benchmark::State& state) {
for (auto _ : state) {
const auto f = std::function<void(const Car&)>{};
const auto copied = f;
benchmark::DoNotOptimize(copied);
}
}
static void move(benchmark::State& state) {
for (auto _ : state) {
auto f = std::function<void(const Car&)>{};
const auto moved = std::move(f);
benchmark::DoNotOptimize(moved);
}
}
BENCHMARK(copy);
BENCHMARK(move);
BENCHMARK_MAIN();
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug libstdc++/101923] std::function's move ctor is slower than the copy one for empty source objects
2021-08-15 13:17 [Bug libstdc++/101923] New: std::function's move ctor is slower than the copy one for empty source objects dartdart26 at gmail dot com
2021-08-15 13:38 ` [Bug libstdc++/101923] " dartdart26 at gmail dot com
@ 2021-08-15 17:54 ` nok.raven at gmail dot com
2021-08-16 7:18 ` dartdart26 at gmail dot com
` (8 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: nok.raven at gmail dot com @ 2021-08-15 17:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101923
Nikita Kniazev <nok.raven at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |nok.raven at gmail dot com
--- Comment #2 from Nikita Kniazev <nok.raven at gmail dot com> ---
There is no difference in the produced code on trunk (except move ops order)
https://godbolt.org/z/esfjhr9ae
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug libstdc++/101923] std::function's move ctor is slower than the copy one for empty source objects
2021-08-15 13:17 [Bug libstdc++/101923] New: std::function's move ctor is slower than the copy one for empty source objects dartdart26 at gmail dot com
2021-08-15 13:38 ` [Bug libstdc++/101923] " dartdart26 at gmail dot com
2021-08-15 17:54 ` nok.raven at gmail dot com
@ 2021-08-16 7:18 ` dartdart26 at gmail dot com
2021-08-16 7:30 ` [Bug tree-optimization/101923] " pinskia at gcc dot gnu.org
` (7 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: dartdart26 at gmail dot com @ 2021-08-16 7:18 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101923
--- Comment #3 from Petar Ivanov <dartdart26 at gmail dot com> ---
Thank you for pointing the output on x86!
Following that, I checked O2 and O3 on ARM64 and I see differences, though I
cannot say what their actual impact is:
02: https://godbolt.org/z/P9Garznef
O3: https://godbolt.org/z/Yb1q33YP3
In terms of x86, I ran the benchmark in Quick Bench (I assume x86 as that what
the disassembly is) and the results are similar to my findings on ARM64 - move
being slower:
https://quick-bench.com/q/vK9eSYngutKGo4QSPcdra9gUOI0
The benchmark code seems correct to me, but I might be missing something, might
be misusing DoNotOptimize() or there might be some side effects.
I am sure this is not a big deal. I was just wondering if adding an if
statement is doable and, if yes, it seems like a quick and easy win.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug tree-optimization/101923] std::function's move ctor is slower than the copy one for empty source objects
2021-08-15 13:17 [Bug libstdc++/101923] New: std::function's move ctor is slower than the copy one for empty source objects dartdart26 at gmail dot com
` (2 preceding siblings ...)
2021-08-16 7:18 ` dartdart26 at gmail dot com
@ 2021-08-16 7:30 ` pinskia at gcc dot gnu.org
2021-08-17 6:09 ` dartdart26 at gmail dot com
` (6 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-08-16 7:30 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101923
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization
Component|libstdc++ |tree-optimization
--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Hmm
__tmp = MEM[(union _Any_data & {ref-all})&f];
MEM[(union _Any_data * {ref-all})&f] = MEM[(union _Any_data &
{ref-all})&moved];
MEM[(union _Any_data * {ref-all})&moved] = __tmp;
__tmp ={v} {CLOBBER};
_13 = MEM[(void (*type) (const union _Any_data & {ref-all}, const struct Car
&) &)&f + 24];
_14 = MEM[(void (*type) (const union _Any_data & {ref-all}, const struct Car
&) &)&moved + 24];
MEM[(void (*<Te9f8>) (const union _Any_data & {ref-all}, const struct Car &)
&)&f + 24] = _14;
MEM[(void (*<Te9f8>) (const union _Any_data & {ref-all}, const struct Car &)
&)&moved + 24] = _13;
So a missed optimization at the gimple level.
But note the arm64 compiler on godbolt is a few months old, 20210528. There
might have been some fixes which improve this already.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug tree-optimization/101923] std::function's move ctor is slower than the copy one for empty source objects
2021-08-15 13:17 [Bug libstdc++/101923] New: std::function's move ctor is slower than the copy one for empty source objects dartdart26 at gmail dot com
` (3 preceding siblings ...)
2021-08-16 7:30 ` [Bug tree-optimization/101923] " pinskia at gcc dot gnu.org
@ 2021-08-17 6:09 ` dartdart26 at gmail dot com
2021-08-17 10:00 ` redi at gcc dot gnu.org
` (5 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: dartdart26 at gmail dot com @ 2021-08-17 6:09 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101923
--- Comment #5 from Petar Ivanov <dartdart26 at gmail dot com> ---
(In reply to Andrew Pinski from comment #4)
> Hmm
>
> __tmp = MEM[(union _Any_data & {ref-all})&f];
> MEM[(union _Any_data * {ref-all})&f] = MEM[(union _Any_data &
> {ref-all})&moved];
> MEM[(union _Any_data * {ref-all})&moved] = __tmp;
> __tmp ={v} {CLOBBER};
> _13 = MEM[(void (*type) (const union _Any_data & {ref-all}, const struct
> Car &) &)&f + 24];
> _14 = MEM[(void (*type) (const union _Any_data & {ref-all}, const struct
> Car &) &)&moved + 24];
> MEM[(void (*<Te9f8>) (const union _Any_data & {ref-all}, const struct Car
> &) &)&f + 24] = _14;
> MEM[(void (*<Te9f8>) (const union _Any_data & {ref-all}, const struct Car
> &) &)&moved + 24] = _13;
>
> So a missed optimization at the gimple level.
> But note the arm64 compiler on godbolt is a few months old, 20210528. There
> might have been some fixes which improve this already.
I see, thank you.
Do you think the x86 results on quick bench are something worth improving? From
a user's perspective, I assume the expectation is that moves are at least as
fast as copies.
Could you please advise on how I can proceed with this report? Can a change be
made in libstdc++ or should it be considered a compiler issue?
Thank you!
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug tree-optimization/101923] std::function's move ctor is slower than the copy one for empty source objects
2021-08-15 13:17 [Bug libstdc++/101923] New: std::function's move ctor is slower than the copy one for empty source objects dartdart26 at gmail dot com
` (4 preceding siblings ...)
2021-08-17 6:09 ` dartdart26 at gmail dot com
@ 2021-08-17 10:00 ` redi at gcc dot gnu.org
2021-08-17 10:25 ` redi at gcc dot gnu.org
` (4 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: redi at gcc dot gnu.org @ 2021-08-17 10:00 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101923
--- Comment #6 from Jonathan Wakely <redi at gcc dot gnu.org> ---
(In reply to Petar Ivanov from comment #5)
> Could you please advise on how I can proceed with this report? Can a change
> be made in libstdc++ or should it be considered a compiler issue?
Both, I think.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug tree-optimization/101923] std::function's move ctor is slower than the copy one for empty source objects
2021-08-15 13:17 [Bug libstdc++/101923] New: std::function's move ctor is slower than the copy one for empty source objects dartdart26 at gmail dot com
` (5 preceding siblings ...)
2021-08-17 10:00 ` redi at gcc dot gnu.org
@ 2021-08-17 10:25 ` redi at gcc dot gnu.org
2021-08-17 13:24 ` cvs-commit at gcc dot gnu.org
` (3 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: redi at gcc dot gnu.org @ 2021-08-17 10:25 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101923
--- Comment #7 from Jonathan Wakely <redi at gcc dot gnu.org> ---
We can do better than just making the swap conditional:
function(function&& __x) noexcept
: _Function_base(), _M_invoker(__x._M_invoker)
{
if (static_cast<bool>(__x))
{
_M_functor = __x._M_functor;
_M_manager = __x._M_manager;
__x._M_manager = nullptr;
__x._M_invoker = nullptr;
}
}
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug tree-optimization/101923] std::function's move ctor is slower than the copy one for empty source objects
2021-08-15 13:17 [Bug libstdc++/101923] New: std::function's move ctor is slower than the copy one for empty source objects dartdart26 at gmail dot com
` (6 preceding siblings ...)
2021-08-17 10:25 ` redi at gcc dot gnu.org
@ 2021-08-17 13:24 ` cvs-commit at gcc dot gnu.org
2021-08-18 6:02 ` dartdart26 at gmail dot com
` (2 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-08-17 13:24 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101923
--- Comment #8 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>:
https://gcc.gnu.org/g:0808b0df9c4d31f4c362b9c85fb538b6aafcb517
commit r12-2959-g0808b0df9c4d31f4c362b9c85fb538b6aafcb517
Author: Jonathan Wakely <jwakely@redhat.com>
Date: Tue Aug 17 11:30:56 2021 +0100
libstdc++: Optimize std::function move constructor [PR101923]
PR 101923 points out that the unconditional swap in the std::function
move constructor makes it slower than copying an empty std::function.
The copy constructor has to check for the empty case before doing
anything, and that makes it very fast for the empty case.
Adding the same check to the move constructor avoids copying the
_Any_data POD when we don't need to. We can also inline the effects of
swap, by copying each member and then zeroing the pointer members.
This makes moving an empty object at least as fast as copying an empty
object.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
PR libstdc++/101923
* include/bits/std_function.h (function(function&&)): Check for
non-empty parameter before doing any work.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug tree-optimization/101923] std::function's move ctor is slower than the copy one for empty source objects
2021-08-15 13:17 [Bug libstdc++/101923] New: std::function's move ctor is slower than the copy one for empty source objects dartdart26 at gmail dot com
` (7 preceding siblings ...)
2021-08-17 13:24 ` cvs-commit at gcc dot gnu.org
@ 2021-08-18 6:02 ` dartdart26 at gmail dot com
2021-10-12 10:59 ` cvs-commit at gcc dot gnu.org
2022-12-29 22:14 ` pinskia at gcc dot gnu.org
10 siblings, 0 replies; 12+ messages in thread
From: dartdart26 at gmail dot com @ 2021-08-18 6:02 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101923
Petar Ivanov <dartdart26 at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution|--- |FIXED
--- Comment #9 from Petar Ivanov <dartdart26 at gmail dot com> ---
(In reply to CVS Commits from comment #8)
> The master branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>:
>
> https://gcc.gnu.org/g:0808b0df9c4d31f4c362b9c85fb538b6aafcb517
>
> commit r12-2959-g0808b0df9c4d31f4c362b9c85fb538b6aafcb517
> Author: Jonathan Wakely <jwakely@redhat.com>
> Date: Tue Aug 17 11:30:56 2021 +0100
>
> libstdc++: Optimize std::function move constructor [PR101923]
>
Thank you!
On ARM64, it is now identical to copy:
-----------------------------------------------------
Benchmark Time CPU Iterations
-----------------------------------------------------
copy 0.948 ns 0.948 ns 558822565
move 0.952 ns 0.952 ns 729210032
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug tree-optimization/101923] std::function's move ctor is slower than the copy one for empty source objects
2021-08-15 13:17 [Bug libstdc++/101923] New: std::function's move ctor is slower than the copy one for empty source objects dartdart26 at gmail dot com
` (8 preceding siblings ...)
2021-08-18 6:02 ` dartdart26 at gmail dot com
@ 2021-10-12 10:59 ` cvs-commit at gcc dot gnu.org
2022-12-29 22:14 ` pinskia at gcc dot gnu.org
10 siblings, 0 replies; 12+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-10-12 10:59 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101923
--- Comment #10 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-11 branch has been updated by Jonathan Wakely
<redi@gcc.gnu.org>:
https://gcc.gnu.org/g:73b0f810a17a5f529fc8342a2df31276d3538851
commit r11-9111-g73b0f810a17a5f529fc8342a2df31276d3538851
Author: Jonathan Wakely <jwakely@redhat.com>
Date: Tue Aug 17 11:30:56 2021 +0100
libstdc++: Optimize std::function move constructor [PR101923]
PR 101923 points out that the unconditional swap in the std::function
move constructor makes it slower than copying an empty std::function.
The copy constructor has to check for the empty case before doing
anything, and that makes it very fast for the empty case.
Adding the same check to the move constructor avoids copying the
_Any_data POD when we don't need to. We can also inline the effects of
swap, by copying each member and then zeroing the pointer members.
This makes moving an empty object at least as fast as copying an empty
object.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
PR libstdc++/101923
* include/bits/std_function.h (function(function&&)): Check for
non-empty parameter before doing any work.
(cherry picked from commit 0808b0df9c4d31f4c362b9c85fb538b6aafcb517)
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug tree-optimization/101923] std::function's move ctor is slower than the copy one for empty source objects
2021-08-15 13:17 [Bug libstdc++/101923] New: std::function's move ctor is slower than the copy one for empty source objects dartdart26 at gmail dot com
` (9 preceding siblings ...)
2021-10-12 10:59 ` cvs-commit at gcc dot gnu.org
@ 2022-12-29 22:14 ` pinskia at gcc dot gnu.org
10 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-12-29 22:14 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101923
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|--- |11.0
^ permalink raw reply [flat|nested] 12+ messages in thread