public inbox for libstdc++@gcc.gnu.org
 help / color / mirror / Atom feed
* First-time contributor -- speeding up `std::unique_ptr` compilation
@ 2022-06-24  0:19 mail
  2022-06-24  6:20 ` Jonathan Wakely
  0 siblings, 1 reply; 5+ messages in thread
From: mail @ 2022-06-24  0:19 UTC (permalink / raw)
  To: libstdc++

Hello everyone,
Hope you are all doing well.

I would like to begin by saying that I am terribly unfamiliar with mailing lists and I've always found them cumbersome to use -- I tend to do most of my development via GitHub PRs and Discord. Therefore, please forgive me if this is not the right place to discuss "how to contribute", and also forgive me for not being able to create a proper patch file.

My goal is to generally improve the compilation speed of C++ programs using the standard library. In personal projects and open-source projects I maintain, I've achieved good results by benchmarking compilation via ClangBuildAnalyzer, figuring out the bottlenecks, and optimizing them. While working on the SFML project, I noticed that the `std::unique_ptr` template was taking a lot of time to be instantiated using the latest version of libstdc++.

I investigated a bit, and figured out that the `<tuple>` include and `std::tuple` instantiation were the culprit. I then replaced them with a handmade workaround in my system libraries, recompiled SFML, and noticed that `std::unique_ptr`'s instantiation time became negligible as a result of my changes. As a side benefit, we also avoid including the '<tuple>' header altogether, reducing the risk of unintended transitive includes. I created a PR on the GCC GitHub mirror as a way to observe the diff in a nice interface and to discuss my changes, and to see if the maintainers generally think that this is a good idea. You can see the diff here:
https://github.com/gcc-mirror/gcc/pull/67/files?diff=split&w=1

In terms of rough measurements, the average instantiation time of `std::unique_ptr` went from 22ms to 4ms. This was measured over the entire SFML codebase using clang++ 14.x with the `-ftime-trace` flag, plus the aforementioned ClangBuildAnalyzer tool.

A few questions:

  1.  Is this improvement something that the libstdc++ maintainers would want to merge in? Of course, the changes need to be cleaned up and adapted to GCC's coding style.
  2.  If so, what is the easiest way to create a patch in the format that you are used to and test it against libstdc++'s test suite? I have never contributed to libstdc++ or GCC before.

Thanks.


Vittorio Romeo
https://vittorioromeo.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: First-time contributor -- speeding up `std::unique_ptr` compilation
  2022-06-24  0:19 First-time contributor -- speeding up `std::unique_ptr` compilation mail
@ 2022-06-24  6:20 ` Jonathan Wakely
  2022-06-24  7:22   ` Jonathan Wakely
  0 siblings, 1 reply; 5+ messages in thread
From: Jonathan Wakely @ 2022-06-24  6:20 UTC (permalink / raw)
  To: mail; +Cc: libstdc++

On Fri, 24 Jun 2022, 01:19 , <mail@vittorioromeo.com> wrote:

> Hello everyone,
> Hope you are all doing well.
>
> I would like to begin by saying that I am terribly unfamiliar with mailing
> lists and I've always found them cumbersome to use -- I tend to do most of
> my development via GitHub PRs and Discord. Therefore, please forgive me if
> this is not the right place to discuss "how to contribute", and also
> forgive me for not being able to create a proper patch file.
>
> My goal is to generally improve the compilation speed of C++ programs
> using the standard library. In personal projects and open-source projects I
> maintain, I've achieved good results by benchmarking compilation via
> ClangBuildAnalyzer, figuring out the bottlenecks, and optimizing them.
> While working on the SFML project, I noticed that the `std::unique_ptr`
> template was taking a lot of time to be instantiated using the latest
> version of libstdc++.
>
> I investigated a bit, and figured out that the `<tuple>` include and
> `std::tuple` instantiation were the culprit. I then replaced them with a
> handmade workaround in my system libraries, recompiled SFML, and noticed
> that `std::unique_ptr`'s instantiation time became negligible as a result
> of my changes. As a side benefit, we also avoid including the '<tuple>'
> header altogether, reducing the risk of unintended transitive includes. I
> created a PR on the GCC GitHub mirror as a way to observe the diff in a
> nice interface and to discuss my changes, and to see if the maintainers
> generally think that this is a good idea. You can see the diff here:
> https://github.com/gcc-mirror/gcc/pull/67/files?diff=split&w=1
>
> In terms of rough measurements, the average instantiation time of
> `std::unique_ptr` went from 22ms to 4ms. This was measured over the entire
> SFML codebase using clang++ 14.x with the `-ftime-trace` flag, plus the
> aforementioned ClangBuildAnalyzer tool.
>
> A few questions:
>
>   1.  Is this improvement something that the libstdc++ maintainers would
> want to merge in? Of course, the changes need to be cleaned up and adapted
> to GCC's coding style.
>

No, it would be a ABI break. It would change the layout for a deleter with
the 'final' specifier.

It also looks like quite a maintenance headache "just" for improved
compilation time. The need for two implementations (with and without the
attribute) that both need to be kept ABI compatible with std::tuple doesn't
seem like a good trade off and a good use of maintainer effort.


  2.  If so, what is the easiest way to create a patch in the format that
> you are used to and test it against libstdc++'s test suite? I have never
> contributed to libstdc++ or GCC before.
>

I'll reply to this part later today.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: First-time contributor -- speeding up `std::unique_ptr` compilation
  2022-06-24  6:20 ` Jonathan Wakely
@ 2022-06-24  7:22   ` Jonathan Wakely
  2022-06-25 12:45     ` Vittorio Romeo
  0 siblings, 1 reply; 5+ messages in thread
From: Jonathan Wakely @ 2022-06-24  7:22 UTC (permalink / raw)
  To: mail; +Cc: libstdc++

On Fri, 24 Jun 2022, 07:20 Jonathan Wakely, <jwakely.gcc@gmail.com> wrote:

>
>
> On Fri, 24 Jun 2022, 01:19 , <mail@vittorioromeo.com> wrote:
>
>> Hello everyone,
>> Hope you are all doing well.
>>
>> I would like to begin by saying that I am terribly unfamiliar with
>> mailing lists and I've always found them cumbersome to use -- I tend to do
>> most of my development via GitHub PRs and Discord. Therefore, please
>> forgive me if this is not the right place to discuss "how to contribute",
>> and also forgive me for not being able to create a proper patch file.
>>
>> My goal is to generally improve the compilation speed of C++ programs
>> using the standard library. In personal projects and open-source projects I
>> maintain, I've achieved good results by benchmarking compilation via
>> ClangBuildAnalyzer, figuring out the bottlenecks, and optimizing them.
>> While working on the SFML project, I noticed that the `std::unique_ptr`
>> template was taking a lot of time to be instantiated using the latest
>> version of libstdc++.
>>
>> I investigated a bit, and figured out that the `<tuple>` include and
>> `std::tuple` instantiation were the culprit. I then replaced them with a
>> handmade workaround in my system libraries, recompiled SFML, and noticed
>> that `std::unique_ptr`'s instantiation time became negligible as a result
>> of my changes. As a side benefit, we also avoid including the '<tuple>'
>> header altogether, reducing the risk of unintended transitive includes. I
>> created a PR on the GCC GitHub mirror as a way to observe the diff in a
>> nice interface and to discuss my changes, and to see if the maintainers
>> generally think that this is a good idea. You can see the diff here:
>> https://github.com/gcc-mirror/gcc/pull/67/files?diff=split&w=1
>>
>> In terms of rough measurements, the average instantiation time of
>> `std::unique_ptr` went from 22ms to 4ms. This was measured over the entire
>> SFML codebase using clang++ 14.x with the `-ftime-trace` flag, plus the
>> aforementioned ClangBuildAnalyzer tool.
>>
>> A few questions:
>>
>>   1.  Is this improvement something that the libstdc++ maintainers would
>> want to merge in? Of course, the changes need to be cleaned up and adapted
>> to GCC's coding style.
>>
>
> No, it would be a ABI break. It would change the layout for a deleter with
> the 'final' specifier.
>
> It also looks like quite a maintenance headache "just" for improved
> compilation time. The need for two implementations (with and without the
> attribute) that both need to be kept ABI compatible with std::tuple doesn't
> seem like a good trade off and a good use of maintainer effort.
>

I think a more maintainable (and ABI compatible) approach would be to move
the _Head_base templates to a new header and include that in
bits/unique_ptr.h, without the rest of <tuple>. That will also allow the
gdb pretty printers to continue working, whereas your suggestion would mean
rewriting them.

I know you are concerned about build times, but correctness, ABI
compatibility, and to a lesser extent maintainability all matter more.
Build times for microbenchmarks can be worth checking, but bear in mind
that your changes actually mean *more* code needs to be compiled for a
program that already uses std::tuple anyway. You're not removing any code
from the library or making the existing tuple code faster to compile, only
adding more code.




>
>   2.  If so, what is the easiest way to create a patch in the format that
>> you are used to and test it against libstdc++'s test suite? I have never
>> contributed to libstdc++ or GCC before.
>>
>
You need to test it first, before you create a patch. Build your modified
sources as per https://gcc.gnu.org/wiki/InstallingGCC (but starting with
your git clone instead of a release tarball).

You can use --enable-languages=c++ to avoid building more of the compiler
then you need for libstdc++ testing.

In the libstdc++ build directory (e.g. x86_64-pc-linux-gnu/libstdc++-v3)
run make check to run the tests, adding -j8 or a suitable option for your
number of cores.

See https://gcc.gnu.org/onlinedocs/libstdc++/manual/test.html#test.run for
testing docs.

Once you're happy with the results see https://gcc.gnu.org/contribute.html
for patch submission.


>
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: First-time contributor -- speeding up `std::unique_ptr` compilation
  2022-06-24  7:22   ` Jonathan Wakely
@ 2022-06-25 12:45     ` Vittorio Romeo
  2022-07-25 21:34       ` Jonathan Wakely
  0 siblings, 1 reply; 5+ messages in thread
From: Vittorio Romeo @ 2022-06-25 12:45 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: libstdc++

Thank you, Jonathan, for the very thorough explanation and for pointing out that I completely forgot to consider ABI stability in this case.

I still think that it might be worthwhile to attempt some work in this area as multiple projects I work on have various TUs using unique_ptr but not tuple. I will look into the refactoring you suggested.


Vittorio Romeo
https://vittorioromeo.info
________________________________
From: Jonathan Wakely <jwakely.gcc@gmail.com>
Sent: Friday, June 24, 2022 9:22:56 AM
To: mail@vittorioromeo.com <mail@vittorioromeo.com>
Cc: libstdc++ <libstdc++@gcc.gnu.org>
Subject: Re: First-time contributor -- speeding up `std::unique_ptr` compilation



On Fri, 24 Jun 2022, 07:20 Jonathan Wakely, <jwakely.gcc@gmail.com<mailto:jwakely.gcc@gmail.com>> wrote:


On Fri, 24 Jun 2022, 01:19 , <mail@vittorioromeo.com<mailto:mail@vittorioromeo.com>> wrote:
Hello everyone,
Hope you are all doing well.

I would like to begin by saying that I am terribly unfamiliar with mailing lists and I've always found them cumbersome to use -- I tend to do most of my development via GitHub PRs and Discord. Therefore, please forgive me if this is not the right place to discuss "how to contribute", and also forgive me for not being able to create a proper patch file.

My goal is to generally improve the compilation speed of C++ programs using the standard library. In personal projects and open-source projects I maintain, I've achieved good results by benchmarking compilation via ClangBuildAnalyzer, figuring out the bottlenecks, and optimizing them. While working on the SFML project, I noticed that the `std::unique_ptr` template was taking a lot of time to be instantiated using the latest version of libstdc++.

I investigated a bit, and figured out that the `<tuple>` include and `std::tuple` instantiation were the culprit. I then replaced them with a handmade workaround in my system libraries, recompiled SFML, and noticed that `std::unique_ptr`'s instantiation time became negligible as a result of my changes. As a side benefit, we also avoid including the '<tuple>' header altogether, reducing the risk of unintended transitive includes. I created a PR on the GCC GitHub mirror as a way to observe the diff in a nice interface and to discuss my changes, and to see if the maintainers generally think that this is a good idea. You can see the diff here:
https://github.com/gcc-mirror/gcc/pull/67/files?diff=split&w=1

In terms of rough measurements, the average instantiation time of `std::unique_ptr` went from 22ms to 4ms. This was measured over the entire SFML codebase using clang++ 14.x with the `-ftime-trace` flag, plus the aforementioned ClangBuildAnalyzer tool.

A few questions:

  1.  Is this improvement something that the libstdc++ maintainers would want to merge in? Of course, the changes need to be cleaned up and adapted to GCC's coding style.

No, it would be a ABI break. It would change the layout for a deleter with the 'final' specifier.

It also looks like quite a maintenance headache "just" for improved compilation time. The need for two implementations (with and without the attribute) that both need to be kept ABI compatible with std::tuple doesn't seem like a good trade off and a good use of maintainer effort.

I think a more maintainable (and ABI compatible) approach would be to move the _Head_base templates to a new header and include that in bits/unique_ptr.h, without the rest of <tuple>. That will also allow the gdb pretty printers to continue working, whereas your suggestion would mean rewriting them.

I know you are concerned about build times, but correctness, ABI compatibility, and to a lesser extent maintainability all matter more. Build times for microbenchmarks can be worth checking, but bear in mind that your changes actually mean *more* code needs to be compiled for a program that already uses std::tuple anyway. You're not removing any code from the library or making the existing tuple code faster to compile, only adding more code.





  2.  If so, what is the easiest way to create a patch in the format that you are used to and test it against libstdc++'s test suite? I have never contributed to libstdc++ or GCC before.

You need to test it first, before you create a patch. Build your modified sources as per https://gcc.gnu.org/wiki/InstallingGCC (but starting with your git clone instead of a release tarball).

You can use --enable-languages=c++ to avoid building more of the compiler then you need for libstdc++ testing.

In the libstdc++ build directory (e.g. x86_64-pc-linux-gnu/libstdc++-v3) run make check to run the tests, adding -j8 or a suitable option for your number of cores.

See https://gcc.gnu.org/onlinedocs/libstdc++/manual/test.html#test.run for testing docs.

Once you're happy with the results see https://gcc.gnu.org/contribute.html for patch submission.




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: First-time contributor -- speeding up `std::unique_ptr` compilation
  2022-06-25 12:45     ` Vittorio Romeo
@ 2022-07-25 21:34       ` Jonathan Wakely
  0 siblings, 0 replies; 5+ messages in thread
From: Jonathan Wakely @ 2022-07-25 21:34 UTC (permalink / raw)
  To: Vittorio Romeo; +Cc: libstdc++

[-- Attachment #1: Type: text/plain, Size: 565 bytes --]

On Sat, 25 Jun 2022 at 13:45, Vittorio Romeo wrote:
>
> Thank you, Jonathan, for the very thorough explanation and for pointing out that I completely forgot to consider ABI stability in this case.
>
> I still think that it might be worthwhile to attempt some work in this area as multiple projects I work on have various TUs using unique_ptr but not tuple. I will look into the refactoring you suggested.

Something like this would avoid instantiating most of std::tuple while
preserving ABI, but it doesn't seem to really make much difference to
compilation time.

[-- Attachment #2: unique_ptr.patch --]
[-- Type: text/x-patch, Size: 2388 bytes --]

diff --git a/libstdc++-v3/include/bits/unique_ptr.h b/libstdc++-v3/include/bits/unique_ptr.h
index e1ad7721a59..777c7dec15a 100644
--- a/libstdc++-v3/include/bits/unique_ptr.h
+++ b/libstdc++-v3/include/bits/unique_ptr.h
@@ -172,7 +172,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
       __uniq_ptr_impl() = default;
       _GLIBCXX23_CONSTEXPR
-      __uniq_ptr_impl(pointer __p) : _M_t() { _M_ptr() = __p; }
+      __uniq_ptr_impl(pointer __p) : _M_t(__p) { }
 
       template<typename _Del>
 	_GLIBCXX23_CONSTEXPR
@@ -193,13 +193,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
       }
 
       _GLIBCXX23_CONSTEXPR
-      pointer&   _M_ptr() noexcept { return std::get<0>(_M_t); }
+      pointer&   _M_ptr() noexcept { return _M_t._M_ptr(); }
       _GLIBCXX23_CONSTEXPR
-      pointer    _M_ptr() const noexcept { return std::get<0>(_M_t); }
+      pointer    _M_ptr() const noexcept { return _M_t._M_ptr(); }
       _GLIBCXX23_CONSTEXPR
-      _Dp&       _M_deleter() noexcept { return std::get<1>(_M_t); }
+      _Dp&       _M_deleter() noexcept { return _M_t._M_deleter(); }
       _GLIBCXX23_CONSTEXPR
-      const _Dp& _M_deleter() const noexcept { return std::get<1>(_M_t); }
+      const _Dp& _M_deleter() const noexcept { return _M_t._M_deleter(); }
 
       _GLIBCXX23_CONSTEXPR
       void reset(pointer __p) noexcept
@@ -226,9 +226,35 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	swap(this->_M_ptr(), __rhs._M_ptr());
 	swap(this->_M_deleter(), __rhs._M_deleter());
       }
-
     private:
-      tuple<pointer, _Dp> _M_t;
+      struct _Tuple : _Head_base<1, _Dp>, _Head_base<0, pointer>
+      {
+	using _Ptr_base = _Head_base<0, pointer>;
+	using _Del_base = _Head_base<1, _Dp>;
+
+	_Tuple() = default;
+	_Tuple(_Tuple&&) = default;
+
+	constexpr
+	_Tuple(pointer __p) : _Ptr_base(__p) { }
+
+	template<typename _Del>
+	  constexpr
+	  _Tuple(pointer __p, _Del&& __d)
+	  : _Del_base(std::forward<_Del>(__d)), _Ptr_base(__p)
+	  { }
+
+	constexpr pointer&
+	_M_ptr() noexcept { return _Ptr_base::_M_head(*this); }
+	constexpr pointer
+	_M_ptr() const noexcept { return _Ptr_base::_M_head(*this); }
+	constexpr _Dp&
+	_M_deleter() noexcept { return _Del_base::_M_head(*this); }
+	constexpr const _Dp&
+	_M_deleter() const noexcept { return _Del_base::_M_head(*this); }
+      };
+
+      _Tuple _M_t;
     };
 
   // Defines move construction + assignment as either defaulted or deleted.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-07-25 21:34 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-24  0:19 First-time contributor -- speeding up `std::unique_ptr` compilation mail
2022-06-24  6:20 ` Jonathan Wakely
2022-06-24  7:22   ` Jonathan Wakely
2022-06-25 12:45     ` Vittorio Romeo
2022-07-25 21:34       ` Jonathan Wakely

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).