public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/99386] New: std::variant overhead much larger compared to clang
@ 2021-03-04 12:34 mail at milianw dot de
2021-03-04 13:12 ` [Bug c++/99386] " rguenth at gcc dot gnu.org
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: mail at milianw dot de @ 2021-03-04 12:34 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99386
Bug ID: 99386
Summary: std::variant overhead much larger compared to clang
Product: gcc
Version: 10.2.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c++
Assignee: unassigned at gcc dot gnu.org
Reporter: mail at milianw dot de
Target Milestone: ---
I've come across some code in an application I'm working on that makes use of
std::variant. The overhead imposed by std::variant compared to a raw type is
extremely high (700% and more). I created a little MWE to show this behavior:
https://github.com/milianw/cpp-variant-overhead
To reproduce, compile two versions in different build folders with both g++ or
clang++:
```
CXX=g++ cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo
CXX=clang++ cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo
```
Then run both versions:
```
perf stat -r 5 -d ./variant 0
perf stat -r 5 -d ./variant 1
perf stat -r 5 -d ./variant 2
```
I put the measurements on my machine into the README.md. The gist is, the
relative runtime overhead is huge when compiling with g++:
g++
uint64_t: 100%
std::variant<uint64_t>: 720%
std::variant<uint64_t, uint32_t>: 840%
clang++
uint64_t: 100%
std::variant<uint64_t>: 114%
std::variant<uint64_t, uint32_t>: 184%
The baseline for both g++/clang++ is roughly the same.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug c++/99386] std::variant overhead much larger compared to clang
2021-03-04 12:34 [Bug c++/99386] New: std::variant overhead much larger compared to clang mail at milianw dot de
@ 2021-03-04 13:12 ` rguenth at gcc dot gnu.org
2021-03-04 13:48 ` mail at milianw dot de
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-03-04 13:12 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99386
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Is that clang++ using libstdc++ from GCC or libc++? In the end the difference
might boil down to inlining decision differences.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug c++/99386] std::variant overhead much larger compared to clang
2021-03-04 12:34 [Bug c++/99386] New: std::variant overhead much larger compared to clang mail at milianw dot de
2021-03-04 13:12 ` [Bug c++/99386] " rguenth at gcc dot gnu.org
@ 2021-03-04 13:48 ` mail at milianw dot de
2021-03-04 13:52 ` mail at milianw dot de
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: mail at milianw dot de @ 2021-03-04 13:48 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99386
--- Comment #2 from Milian Wolff <mail at milianw dot de> ---
in both cases libstdc++ is being used:
```
gcc:
linux-vdso.so.1 (0x00007ffdc9f93000)
libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007f1449b2d000)
libm.so.6 => /usr/lib/libm.so.6 (0x00007f14499e8000)
libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x00007f14499ce000)
libc.so.6 => /usr/lib/libc.so.6 (0x00007f1449801000)
/lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2
(0x00007f1449d40000)
clang:
linux-vdso.so.1 (0x00007fff5854f000)
libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007fd9b190f000)
libm.so.6 => /usr/lib/libm.so.6 (0x00007fd9b17ca000)
libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x00007fd9b17b0000)
libc.so.6 => /usr/lib/libc.so.6 (0x00007fd9b15e3000)
/lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2
(0x00007fd9b1b22000)
```
I just tried setting `-O3 -finline-limit=5000`, but the performance numbers
don't really change much. Is there anything else I should be trying out?
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug c++/99386] std::variant overhead much larger compared to clang
2021-03-04 12:34 [Bug c++/99386] New: std::variant overhead much larger compared to clang mail at milianw dot de
2021-03-04 13:12 ` [Bug c++/99386] " rguenth at gcc dot gnu.org
2021-03-04 13:48 ` mail at milianw dot de
@ 2021-03-04 13:52 ` mail at milianw dot de
2021-03-04 14:08 ` mail at milianw dot de
2021-03-04 14:49 ` redi at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: mail at milianw dot de @ 2021-03-04 13:52 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99386
--- Comment #3 from Milian Wolff <mail at milianw dot de> ---
Ah, seems like `-O2 -flto` fixes the issue for me, but how come clang can pull
this off without LTO?
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug c++/99386] std::variant overhead much larger compared to clang
2021-03-04 12:34 [Bug c++/99386] New: std::variant overhead much larger compared to clang mail at milianw dot de
` (2 preceding siblings ...)
2021-03-04 13:52 ` mail at milianw dot de
@ 2021-03-04 14:08 ` mail at milianw dot de
2021-03-04 14:49 ` redi at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: mail at milianw dot de @ 2021-03-04 14:08 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99386
--- Comment #4 from Milian Wolff <mail at milianw dot de> ---
Ah, but LTO only helps with the variant that contains a single type. The
variant with two types remains very slow:
variant with single type:
```
Performance counter stats for './variant 1' (5 runs):
264.14 msec task-clock # 0.999 CPUs utilized
( +- 0.13% )
0 context-switches # 0.001 K/sec
( +-100.00% )
0 cpu-migrations # 0.000 K/sec
380 page-faults # 0.001 M/sec
( +- 0.13% )
1,182,582,454 cycles # 4.477 GHz
( +- 0.06% ) (62.52%)
634,015 stalled-cycles-frontend # 0.05% frontend cycles
idle ( +- 3.72% ) (62.52%)
1,044,218,220 stalled-cycles-backend # 88.30% backend cycles
idle ( +- 0.16% ) (62.52%)
1,187,317,899 instructions # 1.00 insn per cycle
# 0.88 stalled cycles per
insn ( +- 0.11% ) (62.52%)
132,470,519 branches # 501.512 M/sec
( +- 0.09% ) (62.53%)
2,967 branch-misses # 0.00% of all branches
( +- 7.80% ) (62.47%)
788,740,131 L1-dcache-loads # 2986.044 M/sec
( +- 0.16% ) (62.47%)
16,466,669 L1-dcache-load-misses # 2.09% of all L1-dcache
accesses ( +- 0.16% ) (62.46%)
<not supported> LLC-loads
<not supported> LLC-load-misses
0.264412 +- 0.000379 seconds time elapsed ( +- 0.14% )
```
The above measurements is in the same ballpark as the no-variant baseline
without LTO. But check out the following for using a variant with two types:
```
Performance counter stats for './variant 2' (5 runs):
1,807.01 msec task-clock # 1.000 CPUs utilized
( +- 0.04% )
4 context-switches # 0.002 K/sec
( +- 11.59% )
0 cpu-migrations # 0.000 K/sec
( +- 61.24% )
383 page-faults # 0.212 K/sec
( +- 0.27% )
8,093,139,812 cycles # 4.479 GHz
( +- 0.01% ) (62.35%)
1,393,308 stalled-cycles-frontend # 0.02% frontend cycles
idle ( +- 5.84% ) (62.52%)
7,257,955,665 stalled-cycles-backend # 89.68% backend cycles
idle ( +- 0.08% ) (62.62%)
4,728,542,717 instructions # 0.58 insn per cycle
# 1.53 stalled cycles per
insn ( +- 0.02% ) (62.65%)
395,189,246 branches # 218.698 M/sec
( +- 0.02% ) (62.65%)
17,570 branch-misses # 0.00% of all branches
( +- 12.38% ) (62.55%)
3,806,321,294 L1-dcache-loads # 2106.424 M/sec
( +- 0.02% ) (62.39%)
16,753,910 L1-dcache-load-misses # 0.44% of all L1-dcache
accesses ( +- 0.11% ) (62.28%)
<not supported> LLC-loads
<not supported> LLC-load-misses
1.807335 +- 0.000776 seconds time elapsed ( +- 0.04% )
```
Again, performance suffers dramatically
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug c++/99386] std::variant overhead much larger compared to clang
2021-03-04 12:34 [Bug c++/99386] New: std::variant overhead much larger compared to clang mail at milianw dot de
` (3 preceding siblings ...)
2021-03-04 14:08 ` mail at milianw dot de
@ 2021-03-04 14:49 ` redi at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: redi at gcc dot gnu.org @ 2021-03-04 14:49 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99386
--- Comment #5 from Jonathan Wakely <redi at gcc dot gnu.org> ---
See PR 78113 and PR 86912
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-03-04 14:49 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-04 12:34 [Bug c++/99386] New: std::variant overhead much larger compared to clang mail at milianw dot de
2021-03-04 13:12 ` [Bug c++/99386] " rguenth at gcc dot gnu.org
2021-03-04 13:48 ` mail at milianw dot de
2021-03-04 13:52 ` mail at milianw dot de
2021-03-04 14:08 ` mail at milianw dot de
2021-03-04 14:49 ` redi at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).