public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
* [Bug c++/99386] New: std::variant overhead much larger compared to clang @ 2021-03-04 12:34 mail at milianw dot de 2021-03-04 13:12 ` [Bug c++/99386] " rguenth at gcc dot gnu.org ` (4 more replies) 0 siblings, 5 replies; 6+ messages in thread From: mail at milianw dot de @ 2021-03-04 12:34 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99386 Bug ID: 99386 Summary: std::variant overhead much larger compared to clang Product: gcc Version: 10.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: mail at milianw dot de Target Milestone: --- I've come across some code in an application I'm working on that makes use of std::variant. The overhead imposed by std::variant compared to a raw type is extremely high (700% and more). I created a little MWE to show this behavior: https://github.com/milianw/cpp-variant-overhead To reproduce, compile two versions in different build folders with both g++ or clang++: ``` CXX=g++ cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo CXX=clang++ cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo ``` Then run both versions: ``` perf stat -r 5 -d ./variant 0 perf stat -r 5 -d ./variant 1 perf stat -r 5 -d ./variant 2 ``` I put the measurements on my machine into the README.md. The gist is, the relative runtime overhead is huge when compiling with g++: g++ uint64_t: 100% std::variant<uint64_t>: 720% std::variant<uint64_t, uint32_t>: 840% clang++ uint64_t: 100% std::variant<uint64_t>: 114% std::variant<uint64_t, uint32_t>: 184% The baseline for both g++/clang++ is roughly the same. ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug c++/99386] std::variant overhead much larger compared to clang 2021-03-04 12:34 [Bug c++/99386] New: std::variant overhead much larger compared to clang mail at milianw dot de @ 2021-03-04 13:12 ` rguenth at gcc dot gnu.org 2021-03-04 13:48 ` mail at milianw dot de ` (3 subsequent siblings) 4 siblings, 0 replies; 6+ messages in thread From: rguenth at gcc dot gnu.org @ 2021-03-04 13:12 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99386 --- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> --- Is that clang++ using libstdc++ from GCC or libc++? In the end the difference might boil down to inlining decision differences. ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug c++/99386] std::variant overhead much larger compared to clang 2021-03-04 12:34 [Bug c++/99386] New: std::variant overhead much larger compared to clang mail at milianw dot de 2021-03-04 13:12 ` [Bug c++/99386] " rguenth at gcc dot gnu.org @ 2021-03-04 13:48 ` mail at milianw dot de 2021-03-04 13:52 ` mail at milianw dot de ` (2 subsequent siblings) 4 siblings, 0 replies; 6+ messages in thread From: mail at milianw dot de @ 2021-03-04 13:48 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99386 --- Comment #2 from Milian Wolff <mail at milianw dot de> --- in both cases libstdc++ is being used: ``` gcc: linux-vdso.so.1 (0x00007ffdc9f93000) libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007f1449b2d000) libm.so.6 => /usr/lib/libm.so.6 (0x00007f14499e8000) libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x00007f14499ce000) libc.so.6 => /usr/lib/libc.so.6 (0x00007f1449801000) /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f1449d40000) clang: linux-vdso.so.1 (0x00007fff5854f000) libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007fd9b190f000) libm.so.6 => /usr/lib/libm.so.6 (0x00007fd9b17ca000) libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x00007fd9b17b0000) libc.so.6 => /usr/lib/libc.so.6 (0x00007fd9b15e3000) /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007fd9b1b22000) ``` I just tried setting `-O3 -finline-limit=5000`, but the performance numbers don't really change much. Is there anything else I should be trying out? ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug c++/99386] std::variant overhead much larger compared to clang 2021-03-04 12:34 [Bug c++/99386] New: std::variant overhead much larger compared to clang mail at milianw dot de 2021-03-04 13:12 ` [Bug c++/99386] " rguenth at gcc dot gnu.org 2021-03-04 13:48 ` mail at milianw dot de @ 2021-03-04 13:52 ` mail at milianw dot de 2021-03-04 14:08 ` mail at milianw dot de 2021-03-04 14:49 ` redi at gcc dot gnu.org 4 siblings, 0 replies; 6+ messages in thread From: mail at milianw dot de @ 2021-03-04 13:52 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99386 --- Comment #3 from Milian Wolff <mail at milianw dot de> --- Ah, seems like `-O2 -flto` fixes the issue for me, but how come clang can pull this off without LTO? ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug c++/99386] std::variant overhead much larger compared to clang 2021-03-04 12:34 [Bug c++/99386] New: std::variant overhead much larger compared to clang mail at milianw dot de ` (2 preceding siblings ...) 2021-03-04 13:52 ` mail at milianw dot de @ 2021-03-04 14:08 ` mail at milianw dot de 2021-03-04 14:49 ` redi at gcc dot gnu.org 4 siblings, 0 replies; 6+ messages in thread From: mail at milianw dot de @ 2021-03-04 14:08 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99386 --- Comment #4 from Milian Wolff <mail at milianw dot de> --- Ah, but LTO only helps with the variant that contains a single type. The variant with two types remains very slow: variant with single type: ``` Performance counter stats for './variant 1' (5 runs): 264.14 msec task-clock # 0.999 CPUs utilized ( +- 0.13% ) 0 context-switches # 0.001 K/sec ( +-100.00% ) 0 cpu-migrations # 0.000 K/sec 380 page-faults # 0.001 M/sec ( +- 0.13% ) 1,182,582,454 cycles # 4.477 GHz ( +- 0.06% ) (62.52%) 634,015 stalled-cycles-frontend # 0.05% frontend cycles idle ( +- 3.72% ) (62.52%) 1,044,218,220 stalled-cycles-backend # 88.30% backend cycles idle ( +- 0.16% ) (62.52%) 1,187,317,899 instructions # 1.00 insn per cycle # 0.88 stalled cycles per insn ( +- 0.11% ) (62.52%) 132,470,519 branches # 501.512 M/sec ( +- 0.09% ) (62.53%) 2,967 branch-misses # 0.00% of all branches ( +- 7.80% ) (62.47%) 788,740,131 L1-dcache-loads # 2986.044 M/sec ( +- 0.16% ) (62.47%) 16,466,669 L1-dcache-load-misses # 2.09% of all L1-dcache accesses ( +- 0.16% ) (62.46%) <not supported> LLC-loads <not supported> LLC-load-misses 0.264412 +- 0.000379 seconds time elapsed ( +- 0.14% ) ``` The above measurements is in the same ballpark as the no-variant baseline without LTO. But check out the following for using a variant with two types: ``` Performance counter stats for './variant 2' (5 runs): 1,807.01 msec task-clock # 1.000 CPUs utilized ( +- 0.04% ) 4 context-switches # 0.002 K/sec ( +- 11.59% ) 0 cpu-migrations # 0.000 K/sec ( +- 61.24% ) 383 page-faults # 0.212 K/sec ( +- 0.27% ) 8,093,139,812 cycles # 4.479 GHz ( +- 0.01% ) (62.35%) 1,393,308 stalled-cycles-frontend # 0.02% frontend cycles idle ( +- 5.84% ) (62.52%) 7,257,955,665 stalled-cycles-backend # 89.68% backend cycles idle ( +- 0.08% ) (62.62%) 4,728,542,717 instructions # 0.58 insn per cycle # 1.53 stalled cycles per insn ( +- 0.02% ) (62.65%) 395,189,246 branches # 218.698 M/sec ( +- 0.02% ) (62.65%) 17,570 branch-misses # 0.00% of all branches ( +- 12.38% ) (62.55%) 3,806,321,294 L1-dcache-loads # 2106.424 M/sec ( +- 0.02% ) (62.39%) 16,753,910 L1-dcache-load-misses # 0.44% of all L1-dcache accesses ( +- 0.11% ) (62.28%) <not supported> LLC-loads <not supported> LLC-load-misses 1.807335 +- 0.000776 seconds time elapsed ( +- 0.04% ) ``` Again, performance suffers dramatically ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug c++/99386] std::variant overhead much larger compared to clang 2021-03-04 12:34 [Bug c++/99386] New: std::variant overhead much larger compared to clang mail at milianw dot de ` (3 preceding siblings ...) 2021-03-04 14:08 ` mail at milianw dot de @ 2021-03-04 14:49 ` redi at gcc dot gnu.org 4 siblings, 0 replies; 6+ messages in thread From: redi at gcc dot gnu.org @ 2021-03-04 14:49 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99386 --- Comment #5 from Jonathan Wakely <redi at gcc dot gnu.org> --- See PR 78113 and PR 86912 ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-03-04 14:49 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-03-04 12:34 [Bug c++/99386] New: std::variant overhead much larger compared to clang mail at milianw dot de 2021-03-04 13:12 ` [Bug c++/99386] " rguenth at gcc dot gnu.org 2021-03-04 13:48 ` mail at milianw dot de 2021-03-04 13:52 ` mail at milianw dot de 2021-03-04 14:08 ` mail at milianw dot de 2021-03-04 14:49 ` redi at gcc dot gnu.org
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).