public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/99386] New: std::variant overhead much larger compared to clang
@ 2021-03-04 12:34 mail at milianw dot de
  2021-03-04 13:12 ` [Bug c++/99386] " rguenth at gcc dot gnu.org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: mail at milianw dot de @ 2021-03-04 12:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99386

            Bug ID: 99386
           Summary: std::variant overhead much larger compared to clang
           Product: gcc
           Version: 10.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: mail at milianw dot de
  Target Milestone: ---

I've come across some code in an application I'm working on that makes use of
std::variant. The overhead imposed by std::variant compared to a raw type is
extremely high (700% and more). I created a little MWE to show this behavior:

https://github.com/milianw/cpp-variant-overhead

To reproduce, compile two versions in different build folders with both g++ or
clang++:

```
CXX=g++ cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo
CXX=clang++ cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo
```

Then run both versions:
```
perf stat -r 5 -d ./variant 0
perf stat -r 5 -d ./variant 1
perf stat -r 5 -d ./variant 2
```

I put the measurements on my machine into the README.md. The gist is, the
relative runtime overhead is huge when compiling with g++:

g++
uint64_t: 100%
std::variant<uint64_t>: 720%
std::variant<uint64_t, uint32_t>: 840%

clang++
uint64_t: 100%
std::variant<uint64_t>: 114%
std::variant<uint64_t, uint32_t>: 184%

The baseline for both g++/clang++ is roughly the same.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug c++/99386] std::variant overhead much larger compared to clang
  2021-03-04 12:34 [Bug c++/99386] New: std::variant overhead much larger compared to clang mail at milianw dot de
@ 2021-03-04 13:12 ` rguenth at gcc dot gnu.org
  2021-03-04 13:48 ` mail at milianw dot de
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-03-04 13:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99386

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Is that clang++ using libstdc++ from GCC or libc++?  In the end the difference
might boil down to inlining decision differences.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug c++/99386] std::variant overhead much larger compared to clang
  2021-03-04 12:34 [Bug c++/99386] New: std::variant overhead much larger compared to clang mail at milianw dot de
  2021-03-04 13:12 ` [Bug c++/99386] " rguenth at gcc dot gnu.org
@ 2021-03-04 13:48 ` mail at milianw dot de
  2021-03-04 13:52 ` mail at milianw dot de
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: mail at milianw dot de @ 2021-03-04 13:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99386

--- Comment #2 from Milian Wolff <mail at milianw dot de> ---
in both cases libstdc++ is being used:

```
gcc:
        linux-vdso.so.1 (0x00007ffdc9f93000)
        libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007f1449b2d000)
        libm.so.6 => /usr/lib/libm.so.6 (0x00007f14499e8000)
        libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x00007f14499ce000)
        libc.so.6 => /usr/lib/libc.so.6 (0x00007f1449801000)
        /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2
(0x00007f1449d40000)

clang:
        linux-vdso.so.1 (0x00007fff5854f000)
        libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007fd9b190f000)
        libm.so.6 => /usr/lib/libm.so.6 (0x00007fd9b17ca000)
        libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x00007fd9b17b0000)
        libc.so.6 => /usr/lib/libc.so.6 (0x00007fd9b15e3000)
        /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2
(0x00007fd9b1b22000)
```

I just tried setting `-O3 -finline-limit=5000`, but the performance numbers
don't really change much. Is there anything else I should be trying out?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug c++/99386] std::variant overhead much larger compared to clang
  2021-03-04 12:34 [Bug c++/99386] New: std::variant overhead much larger compared to clang mail at milianw dot de
  2021-03-04 13:12 ` [Bug c++/99386] " rguenth at gcc dot gnu.org
  2021-03-04 13:48 ` mail at milianw dot de
@ 2021-03-04 13:52 ` mail at milianw dot de
  2021-03-04 14:08 ` mail at milianw dot de
  2021-03-04 14:49 ` redi at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: mail at milianw dot de @ 2021-03-04 13:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99386

--- Comment #3 from Milian Wolff <mail at milianw dot de> ---
Ah, seems like `-O2 -flto` fixes the issue for me, but how come clang can pull
this off without LTO?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug c++/99386] std::variant overhead much larger compared to clang
  2021-03-04 12:34 [Bug c++/99386] New: std::variant overhead much larger compared to clang mail at milianw dot de
                   ` (2 preceding siblings ...)
  2021-03-04 13:52 ` mail at milianw dot de
@ 2021-03-04 14:08 ` mail at milianw dot de
  2021-03-04 14:49 ` redi at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: mail at milianw dot de @ 2021-03-04 14:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99386

--- Comment #4 from Milian Wolff <mail at milianw dot de> ---
Ah, but LTO only helps with the variant that contains a single type. The
variant with two types remains very slow:


variant with single type:
```
 Performance counter stats for './variant 1' (5 runs):

            264.14 msec task-clock                #    0.999 CPUs utilized     
      ( +-  0.13% )
                 0      context-switches          #    0.001 K/sec             
      ( +-100.00% )
                 0      cpu-migrations            #    0.000 K/sec              
               380      page-faults               #    0.001 M/sec             
      ( +-  0.13% )
     1,182,582,454      cycles                    #    4.477 GHz               
      ( +-  0.06% )  (62.52%)
           634,015      stalled-cycles-frontend   #    0.05% frontend cycles
idle     ( +-  3.72% )  (62.52%)
     1,044,218,220      stalled-cycles-backend    #   88.30% backend cycles
idle      ( +-  0.16% )  (62.52%)
     1,187,317,899      instructions              #    1.00  insn per cycle     
                                                  #    0.88  stalled cycles per
insn  ( +-  0.11% )  (62.52%)
       132,470,519      branches                  #  501.512 M/sec             
      ( +-  0.09% )  (62.53%)
             2,967      branch-misses             #    0.00% of all branches   
      ( +-  7.80% )  (62.47%)
       788,740,131      L1-dcache-loads           # 2986.044 M/sec             
      ( +-  0.16% )  (62.47%)
        16,466,669      L1-dcache-load-misses     #    2.09% of all L1-dcache
accesses  ( +-  0.16% )  (62.46%)
   <not supported>      LLC-loads                                               
   <not supported>      LLC-load-misses                                         

          0.264412 +- 0.000379 seconds time elapsed  ( +-  0.14% )

```

The above measurements is in the same ballpark as the no-variant baseline
without LTO. But check out the following for using a variant with two types:
```
 Performance counter stats for './variant 2' (5 runs):

          1,807.01 msec task-clock                #    1.000 CPUs utilized     
      ( +-  0.04% )
                 4      context-switches          #    0.002 K/sec             
      ( +- 11.59% )
                 0      cpu-migrations            #    0.000 K/sec             
      ( +- 61.24% )
               383      page-faults               #    0.212 K/sec             
      ( +-  0.27% )
     8,093,139,812      cycles                    #    4.479 GHz               
      ( +-  0.01% )  (62.35%)
         1,393,308      stalled-cycles-frontend   #    0.02% frontend cycles
idle     ( +-  5.84% )  (62.52%)
     7,257,955,665      stalled-cycles-backend    #   89.68% backend cycles
idle      ( +-  0.08% )  (62.62%)
     4,728,542,717      instructions              #    0.58  insn per cycle     
                                                  #    1.53  stalled cycles per
insn  ( +-  0.02% )  (62.65%)
       395,189,246      branches                  #  218.698 M/sec             
      ( +-  0.02% )  (62.65%)
            17,570      branch-misses             #    0.00% of all branches   
      ( +- 12.38% )  (62.55%)
     3,806,321,294      L1-dcache-loads           # 2106.424 M/sec             
      ( +-  0.02% )  (62.39%)
        16,753,910      L1-dcache-load-misses     #    0.44% of all L1-dcache
accesses  ( +-  0.11% )  (62.28%)
   <not supported>      LLC-loads                                               
   <not supported>      LLC-load-misses                                         

          1.807335 +- 0.000776 seconds time elapsed  ( +-  0.04% )

```

Again, performance suffers dramatically

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug c++/99386] std::variant overhead much larger compared to clang
  2021-03-04 12:34 [Bug c++/99386] New: std::variant overhead much larger compared to clang mail at milianw dot de
                   ` (3 preceding siblings ...)
  2021-03-04 14:08 ` mail at milianw dot de
@ 2021-03-04 14:49 ` redi at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: redi at gcc dot gnu.org @ 2021-03-04 14:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99386

--- Comment #5 from Jonathan Wakely <redi at gcc dot gnu.org> ---
See PR 78113 and PR 86912

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-03-04 14:49 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-04 12:34 [Bug c++/99386] New: std::variant overhead much larger compared to clang mail at milianw dot de
2021-03-04 13:12 ` [Bug c++/99386] " rguenth at gcc dot gnu.org
2021-03-04 13:48 ` mail at milianw dot de
2021-03-04 13:52 ` mail at milianw dot de
2021-03-04 14:08 ` mail at milianw dot de
2021-03-04 14:49 ` redi at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).