public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "mkretz at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug libstdc++/77776] C++17 std::hypot implementation is poor
Date: Mon, 04 Mar 2024 17:14:38 +0000	[thread overview]
Message-ID: <bug-77776-4-2qPy5glAji@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-77776-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77776

--- Comment #17 from Matthias Kretz (Vir) <mkretz at gcc dot gnu.org> ---
hypotf(a, b) is implemented using double precision and hypot(a, b) uses 80-bit
long double on i386 and x86_64 hypot does what you describe, right?

std::experimental::simd benchmarks of hypot(a, b), where simd_abi::scalar uses
the <cmath> implementation (i.e. glibc):


-march=skylake-avx512 -ffast-math -O3 -lmvec:
              TYPE                      Latency     Speedup     Throughput    
Speedup
                                  [cycles/call] [per value]  [cycles/call] [per
value]
 float, simd_abi::scalar                   37.5           1           11.5     
     1
 float,                                    37.6       0.999           10.2     
  1.13
 float, simd_abi::__sse                      34        4.42           6.46     
  7.15
 float, simd_abi::__avx                    34.1        8.79           6.56     
  14.1
 float, simd_abi::_Avx512<32>              34.3        8.76           6.01     
  15.4
 float, simd_abi::_Avx512<64>              44.1        13.6             12     
  15.4
 float, [[gnu::vector_size(16)]]           58.3        2.57           47.5     
 0.974
 float, [[gnu::vector_size(32)]]            132        2.27            104     
 0.892
 float, [[gnu::vector_size(64)]]            240         2.5            222     
 0.832
--------------------------------------------------------------------------------------
              TYPE                      Latency     Speedup     Throughput    
Speedup
                                  [cycles/call] [per value]  [cycles/call] [per
value]
double, simd_abi::scalar                     81           1           21.5     
     1
double,                                    80.1        1.01           21.3     
  1.01
double, simd_abi::__sse                    39.9        4.06           6.47     
  6.64
double, simd_abi::__avx                    40.2        8.05             12     
  7.14
double, simd_abi::_Avx512<32>              40.3        8.04             12     
  7.14
double, simd_abi::_Avx512<64>              56.2        11.5             24     
  7.14
double, [[gnu::vector_size(16)]]           89.3        1.81           42.5     
  1.01
double, [[gnu::vector_size(32)]]            150        2.16            110     
 0.777
double, [[gnu::vector_size(64)]]            297        2.18            242     
  0.71
--------------------------------------------------------------------------------------

-march=skylake-avx512 -O3 -lmvec:
              TYPE                      Latency     Speedup     Throughput    
Speedup
                                  [cycles/call] [per value]  [cycles/call] [per
value]
 float, simd_abi::scalar                   37.6           1           10.4     
     1
 float,                                    37.7       0.998           10.2     
  1.02                                                                          
 float, simd_abi::__sse                    37.6           4           8.83     
  4.71                                                                          
 float, simd_abi::__avx                    37.5        8.01           9.42     
  8.82
 float, simd_abi::_Avx512<64>              47.8        12.6             12     
  13.8
 float, [[gnu::vector_size(16)]]           98.7        1.52           57.2     
 0.727
 float, [[gnu::vector_size(32)]]            151           2            114     
 0.728
 float, [[gnu::vector_size(64)]]            260        2.31            230     
 0.722
--------------------------------------------------------------------------------------
              TYPE                      Latency     Speedup     Throughput    
Speedup
                                  [cycles/call] [per value]  [cycles/call] [per
value]
double, simd_abi::scalar                   79.7           1           21.7     
     1
double,                                    80.1       0.995           21.6     
     1
double, simd_abi::__sse                    44.2         3.6           9.99     
  4.33
double, simd_abi::__avx                    43.6        7.32             12     
  7.21
double, simd_abi::_Avx512<64>              59.9        10.6             24     
  7.21
double, [[gnu::vector_size(16)]]           88.3         1.8           44.2     
  0.98
double, [[gnu::vector_size(32)]]            163        1.96            115     
  0.75
double, [[gnu::vector_size(64)]]            302        2.11            233     
 0.742
--------------------------------------------------------------------------------------

I have never ported my SIMD implementation back to scalar and benchmarked it
against glibc.

  parent reply	other threads:[~2024-03-04 17:14 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <bug-77776-4@http.gcc.gnu.org/bugzilla/>
2021-05-04 12:31 ` rguenth at gcc dot gnu.org
2024-02-29 18:23 ` g.peterhoff@t-online.de
2024-03-01 12:27 ` redi at gcc dot gnu.org
2024-03-02 15:52 ` g.peterhoff@t-online.de
2024-03-04  3:49 ` de34 at live dot cn
2024-03-04 10:45 ` mkretz at gcc dot gnu.org
2024-03-04 11:12 ` jakub at gcc dot gnu.org
2024-03-04 17:14 ` mkretz at gcc dot gnu.org [this message]
2024-03-04 20:21 ` jakub at gcc dot gnu.org
2024-03-06  2:47 ` g.peterhoff@t-online.de
2024-03-06  9:43 ` mkretz at gcc dot gnu.org
2024-03-06 12:35 ` redi at gcc dot gnu.org
2024-03-12 11:31 ` mkretz at gcc dot gnu.org
2024-03-19  0:09 ` g.peterhoff@t-online.de
2024-03-25 10:06 ` mkretz at gcc dot gnu.org
2024-04-05  1:08 ` g.peterhoff@t-online.de
2024-04-05  1:23 ` g.peterhoff@t-online.de
2024-04-10 16:41 ` g.peterhoff@t-online.de
2024-04-10 16:48 ` jakub at gcc dot gnu.org
2024-04-10 18:08 ` g.peterhoff@t-online.de

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-77776-4-2qPy5glAji@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).