[Bug c++/109029] New: std::signbit(double) generiert sehr ineffizienten code

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug c++/109029] New: std::signbit(double) generiert sehr ineffizienten code
@ 2023-03-05 12:08 g.peterhoff@t-online.de
  2023-03-05 12:11 ` [Bug c++/109029] " g.peterhoff@t-online.de
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: g.peterhoff@t-online.de @ 2023-03-05 12:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109029

            Bug ID: 109029
           Summary: std::signbit(double) generiert sehr ineffizienten code
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: g.peterhoff@t-online.de
  Target Milestone: ---

Hallo,
std::signbit(double) generiert sehr ineffizienten code und kann somit nicht
vektorisiert werden (https://godbolt.org/z/se6Ea8bo9).

thx
Gero

-std=c++20 -march=x86-64-v3 -O3 -mno-vzeroupper

#include <cmath>
#include <array>
#include <numbers>
#include <algorithm>

static constexpr size_t Size = 1024;

using float80_t = long double;
using float64_t = double;
using float32_t = float;

template <typename Type>
inline constexpr bool   foo(const Type x)   noexcept
{
    return std::signbit(x);
}

template <typename Type>
inline constexpr Type   bar(const Type x)   noexcept
{
   return std::signbit(x) ? std::numbers::pi_v<Type> : 0;
}

template <typename Container, typename Function>
inline constexpr void for_all(Container& cnt, Function&& f)     noexcept
{
        std::transform(cnt.begin(), cnt.end(), cnt.begin(), f);
}

template <typename ContainerRes, typename ContainerArg, typename Function>
inline constexpr void for_all(ContainerRes& res, const ContainerArg& arg,
Function&& f) noexcept
{
        std::transform(arg.begin(), arg.end(), res.begin(), f);
}

float64_t foo64(const float64_t x)   noexcept { return foo(x); }
float32_t foo32(const float32_t x)   noexcept { return foo(x); }

float64_t bar64(const float64_t x)   noexcept { return bar(x); }
float32_t bar32(const float32_t x)   noexcept { return bar(x); }

void foos64(std::array<bool, Size>& res, const std::array<float64_t, Size>&
arg)   noexcept { for_all(res, arg, foo<float64_t>); }
void foos32(std::array<bool, Size>& res, const std::array<float32_t, Size>&
arg)   noexcept { for_all(res, arg, foo<float32_t>); }

void bars64(std::array<float64_t, Size>& cnt)   noexcept { for_all(cnt,
bar<float64_t>); }
void bars32(std::array<float32_t, Size>& cnt)   noexcept { for_all(cnt,
bar<float32_t>); }

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c++/109029] std::signbit(double) generiert sehr ineffizienten code
  2023-03-05 12:08 [Bug c++/109029] New: std::signbit(double) generiert sehr ineffizienten code g.peterhoff@t-online.de
@ 2023-03-05 12:11 ` g.peterhoff@t-online.de
  2023-03-05 19:18 ` [Bug tree-optimization/109029] __builtin_signbit for 64bit fp does not vectorize pinskia at gcc dot gnu.org
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: g.peterhoff@t-online.de @ 2023-03-05 12:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109029

--- Comment #1 from g.peterhoff@t-online.de ---
Ok in english
std::signbit(double) generates very inefficient code and thus cannot be
vectorized (https://godbolt.org/z/se6Ea8bo9).

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/109029] __builtin_signbit for 64bit fp does not vectorize
  2023-03-05 12:08 [Bug c++/109029] New: std::signbit(double) generiert sehr ineffizienten code g.peterhoff@t-online.de
  2023-03-05 12:11 ` [Bug c++/109029] " g.peterhoff@t-online.de
@ 2023-03-05 19:18 ` pinskia at gcc dot gnu.org
  2023-03-06  8:52 ` rguenth at gcc dot gnu.org
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-03-05 19:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109029

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement
             Status|UNCONFIRMED                 |NEW
             Blocks|                            |53947
            Summary|std::signbit(double)        |__builtin_signbit for 64bit
                   |generates very inefficient  |fp does not vectorize
                   |code                        |
          Component|target                      |tree-optimization
   Last reconfirmed|                            |2023-03-05
     Ever confirmed|0                           |1

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
missed:   mismatched vector sizes const vector(2) double and vector(2) int


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/109029] __builtin_signbit for 64bit fp does not vectorize
  2023-03-05 12:08 [Bug c++/109029] New: std::signbit(double) generiert sehr ineffizienten code g.peterhoff@t-online.de
  2023-03-05 12:11 ` [Bug c++/109029] " g.peterhoff@t-online.de
  2023-03-05 19:18 ` [Bug tree-optimization/109029] __builtin_signbit for 64bit fp does not vectorize pinskia at gcc dot gnu.org
@ 2023-03-06  8:52 ` rguenth at gcc dot gnu.org
  2023-03-31  3:37 ` crazylht at gmail dot com
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-03-06  8:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109029

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Version|unknown                     |13.0

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
Simple testcase:

void foo (bool *res, double *arg, int n)
{
  for (int i = 0; i < n; ++i)
    res[i] = __builtin_signbit (arg[i]) != 0;
}

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/109029] __builtin_signbit for 64bit fp does not vectorize
  2023-03-05 12:08 [Bug c++/109029] New: std::signbit(double) generiert sehr ineffizienten code g.peterhoff@t-online.de
                   ` (2 preceding siblings ...)
  2023-03-06  8:52 ` rguenth at gcc dot gnu.org
@ 2023-03-31  3:37 ` crazylht at gmail dot com
  2023-03-31  4:16 ` crazylht at gmail dot com
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2023-03-31  3:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109029

--- Comment #4 from Hongtao.liu <crazylht at gmail dot com> ---
Also found a document missing for signbitm2 in md.texi.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/109029] __builtin_signbit for 64bit fp does not vectorize
  2023-03-05 12:08 [Bug c++/109029] New: std::signbit(double) generiert sehr ineffizienten code g.peterhoff@t-online.de
                   ` (3 preceding siblings ...)
  2023-03-31  3:37 ` crazylht at gmail dot com
@ 2023-03-31  4:16 ` crazylht at gmail dot com
  2023-03-31  5:47 ` crazylht at gmail dot com
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2023-03-31  4:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109029

--- Comment #5 from Hongtao.liu <crazylht at gmail dot com> ---
We need to support signbit<mode>2 for vector double/_Float16. Also similar like
popcnt, there's a mismatch of input and output between builtin and
signbit_optab, it could be handled in vectorizer pattern match.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/109029] __builtin_signbit for 64bit fp does not vectorize
  2023-03-05 12:08 [Bug c++/109029] New: std::signbit(double) generiert sehr ineffizienten code g.peterhoff@t-online.de
                   ` (4 preceding siblings ...)
  2023-03-31  4:16 ` crazylht at gmail dot com
@ 2023-03-31  5:47 ` crazylht at gmail dot com
  2023-03-31  5:52 ` crazylht at gmail dot com
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2023-03-31  5:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109029

--- Comment #6 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Hongtao.liu from comment #5)
> We need to support signbit<mode>2 for vector double/_Float16. Also similar
> like popcnt, there's a mismatch of input and output between builtin and
> signbit_optab, it could be handled in vectorizer pattern match.

After support signbit{v2df,v4df,v8df}2, vectorizer still failed, currently, we
only support simple integer narrowing, but not for v4df->v8si.

 3480  /* First try using an internal function.  */
 3481  tree_code convert_code = ERROR_MARK;
 3482  if (cfn != CFN_LAST
 3483      && (modifier == NONE
 3484          || (modifier == NARROW
 3485              && simple_integer_narrowing (vectype_out, vectype_in,
 3486                                           &convert_code))))
 3487    ifn = vectorizable_internal_function (cfn, callee, vectype_out,
 3488                                          vectype_in);
 3489

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/109029] __builtin_signbit for 64bit fp does not vectorize
  2023-03-05 12:08 [Bug c++/109029] New: std::signbit(double) generiert sehr ineffizienten code g.peterhoff@t-online.de
                   ` (5 preceding siblings ...)
  2023-03-31  5:47 ` crazylht at gmail dot com
@ 2023-03-31  5:52 ` crazylht at gmail dot com
  2023-03-31  6:22 ` crazylht at gmail dot com
  2023-03-31  7:19 ` crazylht at gmail dot com
  8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2023-03-31  5:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109029

--- Comment #7 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Hongtao.liu from comment #6)
> (In reply to Hongtao.liu from comment #5)
> > We need to support signbit<mode>2 for vector double/_Float16. Also similar
> > like popcnt, there's a mismatch of input and output between builtin and
> > signbit_optab, it could be handled in vectorizer pattern match.
> 
> After support signbit{v2df,v4df,v8df}2, vectorizer still failed, currently,
> we only support simple integer narrowing, but not for v4df->v8si.
> 
>  3480  /* First try using an internal function.  */
>  3481  tree_code convert_code = ERROR_MARK;
>  3482  if (cfn != CFN_LAST
>  3483      && (modifier == NONE
>  3484          || (modifier == NARROW
>  3485              && simple_integer_narrowing (vectype_out, vectype_in,
>  3486                                           &convert_code))))
>  3487    ifn = vectorizable_internal_function (cfn, callee, vectype_out,
>  3488                                          vectype_in);
>  3489

One solution is handling it in ix86_builtin_vectorized_function just like other
math functions which has different input/output sizes.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/109029] __builtin_signbit for 64bit fp does not vectorize
  2023-03-05 12:08 [Bug c++/109029] New: std::signbit(double) generiert sehr ineffizienten code g.peterhoff@t-online.de
                   ` (6 preceding siblings ...)
  2023-03-31  5:52 ` crazylht at gmail dot com
@ 2023-03-31  6:22 ` crazylht at gmail dot com
  2023-03-31  7:19 ` crazylht at gmail dot com
  8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2023-03-31  6:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109029

--- Comment #8 from Hongtao.liu <crazylht at gmail dot com> ---

> 
> One solution is handling it in ix86_builtin_vectorized_function just like
> other math functions which has different input/output sizes.

Or extend vectorizable_call to handle NARROW/WIDEN cases when output of ifn is
same-sized vector integer of vectype_in, and do NARROW/WIDEN between ifn_output
and vectype_out.

 3472  /* For now, we only vectorize functions if a target specific builtin
 3473     is available.  TODO -- in some cases, it might be profitable to
 3474     insert the calls for pieces of the vector, in order to be able
 3475     to vectorize other operations in the loop.  */
 3476  fndecl = NULL_TREE;
 3477  internal_fn ifn = IFN_LAST;
 3478  tree callee = gimple_call_fndecl (stmt);
 3479
 3480  /* First try using an internal function.  */
 3481  tree_code convert_code = ERROR_MARK;

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/109029] __builtin_signbit for 64bit fp does not vectorize
  2023-03-05 12:08 [Bug c++/109029] New: std::signbit(double) generiert sehr ineffizienten code g.peterhoff@t-online.de
                   ` (7 preceding siblings ...)
  2023-03-31  6:22 ` crazylht at gmail dot com
@ 2023-03-31  7:19 ` crazylht at gmail dot com
  8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2023-03-31  7:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109029

--- Comment #9 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Hongtao.liu from comment #8)
> > 
> > One solution is handling it in ix86_builtin_vectorized_function just like
> > other math functions which has different input/output sizes.
> 
> Or extend vectorizable_call to handle NARROW/WIDEN cases when output of ifn
> is same-sized vector integer of vectype_in, and do NARROW/WIDEN between
> ifn_output and vectype_out.
> 
>  3472  /* For now, we only vectorize functions if a target specific builtin
>  3473     is available.  TODO -- in some cases, it might be profitable to
>  3474     insert the calls for pieces of the vector, in order to be able
>  3475     to vectorize other operations in the loop.  */
>  3476  fndecl = NULL_TREE;
>  3477  internal_fn ifn = IFN_LAST;
>  3478  tree callee = gimple_call_fndecl (stmt);
>  3479
>  3480  /* First try using an internal function.  */
>  3481  tree_code convert_code = ERROR_MARK;

Another alternative it's we can recognize signbit as shift in the vectorizer.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-03-31  7:19 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-05 12:08 [Bug c++/109029] New: std::signbit(double) generiert sehr ineffizienten code g.peterhoff@t-online.de
2023-03-05 12:11 ` [Bug c++/109029] " g.peterhoff@t-online.de
2023-03-05 19:18 ` [Bug tree-optimization/109029] __builtin_signbit for 64bit fp does not vectorize pinskia at gcc dot gnu.org
2023-03-06  8:52 ` rguenth at gcc dot gnu.org
2023-03-31  3:37 ` crazylht at gmail dot com
2023-03-31  4:16 ` crazylht at gmail dot com
2023-03-31  5:47 ` crazylht at gmail dot com
2023-03-31  5:52 ` crazylht at gmail dot com
2023-03-31  6:22 ` crazylht at gmail dot com
2023-03-31  7:19 ` crazylht at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).