public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/109029] New: std::signbit(double) generiert sehr ineffizienten code
@ 2023-03-05 12:08 g.peterhoff@t-online.de
2023-03-05 12:11 ` [Bug c++/109029] " g.peterhoff@t-online.de
` (8 more replies)
0 siblings, 9 replies; 10+ messages in thread
From: g.peterhoff@t-online.de @ 2023-03-05 12:08 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109029
Bug ID: 109029
Summary: std::signbit(double) generiert sehr ineffizienten code
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c++
Assignee: unassigned at gcc dot gnu.org
Reporter: g.peterhoff@t-online.de
Target Milestone: ---
Hallo,
std::signbit(double) generiert sehr ineffizienten code und kann somit nicht
vektorisiert werden (https://godbolt.org/z/se6Ea8bo9).
thx
Gero
-std=c++20 -march=x86-64-v3 -O3 -mno-vzeroupper
#include <cmath>
#include <array>
#include <numbers>
#include <algorithm>
static constexpr size_t Size = 1024;
using float80_t = long double;
using float64_t = double;
using float32_t = float;
template <typename Type>
inline constexpr bool foo(const Type x) noexcept
{
return std::signbit(x);
}
template <typename Type>
inline constexpr Type bar(const Type x) noexcept
{
return std::signbit(x) ? std::numbers::pi_v<Type> : 0;
}
template <typename Container, typename Function>
inline constexpr void for_all(Container& cnt, Function&& f) noexcept
{
std::transform(cnt.begin(), cnt.end(), cnt.begin(), f);
}
template <typename ContainerRes, typename ContainerArg, typename Function>
inline constexpr void for_all(ContainerRes& res, const ContainerArg& arg,
Function&& f) noexcept
{
std::transform(arg.begin(), arg.end(), res.begin(), f);
}
float64_t foo64(const float64_t x) noexcept { return foo(x); }
float32_t foo32(const float32_t x) noexcept { return foo(x); }
float64_t bar64(const float64_t x) noexcept { return bar(x); }
float32_t bar32(const float32_t x) noexcept { return bar(x); }
void foos64(std::array<bool, Size>& res, const std::array<float64_t, Size>&
arg) noexcept { for_all(res, arg, foo<float64_t>); }
void foos32(std::array<bool, Size>& res, const std::array<float32_t, Size>&
arg) noexcept { for_all(res, arg, foo<float32_t>); }
void bars64(std::array<float64_t, Size>& cnt) noexcept { for_all(cnt,
bar<float64_t>); }
void bars32(std::array<float32_t, Size>& cnt) noexcept { for_all(cnt,
bar<float32_t>); }
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug c++/109029] std::signbit(double) generiert sehr ineffizienten code
2023-03-05 12:08 [Bug c++/109029] New: std::signbit(double) generiert sehr ineffizienten code g.peterhoff@t-online.de
@ 2023-03-05 12:11 ` g.peterhoff@t-online.de
2023-03-05 19:18 ` [Bug tree-optimization/109029] __builtin_signbit for 64bit fp does not vectorize pinskia at gcc dot gnu.org
` (7 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: g.peterhoff@t-online.de @ 2023-03-05 12:11 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109029
--- Comment #1 from g.peterhoff@t-online.de ---
Ok in english
std::signbit(double) generates very inefficient code and thus cannot be
vectorized (https://godbolt.org/z/se6Ea8bo9).
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug tree-optimization/109029] __builtin_signbit for 64bit fp does not vectorize
2023-03-05 12:08 [Bug c++/109029] New: std::signbit(double) generiert sehr ineffizienten code g.peterhoff@t-online.de
2023-03-05 12:11 ` [Bug c++/109029] " g.peterhoff@t-online.de
@ 2023-03-05 19:18 ` pinskia at gcc dot gnu.org
2023-03-06 8:52 ` rguenth at gcc dot gnu.org
` (6 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-03-05 19:18 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109029
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Severity|normal |enhancement
Status|UNCONFIRMED |NEW
Blocks| |53947
Summary|std::signbit(double) |__builtin_signbit for 64bit
|generates very inefficient |fp does not vectorize
|code |
Component|target |tree-optimization
Last reconfirmed| |2023-03-05
Ever confirmed|0 |1
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
missed: mismatched vector sizes const vector(2) double and vector(2) int
Referenced Bugs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug tree-optimization/109029] __builtin_signbit for 64bit fp does not vectorize
2023-03-05 12:08 [Bug c++/109029] New: std::signbit(double) generiert sehr ineffizienten code g.peterhoff@t-online.de
2023-03-05 12:11 ` [Bug c++/109029] " g.peterhoff@t-online.de
2023-03-05 19:18 ` [Bug tree-optimization/109029] __builtin_signbit for 64bit fp does not vectorize pinskia at gcc dot gnu.org
@ 2023-03-06 8:52 ` rguenth at gcc dot gnu.org
2023-03-31 3:37 ` crazylht at gmail dot com
` (5 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-03-06 8:52 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109029
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Version|unknown |13.0
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
Simple testcase:
void foo (bool *res, double *arg, int n)
{
for (int i = 0; i < n; ++i)
res[i] = __builtin_signbit (arg[i]) != 0;
}
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug tree-optimization/109029] __builtin_signbit for 64bit fp does not vectorize
2023-03-05 12:08 [Bug c++/109029] New: std::signbit(double) generiert sehr ineffizienten code g.peterhoff@t-online.de
` (2 preceding siblings ...)
2023-03-06 8:52 ` rguenth at gcc dot gnu.org
@ 2023-03-31 3:37 ` crazylht at gmail dot com
2023-03-31 4:16 ` crazylht at gmail dot com
` (4 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2023-03-31 3:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109029
--- Comment #4 from Hongtao.liu <crazylht at gmail dot com> ---
Also found a document missing for signbitm2 in md.texi.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug tree-optimization/109029] __builtin_signbit for 64bit fp does not vectorize
2023-03-05 12:08 [Bug c++/109029] New: std::signbit(double) generiert sehr ineffizienten code g.peterhoff@t-online.de
` (3 preceding siblings ...)
2023-03-31 3:37 ` crazylht at gmail dot com
@ 2023-03-31 4:16 ` crazylht at gmail dot com
2023-03-31 5:47 ` crazylht at gmail dot com
` (3 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2023-03-31 4:16 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109029
--- Comment #5 from Hongtao.liu <crazylht at gmail dot com> ---
We need to support signbit<mode>2 for vector double/_Float16. Also similar like
popcnt, there's a mismatch of input and output between builtin and
signbit_optab, it could be handled in vectorizer pattern match.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug tree-optimization/109029] __builtin_signbit for 64bit fp does not vectorize
2023-03-05 12:08 [Bug c++/109029] New: std::signbit(double) generiert sehr ineffizienten code g.peterhoff@t-online.de
` (4 preceding siblings ...)
2023-03-31 4:16 ` crazylht at gmail dot com
@ 2023-03-31 5:47 ` crazylht at gmail dot com
2023-03-31 5:52 ` crazylht at gmail dot com
` (2 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2023-03-31 5:47 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109029
--- Comment #6 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Hongtao.liu from comment #5)
> We need to support signbit<mode>2 for vector double/_Float16. Also similar
> like popcnt, there's a mismatch of input and output between builtin and
> signbit_optab, it could be handled in vectorizer pattern match.
After support signbit{v2df,v4df,v8df}2, vectorizer still failed, currently, we
only support simple integer narrowing, but not for v4df->v8si.
3480 /* First try using an internal function. */
3481 tree_code convert_code = ERROR_MARK;
3482 if (cfn != CFN_LAST
3483 && (modifier == NONE
3484 || (modifier == NARROW
3485 && simple_integer_narrowing (vectype_out, vectype_in,
3486 &convert_code))))
3487 ifn = vectorizable_internal_function (cfn, callee, vectype_out,
3488 vectype_in);
3489
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug tree-optimization/109029] __builtin_signbit for 64bit fp does not vectorize
2023-03-05 12:08 [Bug c++/109029] New: std::signbit(double) generiert sehr ineffizienten code g.peterhoff@t-online.de
` (5 preceding siblings ...)
2023-03-31 5:47 ` crazylht at gmail dot com
@ 2023-03-31 5:52 ` crazylht at gmail dot com
2023-03-31 6:22 ` crazylht at gmail dot com
2023-03-31 7:19 ` crazylht at gmail dot com
8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2023-03-31 5:52 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109029
--- Comment #7 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Hongtao.liu from comment #6)
> (In reply to Hongtao.liu from comment #5)
> > We need to support signbit<mode>2 for vector double/_Float16. Also similar
> > like popcnt, there's a mismatch of input and output between builtin and
> > signbit_optab, it could be handled in vectorizer pattern match.
>
> After support signbit{v2df,v4df,v8df}2, vectorizer still failed, currently,
> we only support simple integer narrowing, but not for v4df->v8si.
>
> 3480 /* First try using an internal function. */
> 3481 tree_code convert_code = ERROR_MARK;
> 3482 if (cfn != CFN_LAST
> 3483 && (modifier == NONE
> 3484 || (modifier == NARROW
> 3485 && simple_integer_narrowing (vectype_out, vectype_in,
> 3486 &convert_code))))
> 3487 ifn = vectorizable_internal_function (cfn, callee, vectype_out,
> 3488 vectype_in);
> 3489
One solution is handling it in ix86_builtin_vectorized_function just like other
math functions which has different input/output sizes.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug tree-optimization/109029] __builtin_signbit for 64bit fp does not vectorize
2023-03-05 12:08 [Bug c++/109029] New: std::signbit(double) generiert sehr ineffizienten code g.peterhoff@t-online.de
` (6 preceding siblings ...)
2023-03-31 5:52 ` crazylht at gmail dot com
@ 2023-03-31 6:22 ` crazylht at gmail dot com
2023-03-31 7:19 ` crazylht at gmail dot com
8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2023-03-31 6:22 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109029
--- Comment #8 from Hongtao.liu <crazylht at gmail dot com> ---
>
> One solution is handling it in ix86_builtin_vectorized_function just like
> other math functions which has different input/output sizes.
Or extend vectorizable_call to handle NARROW/WIDEN cases when output of ifn is
same-sized vector integer of vectype_in, and do NARROW/WIDEN between ifn_output
and vectype_out.
3472 /* For now, we only vectorize functions if a target specific builtin
3473 is available. TODO -- in some cases, it might be profitable to
3474 insert the calls for pieces of the vector, in order to be able
3475 to vectorize other operations in the loop. */
3476 fndecl = NULL_TREE;
3477 internal_fn ifn = IFN_LAST;
3478 tree callee = gimple_call_fndecl (stmt);
3479
3480 /* First try using an internal function. */
3481 tree_code convert_code = ERROR_MARK;
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug tree-optimization/109029] __builtin_signbit for 64bit fp does not vectorize
2023-03-05 12:08 [Bug c++/109029] New: std::signbit(double) generiert sehr ineffizienten code g.peterhoff@t-online.de
` (7 preceding siblings ...)
2023-03-31 6:22 ` crazylht at gmail dot com
@ 2023-03-31 7:19 ` crazylht at gmail dot com
8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2023-03-31 7:19 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109029
--- Comment #9 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Hongtao.liu from comment #8)
> >
> > One solution is handling it in ix86_builtin_vectorized_function just like
> > other math functions which has different input/output sizes.
>
> Or extend vectorizable_call to handle NARROW/WIDEN cases when output of ifn
> is same-sized vector integer of vectype_in, and do NARROW/WIDEN between
> ifn_output and vectype_out.
>
> 3472 /* For now, we only vectorize functions if a target specific builtin
> 3473 is available. TODO -- in some cases, it might be profitable to
> 3474 insert the calls for pieces of the vector, in order to be able
> 3475 to vectorize other operations in the loop. */
> 3476 fndecl = NULL_TREE;
> 3477 internal_fn ifn = IFN_LAST;
> 3478 tree callee = gimple_call_fndecl (stmt);
> 3479
> 3480 /* First try using an internal function. */
> 3481 tree_code convert_code = ERROR_MARK;
Another alternative it's we can recognize signbit as shift in the vectorizer.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2023-03-31 7:19 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-05 12:08 [Bug c++/109029] New: std::signbit(double) generiert sehr ineffizienten code g.peterhoff@t-online.de
2023-03-05 12:11 ` [Bug c++/109029] " g.peterhoff@t-online.de
2023-03-05 19:18 ` [Bug tree-optimization/109029] __builtin_signbit for 64bit fp does not vectorize pinskia at gcc dot gnu.org
2023-03-06 8:52 ` rguenth at gcc dot gnu.org
2023-03-31 3:37 ` crazylht at gmail dot com
2023-03-31 4:16 ` crazylht at gmail dot com
2023-03-31 5:47 ` crazylht at gmail dot com
2023-03-31 5:52 ` crazylht at gmail dot com
2023-03-31 6:22 ` crazylht at gmail dot com
2023-03-31 7:19 ` crazylht at gmail dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).