public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
* [Bug target/98647] New: Failure to optimize out convertion from float to vector type @ 2021-01-13 0:57 gabravier at gmail dot com 2021-01-13 1:07 ` [Bug target/98647] " pinskia at gcc dot gnu.org ` (2 more replies) 0 siblings, 3 replies; 4+ messages in thread From: gabravier at gmail dot com @ 2021-01-13 0:57 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98647 Bug ID: 98647 Summary: Failure to optimize out convertion from float to vector type Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: gabravier at gmail dot com Target Milestone: --- float f(float val) { return _mm_cvtss_f32(_mm_and_ps(_mm_set_ss(val), _mm_castsi128_ps(_mm_set1_epi32(0x7fffffff)))); } This can be optimized to avoid the conversions in the emitted assembly code. This optimization is done by LLVM, but not by GCC. LLVM code generation : .LCPI1_0: .long 0x7fffffff # float NaN .long 0x7fffffff # float NaN .long 0x7fffffff # float NaN .long 0x7fffffff # float NaN f(float): # @f(float) andps xmm0, xmmword ptr [rip + .LCPI1_0] ret GCC code generation : f(float): pxor xmm1, xmm1 movss xmm1, xmm0 movaps xmm0, XMMWORD PTR .LC1[rip] andps xmm0, xmm1 ret .LC1: .long 2147483647 .long 2147483647 .long 2147483647 .long 2147483647 ^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug target/98647] Failure to optimize out convertion from float to vector type 2021-01-13 0:57 [Bug target/98647] New: Failure to optimize out convertion from float to vector type gabravier at gmail dot com @ 2021-01-13 1:07 ` pinskia at gcc dot gnu.org 2021-01-13 1:24 ` gabravier at gmail dot com 2021-01-13 7:01 ` jakub at gcc dot gnu.org 2 siblings, 0 replies; 4+ messages in thread From: pinskia at gcc dot gnu.org @ 2021-01-13 1:07 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98647 Andrew Pinski <pinskia at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |ABI --- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> --- I think this is an ABI issue. What does the x86_64 ABI say about the other bits of the SSE register when passing float to a function? It might be the case that clang/llvm is not implementing the ABI correctly. ^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug target/98647] Failure to optimize out convertion from float to vector type 2021-01-13 0:57 [Bug target/98647] New: Failure to optimize out convertion from float to vector type gabravier at gmail dot com 2021-01-13 1:07 ` [Bug target/98647] " pinskia at gcc dot gnu.org @ 2021-01-13 1:24 ` gabravier at gmail dot com 2021-01-13 7:01 ` jakub at gcc dot gnu.org 2 siblings, 0 replies; 4+ messages in thread From: gabravier at gmail dot com @ 2021-01-13 1:24 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98647 --- Comment #2 from Gabriel Ravier <gabravier at gmail dot com> --- I have just looked at the ABI and it just says that floats/doubles are passed in SSE registers, but does not seem to explicitly specify whether the upper bits are cleared or not (it explicitly specifies that only the first 8 bits of a _Bool return are specified: I would tend to deduce from that that by default, the upper bits or a value are specified to be cleared). ^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug target/98647] Failure to optimize out convertion from float to vector type 2021-01-13 0:57 [Bug target/98647] New: Failure to optimize out convertion from float to vector type gabravier at gmail dot com 2021-01-13 1:07 ` [Bug target/98647] " pinskia at gcc dot gnu.org 2021-01-13 1:24 ` gabravier at gmail dot com @ 2021-01-13 7:01 ` jakub at gcc dot gnu.org 2 siblings, 0 replies; 4+ messages in thread From: jakub at gcc dot gnu.org @ 2021-01-13 7:01 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98647 Jakub Jelinek <jakub at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |hubicka at gcc dot gnu.org, | |jakub at gcc dot gnu.org, | |matz at gcc dot gnu.org --- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> --- If the psABI doesn't say those upper parts of the register have to be cleared (explicitly), then it means their content is undefined in the register passing. I believe even LLVM doesn't assume it has to be zeros, otherwise it wouldn't compile: typedef float V __attribute__((vector_size (16), may_alias)); float foo (V x) { return x[0]; } into just retq - if the psABI mandated clearing of the upper bits, then we'd need to clear in that case. Anyway, as only the low float is extracted from it in the end, it might be best to just change the mask into 0x7fffffff, 0, 0, 0 and then ignore any code that would only ensure those upper floats are initialized properly. Testcase with generic vectors so that intrinsics don't interfere with this: typedef float V __attribute__((vector_size (16), may_alias)); typedef int W __attribute__((vector_size (16), may_alias)); float foo (float x) { V a = (V) { x, 0.0, 0.0, 0.0 }; W b = *(W *)&a; b &= (W) { 0x7fffffff, 0x7fffffff, 0x7fffffff, 0x7fffffff }; V c = *(V *)&b; return c[0]; } V bar (float x) { V a = (V) { x, 0.0, 0.0, 0.0 }; W b = *(W *)&a; b &= (W) { 0x7fffffff, 0x7fffffff, 0x7fffffff, 0x7fffffff }; V c = *(V *)&b; return c; } ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2021-01-13 7:01 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-01-13 0:57 [Bug target/98647] New: Failure to optimize out convertion from float to vector type gabravier at gmail dot com 2021-01-13 1:07 ` [Bug target/98647] " pinskia at gcc dot gnu.org 2021-01-13 1:24 ` gabravier at gmail dot com 2021-01-13 7:01 ` jakub at gcc dot gnu.org
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).