[Bug target/98647] New: Failure to optimize out convertion from float to vector type

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug target/98647] New: Failure to optimize out convertion from float to vector type
@ 2021-01-13  0:57 gabravier at gmail dot com
  2021-01-13  1:07 ` [Bug target/98647] " pinskia at gcc dot gnu.org
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: gabravier at gmail dot com @ 2021-01-13  0:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98647

            Bug ID: 98647
           Summary: Failure to optimize out convertion from float to
                    vector type
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gabravier at gmail dot com
  Target Milestone: ---

float f(float val)
{
    return _mm_cvtss_f32(_mm_and_ps(_mm_set_ss(val),
_mm_castsi128_ps(_mm_set1_epi32(0x7fffffff))));
}

This can be optimized to avoid the conversions in the emitted assembly code.
This optimization is done by LLVM, but not by GCC.

LLVM code generation :

.LCPI1_0:
  .long 0x7fffffff # float NaN
  .long 0x7fffffff # float NaN
  .long 0x7fffffff # float NaN
  .long 0x7fffffff # float NaN
f(float): # @f(float)
  andps xmm0, xmmword ptr [rip + .LCPI1_0]
  ret

GCC code generation :

f(float):
  pxor xmm1, xmm1
  movss xmm1, xmm0
  movaps xmm0, XMMWORD PTR .LC1[rip]
  andps xmm0, xmm1
  ret

.LC1:
  .long 2147483647
  .long 2147483647
  .long 2147483647
  .long 2147483647

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug target/98647] Failure to optimize out convertion from float to vector type
  2021-01-13  0:57 [Bug target/98647] New: Failure to optimize out convertion from float to vector type gabravier at gmail dot com
@ 2021-01-13  1:07 ` pinskia at gcc dot gnu.org
  2021-01-13  1:24 ` gabravier at gmail dot com
  2021-01-13  7:01 ` jakub at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-01-13  1:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98647

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |ABI

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I think this is an ABI issue.  What does the x86_64 ABI say about the other
bits of the SSE register when passing float to a function?
It might be the case that clang/llvm is not implementing the ABI correctly.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug target/98647] Failure to optimize out convertion from float to vector type
  2021-01-13  0:57 [Bug target/98647] New: Failure to optimize out convertion from float to vector type gabravier at gmail dot com
  2021-01-13  1:07 ` [Bug target/98647] " pinskia at gcc dot gnu.org
@ 2021-01-13  1:24 ` gabravier at gmail dot com
  2021-01-13  7:01 ` jakub at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: gabravier at gmail dot com @ 2021-01-13  1:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98647

--- Comment #2 from Gabriel Ravier <gabravier at gmail dot com> ---
I have just looked at the ABI and it just says that floats/doubles are passed
in SSE registers, but does not seem to explicitly specify whether the upper
bits are cleared or not (it explicitly specifies that only the first 8 bits of
a _Bool return are specified: I would tend to deduce from that that by default,
the upper bits or a value are specified to be cleared).

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug target/98647] Failure to optimize out convertion from float to vector type
  2021-01-13  0:57 [Bug target/98647] New: Failure to optimize out convertion from float to vector type gabravier at gmail dot com
  2021-01-13  1:07 ` [Bug target/98647] " pinskia at gcc dot gnu.org
  2021-01-13  1:24 ` gabravier at gmail dot com
@ 2021-01-13  7:01 ` jakub at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-01-13  7:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98647

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hubicka at gcc dot gnu.org,
                   |                            |jakub at gcc dot gnu.org,
                   |                            |matz at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
If the psABI doesn't say those upper parts of the register have to be cleared
(explicitly), then it means their content is undefined in the register passing.
I believe even LLVM doesn't assume it has to be zeros, otherwise it wouldn't
compile:
typedef float V __attribute__((vector_size (16), may_alias));
float foo (V x) { return x[0]; }
into just retq - if the psABI mandated clearing of the upper bits, then we'd
need to clear in that case.

Anyway, as only the low float is extracted from it in the end, it might be best
to just change the mask into 0x7fffffff, 0, 0, 0 and then ignore any code that
would only ensure those upper floats are initialized properly.

Testcase with generic vectors so that intrinsics don't interfere with this:
typedef float V __attribute__((vector_size (16), may_alias));
typedef int W __attribute__((vector_size (16), may_alias));

float
foo (float x)
{
  V a = (V) { x, 0.0, 0.0, 0.0 };
  W b = *(W *)&a;
  b &= (W) { 0x7fffffff, 0x7fffffff, 0x7fffffff, 0x7fffffff };
  V c = *(V *)&b;
  return c[0];
}

V
bar (float x)
{
  V a = (V) { x, 0.0, 0.0, 0.0 };
  W b = *(W *)&a;
  b &= (W) { 0x7fffffff, 0x7fffffff, 0x7fffffff, 0x7fffffff };
  V c = *(V *)&b;
  return c;
}

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-01-13  7:01 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-13  0:57 [Bug target/98647] New: Failure to optimize out convertion from float to vector type gabravier at gmail dot com
2021-01-13  1:07 ` [Bug target/98647] " pinskia at gcc dot gnu.org
2021-01-13  1:24 ` gabravier at gmail dot com
2021-01-13  7:01 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).