[Bug target/96654] New: Failure to optimize vectorized conversion to `int` with AVX

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug target/96654] New: Failure to optimize vectorized conversion to `int` with AVX
@ 2020-08-17 11:57 gabravier at gmail dot com
  2020-08-17 19:04 ` [Bug tree-optimization/96654] " ubizjak at gmail dot com
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: gabravier at gmail dot com @ 2020-08-17 11:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96654

            Bug ID: 96654
           Summary: Failure to optimize vectorized conversion to `int`
                    with AVX
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gabravier at gmail dot com
  Target Milestone: ---

void f(double *src, int *dst)
{
    for (int i = 0; i < 4; i ++)
        dst[i] = (int)src[i];
}

With -O3 -mavx, LLVM outputs this :

f(double*, int*):
  vcvttpd2dq xmm0, ymmword ptr [rdi]
  vmovupd xmmword ptr [rsi], xmm0
  ret

GCC outputs this :

f(double*, int*):
  push rbp
  vmovupd xmm1, XMMWORD PTR [rdi]
  vinsertf128 ymm0, ymm1, XMMWORD PTR [rdi+16], 0x1
  mov rbp, rsp
  vcvttpd2dq xmm0, ymm0
  vmovdqu XMMWORD PTR [rsi], xmm0
  vzeroupper
  pop rbp
  ret

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/96654] Failure to optimize vectorized conversion to `int` with AVX
  2020-08-17 11:57 [Bug target/96654] New: Failure to optimize vectorized conversion to `int` with AVX gabravier at gmail dot com
@ 2020-08-17 19:04 ` ubizjak at gmail dot com
  2020-08-22 16:32 ` glisse at gcc dot gnu.org
  2020-08-25 10:40 ` rguenth at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: ubizjak at gmail dot com @ 2020-08-17 19:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96654

Uroš Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
          Component|target                      |tree-optimization
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2020-08-17

--- Comment #1 from Uroš Bizjak <ubizjak at gmail dot com> ---
The relevant pattern is present in sse.md:

(define_insn "fix_truncv4dfv4si2<mask_name>"
  [(set (match_operand:V4SI 0 "register_operand" "=v")
        (fix:V4SI (match_operand:V4DF 1 "nonimmediate_operand" "vm")))]
  "TARGET_AVX || (TARGET_AVX512VL && TARGET_AVX512F)"
  "vcvttpd2dq{y}\t{%1, %0<mask_operand2>|%0<mask_operand2>, %1}"

but for some reason not exercised by target-independent part of the compiler.

Confirmed as a tree optimization problem.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/96654] Failure to optimize vectorized conversion to `int` with AVX
  2020-08-17 11:57 [Bug target/96654] New: Failure to optimize vectorized conversion to `int` with AVX gabravier at gmail dot com
  2020-08-17 19:04 ` [Bug tree-optimization/96654] " ubizjak at gmail dot com
@ 2020-08-22 16:32 ` glisse at gcc dot gnu.org
  2020-08-25 10:40 ` rguenth at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: glisse at gcc dot gnu.org @ 2020-08-22 16:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96654

--- Comment #2 from Marc Glisse <glisse at gcc dot gnu.org> ---
gcc doesn't seem very fond of using 2 different vector bitsizes at the same
time, so VEC_PACK_FIX_TRUNC_EXPR takes 2 vectors of 2 double and gives one
vector of 4 int. At the RTL level, we have a vec_concat:V4DF of 2 V2DF adjacent
in memory, but nothing knows to turn that into a single load. (the conversion
itself of 4 double to int is fine)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/96654] Failure to optimize vectorized conversion to `int` with AVX
  2020-08-17 11:57 [Bug target/96654] New: Failure to optimize vectorized conversion to `int` with AVX gabravier at gmail dot com
  2020-08-17 19:04 ` [Bug tree-optimization/96654] " ubizjak at gmail dot com
  2020-08-22 16:32 ` glisse at gcc dot gnu.org
@ 2020-08-25 10:40 ` rguenth at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-08-25 10:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96654

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rguenth at gcc dot gnu.org,
                   |                            |rsandifo at gcc dot gnu.org
             Blocks|                            |53947

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
The pattern is exercised directly by BB vectorization only, loop vectorization
still uses a fixed vector size.  Still the assembly shows basically the same
code when doing BB vectorization only:

f:
.LFB0:
        .cfi_startproc
        vmovupd (%rdi), %xmm1
        vinsertf128     $0x1, 16(%rdi), %ymm1, %ymm0
        vcvttpd2dqy     %ymm0, %xmm0
        vmovdqu %xmm0, (%rsi)
        vzeroupper
        ret

this is probably because of some tuning (split unaligned loads, not using
a memory operand for vcvttpd2dqy).

With -O3 -fno-tree-loop-vectorize -march=core-avx2 I get

f:
.LFB0:
        .cfi_startproc
        vcvttpd2dqy     (%rdi), %xmm0
        vmovdqu %xmm0, (%rsi)
        ret


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-08-25 10:40 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-17 11:57 [Bug target/96654] New: Failure to optimize vectorized conversion to `int` with AVX gabravier at gmail dot com
2020-08-17 19:04 ` [Bug tree-optimization/96654] " ubizjak at gmail dot com
2020-08-22 16:32 ` glisse at gcc dot gnu.org
2020-08-25 10:40 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).