public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/99881] New: Regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX
@ 2021-04-02  3:49 crazylht at gmail dot com
  2021-04-02 14:29 ` [Bug target/99881] " hjl.tools at gmail dot com
                   ` (12 more replies)
  0 siblings, 13 replies; 14+ messages in thread
From: crazylht at gmail dot com @ 2021-04-02  3:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99881

            Bug ID: 99881
           Summary: Regression compare -O2 -ftree-vectorize with -O2 on
                    SKX/CLX
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: crazylht at gmail dot com
  Target Milestone: ---

testcase is extracted from 557.xz_r

void
foo (int* __restrict a, int n, int c)
{
    a[0] = n;
    a[1] = c;
}

gcc -O2 -ftree-vectorize -fvect-cost-model=very-cheap

foo(int*, int, int):
        movd    xmm0, esi
        movd    xmm1, edx
        punpckldq       xmm0, xmm1
        movq    QWORD PTR [rdi], xmm0
        ret

without vectorization

foo(int*, int, int):
        mov     DWORD PTR [rdi], esi
        mov     DWORD PTR [rdi+4], edx
        ret

cost model:
scalar: 2 times scalar_store costs 24,
vector: 1 times unaligned_store costs 12, vec_contruct 8

I know that the current strategy of the cost model is to enable vectorization
as much as possible, but for the case above, it hurts performance. Because the
throughput of punpckldq is 1 on SKX/CLX, which becomes a bottleneck (znver2 is
ok). with -march=SKX, the second vmovd and unpck will be replaced by vpinsr,
and it regression more since vpinsr has throught 2 on CLX/SKX.

So i'm thinking to add extra cost for 2-element vec_construct to prevent the
above vectorization, at the same time, try not to affect other vectorization
situations.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-02-22  8:00 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-02  3:49 [Bug target/99881] New: Regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX crazylht at gmail dot com
2021-04-02 14:29 ` [Bug target/99881] " hjl.tools at gmail dot com
2021-04-02 19:34 ` hjl.tools at gmail dot com
2021-04-06  7:48 ` rguenth at gcc dot gnu.org
2021-04-06 10:06 ` crazylht at gmail dot com
2021-04-06 11:44 ` rguenth at gcc dot gnu.org
2021-07-28  2:48 ` cvs-commit at gcc dot gnu.org
2021-07-28  2:49 ` crazylht at gmail dot com
2021-07-28 22:47 ` jakub at gcc dot gnu.org
2021-07-29  1:09 ` crazylht at gmail dot com
2021-07-29  2:18 ` cvs-commit at gcc dot gnu.org
2021-08-19  2:32 ` crazylht at gmail dot com
2022-02-22  7:59 ` cvs-commit at gcc dot gnu.org
2022-02-22  8:00 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).