public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/101668] New: vectorizer doesn't categorize vector construct cost right.
@ 2021-07-29  1:47 crazylht at gmail dot com
  2021-07-29  6:55 ` [Bug tree-optimization/101668] BB vectorizer doesn't handle lowpart of existing vector rguenth at gcc dot gnu.org
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: crazylht at gmail dot com @ 2021-07-29  1:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101668

            Bug ID: 101668
           Summary: vectorizer doesn't categorize vector construct cost
                    right.
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: crazylht at gmail dot com
  Target Milestone: ---

cat test.c

typedef int v16si __attribute__((vector_size (64)));
typedef long long v8di __attribute__((vector_size (64)));

void
bar_s32_s64 (v8di * dst, v16si src)
{
  long long tem[8];
  tem[0] = src[0];
  tem[1] = src[1];
  tem[2] = src[2];
  tem[3] = src[3];
  tem[4] = src[4];
  tem[5] = src[5];
  tem[6] = src[6];
  tem[7] = src[7];
  dst[0] = *(v8di *) tem;
}

gcc -O3 -march=skylake-avx512 will fail to vectorize the case after my r12-2549
because i've increased vec_construct cost for SKX/CLX. Here's dump for slp2

  <bb 2> [local count: 1073741824]:
  _1 = BIT_FIELD_REF <src_18(D), 32, 0>;
  _2 = (long long int) _1;
  _3 = BIT_FIELD_REF <src_18(D), 32, 32>;
  _4 = (long long int) _3;
  _5 = BIT_FIELD_REF <src_18(D), 32, 64>;
  _6 = (long long int) _5;
  _7 = BIT_FIELD_REF <src_18(D), 32, 96>;
  _8 = (long long int) _7;
  _9 = BIT_FIELD_REF <src_18(D), 32, 128>;
  _10 = (long long int) _9;
  _11 = BIT_FIELD_REF <src_18(D), 32, 160>;
  _12 = (long long int) _11;
  _13 = BIT_FIELD_REF <src_18(D), 32, 192>;
  _14 = (long long int) _13;
  _15 = BIT_FIELD_REF <src_18(D), 32, 224>;
  _31 = {_1, _3, _5, _7, _9, _11, _13, _15};
  vect__2.4_32 = (vector(8) long long int) _31;
  _16 = (long long int) _15;
  MEM <vector(8) long long int> [(long long int *)&tem] = vect__2.4_32;
  _17 = MEM[(v8di *)&tem];
  *dst_28(D) = _17;
  tem ={v} {CLOBBER};
  return;

But actually, there's no need for vec_contruct from each element, it will be
optimized to

   <bb 2> [local count: 1073741824]:
  _2 = BIT_FIELD_REF <src_18(D), 256, 0>;
  vect__2.4_32 = (vector(8) long long int) _2;
  *dst_28(D) = vect__2.4_32;
  return;

So at the time slp2 can realize the optimization and categorize vec_contruct
cost more accurately, we can avoid this regression.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-06-02  6:47 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-29  1:47 [Bug tree-optimization/101668] New: vectorizer doesn't categorize vector construct cost right crazylht at gmail dot com
2021-07-29  6:55 ` [Bug tree-optimization/101668] BB vectorizer doesn't handle lowpart of existing vector rguenth at gcc dot gnu.org
2021-07-29  7:03 ` crazylht at gmail dot com
2022-05-20  9:03 ` rguenth at gcc dot gnu.org
2022-05-20  9:13 ` crazylht at gmail dot com
2022-05-20  9:25 ` rguenth at gcc dot gnu.org
2022-05-25 13:05 ` rguenth at gcc dot gnu.org
2022-06-02  6:46 ` cvs-commit at gcc dot gnu.org
2022-06-02  6:47 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).