public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/101621] New: gcc cannot optimize int8_t vector assign with subscription to shuffle
@ 2021-07-26  1:45 yumeyao at gmail dot com
  2021-07-26  1:47 ` [Bug tree-optimization/101621] " yumeyao at gmail dot com
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: yumeyao at gmail dot com @ 2021-07-26  1:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101621

            Bug ID: 101621
           Summary: gcc cannot optimize int8_t vector assign with
                    subscription to shuffle
           Product: gcc
           Version: 11.1.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: yumeyao at gmail dot com
  Target Milestone: ---

https://gcc.godbolt.org/z/91cqenf99

typedef char v16b __attribute__((vector_size(16)));

To summary it up, regarding optimizing v = { v[n] ...} into shuffle, targeting
Intel x86(x86_64):
These is a lack of optimization when there is a zero
There is some regression starting from gcc9.
so this might be 2 issues. But I think a proper fix could resolve both.


* gcc can optimize int8_t vector assign with subscription of the same vector to
shuffle, like this:
v16b gcc_can_shuffle(v16b b) {
    return (v16b) {b[0], b[0], b[0], b[0], b[4], b[4], b[4], b[4], b[8], b[8],
b[8], b[8], b[12], b[12], b[12], b[12]};
}

* However, if there is a zero, gcc can't handle this. Actually this is
supported on Intel x86, with a negative subscription indicating the 'zero
value'.
Clang can do the optimization starting with clang 5.

* Furthermore, there is a regression:
gcc < 8 can always optimize it, but starting with gcc9, if there is a cast,
then the optimization fails:
typedef long v2si64 __attribute__((vector_size(16)));
v16b gcc_cannot_shuffle_with_cast(v2si64 x) {
    v16b b = (v16b)x;
    v16b b0 = {b[0], b[0], b[0], b[0], b[4], b[4], b[4], b[4], b[8], b[8],
b[8], b[8], b[12], b[12], b[12], b[12]};
    return b0;
}
gcc 11 can optimize it on -O3, but not on -O1 or -O2.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/101621] gcc cannot optimize int8_t vector assign with subscription to shuffle
  2021-07-26  1:45 [Bug tree-optimization/101621] New: gcc cannot optimize int8_t vector assign with subscription to shuffle yumeyao at gmail dot com
@ 2021-07-26  1:47 ` yumeyao at gmail dot com
  2021-07-26  2:08 ` pinskia at gcc dot gnu.org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: yumeyao at gmail dot com @ 2021-07-26  1:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101621

--- Comment #1 from YumeYao <yumeyao at gmail dot com> ---
https://gcc.godbolt.org/z/a47Enb9oK

16-bytes (AVX) version added.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/101621] gcc cannot optimize int8_t vector assign with subscription to shuffle
  2021-07-26  1:45 [Bug tree-optimization/101621] New: gcc cannot optimize int8_t vector assign with subscription to shuffle yumeyao at gmail dot com
  2021-07-26  1:47 ` [Bug tree-optimization/101621] " yumeyao at gmail dot com
@ 2021-07-26  2:08 ` pinskia at gcc dot gnu.org
  2021-07-26  5:24 ` yumeyao at gmail dot com
  2021-07-27 11:01 ` rguenth at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-07-26  2:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101621

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
The cast issue is because in GCC 9, it was not producing PERM at the gimple
level which was fixed correctly in GCC 11.

clang_shuffle_with_zero can easy be added.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/101621] gcc cannot optimize int8_t vector assign with subscription to shuffle
  2021-07-26  1:45 [Bug tree-optimization/101621] New: gcc cannot optimize int8_t vector assign with subscription to shuffle yumeyao at gmail dot com
  2021-07-26  1:47 ` [Bug tree-optimization/101621] " yumeyao at gmail dot com
  2021-07-26  2:08 ` pinskia at gcc dot gnu.org
@ 2021-07-26  5:24 ` yumeyao at gmail dot com
  2021-07-27 11:01 ` rguenth at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: yumeyao at gmail dot com @ 2021-07-26  5:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101621

--- Comment #3 from YumeYao <yumeyao at gmail dot com> ---
(In reply to Andrew Pinski from comment #2)
> The cast issue is because in GCC 9, it was not producing PERM at the gimple
> level which was fixed correctly in GCC 11.
> 
> clang_shuffle_with_zero can easy be added.

Thanks for your insights.

Do you have any comment on the optimization flag part (gcc <=8 only needs -O1
to optimize the 'cast' case, but gcc 11 requires -O3)?
Is it due to some default optimization options change in -O1 between gcc 8 and
11, or it's something deeper?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/101621] gcc cannot optimize int8_t vector assign with subscription to shuffle
  2021-07-26  1:45 [Bug tree-optimization/101621] New: gcc cannot optimize int8_t vector assign with subscription to shuffle yumeyao at gmail dot com
                   ` (2 preceding siblings ...)
  2021-07-26  5:24 ` yumeyao at gmail dot com
@ 2021-07-27 11:01 ` rguenth at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-07-27 11:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101621

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2021-07-27
           Keywords|                            |missed-optimization
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
It's done with -O3 since the vectorizer handles this.  There's a pattern
matching phase in forwprop which also handles some cases but it does not handle
this specific case (and it's a bit messy already).  forwprop eventually sees

v16b gcc_cannot_shuffle_with_cast (v2si64 x)
{
  v16b b0;
  char _1;
  char _2;
  char _3;
  char _4;

  <bb 2> [local count: 1073741824]:
  _1 = BIT_FIELD_REF <x_5(D), 8, 0>;
  _2 = BIT_FIELD_REF <x_5(D), 8, 32>;
  _3 = BIT_FIELD_REF <x_5(D), 8, 64>;
  _4 = BIT_FIELD_REF <x_5(D), 8, 96>; 
  b0_6 = {_1, _1, _1, _1, _2, _2, _2, _2, _3, _3, _3, _3, _4, _4, _4, _4};
  return b0_6;

and early forwprop

  <bb 2> :
  _1 = VIEW_CONVERT_EXPR<v16b>(x_18(D));
  _2 = BIT_FIELD_REF <_1, 8, 0>;
  _3 = BIT_FIELD_REF <_1, 8, 0>;
...
  b0_20 = {_2, _3, _4, _5, _6, _7, _8, _9, _10, _11, _12, _13, _14, _15, _16,
_17};

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-07-27 11:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-26  1:45 [Bug tree-optimization/101621] New: gcc cannot optimize int8_t vector assign with subscription to shuffle yumeyao at gmail dot com
2021-07-26  1:47 ` [Bug tree-optimization/101621] " yumeyao at gmail dot com
2021-07-26  2:08 ` pinskia at gcc dot gnu.org
2021-07-26  5:24 ` yumeyao at gmail dot com
2021-07-27 11:01 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).