public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/113677] New: Missing `VEC_PERM_EXPR <{a, CST}, CST, {0, 1, 2, ...}>` optimization
@ 2024-01-31  1:46 pinskia at gcc dot gnu.org
  2024-01-31  1:47 ` [Bug tree-optimization/113677] " pinskia at gcc dot gnu.org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-01-31  1:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113677

            Bug ID: 113677
           Summary: Missing `VEC_PERM_EXPR <{a, CST}, CST, {0, 1, 2,
                    ...}>` optimization
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: enhancement
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---

Take:
```
#define vect32 __attribute__((vector_size(4) ))
#define vect64 __attribute__((vector_size(8) ))

vect64 unsigned char f(vect32 unsigned char a)
{
  vect32 unsigned char zero={0,0,0,0};
  return __builtin_shufflevector (a, zero, 0, 1, 2, 3, 4, 5, 6, 7);
}

```
On x86_64 this produces:
```
f:
        movd    xmm0, edi
        pxor    xmm1, xmm1
        punpckldq       xmm0, xmm1
        ret
```

We should just produce:
```
        movd    xmm0, edi
        ret
```

In .optimized we get:
```
  _1 = {a_2(D), { 0, 0, 0, 0 }};
  _3 = VEC_PERM_EXPR <_1, { 0, 0, 0, 0, 0, 0, 0, 0 }, { 0, 1, 2, 3, 8, 9, 10,
11 }>;
  return _3;
```


But _3 and _1 are the same ...

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/113677] Missing `VEC_PERM_EXPR <{a, CST}, CST, {0, 1, 2, ...}>` optimization
  2024-01-31  1:46 [Bug tree-optimization/113677] New: Missing `VEC_PERM_EXPR <{a, CST}, CST, {0, 1, 2, ...}>` optimization pinskia at gcc dot gnu.org
@ 2024-01-31  1:47 ` pinskia at gcc dot gnu.org
  2024-01-31  2:16 ` pinskia at gcc dot gnu.org
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-01-31  1:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113677

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|                            |x86_64

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I should note I noticed this while working on adding V4QI support for aarch64
but it is definite a generic issue.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/113677] Missing `VEC_PERM_EXPR <{a, CST}, CST, {0, 1, 2, ...}>` optimization
  2024-01-31  1:46 [Bug tree-optimization/113677] New: Missing `VEC_PERM_EXPR <{a, CST}, CST, {0, 1, 2, ...}>` optimization pinskia at gcc dot gnu.org
  2024-01-31  1:47 ` [Bug tree-optimization/113677] " pinskia at gcc dot gnu.org
@ 2024-01-31  2:16 ` pinskia at gcc dot gnu.org
  2024-01-31  8:27 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-01-31  2:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113677

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|x86_64                      |x86_64 aarch64

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Here is another example, using 64/128 on aarch64:
```
#define vect64 __attribute__((vector_size(8) ))
#define vect128 __attribute__((vector_size(16) ))

vect128 unsigned int f(vect64 unsigned int a)
{
  vect64 unsigned int zero={0, 0};
  return __builtin_shufflevector (a, zero, 0, 1, 2, 3);
}
```

We get:
```
f:
        movi    v31.4s, 0
        fmov    d0, d0
        zip1    v0.2d, v0.2d, v31.2d
```

This should just produce the `fmov` for little-endian and `mov/ins` for
big-endian.

Note for this part of the issue the aarch64 back-end represents zip using
UNSPEC where it could use VEC_CONCAT instead. And it would do the correct thing
there ...

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/113677] Missing `VEC_PERM_EXPR <{a, CST}, CST, {0, 1, 2, ...}>` optimization
  2024-01-31  1:46 [Bug tree-optimization/113677] New: Missing `VEC_PERM_EXPR <{a, CST}, CST, {0, 1, 2, ...}>` optimization pinskia at gcc dot gnu.org
  2024-01-31  1:47 ` [Bug tree-optimization/113677] " pinskia at gcc dot gnu.org
  2024-01-31  2:16 ` pinskia at gcc dot gnu.org
@ 2024-01-31  8:27 ` rguenth at gcc dot gnu.org
  2024-02-06 17:05 ` pinskia at gcc dot gnu.org
  2024-03-08  5:08 ` pinskia at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-01-31  8:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113677

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2024-01-31
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
Yeah, most of the code in forwprop/match doesn't deal with the "new" permutes
where the result isn't the same length as the inputs.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/113677] Missing `VEC_PERM_EXPR <{a, CST}, CST, {0, 1, 2, ...}>` optimization
  2024-01-31  1:46 [Bug tree-optimization/113677] New: Missing `VEC_PERM_EXPR <{a, CST}, CST, {0, 1, 2, ...}>` optimization pinskia at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2024-01-31  8:27 ` rguenth at gcc dot gnu.org
@ 2024-02-06 17:05 ` pinskia at gcc dot gnu.org
  2024-03-08  5:08 ` pinskia at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-02-06 17:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113677

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Note it is not just about constants either.
Take:
```
#define vect64 __attribute__((vector_size(8) ))
#define vect128 __attribute__((vector_size(16) ))

vect128 unsigned int f(vect64 unsigned int a, vect64 unsigned int b)
{
  vect64 unsigned int zero={0, 0};
  return __builtin_shufflevector (a, b, 0, 1, 2, 3);
}
```

We get:
```
  _1 = {a_3(D), { 0, 0 }};
  _2 = {b_4(D), { 0, 0 }};
  _5 = VEC_PERM_EXPR <_1, _2, { 0, 1, 4, 5 }>;
```

Which obvious could be done to just:
`_5 = {a_3(D), b_4(D)};`

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/113677] Missing `VEC_PERM_EXPR <{a, CST}, CST, {0, 1, 2, ...}>` optimization
  2024-01-31  1:46 [Bug tree-optimization/113677] New: Missing `VEC_PERM_EXPR <{a, CST}, CST, {0, 1, 2, ...}>` optimization pinskia at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2024-02-06 17:05 ` pinskia at gcc dot gnu.org
@ 2024-03-08  5:08 ` pinskia at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-08  5:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113677

--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #4)
> Note it is not just about constants either.

That is the same as what is mentioned in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94301#c2 even :).

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-03-08  5:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-31  1:46 [Bug tree-optimization/113677] New: Missing `VEC_PERM_EXPR <{a, CST}, CST, {0, 1, 2, ...}>` optimization pinskia at gcc dot gnu.org
2024-01-31  1:47 ` [Bug tree-optimization/113677] " pinskia at gcc dot gnu.org
2024-01-31  2:16 ` pinskia at gcc dot gnu.org
2024-01-31  8:27 ` rguenth at gcc dot gnu.org
2024-02-06 17:05 ` pinskia at gcc dot gnu.org
2024-03-08  5:08 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).