public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/43147] SSE shuffle merge
[not found] <bug-43147-4@http.gcc.gnu.org/bugzilla/>
@ 2011-10-23 8:40 ` marc.glisse at normalesup dot org
2012-05-07 15:00 ` glisse at gcc dot gnu.org
` (11 subsequent siblings)
12 siblings, 0 replies; 13+ messages in thread
From: marc.glisse at normalesup dot org @ 2011-10-23 8:40 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43147
Marc Glisse <marc.glisse at normalesup dot org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |marc.glisse at normalesup
| |dot org
--- Comment #4 from Marc Glisse <marc.glisse at normalesup dot org> 2011-10-23 08:40:04 UTC ---
Apart from combining 2 shuffles, I would expect the set and the shuffle to be
combined in Comment 1. I was going to report the following, but it already
appears in this bug:
__m128d f(double d){
__m128d x=_mm_setr_pd(-d,d);
return _mm_shuffle_pd(x,x,1);
}
movsd .LC0(%rip), %xmm1
xorpd %xmm0, %xmm1
movapd %xmm1, %xmm2
unpcklpd %xmm0, %xmm2
movapd %xmm2, %xmm0
shufpd $1, %xmm2, %xmm0
some extra moves, as usual, and a shuffle that could be combined with the
unpack.
(obviously we don't write such code, it is only after inlining that it looks
that way)
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug rtl-optimization/43147] SSE shuffle merge
[not found] <bug-43147-4@http.gcc.gnu.org/bugzilla/>
2011-10-23 8:40 ` [Bug rtl-optimization/43147] SSE shuffle merge marc.glisse at normalesup dot org
@ 2012-05-07 15:00 ` glisse at gcc dot gnu.org
2021-08-21 22:44 ` [Bug target/43147] " pinskia at gcc dot gnu.org
` (10 subsequent siblings)
12 siblings, 0 replies; 13+ messages in thread
From: glisse at gcc dot gnu.org @ 2012-05-07 15:00 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43147
Marc Glisse <glisse at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |glisse at gcc dot gnu.org
--- Comment #5 from Marc Glisse <glisse at gcc dot gnu.org> 2012-05-07 14:52:46 UTC ---
Actually, why isn't constant propagation happening for the example in comment
#1? simplify-rtx.c contains code to that effect, it might just need a little
tweaking...
(yes, I know the original report is probably interested in non-constant
operands)
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/43147] SSE shuffle merge
[not found] <bug-43147-4@http.gcc.gnu.org/bugzilla/>
2011-10-23 8:40 ` [Bug rtl-optimization/43147] SSE shuffle merge marc.glisse at normalesup dot org
2012-05-07 15:00 ` glisse at gcc dot gnu.org
@ 2021-08-21 22:44 ` pinskia at gcc dot gnu.org
2021-08-21 23:45 ` hjl.tools at gmail dot com
` (9 subsequent siblings)
12 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-08-21 22:44 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43147
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Component|rtl-optimization |target
--- Comment #11 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
We produce:
Trying 5, 7 -> 11:
5: r86:V4SF=[`*.LC0']
REG_EQUAL const_vector
7: r85:V4SF=vec_select(vec_concat(r86:V4SF,r86:V4SF),parallel)
REG_DEAD r86:V4SF
REG_EQUAL const_vector
11: r88:V4SF=vec_select(vec_concat(r85:V4SF,r85:V4SF),parallel)
REG_DEAD r85:V4SF
REG_EQUAL const_vector
Failed to match this instruction:
(set (reg:V4SF 88)
(const_vector:V4SF [
(const_double:SF 2.0e+0 [0x0.8p+2])
(const_double:SF 1.0e+0 [0x0.8p+1])
(const_double:SF 4.0e+0 [0x0.8p+3])
(const_double:SF 3.0e+0 [0x0.cp+2])
]))
Which means the vec_select are merging at the rtl level just fine.
Anyways if the target expands __builtin_ia32_shufps to VEC_PERM_EXPR we would
have gotten this optimized at the gimple level. So this is a target issue.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/43147] SSE shuffle merge
[not found] <bug-43147-4@http.gcc.gnu.org/bugzilla/>
` (2 preceding siblings ...)
2021-08-21 22:44 ` [Bug target/43147] " pinskia at gcc dot gnu.org
@ 2021-08-21 23:45 ` hjl.tools at gmail dot com
2021-08-22 12:54 ` hjl.tools at gmail dot com
` (8 subsequent siblings)
12 siblings, 0 replies; 13+ messages in thread
From: hjl.tools at gmail dot com @ 2021-08-21 23:45 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43147
H.J. Lu <hjl.tools at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Assignee|unassigned at gcc dot gnu.org |hjl.tools at gmail dot com
--- Comment #12 from H.J. Lu <hjl.tools at gmail dot com> ---
Created attachment 51345
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51345&action=edit
A patch
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/43147] SSE shuffle merge
[not found] <bug-43147-4@http.gcc.gnu.org/bugzilla/>
` (3 preceding siblings ...)
2021-08-21 23:45 ` hjl.tools at gmail dot com
@ 2021-08-22 12:54 ` hjl.tools at gmail dot com
2021-08-24 1:57 ` [Bug rtl-optimization/43147] " hjl.tools at gmail dot com
` (7 subsequent siblings)
12 siblings, 0 replies; 13+ messages in thread
From: hjl.tools at gmail dot com @ 2021-08-22 12:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43147
H.J. Lu <hjl.tools at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |patch
URL| |https://gcc.gnu.org/piperma
| |il/gcc-patches/2021-August/
| |577884.html
--- Comment #13 from H.J. Lu <hjl.tools at gmail dot com> ---
A patch is posted at
https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577884.html
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug rtl-optimization/43147] SSE shuffle merge
[not found] <bug-43147-4@http.gcc.gnu.org/bugzilla/>
` (4 preceding siblings ...)
2021-08-22 12:54 ` hjl.tools at gmail dot com
@ 2021-08-24 1:57 ` hjl.tools at gmail dot com
2021-08-25 8:42 ` crazylht at gmail dot com
` (6 subsequent siblings)
12 siblings, 0 replies; 13+ messages in thread
From: hjl.tools at gmail dot com @ 2021-08-24 1:57 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43147
H.J. Lu <hjl.tools at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Assignee|hjl.tools at gmail dot com |unassigned at gcc dot gnu.org
Component|target |rtl-optimization
Version|4.4.1 |12.0
--- Comment #14 from H.J. Lu <hjl.tools at gmail dot com> ---
From
https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577991.html
Trying 5 -> 7:
5: r85:V4SF=[`*.LC0']
REG_EQUAL const_vector
7: r84:V4SF=vec_select(vec_concat(r85:V4SF,r85:V4SF),parallel)
REG_DEAD r85:V4SF
REG_EQUAL const_vector
Failed to match this instruction:
(set (reg:V4SF 84)
(const_vector:V4SF [
(const_double:SF 3.0e+0 [0x0.cp+2])
(const_double:SF 2.0e+0 [0x0.8p+2])
(const_double:SF 4.0e+0 [0x0.8p+3])
(const_double:SF 1.0e+0 [0x0.8p+1])
]))
(insn 5 2 7 2 (set (reg:V4SF 85)
(mem/u/c:V4SF (symbol_ref/u:DI ("*.LC0") [flags 0x2]) [0 S16
A128]))
"/export/users/liuhongt/install/git_trunk_master_native/lib/gcc/x86_64-pc-linux-gnu/12.0.0/include/xmmintrin.h":746:19
1600 {movv4sf_internal}
(expr_list:REG_EQUAL (const_vector:V4SF [
(const_double:SF 4.0e+0 [0x0.8p+3])
(const_double:SF 3.0e+0 [0x0.cp+2])
(const_double:SF 2.0e+0 [0x0.8p+2])
(const_double:SF 1.0e+0 [0x0.8p+1])
])
(nil)))
(insn 7 5 11 2 (set (reg:V4SF 84)
(vec_select:V4SF (vec_concat:V8SF (reg:V4SF 85)
(reg:V4SF 85))
(parallel [
(const_int 1 [0x1])
(const_int 2 [0x2])
(const_int 4 [0x4])
(const_int 7 [0x7])
])))
"/export/users/liuhongt/install/git_trunk_master_native/lib/gcc/x86_64-pc-linux-gnu/12.0.0/include/xmmintrin.h":746:19
3015 {sse_shufps_v4sf}
(expr_list:REG_DEAD (reg:V4SF 85)
(expr_list:REG_EQUAL (const_vector:V4SF [
(const_double:SF 3.0e+0 [0x0.cp+2])
(const_double:SF 2.0e+0 [0x0.8p+2])
(const_double:SF 4.0e+0 [0x0.8p+3])
(const_double:SF 1.0e+0 [0x0.8p+1])
])
(nil))))
I think pass_combine should be extended to force illegitimate constant
to constant pool and recog load insn again, It looks like a general
optimization that better not do it in the backend.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug rtl-optimization/43147] SSE shuffle merge
[not found] <bug-43147-4@http.gcc.gnu.org/bugzilla/>
` (5 preceding siblings ...)
2021-08-24 1:57 ` [Bug rtl-optimization/43147] " hjl.tools at gmail dot com
@ 2021-08-25 8:42 ` crazylht at gmail dot com
2021-08-25 8:52 ` pinskia at gcc dot gnu.org
` (5 subsequent siblings)
12 siblings, 0 replies; 13+ messages in thread
From: crazylht at gmail dot com @ 2021-08-25 8:42 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43147
--- Comment #15 from Hongtao.liu <crazylht at gmail dot com> ---
> I think pass_combine should be extended to force illegitimate constant
> to constant pool and recog load insn again, It looks like a general
> optimization that better not do it in the backend.
The issue can also be solved by folding __builtin_ia32_shufps to gimple
VEC_PERM_EXPR, .i.e the below testcase doesn't have the problem
typedef int v4si __attribute__((vector_size (16)));
v4si
foo ()
{
v4si a = __extension__ (v4si) {4, 3, 2, 1};
v4si b = __extension__ (v4si) {5, 6, 7, 8};
v4si c = __builtin_shufflevector (a, b, 1, 4, 2, 7);
v4si d = __builtin_shuffle (c, __extension__ (v4si) { 3, 2, 0, 1 });
return d;
}
foo():
movdqa .LC0(%rip), %xmm0
ret
.LC0:
.long 8
.long 2
.long 3
.long 5
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug rtl-optimization/43147] SSE shuffle merge
[not found] <bug-43147-4@http.gcc.gnu.org/bugzilla/>
` (6 preceding siblings ...)
2021-08-25 8:42 ` crazylht at gmail dot com
@ 2021-08-25 8:52 ` pinskia at gcc dot gnu.org
2021-08-25 8:54 ` glisse at gcc dot gnu.org
` (4 subsequent siblings)
12 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-08-25 8:52 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43147
--- Comment #16 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Hongtao.liu from comment #15)
> > I think pass_combine should be extended to force illegitimate constant
> > to constant pool and recog load insn again, It looks like a general
> > optimization that better not do it in the backend.
>
> The issue can also be solved by folding __builtin_ia32_shufps to gimple
> VEC_PERM_EXPR, .i.e the below testcase doesn't have the problem
>
> typedef int v4si __attribute__((vector_size (16)));
>
> v4si
> foo ()
> {
> v4si a = __extension__ (v4si) {4, 3, 2, 1};
> v4si b = __extension__ (v4si) {5, 6, 7, 8};
> v4si c = __builtin_shufflevector (a, b, 1, 4, 2, 7);
> v4si d = __builtin_shuffle (c, __extension__ (v4si) { 3, 2, 0, 1 });
> return d;
> }
But that is because we constant fold on the gimple level for PERMs.
combining VEC_PERM_EXPR on the gimple is PR 54346; note I found this while
looking at other issues too :).
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug rtl-optimization/43147] SSE shuffle merge
[not found] <bug-43147-4@http.gcc.gnu.org/bugzilla/>
` (7 preceding siblings ...)
2021-08-25 8:52 ` pinskia at gcc dot gnu.org
@ 2021-08-25 8:54 ` glisse at gcc dot gnu.org
2021-08-25 9:20 ` crazylht at gmail dot com
` (3 subsequent siblings)
12 siblings, 0 replies; 13+ messages in thread
From: glisse at gcc dot gnu.org @ 2021-08-25 8:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43147
--- Comment #17 from Marc Glisse <glisse at gcc dot gnu.org> ---
(In reply to Hongtao.liu from comment #15)
> The issue can also be solved by folding __builtin_ia32_shufps to gimple
> VEC_PERM_EXPR,
Didn't you post a patch to do that last year? What happened to it?
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug rtl-optimization/43147] SSE shuffle merge
[not found] <bug-43147-4@http.gcc.gnu.org/bugzilla/>
` (8 preceding siblings ...)
2021-08-25 8:54 ` glisse at gcc dot gnu.org
@ 2021-08-25 9:20 ` crazylht at gmail dot com
2021-08-27 0:51 ` cvs-commit at gcc dot gnu.org
` (2 subsequent siblings)
12 siblings, 0 replies; 13+ messages in thread
From: crazylht at gmail dot com @ 2021-08-25 9:20 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43147
--- Comment #18 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Marc Glisse from comment #17)
> (In reply to Hongtao.liu from comment #15)
> > The issue can also be solved by folding __builtin_ia32_shufps to gimple
> > VEC_PERM_EXPR,
>
> Didn't you post a patch to do that last year? What happened to it?
I almost forgot it, let me retest my patch, it's in
https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562029.html
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug rtl-optimization/43147] SSE shuffle merge
[not found] <bug-43147-4@http.gcc.gnu.org/bugzilla/>
` (9 preceding siblings ...)
2021-08-25 9:20 ` crazylht at gmail dot com
@ 2021-08-27 0:51 ` cvs-commit at gcc dot gnu.org
2021-08-27 1:00 ` crazylht at gmail dot com
2023-08-22 4:32 ` pinskia at gcc dot gnu.org
12 siblings, 0 replies; 13+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-08-27 0:51 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43147
--- Comment #19 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:
https://gcc.gnu.org/g:0fa4787bf34b173ce6f198e99b6f6dd8a3f98014
commit r12-3177-g0fa4787bf34b173ce6f198e99b6f6dd8a3f98014
Author: liuhongt <hongtao.liu@intel.com>
Date: Fri Dec 11 19:02:43 2020 +0800
Fold more shuffle builtins to VEC_PERM_EXPR.
A follow-up to
https://gcc.gnu.org/pipermail/gcc-patches/2019-May/521983.html
gcc/
PR target/98167
PR target/43147
* config/i386/i386.c (ix86_gimple_fold_builtin): Fold
IX86_BUILTIN_SHUFPD512, IX86_BUILTIN_SHUFPS512,
IX86_BUILTIN_SHUFPD256ï¼ IX86_BUILTIN_SHUFPSï¼
IX86_BUILTIN_SHUFPS256.
(ix86_masked_all_ones): New function.
gcc/testsuite/
* gcc.target/i386/avx512f-vshufpd-1.c: Adjust testcase.
* gcc.target/i386/avx512f-vshufps-1.c: Adjust testcase.
* gcc.target/i386/pr43147.c: New test.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug rtl-optimization/43147] SSE shuffle merge
[not found] <bug-43147-4@http.gcc.gnu.org/bugzilla/>
` (10 preceding siblings ...)
2021-08-27 0:51 ` cvs-commit at gcc dot gnu.org
@ 2021-08-27 1:00 ` crazylht at gmail dot com
2023-08-22 4:32 ` pinskia at gcc dot gnu.org
12 siblings, 0 replies; 13+ messages in thread
From: crazylht at gmail dot com @ 2021-08-27 1:00 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43147
--- Comment #20 from Hongtao.liu <crazylht at gmail dot com> ---
Fixed in GCC12, now gcc generate optimal codes.
main:
.LFB532:
.cfi_startproc
subq $8, %rsp
.cfi_def_cfa_offset 16
movaps .LC0(%rip), %xmm0
call printv
xorl %eax, %eax
addq $8, %rsp
.cfi_def_cfa_offset 8
ret
.cfi_endproc
.LFE532:
.size main, .-main
.section .rodata.cst16,"aM",@progbits,16
.align 16
.LC0:
.long 1073741824
.long 1065353216
.long 1082130432
.long 1077936128
.ident "GCC: (GNU) 12.0.0 20210825 (experimental)"
.section .note.GNU-stack,"",@progbits
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug rtl-optimization/43147] SSE shuffle merge
[not found] <bug-43147-4@http.gcc.gnu.org/bugzilla/>
` (11 preceding siblings ...)
2021-08-27 1:00 ` crazylht at gmail dot com
@ 2023-08-22 4:32 ` pinskia at gcc dot gnu.org
12 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-08-22 4:32 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43147
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Target Milestone|--- |13.0
Resolution|--- |FIXED
--- Comment #21 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Constant folding part was fixed in GCC 12 but combining shuffles was fixed in
GCC 13.
That is for:
```
__m128 m;
int main()
{
m = _mm_shuffle_ps(m, m, 0xC9); // Those two shuffles together swap
pairs
m = _mm_shuffle_ps(m, m, 0x2D); // And could be optimized to 0x4E
printv(m);
return 0;
}
```
GCC 13+ Produces:
```
movaps m(%rip), %xmm0
shufps $78, %xmm0, %xmm0
movaps %xmm0, m(%rip)
call _Z6printvDv4_f
```
instead of what was there in GCC 12:
```
movaps m(%rip), %xmm0
shufps $201, %xmm0, %xmm0
shufps $45, %xmm0, %xmm0
movaps %xmm0, m(%rip)
```
So closing as fixed in GCC 13.
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2023-08-22 4:32 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <bug-43147-4@http.gcc.gnu.org/bugzilla/>
2011-10-23 8:40 ` [Bug rtl-optimization/43147] SSE shuffle merge marc.glisse at normalesup dot org
2012-05-07 15:00 ` glisse at gcc dot gnu.org
2021-08-21 22:44 ` [Bug target/43147] " pinskia at gcc dot gnu.org
2021-08-21 23:45 ` hjl.tools at gmail dot com
2021-08-22 12:54 ` hjl.tools at gmail dot com
2021-08-24 1:57 ` [Bug rtl-optimization/43147] " hjl.tools at gmail dot com
2021-08-25 8:42 ` crazylht at gmail dot com
2021-08-25 8:52 ` pinskia at gcc dot gnu.org
2021-08-25 8:54 ` glisse at gcc dot gnu.org
2021-08-25 9:20 ` crazylht at gmail dot com
2021-08-27 0:51 ` cvs-commit at gcc dot gnu.org
2021-08-27 1:00 ` crazylht at gmail dot com
2023-08-22 4:32 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).