* [Bug tree-optimization/54346] combine permutations
2012-08-21 14:02 [Bug tree-optimization/54346] New: combine permutations glisse at gcc dot gnu.org
@ 2021-08-25 7:26 ` pinskia at gcc dot gnu.org
2021-08-25 9:02 ` pinskia at gcc dot gnu.org
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-08-25 7:26 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54346
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |pinskia at gcc dot gnu.org
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
*** Bug 102056 has been marked as a duplicate of this bug. ***
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/54346] combine permutations
2012-08-21 14:02 [Bug tree-optimization/54346] New: combine permutations glisse at gcc dot gnu.org
2021-08-25 7:26 ` [Bug tree-optimization/54346] " pinskia at gcc dot gnu.org
@ 2021-08-25 9:02 ` pinskia at gcc dot gnu.org
2022-06-07 9:10 ` crazylht at gmail dot com
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-08-25 9:02 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54346
--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Here is example code which should produce the same code:
typedef int v4si __attribute__((vector_size (16)));
v4si
foo (v4si a, v4si b)
{
v4si c = __builtin_shuffle (a, b, __extension__ (v4si) {1, 4, 2, 7});
v4si d = __builtin_shuffle (c, __extension__ (v4si) { 3, 2, 0, 1 });
return d;
}
typedef int v4si __attribute__((vector_size (16)));
v4si
foo1 (v4si a, v4si b)
{
v4si c = __builtin_shuffle (a, b, __extension__ (v4si){ 7, 2, 1, 4 });
return c;
}
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/54346] combine permutations
2012-08-21 14:02 [Bug tree-optimization/54346] New: combine permutations glisse at gcc dot gnu.org
2021-08-25 7:26 ` [Bug tree-optimization/54346] " pinskia at gcc dot gnu.org
2021-08-25 9:02 ` pinskia at gcc dot gnu.org
@ 2022-06-07 9:10 ` crazylht at gmail dot com
2022-10-11 6:12 ` cvs-commit at gcc dot gnu.org
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: crazylht at gmail dot com @ 2022-06-07 9:10 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54346
--- Comment #4 from Hongtao.liu <crazylht at gmail dot com> ---
Now more x86 shuffle instrinsics are folded into gimple VEC_PERM_EXPR, I guess
we need some Gimple-level pattern match to simplify successive vec_perm_expr.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/54346] combine permutations
2012-08-21 14:02 [Bug tree-optimization/54346] New: combine permutations glisse at gcc dot gnu.org
` (2 preceding siblings ...)
2022-06-07 9:10 ` crazylht at gmail dot com
@ 2022-10-11 6:12 ` cvs-commit at gcc dot gnu.org
2022-10-11 6:21 ` glisse at gcc dot gnu.org
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-10-11 6:12 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54346
--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:
https://gcc.gnu.org/g:b88adba751da635c6f0c353c5bc51bbe2ecf4c89
commit r13-3212-gb88adba751da635c6f0c353c5bc51bbe2ecf4c89
Author: Liwei Xu <liwei.xu@intel.com>
Date: Fri Sep 23 13:46:02 2022 +0800
Optimize nested permutation to single VEC_PERM_EXPR [PR54346]
This patch implemented the optimization in PR 54346, which Merges
c = VEC_PERM_EXPR <a, b, VCST0>;
d = VEC_PERM_EXPR <c, c, VCST1>;
to
d = VEC_PERM_EXPR <a, b, NEW_VCST>;
Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}
tree-ssa/forwprop-19.c fail to pass but I'm not sure whether it
is ok to removed it.
gcc/ChangeLog:
PR tree-optimization/54346
* match.pd: Merge the index of VCST then generates the new
vec_perm.
gcc/testsuite/ChangeLog:
* gcc.dg/pr54346.c: New test.
Co-authored-by: liuhongt <hongtao.liu@intel.com>
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/54346] combine permutations
2012-08-21 14:02 [Bug tree-optimization/54346] New: combine permutations glisse at gcc dot gnu.org
` (3 preceding siblings ...)
2022-10-11 6:12 ` cvs-commit at gcc dot gnu.org
@ 2022-10-11 6:21 ` glisse at gcc dot gnu.org
2022-10-11 6:36 ` crazylht at gmail dot com
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: glisse at gcc dot gnu.org @ 2022-10-11 6:21 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54346
--- Comment #6 from Marc Glisse <glisse at gcc dot gnu.org> ---
The log says that this breaks tree-ssa/forwprop-19.c, but I don't see any xfail
or anything. Does it only fail because gimple-simplify leaves some dead code
around, so you could update the test to scan the next DCE pass dump instead of
forwprop1? Or are we missing a transformation that just detects a VEC_PERM_EXPR
with an identity permutation?
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/54346] combine permutations
2012-08-21 14:02 [Bug tree-optimization/54346] New: combine permutations glisse at gcc dot gnu.org
` (4 preceding siblings ...)
2022-10-11 6:21 ` glisse at gcc dot gnu.org
@ 2022-10-11 6:36 ` crazylht at gmail dot com
2022-10-21 7:17 ` cvs-commit at gcc dot gnu.org
2022-10-25 5:53 ` crazylht at gmail dot com
7 siblings, 0 replies; 9+ messages in thread
From: crazylht at gmail dot com @ 2022-10-11 6:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54346
--- Comment #7 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Marc Glisse from comment #6)
> The log says that this breaks tree-ssa/forwprop-19.c, but I don't see any
> xfail or anything. Does it only fail because gimple-simplify leaves some
> dead code around, so you could update the test to scan the next DCE pass
> dump instead of forwprop1? Or are we missing a transformation that just
> detects a VEC_PERM_EXPR with an identity permutation?
Uoops, I didn't notice this, will add an incremental patch to handle the
indentical index (forwporp-19.c) scenario, sorry.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/54346] combine permutations
2012-08-21 14:02 [Bug tree-optimization/54346] New: combine permutations glisse at gcc dot gnu.org
` (5 preceding siblings ...)
2022-10-11 6:36 ` crazylht at gmail dot com
@ 2022-10-21 7:17 ` cvs-commit at gcc dot gnu.org
2022-10-25 5:53 ` crazylht at gmail dot com
7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-10-21 7:17 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54346
--- Comment #8 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:fa553ff26d96f6fecaa8f1b00649cfdc6cda5f5a
commit r13-3430-gfa553ff26d96f6fecaa8f1b00649cfdc6cda5f5a
Author: Jakub Jelinek <jakub@redhat.com>
Date: Fri Oct 21 09:16:44 2022 +0200
match.pd: Fix up gcc.dg/pr54346.c on i686-linux [PR54346]
The pr54346.c testcase FAILs on i686-linux (without -msse*) for multiple
reasons. One is the trivial missing -Wno-psabi which the following patch
adds, but that isn't enough. The thing is that without native vector
support, we have VEC_PERM_EXPRs in the IL and are actually considering
the nested VEC_PERM_EXPRs into one VEC_PERM_EXPR optimization, but punt
because can_vec_perm_const_p (result_mode, op_mode, sel2, false) is false.
Such a test makes sense to prevent "optimizing" two VEC_PERM_EXPRs
that can be handled by the backend natively into one VEC_PERM_EXPR
that can't be handled. But if both of the original VEC_PERM_EXPRs
can't be handled natively either, having just one VEC_PERM_EXPR that will
be
lowered by generic vec lowering is IMHO still better than 2.
Or even if we trade just one VEC_PERM_EXPR that can't be handled plus
one that can to one that can't be handled.
Also, removing the testcase's executable permissions...
2022-10-21 <jakub@redhat.com>
PR tree-optimization/54346
* match.pd ((vec_perm (vec_perm@0 @1 @2 VECTOR_CST) @0
VECTOR_CST)):
Optimize nested VEC_PERM_EXPRs even if target can't handle the
new one provided we don't increase number of VEC_PERM_EXPRs the
target can't handle.
* gcc.dg/pr54346.c: Add -Wno-psabi to dg-options.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/54346] combine permutations
2012-08-21 14:02 [Bug tree-optimization/54346] New: combine permutations glisse at gcc dot gnu.org
` (6 preceding siblings ...)
2022-10-21 7:17 ` cvs-commit at gcc dot gnu.org
@ 2022-10-25 5:53 ` crazylht at gmail dot com
7 siblings, 0 replies; 9+ messages in thread
From: crazylht at gmail dot com @ 2022-10-25 5:53 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54346
--- Comment #9 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Hongtao.liu from comment #7)
> (In reply to Marc Glisse from comment #6)
> > The log says that this breaks tree-ssa/forwprop-19.c, but I don't see any
> > xfail or anything. Does it only fail because gimple-simplify leaves some
> > dead code around, so you could update the test to scan the next DCE pass
> > dump instead of forwprop1? Or are we missing a transformation that just
> > detects a VEC_PERM_EXPR with an identity permutation?
>
> Uoops, I didn't notice this, will add an incremental patch to handle the
> indentical index (forwporp-19.c) scenario, sorry.
It's fixed.
^ permalink raw reply [flat|nested] 9+ messages in thread