* [Bug tree-optimization/90579] [8/9/10/11 Regression] Huge store forward stall due to vectorizer, missed CSE
[not found] <bug-90579-4@http.gcc.gnu.org/bugzilla/>
@ 2021-03-11 14:23 ` rguenth at gcc dot gnu.org
2021-03-15 9:52 ` rguenth at gcc dot gnu.org
` (5 subsequent siblings)
6 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-03-11 14:23 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90579
--- Comment #11 from Richard Biener <rguenth at gcc dot gnu.org> ---
OK, so push_partial_def could simply record the first non-constant def and upon
the second, if it completes all bits handle some easy cases (VEC_PERM_EXPR,
COMPLEX_EXPR come to my mind, eventually also CONSTRUCTOR) and else fail.
Will look into this tomorrow, it shouldn't be too awkward and removing
STLF when a read is detected to overlap two previous writes is definitely
worth trouble.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/90579] [8/9/10/11 Regression] Huge store forward stall due to vectorizer, missed CSE
[not found] <bug-90579-4@http.gcc.gnu.org/bugzilla/>
2021-03-11 14:23 ` [Bug tree-optimization/90579] [8/9/10/11 Regression] Huge store forward stall due to vectorizer, missed CSE rguenth at gcc dot gnu.org
@ 2021-03-15 9:52 ` rguenth at gcc dot gnu.org
2021-05-14 9:51 ` [Bug tree-optimization/90579] [9/10/11/12 " jakub at gcc dot gnu.org
` (4 subsequent siblings)
6 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-03-15 9:52 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90579
--- Comment #12 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #9)
> So we now have a "real" FRE after the vectorizer but we fail to CSE
>
> MEM <vector(4) double> [(double *)&r] = vect__3.20_74;
> ...
> MEM <vector(2) double> [(double *)&r + 32B] = vect__62.26_88;
> ...
> vect__5.7_34 = MEM <vector(4) double> [(double *)&r + 16B];
>
> mine for GCC 11 to look at. The code to CSE that load for _74 and _88
> is going to be a bit awkward though but it will nicely combine with the
> following stmts
>
> vect__5.8_35 = VEC_PERM_EXPR <vect__5.7_34, vect__5.7_34, { 3, 2, 1, 0 }>;
> stmp_t_12.9_36 = BIT_FIELD_REF <vect__5.8_35, 64, 0>;
> stmp_t_12.9_37 = stmp_t_12.9_36 + 0.0;
> stmp_t_12.9_38 = BIT_FIELD_REF <vect__5.8_35, 64, 64>;
> stmp_t_12.9_39 = stmp_t_12.9_37 + stmp_t_12.9_38;
> stmp_t_12.9_40 = BIT_FIELD_REF <vect__5.8_35, 64, 128>;
> stmp_t_12.9_41 = stmp_t_12.9_39 + stmp_t_12.9_40;
> stmp_t_12.9_42 = BIT_FIELD_REF <vect__5.8_35, 64, 192>;
> t_12 = stmp_t_12.9_41 + stmp_t_12.9_42;
>
> and hopefully elide 'r' completely.
So the difficult thing is that we need to compose the upper v2df half of
vect__3.20_74 and the v2df vect__62.26_88. Assembly for that would be sth
like
vextractf128 $0x1, %ymm0, %xmm0
vinsertf128 $0x1, %xmm1, %ymm0, %ymm0
and on GIMPLE
tem_42 = BIT_FIELD_REF <vect__3.20_74, 128, 128>;
vect__5.7_34 = { tem_42, vect__62.26_88 };
that's two stmts which at the moment VN simplification insertion doesn't
support. It would be "nicer" to enhance for example VEC_PERM to allow
vect__5.7_34 = VEC_PERM <vect__3.20_74, vect__62.26_88, { 2, 3, 4, 5 }>
"implicitely" extending _88 to v4df (aka a paradoxical v4df subreg of
the v2df SSE reg). It would turn VEC_PERM into a concat + select operation
with not requiring the intermediate to have vector mode (in this case
it would have v6df without introducing subregs, a mode not possible).
On RTL unfortunately (vec_select:V4DF (vec_concat (reg:V4DF ..) (reg:V2DF ..))
..) is not possible because of that restriction. OTOH RTL lacks that
concat-and-select operation, allowing the cited form and vec_merge to be
"merged" (vec_merge doesn't require such intermediate mode either).
I'll see how difficult it is to teach VN multi-stmt insertions.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/90579] [9/10/11/12 Regression] Huge store forward stall due to vectorizer, missed CSE
[not found] <bug-90579-4@http.gcc.gnu.org/bugzilla/>
2021-03-11 14:23 ` [Bug tree-optimization/90579] [8/9/10/11 Regression] Huge store forward stall due to vectorizer, missed CSE rguenth at gcc dot gnu.org
2021-03-15 9:52 ` rguenth at gcc dot gnu.org
@ 2021-05-14 9:51 ` jakub at gcc dot gnu.org
2021-06-01 8:14 ` rguenth at gcc dot gnu.org
` (3 subsequent siblings)
6 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-05-14 9:51 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90579
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|8.5 |9.4
--- Comment #13 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 8 branch is being closed.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/90579] [9/10/11/12 Regression] Huge store forward stall due to vectorizer, missed CSE
[not found] <bug-90579-4@http.gcc.gnu.org/bugzilla/>
` (2 preceding siblings ...)
2021-05-14 9:51 ` [Bug tree-optimization/90579] [9/10/11/12 " jakub at gcc dot gnu.org
@ 2021-06-01 8:14 ` rguenth at gcc dot gnu.org
2022-05-27 9:40 ` [Bug tree-optimization/90579] [10/11/12/13 " rguenth at gcc dot gnu.org
` (2 subsequent siblings)
6 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-06-01 8:14 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90579
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|9.4 |9.5
--- Comment #14 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 9.4 is being released, retargeting bugs to GCC 9.5.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/90579] [10/11/12/13 Regression] Huge store forward stall due to vectorizer, missed CSE
[not found] <bug-90579-4@http.gcc.gnu.org/bugzilla/>
` (3 preceding siblings ...)
2021-06-01 8:14 ` rguenth at gcc dot gnu.org
@ 2022-05-27 9:40 ` rguenth at gcc dot gnu.org
2022-06-28 10:37 ` jakub at gcc dot gnu.org
2023-07-07 10:35 ` [Bug tree-optimization/90579] [11/12/13/14 " rguenth at gcc dot gnu.org
6 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-05-27 9:40 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90579
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|9.5 |10.4
--- Comment #15 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 9 branch is being closed
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/90579] [10/11/12/13 Regression] Huge store forward stall due to vectorizer, missed CSE
[not found] <bug-90579-4@http.gcc.gnu.org/bugzilla/>
` (4 preceding siblings ...)
2022-05-27 9:40 ` [Bug tree-optimization/90579] [10/11/12/13 " rguenth at gcc dot gnu.org
@ 2022-06-28 10:37 ` jakub at gcc dot gnu.org
2023-07-07 10:35 ` [Bug tree-optimization/90579] [11/12/13/14 " rguenth at gcc dot gnu.org
6 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-06-28 10:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90579
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|10.4 |10.5
--- Comment #16 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 10.4 is being released, retargeting bugs to GCC 10.5.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/90579] [11/12/13/14 Regression] Huge store forward stall due to vectorizer, missed CSE
[not found] <bug-90579-4@http.gcc.gnu.org/bugzilla/>
` (5 preceding siblings ...)
2022-06-28 10:37 ` jakub at gcc dot gnu.org
@ 2023-07-07 10:35 ` rguenth at gcc dot gnu.org
6 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-07-07 10:35 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90579
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|10.5 |11.5
--- Comment #17 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 10 branch is being closed.
^ permalink raw reply [flat|nested] 7+ messages in thread