public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/90579] [8/9/10/11 Regression] Huge store forward stall due to vectorizer, missed CSE
       [not found] <bug-90579-4@http.gcc.gnu.org/bugzilla/>
@ 2021-03-11 14:23 ` rguenth at gcc dot gnu.org
  2021-03-15  9:52 ` rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-03-11 14:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90579

--- Comment #11 from Richard Biener <rguenth at gcc dot gnu.org> ---
OK, so push_partial_def could simply record the first non-constant def and upon
the second, if it completes all bits handle some easy cases (VEC_PERM_EXPR,
COMPLEX_EXPR come to my mind, eventually also CONSTRUCTOR) and else fail.

Will look into this tomorrow, it shouldn't be too awkward and removing
STLF when a read is detected to overlap two previous writes is definitely
worth trouble.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/90579] [8/9/10/11 Regression] Huge store forward stall due to vectorizer, missed CSE
       [not found] <bug-90579-4@http.gcc.gnu.org/bugzilla/>
  2021-03-11 14:23 ` [Bug tree-optimization/90579] [8/9/10/11 Regression] Huge store forward stall due to vectorizer, missed CSE rguenth at gcc dot gnu.org
@ 2021-03-15  9:52 ` rguenth at gcc dot gnu.org
  2021-05-14  9:51 ` [Bug tree-optimization/90579] [9/10/11/12 " jakub at gcc dot gnu.org
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-03-15  9:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90579

--- Comment #12 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #9)
> So we now have a "real" FRE after the vectorizer but we fail to CSE
> 
>   MEM <vector(4) double> [(double *)&r] = vect__3.20_74;
> ...
>   MEM <vector(2) double> [(double *)&r + 32B] = vect__62.26_88;
> ...
>   vect__5.7_34 = MEM <vector(4) double> [(double *)&r + 16B];
> 
> mine for GCC 11 to look at.  The code to CSE that load for _74 and _88
> is going to be a bit awkward though but it will nicely combine with the
> following stmts
> 
>   vect__5.8_35 = VEC_PERM_EXPR <vect__5.7_34, vect__5.7_34, { 3, 2, 1, 0 }>;
>   stmp_t_12.9_36 = BIT_FIELD_REF <vect__5.8_35, 64, 0>;
>   stmp_t_12.9_37 = stmp_t_12.9_36 + 0.0;
>   stmp_t_12.9_38 = BIT_FIELD_REF <vect__5.8_35, 64, 64>;
>   stmp_t_12.9_39 = stmp_t_12.9_37 + stmp_t_12.9_38;
>   stmp_t_12.9_40 = BIT_FIELD_REF <vect__5.8_35, 64, 128>;
>   stmp_t_12.9_41 = stmp_t_12.9_39 + stmp_t_12.9_40;
>   stmp_t_12.9_42 = BIT_FIELD_REF <vect__5.8_35, 64, 192>;
>   t_12 = stmp_t_12.9_41 + stmp_t_12.9_42;
> 
> and hopefully elide 'r' completely.

So the difficult thing is that we need to compose the upper v2df half of
vect__3.20_74 and the v2df vect__62.26_88.  Assembly for that would be sth
like

        vextractf128    $0x1, %ymm0, %xmm0
        vinsertf128     $0x1, %xmm1, %ymm0, %ymm0

and on GIMPLE

    tem_42 = BIT_FIELD_REF <vect__3.20_74, 128, 128>;
    vect__5.7_34 = { tem_42, vect__62.26_88 };

that's two stmts which at the moment VN simplification insertion doesn't
support.  It would be "nicer" to enhance for example VEC_PERM to allow

    vect__5.7_34 = VEC_PERM <vect__3.20_74, vect__62.26_88, { 2, 3, 4, 5 }>

"implicitely" extending _88 to v4df (aka a paradoxical v4df subreg of
the v2df SSE reg).  It would turn VEC_PERM into a concat + select operation
with not requiring the intermediate to have vector mode (in this case
it would have v6df without introducing subregs, a mode not possible).
On RTL unfortunately (vec_select:V4DF (vec_concat (reg:V4DF ..) (reg:V2DF ..))
..) is not possible because of that restriction.  OTOH RTL lacks that
concat-and-select operation, allowing the cited form and vec_merge to be
"merged" (vec_merge doesn't require such intermediate mode either).

I'll see how difficult it is to teach VN multi-stmt insertions.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/90579] [9/10/11/12 Regression] Huge store forward stall due to vectorizer, missed CSE
       [not found] <bug-90579-4@http.gcc.gnu.org/bugzilla/>
  2021-03-11 14:23 ` [Bug tree-optimization/90579] [8/9/10/11 Regression] Huge store forward stall due to vectorizer, missed CSE rguenth at gcc dot gnu.org
  2021-03-15  9:52 ` rguenth at gcc dot gnu.org
@ 2021-05-14  9:51 ` jakub at gcc dot gnu.org
  2021-06-01  8:14 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-05-14  9:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90579

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|8.5                         |9.4

--- Comment #13 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 8 branch is being closed.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/90579] [9/10/11/12 Regression] Huge store forward stall due to vectorizer, missed CSE
       [not found] <bug-90579-4@http.gcc.gnu.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2021-05-14  9:51 ` [Bug tree-optimization/90579] [9/10/11/12 " jakub at gcc dot gnu.org
@ 2021-06-01  8:14 ` rguenth at gcc dot gnu.org
  2022-05-27  9:40 ` [Bug tree-optimization/90579] [10/11/12/13 " rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-06-01  8:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90579

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|9.4                         |9.5

--- Comment #14 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 9.4 is being released, retargeting bugs to GCC 9.5.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/90579] [10/11/12/13 Regression] Huge store forward stall due to vectorizer, missed CSE
       [not found] <bug-90579-4@http.gcc.gnu.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2021-06-01  8:14 ` rguenth at gcc dot gnu.org
@ 2022-05-27  9:40 ` rguenth at gcc dot gnu.org
  2022-06-28 10:37 ` jakub at gcc dot gnu.org
  2023-07-07 10:35 ` [Bug tree-optimization/90579] [11/12/13/14 " rguenth at gcc dot gnu.org
  6 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-05-27  9:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90579

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|9.5                         |10.4

--- Comment #15 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 9 branch is being closed

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/90579] [10/11/12/13 Regression] Huge store forward stall due to vectorizer, missed CSE
       [not found] <bug-90579-4@http.gcc.gnu.org/bugzilla/>
                   ` (4 preceding siblings ...)
  2022-05-27  9:40 ` [Bug tree-optimization/90579] [10/11/12/13 " rguenth at gcc dot gnu.org
@ 2022-06-28 10:37 ` jakub at gcc dot gnu.org
  2023-07-07 10:35 ` [Bug tree-optimization/90579] [11/12/13/14 " rguenth at gcc dot gnu.org
  6 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-06-28 10:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90579

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|10.4                        |10.5

--- Comment #16 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 10.4 is being released, retargeting bugs to GCC 10.5.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/90579] [11/12/13/14 Regression] Huge store forward stall due to vectorizer, missed CSE
       [not found] <bug-90579-4@http.gcc.gnu.org/bugzilla/>
                   ` (5 preceding siblings ...)
  2022-06-28 10:37 ` jakub at gcc dot gnu.org
@ 2023-07-07 10:35 ` rguenth at gcc dot gnu.org
  6 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-07-07 10:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90579

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|10.5                        |11.5

--- Comment #17 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 10 branch is being closed.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-07-07 10:35 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-90579-4@http.gcc.gnu.org/bugzilla/>
2021-03-11 14:23 ` [Bug tree-optimization/90579] [8/9/10/11 Regression] Huge store forward stall due to vectorizer, missed CSE rguenth at gcc dot gnu.org
2021-03-15  9:52 ` rguenth at gcc dot gnu.org
2021-05-14  9:51 ` [Bug tree-optimization/90579] [9/10/11/12 " jakub at gcc dot gnu.org
2021-06-01  8:14 ` rguenth at gcc dot gnu.org
2022-05-27  9:40 ` [Bug tree-optimization/90579] [10/11/12/13 " rguenth at gcc dot gnu.org
2022-06-28 10:37 ` jakub at gcc dot gnu.org
2023-07-07 10:35 ` [Bug tree-optimization/90579] [11/12/13/14 " rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).