public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/110221] New: With AVX512 fully masking gfortran.dg/pr68146.f ICEs
@ 2023-06-12 12:45 rguenth at gcc dot gnu.org
  2023-06-12 13:20 ` [Bug tree-optimization/110221] " rguenth at gcc dot gnu.org
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-06-12 12:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110221

            Bug ID: 110221
           Summary: With AVX512 fully masking gfortran.dg/pr68146.f ICEs
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

The testcase ICEs with -march=znver4 --param vect-partial-vector-usage=2
because
invariant .COND_* functions with conditional masks that end up being invariant
get scheduled ahead of the loop by SLP.

This is similar to PR108979 but complicated by the conditional mask being
originally computed inside of the loop (so not vect_external_def).

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/110221] With AVX512 fully masking gfortran.dg/pr68146.f ICEs
  2023-06-12 12:45 [Bug tree-optimization/110221] New: With AVX512 fully masking gfortran.dg/pr68146.f ICEs rguenth at gcc dot gnu.org
@ 2023-06-12 13:20 ` rguenth at gcc dot gnu.org
  2023-11-10 11:34 ` rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-06-12 13:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110221

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
So sth along the PR108979 patch doesn't help:

diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 61e508fcb6c..be963aea16f 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -3365,6 +3365,7 @@ vectorizable_call (vec_info *vinfo,
   if (internal_fn_p (cfn))
     mask_opno = internal_fn_mask_index (as_internal_fn (cfn));

+  bool is_invariant = true;
   for (i = 0; i < nargs; i++)
     {
       if ((int) i == mask_opno)
@@ -3383,6 +3384,8 @@ vectorizable_call (vec_info *vinfo,
                             "use not simple.\n");
          return false;
        }
+      if (dt[i] != vect_external_def && dt[i] != vect_constant_def)
+       is_invariant = false;

       /* We can only handle calls with arguments of the same type.  */
       if (rhs_type
@@ -3607,7 +3610,8 @@ vectorizable_call (vec_info *vinfo,
   scalar_dest = gimple_call_lhs (stmt);
   vec_dest = vect_create_destination_var (scalar_dest, vectype_out);

-  bool masked_loop_p = loop_vinfo && LOOP_VINFO_FULLY_MASKED_P (loop_vinfo);
+  bool masked_loop_p
+    = !is_invariant && loop_vinfo && LOOP_VINFO_FULLY_MASKED_P (loop_vinfo);
   unsigned int vect_nargs = nargs;
   if (masked_loop_p && reduc_idx >= 0)
     {

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/110221] With AVX512 fully masking gfortran.dg/pr68146.f ICEs
  2023-06-12 12:45 [Bug tree-optimization/110221] New: With AVX512 fully masking gfortran.dg/pr68146.f ICEs rguenth at gcc dot gnu.org
  2023-06-12 13:20 ` [Bug tree-optimization/110221] " rguenth at gcc dot gnu.org
@ 2023-11-10 11:34 ` rguenth at gcc dot gnu.org
  2023-11-10 13:17 ` cvs-commit at gcc dot gnu.org
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-11-10 11:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110221

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
           Keywords|                            |ice-on-valid-code
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2023-11-10
                 CC|                            |rsandifo at gcc dot gnu.org
             Blocks|                            |53947

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
So in this case the stmt requiring the loop mask is only "indirectly" invariant
as the mask itself is inside of the loop but with invariant operands.

What works is avoiding to schedule internal def vectorized stmts outside of the
loop.  That will then leave possible invariant motion to the LIM pass, at
least when no loop masking/len is required.  So I'm testing the following.

diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index 3e5814c3a31..80e279d8f50 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -9081,6 +9081,16 @@ vect_schedule_slp_node (vec_info *vinfo,
       /* Emit other stmts after the children vectorized defs which is
         earliest possible.  */
       gimple *last_stmt = NULL;
+      if (auto loop_vinfo = dyn_cast <loop_vec_info> (vinfo))
+       if (LOOP_VINFO_FULLY_MASKED_P (loop_vinfo)
+           || LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo))
+         {
+           /* But avoid scheduling internal defs outside of the loop when
+              we might have only implicitly tracked loop mask/len defs.  */
+           gimple_stmt_iterator si
+             = gsi_after_labels (LOOP_VINFO_LOOP (loop_vinfo)->header);
+           last_stmt = *si;
+         }
       bool seen_vector_def = false;
       FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
        if (SLP_TREE_DEF_TYPE (child) == vect_internal_def)


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/110221] With AVX512 fully masking gfortran.dg/pr68146.f ICEs
  2023-06-12 12:45 [Bug tree-optimization/110221] New: With AVX512 fully masking gfortran.dg/pr68146.f ICEs rguenth at gcc dot gnu.org
  2023-06-12 13:20 ` [Bug tree-optimization/110221] " rguenth at gcc dot gnu.org
  2023-11-10 11:34 ` rguenth at gcc dot gnu.org
@ 2023-11-10 13:17 ` cvs-commit at gcc dot gnu.org
  2023-11-10 13:17 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-11-10 13:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110221

--- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:e5f1956498251a4973d52c8aad3faf34d0443169

commit r14-5320-ge5f1956498251a4973d52c8aad3faf34d0443169
Author: Richard Biener <rguenther@suse.de>
Date:   Fri Nov 10 12:39:11 2023 +0100

    tree-optimization/110221 - SLP and loop mask/len

    The following fixes the issue that when SLP stmts are internal defs
    but appear invariant because they end up only using invariant defs
    then they get scheduled outside of the loop.  This nice optimization
    breaks down when loop masks or lens are applied since those are not
    explicitly tracked as dependences.  The following makes sure to never
    schedule internal defs outside of the vectorized loop when the
    loop uses masks/lens.

            PR tree-optimization/110221
            * tree-vect-slp.cc (vect_schedule_slp_node): When loop
            masking / len is applied make sure to not schedule
            intenal defs outside of the loop.

            * gfortran.dg/pr110221.f: New testcase.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/110221] With AVX512 fully masking gfortran.dg/pr68146.f ICEs
  2023-06-12 12:45 [Bug tree-optimization/110221] New: With AVX512 fully masking gfortran.dg/pr68146.f ICEs rguenth at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2023-11-10 13:17 ` cvs-commit at gcc dot gnu.org
@ 2023-11-10 13:17 ` rguenth at gcc dot gnu.org
  2024-01-17 11:46 ` saurabh.jha at arm dot com
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-11-10 13:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110221

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/110221] With AVX512 fully masking gfortran.dg/pr68146.f ICEs
  2023-06-12 12:45 [Bug tree-optimization/110221] New: With AVX512 fully masking gfortran.dg/pr68146.f ICEs rguenth at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2023-11-10 13:17 ` rguenth at gcc dot gnu.org
@ 2024-01-17 11:46 ` saurabh.jha at arm dot com
  2024-02-06 13:20 ` cvs-commit at gcc dot gnu.org
  2024-03-01 13:40 ` cvs-commit at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: saurabh.jha at arm dot com @ 2024-01-17 11:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110221

Saurabh Jha <saurabh.jha at arm dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |saurabh.jha at arm dot com

--- Comment #5 from Saurabh Jha <saurabh.jha at arm dot com> ---
Hi Richard,

Just to let you know, this fix has also seemed to have fixed this ICE too:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111478.

This ICE happens in GCC 12 but not in 13. I did a bisect on where the fixed
happened and it converged to your commit
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=e5f1956498251a4973d52c8aad3faf34d0443169

Regards,
Saurabh

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/110221] With AVX512 fully masking gfortran.dg/pr68146.f ICEs
  2023-06-12 12:45 [Bug tree-optimization/110221] New: With AVX512 fully masking gfortran.dg/pr68146.f ICEs rguenth at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2024-01-17 11:46 ` saurabh.jha at arm dot com
@ 2024-02-06 13:20 ` cvs-commit at gcc dot gnu.org
  2024-03-01 13:40 ` cvs-commit at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-02-06 13:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110221

--- Comment #6 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-13 branch has been updated by Richard Biener
<rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:7c67939ec384425a3d7383dfb4fb39aa7e9ad20a

commit r13-8288-g7c67939ec384425a3d7383dfb4fb39aa7e9ad20a
Author: Richard Biener <rguenther@suse.de>
Date:   Fri Nov 10 12:39:11 2023 +0100

    tree-optimization/110221 - SLP and loop mask/len

    The following fixes the issue that when SLP stmts are internal defs
    but appear invariant because they end up only using invariant defs
    then they get scheduled outside of the loop.  This nice optimization
    breaks down when loop masks or lens are applied since those are not
    explicitly tracked as dependences.  The following makes sure to never
    schedule internal defs outside of the vectorized loop when the
    loop uses masks/lens.

            PR tree-optimization/110221
            * tree-vect-slp.cc (vect_schedule_slp_node): When loop
            masking / len is applied make sure to not schedule
            intenal defs outside of the loop.

            * gfortran.dg/pr110221.f: New testcase.

    (cherry picked from commit e5f1956498251a4973d52c8aad3faf34d0443169)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug tree-optimization/110221] With AVX512 fully masking gfortran.dg/pr68146.f ICEs
  2023-06-12 12:45 [Bug tree-optimization/110221] New: With AVX512 fully masking gfortran.dg/pr68146.f ICEs rguenth at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2024-02-06 13:20 ` cvs-commit at gcc dot gnu.org
@ 2024-03-01 13:40 ` cvs-commit at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-03-01 13:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110221

--- Comment #7 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-12 branch has been updated by Andre Simoes Dias Vieira
<avieira@gcc.gnu.org>:

https://gcc.gnu.org/g:9d033155254ac6df5f47ab32896dbf336f991589

commit r12-10186-g9d033155254ac6df5f47ab32896dbf336f991589
Author: Richard Biener <rguenther@suse.de>
Date:   Fri Nov 10 12:39:11 2023 +0100

    tree-optimization/110221 - SLP and loop mask/len

    The following fixes the issue that when SLP stmts are internal defs
    but appear invariant because they end up only using invariant defs
    then they get scheduled outside of the loop.  This nice optimization
    breaks down when loop masks or lens are applied since those are not
    explicitly tracked as dependences.  The following makes sure to never
    schedule internal defs outside of the vectorized loop when the
    loop uses masks/lens.

            PR tree-optimization/110221
            * tree-vect-slp.cc (vect_schedule_slp_node): When loop
            masking / len is applied make sure to not schedule
            intenal defs outside of the loop.

            * gfortran.dg/pr110221.f: New testcase.

    (cherry picked from commit 7c67939ec384425a3d7383dfb4fb39aa7e9ad20a)

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-03-01 13:41 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-12 12:45 [Bug tree-optimization/110221] New: With AVX512 fully masking gfortran.dg/pr68146.f ICEs rguenth at gcc dot gnu.org
2023-06-12 13:20 ` [Bug tree-optimization/110221] " rguenth at gcc dot gnu.org
2023-11-10 11:34 ` rguenth at gcc dot gnu.org
2023-11-10 13:17 ` cvs-commit at gcc dot gnu.org
2023-11-10 13:17 ` rguenth at gcc dot gnu.org
2024-01-17 11:46 ` saurabh.jha at arm dot com
2024-02-06 13:20 ` cvs-commit at gcc dot gnu.org
2024-03-01 13:40 ` cvs-commit at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).