public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <rguenther@suse.de>
To: "Andre Vieira (lists)" <andre.simoesdiasvieira@arm.com>
Cc: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com
Subject: Re: [PATCH 1/2][vect]PR 88915: Vectorize epilogues when versioning loops
Date: Tue, 29 Oct 2019 12:12:00 -0000	[thread overview]
Message-ID: <nycvar.YFH.7.76.1910291234590.5566@zhemvz.fhfr.qr> (raw)
In-Reply-To: <148647ef-e409-0a52-724f-85a38bea43da@arm.com>

[-- Attachment #1: Type: text/plain, Size: 11338 bytes --]

On Mon, 28 Oct 2019, Andre Vieira (lists) wrote:

> Hi,
> 
> Reworked according to your comments, see inline for clarification.
> 
> Is this OK for trunk?

+             gimple_seq seq = STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo);
+             while (seq)
+               {
+                 stmt_worklist.safe_push (seq);
+                 seq = seq->next;
+               }

you're supposed to do to the following, not access the ->next
implementation detail:

    for (gimple_stmt_iterator gsi = gsi_start (seq); !gsi_end_p (gsi); 
gsi_next (&gsi))
      stmt_worklist.safe_push (gsi_stmt (gsi));


+      /* Data references for gather loads and scatter stores do not use 
the
+        updated offset we set using ADVANCE.  Instead we have to make 
sure
the
+        reference in the data references point to the corresponding copy 
of
+        the original in the epilogue.  */
+      if (STMT_VINFO_GATHER_SCATTER_P (stmt_vinfo))
+       {
+         DR_REF (dr)
+           = simplify_replace_tree (DR_REF (dr), NULL_TREE, NULL_TREE,
+                                    &find_in_mapping, &mapping);
+         DR_BASE_ADDRESS (dr)
+           = simplify_replace_tree (DR_BASE_ADDRESS (dr), NULL_TREE,
NULL_TREE,
+                                    &find_in_mapping, &mapping);
+       }

Hmm.  So for other DRs we account for the previous vector loop
by adjusting DR_OFFSET?  But STMT_VINFO_GATHER_SCATTER_P ends up
using (unconditionally) DR_REF here?  In that case it seems
best to adjust DR_REF only but NULL out DR_BASE_ADDRESS and
DR_OFFSET?  I wonder how prologue peeling deals with
STMT_VINFO_GATHER_SCATTER_P ... I see the caller of
vect_update_init_of_dr there does nothing for STMT_VINFO_GATHER_SCATTER_P.

I wonder if (as followup to not delay this further) we can
"offload" all the DR adjustment by storing ADVANCE in dr_vec_info
and accounting for that when we create the dataref pointers in
vectorizable_load/store?  That way we could avoid saving/restoring
DR_OFFSET as well. 

So, the patch is OK with the sequence iteration fixed.  I think
sorting out the above can be done as followup.

Thanks,
Richard.

> gcc/ChangeLog:
> 2019-10-28  Andre Vieira  <andre.simoesdiasvieira@arm.com>
> 
>     PR 88915
>     * tree-ssa-loop-niter.h (simplify_replace_tree): Change declaration.
>     * tree-ssa-loop-niter.c (simplify_replace_tree): Add context parameter
>     and make the valueize function pointer also take a void pointer.
>     * gcc/tree-ssa-sccvn.c (vn_valueize_wrapper): New function to wrap
>     around vn_valueize, to call it without a context.
>     (process_bb): Use vn_valueize_wrapper instead of vn_valueize.
>     * tree-vect-loop.c (_loop_vec_info): Initialize epilogue_vinfos.
>     (~_loop_vec_info): Release epilogue_vinfos.
>     (vect_analyze_loop_costing): Use knowledge of main VF to estimate
>     number of iterations of epilogue.
>     (vect_analyze_loop_2): Adapt to analyse main loop for all supported
>     vector sizes when vect-epilogues-nomask=1.  Also keep track of lowest
>     versioning threshold needed for main loop.
>     (vect_analyze_loop): Likewise.
>     (find_in_mapping): New helper function.
>     (update_epilogue_loop_vinfo): New function.
>     (vect_transform_loop): When vectorizing epilogues re-use analysis done
>     on main loop and call update_epilogue_loop_vinfo to update it.
>     * tree-vect-loop-manip.c (vect_update_inits_of_drs): No longer insert
>     stmts on loop preheader edge.
>     (vect_do_peeling): Enable skip-vectors when doing loop versioning if
>     we decided to vectorize epilogues.  Update epilogues NITERS and
>     construct ADVANCE to update epilogues data references where needed.
>     * tree-vectorizer.h (_loop_vec_info): Add epilogue_vinfos.
>     (vect_do_peeling, vect_update_inits_of_drs,
>      determine_peel_for_niter, vect_analyze_loop): Add or update declarations.
>     * tree-vectorizer.c (try_vectorize_loop_1): Make sure to use already
>     created loop_vec_info's for epilogues when available.  Otherwise 
> analyse
>     epilogue separately.
> 
> 
> 
> Cheers,
> Andre
> 
> On 28/10/2019 14:16, Richard Biener wrote:
> > On Fri, 25 Oct 2019, Andre Vieira (lists) wrote:
> > 
> >> Hi,
> >>
> >> This is the reworked patch after your comments.
> >>
> >> I have moved the epilogue check into the analysis form disguised under
> >> '!epilogue_vinfos.is_empty ()'.  This because I realized that I am doing
> >> the
> >> "lowest threshold" check there.
> >>
> >> The only place where we may reject an epilogue_vinfo is when we know the
> >> number of scalar iterations and we realize the number of iterations left
> >> after
> >> the main loop are not enough to enter the vectorized epilogue so we
> >> optimize
> >> away that code-gen.  The only way we know this to be true is if the number
> >> of
> >> scalar iterations are known and the peeling for alignment is known. So we
> >> know
> >> we will enter the main loop regardless, so whether the threshold we use is
> >> for
> >> a lower VF or not it shouldn't matter as much, I would even like to think
> >> that
> >> check isn't done, but I am not sure... Might be worth checking as an
> >> optimization.
> >>
> >>
> >> Is this OK for trunk?
> > 
> > +      for (epilogue_phi_gsi = gsi_start_phis (epilogue_bbs[i]);
> > +          !gsi_end_p (epilogue_phi_gsi); gsi_next (&epilogue_phi_gsi))
> > +       {
> > ..
> > +         if (STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo))
> > +           pattern_worklist.safe_push (stmt_vinfo);
> > +
> > +         related_vinfo = STMT_VINFO_RELATED_STMT (stmt_vinfo);
> > +         while (related_vinfo && related_vinfo != stmt_vinfo)
> > +           {
> > 
> > I think PHIs cannot have patterns.  You can assert
> > that STMT_VINFO_RELATED_STMT is NULL I think.
> 
> Done.
> > 
> > +         related_vinfo = STMT_VINFO_RELATED_STMT (stmt_vinfo);
> > +         while (related_vinfo && related_vinfo != stmt_vinfo)
> > +           {
> > +             related_worklist.safe_push (related_vinfo);
> > +             /* Set BB such that the assert in
> > +               'get_initial_def_for_reduction' is able to determine that
> > +               the BB of the related stmt is inside this loop.  */
> > +             gimple_set_bb (STMT_VINFO_STMT (related_vinfo),
> > +                            gimple_bb (new_stmt));
> > +             related_vinfo = STMT_VINFO_RELATED_STMT (related_vinfo);
> > +           }
> > 
> > do we really keep references to "nested" patterns?  Thus, do you
> > need this loop?
> 
> Changed and added asserts.  They didn't trigger so I suppose you are right, I
> didn't know at the time whether it was possible, so I just operated on the
> side of caution.  Can remove the asserts and so on if you want.
> > 
> > +  /* The PATTERN_DEF_SEQs in the epilogue were constructed using the
> > +     original main loop and thus need to be updated to refer to the
> > cloned
> > +     variables used in the epilogue.  */
> > +  for (unsigned i = 0; i < pattern_worklist.length (); ++i)
> > +    {
> > ...
> > +                 op = simplify_replace_tree (op, NULL_TREE, NULL_TREE,
> > +                                        &find_in_mapping, &mapping);
> > +                 gimple_set_op (seq, j, op);
> > 
> > you do this for the pattern-def seq but not for the related one.
> > I guess you ran into this for COND_EXPR conditions.  I wondered
> > to use a shared worklist for both the def-seq and the main pattern
> > stmt or at least to split out the replacement so you can share it.
> 
> I think that was it yeah, reworked it now to use the same list. Less code,
> thanks!
> > 
> > +      /* Data references for gather loads and scatter stores do not use
> > the
> > +        updated offset we set using ADVANCE.  Instead we have to make
> > sure the
> > +        reference in the data references point to the corresponding copy
> > of
> > +        the original in the epilogue.  */
> > +      if (STMT_VINFO_GATHER_SCATTER_P (stmt_vinfo))
> > +       {
> > +         int j;
> > +         if (TREE_CODE (DR_REF (dr)) == MEM_REF)
> > +           j = 0;
> > +         else if (TREE_CODE (DR_REF (dr)) == ARRAY_REF)
> > +           j = 1;
> > +         else
> > +           gcc_unreachable ();
> > +
> > +         if (tree *new_op = mapping.get (TREE_OPERAND (DR_REF (dr), j)))
> > +           {
> > +             DR_REF (dr) = unshare_expr (DR_REF (dr));
> > +             TREE_OPERAND (DR_REF (dr), j) = *new_op;
> > +           }
> > 
> > huh, do you really only ever see MEM_REF or ARRAY_REF here?
> > I would guess using simplify_replace_tree is safer.
> > There's also DR_BASE_ADDRESS - we seem to leave the DRs partially
> > updated, is that correct?
> 
> Yeah can use simplify_replace_tree indeed.  And I have changed it so it
> updates DR_BASE_ADDRESS.  I think DR_BASE_ADDRESS never actually changed in
> the way we use data_references... Either way, replacing them if they do change
> is cleaner and more future proof.
> > 
> > Otherwise looks OK to me.
> > 
> > Thanks,
> > Richard.
> > 
> > 
> >> gcc/ChangeLog:
> >> 2019-10-25  Andre Vieira  <andre.simoesdiasvieira@arm.com>
> >>
> >>      PR 88915
> >>      * tree-ssa-loop-niter.h (simplify_replace_tree): Change declaration.
> >>      * tree-ssa-loop-niter.c (simplify_replace_tree): Add context parameter
> >>      and make the valueize function pointer also take a void pointer.
> >>      * gcc/tree-ssa-sccvn.c (vn_valueize_wrapper): New function to wrap
> >>      around vn_valueize, to call it without a context.
> >>      (process_bb): Use vn_valueize_wrapper instead of vn_valueize.
> >>      * tree-vect-loop.c (_loop_vec_info): Initialize epilogue_vinfos.
> >>      (~_loop_vec_info): Release epilogue_vinfos.
> >>      (vect_analyze_loop_costing): Use knowledge of main VF to estimate
> >>      number of iterations of epilogue.
> >>      (vect_analyze_loop_2): Adapt to analyse main loop for all supported
> >>      vector sizes when vect-epilogues-nomask=1.  Also keep track of lowest
> >>      versioning threshold needed for main loop.
> >>      (vect_analyze_loop): Likewise.
> >>      (find_in_mapping): New helper function.
> >>      (update_epilogue_loop_vinfo): New function.
> >>      (vect_transform_loop): When vectorizing epilogues re-use analysis done
> >>      on main loop and call update_epilogue_loop_vinfo to update it.
> >>      * tree-vect-loop-manip.c (vect_update_inits_of_drs): No longer insert
> >>      stmts on loop preheader edge.
> >>      (vect_do_peeling): Enable skip-vectors when doing loop versioning if
> >>      we decided to vectorize epilogues.  Update epilogues NITERS and
> >>      construct ADVANCE to update epilogues data references where needed.
> >>      * tree-vectorizer.h (_loop_vec_info): Add epilogue_vinfos.
> >>      (vect_do_peeling, vect_update_inits_of_drs,
> >>      determine_peel_for_niter, vect_analyze_loop): Add or update
> >>      declarations.
> >>      * tree-vectorizer.c (try_vectorize_loop_1): Make sure to use already
> >>      created loop_vec_info's for epilogues when available.  Otherwise
> >> analyse
> >>      epilogue separately.
> >>
> >>
> >>
> >> Cheers,
> >> Andre
> >>
> > 
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

  reply	other threads:[~2019-10-29 11:49 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-23 17:17 Andre Vieira (lists)
2019-08-26 13:17 ` Richard Biener
2019-09-18 11:11   ` Andre Vieira (lists)
2019-10-08 13:16   ` Andre Vieira (lists)
2019-10-09  9:06     ` Richard Biener
2019-10-10 14:01       ` Andre Vieira (lists)
2019-10-11 13:01         ` Richard Biener
2019-10-22 12:53           ` Andre Vieira (lists)
2019-10-22 14:05             ` Richard Biener
2019-10-25 16:18               ` Andre Vieira (lists)
2019-10-28 13:16                 ` Richard Biener
2019-10-22 18:07             ` Richard Sandiford
2019-10-25 16:20               ` Andre Vieira (lists)
2019-10-25 16:27                 ` Andre Vieira (lists)
2019-10-28 14:18                   ` Richard Biener
2019-10-28 18:37                     ` Andre Vieira (lists)
2019-10-29 12:12                       ` Richard Biener [this message]
2019-10-29 13:17                         ` [committed][vect]PR 88915: Vectorize epilogues when versioning loops (was Re: [PATCH 1/2][vect]PR 88915: Vectorize epilogues when versioning loops) Andre Vieira (lists)
2019-10-29 13:45                           ` Richard Biener
2019-09-04 15:35 ` [PATCH 1/2][vect]PR 88915: Vectorize epilogues when versioning loops Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=nycvar.YFH.7.76.1910291234590.5566@zhemvz.fhfr.qr \
    --to=rguenther@suse.de \
    --cc=andre.simoesdiasvieira@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=richard.sandiford@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).