From: Richard Biener <rguenther@suse.de>
To: Tamar Christina <tamar.christina@arm.com>
Cc: gcc-patches@gcc.gnu.org, nd@arm.com, jlaw@ventanamicro.com
Subject: Re: [PATCH]middle-end: check memory accesses in the destination block [PR113588].
Date: Tue, 30 Jan 2024 10:51:16 +0100 (CET) [thread overview]
Message-ID: <1096n2p8-s430-5p01-orn1-ro09s5162o8n@fhfr.qr> (raw)
In-Reply-To: <patch-18216-tamar@arm.com>
On Mon, 29 Jan 2024, Tamar Christina wrote:
> Hi All,
>
> When analyzing loads for early break it was always the intention that
> for the exit where things get moved to we only check the loads that can
> be reached from the condition.
Looking at the code I'm a bit confused that we always move to
single_pred (loop->latch) - IIRC that was different at some point?
Shouldn't we move stores after the last early exit condition instead?
In particular for the peeled case single_pred (loop->latch) is the
block with the actual early exit condition? So for that case we'd
need to move to the latch itself instead? For non-peeled we move
to the block with the IV condition which looks OK.
> However the main loop checks all loads and we skip the destination BB.
> As such we never actually check the loads reachable from the COND in the
> last BB unless this BB was also the exit chosen by the vectorizer.
>
> This leads us to incorrectly vectorize the loop in the PR and in doing so access
> out of bounds.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Ok for master?
The patch ends up with a worklist and another confusing comment
+ /* For the destination BB we need to only analyze loads reachable from
the early
+ break statement itself. */
But I think it's a downstream issue from the issue above. That said,
even for the non-peeled case we need to check ref_within_array_bound,
no?
So what about re-doing that initial loop like the following instead
(and also fix dest_bb, but I'd like clarification here). Basically
walk all blocks, do the ref_within_array_bound first and only
after we've seen 'dest_bb' do the checks required for moving
stores for all upstream BBs.
And dest_bb should be
/* Move side-effects to the in-loop destination of the last early
exit. */
if (LOOP_VINFO_EARLY_BREAKS_VECT_PEELED (loop_vinfo))
dest_bb = loop->latch;
else
dest_bb = single_pred (loop->latch);
diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
index f592aeb8028..d6c8910dd6c 100644
--- a/gcc/tree-vect-data-refs.cc
+++ b/gcc/tree-vect-data-refs.cc
@@ -668,7 +668,6 @@ vect_analyze_early_break_dependences (loop_vec_info loop_vinfo)
auto_vec<data_reference *> bases;
basic_block dest_bb = NULL;
- hash_set <gimple *> visited;
class loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
class loop *loop_nest = loop_outer (loop);
@@ -681,15 +680,11 @@ vect_analyze_early_break_dependences (loop_vec_info loop_vinfo)
side-effects to is always the latch connected exit. When we support
general control flow we can do better but for now this is fine. */
dest_bb = single_pred (loop->latch);
- basic_block bb = dest_bb;
+ basic_block bb = loop->latch;
+ bool check_deps = false;
do
{
- /* If the destination block is also the header then we have nothing to do. */
- if (!single_pred_p (bb))
- continue;
-
- bb = single_pred (bb);
gimple_stmt_iterator gsi = gsi_last_bb (bb);
/* Now analyze all the remaining statements and try to determine which
@@ -707,6 +702,25 @@ vect_analyze_early_break_dependences (loop_vec_info loop_vinfo)
if (!dr_ref)
continue;
+ /* Check if vector accesses to the object will be within bounds.
+ must be a constant or assume loop will be versioned or niters
+ bounded by VF so accesses are within range. */
+ if (!ref_within_array_bound (stmt, DR_REF (dr_ref)))
+ {
+ if (dump_enabled_p ())
+ dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+ "early breaks not supported: vectorization "
+ "would %s beyond size of obj.",
+ DR_IS_READ (dr_ref) ? "read" : "write");
+ return opt_result::failure_at (stmt,
+ "can't safely apply code motion to "
+ "dependencies of %G to vectorize "
+ "the early exit.\n", stmt);
+ }
+
+ if (!check_deps)
+ continue;
+
/* We currently only support statically allocated objects due to
not having first-faulting loads support or peeling for
alignment support. Compute the size of the referenced object
@@ -739,22 +753,6 @@ vect_analyze_early_break_dependences (loop_vec_info loop_vinfo)
"the early exit.\n", stmt);
}
- /* Check if vector accesses to the object will be within bounds.
- must be a constant or assume loop will be versioned or niters
- bounded by VF so accesses are within range. */
- if (!ref_within_array_bound (stmt, DR_REF (dr_ref)))
- {
- if (dump_enabled_p ())
- dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
- "early breaks not supported: vectorization "
- "would %s beyond size of obj.",
- DR_IS_READ (dr_ref) ? "read" : "write");
- return opt_result::failure_at (stmt,
- "can't safely apply code motion to "
- "dependencies of %G to vectorize "
- "the early exit.\n", stmt);
- }
-
if (DR_IS_READ (dr_ref))
bases.safe_push (dr_ref);
else if (DR_IS_WRITE (dr_ref))
@@ -814,8 +812,16 @@ vect_analyze_early_break_dependences (loop_vec_info loop_vinfo)
"marked statement for vUSE update: %G", stmt);
}
}
+ if (!single_pred_p (bb))
+ {
+ gcc_assert (bb == loop->header);
+ break;
+ }
+ if (bb == dest_bb)
+ check_deps = true;
+ bb = single_pred (bb);
}
- while (bb != loop->header);
+ while (1);
/* We don't allow outer -> inner loop transitions which should have been
trapped already during loop form analysis. */
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
> PR tree-optimization/113588
> * tree-vect-data-refs.cc (vect_analyze_early_break_dependences_1): New.
> (vect_analyze_data_ref_dependence): Use it.
> (vect_analyze_early_break_dependences): Update comments.
>
> gcc/testsuite/ChangeLog:
>
> PR tree-optimization/113588
> * gcc.dg/vect/vect-early-break_108-pr113588.c: New test.
> * gcc.dg/vect/vect-early-break_109-pr113588.c: New test.
>
> --- inline copy of patch --
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_108-pr113588.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_108-pr113588.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..e488619c9aac41fafbcf479818392a6bb7c6924f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_108-pr113588.c
> @@ -0,0 +1,15 @@
> +/* { dg-do compile } */
> +/* { dg-add-options vect_early_break } */
> +/* { dg-require-effective-target vect_early_break } */
> +/* { dg-require-effective-target vect_int } */
> +
> +/* { dg-final { scan-tree-dump-not "LOOP VECTORIZED" "vect" } } */
> +
> +int foo (const char *s, unsigned long n)
> +{
> + unsigned long len = 0;
> + while (*s++ && n--)
> + ++len;
> + return len;
> +}
> +
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_109-pr113588.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_109-pr113588.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..488c19d3ede809631d1a7ede0e7f7bcdc7a1ae43
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_109-pr113588.c
> @@ -0,0 +1,44 @@
> +/* { dg-add-options vect_early_break } */
> +/* { dg-require-effective-target vect_early_break } */
> +/* { dg-require-effective-target vect_int } */
> +/* { dg-require-effective-target mmap } */
> +
> +/* { dg-final { scan-tree-dump-not "LOOP VECTORIZED" "vect" } } */
> +
> +#include <sys/mman.h>
> +#include <unistd.h>
> +
> +#include "tree-vect.h"
> +
> +__attribute__((noipa))
> +int foo (const char *s, unsigned long n)
> +{
> + unsigned long len = 0;
> + while (*s++ && n--)
> + ++len;
> + return len;
> +}
> +
> +int main()
> +{
> +
> + check_vect ();
> +
> + long pgsz = sysconf (_SC_PAGESIZE);
> + void *p = mmap (NULL, pgsz * 3, PROT_READ|PROT_WRITE,
> + MAP_ANONYMOUS|MAP_PRIVATE, 0, 0);
> + if (p == MAP_FAILED)
> + return 0;
> + mprotect (p, pgsz, PROT_NONE);
> + mprotect (p+2*pgsz, pgsz, PROT_NONE);
> + char *p1 = p + pgsz;
> + p1[0] = 1;
> + p1[1] = 0;
> + foo (p1, 1000);
> + p1 = p + 2*pgsz - 2;
> + p1[0] = 1;
> + p1[1] = 0;
> + foo (p1, 1000);
> + return 0;
> +}
> +
> diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
> index f592aeb8028afd4fd70e2175104efab2a2c0d82e..52cef242a7ce5d0e525bff639fa1dc2f0a6f30b9 100644
> --- a/gcc/tree-vect-data-refs.cc
> +++ b/gcc/tree-vect-data-refs.cc
> @@ -619,10 +619,69 @@ vect_analyze_data_ref_dependence (struct data_dependence_relation *ddr,
> return opt_result::success ();
> }
>
> -/* Funcion vect_analyze_early_break_dependences.
> +/* Function vect_analyze_early_break_dependences_1
>
> - Examime all the data references in the loop and make sure that if we have
> - mulitple exits that we are able to safely move stores such that they become
> + Helper function of vect_analyze_early_break_dependences which performs safety
> + analysis for load operations in an early break. */
> +
> +static opt_result
> +vect_analyze_early_break_dependences_1 (data_reference *dr_ref, gimple *stmt)
> +{
> + /* We currently only support statically allocated objects due to
> + not having first-faulting loads support or peeling for
> + alignment support. Compute the size of the referenced object
> + (it could be dynamically allocated). */
> + tree obj = DR_BASE_ADDRESS (dr_ref);
> + if (!obj || TREE_CODE (obj) != ADDR_EXPR)
> + {
> + if (dump_enabled_p ())
> + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> + "early breaks only supported on statically"
> + " allocated objects.\n");
> + return opt_result::failure_at (stmt,
> + "can't safely apply code motion to "
> + "dependencies of %G to vectorize "
> + "the early exit.\n", stmt);
> + }
> +
> + tree refop = TREE_OPERAND (obj, 0);
> + tree refbase = get_base_address (refop);
> + if (!refbase || !DECL_P (refbase) || !DECL_SIZE (refbase)
> + || TREE_CODE (DECL_SIZE (refbase)) != INTEGER_CST)
> + {
> + if (dump_enabled_p ())
> + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> + "early breaks only supported on"
> + " statically allocated objects.\n");
> + return opt_result::failure_at (stmt,
> + "can't safely apply code motion to "
> + "dependencies of %G to vectorize "
> + "the early exit.\n", stmt);
> + }
> +
> + /* Check if vector accesses to the object will be within bounds.
> + must be a constant or assume loop will be versioned or niters
> + bounded by VF so accesses are within range. */
> + if (!ref_within_array_bound (stmt, DR_REF (dr_ref)))
> + {
> + if (dump_enabled_p ())
> + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> + "early breaks not supported: vectorization "
> + "would %s beyond size of obj.",
> + DR_IS_READ (dr_ref) ? "read" : "write");
> + return opt_result::failure_at (stmt,
> + "can't safely apply code motion to "
> + "dependencies of %G to vectorize "
> + "the early exit.\n", stmt);
> + }
> +
> + return opt_result::success ();
> +}
> +
> +/* Function vect_analyze_early_break_dependences.
> +
> + Examine all the data references in the loop and make sure that if we have
> + multiple exits that we are able to safely move stores such that they become
> safe for vectorization. The function also calculates the place where to move
> the instructions to and computes what the new vUSE chain should be.
>
> @@ -639,7 +698,7 @@ vect_analyze_data_ref_dependence (struct data_dependence_relation *ddr,
> - Multiple loads are allowed as long as they don't alias.
>
> NOTE:
> - This implemementation is very conservative. Any overlappig loads/stores
> + This implementation is very conservative. Any overlapping loads/stores
> that take place before the early break statement gets rejected aside from
> WAR dependencies.
>
> @@ -668,7 +727,6 @@ vect_analyze_early_break_dependences (loop_vec_info loop_vinfo)
> auto_vec<data_reference *> bases;
> basic_block dest_bb = NULL;
>
> - hash_set <gimple *> visited;
> class loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
> class loop *loop_nest = loop_outer (loop);
>
> @@ -683,6 +741,7 @@ vect_analyze_early_break_dependences (loop_vec_info loop_vinfo)
> dest_bb = single_pred (loop->latch);
> basic_block bb = dest_bb;
>
> + /* First analyse all blocks leading to dest_bb excluding dest_bb itself. */
> do
> {
> /* If the destination block is also the header then we have nothing to do. */
> @@ -707,53 +766,11 @@ vect_analyze_early_break_dependences (loop_vec_info loop_vinfo)
> if (!dr_ref)
> continue;
>
> - /* We currently only support statically allocated objects due to
> - not having first-faulting loads support or peeling for
> - alignment support. Compute the size of the referenced object
> - (it could be dynamically allocated). */
> - tree obj = DR_BASE_ADDRESS (dr_ref);
> - if (!obj || TREE_CODE (obj) != ADDR_EXPR)
> - {
> - if (dump_enabled_p ())
> - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> - "early breaks only supported on statically"
> - " allocated objects.\n");
> - return opt_result::failure_at (stmt,
> - "can't safely apply code motion to "
> - "dependencies of %G to vectorize "
> - "the early exit.\n", stmt);
> - }
> -
> - tree refop = TREE_OPERAND (obj, 0);
> - tree refbase = get_base_address (refop);
> - if (!refbase || !DECL_P (refbase) || !DECL_SIZE (refbase)
> - || TREE_CODE (DECL_SIZE (refbase)) != INTEGER_CST)
> - {
> - if (dump_enabled_p ())
> - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> - "early breaks only supported on"
> - " statically allocated objects.\n");
> - return opt_result::failure_at (stmt,
> - "can't safely apply code motion to "
> - "dependencies of %G to vectorize "
> - "the early exit.\n", stmt);
> - }
> -
> - /* Check if vector accesses to the object will be within bounds.
> - must be a constant or assume loop will be versioned or niters
> - bounded by VF so accesses are within range. */
> - if (!ref_within_array_bound (stmt, DR_REF (dr_ref)))
> - {
> - if (dump_enabled_p ())
> - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> - "early breaks not supported: vectorization "
> - "would %s beyond size of obj.",
> - DR_IS_READ (dr_ref) ? "read" : "write");
> - return opt_result::failure_at (stmt,
> - "can't safely apply code motion to "
> - "dependencies of %G to vectorize "
> - "the early exit.\n", stmt);
> - }
> + /* Check if the operation is one we can safely do. */
> + opt_result res
> + = vect_analyze_early_break_dependences_1 (dr_ref, stmt);
> + if (!res)
> + return res;
>
> if (DR_IS_READ (dr_ref))
> bases.safe_push (dr_ref);
> @@ -817,6 +834,51 @@ vect_analyze_early_break_dependences (loop_vec_info loop_vinfo)
> }
> while (bb != loop->header);
>
> + /* For the destination BB we need to only analyze loads reachable from the early
> + break statement itself. */
> + auto_vec <tree> workset;
> + hash_set <tree> visited;
> + gimple *last_stmt = gsi_stmt (gsi_last_bb (dest_bb));
> + gcond *last_cond = dyn_cast <gcond *> (last_stmt);
> + /* If the cast fails we have a different control flow statement in the latch. Most
> + commonly this is a switch. */
> + if (!last_cond)
> + return opt_result::failure_at (last_stmt,
> + "can't safely apply code motion to dependencies"
> + " to vectorize the early exit, unknown control fow"
> + " in stmt %G", last_stmt);
> + workset.safe_push (gimple_cond_lhs (last_cond));
> + workset.safe_push (gimple_cond_rhs (last_cond));
> +
> + imm_use_iterator imm_iter;
> + use_operand_p use_p;
> + tree lhs;
> + do
> + {
> + tree op = workset.pop ();
> + if (visited.add (op))
> + continue;
> + stmt_vec_info stmt_vinfo = loop_vinfo->lookup_def (op);
> +
> + /* Not defined in loop, don't care. */
> + if (!stmt_vinfo)
> + continue;
> + gimple *stmt = STMT_VINFO_STMT (stmt_vinfo);
> + auto dr_ref = STMT_VINFO_DATA_REF (stmt_vinfo);
> + if (dr_ref)
> + {
> + opt_result res
> + = vect_analyze_early_break_dependences_1 (dr_ref, stmt);
> + if (!res)
> + return res;
> + }
> + else
> + FOR_EACH_IMM_USE_FAST (use_p, imm_iter, op)
> + if ((lhs = gimple_get_lhs (USE_STMT (use_p))))
> + workset.safe_push (lhs);
> + }
> + while (!workset.is_empty ());
> +
> /* We don't allow outer -> inner loop transitions which should have been
> trapped already during loop form analysis. */
> gcc_assert (dest_bb->loop_father == loop);
>
>
>
>
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
next prev parent reply other threads:[~2024-01-30 9:56 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-29 15:04 Tamar Christina
2024-01-30 9:51 ` Richard Biener [this message]
2024-01-30 12:57 ` Tamar Christina
2024-01-30 13:04 ` Richard Biener
2024-02-01 21:33 ` Tamar Christina
2024-02-02 9:37 ` Toon Moene
2024-02-03 22:00 ` Sam James
2024-02-02 10:21 ` Richard Biener
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1096n2p8-s430-5p01-orn1-ro09s5162o8n@fhfr.qr \
--to=rguenther@suse.de \
--cc=gcc-patches@gcc.gnu.org \
--cc=jlaw@ventanamicro.com \
--cc=nd@arm.com \
--cc=tamar.christina@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).