* [Bug tree-optimization/115382] Wrong code with in-order conditional reduction and masked loops
2024-06-07 8:17 [Bug tree-optimization/115382] New: Wrong code with in-order conditional reduction and masked loops rguenth at gcc dot gnu.org
@ 2024-06-07 19:02 ` rdapp at gcc dot gnu.org
2024-06-10 6:18 ` rguenth at gcc dot gnu.org
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: rdapp at gcc dot gnu.org @ 2024-06-07 19:02 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115382
--- Comment #1 from Robin Dapp <rdapp at gcc dot gnu.org> ---
Would something like this work? The testcase ran successfully with Intel's SME
with that change (and aarch64 qemu with SVE).
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 028692614bb..f9bf6a45611 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -7215,7 +7215,21 @@ vectorize_fold_left_reduction (loop_vec_info loop_vinfo,
tree len = NULL_TREE;
tree bias = NULL_TREE;
if (LOOP_VINFO_FULLY_MASKED_P (loop_vinfo))
- mask = vect_get_loop_mask (loop_vinfo, gsi, masks, vec_num, vectype_in,
i);
+ {
+ tree mask_loop = vect_get_loop_mask (loop_vinfo, gsi, masks,
+ vec_num, vectype_in, i);
+ if (is_cond_op)
+ {
+ /* Merge the loop mask and the cond_op mask. */
+ mask = make_ssa_name (TREE_TYPE (mask_loop));
+ gassign *and_stmt = gimple_build_assign (mask, BIT_AND_EXPR,
+ mask_loop,
+ vec_opmask[i]);
+ gsi_insert_before (gsi, and_stmt, GSI_SAME_STMT);
+ }
+ else
+ mask = mask_loop;
+ }
else if (is_cond_op)
mask = vec_opmask[i];
if (LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo))
--
2.45.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/115382] Wrong code with in-order conditional reduction and masked loops
2024-06-07 8:17 [Bug tree-optimization/115382] New: Wrong code with in-order conditional reduction and masked loops rguenth at gcc dot gnu.org
2024-06-07 19:02 ` [Bug tree-optimization/115382] " rdapp at gcc dot gnu.org
@ 2024-06-10 6:18 ` rguenth at gcc dot gnu.org
2024-06-10 6:49 ` rdapp at gcc dot gnu.org
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-06-10 6:18 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115382
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
I think it should work, but there's also prepare_vec_mask which is using a
cache but I have no idea whether this is applicable for non-load/store and
whether there's extra work to be done for it to be usable.
Richard?
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/115382] Wrong code with in-order conditional reduction and masked loops
2024-06-07 8:17 [Bug tree-optimization/115382] New: Wrong code with in-order conditional reduction and masked loops rguenth at gcc dot gnu.org
2024-06-07 19:02 ` [Bug tree-optimization/115382] " rdapp at gcc dot gnu.org
2024-06-10 6:18 ` rguenth at gcc dot gnu.org
@ 2024-06-10 6:49 ` rdapp at gcc dot gnu.org
2024-06-10 7:12 ` rguenth at gcc dot gnu.org
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: rdapp at gcc dot gnu.org @ 2024-06-10 6:49 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115382
--- Comment #3 from Robin Dapp <rdapp at gcc dot gnu.org> ---
For the record - the hunk before bootstrapped and regtested on the cfarm
machines and tested successfully on aarch64 qemu with sve. I still need to set
up a regtest environment with SME.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/115382] Wrong code with in-order conditional reduction and masked loops
2024-06-07 8:17 [Bug tree-optimization/115382] New: Wrong code with in-order conditional reduction and masked loops rguenth at gcc dot gnu.org
` (2 preceding siblings ...)
2024-06-10 6:49 ` rdapp at gcc dot gnu.org
@ 2024-06-10 7:12 ` rguenth at gcc dot gnu.org
2024-06-11 18:10 ` cvs-commit at gcc dot gnu.org
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-06-10 7:12 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115382
--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Robin Dapp from comment #3)
> For the record - the hunk before bootstrapped and regtested on the cfarm
> machines and tested successfully on aarch64 qemu with sve. I still need to
> set up a regtest environment with SME.
I think the patch is OK, so I suggest to post it and CC Richard S. so he
can chime in.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/115382] Wrong code with in-order conditional reduction and masked loops
2024-06-07 8:17 [Bug tree-optimization/115382] New: Wrong code with in-order conditional reduction and masked loops rguenth at gcc dot gnu.org
` (3 preceding siblings ...)
2024-06-10 7:12 ` rguenth at gcc dot gnu.org
@ 2024-06-11 18:10 ` cvs-commit at gcc dot gnu.org
2024-06-12 7:03 ` rguenth at gcc dot gnu.org
2024-06-12 7:33 ` rdapp at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-06-11 18:10 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115382
--- Comment #5 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Robin Dapp <rdapp@gcc.gnu.org>:
https://gcc.gnu.org/g:2b438a0d2aa80f051a09b245a58f643540d4004b
commit r15-1187-g2b438a0d2aa80f051a09b245a58f643540d4004b
Author: Robin Dapp <rdapp@ventanamicro.com>
Date: Fri Jun 7 14:36:41 2024 +0200
vect: Merge loop mask and cond_op mask in fold-left reduction [PR115382].
Currently we discard the cond-op mask when the loop is fully masked
which causes wrong code in
gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c
when compiled with
-O3 -march=cascadelake --param vect-partial-vector-usage=2.
This patch ANDs both masks.
gcc/ChangeLog:
PR tree-optimization/115382
* tree-vect-loop.cc (vectorize_fold_left_reduction): Use
prepare_vec_mask.
* tree-vect-stmts.cc (check_load_store_for_partial_vectors):
Remove static of prepare_vec_mask.
* tree-vectorizer.h (prepare_vec_mask): Export.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/115382] Wrong code with in-order conditional reduction and masked loops
2024-06-07 8:17 [Bug tree-optimization/115382] New: Wrong code with in-order conditional reduction and masked loops rguenth at gcc dot gnu.org
` (4 preceding siblings ...)
2024-06-11 18:10 ` cvs-commit at gcc dot gnu.org
@ 2024-06-12 7:03 ` rguenth at gcc dot gnu.org
2024-06-12 7:33 ` rdapp at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-06-12 7:03 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115382
--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
Thanks. It seems to be latent on the branch as well, right?
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/115382] Wrong code with in-order conditional reduction and masked loops
2024-06-07 8:17 [Bug tree-optimization/115382] New: Wrong code with in-order conditional reduction and masked loops rguenth at gcc dot gnu.org
` (5 preceding siblings ...)
2024-06-12 7:03 ` rguenth at gcc dot gnu.org
@ 2024-06-12 7:33 ` rdapp at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: rdapp at gcc dot gnu.org @ 2024-06-12 7:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115382
--- Comment #7 from Robin Dapp <rdapp at gcc dot gnu.org> ---
Ah yes, I'm going to push the patch to 14 still.
^ permalink raw reply [flat|nested] 8+ messages in thread