public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/98137] New: Could use SLP to vectorize if split_constant_offset were smarter
@ 2020-12-04 9:48 rguenth at gcc dot gnu.org
2020-12-04 10:05 ` [Bug tree-optimization/98137] " rguenth at gcc dot gnu.org
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-12-04 9:48 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98137
Bug ID: 98137
Summary: Could use SLP to vectorize if split_constant_offset
were smarter
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: rguenth at gcc dot gnu.org
Target Milestone: ---
void
gemm_m10_n9_k17_ldA20_ldB20_ldC10_beta0_alignedA1_alignedC1_pfsigonly(const
double* __restrict__ A, const double* __restrict__ B, double* __restrict__ C,
const double* A_prefetch, const double* B_prefetch, const double* C_prefetch) {
unsigned int l_m = 0;
unsigned int l_n = 0;
unsigned int l_k = 0;
for ( l_n = 0; l_n < 9; l_n++ ) {
for ( l_m = 0; l_m < 10; l_m++ ) { C[(l_n*10)+l_m] = 0.0; }
for ( l_k = 0; l_k < 17; l_k++ ) {
for ( l_m = 0; l_m < 10; l_m++ ) {
C[(l_n*10)+l_m] += A[(l_k*20)+l_m] * B[(l_n*20)+l_k];
}
}
}
}
is nicely vectorized with BB SLP when you make l_{m,n,k} signed but when
unsigned as above then split_constant_offset gives up when seeing
C + ((unsigned long)(_286 + 1) * 8)
but we even have nice range-info:
# RANGE [0, 80] NONZERO 126
_286 = l_n_189 * 10;
# RANGE [0, 80] NONZERO 126
_288 = (long unsigned int) _286;
# RANGE [0, 640] NONZERO 1008
_289 = _288 * 8;
# PT = null { D.2428 } (nonlocal, restrict)
_290 = C_37(D) + _289;
^^ C + ((unsigned long)(_286) * 8)
# RANGE [1, 81] NONZERO 127
_296 = _286 + 1;
# RANGE [1, 81] NONZERO 127
_297 = (long unsigned int) _296;
# RANGE [8, 648] NONZERO 1016
_298 = _297 * 8;
# PT = { D.2428 } (nonlocal, restrict)
_299 = C_37(D) + _298;
^^ C + ((unsigned long)(_286 + 1) * 8
giving up means DR group analysis doesn't relate them and we do not consider
SLP vectorization.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/98137] Could use SLP to vectorize if split_constant_offset were smarter
2020-12-04 9:48 [Bug tree-optimization/98137] New: Could use SLP to vectorize if split_constant_offset were smarter rguenth at gcc dot gnu.org
@ 2020-12-04 10:05 ` rguenth at gcc dot gnu.org
2020-12-04 10:18 ` rguenth at gcc dot gnu.org
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-12-04 10:05 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98137
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org
Last reconfirmed| |2020-12-04
Status|UNCONFIRMED |ASSIGNED
Ever confirmed|0 |1
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Ah, so the issue is that
/* Split the unconverted operand and try to prove that
wrapping isn't a problem. */
tree tmp_var, tmp_off;
split_constant_offset (op0, &tmp_var, &tmp_off, cache, limit);
/* See whether we have an SSA_NAME whose range is known
to be [A, B]. */
if (TREE_CODE (tmp_var) != SSA_NAME)
return false;
we end up with tmp_var as MULT_EXPR (l_n_189 * 10). We can fix that by
retaining the original code when splitting the constant offset returns zero.
While this makes us match up (_286 + 1) * 8 and (_286 + 2) * 8 it fails
to catch (_286 + 0) * 8 because that now is no longer expanded.
But we can use determine_value_range instead of get_value_range to also handle
expression trees. That fixes the testcase.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/98137] Could use SLP to vectorize if split_constant_offset were smarter
2020-12-04 9:48 [Bug tree-optimization/98137] New: Could use SLP to vectorize if split_constant_offset were smarter rguenth at gcc dot gnu.org
2020-12-04 10:05 ` [Bug tree-optimization/98137] " rguenth at gcc dot gnu.org
@ 2020-12-04 10:18 ` rguenth at gcc dot gnu.org
2020-12-07 7:15 ` cvs-commit at gcc dot gnu.org
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-12-04 10:18 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98137
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Created attachment 49677
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49677&action=edit
patch
I am testing the attached simple patch.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/98137] Could use SLP to vectorize if split_constant_offset were smarter
2020-12-04 9:48 [Bug tree-optimization/98137] New: Could use SLP to vectorize if split_constant_offset were smarter rguenth at gcc dot gnu.org
2020-12-04 10:05 ` [Bug tree-optimization/98137] " rguenth at gcc dot gnu.org
2020-12-04 10:18 ` rguenth at gcc dot gnu.org
@ 2020-12-07 7:15 ` cvs-commit at gcc dot gnu.org
2020-12-07 7:15 ` rguenth at gcc dot gnu.org
2020-12-16 9:14 ` rguenth at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-12-07 7:15 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98137
--- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:
https://gcc.gnu.org/g:7b4ea2827d2003c8ffc76cd478f8974360cbd78f
commit r11-5809-g7b4ea2827d2003c8ffc76cd478f8974360cbd78f
Author: Richard Biener <rguenther@suse.de>
Date: Fri Dec 4 11:13:48 2020 +0100
tree-optimization/98137 - enhance split_constant_offset range handling
split_constant_offset currently gives up looking at ranges when
dealing with possibly wrapping operations for looking through
conversions when the downstream analysis does not yield a SSA name.
That's overly conservative and we have a nice helper that can
deal with arbitrary expresssions. Use that. This helps data
reference group analysis so the testcase is fully SLP vectorized,
making use of the whole-function "BB" vectorization capabilities
we now have.
2020-12-04 Richard Biener <rguenther@suse.de>
PR tree-optimization/98137
* tree-data-ref.c (split_constant_offset_1): Use
determine_value_range instead of get_range_info to handle
arbitrary expressions.
* gcc.dg/vect/bb-slp-pr98137.c: New testcase.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/98137] Could use SLP to vectorize if split_constant_offset were smarter
2020-12-04 9:48 [Bug tree-optimization/98137] New: Could use SLP to vectorize if split_constant_offset were smarter rguenth at gcc dot gnu.org
` (2 preceding siblings ...)
2020-12-07 7:15 ` cvs-commit at gcc dot gnu.org
@ 2020-12-07 7:15 ` rguenth at gcc dot gnu.org
2020-12-16 9:14 ` rguenth at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-12-07 7:15 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98137
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution|--- |FIXED
--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/98137] Could use SLP to vectorize if split_constant_offset were smarter
2020-12-04 9:48 [Bug tree-optimization/98137] New: Could use SLP to vectorize if split_constant_offset were smarter rguenth at gcc dot gnu.org
` (3 preceding siblings ...)
2020-12-07 7:15 ` rguenth at gcc dot gnu.org
@ 2020-12-16 9:14 ` rguenth at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-12-16 9:14 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98137
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |helijia at gcc dot gnu.org
--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
*** Bug 88767 has been marked as a duplicate of this bug. ***
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-12-16 9:14 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-04 9:48 [Bug tree-optimization/98137] New: Could use SLP to vectorize if split_constant_offset were smarter rguenth at gcc dot gnu.org
2020-12-04 10:05 ` [Bug tree-optimization/98137] " rguenth at gcc dot gnu.org
2020-12-04 10:18 ` rguenth at gcc dot gnu.org
2020-12-07 7:15 ` cvs-commit at gcc dot gnu.org
2020-12-07 7:15 ` rguenth at gcc dot gnu.org
2020-12-16 9:14 ` rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).