public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/102436] New: [11/12 Regression] Lost Load/Store Motion
@ 2021-09-21 20:09 law at gcc dot gnu.org
2021-09-22 7:18 ` [Bug tree-optimization/102436] " rguenth at gcc dot gnu.org
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: law at gcc dot gnu.org @ 2021-09-21 20:09 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102436
Bug ID: 102436
Summary: [11/12 Regression] Lost Load/Store Motion
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: law at gcc dot gnu.org
Target Milestone: ---
Created attachment 51492
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51492&action=edit
Testcase
So consider this loop (-O2, lim2 dump, trunk, x86_64):
;; basic block 3, loop depth 1
;; pred: 2
;; 10
# target_8 = PHI <target_13(D)(2), target_17(10)>
_4 = board[target_8];
if (_4 == 13)
goto <bb 4>; [94.50%]
else
goto <bb 7>; [5.50%]
;; succ: 4
;; 7
;; basic block 4, loop depth 1
;; pred: 3
if (captures.32_5 == 0)
goto <bb 5>; [33.00%]
else
goto <bb 6>; [67.00%]
;; succ: 5
;; 6
;; basic block 5, loop depth 1
;; pred: 4
numb_moves.1_21 = numb_moves;
_22 = (long unsigned int) numb_moves.1_21;
_23 = _22 * 24;
_24 = (struct move_s *) _23;
_24->from = gfrom.30_1;
_24->target = target_8;
_24->captured = 13;
_24->castled = 0;
_24->promoted = 0;
_24->ep = 0;
_26 = numb_moves.1_21 + 1;
numb_moves = _26;
;; succ: 6
;; basic block 6, loop depth 1
;; pred: 4
;; 5
target_17 = target_8 + offset_14;
_7 = board[target_17];
if (_7 != 0)
goto <bb 10>; [94.50%]
else
goto <bb 9>; [5.50%]
;; succ: 10
;; 9
;; basic block 10, loop depth 1
;; pred: 6
goto <bb 3>; [100.00%]
;; succ: 3
In particular note the load from and store to numb_moves in block #5 within the
loop. I don't immediately see an aliasing issue that would prevent LSM. The
bigger problem is control flow, obviously the load/store may not be executed,
but I thought our LIM/LSM code handled that correctly.
If we look at gcc-10 we get something like this:
;; basic block 3, loop depth 1
;; pred: 2
;; 10
# target_9 = PHI <target_14(D)(2), target_19(10)>
# numb_moves_lsm.43_6 = PHI <numb_moves_lsm.43_34(2),
numb_moves_lsm.43_2(10)>
# numb_moves_lsm_flag.44_20 = PHI <numb_moves_lsm_flag.44_35(2),
numb_moves_lsm_flag.44_18(10)>
_4 = board[target_9];
if (_4 == 13)
goto <bb 4>; [94.50%]
else
goto <bb 7>; [5.50%]
;; succ: 4
;; 7
;; basic block 4, loop depth 1
;; pred: 3
if (captures.32_5 == 0)
goto <bb 5>; [33.00%]
else
goto <bb 6>; [67.00%]
;; succ: 5
;; 6
;; basic block 5, loop depth 1
;; pred: 4
numb_moves.1_21 = numb_moves_lsm.43_6;
_22 = (long unsigned int) numb_moves.1_21;
_23 = _22 * 24;
_24 = (struct move_s *) _23;
_24->from = gfrom.30_1;
_24->target = target_9;
_24->captured = 13;
_24->castled = 0;
_24->promoted = 0;
_24->ep = 0;
_26 = numb_moves.1_21 + 1;
numb_moves_lsm.43_37 = _26;
numb_moves_lsm_flag.44_38 = 1;
;; succ: 6
;; basic block 6, loop depth 1
;; pred: 4
;; 5
# numb_moves_lsm.43_2 = PHI <numb_moves_lsm.43_6(4), numb_moves_lsm.43_37(5)>
# numb_moves_lsm_flag.44_18 = PHI <numb_moves_lsm_flag.44_20(4),
numb_moves_lsm_flag.44_38(5)>
target_19 = target_9 + offset_15;
_8 = board[target_19];
if (_8 != 0)
goto <bb 10>; [94.50%]
else
goto <bb 11>; [5.50%]
;; succ: 10
;; 11
[ ... ]
;; basic block 10, loop depth 1
;; pred: 6
goto <bb 3>; [100.00%]
;; succ: 3
Obviously with the load before the loop and the store after.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/102436] [11/12 Regression] Lost Load/Store Motion
2021-09-21 20:09 [Bug tree-optimization/102436] New: [11/12 Regression] Lost Load/Store Motion law at gcc dot gnu.org
@ 2021-09-22 7:18 ` rguenth at gcc dot gnu.org
2021-11-16 14:57 ` rguenth at gcc dot gnu.org
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-09-22 7:18 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102436
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Ever confirmed|0 |1
Target Milestone|--- |11.3
Priority|P3 |P2
Keywords| |missed-optimization
Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org
Last reconfirmed| |2021-09-22
Status|UNCONFIRMED |ASSIGNED
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Memory reference 3: numb_moves
Memory reference 4: _24->from
...
Querying dependency of refs 3 and 4: dependent.
Querying SM WAW dependencies of ref 3 in loop 1: dependent
the issue is that we require conditional executed stores to be independent
on all other stores as we cannot re-issue other stores on exit in the proper
order.
Now, in this case the dependent stores are executed under the same condition
and in fact ordered in a way that we don't have to re-issue any dependent
store.
We're failing to handle this special case after the store-motion re-write that
fixed the TBAA issues.
Smaller testcase where we can just issue the conditional store to 'p':
unsigned p;
void foo (float *q)
{
for (int i = 0; i < 256; ++i)
{
if (p)
{
unsigned a = p;
*(q++) = 1.;
p = a + 1;
}
}
}
the following are what's very much more difficult to handle
(we have to issue a conditional sequence of two stores, and remember the
location the non-invariant store stored to _and_ verify we can re-emit that
out-of-order, and we have to remember the value stored):
unsigned p;
void foo (float *q)
{
for (int i = 0; i < 256; ++i)
{
if (p)
{
unsigned a = p;
p = a + 1;
*(q++) = 1.;
}
}
}
a bit easier (the store we have to re-issue is always executed after the
last conditional store):
unsigned p;
void foo (float *q)
{
for (int i = 0; i < 256; ++i)
{
if (p)
{
unsigned a = p;
p = a + 1;
}
*(q++) = 1.;
}
}
impossible / invalid:
unsigned p;
void foo (float *q)
{
for (int i = 0; i < 256; ++i)
{
*(q++) = 1.;
if (p)
{
unsigned a = p;
p = a + 1;
}
}
}
I will see how difficult it is to teach the already interwinded code the
"trivial" case and whether the bit easier case falls out naturally.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/102436] [11/12 Regression] Lost Load/Store Motion
2021-09-21 20:09 [Bug tree-optimization/102436] New: [11/12 Regression] Lost Load/Store Motion law at gcc dot gnu.org
2021-09-22 7:18 ` [Bug tree-optimization/102436] " rguenth at gcc dot gnu.org
@ 2021-11-16 14:57 ` rguenth at gcc dot gnu.org
2021-11-19 8:35 ` cvs-commit at gcc dot gnu.org
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-11-16 14:57 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102436
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
So what can probably be special-cased is when all stores in the loop happen in
the same basic-block. I do have a start for this but need to massage it a bit
more. Once there's more than one block controlled by a different conditional
the order of stores is lost and so they have to be independent (but in
principle only across BBs, not all-to-all, but only in case the order is the
same in all BBs ...).
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/102436] [11/12 Regression] Lost Load/Store Motion
2021-09-21 20:09 [Bug tree-optimization/102436] New: [11/12 Regression] Lost Load/Store Motion law at gcc dot gnu.org
2021-09-22 7:18 ` [Bug tree-optimization/102436] " rguenth at gcc dot gnu.org
2021-11-16 14:57 ` rguenth at gcc dot gnu.org
@ 2021-11-19 8:35 ` cvs-commit at gcc dot gnu.org
2021-11-19 8:37 ` rguenth at gcc dot gnu.org
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-11-19 8:35 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102436
--- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:
https://gcc.gnu.org/g:0fc859f5efcb4624a8b4ffdbf34d63972af179a8
commit r12-5394-g0fc859f5efcb4624a8b4ffdbf34d63972af179a8
Author: Richard Biener <rguenther@suse.de>
Date: Thu Nov 18 13:40:32 2021 +0100
tree-optimization/102436 - restore loop store motion
This restores a case of conditional store motion we fail to handle
after the rewrite. We can recognize the special case of all
stores in a loop happening in a single conditionally executed
block which ensures stores are not re-ordered by executing them
in different loop iterations. Separating out the case avoids
complicating the already complex main path.
2021-11-18 Richard Biener <rguenther@suse.de>
PR tree-optimization/102436
* tree-ssa-loop-im.c (execute_sm_if_changed): Add mode
to just create the if structure and return the then block.
(execute_sm): Add flag to indicate the var will re-use
another flag var.
(hoist_memory_references): Support a single conditional
block with all stores as special case.
* gcc.dg/torture/20211118-1.c: New testcase.
* gcc.dg/tree-ssa/ssa-lim-18.c: Likewise.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/102436] [11/12 Regression] Lost Load/Store Motion
2021-09-21 20:09 [Bug tree-optimization/102436] New: [11/12 Regression] Lost Load/Store Motion law at gcc dot gnu.org
` (2 preceding siblings ...)
2021-11-19 8:35 ` cvs-commit at gcc dot gnu.org
@ 2021-11-19 8:37 ` rguenth at gcc dot gnu.org
2021-11-19 15:28 ` [Bug tree-optimization/102436] [11 " law at gcc dot gnu.org
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-11-19 8:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102436
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Known to work| |10.3.1, 12.0
Known to fail| |11.1.0, 11.2.0
--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed on trunk, I will likely not backport it but will keep the bug open for
now.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/102436] [11 Regression] Lost Load/Store Motion
2021-09-21 20:09 [Bug tree-optimization/102436] New: [11/12 Regression] Lost Load/Store Motion law at gcc dot gnu.org
` (3 preceding siblings ...)
2021-11-19 8:37 ` rguenth at gcc dot gnu.org
@ 2021-11-19 15:28 ` law at gcc dot gnu.org
2022-04-21 7:50 ` rguenth at gcc dot gnu.org
2023-05-29 10:05 ` jakub at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: law at gcc dot gnu.org @ 2021-11-19 15:28 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102436
--- Comment #5 from Jeffrey A. Law <law at gcc dot gnu.org> ---
Sounds reasonable (not backporting, but holding bug open for now). I'll
probably do some testing with it internally, so if you end up wanting to
revisit the backporting question, reach out I may have useful data.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/102436] [11 Regression] Lost Load/Store Motion
2021-09-21 20:09 [Bug tree-optimization/102436] New: [11/12 Regression] Lost Load/Store Motion law at gcc dot gnu.org
` (4 preceding siblings ...)
2021-11-19 15:28 ` [Bug tree-optimization/102436] [11 " law at gcc dot gnu.org
@ 2022-04-21 7:50 ` rguenth at gcc dot gnu.org
2023-05-29 10:05 ` jakub at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-04-21 7:50 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102436
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|11.3 |11.4
--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 11.3 is being released, retargeting bugs to GCC 11.4.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/102436] [11 Regression] Lost Load/Store Motion
2021-09-21 20:09 [Bug tree-optimization/102436] New: [11/12 Regression] Lost Load/Store Motion law at gcc dot gnu.org
` (5 preceding siblings ...)
2022-04-21 7:50 ` rguenth at gcc dot gnu.org
@ 2023-05-29 10:05 ` jakub at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-05-29 10:05 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102436
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|11.4 |11.5
--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 11.4 is being released, retargeting bugs to GCC 11.5.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-05-29 10:05 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-21 20:09 [Bug tree-optimization/102436] New: [11/12 Regression] Lost Load/Store Motion law at gcc dot gnu.org
2021-09-22 7:18 ` [Bug tree-optimization/102436] " rguenth at gcc dot gnu.org
2021-11-16 14:57 ` rguenth at gcc dot gnu.org
2021-11-19 8:35 ` cvs-commit at gcc dot gnu.org
2021-11-19 8:37 ` rguenth at gcc dot gnu.org
2021-11-19 15:28 ` [Bug tree-optimization/102436] [11 " law at gcc dot gnu.org
2022-04-21 7:50 ` rguenth at gcc dot gnu.org
2023-05-29 10:05 ` jakub at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).