public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/114589] New: missed optimization: losing bool range information
@ 2024-04-04 17:28 barry.revzin at gmail dot com
2024-04-04 17:47 ` [Bug tree-optimization/114589] " pinskia at gcc dot gnu.org
` (7 more replies)
0 siblings, 8 replies; 9+ messages in thread
From: barry.revzin at gmail dot com @ 2024-04-04 17:28 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114589
Bug ID: 114589
Summary: missed optimization: losing bool range information
Product: gcc
Version: 13.1.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c++
Assignee: unassigned at gcc dot gnu.org
Reporter: barry.revzin at gmail dot com
Target Milestone: ---
Consider the following example:
template <class T>
struct simple_optional {
bool has_val;
T val;
auto begin() const -> T const* { return &val; }
#ifdef SIMPLE
auto end() const -> T const* { return &val + (has_val ? 1 : 0); }
#else
auto end() const -> T const* { return has_val ? &val + 1 : &val; }
#endif
};
void f(int);
void call_f(simple_optional<int> const& o) {
for (int i : o) {
f(i);
}
}
This function should call f at most one time. With the SIMPLE implementation
that adds (has_val ? 1 : 0), or simply has_val, or static_cast<int>(has_val),
or any version thereof - gcc trunk -O3 still emits a loop:
call_f(simple_optional<int> const&):
push rbp
push rbx
lea rbx, [rdi+4]
sub rsp, 8
movzx edx, BYTE PTR [rdi]
lea rbp, [rbx+rdx*4]
test dl, dl
je .L1
.L3:
mov edi, DWORD PTR [rbx]
add rbx, 4
call f(int)
cmp rbp, rbx
jne .L3
.L1:
add rsp, 8
pop rbx
pop rbp
ret
With the other approach, where end() explicitly returns either &val + 1 or
&val, gcc does not emit a loop:
call_f(simple_optional<int> const&):
cmp BYTE PTR [rdi], 0
jne .L4
ret
.L4:
mov edi, DWORD PTR [rdi+4]
jmp f(int)
On compiler explorer: https://godbolt.org/z/jaMxqsj5q
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/114589] missed optimization: losing bool range information
2024-04-04 17:28 [Bug c++/114589] New: missed optimization: losing bool range information barry.revzin at gmail dot com
@ 2024-04-04 17:47 ` pinskia at gcc dot gnu.org
2024-04-04 17:58 ` pinskia at gcc dot gnu.org
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-04-04 17:47 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114589
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Severity|normal |enhancement
Component|middle-end |tree-optimization
Status|UNCONFIRMED |NEW
Last reconfirmed| |2024-04-04
Ever confirmed|0 |1
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Confirmed.
Why didn't sink1 push _10, _13, _12, and _11 past the conditional here ...
If it did that I think it might have optimized correctly.
<bb 2> [local count: 118111600]:
_11 = &o_3(D)->val;
_5 = o_3(D)->has_val;
_12 = (sizetype) _5;
_13 = _12 << 2;
_10 = _11 + _13;
if (_5 != 0)
goto <bb 5>; [89.00%]
else
goto <bb 9>; [11.00%]
<bb 5> [local count: 105119324]:
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/114589] missed optimization: losing bool range information
2024-04-04 17:28 [Bug c++/114589] New: missed optimization: losing bool range information barry.revzin at gmail dot com
2024-04-04 17:47 ` [Bug tree-optimization/114589] " pinskia at gcc dot gnu.org
@ 2024-04-04 17:58 ` pinskia at gcc dot gnu.org
2024-04-05 7:52 ` rguenth at gcc dot gnu.org
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-04-04 17:58 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114589
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #1)
> Confirmed.
>
> Why didn't sink1 push _10, _13, _12, and _11 past the conditional here ...
> If it did that I think it might have optimized correctly.
Because then _12 would be 1, _13 would be 4 and _10 would be come `_11 + 4` and
the loop induction variable would be going from _11 to `_11 + 4` then:
```
# __for_begin_15 = PHI <__for_begin_8(6), _11(5)>
# .MEM_16 = PHI <.MEM_7(6), .MEM_4(D)(5)>
# VUSE <.MEM_16>
i_6 = *__for_begin_15;
# .MEM_7 = VDEF <.MEM_16>
# USE = nonlocal escaped
# CLB = nonlocal escaped
_Z1fiD.2782 (i_6);
# PT = nonlocal
__for_begin_8 = __for_begin_15 + 4;
if (__for_begin_8 != _10)
goto <bb 6>; [89.00%]
else
goto <bb 10>; [11.00%]
```
And we could figure out this loop is only gone through once only.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/114589] missed optimization: losing bool range information
2024-04-04 17:28 [Bug c++/114589] New: missed optimization: losing bool range information barry.revzin at gmail dot com
2024-04-04 17:47 ` [Bug tree-optimization/114589] " pinskia at gcc dot gnu.org
2024-04-04 17:58 ` pinskia at gcc dot gnu.org
@ 2024-04-05 7:52 ` rguenth at gcc dot gnu.org
2024-04-05 9:39 ` [Bug tree-optimization/114589] missed optimization: losing bool range information, missed sinking rguenth at gcc dot gnu.org
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-04-05 7:52 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114589
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |rguenth at gcc dot gnu.org
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
This is the "old" issue with the bogus
/* If BEST_BB is at the same nesting level, then require it to have
significantly lower execution frequency to avoid gratuitous movement. */
if (bb_loop_depth (best_bb) == bb_loop_depth (early_bb)
/* If result of comparsion is unknown, prefer EARLY_BB.
Thus use !(...>=..) rather than (...<...) */
&& !(best_bb->count * 100 >= early_bb->count * threshold))
return best_bb;
check. We have a BB2 count of 118111600 and BB5 we'd sink to has 105119324
which is higher than 75% of BB2.
This was supposed to prevent sinking across if (x) unlikely (); blocks
but it has odd side-effects like this one.
IIRC there's duplicates for this.
I'll note that we probably want to avoid sinking to a post-dominator
unless it sinks out of a loop (that's cross-BB scheduling). But we no
longer compute post-dominators. Without re-introducing those and
limiting sinking this way removing the odd code will cause regressions.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/114589] missed optimization: losing bool range information, missed sinking
2024-04-04 17:28 [Bug c++/114589] New: missed optimization: losing bool range information barry.revzin at gmail dot com
` (2 preceding siblings ...)
2024-04-05 7:52 ` rguenth at gcc dot gnu.org
@ 2024-04-05 9:39 ` rguenth at gcc dot gnu.org
2024-05-15 9:37 ` rguenth at gcc dot gnu.org
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-04-05 9:39 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114589
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |ASSIGNED
Summary|missed optimization: losing |missed optimization: losing
|bool range information |bool range information,
| |missed sinking
Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org
--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
Let me work on this in stage1.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/114589] missed optimization: losing bool range information, missed sinking
2024-04-04 17:28 [Bug c++/114589] New: missed optimization: losing bool range information barry.revzin at gmail dot com
` (3 preceding siblings ...)
2024-04-05 9:39 ` [Bug tree-optimization/114589] missed optimization: losing bool range information, missed sinking rguenth at gcc dot gnu.org
@ 2024-05-15 9:37 ` rguenth at gcc dot gnu.org
2024-05-15 11:32 ` rguenth at gcc dot gnu.org
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-05-15 9:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114589
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Blocks| |85316
--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
So when doing the sinking it's too late since there's no VRP between early
sinking and late unrolling and also threadfull doesn't figure out
<bb 3> [local count: 105119324]:
_11 = &o_3(D)->val;
_12 = 1;
_13 = 4;
_10 = _11 + 4;
ivtmp.11_14 = (unsigned long) _11;
<bb 4> [local count: 955630224]:
# ivtmp.11_17 = PHI <ivtmp.11_2(6), ivtmp.11_14(3)>
_18 = (void *) ivtmp.11_17;
i_6 = MEM[(const int *)_18];
f (i_6);
ivtmp.11_2 = ivtmp.11_17 + 4;
__for_begin_19 = (const int *) ivtmp.11_2;
if (__for_begin_19 != _10)
goto <bb 6>; [89.00%]
else
goto <bb 5>; [11.00%]
<bb 6> [local count: 850510900]:
goto <bb 4>; [100.00%]
but IIRC it doesn't consider the loop entry "fallthru" as "threading" source.
Nevertheless I have a fix for the sinking issue.
Referenced Bugs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85316
[Bug 85316] [meta-bug] VRP range propagation missed cases
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/114589] missed optimization: losing bool range information, missed sinking
2024-04-04 17:28 [Bug c++/114589] New: missed optimization: losing bool range information barry.revzin at gmail dot com
` (4 preceding siblings ...)
2024-05-15 9:37 ` rguenth at gcc dot gnu.org
@ 2024-05-15 11:32 ` rguenth at gcc dot gnu.org
2024-05-15 16:13 ` cvs-commit at gcc dot gnu.org
2024-05-15 16:14 ` rguenth at gcc dot gnu.org
7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-05-15 11:32 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114589
--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
So actually it also needs -fno-ivopts since otherwise VRP is confused. With
-fno-ivopts it's late DOM that figures out the loop doesn't iterate.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/114589] missed optimization: losing bool range information, missed sinking
2024-04-04 17:28 [Bug c++/114589] New: missed optimization: losing bool range information barry.revzin at gmail dot com
` (5 preceding siblings ...)
2024-05-15 11:32 ` rguenth at gcc dot gnu.org
@ 2024-05-15 16:13 ` cvs-commit at gcc dot gnu.org
2024-05-15 16:14 ` rguenth at gcc dot gnu.org
7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-05-15 16:13 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114589
--- Comment #7 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:
https://gcc.gnu.org/g:99b1daae18c095d6c94d32efb77442838e11cbfb
commit r15-518-g99b1daae18c095d6c94d32efb77442838e11cbfb
Author: Richard Biener <rguenther@suse.de>
Date: Fri May 3 14:04:41 2024 +0200
tree-optimization/114589 - remove profile based sink heuristics
The following removes the profile based heuristic limiting sinking
and instead uses post-dominators to avoid sinking to places that
are executed under the same conditions as the earlier location which
the profile based heuristic should have guaranteed as well.
To avoid regressing this moves the empty-latch check to cover all
sink cases.
It also stream-lines the resulting select_best_block a bit but avoids
adjusting heuristics more with this change. gfortran.dg/streamio_9.f90
starts execute failing with this on x86_64 with -m32 because the
(float)i * 9.9999...e-7 compute is sunk across a STOP causing it
to be no longer spilled and thus the compare failing due to excess
precision. The patch adds -ffloat-store to avoid this, following
other similar testcases.
This change fixes the testcase in the PR only when using -fno-ivopts
as otherwise VRP is confused.
PR tree-optimization/114589
* tree-ssa-sink.cc (select_best_block): Remove profile-based
heuristics. Instead reject sink locations that sink
to post-dominators. Move empty latch check here from
statement_sink_location. Also consider early_bb for the
loop depth check.
(statement_sink_location): Remove superfluous check. Remove
empty latch check.
(pass_sink_code::execute): Compute/release post-dominators.
* gfortran.dg/streamio_9.f90: Use -ffloat-store to avoid
excess precision when not spilling.
* g++.dg/tree-ssa/pr114589.C: New testcase.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/114589] missed optimization: losing bool range information, missed sinking
2024-04-04 17:28 [Bug c++/114589] New: missed optimization: losing bool range information barry.revzin at gmail dot com
` (6 preceding siblings ...)
2024-05-15 16:13 ` cvs-commit at gcc dot gnu.org
@ 2024-05-15 16:14 ` rguenth at gcc dot gnu.org
7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-05-15 16:14 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114589
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Assignee|rguenth at gcc dot gnu.org |unassigned at gcc dot gnu.org
Status|ASSIGNED |NEW
--- Comment #8 from Richard Biener <rguenth at gcc dot gnu.org> ---
The missed sinking is now fixed for GCC 15, VRP is still confused by what
IVOPTs does so without -fno-ivopts the loop remains.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-05-15 16:14 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-04 17:28 [Bug c++/114589] New: missed optimization: losing bool range information barry.revzin at gmail dot com
2024-04-04 17:47 ` [Bug tree-optimization/114589] " pinskia at gcc dot gnu.org
2024-04-04 17:58 ` pinskia at gcc dot gnu.org
2024-04-05 7:52 ` rguenth at gcc dot gnu.org
2024-04-05 9:39 ` [Bug tree-optimization/114589] missed optimization: losing bool range information, missed sinking rguenth at gcc dot gnu.org
2024-05-15 9:37 ` rguenth at gcc dot gnu.org
2024-05-15 11:32 ` rguenth at gcc dot gnu.org
2024-05-15 16:13 ` cvs-commit at gcc dot gnu.org
2024-05-15 16:14 ` rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).