public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/96272] New: Failure to optimize overflow check
@ 2020-07-21 17:36 gabravier at gmail dot com
2020-07-22 8:34 ` [Bug tree-optimization/96272] " ubizjak at gmail dot com
` (7 more replies)
0 siblings, 8 replies; 9+ messages in thread
From: gabravier at gmail dot com @ 2020-07-21 17:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96272
Bug ID: 96272
Summary: Failure to optimize overflow check
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: gabravier at gmail dot com
Target Milestone: ---
static inline unsigned f(unsigned a, unsigned b)
{
if (b > UINT_MAX - a)
return UINT_MAX;
return a + b;
}
With -O3, LLVM outputs this:
f(unsigned int, unsigned int):
add edi, esi
mov eax, -1
cmovae eax, edi
ret
GCC outputs this:
f(unsigned int, unsigned int):
mov eax, edi
not eax
add edi, esi
cmp eax, esi
mov eax, -1
cmovnb eax, edi
ret
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/96272] Failure to optimize overflow check
2020-07-21 17:36 [Bug tree-optimization/96272] New: Failure to optimize overflow check gabravier at gmail dot com
@ 2020-07-22 8:34 ` ubizjak at gmail dot com
2020-07-22 8:40 ` jakub at gcc dot gnu.org
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: ubizjak at gmail dot com @ 2020-07-22 8:34 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96272
Uroš Bizjak <ubizjak at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jakub at gcc dot gnu.org
Ever confirmed|0 |1
Last reconfirmed| |2020-07-22
Status|UNCONFIRMED |NEW
--- Comment #1 from Uroš Bizjak <ubizjak at gmail dot com> ---
Confirmed, pattern to convert:
a > UINT_MAX - b;
to
__builtin_uadd_overflow
should be added to match.pd.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/96272] Failure to optimize overflow check
2020-07-21 17:36 [Bug tree-optimization/96272] New: Failure to optimize overflow check gabravier at gmail dot com
2020-07-22 8:34 ` [Bug tree-optimization/96272] " ubizjak at gmail dot com
@ 2020-07-22 8:40 ` jakub at gcc dot gnu.org
2020-07-22 10:05 ` ubizjak at gmail dot com
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-07-22 8:40 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96272
--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Well, it needs the addition too, so I think this can't be done in match.pd, but
would need to be done in some other pass (not sure which, perhaps phiopt?).
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/96272] Failure to optimize overflow check
2020-07-21 17:36 [Bug tree-optimization/96272] New: Failure to optimize overflow check gabravier at gmail dot com
2020-07-22 8:34 ` [Bug tree-optimization/96272] " ubizjak at gmail dot com
2020-07-22 8:40 ` jakub at gcc dot gnu.org
@ 2020-07-22 10:05 ` ubizjak at gmail dot com
2020-11-23 8:07 ` ubizjak at gmail dot com
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: ubizjak at gmail dot com @ 2020-07-22 10:05 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96272
--- Comment #3 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Jakub Jelinek from comment #2)
> Well, it needs the addition too, so I think this can't be done in match.pd,
> but would need to be done in some other pass (not sure which, perhaps
> phiopt?).
No, I was referring to the first step of the optimization. The converted source
would read something like:
unsigned
bar (unsigned a, unsigned b)
{
int dummy;
if (__builtin_uadd_overflow (a, b, &dummy))
return UINT_MAX;
return a + b;
}
The RTL CSE pass is able to eliminate one addition, resulting in:
bar:
addl %esi, %edi
jc .L5
movl %edi, %eax
ret
.L5:
orl $-1, %eax
ret
Eventually, some tree pass could convert the above source to:
unsigned
bar (unsigned a, unsigned b)
{
unsigned res;
if (__builtin_uadd_overflow (a, b, &res))
return UINT_MAX;
return res;
}
which results in:
bar:
addl %esi, %edi
movl $-1, %eax
cmovnc %edi, %eax
ret
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/96272] Failure to optimize overflow check
2020-07-21 17:36 [Bug tree-optimization/96272] New: Failure to optimize overflow check gabravier at gmail dot com
` (2 preceding siblings ...)
2020-07-22 10:05 ` ubizjak at gmail dot com
@ 2020-11-23 8:07 ` ubizjak at gmail dot com
2020-11-23 16:44 ` jakub at gcc dot gnu.org
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: ubizjak at gmail dot com @ 2020-11-23 8:07 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96272
--- Comment #4 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Jakub Jelinek from comment #2)
> Well, it needs the addition too, so I think this can't be done in match.pd,
> but would need to be done in some other pass (not sure which, perhaps
> phiopt?).
Maybe I was not too clear. Please consider this testcase:
--cut here--
#include <limits.h>
#include <stdio.h>
int
foo (unsigned a, unsigned b)
{
return a > UINT_MAX - b;
}
int
bar (unsigned a, unsigned b)
{
int dummy;
return __builtin_uadd_overflow (a, b, &dummy);
}
--cut here--
This results in (-O2):
foo:
notl %esi
xorl %eax, %eax
cmpl %edi, %esi
setb %al
ret
bar:
xorl %eax, %eax
addl %esi, %edi
setc %al
ret
So, if it is possible to transform the comparison via the following
equivalence:
a > UINT_MAX - b == (a + b) > UINT_MAX
to a __builtin_uadd_overflow, then at least on x86 it is possible to produce
much better code.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/96272] Failure to optimize overflow check
2020-07-21 17:36 [Bug tree-optimization/96272] New: Failure to optimize overflow check gabravier at gmail dot com
` (3 preceding siblings ...)
2020-11-23 8:07 ` ubizjak at gmail dot com
@ 2020-11-23 16:44 ` jakub at gcc dot gnu.org
2020-12-11 15:37 ` jakub at gcc dot gnu.org
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-11-23 16:44 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96272
--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
It shouldn't be added to match.pd, the check as written in the source is
certainly better for other optimization passes.
For PR95853 I've added recently a widening_mul (i.e. very late pass, almost
before expansion) pattern matching of another form (very lame in the source
code) of the overflow check but in that case I've put a constraint that the
overflow check and addition must be in the same bb (and not too far from each
other).
In this case it isn't in the same bb and again would be very undesirable if it
is really too far (would increase register pressure).
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/96272] Failure to optimize overflow check
2020-07-21 17:36 [Bug tree-optimization/96272] New: Failure to optimize overflow check gabravier at gmail dot com
` (4 preceding siblings ...)
2020-11-23 16:44 ` jakub at gcc dot gnu.org
@ 2020-12-11 15:37 ` jakub at gcc dot gnu.org
2020-12-12 13:49 ` cvs-commit at gcc dot gnu.org
2020-12-14 10:29 ` jakub at gcc dot gnu.org
7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-12-11 15:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96272
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org
Status|NEW |ASSIGNED
--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 49744
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49744&action=edit
gcc11-pr96272.patch
Untested fix. The addition is then CSEd by RTL PRE, and the reason why we
don't emit a conditional move is something in the CE. Though it is a question
if we want to use a conditional move, generally predicting overflows don't
happen is a good idea and so the overflow case should be pretty cold.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/96272] Failure to optimize overflow check
2020-07-21 17:36 [Bug tree-optimization/96272] New: Failure to optimize overflow check gabravier at gmail dot com
` (5 preceding siblings ...)
2020-12-11 15:37 ` jakub at gcc dot gnu.org
@ 2020-12-12 13:49 ` cvs-commit at gcc dot gnu.org
2020-12-14 10:29 ` jakub at gcc dot gnu.org
7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-12-12 13:49 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96272
--- Comment #7 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:fe78528c05fdd562f21e12675781473b0fbe892e
commit r11-5957-gfe78528c05fdd562f21e12675781473b0fbe892e
Author: Jakub Jelinek <jakub@redhat.com>
Date: Sat Dec 12 14:48:47 2020 +0100
widening_mul: Recognize another form of ADD_OVERFLOW [PR96272]
The following patch recognizes another form of hand written
__builtin_add_overflow (this time _p), in particular when
the code does unsigned
if (x > ~0U - y)
or
if (x <= ~0U - y)
it can be optimized (if the subtraction turned into ~y is single use)
into
if (__builtin_add_overflow_p (x, y, 0U))
or
if (!__builtin_add_overflow_p (x, y, 0U))
and generate better code, e.g. for the first function in the testcase:
- movl %esi, %eax
addl %edi, %esi
- notl %eax
- cmpl %edi, %eax
- movl $-1, %eax
- cmovnb %esi, %eax
+ jc .L3
+ movl %esi, %eax
+ ret
+.L3:
+ orl $-1, %eax
ret
on x86_64. As for the jumps vs. conditional move case, that is some CE
issue with complex branch patterns we should fix up no matter what, but
in this case I'm actually not sure if branchy code isn't better, overflow
is something that isn't that common.
2020-12-12 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/96272
* tree-ssa-math-opts.c (uaddsub_overflow_check_p): Add OTHER
argument.
Handle BIT_NOT_EXPR.
(match_uaddsub_overflow): Optimize unsigned a > ~b into
__imag__ .ADD_OVERFLOW (a, b).
(math_opts_dom_walker::after_dom_children): Call
match_uaddsub_overflow
even for BIT_NOT_EXPR.
* gcc.dg/tree-ssa/pr96272.c: New test.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug tree-optimization/96272] Failure to optimize overflow check
2020-07-21 17:36 [Bug tree-optimization/96272] New: Failure to optimize overflow check gabravier at gmail dot com
` (6 preceding siblings ...)
2020-12-12 13:49 ` cvs-commit at gcc dot gnu.org
@ 2020-12-14 10:29 ` jakub at gcc dot gnu.org
7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-12-14 10:29 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96272
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Status|ASSIGNED |RESOLVED
--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed now.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2020-12-14 10:29 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-21 17:36 [Bug tree-optimization/96272] New: Failure to optimize overflow check gabravier at gmail dot com
2020-07-22 8:34 ` [Bug tree-optimization/96272] " ubizjak at gmail dot com
2020-07-22 8:40 ` jakub at gcc dot gnu.org
2020-07-22 10:05 ` ubizjak at gmail dot com
2020-11-23 8:07 ` ubizjak at gmail dot com
2020-11-23 16:44 ` jakub at gcc dot gnu.org
2020-12-11 15:37 ` jakub at gcc dot gnu.org
2020-12-12 13:49 ` cvs-commit at gcc dot gnu.org
2020-12-14 10:29 ` jakub at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).