public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/113703] New: ivopts miscompiles loop
@ 2024-02-01 11:17 kristerw at gcc dot gnu.org
2024-02-01 14:12 ` [Bug tree-optimization/113703] " rguenth at gcc dot gnu.org
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: kristerw at gcc dot gnu.org @ 2024-02-01 11:17 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113703
Bug ID: 113703
Summary: ivopts miscompiles loop
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: kristerw at gcc dot gnu.org
Target Milestone: ---
The following function (gcc.dg/tree-ssa/ivopts-lt.c) is miscompiled when
compiled with with -O1 for X86_64:
#include "stdint.h"
void
f1 (char *p, uintptr_t i, uintptr_t n)
{
p += i;
do
{
*p = '\0';
p += 1;
i++;
}
while (i < n);
}
The IR after cunroll looks like:
void f1 (char * p, uintptr_t i, uintptr_t n)
{
<bb 2>:
p_6 = p_4(D) + i_5(D);
<bb 3>:
# p_1 = PHI <p_6(2), p_9(5)>
# i_2 = PHI <i_5(D)(2), i_10(5)>
*p_1 = 0;
p_9 = p_1 + 1;
i_10 = i_2 + 1;
if (i_10 < n_11(D))
goto <bb 5>;
else
goto <bb 4>;
<bb 5>:
goto <bb 3>;
<bb 4>:
return;
}
This is then changed by ivopts to
void f1 (char * p, uintptr_t i, uintptr_t n)
{
sizetype _13;
char * _14;
<bb 2>:
p_6 = p_4(D) + i_5(D);
_13 = n_11(D) - i_5(D);
_14 = p_6 + _13;
<bb 3>:
# p_1 = PHI <p_6(2), p_9(5)>
MEM[(char *)p_1] = 0;
p_9 = p_1 + 1;
if (p_9 < _14)
goto <bb 5>;
else
goto <bb 4>;
<bb 5>:
goto <bb 3>;
<bb 4>:
return;
}
Suppose the function gets called with the values:
p = 0x0002ffffffffffff
i = 0xffff000000000001
n = 0xdffd7fffffffffff
The original function writes 0 to address 0x0002000000000000, and then exits.
The optimized function overflows when calculating _14, and the function does
the equivalent of
memset(0x0002000000000000, 0, 0xdffe7ffffffffffe);
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/113703] ivopts miscompiles loop
2024-02-01 11:17 [Bug tree-optimization/113703] New: ivopts miscompiles loop kristerw at gcc dot gnu.org
@ 2024-02-01 14:12 ` rguenth at gcc dot gnu.org
2024-02-01 15:01 ` kristerw at gcc dot gnu.org
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-02-01 14:12 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113703
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |wrong-code
CC| |rguenth at gcc dot gnu.org
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
I think the point is we fail to represent
Analyzing # of iterations of loop 1
exit condition [i_5(D) + 1, + , 1] < n_11(D)
bounds on difference of bases: -18446744073709551615 ... 18446744073709551615
result:
zero if i_5(D) + 1 > n_11(D)
# of iterations (n_11(D) - i_5(D)) + 18446744073709551615, bounded by
18446744073709551615
number of iterations (n_11(D) - i_5(D)) + 18446744073709551615; zero if
i_5(D) + 1 > n_11(D)
specifically the 'zero if i_5(D) + 1 > n_11(D)'
I think may_eliminate_iv is wrong here, maybe not considering overflow
of the niter expression?
I wonder if it is possible to write a runtime testcase that FAILs with
reasonable memory requirement/layout.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/113703] ivopts miscompiles loop
2024-02-01 11:17 [Bug tree-optimization/113703] New: ivopts miscompiles loop kristerw at gcc dot gnu.org
2024-02-01 14:12 ` [Bug tree-optimization/113703] " rguenth at gcc dot gnu.org
@ 2024-02-01 15:01 ` kristerw at gcc dot gnu.org
2024-02-01 15:30 ` kristerw at gcc dot gnu.org
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: kristerw at gcc dot gnu.org @ 2024-02-01 15:01 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113703
--- Comment #2 from Krister Walfridsson <kristerw at gcc dot gnu.org> ---
Here is a runtime testcase:
#include <sys/mman.h>
#include <unistd.h>
#include <stdint.h>
__attribute__((noipa))
void f1 (char *p, uintptr_t i, uintptr_t n)
{
p += i;
do
{
*p = '\0';
p += 1;
i++;
}
while (i < n);
}
int main()
{
long pgsz = sysconf (_SC_PAGESIZE);
void *p = mmap (NULL, pgsz * 2, PROT_READ|PROT_WRITE,
MAP_ANONYMOUS|MAP_PRIVATE, 0, 0);
if (p == MAP_FAILED)
return 0;
mprotect (p+pgsz, pgsz, PROT_NONE);
uintptr_t n = -3 - (uintptr_t)p;
f1 (p+2, -2, n);
return 0;
}
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/113703] ivopts miscompiles loop
2024-02-01 11:17 [Bug tree-optimization/113703] New: ivopts miscompiles loop kristerw at gcc dot gnu.org
2024-02-01 14:12 ` [Bug tree-optimization/113703] " rguenth at gcc dot gnu.org
2024-02-01 15:01 ` kristerw at gcc dot gnu.org
@ 2024-02-01 15:30 ` kristerw at gcc dot gnu.org
2024-02-05 10:04 ` rguenth at gcc dot gnu.org
2024-02-06 11:33 ` rguenth at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: kristerw at gcc dot gnu.org @ 2024-02-01 15:30 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113703
--- Comment #3 from Krister Walfridsson <kristerw at gcc dot gnu.org> ---
Oops. I messed up the test case... It "works", but the actual values does not
make sense...
The following is better:
int main()
{
long pgsz = sysconf (_SC_PAGESIZE);
void *p = mmap (NULL, pgsz * 2, PROT_READ|PROT_WRITE,
MAP_ANONYMOUS|MAP_PRIVATE, 0, 0);
if (p == MAP_FAILED)
return 0;
mprotect (p+pgsz, pgsz, PROT_NONE);
uintptr_t n = -2 - (uintptr_t)(p+pgsz);
f1 (p+pgsz, -2, n);
return 0;
}
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/113703] ivopts miscompiles loop
2024-02-01 11:17 [Bug tree-optimization/113703] New: ivopts miscompiles loop kristerw at gcc dot gnu.org
` (2 preceding siblings ...)
2024-02-01 15:30 ` kristerw at gcc dot gnu.org
@ 2024-02-05 10:04 ` rguenth at gcc dot gnu.org
2024-02-06 11:33 ` rguenth at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-02-05 10:04 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113703
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Ever confirmed|0 |1
Keywords| |needs-bisection
Status|UNCONFIRMED |NEW
Last reconfirmed| |2024-02-05
--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
Confirmed.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/113703] ivopts miscompiles loop
2024-02-01 11:17 [Bug tree-optimization/113703] New: ivopts miscompiles loop kristerw at gcc dot gnu.org
` (3 preceding siblings ...)
2024-02-05 10:04 ` rguenth at gcc dot gnu.org
@ 2024-02-06 11:33 ` rguenth at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-02-06 11:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113703
--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
It's going wrong in iv_elimination_compare_lt which tries to exactly handle
this kind of loop:
We aim to handle the following situation:
sometype *base, *p;
int a, b, i;
i = a;
p = p_0 = base + a;
do
{
bla (*p);
p++;
i++;
}
while (i < b);
Here, the number of iterations of the loop is (a + 1 > b) ? 0 : b - a - 1.
We aim to optimize this to
p = p_0 = base + a;
do
{
bla (*p);
p++;
}
while (p < p_0 - a + b);
This preserves the correctness, since the pointer arithmetics does not
overflow. More precisely:
1) if a + 1 <= b, then p_0 - a + b is the final value of p, hence there is
no
overflow in computing it or the values of p.
2) if a + 1 > b, then we need to verify that the expression p_0 - a does not
overflow. To prove this, we use the fact that p_0 = base + a.
there's either a hole in that logic or the implementation is off.
/* Finally, check that CAND->IV->BASE - CAND->IV->STEP * A does not
overflow. */
offset = fold_build2 (MULT_EXPR, TREE_TYPE (cand->iv->step),
cand->iv->step,
fold_convert (TREE_TYPE (cand->iv->step), a));
if (!difference_cannot_overflow_p (data, cand->iv->base, offset))
return false;
where 'A' is 'i', CAND->IV->BASE is 'p + i' and CAND->IV->STEP is 1
as 'sizetype'.
That just checks that (p + i) - i doesn't overflow.
Somehow it misses to prove p + b doesn't overflow since we end up with
p' < (p + i) + (n - i) aka p' < p + n.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-02-06 11:33 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-01 11:17 [Bug tree-optimization/113703] New: ivopts miscompiles loop kristerw at gcc dot gnu.org
2024-02-01 14:12 ` [Bug tree-optimization/113703] " rguenth at gcc dot gnu.org
2024-02-01 15:01 ` kristerw at gcc dot gnu.org
2024-02-01 15:30 ` kristerw at gcc dot gnu.org
2024-02-05 10:04 ` rguenth at gcc dot gnu.org
2024-02-06 11:33 ` rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).