* [Bug tree-optimization/113703] ivopts miscompiles loop
2024-02-01 11:17 [Bug tree-optimization/113703] New: ivopts miscompiles loop kristerw at gcc dot gnu.org
@ 2024-02-01 14:12 ` rguenth at gcc dot gnu.org
2024-02-01 15:01 ` kristerw at gcc dot gnu.org
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-02-01 14:12 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113703
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |wrong-code
CC| |rguenth at gcc dot gnu.org
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
I think the point is we fail to represent
Analyzing # of iterations of loop 1
exit condition [i_5(D) + 1, + , 1] < n_11(D)
bounds on difference of bases: -18446744073709551615 ... 18446744073709551615
result:
zero if i_5(D) + 1 > n_11(D)
# of iterations (n_11(D) - i_5(D)) + 18446744073709551615, bounded by
18446744073709551615
number of iterations (n_11(D) - i_5(D)) + 18446744073709551615; zero if
i_5(D) + 1 > n_11(D)
specifically the 'zero if i_5(D) + 1 > n_11(D)'
I think may_eliminate_iv is wrong here, maybe not considering overflow
of the niter expression?
I wonder if it is possible to write a runtime testcase that FAILs with
reasonable memory requirement/layout.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/113703] ivopts miscompiles loop
2024-02-01 11:17 [Bug tree-optimization/113703] New: ivopts miscompiles loop kristerw at gcc dot gnu.org
2024-02-01 14:12 ` [Bug tree-optimization/113703] " rguenth at gcc dot gnu.org
@ 2024-02-01 15:01 ` kristerw at gcc dot gnu.org
2024-02-01 15:30 ` kristerw at gcc dot gnu.org
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: kristerw at gcc dot gnu.org @ 2024-02-01 15:01 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113703
--- Comment #2 from Krister Walfridsson <kristerw at gcc dot gnu.org> ---
Here is a runtime testcase:
#include <sys/mman.h>
#include <unistd.h>
#include <stdint.h>
__attribute__((noipa))
void f1 (char *p, uintptr_t i, uintptr_t n)
{
p += i;
do
{
*p = '\0';
p += 1;
i++;
}
while (i < n);
}
int main()
{
long pgsz = sysconf (_SC_PAGESIZE);
void *p = mmap (NULL, pgsz * 2, PROT_READ|PROT_WRITE,
MAP_ANONYMOUS|MAP_PRIVATE, 0, 0);
if (p == MAP_FAILED)
return 0;
mprotect (p+pgsz, pgsz, PROT_NONE);
uintptr_t n = -3 - (uintptr_t)p;
f1 (p+2, -2, n);
return 0;
}
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/113703] ivopts miscompiles loop
2024-02-01 11:17 [Bug tree-optimization/113703] New: ivopts miscompiles loop kristerw at gcc dot gnu.org
2024-02-01 14:12 ` [Bug tree-optimization/113703] " rguenth at gcc dot gnu.org
2024-02-01 15:01 ` kristerw at gcc dot gnu.org
@ 2024-02-01 15:30 ` kristerw at gcc dot gnu.org
2024-02-05 10:04 ` rguenth at gcc dot gnu.org
2024-02-06 11:33 ` rguenth at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: kristerw at gcc dot gnu.org @ 2024-02-01 15:30 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113703
--- Comment #3 from Krister Walfridsson <kristerw at gcc dot gnu.org> ---
Oops. I messed up the test case... It "works", but the actual values does not
make sense...
The following is better:
int main()
{
long pgsz = sysconf (_SC_PAGESIZE);
void *p = mmap (NULL, pgsz * 2, PROT_READ|PROT_WRITE,
MAP_ANONYMOUS|MAP_PRIVATE, 0, 0);
if (p == MAP_FAILED)
return 0;
mprotect (p+pgsz, pgsz, PROT_NONE);
uintptr_t n = -2 - (uintptr_t)(p+pgsz);
f1 (p+pgsz, -2, n);
return 0;
}
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/113703] ivopts miscompiles loop
2024-02-01 11:17 [Bug tree-optimization/113703] New: ivopts miscompiles loop kristerw at gcc dot gnu.org
` (2 preceding siblings ...)
2024-02-01 15:30 ` kristerw at gcc dot gnu.org
@ 2024-02-05 10:04 ` rguenth at gcc dot gnu.org
2024-02-06 11:33 ` rguenth at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-02-05 10:04 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113703
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Ever confirmed|0 |1
Keywords| |needs-bisection
Status|UNCONFIRMED |NEW
Last reconfirmed| |2024-02-05
--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
Confirmed.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/113703] ivopts miscompiles loop
2024-02-01 11:17 [Bug tree-optimization/113703] New: ivopts miscompiles loop kristerw at gcc dot gnu.org
` (3 preceding siblings ...)
2024-02-05 10:04 ` rguenth at gcc dot gnu.org
@ 2024-02-06 11:33 ` rguenth at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-02-06 11:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113703
--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
It's going wrong in iv_elimination_compare_lt which tries to exactly handle
this kind of loop:
We aim to handle the following situation:
sometype *base, *p;
int a, b, i;
i = a;
p = p_0 = base + a;
do
{
bla (*p);
p++;
i++;
}
while (i < b);
Here, the number of iterations of the loop is (a + 1 > b) ? 0 : b - a - 1.
We aim to optimize this to
p = p_0 = base + a;
do
{
bla (*p);
p++;
}
while (p < p_0 - a + b);
This preserves the correctness, since the pointer arithmetics does not
overflow. More precisely:
1) if a + 1 <= b, then p_0 - a + b is the final value of p, hence there is
no
overflow in computing it or the values of p.
2) if a + 1 > b, then we need to verify that the expression p_0 - a does not
overflow. To prove this, we use the fact that p_0 = base + a.
there's either a hole in that logic or the implementation is off.
/* Finally, check that CAND->IV->BASE - CAND->IV->STEP * A does not
overflow. */
offset = fold_build2 (MULT_EXPR, TREE_TYPE (cand->iv->step),
cand->iv->step,
fold_convert (TREE_TYPE (cand->iv->step), a));
if (!difference_cannot_overflow_p (data, cand->iv->base, offset))
return false;
where 'A' is 'i', CAND->IV->BASE is 'p + i' and CAND->IV->STEP is 1
as 'sizetype'.
That just checks that (p + i) - i doesn't overflow.
Somehow it misses to prove p + b doesn't overflow since we end up with
p' < (p + i) + (n - i) aka p' < p + n.
^ permalink raw reply [flat|nested] 6+ messages in thread