public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/102461] New: overflow in omp parallel for schedule (static,chunk_size)
@ 2021-09-22 22:56 michelemartone at users dot sourceforge.net
2021-09-22 22:59 ` [Bug c/102461] " michelemartone at users dot sourceforge.net
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: michelemartone at users dot sourceforge.net @ 2021-09-22 22:56 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102461
Bug ID: 102461
Summary: overflow in omp parallel for schedule
(static,chunk_size)
Product: gcc
Version: 11.2.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: michelemartone at users dot sourceforge.net
Target Milestone: ---
Created attachment 51499
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51499&action=edit
`git diff` output of gcc/omp-expand.c against revision 7ca388565af aka tag:
releases/gcc-11.2.0
Hi,
I am the author of the LIBRSB library [http://librsb.sourceforge.net/] for
Sparse BLAS (sparse linear algebra_ computations.
I've received a bug report [https://savannah.gnu.org/bugs/?60042#comment4]
where an OpenMP-parallelized loop between i=0..<near to INT_MAX> ended up
executing the loop body with i=-1.
The original loop in question had static scheduling and chunk_size=10000, and
even one thread is sufficient to cause the problem.
I reduced the problem and can reproduce it with gcc-11.2.0 (built from sources,
GIT tag releases/gcc-11.2.0) and older ones; arch x86_64.
I could not reproduce the problem with clang-11.0.1 or icc-19.0.5.281.
// # gcc -Wall -Wextra -pedantic -fopenmp -std=c11 -O0 overflow.c
#include <stdlib.h> // abort
#include <limits.h> // INT_MAX
#include <omp.h> // compile with -fopenmp
int main ()
{
const int chunk_size = 1000;
const int n = INT_MAX - 100; // 2147483547
int l = 0;
#pragma omp parallel for schedule (static,chunk_size) num_threads (1)
for (int i = 0; i < n; ++i)
{
l = i;
if(i < 0)
abort ();
}
if ( l != n-1 )
abort ();
return 0;
}
I made some experiments in gcc/omp-expand.c and came up with a fix to insert
overflow detection code when computing the lower and upper boundaries of the
current chunk, thus avoiding the loop body from being executed with i=-1.
Invoking patched gcc like:
FIX1=1 FIX2=1 FIX3=1 gcc -fopenmp -Wall -pedantic -O0 overflow_mini.c -o
overflow -fdump-tree-all
and looking at the *.ompexp file it dumps, one can get an idea of what was
going wrong with the original flow.
Unfortunately I was not able to make my patch generate correct code for -O1 or
more.
I am attaching `git diff` output of gcc/omp-expand.c against revision
7ca388565af aka tag: releases/gcc-11.2.0 -- so one may use this as a base to
fix the bug fully.
I hope this information is enough for you GCC/OpenMP folks to fix this problem!
Michele
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug c/102461] overflow in omp parallel for schedule (static,chunk_size)
2021-09-22 22:56 [Bug c/102461] New: overflow in omp parallel for schedule (static,chunk_size) michelemartone at users dot sourceforge.net
@ 2021-09-22 22:59 ` michelemartone at users dot sourceforge.net
2021-09-22 23:00 ` michelemartone at users dot sourceforge.net
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: michelemartone at users dot sourceforge.net @ 2021-09-22 22:59 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102461
Michele Martone <michelemartone at users dot sourceforge.net> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |michelemartone at users dot source
| |forge.net
--- Comment #1 from Michele Martone <michelemartone at users dot sourceforge.net> ---
Created attachment 51500
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51500&action=edit
a slightly longer reproducer listing
I add a slightly longer reproducer listing.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug c/102461] overflow in omp parallel for schedule (static,chunk_size)
2021-09-22 22:56 [Bug c/102461] New: overflow in omp parallel for schedule (static,chunk_size) michelemartone at users dot sourceforge.net
2021-09-22 22:59 ` [Bug c/102461] " michelemartone at users dot sourceforge.net
@ 2021-09-22 23:00 ` michelemartone at users dot sourceforge.net
2021-09-22 23:02 ` michelemartone at users dot sourceforge.net
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: michelemartone at users dot sourceforge.net @ 2021-09-22 23:00 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102461
--- Comment #2 from Michele Martone <michelemartone at users dot sourceforge.net> ---
Created attachment 51501
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51501&action=edit
original *ompexp file with the OMP region tree
I am attaching original *ompexp file with the OMP region tree.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug c/102461] overflow in omp parallel for schedule (static,chunk_size)
2021-09-22 22:56 [Bug c/102461] New: overflow in omp parallel for schedule (static,chunk_size) michelemartone at users dot sourceforge.net
2021-09-22 22:59 ` [Bug c/102461] " michelemartone at users dot sourceforge.net
2021-09-22 23:00 ` michelemartone at users dot sourceforge.net
@ 2021-09-22 23:02 ` michelemartone at users dot sourceforge.net
2021-09-23 7:54 ` [Bug middle-end/102461] " jakub at gcc dot gnu.org
2022-01-24 10:28 ` michelemartone at users dot sourceforge.net
4 siblings, 0 replies; 6+ messages in thread
From: michelemartone at users dot sourceforge.net @ 2021-09-22 23:02 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102461
--- Comment #3 from Michele Martone <michelemartone at users dot sourceforge.net> ---
Created attachment 51502
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51502&action=edit
patched gcc *ompexp file with the OMP region tree.
I am attaching patched gcc *ompexp file with the OMP region tree. Notice the
negative values checks.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug middle-end/102461] overflow in omp parallel for schedule (static,chunk_size)
2021-09-22 22:56 [Bug c/102461] New: overflow in omp parallel for schedule (static,chunk_size) michelemartone at users dot sourceforge.net
` (2 preceding siblings ...)
2021-09-22 23:02 ` michelemartone at users dot sourceforge.net
@ 2021-09-23 7:54 ` jakub at gcc dot gnu.org
2022-01-24 10:28 ` michelemartone at users dot sourceforge.net
4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-09-23 7:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102461
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jakub at gcc dot gnu.org
--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Note, the general OpenMP restrictions say roughly that if there is any overflow
during the computation of the number of loops, then the behavior is
unspecified.
So e.g. a loop
#pragma omp for
for (int i = -100; i < INT_MAX - 50; i++)
is UB, because the number of iterations overflows in int.
Though, in your case it is just the chunk_size computation that is done later
and nothing talks about that being undefined.
In the library for static,chunk_size the code handling it e.g. for long
iterators (or whatever fits into it) is:
unsigned long n, s0, e0, i, c;
long s, e;
/* Otherwise, each thread gets exactly chunk_size iterations
(if available) each time through the loop. */
s = ws->incr + (ws->incr > 0 ? -1 : 1);
n = (ws->end - ws->next + s) / ws->incr;
i = thr->ts.team_id;
c = ws->chunk_size;
/* Initial guess is a C sized chunk positioned nthreads iterations
in, offset by our thread number. */
s0 = (thr->ts.static_trip * nthreads + i) * c;
e0 = s0 + c;
/* Detect overflow. */
if (s0 >= n)
return 1;
if (e0 > n)
e0 = n;
/* Transform these to the actual start and end numbers. */
s = (long)s0 * ws->incr + ws->next;
e = (long)e0 * ws->incr + ws->next;
*pstart = s;
*pend = e;
if (e0 == n)
thr->ts.static_trip = -1;
else
thr->ts.static_trip++;
return 0;
Any overflow in the s and n's computation are user's fault.
thr->ts.static_trip says how many times the current thread got a chunk assigned
before (with -1 means no more work for the current thread), so for e.g. very
large values of chunk_size it will be at most 1 for some threads and thread
numbers need to fit into int, so we can also assume there won't be more than
INT_MAX threads.
I guess for * c and + c we should use __builtin_mul_overflow and
__builtin_add_overflow.
And of course another thing is the inline expansion in omp-expand.c which could
do the same thing.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug middle-end/102461] overflow in omp parallel for schedule (static,chunk_size)
2021-09-22 22:56 [Bug c/102461] New: overflow in omp parallel for schedule (static,chunk_size) michelemartone at users dot sourceforge.net
` (3 preceding siblings ...)
2021-09-23 7:54 ` [Bug middle-end/102461] " jakub at gcc dot gnu.org
@ 2022-01-24 10:28 ` michelemartone at users dot sourceforge.net
4 siblings, 0 replies; 6+ messages in thread
From: michelemartone at users dot sourceforge.net @ 2022-01-24 10:28 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102461
--- Comment #5 from Michele Martone <michelemartone at users dot sourceforge.net> ---
Hi Jakub.
Thanks for confirming the details of the algorithm determining the boundaries.
If we take Table 2.5 of the OpenMP spec:
https://www.openmp.org/spec-html/5.1/openmpsu48.html#x73-73045
it says about "static":
"iterations are divided into chunks of size chunk_size".
So already sticking to that (so, performing that division) would avoid the
problem here.
To me, this nails down the issue as a bug.
If you can use a non-division based arithmetic to get around it would be fine
as well.
Indeed, as of now some more resilience for these loops would be nice..
>From the user perspective, the idea of getting an overflow with one thread
(case here) is odd: one may be tempted to think that in this special case no
computation at all happens.
Moreover, nowadays is pretty easy for the loop range to get near to INT_MAX,
and then an overflow situation here may arise even if e.g. n = 2000000000,
chunk_size = 1080000000 (so something more than n/2) and 1 thread.
I hope the overflow-avoiding version makes it in soon here (just as with
dynamic or guided)!
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2022-01-24 10:28 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-22 22:56 [Bug c/102461] New: overflow in omp parallel for schedule (static,chunk_size) michelemartone at users dot sourceforge.net
2021-09-22 22:59 ` [Bug c/102461] " michelemartone at users dot sourceforge.net
2021-09-22 23:00 ` michelemartone at users dot sourceforge.net
2021-09-22 23:02 ` michelemartone at users dot sourceforge.net
2021-09-23 7:54 ` [Bug middle-end/102461] " jakub at gcc dot gnu.org
2022-01-24 10:28 ` michelemartone at users dot sourceforge.net
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).