[Bug c/102461] New: overflow in omp parallel for schedule (static,chunk

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug c/102461] New: overflow in omp parallel for schedule (static,chunk_size)
@ 2021-09-22 22:56 michelemartone at users dot sourceforge.net
  2021-09-22 22:59 ` [Bug c/102461] " michelemartone at users dot sourceforge.net
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: michelemartone at users dot sourceforge.net @ 2021-09-22 22:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102461

            Bug ID: 102461
           Summary: overflow in omp parallel for schedule
                    (static,chunk_size)
           Product: gcc
           Version: 11.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: michelemartone at users dot sourceforge.net
  Target Milestone: ---

Created attachment 51499
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51499&action=edit
`git diff` output of gcc/omp-expand.c against revision 7ca388565af aka tag:
releases/gcc-11.2.0

Hi,

I am the author of the LIBRSB library [http://librsb.sourceforge.net/] for
Sparse BLAS (sparse linear algebra_ computations.
I've received a bug report [https://savannah.gnu.org/bugs/?60042#comment4]
where an OpenMP-parallelized loop between i=0..<near to INT_MAX> ended up
executing the loop body with i=-1.

The original loop in question had static scheduling and chunk_size=10000, and
even one thread is sufficient to cause the problem.

I reduced the problem and can reproduce it with gcc-11.2.0 (built from sources,
GIT tag releases/gcc-11.2.0) and older ones; arch x86_64.
I could not reproduce the problem with clang-11.0.1 or icc-19.0.5.281.  

// # gcc -Wall -Wextra -pedantic -fopenmp -std=c11 -O0 overflow.c               
#include <stdlib.h> // abort
#include <limits.h> // INT_MAX
#include <omp.h> // compile with -fopenmp
int main ()
{ 
  const int chunk_size = 1000; 
  const int n = INT_MAX - 100; // 2147483547
  int l = 0;
  #pragma omp parallel for schedule (static,chunk_size) num_threads (1)
  for (int i = 0; i < n; ++i)
    { 
      l = i; 
      if(i < 0)
        abort ();
    }
  if ( l != n-1 )
    abort ();
  return 0;
}

I made some experiments in gcc/omp-expand.c and came up with a fix to insert
overflow detection code when computing the lower and upper boundaries of the
current chunk, thus avoiding the loop body from being executed with i=-1.
Invoking patched gcc like:

FIX1=1 FIX2=1 FIX3=1 gcc -fopenmp -Wall -pedantic -O0 overflow_mini.c -o
overflow -fdump-tree-all

and looking at the *.ompexp file it dumps, one can get an idea of what was
going wrong with the original flow.

Unfortunately I was not able to make my patch generate correct code for -O1 or
more.

I am attaching `git diff` output of gcc/omp-expand.c against revision
7ca388565af aka tag: releases/gcc-11.2.0 -- so one may use this as a base to
fix the bug fully.

I hope this information is enough for you GCC/OpenMP folks to fix this problem!

Michele

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug c/102461] overflow in omp parallel for schedule (static,chunk_size)
  2021-09-22 22:56 [Bug c/102461] New: overflow in omp parallel for schedule (static,chunk_size) michelemartone at users dot sourceforge.net
@ 2021-09-22 22:59 ` michelemartone at users dot sourceforge.net
  2021-09-22 23:00 ` michelemartone at users dot sourceforge.net
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: michelemartone at users dot sourceforge.net @ 2021-09-22 22:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102461

Michele Martone <michelemartone at users dot sourceforge.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |michelemartone at users dot source
                   |                            |forge.net

--- Comment #1 from Michele Martone <michelemartone at users dot sourceforge.net> ---
Created attachment 51500
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51500&action=edit
a slightly longer reproducer listing

I add a slightly longer reproducer listing.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug c/102461] overflow in omp parallel for schedule (static,chunk_size)
  2021-09-22 22:56 [Bug c/102461] New: overflow in omp parallel for schedule (static,chunk_size) michelemartone at users dot sourceforge.net
  2021-09-22 22:59 ` [Bug c/102461] " michelemartone at users dot sourceforge.net
@ 2021-09-22 23:00 ` michelemartone at users dot sourceforge.net
  2021-09-22 23:02 ` michelemartone at users dot sourceforge.net
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: michelemartone at users dot sourceforge.net @ 2021-09-22 23:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102461

--- Comment #2 from Michele Martone <michelemartone at users dot sourceforge.net> ---
Created attachment 51501
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51501&action=edit
original *ompexp file with the OMP region tree

I am attaching original *ompexp file with the OMP region tree.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug c/102461] overflow in omp parallel for schedule (static,chunk_size)
  2021-09-22 22:56 [Bug c/102461] New: overflow in omp parallel for schedule (static,chunk_size) michelemartone at users dot sourceforge.net
  2021-09-22 22:59 ` [Bug c/102461] " michelemartone at users dot sourceforge.net
  2021-09-22 23:00 ` michelemartone at users dot sourceforge.net
@ 2021-09-22 23:02 ` michelemartone at users dot sourceforge.net
  2021-09-23  7:54 ` [Bug middle-end/102461] " jakub at gcc dot gnu.org
  2022-01-24 10:28 ` michelemartone at users dot sourceforge.net
  4 siblings, 0 replies; 6+ messages in thread
From: michelemartone at users dot sourceforge.net @ 2021-09-22 23:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102461

--- Comment #3 from Michele Martone <michelemartone at users dot sourceforge.net> ---
Created attachment 51502
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51502&action=edit
patched gcc *ompexp file with the OMP region tree.

I am attaching patched gcc *ompexp file with the OMP region tree. Notice the
negative values checks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug middle-end/102461] overflow in omp parallel for schedule (static,chunk_size)
  2021-09-22 22:56 [Bug c/102461] New: overflow in omp parallel for schedule (static,chunk_size) michelemartone at users dot sourceforge.net
                   ` (2 preceding siblings ...)
  2021-09-22 23:02 ` michelemartone at users dot sourceforge.net
@ 2021-09-23  7:54 ` jakub at gcc dot gnu.org
  2022-01-24 10:28 ` michelemartone at users dot sourceforge.net
  4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-09-23  7:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102461

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Note, the general OpenMP restrictions say roughly that if there is any overflow
during the computation of the number of loops, then the behavior is
unspecified.
So e.g. a loop
#pragma omp for
  for (int i = -100; i < INT_MAX - 50; i++)
is UB, because the number of iterations overflows in int.
Though, in your case it is just the chunk_size computation that is done later
and  nothing talks about that being undefined.
In the library for static,chunk_size the code handling it e.g. for long
iterators (or whatever fits into it) is:
      unsigned long n, s0, e0, i, c;
      long s, e;

      /* Otherwise, each thread gets exactly chunk_size iterations
         (if available) each time through the loop.  */

      s = ws->incr + (ws->incr > 0 ? -1 : 1);
      n = (ws->end - ws->next + s) / ws->incr;
      i = thr->ts.team_id;
      c = ws->chunk_size;

      /* Initial guess is a C sized chunk positioned nthreads iterations
         in, offset by our thread number.  */
      s0 = (thr->ts.static_trip * nthreads + i) * c;
      e0 = s0 + c;

      /* Detect overflow.  */
      if (s0 >= n)
        return 1;
      if (e0 > n)
        e0 = n;

      /* Transform these to the actual start and end numbers.  */
      s = (long)s0 * ws->incr + ws->next;
      e = (long)e0 * ws->incr + ws->next;

      *pstart = s;
      *pend = e;

      if (e0 == n)
        thr->ts.static_trip = -1;
      else
        thr->ts.static_trip++;
      return 0;

Any overflow in the s and n's computation are user's fault.
thr->ts.static_trip says how many times the current thread got a chunk assigned
before (with -1 means no more work for the current thread), so for e.g. very
large values of chunk_size it will be at most 1 for some threads and thread
numbers need to fit into int, so we can also assume there won't be more than
INT_MAX threads.
I guess for * c and + c we should use __builtin_mul_overflow and
__builtin_add_overflow.
And of course another thing is the inline expansion in omp-expand.c which could
do the same thing.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug middle-end/102461] overflow in omp parallel for schedule (static,chunk_size)
  2021-09-22 22:56 [Bug c/102461] New: overflow in omp parallel for schedule (static,chunk_size) michelemartone at users dot sourceforge.net
                   ` (3 preceding siblings ...)
  2021-09-23  7:54 ` [Bug middle-end/102461] " jakub at gcc dot gnu.org
@ 2022-01-24 10:28 ` michelemartone at users dot sourceforge.net
  4 siblings, 0 replies; 6+ messages in thread
From: michelemartone at users dot sourceforge.net @ 2022-01-24 10:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102461

--- Comment #5 from Michele Martone <michelemartone at users dot sourceforge.net> ---
Hi Jakub.

Thanks for confirming the details of the algorithm determining the boundaries.

If we take Table 2.5 of the OpenMP spec:
https://www.openmp.org/spec-html/5.1/openmpsu48.html#x73-73045
it says about "static":
  "iterations are divided into chunks of size chunk_size".
So already sticking to that (so, performing that division) would avoid the
problem here.

To me, this nails down the issue as a bug.

If you can use a non-division based arithmetic to get around it would be fine
as well.

Indeed, as of now some more resilience for these loops would be nice..
>From the user perspective, the idea of getting an overflow with one thread
(case here) is odd: one may be tempted to think that in this special case no
computation at all happens.
Moreover, nowadays is pretty easy for the loop range to get near to INT_MAX,
and then an overflow situation here may arise even if e.g. n = 2000000000,
chunk_size = 1080000000 (so something more than n/2) and 1 thread.

I hope the overflow-avoiding version makes it in soon here (just as with
dynamic or guided)!

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-01-24 10:28 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-22 22:56 [Bug c/102461] New: overflow in omp parallel for schedule (static,chunk_size) michelemartone at users dot sourceforge.net
2021-09-22 22:59 ` [Bug c/102461] " michelemartone at users dot sourceforge.net
2021-09-22 23:00 ` michelemartone at users dot sourceforge.net
2021-09-22 23:02 ` michelemartone at users dot sourceforge.net
2021-09-23  7:54 ` [Bug middle-end/102461] " jakub at gcc dot gnu.org
2022-01-24 10:28 ` michelemartone at users dot sourceforge.net

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).