public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "mika dot fischer at kit dot edu" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug libgomp/43706] scheduling two threads on one core leads to starvation
Date: Tue, 20 Apr 2010 12:23:00 -0000	[thread overview]
Message-ID: <20100420122312.7751.qmail@sourceware.org> (raw)
In-Reply-To: <bug-43706-19017@http.gcc.gnu.org/bugzilla/>



------- Comment #7 from mika dot fischer at kit dot edu  2010-04-20 12:23 -------
> For performance reasons libgomp uses some busy waiting, which of course works
> well when there are available CPUs and cycles to burn (decreases latency a
> lot), but if you have more threads than CPUs it can make things worse.
> You can tweak this through OMP_WAIT_POLICY and GOMP_SPINCOUNT env vars.

This is definitely the reason for the behavior we're seeing. When we set
OMP_WAIT_POLICY=passive, the test program runs through normally. Without it
it takes very very long.

Here are some
measurements when "while (true)" is replaced by "for (int j=0; j<1000; ++j)":


All cores idle:
===============
$ /usr/bin/time ./openmp-bug
3.21user 0.00system 0:00.81elapsed 391%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+331minor)pagefaults 0swaps

$ OMP_WAIT_POLICY=passive /usr/bin/time ./openmp-bug
2.75user 0.05system 0:01.42elapsed 196%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+335minor)pagefaults 0swaps


1 (out of 4) cores occupied:
============================
$ /usr/bin/time ./openmp-bug
133.65user 0.02system 0:45.30elapsed 295%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+330minor)pagefaults 0swaps

$ OMP_WAIT_POLICY=passive /usr/bin/time ./openmp-bug
2.67user 0.00system 0:02.35elapsed 113%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+335minor)pagefaults 0swaps

$ GOMP_SPINCOUNT=10 /usr/bin/time ./openmp-bug
2.91user 0.03system 0:01.73elapsed 169%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+336minor)pagefaults 0swaps

$ GOMP_SPINCOUNT=100 /usr/bin/time ./openmp-bug
2.77user 0.03system 0:01.90elapsed 147%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+336minor)pagefaults 0swaps

$ GOMP_SPINCOUNT=1000 /usr/bin/time ./openmp-bug
2.87user 0.00system 0:01.70elapsed 168%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+336minor)pagefaults 0swaps

$ GOMP_SPINCOUNT=10000 /usr/bin/time ./openmp-bug
3.05user 0.06system 0:01.85elapsed 167%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+337minor)pagefaults 0swaps

$ GOMP_SPINCOUNT=100000 /usr/bin/time ./openmp-bug
5.25user 0.03system 0:03.10elapsed 170%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+335minor)pagefaults 0swaps

$ GOMP_SPINCOUNT=1000000 /usr/bin/time ./openmp-bug
28.84user 0.00system 0:14.13elapsed 203%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+336minor)pagefaults 0swaps

[I ran all of these several times and took the runtime around the average]

> Although the implementation recognizes two kinds of spin counts (normal and
> throttled, the latter in use when number of threads is bigger than number of
> available CPUs), in some cases even that default might be too large (the
> default for throttled spin count is 1000 spins for OMP_WAIT_POLICY=active and
> 100 spins for no OMP_WAIT_POLICY in environment).

As the numbers show, a default spin count of 1000 would be fine. The problem is
however, that OpenMP assumes that it has all the cores of the CPU for itself.
The throttled spin count is only used if the number of OpenMP threads is larger
than the number of cores in the system (AFAICT). This will almost never happen
(AFAICT only if you set OMP_NUM_THREADS to something larger than the number of
cores).

Since it seems clear that the number of spin counts should be smaller when the
CPU cores are active, the throttled spin count must be used when the cores are
actually used at the moment the thread starts waiting. That the number of
OpenMP
threads running at that time is smaller than the number of cores is not a
sufficient condition. If it's not possible to determine this or if it's too
time-consuming, then maybe the non-throttled default spin count can be reduced
to 1000
or so.

So thanks for the workaround! But I still think the default behavior can easily
cause very significant slowdowns and thus should be reconsidered.

Finally, I still don't get why the spinlocking has these effects on the
runtime.
I would expect even 2000000 spin lock cycles to be over very quickly and not a
20-fold increase in the total runtime of the program. Just out of curiosity
maybe you can explain why this happens.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43706


  parent reply	other threads:[~2010-04-20 12:23 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-09 16:22 [Bug libgomp/43706] New: " baeuml at kit dot edu
2010-04-09 16:22 ` [Bug libgomp/43706] " baeuml at kit dot edu
2010-04-09 18:34 ` pinskia at gcc dot gnu dot org
2010-04-09 20:55 ` baeuml at kit dot edu
2010-04-09 22:11 ` mika dot fischer at kit dot edu
2010-04-20 10:23 ` jakub at gcc dot gnu dot org
2010-04-20 10:49 ` jakub at gcc dot gnu dot org
2010-04-20 12:23 ` mika dot fischer at kit dot edu [this message]
2010-04-20 15:38 ` jakub at gcc dot gnu dot org
2010-04-21 14:01 ` jakub at gcc dot gnu dot org
2010-04-21 14:01 ` jakub at gcc dot gnu dot org
2010-04-21 14:06 ` jakub at gcc dot gnu dot org
2010-04-21 14:07 ` jakub at gcc dot gnu dot org
2010-04-21 14:23 ` mika dot fischer at kit dot edu
2010-04-23 14:17 ` singler at kit dot edu
2010-04-30  8:53 ` jakub at gcc dot gnu dot org
2010-07-02  1:39 ` solar-gcc at openwall dot com
2010-07-30 14:00 ` johnfb at mail dot utexas dot edu
2010-08-13 15:48 ` singler at kit dot edu
2010-08-24 11:07 ` solar-gcc at openwall dot com
2010-08-24 11:41 ` jakub at gcc dot gnu dot org
2010-08-24 12:18 ` solar-gcc at openwall dot com
2010-08-30  8:41 ` singler at kit dot edu
2010-09-01 16:38 ` jakub at gcc dot gnu dot org
2010-09-05 11:37 ` solar-gcc at openwall dot com
     [not found] <bug-43706-4@http.gcc.gnu.org/bugzilla/>
2010-11-09 16:33 ` solar-gcc at openwall dot com
2010-11-12  8:21 ` singler at kit dot edu
2010-11-12 11:44 ` solar-gcc at openwall dot com
2010-11-15  9:14 ` singler at kit dot edu
2010-12-02 14:31 ` jakub at gcc dot gnu.org
2010-12-16 13:03 ` rguenth at gcc dot gnu.org
2012-01-12 20:34 ` pinskia at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100420122312.7751.qmail@sourceware.org \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).