public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libgomp/49490] New: suboptimal load balancing in loops
@ 2011-06-21 16:48 dennis.jespersen at nasa dot gov
2011-06-22 14:42 ` [Bug libgomp/49490] " jakub at gcc dot gnu.org
2011-06-22 20:39 ` jakub at gcc dot gnu.org
0 siblings, 2 replies; 3+ messages in thread
From: dennis.jespersen at nasa dot gov @ 2011-06-21 16:48 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49490
Summary: suboptimal load balancing in loops
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: minor
Priority: P3
Component: libgomp
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: dennis.jespersen@nasa.gov
Created attachment 24573
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=24573
test code to show how a compiler/runtime splits an OpenMP loop
The OpenMP runtime library produces a correct but suboptimal load balance
in parallel loops.
For example, a loop of length 33 with 8 OpenMP threads will give the
threads work of lengths 5, 5, 5, 5, 5, 5, 3, 0 respectively. This is logically
correct, but imagine a dual-socket 4 core + 4 core configuration; then
the "left" socket has 20 units of work while the "right" socket has 13
units of work. This could put undue pressure on the left cache(s) and/or
memory connection. It would be better to spread out the work as much
as possible, so in the example in question the threads would get work
of lengths 5, 4, 4, 4, 4, 4, 4, 4.
It should be fairly easy to modify libgomp/iter.c to produce the better
load balancing (at least I think that's where the modification would go).
The attached Fortran code will show the load balance; the Portland Group and
Intel products give the desired even balance.
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug libgomp/49490] suboptimal load balancing in loops
2011-06-21 16:48 [Bug libgomp/49490] New: suboptimal load balancing in loops dennis.jespersen at nasa dot gov
@ 2011-06-22 14:42 ` jakub at gcc dot gnu.org
2011-06-22 20:39 ` jakub at gcc dot gnu.org
1 sibling, 0 replies; 3+ messages in thread
From: jakub at gcc dot gnu.org @ 2011-06-22 14:42 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49490
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |ASSIGNED
Last reconfirmed| |2011.06.22 14:41:35
AssignedTo|unassigned at gcc dot |jakub at gcc dot gnu.org
|gnu.org |
Ever Confirmed|0 |1
--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-06-22 14:41:35 UTC ---
Created attachment 24580
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=24580
gcc47-pr49490.patch
Untested fix.
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug libgomp/49490] suboptimal load balancing in loops
2011-06-21 16:48 [Bug libgomp/49490] New: suboptimal load balancing in loops dennis.jespersen at nasa dot gov
2011-06-22 14:42 ` [Bug libgomp/49490] " jakub at gcc dot gnu.org
@ 2011-06-22 20:39 ` jakub at gcc dot gnu.org
1 sibling, 0 replies; 3+ messages in thread
From: jakub at gcc dot gnu.org @ 2011-06-22 20:39 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49490
--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-06-22 20:39:27 UTC ---
Author: jakub
Date: Wed Jun 22 20:39:25 2011
New Revision: 175315
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=175315
Log:
PR libgomp/49490
* omp-low.c (expand_omp_for_static_nochunk): Only
use n ceil/ nthreads size for the first
n % nthreads threads in the team instead of
all threads except for the last few ones which
get less work or none at all.
* iter.c (gomp_iter_static_next): For chunk size 0
only use n ceil/ nthreads size for the first
n % nthreads threads in the team instead of
all threads except for the last few ones which
get less work or none at all.
* iter_ull.c (gomp_iter_ull_static_next): Likewise.
* env.c (parse_schedule): If OMP_SCHEDULE doesn't have
chunk argument, set run_sched_modifier to 0 for static
resp. 1 for other kinds. If chunk argument is 0
and not static, set value to 1.
Modified:
trunk/gcc/ChangeLog
trunk/gcc/omp-low.c
trunk/libgomp/ChangeLog
trunk/libgomp/env.c
trunk/libgomp/iter.c
trunk/libgomp/iter_ull.c
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2011-06-22 20:39 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-21 16:48 [Bug libgomp/49490] New: suboptimal load balancing in loops dennis.jespersen at nasa dot gov
2011-06-22 14:42 ` [Bug libgomp/49490] " jakub at gcc dot gnu.org
2011-06-22 20:39 ` jakub at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).