public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
* [Bug libgomp/49490] New: suboptimal load balancing in loops @ 2011-06-21 16:48 dennis.jespersen at nasa dot gov 2011-06-22 14:42 ` [Bug libgomp/49490] " jakub at gcc dot gnu.org 2011-06-22 20:39 ` jakub at gcc dot gnu.org 0 siblings, 2 replies; 3+ messages in thread From: dennis.jespersen at nasa dot gov @ 2011-06-21 16:48 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49490 Summary: suboptimal load balancing in loops Product: gcc Version: unknown Status: UNCONFIRMED Severity: minor Priority: P3 Component: libgomp AssignedTo: unassigned@gcc.gnu.org ReportedBy: dennis.jespersen@nasa.gov Created attachment 24573 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=24573 test code to show how a compiler/runtime splits an OpenMP loop The OpenMP runtime library produces a correct but suboptimal load balance in parallel loops. For example, a loop of length 33 with 8 OpenMP threads will give the threads work of lengths 5, 5, 5, 5, 5, 5, 3, 0 respectively. This is logically correct, but imagine a dual-socket 4 core + 4 core configuration; then the "left" socket has 20 units of work while the "right" socket has 13 units of work. This could put undue pressure on the left cache(s) and/or memory connection. It would be better to spread out the work as much as possible, so in the example in question the threads would get work of lengths 5, 4, 4, 4, 4, 4, 4, 4. It should be fairly easy to modify libgomp/iter.c to produce the better load balancing (at least I think that's where the modification would go). The attached Fortran code will show the load balance; the Portland Group and Intel products give the desired even balance. ^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug libgomp/49490] suboptimal load balancing in loops 2011-06-21 16:48 [Bug libgomp/49490] New: suboptimal load balancing in loops dennis.jespersen at nasa dot gov @ 2011-06-22 14:42 ` jakub at gcc dot gnu.org 2011-06-22 20:39 ` jakub at gcc dot gnu.org 1 sibling, 0 replies; 3+ messages in thread From: jakub at gcc dot gnu.org @ 2011-06-22 14:42 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49490 Jakub Jelinek <jakub at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |ASSIGNED Last reconfirmed| |2011.06.22 14:41:35 AssignedTo|unassigned at gcc dot |jakub at gcc dot gnu.org |gnu.org | Ever Confirmed|0 |1 --- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-06-22 14:41:35 UTC --- Created attachment 24580 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=24580 gcc47-pr49490.patch Untested fix. ^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug libgomp/49490] suboptimal load balancing in loops 2011-06-21 16:48 [Bug libgomp/49490] New: suboptimal load balancing in loops dennis.jespersen at nasa dot gov 2011-06-22 14:42 ` [Bug libgomp/49490] " jakub at gcc dot gnu.org @ 2011-06-22 20:39 ` jakub at gcc dot gnu.org 1 sibling, 0 replies; 3+ messages in thread From: jakub at gcc dot gnu.org @ 2011-06-22 20:39 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49490 --- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-06-22 20:39:27 UTC --- Author: jakub Date: Wed Jun 22 20:39:25 2011 New Revision: 175315 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=175315 Log: PR libgomp/49490 * omp-low.c (expand_omp_for_static_nochunk): Only use n ceil/ nthreads size for the first n % nthreads threads in the team instead of all threads except for the last few ones which get less work or none at all. * iter.c (gomp_iter_static_next): For chunk size 0 only use n ceil/ nthreads size for the first n % nthreads threads in the team instead of all threads except for the last few ones which get less work or none at all. * iter_ull.c (gomp_iter_ull_static_next): Likewise. * env.c (parse_schedule): If OMP_SCHEDULE doesn't have chunk argument, set run_sched_modifier to 0 for static resp. 1 for other kinds. If chunk argument is 0 and not static, set value to 1. Modified: trunk/gcc/ChangeLog trunk/gcc/omp-low.c trunk/libgomp/ChangeLog trunk/libgomp/env.c trunk/libgomp/iter.c trunk/libgomp/iter_ull.c ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2011-06-22 20:39 UTC | newest] Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-06-21 16:48 [Bug libgomp/49490] New: suboptimal load balancing in loops dennis.jespersen at nasa dot gov 2011-06-22 14:42 ` [Bug libgomp/49490] " jakub at gcc dot gnu.org 2011-06-22 20:39 ` jakub at gcc dot gnu.org
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).