From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 24296 invoked by alias); 2 Feb 2014 22:51:36 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 24261 invoked by uid 48); 2 Feb 2014 22:51:31 -0000 From: "njs at pobox dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug libgomp/60035] New: [PATCH] make it possible to use OMP on both sides of a fork (without violating standard) Date: Sun, 02 Feb 2014 22:51:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: libgomp X-Bugzilla-Version: unknown X-Bugzilla-Keywords: X-Bugzilla-Severity: enhancement X-Bugzilla-Who: njs at pobox dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter cc attachments.created Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2014-02/txt/msg00094.txt.bz2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60035 Bug ID: 60035 Summary: [PATCH] make it possible to use OMP on both sides of a fork (without violating standard) Product: gcc Version: unknown Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: libgomp Assignee: unassigned at gcc dot gnu.org Reporter: njs at pobox dot com CC: jakub at gcc dot gnu.org Created attachment 32019 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=32019&action=edit patch to make openmp -> quiesce -> fork -> openmp work This is a re-open of #52303 and #58378, with more arguments, and a proposed patch that fixes the problem without violating the openmp standard. Background: Almost all scientific/numerical code delegates linear algebra operations to some optimized BLAS library. Currently, the main contenders for this library are: 1) ATLAS: free software, but uses extensive build-time configuration, which means it must be re-compiled from source by every user to achieve competitive performance. 2) MKL: proprietary, but technically excellent. 3) OpenBLAS: free software, but uses OpenMP for threading, which means that any program which does linear algebra and also expects fork() to work is screwed [1], at least when using GCC. This means that for projects like numpy, which are used in a very large range of downstream products, we are pretty much screwed too. Many of our users use fork(), for various good reasons that I can elaborate if desired, so we can't just recommend OpenBLAS in general -- ATLAS or MKL are superior for . But recompiling ATLAS is difficult, so we can't recommend that as a general solution, or provide it in pre-compiled downloads. So what we end up doing is shipping slow, unoptimized BLAS, while all the major "scientific python" distros ship MKL; and we also get constantly pressured by users to either ship binaries with MKL or with OpenBLAS built with icc; and we field a new bug report every week or two from people who use OpenBLAS without realizing it and are experiencing mysterious hangs. (Or sometimes other projects get caught in the crossfire, e.g. [2] which is someone trying to figure out why their web-app can't generate plot graphics when using the celery job queue manager.) Meanwhile people are waiting with bated breath for clang to get an openmp implementation so that they can shift their whole stack over there, solely because of this one bug. Basically the current situation is causing ongoing pain for a wide range of people and makes free software uncompetitive with proprietary software for scientific code using Python in general. But it doesn't have to be this way! In actual practice on real implementations -- regardless of what POSIX says -- it's perfectly safe to use arbitrary POSIX APIs after fork, so long as all threads are in a known, quiescent state when the fork occurs. The attached patch has essentially no impact on compliant OpenMP-using programs; in particular, and unlike the patch in #58378, it has no affect on the behavior of the parent process, and in the child process it does nothing that violates POSIX unless the user has violated POSIX first. But it makes it safe in practice to use OpenMP encapsulated within a serial library API, without mysterious breakage depending on far away parts of the program, and in particular should fix the OpenBLAS issue. Test case included in patch is by Olivier Grisel, from #58378. Patch is against current gcc svn trunk (r206297). [1] https://github.com/xianyi/OpenBLAS/issues/294 [2] https://github.com/celery/celery/issues/1842