From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 8887 invoked by alias); 10 May 2006 22:46:34 -0000 Received: (qmail 8873 invoked by uid 48); 10 May 2006 22:46:28 -0000 Date: Wed, 10 May 2006 22:46:00 -0000 Message-ID: <20060510224628.8872.qmail@sourceware.org> X-Bugzilla-Reason: CC References: Subject: [Bug other/27541] Support Cluster OpenMP (distributed-memory OpenMP) In-Reply-To: Reply-To: gcc-bugzilla@gcc.gnu.org To: gcc-bugs@gcc.gnu.org From: "tobias dot burnus at physik dot fu-berlin dot de" Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2006-05/txt/msg01046.txt.bz2 List-Id: ------- Comment #2 from tobias dot burnus at physik dot fu-berlin dot de 2006-05-10 22:46 ------- > Is there really a standard for this or just an extension of OpenMP? I'm not sure whether I understand the question. Directive wise it is OpenMP augmented by a single directive: "shareable". This seems to be a single-vendor extension by Intel. I think they simply took the OpenMP standard and looked how to implement it for distributed memory. The problem in inplementing is, as far as I understood, that plainly speaking OpenMP assumes everything is global whereas distributed programs want to minimize 'global' (better: shared) variables as they have to be synchronized and are thus expensive. Intel added "sharable" to denote explictly such a variable; however, some variables are automatically marked as shareable. Thus: Syntaxwise it is a rather small change quickly to be implemented in GOMP. But the library system behind is a bigger task: - wrapper main() which initializes helper library, finds settings and starts the program on m computers (incl. forking of n threads) via rsh/ssh. - provides synchronization (barrier, data exchange, collection of data etc.) via TCP or DAPL, especially for global variables > Though this is useful a little bit for the Cell where you don't > really have a distrubuted machine but the memory will be distributed though. Well, I'm primarily interested to run a number-crunching program (exciting.sf.net) on more nodes. As it is only parallized using OpenMP (and not e.g. via MPI), I'm currently limited to 2 CPUs (or one dualcore) on our cluster. Using Cluster OpenMP with infiniband I could use 2*120 CPUs. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27541