From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 23166 invoked by alias); 13 Dec 2012 16:44:32 -0000 Received: (qmail 22961 invoked by uid 22791); 13 Dec 2012 16:44:28 -0000 X-SWARE-Spam-Status: No, hits=-5.0 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,KHOP_RCVD_TRUST,KHOP_THREADED,RCVD_IN_DNSWL_LOW,RCVD_IN_HOSTKARMA_YE X-Spam-Check-By: sourceware.org Received: from mail-we0-f169.google.com (HELO mail-we0-f169.google.com) (74.125.82.169) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 13 Dec 2012 16:44:23 +0000 Received: by mail-we0-f169.google.com with SMTP id t49so1056255wey.0 for ; Thu, 13 Dec 2012 08:44:21 -0800 (PST) Received: by 10.194.77.148 with SMTP id s20mr9524736wjw.52.1355417061786; Thu, 13 Dec 2012 08:44:21 -0800 (PST) MIME-Version: 1.0 Received: by 10.194.90.231 with HTTP; Thu, 13 Dec 2012 08:44:01 -0800 (PST) In-Reply-To: References: <50C791BB.4060303@calculquebec.ca> <50C8F364.2010109@calculquebec.ca> <50C914C4.2060102@calculquebec.ca> <50C9D68A.1050903@calculquebec.ca> From: Rhys Ulerich Date: Thu, 13 Dec 2012 16:44:00 -0000 Message-ID: Subject: Re: Adding OpenMP support for some of the GSL functions To: Maxime Boissonneault Cc: gsl-discuss@sourceware.org Content-Type: text/plain; charset=ISO-8859-1 Mailing-List: contact gsl-discuss-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gsl-discuss-owner@sourceware.org X-SW-Source: 2012-q4/txt/msg00012.txt.bz2 > This feels like you're getting a small > memory/cache bandwidth increase for the rkf45_apply level-1-BLAS-like > operations by using multiple cores but the cores are otherwise not > being used effectively. I say this because a state vector 1e6 doubles > long will not generally fit in cache. Adding more cores increases the > amount of cache available. Hmm... I tentatively take this back on re-thinking how you've added the #pragma omp lines to the rkf45.c file you attached elsewhere in this thread. Try using a single #pragma omp parallel and then individual lines like #pragma omp for at each for loop. Using #pragma omp parallel for repeatedly as you've done can introduce excess overhead, depending on your compiler, because it may incur unnecessary overhead. - Rhys