On 05/08/2015 10:40 AM, Thomas Schwinge wrote: > Hi! > > On Thu, 7 May 2015 13:39:40 +0200, Jakub Jelinek wrote: >> On Thu, May 07, 2015 at 01:26:57PM +0200, Rainer Orth wrote: >>> As reported in the PR, with the addition of all those OpenACC tests, >>> libgomp make check times have skyrocketed since the testsuite is still >>> run sequentially. > > ACK. And, thanks for looking into that! > >>> Fixing this proved trivial: I managed to almost literally copy the >>> solution from libstdc++-v3/testsuite/Makefile.am, with a minimal change >>> to libgomp.exp so the generated libgomp-test-support.exp file is found >>> in both the sequential and parallel cases. This isn't an issue in >>> libstdc++ since all necessary variables are stored in a single >>> site.exp. >> >> It is far from trivial though. >> The point is that most of the OpenMP tests are parallelized with the >> default OMP_NUM_THREADS, so running the tests in parallel oversubscribes the >> machine a lot, the higher number of hw threads the more. > > Do you agree that we have two classes of test cases in libgomp: 1) test > cases that don't place a considerably higher load on the machine compared > to "normal" (single-threaded) execution tests, because they're just > testing some functionality that is not expected to actively depend > on/interfere with parallelism. If needed, and/or if not already done, > such test cases can be parameterized (OMP_NUM_THREADS, OpenACC num_gangs, > num_workers, vector_length clauses, and so on) for low parallelism > levels. And, 2) test cases that place a considerably higher load on the > machine compared to "normal" (single-threaded) execution tests, because > they're testing some functionality that actively depends on/interferes > with some kind of parallelism. What about marking such tests specially, > such that DejaGnu will only ever schedule one of them for execution at > the same time? For example, a new dg-* directive to run them wrapped > through »flock [libgomp/testsuite/serial.lock] [a.out]« or some such? Looks the thread got stuck. Anyway I've just noticed how slow libgomp.exp tests are on a recent Intel Machine with 160 HT cores. I'm attaching graph with CPU utilization and 'ps ax | grep expect' log file that shows which tests are running. Roughly, after 10 minutes I see drop in utilization and then libgomp.exp is running mainly serially. So I believe splitting tests in libgomp.exp to serial and parallel would make sense. Another another idea is to overwrite OMP_NUM_THREADS to a reasonable number which will enable also parallel execution of parallel tests? Thanks, Martin > >> If we go forward with some parallelization of the tests, we at least should >> try to export something like OMP_WAIT_POLICY=passive so that the >> oversubscribed machine would at least not spend too much time in spinning. > > (Will again have the problem that DejaGnu doesn't provide infrastructure > to communicate environment variables to boards in remote testing.) > >> And perhaps reconsider running all OpenACC threads 3 times, just allow >> user to select which offloading target they want to test (host fallback, >> the host nonshm hack, PTX, XeonPHI in the future?), and test just that >> (that is pretty much how OpenMP offloading testing works). > > My rationale is: if you configure GCC to support a set of offloading > devices (more than one), you'll also want to get the test coverage that > indeed all these work as expected. (It currently doesn't matter, but...) > that's something I'd like to see improved in the libgomp OpenMP > offloading testing (once it supports more than one architecture for > offloading). > >> For tests that >> always want to test host fallback, I hope OpenACC offers clauses to force >> the host fallback. > > Yes. > > > Grüße, > Thomas >