From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 2628 invoked by alias); 14 Aug 2009 20:52:48 -0000 Received: (qmail 2529 invoked by uid 22791); 14 Aug 2009 20:52:47 -0000 X-SWARE-Spam-Status: No, hits=-2.0 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_64 X-Spam-Check-By: sourceware.org Received: from smtp21.services.sfr.fr (HELO smtp21.services.sfr.fr) (93.17.128.2) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 14 Aug 2009 20:52:39 +0000 Received: from filter.sfr.fr (localhost [127.0.0.1]) by msfrf2124.sfr.fr (SMTP Server) with ESMTP id 81F1C700008A; Fri, 14 Aug 2009 22:52:36 +0200 (CEST) Received: from [192.168.1.101] (197.156.90-79.rev.gaoland.net [79.90.156.197]) by msfrf2124.sfr.fr (SMTP Server) with ESMTP id 109207000087; Fri, 14 Aug 2009 22:52:36 +0200 (CEST) X-SFR-UUID: 20090814205236679.109207000087@msfrf2124.sfr.fr Subject: Need some Unix and /bin/sh expertise for GCC testsuite From: Laurent GUERBY To: gcc , Paolo Bonzini Cc: Arnaud Charlet , Eric Botcazou Content-Type: text/plain Date: Fri, 14 Aug 2009 21:17:00 -0000 Message-Id: <1250283155.20287.116.camel@localhost> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org X-SW-Source: 2009-08/txt/msg00245.txt.bz2 Hi, Even after the last patch I can still see random ACATS failures on a stock debian etch x86_64 machine (gcc13). I've added many traces to the ACATS script and I can see now a common pattern and it's not related to Ada multi threading or wrong code generation. First the ACATS script itself is relatively straightforward: loop over the test, copy some files, call gnatmake and then run the compiled test and check the output. The issue comes for a surprising - to me - /bin/sh behaviour, if an /bin/sh expert could help me figure out the following: => gcc/testsuite/ada/acats/run_all.sh << #!/bin/sh # Run ACATS with the GNU Ada compiler ... target_gnatmake () { echo gnatmake --GCC=\"$GCC\" $gnatflags $gccflags $* -largs $EXTERNAL_OBJECTS --GCC=\"$GCC\" gnatmake --GCC="$GCC" $gnatflags $gccflags $* -largs $EXTERNAL_OBJECTS --GCC="$GCC" } ... while ... ... target_gnatmake $extraflags -I$dir/support $main >> $dir/acats.log 2>&1 if [ $? -ne 0 ]; then display "FAIL: $i" failed="${failed}${i} " clean_dir continue fi echo "RUN $binmain" >> $dir/acats.log cd $dir/run ZSTAMP=none if [ ! -x $dir/tests/$chapter/$i/$binmain ]; then sync ZSTAMP=$(date '+%Y%m%dT%H%M%S') ls -l $dir/tests/$chapter/$i/ > /home/guerby/tmp/acats/postsync-${i}-${ZSTAMP} 2>&1 ps fauxwwwww > /home/guerby/tmp/acats/psfauxw1-${i}-${ZSTAMP} 2>&1 fi target_run $dir/tests/$chapter/$i/$binmain > $dir/tests/$chapter/$i/${i}.log 2>&1 ... >> Now the common fail pattern is as follows: 1/ target_gnatmake succeeds, that is we don't pass in the first "if". 2/ However even is gnatmake has succeeded we enter the second "if" because there's no executable in the dir as shown by "ls -l" output: => postsync-c48005b-20090813T202815 << total 44 -rw-r--r-- 1 guerby guerby 10345 2009-08-13 20:28 b~c48005b.adb -rw-r--r-- 1 guerby guerby 12375 2009-08-13 20:28 b~c48005b.ads -rw-r--r-- 1 guerby guerby 2786 2009-08-13 20:28 c48005b.adb -rw-r--r-- 1 guerby guerby 784 2009-08-13 20:28 c48005b.ali -rw-r--r-- 1 guerby guerby 12 2009-08-13 20:28 c48005b.lst -rw-r--r-- 1 guerby guerby 3208 2009-08-13 20:28 c48005b.o >> 3/ Here is the point I find surprising: the "ps fauxww" run in the second "if" show that even if the script is fully sequential at least one gnatmake subprocess (collect-ld) is still marked as running *in parallel* with the ps command in the subsequent "if" of the script! => psfauxw1-c48005b-20090813T202815 << ... guerby 7715 1.3 0.0 12176 1936 ? SN 20:20 0:06 \_ /bin/sh /home/guerby/trunk/gcc/testsuite/ada/acats/run_all.sh guerby 7794 0.0 0.0 10796 2476 ? SN 20:28 0:00 \_ gnatmake --GCC=/home/guerby/build/gcc/xgcc -B/home/guerby/build/gcc/ -gnatws -O2 -I/home/guerby/build/gcc/testsuite/ada/acats/support c48005b.adb -largs --GCC=/home/guerby/build/gcc/xgcc -B/home/guerby/build/gcc/ guerby 7803 0.0 0.0 4048 1228 ? SN 20:28 0:00 | \_ /home/guerby/build/gcc/gnatlink c48005b.ali --GCC=/home/guerby/build/gcc/xgcc -B/home/guerby/build/gcc/ guerby 7809 0.0 0.0 2880 584 ? SN 20:28 0:00 | \_ /home/guerby/build/gcc/xgcc b~c48005b.o ... -o c48005b ... guerby 7810 0.0 0.0 2756 444 ? SN 20:28 0:00 | \_ /home/guerby/build/gcc/collect2 --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o c48005b ... guerby 7811 0.0 0.0 11548 1500 ? RN 20:28 0:00 | \_ /bin/sh /home/guerby/build/gcc/collect-ld --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o c48005b .... guerby 7808 0.0 0.0 14328 1156 ? RN 20:28 0:00 \_ ps fauxwwwww ... >> 4/ we run the executable but since it's not there we get: => c48005b.log << /home/guerby/trunk/gcc/testsuite/ada/acats/run_all.sh: line 16: /home/guerby/build/gcc/testsuite/ada/acats/tests/c4/c48005b/c48005b: Permission denied >> 5/ After the run an empty file appears in another "ls -l" (not shown in the script above): -rw-r--r-- 1 guerby guerby 0 2009-08-13 20:28 c48005b 6/ Waiting for one more second ("sleep 1" not shown above) the full file appears at last in "ls -l": -rwxr-xr-x 1 guerby guerby 1164960 2009-08-13 20:28 c48005b Any idea of why /bin/sh is running stuff in parallel instead of sequential? Could some code in gnatmake/gnatlink/xgcc/collect2/collect-ld cause it? guerby@gcc13:~$ /bin/sh --version GNU bash, version 3.1.17(1)-release (x86_64-pc-linux-gnu) Thanks in advance, Laurent