From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <frysk-return-2355-listarch-frysk=sources.redhat.com@sourceware.org>
Received: (qmail 22909 invoked by alias); 7 Sep 2007 14:30:08 -0000
Received: (qmail 22901 invoked by uid 22791); 7 Sep 2007 14:30:08 -0000
X-Spam-Status: No, hits=-0.1 required=5.0 	tests=AWL,BAYES_00,DK_POLICY_SIGNSOME,UNPARSEABLE_RELAY
X-Spam-Check-By: sourceware.org
Received: from rgminet01.oracle.com (HELO rgminet01.oracle.com) (148.87.113.118)     by sourceware.org (qpsmtpd/0.31) with ESMTP; Fri, 07 Sep 2007 14:30:01 +0000
Received: from agmgw2.us.oracle.com (agmgw2.us.oracle.com [152.68.180.213]) 	by rgminet01.oracle.com (Switch-3.2.4/Switch-3.1.6) with ESMTP id l87ETfar023587; 	Fri, 7 Sep 2007 08:29:43 -0600
Received: from acsmt351.oracle.com (acsmt351.oracle.com [141.146.40.151]) 	by agmgw2.us.oracle.com (Switch-3.2.0/Switch-3.2.0) with ESMTP id l87ETden026301; 	Fri, 7 Sep 2007 08:29:40 -0600
Received: from alchar.org by acsmt352.oracle.com 	with ESMTP id 3193449631189175308; Fri, 07 Sep 2007 07:28:28 -0700
Date: Fri, 07 Sep 2007 14:30:00 -0000
From: Kris Van Hees <kris.van.hees@oracle.com>
To: Andrew Cagney <cagney@redhat.com>
Cc: Kris Van Hees <kris.van.hees@oracle.com>, frysk@sourceware.org
Subject: Re: Patch for TearDownProcess
Message-ID: <20070907142827.GL22263@oracle.com>
References: <20070907022100.GK22263@oracle.com> <46E15274.5020400@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <46E15274.5020400@redhat.com>
User-Agent: Mutt/1.5.16 (2007-06-09)
X-Brightmail-Tracker: AAAAAQAAAAI=
X-Brightmail-Tracker: AAAAAQAAAAI=
X-Whitelist: TRUE
X-Whitelist: TRUE
X-IsSubscribed: yes
Mailing-List: contact frysk-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <frysk.sourceware.org>
List-Subscribe: <mailto:frysk-subscribe@sourceware.org>
List-Post: <mailto:frysk@sourceware.org>
List-Help: <mailto:frysk-help@sourceware.org>, <http://sourceware.org/lists.html#faqs>
Sender: frysk-owner@sourceware.org
X-SW-Source: 2007-q3/txt/msg00390.txt.bz2

On Fri, Sep 07, 2007 at 09:30:28AM -0400, Andrew Cagney wrote:
> Kris Van Hees wrote:
>> The attached patch *does* open the possibility that a stray process may 
>> remain
>> after tearDown() has been executed if e.g. a process termination causes 
>> another
>> process to be created, etc...  However, that will not cause a testsuite 
>> hang.
>> It is also not expected to pose a problem with the further execution of 
>> the
>> testsuite (in fact, it is very likely to get cleaned up during the 
>> tearDown()
>> of the next test).
>>   
> Unfortunately this isn't theoretical.  On a slower f5 machine; this happens 
> consistently; invalidating test results.

The information you added to ticket 4996 doesn't quite show anything that
indicates that this is a problem with my patch.  In the future, could you
at least add output after a -c FINEST run or something, to ensure there is
relevant information there to help work out a fix?

> Given the choices between a potential test-suite hang, and tear-down 
> leaving waitpid events around, the decision was made in favor of the 
> latter.  That decision hasn't changed.  Having the test run hang, is a 
> lesser evil then the test-suite continuing but producing bogus results.. 
> I restored the old behavior; and then added a timeout.  It currently logs a 
> message, I suspect it should abandon the test run, since the problem state 
> hasn't gone away.

Obviously, there can be a difference of opinion on this matter, and I believe
that reversing this patch without any consideration for the matter it aims at
solving is useless.  You deliberately restore a problem behaviour.  Why?

It would have been more constructive to open a bug about the problem you have
encountered, and have that problem resolved, so that in the end we have a
fully working testsuite that doesn't need a tradeoff between hanging and
stray waitpid events.

Finally, given that you use FC5 as your reference platform here, perhaps you
could add it to the automated test system (i.e. have it submit results there)
so that test coverage can be expended to that release as well (especially since
it seems to uncover some problems that other systems do not - assuming of
course that no kernel issues are involved).

	Cheers,
	Kris