From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 21018 invoked by alias); 17 Aug 2015 13:28:30 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 20357 invoked by uid 89); 17 Aug 2015 13:28:29 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.2 required=5.0 tests=AWL,BAYES_50,KAM_LAZY_DOMAIN_SECURITY,RCVD_IN_DNSWL_LOW autolearn=no version=3.3.2 X-HELO: mail-ob0-f182.google.com Received: from mail-ob0-f182.google.com (HELO mail-ob0-f182.google.com) (209.85.214.182) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Mon, 17 Aug 2015 13:28:27 +0000 Received: by obnw1 with SMTP id w1so111521367obn.3 for ; Mon, 17 Aug 2015 06:28:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type; bh=UyYkJJIqMP+j6J7V6wJxeF2Wx6nIy9wvjOn0w0SOxG0=; b=f6RnymErDc4T+8DzqKRss/UF665h7XB+XtU7uYL3X5iBGvXfgE+gIcx4Kw0qke3QqQ qD2pUAKb9yeDb3pVLq0dLZbTO9mOxKZaekRZ3a9ohy7IfPX8lqJtQpR6M39gQ1kUlBgL IHdpIrbaHqawrsCFqV95ZQ0h8Yx6OGGbucYCrftaGy/FgsEfgga6DnkJ88jZL2AQZ2iq 89z5vnHu/ySLYBAlHshO2Nl8em3LfUkaDfTZ3W1tI45kxtPxQFmwHiirSDfyH4Ju7ZFH WnkXt+vNlKIm5rRQrI2MDAz6GdCWI1dgH/RmOX4US6YRysGJIqUncvizNnd+xYmWyprm 9Ytw== X-Gm-Message-State: ALoCoQluTIP5QxoWYe2fnrCRS8mwsQ/O2Wnl4fp+Y3Mt5B0fvsvv5OtIE1Cw9iGoGm3YmHJI6owS X-Received: by 10.60.133.50 with SMTP id oz18mr1169689oeb.64.1439818105695; Mon, 17 Aug 2015 06:28:25 -0700 (PDT) MIME-Version: 1.0 Received: by 10.182.56.202 with HTTP; Mon, 17 Aug 2015 06:28:06 -0700 (PDT) In-Reply-To: References: <1433878062-23560-1-git-send-email-patrick@parcs.ath.cx> <1434466413-28892-1-git-send-email-patrick@parcs.ath.cx> <87mvym67i5.fsf_-_@redhat.com> <87zj1un7ro.fsf@redhat.com> <55CD27AE.2090002@redhat.com> From: Patrick Palka Date: Mon, 17 Aug 2015 13:28:00 -0000 Message-ID: Subject: Re: Racy failures on gdb.base/gdbinit-history.exp (native-extended-gdbserver/-m64) To: Pedro Alves Cc: Sergio Durigan Junior , "gdb-patches@sourceware.org" Content-Type: text/plain; charset=UTF-8 X-SW-Source: 2015-08/txt/msg00420.txt.bz2 On Mon, Aug 17, 2015 at 9:15 AM, Patrick Palka wrote: > On Thu, Aug 13, 2015 at 7:26 PM, Pedro Alves wrote: >> On 08/13/2015 11:27 PM, Sergio Durigan Junior wrote: >>> On Thursday, August 13 2015, Patrick Palka wrote: >>> >>>> On Thu, Jul 23, 2015 at 2:42 PM, Sergio Durigan Junior >>>> wrote: >>>>> On Tuesday, June 16 2015, Patrick Palka wrote: >>>>> >>>>>> We still do not handle "set history size unlimited" correctly. In >>>>>> particular, after writing to the history file, we truncate the history >>>>>> even if it is unlimited. >>>>>> >>>>>> This patch makes sure that we do not call history_truncate_file() if the >>>>>> history is not stifled (i.e. if it's unlimited). This bug causes the >>>>>> history file to be truncated to zero on exit when one has "set history >>>>>> size unlimited" in their gdbinit file. Although this code exists in GDB >>>>>> 7.8, the bug is masked by a pre-existing bug that's been only fixed in >>>>>> GDB 7.9 (PR gdb/17820). >>>>> >>>>> Hey Patrick, >>>>> >>>>> Looking at the BuildBot logs today, I found that this new test is >>>>> failing occasionally on native-extended-gdbserver testing. Take a look >>>>> at the following build: >>>>> >>>>> >>>>> >>>>> You can see that gdb.base/gdbinit-history.exp failed: >>>>> >>>>> PASS -> FAIL: gdb.base/gdbinit-history.exp: truncation: appending: server show commands >>>>> PASS -> FAIL: gdb.base/gdbinit-history.exp: truncation: creating: server show commands >>>>> >>>>> The gdb.log is here: >>>>> >>>>> >>>>> >>>>> I haven't really investigated to determine what's going on here, but let >>>>> me know if you need any help with this. >>>> >>>> Hi Sergio, >>>> >>>> Have you seen these spurious FAILs pop up recently? I wonder if the >>>> fixes to SIGTERM handling I made a few weeks ago may have fixed this >>>> as well. >>> >>> Hey Patrick, >>> >>> Yeah, I still see those FAIL's from time to time: >>> >>> >>> >>> >>> I may be wrong, but I noticed that the FAIL's happen only on the PPC >>> buildslaves. >>> >> >> Here's a theory: >> >> (gdb) server show commands >> 1 set height 0 >> 2 set width 0 >> 3 target extended-remote localhost:2348 >> 4 print 1 >> 5 monitor exit >> 6 set height 0 >> 7 set width 0 >> 8 target extended-remote localhost:2349 >> (gdb) PASS: gdb.base/gdbinit-history.exp: truncation: creating: server show commands >> Remote debugging from host 127.0.0.1 >> monitor exit >> (gdb) spawn /home/gdb-buildbot/fedora-ppc64le-1/fedora-ppc64le-native-extended-gdbserver/build/gdb/testsuite/../../gdb/gdb -nw -nx -data-directory /home/gdb-buildbot/fedora >> -ppc64le-1/fedora-ppc64le-native-extended-gdbserver/build/gdb/testsuite/../data-directory -x /home/gdb-buildbot/fedora-ppc64le-1/fedora-ppc64le-native-extended-gdbserver/bu >> ild/gdb/testsuite/outputs/gdb.base/gdbinit-history/gdbinit-history.gdbinit -ex set auto-connect-native-target off >> GNU gdb (GDB) 7.10.50.20150812-cvs >> ... >> (gdb) set height 0 >> (gdb) set width 0 >> (gdb) spawn ../gdbserver/gdbserver --once --multi :2350 >> Listening on port 2350 >> target extended-remote localhost:2350 >> Remote debugging using localhost:2350 >> (gdb) server show commands >> 1 set height 0 >> 2 set width 0 >> 3 target extended-remote localhost:2350 >> (gdb) FAIL: gdb.base/gdbinit-history.exp: truncation: appending: server show commands >> >> Observations: >> >> #1 - In the FAILing case, the history only shows commands invoked in that >> gdb session. That means that either that GDB invocation failed to >> load the history file, or the previous invocation failed to write it. >> >> #2 - dejagnu's standard_close, what is ultimately used to exit gdb, >> does this: >> >> 2.1 - send gdb a SIGINT. This might help by interrupting any command >> that gdb might be stuck in, getting it back to >> the event loop processing input. >> >> 2.2 - close gdb's ptty, in effect closing gdb's stdin/stdout. >> >> 2.3 - if gdb doesn't exit after 5 seconds, kills it >> with SIGTERM. If it still doesn't exit after another 5 >> seconds, dejagnu kill it with SIGKILL. >> >> Unless you're running dejagnu git master, 2.1 and 2.2 happen >> in parallel. >> >> I'm assuming that 2.3 is not happening. It's 2.2 that normally >> causes gdb to exit. See event-top.c:stdin_event_handler. >> >> - Because, due to 2.2, stdin/stdout are already closed when gdb is >> running the teardown code (quit_force, what saves history), any >> warning/output that gdb might output goes nowhere, it won't >> appear in gdb.log. >> >> #3 - If you'll recall, not long ago we made the code that appends >> the commands to the history file roughly: >> >> - mv old-history-file old-history-file.tmp >> - append to old-history-file.tmp >> - mv old-history-file.tmp old-history-file >> >> That is top.c:gdb_safe_append_history. >> >> AFAICS, SIGINT is not set to SA_RESTART. Now imagine what happens >> if the SIGINT lands exactly just when gdb is about to call >> rename here, in gdb_safe_append_history: >> >> ret = rename (local_history_filename, history_filename); >> saved_errno = errno; >> if (ret < 0 && saved_errno != EEXIST) >> warning (_("Could not rename %s to %s: %s"), >> local_history_filename, history_filename, >> safe_strerror (saved_errno)); >> } >> >> and rename fails with EINTR. Then the following gdb invocation >> would not find any previous history file to load, which would >> perfectly explain that FAIL. >> >> If we still have the build/test directory of that test run, >> if the theory is right, I think we'd find a >> leftover gdbinit-history.gdb_history-gdb-$PID~ file. >> >> What doesn't look right in the theory is that the rename man >> does not document rename as failing with EINTR... But >> maybe it does? > > Interesting theory. I had assumed that the problem lies in the > testsuite driver, not in the GDB source code, because the problem only > manifests itself under native-extended-gdbserver. If I understand > correctly, your explanation is also plausible under native or > native-gdbserver, yet the problem was not observed to ever manifest > itself under these targets. > > Also, if rename fails with EINTR, wouldn't a "Could not rename %s to > %s: %s" warning be emitted in the gdb output log? Ah, you already addressed this: a warning is not emitted because stdout is closed.. But because the problem only occurs under extended-gdbserver, I'm inclined to think the issue is with the testsuite driver, in particular with the gdb_exit implementation in lib/gdbserver-support.exp. One potential issue I notice in this proc is that when we send "monitor exit" to GDB, we don't necessarily wait for the command to finish (i.e. for the gdb prompt to get printed). As soon as the server is observed to get killed, we continue with exiting. Dunno if that's substantial..