From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 45898 invoked by alias); 31 Jul 2015 19:02:21 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 45887 invoked by uid 89); 31 Jul 2015 19:02:20 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=ham version=3.3.2 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Fri, 31 Jul 2015 19:02:17 +0000 Received: from int-mx14.intmail.prod.int.phx2.redhat.com (int-mx14.intmail.prod.int.phx2.redhat.com [10.5.11.27]) by mx1.redhat.com (Postfix) with ESMTPS id AEF2735B969; Fri, 31 Jul 2015 19:02:16 +0000 (UTC) Received: from [127.0.0.1] (ovpn01.gateway.prod.ext.ams2.redhat.com [10.39.146.11]) by int-mx14.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id t6VJ2EUe001066; Fri, 31 Jul 2015 15:02:15 -0400 Message-ID: <55BBC636.40705@redhat.com> Date: Fri, 31 Jul 2015 19:02:00 -0000 From: Pedro Alves User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Don Breazeal , gdb-patches@sourceware.org Subject: Re: [PATCH/7.10 2/2] gdbserver: Fix non-stop / fork / step-over issues References: <1438362229-27653-1-git-send-email-palves@redhat.com> <1438362229-27653-3-git-send-email-palves@redhat.com> <55BBB89B.8020101@codesourcery.com> In-Reply-To: <55BBB89B.8020101@codesourcery.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-SW-Source: 2015-07/txt/msg00954.txt.bz2 On 07/31/2015 07:04 PM, Don Breazeal wrote: > On 7/31/2015 10:03 AM, Pedro Alves wrote: >> Ref: https://sourceware.org/ml/gdb-patches/2015-07/msg00868.html >> >> This adds a test that has a multithreaded program have several threads >> continuously fork, while another thread continuously steps over a >> breakpoint. > > Wow. > If gdb survives these stress tests, it can hold up to anything. :-) >> - The test runs with both "set detach-on-fork" on and off. When off, >> it exercises the case of GDB detaching the fork child explicitly. >> When on, it exercises the case of gdb resuming the child >> explicitly. In the "off" case, gdb seems to exponentially become >> slower as new inferiors are created. This is _very_ noticeable as >> with only 100 inferiors gdb is crawling already, which makes the >> test take quite a bit to run. For that reason, I've disabled the >> "off" variant for now. > > Bummer. I was going to ask whether this use-case justifies disabling > the feature completely, Note that this being a stress test, may not be representative of a real work load. I'm assuming most real use cases won't be so demanding. > but since the whole follow-fork mechanism is of > limited usefulness without exec events, the question is likely moot > anyway. Yeah. There are use cases with fork alone, but combined with exec is much more useful. I'll take a look at your exec patches soon; I'm very much looking forward to have that in. > > Do you have any thoughts about whether this slowdown is caused by the > fork event machinery or by some more general gdbserver multiple > inferior problem? Not sure. The number of forks live at a given time in the test is constant -- each thread forks and waits for the child to exit until it forks again. But if you run the test, you see that the first few inferiors are created quickly, and then as the inferior number grows, new inferiors are added at a slower and slower. I'd suspect the problem to be on the gdb side. But the test fails on native, so it's not easy to get gdbserver out of the picture for a quick check. It feels like some data structures are leaking, but still reacheable, and then a bunch of linear walks end up costing more and more. I once added the prune_inferiors call at the end of normal_stop to handle a slowdown like this. It feels like something similar to that. With detach "on" alone, it takes under 2 seconds against gdbserver for me. If I remove the breakpoint from the test, and reenable both detach on/off, it ends in around 10-20 seconds. That's still a lot slower than "detach on" along, but gdb has to insert/remove breakpoints in the child and load its symbols (well, it could avoid that, given the child is a clone of the parent, but we're not there yet), so not entirely unexpected. But pristine, with both detach on/off, it takes almost 2 minutes here. ( and each thread only spawns 10 forks, my first attempt was shooting for 100 :-) ) I also suspected all the thread stop/restarting gdbserver does both to step over breakpoints, and to insert/remove breakpoints. But then again with detach on, there are 12 threads, with detach off, at most 22. So that'd be odd. Unless the data structure leaks are on gdbserver's side. But then I'd think that tests like attach-many-short-lived-threads.exp or non-stop-fair-events.exp would have already exposed something like that. > > Are you planning to look at the slowdown? Nope, at least not in the immediate future. > Can I help out? I have an > interest in having detach-on-fork 'off' enabled. :-S That'd be much appreciated. :-) At least identifying the culprit would be very nice. I too would love for our multi-process support to be rock solid. Thanks, Pedro Alves