From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-patches-return-124883-listarch-gdb-patches=sources.redhat.com@sourceware.org>
Received: (qmail 45898 invoked by alias); 31 Jul 2015 19:02:21 -0000
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
Received: (qmail 45887 invoked by uid 89); 31 Jul 2015 19:02:20 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=ham version=3.3.2
X-HELO: mx1.redhat.com
Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Fri, 31 Jul 2015 19:02:17 +0000
Received: from int-mx14.intmail.prod.int.phx2.redhat.com (int-mx14.intmail.prod.int.phx2.redhat.com [10.5.11.27])	by mx1.redhat.com (Postfix) with ESMTPS id AEF2735B969;	Fri, 31 Jul 2015 19:02:16 +0000 (UTC)
Received: from [127.0.0.1] (ovpn01.gateway.prod.ext.ams2.redhat.com [10.39.146.11])	by int-mx14.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id t6VJ2EUe001066;	Fri, 31 Jul 2015 15:02:15 -0400
Message-ID: <55BBC636.40705@redhat.com>
Date: Fri, 31 Jul 2015 19:02:00 -0000
From: Pedro Alves <palves@redhat.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0
MIME-Version: 1.0
To: Don Breazeal <donb@codesourcery.com>, gdb-patches@sourceware.org
Subject: Re: [PATCH/7.10 2/2] gdbserver: Fix non-stop / fork / step-over issues
References: <1438362229-27653-1-git-send-email-palves@redhat.com> <1438362229-27653-3-git-send-email-palves@redhat.com> <55BBB89B.8020101@codesourcery.com>
In-Reply-To: <55BBB89B.8020101@codesourcery.com>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
X-SW-Source: 2015-07/txt/msg00954.txt.bz2

On 07/31/2015 07:04 PM, Don Breazeal wrote:
> On 7/31/2015 10:03 AM, Pedro Alves wrote:
>> Ref: https://sourceware.org/ml/gdb-patches/2015-07/msg00868.html
>>
>> This adds a test that has a multithreaded program have several threads
>> continuously fork, while another thread continuously steps over a
>> breakpoint.
> 
> Wow.
> 

If gdb survives these stress tests, it can hold up to anything.  :-)

>>  - The test runs with both "set detach-on-fork" on and off.  When off,
>>    it exercises the case of GDB detaching the fork child explicitly.
>>    When on, it exercises the case of gdb resuming the child
>>    explicitly.  In the "off" case, gdb seems to exponentially become
>>    slower as new inferiors are created.  This is _very_ noticeable as
>>    with only 100 inferiors gdb is crawling already, which makes the
>>    test take quite a bit to run.  For that reason, I've disabled the
>>    "off" variant for now.
> 
> Bummer.  I was going to ask whether this use-case justifies disabling
> the feature completely, 

Note that this being a stress test, may not be representative of a
real work load.  I'm assuming most real use cases won't be
so demanding.

> but since the whole follow-fork mechanism is of
> limited usefulness without exec events, the question is likely moot
> anyway.

Yeah.  There are use cases with fork alone, but combined with exec is
much more useful.  I'll take a look at your exec patches soon; I'm very
much looking forward to have that in.

> 
> Do you have any thoughts about whether this slowdown is caused by the
> fork event machinery or by some more general gdbserver multiple
> inferior problem?

Not sure.

The number of forks live at a given time in the test is constant
-- each thread forks and waits for the child to exit until it forks
again.   But if you run the test, you see that the first
few inferiors are created quickly, and then as the inferior number
grows, new inferiors are added at a slower and slower.
I'd suspect the problem to be on the gdb side.  But the test
fails on native, so it's not easy to get gdbserver out of
the picture for a quick check.

It feels like some data structures are leaking, but
still reacheable, and then a bunch of linear walks end up costing
more and more.  I once added the prune_inferiors call at the end
of normal_stop to handle a slowdown like this.  It feels like
something similar to that.

With detach "on" alone, it takes under 2 seconds against gdbserver
for me.

If I remove the breakpoint from the test, and reenable both detach on/off,
it ends in around 10-20 seconds.  That's still a lot slower
than "detach on" along, but gdb has to insert/remove breakpoints in the
child and load its symbols (well, it could avoid that, given the
child is a clone of the parent, but we're not there yet), so
not entirely unexpected.

But pristine, with both detach on/off, it takes almost 2 minutes
here.  ( and each thread only spawns 10 forks, my first attempt
was shooting for 100 :-) )

I also suspected all the thread stop/restarting gdbserver does
both to step over breakpoints, and to insert/remove breakpoints.
But then again with detach on, there are 12 threads, with detach
off, at most 22.  So that'd be odd.  Unless the data structure
leaks are on gdbserver's side.  But then I'd think that tests
like attach-many-short-lived-threads.exp or non-stop-fair-events.exp
would have already exposed something like that.

> 
> Are you planning to look at the slowdown?  

Nope, at least not in the immediate future.

> Can I help out?  I have an
> interest in having detach-on-fork 'off' enabled.  :-S

That'd be much appreciated.  :-)  At least identifying the
culprit would be very nice.  I too would love for our
multi-process support to be rock solid.

Thanks,
Pedro Alves