From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 8E8353858D32 for ; Mon, 16 Jan 2023 17:22:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8E8353858D32 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673889728; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=LTyPVPORu483IKaeLGoOj39XeasO1DL1L6jVqKDKJk8=; b=NzIojAfS6SXna1PUfKInZ1TpjHLq71SHpw3S2vWxNZeCOJLWFCmWrh3ghBKtV/Yvtzrg/z g/V5xa3nx4L7rhoQU97hp5Un97iX2uDzyKb2QjM7Ffhfn1w4Osr+oaHyBll90vsLMeWVK4 EtDb7dvZrd2hCP4Yah7N2Bs3QO3/eIE= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-551-Gl8uqv4-PrKkUO1vtxT41w-1; Mon, 16 Jan 2023 12:22:06 -0500 X-MC-Unique: Gl8uqv4-PrKkUO1vtxT41w-1 Received: by mail-wm1-f69.google.com with SMTP id z22-20020a05600c0a1600b003db00dc4b69so910066wmp.5 for ; Mon, 16 Jan 2023 09:22:06 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=mime-version:message-id:date:references:in-reply-to:subject:cc:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=LTyPVPORu483IKaeLGoOj39XeasO1DL1L6jVqKDKJk8=; b=sPIh2X/uifF5DRaUBRfOx3+YUgh35VoCK2lS7I+UGY6NHVHuu2RUf3H22g578pLWhQ hDtZ1RDqEut33NzPDCe0DiF8RtIEk6IRvtFdpBndIh3XxVuFN22DtPvcdhwFQyXPKtx5 0mUhjASAu7HSnJ0NDkiiL0vzy1epdhy2HYyySLB4UiyDGqqcMYCMdY4W3JR20DYx0B5H uQf9DEXJJ0ZeiKvTMBoTpr9hWBknMyMCfOwPyG61n/rQvLEaqAbxG95NNGkI3sVTVjTf t1m9irSGY6oCVm78KJVMvtVjmCzZA+x10e01X8gLDO9e4o7fXvsVAenaRMZoO2Lqt8Ol MLxA== X-Gm-Message-State: AFqh2kqCgv2xofVn1FjOaAiOiX3i2PXarAT2gRxMxpO9Qm8imkBqCtbw Sxe2yun2i5W+cfVsuJPuNRrPOs/DvKuMf/mZ1vd/Bm0oAkt2agHLhfd5Ov2brdXnAS9WTIVMc8+ VvUNgvVdrUGCGac4las1i8g== X-Received: by 2002:a05:600c:181b:b0:3da:ff1f:e8d3 with SMTP id n27-20020a05600c181b00b003daff1fe8d3mr260354wmp.15.1673889725613; Mon, 16 Jan 2023 09:22:05 -0800 (PST) X-Google-Smtp-Source: AMrXdXt0afcfs/y2i1LLEXr1Ql3LiFZYpkzKAmsx6xGoAg/QOP7OanXo3rwBedO+k0g375MnaU/ikw== X-Received: by 2002:a05:600c:181b:b0:3da:ff1f:e8d3 with SMTP id n27-20020a05600c181b00b003daff1fe8d3mr260340wmp.15.1673889725295; Mon, 16 Jan 2023 09:22:05 -0800 (PST) Received: from localhost (95.72.115.87.dyn.plus.net. [87.115.72.95]) by smtp.gmail.com with ESMTPSA id m18-20020a05600c4f5200b003c71358a42dsm51943350wmq.18.2023.01.16.09.22.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Jan 2023 09:22:04 -0800 (PST) From: Andrew Burgess To: Eli Zaretskii Cc: gdb-patches@sourceware.org Subject: Re: [PATCH 11/12] gdb: add timeouts for inferior function calls In-Reply-To: <83wn5p72w2.fsf@gnu.org> References: <83a65pwi1h.fsf@gnu.org> <87ilh9fkj3.fsf@redhat.com> <83wn5p72w2.fsf@gnu.org> Date: Mon, 16 Jan 2023 17:22:04 +0000 Message-ID: <87cz7efl8j.fsf@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain X-Spam-Status: No, score=-5.7 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP,WEIRD_PORT autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Eli Zaretskii writes: >> From: Andrew Burgess >> Cc: gdb-patches@sourceware.org >> Date: Sat, 14 Jan 2023 11:00:32 +0000 >> >> >> +@table @code >> >> +@item set direct-call-timeout @var{seconds} >> >> +@kindex set direct-call-timeout >> >> +@cindex timeout for called functions >> >> +Set the timeout used when calling functions in the program to >> >> +@var{seconds}, which should be an integer greater than zero, or the >> >> +special value @code{unlimited}, which indicates no timeout should be >> >> +used. The default for this setting is @code{unlimited}. >> > >> > Why integer in seconds? don't we want to be able to support shorter >> > timeouts, like 100 msec? Most inferior calls should take much less >> > than a second, so a second resolution is not the best idea, IMO. It >> > could, for example, make running a program with such a breakpoint >> > unbearably slow. >> >> Remember, this timeout is a safety net intended to catch situations >> where either due to a bug in the inferior, or due to user error, the >> breakpoint condition is never going to complete. >> >> As such I don't think we should be trying to trim this value down as low >> as possible, I just don't see any additional value, in fact, I see doing >> so adding more risk that the user will hit invalid timeouts. >> >> If you wanted to be super aggressive with the timeout, and you set it to >> 1 second, but you still expect your inferior calls to complete in >> 100msec, then the timeout being 1 second will not slow you down in any >> way. But, if for some reason your inferior call deadlocks, you'll end >> up waiting that complete second before GDB gives up. >> >> I can't imagine a use case where a user will be debugging by having an >> inferior function call repeatedly timeout, and that's the only way that >> having such a large timeout would be unbearable > > Suppose we have an inferior call as part of the breakpoint command > conditions, and suppose that breakpoint is hit very frequently. What > do we want to happen in this case? Do we want: > > . GDB to silently keep calling the inferior and waiting for it to time out? > . GDB to announce the timeout and stop the first time it happens? > . something else? > > I thought we wanted the 1st alternative, in which case waiting for a > second could make the run very slow, since an inferior call should > take, like, microseconds in almost all cases? > > But if we intend GDB to stop and not continue in this case, then yes, > 1 sec could be appropriate. I guess my attempt at documenting this is not great :/ We are getting option #2, GDB will stop and report the timeout, here's how it will look to the user: (gdb) set indirect-call-timeout 1 (gdb) break breakpt() if (deadlock() == 0) Breakpoint 1 at 0x401169: file test.cc, line 35. (gdb) r Starting program: /tmp/inf-call-timeout/test Program stopped. 0x00007ffff7b3d1e7 in nanosleep () from /lib64/libc.so.6 Error in testing condition for breakpoint 1: timeout waiting for inferior function to complete An error occurred while in a function called from GDB. Evaluation of the expression containing the function (deadlock()) will be abandoned. When the function is done executing, GDB will silently stop. (gdb) bt #0 0x00007ffff7b3d1e7 in nanosleep () from /lib64/libc.so.6 #1 0x00007ffff7b3d11e in sleep () from /lib64/libc.so.6 #2 0x000000000040113f in deadlock_inner () at test.cc:13 #3 0x000000000040114e in deadlock () at test.cc:20 #4 #5 breakpt () at test.cc:35 #6 0x0000000000401175 in main () at test.cc:40 (gdb) Here's why I think this is the right behaviour: If we automatically unwound the stack once the timeout was detected, then surely the only correct choice would be for the breakpoint condition to trigger. We can't say for sure that the condition didn't trigger, so we should stop. This is similar to how if I do say: (gdb) break foo if (*some_null_pointer == 1234) Then GDB will immediately stop at this breakpoint with a message saying that it was unable to access "*some_null_pointer"; the condition caused a stop not because the condition was true, but because GDB failed to evaluate the condition. I think the same logic applies to a timeout in this case. However, for the timeout case I've left the context of the timeout on the stack, in my example above frames #0 to #4 are all related to the inferior function call. I think this is useful, if the inferior is somehow deadlocked during the inferior function call then it might be useful that the user can investigate the current state and try to figure out what's gone wrong. Hopefully this explains why I think a minimum 1 second timeout is acceptable, I wonder, with the explanation above, how you feel about this now? I'll take another pass at the documentation, and try to make it clearer, obviously I didn't do a good job of explaining what I'm trying to do. Thanks, Andrew