From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <simon.marchi@polymtl.ca>
Received: from smtp.polymtl.ca (smtp.polymtl.ca [132.207.4.11])
 by sourceware.org (Postfix) with ESMTPS id C28DD383F411
 for <gdb-patches@sourceware.org>; Thu,  6 May 2021 01:59:38 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org C28DD383F411
Received: from simark.ca (simark.ca [158.69.221.121]) (authenticated bits=0)
 by smtp.polymtl.ca (8.14.7/8.14.7) with ESMTP id 1461xW4Y008153
 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); 
 Wed, 5 May 2021 21:59:37 -0400
DKIM-Filter: OpenDKIM Filter v2.11.0 smtp.polymtl.ca 1461xW4Y008153
Received: from [10.0.0.11] (192-222-157-6.qc.cable.ebox.net [192.222.157.6])
 (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
 (No client certificate requested)
 by simark.ca (Postfix) with ESMTPSA id A2C071E813;
 Wed,  5 May 2021 21:59:32 -0400 (EDT)
Subject: Re: [RFC PATCH] gdb: fix eval.c assert during inferior exit event
To: Magne Hov <mhov@undo.io>, gdb-patches@sourceware.org
References: <20210505155627.3850386-1-mhov@undo.io>
From: Simon Marchi <simon.marchi@polymtl.ca>
Message-ID: <2dcd95bb-4262-4a4a-e851-472f977f15c9@polymtl.ca>
Date: Wed, 5 May 2021 21:59:32 -0400
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
 Thunderbird/78.7.1
MIME-Version: 1.0
In-Reply-To: <20210505155627.3850386-1-mhov@undo.io>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 7bit
X-Poly-FromMTA: (simark.ca [158.69.221.121]) at Thu, 6 May 2021 01:59:33 +0000
X-Spam-Status: No, score=-10.2 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, NICE_REPLY_A,
 RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS,
 TXREP autolearn=ham autolearn_force=no version=3.4.2
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on
 server2.sourceware.org
X-BeenThere: gdb-patches@sourceware.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gdb-patches mailing list <gdb-patches.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/gdb-patches>,
 <mailto:gdb-patches-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/gdb-patches>,
 <mailto:gdb-patches-request@sourceware.org?subject=subscribe>
X-List-Received-Date: Thu, 06 May 2021 01:59:41 -0000

y
On 2021-05-05 11:56 a.m., Magne Hov via Gdb-patches wrote:
> Evaluating expressions from within an inferior exit event handler can
> cause a crash:
> 
>     echo "int main() { return 0; }" > repro.cc
>     g++ -g repro.cc -o repro
>     ./gdb -q --ex "start" --ex "python gdb.events.exited.connect(lambda _: gdb.execute('set \$_a=0'))" --ex "continue" repro
> 
>     Reading symbols from repro...
>     Temporary breakpoint 1 at 0x1131: file repro.cc, line 1.
>     Starting program: /home/mhov/repos/binutils-gdb-master/install/bin/repro
> 
>     Temporary breakpoint 1, main () at repro.cc:1
>     1	int main() { return 0; }
>     Continuing.
>     [Inferior 1 (process 75524) exited normally]
>     ../../gdb/thread.c:72: internal-error: thread_info* inferior_thread(): Assertion `current_thread_ != nullptr' failed.
>     A problem internal to GDB has been detected,
>     further debugging may prove unreliable.
>     Quit this debugging session? (y or n) [answered Y; input not from terminal]
> 
>     This is a bug, please report it.  For instructions, see:
>     <https://www.gnu.org/software/gdb/bugs/>.
> 
> Backtrace
>     0  in internal_error of ../../gdbsupport/errors.cc:51
>     1  in inferior_thread of ../../gdb/thread.c:72
>     2  in expression::evaluate of ../../gdb/eval.c:98
>     3  in evaluate_expression of ../../gdb/eval.c:115
>     4  in set_command of ../../gdb/printcmd.c:1502
>     5  in do_const_cfunc of ../../gdb/cli/cli-decode.c:101
>     6  in cmd_func of ../../gdb/cli/cli-decode.c:2181
>     7  in execute_command of ../../gdb/top.c:670
>     ...
>     22 in python_inferior_exit of ../../gdb/python/py-inferior.c:182
> 
> This is the same assertion site as in
> <https://sourceware.org/bugzilla/show_bug.cgi?id=26761>, but I'm not
> sure if it is the same problem.

I don't think so, this assertion in inferior_thread (trying to get the
current thread while there is not current thread) is kind of a generic
symptom, but the root cause can be any thing.

> 
> In `expression::evaluate (...)' there is a call to `inferior_thread
> ()' that is guarded by `target_has_execution ()':
> 
>     struct value *
>     expression::evaluate (struct type *expect_type, enum noside noside)
>     {
>       gdb::optional<enable_thread_stack_temporaries> stack_temporaries;
>       if (target_has_execution ()
>           && language_defn->la_language == language_cplus
>           && !thread_stack_temporaries_enabled_p (inferior_thread ()))
>         stack_temporaries.emplace (inferior_thread ());
> 
> The `target_has_execution ()' guard maps onto `inf->pid' and the
> `inferior_thread ()' call assumes that `current_thread_' is set to
> something meaningful:
> 
>     struct thread_info*
>     inferior_thread (void)
>     {
>       gdb_assert (current_thread_ != nullptr);
>       return current_thread_;
>     }
> 
> In other words, it is assumed that if `inf->pid' is set then
> `current_thread_' must also be set. This does not hold at the point
> where inferior exit observers are notified:
> - `generic_mourn_inferior (...)'
>   - `switch_to_no_thread ()'
>     - `current_thread_ = nullptr;'
>   - `exit_inferior (...)'
>     - `gdb::observers::inferior_exit.notify (...)'
>     - `inf->pid = 0'
> 
> The inferior exit notification means that a Python handler can get a
> chance to run while `current_thread' has been cleared and the
> `inf->pid' has not been cleared. Since the Python handler can call any
> GDB command with `gdb.execute(...)' (in my case `gdb.execute("set
> $_a=0")' we can end evaluating expressions and asserting in
> `evaluate_subexp (...)'.
> 
> This patch adds a test in `evaluate_subexp (...)' to check the global
> `inferior_ptid' which is reset at the same time as `current_thread_'.
> Checking `inferior_ptid' at the same time as `target_has_execution ()'
> seems to be a common pattern:
> 
>     $ git grep -n -e inferior_ptid --and -e target_has_execution
>     gdb/breakpoint.c:2998:    && (inferior_ptid == null_ptid || !target_has_execution ()))
>     gdb/breakpoint.c:3054:    && (inferior_ptid == null_ptid || !target_has_execution ()))
>     gdb/breakpoint.c:4587:  if (inferior_ptid == null_ptid || !target_has_execution ())
>     gdb/infcmd.c:360:  if (inferior_ptid != null_ptid && target_has_execution ())
>     gdb/infcmd.c:2380:  /* FIXME:  This should not really be inferior_ptid (or target_has_execution).
>     gdb/infrun.c:3438:  if (!target_has_execution () || inferior_ptid == null_ptid)
>     gdb/remote.c:11961:  if (!target_has_execution () || inferior_ptid == null_ptid)
>     gdb/solib.c:725:  if (target_has_execution () && inferior_ptid != null_ptid)
> 
> The testsuite has been run on 5.4.0-59-generic x86_64 GNU/Linux:
> - Ubuntu 20.04.1 LTS
> - gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
> - DejaGnu version 1.6.2
>   - Expect version 5.45.4
>   - Tcl version 8.6
> - Native configuration: x86_64-pc-linux-gnu
> - Target: unix
> 
> Diffing results before and after the patch shows a few XFAIL in
> gdb.threads/attach-many-short-lived-threads.exp, and one unexpected
> failure in gdb.base/run-attach-while-running.exp which seems to be
> flaky on master as well.

First, let me say that your commit message is fantastic.

I do agree with your analysis and the fix.  inferior_ptid is always
supposed to be in sync with current_thread_.

> diff --git a/gdb/testsuite/gdb.python/py-events.exp b/gdb/testsuite/gdb.python/py-events.exp
> index e89cd8b021b..bec3c9f2cab 100644
> --- a/gdb/testsuite/gdb.python/py-events.exp
> +++ b/gdb/testsuite/gdb.python/py-events.exp
> @@ -197,18 +197,33 @@ gdb_test_multiple "continue" $test {
>  gdb_test_no_output "delete $second_breakpoint"
>  
>  #test exited event.
> +proc get_process_id {test} {
> +    global gdb_prompt
> +    global process_id
> +    gdb_test_multiple "info proc" $test {
> +	-re "process (\\d+).*$gdb_prompt $" {
> +	    set process_id $expect_out(1,string)
> +	    pass $test
> +	}
> +    }
> +}

Insted of a proc setting a global variable (process_id), I would prefer
if the proc returned the value.  And then you could do:

set process_id [get_process_id]

> +
> +get_process_id "get inferior 1 process id"
>  gdb_test "continue" ".*event type: continue.*
>  .*clear_objfiles\[\r\n\]*progspace: .*py-events.*
>  .*event type: exit.*
>  .*exit code: 12.*
>  .*exit inf: 1.*
> +.*exit pid: $process_id.*
>  dir ok: True.*" "Inferior 1 terminated."
>  
>  gdb_test "inferior 2" ".*Switching to inferior 2.*"
> +get_process_id "get inferior 2 process id"
>  gdb_test "continue" ".*event type: continue.*

I get:

    DUPLICATE: gdb.python/py-events.exp: continue

We try to avoid duplicate test names, to make it easier to track
down failures.  A handy way to make test names unique is to use
with_test_prefix:

with_test_prefix "inferior 1" {
    ...
}

with_test_prefix "inferior 2" {
    .
}

That will add "inferior 1: " and "inferior 2: " to the test names of
whatever is within these { }.

>  .*event type: exit.*
>  .*exit code: 12.*
>  .*exit inf: 2.*
> +.*exit pid: $process_id.*
>  dir ok: True.*" "Inferior 2 terminated."
>  
>  
> @@ -235,3 +250,27 @@ gdb_test "python print(count)" 2 "check for before_prompt event"
>  
>  gdb_test_no_output "xxz" "run a canned sequence"
>  gdb_test "python print(count)" 4 "check for before_prompt event again"
> +
> +# Test evaluating expressions from within an inferior exit event handler. This used to cause a crash
> +# when C++ expression were evaluated.

Wrap the comment to 80 columns, and use two spaces after the period
(this is a common rule in this codebase).

> +standard_testfile .cc

Is C++ really required?  I was able to reproduce the bug with your
reproducer, but compiled as C.  If we can re-use the existing test
program for this test (just let it run to completion), it would be
simpler and preferable.

> +
> +if {[prepare_for_testing "failed to prepare" $testfile $srcfile]} {
> +    return -1
> +}
> +
> +gdb_test_multiline "add exited listener" \
> +    "python" "" \
> +    "def increment_foo(_):" "" \
> +    "  gdb.execute('set \$_foo=\$_foo+1')" "" \
> +    "gdb.events.exited.connect(increment_foo)" "" \
> +    "end" ""
> +gdb_test "set \$_foo=0" "" "initialize foo variable"
> +gdb_test "print \$_foo" "= 0" "check foo initialized"
> +gdb_test "run" "exited normally.*"
> +gdb_test "print \$_foo" "= 1" "check foo after run"
> +gdb_test "start" "Temporary breakpoint .* main .*" "stop on a C++ frame"
> +gdb_test "continue" "exited normally.*"

We want the tests to also work when using the native-gdbserver
(gdbserver using the remote protocol) and native-extended-gdbserver
(gdbserver using the extended-remote protocol) boards.  For this reason,
it's generally not good to use "run" and "start" literally in the tests.

Instead, try to use the procedures runto_main, gdb_run_cmd,
gdb_start_cmd, etc.

Unfortunately, I can see that this test is completely skipped when
running with the native-gdbserver board.  I don't understand why, Python
events are still relevant then.  And it's broken with the
native-extended-gdbserver board.  But let's at least try to do the right
thing and use the right procedures.  It will help for when someone goes
and tries to fix that test.

> +gdb_test "print \$_foo" "= 2" "check foo after start continue"
> +
> +>>>>>>> 2d04478f2ca... gdb: check inferior_ptid before calling inferior_thread in eval.c

I guess this last line is not meant to be included.

Simon