public inbox for gdb-prs@sourceware.org
help / color / mirror / Atom feed
* [Bug remote/30618] New: warning: while parsing threads: not well-formed (invalid token) - in non-stop + remote mode
@ 2023-07-05 18:52 jonah at kichwacoders dot com
2023-07-06 22:22 ` [Bug remote/30618] " tromey at sourceware dot org
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: jonah at kichwacoders dot com @ 2023-07-05 18:52 UTC (permalink / raw)
To: gdb-prs
https://sourceware.org/bugzilla/show_bug.cgi?id=30618
Bug ID: 30618
Summary: warning: while parsing threads: not well-formed
(invalid token) - in non-stop + remote mode
Product: gdb
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: remote
Assignee: unassigned at sourceware dot org
Reporter: jonah at kichwacoders dot com
Target Milestone: ---
Create an empty main method in a file containing unicode characters and compile
it with gcc, start gdbserver and connect to it with gdb in non-stop mode and
the connection sequence fails (full log below):
(gdb) set non-stop on
(gdb) target remote :3333
Remote debugging using :3333
warning: while parsing threads: not well-formed (invalid token)
The target is not running (try extended-remote?)
With remote debugging on this is the output (run in MI mode because the
characters are escaped better):
&" [remote] Sending packet: $QNonStop:1#8d\n"
&" [remote] Packet received: OK\n"
&" [remote] Sending packet: $qXfer:threads:read::0,1000#92\n"
&" [remote] Packet received: l<threads>\\n<thread id=\"p10883.10883\"
core=\"8\" name=\"issue-275-\\346\\265\\213\\350\\257\"/>\\n</threads>\\n\n"
&"warning: while parsing threads: not well-formed (invalid token)\n"
&" [remote] Sending packet: $qTStatus#49\n"
&" [remote] Packet received:
T0;tnotrun:0;tframes:0;tcreated:0;tfree:500000;tsize:500000;circular:0;disconn:0;starttime:0;stoptime:0;username:;notes::\n"
&" [remote] packet_ok: Packet qTStatus (trace-status) is supported\n"
&" [remote] Sending packet: $qTfV#81\n"
&" [remote] Packet received: 1:0:1:74726163655f74696d657374616d70\n"
&" [remote] Sending packet: $qTsV#8e\n"
&" [remote] Packet received: l\n"
=tsv-created,name="trace_timestamp",initial="0"
&" [remote] Sending packet: $?#3f\n"
&" [remote] Packet received:
T0506:0000000000000000;07:90daffffff7f0000;10:b032fef7ff7f0000;thread:p10883.10883;core:8;\n"
&" [remote] Sending packet: $vStopped#55\n"
&" [remote] Packet received: OK\n"
&"[remote] start_remote_1: exit\n"
Here is the source and versions I am using:
$ cat src/integration-tests/test-programs/issue-275-测试.c
int main(int argc, char *argv[])
{
return 0;
}
$ gcc -o src/integration-tests/test-programs/issue-275-测试 -g
src/integration-tests/test-programs/issue-275-测试.c
$ gcc --version
gcc (Ubuntu 11.3.0-1ubuntu1~22.04.1) 11.3.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ gdb --version
GNU gdb (Ubuntu 12.1-0ubuntu1~22.04) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
In case the encoding in bugzilla corrupt it, the 测试 is "test"
(https://translate.google.ca/?sl=auto&tl=en&text=%E6%B5%8B%E8%AF%95&op=translate)
and is encoded in UTF-8 as \xe6\xb5\x8b\xe8\xaf\x95 or \346\265\213\350\257\225
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug remote/30618] warning: while parsing threads: not well-formed (invalid token) - in non-stop + remote mode
2023-07-05 18:52 [Bug remote/30618] New: warning: while parsing threads: not well-formed (invalid token) - in non-stop + remote mode jonah at kichwacoders dot com
@ 2023-07-06 22:22 ` tromey at sourceware dot org
2023-07-13 16:20 ` tromey at sourceware dot org
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: tromey at sourceware dot org @ 2023-07-06 22:22 UTC (permalink / raw)
To: gdb-prs
https://sourceware.org/bugzilla/show_bug.cgi?id=30618
Tom Tromey <tromey at sourceware dot org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Ever confirmed|0 |1
Last reconfirmed| |2023-07-06
CC| |tromey at sourceware dot org
--- Comment #1 from Tom Tromey <tromey at sourceware dot org> ---
I debugged this a little, and the issue is that the Linux kernel
truncates the 'comm' file at 16 bytes. This truncates the final
character in the name -- yielding an invalid UTF-8 sequence, which
gdbserver dutifully passes back to gdb.
I am not sure how to handle this.
One idea is to convert all non-ASCII characters to hex.
Or just drop them.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug remote/30618] warning: while parsing threads: not well-formed (invalid token) - in non-stop + remote mode
2023-07-05 18:52 [Bug remote/30618] New: warning: while parsing threads: not well-formed (invalid token) - in non-stop + remote mode jonah at kichwacoders dot com
2023-07-06 22:22 ` [Bug remote/30618] " tromey at sourceware dot org
@ 2023-07-13 16:20 ` tromey at sourceware dot org
2023-07-13 21:26 ` tromey at sourceware dot org
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: tromey at sourceware dot org @ 2023-07-13 16:20 UTC (permalink / raw)
To: gdb-prs
https://sourceware.org/bugzilla/show_bug.cgi?id=30618
--- Comment #2 from Tom Tromey <tromey at sourceware dot org> ---
Since this is Linux-specific we could probably just rely
directly on iconv here -- iconv the 'comm' contents to
UTF-8 and drop / substitute anything that gives an error.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug remote/30618] warning: while parsing threads: not well-formed (invalid token) - in non-stop + remote mode
2023-07-05 18:52 [Bug remote/30618] New: warning: while parsing threads: not well-formed (invalid token) - in non-stop + remote mode jonah at kichwacoders dot com
2023-07-06 22:22 ` [Bug remote/30618] " tromey at sourceware dot org
2023-07-13 16:20 ` tromey at sourceware dot org
@ 2023-07-13 21:26 ` tromey at sourceware dot org
2023-07-17 20:48 ` tromey at sourceware dot org
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: tromey at sourceware dot org @ 2023-07-13 21:26 UTC (permalink / raw)
To: gdb-prs
https://sourceware.org/bugzilla/show_bug.cgi?id=30618
--- Comment #3 from Tom Tromey <tromey at sourceware dot org> ---
One other issue here is knowing the correct encoding to use.
gdb itself can pass in target_charset().
I guess gdbserver could use the prevailing encoding from the locale.
I wonder if we even care about non-ASCII characters here.
What if we substitute ? for those instead.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug remote/30618] warning: while parsing threads: not well-formed (invalid token) - in non-stop + remote mode
2023-07-05 18:52 [Bug remote/30618] New: warning: while parsing threads: not well-formed (invalid token) - in non-stop + remote mode jonah at kichwacoders dot com
` (2 preceding siblings ...)
2023-07-13 21:26 ` tromey at sourceware dot org
@ 2023-07-17 20:48 ` tromey at sourceware dot org
2023-07-19 17:40 ` jonah at kichwacoders dot com
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: tromey at sourceware dot org @ 2023-07-17 20:48 UTC (permalink / raw)
To: gdb-prs
https://sourceware.org/bugzilla/show_bug.cgi?id=30618
--- Comment #4 from Tom Tromey <tromey at sourceware dot org> ---
https://sourceware.org/pipermail/gdb-patches/2023-July/200971.html
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug remote/30618] warning: while parsing threads: not well-formed (invalid token) - in non-stop + remote mode
2023-07-05 18:52 [Bug remote/30618] New: warning: while parsing threads: not well-formed (invalid token) - in non-stop + remote mode jonah at kichwacoders dot com
` (3 preceding siblings ...)
2023-07-17 20:48 ` tromey at sourceware dot org
@ 2023-07-19 17:40 ` jonah at kichwacoders dot com
2023-11-14 16:14 ` cvs-commit at gcc dot gnu.org
2023-11-15 13:53 ` tromey at sourceware dot org
6 siblings, 0 replies; 8+ messages in thread
From: jonah at kichwacoders dot com @ 2023-07-19 17:40 UTC (permalink / raw)
To: gdb-prs
https://sourceware.org/bugzilla/show_bug.cgi?id=30618
--- Comment #5 from Jonah Graham <jonah at kichwacoders dot com> ---
> This truncates the final
> character in the name -- yielding an invalid UTF-8 sequence, which
> gdbserver dutifully passes back to gdb.
Thanks Tom - with this explanation I was able to craft my test in
cdt-gdb-adapter to avoid this bug where I am trying to improve unicode support
https://github.com/eclipse-cdt-cloud/cdt-gdb-adapter/pull/276.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug remote/30618] warning: while parsing threads: not well-formed (invalid token) - in non-stop + remote mode
2023-07-05 18:52 [Bug remote/30618] New: warning: while parsing threads: not well-formed (invalid token) - in non-stop + remote mode jonah at kichwacoders dot com
` (4 preceding siblings ...)
2023-07-19 17:40 ` jonah at kichwacoders dot com
@ 2023-11-14 16:14 ` cvs-commit at gcc dot gnu.org
2023-11-15 13:53 ` tromey at sourceware dot org
6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-11-14 16:14 UTC (permalink / raw)
To: gdb-prs
https://sourceware.org/bugzilla/show_bug.cgi?id=30618
--- Comment #6 from cvs-commit at gcc dot gnu.org <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Tom Tromey <tromey@sourceware.org>:
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=07b3255c3bae7126a0d679f957788560351eb236
commit 07b3255c3bae7126a0d679f957788560351eb236
Author: Tom Tromey <tom@tromey.com>
Date: Thu Jul 13 17:28:48 2023 -0600
Filter invalid encodings from Linux thread names
On Linux, a thread can only be 16 bytes (including the trailing \0).
A user sent in a test case where this causes a truncated UTF-8
sequence, causing gdbserver to create invalid XML.
I went back and forth about different ways to solve this, and in the
end decided to fix it in gdbserver, with the reason being that it
seems important to generate correct XML for the <thread> response.
I am not totally sure whether the call to setlocale could have
unplanned consequences. This is needed, though, for nl_langinfo to
return the correct result.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30618
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug remote/30618] warning: while parsing threads: not well-formed (invalid token) - in non-stop + remote mode
2023-07-05 18:52 [Bug remote/30618] New: warning: while parsing threads: not well-formed (invalid token) - in non-stop + remote mode jonah at kichwacoders dot com
` (5 preceding siblings ...)
2023-11-14 16:14 ` cvs-commit at gcc dot gnu.org
@ 2023-11-15 13:53 ` tromey at sourceware dot org
6 siblings, 0 replies; 8+ messages in thread
From: tromey at sourceware dot org @ 2023-11-15 13:53 UTC (permalink / raw)
To: gdb-prs
https://sourceware.org/bugzilla/show_bug.cgi?id=30618
Tom Tromey <tromey at sourceware dot org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Target Milestone|--- |15.1
Resolution|--- |FIXED
--- Comment #7 from Tom Tromey <tromey at sourceware dot org> ---
Fixed.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-11-15 13:53 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-05 18:52 [Bug remote/30618] New: warning: while parsing threads: not well-formed (invalid token) - in non-stop + remote mode jonah at kichwacoders dot com
2023-07-06 22:22 ` [Bug remote/30618] " tromey at sourceware dot org
2023-07-13 16:20 ` tromey at sourceware dot org
2023-07-13 21:26 ` tromey at sourceware dot org
2023-07-17 20:48 ` tromey at sourceware dot org
2023-07-19 17:40 ` jonah at kichwacoders dot com
2023-11-14 16:14 ` cvs-commit at gcc dot gnu.org
2023-11-15 13:53 ` tromey at sourceware dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).