public inbox for gdb-prs@sourceware.org
help / color / mirror / Atom feed
* [Bug rust/20164] fix string printing in rust
       [not found] <bug-20164-4717@http.sourceware.org/bugzilla/>
@ 2022-02-13 18:03 ` tromey at sourceware dot org
  2022-02-17 23:15 ` tromey at sourceware dot org
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 8+ messages in thread
From: tromey at sourceware dot org @ 2022-02-13 18:03 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=20164

Tom Tromey <tromey at sourceware dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at sourceware dot org   |tromey at sourceware dot org

--- Comment #2 from Tom Tromey <tromey at sourceware dot org> ---
Working on the escapes now.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rust/20164] fix string printing in rust
       [not found] <bug-20164-4717@http.sourceware.org/bugzilla/>
  2022-02-13 18:03 ` [Bug rust/20164] fix string printing in rust tromey at sourceware dot org
@ 2022-02-17 23:15 ` tromey at sourceware dot org
  2022-02-17 23:45 ` mark at klomp dot org
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 8+ messages in thread
From: tromey at sourceware dot org @ 2022-02-17 23:15 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=20164

--- Comment #3 from Tom Tromey <tromey at sourceware dot org> ---
I wonder if char printing should use unicode-ish syntax like:

(gdb) print 'x'
$1 = U+0078 'x'

Not sure.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rust/20164] fix string printing in rust
       [not found] <bug-20164-4717@http.sourceware.org/bugzilla/>
  2022-02-13 18:03 ` [Bug rust/20164] fix string printing in rust tromey at sourceware dot org
  2022-02-17 23:15 ` tromey at sourceware dot org
@ 2022-02-17 23:45 ` mark at klomp dot org
  2022-02-17 23:53 ` tromey at sourceware dot org
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 8+ messages in thread
From: mark at klomp dot org @ 2022-02-17 23:45 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=20164

Mark Wielaard <mark at klomp dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |mark at klomp dot org

--- Comment #4 from Mark Wielaard <mark at klomp dot org> ---
(In reply to Tom Tromey from comment #3)
> I wonder if char printing should use unicode-ish syntax like:
> 
> (gdb) print 'x'
> $1 = U+0078 'x'
> 
> Not sure.

I like it.

Note that the convention is to have zeros prepended when there are less than 4
digits, but not when there are 5 or 6 digits.
https://www.unicode.org/versions/Unicode14.0.0/appA.pdf
"Leading zeros are omitted, unless the code point would have fewer than four
hexadecimal digits — for example, U+0001, U+0012, U+0123, U+1234, U+12345,
U+102345."

I am not sure how to flag the "illegal" char values, U+D800 to U+DFFF and
anything larger than U+10FFFF. But it would be good to have some indicator you
are dealing with an invalid value.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rust/20164] fix string printing in rust
       [not found] <bug-20164-4717@http.sourceware.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2022-02-17 23:45 ` mark at klomp dot org
@ 2022-02-17 23:53 ` tromey at sourceware dot org
  2022-02-18  0:10 ` mark at klomp dot org
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 8+ messages in thread
From: tromey at sourceware dot org @ 2022-02-17 23:53 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=20164

--- Comment #5 from Tom Tromey <tromey at sourceware dot org> ---
(In reply to Mark Wielaard from comment #4)

> I am not sure how to flag the "illegal" char values, U+D800 to U+DFFF and
> anything larger than U+10FFFF. But it would be good to have some indicator
> you are dealing with an invalid value.

You'll already get a hex escape in this situation:

(gdb) p '\u{d800}'
$2 = 55296 '\u{00d800}'

I guess the escapes sort of make the "U+..." idea a bit weird.
It would just be printing the same info twice.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rust/20164] fix string printing in rust
       [not found] <bug-20164-4717@http.sourceware.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2022-02-17 23:53 ` tromey at sourceware dot org
@ 2022-02-18  0:10 ` mark at klomp dot org
  2022-02-18  0:39 ` tromey at sourceware dot org
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 8+ messages in thread
From: mark at klomp dot org @ 2022-02-18  0:10 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=20164

--- Comment #6 from Mark Wielaard <mark at klomp dot org> ---
(In reply to Tom Tromey from comment #5)
> (In reply to Mark Wielaard from comment #4)
> 
> > I am not sure how to flag the "illegal" char values, U+D800 to U+DFFF and
> > anything larger than U+10FFFF. But it would be good to have some indicator
> > you are dealing with an invalid value.
> 
> You'll already get a hex escape in this situation:
> 
> (gdb) p '\u{d800}'
> $2 = 55296 '\u{00d800}'

But isn't that escape "value" itself invalid?
Shouldn't it be something like

$2 = 55296 '???'

So, use U+xxxxxx 'unicode-char' for valid char values, otherwise "raw number"
'???'

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rust/20164] fix string printing in rust
       [not found] <bug-20164-4717@http.sourceware.org/bugzilla/>
                   ` (4 preceding siblings ...)
  2022-02-18  0:10 ` mark at klomp dot org
@ 2022-02-18  0:39 ` tromey at sourceware dot org
  2022-02-18  1:04 ` mark at klomp dot org
  2022-02-18  1:16 ` tromey at sourceware dot org
  7 siblings, 0 replies; 8+ messages in thread
From: tromey at sourceware dot org @ 2022-02-18  0:39 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=20164

--- Comment #7 from Tom Tromey <tromey at sourceware dot org> ---
Yeah, I see that rustc rejects a program using that.
We could print something like

$2 = U+D8000 <invalid Unicode character>

... at least if the invalid ranges aren't a pain to figure out.
Is it really just surrogates and values > 0x10ffff?  I didn't look yet.

There are still characters that print as escapes, for example U+200C.
But maybe double-printing these oddities isn't so bad.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rust/20164] fix string printing in rust
       [not found] <bug-20164-4717@http.sourceware.org/bugzilla/>
                   ` (5 preceding siblings ...)
  2022-02-18  0:39 ` tromey at sourceware dot org
@ 2022-02-18  1:04 ` mark at klomp dot org
  2022-02-18  1:16 ` tromey at sourceware dot org
  7 siblings, 0 replies; 8+ messages in thread
From: mark at klomp dot org @ 2022-02-18  1:04 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=20164

--- Comment #8 from Mark Wielaard <mark at klomp dot org> ---
(In reply to Tom Tromey from comment #7)
> Yeah, I see that rustc rejects a program using that.
> We could print something like
> 
> $2 = U+D8000 <invalid Unicode character>

I would like that.

> ... at least if the invalid ranges aren't a pain to figure out.
> Is it really just surrogates and values > 0x10ffff?  I didn't look yet.

It is, see https://doc.rust-lang.org/reference/types/textual.html
"A value of type char is a Unicode scalar value (i.e. a code point that is not
a surrogate), represented as a 32-bit unsigned word in the 0x0000 to 0xD7FF or
0xE000 to 0x10FFFF range. It is immediate Undefined Behavior to create a char
that falls outside this range."

> There are still characters that print as escapes, for example U+200C.
> But maybe double-printing these oddities isn't so bad.

I don't think it is that odd. You could also use ascii escapes
(https://doc.rust-lang.org/reference/tokens.html#ascii-escapes) when possible.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rust/20164] fix string printing in rust
       [not found] <bug-20164-4717@http.sourceware.org/bugzilla/>
                   ` (6 preceding siblings ...)
  2022-02-18  1:04 ` mark at klomp dot org
@ 2022-02-18  1:16 ` tromey at sourceware dot org
  7 siblings, 0 replies; 8+ messages in thread
From: tromey at sourceware dot org @ 2022-02-18  1:16 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=20164

--- Comment #9 from Tom Tromey <tromey at sourceware dot org> ---
(In reply to Mark Wielaard from comment #8)
> I don't think it is that odd. You could also use ascii escapes
> (https://doc.rust-lang.org/reference/tokens.html#ascii-escapes) when
> possible.

That's already done, though I see on that page that only up to 0x7f
is supported for char types, so there's a little bug in the output here,
as gdb checks "if (value <= 255)"

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-02-18  1:16 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-20164-4717@http.sourceware.org/bugzilla/>
2022-02-13 18:03 ` [Bug rust/20164] fix string printing in rust tromey at sourceware dot org
2022-02-17 23:15 ` tromey at sourceware dot org
2022-02-17 23:45 ` mark at klomp dot org
2022-02-17 23:53 ` tromey at sourceware dot org
2022-02-18  0:10 ` mark at klomp dot org
2022-02-18  0:39 ` tromey at sourceware dot org
2022-02-18  1:04 ` mark at klomp dot org
2022-02-18  1:16 ` tromey at sourceware dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).