public inbox for gdb-prs@sourceware.org
help / color / mirror / Atom feed
* [Bug gdb/31853] New: GDB's use of iconv does not work with macOS Sonoma
@ 2024-06-06 15:10 ciaranwoodward at xmos dot com
  2024-06-06 15:10 ` [Bug gdb/31853] " ciaranwoodward at xmos dot com
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: ciaranwoodward at xmos dot com @ 2024-06-06 15:10 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=31853

            Bug ID: 31853
           Summary: GDB's use of iconv does not work with macOS Sonoma
           Product: gdb
           Version: HEAD
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: gdb
          Assignee: unassigned at sourceware dot org
          Reporter: ciaranwoodward at xmos dot com
  Target Milestone: ---

Created attachment 15569
  --> https://sourceware.org/bugzilla/attachment.cgi?id=15569&action=edit
Minimal reproduction test

I noticed when using a cross-debugger on the latest macOS (Sonoma) that
printing strings would only print the first character.

For instance if there was a 'const char* s = "foobar"' then 'print s' would
print '$1 = "f"' rather than the expected '$1 = "foobar"'.

As far as I can tell, Apple silently swapped out the use of GNU libiconv for a
custom version in Sonoma, and the custom version is the only version present on
the new system. I'm not sure what the best fix is here, since even old built
versions of GDB (for instance, built for Ventura) would experience this issue
when they are run on Sonoma (since libiconv is dynamically linked).

I have attached a test program which exhibits the issue with the iconv function
in macOS. This replicates GDB's use of iconv with the wchar_iterator. I have
also attached a patch which I applied locally to fix the issue, but maybe it
seems a bit rough for general release?

The expected test program output (and the output on older macOS or with GNU
libiconv) is:

```
Input ASCII string: 'libiconv on Sonoma is acting strangely'
Converted UTF-32LE string (in 4 byte chunks) is 38 chars: 'libiconv on Sonoma
is acting strangely'
Converted UTF-32LE string (in 100 byte chunks) is 38 chars: 'libiconv on Sonoma
is acting strangely'
Converted wchar_t string (in 4 byte chunks) is 38 chars: 'libiconv on Sonoma is
acting strangely'
Converted wchar_t string (in 100 byte chunks) is 38 chars: 'libiconv on Sonoma
is acting strangely'
```

The actual test program output on Sonoma is:

```
Input ASCII string: 'libiconv on Sonoma is acting strangely'
Converted UTF-32LE string (in 4 byte chunks) is 38 chars: 'libiconv on Sonoma
is acting strangely'
Converted UTF-32LE string (in 100 byte chunks) is 38 chars: 'libiconv on Sonoma
is acting strangely'
Converted wchar_t string (in 4 byte chunks) is 1 chars: 'l'
Converted wchar_t string (in 100 byte chunks) is 25 chars: 'libiconv on Sonoma
is act'

```

The attached test program can be compiled with 'gcc -o test -liconv test.c'.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug gdb/31853] GDB's use of iconv does not work with macOS Sonoma
  2024-06-06 15:10 [Bug gdb/31853] New: GDB's use of iconv does not work with macOS Sonoma ciaranwoodward at xmos dot com
@ 2024-06-06 15:10 ` ciaranwoodward at xmos dot com
  2024-06-06 15:25 ` ciaranwoodward at xmos dot com
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: ciaranwoodward at xmos dot com @ 2024-06-06 15:10 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=31853

--- Comment #1 from Ciaran Woodward <ciaranwoodward at xmos dot com> ---
Created attachment 15570
  --> https://sourceware.org/bugzilla/attachment.cgi?id=15570&action=edit
Rough patch which fixes the issue

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug gdb/31853] GDB's use of iconv does not work with macOS Sonoma
  2024-06-06 15:10 [Bug gdb/31853] New: GDB's use of iconv does not work with macOS Sonoma ciaranwoodward at xmos dot com
  2024-06-06 15:10 ` [Bug gdb/31853] " ciaranwoodward at xmos dot com
@ 2024-06-06 15:25 ` ciaranwoodward at xmos dot com
  2024-06-07 15:30 ` tromey at sourceware dot org
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: ciaranwoodward at xmos dot com @ 2024-06-06 15:25 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=31853

--- Comment #2 from Ciaran Woodward <ciaranwoodward at xmos dot com> ---
I forgot to mention the specific issue. The macos version of libiconv seems to
incorrectly handle the 'outbytesleft' parameter when the target encoding is
'wchar_t'.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug gdb/31853] GDB's use of iconv does not work with macOS Sonoma
  2024-06-06 15:10 [Bug gdb/31853] New: GDB's use of iconv does not work with macOS Sonoma ciaranwoodward at xmos dot com
  2024-06-06 15:10 ` [Bug gdb/31853] " ciaranwoodward at xmos dot com
  2024-06-06 15:25 ` ciaranwoodward at xmos dot com
@ 2024-06-07 15:30 ` tromey at sourceware dot org
  2024-06-07 16:18 ` ciaranwoodward at xmos dot com
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: tromey at sourceware dot org @ 2024-06-07 15:30 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=31853

Tom Tromey <tromey at sourceware dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |tromey at sourceware dot org

--- Comment #3 from Tom Tromey <tromey at sourceware dot org> ---
IIUC, Sonoma:

* Does not define __STDC_ISO_10646__
* ... but does use UTF-32 as its wchar_t encoding
* ... and defines _LIBICONV_VERSION
* ... but for its own libiconv, not the GNU libiconv

Is that all correct?

I think host hacks like this are completely fine & appropriate.
However I want to make sure I am understanding correctly first.

Also, will the approach taken in your patch also work on
older versions of the OS?

For inclusion in gdb I guess all I'd really require here
is some commentary explaining the situation and conformity
to gdb styling (like, gdb doesn't use "//" comments).

Thanks for looking into this.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug gdb/31853] GDB's use of iconv does not work with macOS Sonoma
  2024-06-06 15:10 [Bug gdb/31853] New: GDB's use of iconv does not work with macOS Sonoma ciaranwoodward at xmos dot com
                   ` (2 preceding siblings ...)
  2024-06-07 15:30 ` tromey at sourceware dot org
@ 2024-06-07 16:18 ` ciaranwoodward at xmos dot com
  2024-06-07 21:32 ` tromey at sourceware dot org
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: ciaranwoodward at xmos dot com @ 2024-06-07 16:18 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=31853

--- Comment #4 from Ciaran Woodward <ciaranwoodward at xmos dot com> ---
I wrote a quick preprocessor check to double-confirm:

> * Does not define __STDC_ISO_10646__
> * ... and defines _LIBICONV_VERSION

And both of these are correct. __STDC_ISO_10646__ is not
defined, and _LIBICONV_VERSION is defined as 0x010B (with
a comment which claims "compatible").

> * ... but for its own libiconv, not the GNU libiconv

I can't find any official documentation stating that libiconv
was replaced, but have found references to it on online forums,
and the header is substantially different (but claims to be
compatible). It seems it was Apple's intention to provide
this iconv as a 'drop in' replacement, so building against an
old "mac SDK" (so your application is compatible with older
macOSs) will still have the new non-GNU libiconv linked when
run on the newer system.

> * ... but does use UTF-32 as its wchar_t encoding

I am not sure about this, nor how to check it really.
I can see the wchar_t type is 4-bytes in size, but maybe
that is different for very-old versions of macOS.

The intermediate_encoding() (in charset.c) function that ends up getting
used as INTERMEDIATE_ENCODING will adjust the bitwidth to match the
width of wchar_t. It will use either UTF-16,UTF-32,UCS-2 or UCS-4
depending on what iconv_open supports (by runtime testing).

From my understanding, this is all fine because GDB is only using
this as an intermediate encoding, not expecting the host or target
to necessarily understand it natively.

> Also, will the approach taken in your patch also work on
> older versions of the OS?

I have tested on all of the 'currently supported' versions
of macOS and it seems to work, but I really don't understand
the platform well enough to guarantee it will work for every
possible past version.

I also reported this issue to Apple, but no guarantees they
actually acknowledge/fix it in their iconv implementation :)

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug gdb/31853] GDB's use of iconv does not work with macOS Sonoma
  2024-06-06 15:10 [Bug gdb/31853] New: GDB's use of iconv does not work with macOS Sonoma ciaranwoodward at xmos dot com
                   ` (3 preceding siblings ...)
  2024-06-07 16:18 ` ciaranwoodward at xmos dot com
@ 2024-06-07 21:32 ` tromey at sourceware dot org
  2024-06-10 16:16 ` ciaranwoodward at xmos dot com
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: tromey at sourceware dot org @ 2024-06-07 21:32 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=31853

--- Comment #5 from Tom Tromey <tromey at sourceware dot org> ---
(In reply to Ciaran Woodward from comment #4)

> > * ... but does use UTF-32 as its wchar_t encoding
> 
> I am not sure about this, nor how to check it really.
> I can see the wchar_t type is 4-bytes in size, but maybe
> that is different for very-old versions of macOS.

Yeah, it's maybe not easy to check definitively.
Some searching seems to indicate that macOS does use this though.

> From my understanding, this is all fine because GDB is only using
> this as an intermediate encoding, not expecting the host or target
> to necessarily understand it natively.

There are spots that use the wchar_t, for example in valprint.c.
This is done essentially to access iswprint, I think.

> > Also, will the approach taken in your patch also work on
> > older versions of the OS?
> 
> I have tested on all of the 'currently supported' versions
> of macOS and it seems to work, but I really don't understand
> the platform well enough to guarantee it will work for every
> possible past version.

I was thinking maybe we could have this check the version,
but I suppose it's better not to, so that gdb could be
built on an older macOS and still work on a newer one... ?

If that's so then I tend to think your patch is fine.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug gdb/31853] GDB's use of iconv does not work with macOS Sonoma
  2024-06-06 15:10 [Bug gdb/31853] New: GDB's use of iconv does not work with macOS Sonoma ciaranwoodward at xmos dot com
                   ` (4 preceding siblings ...)
  2024-06-07 21:32 ` tromey at sourceware dot org
@ 2024-06-10 16:16 ` ciaranwoodward at xmos dot com
  2024-06-11 10:30 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: ciaranwoodward at xmos dot com @ 2024-06-10 16:16 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=31853

--- Comment #6 from Ciaran Woodward <ciaranwoodward at xmos dot com> ---
> There are spots that use the wchar_t, for example in valprint.c.
> This is done essentially to access iswprint, I think.

It's true that wchar_t is used in places, but the output of the
iconv system which has been requested to be INTERMEDIATE_ENCODING
specifically does not seem to be used for anything except converting
again later. So I don't think that changing INTERMEDIATE_ENCODING
will cause any issues - even if it isn't what the host thinks
wchar_t should be.

(this is from a quick search for INTERMEDIATE_ENCODING through the
codebase).

> I was thinking maybe we could have this check the version,
> but I suppose it's better not to, so that gdb could be
> built on an older macOS and still work on a newer one... ?

This was my thought. Usually on windows/mac you build for the earliest
version you want support and then expect that to still run on later
versions of the system.

I'll send a polished-up patch to the list

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug gdb/31853] GDB's use of iconv does not work with macOS Sonoma
  2024-06-06 15:10 [Bug gdb/31853] New: GDB's use of iconv does not work with macOS Sonoma ciaranwoodward at xmos dot com
                   ` (5 preceding siblings ...)
  2024-06-10 16:16 ` ciaranwoodward at xmos dot com
@ 2024-06-11 10:30 ` cvs-commit at gcc dot gnu.org
  2024-06-11 15:08 ` tromey at sourceware dot org
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-06-11 10:30 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=31853

--- Comment #7 from Sourceware Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Ciaran Woodward <ciaran@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=bb2981798f54e6eb30e46fb11cda2ca49561ffd3

commit bb2981798f54e6eb30e46fb11cda2ca49561ffd3
Author: Ciaran Woodward <ciaranwoodward@xmos.com>
Date:   Mon Jun 10 16:52:37 2024 +0100

    Fix printing strings on macOS Sonoma

    On macOS sonoma, printing a string would only print the first
    character. For instance, if there was a 'const char *s = "foobar"',
    then the 'print s' command would print '$1 = "f"' rather than the
    expected '$1 = "foobar"'.

    It seems that this is due to Apple silently replacing the version
    of libiconv they ship with the OS to one which silently fails to
    handle the 'outbytesleft' parameter correctly when using 'wchar_t'
    as a target encoding.

    This specifically causes issues when using iterating through a
    string as wchar_iterator does.

    This bug is visible even if you build for an old version of macOS,
    but then run on Sonoma. Therefore this fix in the code applies
    generally to macOS, and not specific to building on Sonoma. Building
    for an older version and expecting forwards compatibility is a
    common situation on macOS.

    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31853

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug gdb/31853] GDB's use of iconv does not work with macOS Sonoma
  2024-06-06 15:10 [Bug gdb/31853] New: GDB's use of iconv does not work with macOS Sonoma ciaranwoodward at xmos dot com
                   ` (6 preceding siblings ...)
  2024-06-11 10:30 ` cvs-commit at gcc dot gnu.org
@ 2024-06-11 15:08 ` tromey at sourceware dot org
  2024-06-12 10:21 ` cvs-commit at gcc dot gnu.org
  2024-06-12 12:39 ` tromey at sourceware dot org
  9 siblings, 0 replies; 11+ messages in thread
From: tromey at sourceware dot org @ 2024-06-11 15:08 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=31853

Tom Tromey <tromey at sourceware dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
   Target Milestone|---                         |16.1
             Status|UNCONFIRMED                 |RESOLVED

--- Comment #8 from Tom Tromey <tromey at sourceware dot org> ---
Fixed.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug gdb/31853] GDB's use of iconv does not work with macOS Sonoma
  2024-06-06 15:10 [Bug gdb/31853] New: GDB's use of iconv does not work with macOS Sonoma ciaranwoodward at xmos dot com
                   ` (7 preceding siblings ...)
  2024-06-11 15:08 ` tromey at sourceware dot org
@ 2024-06-12 10:21 ` cvs-commit at gcc dot gnu.org
  2024-06-12 12:39 ` tromey at sourceware dot org
  9 siblings, 0 replies; 11+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-06-12 10:21 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=31853

--- Comment #9 from Sourceware Commits <cvs-commit at gcc dot gnu.org> ---
The gdb-15-branch branch has been updated by Ciaran Woodward
<ciaran@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=f41400ee71f8cfa40b4fb784cbba231394b9c698

commit f41400ee71f8cfa40b4fb784cbba231394b9c698
Author: Ciaran Woodward <ciaranwoodward@xmos.com>
Date:   Mon Jun 10 16:52:37 2024 +0100

    Fix printing strings on macOS Sonoma

    On macOS sonoma, printing a string would only print the first
    character. For instance, if there was a 'const char *s = "foobar"',
    then the 'print s' command would print '$1 = "f"' rather than the
    expected '$1 = "foobar"'.

    It seems that this is due to Apple silently replacing the version
    of libiconv they ship with the OS to one which silently fails to
    handle the 'outbytesleft' parameter correctly when using 'wchar_t'
    as a target encoding.

    This specifically causes issues when using iterating through a
    string as wchar_iterator does.

    This bug is visible even if you build for an old version of macOS,
    but then run on Sonoma. Therefore this fix in the code applies
    generally to macOS, and not specific to building on Sonoma. Building
    for an older version and expecting forwards compatibility is a
    common situation on macOS.

    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31853
    Approved-By: Tom Tromey <tom@tromey.com>

    (cherry picked from commit bb2981798f54e6eb30e46fb11cda2ca49561ffd3)

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug gdb/31853] GDB's use of iconv does not work with macOS Sonoma
  2024-06-06 15:10 [Bug gdb/31853] New: GDB's use of iconv does not work with macOS Sonoma ciaranwoodward at xmos dot com
                   ` (8 preceding siblings ...)
  2024-06-12 10:21 ` cvs-commit at gcc dot gnu.org
@ 2024-06-12 12:39 ` tromey at sourceware dot org
  9 siblings, 0 replies; 11+ messages in thread
From: tromey at sourceware dot org @ 2024-06-12 12:39 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=31853

Tom Tromey <tromey at sourceware dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|16.1                        |15.1

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-06-12 12:39 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-06 15:10 [Bug gdb/31853] New: GDB's use of iconv does not work with macOS Sonoma ciaranwoodward at xmos dot com
2024-06-06 15:10 ` [Bug gdb/31853] " ciaranwoodward at xmos dot com
2024-06-06 15:25 ` ciaranwoodward at xmos dot com
2024-06-07 15:30 ` tromey at sourceware dot org
2024-06-07 16:18 ` ciaranwoodward at xmos dot com
2024-06-07 21:32 ` tromey at sourceware dot org
2024-06-10 16:16 ` ciaranwoodward at xmos dot com
2024-06-11 10:30 ` cvs-commit at gcc dot gnu.org
2024-06-11 15:08 ` tromey at sourceware dot org
2024-06-12 10:21 ` cvs-commit at gcc dot gnu.org
2024-06-12 12:39 ` tromey at sourceware dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).