public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* an mbrtoc32 bug
@ 2024-05-23 20:16 Bruno Haible
  2024-05-23 21:31 ` Bruno Haible
  0 siblings, 1 reply; 2+ messages in thread
From: Bruno Haible @ 2024-05-23 20:16 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 374 bytes --]

In Cygwin 3.5.3, the attached program has an assertion failure in line 24:
bytes is not (size_t)-2.

How to reproduce:
$ gcc -Wall foo.c
$ ./a

I think this is a bug, because
  - ISO C 23 § 7.30.1.5 talks about "completing" a character, not
    "representing" an (entire) character.
  - The test passes on glibc, musl libc, FreeBSD 14.0, Solaris 11.4.

Bruno


[-- Attachment #2: foo.c --]
[-- Type: text/x-csrc, Size: 766 bytes --]

#include <assert.h>
#include <locale.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <uchar.h>
#include <wchar.h>

int main ()
{
  assert (setlocale (LC_ALL, "en_US.UTF-8") != NULL);
  mbstate_t state;
  memset (&state, 0, sizeof (state));

  char32_t uc = 0xDEADBEEF;
  size_t bytes;

  /* \360\237\220\203 = U+0001F403 */
  bytes = mbrtoc32 (&uc, "\360", 1, &state);
  assert (bytes == (size_t)-2);
  bytes = mbrtoc32 (&uc, "\237", 1, &state);
  assert (bytes == (size_t)-2);
  bytes = mbrtoc32 (&uc, "\220", 1, &state);
  assert (bytes == (size_t)-2);
  bytes = mbrtoc32 (&uc, "\203", 1, &state);
  assert (bytes == 1);
  assert (uc == 0x0001F403);
}

/* Works in: glibc, musl libc, FreeBSD 14.0, Solaris 11.4
   Fails in: Cygwin 3.5.3
 */

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: an mbrtoc32 bug
  2024-05-23 20:16 an mbrtoc32 bug Bruno Haible
@ 2024-05-23 21:31 ` Bruno Haible
  0 siblings, 0 replies; 2+ messages in thread
From: Bruno Haible @ 2024-05-23 21:31 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 202 bytes --]

The test case that I sent uses UTF-8 encoding.

Here's another test case, that uses GB18030 (supposedly supported
since Cygwin 3.5.0). It fails as well, in line 26.

On glibc systems, this test works.


[-- Attachment #2: foo.c --]
[-- Type: text/x-csrc, Size: 729 bytes --]

#include <assert.h>
#include <locale.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <uchar.h>
#include <wchar.h>

int main ()
{
  assert (setlocale (LC_ALL, "zh_CN.GB18030") != NULL);
  mbstate_t state;
  memset (&state, 0, sizeof (state));

  char32_t uc = 0xDEADBEEF;
  size_t bytes;

  /* \224\071\311\067 = U+0001F403 */
  bytes = mbrtoc32 (&uc, "\224", 1, &state);
  assert (bytes == (size_t)-2);
  bytes = mbrtoc32 (&uc, "\071", 1, &state);
  assert (bytes == (size_t)-2);
  bytes = mbrtoc32 (&uc, "\311", 1, &state);
  assert (bytes == (size_t)-2);
  bytes = mbrtoc32 (&uc, "\067", 1, &state);
  assert (bytes == 1);
  assert (uc == 0x0001F403);
}

/* Works in: glibc
   Fails in: Cygwin 3.5.3
 */

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-05-23 21:31 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-23 20:16 an mbrtoc32 bug Bruno Haible
2024-05-23 21:31 ` Bruno Haible

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).