public inbox for glibc-bugs@sourceware.org help / color / mirror / Atom feed
From: "mbuilov at gmail dot com" <sourceware-bugzilla@sourceware.org> To: glibc-bugs@sourceware.org Subject: [Bug string/28898] New: mbrtoc16 segfaults when counting wide characters Date: Wed, 16 Feb 2022 06:41:00 +0000 [thread overview] Message-ID: <bug-28898-131@http.sourceware.org/bugzilla/> (raw) https://sourceware.org/bugzilla/show_bug.cgi?id=28898 Bug ID: 28898 Summary: mbrtoc16 segfaults when counting wide characters Product: glibc Version: unspecified Status: UNCONFIRMED Severity: normal Priority: P2 Component: string Assignee: unassigned at sourceware dot org Reporter: mbuilov at gmail dot com Target Milestone: --- The following test case demonstrates Segmentation fault. The program works correctly if the first argument to mbrtoc16() is not NULL, otherwise it crashes. $ cat bug.c #include <stdio.h> #include <string.h> #include <locale.h> #include <uchar.h> static const unsigned char utf8[] = { 0xf0, 0x9d, 0x94, 0xa0, 0xd5, 0xae }; int main(int argc, char *argv[]) { mbstate_t state; unsigned i = 0, n = 0; const char *const locale = setlocale(LC_ALL, "C.UTF-8"); if (!locale) { fprintf(stderr, "failed to set locale\n"); return -2; } memset(&state, 0, sizeof(state)); while (i < sizeof(utf8)) { const size_t sz = mbrtoc16(NULL, (const char*)&utf8[i], 1, &state); if (sz == (size_t)-2) { i += 1; continue; } if (sz != (size_t)-3) i += 1; n++; } fprintf(stdout, "number of utf16 characters: %u\n", n); (void)argc, (void)argv; return 0; } ------------- The problem is in the following peace of code of mbrtoc16(), in ./wcsmbs/mbrtoc16.c: /* The standard text does not say that S being NULL means the state is reset even if the second half of a surrogate still have to be returned. In fact, the error code description indicates otherwise. Therefore always first try to return a second half. */ if (ps->__count & 0x80000000) { /* We have to return the second word for a surrogate. */ ps->__count &= 0x7fffffff; *pc16 = ps->__value.__wch; ps->__value.__wch = L'\0'; return (size_t) -3; } - the pc16 pointer is not checked for NULL before being dereferenced. -- You are receiving this mail because: You are on the CC list for the bug.
next reply other threads:[~2022-02-16 6:41 UTC|newest] Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-02-16 6:41 mbuilov at gmail dot com [this message] 2023-06-26 13:23 ` [Bug string/28898] " bruno at clisp dot org 2023-06-26 13:27 ` bruno at clisp dot org 2024-05-04 4:28 ` luigighiron at gmail dot com
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-28898-131@http.sourceware.org/bugzilla/ \ --to=sourceware-bugzilla@sourceware.org \ --cc=glibc-bugs@sourceware.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).