public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "mbuilov at gmail dot com" <sourceware-bugzilla@sourceware.org>
To: glibc-bugs@sourceware.org
Subject: [Bug string/28898] New: mbrtoc16 segfaults when counting wide characters
Date: Wed, 16 Feb 2022 06:41:00 +0000	[thread overview]
Message-ID: <bug-28898-131@http.sourceware.org/bugzilla/> (raw)

https://sourceware.org/bugzilla/show_bug.cgi?id=28898

            Bug ID: 28898
           Summary: mbrtoc16 segfaults when counting wide characters
           Product: glibc
           Version: unspecified
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: string
          Assignee: unassigned at sourceware dot org
          Reporter: mbuilov at gmail dot com
  Target Milestone: ---

The following test case demonstrates Segmentation fault.

The program works correctly if the first argument to mbrtoc16() is not NULL,
otherwise it crashes.


$ cat bug.c
#include <stdio.h>
#include <string.h>
#include <locale.h>
#include <uchar.h>

static const unsigned char utf8[] = {
  0xf0, 0x9d, 0x94, 0xa0, 0xd5, 0xae
};

int main(int argc, char *argv[])
{
        mbstate_t state;
        unsigned i = 0, n = 0;
        const char *const locale = setlocale(LC_ALL, "C.UTF-8");
        if (!locale) {
                fprintf(stderr, "failed to set locale\n");
                return -2;
        }
        memset(&state, 0, sizeof(state));
        while (i < sizeof(utf8)) {
                const size_t sz = mbrtoc16(NULL, (const char*)&utf8[i], 1,
&state);
                if (sz == (size_t)-2) {
                        i += 1;
                        continue;
                }
                if (sz != (size_t)-3)
                        i += 1;
                n++;
        }
        fprintf(stdout, "number of utf16 characters: %u\n", n);
        (void)argc, (void)argv;
        return 0;
}

-------------

The problem is in the following peace of code of mbrtoc16(), in
./wcsmbs/mbrtoc16.c:


  /* The standard text does not say that S being NULL means the state
     is reset even if the second half of a surrogate still have to be
     returned.  In fact, the error code description indicates
     otherwise.  Therefore always first try to return a second
     half.  */
  if (ps->__count & 0x80000000)
    {
      /* We have to return the second word for a surrogate.  */
      ps->__count &= 0x7fffffff;
      *pc16 = ps->__value.__wch;
      ps->__value.__wch = L'\0';
      return (size_t) -3;
    }


- the pc16 pointer is not checked for NULL before being dereferenced.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

             reply	other threads:[~2022-02-16  6:41 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-16  6:41 mbuilov at gmail dot com [this message]
2023-06-26 13:23 ` [Bug string/28898] " bruno at clisp dot org
2023-06-26 13:27 ` bruno at clisp dot org
2024-05-04  4:28 ` luigighiron at gmail dot com

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-28898-131@http.sourceware.org/bugzilla/ \
    --to=sourceware-bugzilla@sourceware.org \
    --cc=glibc-bugs@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).