public inbox for glibc-bugs-regex@sourceware.org
help / color / mirror / Atom feed
From: "steve98 at gmail dot com" <sourceware-bugzilla@sourceware.org>
To: glibc-bugs-regex@sourceware.org
Subject: [Bug regex/25934] re_token_t.mb_partial used before initialization
Date: Thu, 07 May 2020 06:07:36 +0000	[thread overview]
Message-ID: <bug-25934-132-XBsFRdbN4U@http.sourceware.org/bugzilla/> (raw)
In-Reply-To: <bug-25934-132@http.sourceware.org/bugzilla/>

https://sourceware.org/bugzilla/show_bug.cgi?id=25934

--- Comment #1 from Steven Li <steve98 at gmail dot com> ---
OK, I managed to create a simple problem to recreate this problem (on Ubuntu
18.04, using 2.27). The code is super simple:

$ cat a.c
#include <stdio.h>
#include <regex.h>
#include <locale.h>

int main() {
  char * pattern = "^[ab]*(c)$"; // any simpler, the problem goes away
  int flags = REG_ICASE; // has to be there for problem to appear

  setlocale(LC_CTYPE, ""); // without this, there is no problem

  regex_t regex;
  regcomp(&regex, pattern, flags);
}

The interesting thing is with the 1st 3 lines of code, each of them is a
necessary condition for the problem. Compiling the code is easy enough:

$ rm a.out; gcc a.c

Running the code under Valgrind yields really interesting/disturbing result
(with my home directory name masked out in the messages):

$ valgrind --track-origins=yes --leak-check=yes ./a.out
==12322== Memcheck, a memory error detector
==12322== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==12322== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==12322== Command: ./a.out
==12322==
==12322== Conditional jump or move depends on uninitialised value(s)
==12322==    at 0x4F2D13D: re_compile_fastmap_iter.isra.26 (regcomp.c:328)
==12322==    by 0x4F3D3D0: __re_compile_fastmap (regcomp.c:280)
==12322==    by 0x4F3D3D0: regcomp (regcomp.c:509)
==12322==    by 0x108749: main (in [...]/a.out)
==12322==  Uninitialised value was created by a heap allocation
==12322==    at 0x4C2FB0F: malloc (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==12322==    by 0x4F2DD6A: create_token_tree.isra.14.constprop.39
(regcomp.c:3749)
==12322==    by 0x4F35885: parse_expression (regcomp.c:2356)
==12322==    by 0x4F364CB: parse_branch (regcomp.c:2183)
==12322==    by 0x4F3668B: parse_reg_exp (regcomp.c:2138)
==12322==    by 0x4F36D7C: parse (regcomp.c:2107)
==12322==    by 0x4F36D7C: re_compile_internal (regcomp.c:788)
==12322==    by 0x4F3D331: regcomp (regcomp.c:498)
==12322==    by 0x108749: main (in [...]/a.out)
==12322==
==12322== Conditional jump or move depends on uninitialised value(s)
==12322==    at 0x4F2D13D: re_compile_fastmap_iter.isra.26 (regcomp.c:328)
==12322==    by 0x4F3D3F0: __re_compile_fastmap (regcomp.c:282)
==12322==    by 0x4F3D3F0: regcomp (regcomp.c:509)
==12322==    by 0x108749: main (in /home/stevel/workspace/TDengine/a.out)
==12322==  Uninitialised value was created by a heap allocation
==12322==    at 0x4C2FB0F: malloc (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==12322==    by 0x4F2DD6A: create_token_tree.isra.14.constprop.39
(regcomp.c:3749)
==12322==    by 0x4F35885: parse_expression (regcomp.c:2356)
==12322==    by 0x4F364CB: parse_branch (regcomp.c:2183)
==12322==    by 0x4F3668B: parse_reg_exp (regcomp.c:2138)
==12322==    by 0x4F36D7C: parse (regcomp.c:2107)
==12322==    by 0x4F36D7C: re_compile_internal (regcomp.c:788)
==12322==    by 0x4F3D331: regcomp (regcomp.c:498)
==12322==    by 0x108749: main (in [...]/a.out)
==12322==
==12322==
==12322== HEAP SUMMARY:
==12322==     in use at exit: 2,680 bytes in 48 blocks
==12322==   total heap usage: 82 allocs, 34 frees, 9,003 bytes allocated
==12322==
==12322== 256 bytes in 1 blocks are definitely lost in loss record 35 of 39
==12322==    at 0x4C2FB0F: malloc (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==12322==    by 0x4F3D2C9: regcomp (regcomp.c:479)
==12322==    by 0x108749: main (in [...]/a.out)
==12322==
==12322== 2,424 (224 direct, 2,200 indirect) bytes in 1 blocks are definitely
lost in loss record 39 of 39
==12322==    at 0x4C2FA3F: malloc (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==12322==    by 0x4C31D84: realloc (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==12322==    by 0x4F37CDB: re_compile_internal (regcomp.c:749)
==12322==    by 0x4F3D331: regcomp (regcomp.c:498)
==12322==    by 0x108749: main (in [...]/a.out)
==12322==
==12322== LEAK SUMMARY:
==12322==    definitely lost: 480 bytes in 2 blocks
==12322==    indirectly lost: 2,200 bytes in 46 blocks
==12322==      possibly lost: 0 bytes in 0 blocks
==12322==    still reachable: 0 bytes in 0 blocks
==12322==         suppressed: 0 bytes in 0 blocks
==12322==
==12322== For counts of detected and suppressed errors, rerun with: -v
==12322== ERROR SUMMARY: 4 errors from 4 contexts (suppressed: 0 from 0)

-- 
You are receiving this mail because:
You are on the CC list for the bug.

  reply	other threads:[~2020-05-07  6:07 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-07  0:49 [Bug regex/25934] New: " steve98 at gmail dot com
2020-05-07  6:07 ` steve98 at gmail dot com [this message]
2020-05-07  6:37 ` [Bug regex/25934] " sangshuduo at gmail dot com

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-25934-132-XBsFRdbN4U@http.sourceware.org/bugzilla/ \
    --to=sourceware-bugzilla@sourceware.org \
    --cc=glibc-bugs-regex@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).