public inbox for glibc-bugs@sourceware.org help / color / mirror / Atom feed
From: "gilles.duvert@univ-grenoble-alpes.fr" <sourceware-bugzilla@sourceware.org> To: glibc-bugs@sourceware.org Subject: [Bug libc/30024] New: regcomp does not honour the documented behaviour. Date: Wed, 18 Jan 2023 23:02:37 +0000 [thread overview] Message-ID: <bug-30024-131@http.sourceware.org/bugzilla/> (raw) https://sourceware.org/bugzilla/show_bug.cgi?id=30024 Bug ID: 30024 Summary: regcomp does not honour the documented behaviour. Product: glibc Version: 2.36 Status: UNCONFIRMED Severity: normal Priority: P2 Component: libc Assignee: unassigned at sourceware dot org Reporter: gilles.duvert@univ-grenoble-alpes.fr CC: drepper.fsp at gmail dot com Target Milestone: --- Description of problem: regcomp() should correctly find the occurences of '{ ' in a string, since it is said: (man 7 regex) A '{' followed by a character other than a digit is an ordinary character, not the beginning of a bound(!). Version-Release number of selected component (if applicable): How reproducible: Always. Steps to Reproduce: 1. compile and run this small C code below (slightly edited copy of the man example). 2. the result is OK on, e.g., OSX. Not on Mageia 8 (glibc 2.32-30) and not on Mageia Cauldron (glibc 2.36-30 at the time of writing) where an error is issued. Instead, the program using regcomp(), should find the positions of '{ ' in the string "1234 G!t!rk{ ss { zz...\n" Please notet hat the problem exists also with '{' and not '{ ' as I demonstrate in the below program. I believe the sentence " '{' followed by a character..." holds even if there is no character at all (and indeed the OSX version behaves the same with '{'), so the exact extent of this glibc bug is to be determined. 3. the code: #include <stdint.h> #include <stdio.h> #include <stdlib.h> #include <regex.h> #define ARRAY_SIZE(arr) (sizeof((arr)) / sizeof((arr)[0])) static const char *const str = "1234 G!t!rk{ ss { zz...\n"; static const char *const re = "{ "; int main(void) { static const char *s = str; regex_t regex; regmatch_t pmatch[1]; regoff_t off, len; int cflags = REG_EXTENDED; int res=regcomp(®ex, re, cflags); if (res) { printf("regcomp error:"); if (res == REG_BADBR ) printf(" REG_BADBR "); if (res == REG_BADPAT ) printf(" REG_BADPAT "); if (res == REG_BADRPT ) printf(" REG_BADRPT "); if (res == REG_EBRACE ) printf(" REG_EBRACE "); if (res == REG_EBRACK ) printf(" REG_EBRACK "); if (res == REG_ECOLLATE) printf(" REG_ECOLLATE"); if (res == REG_ECTYPE ) printf(" REG_ECTYPE "); /* if (res == REG_EEND ) printf(" REG_EEND "); */ if (res == REG_EESCAPE ) printf(" REG_EESCAPE "); if (res == REG_EPAREN ) printf(" REG_EPAREN "); if (res == REG_ERANGE ) printf(" REG_ERANGE "); /* if (res == REG_ESIZE ) printf(" REG_ESIZE "); */ if (res == REG_ESPACE ) printf(" REG_ESPACE "); if (res == REG_ESUBREG ) printf(" REG_ESUBREG "); printf("\n"); exit(EXIT_FAILURE); } printf("String = \"%s\"\n", str); printf("Matches:\n"); for (int i = 0; ; i++) { if (regexec(®ex, s, ARRAY_SIZE(pmatch), pmatch, 0)) break; off = pmatch[0].rm_so + (s - str); len = pmatch[0].rm_eo - pmatch[0].rm_so; printf("#%d:\n", i); printf("offset = %jd; length = %jd\n", (intmax_t) off, (intmax_t) len); printf("substring = \"%.*s\"\n", len, s + pmatch[0].rm_so); s += pmatch[0].rm_eo; } exit(EXIT_SUCCESS); } -- You are receiving this mail because: You are on the CC list for the bug.
next reply other threads:[~2023-01-18 23:02 UTC|newest] Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top 2023-01-18 23:02 gilles.duvert@univ-grenoble-alpes.fr [this message] 2023-01-18 23:26 ` [Bug libc/30024] " schwab@linux-m68k.org 2023-01-19 11:18 ` gilles.duvert@univ-grenoble-alpes.fr 2023-01-19 15:36 ` schwab@linux-m68k.org 2023-01-19 22:21 ` gilles.duvert@univ-grenoble-alpes.fr
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-30024-131@http.sourceware.org/bugzilla/ \ --to=sourceware-bugzilla@sourceware.org \ --cc=glibc-bugs@sourceware.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).