From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 96C733858D28; Wed, 18 Jan 2023 23:02:38 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 96C733858D28 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1674082958; bh=l/PYHPOnvGHDqyTS63DpR3Tbga9fxYZStyrUpLxPY/k=; h=From:To:Subject:Date:From; b=mkaRNNOUXnOyeWy3aeUSyeVByUK7LRAgNpWI3tkRXU6VgwgCh45uPJDvtyLPV6Bfx MiBwf8/tw7BVvM5NDI0BwH3HzjRdsLAelraijuhE7JXNqrKhNQwMa3TBNRrddJZ6lG rS/SrGIxEDFjTrzRauevmlgzmoHOfFW1u3USSIDk= From: "gilles.duvert@univ-grenoble-alpes.fr" To: glibc-bugs@sourceware.org Subject: [Bug libc/30024] New: regcomp does not honour the documented behaviour. Date: Wed, 18 Jan 2023 23:02:37 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: glibc X-Bugzilla-Component: libc X-Bugzilla-Version: 2.36 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: gilles.duvert@univ-grenoble-alpes.fr X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at sourceware dot org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter cc target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://sourceware.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://sourceware.org/bugzilla/show_bug.cgi?id=3D30024 Bug ID: 30024 Summary: regcomp does not honour the documented behaviour. Product: glibc Version: 2.36 Status: UNCONFIRMED Severity: normal Priority: P2 Component: libc Assignee: unassigned at sourceware dot org Reporter: gilles.duvert@univ-grenoble-alpes.fr CC: drepper.fsp at gmail dot com Target Milestone: --- Description of problem: regcomp() should correctly find the occurences of '{ ' in a string, since i= t is said: (man 7 regex)=20 A '{' followed by a character other than a digit is an ordinary character, not the beginning of a bound(!). Version-Release number of selected component (if applicable): How reproducible: Always. Steps to Reproduce: 1. compile and run this small C code below (slightly edited copy of the man example).=20 2. the result is OK on, e.g., OSX. Not on Mageia 8 (glibc 2.32-30) and not = on Mageia Cauldron (glibc 2.36-30 at the time of writing) where an error is issued.=20 Instead, the program using regcomp(), should find the positions of '{ ' in = the string "1234 G!t!rk{ ss { zz...\n"=20 Please notet hat the problem exists also with '{' and not '{ ' as I demonst= rate in the below program. I believe the sentence " '{' followed by a character.= .." holds even if there is no character at all (and indeed the OSX version beha= ves the same with '{'), so the exact extent of this glibc bug is to be determin= ed. 3. the code: #include #include #include #include #define ARRAY_SIZE(arr) (sizeof((arr)) / sizeof((arr)[0])) static const char *const str =3D "1234 G!t!rk{ ss { zz...\n"; static const char *const re =3D "{ "; int main(void) { static const char *s =3D str; regex_t regex; regmatch_t pmatch[1]; regoff_t off, len; int cflags =3D REG_EXTENDED; int res=3Dregcomp(®ex, re, cflags); if (res) { printf("regcomp error:"); if (res =3D=3D REG_BADBR ) printf(" REG_BADBR "); if (res =3D=3D REG_BADPAT ) printf(" REG_BADPAT "); if (res =3D=3D REG_BADRPT ) printf(" REG_BADRPT "); if (res =3D=3D REG_EBRACE ) printf(" REG_EBRACE "); if (res =3D=3D REG_EBRACK ) printf(" REG_EBRACK "); if (res =3D=3D REG_ECOLLATE) printf(" REG_ECOLLATE"); if (res =3D=3D REG_ECTYPE ) printf(" REG_ECTYPE "); /* if (res =3D=3D REG_EEND ) printf(" REG_EEND "); */ if (res =3D=3D REG_EESCAPE ) printf(" REG_EESCAPE "); if (res =3D=3D REG_EPAREN ) printf(" REG_EPAREN "); if (res =3D=3D REG_ERANGE ) printf(" REG_ERANGE "); /* if (res =3D=3D REG_ESIZE ) printf(" REG_ESIZE "); */ if (res =3D=3D REG_ESPACE ) printf(" REG_ESPACE "); if (res =3D=3D REG_ESUBREG ) printf(" REG_ESUBREG "); printf("\n"); exit(EXIT_FAILURE); } printf("String =3D \"%s\"\n", str); printf("Matches:\n"); for (int i =3D 0; ; i++) { if (regexec(®ex, s, ARRAY_SIZE(pmatch), pmatch, 0)) break; off =3D pmatch[0].rm_so + (s - str); len =3D pmatch[0].rm_eo - pmatch[0].rm_so; printf("#%d:\n", i); printf("offset =3D %jd; length =3D %jd\n", (intmax_t) off, (intmax_t) len); printf("substring =3D \"%.*s\"\n", len, s + pmatch[0].rm_so); s +=3D pmatch[0].rm_eo; } exit(EXIT_SUCCESS); } --=20 You are receiving this mail because: You are on the CC list for the bug.=