public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libc/30024] New: regcomp does not honour the documented behaviour.
@ 2023-01-18 23:02 gilles.duvert@univ-grenoble-alpes.fr
2023-01-18 23:26 ` [Bug libc/30024] " schwab@linux-m68k.org
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: gilles.duvert@univ-grenoble-alpes.fr @ 2023-01-18 23:02 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=30024
Bug ID: 30024
Summary: regcomp does not honour the documented behaviour.
Product: glibc
Version: 2.36
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: libc
Assignee: unassigned at sourceware dot org
Reporter: gilles.duvert@univ-grenoble-alpes.fr
CC: drepper.fsp at gmail dot com
Target Milestone: ---
Description of problem:
regcomp() should correctly find the occurences of '{ ' in a string, since it is
said: (man 7 regex)
A '{' followed by a character other than a digit is an ordinary character,
not the beginning of a bound(!).
Version-Release number of selected component (if applicable):
How reproducible:
Always.
Steps to Reproduce:
1. compile and run this small C code below (slightly edited copy of the man
example).
2. the result is OK on, e.g., OSX. Not on Mageia 8 (glibc 2.32-30) and not on
Mageia Cauldron (glibc 2.36-30 at the time of writing) where an error is
issued.
Instead, the program using regcomp(), should find the positions of '{ ' in the
string "1234 G!t!rk{ ss { zz...\n"
Please notet hat the problem exists also with '{' and not '{ ' as I demonstrate
in the below program. I believe the sentence " '{' followed by a character..."
holds even if there is no character at all (and indeed the OSX version behaves
the same with '{'), so the exact extent of this glibc bug is to be determined.
3. the code:
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <regex.h>
#define ARRAY_SIZE(arr) (sizeof((arr)) / sizeof((arr)[0]))
static const char *const str = "1234 G!t!rk{ ss { zz...\n";
static const char *const re = "{ ";
int main(void)
{
static const char *s = str;
regex_t regex;
regmatch_t pmatch[1];
regoff_t off, len;
int cflags = REG_EXTENDED;
int res=regcomp(®ex, re, cflags);
if (res) {
printf("regcomp error:");
if (res == REG_BADBR ) printf(" REG_BADBR ");
if (res == REG_BADPAT ) printf(" REG_BADPAT ");
if (res == REG_BADRPT ) printf(" REG_BADRPT ");
if (res == REG_EBRACE ) printf(" REG_EBRACE ");
if (res == REG_EBRACK ) printf(" REG_EBRACK ");
if (res == REG_ECOLLATE) printf(" REG_ECOLLATE");
if (res == REG_ECTYPE ) printf(" REG_ECTYPE ");
/* if (res == REG_EEND ) printf(" REG_EEND "); */
if (res == REG_EESCAPE ) printf(" REG_EESCAPE ");
if (res == REG_EPAREN ) printf(" REG_EPAREN ");
if (res == REG_ERANGE ) printf(" REG_ERANGE ");
/* if (res == REG_ESIZE ) printf(" REG_ESIZE "); */
if (res == REG_ESPACE ) printf(" REG_ESPACE ");
if (res == REG_ESUBREG ) printf(" REG_ESUBREG ");
printf("\n");
exit(EXIT_FAILURE);
}
printf("String = \"%s\"\n", str);
printf("Matches:\n");
for (int i = 0; ; i++) {
if (regexec(®ex, s, ARRAY_SIZE(pmatch), pmatch, 0))
break;
off = pmatch[0].rm_so + (s - str);
len = pmatch[0].rm_eo - pmatch[0].rm_so;
printf("#%d:\n", i);
printf("offset = %jd; length = %jd\n", (intmax_t) off,
(intmax_t) len);
printf("substring = \"%.*s\"\n", len, s + pmatch[0].rm_so);
s += pmatch[0].rm_eo;
}
exit(EXIT_SUCCESS);
}
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug libc/30024] regcomp does not honour the documented behaviour.
2023-01-18 23:02 [Bug libc/30024] New: regcomp does not honour the documented behaviour gilles.duvert@univ-grenoble-alpes.fr
@ 2023-01-18 23:26 ` schwab@linux-m68k.org
2023-01-19 11:18 ` gilles.duvert@univ-grenoble-alpes.fr
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: schwab@linux-m68k.org @ 2023-01-18 23:26 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=30024
Andreas Schwab <schwab@linux-m68k.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution|--- |NOTABUG
--- Comment #1 from Andreas Schwab <schwab@linux-m68k.org> ---
POSIX says:
Any of the following uses produce undefined results: ...
* If a <left-brace> is not part of a valid interval expression (see EREs
Matching Multiple Characters)
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug libc/30024] regcomp does not honour the documented behaviour.
2023-01-18 23:02 [Bug libc/30024] New: regcomp does not honour the documented behaviour gilles.duvert@univ-grenoble-alpes.fr
2023-01-18 23:26 ` [Bug libc/30024] " schwab@linux-m68k.org
@ 2023-01-19 11:18 ` gilles.duvert@univ-grenoble-alpes.fr
2023-01-19 15:36 ` schwab@linux-m68k.org
2023-01-19 22:21 ` gilles.duvert@univ-grenoble-alpes.fr
3 siblings, 0 replies; 5+ messages in thread
From: gilles.duvert@univ-grenoble-alpes.fr @ 2023-01-19 11:18 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=30024
--- Comment #2 from Gilles Duvert <gilles.duvert@univ-grenoble-alpes.fr> ---
man 7 regex excerpt:
REGEX(7)
Linux Programmer's Manual
REGEX(7)
NAME
regex - POSIX.2 regular expressions
and man 3 regcomp says:
REGEX(3)
Linux Programmer's Manual
REGEX(3)
NAME
regcomp, regexec, regerror, regfree - POSIX regex functions
so, POSIX, POSIX.2 ??
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug libc/30024] regcomp does not honour the documented behaviour.
2023-01-18 23:02 [Bug libc/30024] New: regcomp does not honour the documented behaviour gilles.duvert@univ-grenoble-alpes.fr
2023-01-18 23:26 ` [Bug libc/30024] " schwab@linux-m68k.org
2023-01-19 11:18 ` gilles.duvert@univ-grenoble-alpes.fr
@ 2023-01-19 15:36 ` schwab@linux-m68k.org
2023-01-19 22:21 ` gilles.duvert@univ-grenoble-alpes.fr
3 siblings, 0 replies; 5+ messages in thread
From: schwab@linux-m68k.org @ 2023-01-19 15:36 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=30024
--- Comment #3 from Andreas Schwab <schwab@linux-m68k.org> ---
Third-party manpages are not authoritative.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug libc/30024] regcomp does not honour the documented behaviour.
2023-01-18 23:02 [Bug libc/30024] New: regcomp does not honour the documented behaviour gilles.duvert@univ-grenoble-alpes.fr
` (2 preceding siblings ...)
2023-01-19 15:36 ` schwab@linux-m68k.org
@ 2023-01-19 22:21 ` gilles.duvert@univ-grenoble-alpes.fr
3 siblings, 0 replies; 5+ messages in thread
From: gilles.duvert@univ-grenoble-alpes.fr @ 2023-01-19 22:21 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=30024
--- Comment #4 from Gilles Duvert <gilles.duvert@univ-grenoble-alpes.fr> ---
Do you recommend to file a report to manpages?
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-01-19 22:21 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-18 23:02 [Bug libc/30024] New: regcomp does not honour the documented behaviour gilles.duvert@univ-grenoble-alpes.fr
2023-01-18 23:26 ` [Bug libc/30024] " schwab@linux-m68k.org
2023-01-19 11:18 ` gilles.duvert@univ-grenoble-alpes.fr
2023-01-19 15:36 ` schwab@linux-m68k.org
2023-01-19 22:21 ` gilles.duvert@univ-grenoble-alpes.fr
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).