From: Dirk Gouders <dirk@gouders.net>
To: libc-help@sourceware.org
Subject: Help: match '\0' with regexec(3)
Date: Sat, 03 Feb 2024 21:33:42 +0100 [thread overview]
Message-ID: <ghsf29uw2h.fsf@gouders.net> (raw)
[-- Attachment #1: Type: text/plain, Size: 1055 bytes --]
Hi,
I would like to ask for an explanation or hint to my error for my
attempt to use regexec(3) to match null-characters ('\0').
To illustrate it, I wrote the attached test-program and what I do not
understand is why I get false match-positions when testing with a string
that contains '\0' (I am not absolutely sure if '.' is supposed to match '\0').
Here is some "normal" output:
$ printf ".\nab\n" | ./test_regex
Compiling regex "."
Testing string "ab"...
regexec match: pos 0 length 1
"ab"
Testing string "b"...
regexec match: pos 1 length 1
"b"
Testing string ""...
But when I insert a '\0' into that string, the result is confusing to
me:
$ printf ".\na\0b\n" | ./test_regex
Compiling regex "."
Testing string "a"...
regexec match: pos 0 length 1
"a"
Testing string ""...
regexec match: pos 2 length 1
"b"
Testing string "b"...
regexec match: pos 2 length 1
"b"
Testing string ""...
My appologies in advance should this question be easy to answer myself
if I had googled it correctly.
Regards,
Dirk
[-- Attachment #2: regexec(3) test-program --]
[-- Type: text/plain, Size: 1334 bytes --]
#include <stdlib.h>
#include <stdio.h>
#include <regex.h>
int main()
{
int ret;
char *line = NULL;
char *reg_expr = NULL;
size_t line_len = 256;
size_t l;
static regex_t preg;
regmatch_t pmatch[1];
ret = getline(®_expr, &line_len, stdin);
if (ret < 1)
exit(1);
reg_expr[ret - 1] = '\0'; /* remove newline */
printf("Compiling regex \"%s\"\n", reg_expr);
if (ret = regcomp(&preg, reg_expr, REG_EXTENDED | REG_NEWLINE) != 0) {
fprintf(stderr, "regcomp() failed: %d\n", ret);
exit(1);
}
while (1) {
ret = getline(&line, &line_len, stdin);
line[ret - 1] = '\0'; /* remove newline */
line_len = ret - 1;
if (ret < 1)
break;
for (int i = 0; i < line_len; i += l ? l : 1) {
pmatch[0].rm_so = 0;
pmatch[0].rm_eo = line_len - i;
printf("Testing string \"");
for (int j = i; j < line_len; j++)
printf("%c", line[j]);
printf("\"...\n");
ret = regexec(&preg, line + i, 1, pmatch, REG_NOTEOL | REG_STARTEND);
if (ret != 0) {
printf("No match.\n");
break;
} else
printf("regexec match: pos %u length %u\n\t\"%s\"\n",
pmatch[0].rm_so + i,
pmatch[0].rm_eo - pmatch[0].rm_so,
line + i + pmatch[0].rm_so);
l = pmatch[0].rm_eo - pmatch[0].rm_so;
}
}
}
next reply other threads:[~2024-02-03 20:33 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-03 20:33 Dirk Gouders [this message]
2024-02-03 20:50 ` Dirk Gouders
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ghsf29uw2h.fsf@gouders.net \
--to=dirk@gouders.net \
--cc=libc-help@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).