public inbox for glibc-bugs-regex@sourceware.org
help / color / mirror / Atom feed
From: "andrew dot mackey at baesystems dot com" <sourceware-bugzilla@sourceware.org>
To: glibc-bugs-regex@sources.redhat.com
Subject: [Bug regex/3957] New: regcomp with REG_NEWLINE flag does operate as POSIX specification for a non-matching list
Date: Fri, 02 Feb 2007 13:23:00 -0000	[thread overview]
Message-ID: <20070202132302.3957.andrew.mackey@baesystems.com> (raw)

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 2639 bytes --]

Given the string ‘foo\nbar’ (where \n is a linefeed) the regular expression 
‘foo[^ ]+’ matches the complete string. The regex is compiled with REG_EXTENDED 
and REG_NEWLINE flags.

The POSIX specification at 
http://www.opengroup.org/onlinepubs/009695399/functions/regcomp.html

Says for the REG_NEWLINE flag that

“A <newline> in string shall not be matched by a period outside a bracket 
expression or by any form of a non-matching list “

For older versions of glicb (glibc-2.1.3) the behaviour of regcomp is as the 
POSIX specification. For glibc-2.5 and from at least glibc-2.3.2 this is not 
the behaviour

The following code demonstrates the issue

#include <stdio.h>
#include <sys/types.h>
#include <regex.h>

int main(int argc, char **argv)
{
  char regex[] = "foo[^ ]+";
  char text[] = "foo\nbar";
  regex_t preg;
  regmatch_t pmatch[1];
  int flags = REG_EXTENDED | REG_NEWLINE;
  int i;

  printf("About to compile regexp '%s', with flags %d\n", regex, flags);

  if(!regcomp(&preg, regex, flags))
  {  
    printf("About to search string '%s'\n", text);

    if(!regexec(&preg, text, 1, pmatch, 0))
    {
      printf("Regex matched, match text is '");
      for(i = pmatch[0].rm_so; i < pmatch[0].rm_eo; i++)
      {
        printf("%c", text[i]); 
      }
      printf("'\n");
    }
    else
    {
      printf("Regex did not match\n");
    }
    
    regfree(&preg);
  }
  else
  {
    printf("Failed to compile regex\n");
  }

  return 0;  
}

On glib-2.3.2, glibc-2.3.6 or glibc-2.5 the program gives 

About to compile regexp 'foo[^ ]+', with flags 5
About to search string 'foo
bar'
Regex matched, match text is 'foo
bar'

On glibc-2.1.3 and other C libraries such as found on Solaris 9 the output is 

About to compile regexp 'foo[^ ]+', with flags 9
About to search string 'foo
bar'
Regex did not match

Which I believe is the expected POISX behaviour.

-- 
           Summary: regcomp with REG_NEWLINE flag does operate as POSIX
                    specification for a non-matching list
           Product: glibc
           Version: 2.4
            Status: NEW
          Severity: normal
          Priority: P2
         Component: regex
        AssignedTo: drepper at redhat dot com
        ReportedBy: andrew dot mackey at baesystems dot com
                CC: andrew dot mackey at baesystems dot com,glibc-bugs-regex
                    at sources dot redhat dot com,glibc-bugs at sources dot
                    redhat dot com


http://sourceware.org/bugzilla/show_bug.cgi?id=3957

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


             reply	other threads:[~2007-02-02 13:23 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-02 13:23 andrew dot mackey at baesystems dot com [this message]
2007-02-05 13:43 ` [Bug regex/3957] " jakub at redhat dot com
2007-02-05 15:24 ` drepper at redhat dot com
2007-07-12 14:50 ` cvs-commit at gcc dot gnu dot org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070202132302.3957.andrew.mackey@baesystems.com \
    --to=sourceware-bugzilla@sourceware.org \
    --cc=glibc-bugs-regex@sources.redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).