public inbox for glibc-bugs-regex@sourceware.org
help / color / mirror / Atom feed
From: "vapier at gentoo dot org" <sourceware-bugzilla@sourceware.org>
To: glibc-bugs-regex@sources.redhat.com
Subject: [Bug regex/12567] New: regexec leaks mem when used multiple times
Date: Fri, 11 Mar 2011 05:13:00 -0000	[thread overview]
Message-ID: <bug-12567-132@http.sourceware.org/bugzilla/> (raw)

http://sourceware.org/bugzilla/show_bug.cgi?id=12567

           Summary: regexec leaks mem when used multiple times
           Product: glibc
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: regex
        AssignedTo: drepper.fsp@gmail.com
        ReportedBy: vapier@gentoo.org


Created attachment 5291
  --> http://sourceware.org/bugzilla/attachment.cgi?id=5291
problem.c

from Luis Fernando Schultz Xavier da Silveira:

The function regexec leaks memory if the same regex_t structure is fed to
multiple inputs. The leak is so dramatic it appears to be at least linear in
the amount of text fed.

Given a regular expression of size m and a text of size n, regcomp is supposed
to run in O(m), regexec in O(mn) and regfree in O(m). Even if the
implementation chooses to adopt another strategy with different time
complexities, this is still a bug because the calls to regexec should be
independent. Accumulation of memory between calls is surely a bug.

I will attach an example program. The regex accepts any string with at least
499 characters such that the 498-th last one (the 0-th being the last one) is
'a'. This regex is compiled and is run against successive random strings of
length 1024. The cflags is REG_NOSUB and the eflags is 0. The regular
expression is prefixed with '^' and suffixed with '$'.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


             reply	other threads:[~2011-03-11  5:13 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-11  5:13 vapier at gentoo dot org [this message]
2011-03-11  5:13 ` [Bug regex/12567] " vapier at gentoo dot org
2011-03-11  6:55 ` jakub at redhat dot com
2011-03-11  7:55 ` [Bug regex/12567] regexec sucks up mem when used multiple times with different strings vapier at gentoo dot org
2011-03-11  8:04 ` jakub at redhat dot com
2011-03-11  8:10 ` bonzini at gnu dot org
2014-06-27 13:45 ` fweimer at redhat dot com

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-12567-132@http.sourceware.org/bugzilla/ \
    --to=sourceware-bugzilla@sourceware.org \
    --cc=glibc-bugs-regex@sources.redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).