public inbox for glibc-bugs-regex@sourceware.org
help / color / mirror / Atom feed
* [Bug regex/12567] New: regexec leaks mem when used multiple times
@ 2011-03-11  5:13 vapier at gentoo dot org
  2011-03-11  5:13 ` [Bug regex/12567] " vapier at gentoo dot org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: vapier at gentoo dot org @ 2011-03-11  5:13 UTC (permalink / raw)
  To: glibc-bugs-regex

http://sourceware.org/bugzilla/show_bug.cgi?id=12567

           Summary: regexec leaks mem when used multiple times
           Product: glibc
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: regex
        AssignedTo: drepper.fsp@gmail.com
        ReportedBy: vapier@gentoo.org


Created attachment 5291
  --> http://sourceware.org/bugzilla/attachment.cgi?id=5291
problem.c

from Luis Fernando Schultz Xavier da Silveira:

The function regexec leaks memory if the same regex_t structure is fed to
multiple inputs. The leak is so dramatic it appears to be at least linear in
the amount of text fed.

Given a regular expression of size m and a text of size n, regcomp is supposed
to run in O(m), regexec in O(mn) and regfree in O(m). Even if the
implementation chooses to adopt another strategy with different time
complexities, this is still a bug because the calls to regexec should be
independent. Accumulation of memory between calls is surely a bug.

I will attach an example program. The regex accepts any string with at least
499 characters such that the 498-th last one (the 0-th being the last one) is
'a'. This regex is compiled and is run against successive random strings of
length 1024. The cflags is REG_NOSUB and the eflags is 0. The regular
expression is prefixed with '^' and suffixed with '$'.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug regex/12567] regexec leaks mem when used multiple times
  2011-03-11  5:13 [Bug regex/12567] New: regexec leaks mem when used multiple times vapier at gentoo dot org
@ 2011-03-11  5:13 ` vapier at gentoo dot org
  2011-03-11  6:55 ` jakub at redhat dot com
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: vapier at gentoo dot org @ 2011-03-11  5:13 UTC (permalink / raw)
  To: glibc-bugs-regex

http://sourceware.org/bugzilla/show_bug.cgi?id=12567

Mike Frysinger <vapier at gentoo dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |toolchain at gentoo dot org
           See Also|                            |http://bugs.gentoo.org/show
                   |                            |_bug.cgi?id=347235

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug regex/12567] regexec leaks mem when used multiple times
  2011-03-11  5:13 [Bug regex/12567] New: regexec leaks mem when used multiple times vapier at gentoo dot org
  2011-03-11  5:13 ` [Bug regex/12567] " vapier at gentoo dot org
@ 2011-03-11  6:55 ` jakub at redhat dot com
  2011-03-11  7:55 ` [Bug regex/12567] regexec sucks up mem when used multiple times with different strings vapier at gentoo dot org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: jakub at redhat dot com @ 2011-03-11  6:55 UTC (permalink / raw)
  To: glibc-bugs-regex

http://sourceware.org/bugzilla/show_bug.cgi?id=12567

Jakub Jelinek <jakub at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
                 CC|                            |jakub at redhat dot com
         Resolution|                            |INVALID

--- Comment #1 from Jakub Jelinek <jakub at redhat dot com> 2011-03-11 06:54:54 UTC ---
You clearly don't understand what is a memory leak.  Just call regfree at the
end of the testcase and you'll see that no memory has been leaked.
glibc regex implementation is a DFA, which creates needed nodes on the fly.  If
you always search the same string, after a first regexec new nodes won't need
to be created, but if you always search different strings, it may be that they
need to.  All the memory allocated memory is tracked and freed upon regfree
though.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug regex/12567] regexec sucks up mem when used multiple times with different strings
  2011-03-11  5:13 [Bug regex/12567] New: regexec leaks mem when used multiple times vapier at gentoo dot org
  2011-03-11  5:13 ` [Bug regex/12567] " vapier at gentoo dot org
  2011-03-11  6:55 ` jakub at redhat dot com
@ 2011-03-11  7:55 ` vapier at gentoo dot org
  2011-03-11  8:04 ` jakub at redhat dot com
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: vapier at gentoo dot org @ 2011-03-11  7:55 UTC (permalink / raw)
  To: glibc-bugs-regex

http://sourceware.org/bugzilla/show_bug.cgi?id=12567

Mike Frysinger <vapier at gentoo dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|INVALID                     |
            Summary|regexec leaks mem when used |regexec sucks up mem when
                   |multiple times              |used multiple times with
                   |                            |different strings

--- Comment #2 from Mike Frysinger <vapier at gentoo dot org> 2011-03-11 07:55:05 UTC ---
being pedantic doesnt make the poor behavior go away.  clearly the behavior is
highly undesirable.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug regex/12567] regexec sucks up mem when used multiple times with different strings
  2011-03-11  5:13 [Bug regex/12567] New: regexec leaks mem when used multiple times vapier at gentoo dot org
                   ` (2 preceding siblings ...)
  2011-03-11  7:55 ` [Bug regex/12567] regexec sucks up mem when used multiple times with different strings vapier at gentoo dot org
@ 2011-03-11  8:04 ` jakub at redhat dot com
  2011-03-11  8:10 ` bonzini at gnu dot org
  2014-06-27 13:45 ` fweimer at redhat dot com
  5 siblings, 0 replies; 7+ messages in thread
From: jakub at redhat dot com @ 2011-03-11  8:04 UTC (permalink / raw)
  To: glibc-bugs-regex

http://sourceware.org/bugzilla/show_bug.cgi?id=12567

Jakub Jelinek <jakub at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|REOPENED                    |RESOLVED
         Resolution|                            |INVALID

--- Comment #3 from Jakub Jelinek <jakub at redhat dot com> 2011-03-11 08:04:15 UTC ---
Except that it is not a poor behavior, but a concious implementation decision
for performance reasons.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug regex/12567] regexec sucks up mem when used multiple times with different strings
  2011-03-11  5:13 [Bug regex/12567] New: regexec leaks mem when used multiple times vapier at gentoo dot org
                   ` (3 preceding siblings ...)
  2011-03-11  8:04 ` jakub at redhat dot com
@ 2011-03-11  8:10 ` bonzini at gnu dot org
  2014-06-27 13:45 ` fweimer at redhat dot com
  5 siblings, 0 replies; 7+ messages in thread
From: bonzini at gnu dot org @ 2011-03-11  8:10 UTC (permalink / raw)
  To: glibc-bugs-regex

http://sourceware.org/bugzilla/show_bug.cgi?id=12567

Paolo Bonzini <bonzini at gnu dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bonzini at gnu dot org

--- Comment #4 from Paolo Bonzini <bonzini at gnu dot org> 2011-03-11 08:09:58 UTC ---
Especially considered that this behavior is because you are using _random_
inputs.  grep would show the same "problem" as glibc regexec.  Please do not
reopen.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug regex/12567] regexec sucks up mem when used multiple times with different strings
  2011-03-11  5:13 [Bug regex/12567] New: regexec leaks mem when used multiple times vapier at gentoo dot org
                   ` (4 preceding siblings ...)
  2011-03-11  8:10 ` bonzini at gnu dot org
@ 2014-06-27 13:45 ` fweimer at redhat dot com
  5 siblings, 0 replies; 7+ messages in thread
From: fweimer at redhat dot com @ 2014-06-27 13:45 UTC (permalink / raw)
  To: glibc-bugs-regex

https://sourceware.org/bugzilla/show_bug.cgi?id=12567

Florian Weimer <fweimer at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
              Flags|                            |security-

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-06-27 13:45 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-03-11  5:13 [Bug regex/12567] New: regexec leaks mem when used multiple times vapier at gentoo dot org
2011-03-11  5:13 ` [Bug regex/12567] " vapier at gentoo dot org
2011-03-11  6:55 ` jakub at redhat dot com
2011-03-11  7:55 ` [Bug regex/12567] regexec sucks up mem when used multiple times with different strings vapier at gentoo dot org
2011-03-11  8:04 ` jakub at redhat dot com
2011-03-11  8:10 ` bonzini at gnu dot org
2014-06-27 13:45 ` fweimer at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).