public inbox for glibc-bugs-regex@sourceware.org
help / color / mirror / Atom feed
* [Bug regex/11159] New: lock contention within regexec() when used from multiple threads
@ 2010-01-11 9:40 extproxy at gmail dot com
2010-01-11 9:58 ` [Bug regex/11159] " schwab at linux-m68k dot org
` (5 more replies)
0 siblings, 6 replies; 8+ messages in thread
From: extproxy at gmail dot com @ 2010-01-11 9:40 UTC (permalink / raw)
To: glibc-bugs-regex
I have a program that uses multiple threads. Each thread makes heavy use of
regular expression matches by calling the glibc regexec() function.
Unfortunately, this function seems to acquire a global lock - which causes poor
performance in a multi-threaded environment.
I'm not even sure what regexec() needs to lock - it really doesn't need access
to any global state. Maybe it accesses some global locale object or something.
Anyways, it doesn't need to acquire a write lock - a read lock should have
sufficed. Alternatively, a thread-local data structure could be considered.
Hope future releases of glibc can address this performance bug.
--
Summary: lock contention within regexec() when used from multiple
threads
Product: glibc
Version: 2.10
Status: NEW
Severity: normal
Priority: P2
Component: regex
AssignedTo: drepper at redhat dot com
ReportedBy: extproxy at gmail dot com
CC: glibc-bugs-regex at sources dot redhat dot com,glibc-
bugs at sources dot redhat dot com
http://sourceware.org/bugzilla/show_bug.cgi?id=11159
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug regex/11159] lock contention within regexec() when used from multiple threads
2010-01-11 9:40 [Bug regex/11159] New: lock contention within regexec() when used from multiple threads extproxy at gmail dot com
@ 2010-01-11 9:58 ` schwab at linux-m68k dot org
2010-01-11 17:46 ` extproxy at gmail dot com
` (4 subsequent siblings)
5 siblings, 0 replies; 8+ messages in thread
From: schwab at linux-m68k dot org @ 2010-01-11 9:58 UTC (permalink / raw)
To: glibc-bugs-regex
------- Additional Comments From schwab at linux-m68k dot org 2010-01-11 09:58 -------
Use a separate regex_t object in each thread.
--
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |INVALID
http://sourceware.org/bugzilla/show_bug.cgi?id=11159
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug regex/11159] lock contention within regexec() when used from multiple threads
2010-01-11 9:40 [Bug regex/11159] New: lock contention within regexec() when used from multiple threads extproxy at gmail dot com
2010-01-11 9:58 ` [Bug regex/11159] " schwab at linux-m68k dot org
@ 2010-01-11 17:46 ` extproxy at gmail dot com
2010-01-11 17:55 ` bonzini at gnu dot org
` (3 subsequent siblings)
5 siblings, 0 replies; 8+ messages in thread
From: extproxy at gmail dot com @ 2010-01-11 17:46 UTC (permalink / raw)
To: glibc-bugs-regex
------- Additional Comments From extproxy at gmail dot com 2010-01-11 17:45 -------
> Use a separate regex_t object in each thread.
Why is that ? The regexec() interface takes in a 'const regex_t *' object. This
implies multiple threads can use the same object.
In my program, all threads work with the same regular expression. So why use a
different regex_t object ?
At the very least, the regexec() documentation needs to clarify this performance
limitation. I'm re-opening this bug - please change title if necessary to a doc
bug if you still don't agree that this should be fixed.
--
What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |REOPENED
Resolution|INVALID |
http://sourceware.org/bugzilla/show_bug.cgi?id=11159
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug regex/11159] lock contention within regexec() when used from multiple threads
2010-01-11 9:40 [Bug regex/11159] New: lock contention within regexec() when used from multiple threads extproxy at gmail dot com
2010-01-11 9:58 ` [Bug regex/11159] " schwab at linux-m68k dot org
2010-01-11 17:46 ` extproxy at gmail dot com
@ 2010-01-11 17:55 ` bonzini at gnu dot org
2010-01-11 18:01 ` extproxy at gmail dot com
` (2 subsequent siblings)
5 siblings, 0 replies; 8+ messages in thread
From: bonzini at gnu dot org @ 2010-01-11 17:55 UTC (permalink / raw)
To: glibc-bugs-regex
------- Additional Comments From bonzini at gnu dot org 2010-01-11 17:55 -------
The fact that is "const" does not mean that no internal data structures are
modified (and this needs locking). C++ even has a "mutable" keyword for this.
glibc does the locking per-regex_t.
Could be a doc bug, leaving this decision to the glibc maintainers.
--
http://sourceware.org/bugzilla/show_bug.cgi?id=11159
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug regex/11159] lock contention within regexec() when used from multiple threads
2010-01-11 9:40 [Bug regex/11159] New: lock contention within regexec() when used from multiple threads extproxy at gmail dot com
` (2 preceding siblings ...)
2010-01-11 17:55 ` bonzini at gnu dot org
@ 2010-01-11 18:01 ` extproxy at gmail dot com
2010-01-11 18:03 ` bonzini at gnu dot org
2010-01-15 7:51 ` drepper at redhat dot com
5 siblings, 0 replies; 8+ messages in thread
From: extproxy at gmail dot com @ 2010-01-11 18:01 UTC (permalink / raw)
To: glibc-bugs-regex
------- Additional Comments From extproxy at gmail dot com 2010-01-11 18:01 -------
Out of curiosity, what exactly is regex_t locking ?
--
http://sourceware.org/bugzilla/show_bug.cgi?id=11159
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug regex/11159] lock contention within regexec() when used from multiple threads
2010-01-11 9:40 [Bug regex/11159] New: lock contention within regexec() when used from multiple threads extproxy at gmail dot com
` (3 preceding siblings ...)
2010-01-11 18:01 ` extproxy at gmail dot com
@ 2010-01-11 18:03 ` bonzini at gnu dot org
2010-01-15 7:51 ` drepper at redhat dot com
5 siblings, 0 replies; 8+ messages in thread
From: bonzini at gnu dot org @ 2010-01-11 18:03 UTC (permalink / raw)
To: glibc-bugs-regex
------- Additional Comments From bonzini at gnu dot org 2010-01-11 18:03 -------
regexec converts the NFA to DFA on demand, so the DFA representation is locked
(and some more stuff too, but TLS could indeed be used for that because it is
per-match data; DFA states are persistent).
--
http://sourceware.org/bugzilla/show_bug.cgi?id=11159
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug regex/11159] lock contention within regexec() when used from multiple threads
2010-01-11 9:40 [Bug regex/11159] New: lock contention within regexec() when used from multiple threads extproxy at gmail dot com
` (4 preceding siblings ...)
2010-01-11 18:03 ` bonzini at gnu dot org
@ 2010-01-15 7:51 ` drepper at redhat dot com
5 siblings, 0 replies; 8+ messages in thread
From: drepper at redhat dot com @ 2010-01-15 7:51 UTC (permalink / raw)
To: glibc-bugs-regex
------- Additional Comments From drepper at redhat dot com 2010-01-15 07:50 -------
No, you cannot use TLS. The semantics is that using a regex_t in one thread
after the other the side effects are carried over.
If you know this isn't needed, use separate regex_t.
--
What |Removed |Added
----------------------------------------------------------------------------
Status|REOPENED |RESOLVED
Resolution| |WONTFIX
http://sourceware.org/bugzilla/show_bug.cgi?id=11159
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug regex/11159] lock contention within regexec() when used from multiple threads
[not found] <bug-11159-132@http.sourceware.org/bugzilla/>
@ 2014-06-30 20:24 ` fweimer at redhat dot com
0 siblings, 0 replies; 8+ messages in thread
From: fweimer at redhat dot com @ 2014-06-30 20:24 UTC (permalink / raw)
To: glibc-bugs-regex
https://sourceware.org/bugzilla/show_bug.cgi?id=11159
Florian Weimer <fweimer at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Flags| |security-
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2014-06-30 20:24 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-01-11 9:40 [Bug regex/11159] New: lock contention within regexec() when used from multiple threads extproxy at gmail dot com
2010-01-11 9:58 ` [Bug regex/11159] " schwab at linux-m68k dot org
2010-01-11 17:46 ` extproxy at gmail dot com
2010-01-11 17:55 ` bonzini at gnu dot org
2010-01-11 18:01 ` extproxy at gmail dot com
2010-01-11 18:03 ` bonzini at gnu dot org
2010-01-15 7:51 ` drepper at redhat dot com
[not found] <bug-11159-132@http.sourceware.org/bugzilla/>
2014-06-30 20:24 ` fweimer at redhat dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).