public inbox for glibc-bugs-regex@sourceware.org
help / color / mirror / Atom feed
* [Bug regex/934] New: segfault in regexec
@ 2005-05-06  6:06 zachmann at schlund dot de
  2005-05-06  7:37 ` [Bug regex/934] " paolo dot bonzini at lu dot unisi dot ch
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: zachmann at schlund dot de @ 2005-05-06  6:06 UTC (permalink / raw)
  To: glibc-bugs-regex

During the development of a multi threaded application on a multi processor 
machine I found a segfault in regexec. I could not reproduce this crash on a 
single processor machine. Attached you find a small testprogram that crashes 
in about 30 of 100 runs. 
Here is a backtrace of a crash: 
 
Program terminated with signal 11, Segmentation fault. 
#0  0x0805eb1e in re_acquire_state_context () 
(gdb) bt 
#0  0x0805eb1e in re_acquire_state_context () 
#1  0x08061e75 in build_trtable () 
#2  0x0806387e in re_search_internal () 
#3  0x08063c51 in regexec () 
#4  0x080482a2 in run () 
#5  0x08048c21 in pthread_start_thread () 
 
glibc 2.3.4 
gcc 3.3.5 
 
The program is run on a dual Intel(R) Xeon(TM) CPU 2.40GHz with hyberthreading 
enabled. 
 
If you need more informations please let me know. 
 
regextest.c 
-------------------------------------- 
#include <sys/types.h> 
#include <regex.h> 
#include <stdlib.h> 
 
regex_t * regex; 
 
void *run( void * param ) 
{ 
  int i = 0; 
  for ( ; i < 1000; ++i ) 
  { 
    size_t nmatch = 1; 
    regmatch_t pmatch[nmatch]; 
    regexec( regex, "this can cause a segfault on multi processor machines", 
nmatch, pmatch, 0 ); 
  } 
} 
 
int main() 
{ 
  int not = 4; 
  int i = 0; 
  int ret = 0; 
  char *exp = "a(aaaaaa|bbb(bbbb|ccc)?cccc(cccccccc)?\\.dd)d|" 
              "eeeeeee|f(gggggggggggggg|hhhhhhhh([0-9](\\.[0-9])?))|" 
              "i(jjjjj(/[0-9](\\.[0-9])?)?|kkkkkkkk)|" 
              "l(mmmmmmmmmmmmmmmmmmmmm|nnnnnn)|oooooooooooo\\.ooo|" 
              "ppppppppppp|qqqq[/ ]?1\\.[0-9]|rrrrrrrrr/[0-9](\\.[0-9])?|" 
              "ssss|M(tttt|uuu)|N(uuuuuuuu?/[1-9](\\.[0-9])?|aaaa)|" 
              "bbbbb[ /]?[0-9](\\.[0-9])?|P(aaaaaaaa|b(c/[3-4]|ddddd))|" 
              "S(a(bbbbb|ccc)|dd|eee|fffff|gggggg|hhhhhhhhhhh)|" 
              "wwwwwwwwwwwwwww|x(aaaaaaaa|bbb)"; 
  pthread_t pthread_[not]; 
 
  regex = (regex_t *)malloc( sizeof( regex_t ) ); 
 
  ret = regcomp( regex, exp, REG_ICASE|REG_EXTENDED ); 
 
  if ( ret != 0 ) 
  { 
    printf( "regcomp failed: %d\n", ret ); 
  } 
 
  for ( i = 0; i < not; ++i ) 
  { 
    int error = pthread_create( &pthread_[ i ], NULL, &run, 0 ); 
    if ( error != 0 ) 
    { 
      printf( "unable to create thread: %d", error ); 
      exit( 1 ); 
    } 
  } 
 
  for ( i = 0; i < not; ++i ) 
  { 
    pthread_join( pthread_[i], NULL ); 
  } 
}

-- 
           Summary: segfault in regexec
           Product: glibc
           Version: 2.3.4
            Status: NEW
          Severity: normal
          Priority: P2
         Component: regex
        AssignedTo: gotom at debian dot or dot jp
        ReportedBy: zachmann at schlund dot de
                CC: glibc-bugs-regex at sources dot redhat dot com,glibc-
                    bugs at sources dot redhat dot com


http://sources.redhat.com/bugzilla/show_bug.cgi?id=934

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug regex/934] segfault in regexec
  2005-05-06  6:06 [Bug regex/934] New: segfault in regexec zachmann at schlund dot de
@ 2005-05-06  7:37 ` paolo dot bonzini at lu dot unisi dot ch
  2005-05-06  8:13 ` zachmann at schlund dot de
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: paolo dot bonzini at lu dot unisi dot ch @ 2005-05-06  7:37 UTC (permalink / raw)
  To: glibc-bugs-regex


------- Additional Comments From paolo dot bonzini at lu dot unisi dot ch  2005-05-06 07:37 -------
Subject: Re:  New: segfault in regexec

zachmann at schlund dot de wrote:

>During the development of a multi threaded application on a multi processor 
>machine I found a segfault in regexec. I could not reproduce this crash on a 
>single processor machine.
>
Probably, trying with a long running regex would make the crash almost 
100% reproducible on both single and multi-processor machines.  I'd try 
with ^(.)?(.?)(.?)(.?)(.?)\5\4\3\2\1$ for example.

>Program terminated with signal 11, Segmentation fault. 
>#0  0x0805eb1e in re_acquire_state_context () 
>
regexec is not reentrant, because it includes a cache of DFA states in 
regex_t.

Paolo


-- 


http://sources.redhat.com/bugzilla/show_bug.cgi?id=934

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug regex/934] segfault in regexec
  2005-05-06  6:06 [Bug regex/934] New: segfault in regexec zachmann at schlund dot de
  2005-05-06  7:37 ` [Bug regex/934] " paolo dot bonzini at lu dot unisi dot ch
@ 2005-05-06  8:13 ` zachmann at schlund dot de
  2005-05-06  8:19 ` paolo dot bonzini at lu dot unisi dot ch
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: zachmann at schlund dot de @ 2005-05-06  8:13 UTC (permalink / raw)
  To: glibc-bugs-regex


------- Additional Comments From zachmann at schlund dot de  2005-05-06 08:11 -------
But if I'm not mistaken the IEEE Std 1003.1, 2004 Edition states:  
  
"The interface is defined so that the matched substrings rm_sp and rm_ep are  
in a separate regmatch_t structure instead of in regex_t. This allows a single  
compiled RE to be used simultaneously in several contexts; in main() and a  
signal handler, perhaps, or in multiple threads of lightweight processes. (The  
preg argument to regexec() is declared with type const, so the implementation  
is not permitted to use the structure to store intermediate results.)"  
  
from:  
http://www.opengroup.org/onlinepubs/000095399/functions/regcomp.html  
  
So there should be no internal states in regex_t or I'm wrong here?  
 
 (In reply to comment #1) 
> Probably, trying with a long running regex would make the crash almost  
> 100% reproducible on both single and multi-processor machines.  I'd try  
> with ^(.)?(.?)(.?)(.?)(.?)\5\4\3\2\1$ for example. 
  
with this regex it only crashes only in 5 out of 100 runs. 
  

-- 


http://sources.redhat.com/bugzilla/show_bug.cgi?id=934

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug regex/934] segfault in regexec
  2005-05-06  6:06 [Bug regex/934] New: segfault in regexec zachmann at schlund dot de
  2005-05-06  7:37 ` [Bug regex/934] " paolo dot bonzini at lu dot unisi dot ch
  2005-05-06  8:13 ` zachmann at schlund dot de
@ 2005-05-06  8:19 ` paolo dot bonzini at lu dot unisi dot ch
  2005-05-06  8:26 ` jakub at redhat dot com
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: paolo dot bonzini at lu dot unisi dot ch @ 2005-05-06  8:19 UTC (permalink / raw)
  To: glibc-bugs-regex


------- Additional Comments From paolo dot bonzini at lu dot unisi dot ch  2005-05-06 08:19 -------
Subject: Re:  segfault in regexec


>So there should be no internal states in regex_t or I'm wrong here?
>  
>
Hm, re_acquire_state and other functions that *create* these states 
should be guarded.

>>Probably, trying with a long running regex would make the crash almost  
>>100% reproducible on both single and multi-processor machines.  I'd try  
>>with ^(.)?(.?)(.?)(.?)(.?)\5\4\3\2\1$ for example. 
>>    
>>
>with this regex it only crashes only in 5 out of 100 runs.
>  
>
Oh right, because this one runs longer but not in re_acquire_state_context.

Paolo


-- 


http://sources.redhat.com/bugzilla/show_bug.cgi?id=934

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug regex/934] segfault in regexec
  2005-05-06  6:06 [Bug regex/934] New: segfault in regexec zachmann at schlund dot de
                   ` (2 preceding siblings ...)
  2005-05-06  8:19 ` paolo dot bonzini at lu dot unisi dot ch
@ 2005-05-06  8:26 ` jakub at redhat dot com
  2005-07-18  2:51 ` cvs-commit at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: jakub at redhat dot com @ 2005-05-06  8:26 UTC (permalink / raw)
  To: glibc-bugs-regex


------- Additional Comments From jakub at redhat dot com  2005-05-06 08:26 -------
Or the whole re_search_internal.  Depends on how common is using the same
regex_t simultaneously.  I think it is uncommon, and therefore just
locking/unlocking it once should be faster than finding out which places
ever touch anything reachable from regex_t and adding locks to all those
places.

-- 


http://sources.redhat.com/bugzilla/show_bug.cgi?id=934

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug regex/934] segfault in regexec
  2005-05-06  6:06 [Bug regex/934] New: segfault in regexec zachmann at schlund dot de
                   ` (3 preceding siblings ...)
  2005-05-06  8:26 ` jakub at redhat dot com
@ 2005-07-18  2:51 ` cvs-commit at gcc dot gnu dot org
  2005-07-18  2:52 ` roland at gnu dot org
  2005-07-19  3:31 ` roland at gnu dot org
  6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu dot org @ 2005-07-18  2:51 UTC (permalink / raw)
  To: glibc-bugs-regex


------- Additional Comments From cvs-commit at gcc dot gnu dot org  2005-07-18 02:51 -------
Subject: Bug 934

CVSROOT:	/cvs/glibc
Module name:	libc
Branch: 	glibc-2_3-branch
Changes by:	roland@sources.redhat.com	2005-07-18 02:51:43

Modified files:
	posix          : regexec.c regcomp.c regex_internal.h 

Log message:
	2005-05-06  Jakub Jelinek  <jakub@redhat.com>
	
	[BZ #934]
	* posix/regex_internal.h: Include bits/libc-lock.h or define dummy
	__libc_lock_* macros if not _LIBC.
	(struct re_dfa_t): Add lock.
	* posix/regcomp.c (re_compile_internal): Add __libc_lock_init.
	* posix/regexec.c (regexec, re_search_stub): Add locking.

Patches:
http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/posix/regexec.c.diff?cvsroot=glibc&only_with_tag=glibc-2_3-branch&r1=1.74&r2=1.74.2.1
http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/posix/regcomp.c.diff?cvsroot=glibc&only_with_tag=glibc-2_3-branch&r1=1.87.2.1&r2=1.87.2.2
http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/posix/regex_internal.h.diff?cvsroot=glibc&only_with_tag=glibc-2_3-branch&r1=1.56.2.1&r2=1.56.2.2



-- 


http://sources.redhat.com/bugzilla/show_bug.cgi?id=934

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug regex/934] segfault in regexec
  2005-05-06  6:06 [Bug regex/934] New: segfault in regexec zachmann at schlund dot de
                   ` (4 preceding siblings ...)
  2005-07-18  2:51 ` cvs-commit at gcc dot gnu dot org
@ 2005-07-18  2:52 ` roland at gnu dot org
  2005-07-19  3:31 ` roland at gnu dot org
  6 siblings, 0 replies; 8+ messages in thread
From: roland at gnu dot org @ 2005-07-18  2:52 UTC (permalink / raw)
  To: glibc-bugs-regex



-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
OtherBugsDependingO|                            |852
              nThis|                            |


http://sources.redhat.com/bugzilla/show_bug.cgi?id=934

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug regex/934] segfault in regexec
  2005-05-06  6:06 [Bug regex/934] New: segfault in regexec zachmann at schlund dot de
                   ` (5 preceding siblings ...)
  2005-07-18  2:52 ` roland at gnu dot org
@ 2005-07-19  3:31 ` roland at gnu dot org
  6 siblings, 0 replies; 8+ messages in thread
From: roland at gnu dot org @ 2005-07-19  3:31 UTC (permalink / raw)
  To: glibc-bugs-regex


------- Additional Comments From roland at gnu dot org  2005-07-19 03:30 -------
This fix is now in the 2.3 branch as well as the trunk, and the problem should
be resolved as of the 2.3.6 release.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED


http://sources.redhat.com/bugzilla/show_bug.cgi?id=934

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2005-07-19  3:31 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-05-06  6:06 [Bug regex/934] New: segfault in regexec zachmann at schlund dot de
2005-05-06  7:37 ` [Bug regex/934] " paolo dot bonzini at lu dot unisi dot ch
2005-05-06  8:13 ` zachmann at schlund dot de
2005-05-06  8:19 ` paolo dot bonzini at lu dot unisi dot ch
2005-05-06  8:26 ` jakub at redhat dot com
2005-07-18  2:51 ` cvs-commit at gcc dot gnu dot org
2005-07-18  2:52 ` roland at gnu dot org
2005-07-19  3:31 ` roland at gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).