[Bug libc/4737] fork is not async-signal-safe

public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug libc/4737] fork is not async-signal-safe
       [not found] <bug-4737-131@http.sourceware.org/bugzilla/>
@ 2011-10-13 17:53 ` llucax at gmail dot com
  2011-10-13 17:54 ` llucax at gmail dot com
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: llucax at gmail dot com @ 2011-10-13 17:53 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=4737

--- Comment #20 from Leandro Lucarella <llucax at gmail dot com> 2011-10-13 17:52:35 UTC ---
Created attachment 5987
  --> http://sourceware.org/bugzilla/attachment.cgi?id=5987
Test case for what could be a related problem

I'm having a problem with a rare interaction between fork() and malloc() that
looks like it could be related to this issue.

The test case attached is the most reduced test case I could came up with. This
is based on some trickery done to implement a concurrent garbage collector.

The program eventually hangs and this is the backtrace I get:

Thread 3 (Thread 0x7ffff7028700 (LWP 10504)):
#0  __lll_lock_wait_private () at
../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:97
#1  0x00007ffff78a837e in _L_lock_36 () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007ffff78a067c in ptmalloc_lock_all () at arena.c:288
#3  0x00007ffff78d7635 in __libc_fork () at
../nptl/sysdeps/unix/sysv/linux/x86_64/../fork.c:95
#4  0x0000000000400c3b in do_fork ()
#5  0x00007ffff7bc4d8c in start_thread (arg=0x7ffff7028700) at
pthread_create.c:304
#6  0x00007ffff791004d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#7  0x0000000000000000 in ?? ()

Thread 2 (Thread 0x7ffff7829700 (LWP 10503)):
#0  0x00007ffff785e084 in do_sigsuspend (set=<value optimized out>) at
../sysdeps/unix/sysv/linux/sigsuspend.c:63
#1  __sigsuspend (set=<value optimized out>) at
../sysdeps/unix/sysv/linux/sigsuspend.c:78
#2  0x0000000000400bc1 in signal_handler ()
#3  <signal handler called>
#4  0x00007ffff78a222f in _int_free (av=0x7ffff7bb91c0, p=0x603240) at
malloc.c:4780
#5  0x00007ffff78a68e3 in __libc_free (mem=<value optimized out>) at
malloc.c:3738
#6  0x0000000000400d3e in do_malloc ()
#7  0x00007ffff7bc4d8c in start_thread (arg=0x7ffff7829700) at
pthread_create.c:304
#8  0x00007ffff791004d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#9  0x0000000000000000 in ?? ()

Thread 1 (Thread 0x7ffff7fd7720 (LWP 10500)):
#0  0x00007ffff7bc606d in pthread_join (threadid=140737345918720,
thread_return=0x0) at pthread_join.c:89
#1  0x0000000000400f82 in main ()


Is this related to the same lock that prevents fork() from being async-safe? If
this is the case, I think it won't be considered a bug, right?

TIA

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug libc/4737] fork is not async-signal-safe
       [not found] <bug-4737-131@http.sourceware.org/bugzilla/>
  2011-10-13 17:53 ` [Bug libc/4737] fork is not async-signal-safe llucax at gmail dot com
@ 2011-10-13 17:54 ` llucax at gmail dot com
  2011-10-13 17:56 ` llucax at gmail dot com
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: llucax at gmail dot com @ 2011-10-13 17:54 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=4737

Leandro Lucarella <llucax at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |llucax at gmail dot com

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug libc/4737] fork is not async-signal-safe
       [not found] <bug-4737-131@http.sourceware.org/bugzilla/>
  2011-10-13 17:53 ` [Bug libc/4737] fork is not async-signal-safe llucax at gmail dot com
  2011-10-13 17:54 ` llucax at gmail dot com
@ 2011-10-13 17:56 ` llucax at gmail dot com
  2011-10-14 14:06 ` bugdal at aerifal dot cx
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: llucax at gmail dot com @ 2011-10-13 17:56 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=4737

Leandro Lucarella <llucax at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Attachment #5987|text/x-csrc                 |text/plain
          mime type|                            |

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug libc/4737] fork is not async-signal-safe
       [not found] <bug-4737-131@http.sourceware.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2011-10-13 17:56 ` llucax at gmail dot com
@ 2011-10-14 14:06 ` bugdal at aerifal dot cx
  2012-03-27 13:43 ` krebbel1 at de dot ibm.com
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: bugdal at aerifal dot cx @ 2011-10-14 14:06 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=4737

Rich Felker <bugdal at aerifal dot cx> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bugdal at aerifal dot cx

--- Comment #21 from Rich Felker <bugdal at aerifal dot cx> 2011-10-14 14:05:00 UTC ---
It should be noted that this bug, even if irrelevant to future POSIX revisions,
can easily be fixed. Either recursive or error-checking mutexes will work, but
they must also be reentrant, i.e. the mutex must be in a consistent state with
the owner and lock count correct immediately after the atomic operation to take
the lock completes successfully. This is most easily achieved by putting the
owner id in the atomic lock field and using a zero-based lock count (where the
first lock operation does not increment the count and the last unlock does not
decrement it). The only additional cost is looking up the caller's thread id,
if this is not already performed in the existing locking code... but for some
archs that may be prohibitively costly...

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug libc/4737] fork is not async-signal-safe
       [not found] <bug-4737-131@http.sourceware.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2011-10-14 14:06 ` bugdal at aerifal dot cx
@ 2012-03-27 13:43 ` krebbel1 at de dot ibm.com
  2012-04-11  7:58 ` aj at suse dot de
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: krebbel1 at de dot ibm.com @ 2012-03-27 13:43 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=4737

Andreas Krebbel <krebbel1 at de dot ibm.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |krebbel1 at de dot ibm.com

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug libc/4737] fork is not async-signal-safe
       [not found] <bug-4737-131@http.sourceware.org/bugzilla/>
                   ` (4 preceding siblings ...)
  2012-03-27 13:43 ` krebbel1 at de dot ibm.com
@ 2012-04-11  7:58 ` aj at suse dot de
  2014-02-16 19:42 ` jackie.rosen at hushmail dot com
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: aj at suse dot de @ 2012-04-11  7:58 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=4737

Andreas Jaeger <aj at suse dot de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
                 CC|                            |aj at suse dot de
         Resolution|                            |WONTFIX

--- Comment #22 from Andreas Jaeger <aj at suse dot de> 2012-04-11 07:57:10 UTC ---
I agree with comment #19 - let's resolve as WONTFIX.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug libc/4737] fork is not async-signal-safe
       [not found] <bug-4737-131@http.sourceware.org/bugzilla/>
                   ` (5 preceding siblings ...)
  2012-04-11  7:58 ` aj at suse dot de
@ 2014-02-16 19:42 ` jackie.rosen at hushmail dot com
  2014-05-28 19:41 ` schwab at sourceware dot org
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: jackie.rosen at hushmail dot com @ 2014-02-16 19:42 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=4737

Jackie Rosen <jackie.rosen at hushmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jackie.rosen at hushmail dot com

--- Comment #23 from Jackie Rosen <jackie.rosen at hushmail dot com> ---
*** Bug 260998 has been marked as a duplicate of this bug. ***
Seen from the domain http://volichat.com
Page where seen: http://volichat.com/adult-chat-rooms
Marked for reference. Resolved as fixed @bugzilla.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug libc/4737] fork is not async-signal-safe
       [not found] <bug-4737-131@http.sourceware.org/bugzilla/>
                   ` (6 preceding siblings ...)
  2014-02-16 19:42 ` jackie.rosen at hushmail dot com
@ 2014-05-28 19:41 ` schwab at sourceware dot org
  2014-08-25  2:25 ` naesten at gmail dot com
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: schwab at sourceware dot org @ 2014-05-28 19:41 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=4737

Andreas Schwab <schwab at sourceware dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|jackie.rosen at hushmail dot com   |

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug libc/4737] fork is not async-signal-safe
       [not found] <bug-4737-131@http.sourceware.org/bugzilla/>
                   ` (7 preceding siblings ...)
  2014-05-28 19:41 ` schwab at sourceware dot org
@ 2014-08-25  2:25 ` naesten at gmail dot com
  2015-05-07 15:27 ` gbenson at redhat dot com
  2021-06-28 19:00 ` adhemerval.zanella at linaro dot org
  10 siblings, 0 replies; 31+ messages in thread
From: naesten at gmail dot com @ 2014-08-25  2:25 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=4737

Samuel Bronson <naesten at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |std-posix
             Status|RESOLVED                    |REOPENED
                 CC|                            |naesten at gmail dot com
           See Also|                            |https://bugzilla.redhat.com
                   |                            |/show_bug.cgi?id=906468
         Resolution|WONTFIX                     |---
           Assignee|drepper.fsp at gmail dot com       |unassigned at sourceware dot org

--- Comment #24 from Samuel Bronson <naesten at gmail dot com> ---
Hmm, what we have now says:

> When the application calls fork() from a signal handler and any of the
> fork handlers registered by pthread_atfork() calls a function that is
> not async-signal-safe, the behavior is undefined.

I'm pretty sure this does *not* include whatever functions libc itself may
decide to install using the same mechanism, especially considering that fork is
*still* on the list of functions that must be async-signal-safe.

(Of course, listing it there is a bit of a lie, because fork() is clearly *not*
without restriction here.)

So it looks like the current behavior is not only buggy, but sub-standard.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug libc/4737] fork is not async-signal-safe
       [not found] <bug-4737-131@http.sourceware.org/bugzilla/>
                   ` (8 preceding siblings ...)
  2014-08-25  2:25 ` naesten at gmail dot com
@ 2015-05-07 15:27 ` gbenson at redhat dot com
  2021-06-28 19:00 ` adhemerval.zanella at linaro dot org
  10 siblings, 0 replies; 31+ messages in thread
From: gbenson at redhat dot com @ 2015-05-07 15:27 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=4737

Gary Benson <gbenson at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |gbenson at redhat dot com

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug libc/4737] fork is not async-signal-safe
       [not found] <bug-4737-131@http.sourceware.org/bugzilla/>
                   ` (9 preceding siblings ...)
  2015-05-07 15:27 ` gbenson at redhat dot com
@ 2021-06-28 19:00 ` adhemerval.zanella at linaro dot org
  10 siblings, 0 replies; 31+ messages in thread
From: adhemerval.zanella at linaro dot org @ 2021-06-28 19:00 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=4737

Adhemerval Zanella <adhemerval.zanella at linaro dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
                 CC|                            |adhemerval.zanella at linaro dot o
                   |                            |rg
           Assignee|unassigned at sourceware dot org   |adhemerval.zanella at linaro dot o
                   |                            |rg
   Target Milestone|---                         |2.34
         Resolution|---                         |FIXED

--- Comment #27 from Adhemerval Zanella <adhemerval.zanella at linaro dot org> ---
On 2.34 with now have _Fork (c32c868ab8b2b) which is a async-signal-safe fork()
replacement added by Austin Group issue 62 [1].  The Austin defect also dropped
the async-signal-safe requirement for fork, so currently there is no plan to
try make fork() async-signal-safe.

[1] https://austingroupbugs.net/view.php?id=62

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug libc/4737] fork is not async-signal-safe
  2007-07-04  1:35 [Bug libc/4737] New: " nmiell at comcast dot net
                   ` (18 preceding siblings ...)
  2009-01-14  9:44 ` tom dot honermann at oracle dot com
@ 2009-01-16 17:19 ` tom dot honermann at oracle dot com
  19 siblings, 0 replies; 31+ messages in thread
From: tom dot honermann at oracle dot com @ 2009-01-16 17:19 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From tom dot honermann at oracle dot com  2009-01-16 17:19 -------
The Austin group met yesterday and retained the decision to interpret fork as
async-signal-unsafe with future specifications mandating that posix_spawn be
made async-signal-safe to fill the functionality gap.  Minutes of the meeting
are available at https://www.opengroup.org/austin/docs/austin_446.txt.

I think this bug can now be closed as "WONTFIX"

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=4737

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug libc/4737] fork is not async-signal-safe
  2007-07-04  1:35 [Bug libc/4737] New: " nmiell at comcast dot net
                   ` (17 preceding siblings ...)
  2009-01-14  8:46 ` jakub at redhat dot com
@ 2009-01-14  9:44 ` tom dot honermann at oracle dot com
  2009-01-16 17:19 ` tom dot honermann at oracle dot com
  19 siblings, 0 replies; 31+ messages in thread
From: tom dot honermann at oracle dot com @ 2009-01-14  9:44 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From tom dot honermann at oracle dot com  2009-01-14 09:44 -------
My understanding is that the Jan 8th discussion was preliminary or not yet
agreed on or something like that.  I wasn't present, so the information I have
is limited.  I do know this is scheduled for further discussion on the 15th.

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=4737

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug libc/4737] fork is not async-signal-safe
  2007-07-04  1:35 [Bug libc/4737] New: " nmiell at comcast dot net
                   ` (16 preceding siblings ...)
  2009-01-14  1:22 ` tom dot honermann at oracle dot com
@ 2009-01-14  8:46 ` jakub at redhat dot com
  2009-01-14  9:44 ` tom dot honermann at oracle dot com
  2009-01-16 17:19 ` tom dot honermann at oracle dot com
  19 siblings, 0 replies; 31+ messages in thread
From: jakub at redhat dot com @ 2009-01-14  8:46 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From jakub at redhat dot com  2009-01-14 08:46 -------
That has already happened, see
http://www.opengroup.org/austin/docs/austin_445.txt

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=4737

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug libc/4737] fork is not async-signal-safe
  2007-07-04  1:35 [Bug libc/4737] New: " nmiell at comcast dot net
                   ` (15 preceding siblings ...)
  2008-11-19 16:23 ` rsa at us dot ibm dot com
@ 2009-01-14  1:22 ` tom dot honermann at oracle dot com
  2009-01-14  8:46 ` jakub at redhat dot com
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 31+ messages in thread
From: tom dot honermann at oracle dot com @ 2009-01-14  1:22 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From tom dot honermann at oracle dot com  2009-01-14 01:21 -------
The IEEE Austin group will be meeting on Thursday 5-Jan-2009 at 4:00PM GMT to
discuss (amongst other things) removing fork() from the list of
async-signal-safe functions as defined by the IEEE 1003.1 POSIX/SUS specs. 
Participation in the meeting is open to anyone - phone numbers are available at
the Austin group website in the calendar entry for this meeting
(https://www.opengroup.org/austin/calendar).

Anyone wishing to have a voice in this discussion is urged to dial in and share
your thoughts.

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=4737

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug libc/4737] fork is not async-signal-safe
  2007-07-04  1:35 [Bug libc/4737] New: " nmiell at comcast dot net
                   ` (14 preceding siblings ...)
  2008-11-19  1:39 ` tom dot honermann at oracle dot com
@ 2008-11-19 16:23 ` rsa at us dot ibm dot com
  2009-01-14  1:22 ` tom dot honermann at oracle dot com
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 31+ messages in thread
From: rsa at us dot ibm dot com @ 2008-11-19 16:23 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From rsa at us dot ibm dot com  2008-11-19 16:22 -------
(In reply to comment #14)
> (In reply to comment #13)
> > Tom's suggestion: "Another possible solution for this problem would be to just
> > not call atfork registered handlers at all if fork is called from within the
> > context of a signal handler."
> > 
> > ... seems to be what the IEEE interpretation recommends and I've thought as well
> > that it might be a potential solution.  
> > 
> > Ulrich will have to decide whether this is an approach he'd accept.
> 
> Thanks for the links Ryan!
> 
> Does anyone have any suggests for how a thread can determine that it is running
> in a signal handler?  Does glibc maintain a thread specific indicator that a
> thread is running a signal handler?  I've been looking at the glibc code (struct
> pthread in particular) and I don't see an obvious indicator there.

Tom, perhaps you can post this question to the libc-help mailing list and
reference this bugzilla?

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=4737

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug libc/4737] fork is not async-signal-safe
  2007-07-04  1:35 [Bug libc/4737] New: " nmiell at comcast dot net
                   ` (13 preceding siblings ...)
  2008-11-18 23:57 ` rsa at us dot ibm dot com
@ 2008-11-19  1:39 ` tom dot honermann at oracle dot com
  2008-11-19 16:23 ` rsa at us dot ibm dot com
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 31+ messages in thread
From: tom dot honermann at oracle dot com @ 2008-11-19  1:39 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From tom dot honermann at oracle dot com  2008-11-19 01:37 -------
(In reply to comment #13)
> Tom's suggestion: "Another possible solution for this problem would be to just
> not call atfork registered handlers at all if fork is called from within the
> context of a signal handler."
> 
> ... seems to be what the IEEE interpretation recommends and I've thought as well
> that it might be a potential solution.  
> 
> Ulrich will have to decide whether this is an approach he'd accept.

Thanks for the links Ryan!

Does anyone have any suggests for how a thread can determine that it is running
in a signal handler?  Does glibc maintain a thread specific indicator that a
thread is running a signal handler?  I've been looking at the glibc code (struct
pthread in particular) and I don't see an obvious indicator there.

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=4737

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug libc/4737] fork is not async-signal-safe
  2007-07-04  1:35 [Bug libc/4737] New: " nmiell at comcast dot net
                   ` (12 preceding siblings ...)
  2008-11-18 23:45 ` rsa at us dot ibm dot com
@ 2008-11-18 23:57 ` rsa at us dot ibm dot com
  2008-11-19  1:39 ` tom dot honermann at oracle dot com
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 31+ messages in thread
From: rsa at us dot ibm dot com @ 2008-11-18 23:57 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From rsa at us dot ibm dot com  2008-11-18 23:56 -------
Tom's suggestion: "Another possible solution for this problem would be to just
not call atfork registered handlers at all if fork is called from within the
context of a signal handler."

... seems to be what the IEEE interpretation recommends and I've thought as well
that it might be a potential solution.  

Ulrich will have to decide whether this is an approach he'd accept.

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=4737

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug libc/4737] fork is not async-signal-safe
  2007-07-04  1:35 [Bug libc/4737] New: " nmiell at comcast dot net
                   ` (11 preceding siblings ...)
  2008-11-18 22:30 ` tom dot honermann at oracle dot com
@ 2008-11-18 23:45 ` rsa at us dot ibm dot com
  2008-11-18 23:57 ` rsa at us dot ibm dot com
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 31+ messages in thread
From: rsa at us dot ibm dot com @ 2008-11-18 23:45 UTC (permalink / raw)
  To: glibc-bugs

------- Additional Comments From rsa at us dot ibm dot com  2008-11-18 23:44 -------
A hole was introduced in the concept of fork being async signal safe with the
introduction of at_fork handlers.

The following IEEE interpretation should be very enlightening:

http://standards.ieee.org/reading/ieee/interp/1003-1c-95_int/pasc-1003.1c-37.html

Per the ieee interpretation:
   "The implementation must also be required to protect itself
   across a fork, for this to occur. But it should NOT be required to
   protect itself against a fork from a signal handler, because this
   may be impossible to accomplish."

Support for at_fork handlers introduces an window of vulnerability.

"The rationale is very specific that the
handlers should be designed so they **can't** be used when fork is
called in a signal handler which under normal programming practices
meant that they should 'acquire' the mutexes, but the standard does not
require that mutex operations be async safe.  This is being referred to
the sponsor for consideration."

There doesn't seem to have been a follow up to the recommendation.

The following discussion held on libc-hacker is pertinent to the implementation
in GLIBC.

http://www.sourceware.org/ml/libc-hacker/2007-02/msg00009.html

The problem is the lock that may be held by both the at_fork handlers and malloc.

I'm not sure if there is _any_ usage of recursive locks in GLIBC.

-- 

http://sourceware.org/bugzilla/show_bug.cgi?id=4737

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug libc/4737] fork is not async-signal-safe
  2007-07-04  1:35 [Bug libc/4737] New: " nmiell at comcast dot net
                   ` (10 preceding siblings ...)
  2008-11-11 22:04 ` tom dot honermann at oracle dot com
@ 2008-11-18 22:30 ` tom dot honermann at oracle dot com
  2008-11-18 23:45 ` rsa at us dot ibm dot com
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 31+ messages in thread
From: tom dot honermann at oracle dot com @ 2008-11-18 22:30 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From tom dot honermann at oracle dot com  2008-11-18 22:28 -------
Another possible solution for this problem would be to just not call atfork
registered handlers at all if fork is called from within the context of a signal
handler.  This exact solution was implemented for Solaris 10.  See
http://src.opensolaris.org/source/diff/onnv/onnv-gate/usr/src/lib/libc/port/threads/scalls.c?r1=/onnv/onnv-gate/usr/src/lib/libc/port/threads/scalls.c@6933&r2=/onnv/onnv-gate/usr/src/lib/libc/port/threads/scalls.c@7901&format=s&full=0

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=4737

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug libc/4737] fork is not async-signal-safe
  2007-07-04  1:35 [Bug libc/4737] New: " nmiell at comcast dot net
                   ` (9 preceding siblings ...)
  2008-11-11 21:41 ` tom dot honermann at oracle dot com
@ 2008-11-11 22:04 ` tom dot honermann at oracle dot com
  2008-11-18 22:30 ` tom dot honermann at oracle dot com
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 31+ messages in thread
From: tom dot honermann at oracle dot com @ 2008-11-11 22:04 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From tom dot honermann at oracle dot com  2008-11-11 22:03 -------
Another solution for this issue would be to use error checked mutexes (ie,
PTHREAD_MUTEX_ERRORCHECK).  However, glibc 2.3.4 doesn't seem to have a generic
infrastructure for supporting error checked mutexes as it does for recursive
mutexes (ie, __libc_lock_init vs __libc_lock_init_recursive - there is no
__libc_lock_init_error_checked or similar that I can find), so this approach
might require a more invasive change.

If error checked mutexes were used, then ptmalloc_lock_all could be modified
similarly to the code below.  (This assumes mutex_lock would return the same
values as pthread_mutex_lock)

static void
ptmalloc_lock_all (void)
{
  mstate ar_ptr;
  int rc;

  if(__malloc_initialized < 1)
    return;
  rc = mutex_lock(&list_lock);
  if(rc != 0 && rc != EDEADLK) {
    /* Error handling */
  }
  for(ar_ptr = &main_arena;;) {
    rc = mutex_lock(&ar_ptr->mutex);
    if(rc != 0 && rc != EDEADLK) {
      /* Error handling */
    }
    ar_ptr = ar_ptr->next;
    if(ar_ptr == &main_arena) break;
  }
  save_malloc_hook = __malloc_hook;
  save_free_hook = __free_hook;
  __malloc_hook = malloc_atfork;
  __free_hook = free_atfork;
  /* Only the current thread may perform malloc/free calls now. */
  tsd_getspecific(arena_key, save_arena);
  tsd_setspecific(arena_key, ATFORK_ARENA_PTR);
}

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=4737

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug libc/4737] fork is not async-signal-safe
  2007-07-04  1:35 [Bug libc/4737] New: " nmiell at comcast dot net
                   ` (8 preceding siblings ...)
  2008-11-11 21:35 ` tom dot honermann at oracle dot com
@ 2008-11-11 21:41 ` tom dot honermann at oracle dot com
  2008-11-11 22:04 ` tom dot honermann at oracle dot com
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 31+ messages in thread
From: tom dot honermann at oracle dot com @ 2008-11-11 21:41 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From tom dot honermann at oracle dot com  2008-11-11 21:39 -------
(In reply to comment #7)
> It is, as far as I can see, not a problem of the fork() implementation
> allocating memory. I believe that it's:
> 
> 1) a problem of glibc's malloc() implementation registering ptmalloc_lock_all()
> for execution at fork (via thread_atfork()) while not making sure that this
> function can actually be called from a signal handler without deadlocking.
> Fork() just calls each registered function. It can not, in any way, know if
> these functions will lock up the process - that is the responsibility of whoever
> registers a the function for execution. This must be a bug in malloc/arena.c.
> This is the bug you (Nicholas) hit.

Absolutely correct.  ptmalloc_lock_all, ptmalloc_unlock_all, and
ptmalloc_unlock_all2 are not async-signal-safe as is required to be called
within the context of a signal handler.  

> 
> 2) a problem of fork() calling _IO_list_lock() which might be locked by the main
> execution flow. This must be a bug in nptl/sysdeps/unix/sysv/linux/fork.c. This
> is the bug I hit.
> 
> I think it looks like two separate bugs, which are triggered by similar
> conditions. The fix might be identical; I don't know.

These look like two separate bugs to me as well.  I haven't hit the
_IO_list_lock related one myself yet, but I suspect it is only a matter of time...



-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=4737

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug libc/4737] fork is not async-signal-safe
  2007-07-04  1:35 [Bug libc/4737] New: " nmiell at comcast dot net
                   ` (7 preceding siblings ...)
  2008-11-07  1:10 ` morten+sources dot redhat dot com at afdelingp dot dk
@ 2008-11-11 21:35 ` tom dot honermann at oracle dot com
  2008-11-11 21:41 ` tom dot honermann at oracle dot com
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 31+ messages in thread
From: tom dot honermann at oracle dot com @ 2008-11-11 21:35 UTC (permalink / raw)
  To: glibc-bugs

------- Additional Comments From tom dot honermann at oracle dot com  2008-11-11 21:34 -------
(In reply to comment #6)
Thank you for your comments Nicholas.  

> The purpose of aborting on detection of heap corruption is to prevent deliberate
> heap corruption attacks. As such, allowing further use of the allocator after
> detection of corruption has the potential to open up security holes that the
> fail fast behavior designed to prevent.

I agree that aborting the process when heap corruption is detected is a very
good thing.  I also agree with preventing further use of the allocator after
heap corruption has been detected.  However, those issues are different concerns
than what has been reported in this bug report.

> 
> Furthermore, being async-signal-safe requires that fork be callable from any
> signal handler at any time, which means that in addition to being able to fork
> from a SIGABRT resulting from heap corruption, we must also be able to fork from
> e.g. a SIGALRM handler that can interrupt the allocator at any time.

POSIX requires fork to be callable from any signal handler at any time - which
includes the contexts which you described.  Note that it would be a violation of
POSIX for the signal handler (either before or after calling fork) to call heap
routines.  None of the heap routines (malloc, free, etc...) are required to be
async-signal-safe.  All functions called by a signal handler must be
async-signal-safe - which means that any attack of the heap through manipulation
of a signal handler that calls heap routines would already constitute a defect
in the signal handler (ie, that it called heap routines at all).  The security
issue in that case would be the defect in the signal handler, not that fork was
async-signal-safe.

> 
> As such, making the mutexes recursive would allow
> malloc-from-fork-from-signal-handler to potentially see inconsistent allocator
> state and lead to heap corruption or other errors.

This is true - but would constitute a POSIX violation since heap routines are
not async-signal-safe.

> 
> fork probably needs to stop allocating memory at all.

fork doesn't allocate memory - it isn't permitted to (at least, not via the
heap) due to the async-signal-safe requirement.

The problem reported here is that calls to fork will hang indefinitely if fork
is called from a signal handler while the executing thread holds one of the heap
mutexes.  This hang happens even if the signal handler never calls a heap
routine.  The fact that fork locks all heap related mutexes is a very important
feature for when fork is called from outside of a signal handler (since in that
case, the forked child may go on to use the heap).  However, it breaks fork from
a signal handler by attempting to acquire mutexes that are not needed for the
parent/child processes running in the context of the signal handler.

-- 

http://sourceware.org/bugzilla/show_bug.cgi?id=4737

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug libc/4737] fork is not async-signal-safe
  2007-07-04  1:35 [Bug libc/4737] New: " nmiell at comcast dot net
                   ` (6 preceding siblings ...)
  2008-11-06 23:10 ` nmiell at comcast dot net
@ 2008-11-07  1:10 ` morten+sources dot redhat dot com at afdelingp dot dk
  2008-11-11 21:35 ` tom dot honermann at oracle dot com
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 31+ messages in thread
From: morten+sources dot redhat dot com at afdelingp dot dk @ 2008-11-07  1:10 UTC (permalink / raw)
  To: glibc-bugs

------- Additional Comments From morten+sources dot redhat dot com at afdelingp dot dk  2008-11-07 01:08 -------
It is, as far as I can see, not a problem of the fork() implementation
allocating memory. I believe that it's:

1) a problem of glibc's malloc() implementation registering ptmalloc_lock_all()
for execution at fork (via thread_atfork()) while not making sure that this
function can actually be called from a signal handler without deadlocking.
Fork() just calls each registered function. It can not, in any way, know if
these functions will lock up the process - that is the responsibility of whoever
registers a the function for execution. This must be a bug in malloc/arena.c.
This is the bug you (Nicholas) hit.

2) a problem of fork() calling _IO_list_lock() which might be locked by the main
execution flow. This must be a bug in nptl/sysdeps/unix/sysv/linux/fork.c. This
is the bug I hit.

I think it looks like two separate bugs, which are triggered by similar
conditions. The fix might be identical; I don't know.

-- 

http://sourceware.org/bugzilla/show_bug.cgi?id=4737

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug libc/4737] fork is not async-signal-safe
  2007-07-04  1:35 [Bug libc/4737] New: " nmiell at comcast dot net
                   ` (5 preceding siblings ...)
  2008-11-05  9:57 ` tom dot honermann at oracle dot com
@ 2008-11-06 23:10 ` nmiell at comcast dot net
  2008-11-07  1:10 ` morten+sources dot redhat dot com at afdelingp dot dk
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 31+ messages in thread
From: nmiell at comcast dot net @ 2008-11-06 23:10 UTC (permalink / raw)
  To: glibc-bugs

------- Additional Comments From nmiell at comcast dot net  2008-11-06 23:08 -------
The purpose of aborting on detection of heap corruption is to prevent deliberate
heap corruption attacks. As such, allowing further use of the allocator after
detection of corruption has the potential to open up security holes that the
fail fast behavior designed to prevent.

Furthermore, being async-signal-safe requires that fork be callable from any
signal handler at any time, which means that in addition to being able to fork
from a SIGABRT resulting from heap corruption, we must also be able to fork from
e.g. a SIGALRM handler that can interrupt the allocator at any time.

As such, making the mutexes recursive would allow
malloc-from-fork-from-signal-handler to potentially see inconsistent allocator
state and lead to heap corruption or other errors.

fork probably needs to stop allocating memory at all.

-- 

http://sourceware.org/bugzilla/show_bug.cgi?id=4737

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug libc/4737] fork is not async-signal-safe
  2007-07-04  1:35 [Bug libc/4737] New: " nmiell at comcast dot net
                   ` (4 preceding siblings ...)
  2008-11-05  9:00 ` tom dot honermann at oracle dot com
@ 2008-11-05  9:57 ` tom dot honermann at oracle dot com
  2008-11-06 23:10 ` nmiell at comcast dot net
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 31+ messages in thread
From: tom dot honermann at oracle dot com @ 2008-11-05  9:57 UTC (permalink / raw)
  To: glibc-bugs

------- Additional Comments From tom dot honermann at oracle dot com  2008-11-05 09:55 -------
Oracle/PeopleSoft is also running into this bug.  Oracle engineers (though not
myself) are currently assigned to resolving this issue.  I am hoping to
facilitate discussion regarding what an acceptable solution to this problem
should look like.

I've been studying the glibc-2.3.4 source code (I know, old, but this is the
version that we will ultimately have to create patches for).  I suspect (but
have not verified) that the underlying issue is still present in the latest CVS
source code (implied by the fact that this bug report is still open).  My
priority is to get this corrected for Linux/x86_64.

The stack trace for the hang I've been seeing looks like:
#0  0x00000034cc0d9128 in __lll_mutex_lock_wait () from /lib64/libc.so.6
#1  0x00000034cc07262c in _L_lock_57 () from /lib64/libc.so.6
#2  0x00000034cc06bfa3 in ptmalloc_lock_all () from /lib64/libc.so.6
#3  0x00000034cc09461a in fork () from /lib64/libc.so.6 
<Oracle/PeopleSoft signal handler stack frames>
#9  <signal handler called> 
#10 0x00000034cc030015 in raise () from /lib64/libc.so.6
#11 0x00000034cc031980 in abort () from /lib64/libc.so.6
#12 0x00000034cc0674db in __libc_message () from /lib64/libc.so.6
#13 0x00000034cc06e8a0 in _int_free () from /lib64/libc.so.6
#14 0x00000034cc071fbc in free () from /lib64/libc.so.6 

The root cause for the signal generated in this case was heap corruption (glibc
detected the corruption and aborted the process).  The invoked signal handler is
simply trying to fork/exec a program to gather diagnostics we need to help us
find the cause of the heap corruption.

The Linux/x86_64 glibc build is currently using "normal" mutexes for locking the
heap arenas (see 'ptmalloc_init' in malloc/arena.c).  These mutexes are
initialized by calling 'mutex_init' in 'ptmalloc_init' and these "normal"
mutexes will deadlock if a thread owning the mutex attempts to re-acquire it.

The simplest solution seems to me to convert these to recursive mutexes.  The
reason for using a recursive mutex is to allow a thread that already holds one
of the arena mutexes to handle a signal, call fork from within that signal
handler, call ptmalloc_lock_all, and still obtain a lock to all arena mutexes. 
This would allow the thread to continue while the data structures for the
previously locked arena are not in a stable state (since the original heap
function invocation was interrupted by the signal), but this should be ok since
heap functions are not async-signal safe and therefore may not be (reliably)
called from within a signal handler anyway.  Since the relevant thread in both
the parent and child processes is still executing within the context of a signal
handler, the arena data structures may not be touched by either thread.

The downsides to this approach are performance overhead and the potential for
defects to go unnoticed during development (since unintentional attempts to
recursively lock a mutex would no longer lead to deadlocks).

This approach would also require changes to 'ptmalloc_unlock_all2' (which
currently re-initializes arena mutexes in the child processes rather than
unlocking them) since a return from the signal handler in the child process will
attempt to unlock the previously held arena mutex lock.  If the mutex is
re-initialized, the unlock call could result in undesirable behavior.

Eagerly awaiting comments and criticisms...

-- 

http://sourceware.org/bugzilla/show_bug.cgi?id=4737

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug libc/4737] fork is not async-signal-safe
  2007-07-04  1:35 [Bug libc/4737] New: " nmiell at comcast dot net
                   ` (3 preceding siblings ...)
  2008-10-30 14:55 ` morten+sources dot redhat dot com at afdelingp dot dk
@ 2008-11-05  9:00 ` tom dot honermann at oracle dot com
  2008-11-05  9:57 ` tom dot honermann at oracle dot com
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 31+ messages in thread
From: tom dot honermann at oracle dot com @ 2008-11-05  9:00 UTC (permalink / raw)
  To: glibc-bugs



-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |tom dot honermann at oracle
                   |                            |dot com


http://sourceware.org/bugzilla/show_bug.cgi?id=4737

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug libc/4737] fork is not async-signal-safe
  2007-07-04  1:35 [Bug libc/4737] New: " nmiell at comcast dot net
                   ` (2 preceding siblings ...)
  2008-10-21  5:13 ` nmiell at comcast dot net
@ 2008-10-30 14:55 ` morten+sources dot redhat dot com at afdelingp dot dk
  2008-11-05  9:00 ` tom dot honermann at oracle dot com
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 31+ messages in thread
From: morten+sources dot redhat dot com at afdelingp dot dk @ 2008-10-30 14:55 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From morten+sources dot redhat dot com at afdelingp dot dk  2008-10-30 14:53 -------
Here is a bit more information.

It appears that this is more than one bug (or at least more than one lock). The
bug I hit is caused by __GI__IO_list_lock(). The bug which Nicholas Miell hit is
caused by ptmalloc_lock_all(), which is called via thread_atfork().

The following stack traces are from Ubuntu's libc6-2.7-10ubuntu3 glibc package.

When the deadlock is provoked by a fork() in the main loop:
#0  0xb7f3d410 in __kernel_vsyscall ()
#1  0xb7ebc033 in __lll_lock_wait_private () from /lib/tls/i686/cmov/libc.so.6
#2  0xb7e40a2d in _L_lock_2616 () from /lib/tls/i686/cmov/libc.so.6
#3  0xb7e3fd7d in __GI__IO_list_lock () from /lib/tls/i686/cmov/libc.so.6
#4  0xb7e6dd45 in fork () from /lib/tls/i686/cmov/libc.so.6
#5  0x080484ae in sighandler ()
#6  <signal handler called>
#7  0xb7e3fd77 in __GI__IO_list_lock () from /lib/tls/i686/cmov/libc.so.6
#8  0xb7e6dd45 in fork () from /lib/tls/i686/cmov/libc.so.6
#9  0x08048591 in main ()

When the deadlock is provoked by malloc()/free() in the main loop:
#0  0xb7f93410 in __kernel_vsyscall ()
#1  0xb7f12033 in __lll_lock_wait_private () from /lib/tls/i686/cmov/libc.so.6
#2  0xb7e9db4f in _L_lock_48 () from /lib/tls/i686/cmov/libc.so.6
#3  0xb7e9728b in ptmalloc_lock_all () from /lib/tls/i686/cmov/libc.so.6
#4  0xb7ec3f7f in fork () from /lib/tls/i686/cmov/libc.so.6
#5  0x0804850e in sighandler ()
#6  <signal handler called>
#7  0xb7e9971a in _int_free () from /lib/tls/i686/cmov/libc.so.6
#8  0xb7e9d4f0 in free () from /lib/tls/i686/cmov/libc.so.6
#9  0x08048600 in main ()

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |morten+sources dot redhat
                   |                            |dot com at afdelingp dot dk


http://sourceware.org/bugzilla/show_bug.cgi?id=4737

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug libc/4737] fork is not async-signal-safe
  2007-07-04  1:35 [Bug libc/4737] New: " nmiell at comcast dot net
  2008-10-11 18:47 ` [Bug libc/4737] " morten+sources dot redhat dot com at afdelingp dot dk
  2008-10-20 11:47 ` morten+sources dot redhat dot com at afdelingp dot dk
@ 2008-10-21  5:13 ` nmiell at comcast dot net
  2008-10-30 14:55 ` morten+sources dot redhat dot com at afdelingp dot dk
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 31+ messages in thread
From: nmiell at comcast dot net @ 2008-10-21  5:13 UTC (permalink / raw)
  To: glibc-bugs

------- Additional Comments From nmiell at comcast dot net  2008-10-21 05:12 -------
(In reply to comment #2)
> I am no glibc hacker, and I am not sure how to fix this bug. If the locking is
> needed, perhaps signals should be blocked while the lock is held?

Unfortunately, that's only a partial solution -- applications generally expect
to get their signals, even if they're in the bowels of the memory allocator.

The original thing that inspired this bug entry was a SIGSEGV in malloc()
leading to an attempt to fork and exec GNOME's bug-buddy leading to a deadlock.
Masking signals would prevent this valid usage scenario and annoy users.

And this is completely ignoring the performance impact of two system calls for
every malloc operation.

-- 

http://sourceware.org/bugzilla/show_bug.cgi?id=4737

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug libc/4737] fork is not async-signal-safe
  2007-07-04  1:35 [Bug libc/4737] New: " nmiell at comcast dot net
  2008-10-11 18:47 ` [Bug libc/4737] " morten+sources dot redhat dot com at afdelingp dot dk
@ 2008-10-20 11:47 ` morten+sources dot redhat dot com at afdelingp dot dk
  2008-10-21  5:13 ` nmiell at comcast dot net
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 31+ messages in thread
From: morten+sources dot redhat dot com at afdelingp dot dk @ 2008-10-20 11:47 UTC (permalink / raw)
  To: glibc-bugs

------- Additional Comments From morten+sources dot redhat dot com at afdelingp dot dk  2008-10-20 11:46 -------
This bug in glibc is causing deadlocks in a server we are running.

The software follows the POSIX guidelines, and only calls "safe" functions
inside signal handlers. We've had no problems on BSD. On Linux (with glibc) the
software deadlocks once every few weeks, which is - of course - a total
show-stopper. We can't use Linux in the production environment if the software
keeps deadlocking.

I am no glibc hacker, and I am not sure how to fix this bug. If the locking is
needed, perhaps signals should be blocked while the lock is held?

Best regards,
Morten K. Poulsen

-- 

http://sourceware.org/bugzilla/show_bug.cgi?id=4737

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug libc/4737] fork is not async-signal-safe
  2007-07-04  1:35 [Bug libc/4737] New: " nmiell at comcast dot net
@ 2008-10-11 18:47 ` morten+sources dot redhat dot com at afdelingp dot dk
  2008-10-20 11:47 ` morten+sources dot redhat dot com at afdelingp dot dk
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 31+ messages in thread
From: morten+sources dot redhat dot com at afdelingp dot dk @ 2008-10-11 18:47 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From morten+sources dot redhat dot com at afdelingp dot dk  2008-10-11 18:46 -------
Created an attachment (id=2996)
 --> (http://sourceware.org/bugzilla/attachment.cgi?id=2996&action=view)
fork-fail.c

Here is another short example which triggers the bug, but without using
malloc().

>From signal(7):
POSIX.1-2003 requires an implementation to guarantee that the following
functions can be safely called inside a signal handler:
...
fork()
...

Are there any plans to fix this issue in glibc?

Best regards,
Morten K. Poulsen

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=4737

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2021-06-28 19:00 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-4737-131@http.sourceware.org/bugzilla/>
2011-10-13 17:53 ` [Bug libc/4737] fork is not async-signal-safe llucax at gmail dot com
2011-10-13 17:54 ` llucax at gmail dot com
2011-10-13 17:56 ` llucax at gmail dot com
2011-10-14 14:06 ` bugdal at aerifal dot cx
2012-03-27 13:43 ` krebbel1 at de dot ibm.com
2012-04-11  7:58 ` aj at suse dot de
2014-02-16 19:42 ` jackie.rosen at hushmail dot com
2014-05-28 19:41 ` schwab at sourceware dot org
2014-08-25  2:25 ` naesten at gmail dot com
2015-05-07 15:27 ` gbenson at redhat dot com
2021-06-28 19:00 ` adhemerval.zanella at linaro dot org
2007-07-04  1:35 [Bug libc/4737] New: " nmiell at comcast dot net
2008-10-11 18:47 ` [Bug libc/4737] " morten+sources dot redhat dot com at afdelingp dot dk
2008-10-20 11:47 ` morten+sources dot redhat dot com at afdelingp dot dk
2008-10-21  5:13 ` nmiell at comcast dot net
2008-10-30 14:55 ` morten+sources dot redhat dot com at afdelingp dot dk
2008-11-05  9:00 ` tom dot honermann at oracle dot com
2008-11-05  9:57 ` tom dot honermann at oracle dot com
2008-11-06 23:10 ` nmiell at comcast dot net
2008-11-07  1:10 ` morten+sources dot redhat dot com at afdelingp dot dk
2008-11-11 21:35 ` tom dot honermann at oracle dot com
2008-11-11 21:41 ` tom dot honermann at oracle dot com
2008-11-11 22:04 ` tom dot honermann at oracle dot com
2008-11-18 22:30 ` tom dot honermann at oracle dot com
2008-11-18 23:45 ` rsa at us dot ibm dot com
2008-11-18 23:57 ` rsa at us dot ibm dot com
2008-11-19  1:39 ` tom dot honermann at oracle dot com
2008-11-19 16:23 ` rsa at us dot ibm dot com
2009-01-14  1:22 ` tom dot honermann at oracle dot com
2009-01-14  8:46 ` jakub at redhat dot com
2009-01-14  9:44 ` tom dot honermann at oracle dot com
2009-01-16 17:19 ` tom dot honermann at oracle dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).