public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libc/13845] New: Infrequent random stop in futex_wait using printf inside alarm signal handler
@ 2012-03-14 13:08 jon at jonshouse dot co.uk
  2012-03-14 13:31 ` [Bug libc/13845] " schwab@linux-m68k.org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: jon at jonshouse dot co.uk @ 2012-03-14 13:08 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13845

             Bug #: 13845
           Summary: Infrequent random stop in futex_wait using printf
                    inside alarm signal handler
           Product: glibc
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: libc
        AssignedTo: unassigned@sourceware.org
        ReportedBy: jon@jonshouse.co.uk
                CC: drepper.fsp@gmail.com
    Classification: Unclassified


Excuse the background, but I feel it may be useful.

Recently i've been developing an application. A single (non threaded) process
receives UDP broadcast packets from multiple sources, mixes them into a single
audio stream and plays them via ALSA or PULSE audio.

I build 3 versions of the rx process, one driving ALSA, one driving PULSE and
one "dummy" version that discards the audio but is in every other respect
identical (conditional compiles from the same source).  I build these 3
versions for two architectures, intel and ARM.

The "dummy" version on both intel and ARM machines run forever....
The pulse version also runs forever (as best as I can tell)
The alsa version randomly freezes.

I not unreasonably assumed it to be an ALSA bug.

Once I had found the time to track the lockups down they seem to be caused by
printf stalling in a handler for a once a second alarm signal.

I checked the documentation for "alarm" and several other places but cant see
any warning that suggest that printf can not be used in that context.

I fixed my application by adding a simple flag.
The freezing behaviour I have seen on all linux/gclic versions i've tried so
far.

The faster the machine the less frequently the stop happens.  On my 3.8Ghz AMD
64 it would typically occur only every few days of runtime, on a 200Mhz ARM the
event would occur typically within 3 hours

Attached a stack backtrace of my process running on a 200mhz ARM board.

The process is making heavy use of recvfrom(UDP) and lighter but frequenct use
of ALSA snd_pcm_writen.

Process stall after 2hours, running under gdb:
ARM udp-many-way-audio-rx:  My IP Address [10.10.10.111]  Runtime:[   0D  2H 
5M 57S]
Decoding my own audio packets
Playback ALSA lib:1.0.25 writen underruns: 0  overruns (handled):0
Main loops per second = 51      EAGAIN per second = 52
Total packets received 2546534 , errors 0, Per Second 341
Last error ()

 1  [      10.10.10.6  ]  PLS 00000  PC 0000064638 PPS 43  BWD  2  OLP 0
 2  [    10.10.10.111  ]  PLS 00000  PC 0000325200 PPS 42  BWD  2  OLP 2
 3  [     10.10.10.66  ]  PLS 00000  PC 0000318855 PPS 43  BWD  2  OLP 1
 4  [     10.10.10.65  ]  PLS 00000  PC 0000318855 PPS 43  BWD  2  OLP 1
 5  [     10.10.10.64  ]  PLS 00000  PC 0000318855 PPS 43  BWD  2  OLP 1
 6  [     10.10.10.63  ]  PLS 00000  PC 0000318855 PPS 43  BWD  2  OLP 1
 7  [     10.10.10.62  ]  PLS 00000  PC 0000318855 PPS 42  BWD  2  OLP 1
 8  [     10.10.10.61  ]  PLS 00000  PC 0000318855 PPS 42  BWD  2  OLP 0

PLS=Packet Last Seen    PC=Packet Counter       PPS=Packets per second
BWD=Buffers with data  OLP=oldest packet buffer
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA^C
Program received signal SIGINT, Interrupt.
0x000ddbc0 in __lll_lock_wait_private ()
(gdb) backtrace
#0  0x000ddbc0 in __lll_lock_wait_private ()
#1  0x000b7a4c in vfprintf ()
#2  0x000bb9f8 in printf ()
#3  0x0000a930 in once_a_second_alarm_handler () at udp-many-way-audio-rx.c:359
#4  <signal handler called>
#5  0x000c24a0 in fflush ()
#6  0x0000ac20 in slot_buffers_firstfree (slot=3) at
udp-many-way-audio-rx.c:433
#7  0x0000c0e4 in main (argc=1, argv=0xbefffe14) at udp-many-way-audio-rx.c:692
(gdb)

An strace of the process stalling on intel is here:
http://www.jonshouse.co.uk/download/a_stop.txt.gz

Thanks,
Jon

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libc/13845] Infrequent random stop in futex_wait using printf inside alarm signal handler
  2012-03-14 13:08 [Bug libc/13845] New: Infrequent random stop in futex_wait using printf inside alarm signal handler jon at jonshouse dot co.uk
@ 2012-03-14 13:31 ` schwab@linux-m68k.org
  2012-03-14 13:38 ` jon at jonshouse dot co.uk
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: schwab@linux-m68k.org @ 2012-03-14 13:31 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13845

Andreas Schwab <schwab@linux-m68k.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |INVALID

--- Comment #1 from Andreas Schwab <schwab@linux-m68k.org> 2012-03-14 13:30:46 UTC ---
printf and fflush are not async-signal-safe.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libc/13845] Infrequent random stop in futex_wait using printf inside alarm signal handler
  2012-03-14 13:08 [Bug libc/13845] New: Infrequent random stop in futex_wait using printf inside alarm signal handler jon at jonshouse dot co.uk
  2012-03-14 13:31 ` [Bug libc/13845] " schwab@linux-m68k.org
@ 2012-03-14 13:38 ` jon at jonshouse dot co.uk
  2012-03-14 14:18 ` schwab@linux-m68k.org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: jon at jonshouse dot co.uk @ 2012-03-14 13:38 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13845

Jonathan Andrews <jon at jonshouse dot co.uk> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |UNCONFIRMED
         Resolution|INVALID                     |
     Ever Confirmed|1                           |0

--- Comment #2 from Jonathan Andrews <jon at jonshouse dot co.uk> 2012-03-14 13:38:18 UTC ---
(In reply to comment #1)
> printf and fflush are not async-signal-safe.

Thats nice !

Would it not have been an idea to tell someone, in the documentation for alarm
for example.  That way time would saved I would be swearing less.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libc/13845] Infrequent random stop in futex_wait using printf inside alarm signal handler
  2012-03-14 13:08 [Bug libc/13845] New: Infrequent random stop in futex_wait using printf inside alarm signal handler jon at jonshouse dot co.uk
  2012-03-14 13:31 ` [Bug libc/13845] " schwab@linux-m68k.org
  2012-03-14 13:38 ` jon at jonshouse dot co.uk
@ 2012-03-14 14:18 ` schwab@linux-m68k.org
  2012-03-14 14:55 ` jon at jonshouse dot co.uk
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: schwab@linux-m68k.org @ 2012-03-14 14:18 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13845

Andreas Schwab <schwab@linux-m68k.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|                            |INVALID

--- Comment #3 from Andreas Schwab <schwab@linux-m68k.org> 2012-03-14 14:18:31 UTC ---
http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libc/13845] Infrequent random stop in futex_wait using printf inside alarm signal handler
  2012-03-14 13:08 [Bug libc/13845] New: Infrequent random stop in futex_wait using printf inside alarm signal handler jon at jonshouse dot co.uk
                   ` (2 preceding siblings ...)
  2012-03-14 14:18 ` schwab@linux-m68k.org
@ 2012-03-14 14:55 ` jon at jonshouse dot co.uk
  2012-03-15 15:03 ` ppluzhnikov at google dot com
  2014-06-26 13:59 ` fweimer at redhat dot com
  5 siblings, 0 replies; 7+ messages in thread
From: jon at jonshouse dot co.uk @ 2012-03-14 14:55 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13845

--- Comment #4 from Jonathan Andrews <jon at jonshouse dot co.uk> 2012-03-14 14:55:11 UTC ---
(In reply to comment #3)
> http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04

Ok, I found the reference to the issue in an up to date man page for "signal"
on a Debian box. My older development machine does not mention it, nor does the
man page for alarm I referenced.

"unspecified" is also a pretty vague term.

Lots of nice unsafe examples via google - thanks web ....
http://www.ccplusplus.com/2011/10/alarm-function-example-in-c.html

"Interactions between alarm() and setitimer() are unspecified"
No "Only thread safe functions should be used in the alarm signal handler"

Its documented with the assumption that somebody has read it in the correct
order, the documentation isn't thread safe itself :-(

Thanks,
Jon

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libc/13845] Infrequent random stop in futex_wait using printf inside alarm signal handler
  2012-03-14 13:08 [Bug libc/13845] New: Infrequent random stop in futex_wait using printf inside alarm signal handler jon at jonshouse dot co.uk
                   ` (3 preceding siblings ...)
  2012-03-14 14:55 ` jon at jonshouse dot co.uk
@ 2012-03-15 15:03 ` ppluzhnikov at google dot com
  2014-06-26 13:59 ` fweimer at redhat dot com
  5 siblings, 0 replies; 7+ messages in thread
From: ppluzhnikov at google dot com @ 2012-03-15 15:03 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13845

Paul Pluzhnikov <ppluzhnikov at google dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ppluzhnikov at google dot
                   |                            |com

--- Comment #5 from Paul Pluzhnikov <ppluzhnikov at google dot com> 2012-03-15 15:01:35 UTC ---
(In reply to comment #4)

> No "Only thread safe functions should be used in the alarm signal handler"

That statement would be incorrect.

The correct statement reads: "only async-signal-safe functions can be used
inside asynchronous signal handler". 

http://pubs.opengroup.org/onlinepubs/009695399/functions/xsh_chap02_04.html

This fact is the "ABC" of signal handling, surely *everybody* knows that?

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libc/13845] Infrequent random stop in futex_wait using printf inside alarm signal handler
  2012-03-14 13:08 [Bug libc/13845] New: Infrequent random stop in futex_wait using printf inside alarm signal handler jon at jonshouse dot co.uk
                   ` (4 preceding siblings ...)
  2012-03-15 15:03 ` ppluzhnikov at google dot com
@ 2014-06-26 13:59 ` fweimer at redhat dot com
  5 siblings, 0 replies; 7+ messages in thread
From: fweimer at redhat dot com @ 2014-06-26 13:59 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13845

Florian Weimer <fweimer at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
              Flags|                            |security-

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-06-26 13:59 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-14 13:08 [Bug libc/13845] New: Infrequent random stop in futex_wait using printf inside alarm signal handler jon at jonshouse dot co.uk
2012-03-14 13:31 ` [Bug libc/13845] " schwab@linux-m68k.org
2012-03-14 13:38 ` jon at jonshouse dot co.uk
2012-03-14 14:18 ` schwab@linux-m68k.org
2012-03-14 14:55 ` jon at jonshouse dot co.uk
2012-03-15 15:03 ` ppluzhnikov at google dot com
2014-06-26 13:59 ` fweimer at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).