public inbox for cygwin-developers@cygwin.com
 help / color / mirror / Atom feed
* malloc crash
@ 2021-10-24 21:46 Ken Brown
  2021-10-25  8:56 ` Takashi Yano
  2021-10-25  8:59 ` Corinna Vinschen
  0 siblings, 2 replies; 27+ messages in thread
From: Ken Brown @ 2021-10-24 21:46 UTC (permalink / raw)
  To: cygwin-devel

[-- Attachment #1: Type: text/plain, Size: 5086 bytes --]

I'm trying to debug the fifo problem reported here:

   https://cygwin.com/pipermail/cygwin/2021-October/249635.html

To keep my email self-contained, here are the reproduction instructions.  Run 
the attached script with argument 1000.  The output is supposed to look like this:

$ ./fifo_test.sh 1000
Creating 1000 fifo readers...
Created PID=6503  reading from /tmp/catfifo_0
FIFO 0
Created PID=6506  reading from /tmp/catfifo_1
FIFO 1
[...]
Created PID=9506  reading from /tmp/catfifo_998
FIFO 998
Created PID=9509  reading from /tmp/catfifo_999
FIFO 999

But invariably one of the exec'd cat processes will appear to hang.  (Actually 
it goes into an infinite loop.)  If you attach gdb to that process and catch it 
at the right time, you see something like this:

[...]
Reading symbols from /usr/bin/cat.exe...
Reading symbols from /usr/lib/debug//usr/bin/cat.exe.dbg...
(gdb) thr 1
[Switching to thread 1 (Thread 9692.0x8658)]
#0  0x00007ffe950ed674 in ntdll!ZwCreateEvent ()
    from /c/WINDOWS/SYSTEM32/ntdll.dll
(gdb) bt
#0  0x00007ffe950ed674 in ntdll!ZwCreateEvent ()
    from /c/WINDOWS/SYSTEM32/ntdll.dll
#1  0x00000001800e56c8 in CreateEventW (
     lpEventAttributes=0x18030ac90 <sec_none_nih>, bManualReset=0,
     bInitialState=0, lpName=0x0)
     at ../../../../temp/winsup/cygwin/kernel32.cc:46
#2  0x00000001800e57c1 in CreateEventA (
     lpEventAttributes=0x18030ac90 <sec_none_nih>, bManualReset=0,
     bInitialState=0, lpName=0x0)
     at ../../../../temp/winsup/cygwin/kernel32.cc:71
#3  0x00000001801493f1 in sig_send (p=0x180010000, si=..., tls=0xffffce00)
     at ../../../../temp/winsup/cygwin/sigproc.cc:698
#4  0x00000001800676c9 in exception::handle (e=0xffffc5b0, frame=0xffffcd80,
     in=0xffffc0c0, dispatch=0xffffbf40)
     at ../../../../temp/winsup/cygwin/exceptions.cc:834
#5  0x00007ffe950f20cf in ntdll!.chkstk () from /c/WINDOWS/SYSTEM32/ntdll.dll
#6  0x00007ffe950a1454 in ntdll!RtlRaiseException ()
    from /c/WINDOWS/SYSTEM32/ntdll.dll
#7  0x00007ffe950f0bfe in ntdll!KiUserExceptionDispatcher ()
    from /c/WINDOWS/SYSTEM32/ntdll.dll
#8  0x0000000180191a5c in init_top (m=0x18036f860 <_gm_>, p=0x800010000,
     psize=65456) at ../../../../temp/winsup/cygwin/malloc.cc:3903
#9  0x0000000180193249 in sys_alloc (m=0x18036f860 <_gm_>, nb=256)
     at ../../../../temp/winsup/cygwin/malloc.cc:4186
#10 0x0000000180196b96 in dlmalloc (bytes=248)
     at ../../../../temp/winsup/cygwin/malloc.cc:4669
#11 0x0000000180197f5d in dlcalloc (n_elements=1, elem_size=248)
     at ../../../../temp/winsup/cygwin/malloc.cc:4799
#12 0x00000001800e9030 in calloc (nmemb=1, size=248)
     at ../../../../temp/winsup/cygwin/malloc_wrapper.cc:101
#13 0x0000000180044a2a in operator new (s=248)
     at ../../../../temp/winsup/cygwin/cxx.cc:21
#14 0x000000018016a75d in pthread::init_mainthread ()
     at ../../../../temp/winsup/cygwin/thread.cc:371
#15 0x000000018004a310 in dll_crt0_1 ()
     at ../../../../temp/winsup/cygwin/dcrt0.cc:887
#16 0x000000018004771c in _cygtls::call2 (this=0xffffce00,
     func=0x18004a218 <dll_crt0_1(void*)>, arg=0x0, buf=0xffffcdb0)
     at ../../../../temp/winsup/cygwin/cygtls.cc:40
#17 0x00000001800476c1 in _cygtls::call (func=0x18004a218 <dll_crt0_1(void*)>,
     arg=0x0) at ../../../../temp/winsup/cygwin/cygtls.cc:27
#18 0x000000018004aac9 in _dll_crt0 ()
     at ../../../../temp/winsup/cygwin/dcrt0.cc:1099
#19 0x0000000000000000 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Typing 'finish' enough times until it won't return anymore shows that there is 
an infinite loop starting with an access violation here:

(gdb) f 8
#8  0x0000000180191a5c in init_top (m=0x18036f860 <_gm_>, p=0x800010000,
     psize=65456) at ../../../../temp/winsup/cygwin/malloc.cc:3903
3903      p->head = psize | PINUSE_BIT;

I guess there's an infinite loop rather than a crash because the exec'd cat 
process isn't fully initialized yet, and the exception handler just keeps 
continuing execution at the site of the access violation.

If I'm reading the backtrace correctly, the access violation occurs while Cygwin 
is trying to allocate storage for the main thread object of the exec'd process.

I'm not familiar enough with the relevant Cygwin internals to take the analysis 
any further, but my guess is that the problem is somehow triggered by the 
creation of a new thread at the end of fhandler_fifo::fixup_after_exec:

       new cygthread (fifo_reader_thread, this, "fifo_reader", thr_sync_evt);

Is this a bug in the fifo code?  Is there some reason I shouldn't be creating a 
new thread in fixup_after_exec?  If so, I'm not sure what to do.  The fifo 
reader code depends crucially on having that thread running.

By the way, every once in a while the hang seems to occur in the forked bash 
process, before it execs cat.  This could also be due to the creation of a new 
thread, this time in fixup_after_fork.

Ken

P.S. The gdb session was based on a build from current git HEAD, but the problem 
also occurs in Cygwin 3.2.0.  So I don't think it's related to the new pipe code.

[-- Attachment #2: fifo_test.sh --]
[-- Type: text/plain, Size: 782 bytes --]

#!/bin/bash

# take arg as number of iterations (default=100)
STEPS="${1-100}"

FIFO_PFX="/tmp/catfifo_"
FIFO_WAIT=0
STEP_WAIT=0

function mysleep() { if [ -n "$1" -a "$1" != "0" ]; then sleep "$1"; fi }

function cleanup(){
	rm -f "$FIFO_PFX"*
}
trap cleanup EXIT

printf "Creating $STEPS fifo readers...\n"
for ((i=0; i<STEPS; i++ )); do
	fifo="$FIFO_PFX$i"

	# create fifo
	mkfifo "$fifo"
	mysleep $FIFO_WAIT

	# fork a process reading from fifo and writing it to stdout
	cat < "$fifo" &
	pid=$!
	printf "Created PID=$pid  reading from $fifo\n"

	# redirect FD3 to the fifo and print a message to it
	exec 3>"$fifo"		
	printf "FIFO %d\n" "$i" >&3

	# close the file descriptor, wait for process to exit and clean up
	exec 3>&-
	wait $pid
	rm -f "$fifo"

	mysleep $STEP_WAIT
done

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2021-10-27  9:01 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-24 21:46 malloc crash Ken Brown
2021-10-25  8:56 ` Takashi Yano
2021-10-25 13:37   ` Ken Brown
2021-10-25  8:59 ` Corinna Vinschen
2021-10-25 12:35   ` Ken Brown
2021-10-25 15:39     ` Corinna Vinschen
2021-10-25 21:29       ` Mark Geisert
2021-10-25 22:02         ` Ken Brown
2021-10-25 23:36           ` Mark Geisert
2021-10-26  0:18             ` Takashi Yano
2021-10-26  0:54               ` Mark Geisert
2021-10-26  8:30                 ` Mark Geisert
2021-10-26  8:52                   ` Takashi Yano
2021-10-26  8:59                     ` Mark Geisert
2021-10-26  9:26                       ` Takashi Yano
2021-10-26  9:31                         ` Corinna Vinschen
2021-10-26  9:28                       ` Corinna Vinschen
2021-10-26  9:27                 ` Corinna Vinschen
2021-10-26  9:24           ` Corinna Vinschen
2021-10-26 14:32             ` Ken Brown
2021-10-26 16:03               ` Corinna Vinschen
2021-10-26 16:36                 ` Ken Brown
2021-10-26 16:49                   ` Corinna Vinschen
2021-10-26 17:10                     ` Ken Brown
2021-10-27  0:44                     ` Takashi Yano
2021-10-27  9:01                       ` Corinna Vinschen
2021-10-26 16:44                 ` Takashi Yano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).