From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from m0.truegem.net (m0.truegem.net [69.55.228.47]) by sourceware.org (Postfix) with ESMTPS id 3D3F13858C27 for ; Tue, 26 Oct 2021 00:54:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 3D3F13858C27 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=maxrnd.com Authentication-Results: sourceware.org; spf=none smtp.mailfrom=maxrnd.com Received: (from daemon@localhost) by m0.truegem.net (8.12.11/8.12.11) id 19Q0sBNf025936 for ; Mon, 25 Oct 2021 17:54:11 -0700 (PDT) (envelope-from mark@maxrnd.com) Received: from 162-235-43-67.lightspeed.irvnca.sbcglobal.net(162.235.43.67), claiming to be "[192.168.1.100]" via SMTP by m0.truegem.net, id smtpd6tu9xb; Mon Oct 25 17:54:06 2021 Subject: Re: malloc crash To: cygwin-developers@cygwin.com References: <6a4d6675-7e4d-bcb3-9aff-acc0788d211d@cornell.edu> <97873b16-7ec3-02d7-1861-3ec62a79c37e@cornell.edu> <4b322eb0-4941-6b8f-6f46-aa76caf5a66f@cornell.edu> <2819d0db-3c5c-2d31-2b21-91efafb7f8f4@maxrnd.com> <20211026091855.7aaf2de97d10174121cbc8f9@nifty.ne.jp> From: Mark Geisert Message-ID: Date: Mon, 25 Oct 2021 17:54:07 -0700 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0 SeaMonkey/2.49.4 MIME-Version: 1.0 In-Reply-To: <20211026091855.7aaf2de97d10174121cbc8f9@nifty.ne.jp> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.6 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, NICE_REPLY_A, SPF_HELO_NONE, SPF_NONE, TXREP, WEIRD_PORT autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: cygwin-developers@cygwin.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Cygwin core component developers mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Oct 2021 00:54:15 -0000 Takashi Yano wrote: > On Mon, 25 Oct 2021 16:36:50 -0700 > Mark Geisert wrote: >> Ken Brown wrote: >>> On 10/25/2021 5:29 PM, Mark Geisert wrote: >>>> Corinna Vinschen wrote: >>>>> On Oct 25 08:35, Ken Brown wrote: >>>>>> On 10/25/2021 4:59 AM, Corinna Vinschen wrote: >>>>>>> Has the thread already been started at this point? >>>>>> >>>>>> Yes, here's the backtrace of that thread: >>>>>> >>>>>> Thread 5 (Thread 9692.0x7c4c): >>>>>> #0  0x00000001801934f9 in sys_alloc (m=0x18036f860 <_gm_>, nb=1040) at >>>>>> ../../../../temp/winsup/cygwin/malloc.cc:4232 >>>>>> #1  0x0000000180196b96 in dlmalloc (bytes=1024) at >>>>>> ../../../../temp/winsup/cygwin/malloc.cc:4669 >>>>>> #2  0x00000001801993e1 in dlrealloc (oldmem=0x0, bytes=1024) at >>>>>> ../../../../temp/winsup/cygwin/malloc.cc:5187 >>>>>> #3  0x00000001800e8eed in realloc (p=0x0, size=1024) at >>>>>> ../../../../temp/winsup/cygwin/malloc_wrapper.cc:73 >>>>> >>>>> Er... huh?  So both threads are in a malloc function?  This shouldn't >>>>> have happened, given the clunky muto guarding malloc calls.  This is >>>>> really strange.  Why's the muto not working here? >>>> >>>> Is it possible both threads have executed malloc_init()? >>>> If so, the second one would reinit the muto. >>> >>> Or does the fifo_reader thread call a malloc function before the main thread has >>> called malloc_init()?  This would presumably cause __malloc_lock() to fail, but >>> there's no error check. >> >> If there's a global constructor involved, that is known to happen. Constructors >> are run from dll_crt0_0(), before malloc_init() is called from dll_crt0_1(). See >> dcrt0.cc for the details. > > So how about moving malloc_init() call from dll_crt0_1() to dll_crl0_0() > so that malloc() can be called in fixup_after_fork/exec()? It appears simple, but this is a touchy area of code. The _0 and _1 are two separate phases of process startup. I'd want to hear Corinna's thoughts on this. I'd also like to verify somehow that this is the scenario Ken is hitting. When I was researching different mallocs for Cygwin I hit the constructor snag repeatedly. I did try delaying the constructor-running until after malloc_init(). More problems. I did not try moving malloc_init() to before the constructor run. ..mark