From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-0010f301.pphosted.com (mx0a-0010f301.pphosted.com [148.163.149.254]) by sourceware.org (Postfix) with ESMTPS id 135BC3858C5F for ; Fri, 4 Aug 2023 14:46:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 135BC3858C5F Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=rice.edu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rice.edu Received: from pps.filterd (m0102855.ppops.net [127.0.0.1]) by mx0b-0010f301.pphosted.com (8.17.1.22/8.17.1.22) with ESMTP id 374DMbeF018381 for ; Fri, 4 Aug 2023 09:46:13 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rice.edu; h= content-type:content-transfer-encoding:mime-version:subject:from :in-reply-to:date:cc:message-id:references:to; s=ricemail; bh=cU 5Nqy+XsqhmdWbyqbiTDD1CEaKPHw0DVRMWTqVFZTU=; b=LTitpB144npnb//ye8 wzO9MStm6D7UftW3SF8H4Ah6BfbxuCjBLLB42BrL8M9ev88Qn3Duyt1BZNXN07/X AkI+C38EMemJpfg0+/D48gi6j/a7wURra6pZa7w5fXaYCzcPPq67Wg55AVbxsrz3 kFQOb6HAV2dmqYDRfP0vfSigJB/jd5iuh9wwA2vknvh5Oxl+ancWJlC9mQ+XHRZU 5BYLBJPmboUZPbOI++r000e0gVeM2QvGE7yjKBPus09BdfE0Mh3NIVU/9Ek3YtzX Orn61PMQ6eVpp7oGcQRmlYjvOny20wUKUKAsbvM2svttYpFLKQHxOLQCWp38rrGj M0lQ== Received: from mx1.mail.rice.edu (mx1.mail.rice.edu [128.42.201.100]) by mx0b-0010f301.pphosted.com (PPS) with ESMTPS id 3s7f3mvuh3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Fri, 04 Aug 2023 09:46:13 -0500 (CDT) Received: from mx1.mail.rice.edu (localhost [127.0.0.1]) by mx1.mail.rice.edu (Postfix) with ESMTP id 67F3E42EDCD for ; Fri, 4 Aug 2023 09:46:12 -0500 (CDT) Received: from localhost (localhost [127.0.0.1]) by mx1.mail.rice.edu (Postfix) with ESMTP id 56CA642EDC2; Fri, 4 Aug 2023 09:46:12 -0500 (CDT) X-Virus-Scanned: by amavis-2.12.1 at mx1.mail.rice.edu, auth channel Received: from mx1.mail.rice.edu ([127.0.0.1]) by localhost (mx1.mail.rice.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id b2OwRE7gpc8U; Fri, 4 Aug 2023 09:46:12 -0500 (CDT) Received: from smtpclient.apple (unknown [216.241.195.31]) (using TLSv1.2 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: johnmc) by mx1.mail.rice.edu (Postfix) with ESMTPSA id F2D39209CA8; Fri, 4 Aug 2023 09:46:11 -0500 (CDT) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (1.0) Subject: Re: Deadlock from --enable-thread-safety From: John Mellor-Crummey In-Reply-To: Date: Fri, 4 Aug 2023 08:46:01 -0600 Cc: elfutils-devel@sourceware.org Message-Id: <9FA052F5-00A6-4256-B1A4-C23BBC6E0A77@rice.edu> References: To: Heather McIntyre X-Mailer: iPhone Mail (20G75) X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-08-04_14,2023-08-03_01,2023-05-22_02 X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Mark, A third option that Heather and I discussed for the routines that needed rep= lication (because sometimes they are called holding a write lock and sometim= es not) was to put each routine in an include file where the write lock stat= us is expected to be defined as a macro. Then, we can define the macro one w= ay (write lock held), include the file, then define the macro the other way (= write lock not held) and include the function again. The write lock status w= ould be incorporated into the function name.=20 This avoids two copies of the code to maintain and calls the correct helper r= outines that either=20 - use the lock already possessed or=20 - acquire their own lock if the caller doesn=E2=80=99t have it already. Would that strategy be acceptable to you. None of the code does anything lik= e that as far as I know.=20 John Mellor-Crummey (sent from my phone) > On Aug 4, 2023, at 8:21 AM, Heather McIntyre via Elfutils-devel wrote: >=20 > =EF=BB=BFI've been making --enable-thread-safety (USE_LOCKS) more viable b= y fixing > deadlocks throughout the libelf library. This has required minimal code > changes so far. However, I've hit a snag where "__elf64_updatenull_wrlock"= > calls "elf64_getchdr," which leads to "elf64_getshdr." This sequence > attempts to acquire a read lock under a write lock and fails. Similarly, > "elf64_getchdr" calls "elf_getdata," which also tries and fails to > implement a read lock. >=20 > Unfortunately, fixing this particular deadlock requires more than minimal > code changes. =46rom what I understand, I may have a few options on how to= > fix this: >=20 > 1) Create copies of the getchdr and elf_getdata functions, renaming them > getchdr_wrlock and elf_getdata_wrlock. However, multiple copies of these > functions could bloat the code, which may not be ideal. > 2) Some type of write lock flag to indicate when a write lock is active. > Either within the locking macro in eu-config.h or passed as an argument in= > the functions. >=20 > Ultimately, I thought others may want to look into this to see if there > might be options for a better solution. Here is the backtrace: >=20 > Program received signal SIGABRT, Aborted. > 0x00007ffff7837aff in raise () from /lib64/libc.so.6 > #0 0x00007ffff7837aff in raise () from /lib64/libc.so.6 > #1 0x00007ffff780aea5 in abort () from /lib64/libc.so.6 > #2 0x00007ffff780ad79 in __assert_fail_base.cold.0 () from /lib64/libc.so= .6 > #3 0x00007ffff78304c9 in __assert_perror_fail () from /lib64/libc.so.6 > #4 0x00007ffff7fda554 in elf64_getshdr (scn=3D0x40fc20) at > ../../libelf/elf32_getshdr.c:292 > #5 0x00007ffff7fe9590 in elf64_getchdr (scn=3D0x40fc20) at > ../../libelf/elf32_getchdr.c:45 > #6 0x00007ffff7fdf690 in __elf64_updatenull_wrlock (elf=3D0x40d880, > change_bop=3D0x7fffffffac60, shnum=3D30) at ../../libelf/elf32_updatenull.= c:407 > #7 0x00007ffff7fdd83d in elf_update (elf=3D0x40d880, cmd=3DELF_C_WRITE) a= t > ../../libelf/elf_update.c:211 > #8 0x0000000000405d9e in process_file (fname=3D0x7fffffffbb96 > "testfile-s390x-hash-both") at ../../src/elfcompress.c:1336 > #9 0x0000000000406232 in main (argc=3D7, argv=3D0x7fffffffb768) at > ../../src/elfcompress.c:1458