public inbox for elfutils@sourceware.org
 help / color / mirror / Atom feed
* Deadlock from --enable-thread-safety
@ 2023-08-04 14:21 Heather McIntyre
  2023-08-04 14:46 ` John Mellor-Crummey
  0 siblings, 1 reply; 2+ messages in thread
From: Heather McIntyre @ 2023-08-04 14:21 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 2112 bytes --]

I've been making --enable-thread-safety (USE_LOCKS) more viable by fixing
deadlocks throughout the libelf library. This has required minimal code
changes so far. However, I've hit a snag where "__elf64_updatenull_wrlock"
calls "elf64_getchdr," which leads to "elf64_getshdr." This sequence
attempts to acquire a read lock under a write lock and fails. Similarly,
"elf64_getchdr" calls "elf_getdata," which also tries and fails to
implement a read lock.

Unfortunately, fixing this particular deadlock requires more than minimal
code changes. From what I understand, I may have a few options on how to
fix this:

1) Create copies of the getchdr and elf_getdata functions, renaming them
getchdr_wrlock and elf_getdata_wrlock. However, multiple copies of these
functions could bloat the code, which may not be ideal.
2) Some type of write lock flag to indicate when a write lock is active.
Either within the locking macro in eu-config.h or passed as an argument in
the functions.

Ultimately, I thought others may want to look into this to see if there
might be options for a better solution. Here is the backtrace:

Program received signal SIGABRT, Aborted.
0x00007ffff7837aff in raise () from /lib64/libc.so.6
#0  0x00007ffff7837aff in raise () from /lib64/libc.so.6
#1  0x00007ffff780aea5 in abort () from /lib64/libc.so.6
#2  0x00007ffff780ad79 in __assert_fail_base.cold.0 () from /lib64/libc.so.6
#3  0x00007ffff78304c9 in __assert_perror_fail () from /lib64/libc.so.6
#4  0x00007ffff7fda554 in elf64_getshdr (scn=0x40fc20) at
../../libelf/elf32_getshdr.c:292
#5  0x00007ffff7fe9590 in elf64_getchdr (scn=0x40fc20) at
../../libelf/elf32_getchdr.c:45
#6  0x00007ffff7fdf690 in __elf64_updatenull_wrlock (elf=0x40d880,
change_bop=0x7fffffffac60, shnum=30) at ../../libelf/elf32_updatenull.c:407
#7  0x00007ffff7fdd83d in elf_update (elf=0x40d880, cmd=ELF_C_WRITE) at
../../libelf/elf_update.c:211
#8  0x0000000000405d9e in process_file (fname=0x7fffffffbb96
"testfile-s390x-hash-both") at ../../src/elfcompress.c:1336
#9  0x0000000000406232 in main (argc=7, argv=0x7fffffffb768) at
../../src/elfcompress.c:1458

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Deadlock from --enable-thread-safety
  2023-08-04 14:21 Deadlock from --enable-thread-safety Heather McIntyre
@ 2023-08-04 14:46 ` John Mellor-Crummey
  0 siblings, 0 replies; 2+ messages in thread
From: John Mellor-Crummey @ 2023-08-04 14:46 UTC (permalink / raw)
  To: Heather McIntyre; +Cc: elfutils-devel

Mark,

A third option that Heather and I discussed for the routines that needed replication (because sometimes they are called holding a write lock and sometimes not) was to put each routine in an include file where the write lock status is expected to be defined as a macro. Then, we can define the macro one way (write lock held), include the file, then define the macro the other way (write lock not held) and include the function again. The write lock status would be incorporated into the function name. 

This avoids two copies of the code to maintain and calls the correct helper routines that either 
- use the lock already possessed or 
- acquire their own lock if the caller doesn’t have it already.

Would that strategy be acceptable to you. None of the code does anything like that as far as I know. 

John Mellor-Crummey

(sent from my phone)

> On Aug 4, 2023, at 8:21 AM, Heather McIntyre via Elfutils-devel <elfutils-devel@sourceware.org> wrote:
> 
> I've been making --enable-thread-safety (USE_LOCKS) more viable by fixing
> deadlocks throughout the libelf library. This has required minimal code
> changes so far. However, I've hit a snag where "__elf64_updatenull_wrlock"
> calls "elf64_getchdr," which leads to "elf64_getshdr." This sequence
> attempts to acquire a read lock under a write lock and fails. Similarly,
> "elf64_getchdr" calls "elf_getdata," which also tries and fails to
> implement a read lock.
> 
> Unfortunately, fixing this particular deadlock requires more than minimal
> code changes. From what I understand, I may have a few options on how to
> fix this:
> 
> 1) Create copies of the getchdr and elf_getdata functions, renaming them
> getchdr_wrlock and elf_getdata_wrlock. However, multiple copies of these
> functions could bloat the code, which may not be ideal.
> 2) Some type of write lock flag to indicate when a write lock is active.
> Either within the locking macro in eu-config.h or passed as an argument in
> the functions.
> 
> Ultimately, I thought others may want to look into this to see if there
> might be options for a better solution. Here is the backtrace:
> 
> Program received signal SIGABRT, Aborted.
> 0x00007ffff7837aff in raise () from /lib64/libc.so.6
> #0  0x00007ffff7837aff in raise () from /lib64/libc.so.6
> #1  0x00007ffff780aea5 in abort () from /lib64/libc.so.6
> #2  0x00007ffff780ad79 in __assert_fail_base.cold.0 () from /lib64/libc.so.6
> #3  0x00007ffff78304c9 in __assert_perror_fail () from /lib64/libc.so.6
> #4  0x00007ffff7fda554 in elf64_getshdr (scn=0x40fc20) at
> ../../libelf/elf32_getshdr.c:292
> #5  0x00007ffff7fe9590 in elf64_getchdr (scn=0x40fc20) at
> ../../libelf/elf32_getchdr.c:45
> #6  0x00007ffff7fdf690 in __elf64_updatenull_wrlock (elf=0x40d880,
> change_bop=0x7fffffffac60, shnum=30) at ../../libelf/elf32_updatenull.c:407
> #7  0x00007ffff7fdd83d in elf_update (elf=0x40d880, cmd=ELF_C_WRITE) at
> ../../libelf/elf_update.c:211
> #8  0x0000000000405d9e in process_file (fname=0x7fffffffbb96
> "testfile-s390x-hash-both") at ../../src/elfcompress.c:1336
> #9  0x0000000000406232 in main (argc=7, argv=0x7fffffffb768) at
> ../../src/elfcompress.c:1458

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2023-08-04 14:46 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-04 14:21 Deadlock from --enable-thread-safety Heather McIntyre
2023-08-04 14:46 ` John Mellor-Crummey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).