public inbox for glibc-bugs@sourceware.org help / color / mirror / Atom feed
From: "mail at roychan dot org" <sourceware-bugzilla@sourceware.org> To: glibc-bugs@sourceware.org Subject: [Bug malloc/30945] New: Core affinity setting incurs lock contentions between threads Date: Fri, 06 Oct 2023 00:24:43 +0000 [thread overview] Message-ID: <bug-30945-131@http.sourceware.org/bugzilla/> (raw) https://sourceware.org/bugzilla/show_bug.cgi?id=30945 Bug ID: 30945 Summary: Core affinity setting incurs lock contentions between threads Product: glibc Version: 2.38 Status: UNCONFIRMED Severity: normal Priority: P2 Component: malloc Assignee: unassigned at sourceware dot org Reporter: mail at roychan dot org Target Milestone: --- Created attachment 15156 --> https://sourceware.org/bugzilla/attachment.cgi?id=15156&action=edit the example program to reproduce the issue Hi, I recently encounter poor malloc/free performance when building a data-intensive application. The deserialization library we used works 10x slower than expected. Investigations show that this is due to the arena_get2 function uses __get_nprocs_sched instead of __get_nprocs. Without changing core affinity settings, this call returns the real number of cores so the upper limit of total arenas is set correctly. However, if a thread is pinned to a core, further malloc calls only sees n = 1 because the function returns only schedulable cores. Therefore, the maximum number of arenas will be 8 on 64-bit platforms. This leads to arena lock contentions between threads if: - The program spans multiple cores (say, more than 8 cores). - Threads are pinned to cores before any malloc calls, so they have not attached to any arenas. - Later memory allocations are served from the arenas. - No MALLOC_ARENA_MAX tunable is set to manually increase the limit. A mail thread about this briefly discussed this issue last year: https://sourceware.org/pipermail/libc-alpha/2022-June/140123.html However, it did not give a program that can be used to easily reproduce the (un)expected behaviors. Here I would like to provide a minimal example that can will expose the problem, and, if possible, initiate further discussions about whether the core counting in arena_get can be better implemented. The program accepts 3 arguments. The first one is the number of cores, the second one is whether the thread is pinned to a core right after its creation, and the third one is whether we would like to apply a small "fix". The fix is add a free(malloc(8)) right before we set the affinity in each thread. In this case, each thread can see all the cores so they can create and attach to a "local" arena that is not shared. The output is the average time each thread uses to finish a bunch of malloc/frees. The following is the result I collected from my PC with 16-core Ryzen 9 5950X, running Linux kernel 6.5.5 and glibc 2.38. The program is compiled using gcc 13.2.1 without optimizations flags. ./a.out 32 false false --- nr_cpu: 32 pin: no fix: no thread average (ms): 16.233663 ./a.out 32 true false --- nr_cpu: 32 pin: yes fix: no thread average (ms): 1360.919047 ./a.out 32 true true --- nr_cpu: 32 pin: yes fix: yes thread average (ms): 15.505453 env GLIBC_TUNABLES='glibc.malloc.arena_max=32' ./a.out 32 true false --- nr_cpu: 32 pin: yes fix: no thread average (ms): 16.036667 Also recorded a few runs with perf. It suggested massive overheads in __lll_lock_wait_private and __lll_lock_wake_private calls. -- You are receiving this mail because: You are on the CC list for the bug.
next reply other threads:[~2023-10-06 0:24 UTC|newest] Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top 2023-10-06 0:24 mail at roychan dot org [this message] 2023-10-06 0:27 ` [Bug malloc/30945] " mail at roychan dot org 2023-10-11 16:11 ` adhemerval.zanella at linaro dot org 2023-10-12 2:32 ` sam at gentoo dot org 2023-11-22 14:19 ` adhemerval.zanella at linaro dot org 2024-01-11 9:41 ` fweimer at redhat dot com 2024-02-11 22:04 ` kuganv at gmail dot com 2024-02-12 10:14 ` sam at gentoo dot org 2024-02-12 13:24 ` adhemerval.zanella at linaro dot org 2024-02-12 21:49 ` kuganv at gmail dot com
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-30945-131@http.sourceware.org/bugzilla/ \ --to=sourceware-bugzilla@sourceware.org \ --cc=glibc-bugs@sourceware.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).