From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 1791) id 7FD7A3857803; Tue, 7 Nov 2023 16:41:44 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 7FD7A3857803 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1699375304; bh=PvZMMw/RqSB2fOPumJMGvBbELJv0KhSpTvlpsl451YM=; h=From:To:Subject:Date:From; b=SqZW2QPSoHAl6LOwMeNgQ9KpG//jeCWHENYIHOzvQ6WkCzV88HN0i8w6jH5KfhV24 EddlDI2OyNTRy6plLbDijF+K50BmDEFOh5Kmn8eIt3jILz9vqOWKKh1hnrtyarHvG7 CG6z4GhngrOW4exfZ7UfmvROm0PRUxEobARAyMGU= Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: Adhemerval Zanella To: glibc-cvs@sourceware.org Subject: [glibc] elf: Add glibc.mem.decorate_maps tunable X-Act-Checkin: glibc X-Git-Author: Adhemerval Zanella X-Git-Refname: refs/heads/master X-Git-Oldrev: f10ba2ab250b04e47868cfb888df22058436173d X-Git-Newrev: bf033c0072554366fe9617c283c982594059ad9d Message-Id: <20231107164144.7FD7A3857803@sourceware.org> Date: Tue, 7 Nov 2023 16:41:44 +0000 (GMT) List-Id: https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=bf033c0072554366fe9617c283c982594059ad9d commit bf033c0072554366fe9617c283c982594059ad9d Author: Adhemerval Zanella Date: Wed Nov 1 09:56:11 2023 -0300 elf: Add glibc.mem.decorate_maps tunable The PR_SET_VMA_ANON_NAME support is only enabled through a configurable kernel switch, mainly because assigning a name to a anonymous virtual memory area might prevent that area from being merged with adjacent virtual memory areas. For instance, with the following code: void *p1 = mmap (NULL, 1024 * 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); void *p2 = mmap (p1 + (1024 * 4096), 1024 * 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); The kernel will potentially merge both mappings resulting in only one segment of size 0x800000. If the segment is names with PR_SET_VMA_ANON_NAME with different names, it results in two mappings. Although this will unlikely be an issue for pthread stacks and malloc arenas (since for pthread stacks the guard page will result in a PROT_NONE segment, similar to the alignment requirement for the arena block), it still might prevent the mmap memory allocated for detail malloc. There is also another potential scalability issue, where the prctl requires to take the mmap global lock which is still not fully fixed in Linux [1] (for pthread stacks and arenas, it is mitigated by the stack cached and the arena reuse). So this patch disables anonymous mapping annotations as default and add a new tunable, glibc.mem.decorate_maps, can be used to enable it. [1] https://lwn.net/Articles/906852/ Checked on x86_64-linux-gnu and aarch64-linux-gnu. Reviewed-by: DJ Delorie Diff: --- NEWS | 5 +++++ elf/Makefile | 2 +- elf/dl-tunables.list | 5 +++++ manual/tunables.texi | 17 +++++++++++++++++ sysdeps/unix/sysv/linux/setvmaname.c | 15 ++++++++++----- 5 files changed, 38 insertions(+), 6 deletions(-) diff --git a/NEWS b/NEWS index 4580fe381d..139cfef1b0 100644 --- a/NEWS +++ b/NEWS @@ -38,6 +38,11 @@ Major new features: and the wfN format length modifiers for arguments pointing to types int_fastN_t or uint_fastN_t, as specified in draft ISO C2X. +* A new tunable, glibc.mem.decorate_maps, can be used to add additional + information on underlying memory allocated by the glibc (for instance, + on thread stack created by pthread_create or memory allocated by + malloc). + Deprecated and removed features, and other changes affecting compatibility: * The ldconfig program now skips file names containing ';' or ending in diff --git a/elf/Makefile b/elf/Makefile index 328dbe82de..f9bd86a05a 100644 --- a/elf/Makefile +++ b/elf/Makefile @@ -2985,5 +2985,5 @@ $(objpfx)tst-dlclose-lazy.out: \ $(objpfx)tst-decorate-maps: $(shared-thread-library) tst-decorate-maps-ENV = \ - GLIBC_TUNABLES=glibc.malloc.arena_max=8:glibc.malloc.mmap_threshold=1024 + GLIBC_TUNABLES=glibc.malloc.arena_max=8:glibc.malloc.mmap_threshold=1024:glibc.mem.decorate_maps=1 tst-decorate-maps-ARGS = 8 diff --git a/elf/dl-tunables.list b/elf/dl-tunables.list index 695ba7192e..888d2ede04 100644 --- a/elf/dl-tunables.list +++ b/elf/dl-tunables.list @@ -160,6 +160,11 @@ glibc { maxval: 255 security_level: SXID_IGNORE } + decorate_maps { + type: INT_32 + minval: 0 + maxval: 1 + } } rtld { diff --git a/manual/tunables.texi b/manual/tunables.texi index 776fd93fd9..c28360adcd 100644 --- a/manual/tunables.texi +++ b/manual/tunables.texi @@ -653,6 +653,23 @@ support in the kernel if this tunable has any non-zero value. The default value is @samp{0}, which disables all memory tagging. @end deftp +@deftp Tunable glibc.mem.decorate_maps +If the kernel supports naming anonymous virtual memory areas (since +Linux version 5.17, although not always enabled by some kernel +configurations), this tunable can be used to control whether +@theglibc{} decorates the underlying memory obtained from operating +system with a string describing its usage (for instance, on the thread +stack created by @code{ptthread_create} or memory allocated by +@code{malloc}). + +The process mappings can be obtained by reading the @code{/proc/maps} +(with @code{pid} being either the @dfn{process ID} or @code{self} for the +process own mapping). + +This tunable takes a value of 0 and 1, where 1 enables the feature. +The default value is @samp{0}, which disables the decoration. +@end deftp + @node gmon Tunables @section gmon Tunables @cindex gmon tunables diff --git a/sysdeps/unix/sysv/linux/setvmaname.c b/sysdeps/unix/sysv/linux/setvmaname.c index 9960ab5917..cd6d571772 100644 --- a/sysdeps/unix/sysv/linux/setvmaname.c +++ b/sysdeps/unix/sysv/linux/setvmaname.c @@ -20,6 +20,7 @@ #include #include #include +#include /* If PR_SET_VMA_ANON_NAME is not supported by the kernel, prctl returns EINVAL. However, it also returns the same error for invalid argument. @@ -34,11 +35,15 @@ __set_vma_name (void *start, size_t len, const char *name) if (atomic_load_relaxed (&prctl_supported) == 0) return; - int r = INTERNAL_SYSCALL_CALL (prctl, PR_SET_VMA, PR_SET_VMA_ANON_NAME, - start, len, name); - if (r == 0 || r != -EINVAL) - return; - + /* Set the prctl as not supported to avoid checking the tunable on every + call. */ + if (TUNABLE_GET (glibc, mem, decorate_maps, int32_t, NULL) != 0) + { + int r = INTERNAL_SYSCALL_CALL (prctl, PR_SET_VMA, PR_SET_VMA_ANON_NAME, + start, len, name); + if (r == 0 || r != -EINVAL) + return; + } atomic_store_relaxed (&prctl_supported, 0); return; }