public inbox for glibc-bugs@sourceware.org help / color / mirror / Atom feed
From: "nicolas at freedelity dot be" <sourceware-bugzilla@sourceware.org> To: glibc-bugs@sourceware.org Subject: [Bug malloc/30579] New: trim_threshold in realloc lead to high memory usage Date: Thu, 22 Jun 2023 13:54:08 +0000 [thread overview] Message-ID: <bug-30579-131@http.sourceware.org/bugzilla/> (raw) https://sourceware.org/bugzilla/show_bug.cgi?id=30579 Bug ID: 30579 Summary: trim_threshold in realloc lead to high memory usage Product: glibc Version: 2.37 Status: UNCONFIRMED Severity: normal Priority: P2 Component: malloc Assignee: unassigned at sourceware dot org Reporter: nicolas at freedelity dot be Target Milestone: --- The recent usage of trim_threshold in realloc (commit f4f2ca1509288f6f780af50659693a89949e7e46: https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=f4f2ca1509288f6f780af50659693a89949e7e46) is preventing to reclaim unused memory in some use case scenario. The scenario that is affecting us now is that we need to normalize a lot of strings (around ~20M) and their final size after normalization is unknown (just that we have a upperbound of 1K bytes). So the memory is preallocated to the maximum size and then sized down to the final size, so that we avoid many reallocation during string constructions. So during normalization, the memory for a given string is preallocated to 1024 and then reallocated back to its final size. The default value for trim_threshold is 128K so this memory is never reclaimed.The process ends up eating much more memory than necessary. The following test program reproduce the issue: ``` #include <stdlib.h> #include <string.h> #include <unistd.h> int main() { int i; int count = 1000000; char ** strings = malloc(sizeof(char*) * count); for(i=0; i<count; i++){ strings[i] = malloc(sizeof(char) * 1000); strcpy(strings[i], "hello"); strings[i] = realloc(strings[i], sizeof(char) * 6); } while(1) { sleep(100); } return 0; } ``` It constructs 1M strings set to "hello" by preallocating a 1000-bytes array for each string to emulate the fact that we do not know its size in advance. The correct size is set back with a call to realloc. Until GLIBC 2.36, the memory was correctly reclaimed when calling realloc and we end up consuming around 40 MB. Starting from GLIBC 2.37 and the mentionned commit, the memory goes up until 1GB. I'm not an expert but reusing trim_threshold as an heuristic to decide to reclaim memory or not does not seem a good fit for heap memory. By reading the comments, I would be tempted to set glibc.malloc.trim_threshold (via GLIBC_TUNABLES) to a value high enough for performance reason but now, we are somewhat forced to set it to a very low value to avoid consuming too much memory. In my opinion, the heuristics should probably be based on an independent value. -- You are receiving this mail because: You are on the CC list for the bug.
next reply other threads:[~2023-06-22 13:54 UTC|newest] Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top 2023-06-22 13:54 nicolas at freedelity dot be [this message] 2023-06-22 14:20 ` [Bug malloc/30579] " siddhesh at sourceware dot org 2023-06-22 15:10 ` nicolas at freedelity dot be 2023-06-28 7:56 ` nicolas at freedelity dot be 2023-06-28 16:31 ` siddhesh at sourceware dot org 2023-07-06 15:38 ` cvs-commit at gcc dot gnu.org 2023-07-06 15:40 ` cvs-commit at gcc dot gnu.org 2023-07-06 15:42 ` siddhesh at sourceware dot org 2023-07-06 16:09 ` nicolas at freedelity dot be
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-30579-131@http.sourceware.org/bugzilla/ \ --to=sourceware-bugzilla@sourceware.org \ --cc=glibc-bugs@sourceware.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).