* status of dj/malloc branch?
@ 2024-04-01 19:19 Eric Wong
2024-04-01 20:59 ` DJ Delorie
0 siblings, 1 reply; 4+ messages in thread
From: Eric Wong @ 2024-04-01 19:19 UTC (permalink / raw)
To: libc-alpha
I'm interested in the tracing features described at
https://sourceware.org/glibc/wiki/MallocTracing
to test and validate memory fragmentation avoidance in a long-lived
single-threaded Perl C10K HTTP/IMAP/NNTP/POP3 daemon.
It appears stalled for years, however, and the current glibc
malloc doesn't have the trace + replay features.
I'm currently dogfooding the below patch on an old glibc (Debian
oldstable :x) on my "production" home server. My theory is the
jemalloc idea of having fewer possible sizes is good for
avoiding fragmentation in long-lived processes.
This is because sizes for string processing are highly
variable and lifetimes are mixed for event-driven C10K servers
where some clients live only for a single request and others for
many. Clients end up sharing allocations due to caching
and deduplication, so a short-lived client can end up allocating
something that lives a long-time. Perl does lazy loading
and internal caching+memoization all over the place, too.
The downside is 0-20% waste in initial fits, but I expect it to
get better fits over time...
Not a serious patch against Debian glibc 2.31-13+deb11u8:
diff --git a/malloc/malloc.c b/malloc/malloc.c
index f7cd29bc..6e0b066d 100644
--- a/malloc/malloc.c
+++ b/malloc/malloc.c
@@ -3018,6 +3018,31 @@ tcache_thread_shutdown (void)
#endif /* !USE_TCACHE */
+static inline size_t
+size_class_pad (size_t bytes)
+{
+ if (bytes <= MAX_FAST_SIZE || bytes >= DEFAULT_MMAP_THRESHOLD_MAX)
+ return bytes;
+ /*
+ * Use jemalloc-inspired size classes for mid-size allocations to
+ * minimize fragmentation. This means we pay a 0-20% overhead on
+ * the initial allocations to improve the likelyhood of reuse.
+ */
+ size_t max = sizeof(void *) << 4;
+ size_t nxt;
+
+ do {
+ if (bytes <= max) {
+ size_t sc_bytes = ALIGN_UP (bytes, max >> 3);
+
+ return sc_bytes <= DEFAULT_MMAP_THRESHOLD_MAX ? sc_bytes : bytes;
+ }
+ nxt = max << 1;
+ } while (nxt > max && nxt < DEFAULT_MMAP_THRESHOLD_MAX && (max = nxt));
+
+ return bytes;
+}
+
void *
__libc_malloc (size_t bytes)
{
@@ -3031,6 +3056,7 @@ __libc_malloc (size_t bytes)
= atomic_forced_read (__malloc_hook);
if (__builtin_expect (hook != NULL, 0))
return (*hook)(bytes, RETURN_ADDRESS (0));
+ bytes = size_class_pad (bytes);
#if USE_TCACHE
/* int_free also calls request2size, be careful to not pad twice. */
size_t tbytes;
@@ -3150,6 +3176,8 @@ __libc_realloc (void *oldmem, size_t bytes)
if (oldmem == 0)
return __libc_malloc (bytes);
+ bytes = size_class_pad (bytes);
+
/* chunk corresponding to oldmem */
const mchunkptr oldp = mem2chunk (oldmem);
/* its size */
@@ -3391,6 +3419,7 @@ __libc_calloc (size_t n, size_t elem_size)
return memset (mem, 0, sz);
}
+ sz = size_class_pad (sz);
MAYBE_INIT_TCACHE ();
if (SINGLE_THREAD_P)
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: status of dj/malloc branch?
2024-04-01 19:19 status of dj/malloc branch? Eric Wong
@ 2024-04-01 20:59 ` DJ Delorie
2024-04-06 22:02 ` Eric Wong
0 siblings, 1 reply; 4+ messages in thread
From: DJ Delorie @ 2024-04-01 20:59 UTC (permalink / raw)
To: Eric Wong; +Cc: libc-alpha
If you're wondering about the branch itself, the harsh answer is that
I'm not maintaining it.
If you're wondering about the goals therein, IIRC we discovered we had
no good way to visualize and analyze the heap itself in order to
understand what causes the problem we're trying to solve. While these
are solvable problems, they're big projects and never quite made it to
the top of our priority lists.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: status of dj/malloc branch?
2024-04-01 20:59 ` DJ Delorie
@ 2024-04-06 22:02 ` Eric Wong
2024-04-08 18:37 ` DJ Delorie
0 siblings, 1 reply; 4+ messages in thread
From: Eric Wong @ 2024-04-06 22:02 UTC (permalink / raw)
To: DJ Delorie; +Cc: libc-alpha
DJ Delorie <dj@redhat.com> wrote:
>
> If you're wondering about the branch itself, the harsh answer is that
> I'm not maintaining it.
>
> If you're wondering about the goals therein, IIRC we discovered we had
> no good way to visualize and analyze the heap itself in order to
> understand what causes the problem we're trying to solve. While these
> are solvable problems, they're big projects and never quite made it to
> the top of our priority lists.
Thanks for the response.
I started doing my own tracing[1] in mwrap-perl[2] and it's
crazy expensive (I/O and storage) to trace all the allocations
done by a busy Perl process. I had to add compression (using
zstd) to slow down the use of disk space; hope I can get useful
reproducible data before I run out of space.
[1] https://80x24.org/mwrap-perl/20240406214954.159627-1-e@80x24.org/
[2] https://80x24.org/mwrap-perl.git
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: status of dj/malloc branch?
2024-04-06 22:02 ` Eric Wong
@ 2024-04-08 18:37 ` DJ Delorie
0 siblings, 0 replies; 4+ messages in thread
From: DJ Delorie @ 2024-04-08 18:37 UTC (permalink / raw)
To: Eric Wong; +Cc: libc-alpha
Eric Wong <e@80x24.org> writes:
> I started doing my own tracing[1] in mwrap-perl[2] and it's
> crazy expensive (I/O and storage) to trace all the allocations
Yeah, we ran into that problem too. 64-bit pointers plus many-argument
functions like realloc results in a HUGE log file. Like, terabytes in
one case.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-04-08 18:38 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-01 19:19 status of dj/malloc branch? Eric Wong
2024-04-01 20:59 ` DJ Delorie
2024-04-06 22:02 ` Eric Wong
2024-04-08 18:37 ` DJ Delorie
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).