From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=Z6XE=LR=80x24.org=e@sourceware.org>
Received: from dcvr.yhbt.net (dcvr.yhbt.net [173.255.242.215])
	by sourceware.org (Postfix) with ESMTPS id 63D623858CDB
	for <libc-alpha@sourceware.org>; Fri, 12 Apr 2024 17:02:44 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 63D623858CDB
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=80x24.org
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=80x24.org
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 63D623858CDB
Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=173.255.242.215
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712941371; cv=none;
	b=r5tTlLHsjUdsccEHGWD4EyHu25QQu6hjUGU9XiVklm4Jx5YWTZCm4O+UEMxuShHO+uQpe8IUfKd2yX5gAfXsZpvSIZtf4wKVR/qRC9hUBywdmffrXBi1m3s1cjfORv2SwfJPRfbCYx4QevS8z1zwV6NLdgw/1QkXLuZjMtrkGDs=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
	t=1712941371; c=relaxed/simple;
	bh=3AhvnGuccuwyvFIBjC8ZKT+8ZfJwIf918oxmSYPyS88=;
	h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=szONdXdbV0j4RvT7idATZMfD49+jRc5ev+oOPoPokhNVNapoOzUh2Di0gM7RAfWCejqQfn/MSwKl3pftLgIbZGHZz+vKVsoxTkHHXflO43KUR59OI16AFSAg1pWrR82E8OhpiE/TT63TaC2HollB64ZzI92HIl2LOcGqEDEGduQ=
ARC-Authentication-Results: i=1; server2.sourceware.org
Received: from localhost (dcvr.yhbt.net [127.0.0.1])
	by dcvr.yhbt.net (Postfix) with ESMTP id 619AD1F44D;
	Fri, 12 Apr 2024 17:02:43 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=80x24.org;
	s=selector1; t=1712941363;
	bh=3AhvnGuccuwyvFIBjC8ZKT+8ZfJwIf918oxmSYPyS88=;
	h=Date:From:To:Cc:Subject:References:In-Reply-To:From;
	b=4VR9buPV2dPF5nsT1a4bnpBiH2OaU6hneV0QGIZb7igcyJJ9yQ2WVc4rp4/UJH68w
	 Oum6sTspNgQhikPL+qKKJrw/6bvBT7JlUbYjAkYsKCvUlWnaM19PFVVBST8iH7X5vb
	 J3726VzGVrW4xAg1q78rSGg0MCu5h1n5+euT50mQ=
Date: Fri, 12 Apr 2024 17:02:43 +0000
From: Eric Wong <e@80x24.org>
To: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Cc: libc-alpha@sourceware.org
Subject: Re: [RFT] malloc: reduce largebin granularity to reduce fragmentation
Message-ID: <20240412170243.M270055@dcvr>
References: <PAWPR08MB89827CD885891415D2A6051183042@PAWPR08MB8982.eurprd08.prod.outlook.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
In-Reply-To: <PAWPR08MB89827CD885891415D2A6051183042@PAWPR08MB8982.eurprd08.prod.outlook.com>
X-Spam-Status: No, score=-4.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <libc-alpha.sourceware.org>

Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote:
> Hi Eric,
> 
> I agree that limiting size classes will help with fragmentation. However
> you need more than this: when it finds a block that is larger than requested
> it will always split off the end even if it is a much smaller block. As a result
> large blocks get smaller over time. Additionally allocations that can't be
> satisfied by the free lists are taken from the large remainder of the arena,
> thus interleaving allocations of all sizes rather than grouping similar sizes
> together in separate regions like modern allocators.

Thanks for taking a look.

I think relying on a smaller (non-sliding) mmap threshold will
still be necessary atm for my use case.

Right now that's MALLOC_MMAP_THRESHOLD_=8388608 which is
analogous to the opt.oversize_threshold default in jemalloc.

jemalloc has a dedicated arena for those allocations instead of
doing mmap each time.  This may be another idea we can steal :>

> The tcache and smallbin freelist also keeps freed block listed as 'allocated',
> thus blocking merging of free blocks into larger ones - this is only done
> every now and again during a consolidation pass. Threads that free blocks
> but never allocate a large block again may never really free their freed
> blocks.

Yeah, I'm only focusing on single-threaded behavior ATM.

My wfcqueue idea from 2018 should make things much nicer to
reduce fragmentation w/ threads.  I'm mainly working on
single-threaded codebases nowadays, so not yet motivated to work
on it.

> So to reduce fragmentation you also have to address all these issues too.

Agreed, I'm trying to take small incremental steps in the right
direction.  Perfect is the enemy of good, of course.

> My general feeling is that this allocator is way too ancient and hacked up to
> compete with modern allocators like jemalloc or mimalloc.

I remember there were compatibility reasons to stick with the
current one.  That's way above my paygrade, though.  That said,
I do believe our old design can be improved to be competitive
with newer mallocs.

Getting users to run LD_PRELOAD or recompile Perl against a
different malloc is a huge ask; and I don't like having
redundant software on my systems for auditability and space
reasons.

Thanks.