From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dcvr.yhbt.net (dcvr.yhbt.net [173.255.242.215]) by sourceware.org (Postfix) with ESMTPS id 63D623858CDB for ; Fri, 12 Apr 2024 17:02:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 63D623858CDB Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=80x24.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=80x24.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 63D623858CDB Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=173.255.242.215 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712941371; cv=none; b=r5tTlLHsjUdsccEHGWD4EyHu25QQu6hjUGU9XiVklm4Jx5YWTZCm4O+UEMxuShHO+uQpe8IUfKd2yX5gAfXsZpvSIZtf4wKVR/qRC9hUBywdmffrXBi1m3s1cjfORv2SwfJPRfbCYx4QevS8z1zwV6NLdgw/1QkXLuZjMtrkGDs= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712941371; c=relaxed/simple; bh=3AhvnGuccuwyvFIBjC8ZKT+8ZfJwIf918oxmSYPyS88=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=szONdXdbV0j4RvT7idATZMfD49+jRc5ev+oOPoPokhNVNapoOzUh2Di0gM7RAfWCejqQfn/MSwKl3pftLgIbZGHZz+vKVsoxTkHHXflO43KUR59OI16AFSAg1pWrR82E8OhpiE/TT63TaC2HollB64ZzI92HIl2LOcGqEDEGduQ= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 619AD1F44D; Fri, 12 Apr 2024 17:02:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=80x24.org; s=selector1; t=1712941363; bh=3AhvnGuccuwyvFIBjC8ZKT+8ZfJwIf918oxmSYPyS88=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=4VR9buPV2dPF5nsT1a4bnpBiH2OaU6hneV0QGIZb7igcyJJ9yQ2WVc4rp4/UJH68w Oum6sTspNgQhikPL+qKKJrw/6bvBT7JlUbYjAkYsKCvUlWnaM19PFVVBST8iH7X5vb J3726VzGVrW4xAg1q78rSGg0MCu5h1n5+euT50mQ= Date: Fri, 12 Apr 2024 17:02:43 +0000 From: Eric Wong To: Wilco Dijkstra Cc: libc-alpha@sourceware.org Subject: Re: [RFT] malloc: reduce largebin granularity to reduce fragmentation Message-ID: <20240412170243.M270055@dcvr> References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-4.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Wilco Dijkstra wrote: > Hi Eric, > > I agree that limiting size classes will help with fragmentation. However > you need more than this: when it finds a block that is larger than requested > it will always split off the end even if it is a much smaller block. As a result > large blocks get smaller over time. Additionally allocations that can't be > satisfied by the free lists are taken from the large remainder of the arena, > thus interleaving allocations of all sizes rather than grouping similar sizes > together in separate regions like modern allocators. Thanks for taking a look. I think relying on a smaller (non-sliding) mmap threshold will still be necessary atm for my use case. Right now that's MALLOC_MMAP_THRESHOLD_=8388608 which is analogous to the opt.oversize_threshold default in jemalloc. jemalloc has a dedicated arena for those allocations instead of doing mmap each time. This may be another idea we can steal :> > The tcache and smallbin freelist also keeps freed block listed as 'allocated', > thus blocking merging of free blocks into larger ones - this is only done > every now and again during a consolidation pass. Threads that free blocks > but never allocate a large block again may never really free their freed > blocks. Yeah, I'm only focusing on single-threaded behavior ATM. My wfcqueue idea from 2018 should make things much nicer to reduce fragmentation w/ threads. I'm mainly working on single-threaded codebases nowadays, so not yet motivated to work on it. > So to reduce fragmentation you also have to address all these issues too. Agreed, I'm trying to take small incremental steps in the right direction. Perfect is the enemy of good, of course. > My general feeling is that this allocator is way too ancient and hacked up to > compete with modern allocators like jemalloc or mimalloc. I remember there were compatibility reasons to stick with the current one. That's way above my paygrade, though. That said, I do believe our old design can be improved to be competitive with newer mallocs. Getting users to run LD_PRELOAD or recompile Perl against a different malloc is a huge ask; and I don't like having redundant software on my systems for auditability and space reasons. Thanks.