From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from xry111.site (xry111.site [IPv6:2001:470:683e::1]) by sourceware.org (Postfix) with ESMTPS id B8142385802F for ; Tue, 8 Aug 2023 08:08:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B8142385802F Authentication-Results: sourceware.org; dmarc=pass (p=reject dis=none) header.from=xry111.site Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=xry111.site DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=xry111.site; s=default; t=1691482116; bh=ViApul2SJ7WtuOoU4LSQaszBK0jRI84fruAvN4pGmY8=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=iVgK9rJS5oi0KKwzeQ9BZism0Jzt63L7TE6AyzHua/ESyNpzJmsezpGWalc10yt0E x0tw1VY0rPpCbj5jTZ/wDigktSiEX8M9HgNceJvsH1ANoEkvOcWfwsPUOYRQpiqvz2 dZLOFldFdi7PqnRVOw+n+tL97p4sG6YhNrC0xp2M= Received: from localhost.localdomain (xry111.site [IPv6:2001:470:683e::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature ECDSA (P-384) server-digest SHA384) (Client did not present a certificate) (Authenticated sender: xry111@xry111.site) by xry111.site (Postfix) with ESMTPSA id B6E4A659CA; Tue, 8 Aug 2023 04:08:34 -0400 (EDT) Message-ID: Subject: Re: posix_memalign performance regression in 2.38? From: Xi Ruoyao To: DJ Delorie , Sam James Cc: adhemerval.zanella@linaro.org, libc-alpha@sourceware.org, dilfridge@gentoo.org, timo@rothenpieler.org Date: Tue, 08 Aug 2023 16:08:32 +0800 In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.48.4 MIME-Version: 1.0 X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,LIKELY_SPAM_FROM,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Mon, 2023-08-07 at 23:38 -0400, DJ Delorie wrote: >=20 > Reproduced. >=20 > In the case where I reproduced it, the most common problematic case was > an allocation of 64-byte aligned chunks of 472 bytes, where 30 smallbin > chunks were tested without finding a match. >=20 > The most common non-problematic case was a 64-byte-aligned request for > 24 bytes. >=20 > There were a LOT of other size requests.=C2=A0 The smallest I saw was TWO > bytes.=C2=A0 WHY?=C2=A0 I'm tempted to not fix this, to teach developers = to not > use posix_memalign() unless they REALLY need it ;-) Have you tested this? $ cat t.c #include int main() { void *buf; for (int i =3D 0; i < (1 << 16); i++) posix_memalign(&buf, 64, 64); } To me this is quite reasonable (if we just want many blocks each can fit into a cache line), but this costs 17.7 seconds on my system. Do you think people just should avoid this? If so we at least need to document the issue in the manual. --=20 Xi Ruoyao School of Aerospace Science and Technology, Xidian University