From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id D71763858D28 for ; Tue, 18 Apr 2023 01:58:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D71763858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1681783098; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to; bh=3Pnt3Ccx0atRcdRezGnTxXIEEKofULQRFNJBiEtTkas=; b=T8cRUyQ5i6KskZzhUH82GGcWnI/+kcfnzk6Xaew+KpUslII9uJjMicONWYbu5KFpROnkSX Ewbm3DWA3i7WMTfwCic8y0qixaDszcXuzVmin6wfm2Z+1BMYXw/SgphGKFr/GPBoFFn0UB 4WZWTYR0466ZYhoB5ADGXCXezJMAGRI= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-639-8_0umoP2OP-8zVItRVZcgQ-1; Mon, 17 Apr 2023 21:58:16 -0400 X-MC-Unique: 8_0umoP2OP-8zVItRVZcgQ-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 8D3DF8996E0 for ; Tue, 18 Apr 2023 01:58:16 +0000 (UTC) Received: from greed.delorie.com (unknown [10.22.8.27]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 6230940C83AC; Tue, 18 Apr 2023 01:58:16 +0000 (UTC) Received: from greed.delorie.com.redhat.com (localhost [127.0.0.1]) by greed.delorie.com (8.15.2/8.15.2) with ESMTP id 33I1P50u1107999; Mon, 17 Apr 2023 21:25:35 -0400 From: DJ Delorie To: "Carlos O'Donell" Cc: libc-alpha@sourceware.org Subject: [PATCH v7] memalign: Support scanning for aligned chunks. In-Reply-To: <01b97455-b4d9-e12d-f072-c76f6e58bcd4@redhat.com> Date: Mon, 17 Apr 2023 21:25:05 -0400 Message-ID: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain X-Spam-Status: No, score=-10.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: "Carlos O'Donell" writes: > DJ and I sat down to review v6 in detail. > > We need a v7 for three reasons: > > (a) chunk_ok_for_memalign() is commented as taking *user* bytes but you pass > chunksize in this change and the comment needs adjusting. You might as > well change `size_t bytes` to `size_t nb` to be consistent. > > (b) If we are now passing chunksize then the second line of the function which > computes size, should use, not memsize(p), but chunksize(p), to correctly > account for the header bytes (otherwise we're conservative). > > (c) We need to exclude the test from mcheck runs via tests-exclude-mcheck since > the expected chunk results won't work when mcheck is in effect. > > FAIL: malloc/tst-memalign-2-mcheck > FAIL: malloc/tst-memalign-3-mcheck Adjusted as requested. As noted previously, a large chunk of the malloc.c diff is merely indentation change (with the exception of the calls to chunk_ok_for_memalign). >From fee79d5ab6d385817ea88e5097254a9559c35878 Mon Sep 17 00:00:00 2001 From: DJ Delorie Date: Mon, 3 Apr 2023 17:33:03 -0400 Subject: malloc: set NON_MAIN_ARENA flag for reclaimed memalign chunk (BZ #30101) Based on these comments in malloc.c: size field is or'ed with NON_MAIN_ARENA if the chunk was obtained from a non-main arena. This is only set immediately before handing the chunk to the user, if necessary. The NON_MAIN_ARENA flag is never set for unsorted chunks, so it does not have to be taken into account in size comparisons. When we pull a chunk off the unsorted list (or any list) we need to make sure that flag is set properly before returning the chunk. Use the rounded-up size for chunk_ok_for_memalign() Do not scan the arena for reusable chunks if there's no arena. Account for chunk overhead when determining if a chunk is a reuse candidate. mcheck interferes with memalign, so skip mcheck variants of memalign tests. diff --git a/malloc/Makefile b/malloc/Makefile index f49675845e..071dfdb9d8 100644 --- a/malloc/Makefile +++ b/malloc/Makefile @@ -43,7 +43,8 @@ tests := mallocbug tst-malloc tst-valloc tst-calloc tst-obstack \ tst-tcfree1 tst-tcfree2 tst-tcfree3 \ tst-safe-linking \ tst-mallocalign1 \ - tst-memalign-2 + tst-memalign-2 \ + tst-memalign-3 tests-static := \ tst-interpose-static-nothread \ @@ -71,7 +72,7 @@ test-srcs = tst-mtrace # with MALLOC_CHECK_=3 because they expect a specific failure. tests-exclude-malloc-check = tst-malloc-check tst-malloc-usable \ tst-mxfast tst-safe-linking \ - tst-compathooks-off tst-compathooks-on tst-memalign-2 + tst-compathooks-off tst-compathooks-on tst-memalign-2 tst-memalign-3 # Run all tests with MALLOC_CHECK_=3 tests-malloc-check = $(filter-out $(tests-exclude-malloc-check) \ @@ -116,6 +117,8 @@ tests-exclude-mcheck = tst-mallocstate \ tst-malloc-usable-tunables \ tst-malloc_info \ tst-compathooks-off tst-compathooks-on \ + tst-memalign-2 \ + tst-memalign-3 \ tst-mxfast tests-mcheck = $(filter-out $(tests-exclude-mcheck) $(tests-static), $(tests)) diff --git a/malloc/malloc.c b/malloc/malloc.c index 0315ac5d16..e33ed665db 100644 --- a/malloc/malloc.c +++ b/malloc/malloc.c @@ -4974,13 +4974,13 @@ _int_realloc (mstate av, mchunkptr oldp, INTERNAL_SIZE_T oldsize, /* Returns 0 if the chunk is not and does not contain the requested aligned sub-chunk, else returns the amount of "waste" from - trimming. BYTES is the *user* byte size, not the chunk byte + trimming. NB is the *chunk* byte size, not the user byte size. */ static size_t -chunk_ok_for_memalign (mchunkptr p, size_t alignment, size_t bytes) +chunk_ok_for_memalign (mchunkptr p, size_t alignment, size_t nb) { void *m = chunk2mem (p); - INTERNAL_SIZE_T size = memsize (p); + INTERNAL_SIZE_T size = chunksize (p); void *aligned_m = m; if (__glibc_unlikely (misaligned_chunk (p))) @@ -4997,12 +4997,12 @@ chunk_ok_for_memalign (mchunkptr p, size_t alignment, size_t bytes) /* If it's a perfect fit, it's an exception to the return value rule (we would return zero waste, which looks like "not usable"), so handle it here by returning a small non-zero value instead. */ - if (size == bytes && front_extra == 0) + if (size == nb && front_extra == 0) return 1; /* If the block we need fits in the chunk, calculate total waste. */ - if (size > bytes + front_extra) - return size - bytes; + if (size > nb + front_extra) + return size - nb; /* Can't use this chunk. */ return 0; @@ -5048,95 +5048,98 @@ _int_memalign (mstate av, size_t alignment, size_t bytes) and unlikely to meet our alignment requirements. We have not done any experimentation with searching for aligned fastbins. */ - int first_bin_index; - int first_largebin_index; - int last_bin_index; + if (av != NULL) + { + int first_bin_index; + int first_largebin_index; + int last_bin_index; - if (in_smallbin_range (nb)) - first_bin_index = smallbin_index (nb); - else - first_bin_index = largebin_index (nb); + if (in_smallbin_range (nb)) + first_bin_index = smallbin_index (nb); + else + first_bin_index = largebin_index (nb); - if (in_smallbin_range (nb * 2)) - last_bin_index = smallbin_index (nb * 2); - else - last_bin_index = largebin_index (nb * 2); + if (in_smallbin_range (nb * 2)) + last_bin_index = smallbin_index (nb * 2); + else + last_bin_index = largebin_index (nb * 2); - first_largebin_index = largebin_index (MIN_LARGE_SIZE); + first_largebin_index = largebin_index (MIN_LARGE_SIZE); - int victim_index; /* its bin index */ + int victim_index; /* its bin index */ - for (victim_index = first_bin_index; - victim_index < last_bin_index; - victim_index ++) - { - victim = NULL; + for (victim_index = first_bin_index; + victim_index < last_bin_index; + victim_index ++) + { + victim = NULL; - if (victim_index < first_largebin_index) - { - /* Check small bins. Small bin chunks are doubly-linked despite - being the same size. */ + if (victim_index < first_largebin_index) + { + /* Check small bins. Small bin chunks are doubly-linked despite + being the same size. */ - mchunkptr fwd; /* misc temp for linking */ - mchunkptr bck; /* misc temp for linking */ + mchunkptr fwd; /* misc temp for linking */ + mchunkptr bck; /* misc temp for linking */ - bck = bin_at (av, victim_index); - fwd = bck->fd; - while (fwd != bck) - { - if (chunk_ok_for_memalign (fwd, alignment, bytes) > 0) - { - victim = fwd; + bck = bin_at (av, victim_index); + fwd = bck->fd; + while (fwd != bck) + { + if (chunk_ok_for_memalign (fwd, alignment, nb) > 0) + { + victim = fwd; - /* Unlink it */ - victim->fd->bk = victim->bk; - victim->bk->fd = victim->fd; - break; + /* Unlink it */ + victim->fd->bk = victim->bk; + victim->bk->fd = victim->fd; + break; + } + + fwd = fwd->fd; + } } + else + { + /* Check large bins. */ + mchunkptr fwd; /* misc temp for linking */ + mchunkptr bck; /* misc temp for linking */ + mchunkptr best = NULL; + size_t best_size = 0; - fwd = fwd->fd; - } - } - else - { - /* Check large bins. */ - mchunkptr fwd; /* misc temp for linking */ - mchunkptr bck; /* misc temp for linking */ - mchunkptr best = NULL; - size_t best_size = 0; + bck = bin_at (av, victim_index); + fwd = bck->fd; - bck = bin_at (av, victim_index); - fwd = bck->fd; + while (fwd != bck) + { + int extra; - while (fwd != bck) - { - int extra; + if (chunksize (fwd) < nb) + break; + extra = chunk_ok_for_memalign (fwd, alignment, nb); + if (extra > 0 + && (extra <= best_size || best == NULL)) + { + best = fwd; + best_size = extra; + } - if (chunksize (fwd) < nb) - break; - extra = chunk_ok_for_memalign (fwd, alignment, bytes); - if (extra > 0 - && (extra <= best_size || best == NULL)) - { - best = fwd; - best_size = extra; - } + fwd = fwd->fd; + } + victim = best; - fwd = fwd->fd; - } - victim = best; + if (victim != NULL) + { + unlink_chunk (av, victim); + break; + } + } - if (victim != NULL) - { - unlink_chunk (av, victim); - break; + if (victim != NULL) + break; } } - if (victim != NULL) - break; - } - /* Strategy: find a spot within that chunk that meets the alignment request, and then possibly free the leading and trailing space. This strategy is incredibly costly and can lead to external @@ -5147,6 +5150,8 @@ _int_memalign (mstate av, size_t alignment, size_t bytes) p = victim; m = chunk2mem (p); set_inuse (p); + if (av != &main_arena) + set_non_main_arena (p); } else { diff --git a/malloc/tst-memalign-2.c b/malloc/tst-memalign-2.c index 4996578e9f..f229283dbf 100644 --- a/malloc/tst-memalign-2.c +++ b/malloc/tst-memalign-2.c @@ -33,9 +33,10 @@ typedef struct TestCase { } TestCase; static TestCase tcache_allocs[] = { - { 24, 8, NULL, NULL }, - { 24, 16, NULL, NULL }, - { 128, 32, NULL, NULL } + { 24, 32, NULL, NULL }, + { 24, 64, NULL, NULL }, + { 128, 128, NULL, NULL }, + { 500, 128, NULL, NULL } }; #define TN array_length (tcache_allocs) @@ -70,11 +71,15 @@ do_test (void) for (i = 0; i < TN; ++ i) { + size_t sz2; + tcache_allocs[i].ptr1 = memalign (tcache_allocs[i].alignment, tcache_allocs[i].size); CHECK (tcache_allocs[i].ptr1, tcache_allocs[i].alignment); + sz2 = malloc_usable_size (tcache_allocs[i].ptr1); free (tcache_allocs[i].ptr1); + /* This should return the same chunk as was just free'd. */ - tcache_allocs[i].ptr2 = memalign (tcache_allocs[i].alignment, tcache_allocs[i].size); + tcache_allocs[i].ptr2 = memalign (tcache_allocs[i].alignment, sz2); CHECK (tcache_allocs[i].ptr2, tcache_allocs[i].alignment); free (tcache_allocs[i].ptr2); diff --git a/malloc/tst-memalign-3.c b/malloc/tst-memalign-3.c new file mode 100644 index 0000000000..ab90d6ca9b --- /dev/null +++ b/malloc/tst-memalign-3.c @@ -0,0 +1,173 @@ +/* Test for memalign chunk reuse. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + + +typedef struct TestCase { + size_t size; + size_t alignment; + void *ptr1; + void *ptr2; +} TestCase; + +static TestCase tcache_allocs[] = { + { 24, 32, NULL, NULL }, + { 24, 64, NULL, NULL }, + { 128, 128, NULL, NULL }, + { 500, 128, NULL, NULL } +}; +#define TN array_length (tcache_allocs) + +static TestCase large_allocs[] = { + { 23450, 64, NULL, NULL }, + { 23450, 64, NULL, NULL }, + { 23550, 64, NULL, NULL }, + { 23550, 64, NULL, NULL }, + { 23650, 64, NULL, NULL }, + { 23650, 64, NULL, NULL }, + { 33650, 64, NULL, NULL }, + { 33650, 64, NULL, NULL } +}; +#define LN array_length (large_allocs) + +void *p; + +/* Sanity checks, ancillary to the actual test. */ +#define CHECK(p,a) \ + if (p == NULL || !PTR_IS_ALIGNED (p, a)) \ + FAIL_EXIT1 ("NULL or misaligned memory detected.\n"); + +static void * +mem_test (void *closure) +{ + int i; + int j; + int count; + void *ptr[10]; + void *p; + + /* TCache test. */ + for (i = 0; i < TN; ++ i) + { + size_t sz2; + + tcache_allocs[i].ptr1 = memalign (tcache_allocs[i].alignment, tcache_allocs[i].size); + CHECK (tcache_allocs[i].ptr1, tcache_allocs[i].alignment); + sz2 = malloc_usable_size (tcache_allocs[i].ptr1); + free (tcache_allocs[i].ptr1); + + /* This should return the same chunk as was just free'd. */ + tcache_allocs[i].ptr2 = memalign (tcache_allocs[i].alignment, sz2); + CHECK (tcache_allocs[i].ptr2, tcache_allocs[i].alignment); + free (tcache_allocs[i].ptr2); + + TEST_VERIFY (tcache_allocs[i].ptr1 == tcache_allocs[i].ptr2); + } + + /* Test for non-head tcache hits. */ + for (i = 0; i < array_length (ptr); ++ i) + { + if (i == 4) + { + ptr[i] = memalign (64, 256); + CHECK (ptr[i], 64); + } + else + { + ptr[i] = malloc (256); + CHECK (ptr[i], 4); + } + } + for (i = 0; i < array_length (ptr); ++ i) + free (ptr[i]); + + p = memalign (64, 256); + CHECK (p, 64); + + count = 0; + for (i = 0; i < 10; ++ i) + if (ptr[i] == p) + ++ count; + free (p); + TEST_VERIFY (count > 0); + + /* Large bins test. */ + + for (i = 0; i < LN; ++ i) + { + large_allocs[i].ptr1 = memalign (large_allocs[i].alignment, large_allocs[i].size); + CHECK (large_allocs[i].ptr1, large_allocs[i].alignment); + /* Keep chunks from combining by fragmenting the heap. */ + p = malloc (512); + CHECK (p, 4); + } + + for (i = 0; i < LN; ++ i) + free (large_allocs[i].ptr1); + + /* Force the unsorted bins to be scanned and moved to small/large + bins. */ + p = malloc (60000); + + for (i = 0; i < LN; ++ i) + { + large_allocs[i].ptr2 = memalign (large_allocs[i].alignment, large_allocs[i].size); + CHECK (large_allocs[i].ptr2, large_allocs[i].alignment); + } + + count = 0; + for (i = 0; i < LN; ++ i) + { + int ok = 0; + for (j = 0; j < LN; ++ j) + if (large_allocs[i].ptr1 == large_allocs[j].ptr2) + ok = 1; + if (ok == 1) + count ++; + } + + /* The allocation algorithm is complicated outside of the memalign + logic, so just make sure it's working for most of the + allocations. This avoids possible boundary conditions with + empty/full heaps. */ + TEST_VERIFY (count > LN / 2); + + return 0; +} + +static int +do_test (void) +{ + pthread_t p; + + p = xpthread_create (NULL, mem_test, NULL); + xpthread_join (p); + return 0; +} + +#include