From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id DCFEC3858C62 for ; Tue, 18 Apr 2023 13:58:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DCFEC3858C62 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1681826336; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3z7BFWUhJxzkMYyqrFqhwhV+bepj6IUxR2A8jYvVdL0=; b=XoXs0w9iBFOkEidhI3cH+8/w4apoJD0jTtsIMX9VY5xSAZvK67aqIXEsUc6eMTGC6ga6C5 OBDEPHjrVx/shx83OCogMyyxLH5nmBPEW5tHrRyioT4+ahlwiB9q8XtTTOhoO/O7peLUN3 8yz1mxPa+Y0vOzWCmI4AF7582NR2krY= Received: from mail-yw1-f199.google.com (mail-yw1-f199.google.com [209.85.128.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-119-ZuyZs8AuO46kn5CufMYYxA-1; Tue, 18 Apr 2023 09:58:55 -0400 X-MC-Unique: ZuyZs8AuO46kn5CufMYYxA-1 Received: by mail-yw1-f199.google.com with SMTP id 00721157ae682-54fde069e4aso111788147b3.3 for ; Tue, 18 Apr 2023 06:58:55 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681826334; x=1684418334; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=3z7BFWUhJxzkMYyqrFqhwhV+bepj6IUxR2A8jYvVdL0=; b=LT5RZ2nMekDjYKJ5DG4gmyLYknOMp5KXOeMij/7B/Ztc76zKgosBW2PtN909r+ax9m WO+6rAi+sKjECwYi2yY6+iZVlmuqKDgLHTYTTbr7UyoDJ/BtQRZNVncpJBqMwPX1u4X4 /y/GU4wPtmOxmbH1rlHZDj06iw5uTCr1p4seOtXuWovqG7wS3kfbHYwAvKRFUJ6ZUi9G JUwZD+K3Y07yCubh9PHDD8997bVfTYICJ9tSPKsqCXlknhck31ijOu2g1JzVFrYhqnKC ZdYtpWEfnYnK7pq98/ygsW5j6JjakZwX3aS6dvR5n+fyq4rp3BaC2ebRed5mhaGpKlDt tlFQ== X-Gm-Message-State: AAQBX9fWTgglxcjfdwF2IXOabTtsfdN6ONTHuVFxb9KdMVD4n7bLr12Q ZAdKc3OLlBGA0o6MPst0/I/acT8ZdirVmEy4yjZIFkr3eplbZtCZ2/gETrnLnPQgUKj7B3PX/MN zEuUJKpcZwL+vRTSr8594j+JNaveu X-Received: by 2002:a0d:e9c4:0:b0:541:8abb:1f7a with SMTP id s187-20020a0de9c4000000b005418abb1f7amr15441053ywe.30.1681826334245; Tue, 18 Apr 2023 06:58:54 -0700 (PDT) X-Google-Smtp-Source: AKy350bvfK2pnhJYQmOGvf1hiJ4iOo/UbwyAqsoSh+/qvRnU2MqEFPFzzfmqDTYSx6EHolTUq0CzSw== X-Received: by 2002:a0d:e9c4:0:b0:541:8abb:1f7a with SMTP id s187-20020a0de9c4000000b005418abb1f7amr15441037ywe.30.1681826333893; Tue, 18 Apr 2023 06:58:53 -0700 (PDT) Received: from [192.168.0.241] ([198.48.244.52]) by smtp.gmail.com with ESMTPSA id x14-20020a81f90e000000b00545a08184efsm3207684ywm.127.2023.04.18.06.58.52 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 18 Apr 2023 06:58:53 -0700 (PDT) Message-ID: <7c352e30-2b1b-b0e7-5cc0-c6d11cc522e0@redhat.com> Date: Tue, 18 Apr 2023 09:58:52 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.10.0 Subject: Re: [PATCH v7] memalign: Support scanning for aligned chunks. To: DJ Delorie Cc: libc-alpha@sourceware.org References: From: Carlos O'Donell Organization: Red Hat In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-14.0 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 4/17/23 21:25, DJ Delorie wrote: > "Carlos O'Donell" writes: >> DJ and I sat down to review v6 in detail. >> >> We need a v7 for three reasons: >> >> (a) chunk_ok_for_memalign() is commented as taking *user* bytes but you pass >> chunksize in this change and the comment needs adjusting. You might as >> well change `size_t bytes` to `size_t nb` to be consistent. >> >> (b) If we are now passing chunksize then the second line of the function which >> computes size, should use, not memsize(p), but chunksize(p), to correctly >> account for the header bytes (otherwise we're conservative). >> >> (c) We need to exclude the test from mcheck runs via tests-exclude-mcheck since >> the expected chunk results won't work when mcheck is in effect. >> >> FAIL: malloc/tst-memalign-2-mcheck >> FAIL: malloc/tst-memalign-3-mcheck > > Adjusted as requested. As noted previously, a large chunk of the > malloc.c diff is merely indentation change (with the exception of the > calls to chunk_ok_for_memalign). > > From fee79d5ab6d385817ea88e5097254a9559c35878 Mon Sep 17 00:00:00 2001 > From: DJ Delorie > Date: Mon, 3 Apr 2023 17:33:03 -0400 > Subject: malloc: set NON_MAIN_ARENA flag for reclaimed memalign chunk (BZ #30101) > > Based on these comments in malloc.c: > > size field is or'ed with NON_MAIN_ARENA if the chunk was obtained > from a non-main arena. This is only set immediately before handing > the chunk to the user, if necessary. > > The NON_MAIN_ARENA flag is never set for unsorted chunks, so it > does not have to be taken into account in size comparisons. > > When we pull a chunk off the unsorted list (or any list) we need to > make sure that flag is set properly before returning the chunk. > > Use the rounded-up size for chunk_ok_for_memalign() > > Do not scan the arena for reusable chunks if there's no arena. > > Account for chunk overhead when determining if a chunk is a reuse > candidate. > > mcheck interferes with memalign, so skip mcheck variants of > memalign tests. v7 LGTM and addresses all previous issues. Reviewed-by: Carlos O'Donell Tested-by: Carlos O'Donell > > diff --git a/malloc/Makefile b/malloc/Makefile > index f49675845e..071dfdb9d8 100644 > --- a/malloc/Makefile > +++ b/malloc/Makefile > @@ -43,7 +43,8 @@ tests := mallocbug tst-malloc tst-valloc tst-calloc tst-obstack \ > tst-tcfree1 tst-tcfree2 tst-tcfree3 \ > tst-safe-linking \ > tst-mallocalign1 \ > - tst-memalign-2 > + tst-memalign-2 \ > + tst-memalign-3 OK. > > tests-static := \ > tst-interpose-static-nothread \ > @@ -71,7 +72,7 @@ test-srcs = tst-mtrace > # with MALLOC_CHECK_=3 because they expect a specific failure. > tests-exclude-malloc-check = tst-malloc-check tst-malloc-usable \ > tst-mxfast tst-safe-linking \ > - tst-compathooks-off tst-compathooks-on tst-memalign-2 > + tst-compathooks-off tst-compathooks-on tst-memalign-2 tst-memalign-3 OK. > > # Run all tests with MALLOC_CHECK_=3 > tests-malloc-check = $(filter-out $(tests-exclude-malloc-check) \ > @@ -116,6 +117,8 @@ tests-exclude-mcheck = tst-mallocstate \ > tst-malloc-usable-tunables \ > tst-malloc_info \ > tst-compathooks-off tst-compathooks-on \ > + tst-memalign-2 \ > + tst-memalign-3 \ OK. > tst-mxfast > > tests-mcheck = $(filter-out $(tests-exclude-mcheck) $(tests-static), $(tests)) > diff --git a/malloc/malloc.c b/malloc/malloc.c > index 0315ac5d16..e33ed665db 100644 > --- a/malloc/malloc.c > +++ b/malloc/malloc.c > @@ -4974,13 +4974,13 @@ _int_realloc (mstate av, mchunkptr oldp, INTERNAL_SIZE_T oldsize, > > /* Returns 0 if the chunk is not and does not contain the requested > aligned sub-chunk, else returns the amount of "waste" from > - trimming. BYTES is the *user* byte size, not the chunk byte > + trimming. NB is the *chunk* byte size, not the user byte OK. Corrects the comment to use chunksize as required for the back-chunk size. > size. */ > static size_t > -chunk_ok_for_memalign (mchunkptr p, size_t alignment, size_t bytes) > +chunk_ok_for_memalign (mchunkptr p, size_t alignment, size_t nb) > { > void *m = chunk2mem (p); > - INTERNAL_SIZE_T size = memsize (p); > + INTERNAL_SIZE_T size = chunksize (p); OK. Use the correct tight boundary e.g. chunksize. > void *aligned_m = m; > > if (__glibc_unlikely (misaligned_chunk (p))) > @@ -4997,12 +4997,12 @@ chunk_ok_for_memalign (mchunkptr p, size_t alignment, size_t bytes) > /* If it's a perfect fit, it's an exception to the return value rule > (we would return zero waste, which looks like "not usable"), so > handle it here by returning a small non-zero value instead. */ > - if (size == bytes && front_extra == 0) > + if (size == nb && front_extra == 0) OK. > return 1; > > /* If the block we need fits in the chunk, calculate total waste. */ > - if (size > bytes + front_extra) > - return size - bytes; > + if (size > nb + front_extra) > + return size - nb; OK. > > /* Can't use this chunk. */ > return 0; > @@ -5048,95 +5048,98 @@ _int_memalign (mstate av, size_t alignment, size_t bytes) > and unlikely to meet our alignment requirements. We have not done > any experimentation with searching for aligned fastbins. */ > > - int first_bin_index; > - int first_largebin_index; > - int last_bin_index; > + if (av != NULL) > + { > + int first_bin_index; > + int first_largebin_index; > + int last_bin_index; > > - if (in_smallbin_range (nb)) > - first_bin_index = smallbin_index (nb); > - else > - first_bin_index = largebin_index (nb); > + if (in_smallbin_range (nb)) > + first_bin_index = smallbin_index (nb); > + else > + first_bin_index = largebin_index (nb); > > - if (in_smallbin_range (nb * 2)) > - last_bin_index = smallbin_index (nb * 2); > - else > - last_bin_index = largebin_index (nb * 2); > + if (in_smallbin_range (nb * 2)) > + last_bin_index = smallbin_index (nb * 2); > + else > + last_bin_index = largebin_index (nb * 2); > > - first_largebin_index = largebin_index (MIN_LARGE_SIZE); > + first_largebin_index = largebin_index (MIN_LARGE_SIZE); > > - int victim_index; /* its bin index */ > + int victim_index; /* its bin index */ > > - for (victim_index = first_bin_index; > - victim_index < last_bin_index; > - victim_index ++) > - { > - victim = NULL; > + for (victim_index = first_bin_index; > + victim_index < last_bin_index; > + victim_index ++) > + { > + victim = NULL; > > - if (victim_index < first_largebin_index) > - { > - /* Check small bins. Small bin chunks are doubly-linked despite > - being the same size. */ > + if (victim_index < first_largebin_index) > + { > + /* Check small bins. Small bin chunks are doubly-linked despite > + being the same size. */ > > - mchunkptr fwd; /* misc temp for linking */ > - mchunkptr bck; /* misc temp for linking */ > + mchunkptr fwd; /* misc temp for linking */ > + mchunkptr bck; /* misc temp for linking */ > > - bck = bin_at (av, victim_index); > - fwd = bck->fd; > - while (fwd != bck) > - { > - if (chunk_ok_for_memalign (fwd, alignment, bytes) > 0) > - { > - victim = fwd; > + bck = bin_at (av, victim_index); > + fwd = bck->fd; > + while (fwd != bck) > + { > + if (chunk_ok_for_memalign (fwd, alignment, nb) > 0) OK. Use nb. > + { > + victim = fwd; > > - /* Unlink it */ > - victim->fd->bk = victim->bk; > - victim->bk->fd = victim->fd; > - break; > + /* Unlink it */ > + victim->fd->bk = victim->bk; > + victim->bk->fd = victim->fd; > + break; > + } > + > + fwd = fwd->fd; > + } > } > + else > + { > + /* Check large bins. */ > + mchunkptr fwd; /* misc temp for linking */ > + mchunkptr bck; /* misc temp for linking */ > + mchunkptr best = NULL; > + size_t best_size = 0; > > - fwd = fwd->fd; > - } > - } > - else > - { > - /* Check large bins. */ > - mchunkptr fwd; /* misc temp for linking */ > - mchunkptr bck; /* misc temp for linking */ > - mchunkptr best = NULL; > - size_t best_size = 0; > + bck = bin_at (av, victim_index); > + fwd = bck->fd; > > - bck = bin_at (av, victim_index); > - fwd = bck->fd; > + while (fwd != bck) > + { > + int extra; > > - while (fwd != bck) > - { > - int extra; > + if (chunksize (fwd) < nb) > + break; > + extra = chunk_ok_for_memalign (fwd, alignment, nb); OK. Use nb. > + if (extra > 0 > + && (extra <= best_size || best == NULL)) > + { > + best = fwd; > + best_size = extra; > + } > > - if (chunksize (fwd) < nb) > - break; > - extra = chunk_ok_for_memalign (fwd, alignment, bytes); > - if (extra > 0 > - && (extra <= best_size || best == NULL)) > - { > - best = fwd; > - best_size = extra; > - } > + fwd = fwd->fd; > + } > + victim = best; > > - fwd = fwd->fd; > - } > - victim = best; > + if (victim != NULL) > + { > + unlink_chunk (av, victim); > + break; > + } > + } > > - if (victim != NULL) > - { > - unlink_chunk (av, victim); > - break; > + if (victim != NULL) > + break; > } > } > > - if (victim != NULL) > - break; > - } > - > /* Strategy: find a spot within that chunk that meets the alignment > request, and then possibly free the leading and trailing space. > This strategy is incredibly costly and can lead to external > @@ -5147,6 +5150,8 @@ _int_memalign (mstate av, size_t alignment, size_t bytes) > p = victim; > m = chunk2mem (p); > set_inuse (p); > + if (av != &main_arena) > + set_non_main_arena (p); OK. Set M bit. > } > else > { > diff --git a/malloc/tst-memalign-2.c b/malloc/tst-memalign-2.c > index 4996578e9f..f229283dbf 100644 > --- a/malloc/tst-memalign-2.c > +++ b/malloc/tst-memalign-2.c > @@ -33,9 +33,10 @@ typedef struct TestCase { > } TestCase; > > static TestCase tcache_allocs[] = { > - { 24, 8, NULL, NULL }, > - { 24, 16, NULL, NULL }, > - { 128, 32, NULL, NULL } > + { 24, 32, NULL, NULL }, > + { 24, 64, NULL, NULL }, > + { 128, 128, NULL, NULL }, > + { 500, 128, NULL, NULL } OK. Use higher alignments. > }; > #define TN array_length (tcache_allocs) > > @@ -70,11 +71,15 @@ do_test (void) > > for (i = 0; i < TN; ++ i) > { > + size_t sz2; > + > tcache_allocs[i].ptr1 = memalign (tcache_allocs[i].alignment, tcache_allocs[i].size); > CHECK (tcache_allocs[i].ptr1, tcache_allocs[i].alignment); > + sz2 = malloc_usable_size (tcache_allocs[i].ptr1); > free (tcache_allocs[i].ptr1); > + > /* This should return the same chunk as was just free'd. */ > - tcache_allocs[i].ptr2 = memalign (tcache_allocs[i].alignment, tcache_allocs[i].size); > + tcache_allocs[i].ptr2 = memalign (tcache_allocs[i].alignment, sz2); > CHECK (tcache_allocs[i].ptr2, tcache_allocs[i].alignment); > free (tcache_allocs[i].ptr2); > > diff --git a/malloc/tst-memalign-3.c b/malloc/tst-memalign-3.c > new file mode 100644 > index 0000000000..ab90d6ca9b > --- /dev/null > +++ b/malloc/tst-memalign-3.c > @@ -0,0 +1,173 @@ > +/* Test for memalign chunk reuse. > + Copyright (C) 2022 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + . */ > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > + > +typedef struct TestCase { > + size_t size; > + size_t alignment; > + void *ptr1; > + void *ptr2; > +} TestCase; > + > +static TestCase tcache_allocs[] = { > + { 24, 32, NULL, NULL }, > + { 24, 64, NULL, NULL }, > + { 128, 128, NULL, NULL }, > + { 500, 128, NULL, NULL } > +}; > +#define TN array_length (tcache_allocs) > + > +static TestCase large_allocs[] = { > + { 23450, 64, NULL, NULL }, > + { 23450, 64, NULL, NULL }, > + { 23550, 64, NULL, NULL }, > + { 23550, 64, NULL, NULL }, > + { 23650, 64, NULL, NULL }, > + { 23650, 64, NULL, NULL }, > + { 33650, 64, NULL, NULL }, > + { 33650, 64, NULL, NULL } > +}; > +#define LN array_length (large_allocs) > + > +void *p; > + > +/* Sanity checks, ancillary to the actual test. */ > +#define CHECK(p,a) \ > + if (p == NULL || !PTR_IS_ALIGNED (p, a)) \ > + FAIL_EXIT1 ("NULL or misaligned memory detected.\n"); > + > +static void * > +mem_test (void *closure) > +{ > + int i; > + int j; > + int count; > + void *ptr[10]; > + void *p; > + > + /* TCache test. */ > + for (i = 0; i < TN; ++ i) > + { > + size_t sz2; > + > + tcache_allocs[i].ptr1 = memalign (tcache_allocs[i].alignment, tcache_allocs[i].size); > + CHECK (tcache_allocs[i].ptr1, tcache_allocs[i].alignment); > + sz2 = malloc_usable_size (tcache_allocs[i].ptr1); > + free (tcache_allocs[i].ptr1); > + > + /* This should return the same chunk as was just free'd. */ > + tcache_allocs[i].ptr2 = memalign (tcache_allocs[i].alignment, sz2); > + CHECK (tcache_allocs[i].ptr2, tcache_allocs[i].alignment); > + free (tcache_allocs[i].ptr2); > + > + TEST_VERIFY (tcache_allocs[i].ptr1 == tcache_allocs[i].ptr2); > + } > + > + /* Test for non-head tcache hits. */ > + for (i = 0; i < array_length (ptr); ++ i) > + { > + if (i == 4) > + { > + ptr[i] = memalign (64, 256); > + CHECK (ptr[i], 64); > + } > + else > + { > + ptr[i] = malloc (256); > + CHECK (ptr[i], 4); > + } > + } > + for (i = 0; i < array_length (ptr); ++ i) > + free (ptr[i]); > + > + p = memalign (64, 256); > + CHECK (p, 64); > + > + count = 0; > + for (i = 0; i < 10; ++ i) > + if (ptr[i] == p) > + ++ count; > + free (p); > + TEST_VERIFY (count > 0); > + > + /* Large bins test. */ > + > + for (i = 0; i < LN; ++ i) > + { > + large_allocs[i].ptr1 = memalign (large_allocs[i].alignment, large_allocs[i].size); > + CHECK (large_allocs[i].ptr1, large_allocs[i].alignment); > + /* Keep chunks from combining by fragmenting the heap. */ > + p = malloc (512); > + CHECK (p, 4); > + } > + > + for (i = 0; i < LN; ++ i) > + free (large_allocs[i].ptr1); > + > + /* Force the unsorted bins to be scanned and moved to small/large > + bins. */ > + p = malloc (60000); > + > + for (i = 0; i < LN; ++ i) > + { > + large_allocs[i].ptr2 = memalign (large_allocs[i].alignment, large_allocs[i].size); > + CHECK (large_allocs[i].ptr2, large_allocs[i].alignment); > + } > + > + count = 0; > + for (i = 0; i < LN; ++ i) > + { > + int ok = 0; > + for (j = 0; j < LN; ++ j) > + if (large_allocs[i].ptr1 == large_allocs[j].ptr2) > + ok = 1; > + if (ok == 1) > + count ++; > + } > + > + /* The allocation algorithm is complicated outside of the memalign > + logic, so just make sure it's working for most of the > + allocations. This avoids possible boundary conditions with > + empty/full heaps. */ > + TEST_VERIFY (count > LN / 2); > + > + return 0; > +} > + > +static int > +do_test (void) > +{ > + pthread_t p; > + > + p = xpthread_create (NULL, mem_test, NULL); > + xpthread_join (p); > + return 0; > +} > + > +#include > -- Cheers, Carlos.