From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x102d.google.com (mail-pj1-x102d.google.com [IPv6:2607:f8b0:4864:20::102d]) by sourceware.org (Postfix) with ESMTPS id B3F523857C76 for ; Sat, 6 Nov 2021 19:12:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org B3F523857C76 Received: by mail-pj1-x102d.google.com with SMTP id v23so3309338pjr.5 for ; Sat, 06 Nov 2021 12:12:37 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=GEkACOEiwjeSLApiTOitklaxfG6YNCOojKWJAr7oXMc=; b=Cv+Aw/4DVhCxX/bMLNTb6v1PoR2my8xjGMneB70KCjAc/yG5A3VXcJlCGBGChUhSWS ZR2HwTiPzSa1nzIIlSBUVMC98Ab9hNg9/mIsomFQ/0y1drQ2shfcmD+CS0kTd8GSoMUx r4Aj+Se+yOTxnFR/Lib7h0EFcpBdy3DhZ6V3lRMJp50hAtS5RqgrlBv5ar+oF5Hfy2LT U2mTsvwjJhi40zvDGqKGU4Sai5wEYnExac/ou+5ddSzhUppy51wVxpPwi/oH3D0rt8Tl gRbtCngnDkHv65CxiMvRLgvMYlOjQcd+apAmKyCcgGDa8bItOnAG/HEC0epXNvEYqdfx TH4g== X-Gm-Message-State: AOAM53390Di826XQ2Gkx7WxqXAPHgtPznRSovF5/xoOzH5jgZP0I9m+y vWQx4zAnQ/x63iC5J8VEl7SEHj1CKEtdQkQVfVI= X-Google-Smtp-Source: ABdhPJzSfdA2zXqkOxPgwK8BOT+4lX03GFzFGxdGn1TOfkOVO0J7+TxjhT+5QMNr8IegpLExb/Il1USRMFehBvUlrF8= X-Received: by 2002:a17:903:1111:b0:13f:d1d7:fb67 with SMTP id n17-20020a170903111100b0013fd1d7fb67mr58698502plh.85.1636225956706; Sat, 06 Nov 2021 12:12:36 -0700 (PDT) MIME-Version: 1.0 References: <20211101054952.2349590-1-goldstein.w.n@gmail.com> <20211106183322.3129442-1-goldstein.w.n@gmail.com> <20211106183322.3129442-2-goldstein.w.n@gmail.com> In-Reply-To: <20211106183322.3129442-2-goldstein.w.n@gmail.com> From: "H.J. Lu" Date: Sat, 6 Nov 2021 12:12:00 -0700 Message-ID: Subject: Re: [PATCH v4 2/5] benchtests: Add additional cases to bench-memcpy.c and bench-memmove.c To: Noah Goldstein Cc: GNU C Library , "Carlos O'Donell" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-3029.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 06 Nov 2021 19:12:39 -0000 On Sat, Nov 6, 2021 at 11:33 AM Noah Goldstein wrote: > > This commit adds more benchmarks for the common memcpy/memmove > benchmarks. The most signifcant cases are the half page offsets. The > current versions leaves dst and src near page aligned which leads to > false 4k aliasing on x86_64. This can add noise due to false > dependencies from one run to the next. As well, this seems like more > of an edge case that common case so it shouldn't be the only thing > Reviewed-by: H.J. Lu > --- > benchtests/bench-memcpy.c | 49 +++++++++++++++++++++++++++++++++----- > benchtests/bench-memmove.c | 26 +++++++++++++++++--- > 2 files changed, 66 insertions(+), 9 deletions(-) > > diff --git a/benchtests/bench-memcpy.c b/benchtests/bench-memcpy.c > index d9236a2282..744bea26d3 100644 > --- a/benchtests/bench-memcpy.c > +++ b/benchtests/bench-memcpy.c > @@ -40,7 +40,10 @@ do_one_test (json_ctx_t *json_ctx, impl_t *impl, char *dst, const char *src, > { > size_t i, iters = INNER_LOOP_ITERS; > timing_t start, stop, cur; > - > + for (i = 0; i < iters / 64; ++i) > + { > + CALL (impl, dst, src, len); > + } > TIMING_NOW (start); > for (i = 0; i < iters; ++i) > { > @@ -60,11 +63,11 @@ do_test (json_ctx_t *json_ctx, size_t align1, size_t align2, size_t len, > size_t i, j; > char *s1, *s2; > size_t repeats; > - align1 &= 63; > + align1 &= (getpagesize () - 1); > if (align1 + len >= page_size) > return; > > - align2 &= 63; > + align2 &= (getpagesize () - 1); > if (align2 + len >= page_size) > return; > > @@ -99,7 +102,7 @@ test_main (void) > { > json_ctx_t json_ctx; > size_t i; > - > + size_t half_page = getpagesize () / 2; > test_init (); > > json_init (&json_ctx, 0, stdout); > @@ -121,8 +124,15 @@ test_main (void) > { > do_test (&json_ctx, 0, 0, 1 << i, 1); > do_test (&json_ctx, i, 0, 1 << i, 1); > + do_test (&json_ctx, i + 32, 0, 1 << i, 1); > do_test (&json_ctx, 0, i, 1 << i, 1); > + do_test (&json_ctx, 0, i + 32, 1 << i, 1); > do_test (&json_ctx, i, i, 1 << i, 1); > + do_test (&json_ctx, i + 32, i + 32, 1 << i, 1); > + do_test (&json_ctx, half_page, 0, 1 << i, 1); > + do_test (&json_ctx, half_page + i, 0, 1 << i, 1); > + do_test (&json_ctx, half_page, i, 1 << i, 1); > + do_test (&json_ctx, half_page + i, i, 1 << i, 1); > } > > for (i = 0; i < 32; ++i) > @@ -131,16 +141,26 @@ test_main (void) > do_test (&json_ctx, i, 0, i, 0); > do_test (&json_ctx, 0, i, i, 0); > do_test (&json_ctx, i, i, i, 0); > + do_test (&json_ctx, half_page, 0, i, 0); > + do_test (&json_ctx, half_page + i, 0, i, 0); > + do_test (&json_ctx, half_page, i, i, 0); > + do_test (&json_ctx, half_page + i, i, i, 0); > + do_test (&json_ctx, getpagesize () - 1, 0, i, 0); > + do_test (&json_ctx, 0, getpagesize () - 1, i, 0); > } > > for (i = 3; i < 32; ++i) > { > if ((i & (i - 1)) == 0) > - continue; > + continue; > do_test (&json_ctx, 0, 0, 16 * i, 1); > do_test (&json_ctx, i, 0, 16 * i, 1); > do_test (&json_ctx, 0, i, 16 * i, 1); > do_test (&json_ctx, i, i, 16 * i, 1); > + do_test (&json_ctx, half_page, 0, 16 * i, 1); > + do_test (&json_ctx, half_page + i, 0, 16 * i, 1); > + do_test (&json_ctx, half_page, i, 16 * i, 1); > + do_test (&json_ctx, half_page + i, i, 16 * i, 1); > } > > for (i = 32; i < 64; ++i) > @@ -149,16 +169,33 @@ test_main (void) > do_test (&json_ctx, i, 0, 32 * i, 1); > do_test (&json_ctx, 0, i, 32 * i, 1); > do_test (&json_ctx, i, i, 32 * i, 1); > + do_test (&json_ctx, half_page, 0, 32 * i, 1); > + do_test (&json_ctx, half_page + i, 0, 32 * i, 1); > + do_test (&json_ctx, half_page, i, 32 * i, 1); > + do_test (&json_ctx, half_page + i, i, 32 * i, 1); > } > > do_test (&json_ctx, 0, 0, getpagesize (), 1); > > - for (i = 0; i <= 32; ++i) > + for (i = 0; i <= 48; ++i) > { > do_test (&json_ctx, 0, 0, 2048 + 64 * i, 1); > do_test (&json_ctx, i, 0, 2048 + 64 * i, 1); > + do_test (&json_ctx, i + 32, 0, 2048 + 64 * i, 1); > do_test (&json_ctx, 0, i, 2048 + 64 * i, 1); > + do_test (&json_ctx, 0, i + 32, 2048 + 64 * i, 1); > do_test (&json_ctx, i, i, 2048 + 64 * i, 1); > + do_test (&json_ctx, i + 32, i + 32, 2048 + 64 * i, 1); > + do_test (&json_ctx, half_page, 0, 2048 + 64 * i, 1); > + do_test (&json_ctx, half_page + i, 0, 2048 + 64 * i, 1); > + do_test (&json_ctx, half_page, i, 2048 + 64 * i, 1); > + do_test (&json_ctx, half_page + i, i, 2048 + 64 * i, 1); > + do_test (&json_ctx, i, 1, 2048 + 64 * i, 1); > + do_test (&json_ctx, 1, i, 2048 + 64 * i, 1); > + do_test (&json_ctx, i + 32, 1, 2048 + 64 * i, 1); > + do_test (&json_ctx, 1, i + 32, 2048 + 64 * i, 1); > + do_test (&json_ctx, half_page + i, 1, 2048 + 64 * i, 1); > + do_test (&json_ctx, half_page + 1, i, 2048 + 64 * i, 1); > } > > json_array_end (&json_ctx); > diff --git a/benchtests/bench-memmove.c b/benchtests/bench-memmove.c > index 6becbf4782..855f4d0649 100644 > --- a/benchtests/bench-memmove.c > +++ b/benchtests/bench-memmove.c > @@ -34,7 +34,10 @@ do_one_test (json_ctx_t *json_ctx, impl_t *impl, char *dst, char *src, > { > size_t i, iters = INNER_LOOP_ITERS; > timing_t start, stop, cur; > - > + for (i = 0; i < iters / 64; ++i) > + { > + CALL (impl, dst, src, len); > + } > TIMING_NOW (start); > for (i = 0; i < iters; ++i) > { > @@ -53,11 +56,11 @@ do_test (json_ctx_t *json_ctx, size_t align1, size_t align2, size_t len) > size_t i, j; > char *s1, *s2; > > - align1 &= 63; > + align1 &= (getpagesize () - 1); > if (align1 + len >= page_size) > return; > > - align2 &= 63; > + align2 &= (getpagesize () - 1); > if (align2 + len >= page_size) > return; > > @@ -85,6 +88,7 @@ test_main (void) > { > json_ctx_t json_ctx; > size_t i; > + size_t half_page = getpagesize () / 2; > > test_init (); > > @@ -138,6 +142,22 @@ test_main (void) > do_test (&json_ctx, i, i, 32 * i); > } > > + for (i = 0; i <= 48; ++i) > + { > + do_test (&json_ctx, 0, 0, 2048 + 64 * i); > + do_test (&json_ctx, i, 0, 2048 + 64 * i); > + do_test (&json_ctx, 0, i, 2048 + 64 * i); > + do_test (&json_ctx, i, i, 2048 + 64 * i); > + do_test (&json_ctx, half_page, 0, 2048 + 64 * i); > + do_test (&json_ctx, 0, half_page, 2048 + 64 * i); > + do_test (&json_ctx, half_page + i, 0, 2048 + 64 * i); > + do_test (&json_ctx, i, half_page, 2048 + 64 * i); > + do_test (&json_ctx, half_page, i, 2048 + 64 * i); > + do_test (&json_ctx, 0, half_page + i, 2048 + 64 * i); > + do_test (&json_ctx, half_page + i, i, 2048 + 64 * i); > + do_test (&json_ctx, i, half_page + i, 2048 + 64 * i); > + } > + > json_array_end (&json_ctx); > json_attr_object_end (&json_ctx); > json_attr_object_end (&json_ctx); > -- > 2.25.1 > LGTM. Reviewed-by: H.J. Lu Thanks. -- H.J.