From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa4.fujitsucc.c3s2.iphmx.com (esa4.fujitsucc.c3s2.iphmx.com [68.232.151.214]) by sourceware.org (Postfix) with ESMTPS id ADF39385800D for ; Wed, 21 Jul 2021 12:56:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org ADF39385800D X-IronPort-AV: E=McAfee;i="6200,9189,10051"; a="43287423" X-IronPort-AV: E=Sophos;i="5.84,258,1620658800"; d="scan'208";a="43287423" Received: from mail-os2jpn01lp2052.outbound.protection.outlook.com (HELO JPN01-OS2-obe.outbound.protection.outlook.com) ([104.47.92.52]) by ob1.fujitsucc.c3s2.iphmx.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Jul 2021 21:56:16 +0900 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=aYHxr0qVMEgeUePyTIid/sw/S2Pj1VFZaPwDa3N14zxBDq7yILRhP/jOt68DahSZs0ivWYksQlpwqwduz3qOpgOVcCdlm0EYle5KuswIBMcF2x3rA1VBIr8+HchRnHMqPTSVDPyzaomturT74kePO1PHHu1Mvg+3d1T/4eZXPgRucXvVU2Pff+sOj1NKrGnDc9o0++pVvZpHNC7CWuqH0Bfdwe72Pspo1O6nTpAKsPMdFU092EwXGF4vHajpK7BbqAkmUCAL6RleFZ8JRB/SY16+2By93PKGbfh/He5Yp7mRUHTIup7x5ZANGiK/cGaVYC06O1yDFzcZGm+0kyk/yg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=/+nDiisZVU9lrchIYLdszMb/iq5v5znW4nR/0/KslRc=; b=mSb7DpSXSexz5biVqbOiWofenVnyS28NQZR9Lyx3ENXSy9ETb0+p2AZ0ka0rVhdAYFBsePnneWA2WnnhwAfHhLGcUUcIo79VyPJKWd0ixjhNfe1GZirm7A2sz2Pe4bTi9fxoPkd9+ZZZHgK1COZaZ8j/VqjQHti2lr5/ceqmV7QOhIWRhBh+vBiqIMY3iEdA1F0Cm0d+W/cNykSXBU+JQsRpHJEH/+9ZEv8tXjsTiEBSX0QDk1RhbWtKhqTJgvJfxnjUvpIhXbS8VWc5HcXixio1/ujn4tbyr5mw1zphSo0Pu3SwL1Z69sC0egXmkC3pbe+8iVErQEdLbQ3hUS198g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=fujitsu.com; dmarc=pass action=none header.from=fujitsu.com; dkim=pass header.d=fujitsu.com; arc=none Received: from TYAPR01MB6025.jpnprd01.prod.outlook.com (2603:1096:402:36::13) by TYAPR01MB4639.jpnprd01.prod.outlook.com (2603:1096:404:12a::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4331.24; Wed, 21 Jul 2021 12:56:12 +0000 Received: from TYAPR01MB6025.jpnprd01.prod.outlook.com ([fe80::5816:45c1:5336:c108]) by TYAPR01MB6025.jpnprd01.prod.outlook.com ([fe80::5816:45c1:5336:c108%8]) with mapi id 15.20.4331.034; Wed, 21 Jul 2021 12:56:12 +0000 From: "naohirot@fujitsu.com" To: Noah Goldstein CC: Wilco Dijkstra , "Lucas A. M. Magalhaes" , GNU C Library Subject: RE: [PATCH v2 2/5] benchtests: Add memset zero fill benchtest Thread-Topic: [PATCH v2 2/5] benchtests: Add memset zero fill benchtest Thread-Index: AQHXfTFkDFnCQ66ygU6ZgGc9SBuuUatME7OAgAEdVVA= Date: Wed, 21 Jul 2021 12:56:12 +0000 Message-ID: References: <20210713082214.307529-1-naohirot@fujitsu.com> <20210720063500.362313-1-naohirot@fujitsu.com> In-Reply-To: Accept-Language: en-001, ja-JP, en-US Content-Language: aa X-MS-Has-Attach: X-MS-TNEF-Correlator: msip_labels: MSIP_Label_a7295cc1-d279-42ac-ab4d-3b0f4fece050_Enabled=True; MSIP_Label_a7295cc1-d279-42ac-ab4d-3b0f4fece050_SiteId=a19f121d-81e1-4858-a9d8-736e267fd4c7; MSIP_Label_a7295cc1-d279-42ac-ab4d-3b0f4fece050_SetDate=2021-07-21T12:56:12.095Z; MSIP_Label_a7295cc1-d279-42ac-ab4d-3b0f4fece050_Name=FUJITSU-RESTRICTED; MSIP_Label_a7295cc1-d279-42ac-ab4d-3b0f4fece050_ContentBits=0; MSIP_Label_a7295cc1-d279-42ac-ab4d-3b0f4fece050_Method=Standard; x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: d98a0573-eceb-42ba-6642-08d94c46eb34 x-ms-traffictypediagnostic: TYAPR01MB4639: x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:4303; x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: qm4YMYL0W3QxNIXZ+n0QLda8xpotdkZl41JKuholOXsRH1P1MUnH/GB8jkpSym62jp/fE16NfIm7de5AX9ltMpnJBxIcGTeweckdXB6QzqGwVVlTallYdW7VkM+JbjxaamsEq/mAS9VFtpOJrHBQSqq4HjOs3S4tBwj7EEHQCF9Brfu/0dHoCEVT6YWZ1rbmczFvPmut6ODoYJVBiQwBO+IoVrPkilvIIMjZc6qXA39vYg8mlRTrRl44oYWHCLa2m0oYpTwMDTCqL8owtY1cRFMG1dS0KXx26dj1/qQCVGJQnrrk9uL9a52IXmed/pCgHHfows6F0OzGM7804SfiwGYqeq5tpXTEVAJ+0eCdEDHxk4xzWsOCaCYmKAk1kKUUbHBrLEDpQkH3N0MymuBYqS4jvqeV+fdhLQr14Z3aEraSS+lH/IhMYg7sn1deIMe5p2PuS/73X/NNn7jYj1ExfhoJW1D8hiBTBkJKUA+1dGNIFz1RU8f7w+pcqQBnaZHgJHqBmDIPm5sxaEDtecFT2vhngOGM8+i3Cz2xj586v72rj5i/M2FlK+bFz4ryrdXIpmKlC+0GzJQXQJEZntW2fVuuazlzv704H2aCKJ97GmzZtOaP1TPEBLkCudYMpKGKrMERwsYrLtqfZg3X7xrM3lG7T9hmxxRKypN/LkSPrKRVSYSJxcEZTvdryhYMf/zbszEOYdCzFq1BxFP5oDPQUQ== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:TYAPR01MB6025.jpnprd01.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(366004)(39860400002)(346002)(136003)(376002)(396003)(4326008)(8936002)(33656002)(9686003)(76116006)(186003)(316002)(54906003)(86362001)(55016002)(38100700002)(6916009)(2906002)(478600001)(26005)(5660300002)(6506007)(64756008)(66556008)(66446008)(71200400001)(85182001)(52536014)(66946007)(122000001)(66476007)(8676002)(7696005)(38070700004); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?iso-2022-jp?B?c1ZoMUJWbTQ3emM2T0I0VFpHZU1yUDBQajE1UVRKSmEzcHpORFdIZkI3?= =?iso-2022-jp?B?NlUxUzNVR1lFMkdDSzZzdlJibDFqTjZpQmtYR1h6dmJvdUlTWGF5T2NF?= =?iso-2022-jp?B?djBNMUN2WEtybHl4elZVRlVNOFh6ZlNvYUJMZTJhbmhwMDA4a3VBUUxk?= =?iso-2022-jp?B?OVVzOUdCVVNIcDFybndreU5FZDZxbVV1eXVwTjZkUk1Zdk9WNHZ5WWFJ?= =?iso-2022-jp?B?dFF4WFQwbXRvd21kZXVOOWZEZFhodlM1UENvenJ1ZWVwTDRxWnZteEQz?= =?iso-2022-jp?B?OU1pYzBEZytweXQ1YXNnMDRuOFVMeEN4RXZuVmw1WXBickJYU3RMSzl3?= =?iso-2022-jp?B?cnBrcjJGRnB0UndFQkRhZmYzS1ZQNlRuMDA2cStTQmRQMUN3Q1EyaFJR?= =?iso-2022-jp?B?cWl5VjlybXRiRG5mM011R2xUMStRemIzSGVVZy9kMmMyd1VYVXB3ZU0r?= =?iso-2022-jp?B?ZUhIbEh0ZEdsdEpRZFJ3bDJZSEJOSVdBbUFPOVR2RTlqeGJoVmQ0UjJo?= =?iso-2022-jp?B?Z1p1Q0UzS003b3F4M3VaanpEOFJIdmNhL2NROVB5eTRJTmUxNEdCek9m?= =?iso-2022-jp?B?eWh4WE5oRXVYTGF6UUVSRUhwYTJUay96bzN4eTVCVHdUeVhXSTVQL29x?= =?iso-2022-jp?B?RDl2OFdHQUw4d01kU1YvTnFwVWdGNEFmMjFWemx4TE8yVGQrNTJlN1BG?= =?iso-2022-jp?B?aVlJeU5EV3pwQVJhSHd3Q0RmTkhTa0NUTSt1MmFhTEdBYkxpOXhiek1E?= =?iso-2022-jp?B?L2M2c0RhM3F6ay92RzdSRXU1dkhSMlNZTUt2TzhOUFNaZzJPV2tUdVcx?= =?iso-2022-jp?B?NFVkVzI3cHh4RE9RQUhBbVRsdm1XSnNMUzVEOTNhcW1SRld5aDI2VXBv?= =?iso-2022-jp?B?MFBCenM5NElJNG50b29MejZsS3ZNYjVSTE9XSy9yeVpqd0s4b1FHU0Jq?= =?iso-2022-jp?B?dlI0ZEx5bVJJWDgwUVJrK3YybjFWTUdERkVQMjQ0R1lQUWx1WUhobTlm?= =?iso-2022-jp?B?T0YvS2lmSklDeTBmNkc5WlpNb2pUN3NoZk9KcWpuYnhaQ2xhUGw5eFdp?= =?iso-2022-jp?B?dGVZdEhDc2ZpSFBKVDJOQXZHK0c5ZytJMTBtU3Nma2V5ZDI2ZER2VDdP?= =?iso-2022-jp?B?MTE2MnZ3NnVLODJMZUtTUENuSEVoMVFQYjZxYUpxTzRKT1dGN2M4Um44?= =?iso-2022-jp?B?clJUQjdHVEhhMjg5SXF0bVVwUW1zc3hBN2x5Ymlja2RUaGZKTEQ3aEVB?= =?iso-2022-jp?B?Mk9KT1E3M2ZSS2lxanVnVE5XejlPQmwyZWtZSEo1RGg2UHJLQUpRWFFh?= =?iso-2022-jp?B?NDFlN0N5amtkRlFhUEQweWxhZ0VJTFhRejZtTk9RUStzVVpmY2V0alNp?= =?iso-2022-jp?B?OW40Q2pDMm1aZ21wdFVnNDBHUVMya3pPQXB3ckxkb1g2MkVFOCs0T00v?= =?iso-2022-jp?B?dURmYTJKcm92dHJzRmh4Y2h5blVJNlNFaEY3U0RidWlEUVlPRjhYWnZ4?= =?iso-2022-jp?B?NHdlNVU2OWpQMmN5Y1YrNm93eU0xeW8rbC9XVnZQNXJkOGFGL3hiL0NI?= =?iso-2022-jp?B?Zy91M2RLa2NMYVZTUVJsMURNSDFxdE1leVU0azYrU2hEampndmg0WG53?= =?iso-2022-jp?B?TTVHUjNLSEpoTk5yUkFjSmRYanBNNjRFZzR5ZTRBQytKTnJQR01ocDF6?= =?iso-2022-jp?B?cno5UXVrakZTa3UzY1B5WVlxVDlNWFZHQzV5bnNqSXRLcVRMTnNqSFhn?= =?iso-2022-jp?B?SWxOWXVPOEVIWmxCQ3hNQk5oR3VWOWY1eWlNT240c1dsOS9UYTJIcjBI?= =?iso-2022-jp?B?QVd5YW1ZWndWYm1FV0NIUE02MHpmNXNEd2dwaW95WUVSNEpLbk1JUUxU?= =?iso-2022-jp?B?ZEtFNi8vWEFrN1dOd1FHMjNCQ2ZzPQ==?= Content-Type: text/plain; charset="iso-2022-jp" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: fujitsu.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: TYAPR01MB6025.jpnprd01.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: d98a0573-eceb-42ba-6642-08d94c46eb34 X-MS-Exchange-CrossTenant-originalarrivaltime: 21 Jul 2021 12:56:12.6759 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a19f121d-81e1-4858-a9d8-736e267fd4c7 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: TZPqhE+hhRko3PRV02bdfMGWrwwHKpm99F0GUA+P8AVoRUwVPx43vw+4pwfFXCoNH32+fSYX3AVn+tKbkig9Fw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: TYAPR01MB4639 X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, KAM_DMARC_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_NONE, TXREP autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Jul 2021 12:56:19 -0000 Hi Noah,=0A= =0A= Thank you for the review.=0A= =0A= > > +#define TEST_MAIN=0A= > > +#define TEST_NAME "memset"=0A= > > +#define START_SIZE (16 * 1024)=0A= > > +#define MIN_PAGE_SIZE (getpagesize () + 64 * 1024 * 1024)=0A= > > +#define TIMEOUT (20 * 60)=0A= > > +#include "bench-string.h"=0A= > > +=0A= > > +#include "json-lib.h"=0A= > > +=0A= > > +void *generic_memset (void *, int, size_t);=0A= > > +typedef void *(*proto_t) (void *, int, size_t);=0A= > > +=0A= > > +IMPL (MEMSET, 1)=0A= > > +IMPL (generic_memset, 0)=0A= > > +=0A= > > +static void=0A= > Do we want __attribute__((noinline, noclone))? =0A= =0A= Yes, I'll add it.=0A= =0A= > > +do_one_test (json_ctx_t *json_ctx, impl_t *impl, CHAR *s,=0A= > > + int c1 __attribute ((unused)), int c2 __attribute ((unused= )),=0A= > > + size_t n)=0A= > > +{=0A= > > + size_t i, iters =3D 16;=0A= > =0A= > I think 16 is probably too few iterations for reliable benchmarking. =0A= > Maybe `INNER_LOOP_ITERS` which is 8192=0A= =0A= I tried it. If it is changed to 8192, it hit the TIMEOUT (20 * 60) on a64fx= .=0A= Please check the code below.=0A= =0A= > =0A= > > + timing_t start, stop, cur;=0A= > > +=0A= > > + TIMING_NOW (start);=0A= > > + for (i =3D 0; i < iters; i +=3D 2)=0A= > > + {=0A= > > + CALL (impl, s, c1, n);=0A= > I am a bit worried that the overhead from the first call with `c1` will d= istort the results.=0A= > Is it possible to implement it with a nested loop where you fill `s` with= `c1` for =0A= > `n * inner_loop_iterations` in the outer loop and in the inner loop fill = `c2` on `s + n * i`? =0A= > In that case maybe 16 for inner loop iterations and 512 for outer loop it= erations. =0A= =0A= It seems that we have to set smaller number if this implementation is not w= rong.=0A= Because it will take 99.4 minutes estimating from the case that "iters =3D = 32"=0A= took 23.3 seconds.=0A= (8192/32*23.3/60=3D99.4)=0A= =0A= =0A= #define START_SIZE (16 * 1024)=0A= ...=0A= static void=0A= __attribute__((noinline, noclone))=0A= do_one_test (json_ctx_t *json_ctx, impl_t *impl, CHAR *s,=0A= int c1 __attribute ((unused)), int c2 __attribute ((unused)),= =0A= size_t n)=0A= {=0A= size_t i, j, iters =3D INNER_LOOP_ITERS; // 32;=0A= timing_t start, stop, cur, latency =3D 0;=0A= =0A= for (i =3D 0; i < 512; i++) // for (i =3D 0; i < 2; i++)=0A= {=0A= CALL (impl, s, c1, n * 16);=0A= TIMING_NOW (start);=0A= for (j =3D 0; j < 16; j++)=0A= CALL (impl, s + n * j, c2, n);=0A= TIMING_NOW (stop);=0A= TIMING_DIFF (cur, start, stop);=0A= TIMING_ACCUM (latency, cur);=0A= }=0A= =0A= json_element_double (json_ctx, (double) latency / (double) iters);=0A= }=0A= =0A= > > + CALL (impl, s, c2, n);=0A= > > + }=0A= > > + TIMING_NOW (stop);=0A= > > +=0A= > > + TIMING_DIFF (cur, start, stop);=0A= > > +=0A= > > + json_element_double (json_ctx, (double) cur / (double) iters);=0A= > > +}=0A= > > +=0A= > > +static void=0A= > > +do_test (json_ctx_t *json_ctx, size_t align, int c1, int c2, size_t le= n)=0A= > > +{=0A= > > + align &=3D 63;=0A= > Can you make this `align &=3D getpagesize () - 1;`? =0A= =0A= I'll change it.=0A= =0A= Thanks.=0A= Naohiro=0A=