From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 32859 invoked by alias); 22 May 2017 20:22:52 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Received: (qmail 32838 invoked by uid 89); 22 May 2017 20:22:51 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-24.1 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,RCVD_IN_DNSWL_NONE,RCVD_IN_SORBS_SPAM,SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mail-qk0-f170.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=Bpfm/cleKncLPqTIJW3WFqKiiaUhzP/c0ERCorXGdU8=; b=aOM10zJy4nydXgPpYx7MNAPCeAQr5MWzULgbskrY+XIYyT0yzqSzyMa+9DgKRd5ITM hhjV0KFYFTrlso5nCUJ3bkiB6fiw7Zx5fQ3qMzgvcSm5LtCalLmq6k8fP/KEi43+wQCT PDylx368tCKkhwCo6I8pMGsX+HWQPVADCZqi2a1abGTgowVTIel88+IaPAWmYsF68zlK bttLlGGBKPbwhh3EoE+p1pOf2QmUwpDMBNHPK1nvMXgv0ucDAGO6OMIk8nSAHICmtM8p y6SVHhEcUw1w2YYvzV5Fgzzr6yWsTo2RwitlGiJvfQP4Z7gyNTxeCpmAlrGr5llT23xs 3j9Q== X-Gm-Message-State: AODbwcB4zRJGtzwBAg0gqUXsC+PLPTl5IPqFm+1fn9MYxz03B6ETU2Wi i7PgCdQlQ8Nm7H/0uoU+LLTXPeiBfQ== X-Received: by 10.55.50.19 with SMTP id y19mr20026382qky.24.1495484571826; Mon, 22 May 2017 13:22:51 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: References: <9c563a4b-424b-242f-b82f-4650ab2637f7@redhat.com> <28e34264-e8c5-5570-c48c-9125893808b2@redhat.com> From: "H.J. Lu" Date: Mon, 22 May 2017 20:22:00 -0000 Message-ID: Subject: Re: memcpy performance regressions 2.19 -> 2.24(5) To: Erich Elsen Cc: "Carlos O'Donell" , GNU C Library Content-Type: multipart/mixed; boundary="001a1146f198abe0b0055022a2f1" X-SW-Source: 2017-05/txt/msg00672.txt.bz2 --001a1146f198abe0b0055022a2f1 Content-Type: text/plain; charset="UTF-8" Content-length: 1023 On Mon, May 22, 2017 at 12:17 PM, H.J. Lu wrote: > On Thu, May 18, 2017 at 1:59 PM, Erich Elsen wrote: >> Hi H.J., >> >> I was on vacation, sorry for the slow reply. The updated benchmark >> still shows the same behavior, thanks. >> >> I'll try my hand at creating a patch that makes that variable >> __x86_shared_non_temporal_threshold a tunable. It will be necessary >> to do internal experiments anyway. >> > > __x86_shared_non_temporal_threshold was set to 6 times of per-core > shared cache size, based on the large memcpy micro benchmark in glibc > on a 8-core processor. For a processor with more than 8 cores, the > threshold is too low. Set __x86_shared_non_temporal_threshold to the > 3/4 of the total shared cache size so that it is unchanged on 8-core > processors. On processors with less than 8 cores, the threshold is > lower. > > Any comments? > Here is a patch to add support for "glibc.x86_cache.non_temporal_threshold=number" to GLIBC_TUNABLES. -- H.J. --001a1146f198abe0b0055022a2f1 Content-Type: text/x-patch; charset="US-ASCII"; name="0001-Add-x86_cache.non_temporal_threshold-to-GLIBC_TUNABL.patch" Content-Disposition: attachment; filename="0001-Add-x86_cache.non_temporal_threshold-to-GLIBC_TUNABL.patch" Content-Transfer-Encoding: base64 X-Attachment-Id: f_j30kvfma1 Content-length: 3640 RnJvbSAzZTMxYmM0YTkzMGU3YjMyOTI0YmVmZTc2MjAxNGY4NWQ1NDA4Njky IE1vbiBTZXAgMTcgMDA6MDA6MDAgMjAwMQpGcm9tOiAiSC5KLiBMdSIgPGhq bC50b29sc0BnbWFpbC5jb20+CkRhdGU6IE1vbiwgMjIgTWF5IDIwMTcgMTI6 MDA6NDMgLTA3MDAKU3ViamVjdDogW1BBVENIXSBBZGQgeDg2X2NhY2hlLm5v bl90ZW1wb3JhbF90aHJlc2hvbGQgdG8gR0xJQkNfVFVOQUJMRVMKCkFkZCBz dXBwb3J0IGZvciAiZ2xpYmMueDg2X2NhY2hlLm5vbl90ZW1wb3JhbF90aHJl c2hvbGQ9bnVtYmVyIiB0bwpHTElCQ19UVU5BQkxFUy4KCgkqIGVsZi9kbC10 dW5hYmxlcy5saXN0ICh4ODZfY2FjaGUpOiBOZXcgbmFtZSBzcGFjZS4KCSog c3lzZGVwcy94ODYvY2FjaGVpbmZvLmMgW0hBVkVfVFVOQUJMRVNdIChUVU5B QkxFX05BTUVTUEFDRSk6CglOZXcuCglbSEFWRV9UVU5BQkxFU106IEluY2x1 ZGUgPGVsZi9kbC10dW5hYmxlcy5oPi4KCVtIQVZFX1RVTkFCTEVTXSAoRExf VFVOQUJMRV9DQUxMQkFDSyAoc2V0X25vbl90ZW1wb3JhbF90aHJlc2hvbGQp KToKCU5ldy4KCVtIQVZFX1RVTkFCTEVTXSAoaW5pdF9jYWNoZWluZm8pOiBD YWxsIFRVTkFCTEVfU0VUX1ZBTF9XSVRIX0NBTExCQUNLCgl3aXRoIHNldF9u b25fdGVtcG9yYWxfdGhyZXNob2xkLgotLS0KIGVsZi9kbC10dW5hYmxlcy5s aXN0ICAgIHwgIDYgKysrKysrCiBzeXNkZXBzL3g4Ni9jYWNoZWluZm8uYyB8 IDIyICsrKysrKysrKysrKysrKysrKystLS0KIDIgZmlsZXMgY2hhbmdlZCwg MjUgaW5zZXJ0aW9ucygrKSwgMyBkZWxldGlvbnMoLSkKCmRpZmYgLS1naXQg YS9lbGYvZGwtdHVuYWJsZXMubGlzdCBiL2VsZi9kbC10dW5hYmxlcy5saXN0 CmluZGV4IGI5ZjE0ODguLjJjODk5ZmUgMTAwNjQ0Ci0tLSBhL2VsZi9kbC10 dW5hYmxlcy5saXN0CisrKyBiL2VsZi9kbC10dW5hYmxlcy5saXN0CkBAIC03 Nyw0ICs3NywxMCBAQCBnbGliYyB7CiAgICAgICBzZWN1cml0eV9sZXZlbDog U1hJRF9JR05PUkUKICAgICB9CiAgIH0KKyAgeDg2X2NhY2hlIHsKKyAgICBu b25fdGVtcG9yYWxfdGhyZXNob2xkIHsKKyAgICAgIHR5cGU6IFNJWkVfVAor ICAgICAgc2VjdXJpdHlfbGV2ZWw6IFNYSURfSUdOT1JFCisgICAgfQorICB9 CiB9CmRpZmYgLS1naXQgYS9zeXNkZXBzL3g4Ni9jYWNoZWluZm8uYyBiL3N5 c2RlcHMveDg2L2NhY2hlaW5mby5jCmluZGV4IDM0MzRkOTcuLjFiMTk1ZWIg MTAwNjQ0Ci0tLSBhL3N5c2RlcHMveDg2L2NhY2hlaW5mby5jCisrKyBiL3N5 c2RlcHMveDg2L2NhY2hlaW5mby5jCkBAIC0yMyw2ICsyMywyMCBAQAogI2lu Y2x1ZGUgPGNwdWlkLmg+CiAjaW5jbHVkZSA8aW5pdC1hcmNoLmg+CiAKKy8q IFRocmVzaG9sZCB0byB1c2Ugbm9uIHRlbXBvcmFsIHN0b3JlLiAgKi8KK2xv bmcgaW50IF9feDg2X3NoYXJlZF9ub25fdGVtcG9yYWxfdGhyZXNob2xkIGF0 dHJpYnV0ZV9oaWRkZW47CisKKyNpZiBIQVZFX1RVTkFCTEVTCisjIGRlZmlu ZSBUVU5BQkxFX05BTUVTUEFDRSB4ODZfY2FjaGUKKyMgaW5jbHVkZSA8ZWxm L2RsLXR1bmFibGVzLmg+CisKK3ZvaWQKK0RMX1RVTkFCTEVfQ0FMTEJBQ0sg KHNldF9ub25fdGVtcG9yYWxfdGhyZXNob2xkKSAodHVuYWJsZV92YWxfdCAq dmFscCkKK3sKKyAgX194ODZfc2hhcmVkX25vbl90ZW1wb3JhbF90aHJlc2hv bGQgPSAobG9uZyBpbnQpIHZhbHAtPm51bXZhbDsKK30KKyNlbmRpZgorCiAj ZGVmaW5lIGlzX2ludGVsIEdMUk8oZGxfeDg2X2NwdV9mZWF0dXJlcykua2lu ZCA9PSBhcmNoX2tpbmRfaW50ZWwKICNkZWZpbmUgaXNfYW1kIEdMUk8oZGxf eDg2X2NwdV9mZWF0dXJlcykua2luZCA9PSBhcmNoX2tpbmRfYW1kCiAjZGVm aW5lIG1heF9jcHVpZCBHTFJPKGRsX3g4Nl9jcHVfZmVhdHVyZXMpLm1heF9j cHVpZApAQCAtNDY2LDkgKzQ4MCw2IEBAIGxvbmcgaW50IF9feDg2X3Jhd19z aGFyZWRfY2FjaGVfc2l6ZV9oYWxmIGF0dHJpYnV0ZV9oaWRkZW4gPSAxMDI0 ICogMTAyNCAvIDI7CiAvKiBTaW1pbGFyIHRvIF9feDg2X3NoYXJlZF9jYWNo ZV9zaXplLCBidXQgbm90IHJvdW5kZWQuICAqLwogbG9uZyBpbnQgX194ODZf cmF3X3NoYXJlZF9jYWNoZV9zaXplIGF0dHJpYnV0ZV9oaWRkZW4gPSAxMDI0 ICogMTAyNDsKIAotLyogVGhyZXNob2xkIHRvIHVzZSBub24gdGVtcG9yYWwg c3RvcmUuICAqLwotbG9uZyBpbnQgX194ODZfc2hhcmVkX25vbl90ZW1wb3Jh bF90aHJlc2hvbGQgYXR0cmlidXRlX2hpZGRlbjsKLQogI2lmbmRlZiBESVNB QkxFX1BSRUZFVENIVwogLyogUFJFRkVUQ0hXIHN1cHBvcnQgZmxhZyBmb3Ig dXNlIGluIG1lbW9yeSBhbmQgc3RyaW5nIHJvdXRpbmVzLiAgKi8KIGludCBf X3g4Nl9wcmVmZXRjaHcgYXR0cmlidXRlX2hpZGRlbjsKQEAgLTc3MCw0ICs3 ODEsOSBAQCBpbnRlbF9idWdfbm9fY2FjaGVfaW5mbzoKICAgICAgdG90YWwg c2hhcmVkIGNhY2hlIHNpemUuICAqLwogICBfX3g4Nl9zaGFyZWRfbm9uX3Rl bXBvcmFsX3RocmVzaG9sZAogICAgID0gX194ODZfc2hhcmVkX2NhY2hlX3Np emUgKiB0aHJlYWRzICogMyAvIDQ7CisKKyNpZiBIQVZFX1RVTkFCTEVTCisg IFRVTkFCTEVfU0VUX1ZBTF9XSVRIX0NBTExCQUNLIChub25fdGVtcG9yYWxf dGhyZXNob2xkLCBOVUxMLAorCQkJCSBzZXRfbm9uX3RlbXBvcmFsX3RocmVz aG9sZCk7CisjZW5kaWYKIH0KLS0gCjIuOS40Cgo= --001a1146f198abe0b0055022a2f1--