From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 54864 invoked by alias); 8 Aug 2016 17:48:37 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 54853 invoked by uid 89); 8 Aug 2016 17:48:36 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=BAYES_00,FREEMAIL_FROM,KAM_ASCII_DIVIDERS,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=no version=3.3.2 spammy=nonexistent, pinskia@gmail.com, benchmarks, pinskiagmailcom X-HELO: mail-wm0-f43.google.com Received: from mail-wm0-f43.google.com (HELO mail-wm0-f43.google.com) (74.125.82.43) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Mon, 08 Aug 2016 17:48:33 +0000 Received: by mail-wm0-f43.google.com with SMTP id f65so130875814wmi.0 for ; Mon, 08 Aug 2016 10:48:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=T3gw0gIU6ugsp3jXFd8Mots4o2uHymnxMuzL11MPMZI=; b=iqe5ojX/BxiHbFZ1MEeqjTFqWq78gXj6knEnWnX/9s0LyU8pg8Cne6CHzeDlI1d0WW UZsVIzUxWIFkWGEwgYkeXOMvY26nogm333dS4v6xLG3e0ZPvNslafHC6Io384vryaDbc 4Tw8Co0kPUTPCN2xt0zO6FHhN+ZKlbaXZA038pijjz0fLYgsfs//HeO1QeD/YxmcAwt7 89jABdX21wV0DgGltFhtSH7Bb4DA258EmhBGNXg2NAFP2UBxZ+wHUsQ5CO1VJ/knr5K3 1DRMVvTozwOJODpzyHhxU6TUeVKRRFVjdBCieenHFUcpLliqUennCQ9KcvGmIONSoTXE e3DQ== X-Gm-Message-State: AEkoous02X9Ycaz3EItMZsTubD/a/BI1okP5z0scw5dVb1wTaOtsLEMDCr6/Hwc7W4oru8lAoRYy27OIYxhvug== X-Received: by 10.25.207.10 with SMTP id f10mr24091535lfg.108.1470678509222; Mon, 08 Aug 2016 10:48:29 -0700 (PDT) MIME-Version: 1.0 Received: by 10.25.42.4 with HTTP; Mon, 8 Aug 2016 10:48:28 -0700 (PDT) In-Reply-To: References: From: Andrew Pinski Date: Mon, 08 Aug 2016 17:48:00 -0000 Message-ID: Subject: Re: [PATCH/AARCH64] Improve ThunderX code generation slightly with load/store pair To: GCC Patches Content-Type: multipart/mixed; boundary=001a1141886e1f15e90539930611 X-IsSubscribed: yes X-SW-Source: 2016-08/txt/msg00619.txt.bz2 --001a1141886e1f15e90539930611 Content-Type: text/plain; charset=UTF-8 Content-length: 1855 On Fri, Aug 5, 2016 at 12:18 AM, Andrew Pinski wrote: > Hi, > On ThunderX, load (and store) pair that does a pair of two word > (32bits) load/stores is slower in some cases than doing two > load/stores. For some internal benchmarks, it provides a 2-5% > improvement. > > This patch disables the forming of the load/store pairs for SImode if > we are tuning for ThunderX. I used the tuning flags route so it can > be overridden if needed later on or if someone else wants to use the > same method for their core. > > OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions. Here is a new version based on feedback both on the list and off. I added a check for alignment to greater than 8 bytes as that is alignment < 8 causes the slow down. I also added two new testcases testing this to make sure it did the load pair optimization when it is profitable. OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions. Thanks, Andrew Pinski ChangeLog: * config/aarch64/aarch64-tuning-flags.def (slow_ldpw): New tuning option. * config/aarch64/aarch64.c (thunderx_tunings): Enable AARCH64_EXTRA_TUNE_SLOW_LDPW. (aarch64_operands_ok_for_ldpstp): Return false if AARCH64_EXTRA_TUNE_SLOW_LDPW and the mode was SImode and the alignment is less than 8 byte. (aarch64_operands_adjust_ok_for_ldpstp): Likewise. testsuite/ChangeLog: * gcc.target/aarch64/thunderxloadpair.c: New testcase. * gcc.target/aarch64/thunderxnoloadpair.c: New testcase. > > Thanks, > Andrew Pinski > > ChangeLog: > * config/aarch64/aarch64-tuning-flags.def (slow_ldpw): New tuning option. > * config/aarch64/aarch64.c (thunderx_tunings): Enable > AARCH64_EXTRA_TUNE_SLOW_LDPW. > (aarch64_operands_ok_for_ldpstp): Return false if > AARCH64_EXTRA_TUNE_SLOW_LDPW and the mode was SImode. > (aarch64_operands_adjust_ok_for_ldpstp): Likewise. --001a1141886e1f15e90539930611 Content-Type: text/plain; charset=US-ASCII; name="stldpw.diff.txt" Content-Disposition: attachment; filename="stldpw.diff.txt" Content-Transfer-Encoding: base64 X-Attachment-Id: f_irmbx02q1 Content-length: 4303 SW5kZXg6IGNvbmZpZy9hYXJjaDY0L2FhcmNoNjQtdHVuaW5nLWZsYWdzLmRl Zgo9PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09Ci0tLSBjb25maWcvYWFyY2g2NC9h YXJjaDY0LXR1bmluZy1mbGFncy5kZWYJKHJldmlzaW9uIDIzOTIyOCkKKysr IGNvbmZpZy9hYXJjaDY0L2FhcmNoNjQtdHVuaW5nLWZsYWdzLmRlZgkod29y a2luZyBjb3B5KQpAQCAtMjksMyArMjksNCBAQAogICAgICBBQVJDSDY0X1RV TkVfIHRvIGdpdmUgYW4gZW51bSBuYW1lLiAqLwogCiBBQVJDSDY0X0VYVFJB X1RVTklOR19PUFRJT04gKCJyZW5hbWVfZm1hX3JlZ3MiLCBSRU5BTUVfRk1B X1JFR1MpCitBQVJDSDY0X0VYVFJBX1RVTklOR19PUFRJT04gKCJzbG93X2xk cHciLCBTTE9XX0xEUFcpCkluZGV4OiBjb25maWcvYWFyY2g2NC9hYXJjaDY0 LmMKPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PQotLS0gY29uZmlnL2FhcmNoNjQv YWFyY2g2NC5jCShyZXZpc2lvbiAyMzkyMjgpCisrKyBjb25maWcvYWFyY2g2 NC9hYXJjaDY0LmMJKHdvcmtpbmcgY29weSkKQEAgLTcxMiw3ICs3MTIsNyBA QAogICAwLAkvKiBtYXhfY2FzZV92YWx1ZXMuICAqLwogICAwLAkvKiBjYWNo ZV9saW5lX3NpemUuICAqLwogICB0dW5lX3BhcmFtczo6QVVUT1BSRUZFVENI RVJfT0ZGLAkvKiBhdXRvcHJlZmV0Y2hlcl9tb2RlbC4gICovCi0gIChBQVJD SDY0X0VYVFJBX1RVTkVfTk9ORSkJLyogdHVuZV9mbGFncy4gICovCisgIChB QVJDSDY0X0VYVFJBX1RVTkVfU0xPV19MRFBXKQkvKiB0dW5lX2ZsYWdzLiAg Ki8KIH07CiAKIHN0YXRpYyBjb25zdCBzdHJ1Y3QgdHVuZV9wYXJhbXMgeGdl bmUxX3R1bmluZ3MgPQpAQCAtMTM1OTMsNiArMTM1OTMsMTUgQEAKICAgaWYg KE1FTV9WT0xBVElMRV9QIChtZW1fMSkgfHwgTUVNX1ZPTEFUSUxFX1AgKG1l bV8yKSkKICAgICByZXR1cm4gZmFsc2U7CiAKKyAgLyogSWYgd2UgaGF2ZSBT SW1vZGUgYW5kIHNsb3cgbGRwLCBjaGVjayB0aGUgYWxpZ25tZW50IHRvIGJl IGdyZWF0ZXIKKyAgICAgdGhhbiA4IGJ5dGUuICovCisgIGlmIChtb2RlID09 IFNJbW9kZQorICAgICAgJiYgKGFhcmNoNjRfdHVuZV9wYXJhbXMuZXh0cmFf dHVuaW5nX2ZsYWdzCisgICAgICAgICAgJiBBQVJDSDY0X0VYVFJBX1RVTkVf U0xPV19MRFBXKQorICAgICAgJiYgIW9wdGltaXplX3NpemUKKyAgICAgICYm IE1FTV9BTElHTiAobWVtXzEpIDwgOCAqIEJJVFNfUEVSX1VOSVQpCisgICAg cmV0dXJuIGZhbHNlOworCiAgIC8qIENoZWNrIGlmIHRoZSBhZGRyZXNzZXMg YXJlIGluIHRoZSBmb3JtIG9mIFtiYXNlK29mZnNldF0uICAqLwogICBleHRy YWN0X2Jhc2Vfb2Zmc2V0X2luX2FkZHIgKG1lbV8xLCAmYmFzZV8xLCAmb2Zm c2V0XzEpOwogICBpZiAoYmFzZV8xID09IE5VTExfUlRYIHx8IG9mZnNldF8x ID09IE5VTExfUlRYKQpAQCAtMTM3NTIsNiArMTM3NjEsMTUgQEAKIAlyZXR1 cm4gZmFsc2U7CiAgICAgfQogCisgIC8qIElmIHdlIGhhdmUgU0ltb2RlIGFu ZCBzbG93IGxkcCwgY2hlY2sgdGhlIGFsaWdubWVudCB0byBiZSBncmVhdGVy CisgICAgIHRoYW4gOCBieXRlLiAqLworICBpZiAobW9kZSA9PSBTSW1vZGUK KyAgICAgICYmIChhYXJjaDY0X3R1bmVfcGFyYW1zLmV4dHJhX3R1bmluZ19m bGFncworICAgICAgICAgICYgQUFSQ0g2NF9FWFRSQV9UVU5FX1NMT1dfTERQ VykKKyAgICAgICYmICFvcHRpbWl6ZV9zaXplCisgICAgICAmJiBNRU1fQUxJ R04gKG1lbV8xKSA8IDggKiBCSVRTX1BFUl9VTklUKQorICAgIHJldHVybiBm YWxzZTsKKwogICBpZiAoUkVHX1AgKHJlZ18xKSAmJiBGUF9SRUdOVU1fUCAo UkVHTk8gKHJlZ18xKSkpCiAgICAgcmNsYXNzXzEgPSBGUF9SRUdTOwogICBl bHNlCkluZGV4OiB0ZXN0c3VpdGUvZ2NjLnRhcmdldC9hYXJjaDY0L3RodW5k ZXJ4bG9hZHBhaXIuYwo9PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09Ci0tLSB0ZXN0 c3VpdGUvZ2NjLnRhcmdldC9hYXJjaDY0L3RodW5kZXJ4bG9hZHBhaXIuYwko bm9uZXhpc3RlbnQpCisrKyB0ZXN0c3VpdGUvZ2NjLnRhcmdldC9hYXJjaDY0 L3RodW5kZXJ4bG9hZHBhaXIuYwkod29ya2luZyBjb3B5KQpAQCAtMCwwICsx LDIwIEBACisvKiB7IGRnLWRvIGNvbXBpbGUgfSAqLworLyogeyBkZy1vcHRp b25zICItTzIgLW1jcHU9dGh1bmRlcngiIH0gKi8KKworc3RydWN0IGxkcAor eworICBsb25nIGxvbmcgYzsKKyAgaW50IGEsIGI7Cit9OworCisKK2ludCBm KHN0cnVjdCBsZHAgKmEpCit7CisgIHJldHVybiBhLT5hICsgYS0+YjsKK30K KworCisvKiBXZSBrbm93IHRoZSBhbGlnbmVtZW50IG9mIGEtPmEgdG8gYmUg OCBieXRlIGFsaWduZWQgc28gaXQgaXMgcHJvZml0YWJsZQorICAgdG8gZG8g bGRwLiAqLworLyogeyBkZy1maW5hbCB7IHNjYW4tYXNzZW1ibGVyLXRpbWVz ICJsZHBcdHdcWzAtOVxdKywgd1xbMC05XF0iIDEgfSB9ICovCisKSW5kZXg6 IHRlc3RzdWl0ZS9nY2MudGFyZ2V0L2FhcmNoNjQvdGh1bmRlcnhub2xvYWRw YWlyLmMKPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PQotLS0gdGVzdHN1aXRlL2dj Yy50YXJnZXQvYWFyY2g2NC90aHVuZGVyeG5vbG9hZHBhaXIuYwkobm9uZXhp c3RlbnQpCisrKyB0ZXN0c3VpdGUvZ2NjLnRhcmdldC9hYXJjaDY0L3RodW5k ZXJ4bm9sb2FkcGFpci5jCSh3b3JraW5nIGNvcHkpCkBAIC0wLDAgKzEsMTcg QEAKKy8qIHsgZGctZG8gY29tcGlsZSB9ICovCisvKiB7IGRnLW9wdGlvbnMg Ii1PMiAtbWNwdT10aHVuZGVyeCIgfSAqLworCitzdHJ1Y3Qgbm9sZHAKK3sK KyAgaW50IGEsIGI7Cit9OworCisKK2ludCBmKHN0cnVjdCBub2xkcCAqYSkK K3sKKyAgcmV0dXJuIGEtPmEgKyBhLT5iOworfQorCisvKiBXZSBrbm93IHRo ZSBhbGlnbmVtZW50IG9mIGEtPmEgdG8gYmUgNCBieXRlIGFsaWduZWQgc28g aXQgaXMgbm90IHByb2ZpdGFibGUKKyAgIHRvIGRvIGxkcC4gKi8KKy8qIHsg ZGctZmluYWwgeyBzY2FuLWFzc2VtYmxlci10aW1lICJsZHBcdHdcWzAtOVxd Kywgd1xbMC05XF0iIDEgfSB9ICovCg== --001a1141886e1f15e90539930611--