From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x1033.google.com (mail-pj1-x1033.google.com [IPv6:2607:f8b0:4864:20::1033]) by sourceware.org (Postfix) with ESMTPS id 778D53861841 for ; Sun, 15 Aug 2021 04:18:04 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 778D53861841 Received: by mail-pj1-x1033.google.com with SMTP id fa24-20020a17090af0d8b0290178bfa69d97so22186487pjb.0 for ; Sat, 14 Aug 2021 21:18:04 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:message-id:date:user-agent :mime-version:content-language; bh=3+J44OLcUkNDo1gXtXYHA2tWTVvoo3YFPJ5Z0xckgEU=; b=UbLyvA/SkXyHlQCmRd62rbF4rvux8MexMnU/qRuYasfEL5mxkpk1oTjhLSPLCmxShD a7LTi1M5ppjQabImgJvMVfkJIBXiypEDBC/DlpEK5jWjI720QCPSD49QK5NNC4VTPhNS H9OWx509hS9FqjZ4as2Bo+aGfbG5OznUb3C21sZSnbnu6yEdClp7AU/3O1u2P3p0Tebo mkxgdcPmX4xkgydVyWmkKf3zeXVeZ5d4K+fEbrGfX/zEQGl57thqt20QDxmp9XlxlQUb e/acpIQA4F2JW+/sTEZCtn6608cV4KlYRZkWGSrU2jaUAblWPf6OIKvL9H8WIDh5YfcV VsnA== X-Gm-Message-State: AOAM532Sot4f1nNGmGVjXOoSheSdJJ641osUMwucaeTQSU8s26DCPOKK DEDsNG+qRr928MLt4AnomgG3uoRAOjl30w== X-Google-Smtp-Source: ABdhPJz2nOfq/n6Iem8Bc2LtmRpNxb+kUCTlModc+7ijn2CyPkwBMjJeEoCl+qy2JgYveZV8KzWACw== X-Received: by 2002:a17:90a:de0b:: with SMTP id m11mr10390671pjv.39.1629001083015; Sat, 14 Aug 2021 21:18:03 -0700 (PDT) Received: from [172.31.0.175] (c-98-202-48-222.hsd1.ut.comcast.net. [98.202.48.222]) by smtp.gmail.com with ESMTPSA id ne3sm3174558pjb.51.2021.08.14.21.18.01 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 14 Aug 2021 21:18:02 -0700 (PDT) From: Jeff Law To: GCC Patches Subject: [committed] Improve many SImode shifts on the H8/300H. Message-ID: <66685fb7-ea8d-e2db-ff11-e44f22168d3b@gmail.com> Date: Sat, 14 Aug 2021 22:18:01 -0600 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.12.0 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="------------187C892BE8BD82B41BA1A10C" Content-Language: en-US X-Spam-Status: No, score=-7.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SCC_10_SHORT_WORD_LINES, SCC_5_SHORT_WORD_LINES, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 15 Aug 2021 04:18:06 -0000 This is a multi-part message in MIME format. --------------187C892BE8BD82B41BA1A10C Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit As I've mentioned before, the H8/300H can only shift a single bit position at a time.  Naturally this means many shifts are implemented as loops.  There's a variety of special cases that we can do without loops by using rotates, sub-word moves, etc.  The general guidance for the port has been to only use inline or special sequences if they're shorter or just one instruction longer than the loop. This was pretty reasonable guidance for QI/HI mode.  It was relaxed a bit about 10 years ago for HImode in particular where the kpit team realized they could save 50-100 cycles for some shifts by allowing 2 instructions of code growth over the loop implementation. But they only re-tuned HImode shifts.  There's even bigger benefits for re-tuning SImode shifts.  There's cases where we can save close to 200 cycles by allowing 2 additional instructions. This patch re-tunes SImode shifts on the H8/300H primarily by inlining more often or using a special sequence + inlining for residuals.  Both cases were already supported and this just uses those existing capabilities more often, so it was trivial to implement.  I think there's some cases were entirely new special sequences could be used, but I haven't tried those yet. There'll be a similar follow-up for the H8/S.  The gains aren't as spectacular as the H8/S gained shift-by-2 instructions, but they should still be significant. Committed to the trunk after the usual testing and no regressions. Jeff --------------187C892BE8BD82B41BA1A10C Content-Type: text/plain; charset=UTF-8; name="h8.patch" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="h8.patch" Y29tbWl0IDg4MmYxZDU4YmZhNTY3MzdmZjJkZTg0YzNjZDFlMGFjZmMzMThiODYKQXV0aG9y OiBKZWZmIExhdyA8amxhd0Bsb2NhbGhvc3QubG9jYWxkb21haW4+CkRhdGU6ICAgU3VuIEF1 ZyAxNSAwMDoxMzoyMyAyMDIxIC0wNDAwCgogICAgSW1wcm92ZSBtYW55IFNJbW9kZSBzaGlm dHMgb24gdGhlIEg4LzMwMEgKICAgIAogICAgQXMgSSd2ZSBtZW50aW9uZWQgYmVmb3JlLCB0 aGUgSDgvMzAwSCBjYW4gb25seSBzaGlmdCBhIHNpbmdsZSBiaXQgcG9zaXRpb24gYXQgYSB0 aW1lLiAgTmF0dXJhbGx5IHRoaXMgbWVhbnMgbWFueSBzaGlmdHMgYXJlIGltcGxlbWVudGVk IGFzIGxvb3BzLiAgVGhlcmUncyBhIHZhcmlldHkgb2Ygc3BlY2lhbCBjYXNlcyB0aGF0IHdl IGNhbiBkbyB3aXRob3V0IGxvb3BzIGJ5IHVzaW5nIHJvdGF0ZXMsIHN1Yi13b3JkIG1vdmVz LCBldGMuICBUaGUgZ2VuZXJhbCBndWlkYW5jZSBmb3IgdGhlIHBvcnQgaGFzIGJlZW4gdG8g b25seSB1c2UgaW5saW5lIG9yIHNwZWNpYWwgc2VxdWVuY2VzIGlmIHRoZXkncmUgc2hvcnRl ciBvciBqdXN0IG9uZSBpbnN0cnVjdGlvbiBsb25nZXIgdGhhbiB0aGUgbG9vcC4KICAgIAog ICAgVGhpcyB3YXMgcHJldHR5IHJlYXNvbmFibGUgZ3VpZGFuY2UgZm9yIFFJL0hJIG1vZGUu ICBJdCB3YXMgcmVsYXhlZCBhIGJpdCBhYm91dCAxMCB5ZWFycyBhZ28gZm9yIEhJbW9kZSBp biBwYXJ0aWN1bGFyIHdoZXJlIHRoZSBrcGl0IHRlYW0gcmVhbGl6ZWQgdGhleSBjb3VsZCBz YXZlIDUwLTEwMCBjeWNsZXMgZm9yIHNvbWUgc2hpZnRzIGJ5IGFsbG93aW5nIDIgaW5zdHJ1 Y3Rpb25zIG9mIGNvZGUgZ3Jvd3RoIG92ZXIgdGhlIGxvb3AgaW1wbGVtZW50YXRpb24uCiAg ICAKICAgIEJ1dCB0aGV5IG9ubHkgcmUtdHVuZWQgSEltb2RlIHNoaWZ0cy4gIFRoZXJlJ3Mg ZXZlbiBiaWdnZXIgYmVuZWZpdHMgZm9yIHJlLXR1bmluZyBTSW1vZGUgc2hpZnRzLiAgVGhl cmUncyBjYXNlcyB3aGVyZSB3ZSBjYW4gc2F2ZSBjbG9zZSB0byAyMDAgY3ljbGVzIGJ5IGFs bG93aW5nIDIgYWRkaXRpb25hbCBpbnN0cnVjdGlvbnMuCiAgICAKICAgIFRoaXMgcGF0Y2gg cmUtdHVuZXMgU0ltb2RlIHNoaWZ0cyBvbiB0aGUgSDgvMzAwSCBwcmltYXJpbHkgYnkgaW5s aW5pbmcgbW9yZSBvZnRlbiBvciB1c2luZyBhIHNwZWNpYWwgc2VxdWVuY2UgKyBpbmxpbmlu ZyBmb3IgcmVzaWR1YWxzLiAgQm90aCBjYXNlcyB3ZXJlIGFscmVhZHkgc3VwcG9ydGVkIGFu ZCB0aGlzIGp1c3QgdXNlcyB0aG9zZSBleGlzdGluZyBjYXBhYmlsaXRpZXMgbW9yZSBvZnRl biwgc28gaXQgd2FzIHRyaXZpYWwgdG8gaW1wbGVtZW50LiAgSSB0aGluayB0aGVyZSdzIHNv bWUgY2FzZXMgd2VyZSBlbnRpcmVseSBuZXcgc3BlY2lhbCBzZXF1ZW5jZXMgY291bGQgYmUg dXNlZCwgYnV0IEkgaGF2ZW4ndCB0cmllZCB0aG9zZSB5ZXQuCiAgICAKICAgIGdjYy8KICAg IAogICAgICAgICAgICAqIGNvbmZpZy9oODMwMC9oODMwMC5jIChzaGlmdF9hbGdfc2kpOiBS ZXR1bmUgSDgvMzAwSCBzaGlmdHMKICAgICAgICAgICAgdG8gYWxsb3cgYSBiaXQgbW9yZSBj b2RlIGdyb3d0aCwgc2F2aW5nIG1hbnkgZG96ZW5zIG9mIGN5Y2xlcy4KICAgICAgICAgICAg KGg4MzAwX29wdGlvbl9vdmVycmlkZSk6IEFkanVzIHNoaWZ0X2FsZ19zaSBpZiBvcHRpbWl6 aW5nIGZvcgogICAgICAgICAgICBjb2RlIHNpemUuCiAgICAgICAgICAgIChnZXRfc2hpZnRf YWxnKTogVXNlIHNwZWNpYWwgKyBpbmxpbmUgc2hpZnRzIGZvciByZXNpZHVhbHMKICAgICAg ICAgICAgaW4gbW9yZSBjYXNlcy4KCmRpZmYgLS1naXQgYS9nY2MvY29uZmlnL2g4MzAwL2g4 MzAwLmMgYi9nY2MvY29uZmlnL2g4MzAwL2g4MzAwLmMKaW5kZXggZDJmNjU0OGEyNjUuLjc5 NTlhZDFlMjc2IDEwMDY0NAotLS0gYS9nY2MvY29uZmlnL2g4MzAwL2g4MzAwLmMKKysrIGIv Z2NjL2NvbmZpZy9oODMwMC9oODMwMC5jCkBAIC0yMjgsMTggKzIyOCwxOCBAQCBzdGF0aWMg ZW51bSBzaGlmdF9hbGcgc2hpZnRfYWxnX3NpWzJdWzNdWzMyXSA9IHsKICAgICAvKiAgOCAg ICA5ICAgMTAgICAxMSAgIDEyICAgMTMgICAxNCAgIDE1ICAqLwogICAgIC8qIDE2ICAgMTcg ICAxOCAgIDE5ICAgMjAgICAyMSAgIDIyICAgMjMgICovCiAgICAgLyogMjQgICAyNSAgIDI2 ICAgMjcgICAyOCAgIDI5ICAgMzAgICAzMSAgKi8KLSAgICB7IElOTCwgSU5MLCBJTkwsIElO TCwgSU5MLCBMT1AsIExPUCwgTE9QLAorICAgIHsgSU5MLCBJTkwsIElOTCwgSU5MLCBJTkws IElOTCwgSU5MLCBMT1AsCiAgICAgICBTUEMsIExPUCwgTE9QLCBMT1AsIExPUCwgTE9QLCBM T1AsIFNQQywKLSAgICAgIFNQQywgU1BDLCBTUEMsIFNQQywgTE9QLCBMT1AsIExPUCwgTE9Q LAotICAgICAgU1BDLCBMT1AsIExPUCwgTE9QLCBTUEMsIFNQQywgU1BDLCBTUEMgfSwgLyog U0hJRlRfQVNISUZUICAgKi8KLSAgICB7IElOTCwgSU5MLCBJTkwsIElOTCwgSU5MLCBMT1As IExPUCwgTE9QLAorICAgICAgU1BDLCBTUEMsIFNQQywgU1BDLCBTUEMsIFNQQywgU1BDLCBT UEMsCisgICAgICBTUEMsIFNQQywgU1BDLCBTUEMsIFNQQywgU1BDLCBTUEMsIFNQQyB9LCAv KiBTSElGVF9BU0hJRlQgICAqLworICAgIHsgSU5MLCBJTkwsIElOTCwgSU5MLCBJTkwsIElO TCwgSU5MLCBMT1AsCiAgICAgICBTUEMsIExPUCwgTE9QLCBMT1AsIExPUCwgTE9QLCBMT1As IFNQQywKLSAgICAgIFNQQywgU1BDLCBTUEMsIFNQQywgTE9QLCBMT1AsIExPUCwgTE9QLAot ICAgICAgU1BDLCBMT1AsIExPUCwgTE9QLCBTUEMsIFNQQywgU1BDLCBTUEMgfSwgLyogU0hJ RlRfTFNISUZUUlQgKi8KLSAgICB7IElOTCwgSU5MLCBJTkwsIElOTCwgSU5MLCBMT1AsIExP UCwgTE9QLAorICAgICAgU1BDLCBTUEMsIFNQQywgU1BDLCBTUEMsIFNQQywgU1BDLCBTUEMs CisgICAgICBTUEMsIFNQQywgU1BDLCBTUEMsIFNQQywgU1BDLCBTUEMsIFNQQyB9LCAvKiBT SElGVF9MU0hJRlRSVCAqLworICAgIHsgSU5MLCBJTkwsIElOTCwgSU5MLCBJTkwsIElOTCwg SU5MLCBMT1AsCiAgICAgICBTUEMsIExPUCwgTE9QLCBMT1AsIExPUCwgTE9QLCBMT1AsIExP UCwKLSAgICAgIFNQQywgU1BDLCBTUEMsIFNQQywgTE9QLCBMT1AsIExPUCwgTE9QLAotICAg ICAgU1BDLCBMT1AsIExPUCwgTE9QLCBMT1AsIExPUCwgTE9QLCBTUEMgfSwgLyogU0hJRlRf QVNISUZUUlQgKi8KKyAgICAgIFNQQywgU1BDLCBTUEMsIFNQQywgU1BDLCBTUEMsIFNQQywg U1BDLAorICAgICAgU1BDLCBTUEMsIFNQQywgU1BDLCBMT1AsIExPUCwgTE9QLCBTUEMgfSwg LyogU0hJRlRfQVNISUZUUlQgKi8KICAgfSwKICAgewogICAgIC8qIFRBUkdFVF9IODMwMFMg ICovCkBAIC0zNDMsNiArMzQzLDM2IEBAIGg4MzAwX29wdGlvbl9vdmVycmlkZSAodm9pZCkK ICAgICAgIHNoaWZ0X2FsZ19oaVtIOF8zMDBIXVtTSElGVF9BU0hJRlRSVF1bMTNdID0gU0hJ RlRfTE9PUDsKICAgICAgIHNoaWZ0X2FsZ19oaVtIOF8zMDBIXVtTSElGVF9BU0hJRlRSVF1b MTRdID0gU0hJRlRfTE9PUDsKIAorICAgICAgc2hpZnRfYWxnX3NpW0g4XzMwMEhdW1NISUZU X0FTSElGVF1bNV0gPSBTSElGVF9MT09QOworICAgICAgc2hpZnRfYWxnX3NpW0g4XzMwMEhd W1NISUZUX0FTSElGVF1bNl0gPSBTSElGVF9MT09QOworICAgICAgc2hpZnRfYWxnX3NpW0g4 XzMwMEhdW1NISUZUX0FTSElGVF1bMjBdID0gU0hJRlRfTE9PUDsKKyAgICAgIHNoaWZ0X2Fs Z19zaVtIOF8zMDBIXVtTSElGVF9BU0hJRlRdWzIxXSA9IFNISUZUX0xPT1A7CisgICAgICBz aGlmdF9hbGdfc2lbSDhfMzAwSF1bU0hJRlRfQVNISUZUXVsyMl0gPSBTSElGVF9MT09QOwor ICAgICAgc2hpZnRfYWxnX3NpW0g4XzMwMEhdW1NISUZUX0FTSElGVF1bMjNdID0gU0hJRlRf TE9PUDsKKyAgICAgIHNoaWZ0X2FsZ19zaVtIOF8zMDBIXVtTSElGVF9BU0hJRlRdWzI1XSA9 IFNISUZUX0xPT1A7CisgICAgICBzaGlmdF9hbGdfc2lbSDhfMzAwSF1bU0hJRlRfQVNISUZU XVsyNl0gPSBTSElGVF9MT09QOworICAgICAgc2hpZnRfYWxnX3NpW0g4XzMwMEhdW1NISUZU X0FTSElGVF1bMjddID0gU0hJRlRfTE9PUDsKKworICAgICAgc2hpZnRfYWxnX3NpW0g4XzMw MEhdW1NISUZUX0xTSElGVFJUXVs1XSA9IFNISUZUX0xPT1A7CisgICAgICBzaGlmdF9hbGdf c2lbSDhfMzAwSF1bU0hJRlRfTFNISUZUUlRdWzZdID0gU0hJRlRfTE9PUDsKKyAgICAgIHNo aWZ0X2FsZ19zaVtIOF8zMDBIXVtTSElGVF9MU0hJRlRSVF1bMjBdID0gU0hJRlRfTE9PUDsK KyAgICAgIHNoaWZ0X2FsZ19zaVtIOF8zMDBIXVtTSElGVF9MU0hJRlRSVF1bMjFdID0gU0hJ RlRfTE9PUDsKKyAgICAgIHNoaWZ0X2FsZ19zaVtIOF8zMDBIXVtTSElGVF9MU0hJRlRSVF1b MjJdID0gU0hJRlRfTE9PUDsKKyAgICAgIHNoaWZ0X2FsZ19zaVtIOF8zMDBIXVtTSElGVF9M U0hJRlRSVF1bMjNdID0gU0hJRlRfTE9PUDsKKyAgICAgIHNoaWZ0X2FsZ19zaVtIOF8zMDBI XVtTSElGVF9MU0hJRlRSVF1bMjVdID0gU0hJRlRfTE9PUDsKKyAgICAgIHNoaWZ0X2FsZ19z aVtIOF8zMDBIXVtTSElGVF9MU0hJRlRSVF1bMjZdID0gU0hJRlRfTE9PUDsKKyAgICAgIHNo aWZ0X2FsZ19zaVtIOF8zMDBIXVtTSElGVF9MU0hJRlRSVF1bMjddID0gU0hJRlRfTE9PUDsK KworICAgICAgc2hpZnRfYWxnX3NpW0g4XzMwMEhdW1NISUZUX0FTSElGVFJUXVs1XSA9IFNI SUZUX0xPT1A7CisgICAgICBzaGlmdF9hbGdfc2lbSDhfMzAwSF1bU0hJRlRfQVNISUZUUlRd WzZdID0gU0hJRlRfTE9PUDsKKyAgICAgIHNoaWZ0X2FsZ19zaVtIOF8zMDBIXVtTSElGVF9B U0hJRlRSVF1bMjBdID0gU0hJRlRfTE9PUDsKKyAgICAgIHNoaWZ0X2FsZ19zaVtIOF8zMDBI XVtTSElGVF9BU0hJRlRSVF1bMjFdID0gU0hJRlRfTE9PUDsKKyAgICAgIHNoaWZ0X2FsZ19z aVtIOF8zMDBIXVtTSElGVF9BU0hJRlRSVF1bMjJdID0gU0hJRlRfTE9PUDsKKyAgICAgIHNo aWZ0X2FsZ19zaVtIOF8zMDBIXVtTSElGVF9BU0hJRlRSVF1bMjNdID0gU0hJRlRfTE9PUDsK KyAgICAgIHNoaWZ0X2FsZ19zaVtIOF8zMDBIXVtTSElGVF9BU0hJRlRSVF1bMjVdID0gU0hJ RlRfTE9PUDsKKyAgICAgIHNoaWZ0X2FsZ19zaVtIOF8zMDBIXVtTSElGVF9BU0hJRlRSVF1b MjZdID0gU0hJRlRfTE9PUDsKKyAgICAgIHNoaWZ0X2FsZ19zaVtIOF8zMDBIXVtTSElGVF9B U0hJRlRSVF1bMjddID0gU0hJRlRfTE9PUDsKKwogICAgICAgLyogSDhTICovCiAgICAgICBz aGlmdF9hbGdfaGlbSDhfU11bU0hJRlRfQVNISUZUUlRdWzE0XSA9IFNISUZUX0xPT1A7CiAg ICAgfQpAQCAtMzc4NCw3ICszODE0LDcgQEAgZ2V0X3NoaWZ0X2FsZyAoZW51bSBzaGlmdF90 eXBlIHNoaWZ0X3R5cGUsIGVudW0gc2hpZnRfbW9kZSBzaGlmdF9tb2RlLAogCSAgICAgIGdj Y191bnJlYWNoYWJsZSAoKTsKIAkgICAgfQogCX0KLSAgICAgIGVsc2UgaWYgKChUQVJHRVRf SDgzMDBIICYmIGNvdW50ID49IDE2ICYmIGNvdW50IDw9IDE5KQorICAgICAgZWxzZSBpZiAo KFRBUkdFVF9IODMwMEggJiYgY291bnQgPj0gMTYgJiYgY291bnQgPD0gMjMpCiAJICAgICAg IHx8IChUQVJHRVRfSDgzMDBTICYmIGNvdW50ID49IDE2ICYmIGNvdW50IDw9IDIxKSkKIAl7 CiAJICBpbmZvLT5yZW1haW5kZXIgPSBjb3VudCAtIDE2OwpAQCAtMzgwNCw3ICszODM0LDcg QEAgZ2V0X3NoaWZ0X2FsZyAoZW51bSBzaGlmdF90eXBlIHNoaWZ0X3R5cGUsIGVudW0gc2hp ZnRfbW9kZSBzaGlmdF9tb2RlLAogCSAgICAgIGdvdG8gZW5kOwogCSAgICB9CiAJfQotICAg ICAgZWxzZSBpZiAoKFRBUkdFVF9IODMwMEggJiYgY291bnQgPT0gMjQpCisgICAgICBlbHNl IGlmICgoVEFSR0VUX0g4MzAwSCAmJiBjb3VudCA+PSAyNCB8fCBjb3VudCA8PSAyNykKIAkg ICAgICAgfHwgKFRBUkdFVF9IODMwMFMgJiYgY291bnQgPj0gMjQgJiYgY291bnQgPD0gMjUp KQogCXsKIAkgIGluZm8tPnJlbWFpbmRlciA9IGNvdW50IC0gMjQ7Cg== --------------187C892BE8BD82B41BA1A10C--