From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id F32263858D1E for ; Tue, 24 Jan 2023 13:54:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org F32263858D1E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6C5BFC14; Tue, 24 Jan 2023 05:55:10 -0800 (PST) Received: from [10.57.75.149] (unknown [10.57.75.149]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 6C7C03F71E; Tue, 24 Jan 2023 05:54:27 -0800 (PST) Content-Type: multipart/mixed; boundary="------------5P094bRjrS3WBZuuwidBOaBL" Message-ID: <22ba05fb-774e-62b8-64a2-90c5d73fcaba@arm.com> Date: Tue, 24 Jan 2023 13:54:20 +0000 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.6.1 Subject: [PATCH 2/3] arm: Remove unnecessary zero-extending of MVE predicates before use [PR 107674] Content-Language: en-US To: "gcc-patches@gcc.gnu.org" References: <13d03aef-f5d1-03fe-5281-31921d24dce0@arm.com> Cc: Richard Sandiford , Richard Earnshaw , Richard Biener , Kyrylo Tkachov From: "Andre Vieira (lists)" In-Reply-To: <13d03aef-f5d1-03fe-5281-31921d24dce0@arm.com> X-Spam-Status: No, score=-16.5 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,KAM_LOTSOFHASH,KAM_SHORT,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: This is a multi-part message in MIME format. --------------5P094bRjrS3WBZuuwidBOaBL Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Hi, This patch teaches GCC that zero-extending a MVE predicate from 16-bits to 32-bits and then only using 16-bits is a no-op. It does so in two steps: - it lets gcc know that it can access any MVE predicate mode using any other MVE predicate mode without needing to copy it, using the TARGET_MODES_TIEABLE_P hook, - it teaches simplify_subreg to optimize a subreg with a vector outermode, by replacing this outermode with a same-sized integer mode and trying the avalailable optimizations, then if successful it surrounds the result with a subreg casting it back to the original vector outermode. This removes the unnecessary zero-extending shown on PR 107674 (though it's a sign-extend there), that was introduced in gcc 11. Bootstrapped on aarch64-none-linux-gnu and regression tested on arm-none-eabi and armeb-none-eabi for armv8.1-m.main+mve.fp. OK for trunk? gcc/ChangeLog: PR target/107674 * conig/arm/arm.cc (arm_hard_regno_mode_ok): Use new MACRO. (arm_modes_tieable_p): Make MVE predicate modes tieable. * config/arm/arm.h (VALID_MVE_PRED_MODE): New define. * simplify-rtx.cc (simplify_context::simplify_subreg): Teach simplify_subreg to simplify subregs where the outermode is not scalar. gcc/testsuite/ChangeLog: * gcc.target/arm/mve/mve_vpt.c: Change to remove unecessary zero-extend. --------------5P094bRjrS3WBZuuwidBOaBL Content-Type: text/plain; charset=UTF-8; name="pr107674-2.patch" Content-Disposition: attachment; filename="pr107674-2.patch" Content-Transfer-Encoding: base64 ZGlmZiAtLWdpdCBhL2djYy9jb25maWcvYXJtL2FybS5oIGIvZ2NjL2NvbmZpZy9hcm0vYXJt LmgKaW5kZXggNmY3ZWNmOTEyODA0NzY0N2ZjNDE2NzdlNjM0Y2Q5NjEyYTEzMjQyYi4uNDM1 MmM4MzBjYjZkMmU2MzJhMjI1ZWRlYTg2MWI1Y2ViMzVkZDAzNSAxMDA2NDQKLS0tIGEvZ2Nj L2NvbmZpZy9hcm0vYXJtLmgKKysrIGIvZ2NjL2NvbmZpZy9hcm0vYXJtLmgKQEAgLTEwOTEs NiArMTA5MSwxMCBAQCBleHRlcm4gY29uc3QgaW50IGFybV9hcmNoX2NkZV9jb3Byb2NfYml0 c1tdOwogICAgfHwgKE1PREUpID09IFYxNlFJbW9kZSB8fCAoTU9ERSkgPT0gVjhIRm1vZGUg fHwgKE1PREUpID09IFY0U0Ztb2RlIFwKICAgIHx8IChNT0RFKSA9PSBWMkRGbW9kZSkKIAor I2RlZmluZSBWQUxJRF9NVkVfUFJFRF9NT0RFKE1PREUpIFwKKyAgKChNT0RFKSA9PSBISW1v ZGUJCQkJCQkJXAorICAgfHwgKE1PREUpID09IFYxNkJJbW9kZSB8fCAoTU9ERSkgPT0gVjhC SW1vZGUgfHwgKE1PREUpID09IFY0Qkltb2RlKQorCiAjZGVmaW5lIFZBTElEX01WRV9TSV9N T0RFKE1PREUpIFwKICAgKChNT0RFKSA9PSBWMkRJbW9kZSB8fChNT0RFKSA9PSBWNFNJbW9k ZSB8fCAoTU9ERSkgPT0gVjhISW1vZGUgXAogICAgfHwgKE1PREUpID09IFYxNlFJbW9kZSkK ZGlmZiAtLWdpdCBhL2djYy9jb25maWcvYXJtL2FybS5jYyBiL2djYy9jb25maWcvYXJtL2Fy bS5jYwppbmRleCAzZjE3MTE4OGRlNTEzZTI1ODM2OTM5N2U0NzI2YWZlMjdiZDlmZGJmLi4x ODQ2MGVmNTI4MGJlOGMxZGY4NWVmZjQyNGExYmY2NmQ2MDE5YzBhIDEwMDY0NAotLS0gYS9n Y2MvY29uZmlnL2FybS9hcm0uY2MKKysrIGIvZ2NjL2NvbmZpZy9hcm0vYXJtLmNjCkBAIC0y NTU2NCwxMCArMjU1NjQsNyBAQCBhcm1faGFyZF9yZWdub19tb2RlX29rICh1bnNpZ25lZCBp bnQgcmVnbm8sIG1hY2hpbmVfbW9kZSBtb2RlKQogICAgIHJldHVybiBmYWxzZTsKIAogICBp ZiAoSVNfVlBSX1JFR05VTSAocmVnbm8pKQotICAgIHJldHVybiBtb2RlID09IEhJbW9kZQot ICAgICAgfHwgbW9kZSA9PSBWMTZCSW1vZGUKLSAgICAgIHx8IG1vZGUgPT0gVjhCSW1vZGUK LSAgICAgIHx8IG1vZGUgPT0gVjRCSW1vZGU7CisgICAgcmV0dXJuIFZBTElEX01WRV9QUkVE X01PREUgKG1vZGUpOwogCiAgIGlmIChUQVJHRVRfVEhVTUIxKQogICAgIC8qIEZvciB0aGUg VGh1bWIgd2Ugb25seSBhbGxvdyB2YWx1ZXMgYmlnZ2VyIHRoYW4gU0ltb2RlIGluCkBAIC0y NTY0Niw2ICsyNTY0MywxMCBAQCBhcm1fbW9kZXNfdGllYWJsZV9wIChtYWNoaW5lX21vZGUg bW9kZTEsIG1hY2hpbmVfbW9kZSBtb2RlMikKICAgaWYgKEdFVF9NT0RFX0NMQVNTIChtb2Rl MSkgPT0gR0VUX01PREVfQ0xBU1MgKG1vZGUyKSkKICAgICByZXR1cm4gdHJ1ZTsKIAorICBp ZiAoVEFSR0VUX0hBVkVfTVZFCisgICAgICAmJiAoVkFMSURfTVZFX1BSRURfTU9ERSAobW9k ZTEpICYmIFZBTElEX01WRV9QUkVEX01PREUgKG1vZGUyKSkpCisgICAgcmV0dXJuIHRydWU7 CisKICAgLyogV2Ugc3BlY2lmaWNhbGx5IHdhbnQgdG8gYWxsb3cgZWxlbWVudHMgb2YgInN0 cnVjdHVyZSIgbW9kZXMgdG8KICAgICAgYmUgdGllYWJsZSB0byB0aGUgc3RydWN0dXJlLiAg VGhpcyBtb3JlIGdlbmVyYWwgY29uZGl0aW9uIGFsbG93cwogICAgICBvdGhlciByYXJlciBz aXR1YXRpb25zIHRvby4gICovCmRpZmYgLS1naXQgYS9nY2Mvc2ltcGxpZnktcnR4LmNjIGIv Z2NjL3NpbXBsaWZ5LXJ0eC5jYwppbmRleCA3ZmIxZTk3ZmJlYTRlN2I4YjA5MWY1NzI0ZWJl MGNiNjFlZWU3ZWMzLi5hOTUxMjcyMTg2NTg1YzBhNWNjM2UwMTU1Mjg1ZTdhNjM1ODY1ZjQy IDEwMDY0NAotLS0gYS9nY2Mvc2ltcGxpZnktcnR4LmNjCisrKyBiL2djYy9zaW1wbGlmeS1y dHguY2MKQEAgLTc2NTIsNiArNzY1MiwyMiBAQCBzaW1wbGlmeV9jb250ZXh0OjpzaW1wbGlm eV9zdWJyZWcgKG1hY2hpbmVfbW9kZSBvdXRlcm1vZGUsIHJ0eCBvcCwKIAl9CiAgICAgfQog CisgIC8qIFRyeSBzaW1wbGlmeWluZyBhIFNVQlJFRyBleHByZXNzaW9uIG9mIGEgbm9uLWlu dGVnZXIgT1VURVJNT0RFIGJ5IHVzaW5nIGEKKyAgICAgTkVXX09VVEVSTU9ERSBvZiB0aGUg c2FtZSBzaXplIGluc3RlYWQsIG90aGVyIHNpbXBsaWZpY2F0aW9ucyByZWx5IG9uCisgICAg IGludGVnZXIgdG8gaW50ZWdlciBzdWJyZWdzIGFuZCB3ZSdkIHBvdGVudGlhbGx5IG1pc3Mg b3V0IG9uIG9wdGltaXphdGlvbnMKKyAgICAgb3RoZXJ3aXNlLiAgKi8KKyAgaWYgKGtub3du X2d0IChHRVRfTU9ERV9TSVpFIChpbm5lcm1vZGUpLAorCQlHRVRfTU9ERV9TSVpFIChvdXRl cm1vZGUpKQorICAgICAgJiYgU0NBTEFSX0lOVF9NT0RFX1AgKGlubmVybW9kZSkKKyAgICAg ICYmICFTQ0FMQVJfSU5UX01PREVfUCAob3V0ZXJtb2RlKQorICAgICAgJiYgaW50X21vZGVf Zm9yX3NpemUgKEdFVF9NT0RFX0JJVFNJWkUgKG91dGVybW9kZSksCisJCQkgICAgMCkuZXhp c3RzICgmaW50X291dGVybW9kZSkpCisgICAgeworICAgICAgcnR4IHRlbSA9IHNpbXBsaWZ5 X3N1YnJlZyAoaW50X291dGVybW9kZSwgb3AsIGlubmVybW9kZSwgYnl0ZSk7CisgICAgICBp ZiAodGVtKQorCXJldHVybiBzaW1wbGlmeV9nZW5fc3VicmVnIChvdXRlcm1vZGUsIHRlbSwg R0VUX01PREUgKHRlbSksIGJ5dGUpOworICAgIH0KKwogICAvKiBJZiBPUCBpcyBhIHZlY3Rv ciBjb21wYXJpc29uIGFuZCB0aGUgc3VicmVnIGlzIG5vdCBjaGFuZ2luZyB0aGUKICAgICAg bnVtYmVyIG9mIGVsZW1lbnRzIG9yIHRoZSBzaXplIG9mIHRoZSBlbGVtZW50cywgY2hhbmdl IHRoZSByZXN1bHQKICAgICAgb2YgdGhlIGNvbXBhcmlzb24gdG8gdGhlIG5ldyBtb2RlLiAg Ki8KZGlmZiAtLWdpdCBhL2djYy90ZXN0c3VpdGUvZ2NjLnRhcmdldC9hcm0vbXZlL212ZV92 cHQuYyBiL2djYy90ZXN0c3VpdGUvZ2NjLnRhcmdldC9hcm0vbXZlL212ZV92cHQuYwppbmRl eCAyNmE1NjViNzlkZDEzNDhlMzYxYjNhYTIzYTFkNmU2ZDEzYmZmY2U4Li44ZTU2MmE5ZjA2 NWVmZjE1N2Y2M2ViZDVhY2Y5YWYwYTIxNTViNWM1IDEwMDY0NAotLS0gYS9nY2MvdGVzdHN1 aXRlL2djYy50YXJnZXQvYXJtL212ZS9tdmVfdnB0LmMKKysrIGIvZ2NjL3Rlc3RzdWl0ZS9n Y2MudGFyZ2V0L2FybS9tdmUvbXZlX3ZwdC5jCkBAIC0xNiw5ICsxNiw2IEBAIHZvaWQgdGVz dDAgKHVpbnQ4X3QgKmEsIHVpbnQ4X3QgKmIsIHVpbnQ4X3QgKmMpCiAqKgl2bGRyYi44CXEy LCBcW3IwXF0KICoqCXZsZHJiLjgJcTEsIFxbcjFcXQogKioJdmNtcC5pOAllcSwgcTIsIHEx Ci0qKgl2bXJzCXIzLCBwMAlAIG1vdmhpCi0qKgl1eHRoCXIzLCByMwotKioJdm1zcglwMCwg cjMJQCBtb3ZoaQogKioJdnBzdAogKioJdmFkZHQuaTgJcTMsIHEyLCBxMQogKioJdnBzdAo= --------------5P094bRjrS3WBZuuwidBOaBL--