From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa3.mentor.iphmx.com (esa3.mentor.iphmx.com [68.232.137.180]) by sourceware.org (Postfix) with ESMTPS id 083EF3858415 for ; Tue, 8 Nov 2022 14:35:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 083EF3858415 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.96,148,1665475200"; d="scan'208,223";a="86239763" Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa3.mentor.iphmx.com with ESMTP; 08 Nov 2022 06:35:46 -0800 IronPort-SDR: Hna1vcyWGNWiw8XwOd4GWMpcbPdTNlhiSJdNs1GpIDHeKPPkc0keu+agy1IODT31oznu/5G5cE bjb9EUJBudFHQA3/TStC4+q9JlA686KfqupdzEaTfqqII4Ven779ndnGO2/O3X5k3e1K76j4R5 yC7D2srr/c0bkCNCjoZyZqIg1f5PiCZZM9Cn6au1ASNzffO8YZT4OQAhVMSuzreieVz++nmxfj PXZremNLWFzqwYrPBrO4WL6IKjWE9UtwHFf2oQ8spbY+aekY0jqc65a4RM9uzq1YHnO+vFaZqW UJE= Content-Type: multipart/mixed; boundary="------------bNqG6abR9ptpEOYSx0OLQYEY" Message-ID: <952c73e5-ba66-0a5a-e33e-1feb6396743e@codesourcery.com> Date: Tue, 8 Nov 2022 14:35:28 +0000 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2 From: Kwok Cheung Yeung Subject: [PATCH] amdgcn: Add builtins for vectorized native versions of abs, floorf and floor To: gcc-patches , Andrew Stubbs X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-11.mgc.mentorg.com (139.181.222.11) To svr-ies-mbx-12.mgc.mentorg.com (139.181.222.12) X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00,GIT_PATCH_0,HEADER_FROM_DIFFERENT_DOMAINS,KAM_DMARC_STATUS,KAM_SHORT,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --------------bNqG6abR9ptpEOYSx0OLQYEY Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit Hello This patch adds three extra builtins for the vectorized forms of the abs, floorf and floor math functions, which are implemented by native GCN instructions. I have also added a test to check that they generate the expected assembler instructions. Okay for trunk? Thanks Kwok --------------bNqG6abR9ptpEOYSx0OLQYEY Content-Type: text/plain; charset="UTF-8"; name="0001-amdgcn-Add-builtins-for-vectorized-native-versions-o.patch" Content-Disposition: attachment; filename*0="0001-amdgcn-Add-builtins-for-vectorized-native-versions-o.pa"; filename*1="tch" Content-Transfer-Encoding: base64 RnJvbSAzN2Y0OWIyMDRkNTAxMzI3ZDA4NjdiM2U4YTNmMDFiOTQ0NWZiOWJkIE1vbiBTZXAg MTcgMDA6MDA6MDAgMjAwMQ0KRnJvbTogS3dvayBDaGV1bmcgWWV1bmcgPGtjeUBjb2Rlc291 cmNlcnkuY29tPg0KRGF0ZTogVHVlLCA4IE5vdiAyMDIyIDExOjU5OjU4ICswMDAwDQpTdWJq ZWN0OiBbUEFUQ0hdIGFtZGdjbjogQWRkIGJ1aWx0aW5zIGZvciB2ZWN0b3JpemVkIG5hdGl2 ZSB2ZXJzaW9ucyBvZiBhYnMsDQogZmxvb3JmIGFuZCBmbG9vcg0KDQoyMDIyLTExLTA4ICBL d29rIENoZXVuZyBZZXVuZyAgPGtjeUBjb2Rlc291cmNlcnkuY29tPg0KDQoJZ2NjLw0KCSog Y29uZmlnL2djbi9nY24tYnVpbHRpbnMuZGVmIChGQUJTViwgRkxPT1JWRiwgRkxPT1JWKTog TmV3IGJ1aWx0aW5zLg0KCSogY29uZmlnL2djbi9nY24uY2MgKGdjbl9leHBhbmRfYnVpbHRp bl8xKTogRXhwYW5kIEdDTl9CVUlMVElOX0ZBQlNWLA0KCUdDTl9CVUlMVElOX0ZMT09SVkYg YW5kIEdDTl9CVUlMVElOX0ZMT09SVi4NCg0KCWdjYy90ZXN0c3VpdGUvDQoJKiBnY2MudGFy Z2V0L2djbi9tYXRoLWJ1aWx0aW5zLTEuYzogTmV3IHRlc3QuDQotLS0NCiBnY2MvY29uZmln L2djbi9nY24tYnVpbHRpbnMuZGVmICAgICAgICAgICAgICAgfCAxNSArKysrKysrKysNCiBn Y2MvY29uZmlnL2djbi9nY24uY2MgICAgICAgICAgICAgICAgICAgICAgICAgfCAzMyArKysr KysrKysrKysrKysrKysrDQogLi4uL2djYy50YXJnZXQvZ2NuL21hdGgtYnVpbHRpbnMtMS5j ICAgICAgICAgIHwgMzMgKysrKysrKysrKysrKysrKysrKw0KIDMgZmlsZXMgY2hhbmdlZCwg ODEgaW5zZXJ0aW9ucygrKQ0KIGNyZWF0ZSBtb2RlIDEwMDY0NCBnY2MvdGVzdHN1aXRlL2dj Yy50YXJnZXQvZ2NuL21hdGgtYnVpbHRpbnMtMS5jDQoNCmRpZmYgLS1naXQgYS9nY2MvY29u ZmlnL2djbi9nY24tYnVpbHRpbnMuZGVmIGIvZ2NjL2NvbmZpZy9nY24vZ2NuLWJ1aWx0aW5z LmRlZg0KaW5kZXggMjc2OTE5MDk5MjUuLmM1MDc3N2JkM2IwIDEwMDY0NA0KLS0tIGEvZ2Nj L2NvbmZpZy9nY24vZ2NuLWJ1aWx0aW5zLmRlZg0KKysrIGIvZ2NjL2NvbmZpZy9nY24vZ2Nu LWJ1aWx0aW5zLmRlZg0KQEAgLTY0LDYgKzY0LDIxIEBAIERFRl9CVUlMVElOIChGQUJTVkYs IDMgLypDT0RFX0ZPUl9mYWJzdmYgKi8sDQogCSAgICAgX0EyIChHQ05fQlRJX1Y2NFNGLCBH Q05fQlRJX1Y2NFNGKSwNCiAJICAgICBnY25fZXhwYW5kX2J1aWx0aW5fMSkNCiANCitERUZf QlVJTFRJTiAoRkFCU1YsIDMgLypDT0RFX0ZPUl9mYWJzdiAqLywNCisJICAgICAiZmFic3Yi LCBCX0lOU04sDQorCSAgICAgX0EyIChHQ05fQlRJX1Y2NERGLCBHQ05fQlRJX1Y2NERGKSwN CisJICAgICBnY25fZXhwYW5kX2J1aWx0aW5fMSkNCisNCitERUZfQlVJTFRJTiAoRkxPT1JW RiwgMyAvKkNPREVfRk9SX2Zsb29ydmYgKi8sDQorCSAgICAgImZsb29ydmYiLCBCX0lOU04s DQorCSAgICAgX0EyIChHQ05fQlRJX1Y2NFNGLCBHQ05fQlRJX1Y2NFNGKSwNCisJICAgICBn Y25fZXhwYW5kX2J1aWx0aW5fMSkNCisNCitERUZfQlVJTFRJTiAoRkxPT1JWLCAzIC8qQ09E RV9GT1JfZmxvb3J2ICovLA0KKwkgICAgICJmbG9vcnYiLCBCX0lOU04sDQorCSAgICAgX0Ey IChHQ05fQlRJX1Y2NERGLCBHQ05fQlRJX1Y2NERGKSwNCisJICAgICBnY25fZXhwYW5kX2J1 aWx0aW5fMSkNCisNCiBERUZfQlVJTFRJTiAoTERFWFBWRiwgMyAvKkNPREVfRk9SX2xkZXhw dmYgKi8sDQogCSAgICAgImxkZXhwdmYiLCBCX0lOU04sDQogCSAgICAgX0EzIChHQ05fQlRJ X1Y2NFNGLCBHQ05fQlRJX1Y2NFNGLCBHQ05fQlRJX1Y2NFNJKSwNCmRpZmYgLS1naXQgYS9n Y2MvY29uZmlnL2djbi9nY24uY2MgYi9nY2MvY29uZmlnL2djbi9nY24uY2MNCmluZGV4IDE5 OTYxMTVhNjg2Li45YzVlMzQxOTc0OCAxMDA2NDQNCi0tLSBhL2djYy9jb25maWcvZ2NuL2dj bi5jYw0KKysrIGIvZ2NjL2NvbmZpZy9nY24vZ2NuLmNjDQpAQCAtNDMyOSw2ICs0MzI5LDM5 IEBAIGdjbl9leHBhbmRfYnVpbHRpbl8xICh0cmVlIGV4cCwgcnR4IHRhcmdldCwgcnR4IC8q c3VidGFyZ2V0ICovICwNCiAJZW1pdF9pbnNuIChnZW5fYWJzdjY0c2YyICh0YXJnZXQsIGFy ZykpOw0KIAlyZXR1cm4gdGFyZ2V0Ow0KICAgICAgIH0NCisgICAgY2FzZSBHQ05fQlVJTFRJ Tl9GQUJTVjoNCisgICAgICB7DQorCWlmIChpZ25vcmUpDQorCSAgcmV0dXJuIHRhcmdldDsN CisJcnR4IGFyZyA9IGZvcmNlX3JlZyAoVjY0REZtb2RlLA0KKwkJCSAgICAgZXhwYW5kX2V4 cHIgKENBTExfRVhQUl9BUkcgKGV4cCwgMCksIE5VTExfUlRYLA0KKwkJCQkJICBWNjRERm1v ZGUsDQorCQkJCQkgIEVYUEFORF9OT1JNQUwpKTsNCisJZW1pdF9pbnNuIChnZW5fYWJzdjY0 ZGYyICh0YXJnZXQsIGFyZykpOw0KKwlyZXR1cm4gdGFyZ2V0Ow0KKyAgICAgIH0NCisgICAg Y2FzZSBHQ05fQlVJTFRJTl9GTE9PUlZGOg0KKyAgICAgIHsNCisJaWYgKGlnbm9yZSkNCisJ ICByZXR1cm4gdGFyZ2V0Ow0KKwlydHggYXJnID0gZm9yY2VfcmVnIChWNjRTRm1vZGUsDQor CQkJICAgICBleHBhbmRfZXhwciAoQ0FMTF9FWFBSX0FSRyAoZXhwLCAwKSwgTlVMTF9SVFgs DQorCQkJCQkgIFY2NFNGbW9kZSwNCisJCQkJCSAgRVhQQU5EX05PUk1BTCkpOw0KKwllbWl0 X2luc24gKGdlbl9mbG9vcnY2NHNmMiAodGFyZ2V0LCBhcmcpKTsNCisJcmV0dXJuIHRhcmdl dDsNCisgICAgICB9DQorICAgIGNhc2UgR0NOX0JVSUxUSU5fRkxPT1JWOg0KKyAgICAgIHsN CisJaWYgKGlnbm9yZSkNCisJICByZXR1cm4gdGFyZ2V0Ow0KKwlydHggYXJnID0gZm9yY2Vf cmVnIChWNjRERm1vZGUsDQorCQkJICAgICBleHBhbmRfZXhwciAoQ0FMTF9FWFBSX0FSRyAo ZXhwLCAwKSwgTlVMTF9SVFgsDQorCQkJCQkgIFY2NERGbW9kZSwNCisJCQkJCSAgRVhQQU5E X05PUk1BTCkpOw0KKwllbWl0X2luc24gKGdlbl9mbG9vcnY2NGRmMiAodGFyZ2V0LCBhcmcp KTsNCisJcmV0dXJuIHRhcmdldDsNCisgICAgICB9DQogICAgIGNhc2UgR0NOX0JVSUxUSU5f TERFWFBWRjoNCiAgICAgICB7DQogCWlmIChpZ25vcmUpDQpkaWZmIC0tZ2l0IGEvZ2NjL3Rl c3RzdWl0ZS9nY2MudGFyZ2V0L2djbi9tYXRoLWJ1aWx0aW5zLTEuYyBiL2djYy90ZXN0c3Vp dGUvZ2NjLnRhcmdldC9nY24vbWF0aC1idWlsdGlucy0xLmMNCm5ldyBmaWxlIG1vZGUgMTAw NjQ0DQppbmRleCAwMDAwMDAwMDAwMC4uZTFhYWRmYjQwZDkNCi0tLSAvZGV2L251bGwNCisr KyBiL2djYy90ZXN0c3VpdGUvZ2NjLnRhcmdldC9nY24vbWF0aC1idWlsdGlucy0xLmMNCkBA IC0wLDAgKzEsMzMgQEANCisvKiB7IGRnLWRvIGNvbXBpbGUgfSAqLw0KKy8qIHsgZGctb3B0 aW9ucyAiLU8xIiB9ICovDQorDQordHlwZWRlZiBmbG9hdCB2NjRzZiBfX2F0dHJpYnV0ZV9f ICgodmVjdG9yX3NpemUgKDI1NikpKTsNCit0eXBlZGVmIGRvdWJsZSB2NjRkZiBfX2F0dHJp YnV0ZV9fICgodmVjdG9yX3NpemUgKDUxMikpKTsNCit0eXBlZGVmIGludCB2NjRzaSBfX2F0 dHJpYnV0ZV9fICgodmVjdG9yX3NpemUgKDI1NikpKTsNCit0eXBlZGVmIGxvbmcgdjY0ZGkg X19hdHRyaWJ1dGVfXyAoKHZlY3Rvcl9zaXplICg1MTIpKSk7DQorDQordjY0c2YgZiAodjY0 c2YgX3gsIHY2NHNpIF95KQ0KK3sNCisgIHY2NHNmIHggPSBfeDsNCisgIHY2NHNpIHkgPSBf eTsNCisgIHggPSBfX2J1aWx0aW5fZ2NuX2ZhYnN2ZiAoeCk7IC8qIHsgZGctZmluYWwgeyBz Y2FuLWFzc2VtYmxlciAidl9hZGRfZjMyXFxzK3ZcWzAtOVxdKywgMCwgfHZcWzAtOVxdK3wi IH0gfSAqLw0KKyAgeCA9IF9fYnVpbHRpbl9nY25fZmxvb3J2ZiAoeCk7IC8qIHsgZGctZmlu YWwgeyBzY2FuLWFzc2VtYmxlciAidl9mbG9vcl9mMzJcXHMrdlxbMC05XF0rLCB2XFswLTlc XSsiIH0gfSovDQorICB4ID0gX19idWlsdGluX2djbl9mcmV4cHZmX21hbnQgKHgpOyAvKiB7 IGRnLWZpbmFsIHsgc2Nhbi1hc3NlbWJsZXIgInZfZnJleHBfbWFudF9mMzJcXHMrdlxbMC05 XF0rLCB2XFswLTlcXSsiIH0gfSovDQorICB5ID0gX19idWlsdGluX2djbl9mcmV4cHZmX2V4 cCAoeCk7IC8qIHsgZGctZmluYWwgeyBzY2FuLWFzc2VtYmxlciAidl9mcmV4cF9leHBfaTMy X2YzMlxccyt2XFswLTlcXSssIHZcWzAtOVxdKyIgfSB9Ki8NCisgIHggPSBfX2J1aWx0aW5f Z2NuX2xkZXhwdmYgKHgsIHkpOyAvKiB7IGRnLWZpbmFsIHsgc2Nhbi1hc3NlbWJsZXIgInZf bGRleHBfZjMyXFxzK3ZcWzAtOVxdKywgdlxbMC05XF0rLCB2XFswLTlcXSsiIH0gfSovDQor DQorICByZXR1cm4geDsNCit9DQorDQordjY0ZGYgZyAodjY0ZGYgX3gsIHY2NHNpIF95KQ0K K3sNCisgIHY2NGRmIHggPSBfeDsNCisgIHY2NHNpIHkgPSBfeTsNCisgIHggPSBfX2J1aWx0 aW5fZ2NuX2ZhYnN2ICh4KTsgLyogeyBkZy1maW5hbCB7IHNjYW4tYXNzZW1ibGVyICJ2X2Fk ZF9mNjRcXHMrdlxcXFtcWzAtOVxdKzpcWzAtOV0rXFxcXSwgMCwgfHZcXFxbXFswLTlcXSs6 XFswLTlcXStcXFxdfCIgfSB9ICovDQorICB4ID0gX19idWlsdGluX2djbl9mbG9vcnYgKHgp OyAvKiB7IGRnLWZpbmFsIHsgc2Nhbi1hc3NlbWJsZXIgInZfZmxvb3JfZjY0XFxzK3ZcXFxb XFswLTlcXSs6XFswLTldK1xcXF0sIHZcXFxbXFswLTlcXSs6XFswLTldK1xcXF0iIH0gfSov DQorICB4ID0gX19idWlsdGluX2djbl9mcmV4cHZfbWFudCAoeCk7IC8qIHsgZGctZmluYWwg eyBzY2FuLWFzc2VtYmxlciAidl9mcmV4cF9tYW50X2Y2NFxccyt2XFxcW1xbMC05XF0rOlxb MC05XStcXFxdLCB2XFxcW1xbMC05XF0rOlxbMC05XStcXFxdIiB9IH0qLw0KKyAgeSA9IF9f YnVpbHRpbl9nY25fZnJleHB2X2V4cCAoeCk7IC8qIHsgZGctZmluYWwgeyBzY2FuLWFzc2Vt YmxlciAidl9mcmV4cF9leHBfaTMyX2Y2NFxccyt2XFswLTlcXSssIHZcXFxbXFswLTlcXSs6 XFswLTldK1xcXF0iIH0gfSovDQorICB4ID0gX19idWlsdGluX2djbl9sZGV4cHYgKHgsIHkp OyAvKiB7IGRnLWZpbmFsIHsgc2Nhbi1hc3NlbWJsZXIgInZfbGRleHBfZjY0XFxzK3ZcXFxb XFswLTlcXSs6XFswLTldK1xcXF0sIHZcXFxbXFswLTlcXSs6XFswLTldK1xcXF0sIHZcWzAt OVxdKyIgfSB9Ki8NCisNCisgIHJldHVybiB4Ow0KK30NCi0tIA0KMi4yNS4xDQoNCg== --------------bNqG6abR9ptpEOYSx0OLQYEY--