From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 62160 invoked by alias); 6 Sep 2016 09:20:25 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 62128 invoked by uid 89); 6 Sep 2016 09:20:23 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.6 required=5.0 tests=AWL,BAYES_05,SPF_PASS autolearn=ham version=3.3.2 spammy=fool, uns, simode, SImode X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (146.101.78.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 06 Sep 2016 09:20:13 +0000 Received: from EUR03-VE1-obe.outbound.protection.outlook.com (mail-ve1eur03lp0149.outbound.protection.outlook.com [213.199.154.149]) (Using TLS) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-73-QkT52OUNNZqVpJDNT3CYIg-1; Tue, 06 Sep 2016 10:20:10 +0100 Received: from VI1PR0801CA0063.eurprd08.prod.outlook.com (10.168.60.159) by HE1PR0801MB1788.eurprd08.prod.outlook.com (10.168.150.19) with Microsoft SMTP Server (version=TLS1_0, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA_P384) id 15.1.609.9; Tue, 6 Sep 2016 09:20:08 +0000 Received: from DB3FFO11FD037.protection.gbl (2a01:111:f400:7e04::105) by VI1PR0801CA0063.outlook.office365.com (2603:10a6:800:4d::31) with Microsoft SMTP Server (version=TLS1_0, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA_P384) id 15.1.599.9 via Frontend Transport; Tue, 6 Sep 2016 09:20:08 +0000 Received: from nebula.arm.com (217.140.96.140) by DB3FFO11FD037.mail.protection.outlook.com (10.47.217.68) with Microsoft SMTP Server (version=TLS1_0, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA_P384) id 15.1.587.6 via Frontend Transport; Tue, 6 Sep 2016 09:20:07 +0000 Received: from e107456-lin.cambridge.arm.com (10.1.2.79) by mail.arm.com (10.1.106.66) with Microsoft SMTP Server id 14.3.294.0; Tue, 6 Sep 2016 10:19:52 +0100 From: James Greenhalgh To: CC: , , Subject: [Patch AArch64] Add floatdihf2 and floatunsdihf2 patterns Date: Tue, 06 Sep 2016 09:23:00 -0000 Message-ID: <1473153590-29944-1-git-send-email-james.greenhalgh@arm.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-Office365-Filtering-HT: Tenant X-Forefront-Antispam-Report: CIP:217.140.96.140;IPV:CAL;SCL:-1;CTRY:GB;EFV:NLI;SFV:NSPM;SFS:(10009020)(6009001)(7916002)(2980300002)(438002)(189002)(199003)(377424004)(246002)(7846002)(77096005)(7696003)(87936001)(5660300001)(8936002)(2906002)(86362001)(50986999)(33646002)(104016004)(11100500001)(189998001)(110136002)(50226002)(356003)(586003)(626004)(92566002)(106466001)(2351001)(84326002)(229853001)(305945005)(8676002)(4326007)(512874002)(36756003)(5890100001)(19580405001)(19580395003)(2476003)(450100001)(26826002)(4610100001)(568964002);DIR:OUT;SFP:1101;SCL:1;SRVR:HE1PR0801MB1788;H:nebula.arm.com;FPR:;SPF:Pass;PTR:fw-tnat.cambridge.arm.com;MX:1;A:1;LANG:en; X-Microsoft-Exchange-Diagnostics: 1;DB3FFO11FD037;1:rQ6a/PPkpyBxiNUcHnLZe1pj35aE0WV8YbManqqajrZQXYrtEb1YyDfE/9RS6H/G6B7ABhpoFv/M9vN53l7lYNr1gaMDKNW1VPEIItjLifs1ao+9FEwMn4eonCZ+Uwm3Bj3BIEpCQbPYWl6wZiuW7UfVLy20r2M8fUzvMmS5RlAHW901t4Q+FSFBJEkFFQFXTr+hReL462M2jgE2FEGSV3DFjkN8bOjKWNAeyx4hd50t4p5KRnfVhu6rBj+kFDhTc3YTYve04452/UfEomIvZFCZ9S2pzsqbwHJ3RcjdUmMgn/+j1t5rizE/afrF/PXpKc0/mcoRCH/ahPxBIFnALejMn6Lqzf879rE9Dr6lkMqC4rTtuX7TzC2onovwx2WGAcR0WofHgFs2lxA6KHdUBeDNO/qlE+YBwDZpmE1mwKaEWFWVsXkkuH86T9IJomm1SSjSqrduVFYb2NKroiMUAMFwfWCRaWwNkrfWho7/U4Lux03f9eDYvjqyXPgAHvGaivivsworLA+R4ij1mEThyVedfZRa557WI9rh6VfzwoJWwqqQ0k8UL/yNFrSCkxcz X-MS-Office365-Filtering-Correlation-Id: 02650f77-c400-441b-cdcc-08d3d636fe9f X-Microsoft-Exchange-Diagnostics: 1;HE1PR0801MB1788;2:Xu4PgkfM4ndUQTBWSfrOPqkCcR6VNP4/b7+FDRmTaSe1R5y1QJhPnAbUbVnuvNiUGli7+rtzd3J56LaHadPo2qx12LArNMfV1KVhCcYULFoQRW3uVeOWJb31551kYy7hrLqObNh2ykHwrnCffGN3Lp5W+fLc72+VOk3nsPSXEXF3jstszd6kRsfqkn8g67tS;3:urjtpg84/hsTMItRb90Ksd2a9BzcN0uBOMDhcotMqN++eNWAQZaJmROOGlpC/ijgpclw4y7WYGcIhrlH9GgKSBheXpZepfYN6hURyg92G51AArbRxhnECdALdYDpYmmevDUmLZeNbX9JI88pqCJ8M+JLj0GeMZNr+l5c7ZgWsaf4HDKp6/oInGw52TPnQc9Iwwi6rD7+JqwwojcoWTNTEMEexT6LDbQIg+6Rl1DZVz5hwa8zQLlD3/FnXw/kLC/K/ffI/ryCBZydyaPkPnyzQA==;25:1xpXjzaNHDOR0nUI/G1g7LIIqqzQkWEL9Ch6mwAeUEBhg9qtEpkElKj2oOEBe9gcbSzRrRg0UfHgzLUMqPNNujiLST0vOKYLglOZFOcH0W34cooO8XS38GwbQCnFp5loxeW70Le/OSF/8gc23aLnnUooXcVlyeEBT9ioAL7Ak8fvFveX+gDEt3df7WEEC9YLyJwTzmZ9nk+14znyW8+aoLH8rUGVYk8sLClz9ll3cbn30nWhBraGs5shW016+2CJqugXpWI006UqkM7E3es8xsVkTS/4qwWTbdqEubbG67hcNSknsuXUhZ1AcU+wK1nr0zLyTpwJ4NnBwFJj1yG+wO4B6K6ASdGwBwvOtu3gH5aA3evgH3ibLpaYcNUnafEQZiG8cJyrRqeJODsxcchO1JxrqRSGlgnPriGSaPsa9gOPSiXoH0cqsNrc0PAtp3mY/r04qJGegrle2YVmOx/ZPw== X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(8251501002);SRVR:HE1PR0801MB1788; X-Microsoft-Exchange-Diagnostics: 1;HE1PR0801MB1788;31:moNqgMo/1grzy979ywkvib0CNPtQe9yOHbfvPFZkLaOER8ePWBz86vkpFaV2FfPzMEuLNkANQVK3crPvpaD2vqpT80zwaZriS+2N2Wx5z4yZ/2X93SkYo4o7Xhgnsxhye44uz9ye1lvIrbnnKFH5LugRaBjRfAJ1WiLGhvjgzQTILIEuo5IebTSww/k51y4w/riJQhaWdRLgNSQUv/NnC+2JVWX1X7g/CUnYfXrM3K8Y+EUQUzOCoSnwi+FVhCVs84SuiAKKcW8rCnaHZgjYeQ==;20:OPpOzNMQwjcv32GTVkpI0nTrtqDYUxkljUbsTP9+EQcgvYD4b0la7FPVD/VU1LenSD1H6dtS15cxn6J+wGKpUEcEt9oN0Ufe6xJhYmmEQK4xSIwj/5fKhrgw3RKsri/jszPkWqO5lAo8QfR/jyDuxQu+n/dxvI8rMCa4QzdmOJ1SJD4lpwG11Sm2pGjM67reCWFBiDmDeT8QDF1CdOku3AfFWX3dfyik9Ea4BKgzM828HJ9VgXvxXURxQr4Tvndr NoDisclaimer: True X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(180628864354917); X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(102415321)(6040176)(601004)(2401047)(13020025)(13024025)(13023025)(13013025)(8121501046)(5005006)(10201501046)(3002001)(6055026);SRVR:HE1PR0801MB1788;BCL:0;PCL:0;RULEID:;SRVR:HE1PR0801MB1788; X-Microsoft-Exchange-Diagnostics: 1;HE1PR0801MB1788;4:UZe3u8m7h9o6OmJAgbBrFGFfdQ1d8cL2HmpL0Ta1syIJPvgOE77aAgqG34c4v9oIhMSj8X3pgpKM1mX3mD9cuzJnHyjqgGBeT1GeZcH1khlqLMA3VWfYZkej1BEwa/mocjWivFGcKAhd357LYLd365LOIw2ZLMZQPgKz0CttlBWoEz+435OPDzOw3nBeguXApuefhlV1r3WHO/WaJ1Y5noI+3s0X6bYeIl8QTqSdSL+JNVvrz7fGhtOwVLCHiiDMd8fY3P3hr4H8EM+AhrG4bN/ZkWKxuisQ6j1zZZiA9c4X+ZyyqGIwp86BoCr+vaHbs/3krSJ2Mo/MPBITdKiQl4ZkyifS3Add0hMbflx3JEXyRcM5M8n3XKbHzzVHybFQpvK8l8Lk0325r+zWQYy7KtL7p0JdfjL5Cprv7oOD8JOTL/hw1eYO7ahwnEMMrlrYtT0yhk983BRnC4KWO2aiz86JOAjMJJG3JpABbAh91qddmaYZLFhLNsOmbU5iHsmr4QEp3inaSEkHxKxfjPOxiykQXtjQjRX0Gq0SwCsQ9Go= X-Forefront-PRVS: 0057EE387C X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1;HE1PR0801MB1788;23:Tos1DlJyDsCEZqarzsADkFgE0K+l8WmTXN0bMaZ?= =?us-ascii?Q?kxeRyR0WfjMQQ2nXnYRJ4OPjwPgN6rcZehAq1H1ZIEGHopXSxZRaPfMHmKNn?= =?us-ascii?Q?nHfMWeayKlRREhg8UsyjPmJggYEKqK9So8t78lrlIPPlyCymOJvXNwzZJF33?= =?us-ascii?Q?JB2ZqN+D3R065am5R/lo+O6bYO7KRmAdYdJpZ2vQ6Zt4QdRBZYO/pIHpQ2/Z?= =?us-ascii?Q?FNdXJghTVxuiDIQo108HppZMPhNEweezSjfmaowD2nT8Q0BCkHr2tbP8tuG8?= =?us-ascii?Q?vBLC5FORNSsaXqyk4W4R2iXoJvcnEqS45h/04mcL+FAPhuHXvgg2kzg1r9ep?= =?us-ascii?Q?C4QFrNHGGMvRnnmZHlBuUxGgtJbYBLa48DUxbQizIDUT1O4Hh4uIjtL/yii6?= =?us-ascii?Q?oqdvaFzIGr4gfIF9iVCL08EDDQGwKdB0Y4EwZfDNYSIxiDjIbAI3XA1/m/kW?= =?us-ascii?Q?q0/m9KnfH3V74PTKTZcN+gg/Zf6POrE3WUPo+2flH22COovGIoO6D56DyamC?= =?us-ascii?Q?TRkt+bVvAaZg4nSruxdzew0bZx4CP55ymO0y5OQ7ZB6hgzCZpWUtAWUFbq61?= =?us-ascii?Q?7aCFHwgcsLarZdapQdHTgIr4ZJkDtaoLzJRcQ1K51RPsUpweFuFUeE/30v+A?= =?us-ascii?Q?0qR3jlPotBgFEdO7tZ6R4xzdvtYmxiQOQS3GDFT8RTXohJN2wzMep1/KpP1e?= =?us-ascii?Q?I2wxhHN2CsjpIkLKFVbkchUgDF8kpFqR9NMvygjmlk5ZRS5knc4zKHANgBPq?= =?us-ascii?Q?+iZiSWVohGmjsfvwQ7dP4AhhTqZXf5B6VqWHIMfrmrEGT/7ONM9Khm2cliQi?= =?us-ascii?Q?LfXpirCmUlQcOj+wI8+j2PqmB789u17v2rEIhFKgQJUMiULc7jmS5IGdHXli?= =?us-ascii?Q?+tfVX1S8VD4gGfG4xlMMp387gBD/bSEJ2Tsx3Vd/7LTod74C/GR/lXm3hbSX?= =?us-ascii?Q?QGVQiGsj6aZ9CfMya3Ibl2xVS9am96vsMes78AFA9O9yJMbqJE72PV63bhhw?= =?us-ascii?Q?r2si4owqiQA9E4dTx9CkhNX15vRxd9Al5zjO0PLYpvMBhhlUgm53E3n+/4St?= =?us-ascii?Q?zzHOgvPrJvAGCJVQFNkbOYyYxqYJzIiIKMlPZ/2sIQ7ucCnuk0g=3D=3D?= X-Microsoft-Exchange-Diagnostics: 1;HE1PR0801MB1788;6:ldVSl9HBhPZg98YSSl1dWYiN0+Ia/nZoRV/bfJwZBDxFbGWYcAojD3GelHAjn8xN6XEH9G+AyiN4iFLxab2MWUPNhqcqGYIc962dJHcMTFSOnLGP45Tkp+Qy1SnCYhlTbvhWlt/fxdOh6N/GnO3n3f4WGdlcoEdnIFBdg71p3ZzHeFPvIN3k53B4RBhnZE3M2mml8VpRxYHSJ4Iw+zLiu64HJ3hxqkHVFE0M1VI9jW8LzQyLXOq/P3q4mz6FoHvNAt+1ZTyMHqR8B8Fq9vYIdyRg30wLMds1FcqqqJRGmaiXoXGLaJQ7JopHKdkcBI6W7KKiw6hb8BhAan/Nsyfq6w==;5:mkUMdMhtfbsqTNQqO6skQQYWIcM09X8ONlT3sni8AslYYdRuvzJD/vIyME14ToRgtUyaJi6eolQkja/xIhc5yPbxX4OCewJgvHYodYG/fsdcnsV6DrmfQCrXfWhxIfOEvRJ02/m4+L5KH6wjjHFTsQ==;24:rF3pDEwMdCAUMwdPDBrpZ3/E6RLE52BfJaC8nFpEHhE1GJ7mK6QmbxANILPN6tQ006J3gKfLaK5jqgn5brJ/jOJXQKO8jlIaSOUC9+sjk+g=;7:9R7/KyTIHabLX+4GgbUMnuatcv79g5E1qqUYWTcfSNhM9dh8m5ApBrx+NYz7cr26o13B9ueHb7AdixvioCK6R4MGP5aNKKbXoeEnKZj0YKR8hrdAk/bZzfAj6oTmKsmTPmDc25zaCRZ088az2QUO8Kyd/jNk72Xc4mBRfLwB6J+Mu1oAcjwJHz29iiDb6ZRpRTngPL5cwRifzsSPo6gBNu5ywhfgN7U/nT3+U+UcHYkDU5LXeR5OPtkmc32aglUb SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;HE1PR0801MB1788;20:LgHMcFPmxORxzF7aswyqSLb9kVyVoGjrsbPAuZ3lVqudzaRFTfsuTWxDw3PmzOTBO1XKIPCfL2sFNYxNTjWYzreEAJc4H8Wb4f9HRa/wpyaUsTBvOH7InJSvhpZk5DqZegUzpu2MTUx8jLOCzUonuC+ZIY3SDLMXFCrGOoRxHbc= X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 Sep 2016 09:20:07.5339 (UTC) X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[217.140.96.140];Helo=[nebula.arm.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR0801MB1788 X-MC-Unique: QkT52OUNNZqVpJDNT3CYIg-1 Content-Type: multipart/mixed; boundary="------------2.6.4.2.gae996d8" X-IsSubscribed: yes X-SW-Source: 2016-09/txt/msg00269.txt.bz2 --------------2.6.4.2.gae996d8 Content-Type: text/plain; charset=UTF-8; format=fixed Content-Transfer-Encoding: quoted-printable Content-length: 1516 Hi, This patch adds patterns for conversion from 64-bit integer to 16-bit floating-point values under AArch64 targets which don't have support for the ARMv8.2-A 16-bit floating point extensions. We implement these by first saturating to a SImode (we know that any values >=3D 65504 will round to infinity after conversion to HFmode), then converting to a DFmode (unsigned conversions could go to SFmode, but there is no performance benefit to this). Then converting to HFmode. Having added these patterns, the expansion path in "expand_float" will now try to use them for conversions from SImode to HFmode as there is no floatsihf2 pattern. expand_float first tries widening the integer size and looking for a match, so it will try SImode -> DImode. But our DI mode pattern is going to then saturate us back to SImode which is wasteful. Better, would be for us to provide float(uns)sihf2 patterns directly. So that's what this patch does. The testcase add in this patch would fail on trunk for AArch64. There is no libgcc routine to make the conversion, and we don't provide appropriate patterns in the backend, so we get a link-time error. Bootstrapped and tested on aarch64-none-linux-gnu OK for trunk? James --- 2016-09-06 James Greenhalgh * config/aarch64/aarch64.md (sihf2): Convert to expand. (dihf2): Likewise. (aarch64_fp16_hf2): New. 2016-09-06 James Greenhalgh * gcc.target/aarch64/floatdihf2_1.c: New. --------------2.6.4.2.gae996d8 Content-Type: text/x-patch; name=0001-Patch-AArch64-Add-floatdihf2-and-floatunsdihf2-patte.patch Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="0001-Patch-AArch64-Add-floatdihf2-and-floatunsdihf2-patte.patch" Content-length: 3461 diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 6afaf90..1882a72 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -4630,7 +4630,14 @@ [(set_attr "type" "f_cvti2f")] ) =20 -(define_insn "hf2" +;; If we do not have ARMv8.2-A 16-bit floating point extensions, the +;; midend will arrange for an SImode conversion to HFmode to first go +;; through DFmode, then to HFmode. But first it will try converting +;; to DImode then down, which would match our DImode pattern below and +;; give very poor code-generation. So, we must provide our own emulation +;; of the mid-end logic. + +(define_insn "aarch64_fp16_hf2" [(set (match_operand:HF 0 "register_operand" "=3Dw") (FLOATUORS:HF (match_operand:GPI 1 "register_operand" "r")))] "TARGET_FP_F16INST" @@ -4638,6 +4645,53 @@ [(set_attr "type" "f_cvti2f")] ) =20 +(define_expand "sihf2" + [(set (match_operand:HF 0 "register_operand") + (FLOATUORS:HF (match_operand:SI 1 "register_operand")))] + "TARGET_FLOAT" +{ + if (TARGET_FP_F16INST) + emit_insn (gen_aarch64_fp16_sihf2 (operands[0], operands[1])); + else + { + rtx convert_target =3D gen_reg_rtx (DFmode); + emit_insn (gen_sidf2 (convert_target, operands[1])); + emit_insn (gen_truncdfhf2 (operands[0], convert_target)); + } + DONE; +} +) + +;; For DImode there is no wide enough floating-point mode that we +;; can convert through natively (TFmode would work, but requires a library +;; call). However, we know that any value >=3D 65504 will be rounded +;; to infinity on conversion. This is well within the range of SImode, so +;; we can: +;; Saturate to SImode. +;; Convert from that to DFmode +;; Convert from that to HFmode (phew!). +;; Note that the saturation to SImode requires the SIMD extensions. If +;; we ever need to provide this pattern where the SIMD extensions are not +;; available, we would need a different approach. + +(define_expand "dihf2" + [(set (match_operand:HF 0 "register_operand") + (FLOATUORS:HF (match_operand:DI 1 "register_operand")))] + "TARGET_FLOAT && (TARGET_FP_F16INST || TARGET_SIMD)" +{ + if (TARGET_FP_F16INST) + emit_insn (gen_aarch64_fp16_dihf2 (operands[0], operands[1])); + else + { + rtx sat_target =3D gen_reg_rtx (SImode); + emit_insn (gen_aarch64_qmovndi (sat_target, operands[1])); + emit_insn (gen_sihf2 (operands[0], sat_target)); + } + + DONE; +} +) + ;; Convert between fixed-point and floating-point (scalar modes) =20 (define_insn "3" diff --git a/gcc/testsuite/gcc.target/aarch64/floatdihf2_1.c b/gcc/testsuit= e/gcc.target/aarch64/floatdihf2_1.c new file mode 100644 index 0000000..9eaa4ba --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/floatdihf2_1.c @@ -0,0 +1,35 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +/* Test that conversion from 32-bit and 64-bit integers can be done + without a call to the support library. */ + +#pragma GCC target ("arch=3Darmv8.2-a+nofp16") + +__fp16 +foo (int x) +{ + return x; +} + +__fp16 +bar (unsigned int x) +{ + return x; +} + +__fp16 +fool (long long x) +{ + return x; +} + +__fp16 +barl (unsigned long long x) +{ + return x; +} + + +/* { dg-final { scan-assembler-not "__float\\\[ds\\\]ihf2" } } */ +/* { dg-final { scan-assembler-not "__floatun\\\[ds\\\]ihf2" } } */ --------------2.6.4.2.gae996d8--