From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 33799 invoked by alias); 21 Jun 2017 10:48:42 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 33741 invoked by uid 89); 21 Jun 2017 10:48:39 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.9 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.2 spammy=0h X-HELO: EUR01-VE1-obe.outbound.protection.outlook.com Received: from mail-ve1eur01on0055.outbound.protection.outlook.com (HELO EUR01-VE1-obe.outbound.protection.outlook.com) (104.47.1.55) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 21 Jun 2017 10:48:37 +0000 Received: from VI1PR0801MB2031.eurprd08.prod.outlook.com (10.173.74.140) by AM4PR08MB2660.eurprd08.prod.outlook.com (10.171.190.149) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1178.14; Wed, 21 Jun 2017 10:48:34 +0000 Received: from VI1PR0801MB2031.eurprd08.prod.outlook.com ([fe80::341b:7cb4:66ec:496e]) by VI1PR0801MB2031.eurprd08.prod.outlook.com ([fe80::341b:7cb4:66ec:496e%17]) with mapi id 15.01.1199.015; Wed, 21 Jun 2017 10:48:33 +0000 From: Tamar Christina To: James Greenhalgh CC: GCC Patches , nd , Marcus Shawcroft , Richard Earnshaw Subject: RE: [PATCH][GCC][AArch64] optimize float immediate moves (2 /4) - HF/DF/SF mode. Date: Wed, 21 Jun 2017 10:48:00 -0000 Message-ID: References: <20170614084233.GA15599@arm.com> In-Reply-To: <20170614084233.GA15599@arm.com> authentication-results: arm.com; dkim=none (message not signed) header.d=none;arm.com; dmarc=none action=none header.from=arm.com; x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1;AM4PR08MB2660;7:lqvJxkAa2VSj0v+fAl8+unViw3AQ6SGpuzR/MOjJOQT/YDqYru2NdN5969HhjTaazQIYxWdWebVSVkdRwPSsH/V7lOtxcDc2wwzmjqfUEvpxhvkb8bIyKBOtlLwRtnDxGyuINKN5yTVIX/w420kPyOPCNNWyH3H18x+J+G4Hi8j08jviPOiPVAgRcXab4KTw/IEWxasLKSD3ykYRxgFeVZsQGARvMyOdRHBBb98pQHnY8FdUoyeGTw/atOW3XstRYLOcfrm4CQtYL7tsZTodgYL96mddRn5Dqp/Vz9qcQ45KdUEbHjDDtjH0u/EZUs1OVtnCgfZtxOk6q7WxCLk04baLxwxKeDxKhyTC/MU7c46hvSi/Mk5g7p6d3QvFMywmA+MwfgPmpQoWa2ty9l56eFzB82pdZ3tr5zPUWYtUq8XOzg7TXwohWmbyVApP9eRg1e2pVHF8Lu8Mafe54MY9fIu/zDukyOSCwkqIi0F9jbMq7qna2k75IGnvlsNuGUbSFGiA/b/fCQMkyhXvv61memNHuFrLLG9U3EmUo69TRn2VC59+FpSo+Y0CPgtn6ms7q/apza1Wrc9Yapu70ln1ALNDXWwWu9mF2GB7Ihl9x1I5LECmBMH6ub4AdHiW97sL/LsSO9fnSCBVbc1vVfy8MmingKy/CuvOhWXM4Yf88kJrNRvX0WZY/VeBSXKDX5SF+xXH41jYZj/O0joWyGSWfLQXTUaOk9hKBKFMa+bP2t8sf72GV9Xw6vPpCLsrIkxUzVk4DRuzScYpA3muX/+J1Shb9armIcaLGNPb7ZOcC+E= x-ms-office365-filtering-correlation-id: 1e6e88bf-7f6f-4ba3-7e6c-08d4b8931049 x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(22001)(2017030254075)(48565401081)(201703131423075)(201703031133081);SRVR:AM4PR08MB2660; x-ms-traffictypediagnostic: AM4PR08MB2660: nodisclaimer: True x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:; x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(601004)(2401047)(8121501046)(5005006)(93006095)(93001095)(3002001)(10201501046)(100000703101)(100105400095)(6055026)(6041248)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123564025)(20161123562025)(20161123555025)(20161123560025)(20161123558100)(6072148)(100000704101)(100105200095)(100000705101)(100105500095);SRVR:AM4PR08MB2660;BCL:0;PCL:0;RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095);SRVR:AM4PR08MB2660; x-forefront-prvs: 0345CFD558 x-forefront-antispam-report: SFV:NSPM;SFS:(10009020)(6009001)(39410400002)(39850400002)(39450400003)(39840400002)(39860400002)(39400400002)(8676002)(72206003)(7736002)(189998001)(81166006)(66066001)(8936002)(33656002)(86362001)(478600001)(6862004)(2900100001)(14454004)(7696004)(229853002)(5660300001)(3660700001)(3846002)(50986999)(6116002)(5250100002)(3280700002)(4326008)(102836003)(9686003)(305945005)(54356999)(6636002)(74316002)(76176999)(53936002)(6506006)(2950100002)(38730400002)(6436002)(99286003)(55016002)(54906002)(110136004)(25786009)(14773001);DIR:OUT;SFP:1101;SCL:1;SRVR:AM4PR08MB2660;H:VI1PR0801MB2031.eurprd08.prod.outlook.com;FPR:;SPF:None;MLV:sfv;LANG:en; spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-originalarrivaltime: 21 Jun 2017 10:48:33.6385 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM4PR08MB2660 X-IsSubscribed: yes X-SW-Source: 2017-06/txt/msg01562.txt.bz2 > > movi\\t%0.4h, #0 > > - mov\\t%0.h[0], %w1 > > + fmov\\t%s0, %w1 >=20 > Should this not be %h0? The problem is that H registers are only available in ARMv8.2+, I'm not sure what to do about ARMv8.1 given your other feedback Pointing out that the bit patterns between how it's stored in s vs h regist= ers differ. >=20 > > umov\\t%w0, %1.h[0] > > mov\\t%0.h[0], %1.h[0] > > + fmov\\t%s0, %1 >=20 > Likewise, and much more important for correctness as it changes the way t= he > bit pattern ends up in the register (see table C2-1 in release B.a of the= ARM > Architecture Reference Manual for ARMv8-A), here. >=20 > > + * return aarch64_output_scalar_simd_mov_immediate (operands[1], > > + SImode); > > ldr\\t%h0, %1 > > str\\t%h1, %0 > > ldrh\\t%w0, %1 > > strh\\t%w1, %0 > > mov\\t%w0, %w1" > > - [(set_attr "type" > "neon_move,neon_from_gp,neon_to_gp,neon_move,\ > > - f_loads,f_stores,load1,store1,mov_reg") > > - (set_attr "simd" "yes,yes,yes,yes,*,*,*,*,*")] > > + "&& can_create_pseudo_p () > > + && !aarch64_can_const_movi_rtx_p (operands[1], HFmode) > > + && !aarch64_float_const_representable_p (operands[1]) > > + && aarch64_float_const_rtx_p (operands[1])" > > + [(const_int 0)] > > + "{ > > + unsigned HOST_WIDE_INT ival; > > + if (!aarch64_reinterpret_float_as_int (operands[1], &ival)) > > + FAIL; > > + > > + rtx tmp =3D gen_reg_rtx (SImode); > > + aarch64_expand_mov_immediate (tmp, GEN_INT (ival)); > > + tmp =3D simplify_gen_subreg (HImode, tmp, SImode, 0); > > + emit_move_insn (operands[0], gen_lowpart (HFmode, tmp)); > > + DONE; > > + }" > > + [(set_attr "type" "neon_move,f_mcr,neon_to_gp,neon_move,fconsts, > \ > > + neon_move,f_loads,f_stores,load1,store1,mov_reg") > > + (set_attr "simd" "yes,*,yes,yes,*,yes,*,*,*,*,*")] > > ) >=20 > Thanks, > James