From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 127403 invoked by alias); 29 Sep 2017 23:01:46 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 127025 invoked by uid 89); 29 Sep 2017 23:01:45 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.2 spammy=Hx-spam-relays-external:sk:EUR01-D, sk:michael, HX-HELO:sk:EUR01-D, H*RU:sk:EUR01-D X-HELO: EUR01-DB5-obe.outbound.protection.outlook.com Received: from mail-db5eur01on0067.outbound.protection.outlook.com (HELO EUR01-DB5-obe.outbound.protection.outlook.com) (104.47.2.67) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 29 Sep 2017 23:01:44 +0000 Received: from VI1PR08CA0190.eurprd08.prod.outlook.com (2603:10a6:800:d2::20) by AM5PR0801MB1939.eurprd08.prod.outlook.com (2603:10a6:203:4a::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.77.7; Fri, 29 Sep 2017 23:01:40 +0000 Received: from DB5EUR03FT041.eop-EUR03.prod.protection.outlook.com (2a01:111:f400:7e0a::208) by VI1PR08CA0190.outlook.office365.com (2603:10a6:800:d2::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.56.8 via Frontend Transport; Fri, 29 Sep 2017 23:01:40 +0000 Authentication-Results: spf=pass (sender IP is 217.140.96.140) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 217.140.96.140 as permitted sender) receiver=protection.outlook.com; client-ip=217.140.96.140; helo=nebula.arm.com; Received: from nebula.arm.com (217.140.96.140) by DB5EUR03FT041.mail.protection.outlook.com (10.152.21.4) with Microsoft SMTP Server (version=TLS1_0, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA_P384) id 15.20.56.11 via Frontend Transport; Fri, 29 Sep 2017 23:01:39 +0000 Received: from arm.com (10.1.2.79) by mail.arm.com (10.1.106.66) with Microsoft SMTP Server id 14.3.294.0; Sat, 30 Sep 2017 00:01:24 +0100 Date: Fri, 29 Sep 2017 23:01:00 -0000 From: James Greenhalgh To: Michael Collison CC: Richard Sandiford , Richard Biener , Richard Kenner , GCC Patches , nd , Andrew Pinski Subject: Re: [PATCH] [Aarch64] Optimize subtract in shift counts Message-ID: <20170929230123.GB22958@arm.com> References: <87r2w4cb6w.fsf@linaro.org> <87mv6sc98y.fsf@linaro.org> <87bmmot8wu.fsf@linaro.org> <8760cvu5to.fsf@linaro.org> <87h8wfspy2.fsf@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-EOPAttributedMessage: 0 X-MS-Office365-Filtering-HT: Tenant X-Forefront-Antispam-Report: CIP:217.140.96.140;IPV:CAL;SCL:-1;CTRY:GB;EFV:NLI;SFV:NSPM;SFS:(10009020)(6009001)(346002)(39860400002)(376002)(2980300002)(438002)(189002)(24454002)(377424004)(199003)(86362001)(478600001)(54356999)(36756003)(76176999)(50986999)(7696004)(189998001)(47776003)(97756001)(8676002)(6636002)(305945005)(356003)(33656002)(8936002)(16586007)(58126008)(39060400002)(77096006)(93886005)(229853002)(6862004)(2906002)(50466002)(54906003)(6286002)(5660300001)(246002)(37006003)(106466001)(4326008)(46406003)(26826003)(55016002)(83506001)(6246003)(104016004)(1076002)(72206003)(2950100002)(23726003)(106002)(316002)(18370500001);DIR:OUT;SFP:1101;SCL:1;SRVR:AM5PR0801MB1939;H:nebula.arm.com;FPR:;SPF:Pass;PTR:fw-tnat.cambridge.arm.com;MX:1;A:1;LANG:en; X-Microsoft-Exchange-Diagnostics: 1;DB5EUR03FT041;1:0H9Q0MgFLfpWNpvvxNgDyzAp7GNYmButGlgZsSElof1c6VckUlOjeoZDFgGYvLHitTQ//ChARnMqn742dl+rL6za1fJ3ocxBbtZzqWinbFUC+rVHPoboopc43C8kml0j X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 010c0930-dec3-4b91-a827-08d5078e0b44 X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(22001)(2017030254152)(8251501002)(2017052603199)(201703131423075)(201703031133081)(201702281549075);SRVR:AM5PR0801MB1939; X-Microsoft-Exchange-Diagnostics: 1;AM5PR0801MB1939;3:Z4WQbB4XNdfxsSxGswbEPB4p4IeUdR78pKt/czvpDyIEAipWzuj9TfoeAICfLiZwbEmGej+kgE5wkFuRbfqwRWJHuwMUk4gIJfOQYJvHSkxcbNHAmHSxSpNSLfilcRm3Ky2OvKONj/4vUcnJZcTnQJnEhZj2xxU/5NPCY0i1bt0DZhEnrVqYgKXlMW2cpZlZzrXLxDqGzE26jB7BaskcLsp5HNcgysPMIc9GZqtCTKTdgGAVc9JtN7uoexgmX4w6WFHnRgk1X7uuC5jYum7Hpmfch1AfnhabALbBiMR+NZ6y7w2wXXJS+ogiIfUgO6oroGByZgUiRYPVb87q+tJ2HdjIbCotq4ovurJLIBWiSqE=;25:PXXE7UxFdB85iHpCtp1M0X5oNF+ngbjQ8uATkUFSvNvcp5aJMkNjVLF4G43OgKfcn7aVbx0S1xZWJeyeg5S1LXtsYrdybJ5TUO0KVFFQ+aqfIJ9E+FRuKvq9jTjPBkaQAFdpbtItv97woB2BcIG49O/whhC34+7SODZFlrli4n/4Aklh1/l87lYh8gUi/JCrxfoDS8QtbSU0ouV0fmQ+DJbzxNJ9T21GDfzqM3T3GS9MCoEvfHX5CQBLBqepEuhwgepGqHz5i7MXdcQhWv4OA7y5BttnW2z8chGsDBWKlaEdNwXunLJBE2Nb4tZB5cLgyuw3nBy0tOI9gV2kXTxhnw== X-MS-TrafficTypeDiagnostic: AM5PR0801MB1939: X-Microsoft-Exchange-Diagnostics: 1;AM5PR0801MB1939;31:R9qqan16tk0WkJc5sTLMLY5BHzjsGqHm51SqK+hobMKfLSRD0FTzUu7VVhdoiSOWcuXOX4D04slpTCHgu2o48ZXMRoehER4+DQpwV5jln+7uw3iX0rz56n54basEs/c9VDgqShoN+RucWa9re+Jrd1L1X5mW9VKy3c+8bWhOparvf/PyNmWKEC1LhI8obNxIbp3LV/Xu1nIc8H5+8ag3Obg4a6YoVCcNfC1wdyEGEVs=;20:wL2leJx+olot0SuYnzcHnRTeZUNSeThhSmnS9fRAj50jRSpf1CXMsRN9dwqJEErJIrZyFL0WchFMbvg+7fAreOftZEzgqJEeRi368AKdgrj23Oyz0VxqcCgRrNHQglg0OxkIh0f2wkCHe88CLjgxL6ASBCk7Gevh8ZsTlpabIJM0RsvzD+JJr47Di8H0wwl3fqEXcVQ6k+ikiw7OESBtF4HMbb/QNabuBxMdJCPrPOqLvWMMQPCjXfXpqMaD3I2T;4:eED3zWdzEfvVcg9jYN3c7NvhUvlcpZLcTlIk0GvwCQDh3yj1ZR7kE775HktKivpCHqLx7QCIap6DvqTux5v3PJNCWRX4gZrS3XZof6dxJeZyfwmi84HdUDym0dJKvGKPArqLWpxnJS4aZ+AS6jVapVrs3EQPghVpUW1emFkWjJ7Fv1jS+LTNV5/+xcMjqnlwOGooKBE+yFTYS9uBktZmZuaCsuC8OT7J/wOILF3oyS5FdbteEzQaf1RHb+HLO5ziy4oPm+ypkErW6xQRu84jOVFsAHg3g/tuzmrC4VCQWYc= NoDisclaimer: True X-Exchange-Antispam-Report-Test: UriScan:(180628864354917); X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(2401047)(8121501046)(5005006)(3002001)(100000703101)(100105400095)(93006095)(93004095)(10201501046)(6055026)(6041248)(20161123564025)(20161123555025)(20161123558100)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123562025)(20161123560025)(6072148)(201708071742011)(100000704101)(100105200095)(100000705101)(100105500095);SRVR:AM5PR0801MB1939;BCL:0;PCL:0;RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095);SRVR:AM5PR0801MB1939; X-Forefront-PRVS: 0445A82F82 X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1;AM5PR0801MB1939;23:t4B3j5BLjOGRE3ABHjLZgd9n50icLB6mFu96wmm?= =?us-ascii?Q?PtZJSvItHKVuqskIVQh0RK1dJct1sGXl+502no+pEYOdfleeESyoKecJBNPg?= =?us-ascii?Q?PDChP64Av/qb9NL/oWoKw7e7GZjBt15rNOYsHDE45jVz+gNMUTvsjlYAdvPW?= =?us-ascii?Q?VdUUPyZk1+WBsT+vwMiuXfSLK+ONsFeqmm0kaGmBL57iIDE3szP/YVCjJXfG?= =?us-ascii?Q?ZA6C4F0ITD2X7qqiuqMWbcN7bq6s16pZ05LmN+GYq8LmwoQPXDvWLpdzFSzy?= =?us-ascii?Q?wOB173otYfs498jAunOvHx0X7kzEn8+GVQyllXUKNEHz0rdP+aGzOcK354Ov?= =?us-ascii?Q?u92WLVMYQPArmtjQMg5ND31N1D3fdKSq3UiXc1yayHnRQcuo3nYJb8kMwO6y?= =?us-ascii?Q?AnJYqp4XmnrYrQr05N6gDJ56F1cHnr0sr5tGsP2miFSAGl6n1jNThquwKo8y?= =?us-ascii?Q?VsVkqUHrDThOKP8dxsSQH+AfDnOkpNUVel9jXuoQj8L61c2M0nK6+bZCH/fS?= =?us-ascii?Q?QrCzjMUBMbRcTBJDqBHAFl89DLZO+NIwJWZ41Am/BpDoE35yEApb6rDq9+T4?= =?us-ascii?Q?BnXIJzirb+2prnhhQ2HQfWf0caFAtwBEcb47AVLdKb8MsugKtZAYgyeZyl8M?= =?us-ascii?Q?tzWZxbnTDOUTA+bhsHYSDLZULKtm4E7f+yB46WhL3rcTGENn9Q3DHMXFKwf0?= =?us-ascii?Q?R/MEZYTchvmb8ctKoLqLLdhZpiaoE+BJg/m1JyV7aRmT1nlVXsEQkX/JO47V?= =?us-ascii?Q?E1ux7R7qpBvaPPXXdNp/r8CRYHyDhFDIFxzZFFR2ZUyH3YMhnNOrO93khfxI?= =?us-ascii?Q?avtwmrZK8RyduCmbFVnkuO9/al1NHRADqjIKHWP2TuyTeOFgLAnw/DR9VK/7?= =?us-ascii?Q?gLbRSanU9FsRZo/DTpi5I+/F6ydJaimFLRJUCV8s/WlXIgw/6a25pVFmdqnA?= =?us-ascii?Q?nzFLxIp/+ZGyHxSeM/qLwNrgTnoew6BYpCdbmn6tLPDiUUp5PpSTy3xImqrP?= =?us-ascii?Q?oLOr6868UpOU9EhlaDhjIKjWqcwRWtkK2eYu5mL5W1J9lUvvUhnQ5P/G9tKk?= =?us-ascii?Q?2yqooUvQpxQ/PPmXIHPXih7u8/CNFMeE2e0+gBHJg5iTIy0UjCAHfC9WWWak?= =?us-ascii?Q?s2SPfCwb8vTKxszIunqQHm72JOgij7NLX9qbXtPkzGVxMUF1vWS/lfyfk7Oq?= =?us-ascii?Q?LO6aaIQl+Vbt5DgmRv55UaBge1y9RC1Fnp5R5mLCXoZN1AKgoLR0OBhK9lBK?= =?us-ascii?Q?h7K4Bpq3lPalBJS4veD41X3uQUaxMiHmnkpDUB2Yp?= X-Microsoft-Exchange-Diagnostics: 1;AM5PR0801MB1939;6:HAsE6rYplk2jPAWb7cu2pCLgJz6J3MuTTuzfVJUGrhZLf9h29Kl0hUlEJW5yNuT3Um70mUVLRX3oYhAzzj6Hx6ULbrNx9Emw2dDBqLYgQvSRVhjuV8ibKVsSPa55k4ar8elUfScM577lmVYxIf9OtD+PLx3Bvq4HxW8s2RoWJ1UlB0RAE2JKOZPbVUbTsdJ0KcbYr6Y4+iJ1YS89hslT3ENbBHi1PLAVkYXgDqKfsHmNnQAXdrHuJlRcZA4ZY5mYjIiG9IpvVqfVDhLQ8QdGtOdGdBee0SU/JXEOf+5HWMjHoj78KjHlXwYlddq99qow7sAu0+cj6weJbLP+S6bfvQ==;5:7jciXfjGGbv6csOK4AFx/njqas/80dsvmUxiuRdXu5HY4rscP0DbDBDAMUQ60tm/lxqP4/tZtYB/6Bm+XvyLg8sNHRnmhhYmtM7+0JCRW1T6+w/x2OW7EY71DVLSGqewBlSsEXzL5rVZd2jNNj96zw==;24:hUgjyuNPY9pB9ok8ctIroGrExivUGEFcfG0aNKsjBiZI0sUqm6+mieTWYdEbV98aOihv52gzCqEtzsVBU2L7WZR4diCGfa8hix0OIzIb0lg=;7:8GupPZSjN2aYS2kYin0NY3XtxTCOoQKBBxq9gPXOr58x+RwAz0jmFXugl4++W4tRx9Ehm15LgDzC61ox58ZIteP9sEkz+n7JN/+eVW0pVheXUdz5zL/gEaTWM9OPsfba5RMlzCbyDOkXoV/OgGR7G39OaChRZQXfJWykeon/K4dueonx+GwE5VRBgvr8/yvAbYJP42ngAQGt55PT87ybu1iAjjNnnMvUXfmz97lyUp8= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 29 Sep 2017 23:01:39.6092 (UTC) X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[217.140.96.140];Helo=[nebula.arm.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM5PR0801MB1939 X-IsSubscribed: yes X-SW-Source: 2017-09/txt/msg02016.txt.bz2 On Fri, Sep 15, 2017 at 09:15:17AM +0100, Michael Collison wrote: > Patch updated with comments to simplify pattern .from Richard Sandiford. Okay for trunk? OK. Thanks, James > > 2017-09-14 Michael Collison > > * config/aarch64/aarch64.md (*aarch64_reg__minus3): > New pattern. > > 2017-09-14 Michael Collison > > * gcc.target/aarch64/var_shift_mask_2.c: New test. > > Richard Sandiford writes: > > Richard Sandiford writes: > >> Michael Collison writes: > >>> Richard, > >>> > >>> The problem with this approach for Aarch64 is that > >>> TARGET_SHIFT_TRUNCATION_MASK is based on SHIFT_COUNT_TRUNCATED which > >>> is normally 0 as it based on the TARGET_SIMD flag. > >> > >> Maybe I'm wrong, but that seems like a missed optimisation in itself. > > Sorry to follow up on myself yet again, but I'd forgotten this was because we allow the SIMD unit to do scalar shifts. So I guess we have no choice, even though it seems unfortunate. > > > +(define_insn_and_split "*aarch64_reg__minus3" > > + [(set (match_operand:GPI 0 "register_operand" "=&r") > > + (ASHIFT:GPI > > + (match_operand:GPI 1 "register_operand" "r") > > + (minus:QI (match_operand 2 "const_int_operand" "n") > > + (match_operand:QI 3 "register_operand" "r"))))] > > + "INTVAL (operands[2]) == GET_MODE_BITSIZE (mode)" > > + "#" > > + "&& true" > > + [(const_int 0)] > > + { > > + /* Handle cases where operand 3 is a plain QI register, or > > + a subreg with either a SImode or DImode register. */ > > + > > + rtx subreg_tmp = (REG_P (operands[3]) > > + ? gen_lowpart_SUBREG (SImode, operands[3]) > > + : SUBREG_REG (operands[3])); > > + > > + if (REG_P (subreg_tmp) && GET_MODE (subreg_tmp) == DImode) > > + subreg_tmp = gen_lowpart_SUBREG (SImode, subreg_tmp); > > I think this all simplifies to: > > rtx subreg_tmp = gen_lowpart (SImode, operands[3]); > > (or it would be worth having a comment that explains why not). > As well as being shorter, it will properly simplify hard REGs to new hard REGs. > > > + rtx tmp = (can_create_pseudo_p () ? gen_reg_rtx (SImode) > > + : operands[0]); > > + > > + if (mode == DImode && !can_create_pseudo_p ()) > > + tmp = gen_lowpart_SUBREG (SImode, operands[0]); > > I think this too would be simpler with gen_lowpart: > > rtx tmp = (can_create_pseudo_p () ? gen_reg_rtx (SImode) > : gen_lowpart (SImode, operands[0])); > > > + > > + emit_insn (gen_negsi2 (tmp, subreg_tmp)); > > + > > + rtx and_op = gen_rtx_AND (SImode, tmp, > > + GEN_INT (GET_MODE_BITSIZE (mode) - 1)); > > + > > + rtx subreg_tmp2 = gen_lowpart_SUBREG (QImode, and_op); > > + > > + emit_insn (gen_3 (operands[0], operands[1], subreg_tmp2)); > > + DONE; > > + } > > +) > > The pattern should probably set the "length" attribute to 8. > > Looks good to me with those changes FWIW. > > Thanks, > Richard