From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR05-DB8-obe.outbound.protection.outlook.com (mail-db8eur05on2045.outbound.protection.outlook.com [40.107.20.45]) by sourceware.org (Postfix) with ESMTPS id E2BCA3857BA0 for ; Fri, 23 Sep 2022 11:43:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E2BCA3857BA0 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=YWTbsZ9q1mQI1iEACVDQlYY3MMY3EAaVCtZtzxNqIEyfuAeE6N9A9QeUUhpQDsu3lLeuW5AiCsSskTb1eudV2kSQ0VarxJFntrFMJr1qR/DMLqI5C/n3yo8kGiCe8qBOiisNeLtGQdBSfUIKoEwNFvhQl+vQvxG8CjemjahTrY2S8/FQqkO4vPWm3aN0ITFmMxkY/His8ps5SOIyySR9z2dAcvvXspWZhmHhqs1lQ/wmGR5J5Lbya3QdOcrHvNHf89ONue4YOrbcbbmW3rbCDD8hel21h2lRPrYIQJN9aEy+n4ucoDUq86BSr2+0mRYcA44x4vrtRYLA5xuBRtCjUA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=9R0Uv45BfzE70NKyy31JHIX0uDtBUKHX0t31VTy0K44=; b=WHuEoqpEHJ6cpmD67oRu6eXIMsS8lZ4ICLiG4okpXptiJMB2lhaDozl6tyyG8dIvQ5fqu/fO0tWa77AU2xYCD3NNaYVpJWfgr/zdp+LQVMV4EHqlaSGVKtFqEBmPQQb5gzwV6mAy6m4CTJ5ZV8Q7mq5LHaL3j78a5FYruiob/vNSgjqaOckAw/7uDw6ZPmK76okDNSQT9Gk1D1MqribhrQafB2vSpH2wc2LNGf8r7NVL8U6n5fl+JTcvBfDj1ITIXxJCy6yWxMLWCM9Yv18oO1Aw2n+mH9ON9PzGlAGaeql2FYPpzHc2Rudx8Yep0ZBbPeaoAOXcqZExnsUM9MFbZw== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=bestguesspass action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=9R0Uv45BfzE70NKyy31JHIX0uDtBUKHX0t31VTy0K44=; b=NlC7sB6nPHvP0rjimF7Qu0lEgkUNCC7wSJhVOEeSSYpbFe8gBtSiyCQMmBzRNdBIhlVvz6pGiOP6bkuDiK8TIw9G+c3LyqgaR+u5SFw0Z6oV0vJQwbf0T3Ve1p2HwaPldcEf4yN6zDqoQ+Nstw9pcGZNkaqDClHhAi3FfjjX2q0= Received: from AM6PR0202CA0041.eurprd02.prod.outlook.com (2603:10a6:20b:3a::18) by DB4PR08MB9863.eurprd08.prod.outlook.com (2603:10a6:10:3f0::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5654.17; Fri, 23 Sep 2022 11:43:36 +0000 Received: from AM7EUR03FT018.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:3a:cafe::b9) by AM6PR0202CA0041.outlook.office365.com (2603:10a6:20b:3a::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5654.20 via Frontend Transport; Fri, 23 Sep 2022 11:43:36 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM7EUR03FT018.mail.protection.outlook.com (100.127.140.97) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5654.14 via Frontend Transport; Fri, 23 Sep 2022 11:43:35 +0000 Received: ("Tessian outbound ee41cdb23966:v124"); Fri, 23 Sep 2022 11:43:34 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: d7592df247ae0227 X-CR-MTA-TID: 64aa7808 Received: from dce00970daf7.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 26588329-FC18-44C5-90EB-224481B1B6EF.1; Fri, 23 Sep 2022 11:43:27 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id dce00970daf7.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 23 Sep 2022 11:43:27 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=h8bZkorJGS11e4BdtRDun68R94OmgKL4dLoHrHSjxamH8Gv+g2fcnt5Va8C6k9SWO4o9U01buuftsLtZ91GKqw4ukrvBUezZFxMukRvghiI68QuDm0ChpLxFzuWExMShG+F+7+Rb9JqB7rgfikmbDaNbjIBEDlj3v34UhW4OIPYk3Z7BEebgfBjRksSHhimUEp5Kisxc+dcRd921r1DrMRfBic6T3LQHgcgKj6p0c4ytcTvr//i4xqQNceDKqBvabpUuJRGVAGCJnr+WTs105PPZC+1NGbNWGWKG8gO1EAYwNNSAPxctu4cATtAHReqfvJuzm9e7hsZ/V+TvhCG/kw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=9R0Uv45BfzE70NKyy31JHIX0uDtBUKHX0t31VTy0K44=; b=UZqi4mvJA2tBG3hWfV1bzp8QKIJRSfd6nNEGWXNv6hiPLpIvFhAS+L4qButCT2SPLFmNxBGm6RII7Oxg8+a9TdrB7+djKTHYhoaP2rdT4W62qNLzqIWuYtO59X03BuExkoKBJBD4MHfw/ftYm5lFNpDJfB48Z6AwJOfWeASxokoi5xGDvFcnSeKd1fzBHC7LAYcA0pEfkFtsMXpbf2BefQYDSCakmpSOZ7h67yI30l5Q/vmruBZ3RE0gUMhz/JTpDhpuc6U4w++EFx5YN10WzvLt/WAHdgXnDWpwizW8DASr+Y3OFFwxOW3JWmWqMXgAv+sAA1IjiLjDFFig9GFtFQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=9R0Uv45BfzE70NKyy31JHIX0uDtBUKHX0t31VTy0K44=; b=NlC7sB6nPHvP0rjimF7Qu0lEgkUNCC7wSJhVOEeSSYpbFe8gBtSiyCQMmBzRNdBIhlVvz6pGiOP6bkuDiK8TIw9G+c3LyqgaR+u5SFw0Z6oV0vJQwbf0T3Ve1p2HwaPldcEf4yN6zDqoQ+Nstw9pcGZNkaqDClHhAi3FfjjX2q0= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by AS8PR08MB8993.eurprd08.prod.outlook.com (2603:10a6:20b:5b4::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5654.16; Fri, 23 Sep 2022 11:43:26 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::6529:66e5:e7d4:1a40]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::6529:66e5:e7d4:1a40%4]) with mapi id 15.20.5632.021; Fri, 23 Sep 2022 11:43:26 +0000 Date: Fri, 23 Sep 2022 12:43:18 +0100 From: Tamar Christina To: gcc-patches@gcc.gnu.org Cc: nd@arm.com, Richard.Earnshaw@arm.com, Marcus.Shawcroft@arm.com, Kyrylo.Tkachov@arm.com, richard.sandiford@arm.com Subject: [PATCH 2/2]AArch64 Perform more late folding of reg moves and shifts which arrive after expand Message-ID: Content-Type: multipart/mixed; boundary="F7rQHXCg37B8tZx1" Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: SA0PR11CA0100.namprd11.prod.outlook.com (2603:10b6:806:d1::15) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|AS8PR08MB8993:EE_|AM7EUR03FT018:EE_|DB4PR08MB9863:EE_ X-MS-Office365-Filtering-Correlation-Id: 1310ef1e-dcf8-4712-9ef7-08da9d58d91a x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: PVcTgF441PiaxhRAf485n7lAN1tCTR8DRW55RLnqMlkfUG64XnkjZHryfs4OynckXnrudlUFbqWsTeonFv/DU0g0OST2zXLZYLFvMhOWhwXddnRCqesdJJ4QDVimVwxKxPI8EIoURUeZq0KhEbzsFwtgxrmFv/NqEh/ITNLrVgO6gK3TiSXTTIgkbuHi389K8yFfuWSR8D/j9E+mYTTisMnr+NSCwW0NrK6ZOPhmxtOzK26QPTolrEgB5KR3FoNJmZxN89+MO6eUk1jBzjeOAeiw+YMBHpb+7DpyeRgO+aZ+WWUkPElbpMi3LbzDUm6dL0nYTwBioAFki8POipjq6pxznqHvFtXGVf6z1ctHDN5vsQckdjz1G2RN38x5BsW4dPWOxH4lRfRKK9lB/ShHrTgJ7yma3U+/EkMUpb3i7sQiiV1CH34f6FYY0gglI/s38NP/ZMXgrTg53iDWzQWhB7B9oLSnZrqzgtlCozcdYLOnnOcLN9NK2LLB5kV6XV/E0NFK2RwKqm5Z8v/r87Cm/Wh1NP4EjLr0QeTkjYRfXLQfiMRZRExnsFgUJamsQSkCRcROVcpbrftdIGOJ69GKyrsAMbAmdEi9qCTcXd06I1p/qzx+eJgk6zneOC1UL8js2A+J8FzsvnCDFkC0MVBdJbXeabcG3F7vx4fkzM3ebRKIjIniLMoP/NrSDRYWYiIMBMfUlYII9JgvnBkhZYg50Uxy67Bzj52tbVfkeZEGwBPOfGr4Q7lLeLeiMCZ9EWOeoL4bUiNA6/pXrJzzYYRuJH5+AKgDAFQjhG6PKckeS7k= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VI1PR08MB5325.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(396003)(366004)(39860400002)(376002)(136003)(346002)(451199015)(4326008)(66899012)(186003)(2616005)(4743002)(44832011)(38100700002)(6506007)(5660300002)(41300700001)(6512007)(44144004)(6486002)(26005)(235185007)(33964004)(8936002)(478600001)(8676002)(66946007)(84970400001)(6666004)(86362001)(316002)(66556008)(66476007)(6916009)(2906002)(36756003)(2700100001);DIR:OUT;SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB8993 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM7EUR03FT018.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: c4321047-5aab-4862-61bf-08da9d58d379 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 1YFMLKPPmpvYUPll5l8/M10D83RJMljh93R36a/lSqcF/DTmMbi0GCCYcrvKYs6D4JTEGes/XTJVs/sUqz5NQCKCLXSVeP3o0rMw3myvanWjR8775LpXC1Okykp6HWWIQ1aR6SkR/5OPtv6WtpnZJuLmm1nbLHONhUo+2mwcK9R1fPv33MKdjHyaalGZcdLSBuKltoV/Rfww5Zb9JO8hrVjli1tdWvq9JFJB6Lfo5wBOglB7kWJc8Y8DD4o30cLG0qT47ftQvoOSt3mdpZ3IPJ19PjolZR6vcVkJqhOAYefWeRf6KJ3Phh3CY7h8nt7FHN3aVeE+AjNjX0uMzoGub61VwCVvVEaKyxzvsiO8x3FcmahDzmM06J6ILa3MBcvbRszHGsQ3UtxooPB/ix4cADhPnRkBnPzZicC56uojSK9aVToXx73FxOI/EQ4U/AzY5GX6CZzuQid9Bg8PrPjsRjBLmrQq93u1EmJz9vbQxcmHn6RKHHOyB41Pm+ZcDEP/Sj6CoUicGFJdsLmsbKfc9H4pMbofc5i265xfvvtVGWGzs+iAGXmbN3PWf3RumigKV1rqUmj7jrQM8vLr1rjNk1lqWVvKgTQn7p0I0TA1IXeyXamEkFJG0+a2zIabS8HOH0Zkf1PbrCBGnhrhgDSy32fjWuTDBewCICM7xCVsbJG6IIzqlq6KvCHozfy3aiEuE118Q+xKJIN7SykepVUhmPRHUJo4Q/RefV9gR5RtngauBt4mITgkBDJ88cDkQEr5xYaK6wgY7RKGrNQXk4O6fUT3scfWwrHRAktTLUdcnAJjEg8bhS6J97kaJulu+OhT72VcfnUMg7ArKXxPkrJvXA== X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230022)(4636009)(39860400002)(346002)(376002)(396003)(136003)(451199015)(36840700001)(46966006)(40470700004)(478600001)(41300700001)(6486002)(2906002)(6506007)(44144004)(6916009)(6666004)(26005)(8676002)(5660300002)(4326008)(70206006)(8936002)(82740400003)(33964004)(70586007)(44832011)(316002)(66899012)(235185007)(6512007)(84970400001)(40460700003)(36756003)(82310400005)(40480700001)(356005)(86362001)(36860700001)(47076005)(81166007)(336012)(186003)(2616005)(4743002)(2700100001);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 23 Sep 2022 11:43:35.0804 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 1310ef1e-dcf8-4712-9ef7-08da9d58d91a X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM7EUR03FT018.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB4PR08MB9863 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,GIT_PATCH_0,KAM_DMARC_NONE,KAM_LOTSOFHASH,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --F7rQHXCg37B8tZx1 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Hi All, Similar to the 1/2 patch but adds additional back-end specific folding for if the register sequence was created as a result of RTL optimizations. Concretely: #include unsigned int foor (uint32x4_t x) { return x[1] >> 16; } generates: foor: umov w0, v0.h[3] ret instead of foor: umov w0, v0.s[1] lsr w0, w0, 16 ret Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64.md (*si3_insn_uxtw): Split SHIFT into left and right ones. * config/aarch64/constraints.md (Usl): New. * config/aarch64/iterators.md (SHIFT_NL, LSHIFTRT): New. gcc/testsuite/ChangeLog: * gcc.target/aarch64/shift-read.c: New test. --- inline copy of patch -- diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index c333fb1f72725992bb304c560f1245a242d5192d..6aa1fb4be003f2027d63ac69fd314c2bbc876258 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -5493,7 +5493,7 @@ (define_insn "*rol3_insn" ;; zero_extend version of shifts (define_insn "*si3_insn_uxtw" [(set (match_operand:DI 0 "register_operand" "=r,r") - (zero_extend:DI (SHIFT_no_rotate:SI + (zero_extend:DI (SHIFT_arith:SI (match_operand:SI 1 "register_operand" "r,r") (match_operand:QI 2 "aarch64_reg_or_shift_imm_si" "Uss,r"))))] "" @@ -5528,6 +5528,60 @@ (define_insn "*rolsi3_insn_uxtw" [(set_attr "type" "rotate_imm")] ) +(define_insn "*si3_insn2_uxtw" + [(set (match_operand:DI 0 "register_operand" "=r,?r,r") + (zero_extend:DI (LSHIFTRT:SI + (match_operand:SI 1 "register_operand" "w,r,r") + (match_operand:QI 2 "aarch64_reg_or_shift_imm_si" "Usl,Uss,r"))))] + "" + { + switch (which_alternative) + { + case 0: + { + machine_mode dest, vec_mode; + int val = INTVAL (operands[2]); + int size = 32 - val; + if (size == 16) + dest = HImode; + else if (size == 8) + dest = QImode; + else + gcc_unreachable (); + + /* Get nearest 64-bit vector mode. */ + int nunits = 64 / size; + auto vector_mode + = mode_for_vector (as_a (dest), nunits); + if (!vector_mode.exists (&vec_mode)) + gcc_unreachable (); + operands[1] = gen_rtx_REG (vec_mode, REGNO (operands[1])); + operands[2] = gen_int_mode (val / size, SImode); + + /* Ideally we just call aarch64_get_lane_zero_extend but reload gets + into a weird loop due to a mov of w -> r being present most time + this instruction applies. */ + switch (dest) + { + case QImode: + return "umov\\t%w0, %1.b[%2]"; + case HImode: + return "umov\\t%w0, %1.h[%2]"; + default: + gcc_unreachable (); + } + } + case 1: + return "\\t%w0, %w1, %2"; + case 2: + return "\\t%w0, %w1, %w2"; + default: + gcc_unreachable (); + } + } + [(set_attr "type" "neon_to_gp,bfx,shift_reg")] +) + (define_insn "*3_insn" [(set (match_operand:SHORT 0 "register_operand" "=r") (ASHIFT:SHORT (match_operand:SHORT 1 "register_operand" "r") diff --git a/gcc/config/aarch64/constraints.md b/gcc/config/aarch64/constraints.md index ee7587cca1673208e2bfd6b503a21d0c8b69bf75..470510d691ee8589aec9b0a71034677534641bea 100644 --- a/gcc/config/aarch64/constraints.md +++ b/gcc/config/aarch64/constraints.md @@ -166,6 +166,14 @@ (define_constraint "Uss" (and (match_code "const_int") (match_test "(unsigned HOST_WIDE_INT) ival < 32"))) +(define_constraint "Usl" + "@internal + A constraint that matches an immediate shift constant in SImode that has an + exact mode available to use." + (and (match_code "const_int") + (and (match_test "satisfies_constraint_Uss (op)") + (match_test "(32 - ival == 8) || (32 - ival == 16)")))) + (define_constraint "Usn" "A constant that can be used with a CCMN operation (once negated)." (and (match_code "const_int") diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index e904407b2169e589b7007ff966b2d9347a6d0fd2..bf16207225e3a4f1f20ed6f54321bccbbf15d73f 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -2149,8 +2149,11 @@ (define_mode_attr sve_lane_pair_con [(VNx8HF "y") (VNx4SF "x")]) ;; This code iterator allows the various shifts supported on the core (define_code_iterator SHIFT [ashift ashiftrt lshiftrt rotatert rotate]) -;; This code iterator allows all shifts except for rotates. -(define_code_iterator SHIFT_no_rotate [ashift ashiftrt lshiftrt]) +;; This code iterator allows arithmetic shifts +(define_code_iterator SHIFT_arith [ashift ashiftrt]) + +;; Singleton code iterator for only logical right shift. +(define_code_iterator LSHIFTRT [lshiftrt]) ;; This code iterator allows the shifts supported in arithmetic instructions (define_code_iterator ASHIFT [ashift ashiftrt lshiftrt]) diff --git a/gcc/testsuite/gcc.target/aarch64/shift-read.c b/gcc/testsuite/gcc.target/aarch64/shift-read.c new file mode 100644 index 0000000000000000000000000000000000000000..e6e355224c96344fe1cdabd6b0d3d5d609cd95bd --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/shift-read.c @@ -0,0 +1,85 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O2" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +#include + +/* +** foor: +** umov w0, v0.h\[3\] +** ret +*/ +unsigned int foor (uint32x4_t x) +{ + return x[1] >> 16; +} + +/* +** fool: +** umov w0, v0.s\[1\] +** lsl w0, w0, 16 +** ret +*/ +unsigned int fool (uint32x4_t x) +{ + return x[1] << 16; +} + +/* +** foor2: +** umov w0, v0.h\[7\] +** ret +*/ +unsigned short foor2 (uint32x4_t x) +{ + return x[3] >> 16; +} + +/* +** fool2: +** fmov w0, s0 +** lsl w0, w0, 16 +** ret +*/ +unsigned int fool2 (uint32x4_t x) +{ + return x[0] << 16; +} + +typedef int v4si __attribute__ ((vector_size (16))); + +/* +** bar: +** addv s0, v0.4s +** fmov w0, s0 +** lsr w1, w0, 16 +** add w0, w1, w0, uxth +** ret +*/ +int bar (v4si x) +{ + unsigned int sum = vaddvq_s32 (x); + return (((uint16_t)(sum & 0xffff)) + ((uint32_t)sum >> 16)); +} + +/* +** foo: +** lsr w0, w0, 16 +** ret +*/ +unsigned short foo (unsigned x) +{ + return x >> 16; +} + +/* +** foo2: +** ... +** umov w0, v[0-8]+.h\[1\] +** ret +*/ +unsigned short foo2 (v4si x) +{ + int y = x[0] + x[1]; + return y >> 16; +} -- --F7rQHXCg37B8tZx1 Content-Type: text/plain; charset=utf-8 Content-Disposition: attachment; filename="rb15777.patch" diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index c333fb1f72725992bb304c560f1245a242d5192d..6aa1fb4be003f2027d63ac69fd314c2bbc876258 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -5493,7 +5493,7 @@ (define_insn "*rol3_insn" ;; zero_extend version of shifts (define_insn "*si3_insn_uxtw" [(set (match_operand:DI 0 "register_operand" "=r,r") - (zero_extend:DI (SHIFT_no_rotate:SI + (zero_extend:DI (SHIFT_arith:SI (match_operand:SI 1 "register_operand" "r,r") (match_operand:QI 2 "aarch64_reg_or_shift_imm_si" "Uss,r"))))] "" @@ -5528,6 +5528,60 @@ (define_insn "*rolsi3_insn_uxtw" [(set_attr "type" "rotate_imm")] ) +(define_insn "*si3_insn2_uxtw" + [(set (match_operand:DI 0 "register_operand" "=r,?r,r") + (zero_extend:DI (LSHIFTRT:SI + (match_operand:SI 1 "register_operand" "w,r,r") + (match_operand:QI 2 "aarch64_reg_or_shift_imm_si" "Usl,Uss,r"))))] + "" + { + switch (which_alternative) + { + case 0: + { + machine_mode dest, vec_mode; + int val = INTVAL (operands[2]); + int size = 32 - val; + if (size == 16) + dest = HImode; + else if (size == 8) + dest = QImode; + else + gcc_unreachable (); + + /* Get nearest 64-bit vector mode. */ + int nunits = 64 / size; + auto vector_mode + = mode_for_vector (as_a (dest), nunits); + if (!vector_mode.exists (&vec_mode)) + gcc_unreachable (); + operands[1] = gen_rtx_REG (vec_mode, REGNO (operands[1])); + operands[2] = gen_int_mode (val / size, SImode); + + /* Ideally we just call aarch64_get_lane_zero_extend but reload gets + into a weird loop due to a mov of w -> r being present most time + this instruction applies. */ + switch (dest) + { + case QImode: + return "umov\\t%w0, %1.b[%2]"; + case HImode: + return "umov\\t%w0, %1.h[%2]"; + default: + gcc_unreachable (); + } + } + case 1: + return "\\t%w0, %w1, %2"; + case 2: + return "\\t%w0, %w1, %w2"; + default: + gcc_unreachable (); + } + } + [(set_attr "type" "neon_to_gp,bfx,shift_reg")] +) + (define_insn "*3_insn" [(set (match_operand:SHORT 0 "register_operand" "=r") (ASHIFT:SHORT (match_operand:SHORT 1 "register_operand" "r") diff --git a/gcc/config/aarch64/constraints.md b/gcc/config/aarch64/constraints.md index ee7587cca1673208e2bfd6b503a21d0c8b69bf75..470510d691ee8589aec9b0a71034677534641bea 100644 --- a/gcc/config/aarch64/constraints.md +++ b/gcc/config/aarch64/constraints.md @@ -166,6 +166,14 @@ (define_constraint "Uss" (and (match_code "const_int") (match_test "(unsigned HOST_WIDE_INT) ival < 32"))) +(define_constraint "Usl" + "@internal + A constraint that matches an immediate shift constant in SImode that has an + exact mode available to use." + (and (match_code "const_int") + (and (match_test "satisfies_constraint_Uss (op)") + (match_test "(32 - ival == 8) || (32 - ival == 16)")))) + (define_constraint "Usn" "A constant that can be used with a CCMN operation (once negated)." (and (match_code "const_int") diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index e904407b2169e589b7007ff966b2d9347a6d0fd2..bf16207225e3a4f1f20ed6f54321bccbbf15d73f 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -2149,8 +2149,11 @@ (define_mode_attr sve_lane_pair_con [(VNx8HF "y") (VNx4SF "x")]) ;; This code iterator allows the various shifts supported on the core (define_code_iterator SHIFT [ashift ashiftrt lshiftrt rotatert rotate]) -;; This code iterator allows all shifts except for rotates. -(define_code_iterator SHIFT_no_rotate [ashift ashiftrt lshiftrt]) +;; This code iterator allows arithmetic shifts +(define_code_iterator SHIFT_arith [ashift ashiftrt]) + +;; Singleton code iterator for only logical right shift. +(define_code_iterator LSHIFTRT [lshiftrt]) ;; This code iterator allows the shifts supported in arithmetic instructions (define_code_iterator ASHIFT [ashift ashiftrt lshiftrt]) diff --git a/gcc/testsuite/gcc.target/aarch64/shift-read.c b/gcc/testsuite/gcc.target/aarch64/shift-read.c new file mode 100644 index 0000000000000000000000000000000000000000..e6e355224c96344fe1cdabd6b0d3d5d609cd95bd --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/shift-read.c @@ -0,0 +1,85 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O2" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +#include + +/* +** foor: +** umov w0, v0.h\[3\] +** ret +*/ +unsigned int foor (uint32x4_t x) +{ + return x[1] >> 16; +} + +/* +** fool: +** umov w0, v0.s\[1\] +** lsl w0, w0, 16 +** ret +*/ +unsigned int fool (uint32x4_t x) +{ + return x[1] << 16; +} + +/* +** foor2: +** umov w0, v0.h\[7\] +** ret +*/ +unsigned short foor2 (uint32x4_t x) +{ + return x[3] >> 16; +} + +/* +** fool2: +** fmov w0, s0 +** lsl w0, w0, 16 +** ret +*/ +unsigned int fool2 (uint32x4_t x) +{ + return x[0] << 16; +} + +typedef int v4si __attribute__ ((vector_size (16))); + +/* +** bar: +** addv s0, v0.4s +** fmov w0, s0 +** lsr w1, w0, 16 +** add w0, w1, w0, uxth +** ret +*/ +int bar (v4si x) +{ + unsigned int sum = vaddvq_s32 (x); + return (((uint16_t)(sum & 0xffff)) + ((uint32_t)sum >> 16)); +} + +/* +** foo: +** lsr w0, w0, 16 +** ret +*/ +unsigned short foo (unsigned x) +{ + return x >> 16; +} + +/* +** foo2: +** ... +** umov w0, v[0-8]+.h\[1\] +** ret +*/ +unsigned short foo2 (v4si x) +{ + int y = x[0] + x[1]; + return y >> 16; +} --F7rQHXCg37B8tZx1--