From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR03-AM7-obe.outbound.protection.outlook.com (mail-am7eur03on2077.outbound.protection.outlook.com [40.107.105.77]) by sourceware.org (Postfix) with ESMTPS id F2FD238582A4 for ; Wed, 2 Aug 2023 10:25:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org F2FD238582A4 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=6xnKXVSL1LyovF7z1W+bhcap1Qufg9r6kN4EJspZc8M=; b=8ETNpU0n7sgLnjsX0b87y6uri/pLhTW9NJB0h43vYyVJShrHEevqyd4eJ4ChnCDFNhmtsGICB0AnPXYhwARJUkuN3UcFF5xU+DA0j2WOwvtlbHC1fhyqYdtSN/jXVELJBl6jq2ZMRARXVqiGdA0HO2cEJOTw+rtixYeG5RbyhI4= Received: from AM6P194CA0098.EURP194.PROD.OUTLOOK.COM (2603:10a6:209:8f::39) by GV1PR08MB8238.eurprd08.prod.outlook.com (2603:10a6:150:5e::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6631.45; Wed, 2 Aug 2023 10:25:26 +0000 Received: from AM7EUR03FT020.eop-EUR03.prod.protection.outlook.com (2603:10a6:209:8f:cafe::63) by AM6P194CA0098.outlook.office365.com (2603:10a6:209:8f::39) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6631.44 via Frontend Transport; Wed, 2 Aug 2023 10:25:26 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM7EUR03FT020.mail.protection.outlook.com (100.127.140.196) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6631.45 via Frontend Transport; Wed, 2 Aug 2023 10:25:25 +0000 Received: ("Tessian outbound d7adc65d10b4:v145"); Wed, 02 Aug 2023 10:25:25 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 48499f3f72b042bb X-CR-MTA-TID: 64aa7808 Received: from 3bca18a9f4ca.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id F6E4B1E9-174E-4139-AB48-6791A6F6F4EA.1; Wed, 02 Aug 2023 10:25:18 +0000 Received: from EUR05-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 3bca18a9f4ca.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Wed, 02 Aug 2023 10:25:18 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=mjbII3A/MooecUNnlAAn27sTpEfddxOii526woF6oiZQs+sjh/Py/xFYHJ/jeQlOt46s885KdPpSjiZMRS790vwh7FOXRSCpHckAuyF9zcAaJJ8TwdUTtHpzKMmmfnjibLy63r3SaTVJsHQ1dmdQFYxpmcg02ix4+b3Y0v/FbBQ1qmFYkEoU9RkaVpQtECqyj+XCcJzOl3qR00nEdzBKR1hKQrZwmCWmurWJaPjrTeyp3uX1WAhBF/rV6qD4FD/EP1ZT1+e9eEfdhNvEg2f0vI/bAvVjz8WSLTkv0eFWYwKPDmgBmeD1M2XXoca0dXzEQNIn27gaIyz4XVNS1tTJng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=6xnKXVSL1LyovF7z1W+bhcap1Qufg9r6kN4EJspZc8M=; b=d/KMueXJwXk//ARtZrzxPR5KH86HcmqgRgwCbwWkXC/PyUPbeDtoLVxHVs3hThhUGMaa8S5qVCJeOuqzOMGVDs/xdqV3muG7okFype9KzBikn59hEN6nRGOnu0IQlVELtIio4RrOOyrEzgKTnyAaENtUE4XZ0zJUu3gB/40M55lxmIFYKWUPZj3bqrfunP1xfpmCV+qV9AOU4oknaX7oNRo7MiI2mF1XK3wqyw9C38SDC9/X7XYXCXWelR3Jo0hxwjxHfatkv9jukUo2COYJHsWXjwOWUAmCZNzAc99UtEq85icbDEBEIuvxkiSQ8Lhz3aQnrvOhpoWin9dvc94GbQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=6xnKXVSL1LyovF7z1W+bhcap1Qufg9r6kN4EJspZc8M=; b=8ETNpU0n7sgLnjsX0b87y6uri/pLhTW9NJB0h43vYyVJShrHEevqyd4eJ4ChnCDFNhmtsGICB0AnPXYhwARJUkuN3UcFF5xU+DA0j2WOwvtlbHC1fhyqYdtSN/jXVELJBl6jq2ZMRARXVqiGdA0HO2cEJOTw+rtixYeG5RbyhI4= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by DB9PR08MB10379.eurprd08.prod.outlook.com (2603:10a6:10:3d9::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6631.45; Wed, 2 Aug 2023 10:25:14 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::a85:6d3:5dd7:7d3]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::a85:6d3:5dd7:7d3%7]) with mapi id 15.20.6631.045; Wed, 2 Aug 2023 10:25:14 +0000 Date: Wed, 2 Aug 2023 11:25:12 +0100 From: Tamar Christina To: gcc-patches@gcc.gnu.org Cc: nd@arm.com, Richard.Earnshaw@arm.com, Marcus.Shawcroft@arm.com, Kyrylo.Tkachov@arm.com, richard.sandiford@arm.com Subject: [PATCH]AArch64 Undo vec_widen_shiftl optabs [PR106346] Message-ID: Content-Type: multipart/mixed; boundary="erTk7pc3PMhJhcBE" Content-Disposition: inline X-ClientProxiedBy: LO4P265CA0009.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:2ad::12) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|DB9PR08MB10379:EE_|AM7EUR03FT020:EE_|GV1PR08MB8238:EE_ X-MS-Office365-Filtering-Correlation-Id: 3b91a96d-6a1a-418f-8b84-08db9342c92e x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: ChHLGFZG2vXTZXa+UQMPlu5wPiST0mgtEFvzGXOGZBsc0tAfvwhL1zrn32KWz/WVihNFs4Y0UON6Fp4JwEch1aPMQg1FodMeuMLRkTfsONsAfn3MVT1B4vzvYfhwUj5eTir9fkz/jxwnGlmXiGpGHqdVaeyVxdczPNi6226927DLvA0mtDBW+LbroSGc00xyTFpb+hziLBM3J31SVKDaP1DfRdiPibDB5fal7xJRz1FK/XywDlXRiXj5PPp7N9uCp8ZpIh5kVS/531pYju19gNj8b6KOC7UgXigERPGGlQQX2B9GaDMVOKTCET2U6VNfCJXB2D+MJ8kK//0WQSgRILVl4Q3Sz98hbbxpprZJYKg0uXxvwKybRuHsD/CAF3Ezn7w1h65gOMIZogn0eimcn61noyudaOZVtZXOKtnl6y48rxACC++mFB/1SO9VKcgWPjsaTGRcW7e+Fv54VPrqpxC0rEyxFkLUGieH60pB9gLYGlLPJjuByPnZOZoJOlPt1VjWCAp9wVXpTi3HGQHoKjeUTZ6Vsg60joMk0uMZhRx3tVuWSn4LJNOGDe7Ne4pYMw+1KETCv38sNmErd6Ov+2rT/ZTPIi2S1F8vtDspvJecIjVZ+790TmgZ6aRUMhlUqF1M301rdFdbE/tbksjr0YSanLXRzI/69nsoZmzFpKs= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VI1PR08MB5325.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230028)(4636009)(346002)(376002)(366004)(136003)(396003)(39860400002)(451199021)(30864003)(36756003)(86362001)(478600001)(6512007)(44144004)(33964004)(6486002)(235185007)(8676002)(8936002)(84970400001)(41300700001)(316002)(5660300002)(44832011)(66556008)(4326008)(6916009)(66476007)(83380400001)(2906002)(38100700002)(66946007)(26005)(2616005)(186003)(6506007)(4743002)(4216001)(2700100001);DIR:OUT;SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB9PR08MB10379 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM7EUR03FT020.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 75faa972-ad97-41de-3ab5-08db9342c252 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: MNRuTOB/5a6GjvqjHhxjI56kLCKOZVeLBZtjXxUf8yHkr15xDEW6eDHHZxcDxqqQ0MV8P3LJxB+nIo7l4ne8dodBzv6nOKd8QhgSYGUo/o/Ie5it9e35FggSrsfC8HT4/mPpHM9/tVFgvY3xlduM6bDu1TQCX9/dZDt5yjgepn8U8M3ZFxGU1bRclW1zLdJ/wVmDDV6voZTf72p2iBXUyfkLffvOlXdaXYh7T0jXawm3oFaKt0RwjPrMkB7CDgArKzVNwKRJHiTfKduH2inOxipmUbUi6DXd2Y6j2ra7ARnmZal/FbeI3DhOcQbOphCdTIDR/iYQvGirEYghroB/eoPOTibxnMl/3D9ySMJMmiPN0UKnybsX9vN4AWJeXY92s0SyA4lFi4sL+z49zJGK31zGr2blLONr7HgUjgijGRjtLVIEu8Q7evhhFFzunWWy6MzHE7v5AfvbaLtbLv7iArfBNDYy8MHr1bRvtczPar68If0wfuNENbIR/Po3WxZgxhyauvzbhnXNJatHrxcNW4uEXk4eX9ypQN1dyRwJap/al1CUvh0rNPXiWq5d2a2avSxji1wMNNaFDVxp/60UCsqAvDpwymbm4Z1PzIec9/PUdfHcvksVv5OLi/ausoLr2dIZz+pnDvcGSaEE1LwL8SKo1yJ7ExOnFS+fCA7/goqeMZEjiU7C6idpyCK754oSxP4qhl2CL0bF2ZN2HauHW1gVU4qF84w9k00HHT30YqyQhzW5SJmcU7vJ5Dh/bSY6f9WvRqe0GSXnNP5cMfLsV2M3kLuj3YVLlhQh6o+LTB8yKyfmuG9tecRRkyf6dK15 X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230028)(4636009)(346002)(396003)(136003)(376002)(39860400002)(82310400008)(451199021)(40470700004)(36840700001)(46966006)(84970400001)(8676002)(5660300002)(41300700001)(235185007)(2906002)(8936002)(83380400001)(47076005)(36860700001)(30864003)(44832011)(40460700003)(2616005)(40480700001)(186003)(336012)(4743002)(86362001)(81166007)(478600001)(316002)(82740400003)(26005)(6506007)(33964004)(44144004)(356005)(70586007)(70206006)(6486002)(4326008)(6916009)(6512007)(36756003)(4216001)(2700100001);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 02 Aug 2023 10:25:25.4671 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 3b91a96d-6a1a-418f-8b84-08db9342c92e X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM7EUR03FT020.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV1PR08MB8238 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,GIT_PATCH_0,KAM_DMARC_NONE,KAM_LOTSOFHASH,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --erTk7pc3PMhJhcBE Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Hi All, In GCC 11 we implemented the vectorizer optab for widening left shifts, however this optab is only supported for uniform shift constants. At the moment GCC still has two loop vectorization strategy (classical loop and SLP based loop vec) and the optab is implemented as a scalar pattern. This means that when we apply it to a non-uniform constant inside a loop we only find out during SLP build that the constants aren't uniform. At this point it's too late and we lose SLP entirely. Over the years I've tried various options but none of it works well: 1. Dissolving patterns during SLP built (problematic, also dissolves them for non-slp). 2. Optionally ignoring patterns for SLP build (problematic, ends up interfearing with relevancy detection). 3. Relaxing contraint on SLP build to allow non-constant values and dissolving them after SLP build using an SLP pattern. (problematic, ends up breaking shift reassociation). As a result we've concluded that for now this pattern should just be removed and formed during RTL. The plan is to move this to an SLP only pattern once we remove classical loop vectorization support from GCC, at which time we can also properly support SVE's Top and Bottom variants. This removes the optab and reworks the RTL to recognize both the vector variant and the intrinsics variant. Also just simplifies all these patterns. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: PR target/106346 * config/aarch64/aarch64-simd.md (vec_widen_shiftl_lo_, vec_widen_shiftl_hi_): Remove. (aarch64_shll_internal): Renamed to... (aarch64_shll): .. This. (aarch64_shll2_internal): Renamed to... (aarch64_shll2): .. This. (aarch64_shll_n, aarch64_shll2_n): Re-use new optabs. * config/aarch64/constraints.md (D2, D3): New. * config/aarch64/predicates.md (aarch64_simd_shift_imm_vec): New. gcc/testsuite/ChangeLog: PR target/106346 * gcc.target/aarch64/pr98772.c: Adjust assembly. * gcc.target/aarch64/vect-widen-shift.c: New test. --- inline copy of patch -- diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index d95394101470446e55f25a2397dd112239b6a54d..afd5b8632afbcddf8dad14495c3446c560eb085d 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -6387,105 +6387,66 @@ (define_insn "aarch64_qshl" [(set_attr "type" "neon_sat_shift_reg")] ) -(define_expand "vec_widen_shiftl_lo_" - [(set (match_operand: 0 "register_operand" "=w") - (unspec: [(match_operand:VQW 1 "register_operand" "w") - (match_operand:SI 2 - "aarch64_simd_shift_imm_bitsize_" "i")] - VSHLL))] - "TARGET_SIMD" - { - rtx p = aarch64_simd_vect_par_cnst_half (mode, , false); - emit_insn (gen_aarch64_shll_internal (operands[0], operands[1], - p, operands[2])); - DONE; - } -) - -(define_expand "vec_widen_shiftl_hi_" - [(set (match_operand: 0 "register_operand") - (unspec: [(match_operand:VQW 1 "register_operand" "w") - (match_operand:SI 2 - "immediate_operand" "i")] - VSHLL))] - "TARGET_SIMD" - { - rtx p = aarch64_simd_vect_par_cnst_half (mode, , true); - emit_insn (gen_aarch64_shll2_internal (operands[0], operands[1], - p, operands[2])); - DONE; - } -) - ;; vshll_n -(define_insn "aarch64_shll_internal" - [(set (match_operand: 0 "register_operand" "=w") - (unspec: [(vec_select: - (match_operand:VQW 1 "register_operand" "w") - (match_operand:VQW 2 "vect_par_cnst_lo_half" "")) - (match_operand:SI 3 - "aarch64_simd_shift_imm_bitsize_" "i")] - VSHLL))] +(define_insn "aarch64_shll" + [(set (match_operand: 0 "register_operand") + (ashift: (ANY_EXTEND: + (match_operand:VD_BHSI 1 "register_operand")) + (match_operand: 2 + "aarch64_simd_shift_imm_vec")))] "TARGET_SIMD" - { - if (INTVAL (operands[3]) == GET_MODE_UNIT_BITSIZE (mode)) - return "shll\\t%0., %1., %3"; - else - return "shll\\t%0., %1., %3"; + {@ [cons: =0, 1, 2] + [w, w, D2] shll\t%0., %1., %I2 + [w, w, D3] shll\t%0., %1., %I2 } [(set_attr "type" "neon_shift_imm_long")] ) -(define_insn "aarch64_shll2_internal" - [(set (match_operand: 0 "register_operand" "=w") - (unspec: [(vec_select: - (match_operand:VQW 1 "register_operand" "w") - (match_operand:VQW 2 "vect_par_cnst_hi_half" "")) - (match_operand:SI 3 - "aarch64_simd_shift_imm_bitsize_" "i")] +(define_expand "aarch64_shll_n" + [(set (match_operand: 0 "register_operand") + (unspec: [(match_operand:VD_BHSI 1 "register_operand") + (match_operand:SI 2 + "aarch64_simd_shift_imm_bitsize_")] VSHLL))] "TARGET_SIMD" { - if (INTVAL (operands[3]) == GET_MODE_UNIT_BITSIZE (mode)) - return "shll2\\t%0., %1., %3"; - else - return "shll2\\t%0., %1., %3"; + rtx shft = gen_const_vec_duplicate (mode, operands[2]); + emit_insn (gen_aarch64_shll (operands[0], operands[1], shft)); + DONE; } - [(set_attr "type" "neon_shift_imm_long")] ) -(define_insn "aarch64_shll_n" - [(set (match_operand: 0 "register_operand" "=w") - (unspec: [(match_operand:VD_BHSI 1 "register_operand" "w") - (match_operand:SI 2 - "aarch64_simd_shift_imm_bitsize_" "i")] - VSHLL))] +;; vshll_high_n + +(define_insn "aarch64_shll2" + [(set (match_operand: 0 "register_operand") + (ashift: (ANY_EXTEND: + (vec_select: + (match_operand:VQW 1 "register_operand") + (match_operand:VQW 2 "vect_par_cnst_hi_half"))) + (match_operand: 3 + "aarch64_simd_shift_imm_vec")))] "TARGET_SIMD" - { - if (INTVAL (operands[2]) == GET_MODE_UNIT_BITSIZE (mode)) - return "shll\\t%0., %1., %2"; - else - return "shll\\t%0., %1., %2"; + {@ [cons: =0, 1, 2, 3] + [w, w, , D2] shll2\t%0., %1., %I3 + [w, w, , D3] shll2\t%0., %1., %I3 } [(set_attr "type" "neon_shift_imm_long")] ) -;; vshll_high_n - -(define_insn "aarch64_shll2_n" +(define_expand "aarch64_shll2_n" [(set (match_operand: 0 "register_operand" "=w") (unspec: [(match_operand:VQW 1 "register_operand" "w") (match_operand:SI 2 "immediate_operand" "i")] - VSHLL))] + VSHLL))] "TARGET_SIMD" { - if (INTVAL (operands[2]) == GET_MODE_UNIT_BITSIZE (mode)) - return "shll2\\t%0., %1., %2"; - else - return "shll2\\t%0., %1., %2"; + rtx shft = gen_const_vec_duplicate (mode, operands[2]); + rtx p = aarch64_simd_vect_par_cnst_half (mode, , true); + emit_insn (gen_aarch64_shll2 (operands[0], operands[1], p, shft)); + DONE; } - [(set_attr "type" "neon_shift_imm_long")] ) ;; vrshr_n diff --git a/gcc/config/aarch64/constraints.md b/gcc/config/aarch64/constraints.md index 6df1dbec2a8097abe9783ed1670c77a8fad4ca57..07613013622deb8797255cbf10c265080066565b 100644 --- a/gcc/config/aarch64/constraints.md +++ b/gcc/config/aarch64/constraints.md @@ -468,6 +468,22 @@ (define_constraint "D1" GET_MODE_UNIT_BITSIZE (mode) - 1, GET_MODE_UNIT_BITSIZE (mode) - 1)"))) +(define_constraint "D2" + "@internal + A constraint that matches vector of immediates that is bits(mode)/2." + (and (match_code "const,const_vector") + (match_test "aarch64_const_vec_all_same_in_range_p (op, + GET_MODE_UNIT_BITSIZE (mode) / 2, + GET_MODE_UNIT_BITSIZE (mode) / 2)"))) + +(define_constraint "D3" + "@internal + A constraint that matches vector of immediates that is with 0 to + (bits(mode)/2)-1." + (and (match_code "const,const_vector") + (match_test "aarch64_const_vec_all_same_in_range_p (op, 0, + (GET_MODE_UNIT_BITSIZE (mode) / 2) - 1)"))) + (define_constraint "Dr" "@internal A constraint that matches vector of immediates for right shifts." diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md index d5a4a1cd9bf8cde8e779de6e0afa531f04892a7b..2b50e39aa8651b415443b5fd90f187c83d5591d5 100644 --- a/gcc/config/aarch64/predicates.md +++ b/gcc/config/aarch64/predicates.md @@ -638,6 +638,11 @@ (define_predicate "aarch64_simd_raddsubhn_imm_vec" HOST_WIDE_INT_1U << (GET_MODE_UNIT_BITSIZE (mode) / 2 - 1))"))) +(define_predicate "aarch64_simd_shift_imm_vec" + (and (match_code "const_vector") + (match_test "aarch64_const_vec_all_same_in_range_p (op, 0, + GET_MODE_UNIT_BITSIZE (mode) / 2)"))) + (define_predicate "aarch64_simd_shift_imm_bitsize_qi" (and (match_code "const_int") (match_test "IN_RANGE (INTVAL (op), 0, 8)"))) diff --git a/gcc/testsuite/gcc.target/aarch64/pr98772.c b/gcc/testsuite/gcc.target/aarch64/pr98772.c index 8259251a7c0b64ae8362ea29ec3cf1d2a9d63547..52ad012dcfe72721b8c987bb826c0ffb8ba3f31e 100644 --- a/gcc/testsuite/gcc.target/aarch64/pr98772.c +++ b/gcc/testsuite/gcc.target/aarch64/pr98772.c @@ -155,4 +155,4 @@ int main () /* { dg-final { scan-assembler-times "uaddl\\tv" 2 } } */ /* { dg-final { scan-assembler-times "usubl\\tv" 2 } } */ /* { dg-final { scan-assembler-times "umull\\tv" 2 } } */ -/* { dg-final { scan-assembler-times "shl\\tv" 2 } } */ +/* { dg-final { scan-assembler-times "shll\\tv" 2 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-shift.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-shift.c new file mode 100644 index 0000000000000000000000000000000000000000..6ee41f63ef8a145c0eb7f213950e7501e058b2fa --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-shift.c @@ -0,0 +1,50 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ +#include +#include + +#pragma GCC target "+nosve" + +#define ARR_SIZE 1024 + +/* Should produce an shll,shll2 pair*/ +/* +** sshll_opt1: +** ... +** shll v[0-9]+.4s, v[0-9]+.4h, 16 +** shll2 v[0-9]+.4s, v[0-9]+.8h, 16 +** ... +*/ +void sshll_opt1 (int32_t *foo, int16_t *a, int16_t *b) +{ + for( int i = 0; i < ARR_SIZE - 3;i=i+4) + { + foo[i] = a[i] << 16; + foo[i+1] = a[i+1] << 16; + foo[i+2] = a[i+2] << 16; + foo[i+3] = a[i+3] << 16; + } +} + +/* +** sshll_opt2: +** ... +** sxtl v[0-9]+.4s, v[0-9]+.4h +** sxtl2 v[0-9]+.4s, v[0-9]+.8h +** sshl v[0-9]+.4s, v[0-9]+.4s, v[0-9]+.4s +** sshl v[0-9]+.4s, v[0-9]+.4s, v[0-9]+.4s +** ... +*/ +void sshll_opt2 (int32_t *foo, int16_t *a, int16_t *b) +{ + for( int i = 0; i < ARR_SIZE - 3;i=i+4) + { + foo[i] = a[i] << 16; + foo[i+1] = a[i+1] << 15; + foo[i+2] = a[i+2] << 14; + foo[i+3] = a[i+3] << 17; + } +} + + -- --erTk7pc3PMhJhcBE Content-Type: text/plain; charset=utf-8 Content-Disposition: attachment; filename="rb17620.patch" diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index d95394101470446e55f25a2397dd112239b6a54d..afd5b8632afbcddf8dad14495c3446c560eb085d 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -6387,105 +6387,66 @@ (define_insn "aarch64_qshl" [(set_attr "type" "neon_sat_shift_reg")] ) -(define_expand "vec_widen_shiftl_lo_" - [(set (match_operand: 0 "register_operand" "=w") - (unspec: [(match_operand:VQW 1 "register_operand" "w") - (match_operand:SI 2 - "aarch64_simd_shift_imm_bitsize_" "i")] - VSHLL))] - "TARGET_SIMD" - { - rtx p = aarch64_simd_vect_par_cnst_half (mode, , false); - emit_insn (gen_aarch64_shll_internal (operands[0], operands[1], - p, operands[2])); - DONE; - } -) - -(define_expand "vec_widen_shiftl_hi_" - [(set (match_operand: 0 "register_operand") - (unspec: [(match_operand:VQW 1 "register_operand" "w") - (match_operand:SI 2 - "immediate_operand" "i")] - VSHLL))] - "TARGET_SIMD" - { - rtx p = aarch64_simd_vect_par_cnst_half (mode, , true); - emit_insn (gen_aarch64_shll2_internal (operands[0], operands[1], - p, operands[2])); - DONE; - } -) - ;; vshll_n -(define_insn "aarch64_shll_internal" - [(set (match_operand: 0 "register_operand" "=w") - (unspec: [(vec_select: - (match_operand:VQW 1 "register_operand" "w") - (match_operand:VQW 2 "vect_par_cnst_lo_half" "")) - (match_operand:SI 3 - "aarch64_simd_shift_imm_bitsize_" "i")] - VSHLL))] +(define_insn "aarch64_shll" + [(set (match_operand: 0 "register_operand") + (ashift: (ANY_EXTEND: + (match_operand:VD_BHSI 1 "register_operand")) + (match_operand: 2 + "aarch64_simd_shift_imm_vec")))] "TARGET_SIMD" - { - if (INTVAL (operands[3]) == GET_MODE_UNIT_BITSIZE (mode)) - return "shll\\t%0., %1., %3"; - else - return "shll\\t%0., %1., %3"; + {@ [cons: =0, 1, 2] + [w, w, D2] shll\t%0., %1., %I2 + [w, w, D3] shll\t%0., %1., %I2 } [(set_attr "type" "neon_shift_imm_long")] ) -(define_insn "aarch64_shll2_internal" - [(set (match_operand: 0 "register_operand" "=w") - (unspec: [(vec_select: - (match_operand:VQW 1 "register_operand" "w") - (match_operand:VQW 2 "vect_par_cnst_hi_half" "")) - (match_operand:SI 3 - "aarch64_simd_shift_imm_bitsize_" "i")] +(define_expand "aarch64_shll_n" + [(set (match_operand: 0 "register_operand") + (unspec: [(match_operand:VD_BHSI 1 "register_operand") + (match_operand:SI 2 + "aarch64_simd_shift_imm_bitsize_")] VSHLL))] "TARGET_SIMD" { - if (INTVAL (operands[3]) == GET_MODE_UNIT_BITSIZE (mode)) - return "shll2\\t%0., %1., %3"; - else - return "shll2\\t%0., %1., %3"; + rtx shft = gen_const_vec_duplicate (mode, operands[2]); + emit_insn (gen_aarch64_shll (operands[0], operands[1], shft)); + DONE; } - [(set_attr "type" "neon_shift_imm_long")] ) -(define_insn "aarch64_shll_n" - [(set (match_operand: 0 "register_operand" "=w") - (unspec: [(match_operand:VD_BHSI 1 "register_operand" "w") - (match_operand:SI 2 - "aarch64_simd_shift_imm_bitsize_" "i")] - VSHLL))] +;; vshll_high_n + +(define_insn "aarch64_shll2" + [(set (match_operand: 0 "register_operand") + (ashift: (ANY_EXTEND: + (vec_select: + (match_operand:VQW 1 "register_operand") + (match_operand:VQW 2 "vect_par_cnst_hi_half"))) + (match_operand: 3 + "aarch64_simd_shift_imm_vec")))] "TARGET_SIMD" - { - if (INTVAL (operands[2]) == GET_MODE_UNIT_BITSIZE (mode)) - return "shll\\t%0., %1., %2"; - else - return "shll\\t%0., %1., %2"; + {@ [cons: =0, 1, 2, 3] + [w, w, , D2] shll2\t%0., %1., %I3 + [w, w, , D3] shll2\t%0., %1., %I3 } [(set_attr "type" "neon_shift_imm_long")] ) -;; vshll_high_n - -(define_insn "aarch64_shll2_n" +(define_expand "aarch64_shll2_n" [(set (match_operand: 0 "register_operand" "=w") (unspec: [(match_operand:VQW 1 "register_operand" "w") (match_operand:SI 2 "immediate_operand" "i")] - VSHLL))] + VSHLL))] "TARGET_SIMD" { - if (INTVAL (operands[2]) == GET_MODE_UNIT_BITSIZE (mode)) - return "shll2\\t%0., %1., %2"; - else - return "shll2\\t%0., %1., %2"; + rtx shft = gen_const_vec_duplicate (mode, operands[2]); + rtx p = aarch64_simd_vect_par_cnst_half (mode, , true); + emit_insn (gen_aarch64_shll2 (operands[0], operands[1], p, shft)); + DONE; } - [(set_attr "type" "neon_shift_imm_long")] ) ;; vrshr_n diff --git a/gcc/config/aarch64/constraints.md b/gcc/config/aarch64/constraints.md index 6df1dbec2a8097abe9783ed1670c77a8fad4ca57..07613013622deb8797255cbf10c265080066565b 100644 --- a/gcc/config/aarch64/constraints.md +++ b/gcc/config/aarch64/constraints.md @@ -468,6 +468,22 @@ (define_constraint "D1" GET_MODE_UNIT_BITSIZE (mode) - 1, GET_MODE_UNIT_BITSIZE (mode) - 1)"))) +(define_constraint "D2" + "@internal + A constraint that matches vector of immediates that is bits(mode)/2." + (and (match_code "const,const_vector") + (match_test "aarch64_const_vec_all_same_in_range_p (op, + GET_MODE_UNIT_BITSIZE (mode) / 2, + GET_MODE_UNIT_BITSIZE (mode) / 2)"))) + +(define_constraint "D3" + "@internal + A constraint that matches vector of immediates that is with 0 to + (bits(mode)/2)-1." + (and (match_code "const,const_vector") + (match_test "aarch64_const_vec_all_same_in_range_p (op, 0, + (GET_MODE_UNIT_BITSIZE (mode) / 2) - 1)"))) + (define_constraint "Dr" "@internal A constraint that matches vector of immediates for right shifts." diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md index d5a4a1cd9bf8cde8e779de6e0afa531f04892a7b..2b50e39aa8651b415443b5fd90f187c83d5591d5 100644 --- a/gcc/config/aarch64/predicates.md +++ b/gcc/config/aarch64/predicates.md @@ -638,6 +638,11 @@ (define_predicate "aarch64_simd_raddsubhn_imm_vec" HOST_WIDE_INT_1U << (GET_MODE_UNIT_BITSIZE (mode) / 2 - 1))"))) +(define_predicate "aarch64_simd_shift_imm_vec" + (and (match_code "const_vector") + (match_test "aarch64_const_vec_all_same_in_range_p (op, 0, + GET_MODE_UNIT_BITSIZE (mode) / 2)"))) + (define_predicate "aarch64_simd_shift_imm_bitsize_qi" (and (match_code "const_int") (match_test "IN_RANGE (INTVAL (op), 0, 8)"))) diff --git a/gcc/testsuite/gcc.target/aarch64/pr98772.c b/gcc/testsuite/gcc.target/aarch64/pr98772.c index 8259251a7c0b64ae8362ea29ec3cf1d2a9d63547..52ad012dcfe72721b8c987bb826c0ffb8ba3f31e 100644 --- a/gcc/testsuite/gcc.target/aarch64/pr98772.c +++ b/gcc/testsuite/gcc.target/aarch64/pr98772.c @@ -155,4 +155,4 @@ int main () /* { dg-final { scan-assembler-times "uaddl\\tv" 2 } } */ /* { dg-final { scan-assembler-times "usubl\\tv" 2 } } */ /* { dg-final { scan-assembler-times "umull\\tv" 2 } } */ -/* { dg-final { scan-assembler-times "shl\\tv" 2 } } */ +/* { dg-final { scan-assembler-times "shll\\tv" 2 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-shift.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-shift.c new file mode 100644 index 0000000000000000000000000000000000000000..6ee41f63ef8a145c0eb7f213950e7501e058b2fa --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-shift.c @@ -0,0 +1,50 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ +#include +#include + +#pragma GCC target "+nosve" + +#define ARR_SIZE 1024 + +/* Should produce an shll,shll2 pair*/ +/* +** sshll_opt1: +** ... +** shll v[0-9]+.4s, v[0-9]+.4h, 16 +** shll2 v[0-9]+.4s, v[0-9]+.8h, 16 +** ... +*/ +void sshll_opt1 (int32_t *foo, int16_t *a, int16_t *b) +{ + for( int i = 0; i < ARR_SIZE - 3;i=i+4) + { + foo[i] = a[i] << 16; + foo[i+1] = a[i+1] << 16; + foo[i+2] = a[i+2] << 16; + foo[i+3] = a[i+3] << 16; + } +} + +/* +** sshll_opt2: +** ... +** sxtl v[0-9]+.4s, v[0-9]+.4h +** sxtl2 v[0-9]+.4s, v[0-9]+.8h +** sshl v[0-9]+.4s, v[0-9]+.4s, v[0-9]+.4s +** sshl v[0-9]+.4s, v[0-9]+.4s, v[0-9]+.4s +** ... +*/ +void sshll_opt2 (int32_t *foo, int16_t *a, int16_t *b) +{ + for( int i = 0; i < ARR_SIZE - 3;i=i+4) + { + foo[i] = a[i] << 16; + foo[i+1] = a[i+1] << 15; + foo[i+2] = a[i+2] << 14; + foo[i+3] = a[i+3] << 17; + } +} + + --erTk7pc3PMhJhcBE--