From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR04-VI1-obe.outbound.protection.outlook.com (mail-vi1eur04on2087.outbound.protection.outlook.com [40.107.8.87]) by sourceware.org (Postfix) with ESMTPS id 89A443858D20 for ; Wed, 15 Nov 2023 14:42:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 89A443858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 89A443858D20 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.8.87 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1700059323; cv=pass; b=oW0V9CS8pq6phoe+dDe+QoLgUojjTctaWYHN9zQiJ+UwYL05T38EiUXsnERu+NPdZH5TFymHeiyrq+RgNzgUICFrQkaRQnxyyQ3LhzQXBYSrYVWhQcnEGDltOPoNLqVg1FY3ug8/IlzoJp3sNGZ/axOFTAR68IYF7BLHAUp9Cwo= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1700059323; c=relaxed/simple; bh=Z7t4h8bqNOY/9daLhfGv+Jdr7FxWKarULSSi3pZmaBQ=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=PRhKTUvtPgJpZ4aUEwkH+iF6tQcMkYu3rrQ0xmW/zWatTdjOXa0P904KuJ2zUOMKNc1Q9pR1LSCkEvqaFDnWrmDmWboaSgU88iFArdGbsHKVjUKAhYsuISob2gYsYNzqJptV7dgP8sr3sPaLnDF6QF7Rmqhwr1hJv+B3k1R8MqU= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=nihQVryXg95KMXhYf1V+lIBjcSjPcpwWty8D1R1PydpLj/wbdgU4wF/9tUxP/qPjFKY9SnuLoJM/sFzEgcHKNXKu4xFYn14xaSQ3etGKptEFlmp3qaKG7Gk9LXylq0/Fcwo7CzUdxOOvXF+NY8shRK68NyiJ8XzdDM/zE+2Wef7+yaweCIOjXsnNao7fbUX54x3zQgvFzofEAeoa75kIPH4Cw4WyR5XzJluDAEiyw5Y91HgAXEnd6iPMvIat4h1IAP441Jgc0lnyrnkHnFm7gBHDTgCypa7OiwR9nivowXgO3/6ibhyFLCV3Oq89W/J2h2IdNzb3jxNK65KK87khRA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=cK+trrXulGGrGdcheXqE4n98SFh+JxcGgL3kCpxP21E=; b=cFvaDh3b3FbSdYojhN2+dikqoGQNdIN/TjYsns4UbPdpXUeSkFbTtlCMlchoVYSG8utiDezowH/9PTkxBMzeVTcxcopcio+Uaya3iVPKfDN0ONa+vP7Zmp7Dy3RyGZr0y3ZnhiTZ7ZaqZy5I2ItxEY2eID5bJvOMxzGLclvCLyvIXcz0aoET4ykt2eQBWcoLDKy2z2a5TZKi+QGs276Az4H00hDlLKfOCttoyHj+vJ/KmAexJU0lN9YbhoB6Vl+qbTwD04Ij3uGc8ZJkU+/G29UVTeTIlMsMilUmLbyrlW9r4n9ofihD0A+0D1Piz+3waCDbR+udzZzvcFYG07ZgWg== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=cK+trrXulGGrGdcheXqE4n98SFh+JxcGgL3kCpxP21E=; b=M6ph/2rpeUoHX+51EdTyjZgZapXxksykLV9pk/SdOzesbXqt2Rg50UjegpoHy3+EhiBAWHG3jVQlIWSRfYI4FWSk+PVjz7mvtQDH4QQMlC4uNUxZ4868nXSKcbJvpu01guYNFuaRPbubSa04JLGIyzbuwuYMjSEAkYYaYMIaW54= Received: from DB9PR02CA0004.eurprd02.prod.outlook.com (2603:10a6:10:1d9::9) by AM0PR08MB5297.eurprd08.prod.outlook.com (2603:10a6:208:18a::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.18; Wed, 15 Nov 2023 14:41:55 +0000 Received: from DB5PEPF00014B8F.eurprd02.prod.outlook.com (2603:10a6:10:1d9:cafe::11) by DB9PR02CA0004.outlook.office365.com (2603:10a6:10:1d9::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6977.31 via Frontend Transport; Wed, 15 Nov 2023 14:41:55 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5PEPF00014B8F.mail.protection.outlook.com (10.167.8.203) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.20 via Frontend Transport; Wed, 15 Nov 2023 14:41:55 +0000 Received: ("Tessian outbound e243565b0037:v228"); Wed, 15 Nov 2023 14:41:54 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: e4a5e4436bad21be X-CR-MTA-TID: 64aa7808 Received: from 371ab3990aef.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 955DCCBF-AB87-46E6-9EFE-A793A8EB9B3A.1; Wed, 15 Nov 2023 14:41:48 +0000 Received: from EUR02-AM0-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 371ab3990aef.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Wed, 15 Nov 2023 14:41:48 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=QCoKRZoyVo7jc4gXoovRvvGicv5EuXzhuB4NgLUEnYzAnreAkvqFD/CkXnbSo3JyX/bk/ftIYDLUOztt5oRJs7je3Z8P+zLAeW9sJIr0hH9xYmVKHWnIyywmN7RfXYmvMCk1ELqcFfgu3prNbLA++gyeF9yvfrSfQHs+Ve8rXfcxXrrZjJkxBm8nnA0DlKwM7foJn94Ak36R0p+gHilXJnXuaKamhhihYslKOfT+ra6qKw8tJEtA0jrXMIYOsf9I+gcAFo1i1d0dpUZnwQPp4sV38f4gPGOoi9/fUGWte3cFyYj/Ssm9N5WX6D9ir7h+mmeuSUcpssKjeBpnmqDrqQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=cK+trrXulGGrGdcheXqE4n98SFh+JxcGgL3kCpxP21E=; b=SWb3uM7amO7L4uXWRzOSzi3LARCi350GDq7arGCm6DmvvXUieFeNvlbZtaELXgWkDpLFNVhAzy390L6C29aX6m1/nhw04JJuu93fN3OlxoR33PnlZ4b0jpJAE2ZHFkAEIrP4DKVYVseCmbpFbynLht7ZIrbbF31f3dD8Q0Ls0za5uS3PE0q0TCmre6YPswRxa7BBLZWaOc4/mMey4RhjUq3wLdzL71Fi7JqPCbXZlU9PNs0a2fRQUEXx92X8T46mFsGKmV3ftpknueS65uDMunyO0ywYlEkTe12c49Cp+PcTIiprCwCX2b3yFifkyoCO2QnVvY5tGEVU8T2HqvSZwQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=cK+trrXulGGrGdcheXqE4n98SFh+JxcGgL3kCpxP21E=; b=M6ph/2rpeUoHX+51EdTyjZgZapXxksykLV9pk/SdOzesbXqt2Rg50UjegpoHy3+EhiBAWHG3jVQlIWSRfYI4FWSk+PVjz7mvtQDH4QQMlC4uNUxZ4868nXSKcbJvpu01guYNFuaRPbubSa04JLGIyzbuwuYMjSEAkYYaYMIaW54= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by DU0PR08MB9130.eurprd08.prod.outlook.com (2603:10a6:10:474::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6977.31; Wed, 15 Nov 2023 14:41:45 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::9679:2ab0:99c6:54a3]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::9679:2ab0:99c6:54a3%6]) with mapi id 15.20.6977.029; Wed, 15 Nov 2023 14:41:45 +0000 Date: Wed, 15 Nov 2023 14:41:42 +0000 From: Tamar Christina To: gcc-patches@gcc.gnu.org Cc: nd@arm.com, Richard.Earnshaw@arm.com, Marcus.Shawcroft@arm.com, Kyrylo.Tkachov@arm.com, richard.sandiford@arm.com Subject: [PATCH]AArch64 Add pattern for unsigned widenings (uxtl) to zip{1,2} Message-ID: Content-Type: multipart/mixed; boundary="VsyUTVqOrULknLmi" Content-Disposition: inline X-ClientProxiedBy: LO4P265CA0055.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:2af::11) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|DU0PR08MB9130:EE_|DB5PEPF00014B8F:EE_|AM0PR08MB5297:EE_ X-MS-Office365-Filtering-Correlation-Id: 59cd16b3-d88d-4914-5146-08dbe5e9036e x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: krAvuxG3nJ+ItKDUWZ16SkfJIGfMD2PBHz+62t0kk2EuzSJEuj1pY2+RW6oFA2v9ukqhrjtAHELHDXDIwyPkzJDrIre5uUoZzLvpO6W19EAe1YDHhADNFMDUnzO8TjUm/9v5tbsPY7tiqpUXDxqcRzYyTm5IcFtFgxvcTPLv/Bfso8+4GRkt7Fp+6StjiNqaxiW4emOJO2ygnqk5t99FDralXuqYMo5QD2H3IuRXgifzrQukd5C83XB2fEkY4Add6bvT8ilSEcMHO2fBQhqvm8WTH7BdG3dYxgh4ELiIZqN10uSBAgCXhcf3i2dHsL5EwpGUQQqrSCwzFXcMpOv9FyfnlEU3uYXS8LV0+8PSjAhxUn2An0dszGZYiIH4JymWO5Yh2/CQSsdiCfj7j6Cifa0NNRqpBGmc1y2Io00WHY0ysNoaKT3rdH8i/UNEIUaVgamdkNP5pr3hdzUE9liIgr4uePyffBahgkMJgmREQMhKGyYaK1YQnJzkVy9+DJ7cULlHYx1Eb3BmdEIUbxZA5Niy5Auunq1PQa4k/e1Z9TaF/3SjttqgoHZu9wc9106ia5br2J24ns0XSWCvaO39lzQTNaDNZ6iIIak7swqrFYFDwTjRhF2z72qR1qjGirdUKX0nZJkweJd+NYsq5AE91b/6GWvxHiFd2ZePoQc3qM0= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VI1PR08MB5325.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031)(366004)(136003)(39860400002)(396003)(346002)(376002)(230922051799003)(186009)(451199024)(64100799003)(1800799009)(478600001)(84970400001)(2906002)(41300700001)(30864003)(235185007)(44832011)(316002)(5660300002)(8936002)(8676002)(4326008)(36756003)(6916009)(66476007)(6512007)(86362001)(66556008)(66946007)(6486002)(2616005)(38100700002)(83380400001)(33964004)(26005)(6506007)(44144004)(6666004)(4216001)(2700100001);DIR:OUT;SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DU0PR08MB9130 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5PEPF00014B8F.eurprd02.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 546bbad0-0470-4b70-9096-08dbe5e8fd6e X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: R/lvstGjFQ0tRZHp76BNEHQJ4DazUt8Ct31wDoeZPiahY8FkdjHN3dQ/uF8TVtv72v4+eVCoZhbhwM/IFJMurkByC++E6DbGXAHJ/LH06ytBiDAdFLRopjwq3zqHC1YPkYBxANo4tSCxDBzqpR5toMqiwKVBS1mpa0TF9FTrbzvs7D5kGolhMZa2+/Hr/Y9DBBnsaEtu6ECJpkir2d+AC2t6D5LM6p+YBQet7Q9y7vfsvFZNsngPMqkZJDprliWf1DpCSZ1PUqFCrUjAmOX0U2ssHDeFigxb0ZypCJppXHpTHIbUMuhi5ANv0l2i0Yi3YxlPyitqs5mV475lLEdi7Ir9ytZxy6y11Tg+taFbaDyShuZLcEQNXGipv/ysf8s31vRNrTzd+Mz4P4fwWZYvui9y9aftV9qG44TZgnX0/zx3eNAIIIXCk6/tmXjsOJWs/6xpyfIMjoyoMBPLRE6z21oyL9i3li/mSqhdvo/e+2oY6dQhbc7Ws2u4etTxGYQvVntPBoj9K++rBb65GvRn/sXUI8bVUCSIerXXixcJlySo0zOTuuxvyn6HsskUrrzPPEBNN/KCaVpuXmpP3bDLcqH8YPak46A/JI3OvnXoGfxmlfC62tiX8AJR3EzXpXeoIkxpc3M2y/awHH3lGA8oXMuQG/V61J55Zhkxwz1Ae9ELdBGWu29Kbfk+aR7/SAZFBH3Ai7rc+sL1N31oxflvksN/to7bXY4dc/QP3HIRNTHGYTX3DM43XoSalGmEj3IY/L/u2/1sw18k8+nD+MbOCHGVQGd+jFPoyxYMP6iqyWvqoP3JHMy8jXNxWUiBiqnB X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230031)(4636009)(396003)(376002)(39860400002)(346002)(136003)(230922051799003)(451199024)(64100799003)(1800799009)(186009)(82310400011)(40470700004)(46966006)(36840700001)(83380400001)(40460700003)(82740400003)(2906002)(30864003)(26005)(336012)(356005)(81166007)(47076005)(41300700001)(36860700001)(478600001)(6486002)(316002)(70586007)(70206006)(44144004)(33964004)(6916009)(6506007)(6666004)(4326008)(8936002)(8676002)(84970400001)(86362001)(36756003)(235185007)(5660300002)(2616005)(40480700001)(44832011)(6512007)(4216001)(2700100001);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Nov 2023 14:41:55.0759 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 59cd16b3-d88d-4914-5146-08dbe5e9036e X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB5PEPF00014B8F.eurprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM0PR08MB5297 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,GIT_PATCH_0,KAM_DMARC_NONE,KAM_LOTSOFHASH,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --VsyUTVqOrULknLmi Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Hi All, This changes unpack instructions to use zip{1,2} when doing a zero-extending widening operation. Permutes generally have a higher throughput than the widening operations. Zeros are shuffled into the top half of the registers. The testcase void d2 (unsigned * restrict a, unsigned short *b, int n) { for (int i = 0; i < (n & -8); i++) a[i] = b[i]; } now generates: movi v1.4s, 0 .L3: ldr q0, [x1], 16 zip1 v2.8h, v0.8h, v1.8h zip2 v0.8h, v0.8h, v1.8h stp q2, q0, [x0] add x0, x0, 32 cmp x1, x2 bne .L3 instead of: .L3: ldr q0, [x1], 16 uxtl v1.4s, v0.4h uxtl2 v0.4s, v0.8h stp q1, q0, [x0] add x0, x0, 32 cmp x1, x2 bne .L3 Since we need the extra 0 register we do this only for the vectorizer's lo/hi pairs when we know the 0 will be floated outside of the loop. This gives an 8% speed-up in Imagick in SPECCPU 2017 on Neoverse V2. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64-simd.md (vec_unpack_lo__lo___zip): New. (aarch64_uaddw__zip): New. * config/aarch64/iterators.md (PERM_EXTEND, perm_index): New. (perm_hilo): Add UNSPEC_ZIP1, UNSPEC_ZIP2. gcc/testsuite/ChangeLog: * gcc.target/aarch64/simd/vmovl_high_1.c: Update codegen. * gcc.target/aarch64/uxtl-combine-1.c: New test. * gcc.target/aarch64/uxtl-combine-2.c: New test. * gcc.target/aarch64/uxtl-combine-3.c: New test. * gcc.target/aarch64/uxtl-combine-4.c: New test. * gcc.target/aarch64/uxtl-combine-5.c: New test. * gcc.target/aarch64/uxtl-combine-6.c: New test. --- inline copy of patch -- diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 81ff5bad03d598fa0d48df93d172a28bc0d1d92e..3d811007dd94dcd9176d6021a41a196c12fe9c3f 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -1988,26 +1988,60 @@ (define_insn "aarch64_simd_vec_unpack_hi_" [(set_attr "type" "neon_shift_imm_long")] ) -(define_expand "vec_unpack_hi_" +(define_expand "vec_unpacku_hi_" [(match_operand: 0 "register_operand") - (ANY_EXTEND: (match_operand:VQW 1 "register_operand"))] + (match_operand:VQW 1 "register_operand")] + "TARGET_SIMD" + { + rtx res = gen_reg_rtx (mode); + rtx tmp = aarch64_gen_shareable_zero (mode); + if (BYTES_BIG_ENDIAN) + emit_insn (gen_aarch64_zip2 (res, tmp, operands[1])); + else + emit_insn (gen_aarch64_zip2 (res, operands[1], tmp)); + emit_move_insn (operands[0], + simplify_gen_subreg (mode, res, mode, 0)); + DONE; + } +) + +(define_expand "vec_unpacks_hi_" + [(match_operand: 0 "register_operand") + (match_operand:VQW 1 "register_operand")] "TARGET_SIMD" { rtx p = aarch64_simd_vect_par_cnst_half (mode, , true); - emit_insn (gen_aarch64_simd_vec_unpack_hi_ (operands[0], - operands[1], p)); + emit_insn (gen_aarch64_simd_vec_unpacks_hi_ (operands[0], + operands[1], p)); + DONE; + } +) + +(define_expand "vec_unpacku_lo_" + [(match_operand: 0 "register_operand") + (match_operand:VQW 1 "register_operand")] + "TARGET_SIMD" + { + rtx res = gen_reg_rtx (mode); + rtx tmp = aarch64_gen_shareable_zero (mode); + if (BYTES_BIG_ENDIAN) + emit_insn (gen_aarch64_zip1 (res, tmp, operands[1])); + else + emit_insn (gen_aarch64_zip1 (res, operands[1], tmp)); + emit_move_insn (operands[0], + simplify_gen_subreg (mode, res, mode, 0)); DONE; } ) -(define_expand "vec_unpack_lo_" +(define_expand "vec_unpacks_lo_" [(match_operand: 0 "register_operand") - (ANY_EXTEND: (match_operand:VQW 1 "register_operand"))] + (match_operand:VQW 1 "register_operand")] "TARGET_SIMD" { rtx p = aarch64_simd_vect_par_cnst_half (mode, , false); - emit_insn (gen_aarch64_simd_vec_unpack_lo_ (operands[0], - operands[1], p)); + emit_insn (gen_aarch64_simd_vec_unpacks_lo_ (operands[0], + operands[1], p)); DONE; } ) @@ -4735,6 +4769,34 @@ (define_insn "aarch64_subw2_internal" [(set_attr "type" "neon_sub_widen")] ) +(define_insn "aarch64_usubw__zip" + [(set (match_operand: 0 "register_operand" "=w") + (minus: + (match_operand: 1 "register_operand" "w") + (subreg: + (unspec: [ + (match_operand:VQW 2 "register_operand" "w") + (match_operand:VQW 3 "aarch64_simd_imm_zero" "Dz") + ] PERM_EXTEND) 0)))] + "TARGET_SIMD" + "usubw\\t%0., %1., %2." + [(set_attr "type" "neon_sub_widen")] +) + +(define_insn "aarch64_uaddw__zip" + [(set (match_operand: 0 "register_operand" "=w") + (plus: + (subreg: + (unspec: [ + (match_operand:VQW 2 "register_operand" "w") + (match_operand:VQW 3 "aarch64_simd_imm_zero" "Dz") + ] PERM_EXTEND) 0) + (match_operand: 1 "register_operand" "w")))] + "TARGET_SIMD" + "uaddw\\t%0., %1., %2." + [(set_attr "type" "neon_add_widen")] +) + (define_insn "aarch64_addw" [(set (match_operand: 0 "register_operand" "=w") (plus: diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index f9e2210095ea9d6d9c96971222a7757a2f418c2d..de281671aedc5141c69063f14cf0fbec5adecb04 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -2674,6 +2674,9 @@ (define_int_iterator PERMUTEQ [UNSPEC_ZIP1Q UNSPEC_ZIP2Q (define_int_iterator OPTAB_PERMUTE [UNSPEC_ZIP1 UNSPEC_ZIP2 UNSPEC_UZP1 UNSPEC_UZP2]) +;; Permutes for zero extends +(define_int_iterator PERM_EXTEND [UNSPEC_ZIP1 UNSPEC_ZIP2]) + (define_int_iterator REVERSE [UNSPEC_REV64 UNSPEC_REV32 UNSPEC_REV16]) (define_int_iterator FRINT [UNSPEC_FRINTZ UNSPEC_FRINTP UNSPEC_FRINTM @@ -3496,7 +3499,10 @@ (define_int_attr rev_op [(UNSPEC_REV64 "64") (UNSPEC_REV32 "32") (UNSPEC_REV16 "16")]) (define_int_attr perm_hilo [(UNSPEC_UNPACKSHI "hi") (UNSPEC_UNPACKUHI "hi") - (UNSPEC_UNPACKSLO "lo") (UNSPEC_UNPACKULO "lo")]) + (UNSPEC_UNPACKSLO "lo") (UNSPEC_UNPACKULO "lo") + (UNSPEC_ZIP2 "hi") (UNSPEC_ZIP1 "lo")]) + +(define_int_attr perm_index [(UNSPEC_ZIP2 "2") (UNSPEC_ZIP1 "")]) ;; Return true if the associated optab refers to the high-numbered lanes, ;; false if it refers to the low-numbered lanes. The convention is for diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vmovl_high_1.c b/gcc/testsuite/gcc.target/aarch64/simd/vmovl_high_1.c index d45bb83e3503d512c443f37a446d30d188719a96..a2d09eaee0de5a3d3409330c5c26a3b5315e84eb 100644 --- a/gcc/testsuite/gcc.target/aarch64/simd/vmovl_high_1.c +++ b/gcc/testsuite/gcc.target/aarch64/simd/vmovl_high_1.c @@ -22,11 +22,11 @@ FUNC (int32x4_t, int64x2_t, s32) /* { dg-final { scan-assembler-times {sxtl2\tv0\.2d, v0\.4s} 1} } */ FUNC (uint8x16_t, uint16x8_t, u8) -/* { dg-final { scan-assembler-times {uxtl2\tv0\.8h, v0\.16b} 1} } */ +/* { dg-final { scan-assembler-times {zip2\tv0\.16b, v0\.16b} 1} } */ FUNC (uint16x8_t, uint32x4_t, u16) -/* { dg-final { scan-assembler-times {uxtl2\tv0\.4s, v0\.8h} 1} } */ +/* { dg-final { scan-assembler-times {zip2\tv0\.8h, v0\.8h} 1} } */ FUNC (uint32x4_t, uint64x2_t, u32) -/* { dg-final { scan-assembler-times {uxtl2\tv0\.2d, v0\.4s} 1} } */ +/* { dg-final { scan-assembler-times {zip2\tv0\.4s, v0\.4s} 1} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/uxtl-combine-1.c b/gcc/testsuite/gcc.target/aarch64/uxtl-combine-1.c new file mode 100755 index 0000000000000000000000000000000000000000..68fa9a09fe55f5a72355e23c90e781a898c5975e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/uxtl-combine-1.c @@ -0,0 +1,20 @@ +/* { dg-do assemble } */ +/* { dg-options "-O3 --save-temps --param=vect-epilogues-nomask=0" } */ + +#pragma GCC target "+nosve" + +#define SIGN unsigned +#define TYPE1 char +#define TYPE2 short + +void d2 (SIGN TYPE2 * restrict a, SIGN TYPE1 *b, int n) +{ + for (int i = 0; i < (n & -8); i++) + a[i] = b[i]; +} + +/* { dg-final { scan-assembler-times {\tzip1\t} 1 } } */ +/* { dg-final { scan-assembler-times {\tzip2\t} 1 } } */ +/* { dg-final { scan-assembler-not {\tuxtl\t} } } */ +/* { dg-final { scan-assembler-not {\tuxtl2\t} } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/uxtl-combine-2.c b/gcc/testsuite/gcc.target/aarch64/uxtl-combine-2.c new file mode 100755 index 0000000000000000000000000000000000000000..af8a89085cfca800b41970a8410bc91b84a31d07 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/uxtl-combine-2.c @@ -0,0 +1,20 @@ +/* { dg-do assemble } */ +/* { dg-options "-O3 --save-temps --param=vect-epilogues-nomask=0" } */ + +#pragma GCC target "+nosve" + +#define SIGN unsigned +#define TYPE1 short +#define TYPE2 int + +void d2 (SIGN TYPE2 * restrict a, SIGN TYPE1 *b, int n) +{ + for (int i = 0; i < (n & -8); i++) + a[i] = b[i]; +} + +/* { dg-final { scan-assembler-times {\tzip1\t} 1 } } */ +/* { dg-final { scan-assembler-times {\tzip2\t} 1 } } */ +/* { dg-final { scan-assembler-not {\tuxtl\t} } } */ +/* { dg-final { scan-assembler-not {\tuxtl2\t} } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/uxtl-combine-3.c b/gcc/testsuite/gcc.target/aarch64/uxtl-combine-3.c new file mode 100755 index 0000000000000000000000000000000000000000..cdae6d09529b743857a092f53a07111df64775d7 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/uxtl-combine-3.c @@ -0,0 +1,20 @@ +/* { dg-do assemble } */ +/* { dg-options "-O3 --save-temps --param=vect-epilogues-nomask=0" } */ + +#pragma GCC target "+nosve" + +#define SIGN unsigned +#define TYPE1 int +#define TYPE2 long long + +void d2 (SIGN TYPE2 * restrict a, SIGN TYPE1 *b, int n) +{ + for (int i = 0; i < (n & -8); i++) + a[i] = b[i]; +} + +/* { dg-final { scan-assembler-times {\tzip1\t} 1 } } */ +/* { dg-final { scan-assembler-times {\tzip2\t} 1 } } */ +/* { dg-final { scan-assembler-not {\tuxtl\t} } } */ +/* { dg-final { scan-assembler-not {\tuxtl2\t} } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/uxtl-combine-4.c b/gcc/testsuite/gcc.target/aarch64/uxtl-combine-4.c new file mode 100755 index 0000000000000000000000000000000000000000..e1a9c4f5661a36ec7b2c5dc6f0fd85c42fcaac39 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/uxtl-combine-4.c @@ -0,0 +1,20 @@ +/* { dg-do assemble } */ +/* { dg-options "-O3 --save-temps --param=vect-epilogues-nomask=0" } */ + +#pragma GCC target "+nosve" + +#define SIGN signed +#define TYPE1 char +#define TYPE2 short + +void d2 (SIGN TYPE2 * restrict a, SIGN TYPE1 *b, int n) +{ + for (int i = 0; i < (n & -8); i++) + a[i] = b[i]; +} + +/* { dg-final { scan-assembler-not {\tzip1\t} } } */ +/* { dg-final { scan-assembler-not {\tzip2\t} } } */ +/* { dg-final { scan-assembler-times {\tsxtl\t} 1 } } */ +/* { dg-final { scan-assembler-time {\tsxtl2\t} 1 } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/uxtl-combine-5.c b/gcc/testsuite/gcc.target/aarch64/uxtl-combine-5.c new file mode 100755 index 0000000000000000000000000000000000000000..92b09ba4abba80f240ac175be2ef880968534975 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/uxtl-combine-5.c @@ -0,0 +1,20 @@ +/* { dg-do assemble } */ +/* { dg-options "-O3 --save-temps --param=vect-epilogues-nomask=0" } */ + +#pragma GCC target "+nosve" + +#define SIGN signed +#define TYPE1 short +#define TYPE2 int + +void d2 (SIGN TYPE2 * restrict a, SIGN TYPE1 *b, int n) +{ + for (int i = 0; i < (n & -8); i++) + a[i] = b[i]; +} + +/* { dg-final { scan-assembler-not {\tzip1\t} } } */ +/* { dg-final { scan-assembler-not {\tzip2\t} } } */ +/* { dg-final { scan-assembler-times {\tsxtl\t} 1 } } */ +/* { dg-final { scan-assembler-time {\tsxtl2\t} 1 } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/uxtl-combine-6.c b/gcc/testsuite/gcc.target/aarch64/uxtl-combine-6.c new file mode 100755 index 0000000000000000000000000000000000000000..5c6e635f29d1e52f51f5b75a477f7d8744f32ca3 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/uxtl-combine-6.c @@ -0,0 +1,20 @@ +/* { dg-do assemble } */ +/* { dg-options "-O3 --save-temps --param=vect-epilogues-nomask=0" } */ + +#pragma GCC target "+nosve" + +#define SIGN signed +#define TYPE1 int +#define TYPE2 long long + +void d2 (SIGN TYPE2 * restrict a, SIGN TYPE1 *b, int n) +{ + for (int i = 0; i < (n & -8); i++) + a[i] = b[i]; +} + +/* { dg-final { scan-assembler-not {\tzip1\t} } } */ +/* { dg-final { scan-assembler-not {\tzip2\t} } } */ +/* { dg-final { scan-assembler-times {\tsxtl\t} 1 } } */ +/* { dg-final { scan-assembler-time {\tsxtl2\t} 1 } } */ + -- --VsyUTVqOrULknLmi Content-Type: text/plain; charset=utf-8 Content-Disposition: attachment; filename="rb15000.patch" diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 81ff5bad03d598fa0d48df93d172a28bc0d1d92e..3d811007dd94dcd9176d6021a41a196c12fe9c3f 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -1988,26 +1988,60 @@ (define_insn "aarch64_simd_vec_unpack_hi_" [(set_attr "type" "neon_shift_imm_long")] ) -(define_expand "vec_unpack_hi_" +(define_expand "vec_unpacku_hi_" [(match_operand: 0 "register_operand") - (ANY_EXTEND: (match_operand:VQW 1 "register_operand"))] + (match_operand:VQW 1 "register_operand")] + "TARGET_SIMD" + { + rtx res = gen_reg_rtx (mode); + rtx tmp = aarch64_gen_shareable_zero (mode); + if (BYTES_BIG_ENDIAN) + emit_insn (gen_aarch64_zip2 (res, tmp, operands[1])); + else + emit_insn (gen_aarch64_zip2 (res, operands[1], tmp)); + emit_move_insn (operands[0], + simplify_gen_subreg (mode, res, mode, 0)); + DONE; + } +) + +(define_expand "vec_unpacks_hi_" + [(match_operand: 0 "register_operand") + (match_operand:VQW 1 "register_operand")] "TARGET_SIMD" { rtx p = aarch64_simd_vect_par_cnst_half (mode, , true); - emit_insn (gen_aarch64_simd_vec_unpack_hi_ (operands[0], - operands[1], p)); + emit_insn (gen_aarch64_simd_vec_unpacks_hi_ (operands[0], + operands[1], p)); + DONE; + } +) + +(define_expand "vec_unpacku_lo_" + [(match_operand: 0 "register_operand") + (match_operand:VQW 1 "register_operand")] + "TARGET_SIMD" + { + rtx res = gen_reg_rtx (mode); + rtx tmp = aarch64_gen_shareable_zero (mode); + if (BYTES_BIG_ENDIAN) + emit_insn (gen_aarch64_zip1 (res, tmp, operands[1])); + else + emit_insn (gen_aarch64_zip1 (res, operands[1], tmp)); + emit_move_insn (operands[0], + simplify_gen_subreg (mode, res, mode, 0)); DONE; } ) -(define_expand "vec_unpack_lo_" +(define_expand "vec_unpacks_lo_" [(match_operand: 0 "register_operand") - (ANY_EXTEND: (match_operand:VQW 1 "register_operand"))] + (match_operand:VQW 1 "register_operand")] "TARGET_SIMD" { rtx p = aarch64_simd_vect_par_cnst_half (mode, , false); - emit_insn (gen_aarch64_simd_vec_unpack_lo_ (operands[0], - operands[1], p)); + emit_insn (gen_aarch64_simd_vec_unpacks_lo_ (operands[0], + operands[1], p)); DONE; } ) @@ -4735,6 +4769,34 @@ (define_insn "aarch64_subw2_internal" [(set_attr "type" "neon_sub_widen")] ) +(define_insn "aarch64_usubw__zip" + [(set (match_operand: 0 "register_operand" "=w") + (minus: + (match_operand: 1 "register_operand" "w") + (subreg: + (unspec: [ + (match_operand:VQW 2 "register_operand" "w") + (match_operand:VQW 3 "aarch64_simd_imm_zero" "Dz") + ] PERM_EXTEND) 0)))] + "TARGET_SIMD" + "usubw\\t%0., %1., %2." + [(set_attr "type" "neon_sub_widen")] +) + +(define_insn "aarch64_uaddw__zip" + [(set (match_operand: 0 "register_operand" "=w") + (plus: + (subreg: + (unspec: [ + (match_operand:VQW 2 "register_operand" "w") + (match_operand:VQW 3 "aarch64_simd_imm_zero" "Dz") + ] PERM_EXTEND) 0) + (match_operand: 1 "register_operand" "w")))] + "TARGET_SIMD" + "uaddw\\t%0., %1., %2." + [(set_attr "type" "neon_add_widen")] +) + (define_insn "aarch64_addw" [(set (match_operand: 0 "register_operand" "=w") (plus: diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index f9e2210095ea9d6d9c96971222a7757a2f418c2d..de281671aedc5141c69063f14cf0fbec5adecb04 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -2674,6 +2674,9 @@ (define_int_iterator PERMUTEQ [UNSPEC_ZIP1Q UNSPEC_ZIP2Q (define_int_iterator OPTAB_PERMUTE [UNSPEC_ZIP1 UNSPEC_ZIP2 UNSPEC_UZP1 UNSPEC_UZP2]) +;; Permutes for zero extends +(define_int_iterator PERM_EXTEND [UNSPEC_ZIP1 UNSPEC_ZIP2]) + (define_int_iterator REVERSE [UNSPEC_REV64 UNSPEC_REV32 UNSPEC_REV16]) (define_int_iterator FRINT [UNSPEC_FRINTZ UNSPEC_FRINTP UNSPEC_FRINTM @@ -3496,7 +3499,10 @@ (define_int_attr rev_op [(UNSPEC_REV64 "64") (UNSPEC_REV32 "32") (UNSPEC_REV16 "16")]) (define_int_attr perm_hilo [(UNSPEC_UNPACKSHI "hi") (UNSPEC_UNPACKUHI "hi") - (UNSPEC_UNPACKSLO "lo") (UNSPEC_UNPACKULO "lo")]) + (UNSPEC_UNPACKSLO "lo") (UNSPEC_UNPACKULO "lo") + (UNSPEC_ZIP2 "hi") (UNSPEC_ZIP1 "lo")]) + +(define_int_attr perm_index [(UNSPEC_ZIP2 "2") (UNSPEC_ZIP1 "")]) ;; Return true if the associated optab refers to the high-numbered lanes, ;; false if it refers to the low-numbered lanes. The convention is for diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vmovl_high_1.c b/gcc/testsuite/gcc.target/aarch64/simd/vmovl_high_1.c index d45bb83e3503d512c443f37a446d30d188719a96..a2d09eaee0de5a3d3409330c5c26a3b5315e84eb 100644 --- a/gcc/testsuite/gcc.target/aarch64/simd/vmovl_high_1.c +++ b/gcc/testsuite/gcc.target/aarch64/simd/vmovl_high_1.c @@ -22,11 +22,11 @@ FUNC (int32x4_t, int64x2_t, s32) /* { dg-final { scan-assembler-times {sxtl2\tv0\.2d, v0\.4s} 1} } */ FUNC (uint8x16_t, uint16x8_t, u8) -/* { dg-final { scan-assembler-times {uxtl2\tv0\.8h, v0\.16b} 1} } */ +/* { dg-final { scan-assembler-times {zip2\tv0\.16b, v0\.16b} 1} } */ FUNC (uint16x8_t, uint32x4_t, u16) -/* { dg-final { scan-assembler-times {uxtl2\tv0\.4s, v0\.8h} 1} } */ +/* { dg-final { scan-assembler-times {zip2\tv0\.8h, v0\.8h} 1} } */ FUNC (uint32x4_t, uint64x2_t, u32) -/* { dg-final { scan-assembler-times {uxtl2\tv0\.2d, v0\.4s} 1} } */ +/* { dg-final { scan-assembler-times {zip2\tv0\.4s, v0\.4s} 1} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/uxtl-combine-1.c b/gcc/testsuite/gcc.target/aarch64/uxtl-combine-1.c new file mode 100755 index 0000000000000000000000000000000000000000..68fa9a09fe55f5a72355e23c90e781a898c5975e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/uxtl-combine-1.c @@ -0,0 +1,20 @@ +/* { dg-do assemble } */ +/* { dg-options "-O3 --save-temps --param=vect-epilogues-nomask=0" } */ + +#pragma GCC target "+nosve" + +#define SIGN unsigned +#define TYPE1 char +#define TYPE2 short + +void d2 (SIGN TYPE2 * restrict a, SIGN TYPE1 *b, int n) +{ + for (int i = 0; i < (n & -8); i++) + a[i] = b[i]; +} + +/* { dg-final { scan-assembler-times {\tzip1\t} 1 } } */ +/* { dg-final { scan-assembler-times {\tzip2\t} 1 } } */ +/* { dg-final { scan-assembler-not {\tuxtl\t} } } */ +/* { dg-final { scan-assembler-not {\tuxtl2\t} } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/uxtl-combine-2.c b/gcc/testsuite/gcc.target/aarch64/uxtl-combine-2.c new file mode 100755 index 0000000000000000000000000000000000000000..af8a89085cfca800b41970a8410bc91b84a31d07 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/uxtl-combine-2.c @@ -0,0 +1,20 @@ +/* { dg-do assemble } */ +/* { dg-options "-O3 --save-temps --param=vect-epilogues-nomask=0" } */ + +#pragma GCC target "+nosve" + +#define SIGN unsigned +#define TYPE1 short +#define TYPE2 int + +void d2 (SIGN TYPE2 * restrict a, SIGN TYPE1 *b, int n) +{ + for (int i = 0; i < (n & -8); i++) + a[i] = b[i]; +} + +/* { dg-final { scan-assembler-times {\tzip1\t} 1 } } */ +/* { dg-final { scan-assembler-times {\tzip2\t} 1 } } */ +/* { dg-final { scan-assembler-not {\tuxtl\t} } } */ +/* { dg-final { scan-assembler-not {\tuxtl2\t} } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/uxtl-combine-3.c b/gcc/testsuite/gcc.target/aarch64/uxtl-combine-3.c new file mode 100755 index 0000000000000000000000000000000000000000..cdae6d09529b743857a092f53a07111df64775d7 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/uxtl-combine-3.c @@ -0,0 +1,20 @@ +/* { dg-do assemble } */ +/* { dg-options "-O3 --save-temps --param=vect-epilogues-nomask=0" } */ + +#pragma GCC target "+nosve" + +#define SIGN unsigned +#define TYPE1 int +#define TYPE2 long long + +void d2 (SIGN TYPE2 * restrict a, SIGN TYPE1 *b, int n) +{ + for (int i = 0; i < (n & -8); i++) + a[i] = b[i]; +} + +/* { dg-final { scan-assembler-times {\tzip1\t} 1 } } */ +/* { dg-final { scan-assembler-times {\tzip2\t} 1 } } */ +/* { dg-final { scan-assembler-not {\tuxtl\t} } } */ +/* { dg-final { scan-assembler-not {\tuxtl2\t} } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/uxtl-combine-4.c b/gcc/testsuite/gcc.target/aarch64/uxtl-combine-4.c new file mode 100755 index 0000000000000000000000000000000000000000..e1a9c4f5661a36ec7b2c5dc6f0fd85c42fcaac39 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/uxtl-combine-4.c @@ -0,0 +1,20 @@ +/* { dg-do assemble } */ +/* { dg-options "-O3 --save-temps --param=vect-epilogues-nomask=0" } */ + +#pragma GCC target "+nosve" + +#define SIGN signed +#define TYPE1 char +#define TYPE2 short + +void d2 (SIGN TYPE2 * restrict a, SIGN TYPE1 *b, int n) +{ + for (int i = 0; i < (n & -8); i++) + a[i] = b[i]; +} + +/* { dg-final { scan-assembler-not {\tzip1\t} } } */ +/* { dg-final { scan-assembler-not {\tzip2\t} } } */ +/* { dg-final { scan-assembler-times {\tsxtl\t} 1 } } */ +/* { dg-final { scan-assembler-time {\tsxtl2\t} 1 } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/uxtl-combine-5.c b/gcc/testsuite/gcc.target/aarch64/uxtl-combine-5.c new file mode 100755 index 0000000000000000000000000000000000000000..92b09ba4abba80f240ac175be2ef880968534975 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/uxtl-combine-5.c @@ -0,0 +1,20 @@ +/* { dg-do assemble } */ +/* { dg-options "-O3 --save-temps --param=vect-epilogues-nomask=0" } */ + +#pragma GCC target "+nosve" + +#define SIGN signed +#define TYPE1 short +#define TYPE2 int + +void d2 (SIGN TYPE2 * restrict a, SIGN TYPE1 *b, int n) +{ + for (int i = 0; i < (n & -8); i++) + a[i] = b[i]; +} + +/* { dg-final { scan-assembler-not {\tzip1\t} } } */ +/* { dg-final { scan-assembler-not {\tzip2\t} } } */ +/* { dg-final { scan-assembler-times {\tsxtl\t} 1 } } */ +/* { dg-final { scan-assembler-time {\tsxtl2\t} 1 } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/uxtl-combine-6.c b/gcc/testsuite/gcc.target/aarch64/uxtl-combine-6.c new file mode 100755 index 0000000000000000000000000000000000000000..5c6e635f29d1e52f51f5b75a477f7d8744f32ca3 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/uxtl-combine-6.c @@ -0,0 +1,20 @@ +/* { dg-do assemble } */ +/* { dg-options "-O3 --save-temps --param=vect-epilogues-nomask=0" } */ + +#pragma GCC target "+nosve" + +#define SIGN signed +#define TYPE1 int +#define TYPE2 long long + +void d2 (SIGN TYPE2 * restrict a, SIGN TYPE1 *b, int n) +{ + for (int i = 0; i < (n & -8); i++) + a[i] = b[i]; +} + +/* { dg-final { scan-assembler-not {\tzip1\t} } } */ +/* { dg-final { scan-assembler-not {\tzip2\t} } } */ +/* { dg-final { scan-assembler-times {\tsxtl\t} 1 } } */ +/* { dg-final { scan-assembler-time {\tsxtl2\t} 1 } } */ + --VsyUTVqOrULknLmi--