From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR03-AM5-obe.outbound.protection.outlook.com (mail-eopbgr30040.outbound.protection.outlook.com [40.107.3.40]) by sourceware.org (Postfix) with ESMTPS id 54EC3385801D for ; Thu, 11 Feb 2021 17:46:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 54EC3385801D Received: from MRXP264CA0023.FRAP264.PROD.OUTLOOK.COM (2603:10a6:500:15::35) by DB6PR0801MB1942.eurprd08.prod.outlook.com (2603:10a6:4:75::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3825.20; Thu, 11 Feb 2021 17:46:57 +0000 Received: from VE1EUR03FT009.eop-EUR03.prod.protection.outlook.com (2603:10a6:500:15:cafe::c3) by MRXP264CA0023.outlook.office365.com (2603:10a6:500:15::35) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3825.27 via Frontend Transport; Thu, 11 Feb 2021 17:46:57 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;gcc.gnu.org; dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by VE1EUR03FT009.mail.protection.outlook.com (10.152.18.92) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3846.25 via Frontend Transport; Thu, 11 Feb 2021 17:46:56 +0000 Received: ("Tessian outbound 4d8113405d55:v71"); Thu, 11 Feb 2021 17:46:56 +0000 X-CR-MTA-TID: 64aa7808 Received: from 1511caf95b68.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id A260141C-1AB8-4C42-9B39-DDD47CA38530.1; Thu, 11 Feb 2021 17:46:51 +0000 Received: from EUR04-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 1511caf95b68.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 11 Feb 2021 17:46:51 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=SJWc4zIflh3HZiFvxwlaQVN7fUjPkTAjEhtMD07QWJsCFSmBoD/+gINUrIUwkEy+8WbSeshQ0PBNXIfaylnEV2bQS/9UZr4/YuiGopl31GYZ8uw2URZpPaw+EMz1GQYF6CYuv2KC+cSCjlUAG9u+UeEWr1Z+QtnCmN3O5/J+0eg4xYIQMh8kCRBXLq/5K7VaFQ4bSIBaASOhldOQhqyRmUSJBah/dyUjxJAH03U85U1mssHddQo1hWijx4bhGIUq38odTBJbIzJEGrzukbZNVsSsRbGpuaVgUIecXn/RMH+h51Sa4wzOXHSqURBNSSVTGLIrdPQ7t7ipIq4p4jBLSA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=JEpZfuCB08vYraOgg33x1HWWtkitQi0uWGpd/R8mWXo=; b=l6DKoT2hQu5ka0VGDYPYoQM3u23Goq0JFh+ta35XPj6+R2JI0s9+FII07QqV12n4/a/act0bYix6Olpo7znoT/fc++wFuR8KoFuCGoU52238s4q/WvjBlAbCSu1KscgqxmHxE7ksblwm34np84FYaCKBo0citiCC48Wahwa2UOwrAAHAax8vykmAfkaMZXjVBO6Loga0afIA2Z163QV5exI1ooqCB7HGVsLwkNKTiSwXWF0dKD4+eRUIgO0j7dNQEP+pTYMmC9z9SUinhwx81lwVZXIRQ74iy2rse/H4DToYZ9kS+CAseSUoOEjybbbL9LM27DV8DQAcmTmWZaA1rg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Received: from DBBPR08MB5993.eurprd08.prod.outlook.com (2603:10a6:10:1f4::23) by DB6PR0801MB1926.eurprd08.prod.outlook.com (2603:10a6:4:6c::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3825.30; Thu, 11 Feb 2021 17:46:47 +0000 Received: from DBBPR08MB5993.eurprd08.prod.outlook.com ([fe80::58cc:b9ec:cc87:bbe5]) by DBBPR08MB5993.eurprd08.prod.outlook.com ([fe80::58cc:b9ec:cc87:bbe5%5]) with mapi id 15.20.3825.030; Thu, 11 Feb 2021 17:46:47 +0000 From: Victor Do Nascimento To: "gcc-patches@gcc.gnu.org" CC: Kyrylo Tkachov , Richard Earnshaw , nd Subject: [PATCH][AArch64] Leveraging the use of STP instruction for vec_duplicate Thread-Topic: [PATCH][AArch64] Leveraging the use of STP instruction for vec_duplicate Thread-Index: AdcAnd2rN7H9OC5BTQ60IGkvFtXWaA== Date: Thu, 11 Feb 2021 17:46:47 +0000 Message-ID: Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ts-tracking-id: 3D5A2AB36B0EBD43A501283197CE6B49.0 x-checkrecipientchecked: true Authentication-Results-Original: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; x-originating-ip: [188.221.108.79] x-ms-publictraffictype: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: efedad7f-5e3b-477b-06ad-08d8ceb506b7 x-ms-traffictypediagnostic: DB6PR0801MB1926:|DB6PR0801MB1942: x-ms-exchange-transport-forked: True X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true nodisclaimer: true x-ms-oob-tlc-oobclassifiers: OLM:9508;OLM:9508; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: +nJPtcbE/6N6sB4ZqHm8ByJJEY23bfk0UHogoB1w4x0No8sSR34o3NQXbvpryjqUTgYVbtu7kDexH8CqLYdF1xcZbv9PJ55eScy5dTy/rrhD7V0K9DFl1raI4IOnniV6dvTx65csNxfKi0dtZ5C5vqC/zMz6KkWv9SacdlBrLSMHnujUWFxhMn3G9FETQ46RmRMv7tBgCpzBuGZBXOGU8Fdxb6npxejc93a+jLCnQpmVdh9MER1yK7U/q01JRrN+OvsqC/Bud77TLHw7UCPfHWOkFnIwaT0kq7mLLudbV9JXtdPtuNIwuDOP/iqRP6XceZtwLdNfETUkAIMsH09qDAXgDLof7uvza08eohASIZXTkZW+9/5qqarpRIL5W033t1J8KFNijTadg9XF8BZYTGN0Y1/H33zjg++fzw69DlORGy7iq4nMhgmiLQsfFJfOHZ4ee92FDq4ZmHCemO167qMe7FGuq6kXrnQOFIaczjTD/L1pvJm8fDuCjryAoT016YWTiaBUWzL9ZLE952Fn7g== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DBBPR08MB5993.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(39850400004)(136003)(346002)(396003)(376002)(366004)(8936002)(186003)(71200400001)(6916009)(83380400001)(6506007)(33656002)(76116006)(66556008)(26005)(66446008)(2906002)(52536014)(7696005)(64756008)(54906003)(55016002)(66476007)(9686003)(8676002)(478600001)(86362001)(316002)(66946007)(4326008)(5660300002); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?iso-8859-1?Q?QOETT3l7McKzTsPLDU6yfsJwmmlFEDm6VKqbDbQGtJwaQ8r1EIqYDu0c/J?= =?iso-8859-1?Q?jWKsGJ0Cykl2chcqXZCvtkFZipVSm7ibDw44ecE2z5+X/7LZYNyL21GJI8?= =?iso-8859-1?Q?YurZe1roFcWKn8wxWNb6pF0zDwdVU944CqcG2Qg77h6sweeWAaJn3TKH0I?= =?iso-8859-1?Q?aHSbYeE1vAVKfpNU3R8eFkp3HGtOvHi64VuEFuVfGb30JlMb+GqObqiZ7K?= =?iso-8859-1?Q?4P+vilBi4PT5JL1v3JE2DghnfsfPdRyboR8i6Xjuq2/C3qYgBsNJXOf0Qu?= =?iso-8859-1?Q?inh4NtinZm7rZoBljh6Se/YOi/NcgyRz/tQLyVnjjfX21DuxebDlYb+qY3?= =?iso-8859-1?Q?RddGD85vDqvCcoBnw046KgTmkJgw0YgOgvxIu/xQnyGgphvQ9kGSHmvsLw?= =?iso-8859-1?Q?AsHaEWuPVpP/yoCv452RqehoPqL37vnliEJcly2cJuxgZ3JfHl3fDofkXU?= =?iso-8859-1?Q?WTgggrX1AJlRPWzGvhsYudXVJxZE8YVEEfrRLBUQ9aawdn5ikXa7d0GNzw?= =?iso-8859-1?Q?9LXRZfv5KJuHpbhDkIMaAjVeta8F555XRXyzROlN4hgnDhpKE+eBWA0PnG?= =?iso-8859-1?Q?wdBobyjBk6dUVP4dvODkwE/mJcGPrJryCAVuNBZeLHGqUssvNOISsKgQzQ?= =?iso-8859-1?Q?dL5byvLGc/tsoJyz69P+OcjwMsXxIMBQ2AwyfJriX7edvFutt/abHfHSSj?= =?iso-8859-1?Q?CMt8AeinIPTyeQTOynhq17pzNdLi4pFwM3vSz97+LovjQF9JK4ajrF82oX?= =?iso-8859-1?Q?vTHWIjfn8OIcLj5dj+X3zHMnSpmJ49bvvFcFqsw6zAH5ANswQSd61B1maA?= =?iso-8859-1?Q?g+vFTCO/YEsw/DcbVDUyMpMtODLe0GnNiWFg0vlWyv12gkhZnBmTzDcxQk?= =?iso-8859-1?Q?x0UbKam5VuL9u5HSTQLq2CBssBiz1bSKn/UeLLxTcB9kHzy2bQybEm67u1?= =?iso-8859-1?Q?PgSLoxlfFQMRWBFNetan5i34Sy3eI8WylBERJbYg0+nS/Q55WO6ywW9NJn?= =?iso-8859-1?Q?rCJntDusq8UYWZPcYkWrsqPpjWohNdgf1NqkXJOL9IVU530t7F4kr3O9oj?= =?iso-8859-1?Q?FkoLyUVG9LUEe4EXFDx9gQQwO0v3yLiAyFGxV+pWGSUo7NtbbvrRUqUG53?= =?iso-8859-1?Q?J8KGb1cy8RUc4cziA8JlTsOAcA3ZKVkQfhm0rKCqxuUTPLH4yO+Er4V/sw?= =?iso-8859-1?Q?Sr4lXMahBYAZF9PAfAuPShqH7/aRKj+DuI0iLd/uyKrXHFYei0eoKod+so?= =?iso-8859-1?Q?6XAPGV3lbBfUaMwC0gubD3dhruuEG1NB1XqI46Zy3Gohq0NCvbaBLrqZo/?= =?iso-8859-1?Q?nbTG6csBNZl4GnWEYU+676+12ek4u3ruUFvn5OP16WkyX/I=3D?= Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR0801MB1926 Original-Authentication-Results: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: VE1EUR03FT009.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 6bb281d0-e818-4e97-824e-08d8ceb500df X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 1l9VUks7q5mnws5NSKFm/GoTtulOjq61lB9DfbSGGIYY0OpUn4FmlBzIR1ylephnURHWIT+9wkvhtFlU1WiWxGuKfKgRizqOHEgvQyfBXAai+Ei76ipAsCV+xeNJJfTdkqbXuR3nN3TX/NQ1ICYh37YJYsrSDuBUbdrWwuyp/lQLlV4PJrTtQLovp4CKXRtOxrqx4ut8hgS077TwX0abN+d5x9H8TYwcjt3pUFugK06huxecnrjPDUH6Zj98UXg0YVVNlNxTDtpdJ/COwCykxcfNbRepNp2IctClH6xi7oBkogdbOce+KK8wVzOHKSpWQsc/iroCtLtLyOKe4X0vPyEA4BJc9dB3QnTI3HkpSzc274vkgezo7JY/9ia8EEan9hgG27HeE3ymz+jVUjPdykWGAzvJ2EqXiNMyUyYYTCCiktzpNhC9ARDKUxZUFyFM2oHfqoZuXt3cVT6U1kdXblAD5HjhMQBI2fUG/dhGh4JTC14Lqpwo5HZ+BbxoViHteJwXbFQXv+E3YjLcEtw8NzaOC4ohoboHAODtEe5qGJgs2ugCPCnFWn61t0FVSoT0KEdOlWtybb3D2tb8jE8NPrwh/4lq+TUckHq748qvABjj1o1peUfoask8Jc8kV6NzBAkdlbfBE+5Trd1eBaqdmtfbjC/BaWv4ViG5aGpOh04= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(346002)(376002)(39850400004)(136003)(396003)(36840700001)(46966006)(8676002)(2906002)(4326008)(47076005)(6506007)(52536014)(54906003)(356005)(33656002)(82310400003)(82740400003)(83380400001)(70206006)(36860700001)(336012)(7696005)(478600001)(5660300002)(26005)(86362001)(316002)(9686003)(55016002)(81166007)(70586007)(8936002)(186003)(6916009); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 11 Feb 2021 17:46:56.8706 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: efedad7f-5e3b-477b-06ad-08d8ceb506b7 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: VE1EUR03FT009.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR0801MB1942 X-Spam-Status: No, score=-13.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, RAND_MKTG_HEADER, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Feb 2021 17:47:01 -0000 Dear GCC community, The backend pattern for storing a pair of identical values in 32 and 64-bit= modes with the machine instruction STP was missing, and multiple instructi= ons were needed to reproduce this behavior as a result of failed RTL patter= n match in combine pass.=20 For the test case : typedef long long v2di __attribute__((vector_size (16))); typedef int v2si = __attribute__((vector_size (8))); void foo (v2di *x, long long a) { =A0 v2di tmp =3D {a, a}; =A0 *x =3D tmp; } void foo2 (v2si *x, int a) { =A0 v2si tmp =3D {a, a}; =A0 *x =3D tmp; } at -O2 on aarch64 gives: foo: =A0=A0=A0 stp x1, x1, [x0] =A0=A0=A0 ret foo2: =A0=A0=A0 stp w1, w1, [x0] =A0=A0=A0 ret instead of: foo: =A0=A0=A0=A0=A0=A0=A0 dup=A0=A0=A0=A0 v0.2d, x1 =A0=A0=A0=A0=A0=A0=A0 str=A0=A0=A0=A0 q0, [x0] =A0=A0=A0=A0=A0=A0=A0 ret foo2: =A0=A0=A0=A0=A0=A0=A0 dup=A0=A0=A0=A0 v0.2s, w1 =A0=A0=A0=A0=A0=A0=A0 str=A0=A0=A0=A0 d0, [x0] =A0=A0=A0=A0=A0=A0=A0 ret =A0=A0=A0=A0=A0=A0=A0=20 Added new RTL template, unittest and checked for regressions on bootstrappe= d aarch64-none-linux-gnu. gcc/ChangeLog 2021-02-04 victor Do Nascimento * config/aarch64/aarch64-simd.md: Implement RTX pattern for mapping 'vec_duplicate' RTX onto 'STP' ASM insn. * config/aarch64/iterators.md: Implement ldpstp_vel_sz iterator to map STP/LDP vector element mode to correct suffix in=20 attribute type definition of aarch64_simd_stp pattern. =A0=A0=A0=20 gcc/testsuite/ChangeLog =A0=A0=A0=20 2021-02-04 Victor Do Nascimento * gcc.target/stp_vec-dup_32_64-1.c: New. Regards, Victor diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch6= 4-simd.md index 68baf416045..4623cbb95f4 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -205,6 +205,16 @@ [(set_attr "type" "neon_stp")] ) =20 +(define_insn "aarch64_simd_stp" + [(set (match_operand:VP_2E 0 "aarch64_mem_pair_operand" "=3DUmp,Ump") + (vec_duplicate:VP_2E (match_operand: 1 "register_operand" "w,r"= )))] + "TARGET_SIMD" + "@ + stp\\t%1, %1, %z0 + stp\\t%1, %1, %z0" + [(set_attr "type" "neon_stp, store_")] +) + (define_insn "load_pair" [(set (match_operand:VQ 0 "register_operand" "=3Dw") (match_operand:VQ 1 "aarch64_mem_pair_operand" "Ump")) diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators= .md index fb1426b7752..aac6e0b5bd9 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -880,6 +880,9 @@ ;; Likewise for load/store pair. (define_mode_attr ldpstp_sz [(SI "8") (DI "16")]) =20 +;; Size of element access for STP/LDP-generated vectors. +(define_mode_attr ldpstp_vel_sz [(V2SI "8") (V2SF "8") (V2DI "16") (V2DF "= 16")]) + ;; For inequal width int to float conversion (define_mode_attr w1 [(HF "w") (SF "w") (DF "x")]) (define_mode_attr w2 [(HF "x") (SF "x") (DF "w")]) diff --git a/gcc/testsuite/gcc.target/aarch64/stp_vec_dup_32_64-1.c b/gcc/t= estsuite/gcc.target/aarch64/stp_vec_dup_32_64-1.c new file mode 100644 index 00000000000..a37c903dfd4 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/stp_vec_dup_32_64-1.c @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +typedef long long v2di __attribute__((vector_size (16))); +typedef int v2si __attribute__((vector_size (8))); + +void +foo (v2di *x, long long a) +{ + v2di tmp =3D {a, a}; + *x =3D tmp; +} + +void +foo2 (v2si *x, int a) +{ + v2si tmp =3D {a, a}; + *x =3D tmp; +} + +/* { dg-final { scan-assembler-times "stp\t" 2 } } */ +/* { dg-final { scan-assembler-not "dup\t" } } */