From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR05-VI1-obe.outbound.protection.outlook.com (mail-vi1eur05on2087.outbound.protection.outlook.com [40.107.21.87]) by sourceware.org (Postfix) with ESMTPS id 076BF3858427 for ; Tue, 1 Nov 2022 15:20:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 076BF3858427 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=PUlD49jHUX/qvQwb5D7w7dgX9fUGweW3+rEppXLgfkZ7EgjBW7Ual3RIoUkxOZXjjCfuYDYVxa+YjMk0TQuW/NdnxWjqY6e+uQ+Ev5HfH0xapyMRUuvS7Lnh3KsiWT2dTnY9/swebOjRbxUAS49yc1YybnYh1rA5UAP6FM0WIGsFFo5vgEr64GZi1vVKG3JSjY7riJs9AXq3SEBDhQXW/9SSB7mKnrIWpvx4vW8T1uIbc03LcwrI1lUTolDRNcFZDP00qjE/6AbJ+ruQtYve/8q1LQXnL5mawftP0y74PqLsnXOfsn6gAj20LRSjWBjvaA1LL173W2RWzcJVqKmkoA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=1yFSri0C9PZp8oqjS827f1XpVPnDSPKzIwWaX7HurRg=; b=nO2ciDbI1nnSgjX8wF9B45+xiUJfJ3KSHHg1qmS1Fzb1XghL0fRHVoY3cYddWrRRz0j1ebvgZXl8IchW+R2lzYFsmtzFrtKRMXE1IpxJH3WdFIQNRguRSGqbKNZ0fFKvYw1zJQei+H2AAR5mC3SjRblWL49NacrpMP/XF9vDAPjOoaLsTURlMPnZTu8Fvk0yxi9pHhnS8lQ6MF0OPerWdZtxX5JUm9MKtpQFISc9DmUG37TjzjYzkffw2XvsQDkUE6bycTdRZgDq/N3RW0ZqOcaHdcP9X+SdntJmU5bPlr7A72rKiSGEzmCszz+wh7KtsJIqhJVj2TGy3NceTHmfag== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=1yFSri0C9PZp8oqjS827f1XpVPnDSPKzIwWaX7HurRg=; b=GmyHSpTcl2Ur/+14rUDSwRxbTR634fuLPYRledjmRAHakWRbtZNKm7TR9ioHZ17Rxa+TDvMfeMjnKWrGLwjO06a2uDwZlTLRzfJOCM7E0qdvJrp+/NIp0682DdD1uACgmUIMAvbb0qSYq97lDoilFPq9EqNIyVf2Zy36mgo/3i0= Received: from DB9PR01CA0013.eurprd01.prod.exchangelabs.com (2603:10a6:10:1d8::18) by AS8PR08MB8251.eurprd08.prod.outlook.com (2603:10a6:20b:53d::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.19; Tue, 1 Nov 2022 15:20:22 +0000 Received: from DBAEUR03FT064.eop-EUR03.prod.protection.outlook.com (2603:10a6:10:1d8:cafe::5c) by DB9PR01CA0013.outlook.office365.com (2603:10a6:10:1d8::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.19 via Frontend Transport; Tue, 1 Nov 2022 15:20:22 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DBAEUR03FT064.mail.protection.outlook.com (100.127.143.3) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.14 via Frontend Transport; Tue, 1 Nov 2022 15:20:22 +0000 Received: ("Tessian outbound 6c699027a257:v130"); Tue, 01 Nov 2022 15:20:21 +0000 X-CR-MTA-TID: 64aa7808 Received: from b42a56ef30b4.3 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 8384AB55-FC3A-4106-97E8-C167354D0C2A.1; Tue, 01 Nov 2022 15:20:11 +0000 Received: from EUR05-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id b42a56ef30b4.3 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Tue, 01 Nov 2022 15:20:11 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=lSER8y5E6J3V6D8jn4VEFD4rCMzKp7Y7R/dFl7Frvv7zI0QknRbfm/igxWF0lMqs+EZfGWxhCUiZQnw6wZFWuXK5leV6MDDAzyey2cjfsQa3+q4Hoh6RciUkhdz/ha0pccwaTx2kcOu4mrEdwtpg8eag6T2yCPDJ3O4s/JkTW1YmxPbJcN5EloTP7K/SCICYSYb8517y39yz8/PcHcCh91VMF3qFoxNUdsh9KZXnHdunevU9U/jIgUgzo6Lv/ol7xfqvhTxzq/5IE58GkSPDBLipSyShMXfiDa3SOzCV2D+mLIaJpEZS8i9HwPxj0QKsVcoOt8HiM76NDwjcqLFikA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=1yFSri0C9PZp8oqjS827f1XpVPnDSPKzIwWaX7HurRg=; b=geLJqw54uzqgU0bmi+Bja2jHZ+7vnq/UFiLS6PbtjvB5zZmwmu4E1usBnQS3TOIPTsX9VCE9ugSsOqp8ld72dc0811T0L1pGrzMnqYIBQcBOpQNgZ9Xug4JafvyiMKXrnWtuH2IRhoX3/4vDrHJHDMtIr5IvJlz44Wvq5mlHYzddC5Pn2EF/uCmu53JS6tMB0LDG08GX3RqmAHHOMkvEsxRaPboP/rWJQNd5p9G2x5Uoh+miQL93U81RwE+hOyZhl9Usv6m4RLdSeHBXV6Cgn15RknXcrO0oLpgHK6Uf6cy0V8aGxu4x49Yw6LfVpktc2qxLVIcke+15+au7EaXg1A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=1yFSri0C9PZp8oqjS827f1XpVPnDSPKzIwWaX7HurRg=; b=GmyHSpTcl2Ur/+14rUDSwRxbTR634fuLPYRledjmRAHakWRbtZNKm7TR9ioHZ17Rxa+TDvMfeMjnKWrGLwjO06a2uDwZlTLRzfJOCM7E0qdvJrp+/NIp0682DdD1uACgmUIMAvbb0qSYq97lDoilFPq9EqNIyVf2Zy36mgo/3i0= Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by DB8PR08MB5530.eurprd08.prod.outlook.com (2603:10a6:10:11f::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.16; Tue, 1 Nov 2022 15:20:08 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52%4]) with mapi id 15.20.5769.021; Tue, 1 Nov 2022 15:20:08 +0000 From: Tamar Christina To: Richard Sandiford CC: "gcc-patches@gcc.gnu.org" , nd , Richard Earnshaw , Marcus Shawcroft , Kyrylo Tkachov Subject: RE: [PATCH 8/8]AArch64: Have reload not choose to do add on the scalar side if both values exist on the SIMD side. Thread-Topic: [PATCH 8/8]AArch64: Have reload not choose to do add on the scalar side if both values exist on the SIMD side. Thread-Index: AQHY7SBfng0HXoUegEu0avwU6KD1BK4qLJPFgAADDIA= Date: Tue, 1 Nov 2022 15:20:08 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ts-tracking-id: 46458C636C5FF644ABD5365B69DED096.0 x-checkrecipientchecked: true Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; x-ms-traffictypediagnostic: VI1PR08MB5325:EE_|DB8PR08MB5530:EE_|DBAEUR03FT064:EE_|AS8PR08MB8251:EE_ X-MS-Office365-Filtering-Correlation-Id: 3731c96f-d74c-48a3-c0c4-08dabc1c97e9 x-checkrecipientrouted: true nodisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: JB2eGVNDy+WRS4VyGtISKvckm+SSLjdOz1pbm2RcQZH9tHWdNTe7qzJiWZhSKWNMcNB12DEkzcTkq62Q2BgurZlAdpx5oWY0Fq4eiItr4mDmmuKKrBdMT6FJ/qtDQEhknXmre/pK1MtNtMIqSfXct/1K94xFyaUiox8yVicNYJRucTdA4lmIx9pKYkfJiXbCrt6syRbYJT3R88LKuNajZrgYzgQmEbjWo1G+L56jFsqbOv8Brq6fTjMTi+4nOb1el6xYtISvkc+iLRrV457czXGUJ4gSi7vW2/OCUwG7bWa5VPuAmJKeGjnK3KsxiSUFdyfGJEmsxmNNHsfFUtxLovcLpGjpH23+yGr8muNDSn6B0dSrOJg1zw+BJiH8kVP9uiEkD49K9z148Ib96Qd0qjbPLaOi4n00jD5WVghkCaTg6wE0G/atniPROE9sFDuFNlGdRZG0yFHhQ+26zRET495LqwF9sp/0eYfz1nhLRomk4jStLwtG/grRekzJgPNiC7fvgo636r9hvIth1PtvtGzr1B6x5FoKAfpygypmlQ1i/DTcd0JfqAorK23D6/nPr3fEcFoU4iTtVYPfIGaFhEbQmHraDpV2nKjAJxptstTKY3uyYcnOcZbV9SwN35TNqTGFwojuK5WHMjtlftLFWrZ4+xPUkXubIEmIEtQpo6h6bZauNdKE2yZk6mG023OlZCSvuIuPjvI+mVN6Qq8PI1uya2Sd+SyrruXRBb1AqxhHP6mVr+E7o0tdK+cdkIgDvjZK80cnMfhTBVtVOO/dFTGAmGKq5z8fSKNoxxiRFlLg60IM97UGKE9w+kjidMGQ1L9f3zJyeQi2IO6GytxVhA== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VI1PR08MB5325.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(366004)(39860400002)(376002)(346002)(136003)(396003)(451199015)(52536014)(5660300002)(8676002)(66446008)(6862004)(64756008)(8936002)(4326008)(66476007)(41300700001)(30864003)(66556008)(38070700005)(86362001)(71200400001)(33656002)(83380400001)(55016003)(7696005)(6506007)(26005)(53546011)(478600001)(9686003)(122000001)(316002)(66946007)(38100700002)(76116006)(6636002)(186003)(2906002)(54906003)(84970400001);DIR:OUT;SFP:1101; Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB8PR08MB5530 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DBAEUR03FT064.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: c0bc5c16-f967-4cdd-8e20-08dabc1c8fdb X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 9uQpF2z6dS9soTZH1DQB6TrRdcp4oS26xdYNRfhYYYmtgDggKP4Ag7721gPjVq8OIDSziYKMsdbVc+X+R9TzccZvMCMwKYAYijJ6I08KG+mUgjpZVjv4r/N4OqSQb2wOoI8etV6WjWNtugp0qPXmmSzAIXRzGreDmbQD0sf70O/xLBI0E4EfbnQateQdyxcmKmkC7mwSfjMQoE/b/efuVseDTgaJsyZQrwPuPxeMD5kme1zyLsv6/0Qg1C8J/6uX+egrnddysehMgJ/fWlzLQQ2Q8lt/rjwc0FCed91dKrVLTzauSVl/bJW1aYelb9tB9v2KKDuFjEzQd9TR9hOnZplduxnrmmSBio4ENbx7YQR4jIC2pjTmDJWj2E7lG69uiIoLhPQBcTKGn+NjPTU8L/sb6qZ9HZ9G6pBZwRGyAHeXcfkCYagw22r0+6HG0Ksj5wzPG3y8RMBzG7gY2H/HVTCVDNP6Pcti6SpAWxEyE3GCAcC1yiWEfkWhyU+p2XNpO9YESCjJEsg2ZgyBaAO9Xzk9zenZzDTnNlgEXbUx0SefHJai0n3X0uZRRhM5porLLgRzlbCEKVV0A4jaK5C2wWbsIFysfelpEZvXJwIoCDyenp8PBz+Ieo5VnhLoQcoN+6F2kQrKyIaCCZbvgO3j56eatIphYgl/Ay+kgI3wYM4RSckf4pIxIYecj1fDm1C3NSw+g54kiUF75uRJiVWWSVtsSi/AhyVHjoVQfzgxz23tOQEm8w9OajYM9GM9xA+UvUz9wmlKLerwPIeWHPQ4Zdzm38ALCR85b6JRaZK1Le9i5QTZQ25h41qnPrdYKCepOYTrFrAFJAcDCK88eqPgWA== X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230022)(4636009)(39860400002)(136003)(376002)(346002)(396003)(451199015)(36840700001)(40470700004)(46966006)(6862004)(40480700001)(52536014)(9686003)(26005)(70586007)(70206006)(33656002)(47076005)(55016003)(186003)(336012)(83380400001)(8936002)(2906002)(41300700001)(81166007)(356005)(36860700001)(30864003)(478600001)(82310400005)(316002)(86362001)(8676002)(6506007)(4326008)(53546011)(84970400001)(6636002)(54906003)(7696005)(5660300002)(40460700003)(82740400003);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 01 Nov 2022 15:20:22.0131 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 3731c96f-d74c-48a3-c0c4-08dabc1c97e9 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DBAEUR03FT064.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB8251 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,GIT_PATCH_0,KAM_DMARC_NONE,KAM_LOTSOFHASH,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: > -----Original Message----- > From: Richard Sandiford > Sent: Tuesday, November 1, 2022 3:05 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; Kyrylo Tkachov > Subject: Re: [PATCH 8/8]AArch64: Have reload not choose to do add on the > scalar side if both values exist on the SIMD side. >=20 > Tamar Christina writes: > > Hi All, > > > > Currently we often times generate an r -> r add even if it means we > > need two reloads to perform it, i.e. in the case that the values are on= the > SIMD side. > > > > The pairwise operations expose these more now and so we get suboptimal > codegen. > > > > Normally I would have liked to use ^ or $ here, but while this works > > for the simple examples, reload inexplicably falls apart on examples > > that should have been trivial. It forces a move to r -> w to use the w > > ADD, which is counter to what ^ and $ should do. > > > > However ! seems to fix all the regression and still maintains the good > codegen. > > > > I have tried looking into whether it's our costings that are off, but > > I can't seem anything logical here. So I'd like to push this change > > instead along with test that augment the other testcases that guard the= r -> > r variants. >=20 > This feels like a hack though. r<-r+r is one of the simplest thing the p= rocessor > can do, so I don't think it makes logical sense to mark it with !, which = means > "prohibitively expensive". It's likely to push operations that require r= eloads > onto the SIMD side. I agree. Though at the moment, reload isn't behaving as it should. It's alm= ost as if the register transfer costs are not taken into account when deciding on an = alternative. It seems to think that an r->r and w->w are as cheap even when the value ha= s been assigned to w before. For instance, some of the testcases below don't work correctl= y because of this. I don't think I can influence this costing, and as I mentioned ^ works for = the simple example But then somehow makes w->w cheaper even though the value was assigned to r= . I'm not really sure where to look here, but the current version is also equ= ally broken.. It basically always forces to r. Thanks, Tamar >=20 > Thanks, > Richard >=20 > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > > > Ok for master? > > > > Thanks, > > Tamar > > > > gcc/ChangeLog: > > > > * config/aarch64/aarch64.md (*add3_aarch64): Add ! to the > r -> r > > alternative. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/aarch64/simd/scalar_addp.c: New test. > > * gcc.target/aarch64/simd/scalar_faddp.c: New test. > > * gcc.target/aarch64/simd/scalar_faddp2.c: New test. > > * gcc.target/aarch64/simd/scalar_fmaxp.c: New test. > > * gcc.target/aarch64/simd/scalar_fminp.c: New test. > > * gcc.target/aarch64/simd/scalar_maxp.c: New test. > > * gcc.target/aarch64/simd/scalar_minp.c: New test. > > > > --- inline copy of patch -- > > diff --git a/gcc/config/aarch64/aarch64.md > > b/gcc/config/aarch64/aarch64.md index > > > 09ae1118371f82ca63146fceb953eb9e820d05a4..c333fb1f72725992bb304c560f > 12 > > 45a242d5192d 100644 > > --- a/gcc/config/aarch64/aarch64.md > > +++ b/gcc/config/aarch64/aarch64.md > > @@ -2043,7 +2043,7 @@ (define_expand "add3" > > > > (define_insn "*add3_aarch64" > > [(set > > - (match_operand:GPI 0 "register_operand" "=3Drk,rk,w,rk,r,r,rk") > > + (match_operand:GPI 0 "register_operand" "=3Drk,!rk,w,rk,r,r,rk") > > (plus:GPI > > (match_operand:GPI 1 "register_operand" "%rk,rk,w,rk,rk,0,rk") > > (match_operand:GPI 2 "aarch64_pluslong_operand" > > "I,r,w,J,Uaa,Uai,Uav")))] diff --git > > a/gcc/testsuite/gcc.target/aarch64/simd/scalar_addp.c > > b/gcc/testsuite/gcc.target/aarch64/simd/scalar_addp.c > > new file mode 100644 > > index > > > 0000000000000000000000000000000000000000..5b8d40f19884fc7b4e7decd80 > 758 > > bc36fa76d058 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_addp.c > > @@ -0,0 +1,70 @@ > > +/* { dg-do assemble } */ > > +/* { dg-additional-options "-save-temps -O1 -std=3Dc99" } */ > > +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } > > +} */ > > + > > +typedef long long v2di __attribute__((vector_size (16))); typedef > > +unsigned long long v2udi __attribute__((vector_size (16))); typedef > > +int v2si __attribute__((vector_size (16))); typedef unsigned int > > +v2usi __attribute__((vector_size (16))); > > + > > +/* > > +** foo: > > +** addp d0, v0.2d > > +** fmov x0, d0 > > +** ret > > +*/ > > +long long > > +foo (v2di x) > > +{ > > + return x[1] + x[0]; > > +} > > + > > +/* > > +** foo1: > > +** saddlp v0.1d, v0.2s > > +** fmov x0, d0 > > +** ret > > +*/ > > +long long > > +foo1 (v2si x) > > +{ > > + return x[1] + x[0]; > > +} > > + > > +/* > > +** foo2: > > +** uaddlp v0.1d, v0.2s > > +** fmov x0, d0 > > +** ret > > +*/ > > +unsigned long long > > +foo2 (v2usi x) > > +{ > > + return x[1] + x[0]; > > +} > > + > > +/* > > +** foo3: > > +** uaddlp v0.1d, v0.2s > > +** add d0, d0, d1 > > +** fmov x0, d0 > > +** ret > > +*/ > > +unsigned long long > > +foo3 (v2usi x, v2udi y) > > +{ > > + return (x[1] + x[0]) + y[0]; > > +} > > + > > +/* > > +** foo4: > > +** saddlp v0.1d, v0.2s > > +** add d0, d0, d1 > > +** fmov x0, d0 > > +** ret > > +*/ > > +long long > > +foo4 (v2si x, v2di y) > > +{ > > + return (x[1] + x[0]) + y[0]; > > +} > > diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp.c > > b/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp.c > > new file mode 100644 > > index > > > 0000000000000000000000000000000000000000..ff455e060fc833b2f63e89c467 > b9 > > 1a76fbe31aff > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp.c > > @@ -0,0 +1,66 @@ > > +/* { dg-do assemble } */ > > +/* { dg-require-effective-target arm_v8_2a_fp16_scalar_ok } */ > > +/* { dg-add-options arm_v8_2a_fp16_scalar } */ > > +/* { dg-additional-options "-save-temps -O1" } */ > > +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } > > +} */ > > + > > +typedef double v2df __attribute__((vector_size (16))); typedef float > > +v4sf __attribute__((vector_size (16))); typedef __fp16 v8hf > > +__attribute__((vector_size (16))); > > + > > +/* > > +** foo: > > +** faddp d0, v0.2d > > +** ret > > +*/ > > +double > > +foo (v2df x) > > +{ > > + return x[1] + x[0]; > > +} > > + > > +/* > > +** foo1: > > +** faddp s0, v0.2s > > +** ret > > +*/ > > +float > > +foo1 (v4sf x) > > +{ > > + return x[0] + x[1]; > > +} > > + > > +/* > > +** foo2: > > +** faddp h0, v0.2h > > +** ret > > +*/ > > +__fp16 > > +foo2 (v8hf x) > > +{ > > + return x[0] + x[1]; > > +} > > + > > +/* > > +** foo3: > > +** ext v0.16b, v0.16b, v0.16b, #4 > > +** faddp s0, v0.2s > > +** ret > > +*/ > > +float > > +foo3 (v4sf x) > > +{ > > + return x[1] + x[2]; > > +} > > + > > +/* > > +** foo4: > > +** dup s0, v0.s\[3\] > > +** faddp h0, v0.2h > > +** ret > > +*/ > > +__fp16 > > +foo4 (v8hf x) > > +{ > > + return x[6] + x[7]; > > +} > > diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp2.c > > b/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp2.c > > new file mode 100644 > > index > > > 0000000000000000000000000000000000000000..04412c3b45c51648e46ff20f73 > 0b > > 1213e940391a > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp2.c > > @@ -0,0 +1,14 @@ > > +/* { dg-do assemble } */ > > +/* { dg-additional-options "-save-temps -O1 -w" } */ > > + > > +typedef __m128i __attribute__((__vector_size__(2 * sizeof(long)))); > > +double a[]; *b; > > +fn1() { > > + __m128i c; > > + *(__m128i *)a =3D c; > > + *b =3D a[0] + a[1]; > > +} > > + > > +/* { dg-final { scan-assembler-times {faddp\td0, v0\.2d} 1 } } */ > > + > > diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_fmaxp.c > > b/gcc/testsuite/gcc.target/aarch64/simd/scalar_fmaxp.c > > new file mode 100644 > > index > > > 0000000000000000000000000000000000000000..aa1d2bf17cd707b74d8f7c5745 > 06 > > 610ab4fd7299 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_fmaxp.c > > @@ -0,0 +1,56 @@ > > +/* { dg-do assemble } */ > > +/* { dg-require-effective-target arm_v8_2a_fp16_scalar_ok } */ > > +/* { dg-add-options arm_v8_2a_fp16_scalar } */ > > +/* { dg-additional-options "-save-temps -O1" } */ > > +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } > > +} */ > > + > > +typedef double v2df __attribute__((vector_size (16))); typedef float > > +v4sf __attribute__((vector_size (16))); typedef __fp16 v8hf > > +__attribute__((vector_size (16))); > > + > > +/* > > +** foo: > > +** fmaxnmp d0, v0.2d > > +** ret > > +*/ > > +double > > +foo (v2df x) > > +{ > > + return x[0] > x[1] ? x[0] : x[1]; > > +} > > + > > +/* > > +** foo1: > > +** fmaxnmp s0, v0.2s > > +** ret > > +*/ > > +float > > +foo1 (v4sf x) > > +{ > > + return x[0] > x[1] ? x[0] : x[1]; > > +} > > + > > +/* > > +** foo2: > > +** fmaxnmp h0, v0.2h > > +** ret > > +*/ > > +__fp16 > > +foo2 (v8hf x) > > +{ > > + return x[0] > x[1] ? x[0] : x[1]; > > +} > > + > > +/* > > +** foo3: > > +** fmaxnmp s0, v0.2s > > +** fcvt d0, s0 > > +** fadd d0, d0, d1 > > +** ret > > +*/ > > +double > > +foo3 (v4sf x, v2df y) > > +{ > > + return (x[0] > x[1] ? x[0] : x[1]) + y[0]; } > > + > > diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_fminp.c > > b/gcc/testsuite/gcc.target/aarch64/simd/scalar_fminp.c > > new file mode 100644 > > index > > > 0000000000000000000000000000000000000000..6136c5272069c4d86f09951cdff > 2 > > 5f1494e839f0 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_fminp.c > > @@ -0,0 +1,55 @@ > > +/* { dg-do assemble } */ > > +/* { dg-require-effective-target arm_v8_2a_fp16_scalar_ok } */ > > +/* { dg-add-options arm_v8_2a_fp16_scalar } */ > > +/* { dg-additional-options "-save-temps -O1" } */ > > +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } > > +} */ > > + > > +typedef double v2df __attribute__((vector_size (16))); typedef float > > +v4sf __attribute__((vector_size (16))); typedef __fp16 v8hf > > +__attribute__((vector_size (16))); > > + > > +/* > > +** foo: > > +** fminnmp d0, v0.2d > > +** ret > > +*/ > > +double > > +foo (v2df x) > > +{ > > + return x[0] < x[1] ? x[0] : x[1]; > > +} > > + > > +/* > > +** foo1: > > +** fminnmp s0, v0.2s > > +** ret > > +*/ > > +float > > +foo1 (v4sf x) > > +{ > > + return x[0] < x[1] ? x[0] : x[1]; > > +} > > + > > +/* > > +** foo2: > > +** fminnmp h0, v0.2h > > +** ret > > +*/ > > +__fp16 > > +foo2 (v8hf x) > > +{ > > + return x[0] < x[1] ? x[0] : x[1]; > > +} > > + > > +/* > > +** foo3: > > +** fminnmp s0, v0.2s > > +** fcvt d0, s0 > > +** fadd d0, d0, d1 > > +** ret > > +*/ > > +double > > +foo3 (v4sf x, v2df y) > > +{ > > + return (x[0] < x[1] ? x[0] : x[1]) + y[0]; } > > diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_maxp.c > > b/gcc/testsuite/gcc.target/aarch64/simd/scalar_maxp.c > > new file mode 100644 > > index > > > 0000000000000000000000000000000000000000..e219a13abc745b83dca58633f > d2d > > 812e276d6b2d > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_maxp.c > > @@ -0,0 +1,74 @@ > > +/* { dg-do assemble } */ > > +/* { dg-additional-options "-save-temps -O1 -std=3Dc99" } */ > > +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } > > +} */ > > + > > +typedef long long v2di __attribute__((vector_size (16))); typedef > > +unsigned long long v2udi __attribute__((vector_size (16))); typedef > > +int v2si __attribute__((vector_size (16))); typedef unsigned int > > +v2usi __attribute__((vector_size (16))); > > + > > +/* > > +** foo: > > +** umov x0, v0.d\[1\] > > +** fmov x1, d0 > > +** cmp x0, x1 > > +** csel x0, x0, x1, ge > > +** ret > > +*/ > > +long long > > +foo (v2di x) > > +{ > > + return x[0] > x[1] ? x[0] : x[1]; > > +} > > + > > +/* > > +** foo1: > > +** smaxp v0.2s, v0.2s, v0.2s > > +** smov x0, v0.s\[0\] > > +** ret > > +*/ > > +long long > > +foo1 (v2si x) > > +{ > > + return x[0] > x[1] ? x[0] : x[1]; > > +} > > + > > +/* > > +** foo2: > > +** umaxp v0.2s, v0.2s, v0.2s > > +** fmov w0, s0 > > +** ret > > +*/ > > +unsigned long long > > +foo2 (v2usi x) > > +{ > > + return x[0] > x[1] ? x[0] : x[1]; > > +} > > + > > +/* > > +** foo3: > > +** umaxp v0.2s, v0.2s, v0.2s > > +** fmov w0, s0 > > +** fmov x1, d1 > > +** add x0, x1, w0, uxtw > > +** ret > > +*/ > > +unsigned long long > > +foo3 (v2usi x, v2udi y) > > +{ > > + return (x[0] > x[1] ? x[0] : x[1]) + y[0]; } > > + > > +/* > > +** foo4: > > +** smaxp v0.2s, v0.2s, v0.2s > > +** fmov w0, s0 > > +** fmov x1, d1 > > +** add x0, x1, w0, sxtw > > +** ret > > +*/ > > +long long > > +foo4 (v2si x, v2di y) > > +{ > > + return (x[0] > x[1] ? x[0] : x[1]) + y[0]; } > > diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_minp.c > > b/gcc/testsuite/gcc.target/aarch64/simd/scalar_minp.c > > new file mode 100644 > > index > > > 0000000000000000000000000000000000000000..2a32fb4ea3edaa4c547a7a481c > 3d > > dca6b477430e > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_minp.c > > @@ -0,0 +1,74 @@ > > +/* { dg-do assemble } */ > > +/* { dg-additional-options "-save-temps -O1 -std=3Dc99" } */ > > +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } > > +} */ > > + > > +typedef long long v2di __attribute__((vector_size (16))); typedef > > +unsigned long long v2udi __attribute__((vector_size (16))); typedef > > +int v2si __attribute__((vector_size (16))); typedef unsigned int > > +v2usi __attribute__((vector_size (16))); > > + > > +/* > > +** foo: > > +** umov x0, v0.d\[1\] > > +** fmov x1, d0 > > +** cmp x0, x1 > > +** csel x0, x0, x1, le > > +** ret > > +*/ > > +long long > > +foo (v2di x) > > +{ > > + return x[0] < x[1] ? x[0] : x[1]; > > +} > > + > > +/* > > +** foo1: > > +** sminp v0.2s, v0.2s, v0.2s > > +** smov x0, v0.s\[0\] > > +** ret > > +*/ > > +long long > > +foo1 (v2si x) > > +{ > > + return x[0] < x[1] ? x[0] : x[1]; > > +} > > + > > +/* > > +** foo2: > > +** uminp v0.2s, v0.2s, v0.2s > > +** fmov w0, s0 > > +** ret > > +*/ > > +unsigned long long > > +foo2 (v2usi x) > > +{ > > + return x[0] < x[1] ? x[0] : x[1]; > > +} > > + > > +/* > > +** foo3: > > +** uminp v0.2s, v0.2s, v0.2s > > +** fmov w0, s0 > > +** fmov x1, d1 > > +** add x0, x1, w0, uxtw > > +** ret > > +*/ > > +unsigned long long > > +foo3 (v2usi x, v2udi y) > > +{ > > + return (x[0] < x[1] ? x[0] : x[1]) + y[0]; } > > + > > +/* > > +** foo4: > > +** sminp v0.2s, v0.2s, v0.2s > > +** fmov w0, s0 > > +** fmov x1, d1 > > +** add x0, x1, w0, sxtw > > +** ret > > +*/ > > +long long > > +foo4 (v2si x, v2di y) > > +{ > > + return (x[0] < x[1] ? x[0] : x[1]) + y[0]; }