From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR04-DB3-obe.outbound.protection.outlook.com (mail-eopbgr60054.outbound.protection.outlook.com [40.107.6.54]) by sourceware.org (Postfix) with ESMTPS id 0DC73385828E for ; Mon, 21 Nov 2022 10:46:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 0DC73385828E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=rqcbFOeEgA7OFmApupVo9pX5wJpYv6vywFRKUurHl7I=; b=NOCsBqWa99/9c8VVuJ5XlhcSdzm2tq3Lq2C+qvT3cCPuzb+bPJ5rig5hqJj6sySDOgljVmRIJftvkzAEY+3qyNEttCJN+exjCiDod7t4aL4nZMF5GywmD9BopLXzwdR4+kZKmRcl+9/9zf5fwJQSceiNOJcpcd04fFxltTFJA1g= Received: from ZRAP278CA0006.CHEP278.PROD.OUTLOOK.COM (2603:10a6:910:10::16) by AS8PR08MB8659.eurprd08.prod.outlook.com (2603:10a6:20b:563::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5857.13; Mon, 21 Nov 2022 10:45:59 +0000 Received: from VI1EUR03FT063.eop-EUR03.prod.protection.outlook.com (2603:10a6:910:10:cafe::47) by ZRAP278CA0006.outlook.office365.com (2603:10a6:910:10::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5813.23 via Frontend Transport; Mon, 21 Nov 2022 10:45:59 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by VI1EUR03FT063.mail.protection.outlook.com (100.127.144.155) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5834.8 via Frontend Transport; Mon, 21 Nov 2022 10:45:59 +0000 Received: ("Tessian outbound 2ff13c8f2c05:v130"); Mon, 21 Nov 2022 10:45:58 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: e2af007c48a02169 X-CR-MTA-TID: 64aa7808 Received: from bd9261405e4d.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 9900509B-E21B-4B1E-BF67-533D3AB8081F.1; Mon, 21 Nov 2022 10:45:51 +0000 Received: from EUR04-HE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id bd9261405e4d.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 21 Nov 2022 10:45:51 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=P8WQ/RyCVt3uTk6+2rizO0LA1X8rb0tSai7OJztAHTOZb8e8ZDB308d5dfrzhfEMbZE91Dmw0l4eGqQ1n5w6rKCsZV8hmUlbtMXchB8ePBhb4NdrNsU+O/sWNsafPKxJfJ6DYTAxmS8EXr7m40TZNkhiZUlt/m3R0IPhsFXFzbNLDGa8Z43gz68Snih8jaDFDnWid1ar4AQv1Di0yIFGNGptMZLyZJZuaqmm/8d0nqWCcKW0fUo2eBudqzIFidhhq4r0wW6MQz3eIVdEUBN6iRh310t8CDgo3dPqlSonsoxCuz0XT0mm0lJu4LxOU/bqQyRIhmQ64MHJmHOmXnJnlg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=rqcbFOeEgA7OFmApupVo9pX5wJpYv6vywFRKUurHl7I=; b=cfLtRDyrRIbANgReyX2qnxpxEtNn7vRAvvIW4FgBhaF5d5GUmnjF6x2VqHnOLbkOMKQukaVDHk05oRj0YvV9dBX/igmYb8nG279di2nxuBy2VAkrMkmkR7p4qccIqpwhsIf4yGJEuDcgnysHYjX0/dkDsnxJv575dXo3rQ41dNXoFXZBIprTY14zOzWnoKfr1BHZc10jPkIwNh/56JTyFyTVqlrpbBFrnzulpZ2OxS78T8pbKSxJ7Ik0j2pI0cRoCHaVv2/IfQODlpRl9UbkcEqZxJIEFHnUmRb8ewtlMpDV61tqphY4OtZnca2q/lDYz2yEpronRwWBNA08MeqXlw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=rqcbFOeEgA7OFmApupVo9pX5wJpYv6vywFRKUurHl7I=; b=NOCsBqWa99/9c8VVuJ5XlhcSdzm2tq3Lq2C+qvT3cCPuzb+bPJ5rig5hqJj6sySDOgljVmRIJftvkzAEY+3qyNEttCJN+exjCiDod7t4aL4nZMF5GywmD9BopLXzwdR4+kZKmRcl+9/9zf5fwJQSceiNOJcpcd04fFxltTFJA1g= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from DB9PR08MB6507.eurprd08.prod.outlook.com (2603:10a6:10:25a::6) by DBBPR08MB6202.eurprd08.prod.outlook.com (2603:10a6:10:209::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5857.17; Mon, 21 Nov 2022 10:45:45 +0000 Received: from DB9PR08MB6507.eurprd08.prod.outlook.com ([fe80::a98d:81cd:e426:fd21]) by DB9PR08MB6507.eurprd08.prod.outlook.com ([fe80::a98d:81cd:e426:fd21%7]) with mapi id 15.20.5857.017; Mon, 21 Nov 2022 10:45:44 +0000 Message-ID: Date: Mon, 21 Nov 2022 10:45:39 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.5.0 Subject: Re: [PATCH 13/35] arm: further fix overloading of MVE vaddq[_m]_n intrinsic To: Kyrylo Tkachov , Andrea Corallo , "gcc-patches@gcc.gnu.org" Cc: Richard Earnshaw References: <20221117163809.1009526-1-andrea.corallo@arm.com> <20221117163809.1009526-14-andrea.corallo@arm.com> Content-Language: en-US From: Stam Markianos-Wright In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-ClientProxiedBy: LO2P265CA0180.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:a::24) To DB9PR08MB6507.eurprd08.prod.outlook.com (2603:10a6:10:25a::6) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: DB9PR08MB6507:EE_|DBBPR08MB6202:EE_|VI1EUR03FT063:EE_|AS8PR08MB8659:EE_ X-MS-Office365-Filtering-Correlation-Id: 0c4d6565-679c-46da-cfd0-08dacbad9391 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: +b6NW8/hHvSOi9UIOHoa5fBwDKQkRW1j6BuhC8XXeOnE5Lq/KZ1KlxytTdocuH+3X6TEupDNR/LZ4Ve/luvQ7gO0fZnSS6TLUyiEhx2iPQ+kTZUE1RNDF5hasUjeJa97ELSBmaHGoXFZb9VrDZXCzg3irnwoj8EqJ79t1+Qfb0XeSvlIeCw1l7Tmp9i2iLCNGjDMdzQ/gxKGpNXWQ07U8GSX4y1zBEBzG+dMZDViGCgkSbqYs4k5rYPhzBhV0QrzHorYiypYGl5eC9pChSlY9BAkthFxGWsHlYM2eg8yNkeKw2RKrtvqogZhelK82r+Mr0iTjzwyUt3px30BFIY9FF5TFtncqkMol2qpagAVDXuKM7z72HOPJAWURa7tna68PEu/wvOTwldo6/mSjAebx9j4Kg8xv60T+SaIVDrlRGkAvEncnZqJVBvYWoiEF0SmBgHZHRQbML71g47slvE0e3JQ2H1NPDKkndcVZ+1AnFlHGW52PC4bWw4bIvFN6n1l8bJ6TmyoGZLVdAxHdCjjQbm8tFy/fGu02+abMRzY4cDaCEOr1w/QQpInu1SqxRR4xGtmyEOXt7J9x7tRaKo+7kLoSWHD6TEkP5OIPe4fVFWOP3LRb5mOcYylYCpKauJDV+KW//3IHk8nuq3t00zcK/LLH4qUB/wnh6Tz6hui50NwXBMtC23dAU+lVm2ASjrmpX/Hoc83JyVKFm9cPvv6i5beARF6IdW2HcPIGOI0erGZI02hdvc42vsYWig6b3LL X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DB9PR08MB6507.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(346002)(396003)(39860400002)(366004)(136003)(376002)(451199015)(41300700001)(8936002)(5660300002)(4326008)(316002)(66946007)(66476007)(66556008)(31686004)(30864003)(8676002)(478600001)(2906002)(2616005)(36756003)(53546011)(6486002)(6512007)(6666004)(6506007)(966005)(186003)(110136005)(26005)(38100700002)(83380400001)(31696002)(86362001)(45980500001)(43740500002);DIR:OUT;SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DBBPR08MB6202 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: VI1EUR03FT063.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 092d9e7a-c682-4371-7fd7-08dacbad8a4d X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 3cOBCnBArfAtrC5v4XHbZZ+F3YQld6rWOBm/NxPMkrSsTuAd3Y1kZhPkxZnf3nTpVAspJ8/HhznDi454l5p2g1a90VssfzI7RMPEND9Ngs1Jv6tW5jWxrJMhPCJc4WPG+ocImhtu8Lz1mJnUDYt30Ck5yzw52hK5TH2Z9AhEi0q06rUKBJjZpLwC34EBlK9EuVib0//vjOa5R8VNedW3Lh1CETtFbEg7ujAARApfN7l0Qb9bywsz3oPvOfy9WwBzXIEswp/zlw10o/W4D6kNo2zN4eQJXGcmtkb7eGdDjrXd07nGdkoPDAGa4mdzKVDp9aE+xlAwXYePFE6YnxOsiuuN+jln5fXtTqmQniorI+QXytFxDwpFPCet1dScfEPuO1TtW+T3CxgT6Zn5VqqjVadQDUWTiC6bDCikQOXYD/W5wJ8P1UdrDrIalidzfMK/KAmgJciufROUforFHHFrKB9OrIw97LAOo05MFXkStp8Ih4dwBaWPsrRhE2j6erXhA/zARdpAru9A+NJhDmEoCkF/IvE2ntBCNFoWz3+ACwcX02W2ukL6owxMj6/vvED5TkBGcJGyzqTH+yCe1MWDyt8NaYhDfFKOIZMF+e8XsLp5kU+BMlvTmMkKn8Um7bADuSXDP7p3HSa5WGLOb7uAK8jJiBFDQvH/eyXmHAgSizfLLC5Kd3Ixc9JIVNbeJayJpn6AaGKe8iWmo6/NRqa+nnRnUEGa1a1dPfQZBpDHyJKeyL36AvCv8tpohR+wbP/aSDpYCyd2xwqwGrkA6RHeRw== X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230022)(4636009)(136003)(39860400002)(346002)(376002)(396003)(451199015)(40470700004)(46966006)(36840700001)(31686004)(41300700001)(8936002)(5660300002)(4326008)(316002)(70206006)(70586007)(8676002)(30864003)(6486002)(36756003)(2906002)(478600001)(966005)(336012)(6512007)(186003)(6506007)(47076005)(26005)(6666004)(2616005)(110136005)(53546011)(36860700001)(82310400005)(40480700001)(40460700003)(83380400001)(86362001)(31696002)(82740400003)(356005)(81166007)(43740500002);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 21 Nov 2022 10:45:59.0527 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 0c4d6565-679c-46da-cfd0-08dacbad9391 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: VI1EUR03FT063.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB8659 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,GIT_PATCH_0,KAM_DMARC_NONE,KAM_SHORT,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 11/18/22 16:49, Kyrylo Tkachov wrote: > >> -----Original Message----- >> From: Andrea Corallo >> Sent: Thursday, November 17, 2022 4:38 PM >> To: gcc-patches@gcc.gnu.org >> Cc: Kyrylo Tkachov ; Richard Earnshaw >> ; Stam Markianos-Wright > Wright@arm.com> >> Subject: [PATCH 13/35] arm: further fix overloading of MVE vaddq[_m]_n >> intrinsic >> >> From: Stam Markianos-Wright >> >> It was observed that in tests `vaddq_m_n_[s/u][8/16/32].c`, the _Generic >> resolution would fall back to the `__ARM_undef` failure state. >> >> This is a regression since `dc39db873670bea8d8e655444387ceaa53a01a79` >> and >> `6bd4ce64eb48a72eca300cb52773e6101d646004`, but it previously wasn't >> identified, because the tests were not checking for this kind of failure. >> >> The above commits changed the definitions of the intrinsics from using >> `[u]int[8/16/32]_t` types for the scalar argument to using `int`. This >> allowed `int` to be supported in user code through the overloaded >> `#defines`, but seems to have broken the `[u]int[8/16/32]_t` types >> >> The solution implemented by this patch is to explicitly use a new >> _Generic mapping from all the `[u]int[8/16/32]_t` types for int. With this >> change, both `int` and `[u]int[8/16/32]_t` parameters are supported from >> user code and are handled by the overloading mechanism correctly. >> >> gcc/ChangeLog: >> >> * config/arm/arm_mve.h (__arm_vaddq_m_n_s8): Change types. >> (__arm_vaddq_m_n_s32): Likewise. >> (__arm_vaddq_m_n_s16): Likewise. >> (__arm_vaddq_m_n_u8): Likewise. >> (__arm_vaddq_m_n_u32): Likewise. >> (__arm_vaddq_m_n_u16): Likewise. >> (__arm_vaddq_m): Fix Overloading. >> (__ARM_mve_coerce3): New. > Ok. Wasn't there a PR in Bugzilla about this that we can cite in the commit message? > Thanks, > Kyrill Thanks for the review! Ah yes, there was this one: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96795 which was closed last time around. It does make sense to add it, though, so we'll do that. Thanks! > >> --- >> gcc/config/arm/arm_mve.h | 78 ++++++++++++++++++++-------------------- >> 1 file changed, 40 insertions(+), 38 deletions(-) >> >> diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h >> index 684f997520f..951dc25374b 100644 >> --- a/gcc/config/arm/arm_mve.h >> +++ b/gcc/config/arm/arm_mve.h >> @@ -9675,42 +9675,42 @@ __arm_vabdq_m_u16 (uint16x8_t __inactive, >> uint16x8_t __a, uint16x8_t __b, mve_pr >> >> __extension__ extern __inline int8x16_t >> __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) >> -__arm_vaddq_m_n_s8 (int8x16_t __inactive, int8x16_t __a, int __b, >> mve_pred16_t __p) >> +__arm_vaddq_m_n_s8 (int8x16_t __inactive, int8x16_t __a, int8_t __b, >> mve_pred16_t __p) >> { >> return __builtin_mve_vaddq_m_n_sv16qi (__inactive, __a, __b, __p); >> } >> >> __extension__ extern __inline int32x4_t >> __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) >> -__arm_vaddq_m_n_s32 (int32x4_t __inactive, int32x4_t __a, int __b, >> mve_pred16_t __p) >> +__arm_vaddq_m_n_s32 (int32x4_t __inactive, int32x4_t __a, int32_t __b, >> mve_pred16_t __p) >> { >> return __builtin_mve_vaddq_m_n_sv4si (__inactive, __a, __b, __p); >> } >> >> __extension__ extern __inline int16x8_t >> __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) >> -__arm_vaddq_m_n_s16 (int16x8_t __inactive, int16x8_t __a, int __b, >> mve_pred16_t __p) >> +__arm_vaddq_m_n_s16 (int16x8_t __inactive, int16x8_t __a, int16_t __b, >> mve_pred16_t __p) >> { >> return __builtin_mve_vaddq_m_n_sv8hi (__inactive, __a, __b, __p); >> } >> >> __extension__ extern __inline uint8x16_t >> __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) >> -__arm_vaddq_m_n_u8 (uint8x16_t __inactive, uint8x16_t __a, int __b, >> mve_pred16_t __p) >> +__arm_vaddq_m_n_u8 (uint8x16_t __inactive, uint8x16_t __a, uint8_t __b, >> mve_pred16_t __p) >> { >> return __builtin_mve_vaddq_m_n_uv16qi (__inactive, __a, __b, __p); >> } >> >> __extension__ extern __inline uint32x4_t >> __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) >> -__arm_vaddq_m_n_u32 (uint32x4_t __inactive, uint32x4_t __a, int __b, >> mve_pred16_t __p) >> +__arm_vaddq_m_n_u32 (uint32x4_t __inactive, uint32x4_t __a, uint32_t >> __b, mve_pred16_t __p) >> { >> return __builtin_mve_vaddq_m_n_uv4si (__inactive, __a, __b, __p); >> } >> >> __extension__ extern __inline uint16x8_t >> __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) >> -__arm_vaddq_m_n_u16 (uint16x8_t __inactive, uint16x8_t __a, int __b, >> mve_pred16_t __p) >> +__arm_vaddq_m_n_u16 (uint16x8_t __inactive, uint16x8_t __a, uint16_t >> __b, mve_pred16_t __p) >> { >> return __builtin_mve_vaddq_m_n_uv8hi (__inactive, __a, __b, __p); >> } >> @@ -26417,42 +26417,42 @@ __arm_vabdq_m (uint16x8_t __inactive, >> uint16x8_t __a, uint16x8_t __b, mve_pred16 >> >> __extension__ extern __inline int8x16_t >> __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) >> -__arm_vaddq_m (int8x16_t __inactive, int8x16_t __a, int __b, >> mve_pred16_t __p) >> +__arm_vaddq_m (int8x16_t __inactive, int8x16_t __a, int8_t __b, >> mve_pred16_t __p) >> { >> return __arm_vaddq_m_n_s8 (__inactive, __a, __b, __p); >> } >> >> __extension__ extern __inline int32x4_t >> __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) >> -__arm_vaddq_m (int32x4_t __inactive, int32x4_t __a, int __b, >> mve_pred16_t __p) >> +__arm_vaddq_m (int32x4_t __inactive, int32x4_t __a, int32_t __b, >> mve_pred16_t __p) >> { >> return __arm_vaddq_m_n_s32 (__inactive, __a, __b, __p); >> } >> >> __extension__ extern __inline int16x8_t >> __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) >> -__arm_vaddq_m (int16x8_t __inactive, int16x8_t __a, int __b, >> mve_pred16_t __p) >> +__arm_vaddq_m (int16x8_t __inactive, int16x8_t __a, int16_t __b, >> mve_pred16_t __p) >> { >> return __arm_vaddq_m_n_s16 (__inactive, __a, __b, __p); >> } >> >> __extension__ extern __inline uint8x16_t >> __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) >> -__arm_vaddq_m (uint8x16_t __inactive, uint8x16_t __a, int __b, >> mve_pred16_t __p) >> +__arm_vaddq_m (uint8x16_t __inactive, uint8x16_t __a, uint8_t __b, >> mve_pred16_t __p) >> { >> return __arm_vaddq_m_n_u8 (__inactive, __a, __b, __p); >> } >> >> __extension__ extern __inline uint32x4_t >> __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) >> -__arm_vaddq_m (uint32x4_t __inactive, uint32x4_t __a, int __b, >> mve_pred16_t __p) >> +__arm_vaddq_m (uint32x4_t __inactive, uint32x4_t __a, uint32_t __b, >> mve_pred16_t __p) >> { >> return __arm_vaddq_m_n_u32 (__inactive, __a, __b, __p); >> } >> >> __extension__ extern __inline uint16x8_t >> __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) >> -__arm_vaddq_m (uint16x8_t __inactive, uint16x8_t __a, int __b, >> mve_pred16_t __p) >> +__arm_vaddq_m (uint16x8_t __inactive, uint16x8_t __a, uint16_t __b, >> mve_pred16_t __p) >> { >> return __arm_vaddq_m_n_u16 (__inactive, __a, __b, __p); >> } >> @@ -35657,6 +35657,8 @@ extern void *__ARM_undef; >> _Generic(param, type: param, const type: param, default: *(type >> *)__ARM_undef) >> #define __ARM_mve_coerce2(param, type) \ >> _Generic(param, type: param, float16_t: param, float32_t: param, default: >> *(type *)__ARM_undef) >> +#define __ARM_mve_coerce3(param, type) \ >> + _Generic(param, type: param, int8_t: param, int16_t: param, int32_t: >> param, int64_t: param, uint8_t: param, uint16_t: param, uint32_t: param, >> uint64_t: param, default: *(type *)__ARM_undef) >> >> #if (__ARM_FEATURE_MVE & 2) /* MVE Floating point. */ >> >> @@ -35871,14 +35873,14 @@ extern void *__ARM_undef; >> int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: >> __arm_vaddq_u8 (__ARM_mve_coerce(__p0, uint8x16_t), >> __ARM_mve_coerce(__p1, uint8x16_t)), \ >> int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: >> __arm_vaddq_u16 (__ARM_mve_coerce(__p0, uint16x8_t), >> __ARM_mve_coerce(__p1, uint16x8_t)), \ >> int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: >> __arm_vaddq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), >> __ARM_mve_coerce(__p1, uint32x4_t)), \ >> - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: >> __arm_vaddq_f16 (__ARM_mve_coerce(p0, float16x8_t), >> __ARM_mve_coerce(p1, float16x8_t)), \ >> - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: >> __arm_vaddq_f32 (__ARM_mve_coerce(p0, float32x4_t), >> __ARM_mve_coerce(p1, float32x4_t)), \ >> - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_int_n]: >> __arm_vaddq_n_u8 (__ARM_mve_coerce(__p0, uint8x16_t), >> __ARM_mve_coerce(__p1, int)), \ >> - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_int_n]: >> __arm_vaddq_n_u16 (__ARM_mve_coerce(__p0, uint16x8_t), >> __ARM_mve_coerce(__p1, int)), \ >> - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_int_n]: >> __arm_vaddq_n_u32 (__ARM_mve_coerce(__p0, uint32x4_t), >> __ARM_mve_coerce(__p1, int)), \ >> - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int_n]: >> __arm_vaddq_n_s8 (__ARM_mve_coerce(__p0, int8x16_t), >> __ARM_mve_coerce(__p1, int)), \ >> - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int_n]: >> __arm_vaddq_n_s16 (__ARM_mve_coerce(__p0, int16x8_t), >> __ARM_mve_coerce(__p1, int)), \ >> - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int_n]: >> __arm_vaddq_n_s32 (__ARM_mve_coerce(__p0, int32x4_t), >> __ARM_mve_coerce(__p1, int)), \ >> + int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: >> __arm_vaddq_f16 (__ARM_mve_coerce(__p0, float16x8_t), >> __ARM_mve_coerce(__p1, float16x8_t)), \ >> + int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: >> __arm_vaddq_f32 (__ARM_mve_coerce(__p0, float32x4_t), >> __ARM_mve_coerce(__p1, float32x4_t)), \ >> + int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_int_n]: >> __arm_vaddq_n_u8 (__ARM_mve_coerce(__p0, uint8x16_t), >> __ARM_mve_coerce3(p1, int)), \ >> + int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_int_n]: >> __arm_vaddq_n_u16 (__ARM_mve_coerce(__p0, uint16x8_t), >> __ARM_mve_coerce3(p1, int)), \ >> + int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_int_n]: >> __arm_vaddq_n_u32 (__ARM_mve_coerce(__p0, uint32x4_t), >> __ARM_mve_coerce3(p1, int)), \ >> + int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int_n]: >> __arm_vaddq_n_s8 (__ARM_mve_coerce(__p0, int8x16_t), >> __ARM_mve_coerce3(p1, int)), \ >> + int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int_n]: >> __arm_vaddq_n_s16 (__ARM_mve_coerce(__p0, int16x8_t), >> __ARM_mve_coerce3(p1, int)), \ >> + int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int_n]: >> __arm_vaddq_n_s32 (__ARM_mve_coerce(__p0, int32x4_t), >> __ARM_mve_coerce3(p1, int)), \ >> int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_fp_n]: >> __arm_vaddq_n_f16 (__ARM_mve_coerce(__p0, float16x8_t), >> __ARM_mve_coerce2(__p1, double)), \ >> int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_fp_n]: >> __arm_vaddq_n_f32 (__ARM_mve_coerce(__p0, float32x4_t), >> __ARM_mve_coerce2(__p1, double)));}) >> >> @@ -37316,12 +37318,12 @@ extern void *__ARM_undef; >> int >> (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t][__ARM_m >> ve_type_uint32x4_t]: __arm_vaddq_m_u32 (__ARM_mve_coerce(__p0, >> uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), >> __ARM_mve_coerce(__p2, uint32x4_t), p3), \ >> int >> (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t][__ARM_ >> mve_type_float16x8_t]: __arm_vaddq_m_f16 (__ARM_mve_coerce(__p0, >> float16x8_t), __ARM_mve_coerce(__p1, float16x8_t), >> __ARM_mve_coerce(__p2, float16x8_t), p3), \ >> int >> (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t][__ARM_ >> mve_type_float32x4_t]: __arm_vaddq_m_f32 (__ARM_mve_coerce(__p0, >> float32x4_t), __ARM_mve_coerce(__p1, float32x4_t), >> __ARM_mve_coerce(__p2, float32x4_t), p3), \ >> - int >> (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t][__ARM_mve >> _type_int_n]: __arm_vaddq_m_n_s8 (__ARM_mve_coerce(__p0, int8x16_t), >> __ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, int), p3), \ >> - int >> (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t][__ARM_mve >> _type_int_n]: __arm_vaddq_m_n_s16 (__ARM_mve_coerce(__p0, >> int16x8_t), __ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2, >> int), p3), \ >> - int >> (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve >> _type_int_n]: __arm_vaddq_m_n_s32 (__ARM_mve_coerce(__p0, >> int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, >> int), p3), \ >> - int >> (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t][__ARM_m >> ve_type_int_n]: __arm_vaddq_m_n_u8 (__ARM_mve_coerce(__p0, >> uint8x16_t), __ARM_mve_coerce(__p1, uint8x16_t), >> __ARM_mve_coerce(__p2, int), p3), \ >> - int >> (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t][__ARM_m >> ve_type_int_n]: __arm_vaddq_m_n_u16 (__ARM_mve_coerce(__p0, >> uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t), >> __ARM_mve_coerce(__p2, int), p3), \ >> - int >> (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t][__ARM_m >> ve_type_int_n]: __arm_vaddq_m_n_u32 (__ARM_mve_coerce(__p0, >> uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), >> __ARM_mve_coerce(__p2, int), p3), \ >> + int >> (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t][__ARM_mve >> _type_int_n]: __arm_vaddq_m_n_s8 (__ARM_mve_coerce(__p0, int8x16_t), >> __ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce3(p2, int), p3), \ >> + int >> (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t][__ARM_mve >> _type_int_n]: __arm_vaddq_m_n_s16 (__ARM_mve_coerce(__p0, >> int16x8_t), __ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce3(p2, >> int), p3), \ >> + int >> (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve >> _type_int_n]: __arm_vaddq_m_n_s32 (__ARM_mve_coerce(__p0, >> int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce3(p2, >> int), p3), \ >> + int >> (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t][__ARM_m >> ve_type_int_n]: __arm_vaddq_m_n_u8 (__ARM_mve_coerce(__p0, >> uint8x16_t), __ARM_mve_coerce(__p1, uint8x16_t), >> __ARM_mve_coerce3(p2, int), p3), \ >> + int >> (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t][__ARM_m >> ve_type_int_n]: __arm_vaddq_m_n_u16 (__ARM_mve_coerce(__p0, >> uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t), >> __ARM_mve_coerce3(p2, int), p3), \ >> + int >> (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t][__ARM_m >> ve_type_int_n]: __arm_vaddq_m_n_u32 (__ARM_mve_coerce(__p0, >> uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), >> __ARM_mve_coerce3(p2, int), p3), \ >> int >> (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t][__ARM_ >> mve_type_fp_n]: __arm_vaddq_m_n_f16 (__ARM_mve_coerce(__p0, >> float16x8_t), __ARM_mve_coerce(__p1, float16x8_t), >> __ARM_mve_coerce2(__p2, double), p3), \ >> int >> (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t][__ARM_ >> mve_type_fp_n]: __arm_vaddq_m_n_f32 (__ARM_mve_coerce(__p0, >> float32x4_t), __ARM_mve_coerce(__p1, float32x4_t), >> __ARM_mve_coerce2(__p2, double), p3));}) >> >> @@ -38820,12 +38822,12 @@ extern void *__ARM_undef; >> int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: >> __arm_vaddq_u8 (__ARM_mve_coerce(__p0, uint8x16_t), >> __ARM_mve_coerce(__p1, uint8x16_t)), \ >> int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: >> __arm_vaddq_u16 (__ARM_mve_coerce(__p0, uint16x8_t), >> __ARM_mve_coerce(__p1, uint16x8_t)), \ >> int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: >> __arm_vaddq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), >> __ARM_mve_coerce(__p1, uint32x4_t)), \ >> - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_int_n]: >> __arm_vaddq_n_u8 (__ARM_mve_coerce(__p0, uint8x16_t), >> __ARM_mve_coerce(__p1, int)), \ >> - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_int_n]: >> __arm_vaddq_n_u16 (__ARM_mve_coerce(__p0, uint16x8_t), >> __ARM_mve_coerce(__p1, int)), \ >> - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_int_n]: >> __arm_vaddq_n_u32 (__ARM_mve_coerce(__p0, uint32x4_t), >> __ARM_mve_coerce(__p1, int)), \ >> - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int_n]: >> __arm_vaddq_n_s8 (__ARM_mve_coerce(__p0, int8x16_t), >> __ARM_mve_coerce(__p1, int)), \ >> - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int_n]: >> __arm_vaddq_n_s16 (__ARM_mve_coerce(__p0, int16x8_t), >> __ARM_mve_coerce(__p1, int)), \ >> - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int_n]: >> __arm_vaddq_n_s32 (__ARM_mve_coerce(__p0, int32x4_t), >> __ARM_mve_coerce(__p1, int)));}) >> + int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_int_n]: >> __arm_vaddq_n_u8 (__ARM_mve_coerce(__p0, uint8x16_t), >> __ARM_mve_coerce3(p1, int)), \ >> + int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_int_n]: >> __arm_vaddq_n_u16 (__ARM_mve_coerce(__p0, uint16x8_t), >> __ARM_mve_coerce3(p1, int)), \ >> + int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_int_n]: >> __arm_vaddq_n_u32 (__ARM_mve_coerce(__p0, uint32x4_t), >> __ARM_mve_coerce3(p1, int)), \ >> + int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int_n]: >> __arm_vaddq_n_s8 (__ARM_mve_coerce(__p0, int8x16_t), >> __ARM_mve_coerce3(p1, int)), \ >> + int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int_n]: >> __arm_vaddq_n_s16 (__ARM_mve_coerce(__p0, int16x8_t), >> __ARM_mve_coerce3(p1, int)), \ >> + int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int_n]: >> __arm_vaddq_n_s32 (__ARM_mve_coerce(__p0, int32x4_t), >> __ARM_mve_coerce3(p1, int)));}) >> >> #define __arm_vandq(p0,p1) ({ __typeof(p0) __p0 = (p0); \ >> __typeof(p1) __p1 = (p1); \ >> @@ -39641,12 +39643,12 @@ extern void *__ARM_undef; >> __typeof(p1) __p1 = (p1); \ >> __typeof(p2) __p2 = (p2); \ >> _Generic( (int >> (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typ >> eid(__p2)])0, \ >> - int >> (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t][__ARM_mve >> _type_int_n]: __arm_vaddq_m_n_s8 (__ARM_mve_coerce(__p0, int8x16_t), >> __ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, int), p3), \ >> - int >> (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t][__ARM_mve >> _type_int_n]: __arm_vaddq_m_n_s16 (__ARM_mve_coerce(__p0, >> int16x8_t), __ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2, >> int), p3), \ >> - int >> (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve >> _type_int_n]: __arm_vaddq_m_n_s32 (__ARM_mve_coerce(__p0, >> int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, >> int), p3), \ >> - int >> (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t][__ARM_m >> ve_type_int_n]: __arm_vaddq_m_n_u8 (__ARM_mve_coerce(__p0, >> uint8x16_t), __ARM_mve_coerce(__p1, uint8x16_t), >> __ARM_mve_coerce(__p2, int), p3), \ >> - int >> (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t][__ARM_m >> ve_type_int_n]: __arm_vaddq_m_n_u16 (__ARM_mve_coerce(__p0, >> uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t), >> __ARM_mve_coerce(__p2, int), p3), \ >> - int >> (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t][__ARM_m >> ve_type_int_n]: __arm_vaddq_m_n_u32 (__ARM_mve_coerce(__p0, >> uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), >> __ARM_mve_coerce(__p2, int), p3), \ >> + int >> (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t][__ARM_mve >> _type_int_n]: __arm_vaddq_m_n_s8 (__ARM_mve_coerce(__p0, int8x16_t), >> __ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce3(p2, int), p3), \ >> + int >> (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t][__ARM_mve >> _type_int_n]: __arm_vaddq_m_n_s16 (__ARM_mve_coerce(__p0, >> int16x8_t), __ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce3(p2, >> int), p3), \ >> + int >> (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve >> _type_int_n]: __arm_vaddq_m_n_s32 (__ARM_mve_coerce(__p0, >> int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce3(p2, >> int), p3), \ >> + int >> (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t][__ARM_m >> ve_type_int_n]: __arm_vaddq_m_n_u8 (__ARM_mve_coerce(__p0, >> uint8x16_t), __ARM_mve_coerce(__p1, uint8x16_t), >> __ARM_mve_coerce3(p2, int), p3), \ >> + int >> (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t][__ARM_m >> ve_type_int_n]: __arm_vaddq_m_n_u16 (__ARM_mve_coerce(__p0, >> uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t), >> __ARM_mve_coerce3(p2, int), p3), \ >> + int >> (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t][__ARM_m >> ve_type_int_n]: __arm_vaddq_m_n_u32 (__ARM_mve_coerce(__p0, >> uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), >> __ARM_mve_coerce3(p2, int), p3), \ >> int >> (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t][__ARM_mve >> _type_int8x16_t]: __arm_vaddq_m_s8 (__ARM_mve_coerce(__p0, >> int8x16_t), __ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, >> int8x16_t), p3), \ >> int >> (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t][__ARM_mve >> _type_int16x8_t]: __arm_vaddq_m_s16 (__ARM_mve_coerce(__p0, >> int16x8_t), __ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2, >> int16x8_t), p3), \ >> int >> (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve >> _type_int32x4_t]: __arm_vaddq_m_s32 (__ARM_mve_coerce(__p0, >> int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, >> int32x4_t), p3), \ >> -- >> 2.25.1