From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR05-DB8-obe.outbound.protection.outlook.com (mail-db8eur05on2084.outbound.protection.outlook.com [40.107.20.84]) by sourceware.org (Postfix) with ESMTPS id 5B1E13858423 for ; Wed, 5 Jul 2023 16:47:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5B1E13858423 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=lYxlBun/pgDfBFwtIWfvkKbZtl4WWLJlgfCsHMtqhUk=; b=6JlA8nC7h7e/adnq22hZF9CR+6fIO7vJ3oE7mtwAWuvWxZ5wgXMEmBYNH3IGLgJED465P/0v5uWu79VEmfyoD4fifr6plwoxh/ftaTje+2QR+5rjJG+c0NcTI9JBbjSotvZgP47A4N8QKm2V510/eonLZV4EAUltlsh74eY0Peo= Received: from DUZPR01CA0309.eurprd01.prod.exchangelabs.com (2603:10a6:10:4ba::9) by GV1PR08MB8641.eurprd08.prod.outlook.com (2603:10a6:150:82::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6521.24; Wed, 5 Jul 2023 16:47:16 +0000 Received: from DBAEUR03FT021.eop-EUR03.prod.protection.outlook.com (2603:10a6:10:4ba:cafe::3) by DUZPR01CA0309.outlook.office365.com (2603:10a6:10:4ba::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6565.18 via Frontend Transport; Wed, 5 Jul 2023 16:47:16 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DBAEUR03FT021.mail.protection.outlook.com (100.127.142.184) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6565.23 via Frontend Transport; Wed, 5 Jul 2023 16:47:16 +0000 Received: ("Tessian outbound e2424c13b707:v142"); Wed, 05 Jul 2023 16:47:16 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 7cf6261269810df2 X-CR-MTA-TID: 64aa7808 Received: from a06d58c91897.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id EBC2C423-CC3D-4E71-ABEC-AA897FD5781B.1; Wed, 05 Jul 2023 16:47:09 +0000 Received: from EUR03-AM7-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id a06d58c91897.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Wed, 05 Jul 2023 16:47:09 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ThEWNwzGQRhH5d2soXPLtGM28rsfIFOJrb51nluR0yRY2NrvE4cFBQfHPD8m/O5PDRwCgnoW1igmwC8qXzLqYsZHByLkfQ2lzbhkCMezS1Ta1Wbg8ldNTYcRgqLl+2jiKWP/AxO2RnyTMDst+pKMYfQolkAOzPS/UslHWAMTP10ZPj4kr3Xl1rvj97xeWOk8DWrnIxrYGu0wynPx5JrROo2aXdIzKhM2wBQnSX0ixXNnI7wwWERpjYXX8TBBffsQaL89Fw9YL3sujbYd1ErLPIyY4dnyvVuf7k5aF6KZl+j0rNdQ6BSy7iQJ+SttLaYUKGk1TbaH6HEvwwz1rdzg5Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=lYxlBun/pgDfBFwtIWfvkKbZtl4WWLJlgfCsHMtqhUk=; b=W0RVHIlsItatT9Mskadweqz6KjpbUG6BwRRKZNnP1IALG4KJYTsgkLdazK/G8xkofecpV1yERnxtyKqhsEVIJjTqNkp1ljqZtLXWpF4zICuGZZRGZ/Txd6wPZ2ogH2/v9i3pgUvBHikJrAKmg6I1C3dsSXd3+TNnCBKZUMwpUDJ89/Ox3RwPI6ynJkhggGOU09XoG1jmMgw4JVHqTmiSII2TzlG8HrM1YrjWPqHlQOOvKxOQklJOHI1dr1Qr+a2Ovb0LQaDejiNzlJKkzJTfnGTMFWHzc2Oqa94vPQW717LrYtmJ1KINu4sxCiFhJY4DGcfsfQ1UI4fSjwgdUZ31/w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=lYxlBun/pgDfBFwtIWfvkKbZtl4WWLJlgfCsHMtqhUk=; b=6JlA8nC7h7e/adnq22hZF9CR+6fIO7vJ3oE7mtwAWuvWxZ5wgXMEmBYNH3IGLgJED465P/0v5uWu79VEmfyoD4fifr6plwoxh/ftaTje+2QR+5rjJG+c0NcTI9JBbjSotvZgP47A4N8QKm2V510/eonLZV4EAUltlsh74eY0Peo= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from DB9PR08MB7179.eurprd08.prod.outlook.com (2603:10a6:10:2cc::19) by DB9PR08MB8484.eurprd08.prod.outlook.com (2603:10a6:10:3d4::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6565.17; Wed, 5 Jul 2023 16:47:08 +0000 Received: from DB9PR08MB7179.eurprd08.prod.outlook.com ([fe80::43b7:3a83:5cbe:4559]) by DB9PR08MB7179.eurprd08.prod.outlook.com ([fe80::43b7:3a83:5cbe:4559%4]) with mapi id 15.20.6565.016; Wed, 5 Jul 2023 16:47:08 +0000 Date: Wed, 5 Jul 2023 17:46:53 +0100 From: Szabolcs Nagy To: Joe Ramsay , Subject: Re: [PATCH v4 2/4] aarch64: Add vector implementations of sin routines Message-ID: References: <20230628111939.48140-1-Joe.Ramsay@arm.com> <20230628111939.48140-2-Joe.Ramsay@arm.com> Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20230628111939.48140-2-Joe.Ramsay@arm.com> X-ClientProxiedBy: LO4P265CA0077.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:2bd::8) To DB9PR08MB7179.eurprd08.prod.outlook.com (2603:10a6:10:2cc::19) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: DB9PR08MB7179:EE_|DB9PR08MB8484:EE_|DBAEUR03FT021:EE_|GV1PR08MB8641:EE_ X-MS-Office365-Filtering-Correlation-Id: a7ee7b36-f2b3-45a9-d3e3-08db7d777d79 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: +3FPzIGn6VFX5uNMKUgDmdnjrazjoBRuDp5xtCLs0tnfr3XnonqimqcmWC2Td8GDDJS+5tLAkbWijFKZ5aY0UbufbQfVESPkNDrSlvclV7IoID6vO1YROLyXuKcmqLu2pAYV3C28NJ8n5UCeixH5ov3RmurvWAEAP7i+cp/4e6cmjjK6siQGLUdiqlJ+4H+mbxt8/QVnaO88itdiq6KqRqScQm8NCVTyTqwyvgJamE0Jr+2TGl8F+xCVLjh8Z4LgLx02+sFm+GnCErXji82xTt86Yc6oTGHLIBY93eICqMTrloBcEGkTXRe3RelgAvILC+RHlEYrrcAKrYMOa0W0XAUunYmHhcUgJkSJVAg6xd81PywqQPMMV6kd3ZDlXBor3r65hFapl4JRwuxpzPGCDA57Pl3CjS3vZJ3Nvha60j7y4gwTCvUodljdpCqxITPvbVao3Zm2ejL9ScnCFppGezvUAJxfP1BwdTB3+0fntBFAg3dttOSJWOz0YBufGzsPwkgX+7Hj8EcWrkxkxvJ3roIrpA5FVVJwVcbQr/bWRSo= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DB9PR08MB7179.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230028)(4636009)(366004)(39860400002)(396003)(136003)(376002)(346002)(451199021)(41300700001)(478600001)(966005)(6512007)(110136005)(86362001)(38100700002)(6666004)(6486002)(66476007)(36756003)(83380400001)(316002)(66946007)(2906002)(8676002)(2616005)(5660300002)(8936002)(66556008)(186003)(44832011)(6506007)(26005)(66899021);DIR:OUT;SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB9PR08MB8484 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DBAEUR03FT021.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: d72fbab9-c26d-4e5f-a6b7-08db7d777873 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 051V5pf5L5bgVTnDFM6Q7ZiiEREPpvUc8gDOSU2BOgJ+q/kUpI4a205wRgKFQB7WG9bnb/DTKBCCavyxFN24HtYIO4xAOF20/RDtTSRIEc9U7ayjaZOX/Fi5Q/bAbfi5mqc2/Mdo9x4TKkvAJ2iWAcYFXqZD6iFmtg9/w0IefmGw20KvN35C3iu304rU8IUeaZXJFpx6F/bXzoOA2KxTj/1CR3w0CPq/DNXnaRzK/9XA8q9/cTqWigsCj4xdhtflz5XQvnXLnwGL/k/zsQ3BdcL/JCKlPqVhj4OhoabsDmkpQl6kvqEJeOIlr8BhaW4+8F/DY3qT+T3rPKyBmzDDSU8/SAJ9f7Ps2gBYVlqMH85M5+OL9OsDCDgcfajSSPk0ewz9Uc8x/KXX8/zgLaoF9bwB+m8MxVq1zBTxGIel8wdZNc5eQjHBkzqnzd1oJFp74rdnuVS+mbJ9Dt+/fAxpqqItXBFNPbcyDg/XhqQhVFeQsn0wcOhLhlP35XNhbWwRsG70N7UGI0MHySysoMRH65txyUBeWrFODWPU/gSVu1rQW6bY2dJAsQzhCOEQhE86Gg2Dl0e9uKAJEIBquTQB3rrzBugXdPHiIziTdLpvFeDlRcUuOpe5tnzBzgVZTNvIC+IRuDFkivZWWxLHN8UKcWkWywLWlKLYclgG7eWd3Ozt4iuvl+dkdmbduCUywm5ZhiTZbmOTv1Sc26YWf4NKb9m9XdEg/gSTvH9yD0FFge8= X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230028)(4636009)(39860400002)(396003)(136003)(346002)(376002)(451199021)(46966006)(40470700004)(36840700001)(5660300002)(66899021)(70586007)(316002)(478600001)(36756003)(44832011)(70206006)(6506007)(8936002)(8676002)(86362001)(6512007)(26005)(2906002)(966005)(40460700003)(41300700001)(36860700001)(40480700001)(186003)(6666004)(6486002)(82310400005)(356005)(81166007)(336012)(47076005)(83380400001)(82740400003)(110136005)(2616005);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 05 Jul 2023 16:47:16.2773 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: a7ee7b36-f2b3-45a9-d3e3-08db7d777d79 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DBAEUR03FT021.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV1PR08MB8641 X-Spam-Status: No, score=-5.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,KAM_DMARC_NONE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: The 06/28/2023 12:19, Joe Ramsay via Libc-alpha wrote: > +++ b/sysdeps/aarch64/fpu/sin_advsimd.c > @@ -0,0 +1,106 @@ > +/* Double-precision vector (Advanced SIMD) sin function. ... > +float64x2_t VPCS_ATTR V_NAME_D1 (sin) (float64x2_t x) > +{ > + const struct data *d = ptr_barrier (&data); > + float64x2_t n, r, r2, r3, r4, y, t1, t2, t3; > + uint64x2_t odd, cmp, eqz; > + > +#if WANT_SIMD_EXCEPT > + /* Detect |x| <= TinyBound or |x| >= RangeVal. If fenv exceptions are to be > + triggered correctly, set any special lanes to 1 (which is neutral w.r.t. > + fenv). These lanes will be fixed by special-case handler later. */ > + uint64x2_t ir = vreinterpretq_u64_f64 (vabsq_f64 (x)); > + cmp = vcgeq_u64 (vsubq_u64 (ir, TinyBound), Thresh); > + r = vbslq_f64 (cmp, vreinterpretq_f64_u64 (cmp), x); > +#else > + r = x; > + cmp = vcageq_f64 (d->range_val, x); > + cmp = vceqzq_u64 (cmp); /* cmp = ~cmp. */ > +#endif > + eqz = vceqzq_f64 (x); > + > + /* n = rint(|x|/pi). */ > + n = vfmaq_f64 (d->shift, d->inv_pi, r); > + odd = vshlq_n_u64 (vreinterpretq_u64_f64 (n), 63); > + n = vsubq_f64 (n, d->shift); > + > + /* r = |x| - n*pi (range reduction into -pi/2 .. pi/2). */ > + r = vfmsq_f64 (r, d->pi_1, n); > + r = vfmsq_f64 (r, d->pi_2, n); > + r = vfmsq_f64 (r, d->pi_3, n); > + > + /* sin(r) poly approx. */ > + r2 = vmulq_f64 (r, r); > + r3 = vmulq_f64 (r2, r); > + r4 = vmulq_f64 (r2, r2); > + > + t1 = vfmaq_f64 (C (4), C (5), r2); > + t2 = vfmaq_f64 (C (2), C (3), r2); > + t3 = vfmaq_f64 (C (0), C (1), r2); > + > + y = vfmaq_f64 (t1, C (6), r4); > + y = vfmaq_f64 (t2, y, r4); > + y = vfmaq_f64 (t3, y, r4); > + y = vfmaq_f64 (r, y, r3); > + > + /* Sign of 0 is discarded by polynomial, so copy it back here. */ > + if (__glibc_unlikely (v_any_u64 (eqz))) > + y = vbslq_f64 (eqz, x, y); this check is for dealing with the sin(-0.0) case. but -ffast-math can already break the sign of 0 and libmvec symbols are supposed to be used with -ffast-math (at least via gcc auto vectorizer). so do we need to provide quality guarantees beyond -ffast-math in libmvec? or can we ignore math tests that check for the sign of 0 and change the code accordingly. the wiki does not seem to cover this under "C99 compliance": https://sourceware.org/glibc/wiki/libmvec > + > + if (__glibc_unlikely (v_any_u64 (cmp))) > + return special_case (x, y, odd, cmp); > + return vreinterpretq_f64_u64 (veorq_u64 (vreinterpretq_u64_f64 (y), odd)); > +}