From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR04-DB3-obe.outbound.protection.outlook.com (mail-db3eur04on0622.outbound.protection.outlook.com [IPv6:2a01:111:f400:fe0c::622]) by sourceware.org (Postfix) with ESMTPS id A563C3858D35 for ; Thu, 16 Mar 2023 16:14:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A563C3858D35 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=3mAwrcUTOTEZbhrY7/moCHknvMSSEpGPvvqFYltgazo=; b=fb8fN0l58rmnuDtfb0/pfSuKDng1fCK2qcv2S9yL+N0XYaaIDX9tcNngBLABqa9xsvdMt2lnrN8+2ZQ1hQy7DZLMRS4XzvbXoHo062a/qqZx6WigsgCQQCYzSQOAgLNjMGiB/33AO9ZOpcbBBxptBOBvGeJYh1iGltwdybGFHc0= Received: from DU2P250CA0027.EURP250.PROD.OUTLOOK.COM (2603:10a6:10:231::32) by PA4PR08MB6128.eurprd08.prod.outlook.com (2603:10a6:102:f2::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6178.31; Thu, 16 Mar 2023 16:13:57 +0000 Received: from DBAEUR03FT023.eop-EUR03.prod.protection.outlook.com (2603:10a6:10:231:cafe::3d) by DU2P250CA0027.outlook.office365.com (2603:10a6:10:231::32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6178.30 via Frontend Transport; Thu, 16 Mar 2023 16:13:57 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DBAEUR03FT023.mail.protection.outlook.com (100.127.142.253) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6199.18 via Frontend Transport; Thu, 16 Mar 2023 16:13:57 +0000 Received: ("Tessian outbound cfb430c87a1e:v135"); Thu, 16 Mar 2023 16:13:57 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 1f76561a39b673fb X-CR-MTA-TID: 64aa7808 Received: from ceb50ed847f8.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id F34D3CA3-B844-4B0F-A22D-F177992CFF74.1; Thu, 16 Mar 2023 16:13:51 +0000 Received: from EUR02-DB5-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id ceb50ed847f8.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 16 Mar 2023 16:13:51 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=LeYlxXyB/7FFZYWkzkE9eHVx7lFzPMPvL+reaCHZutJFN1NmFYwwPlZ9cxnE9FOAn0R1lCbIRakv7z9ayJvta5QqviVZKcCrYn2JiDfU+4wG51aEy5Hsik2cLacxk5ZQwOHZXFS62VuXzdVVtbJRXdVeqAapHdQuHrbR14/qVEyqLIdqtsK3VAQgZMtoNyAfyUk4Rkw12fzuqoXLGGuabSP6KqKtiV7m6B/lR6hsg4sxEU6hLovxaWAIEnEj25ewCyIwzvjaGWcby1+d78QM3QfIr/198/wRmExHrCAGWjt1FoLxLgt5/2eNkzdkVcztI4WqtMiKQ9FOT6V6bitWJA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=3mAwrcUTOTEZbhrY7/moCHknvMSSEpGPvvqFYltgazo=; b=gsoJi/mG8lRNo5eQzSg73yLslAxpuIh6PiPqmQA9tr6mScOTWvlz8GPWt2v3V5eaPWSeVQPC/dXRK6jT8FFLHr11Bw2f1yiBF3Sa6HmpOMW+uTeAdGg4eecdxUDkBAWJ9njc0a+3SlX9+IZ4DGabQW3ekF5wFEvU1GF6YmGbG7Mlz6OOKkhXUHWs59PUl/LdnJpv7b7fRjU7WFbMudAQVKHY3e9kWgodMcFfTrGO69MG4zsiY0iiF0CBqV8z5cTmrqkeOJiDLdz0CojKOoQOPFg/7aTXVSayA2RbGAOxZ86TphLpjhJdPTZG972SIhYoeP5Jt5qtdBipiSKuU3bf0Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=3mAwrcUTOTEZbhrY7/moCHknvMSSEpGPvvqFYltgazo=; b=fb8fN0l58rmnuDtfb0/pfSuKDng1fCK2qcv2S9yL+N0XYaaIDX9tcNngBLABqa9xsvdMt2lnrN8+2ZQ1hQy7DZLMRS4XzvbXoHo062a/qqZx6WigsgCQQCYzSQOAgLNjMGiB/33AO9ZOpcbBBxptBOBvGeJYh1iGltwdybGFHc0= Received: from PAWPR08MB8982.eurprd08.prod.outlook.com (2603:10a6:102:33f::20) by DB3PR08MB8987.eurprd08.prod.outlook.com (2603:10a6:10:431::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6178.26; Thu, 16 Mar 2023 16:13:47 +0000 Received: from PAWPR08MB8982.eurprd08.prod.outlook.com ([fe80::dc17:8fa2:cce5:3573]) by PAWPR08MB8982.eurprd08.prod.outlook.com ([fe80::dc17:8fa2:cce5:3573%7]) with mapi id 15.20.6178.030; Thu, 16 Mar 2023 16:13:47 +0000 From: Wilco Dijkstra To: Adhemerval Zanella Netto , "H.J. Lu" CC: "libc-alpha@sourceware.org" , kirill Subject: Re: [PATCH v2 3/5] math: Improve fmod Thread-Topic: [PATCH v2 3/5] math: Improve fmod Thread-Index: AQHZV4EMzhGqWZw8wU+m559tgWlHQ678ldYAgADiRgCAAAgvoA== Date: Thu, 16 Mar 2023 16:13:47 +0000 Message-ID: References: <20230315205910.4120377-1-adhemerval.zanella@linaro.org> <20230315205910.4120377-4-adhemerval.zanella@linaro.org> In-Reply-To: Accept-Language: en-GB, en-US Content-Language: en-GB X-MS-Has-Attach: X-MS-TNEF-Correlator: msip_labels: Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; x-ms-traffictypediagnostic: PAWPR08MB8982:EE_|DB3PR08MB8987:EE_|DBAEUR03FT023:EE_|PA4PR08MB6128:EE_ X-MS-Office365-Filtering-Correlation-Id: 0ee3d7eb-2dd5-4d85-2fb7-08db26397266 x-checkrecipientrouted: true nodisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: BO6ONUwBnu+hZV0i5SOMhu3TKWvuDUNejcDX9RIbGkE1v9NYDnZvPWFprULDIb2InoXDbXqYFtbfhfvjjgPOeMIGzqRXWqWUXBWk/p4di0mOaBn46NwEOcY4mUbmztVamRDoyX9HfJ/gSkdcZelofqh4AV5KONz+qfVtQKg4v+r3ya8fNJpdhV+SamPcikp3elaUBqPI/burdT68pq85B09ubzCX3EO8Ar+sB0uPasXKDwYNqn25wynU7Je/nN2oOib0ZS95tzunMc+AQ8XFezDmv71K4WO5czLQUYeIe4oaeZ2OFCNQHRKL71x9Z85hjqOacXXSzTKIl3B9V6g2MUiRPlcUvAb5OD3sk2cSMygUmafFaCPtQMQj2oQPI7pfYTHuEQFuIpVhKNHnGrE+QMaRKikTQPrdF1DvE1zjv53pUOc1NIqmPS0phiMCsPxdEVNIu+K4d6+epp7Mlrp8/uYov5HfFQM/aaWFLLMUb8JV3Ux/78FauHkhPuPPueVEBUtS2eQk1OI3lqStNyjiYyJFyXdN18UWm5UGscW1SeHQk8e7unH5xjoLlRYJvcTGvJ6apTwPNnDgKTp9RBPHvRvDMZH9J3/QP9B7Y8z6+EBc4HTVUTPv82oHS8oZ6bsXqz8DprdFiNKAqTNvrDcx133nXDjXTlJD0mJZlxb/2druSHuQ8fLOdeR59Rw26DAhZLcddnGG/hcS8xGg71CrIw== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:PAWPR08MB8982.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230025)(4636009)(39860400002)(396003)(346002)(366004)(136003)(376002)(451199018)(478600001)(26005)(8676002)(33656002)(91956017)(54906003)(52536014)(66556008)(64756008)(66476007)(4326008)(110136005)(316002)(76116006)(66446008)(5660300002)(41300700001)(2906002)(86362001)(38070700005)(4744005)(8936002)(66946007)(71200400001)(6506007)(7696005)(55016003)(186003)(9686003)(122000001)(83380400001)(38100700002);DIR:OUT;SFP:1101; Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB3PR08MB8987 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DBAEUR03FT023.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 26a79f5a-66f1-45bd-bd29-08db26396c70 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: SN20BFyRExS9oh+r5UHqDtcwq9ne39xD5h6iomkDanntbILle/5w/4So/iRJjDxD5ZcFJKXgwmDEaoNFPGAdzIygnVpcoSe1zB8XX5P7tvHW3sl3EcrcwmwoUh493BYn/2RluejypsC5Wb02pJCn0wEt1gIxwSJyqqIhd1JXxMBD0kvgX0iA6Hg1l+mqYKqFijhLtVTOYPujtg7+qX7dRcdjd5EGwpMLWQOzVdSC75XkirmukYfUCrBLf9Fcf9ETWeL8WNnO4jZyFhaLz2YqyjKesL5KKuP717BwXccJRd5VunYzf4lw/XkWzhjwV+aSBJWH42/09ZNWva6S/mF7k9tvjAQtN995clWflxhabQ6yed7zvuyz6KW+BB67zHhk5oLS7ZaZfwGUfs4lBBkaPcEZPMJuMzspiYdAmdb8GFee7Rh8M4Nus3Ho/WigfJxlbZKpzd2SnazdDzlXznZlj/MS9Q6P6Y4C8goCFhkzmneZF77PpIg36h2KJEU2v86ma0w92q57kSnsl7Jjc8c8Hk2/1meh/MiF1WevzRXHSDsokKm1riSgIrtHiq95AC5Q/vwj6yKxNZTqT1eHbIfNJEV1iP5xAsTZ+L5cWpF5lSTke+OPBWGRRVqoL38mdZQFY0leG4+luT+RNPKSqKYf4hte2LbV1/TUVUMylPWJCO996u0r8IoN4ptSRq/hwTGnbAYD5lq8PA9owySZLeQLzQ== X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230025)(4636009)(136003)(376002)(39860400002)(346002)(396003)(451199018)(36840700001)(46966006)(40470700004)(4744005)(186003)(36860700001)(5660300002)(70206006)(8676002)(7696005)(70586007)(356005)(33656002)(86362001)(478600001)(4326008)(54906003)(40480700001)(316002)(55016003)(110136005)(41300700001)(81166007)(107886003)(40460700003)(82740400003)(26005)(8936002)(6506007)(9686003)(2906002)(52536014)(336012)(82310400005)(83380400001)(47076005);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 Mar 2023 16:13:57.7191 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 0ee3d7eb-2dd5-4d85-2fb7-08db26397266 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DBAEUR03FT023.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PA4PR08MB6128 X-Spam-Status: No, score=-4.0 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,KAM_DMARC_NONE,SPF_HELO_PASS,SPF_NONE,TXREP,UNPARSEABLE_RELAY autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi,=0A= =0A= It's these cases where x87 is still faster than the generic version:=0A= =0A= > E5-2640=A0=A0=A0=A0=A0=A0=A0=A0=A0 | close-exponents | 39.298=A0=A0 | 22.= 2742=0A= >=0A= > i7-4510U=A0=A0=A0=A0=A0=A0=A0=A0 | close-exponents | 29.463=A0=A0 | 22.85= 72=0A= =0A= Are these mostly x < y or cases where the exponent difference is just over = 11 and=0A= thus we do not use the fast path?=0A= =0A= > I am also checking a algorithm change to use simple loop for the normal i= nputs,=0A= > where integer modulo operation is used instead of inverse multiplication.= =0A= =0A= Adding another fast path for a wider range of exponent difference could be = faster=0A= than the generic modulo loop. This could do 2 modulo steps and maybe handle= =0A= tail zeroes (which I think is what HJ's testcase will benefit from).=0A= =0A= For really large exponent differences, the generic modulo code could proces= s 30=0A= or 60 bits per iteration instead of just 11. It's more complex (so would be= a separate=0A= patch) but it should help CPUs with relatively high latency multipliers.=0A= =0A= Cheers,=0A= Wilco=