From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR05-DB8-obe.outbound.protection.outlook.com (mail-db8eur05on2070.outbound.protection.outlook.com [40.107.20.70]) by sourceware.org (Postfix) with ESMTPS id 3241C3858D37 for ; Tue, 14 Mar 2023 16:43:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3241C3858D37 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=msbyvTowbXeC4oUEou2T381WorI7oIviEVIHaIWrs6Y=; b=KE7UlbukITewgRqQu2bkNe+4ndB8x28GjBPXsJfkcQokhyt8Du0pzt7W/9DCbYuUor3YedGnwQJQFBrfp15rgzSPqeDGecICrRZlFUWLa+2uGUVYE1I7F8htQ41NT7en2IXLJa+6VqZzlR7R47ODUdVhBMRpPJusatvOxCks5Lo= Received: from DB6PR0202CA0004.eurprd02.prod.outlook.com (2603:10a6:4:29::14) by AS8PR08MB8993.eurprd08.prod.outlook.com (2603:10a6:20b:5b4::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6178.26; Tue, 14 Mar 2023 16:42:57 +0000 Received: from DBAEUR03FT044.eop-EUR03.prod.protection.outlook.com (2603:10a6:4:29:cafe::2a) by DB6PR0202CA0004.outlook.office365.com (2603:10a6:4:29::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6178.25 via Frontend Transport; Tue, 14 Mar 2023 16:42:57 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DBAEUR03FT044.mail.protection.outlook.com (100.127.142.189) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6199.11 via Frontend Transport; Tue, 14 Mar 2023 16:42:57 +0000 Received: ("Tessian outbound c2bcb4c18c29:v135"); Tue, 14 Mar 2023 16:42:57 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 5033642820f2a5c3 X-CR-MTA-TID: 64aa7808 Received: from dc310ea5de67.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id E85136F9-D8E9-493A-A471-4E2A52A2355C.1; Tue, 14 Mar 2023 16:42:50 +0000 Received: from EUR03-AM7-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id dc310ea5de67.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Tue, 14 Mar 2023 16:42:50 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=IMZvXuweo1m1bsyZrKl/uddjvRSq2To70eLqkx/sM91WDssMSpRXY2gMU7PzHi9yStb4AdyTPkyR/k54cHds4PX6Nl3oTXEcKfUejdQD/s8cuNVaiuUlX4U98FnXno1YNjBGRDRYgoDEIlAd04seo8jqUm5np7WvODS9HZqB6UKn9Py81noEZuPU448/I3/FxWm0CmZCPgmu1DGoUiPGIQXh88JkjGN8zmCIgLXrV9dFT+VTEk0o4SDWfq6VD9UIgdzItH2/M9A7wL8YZcwn5wBz3eraJVjaGT0NbEuI0yuYJOGnmTmtaJlUWaJDkTpBCF1VKNZr//4ZvHOYorwoHw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=msbyvTowbXeC4oUEou2T381WorI7oIviEVIHaIWrs6Y=; b=nzsHGG4en4+9xT+fuVDcyK5JCDqA1UBeufA9m20HjvHdggQes0Z/zTo+zUtO/7V6PaoLZg98uLN5Kdi8q/chDNVVZ0RT/tl/ThZnB/pJneH505GWaIH5dQ1ZfSnsfkBJkcHKHTW1cmcx2w/m+fE/EFXf5Yl3hM+AAnIhioFCtLNTiHjnquysH0oBT1ATLvT2Ia8kmtJ9PLOSZyi2SPHc7Or6KvCHD3PRabzcHd8anJDEzSdaZ5DBZ3sxjHqjgyNvvBQWJfUYbbCYg5kewTOdJXXP/IfQ4XA1QlgLnPqvbDBfUtfmUaMMwBAl3gRFVdFJlEtZIutw+ZhH268hDmzOBQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=msbyvTowbXeC4oUEou2T381WorI7oIviEVIHaIWrs6Y=; b=KE7UlbukITewgRqQu2bkNe+4ndB8x28GjBPXsJfkcQokhyt8Du0pzt7W/9DCbYuUor3YedGnwQJQFBrfp15rgzSPqeDGecICrRZlFUWLa+2uGUVYE1I7F8htQ41NT7en2IXLJa+6VqZzlR7R47ODUdVhBMRpPJusatvOxCks5Lo= Received: from PAWPR08MB8982.eurprd08.prod.outlook.com (2603:10a6:102:33f::20) by DB8PR08MB5372.eurprd08.prod.outlook.com (2603:10a6:10:f9::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6178.26; Tue, 14 Mar 2023 16:42:48 +0000 Received: from PAWPR08MB8982.eurprd08.prod.outlook.com ([fe80::dc17:8fa2:cce5:3573]) by PAWPR08MB8982.eurprd08.prod.outlook.com ([fe80::dc17:8fa2:cce5:3573%6]) with mapi id 15.20.6178.024; Tue, 14 Mar 2023 16:42:47 +0000 From: Wilco Dijkstra To: Adhemerval Zanella , "libc-alpha@sourceware.org" , "H . J . Lu" CC: kirill Subject: Re: [PATCH 4/4] math: Improve fmodf Thread-Topic: [PATCH 4/4] math: Improve fmodf Thread-Index: AQHZU3oVe0zZ+UNzoEOBx+pA8qA0Kq76b85b Date: Tue, 14 Mar 2023 16:42:37 +0000 Message-ID: References: <20230310175900.2388957-1-adhemerval.zanella@linaro.org> <20230310175900.2388957-5-adhemerval.zanella@linaro.org> In-Reply-To: <20230310175900.2388957-5-adhemerval.zanella@linaro.org> Accept-Language: en-GB, en-US Content-Language: en-GB X-MS-Has-Attach: X-MS-TNEF-Correlator: msip_labels: Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; x-ms-traffictypediagnostic: PAWPR08MB8982:EE_|DB8PR08MB5372:EE_|DBAEUR03FT044:EE_|AS8PR08MB8993:EE_ X-MS-Office365-Filtering-Correlation-Id: f70d7bc0-73d6-4591-8915-08db24ab2a87 x-checkrecipientrouted: true nodisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: QuKl16cwRY/Kp7HYbD7LTWBsOq8zgZkxsUO6vbzijPDt6Rdje/WrtOguWdrG+iAcBbpOXAn3/ErTxmZnEfwCTIumDq9uGRyyZ15r6PExgeYXQBmkhffClDXXggL9WziwIoCm9BfqRo4Ku3sk/bV9Cb9WssB/OEnifMprcdsNN/ZsH0Jp3UhP0RI5Xxdczon+gJpt9zz9B5ncNLgSRHm83/f4stFJh9oSrslp10rncGZGeNddc51GJFcZq5hL77/juBaP0Y8c/K56hFvuWqktBc1mt+xbLJfn2UNk1oDZsCrvrHlSy8gvAPLpMmN/Z9kA94OOjfw0XHDcAW4brKBUc+/8ewvk/NT+t1p7nhoFS5cpmSK7QVGw/Ws0Bena8jILo2XW7NbTn9FGVhxDD6Zuos+MZwGNrZJqX/jKzh4RKb7PKBHijeuEfekatzmTsnV/4QI3undZn/4lLWvy/RQs287+8gh4q6AGYGTOePWe8N/XYbic8MgyNlx6kBqc07NTeXw8htGlDk54++g325vdQcztF/xPEkWnRcMp4S+XZ18IYBChpV1hkRLjjUscMS2Cq5iYgOXaC6t7+VYmKiZQsdGvrAP32CCCQD2nI2pbIwhlxwooFfCdlkfsOI6OEvfpeDbmKlhCEztSas6NZ/RF+/JTGX0WsGPbC/HjjCCujwxeQ+zMIoE/lRWmO7QpMBKbyBPoXlIoklcKWd7AvrGLbg== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:PAWPR08MB8982.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230025)(4636009)(366004)(39860400002)(136003)(346002)(376002)(396003)(451199018)(8936002)(5660300002)(4326008)(41300700001)(8676002)(52536014)(55016003)(33656002)(86362001)(38070700005)(38100700002)(122000001)(2906002)(478600001)(83380400001)(6506007)(7696005)(71200400001)(6666004)(64756008)(66446008)(66476007)(76116006)(66946007)(91956017)(316002)(110136005)(186003)(9686003)(66556008)(26005);DIR:OUT;SFP:1101; Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB8PR08MB5372 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DBAEUR03FT044.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 3e5082df-485e-437c-e6ef-08db24ab1e72 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: GvMMjMdQuHZb0vdPCkBpYzdbr50pCkTLwrqYrAoZI4fm+uALFoJQQzJDBOjcmyzonvuJLltYz97oadv7ozSswO7ZHuGa18BhRwi0mLF0klLXZVondA6zkpxlx39I532yvd+P7E29Q7nHZg8QoESvOszP3rbmOXFKzXG5oYz8CxBIqCfEFDJcl/D804VO8AYJGS3deHRS3E6Ubk0d69PI7oqpaFUNv86OEQ+StaxWTtfeOWqFRV0+kOGy5fUe3pYrOfJ7BrtpDLJ4207omR0pPLvdxkIBQ504F/AavBt6ObtfYIrel0p9cMySghvE17boC6r/qrbQUqdXnznDXnMGA+TkXhDpfuCqgR3CNXkvvWmgErAV/YEBgKNxp9X0+Tpo0xxTZF2jrdrDazuMaaPIBXwdRtVgfOpgZL3jPGMJvgXwcmOcWiPuVHr4z6CxszkUyfSO7zWS+kpEyXiT1y4G9opV4okWv0DQYwuFe71/hrK9oxFcVd5ceSmZpYx88XL5VZbI1Dn7QJh7n+XsFEMxO0za5bamPja0hJ5ZwYOYS/+GJq1U4GdHeTj9BQ2q3p3B+8u+N65/zOV+YI6zpOCUilM5Kpfyru4on2cVcP2GvCqFJs3iirnPreASA0eY0/JL8IVLQbvdkJHs49Is/SLw5CapKP3IgV49ZKktLfQt1nlrtgNcjOFvpHAyQzeGtE9UQ3s0jWh62TlutysQdTw5lA== X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230025)(4636009)(39860400002)(396003)(136003)(346002)(376002)(451199018)(40470700004)(46966006)(36840700001)(86362001)(356005)(82740400003)(36860700001)(81166007)(33656002)(41300700001)(2906002)(52536014)(8936002)(5660300002)(55016003)(40460700003)(82310400005)(4326008)(40480700001)(9686003)(26005)(6506007)(186003)(83380400001)(336012)(316002)(47076005)(110136005)(70206006)(70586007)(8676002)(6666004)(107886003)(478600001)(7696005);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Mar 2023 16:42:57.4433 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: f70d7bc0-73d6-4591-8915-08db24ab2a87 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DBAEUR03FT044.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB8993 X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,GIT_PATCH_0,KAM_DMARC_NONE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi Adhemerval,=0A= =0A= This looks good overall. I guess we still needs error handling and the wrap= per=0A= disabled since it spends 1/3 of the time in the useless wrapper but that ca= n be=0A= a separate patch.=0A= =0A= A few comments below, there are a few things that look wrong, and I think m= ost=0A= of the helper functions just add extra complexity and branches for no gain:= =0A= =0A= =0A= +=A0 uint32_t sx =3D hx & SIGN_MASK;=0A= +=A0 /* Get |x| and |y|.=A0 */=0A= +=A0 hx ^=3D sx;=0A= +=A0 hy &=3D ~SIGN_MASK;=0A= +=0A= +=A0 /* Special cases:=0A= +=A0=A0=A0=A0 - If x or y is a Nan, NaN is returned.=0A= +=A0=A0=A0=A0 - If x is an inifinity, a NaN is returned.=0A= +=A0=A0=A0=A0 - If y is zero, Nan is returned.=0A= +=A0=A0=A0=A0 - If x is +0/-0, and y is not zero, +0/-0 is returned.=A0 */= =0A= +=A0 if (__glibc_unlikely (hy =3D=3D 0=A0=A0=A0=A0=A0=A0=A0 || hx >=3D EXPO= NENT_MASK || hy > EXPONENT_MASK))=0A= +=A0=A0=A0 return (x * y) / (x * y);=0A= +=0A= +=A0 if (__glibc_unlikely (hx <=3D hy))=0A= +=A0=A0=A0 {=0A= +=A0=A0=A0=A0=A0 if (hx < hy)=0A= +=A0=A0=A0=A0=A0=A0 return x;=0A= +=A0=A0=A0=A0=A0 return sx ? -0.0 : 0.0;=0A= =0A= This should be return asfloat (sx);=0A= =0A= +=A0=A0=A0 }=0A= +=0A= +=A0 int ex =3D get_unbiased_exponent (hx);=0A= +=A0 int ey =3D get_unbiased_exponent (hy);=0A= =0A= Should be hx >> MANTISSA_WIDTH since we cleared the sign bits (now we don't= =0A= need to add get_unbiased_exponent).=0A= =0A= +=A0 /* Common case where exponents are close: ey >=3D -103 and |x/y| < 2^8= ,=A0 */=0A= +=A0 if (__glibc_likely (ey > MANTISSA_WIDTH && ex - ey <=3D EXPONENT_WIDTH= ))=0A= +=A0=A0=A0 {=0A= +=A0=A0=A0=A0=A0 uint32_t mx =3D get_explicit_mantissa (hx);=0A= +=A0=A0=A0=A0=A0 uint32_t my =3D get_explicit_mantissa (hy);=0A= =0A= Note this is equivalent to:=0A= =0A= mx =3D (hx & MANTISSA_MASK) | (MANTISSA_MASK + 1);=0A= =0A= So we don't need get_explicit_mantissa (or we could change it to do the abo= ve).=0A= If we do this before the if statement, we don't need to repeat it below.=0A= =0A= +=A0=A0=A0=A0=A0 uint32_t d =3D (ex =3D=3D ey) ? (mx - my) : (mx << (ex - e= y)) % my;=0A= +=A0=A0=A0=A0=A0 if (d =3D=3D 0)=0A= +=A0=A0=A0=A0=A0=A0 return 0.0;=0A= =0A= Looks like a bug, should be asfloat (sx)?=0A= =0A= +=A0=A0=A0=A0=A0 return make_float (d, ey - 1, sx);=0A= +=A0=A0=A0 }=0A= +=0A= +=A0 /* Special case, both x and y are subnormal.=A0 */=0A= +=A0 if (__glibc_unlikely (ex =3D=3D 0 && ey =3D=3D 0))=0A= +=A0=A0=A0 return asfloat (hx % hy);=0A= =0A= Similarly, shouldn't this be asfloat (sx | (hx % hy))?=0A= =0A= +=A0 /* Convert |x| and |y| to 'mx + 2^ex' and 'my + 2^ey'.=A0 Assume that = hx is=0A= +=A0=A0=A0=A0 not subnormal by conditions above.=A0 */=0A= +=A0 uint32_t mx =3D get_explicit_mantissa (hx);=0A= +=A0 ex--;=0A= +=0A= +=A0 uint32_t my =3D get_explicit_mantissa (hy);=0A= =0A= If we set mx/my above then this isn't needed.=0A= =0A= +=A0 int lead_zeros_my =3D EXPONENT_WIDTH;=0A= +=A0 if (__glibc_likely (ey > 0))=0A= +=A0=A0=A0 ey--;=0A= +=A0 else=0A= +=A0=A0=A0 {=0A= +=A0=A0=A0=A0=A0 my =3D get_mantissa (hy);=0A= =0A= This is really my =3D hy; since we know hy is positive denormal.=0A= =0A= +=A0=A0=A0=A0=A0 lead_zeros_my =3D __builtin_clz (my);=0A= +=A0=A0=A0 }=0A= +=0A= +=A0 int tail_zeros_my =3D __builtin_ctz (my);=0A= +=A0 int sides_zeroes =3D lead_zeros_my + tail_zeros_my;=0A= +=A0 int exp_diff =3D ex - ey;=0A= +=0A= +=A0 int right_shift =3D exp_diff < tail_zeros_my ? exp_diff : tail_zeros_m= y;=0A= +=A0 my >>=3D right_shift;=0A= +=A0 exp_diff -=3D right_shift;=0A= +=A0 ey +=3D right_shift;=0A= +=0A= +=A0 int left_shift =3D exp_diff < EXPONENT_WIDTH ? exp_diff : EXPONENT_WID= TH;=0A= +=A0 mx <<=3D left_shift;=0A= +=A0 exp_diff -=3D left_shift;=0A= +=0A= +=A0 mx %=3D my;=0A= +=0A= +=A0 if (__glibc_unlikely (mx =3D=3D 0))=0A= +=A0=A0=A0 return sx ? -0.0 : 0.0;=0A= =0A= Should be asfloat (sx);=0A= =0A= +=A0 if (exp_diff =3D=3D 0)=0A= +=A0=A0=A0 return make_float (my, ey, sx);=0A= +=0A= +=A0 /* Assume modulo/divide operation is slow, so use multiplication with = invert=0A= +=A0=A0=A0=A0 values.=A0 */=0A= +=A0 uint32_t inv_hy =3D UINT32_MAX / my;=0A= +=A0 while (exp_diff > sides_zeroes) {=0A= +=A0=A0=A0 exp_diff -=3D sides_zeroes;=0A= +=A0=A0=A0 uint32_t hd =3D (mx * inv_hy) >> (BIT_WIDTH - sides_zeroes);=0A= +=A0=A0=A0 mx <<=3D sides_zeroes;=0A= +=A0=A0=A0 mx -=3D hd * my;=0A= +=A0=A0=A0 while (__glibc_unlikely (mx > my))=0A= +=A0=A0=A0=A0=A0 mx -=3D my;=0A= +=A0 }=0A= +=A0 uint32_t hd =3D (mx * inv_hy) >> (BIT_WIDTH - exp_diff);=0A= +=A0 mx <<=3D exp_diff;=0A= +=A0 mx -=3D hd * my;=0A= +=A0 while (__glibc_unlikely (mx > my))=0A= +=A0=A0=A0 mx -=3D my;=0A= +=0A= +=A0 return make_float (mx, ey, sx);=0A= =A0}=0A= =A0libm_alias_finite (__ieee754_fmodf, __fmodf)=0A= diff --git a/sysdeps/ieee754/flt-32/math_config.h b/sysdeps/ieee754/flt-32/= math_config.h=0A= index 23045f59d6..cdab3a36ef 100644=0A= --- a/sysdeps/ieee754/flt-32/math_config.h=0A= +++ b/sysdeps/ieee754/flt-32/math_config.h=0A= @@ -110,6 +110,95 @@ issignalingf_inline (float x)=0A= =A0=A0 return 2 * (ix ^ 0x00400000) > 2 * 0x7fc00000UL;=0A= =A0}=0A= =A0=0A= +#define BIT_WIDTH=A0=A0=A0=A0=A0=A0 32=0A= +#define MANTISSA_WIDTH=A0 23=0A= +#define EXPONENT_WIDTH=A0 8=0A= +#define MANTISSA_MASK=A0=A0 0x007fffff=0A= +#define EXPONENT_MASK=A0=A0 0x7f800000=0A= +#define EXP_MANT_MASK=A0=A0 0x7fffffff=0A= +#define QUIET_NAN_MASK=A0 0x00400000=0A= +#define SIGN_MASK=A0=A0=A0=A0=A0=A0 0x80000000=0A= +=0A= +static inline bool=0A= +is_nan (uint32_t x)=0A= +{=0A= +=A0 return (x & EXP_MANT_MASK) > EXPONENT_MASK;=0A= +}=0A= +=0A= +static inline bool=0A= +is_quiet_nan (uint32_t x)=0A= +{=0A= +=A0=A0 return (x & EXP_MANT_MASK) =3D=3D (EXPONENT_MASK | QUIET_NAN_MASK);= =0A= +}=0A= +=0A= +static inline bool=0A= +is_inf_or_nan (uint32_t x)=0A= +{=0A= +=A0 return (x & EXPONENT_MASK) =3D=3D EXPONENT_MASK;=0A= +}=0A= +=0A= +static inline uint16_t=0A= +get_unbiased_exponent (uint32_t x)=0A= +{=0A= +=A0 return (x & EXPONENT_MASK) >> MANTISSA_WIDTH;=0A= +}=0A= +=0A= +/* Return mantissa with the implicit bit set iff X is a normal number.=A0 = */=0A= +static inline uint32_t=0A= +get_explicit_mantissa (uint32_t x)=0A= +{=0A= +=A0 uint32_t p1 =3D (get_unbiased_exponent (x) > 0 && !is_inf_or_nan (x)= =0A= +=A0=A0=A0 ? (MANTISSA_MASK + 1) : 0);=0A= +=A0 uint32_t p2 =3D (x & MANTISSA_MASK);=0A= +=A0 return p1 | p2;=0A= +}=0A= =0A= I don't think we need this (and anything called by it).=0A= =0A= +static inline uint32_t=0A= +set_mantissa (uint32_t x, uint32_t m)=0A= +{=0A= +=A0 m &=3D MANTISSA_MASK;=0A= +=A0 x &=3D ~(MANTISSA_MASK);=0A= +=A0 return x |=3D m;=0A= +}=0A= +=0A= +static inline uint32_t=0A= +get_mantissa (uint32_t x)=0A= +{=0A= +=A0 return x & MANTISSA_MASK;=0A= +}=0A= +=0A= +static inline uint32_t=0A= +set_unbiased_exponent (uint32_t x, uint32_t e)=0A= +{=0A= +=A0 e =3D (e << MANTISSA_WIDTH) & EXPONENT_MASK;=0A= +=A0 x &=3D ~(EXPONENT_MASK);=0A= +=A0 return x |=3D e;=0A= +}=0A= +=0A= +/* Convert integer number X, unbiased exponent EP, and sign S to double:= =0A= +=0A= +=A0=A0 result =3D X * 2^(EP+1 - exponent_bias)=0A= +=0A= +=A0=A0 NB: zero is not supported.=A0 */=0A= +static inline double=0A= +make_float (uint32_t x, int ep, uint32_t s)=0A= +{=0A= +=A0 int lz =3D __builtin_clz (x) - EXPONENT_WIDTH;=0A= +=A0 x <<=3D lz;=0A= +=A0 ep -=3D lz;=0A= +=0A= +=A0 uint32_t r =3D 0;=0A= +=A0 if (__glibc_likely (ep >=3D 0))=0A= +=A0=A0=A0 {=0A= +=A0=A0=A0=A0=A0 r =3D set_mantissa (r, x);=0A= +=A0=A0=A0=A0=A0 r =3D set_unbiased_exponent (r, ep + 1);=0A= +=A0=A0=A0 }=0A= +=A0 else=0A= +=A0=A0=A0 r =3D set_mantissa (r, x >> -ep);=0A= +=0A= +=A0 return asfloat (r | s);=0A= +}=0A= =0A= This is overly complex, make_float is as trivial as:=0A= =0A= static inline double=0A= make_float (uint32_t x, int ep, uint32_t s)=0A= {=0A= =A0 int lz =3D __builtin_clz (x) - EXPONENT_WIDTH;=0A= =A0 x <<=3D lz;=0A= =A0 ep -=3D lz;=0A= =0A= if (__glibc_unlikely (ep < 0))=0A= {=0A= x >> -ep;=0A= ep =3D 0;=0A= }=0A= return asfloat (s + x + (ep << MANTISSA_WIDTH));=0A= }=0A= =0A= And now you don't need set_unbiased_exponent and set_mantissa either.=0A= =0A= Cheers,=0A= Wilco=