From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x134.google.com (mail-lf1-x134.google.com [IPv6:2a00:1450:4864:20::134]) by sourceware.org (Postfix) with ESMTPS id 6F5433985461 for ; Tue, 17 Nov 2020 21:35:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 6F5433985461 Received: by mail-lf1-x134.google.com with SMTP id s30so32218334lfc.4 for ; Tue, 17 Nov 2020 13:35:46 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:from:subject:message-id:date:user-agent :mime-version:content-language; bh=tzlt7Jn0dR1uXlkAIhYJd3eCwgMcwxqIvOutJ6vKxrA=; b=ohaN0K5FHtAc3kH1pIL16R2BwcDhnRAUafeB1Sew5+cALPHNx+zNN1rHnnwpojFN8/ +j4+qXtcbx/gsKG6s5Aqo8e9NWmRwjtUOaJc7PJr+xBGr69OCEby/TAg/IyWFZu2VXb7 r0WIQ73kYVq3EOvrL1mJ+S6O7cnHPcF9QLCzeB12HFNTXI3Dit9JKiF4h8oLXxYDJHQd x5MM6XK6/sOkpjPFWJkch4oWSB34NgL0MYsH0xXhxzHWSqrNsxzZQ9KwXJCUK/W+jekz yCG7IRvQxJXdEZrwnSVpMKHIjqx8nurBNgiHREIhA5kF8QBkdwm9jRLtxIMISVFlB45L DLbQ== X-Gm-Message-State: AOAM533Sgk1lWalwXxMbEMi/AdFbOAM9a1zfJnDt52Tedpp8NatuyuWh XpWphW/s8XQN1IgaD/oCQ5L9tiliH53GqQ== X-Google-Smtp-Source: ABdhPJz2VNHVWTZvYX269PDy1Dx8ORrwqHEqsmR6T96Z5gNdwAp94Y8J2CY9TX6DeLzG1ErluUMoGw== X-Received: by 2002:a19:587:: with SMTP id 129mr2458976lff.189.1605648944830; Tue, 17 Nov 2020 13:35:44 -0800 (PST) Received: from [192.168.5.169] ([91.224.181.33]) by smtp.googlemail.com with ESMTPSA id r25sm2968122ljj.42.2020.11.17.13.35.43 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 17 Nov 2020 13:35:44 -0800 (PST) To: libc-help@sourceware.org From: Kirill Okhotnikov Subject: New libm fmod function for 64 bit targets. Message-ID: <5928e610-e5fc-d1df-44d3-0d3a36470917@gmail.com> Date: Tue, 17 Nov 2020 22:35:43 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 Content-Language: en-GB X-Spam-Status: No, score=0.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, HTML_MESSAGE, RCVD_IN_DNSWL_NONE, RCVD_IN_SBL_CSS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: libc-help@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-help mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 Nov 2020 21:35:48 -0000 Dear Developers, Thank you very much for the project. I want to contribute to the project with a new fmod 64 bit function. https://sourceware.org/glibc/wiki/Development_Todo/Master#libm_itself > The C implementations of fmod functions appear suboptimal: they could > use __builtin_clz operations for subnormals instead of a loop, where > the processor has efficient clz support, and the repeated-subtraction > loop may be less efficient than approaches using integer mod (where > there is hardware support for that) and repeated squaring, where the > exponents are widely different. (Before doing much on this, check what > processors would actually benefit.) During my "research" I found that 32 bit algorithm for fmod written by Sun Inc is efficient (*), but it's 64 bit adoption is not. For 64 bit system I propose to use a new algorithm which uses integer mod. See description here https://github.com/orex/test_fmod/blob/master/libm-file/e_fmod.c My tests on x86_64 (Intel, AMD) and ARM64 shows that the new algorithm up to 20 times faster for "extreme" cases. And up to two times faster for regular using of the function. https://github.com/orex/test_fmod/blob/master/README.md Also, I did some unit testing which shows that old and a new algorithms gives binary equivalent result for each of billions different pairs (x, y) with wide range of numbers including normal, subnormal, and special one (NaN INF, 0). Libm tests also passed. Source code for the tests can be found here. https://github.com/orex/test_fmod/ The test calculate a hash for all input values and output from libm fmod and new fmod. Therefore you can check that it produces the same result for different architectures. Patch is straightforward. You need to change file glibc/sysdeps/ieee754/dbl-64/wordsize-64/e_mod.c to https://raw.githubusercontent.com/orex/test_fmod/master/libm-file/e_fmod.c. My question is what else essentially I need to do to apply a patch for the library? Best, Kirill. P.S. I develop and wrote the algorithm by myself so the copyright is fully mine. Now it is MIT license, but I can change to any license needed, of course. (*) Subnormals loops for CLZ can be changed, of course to builtin_clz. But such case looks quite rare in a real calculations.