From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <kirill.okhotnikov@gmail.com>
Received: from mail-lf1-x134.google.com (mail-lf1-x134.google.com
 [IPv6:2a00:1450:4864:20::134])
 by sourceware.org (Postfix) with ESMTPS id 6F5433985461
 for <libc-help@sourceware.org>; Tue, 17 Nov 2020 21:35:46 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 6F5433985461
Received: by mail-lf1-x134.google.com with SMTP id s30so32218334lfc.4
 for <libc-help@sourceware.org>; Tue, 17 Nov 2020 13:35:46 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:to:from:subject:message-id:date:user-agent
 :mime-version:content-language;
 bh=tzlt7Jn0dR1uXlkAIhYJd3eCwgMcwxqIvOutJ6vKxrA=;
 b=ohaN0K5FHtAc3kH1pIL16R2BwcDhnRAUafeB1Sew5+cALPHNx+zNN1rHnnwpojFN8/
 +j4+qXtcbx/gsKG6s5Aqo8e9NWmRwjtUOaJc7PJr+xBGr69OCEby/TAg/IyWFZu2VXb7
 r0WIQ73kYVq3EOvrL1mJ+S6O7cnHPcF9QLCzeB12HFNTXI3Dit9JKiF4h8oLXxYDJHQd
 x5MM6XK6/sOkpjPFWJkch4oWSB34NgL0MYsH0xXhxzHWSqrNsxzZQ9KwXJCUK/W+jekz
 yCG7IRvQxJXdEZrwnSVpMKHIjqx8nurBNgiHREIhA5kF8QBkdwm9jRLtxIMISVFlB45L
 DLbQ==
X-Gm-Message-State: AOAM533Sgk1lWalwXxMbEMi/AdFbOAM9a1zfJnDt52Tedpp8NatuyuWh
 XpWphW/s8XQN1IgaD/oCQ5L9tiliH53GqQ==
X-Google-Smtp-Source: ABdhPJz2VNHVWTZvYX269PDy1Dx8ORrwqHEqsmR6T96Z5gNdwAp94Y8J2CY9TX6DeLzG1ErluUMoGw==
X-Received: by 2002:a19:587:: with SMTP id 129mr2458976lff.189.1605648944830; 
 Tue, 17 Nov 2020 13:35:44 -0800 (PST)
Received: from [192.168.5.169] ([91.224.181.33])
 by smtp.googlemail.com with ESMTPSA id r25sm2968122ljj.42.2020.11.17.13.35.43
 for <libc-help@sourceware.org>
 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
 Tue, 17 Nov 2020 13:35:44 -0800 (PST)
To: libc-help@sourceware.org
From: Kirill Okhotnikov <kirill.okhotnikov@gmail.com>
Subject: New libm fmod function for 64 bit targets.
Message-ID: <5928e610-e5fc-d1df-44d3-0d3a36470917@gmail.com>
Date: Tue, 17 Nov 2020 22:35:43 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101
 Thunderbird/68.10.0
MIME-Version: 1.0
Content-Language: en-GB
X-Spam-Status: No, score=0.3 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, HTML_MESSAGE,
 RCVD_IN_DNSWL_NONE, RCVD_IN_SBL_CSS, SPF_HELO_NONE, SPF_PASS,
 TXREP autolearn=no autolearn_force=no version=3.4.2
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on
 server2.sourceware.org
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Content-Filtered-By: Mailman/MimeDel 2.1.29
X-BeenThere: libc-help@sourceware.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Libc-help mailing list <libc-help.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/libc-help>,
 <mailto:libc-help-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/libc-help/>
List-Help: <mailto:libc-help-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/libc-help>,
 <mailto:libc-help-request@sourceware.org?subject=subscribe>
X-List-Received-Date: Tue, 17 Nov 2020 21:35:48 -0000

Dear Developers,

Thank you very much for the project. I want to contribute to the project 
with a new fmod 64 bit function.

https://sourceware.org/glibc/wiki/Development_Todo/Master#libm_itself

> The C implementations of fmod functions appear suboptimal: they could 
> use __builtin_clz operations for subnormals instead of a loop, where 
> the processor has efficient clz support, and the repeated-subtraction 
> loop may be less efficient than approaches using integer mod (where 
> there is hardware support for that) and repeated squaring, where the 
> exponents are widely different. (Before doing much on this, check what 
> processors would actually benefit.)

During my "research" I found that 32 bit algorithm for fmod written by 
Sun Inc is efficient (*), but it's 64 bit adoption is not. For 64 bit 
system I propose to use a new algorithm which uses integer mod. See 
description here

https://github.com/orex/test_fmod/blob/master/libm-file/e_fmod.c

My tests on x86_64 (Intel, AMD) and ARM64 shows that the new algorithm 
up to 20 times faster for "extreme" cases. And up to two times faster 
for regular using of the function.

https://github.com/orex/test_fmod/blob/master/README.md

Also, I did some unit testing which shows that old and a new algorithms 
gives binary equivalent result for each of billions different pairs (x, 
y) with wide range of numbers including normal, subnormal, and special 
one (NaN INF, 0). Libm tests also passed.

Source code for the tests can be found here.

https://github.com/orex/test_fmod/

The test calculate a hash for all input values and output from libm fmod 
and new fmod. Therefore you can check that it produces the same result 
for different architectures.

Patch is straightforward. You need to change file 
glibc/sysdeps/ieee754/dbl-64/wordsize-64/e_mod.c to 
https://raw.githubusercontent.com/orex/test_fmod/master/libm-file/e_fmod.c.

My question is what else essentially I need to do to apply a patch for 
the library?

Best,
Kirill.

P.S. I develop and wrote the algorithm by myself so the copyright is 
fully mine. Now it is MIT license, but I can change to any license 
needed, of course.

(*) Subnormals loops for CLZ can be changed, of course to builtin_clz. 
But such case looks quite rare in a real calculations.