From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out-no.shaw.ca (smtp-out-no.shaw.ca [64.59.134.12]) by sourceware.org (Postfix) with ESMTPS id 9C1BF3854803 for ; Sat, 5 Jun 2021 13:25:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 9C1BF3854803 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=SystematicSw.ab.ca Authentication-Results: sourceware.org; spf=none smtp.mailfrom=systematicsw.ab.ca Received: from [192.168.1.104] ([68.147.0.90]) by shaw.ca with ESMTP id pWIqlflkrycp5pWIrly64g; Sat, 05 Jun 2021 07:25:29 -0600 X-Authority-Analysis: v=2.4 cv=H864f8Ui c=1 sm=1 tr=0 ts=60bb7b49 a=T+ovY1NZ+FAi/xYICV7Bgg==:117 a=T+ovY1NZ+FAi/xYICV7Bgg==:17 a=IkcTkHD0fZMA:10 a=zdzGXEsp-mvWGLZsmbcA:9 a=QEXdDO2ut3YA:10 From: Brian Inglis Subject: Re: incorrectly rounded square root Reply-To: newlib@sourceware.org To: newlib@sourceware.org References: <2f8796f4-f164-5734-16ca-9a392e788beb@gmail.com> Organization: Systematic Software Message-ID: <0a785f1d-4a6f-f880-a60a-05c68948f932@SystematicSw.ab.ca> Date: Sat, 5 Jun 2021 07:25:27 -0600 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-CA Content-Transfer-Encoding: 7bit X-CMAE-Envelope: MS4xfHtxneXJn6gmLUKFwudbRGF7KjSbjXjjCfDtnDh8MgrLu27xFCBSfODxPbsyOw7jiZjGUrcXigsGXEhr8nQnpyp9jTTV+fjyMqavlZiJtpyjO0Kxc7uM ZF3K4+z7/kMKMftoH3SxWuAvw71/zv1rk7RRS3AK9ClO+Iww3LvGbMRSa3xtZ9sc5hq1LB8cJZpVG9LT8G0ZYPHbL0ibRfs7y3Q= X-Spam-Status: No, score=-3488.3 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, NICE_REPLY_A, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: newlib@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Newlib mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 05 Jun 2021 13:25:32 -0000 Great catch, analysis, and fix! Now sqrtf rounds correctly on Cygwin! $ ./test-sqrtf-round Direction CW MX Input Hex Input Decimal Sqrt Hex Sqrt Decimal RNDN 0 0 37f 1f80: 0x1.ff07fe00p+127 339638501828070541185766401939693633536 0x1.ff83f000p+63 18429283829060468736 RNDD 1 1 77f 3f80: 0x1.ff07fe00p+127 339638501828070541185766401939693633536 0x1.ff83ee00p+63 18429282729548840960 RNDU 2 2 b7f 5f80: 0x1.ff07fe00p+127 339638501828070541185766401939693633536 0x1.ff83f000p+63 18429283829060468736 RNDZ 3 3 f7f 7f80: 0x1.ff07fe00p+127 339638501828070541185766401939693633536 0x1.ff83ee00p+63 18429282729548840960 -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada This email may be disturbing to some readers as it contains too much technical detail. Reader discretion is advised. On 2021-06-04 12:44, Jeff Johnston wrote: > Ok, I now know exactly what is happening. > The compiler is optimizing out the rounding check in ef_sqrt.c, probably > due to the operation using two constants. > 86 ix += (m <<23); > (gdb) list > 81 else > 82 q += (q&1); > When I debug, it always does the else at line 81 without performing the > one-tiny operation. The difference in the mxcsr > register is the PE bit which I believe gets set when you do the one-tiny > operation. Since we aren't doing it, it never gets > set on and the difference of 0x20 in the mxcsr register is explained. > By making the constants volatile, I am able to get the code working > as it should. I have pushed a patch for this. > On Fri, Jun 4, 2021 at 3:14 AM Paul Zimmermann wrote: >>> I figured the values were off when I had to hard-code them in my own >>> test_sqrt.c but forgot to include that info in my note. >>> >>> Now, that said, using the code I attached earlier, I am seeing >>> the exact values you are quoting above for glibc for the mxcsr >>> register and the round is working. Have your tried running that >>> code? >> yes it works as expected, but it doesn't work with Newlib's fenv.h and >> libm.a (see below). >>> The mxcsr values you are seeing that are different are not due to >>> the fesetround code. The code is shifting the round value 13 >>> bits and for 3, that ends up being 0x6000. It is masking mxcsr >>> with 0xffff9fff first so when you start with 0x1fxx and end up >>> with 0x7fxx, the code is doing what is supposed to do. >>> The difference in values above is 0x20 (e.g. 0x7fa0 vs 0x7f80) >>> which is a bit in the last 2 hex digits which isn't touched by >>> the code logic. >> here is how to reproduce the issue: >> tar xf newlib-4.1.0.tar.gz >> cd newlib-4.1.0 >> mkdir build >> cd build >> ../configure --prefix=/tmp --disable-multilib --target=x86_64 >> make -j4 >> make install >> $ cat test_sqrt_2.c >> #include >> #include >> #include >> #ifdef NEWLIB >> /* RedHat's libm claims: >> undefined reference to `__errno' in j1f/y1f */ >> int errno; >> int* __errno () { return &errno; } >> #endif >> int main() >> { >> int rnd[4] = { FE_TONEAREST, FE_TOWARDZERO, FE_UPWARD, FE_DOWNWARD }; >> char Rnd[4] = "NZUD"; >> float x = 0x1.ff07fep+127f; >> float y; >> for (int i = 0; i < 4; i++) >> { >> unsigned short cw; >> unsigned int mxcsr = 0; >> fesetround (rnd[i]); >> __asm__ volatile ("fnstcw %0" : "=m" (cw) : ); >> __asm__ volatile ("stmxcsr %0" : "=m" (mxcsr) : ); >> y = sqrtf (x); >> printf ("RND%c: %a cw=%u mxcsr=%u\n", Rnd[i], y, cw, mxcsr); >> } >> } >> With GNU libc: >> $ gcc -fno-builtin test_sqrt_2.c -lm >> $ ./a.out >> RNDN: 0x1.ff83fp+63 cw=895 mxcsr=8064 >> RNDZ: 0x1.ff83eep+63 cw=3967 mxcsr=32672 >> RNDU: 0x1.ff83fp+63 cw=2943 mxcsr=24480 >> RNDD: 0x1.ff83eep+63 cw=1919 mxcsr=16288 >> With Newlib: >> $ gcc -I/tmp/x86_64/include -DNEWLIB -fno-builtin test_sqrt_2.c /tmp/libm.a >> $ ./a.out >> RNDN: 0x1.ff83fp+63 cw=895 mxcsr=8064 >> RNDZ: 0x1.ff83fp+63 cw=3967 mxcsr=32640 >> RNDU: 0x1.ff83fp+63 cw=2943 mxcsr=24448 >> RNDD: 0x1.ff83fp+63 cw=1919 mxcsr=16256 >> Can you reproduce that on x86_64 Linux?