From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 85935 invoked by alias); 21 Mar 2017 07:54:57 -0000 Mailing-List: contact newlib-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: newlib-owner@sourceware.org Received: (qmail 85917 invoked by uid 89); 21 Mar 2017 07:54:56 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: Yes, score=6.3 required=5.0 tests=AWL,BAYES_50,FOREIGN_BODY,RCVD_IN_DNSWL_NONE,SPF_PASS,T_FILL_THIS_FORM_SHORT autolearn=no version=3.3.2 spammy=diese, Nachricht, nachricht, Mitteilung X-HELO: dedi548.your-server.de Received: from dedi548.your-server.de (HELO dedi548.your-server.de) (85.10.215.148) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 21 Mar 2017 07:54:54 +0000 Received: from [88.198.220.130] (helo=sslproxy01.your-server.de) by dedi548.your-server.de with esmtpsa (TLSv1.2:DHE-RSA-AES256-GCM-SHA384:256) (Exim 4.85_2) (envelope-from ) id 1cqEd2-0000o7-Sk; Tue, 21 Mar 2017 08:54:52 +0100 Received: from [82.135.62.35] (helo=mail.embedded-brains.de) by sslproxy01.your-server.de with esmtpsa (TLSv1.2:DHE-RSA-AES256-GCM-SHA384:256) (Exim 4.84_2) (envelope-from ) id 1cqEd2-0007zY-Iz; Tue, 21 Mar 2017 08:54:52 +0100 Received: from localhost (localhost.localhost [127.0.0.1]) by mail.embedded-brains.de (Postfix) with ESMTP id E5E0E2A1663; Tue, 21 Mar 2017 08:55:17 +0100 (CET) Received: from mail.embedded-brains.de ([127.0.0.1]) by localhost (zimbra.eb.localhost [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id FKdik6j3r5hH; Tue, 21 Mar 2017 08:55:15 +0100 (CET) Received: from localhost (localhost.localhost [127.0.0.1]) by mail.embedded-brains.de (Postfix) with ESMTP id 663092A1664; Tue, 21 Mar 2017 08:55:15 +0100 (CET) Received: from mail.embedded-brains.de ([127.0.0.1]) by localhost (zimbra.eb.localhost [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 4_RT_XxqY46j; Tue, 21 Mar 2017 08:55:15 +0100 (CET) Received: from [192.168.96.129] (unknown [192.168.96.129]) by mail.embedded-brains.de (Postfix) with ESMTPSA id 540112A1663; Tue, 21 Mar 2017 08:55:15 +0100 (CET) Subject: Re: [PATCH] ARM: Optimize IEEE-754 sqrt implementation To: newlib@sourceware.org References: <1490082540-22841-1-git-send-email-sebastian.huber@embedded-brains.de> Cc: Richard.Earnshaw@arm.com From: Sebastian Huber Message-ID: <58D0DC49.10404@embedded-brains.de> Date: Tue, 21 Mar 2017 07:54:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0 MIME-Version: 1.0 In-Reply-To: <1490082540-22841-1-git-send-email-sebastian.huber@embedded-brains.de> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes X-SW-Source: 2017/txt/msg00203.txt.bz2 It built this using the ARM RTEMS multilibs: https://gcc.gnu.org/viewcvs/gcc/trunk/gcc/config/arm/t-rtems?view=3Dmarkup I used this test program to check the implementation: https://git.rtems.org/rtems/tree/testsuites/samples/paranoia/paranoia.c Output of test run on a Cortex-R5: paranoia version 1.1 [cygnus] Program is now RUNNING tests on small integers: TEST: 0+0 !=3D 0, 1-1 !=3D 0, 1 <=3D 0, or 1+1 !=3D 2 PASS: 0+0 !=3D 0, 1-1 !=3D 0, 1 <=3D 0, or 1+1 !=3D 2 TEST: 3 !=3D 2+1, 4 !=3D 3+1, 4+2*(-2) !=3D 0, or 4-3-1 !=3D 0 PASS: 3 !=3D 2+1, 4 !=3D 3+1, 4+2*(-2) !=3D 0, or 4-3-1 !=3D 0 TEST: -1+1 !=3D 0, (-1)+abs(1) !=3D 0, or -1+(-1)*(-1) !=3D 0 PASS: -1+1 !=3D 0, (-1)+abs(1) !=3D 0, or -1+(-1)*(-1) !=3D 0 TEST: 1/2 + (-1) + 1/2 !=3D 0 PASS: 1/2 + (-1) + 1/2 !=3D 0 TEST: 9 !=3D 3*3, 27 !=3D 9*3, 32 !=3D 8*4, or 32-27-4-1 !=3D 0 PASS: 9 !=3D 3*3, 27 !=3D 9*3, 32 !=3D 8*4, or 32-27-4-1 !=3D 0 TEST: 5 !=3D 4+1, 240/3 !=3D 80, 240/4 !=3D 60, or 240/5 !=3D 48 PASS: 5 !=3D 4+1, 240/3 !=3D 80, 240/4 !=3D 60, or 240/5 !=3D 48 -1, 0, 1/2, 1, 2, 3, 4, 5, 9, 27, 32 & 240 are O.K. Searching for Radix and Precision. Radix =3D 2.000000 . Closest relative separation found is U1 =3D 1.1102230e-16 . Recalculating radix and precision confirms closest relative separation U1 . Radix confirmed. TEST: Radix is too big: roundoff problems PASS: Radix is too big: roundoff problems TEST: Radix is not as good as 2 or 10 PASS: Radix is not as good as 2 or 10 TEST: (1-U1)-1/2 < 1/2 is FALSE, prog. fails? PASS: (1-U1)-1/2 < 1/2 is FALSE, prog. fails? TEST: Comparison is fuzzy,X=3D1 but X-1/2-1/2 !=3D 0 PASS: Comparison is fuzzy,X=3D1 but X-1/2-1/2 !=3D 0 The number of significant digits of the Radix is 53.000000 . TEST: Precision worse than 5 decimal figures PASS: Precision worse than 5 decimal figures TEST: Subtraction is not normalized X=3DY,X+Z !=3D Y+Z! PASS: Subtraction is not normalized X=3DY,X+Z !=3D Y+Z! Subtraction appears to be normalized, as it should be. Checking for guard digit in *, /, and -. TEST: * gets too many final digits wrong. PASS: * gets too many final digits wrong. TEST: Division lacks a Guard Digit, so error can exceed 1 ulp or 1/3 and 3/9 and 9/27 may disagree PASS: Division lacks a Guard Digit, so error can exceed 1 ulp or 1/3 and 3/9 and 9/27 may disagree TEST: Computed value of 1/1.000..1 >=3D 1 PASS: Computed value of 1/1.000..1 >=3D 1 TEST: * and/or / gets too many last digits wrong PASS: * and/or / gets too many last digits wrong *, /, and - appear to have guard digits, as they should. Checking rounding on multiply, divide and add/subtract. TEST: X * (1/X) differs from 1 PASS: X * (1/X) differs from 1 Multiplication appears to round correctly. Division appears to round correctly. TEST: Radix * ( 1 / Radix ) differs from 1 PASS: Radix * ( 1 / Radix ) differs from 1 TEST: Incomplete carry-propagation in Addition PASS: Incomplete carry-propagation in Addition Addition/Subtraction appears to round correctly. Checking for sticky bit. Sticky bit apparently used correctly. TEST: lack(s) of guard digits or failure(s) to correctly round or chop (noted above) count as one flaw in the final tally below PASS: lack(s) of guard digits or failure(s) to correctly round or chop (noted above) count as one flaw in the final tally below Does Multiplication commute? Testing on 20 random pairs. No failures found in 20 integer pairs. Running test of square root(x). TEST: Square root of 0.0, -0.0 or 1.0 wrong PASS: Square root of 0.0, -0.0 or 1.0 wrong Testing if sqrt(X * X) =3D=3D X for 20 Integers X. Test for sqrt monotonicity. sqrt has passed a test for Monotonicity. Testing whether sqrt is rounded or chopped. Square root appears to be correctly rounded. Testing powers Z^i for small Integers Z and i. ... no discrepancies found. Seeking Underflow thresholds UfThold and E0. Smallest strictly positive number found is E0 =3D 4.94066e-324 . Since comparison denies Z =3D 0, evaluating (Z + Z) / Z should be safe. What the machine gets for (Z + Z) / Z is 2.00000000000000000e+00 . This is O.K., provided Over/Underflow has NOT just been signaled. Underflow is gradual; it incurs Absolute Error =3D (roundoff in UfThold) < E0. The Underflow threshold is 2.22507385850720188e-308, below which calculation may suffer larger Relative error than merely roundoff. Since underflow occurs below the threshold UfThold =3D (2.00000000000000000e+00) ^ (-1.02200000000000000e+03) only underflow should afflict the expression (2.00000000000000000e+00) ^ (-2.04400000000000000e+03); actually calculating yields: 0.00000000000000000e+00 . This computed value is O.K. Testing X^((X + 1) / (X - 1)) vs. exp(2) =3D 7.38905609893065218e+00 as X=20 -> 1. Accuracy seems adequate. Testing powers Z^Q at four nearly extreme values. ... no discrepancies found. Searching for Overflow threshold: This may generate an error. Can `Z =3D -Y' overflow? Trying it on Y =3D -inf . Seems O.K. Overflow threshold is V =3D 1.79769313486231571e+308 . Overflow saturates at V0 =3D inf . No Overflow should be signaled for V * 1 =3D 1.79769313486231571e+308 nor for V / 1 =3D 1.79769313486231571e+308 . Any overflow signal separating this * from the one above is a DEFECT. What message and/or values does Division by Zero produce? Trying to compute 1 / 0 produces ... inf . Trying to compute 0 / 0 produces ... nan . No failures, defects nor flaws have been discovered. Rounding appears to conform to the proposed IEEE standard P754. The arithmetic diagnosed appears to be Excellent! --=20 Sebastian Huber, embedded brains GmbH Address : Dornierstr. 4, D-82178 Puchheim, Germany Phone : +49 89 189 47 41-16 Fax : +49 89 189 47 41-09 E-Mail : sebastian.huber@embedded-brains.de PGP : Public key available on request. Diese Nachricht ist keine gesch=E4ftliche Mitteilung im Sinne des EHUG.