From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by sourceware.org (Postfix) with ESMTPS id B41423985C0A for ; Wed, 24 Jun 2020 06:22:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org B41423985C0A Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=inria.fr Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=Paul.Zimmermann@inria.fr X-IronPort-AV: E=Sophos;i="5.75,274,1589234400"; d="scan'208";a="456361109" Received: from tomate.loria.fr (HELO tomate) ([152.81.10.51]) by mail2-relais-roc.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 24 Jun 2020 08:22:22 +0200 Date: Wed, 24 Jun 2020 08:22:22 +0200 Message-Id: From: Paul Zimmermann To: Paul E Murphy CC: libc-alpha@sourceware.org In-reply-to: (message from Paul E Murphy on Mon, 22 Jun 2020 08:59:08 -0500) Subject: Re: faster expf128 References: X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, KAM_NUMSUBJECT, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Jun 2020 06:22:25 -0000 Dear Paul, thank you for your feedback. > From: Paul E Murphy > Date: Mon, 22 Jun 2020 08:59:08 -0500 > > On 6/22/20 6:02 AM, Paul Zimmermann wrote: > > I have written some expf128 for x86_64 that is more than 10 times faster than > > the current glibc/libquadmath code [1] (see slide 21 of [2]). > > I would highly recommend running the benchmarks against ppc64le or s390x > before replacing the existing implementation. I think it would improve > the code to have more explicit separation between implementations > optimized for soft and hardfp if performance cannot be rectified. I > think much of the float128 support assumes the underlying machine does > not natively support binary128. I forgot to say my code is intended mainly for machines that do not provide hardware float128 support. However I did compare with the glibc expf128 on gcc135.fsffrance.org (ppc64le GNU/Linux) and below are the results. You can reproduce them with the code from [1]. We see that my implementation is about 27% faster, but slightly less accurate (999585 instead of 999999 correct rounding over 1000000). One caveat though: I did not find how to efficiently set the inexact flag, thus it is not set in my code. glibc function (with hardware float128): [zimmerma@gcc135 ~]$ /opt/at12.0/bin/gcc -DUSE_GLIBC -DNO_WARN_X86_INTRINSICS -O3 main.c expf128.c -lm -lmpfr -lgmp [zimmerma@gcc135 ~]$ ./a.out GNU libc version: 2.28 GNU libc release: stable correct roundings: 999999/1000000 max err=1 ulp(s) maximal error for x=-4.2166924211009987727735597908208042e+00 y=1.47473419221889191873789731438093288e-02 z=1.47473419221889191873789731438093303e-02 [zimmerma@gcc135 ~]$ /opt/at12.0/bin/gcc -DTIMINGS -DUSE_GLIBC -DNO_WARN_X86_INTRINSICS -O3 main.c expf128.c -lm -lmpfr -lgmp [zimmerma@gcc135 ~]$ time ./a.out GNU libc version: 2.28 GNU libc release: stable s=1.09651217175878924483994909720534935e+09 real 0m0.195s user 0m0.194s sys 0m0.000s my implementation: [zimmerma@gcc135 ~]$ /opt/at12.0/bin/gcc -DNO_WARN_X86_INTRINSICS -O3 main.c expf128.c -lm -lmpfr -lgmp [zimmerma@gcc135 ~]$ ./a.out correct roundings: 999585/1000000 max err=1 ulp(s) maximal error for x=-9.88703896394271837099996910948152675e+00 y=5.08292305698879224291515174794000669e-05 z=5.08292305698879224291515174794000728e-05 [zimmerma@gcc135 ~]$ /opt/at12.0/bin/gcc -DTIMINGS -DNO_WARN_X86_INTRINSICS -O3 main.c expf128.c -lm -lmpfr -lgmp [zimmerma@gcc135 ~]$ time ./a.out s=1.09651217175878924483994909720534935e+09 real 0m0.143s user 0m0.142s sys 0m0.000s > > Before making a proper patch for glibc, I'd like to make sure it fits the > > glibc requirements. In particular, the table size is 16kb. Is that ok? > > If too large, what table size would be ok? > > I think that is acceptable. The current tables for expf128 probably > aren't much smaller, if I recall correctly. ok, then I will prepare a patch, once glibc 2.32 is out. Best regards, Paul [1] https://homepages.loria.fr/PZimmermann/glibc-contrib/