From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <stefan.kanthak@nexgo.de>
Received: from smtpout2.vodafonemail.de (smtpout2.vodafonemail.de
 [145.253.239.133])
 by sourceware.org (Postfix) with ESMTPS id CD621385503C
 for <libc-help@sourceware.org>; Sat, 21 Aug 2021 13:46:46 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org CD621385503C
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=nexgo.de
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=nexgo.de
Received: from smtp.vodafone.de (smtpa06.fra-mediabeam.com [10.2.0.37])
 by smtpout2.vodafonemail.de (Postfix) with ESMTP id D2A54122183
 for <libc-help@sourceware.org>; Sat, 21 Aug 2021 15:46:45 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nexgo.de;
 s=vfde-smtpout-mb-15sep; t=1629553605;
 bh=RyY06nc2yNmytldfHrrdEj0bW1PnSKKDR8PwDmbsufA=;
 h=From:To:Subject:Date;
 b=YlnRVcNZnaU7onAgq5a6eG+8T3+hs6XgPtt2frkKnitH2WNX83/Vc5bAJ5957MdS5
 LD6BGwiFK5R6hu8/rj/TzF1fIIji+dgOaPUG8gy+eUGH5Im4lHgjez13ifP7DPOybb
 emD14QkVJfg/9vMh5WsmAmLp1g3siajCqzIhQyOc=
Received: from H270 (p5b38f1bc.dip0.t-ipconnect.de [91.56.241.188])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits))
 (No client certificate requested)
 by smtp.vodafone.de (Postfix) with ESMTPSA id 75D3B140263
 for <libc-help@sourceware.org>; Sat, 21 Aug 2021 13:46:45 +0000 (UTC)
Message-ID: <4DD65B114A174A35AC6960DD2104BDE7@H270>
From: "Stefan Kanthak" <stefan.kanthak@nexgo.de>
To: <libc-help@sourceware.org>
Subject: Twiddling with 64-bit values as 2 ints;
Date: Sat, 21 Aug 2021 15:34:50 +0200
Organization: Me, myself & IT
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Windows Mail 6.0.6002.18197
X-MimeOLE: Produced By Microsoft MimeOLE V6.1.7601.24158
X-purgate-type: clean
X-purgate-Ad: Categorized by eleven eXpurgate (R) http://www.eleven.de
X-purgate: This mail is considered clean (visit http://www.eleven.de for
 further information)
X-purgate: clean
X-purgate-size: 3643
X-purgate-ID: 155817::1629553605-00003C24-494557C0/0/0
X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2,
 SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: libc-help@sourceware.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Libc-help mailing list <libc-help.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/libc-help>,
 <mailto:libc-help-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/libc-help/>
List-Help: <mailto:libc-help-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/libc-help>,
 <mailto:libc-help-request@sourceware.org?subject=subscribe>
X-List-Received-Date: Sat, 21 Aug 2021 13:46:57 -0000

Hi,

32 years ago, C89 introduced 64-bit integers: [un]signed long long
IEEE 754 defined the 64-bit double-precision floating-point format,
now called binary64. in 1985.

Especially SunSoft's [fd]libm, which (to my knowledge) started around
this time, and also IBM's APMathLib/libultim, which followed a little
later, and also quite some ACM TOMS routines, but use (pairs of) 32-bit
integers for bit-twiddling on the representation of double/binary64:
additions/subtractions/shifts on the 52-bit mantissa/fraction, and
operations on the full 64-bit double, involve both ints, and need to
take care of the carry/borrow -- explicitly, and quite ugly!
It's also generally unknown whether a compiler will recognize this
sort of carry/borrow/overflow handling and generate proper machine
code using "add with carry"/"subtract with borrow" instructions.

JFTR: while sticking with 32-bit integers MAY give better performance
      on 32-bit processors, especially when an operations only involves
      either low or high part, the explicit carry/borrow handling can
      have negative performance impact.

See for example <http://www.netlib.no/netlib/toms/722>, written by
William J. Cody (known from Cody/Waite range reduction):

|    W. J. Cody, J. T. Coonen, March 30, 1992
...
|       /* Otherwise, use integer arithmetic to increment or      */
|       /* decrement least significant half of z, being careful   */
|       /* with carries and borrows involving most significant    */
|       /* half.                                                  */
|          else if (((argx < Zero) && (argx < argy)) ||
|                   ((argx > Zero) && (argx > argy))) {
|                   --lowpart(z);
|                   if (lowpart(z) == -1)
|                      --highpart(z);
|                   }
|                else {
|                   ++lowpart(z);
|                   if (lowpart(z) == 0)
|                      ++highpart(z);
|                   }
|

Compare this with the REALLY UGLY
<https://sourceware.org/git/?p=glibc.git;a=blob_plain;f=math/s_nextafter.c;hb=HEAD>

|  * Copyright (C) 1993 by Sun Microsystems, Inc. All rights reserved.
...
|        if(((ix>=0x7ff00000)&&((ix-0x7ff00000)|lx)!=0) ||   /* x is nan */
|           ((iy>=0x7ff00000)&&((iy-0x7ff00000)|ly)!=0))     /* y is nan */
|           return x+y;
...
|        if(hx>=0) {                               /* x > 0 */
|            if(hx>hy||((hx==hy)&&(lx>ly))) {      /* x > y, x -= ulp */
|                if(lx==0) hx -= 1;
|                lx -= 1;
|            } else {                              /* x < y, x += ulp */
|                lx += 1;
|                if(lx==0) hx += 1;
|            }
|        } else {                                  /* x < 0 */
|            if(hy>=0||hx>hy||((hx==hy)&&(lx>ly))){/* x < y, x -= ulp */
|                if(lx==0) hx -= 1;
|                lx -= 1;
|            } else {                              /* x > y, x += ulp */
|                lx += 1;
|                if(lx==0) hx += 1;
|            }
|        }

(Heretic.-) questions:
- why does glibc still employ such ugly code?
- Why doesn't glibc take advantage of 64-bit integers in such code?

JFTR: on 64-bit processors, when the compiler does not recognize
      that hx:lx and hy:ly are in fact a single 64-bit integer it
      can hold in a SINGLE register, but smears it over 2 registers,
      such cruft kills performance.

For 32-bit processors, the JFTR from above still holds: using 64-bit
integers with a C89 compiler should give better machine code.

Stefan