From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr1-x42a.google.com (mail-wr1-x42a.google.com [IPv6:2a00:1450:4864:20::42a]) by sourceware.org (Postfix) with ESMTPS id E31863858C27; Sat, 25 Dec 2021 13:03:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E31863858C27 Received: by mail-wr1-x42a.google.com with SMTP id i22so22254241wrb.13; Sat, 25 Dec 2021 05:03:25 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:mime-version:subject:message-id:date:cc:to; bh=Ff71i8ynGmCrEJgdeH6PbJSiD68TC3hqU+GVOeZZn1M=; b=SbRo7sNTdxZfQhNyeWajLBxaywgVcIMky8J88WH2uqh9qZaEbSUCkdNXpWObS1YDTh B6VoD/CiwJEaBJJvqpEZTPBTVFtLzvd4hBdrcooI3p0CEIwVrCDTQdcqZk2oK3DeEPQ6 2tsaBFCCOIlH+26Ttr71QI+WsndJGSt4kkSYXeA3WBzGqZDaNoTNa2A6zQ/uOj0fJqHB rl7azWYmtctx0knnyfuMVwJ330Crzh3oJ+jl1gTCaNxOW9Qb52Yh00qhZlCB6+ZRFa+k lfgNPEK1QllQg/EII7nHgCmCoiUxhMmnzQL/2tbL7brDNazrbpUGV1JyB7IS5DrXY+tu YUfA== X-Gm-Message-State: AOAM531c/yWV8vJHF7lgVSqH1NzNQAsJEKa50VevE4uG895Rg9jlwZQd uUPDNaY9rNo1n8i52eHDPNKYZ75SNOM= X-Google-Smtp-Source: ABdhPJy4Pg4FnI30ii+Y/2PdQZ/hG7U/8s25RQEqcVjZsILiKyfK0cMGCKfvSXAQyU6Q6Sc80O8elw== X-Received: by 2002:adf:a386:: with SMTP id l6mr7096847wrb.505.1640437404861; Sat, 25 Dec 2021 05:03:24 -0800 (PST) Received: from smtpclient.apple ([2a01:e34:ec28:8cb0:316e:4909:e1cd:b98f]) by smtp.gmail.com with ESMTPSA id b19sm14888573wmb.38.2021.12.25.05.03.23 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 25 Dec 2021 05:03:24 -0800 (PST) From: FX Content-Type: multipart/mixed; boundary="Apple-Mail=_33A289B3-4971-4492-A307-B0CB2B823043" Mime-Version: 1.0 (Mac OS X Mail 15.0 \(3693.40.0.1.81\)) Subject: [PATCH] Make integer output faster in libgfortran Message-Id: Date: Sat, 25 Dec 2021 14:03:23 +0100 Cc: gcc-patches@gcc.gnu.org To: fortran@gcc.gnu.org X-Mailer: Apple Mail (2.3693.40.0.1.81) X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: fortran@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Fortran mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Dec 2021 13:03:27 -0000 --Apple-Mail=_33A289B3-4971-4492-A307-B0CB2B823043 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Hi, Integer output in libgfortran is done by passing values as the largest = integer type available. This is what our gfc_itoa() function for = conversion to decimal form uses, as well, performing series of divisions = by 10. On targets with a 128-bit integer type (which is most targets, = really, nowadays), division is slow, because it is implemented in = software and requires a call to a libgcc function. We can speed this up in two easy ways: - If the value fits into 64-bit, use a simple 64-bit itoa() function, = which does the series of divisions by 10 with hardware. Most I/O will = actually fall into that case, in real-life, unless you=E2=80=99re = printing very big 128-bit integers. - If the value does not fit into 64-bit, perform only one slow division, = by 10^19, and use two calls to the 64-bit function to output each part = (the low part needing zero-padding). What is the speed-up? It really depends on the exact nature of the I/O = done. For the most common-case, list-directed I/O with no special = format, the patch does not speed (or slow!) things for values up to = HUGE(KIND=3D4), but speeds things up for larger values. For very large = 128-bit values, it can cut the I/O time in half. I attach my own timing code to this email. Results before the patch = (with previous itoa-patch applied, though): Timing for INTEGER(KIND=3D1) Value 0, time: 0.191409990 =20 Value HUGE(KIND=3D1), time: 0.173687011 =20 Timing for INTEGER(KIND=3D4) Value 0, time: 0.171809018 =20 Value 1049, time: 0.177439988 =20 Value HUGE(KIND=3D4), time: 0.217984974 =20 Timing for INTEGER(KIND=3D8) Value 0, time: 0.178072989 =20 Value HUGE(KIND=3D4), time: 0.214841008 =20 Value HUGE(KIND=3D8), time: 0.276726007 =20 Timing for INTEGER(KIND=3D16) Value 0, time: 0.175235987 =20 Value HUGE(KIND=3D4), time: 0.217689037 =20 Value HUGE(KIND=3D8), time: 0.280257106 =20 Value HUGE(KIND=3D16), time: 0.420036077 =20 Results after the patch: Timing for INTEGER(KIND=3D1) Value 0, time: 0.194633007 =20 Value HUGE(KIND=3D1), time: 0.172436997 =20 Timing for INTEGER(KIND=3D4) Value 0, time: 0.167517006 =20 Value 1049, time: 0.176503003 =20 Value HUGE(KIND=3D4), time: 0.172892988 =20 Timing for INTEGER(KIND=3D8) Value 0, time: 0.171101034 =20 Value HUGE(KIND=3D4), time: 0.174461007 =20 Value HUGE(KIND=3D8), time: 0.180289030 =20 Timing for INTEGER(KIND=3D16) Value 0, time: 0.175765991 =20 Value HUGE(KIND=3D4), time: 0.181162953 =20 Value HUGE(KIND=3D8), time: 0.186082959 =20 Value HUGE(KIND=3D16), time: 0.207401991 =20 Times are CPU times in seconds, for one million integer writes into a = buffer string. With the patch, we see that integer decimal output is = almost independent of the value written, meaning the I/O library = overhead is dominant, not the decimal conversion. For this reason, I = don=E2=80=99t think we really need a faster implementation of the 64-bit = itoa, and can keep the current series-of-division-by-10 approach. --------------- This patch applies on top of my previous itoa-related patch at = https://gcc.gnu.org/pipermail/fortran/2021-December/057218.html The patch has been bootstrapped and regtested on two 64-bit targets: = aarch64-apple-darwin21 (development branch) and x86_64-pc-gnu-linux. I = would like it to be tested on a 32-bit target without 128-bit integer = type. Does someone have access to that? Once tested on a 32-bit target, OK to commit? FX --Apple-Mail=_33A289B3-4971-4492-A307-B0CB2B823043 Content-Disposition: attachment; filename=itoa-faster.patch Content-Type: application/octet-stream; x-unix-mode=0644; name="itoa-faster.patch" Content-Transfer-Encoding: 7bit commit 4526dd52ebc76de63a8386767eda2f02d8b0a27b Author: Francois-Xavier Coudert Date: 2021-12-25 13:42:25 +0100 Fortran: speed up decimal output of integers libgfortran/ChangeLog: PR libfortran/98076 * runtime/string.c (itoa64, itoa64_pad19): New helper functions. (gfc_itoa): On targets with 128-bit integers, call fast 64-bit functions to avoid many slow divisions. gcc/testsuite/ChangeLog: PR libfortran/98076 * gfortran.dg/pr98076.f90: New test. diff --git a/gcc/testsuite/gfortran.dg/pr98076.f90 b/gcc/testsuite/gfortran.dg/pr98076.f90 new file mode 100644 index 00000000000..d1288a41fef --- /dev/null +++ b/gcc/testsuite/gfortran.dg/pr98076.f90 @@ -0,0 +1,293 @@ +! { dg-do run } +! { dg-require-effective-target fortran_large_int } +! +! Check that we can print large integer values + +program test + implicit none + ! 128-bit integer kind + integer, parameter :: k = selected_int_kind(38) + + character(len=39) :: s + character(len=100) :: buffer + integer(kind=k) :: n + integer :: i + + ! Random checks + do i = 1, 1000 + call random_digits(s) + read(s,*) n + write(buffer,'(I0.38)') n + print *, s + print *, buffer + if (adjustl(buffer) /= adjustl(s)) stop 2 + end do + + ! Systematic check + call check(0_k, "0") + call check(1_k, "1") + call check(9_k, "9") + call check(10_k, "10") + call check(11_k, "11") + call check(99_k, "99") + call check(100_k, "100") + call check(101_k, "101") + call check(999_k, "999") + call check(1000_k, "1000") + call check(1001_k, "1001") + call check(9999_k, "9999") + call check(10000_k, "10000") + call check(10001_k, "10001") + call check(99999_k, "99999") + call check(100000_k, "100000") + call check(100001_k, "100001") + call check(999999_k, "999999") + call check(1000000_k, "1000000") + call check(1000001_k, "1000001") + call check(9999999_k, "9999999") + call check(10000000_k, "10000000") + call check(10000001_k, "10000001") + call check(99999999_k, "99999999") + call check(100000000_k, "100000000") + call check(100000001_k, "100000001") + call check(999999999_k, "999999999") + call check(1000000000_k, "1000000000") + call check(1000000001_k, "1000000001") + call check(9999999999_k, "9999999999") + call check(10000000000_k, "10000000000") + call check(10000000001_k, "10000000001") + call check(99999999999_k, "99999999999") + call check(100000000000_k, "100000000000") + call check(100000000001_k, "100000000001") + call check(999999999999_k, "999999999999") + call check(1000000000000_k, "1000000000000") + call check(1000000000001_k, "1000000000001") + call check(9999999999999_k, "9999999999999") + call check(10000000000000_k, "10000000000000") + call check(10000000000001_k, "10000000000001") + call check(99999999999999_k, "99999999999999") + call check(100000000000000_k, "100000000000000") + call check(100000000000001_k, "100000000000001") + call check(999999999999999_k, "999999999999999") + call check(1000000000000000_k, "1000000000000000") + call check(1000000000000001_k, "1000000000000001") + call check(9999999999999999_k, "9999999999999999") + call check(10000000000000000_k, "10000000000000000") + call check(10000000000000001_k, "10000000000000001") + call check(99999999999999999_k, "99999999999999999") + call check(100000000000000000_k, "100000000000000000") + call check(100000000000000001_k, "100000000000000001") + call check(999999999999999999_k, "999999999999999999") + call check(1000000000000000000_k, "1000000000000000000") + call check(1000000000000000001_k, "1000000000000000001") + call check(9999999999999999999_k, "9999999999999999999") + call check(10000000000000000000_k, "10000000000000000000") + call check(10000000000000000001_k, "10000000000000000001") + call check(99999999999999999999_k, "99999999999999999999") + call check(100000000000000000000_k, "100000000000000000000") + call check(100000000000000000001_k, "100000000000000000001") + call check(999999999999999999999_k, "999999999999999999999") + call check(1000000000000000000000_k, "1000000000000000000000") + call check(1000000000000000000001_k, "1000000000000000000001") + call check(9999999999999999999999_k, "9999999999999999999999") + call check(10000000000000000000000_k, "10000000000000000000000") + call check(10000000000000000000001_k, "10000000000000000000001") + call check(99999999999999999999999_k, "99999999999999999999999") + call check(100000000000000000000000_k, "100000000000000000000000") + call check(100000000000000000000001_k, "100000000000000000000001") + call check(999999999999999999999999_k, "999999999999999999999999") + call check(1000000000000000000000000_k, "1000000000000000000000000") + call check(1000000000000000000000001_k, "1000000000000000000000001") + call check(9999999999999999999999999_k, "9999999999999999999999999") + call check(10000000000000000000000000_k, "10000000000000000000000000") + call check(10000000000000000000000001_k, "10000000000000000000000001") + call check(99999999999999999999999999_k, "99999999999999999999999999") + call check(100000000000000000000000000_k, "100000000000000000000000000") + call check(100000000000000000000000001_k, "100000000000000000000000001") + call check(999999999999999999999999999_k, "999999999999999999999999999") + call check(1000000000000000000000000000_k, "1000000000000000000000000000") + call check(1000000000000000000000000001_k, "1000000000000000000000000001") + call check(9999999999999999999999999999_k, "9999999999999999999999999999") + call check(10000000000000000000000000000_k, "10000000000000000000000000000") + call check(10000000000000000000000000001_k, "10000000000000000000000000001") + call check(99999999999999999999999999999_k, "99999999999999999999999999999") + call check(100000000000000000000000000000_k, "100000000000000000000000000000") + call check(100000000000000000000000000001_k, "100000000000000000000000000001") + call check(999999999999999999999999999999_k, "999999999999999999999999999999") + call check(1000000000000000000000000000000_k, "1000000000000000000000000000000") + call check(1000000000000000000000000000001_k, "1000000000000000000000000000001") + call check(9999999999999999999999999999999_k, "9999999999999999999999999999999") + call check(10000000000000000000000000000000_k, "10000000000000000000000000000000") + call check(10000000000000000000000000000001_k, "10000000000000000000000000000001") + call check(99999999999999999999999999999999_k, "99999999999999999999999999999999") + call check(100000000000000000000000000000000_k, "100000000000000000000000000000000") + call check(100000000000000000000000000000001_k, "100000000000000000000000000000001") + call check(999999999999999999999999999999999_k, "999999999999999999999999999999999") + call check(1000000000000000000000000000000000_k, "1000000000000000000000000000000000") + call check(1000000000000000000000000000000001_k, "1000000000000000000000000000000001") + call check(9999999999999999999999999999999999_k, "9999999999999999999999999999999999") + call check(10000000000000000000000000000000000_k, "10000000000000000000000000000000000") + call check(10000000000000000000000000000000001_k, "10000000000000000000000000000000001") + call check(99999999999999999999999999999999999_k, "99999999999999999999999999999999999") + call check(100000000000000000000000000000000000_k, "100000000000000000000000000000000000") + call check(100000000000000000000000000000000001_k, "100000000000000000000000000000000001") + call check(999999999999999999999999999999999999_k, "999999999999999999999999999999999999") + call check(1000000000000000000000000000000000000_k, "1000000000000000000000000000000000000") + call check(1000000000000000000000000000000000001_k, "1000000000000000000000000000000000001") + call check(9999999999999999999999999999999999999_k, "9999999999999999999999999999999999999") + call check(10000000000000000000000000000000000000_k, "10000000000000000000000000000000000000") + call check(10000000000000000000000000000000000001_k, "10000000000000000000000000000000000001") + call check(99999999999999999999999999999999999999_k, "99999999999999999999999999999999999999") + call check(100000000000000000000000000000000000000_k, "100000000000000000000000000000000000000") + call check(100000000000000000000000000000000000001_k, "100000000000000000000000000000000000001") + call check(109999999999999999999999999999999999999_k, "109999999999999999999999999999999999999") + + call check(-1_k, "-1") + call check(-9_k, "-9") + call check(-10_k, "-10") + call check(-11_k, "-11") + call check(-99_k, "-99") + call check(-100_k, "-100") + call check(-101_k, "-101") + call check(-999_k, "-999") + call check(-1000_k, "-1000") + call check(-1001_k, "-1001") + call check(-9999_k, "-9999") + call check(-10000_k, "-10000") + call check(-10001_k, "-10001") + call check(-99999_k, "-99999") + call check(-100000_k, "-100000") + call check(-100001_k, "-100001") + call check(-999999_k, "-999999") + call check(-1000000_k, "-1000000") + call check(-1000001_k, "-1000001") + call check(-9999999_k, "-9999999") + call check(-10000000_k, "-10000000") + call check(-10000001_k, "-10000001") + call check(-99999999_k, "-99999999") + call check(-100000000_k, "-100000000") + call check(-100000001_k, "-100000001") + call check(-999999999_k, "-999999999") + call check(-1000000000_k, "-1000000000") + call check(-1000000001_k, "-1000000001") + call check(-9999999999_k, "-9999999999") + call check(-10000000000_k, "-10000000000") + call check(-10000000001_k, "-10000000001") + call check(-99999999999_k, "-99999999999") + call check(-100000000000_k, "-100000000000") + call check(-100000000001_k, "-100000000001") + call check(-999999999999_k, "-999999999999") + call check(-1000000000000_k, "-1000000000000") + call check(-1000000000001_k, "-1000000000001") + call check(-9999999999999_k, "-9999999999999") + call check(-10000000000000_k, "-10000000000000") + call check(-10000000000001_k, "-10000000000001") + call check(-99999999999999_k, "-99999999999999") + call check(-100000000000000_k, "-100000000000000") + call check(-100000000000001_k, "-100000000000001") + call check(-999999999999999_k, "-999999999999999") + call check(-1000000000000000_k, "-1000000000000000") + call check(-1000000000000001_k, "-1000000000000001") + call check(-9999999999999999_k, "-9999999999999999") + call check(-10000000000000000_k, "-10000000000000000") + call check(-10000000000000001_k, "-10000000000000001") + call check(-99999999999999999_k, "-99999999999999999") + call check(-100000000000000000_k, "-100000000000000000") + call check(-100000000000000001_k, "-100000000000000001") + call check(-999999999999999999_k, "-999999999999999999") + call check(-1000000000000000000_k, "-1000000000000000000") + call check(-1000000000000000001_k, "-1000000000000000001") + call check(-9999999999999999999_k, "-9999999999999999999") + call check(-10000000000000000000_k, "-10000000000000000000") + call check(-10000000000000000001_k, "-10000000000000000001") + call check(-99999999999999999999_k, "-99999999999999999999") + call check(-100000000000000000000_k, "-100000000000000000000") + call check(-100000000000000000001_k, "-100000000000000000001") + call check(-999999999999999999999_k, "-999999999999999999999") + call check(-1000000000000000000000_k, "-1000000000000000000000") + call check(-1000000000000000000001_k, "-1000000000000000000001") + call check(-9999999999999999999999_k, "-9999999999999999999999") + call check(-10000000000000000000000_k, "-10000000000000000000000") + call check(-10000000000000000000001_k, "-10000000000000000000001") + call check(-99999999999999999999999_k, "-99999999999999999999999") + call check(-100000000000000000000000_k, "-100000000000000000000000") + call check(-100000000000000000000001_k, "-100000000000000000000001") + call check(-999999999999999999999999_k, "-999999999999999999999999") + call check(-1000000000000000000000000_k, "-1000000000000000000000000") + call check(-1000000000000000000000001_k, "-1000000000000000000000001") + call check(-9999999999999999999999999_k, "-9999999999999999999999999") + call check(-10000000000000000000000000_k, "-10000000000000000000000000") + call check(-10000000000000000000000001_k, "-10000000000000000000000001") + call check(-99999999999999999999999999_k, "-99999999999999999999999999") + call check(-100000000000000000000000000_k, "-100000000000000000000000000") + call check(-100000000000000000000000001_k, "-100000000000000000000000001") + call check(-999999999999999999999999999_k, "-999999999999999999999999999") + call check(-1000000000000000000000000000_k, "-1000000000000000000000000000") + call check(-1000000000000000000000000001_k, "-1000000000000000000000000001") + call check(-9999999999999999999999999999_k, "-9999999999999999999999999999") + call check(-10000000000000000000000000000_k, "-10000000000000000000000000000") + call check(-10000000000000000000000000001_k, "-10000000000000000000000000001") + call check(-99999999999999999999999999999_k, "-99999999999999999999999999999") + call check(-100000000000000000000000000000_k, "-100000000000000000000000000000") + call check(-100000000000000000000000000001_k, "-100000000000000000000000000001") + call check(-999999999999999999999999999999_k, "-999999999999999999999999999999") + call check(-1000000000000000000000000000000_k, "-1000000000000000000000000000000") + call check(-1000000000000000000000000000001_k, "-1000000000000000000000000000001") + call check(-9999999999999999999999999999999_k, "-9999999999999999999999999999999") + call check(-10000000000000000000000000000000_k, "-10000000000000000000000000000000") + call check(-10000000000000000000000000000001_k, "-10000000000000000000000000000001") + call check(-99999999999999999999999999999999_k, "-99999999999999999999999999999999") + call check(-100000000000000000000000000000000_k, "-100000000000000000000000000000000") + call check(-100000000000000000000000000000001_k, "-100000000000000000000000000000001") + call check(-999999999999999999999999999999999_k, "-999999999999999999999999999999999") + call check(-1000000000000000000000000000000000_k, "-1000000000000000000000000000000000") + call check(-1000000000000000000000000000000001_k, "-1000000000000000000000000000000001") + call check(-9999999999999999999999999999999999_k, "-9999999999999999999999999999999999") + call check(-10000000000000000000000000000000000_k, "-10000000000000000000000000000000000") + call check(-10000000000000000000000000000000001_k, "-10000000000000000000000000000000001") + call check(-99999999999999999999999999999999999_k, "-99999999999999999999999999999999999") + call check(-100000000000000000000000000000000000_k, "-100000000000000000000000000000000000") + call check(-100000000000000000000000000000000001_k, "-100000000000000000000000000000000001") + call check(-999999999999999999999999999999999999_k, "-999999999999999999999999999999999999") + call check(-1000000000000000000000000000000000000_k, "-1000000000000000000000000000000000000") + call check(-1000000000000000000000000000000000001_k, "-1000000000000000000000000000000000001") + call check(-9999999999999999999999999999999999999_k, "-9999999999999999999999999999999999999") + call check(-10000000000000000000000000000000000000_k, "-10000000000000000000000000000000000000") + call check(-10000000000000000000000000000000000001_k, "-10000000000000000000000000000000000001") + call check(-99999999999999999999999999999999999999_k, "-99999999999999999999999999999999999999") + call check(-100000000000000000000000000000000000000_k, "-100000000000000000000000000000000000000") + call check(-100000000000000000000000000000000000001_k, "-100000000000000000000000000000000000001") + call check(-109999999999999999999999999999999999999_k, "-109999999999999999999999999999999999999") + +contains + + subroutine check (i, str) + implicit none + integer(kind=k), intent(in), value :: i + character(len=*), intent(in) :: str + + character(len=100) :: buffer + write(buffer,*) i + if (adjustl(buffer) /= adjustl(str)) stop 1 + end subroutine + + subroutine random_digits (str) + implicit none + integer, parameter :: l = 38 + character(len=l+1) :: str + real :: r + integer :: i, d + + str = "" + do i = 2, l+1 + call random_number(r) + d = floor(r * 10) + str(i:i) = achar(48 + d) + end do + + call random_number(r) + if (r > 0.5) then + str(1:1) = '-' + end if + end subroutine +end diff --git a/libgfortran/runtime/string.c b/libgfortran/runtime/string.c index 835027a7cd6..0ccd731852a 100644 --- a/libgfortran/runtime/string.c +++ b/libgfortran/runtime/string.c @@ -23,6 +23,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see . */ #include "libgfortran.h" +#include #include #include @@ -169,6 +170,38 @@ find_option (st_parameter_common *cmp, const char *s1, gfc_charlen_type s1_len, } +/* Fast helper function for a positive value that fits in uint64_t. */ + +static inline char * +itoa64 (uint64_t n, char *p) +{ + while (n != 0) + { + *--p = '0' + (n % 10); + n /= 10; + } + return p; +} + + +#if defined(HAVE_GFC_INTEGER_16) +# define TEN19 ((GFC_UINTEGER_LARGEST) 1000000 * (GFC_UINTEGER_LARGEST) 1000000 * (GFC_UINTEGER_LARGEST) 10000000) + +/* Same as itoa64(), with zero padding of 19 digits. */ + +static inline char * +itoa64_pad19 (uint64_t n, char *p) +{ + for (int k = 0; k < 19; k++) + { + *--p = '0' + (n % 10); + n /= 10; + } + return p; +} +#endif + + /* Integer to decimal conversion. This function is much more restricted than the widespread (but @@ -195,11 +228,33 @@ gfc_itoa (GFC_UINTEGER_LARGEST n, char *buffer, size_t len) p = buffer + GFC_ITOA_BUF_SIZE - 1; *p = '\0'; - while (n != 0) +#if defined(HAVE_GFC_INTEGER_16) + /* On targets that have a 128-bit integer type, division in that type + is slow, because it occurs through a function call. We avoid that. */ + + if (n <= UINT64_MAX) + /* If the value fits in uint64_t, use the fast function. */ + return itoa64 (n, p); + else { - *--p = '0' + (n % 10); - n /= 10; + /* Otherwise, break down into smaller bits by division. Two calls to + the uint64_t function are not sufficient for all 128-bit unsigned + integers (we would need three calls), but they do suffice for all + values up to 2^127, which is the largest that Fortran can produce + (-HUGE(0_16)-1) with its signed integer types. */ + static_assert(sizeof(GFC_UINTEGER_LARGEST) <= 2 * sizeof(uint64_t)); + + GFC_UINTEGER_LARGEST r; + r = n % TEN19; + n = n / TEN19; + assert (r <= UINT64_MAX); + p = itoa64_pad19 (r, p); + + assert(n <= UINT64_MAX); + return itoa64 (n, p); } - - return p; +#else + /* On targets where the largest integer is 64-bit, just use that. */ + return itoa64 (n, p); +#endif } --Apple-Mail=_33A289B3-4971-4492-A307-B0CB2B823043 Content-Disposition: attachment; filename=timing.f90 Content-Type: application/octet-stream; x-unix-mode=0644; name="timing.f90" Content-Transfer-Encoding: 7bit program test implicit none integer, parameter :: n = 100000 real :: t1, t2 character(len=100) :: s integer :: i integer(kind=1) :: x1(n) integer(kind=4) :: x4(n) integer(kind=8) :: x8(n) integer(kind=16) :: x16(n) print *, "Timing for INTEGER(KIND=1)" x1(:) = 0 call cpu_time(t1) do i = 1, 10 call output1(s) end do call cpu_time(t2) write(*,*) "Value 0, time:", t2 - t1 x1(:) = huge(x1) call cpu_time(t1) do i = 1, 10 call output1(s) end do call cpu_time(t2) write(*,*) "Value HUGE(KIND=1), time:", t2 - t1 print *, "Timing for INTEGER(KIND=4)" x4(:) = 0 call cpu_time(t1) do i = 1, 10 call output4(s) end do call cpu_time(t2) write(*,*) "Value 0, time:", t2 - t1 x4(:) = 1049 call cpu_time(t1) do i = 1, 10 call output4(s) end do call cpu_time(t2) write(*,*) "Value 1049, time:", t2 - t1 x4(:) = huge(x4) call cpu_time(t1) do i = 1, 10 call output4(s) end do call cpu_time(t2) write(*,*) "Value HUGE(KIND=4), time:", t2 - t1 print *, "Timing for INTEGER(KIND=8)" x8(:) = 0 call cpu_time(t1) do i = 1, 10 call output8(s) end do call cpu_time(t2) write(*,*) "Value 0, time:", t2 - t1 x8(:) = huge(x4) call cpu_time(t1) do i = 1, 10 call output8(s) end do call cpu_time(t2) write(*,*) "Value HUGE(KIND=4), time:", t2 - t1 x8(:) = huge(x8) call cpu_time(t1) do i = 1, 10 call output8(s) end do call cpu_time(t2) write(*,*) "Value HUGE(KIND=8), time:", t2 - t1 print *, "Timing for INTEGER(KIND=16)" x16(:) = 0 call cpu_time(t1) do i = 1, 10 call output16(s) end do call cpu_time(t2) write(*,*) "Value 0, time:", t2 - t1 x16(:) = huge(x4) call cpu_time(t1) do i = 1, 10 call output16(s) end do call cpu_time(t2) write(*,*) "Value HUGE(KIND=4), time:", t2 - t1 x16(:) = huge(x8) call cpu_time(t1) do i = 1, 10 call output16(s) end do call cpu_time(t2) write(*,*) "Value HUGE(KIND=8), time:", t2 - t1 x16(:) = huge(x16) call cpu_time(t1) do i = 1, 10 call output16(s) end do call cpu_time(t2) write(*,*) "Value HUGE(KIND=16), time:", t2 - t1 contains subroutine output1(s) implicit none character(len=100) :: s integer :: i do i = 1, n write(s,*) x1(i) end do end subroutine subroutine output4(s) implicit none character(len=100) :: s integer :: i do i = 1, n write(s,*) x4(i) end do end subroutine subroutine output8(s) implicit none character(len=100) :: s integer :: i do i = 1, n write(s,*) x8(i) end do end subroutine subroutine output16(s) implicit none character(len=100) :: s integer :: i do i = 1, n write(s,*) x16(i) end do end subroutine end --Apple-Mail=_33A289B3-4971-4492-A307-B0CB2B823043--