From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [207.211.31.120]) by sourceware.org (Postfix) with ESMTP id BF7FA3857C56 for ; Tue, 21 Jul 2020 11:04:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org BF7FA3857C56 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-471-NHCYZaM2N7-wUmyXSJmoEg-1; Tue, 21 Jul 2020 07:04:20 -0400 X-MC-Unique: NHCYZaM2N7-wUmyXSJmoEg-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 9C7761005510; Tue, 21 Jul 2020 11:04:19 +0000 (UTC) Received: from localhost (unknown [10.33.36.241]) by smtp.corp.redhat.com (Postfix) with ESMTP id 35C861002397; Tue, 21 Jul 2020 11:04:19 +0000 (UTC) Date: Tue, 21 Jul 2020 12:04:18 +0100 From: Jonathan Wakely To: Florian Weimer Cc: libstdc++@gcc.gnu.org, gcc-patches@gcc.gnu.org Subject: Re: [committed] libstdc++: Add std::from_chars for floating-point types Message-ID: <20200721110418.GH3215@redhat.com> References: <20200720225233.GA174400@redhat.com> <87y2ndtnr5.fsf@oldenburg2.str.redhat.com> MIME-Version: 1.0 In-Reply-To: <87y2ndtnr5.fsf@oldenburg2.str.redhat.com> X-Clacks-Overhead: GNU Terry Pratchett X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline X-Spam-Status: No, score=-9.8 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libstdc++@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libstdc++ mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Jul 2020 11:04:25 -0000 On 21/07/20 07:56 +0200, Florian Weimer wrote: >* Jonathan Wakely via Libstdc: > >> By replacing the use of strtod we could avoid allocation, avoid changing >> locale, and use optimised code paths specific to each std::chars_format >> case. We would also get more portable behaviour, rather than depending >> on the presence of uselocale, and on any bugs or quirks of the target >> libc's strtod. Replacing strtod is a project for a later date. > >glibc already has strtod_l (since glibc 2.1, undocumented, but declared >in ). Yes, I noticed that in the glibc sources. I decided not to bother using it because we still need the newlocale and freelocale calls, which can still potentially allocate memory (although in practice maybe they don't for the "C" locale?) and because what I committed should work for any POSIX target. >What seems to be missing is a function that takes an explicit buffer >length. A static reference to the C locale object would be helpful as >well, I assume. How expensive is it to do newlocale("C", nullptr), uselocale, and freelocale? >Maybe this is sufficiently clean that we can export this for libstdc++'s >use? Without repeating the libio mess? I think we could beat strtod's performance with a handwritten implementation, so I don't know if it's worth adding glibc extensions if we would stop using them eventually anyway. std::from_chars takes an enum that says whether the input is in hex, scientific or fixed format (or 'general' which is fixed|scientific). Because strtod determines the format itself, we need to do some preprocessing before calling strtod, to stop it being too general. Some examples where strtod does the wrong thing unless we do extra work before calling it: "0x1p01" should always produce the result 0, for any format (because you pass the hex flag to std::from_chars, it doesn't need a "0x" prefix, and if one is present it's interpreted as simply "0"). If we don't truncate the string, strtod produces 2. "0.8p1" should produce 0.8 for fixed and general formats, produce an error for scientific format, and produce 1 for hex format (which means we need to create the string "0x0.8p1" to pass to strtod). strtod always produces 0.8 for this input. I also noticed some strings give an underflow error with glibc's strtod, but are valid for the Microsoft implementation. For example, this one: https://github.com/microsoft/STL/blob/master/tests/std/tests/P0067R5_charconv/double_from_chars_test_cases.hpp#L265 Without the final '1' digit glibc returns DBL_MIN, but with the final '1' digit (so a number larger than DBL_MIN) it underflows. Is that expected?