From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 72881 invoked by alias); 30 Aug 2019 17:03:22 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 72866 invoked by uid 89); 30 Aug 2019 17:03:22 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-7.2 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy=HX-Received:47c3 X-HELO: mail-qt1-f193.google.com Received: from mail-qt1-f193.google.com (HELO mail-qt1-f193.google.com) (209.85.160.193) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 30 Aug 2019 17:03:20 +0000 Received: by mail-qt1-f193.google.com with SMTP id n7so8330285qtb.6; Fri, 30 Aug 2019 10:03:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-language:content-transfer-encoding; bh=FHJat9Z8W1hSIx3BAE0r7zzl6LGru1oJuObw2XCBP7M=; b=jc42LED9xTSiIyLIKQLbfRa+N0KSXbd8G+0bNVE0qbGIEsWCFUiPgVJd5f3JUPCDR+ u4aA8oCmXNsKH9lX9LuoOLFwMznGQQndI1Y+//U1XsRqhGg3ImAtpTKLoR92u7//Pr7o CaTzF3QLkHeQoXwLa4ZO6xXKrSbEx6YWmBzzBVrnx+qFOEtwVYofFeIWlOmsRJIdG+Lk VqYC5ukz8I8iXYNGWaCAZ6WfF6ezd98aKQc8stlrstuwG+sgZXY5oss3FZPTzDzLOzG9 EDJmRQOXWIyvCBmIfztgXHowG5XjkNf2v12qaiRmXZMtygYGfBFVAZxz46e6dqpWoolD bEVQ== Return-Path: Received: from [192.168.0.41] (97-118-126-194.hlrn.qwest.net. [97.118.126.194]) by smtp.gmail.com with ESMTPSA id d45sm3169661qtk.57.2019.08.30.10.03.17 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 30 Aug 2019 10:03:17 -0700 (PDT) Subject: Re: [PATCH] Optimize to_chars To: Antony Polukhin , libstdc++ , gcc-patches List References: From: Martin Sebor Message-ID: <878a0df2-3786-6cc7-d31a-9421093192d1@gmail.com> Date: Fri, 30 Aug 2019 19:41:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes X-SW-Source: 2019-08/txt/msg02106.txt.bz2 On 8/30/19 8:27 AM, Antony Polukhin wrote: > Bunch of micro optimizations for std::to_chars: > * For base == 8 replacing the lookup in __digits table with arithmetic > computations leads to a same CPU cycles for a loop (exchanges two > movzx with 3 bit ops https://godbolt.org/z/RTui7m ). However this > saves 129 bytes of data and totally avoids a chance of cache misses on > __digits. > * For base == 16 replacing the lookup in __digits table with > arithmetic computations leads to a few additional instructions, but > totally avoids a chance of cache misses on __digits (- ~9 cache misses > for worst case) and saves 513 bytes of const data. > * Replacing __first[pos] and __first[pos - 1] with __first[1] and > __first[0] on final iterations saves ~2% of code size. > * Removing trailing '\0' from arrays of digits allows the linker to > merge the symbols (so that "0123456789abcdefghijklmnopqrstuvwxyz" and > "0123456789abcdef" could share the same address). This improves data > locality and reduces binary sizes. > * Using __detail::__to_chars_len_2 instead of a generic > __detail::__to_chars_len makes the operation O(1) instead of O(N). It > also makes the code two times shorter ( https://godbolt.org/z/Peq_PG) > . > > In sum: this significantly reduces the size of a binary (for about > 4KBs only for base-8 conversion https://godbolt.org/z/WPKijS ), deals > with latency (CPU cache misses) without changing the iterations count > and without adding costly instructions into the loops. Would it make sense to move some of this code into GCC as a built-in so that it could also be used by GCC to expand some strtol and sprintf calls? Martin > > Changelog: > * include/std/charconv (__detail::__to_chars_8, > __detail::__to_chars_16): Replace array of precomputed digits > with arithmetic operations to avoid CPU cache misses. Remove > zero termination from array of digits to allow symbol merge with > generic implementation of __detail::__to_chars. Replace final > offsets with constants. Use __detail::__to_chars_len_2 instead > of a generic __detail::__to_chars_len. > * include/std/charconv (__detail::__to_chars): Remove > zero termination from array of digits. > * include/std/charconv (__detail::__to_chars_2): Leading digit > is always '1'. >