public inbox for libstdc++@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jonathan Wakely <jwakely@redhat.com>
To: Antony Polukhin <antoshkka@gmail.com>
Cc: libstdc++ <libstdc++@gcc.gnu.org>,
	gcc-patches List <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] Optimize to_chars
Date: Fri, 30 Aug 2019 16:09:00 -0000	[thread overview]
Message-ID: <20190830160857.GP9487@redhat.com> (raw)
In-Reply-To: <20190830160126.GO9487@redhat.com>

On 30/08/19 17:01 +0100, Jonathan Wakely wrote:
>On 30/08/19 17:27 +0300, Antony Polukhin wrote:
>>Bunch of micro optimizations for std::to_chars:
>>* For base == 8 replacing the lookup in __digits table with arithmetic
>>computations leads to a same CPU cycles for a loop (exchanges two
>>movzx with 3 bit ops https://godbolt.org/z/RTui7m ). However this
>>saves 129 bytes of data and totally avoids a chance of cache misses on
>>__digits.
>>* For base == 16 replacing the lookup in __digits table with
>>arithmetic computations leads to a few additional instructions, but
>>totally avoids a chance of cache misses on __digits (- ~9 cache misses
>>for worst case) and saves 513 bytes of const data.
>>* Replacing __first[pos] and __first[pos - 1] with __first[1] and
>>__first[0] on final iterations saves ~2% of code size.
>>* Removing trailing '\0' from arrays of digits allows the linker to
>>merge the symbols (so that "0123456789abcdefghijklmnopqrstuvwxyz" and
>>"0123456789abcdef" could share the same address). This improves data
>>locality and reduces binary sizes.
>>* Using __detail::__to_chars_len_2 instead of a generic
>>__detail::__to_chars_len makes the operation O(1) instead of O(N). It
>>also makes the code two times shorter ( https://godbolt.org/z/Peq_PG)
>>.
>>
>>In sum: this significantly reduces the size of a binary (for about
>>4KBs only for base-8 conversion https://godbolt.org/z/WPKijS ), deals
>>with latency (CPU cache misses) without changing the iterations count
>>and without adding costly instructions into the loops.
>
>This is great, thanks.
>
>Have you tried comparing the improved code to libc++'s implementation?
>I believe they use precomputed arrays of digits, but they use larger
>arrays that allow 4 bytes to be written at once, which is considerably
>faster (and those precomputed arrays libe in libc++.so not in the
>header). Would we be better off keeping the precomputed arrays and
>expanding them to do 4-byte writes?
>
>Since we don't have a patch to do that, I think I'll commit yours. We
>can always go back to precomputed arrays later if somebody does that
>work.
>
>My only comments are on the changelog:
>
>>Changelog:
>>   * include/std/charconv (__detail::__to_chars_8,
>>   __detail::__to_chars_16): Replace array of precomputed digits
>
>When the list of changed functions is split across lines it should be
>like this:
>
>   * include/std/charconv (__detail::__to_chars_8)
>   (__detail::__to_chars_16): Replace array of precomputed digits
>
>i.e close the parentheses before the line break, and reopen on the
>next line.
>
>>   with arithmetic operations to avoid CPU cache misses. Remove
>>   zero termination from array of digits to allow symbol merge with
>>   generic implementation of __detail::__to_chars. Replace final
>>   offsets with constants. Use __detail::__to_chars_len_2 instead
>>   of a generic __detail::__to_chars_len.
>>   * include/std/charconv (__detail::__to_chars): Remove
>
>Don't repeat the asterisk and filename for later changes in the same
>file, i.e.
>
>   (__detail::__to_chars): Remove zero termination from array of digits.
>   (__detail::__to_chars_2): Leading digit is always '1'.
>
>There's no changelog entry for the changes to __to_chars_len_8 and
>__to_chars_len_16.

Oh, there's no separate __to_chars_len_16 function anyway, and you did
mention it as "Use __detail::__to_chars_len_2 instead ..." - sorry!

I think we might as well inline __to_chars_len_8 into __to_chars_8,
there's not much benefit to having a separate function. I'll do that
as a follow up patch.



  reply	other threads:[~2019-08-30 16:09 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-30 14:27 Antony Polukhin
2019-08-30 16:01 ` Jonathan Wakely
2019-08-30 16:09   ` Jonathan Wakely [this message]
2019-09-02 11:32     ` Jonathan Wakely
2019-09-08 13:45       ` Antony Polukhin
2019-09-09 11:13         ` Jonathan Wakely
2019-08-30 19:46   ` Antony Polukhin
2019-08-30 20:05     ` Jonathan Wakely
2019-08-30 17:03 ` Martin Sebor
2019-08-30 19:24   ` Jonathan Wakely

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190830160857.GP9487@redhat.com \
    --to=jwakely@redhat.com \
    --cc=antoshkka@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=libstdc++@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).