public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Mariam Arutunian <mariamarutunian@gmail.com>
To: NightStrike <nightstrike@gmail.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [RFC/RFA][PATCH 00/12] CRC optimization
Date: Mon, 27 May 2024 13:56:20 +0400	[thread overview]
Message-ID: <CAE65F3PSMxchshDhtsxV=vwxJREWnjdpWeWyPq5P_ZKx5vSwbw@mail.gmail.com> (raw)
In-Reply-To: <CAF1jjLssFwmVg51NwKY63cTQpwyb9+-Jc3bv0NZAon_qTGZaqQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4633 bytes --]

Yes, this pass can detect only bitwise (naive) implementations of CRC32C
and replace them with hardware crc32* instructions (if supported), or
alternatively
generate CRC using CLMUL-based or table-based methods.
Here is an example of such a CRC32C implementation, where the entire loop
will be replaced with the crc32b instruction:
uint32_t crc32c(const uint8_t *data, size_t length) { uint32_t crc =
0xFFFFFFFF; for (size_t i = 0; i < length; ++i) { crc ^= data[i]; for (int
j = 0; j < 8; ++j) { crc = (crc >> 1) ^ ((crc & 1) ? 0x82F63B78 : 0); } }
return ~crc; }
I have added support for this optimization, as detailed in this patch:
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652615.html . The patch
description mentions a polynomial of 0x1EDC6F41, but CRC32C actually uses
the reflected version of this polynomial: 0x82F63B78. It might be clearer
to use 0x82F63B78 in the description.
Unfortunately, the example from the link you provided uses a table-based
version of CRC32C, which we don't detect. Currently, we only optimize
bitwise implementations, which are slower than all other implementations. If
necessary, I can add support for detecting table-based CRC implementations.
This would require a completely different algorithm than the one used for
bitwise implementations.
Thanks,
Mariam

On Sun, May 26, 2024 at 10:23 PM NightStrike <nightstrike@gmail.com> wrote:

>
>
> On Fri, May 24, 2024, 04:42 Mariam Arutunian <mariamarutunian@gmail.com>
> wrote:
>
>> Hello!
>> This patch set detects bitwise CRC implementation loops (with branches)
>> in the GIMPLE optimizers and replaces them with more optimal CRC
>> implementations in RTL. These patches introduce new internal functions,
>> built-in functions, and expanders for CRC generation, leveraging hardware
>> instructions where available. Additionally, various tests are included
>> to check CRC detection and generation. Main Features:
>>
>>    1.
>>
>>    CRC Loop Detection and Replacement:
>>    - Detection of CRC loops involves two stages: fast checks to identify
>>       potential candidates and verification using symbolic execution. The
>>       algorithm detects only CRCs (8, 16, 32, and 64 bits, both bit-forward and
>>       bit-reversed) with constant polynomials used without the leading 1. This
>>       part can be improved to detect more implementation types.
>>       - Once identified, the CRC loops are replaced with calls to newly
>>       added internal functions. These internal functions use target-specific
>>       expanders if available, otherwise generating table-based CRCs.
>>    2.
>>
>>    Architecture-Specific Expanders:
>>    - Expanders are added for RISC-V, aarch64, and i386 architectures.
>>       - These expanders generate CRCs using either carry-less
>>       multiplication instructions or direct CRC instructions, based on the target
>>       architecture's capabilities.
>>    3.
>>
>>    New Internal and Built-In Functions:
>>    - Introduces internal functions and built-in functions for CRC
>>       generation, supporting various CRC and data sizes (8, 16, 32, and 64 bits).
>>
>> I presented this work during the GNU Tools Cauldron 2023. You can view
>> the presentation here: GCC CRC optimization presentation
>> <https://gcc.gnu.org/wiki/cauldron2023talks?action=AttachFile&do=view&target=GCC+CRC+optimization.pdf>
>> .
>>
>> Previously, I submitted a patch to GCC upstream that included built-in
>> parts and expanders for RISC-V. However, the main component of the
>> previously sent patch has been changed. You can find the patch here:
>> https://gcc.gnu.org/pipermail/gcc-patches/2023-August/626279.html
>>
>>
>> Best regards,
>> Mariam
>>
>
> Could this detect a specific CRC32C calculation as well (vs any urge CRC)
> and replace it with optimized calls to the dedicated crc32c hardware
> instructions via __builtin_ia32_crc32*i? I'm asking because ironically I
> was just a few days ago trying to see how current GCC optimizes simpler but
> slower approaches compared to hand tuning. For example,
> https://github.com/htot/crc32c. Almost every implementation I found was
> ultimately based on Mark Adler's (of zlib and that fun Mars mission fame)
> implementation posted to stack overflow here:
> https://stackoverflow.com/a/17646775
>
> Because so many current libraries are based on that, it would be nice to
> see if GCC can improve upon it with your patch. Or if they are no longer
> necessary, because your patch optimizes the simpler software approach to
> basically yield Mark's solution.
>
>>

      reply	other threads:[~2024-05-27  9:56 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-24  8:41 Mariam Arutunian
2024-05-25  1:53 ` Jeff Law
2024-05-26 18:23 ` NightStrike
2024-05-27  9:56   ` Mariam Arutunian [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAE65F3PSMxchshDhtsxV=vwxJREWnjdpWeWyPq5P_ZKx5vSwbw@mail.gmail.com' \
    --to=mariamarutunian@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=nightstrike@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).