From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x632.google.com (mail-ej1-x632.google.com [IPv6:2a00:1450:4864:20::632]) by sourceware.org (Postfix) with ESMTPS id 2A95B3858D28 for ; Sun, 26 May 2024 18:23:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2A95B3858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 2A95B3858D28 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::632 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1716747824; cv=none; b=QGKwa2cVE6lAlZ79Ft50h+pV8MlO/bQNRW1vnhRP8VSiK1odfmU5e5sRquUHHAWOJ7W9LtQyEBPYLKaIpFKRMhs+DvdGxQ01P6Iuw7Z/ZpZmxIw94S/iPmHLUU7/wl4/62JivjgRXxfxe/rjbbXo2pNZzI9bBJQXn3+6b8JsTmM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1716747824; c=relaxed/simple; bh=xBU1NR3T55btViPAKpAYSPPzavNwie6HmIdB+k0QEh8=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=McuDHVMnN4pRpY65nAIo9t11nLLbRlq+qnl2xVZuNXK4t9Kqa5tEK21pV3tRFDAc0aOhYAnLp+rGaZqtrUdgmFzkFJVRpuxcw8/+XNXWIPR9CtGv8ygzGrz5ujO1PEHF5YbaGdIGRUkJMVRHOWdqHS6lZ8BB5nWDYvkgNCnJQjc= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ej1-x632.google.com with SMTP id a640c23a62f3a-a59a352bbd9so836296366b.1 for ; Sun, 26 May 2024 11:23:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1716747820; x=1717352620; darn=gcc.gnu.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=7Do+zbrRFCL04Q18UL2elvAhit4phvaC+rCOl+AkqFA=; b=OZhJnYZ3tH5aWmk8x3ZEsGMZWQx+ENLpVeOGO1xrMz6rGs0m1p12wo9EU5Oh8mMK7e TpPxbOpw2ANW5yCJrmxbklhYnnpc4S6WolLOM4us9ibWp38F01xqoGtvPOV4azApRizM esR7/7X7JB7C9tegr3Dgaz5y10zm5awAU3wBfcIIyEz2OSvidHCdgpCMeEzdqCXxOk41 1HF9zSUtTMjKWEYHD6lxi7OyZQOchdtB3mEgvXvA1Ka5qIN4MgEPQMWZGPSA8bsvl7te XhYxse4x7zduAmmFCIIHkw5eqmTWfKUUaqcPm/k2qoIRB6l6xRJw9Ab23zLULshRCmmj 5O9Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716747820; x=1717352620; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=7Do+zbrRFCL04Q18UL2elvAhit4phvaC+rCOl+AkqFA=; b=fZWwVl4it2dSgNaDY25M1r+6JoanfY71CEJM+wusyaLr04kWKa9Hol00IRhz+LP3hP oQKQew0mVvoAMiVKKjV9MFGskeIPdLkXCvh4A4dDEI7lLOCMLlnW8mk7jZUHUxHj1G17 XAYOgoBQgZn/ViUJf/18w4WVIo/c5GokOImeLgLO9vTTPhcJmkGtpsycQbSeR8IJMWEK snyeRQOtyvNKFAcly+aRjE4JSCag1UKDws3pUoOy3WFIiXvVRsRbyboZFkVQkHtBEXOg rXn0k3nCu1ZqDFRMTGPEEyjTviSTc3ygEbCNPe+KZl+kcBBhmEmCYM3k+U4PTy5jBeLx SHFw== X-Gm-Message-State: AOJu0Ywewlw7WLX3XtT8OsrEeWbQlS78YnDinXj93dyrCM1GOHBK6l50 8U9QScvxSKHdpGrmCYwOPIDC8/eOloluX09wVaqgLNJmKZ8UwyPDBqfZXvVnF9yxMwOKbM1jyzG wDXWGND0GoTKGZ1ukVBQl+42x000= X-Google-Smtp-Source: AGHT+IH1owujJulT35oix55XwuSMeW6kFUvoZgIWC5l6s5Pv5wCQyR7b7Htxuze7xOkwOBr2KwsruzxxYQ5IO0TnGbA= X-Received: by 2002:a17:906:a287:b0:a5c:dad0:c464 with SMTP id a640c23a62f3a-a623e6d5882mr940330066b.6.1716747819355; Sun, 26 May 2024 11:23:39 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: NightStrike Date: Sun, 26 May 2024 14:23:28 -0400 Message-ID: Subject: Re: [RFC/RFA][PATCH 00/12] CRC optimization To: Mariam Arutunian Cc: GCC Patches Content-Type: multipart/alternative; boundary="000000000000f1158f06195f7f40" X-Spam-Status: No, score=-0.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,HTML_MESSAGE,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --000000000000f1158f06195f7f40 Content-Type: text/plain; charset="UTF-8" On Fri, May 24, 2024, 04:42 Mariam Arutunian wrote: > Hello! > This patch set detects bitwise CRC implementation loops (with branches) in > the GIMPLE optimizers and replaces them with more optimal CRC > implementations in RTL. These patches introduce new internal functions, > built-in functions, and expanders for CRC generation, leveraging hardware > instructions where available. Additionally, various tests are included to > check CRC detection and generation. Main Features: > > 1. > > CRC Loop Detection and Replacement: > - Detection of CRC loops involves two stages: fast checks to identify > potential candidates and verification using symbolic execution. The > algorithm detects only CRCs (8, 16, 32, and 64 bits, both bit-forward and > bit-reversed) with constant polynomials used without the leading 1. This > part can be improved to detect more implementation types. > - Once identified, the CRC loops are replaced with calls to newly > added internal functions. These internal functions use target-specific > expanders if available, otherwise generating table-based CRCs. > 2. > > Architecture-Specific Expanders: > - Expanders are added for RISC-V, aarch64, and i386 architectures. > - These expanders generate CRCs using either carry-less > multiplication instructions or direct CRC instructions, based on the target > architecture's capabilities. > 3. > > New Internal and Built-In Functions: > - Introduces internal functions and built-in functions for CRC > generation, supporting various CRC and data sizes (8, 16, 32, and 64 bits). > > I presented this work during the GNU Tools Cauldron 2023. You can view the > presentation here: GCC CRC optimization presentation > > . > > Previously, I submitted a patch to GCC upstream that included built-in > parts and expanders for RISC-V. However, the main component of the > previously sent patch has been changed. You can find the patch here: > https://gcc.gnu.org/pipermail/gcc-patches/2023-August/626279.html > > > Best regards, > Mariam > Could this detect a specific CRC32C calculation as well (vs any urge CRC) and replace it with optimized calls to the dedicated crc32c hardware instructions via __builtin_ia32_crc32*i? I'm asking because ironically I was just a few days ago trying to see how current GCC optimizes simpler but slower approaches compared to hand tuning. For example, https://github.com/htot/crc32c. Almost every implementation I found was ultimately based on Mark Adler's (of zlib and that fun Mars mission fame) implementation posted to stack overflow here: https://stackoverflow.com/a/17646775 Because so many current libraries are based on that, it would be nice to see if GCC can improve upon it with your patch. Or if they are no longer necessary, because your patch optimizes the simpler software approach to basically yield Mark's solution. > --000000000000f1158f06195f7f40--