public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/106453] New: Redundant zero extension after crc32q
@ 2022-07-27  9:55 amonakov at gcc dot gnu.org
  2022-07-28 15:45 ` [Bug target/106453] " amonakov at gcc dot gnu.org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: amonakov at gcc dot gnu.org @ 2022-07-27  9:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106453

            Bug ID: 106453
           Summary: Redundant zero extension after crc32q
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: amonakov at gcc dot gnu.org
  Target Milestone: ---

On 64-bit x86, straightforward use of SSE 4.2 crc instruction looks like

#include <immintrin.h>
#include <stdint.h>

uint32_t f(uint32_t c, uint64_t *p, size_t n)
{
    for (size_t i = 0; i < n; i++)
        c = _mm_crc32_u64(c, p[i]);
    return c;
}

On the ISA level, the crc32q instruction takes 64-bit operands, and resulting
assembly is (gcc -O2 -msse4.2):

f:
        mov     eax, edi
        test    rdx, rdx
        je      .L1
        lea     rdx, [rsi+rdx*8]
.L3:
        mov     eax, eax
        add     rsi, 8
        crc32   rax, QWORD PTR [rsi-8]
        cmp     rdx, rsi
        jne     .L3
.L1:
        ret

Note zero-extension of 'eax' (which is usually not move-eliminated since
destination is the same as source).

The crc32q instruction zero-extends rax from the 32-bit result (it also ignores
high 32 bits when reading the destination operand), so I think it should be
possible to model zero extension in the .md pattern, allowing to eliminate the
explicit extension.

A source-level workaround is using a 64-bit variable in the loop, so the
extension happens just once before the loop.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/106453] Redundant zero extension after crc32q
  2022-07-27  9:55 [Bug target/106453] New: Redundant zero extension after crc32q amonakov at gcc dot gnu.org
@ 2022-07-28 15:45 ` amonakov at gcc dot gnu.org
  2022-07-28 17:52 ` hjl.tools at gmail dot com
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: amonakov at gcc dot gnu.org @ 2022-07-28 15:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106453

--- Comment #1 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
Any idea if the following is reasonable? It compiles and achieves the desired
result.

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index bdde577dd..d82656678 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -23598,10 +23598,10 @@

 (define_insn "sse4_2_crc32di"
   [(set (match_operand:DI 0 "register_operand" "=r")
-       (unspec:DI
-         [(match_operand:DI 1 "register_operand" "0")
+       (zero_extend:DI (unspec:SI
+         [(match_operand:SI 1 "register_operand" "0")
           (match_operand:DI 2 "nonimmediate_operand" "rm")]
-         UNSPEC_CRC32))]
+         UNSPEC_CRC32)))]
   "TARGET_64BIT && TARGET_CRC32"
   "crc32{q}\t{%2, %0|%0, %2}"
   [(set_attr "type" "sselog1")

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/106453] Redundant zero extension after crc32q
  2022-07-27  9:55 [Bug target/106453] New: Redundant zero extension after crc32q amonakov at gcc dot gnu.org
  2022-07-28 15:45 ` [Bug target/106453] " amonakov at gcc dot gnu.org
@ 2022-07-28 17:52 ` hjl.tools at gmail dot com
  2022-08-08 17:05 ` pinskia at gcc dot gnu.org
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: hjl.tools at gmail dot com @ 2022-07-28 17:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106453

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2022-07-28
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW

--- Comment #2 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Alexander Monakov from comment #1)
> Any idea if the following is reasonable? It compiles and achieves the
> desired result.
> 
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index bdde577dd..d82656678 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -23598,10 +23598,10 @@
> 
>  (define_insn "sse4_2_crc32di"
>    [(set (match_operand:DI 0 "register_operand" "=r")
> -       (unspec:DI
> -         [(match_operand:DI 1 "register_operand" "0")
> +       (zero_extend:DI (unspec:SI
> +         [(match_operand:SI 1 "register_operand" "0")
>            (match_operand:DI 2 "nonimmediate_operand" "rm")]
> -         UNSPEC_CRC32))]
> +         UNSPEC_CRC32)))]
>    "TARGET_64BIT && TARGET_CRC32"
>    "crc32{q}\t{%2, %0|%0, %2}"
>    [(set_attr "type" "sselog1")

It looks good.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/106453] Redundant zero extension after crc32q
  2022-07-27  9:55 [Bug target/106453] New: Redundant zero extension after crc32q amonakov at gcc dot gnu.org
  2022-07-28 15:45 ` [Bug target/106453] " amonakov at gcc dot gnu.org
  2022-07-28 17:52 ` hjl.tools at gmail dot com
@ 2022-08-08 17:05 ` pinskia at gcc dot gnu.org
  2022-09-05 18:02 ` cvs-commit at gcc dot gnu.org
  2022-09-05 18:04 ` amonakov at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-08-08 17:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106453

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/106453] Redundant zero extension after crc32q
  2022-07-27  9:55 [Bug target/106453] New: Redundant zero extension after crc32q amonakov at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2022-08-08 17:05 ` pinskia at gcc dot gnu.org
@ 2022-09-05 18:02 ` cvs-commit at gcc dot gnu.org
  2022-09-05 18:04 ` amonakov at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-09-05 18:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106453

--- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Alexander Monakov <amonakov@gcc.gnu.org>:

https://gcc.gnu.org/g:810d9815249451f477d4cbc67b8e4a0819c37faa

commit r13-2448-g810d9815249451f477d4cbc67b8e4a0819c37faa
Author: Alexander Monakov <amonakov@ispras.ru>
Date:   Tue Aug 23 18:42:24 2022 +0300

    i386: avoid zero extension for crc32q

    The crc32q instruction takes 64-bit operands, but ignores high 32 bits
    of the destination operand, and zero-extends the result from 32 bits.

    Let's model this in the RTL pattern to avoid zero-extension when the
    _mm_crc32_u64 intrinsic is used with a 32-bit type.

            PR target/106453

    gcc/ChangeLog:

            * config/i386/i386.md (sse4_2_crc32di): Model that only low 32
            bits of operand 0 are consumed, and the result is zero-extended
            to 64 bits.

    gcc/testsuite/ChangeLog:

            * gcc.target/i386/pr106453.c: New test.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/106453] Redundant zero extension after crc32q
  2022-07-27  9:55 [Bug target/106453] New: Redundant zero extension after crc32q amonakov at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2022-09-05 18:02 ` cvs-commit at gcc dot gnu.org
@ 2022-09-05 18:04 ` amonakov at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: amonakov at gcc dot gnu.org @ 2022-09-05 18:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106453

Alexander Monakov <amonakov at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #4 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
Fixed for gcc-13.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-09-05 18:04 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-27  9:55 [Bug target/106453] New: Redundant zero extension after crc32q amonakov at gcc dot gnu.org
2022-07-28 15:45 ` [Bug target/106453] " amonakov at gcc dot gnu.org
2022-07-28 17:52 ` hjl.tools at gmail dot com
2022-08-08 17:05 ` pinskia at gcc dot gnu.org
2022-09-05 18:02 ` cvs-commit at gcc dot gnu.org
2022-09-05 18:04 ` amonakov at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).