* [PATCH] i386: avoid zero extension for crc32q
@ 2022-08-23 16:09 Alexander Monakov
2022-08-24 11:35 ` Alexander Monakov
2022-09-04 19:36 ` Uros Bizjak
0 siblings, 2 replies; 3+ messages in thread
From: Alexander Monakov @ 2022-08-23 16:09 UTC (permalink / raw)
To: gcc-patches; +Cc: Alexander Monakov
The crc32q instruction takes 64-bit operands, but ignores high 32 bits
of the destination operand, and zero-extends the result from 32 bits.
Let's model this in the RTL pattern to avoid zero-extension when the
_mm_crc32_u64 intrinsic is used with a 32-bit type.
PR target/106453
gcc/ChangeLog:
* config/i386/i386.md (sse4_2_crc32di): Model that only low 32
bits of operand 0 are consumed, and the result is zero-extended
to 64 bits.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr106453.c: New test.
---
gcc/config/i386/i386.md | 6 +++---
gcc/testsuite/gcc.target/i386/pr106453.c | 13 +++++++++++++
2 files changed, 16 insertions(+), 3 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/i386/pr106453.c
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 58fcc382f..b5760bb23 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -23823,10 +23823,10 @@
(define_insn "sse4_2_crc32di"
[(set (match_operand:DI 0 "register_operand" "=r")
- (unspec:DI
- [(match_operand:DI 1 "register_operand" "0")
+ (zero_extend:DI (unspec:SI
+ [(match_operand:SI 1 "register_operand" "0")
(match_operand:DI 2 "nonimmediate_operand" "rm")]
- UNSPEC_CRC32))]
+ UNSPEC_CRC32)))]
"TARGET_64BIT && TARGET_CRC32"
"crc32{q}\t{%2, %0|%0, %2}"
[(set_attr "type" "sselog1")
diff --git a/gcc/testsuite/gcc.target/i386/pr106453.c b/gcc/testsuite/gcc.target/i386/pr106453.c
new file mode 100644
index 000000000..bab5b1cb2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr106453.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-msse4.2 -O2 -fdump-rtl-final" } */
+/* { dg-final { scan-rtl-dump-not "zero_extendsidi" "final" } } */
+
+#include <immintrin.h>
+#include <stdint.h>
+
+uint32_t f(uint32_t c, uint64_t *p, size_t n)
+{
+ for (size_t i = 0; i < n; i++)
+ c = _mm_crc32_u64(c, p[i]);
+ return c;
+}
--
2.35.1
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] i386: avoid zero extension for crc32q
2022-08-23 16:09 [PATCH] i386: avoid zero extension for crc32q Alexander Monakov
@ 2022-08-24 11:35 ` Alexander Monakov
2022-09-04 19:36 ` Uros Bizjak
1 sibling, 0 replies; 3+ messages in thread
From: Alexander Monakov @ 2022-08-24 11:35 UTC (permalink / raw)
To: gcc-patches
On Tue, 23 Aug 2022, Alexander Monakov via Gcc-patches wrote:
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr106453.c
> @@ -0,0 +1,13 @@
> +/* { dg-do compile } */
> +/* { dg-options "-msse4.2 -O2 -fdump-rtl-final" } */
> +/* { dg-final { scan-rtl-dump-not "zero_extendsidi" "final" } } */
I noticed that the test is 64-bit only and added the following fixup in my
tree:
--- a/gcc/testsuite/gcc.target/i386/pr106453.c
+++ b/gcc/testsuite/gcc.target/i386/pr106453.c
@@ -1,4 +1,4 @@
-/* { dg-do compile } */
+/* { dg-do compile { target { ! ia32 } } */
/* { dg-options "-msse4.2 -O2 -fdump-rtl-final" } */
/* { dg-final { scan-rtl-dump-not "zero_extendsidi" "final" } } */
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] i386: avoid zero extension for crc32q
2022-08-23 16:09 [PATCH] i386: avoid zero extension for crc32q Alexander Monakov
2022-08-24 11:35 ` Alexander Monakov
@ 2022-09-04 19:36 ` Uros Bizjak
1 sibling, 0 replies; 3+ messages in thread
From: Uros Bizjak @ 2022-09-04 19:36 UTC (permalink / raw)
To: gcc-patches; +Cc: Alexander Monakov
[-- Attachment #1: Type: text/plain, Size: 2462 bytes --]
On Tue, Aug 23, 2022 at 6:10 PM Alexander Monakov via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> The crc32q instruction takes 64-bit operands, but ignores high 32 bits
> of the destination operand, and zero-extends the result from 32 bits.
>
> Let's model this in the RTL pattern to avoid zero-extension when the
> _mm_crc32_u64 intrinsic is used with a 32-bit type.
>
> PR target/106453
>
> gcc/ChangeLog:
>
> * config/i386/i386.md (sse4_2_crc32di): Model that only low 32
> bits of operand 0 are consumed, and the result is zero-extended
> to 64 bits.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/pr106453.c: New test.
OK with a nit and a couple of changes to the testcase dg-directives.
> ---
> gcc/config/i386/i386.md | 6 +++---
> gcc/testsuite/gcc.target/i386/pr106453.c | 13 +++++++++++++
> 2 files changed, 16 insertions(+), 3 deletions(-)
> create mode 100644 gcc/testsuite/gcc.target/i386/pr106453.c
>
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index 58fcc382f..b5760bb23 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -23823,10 +23823,10 @@
>
> (define_insn "sse4_2_crc32di"
> [(set (match_operand:DI 0 "register_operand" "=r")
> - (unspec:DI
> - [(match_operand:DI 1 "register_operand" "0")
> + (zero_extend:DI (unspec:SI
> + [(match_operand:SI 1 "register_operand" "0")
> (match_operand:DI 2 "nonimmediate_operand" "rm")]
> - UNSPEC_CRC32))]
> + UNSPEC_CRC32)))]
Usually the (unspec) part comes in the next line.
> "TARGET_64BIT && TARGET_CRC32"
> "crc32{q}\t{%2, %0|%0, %2}"
> [(set_attr "type" "sselog1")
> diff --git a/gcc/testsuite/gcc.target/i386/pr106453.c b/gcc/testsuite/gcc.target/i386/pr106453.c
> new file mode 100644
> index 000000000..bab5b1cb2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr106453.c
> @@ -0,0 +1,13 @@
> +/* { dg-do compile } */
> +/* { dg-options "-msse4.2 -O2 -fdump-rtl-final" } */
> +/* { dg-final { scan-rtl-dump-not "zero_extendsidi" "final" } } */
This part can use scan-asembler-not directive, with a -dp compiler option:
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -mcrc32 -dp" } */
+/* { dg-final { scan-assembler-not "zero_extendsidi" } } */
Also, the mainline compiler can use -mcrc32.
Please find all these suggestions implemented in the attached patch.
Thanks,
Uros.
[-- Attachment #2: p.diff.txt --]
[-- Type: text/plain, Size: 1271 bytes --]
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 1aef1af594d..57771ed84f5 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -23823,10 +23823,11 @@ (define_insn "sse4_2_crc32<mode>"
(define_insn "sse4_2_crc32di"
[(set (match_operand:DI 0 "register_operand" "=r")
- (unspec:DI
- [(match_operand:DI 1 "register_operand" "0")
- (match_operand:DI 2 "nonimmediate_operand" "rm")]
- UNSPEC_CRC32))]
+ (zero_extend:DI
+ (unspec:SI
+ [(match_operand:SI 1 "register_operand" "0")
+ (match_operand:DI 2 "nonimmediate_operand" "rm")]
+ UNSPEC_CRC32)))]
"TARGET_64BIT && TARGET_CRC32"
"crc32{q}\t{%2, %0|%0, %2}"
[(set_attr "type" "sselog1")
diff --git a/gcc/testsuite/gcc.target/i386/pr106453.c b/gcc/testsuite/gcc.target/i386/pr106453.c
new file mode 100644
index 00000000000..bd2e7282cf6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr106453.c
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -mcrc32 -dp" } */
+/* { dg-final { scan-assembler-not "zero_extendsidi" } } */
+
+#include <immintrin.h>
+#include <stdint.h>
+
+uint32_t f(uint32_t c, uint64_t *p, size_t n)
+{
+ for (size_t i = 0; i < n; i++)
+ c = _mm_crc32_u64(c, p[i]);
+ return c;
+}
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2022-09-04 19:36 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-23 16:09 [PATCH] i386: avoid zero extension for crc32q Alexander Monakov
2022-08-24 11:35 ` Alexander Monakov
2022-09-04 19:36 ` Uros Bizjak
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).