* [PATCH] x86: Optimize load of const all 1s float vectors
@ 2021-08-07 14:41 H.J. Lu
2021-08-08 20:23 ` Uros Bizjak
0 siblings, 1 reply; 7+ messages in thread
From: H.J. Lu @ 2021-08-07 14:41 UTC (permalink / raw)
To: gcc-patches; +Cc: Uros Bizjak, liuhongt
Update vector_all_ones_operand to return true for const all 1s float
vectors.
gcc/
PR target/101804
* config/i386/predicates.md (vector_all_ones_operand): Return
true for const all 1s float vectors.
gcc/testsuite/
PR target/101804
* gcc.target/i386/avx2-gather-2.c: Pass -march=skylake instead
of "-mavx2 -mtune=skylake". Scan vpcmpeqd.
---
gcc/config/i386/predicates.md | 7 ++++---
gcc/testsuite/gcc.target/i386/avx2-gather-2.c | 3 ++-
2 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index 6aa1ea32627..9637e64ea58 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -1126,9 +1126,10 @@ (define_predicate "float_vector_all_ones_operand"
/* Return true if operand is a vector constant that is all ones. */
(define_predicate "vector_all_ones_operand"
- (and (match_code "const_vector")
- (match_test "INTEGRAL_MODE_P (GET_MODE (op))")
- (match_test "op == CONSTM1_RTX (GET_MODE (op))")))
+ (ior (and (match_code "const_vector")
+ (match_test "INTEGRAL_MODE_P (GET_MODE (op))")
+ (match_test "op == CONSTM1_RTX (GET_MODE (op))"))
+ (match_operand 0 "float_vector_all_ones_operand")))
; Return true when OP is operand acceptable for vector memory operand.
; Only AVX can have misaligned memory operand.
diff --git a/gcc/testsuite/gcc.target/i386/avx2-gather-2.c b/gcc/testsuite/gcc.target/i386/avx2-gather-2.c
index 1a704afd834..ad5ef73107c 100644
--- a/gcc/testsuite/gcc.target/i386/avx2-gather-2.c
+++ b/gcc/testsuite/gcc.target/i386/avx2-gather-2.c
@@ -1,6 +1,7 @@
/* { dg-do compile } */
-/* { dg-options "-O3 -mavx2 -fdump-tree-vect-details -mtune=skylake" } */
+/* { dg-options "-O3 -fdump-tree-vect-details -march=skylake" } */
#include "avx2-gather-1.c"
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 16 "vect" } } */
+/* { dg-final { scan-assembler "vpcmpeqd" } } */
--
2.31.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] x86: Optimize load of const all 1s float vectors
2021-08-07 14:41 [PATCH] x86: Optimize load of const all 1s float vectors H.J. Lu
@ 2021-08-08 20:23 ` Uros Bizjak
2021-08-09 15:23 ` [PATCH v2] " H.J. Lu
0 siblings, 1 reply; 7+ messages in thread
From: Uros Bizjak @ 2021-08-08 20:23 UTC (permalink / raw)
To: H.J. Lu; +Cc: gcc-patches, liuhongt
On Sat, Aug 7, 2021 at 4:41 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> Update vector_all_ones_operand to return true for const all 1s float
> vectors.
>
> gcc/
>
> PR target/101804
> * config/i386/predicates.md (vector_all_ones_operand): Return
> true for const all 1s float vectors.
>
> gcc/testsuite/
>
> PR target/101804
> * gcc.target/i386/avx2-gather-2.c: Pass -march=skylake instead
> of "-mavx2 -mtune=skylake". Scan vpcmpeqd.
No, vector_all_ones_operand is intended to be integer minus-one. Use
float_vector_all_ones_operand in a specific place, where it is needed.
Uros.
> ---
> gcc/config/i386/predicates.md | 7 ++++---
> gcc/testsuite/gcc.target/i386/avx2-gather-2.c | 3 ++-
> 2 files changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
> index 6aa1ea32627..9637e64ea58 100644
> --- a/gcc/config/i386/predicates.md
> +++ b/gcc/config/i386/predicates.md
> @@ -1126,9 +1126,10 @@ (define_predicate "float_vector_all_ones_operand"
>
> /* Return true if operand is a vector constant that is all ones. */
> (define_predicate "vector_all_ones_operand"
> - (and (match_code "const_vector")
> - (match_test "INTEGRAL_MODE_P (GET_MODE (op))")
> - (match_test "op == CONSTM1_RTX (GET_MODE (op))")))
> + (ior (and (match_code "const_vector")
> + (match_test "INTEGRAL_MODE_P (GET_MODE (op))")
> + (match_test "op == CONSTM1_RTX (GET_MODE (op))"))
> + (match_operand 0 "float_vector_all_ones_operand")))
>
> ; Return true when OP is operand acceptable for vector memory operand.
> ; Only AVX can have misaligned memory operand.
> diff --git a/gcc/testsuite/gcc.target/i386/avx2-gather-2.c b/gcc/testsuite/gcc.target/i386/avx2-gather-2.c
> index 1a704afd834..ad5ef73107c 100644
> --- a/gcc/testsuite/gcc.target/i386/avx2-gather-2.c
> +++ b/gcc/testsuite/gcc.target/i386/avx2-gather-2.c
> @@ -1,6 +1,7 @@
> /* { dg-do compile } */
> -/* { dg-options "-O3 -mavx2 -fdump-tree-vect-details -mtune=skylake" } */
> +/* { dg-options "-O3 -fdump-tree-vect-details -march=skylake" } */
>
> #include "avx2-gather-1.c"
>
> /* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 16 "vect" } } */
> +/* { dg-final { scan-assembler "vpcmpeqd" } } */
> --
> 2.31.1
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2] x86: Optimize load of const all 1s float vectors
2021-08-08 20:23 ` Uros Bizjak
@ 2021-08-09 15:23 ` H.J. Lu
2021-08-09 15:27 ` Uros Bizjak
0 siblings, 1 reply; 7+ messages in thread
From: H.J. Lu @ 2021-08-09 15:23 UTC (permalink / raw)
To: Uros Bizjak; +Cc: gcc-patches, liuhongt
[-- Attachment #1: Type: text/plain, Size: 758 bytes --]
On Sun, Aug 8, 2021 at 1:23 PM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> On Sat, Aug 7, 2021 at 4:41 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > Update vector_all_ones_operand to return true for const all 1s float
> > vectors.
> >
> > gcc/
> >
> > PR target/101804
> > * config/i386/predicates.md (vector_all_ones_operand): Return
> > true for const all 1s float vectors.
> >
> > gcc/testsuite/
> >
> > PR target/101804
> > * gcc.target/i386/avx2-gather-2.c: Pass -march=skylake instead
> > of "-mavx2 -mtune=skylake". Scan vpcmpeqd.
>
> No, vector_all_ones_operand is intended to be integer minus-one. Use
> float_vector_all_ones_operand in a specific place, where it is needed.
>
Like this?
--
H.J.
[-- Attachment #2: v2-0001-x86-Optimize-load-of-const-all-1s-float-vectors.patch --]
[-- Type: text/x-patch, Size: 3343 bytes --]
From 017dee0c9ee946e16fbb1b938c1dd62ac0f95b09 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Fri, 6 Aug 2021 12:32:01 -0700
Subject: [PATCH v2] x86: Optimize load of const all 1s float vectors
Check float_vector_all_ones_operand for vector floating-point modes to
optimize load of const all 1s float vectors.
gcc/
PR target/101804
* config/i386/constraints.md (BC): For vector floating-point
modes, also check float_vector_all_ones_operand.
* config/i386/i386.c (standard_sse_constant_p): Likewise.
(standard_sse_constant_opcode): Likewise.
gcc/testsuite/
PR target/101804
* gcc.target/i386/avx2-gather-2.c: Pass -march=skylake instead
of "-mavx2 -mtune=skylake". Scan vpcmpeqd.
---
gcc/config/i386/constraints.md | 4 +++-
gcc/config/i386/i386.c | 11 +++++++++--
gcc/testsuite/gcc.target/i386/avx2-gather-2.c | 3 ++-
3 files changed, 14 insertions(+), 4 deletions(-)
diff --git a/gcc/config/i386/constraints.md b/gcc/config/i386/constraints.md
index 4aa28a5621c..cb1a803ab87 100644
--- a/gcc/config/i386/constraints.md
+++ b/gcc/config/i386/constraints.md
@@ -219,7 +219,9 @@ (define_constraint "BC"
"@internal SSE constant -1 operand."
(and (match_test "TARGET_SSE")
(ior (match_test "op == constm1_rtx")
- (match_operand 0 "vector_all_ones_operand"))))
+ (match_operand 0 "vector_all_ones_operand")
+ (and (match_test "GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT")
+ (match_operand 0 "float_vector_all_ones_operand")))))
;; Integer constant constraints.
(define_constraint "Wb"
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index aea224ab235..4d4ab6a03d6 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -5073,7 +5073,11 @@ standard_sse_constant_p (rtx x, machine_mode pred_mode)
if (x == const0_rtx || const0_operand (x, mode))
return 1;
- if (x == constm1_rtx || vector_all_ones_operand (x, mode))
+ if (x == constm1_rtx
+ || vector_all_ones_operand (x, mode)
+ || ((GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT
+ || GET_MODE_CLASS (pred_mode) == MODE_VECTOR_FLOAT)
+ && float_vector_all_ones_operand (x, mode)))
{
/* VOIDmode integer constant, get mode from the predicate. */
if (mode == VOIDmode)
@@ -5171,7 +5175,10 @@ standard_sse_constant_opcode (rtx_insn *insn, rtx *operands)
gcc_unreachable ();
}
}
- else if (x == constm1_rtx || vector_all_ones_operand (x, mode))
+ else if (x == constm1_rtx
+ || vector_all_ones_operand (x, mode)
+ || (GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT
+ && float_vector_all_ones_operand (x, mode)))
{
enum attr_mode insn_mode = get_attr_mode (insn);
diff --git a/gcc/testsuite/gcc.target/i386/avx2-gather-2.c b/gcc/testsuite/gcc.target/i386/avx2-gather-2.c
index 1a704afd834..ad5ef73107c 100644
--- a/gcc/testsuite/gcc.target/i386/avx2-gather-2.c
+++ b/gcc/testsuite/gcc.target/i386/avx2-gather-2.c
@@ -1,6 +1,7 @@
/* { dg-do compile } */
-/* { dg-options "-O3 -mavx2 -fdump-tree-vect-details -mtune=skylake" } */
+/* { dg-options "-O3 -fdump-tree-vect-details -march=skylake" } */
#include "avx2-gather-1.c"
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 16 "vect" } } */
+/* { dg-final { scan-assembler "vpcmpeqd" } } */
--
2.31.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2] x86: Optimize load of const all 1s float vectors
2021-08-09 15:23 ` [PATCH v2] " H.J. Lu
@ 2021-08-09 15:27 ` Uros Bizjak
2021-08-09 17:46 ` [PATCH v3] x86: Optimize load of const all 1s FP vectors H.J. Lu
0 siblings, 1 reply; 7+ messages in thread
From: Uros Bizjak @ 2021-08-09 15:27 UTC (permalink / raw)
To: H.J. Lu; +Cc: gcc-patches, liuhongt
On Mon, Aug 9, 2021 at 5:24 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Sun, Aug 8, 2021 at 1:23 PM Uros Bizjak <ubizjak@gmail.com> wrote:
> >
> > On Sat, Aug 7, 2021 at 4:41 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > >
> > > Update vector_all_ones_operand to return true for const all 1s float
> > > vectors.
> > >
> > > gcc/
> > >
> > > PR target/101804
> > > * config/i386/predicates.md (vector_all_ones_operand): Return
> > > true for const all 1s float vectors.
> > >
> > > gcc/testsuite/
> > >
> > > PR target/101804
> > > * gcc.target/i386/avx2-gather-2.c: Pass -march=skylake instead
> > > of "-mavx2 -mtune=skylake". Scan vpcmpeqd.
> >
> > No, vector_all_ones_operand is intended to be integer minus-one. Use
> > float_vector_all_ones_operand in a specific place, where it is needed.
> >
>
> Like this?
Please also add a new constraint, BC is intended for integer values.
Uros.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v3] x86: Optimize load of const all 1s FP vectors
2021-08-09 15:27 ` Uros Bizjak
@ 2021-08-09 17:46 ` H.J. Lu
2021-08-09 18:53 ` Uros Bizjak
0 siblings, 1 reply; 7+ messages in thread
From: H.J. Lu @ 2021-08-09 17:46 UTC (permalink / raw)
To: Uros Bizjak; +Cc: gcc-patches, liuhongt
[-- Attachment #1: Type: text/plain, Size: 1157 bytes --]
On Mon, Aug 9, 2021 at 8:27 AM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> On Mon, Aug 9, 2021 at 5:24 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > On Sun, Aug 8, 2021 at 1:23 PM Uros Bizjak <ubizjak@gmail.com> wrote:
> > >
> > > On Sat, Aug 7, 2021 at 4:41 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > > >
> > > > Update vector_all_ones_operand to return true for const all 1s float
> > > > vectors.
> > > >
> > > > gcc/
> > > >
> > > > PR target/101804
> > > > * config/i386/predicates.md (vector_all_ones_operand): Return
> > > > true for const all 1s float vectors.
> > > >
> > > > gcc/testsuite/
> > > >
> > > > PR target/101804
> > > > * gcc.target/i386/avx2-gather-2.c: Pass -march=skylake instead
> > > > of "-mavx2 -mtune=skylake". Scan vpcmpeqd.
> > >
> > > No, vector_all_ones_operand is intended to be integer minus-one. Use
> > > float_vector_all_ones_operand in a specific place, where it is needed.
> > >
> >
> > Like this?
>
> Please also add a new constraint, BC is intended for integer values.
>
> Uros.
Here is the v3 patch with the new BF constraint. OK for master?
Thanks.
--
H.J.
[-- Attachment #2: v3-0001-x86-Optimize-load-of-const-all-1s-FP-vectors.patch --]
[-- Type: text/x-patch, Size: 5244 bytes --]
From 6d4f8d82ad2c6d284c2c7afc199af27749da6418 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Fri, 6 Aug 2021 12:32:01 -0700
Subject: [PATCH v3] x86: Optimize load of const all 1s FP vectors
Check float_vector_all_ones_operand for vector floating-point modes to
optimize load of const all 1s floating-point vectors.
gcc/
PR target/101804
* config/i386/constraints.md (BC): Document for integer SSE
constant -1 operand.
(BF): New constraint for const all 1s floating-point vectors.
* config/i386/i386.c (standard_sse_constant_p): Likewise.
(standard_sse_constant_opcode): Likewise.
* config/i386/sse.md (sseconstm1): New mode attribute.
(mov<mode>_internal): Replace BC with <sseconstm1>.
gcc/testsuite/
PR target/101804
* gcc.target/i386/avx2-gather-2.c: Pass -march=skylake instead
of "-mavx2 -mtune=skylake". Scan vpcmpeqd.
---
gcc/config/i386/constraints.md | 10 ++++++++--
gcc/config/i386/i386.c | 11 +++++++++--
gcc/config/i386/sse.md | 11 ++++++++++-
gcc/testsuite/gcc.target/i386/avx2-gather-2.c | 3 ++-
4 files changed, 29 insertions(+), 6 deletions(-)
diff --git a/gcc/config/i386/constraints.md b/gcc/config/i386/constraints.md
index 4aa28a5621c..5a8c52b52e0 100644
--- a/gcc/config/i386/constraints.md
+++ b/gcc/config/i386/constraints.md
@@ -166,7 +166,8 @@ (define_register_constraint "YW"
;; s Sibcall memory operand, not valid for TARGET_X32
;; w Call memory operand, not valid for TARGET_X32
;; z Constant call address operand.
-;; C SSE constant operand.
+;; C Integer SSE constant -1 operand.
+;; F Floating-point SSE constant -1 operand.
(define_constraint "Bf"
"@internal Flags register operand."
@@ -216,11 +217,16 @@ (define_constraint "Bz"
(match_operand 0 "constant_call_address_operand"))
(define_constraint "BC"
- "@internal SSE constant -1 operand."
+ "@internal integer SSE constant -1 operand."
(and (match_test "TARGET_SSE")
(ior (match_test "op == constm1_rtx")
(match_operand 0 "vector_all_ones_operand"))))
+(define_constraint "BF"
+ "@internal floating-point SSE constant -1 operand."
+ (and (match_test "TARGET_SSE")
+ (match_operand 0 "float_vector_all_ones_operand")))
+
;; Integer constant constraints.
(define_constraint "Wb"
"Integer constant in the range 0 @dots{} 7, for 8-bit shifts."
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index aea224ab235..4d4ab6a03d6 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -5073,7 +5073,11 @@ standard_sse_constant_p (rtx x, machine_mode pred_mode)
if (x == const0_rtx || const0_operand (x, mode))
return 1;
- if (x == constm1_rtx || vector_all_ones_operand (x, mode))
+ if (x == constm1_rtx
+ || vector_all_ones_operand (x, mode)
+ || ((GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT
+ || GET_MODE_CLASS (pred_mode) == MODE_VECTOR_FLOAT)
+ && float_vector_all_ones_operand (x, mode)))
{
/* VOIDmode integer constant, get mode from the predicate. */
if (mode == VOIDmode)
@@ -5171,7 +5175,10 @@ standard_sse_constant_opcode (rtx_insn *insn, rtx *operands)
gcc_unreachable ();
}
}
- else if (x == constm1_rtx || vector_all_ones_operand (x, mode))
+ else if (x == constm1_rtx
+ || vector_all_ones_operand (x, mode)
+ || (GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT
+ && float_vector_all_ones_operand (x, mode)))
{
enum attr_mode insn_mode = get_attr_mode (insn);
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index a46a2373547..5255d42900e 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -777,6 +777,15 @@ (define_mode_attr sseinsnmode
(V4SF "V4SF") (V2DF "V2DF")
(TI "TI")])
+;; SSE constant -1 constraint
+(define_mode_attr sseconstm1
+ [(V64QI "BC") (V32HI "BC") (V16SI "BC") (V8DI "BC") (V4TI "BC")
+ (V32QI "BC") (V16HI "BC") (V8SI "BC") (V4DI "BC") (V2TI "BC")
+ (V16QI "BC") (V8HI "BC") (V4SI "BC") (V2DI "BC") (V1TI "BC")
+ (V16SF "BF") (V8DF "BF")
+ (V8SF "BF") (V4DF "BF")
+ (V4SF "BF") (V2DF "BF")])
+
;; Mapping of vector modes to corresponding mask size
(define_mode_attr avx512fmaskmode
[(V64QI "DI") (V32QI "SI") (V16QI "HI")
@@ -1056,7 +1065,7 @@ (define_insn "mov<mode>_internal"
[(set (match_operand:VMOVE 0 "nonimmediate_operand"
"=v,v ,v ,m")
(match_operand:VMOVE 1 "nonimmediate_or_sse_const_operand"
- " C,BC,vm,v"))]
+ " C,<sseconstm1>,vm,v"))]
"TARGET_SSE
&& (register_operand (operands[0], <MODE>mode)
|| register_operand (operands[1], <MODE>mode))"
diff --git a/gcc/testsuite/gcc.target/i386/avx2-gather-2.c b/gcc/testsuite/gcc.target/i386/avx2-gather-2.c
index 1a704afd834..ad5ef73107c 100644
--- a/gcc/testsuite/gcc.target/i386/avx2-gather-2.c
+++ b/gcc/testsuite/gcc.target/i386/avx2-gather-2.c
@@ -1,6 +1,7 @@
/* { dg-do compile } */
-/* { dg-options "-O3 -mavx2 -fdump-tree-vect-details -mtune=skylake" } */
+/* { dg-options "-O3 -fdump-tree-vect-details -march=skylake" } */
#include "avx2-gather-1.c"
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 16 "vect" } } */
+/* { dg-final { scan-assembler "vpcmpeqd" } } */
--
2.31.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v3] x86: Optimize load of const all 1s FP vectors
2021-08-09 17:46 ` [PATCH v3] x86: Optimize load of const all 1s FP vectors H.J. Lu
@ 2021-08-09 18:53 ` Uros Bizjak
2021-08-09 19:14 ` [PATCH v4] x86: Optimize load of const FP all bits set vectors H.J. Lu
0 siblings, 1 reply; 7+ messages in thread
From: Uros Bizjak @ 2021-08-09 18:53 UTC (permalink / raw)
To: H.J. Lu; +Cc: gcc-patches, liuhongt
On Mon, Aug 9, 2021 at 7:47 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Mon, Aug 9, 2021 at 8:27 AM Uros Bizjak <ubizjak@gmail.com> wrote:
> >
> > On Mon, Aug 9, 2021 at 5:24 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > >
> > > On Sun, Aug 8, 2021 at 1:23 PM Uros Bizjak <ubizjak@gmail.com> wrote:
> > > >
> > > > On Sat, Aug 7, 2021 at 4:41 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > > > >
> > > > > Update vector_all_ones_operand to return true for const all 1s float
> > > > > vectors.
> > > > >
> > > > > gcc/
> > > > >
> > > > > PR target/101804
> > > > > * config/i386/predicates.md (vector_all_ones_operand): Return
> > > > > true for const all 1s float vectors.
> > > > >
> > > > > gcc/testsuite/
> > > > >
> > > > > PR target/101804
> > > > > * gcc.target/i386/avx2-gather-2.c: Pass -march=skylake instead
> > > > > of "-mavx2 -mtune=skylake". Scan vpcmpeqd.
> > > >
> > > > No, vector_all_ones_operand is intended to be integer minus-one. Use
> > > > float_vector_all_ones_operand in a specific place, where it is needed.
> > > >
> > >
> > > Like this?
> >
> > Please also add a new constraint, BC is intended for integer values.
> >
> > Uros.
>
> Here is the v3 patch with the new BF constraint. OK for master?
OK with some comment fixes.
+;; C Integer SSE constant -1 operand.
+;; F Floating-point SSE constant -1 operand.
Maybe we should simply say "... SSE constant with all bits set" here.
"... SSE constant -1" is ambiguous, someone can interpret this as a
constant -1.0.
- "@internal SSE constant -1 operand."
+ "@internal integer SSE constant -1 operand."
Also here.
+(define_constraint "BF"
+ "@internal floating-point SSE constant -1 operand."
And here.
Thanks,
Uros.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v4] x86: Optimize load of const FP all bits set vectors
2021-08-09 18:53 ` Uros Bizjak
@ 2021-08-09 19:14 ` H.J. Lu
0 siblings, 0 replies; 7+ messages in thread
From: H.J. Lu @ 2021-08-09 19:14 UTC (permalink / raw)
To: Uros Bizjak; +Cc: gcc-patches, liuhongt
[-- Attachment #1: Type: text/plain, Size: 1999 bytes --]
On Mon, Aug 9, 2021 at 11:53 AM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> On Mon, Aug 9, 2021 at 7:47 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > On Mon, Aug 9, 2021 at 8:27 AM Uros Bizjak <ubizjak@gmail.com> wrote:
> > >
> > > On Mon, Aug 9, 2021 at 5:24 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > > >
> > > > On Sun, Aug 8, 2021 at 1:23 PM Uros Bizjak <ubizjak@gmail.com> wrote:
> > > > >
> > > > > On Sat, Aug 7, 2021 at 4:41 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > > > > >
> > > > > > Update vector_all_ones_operand to return true for const all 1s float
> > > > > > vectors.
> > > > > >
> > > > > > gcc/
> > > > > >
> > > > > > PR target/101804
> > > > > > * config/i386/predicates.md (vector_all_ones_operand): Return
> > > > > > true for const all 1s float vectors.
> > > > > >
> > > > > > gcc/testsuite/
> > > > > >
> > > > > > PR target/101804
> > > > > > * gcc.target/i386/avx2-gather-2.c: Pass -march=skylake instead
> > > > > > of "-mavx2 -mtune=skylake". Scan vpcmpeqd.
> > > > >
> > > > > No, vector_all_ones_operand is intended to be integer minus-one. Use
> > > > > float_vector_all_ones_operand in a specific place, where it is needed.
> > > > >
> > > >
> > > > Like this?
> > >
> > > Please also add a new constraint, BC is intended for integer values.
> > >
> > > Uros.
> >
> > Here is the v3 patch with the new BF constraint. OK for master?
>
> OK with some comment fixes.
>
> +;; C Integer SSE constant -1 operand.
> +;; F Floating-point SSE constant -1 operand.
>
> Maybe we should simply say "... SSE constant with all bits set" here.
> "... SSE constant -1" is ambiguous, someone can interpret this as a
> constant -1.0.
>
> - "@internal SSE constant -1 operand."
> + "@internal integer SSE constant -1 operand."
>
> Also here.
>
> +(define_constraint "BF"
> + "@internal floating-point SSE constant -1 operand."
>
> And here.
>
> Thanks,
> Uros.
This is the patch I am going to check in.
Thanks.
--
H.J.
[-- Attachment #2: v4-0001-x86-Optimize-load-of-const-FP-all-bits-set-vector.patch --]
[-- Type: text/x-patch, Size: 5338 bytes --]
From 93499102a52d29974b47e1d32274f6a08a4d6580 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Fri, 6 Aug 2021 12:32:01 -0700
Subject: [PATCH v4] x86: Optimize load of const FP all bits set vectors
Check float_vector_all_ones_operand for vector floating-point modes to
optimize load of const floating-point all bits set vectors.
gcc/
PR target/101804
* config/i386/constraints.md (BC): Document for integer SSE
constant all bits set operand.
(BF): New constraint for const floating-point all bits set
vectors.
* config/i386/i386.c (standard_sse_constant_p): Likewise.
(standard_sse_constant_opcode): Likewise.
* config/i386/sse.md (sseconstm1): New mode attribute.
(mov<mode>_internal): Replace BC with <sseconstm1>.
gcc/testsuite/
PR target/101804
* gcc.target/i386/avx2-gather-2.c: Pass -march=skylake instead
of "-mavx2 -mtune=skylake". Scan vpcmpeqd.
Fix
---
gcc/config/i386/constraints.md | 10 ++++++++--
gcc/config/i386/i386.c | 11 +++++++++--
gcc/config/i386/sse.md | 11 ++++++++++-
gcc/testsuite/gcc.target/i386/avx2-gather-2.c | 3 ++-
4 files changed, 29 insertions(+), 6 deletions(-)
diff --git a/gcc/config/i386/constraints.md b/gcc/config/i386/constraints.md
index 4aa28a5621c..87cceac4cfb 100644
--- a/gcc/config/i386/constraints.md
+++ b/gcc/config/i386/constraints.md
@@ -166,7 +166,8 @@ (define_register_constraint "YW"
;; s Sibcall memory operand, not valid for TARGET_X32
;; w Call memory operand, not valid for TARGET_X32
;; z Constant call address operand.
-;; C SSE constant operand.
+;; C Integer SSE constant with all bits set operand.
+;; F Floating-point SSE constant with all bits set operand.
(define_constraint "Bf"
"@internal Flags register operand."
@@ -216,11 +217,16 @@ (define_constraint "Bz"
(match_operand 0 "constant_call_address_operand"))
(define_constraint "BC"
- "@internal SSE constant -1 operand."
+ "@internal integer SSE constant with all bits set operand."
(and (match_test "TARGET_SSE")
(ior (match_test "op == constm1_rtx")
(match_operand 0 "vector_all_ones_operand"))))
+(define_constraint "BF"
+ "@internal floating-point SSE constant with all bits set operand."
+ (and (match_test "TARGET_SSE")
+ (match_operand 0 "float_vector_all_ones_operand")))
+
;; Integer constant constraints.
(define_constraint "Wb"
"Integer constant in the range 0 @dots{} 7, for 8-bit shifts."
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index aea224ab235..4d4ab6a03d6 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -5073,7 +5073,11 @@ standard_sse_constant_p (rtx x, machine_mode pred_mode)
if (x == const0_rtx || const0_operand (x, mode))
return 1;
- if (x == constm1_rtx || vector_all_ones_operand (x, mode))
+ if (x == constm1_rtx
+ || vector_all_ones_operand (x, mode)
+ || ((GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT
+ || GET_MODE_CLASS (pred_mode) == MODE_VECTOR_FLOAT)
+ && float_vector_all_ones_operand (x, mode)))
{
/* VOIDmode integer constant, get mode from the predicate. */
if (mode == VOIDmode)
@@ -5171,7 +5175,10 @@ standard_sse_constant_opcode (rtx_insn *insn, rtx *operands)
gcc_unreachable ();
}
}
- else if (x == constm1_rtx || vector_all_ones_operand (x, mode))
+ else if (x == constm1_rtx
+ || vector_all_ones_operand (x, mode)
+ || (GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT
+ && float_vector_all_ones_operand (x, mode)))
{
enum attr_mode insn_mode = get_attr_mode (insn);
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index a46a2373547..5255d42900e 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -777,6 +777,15 @@ (define_mode_attr sseinsnmode
(V4SF "V4SF") (V2DF "V2DF")
(TI "TI")])
+;; SSE constant -1 constraint
+(define_mode_attr sseconstm1
+ [(V64QI "BC") (V32HI "BC") (V16SI "BC") (V8DI "BC") (V4TI "BC")
+ (V32QI "BC") (V16HI "BC") (V8SI "BC") (V4DI "BC") (V2TI "BC")
+ (V16QI "BC") (V8HI "BC") (V4SI "BC") (V2DI "BC") (V1TI "BC")
+ (V16SF "BF") (V8DF "BF")
+ (V8SF "BF") (V4DF "BF")
+ (V4SF "BF") (V2DF "BF")])
+
;; Mapping of vector modes to corresponding mask size
(define_mode_attr avx512fmaskmode
[(V64QI "DI") (V32QI "SI") (V16QI "HI")
@@ -1056,7 +1065,7 @@ (define_insn "mov<mode>_internal"
[(set (match_operand:VMOVE 0 "nonimmediate_operand"
"=v,v ,v ,m")
(match_operand:VMOVE 1 "nonimmediate_or_sse_const_operand"
- " C,BC,vm,v"))]
+ " C,<sseconstm1>,vm,v"))]
"TARGET_SSE
&& (register_operand (operands[0], <MODE>mode)
|| register_operand (operands[1], <MODE>mode))"
diff --git a/gcc/testsuite/gcc.target/i386/avx2-gather-2.c b/gcc/testsuite/gcc.target/i386/avx2-gather-2.c
index 1a704afd834..ad5ef73107c 100644
--- a/gcc/testsuite/gcc.target/i386/avx2-gather-2.c
+++ b/gcc/testsuite/gcc.target/i386/avx2-gather-2.c
@@ -1,6 +1,7 @@
/* { dg-do compile } */
-/* { dg-options "-O3 -mavx2 -fdump-tree-vect-details -mtune=skylake" } */
+/* { dg-options "-O3 -fdump-tree-vect-details -march=skylake" } */
#include "avx2-gather-1.c"
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 16 "vect" } } */
+/* { dg-final { scan-assembler "vpcmpeqd" } } */
--
2.31.1
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2021-08-09 19:15 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-07 14:41 [PATCH] x86: Optimize load of const all 1s float vectors H.J. Lu
2021-08-08 20:23 ` Uros Bizjak
2021-08-09 15:23 ` [PATCH v2] " H.J. Lu
2021-08-09 15:27 ` Uros Bizjak
2021-08-09 17:46 ` [PATCH v3] x86: Optimize load of const all 1s FP vectors H.J. Lu
2021-08-09 18:53 ` Uros Bizjak
2021-08-09 19:14 ` [PATCH v4] x86: Optimize load of const FP all bits set vectors H.J. Lu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).