* [PATCH, ARM] Support vcond/vcondu patterns for NEON
@ 2010-08-25 13:06 Julian Brown
2010-08-25 14:03 ` Richard Guenther
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Julian Brown @ 2010-08-25 13:06 UTC (permalink / raw)
To: gcc-patches; +Cc: paul, rearnsha
[-- Attachment #1: Type: text/plain, Size: 1479 bytes --]
Hi,
This patch implements vcond<mode> and vcondu<mode> for NEON, fixing the
testsuite failure gcc.dg/vect/pr43430-1.c. These are RTX "standard
names", but are unfortunately undocumented at present (see PR29269), so
the intended semantics have been cargo-culted from other backends which
implement the pattern. These vector comparisons provide a rather nice
extension to the capabilities of the vectorizer.
Also, the patterns for vcge, vcgt and vceq instructions have been
extended to support the immediate variants (only comparisons with zero
are supported), and vcle and vclt immediate patterns have been added. I
haven't attempted to hook these up to the intrinsic-expansion
mechanism, so those won't support the immediate-zero mode yet.
(Note the gap in the numbering for the unspecs is intended to be filled
by the NEON misalignment-support patch, when that is approved).
Tested with cross to ARM Linux, using the options "-mfpu=neon
-march=armv7-a -mfloat-abi=softfp" (gcc, g++ and libstdc++). The only
change in test results is the transition of the test named above from
FAIL to PASS.
OK to apply?
Thanks,
Julian
ChangeLog
gcc/
* config/arm/neon.md (UNSPEC_VCLE, UNSPEC_VCLT): New constants for
unspecs.
(vcond<mode>, vcondu<mode>): New expanders.
(neon_vceq<mode>, neon_vcge<mode>, neon_vcgt<mode>): Support
comparisons with zero.
(neon_vcle<mode>, neon_vclt<mode>): New patterns.
* config/arm/constraints.md (Dz): New constraint.
[-- Attachment #2: neon-vcond-support-fsf-1.diff --]
[-- Type: text/x-patch, Size: 11283 bytes --]
Index: gcc/config/arm/neon.md
===================================================================
--- gcc/config/arm/neon.md (revision 163338)
+++ gcc/config/arm/neon.md (working copy)
@@ -140,7 +140,9 @@
(UNSPEC_VUZP1 201)
(UNSPEC_VUZP2 202)
(UNSPEC_VZIP1 203)
- (UNSPEC_VZIP2 204)])
+ (UNSPEC_VZIP2 204)
+ (UNSPEC_VCLE 206)
+ (UNSPEC_VCLT 207)])
;; Attribute used to permit string comparisons against <VQH_mnem> in
@@ -1452,6 +1454,169 @@
[(set_attr "neon_type" "neon_int_5")]
)
+;; Conditional instructions. These are comparisons with conditional moves for
+;; vectors. They perform the assignment:
+;;
+;; Vop0 = (Vop4 <op3> Vop5) ? Vop1 : Vop2;
+;;
+;; where op3 is <, <=, ==, !=, >= or >. Operations are performed
+;; element-wise.
+
+(define_expand "vcond<mode>"
+ [(set (match_operand:VDQW 0 "s_register_operand" "")
+ (if_then_else:VDQW
+ (match_operator 3 "arm_comparison_operator"
+ [(match_operand:VDQW 4 "s_register_operand" "")
+ (match_operand:VDQW 5 "nonmemory_operand" "")])
+ (match_operand:VDQW 1 "s_register_operand" "")
+ (match_operand:VDQW 2 "s_register_operand" "")))]
+ "TARGET_NEON && (!<Is_float_mode> || flag_unsafe_math_optimizations)"
+{
+ rtx mask;
+ int inverse = 0, immediate_zero = 0;
+ /* See the description of "magic" bits in the 'T' case of
+ arm_print_operand. */
+ HOST_WIDE_INT magic_word = (<MODE>mode == V2SFmode || <MODE>mode == V4SFmode)
+ ? 3 : 1;
+ rtx magic_rtx = GEN_INT (magic_word);
+
+ mask = gen_reg_rtx (<V_cmp_result>mode);
+
+ if (operands[5] == CONST0_RTX (<MODE>mode))
+ immediate_zero = 1;
+ else if (!REG_P (operands[5]))
+ operands[5] = force_reg (<MODE>mode, operands[5]);
+
+ switch (GET_CODE (operands[3]))
+ {
+ case GE:
+ emit_insn (gen_neon_vcge<mode> (mask, operands[4], operands[5],
+ magic_rtx));
+ break;
+
+ case GT:
+ emit_insn (gen_neon_vcgt<mode> (mask, operands[4], operands[5],
+ magic_rtx));
+ break;
+
+ case EQ:
+ emit_insn (gen_neon_vceq<mode> (mask, operands[4], operands[5],
+ magic_rtx));
+ break;
+
+ case LE:
+ if (immediate_zero)
+ emit_insn (gen_neon_vcle<mode> (mask, operands[4], operands[5],
+ magic_rtx));
+ else
+ emit_insn (gen_neon_vcge<mode> (mask, operands[5], operands[4],
+ magic_rtx));
+ break;
+
+ case LT:
+ if (immediate_zero)
+ emit_insn (gen_neon_vclt<mode> (mask, operands[4], operands[5],
+ magic_rtx));
+ else
+ emit_insn (gen_neon_vcgt<mode> (mask, operands[5], operands[4],
+ magic_rtx));
+ break;
+
+ case NE:
+ emit_insn (gen_neon_vceq<mode> (mask, operands[4], operands[5],
+ magic_rtx));
+ inverse = 1;
+ break;
+
+ default:
+ gcc_unreachable ();
+ }
+
+ if (inverse)
+ emit_insn (gen_neon_vbsl<mode> (operands[0], mask, operands[2],
+ operands[1]));
+ else
+ emit_insn (gen_neon_vbsl<mode> (operands[0], mask, operands[1],
+ operands[2]));
+
+ DONE;
+})
+
+(define_expand "vcondu<mode>"
+ [(set (match_operand:VDQIW 0 "s_register_operand" "")
+ (if_then_else:VDQIW
+ (match_operator 3 "arm_comparison_operator"
+ [(match_operand:VDQIW 4 "s_register_operand" "")
+ (match_operand:VDQIW 5 "s_register_operand" "")])
+ (match_operand:VDQIW 1 "s_register_operand" "")
+ (match_operand:VDQIW 2 "s_register_operand" "")))]
+ "TARGET_NEON"
+{
+ rtx mask;
+ int inverse = 0, immediate_zero = 0;
+
+ mask = gen_reg_rtx (<V_cmp_result>mode);
+
+ if (operands[5] == CONST0_RTX (<MODE>mode))
+ immediate_zero = 1;
+ else if (!REG_P (operands[5]))
+ operands[5] = force_reg (<MODE>mode, operands[5]);
+
+ switch (GET_CODE (operands[3]))
+ {
+ case GEU:
+ emit_insn (gen_neon_vcge<mode> (mask, operands[4], operands[5],
+ const0_rtx));
+ break;
+
+ case GTU:
+ emit_insn (gen_neon_vcgt<mode> (mask, operands[4], operands[5],
+ const0_rtx));
+ break;
+
+ case EQ:
+ emit_insn (gen_neon_vceq<mode> (mask, operands[4], operands[5],
+ const0_rtx));
+ break;
+
+ case LEU:
+ if (immediate_zero)
+ emit_insn (gen_neon_vcle<mode> (mask, operands[4], operands[5],
+ const0_rtx));
+ else
+ emit_insn (gen_neon_vcge<mode> (mask, operands[5], operands[4],
+ const0_rtx));
+ break;
+
+ case LTU:
+ if (immediate_zero)
+ emit_insn (gen_neon_vclt<mode> (mask, operands[4], operands[5],
+ const0_rtx));
+ else
+ emit_insn (gen_neon_vcgt<mode> (mask, operands[5], operands[4],
+ const0_rtx));
+ break;
+
+ case NE:
+ emit_insn (gen_neon_vceq<mode> (mask, operands[4], operands[5],
+ const0_rtx));
+ inverse = 1;
+ break;
+
+ default:
+ gcc_unreachable ();
+ }
+
+ if (inverse)
+ emit_insn (gen_neon_vbsl<mode> (operands[0], mask, operands[2],
+ operands[1]));
+ else
+ emit_insn (gen_neon_vbsl<mode> (operands[0], mask, operands[1],
+ operands[2]));
+
+ DONE;
+})
+
;; Patterns for builtins.
; good for plain vadd, vaddq.
@@ -1863,13 +2028,16 @@
)
(define_insn "neon_vceq<mode>"
- [(set (match_operand:<V_cmp_result> 0 "s_register_operand" "=w")
- (unspec:<V_cmp_result> [(match_operand:VDQW 1 "s_register_operand" "w")
- (match_operand:VDQW 2 "s_register_operand" "w")
- (match_operand:SI 3 "immediate_operand" "i")]
- UNSPEC_VCEQ))]
+ [(set (match_operand:<V_cmp_result> 0 "s_register_operand" "=w,w")
+ (unspec:<V_cmp_result>
+ [(match_operand:VDQW 1 "s_register_operand" "w,w")
+ (match_operand:VDQW 2 "nonmemory_operand" "w,Dz")
+ (match_operand:SI 3 "immediate_operand" "i,i")]
+ UNSPEC_VCEQ))]
"TARGET_NEON"
- "vceq.<V_if_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2"
+ "@
+ vceq.<V_if_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2
+ vceq.<V_if_elem>\t%<V_reg>0, %<V_reg>1, #0"
[(set (attr "neon_type")
(if_then_else (ne (symbol_ref "<Is_float_mode>") (const_int 0))
(if_then_else (ne (symbol_ref "<Is_d_reg>") (const_int 0))
@@ -1879,13 +2047,16 @@
)
(define_insn "neon_vcge<mode>"
- [(set (match_operand:<V_cmp_result> 0 "s_register_operand" "=w")
- (unspec:<V_cmp_result> [(match_operand:VDQW 1 "s_register_operand" "w")
- (match_operand:VDQW 2 "s_register_operand" "w")
- (match_operand:SI 3 "immediate_operand" "i")]
- UNSPEC_VCGE))]
+ [(set (match_operand:<V_cmp_result> 0 "s_register_operand" "=w,w")
+ (unspec:<V_cmp_result>
+ [(match_operand:VDQW 1 "s_register_operand" "w,w")
+ (match_operand:VDQW 2 "nonmemory_operand" "w,Dz")
+ (match_operand:SI 3 "immediate_operand" "i,i")]
+ UNSPEC_VCGE))]
"TARGET_NEON"
- "vcge.%T3%#<V_sz_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2"
+ "@
+ vcge.%T3%#<V_sz_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2
+ vcge.%T3%#<V_sz_elem>\t%<V_reg>0, %<V_reg>1, #0"
[(set (attr "neon_type")
(if_then_else (ne (symbol_ref "<Is_float_mode>") (const_int 0))
(if_then_else (ne (symbol_ref "<Is_d_reg>") (const_int 0))
@@ -1895,13 +2066,16 @@
)
(define_insn "neon_vcgt<mode>"
- [(set (match_operand:<V_cmp_result> 0 "s_register_operand" "=w")
- (unspec:<V_cmp_result> [(match_operand:VDQW 1 "s_register_operand" "w")
- (match_operand:VDQW 2 "s_register_operand" "w")
- (match_operand:SI 3 "immediate_operand" "i")]
- UNSPEC_VCGT))]
+ [(set (match_operand:<V_cmp_result> 0 "s_register_operand" "=w,w")
+ (unspec:<V_cmp_result>
+ [(match_operand:VDQW 1 "s_register_operand" "w,w")
+ (match_operand:VDQW 2 "nonmemory_operand" "w,Dz")
+ (match_operand:SI 3 "immediate_operand" "i,i")]
+ UNSPEC_VCGT))]
"TARGET_NEON"
- "vcgt.%T3%#<V_sz_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2"
+ "@
+ vcgt.%T3%#<V_sz_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2
+ vcgt.%T3%#<V_sz_elem>\t%<V_reg>0, %<V_reg>1, #0"
[(set (attr "neon_type")
(if_then_else (ne (symbol_ref "<Is_float_mode>") (const_int 0))
(if_then_else (ne (symbol_ref "<Is_d_reg>") (const_int 0))
@@ -1910,6 +2084,43 @@
(const_string "neon_int_5")))]
)
+;; VCLE and VCLT only support comparisons with immediate zero (register
+;; variants are VCGE and VCGT with operands reversed).
+
+(define_insn "neon_vcle<mode>"
+ [(set (match_operand:<V_cmp_result> 0 "s_register_operand" "=w")
+ (unspec:<V_cmp_result>
+ [(match_operand:VDQW 1 "s_register_operand" "w")
+ (match_operand:VDQW 2 "nonmemory_operand" "Dz")
+ (match_operand:SI 3 "immediate_operand" "i")]
+ UNSPEC_VCLE))]
+ "TARGET_NEON"
+ "vcle.%T3%#<V_sz_elem>\t%<V_reg>0, %<V_reg>1, #0"
+ [(set (attr "neon_type")
+ (if_then_else (ne (symbol_ref "<Is_float_mode>") (const_int 0))
+ (if_then_else (ne (symbol_ref "<Is_d_reg>") (const_int 0))
+ (const_string "neon_fp_vadd_ddd_vabs_dd")
+ (const_string "neon_fp_vadd_qqq_vabs_qq"))
+ (const_string "neon_int_5")))]
+)
+
+(define_insn "neon_vclt<mode>"
+ [(set (match_operand:<V_cmp_result> 0 "s_register_operand" "=w")
+ (unspec:<V_cmp_result>
+ [(match_operand:VDQW 1 "s_register_operand" "w")
+ (match_operand:VDQW 2 "nonmemory_operand" "Dz")
+ (match_operand:SI 3 "immediate_operand" "i")]
+ UNSPEC_VCLT))]
+ "TARGET_NEON"
+ "vclt.%T3%#<V_sz_elem>\t%<V_reg>0, %<V_reg>1, #0"
+ [(set (attr "neon_type")
+ (if_then_else (ne (symbol_ref "<Is_float_mode>") (const_int 0))
+ (if_then_else (ne (symbol_ref "<Is_d_reg>") (const_int 0))
+ (const_string "neon_fp_vadd_ddd_vabs_dd")
+ (const_string "neon_fp_vadd_qqq_vabs_qq"))
+ (const_string "neon_int_5")))]
+)
+
(define_insn "neon_vcage<mode>"
[(set (match_operand:<V_cmp_result> 0 "s_register_operand" "=w")
(unspec:<V_cmp_result> [(match_operand:VCVTF 1 "s_register_operand" "w")
Index: gcc/config/arm/constraints.md
===================================================================
--- gcc/config/arm/constraints.md (revision 163338)
+++ gcc/config/arm/constraints.md (working copy)
@@ -29,7 +29,7 @@
;; in Thumb-1 state: I, J, K, L, M, N, O
;; The following multi-letter normal constraints have been used:
-;; in ARM/Thumb-2 state: Da, Db, Dc, Dn, Dl, DL, Dv, Dy, Di
+;; in ARM/Thumb-2 state: Da, Db, Dc, Dn, Dl, DL, Dv, Dy, Di, Dz
;; in Thumb-1 state: Pa, Pb, Pc, Pd
;; in Thumb-2 state: Ps, Pt, Pu, Pv, Pw, Px
@@ -199,6 +199,12 @@
(and (match_code "const_double")
(match_test "TARGET_32BIT && neg_const_double_rtx_ok_for_fpa (op)")))
+(define_constraint "Dz"
+ "@internal
+ In ARM/Thumb-2 state a vector of constant zeros."
+ (and (match_code "const_vector")
+ (match_test "TARGET_NEON && op == CONST0_RTX (mode)")))
+
(define_constraint "Da"
"@internal
In ARM/Thumb-2 state a const_int, const_double or const_vector that can
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH, ARM] Support vcond/vcondu patterns for NEON
2010-08-25 13:06 [PATCH, ARM] Support vcond/vcondu patterns for NEON Julian Brown
@ 2010-08-25 14:03 ` Richard Guenther
2010-08-25 16:45 ` Joseph S. Myers
2010-09-01 15:18 ` Richard Earnshaw
2 siblings, 0 replies; 4+ messages in thread
From: Richard Guenther @ 2010-08-25 14:03 UTC (permalink / raw)
To: Julian Brown; +Cc: gcc-patches, paul, rearnsha
On Wed, Aug 25, 2010 at 2:22 PM, Julian Brown <julian@codesourcery.com> wrote:
> Hi,
>
> This patch implements vcond<mode> and vcondu<mode> for NEON, fixing the
> testsuite failure gcc.dg/vect/pr43430-1.c. These are RTX "standard
> names", but are unfortunately undocumented at present (see PR29269), so
> the intended semantics have been cargo-culted from other backends which
> implement the pattern. These vector comparisons provide a rather nice
> extension to the capabilities of the vectorizer.
If you figured out how they work can you add documentation on
your way?
Thanks,
Richard.
> Also, the patterns for vcge, vcgt and vceq instructions have been
> extended to support the immediate variants (only comparisons with zero
> are supported), and vcle and vclt immediate patterns have been added. I
> haven't attempted to hook these up to the intrinsic-expansion
> mechanism, so those won't support the immediate-zero mode yet.
> (Note the gap in the numbering for the unspecs is intended to be filled
> by the NEON misalignment-support patch, when that is approved).
>
> Tested with cross to ARM Linux, using the options "-mfpu=neon
> -march=armv7-a -mfloat-abi=softfp" (gcc, g++ and libstdc++). The only
> change in test results is the transition of the test named above from
> FAIL to PASS.
>
> OK to apply?
>
> Thanks,
>
> Julian
>
> ChangeLog
>
> gcc/
> * config/arm/neon.md (UNSPEC_VCLE, UNSPEC_VCLT): New constants for
> unspecs.
> (vcond<mode>, vcondu<mode>): New expanders.
> (neon_vceq<mode>, neon_vcge<mode>, neon_vcgt<mode>): Support
> comparisons with zero.
> (neon_vcle<mode>, neon_vclt<mode>): New patterns.
> * config/arm/constraints.md (Dz): New constraint.
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH, ARM] Support vcond/vcondu patterns for NEON
2010-08-25 13:06 [PATCH, ARM] Support vcond/vcondu patterns for NEON Julian Brown
2010-08-25 14:03 ` Richard Guenther
@ 2010-08-25 16:45 ` Joseph S. Myers
2010-09-01 15:18 ` Richard Earnshaw
2 siblings, 0 replies; 4+ messages in thread
From: Joseph S. Myers @ 2010-08-25 16:45 UTC (permalink / raw)
To: Julian Brown; +Cc: gcc-patches, paul, rearnsha
On Wed, 25 Aug 2010, Julian Brown wrote:
> (Note the gap in the numbering for the unspecs is intended to be filled
> by the NEON misalignment-support patch, when that is approved).
Perhaps it would make sense to move ARM unspecs to define_c_enum (as used
by MIPS, for example) as a separate cleanup patch so you don't need to
maintain numbering manually or have gaps that may or may not be justified.
--
Joseph S. Myers
joseph@codesourcery.com
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH, ARM] Support vcond/vcondu patterns for NEON
2010-08-25 13:06 [PATCH, ARM] Support vcond/vcondu patterns for NEON Julian Brown
2010-08-25 14:03 ` Richard Guenther
2010-08-25 16:45 ` Joseph S. Myers
@ 2010-09-01 15:18 ` Richard Earnshaw
2 siblings, 0 replies; 4+ messages in thread
From: Richard Earnshaw @ 2010-09-01 15:18 UTC (permalink / raw)
To: Julian Brown; +Cc: gcc-patches, Paul Brook
On Wed, 2010-08-25 at 13:22 +0100, Julian Brown wrote:
> Hi,
>
> This patch implements vcond<mode> and vcondu<mode> for NEON, fixing the
> testsuite failure gcc.dg/vect/pr43430-1.c. These are RTX "standard
> names", but are unfortunately undocumented at present (see PR29269), so
> the intended semantics have been cargo-culted from other backends which
> implement the pattern. These vector comparisons provide a rather nice
> extension to the capabilities of the vectorizer.
>
> Also, the patterns for vcge, vcgt and vceq instructions have been
> extended to support the immediate variants (only comparisons with zero
> are supported), and vcle and vclt immediate patterns have been added. I
> haven't attempted to hook these up to the intrinsic-expansion
> mechanism, so those won't support the immediate-zero mode yet.
> (Note the gap in the numbering for the unspecs is intended to be filled
> by the NEON misalignment-support patch, when that is approved).
>
> Tested with cross to ARM Linux, using the options "-mfpu=neon
> -march=armv7-a -mfloat-abi=softfp" (gcc, g++ and libstdc++). The only
> change in test results is the transition of the test named above from
> FAIL to PASS.
>
> OK to apply?
>
> Thanks,
>
> Julian
>
> ChangeLog
>
> gcc/
> * config/arm/neon.md (UNSPEC_VCLE, UNSPEC_VCLT): New constants for
> unspecs.
> (vcond<mode>, vcondu<mode>): New expanders.
> (neon_vceq<mode>, neon_vcge<mode>, neon_vcgt<mode>): Support
> comparisons with zero.
> (neon_vcle<mode>, neon_vclt<mode>): New patterns.
> * config/arm/constraints.md (Dz): New constraint.
This is Ok.
As richi says, it would be nice to see a patch to the docs to clarify
the meaning of these expansion targets.
R.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2010-09-01 15:14 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-08-25 13:06 [PATCH, ARM] Support vcond/vcondu patterns for NEON Julian Brown
2010-08-25 14:03 ` Richard Guenther
2010-08-25 16:45 ` Joseph S. Myers
2010-09-01 15:18 ` Richard Earnshaw
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).