* [PATCH 2/3] Allow divmod on UNITS_PER_WORD vectors
2012-07-19 18:53 [PATCH 1/3] Fix 0 != CODE_FOR_nothing thinko Richard Henderson
2012-07-19 18:53 ` [PATCH 3/3] ia64: Update vector multiply pattern names; delete dot_prod Richard Henderson
@ 2012-07-19 18:53 ` Richard Henderson
2012-07-19 20:28 ` Jakub Jelinek
2012-07-19 20:35 ` [PATCH 1/3] Fix 0 != CODE_FOR_nothing thinko Jakub Jelinek
2 siblings, 1 reply; 6+ messages in thread
From: Richard Henderson @ 2012-07-19 18:53 UTC (permalink / raw)
To: gcc-patches; +Cc: jakub
IA64 vectors modes are of course all UNITS_PER_WORD. I see no reason
to prevent the transformation of a V4HImode divmod, or even of V2SImode.
Jakub, was there another reason you added this condition?
r~
* tree-vect-patterns.c (vect_recog_divmod_pattern): Do not require
vectors larger than UNITS_PER_WORD.
---
gcc/ChangeLog | 3 +++
gcc/tree-vect-patterns.c | 3 +--
2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
index 05202a8..d990c4f 100644
--- a/gcc/tree-vect-patterns.c
+++ b/gcc/tree-vect-patterns.c
@@ -1679,8 +1679,7 @@ vect_recog_divmod_pattern (VEC (gimple, heap) **stmts,
{
enum machine_mode vec_mode = TYPE_MODE (vectype);
int icode = (int) optab_handler (optab, vec_mode);
- if (icode != CODE_FOR_nothing
- || GET_MODE_SIZE (vec_mode) == UNITS_PER_WORD)
+ if (icode != CODE_FOR_nothing)
return NULL;
}
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/3] Fix 0 != CODE_FOR_nothing thinko
@ 2012-07-19 18:53 Richard Henderson
2012-07-19 18:53 ` [PATCH 3/3] ia64: Update vector multiply pattern names; delete dot_prod Richard Henderson
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Richard Henderson @ 2012-07-19 18:53 UTC (permalink / raw)
To: gcc-patches
I discovered the vectorizable_operation while trying to figure out why
ia64 wasn't generating the mul_highpart pattern as I expected: we always
force CODE_FOR_nothing to 0. Oops.
The change to expr.c was just to satisfy the grep I did to make sure
there were not other occurrences of a similar problem.
r~
* expr.c (store_constructor): Initialize icode with CODE_FOR_nothing.
* tree-vect-stmts.c (vectorizable_operation): Use LAST_INSN_CODE for
dummy != CODE_FOR_nothing value.
---
gcc/ChangeLog | 6 ++++++
gcc/expr.c | 2 +-
gcc/tree-vect-stmts.c | 2 +-
3 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/gcc/expr.c b/gcc/expr.c
index 5aec53e..9650863 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -6130,7 +6130,7 @@ store_constructor (tree exp, rtx target, int cleared, HOST_WIDE_INT size)
constructor_elt *ce;
int i;
int need_to_clear;
- int icode = 0;
+ int icode = CODE_FOR_nothing;
tree elttype = TREE_TYPE (type);
int elt_size = tree_low_cst (TYPE_SIZE (elttype), 1);
enum machine_mode eltmode = TYPE_MODE (elttype);
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index fb05063..bb42cbc 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -3531,7 +3531,7 @@ vectorizable_operation (gimple stmt, gimple_stmt_iterator *gsi,
if (code == MULT_HIGHPART_EXPR)
{
if (can_mult_highpart_p (vec_mode, TYPE_UNSIGNED (vectype)))
- icode = 0;
+ icode = LAST_INSN_CODE;
else
icode = CODE_FOR_nothing;
}
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 3/3] ia64: Update vector multiply pattern names; delete dot_prod
2012-07-19 18:53 [PATCH 1/3] Fix 0 != CODE_FOR_nothing thinko Richard Henderson
@ 2012-07-19 18:53 ` Richard Henderson
2012-07-19 18:53 ` [PATCH 2/3] Allow divmod on UNITS_PER_WORD vectors Richard Henderson
2012-07-19 20:35 ` [PATCH 1/3] Fix 0 != CODE_FOR_nothing thinko Jakub Jelinek
2 siblings, 0 replies; 6+ messages in thread
From: Richard Henderson @ 2012-07-19 18:53 UTC (permalink / raw)
To: gcc-patches
With the even/odd v4hi multiply pattern names, we'll automatically
generate the same code for dot_prodv4hi. The dot_prodv8qi pattern
was actively incorrect wrt output mode.
r~
* config/ia64/vect.md (smulv4hi3_highpart): New.
(umulv4hi3_highpart): New.
(vec_widen_smult_even_v4hi): Rename from pmpy2_even.
(vec_widen_smult_odd_v4hi): Rename from pmpy2_odd.
(udot_prodv8qi, sdot_prodv8qi): Remove.
(sdot_prodv4hi, udot_prodv4hi): Remove.
* config/ia64/ia64.c (ia64_expand_dot_prod_v8qi): Remove.
* config/ia64/ia64-protos.h: Update.
---
gcc/ChangeLog | 9 ++++
gcc/config/ia64/ia64-protos.h | 1 -
gcc/config/ia64/ia64.c | 40 ------------------
gcc/config/ia64/vect.md | 88 +++++++++++-----------------------------
4 files changed, 33 insertions(+), 105 deletions(-)
diff --git a/gcc/config/ia64/ia64-protos.h b/gcc/config/ia64/ia64-protos.h
index 458b120..39d34d4 100644
--- a/gcc/config/ia64/ia64-protos.h
+++ b/gcc/config/ia64/ia64-protos.h
@@ -43,7 +43,6 @@ extern bool ia64_expand_vecint_minmax (enum rtx_code, enum machine_mode, rtx[]);
extern void ia64_unpack_assemble (rtx, rtx, rtx, bool);
extern void ia64_expand_unpack (rtx [], bool, bool);
extern void ia64_expand_widen_sum (rtx[], bool);
-extern void ia64_expand_dot_prod_v8qi (rtx[], bool);
extern void ia64_expand_call (rtx, rtx, rtx, int);
extern void ia64_split_call (rtx, rtx, rtx, rtx, rtx, int, int);
extern void ia64_reload_gp (void);
diff --git a/gcc/config/ia64/ia64.c b/gcc/config/ia64/ia64.c
index a138ab4..c7fb559 100644
--- a/gcc/config/ia64/ia64.c
+++ b/gcc/config/ia64/ia64.c
@@ -2096,46 +2096,6 @@ ia64_expand_widen_sum (rtx operands[3], bool unsignedp)
emit_move_insn (operands[0], t);
}
-/* Emit a signed or unsigned V8QI dot product operation. */
-
-void
-ia64_expand_dot_prod_v8qi (rtx operands[4], bool unsignedp)
-{
- rtx op1, op2, sn1, sn2, l1, l2, h1, h2;
- rtx p1, p2, p3, p4, s1, s2, s3;
-
- op1 = operands[1];
- op2 = operands[2];
- sn1 = ia64_unpack_sign (op1, unsignedp);
- sn2 = ia64_unpack_sign (op2, unsignedp);
-
- l1 = gen_reg_rtx (V4HImode);
- l2 = gen_reg_rtx (V4HImode);
- h1 = gen_reg_rtx (V4HImode);
- h2 = gen_reg_rtx (V4HImode);
- ia64_unpack_assemble (l1, op1, sn1, false);
- ia64_unpack_assemble (l2, op2, sn2, false);
- ia64_unpack_assemble (h1, op1, sn1, true);
- ia64_unpack_assemble (h2, op2, sn2, true);
-
- p1 = gen_reg_rtx (V2SImode);
- p2 = gen_reg_rtx (V2SImode);
- p3 = gen_reg_rtx (V2SImode);
- p4 = gen_reg_rtx (V2SImode);
- emit_insn (gen_pmpy2_even (p1, l1, l2));
- emit_insn (gen_pmpy2_even (p2, h1, h2));
- emit_insn (gen_pmpy2_odd (p3, l1, l2));
- emit_insn (gen_pmpy2_odd (p4, h1, h2));
-
- s1 = gen_reg_rtx (V2SImode);
- s2 = gen_reg_rtx (V2SImode);
- s3 = gen_reg_rtx (V2SImode);
- emit_insn (gen_addv2si3 (s1, p1, p2));
- emit_insn (gen_addv2si3 (s2, p3, p4));
- emit_insn (gen_addv2si3 (s3, s1, operands[3]));
- emit_insn (gen_addv2si3 (operands[0], s2, s3));
-}
-
/* Emit the appropriate sequence for a call. */
void
diff --git a/gcc/config/ia64/vect.md b/gcc/config/ia64/vect.md
index aa77b01..4d85aa0 100644
--- a/gcc/config/ia64/vect.md
+++ b/gcc/config/ia64/vect.md
@@ -278,7 +278,29 @@
"pmpyshr2.u %0 = %1, %2, %3"
[(set_attr "itanium_class" "mmmul")])
-(define_insn "pmpy2_even"
+(define_expand "smulv4hi3_highpart"
+ [(match_operand:V4HI 0 "gr_register_operand")
+ (match_operand:V4HI 1 "gr_register_operand")
+ (match_operand:V4HI 2 "gr_register_operand")]
+ ""
+{
+ emit_insn (gen_pmpyshr2 (operands[0], operands[1],
+ operands[2], GEN_INT (16)));
+ DONE;
+})
+
+(define_expand "umulv4hi3_highpart"
+ [(match_operand:V4HI 0 "gr_register_operand")
+ (match_operand:V4HI 1 "gr_register_operand")
+ (match_operand:V4HI 2 "gr_register_operand")]
+ ""
+{
+ emit_insn (gen_pmpyshr2_u (operands[0], operands[1],
+ operands[2], GEN_INT (16)));
+ DONE;
+})
+
+(define_insn "vec_widen_smult_even_v4hi"
[(set (match_operand:V2SI 0 "gr_register_operand" "=r")
(mult:V2SI
(vec_select:V2SI
@@ -299,7 +321,7 @@
}
[(set_attr "itanium_class" "mmshf")])
-(define_insn "pmpy2_odd"
+(define_insn "vec_widen_smult_odd_v4hi"
[(set (match_operand:V2SI 0 "gr_register_operand" "=r")
(mult:V2SI
(vec_select:V2SI
@@ -602,68 +624,6 @@
DONE;
})
-(define_expand "udot_prodv8qi"
- [(match_operand:V2SI 0 "gr_register_operand" "")
- (match_operand:V8QI 1 "gr_register_operand" "")
- (match_operand:V8QI 2 "gr_register_operand" "")
- (match_operand:V2SI 3 "gr_register_operand" "")]
- ""
-{
- ia64_expand_dot_prod_v8qi (operands, true);
- DONE;
-})
-
-(define_expand "sdot_prodv8qi"
- [(match_operand:V2SI 0 "gr_register_operand" "")
- (match_operand:V8QI 1 "gr_register_operand" "")
- (match_operand:V8QI 2 "gr_register_operand" "")
- (match_operand:V2SI 3 "gr_register_operand" "")]
- ""
-{
- ia64_expand_dot_prod_v8qi (operands, false);
- DONE;
-})
-
-(define_expand "sdot_prodv4hi"
- [(match_operand:V2SI 0 "gr_register_operand" "")
- (match_operand:V4HI 1 "gr_register_operand" "")
- (match_operand:V4HI 2 "gr_register_operand" "")
- (match_operand:V2SI 3 "gr_register_operand" "")]
- ""
-{
- rtx e, o, t;
-
- e = gen_reg_rtx (V2SImode);
- o = gen_reg_rtx (V2SImode);
- t = gen_reg_rtx (V2SImode);
-
- emit_insn (gen_pmpy2_even (e, operands[1], operands[2]));
- emit_insn (gen_pmpy2_odd (o, operands[1], operands[2]));
- emit_insn (gen_addv2si3 (t, e, operands[3]));
- emit_insn (gen_addv2si3 (operands[0], t, o));
- DONE;
-})
-
-(define_expand "udot_prodv4hi"
- [(match_operand:V2SI 0 "gr_register_operand" "")
- (match_operand:V4HI 1 "gr_register_operand" "")
- (match_operand:V4HI 2 "gr_register_operand" "")
- (match_operand:V2SI 3 "gr_register_operand" "")]
- ""
-{
- rtx l, h, t;
-
- l = gen_reg_rtx (V2SImode);
- h = gen_reg_rtx (V2SImode);
- t = gen_reg_rtx (V2SImode);
-
- emit_insn (gen_vec_widen_umult_lo_v4hi (l, operands[1], operands[2]));
- emit_insn (gen_vec_widen_umult_hi_v4hi (h, operands[1], operands[2]));
- emit_insn (gen_addv2si3 (t, l, operands[3]));
- emit_insn (gen_addv2si3 (operands[0], t, h));
- DONE;
-})
-
(define_expand "vcond<mode><mode>"
[(set (match_operand:VECINT 0 "gr_register_operand" "")
(if_then_else:VECINT
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/3] Allow divmod on UNITS_PER_WORD vectors
2012-07-19 18:53 ` [PATCH 2/3] Allow divmod on UNITS_PER_WORD vectors Richard Henderson
@ 2012-07-19 20:28 ` Jakub Jelinek
0 siblings, 0 replies; 6+ messages in thread
From: Jakub Jelinek @ 2012-07-19 20:28 UTC (permalink / raw)
To: Richard Henderson; +Cc: gcc-patches
On Thu, Jul 19, 2012 at 11:52:57AM -0700, Richard Henderson wrote:
> IA64 vectors modes are of course all UNITS_PER_WORD. I see no reason
> to prevent the transformation of a V4HImode divmod, or even of V2SImode.
>
> Jakub, was there another reason you added this condition?
I guess I wanted to play on the safe side given that vectorizable_operation
already does something with such mode sizes:
if (icode == CODE_FOR_nothing)
{
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "op not supported by target.");
/* Check only during analysis. */
if (GET_MODE_SIZE (vec_mode) != UNITS_PER_WORD
|| (!vec_stmt && vf < vect_min_worthwhile_factor (code)))
return false;
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "proceeding using word mode.");
}
and didn't want to convert something that possibly could be already
vectorized somehow, at least without being able to test what it actually
does.
Jakub
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/3] Fix 0 != CODE_FOR_nothing thinko
2012-07-19 18:53 [PATCH 1/3] Fix 0 != CODE_FOR_nothing thinko Richard Henderson
2012-07-19 18:53 ` [PATCH 3/3] ia64: Update vector multiply pattern names; delete dot_prod Richard Henderson
2012-07-19 18:53 ` [PATCH 2/3] Allow divmod on UNITS_PER_WORD vectors Richard Henderson
@ 2012-07-19 20:35 ` Jakub Jelinek
2012-07-19 20:48 ` Richard Henderson
2 siblings, 1 reply; 6+ messages in thread
From: Jakub Jelinek @ 2012-07-19 20:35 UTC (permalink / raw)
To: Richard Henderson; +Cc: gcc-patches
On Thu, Jul 19, 2012 at 11:52:56AM -0700, Richard Henderson wrote:
> I discovered the vectorizable_operation while trying to figure out why
> ia64 wasn't generating the mul_highpart pattern as I expected: we always
> force CODE_FOR_nothing to 0. Oops.
icode = 0; was chosen as something that wouldn't match a real insn, yet was
different from CODE_FOR_nothing. CODE_FOR_nothing changed to 0 only
10 days ago... Thanks for spotting this.
Jakub
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/3] Fix 0 != CODE_FOR_nothing thinko
2012-07-19 20:35 ` [PATCH 1/3] Fix 0 != CODE_FOR_nothing thinko Jakub Jelinek
@ 2012-07-19 20:48 ` Richard Henderson
0 siblings, 0 replies; 6+ messages in thread
From: Richard Henderson @ 2012-07-19 20:48 UTC (permalink / raw)
To: Jakub Jelinek; +Cc: gcc-patches
On 07/19/2012 01:35 PM, Jakub Jelinek wrote:
> On Thu, Jul 19, 2012 at 11:52:56AM -0700, Richard Henderson wrote:
>> I discovered the vectorizable_operation while trying to figure out why
>> ia64 wasn't generating the mul_highpart pattern as I expected: we always
>> force CODE_FOR_nothing to 0. Oops.
>
> icode = 0; was chosen as something that wouldn't match a real insn, yet was
> different from CODE_FOR_nothing. CODE_FOR_nothing changed to 0 only
> 10 days ago... Thanks for spotting this.
Ah, good. I thought I was just going crazy remembering C_F_N != 0.
I havn't gotten back as far as Steven's patch in my post-vacation mail queue yet...
r~
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2012-07-19 20:48 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-19 18:53 [PATCH 1/3] Fix 0 != CODE_FOR_nothing thinko Richard Henderson
2012-07-19 18:53 ` [PATCH 3/3] ia64: Update vector multiply pattern names; delete dot_prod Richard Henderson
2012-07-19 18:53 ` [PATCH 2/3] Allow divmod on UNITS_PER_WORD vectors Richard Henderson
2012-07-19 20:28 ` Jakub Jelinek
2012-07-19 20:35 ` [PATCH 1/3] Fix 0 != CODE_FOR_nothing thinko Jakub Jelinek
2012-07-19 20:48 ` Richard Henderson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).