Re: [PATCH AArch64]Support missing vcond pattern by adding/using vec_cmp/vcond_mask patterns.

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: James Greenhalgh <james.greenhalgh@arm.com>
To: Bin Cheng <Bin.Cheng@arm.com>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>, nd <nd@arm.com>
Subject: Re: [PATCH AArch64]Support missing vcond pattern by adding/using vec_cmp/vcond_mask patterns.
Date: Tue, 31 May 2016 17:07:00 -0000	[thread overview]
Message-ID: <20160531152454.GA37947@arm.com> (raw)
In-Reply-To: <DB5PR08MB1144FD75076B139152FF6394E7480@DB5PR08MB1144.eurprd08.prod.outlook.com>

On Tue, May 17, 2016 at 09:02:22AM +0000, Bin Cheng wrote:
> Hi,
> Alan and Renlin noticed that some vcond patterns are not supported in
> AArch64(or AArch32?) backend, and they both had some patches fixing this.
> After investigation, I agree with them that vcond/vcondu in AArch64's backend
> should be re-implemented using vec_cmp/vcond_mask patterns, so here comes
> this patch which is based on Alan's.  This patch supports all vcond/vcondu
> patterns by implementing/using vec_cmp and vcond_mask patterns.  Different to
> the original patch, it doesn't change GCC's expanding process, and it keeps
> vcond patterns.  The patch also introduces vec_cmp*_internal to support
> special case optimization for vcond/vcondu which current implementation does.
> Apart from Alan's patch, I also learned ideas from Renlin's, and it is my
> change that shall be blamed if any potential bug is introduced.
> 
> With this patch, GCC's test condition "vect_cond_mixed" can be enabled on
> AArch64 (in a following patch).  Bootstrap and test on AArch64.  Is it OK?
> BTW, this patch is necessary for gcc.dg/vect/PR56541.c (on AArch64) which was
> added before in tree if-conversion patch.

Splitting this patch would have been very helpful. One patch each for the
new standard pattern names, and one patch for the refactor of vcond. As
it is, this patch is rather difficult to read.

> diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
> index bd73bce..f51473a 100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -1053,7 +1053,7 @@
>      }
>  
>    cmp_fmt = gen_rtx_fmt_ee (cmp_operator, V2DImode, operands[1], operands[2]);
> -  emit_insn (gen_aarch64_vcond_internalv2div2di (operands[0], operands[1],
> +  emit_insn (gen_vcondv2div2di (operands[0], operands[1],
>                operands[2], cmp_fmt, operands[1], operands[2]));
>    DONE;
>  })
> @@ -2225,204 +2225,215 @@
>    DONE;
>  })
>  
> -(define_expand "aarch64_vcond_internal<mode><mode>"
> +(define_expand "vcond_mask_<mode><v_cmp_result>"
> +  [(match_operand:VALLDI 0 "register_operand")
> +   (match_operand:VALLDI 1 "nonmemory_operand")
> +   (match_operand:VALLDI 2 "nonmemory_operand")
> +   (match_operand:<V_cmp_result> 3 "register_operand")]
> +  "TARGET_SIMD"
> +{
> +  /* If we have (a = (P) ? -1 : 0);
> +     Then we can simply move the generated mask (result must be int).  */
> +  if (operands[1] == CONSTM1_RTX (<MODE>mode)
> +      && operands[2] == CONST0_RTX (<MODE>mode))
> +    emit_move_insn (operands[0], operands[3]);
> +  /* Similarly, (a = (P) ? 0 : -1) is just inverting the generated mask.  */
> +  else if (operands[1] == CONST0_RTX (<MODE>mode)
> +	   && operands[2] == CONSTM1_RTX (<MODE>mode))
> +    emit_insn (gen_one_cmpl<v_cmp_result>2 (operands[0], operands[3]));
> +  else
> +    {
> +      if (!REG_P (operands[1]))
> +	operands[1] = force_reg (<MODE>mode, operands[1]);
> +      if (!REG_P (operands[2]))
> +	operands[2] = force_reg (<MODE>mode, operands[2]);
> +      emit_insn (gen_aarch64_simd_bsl<mode> (operands[0], operands[3],
> +					     operands[1], operands[2]));
> +    }
> +
> +  DONE;
> +})
> +

This pattern is fine.

> +;; Patterns comparing two vectors to produce a mask.

This comment is insufficient. The logic in vec_cmp<mode><mode>_internal
does not always return the expected mask (in particular for NE), but this
is not made clear in the comment.

> +
> +(define_expand "vec_cmp<mode><mode>_internal"
>    [(set (match_operand:VSDQ_I_DI 0 "register_operand")
> -	(if_then_else:VSDQ_I_DI
> -	  (match_operator 3 "comparison_operator"
> -	    [(match_operand:VSDQ_I_DI 4 "register_operand")
> -	     (match_operand:VSDQ_I_DI 5 "nonmemory_operand")])
> -	  (match_operand:VSDQ_I_DI 1 "nonmemory_operand")
> -	  (match_operand:VSDQ_I_DI 2 "nonmemory_operand")))]
> +	  (match_operator 1 "comparison_operator"
> +	    [(match_operand:VSDQ_I_DI 2 "register_operand")
> +	     (match_operand:VSDQ_I_DI 3 "nonmemory_operand")]))]
>    "TARGET_SIMD"

<snip>

> +(define_expand "vec_cmp<mode><mode>"
> +  [(set (match_operand:VSDQ_I_DI 0 "register_operand")
> +	  (match_operator 1 "comparison_operator"
> +	    [(match_operand:VSDQ_I_DI 2 "register_operand")
> +	     (match_operand:VSDQ_I_DI 3 "nonmemory_operand")]))]
> +  "TARGET_SIMD"
> +{
> +  enum rtx_code code = GET_CODE (operands[1]);
> +
> +  emit_insn (gen_vec_cmp<mode><mode>_internal (operands[0],
> +					 operands[1], operands[2],
> +					 operands[3]));
> +  /* See comments in vec_cmp<mode><v_cmp_result>_internal, below
> +     operators are computed using other operators, and we need to
> +     revert the result.  */

s/revert/invert

There are no comments in vec_cmp<mode><v_cmp_result>_internal for this
to refer to. But, there should be - more comments would definitely help
this patch!

These comments apply equally to the other copies of this comment in the
patch.

> +  if (code == NE)
> +    emit_insn (gen_one_cmpl<v_cmp_result>2 (operands[0], operands[0]));
>    DONE;
>  })
>  
> -(define_expand "aarch64_vcond_internal<VDQF_COND:mode><VDQF:mode>"
> -  [(set (match_operand:VDQF_COND 0 "register_operand")
> -	(if_then_else:VDQF
> -	  (match_operator 3 "comparison_operator"
> -	    [(match_operand:VDQF 4 "register_operand")
> -	     (match_operand:VDQF 5 "nonmemory_operand")])
> -	  (match_operand:VDQF_COND 1 "nonmemory_operand")
> -	  (match_operand:VDQF_COND 2 "nonmemory_operand")))]
> +(define_expand "vec_cmp<mode><v_cmp_result>_internal"
> +  [(set (match_operand:<V_cmp_result> 0 "register_operand")
> +	(match_operator 1 "comparison_operator"
> +	    [(match_operand:VDQF 2 "register_operand")
> +	     (match_operand:VDQF 3 "nonmemory_operand")]))]

This needs much more thorough commenting. The logic is now extremely
difficult to verify as to which pattern is responible for inverting the
mask. This makes it hard to spot whether the inversions applied to the
unordered comparisons are correct. I can't really review the patch with
the logic split across uncommented patterns in this way - is there
anything you can do to clean it up?

> diff --git a/gcc/testsuite/gcc.target/aarch64/vect-vcond.c b/gcc/testsuite/gcc.target/aarch64/vect-vcond.c
> new file mode 100644
> index 0000000..b0ca6d1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/vect-vcond.c

None of this is AArch64 specific?

> @@ -0,0 +1,59 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details" } */
> +
> +float fa[1024] = {0.0}, fb[1024] = {0.0}, fc[1024];
> +double da[1024] = {0.0}, db[1024] = {0.0}, dc[1024];
> +
> +int vcond_f_i (int *x)
> +{
> +  int i = 0;
> +  for (i = 0; i < 1024; i++)
> +    {
> +      fa[i] -= 1.0;
> +      fb[i] += 1.0;
> +      fc[i] = (x[i] < i) ? fa[i] : fb[i];
> +    }
> +
> +  return 0;
> +}
> +
> +int vcond_f_u (unsigned int *x)
> +{
> +  unsigned int i = 0;
> +  for (i = 0; i < 1024; i++)
> +    {
> +      fa[i] -= 1.0;
> +      fb[i] += 1.0;
> +      fc[i] = (x[i] < i) ? fa[i] : fb[i];
> +    }
> +
> +  return 0;
> +}
> +
> +int vcond_d_i (long long *x)
> +{
> +  long long i = 0;
> +  for (i = 0; i < 1024; i++)
> +    {
> +      da[i] -= 1.0;
> +      db[i] += 1.0;
> +      dc[i] = (x[i] < i) ? da[i] : db[i];
> +    }
> +
> +  return 0;
> +}
> +
> +int vcond_d_u (unsigned long long *x)
> +{
> +  unsigned long long i = 0;
> +  for (i = 0; i < 1024; i++)
> +    {
> +      da[i] -= 1.0;
> +      db[i] += 1.0;
> +      dc[i] = (x[i] < i) ? da[i] : db[i];
> +    }
> +
> +  return 0;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 4 "vect" } } */

Thanks,
James

next prev parent reply	other threads:[~2016-05-31 15:25 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-17  9:02 Bin Cheng
2016-05-24 10:17 ` Bin.Cheng
2016-05-31 17:07 ` James Greenhalgh [this message]
2016-06-08 14:58   ` Bin Cheng
2016-06-14 14:51     ` James Greenhalgh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160531152454.GA37947@arm.com \
    --to=james.greenhalgh@arm.com \
    --cc=Bin.Cheng@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=nd@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).