Re: [Patch 1/4][Aarch64] v2: Implement Aarch64 SIMD ABI

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Richard Sandiford <richard.sandiford@arm.com>
To: Steve Ellcey <sellcey@cavium.com>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: [Patch 1/4][Aarch64] v2: Implement Aarch64 SIMD ABI
Date: Fri, 07 Dec 2018 17:34:00 -0000	[thread overview]
Message-ID: <87in05utpk.fsf@arm.com> (raw)
In-Reply-To: <1541699539.12016.6.camel@cavium.com> (Steve Ellcey's message of	"Thu, 8 Nov 2018 17:52:20 +0000")

Sorry for the slow review.

Steve Ellcey <sellcey@cavium.com> writes:
> @@ -1470,6 +1479,45 @@ aarch64_hard_regno_mode_ok (unsigned regno, machine_mode mode)
>    return false;
>  }
>  
> +/* Return true if this is a definition of a vectorized simd function.  */
> +
> +static bool
> +aarch64_simd_decl_p (tree fndecl)
> +{
> +  tree fntype;
> +
> +  if (fndecl == NULL)
> +    return false;
> +  fntype = TREE_TYPE (fndecl);
> +  if (fntype == NULL)
> +    return false;
> +
> +  /* All functions with the aarch64_vector_pcs attribute use the simd ABI.  */
> +  if (lookup_attribute ("aarch64_vector_pcs", TYPE_ATTRIBUTES (fntype)) != NULL)
> +    return true;
> +  /* Functions without the aarch64_vector_pcs or simd attribute never use the
> +     simd ABI.  */
> +  if (lookup_attribute ("simd", DECL_ATTRIBUTES (fndecl)) == NULL)
> +    return false;
> +  /* Functions with the simd attribute can generate three versions of a
> +     function, a masked vector function, an unmasked vector function,
> +     and a scalar version.  Only the vector versions use the simd ABI.  */
> +  return (VECTOR_TYPE_P (TREE_TYPE (fntype)));

Is this enough?  E.g.:

    void __attribute__ ((simd)) f (int *x) { *x = 1; }

generates SIMD clones but doesn't have a vector return type.

I'm not an expert on this stuff, but it looks like:

  struct cgraph_node *node = cgraph_node::get (fndecl);
  return node && node->simdclone;

might work.  But in some ways it would be cleaner to add the
aarch64_vector_pcs attribute for SIMD clones, e.g. via a new hook,
so that the function type is "correct".

It might be more efficient to save aarch64_simd_decl_p in cfun->machine.

> @@ -4863,6 +4949,7 @@ aarch64_process_components (sbitmap components, bool prologue_p)
>  	 mergeable with the current one into a pair.  */
>        if (!satisfies_constraint_Ump (mem)
>  	  || GP_REGNUM_P (regno) != GP_REGNUM_P (regno2)
> +	  || (aarch64_simd_decl_p (cfun->decl) && (FP_REGNUM_P (regno)))

Formatting nit: redundant brackets around FP_REGNUM_P (regno).

> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 82af4d4..44261ee 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -724,7 +724,13 @@
>    ""
>  )
>  
> -(define_insn "simple_return"
> +(define_expand "simple_return"
> +  [(simple_return)]
> +  "aarch64_use_simple_return_insn_p ()"
> +  ""
> +)
> +
> +(define_insn "*simple_return"
>    [(simple_return)]
>    ""
>    "ret"

Can't you just change the condition on the existing define_insn,
without turning it in a define_expand?  Worth a comment explaining
why if not.

> @@ -1487,6 +1538,23 @@
>    [(set_attr "type" "neon_store1_2reg<q>")]
>  )
>  
> +(define_insn "storewb_pair<TX:mode>_<P:mode>"
> +  [(parallel
> +    [(set (match_operand:P 0 "register_operand" "=&k")
> +          (plus:P (match_operand:P 1 "register_operand" "0")
> +                  (match_operand:P 4 "aarch64_mem_pair_offset" "n")))
> +     (set (mem:TX (plus:P (match_dup 0)
> +                  (match_dup 4)))

Should be indented under the (match_dup 0).

> +          (match_operand:TX 2 "register_operand" "w"))
> +     (set (mem:TX (plus:P (match_dup 0)
> +                  (match_operand:P 5 "const_int_operand" "n")))
> +          (match_operand:TX 3 "register_operand" "w"))])]

Think this last part should be:

     (set (mem:TX (plus:P (plus:P (match_dup 0)
				  (match_dup 4))
			  (match_operand:P 5 "const_int_operand" "n")))
	  (match_operand:TX 3 "register_operand" "w"))])]

> +  "TARGET_SIMD &&
> +   INTVAL (operands[5]) == INTVAL (operands[4]) + GET_MODE_SIZE (<TX:MODE>mode)"

&& should be on the second line (which makes the line long enough to
need breaking).

> diff --git a/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-2.c b/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-2.c
> index e69de29..bf6e64a 100644
> --- a/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-2.c
> +++ b/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-2.c
> @@ -0,0 +1,33 @@
> +/* { dg-do compile } */
> +
> +void
> +f (void)
> +{
> +  /* Clobber all fp/simd regs and verify that the correct ones are saved
> +     and restored in the prologue and epilogue of a SIMD function. */
> +  __asm__ __volatile__ ("" :::  "q0",  "q1",  "q2",  "q3");
> +  __asm__ __volatile__ ("" :::  "q4",  "q5",  "q6",  "q7");
> +  __asm__ __volatile__ ("" :::  "q8",  "q9", "q10", "q11");
> +  __asm__ __volatile__ ("" ::: "q12", "q13", "q14", "q15");
> +  __asm__ __volatile__ ("" ::: "q16", "q17", "q18", "q19");
> +  __asm__ __volatile__ ("" ::: "q20", "q21", "q22", "q23");
> +  __asm__ __volatile__ ("" ::: "q24", "q25", "q26", "q27");
> +  __asm__ __volatile__ ("" ::: "q28", "q29", "q30", "q31");
> +}
> +
> +/* { dg-final { scan-assembler {\sstp\td8, d9} } } */
> +/* { dg-final { scan-assembler {\sstp\td10, d11} } } */
> +/* { dg-final { scan-assembler {\sstp\td12, d13} } } */
> +/* { dg-final { scan-assembler {\sstp\td14, d15} } } */
> +/* { dg-final { scan-assembler {\sldp\td8, d9} } } */
> +/* { dg-final { scan-assembler {\sldp\td10, d11} } } */
> +/* { dg-final { scan-assembler {\sldp\td12, d13} } } */
> +/* { dg-final { scan-assembler {\sldp\td14, d15} } } */
> +/* { dg-final { scan-assembler-not {\sstp\tq} } } */
> +/* { dg-final { scan-assembler-not {\sldp\tq} } } */
> +/* { dg-final { scan-assembler-not {\sstp\tq[01234567]} } } */
> +/* { dg-final { scan-assembler-not {\sldp\tq[01234567]} } } */
> +/* { dg-final { scan-assembler-not {\sstp\tq1[6789]} } } */
> +/* { dg-final { scan-assembler-not {\sldp\tq1[6789]} } } */
> +/* { dg-final { scan-assembler-not {\sstr\t} } } */
> +/* { dg-final { scan-assembler-not {\sldr\t} } } */

Comment doesn't match code: this is testing a normal PCS function.

> diff --git a/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-5.c b/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-5.c
> index e69de29..7d639a5e 100644
> --- a/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-5.c
> +++ b/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-5.c
> @@ -0,0 +1,22 @@
> +/* { dg-do compile } */
> +
> +void __attribute__ ((aarch64_vector_pcs))
> +f (void)
> +{
> +  /* Clobber some fp/simd regs and verify that only those are saved
> +     and restored in the prologue and epilogue of a SIMD function. */
> +  __asm__ __volatile__ ("" :::  "q8",  "q9", "q10", "q11");
> +}
> +
> +/* { dg-final { scan-assembler {\sstp\tq8, q9} } } */
> +/* { dg-final { scan-assembler {\sstp\tq10, q11} } } */
> +/* { dg-final { scan-assembler-not {\sstp\tq[034567]} } } */
> +/* { dg-final { scan-assembler-not {\sldp\tq[034567]} } } */
> +/* { dg-final { scan-assembler-not {\sstp\tq1[23456789]} } } */
> +/* { dg-final { scan-assembler-not {\sldp\tq1[23456789]} } } */
> +/* { dg-final { scan-assembler-not {\sstp\tq2[456789]} } } */
> +/* { dg-final { scan-assembler-not {\sldp\tq2[456789]} } } */
> +/* { dg-final { scan-assembler-not {\sstp\td} } } */
> +/* { dg-final { scan-assembler-not {\sldp\td} } } */
> +/* { dg-final { scan-assembler-not {\sstr\t} } } */
> +/* { dg-final { scan-assembler-not {\sldr\t} } } */

This is a nice test, but I think it would also be good to have versions
that don't clobber full register pairs.  E.g. one without q9 and another
without q10 would test individual STR Qs.

LGTM otherwise.

Thanks,
Richard

next prev parent reply	other threads:[~2018-12-07 17:34 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-08 17:52 Steve Ellcey
2018-12-07 17:34 ` Richard Sandiford [this message]
2018-12-11 23:01   ` Steve Ellcey
2018-12-12 11:39     ` Richard Sandiford
2018-12-12 18:26       ` [EXT] " Steve Ellcey
2018-12-12 18:49         ` Richard Sandiford
2019-02-13 16:54 ` Szabolcs Nagy
2019-02-13 17:24   ` Steve Ellcey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87in05utpk.fsf@arm.com \
    --to=richard.sandiford@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=sellcey@cavium.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).