Re: [2/3][vect] Add widening add, subtract vect patterns

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Richard Sandiford <richard.sandiford@arm.com>
To: Joel Hutton via Gcc-patches <gcc-patches@gcc.gnu.org>
Cc: Joel Hutton <Joel.Hutton@arm.com>,  Richard Biener <rguenther@suse.de>
Subject: Re: [2/3][vect] Add widening add, subtract vect patterns
Date: Fri, 13 Nov 2020 12:16:57 +0000	[thread overview]
Message-ID: <mpta6vl30zq.fsf@arm.com> (raw)
In-Reply-To: <VI1PR0802MB2205B412A197A26D635E6199F5E70@VI1PR0802MB2205.eurprd08.prod.outlook.com> (Joel Hutton via Gcc-patches's message of "Thu, 12 Nov 2020 19:34:54 +0000")

[ There was a discussion on irc about how easy it would be to support
  internal functions and tree codes at the same time, so the agreement
  was to go for tree codes for now with a promise to convert the
  widening-related code to use internal functions for GCC 12. ]

Like Richard said, the new patterns need to be documented in md.texi
and the new tree codes need to be documented in generic.texi.

While we're using tree codes, I think we need to make the naming
consistent with other tree codes: WIDEN_PLUS_EXPR instead of
WIDEN_ADD_EXPR and WIDEN_MINUS_EXPR instead of WIDEN_SUB_EXPR.
Same idea for the VEC_* codes.

(In constrast, the internal functions do try to follow the optab names,
since there's usually a 1:1 correspondence.)

Joel Hutton via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> Hi all,
>
> This patch adds widening add and widening subtract patterns to tree-vect-patterns.
>
> All 3 patches together bootstrapped and regression tested on aarch64.
>
> gcc/ChangeLog:
>
> 2020-11-12  Joel Hutton  <joel.hutton@arm.com>
>
>         * expr.c (expand_expr_real_2): add widen_add,widen_subtract cases

Not that I personally care about this stuff (would love to see changelogs
go away :-)) but some nits:

Each description is supposed to start with a capital letter and end with
a full stop (even if it's not a complete sentence).  Same for the rest
of the log.

>         * optabs-tree.c (optab_for_tree_code): optabs for widening adds,subtracts

The line limit for changelogs is 80 characters.  The entry should say
what changed, so “Handle …” or “Add case for …” or something.

>         * optabs.def (OPTAB_D): define vectorized widen add, subtracts
>         * tree-cfg.c (verify_gimple_assign_binary): Add case for widening adds, subtracts
>         * tree-inline.c (estimate_operator_cost): Add case for widening adds, subtracts
>         * tree-vect-generic.c (expand_vector_operations_1): Add case for widening adds, subtracts
>         * tree-vect-patterns.c (vect_recog_widen_add_pattern): New recog ptatern

typo: pattern

>         (vect_recog_widen_sub_pattern): New recog pattern
>         (vect_recog_average_pattern): Update widened add code
>         (vect_recog_average_pattern): Update widened add code
>         * tree-vect-stmts.c (vectorizable_conversion): Add case for widened add, subtract
>         (supportable_widening_operation): Add case for widened add, subtract
>         * tree.def (WIDEN_ADD_EXPR): New tree code
>         (WIDEN_SUB_EXPR): New tree code
>         (VEC_WIDEN_ADD_HI_EXPR): New tree code
>         (VEC_WIDEN_ADD_LO_EXPR): New tree code
>         (VEC_WIDEN_SUB_HI_EXPR): New tree code
>         (VEC_WIDEN_SUB_LO_EXPR): New tree code
>
> gcc/testsuite/ChangeLog:
>
> 2020-11-12  Joel Hutton  <joel.hutton@arm.com>
>
>         * gcc.target/aarch64/vect-widen-add.c: New test.
>         * gcc.target/aarch64/vect-widen-sub.c: New test.
>
>
> Ok for trunk?
>
> From e0c10ca554729b9e6d58dbd3f18ba72b2c3ee8bc Mon Sep 17 00:00:00 2001
> From: Joel Hutton <joel.hutton@arm.com>
> Date: Mon, 9 Nov 2020 15:44:18 +0000
> Subject: [PATCH 2/3] [vect] Add widening add, subtract patterns
>
> Add widening add, subtract patterns to tree-vect-patterns.
> Add aarch64 tests for patterns.
>
> fix sad

Would be good to expand on this for the final commit message.

> […]
> diff --git a/gcc/optabs-tree.c b/gcc/optabs-tree.c
> index 4dfda756932de1693667c39c6fabed043b20b63b..009dccfa3bd298bca7b3b45401a4cc2acc90ff21 100644
> --- a/gcc/optabs-tree.c
> +++ b/gcc/optabs-tree.c
> @@ -170,6 +170,23 @@ optab_for_tree_code (enum tree_code code, const_tree type,
>        return (TYPE_UNSIGNED (type)
>  	      ? vec_widen_ushiftl_lo_optab : vec_widen_sshiftl_lo_optab);
>  
> +    case VEC_WIDEN_ADD_LO_EXPR:
> +      return (TYPE_UNSIGNED (type)
> +	      ? vec_widen_uaddl_lo_optab  : vec_widen_saddl_lo_optab);
> +
> +    case VEC_WIDEN_ADD_HI_EXPR:
> +      return (TYPE_UNSIGNED (type)
> +	      ? vec_widen_uaddl_hi_optab  : vec_widen_saddl_hi_optab);
> +
> +    case VEC_WIDEN_SUB_LO_EXPR:
> +      return (TYPE_UNSIGNED (type)
> +	      ? vec_widen_usubl_lo_optab  : vec_widen_ssubl_lo_optab);
> +
> +    case VEC_WIDEN_SUB_HI_EXPR:
> +      return (TYPE_UNSIGNED (type)
> +	      ? vec_widen_usubl_hi_optab  : vec_widen_ssubl_hi_optab);
> +
> +

Nits: excess blank line at the end and excess space before the “:”s.

>      case VEC_UNPACK_HI_EXPR:
>        return (TYPE_UNSIGNED (type)
>  	      ? vec_unpacku_hi_optab : vec_unpacks_hi_optab);
> diff --git a/gcc/optabs.def b/gcc/optabs.def
> index 78409aa14537d259bf90277751aac00d452a0d3f..a97cdb360781ca9c743e2991422c600626c75aa5 100644
> --- a/gcc/optabs.def
> +++ b/gcc/optabs.def
> @@ -383,6 +383,14 @@ OPTAB_D (vec_widen_smult_even_optab, "vec_widen_smult_even_$a")
>  OPTAB_D (vec_widen_smult_hi_optab, "vec_widen_smult_hi_$a")
>  OPTAB_D (vec_widen_smult_lo_optab, "vec_widen_smult_lo_$a")
>  OPTAB_D (vec_widen_smult_odd_optab, "vec_widen_smult_odd_$a")
> +OPTAB_D (vec_widen_ssubl_hi_optab, "vec_widen_ssubl_hi_$a")
> +OPTAB_D (vec_widen_ssubl_lo_optab, "vec_widen_ssubl_lo_$a")
> +OPTAB_D (vec_widen_saddl_hi_optab, "vec_widen_saddl_hi_$a")
> +OPTAB_D (vec_widen_saddl_lo_optab, "vec_widen_saddl_lo_$a")
> +OPTAB_D (vec_widen_usubl_hi_optab, "vec_widen_usubl_hi_$a")
> +OPTAB_D (vec_widen_usubl_lo_optab, "vec_widen_usubl_lo_$a")
> +OPTAB_D (vec_widen_uaddl_hi_optab, "vec_widen_uaddl_hi_$a")
> +OPTAB_D (vec_widen_uaddl_lo_optab, "vec_widen_uaddl_lo_$a")
>  OPTAB_D (vec_widen_sshiftl_hi_optab, "vec_widen_sshiftl_hi_$a")
>  OPTAB_D (vec_widen_sshiftl_lo_optab, "vec_widen_sshiftl_lo_$a")
>  OPTAB_D (vec_widen_umult_even_optab, "vec_widen_umult_even_$a")

Looks like the current code groups signed stuff together and
unsigned stuff together, so would be good to follow that.

> diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..fc6966fd9f257170501247411d50428aaaabbdae
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
> @@ -0,0 +1,90 @@
> +/* { dg-do run } */
> +/* { dg-options "-O3 -save-temps" } */
> +#include <stdint.h>
> +#include <string.h>
> +
> +#define ARR_SIZE 1024
> +
> +/* Should produce an uaddl */
> +void uadd_opt (uint32_t *foo, uint16_t *a, uint16_t *b)
> +{
> +    for( int i = 0; i < ARR_SIZE - 3;i=i+4)
> +    {
> +        foo[i]   = a[i]   + b[i];
> +        foo[i+1] = a[i+1] + b[i+1];
> +        foo[i+2] = a[i+2] + b[i+2];
> +        foo[i+3] = a[i+3] + b[i+3];
> +    }
> +}
> +
> +__attribute__((optimize (0)))
> +void uadd_nonopt (uint32_t *foo, uint16_t *a, uint16_t *b)
> +{
> +    for( int i = 0; i < ARR_SIZE - 3;i=i+4)
> +    {
> +        foo[i]   = a[i]   + b[i];
> +        foo[i+1] = a[i+1] + b[i+1];
> +        foo[i+2] = a[i+2] + b[i+2];
> +        foo[i+3] = a[i+3] + b[i+3];
> +    }
> +}
> +
> +/* Should produce an saddl */
> +void sadd_opt (int32_t *foo, int16_t *a, int16_t *b)
> +{
> +    for( int i = 0; i < ARR_SIZE - 3;i=i+4)
> +    {
> +        foo[i]   = a[i]   + b[i];
> +        foo[i+1] = a[i+1] + b[i+1];
> +        foo[i+2] = a[i+2] + b[i+2];
> +        foo[i+3] = a[i+3] + b[i+3];
> +    }
> +}
> +
> +__attribute__((optimize (0)))
> +void sadd_nonopt (int32_t *foo, int16_t *a, int16_t *b)
> +{
> +    for( int i = 0; i < ARR_SIZE - 3;i=i+4)
> +    {
> +        foo[i]   = a[i]   + b[i];
> +        foo[i+1] = a[i+1] + b[i+1];
> +        foo[i+2] = a[i+2] + b[i+2];
> +        foo[i+3] = a[i+3] + b[i+3];
> +    }
> +}
> +
> +
> +void __attribute__((optimize (0)))
> +init(uint16_t *a, uint16_t *b)
> +{
> +    for( int i = 0; i < ARR_SIZE;i++)
> +    {
> +      a[i] = i;
> +      b[i] = 2*i;
> +    }
> +}
> +
> +int __attribute__((optimize (0)))
> +main()
> +{
> +    uint32_t foo_arr[ARR_SIZE];
> +    uint32_t bar_arr[ARR_SIZE];
> +    uint16_t a[ARR_SIZE];
> +    uint16_t b[ARR_SIZE];
> +
> +    init(a, b);
> +    uadd_opt(foo_arr, a, b);
> +    uadd_nonopt(bar_arr, a, b);
> +    if (memcmp(foo_arr, bar_arr, ARR_SIZE) != 0)
> +      return 1;
> +    sadd_opt((int32_t*) foo_arr, (int16_t*) a, (int16_t*) b);
> +    sadd_nonopt((int32_t*) bar_arr, (int16_t*) a, (int16_t*) b);
> +    if (memcmp(foo_arr, bar_arr, ARR_SIZE) != 0)
> +      return 1;
> +    return 0;
> +}
> +
> +/* { dg-final { scan-assembler-times "uaddl\t" 1} } */
> +/* { dg-final { scan-assembler-times "uaddl2\t" 1} } */
> +/* { dg-final { scan-assembler-times "saddl\t" 1} } */
> +/* { dg-final { scan-assembler-times "saddl2\t" 1} } */

Same comments as the previous patch about having a "+nosve" pragma
and about the scan-assembler-times lines.  Same for the sub test.

> diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
> index 5139f111fecc7ec6e0902145b808308a5e47450b..532692a5da03d6d9653e2d47a1218982b27c4539 100644
> --- a/gcc/tree-cfg.c
> +++ b/gcc/tree-cfg.c
> @@ -3864,6 +3864,12 @@ verify_gimple_assign_binary (gassign *stmt)
>          return false;
>        }
>  
> +    case VEC_WIDEN_SUB_HI_EXPR:
> +    case VEC_WIDEN_SUB_LO_EXPR:
> +    case VEC_WIDEN_ADD_HI_EXPR:
> +    case VEC_WIDEN_ADD_LO_EXPR:
> +      return false;
> +

I think these should get the same validity checking as
VEC_WIDEN_MULT_HI_EXPR etc.

>      case VEC_WIDEN_LSHIFT_HI_EXPR:
>      case VEC_WIDEN_LSHIFT_LO_EXPR:
>        {
> […]
> diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
> index f68a87e05ed54145a25ccff598eeef9e57f9a759..331027444a6d95343eb110f7f9c7db19b40ee5ee 100644
> --- a/gcc/tree-vect-patterns.c
> +++ b/gcc/tree-vect-patterns.c
> @@ -1086,8 +1086,10 @@ vect_recog_sad_pattern (vec_info *vinfo,
>       of the above pattern.  */
>  
>    tree plus_oprnd0, plus_oprnd1;
> -  if (!vect_reassociating_reduction_p (vinfo, stmt_vinfo, PLUS_EXPR,
> -				       &plus_oprnd0, &plus_oprnd1))
> +  if (!(vect_reassociating_reduction_p (vinfo, stmt_vinfo, PLUS_EXPR,
> +				       &plus_oprnd0, &plus_oprnd1)
> +	|| vect_reassociating_reduction_p (vinfo, stmt_vinfo, WIDEN_ADD_EXPR,
> +				       &plus_oprnd0, &plus_oprnd1)))
>      return NULL;
>  
>    tree sum_type = gimple_expr_type (last_stmt);

I think we should make:

  /* Any non-truncating sequence of conversions is OK here, since
     with a successful match, the result of the ABS(U) is known to fit
     within the nonnegative range of the result type.  (It cannot be the
     negative of the minimum signed value due to the range of the widening
     MINUS_EXPR.)  */
  vect_unpromoted_value unprom_abs;
  plus_oprnd0 = vect_look_through_possible_promotion (vinfo, plus_oprnd0,
						      &unprom_abs);

specific to the PLUS_EXPR case.  If we look through promotions on
the operands of a WIDEN_ADD_EXPR, we could potentially have a mixture
of signednesses involved, one on the operands of the WIDEN_ADD_EXPR
and one on its inputs.

E.g. if we had:

    uint32_t x;
    int32_t y = (int32_t) x;
    int64_t z = WIDEN_ADD_EXPR <y, …>;

I think the code above would look through the signedness change and
give the SAD_EXPR an unsigned extension instead of a signed one.

(Alternatively, we could try to handle WIDEN_ADD_EXPR with promoted
operands, but I think in practice it would be wasted work.)

> @@ -1148,7 +1150,7 @@ vect_recog_sad_pattern (vec_info *vinfo,
>    /* FORNOW.  Can continue analyzing the def-use chain when this stmt in a phi
>       inside the loop (in case we are analyzing an outer-loop).  */
>    vect_unpromoted_value unprom[2];
> -  if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, MINUS_EXPR,
> +  if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, WIDEN_SUB_EXPR,
>  			     false, 2, unprom, &half_type))
>      return NULL;
>  
> @@ -1262,6 +1264,24 @@ vect_recog_widen_mult_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
>  				      "vect_recog_widen_mult_pattern");
>  }
>  
> +static gimple *
> +vect_recog_widen_add_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
> +			       tree *type_out)
> +{
> +  return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
> +				      PLUS_EXPR, WIDEN_ADD_EXPR, false,
> +				      "vect_recog_widen_add_pattern");
> +}
> +
> +static gimple *
> +vect_recog_widen_sub_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
> +			       tree *type_out)
> +{
> +  return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
> +				      MINUS_EXPR, WIDEN_SUB_EXPR, false,
> +				      "vect_recog_widen_sub_pattern");
> +}
> +

The new functions should have comments before them.  Can probably
just use the vect_recog_widen_mult_pattern comment as a template.

Thanks,
Richard

next prev parent reply	other threads:[~2020-11-13 12:17 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-12 19:34 Joel Hutton
2020-11-13  7:58 ` Richard Biener
2020-11-13 12:16 ` Richard Sandiford [this message]
2020-11-13 16:48   ` Joel Hutton
2020-11-16 14:04     ` Richard Biener
2020-11-17 13:40       ` Richard Sandiford

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=mpta6vl30zq.fsf@arm.com \
    --to=richard.sandiford@arm.com \
    --cc=Joel.Hutton@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=rguenther@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).