From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=wKJs=BL=gmail.com=richard.guenther@sourceware.org>
Received: from mail-lj1-x231.google.com (mail-lj1-x231.google.com [IPv6:2a00:1450:4864:20::231])
	by sourceware.org (Postfix) with ESMTPS id D36513858D35
	for <gcc-patches@gcc.gnu.org>; Mon, 22 May 2023 13:32:14 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D36513858D35
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com
Received: by mail-lj1-x231.google.com with SMTP id 38308e7fff4ca-2af2696fd1cso39043821fa.2
        for <gcc-patches@gcc.gnu.org>; Mon, 22 May 2023 06:32:14 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20221208; t=1684762333; x=1687354333;
        h=content-transfer-encoding:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:from:to:cc:subject:date
         :message-id:reply-to;
        bh=9mUbOCJA4yIJ0ZdWdic+Z7kWRroJuskg6xc7l5qgHis=;
        b=ry75q7aV4ucOjfqRAbF1wwnJgcvIhtvrjN2uQzMjwTRgTDhU2BfRFSq/cdGM2ySSDJ
         ajK1GA1h1rAf8xHnialigU35xcskEgBLVMA4dsMojtwDTH21tGfhotcrIowq2b+XWhnD
         vK9Moqnp2tDmtgC44e5xOsdqlbhHT4Y7WHOOMNhl2yAf3tT4efhR6o2T7qdvjo0jnMtH
         udzsTaAo5us8Eag69+1MUew5OODdTFa1EFE8jnI5orPZVVpMdBTr5IBD2PISWn0LBKvt
         /JDfJLkrfBu8tdlXbGI5vbcQDBch1awWHTF7BVafh8qFjRtpWZP5TfSccSR5QoP1Ye9H
         9j6w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20221208; t=1684762333; x=1687354333;
        h=content-transfer-encoding:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=9mUbOCJA4yIJ0ZdWdic+Z7kWRroJuskg6xc7l5qgHis=;
        b=CfLTFkJr8zmPOTB5GpyEs79SRU0ZefGbPZXf9xsgrjKrkWkks80kkD4iLJQbrAuIfG
         dMs97uWOmLND4gAufV0gJyD2m20Y+qMkhFq3wIhfcdUCpPql7sq+0WwNv19DplivT/Fg
         4Ql1+64LiYA/rc0Wiaig4AHoeHT8aPL7fv7SSg+910Gza1QXXZUTTp/v+THTM0l/inox
         hrK283cgwqHscApRFY/IdKSTm/DPKDFFKmT31WuOUEqUVQjVRZ/fRBwqhui+ulzfmKLt
         tjYPygRgoF1ok+5EvyZEc3euTXf2CHAFEJahHoaWN/YtpqN8TSWczKHGq9oaDGupCD7b
         r8XA==
X-Gm-Message-State: AC+VfDz7Ck++HX0JS+d1kEptBHTXs9soXD+LVlRHQ0PTAQeFUPvMixuT
	IEEb4u0h3qlFw+7RlWajPPUw62XG92EUzJwffg0=
X-Google-Smtp-Source: ACHHUZ7dwPq8rPpuWtBYU5HZrQgeFAQ+/PeC8rA4mpwlG4o3jSVZuauFbDbhyHykrLDquxoL1TaPSSzuV3u9Od+2/EI=
X-Received: by 2002:a2e:834b:0:b0:2ad:ba85:9f7 with SMTP id
 l11-20020a2e834b000000b002adba8509f7mr4099237ljh.22.1684762333018; Mon, 22
 May 2023 06:32:13 -0700 (PDT)
MIME-Version: 1.0
References: <20230517122155.531-1-oluwatamilore.adebayo@arm.com>
 <20230518083909.15739-1-Oluwatamilore.Adebayo@arm.com> <mptedndv7tz.fsf@arm.com>
In-Reply-To: <mptedndv7tz.fsf@arm.com>
From: Richard Biener <richard.guenther@gmail.com>
Date: Mon, 22 May 2023 15:32:00 +0200
Message-ID: <CAFiYyc39tohRrUQzUovz0P7-jntB8GOas1VzvPFYaxBe3W1DTw@mail.gmail.com>
Subject: Re: [PATCH 1/4] Missed opportunity to use [SU]ABD
To: Oluwatamilore.Adebayo@arm.com, gcc-patches@gcc.gnu.org, 
	richard.guenther@gmail.com, richard.sandiford@arm.com
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Spam-Status: No, score=-7.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_LOTSOFHASH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>

On Thu, May 18, 2023 at 7:59=E2=80=AFPM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Thanks for the update.  Some of these comments would have applied
> to the first version, so sorry for not catching them first time.
>
> <Oluwatamilore.Adebayo@arm.com> writes:
> > From: oluade01 <oluwatamilore.adebayo@arm.com>
> >
> > This adds a recognition pattern for the non-widening
> > absolute difference (ABD).
> >
> > gcc/ChangeLog:
> >
> >       * doc/md.texi (sabd, uabd): Document them.
> >       * internal-fn.def (ABD): Use new optab.
> >       * optabs.def (sabd_optab, uabd_optab): New optabs,
> >       * tree-vect-patterns.cc (vect_recog_absolute_difference):
> >       Recognize the following idiom abs (a - b).
> >       (vect_recog_sad_pattern): Refactor to use
> >       vect_recog_absolute_difference.
> >       (vect_recog_abd_pattern): Use patterns found by
> >       vect_recog_absolute_difference to build a new ABD
> >       internal call.
> > ---
> >  gcc/doc/md.texi           |  10 ++
> >  gcc/internal-fn.def       |   3 +
> >  gcc/optabs.def            |   2 +
> >  gcc/tree-vect-patterns.cc | 255 +++++++++++++++++++++++++++++++++-----
> >  4 files changed, 239 insertions(+), 31 deletions(-)
> >
> > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> > index 07bf8bdebffb2e523f25a41f2b57e43c0276b745..3e65584d7efcd301f2c96a4=
0edd82d30b84462b8 100644
> > --- a/gcc/doc/md.texi
> > +++ b/gcc/doc/md.texi
> > @@ -5778,6 +5778,16 @@ Other shift and rotate instructions, analogous t=
o the
> >  Vector shift and rotate instructions that take vectors as operand 2
> >  instead of a scalar type.
> >
> > +@cindex @code{uabd@var{m}} instruction pattern
> > +@cindex @code{sabd@var{m}} instruction pattern
> > +@item @samp{uabd@var{m}}, @samp{sabd@var{m}}
> > +Signed and unsigned absolute difference instructions.  These
> > +instructions find the difference between operands 1 and 2
> > +then return the absolute value.  A C code equivalent would be:
> > +@smallexample
> > +op0 =3D op0 > op1 ? op0 - op1 : op1 - op0;
>
> Should be:
>
>   op0 =3D op1 > op2 ? op1 - op2 : op2 - op1;
>
> since op0 is the output.
>
> > +@end smallexample
> > +
> >  @cindex @code{avg@var{m}3_floor} instruction pattern
> >  @cindex @code{uavg@var{m}3_floor} instruction pattern
> >  @item @samp{avg@var{m}3_floor}
> > diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
> > index 7fe742c2ae713e7152ab05cfdfba86e4e0aa3456..0f1724ecf37a31c231572ed=
f90b5577e2d82f468 100644
> > --- a/gcc/internal-fn.def
> > +++ b/gcc/internal-fn.def
> > @@ -167,6 +167,9 @@ DEF_INTERNAL_OPTAB_FN (FMS, ECF_CONST, fms, ternary=
)
> >  DEF_INTERNAL_OPTAB_FN (FNMA, ECF_CONST, fnma, ternary)
> >  DEF_INTERNAL_OPTAB_FN (FNMS, ECF_CONST, fnms, ternary)
> >
> > +DEF_INTERNAL_SIGNED_OPTAB_FN (ABD, ECF_CONST | ECF_NOTHROW, first,
> > +                           sabd, uabd, binary)
> > +
> >  DEF_INTERNAL_SIGNED_OPTAB_FN (AVG_FLOOR, ECF_CONST | ECF_NOTHROW, firs=
t,
> >                             savg_floor, uavg_floor, binary)
> >  DEF_INTERNAL_SIGNED_OPTAB_FN (AVG_CEIL, ECF_CONST | ECF_NOTHROW, first=
,
> > diff --git a/gcc/optabs.def b/gcc/optabs.def
> > index 695f5911b300c9ca5737de9be809fa01aabe5e01..29bc92281a2175f898634cb=
e6af63c18021e5268 100644
> > --- a/gcc/optabs.def
> > +++ b/gcc/optabs.def
> > @@ -359,6 +359,8 @@ OPTAB_D (mask_fold_left_plus_optab, "mask_fold_left=
_plus_$a")
> >  OPTAB_D (extract_last_optab, "extract_last_$a")
> >  OPTAB_D (fold_extract_last_optab, "fold_extract_last_$a")
> >
> > +OPTAB_D (uabd_optab, "uabd$a3")
> > +OPTAB_D (sabd_optab, "sabd$a3")
> >  OPTAB_D (savg_floor_optab, "avg$a3_floor")
> >  OPTAB_D (uavg_floor_optab, "uavg$a3_floor")
> >  OPTAB_D (savg_ceil_optab, "avg$a3_ceil")
> > diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
> > index a49b09539776c0056e77f99b10365d0a8747fbc5..50f1822f220c023027f4b0f=
777965f3757842fa2 100644
> > --- a/gcc/tree-vect-patterns.cc
> > +++ b/gcc/tree-vect-patterns.cc
> > @@ -770,6 +770,93 @@ vect_split_statement (vec_info *vinfo, stmt_vec_in=
fo stmt2_info, tree new_rhs,
> >      }
> >  }
> >
> > +/* Look for the following pattern
> > +     X =3D x[i]
> > +     Y =3D y[i]
> > +     DIFF =3D X - Y
> > +     DAD =3D ABS_EXPR<DIFF>
> > +
> > +   ABS_STMT should point to a statement of code ABS_EXPR or ABSU_EXPR.
> > +   If REJECT_UNSIGNED is true it aborts if the type of ABS_STMT is uns=
igned.
> > +   HALF_TYPE and UNPROM will be set should the statement be found to
> > +   be a widened operation.
> > +   DIFF_OPRNDS will be set to the two inputs of the MINUS_EXPR precedi=
ng
> > +   ABS_STMT, otherwise it will be set the operations found by
> > +   vect_widened_op_tree.
> > + */
> > +static bool
> > +vect_recog_absolute_difference (vec_info *vinfo, gassign *abs_stmt,
> > +                             tree *half_type, bool reject_unsigned,
> > +                             vect_unpromoted_value unprom[2],
> > +                             tree diff_oprnds[2])
> > +{
> > +  if (!abs_stmt)
> > +    return false;
> > +
> > +  /* FORNOW.  Can continue analyzing the def-use chain when this stmt =
in a phi
> > +     inside the loop (in case we are analyzing an outer-loop).  */
> > +  enum tree_code code =3D gimple_assign_rhs_code (abs_stmt);
> > +  if (code !=3D ABS_EXPR && code !=3D ABSU_EXPR)
> > +    return false;
> > +
> > +  tree abs_oprnd =3D gimple_assign_rhs1 (abs_stmt);
> > +  tree abs_type =3D TREE_TYPE (abs_oprnd);
> > +  if (!abs_oprnd)
> > +    return false;
> > +  if (reject_unsigned && TYPE_UNSIGNED (abs_type))
> > +    return false;
> > +  if (!ANY_INTEGRAL_TYPE_P (abs_type) || TYPE_OVERFLOW_WRAPS (abs_type=
))
> > +    return false;
>
> Could you explain the reject_unsigned behaviour?  I'd have expected
> TYPE_OVERFLOW_WRAPS (abs_type) to reject the unsigned case anyway.
>
> > +
> > +  /* Peel off conversions from the ABS input.  This can involve sign
> > +     changes (e.g.  from an unsigned subtraction to a signed ABS input=
)
> > +     or signed promotion, but it can't include unsigned promotion.
> > +     (Note that ABS of an unsigned promotion should have been folded
> > +     away before now anyway.)  */
>
> I'm not sure we can do the "unsigned subtraction to a signed ABS input"
> case.  The code:
>
> int
> f (unsigned int x, int y)
> {
>   unsigned int diff =3D x - y;
>   return __builtin_abs (diff);
> }
>
> is well-defined C (no undefined behaviour).  But it doesn't do the
> same thing as an absolute difference.
>
> So...
>
> > +  vect_unpromoted_value unprom_diff;
> > +  abs_oprnd =3D vect_look_through_possible_promotion (vinfo, abs_oprnd=
,
> > +                                                 &unprom_diff);
> > +  if (!abs_oprnd)
> > +    return false;
> > +  if (TYPE_PRECISION (unprom_diff.type) !=3D TYPE_PRECISION (abs_type)
> > +      && TYPE_UNSIGNED (unprom_diff.type)
> > +      && TYPE_UNSIGNED (abs_type))
> > +    return false;
>
> ...I think there are four valid cases:
>
> (1) !TYPE_UNSIGNED (abs_type)
>     && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (abs_oprnd))
>     && TYPE_PRECISION (unprom_diff.type) =3D=3D TYPE_PRECISION (abs_type)
>     && regular unwidened MINUS_EXPR
>
>     Here the subtraction and ABS input are signed.  We can assume that
>     there is no signed overflow in the subtraction.
>
> (2) !TYPE_UNSIGNED (abs_type)
>     && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (abs_oprnd))
>     && TYPE_PRECISION (unprom_diff.type) < TYPE_PRECISION (abs_type)
>     && !TYPE_UNSIGNED (unprom_diff.type)
>     && regular unwidened MINUS_EXPR
>
>     This is like (1) except that the MINUS_EXPR result is sign-extended
>     before the ABS.  Again we rely on having no signed overflow, hence th=
e
>     TYPE_OVERFLOW_UNDEFINED check on the subtraction type.  We can then
>     push the sign-extension of the subtraction result =E2=80=9Cdown=E2=80=
=9D to the inputs
>     of the subtraction.
>
> (3) !TYPE_UNSIGNED (abs_type)
>     && TYPE_PRECISION (unprom_diff.type) =3D=3D TYPE_PRECISION (abs_type)
>     && MINUS_EXPR of widened inputs
>
>     In other words;
>
>     - the ABS input is signed
>     - the MINUS_EXPR has the same precision as the ABS
>     - the MINUS_EXPR inputs are narrower than the result
>
>     In this case the MINUS_EXPR is known to give correct results for
>     all inputs, due to the extra range.  There's no need to rely on
>     TYPE_OVERFLOW_UNDEFINED.
>
> (4) !TYPE_UNSIGNED (abs_type)
>     && TYPE_PRECISION (unprom_diff.type) < TYPE_PRECISION (abs_type)
>     && !TYPE_UNSIGNED (unprom_diff.type)
>     && MINUS_EXPR of widened inputs
>
>     Like (3), but the subtraction result is further sign-extended
>     (somewhat pointlessly).
>
> Which is a very convoluated way of classifying it, sorry.

There's also the possibility that you see ABSU_EXPR instead of ABS_EXPR.

> I think this means that:
>
> (a) We should reject all cases in TYPE_UNSIGNED (abs_type) is true.
>     Absolute values with unsigned inputs are a niche game, if they're
>     valid at all.
>
> (b) We should require TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (abs_oprnd))
>     whenever we see a MINUS_EXPR with unextended inputs (i.e. whenever
>     vect_widened_op_tree fails).
>
> I guess some of this is really a comment about the existing SAD_EXPR code=
.
>
> > +
> > +  /* We then detect if the operand of abs_expr is defined by a minus_e=
xpr.  */
> > +  stmt_vec_info diff_stmt_vinfo =3D vect_get_internal_def (vinfo, abs_=
oprnd);
> > +  if (!diff_stmt_vinfo)
> > +    return false;
> > +
> > +  /* FORNOW.  Can continue analyzing the def-use chain when this stmt =
in a phi
> > +     inside the loop (in case we are analyzing an outer-loop).  */
> > +  if (vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR,
> > +                          WIDEN_MINUS_EXPR,
> > +                          false, 2, unprom, half_type))
> > +    {
> > +      if (diff_oprnds)
> > +     {
> > +       diff_oprnds[0] =3D unprom[0].op;
> > +       diff_oprnds[1] =3D unprom[1].op;
> > +     }
> > +      return true;
> > +    }
> > +
> > +  // Failed to find a widen operation so we check for a regular MINUS_=
EXPR
> > +  gassign *diff =3D dyn_cast <gassign *> (STMT_VINFO_STMT (diff_stmt_v=
info));
> > +  if (diff_oprnds && diff && gimple_assign_rhs_code (diff) =3D=3D MINU=
S_EXPR)
> > +    {
> > +      diff_oprnds[0] =3D gimple_assign_rhs1 (diff);
> > +      diff_oprnds[1] =3D gimple_assign_rhs2 (diff);
> > +    }
> > +  else
> > +    return false;
> > +
> > +  *half_type =3D NULL_TREE;
> > +
> > +  return true;
>
> Very minor, but I think it'd be clearer to put the *half_type assignment
> and return true in the "if" body, and unconditionally return false
> at the end.  That makes it more consistent with the widening case.
>
> > +}
> > +
> >  /* Convert UNPROM to TYPE and return the result, adding new statements
> >     to STMT_INFO's pattern definition statements if no better way is
> >     available.  VECTYPE is the vector form of TYPE.
> > @@ -1308,40 +1395,13 @@ vect_recog_sad_pattern (vec_info *vinfo,
> >    /* FORNOW.  Can continue analyzing the def-use chain when this stmt =
in a phi
> >       inside the loop (in case we are analyzing an outer-loop).  */
> >    gassign *abs_stmt =3D dyn_cast <gassign *> (abs_stmt_vinfo->stmt);
> > -  if (!abs_stmt
> > -      || (gimple_assign_rhs_code (abs_stmt) !=3D ABS_EXPR
> > -       && gimple_assign_rhs_code (abs_stmt) !=3D ABSU_EXPR))
> > -    return NULL;
> >
> > -  tree abs_oprnd =3D gimple_assign_rhs1 (abs_stmt);
> > -  tree abs_type =3D TREE_TYPE (abs_oprnd);
> > -  if (TYPE_UNSIGNED (abs_type))
> > -    return NULL;
> > -
> > -  /* Peel off conversions from the ABS input.  This can involve sign
> > -     changes (e.g. from an unsigned subtraction to a signed ABS input)
> > -     or signed promotion, but it can't include unsigned promotion.
> > -     (Note that ABS of an unsigned promotion should have been folded
> > -     away before now anyway.)  */
> > -  vect_unpromoted_value unprom_diff;
> > -  abs_oprnd =3D vect_look_through_possible_promotion (vinfo, abs_oprnd=
,
> > -                                                 &unprom_diff);
> > -  if (!abs_oprnd)
> > -    return NULL;
> > -  if (TYPE_PRECISION (unprom_diff.type) !=3D TYPE_PRECISION (abs_type)
> > -      && TYPE_UNSIGNED (unprom_diff.type))
> > -    return NULL;
> > -
> > -  /* We then detect if the operand of abs_expr is defined by a minus_e=
xpr.  */
> > -  stmt_vec_info diff_stmt_vinfo =3D vect_get_internal_def (vinfo, abs_=
oprnd);
> > -  if (!diff_stmt_vinfo)
> > +  vect_unpromoted_value unprom[2];
> > +  if (!vect_recog_absolute_difference (vinfo, abs_stmt, &half_type,
> > +                                    true, unprom, NULL))
> >      return NULL;
> >
> > -  /* FORNOW.  Can continue analyzing the def-use chain when this stmt =
in a phi
> > -     inside the loop (in case we are analyzing an outer-loop).  */
> > -  vect_unpromoted_value unprom[2];
> > -  if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, WIDEN=
_MINUS_EXPR,
> > -                          false, 2, unprom, &half_type))
> > +  if (!half_type)
> >      return NULL;
> >
> >    vect_pattern_detected ("vect_recog_sad_pattern", last_stmt);
> > @@ -1363,6 +1423,138 @@ vect_recog_sad_pattern (vec_info *vinfo,
> >    return pattern_stmt;
> >  }
> >
> > +/* Function vect_recog_abd_pattern
> > +
> > +   Try to find the following ABsolute Difference (ABD) pattern:
> > +
> > +     VTYPE x, y, out;
> > +     type diff;
> > +   loop i in range:
> > +     S1 diff =3D x[i] - y[i]
> > +     S2 out[i] =3D ABS_EXPR <diff>;
> > +
> > +   where 'type' is a integer and 'VTYPE' is a vector of integers
> > +   the same size as 'type'
> > +
> > +   Input:
> > +
> > +   * STMT_VINFO: The stmt from which the pattern search begins
> > +
> > +   Output:
> > +
> > +   * TYPE_out: The type of the output of this pattern
> > +
> > +   * Return value: A new stmt that will be used to replace the sequenc=
e of
> > +     stmts that constitute the pattern; either SABD or UABD:
> > +     SABD_EXPR<x, y, out>
> > +     UABD_EXPR<x, y, out>
> > +
> > +      UABD expressions are used when the input types are
> > +      narrower than the output types or the output type is narrower
> > +      than 32 bits
> > + */
> > +
> > +static gimple *
> > +vect_recog_abd_pattern (vec_info *vinfo,
> > +             stmt_vec_info stmt_vinfo, tree *type_out)
> > +{
> > +  /* Look for the following patterns
> > +     X =3D x[i]
> > +     Y =3D y[i]
> > +     DIFF =3D X - Y
> > +     DAD =3D ABS_EXPR<DIFF>
> > +     out[i] =3D DAD
> > +
> > +     In which
> > +      - X, Y, DIFF, DAD all have the same type
> > +      - x, y, out are all vectors of the same type
> > +  */
> > +  gassign *last_stmt =3D dyn_cast <gassign *> (STMT_VINFO_STMT (stmt_v=
info));
> > +  if (!last_stmt)
> > +    return NULL;
> > +
> > +  tree out_type =3D TREE_TYPE (gimple_assign_lhs (last_stmt));
> > +
> > +  gassign *abs_stmt =3D last_stmt;
> > +  if (gimple_assign_cast_p (last_stmt))
> > +    {
> > +      tree last_rhs =3D gimple_assign_rhs1 (last_stmt);
> > +      if (!SSA_VAR_P (last_rhs))
> > +     return NULL;
> > +
> > +      abs_stmt =3D dyn_cast <gassign *> (SSA_NAME_DEF_STMT (last_rhs))=
;
> > +      if (!abs_stmt)
> > +     return NULL;
> > +    }
>
> Could you explain why this is needed?  I'd expect the function to be
> called on the cast rhs too, so I wouldn't expect that we'd need to handle
> casts explicitly.
>
> > +  vect_unpromoted_value unprom[2];
> > +  tree diff_oprnds[2];
> > +  tree half_type;
> > +  if (!vect_recog_absolute_difference (vinfo, abs_stmt, &half_type,
> > +                                    false, unprom, diff_oprnds))
> > +    return NULL;
> > +
> > +#define SAME_TYPE(A, B) (TYPE_PRECISION (A) =3D=3D TYPE_PRECISION (B))
> > +
> > +  tree abd_oprnds[2];
> > +  if (half_type)
> > +    {
> > +      if (!SAME_TYPE (unprom[0].type, unprom[1].type))
> > +     return NULL;
>
> I wouldn't have expected this to be unecessary.  half_type is supposed
> to be a common type that can hold all values of unprom[0].type and
> unprom[1].type.  We should just be able to do:
>
> > +      tree diff_type =3D TREE_TYPE (diff_oprnds[0]);
> > +      if (TYPE_PRECISION (out_type) !=3D TYPE_PRECISION (diff_type))
> > +     {
> > +       tree vectype =3D get_vectype_for_scalar_type (vinfo, half_type)=
;
> > +       vect_convert_inputs (vinfo, stmt_vinfo, 2, abd_oprnds,
> > +                            half_type, unprom, vectype);
>
> ...this vect_convert_inputs unconditionally.  We need to check that
> the get_vectype_for_scalar_type call succeeds though.
>
> So does it work as:
>
>   if (half_type)
>     {
>       tree vectype =3D get_vectype_for_scalar_type (vinfo, half_type);
>       if (!vectype)
>         return false;
>       vect_convert_inputs (vinfo, stmt_vinfo, 2, abd_oprnds,
>                            half_type, unprom, vectype);
>     }
>
> ?
>
> > +     }
> > +      else
> > +     {
> > +       abd_oprnds[0] =3D diff_oprnds[0];
> > +       abd_oprnds[1] =3D diff_oprnds[1];
> > +     }
> > +    }
> > +  else
> > +    {
> > +      if (unprom[0].op && unprom[1].op
> > +       && (!SAME_TYPE (unprom[0].type, unprom[1].type)
> > +       || !SAME_TYPE (unprom[0].type, out_type)))
> > +     return NULL;
>
> AIUI, the !half_type case shouldn't look at unprom, since it's handling
> simple MINUS_EXPRs.  I think we can just delete this "if" statement.
>
> > +
> > +      unprom[0].op =3D diff_oprnds[0];
> > +      unprom[1].op =3D diff_oprnds[1];
> > +      tree signed_out =3D signed_type_for (out_type);
> > +      tree signed_out_vectype =3D get_vectype_for_scalar_type (vinfo, =
signed_out);
>
> We need to check for success here too.
>
> > +      vect_convert_inputs (vinfo, stmt_vinfo, 2, abd_oprnds,
> > +                        signed_out, unprom, signed_out_vectype);
> > +
> > +      if (!SAME_TYPE (TREE_TYPE (diff_oprnds[0]), TREE_TYPE (abd_oprnd=
s[0])))
> > +     return NULL;
>
> I don't think this is needed.
>
> > +    }
> > +
> > +  if (!SAME_TYPE (TREE_TYPE (abd_oprnds[0]), TREE_TYPE (abd_oprnds[1])=
)
> > +      || !SAME_TYPE (TREE_TYPE (abd_oprnds[0]), out_type))
> > +    return NULL;
>
> I also don't think this is needed.  AIUI, the previous code has done
> all the necessary correctness checks.
>
> > +
> > +  vect_pattern_detected ("vect_recog_abd_pattern", last_stmt);
> > +
> > +  tree vectype =3D get_vectype_for_scalar_type (vinfo, out_type);
>
> I think instead we want the vector types computed above.  That is:
>
> - The ABD should be done on the vector version of half_type
>   if the subtraction was on promoted inputs.  The result of
>   the ABD should then be zero-extended (using vect_convert_output)
>   to out_type.
>
>   In particular, it's the sign of HALF_TYPE that decides whether
>   it's signed or unsigned ABD.
>
> - The ABD should be done on the vector version of signed_outtype
>   if the subtraction was on unpromoted inputs.  We then might need
>   to sign-cast it to outtype, if outtype is unsigned.  We can
>   use vect_convert_output for that too.
>
>   In other words, this case must use signed ABD.
>
> Hope I've got that right...
>
> Thanks,
> Richard
>
> > +  if (!vectype
> > +      || !direct_internal_fn_supported_p (IFN_ABD, vectype,
> > +                                       OPTIMIZE_FOR_SPEED))
> > +    return NULL;
> > +
> > +  *type_out =3D STMT_VINFO_VECTYPE (stmt_vinfo);
> > +
> > +  tree var =3D vect_recog_temp_ssa_var (out_type, NULL);
> > +  gcall *abd_stmt =3D gimple_build_call_internal (IFN_ABD, 2,
> > +                                             abd_oprnds[0], abd_oprnds=
[1]);
> > +  gimple_call_set_lhs (abd_stmt, var);
> > +  gimple_set_location (abd_stmt, gimple_location (last_stmt));
> > +  return abd_stmt;
> > +}
> > +
> >  /* Recognize an operation that performs ORIG_CODE on widened inputs,
> >     so that it can be treated as though it had the form:
> >
> > @@ -6439,6 +6631,7 @@ struct vect_recog_func
> >  static vect_recog_func vect_vect_recog_func_ptrs[] =3D {
> >    { vect_recog_bitfield_ref_pattern, "bitfield_ref" },
> >    { vect_recog_bit_insert_pattern, "bit_insert" },
> > +  { vect_recog_abd_pattern, "abd" },
> >    { vect_recog_over_widening_pattern, "over_widening" },
> >    /* Must come after over_widening, which narrows the shift as much as
> >       possible beforehand.  */