Re: VEC_WIDEN_MULT_(LO|HI)_EXPR vs. VEC_WIDEN_MULT_(EVEN|ODD)_EXPR in vectorization.

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

From: Richard Biener <richard.guenther@gmail.com>
To: Bingfeng Mei <bmei@broadcom.com>
Cc: "gcc@gcc.gnu.org" <gcc@gcc.gnu.org>
Subject: Re: VEC_WIDEN_MULT_(LO|HI)_EXPR vs. VEC_WIDEN_MULT_(EVEN|ODD)_EXPR in vectorization.
Date: Tue, 28 Jan 2014 11:09:00 -0000	[thread overview]
Message-ID: <CAFiYyc1soyzwGiPvzfPuK5f87FnpZLcWb5JRhW_hzfiyT7BbnA@mail.gmail.com> (raw)
In-Reply-To: <B71DF1153024A14EABB94E39368E44A60426734F@SJEXCHMB13.corp.ad.broadcom.com>

On Wed, Jan 22, 2014 at 1:20 PM, Bingfeng Mei <bmei@broadcom.com> wrote:
> Hi,
> I noticed there is a regression of 4.8 against ancient 4.5 in vectorization on our port. After a bit investigation, I found following code that prefer even|odd version instead of lo|hi one. This is obviously the case for AltiVec and maybe some other targets. But even|odd (expanding to a series of instructions) versions are less efficient on our target than lo|hi ones. Shouldn't there be a target-specific hook to do the choice instead of hard-coded one here, or utilizing some cost-estimating technique to compare two alternatives?

Hmm, what's the reason for a target to support both?  I think the idea
was that a target only supports either (the more efficient case).

Richard.

>      /* The result of a vectorized widening operation usually requires
>          two vectors (because the widened results do not fit into one vector).
>          The generated vector results would normally be expected to be
>          generated in the same order as in the original scalar computation,
>          i.e. if 8 results are generated in each vector iteration, they are
>          to be organized as follows:
>                 vect1: [res1,res2,res3,res4],
>                 vect2: [res5,res6,res7,res8].
>
>          However, in the special case that the result of the widening
>          operation is used in a reduction computation only, the order doesn't
>          matter (because when vectorizing a reduction we change the order of
>          the computation).  Some targets can take advantage of this and
>          generate more efficient code.  For example, targets like Altivec,
>          that support widen_mult using a sequence of {mult_even,mult_odd}
>          generate the following vectors:
>                 vect1: [res1,res3,res5,res7],
>                 vect2: [res2,res4,res6,res8].
>
>          When vectorizing outer-loops, we execute the inner-loop sequentially
>          (each vectorized inner-loop iteration contributes to VF outer-loop
>          iterations in parallel).  We therefore don't allow to change the
>          order of the computation in the inner-loop during outer-loop
>          vectorization.  */
>       /* TODO: Another case in which order doesn't *really* matter is when we
>          widen and then contract again, e.g. (short)((int)x * y >> 8).
>          Normally, pack_trunc performs an even/odd permute, whereas the
>          repack from an even/odd expansion would be an interleave, which
>          would be significantly simpler for e.g. AVX2.  */
>       /* In any case, in order to avoid duplicating the code below, recurse
>          on VEC_WIDEN_MULT_EVEN_EXPR.  If it succeeds, all the return values
>          are properly set up for the caller.  If we fail, we'll continue with
>          a VEC_WIDEN_MULT_LO/HI_EXPR check.  */
>       if (vect_loop
>           && STMT_VINFO_RELEVANT (stmt_info) == vect_used_by_reduction
>           && !nested_in_vect_loop_p (vect_loop, stmt)
>           && supportable_widening_operation (VEC_WIDEN_MULT_EVEN_EXPR,
>                                              stmt, vectype_out, vectype_in,
>                                              code1, code2, multi_step_cvt,
>                                              interm_types))
>         return true;
>
>
> Thanks,
> Bingfeng Mei

next prev parent reply	other threads:[~2014-01-28 11:02 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-22 14:33 Bingfeng Mei
2014-01-28 11:09 ` Richard Biener [this message]
2014-01-28 11:56   ` Bingfeng Mei
2014-01-28 15:17     ` Richard Biener
2014-01-28 17:28       ` Bingfeng Mei
2014-01-29  9:36         ` Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFiYyc1soyzwGiPvzfPuK5f87FnpZLcWb5JRhW_hzfiyT7BbnA@mail.gmail.com \
    --to=richard.guenther@gmail.com \
    --cc=bmei@broadcom.com \
    --cc=gcc@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).