public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Sandiford <richard.sandiford@arm.com>
To: Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org>
Cc: gcc Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [aarch64] Code-gen for vector initialization involving constants
Date: Tue, 25 Apr 2023 11:59:21 +0100	[thread overview]
Message-ID: <mpt354o448m.fsf@arm.com> (raw)
In-Reply-To: <CAAgBjMnwGk4fOc3PTM_agTXXFvt=767a3-AWOfSr23Xja6K81w@mail.gmail.com> (Prathamesh Kulkarni's message of "Fri, 3 Feb 2023 12:46:33 +0530")

Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:
> Hi Richard,
> While digging thru aarch64_expand_vector_init, I noticed it gives
> priority to loading a constant first:
>  /* Initialise a vector which is part-variable.  We want to first try
>      to build those lanes which are constant in the most efficient way we
>      can.  */
>
> which results in suboptimal code-gen for following case:
> int16x8_t f_s16(int16_t x)
> {
>   return (int16x8_t) { x, x, x, x, x, x, x, 1 };
> }
>
> code-gen trunk:
> f_s16:
>         movi    v0.8h, 0x1
>         ins     v0.h[0], w0
>         ins     v0.h[1], w0
>         ins     v0.h[2], w0
>         ins     v0.h[3], w0
>         ins     v0.h[4], w0
>         ins     v0.h[5], w0
>         ins     v0.h[6], w0
>         ret
>
> The attached patch tweaks the following condition:
> if (n_var == n_elts && n_elts <= 16)
>   {
>     ...
>   }
>
> to pass if maxv >= 80% of n_elts, with 80% being an
> arbitrary "high enough" threshold. The intent is to dup
> the most repeating variable if it it's repetition
> is "high enough" and insert constants which should be "better" than
> loading constant first and inserting variables like in the above case.

I'm not too keen on the 80%.  Like you say, it seems a bit arbitrary.

The case above can also be handled by relaxing n_var == n_elts to
n_var >= n_elts - 1, so that if there's just one constant element,
we look for duplicated variable elements.  If there are none
(maxv == 1), but there is a constant element, we can duplicate
the constant element into a register.

The case when there's more than one constant element needs more thought
(and testcases :-)).  E.g. after a certain point, it would probably be
better to load the variable and constant parts separately and blend them
using TBL.  It also matters whether the constants are equal or not.

There are also cases that could be handled using EXT.

Plus, if we're inserting many variable elements that are already
in GPRs, we can probably do better by coalescing them into bigger
GPR values and inserting them as wider elements.

Because of things like that, I think we should stick to the
single-constant case for now.

Thanks,
Richard

  parent reply	other threads:[~2023-04-25 10:59 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-03  7:16 Prathamesh Kulkarni
2023-02-13  6:28 ` Prathamesh Kulkarni
2023-04-03 18:12   ` Prathamesh Kulkarni
2023-04-25 10:59 ` Richard Sandiford [this message]
2023-05-02  5:41   ` Prathamesh Kulkarni
2023-05-02  9:25     ` Richard Sandiford
2023-05-02 10:22       ` Prathamesh Kulkarni
2023-05-02 12:02         ` Richard Sandiford
2023-05-02 12:38           ` Prathamesh Kulkarni
2023-05-02 12:52             ` Richard Sandiford
2023-05-03 11:28               ` Prathamesh Kulkarni
2023-05-11 19:15                 ` Richard Sandiford
2023-05-15 14:09                   ` Prathamesh Kulkarni
2023-05-15 18:59                     ` Richard Sandiford
2023-05-17 15:23                       ` Prathamesh Kulkarni
2023-05-18  8:07                         ` Richard Sandiford
2023-05-18 14:41                           ` Prathamesh Kulkarni
2023-05-18 16:34                             ` Richard Sandiford
2023-05-19 10:56                               ` Prathamesh Kulkarni
2023-05-22  8:48                                 ` Richard Sandiford
2023-05-24  9:29                                   ` Prathamesh Kulkarni
2023-05-24 10:10                                     ` Richard Sandiford
2023-05-24 19:13                                       ` Prathamesh Kulkarni
2023-05-24 19:58                                         ` Richard Sandiford
2023-05-25  6:47                                           ` Prathamesh Kulkarni
2023-05-25  7:34                                             ` Richard Sandiford
2023-05-25  9:56                                               ` Prathamesh Kulkarni
2023-05-26  3:04                                                 ` Prathamesh Kulkarni
2023-05-30 18:53                                                   ` Richard Sandiford
2023-06-12 17:52                                                     ` Prathamesh Kulkarni
2023-05-24 19:50                                       ` Prathamesh Kulkarni

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=mpt354o448m.fsf@arm.com \
    --to=richard.sandiford@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=prathamesh.kulkarni@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).