public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Sandiford <richard.sandiford@arm.com>
To: Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org>
Cc: gcc Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [aarch64] Code-gen for vector initialization involving constants
Date: Wed, 24 May 2023 20:58:06 +0100	[thread overview]
Message-ID: <mptv8ghcxip.fsf@arm.com> (raw)
In-Reply-To: <CAAgBjMmB9WqP85xw3u7jFrk6Geq5E2LDgKorAJn0G+gkknTZMw@mail.gmail.com> (Prathamesh Kulkarni's message of "Thu, 25 May 2023 00:43:27 +0530")

Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:
> On Wed, 24 May 2023 at 15:40, Richard Sandiford
> <richard.sandiford@arm.com> wrote:
>>
>> Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:
>> > On Mon, 22 May 2023 at 14:18, Richard Sandiford
>> > <richard.sandiford@arm.com> wrote:
>> >>
>> >> Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:
>> >> > Hi Richard,
>> >> > Thanks for the suggestions. Does the attached patch look OK ?
>> >> > Boostrap+test in progress on aarch64-linux-gnu.
>> >>
>> >> Like I say, please wait for the tests to complete before sending an RFA.
>> >> It saves a review cycle if the tests don't in fact pass.
>> > Right, sorry, will post patches after completion of testing henceforth.
>> >>
>> >> > diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
>> >> > index 29dbacfa917..e611a7cca25 100644
>> >> > --- a/gcc/config/aarch64/aarch64.cc
>> >> > +++ b/gcc/config/aarch64/aarch64.cc
>> >> > @@ -22332,6 +22332,43 @@ aarch64_unzip_vector_init (machine_mode mode, rtx vals, bool even_p)
>> >> >    return gen_rtx_PARALLEL (new_mode, vec);
>> >> >  }
>> >> >
>> >> > +/* Return true if INSN is a scalar move.  */
>> >> > +
>> >> > +static bool
>> >> > +scalar_move_insn_p (const rtx_insn *insn)
>> >> > +{
>> >> > +  rtx set = single_set (insn);
>> >> > +  if (!set)
>> >> > +    return false;
>> >> > +  rtx src = SET_SRC (set);
>> >> > +  rtx dest = SET_DEST (set);
>> >> > +  return is_a<scalar_mode>(GET_MODE (dest))
>> >> > +      && aarch64_mov_operand_p (src, GET_MODE (src));
>> >>
>> >> Formatting:
>> >>
>> >>   return (is_a<scalar_mode>(GET_MODE (dest))
>> >>           && aarch64_mov_operand_p (src, GET_MODE (src)));
>> >>
>> >> OK with that change if the tests pass, thanks.
>> > Unfortunately, the patch regressed vec-init-21.c:
>> >
>> > int8x16_t f_s8(int8_t x, int8_t y)
>> > {
>> >   return (int8x16_t) { x, y, 1, 2, 3, 4, 5, 6,
>> >                        7, 8, 9, 10, 11, 12, 13, 14 };
>> > }
>> >
>> > -O3 code-gen trunk:
>> > f_s8:
>> >         adrp    x2, .LC0
>> >         ldr     q0, [x2, #:lo12:.LC0]
>> >         ins     v0.b[0], w0
>> >         ins     v0.b[1], w1
>> >         ret
>> >
>> > -O3 code-gen patch:
>> > f_s8:
>> >         adrp    x2, .LC0
>> >         ldr     d31, [x2, #:lo12:.LC0]
>> >         adrp    x2, .LC1
>> >         ldr     d0, [x2, #:lo12:.LC1]
>> >         ins     v31.b[0], w0
>> >         ins     v0.b[0], w1
>> >         zip1    v0.16b, v31.16b, v0.16b
>> >         ret
>> >
>> > With trunk, it chooses the fallback sequence because both fallback
>> > and zip1 sequence had cost = 20, however with patch applied,
>> > we end up with zip1 sequence cost = 24 and fallback sequence
>> > cost = 28.
>> >
>> > This happens because of using insn_cost instead of
>> > set_rtx_cost for the following expression:
>> > (set (reg:QI 100)
>> >     (subreg/s/u:QI (reg/v:SI 94 [ y ]) 0))
>> > set_rtx_cost returns 0 for above expression but insn_cost returns 4.
>>
>> Yeah, was wondering why you'd dropped the set_rtx_cost thing,
>> but decided not to question it since using insn_cost seemed
>> reasonable if it worked.
> The attached patch uses set_rtx_cost for single_set and insn_cost
> otherwise for non debug insns similar to seq_cost.

FWIW, I think with the aarch64_mov_operand fix, the old way of using
insn_cost for everything would have worked too.  But either way is fine.

>> > This expression template appears twice in fallback sequence, which raises
>> > the cost to 28 from 20, while it appears once in each half of zip1 sequence,
>> > which raises the cost to 24 from 20, and so it now prefers zip1 sequence
>> > instead.
>> >
>> > I assumed this expression would be ignored because it looks like a scalar move,
>> > but that doesn't seem to be the case ?
>> > aarch64_classify_symbolic_expression returns
>> > SYMBOL_FORCE_TO_MEM for (subreg/s/u:QI (reg/v:SI 94 [ y ]) 0)
>> > and thus aarch64_mov_operand_p returns false.
>>
>> Ah, I guess it should be aarch64_mov_operand instead.  Confusing that
>> they're so different...
> Thanks, using aarch64_mov_operand worked.
>>
>> > Another issue with the zip1 sequence above is using same register x2
>> > for loading another half of constant in:
>> > adrp    x2, .LC1
>> >
>> > I guess this will create an output dependency from adrp x2, .LC0 ->
>> > adrp x2, .LC1
>> > and anti-dependency from  ldr d31, [x2, #:lo12:.LC0] -> adrp x2, .LC1
>> > essentially forcing almost the entire sequence (except ins
>> > instructions) to execute sequentially ?
>>
>> I'd expect modern cores to handle that via renaming.
> Ah right, thanks for the clarification.
>
> For some reason, it seems git diff is not formatting the patch correctly :/
> Or perhaps I am doing something wrongly.

No, I think it's fine.  It's just tabs vs. spaces.  A leading
"+" followed by a tab is still only indented 8 columns, whereas
"+" followed by 6 spaces is indented 7 columns.  So indentation
can look a bit weird in the diff.

I was accounting for that though. :)

> For eg, it shows:
> +  return is_a<scalar_mode>(GET_MODE (dest))
> +        && aarch64_mov_operand (src, GET_MODE (src));
> but after applying the patch, it's formatted correctly with "&&
> aarch64..."  right below is_a<scalar_mode>, both on column 10.

Yeah, the indentation itself was OK.  But there's an “emacs rule”
that says that parens should be used when splitting an expression
over multiple lines like this.  So:

-------
Formatting:

  return (is_a<scalar_mode>(GET_MODE (dest))
	  && aarch64_mov_operand_p (src, GET_MODE (src)));
-------

was about adding the parens.

> +  for (; seq; seq = NEXT_INSN (seq))
> +    if (NONDEBUG_INSN_P (seq)
> +	&& !scalar_move_insn_p (seq))
> +      {
> +	if (rtx set = single_set (seq))
> +	  cost += set_rtx_cost (set, speed);
> +	else
> +	  {
> +	    int this_cost = insn_cost (CONST_CAST_RTX_INSN (seq), speed);
> +	    if (this_cost > 0)
> +	      cost += this_cost;
> +	    else
> +	      cost++;
> +	  }
> +      }

I think it'd be better to do the single_set first, and pass the set
to scalar_move_insn_p.  I.e.

  for (; seq; seq = NEXT_INSN (seq))
    if (NONDEBUG_INSN_P (seq))
      {
	if (rtx set = single_set (seq))
	  {
	    if (!scalar_move_insn_p (set))
	      cost += set_rtx_cost (set, speed);
	  }
	else
	  {
	    int this_cost = insn_cost (CONST_CAST_RTX_INSN (seq), speed);
	    if (this_cost > 0)
	      cost += this_cost;
	    else
	      cost++;
	  }
      }

Then scalar_move_insn_p can just be the last three statements,
adjusted as follows:

  rtx src = SET_SRC (set);
  rtx dest = SET_DEST (set);
  return (is_a<scalar_mode> (GET_MODE (dest))
	  && aarch64_mov_operand (src, GET_MODE (dest)));

Note the space after ">", and that the mode passed to aarch64_mov_operand
is the destination mode (since the source mode might be VOIDmode).

OK with that change, thanks.

Thanks,
Richard

  reply	other threads:[~2023-05-24 19:58 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-03  7:16 Prathamesh Kulkarni
2023-02-13  6:28 ` Prathamesh Kulkarni
2023-04-03 18:12   ` Prathamesh Kulkarni
2023-04-25 10:59 ` Richard Sandiford
2023-05-02  5:41   ` Prathamesh Kulkarni
2023-05-02  9:25     ` Richard Sandiford
2023-05-02 10:22       ` Prathamesh Kulkarni
2023-05-02 12:02         ` Richard Sandiford
2023-05-02 12:38           ` Prathamesh Kulkarni
2023-05-02 12:52             ` Richard Sandiford
2023-05-03 11:28               ` Prathamesh Kulkarni
2023-05-11 19:15                 ` Richard Sandiford
2023-05-15 14:09                   ` Prathamesh Kulkarni
2023-05-15 18:59                     ` Richard Sandiford
2023-05-17 15:23                       ` Prathamesh Kulkarni
2023-05-18  8:07                         ` Richard Sandiford
2023-05-18 14:41                           ` Prathamesh Kulkarni
2023-05-18 16:34                             ` Richard Sandiford
2023-05-19 10:56                               ` Prathamesh Kulkarni
2023-05-22  8:48                                 ` Richard Sandiford
2023-05-24  9:29                                   ` Prathamesh Kulkarni
2023-05-24 10:10                                     ` Richard Sandiford
2023-05-24 19:13                                       ` Prathamesh Kulkarni
2023-05-24 19:58                                         ` Richard Sandiford [this message]
2023-05-25  6:47                                           ` Prathamesh Kulkarni
2023-05-25  7:34                                             ` Richard Sandiford
2023-05-25  9:56                                               ` Prathamesh Kulkarni
2023-05-26  3:04                                                 ` Prathamesh Kulkarni
2023-05-30 18:53                                                   ` Richard Sandiford
2023-06-12 17:52                                                     ` Prathamesh Kulkarni
2023-05-24 19:50                                       ` Prathamesh Kulkarni

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=mptv8ghcxip.fsf@arm.com \
    --to=richard.sandiford@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=prathamesh.kulkarni@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).