public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Sandiford <richard.sandiford@arm.com>
To: Richard Biener <richard.guenther@gmail.com>
Cc: Richard Guenther <rguenther@suse.de>,
	 GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: Add a loop versioning pass
Date: Wed, 12 Dec 2018 18:43:00 -0000	[thread overview]
Message-ID: <87o99qegch.fsf@arm.com> (raw)
In-Reply-To: <CAFiYyc3R01O+RUQm2MX35G7_jUWMf=ERfqmuOdt0NKBWTJii3w@mail.gmail.com>	(Richard Biener's message of "Wed, 12 Dec 2018 13:05:40 +0100")

Richard Biener <richard.guenther@gmail.com> writes:
> On Thu, Dec 6, 2018 at 2:19 PM Richard Sandiford
>> Tested on x86_64-linux-gnu, aarch64-linux-gnu and aarch64_be-elf.
>> Also repeated the performance testing (but haven't yet tried an
>> LTO variant; will do that over the weekend).
>
> Any results?

Sorry, I should've remembered that finding time to run tests is easy,
finding time to analyse them is hard.

Speed-wise, the impact of the patch for LTO is similar to without,
with 554.roms_r being the main beneficiary for both AArch64 and x86_64.
I get a 6.8% improvement on Cortex-A57 with -Ofast -mcpu=native
-flto=jobserver.

Size-wise, there are three tests that grow by >=2% on x86_64:

549.fotonik3d_r: 5.5%
548.exchange2_r: 29.5%
554.roms_r: 39.6%

The roms increase seems fair enough given the benefit, since the
whole program uses a similar style for handling arrays.

fotonik3d is a mixed bag.  Some versioning conditions are from
things like:

  subroutine foo(a)
    real :: a(:,:,:)
    a(1,:,:) = ...
  end subroutine

where we version for the middle dimension having a stride of 1.
This could be eliminated if we had more semantic information.

Other conditions are for things like:

  subroutine foo(a)
    real :: a(:,:,:)
    a(:,1,:) = ...
  end subroutine

where the pass really does help, but unfortunately those loops
aren't hot and might not even be used at all.

Some opportunities are cases in which we avoid gather loads, such as
in the "mp" loads in the hot __material_mod_MOD_mat_updatee function.
But mp feeds a gather load itself and these assignments aren't a
critical part of the loop.

(I'm testing on an AVX-only system, so these are synthesised gathers
rather than real gathers.)

The growth in 548.exchange2_r comes from reasonable changes to cold code.
The test spends almost all its time in __brute_force_MOD_digits_2, which
contains the same code before and after the patch, but which accounts
for less than 1% of .text before the patch.

> I've skimmed over the updated patch and it looks
> a lot better now.
>
> +bool
> +loop_versioning
> +::find_per_loop_multiplication (address_info &address, address_term_info &term)
> +{
>
> is that what coding convention allows?

Not sure we've settled on something, so I improvised.

> For grepping I'd then say we should do
>
> bool loop_versioning::
> find_per_loop_multiplication (...)
>
> ;)

Sounds good to me.

> Anywhere else we you use
>
> loop_versioning::foo
>
> so please stick to that.

Yeah, I used that when I could avoid an argument spilling over 80 chars,
but I think I missed that the above function now fits into that category,
Will double-check the others.

> Otherwise OK.

Thanks :-)

> I think I don't see a testcase where we could version both loops in a nest
> so I'm not sure whether the transform works fine when you are only
> updating SSA form at the very end of the pass.

You mean something like:

  real :: foo(:,:), bar(:)

  do i=1,n
    do j=1,n
      foo(i,j) = ...
    end do
    bar(i) = ..
  end do

?  I can add a test if so.

At the moment the pass only looks for direct versioning opportunities
in inner loops, so the assignment to bar wouldn't be treated as a
versioning opportunity.  We should still hoist the version check
for foo outside both loops, which is tested by loop_versioning_4.f90
for cases in which the outer loop doesn't have its own array
arithmetic, but isn't tested for cases like the above.

> There may also be some subtle issues with substitute_and_fold being
> applied to non-up-to-date SSA form given it folds stmts looking at
> (single-use!) SSA edges.  The single-use guard might be what saves you
> here (SSA uses in the copies are not yet updated to point to the
> copied DEFs).

OK.  I was hoping that because we only apply substitute_and_fold
to new code, there would be no problem with uses elsewhere.

Would it be safer to:

  - version all loops we want to version
  - update SSA explicitly
  - apply substitute and fold to all "new" loops

?  Could we then get away with returning a 0 TODO at the end?

Thanks,
Richard

  reply	other threads:[~2018-12-12 18:43 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-30 10:55 Richard Biener
2018-11-09 10:46 ` Kyrill Tkachov
2018-11-28 16:48 ` Richard Sandiford
2018-11-29 11:31   ` Martin Jambor
2018-12-03 13:16   ` Richard Biener
2018-12-06 13:19     ` Richard Sandiford
2018-12-12 12:06       ` Richard Biener
2018-12-12 18:43         ` Richard Sandiford [this message]
2018-12-13 16:08           ` Richard Biener
2018-12-14 16:18             ` Richard Sandiford
2018-12-15 10:27               ` Richard Biener
  -- strict thread matches above, loose matches on Subject: below --
2018-10-24 13:41 Richard Sandiford
2018-10-25 14:16 ` Richard Biener
2018-10-25 16:03   ` Richard Sandiford
2018-10-26 14:49   ` Richard Biener
2018-10-25 16:16 ` David Malcolm
2018-11-28 14:16   ` Richard Sandiford

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87o99qegch.fsf@arm.com \
    --to=richard.sandiford@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=rguenther@suse.de \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).