public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Andrew Pinski <pinskia@gmail.com>
To: Richard Biener <rguenther@suse.de>
Cc: Filip Kastl <fkastl@suse.cz>,
	gcc-patches@gcc.gnu.org, hubicka@ucw.cz,  jakub@redhat.com
Subject: Re: [RFC] gimple ssa: SCCP - A new PHI optimization pass
Date: Thu, 31 Aug 2023 14:44:02 -0700	[thread overview]
Message-ID: <CA+=Sn1=hj7_=WXCbxPxB+Qam9xLE0FrQunVP_SaF0brg4db70g@mail.gmail.com> (raw)
In-Reply-To: <nycvar.YFH.7.77.849.2308311209240.22006@jbgna.fhfr.qr>

On Thu, Aug 31, 2023 at 5:15 AM Richard Biener via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> On Thu, 31 Aug 2023, Filip Kastl wrote:
>
> > > The most obvious places would be right after SSA construction and before RTL expansion.
> > > Can you provide measurements for those positions?
> >
> > The algorithm should only remove PHIs that break SSA form minimality. Since
> > GCC's SSA construction already produces minimal SSA form, the algorithm isn't
> > expected to remove any PHIs if run right after the construction. I even
> > measured it and indeed -- no PHIs got removed (except for 502.gcc_r, where the
> > algorithm managed to remove exactly 1 PHI, which is weird).
> >
> > I tried putting the pass before pass_expand. There isn't a lot of PHIs to
> > remove at that point, but there still are some.
>
> That's interesting.  Your placement at
>
>           NEXT_PASS (pass_cd_dce, false /* update_address_taken_p */);
>           NEXT_PASS (pass_phiopt, true /* early_p */);
> +         NEXT_PASS (pass_sccp);
>
> and
>
>        NEXT_PASS (pass_tsan);
>        NEXT_PASS (pass_dse, true /* use DR analysis */);
>        NEXT_PASS (pass_dce);
> +      NEXT_PASS (pass_sccp);
>
> isn't immediately after the "best" existing pass we have to
> remove dead PHIs which is pass_cd_dce.  phiopt might leave
> dead PHIs around and the second instance runs long after the
> last CD-DCE.

Actually the last phiopt is run before last pass_cd_dce:
      NEXT_PASS (pass_dce, true /* update_address_taken_p */);
      /* After late DCE we rewrite no longer addressed locals into SSA
         form if possible.  */
      NEXT_PASS (pass_forwprop);
      NEXT_PASS (pass_sink_code, true /* unsplit edges */);
      NEXT_PASS (pass_phiopt, false /* early_p */);
      NEXT_PASS (pass_fold_builtins);
      NEXT_PASS (pass_optimize_widening_mul);
      NEXT_PASS (pass_store_merging);
      /* If DCE is not run before checking for uninitialized uses,
         we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c).
         However, this also causes us to misdiagnose cases that should be
         real warnings (e.g., testsuite/gcc.dg/pr18501.c).  */
      NEXT_PASS (pass_cd_dce, false /* update_address_taken_p */);

Thanks,
Andrew Pinski


>
> So I wonder if your pass just detects unnecessary PHIs we'd have
> removed by other means and what survives until RTL expansion is
> what we should count?



>
> Can you adjust your original early placement to right after
> the cd-dce pass and for the late placement turn the dce pass
> before it into cd-dce and re-do your measurements?
>
> > 500.perlbench_r
> > Started with 43111
> > Ended with 42942
> > Removed PHI % .39201131961680313700
> >
> > 502.gcc_r
> > Started with 141392
> > Ended with 140455
> > Removed PHI % .66269661649881181400
> >
> > 505.mcf_r
> > Started with 482
> > Ended with 478
> > Removed PHI % .82987551867219917100
> >
> > 523.xalancbmk_r
> > Started with 136040
> > Ended with 135629
> > Removed PHI % .30211702440458688700
> >
> > 531.deepsjeng_r
> > Started with 2150
> > Ended with 2148
> > Removed PHI % .09302325581395348900
> >
> > 541.leela_r
> > Started with 4664
> > Ended with 4650
> > Removed PHI % .30017152658662092700
> >
> > 557.xz_r
> > Started with 43
> > Ended with 43
> > Removed PHI % 0
> >
> > > Can the pass somehow be used as part of propagations like during value numbering?
> >
> > I don't think that the pass could be used as a part of different optimizations
> > since it works on the whole CFG (except for copy propagation as I noted in the
> > RFC). I'm adding Honza into Cc. He'll have more insight into this.
> >
> > > Could the new file be called gimple-ssa-sccp.cc or something similar?
> >
> > Certainly. Though I'm not sure, but wouldn't tree-ssa-sccp.cc be more
> > appropriate?
> >
> > I'm thinking about naming the pass 'scc-copy' and the file
> > 'tree-ssa-scc-copy.cc'.
> >
> > > Removing some PHIs is nice, but it would be also interesting to know what
> > > are the effects on generated code size and/or performance.
> > > And also if it has any effects on debug information coverage.
> >
> > Regarding performance: I ran some benchmarks on a Zen3 machine with -O3 with
> > and without the new pass. *I got ~2% speedup for 505.mcf_r and 541.leela_r.
> > Here are the full results. What do you think? Should I run more benchmarks? Or
> > benchmark multiple times? Or run the benchmarks on different machines?*
> >
> > 500.perlbench_r
> > Without SCCP: 244.151807s
> > With SCCP: 242.448438s
> > -0.7025695913124297%
> >
> > 502.gcc_r
> > Without SCCP: 211.029606s
> > With SCCP: 211.614523s
> > +0.27640683243653763%
> >
> > 505.mcf_r
> > Without SCCP: 298.782621s
> > With SCCP: 291.671468s
> > -2.438069465197046%
> >
> > 523.xalancbmk_r
> > Without SCCP: 189.940639s
> > With SCCP: 189.876261s
> > -0.03390523894928332%
> >
> > 531.deepsjeng_r
> > Without SCCP: 250.63648s
> > With SCCP: 250.988624s
> > +0.1403027732444051%
> >
> > 541.leela_r
> > Without SCCP: 346.066278s
> > With SCCP: 339.692987s
> > -1.8761915152519792%
> >
> > Regarding size: The pass doesn't seem to significantly reduce or increase the
> > size of the result binary. The differences were at most ~0.1%.
> >
> > Regarding debug info coverage: I didn't notice any additional guality testcases
> > failing after I applied the patch. *Is there any other way how I should check
> > debug info coverage?*
> >
> >
> > Filip K
> >
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH,
> Frankenstrasse 146, 90461 Nuernberg, Germany;
> GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

  reply	other threads:[~2023-08-31 21:44 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-24 15:07 Filip Kastl
2023-08-24 15:47 ` Richard Biener
2023-08-24 15:54   ` Jakub Jelinek
2023-08-31 11:26     ` Filip Kastl
2023-08-31 11:44       ` Jakub Jelinek
2023-08-31 12:13       ` Richard Biener
2023-08-31 21:44         ` Andrew Pinski [this message]
2023-09-01  6:34           ` Richard Biener
2023-09-01 10:10         ` Filip Kastl
2023-09-01 10:53           ` Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+=Sn1=hj7_=WXCbxPxB+Qam9xLE0FrQunVP_SaF0brg4db70g@mail.gmail.com' \
    --to=pinskia@gmail.com \
    --cc=fkastl@suse.cz \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hubicka@ucw.cz \
    --cc=jakub@redhat.com \
    --cc=rguenther@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).