public inbox for gcc-regression@sourceware.org
help / color / mirror / Atom feed
From: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
To: Andrew Pinski <apinski@marvell.com>,
	Tamar Christina <tamar.christina@arm.com>
Cc: "gcc-regression@gcc.gnu.org" <gcc-regression@gcc.gnu.org>
Subject: Re: [EXT] [TCWG CI] 458.sjeng grew in size by 7% after gcc: Fix PR tree-optimization/103228 and 103228: folding of (type) X op CST where type is a nop convert
Date: Mon, 22 Nov 2021 18:21:09 +0300	[thread overview]
Message-ID: <2F6AEB7D-0E0B-496C-84A9-ED387338854B@linaro.org> (raw)
In-Reply-To: <MWHPR18MB1213710571CA876BFC1D6045BF9C9@MWHPR18MB1213.namprd18.prod.outlook.com>

Hi Andrew,

It appears to be a secret option #3: your patch triggers weirdness in other parts of the toolchain.  Specifically, after your patch workaround for E843419 is triggered in BFD, and that seems to cause the code-size increase.  This is observable only in analysis of the actual final binary; assembly files look almost identical.

The bit that I don’t understand is that the workaround should’ve increased the code-size by 4K, not by 8K.

Hi Tamar,

You’ve touched E843419 workaround last (in 2019) — is it expected that a single use of the workaround can cause 8K increase?

Regards,

--
Maxim Kuvyrkov
https://www.linaro.org

> On 20 Nov 2021, at 01:18, Andrew Pinski via Gcc-regression <gcc-regression@gcc.gnu.org> wrote:
> 
> I looked at this and all I saw was 2 additional instructions being added, both mov instructions due to some IV-OPTs differences (IV-OPTs is adding an cast inside the loop for some reason ...).
> So either I tested this incorrectly or the test method here is incorrect.
> 
> Thanks,
> Andrew Pinski
> 
> ________________________________________
> From: ci_notify@linaro.org <ci_notify@linaro.org>
> Sent: Friday, November 19, 2021 1:04 PM
> To: Andrew Pinski
> Cc: gcc-regression@gcc.gnu.org
> Subject: [EXT] [TCWG CI] 458.sjeng grew in size by 7% after gcc: Fix PR tree-optimization/103228 and 103228: folding of (type) X op CST where type is a nop convert
> 
> External Email
> 
> ----------------------------------------------------------------------
> After gcc commit 32221357007666124409ec3ee0d3a1cf263ebc9e
> Author: Andrew Pinski <apinski@marvell.com>
> 
>    Fix PR tree-optimization/103228 and 103228: folding of (type) X op CST where type is a nop convert
> 
> the following benchmarks grew in size by more than 1%:
> - 458.sjeng grew in size by 7% from 114269 to 122477 bytes
> 
> Below reproducer instructions can be used to re-build both "first_bad" and "last_good" cross-toolchains used in this bisection.  Naturally, the scripts will fail when triggerring benchmarking jobs if you don't have access to Linaro TCWG CI.
> 
> For your convenience, we have uploaded tarballs with pre-processed source and assembly files at:
> - First_bad save-temps: https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job_tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-2Dmaster-2Daarch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-2D32221357007666124409ec3ee0d3a1cf263ebc9e_save-2Dtemps_&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY-AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=64LsUbhKbnp6JoZ4tpOxOqKIBNUPey5b5Wxar2H1ftE&e=
> - Last_good save-temps: https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job_tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-2Dmaster-2Daarch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-2D0e4a8656e818b669129a670057cbc21e5b723c18_save-2Dtemps_&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY-AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=PbUbNCmYCCoLgsnpNqfi23eCFwAAcw3TttTpEQSeLbo&e=
> - Baseline save-temps: https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job_tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-2Dmaster-2Daarch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-2Dbaseline_save-2Dtemps_&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY-AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=k7DWlg8__Jn5Zctgq52zAGytSDc2PyFloO5PCGDz704&e=
> 
> Configuration:
> - Benchmark: SPEC CPU2006
> - Toolchain: GCC + Glibc + GNU Linker
> - Version: all components were built from their tip of trunk
> - Target: aarch64-linux-gnu
> - Compiler flags: -Os
> - Hardware: APM Mustang 8x X-Gene1
> 
> This benchmarking CI is work-in-progress, and we welcome feedback and suggestions at linaro-toolchain@lists.linaro.org .  In our improvement plans is to add support for SPEC CPU2017 benchmarks and provide "perf report/annotate" data behind these reports.
> 
> THIS IS THE END OF INTERESTING STUFF.  BELOW ARE LINKS TO BUILDS, REPRODUCTION INSTRUCTIONS, AND THE RAW COMMIT.
> 
> This commit has regressed these CI configurations:
> - tcwg_bmk_gnu_apm/gnu-master-aarch64-spec2k6-Os
> 
> First_bad build: https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job_tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-2Dmaster-2Daarch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-2D32221357007666124409ec3ee0d3a1cf263ebc9e_&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY-AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=rugdsPeaqR4LpZPMF4LjEMmos5MpkW3s-err6QidWsg&e=
> Last_good build: https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job_tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-2Dmaster-2Daarch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-2D0e4a8656e818b669129a670057cbc21e5b723c18_&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY-AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=3ZDneXQWMjvfpwejPjmbYcLu3aA67zGy9LyKZFOzKBM&e=
> Baseline build: https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job_tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-2Dmaster-2Daarch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-2Dbaseline_&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY-AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=TSQH_4B2d9G86KrglzsbV5hu-6e7Qmcwzr6j9A6A3eA&e=
> Even more details: https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job_tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-2Dmaster-2Daarch64-2Dspec2k6-2DOs_10_artifact_artifacts_&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY-AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=zAYEvNR_w5I_nyAlYW-9JA9Zfc3iJcHwlXBPjHYrnJo&e=
> 
> Reproduce builds:
> <cut>
> mkdir investigate-gcc-32221357007666124409ec3ee0d3a1cf263ebc9e
> cd investigate-gcc-32221357007666124409ec3ee0d3a1cf263ebc9e
> 
> # Fetch scripts
> git clone https://urldefense.proofpoint.com/v2/url?u=https-3A__git.linaro.org_toolchain_jenkins-2Dscripts&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY-AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=51pml1SAxT6Evo12jbzkpKVsiAc4K7nFO-IJlctRmvc&e=
> 
> # Fetch manifests and test.sh script
> mkdir -p artifacts/manifests
> curl -o artifacts/manifests/build-baseline.sh https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job_tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-2Dmaster-2Daarch64-2Dspec2k6-2DOs_10_artifact_artifacts_manifests_build-2Dbaseline.sh&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY-AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=Skv3p9mJHR0P8uzfWvEExAMPxPqw43j4wqTDSDmiXBI&e=  --fail
> curl -o artifacts/manifests/build-parameters.sh https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job_tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-2Dmaster-2Daarch64-2Dspec2k6-2DOs_10_artifact_artifacts_manifests_build-2Dparameters.sh&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY-AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=RDytw50jjcoc2T7E8VAJYshUzXazVHq6_Oi32WyLERU&e=  --fail
> curl -o artifacts/test.sh https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job_tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-2Dmaster-2Daarch64-2Dspec2k6-2DOs_10_artifact_artifacts_test.sh&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY-AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=Ur1s8yJzrGucvvBMwnmw3kx5NxGS4S4bVjav3jjjNns&e=  --fail
> chmod +x artifacts/test.sh
> 
> # Reproduce the baseline build (build all pre-requisites)
> ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh
> 
> # Save baseline build state (which is then restored in artifacts/test.sh)
> mkdir -p ./bisect
> rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /gcc/ ./ ./bisect/baseline/
> 
> cd gcc
> 
> # Reproduce first_bad build
> git checkout --detach 32221357007666124409ec3ee0d3a1cf263ebc9e
> ../artifacts/test.sh
> 
> # Reproduce last_good build
> git checkout --detach 0e4a8656e818b669129a670057cbc21e5b723c18
> ../artifacts/test.sh
> 
> cd ..
> </cut>
> 
> Full commit (up to 1000 lines):
> <cut>
> commit 32221357007666124409ec3ee0d3a1cf263ebc9e
> Author: Andrew Pinski <apinski@marvell.com>
> Date:   Mon Nov 15 09:31:20 2021 +0000
> 
>    Fix PR tree-optimization/103228 and 103228: folding of (type) X op CST where type is a nop convert
> 
>    Currently we fold (type) X op CST into (type) (X op ((type-x) CST)) when the conversion widens
>    but not when the conversion is a nop. For the same reason why we move the widening conversion
>    (the possibility of removing an extra conversion), we should do the same if the conversion is a
>    nop.
> 
>    Committed as approved with the comment change.
> 
>            PR tree-optimization/103228
>            PR tree-optimization/55177
> 
>    gcc/ChangeLog:
> 
>            * match.pd ((type) X bitop CST): Also do this
>            transformation for nop conversions.
> 
>    gcc/testsuite/ChangeLog:
> 
>            * gcc.dg/tree-ssa/pr103228-1.c: New test.
>            * gcc.dg/tree-ssa/pr55177-1.c: New test.
> ---
> gcc/match.pd                               |  6 ++++--
> gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c | 11 +++++++++++
> gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c  | 14 ++++++++++++++
> 3 files changed, 29 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 89df7b2a174..77d848d631e 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -1616,8 +1616,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>          Restrict it to GIMPLE to avoid endless recursions.  */
>        && (bitop != BIT_AND_EXPR || GIMPLE)
>        && (/* That's a good idea if the conversion widens the operand, thus
> -             after hoisting the conversion the operation will be narrower.  */
> -          TYPE_PRECISION (TREE_TYPE (@0)) < TYPE_PRECISION (type)
> +             after hoisting the conversion the operation will be narrower.
> +             It is also a good if the conversion is a nop as moves the
> +             conversion to one side; allowing for combining of the conversions.  */
> +          TYPE_PRECISION (TREE_TYPE (@0)) <= TYPE_PRECISION (type)
>           /* It's also a good idea if the conversion is to a non-integer
>              mode.  */
>           || GET_MODE_CLASS (TYPE_MODE (type)) != MODE_INT
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c b/gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c
> new file mode 100644
> index 00000000000..a7539819cf2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +int f(int a, int b)
> +{
> +  b|=1u;
> +  b|=2;
> +  return b;
> +}
> +/* { dg-final { scan-tree-dump-times "\\\| 3" 1 "optimized"} } */
> +/* { dg-final { scan-tree-dump-times "\\\| 1" 0 "optimized"} } */
> +/* { dg-final { scan-tree-dump-times "\\\| 2" 0 "optimized"} } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c b/gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c
> new file mode 100644
> index 00000000000..de1a264345c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +extern int x;
> +
> +void foo(void)
> +{
> +  int a = __builtin_bswap32(x);
> +  a &= 0x5a5b5c5d;
> +  x = __builtin_bswap32(a);
> +}
> +
> +/* { dg-final { scan-tree-dump-times "__builtin_bswap32" 0 "optimized"} } */
> +/* { dg-final { scan-tree-dump-times "& 1566333786" 1 "optimized"} } */
> +/* { dg-final { scan-tree-dump-times "& 1515936861" 0 "optimized"} } */
> </cut>



  parent reply	other threads:[~2021-11-22 15:21 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-19 21:04 ci_notify
     [not found] ` <MWHPR18MB1213710571CA876BFC1D6045BF9C9@MWHPR18MB1213.namprd18.prod.outlook.com>
2021-11-22 15:21   ` Maxim Kuvyrkov [this message]
2021-11-23 10:57     ` [EXT] " Tamar Christina
2021-11-24 13:10       ` Maxim Kuvyrkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2F6AEB7D-0E0B-496C-84A9-ED387338854B@linaro.org \
    --to=maxim.kuvyrkov@linaro.org \
    --cc=apinski@marvell.com \
    --cc=gcc-regression@gcc.gnu.org \
    --cc=tamar.christina@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).