public inbox for gcc-regression@sourceware.org
help / color / mirror / Atom feed
From: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
To: Tamar Christina <Tamar.Christina@arm.com>
Cc: Andrew Pinski <apinski@marvell.com>,
	"gcc-regression@gcc.gnu.org" <gcc-regression@gcc.gnu.org>
Subject: Re: [EXT] [TCWG CI] 458.sjeng grew in size by 7% after gcc: Fix PR tree-optimization/103228 and 103228: folding of (type) X op CST where type is a nop convert
Date: Wed, 24 Nov 2021 16:10:54 +0300	[thread overview]
Message-ID: <C8340625-F126-47E1-9F72-F2A7BB6FCC84@linaro.org> (raw)
In-Reply-To: <VI1PR08MB5325CD76B09D23308803CCF8FF609@VI1PR08MB5325.eurprd08.prod.outlook.com>

Thanks, Tamar.

--
Maxim Kuvyrkov
https://www.linaro.org

> On 23 Nov 2021, at 13:57, Tamar Christina via Gcc-regression <gcc-regression@gcc.gnu.org> wrote:
> 
> Hi Maxim,
> 
>> -----Original Message-----
>> From: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
>> Sent: Monday, November 22, 2021 3:21 PM
>> To: Andrew Pinski <apinski@marvell.com>; Tamar Christina
>> <Tamar.Christina@arm.com>
>> Cc: gcc-regression@gcc.gnu.org
>> Subject: Re: [EXT] [TCWG CI] 458.sjeng grew in size by 7% after gcc: Fix PR
>> tree-optimization/103228 and 103228: folding of (type) X op CST where type
>> is a nop convert
>> 
>> Hi Andrew,
>> 
>> It appears to be a secret option #3: your patch triggers weirdness in other
>> parts of the toolchain.  Specifically, after your patch workaround for E843419
>> is triggered in BFD, and that seems to cause the code-size increase.  This is
>> observable only in analysis of the actual final binary; assembly files look
>> almost identical.
>> 
>> The bit that I don’t understand is that the workaround should’ve increased
>> the code-size by 4K, not by 8K.
>> 
>> Hi Tamar,
>> 
>> You’ve touched E843419 workaround last (in 2019) — is it expected that a
>> single use of the workaround can cause 8K increase?
> 
> Yes the .text section is aligned to 4K to prevent us from re-introducing the issue
> while we're modifying code and the veneer section itself is sized to a multiple of
> 4K to prevent the veneer section from changing the alignment of the user code
> that was aligned to 4K before.
> 
> Since binutils is single pass by the time we figure out how much space we actually
> need it's too late, we could have done things like resolved absolute relocations etc
> already and so we can't change it anymore.
> 
> So you end up consuming at most 8k for a single workaround.
> 
> Regards,
> Tamar
>> 
>> Regards,
>> 
>> --
>> Maxim Kuvyrkov
>> https://www.linaro.org
>> 
>>> On 20 Nov 2021, at 01:18, Andrew Pinski via Gcc-regression <gcc-
>> regression@gcc.gnu.org> wrote:
>>> 
>>> I looked at this and all I saw was 2 additional instructions being added, both
>> mov instructions due to some IV-OPTs differences (IV-OPTs is adding an cast
>> inside the loop for some reason ...).
>>> So either I tested this incorrectly or the test method here is incorrect.
>>> 
>>> Thanks,
>>> Andrew Pinski
>>> 
>>> ________________________________________
>>> From: ci_notify@linaro.org <ci_notify@linaro.org>
>>> Sent: Friday, November 19, 2021 1:04 PM
>>> To: Andrew Pinski
>>> Cc: gcc-regression@gcc.gnu.org
>>> Subject: [EXT] [TCWG CI] 458.sjeng grew in size by 7% after gcc: Fix
>>> PR tree-optimization/103228 and 103228: folding of (type) X op CST
>>> where type is a nop convert
>>> 
>>> External Email
>>> 
>>> ----------------------------------------------------------------------
>>> After gcc commit 32221357007666124409ec3ee0d3a1cf263ebc9e
>>> Author: Andrew Pinski <apinski@marvell.com>
>>> 
>>>   Fix PR tree-optimization/103228 and 103228: folding of (type) X op
>>> CST where type is a nop convert
>>> 
>>> the following benchmarks grew in size by more than 1%:
>>> - 458.sjeng grew in size by 7% from 114269 to 122477 bytes
>>> 
>>> Below reproducer instructions can be used to re-build both "first_bad" and
>> "last_good" cross-toolchains used in this bisection.  Naturally, the scripts will
>> fail when triggerring benchmarking jobs if you don't have access to Linaro
>> TCWG CI.
>>> 
>>> For your convenience, we have uploaded tarballs with pre-processed
>> source and assembly files at:
>>> - First_bad save-temps:
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job
>>> _tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-
>> 2Dmaster-2Daa
>>> rch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-
>> 2D3222135700766612440
>>> 9ec3ee0d3a1cf263ebc9e_save-
>> 2Dtemps_&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&
>>> r=L_uAQMgirzaBwiEk05NHY-
>> AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6m
>>> 
>> DjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=64LsUbhKbnp6JoZ4tp
>> OxOqKIB
>>> NUPey5b5Wxar2H1ftE&e=
>>> - Last_good save-temps:
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job
>>> _tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-
>> 2Dmaster-2Daa
>>> rch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-
>> 2D0e4a8656e818b669129
>>> a670057cbc21e5b723c18_save-
>> 2Dtemps_&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&
>>> r=L_uAQMgirzaBwiEk05NHY-
>> AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6m
>>> 
>> DjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=PbUbNCmYCCoLgsnpN
>> qfi23eCF
>>> wAAcw3TttTpEQSeLbo&e=
>>> - Baseline save-temps:
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job
>>> _tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-
>> 2Dmaster-2Daa
>>> rch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-2Dbaseline_save-
>> 2Dtem
>>> 
>> ps_&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY
>> -AMcNfJzu
>>> 
>> gOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83
>> CbtRitz
>>> tE32GaVjZM&s=k7DWlg8__Jn5Zctgq52zAGytSDc2PyFloO5PCGDz704&e=
>>> 
>>> Configuration:
>>> - Benchmark: SPEC CPU2006
>>> - Toolchain: GCC + Glibc + GNU Linker
>>> - Version: all components were built from their tip of trunk
>>> - Target: aarch64-linux-gnu
>>> - Compiler flags: -Os
>>> - Hardware: APM Mustang 8x X-Gene1
>>> 
>>> This benchmarking CI is work-in-progress, and we welcome feedback and
>> suggestions at linaro-toolchain@lists.linaro.org .  In our improvement plans is
>> to add support for SPEC CPU2017 benchmarks and provide "perf
>> report/annotate" data behind these reports.
>>> 
>>> THIS IS THE END OF INTERESTING STUFF.  BELOW ARE LINKS TO BUILDS,
>> REPRODUCTION INSTRUCTIONS, AND THE RAW COMMIT.
>>> 
>>> This commit has regressed these CI configurations:
>>> - tcwg_bmk_gnu_apm/gnu-master-aarch64-spec2k6-Os
>>> 
>>> First_bad build:
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job
>>> _tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-
>> 2Dmaster-2Daa
>>> rch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-
>> 2D3222135700766612440
>>> 
>> 9ec3ee0d3a1cf263ebc9e_&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_u
>> AQMgirza
>>> BwiEk05NHY-
>> AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF
>>> 
>> 9MyDALRezQ83CbtRitztE32GaVjZM&s=rugdsPeaqR4LpZPMF4LjEMmos5Mpk
>> W3s-err6Q
>>> idWsg&e= Last_good build:
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job
>>> _tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-
>> 2Dmaster-2Daa
>>> rch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-
>> 2D0e4a8656e818b669129
>>> 
>> a670057cbc21e5b723c18_&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_u
>> AQMgirza
>>> BwiEk05NHY-
>> AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF
>>> 
>> 9MyDALRezQ83CbtRitztE32GaVjZM&s=3ZDneXQWMjvfpwejPjmbYcLu3aA67
>> zGy9LyKZF
>>> OzKBM&e= Baseline build:
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job
>>> _tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-
>> 2Dmaster-2Daa
>>> rch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-
>> 2Dbaseline_&d=DwICaQ&
>>> c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY-
>> AMcNfJzugOS_xTjrtS94k
>>> 
>> &m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32Ga
>> VjZM&s=
>>> TSQH_4B2d9G86KrglzsbV5hu-6e7Qmcwzr6j9A6A3eA&e=
>>> Even more details:
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job
>>> _tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-
>> 2Dmaster-2Daa
>>> rch64-2Dspec2k6-
>> 2DOs_10_artifact_artifacts_&d=DwICaQ&c=nKjWec2b6R0mOyP
>>> az7xtfQ&r=L_uAQMgirzaBwiEk05NHY-
>> AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_
>>> 
>> yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=zAYEvNR_w
>> 5I_nyAlY
>>> W-9JA9Zfc3iJcHwlXBPjHYrnJo&e=
>>> 
>>> Reproduce builds:
>>> <cut>
>>> mkdir investigate-gcc-32221357007666124409ec3ee0d3a1cf263ebc9e
>>> cd investigate-gcc-32221357007666124409ec3ee0d3a1cf263ebc9e
>>> 
>>> # Fetch scripts
>>> git clone
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__git.linaro.org_to
>>> olchain_jenkins-
>> 2Dscripts&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgi
>>> rzaBwiEk05NHY-
>> AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4
>>> 
>> zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=51pml1SAxT6Evo12jbzkpKVsiAc4K
>> 7nFO-I
>>> JlctRmvc&e=
>>> 
>>> # Fetch manifests and test.sh script
>>> mkdir -p artifacts/manifests
>>> curl -o artifacts/manifests/build-baseline.sh
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job
>>> _tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-
>> 2Dmaster-2Daa
>>> rch64-2Dspec2k6-2DOs_10_artifact_artifacts_manifests_build-2Dbaseline.
>>> 
>> sh&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY-
>> AMcNfJzug
>>> 
>> OS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83C
>> btRitzt
>>> E32GaVjZM&s=Skv3p9mJHR0P8uzfWvEExAMPxPqw43j4wqTDSDmiXBI&e=
>> --fail
>>> curl -o artifacts/manifests/build-parameters.sh
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job
>>> _tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-
>> 2Dmaster-2Daa
>>> rch64-2Dspec2k6-2DOs_10_artifact_artifacts_manifests_build-
>> 2Dparameter
>>> 
>> s.sh&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NH
>> Y-AMcNfJz
>>> 
>> ugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ8
>> 3CbtRit
>>> ztE32GaVjZM&s=RDytw50jjcoc2T7E8VAJYshUzXazVHq6_Oi32WyLERU&e=
>> --fail
>>> curl -o artifacts/test.sh
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job
>>> _tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-
>> 2Dmaster-2Daa
>>> rch64-2Dspec2k6-
>> 2DOs_10_artifact_artifacts_test.sh&d=DwICaQ&c=nKjWec2b
>>> 6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY-
>> AMcNfJzugOS_xTjrtS94k&m=zOZlNQh
>>> 
>> 1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=Ur1s
>> 8yJzrG
>>> ucvvBMwnmw3kx5NxGS4S4bVjav3jjjNns&e=  --fail chmod +x
>>> artifacts/test.sh
>>> 
>>> # Reproduce the baseline build (build all pre-requisites)
>>> ./jenkins-scripts/tcwg_bmk-build.sh @@
>>> artifacts/manifests/build-baseline.sh
>>> 
>>> # Save baseline build state (which is then restored in
>>> artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded
>>> --exclude /bisect/ --exclude /artifacts/ --exclude /gcc/ ./
>>> ./bisect/baseline/
>>> 
>>> cd gcc
>>> 
>>> # Reproduce first_bad build
>>> git checkout --detach 32221357007666124409ec3ee0d3a1cf263ebc9e
>>> ../artifacts/test.sh
>>> 
>>> # Reproduce last_good build
>>> git checkout --detach 0e4a8656e818b669129a670057cbc21e5b723c18
>>> ../artifacts/test.sh
>>> 
>>> cd ..
>>> </cut>
>>> 
>>> Full commit (up to 1000 lines):
>>> <cut>
>>> commit 32221357007666124409ec3ee0d3a1cf263ebc9e
>>> Author: Andrew Pinski <apinski@marvell.com>
>>> Date:   Mon Nov 15 09:31:20 2021 +0000
>>> 
>>>   Fix PR tree-optimization/103228 and 103228: folding of (type) X op
>>> CST where type is a nop convert
>>> 
>>>   Currently we fold (type) X op CST into (type) (X op ((type-x) CST)) when
>> the conversion widens
>>>   but not when the conversion is a nop. For the same reason why we move
>> the widening conversion
>>>   (the possibility of removing an extra conversion), we should do the same
>> if the conversion is a
>>>   nop.
>>> 
>>>   Committed as approved with the comment change.
>>> 
>>>           PR tree-optimization/103228
>>>           PR tree-optimization/55177
>>> 
>>>   gcc/ChangeLog:
>>> 
>>>           * match.pd ((type) X bitop CST): Also do this
>>>           transformation for nop conversions.
>>> 
>>>   gcc/testsuite/ChangeLog:
>>> 
>>>           * gcc.dg/tree-ssa/pr103228-1.c: New test.
>>>           * gcc.dg/tree-ssa/pr55177-1.c: New test.
>>> ---
>>> gcc/match.pd                               |  6 ++++--
>>> gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c | 11 +++++++++++
>>> gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c  | 14 ++++++++++++++
>>> 3 files changed, 29 insertions(+), 2 deletions(-)
>>> 
>>> diff --git a/gcc/match.pd b/gcc/match.pd index
>>> 89df7b2a174..77d848d631e 100644
>>> --- a/gcc/match.pd
>>> +++ b/gcc/match.pd
>>> @@ -1616,8 +1616,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>>>         Restrict it to GIMPLE to avoid endless recursions.  */
>>>       && (bitop != BIT_AND_EXPR || GIMPLE)
>>>       && (/* That's a good idea if the conversion widens the operand, thus
>>> -             after hoisting the conversion the operation will be narrower.  */
>>> -          TYPE_PRECISION (TREE_TYPE (@0)) < TYPE_PRECISION (type)
>>> +             after hoisting the conversion the operation will be narrower.
>>> +             It is also a good if the conversion is a nop as moves the
>>> +             conversion to one side; allowing for combining of the conversions.
>> */
>>> +          TYPE_PRECISION (TREE_TYPE (@0)) <= TYPE_PRECISION (type)
>>>          /* It's also a good idea if the conversion is to a non-integer
>>>             mode.  */
>>>          || GET_MODE_CLASS (TYPE_MODE (type)) != MODE_INT diff --git
>>> a/gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c
>>> b/gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c
>>> new file mode 100644
>>> index 00000000000..a7539819cf2
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c
>>> @@ -0,0 +1,11 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-O2 -fdump-tree-optimized" } */ int f(int a, int b)
>>> +{
>>> +  b|=1u;
>>> +  b|=2;
>>> +  return b;
>>> +}
>>> +/* { dg-final { scan-tree-dump-times "\\\| 3" 1 "optimized"} } */
>>> +/* { dg-final { scan-tree-dump-times "\\\| 1" 0 "optimized"} } */
>>> +/* { dg-final { scan-tree-dump-times "\\\| 2" 0 "optimized"} } */
>>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c
>>> b/gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c
>>> new file mode 100644
>>> index 00000000000..de1a264345c
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c
>>> @@ -0,0 +1,14 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-O2 -fdump-tree-optimized" } */ extern int x;
>>> +
>>> +void foo(void)
>>> +{
>>> +  int a = __builtin_bswap32(x);
>>> +  a &= 0x5a5b5c5d;
>>> +  x = __builtin_bswap32(a);
>>> +}
>>> +
>>> +/* { dg-final { scan-tree-dump-times "__builtin_bswap32" 0
>>> +"optimized"} } */
>>> +/* { dg-final { scan-tree-dump-times "& 1566333786" 1 "optimized"} }
>>> +*/
>>> +/* { dg-final { scan-tree-dump-times "& 1515936861" 0 "optimized"} }
>>> +*/
>>> </cut>
> 
> IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.



      reply	other threads:[~2021-11-24 13:10 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-19 21:04 ci_notify
     [not found] ` <MWHPR18MB1213710571CA876BFC1D6045BF9C9@MWHPR18MB1213.namprd18.prod.outlook.com>
2021-11-22 15:21   ` [EXT] " Maxim Kuvyrkov
2021-11-23 10:57     ` Tamar Christina
2021-11-24 13:10       ` Maxim Kuvyrkov [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=C8340625-F126-47E1-9F72-F2A7BB6FCC84@linaro.org \
    --to=maxim.kuvyrkov@linaro.org \
    --cc=Tamar.Christina@arm.com \
    --cc=apinski@marvell.com \
    --cc=gcc-regression@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).