public inbox for gcc-regression@sourceware.org
help / color / mirror / Atom feed
* [TCWG CI] 458.sjeng grew in size by 7% after gcc: Fix PR tree-optimization/103228 and 103228: folding of (type) X op CST where type is a nop convert
@ 2021-11-19 21:04 ci_notify
       [not found] ` <MWHPR18MB1213710571CA876BFC1D6045BF9C9@MWHPR18MB1213.namprd18.prod.outlook.com>
  0 siblings, 1 reply; 4+ messages in thread
From: ci_notify @ 2021-11-19 21:04 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: gcc-regression

After gcc commit 32221357007666124409ec3ee0d3a1cf263ebc9e
Author: Andrew Pinski <apinski@marvell.com>

    Fix PR tree-optimization/103228 and 103228: folding of (type) X op CST where type is a nop convert

the following benchmarks grew in size by more than 1%:
- 458.sjeng grew in size by 7% from 114269 to 122477 bytes

Below reproducer instructions can be used to re-build both "first_bad" and "last_good" cross-toolchains used in this bisection.  Naturally, the scripts will fail when triggerring benchmarking jobs if you don't have access to Linaro TCWG CI.

For your convenience, we have uploaded tarballs with pre-processed source and assembly files at:
- First_bad save-temps: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_apm-gnu-master-aarch64-spec2k6-Os/10/artifact/artifacts/build-32221357007666124409ec3ee0d3a1cf263ebc9e/save-temps/
- Last_good save-temps: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_apm-gnu-master-aarch64-spec2k6-Os/10/artifact/artifacts/build-0e4a8656e818b669129a670057cbc21e5b723c18/save-temps/
- Baseline save-temps: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_apm-gnu-master-aarch64-spec2k6-Os/10/artifact/artifacts/build-baseline/save-temps/

Configuration:
- Benchmark: SPEC CPU2006
- Toolchain: GCC + Glibc + GNU Linker
- Version: all components were built from their tip of trunk
- Target: aarch64-linux-gnu
- Compiler flags: -Os
- Hardware: APM Mustang 8x X-Gene1

This benchmarking CI is work-in-progress, and we welcome feedback and suggestions at linaro-toolchain@lists.linaro.org .  In our improvement plans is to add support for SPEC CPU2017 benchmarks and provide "perf report/annotate" data behind these reports.

THIS IS THE END OF INTERESTING STUFF.  BELOW ARE LINKS TO BUILDS, REPRODUCTION INSTRUCTIONS, AND THE RAW COMMIT.

This commit has regressed these CI configurations:
 - tcwg_bmk_gnu_apm/gnu-master-aarch64-spec2k6-Os

First_bad build: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_apm-gnu-master-aarch64-spec2k6-Os/10/artifact/artifacts/build-32221357007666124409ec3ee0d3a1cf263ebc9e/
Last_good build: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_apm-gnu-master-aarch64-spec2k6-Os/10/artifact/artifacts/build-0e4a8656e818b669129a670057cbc21e5b723c18/
Baseline build: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_apm-gnu-master-aarch64-spec2k6-Os/10/artifact/artifacts/build-baseline/
Even more details: https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_apm-gnu-master-aarch64-spec2k6-Os/10/artifact/artifacts/

Reproduce builds:
<cut>
mkdir investigate-gcc-32221357007666124409ec3ee0d3a1cf263ebc9e
cd investigate-gcc-32221357007666124409ec3ee0d3a1cf263ebc9e

# Fetch scripts
git clone https://git.linaro.org/toolchain/jenkins-scripts

# Fetch manifests and test.sh script
mkdir -p artifacts/manifests
curl -o artifacts/manifests/build-baseline.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_apm-gnu-master-aarch64-spec2k6-Os/10/artifact/artifacts/manifests/build-baseline.sh --fail
curl -o artifacts/manifests/build-parameters.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_apm-gnu-master-aarch64-spec2k6-Os/10/artifact/artifacts/manifests/build-parameters.sh --fail
curl -o artifacts/test.sh https://ci.linaro.org/job/tcwg_bmk_ci_gnu-bisect-tcwg_bmk_apm-gnu-master-aarch64-spec2k6-Os/10/artifact/artifacts/test.sh --fail
chmod +x artifacts/test.sh

# Reproduce the baseline build (build all pre-requisites)
./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh

# Save baseline build state (which is then restored in artifacts/test.sh)
mkdir -p ./bisect
rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /gcc/ ./ ./bisect/baseline/

cd gcc

# Reproduce first_bad build
git checkout --detach 32221357007666124409ec3ee0d3a1cf263ebc9e
../artifacts/test.sh

# Reproduce last_good build
git checkout --detach 0e4a8656e818b669129a670057cbc21e5b723c18
../artifacts/test.sh

cd ..
</cut>

Full commit (up to 1000 lines):
<cut>
commit 32221357007666124409ec3ee0d3a1cf263ebc9e
Author: Andrew Pinski <apinski@marvell.com>
Date:   Mon Nov 15 09:31:20 2021 +0000

    Fix PR tree-optimization/103228 and 103228: folding of (type) X op CST where type is a nop convert
    
    Currently we fold (type) X op CST into (type) (X op ((type-x) CST)) when the conversion widens
    but not when the conversion is a nop. For the same reason why we move the widening conversion
    (the possibility of removing an extra conversion), we should do the same if the conversion is a
    nop.
    
    Committed as approved with the comment change.
    
            PR tree-optimization/103228
            PR tree-optimization/55177
    
    gcc/ChangeLog:
    
            * match.pd ((type) X bitop CST): Also do this
            transformation for nop conversions.
    
    gcc/testsuite/ChangeLog:
    
            * gcc.dg/tree-ssa/pr103228-1.c: New test.
            * gcc.dg/tree-ssa/pr55177-1.c: New test.
---
 gcc/match.pd                               |  6 ++++--
 gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c | 11 +++++++++++
 gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c  | 14 ++++++++++++++
 3 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/gcc/match.pd b/gcc/match.pd
index 89df7b2a174..77d848d631e 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1616,8 +1616,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 	  Restrict it to GIMPLE to avoid endless recursions.  */
        && (bitop != BIT_AND_EXPR || GIMPLE)
        && (/* That's a good idea if the conversion widens the operand, thus
-	      after hoisting the conversion the operation will be narrower.  */
-	   TYPE_PRECISION (TREE_TYPE (@0)) < TYPE_PRECISION (type)
+	      after hoisting the conversion the operation will be narrower.
+	      It is also a good if the conversion is a nop as moves the
+	      conversion to one side; allowing for combining of the conversions.  */
+	   TYPE_PRECISION (TREE_TYPE (@0)) <= TYPE_PRECISION (type)
 	   /* It's also a good idea if the conversion is to a non-integer
 	      mode.  */
 	   || GET_MODE_CLASS (TYPE_MODE (type)) != MODE_INT
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c b/gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c
new file mode 100644
index 00000000000..a7539819cf2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+int f(int a, int b)
+{
+  b|=1u;
+  b|=2;
+  return b;
+}
+/* { dg-final { scan-tree-dump-times "\\\| 3" 1 "optimized"} } */
+/* { dg-final { scan-tree-dump-times "\\\| 1" 0 "optimized"} } */
+/* { dg-final { scan-tree-dump-times "\\\| 2" 0 "optimized"} } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c b/gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c
new file mode 100644
index 00000000000..de1a264345c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+extern int x;
+
+void foo(void)
+{
+  int a = __builtin_bswap32(x);
+  a &= 0x5a5b5c5d;
+  x = __builtin_bswap32(a);
+}
+
+/* { dg-final { scan-tree-dump-times "__builtin_bswap32" 0 "optimized"} } */
+/* { dg-final { scan-tree-dump-times "& 1566333786" 1 "optimized"} } */
+/* { dg-final { scan-tree-dump-times "& 1515936861" 0 "optimized"} } */
</cut>
>From apinski@marvell.com  Fri Nov 19 22:18:27 2021
Return-Path: <apinski@marvell.com>
X-Original-To: gcc-regression@gcc.gnu.org
Delivered-To: gcc-regression@gcc.gnu.org
Received: from mx0b-0016f401.pphosted.com (mx0a-0016f401.pphosted.com
 [67.231.148.174])
 by sourceware.org (Postfix) with ESMTPS id 3A37E3858406
 for <gcc-regression@gcc.gnu.org>; Fri, 19 Nov 2021 22:18:24 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 3A37E3858406
Received: from pps.filterd (m0045849.ppops.net [127.0.0.1])
 by mx0a-0016f401.pphosted.com (8.16.1.2/8.16.1.2) with ESMTP id 1AJMGIdd018511;
 Fri, 19 Nov 2021 14:18:22 -0800
Received: from nam02-dm3-obe.outbound.protection.outlook.com
 (mail-dm3nam07lp2043.outbound.protection.outlook.com [104.47.56.43])
 by mx0a-0016f401.pphosted.com (PPS) with ESMTPS id 3cea0tamba-1
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
 Fri, 19 Nov 2021 14:18:22 -0800
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none;
 b=ScyQJIfgEQV71YOcQTCLNGHAjXBTGdbcZD2lWZkrG3yC5LFggRAn+WQ9WL0FqDQd360j3edwfvUYeHcamOsLb1zmnfHE8IY+R7pYWZJqbIANH6hUjoV+GAyTIIL6e8LyLnKn0ZZEdH50pi3YC5LORiwim+lh0lAKljAknJyne4MaRkLE7Qodl0webulGUazLcba7nOyzR7l3L+AJajVqnU3j/iUOB/LJ9DCzQVF4GrBfju65gObpv0LvQ2uDjasOOM+1Ll92DQ6GH8Sn6ar7DwdVuKp5hrnbEoLOKw3Yag+4cM6gX6JazN/aHhPIl5q8ewXUk0wPePXbc2JOmdqf+g==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; 
 s=arcselector9901;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1;
 bh=MYvQc3vMdUfQ4gV+7Rb/RLp1fKsXgD55snx8nG4VvDM=;
 b=nsbQWpsXlmmMB1ejFpQtqnkc4vHi4m77mvvvRgFWSPBWTcKrWcPGglg+h776yi8KeaogVoSw8frcBvHIvg9q75ksGI7f7BKv23xO/zqEp5WEGD6ANMOxOhFj5CcCMgt5z4DyfECYx3Mmf1HM0ms2wxXMW8W/rx3yQrflh+aD/TJm/BYq3cG9426So6Tb325fu4mipvyiRFzr/ArPNyZqfY85wQ58PRK6foJ7NyptguK+sIzdXpirxXmbiKYEZq1cFbra7rvFSPje2jBSCKbMSf/CGo6xGDPyxzfL0QI7pwzTkNnn6g9WeHKBShqx0M6prnY0Atz1yD1kzmw/FLzigw==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass
 smtp.mailfrom=marvell.com; dmarc=pass action=none header.from=marvell.com;
 dkim=pass header.d=marvell.com; arc=none
Received: from MWHPR18MB1213.namprd18.prod.outlook.com (2603:10b6:320:2a::11)
 by MW2PR18MB2252.namprd18.prod.outlook.com (2603:10b6:907:f::13) with
 Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4713.19; Fri, 19 Nov
 2021 22:18:19 +0000
Received: from MWHPR18MB1213.namprd18.prod.outlook.com
 ([fe80::1c99:24e9:959b:bf61]) by MWHPR18MB1213.namprd18.prod.outlook.com
 ([fe80::1c99:24e9:959b:bf61%6]) with mapi id 15.20.4713.022; Fri, 19 Nov 2021
 22:18:19 +0000
From: Andrew Pinski <apinski@marvell.com>
To: "ci_notify@linaro.org" <ci_notify@linaro.org>
CC: "gcc-regression@gcc.gnu.org" <gcc-regression@gcc.gnu.org>
Subject: Re: [EXT] [TCWG CI] 458.sjeng grew in size by 7% after gcc: Fix PR
 tree-optimization/103228 and 103228: folding of (type) X op CST where type is
 a nop convert
Thread-Topic: [EXT] [TCWG CI] 458.sjeng grew in size by 7% after gcc: Fix PR
 tree-optimization/103228 and 103228: folding of (type) X op CST where type is
 a nop convert
Thread-Index: AQHX3YkQlFyaXh/wrkq/Ge92oKX/fqwLauTf
Date: Fri, 19 Nov 2021 22:18:18 +0000
Message-ID: <MWHPR18MB1213710571CA876BFC1D6045BF9C9@MWHPR18MB1213.namprd18.prod.outlook.com>
References: <856652192.9831.1637355872284@jenkins.jenkins>
In-Reply-To: <856652192.9831.1637355872284@jenkins.jenkins>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
suggested_attachment_session_id: 977699a6-cf29-3433-17f2-3aec2e75198a
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: 2b949822-cf2f-4c87-f7cf-08d9abaa7db7
x-ms-traffictypediagnostic: MW2PR18MB2252:
x-microsoft-antispam-prvs: <MW2PR18MB2252EF0948666775ECFE7750BF9C9@MW2PR18MB2252.namprd18.prod.outlook.com>
x-ms-oob-tlc-oobclassifiers: OLM:207;
x-ms-exchange-senderadcheck: 1
x-ms-exchange-antispam-relay: 0
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: mX+OxnEuxVfkBvuScj6IGMHrJCSzPO6AN12TU/0eXifK14TqYz4MtbbgJ5cz67WTGpSt0YmTCsIDFAL/mCboHiEUxsrTIMw+6jYrGxqMbnwnmym1SlQjlvTO2U2QXRRobGs9y/RbffZQmSN8jbkMzBJgO1HsrXqoXSoqhVj2a7D8/p1z8aRRpgRuyqwy2/M/9qXeR8w0LlZBXPEQq+QqWfJWH1T1+ZBz1bTSHlOws8YFgwLBmrj+6+nfS8yVa9xKNXL1RGHKoxO6Pi+AmfMEKTwHaXQdekf/RF8ydDR+IEEkqfGYHZZYD3EkPzHoGBb9GJMrLDUk3wBIESa2DJ6j72XxYH+FrS531fKDk61o0Do/jcfb29PAWVFREOhDedMXvuprmGhfCwkUuV4W+tLevz3+9coovl5Sn4ythYgLIWIdelk2uT9MImC/LYn/sqWnUKEW7x0j59sAXJl2ZtX5av07CT+Oj1TlCpkjs/+aTA5vuHCWlfauJq209RCM2VHdjS4E5nL0/VUpLs721FwDeprq7tZcWgVOZR5tBZ5OC/lZiH9Un8aJ3qePPhl3kXGg7OdA861ec+XwIHCEIPKEm+muYWAqIvHkVHSV4cUbmOhGRhDyfHfZZLEAcnXx2xOigfyG8yb95jawD4Fcs9jqiRijxEnmkK16xhY8I53oiOSvUsoi6WiEREgL6uJfSf0fyzifm6pTVt0PTLc8K+2dzxkxlidVhKhZmFdERcKyiLCzA4ZHh8lMaMVFgaHwJKEXrCtCCOSeM93Z3cxklHBUosRuWZL8fbCxoSTbAu9GaeA=
x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:;
 IPV:NLI; SFV:NSPM; H:MWHPR18MB1213.namprd18.prod.outlook.com; PTR:; CAT:NONE;
 SFS:(4636009)(366004)(26005)(38070700005)(5660300002)(55016002)(966005)(71200400001)(6916009)(30864003)(4326008)(9686003)(508600001)(84970400001)(2906002)(316002)(52536014)(66476007)(8936002)(6506007)(186003)(66946007)(33656002)(122000001)(8676002)(91956017)(76116006)(83380400001)(38100700002)(66446008)(66556008)(86362001)(53546011)(64756008)(7696005);
 DIR:OUT; SFP:1101; 
x-ms-exchange-antispam-messagedata-chunkcount: 1
x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?jRLUDJ8tJKFDB11Dlj3Fdbh9zQbGYWLNOPKn5s9iT0LWlTJ9pJlRn4JvSriX?=
 =?us-ascii?Q?K6PrzZQAbO0XxEOOFSTDp4B7ZxIZapfRqSrZsZOmgAaX6cAjLfNFt9j1B4Mf?=
 =?us-ascii?Q?Ctu/WVrzcE4/i17scrcFhF4ua7cPKth2c7l5J2rNHch8b3TnJk8KPdyed7sp?=
 =?us-ascii?Q?3sS4F3OTKPQ6THBk86WMCF0ZrcsL2HOFUClPdFrxcVkYDIoJ+51rJKlig0rO?=
 =?us-ascii?Q?vKf0iDdLiZrtDuSzW0wvuKimQ5cUEIVyQtSJVBJhkYLgkXzyeunZpfDZZ7Kd?=
 =?us-ascii?Q?FRe0qjAFqYAAS6RVsDV2B129YohJ1+HwyjW2Tyd7ZGvCg59+EpTQTrnTIjh5?=
 =?us-ascii?Q?eMMz1G5QxCdQqW3ss/bpDeJIR/ieNwpulJ6u2AxxgQyaw57q0OZjocpsg0BG?=
 =?us-ascii?Q?mhx8M1XETrW0AbB8gdRpj6x8/MhAFOH5YQ1kcn1Yn1Q658Z3CHSVv77tOHiy?=
 =?us-ascii?Q?Em+Tytfd3vo7pysPKrFJID6MyrKdu2UW/KDWHMlMEe8P+zF+l0m/onivxll7?=
 =?us-ascii?Q?CaiXZEmoXyG+C+Th9bd94kD8f66LEyC5C74E6m6nRNJvPb9ag2JC05NvaYbV?=
 =?us-ascii?Q?NaMpM3DfEU2qR8ISv6ZT5H+TaB11tOB66zR4+4PwkjzLxrHwfsbObiO6S7iI?=
 =?us-ascii?Q?m69dIZuYamjG2ZfXWx6asb9avgkhkftRwHY5CHaSP9bhT23+z10sOWMB6LKZ?=
 =?us-ascii?Q?zDG50VGIrBrUcbhYUFU4ckcADOnnXfEXI5KtFdTuagvJKw4yP1ShxF6OE+0u?=
 =?us-ascii?Q?AYFAXd57TUAoJ3V4Xep6hry2qwFjpUQX2h9yR4iggYNczkWTcG7e3d70w49j?=
 =?us-ascii?Q?oL3agDq9vKtQ2S3rn4pe+6GssLszXzS0PQBL9mCkEi/LJKVmNdVUeT12Icbw?=
 =?us-ascii?Q?Ro4u5txvAETTT3rX81qbewVfqASZL4lpG8veZsHVkOCau79UvlC4Pd9kXIkF?=
 =?us-ascii?Q?1HmJ/TVER+sCTgmHxV2D57L8b2qSHAa8pLPm9Jk35LCFLGbvtrpkD0n/71aE?=
 =?us-ascii?Q?zJDEJng+RiLiuJKlsEs6VKCN+fSP3vXCCi1dSWheFnH0CuGkHz6+sllykZAD?=
 =?us-ascii?Q?HAFmaxfRe612Gjaed1b3aP3rtd0zCzN8YIdpGKV0opEeA4TGR5FFZNVKjmpe?=
 =?us-ascii?Q?Xocd4NdGhjQIahEzCTF+f3SUwoBaVi28b0Enag3Xme94L4nq0qmFx/8vQrxR?=
 =?us-ascii?Q?a5kPJD6w5LvYVMqrR1g/mMH600oDeaidBDrRSksNJd1PU2fRtDJ+FeKPWYgz?=
 =?us-ascii?Q?nw12GBx+k9aWIfDvTBPT1T3cEpon9I9t2wm2I5KT7vUxaGh8L0CgtLHos01r?=
 =?us-ascii?Q?m+H4avbX9ke2FZB1lBR153ZLNnqJ0w8t5SurMMu8rMD/WNf3SpFn2Zu4bDIW?=
 =?us-ascii?Q?XyZf01+NvtAscCfaQyTXmTflvQ88YqpPbJ9b0u16gYtpc1QahYh7khQkObJd?=
 =?us-ascii?Q?+oymdHdWz5w1tFxWzT5N/IlEXrFTjWmxbv2aTlplW9tPJMcYePsKj1UyjyFJ?=
 =?us-ascii?Q?gvjSnng+hUlLAnrezrU8hVaiZz223ruZUoPouOyfALJ2arIMx6e7/4YyqHnh?=
 =?us-ascii?Q?/Njb08cnTyPauja2HVWYJ6GAAUTp1BJFlRVt/z5TgycgOymDe0JNXvf7DOUp?=
 =?us-ascii?Q?xA=3D=3D?=
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-OriginatorOrg: marvell.com
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-AuthSource: MWHPR18MB1213.namprd18.prod.outlook.com
X-MS-Exchange-CrossTenant-Network-Message-Id: 2b949822-cf2f-4c87-f7cf-08d9abaa7db7
X-MS-Exchange-CrossTenant-originalarrivaltime: 19 Nov 2021 22:18:18.7726 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 70e1fb47-1155-421d-87fc-2e58f638b6e0
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: BOCQ/yQD0Ox3Q5jori7BOmWyaG5z3ANmHzfipKfUezg+UMSApcj/sybvs+vS++XM2SaXzm995xpl23Fa9gnL9Q==
X-MS-Exchange-Transport-CrossTenantHeadersStamped: MW2PR18MB2252
X-Proofpoint-GUID: NuXsQhTXIhd8fusjyJ-ynJ8VWofDxdvu
X-Proofpoint-ORIG-GUID: NuXsQhTXIhd8fusjyJ-ynJ8VWofDxdvu
X-Proofpoint-Virus-Version: vendor=baseguard
 engine=ICAP:2.0.205,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475
 definitions=2021-11-19_15,2021-11-17_01,2020-04-07_01
X-Spam-Status: No, score=-12.9 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_LOTSOFHASH, RCVD_IN_DNSWL_LOW,
 SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: gcc-regression@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-regression mailing list <gcc-regression.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-regression>,
 <mailto:gcc-regression-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-regression/>
List-Post: <mailto:gcc-regression@gcc.gnu.org>
List-Help: <mailto:gcc-regression-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-regression>,
 <mailto:gcc-regression-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Nov 2021 22:18:27 -0000

I looked at this and all I saw was 2 additional instructions being added, b=
oth mov instructions due to some IV-OPTs differences (IV-OPTs is adding an =
cast inside the loop for some reason ...).
So either I tested this incorrectly or the test method here is incorrect.

Thanks,
Andrew Pinski

________________________________________
From: ci_notify@linaro.org <ci_notify@linaro.org>
Sent: Friday, November 19, 2021 1:04 PM
To: Andrew Pinski
Cc: gcc-regression@gcc.gnu.org
Subject: [EXT] [TCWG CI] 458.sjeng grew in size by 7% after gcc: Fix PR tre=
e-optimization/103228 and 103228: folding of (type) X op CST where type is =
a nop convert

External Email

----------------------------------------------------------------------
After gcc commit 32221357007666124409ec3ee0d3a1cf263ebc9e
Author: Andrew Pinski <apinski@marvell.com>

    Fix PR tree-optimization/103228 and 103228: folding of (type) X op CST =
where type is a nop convert

the following benchmarks grew in size by more than 1%:
- 458.sjeng grew in size by 7% from 114269 to 122477 bytes

Below reproducer instructions can be used to re-build both "first_bad" and =
"last_good" cross-toolchains used in this bisection.  Naturally, the script=
s will fail when triggerring benchmarking jobs if you don't have access to =
Linaro TCWG CI.

For your convenience, we have uploaded tarballs with pre-processed source a=
nd assembly files at:
- First_bad save-temps: https://urldefense.proofpoint.com/v2/url?u=3Dhttps-=
3A__ci.linaro.org_job_tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dg=
nu-2Dmaster-2Daarch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-2D32221357=
007666124409ec3ee0d3a1cf263ebc9e_save-2Dtemps_&d=3DDwICaQ&c=3DnKjWec2b6R0mO=
yPaz7xtfQ&r=3DL_uAQMgirzaBwiEk05NHY-AMcNfJzugOS_xTjrtS94k&m=3DzOZlNQh1duLJy=
_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=3D64LsUbhKbnp6JoZ4tpO=
xOqKIBNUPey5b5Wxar2H1ftE&e=3D
- Last_good save-temps: https://urldefense.proofpoint.com/v2/url?u=3Dhttps-=
3A__ci.linaro.org_job_tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dg=
nu-2Dmaster-2Daarch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-2D0e4a8656=
e818b669129a670057cbc21e5b723c18_save-2Dtemps_&d=3DDwICaQ&c=3DnKjWec2b6R0mO=
yPaz7xtfQ&r=3DL_uAQMgirzaBwiEk05NHY-AMcNfJzugOS_xTjrtS94k&m=3DzOZlNQh1duLJy=
_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=3DPbUbNCmYCCoLgsnpNqf=
i23eCFwAAcw3TttTpEQSeLbo&e=3D
- Baseline save-temps: https://urldefense.proofpoint.com/v2/url?u=3Dhttps-3=
A__ci.linaro.org_job_tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgn=
u-2Dmaster-2Daarch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-2Dbaseline_=
save-2Dtemps_&d=3DDwICaQ&c=3DnKjWec2b6R0mOyPaz7xtfQ&r=3DL_uAQMgirzaBwiEk05N=
HY-AMcNfJzugOS_xTjrtS94k&m=3DzOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ8=
3CbtRitztE32GaVjZM&s=3Dk7DWlg8__Jn5Zctgq52zAGytSDc2PyFloO5PCGDz704&e=3D

Configuration:
- Benchmark: SPEC CPU2006
- Toolchain: GCC + Glibc + GNU Linker
- Version: all components were built from their tip of trunk
- Target: aarch64-linux-gnu
- Compiler flags: -Os
- Hardware: APM Mustang 8x X-Gene1

This benchmarking CI is work-in-progress, and we welcome feedback and sugge=
stions at linaro-toolchain@lists.linaro.org .  In our improvement plans is =
to add support for SPEC CPU2017 benchmarks and provide "perf report/annotat=
e" data behind these reports.

THIS IS THE END OF INTERESTING STUFF.  BELOW ARE LINKS TO BUILDS, REPRODUCT=
ION INSTRUCTIONS, AND THE RAW COMMIT.

This commit has regressed these CI configurations:
 - tcwg_bmk_gnu_apm/gnu-master-aarch64-spec2k6-Os

First_bad build: https://urldefense.proofpoint.com/v2/url?u=3Dhttps-3A__ci.=
linaro.org_job_tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-2Dma=
ster-2Daarch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-2D322213570076661=
24409ec3ee0d3a1cf263ebc9e_&d=3DDwICaQ&c=3DnKjWec2b6R0mOyPaz7xtfQ&r=3DL_uAQM=
girzaBwiEk05NHY-AMcNfJzugOS_xTjrtS94k&m=3DzOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4z=
cF9MyDALRezQ83CbtRitztE32GaVjZM&s=3DrugdsPeaqR4LpZPMF4LjEMmos5MpkW3s-err6Qi=
dWsg&e=3D
Last_good build: https://urldefense.proofpoint.com/v2/url?u=3Dhttps-3A__ci.=
linaro.org_job_tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-2Dma=
ster-2Daarch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-2D0e4a8656e818b66=
9129a670057cbc21e5b723c18_&d=3DDwICaQ&c=3DnKjWec2b6R0mOyPaz7xtfQ&r=3DL_uAQM=
girzaBwiEk05NHY-AMcNfJzugOS_xTjrtS94k&m=3DzOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4z=
cF9MyDALRezQ83CbtRitztE32GaVjZM&s=3D3ZDneXQWMjvfpwejPjmbYcLu3aA67zGy9LyKZFO=
zKBM&e=3D
Baseline build: https://urldefense.proofpoint.com/v2/url?u=3Dhttps-3A__ci.l=
inaro.org_job_tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-2Dmas=
ter-2Daarch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-2Dbaseline_&d=3DDw=
ICaQ&c=3DnKjWec2b6R0mOyPaz7xtfQ&r=3DL_uAQMgirzaBwiEk05NHY-AMcNfJzugOS_xTjrt=
S94k&m=3DzOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=
=3DTSQH_4B2d9G86KrglzsbV5hu-6e7Qmcwzr6j9A6A3eA&e=3D
Even more details: https://urldefense.proofpoint.com/v2/url?u=3Dhttps-3A__c=
i.linaro.org_job_tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-2D=
master-2Daarch64-2Dspec2k6-2DOs_10_artifact_artifacts_&d=3DDwICaQ&c=3DnKjWe=
c2b6R0mOyPaz7xtfQ&r=3DL_uAQMgirzaBwiEk05NHY-AMcNfJzugOS_xTjrtS94k&m=3DzOZlN=
Qh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=3DzAYEvNR_w5I=
_nyAlYW-9JA9Zfc3iJcHwlXBPjHYrnJo&e=3D

Reproduce builds:
<cut>
mkdir investigate-gcc-32221357007666124409ec3ee0d3a1cf263ebc9e
cd investigate-gcc-32221357007666124409ec3ee0d3a1cf263ebc9e

# Fetch scripts
git clone https://urldefense.proofpoint.com/v2/url?u=3Dhttps-3A__git.linaro=
.org_toolchain_jenkins-2Dscripts&d=3DDwICaQ&c=3DnKjWec2b6R0mOyPaz7xtfQ&r=3D=
L_uAQMgirzaBwiEk05NHY-AMcNfJzugOS_xTjrtS94k&m=3DzOZlNQh1duLJy_yHjH4z6mDjgTG=
1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=3D51pml1SAxT6Evo12jbzkpKVsiAc4K7nFO=
-IJlctRmvc&e=3D

# Fetch manifests and test.sh script
mkdir -p artifacts/manifests
curl -o artifacts/manifests/build-baseline.sh https://urldefense.proofpoint=
.com/v2/url?u=3Dhttps-3A__ci.linaro.org_job_tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-=
2Dtcwg-5Fbmk-5Fapm-2Dgnu-2Dmaster-2Daarch64-2Dspec2k6-2DOs_10_artifact_arti=
facts_manifests_build-2Dbaseline.sh&d=3DDwICaQ&c=3DnKjWec2b6R0mOyPaz7xtfQ&r=
=3DL_uAQMgirzaBwiEk05NHY-AMcNfJzugOS_xTjrtS94k&m=3DzOZlNQh1duLJy_yHjH4z6mDj=
gTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=3DSkv3p9mJHR0P8uzfWvEExAMPxPqw43=
j4wqTDSDmiXBI&e=3D  --fail
curl -o artifacts/manifests/build-parameters.sh https://urldefense.proofpoi=
nt.com/v2/url?u=3Dhttps-3A__ci.linaro.org_job_tcwg-5Fbmk-5Fci-5Fgnu-2Dbisec=
t-2Dtcwg-5Fbmk-5Fapm-2Dgnu-2Dmaster-2Daarch64-2Dspec2k6-2DOs_10_artifact_ar=
tifacts_manifests_build-2Dparameters.sh&d=3DDwICaQ&c=3DnKjWec2b6R0mOyPaz7xt=
fQ&r=3DL_uAQMgirzaBwiEk05NHY-AMcNfJzugOS_xTjrtS94k&m=3DzOZlNQh1duLJy_yHjH4z=
6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=3DRDytw50jjcoc2T7E8VAJYshUzX=
azVHq6_Oi32WyLERU&e=3D  --fail
curl -o artifacts/test.sh https://urldefense.proofpoint.com/v2/url?u=3Dhttp=
s-3A__ci.linaro.org_job_tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2=
Dgnu-2Dmaster-2Daarch64-2Dspec2k6-2DOs_10_artifact_artifacts_test.sh&d=3DDw=
ICaQ&c=3DnKjWec2b6R0mOyPaz7xtfQ&r=3DL_uAQMgirzaBwiEk05NHY-AMcNfJzugOS_xTjrt=
S94k&m=3DzOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=
=3DUr1s8yJzrGucvvBMwnmw3kx5NxGS4S4bVjav3jjjNns&e=3D  --fail
chmod +x artifacts/test.sh

# Reproduce the baseline build (build all pre-requisites)
./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.s=
h

# Save baseline build state (which is then restored in artifacts/test.sh)
mkdir -p ./bisect
rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ -=
-exclude /gcc/ ./ ./bisect/baseline/

cd gcc

# Reproduce first_bad build
git checkout --detach 32221357007666124409ec3ee0d3a1cf263ebc9e
../artifacts/test.sh

# Reproduce last_good build
git checkout --detach 0e4a8656e818b669129a670057cbc21e5b723c18
../artifacts/test.sh

cd ..
</cut>

Full commit (up to 1000 lines):
<cut>
commit 32221357007666124409ec3ee0d3a1cf263ebc9e
Author: Andrew Pinski <apinski@marvell.com>
Date:   Mon Nov 15 09:31:20 2021 +0000

    Fix PR tree-optimization/103228 and 103228: folding of (type) X op CST =
where type is a nop convert

    Currently we fold (type) X op CST into (type) (X op ((type-x) CST)) whe=
n the conversion widens
    but not when the conversion is a nop. For the same reason why we move t=
he widening conversion
    (the possibility of removing an extra conversion), we should do the sam=
e if the conversion is a
    nop.

    Committed as approved with the comment change.

            PR tree-optimization/103228
            PR tree-optimization/55177

    gcc/ChangeLog:

            * match.pd ((type) X bitop CST): Also do this
            transformation for nop conversions.

    gcc/testsuite/ChangeLog:

            * gcc.dg/tree-ssa/pr103228-1.c: New test.
            * gcc.dg/tree-ssa/pr55177-1.c: New test.
---
 gcc/match.pd                               |  6 ++++--
 gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c | 11 +++++++++++
 gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c  | 14 ++++++++++++++
 3 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/gcc/match.pd b/gcc/match.pd
index 89df7b2a174..77d848d631e 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1616,8 +1616,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
          Restrict it to GIMPLE to avoid endless recursions.  */
        && (bitop !=3D BIT_AND_EXPR || GIMPLE)
        && (/* That's a good idea if the conversion widens the operand, thu=
s
-             after hoisting the conversion the operation will be narrower.=
  */
-          TYPE_PRECISION (TREE_TYPE (@0)) < TYPE_PRECISION (type)
+             after hoisting the conversion the operation will be narrower.
+             It is also a good if the conversion is a nop as moves the
+             conversion to one side; allowing for combining of the convers=
ions.  */
+          TYPE_PRECISION (TREE_TYPE (@0)) <=3D TYPE_PRECISION (type)
           /* It's also a good idea if the conversion is to a non-integer
              mode.  */
           || GET_MODE_CLASS (TYPE_MODE (type)) !=3D MODE_INT
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c b/gcc/testsuite/gcc=
.dg/tree-ssa/pr103228-1.c
new file mode 100644
index 00000000000..a7539819cf2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+int f(int a, int b)
+{
+  b|=3D1u;
+  b|=3D2;
+  return b;
+}
+/* { dg-final { scan-tree-dump-times "\\\| 3" 1 "optimized"} } */
+/* { dg-final { scan-tree-dump-times "\\\| 1" 0 "optimized"} } */
+/* { dg-final { scan-tree-dump-times "\\\| 2" 0 "optimized"} } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c b/gcc/testsuite/gcc.=
dg/tree-ssa/pr55177-1.c
new file mode 100644
index 00000000000..de1a264345c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+extern int x;
+
+void foo(void)
+{
+  int a =3D __builtin_bswap32(x);
+  a &=3D 0x5a5b5c5d;
+  x =3D __builtin_bswap32(a);
+}
+
+/* { dg-final { scan-tree-dump-times "__builtin_bswap32" 0 "optimized"} } =
*/
+/* { dg-final { scan-tree-dump-times "& 1566333786" 1 "optimized"} } */
+/* { dg-final { scan-tree-dump-times "& 1515936861" 0 "optimized"} } */
</cut>


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [EXT] [TCWG CI] 458.sjeng grew in size by 7% after gcc: Fix PR tree-optimization/103228 and 103228: folding of (type) X op CST where type is a nop convert
       [not found] ` <MWHPR18MB1213710571CA876BFC1D6045BF9C9@MWHPR18MB1213.namprd18.prod.outlook.com>
@ 2021-11-22 15:21   ` Maxim Kuvyrkov
  2021-11-23 10:57     ` Tamar Christina
  0 siblings, 1 reply; 4+ messages in thread
From: Maxim Kuvyrkov @ 2021-11-22 15:21 UTC (permalink / raw)
  To: Andrew Pinski, Tamar Christina; +Cc: gcc-regression

Hi Andrew,

It appears to be a secret option #3: your patch triggers weirdness in other parts of the toolchain.  Specifically, after your patch workaround for E843419 is triggered in BFD, and that seems to cause the code-size increase.  This is observable only in analysis of the actual final binary; assembly files look almost identical.

The bit that I don’t understand is that the workaround should’ve increased the code-size by 4K, not by 8K.

Hi Tamar,

You’ve touched E843419 workaround last (in 2019) — is it expected that a single use of the workaround can cause 8K increase?

Regards,

--
Maxim Kuvyrkov
https://www.linaro.org

> On 20 Nov 2021, at 01:18, Andrew Pinski via Gcc-regression <gcc-regression@gcc.gnu.org> wrote:
> 
> I looked at this and all I saw was 2 additional instructions being added, both mov instructions due to some IV-OPTs differences (IV-OPTs is adding an cast inside the loop for some reason ...).
> So either I tested this incorrectly or the test method here is incorrect.
> 
> Thanks,
> Andrew Pinski
> 
> ________________________________________
> From: ci_notify@linaro.org <ci_notify@linaro.org>
> Sent: Friday, November 19, 2021 1:04 PM
> To: Andrew Pinski
> Cc: gcc-regression@gcc.gnu.org
> Subject: [EXT] [TCWG CI] 458.sjeng grew in size by 7% after gcc: Fix PR tree-optimization/103228 and 103228: folding of (type) X op CST where type is a nop convert
> 
> External Email
> 
> ----------------------------------------------------------------------
> After gcc commit 32221357007666124409ec3ee0d3a1cf263ebc9e
> Author: Andrew Pinski <apinski@marvell.com>
> 
>    Fix PR tree-optimization/103228 and 103228: folding of (type) X op CST where type is a nop convert
> 
> the following benchmarks grew in size by more than 1%:
> - 458.sjeng grew in size by 7% from 114269 to 122477 bytes
> 
> Below reproducer instructions can be used to re-build both "first_bad" and "last_good" cross-toolchains used in this bisection.  Naturally, the scripts will fail when triggerring benchmarking jobs if you don't have access to Linaro TCWG CI.
> 
> For your convenience, we have uploaded tarballs with pre-processed source and assembly files at:
> - First_bad save-temps: https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job_tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-2Dmaster-2Daarch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-2D32221357007666124409ec3ee0d3a1cf263ebc9e_save-2Dtemps_&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY-AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=64LsUbhKbnp6JoZ4tpOxOqKIBNUPey5b5Wxar2H1ftE&e=
> - Last_good save-temps: https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job_tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-2Dmaster-2Daarch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-2D0e4a8656e818b669129a670057cbc21e5b723c18_save-2Dtemps_&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY-AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=PbUbNCmYCCoLgsnpNqfi23eCFwAAcw3TttTpEQSeLbo&e=
> - Baseline save-temps: https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job_tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-2Dmaster-2Daarch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-2Dbaseline_save-2Dtemps_&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY-AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=k7DWlg8__Jn5Zctgq52zAGytSDc2PyFloO5PCGDz704&e=
> 
> Configuration:
> - Benchmark: SPEC CPU2006
> - Toolchain: GCC + Glibc + GNU Linker
> - Version: all components were built from their tip of trunk
> - Target: aarch64-linux-gnu
> - Compiler flags: -Os
> - Hardware: APM Mustang 8x X-Gene1
> 
> This benchmarking CI is work-in-progress, and we welcome feedback and suggestions at linaro-toolchain@lists.linaro.org .  In our improvement plans is to add support for SPEC CPU2017 benchmarks and provide "perf report/annotate" data behind these reports.
> 
> THIS IS THE END OF INTERESTING STUFF.  BELOW ARE LINKS TO BUILDS, REPRODUCTION INSTRUCTIONS, AND THE RAW COMMIT.
> 
> This commit has regressed these CI configurations:
> - tcwg_bmk_gnu_apm/gnu-master-aarch64-spec2k6-Os
> 
> First_bad build: https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job_tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-2Dmaster-2Daarch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-2D32221357007666124409ec3ee0d3a1cf263ebc9e_&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY-AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=rugdsPeaqR4LpZPMF4LjEMmos5MpkW3s-err6QidWsg&e=
> Last_good build: https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job_tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-2Dmaster-2Daarch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-2D0e4a8656e818b669129a670057cbc21e5b723c18_&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY-AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=3ZDneXQWMjvfpwejPjmbYcLu3aA67zGy9LyKZFOzKBM&e=
> Baseline build: https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job_tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-2Dmaster-2Daarch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-2Dbaseline_&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY-AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=TSQH_4B2d9G86KrglzsbV5hu-6e7Qmcwzr6j9A6A3eA&e=
> Even more details: https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job_tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-2Dmaster-2Daarch64-2Dspec2k6-2DOs_10_artifact_artifacts_&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY-AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=zAYEvNR_w5I_nyAlYW-9JA9Zfc3iJcHwlXBPjHYrnJo&e=
> 
> Reproduce builds:
> <cut>
> mkdir investigate-gcc-32221357007666124409ec3ee0d3a1cf263ebc9e
> cd investigate-gcc-32221357007666124409ec3ee0d3a1cf263ebc9e
> 
> # Fetch scripts
> git clone https://urldefense.proofpoint.com/v2/url?u=https-3A__git.linaro.org_toolchain_jenkins-2Dscripts&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY-AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=51pml1SAxT6Evo12jbzkpKVsiAc4K7nFO-IJlctRmvc&e=
> 
> # Fetch manifests and test.sh script
> mkdir -p artifacts/manifests
> curl -o artifacts/manifests/build-baseline.sh https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job_tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-2Dmaster-2Daarch64-2Dspec2k6-2DOs_10_artifact_artifacts_manifests_build-2Dbaseline.sh&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY-AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=Skv3p9mJHR0P8uzfWvEExAMPxPqw43j4wqTDSDmiXBI&e=  --fail
> curl -o artifacts/manifests/build-parameters.sh https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job_tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-2Dmaster-2Daarch64-2Dspec2k6-2DOs_10_artifact_artifacts_manifests_build-2Dparameters.sh&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY-AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=RDytw50jjcoc2T7E8VAJYshUzXazVHq6_Oi32WyLERU&e=  --fail
> curl -o artifacts/test.sh https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job_tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-2Dmaster-2Daarch64-2Dspec2k6-2DOs_10_artifact_artifacts_test.sh&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY-AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=Ur1s8yJzrGucvvBMwnmw3kx5NxGS4S4bVjav3jjjNns&e=  --fail
> chmod +x artifacts/test.sh
> 
> # Reproduce the baseline build (build all pre-requisites)
> ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh
> 
> # Save baseline build state (which is then restored in artifacts/test.sh)
> mkdir -p ./bisect
> rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ --exclude /gcc/ ./ ./bisect/baseline/
> 
> cd gcc
> 
> # Reproduce first_bad build
> git checkout --detach 32221357007666124409ec3ee0d3a1cf263ebc9e
> ../artifacts/test.sh
> 
> # Reproduce last_good build
> git checkout --detach 0e4a8656e818b669129a670057cbc21e5b723c18
> ../artifacts/test.sh
> 
> cd ..
> </cut>
> 
> Full commit (up to 1000 lines):
> <cut>
> commit 32221357007666124409ec3ee0d3a1cf263ebc9e
> Author: Andrew Pinski <apinski@marvell.com>
> Date:   Mon Nov 15 09:31:20 2021 +0000
> 
>    Fix PR tree-optimization/103228 and 103228: folding of (type) X op CST where type is a nop convert
> 
>    Currently we fold (type) X op CST into (type) (X op ((type-x) CST)) when the conversion widens
>    but not when the conversion is a nop. For the same reason why we move the widening conversion
>    (the possibility of removing an extra conversion), we should do the same if the conversion is a
>    nop.
> 
>    Committed as approved with the comment change.
> 
>            PR tree-optimization/103228
>            PR tree-optimization/55177
> 
>    gcc/ChangeLog:
> 
>            * match.pd ((type) X bitop CST): Also do this
>            transformation for nop conversions.
> 
>    gcc/testsuite/ChangeLog:
> 
>            * gcc.dg/tree-ssa/pr103228-1.c: New test.
>            * gcc.dg/tree-ssa/pr55177-1.c: New test.
> ---
> gcc/match.pd                               |  6 ++++--
> gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c | 11 +++++++++++
> gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c  | 14 ++++++++++++++
> 3 files changed, 29 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 89df7b2a174..77d848d631e 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -1616,8 +1616,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>          Restrict it to GIMPLE to avoid endless recursions.  */
>        && (bitop != BIT_AND_EXPR || GIMPLE)
>        && (/* That's a good idea if the conversion widens the operand, thus
> -             after hoisting the conversion the operation will be narrower.  */
> -          TYPE_PRECISION (TREE_TYPE (@0)) < TYPE_PRECISION (type)
> +             after hoisting the conversion the operation will be narrower.
> +             It is also a good if the conversion is a nop as moves the
> +             conversion to one side; allowing for combining of the conversions.  */
> +          TYPE_PRECISION (TREE_TYPE (@0)) <= TYPE_PRECISION (type)
>           /* It's also a good idea if the conversion is to a non-integer
>              mode.  */
>           || GET_MODE_CLASS (TYPE_MODE (type)) != MODE_INT
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c b/gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c
> new file mode 100644
> index 00000000000..a7539819cf2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +int f(int a, int b)
> +{
> +  b|=1u;
> +  b|=2;
> +  return b;
> +}
> +/* { dg-final { scan-tree-dump-times "\\\| 3" 1 "optimized"} } */
> +/* { dg-final { scan-tree-dump-times "\\\| 1" 0 "optimized"} } */
> +/* { dg-final { scan-tree-dump-times "\\\| 2" 0 "optimized"} } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c b/gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c
> new file mode 100644
> index 00000000000..de1a264345c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +extern int x;
> +
> +void foo(void)
> +{
> +  int a = __builtin_bswap32(x);
> +  a &= 0x5a5b5c5d;
> +  x = __builtin_bswap32(a);
> +}
> +
> +/* { dg-final { scan-tree-dump-times "__builtin_bswap32" 0 "optimized"} } */
> +/* { dg-final { scan-tree-dump-times "& 1566333786" 1 "optimized"} } */
> +/* { dg-final { scan-tree-dump-times "& 1515936861" 0 "optimized"} } */
> </cut>



^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [EXT] [TCWG CI] 458.sjeng grew in size by 7% after gcc: Fix PR tree-optimization/103228 and 103228: folding of (type) X op CST where type is a nop convert
  2021-11-22 15:21   ` [EXT] " Maxim Kuvyrkov
@ 2021-11-23 10:57     ` Tamar Christina
  2021-11-24 13:10       ` Maxim Kuvyrkov
  0 siblings, 1 reply; 4+ messages in thread
From: Tamar Christina @ 2021-11-23 10:57 UTC (permalink / raw)
  To: Maxim Kuvyrkov, Andrew Pinski; +Cc: gcc-regression

Hi Maxim,

> -----Original Message-----
> From: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
> Sent: Monday, November 22, 2021 3:21 PM
> To: Andrew Pinski <apinski@marvell.com>; Tamar Christina
> <Tamar.Christina@arm.com>
> Cc: gcc-regression@gcc.gnu.org
> Subject: Re: [EXT] [TCWG CI] 458.sjeng grew in size by 7% after gcc: Fix PR
> tree-optimization/103228 and 103228: folding of (type) X op CST where type
> is a nop convert
>
> Hi Andrew,
>
> It appears to be a secret option #3: your patch triggers weirdness in other
> parts of the toolchain.  Specifically, after your patch workaround for E843419
> is triggered in BFD, and that seems to cause the code-size increase.  This is
> observable only in analysis of the actual final binary; assembly files look
> almost identical.
>
> The bit that I don’t understand is that the workaround should’ve increased
> the code-size by 4K, not by 8K.
>
> Hi Tamar,
>
> You’ve touched E843419 workaround last (in 2019) — is it expected that a
> single use of the workaround can cause 8K increase?

Yes the .text section is aligned to 4K to prevent us from re-introducing the issue
while we're modifying code and the veneer section itself is sized to a multiple of
4K to prevent the veneer section from changing the alignment of the user code
that was aligned to 4K before.

Since binutils is single pass by the time we figure out how much space we actually
need it's too late, we could have done things like resolved absolute relocations etc
already and so we can't change it anymore.

So you end up consuming at most 8k for a single workaround.

Regards,
Tamar
>
> Regards,
>
> --
> Maxim Kuvyrkov
> https://www.linaro.org
>
> > On 20 Nov 2021, at 01:18, Andrew Pinski via Gcc-regression <gcc-
> regression@gcc.gnu.org> wrote:
> >
> > I looked at this and all I saw was 2 additional instructions being added, both
> mov instructions due to some IV-OPTs differences (IV-OPTs is adding an cast
> inside the loop for some reason ...).
> > So either I tested this incorrectly or the test method here is incorrect.
> >
> > Thanks,
> > Andrew Pinski
> >
> > ________________________________________
> > From: ci_notify@linaro.org <ci_notify@linaro.org>
> > Sent: Friday, November 19, 2021 1:04 PM
> > To: Andrew Pinski
> > Cc: gcc-regression@gcc.gnu.org
> > Subject: [EXT] [TCWG CI] 458.sjeng grew in size by 7% after gcc: Fix
> > PR tree-optimization/103228 and 103228: folding of (type) X op CST
> > where type is a nop convert
> >
> > External Email
> >
> > ----------------------------------------------------------------------
> > After gcc commit 32221357007666124409ec3ee0d3a1cf263ebc9e
> > Author: Andrew Pinski <apinski@marvell.com>
> >
> >    Fix PR tree-optimization/103228 and 103228: folding of (type) X op
> > CST where type is a nop convert
> >
> > the following benchmarks grew in size by more than 1%:
> > - 458.sjeng grew in size by 7% from 114269 to 122477 bytes
> >
> > Below reproducer instructions can be used to re-build both "first_bad" and
> "last_good" cross-toolchains used in this bisection.  Naturally, the scripts will
> fail when triggerring benchmarking jobs if you don't have access to Linaro
> TCWG CI.
> >
> > For your convenience, we have uploaded tarballs with pre-processed
> source and assembly files at:
> > - First_bad save-temps:
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job
> > _tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-
> 2Dmaster-2Daa
> > rch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-
> 2D3222135700766612440
> > 9ec3ee0d3a1cf263ebc9e_save-
> 2Dtemps_&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&
> > r=L_uAQMgirzaBwiEk05NHY-
> AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6m
> >
> DjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=64LsUbhKbnp6JoZ4tp
> OxOqKIB
> > NUPey5b5Wxar2H1ftE&e=
> > - Last_good save-temps:
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job
> > _tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-
> 2Dmaster-2Daa
> > rch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-
> 2D0e4a8656e818b669129
> > a670057cbc21e5b723c18_save-
> 2Dtemps_&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&
> > r=L_uAQMgirzaBwiEk05NHY-
> AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6m
> >
> DjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=PbUbNCmYCCoLgsnpN
> qfi23eCF
> > wAAcw3TttTpEQSeLbo&e=
> > - Baseline save-temps:
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job
> > _tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-
> 2Dmaster-2Daa
> > rch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-2Dbaseline_save-
> 2Dtem
> >
> ps_&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY
> -AMcNfJzu
> >
> gOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83
> CbtRitz
> > tE32GaVjZM&s=k7DWlg8__Jn5Zctgq52zAGytSDc2PyFloO5PCGDz704&e=
> >
> > Configuration:
> > - Benchmark: SPEC CPU2006
> > - Toolchain: GCC + Glibc + GNU Linker
> > - Version: all components were built from their tip of trunk
> > - Target: aarch64-linux-gnu
> > - Compiler flags: -Os
> > - Hardware: APM Mustang 8x X-Gene1
> >
> > This benchmarking CI is work-in-progress, and we welcome feedback and
> suggestions at linaro-toolchain@lists.linaro.org .  In our improvement plans is
> to add support for SPEC CPU2017 benchmarks and provide "perf
> report/annotate" data behind these reports.
> >
> > THIS IS THE END OF INTERESTING STUFF.  BELOW ARE LINKS TO BUILDS,
> REPRODUCTION INSTRUCTIONS, AND THE RAW COMMIT.
> >
> > This commit has regressed these CI configurations:
> > - tcwg_bmk_gnu_apm/gnu-master-aarch64-spec2k6-Os
> >
> > First_bad build:
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job
> > _tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-
> 2Dmaster-2Daa
> > rch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-
> 2D3222135700766612440
> >
> 9ec3ee0d3a1cf263ebc9e_&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_u
> AQMgirza
> > BwiEk05NHY-
> AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF
> >
> 9MyDALRezQ83CbtRitztE32GaVjZM&s=rugdsPeaqR4LpZPMF4LjEMmos5Mpk
> W3s-err6Q
> > idWsg&e= Last_good build:
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job
> > _tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-
> 2Dmaster-2Daa
> > rch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-
> 2D0e4a8656e818b669129
> >
> a670057cbc21e5b723c18_&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_u
> AQMgirza
> > BwiEk05NHY-
> AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF
> >
> 9MyDALRezQ83CbtRitztE32GaVjZM&s=3ZDneXQWMjvfpwejPjmbYcLu3aA67
> zGy9LyKZF
> > OzKBM&e= Baseline build:
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job
> > _tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-
> 2Dmaster-2Daa
> > rch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-
> 2Dbaseline_&d=DwICaQ&
> > c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY-
> AMcNfJzugOS_xTjrtS94k
> >
> &m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32Ga
> VjZM&s=
> > TSQH_4B2d9G86KrglzsbV5hu-6e7Qmcwzr6j9A6A3eA&e=
> > Even more details:
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job
> > _tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-
> 2Dmaster-2Daa
> > rch64-2Dspec2k6-
> 2DOs_10_artifact_artifacts_&d=DwICaQ&c=nKjWec2b6R0mOyP
> > az7xtfQ&r=L_uAQMgirzaBwiEk05NHY-
> AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_
> >
> yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=zAYEvNR_w
> 5I_nyAlY
> > W-9JA9Zfc3iJcHwlXBPjHYrnJo&e=
> >
> > Reproduce builds:
> > <cut>
> > mkdir investigate-gcc-32221357007666124409ec3ee0d3a1cf263ebc9e
> > cd investigate-gcc-32221357007666124409ec3ee0d3a1cf263ebc9e
> >
> > # Fetch scripts
> > git clone
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__git.linaro.org_to
> > olchain_jenkins-
> 2Dscripts&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgi
> > rzaBwiEk05NHY-
> AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4
> >
> zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=51pml1SAxT6Evo12jbzkpKVsiAc4K
> 7nFO-I
> > JlctRmvc&e=
> >
> > # Fetch manifests and test.sh script
> > mkdir -p artifacts/manifests
> > curl -o artifacts/manifests/build-baseline.sh
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job
> > _tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-
> 2Dmaster-2Daa
> > rch64-2Dspec2k6-2DOs_10_artifact_artifacts_manifests_build-2Dbaseline.
> >
> sh&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY-
> AMcNfJzug
> >
> OS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83C
> btRitzt
> > E32GaVjZM&s=Skv3p9mJHR0P8uzfWvEExAMPxPqw43j4wqTDSDmiXBI&e=
> --fail
> > curl -o artifacts/manifests/build-parameters.sh
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job
> > _tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-
> 2Dmaster-2Daa
> > rch64-2Dspec2k6-2DOs_10_artifact_artifacts_manifests_build-
> 2Dparameter
> >
> s.sh&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NH
> Y-AMcNfJz
> >
> ugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ8
> 3CbtRit
> > ztE32GaVjZM&s=RDytw50jjcoc2T7E8VAJYshUzXazVHq6_Oi32WyLERU&e=
> --fail
> > curl -o artifacts/test.sh
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job
> > _tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-
> 2Dmaster-2Daa
> > rch64-2Dspec2k6-
> 2DOs_10_artifact_artifacts_test.sh&d=DwICaQ&c=nKjWec2b
> > 6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY-
> AMcNfJzugOS_xTjrtS94k&m=zOZlNQh
> >
> 1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=Ur1s
> 8yJzrG
> > ucvvBMwnmw3kx5NxGS4S4bVjav3jjjNns&e=  --fail chmod +x
> > artifacts/test.sh
> >
> > # Reproduce the baseline build (build all pre-requisites)
> > ./jenkins-scripts/tcwg_bmk-build.sh @@
> > artifacts/manifests/build-baseline.sh
> >
> > # Save baseline build state (which is then restored in
> > artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded
> > --exclude /bisect/ --exclude /artifacts/ --exclude /gcc/ ./
> > ./bisect/baseline/
> >
> > cd gcc
> >
> > # Reproduce first_bad build
> > git checkout --detach 32221357007666124409ec3ee0d3a1cf263ebc9e
> > ../artifacts/test.sh
> >
> > # Reproduce last_good build
> > git checkout --detach 0e4a8656e818b669129a670057cbc21e5b723c18
> > ../artifacts/test.sh
> >
> > cd ..
> > </cut>
> >
> > Full commit (up to 1000 lines):
> > <cut>
> > commit 32221357007666124409ec3ee0d3a1cf263ebc9e
> > Author: Andrew Pinski <apinski@marvell.com>
> > Date:   Mon Nov 15 09:31:20 2021 +0000
> >
> >    Fix PR tree-optimization/103228 and 103228: folding of (type) X op
> > CST where type is a nop convert
> >
> >    Currently we fold (type) X op CST into (type) (X op ((type-x) CST)) when
> the conversion widens
> >    but not when the conversion is a nop. For the same reason why we move
> the widening conversion
> >    (the possibility of removing an extra conversion), we should do the same
> if the conversion is a
> >    nop.
> >
> >    Committed as approved with the comment change.
> >
> >            PR tree-optimization/103228
> >            PR tree-optimization/55177
> >
> >    gcc/ChangeLog:
> >
> >            * match.pd ((type) X bitop CST): Also do this
> >            transformation for nop conversions.
> >
> >    gcc/testsuite/ChangeLog:
> >
> >            * gcc.dg/tree-ssa/pr103228-1.c: New test.
> >            * gcc.dg/tree-ssa/pr55177-1.c: New test.
> > ---
> > gcc/match.pd                               |  6 ++++--
> > gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c | 11 +++++++++++
> > gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c  | 14 ++++++++++++++
> > 3 files changed, 29 insertions(+), 2 deletions(-)
> >
> > diff --git a/gcc/match.pd b/gcc/match.pd index
> > 89df7b2a174..77d848d631e 100644
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -1616,8 +1616,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >          Restrict it to GIMPLE to avoid endless recursions.  */
> >        && (bitop != BIT_AND_EXPR || GIMPLE)
> >        && (/* That's a good idea if the conversion widens the operand, thus
> > -             after hoisting the conversion the operation will be narrower.  */
> > -          TYPE_PRECISION (TREE_TYPE (@0)) < TYPE_PRECISION (type)
> > +             after hoisting the conversion the operation will be narrower.
> > +             It is also a good if the conversion is a nop as moves the
> > +             conversion to one side; allowing for combining of the conversions.
> */
> > +          TYPE_PRECISION (TREE_TYPE (@0)) <= TYPE_PRECISION (type)
> >           /* It's also a good idea if the conversion is to a non-integer
> >              mode.  */
> >           || GET_MODE_CLASS (TYPE_MODE (type)) != MODE_INT diff --git
> > a/gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c
> > new file mode 100644
> > index 00000000000..a7539819cf2
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c
> > @@ -0,0 +1,11 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -fdump-tree-optimized" } */ int f(int a, int b)
> > +{
> > +  b|=1u;
> > +  b|=2;
> > +  return b;
> > +}
> > +/* { dg-final { scan-tree-dump-times "\\\| 3" 1 "optimized"} } */
> > +/* { dg-final { scan-tree-dump-times "\\\| 1" 0 "optimized"} } */
> > +/* { dg-final { scan-tree-dump-times "\\\| 2" 0 "optimized"} } */
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c
> > new file mode 100644
> > index 00000000000..de1a264345c
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c
> > @@ -0,0 +1,14 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -fdump-tree-optimized" } */ extern int x;
> > +
> > +void foo(void)
> > +{
> > +  int a = __builtin_bswap32(x);
> > +  a &= 0x5a5b5c5d;
> > +  x = __builtin_bswap32(a);
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "__builtin_bswap32" 0
> > +"optimized"} } */
> > +/* { dg-final { scan-tree-dump-times "& 1566333786" 1 "optimized"} }
> > +*/
> > +/* { dg-final { scan-tree-dump-times "& 1515936861" 0 "optimized"} }
> > +*/
> > </cut>

IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [EXT] [TCWG CI] 458.sjeng grew in size by 7% after gcc: Fix PR tree-optimization/103228 and 103228: folding of (type) X op CST where type is a nop convert
  2021-11-23 10:57     ` Tamar Christina
@ 2021-11-24 13:10       ` Maxim Kuvyrkov
  0 siblings, 0 replies; 4+ messages in thread
From: Maxim Kuvyrkov @ 2021-11-24 13:10 UTC (permalink / raw)
  To: Tamar Christina; +Cc: Andrew Pinski, gcc-regression

Thanks, Tamar.

--
Maxim Kuvyrkov
https://www.linaro.org

> On 23 Nov 2021, at 13:57, Tamar Christina via Gcc-regression <gcc-regression@gcc.gnu.org> wrote:
> 
> Hi Maxim,
> 
>> -----Original Message-----
>> From: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
>> Sent: Monday, November 22, 2021 3:21 PM
>> To: Andrew Pinski <apinski@marvell.com>; Tamar Christina
>> <Tamar.Christina@arm.com>
>> Cc: gcc-regression@gcc.gnu.org
>> Subject: Re: [EXT] [TCWG CI] 458.sjeng grew in size by 7% after gcc: Fix PR
>> tree-optimization/103228 and 103228: folding of (type) X op CST where type
>> is a nop convert
>> 
>> Hi Andrew,
>> 
>> It appears to be a secret option #3: your patch triggers weirdness in other
>> parts of the toolchain.  Specifically, after your patch workaround for E843419
>> is triggered in BFD, and that seems to cause the code-size increase.  This is
>> observable only in analysis of the actual final binary; assembly files look
>> almost identical.
>> 
>> The bit that I don’t understand is that the workaround should’ve increased
>> the code-size by 4K, not by 8K.
>> 
>> Hi Tamar,
>> 
>> You’ve touched E843419 workaround last (in 2019) — is it expected that a
>> single use of the workaround can cause 8K increase?
> 
> Yes the .text section is aligned to 4K to prevent us from re-introducing the issue
> while we're modifying code and the veneer section itself is sized to a multiple of
> 4K to prevent the veneer section from changing the alignment of the user code
> that was aligned to 4K before.
> 
> Since binutils is single pass by the time we figure out how much space we actually
> need it's too late, we could have done things like resolved absolute relocations etc
> already and so we can't change it anymore.
> 
> So you end up consuming at most 8k for a single workaround.
> 
> Regards,
> Tamar
>> 
>> Regards,
>> 
>> --
>> Maxim Kuvyrkov
>> https://www.linaro.org
>> 
>>> On 20 Nov 2021, at 01:18, Andrew Pinski via Gcc-regression <gcc-
>> regression@gcc.gnu.org> wrote:
>>> 
>>> I looked at this and all I saw was 2 additional instructions being added, both
>> mov instructions due to some IV-OPTs differences (IV-OPTs is adding an cast
>> inside the loop for some reason ...).
>>> So either I tested this incorrectly or the test method here is incorrect.
>>> 
>>> Thanks,
>>> Andrew Pinski
>>> 
>>> ________________________________________
>>> From: ci_notify@linaro.org <ci_notify@linaro.org>
>>> Sent: Friday, November 19, 2021 1:04 PM
>>> To: Andrew Pinski
>>> Cc: gcc-regression@gcc.gnu.org
>>> Subject: [EXT] [TCWG CI] 458.sjeng grew in size by 7% after gcc: Fix
>>> PR tree-optimization/103228 and 103228: folding of (type) X op CST
>>> where type is a nop convert
>>> 
>>> External Email
>>> 
>>> ----------------------------------------------------------------------
>>> After gcc commit 32221357007666124409ec3ee0d3a1cf263ebc9e
>>> Author: Andrew Pinski <apinski@marvell.com>
>>> 
>>>   Fix PR tree-optimization/103228 and 103228: folding of (type) X op
>>> CST where type is a nop convert
>>> 
>>> the following benchmarks grew in size by more than 1%:
>>> - 458.sjeng grew in size by 7% from 114269 to 122477 bytes
>>> 
>>> Below reproducer instructions can be used to re-build both "first_bad" and
>> "last_good" cross-toolchains used in this bisection.  Naturally, the scripts will
>> fail when triggerring benchmarking jobs if you don't have access to Linaro
>> TCWG CI.
>>> 
>>> For your convenience, we have uploaded tarballs with pre-processed
>> source and assembly files at:
>>> - First_bad save-temps:
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job
>>> _tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-
>> 2Dmaster-2Daa
>>> rch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-
>> 2D3222135700766612440
>>> 9ec3ee0d3a1cf263ebc9e_save-
>> 2Dtemps_&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&
>>> r=L_uAQMgirzaBwiEk05NHY-
>> AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6m
>>> 
>> DjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=64LsUbhKbnp6JoZ4tp
>> OxOqKIB
>>> NUPey5b5Wxar2H1ftE&e=
>>> - Last_good save-temps:
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job
>>> _tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-
>> 2Dmaster-2Daa
>>> rch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-
>> 2D0e4a8656e818b669129
>>> a670057cbc21e5b723c18_save-
>> 2Dtemps_&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&
>>> r=L_uAQMgirzaBwiEk05NHY-
>> AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6m
>>> 
>> DjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=PbUbNCmYCCoLgsnpN
>> qfi23eCF
>>> wAAcw3TttTpEQSeLbo&e=
>>> - Baseline save-temps:
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job
>>> _tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-
>> 2Dmaster-2Daa
>>> rch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-2Dbaseline_save-
>> 2Dtem
>>> 
>> ps_&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY
>> -AMcNfJzu
>>> 
>> gOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83
>> CbtRitz
>>> tE32GaVjZM&s=k7DWlg8__Jn5Zctgq52zAGytSDc2PyFloO5PCGDz704&e=
>>> 
>>> Configuration:
>>> - Benchmark: SPEC CPU2006
>>> - Toolchain: GCC + Glibc + GNU Linker
>>> - Version: all components were built from their tip of trunk
>>> - Target: aarch64-linux-gnu
>>> - Compiler flags: -Os
>>> - Hardware: APM Mustang 8x X-Gene1
>>> 
>>> This benchmarking CI is work-in-progress, and we welcome feedback and
>> suggestions at linaro-toolchain@lists.linaro.org .  In our improvement plans is
>> to add support for SPEC CPU2017 benchmarks and provide "perf
>> report/annotate" data behind these reports.
>>> 
>>> THIS IS THE END OF INTERESTING STUFF.  BELOW ARE LINKS TO BUILDS,
>> REPRODUCTION INSTRUCTIONS, AND THE RAW COMMIT.
>>> 
>>> This commit has regressed these CI configurations:
>>> - tcwg_bmk_gnu_apm/gnu-master-aarch64-spec2k6-Os
>>> 
>>> First_bad build:
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job
>>> _tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-
>> 2Dmaster-2Daa
>>> rch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-
>> 2D3222135700766612440
>>> 
>> 9ec3ee0d3a1cf263ebc9e_&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_u
>> AQMgirza
>>> BwiEk05NHY-
>> AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF
>>> 
>> 9MyDALRezQ83CbtRitztE32GaVjZM&s=rugdsPeaqR4LpZPMF4LjEMmos5Mpk
>> W3s-err6Q
>>> idWsg&e= Last_good build:
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job
>>> _tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-
>> 2Dmaster-2Daa
>>> rch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-
>> 2D0e4a8656e818b669129
>>> 
>> a670057cbc21e5b723c18_&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_u
>> AQMgirza
>>> BwiEk05NHY-
>> AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF
>>> 
>> 9MyDALRezQ83CbtRitztE32GaVjZM&s=3ZDneXQWMjvfpwejPjmbYcLu3aA67
>> zGy9LyKZF
>>> OzKBM&e= Baseline build:
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job
>>> _tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-
>> 2Dmaster-2Daa
>>> rch64-2Dspec2k6-2DOs_10_artifact_artifacts_build-
>> 2Dbaseline_&d=DwICaQ&
>>> c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY-
>> AMcNfJzugOS_xTjrtS94k
>>> 
>> &m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32Ga
>> VjZM&s=
>>> TSQH_4B2d9G86KrglzsbV5hu-6e7Qmcwzr6j9A6A3eA&e=
>>> Even more details:
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job
>>> _tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-
>> 2Dmaster-2Daa
>>> rch64-2Dspec2k6-
>> 2DOs_10_artifact_artifacts_&d=DwICaQ&c=nKjWec2b6R0mOyP
>>> az7xtfQ&r=L_uAQMgirzaBwiEk05NHY-
>> AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_
>>> 
>> yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=zAYEvNR_w
>> 5I_nyAlY
>>> W-9JA9Zfc3iJcHwlXBPjHYrnJo&e=
>>> 
>>> Reproduce builds:
>>> <cut>
>>> mkdir investigate-gcc-32221357007666124409ec3ee0d3a1cf263ebc9e
>>> cd investigate-gcc-32221357007666124409ec3ee0d3a1cf263ebc9e
>>> 
>>> # Fetch scripts
>>> git clone
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__git.linaro.org_to
>>> olchain_jenkins-
>> 2Dscripts&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgi
>>> rzaBwiEk05NHY-
>> AMcNfJzugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4
>>> 
>> zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=51pml1SAxT6Evo12jbzkpKVsiAc4K
>> 7nFO-I
>>> JlctRmvc&e=
>>> 
>>> # Fetch manifests and test.sh script
>>> mkdir -p artifacts/manifests
>>> curl -o artifacts/manifests/build-baseline.sh
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job
>>> _tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-
>> 2Dmaster-2Daa
>>> rch64-2Dspec2k6-2DOs_10_artifact_artifacts_manifests_build-2Dbaseline.
>>> 
>> sh&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY-
>> AMcNfJzug
>>> 
>> OS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83C
>> btRitzt
>>> E32GaVjZM&s=Skv3p9mJHR0P8uzfWvEExAMPxPqw43j4wqTDSDmiXBI&e=
>> --fail
>>> curl -o artifacts/manifests/build-parameters.sh
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job
>>> _tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-
>> 2Dmaster-2Daa
>>> rch64-2Dspec2k6-2DOs_10_artifact_artifacts_manifests_build-
>> 2Dparameter
>>> 
>> s.sh&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NH
>> Y-AMcNfJz
>>> 
>> ugOS_xTjrtS94k&m=zOZlNQh1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ8
>> 3CbtRit
>>> ztE32GaVjZM&s=RDytw50jjcoc2T7E8VAJYshUzXazVHq6_Oi32WyLERU&e=
>> --fail
>>> curl -o artifacts/test.sh
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.linaro.org_job
>>> _tcwg-5Fbmk-5Fci-5Fgnu-2Dbisect-2Dtcwg-5Fbmk-5Fapm-2Dgnu-
>> 2Dmaster-2Daa
>>> rch64-2Dspec2k6-
>> 2DOs_10_artifact_artifacts_test.sh&d=DwICaQ&c=nKjWec2b
>>> 6R0mOyPaz7xtfQ&r=L_uAQMgirzaBwiEk05NHY-
>> AMcNfJzugOS_xTjrtS94k&m=zOZlNQh
>>> 
>> 1duLJy_yHjH4z6mDjgTG1tYq4zcF9MyDALRezQ83CbtRitztE32GaVjZM&s=Ur1s
>> 8yJzrG
>>> ucvvBMwnmw3kx5NxGS4S4bVjav3jjjNns&e=  --fail chmod +x
>>> artifacts/test.sh
>>> 
>>> # Reproduce the baseline build (build all pre-requisites)
>>> ./jenkins-scripts/tcwg_bmk-build.sh @@
>>> artifacts/manifests/build-baseline.sh
>>> 
>>> # Save baseline build state (which is then restored in
>>> artifacts/test.sh) mkdir -p ./bisect rsync -a --del --delete-excluded
>>> --exclude /bisect/ --exclude /artifacts/ --exclude /gcc/ ./
>>> ./bisect/baseline/
>>> 
>>> cd gcc
>>> 
>>> # Reproduce first_bad build
>>> git checkout --detach 32221357007666124409ec3ee0d3a1cf263ebc9e
>>> ../artifacts/test.sh
>>> 
>>> # Reproduce last_good build
>>> git checkout --detach 0e4a8656e818b669129a670057cbc21e5b723c18
>>> ../artifacts/test.sh
>>> 
>>> cd ..
>>> </cut>
>>> 
>>> Full commit (up to 1000 lines):
>>> <cut>
>>> commit 32221357007666124409ec3ee0d3a1cf263ebc9e
>>> Author: Andrew Pinski <apinski@marvell.com>
>>> Date:   Mon Nov 15 09:31:20 2021 +0000
>>> 
>>>   Fix PR tree-optimization/103228 and 103228: folding of (type) X op
>>> CST where type is a nop convert
>>> 
>>>   Currently we fold (type) X op CST into (type) (X op ((type-x) CST)) when
>> the conversion widens
>>>   but not when the conversion is a nop. For the same reason why we move
>> the widening conversion
>>>   (the possibility of removing an extra conversion), we should do the same
>> if the conversion is a
>>>   nop.
>>> 
>>>   Committed as approved with the comment change.
>>> 
>>>           PR tree-optimization/103228
>>>           PR tree-optimization/55177
>>> 
>>>   gcc/ChangeLog:
>>> 
>>>           * match.pd ((type) X bitop CST): Also do this
>>>           transformation for nop conversions.
>>> 
>>>   gcc/testsuite/ChangeLog:
>>> 
>>>           * gcc.dg/tree-ssa/pr103228-1.c: New test.
>>>           * gcc.dg/tree-ssa/pr55177-1.c: New test.
>>> ---
>>> gcc/match.pd                               |  6 ++++--
>>> gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c | 11 +++++++++++
>>> gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c  | 14 ++++++++++++++
>>> 3 files changed, 29 insertions(+), 2 deletions(-)
>>> 
>>> diff --git a/gcc/match.pd b/gcc/match.pd index
>>> 89df7b2a174..77d848d631e 100644
>>> --- a/gcc/match.pd
>>> +++ b/gcc/match.pd
>>> @@ -1616,8 +1616,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>>>         Restrict it to GIMPLE to avoid endless recursions.  */
>>>       && (bitop != BIT_AND_EXPR || GIMPLE)
>>>       && (/* That's a good idea if the conversion widens the operand, thus
>>> -             after hoisting the conversion the operation will be narrower.  */
>>> -          TYPE_PRECISION (TREE_TYPE (@0)) < TYPE_PRECISION (type)
>>> +             after hoisting the conversion the operation will be narrower.
>>> +             It is also a good if the conversion is a nop as moves the
>>> +             conversion to one side; allowing for combining of the conversions.
>> */
>>> +          TYPE_PRECISION (TREE_TYPE (@0)) <= TYPE_PRECISION (type)
>>>          /* It's also a good idea if the conversion is to a non-integer
>>>             mode.  */
>>>          || GET_MODE_CLASS (TYPE_MODE (type)) != MODE_INT diff --git
>>> a/gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c
>>> b/gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c
>>> new file mode 100644
>>> index 00000000000..a7539819cf2
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr103228-1.c
>>> @@ -0,0 +1,11 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-O2 -fdump-tree-optimized" } */ int f(int a, int b)
>>> +{
>>> +  b|=1u;
>>> +  b|=2;
>>> +  return b;
>>> +}
>>> +/* { dg-final { scan-tree-dump-times "\\\| 3" 1 "optimized"} } */
>>> +/* { dg-final { scan-tree-dump-times "\\\| 1" 0 "optimized"} } */
>>> +/* { dg-final { scan-tree-dump-times "\\\| 2" 0 "optimized"} } */
>>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c
>>> b/gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c
>>> new file mode 100644
>>> index 00000000000..de1a264345c
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr55177-1.c
>>> @@ -0,0 +1,14 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-O2 -fdump-tree-optimized" } */ extern int x;
>>> +
>>> +void foo(void)
>>> +{
>>> +  int a = __builtin_bswap32(x);
>>> +  a &= 0x5a5b5c5d;
>>> +  x = __builtin_bswap32(a);
>>> +}
>>> +
>>> +/* { dg-final { scan-tree-dump-times "__builtin_bswap32" 0
>>> +"optimized"} } */
>>> +/* { dg-final { scan-tree-dump-times "& 1566333786" 1 "optimized"} }
>>> +*/
>>> +/* { dg-final { scan-tree-dump-times "& 1515936861" 0 "optimized"} }
>>> +*/
>>> </cut>
> 
> IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-11-24 13:10 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-19 21:04 [TCWG CI] 458.sjeng grew in size by 7% after gcc: Fix PR tree-optimization/103228 and 103228: folding of (type) X op CST where type is a nop convert ci_notify
     [not found] ` <MWHPR18MB1213710571CA876BFC1D6045BF9C9@MWHPR18MB1213.namprd18.prod.outlook.com>
2021-11-22 15:21   ` [EXT] " Maxim Kuvyrkov
2021-11-23 10:57     ` Tamar Christina
2021-11-24 13:10       ` Maxim Kuvyrkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).