From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-417422-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 35828 invoked by alias); 16 Dec 2015 13:05:32 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 35804 invoked by uid 89); 16 Dec 2015 13:05:30 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-1.5 required=5.0 tests=AWL,BAYES_00,SPF_PASS autolearn=ham version=3.3.2
X-HELO: eu-smtp-delivery-143.mimecast.com
Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (146.101.78.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 16 Dec 2015 13:05:27 +0000
Received: from EUR01-DB5-obe.outbound.protection.outlook.com (mail-db5eur01lp0184.outbound.protection.outlook.com [213.199.154.184]) (Using TLS) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-27-lbb1RzhIS8Ok-p6JgUz07Q-1; Wed, 16 Dec 2015 13:05:23 +0000
Received: from AM3PR08MB0088.eurprd08.prod.outlook.com (2a01:111:e400:8847::18) by AM3PR08MB0595.eurprd08.prod.outlook.com (2a01:111:e400:c408::17) with Microsoft SMTP Server (TLS) id 15.1.361.13; Wed, 16 Dec 2015 13:05:21 +0000
Received: from AM3PR08MB0088.eurprd08.prod.outlook.com ([fe80::5c7b:335e:6049:3647]) by AM3PR08MB0088.eurprd08.prod.outlook.com ([fe80::5c7b:335e:6049:3647%14]) with mapi id 15.01.0355.012; Wed, 16 Dec 2015 13:05:22 +0000
From: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
To: James Greenhalgh <James.Greenhalgh@arm.com>
CC: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>, nd <nd@arm.com>
Subject: RE: [PATCH][AArch64] Add TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS
Date: Wed, 16 Dec 2015 13:05:00 -0000
Message-ID: <AM3PR08MB008857D4F6905FC03A92D23883EF0@AM3PR08MB0088.eurprd08.prod.outlook.com>
References: <AM3PR08MB008850D05209827E0246E28683EE0@AM3PR08MB0088.eurprd08.prod.outlook.com> <20151216095418.GA39374@arm.com>
In-Reply-To: <20151216095418.GA39374@arm.com>
x-microsoft-exchange-diagnostics: 1;AM3PR08MB0595;5:Y4lAsiitaUurcsaYTnJd9TBXw1ARh7jfjf8q19joDi0zd6Pqtv0Iil4xWXmQU1B/LnZJmDSES8peN9DYlQyzN+5svrvjX6bXsiIMG6+f707jukzHsPP7skwjiBFyINtiQCFPxB+qXPz0S8PEIoV/eg==;24:I+juG2smbmK85mIKf9PvQhz1+YjSZZ65ybw19H3f9fFk4sRDXKUAANq2OhHA/9ybF6UlSrIgJb7xubrw+8Dwlak+YHyfxkRfJ2pY9AAiQ9g=;20:p1ANU/zH+EFreUDLKqKWZXeoi1OSujl2msMG6LuHYfgo8nrOl8UEP7fHwE3vVr68M7qpcBhY1G0IMcE5jFN6+fbHlcZhFyhfUFk1m/gWlwzPcOeFgPd7A4RhTQRs1fbvBYaN6Q6Cm2TJJF+uOpKGU9eRWbPEk6bI0KxNyirnrWw=
x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:AM3PR08MB0595;
nodisclaimer: True
x-microsoft-antispam-prvs: <AM3PR08MB05953195110C36735B43081B83EF0@AM3PR08MB0595.eurprd08.prod.outlook.com>
x-exchange-antispam-report-test: UriScan:(180628864354917);
x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(601004)(2401047)(520078)(5005006)(8121501046)(10201501046)(3002001);SRVR:AM3PR08MB0595;BCL:0;PCL:0;RULEID:;SRVR:AM3PR08MB0595;
x-forefront-prvs: 0792DBEAD0
x-forefront-antispam-report: SFV:NSPM;SFS:(10009020)(6009001)(377424004)(189002)(199003)(24454002)(13464003)(54534003)(87936001)(5001960100002)(105586002)(54356999)(189998001)(110136002)(106356001)(101416001)(86362001)(50986999)(74316001)(97736004)(66066001)(76176999)(5002640100001)(81156007)(19580395003)(5003600100002)(19580405001)(33656002)(450100001)(1220700001)(2900100001)(2950100001)(76576001)(5250100002)(1096002)(5008740100001)(92566002)(5004730100002)(3846002)(102836003)(586003)(6116002)(40100003);DIR:OUT;SFP:1101;SCL:1;SRVR:AM3PR08MB0595;H:AM3PR08MB0088.eurprd08.prod.outlook.com;FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en;
spamdiagnosticoutput: 1:23
spamdiagnosticmetadata: NSPM
MIME-Version: 1.0
X-OriginatorOrg: arm.com
X-MS-Exchange-CrossTenant-originalarrivaltime: 16 Dec 2015 13:05:21.9795 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d
X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM3PR08MB0595
X-MC-Unique: lbb1RzhIS8Ok-p6JgUz07Q-1
Content-Type: text/plain; charset=WINDOWS-1252
Content-Transfer-Encoding: quoted-printable
X-SW-Source: 2015-12/txt/msg01598.txt.bz2

James Greenhalgh wrote:
> On Tue, Dec 15, 2015 at 10:54:49AM +0000, Wilco Dijkstra wrote:
> > ping
> >
> > > -----Original Message-----
> > > From: Wilco Dijkstra [mailto:Wilco.Dijkstra@arm.com]
> > > Sent: 06 November 2015 20:06
> > > To: 'gcc-patches@gcc.gnu.org'
> > > Subject: [PATCH][AArch64] Add TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS
> > >
> > > This patch adds support for the TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS
> > > hook. When the cost of GENERAL_REGS and FP_REGS is identical, the reg=
ister
> > > allocator always uses ALL_REGS even when it has a much higher cost. T=
he
> > > hook changes the class to either FP_REGS or GENERAL_REGS depending on=
 the
> > > mode of the register. This results in better register allocation over=
all,
> > > fewer spills and reduced codesize - particularly in SPEC2006 gamess.
> > >
> > > GCC regression passes with several minor fixes.
> > >
> > > OK for commit?
> > >
> > > ChangeLog:
> > > 2015-11-06  Wilco Dijkstra  <wdijkstr@arm.com>
> > >
> > > 	* gcc/config/aarch64/aarch64.c
> > > 	(TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS): New define.
> > > 	(aarch64_ira_change_pseudo_allocno_class): New function.
> > > 	* gcc/testsuite/gcc.target/aarch64/cvtf_1.c: Build with -O2.
> > > 	* gcc/testsuite/gcc.target/aarch64/scalar_shift_1.c
> > > 	(test_corners_sisd_di): Improve force to SIMD register.
> > > 	(test_corners_sisd_si): Likewise.
> > > 	* gcc/testsuite/gcc.target/aarch64/vdup_lane_2.c: Build with -O2.
> > > 	* gcc/testsuite/gcc.target/aarch64/vect-ld1r-compile-fp.c:
> > > 	Remove scan-assembler check for ldr.
>=20
> Drop the gcc/ from the ChangeLog.
>=20
> > > --
> > >  gcc/config/aarch64/aarch64.c                       | 22 ++++++++++++=
++++++++++
> > >  gcc/testsuite/gcc.target/aarch64/cvtf_1.c          |  2 +-
> > >  gcc/testsuite/gcc.target/aarch64/scalar_shift_1.c  |  4 ++--
> > >  gcc/testsuite/gcc.target/aarch64/vdup_lane_2.c     |  2 +-
> > >  .../gcc.target/aarch64/vect-ld1r-compile-fp.c      |  1 -
>=20
> These testsuite changes concern me a bit, and you don't mention them beyo=
nd
> saying they are minor fixes...

Well any changes to register allocator preferencing would cause fallout in =
tests that
are assuming which register is allocated, especially if they use nasty inli=
ne assembler
hacks to do so...

> > > diff --git a/gcc/testsuite/gcc.target/aarch64/cvtf_1.c b/gcc/testsuit=
e/gcc.target/aarch64/cvtf_1.c
> > > index 5f2ff81..96501db 100644
> > > --- a/gcc/testsuite/gcc.target/aarch64/cvtf_1.c
> > > +++ b/gcc/testsuite/gcc.target/aarch64/cvtf_1.c
> > > @@ -1,5 +1,5 @@
> > >  /* { dg-do run } */
> > > -/* { dg-options "-save-temps -fno-inline -O1" } */
> > > +/* { dg-options "-save-temps -fno-inline -O2" } */
>=20
> This one says we have a code-gen regression at -O1 ?

It avoids a regalloc bug - see below.

> > >  #define FCVTDEF(ftype,itype) \
> > >  void \
> > > diff --git a/gcc/testsuite/gcc.target/aarch64/scalar_shift_1.c b/gcc/=
testsuite/gcc.target/aarch64/scalar_shift_1.c
> > > index 363f554..8465c89 100644
> > > --- a/gcc/testsuite/gcc.target/aarch64/scalar_shift_1.c
> > > +++ b/gcc/testsuite/gcc.target/aarch64/scalar_shift_1.c
> > > @@ -186,9 +186,9 @@ test_corners_sisd_di (Int64x1 b)
> > >  {
> > >    force_simd_di (b);
> > >    b =3D b >> 63;
> > > +  force_simd_di (b);
> > >    b =3D b >> 0;
> > >    b +=3D b >> 65; /* { dg-warning "right shift count >=3D width of t=
ype" } */
> > > -  force_simd_di (b);
>=20
> This one I don't understand, but seems to say that we've decided to move
> b out of FP_REGS after getting it in there for b =3D b << 63; ? So this is
> another register allocator regression?

No, basically the register allocator is now making better decisions as to w=
here to
allocate integer variables. It will only allocate them to FP registers if t=
hey are primarily
used by other FP operations. The force_simd_di inline assembler tries to mi=
mic FP uses,
and if there are enough of them at the right places then everything works a=
s expected.
If however you do 3 consecutive integer operations then the allocator will =
now correctly
prefer to allocate them to the integer registers (while previously it would=
n't, which is
inefficient).

> > > diff --git a/gcc/testsuite/gcc.target/aarch64/vdup_lane_2.c b/gcc/tes=
tsuite/gcc.target/aarch64/vdup_lane_2.c
> > > index a49db3e..c5a9c52 100644
> > > --- a/gcc/testsuite/gcc.target/aarch64/vdup_lane_2.c
> > > +++ b/gcc/testsuite/gcc.target/aarch64/vdup_lane_2.c
> > > @@ -1,6 +1,6 @@
> > >  /* Test vdup_lane intrinsics work correctly.  */
> > >  /* { dg-do run } */
> > > -/* { dg-options "-O1 --save-temps" } */
> > > +/* { dg-options "-O2 --save-temps" } */
>=20
> Another -O1 regression ?

No, it's triggering a bug in the -O1 register preferencing that causes inco=
rrect preferences to be
selected despite the costs being right. The cost calculation with -O1 for e=
g.=20
wrap_vdupb_lane_s8_0() in vdup_lane_2.c:

Pass 0 for finding pseudo/allocno costs

    r79: preferred FP_REGS, alternative GENERAL_REGS, allocno GENERAL_REGS
    a1 (r79,l0) best GENERAL_REGS, allocno GENERAL_REGS
    r78: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS
    a0 (r78,l0) best GENERAL_REGS, allocno GENERAL_REGS

  a0(r78,l0) costs: CALLER_SAVE_REGS:5000,5000 GENERAL_REGS:5000,5000 FP_LO=
_REGS:5000,5000 FP_REGS:5000,5000 ALL_REGS:10000,10000 MEM:9000,9000
  a1(r79,l0) costs: CALLER_SAVE_REGS:5000,5000 GENERAL_REGS:5000,5000 FP_LO=
_REGS:0,0 FP_REGS:0,0 ALL_REGS:10000,10000 MEM:9000,9000

So it correctly prefers FP_REGS for r79 as it has the lowest cost, but then=
 forces the allocno and
best register to GENERAL_REGS... We could work around it by not having the =
"r" variant first in
the aarch64_get_lane patterns and further discouraging its use via "?r", bu=
t that's a different patch.

Wilco