From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 35828 invoked by alias); 16 Dec 2015 13:05:32 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 35804 invoked by uid 89); 16 Dec 2015 13:05:30 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.5 required=5.0 tests=AWL,BAYES_00,SPF_PASS autolearn=ham version=3.3.2 X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (146.101.78.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 16 Dec 2015 13:05:27 +0000 Received: from EUR01-DB5-obe.outbound.protection.outlook.com (mail-db5eur01lp0184.outbound.protection.outlook.com [213.199.154.184]) (Using TLS) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-27-lbb1RzhIS8Ok-p6JgUz07Q-1; Wed, 16 Dec 2015 13:05:23 +0000 Received: from AM3PR08MB0088.eurprd08.prod.outlook.com (2a01:111:e400:8847::18) by AM3PR08MB0595.eurprd08.prod.outlook.com (2a01:111:e400:c408::17) with Microsoft SMTP Server (TLS) id 15.1.361.13; Wed, 16 Dec 2015 13:05:21 +0000 Received: from AM3PR08MB0088.eurprd08.prod.outlook.com ([fe80::5c7b:335e:6049:3647]) by AM3PR08MB0088.eurprd08.prod.outlook.com ([fe80::5c7b:335e:6049:3647%14]) with mapi id 15.01.0355.012; Wed, 16 Dec 2015 13:05:22 +0000 From: Wilco Dijkstra To: James Greenhalgh CC: "gcc-patches@gcc.gnu.org" , nd Subject: RE: [PATCH][AArch64] Add TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS Date: Wed, 16 Dec 2015 13:05:00 -0000 Message-ID: References: <20151216095418.GA39374@arm.com> In-Reply-To: <20151216095418.GA39374@arm.com> x-microsoft-exchange-diagnostics: 1;AM3PR08MB0595;5:Y4lAsiitaUurcsaYTnJd9TBXw1ARh7jfjf8q19joDi0zd6Pqtv0Iil4xWXmQU1B/LnZJmDSES8peN9DYlQyzN+5svrvjX6bXsiIMG6+f707jukzHsPP7skwjiBFyINtiQCFPxB+qXPz0S8PEIoV/eg==;24:I+juG2smbmK85mIKf9PvQhz1+YjSZZ65ybw19H3f9fFk4sRDXKUAANq2OhHA/9ybF6UlSrIgJb7xubrw+8Dwlak+YHyfxkRfJ2pY9AAiQ9g=;20:p1ANU/zH+EFreUDLKqKWZXeoi1OSujl2msMG6LuHYfgo8nrOl8UEP7fHwE3vVr68M7qpcBhY1G0IMcE5jFN6+fbHlcZhFyhfUFk1m/gWlwzPcOeFgPd7A4RhTQRs1fbvBYaN6Q6Cm2TJJF+uOpKGU9eRWbPEk6bI0KxNyirnrWw= x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:AM3PR08MB0595; nodisclaimer: True x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(180628864354917); x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(601004)(2401047)(520078)(5005006)(8121501046)(10201501046)(3002001);SRVR:AM3PR08MB0595;BCL:0;PCL:0;RULEID:;SRVR:AM3PR08MB0595; x-forefront-prvs: 0792DBEAD0 x-forefront-antispam-report: SFV:NSPM;SFS:(10009020)(6009001)(377424004)(189002)(199003)(24454002)(13464003)(54534003)(87936001)(5001960100002)(105586002)(54356999)(189998001)(110136002)(106356001)(101416001)(86362001)(50986999)(74316001)(97736004)(66066001)(76176999)(5002640100001)(81156007)(19580395003)(5003600100002)(19580405001)(33656002)(450100001)(1220700001)(2900100001)(2950100001)(76576001)(5250100002)(1096002)(5008740100001)(92566002)(5004730100002)(3846002)(102836003)(586003)(6116002)(40100003);DIR:OUT;SFP:1101;SCL:1;SRVR:AM3PR08MB0595;H:AM3PR08MB0088.eurprd08.prod.outlook.com;FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; spamdiagnosticoutput: 1:23 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-originalarrivaltime: 16 Dec 2015 13:05:21.9795 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM3PR08MB0595 X-MC-Unique: lbb1RzhIS8Ok-p6JgUz07Q-1 Content-Type: text/plain; charset=WINDOWS-1252 Content-Transfer-Encoding: quoted-printable X-SW-Source: 2015-12/txt/msg01598.txt.bz2 James Greenhalgh wrote: > On Tue, Dec 15, 2015 at 10:54:49AM +0000, Wilco Dijkstra wrote: > > ping > > > > > -----Original Message----- > > > From: Wilco Dijkstra [mailto:Wilco.Dijkstra@arm.com] > > > Sent: 06 November 2015 20:06 > > > To: 'gcc-patches@gcc.gnu.org' > > > Subject: [PATCH][AArch64] Add TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS > > > > > > This patch adds support for the TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS > > > hook. When the cost of GENERAL_REGS and FP_REGS is identical, the reg= ister > > > allocator always uses ALL_REGS even when it has a much higher cost. T= he > > > hook changes the class to either FP_REGS or GENERAL_REGS depending on= the > > > mode of the register. This results in better register allocation over= all, > > > fewer spills and reduced codesize - particularly in SPEC2006 gamess. > > > > > > GCC regression passes with several minor fixes. > > > > > > OK for commit? > > > > > > ChangeLog: > > > 2015-11-06 Wilco Dijkstra > > > > > > * gcc/config/aarch64/aarch64.c > > > (TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS): New define. > > > (aarch64_ira_change_pseudo_allocno_class): New function. > > > * gcc/testsuite/gcc.target/aarch64/cvtf_1.c: Build with -O2. > > > * gcc/testsuite/gcc.target/aarch64/scalar_shift_1.c > > > (test_corners_sisd_di): Improve force to SIMD register. > > > (test_corners_sisd_si): Likewise. > > > * gcc/testsuite/gcc.target/aarch64/vdup_lane_2.c: Build with -O2. > > > * gcc/testsuite/gcc.target/aarch64/vect-ld1r-compile-fp.c: > > > Remove scan-assembler check for ldr. >=20 > Drop the gcc/ from the ChangeLog. >=20 > > > -- > > > gcc/config/aarch64/aarch64.c | 22 ++++++++++++= ++++++++++ > > > gcc/testsuite/gcc.target/aarch64/cvtf_1.c | 2 +- > > > gcc/testsuite/gcc.target/aarch64/scalar_shift_1.c | 4 ++-- > > > gcc/testsuite/gcc.target/aarch64/vdup_lane_2.c | 2 +- > > > .../gcc.target/aarch64/vect-ld1r-compile-fp.c | 1 - >=20 > These testsuite changes concern me a bit, and you don't mention them beyo= nd > saying they are minor fixes... Well any changes to register allocator preferencing would cause fallout in = tests that are assuming which register is allocated, especially if they use nasty inli= ne assembler hacks to do so... > > > diff --git a/gcc/testsuite/gcc.target/aarch64/cvtf_1.c b/gcc/testsuit= e/gcc.target/aarch64/cvtf_1.c > > > index 5f2ff81..96501db 100644 > > > --- a/gcc/testsuite/gcc.target/aarch64/cvtf_1.c > > > +++ b/gcc/testsuite/gcc.target/aarch64/cvtf_1.c > > > @@ -1,5 +1,5 @@ > > > /* { dg-do run } */ > > > -/* { dg-options "-save-temps -fno-inline -O1" } */ > > > +/* { dg-options "-save-temps -fno-inline -O2" } */ >=20 > This one says we have a code-gen regression at -O1 ? It avoids a regalloc bug - see below. > > > #define FCVTDEF(ftype,itype) \ > > > void \ > > > diff --git a/gcc/testsuite/gcc.target/aarch64/scalar_shift_1.c b/gcc/= testsuite/gcc.target/aarch64/scalar_shift_1.c > > > index 363f554..8465c89 100644 > > > --- a/gcc/testsuite/gcc.target/aarch64/scalar_shift_1.c > > > +++ b/gcc/testsuite/gcc.target/aarch64/scalar_shift_1.c > > > @@ -186,9 +186,9 @@ test_corners_sisd_di (Int64x1 b) > > > { > > > force_simd_di (b); > > > b =3D b >> 63; > > > + force_simd_di (b); > > > b =3D b >> 0; > > > b +=3D b >> 65; /* { dg-warning "right shift count >=3D width of t= ype" } */ > > > - force_simd_di (b); >=20 > This one I don't understand, but seems to say that we've decided to move > b out of FP_REGS after getting it in there for b =3D b << 63; ? So this is > another register allocator regression? No, basically the register allocator is now making better decisions as to w= here to allocate integer variables. It will only allocate them to FP registers if t= hey are primarily used by other FP operations. The force_simd_di inline assembler tries to mi= mic FP uses, and if there are enough of them at the right places then everything works a= s expected. If however you do 3 consecutive integer operations then the allocator will = now correctly prefer to allocate them to the integer registers (while previously it would= n't, which is inefficient). > > > diff --git a/gcc/testsuite/gcc.target/aarch64/vdup_lane_2.c b/gcc/tes= tsuite/gcc.target/aarch64/vdup_lane_2.c > > > index a49db3e..c5a9c52 100644 > > > --- a/gcc/testsuite/gcc.target/aarch64/vdup_lane_2.c > > > +++ b/gcc/testsuite/gcc.target/aarch64/vdup_lane_2.c > > > @@ -1,6 +1,6 @@ > > > /* Test vdup_lane intrinsics work correctly. */ > > > /* { dg-do run } */ > > > -/* { dg-options "-O1 --save-temps" } */ > > > +/* { dg-options "-O2 --save-temps" } */ >=20 > Another -O1 regression ? No, it's triggering a bug in the -O1 register preferencing that causes inco= rrect preferences to be selected despite the costs being right. The cost calculation with -O1 for e= g.=20 wrap_vdupb_lane_s8_0() in vdup_lane_2.c: Pass 0 for finding pseudo/allocno costs r79: preferred FP_REGS, alternative GENERAL_REGS, allocno GENERAL_REGS a1 (r79,l0) best GENERAL_REGS, allocno GENERAL_REGS r78: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS a0 (r78,l0) best GENERAL_REGS, allocno GENERAL_REGS a0(r78,l0) costs: CALLER_SAVE_REGS:5000,5000 GENERAL_REGS:5000,5000 FP_LO= _REGS:5000,5000 FP_REGS:5000,5000 ALL_REGS:10000,10000 MEM:9000,9000 a1(r79,l0) costs: CALLER_SAVE_REGS:5000,5000 GENERAL_REGS:5000,5000 FP_LO= _REGS:0,0 FP_REGS:0,0 ALL_REGS:10000,10000 MEM:9000,9000 So it correctly prefers FP_REGS for r79 as it has the lowest cost, but then= forces the allocno and best register to GENERAL_REGS... We could work around it by not having the = "r" variant first in the aarch64_get_lane patterns and further discouraging its use via "?r", bu= t that's a different patch. Wilco