From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 92476 invoked by alias); 25 May 2016 10:16:19 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 92461 invoked by uid 89); 25 May 2016 10:16:18 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.4 required=5.0 tests=AWL,BAYES_00,SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (207.82.80.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 25 May 2016 10:16:02 +0000 Received: from emea01-db3-obe.outbound.protection.outlook.com (mail-db3lrp0082.outbound.protection.outlook.com [213.199.154.82]) (Using TLS) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-68-84lND59PTV6B2z0shL81BQ-1; Wed, 25 May 2016 11:15:57 +0100 Received: from DB5PR08CA0039.eurprd08.prod.outlook.com (10.163.102.177) by DB5PR0801MB1544.eurprd08.prod.outlook.com (10.167.229.154) with Microsoft SMTP Server (TLS) id 15.1.501.7; Wed, 25 May 2016 10:15:55 +0000 Received: from DB3FFO11FD009.protection.gbl (2a01:111:f400:7e04::144) by DB5PR08CA0039.outlook.office365.com (2a01:111:e400:52c3::49) with Microsoft SMTP Server (TLS) id 15.1.501.7 via Frontend Transport; Wed, 25 May 2016 10:15:55 +0000 Received: from nebula.arm.com (217.140.96.140) by DB3FFO11FD009.mail.protection.outlook.com (10.47.216.165) with Microsoft SMTP Server (TLS) id 15.1.497.8 via Frontend Transport; Wed, 25 May 2016 10:15:54 +0000 Received: from arm.com (10.1.2.79) by mail.arm.com (10.1.106.66) with Microsoft SMTP Server id 14.3.279.2; Wed, 25 May 2016 11:15:53 +0100 Date: Wed, 25 May 2016 12:29:00 -0000 From: James Greenhalgh To: Evandro Menezes CC: GCC Patches , Wilco Dijkstra , Andrew Pinski , "philipp.tomsich@theobroma-systems.com" , Benedikt Huber , Subject: Re: [PATCH 1/3][AArch64] Add more choices for the reciprocal square root approximation Message-ID: <20160525101553.GB9511@arm.com> References: <57212B7D.9000807@samsung.com> MIME-Version: 1.0 In-Reply-To: <57212B7D.9000807@samsung.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-EOPAttributedMessage: 0 X-Forefront-Antispam-Report: CIP:217.140.96.140;IPV:CAL;SCL:-1;CTRY:GB;EFV:NLI;SFV:NSPM;SFS:(10009020)(6009001)(2980300002)(438002)(189002)(24454002)(199003)(54356999)(11100500001)(76176999)(575784001)(86362001)(50986999)(104016004)(6806005)(5008740100001)(106466001)(87936001)(46406003)(97756001)(50466002)(36756003)(92566002)(33656002)(189998001)(83506001)(5003600100002)(1076002)(1220700001)(77096005)(586003)(19580395003)(110136002)(23726003)(4001350100001)(4326007)(2950100001)(2906002)(8936002)(8676002);DIR:OUT;SFP:1101;SCL:1;SRVR:DB5PR0801MB1544;H:nebula.arm.com;FPR:;SPF:Pass;MLV:sfv;MX:1;A:1;LANG:en; X-Microsoft-Exchange-Diagnostics: 1;DB3FFO11FD009;1:M0pp0O+/2xueyiTFA+ZZADLjlSP+L/yFAlWQLJsczFOhj2A3AV2VsJNOqCk9gB/oOg+ZSdVDkesK8Zw6D3TJJDkfd8PMumNEv+aXxGhwFnOyb4OthIrCGVoDF3o0Siwbk99cEFAb6kvJkoIvvchqzB3Yup8F8id7laXKMV1p0q5UuQFPf7Mnrxo/v0BAZoR6J1ftmhNTl4y20TpmYsFTl4y0KmuMYNb1ZiYyc+itO726doz1aB+5YdRi3xW0TuNw3Ju0YFmUVy/AZUNqxTf308ucxwRzw+sc5i0TE2aHBL+FTUdLSS+sLN1elQ6kEoGek2dhUbXGWlFdnWtIkR4tEFNw6pBmffsqK/aoKtACsW1DNVHnRENJrYg5x8sd+fxx8wnRiXHHYJgsZx9EjAKHfpJ78f6JdbtoKKByzPCSxi2N/VmxACMkeTVPc8mIpuYgWCbOxU5yLs78rFXCFY7aqlEOEI/7v3Z1HAN7+dKyErbvBX+jXDiLrp+Iph4gxvBu9sJ546CGib9IZVx7puiKGmiviBP7lRH49F1N2nzgXX8= X-MS-Office365-Filtering-Correlation-Id: c0a73634-3cda-4f5b-6f09-08d384858ec2 X-Microsoft-Exchange-Diagnostics: 1;DB5PR0801MB1544;2:xJugPKxJRf67hSSX32cRcO5Q/JozlWStxPn8bibSNse3PrU0JaLUsu7cB6nKscKQ9bdeGK6kqksmM7wjYvrh1CSIm3G91xYAIQ0b8lIWMpkI8THA/Pf9volnrCrHz+6H2LPFvkM0uTW5NslBJ9zjU8BKyoNCSVPmY/DF0isaDIEj9RLDEUN8cYGTUk9dbQc+;3:Jc8J92g1UF0ANYQx+aRGX9AxnpE4TEoYXT/SeMjBDy2eahoTm7wWfpyfzXmfvxL7SDJoDfsvhR9jdBe4zu4UpNdIfxmFsYeXh6E7II/qgUXJ6O8C8Dw1ZBl+eEA2mu23tLm4DsajcSRagUwsr3seiOuz0+ovJ/3IijkkmfEQphWdHzEsQlGlXvClZusrFGku3dHJDMJKxI28YCw36HB7jDhCi9rXSe9X8sz4H/OZIssP+4p9stBWzpU7H7z9X+3j9xjWrQ428BHR20wb1U95xA==;25:25lpBeu/y1urj4mhYjtskBHbl1eEG2El7Tx0eA43Mj1QMPcqgkq1DvGYFtbALEhGO3Zx1iBcs0QPlYhEarZkxV2ehr6nH2RRw4U9G668Pf86Z/M7NdHnJEisWuG2+We5IAZF8eFQnEcXVUeRmc4r9WdJ8X9OArLEJZsFq0HQjNxg8D1MDpjEfUdTNezYwcq065GtwRRRG+OB3xc2qtdzcmlhBSxzjxpqNXxJrsLVCvClOaPqVmw8EDWM41KGCi42ruk6il3MBkusPmKwE6Em+mtCqa2KSO90dzczmTJnVHutglu4p6tTsmb/UpVepeRMXC2xRq36gLd7zgPoUui4ADDtKkoJRmhGfQXB9R+fSPCPbzk57yE282m1C/jT0ERjPjQxvmga/Oac/fTWDn3vNfxPZUSOcqpSCK4tV7CT2sA= X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(8251501002);SRVR:DB5PR0801MB1544; X-LD-Processed: f34e5979-57d9-4aaa-ad4d-b122a662184d,ExtAddr NoDisclaimer: True X-Microsoft-Exchange-Diagnostics: 1;DB5PR0801MB1544;20:UPO5Evb4RpXoKXl5sOIJs5+vXa89GI0vAVLl2liXelXHYqZ+QMLggBQA2HFiSSykpcRPrZxjdOX29pLvgPCVFofV94cJhwY0SZldzL754OznwxW5r0MQycfH/euDD2VVkn44Ni/nOHl4NQx+fr5wD94wRN3ib+XNsxE25DDRw/JEgTj8lROKVlIZesB6JEkH7NyMNST0WZpAVoTNDOGhwu4T/9TwQBDstWE/Ls3me3tuOE57gDfdSkd/hvASLAkK;4:hjdPS7lyhmGaGxxzlv+txABAFm2Y1Ubs6Cq6dkVyUzBaKmOoYIZwhb8DisoI0uGCxR29BmdjDHVyJ9TJKf033GNywmQjCmwb/fjYLk238nyv0WwcKS1FxHOedXcCeYWTsh+hAF/iyexPqWasOrt76qmo27tBUz7JpkoBSB7XVfLLL0xNeGSV4espcLFw4WKGCqcp33qwTMe7UVKHbnBV7iIiFZtZvhq/+ogVl1Ui+9XvG46Ix8CeE+s6pEhMHHZiDRrZga+4D26KSRrTvhH/6Ogh3F4bgfPv/fW55F6vr56jsQBZjV0Fm9pzXDJysOjC33nRGQ9Z+Cl2+q6G4T8tMzlGuWTjDVpe6G8Gqm4+injhRmlZUjjDrU2B1+Bkvu8YplgW1ikbsA7e7JMqDCY1ou18R/YxTo65ranZcfp+MDkTu+eHK453wWDP7L+nTODHdmNxNriTApnNkqjzh3VBow== X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(2401047)(13023025)(13020025)(13013025)(8121501046)(13024025)(5005006)(3002001)(10201501046)(6055026);SRVR:DB5PR0801MB1544;BCL:0;PCL:0;RULEID:;SRVR:DB5PR0801MB1544; X-Forefront-PRVS: 09538D3531 X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1;DB5PR0801MB1544;23:hZRu5ohmRz409TXRIYs11bTZGgHPqcrPs/cA+h2?= =?us-ascii?Q?yL0lUeAuBBTbUq701HM/3qFxrypPSwcEEhOTCjwPmD6cqoXu15w1obEDTlNZ?= =?us-ascii?Q?MJ3XNT19K9dautrDGipwvFU35l/AK9KSeL/ankNBmS+QvKVnSLGBhz5Nh3ik?= =?us-ascii?Q?F4yndVug4C4l3losmdMQX8k/tmSEWLKkNDuSsRTM4xrlwH3IwlXy/11zFRAp?= =?us-ascii?Q?a1UtLvQAX841qMnBzpxtjfqISDu6CzV7TU9t7c366I1Mq6gH5WJpQUwHTtVj?= =?us-ascii?Q?2h6RA+abwmfpxzpBnsxVHg9eZMwT0PsG7175E5u9p03uASU+/JgWx9fY/RAV?= =?us-ascii?Q?vyb+PDIrD4yyxLWR/K4Bq56q/lCQQqxsdwPp0GY2nYPAAu5yI78om0kLl4cK?= =?us-ascii?Q?F9OT1cAvo0+whhvoKiNGMqwAhn9sFL1XYU8ZBzu6aJ2epb4yH8XC2ujx++22?= =?us-ascii?Q?vZEw8Id6xJcZ7G6dmlkfr1BTX8eVQ4g6tRRB/ZjR1n+eoRb8iPKJlfpJ5nuN?= =?us-ascii?Q?gROrvAm2ICgGYzZVZba9a6tfJvyQTbScBA1GBkbpzF7oY7zqrMRwUxAQnFnN?= =?us-ascii?Q?ueM6/cxtBnnERTuOzmEUux2CsTpwKq9cxMJ99ukmT78mtye96CRnjNOiXR2Q?= =?us-ascii?Q?+rh1RzlwQUeT7sO+aOut1x0mGxuZKZYlDJf81BCzI5+RQCqh2NOb7Hqpk629?= =?us-ascii?Q?Ad4kDk/4Oo1ZMYduoaUtrcGEqOI1CM+vHjSSZA3KYHAauaepnFRtruTHUYTD?= =?us-ascii?Q?YxixyGnczTCyMharDliYJUWhw2t/CCqw8+elMGoMVQ5lVQk+XUPRZbI2FRkR?= =?us-ascii?Q?s8iS7cB87o9cJVZM6hRk+0YEH1j1XxtXMZU8TFgmJ6WDMK9Hh3akUWIFx9mw?= =?us-ascii?Q?ldS4ARVejSfiuSWy4TeXO8V/Vp44in416ObuMblvNxfLMtLjp4/zL8182Egq?= =?us-ascii?Q?Iitfug4gvxv551zTusZMQlcEV38igLD0qumEk7MpkxDrXLfzesO8LPeSH0Pe?= =?us-ascii?Q?14DQbeWodriAoCU88SyJ8Hc1X14YK21cgu4f/jg6i+7TVug=3D=3D?= X-Microsoft-Exchange-Diagnostics: 1;DB5PR0801MB1544;5:zlPj7XU+uFQOcEiW2yiQItR3A84ucGlLUCiDycI85qE1lJzVrADkTwi0MgKQVv9rYanfGcVKA/T+YzrqCw4JQn4eimIPejhLx/M9SsjDc4Cz4meAWgWYHOM7LCEF8QhuWCtrUOHRgzTDMnKjg3917A==;24:mipcHnUxNyaq18tN7XVgMSL+VPMneXbqtG4KbkpGF4xwui1MYYnGoGPoZs16CeTqRJdCzH1+GdO9bu+3EjMBRTBo+lolNo2hdAEUkSlvrZg=;7:L6ZbBbcW5dY9sxlqb++4GMzdEyRqXWGDwxWy3tZSsplyCn8vCuCwynjVyAFsHWudzSBiTclkOmb3l6cqH+tyKzerf20j/VlCAqpREHMrmX0NPsEDQZ4DUlheKf0Bd1GZMY6i3GDiacYNbcQ0UeUsWVOXS7bzCoxRxjyJ9WZ+4dwNkqnd8MQirdLxCyv9Qi4y;20:1uIRwtlqC5hRjot+lBpdHBVnSmO7pi9EriZmKnGN++c1ll9TOb7eIb+PHZ8uoN69GrLw9Zv0HEEM7MB+LOv5FWyE86ZKQ67izn0/bv7ZadIFt8aoMgkrucQ8j8lkqIk1/sQ+yCZEZyeMP59gNeE8SpBZSCQwOBKqH8vhz05N1MM= SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 May 2016 10:15:54.7499 (UTC) X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[217.140.96.140];Helo=[nebula.arm.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB5PR0801MB1544 X-MC-Unique: 84lND59PTV6B2z0shL81BQ-1 Content-Type: text/plain; charset=WINDOWS-1252 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline X-IsSubscribed: yes X-SW-Source: 2016-05/txt/msg02005.txt.bz2 On Wed, Apr 27, 2016 at 04:13:33PM -0500, Evandro Menezes wrote: > gcc/ > * config/aarch64/aarch64-protos.h > (AARCH64_APPROX_MODE): New macro. > (AARCH64_APPROX_{NONE,SP,DP,DFORM,QFORM,SCALAR,VECTOR,ALL}): > Likewise. > (tune_params): New member "approx_rsqrt_modes". > * config/aarch64/aarch64-tuning-flags.def > (AARCH64_EXTRA_TUNE_APPROX_RSQRT): Remove macro. > * config/aarch64/aarch64.c > (generic_tunings): New member "approx_rsqrt_modes". > (cortexa35_tunings): Likewise. > (cortexa53_tunings): Likewise. > (cortexa57_tunings): Likewise. > (cortexa72_tunings): Likewise. > (exynosm1_tunings): Likewise. > (thunderx_tunings): Likewise. > (xgene1_tunings): Likewise. > (use_rsqrt_p): New argument for the mode and use new member from > "tune_params". > (aarch64_builtin_reciprocal): Devise mode from builtin. > (aarch64_optab_supported_p): New argument for the mode. > * doc/invoke.texi (-mlow-precision-recip-sqrt): Reword descriptio= n. > > diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aar= ch64-protos.h > index f22a31c..50f1d24 100644 > --- a/gcc/config/aarch64/aarch64-protos.h > +++ b/gcc/config/aarch64/aarch64-protos.h > @@ -178,6 +178,32 @@ struct cpu_branch_cost > const int unpredictable; /* Unpredictable branch or optimizing for sp= eed. */ > }; >=20=20 > +/* Control approximate alternatives to certain FP operators. */ > +#define AARCH64_APPROX_MODE(MODE) \ > + ((MIN_MODE_FLOAT <=3D (MODE) && (MODE) <=3D MAX_MODE_FLOAT) \ > + ? (1 << ((MODE) - MIN_MODE_FLOAT)) \ > + : (MIN_MODE_VECTOR_FLOAT <=3D (MODE) && (MODE) <=3D MAX_MODE_VECTOR_F= LOAT) \ > + ? (1 << ((MODE) - MIN_MODE_VECTOR_FLOAT \ > + + MAX_MODE_FLOAT - MIN_MODE_FLOAT + 1)) \ > + : (0)) > +#define AARCH64_APPROX_NONE (0) > +#define AARCH64_APPROX_SP (AARCH64_APPROX_MODE (SFmode) \ > + | AARCH64_APPROX_MODE (V2SFmode) \ > + | AARCH64_APPROX_MODE (V4SFmode)) > +#define AARCH64_APPROX_DP (AARCH64_APPROX_MODE (DFmode) \ > + | AARCH64_APPROX_MODE (V2DFmode)) > +#define AARCH64_APPROX_DFORM (AARCH64_APPROX_MODE (SFmode) \ > + | AARCH64_APPROX_MODE (DFmode) \ > + | AARCH64_APPROX_MODE (V2SFmode)) > +#define AARCH64_APPROX_QFORM (AARCH64_APPROX_MODE (V4SFmode) \ > + | AARCH64_APPROX_MODE (V2DFmode)) > +#define AARCH64_APPROX_SCALAR (AARCH64_APPROX_MODE (SFmode) \ > + | AARCH64_APPROX_MODE (DFmode)) > +#define AARCH64_APPROX_VECTOR (AARCH64_APPROX_MODE (V2SFmode) \ > + | AARCH64_APPROX_MODE (V4SFmode) \ > + | AARCH64_APPROX_MODE (V2DFmode)) > +#define AARCH64_APPROX_ALL (-1) > + Thanks for providing these various subsets, but I think they are unneccesary for the final submission. From what I can see, only=20 AARCH64_APPROX_ALL and AARCH64_APPROX_NONE are used. Please remove the rest, they are easy enough to add back if a subtarget wants them. > struct tune_params > { > const struct cpu_cost_table *insn_extra_cost; > @@ -218,6 +244,7 @@ struct tune_params > } autoprefetcher_model; >=20=20 > unsigned int extra_tuning_flags; > + unsigned int approx_rsqrt_modes; As we're going to add a few of these, lets follow the approach for some of the other costs (e.g. branch costs, vector costs) and bury them in a structure of their own. > }; >=20=20 > #define AARCH64_FUSION_PAIR(x, name) \ > diff --git a/gcc/config/aarch64/aarch64-tuning-flags.def b/gcc/config/aar= ch64/aarch64-tuning-flags.def > index 7e45a0c..048c2a3 100644 > --- a/gcc/config/aarch64/aarch64-tuning-flags.def > +++ b/gcc/config/aarch64/aarch64-tuning-flags.def > @@ -29,5 +29,3 @@ > AARCH64_TUNE_ to give an enum name. */ >=20=20 > AARCH64_EXTRA_TUNING_OPTION ("rename_fma_regs", RENAME_FMA_REGS) > -AARCH64_EXTRA_TUNING_OPTION ("approx_rsqrt", APPROX_RSQRT) > - Did you want to add another way to tune these by command line (not neccessary now, but as a follow-up)? See how instruction fusion is handled by the -moverride code for an example. Thanks, James