From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 125476 invoked by alias); 13 Sep 2016 11:35:25 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 125448 invoked by uid 89); 13 Sep 2016 11:35:24 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00,SPF_PASS autolearn=ham version=3.3.2 spammy=POWER, GLIBC, HContent-Language:en-GB, H*MI:outlook X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (207.82.80.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 13 Sep 2016 11:35:14 +0000 Received: from EUR01-VE1-obe.outbound.protection.outlook.com (mail-ve1eur01lp0240.outbound.protection.outlook.com [213.199.154.240]) (Using TLS) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-30-a3IOHUVRMf6C3Vt4JUkqTg-1; Tue, 13 Sep 2016 12:35:11 +0100 Received: from AM5PR0802MB2610.eurprd08.prod.outlook.com (10.175.46.18) by VI1PR0801MB2032.eurprd08.prod.outlook.com (10.173.74.141) with Microsoft SMTP Server (version=TLS1_0, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA_P384) id 15.1.619.10; Tue, 13 Sep 2016 11:35:07 +0000 Received: from AM5PR0802MB2610.eurprd08.prod.outlook.com ([10.175.46.18]) by AM5PR0802MB2610.eurprd08.prod.outlook.com ([10.175.46.18]) with mapi id 15.01.0619.012; Tue, 13 Sep 2016 11:35:07 +0000 From: Wilco Dijkstra To: Jakub Jelinek , Tamar Christina CC: GCC Patches , "rguenther@suse.de" , Jeff Law , nd Subject: Re: [PATCH] Optimise the fpclassify builtin to perform integer operations when possible Date: Tue, 13 Sep 2016 12:16:00 -0000 Message-ID: x-ms-office365-filtering-correlation-id: 7150fd47-656d-49ca-7b23-08d3dbca036e x-microsoft-exchange-diagnostics: 1;VI1PR0801MB2032;20:7PCrckNIhOfCX6hpsy3KLhQEz07IhpP7Vamnew9p5v0hXWTnhlZeXOUeucTE2tfcjLTQtglWd8yvMyQlOemo5faGWMK4Co2VLFb0qv7MxNGjEgfOCCvkf5m5voSiSSf8YFzmk1eL95hp5pBV0q9D3k51+wJ/cBa8J8gKfnimE88= x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:VI1PR0801MB2032; nodisclaimer: True x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:; x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(6040176)(601004)(2401047)(8121501046)(5005006)(3002001)(10201501046)(6055026);SRVR:VI1PR0801MB2032;BCL:0;PCL:0;RULEID:;SRVR:VI1PR0801MB2032; x-forefront-prvs: 0064B3273C x-forefront-antispam-report: SFV:NSPM;SFS:(10009020)(6009001)(7916002)(189002)(24454002)(199003)(2906002)(3280700002)(66066001)(4326007)(86362001)(2900100001)(87936001)(305945005)(586003)(77096005)(6116002)(102836003)(5002640100001)(3660700001)(76576001)(10400500002)(11100500001)(3846002)(189998001)(7736002)(106356001)(97736004)(101416001)(54356999)(81166006)(8936002)(50986999)(106116001)(122556002)(5660300001)(92566002)(7696004)(33656002)(5001770100001)(105586002)(8676002)(81156014)(74316002)(9686002)(7846002)(68736007);DIR:OUT;SFP:1101;SCL:1;SRVR:VI1PR0801MB2032;H:AM5PR0802MB2610.eurprd08.prod.outlook.com;FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-originalarrivaltime: 13 Sep 2016 11:35:07.4739 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR0801MB2032 X-MC-Unique: a3IOHUVRMf6C3Vt4JUkqTg-1 Content-Type: text/plain; charset=WINDOWS-1252 Content-Transfer-Encoding: quoted-printable X-SW-Source: 2016-09/txt/msg00713.txt.bz2 Jakub wrote: > On Mon, Sep 12, 2016 at 04:19:32PM +0000, Tamar Christina wrote: > > This patch adds an optimized route to the fpclassify builtin > > for floating point numbers which are similar to IEEE-754 in format. > >=20 > > The goal is to make it faster by: > > 1. Trying to determine the most common case first > > (e.g. the float is a Normal number) and then the > > rest. The amount of code generated at -O2 are > > about the same +/- 1 instruction, but the code > > is much better. > > 2. Using integer operation in the optimized path. >=20 > Is it generally preferable to use integer operations for this instead > of floating point operations? I mean various targets have quite high cos= ts > of moving data in between the general purpose and floating point register > file, often it has to go through memory etc. It is generally preferable indeed - there was a *very* long discussion abou= t integer vs FP on the GLIBC mailing list when I updated math.h to use the GCC builti= ns a while back (the GLIBC implementation used a non-inlined unoptimized integer implementation, so an inlined FP implementation seemed a good intermediate = solution). Integer operations are generally lower latency and enable bit manipulation = tricks like the fast early exit. The FP version requires execution of 5 branches for a "nor= mal" FP value and loads several floating point immediates. There are also many targets wi= th emulated floating point types, so 5 calls to the comparison lib function would be se= riously slow. Note using so many FP comparisons is not just slow but they aren't correct = for signalling NaNs, so this patch also fixes bug 66462 for fpclassify. I would suggest someone with access to a machine with slow FP moves (POWER?) to benchmark this using the fpclassify test (glibc/benchtests/bench-math-in= lines.c) so we know for sure. Wilco