From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 125837 invoked by alias); 20 Sep 2016 21:03:36 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 125807 invoked by uid 89); 20 Sep 2016 21:03:35 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.1 required=5.0 tests=AWL,BAYES_05,KAM_LAZY_DOMAIN_SECURITY,RCVD_IN_DNSWL_LOW autolearn=no version=3.3.2 spammy=978, 2506r, 8994797, 2506R X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0a-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.156.1) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 20 Sep 2016 21:03:33 +0000 Received: from pps.filterd (m0098393.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.17/8.16.0.17) with SMTP id u8KL3HMd075152 for ; Tue, 20 Sep 2016 17:03:31 -0400 Received: from e37.co.us.ibm.com (e37.co.us.ibm.com [32.97.110.158]) by mx0a-001b2d01.pphosted.com with ESMTP id 25k7bj76yp-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Tue, 20 Sep 2016 17:03:31 -0400 Received: from localhost by e37.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 20 Sep 2016 15:03:30 -0600 Received: from d03dlp03.boulder.ibm.com (9.17.202.179) by e37.co.us.ibm.com (192.168.1.137) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Tue, 20 Sep 2016 15:03:26 -0600 Received: from b03cxnp08025.gho.boulder.ibm.com (b03cxnp08025.gho.boulder.ibm.com [9.17.130.17]) by d03dlp03.boulder.ibm.com (Postfix) with ESMTP id 0784519D8026; Tue, 20 Sep 2016 15:02:55 -0600 (MDT) Received: from b03ledav006.gho.boulder.ibm.com (b03ledav006.gho.boulder.ibm.com [9.17.130.237]) by b03cxnp08025.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u8KL3PYb12779870; Tue, 20 Sep 2016 14:03:25 -0700 Received: from b03ledav006.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 54A66C603C; Tue, 20 Sep 2016 15:03:25 -0600 (MDT) Received: from ibm-tiger.the-meissners.org (unknown [9.32.77.111]) by b03ledav006.gho.boulder.ibm.com (Postfix) with ESMTP id 22637C604F; Tue, 20 Sep 2016 15:03:25 -0600 (MDT) Received: by ibm-tiger.the-meissners.org (Postfix, from userid 500) id A8F6B44B70; Tue, 20 Sep 2016 17:03:24 -0400 (EDT) Date: Tue, 20 Sep 2016 21:27:00 -0000 From: Michael Meissner To: Tamar Christina Cc: Michael Meissner , GCC Patches , "jakub@redhat.com" , "rguenther@suse.de" , "law@redhat.com" , nd Subject: Re: [PATCH] Optimise the fpclassify builtin to perform integer operations when possible Mail-Followup-To: Michael Meissner , Tamar Christina , GCC Patches , "jakub@redhat.com" , "rguenther@suse.de" , "law@redhat.com" , nd References: <20160919221606.GA7906@ibm-tiger.the-meissners.org> <41217f33-3861-dbb8-2f11-950ab30a7021@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41217f33-3861-dbb8-2f11-950ab30a7021@arm.com> User-Agent: Mutt/1.5.20 (2009-12-10) X-TM-AS-GCONF: 00 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16092021-0024-0000-0000-0000149DC406 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00005793; HX=3.00000240; KW=3.00000007; PH=3.00000004; SC=3.00000185; SDB=6.00759607; UDB=6.00361125; IPR=6.00533953; BA=6.00004739; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00012731; XFM=3.00000011; UTC=2016-09-20 21:03:28 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16092021-0025-0000-0000-000044A8162D Message-Id: <20160920210324.GA8111@ibm-tiger.the-meissners.org> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-09-20_08:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1609020000 definitions=main-1609200265 X-IsSubscribed: yes X-SW-Source: 2016-09/txt/msg01363.txt.bz2 On Tue, Sep 20, 2016 at 01:19:07PM +0100, Tamar Christina wrote: > On 19/09/16 23:16, Michael Meissner wrote: > >On Mon, Sep 12, 2016 at 04:19:32PM +0000, Tamar Christina wrote: > >>Hi All, > >> > >>This patch adds an optimized route to the fpclassify builtin > >>for floating point numbers which are similar to IEEE-754 in format. > >> > >>The goal is to make it faster by: > >>1. Trying to determine the most common case first > >> (e.g. the float is a Normal number) and then the > >> rest. The amount of code generated at -O2 are > >> about the same +/- 1 instruction, but the code > >> is much better. > >>2. Using integer operation in the optimized path. > >> > >>At a high level, the optimized path uses integer operations > >>to perform the following: > >> > >> if (exponent bits aren't all set or unset) > >> return Normal; > >> else if (no bits are set on the number after masking out > >> sign bits then) > >> return Zero; > >> else if (exponent has no bits set) > >> return Subnormal; > >> else if (mantissa has no bits set) > >> return Infinite; > >> else > >> return NaN; > >I haven't looked at fpclassify. I assume we can define a backend insn to do > >the right thing? One of the things we've noticed over the years with the > >PowerPC is that it can be rather expensive to move things from the floating > >point/vector unit to the integer registers and vice versa. This is > >particularly true if you having to do the transfer via the memory unit via > >stores and loads of different sizes. > > > Hmm, what do you mean with the right thing? Do you mean never to use the > integer version? The forthcoming PowerPC with ISA 3.0 (power9), we have different ways to do classification within the floating point unit. For example, there is the XSTSTDCDP instruction that can set a condition code register to whether the value is 0, NaN, Infinity, Denormal. We might come up with a clever set of tests to use 4 of these instructions to return the appropriate FP_. Even if we want to do it by looking at the exponent, ISA 3.0 defines instructions like XSXEXPDP that extracts the exponent from a double precision value and returns it in a GPR register. > If so then no, it currently determines it based on the format. > I could potentially add a hook to allow backends to opt-in/out if > there's a concern this might be slower. It would be better to have a fpclassify2 pattern, and if it isn't defined, then do the machine independent processing. That is the way it is done elsewhere. > Though is the move that much slower that it negates the benefits we > should get from not having to do > 4 branches in the normal case? It depends. We have a lot of other stuff for ISA 3.0 on our plates, and truthfully, we won't be able to answer the question about performance until we get real hardware, but I would prefer not to be locked into an existing implementation. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797