From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-436400-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 125837 invoked by alias); 20 Sep 2016 21:03:36 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 125807 invoked by uid 89); 20 Sep 2016 21:03:35 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-0.1 required=5.0 tests=AWL,BAYES_05,KAM_LAZY_DOMAIN_SECURITY,RCVD_IN_DNSWL_LOW autolearn=no version=3.3.2 spammy=978, 2506r, 8994797, 2506R
X-HELO: mx0a-001b2d01.pphosted.com
Received: from mx0a-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.156.1) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 20 Sep 2016 21:03:33 +0000
Received: from pps.filterd (m0098393.ppops.net [127.0.0.1])	by mx0a-001b2d01.pphosted.com (8.16.0.17/8.16.0.17) with SMTP id u8KL3HMd075152	for <gcc-patches@gcc.gnu.org>; Tue, 20 Sep 2016 17:03:31 -0400
Received: from e37.co.us.ibm.com (e37.co.us.ibm.com [32.97.110.158])	by mx0a-001b2d01.pphosted.com with ESMTP id 25k7bj76yp-1	(version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT)	for <gcc-patches@gcc.gnu.org>; Tue, 20 Sep 2016 17:03:31 -0400
Received: from localhost	by e37.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted	for <gcc-patches@gcc.gnu.org> from <meissner@ibm-tiger.the-meissners.org>;	Tue, 20 Sep 2016 15:03:30 -0600
Received: from d03dlp03.boulder.ibm.com (9.17.202.179)	by e37.co.us.ibm.com (192.168.1.137) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted;	Tue, 20 Sep 2016 15:03:26 -0600
Received: from b03cxnp08025.gho.boulder.ibm.com (b03cxnp08025.gho.boulder.ibm.com [9.17.130.17])	by d03dlp03.boulder.ibm.com (Postfix) with ESMTP id 0784519D8026;	Tue, 20 Sep 2016 15:02:55 -0600 (MDT)
Received: from b03ledav006.gho.boulder.ibm.com (b03ledav006.gho.boulder.ibm.com [9.17.130.237])	by b03cxnp08025.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u8KL3PYb12779870;	Tue, 20 Sep 2016 14:03:25 -0700
Received: from b03ledav006.gho.boulder.ibm.com (unknown [127.0.0.1])	by IMSVA (Postfix) with ESMTP id 54A66C603C;	Tue, 20 Sep 2016 15:03:25 -0600 (MDT)
Received: from ibm-tiger.the-meissners.org (unknown [9.32.77.111])	by b03ledav006.gho.boulder.ibm.com (Postfix) with ESMTP id 22637C604F;	Tue, 20 Sep 2016 15:03:25 -0600 (MDT)
Received: by ibm-tiger.the-meissners.org (Postfix, from userid 500)	id A8F6B44B70; Tue, 20 Sep 2016 17:03:24 -0400 (EDT)
Date: Tue, 20 Sep 2016 21:27:00 -0000
From: Michael Meissner <meissner@linux.vnet.ibm.com>
To: Tamar Christina <tamar.christina@arm.com>
Cc: Michael Meissner <meissner@linux.vnet.ibm.com>,        GCC Patches <gcc-patches@gcc.gnu.org>,        "jakub@redhat.com" <jakub@redhat.com>,        "rguenther@suse.de" <rguenther@suse.de>,        "law@redhat.com" <law@redhat.com>, nd <nd@arm.com>
Subject: Re: [PATCH] Optimise the fpclassify builtin to perform integer operations when possible
Mail-Followup-To: Michael Meissner <meissner@linux.vnet.ibm.com>,	Tamar Christina <tamar.christina@arm.com>,	GCC Patches <gcc-patches@gcc.gnu.org>,	"jakub@redhat.com" <jakub@redhat.com>,	"rguenther@suse.de" <rguenther@suse.de>,	"law@redhat.com" <law@redhat.com>, nd <nd@arm.com>
References: <VI1PR0801MB2031BC0C70CCAD966A9B2933FFFF0@VI1PR0801MB2031.eurprd08.prod.outlook.com> <20160919221606.GA7906@ibm-tiger.the-meissners.org> <41217f33-3861-dbb8-2f11-950ab30a7021@arm.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <41217f33-3861-dbb8-2f11-950ab30a7021@arm.com>
User-Agent: Mutt/1.5.20 (2009-12-10)
X-TM-AS-GCONF: 00
X-Content-Scanned: Fidelis XPS MAILER
x-cbid: 16092021-0024-0000-0000-0000149DC406
X-IBM-SpamModules-Scores:
X-IBM-SpamModules-Versions: BY=3.00005793; HX=3.00000240; KW=3.00000007; PH=3.00000004; SC=3.00000185; SDB=6.00759607; UDB=6.00361125; IPR=6.00533953; BA=6.00004739; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00012731; XFM=3.00000011; UTC=2016-09-20 21:03:28
X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused
x-cbparentid: 16092021-0025-0000-0000-000044A8162D
Message-Id: <20160920210324.GA8111@ibm-tiger.the-meissners.org>
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-09-20_08:,, signatures=0
X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1609020000 definitions=main-1609200265
X-IsSubscribed: yes
X-SW-Source: 2016-09/txt/msg01363.txt.bz2

On Tue, Sep 20, 2016 at 01:19:07PM +0100, Tamar Christina wrote:
> On 19/09/16 23:16, Michael Meissner wrote:
> >On Mon, Sep 12, 2016 at 04:19:32PM +0000, Tamar Christina wrote:
> >>Hi All,
> >>
> >>This patch adds an optimized route to the fpclassify builtin
> >>for floating point numbers which are similar to IEEE-754 in format.
> >>
> >>The goal is to make it faster by:
> >>1. Trying to determine the most common case first
> >>    (e.g. the float is a Normal number) and then the
> >>    rest. The amount of code generated at -O2 are
> >>    about the same +/- 1 instruction, but the code
> >>    is much better.
> >>2. Using integer operation in the optimized path.
> >>
> >>At a high level, the optimized path uses integer operations
> >>to perform the following:
> >>
> >>   if (exponent bits aren't all set or unset)
> >>      return Normal;
> >>   else if (no bits are set on the number after masking out
> >>        sign bits then)
> >>      return Zero;
> >>   else if (exponent has no bits set)
> >>      return Subnormal;
> >>   else if (mantissa has no bits set)
> >>      return Infinite;
> >>   else
> >>      return NaN;
> >I haven't looked at fpclassify.  I assume we can define a backend insn to do
> >the right thing?  One of the things we've noticed over the years with the
> >PowerPC is that it can be rather expensive to move things from the floating
> >point/vector unit to the integer registers and vice versa.  This is
> >particularly true if you having to do the transfer via the memory unit via
> >stores and loads of different sizes.
> >
> Hmm, what do you mean with the right thing? Do you mean never to use the
> integer version?

The forthcoming PowerPC with ISA 3.0 (power9), we have different ways to do
classification within the floating point unit.

For example, there is the XSTSTDCDP instruction that can set a condition code
register to whether the value is 0, NaN, Infinity, Denormal.  We might come up
with a clever set of tests to use 4 of these instructions to return the
appropriate FP_<xxx>.

Even if we want to do it by looking at the exponent, ISA 3.0 defines
instructions like XSXEXPDP that extracts the exponent from a double precision
value and returns it in a GPR register.

> If so then no, it currently determines it based on the format.
> I could potentially add a hook to allow backends to opt-in/out if
> there's a concern this might be slower.

It would be better to have a fpclassify<mode>2 pattern, and if it isn't
defined, then do the machine independent processing.  That is the way it is
done elsewhere.

> Though is the move that much slower that it negates the benefits we
> should get from not having to do
> 4 branches in the normal case?

It depends.  We have a lot of other stuff for ISA 3.0 on our plates, and
truthfully, we won't be able to answer the question about performance until we
get real hardware, but I would prefer not to be locked into an existing
implementation.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797