From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-401264-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 117459 invoked by alias); 25 Jun 2015 13:27:44 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 117431 invoked by uid 89); 25 Jun 2015 13:27:43 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,RP_MATCHES_RCVD autolearn=ham version=3.3.2
X-HELO: mx2.suse.de
Received: from cantor2.suse.de (HELO mx2.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (CAMELLIA256-SHA encrypted) ESMTPS; Thu, 25 Jun 2015 13:27:42 +0000
Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.220.254])	by mx2.suse.de (Postfix) with ESMTP id AF617AC6B;	Thu, 25 Jun 2015 13:27:39 +0000 (UTC)
Date: Thu, 25 Jun 2015 13:27:00 -0000
From: Michael Matz <matz@suse.de>
To: Benedikt Huber <benedikt.huber@theobroma-systems.com>
cc: pinskia@gmail.com, "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,     "philipp.tomsich@theobroma-systems.com" <philipp.tomsich@theobroma-systems.com>
Subject: Re: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math
In-Reply-To: <F2FF9755-1DF9-4000-8602-77AB12077240@theobroma-systems.com>
Message-ID: <alpine.LSU.2.20.1506251521550.18829@wotan.suse.de>
References: <1434629045-24650-1-git-send-email-benedikt.huber@theobroma-systems.com> <8B73CF78-11D4-4963-A60A-E1C2A3B219E2@gmail.com> <F2FF9755-1DF9-4000-8602-77AB12077240@theobroma-systems.com>
User-Agent: Alpine 2.20 (LSU 67 2015-01-07)
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
X-IsSubscribed: yes
X-SW-Source: 2015-06/txt/msg01799.txt.bz2

Hi,

On Thu, 25 Jun 2015, Benedikt Huber wrote:

> > This is NOT a win on thunderX at least for single precision because 
> > you have to do the divide and sqrt in the same time as it takes 5 
> > multiples (estimate and step are multiplies in the thunderX pipeline).  
> > Doubles is 10 multiplies which is just the same as what the patch does 
> > (but it is really slightly less than 10, I rounded up). So in the end 
> > this is NOT a win at all for thunderX unless we do one less step for 
> > both single and double.
> 
> Yes, the expected benefit from rsqrt estimation is implementation 
> specific. If one has a better initial rsqrte or an application that can 
> trade precision for execution time, we could offer a command line option 
> to do only 2 steps for doulbe and 1 step for float; similar to 
> -mrecip-precision for PowerPC. What are your thoughts on that?

On x86-64, under -ffast-math we only do one NR step.  Generally the 
rule-of-thumb take on fast-math is, that common benchmarks should still 
validate with that option in effect.

(And yes, I also never found a speedup for approximated reciprocals so 
that benchmarks would still generally validate, you always had to do two 
NR steps, and then it became as slow as a general divide).  See also 
http://gcc.gnu.org/ml/gcc-patches/2009-11/msg00099.html and the followup 
thread.