From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 106567 invoked by alias); 5 Nov 2019 18:55:39 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Received: (qmail 106554 invoked by uid 89); 5 Nov 2019 18:55:39 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-9.0 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy=maintains X-HELO: mail-qt1-f195.google.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=subject:to:references:from:openpgp:autocrypt:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=/vZu8I33UP3xMpXssZaYxChpOSC19RGG6DJuFo+Jo5s=; b=GMLMN0h5XfrA3P2aPcH0ZZWK5RH72lDTAEy2WJxzaslJM9lbx6LsfDNShPLW2q3Xys W0sRCIIrme4YLH9p51Xv/dvNi0gW6sNW6j47uDEaftN7VW2AZC3tqWQphuOMtBloMMpm OItGLFeAJ2pWAVgqxDQQdSt8os7MqH7fp3KxdB+ftf7m+50V8n9ygxo2MUzCENn/EQ7r cwpgJPQsvCeb1upvMX6+roaLDjsd5BhF95Mnvm62Jhz70stKWGKo3dpVYNSBdxKTA0JH mQoHmt0Vr+AK2fq8R2hNsbk7V6bz507hNZan4UZnCx9ldKnjbwY0XJLRDLVmoDmTlK1E OyFw== Return-Path: Subject: Re: [PATCH 01/17] S390: Use load-fp-integer instruction for nearbyint functions. To: libc-alpha@sourceware.org References: <1572881244-6781-1-git-send-email-stli@linux.ibm.com> From: Adhemerval Zanella Openpgp: preference=signencrypt Message-ID: <4cdb552e-5b56-4e43-a33b-44ec9892cc3f@linaro.org> Date: Tue, 05 Nov 2019 18:55:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-SW-Source: 2019-11/txt/msg00134.txt.bz2 On 05/11/2019 12:49, Stefan Liebler wrote: > On 11/4/19 7:22 PM, Adhemerval Zanella wrote: >> >> >> On 04/11/2019 12:27, Stefan Liebler wrote: >>> If compiled with z196 zarch support, the load-fp-integer instruction >>> is used to implement nearbyint, nearbyintf, nearbyintl. >>> Otherwise the common-code implementation is used. >> >>> + >>> +double >>> +__nearbyint (double x) >>> +{ >>> +  double y; >>> +  /* The z196 zarch "load fp integer" (fidbra) instruction is rounding >>> +     x to the nearest integer according to current rounding mode (M3-field: 0) >>> +     where inexact exceptions are suppressed (M4-field: 4).  */ >>> +  __asm__ ("fidbra %0,0,%1,4" : "=f" (y) : "f" (x)); >>> +  return y; >>> +} >>> +libm_alias_double (__nearbyint, nearbyint) >> >> At least with recent gcc __builtin_nearbyint generates the expected fidbra >> instruction for -march=z196.  I wonder if we could start to simplify some >> math symbols implementation where new architectures/extensions provide >> direct implementation by a direct mapping implemented by compiler builtins. >> >> I would expect to: >> >>    1. Move all sysdeps/ieee754/dbl-64/wordsize-64 to sysdeps/ieee754/dbl-64/ >>       since I hardly doubt these micro-optimizations really pay off with >>       recent architectures and compiler version. >> >>    2. Add internal macros __USE__BUILTIN and use as: >> >>       * sysdeps/ieee754/dbl-64/s_nearbyint.c >>             [...] >>       double >>       __nearbyint (double x) >>       { >>       #if __USE_NEARBYINT_BUILTIN >>         return __builtin_nearbyint (x); >>       #else >>         /* Use generic implementation.  */ >>       #endif >>       } >> >>    3. Define the __USE__BUILTIN for each architecture. >> >> It would allow to simplify some architectures, aarch64 for instance. >> > > Currently the long double builtins are generating an extra not needed stack frame compared to the inline assembly. But this needs to be fixed in gcc. > > E.g. if build for s390 (31bit), where the fidbra & co instructions are not available, the builtins generate a call to libc which would end in an infinite loop.  I will make some tests on s390 starting with the current minimum gcc 6.2 to be sure that the instructions are used.  I have never build glibc with other compilers like clang.  Is there a special need to check this behavior? I think google maintains some branches with clang support (google/grte/*), but there is no know effort to sync these with master. So I see there is no need to focus on non-gcc compiler for now. > > In general I can start with those functions where the builtins can be used on s390, but I won't move all wordsize-64 functions and adjust them to use the builtins with this patch series. > This means for now, I start with using builtins for nearbyint, rint, floor, ceil, trunc, round and copysign. > > Afterwards the same can be done for the remaining functions. > > I will create an own header file, e.g. sysdeps/generic/math-use-builtins.h in the same way as fix-fp-int-compare-invalid.h. > The generic version contains all USE_XYZ_BUILTIN macros defined to 0 > and each architecture can provide its own file with other settings. > For each functions XYZ there will be three macros, e.g. USE_NEARBYINT_BUILTIN, USE_NEARBYINTF_BUILTIN, USE_NEARBYINTL_BUILTIN. > How about this? > I think it is fair start, with the adjustments pointed out by Joseph. I will check out the worksize-64 refactor to avoid duplicate the implementations.