public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: "Paul A. Clarke" <pc@us.ibm.com>
To: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Cc: libc-alpha@sourceware.org,
	Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Subject: Re: [PATCH v4 06/12] math: Remove powerpc e_hypot
Date: Mon, 6 Dec 2021 15:29:32 -0600	[thread overview]
Message-ID: <20211206212932.GB48332@li-24c3614c-2adc-11b2-a85c-85f334518bdb.ibm.com> (raw)
In-Reply-To: <9ab1d04d-e6f7-0ca6-0541-374a8a55ab09@linaro.org>

On Mon, Dec 06, 2021 at 02:12:27PM -0300, Adhemerval Zanella wrote:
> On 02/12/2021 21:00, Adhemerval Zanella wrote:
> > The generic implementation is shows only slight worse performance:
> > 
> > POWER9     reciprocal-throughput    latency
> > master                   13.4024    14.0967
> > new hypot                14.8479    15.8061
> > 
> > POWER8     reciprocal-throughput    latency
> > master                   15.5767    16.8885
> > new hypot                16.5371    18.4057
> > 
> > One way to improve might to make gcc generate xsmaxdp/xsmindp for
> > fmax/fmin (it onl does for -ffast-math, clang does for default
> > options).
> > 
> > Checked on powerpc64-linux-gnu (power8) and powerpc64le-linux-gnu
> > (power9).
> 
> Hi Tulio/Paul,
> 
> This the only missing patch in this set and I would like to check with you,
> powerpc maintainers, that if it would be ok to push it.  The resulting
> performance difference, including the latest one that removes the wrappers,
> is slight better:
> 
>          
> POWER9           reciprocal-throughput     latency
> master                          13.4024    14.0967
> new hypot                       11.9206    13.9871
> 
> POWER8            reciprocal-throughput    latency
> master                          15.5767    16.8885
> new hypot                       15.3541    18.0856

For Power10 / master:

  "hypot": {
   "workload-random": {
    "duration": 5.28242e+08,
    "iterations": 4.8e+07,
    "reciprocal-throughput": 8.28478,
    "latency": 13.7253,
    "max-throughput": 1.20703e+08,
    "min-throughput": 7.2858e+07
   }
  }

For Power10 / new hypot:

  "hypot": {
   "workload-random": {
    "duration": 5.30731e+08,
    "iterations": 5.2e+07,
    "reciprocal-throughput": 7.21945,
    "latency": 13.1933,
    "max-throughput": 1.38515e+08,
    "min-throughput": 7.57963e+07
   }
  }

> The POWER8 çatency difference seems to be due branch misprediction 
> in the max/min selection.  In fact, if I use xsmaxdp/xsmindp on the
> USE_FMAX_BUILTIN/USE_FMIN_BUILTIN, I see way better results on POWER8:
> 
> POWER8            reciprocal-throughput    latency
> xsmaxdp/xsmindp                 12.8959    16.2082
> 
> POWER9 is not affected (I don't see any performance difference by
> using xsmaxdp/xsmindp).
> 
> The xsmaxdp/xsmindp unfortunately are only emitted with -ffast-math
> for some reason, clang use them on default -O2 option.

I believe clang may be wrong here, in that the instructions do not
properly handle NaN for the fmin/fmax semantics without -ffast-math.

PC

  reply	other threads:[~2021-12-06 21:29 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-03  0:00 [PATCH v4 00/12] Improve hypot Adhemerval Zanella
2021-12-03  0:00 ` [PATCH v4 01/12] math: Simplify hypotf implementation Adhemerval Zanella
2021-12-03 13:23   ` Wilco Dijkstra
2021-12-03 19:44     ` Adhemerval Zanella
2021-12-03  0:00 ` [PATCH v4 02/12] math: Use an improved algorithm for hypot (dbl-64) Adhemerval Zanella
2021-12-03 13:41   ` Wilco Dijkstra
2021-12-03 19:44     ` Adhemerval Zanella
2021-12-03  0:00 ` [PATCH v4 03/12] math: Improve hypot performance with FMA Adhemerval Zanella
2021-12-03 13:44   ` Wilco Dijkstra
2021-12-03  0:00 ` [PATCH v4 04/12] math: Use an improved algorithm for hypotl (ldbl-96) Adhemerval Zanella
2021-12-06 12:00   ` Wilco Dijkstra
2021-12-06 12:21     ` Adhemerval Zanella
2021-12-03  0:00 ` [PATCH v4 05/12] math: Use an improved algorithm for hypotl (ldbl-128) Adhemerval Zanella
2021-12-06 11:58   ` Wilco Dijkstra
2021-12-06 12:22     ` Adhemerval Zanella
2021-12-03  0:00 ` [PATCH v4 06/12] math: Remove powerpc e_hypot Adhemerval Zanella
2021-12-06 17:12   ` Adhemerval Zanella
2021-12-06 21:29     ` Paul A. Clarke [this message]
2021-12-07 13:19       ` Adhemerval Zanella
2021-12-03  0:00 ` [PATCH v4 07/12] i386: Move hypot implementation to C Adhemerval Zanella
2021-12-03 14:51   ` Wilco Dijkstra
2021-12-06 12:26     ` Adhemerval Zanella
2021-12-03  0:00 ` [PATCH v4 08/12] math: Add math-use-builtinds-fmax.h Adhemerval Zanella
2021-12-06 11:52   ` Wilco Dijkstra
2021-12-06 21:11   ` Joseph Myers
2021-12-07 13:21     ` Adhemerval Zanella
2021-12-03  0:01 ` [PATCH v4 09/12] math: Add math-use-builtinds-fmin.h Adhemerval Zanella
2021-12-06 11:50   ` Wilco Dijkstra
2021-12-03  0:01 ` [PATCH v4 10/12] aarch64: Add math-use-builtins-f{max,min}.h Adhemerval Zanella
2021-12-06 11:46   ` Wilco Dijkstra
2021-12-06 12:35     ` Adhemerval Zanella
2021-12-03  0:01 ` [PATCH v4 11/12] math: Use fmin/fmax on hypot Adhemerval Zanella
2021-12-06 11:44   ` Wilco Dijkstra
2021-12-03  0:01 ` [PATCH v4 12/12] math: Remove the error handling wrapper from hypot and hypotf Adhemerval Zanella
2021-12-03  8:51 ` [PATCH v4 00/12] Improve hypot Paul Zimmermann
2021-12-06 12:36   ` Adhemerval Zanella

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211206212932.GB48332@li-24c3614c-2adc-11b2-a85c-85f334518bdb.ibm.com \
    --to=pc@us.ibm.com \
    --cc=adhemerval.zanella@linaro.org \
    --cc=libc-alpha@sourceware.org \
    --cc=tuliom@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).