public inbox for
 help / color / mirror / Atom feed
From: Jerry DeLisle <>
To: Janne Blomqvist <>,
	Thomas Koenig <>
Cc: "" <>,
	gcc-patches <>
Subject: Re: [patch, libfortran] AMD-specific versions of library matmul
Date: Thu, 25 May 2017 20:31:00 -0000	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

On 05/25/2017 10:20 AM, Janne Blomqvist wrote:
> On Thu, May 25, 2017 at 1:45 PM, Thomas Koenig <> wrote:
>> Hello world,
>> the attached patch speeds up the library version of matmul for AMD chips
>> by selecting AVX128 instructions and, depending on which instructions
>> are supported, either FMA3 (aka FMA) or FMA4.
>> Jerry tested this on his AMD systems, and found a speedup vs. the
>> current code of around 10%.
>> I have been unable to test this on a Ryzen system (the new compile farm
>> machines won't accept my login yet).  From the benchmarks I have read,
>> this method should also work fairly well on a Ryzen.
>> So, OK for trunk?
> In some comments, you have -mprefer=avx128 whereas the option that gcc
> understands is -mprefer-avx128. Also, have you verified that e.g.
> contemporary Intel processors still use the avx256 codepath and don't
> accidentally end up with avx128?
> As for FMA4, are there sufficient numbers of processors supporting
> FMA4 but not FMA3 around to justify bloating the library to support
> them? I understood that this is only a single AMD CPU generation
> ("bulldozer" in 2011), the next one ("piledriver" in 2012) added FMA3
> in addition to FMA4. And in the new Zen core (Ryzen, Epyc, etc.) AMD
> has dropped support for FMA4 although there are reports that it will
> still execute FMA4 for backward compatibility although it's no longer
> advertised in CPUID, but in any case AMD seems to consider it a legacy
> instruction that should not be used anymore (Intel never supported
> it).

Good questions. I am testing this on Ryzen now. It does work as advertised. The 
cpu flags only advertise FMA.

So I will be testing the older AMD machine which advertises FMA4 and FMA with 
just the FMA flag and likewise the Ryzen with FMA4 and FMA.

I want to see if there is any breakage between the two generations of AMD I can 

Also Ryzen with and without -mprefer-avx128 will be tested.

I do not have an Intel box to test.



  reply	other threads:[~2017-05-25 20:25 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <>
2017-05-25 11:19 ` Fwd: " Thomas Koenig
2017-05-25 14:11 ` Jerry DeLisle
2017-05-25 14:52   ` Thomas Koenig
2017-05-25 17:58 ` Janne Blomqvist
2017-05-25 20:31   ` Jerry DeLisle [this message]
2017-05-25 23:44     ` Thomas Koenig
2017-05-26  5:41       ` Jerry DeLisle
2017-05-26  6:21         ` Andrew Pinski
2017-05-26 15:33           ` Bill Seurer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).