public inbox for
 help / color / mirror / Atom feed
From: Thomas Koenig <>
To: gcc-patches <>
Subject: Fwd: [patch, libfortran] AMD-specific versions of library matmul
Date: Thu, 25 May 2017 11:19:00 -0000	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>


patch is at
(didn't to through to gcc-patches due to size limitations).



-------- Weitergeleitete Nachricht --------
Betreff: [patch, libfortran] AMD-specific versions of library matmul
Datum: Thu, 25 May 2017 12:45:46 +0200
Von: Thomas Koenig <>
An: <>, gcc-patches 

Hello world,

the attached patch speeds up the library version of matmul for AMD chips
by selecting AVX128 instructions and, depending on which instructions
are supported, either FMA3 (aka FMA) or FMA4.

Jerry tested this on his AMD systems, and found a speedup vs. the
current code of around 10%.

I have been unable to test this on a Ryzen system (the new compile farm
machines won't accept my login yet).  From the benchmarks I have read,
this method should also work fairly well on a Ryzen.

So, OK for trunk?



2017-05-25  Thomas Koenig  <>

	PR libfortran/78379
	* Add generated/matmulavx128_*.c files.
	Handle them for compiling and setting the right flags.
	* acinclude.m4: Add tests for FMA3, FMA4 and AVX128.
	* Call them.
	* Regenerated.
	* Regenerated.
	* configure: Regenerated.
	* m4/matmul.m4:  Handle AMD chips by calling 128-bit AVX
	versions which use FMA3 or FMA4.
	* m4/matmulavx128.m4: New file.
          * generated/matmul_c10.c: Regenerated.
          * generated/matmul_c16.c: Regenerated.
          * generated/matmul_c4.c: Regenerated.
          * generated/matmul_c8.c: Regenerated.
          * generated/matmul_i1.c: Regenerated.
          * generated/matmul_i16.c: Regenerated.
          * generated/matmul_i2.c: Regenerated.
          * generated/matmul_i4.c: Regenerated.
          * generated/matmul_i8.c: Regenerated.
          * generated/matmul_r10.c: Regenerated.
          * generated/matmul_r16.c: Regenerated.
          * generated/matmul_r4.c: Regenerated.
          * generated/matmul_r8.c: Regenerated.
          * generated/matmulavx128_c10.c: New file.
          * generated/matmulavx128_c16.c: New file.
          * generated/matmulavx128_c4.c: New file.
          * generated/matmulavx128_c8.c: New file.
          * generated/matmulavx128_i1.c: New file.
          * generated/matmulavx128_i16.c: New file.
          * generated/matmulavx128_i2.c: New file.
          * generated/matmulavx128_i4.c: New file.
          * generated/matmulavx128_i8.c: New file.
          * generated/matmulavx128_r10.c: New file.
          * generated/matmulavx128_r16.c: New file.
          * generated/matmulavx128_r4.c: New file.
          * generated/matmulavx128_r8.c: New file.

       reply	other threads:[~2017-05-25 10:58 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <>
2017-05-25 11:19 ` Thomas Koenig [this message]
2017-05-25 14:11 ` Jerry DeLisle
2017-05-25 14:52   ` Thomas Koenig
2017-05-25 17:58 ` Janne Blomqvist
2017-05-25 20:31   ` Jerry DeLisle
2017-05-25 23:44     ` Thomas Koenig
2017-05-26  5:41       ` Jerry DeLisle
2017-05-26  6:21         ` Andrew Pinski
2017-05-26 15:33           ` Bill Seurer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).