public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Advice on creating an AMD-specific matmul
@ 2017-05-20 13:46 Thomas Koenig
  0 siblings, 0 replies; only message in thread
From: Thomas Koenig @ 2017-05-20 13:46 UTC (permalink / raw)
  To: fortran, gcc mailing list

Hello world,

I am wondering how best to implement an AMD-specific version
of MATMUL for libfortran.

What we currently have in there is restricted to Intel chips,
with a run-time selection of versions depending on availability
of AVX, AVX2+FMA and AVX512F.

The specific function is then declared using (as an example)

matmul_r4_avx (gfc_array_r4 * const restrict retarray,
         gfc_array_r4 * const restrict a, gfc_array_r4 * const restrict 
b, int try_blas,
         int blas_limit, blas_call gemm) __attribute__((__target__("avx")));

This doesn't work well with AMD chips because, according to everything
I have read, their performance with 256-bit AVX is worse than not
using AVX at all. FMA would certainly come in handy for matrix
multiplicaton. Compiling a separate file with -mprefer-avx128 might
work, but would certainly create headaches to do with mixing
m4, CPP and contitional compilation using autoconf.

Alternatively, I don't think something like -mprefer-avx128
can be specified as a target attribute (but I'd like to be
proven wrong here).

Any advice on how best to proceed?

Regards

	Thomas

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2017-05-20 13:46 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-20 13:46 Advice on creating an AMD-specific matmul Thomas Koenig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).