Re: Disable FMADD in chains for Zen4 and generic

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Jan Hubicka <hubicka@ucw.cz>
To: Hongtao Liu <crazylht@gmail.com>
Cc: gcc-patches@gcc.gnu.org, hongtao.liu@intel.com,
	hongjiu.lu@intel.com, "Zhang, Annita" <annita.zhang@intel.com>
Subject: Re: Disable FMADD in chains for Zen4 and generic
Date: Wed, 13 Dec 2023 17:03:18 +0100	[thread overview]
Message-ID: <ZXnVxlYYngUJ96hZ@kam.mff.cuni.cz> (raw)
In-Reply-To: <CAMZc-bxchHtp8NwkJ2H1jO7G8n_jE5GK1tqAGb0Z3MtZGPrpDg@mail.gmail.com>

> > The diffrerence is that Cores understand the fact that fmadd does not need
> > all three parameters to start computation, while Zen cores doesn't.
> >
> > Since this seems noticeable win on zen and not loss on Core it seems like good
> > default for generic.
> >
> > I plan to commit the patch next week if there are no compplains.
> The generic part LGTM.(It's exactly what we proposed in [1])
> 
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637721.html

Thanks.  I wonder if can think of other generic changes that would make
sense to do?
Concerning zen4 and FMA, it is not really win with AVX512 enabled
(which is what I was benchmarking for znver4 tuning), but indeed it is
win with AVX256 where the extra latency is not hidden by the parallelism
exposed by doing evertyhing twice.

I re-benmchmarked zen4 and it behaves similarly to zen3 with avx256, so
for x86-64-v3 this makes sense.

Honza
> >
> > Honza
> >
> > #include <stdio.h>
> > #include <time.h>
> >
> > #define SIZE 1000
> >
> > float a[SIZE][SIZE];
> > float b[SIZE][SIZE];
> > float c[SIZE][SIZE];
> >
> > void init(void)
> > {
> >    int i, j, k;
> >    for(i=0; i<SIZE; ++i)
> >    {
> >       for(j=0; j<SIZE; ++j)
> >       {
> >          a[i][j] = (float)i + j;
> >          b[i][j] = (float)i - j;
> >          c[i][j] = 0.0f;
> >       }
> >    }
> > }
> >
> > void mult(void)
> > {
> >    int i, j, k;
> >
> >    for(i=0; i<SIZE; ++i)
> >    {
> >       for(j=0; j<SIZE; ++j)
> >       {
> >          for(k=0; k<SIZE; ++k)
> >          {
> >             c[i][j] += a[i][k] * b[k][j];
> >          }
> >       }
> >    }
> > }
> >
> > int main(void)
> > {
> >    clock_t s, e;
> >
> >    init();
> >    s=clock();
> >    mult();
> >    e=clock();
> >    printf("        mult took %10d clocks\n", (int)(e-s));
> >
> >    return 0;
> >
> > }
> >
> >         * confg/i386/x86-tune.def (X86_TUNE_AVOID_128FMA_CHAINS, X86_TUNE_AVOID_256FMA_CHAINS)
> >         Enable for znver4 and Core.
> >
> > diff --git a/gcc/config/i386/x86-tune.def b/gcc/config/i386/x86-tune.def
> > index 43fa9e8fd6d..74b03cbcc60 100644
> > --- a/gcc/config/i386/x86-tune.def
> > +++ b/gcc/config/i386/x86-tune.def
> > @@ -515,13 +515,13 @@ DEF_TUNE (X86_TUNE_USE_SCATTER_8PARTS, "use_scatter_8parts",
> >
> >  /* X86_TUNE_AVOID_128FMA_CHAINS: Avoid creating loops with tight 128bit or
> >     smaller FMA chain.  */
> > -DEF_TUNE (X86_TUNE_AVOID_128FMA_CHAINS, "avoid_fma_chains", m_ZNVER1 | m_ZNVER2 | m_ZNVER3
> > -          | m_YONGFENG)
> > +DEF_TUNE (X86_TUNE_AVOID_128FMA_CHAINS, "avoid_fma_chains", m_ZNVER1 | m_ZNVER2 | m_ZNVER3 | m_ZNVER4
> > +          | m_YONGFENG | m_GENERIC)
> >
> >  /* X86_TUNE_AVOID_256FMA_CHAINS: Avoid creating loops with tight 256bit or
> >     smaller FMA chain.  */
> > -DEF_TUNE (X86_TUNE_AVOID_256FMA_CHAINS, "avoid_fma256_chains", m_ZNVER2 | m_ZNVER3
> > -         | m_CORE_HYBRID | m_SAPPHIRERAPIDS | m_CORE_ATOM)
> > +DEF_TUNE (X86_TUNE_AVOID_256FMA_CHAINS, "avoid_fma256_chains", m_ZNVER2 | m_ZNVER3 | m_ZNVER4
> > +         | m_CORE_HYBRID | m_SAPPHIRERAPIDS | m_CORE_ATOM | m_GENERIC)
> >
> >  /* X86_TUNE_AVOID_512FMA_CHAINS: Avoid creating loops with tight 512bit or
> >     smaller FMA chain.  */
> 
> 
> 
> -- 
> BR,
> Hongtao

next prev parent reply	other threads:[~2023-12-13 16:03 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-12 14:37 Jan Hubicka
2023-12-12 15:01 ` Richard Biener
2023-12-12 16:48   ` Jan Hubicka
2023-12-12 17:08   ` Alexander Monakov
2023-12-12 23:56 ` Hongtao Liu
2023-12-13 16:03   ` Jan Hubicka [this message]
2024-01-08  3:16     ` Hongtao Liu
2024-01-17 17:29       ` Jan Hubicka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZXnVxlYYngUJ96hZ@kam.mff.cuni.cz \
    --to=hubicka@ucw.cz \
    --cc=annita.zhang@intel.com \
    --cc=crazylht@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hongjiu.lu@intel.com \
    --cc=hongtao.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).