RE: [PATCH] ix86: Suggest unroll factor for loop vectorization

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: "Cui, Lili" <lili.cui@intel.com>
To: Richard Biener <richard.guenther@gmail.com>
Cc: "Hongtao Liu" <crazylht@gmail.com>,
	"Martin Liška" <mliska@suse.cz>,
	"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
	"Liu, Hongtao" <hongtao.liu@intel.com>
Subject: RE: [PATCH] ix86: Suggest unroll factor for loop vectorization
Date: Wed, 2 Nov 2022 09:37:01 +0000	[thread overview]
Message-ID: <SJ0PR11MB560094054DC374241D31EE419E399@SJ0PR11MB5600.namprd11.prod.outlook.com> (raw)
In-Reply-To: <CAFiYyc1eybPeEsn8HcU682n7rcbwWdoBCNbN_h33g=VfVeeNeA@mail.gmail.com>

> > > +@item x86-vect-unroll-min-ldst-threshold
> > > +The vectorizer will check with target information to determine
> > > +whether unroll it. This parameter is used to limit the mininum of
> > > +loads and stores in the main loop.
> > >
> > > It's odd to "limit" the minimum number of something.  I think this
> > > warrants clarification that for some (unknow to me ;)) reason we
> > > think that when we have many loads and (or?) stores it is beneficial
> > > to unroll to get even more loads and stores in a single iteration.
> > > Btw, does the parameter limit the number of loads and stores _after_
> unrolling or before?
> > >
> > When the number of loads/stores exceeds the threshold, the loads/stores
> are more likely to conflict with loop itself in the L1 cache(Assuming that
> address of loads are scattered).
> > Unroll + software scheduling will make 2 or 4 address contiguous
> loads/stores closer together, it will reduce cache miss rate.
> 
> Ah, nice.  Can we express the default as a function of L1 data cache size, L1
> cache line size and more importantly, the size of the vector memory access?
> 
> Btw, I was looking into making a more meaningful cost modeling for loop
> distribution.  Similar reasoning might apply there - try to _reduce_ the
> number of memory streams so L1 cache utilization allows re-use of a cache
> line in the next [next N] iteration[s]?  OTOH given L1D is quite large I'd expect
> the loops affected to be either quite huge or bottlenecked by load/store
> bandwith (there are 1024 L1D cache lines in zen2 for
> example) - what's the effective L1D load you are keying off?.
> Btw, how does L1D allocation on stores play a role here?
> 
Hi Richard,
To answer your question, I rechecked 549, I found that the 549 improvement comes from load reduction, it has a 3-level loop and 8 scalar loads in inner loop are loop invariants (due to high register pressure, these loop invariants all spill to the stack).
After unrolling the inner loop, those scalar parts are not doubled,  so unrolling reduces load instructions and L1/L2/L3 accesses. In the inner loop there are 8 different three-dimensional arrays, which size like this "a[128][480][128]". Although the size of the 3-layer array is very large,
but it doesn't support the theory I said before, Sorry for that. I need to hold this patch to see if we can do something about this scenario. 

Thanks,
Lili.

     prev parent reply	other threads:[~2022-11-02  9:37 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-24  2:46 Cui,Lili
2022-10-25  5:49 ` Hongtao Liu
2022-10-25 11:53   ` Richard Biener
2022-10-26 11:38     ` Cui, Lili
2022-10-28  7:12       ` Richard Biener
2022-11-02  9:37         ` Cui, Lili [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=SJ0PR11MB560094054DC374241D31EE419E399@SJ0PR11MB5600.namprd11.prod.outlook.com \
    --to=lili.cui@intel.com \
    --cc=crazylht@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hongtao.liu@intel.com \
    --cc=mliska@suse.cz \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).