public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jan Hubicka <hubicka@ucw.cz>
To: Xinliang David Li <davidxl@google.com>
Cc: Jan Hubicka <hubicka@ucw.cz>,
	GCC Patches <gcc-patches@gcc.gnu.org>,
	Teresa Johnson <tejohnson@google.com>
Subject: Re: [PATCH i386]: Enable push/pop in pro/epilogue for modern CPUs
Date: Thu, 13 Dec 2012 01:19:00 -0000	[thread overview]
Message-ID: <20121213011933.GB21037@atrey.karlin.mff.cuni.cz> (raw)
In-Reply-To: <CAAkRFZLAe0CuO+-sBps9pCBDVi5k2ti8cBgL9Ukw4fBmrnpUeg@mail.gmail.com>

> > libcall is not faster up to 8KB to rep sequence that is better for regalloc/code
> > cache than fully blowin function call.
> 
> Be careful with this. My recollection is that REP sequence is good for
> any size -- for smaller size, the REP initial set up cost is too high
> (10s of cycles), while for large size copy, it is less efficient
> compared with library version.

Well this is based on the data from the memtest script.  
Core has good REP implementation - it is a win from rather small blocks (16
bytes if I recall) and it does not need alignment.
Library version starts to be interesting with caching hints, but I think till 80KB
it is still not a win for my setup (glibc-2.15)
> >> >
> >> >    /* X86_TUNE_LCP_STALL: Avoid an expensive length-changing prefix stall
> >> >     * on 16-bit immediate moves into memory on Core2 and Corei7.  */
> >> > @@ -1822,7 +1822,7 @@ static unsigned int initial_ix86_tune_fe
> >> >    m_K6,
> >> >
> >> >    /* X86_TUNE_USE_CLTD */
> >> > -  ~(m_PENT | m_ATOM | m_K6),
> >> > +  ~(m_PENT | m_ATOM | m_K6 | m_GENERIC),
> 
> My change was to enable CLTD for generic. Is your change intended to
> revert that?

No, it is merge conflict, sorry.  I will update it in my tree.
> > Skipping inc/dec is to avoid partial flag stall happening on P4 only.
> >> >
> 
> 
> K8 and K10 partitions the flags into groups. References to flags to
> the same group can still cause the stall -- not sure how that can be
> handled.

I  belive the stalls happends only in quite special cases where compare instruction
combines flags from multiple instructions.  GCC don't generate this type of code, so
we should be safe.

Honza

  parent reply	other threads:[~2012-12-13  1:19 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-08 18:13 Xinliang David Li
2012-12-12 16:37 ` Jan Hubicka
2012-12-12 17:25   ` Xinliang David Li
2012-12-12 17:34   ` Xinliang David Li
2012-12-12 18:30     ` Jan Hubicka
2012-12-12 18:37       ` Andi Kleen
2012-12-12 18:43         ` Andi Kleen
2012-12-12 18:43         ` Jan Hubicka
2012-12-12 20:56         ` x86-64 medium memory model Leif Ekblad
2012-12-12 20:59           ` H.J. Lu
2012-12-12 21:33             ` Leif Ekblad
2012-12-13  0:16       ` [PATCH i386]: Enable push/pop in pro/epilogue for modern CPUs Xinliang David Li
2012-12-13  0:16         ` Xinliang David Li
2012-12-13  1:19         ` Jan Hubicka [this message]
2012-12-13  6:09           ` Xinliang David Li
2012-12-13  6:21             ` Jakub Jelinek
2012-12-13  7:05               ` Xinliang David Li
2012-12-13 19:28                 ` Jan Hubicka
2012-12-13 10:22               ` Richard Biener
2012-12-13 19:43               ` H.J. Lu
2012-12-13 20:26                 ` Jan Hubicka
2012-12-13 20:28                   ` H.J. Lu
2012-12-13 20:40                     ` Jan Hubicka
2012-12-13 21:02                       ` H.J. Lu
2012-12-13 21:35                         ` Jan Hubicka
2012-12-20 12:13                         ` Melik-adamyan, Areg
2012-12-20 14:08                           ` H.J. Lu
2012-12-20 15:05                             ` Jan Hubicka
2012-12-20 15:07                               ` Jan Hubicka
2012-12-20 15:22                                 ` H.J. Lu
2012-12-21  8:28   ` Zamyatin, Igor
2012-12-09 13:50 Uros Bizjak
2012-12-09 17:09 ` Дмитрий Дьяченко
2012-12-10  9:23 ` Richard Biener
2012-12-10 20:42   ` Xinliang David Li
2012-12-10 21:07     ` Mike Stump
2012-12-11  9:49       ` Richard Biener
2012-12-11 17:15         ` Xinliang David Li
2012-12-11 22:53 Xinliang David Li
2012-12-11 23:39 ` Xinliang David Li
2012-12-12 11:00 ` Richard Biener
2012-12-21  7:26 Xinliang David Li
2012-12-21  8:20 ` Zamyatin, Igor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121213011933.GB21037@atrey.karlin.mff.cuni.cz \
    --to=hubicka@ucw.cz \
    --cc=davidxl@google.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=tejohnson@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).