public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: "H.J. Lu" <hjl.tools@gmail.com>
To: Jan Hubicka <jh@suse.cz>
Cc: Jan Hubicka <hubicka@ucw.cz>,
	discuss@x86-64.org, GCC <gcc@gcc.gnu.org>,
		"Girkar, Milind" <milind.girkar@intel.com>,
		"Dmitriev, Serguei N" <serguei.n.dmitriev@intel.com>,
		"Kreitzer, David L" <david.l.kreitzer@intel.com>
Subject: Re: RFC: Extend x86-64 psABI for 256bit AVX register
Date: Fri, 06 Jun 2008 14:28:00 -0000	[thread overview]
Message-ID: <20080606142813.GA18621@lucon.org> (raw)
In-Reply-To: <20080606135026.GA14877@lucon.org>

On Fri, Jun 06, 2008 at 06:50:26AM -0700, H.J. Lu wrote:
> On Fri, Jun 06, 2008 at 10:28:34AM +0200, Jan Hubicka wrote:
> > > 
> > > ymm0 and xmm0 are the same register. xmm0 is the lower 128bit
> > > of xmm0. I am not sure if we need separate XMM registers from
> > > YMM registers.
> > 
> > 
> > Yes, I know that xmm0 is lower part of ymm0.  I still think we ought to
> > be able to support varargs that do save ymm0 registers only when ymm
> > values are passed same way as we touch SSE only when SSE values are
> > passed via EAX hint.
> 
> Which register do you propose for hint? The current psABI uses RAX
> for XMM registers. We can't change it to AL and AH for YMM without
> breaking backward compatibility.
> 
> > This way we will be able to support e.g. printf that has YMM printing %
> > construct but don't need YMM enabled hardware when those are not used.
> > 
> > This is why I think extending EAX to contain information about amount of
> > XMM values to save and in addition YMM values to save is sane.  Then old
> > non-YMM aware varargs prologues will crash when YMM values are passed,
> > but all other combinations will work.
> 
> I don't think it is necessary since -mavx will enable AVX code
> generation for all SSE codes. Unless the function only uses integer,
> it will crash on non-YMM aware hardware.  That is if there is one
> SSE register is used, which is hinted in RAX, varargs prologue will
> use AVX instructions to save it. We don't need another hint for AVX
> instructions.
> 
> > > 
> > > >
> > > > I personally don't have much preferences over 1. or 2.. 1. seems
> > > > relatively easy to implement too, or is packaging two 128bit values to
> > > > single 256bit difficult in va_arg expansion?
> > > >
> > > 
> > > Access to 256bit register as lower and upper 128bits needs 2
> > > instructions. For store
> > > 
> > > vmovaps   %xmm7, -143(%rax)
> > > vextractf128 $1, %ymm7, -15(%rax)
> > > 
> > > For load
> > > 
> > > vmovaps  -143(%rax),%xmm7
> > > vinsert128 $1, -15(%rax),%ymm7,%ymm7
> > > 
> > > If we go beyond 256bit, we need more instructions to access
> > > the full register. For 512bit, it will be split into lower 128bit,
> > > middle 128bit and upper 256bit. 1024bit will have 4 parts.
> > > 
> > > For #2, only one instruction will be needed for 256bit and
> > > beyond.
> > 
> > Yes, but we will still save half of stack space.  Well, I don't have
> > much preferences here.  If it seems saner to simply save whole thing
> > saving lower part twice, I am fine with that.
> 
> I was told that it wasn't very easy to get decent performance with
> split access. I extended my proposal to include a 16bit bitmask to
> indicate which YMM regisetrs should be saved. If the bit is 0,
> we should only save the the lower 128bit in the original register
> save area. Otherwise, we should only save the same whole YMM register.
> 

My second thought. How useful is such a bitmask? Do we really
need it? Is that accepetable to save the lower 128bit twice?

Thanks.


H.J.

  reply	other threads:[~2008-06-06 14:28 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-05 14:31 H.J. Lu
2008-06-05 14:49 ` Richard Guenther
2008-06-05 15:52   ` H.J. Lu
2008-06-05 15:15 ` Jan Hubicka
2008-06-05 16:14   ` H.J. Lu
2008-06-06  8:29     ` Jan Hubicka
2008-06-06 13:50       ` H.J. Lu
2008-06-06 14:28         ` H.J. Lu [this message]
2008-06-06 14:31           ` Richard Guenther
2008-06-06 14:41             ` H.J. Lu
2008-06-06 14:44               ` Richard Guenther
2008-06-09 14:41           ` Jan Hubicka
2008-06-10 11:24             ` Jakub Jelinek
2008-06-10 11:32               ` Jan Hubicka
2008-06-10 13:48                 ` H.J. Lu
2008-06-10 14:50                   ` Jan Hubicka
2008-06-10 14:57                     ` Jakub Jelinek
2008-06-10 15:41                       ` H.J. Lu
2008-06-10 15:49                         ` Jan Hubicka
2008-06-10 16:18                           ` H.J. Lu
2008-06-11 14:49                           ` H.J. Lu
2008-06-15 22:37                             ` Jakub Jelinek
2008-06-16  1:49                               ` Jan Hubicka
2008-06-18 23:16                                 ` H.J. Lu
2008-06-06 15:01 ` Jakub Jelinek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080606142813.GA18621@lucon.org \
    --to=hjl.tools@gmail.com \
    --cc=david.l.kreitzer@intel.com \
    --cc=discuss@x86-64.org \
    --cc=gcc@gcc.gnu.org \
    --cc=hubicka@ucw.cz \
    --cc=jh@suse.cz \
    --cc=milind.girkar@intel.com \
    --cc=serguei.n.dmitriev@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).