public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: "juzhe.zhong@rivai.ai" <juzhe.zhong@rivai.ai>
To: rguenther <rguenther@suse.de>,  "Robin Dapp" <rdapp.gcc@gmail.com>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>,
	 richard.sandiford <richard.sandiford@arm.com>,
	 krebbel <krebbel@linux.ibm.com>,  uweigand <uweigand@de.ibm.com>,
	 linkw <linkw@linux.ibm.com>
Subject: Re: Re: [PATCH V2] VECT: Support LEN_MASK_ LOAD/STORE to support flow control for length loop control
Date: Thu, 15 Jun 2023 18:08:45 +0800	[thread overview]
Message-ID: <CA0493E6B5090759+2023061518084459896088@rivai.ai> (raw)
In-Reply-To: <nycvar.YFH.7.77.849.2306150922480.4723@jbgna.fhfr.qr>

[-- Attachment #1: Type: text/plain, Size: 4183 bytes --]

Hi, Richi. I have sent the first splitted patch (only add ifn and optabs) as you suggested.
https://gcc.gnu.org/pipermail/gcc-patches/2023-June/621874.html 
Could you take a look at it?
After this patch is approved, I will send the second patch (Support them into vectorizer) next.

Thanks!


juzhe.zhong@rivai.ai
 
From: Richard Biener
Date: 2023-06-15 17:52
To: Robin Dapp
CC: juzhe.zhong@rivai.ai; gcc-patches; richard.sandiford; krebbel; uweigand; linkw
Subject: Re: [PATCH V2] VECT: Support LEN_MASK_ LOAD/STORE to support flow control for length loop control
On Thu, 15 Jun 2023, Robin Dapp wrote:
 
> > Meh, PoP is now behind a paywall, trying to get through ... I wonder
> > if there's a nice online html documenting the s390 len_load/store
> > instructions to better understand the need for the bias.
> 
> https://publibfp.dhe.ibm.com/epubs/pdf/a227832c.pdf
> 
> Look for vector load with length (store).  The length operand specifies
> the highest bytes to load instead of the actual length.
 
Hmm.  It indeed cannot represent len == 0, so you are making sure
that never happens?  Because when it is actually zero you are
going to get -1 here?  At least I don't see the bias operand used at
all:
 
; Implement len_load/len_store optabs with vll/vstl.
(define_expand "len_load_v16qi"
  [(match_operand:V16QI 0 "register_operand")
   (match_operand:V16QI 1 "memory_operand")
   (match_operand:QI 2 "register_operand")
   (match_operand:QI 3 "vll_bias_operand")
  ]
  "TARGET_VX && TARGET_64BIT"
{
  rtx mem = adjust_address (operands[1], BLKmode, 0);
 
  rtx len = gen_reg_rtx (SImode);
  emit_move_insn (len, gen_rtx_ZERO_EXTEND (SImode, operands[2]));
  emit_insn (gen_vllv16qi (operands[0], len, mem));
  DONE;
})
 
the docs of len_load say
 
"
@cindex @code{len_load_@var{m}} instruction pattern
@item @samp{len_load_@var{m}}
Load (operand 2 - operand 3) elements from vector memory operand 1
into vector register operand 0, setting the other elements of
operand 0 to undefined values.  Operands 0 and 1 have mode @var{m},
which must be a vector mode.  Operand 2 has whichever integer mode the
target prefers.  Operand 3 conceptually has mode @code{QI}. 
 
Operand 2 can be a variable or a constant amount.  Operand 3 specifies a
constant bias: it is either a constant 0 or a constant -1.  The predicate 
on
operand 3 must only accept the bias values that the target actually 
supports.
GCC handles a bias of 0 more efficiently than a bias of -1.
 
If (operand 2 - operand 3) exceeds the number of elements in mode
@var{m}, the behavior is undefined.
 
If the target prefers the length to be measured in bytes rather than
elements, it should only implement this pattern for vectors of @code{QI}
elements."
 
the minus in 'operand 2 - operand 3' should be a plus if the
bias is really zero or -1.  I suppose
 
'If (operand 2 - operand 3) exceeds the number of elements in mode
@var{m}, the behavior is undefined.'
 
means that the vectorizer has to make sure the biased element
count never underflows?
 
That is, for a loop like
 
void foo (double *x, float *y, int n)
{
  for (int i = 0; i < n; ++i)
    y[i] = x[i];
}
 
you should get
 
   x1 = len_load (...);
   x2 = len_load (...);
   y = VEC_PACK_TRUNC_EXPR <x1, x2>
   len_store (..., y);
 
but then the x2 load can end up with a len of zero and thus
trap (since you will load either a full vector or the first
byte of it).  I see you do
 
  /* If the backend requires a bias of -1 for LEN_LOAD, we must not emit
     len_loads with a length of zero.  In order to avoid that we prohibit
     more than one loop length here.  */
  if (partial_load_bias == -1
      && LOOP_VINFO_LENS (loop_vinfo).length () > 1)
    return false;
 
that's quite conservative.  I think you can do better when the
loads are aligned, reading an extra byte when ignoring the bias
is OK and you at least know the very first element is used.
For stores you would need to emit compare&jump for all but
the first store of a group though ...
 
That said, I'm still not seeing where you actually apply the bias.
 
Richard.
 

  reply	other threads:[~2023-06-15 10:08 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-12  4:14 juzhe.zhong
2023-06-15  8:06 ` Richard Biener
2023-06-15  8:47   ` juzhe.zhong
2023-06-15  8:58     ` Robin Dapp
2023-06-15  9:01       ` juzhe.zhong
2023-06-15  9:15       ` Richard Biener
2023-06-15  9:18         ` Robin Dapp
2023-06-15  9:20           ` Robin Dapp
2023-06-15  9:52           ` Richard Biener
2023-06-15 10:08             ` juzhe.zhong [this message]
2023-06-15 10:10             ` Robin Dapp
2023-06-15 11:12               ` Richard Biener
     [not found]   ` <2023061516473052116877@rivai.ai>
2023-06-15  8:56     ` juzhe.zhong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA0493E6B5090759+2023061518084459896088@rivai.ai \
    --to=juzhe.zhong@rivai.ai \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=krebbel@linux.ibm.com \
    --cc=linkw@linux.ibm.com \
    --cc=rdapp.gcc@gmail.com \
    --cc=rguenther@suse.de \
    --cc=richard.sandiford@arm.com \
    --cc=uweigand@de.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).