public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Kito Cheng <kito.cheng@gmail.com>
To: "juzhe.zhong@rivai.ai" <juzhe.zhong@rivai.ai>
Cc: "richard.sandiford" <richard.sandiford@arm.com>,
	rguenther <rguenther@suse.de>,
	 jeffreyalaw <jeffreyalaw@gmail.com>,
	gcc-patches <gcc-patches@gcc.gnu.org>,
	 palmer <palmer@dabbelt.com>, jakub <jakub@redhat.com>
Subject: Re: Re: [PATCH] machine_mode type size: Extend enum size from 8-bit to 16-bit
Date: Tue, 11 Apr 2023 21:50:40 +0800	[thread overview]
Message-ID: <CA+yXCZCUcChS_GbuHETPy6R3rgJAAMRHGz1LXYcwnoS-EOFXZg@mail.gmail.com> (raw)
In-Reply-To: <0AEFD2378C3DF89B+202304111919556577872@rivai.ai>

Let me give more explanation why RISC-V vector need so many modes than AArch64.

The following will use "RVV" as an abbreviation for "RISC-V Vector"
instructions.

There are two key points here:

- RVV has a concept called LMUL - you can understand that as register
grouping, we can group up to 8 adjacent registers together and then
operate at once, e.g. one vadd can operate on adding two 8-reg groups
at once.
- We have segment load/store that require vector tuple types. -
AArch64 has similar stuffs on both Neon and SVE, e.g. int32x2x2_t or
svint32x2_t.

In order to model LMUL in backend, we have to the combination of
scalar type and LMUL; possible LMUL is 1, 2, 4, 8, 1/2, 1/4, 1/8 - 8
different types of LMUL, and we'll have QI, HI, SI, DI, HF, SF and DF,
so basically we'll have 7 (LMUL type) * 7 (scalar type) here.

Okay, let's talk about tuple type AArch64 also having tuple type, but
why is it not having such a huge number of modes? It mainly cause by
LMUL; use a concrete example to explain why this cause different
design on machine mode, using scalable vector mode with SI mode tuple
here:

AArch64: svint32_t (VNx4SI) svint32x2_t (VNx8SI) svint32x3_t (VNx12SI)
svint32x3_t (VNx16SI)

AArch64 only has up to 3-tuple, but RISC-V could have up to 8-tuple,
so we already have 8 different types for each scalar mode even though
we don't count LMUL concept yet.

RISC-V*: vint32m1_t (VNx4SI) vint32m1x2_t (VNx8SI) vint32m1x3_t
(VNx12SI) vint32m1x4_t (VNx16SI) vint32m1x5_t (VNx20SI) vint32m1x6_t
(VNx24SI) vint32m1x7_t (VNx28SI) vint32m1x8_t (VNx32SI)

Using VLEN=128 as the base type system, you can ignore it if you don't
understand the meaning for now.

And let's consider LMUL now, add LMUL=2 case here, RVV has a
constraint that the LMUL * NF(NF-tuple) must be less or equal to 8, so
we have only 3 extra modes for LMUL=2.

RISC-V*: vint32m2_t (VNx8SI) vint32m2x2_t (VNx16SI) vint32m2x3_t
(VNx24SI) vint32m2x4_t (VNx32SI)

However, there is a big problem RVV have different register constraint
for different LMUL type, LMUL <= 1 can use any register, LMUL=2 type
require register align to multiple-of-2 (v0, v2, …), and LMUL=4 type
requires register align to multiple-of-4 (v0, v4, …).

So vint32m1x2_t (LMUL=1x2) and vint32m2_t (LMUL=2) have the same size
and NUNIT, but they have different register constraint, vint32m1x2_t
is LMUL 1, so we don't have register constraint, but vint32m2_t is
LMUL 2 so it has reg. constraint, it must be aligned to multiple-of-2.

Based on the above reason, those tuple types must have separated
machine mode even if they have the same size and NUNIT.

Why Neon and SVE didn't have such an issue? Because SVE and Neon
didn't have the concept of LMUL, so tuple type in SVE and Neon won't
have two vector types that have the same size but different register
constraints or alignment - one size is one type.

So based on LMUL and register constraint issue of tuple type, we must
have 37 types for vector tuples, and plus 48 modes variable-length
vector mode, and 42 scalar mode - so we have ~140 modes now, it sounds
like still less than 256, so what happened?


RVV has one more thing special thing in our type system due to ISA
design, the minimal vector length of RVV is 32 bit unlike SVE
guarantee, the minimal is 128 bits, so we did some tricks one our type
system is we have a different mode for minimal vector length
(MIN_VLEN) is 32, 64 or large or equal to 128, this design is because
it would be more friendly for vectorizer, and also model things
precisely for better code gen.

e.g.

vint32m1_t is VNx1SI in MIN_VLEN>=32

vint32m1_t is VNx2SI in MIN_VLEN>=64

vint32m1_t is VNx4SI in MIN_VLEN>=128

So actually we will have 37 * 3 modes for vector tuple mode, and now
~210 modes now (the result is little different than JuZhe's number
since I ignore some mode isn't used in C, but it defined in machine
mode due the the current GCC will always define all possible scalar
mode for a vector mode)

We also plan to add some traditional fixed length vector types like
V2SI in future…and apparently 256 mode isn't enough for this plan :(

  reply	other threads:[~2023-04-11 13:50 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-10 14:48 juzhe.zhong
2023-04-10 14:54 ` Jeff Law
2023-04-10 15:02   ` juzhe.zhong
2023-04-10 15:14   ` juzhe.zhong
2023-04-11  9:16     ` Jakub Jelinek
2023-04-11  9:46       ` juzhe.zhong
2023-04-11 10:11         ` Jakub Jelinek
2023-04-11 10:25           ` juzhe.zhong
2023-04-11 10:52             ` Jakub Jelinek
2023-04-11  9:46     ` Richard Sandiford
2023-04-11  9:59       ` Jakub Jelinek
2023-04-11 10:11         ` juzhe.zhong
2023-04-11 10:05       ` Richard Earnshaw
2023-04-11 10:15         ` Richard Sandiford
2023-04-11 10:59       ` Richard Biener
2023-04-11 11:11         ` Richard Sandiford
2023-04-11 11:19           ` juzhe.zhong
2023-04-11 13:50             ` Kito Cheng [this message]
2023-04-12  7:53               ` Richard Biener
2023-04-12  9:06                 ` Kito Cheng
2023-04-12  9:21                   ` Richard Biener
2023-04-12  9:31                     ` Kito Cheng
2023-04-12 23:22                       ` 钟居哲
2023-04-13 13:06                         ` Richard Sandiford
2023-04-13 14:02                           ` Richard Biener
2023-04-15  2:58                             ` Hans-Peter Nilsson
2023-04-17  6:38                               ` Richard Biener
2023-04-20  5:37                                 ` Hans-Peter Nilsson
2023-05-05  1:43                         ` Li, Pan2
2023-05-05  6:25                           ` Richard Biener
2023-05-06  1:10                             ` Li, Pan2
2023-05-06  1:53                               ` Kito Cheng
2023-05-06  1:59                                 ` juzhe.zhong
2023-05-06  2:12                                   ` Li, Pan2
2023-05-06  2:18                                     ` Kito Cheng
2023-05-06  2:20                                       ` Li, Pan2
2023-05-06  2:48                                         ` Li, Pan2
2023-05-07  1:55                                           ` Li, Pan2
2023-05-07 15:23                                             ` Jeff Law
2023-05-08  1:07                                               ` Li, Pan2
2023-05-08  6:29                                               ` Richard Biener
2023-05-08  6:41                                                 ` Li, Pan2
2023-05-08  6:59                                                   ` Li, Pan2
2023-05-08  7:37                                                     ` Richard Biener
2023-05-08  8:05                                                       ` Li, Pan2
2023-05-09  6:13                                                         ` Li, Pan2
2023-05-09  7:04                                                           ` Richard Biener
2023-05-09 10:16                                                         ` Richard Sandiford
2023-05-09 10:26                                                           ` Richard Biener
2023-05-09 11:50                                                             ` Li, Pan2
2023-05-10  5:09                                                               ` Li, Pan2
2023-05-10  7:22                                                                 ` Li, Pan2
2023-05-08  1:35                                         ` Li, Pan2
2023-04-10 15:18   ` Jakub Jelinek
2023-04-10 15:22     ` juzhe.zhong
2023-04-10 20:42       ` Jeff Law
2023-04-10 23:03         ` juzhe.zhong
2023-04-11  1:36         ` juzhe.zhong
     [not found]     ` <20230410232205400970205@rivai.ai>
2023-04-10 15:33       ` juzhe.zhong
2023-04-10 20:39         ` Jeff Law
2023-04-10 20:36     ` Jeff Law
2023-04-10 22:53       ` juzhe.zhong
2023-04-10 15:10 ` Jakub Jelinek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+yXCZCUcChS_GbuHETPy6R3rgJAAMRHGz1LXYcwnoS-EOFXZg@mail.gmail.com \
    --to=kito.cheng@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=jeffreyalaw@gmail.com \
    --cc=juzhe.zhong@rivai.ai \
    --cc=palmer@dabbelt.com \
    --cc=rguenther@suse.de \
    --cc=richard.sandiford@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).