public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Kito Cheng <kito.cheng@gmail.com>
To: "Li, Pan2" <pan2.li@intel.com>
Cc: "Richard Biener" <rguenther@suse.de>, 钟居哲 <juzhe.zhong@rivai.ai>,
	"richard.sandiford" <richard.sandiford@arm.com>,
	"Jeff Law" <jeffreyalaw@gmail.com>,
	gcc-patches <gcc-patches@gcc.gnu.org>,
	palmer <palmer@dabbelt.com>, jakub <jakub@redhat.com>
Subject: Re: Re: [PATCH] machine_mode type size: Extend enum size from 8-bit to 16-bit
Date: Sat, 6 May 2023 09:53:00 +0800	[thread overview]
Message-ID: <CA+yXCZAMTJT1iSenk7aCe=vtOcVQHJQg13p4KEFLnrxNa88AYg@mail.gmail.com> (raw)
In-Reply-To: <MW5PR11MB5908EE9D482C3D7038B84851A9739@MW5PR11MB5908.namprd11.prod.outlook.com>

Hi Pan:

Could you try to apply the following diff and measure again? This
makes tree_type_common size unchanged.


sizeof tree_type_common= 128 (mode = 8 bit)
sizeof tree_type_common= 136 (mode = 16 bit)
sizeof tree_type_common= 128 (mode = 8 bit w/ this diff)

diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index af795aa81f98..b8ccfa407ed9 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -1680,6 +1680,8 @@ struct GTY(()) tree_type_common {
  tree attributes;
  unsigned int uid;

+  ENUM_BITFIELD(machine_mode) mode : 16;
+
  unsigned int precision : 10;
  unsigned no_force_blk_flag : 1;
  unsigned needs_constructing_flag : 1;
@@ -1687,7 +1689,6 @@ struct GTY(()) tree_type_common {
  unsigned restrict_flag : 1;
  unsigned contains_placeholder_bits : 2;

-  ENUM_BITFIELD(machine_mode) mode : 16;

  /* TYPE_STRING_FLAG for INTEGER_TYPE and ARRAY_TYPE.
     TYPE_CXX_ODR_P for RECORD_TYPE and UNION_TYPE.  */
@@ -1712,7 +1713,7 @@ struct GTY(()) tree_type_common {
  unsigned empty_flag : 1;
  unsigned indivisible_p : 1;
  unsigned no_named_args_stdarg_p : 1;
-  unsigned spare : 15;
+  unsigned spare : 7;

  alias_set_type alias_set;
  tree pointer_to;

On Sat, May 6, 2023 at 9:10 AM Li, Pan2 via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Yes, totally agree the number cannot be very accurate up to a point. Update the correlated memory bytes allocated for the X86 target.
>
> Bytes allocated with O2:
> -----------------------------------------------------------------------------------------------------
> Benchmark               |  upstream             | with this PATCH
> -----------------------------------------------------------------------------------------------------
> 400.perlbench           | 25286185160           | 25176544846 ~0.0%
> 401.bzip2               | 1429883731            | 1391040027 -2.7%
> 403.gcc                 | 55023568981           | 54798890746 ~0.0%
> 429.mcf         | 1360975660            | 1321537710 -2.9%
> 445.gobmk               | 12791636502           | 12666523431 -1.0%
> 456.hmmer               | 9354433652            | 9279189174 ~0.0%
> 458.sjeng               | 1991260562            | 1944031904 -2.4%
> 462.libquantum          | 1725112078            | 1684213981 -2.4%
> 464.h264ref             | 8597673515            | 8528855778 ~0.0%
> 471.omnetpp             | 37613034778           | 37432278047 ~0.0%
> 473.astar               | 3817295518            | 3772460508 -1.2%
> 483.xalancbmk           | 149418776991  | 148545162207 ~0.0%
>
> Bytes allocated with Ofast + funroll-loops:
> ------------------------------------------------------------------------------------------
> Benchmark               |  upstream             | with this PATCH
> ------------------------------------------------------------------------------------------
> 400.perlbench           | 30438407499           | 30574152897 ~0.0%
> 401.bzip2               | 2277114519            | 2319432664 +1.9%
> 403.gcc                 | 64499664264           | 64781232731 ~0.0%
> 429.mcf         | 1361486758            | 1399942116 +2.8%
> 445.gobmk               | 15258056111           | 15396801542 +1.0%
> 456.hmmer               | 10896615649           | 10936223486 ~0.0%
> 458.sjeng               | 2592620709            | 2641687496 +1.9%
> 462.libquantum          | 1814487525            | 1854518500 +2.2%
> 464.h264ref             | 13528736878           | 13614517066 ~0.0%
> 471.omnetpp             | 38721066702           | 38910524667 ~0.0%
> 473.astar               | 3924015756            | 3968057027 +1.1%
> 483.xalancbmk           | 165897692838  | 166843885880 ~0.0%
>
> Pan
>
>
> -----Original Message-----
> From: Richard Biener <rguenther@suse.de>
> Sent: Friday, May 5, 2023 2:25 PM
> To: Li, Pan2 <pan2.li@intel.com>
> Cc: 钟居哲 <juzhe.zhong@rivai.ai>; kito.cheng <kito.cheng@gmail.com>; richard.sandiford <richard.sandiford@arm.com>; Jeff Law <jeffreyalaw@gmail.com>; gcc-patches <gcc-patches@gcc.gnu.org>; palmer <palmer@dabbelt.com>; jakub <jakub@redhat.com>
> Subject: RE: Re: [PATCH] machine_mode type size: Extend enum size from 8-bit to 16-bit
>
> On Fri, 5 May 2023, Li, Pan2 wrote:
>
> > I tried the memory profiling by valgrind --tool=memcheck --trace-children=yes for this change, target the SPEC 2006 INT part with rv64gcv. Note we only count the bytes allocated from valgrind log like this "==2832896==   total heap usage: 208 allocs, 165 frees, 123,204 bytes allocated".
> >
> > Consider some variance of valgrind, it looks like the impact to bytes
> > allocated may be limited. However, I am still running this for x86, it
> > will take more than 30 hours for each iteration...
>
> I'm not sure I'd call +- 7% on memory use "limited" - but I fear the numbers are off.  Note since various structures reside in GC memory there's also changes to GC overhead and fragmentation, so precise measurements are difficult.
>
> Richard.
>
> > RISC-V GCC Version:
> > >> ~/bin/test-gnu-8-bits/bin/riscv64-unknown-linux-gnu-gcc --version
> > riscv64-unknown-linux-gnu-gcc (gd7cb9720ed5) 14.0.0 20230503
> > (experimental) Copyright (C) 2023 Free Software Foundation, Inc.
> > This is free software; see the source for copying conditions.  There
> > is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> >
> > Bytes allocated with O2:
> > -----------------------------------------------------------------------------------------------------
> > Benchmark             |  upstream             | with this PATCH
> > -----------------------------------------------------------------------------------------------------
> > 400.perlbench         | 29699642875           | 29949876269 ~0.0%
> > 401.bzip2             | 1641041659            | 1755563972 +6.95%
> > 403.gcc                       | 68447500516           | 68900883291 ~0.0%
> > 429.mcf               | 1433156462            | 1433253373 ~0.0%
> > 445.gobmk             | 14239225210           | 14463438465 ~0.0%
> > 456.hmmer             | 9635955623            | 9808534948 +1.8%
> > 458.sjeng             | 2419478204            | 2545478940 +5.4%
> > 462.libquantum                | 1686404489            | 1800884197 +6.8%
> > 464.h264ref   8j1     | 10190413900           | 10351134161 +1.6%
> > 471.omnetpp           | 40814627684           | 41185864529 ~0.0%
> > 473.astar             | 3807097529            | 3928428183 +3.2%
> > 483.xalancbmk         | 152959418167  | 154201738843 ~0.0%
> >
> > Bytes allocated with Ofast + funroll-loops:
> > ------------------------------------------------------------------------------------------
> > Benchmark             |  upstream             | with this PATCH
> > ------------------------------------------------------------------------------------------
> > 400.perlbench         |  39491184733          | 39223020267 ~0.0%
> > 401.bzip2             |  2843871517           | 2730383463 ~0%
> > 403.gcc                       |  84195991898          | 83730632955 -4.0%
> > 429.mcf               |  1481381164           | 1367309565 -7.7%
> > 445.gobmk             |  20123943663          | 19886116394 -1.2%
> > 456.hmmer             |  12302445139          | 12121745383 -1.5%
> > 458.sjeng             |  3884712615           | 3755481930  -3.3%
> > 462.libquantum                |  1966619940           | 1852274342  -5.8%
> > 464.h264ref           |  19219365552          | 19050288201 ~0.0%
> > 471.omnetpp           |  45701008325          | 45327805079 ~0.0%
> > 473.astar             |  4118600354           | 3995943705 -3.0%
> > 483.xalancbmk         |  179481305182 | 178160306301 ~0.0%
> >
> > Pan
> >
> >
> > -----Original Message-----
> > From: Gcc-patches <gcc-patches-bounces+pan2.li=intel.com@gcc.gnu.org> On Behalf Of ???
> > Sent: Thursday, April 13, 2023 7:23 AM
> > To: kito.cheng <kito.cheng@gmail.com>; rguenther <rguenther@suse.de>
> > Cc: richard.sandiford <richard.sandiford@arm.com>; Jeff Law
> > <jeffreyalaw@gmail.com>; gcc-patches <gcc-patches@gcc.gnu.org>; palmer
> > <palmer@dabbelt.com>; jakub <jakub@redhat.com>
> > Subject: Re: Re: [PATCH] machine_mode type size: Extend enum size from
> > 8-bit to 16-bit
> >
> > Yeah, like kito said.
> > Turns out the tuple type model in ARM SVE is the optimal solution for RVV.
> > And we like ARM SVE style implmentation.
> >
> > And now we see swapping rtx_code and mode in rtx_def can make rtx_def overal not exceed 64 bit.
> > But it seems that there is still problem in tree_type_common and tree_decl_common, is that right?
> >
> > After several trys (remove all redundant TI/TF vector modes and FP16 vector mode), now there are 252 modes in RISC-V port. Basically, I can keep supporting new RVV intrinsisc features recently.
> > However, we can't support more in the future, for example, FP16 vector, BF16 vector, matrix modes, VLS modes,...etc.
> >
> > From RVV side, I think extending 1 more bit of machine mode should be enough for RVV (overal 512 modes).
> > Is it possible make it happen in tree_type_common and tree_decl_common, Richards?
> >
> > Thank you so much for all comments.
> >
> >
> > juzhe.zhong@rivai.ai
> >
> > From: Kito Cheng
> > Date: 2023-04-12 17:31
> > To: Richard Biener
> > CC: juzhe.zhong@rivai.ai; richard.sandiford; jeffreyalaw; gcc-patches;
> > palmer; jakub
> > Subject: Re: Re: [PATCH] machine_mode type size: Extend enum size from
> > 8-bit to 16-bit
> > > > The concept of fractional LMUL is the same as the concept of
> > > > AArch64's partial SVE vectors, so they can only access the lowest
> > > > part, like SVE's partial vector.
> > > >
> > > > We want to spill/restore the exact size of those modes (1/2, 1/4,
> > > > 1/8), so adding dedicated modes for those partial vector modes
> > > > should be unavoidable IMO.
> > > >
> > > > And even if we use sub-vector, we still need to define those
> > > > partial vector types.
> > >
> > > Could you use integer modes for the fractional vectors?
> >
> > You mean using the scalar integer mode like using (subreg:SI
> > (reg:VNx4SI) 0) to represent
> > LMUL=1/4?
> > (Assume VNx4SI is mode for M1)
> >
> > If so I think it might not be able to model that right - it seems like we are using 32-bits but actually we are using poly_int16(1, 1) * 32 bits.
> >
> > > For computation you can always appropriately limit the LEN?
> >
> > RVV provide zvl*b extension like zvl<N>b (e.g.zvl128b or zvl256b) to
> > guarantee the vector length is at least larger than N bits, but it's
> > just guarantee the minimal length like SVE guarantee the minimal
> > vector length is 128 bits
> >
> >
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman; HRB 36809 (AG Nuernberg)

  reply	other threads:[~2023-05-06  1:53 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-10 14:48 juzhe.zhong
2023-04-10 14:54 ` Jeff Law
2023-04-10 15:02   ` juzhe.zhong
2023-04-10 15:14   ` juzhe.zhong
2023-04-11  9:16     ` Jakub Jelinek
2023-04-11  9:46       ` juzhe.zhong
2023-04-11 10:11         ` Jakub Jelinek
2023-04-11 10:25           ` juzhe.zhong
2023-04-11 10:52             ` Jakub Jelinek
2023-04-11  9:46     ` Richard Sandiford
2023-04-11  9:59       ` Jakub Jelinek
2023-04-11 10:11         ` juzhe.zhong
2023-04-11 10:05       ` Richard Earnshaw
2023-04-11 10:15         ` Richard Sandiford
2023-04-11 10:59       ` Richard Biener
2023-04-11 11:11         ` Richard Sandiford
2023-04-11 11:19           ` juzhe.zhong
2023-04-11 13:50             ` Kito Cheng
2023-04-12  7:53               ` Richard Biener
2023-04-12  9:06                 ` Kito Cheng
2023-04-12  9:21                   ` Richard Biener
2023-04-12  9:31                     ` Kito Cheng
2023-04-12 23:22                       ` 钟居哲
2023-04-13 13:06                         ` Richard Sandiford
2023-04-13 14:02                           ` Richard Biener
2023-04-15  2:58                             ` Hans-Peter Nilsson
2023-04-17  6:38                               ` Richard Biener
2023-04-20  5:37                                 ` Hans-Peter Nilsson
2023-05-05  1:43                         ` Li, Pan2
2023-05-05  6:25                           ` Richard Biener
2023-05-06  1:10                             ` Li, Pan2
2023-05-06  1:53                               ` Kito Cheng [this message]
2023-05-06  1:59                                 ` juzhe.zhong
2023-05-06  2:12                                   ` Li, Pan2
2023-05-06  2:18                                     ` Kito Cheng
2023-05-06  2:20                                       ` Li, Pan2
2023-05-06  2:48                                         ` Li, Pan2
2023-05-07  1:55                                           ` Li, Pan2
2023-05-07 15:23                                             ` Jeff Law
2023-05-08  1:07                                               ` Li, Pan2
2023-05-08  6:29                                               ` Richard Biener
2023-05-08  6:41                                                 ` Li, Pan2
2023-05-08  6:59                                                   ` Li, Pan2
2023-05-08  7:37                                                     ` Richard Biener
2023-05-08  8:05                                                       ` Li, Pan2
2023-05-09  6:13                                                         ` Li, Pan2
2023-05-09  7:04                                                           ` Richard Biener
2023-05-09 10:16                                                         ` Richard Sandiford
2023-05-09 10:26                                                           ` Richard Biener
2023-05-09 11:50                                                             ` Li, Pan2
2023-05-10  5:09                                                               ` Li, Pan2
2023-05-10  7:22                                                                 ` Li, Pan2
2023-05-08  1:35                                         ` Li, Pan2
2023-04-10 15:18   ` Jakub Jelinek
2023-04-10 15:22     ` juzhe.zhong
2023-04-10 20:42       ` Jeff Law
2023-04-10 23:03         ` juzhe.zhong
2023-04-11  1:36         ` juzhe.zhong
     [not found]     ` <20230410232205400970205@rivai.ai>
2023-04-10 15:33       ` juzhe.zhong
2023-04-10 20:39         ` Jeff Law
2023-04-10 20:36     ` Jeff Law
2023-04-10 22:53       ` juzhe.zhong
2023-04-10 15:10 ` Jakub Jelinek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+yXCZAMTJT1iSenk7aCe=vtOcVQHJQg13p4KEFLnrxNa88AYg@mail.gmail.com' \
    --to=kito.cheng@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=jeffreyalaw@gmail.com \
    --cc=juzhe.zhong@rivai.ai \
    --cc=palmer@dabbelt.com \
    --cc=pan2.li@intel.com \
    --cc=rguenther@suse.de \
    --cc=richard.sandiford@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).