From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-vs1-xe31.google.com (mail-vs1-xe31.google.com [IPv6:2607:f8b0:4864:20::e31]) by sourceware.org (Postfix) with ESMTPS id 810F53858D35 for ; Sat, 6 May 2023 01:53:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 810F53858D35 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-vs1-xe31.google.com with SMTP id ada2fe7eead31-42ca0c08aa9so761246137.2 for ; Fri, 05 May 2023 18:53:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1683337992; x=1685929992; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=+llIvnV49eApFmV3b95pn28itqnxknqoQorBXNWQzk4=; b=TR1sxkexP0QfmpzdNyTPbLKOL3g8Y7Uoaab3mkkWmMPSZxitBcm1vcQPT/noLWS7Ll UttZB7IE0g74dfD+nz7SPuGQfkSmD1uc3iErLonCNiF63oOtm1C2bndT0Bq1iniqUEhl tlVZrpbvOLhssEp0wyKhU02knTjE17Be/Ds3VgUojwaaYeaFeAy3coIpoKuHwurVEUuj qxiGkSbC0Svoxo3EjgkQ/gOEJxxwfIvtYXAkmNAOrgCffGmoxcOqOH9Gc77yQ0Sz7yvr 3X7s11cz6uHg2qEPJUxQFdFDv0bXx5nLu7XXYuRUJ2Bg7WXf7u5lPDnsoJzfa3XDNUe3 n33w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683337992; x=1685929992; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+llIvnV49eApFmV3b95pn28itqnxknqoQorBXNWQzk4=; b=cMwV9lFYJ8KwWPrMuARk8lpAddWbbtr0ZZgQMKYMwqftkHDhiQRuxz8FpMV7wjFHtn zmrYG5f6XbjfaVYNstkV4QKVL6QRD2l33PVRFeg5BjUw/MRqimxAyCR/Cl4mZxfhh/+1 0GcFV7vgaRHdGd43a4lRnWWABSuBCuy4syV5u8RAN1RVMXIhRRxki9UR2OcmUgWh0vBa B7oEg9SIOnnr+o4eciHewsNxtw4JOvo/Ap2f4X3mGDfjqa+LAMf14Pi1jl0goimf5gFm b6KMI2/IicMbF6DIm8pB8wJmelQ4FHlIc/ctG8naxrChJsGo5MtPF+Zo+e5ccyqzZf4K dV/A== X-Gm-Message-State: AC+VfDznB/afG8Qi24ODHQaLRT0xp/b+XVc/cj1aBdm0MrzJRCQCbzAW Y1X7j1VaG/9ruv28pNMA8ipT9IqAIAHf9CX/pzY= X-Google-Smtp-Source: ACHHUZ43HNBqNDW1xis08wBfu4fGMARLZl9f0GgOmy9AFurInSQPP4TW6H7eexF0n4EWWpcEvNW/pu/21igg7ugjKgc= X-Received: by 2002:a67:fd11:0:b0:434:865e:700a with SMTP id f17-20020a67fd11000000b00434865e700amr161854vsr.16.1683337991506; Fri, 05 May 2023 18:53:11 -0700 (PDT) MIME-Version: 1.0 References: <20230410144808.324346-1-juzhe.zhong@rivai.ai> <89f088ec-8692-01f5-0395-5a66ddf085d7@gmail.com> <47D962C7C724E3A2+20230410231445834316202@rivai.ai> <0AEFD2378C3DF89B+202304111919556577872@rivai.ai> <2978624D57874251+2023041307225185723242@rivai.ai> In-Reply-To: From: Kito Cheng Date: Sat, 6 May 2023 09:53:00 +0800 Message-ID: Subject: Re: Re: [PATCH] machine_mode type size: Extend enum size from 8-bit to 16-bit To: "Li, Pan2" Cc: Richard Biener , =?UTF-8?B?6ZKf5bGF5ZOy?= , "richard.sandiford" , Jeff Law , gcc-patches , palmer , jakub Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_ASCII_DIVIDERS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi Pan: Could you try to apply the following diff and measure again? This makes tree_type_common size unchanged. sizeof tree_type_common=3D 128 (mode =3D 8 bit) sizeof tree_type_common=3D 136 (mode =3D 16 bit) sizeof tree_type_common=3D 128 (mode =3D 8 bit w/ this diff) diff --git a/gcc/tree-core.h b/gcc/tree-core.h index af795aa81f98..b8ccfa407ed9 100644 --- a/gcc/tree-core.h +++ b/gcc/tree-core.h @@ -1680,6 +1680,8 @@ struct GTY(()) tree_type_common { tree attributes; unsigned int uid; + ENUM_BITFIELD(machine_mode) mode : 16; + unsigned int precision : 10; unsigned no_force_blk_flag : 1; unsigned needs_constructing_flag : 1; @@ -1687,7 +1689,6 @@ struct GTY(()) tree_type_common { unsigned restrict_flag : 1; unsigned contains_placeholder_bits : 2; - ENUM_BITFIELD(machine_mode) mode : 16; /* TYPE_STRING_FLAG for INTEGER_TYPE and ARRAY_TYPE. TYPE_CXX_ODR_P for RECORD_TYPE and UNION_TYPE. */ @@ -1712,7 +1713,7 @@ struct GTY(()) tree_type_common { unsigned empty_flag : 1; unsigned indivisible_p : 1; unsigned no_named_args_stdarg_p : 1; - unsigned spare : 15; + unsigned spare : 7; alias_set_type alias_set; tree pointer_to; On Sat, May 6, 2023 at 9:10=E2=80=AFAM Li, Pan2 via Gcc-patches wrote: > > Yes, totally agree the number cannot be very accurate up to a point. Upda= te the correlated memory bytes allocated for the X86 target. > > Bytes allocated with O2: > -------------------------------------------------------------------------= ---------------------------- > Benchmark | upstream | with this PATCH > -------------------------------------------------------------------------= ---------------------------- > 400.perlbench | 25286185160 | 25176544846 ~0.0% > 401.bzip2 | 1429883731 | 1391040027 -2.7% > 403.gcc | 55023568981 | 54798890746 ~0.0% > 429.mcf | 1360975660 | 1321537710 -2.9% > 445.gobmk | 12791636502 | 12666523431 -1.0% > 456.hmmer | 9354433652 | 9279189174 ~0.0% > 458.sjeng | 1991260562 | 1944031904 -2.4% > 462.libquantum | 1725112078 | 1684213981 -2.4% > 464.h264ref | 8597673515 | 8528855778 ~0.0% > 471.omnetpp | 37613034778 | 37432278047 ~0.0% > 473.astar | 3817295518 | 3772460508 -1.2% > 483.xalancbmk | 149418776991 | 148545162207 ~0.0% > > Bytes allocated with Ofast + funroll-loops: > -------------------------------------------------------------------------= ----------------- > Benchmark | upstream | with this PATCH > -------------------------------------------------------------------------= ----------------- > 400.perlbench | 30438407499 | 30574152897 ~0.0% > 401.bzip2 | 2277114519 | 2319432664 +1.9% > 403.gcc | 64499664264 | 64781232731 ~0.0% > 429.mcf | 1361486758 | 1399942116 +2.8% > 445.gobmk | 15258056111 | 15396801542 +1.0% > 456.hmmer | 10896615649 | 10936223486 ~0.0% > 458.sjeng | 2592620709 | 2641687496 +1.9% > 462.libquantum | 1814487525 | 1854518500 +2.2% > 464.h264ref | 13528736878 | 13614517066 ~0.0% > 471.omnetpp | 38721066702 | 38910524667 ~0.0% > 473.astar | 3924015756 | 3968057027 +1.1% > 483.xalancbmk | 165897692838 | 166843885880 ~0.0% > > Pan > > > -----Original Message----- > From: Richard Biener > Sent: Friday, May 5, 2023 2:25 PM > To: Li, Pan2 > Cc: =E9=92=9F=E5=B1=85=E5=93=B2 ; kito.cheng ; richard.sandiford ; Jeff Law <= jeffreyalaw@gmail.com>; gcc-patches ; palmer ; jakub > Subject: RE: Re: [PATCH] machine_mode type size: Extend enum size from 8-= bit to 16-bit > > On Fri, 5 May 2023, Li, Pan2 wrote: > > > I tried the memory profiling by valgrind --tool=3Dmemcheck --trace-chil= dren=3Dyes for this change, target the SPEC 2006 INT part with rv64gcv. Not= e we only count the bytes allocated from valgrind log like this "=3D=3D2832= 896=3D=3D total heap usage: 208 allocs, 165 frees, 123,204 bytes allocate= d". > > > > Consider some variance of valgrind, it looks like the impact to bytes > > allocated may be limited. However, I am still running this for x86, it > > will take more than 30 hours for each iteration... > > I'm not sure I'd call +- 7% on memory use "limited" - but I fear the numb= ers are off. Note since various structures reside in GC memory there's als= o changes to GC overhead and fragmentation, so precise measurements are dif= ficult. > > Richard. > > > RISC-V GCC Version: > > >> ~/bin/test-gnu-8-bits/bin/riscv64-unknown-linux-gnu-gcc --version > > riscv64-unknown-linux-gnu-gcc (gd7cb9720ed5) 14.0.0 20230503 > > (experimental) Copyright (C) 2023 Free Software Foundation, Inc. > > This is free software; see the source for copying conditions. There > > is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULA= R PURPOSE. > > > > Bytes allocated with O2: > > -----------------------------------------------------------------------= ------------------------------ > > Benchmark | upstream | with this PATCH > > -----------------------------------------------------------------------= ------------------------------ > > 400.perlbench | 29699642875 | 29949876269 ~0.0% > > 401.bzip2 | 1641041659 | 1755563972 +6.95% > > 403.gcc | 68447500516 | 68900883291 ~0.= 0% > > 429.mcf | 1433156462 | 1433253373 ~0.0% > > 445.gobmk | 14239225210 | 14463438465 ~0.0% > > 456.hmmer | 9635955623 | 9808534948 +1.8% > > 458.sjeng | 2419478204 | 2545478940 +5.4% > > 462.libquantum | 1686404489 | 1800884197 +6.8= % > > 464.h264ref 8j1 | 10190413900 | 10351134161 +1.6% > > 471.omnetpp | 40814627684 | 41185864529 ~0.0% > > 473.astar | 3807097529 | 3928428183 +3.2% > > 483.xalancbmk | 152959418167 | 154201738843 ~0.0% > > > > Bytes allocated with Ofast + funroll-loops: > > -----------------------------------------------------------------------= ------------------- > > Benchmark | upstream | with this PATCH > > -----------------------------------------------------------------------= ------------------- > > 400.perlbench | 39491184733 | 39223020267 ~0.0% > > 401.bzip2 | 2843871517 | 2730383463 ~0% > > 403.gcc | 84195991898 | 83730632955 -4.= 0% > > 429.mcf | 1481381164 | 1367309565 -7.7% > > 445.gobmk | 20123943663 | 19886116394 -1.2% > > 456.hmmer | 12302445139 | 12121745383 -1.5% > > 458.sjeng | 3884712615 | 3755481930 -3.3% > > 462.libquantum | 1966619940 | 1852274342 -5.= 8% > > 464.h264ref | 19219365552 | 19050288201 ~0.0% > > 471.omnetpp | 45701008325 | 45327805079 ~0.0% > > 473.astar | 4118600354 | 3995943705 -3.0% > > 483.xalancbmk | 179481305182 | 178160306301 ~0.0% > > > > Pan > > > > > > -----Original Message----- > > From: Gcc-patches = On Behalf Of ??? > > Sent: Thursday, April 13, 2023 7:23 AM > > To: kito.cheng ; rguenther > > Cc: richard.sandiford ; Jeff Law > > ; gcc-patches ; palmer > > ; jakub > > Subject: Re: Re: [PATCH] machine_mode type size: Extend enum size from > > 8-bit to 16-bit > > > > Yeah, like kito said. > > Turns out the tuple type model in ARM SVE is the optimal solution for R= VV. > > And we like ARM SVE style implmentation. > > > > And now we see swapping rtx_code and mode in rtx_def can make rtx_def o= veral not exceed 64 bit. > > But it seems that there is still problem in tree_type_common and tree_d= ecl_common, is that right? > > > > After several trys (remove all redundant TI/TF vector modes and FP16 ve= ctor mode), now there are 252 modes in RISC-V port. Basically, I can keep s= upporting new RVV intrinsisc features recently. > > However, we can't support more in the future, for example, FP16 vector,= BF16 vector, matrix modes, VLS modes,...etc. > > > > From RVV side, I think extending 1 more bit of machine mode should be e= nough for RVV (overal 512 modes). > > Is it possible make it happen in tree_type_common and tree_decl_common,= Richards? > > > > Thank you so much for all comments. > > > > > > juzhe.zhong@rivai.ai > > > > From: Kito Cheng > > Date: 2023-04-12 17:31 > > To: Richard Biener > > CC: juzhe.zhong@rivai.ai; richard.sandiford; jeffreyalaw; gcc-patches; > > palmer; jakub > > Subject: Re: Re: [PATCH] machine_mode type size: Extend enum size from > > 8-bit to 16-bit > > > > The concept of fractional LMUL is the same as the concept of > > > > AArch64's partial SVE vectors, so they can only access the lowest > > > > part, like SVE's partial vector. > > > > > > > > We want to spill/restore the exact size of those modes (1/2, 1/4, > > > > 1/8), so adding dedicated modes for those partial vector modes > > > > should be unavoidable IMO. > > > > > > > > And even if we use sub-vector, we still need to define those > > > > partial vector types. > > > > > > Could you use integer modes for the fractional vectors? > > > > You mean using the scalar integer mode like using (subreg:SI > > (reg:VNx4SI) 0) to represent > > LMUL=3D1/4? > > (Assume VNx4SI is mode for M1) > > > > If so I think it might not be able to model that right - it seems like = we are using 32-bits but actually we are using poly_int16(1, 1) * 32 bits. > > > > > For computation you can always appropriately limit the LEN? > > > > RVV provide zvl*b extension like zvlb (e.g.zvl128b or zvl256b) to > > guarantee the vector length is at least larger than N bits, but it's > > just guarantee the minimal length like SVE guarantee the minimal > > vector length is 128 bits > > > > > > -- > Richard Biener > SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg= , Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman; H= RB 36809 (AG Nuernberg)