From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pl1-x635.google.com (mail-pl1-x635.google.com [IPv6:2607:f8b0:4864:20::635]) by sourceware.org (Postfix) with ESMTPS id 0B1FE3858D1E for ; Tue, 25 Apr 2023 20:20:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0B1FE3858D1E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com Received: by mail-pl1-x635.google.com with SMTP id d9443c01a7336-1a92369761cso50007135ad.3 for ; Tue, 25 Apr 2023 13:20:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20221208.gappssmtp.com; s=20221208; t=1682454037; x=1685046037; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=LjmDkyMs8IdtrY5yaqxZaHjKt57/6Y25R1pPJX9pC8s=; b=dodEaFKisl0gOxaGkJWx1McfAqMrkuhe2fdk/l33HEsJO6RwIsBW6cofiqyMvzl6+g zFhFOzTlVwXuD/I8WUBfgh0IMco7JWI/XOszAFDv957uhsFj1Q1tKFndK7HTgRmPSvZ8 xUseUyYKkizkgizrdyUuPV8StqD1D+6Njl56y0NEi7CwF0v8b68Sn55X4uvTXZUAsWtD 97ebyssQraHa5x+maLAhAVIGwSf4q9DxAn6I6PzS61yHchTE3QbeBfuXOqUryy0jZezk SoB1wDHS3KgiBFNFbq1KqfcIqTPqW9Gtfz1aO/dOIPW6MSwcWkEjFA36vJ8cFEyzKsMj sN4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1682454037; x=1685046037; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=LjmDkyMs8IdtrY5yaqxZaHjKt57/6Y25R1pPJX9pC8s=; b=Y9EVQFDVZJPw5okThu1zFw+xEfO929jdJpt5NyVgHPCaHatPoOXehG/R3PcY3VflUn TT4clQUf5vJ8ZIWKEavgwQD69dC54X0sM7VKx9JuWS0/2sOn0UU7Juw/ll5A7rm0tLp9 YlnpVU11WLJk+JQfWMV4CCUuiLPRZSaVuQ8Nhxx6hWK83STdD30BhC1JEvfvvP4ZS4lj z5HHi/yID3BA/x34+PfV2O9JVE5yo2oQ3jp21UFLFlrUX4UzKuu3D52xHOVw60M4s/Kv 9nCTu1i+yjhAkCR9PK6jA6WovzeorIKUm+7xqP5pifbbndTTg78bqwvjAu2XBFKg447K +lIA== X-Gm-Message-State: AAQBX9d7CzJsNmAWM8Hz6tX3nkkzYP0qFkn+r6JlGBG36NBSRossiqMX ARiIHODtITGyqZpqox9PSb185A== X-Google-Smtp-Source: AKy350bscqwUZ+CNqdXidkdD7FU4POded+QjrqhjSYkPZg7a5qZVjMDkRlhw2arI+KkuBovTvkhGJg== X-Received: by 2002:a17:902:ecc4:b0:1a9:51bd:50ac with SMTP id a4-20020a170902ecc400b001a951bd50acmr18913404plh.20.1682454036803; Tue, 25 Apr 2023 13:20:36 -0700 (PDT) Received: from [10.0.17.146] ([50.221.140.188]) by smtp.gmail.com with ESMTPSA id u11-20020a170902b28b00b001a1d41d1b8asm8634546plr.194.2023.04.25.13.20.35 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 25 Apr 2023 13:20:36 -0700 (PDT) Message-ID: Date: Tue, 25 Apr 2023 13:20:35 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.10.0 Subject: Re: [Committed] riscv: relax splitter restrictions for creating pseudos Content-Language: en-US To: Jeff Law , gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, Palmer Dabbelt , Philipp Tomsich , Christoph Mullner , gnu-toolchain@rivosinc.co References: <20230418143635.980594-1-vineetg@rivosinc.com> <680c7bbe-5d6e-07cd-8468-247afc65e1dd@gmail.com> <3d4cda57-e0f0-9861-6969-f31f22ca9ef8@gmail.com> From: Vineet Gupta In-Reply-To: <3d4cda57-e0f0-9861-6969-f31f22ca9ef8@gmail.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 4/25/23 13:03, Jeff Law wrote: > > > On 4/25/23 11:25, Vineet Gupta wrote: >> >> >> On 4/18/23 11:36, Jeff Law via Gcc-patches wrote: >>> >>> >>> On 4/18/23 08:36, Vineet Gupta wrote: >>>> [partial addressing of PR/109279] >>>> >>>> RISCV splitters have restrictions to not create pesudos due to a >>>> combine >>>> limitatation. And despite this being a split-during-combine >>>> limitation, >>>> all split passes take the hit due to way define*_split are used in >>>> gcc. >>>> >>>> With the original combine issue being fixed 61bee6aed2 ("combine: >>>> Don't >>>> record for UNDO_MODE pointers into regno_reg_rtx array [PR104985]") >>>> the RV splitters can now be relaxed. >>>> >>>> This improves the codegen in general. e.g. >>>> >>>>     long long f(void) { return 0x0101010101010101ull; } >>>> >>>> Before >>>> >>>>     li    a0,0x01010000 >>>>     addi    a0,0x0101 >>>>     slli    a0,a0,16 >>>>     addi    a0,a0,0x0101 >>>>     slli    a0,a0,16 >>>>     addi    a0,a0,0x0101 >>>>     ret >>>> >>>> With patch >>>> >>>>     li    a5,0x01010000 >>>>     addi    a5,a5,0x0101 >>>>     mv    a0,a5 >>>>     slli    a5,a5,32 >>>>     add    a0,a5,a0 >>>>     ret >>>> >>>> This is testsuite clean, no regression w/ patch. >>>> >>>>                 ========= Summary of gcc testsuite ========= >>>>                              | # of unexpected case / # of unique >>>> unexpected case >>>>                              |          gcc |          g++ | >>>> gfortran | >>>>   rv64imafdc/  lp64d/ medlow |    2 /     2 |    1 /     1 | 6 >>>> /     1 | >>>>     rv64imac/   lp64/ medlow |    3 /     3 |    1 /     1 | 43 / 8 | >>>>   rv32imafdc/ ilp32d/ medlow |    1 /     1 |    3 /     2 | 6 >>>> /     1 | >>>>     rv32imac/  ilp32/ medlow |    1 /     1 |    3 /     2 | 43 / 8 | >>>> >>>> This came up as part of IRC chat on PR/109279 and was suggested by >>>> Andrew Pinski. >>>> >>>> Signed-off-by: Vineet Gupta >>>> --- >>>>   gcc/config/riscv/riscv-protos.h |  4 +-- >>>>   gcc/config/riscv/riscv.cc       | 46 >>>> +++++++++++++-------------------- >>>>   gcc/config/riscv/riscv.md       |  8 +++--- >>>>   3 files changed, 24 insertions(+), 34 deletions(-) >>> This looks fine, except that you don't have a ChangeLog. >> >> Pushed with Changelog and some additional info about qemu icount >> reductions with this patch >> >>      500.perlbench_r 0       1235310737733   1231742384460 0.29% >>                      1       744489708820    743515759958 >>                      2       714072106766    712875768625 0.17% >>      502.gcc_r       0       197365353269    197178223030 >>                      1       235614445254    235465240341 >>                      2       226769189971    226604663947 >>                      3       188315686133    188123584015 >>                      4       289372107644    289187945424 >>      503.bwaves_r    0       326291538768    326291539697 >>                      1       515809487294    515809488863 >>                      2       401647004144    401647005463 >>                      3       488750661035    488750662484 >>      505.mcf_r       0       681926695281    681925418147 >>      507.cactuBSSN_r 0       3832240965352   3832226068734 >>      508.namd_r      0       1919838790866   1919832527292 >>      510.parest_r    0       3515999635520   3515878553435 >>      511.povray_r    0       3073889223775   3074758622749 >>      519.lbm_r       0       1194077464296   1194077464041 >>      520.omnetpp_r   0       1014144252460   1011530791131 0.26% >>      521.wrf_r       0       3966715533120   3966265425092 >>      523.xalancbmk_r 0       1064914296949   1064506711802 >>      525.x264_r      0       509290028335    509258131632 >>                      1       2001424246635   2001677767181 >>                      2       1914660798226   1914869407575 >>      526.blender_r   0       1726083839515   1725974286174 >>      527.cam4_r      0       2336526136415   2333656336419 >>      531.deepsjeng_r 0       1689007489539   1686541299243 0.15% >>      538.imagick_r   0       3247960667520   3247942048723 >>      541.leela_r     0       2072315300365   2070248271250 >>      544.nab_r       0       1527909091282   1527906483039 >>      548.exchange2_r 0       2086120304280   2086314757502 >>      549.fotonik3d_r 0       2261694058444   2261670330720 >>      554.roms_r      0       2640547903140   2640512733483 >>      557.xz_r        0       388736881767    386880875636 0.48% >>                      1       959356981818    959993132842 >>                      2       547643353034    546374038310 0.23% >>      997.specrand_fr 0       512881578       512599641 >>      999.specrand_ir 0       512881578       512599641 > Just a note, there are some regressions in this table.  For example xz > input #2.  So to be fair when making comparisons it's probably worth > noting the regressions as well as improvements. Fair point, but in my defense I dropped anything under 0.1 % (including improvements). Here's the list of what regressed due to this patch. 511.povray_r    0    3073889223775    3074758622749    -0.03% 525.x264_r    0    509290028335    509258131632     0.01%         1    2001424246635    2001677767181    -0.01%         2    1914660798226    1914869407575    -0.01% 548.exchange2_r    0    2086120304280    2086314757502    -0.01% 557.xz_r    0    388736881767    386880875636     0.48%         1    959356981818    959993132842    -0.07% > > Anyway, my results are in line with yours. Given the instruction > counts and known IPC, I think we're taking about a 2.5% hit on > deepsjeng and about a 1% hit on leela and x264#2 comparing our > internal gcc-12 vs gcc-13 trees. I think you are referring to the bigger deepsjeng regression due to commit 2e886eef7f2b5a which introduced the splitter movconst_internal. This patch is about a small improvement to all splitters which reduces that perf loss slightly (but does regress some tests too as pointed above). Thx, -Vineet