From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf1-x42b.google.com (mail-pf1-x42b.google.com [IPv6:2607:f8b0:4864:20::42b]) by sourceware.org (Postfix) with ESMTPS id 9B7DF3858CD1 for ; Sat, 15 Jul 2023 06:16:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9B7DF3858CD1 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-pf1-x42b.google.com with SMTP id d2e1a72fcca58-668711086f4so1658183b3a.1 for ; Fri, 14 Jul 2023 23:16:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1689401812; x=1691993812; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=t8UrmS3tjoItmLK835fVsRJkKavmNaEPzXL1baeKj3Q=; b=Go1roHm44t8Ce14VQ0/L1zGdsRMU1AcTYmEeQY/AeABDbF2UIbWKNO6f0d1za7ZXML RFkKS/BNZW2TjC2mdPu+QTubRjiSXzOsZayblfhCWzaLLkHRd2PrK519+nrRTwZ0Xjmi nkMHk8rpeu2PVchSZu3BAzhydJBaWLYPDYN2CuCMDcuYREbinIWKB82o7qhDwaKGIVJ8 E4kux2IA7ZJsVcvn/ITQCsliVRK+7cNiYxcBED+PI3kcWQza5tWEbUFQLO65LQkakXlR 1+3UgiCdwz/aN03HnNslW+LwgRPp8/PqhI6sLNmSIcCuFd25mkyPar2D2AQDhINd5/KP mmdw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689401812; x=1691993812; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=t8UrmS3tjoItmLK835fVsRJkKavmNaEPzXL1baeKj3Q=; b=L54OP2B9iihU39PlTrgeghjDIELYbLYCDWon5PMk9Z0pYNuHIbGMYgAAkBzYzq9Wzq 3mFHf0FWZ+p8dAxRFDAXwOArPU9A9MOpc6OU6njFvph8MpMiMLp5RQ5I1mbRnRBL05hN B7ENzPTNyMnxtH9brmhC66l0vp3bS5zfUOn6uF0HKivbyVkCbYgX8ZKrTgbdVjOldiRH UoKSD311xRoqbRCpxWd1M1xJlKAH+wXpgIPxTKDUzdRLbjNrZj5T9AfP9R4JI/Y5qIxf 1sLFEQaA6rZTUsmjftoEU1IBXRaf6PIqCBNVafvrml9UvtXYWSgMFFgXPtsP7zSw5pZt c7lg== X-Gm-Message-State: ABy/qLYgu7uWYsgzUYMzR6QnN6DboF4vsZ6/YvwyktMjZQqhIYnhvqKN 2L91eabb/FEViRZ7+wRpW68= X-Google-Smtp-Source: APBJJlFYkYFXe9CAcviJalvdRqaSJDYGxFHLFg7auEwZuI3isU5Ehc5tdJdhgC6P+t2F/POyaFJg1w== X-Received: by 2002:a05:6a00:390d:b0:656:c971:951 with SMTP id fh13-20020a056a00390d00b00656c9710951mr6379870pfb.8.1689401811914; Fri, 14 Jul 2023 23:16:51 -0700 (PDT) Received: from [172.31.0.109] ([136.36.130.248]) by smtp.gmail.com with ESMTPSA id j15-20020aa7800f000000b00666e883757fsm8112708pfi.123.2023.07.14.23.16.50 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 14 Jul 2023 23:16:51 -0700 (PDT) Message-ID: <3c1f0f8a-34ed-abb2-8a49-3083a2cc55d2@gmail.com> Date: Sat, 15 Jul 2023 00:16:49 -0600 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Subject: Re: RISC-V: Folding memory for FP + constant case Content-Language: en-US To: Jivan Hakobyan , gcc-patches@gcc.gnu.org References: From: Jeff Law In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 7/12/23 14:59, Jivan Hakobyan via Gcc-patches wrote: > Accessing local arrays element turned into load form (fp + (index << > C1)) + C2 address. In the case when access is in the loop we got loop > invariant computation. For some reason, moving out that part cannot > be done in loop-invariant passes. But we can handle that in > target-specific hook (legitimize_address). That provides an > opportunity to rewrite memory access more suitable for the target > architecture. > > This patch solves the mentioned case by rewriting mentioned case to > ((fp + C2) + (index << C1)) I have evaluated it on SPEC2017 and got > an improvement on leela (over 7b instructions, .39% of the dynamic > count) and dwarfs the regression for gcc (14m instructions, .0012% of > the dynamic count). > > > gcc/ChangeLog: * config/riscv/riscv.cc (riscv_legitimize_address): > Handle folding. (mem_shadd_or_shadd_rtx_p): New predicate. So I still need to give the new version a review. But a high level question -- did you re-run the benchmarks with this version to verify that we still saw the same nice improvement in leela? The reason I ask is when I use this on Ventana's internal tree I don't see any notable differences in the dynamic instruction counts. And probably the most critical difference between the upstream tree and Ventana's tree in this space is Ventana's internal tree has an earlier version of the fold-mem-offsets work from Manolis. It may ultimately be the case that this work and Manolis's f-m-o patch have a lot of overlap in terms of their final effect on code generation. Manolis's pass runs much later (after register allocation), so it's not going to address the loop-invariant-code-motion issue that originally got us looking into this space. But his pass is generic enough that it helps other targets. So we may ultimately want both. Anyway, just wanted to verify if this variant is still showing the nice improvement on leela that the prior version did. Jeff ps. I know you're on PTO. No rush on responding -- enjoy the time off.