From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x1036.google.com (mail-pj1-x1036.google.com [IPv6:2607:f8b0:4864:20::1036]) by sourceware.org (Postfix) with ESMTPS id 869D03858C2D for ; Wed, 3 Aug 2022 17:23:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 869D03858C2D Received: by mail-pj1-x1036.google.com with SMTP id w17-20020a17090a8a1100b001f326c73df6so2696599pjn.3 for ; Wed, 03 Aug 2022 10:23:46 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:references:from:in-reply-to :content-transfer-encoding; bh=89anvuUP6hudUZ3sTBeathQs2J3ejoQF6DnqkBmmJNA=; b=DEkkjUdADDImeBbIa0u3+J7HY4Yp7F3zK2O/QOhLaFXcLeJH8+6y5SiFHBUbj5R15S WMyDmRTbVAe32bhfY8loQ8IOmXM6pzqctMxyU9lyY2g9rsOZq9b1EWbNnXvKxhP3dtMR +Nu4voppygZZoOV4B+Syto+CaQb6z05HcdLsC/LbsbU05oDk8zbc8EgFgOsWzLcHKVlx UkTWbuZ/JEwB5ZsyHGOjwD+3nkVPQGc4x8baUBzwxJ/CyBNSI72bdHR+nT3qGlYY5rOU cWJqhw5+cze0OF7gX3Eq2hi1BbKqu66OJ80hDbamg/iL6FwUP2PZK0aBolDP+4fClbK3 WInQ== X-Gm-Message-State: ACgBeo3Jiy25vxTAWyn7+rEPgBx2/bhySI7dhX9kjtY3tJvLvDwq1jr4 gWdZOKG4NWxCDkbLBzZaRyqyXAkBsiI= X-Google-Smtp-Source: AA6agR41d7Z7N1D/krj9GPy0Naconlc5jVcG7Gk3SYxSBpqzLFb6lg9dOwZ9qA3KJ7LkhZoqRLNzFQ== X-Received: by 2002:a17:903:1245:b0:16d:c6fb:2de9 with SMTP id u5-20020a170903124500b0016dc6fb2de9mr27531694plh.116.1659547424834; Wed, 03 Aug 2022 10:23:44 -0700 (PDT) Received: from [172.31.0.204] (c-73-98-188-51.hsd1.ut.comcast.net. [73.98.188.51]) by smtp.gmail.com with ESMTPSA id h17-20020a170902f55100b0015e8d4eb26esm2258050plf.184.2022.08.03.10.23.43 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 03 Aug 2022 10:23:44 -0700 (PDT) Message-ID: Date: Wed, 3 Aug 2022 11:23:43 -0600 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Subject: Re: [PATCH] lower-subreg, expr: Mitigate inefficiencies derived from "(clobber (reg X))" followed by "(set (subreg (reg X)) (...))" Content-Language: en-US To: gcc-patches@gcc.gnu.org References: From: Jeff Law In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Aug 2022 17:23:48 -0000 On 8/3/2022 1:52 AM, Richard Sandiford via Gcc-patches wrote: > Takayuki 'January June' Suwa via Gcc-patches writes: >> Emitting "(clobber (reg X))" before "(set (subreg (reg X)) (...))" keeps >> data flow consistent, but it also increases register allocation pressure >> and thus often creates many unwanted register-to-register moves that >> cannot be optimized away. > There are two things here: > > - If emit_move_complex_parts emits a clobber of a hard register, > then that's probably a bug/misfeature. The point of the clobber is > to indicate that the register has no useful contents. That's useful > for wide pseudos that are written to in parts, since it avoids the > need to track the liveness of each part of the pseudo individually. > But it shouldn't be necessary for hard registers, since subregs of > hard registers are simplified to hard registers wherever possible > (which on most targets is "always"). > > So I think the emit_move_complex_parts clobber should be restricted > to !HARD_REGISTER_P, like the lower-subreg clobber is. If that helps > (if only partly) then it would be worth doing as its own patch. Agreed. > > - I think it'd be worth looking into more detail why a clobber makes > a difference to register pressure. A clobber of a pseudo register R > shouldn't make R conflict with things that are live at the point of > the clobber. Also agreed. > >> It seems just analogous to partial register >> stall which is a famous problem on processors that do register renaming. >> >> In my opinion, when the register to be clobbered is a composite of hard >> ones, we should clobber the individual elements separetely, otherwise >> clear the entire to zero prior to use as the "init-regs" pass does (like >> partial register stall workarounds on x86 CPUs). Such redundant zero >> constant assignments will be removed later in the "cprop_hardreg" pass. > I don't think we should rely on the zero being optimised away later. > > Emitting the zero also makes it harder for the register allocator > to elide the move. For example, if we have: > > (set (subreg:SI (reg:DI P) 0) (reg:SI R0)) > (set (subreg:SI (reg:DI P) 4) (reg:SI R1)) > > then there is at least a chance that the RA could assign hard registers > R0:R1 to P, which would turn the moves into nops. If we emit: > > (set (reg:DI P) (const_int 0)) > > beforehand then that becomes impossible, since R0 and R1 would then > conflict with P. > > TBH I'm surprised we still run init_regs for LRA. I thought there was > a plan to stop doing that, but perhaps I misremember. I have vague memories of dealing with some of this nonsense a few release cycles.  I don't recall all the details, but init-regs + lower-subreg + regcprop + splitting all conspired to generate poor code on the MIPS targets.  See pr87761, though it doesn't include all my findings -- I can't recall if I walked through the entire tortured sequence in the gcc-patches discussion or not. I ended up working around in the mips backend in conjunction with some changes to regcprop IIRC. Jeff