From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x1030.google.com (mail-pj1-x1030.google.com [IPv6:2607:f8b0:4864:20::1030]) by sourceware.org (Postfix) with ESMTPS id 5C9EC3858C52 for ; Thu, 28 Sep 2023 12:37:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5C9EC3858C52 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-pj1-x1030.google.com with SMTP id 98e67ed59e1d1-2788993edaaso3701036a91.0 for ; Thu, 28 Sep 2023 05:37:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695904639; x=1696509439; darn=gcc.gnu.org; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=YBO5UY3vMHsahhMHwEJz+Ex33zRVkbQsFUCEbgHvlTc=; b=IY/HuuJ2nchDT9kSsJqgMlqdQP1UnPmC3ZPlWXhaPCgmr36DS+S6G7ikQNe9S6VjiP 7RQ7zKgEXMWZOklnpPue54n/fPl4a8i897OOm5OiLjCM5kRNEUv2IFrAtuiRKohACQFv Z2pUoEV7VSfsIgiyJTj40X/dKakTzBRcQ2xGbvDYGJmsWYQaq5ZIMKRTrr5NqLS5WolZ pSXE8GOJMYTUP1XDnx+aALeju0uZh2DMIsLXZRSO6sxrwAlQKSOukwtCOrDtGxhKyf0J b8sZ/ZMaC6dTBmUKBbQI+bKOozW8lwkJQbR3dKTESb19J5GOpG2xPfFaGzQq7/f3/T1m cSBw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695904639; x=1696509439; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=YBO5UY3vMHsahhMHwEJz+Ex33zRVkbQsFUCEbgHvlTc=; b=T0AJTIKcM4phuvC9qCZpN1Ps684pkB+kLLmhyNJVUh6keex2ryblwTHAfofbj9gUgI TttUNGvW595JOGTAvaDZNMRyBaFV7gpTUtmgWfhl8EhPNhNuW9zEg5SpjI/M9xCP7FE/ Un214gH8F6N+Ly/UPscNI0VPjFgOUjganFLoJJEfv9QZTcTKKpjOMI2ZnESP5HZhTlIi sZD+ahoL/6TP8Uo6pXbUjc1FSpRlHLBpKWMtD42y0BmVyTAc+fcSblK6L5HQN6op4fd9 4PdSDvb3JupA76Z9jzE0dyTtNTc0YdpQiCkpLiXpFwxwp0QMFOvxEjjWt/rQBMj+Gs7W 3mqA== X-Gm-Message-State: AOJu0YzoTXiJVCJC1qtB/vOiggDwBdHAo8HXC5J5vvKByXDYx3ducBuc ng5uKcgpxy/HJsDL4ZwKCfS4dQOTXF0= X-Google-Smtp-Source: AGHT+IH2Mu62XwcdYdX4kTvtwfrZYj2RJ9FQ8/6LVw1qvccNmYUfdYAlAkKn9u/mPrtxEMI2ingI7A== X-Received: by 2002:a17:90a:7345:b0:271:8195:8 with SMTP id j5-20020a17090a734500b0027181950008mr896236pjs.36.1695904639407; Thu, 28 Sep 2023 05:37:19 -0700 (PDT) Received: from [172.31.0.109] ([136.36.130.248]) by smtp.gmail.com with ESMTPSA id mz6-20020a17090b378600b0026b12768e46sm13364183pjb.42.2023.09.28.05.37.18 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 28 Sep 2023 05:37:18 -0700 (PDT) Message-ID: Date: Thu, 28 Sep 2023 06:37:17 -0600 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] RFC: Add late-combine pass [PR106594] Content-Language: en-US To: gcc-patches@gcc.gnu.org, Robin Dapp , richard.sandiford@arm.com References: From: Jeff Law In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 9/26/23 10:21, Richard Sandiford wrote: > This patch adds a combine pass that runs late in the pipeline. > There are two instances: one between combine and split1, and one > after postreload. > > The pass currently has a single objective: remove definitions by > substituting into all uses. The pre-RA version tries to restrict > itself to cases that are likely to have a neutral or beneficial > effect on register pressure. > > The patch fixes PR106594. It also fixes a few FAILs and XFAILs > in the aarch64 test results, mostly due to making proper use of > MOVPRFX in cases where we didn't previously. I hope it would > also help with Robin's vec_duplicate testcase, although the > pressure heuristic might need tweaking for that case. > > This is just a first step.. I'm hoping that the pass could be > used for other combine-related optimisations in future. In particular, > the post-RA version doesn't need to restrict itself to cases where all > uses are substitutitable, since it doesn't have to worry about register > pressure. If we did that, and if we extended it to handle multi-register > REGs, the pass might be a viable replacement for regcprop, which in > turn might reduce the cost of having a post-RA instance of the new pass. > > I've run an assembly comparison with one target per CPU directory, > and it seems to be a win for all targets except nvptx (which is hard > to measure, being a higher-level asm). The biggest winner seemed > to be AVR. > > However, if a version of the pass does go in, it might be better > to enable it by default only on targets where the extra compile > time seems to be worth it. IMO, fixing PR106594 and the MOVPRFX > issues makes it worthwhile for AArch64. > > The patch contains various bug fixes and new helper routines. > I'd submit those separately in the final version. Because of > that, there's no GNU changelog yet. > > Bootstrapped & regression tested on aarch64-linux-gnu so far. Very interesting. I would generally expect it to be a win on most targets and might allow us to reduce the number of post-reload hacks we do. So I'd lean towards enabling it everywhere. With that in mind, I briefly threw it into my tester. The first thing that popped out was rl78-elf regresses on compile/20021008-1.c. In the pre-RA version we've taken these insns: > (insn 22 21 7 2 (set (reg/v/f:HI 44 [ buf ]) > (const_int 0 [0])) "k.c":9:9 -1 > (nil)) > (insn 7 22 8 2 (set (subreg:SI (reg:DF 43 [ _1 ]) 0) > (mem:SI (plus:HI (reg/v/f:HI 44 [ buf ]) > (const_int 1 [0x1])) [1 MEM[(long double *)buf_4(D) + 1B]+0 S4 A16])) "k.c":9:9 2 {movsi} > (nil)) > (insn 8 7 9 2 (set (subreg:SI (reg:DF 43 [ _1 ]) 4) > (mem:SI (plus:HI (reg/v/f:HI 44 [ buf ]) > (const_int 5 [0x5])) [1 MEM[(long double *)buf_4(D) + 1B]+4 S4 A16])) "k.c":9:9 2 {movsi} > (expr_list:REG_DEAD (reg/v/f:HI 44 [ buf ]) > (nil))) We combine insn 22 with insn 7 and 8 resulting in: > (insn 7 22 8 2 (set (subreg:SI (reg:DF 43 [ _1 ]) 0) > (mem:SI (const_int 1 [0x1]) [1 MEM[(long double *)buf_4(D) + 1B]+0 S4 A16])) "k.c":9:9 2 {movsi} > (nil)) > (insn 8 7 9 2 (set (subreg:SI (reg:DF 43 [ _1 ]) 4) > (mem:SI (const_int 5 [0x5]) [1 MEM[(long double *)buf_4(D) + 1B]+4 S4 A16])) "k.c":9:9 2 {movsi} > (nil)) Which ultimately triggers an assembler error: > k.s: Assembler messages: > k.s:41: Error: movw ax,!1 > k.s:41: Error: ^ Expression not word-aligned > k.s:43: Error: movw ax,!3 > k.s:43: Error: ^ Expression not word-aligned [ ... ] This seems more likely than not to be a target issue. I suspect combine didn't trip over this because of the multiple uses of (reg 44). Jeff