From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x12b.google.com (mail-lf1-x12b.google.com [IPv6:2a00:1450:4864:20::12b]) by sourceware.org (Postfix) with ESMTPS id C1C4C3858D33 for ; Sat, 6 Apr 2024 12:50:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C1C4C3858D33 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org C1C4C3858D33 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::12b ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712407821; cv=none; b=UDWKP6+kE79VIIxTWzWU5j+DhEV4BYF23zK7RowQBb4Pl3CdF/4u39/bGQ24XSOVo28rdFXbzcoqBDiddOjTN7hK/Xg5j+zuGT7bVFvmvI8BDOeS0bRSR5yJplHCkfHXsj6neBWox1hM2zhP9v7DBUyG6tuWTEKdCCmmPjuqdYw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712407821; c=relaxed/simple; bh=sTvtSfuqizA7MctgqtOKZB1XyEH+ov74udTE/kYwWQI=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=I1NyPii+WTkOcN0LBZH5FXTZ2VmlANFH69YjNhrFR5NEv3SW14KquCCflBkm1/xzUzmJYi57Zhls6N+Ozt3z7lfQETw4NI6x3keMKzwIHqm7zdVQcD0WGAgbVmrSDHk+0NNvLsrqripgXHW626kXOeyP0rWxxrXGhIg2VCFcClQ= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lf1-x12b.google.com with SMTP id 2adb3069b0e04-516ab4b3251so3501542e87.0 for ; Sat, 06 Apr 2024 05:50:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1712407817; x=1713012617; darn=gcc.gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=h80C/gGrA0IwpR0wweqfIVwudGoMxAJtqnl++Aoge4w=; b=jMAkQRE3olVjJLZNPnIqlBxxoBSAQgXRXLHtS+rI0us29XQB1CEoypjU+h7p5l+y9x YcITWsfDbP7Zm8X64PxgLOlUQtvCNLqTIXJL51IryqGekgQqs9XJzKyKjjaFotWFvs+s 4hIUszM6a1zV7JdK0Oj+uk+NwGP4EGp6wsLXktfFe0oGaRZGL7sVvo79CrqFoPN4v7p0 pbIPAYvfiIUjcJRCQMi+l+F6t6Cf1KSBsPWmM2y7eTMirxjz9gKc7u0Kt2QczPLrk87l WJHUNYwCofl/lKmCZHwe88rskZt2+9jlsvUVznbxkLhiNVscreUrDrfP+IxCfHfK/KeP irUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712407817; x=1713012617; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=h80C/gGrA0IwpR0wweqfIVwudGoMxAJtqnl++Aoge4w=; b=SIPuV03zjYdB8Lnew7CvR6eQTSL032WaMN/1EMHFMj8ypb86dhdg1kq1G+pm+6nGGR fcx5syiPZ6v09ztx05+l0mLIGe75SU012p2xPVb/YYhy08Tt75+pYs4WlbZapnbMGqDR yeLR8BVmEvVyJNmBvUhwNmDAvb3n3zOU0Euv4KhlIf1gdvVrVtvd25FDIYG5F3W+ZA3P gPmoa+RvzdiKTrl7m73yZb5uj0NcqOOEobF5jfE3TeHa7pWHRccYn8wBrJPWDKQtFB2S m89q7rbMDdVlmDKVm9bLkAEGMVP+l1zJiCj13DXptxBFxAQosRlT2DXYS0XKinRdrJjt 8cjw== X-Forwarded-Encrypted: i=1; AJvYcCURccBhI5HBihEvnsKEUfPL5HRJRZaTbSxM/YPsX9E0Lsisb2Y8+2cdRxjdBojyXdeghTdJsjYRCUyMgTF0FrjSJbrqkzqRaw== X-Gm-Message-State: AOJu0Yybi37H4CGWgMXG53pSpSQOi077IPpDifbcKBIuZ6onI9NzqGXW 4nxcwU+4k/3xdo7dzqyDZsuUyaMrFpGI/e/LLyZgyPQJeU1nNCX5JGVIt/qjEusOI4QJlNW1dDx qFxL77OJXhoN6fA2JviVTKjR8oVs= X-Google-Smtp-Source: AGHT+IF7d9AxUwH9SG/SO0GFmOWc9mFlQu3eH0sSlJKJEsiUupC+wyfmpyN0kBEef0Oon+TRD+cd65npmK15ZxR5zro= X-Received: by 2002:ac2:4a6a:0:b0:516:d30c:7243 with SMTP id q10-20020ac24a6a000000b00516d30c7243mr2972729lfp.28.1712407816650; Sat, 06 Apr 2024 05:50:16 -0700 (PDT) MIME-Version: 1.0 References: <202404031107.433B7hCA019240@gate.crashing.org> <20240405212754.GL19790@gate.crashing.org> In-Reply-To: <20240405212754.GL19790@gate.crashing.org> From: Richard Biener Date: Sat, 6 Apr 2024 14:50:05 +0200 Message-ID: Subject: Re: [PATCH] rtl-optimization/101523 - avoid re-combine after noop 2->2 combination To: Segher Boessenkool Cc: Richard Biener , gcc-patches@gcc.gnu.org, Jakub Jelinek , jeffreyalaw@gmail.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Fri, Apr 5, 2024 at 11:29=E2=80=AFPM Segher Boessenkool wrote: > > Hi! > > On Wed, Apr 03, 2024 at 01:07:41PM +0200, Richard Biener wrote: > > The following avoids re-walking and re-combining the instructions > > between i2 and i3 when the pattern of i2 doesn't change. > > > > Bootstrap and regtest running ontop of a reversal of > > r14-9692-g839bc42772ba7a. > > Please include that in the patch (or series, preferably). I'd like reversal to be considered independently of this patch which is why I didn't include the reversal. Of course without reversal this patch doesn= 't make sense. > > It brings down memory use frmo 9GB to 400MB and compile-time from > > 80s to 3.5s. r14-9692-g839bc42772ba7a does better in both metrics > > but has shown code generation regressions across acrchitectures. > > > > OK to revert r14-9692-g839bc42772ba7a? > > No. > > The patch solved a very real problem. How does your replacement handle > that? You don't say. It looks like it only battles symptoms a bit, > instead :-( My patch isn't a replacement for your solution. Reversal is to address the P1 regressions caused by the change. My change offers to address some of the symptoms shown with the testcase without disabling the offending 2->2 combinations. > We had this before: 3->2 combinations that leave an instruction > identical to what was there before. This was just a combination with > context as well. The only reason this wasn't a huge problem then > already was because this is a 3->2 combination, even if it really is a > 2->1 one it still is beneficial in all the same cases. But in the new > case it can iterate indefinitely -- well not quite, but some polynomial > number of times, for a polynomial at least of degree three, possibly > more :-( > > With this patch you need to show combine still is linear. I don't think > it is, but some deeper analysis might show it still is. We have come to different conclusions as to what the issue is that the testcase exposes in combine. While you spotted a transform that combine shouldn't have done (and I think I agree on that!) I identified the fact that while the number of successful combines is linear in the number of log-links (and thus linear in the size of a basic-block), the number of combination _attempts_ appears to be O(log-links) times O(successful combinations). The testcase hits hard on this because of those 2->2 combinations done but this problem persists with all N->2 combinations. This "quadraticness" (I know you don't like to call it that way) is because for each successful N->2 combination of insns {I2, .. .., I1} we try combining into all insns between (and including) I2 and I1. combine wants to retry combining into I2 rightfully so but there's no good reason to retry _all_ of the n intervening insns. Yes, we want to retry all insns that have their log-links point to I2 (and possibly more for second-level references, dependent on how combine exactly identifies I2, I3 and I4). Re-trying all of them obviously works, but there's your quadraticness. My patch makes this problem a little bit (in general) less often hit since it avoids retrying iff I2 did not change. I _think_ in that case the log-links/notes shouldn't have changed either. In any case I'm disabling less of combine than what your patch did. So again - is it OK to revert your patch? Or do you expect you can address the code generation regressions within the next week? Since the original issue is quite old postponing your solution to the next stage1 should be acceptable. Currently those regressions will block the release of GCC 14. Do you agree that the patch I propose, while it doesn't solve any actual issue (it doesn't fix the walking scheme nor does it avoid the combinations combine shouldn't do), it helps in some cases and shouldn't cause code generation regressions that your patch wouldn't have caused as well? So is that change OK until we get your real solution implemented in a way that doesn't cause regressions? Thanks, Richard.