From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 364EF3858419; Thu, 21 Mar 2024 08:15:44 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 364EF3858419 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1711008944; bh=MV8QOCLe1Kz3MmW/bjZFUtqu2qMJeqQ95M6c6rCl2pw=; h=From:To:Subject:Date:In-Reply-To:References:From; b=GJqDQaGp/tAL4lqpKghB+avGiuFWIUmP1IOFMPS6oADoXLLFJcw4TBdaI1TCxiwGO wePDK7taECb8D1jpp4DhAcHSQLSQ4dUdBxRvGKSaioJACVg6ot9FVhfsvi2jkXSxPz 4vNE/jYhQsQ9L84jOf/G3cBrEfnW1/xtx1JjXMrQ= From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/101523] Huge number of combine attempts Date: Thu, 21 Mar 2024 08:15:43 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: compile-time-hog, memory-hog X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: segher at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D101523 --- Comment #36 from Richard Biener --- (In reply to Segher Boessenkool from comment #35) > (In reply to Richard Biener from comment #34) > > The change itself looks reasonable given costs, though maybe 2 -> 2 > > combinations should not trigger when the cost remains the same? In > > this case it definitely doesn't look profitable, does it? Well, > > in theory it might hide latency and the 2nd instruction can issue > > at the same time as the first. >=20 > No, it definitely should be done. As I showed back then, it costs less t= han > 1% > extra compile time on *any platform* on average, and it reduced code size= by > 1%-2% > everywhere. >=20 > It also cannot get stuck, any combination is attempted only once, any > combination > that succeeds eats up a loglink. It is finite, (almost) linear in fact. So the slowness for the testcase comes from failed attempts. > Any backend is free to say certain insns shouldn't combine at all. This = will > lead to reduced performance though. >=20 > - ~ - ~ - >=20 > Something that is the real complaint here: it seems we do not GC often > enough, > only after processing a BB (or EBB)? That adds up for artificial code li= ke > this, sure. For memory use if you know combine doesn't have "dangling" links to GC memo= ry you can call ggc_collect at any point you like. Or, when you create throw-away RTL, ggc_free it explicitly (yeah, that only frees the "toplevel= "). > And the "param to give an upper limit to how many combination attempts are > done > (per BB)" offer is on the table still, too. I don't think it would ever = be > useful (if you want your code to compile faster just write better code!), > but :-) Well, while you say the number of successful combinations is linear the number of combine attempts appearantly isn't (well, of course, if we ever combine from multi-use defs). So yeah, a param might be useful here but instead of some constant limit on the number of combine attempts per function or per BB it might make sense to instead limit it on the number of DEFs? I understand we work on the uses so it'll be a bit hard to apply this in a way to, say, combine a DEF only with the N nearest uses (but not any ones farther out), and maintaining such a count per DEF would cost. So more practical might be to limit the number of attempts to combine into an (unchanged?) insns? Basically I would hope with a hard limit in place we'd not stop after the first half of a BB leaving trivial combinations in the second half unhandled but instead somehow throttle the "expensive" cases?=