From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id 364EF3858419; Thu, 21 Mar 2024 08:15:44 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 364EF3858419
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1711008944;
	bh=MV8QOCLe1Kz3MmW/bjZFUtqu2qMJeqQ95M6c6rCl2pw=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=GJqDQaGp/tAL4lqpKghB+avGiuFWIUmP1IOFMPS6oADoXLLFJcw4TBdaI1TCxiwGO
	 wePDK7taECb8D1jpp4DhAcHSQLSQ4dUdBxRvGKSaioJACVg6ot9FVhfsvi2jkXSxPz
	 4vNE/jYhQsQ9L84jOf/G3cBrEfnW1/xtx1JjXMrQ=
From: "rguenth at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/101523] Huge number of combine attempts
Date: Thu, 21 Mar 2024 08:15:43 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: 12.0
X-Bugzilla-Keywords: compile-time-hog, memory-hog
X-Bugzilla-Severity: normal
X-Bugzilla-Who: rguenth at gcc dot gnu.org
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: segher at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-101523-4-O8HE25kFuz@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-101523-4@http.gcc.gnu.org/bugzilla/>
References: <bug-101523-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D101523
--- Comment #36 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Segher Boessenkool from comment #35)
> (In reply to Richard Biener from comment #34)
> > The change itself looks reasonable given costs, though maybe 2 -> 2
> > combinations should not trigger when the cost remains the same?  In
> > this case it definitely doesn't look profitable, does it?  Well,
> > in theory it might hide latency and the 2nd instruction can issue
> > at the same time as the first.
>=20
> No, it definitely should be done.  As I showed back then, it costs less t=
han
> 1%
> extra compile time on *any platform* on average, and it reduced code size=
 by
> 1%-2%
> everywhere.
>=20
> It also cannot get stuck, any combination is attempted only once, any
> combination
> that succeeds eats up a loglink.  It is finite, (almost) linear in fact.

So the slowness for the testcase comes from failed attempts.

> Any backend is free to say certain insns shouldn't combine at all.  This =
will
> lead to reduced performance though.
>=20
> - ~ - ~ -
>=20
> Something that is the real complaint here: it seems we do not GC often
> enough,
> only after processing a BB (or EBB)?  That adds up for artificial code li=
ke
> this, sure.

For memory use if you know combine doesn't have "dangling" links to GC memo=
ry
you can call ggc_collect at any point you like.  Or, when you create
throw-away RTL, ggc_free it explicitly (yeah, that only frees the "toplevel=
").

> And the "param to give an upper limit to how many combination attempts are
> done
> (per BB)" offer is on the table still, too.  I don't think it would ever =
be
> useful (if you want your code to compile faster just write better code!),
> but :-)

Well, while you say the number of successful combinations is linear the
number of combine attempts appearantly isn't (well, of course, if we ever
combine from multi-use defs).  So yeah, a param might be useful here but
instead of some constant limit on the number of combine attempts per
function or per BB it might make sense to instead limit it on the number
of DEFs?  I understand we work on the uses so it'll be a bit hard to
apply this in a way to, say, combine a DEF only with the N nearest uses
(but not any ones farther out), and maintaining such a count per DEF would
cost.  So more practical might be to limit the number of attempts to combine
into an (unchanged?) insns?

Basically I would hope with a hard limit in place we'd not stop after the
first half of a BB leaving trivial combinations in the second half
unhandled but instead somehow throttle the "expensive" cases?=