From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-return-169424-listarch-gcc=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 28659 invoked by alias); 19 Jul 2011 22:24:38 -0000
Received: (qmail 28651 invoked by uid 22791); 19 Jul 2011 22:24:37 -0000
X-SWARE-Spam-Status: No, hits=-1.6 required=5.0	tests=AWL,BAYES_00,MIME_QP_LONG_LINE
X-Spam-Check-By: sourceware.org
Received: from c60.cesmail.net (HELO c60.cesmail.net) (216.154.195.49)    by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 19 Jul 2011 22:24:22 +0000
Received: from unknown (HELO epsilon2) ([192.168.1.60])  by c60.cesmail.net with ESMTP; 19 Jul 2011 18:24:21 -0400
Received: from e178018179.adsl.alicedsl.de (e178018179.adsl.alicedsl.de	[85.178.18.179]) by webmail.spamcop.net (Horde MIME library) with HTTP;	Tue, 19 Jul 2011 18:24:21 -0400
Message-ID: <20110719182421.e2375zp1g08s4osg-nzlynne@webmail.spamcop.net>
Date: Tue, 19 Jul 2011 22:28:00 -0000
From: Joern Rennecke <amylaar@spamcop.net>
To: Richard Henderson <rth@redhat.com>
Cc: "gcc@gcc.gnu.org" <gcc@gcc.gnu.org>
Subject: Re: [RFC] Remove -freorder-blocks-and-partition
References: <4E25F810.6050904@redhat.com>
In-Reply-To: <4E25F810.6050904@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain;	charset=UTF-8;	DelSp="Yes";	format="flowed"
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
User-Agent: Internet Messaging Program (IMP) H3 (4.1.4)
Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc/>
List-Post: <mailto:gcc@gcc.gnu.org>
List-Help: <http://gcc.gnu.org/ml/>
Sender: gcc-owner@gcc.gnu.org
X-SW-Source: 2011-07/txt/msg00357.txt.bz2

Quoting Richard Henderson <rth@redhat.com>:

> Andrew Pinski points out that the feature could probably be
> equivalently implemented via outlining and function calls
> (I assume well back at the gimple level).

Function calls would mean that you'd have to deal with
call-clobbered registers - any working size set savings from outlining
could easily be drowned by worsening register allocation or insertion of
caller-save instructions.  And you can't easily set multiple values in
specific registers and stack slots.  Unless you want to add fancy custom
ABIs for the outlined functions.
And then there is the issue that function tend to have a single
return address.  You might have a complex piece of error handling code
that makes a decision where it comes back into the hot code.  With a
function, you would need yet another return value, and then a tablejump
depending on that value.

> At which point we
> no longer have cross-segment jump_insns at the rtl level,
> which seems like a Really Big Win to me at this point.

I suppose the basic problem is that these jumps are so easily mistaken for
ordinary jump_insns.  If they were more obviously different, like a tablejm=
p,
we'd leave them alone by default.  We don't do jump threading through=20=20
non-simplified tablejumps, either.
What would you think about putting the destination section in the
instruction pattern?

Of course, changing the rtl representation doesn't fix the problems with
passes like cfglayout.
These might indeed be better off with a different model for a jump into
a cold section; e.g. it could be thought of as an instruction that sets
a vector of registers and memory locations depending on another such vector,
and then does (optionally) a multi-way jump.  Indeed a bit like a call=20=20
instruction, but with more potential side effects that we want, and less
that we don't want.

> Not that I'm volunteering to actually do the work to implement
> any such scheme.

Same here...  just some thoughts.