From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-return-89244-listarch-gcc=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 3193 invoked by alias); 18 Jan 2004 22:25:24 -0000
Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Archive: <http://gcc.gnu.org/ml/gcc/>
List-Post: <mailto:gcc@gcc.gnu.org>
List-Help: <http://gcc.gnu.org/ml/>
Sender: gcc-owner@gcc.gnu.org
Received: (qmail 3186 invoked from network); 18 Jan 2004 22:25:23 -0000
Received: from unknown (HELO mail.codesourcery.com) (65.74.133.9)
  by sources.redhat.com with SMTP; 18 Jan 2004 22:25:23 -0000
Received: (qmail 9008 invoked from network); 18 Jan 2004 22:25:18 -0000
Received: from taltos.codesourcery.com (zack@66.92.218.83)
  by mail.codesourcery.com with DES-CBC3-SHA encrypted SMTP; 18 Jan 2004 22:25:18 -0000
Received: by taltos.codesourcery.com (sSMTP sendmail emulation); Sun, 18 Jan 2004 14:25:17 -0800
From: "Zack Weinberg" <zack@codesourcery.com>
To: Ian Lance Taylor <ian@wasabisystems.com>
Cc: gcc@gcc.gnu.org
Subject: Re: Can we speed up the gcc_target structure?
References: <20040118083738.10772.qmail@gossamer.airs.com>
	<87smid8w9q.fsf@codesourcery.com> <m3vfn9z6va.fsf@gossamer.airs.com>
Date: Sun, 18 Jan 2004 22:25:00 -0000
In-Reply-To: <m3vfn9z6va.fsf@gossamer.airs.com> (Ian Lance Taylor's message
 of "18 Jan 2004 09:08:57 -0500")
Message-ID: <87ad4ksxma.fsf@egil.codesourcery.com>
User-Agent: Gnus/5.110002 (No Gnus v0.2) Emacs/21.3 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-SW-Source: 2004-01/txt/msg01199.txt.bz2

Ian Lance Taylor <ian@wasabisystems.com> writes:

> "Zack Weinberg" <zack@codesourcery.com> writes:
>> Furthermore, while a 3% measured speed hit is a concern, I think that
>> trying to win it back by undoing the targetm transformation - in the
>> object files, if not in the source code - is barking up the wrong
>> tree.  Instead we should be looking for ways to avoid having targetm
>> hooks in critical paths in the first place.  It's been my experience
>> that that is a much more fruitful source of optimization
>> opportunities.
>
> I don't have anything against that goal, but it is in conflict with
> the goal of speeding up the compiler.  Simply moving targetm hooks
> obviously can not get you the full speedup.  The full speedup comes
> when an optimizing compiler compiling gcc can see that certain values
> are constants, such as, in my example, the various promote_* functions
> in the target vector.  You can't pull those target hooks out of the
> critical path.  Function calls are on the critical path for a
> non-optimizing compilation of many types of C++ code, and a
> non-optimizing compilation is the case where compilation speed is the
> most important.

This - and further discussion downthread - misses the point I was
trying to make.

You're seeing 3% speedup on some test case by exposing that certain
elements of targetm.calls are compile-time constant.  Here are the
existing elements of that structure:

    bool (*promote_function_args) (tree fntype);
    bool (*promote_function_return) (tree fntype);
    bool (*promote_prototypes) (tree fntype);
    rtx (*struct_value_rtx) (tree fndecl, int incoming);
    bool (*return_in_memory) (tree type, tree fndecl);
    bool (*return_in_msb) (tree type);
    rtx (*expand_builtin_saveregs) (void);
    /* Returns pretend_argument_size.  */
    void (*setup_incoming_varargs) (CUMULATIVE_ARGS *ca, enum machine_mode mode,
				    tree type, int *pretend_arg_size,
                                    int second_time);
    bool (*strict_argument_naming) (CUMULATIVE_ARGS *ca);
    /* Returns true if we should use SETUP_INCOMING_VARARGS and/or
       STRICT_ARGUMENT_NAMING.  */
    bool (*pretend_outgoing_varargs_named) (CUMULATIVE_ARGS *ca);

Furthermore, skimming tm.texi, it looks like there are at least a
hundred more function-call related target macros that haven't yet been
moved into the target vector: of which FUNCTION_ARG is an obvious
example, but there are also things like SPLIT_COMPLEX_ARGS,
PUSH_ARGS_REVERSED, FRAME_GROWS_DOWNWARD, ...

Of course inefficiencies are going to be introduced if we just convert
each macro to a target hook with the same semantics.  But that isn't
the only option on the table.  The right thing is to redesign this
interface so that it doesn't *need* 100+ macros and toggles.  If this
is done properly, then not only should there be no inefficiency
introduced by going through the target vector, but also we would have
something that was straightforward to maintain and straightforward to
add support for new architectures.

As a data point, I am aware of another (proprietary) compiler that
completely isolates the back end from the optimizers, to the point
where the back end module can be swapped out at runtime, and it
benchmarks competitively or faster than GCC on similar input.  So I
don't believe that this is impossible.

zw