From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 3193 invoked by alias); 18 Jan 2004 22:25:24 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 3186 invoked from network); 18 Jan 2004 22:25:23 -0000 Received: from unknown (HELO mail.codesourcery.com) (65.74.133.9) by sources.redhat.com with SMTP; 18 Jan 2004 22:25:23 -0000 Received: (qmail 9008 invoked from network); 18 Jan 2004 22:25:18 -0000 Received: from taltos.codesourcery.com (zack@66.92.218.83) by mail.codesourcery.com with DES-CBC3-SHA encrypted SMTP; 18 Jan 2004 22:25:18 -0000 Received: by taltos.codesourcery.com (sSMTP sendmail emulation); Sun, 18 Jan 2004 14:25:17 -0800 From: "Zack Weinberg" To: Ian Lance Taylor Cc: gcc@gcc.gnu.org Subject: Re: Can we speed up the gcc_target structure? References: <20040118083738.10772.qmail@gossamer.airs.com> <87smid8w9q.fsf@codesourcery.com> Date: Sun, 18 Jan 2004 22:25:00 -0000 In-Reply-To: (Ian Lance Taylor's message of "18 Jan 2004 09:08:57 -0500") Message-ID: <87ad4ksxma.fsf@egil.codesourcery.com> User-Agent: Gnus/5.110002 (No Gnus v0.2) Emacs/21.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-SW-Source: 2004-01/txt/msg01199.txt.bz2 Ian Lance Taylor writes: > "Zack Weinberg" writes: >> Furthermore, while a 3% measured speed hit is a concern, I think that >> trying to win it back by undoing the targetm transformation - in the >> object files, if not in the source code - is barking up the wrong >> tree. Instead we should be looking for ways to avoid having targetm >> hooks in critical paths in the first place. It's been my experience >> that that is a much more fruitful source of optimization >> opportunities. > > I don't have anything against that goal, but it is in conflict with > the goal of speeding up the compiler. Simply moving targetm hooks > obviously can not get you the full speedup. The full speedup comes > when an optimizing compiler compiling gcc can see that certain values > are constants, such as, in my example, the various promote_* functions > in the target vector. You can't pull those target hooks out of the > critical path. Function calls are on the critical path for a > non-optimizing compilation of many types of C++ code, and a > non-optimizing compilation is the case where compilation speed is the > most important. This - and further discussion downthread - misses the point I was trying to make. You're seeing 3% speedup on some test case by exposing that certain elements of targetm.calls are compile-time constant. Here are the existing elements of that structure: bool (*promote_function_args) (tree fntype); bool (*promote_function_return) (tree fntype); bool (*promote_prototypes) (tree fntype); rtx (*struct_value_rtx) (tree fndecl, int incoming); bool (*return_in_memory) (tree type, tree fndecl); bool (*return_in_msb) (tree type); rtx (*expand_builtin_saveregs) (void); /* Returns pretend_argument_size. */ void (*setup_incoming_varargs) (CUMULATIVE_ARGS *ca, enum machine_mode mode, tree type, int *pretend_arg_size, int second_time); bool (*strict_argument_naming) (CUMULATIVE_ARGS *ca); /* Returns true if we should use SETUP_INCOMING_VARARGS and/or STRICT_ARGUMENT_NAMING. */ bool (*pretend_outgoing_varargs_named) (CUMULATIVE_ARGS *ca); Furthermore, skimming tm.texi, it looks like there are at least a hundred more function-call related target macros that haven't yet been moved into the target vector: of which FUNCTION_ARG is an obvious example, but there are also things like SPLIT_COMPLEX_ARGS, PUSH_ARGS_REVERSED, FRAME_GROWS_DOWNWARD, ... Of course inefficiencies are going to be introduced if we just convert each macro to a target hook with the same semantics. But that isn't the only option on the table. The right thing is to redesign this interface so that it doesn't *need* 100+ macros and toggles. If this is done properly, then not only should there be no inefficiency introduced by going through the target vector, but also we would have something that was straightforward to maintain and straightforward to add support for new architectures. As a data point, I am aware of another (proprietary) compiler that completely isolates the back end from the optimizers, to the point where the back end module can be swapped out at runtime, and it benchmarks competitively or faster than GCC on similar input. So I don't believe that this is impossible. zw