From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 17558 invoked by alias); 18 Jan 2004 08:37:53 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 17540 invoked from network); 18 Jan 2004 08:37:46 -0000 Received: from unknown (HELO yosemite.airs.com) (209.128.65.135) by sources.redhat.com with SMTP; 18 Jan 2004 08:37:46 -0000 Received: (qmail 1569 invoked by uid 10); 18 Jan 2004 08:37:44 -0000 Received: (qmail 10773 invoked by uid 500); 18 Jan 2004 08:37:38 -0000 Message-ID: <20040118083738.10772.qmail@gossamer.airs.com> Date: Sun, 18 Jan 2004 08:37:00 -0000 From: Ian Lance Taylor To: gcc@gcc.gnu.org Subject: Can we speed up the gcc_target structure? X-SW-Source: 2004-01/txt/msg01155.txt.bz2 Back in the old days, gcc had a lot of code which was conditionally compiled with #ifdef. That was ugly, but the resulting code was fast. Over time, a lot of the parameters checked with #ifdef were converted into macros which were checked at runtime using if. That was less ugly, and, since the macros normally had constant values, when gcc was compiled with an optimizing compiler, the code was just as fast in the normal case. When it was slower, it was generally because the compiler was doing something it couldn't do before. More recently, some of those parameters have moved into the gcc_target structure. They are still checked at run time, but now the if condition never has a constant value. It always requires fetching a value from memory in the target vector, and often requires calling a function. This results in cleaner, more comprehensible code. However, it also slows the compiler down. Just for fun, I converted every instance of targetm.calls.xxxx to be TARGETM_CALLS_XXXX instead. Then I added stuff like this to the end of target.h: #ifndef TARGETM_CALLS_PROMOTE_FUNCTION_ARGS #define TARGETM_CALLS_PROMOTE_FUNCTION_ARGS(FNTYPE) \ targetm.calls.promote_function_args ((FNTYPE)) #endif Then I added stuff like this to i386.h: #define TARGETM_CALLS_PROMOTE_FUNCTION_ARGS(FNTYPE) false Then I rebuilt the compiler and tried it on some reasonably small C++ example (with a native i386 GNU/Linux compiler). I saw compilation speedups of up to 3% when compiling without optimization. The resulting assembler output was, as expected, identical. These tests were far from rigorous. However, compilation speed is a concern these days, and this suggests that the target vector is a measurable speed problem. Somebody must have noticed this before, but I couldn't find anything in the gcc mailing list. It seems to me that we should try to find a way to regain the speed which was lost when we switched to the target vector, without losing the comprehensibility which was gained. Here is a sketch of a possible approach which would require fairly minimal changes in the way the target vector works today: 1) Turn hooks.c and targhooks.c into .h files which define inline functions (with appropriate fallbacks to support older non-gcc compilers for bootstrapping, of course). 2) Move all definitions of target initializer macros from tm.c files into new CPU-target.h files. 3) Include CPU-target.h at the end of target-def.h, where it will redefine and undefine target initializer macros. For cases in which targetm is changed at run time, CPU-target.h must #undef the corresponding initializer macro (and CPU.c must #define it before initializing targetm) (alternatively, force targetm to be const, and adjust the relatively few cases in which it is changed at run time). 4) Change all uses of targetm.xxxx into code which uses TARGETM_XXXX macros, as above. 5) Define the TARGETM macros as either using the target vector or using the initializer macro from target-def.h. The choice would be made based on whether the initializer macro was defined and probably based on some other control. 6) Now code which uses the new inline versions of hooks.c and targhooks.c, and which includes target-def.h and CPU-target.h, will automatically use the inlined versions of the functions when possible, and will see constant variable definitions when possible. The main problem that I see with this approach is the requirement to #undef an initializer macro which is changed at run time. That's why I suggest the alternative of making targetm const. We can convert to this approach over time if we require a particular macro to be defined in order to define the TARGETM macros as using the initializer macros rather than the target vector. Then a backend which has been converted to use CPU-target.h would define that macro. If we eventually want to configure gcc to support multiple target vectors, that would still be possible. When more than one target vector was to be supported, the code would force the TARGETM macros to always use the target vector. This would be determined at configure time. I considered more complex approaches, such as creating a target.def file which defined the target vector, but the basic problem boils down to detecting when the target does not use the default version of a target vector field. Inventing CPU-target.h seems as effective an approach as any to solving this particular problem. Any thoughts? Does anybody think this would be a waste of time? Does anybody have a better approach to solving the general problem? Ian