From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 29946 invoked by alias); 23 Nov 2006 20:35:55 -0000 Received: (qmail 29890 invoked by uid 48); 23 Nov 2006 20:35:44 -0000 Date: Thu, 23 Nov 2006 20:35:00 -0000 Subject: [Bug target/29963] New: could speed up variable access with different object layout X-Bugzilla-Reason: CC Message-ID: Reply-To: gcc-bugzilla@gcc.gnu.org To: gcc-bugs@gcc.gnu.org From: "amylaar at gcc dot gnu dot org" Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2006-11/txt/msg02054.txt.bz2 Currently, variables are generally placed in the data or bss segment; they cannot be addressed directly, hence their address needs to be loaded into a register first. Moreover, this address load is done with a pc-relative load, which incurs extra latency. If the variable was in the text segment within pc-relative load range, integer loads could be done with pc-relative loads, while stores and floating point loads could obtain their address using the mova instruction. Using a pc-relative load saves the extra latency of the address load and/or the register that is used to hold the address. Using mova saves the address load latency, and by making the address cheper to obtain, makes it more feasible to recompute it in a loop so that harder to reload values can be kept in a register. For static variables, the compiler can determine if it is possible to place the variable within reach when the translation unit is compiled. (There are some issues with asms using directives like .rept or .org throwing off the instruction length calculations, but these are also a problem with other compiler activities requiring length calculation, i.e. branch shortening and constant pool layout, and programmers are well-advised to write their assembler templates in a way that their size will not underestimated, e.g. by passing with extra lines.) For variables with global / namespace linkage, the best placement can only be determined at link time. This could either be done using the -mrelax mechanism ( but that would need fixing first, see binutils/3298 http://sourceware.org/bugzilla/show_bug.cgi?id=3298 ), or by integrating this optimization in the LTO framework. -- Summary: could speed up variable access with different object layout Product: gcc Version: 4.3.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: target AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: amylaar at gcc dot gnu dot org GCC target triplet: sh-*-* OtherBugsDependingO 29842 nThis: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29963