From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 21638 invoked by alias); 30 Dec 2002 00:27:08 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 21631 invoked from network); 30 Dec 2002 00:27:07 -0000 Received: from unknown (HELO egil.codesourcery.com) (66.92.14.122) by 209.249.29.67 with SMTP; 30 Dec 2002 00:27:07 -0000 Received: from zack by egil.codesourcery.com with local (Exim 3.36 #1 (Debian)) id 18Snlc-0003Ll-00; Sun, 29 Dec 2002 16:26:52 -0800 To: Andy Walker Cc: Diego Novillo , Chris Lattner , gcc@gcc.gnu.org Subject: Re: An unusual Performance approach using Synthetic registers References: <20021229200428.GB31615@tornado.toronto.redhat.com> From: Zack Weinberg Date: Mon, 30 Dec 2002 01:30:00 -0000 In-Reply-To: (Andy Walker's message of "Sun, 29 Dec 2002 16:16:41 -0600") Message-ID: <87smwg2vxv.fsf@egil.codesourcery.com> User-Agent: Gnus/5.090008 (Oort Gnus v0.08) Emacs/21.2 (i386-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-SW-Source: 2002-12/txt/msg01559.txt.bz2 Andy Walker writes: > On Sunday 29 December 2002 02:04 pm, Diego Novillo wrote: >> On Sun, 29 Dec 2002, Andy Walker wrote: >> > My approach may allow for an effectively infinite register set. For each >> >> Eh? GCC already uses that approach. RTL is a register language >> for an abstract machine with an infinite number of register. >> >> >> Diego. > > I agree. RTL is for an abstract machine. Infinite register set. Excellent > approach. > > Synthetic registers are artificial registers for a real machine. [...] What I think Diego is trying to say is, creating synthetic registers for the x86 isn't going to help much, possibly not at all, because the optimizer passes that could benefit already have unlimited registers to work with. I agree with this assessment. The problems which cause GCC to generate inferior code on register-starved architectures are intrinsic to the register allocator: lack of live-range splitting, the ad-hoc two-phase allocation algorithm (which makes awful choices), and the reload nightmare. I suspect you will get more for your time by working with Michael and Daniel to complete the new register allocator, targetted to the real architectural register set. There's another angle of potential improvement, which is, GCC is totally incapable of optimizing stack frame layout. Stack slots are assigned in linear order and never recycled; the stack pointer is kept aligned whether or not this is necessary; and so on. In this domain I think the sane thing to do is, whenever we have to put something on the stack, create a new pseudo register pointing directly to it. We're not bad at optimizing with operands of the form (mem:MODE (reg:P pseudo)) particularly when they have nonoverlapping alias sets. These pseudos are specially marked and do not get allocated to real registers; instead, after we know everything that's going onto the stack, we replace them all with (mem:MODE (plus:P (reg:P base) (const_int offset))). Graph-coloring allocation should work pretty well to lay out the stack frame. There's trouble if we need scratch registers to calculate some of these base+offset values, but I think the infrastructure already exists to deal with that. zw