From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-return-87361-listarch-gcc=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 7925 invoked by alias); 14 Dec 2003 03:29:14 -0000
Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Archive: <http://gcc.gnu.org/ml/gcc/>
List-Post: <mailto:gcc@gcc.gnu.org>
List-Help: <http://gcc.gnu.org/ml/>
Sender: gcc-owner@gcc.gnu.org
Received: (qmail 7918 invoked from network); 14 Dec 2003 03:29:12 -0000
Received: from unknown (HELO nondot.org) (128.174.245.159)
  by sources.redhat.com with SMTP; 14 Dec 2003 03:29:12 -0000
Received: by nondot.org (Postfix, from userid 501)
	id 438E317C035; Sat, 13 Dec 2003 22:31:00 -0500 (EST)
Received: from localhost (localhost [127.0.0.1])
	by nondot.org (Postfix) with ESMTP
	id 35E9C24C2AD; Sat, 13 Dec 2003 21:31:00 -0600 (CST)
Date: Sun, 14 Dec 2003 03:39:00 -0000
From: Chris Lattner <sabre@nondot.org>
To: Andrew MacLeod <amacleod@redhat.com>
Cc: Jeff Law <law@redhat.com>,
	Zdenek Dvorak <rakdver@atrey.karlin.mff.cuni.cz>,
	gcc mailing list <gcc@gcc.gnu.org>
Subject: Re: [tree-ssa] Lazy updating of stmt operands
In-Reply-To: <1071323157.3257.138.camel@p4>
Message-ID: <Pine.LNX.4.44.0312132121220.23347-100000@nondot.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-SW-Source: 2003-12/txt/msg00772.txt.bz2

> Its not the SSA space thats the memory consumption problem. gcc is now
> unit at a time, so the entire program is read into memory as trees and
> held there. the tree-ssa branch also uses GIMPLE, which creates more
> trees. Then you add even a lightweight SSA and this simply magnifies any
> additional memory that is used. Once the machine starts swapping, its
> another order of magnitude on timing.

Why exactly do you need all of these representations?  In LLVM, we
routinely optimize _entire_ reasonably large C++ programs (like LLVM
itself), without even coming _near_ swap.  Can't you get rid of some of
the duplicate information you are holding?

> So we are trying to watch memory very carefully.  That plus we are still
> working within the existing tree data structures. I beleive our
> representation of SSA is quite lightweight. It only becomes less so when
> you try to add a bunch of stuff that isnt needed everywhere.

It seems to me that you are trying to save memory in the wrong places, but
my knowledge of tree-ssa is pretty limited, so I could certainly be wrong.
I think that you should be looking for the fundamental gains, especially
at this stage of the project.  If you try to make any big changes when you
have lots of transformations written, it will only be harder.  If you
build on a non-sustainable foundation, and never change it, ultimately the
project will become unwieldy.  Perhaps trees are not the right approach.

Saving memory by not having basic things like use-def or def-use chains is
not the right approach either in my mind.

> > If you're trying to work around the problem with caches and recomputation
> > of information, I suspect that you are barking up the wrong tree.  That
> > said, I don't know exactly what's going on in the tree-ssa world, so there
> > may be complicating factors.
>
> The Cache isnt something we're trying to work around. Thats there to
> prevent us from having to muck around in trees looking for things which
> are operands. The stmt is in the form of trees, our "cache" is the
> equivilent of your instruction. It consists of the operands plus looking
> at the tree code(s) of the stmt.

I'm not saying that you are trying to work around the cache: I'm saying
the cache is a workaround.  If "mucking around in the trees looking for
things" is that expensive, you have other problems which the cache is only
disguising.

> So we have 1 word for each operand. It points to the SSA_NAME in the
> stmt tree which represents that ssa version.

How big is each "SSA_NAME"?  What is the _total allocated size_?  Has
anyone looked into what the aggregate footprint is of a simple plus node
with two SSA operands?

> As far as Im concerned, we're not working around anything. Everything
> works just fine. IF someone wants/needs the def->use inforation, it can
> be built today. What we dont have is the ability to keep it up to date
> for some period of time, simply because we've never needed it. If the
> time comes that we do need it, it is not hard to add.

You have to remember that my opinion is not worth very much, and that my
knowledge of tree-ssa is quite limited.  That said, I am qualified to say
that def-use information can be put to good use by _almost every
transformation_, and that the information should be _automatically kept
up-to-date_ by the infrastructure.

Doing it any other way will cause tremendous problems down the road.

-Chris

-- 
http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/