From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-return-194805-listarch-gcc=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 12252 invoked by alias); 18 Dec 2017 16:42:20 -0000
Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc/>
List-Post: <mailto:gcc@gcc.gnu.org>
List-Help: <http://gcc.gnu.org/ml/>
Sender: gcc-owner@gcc.gnu.org
Received: (qmail 11757 invoked by uid 89); 18 Dec 2017 16:42:20 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=culture, professionals, fortune, heck
X-HELO: esa1.cray.iphmx.com
Received: from esa1.cray.iphmx.com (HELO esa1.cray.iphmx.com) (68.232.142.33) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 18 Dec 2017 16:42:18 +0000
X-Cray-OBMMKR: 1433258124 17058908
Received: from cray-smtp-2.cray.com (HELO CFWEX01.americas.cray.com) ([136.162.34.11])  by esa1.cray.iphmx.com with ESMTP/TLS/AES256-SHA; 18 Dec 2017 16:56:59 +0000
Received: from lnx-dag.us.cray.com (172.31.134.145) by CFWEX01.americas.cray.com (172.30.88.25) with Microsoft SMTP Server (TLS) id 14.3.319.2; Mon, 18 Dec 2017 10:41:58 -0600
From: <dag@cray.com>
To: Leslie Zhai <lesliezhai@llvm.org.cn>, <vmakarov@redhat.com>,	<LewisR9@cf.ac.uk>, <stoklund@2pi.dk>
CC: GCC Development <gcc@gcc.gnu.org>, LLVM Developers Mailing List	<llvm-dev@lists.llvm.org>
Subject: Re: Register Allocation Graph Coloring algorithm and Others
References: <a679c6c3-fb97-3efe-130c-af307a79ea2e@llvm.org.cn>
Date: Mon, 18 Dec 2017 16:42:00 -0000
In-Reply-To: <a679c6c3-fb97-3efe-130c-af307a79ea2e@llvm.org.cn>+615F0DCE4D5873A9	(Leslie Zhai's message of "Fri, 15 Dec 2017 11:18:37 +0800")
Message-ID: <nngbmiwc7je.fsf@lnx-dag.us.cray.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-SW-Source: 2017-12/txt/msg00107.txt.bz2

Leslie Zhai <lesliezhai@llvm.org.cn> writes:

> * Memory (20 - 100 cycles) is expensive than Register (1 cycle), but
> it has to spill code when PhysReg is unavailable

As Vladimir said, the cache makes this kind of analysis much more
tricky.  It's not necessarily the case that memory=bad and
register=good.  Since there are tradeoffs involved, one needs to
evaluate different strategies to determine *when* memory is worse than a
register.  It may very well be the case that leaving something in memory
frees up a register for something much more important to use it.  All of
the register allocation algorithms try to determine this kind of thing
through various heuristics.  Which heuristic is most effective is highly
target-dependent.

In my experience, changing heuristics can swing performance 20% or more
in some cases.  Today's processors and memory systems are so complex
that 2nd- and even 3rd-order effects become important.

It is very, very wrong on today's machines to use # of spills as a
metric to determine the "goodness" of an allocation.  Determining *what*
to spill is much more important than the raw number of spills.  Many
times I have a seen codes generated with more spills perform much better
than the code generated with fewer spills.  Almost all of the papers
around the time of Chaiten-Briggs used # of spills as the metric.  That
may have been appropriate at that time but take those results with giant
grains of salt today.  Of course they are still very good papers for
understanding algorithms and heuristics.

The best way I know to determine what's best for your target is to run a
whole bunch of *real* codes on them, trying different allocation
algorithms and heuristics.  It is a lot of work, still worthy of a
Ph.D. even today.  Register allocation is not a "solved" problem and
probably never will be as architectures continue to change and become
ever more diverse.

> * Folding spill code into instructions, handling register coallescing,
> splitting live ranges, doing rematerialization, doing shrink wrapping
> are harder than RegAlloc

Again, as Vladimir said, they are all part of register allocation.
Sometimes they are implemented as separate passes but logically they all
contribute work to the task of assigning registers as efficiently as
possible.  And all of them use heuristics.  Choosing when and when not
to, for example, coalesce can be important.  Splitting live ranges is
logically the opposite of coalescing.  Theoretically one can "undo" a
bad coalescing decision by re-splitting the live range but it's usually
not that simple as other decisions in the interim can make that tricky
if not impossible.  It is a delicate balancing act.

> * The papers by Briggs and Chaiten contradict[2] themselves when
> examine the text of the paper vs. the pseudocode provided?

As others have said, I don't recall any inconsistencies but that doesn't
mean there aren't bugs and/or incomplete descriptions open to
interpretation.  One of my biggest issues with our computer academic
culture is that we do not value repeatability.  It is virtually
impossible to take a paper, implement the algorithm as described (or as
best as possible given ambiguity) and obtain the same results.  Heck,
half my Ph.D. dissertation was dissecting a published paper, examining
the sensitivity of the described algorithm to various parts of the
described heuristic that were ambiguous.  By interpreting the heuristic
description in different ways I observed very different results.  I read
papers for the algorithms, heuristics and ideas in them.  I pay no
attention to results because in the real world we have to implement the
ideas and test them ourselves to see if they will help us.

Peter is right to point you to Preston.  He is very accessible, friendly
and helpful.  I had the good fortune to work with him for a few years
and he taught me a lot.  He has much real-world experience on codes
actually used in production.  That experience is gold.

Good luck to you!  You're entering a world that some computing
professionals think is a waste of time because "we already know how to
do that."  Those of us in the trenches know better.  :)

                               -David