From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-return-194806-listarch-gcc=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 60008 invoked by alias); 18 Dec 2017 17:52:42 -0000
Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc/>
List-Post: <mailto:gcc@gcc.gnu.org>
List-Help: <http://gcc.gnu.org/ml/>
Sender: gcc-owner@gcc.gnu.org
Received: (qmail 59989 invoked by uid 89); 18 Dec 2017 17:52:41 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: =?ISO-8859-1?Q?Yes, score=5.5 required=5.0 tests=BAYES_00,FORGED_MUA_MOZILLA,INVALID_MSGID,KAM_LAZY_DOMAIN_SECURITY,TLD_CHINA autolearn=no version=3.3.2 spammy=19=e6, Images, H*f:sk:nngbmiw, H*i:sk:nngbmiw?=
X-HELO: eggs.gnu.org
Received: from eggs.gnu.org (HELO eggs.gnu.org) (208.118.235.92) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 18 Dec 2017 17:52:38 +0000
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)	(envelope-from <lesliezhai@llvm.org.cn>)	id 1eQzaS-0001F2-Fs	for gcc@gcc.gnu.org; Mon, 18 Dec 2017 12:52:37 -0500
Received: from smtpproxy19.qq.com ([184.105.206.84]:59525)	by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)	(Exim 4.71)	(envelope-from <lesliezhai@llvm.org.cn>)	id 1eQzaS-00019J-5W	for gcc@gcc.gnu.org; Mon, 18 Dec 2017 12:52:24 -0500
X-QQ-mid: bizesmtp7t1513619535tz5alzzww
Received: from [192.168.1.10] (unknown [114.252.40.196])	by esmtp4.qq.com (ESMTP) with 	id ; Tue, 19 Dec 2017 01:52:14 +0800 (CST)
X-QQ-SSF: 01100000002000F0FI50000A0000000
X-QQ-FEAT: ojxDDUWr5wZ0RJRxNljMWoyHOwjANbZEdZSiJ7YdCvaHVTWPPxq5j2Hoy4yzf	jLt2l7Gqjmp52LK+OaZvPpUlA3Icg+Wn+nQaVQnv316pJIps33vjp2h5OnGY7D7u2HyCEY+	dwwhIT6IPLriy4qlOROu4IOM/PpQsk8Z8WniYJi1rhxSeYBhOAIMkc/HZdJNn/SfOsA7whN	gQd4TSHRD4MlmWCtv5AAeR7fHyxPnF58F/cwRUHwnrnolyP7wkfoYXgOd/0bMUdfa1UWdmf	LUq7oto+scC2x/ovZhFgRKx4w=
X-QQ-GoodBg: 0
Subject: Re: Register Allocation Graph Coloring algorithm and Others
To: dag@cray.com, vmakarov@redhat.com, LewisR9@cf.ac.uk, stoklund@2pi.dk
Cc: GCC Development <gcc@gcc.gnu.org>, LLVM Developers Mailing List <llvm-dev@lists.llvm.org>
References: <a679c6c3-fb97-3efe-130c-af307a79ea2e@llvm.org.cn> <nngbmiwc7je.fsf@lnx-dag.us.cray.com>
From: Leslie Zhai <lesliezhai@llvm.org.cn>
Message-ID: <a1e8dd77-b30d-4b69-6b45-bf68fad8556f@llvm.org.cn>+9CBFCADA170A6854
Date: Mon, 18 Dec 2017 17:52:00 -0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0
MIME-Version: 1.0
In-Reply-To: <nngbmiwc7je.fsf@lnx-dag.us.cray.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-QQ-SENDSIZE: 520
Feedback-ID: bizesmtp:llvm.org.cn:qybgforeign:qybgforeign4
X-QQ-Bgrelay: 1
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x [fuzzy]
X-Received-From: 184.105.206.84
X-SW-Source: 2017-12/txt/msg00108.txt.bz2

Hi David,

Thanks for your teaching!

I am a newbie in compiler area, I only learned Compiler Principle in 
2002 https://www.leetcode.cn/2017/12/ilove-compiler-principle.html

But I like to practice and learn :) 
https://github.com/xiangzhai/llvm/blob/avr/lib/CodeGen/RegAllocGraphColoring.cpp#L327 
because theory is not always correct, or misunderstood by people, so I 
want to compare solutionByHEA, IRA, Greedy, PBQP and other algorithms.

Thanks for your lessons to correct my mistakes, such as memory=bad 
register=good, and I need to find the answer *when* memory is worse than 
a register, I am maintaining AVR target, there are 32 general registers, 
32K flash, 2K sram 
http://www.atmel.com/Images/Atmel-42735-8-bit-AVR-Microcontroller-ATmega328-328P_Datasheet.pdf 
so perhaps to MCU, memory might be expensive than register? but what 
about AMDGPU or VLIW processors? I don't have experienced on them, 
please teach me.

I am reading LLVM's code SpillXXX, LiveRangeXXX, RegisterCoalescer, etc. 
to get the whole view of CodeGen.

I am reading Dr. Rhydian Lewis's book: A Guide to Graph Colouring: 
Algorithms and Applications 
http://www.springer.com/us/book/9783319257280Â Â  and other papers, even 
if HEA is not the best solution, I still want to practice and see the 
benchmark, I am not computing professionals, I am only 34 olds, perhaps 
I have enough time to waste :)


å¨ 2017å¹´12æ19æ¥ 00:41, dag@cray.com åé:
> Leslie Zhai <lesliezhai@llvm.org.cn> writes:
>
>> * Memory (20 - 100 cycles) is expensive than Register (1 cycle), but
>> it has to spill code when PhysReg is unavailable
> As Vladimir said, the cache makes this kind of analysis much more
> tricky.  It's not necessarily the case that memory=bad and
> register=good.  Since there are tradeoffs involved, one needs to
> evaluate different strategies to determine *when* memory is worse than a
> register.  It may very well be the case that leaving something in memory
> frees up a register for something much more important to use it.  All of
> the register allocation algorithms try to determine this kind of thing
> through various heuristics.  Which heuristic is most effective is highly
> target-dependent.
>
> In my experience, changing heuristics can swing performance 20% or more
> in some cases.  Today's processors and memory systems are so complex
> that 2nd- and even 3rd-order effects become important.
>
> It is very, very wrong on today's machines to use # of spills as a
> metric to determine the "goodness" of an allocation.  Determining *what*
> to spill is much more important than the raw number of spills.  Many
> times I have a seen codes generated with more spills perform much better
> than the code generated with fewer spills.  Almost all of the papers
> around the time of Chaiten-Briggs used # of spills as the metric.  That
> may have been appropriate at that time but take those results with giant
> grains of salt today.  Of course they are still very good papers for
> understanding algorithms and heuristics.
>
> The best way I know to determine what's best for your target is to run a
> whole bunch of *real* codes on them, trying different allocation
> algorithms and heuristics.  It is a lot of work, still worthy of a
> Ph.D. even today.  Register allocation is not a "solved" problem and
> probably never will be as architectures continue to change and become
> ever more diverse.
>
>> * Folding spill code into instructions, handling register coallescing,
>> splitting live ranges, doing rematerialization, doing shrink wrapping
>> are harder than RegAlloc
> Again, as Vladimir said, they are all part of register allocation.
> Sometimes they are implemented as separate passes but logically they all
> contribute work to the task of assigning registers as efficiently as
> possible.  And all of them use heuristics.  Choosing when and when not
> to, for example, coalesce can be important.  Splitting live ranges is
> logically the opposite of coalescing.  Theoretically one can "undo" a
> bad coalescing decision by re-splitting the live range but it's usually
> not that simple as other decisions in the interim can make that tricky
> if not impossible.  It is a delicate balancing act.
>
>> * The papers by Briggs and Chaiten contradict[2] themselves when
>> examine the text of the paper vs. the pseudocode provided?
> As others have said, I don't recall any inconsistencies but that doesn't
> mean there aren't bugs and/or incomplete descriptions open to
> interpretation.  One of my biggest issues with our computer academic
> culture is that we do not value repeatability.  It is virtually
> impossible to take a paper, implement the algorithm as described (or as
> best as possible given ambiguity) and obtain the same results.  Heck,
> half my Ph.D. dissertation was dissecting a published paper, examining
> the sensitivity of the described algorithm to various parts of the
> described heuristic that were ambiguous.  By interpreting the heuristic
> description in different ways I observed very different results.  I read
> papers for the algorithms, heuristics and ideas in them.  I pay no
> attention to results because in the real world we have to implement the
> ideas and test them ourselves to see if they will help us.
>
> Peter is right to point you to Preston.  He is very accessible, friendly
> and helpful.  I had the good fortune to work with him for a few years
> and he taught me a lot.  He has much real-world experience on codes
> actually used in production.  That experience is gold.
>
> Good luck to you!  You're entering a world that some computing
> professionals think is a waste of time because "we already know how to
> do that."  Those of us in the trenches know better.  :)
>
>                                 -David

-- 
Regards,
Leslie Zhai - https://reviews.llvm.org/p/xiangzhai/