From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 60008 invoked by alias); 18 Dec 2017 17:52:42 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 59989 invoked by uid 89); 18 Dec 2017 17:52:41 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: =?ISO-8859-1?Q?Yes, score=5.5 required=5.0 tests=BAYES_00,FORGED_MUA_MOZILLA,INVALID_MSGID,KAM_LAZY_DOMAIN_SECURITY,TLD_CHINA autolearn=no version=3.3.2 spammy=19=e6, Images, H*f:sk:nngbmiw, H*i:sk:nngbmiw?= X-HELO: eggs.gnu.org Received: from eggs.gnu.org (HELO eggs.gnu.org) (208.118.235.92) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 18 Dec 2017 17:52:38 +0000 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eQzaS-0001F2-Fs for gcc@gcc.gnu.org; Mon, 18 Dec 2017 12:52:37 -0500 Received: from smtpproxy19.qq.com ([184.105.206.84]:59525) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1eQzaS-00019J-5W for gcc@gcc.gnu.org; Mon, 18 Dec 2017 12:52:24 -0500 X-QQ-mid: bizesmtp7t1513619535tz5alzzww Received: from [192.168.1.10] (unknown [114.252.40.196]) by esmtp4.qq.com (ESMTP) with id ; Tue, 19 Dec 2017 01:52:14 +0800 (CST) X-QQ-SSF: 01100000002000F0FI50000A0000000 X-QQ-FEAT: ojxDDUWr5wZ0RJRxNljMWoyHOwjANbZEdZSiJ7YdCvaHVTWPPxq5j2Hoy4yzf jLt2l7Gqjmp52LK+OaZvPpUlA3Icg+Wn+nQaVQnv316pJIps33vjp2h5OnGY7D7u2HyCEY+ dwwhIT6IPLriy4qlOROu4IOM/PpQsk8Z8WniYJi1rhxSeYBhOAIMkc/HZdJNn/SfOsA7whN gQd4TSHRD4MlmWCtv5AAeR7fHyxPnF58F/cwRUHwnrnolyP7wkfoYXgOd/0bMUdfa1UWdmf LUq7oto+scC2x/ovZhFgRKx4w= X-QQ-GoodBg: 0 Subject: Re: Register Allocation Graph Coloring algorithm and Others To: dag@cray.com, vmakarov@redhat.com, LewisR9@cf.ac.uk, stoklund@2pi.dk Cc: GCC Development , LLVM Developers Mailing List References: From: Leslie Zhai Message-ID: +9CBFCADA170A6854 Date: Mon, 18 Dec 2017 17:52:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:llvm.org.cn:qybgforeign:qybgforeign4 X-QQ-Bgrelay: 1 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x [fuzzy] X-Received-From: 184.105.206.84 X-SW-Source: 2017-12/txt/msg00108.txt.bz2 Hi David, Thanks for your teaching! I am a newbie in compiler area, I only learned Compiler Principle in 2002 https://www.leetcode.cn/2017/12/ilove-compiler-principle.html But I like to practice and learn :) https://github.com/xiangzhai/llvm/blob/avr/lib/CodeGen/RegAllocGraphColoring.cpp#L327 because theory is not always correct, or misunderstood by people, so I want to compare solutionByHEA, IRA, Greedy, PBQP and other algorithms. Thanks for your lessons to correct my mistakes, such as memory=bad register=good, and I need to find the answer *when* memory is worse than a register, I am maintaining AVR target, there are 32 general registers, 32K flash, 2K sram http://www.atmel.com/Images/Atmel-42735-8-bit-AVR-Microcontroller-ATmega328-328P_Datasheet.pdf so perhaps to MCU, memory might be expensive than register? but what about AMDGPU or VLIW processors? I don't have experienced on them, please teach me. I am reading LLVM's code SpillXXX, LiveRangeXXX, RegisterCoalescer, etc. to get the whole view of CodeGen. I am reading Dr. Rhydian Lewis's book: A Guide to Graph Colouring: Algorithms and Applications http://www.springer.com/us/book/9783319257280   and other papers, even if HEA is not the best solution, I still want to practice and see the benchmark, I am not computing professionals, I am only 34 olds, perhaps I have enough time to waste :) 在 2017年12月19日 00:41, dag@cray.com 写道: > Leslie Zhai writes: > >> * Memory (20 - 100 cycles) is expensive than Register (1 cycle), but >> it has to spill code when PhysReg is unavailable > As Vladimir said, the cache makes this kind of analysis much more > tricky. It's not necessarily the case that memory=bad and > register=good. Since there are tradeoffs involved, one needs to > evaluate different strategies to determine *when* memory is worse than a > register. It may very well be the case that leaving something in memory > frees up a register for something much more important to use it. All of > the register allocation algorithms try to determine this kind of thing > through various heuristics. Which heuristic is most effective is highly > target-dependent. > > In my experience, changing heuristics can swing performance 20% or more > in some cases. Today's processors and memory systems are so complex > that 2nd- and even 3rd-order effects become important. > > It is very, very wrong on today's machines to use # of spills as a > metric to determine the "goodness" of an allocation. Determining *what* > to spill is much more important than the raw number of spills. Many > times I have a seen codes generated with more spills perform much better > than the code generated with fewer spills. Almost all of the papers > around the time of Chaiten-Briggs used # of spills as the metric. That > may have been appropriate at that time but take those results with giant > grains of salt today. Of course they are still very good papers for > understanding algorithms and heuristics. > > The best way I know to determine what's best for your target is to run a > whole bunch of *real* codes on them, trying different allocation > algorithms and heuristics. It is a lot of work, still worthy of a > Ph.D. even today. Register allocation is not a "solved" problem and > probably never will be as architectures continue to change and become > ever more diverse. > >> * Folding spill code into instructions, handling register coallescing, >> splitting live ranges, doing rematerialization, doing shrink wrapping >> are harder than RegAlloc > Again, as Vladimir said, they are all part of register allocation. > Sometimes they are implemented as separate passes but logically they all > contribute work to the task of assigning registers as efficiently as > possible. And all of them use heuristics. Choosing when and when not > to, for example, coalesce can be important. Splitting live ranges is > logically the opposite of coalescing. Theoretically one can "undo" a > bad coalescing decision by re-splitting the live range but it's usually > not that simple as other decisions in the interim can make that tricky > if not impossible. It is a delicate balancing act. > >> * The papers by Briggs and Chaiten contradict[2] themselves when >> examine the text of the paper vs. the pseudocode provided? > As others have said, I don't recall any inconsistencies but that doesn't > mean there aren't bugs and/or incomplete descriptions open to > interpretation. One of my biggest issues with our computer academic > culture is that we do not value repeatability. It is virtually > impossible to take a paper, implement the algorithm as described (or as > best as possible given ambiguity) and obtain the same results. Heck, > half my Ph.D. dissertation was dissecting a published paper, examining > the sensitivity of the described algorithm to various parts of the > described heuristic that were ambiguous. By interpreting the heuristic > description in different ways I observed very different results. I read > papers for the algorithms, heuristics and ideas in them. I pay no > attention to results because in the real world we have to implement the > ideas and test them ourselves to see if they will help us. > > Peter is right to point you to Preston. He is very accessible, friendly > and helpful. I had the good fortune to work with him for a few years > and he taught me a lot. He has much real-world experience on codes > actually used in production. That experience is gold. > > Good luck to you! You're entering a world that some computing > professionals think is a waste of time because "we already know how to > do that." Those of us in the trenches know better. :) > > -David -- Regards, Leslie Zhai - https://reviews.llvm.org/p/xiangzhai/