From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 95271 invoked by alias); 19 Dec 2017 03:03:16 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 93452 invoked by uid 89); 19 Dec 2017 03:03:15 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: Yes, score=5.5 required=5.0 tests=BAYES_00,FORGED_MUA_MOZILLA,INVALID_MSGID,KAM_LAZY_DOMAIN_SECURITY,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,TLD_CHINA autolearn=no version=3.3.2 spammy=UD:leetcode.cn, UD:www.leetcode.cn, www.leetcode.cn, wwwleetcodecn X-HELO: smtpbg202.qq.com Received: from smtpbg202.qq.com (HELO smtpbg202.qq.com) (184.105.206.29) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 19 Dec 2017 03:03:08 +0000 X-QQ-mid: bizesmtp15t1513652582tbbh5sqc Received: from iSOFT.localdomain (unknown [114.247.223.226]) by esmtp4.qq.com (ESMTP) with id ; Tue, 19 Dec 2017 11:03:01 +0800 (CST) X-QQ-SSF: 00100000002000F0FI50B00A0000000 X-QQ-FEAT: bwgMLqpyqeLhwidMEouM6GGmwR+jvzsW5aGUFtkQLdwuR7Zeb9LyFnTm9qGiL 0MY30wrcWZf+KV9pcvUmNZ7XLwF64EHpmF9RtLP4e18rRperOavmuOBj0okZo3t9wd9TZzU 42Tz8pl5RpopQwbjn0nh/EAyuOgwGvQbcLTOw88bKucHBNMI1Cy2Gr7pvRSjtKUy+631s8/ XhOmTSkbCCuDcA43fIGMbkh8jCNgQe873Fo3vFJyW+gTTFXmNwumIdkOsawKerkfQmSym0N vVJMB0KimilJ5MCpipB+AGB+eHH/tbE8xYdw== X-QQ-GoodBg: 0 Subject: Re: [llvm-dev] Register Allocation Graph Coloring algorithm and Others To: Matthias Braun Cc: dag@cray.com, vmakarov@redhat.com, LewisR9@cf.ac.uk, stoklund@2pi.dk, LLVM Developers Mailing List , GCC Development References: <9CBFCADA170A6854@apple.com> <6708A731-203E-4096-8137-85D4104DA035@apple.com> From: Leslie Zhai Message-ID: <80f8f21c-7e07-903c-7557-09ef186ad3ef@llvm.org.cn>+F0540E70067D1E43 Date: Tue, 19 Dec 2017 03:03:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 In-Reply-To: <6708A731-203E-4096-8137-85D4104DA035@apple.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:llvm.org.cn:qybgforeign:qybgforeign2 X-QQ-Bgrelay: 1 X-SW-Source: 2017-12/txt/msg00118.txt.bz2 Hi Matthias, Thanks for your hint! It is just for learning and practicing for me, just like migrate DragonEgg http://lists.llvm.org/pipermail/llvm-dev/2017-September/117201.html the motivating is for learning from GCC and LLVM developers. 在 2017年12月19日 10:07, Matthias Braun 写道: > > >> On Dec 18, 2017, at 9:52 AM, Leslie Zhai via llvm-dev >> > wrote: >> >> Hi David, >> >> Thanks for your teaching! >> >> I am a newbie in compiler area, I only learned Compiler Principle in >> 2002https://www.leetcode.cn/2017/12/ilove-compiler-principle.html >> >> But I like to practice and learn >> :)https://github.com/xiangzhai/llvm/blob/avr/lib/CodeGen/RegAllocGraphColoring.cpp#L327because >> theory is not always correct, or misunderstood by people, so I want >> to compare solutionByHEA, IRA, Greedy, PBQP and other algorithms. > > Just as another tip: > - Indeed in my experience: Just implementing some algorithms yourself > and comparing them against what existing compilers produce and then > figuring out why is the best way to learn about allocators. > - Don't just put emphasis on all the different different coloring > techniques. In practice it is usually the way you deal register > constraints and other target specific coloring constraints, > rematerialization and how you get coalescing into the picture. Most > regalloc papers don't talk too much about that as it's highly finicky > and often target specific. But they do have a huge impact on > allocation quality and can make or break any of those algorithms... > > - Matthias > >> >> Thanks for your lessons to correct my mistakes, such as memory=bad >> register=good, and I need to find the answer *when* memory is worse >> than a register, I am maintaining AVR target, there are 32 general >> registers, 32K flash, 2K >> sramhttp://www.atmel.com/Images/Atmel-42735-8-bit-AVR-Microcontroller-ATmega328-328P_Datasheet.pdfso >> perhaps to MCU, memory might be expensive than register? but what >> about AMDGPU or VLIW processors? I don't have experienced on them, >> please teach me. >> >> I am reading LLVM's code SpillXXX, LiveRangeXXX, RegisterCoalescer, >> etc. to get the whole view of CodeGen. >> >> I am reading Dr. Rhydian Lewis's book: A Guide to Graph Colouring: >> Algorithms and >> Applicationshttp://www.springer.com/us/book/9783319257280   and other >> papers, even if HEA is not the best solution, I still want to >> practice and see the benchmark, I am not computing professionals, I >> am only 34 olds, perhaps I have enough time to waste :) >> >> >> 在 2017年12月19日 00:41,dag@cray.com 写道: >>> Leslie Zhai > >>> writes: >>> >>>> * Memory (20 - 100 cycles) is expensive than Register (1 cycle), but >>>> it has to spill code when PhysReg is unavailable >>> As Vladimir said, the cache makes this kind of analysis much more >>> tricky.  It's not necessarily the case that memory=bad and >>> register=good.  Since there are tradeoffs involved, one needs to >>> evaluate different strategies to determine *when* memory is worse than a >>> register.  It may very well be the case that leaving something in memory >>> frees up a register for something much more important to use it.  All of >>> the register allocation algorithms try to determine this kind of thing >>> through various heuristics.  Which heuristic is most effective is highly >>> target-dependent. >>> >>> In my experience, changing heuristics can swing performance 20% or more >>> in some cases.  Today's processors and memory systems are so complex >>> that 2nd- and even 3rd-order effects become important. >>> >>> It is very, very wrong on today's machines to use # of spills as a >>> metric to determine the "goodness" of an allocation.  Determining *what* >>> to spill is much more important than the raw number of spills.  Many >>> times I have a seen codes generated with more spills perform much better >>> than the code generated with fewer spills.  Almost all of the papers >>> around the time of Chaiten-Briggs used # of spills as the metric.  That >>> may have been appropriate at that time but take those results with giant >>> grains of salt today.  Of course they are still very good papers for >>> understanding algorithms and heuristics. >>> >>> The best way I know to determine what's best for your target is to run a >>> whole bunch of *real* codes on them, trying different allocation >>> algorithms and heuristics.  It is a lot of work, still worthy of a >>> Ph.D. even today.  Register allocation is not a "solved" problem and >>> probably never will be as architectures continue to change and become >>> ever more diverse. >>> >>>> * Folding spill code into instructions, handling register coallescing, >>>> splitting live ranges, doing rematerialization, doing shrink wrapping >>>> are harder than RegAlloc >>> Again, as Vladimir said, they are all part of register allocation. >>> Sometimes they are implemented as separate passes but logically they all >>> contribute work to the task of assigning registers as efficiently as >>> possible.  And all of them use heuristics.  Choosing when and when not >>> to, for example, coalesce can be important.  Splitting live ranges is >>> logically the opposite of coalescing.  Theoretically one can "undo" a >>> bad coalescing decision by re-splitting the live range but it's usually >>> not that simple as other decisions in the interim can make that tricky >>> if not impossible.  It is a delicate balancing act. >>> >>>> * The papers by Briggs and Chaiten contradict[2] themselves when >>>> examine the text of the paper vs. the pseudocode provided? >>> As others have said, I don't recall any inconsistencies but that doesn't >>> mean there aren't bugs and/or incomplete descriptions open to >>> interpretation.  One of my biggest issues with our computer academic >>> culture is that we do not value repeatability.  It is virtually >>> impossible to take a paper, implement the algorithm as described (or as >>> best as possible given ambiguity) and obtain the same results.  Heck, >>> half my Ph.D. dissertation was dissecting a published paper, examining >>> the sensitivity of the described algorithm to various parts of the >>> described heuristic that were ambiguous.  By interpreting the heuristic >>> description in different ways I observed very different results.  I read >>> papers for the algorithms, heuristics and ideas in them.  I pay no >>> attention to results because in the real world we have to implement the >>> ideas and test them ourselves to see if they will help us. >>> >>> Peter is right to point you to Preston.  He is very accessible, friendly >>> and helpful.  I had the good fortune to work with him for a few years >>> and he taught me a lot.  He has much real-world experience on codes >>> actually used in production.  That experience is gold. >>> >>> Good luck to you!  You're entering a world that some computing >>> professionals think is a waste of time because "we already know how to >>> do that."  Those of us in the trenches know better.  :) >>> >>>                                -David >> >> -- >> Regards, >> Leslie Zhai -https://reviews.llvm.org/p/xiangzhai/ >> >> >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev@lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > -- Regards, Leslie Zhai - https://reviews.llvm.org/p/xiangzhai/