From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-return-194815-listarch-gcc=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 69277 invoked by alias); 19 Dec 2017 02:10:43 -0000
Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc/>
List-Post: <mailto:gcc@gcc.gnu.org>
List-Help: <http://gcc.gnu.org/ml/>
Sender: gcc-owner@gcc.gnu.org
Received: (qmail 68826 invoked by uid 89); 19 Dec 2017 02:10:42 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: Yes, score=5.5 required=5.0 tests=BAYES_00,FORGED_MUA_MOZILLA,INVALID_MSGID,KAM_LAZY_DOMAIN_SECURITY,KAM_SHORT,TLD_CHINA autolearn=no version=3.3.2 spammy=L327, UD:leetcode.cn, l327, UD:www.leetcode.cn
X-HELO: eggs.gnu.org
Received: from eggs.gnu.org (HELO eggs.gnu.org) (208.118.235.92) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 19 Dec 2017 02:10:40 +0000
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)	(envelope-from <lesliezhai@llvm.org.cn>)	id 1eR7MX-00005U-Be	for gcc@gcc.gnu.org; Mon, 18 Dec 2017 21:10:38 -0500
Received: from smtpbgeu1.qq.com ([52.59.177.22]:39171)	by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)	(Exim 4.71)	(envelope-from <lesliezhai@llvm.org.cn>)	id 1eR7MW-0008PP-Tr	for gcc@gcc.gnu.org; Mon, 18 Dec 2017 21:10:33 -0500
X-QQ-mid: bizesmtp7t1513649420tbasmk9s4
Received: from iSOFT.localdomain (unknown [114.247.223.226])	by esmtp4.qq.com (ESMTP) with 	id ; Tue, 19 Dec 2017 10:10:19 +0800 (CST)
X-QQ-SSF: 00100000002000F0FI50B00A0000000
X-QQ-FEAT: 8YYOEVtNMVmkylQHMcqZFANx8sreUL0lAleJjR2Gc1borLpJ7TrX6cTpc+KyA	DgjhCTBV3TkVCZ3/f444nh3gXBSG/4esFADZel++sbPoUAQ9PyQjIRcpvIfe8Ums0d+gWUK	QEpxBdKyTXuMLuAZ1ZlSHE+zNbeRg67nlWUwP3mKFEs7YOqOvAFs2ZB+3e3lGKxBecUrHpf	t9gdYmyyZPg0saBv18tz9qOqihvwO2G+A9jlgkBGaWTUJNLC0u5N9qIsCq2DFQInz2J04F6	aJ7uYdkh7KfTsRZoQaSEjZGDA=
X-QQ-GoodBg: 0
Subject: Re: Register Allocation Graph Coloring algorithm and Others
To: Michael Clark <michaeljclark@mac.com>
Cc: vmakarov@redhat.com, LewisR9@cf.ac.uk, dag@cray.com, stoklund@2pi.dk, GCC Development <gcc@gcc.gnu.org>, LLVM Developers Mailing List <llvm-dev@lists.llvm.org>
References: <615F0DCE4D5873A9@mac.com> <62E41783-1145-4036-9FD4-75D3B0C22DE6@mac.com>
From: Leslie Zhai <lesliezhai@llvm.org.cn>
Message-ID: <acfb293a-a03d-ff74-ef03-d70c3d6caa77@llvm.org.cn>+A63659168FB63981
Date: Tue, 19 Dec 2017 02:10:00 -0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0
MIME-Version: 1.0
In-Reply-To: <62E41783-1145-4036-9FD4-75D3B0C22DE6@mac.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
X-QQ-SENDSIZE: 520
Feedback-ID: bizesmtp:llvm.org.cn:qybgforeign:qybgforeign1
X-QQ-Bgrelay: 1
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy]
X-Received-From: 52.59.177.22
X-SW-Source: 2017-12/txt/msg00117.txt.bz2

Hi Michael,

Thanks for your sharing!

I will read the papers as you suggested, and I asked Quantum Computing 
professionals about solving the NP-complete problem 
https://github.com/epiqc/ScaffCC/issues/14 but GCC's IRA and LRA proved 
that there is a solution in SSA form for classical computer.

Sorting MachineBasicBlock by frequency firstly is a good idea, and I 
only experienced some PASS, such as LoopUnroll 
http://lists.llvm.org/pipermail/llvm-dev/2017-October/118419.html but I 
will practice your idea in my solutionByHEA 
https://github.com/xiangzhai/llvm/blob/avr/lib/CodeGen/RegAllocGraphColoring.cpp#L327

I experienced binary translation from X86 to Mips, QEMU is expensive 
https://www.leetcode.cn/2017/09/tcg-ir-to-llvm-ir.html I will bookmark 
your JIT engine to my blog :)


å¨ 2017å¹´12æ19æ¥ 08:07, Michael Clark åé:
> Hi Leslie,
>
> I suggest adding these 3 papers to your reading list.
>
> 	Register allocation for programs in SSA-form
> 	Sebastian Hack, Daniel Grund, and Gerhard Goos
> 	http://www.rw.cdl.uni-saarland.de/~grund/papers/cc06-ra_ssa.pdf
>
> 	Simple and Efficient Construction of Static Single Assignment Form
> 	Matthias Braun , Sebastian Buchwald , Sebastian Hack , Roland LeiÃa , Christoph Mallon , and Andreas Zwinkau
> 	https://www.info.uni-karlsruhe.de/uploads/publikationen/braun13cc.pdf
>
> 	Optimal register allocation for SSA-form programs in polynomial time
> 	Sebastian Hack, Gerhard Goos
> 	http://web.cs.ucla.edu/~palsberg/course/cs232/papers/HackGoos-ipl06.pdf
>
> A lot of the earlier literature regarding the register allocation problem describes the general graph colouring problem as NP-complete, however previous research in Register Allocation has been in the context of programs that were not in SSA form. i.e. the Chaitin-Briggs paper states that register allocation is NP-complete and proposes an iterative algorithm.
>
> If one reads Hack Goos, there is the discovery that programs that are in SSA form have chordal graphs, and the colouring algorithm for chordal graphs can be completed in polynomial time. After the cost of building the interference graph, it is possible to perform register allocation in a single pass. The key is in not modifying the graph.
>
> If one has frequency for each basic block, then one can sort basic blocks by frequency, allocating the highest frequency blocks first, and where possible assigning the same physcial register to all virtual registers in each phi node (to avoid copies). At program points where the live set is larger than k, the set of physical registers, one spills the the register that has the largest distance between its next use. At least that is how I am thinking about this problem. I am also a novice with regards to register allocation.
>
> I intend to develop a register allocator for use in this JIT engine:
> 	
> 	rv8: a high performance RISC-V to x86 binary translator
> 	Michael Clark, Bruce Hoult
> 	https://carrv.github.io/papers/clark-rv8-carrv2017.pdf
>
> Our JIT already performs almost twice as fast a QEMU and we are using a static register allocation, and QEMU i believe has a register allocator. We are mapping a 31 register RISC architecture to a 16 register CISC architecture. I have statistics on the current slow-down, and it appears to mainly be L1 accesses to spilled registers. I believe after we have implemented a register allocator in our JIT we will be able to run RISC-V binaries at near native performance on x86-64. In fact given we are performing runtime profile guided optimisation, it may even be possible for some benchmarks to run faster than register allocations that are based on static estimates of loop frequencies.
>
> 	https://anarch128.org/~mclark/rv8-slides.pdf
> 	https://rv8.io/bench
>
> Weâve still got a long way to go to get to ~1:1 performance for RISC-V on x86-64, but I think it is possible, at least for some codesâ¦
>
> Regards,
> Michael.
>
>> On 15/12/2017, at 4:18 PM, Leslie Zhai <lesliezhai@llvm.org.cn> wrote:
>>
>> Hi GCC and LLVM developers,
>>
>> I am learning Register Allocation algorithms and I am clear that:
>>
>> * Unlimited VirtReg (pseudo) -> limited or fixed or alias[1] PhysReg (hard)
>>
>> * Memory (20 - 100 cycles) is expensive than Register (1 cycle), but it has to spill code when PhysReg is unavailable
>>
>> * Folding spill code into instructions, handling register coallescing, splitting live ranges, doing rematerialization, doing shrink wrapping are harder than RegAlloc
>>
>> * LRA and IRA is default Passes in RA for GCC:
>>
>> $ /opt/gcc-git/bin/gcc hello.c
>> DEBUG: ../../gcc/lra.c, lra_init_once, line 2441
>> DEBUG: ../../gcc/ira-build.c, ira_build, line 3409
>>
>> * Greedy is default Pass for LLVM
>>
>> But I have some questions, please give me some hint, thanks a lot!
>>
>> * IRA is regional register allocator performing graph coloring on a top-down traversal of nested regions, is it Global? compares with Local LRA
>>
>> * The papers by Briggs and Chaiten contradict[2] themselves when examine the text of the paper vs. the pseudocode provided?
>>
>> * Why  interference graph is expensive to build[3]?
>>
>> And I am practicing[4] to use HEA, developed by Dr. Rhydian Lewis, for LLVM firstly.
>>
>>
>> [1] https://reviews.llvm.org/D39712
>>
>> [2] http://lists.llvm.org/pipermail/llvm-dev/2008-March/012940.html
>>
>> [3] https://github.com/joaotavio/llvm-register-allocator
>>
>> [4] https://github.com/xiangzhai/llvm/tree/avr/include/llvm/CodeGen/GCol
>>
>> -- 
>> Regards,
>> Leslie Zhai - https://reviews.llvm.org/p/xiangzhai/
>>
>>
>>

-- 
Regards,
Leslie Zhai - https://reviews.llvm.org/p/xiangzhai/