From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 17422 invoked by alias); 13 Dec 2018 08:12:29 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 17406 invoked by uid 89); 13 Dec 2018 08:12:28 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=H*i:sk:NQ@mail, H*i:sk:dSjivBP, H*i:CAEFO, H*f:sk:dSjivBP X-HELO: mail-it1-f176.google.com Received: from mail-it1-f176.google.com (HELO mail-it1-f176.google.com) (209.85.166.176) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 13 Dec 2018 08:12:23 +0000 Received: by mail-it1-f176.google.com with SMTP id p197so2572603itp.0 for ; Thu, 13 Dec 2018 00:12:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=czVEJEAJqZeZWkrp/zpMQ4GbZ3PTqo/+4fXFeBmRoz0=; b=WboalAlcpTRZyrhIwxlhnIkJQ4h1tgEc8xez7PrGsjRbWxTfHMO8FlDgnaSmgGJQJc L1uQkgkxWB7ypHsn/HKcbdiF/HlqfrLo8M/j9DWNDjljTUJfW5owS7Ye/6VmXXHfjnuq khqMDsouDeob0U4HRv1wpIpBj5hWZSLKitCIihpUiNIDuU+ZUR1QCoA4LtinTNqXs6DH +5iyTokiTc6LSeIJuFdFfp4fP/enxpsupVBSksXtxX0MCTKjHvuvyForavOL3QK/wmVW X6AHtQ3KVXV4pgF5IfpqVL0rRtib1q2EyREAyngNXMJWnAKaocL2Yx7QNZgBeAMCfK0B C8/w== MIME-Version: 1.0 References: In-Reply-To: From: "Bin.Cheng" Date: Thu, 13 Dec 2018 08:12:00 -0000 Message-ID: Subject: Re: Parallelize the compilation using Threads To: giuliano.belinassi@usp.br Cc: Richard Guenther , GCC Development , kernel-usp@googlegroups.com, gold@ime.usp.br, alfredo.goldman@gmail.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes X-SW-Source: 2018-12/txt/msg00079.txt.bz2 On Wed, Dec 12, 2018 at 11:46 PM Giuliano Augusto Faulin Belinassi wrote: > > Hi, I have some news. :-) > > I replicated the Martin Li=C5=A1ka experiment [1] on a 64-cores machine f= or > gcc [2] and Linux kernel [3] (Linux kernel was fully parallelized), > and I am excited to dive into this problem. As a result, I want to > propose GSoC project on this issue, starting with something like: > 1- Systematically create a benchmark for easily information > gathering. Martin Li=C5=A1ka already made the first version of it, but I > need to improve it. > 2- Find and document the global states (Try to reduce the gcc's > global states as well). > 3- Define the parallelization strategy. > 4- First parallelization attempt. Hi Giuliano, Thanks very much for working on this. It could be very useful, for example, one bottleneck we have is slow compilation of big single source file after intensively using distribution compilation. Of course, a good parallelization strategy is needed. Thanks, bin > > I also proposed this issue as a research project to my advisor and he > supported me on this idea. So I can work for at least one year on > this, and other things related to it. > > Would anyone be willing to mentor me on this? > > [1] https://gcc.gnu.org/bugzilla/attachment.cgi?id=3D43440 > [2] https://www.ime.usp.br/~belinass/64cores-experiment.svg > [3] https://www.ime.usp.br/~belinass/64cores-kernel-experiment.svg > On Mon, Nov 19, 2018 at 8:53 AM Richard Biener > wrote: > > > > On Fri, Nov 16, 2018 at 8:00 PM Giuliano Augusto Faulin Belinassi > > wrote: > > > > > > Hi! Sorry for the late reply again :P > > > > > > On Thu, Nov 15, 2018 at 8:29 AM Richard Biener > > > wrote: > > > > > > > > On Wed, Nov 14, 2018 at 10:47 PM Giuliano Augusto Faulin Belinassi > > > > wrote: > > > > > > > > > > As a brief introduction, I am a graduate student that got interes= ted > > > > > > > > > > in the "Parallelize the compilation using threads"(GSoC 2018 [1])= . I > > > > > am a newcommer in GCC, but already have sent some patches, some of > > > > > them have already been accepted [2]. > > > > > > > > > > I brought this subject up in IRC, but maybe here is a proper plac= e to > > > > > discuss this topic. > > > > > > > > > > From my point of view, parallelizing GCC itself will only speed u= p the > > > > > compilation of projects which have a big file that creates a > > > > > bottleneck in the whole project compilation (note: by big, I mean= the > > > > > amount of code to generate). > > > > > > > > That's true. During GCC bootstrap there are some of those (see PR8= 4402). > > > > > > > > > > > One way to improve parallelism is to use link-time optimization whe= re > > > > even single source files can be split up into multiple link-time un= its. But > > > > then there's the serial whole-program analysis part. > > > > > > Did you mean this: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D844= 02 ? > > > That is a lot of data :-) > > > > > > It seems that 'phase opt and generate' is the most time-consuming > > > part. Is that the 'GIMPLE optimization pipeline' you were talking > > > about in this thread: > > > https://gcc.gnu.org/ml/gcc/2018-03/msg00202.html > > > > It's everything that comes after the frontend parsing bits, thus this > > includes in particular RTL optimization and early GIMPLE optimizations. > > > > > > > Additionally, I know that GCC must not > > > > > change the project layout, but from the software engineering pers= pective, > > > > > this may be a bad smell that indicates that the file should be br= oken > > > > > into smaller files. Finally, the Makefiles will take care of the > > > > > parallelization task. > > > > > > > > What do you mean by GCC must not change the project layout? GCC > > > > happily re-orders functions and link-time optimization will reorder > > > > TUs (well, linking may as well). > > > > > > > > > > That was a response to a comment made on IRC: > > > > > > On Thu, Nov 15, 2018 at 9:44 AM Jonathan Wakely wrote: > > > >I think this is in response to a comment I made on IRC. Giuliano said > > > >that if a project has a very large file that dominates the total bui= ld > > > >time, the file should be split up into smaller pieces. I said "GCC > > > >can't restructure people's code. it can only try to compile it > > > >faster". We weren't referring to code transformations in the compiler > > > >like re-ordering functions, but physically refactoring the source > > > >code. > > > > > > Yes. But from one of the attachments from PR84402, it seems that such > > > files exist on GCC, > > > https://gcc.gnu.org/bugzilla/attachment.cgi?id=3D43440 > > > > > > > > My questions are: > > > > > > > > > > 1. Is there any project compilation that will significantly be i= mproved > > > > > if GCC runs in parallel? Do someone has data about something rela= ted > > > > > to that? How about the Linux Kernel? If not, I can try to bring s= ome. > > > > > > > > We do not have any data about this apart from experiments with > > > > splitting up source files for PR84402. > > > > > > > > > 2. Did I correctly understand the goal of the parallelization? C= an > > > > > anyone provide extra details to me? > > > > > > > > You may want to search the mailing list archives since we had a > > > > student application (later revoked) for the task with some discussi= on. > > > > > > > > In my view (I proposed the thing) the most interesting parts are > > > > getting GCCs global state documented and reduced. The parallelizat= ion > > > > itself is an interesting experiment but whether there will be any > > > > substantial improvement for builds that can already benefit from ma= ke > > > > parallelism remains a question. > > > > > > As I agree that documenting GCC's global states is good for the > > > community and the development of GCC, I really don't think this a good > > > motivation for parallelizing a compiler from a research standpoint. > > > > True ;) Note that my suggestions to the other GSoC student were > > purely based on where it's easiest to experiment with paralellization > > and not where it would be most beneficial. > > > > > There must be something or someone that could take advantage of the > > > fine-grained parallelism. But that data from PR84402 seems to have the > > > answer to it. :-) > > > > > > On Thu, Nov 15, 2018 at 4:07 PM Szabolcs Nagy = wrote: > > > > > > > > On 15/11/18 10:29, Richard Biener wrote: > > > > > In my view (I proposed the thing) the most interesting parts are > > > > > getting GCCs global state documented and reduced. The paralleliz= ation > > > > > itself is an interesting experiment but whether there will be any > > > > > substantial improvement for builds that can already benefit from = make > > > > > parallelism remains a question. > > > > > > > > in the common case (project with many small files, much more than > > > > core count) i'd expect a regression: > > > > > > > > if gcc itself tries to parallelize that introduces inter thread > > > > synchronization and potential false sharing in gcc (e.g. malloc > > > > locks) that does not exist with make parallelism (glibc can avoid > > > > some atomic instructions when a process is single threaded). > > > > > > That is what I am mostly worried about. Or the most costly part is not > > > parallelizable at all. Also, I would expect a regression on very small > > > files, which probably could be avoided implementing this feature as a > > > flag? > > > > I think the the issue should be avoided by avoiding fine-grained parale= llism. > > Which might be somewhat hard given there are core data structures that > > are shared (the memory allocator for a start). > > > > The other issue I am more worried about is that we probably have to > > interact with make somehow so that we do not end up with 64 threads > > when one does -j8 on a 8 core machine. That's basically the same > > issue we run into with -flto and it's threaded WPA writeout or recursive > > invocation of make. > > > > > > > > On Fri, Nov 16, 2018 at 11:05 AM Martin Jambor wrot= e: > > > > > > > > Hi Giuliano, > > > > > > > > On Thu, Nov 15 2018, Richard Biener wrote: > > > > > You may want to search the mailing list archives since we had a > > > > > student application (later revoked) for the task with some discus= sion. > > > > > > > > Specifically, the whole thread beginning with > > > > https://gcc.gnu.org/ml/gcc/2018-03/msg00179.html > > > > > > > > Martin > > > > > > > > > > Yes, I will research this carefully ;-) > > > > > > Thank you