From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 30370 invoked by alias); 7 Apr 2019 09:31:53 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 30361 invoked by uid 89); 7 Apr 2019 09:31:53 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-5.7 required=5.0 tests=AWL,BAYES_00,SPF_PASS autolearn=ham version=3.3.1 spammy=skills, figured, docs.google.com, GSOC X-HELO: mx1.suse.de Received: from mx2.suse.de (HELO mx1.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sun, 07 Apr 2019 09:31:51 +0000 Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id E602DAFEF; Sun, 7 Apr 2019 09:31:48 +0000 (UTC) Date: Sun, 07 Apr 2019 09:31:00 -0000 User-Agent: K-9 Mail for Android In-Reply-To: References: <4327491a-395b-bee0-145a-eddd8f64b0ba@gmail.com> <63e78666-ceca-94d8-9ac4-101130afab4c@gmail.com> <70316c90-b241-3f88-56d8-9e59f3eac0ee@gmail.com> <038afc5c-85fa-8d78-5b84-289207c7b17f@gmail.com> <1eaa53df-3ed9-2a5f-d6db-2a224ee9da01@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: GSOC Proposal To: nick CC: GCC Development From: Richard Biener Message-ID: X-SW-Source: 2019-04/txt/msg00098.txt.bz2 On April 5, 2019 6:11:15 PM GMT+02:00, nick wrote: > > >On 2019-04-05 6:25 a.m., Richard Biener wrote: >> On Wed, 3 Apr 2019, nick wrote: >>=20 >>> >>> >>> On 2019-04-03 7:30 a.m., Richard Biener wrote: >>>> On Mon, 1 Apr 2019, nick wrote: >>>> >>>>> >>>>> >>>>> On 2019-04-01 9:47 a.m., Richard Biener wrote: >>>>>> On Mon, 1 Apr 2019, nick wrote: >>>>>> >>>>>>> Well I'm talking about the shared roots of this garbage >collector core state=20 >>>>>>> data structure or just struct ggc_root_tab. >>>>>>> >>>>>>> But also this seems that this to be no longer shared globally if >I'm not mistaken=20 >>>>>>> or this: >>>>>>> static vec extra_root_vec; >>>>>>> >>>>>>> Not sure after reading the code which is a bigger deal through >so I wrote >>>>>>> my proposal not just asking which is a better issue for not >being thread >>>>>>> safe. Sorry about that. >>>>>>> >>>>>>> As for the second question injection seems to not be the issue >or outside >>>>>>> callers but just internal so phase 3 or step 3 would now be: >>>>>>> Find internal callers or users of x where x is one of the above >rather >>>>>>> than injecting outside callers. Which answers my second question >about >>>>>>> external callers being a issue still. >>>>>>> >>>>>>> Let me know which of the two is a better issue: >>>>>>> 1. struct ggc_root_tabs being shared >>>>>>> 2.static vec extra_root_vec; as a shared >heap or >>>>>>> vector of root nodes for each type of allocation >>>>>>> >>>>>>> and I will gladly rewrite my proposal sections for that >>>>>>> as needs to be reedited. >>>>>> >>>>>> I don't think working on the garbage collector as a separate >>>>>> GSoC project is useful at this point. Doing locking around >>>>>> allocation seems like a good short-term solution and if that >>>>>> turns out to be a performance issue for the threaded part >>>>>> using per-thread freelists is likely an easy to deploy >>>>>> solution. >>>>>> >>>>>> Richard. >>>>>> >>>>> I agree but we were discussing this: >>>>> Or maybe a project to be more >>>>> explicit about regions of the code that assume that the garbage- >>>>> collector can't run within them?[3] (since the GC is state that >would >>>>> be shared by the threads). >>>> >>>> The process of collecting garbage is not the only issue (and that >>>> very issue is easiest mitigated by collecting only at specific >>>> points - which is what we do - and have those be serializing >points). >>>> The main issue is the underlying memory allocator (GCC uses memory >>>> that is garbage collected plus regular heap memory). >>>> >>>>> In addition I moved my paper back to our discussion about garbage >collector >>>>> state with outside callers.Seems we really need to do something >about >>>>> my wording as the idea of my project in a nutshell was to figure >>>>> out how to mark shared state by callers and inject it into the >>>>> garbage collector letting it known that the state was not shared >between >>>>> threads or shared. Seems that was on the GSoc page and in our >discussions the issue >>>>> is marking outside code for shared state. If that's correct then >my >>>>> wording of outside callers is incorrect it should have been shared >>>>> state between threads on outside callers to the garbage collector. >>>>> If the state is that in your wording above then great as I >understand >>>>> where we are going and will gladly change my wording. >>>> >>>> I'm still not sure what you are shooting at, the above sentences do >>>> not make any sense to me. >>>> >>>>> Also freelists don't work here as the state is shared at the >caller's=20 >>>>> end which would need two major issues: >>>>> 1. Locking on nodes of the=20 >>>>> freelists when two threads allocate at the same thing which can be >a=20 >>>>> problem if the shared state is shared a lot >>>>> 2. Locking allocation with=20 >>>>> large numbers of callers can starve threads >>>> >>>> First of all allocating memory from the GC pool is not the main >>>> work of GIMPLE passes so simply serializing at allocation time >might >>>> work out. Second free lists of course do work. What you'd do is >>>> have a fast path in allocation using a thread-local "free list" >>>> which you can allocate from without taking any lock. Maybe I >should >>>> explain "free list" since that term doesn't make too much sense in >>>> a garbage collector world. What I'd do is when a client thread >>>> asks for memory of size N allocate M objects of that size but put >>>> M - 1 on the client thread local "free list" to be allocated >lock-free >>>> from for the next M - 1 calls. Note that garbage collected memory >>>> objects are only handed out in fixed chunks (powers of two plus >>>> a few special sizes) so you'd have one "free list" per chunk size >>>> per thread. >>>> >>>> The collection itself (mark & sweep) would be fully serialized >still >>>> (and not return to any threads local "free list"). >>>> >>>> ggc_free'd objects _might_ go to the threads "free list"s (yeah, we >>>> _do_ have ggc_free ...). >>>> >>>> As said, I don't see GC or the memory allocator as sth interesting >>>> to work on for parallelization until the basic setup works and it >>>> proves to be a bottleneck. >>>> >>>>> Seems that working on the garbage collector itself isn't the issue >but=20 >>>>> the callers as I just figured out as related to your state idea. >Let me=20 >>>>> know if that's correct and if the wording change I mentioned is >fine=20 >>>>> with you as that's the state it seems that needs to be changed. >>>>> Nick=20 >>>> >>>> Richard. >>>> >>> >>> That's fine and it's my fault for not understanding you better. I >was aware=20 >>> of the expand_functions_all being taken for passes.c. However it >seems >>> two other issues are these sets as related to threads: >>> 1.finalize_compilation_unit >>> 2.and the ipa set of pass functions >>> >>> If I'm understanding it correctly number 1 seems to be a early >version of >>> expand_all_functions for the GENERIC representation if that's the >case >>> it really should be fixed. Not sure which is a better issue as both >>> seem to have issues either at the GENERIC level or GIMPLE level with >shared >>> state. >>> >>> Let me know if this is better as it seems now that I really think >about=20 >>> it GIMPLE or GENERIC functions in passes.c are the main issue.=20 >>> >>> Sorry for the misunderstanding and hopefully one of functions listed >is better >>> for moving forward with my proposal, >>=20 >> Sorry, but guessing at useful projects by skimming through GCC code >> at this point isn't the way to go forward - this new "idea" lacks >> both detail and understanding. Please try to stick to one of the >> suggested projects or do more thorough research in case you want >> to work on a new project idea next year. >>=20 >> Thanks, >> Richard. >>=20 > >I was talking about cgraphunits.c and it seems that according to this: >Parallelize compilation using threads. GCC currently has an awful lot >of truly global state and even more per-pass global state which makes >this somewhat hard. The idea is to avoid issues with global state as >much as possible by partitioning the compilation pipeline in pieces >that share as little global state as possible and ensure each thread >only works in one of those partitions. The biggest roadblock will be >the not thread-safe memory allocator of GCC garbage collector. The goal >of this project is to have a compilation pipeline driven by a scheduler >assigning functions to be optimized to the partitions in the pipeline. >This project would be mentored by Richard Biener. Required skills >include: C/C++, ability to analyze big complex code base, >parallelization > >We are trying to create a rendering pipeline if I'm correct and it >seems that the GENERIC level needs finalize_compilation_unit >to be fixed like expand_all_functions at the GIMPLE. That's my point it >still is within that project. Here is what I wrote >as I figured out that it was shared state related to GENERIC passing to >GIMPLE which is a bottleneck or would be in the=20 >threaded pipeline. > >https://docs.google.com/document/d/1BKVeh62IpigsQYf_fJqkdu_js0EeGdKtXInkWZ= -DtU0/edit The pre- post-IPA parts cannot be easily parallelized and that includes GEN= ERIC to GIMPLE translation. This is why the project should focus on the muc= h easier post-IPA and pre-RTL parts of the compilation pipeline since there= interaction between functions is minimal.=20 Richard.=20 > >Nick