public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: nick <xerofoify@gmail.com>
To: Richard Biener <rguenther@suse.de>
Cc: GCC Development <gcc@gcc.gnu.org>
Subject: Re: GSOC Proposal
Date: Fri, 05 Apr 2019 16:11:00 -0000	[thread overview]
Message-ID: <ec115fb3-ee31-b697-e370-1405ee14cabf@gmail.com> (raw)
In-Reply-To: <alpine.LSU.2.20.1904051223450.27537@zhemvz.fhfr.qr>



On 2019-04-05 6:25 a.m., Richard Biener wrote:
> On Wed, 3 Apr 2019, nick wrote:
> 
>>
>>
>> On 2019-04-03 7:30 a.m., Richard Biener wrote:
>>> On Mon, 1 Apr 2019, nick wrote:
>>>
>>>>
>>>>
>>>> On 2019-04-01 9:47 a.m., Richard Biener wrote:
>>>>> On Mon, 1 Apr 2019, nick wrote:
>>>>>
>>>>>> Well I'm talking about the shared roots of this garbage collector core state 
>>>>>> data structure or just struct ggc_root_tab.
>>>>>>
>>>>>> But also this seems that this to be no longer shared globally if I'm not mistaken 
>>>>>> or this:
>>>>>> static vec<const_ggc_root_tab_t> extra_root_vec;
>>>>>>
>>>>>> Not sure after reading the code which is a bigger deal through so I wrote
>>>>>> my proposal not just asking which is a better issue for not being thread
>>>>>> safe. Sorry about that.
>>>>>>
>>>>>> As for the second question injection seems to not be the issue or outside
>>>>>> callers but just internal so phase 3 or step 3 would now be:
>>>>>> Find internal callers or users of x where x is one of the above rather
>>>>>> than injecting outside callers. Which answers my second question about
>>>>>> external callers being a issue still.
>>>>>>
>>>>>> Let me know which  of the two is a better issue:
>>>>>> 1. struct ggc_root_tabs being shared
>>>>>> 2.static vec<const_ggc_root_tab_t> extra_root_vec; as a shared heap or
>>>>>> vector of root nodes for each type of allocation
>>>>>>
>>>>>> and I will gladly rewrite my proposal sections for that
>>>>>> as needs to be reedited.
>>>>>
>>>>> I don't think working on the garbage collector as a separate
>>>>> GSoC project is useful at this point.  Doing locking around
>>>>> allocation seems like a good short-term solution and if that
>>>>> turns out to be a performance issue for the threaded part
>>>>> using per-thread freelists is likely an easy to deploy
>>>>> solution.
>>>>>
>>>>> Richard.
>>>>>
>>>> I agree but we were discussing this:
>>>> Or maybe a project to be more
>>>> explicit about regions of the code that assume that the garbage-
>>>> collector can't run within them?[3] (since the GC is state that would
>>>> be shared by the threads).
>>>
>>> The process of collecting garbage is not the only issue (and that
>>> very issue is easiest mitigated by collecting only at specific
>>> points - which is what we do - and have those be serializing points).
>>> The main issue is the underlying memory allocator (GCC uses memory
>>> that is garbage collected plus regular heap memory).
>>>
>>>> In addition I moved my paper back to our discussion about garbage collector
>>>> state with outside callers.Seems we really need to do something about
>>>> my wording as the idea of my project in a nutshell was to figure
>>>> out how to mark shared state by callers and inject it into the
>>>> garbage collector letting it known that the state was not shared between
>>>> threads or shared. Seems that was on the GSoc page and in our discussions the issue
>>>> is marking outside code for shared state. If that's correct then my
>>>> wording of outside callers is incorrect it should have been shared
>>>> state between threads on outside callers to the garbage collector.
>>>> If the state is that in your wording above then great as I understand
>>>> where we are going and will gladly change my wording.
>>>
>>> I'm still not sure what you are shooting at, the above sentences do
>>> not make any sense to me.
>>>
>>>> Also freelists don't work here as the state is shared at the caller's 
>>>> end which would need two major issues:
>>>> 1. Locking on nodes of the 
>>>> freelists when two threads allocate at the same thing which can be a 
>>>> problem if the shared state is shared a lot
>>>> 2. Locking allocation with 
>>>> large numbers of callers can starve threads
>>>
>>> First of all allocating memory from the GC pool is not the main
>>> work of GIMPLE passes so simply serializing at allocation time might
>>> work out.  Second free lists of course do work.  What you'd do is
>>> have a fast path in allocation using a thread-local "free list"
>>> which you can allocate from without taking any lock.  Maybe I should
>>> explain "free list" since that term doesn't make too much sense in
>>> a garbage collector world.  What I'd do is when a client thread
>>> asks for memory of size N allocate M objects of that size but put
>>> M - 1 on the client thread local "free list" to be allocated lock-free
>>> from for the next M - 1 calls.  Note that garbage collected memory
>>> objects are only handed out in fixed chunks (powers of two plus
>>> a few special sizes) so you'd have one "free list" per chunk size
>>> per thread.
>>>
>>> The collection itself (mark & sweep) would be fully serialized still
>>> (and not return to any threads local "free list").
>>>
>>> ggc_free'd objects _might_ go to the threads "free list"s (yeah, we
>>> _do_ have ggc_free ...).
>>>
>>> As said, I don't see GC or the memory allocator as sth interesting
>>> to work on for parallelization until the basic setup works and it
>>> proves to be a bottleneck.
>>>
>>>> Seems that working on the garbage collector itself isn't the issue but 
>>>> the callers as I just figured out as related to your state idea. Let me 
>>>> know if that's correct and if the wording change I mentioned is fine 
>>>> with you as that's the state it seems that needs to be changed.
>>>> Nick 
>>>
>>> Richard.
>>>
>>
>> That's fine and it's my fault for not understanding you better. I was aware 
>> of the expand_functions_all being taken for passes.c. However it seems
>> two other issues are these sets as related to threads:
>> 1.finalize_compilation_unit
>> 2.and the ipa set of pass functions
>>
>> If I'm understanding it correctly number 1 seems to be a early version of
>> expand_all_functions for the GENERIC representation if that's the case
>> it really should be fixed. Not sure which is a better issue as both
>> seem to have issues either at the GENERIC level or GIMPLE level with shared
>> state.
>>
>> Let me know if this is better as it seems now that I really think about 
>> it GIMPLE or GENERIC functions in passes.c are the main issue. 
>>
>> Sorry for the misunderstanding and hopefully one of functions listed is better
>> for moving forward with my proposal,
> 
> Sorry, but guessing at useful projects by skimming through GCC code
> at this point isn't the way to go forward - this new "idea" lacks
> both detail and understanding.  Please try to stick to one of the
> suggested projects or do more thorough research in case you want
> to work on a new project idea next year.
> 
> Thanks,
> Richard.
> 

I was talking about cgraphunits.c and it seems that according to this:
Parallelize compilation using threads. GCC currently has an awful lot of truly global state and even more per-pass global state which makes this somewhat hard. The idea is to avoid issues with global state as much as possible by partitioning the compilation pipeline in pieces that share as little global state as possible and ensure each thread only works in one of those partitions. The biggest roadblock will be the not thread-safe memory allocator of GCC garbage collector. The goal of this project is to have a compilation pipeline driven by a scheduler assigning functions to be optimized to the partitions in the pipeline. This project would be mentored by Richard Biener. Required skills include: C/C++, ability to analyze big complex code base, parallelization

We are trying to create a rendering pipeline if I'm correct and it seems that the GENERIC level needs finalize_compilation_unit
to be fixed like expand_all_functions at the GIMPLE. That's my point it still is within that project. Here is what I wrote
as I figured out that it was shared state related to GENERIC passing to GIMPLE which is a bottleneck or would be in the 
threaded pipeline.

https://docs.google.com/document/d/1BKVeh62IpigsQYf_fJqkdu_js0EeGdKtXInkWZ-DtU0/edit

Nick

  reply	other threads:[~2019-04-05 16:11 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-27 17:31 nick
2019-03-28  8:59 ` Richard Biener
2019-03-28 13:38   ` nick
2019-03-29  9:08     ` Richard Biener
2019-03-29 14:28       ` nick
2019-03-29 17:00         ` nick
2019-04-01  5:25           ` Eric Gallager
2019-04-01 11:47             ` Nathan Sidwell
2019-04-01  9:56           ` Richard Biener
2019-04-01 13:39             ` nick
2019-04-01 13:48               ` Richard Biener
2019-04-01 14:14                 ` nick
2019-04-03 11:30                   ` Richard Biener
2019-04-03 15:21                     ` nick
2019-04-05 10:25                       ` Richard Biener
2019-04-05 16:11                         ` nick [this message]
2019-04-07  9:31                           ` Richard Biener
2019-04-07 15:40                             ` nick
2019-04-08  7:30                               ` Richard Biener
2019-04-08 13:19                                 ` nick
2019-04-08 13:42                                   ` Richard Biener
2019-04-08 14:17                                     ` nick
  -- strict thread matches above, loose matches on Subject: below --
2022-04-18 17:32 GSoC Proposal Abhigyan Kashyap
2018-03-21 18:39 GSOC proposal Ismael El Houas Ghouddana
2018-03-26 13:31 ` Martin Jambor
2013-03-17  6:02 GSoC Proposal Sai kiran
2013-03-21 18:01 ` Benjamin De Kosnik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ec115fb3-ee31-b697-e370-1405ee14cabf@gmail.com \
    --to=xerofoify@gmail.com \
    --cc=gcc@gcc.gnu.org \
    --cc=rguenther@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).