public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <rguenther@suse.de>
To: Giuliano Belinassi <giuliano.belinassi@usp.br>
Cc: Richard Biener <richard.guenther@gmail.com>,
	    David Malcolm <dmalcolm@redhat.com>,
	GCC Development <gcc@gcc.gnu.org>
Subject: Re: GSOC
Date: Tue, 07 May 2019 13:18:00 -0000	[thread overview]
Message-ID: <alpine.LSU.2.20.1905071514360.10704@zhemvz.fhfr.qr> (raw)
In-Reply-To: <20190506184735.xyfhrgqg6gdy7gsz@smtp.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 11726 bytes --]

On Mon, 6 May 2019, Giuliano Belinassi wrote:

> Hi,
> 
> On 03/29, Richard Biener wrote:
> > On Thu, 28 Mar 2019, Giuliano Belinassi wrote:
> > 
> > > Hi, Richard
> > > 
> > > On 03/28, Richard Biener wrote:
> > > > On Wed, Mar 27, 2019 at 2:55 PM Giuliano Belinassi
> > > > <giuliano.belinassi@usp.br> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > On 03/26, Richard Biener wrote:
> > > > > > On Tue, 26 Mar 2019, David Malcolm wrote:
> > > > > >
> > > > > > > On Mon, 2019-03-25 at 19:51 -0400, nick wrote:
> > > > > > > > Greetings All,
> > > > > > > >
> > > > > > > > I would like to take up parallelize compilation using threads or make
> > > > > > > > c++/c
> > > > > > > > memory issues not automatically promote. I did ask about this before
> > > > > > > > but
> > > > > > > > not get a reply. When someone replies I'm just a little concerned as
> > > > > > > > my writing for proposals has never been great so if someone just
> > > > > > > > reviews
> > > > > > > > and doubt checks that's fine.
> > > > > > > >
> > > > > > > > As for the other things building gcc and running the testsuite is
> > > > > > > > fine. Plus
> > > > > > > > I already working on gcc so I've pretty aware of most things and this
> > > > > > > > would
> > > > > > > > be a great steeping stone into more serious gcc development work.
> > > > > > > >
> > > > > > > > If sample code is required that's in mainline gcc I sent out a trial
> > > > > > > > patch
> > > > > > > > for this issue: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88395
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > >
> > > > > > > > Nick
> > > > > > >
> > > > > > > It's good to see that you've gotten as far as attaching a patch to BZ
> > > > > > > [1]
> > > > > > >
> > > > > > > I think someone was going to attempt the "parallelize compilation using
> > > > > > > threads" idea last year, but then pulled out before the summer; you may
> > > > > > > want to check the archives (or was that you?)
> > > > > >
> > > > > > There's also Giuliano Belinassi who is interested in the same project
> > > > > > (CCed).
> > > > >
> > > > > Yes, I will apply for this project, and I will submit the final version
> > > > > of my proposal by the end of the week.
> > > > >
> > > > > Currently, my target is the `expand_all_functions` routine, as most of
> > > > > the time is spent on it according to the experiments that I performed as
> > > > > part of my Master's research on compiler parallelization.
> > > > > (-O2, --disable-checking)
> > > > 
> > > > Yes, more specifically I think the realistic target is the GIMPLE part
> > > > of   execute_pass_list (cfun, g->get_passes ()->all_passes);  done in
> > > > cgraph_node::expand.  If you look at passes.def you'll see all_passes
> > > > also contains RTL expansion (pass_expand) and the RTL optimization
> > > > queue (pass_rest_of_compilation).  The RTL part isn't a realistic target.
> > > > Without changing the pass hierarchy the obvious part that can be
> > > > handled would be the pass_all_optimizations pass sub-queue of
> > > > all_passes since those are all passes that perform transforms on the
> > > > GIMPLE IL where we have all functions in this state at the same time
> > > > and where no interactions between the functions happen anymore
> > > > and thus functions can be processed in parallel (as much as make
> > > > processes individual translation units in parallel).
> > > > 
> > > 
> > > Great. So if I understood correctly, I will need to split
> > > cgraph_node::expand() into three parts: IPA, GIMPLE and RTL, and then
> > > refactor `expand_all_functions` so that the loop
> > > 
> > >      for (i = new_order_pos - 1; i >= 0; i--)
> > > 
> > >  use these three functions, then partition
> > > 
> > >      g->get_passes()->all_passes
> > > 
> > > into get_passes()->gimple_passes and get_passes()->rtl_passes, so I
> > > can run RTL after GIMPLE is finished, to finally start the
> > > paralellization of per function GIMPLE passes.
> > 
> > Yes, it involves refactoring of the loop - you may notice that
> > parts of the compilation pipeline are under control of the
> > pass manager (passes.c) but some is still manually driven
> > by symbol_table::compile.  Whether it's more convenient to
> > get more control stuffed to the pass manager and perform the
> > threading under its control (I'd say that would be the cleaner
> > design) or to try do this in the current ad-hoc parts remains
> > to be seen.  You can see symbol_table::compile hands over
> > control to the pass manager multiple times, first ipa_passes ()
> > then all_late_ipa_passes and finally the expand_all_functions code.
> > 
> > I guess it would simplify things if you'd split pass_all_passes
> > in passes.def at pass_expand like so:
> > 
> > diff --git a/gcc/passes.def b/gcc/passes.def
> > index 2fcd80e53a3..bb0453b36a7 100644
> > --- a/gcc/passes.def
> > +++ b/gcc/passes.def
> > @@ -403,11 +403,10 @@ along with GCC; see the file COPYING3.  If not see
> >    NEXT_PASS (pass_spectrev1);
> >    NEXT_PASS (pass_warn_function_noreturn);
> >    NEXT_PASS (pass_gen_hsail);
> > +  TERMINATE_PASS_LIST (all_passes)
> >  
> > -  NEXT_PASS (pass_expand);
> > -
> > -  NEXT_PASS (pass_rest_of_compilation);
> > -  PUSH_INSERT_PASSES_WITHIN (pass_rest_of_compilation)
> > +  INSERT_PASSES_AFTER (pass_rest_of_compilation)
> > +      NEXT_PASS (pass_expand);
> >        NEXT_PASS (pass_instantiate_virtual_regs);
> >        NEXT_PASS (pass_into_cfg_layout_mode);
> >        NEXT_PASS (pass_jump);
> > @@ -505,6 +504,5 @@ along with GCC; see the file COPYING3.  If not see
> >           NEXT_PASS (pass_final);
> >        POP_INSERT_PASSES ()
> >        NEXT_PASS (pass_df_finish);
> > -  POP_INSERT_PASSES ()
> >    NEXT_PASS (pass_clean_state);
> > -  TERMINATE_PASS_LIST (all_passes)
> > +  TERMINATE_PASS_LIST (pass_rest_of_compilation)
> > 
> > where to make things "work" again w/o threading you'd invoke
> > execute_pass_list (cfun, g->get_passes ()->pass_rest_of_compilation)
> > right after the all_passes invocation in cgraph_node::expand.
> > 
> > You then can refactor things so the loop over the 'order' array
> > is done twice, once over all_passes (the set you then parallelize)
> > and once over pass_rest_of_compilation (which you can't parallelize
> > because of being in RTL).
> >
> 
> I managed to get it working today. However, I found an issue with the
> statistics_fini_pass() and pass_init_dump_file(), which I had to
> comment, and force a `return false` for every case, respectively. Then I
> managed to compile some programs correctly with -O2. I have no idea why
> yet, but I will keep searching. I've attached my patch here.

It may be that you need to adjust the GCC_PASS_LISTS define in
pass_manager.h, changing pass_rest_of_compilation to a pass list
and also remove its "old" definition in passes.c.  Or it might
be simpler to not re-use pass_rest_of_compilation but wrap
the tail in a new all_passes2 or so.  You'll also see
a call to register_dump_files (all_passes) in passes.c where
you probably need to do the same for the new tail.

As usual grep is your best friend when figuring out what to do
(doing that right now myself).

Richard.

> 
> 
> > The above patch needs more changes in pass manager code - a chance
> > to dive into it a little since that's where you'd change code.
> > 
> > > > To simplify the taks further a useful constraint is to not have
> > > > a single optimization pass executed multiple times at the same time
> > > > (otherwise you have to look at pass specific global states as well),
> > > > thus the parallel part could be coded in a way keeping per function
> > > > the state of what pass to execute next and have a scheduler pick
> > > > a function its next pass is "free", scheduling that to a fixed set of
> > > > worker threads.  There's no dependences between functions
> > > > for the scheduling but each pass has only one execution resource
> > > > in the pipeline.  You can start processing an arbitrarily large number
> > > > of functions but slow functions will keep others from advancing across
> > > > the pass it executes on.
> > > >
> > > 
> > > Something like a pipeline? That is certainly a start, but if one pass is
> > > very slow wouldn't it bottleneck everything?
> > 
> > Yes, something like a pipeline.  It's true a slow pass would
> > bottleneck things - as said, we can selectively make passes
> > thread safe in such cases.
> > 
> > > > Passes could of course be individually marked as thread-safe
> > > > (multiple instances execute concurrently).
> > > > 
> > > > Garbage collection is already in control of the pass manager which
> > > > would also be the thread scheduler.  For GC the remaining issue
> > > > is allocation which passes occasionally do.  Locking is the short
> > > > term solution for GSoC I guess, long-term per-thread GC pools
> > > > might be better (to not slow down non-threaded parts of the compiler).
> > > > 
> > > > Richard.
> > > > 
> > > > >
> > > > > Thank you,
> > > > > Giuliano.
> > > > >
> > > > >
> > > > > >
> > > > > > > IIRC Richard [CCed] was going to mentor, with me co-mentoring [2] - but
> > > > > > > I don't know if he's still interested/able to spare the cycles.
> > > > > >
> > > > > > I've offered mentoring to Giuliano, so yes.
> > > > > >
> > > > > > > That said, the parallel compilation one strikes me as very ambitious;
> > > > > > > it's not clear to me what could realistically be done as a GSoC
> > > > > > > project.  I think a good proposal on that would come up with some
> > > > > > > subset of the problem that's doable over a summer, whilst also being
> > > > > > > useful to the project.  The RTL infrastructure has a lot of global
> > > > > > > state, so maybe either focus on the gimple passes, or on fixing global
> > > > > > > state on the RTL side?  (I'm not sure)
> > > > > >
> > > > > > That was the original intent for the experiment.  There's also
> > > > > > the already somewhat parallel WPA stage in LTO compilation mode
> > > > > > (but it simply forks for the sake of simplicity...).
> > > > > >
> > > > > > > Or maybe a project to be more
> > > > > > > explicit about regions of the code that assume that the garbage-
> > > > > > > collector can't run within them?[3] (since the GC is state that would
> > > > > > > be shared by the threads).
> > > > > >
> > > > > > The GC will be one obstackle.  The original idea was to drive
> > > > > > parallelization on the pass level by the pass manager for the
> > > > > > GIMPLE passes, so serialization points would be in it.
> > > > > >
> > > > > > Richard.
> > > > > >
> > > > > > > Hope this is constructive/helpful
> > > > > > > Dave
> > > > > > >
> > > > > > > [1] though typically our workflow involved sending patches to the gcc-
> > > > > > > patches mailing list
> > > > > > > [2] as libgccjit maintainer I have an interest in global state within
> > > > > > > the compiler
> > > > > > > [3] I posted some ideas about this back in 2013 IIRC; probably
> > > > > > > massively bit-rotted since then.  I also gave a talk at Cauldron 2013
> > > > > > > about global state in the compiler (with a view to gcc-as-a-shared-
> > > > > > > library); likewise I expect much of the ideas there to be out-of-date);
> > > > > > > for libgccjit I went with a different approach
> > > 
> > > Thank you,
> > > Giuliano.
> > > 
> > 
> > -- 
> > Richard Biener <rguenther@suse.de>
> > SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
> > GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)

  reply	other threads:[~2019-05-07 13:18 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-25 23:51 GSOC nick
2019-03-26 13:32 ` GSOC David Malcolm
2019-03-26 13:41   ` GSOC Richard Biener
2019-03-26 13:59     ` GSOC nick
2019-03-27 13:55     ` GSOC Giuliano Belinassi
2019-03-27 14:43       ` GSOC nick
2019-03-28  8:44         ` GSOC Richard Biener
2019-03-28  8:42       ` GSOC Richard Biener
2019-03-28 20:20         ` GSOC Giuliano Belinassi
2019-03-29  8:48           ` GSOC Richard Biener
2019-05-06 18:47             ` GSOC Giuliano Belinassi
2019-05-07 13:18               ` Richard Biener [this message]
2019-05-12 18:31                 ` GSOC Giuliano Belinassi
2019-05-13 12:18                   ` GSOC Richard Biener
2019-05-13 22:32                     ` GSOC Giuliano Belinassi
  -- strict thread matches above, loose matches on Subject: below --
2024-03-23  7:23 GSoC koushiki khobare
2024-03-25 16:48 ` GSoC Martin Jambor
2024-03-06  1:56 GSoC Abhinav Gupta
2024-03-07 13:02 ` GSoC Martin Jambor
     [not found]   ` <CANMwAi-zb1W8Ajss3WVVd5XP-X=_q=Ezf=WRNT4L1mi=JEzzsw@mail.gmail.com>
2024-03-13 13:54     ` GSoC Martin Jambor
2024-03-14 22:24       ` GSoC Thomas Schwinge
2024-03-30 17:40         ` GSoC Abhinav Gupta
2024-03-30 22:47           ` GSoC Martin Jambor
2024-02-26 22:15 GSOC Pratush Rai
2024-03-01 13:54 ` GSOC Martin Jambor
2022-06-09  9:04 GSoC Tim Lange
2022-06-09 14:38 ` GSoC David Malcolm
2022-04-10 18:57 Gsoc 20-cs Kunal Rajnish
2022-04-14 16:26 ` Gsoc Martin Jambor
2022-03-12 15:39 GSoC Γιωργος Μελλιος
2022-03-12 16:00 ` GSoC David Edelsohn
2021-03-20 12:23 GSOC Manish Sahani
2021-03-23 15:09 ` GSOC Jonathan Wakely
2021-03-23 18:20 ` GSOC Martin Jambor
2021-03-19  8:03 GSoC Isitha Subasinghe
2021-03-19 13:24 ` GSoC Philip Herron
2021-03-22 16:12   ` GSoC David Malcolm
2021-03-23 13:35 ` GSoC Martin Jambor
2021-03-12  9:26 GSoC ΓΙΩΡΓΟΣ ΛΙΑΚΟΠΟΥΛΟΣ
2021-03-12 10:50 ` GSoC Philip Herron
2021-02-07 10:47 GSoC Ravi Kumar
2021-02-09 14:20 ` GSoC Martin Jambor
2020-03-31 17:09 GSoC Yerassyl Sagynov
2020-03-15 14:14 GSOC shivam tiwari
2020-03-15 13:19 GSOC shivam tiwari
2020-03-15 18:58 ` GSOC Segher Boessenkool
2020-03-15 21:18 ` GSOC Martin Jambor
2020-03-16 13:13 ` GSOC Giuliano Belinassi
2020-02-21  8:18 GSoC shivam tiwari
2020-02-21  9:31 ` GSoC Richard Biener
2019-04-07 18:14 GSoC utkarsh shrivastava
2019-04-07 18:12 GSoC utkarsh shrivastava
2019-04-07  9:09 GSOC ashwina kumar
2019-04-08  9:23 ` GSOC Richard Biener
2019-04-08 11:27 ` GSOC Martin Jambor
2019-04-04 11:11 GSoC Muhammad Shehzad
2019-04-04 12:38 ` GSoC Martin Jambor
2019-03-26 14:12 Gsoc FuN traveller
2019-03-26 18:34 ` Gsoc Martin Jambor
2019-03-27  1:31   ` Gsoc FuN traveller
2019-03-25 19:22 Gsoc FuN traveller
2019-03-23 16:26 GSoC youssef Elmasry
2019-03-26 18:10 ` GSoC Martin Jambor
2019-03-15  4:20 GSOC nick
2019-03-14 20:01 GSoC Matias Barrientos
2019-03-27  8:31 ` GSoC Martin Jambor
2019-03-10 18:54 GSoC Martin Emil
2019-03-26  0:05 ` GSoC Martin Jambor
2019-03-26  0:15   ` GSoC Jakub Jelinek
2019-03-01 20:08 GSoC Ahmed Ashraf
2019-03-01 22:04 ` GSoC Dmitry Mikushin
2018-11-27 16:36 GSOC Siddhartha Sen
2018-03-25 16:04 GSoC Basil George
2018-03-15  4:25 GSoC Gaurav Ahuja
2018-03-15 11:24 ` GSoC Martin Jambor
     [not found] <CAF3k35YMd2jxTkHx0i-7CXp0fJHUeTNeovqeXGVYLiuuNzPN-Q@mail.gmail.com>
2018-03-14 16:34 ` GSOC Martin Jambor
2018-03-15  8:43   ` GSOC Richard Biener
2018-03-14  9:49 GSoC Maria Kalikas
     [not found] <CALsyVYzd+67-McGFX-xPRDTE=J=YJ+ZoEzsLfg596fOK783WAw@mail.gmail.com>
     [not found] ` <CALsyVYz41bkXd4+ivrebu0DTQm9DugShjDbAaBdo71xBTo6_=w@mail.gmail.com>
     [not found]   ` <CALsyVYwNDgicbnAnSGxqGQPL299PxZiy+iBzt3VVpT=e3Eq=uw@mail.gmail.com>
     [not found]     ` <CALsyVYy26POK5GcHtEkTdp+02pe6jyqRx9n8DqRzG8brRuNq=g@mail.gmail.com>
     [not found]       ` <CALsyVYzYiFaL85VBOHaXcYhizEwc4JweD4y9ktZK8WGg87-XhQ@mail.gmail.com>
     [not found]         ` <CALsyVYzvf-YM7Oe_zwYLpg=sDuAuufuVgi+iv88KS+6Q20YRwA@mail.gmail.com>
     [not found]           ` <CALsyVYzJqP7wOJ=k=rKdtEgc3XpfeLy-EZp_UHL9JCQ-ZF2Qrw@mail.gmail.com>
     [not found]             ` <CALsyVYxtrhJHVS-0y+jc8RcbPyOBUCRf+2f=8sqmANGGSugJdQ@mail.gmail.com>
     [not found]               ` <CALsyVYwVbM2vpTZng7mHzaPU8m366xnjCO3YYwihJCzwoW+VsQ@mail.gmail.com>
     [not found]                 ` <CALsyVYyR0DY20AcMgcZEdpzh3JVLPDGhJLPp=frz0orX9vSQEQ@mail.gmail.com>
     [not found]                   ` <CALsyVYxfzACgWawTCtKfMOq8Oubw52xKFRVTFy7=LWzZnA=u2w@mail.gmail.com>
     [not found]                     ` <CALsyVYzLapO10POzqA9QFCZW+dxx4HnSCMJ_uoxsewn=54ESHg@mail.gmail.com>
     [not found]                       ` <CALsyVYzSmoNgztsk=mJCamj7N8qp5WdF+gPsd2SNj6hoAdNFkg@mail.gmail.com>
     [not found]                         ` <CALsyVYyuLf3oPwb=5r1jLOfiiQ9ODZZKGMSVgKBaVHyK-20bFg@mail.gmail.com>
     [not found]                           ` <CALsyVYw1hv2c_9_DyS_AK8x1XJmKY0Hgb-CfwoXWbb=5Pr-R_A@mail.gmail.com>
     [not found]                             ` <CALsyVYyVbVaBb6Bxfo8tBORKWuzF84QhPEN7k7waBUDm+mBbZA@mail.gmail.com>
     [not found]                               ` <CALsyVYxqko-1VTgQ4UBeDaAxfQs3+hQgrB+z_0N7UsnGEM-Krw@mail.gmail.com>
     [not found]                                 ` <CALsyVYxa5GpG99+D6a-U1iAdut1Xv=SseoKP3UptHFNHDdim5Q@mail.gmail.com>
     [not found]                                   ` <CALsyVYzp2xBLYJHziOGa5nCk3-D8KsW9sZ7Tm6nay3BLCKu2-Q@mail.gmail.com>
     [not found]                                     ` <CALsyVYyvE0tD1RfPR8D_ZEP6vNdO0gEBMGffvOJbG_LQ3NmM_w@mail.gmail.com>
2018-03-13 14:53                                       ` GSoC Ko Phyo
2018-03-01 17:28 GSoC Tejas Joshi
2018-03-01 18:34 ` GSoC Joseph Myers
2018-02-26 16:35 GSoC Dushyant Pratap Singh
2018-02-27 13:16 ` GSoC Richard Biener
2018-02-21 16:05 GSoC Thejazeto Lhousa
2018-02-24 10:56 ` GSoC Thomas Schwinge
2018-02-24 13:24   ` GSoC Thejazeto Lhousa
2018-02-24 16:27 ` GSoC Martin Jambor
2018-02-27  8:01   ` GSoC Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LSU.2.20.1905071514360.10704@zhemvz.fhfr.qr \
    --to=rguenther@suse.de \
    --cc=dmalcolm@redhat.com \
    --cc=gcc@gcc.gnu.org \
    --cc=giuliano.belinassi@usp.br \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).