From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 25915 invoked by alias); 23 Mar 2018 15:05:31 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 25906 invoked by uid 89); 23 Mar 2018 15:05:30 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=1.9 required=5.0 tests=BAYES_50,FORGED_HOTMAIL_RCVD2,FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HTML_MESSAGE,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_PASS autolearn=no version=3.3.2 spammy=sebaspe97hotmailcom, sebaspe97@hotmail.com, HX-OriginatorOrg:hotmail.com, H*i:sk:CAFiYyc X-HELO: EUR01-DB5-obe.outbound.protection.outlook.com Received: from mail-oln040092064054.outbound.protection.outlook.com (HELO EUR01-DB5-obe.outbound.protection.outlook.com) (40.92.64.54) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 23 Mar 2018 15:05:26 +0000 Received: from DB5EUR01FT059.eop-EUR01.prod.protection.outlook.com (10.152.4.58) by DB5EUR01HT057.eop-EUR01.prod.protection.outlook.com (10.152.5.10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.567.18; Fri, 23 Mar 2018 15:05:21 +0000 Received: from AM2PR05MB1186.eurprd05.prod.outlook.com (10.152.4.53) by DB5EUR01FT059.mail.protection.outlook.com (10.152.4.164) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.567.18 via Frontend Transport; Fri, 23 Mar 2018 15:05:21 +0000 Received: from AM2PR05MB1186.eurprd05.prod.outlook.com ([fe80::39c9:572c:f07b:cf74]) by AM2PR05MB1186.eurprd05.prod.outlook.com ([fe80::39c9:572c:f07b:cf74%8]) with mapi id 15.20.0609.012; Fri, 23 Mar 2018 15:05:21 +0000 From: Sebastiaan Peters To: Richard Biener CC: David Malcolm , Sebastiaan Peters , "gcc@gcc.gnu.org" Subject: Re: GSOC Question about the parallelization project Date: Fri, 23 Mar 2018 15:05:00 -0000 Message-ID: References: <2C17E153-DCDC-4E24-B9D3-64E0FF358758@gmail.com> <1521557360.5688.30.camel@redhat.com> , In-Reply-To: x-incomingtopheadermarker: OriginalChecksum:9914768413A61D8343C0058AB26E99676C71EA3BD027D09437A4FB8BB1459E91;UpperCasedChecksum:68DA08632B6E1AEE9A5849EC2E038C862C1801761C028EF70636341C052BC170;SizeAsReceived:8269;Count:47 x-ms-exchange-messagesentrepresentingtype: 1 x-tmn: [jwLQC0xdOA6YqpGPWmKfYHMLLCwea09i4/kU7R7YbVlQBn0U0bhPzUD5oTBnE70J0EBMkGqlETw=] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1;DB5EUR01HT057;6:JecWgrMbiE5dKB82LgAxgiEHVNQOjBQr9wt5PDTZBgOHv/AHXMYzVcAQvCfGe/7+9Dkhq31vO5H9YXH/KHtwGlk3IAxg1pRAtMhpY5BoliU9PgJ3RTGV1hxVgoiDpPrpb8jEk21PTwDomGt5QI9hmIiT4/yByH/k5KbHq1v9cGSpllfutIF2+LIf4sVYr8D0zTZB0ebEJB8SAlMRNGJdPBw0bnBui5IhJxtaNXVQk0wNeRoUCwYBEtwX+VoyZSymqXiYDfw8izEWm9yOELQHt5uGU+FpmY/OCKnl+3An80GZPcu5ffBCzbJ28pywiOvznCE3sbTozqUvyLnRtI9sNjUwz18yhSZdwtA4Z2I6OJk=;5:qpTm97uFWKX7xFx1XHh7RsfN3uYp/w/f3EqkYOdlZKTjudTwMt7dQAc70gjdQ8yjnfDiQCYYq/f6X7qd31cQubJNWXd2yh+2OC7zTpQb+yt5WLNTVzJFYQraSJ+pJvCTfFCKnW4+JGfRUTGans8XwCvD8GGu9o3oxaNpZ2XYco4=;24:wYQhuNJN113XJK14vc7VpvOYiHpwqEkm6+lCrPTqQAdb25YLFyG92DWj38a7BnYmZ/8mLSoHNqDBui0YtmsuZ4joDY5u2vANi1Hd9ZDjMIA=;7:BQUsuV0xnAnZJZ/QxL4TONUU4iM3DUo3lsmyDOWFd3yKEO4fVTfg6muXiJaGDWG+5EVPoLGD1QbySKBSdkVc8h9s8J2CBNKLzXjDRq8nBNUUw0b5NHVhXJWbAdL6aP/ne7qx1z3zVvqqK4/ppVY2CM0ssg3PQtZJc5gNlbCZOYt1/d2wOTDAgKkM0dDJnBqCw1b5LT/BMnM0OoHYpB0vKHygJGC+ETjSOgnLQ7t75nBUYIIm6sPa2sE/AOaIJydP x-incomingheadercount: 47 x-eopattributedmessage: 0 x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(7020095)(201702061078)(5061506573)(5061507331)(1603103135)(2017031320274)(2017031324274)(2017031323274)(201702181274)(2017031322404)(1601125374)(1603101448)(1701031045);SRVR:DB5EUR01HT057; x-ms-exchange-slblob-mailprops: =?iso-8859-1?Q?IkxdR8L5RzGsNV/XsyHItOBvvBxwx38amulsCvXDVrjToD2osjKSDxiW7i?= =?iso-8859-1?Q?c8WShR7rUaVtiefwvizOp2JEz2CXtXtAI2trHZP8IDweIMRxHC4SbV0T3r?= =?iso-8859-1?Q?LXawdrQTK3znVZzDgyl+WOpmYvwT1JzZ4HpMKhjKXwP8hbV+57STmsssEK?= =?iso-8859-1?Q?R1F8tMcy7krqfow4dtem7XbepsB6Zf7/wwapeX7jWa7U5N8cx/qLBwoCaA?= =?iso-8859-1?Q?ZwwL42ck0YkuMptsLEUvfNZKxnqGFdbsfvcHkyHfff4xFMvsE/8tnMJI1C?= =?iso-8859-1?Q?F7E37aK9QFvtNyqHtzpnouUrg+RJBlpUzniDnzQUfnRZAlcdqjnALY7kBo?= =?iso-8859-1?Q?5FkGnq4nDWa3BpTVZy5aNVX0+6GOb2oqbcEA5S/uO3lDweY5eJ3SkWNmW5?= =?iso-8859-1?Q?Sg6f3Htr2DrYsq2CWXDtIS7y23kZLtHW1G9obe0/vrcWa2f7AxUe7NhEo3?= =?iso-8859-1?Q?4FAcx+M03D2mmuiBtJ2yDRpBDnyS492qg/dEeh9ksQhznr41vhHaKuIqF6?= =?iso-8859-1?Q?AtLUM8W8Q6wlF5qMoGhSGPurt99UIaZZEgwlXCfs4oLF8T/cb25VuIA7pm?= =?iso-8859-1?Q?dJ6o13aZAT4vThI77jwVY4iafDW1NPq7NwBSEdPS+saXwCZcwLUbhA56cp?= =?iso-8859-1?Q?wQeOFH+6Rx0M2VMlaQGa+dHWw8/EhbLJSPXUksE79O4vMwmylZY4i5aELu?= =?iso-8859-1?Q?YhuNdOlVzcuA0MbCFc6suEUH0tieTqgQJK0oPGKC9EUuTdiGdPmsjPO9oM?= =?iso-8859-1?Q?yiDw5HUlofLSDeaZhehbn9qVCmsmP6al1i5o/fJhDDgumpJt7HrUDfNB2S?= =?iso-8859-1?Q?E+yCXY4VCJ22INykaogfKueEbKPFJ857JJMAjTYdd9uOCIJIVrXC4g4jTX?= =?iso-8859-1?Q?aT3/RAnvuLx8wzsX88yMIV3RdJuj0fOW8NGzFxDz3gmz3v8CDYXQgVjWc6?= =?iso-8859-1?Q?NGcxRA4IXHqA+FvZU7J5AEth5IYCfvjyLcrxU4CnM07boVFhKZImL3sEVv?= =?iso-8859-1?Q?LSdS/EbFxIjh0tz8E5EcAG/CB6+KMERoM2b3qWbMCRwZEpc1OgbFpox+Uo?= =?iso-8859-1?Q?Cw=3D=3D?= x-ms-traffictypediagnostic: DB5EUR01HT057: x-ms-office365-filtering-correlation-id: 2feb9953-ff4a-4453-3ee3-08d590cf7fb9 x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(444000031);SRVR:DB5EUR01HT057;BCL:0;PCL:0;RULEID:;SRVR:DB5EUR01HT057; x-forefront-prvs: 0620CADDF3 x-forefront-antispam-report: SFV:NSPM;SFS:(7070007)(98901004);DIR:OUT;SFP:1901;SCL:1;SRVR:DB5EUR01HT057;H:AM2PR05MB1186.eurprd05.prod.outlook.com;FPR:;SPF:None;LANG:; x-microsoft-antispam-message-info: MjnlvOQrWPmE4oO/Cmya0xJjaIpKnTcj/GdXxYl0XV4roWsdELx+zDaRmsU6uf5i8BnTDUvl1VYa/EL5bwbG9N9ml/MIxa2yyLd8BSvQ8BgXCpM4pXYr1+BUxRMWXFR5N3SDME5Ndu0QxjR9mVF1qyJ+Pq3IdsQwwL8isIhJM4EGoaB8pMSi3j7QcCPRuH1i spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: hotmail.com X-MS-Exchange-CrossTenant-Network-Message-Id: 2feb9953-ff4a-4453-3ee3-08d590cf7fb9 X-MS-Exchange-CrossTenant-originalarrivaltime: 23 Mar 2018 15:05:21.7019 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Internet X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB5EUR01HT057 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-SW-Source: 2018-03/txt/msg00229.txt.bz2 >On Wed, Mar 21, 2018 at 8:04 PM, Sebastiaan Peters > wrote: >>>On Wed, Mar 21, 2018 at 9:50 AM, Sebastiaan Peters >>> wrote: >>>>>On Tue, Mar 20, 2018 at 3:49 PM, David Malcolm w= rote: >>>>>> On Tue, 2018-03-20 at 14:02 +0100, Richard Biener wrote: >>>>>>> On Mon, Mar 19, 2018 at 9:55 PM, Richard Biener >>>>>>> wrote: >>>>>>> > On March 19, 2018 8:09:32 PM GMT+01:00, Sebastiaan Peters >>>>>> > 7@hotmail.com> wrote: >>>>>>> > > > The goal should be to extend TU wise parallelism via make to >>>>>>> > > > function >>>>>>> > > >>>>>>> > > wise parallelism within GCC. >>>>>>> > > >>>>>>> > > Could you please elaborate more on this? >>>>>>> > >>>>>>> > In the abstract sense you'd view the compilation process separated >>>>>>> > into N stages, each function being processed by each. You'd assign >>>>>>> > a thread to each stage and move the work items (the functions) >>>>>>> > across the set of threads honoring constraints such as an IPA sta= ge >>>>>>> > needing all functions completed the previous stage. That allows y= ou >>>>>>> > to easier model the constraints due to shared state (like no pass >>>>>>> > operating on two functions at the same time) compared to a model >>>>>>> > where you assign a thread to each function. >>>>>>> > >>>>>>> > You'll figure that the easiest point in the pipeline to try this >>>>>>> > 'pipelining' is after IPA has completed and until RTL is generate= d. >>>>>>> > >>>>>>> > Ideally the pipelining would start as early as the front ends >>>>>>> > finished parsing a function and ideally we'd have multiple >>>>>>> > functions in the RTL pipeline. >>>>>>> > >>>>>>> > The main obstacles will be the global state in the compiler of >>>>>>> > which there is the least during the GIMPLE passes (mostly cfun and >>>>>>> > current_function_decl plus globals in the individual passes which >>>>>>> > is easiest dealt with by not allowing a single pass to run at the >>>>>>> > same time in multiple threads). TLS can be used for some of the >>>>>>> > global state plus of course some global data structures need >>>>>>> > locking. >>>> >>>> This would mean that all the passes have to be individually analyzed f= or >>>> which global state >>>> they use, and lock/schedule them accordingly? >>> >>>Their private global state would be excempt by assuring that a pass never >>>runs twice at the same time. >>> >>>The global state that remains should be the same for all passes we are t= alking >>>about (during the late GIMPLE optimization phase which I'd tackle). >>> >>>> If this is the case, is there any documentation that describes the pre= -reqs >>>> for each pass? >>>> I have looked at the internal documentation, however it seems that all= of >>>> this still has to be created? >>> >>>The prereqs are actually the same and not very well documented (if at al= l). >>>There's the global GC memory pool where we allocate statements and >>>stuff like that from (and luckyly statements themselves are function pri= vate). >>>Then there's global trees like types ('int') where modification needs to= be >>>appropriately guarded. Note that "modification" means for example >>>building a new type for the address of 'int' given that all different >>>pointer types >>>to 'int' are chained in a list rooted in the tree for 'int'. That >>>means (a few?) >>>tree building helpers need to be guarded with locks. I don't have a gre= at >>>idea how to identify those apart from knowing them in advance or running >>>into races ... my gut feeling is that there's not a lot to guard but I m= ay >>>be wrong ;) >> >> What does it mean to be a node of a type tree? >> Does it describe information about that type, >> or does it keep a reference to where something >> of that type has been declared? > >On the GIMPLE level we have two main kinds of data objects, one is >'gimple' (and derived clases) for statements and 'tree' for operands, >types, declarations. 'tree's is also what GENERIC is represented in, the >IL that gets produced by the frontends which is lowered to GIMPLE by >the gimplifier. > >On GENERIC everything is a 'tree' (there's a class hierarchy of trees >implemented in C ways via unions), types are, and so are decls >and expressions (in GENERIC). You can look at tree.h (and tree-core.h >for implementation details) and for example see > >#define TYPE_POINTER_TO(NODE) (TYPE_CHECK (NODE)->type_common.pointer_to) > >where tree.c:build_pointer_type_for_mode does > >/* First, if we already have a type for pointers to TO_TYPE and it's >the proper mode, use it. */ >for (t =3D TYPE_POINTER_TO (to_type); t; t =3D TYPE_NEXT_PTR_TO (t)) >if (TYPE_MODE (t) =3D=3D mode && TYPE_REF_CAN_ALIAS_ALL (t) =3D=3D can= _alias_all) >return t; > >so if a pass builds a new pointer type that doesn't already exist we >modify this list. >This means an existing type tree is modified even if the type itself >isn't modified. > >>>> As to how this would be implemented, it seems the easiest way would be= to >>>> extend the macros to >>>> accept a pre-req pass. This would encourage more documentation since t= he >>>> dependencies >>>> become explicit instead of the current implicit ordering. >>> >>>Actually the order is quite explicit. Maybe I now understand your >>>question - no, >>>passes do not "communicate" between each other via global state, all such >>>state is per function and the execution order of passes on a given funct= ion >>>is hard-coded in passes.def. >>> >>>> Assuming the dependencies for the all the tree-ssa passes have to be >>>> individually analyzed. >>>> Currently I have this as my timeline: >>>> - Parallelize the execution of the post-IPA pre-RTL, and a few tre= e-ssa >>>> passes (mid-may - early june) >>>> - Test for possible reproducibility issues for the binary/debug in= fo >>>> (early june - late june) >>>> - Parallelize the rest of tree-ssa (late june - late july) >>>> - Update documentation and test again for reproducibility issues (= late >>>> july - early august) >>>> >>>> Would this be acceptable? >>> >>>Sounds ambitious ;) But yes, it sounds reasonable. I don't exactly >>>understand what "Parallelize the rest of tree-ssa" means though. If >>>you want to tackle a tiny bit more I'd rather include "RTL" by treating >>>the whole RTL part as a single "pass" (as said only one function can >>>be in RTL right now). >>> >> >> I was under the assumption that passes had to be indivdually analysed >> when I wrote that. The timeline is now updated to include some extra >> time in the beginning to familiarize myself a bit more with the project, >> and added the RTL as an optional deliverable:) >> then >> I added a draft of the proposal[0], any feedback would be highly appreci= ated. > >The poposal looks good to me! > >Thanks, >Richard. Awesome, I'll submit this is the coming days. Thanks for all your help, it has helped tremendously. I'm looking forward to (hopefully) work with you this summer. Kind regards, Sebastiaan Peters >>>>>>> Oh, and just to mention - there are a few things that may block >>>>>>> adoption in the end >>>>>>> like whether builds are still reproducible (we allocate things like >>>>>>> DECL_UID from >>>>>>> global pools and doing that somewhat randomly because of threading >>>>>>> might - but not >>>>>>> must - change code generation). Or that some diagnostics will appe= ar >>>>>>> in >>>>>>> non-deterministic order, or that dump files are messed up (both >>>>>>> issues could be >>>>>>> solved by code dealing with the issue, like buffering and doing a r= e- >>>>>>> play in >>>>>>> program order). I guess reproducability is important when it comes >>>>>>> down to >>>>>>> debugging code-generation issues - I'd prefer to debug gcc when it >>>>>>> doesn't run >>>>>>> threaded but if that doesn't reproduce an issue that's bad. >>>>>>> >>>>>>> So the most important "milestone" of this project is to identify su= ch >>>>>>> issues and >>>>>>> document them somewhere. >>>>>> >>>>>> One issue would be the garbage-collector: there are plenty of places= in >>>>>> GCC that have hidden assumptions that "a collection can't happen her= e" >>>>>> (where we have temporaries that reference GC-managed objects, but wh= ich >>>>>> aren't tracked by GC-roots). >>>>>> >>>>>> I had some patches for that back in 2014 that I think I managed to d= rop >>>>>> on the floor (sorry): >>>>>> https://gcc.gnu.org/ml/gcc-patches/2014-11/msg01300.html >>>>>> https://gcc.gnu.org/ml/gcc-patches/2014-11/msg01340.html >>>>>> https://gcc.gnu.org/ml/gcc-patches/2014-11/msg01510.html >>>> >>>> Would there be a way to easily create a static analyzer to find these >>>> untracked temporaries? >>> >>>I don't think so. >>> >>>> A quick look at registered passes makes me count ~135 tree-ssa passes, >>>> So your code on analyzing what globals are referenced where might come= in >>>> handy while analyzing if passes are easily parallelized. >>> >>>I think the "solution" to the garbage collecting issue is to simply keep >>>that serialized. It's currently done on-demand anyway at certain >>>safe collection points so the work scheduler can simply hold off >>>scheduling more work when the collector would want to run, waiting for >>>running jobs to complete. >>> >>>Richard. >>> >>>>>> The GC's allocator is used almost everywhere, and is probably not >>>>>> thread-safe yet. >>>>>Yes. There's also global tree modification like chaining new >>>>>pointer types into TYPE_POINTER_TO and friends so some >>>>>helpers in tree.c need to be guarded as well. >>>>>> FWIW I gave a talk at Cauldron 2013 about global state in GCC. Bewa= re: >>>>>> it's five years out-of-date, but maybe is still relevant in places? >>>>>> https://dmalcolm.fedorapeople.org/gcc/global-state/ >>>>>> https://gcc.gnu.org/ml/gcc/2013-05/msg00015.html >>>>>> (I tackled this for libgccjit by instead introducing a mutex, a "big >>>>>> compiler lock", jit_mutex in gcc/jit/jit-playback.c, held by whichev= er >>>>>> thread is calling into the rest of the compiler sources). >>>>>> >>>>>> Hope this is helpful >>>>>> Dave >>>>>> >>>>>> >> >> [0] https://docs.google.com/document/d/1YkKYI3J-pKFfHLdxljWVoJ89l2oVGibx= slIW3ikluz8/edit?usp=3Dsharing