From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 41476 invoked by alias); 10 Jul 2015 02:01:39 -0000 Mailing-List: contact jit-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Post: List-Help: List-Subscribe: Sender: jit-owner@gcc.gnu.org Received: (qmail 41286 invoked by uid 89); 10 Jul 2015 02:01:39 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Checked: by ClamAV 0.98.7 on sourceware.org X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.2 required=5.0 tests=AWL,BAYES_50,KAM_LAZY_DOMAIN_SECURITY,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=no version=3.3.2 X-Spam-Status: No, score=-0.2 required=5.0 tests=AWL,BAYES_50,KAM_LAZY_DOMAIN_SECURITY,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on sourceware.org X-Spam-Level: X-Spam-User: qpsmtpd, 2 recipients X-HELO: mx1.redhat.com Message-ID: <1436493224.31573.32.camel@surprise> Subject: Re: GCC/JIT and precise garbage collection support? From: David Malcolm To: Basile Starynkevitch Cc: jit@gcc.gnu.org, gcc@gcc.gnu.org Date: Thu, 01 Jan 2015 00:00:00 -0000 In-Reply-To: <559EF2F1.6000000@starynkevitch.net> References: <559EF2F1.6000000@starynkevitch.net> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.68 on 10.5.11.26 X-SW-Source: 2015-q3/txt/msg00071.txt.bz2 On Fri, 2015-07-10 at 00:17 +0200, Basile Starynkevitch wrote: > Hello All, > > (this is triggered by a question on the Ocaml mailing list asking about > SystemZ backend in Ocaml; SystemZ is today a backend for GCC & probably > GCCJIT) > > We might want to support better good garbage collection schemes in GCC, > particularily in GCCJIT. This is a > thing that LLVM is known to be weak at, and we might aim to do much > better. If we did, good frontends for > good functional languages (e.g. F#, Ocaml, Haskell) might in the future > profit of GCC technology. And even a Javascript engine based on GCCJIT > could profit. FWIW PyPy (an implementation of Python) defaults to using true GC, and could benefit from GC support in GCC; currently PyPy has a nasty hack for locating on-stack GC roots, by compiling to assembler, then carving up the assembler with regexes to build GC metadata. (IIRC this is the --gcrootfinder=asmgcc option here: http://pypy.readthedocs.org/en/latest/config/translation.gcrootfinder.html ) > A good GC is very probably a precise (sometimes generational copying) GC > with write barriers > (read the http://gchandbook.org/ for more, or at least the wikipage > about garbage collection). So a good GC is changing pointers. > > So we need to know where, and provide a mechanism for, pointer values > are located in the call stack (of the GCCJIT generated code), and > probably provide some write barrier machinery. > > In my incomplete understanding, this requires cooperation between GCC > backend and middle-end; it perhaps mean in the GIMPLE level that we mark > some trees for local variables as been required to be spilled (by the > backend) at some well defined location in the call frame, and be able to > query that location (i.e. its offset). > > Perhaps a possible approach might be to add, at the C front-end level, > an extra variable attribute telling that the variable should be spilled > always at the same offset in the call frame, to have some machinery to > query the value of that fixed offset, and to also have a GCC builtin > which flushes all the registers into the call frame? > > This is just food for thoughts and still fuzzy in my head. Comments are > welcome (including things like we should not care at all about GC). > > Notice that if we had such support for garbage collection, the (dying) > Java front-end could be resurrected to provide a faster GC than Boehm > GC. And GCC based compilers for languages like Go or D which have > garbage collection could also profit. (even MELT might take advantage of > that). This all sounds like a lot of work. I think a simpler first step might be to have some kind of option to support tracking on-stack roots; presumably some kind of late RTL pass that writes out a stack map: const data describing what GC-pointers are live, at each %pc range, assuming we already have enough metadata to let a collector walk the stack frames of a thread (presumably we already have that for e.g. backtraces). This assumes we have enough type information at the RTL phase to be able to distinguish GC types at different places in the frame, or to punt it and be imprecise. Though that doesn't solve GC ptrs in registers. That said, fwiw I'm already fully tasked with things for GCC 6.