From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 34389 invoked by alias); 8 Mar 2016 22:13:02 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 34352 invoked by uid 89); 8 Mar 2016 22:13:00 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.4 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW,RP_MATCHES_RCVD,SPF_PASS autolearn=ham version=3.3.2 spammy=guidance, parsers, abstraction, nsw X-HELO: mail-vk0-f52.google.com Received: from mail-vk0-f52.google.com (HELO mail-vk0-f52.google.com) (209.85.213.52) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Tue, 08 Mar 2016 22:12:59 +0000 Received: by mail-vk0-f52.google.com with SMTP id c3so34584080vkb.3 for ; Tue, 08 Mar 2016 14:12:58 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc; bh=K48dWDOOBpYsVlT51nDxtEtLaZSazzgrj66kjiGnO0k=; b=TgyeP6uR/Twv5wzoiUQy4F1c7znXHMqs9nB8r65W9wKEA+iHcn/Ka9rrSaJl1+idy+ m1ri6SPqPyA3DMUMwbmrovOWbMFJC9Nw2poq3Px7cDjSCj7gP0Dn2C6ZZNL5HbYI/yvj qe7oeUmhI+IlbsI9Yf7oP7NKtvpYmbX2A4rnQEBUB+grxkU0xbOvE/99oTt1ofSaO4w9 HdB0mRiYHqBmMHXsUx2bsfBVJu9K+4bCtwwcWQUnFVEH+kABeIBb5TqtgP+eH7nX/2YT oBPuJZ0t9oKutXucWKWzVdIMnnOusnw5mGJP6VBEegdV0PiH9gskj+PezZqUv6NX8P4h Q9fg== X-Gm-Message-State: AD7BkJJk+IzuBArCKsvaVmtaNSro16D3kRAWXP6AiM+NSElY7jk+e7KOar2+A3VGryW5ITb69XNtS+BKaHaL7JkB MIME-Version: 1.0 X-Received: by 10.31.138.73 with SMTP id m70mr28631393vkd.70.1457475176849; Tue, 08 Mar 2016 14:12:56 -0800 (PST) Received: by 10.31.97.198 with HTTP; Tue, 8 Mar 2016 14:12:56 -0800 (PST) In-Reply-To: <1457474387.9813.121.camel@redhat.com> References: <1457368435.9813.68.camel@redhat.com> <20160308002418.GA13433@ball> <56DEF2F1.1080900@gmail.com> <1457455668.9813.106.camel@redhat.com> <1457474387.9813.121.camel@redhat.com> Date: Tue, 08 Mar 2016 22:13:00 -0000 Message-ID: Subject: Re: [gimplefe] [gsoc16] Gimple Front End Project From: Diego Novillo To: David Malcolm Cc: =?UTF-8?B?TWFudWVsIEzDs3Blei1JYsOhw7Fleg==?= , Richard Biener , Trevor Saunders , Prasad Ghangal , gcc Mailing List , sandeep@gcc.gnu.org Content-Type: text/plain; charset=UTF-8 X-SW-Source: 2016-03/txt/msg00082.txt.bz2 On Tue, Mar 8, 2016 at 4:59 PM, David Malcolm wrote: > My goal for unit-testing passes is to be able to dump/reload the GIMPLE > IR in a form that's: > (A) readable by both humans and programs, and > (B) editable by humans > (C) roundtrippable for some subset of the IR > (D) can support the input for every gimple pass (pre-CFG, CFG-before > -SSA, CFG-with-SSA, with/without lowered switches etc) > (E) can support the output of every gimple pass, apart from the final > expansion to RTL. > > AIUI, Richard would also like: > (F) the form to be parsable as C > (presumably some subset) > > LLVM IR is likely similar to GIMPLE IR, but AFAIK, LLVM IR requires > SSA, which would seem to preclude using it (goals (D) and (E) above). LLVM IR is an SSA IR, yes. It also has two different representations, a text-based one parseable with its front end, and a binary one (bitcode) which is more efficient for purposes of LTO and such (similar to the GIMPLE bytecode lto front end). LLVM IR is actually lower in the IR abstraction spectrum. It's closer to RTL than it is to GIMPLE. For instance, its type system is very similar to RTL: words, pointers and offsets. A few machine features creep in, but not many. Things like function calls look very much like a C function call. > Also, there may be other things we'd want to express in GIMPLE that > might not be directly expressible in LLVM IR (Richard alluded to this > earlier in this thread: the on-the-side data: range info, points-to > info, etc). Though I suspect converters may be feasible for some > common subset of SSA IR. > > Regarding goal (F) above, AFAIK, LLVM IR has a texual assembly form and > a bitcode form; does LLVM IR have a subset-of-C form? Well, in the sense that it kinda looks like a very low level version of C. For instance, int sum(int y, int x) { return x + y; } becomes: define i32 @sum(i32 %y, i32 %x) #0 { entry: %y.addr = alloca i32, align 4 %x.addr = alloca i32, align 4 store i32 %y, i32* %y.addr, align 4 store i32 %x, i32* %x.addr, align 4 %0 = load i32, i32* %x.addr, align 4 %1 = load i32, i32* %y.addr, align 4 %add = add nsw i32 %0, %1 ret i32 %add } Notice all the word operations and the SSA nature of the IL itself. I'm hardly in a position to offer guidance here, but I'm not sure that overloading the gimple FE on the c-family FE is desirable long term. The two parsers are going to invariably differ in what they want to accept, error messages, etc. For modularity purposes, having a separate module dealing with GIMPLE itself might be better than piggybacking functionality on the C FE. This way, implementing a library that supports dealing with GIMPLE becomes much simpler. This provides a nice foundation for all kinds of gimple-oriented tooling in the future. Diego.