From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 17044 invoked by alias); 2 Dec 2002 18:27:15 -0000 Mailing-List: contact cgen-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cgen-owner@sources.redhat.com Received: (qmail 17029 invoked from network); 2 Dec 2002 18:27:14 -0000 Received: from unknown (HELO neon-gw.transmeta.com) (63.209.4.196) by sources.redhat.com with SMTP; 2 Dec 2002 18:27:14 -0000 Received: (from root@localhost) by neon-gw.transmeta.com (8.9.3/8.9.3) id KAA29868; Mon, 2 Dec 2002 10:27:11 -0800 Received: from mailhost.transmeta.com(10.1.1.15) by neon-gw.transmeta.com via smap (V2.1) id xma029826; Mon, 2 Dec 02 10:26:38 -0800 Received: from casey.transmeta.com (casey.transmeta.com [10.10.25.22]) by deepthought.transmeta.com (8.11.6/8.11.6) with ESMTP id gB2IQgR01995; Mon, 2 Dec 2002 10:26:42 -0800 (PST) Received: (from dje@localhost) by casey.transmeta.com (8.9.3/8.7.3) id KAA12040; Mon, 2 Dec 2002 10:26:41 -0800 From: Doug Evans MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <15851.42465.790262.369771@casey.transmeta.com> Date: Mon, 02 Dec 2002 10:27:00 -0000 To: "J. Grant" Cc: binutils , cgen@sources.redhat.com Subject: Decompilation In-Reply-To: <3DE6AFF4.3010703@jguk.org> References: <3DE6AFF4.3010703@jguk.org> X-SW-Source: 2002-q4/txt/msg00046.txt.bz2 J. Grant writes: > I am currently working on some decompilation methods/ideas. I have been > looking at the suitability of implementing using the GNU tools as a > base. I realise this is a very complex process, so would like to ask > peoples opinions before diving in and coding in all the wrong places. > > I would like to achive something similar to the way that gcc is the > front end for compiling. For each of the stages below I would welcome > sugested areas of binutils/GCC that I should focus my work on. I have > been modifying objdump to produce the intermediate code. Clearly a lot > of new code needs to be written to complete this work. If anyone has > sugestions for the direction I should take this is welcome. > > Stage 1: Front end > Input machine code binary > Disassemble > Abstract intermediate code generation > Intermediate code output What if you used cgen for stage 1? I've always wanted to add the rtl to the opcodes files of cgen (*1), but haven't had a reason or impetus to. With that (and some suitable cover/utility fns) I believe you could easily go from binary to intermediate code (*2). Only for the targets that cgen supports of course. (*1): At some point I've been expecting binutils to want to boot cgen out of libopcodes. I dunno. But I've always wanted to create libcgen too. There's a lot more ISA utilities that can be provided with cgen and should be made available in the form of a library, but whether they belong in libopcodes and shipped with binutils is certainly debatable. (*2): pedantic: insert all the usual caveats of determining what's code and what's data. > Stage 2: Universal decompilation machine (UDM) > CFG generation > CFG analysis > Data Format analysis > > Stage 3: Backend HLL target > HLL constructs identified > HLL output