From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 16953 invoked by alias); 5 Dec 2002 00:12:13 -0000 Mailing-List: contact cgen-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cgen-owner@sources.redhat.com Received: (qmail 16705 invoked from network); 5 Dec 2002 00:12:09 -0000 Received: from unknown (HELO colossus.systems.pipex.net) (62.241.160.73) by sources.redhat.com with SMTP; 5 Dec 2002 00:12:09 -0000 Received: from jguk.org (81-86-242-171.dsl.pipex.com [81.86.242.171]) by colossus.systems.pipex.net (Postfix) with ESMTP id ADCAC16000192; Thu, 5 Dec 2002 00:12:05 +0000 (GMT) Message-ID: <3DEE9A45.5020709@jguk.org> Date: Wed, 04 Dec 2002 16:12:00 -0000 From: "J. Grant" User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2b) Gecko/20021029 X-Accept-Language: en-gb, en MIME-Version: 1.0 To: Doug Evans Cc: binutils , cgen@sources.redhat.com Subject: Re: Decompilation References: <3DE6AFF4.3010703@jguk.org> <15851.42465.790262.369771@casey.transmeta.com> In-Reply-To: <15851.42465.790262.369771@casey.transmeta.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-SW-Source: 2002-q4/txt/msg00051.txt.bz2 Doug Evans wrote: > J. Grant writes: > > I am currently working on some decompilation methods/ideas. I have been > > looking at the suitability of implementing using the GNU tools as a > > base. I realise this is a very complex process, so would like to ask > > peoples opinions before diving in and coding in all the wrong places. > > > > I would like to achive something similar to the way that gcc is the > > front end for compiling. For each of the stages below I would welcome > > sugested areas of binutils/GCC that I should focus my work on. I have > > been modifying objdump to produce the intermediate code. Clearly a lot > > of new code needs to be written to complete this work. If anyone has > > sugestions for the direction I should take this is welcome. > > > > Stage 1: Front end > > Input machine code binary > > Disassemble > > Abstract intermediate code generation > > Intermediate code output > > What if you used cgen for stage 1? > I've always wanted to add the rtl to the opcodes files of cgen (*1), > but haven't had a reason or impetus to. > With that (and some suitable cover/utility fns) I believe you could easily > go from binary to intermediate code (*2). Only for the targets that cgen > supports of course. Thank you for the reply. Having just checked cgen, I think it might be useful to use for this purpose. It is a very interesting project. Is the CPU description language complete? I have been working with C--(http://cminusminus.org/) and modifying objdump currently. C-- seems more of an intermediate code language than the cgen one (from reading the docs). I am not sure of the exact organisation of GCC and binutils. I notice the use of Scheme as the implementation language, I have been leaning towards C for my research so far. Certainly for the first phase it seems suitable IMO. I have not started on phase 2 or 3 yet (beyond considerations of techniques). > (*1): At some point I've been expecting binutils to want to boot cgen > out of libopcodes. I dunno. But I've always wanted to create libcgen too. > There's a lot more ISA utilities that can be provided with cgen and should > be made available in the form of a library, but whether they belong in > libopcodes and shipped with binutils is certainly debatable. Is there cgen Scheme source in binutils currently? I have not seen any as part of the opcodes dir or other areas. Is it only available separatly from the home page currently? > (*2): pedantic: insert all the usual caveats of determining what's code > and what's data. This should be feasible on architectures that separate data & code in different sections. Targets such as SH that I believe place const data inline in .text would have to use some heuristics based on access of the data I expect. Comments or sugestions welcome JG