From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 21580 invoked by alias); 6 Aug 2003 02:45:08 -0000 Mailing-List: contact cgen-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cgen-owner@sources.redhat.com Received: (qmail 21573 invoked from network); 6 Aug 2003 02:45:07 -0000 Received: from unknown (HELO tiktok.the-meissners.org) (66.205.90.83) by sources.redhat.com with SMTP; 6 Aug 2003 02:45:07 -0000 Received: from tiktok.the-meissners.org (localhost [127.0.0.1]) by tiktok.the-meissners.org (8.12.8/8.12.8) with ESMTP id h762j7rn012955 for ; Tue, 5 Aug 2003 22:45:07 -0400 Received: (from meissner@localhost) by tiktok.the-meissners.org (8.12.8/8.12.8/Submit) id h762j6iY012953 for cgen@sources.redhat.com; Tue, 5 Aug 2003 22:45:06 -0400 Date: Wed, 06 Aug 2003 17:27:00 -0000 From: Michael Meissner To: cgen@sources.redhat.com Subject: Types and other issues with cgen Message-ID: <20030806024506.GA12937@tiktok.the-meissners.org> Mail-Followup-To: Michael Meissner , cgen@sources.redhat.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.1i X-SW-Source: 2003-q3/txt/msg00023.txt.bz2 I've been looking at the internal types used within cgen, and I wanted to get some comments before I start making wholesale changes. Sorry for the length, but I thought it is important to talk about the issues (#1, #4, and #8 are minor issues). 1) Cgen uses the PARAMS macro to selectively hide prototypes. Given that both GCC and BINUTILS now require a C90 compiler with prototypes, would patches that go through and compeletely prototype things be accepted? 2) Cgen has a type mechanism (DI/SI/etc.) but it doesn't seem to be used in the actual code for at least the assembler and disassembler (I haven't gotten to sim/sid yet). All fields in the cgen_fields structure are signed long, no matter what the type that I declare in the .cpu file is. In part this seems to be because extract_normal and friends take an address of the field to fill, and return 0/1 for error and success. Wouldn't a better approach be to size & type the fields as the user specified, and make the extract functions return the extracted value and return error/success via a pointer. I could see either separate extractor functions for each type, or signed/unsigned extractor functions of the widest type, or just a single extract function being used. 3) Signed long is another problem in that the machine I'm targeting is a 64-bit machine, but I am doing development on an x86 machine. If we keep to a single type, it should be at least bfd_signed_vma which will be the appropriate size to hold addresses in the target machine. This will mean having to rewrite the places that just call printf or the print functions, but that is not too difficult. Another possibility is to use a cgen specific type (or two types for signed/unsigned) that is sized to be as large as the largest type used in the .cpu file. Ideally for 32-bit ports on 32-bit hosts, you would not slow things down by using 64 bit types blindly, but it would allow those of us developing for larger hosts to use cgen. There are machines out there with 128 bit registers, such as the MIPS chip that is at the heart of the Sony playstation, the SES2 registers on the Pentium IV, and the Altivec registers on the newer Powerpcs. However, C compilers don't often times give 128 bit types. We might want to think about how to handle these machines as well. In terms of instruction size, I do have a 86 bit instruction which pushes the problem also. This may require using gmp if needed. Too bad, we aren't coding in C++, where we could just define a class type to get the extra precision. 4) As a nit, we use unsigned int for the hash type, and I suspect it might be cleaner if we had a cgen specific type for holding hash values (ie, cgen_hash_t). 5) As an experiment, I compiled cgen with -Wconversion, and it showed a lot of places where implicit signed<->unsigned conversions were going on. A lot of the places were using int to hold sizes like buffer lengths, and passing sizeof(...) to the value, and size_t would be more useful. Unfortunately it also shows other places where having a single type for the fields (such as long currently, or bfd_signed_vma/cgen_int_t possibly in the future). One of my thoughts is to have a union of an appropriate unsigned and signed types of the same size, and use the appropriate element in the expansion. 6) Using bfd_put_bits and bfd_get_bits to convert the bits into proper endian format only works for bit sizes of 8, 16, 32, and 64. In all other places, bfd aborts (my machine has mostly 43 bit instructions, and 1 86 bit instruction before the encoding mentioned in #7). It might be better to open code this, rather than falling back to the bfd functions. Another idea is to always encode instructions expressed as a series of bytes in big endian (or little endian) format, and then expect the final assembler encoding to do the appropriate copying. Otherwise, I see a lot of code that checks the endianess to get the correct byte. 7) As I have mentioned in the past, my machine uses 3 43-bit instructions that are encoded into a 128 bit super instruction. Any ideas for the syntax for specifying the encode/decode operations? 8) The @arch@_cgen_hw_table uses (PTR) in initializing the asm_data field. This makes debugging harder. Would it be possible to have 2 fields so that each member is correctly typed, and you can print out pointers in the debugger? So, suggestions on how you would like me to extend cgen to handle the problems my machine exposes? My initial thoughts are to use a cgen specific type for the types. The first round would use bfd_vma/bfd_signed_vma, but eventually size the type based on the maximum size used in the .cpu file. I'm thinking of using the union with signed and unsigned fields, to deal with many of the conversion issues. -- Michael Meissner email: gnu@the-meissners.org http://www.the-meissners.org