From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <cgen-return-1447-listarch-cgen=sources.redhat.com@sources.redhat.com>
Received: (qmail 21580 invoked by alias); 6 Aug 2003 02:45:08 -0000
Mailing-List: contact cgen-help@sources.redhat.com; run by ezmlm
Precedence: bulk
List-Subscribe: <mailto:cgen-subscribe@sources.redhat.com>
List-Archive: <http://sources.redhat.com/ml/cgen/>
List-Post: <mailto:cgen@sources.redhat.com>
List-Help: <mailto:cgen-help@sources.redhat.com>, <http://sources.redhat.com/lists.html#faqs>
Sender: cgen-owner@sources.redhat.com
Received: (qmail 21573 invoked from network); 6 Aug 2003 02:45:07 -0000
Received: from unknown (HELO tiktok.the-meissners.org) (66.205.90.83)
  by sources.redhat.com with SMTP; 6 Aug 2003 02:45:07 -0000
Received: from tiktok.the-meissners.org (localhost [127.0.0.1])
	by tiktok.the-meissners.org (8.12.8/8.12.8) with ESMTP id h762j7rn012955
	for <cgen@sources.redhat.com>; Tue, 5 Aug 2003 22:45:07 -0400
Received: (from meissner@localhost)
	by tiktok.the-meissners.org (8.12.8/8.12.8/Submit) id h762j6iY012953
	for cgen@sources.redhat.com; Tue, 5 Aug 2003 22:45:06 -0400
Date: Wed, 06 Aug 2003 17:27:00 -0000
From: Michael Meissner <cgen-mail@the-meissners.org>
To: cgen@sources.redhat.com
Subject: Types and other issues with cgen
Message-ID: <20030806024506.GA12937@tiktok.the-meissners.org>
Mail-Followup-To: Michael Meissner <cgen-mail@the-meissners.org>,
	cgen@sources.redhat.com
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.4.1i
X-SW-Source: 2003-q3/txt/msg00023.txt.bz2

I've been looking at the internal types used within cgen, and I wanted to get
some comments before I start making wholesale changes.  Sorry for the length,
but I thought it is important to talk about the issues (#1, #4, and #8 are
minor issues).

1) Cgen uses the PARAMS macro to selectively hide prototypes.  Given that both GCC
   and BINUTILS now require a C90 compiler with prototypes, would patches that go
   through and compeletely prototype things be accepted?

2) Cgen has a type mechanism (DI/SI/etc.) but it doesn't seem to be used in the
   actual code for at least the assembler and disassembler (I haven't gotten to
   sim/sid yet).  All fields in the cgen_fields structure are signed long, no
   matter what the type that I declare in the .cpu file is.  In part this seems
   to be because extract_normal and friends take an address of the field to
   fill, and return 0/1 for error and success.  Wouldn't a better approach be
   to size & type the fields as the user specified, and make the extract
   functions return the extracted value and return error/success via a
   pointer.  I could see either separate extractor functions for each type, or
   signed/unsigned extractor functions of the widest type, or just a single
   extract function being used.

3) Signed long is another problem in that the machine I'm targeting is a 64-bit
   machine, but I am doing development on an x86 machine.  If we keep to a
   single type, it should be at least bfd_signed_vma which will be the
   appropriate size to hold addresses in the target machine.  This will mean
   having to rewrite the places that just call printf or the print functions,
   but that is not too difficult.  Another possibility is to use a cgen
   specific type (or two types for signed/unsigned) that is sized to be as
   large as the largest type used in the .cpu file.  Ideally for 32-bit ports
   on 32-bit hosts, you would not slow things down by using 64 bit types
   blindly, but it would allow those of us developing for larger hosts to
   use cgen.

   There are machines out there with 128 bit registers, such as the MIPS chip
   that is at the heart of the Sony playstation, the SES2 registers on the
   Pentium IV, and the Altivec registers on the newer Powerpcs.  However, C
   compilers don't often times give 128 bit types.  We might want to think
   about how to handle these machines as well.  In terms of instruction size, I
   do have a 86 bit instruction which pushes the problem also.  This may
   require using gmp if needed.  Too bad, we aren't coding in C++, where we
   could just define a class type to get the extra precision.

4) As a nit, we use unsigned int for the hash type, and I suspect it might be
   cleaner if we had a cgen specific type for holding hash values (ie,
   cgen_hash_t).

5) As an experiment, I compiled cgen with -Wconversion, and it showed a lot of
   places where implicit signed<->unsigned conversions were going on.  A lot of
   the places were using int to hold sizes like buffer lengths, and passing
   sizeof(...) to the value, and size_t would be more useful.  Unfortunately it
   also shows other places where having a single type for the fields (such as
   long currently, or bfd_signed_vma/cgen_int_t possibly in the future).  One
   of my thoughts is to have a union of an appropriate unsigned and signed
   types of the same size, and use the appropriate element in the expansion.

6) Using bfd_put_bits and bfd_get_bits to convert the bits into proper endian
   format only works for bit sizes of 8, 16, 32, and 64.  In all other places,
   bfd aborts (my machine has mostly 43 bit instructions, and 1 86 bit
   instruction before the encoding mentioned in #7).  It might be better to
   open code this, rather than falling back to the bfd functions.

   Another idea is to always encode instructions expressed as a series of bytes
   in big endian (or little endian) format, and then expect the final assembler
   encoding to do the appropriate copying.  Otherwise, I see a lot of code that
   checks the endianess to get the correct byte.

7) As I have mentioned in the past, my machine uses 3 43-bit instructions that
   are encoded into a 128 bit super instruction.  Any ideas for the syntax for
   specifying the encode/decode operations?

8) The @arch@_cgen_hw_table uses (PTR) in initializing the asm_data field.
   This makes debugging harder.  Would it be possible to have 2 fields so that
   each member is correctly typed, and you can print out pointers in the
   debugger?

So, suggestions on how you would like me to extend cgen to handle the problems
my machine exposes?

My initial thoughts are to use a cgen specific type for the types.  The first
round would use bfd_vma/bfd_signed_vma, but eventually size the type based on
the maximum size used in the .cpu file.  I'm thinking of using the union with
signed and unsigned fields, to deal with many of the conversion issues.

-- 
Michael Meissner
email: gnu@the-meissners.org
http://www.the-meissners.org