Hi!

On Thu, 09 Aug 2012 20:18:27 +0200, I wrote:
> Thanks to the recent memory checking instrastructure (well, it's been
> some weeks already...), a memory corruption issue has been uncovered for
> configurations where the host double format is not equal to target double
> format.  I have seen this for SH, but don't believe it really is specific
> to SH, it's just that the very most of all GDB targets have host double
> format match the target double format.
> 
> As I'm not sufficiently experienced with GDB's expressions and type
> system, I'd like some help here.
> 
>     $ install/bin/*-gdb -q -ex 'file [...]/gdb.cp/misc' -ex 'show architecture' -ex 'print (bool)17.93'
>     Reading symbols from [...]/gdb.cp/misc...done.
>     The target architecture is set automatically (currently sh2a-or-sh3e)
>     $1 = true
>     memory clobbered past end of allocated block
>     Aborted
> 
> sh2a-or-sh3e configures for a 32-bit double format, as opposed to the
> "normal" 64-bit double format (which also is the x86_64 host's double
> format).
> 
> sh-tdep.c:sh_gdbarch_init:
> 
>         case bfd_mach_sh2a_or_sh3e:
>           /* doubles on sh2e and sh3e are actually 4 byte.  */
>           set_gdbarch_double_bit (gdbarch, 4 * TARGET_CHAR_BIT);

Turns out it -- somewhat -- is an issue in sh-tdep.c.  With the following
patch, everything is fine.

Index: sh-tdep.c
===================================================================
RCS file: /cvs/src/src/gdb/sh-tdep.c,v
retrieving revision 1.246
diff -u -p -r1.246 sh-tdep.c
--- sh-tdep.c	22 Jul 2012 16:52:41 -0000	1.246
+++ sh-tdep.c	10 Aug 2012 09:15:04 -0000
@@ -2299,6 +2299,7 @@ sh_gdbarch_init (struct gdbarch_info inf
     case bfd_mach_sh2e:
       /* doubles on sh2e and sh3e are actually 4 byte.  */
       set_gdbarch_double_bit (gdbarch, 4 * TARGET_CHAR_BIT);
+      set_gdbarch_double_format (gdbarch, floatformats_ieee_single);
 
       set_gdbarch_register_name (gdbarch, sh_sh2e_register_name);
       set_gdbarch_register_type (gdbarch, sh_sh3e_register_type);
@@ -2344,6 +2345,7 @@ sh_gdbarch_init (struct gdbarch_info inf
     case bfd_mach_sh2a_or_sh3e:
       /* doubles on sh2e and sh3e are actually 4 byte.  */
       set_gdbarch_double_bit (gdbarch, 4 * TARGET_CHAR_BIT);
+      set_gdbarch_double_format (gdbarch, floatformats_ieee_single);
 
       set_gdbarch_register_name (gdbarch, sh_sh3e_register_name);
       set_gdbarch_register_type (gdbarch, sh_sh3e_register_type);

The testresults speak for themselves:

                    === gdb Summary ===
     
    -# of expected passes           14953
    -# of unexpected failures       154
    +# of expected passes           15760
    +# of unexpected failures       100
     # of unexpected successes      2
     # of expected failures         42
     # of known failures            59
    -# of unresolved testcases      797
     # of untested testcases        45
     # of unsupported tests         172

However, it seems there are other targets where a similar patching would
be needed.  And, to begin with, why does GDB allow for letting this
mismatch happen?  Quoting my yesterday's analysis:

> First and foremost -- is my understanding correct that given this
> configuration the expression »(bool) 17.93« then indeed is to be
> evaluated in a 32-bit double format, and not in the host's 64-bit double
> format?
> 
> Our friend Valgrind is able to confirm this memory corruption issue, and
> -- as so often -- gives a clue where to begin looking:
> 
>     $ valgrind -v -- install/bin/*-gdb -q -ex 'file [...]/gdb.cp/misc' -ex 'print (bool)17.93'
>     [...]
>     ==6509== Invalid write of size 1
>     ==6509==    at 0x47FB974: memcpy (mc_replace_strmem.c:497)
>     ==6509==    by 0x824D487: floatformat_from_doublest (doublest.c:768)
>     ==6509==    by 0x824D78D: store_typed_floating (doublest.c:881)
>     ==6509==    by 0x81023DA: value_from_double (value.c:3170)
>     ==6509==    by 0x810424F: evaluate_subexp_standard (eval.c:846)
>     ==6509==    by 0x82020E3: evaluate_subexp_c (c-lang.c:719)
>     ==6509==    by 0x8102975: evaluate_subexp (eval.c:73)
>     ==6509==    by 0x810A157: evaluate_subexp_standard (eval.c:2713)
>     ==6509==    by 0x82020E3: evaluate_subexp_c (c-lang.c:719)
>     ==6509==    by 0x8102975: evaluate_subexp (eval.c:73)
>     ==6509==    by 0x8102B02: evaluate_expression (eval.c:148)
>     ==6509==    by 0x811EE31: print_command_1 (printcmd.c:966)
>     ==6509==  Address 0x8e83bbc is 0 bytes after a block of size 4 alloc'd
>     ==6509==    at 0x47F925F: calloc (vg_replace_malloc.c:467)
>     ==6509==    by 0x82725B4: xcalloc (common-utils.c:90)
>     ==6509==    by 0x82725EA: xzalloc (common-utils.c:100)
>     ==6509==    by 0x80FE7AB: allocate_value_contents (value.c:695)
>     ==6509==    by 0x80FE7D4: allocate_value (value.c:705)
>     ==6509==    by 0x8102383: value_from_double (value.c:3164)
>     ==6509==    by 0x810424F: evaluate_subexp_standard (eval.c:846)
>     ==6509==    by 0x82020E3: evaluate_subexp_c (c-lang.c:719)
>     ==6509==    by 0x8102975: evaluate_subexp (eval.c:73)
>     ==6509==    by 0x810A157: evaluate_subexp_standard (eval.c:2713)
>     ==6509==    by 0x82020E3: evaluate_subexp_c (c-lang.c:719)
>     ==6509==    by 0x8102975: evaluate_subexp (eval.c:73)
>     [...]
> 
> Here is what is happening, in value.c:value_from_double:
> 
>     struct value *
>     value_from_double (struct type *type, DOUBLEST num)
>     {
>       struct value *val = allocate_value (type);
>       struct type *base_type = check_typedef (type);
>       enum type_code code = TYPE_CODE (base_type);
>     
>       if (code == TYPE_CODE_FLT)
>         {
>           store_typed_floating (value_contents_raw (val), base_type, num);
>     [...]
> 
>     Breakpoint 1, value_from_double (type=0x852c388, num=17.93) at [...]/gdb/value.c:3164
>     3164      struct value *val = allocate_value (type);
>     (gdb) print *type
>     $2 = {pointer_type = 0x0, reference_type = 0x0, chain = 0x852c388, instance_flags = 0, length = 4, main_type = 0x852c3c0}
>     (gdb) print *type->main_type
>     $3 = {code = TYPE_CODE_FLT, flag_unsigned = 0, flag_nosign = 0, flag_stub = 0, flag_target_stub = 0, flag_static = 0, flag_prototyped = 0, flag_incomplete = 0, flag_varargs = 0, flag_vector = 0, flag_stub_supported = 0, 
>       flag_gnu_ifunc = 0, flag_fixed_instance = 0, flag_objfile_owned = 0, flag_declared_class = 0, flag_flag_enum = 0, type_specific_field = TYPE_SPECIFIC_NONE, nfields = 0, vptr_fieldno = -1, name = 0x852c408 "double", tag_name = 0x0, 
>       owner = {objfile = 0x8512f80, gdbarch = 0x8512f80}, target_type = 0x0, flds_bnds = {fields = 0x0, bounds = 0x0}, vptr_basetype = 0x0, type_specific = {cplus_stuff = 0x8476d8c, gnat_stuff = 0x8476d8c, floatformat = 0x8476d8c, 
>         func_stuff = 0x8476d8c}}
>     (gdb) print type->main_type->type_specific->floatformat[1]
>     $36 = (const struct floatformat *) 0x8461f40
>     (gdb) print host_float_format 
>     $31 = (const struct floatformat *) 0x8461e80
>     (gdb) print host_double_format 
>     $35 = (const struct floatformat *) 0x8461f40
> 
> Here we can already see that, TYPE's floatformat is host_double_format,
> which is 64-bit double format.  However, in my understanding, TYPE is
> meant to be a 32-bit double type, having a 32-bit double format.
> 
> Stepping further, allocate_value, allocate_value_contents:
> 
>     allocate_value (type=0x852c388) at [...]/gdb/value.c:705
>     705       allocate_value_contents (val);
>     (gdb) 
>     allocate_value_contents (val=0x8522508) at [...]/gdb/value.c:694
>     694       if (!val->contents)
>     (gdb) 
>     695         val->contents = (gdb_byte *) xzalloc (TYPE_LENGTH (val->enclosing_type));
>     (gdb) 
>     xzalloc (size=4) at [...]/gdb/common/common-utils.c:100
>     100       return xcalloc (1, size);
> 
> So indeed we allocate 32 bits.
> 
> Back in value_from_double, on to doublest.c:store_typed_floating:
> 
>     void
>     store_typed_floating (void *addr, const struct type *type, DOUBLEST val)
>     {
>       const struct floatformat *fmt = floatformat_from_type (type);
>     [...]
>       floatformat_from_doublest (fmt, &val, addr);
>     }
> 
>     store_typed_floating (addr=0x850a570, type=0x852c388, val=17.93) at [...]/gdb/doublest.c:859
>     859       const struct floatformat *fmt = floatformat_from_type (type);
>     [...]
>     881       floatformat_from_doublest (fmt, &val, addr);
>     (gdb) s
>     floatformat_from_doublest (fmt=0x8461f40, in=0xffffbfd8, out=0x850a570) at [...]/gdb/doublest.c:757
> 
> Here we indeed see floatformat fmt 0x8461f40 being passed, which is
> host_double_format, the 64-bit double format (see Bearkpoint 1 above).
> In the following, things break:
> 
>     (gdb) s
>     758       if (fmt == host_float_format)
>     (gdb) 
>     764       else if (fmt == host_double_format)
>     (gdb) 
>     766           double val = *in;
>     (gdb) 
>     768           memcpy (out, &val, sizeof (val));
> 
> We have host_double_format, »sizeof val« is 64 bits, but we only
> allocated 32 bits, thus memcpy corrupts the memory.
> 
> Looking at doublest:floatformat_from_type:
> 
>     const struct floatformat *
>     floatformat_from_type (const struct type *type)
>     {
>       struct gdbarch *gdbarch = get_type_arch (type);
>     
>       gdb_assert (TYPE_CODE (type) == TYPE_CODE_FLT);
>       if (TYPE_FLOATFORMAT (type) != NULL)
>         return TYPE_FLOATFORMAT (type)[gdbarch_byte_order (gdbarch)];
>       else
>         return floatformat_from_length (gdbarch, TYPE_LENGTH (type));
>     }
> 
> In our case, »TYPE_FLOATFORMAT (type)[gdbarch_byte_order (gdbarch)]« is
> host_double_format (see Breakpoint 1 above).  If I instead make that take
> the floatformat_from_length route, it correctly returns the 32-bit
> host_float_type, and everything works as expected.
> 
> So, my understanding is that »TYPE_FLOATFORMAT (type)« is set
> incorrectly -- but where and why is this happening?

It is here:

gdbarch.c:verify_gdbarch:

    [...]
      /* Skip verify of float_bit, invalid_p == 0 */
      if (gdbarch->float_format == 0)
        gdbarch->float_format = floatformats_ieee_single;
      /* Skip verify of double_bit, invalid_p == 0 */
      if (gdbarch->double_format == 0)
        gdbarch->double_format = floatformats_ieee_double;
      /* Skip verify of long_double_bit, invalid_p == 0 */
      if (gdbarch->long_double_format == 0)
        gdbarch->long_double_format = floatformats_ieee_double;
    [...]

That is, if set_gdbarch_double_format has not been called, it will
default to floatformats_ieee_double -- even though set_gdbarch_double_bit
may have been called setting it unequal to the 64-bit double format.
Hmm, and gdbarch.c:verify_gdbarch has the following comment on top of it:
»Ensure that all values in a GDBARCH are reasonable.«  ;-)


Grüße,
 Thomas