plan: VLA (Variable Length Arrays) and Fortran dynamic types

public inbox for gdb@sourceware.org
 help / color / mirror / Atom feed

* plan: VLA (Variable Length Arrays) and Fortran dynamic types
@ 2012-11-29 14:49 Jan Kratochvil
  2012-11-29 21:51 ` Tom Tromey
  0 siblings, 1 reply; 4+ messages in thread
From: Jan Kratochvil @ 2012-11-29 14:49 UTC (permalink / raw)
  To: gdb

Hi all,

the work on implementing the plan below has not started yet.  It describes how
to correctly upstream existing branch archer-jankratochvil-vla from
	http://sourceware.org/gdb/wiki/ArcherBranchManagement

which primarily enables GDB to use Fortran for non-trivial programs.

Originally shortly described at:
	Re: fortran multidimensional arrays and pointers
	http://sourceware.org/ml/gdb/2011-03/msg00021.html
	http://sourceware.org/ml/gdb/2011-03/msg00037.html

Regards,
Jan

------------------------------------------------------------------------------
(1)
Resolve the check_typedef problem - the primary problem of this patchset:

Currently for example one forgets to call check_typedef and the code sometimes
works (but sometimes it does not).

Rename "struct type" to "struct abstract_type".  Make all TYPE_LENGTH,
TYPE_ARRAY_UPPER_BOUND_VALUE etc. macros accessing concrete sizes requiring
to provide also "struct value *".  Remove check_typedef, that is hide it under
the TYPE_LENGTH etc. macros.

This would also include work to pass "struct value *" mostly everywhere
instead of where current "struct type *" is passed as one needs the inferior
memory to find out the concrete dimensions of inferior type.
This then makes the current *-vla implementation of DW_AT_object_pointer easy.

This could be done incrementally but it gets useful only after it gets
completed.

It is questionable whether it would have measurable performance regression.
If so there could be done some caching (cleared on each return to mainloop).

Alternative approach would be to "concretize" abstract types by check_typedef
(which would be kept there).  In such case there still would be
"struct abstract_type" but there would be also "struct concrete_type" which
would automatically cache all the values for better performance.
check_typedef would still be impossible to forget like nowadays due to the
non-compatible GDB types "struct abstract_type" vs. "struct concrete_type".
archer-jankratochvil-vla does it this way (but still keeping "struct type"
being both the input and output GDB type of check_typedef).
I do not think this approach is worth the pain.

Time estimation absolutely unknown but in the range of month or months.

------------------------------------------------------------------------------
(2)
Revive the inferior types reference counting vs. garbage collector.
	[patch 0/8] Types Garbage Collector (for VLA+Python)
	http://sourceware.org/ml/gdb-patches/2009-05/msg00543.html

Probably not much a problem but TBH I haven't found it so important compared
to the other GDB problems.

I am just sorry this types garbage collector / reference counting is needed
also by PythonGDB and I took over those days that part myself but never
checked it in.  Therefore AFAIK FSF GDB now leaks inferior types created by
Python (similarly like archer-jankratochvil-vla does but *-vla is not
upstreamed).

Time estimation 2 weeks.  Partially not on the critical path.

------------------------------------------------------------------------------
(3) - very optional
Merge "struct type" and "struct main_type" as with types gc/refc it has no
longer any benefits.

This is just a code cleanup, not on the critical path.  Also one may rather
that time already redo the whole inferior type system into normal C++
inherited types with accessors.

------------------------------------------------------------------------------
(4) - optional for critical path
value->contents should be discontiguous so that Fortran array splices (just
a row or just a column, that is display only each 10th byte for example) can
be stored into GDB $conveniencevariables.

Also with the dynamic bounds calculation (with no explicit check_typedef)
proposed in (1) such discontiguous memory would store the few bytes of Fortran
array descriptor where the array bounds are stored in inferior memory.
The array descriptor stores lower-bound, upper-bound and stride for each array
dimension (gcc/fortran/trans-array.c).

archer-jankratochvil-vla does not implement this, therefore if I display one
column splice of a 10MB array with 10 rows and 1000000 columns each GDB will
allocate 10MB and not just 10 bytes it needs (assuming 1 byte sized elements
here).

Time estimation 3 weeks.

------------------------------------------------------------------------------
(5)
Cleanup LA_VAL_PRINT vs. LA_VALUE_PRINT where are done too many assumptions
about arrays inferior memory layout.  Possibly more similar cleanups around
code hacked around now in archer-jankratochvil-vla.

Some (all) of the problem were described by Pedro in:
	Re: [RFA] valprint.c / *-valprint.c: Don't lose `embedded_offset'
	http://sourceware.org/ml/gdb-patches/2010-10/msg00127.html

This is not done and ugly hacked around in archer-jankratochvil-vla, this is
also a reason why some more complicated reference/splice/arrays do not work in
*-vla.

Time estimation 2-3 weeks.

------------------------------------------------------------------------------
(6)
Cleanup of struct value, fields like OFFSET, ENCLOSING_TYPE, EMBEDDED_OFFSET
and POINTED_TO_OFFSET should no longer be needed.

I find this part very difficult to do, I even already tried once.

It tries to make $conveniencevariables standalone enough so that even if
objfile gets deleted they work.  But in practice they do not as C++ virtual
method pointer is no longer accessible.

So TBH I would no longer falsely pretend $conveniencevariables can work for
complicated inferior types which nobody uses anyway and then we can drop those
fields.

Otherwise if we should make $conveniencevariables really working store all the
stuff like virtual method tables into the discontiguous memory as got
implemented by (4).  Then make the in-GDB copy shared between multiple struct
values and get rid of all those enclosing/embedded offsets.

Time estimation 4 weeks as long as one is able to cope with it.

------------------------------------------------------------------------------
(7) - optional for critical path
Make GDB inferior objects access piecemeal, that is no longer depend on
value->contents containing all the memory at once.

"print array_of_4gb" would then display the first page of elements asking for
--More-- intead of just "locking up" and/or erroring/crashing out of memory.

Somehow related to how value->contents gets represented after (4).

Time estimation very unknown to me but it could be probably done in an
incremental way, it depends a lot on how (4) gets implemented.

Time estimation 3 weeks.

------------------------------------------------------------------------------
(8)
Port the Fortran / VLA DWARF parsing.

This could be reused from the existing archer-jankratochvil-vla code.

Time estimation 1 week, nothing difficult with all the infrastructure already
in place.

------------------------------------------------------------------------------

Total estimation is 16 weeks + the unknown estimate for (1).
Maybe I overestimated the weeks a bit.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: plan: VLA (Variable Length Arrays) and Fortran dynamic types
  2012-11-29 14:49 plan: VLA (Variable Length Arrays) and Fortran dynamic types Jan Kratochvil
@ 2012-11-29 21:51 ` Tom Tromey
  2012-11-29 22:09   ` Joel Brobecker
  2012-11-30 15:50   ` Jan Kratochvil
  0 siblings, 2 replies; 4+ messages in thread
From: Tom Tromey @ 2012-11-29 21:51 UTC (permalink / raw)
  To: Jan Kratochvil; +Cc: gdb

>>>>> "Jan" == Jan Kratochvil <jan.kratochvil@redhat.com> writes:

Jan> Rename "struct type" to "struct abstract_type".  Make all
Jan> TYPE_LENGTH, TYPE_ARRAY_UPPER_BOUND_VALUE etc. macros accessing
Jan> concrete sizes requiring to provide also "struct value *".  Remove
Jan> check_typedef, that is hide it under the TYPE_LENGTH etc. macros.

Jan> This would also include work to pass "struct value *" mostly
Jan> everywhere instead of where current "struct type *" is passed as
Jan> one needs the inferior memory to find out the concrete dimensions
Jan> of inferior type.  This then makes the current *-vla implementation
Jan> of DW_AT_object_pointer easy.

[...]

Jan> Alternative approach would be to "concretize" abstract types by
Jan> check_typedef (which would be kept there).  In such case there
Jan> still would be "struct abstract_type" but there would be also
Jan> "struct concrete_type" which would automatically cache all the
Jan> values for better performance.  check_typedef would still be
Jan> impossible to forget like nowadays due to the non-compatible GDB
Jan> types "struct abstract_type" vs. "struct concrete_type".
Jan> archer-jankratochvil-vla does it this way (but still keeping
Jan> "struct type" being both the input and output GDB type of
Jan> check_typedef).  I do not think this approach is worth the pain.

Passing a struct value everywhere seems pretty awful though too,
especially considering that one may not generally have a value.
'ptype' and plenty of other things have to work on the abstract type.
But I think maybe I don't understand some details here.  Could you give
an example of where we pass a type now that we would have to pass a
value in the future?

I note that Ada already uses the concretizing approach.
Maybe Joel could share their experiences with this.

Tom

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: plan: VLA (Variable Length Arrays) and Fortran dynamic types
  2012-11-29 21:51 ` Tom Tromey
@ 2012-11-29 22:09   ` Joel Brobecker
  2012-11-30 15:50   ` Jan Kratochvil
  1 sibling, 0 replies; 4+ messages in thread
From: Joel Brobecker @ 2012-11-29 22:09 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Jan Kratochvil, gdb

> I note that Ada already uses the concretizing approach.
> Maybe Joel could share their experiences with this.

We have been using this approach out of necessity. There are more
elegant ways to do this for sure, but we all know how much effort
it would take to implement them (a complete revamp of our type
structure, for instance).

Aside from that, it is actually pretty tricky to implement right,
because it's very easy to lose information.  Jerome and myself recently
spent at least a day of discussions and experimentation before we
figured out the correct way of dealing with Ada interface types. Things
also get tricky with respect to EVAL_AVOID_SIDE_EFFECT vs EVAL_NORMAL,
because, in avoid-side-effect mode, you don't have the data in order to
concretize the type.  There is also the problem of freeing the new
types, which may be taken care of, but may not be, and I haven't looked
at that myself (before my time).

Aside from that, it has the merit of existing, and it works pretty
well. If I was starting from scratch and I had an infinite amount
of time, I'd design a type structure that allows us to express
the dynamic characteristics of a given object.

-- 
Joel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: plan: VLA (Variable Length Arrays) and Fortran dynamic types
  2012-11-29 21:51 ` Tom Tromey
  2012-11-29 22:09   ` Joel Brobecker
@ 2012-11-30 15:50   ` Jan Kratochvil
  1 sibling, 0 replies; 4+ messages in thread
From: Jan Kratochvil @ 2012-11-30 15:50 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb

On Thu, 29 Nov 2012 22:51:07 +0100, Tom Tromey wrote:
> Passing a struct value everywhere seems pretty awful though too,
> especially considering that one may not generally have a value.
> 'ptype' and plenty of other things have to work on the abstract type.

There is needed 'struct value' with some dummy content in such case.

It also means you cannot 'ptype' VLA type (or dynamic Fortran array), you need
an inferior variable instance for VLA.

I have checked now that 'ptype char[5]' works even without VLA.  But that sets
TYPE_LENGTH to 5 and TYPE_TARGET_STUB == 0 so check_typedef does not overwrite
the value 5 (it also has no TYPE_CODE_RANGE anywhere).

Dummy 'struct value' is the disadvantage of a fully dynamic solution but it is
nice we share the opinion with Joel a fully dynamic solution (without
concretization) should be better.

I did not use you words 'abstract type' as I used 'struct abstract_type' in my
text also for inferior types not yet passed through check_typedef.
For non-VLA types it means 'struct abstract_type' may be opaque type.

> But I think maybe I don't understand some details here.  Could you give
> an example of where we pass a type now that we would have to pass a
> value in the future?

It may also depend a bit on the behavior currently in archer-jankratochvil-vla
as:
	(gdb) whatis temp1
	type = char [variable]
	(gdb) ptype temp1
	type = char [78]

Currently it did mean how many check_typedefs to call, with fully dynamic
types in GDB there will need to be different way (flag) to access the array
bound value.

For example LA_PRINT_TYPE parameter type -> value.  I do not know more, just
change type->value at any TYPE_LENGTH and similar accessors, and then change
it in all the callers.

Jan

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-11-30 15:50 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-29 14:49 plan: VLA (Variable Length Arrays) and Fortran dynamic types Jan Kratochvil
2012-11-29 21:51 ` Tom Tromey
2012-11-29 22:09   ` Joel Brobecker
2012-11-30 15:50   ` Jan Kratochvil

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).