* [PATCH] Fix early debug regression with DW_AT_string_length (PR debug/71906) @ 2016-07-21 16:53 Jakub Jelinek 2016-07-22 11:55 ` Richard Biener 2016-08-12 17:47 ` Jason Merrill 0 siblings, 2 replies; 14+ messages in thread From: Jakub Jelinek @ 2016-07-21 16:53 UTC (permalink / raw) To: Jason Merrill, Aldy Hernandez; +Cc: gcc-patches Hi! The early debug changes broke e.g. following testcase: program pr71906 character(len=8) :: vard character(len=:), allocatable :: vare type t character(len=:), allocatable :: f end type type(t) :: varf allocate(character(len=10) :: vare) allocate(character(len=9) :: varf%f) vare = 'foo' call foo (vard, vare, varf) contains subroutine foo (vara, varb, varc) character(len=*) :: vara character(len=:), allocatable :: varb type(t) :: varc vara = 'bar' varb = 'baz' varc%f = 'str' end subroutine end program pr71906 The issue is that unlike e.g. DW_AT_upper_bound, DW_AT_string_length doesn't allow a reference to some other DIE, so while for the former we just emit a reference to an artificial var holding the VLA sizes, for non-constant string length loc_list_from_tree used to work, but doesn't anymore. The following patch has 4 major parts: 1) Fortran FE change to emit the artificial vars holding string length before the string vars (something I broke recently with PR71687 fix) 2) for early_dwarf, loc_list_from_tree for the DW_AT_string_length var will most likely fail, the code in gen_array_type_die in that case adds DW_OP_call4 referencing the DIE of the artificial var or parameter; DW_OP_call4 is a rough match, in that it only works properly if the artificial var has DWARF expressions (rather than location descriptions); the patch also adds newly support for varb above, where the string length is INDIRECT_REF of artificial PARM_DECL; for early dwarf this adds DW_OP_call4; DW_OP_deref{,_size}; the reason to handle it this way is that IMHO it matches more the spirit and intention of the early dwarf eventually for LTO, where I presume we'd stream the early dwarf created debug info and read/adjust it afterwards; LTO doesn't know that something is a fortran character and what string length it has 3) unfortunately, for the PARM_DECL and INDIRECT_REF of PARM_DECL cases, usually we need to refer to a parameter whose DIE has not been created yet during early_dwarf; and trying to create it out of order has various issues, e.g. the debugger would show them up in different order. So, this is resolved using the string_types vector, adjust_string_types function and some code in gen_subprogram_die 4) finally, when finalizing the debug info, resolve_addr and its helper functions look at DW_AT_string_length with DW_OP_call4 optionally followed by DW_OP_deref{,_size} in its DWARF expression, look up the referenced DIE, consider its DW_AT_location and either: - keep it as is, if it is valid DWARF (i.e. known to be defined for all PCs to a DWARF expression) - copy the DW_AT_location attribute value/form to the DW_AT_string_length (in the non-deref case) - for the deref case, adjust what can be easily adjusted (DWARF expression with DW_OP_stack_value at the end drops the stack value and thus handles the dereference, regx/regN replaced by bregx/bregN 0, for DWARF expression only add dereference at the end, drop cases that can't be adjusted - drop the DW_AT_string_length attribute and DW_AT_byte_size if present too if the DIE doesn't have DW_AT_location, or it isn't usable etc. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2016-07-21 Jakub Jelinek <jakub@redhat.com> PR debug/71906 * dwarf2out.c (string_types): New variable. (gen_array_type_die): Change early_dwarf handling of DW_AT_string_length, create DW_OP_call4 referencing the length var temporarily. Handle parameters that are pointers to string length. (adjust_string_types): New function. (gen_subprogram_die): Temporarily set string_types to local var, call adjust_string_types if needed. (non_dwarf_expression, copy_deref_exprloc, optimize_string_length): New functions. (resolve_addr): Adjust DW_AT_string_length if it is DW_OP_call4. * trans-decl.c (gfc_get_symbol_decl): Call gfc_finish_var_decl for decl's character length before gfc_finish_var_decl on the decl itself. --- gcc/dwarf2out.c.jj 2016-07-21 08:59:47.101616662 +0200 +++ gcc/dwarf2out.c 2016-07-21 11:10:11.510137511 +0200 @@ -3123,6 +3123,10 @@ static bool frame_pointer_fb_offset_vali static vec<dw_die_ref> base_types; +/* Pointer to vector of DW_TAG_string_type DIEs that need finalization + once all arguments are parsed. */ +static vec<dw_die_ref> *string_types; + /* Flags to represent a set of attribute classes for attributes that represent a scalar value (bounds, pointers, ...). */ enum dw_scalar_form @@ -19201,18 +19205,70 @@ gen_array_type_die (tree type, dw_die_re if (size >= 0) add_AT_unsigned (array_die, DW_AT_byte_size, size); else if (TYPE_DOMAIN (type) != NULL_TREE - && TYPE_MAX_VALUE (TYPE_DOMAIN (type)) != NULL_TREE - && DECL_P (TYPE_MAX_VALUE (TYPE_DOMAIN (type)))) + && TYPE_MAX_VALUE (TYPE_DOMAIN (type)) != NULL_TREE) { tree szdecl = TYPE_MAX_VALUE (TYPE_DOMAIN (type)); - dw_loc_list_ref loc = loc_list_from_tree (szdecl, 2, NULL); + tree rszdecl = szdecl; + HOST_WIDE_INT rsize = 0; size = int_size_in_bytes (TREE_TYPE (szdecl)); - if (loc && size > 0) + if (!DECL_P (szdecl)) { - add_AT_location_description (array_die, DW_AT_string_length, loc); - if (size != DWARF2_ADDR_SIZE) - add_AT_unsigned (array_die, DW_AT_byte_size, size); + if (TREE_CODE (szdecl) == INDIRECT_REF + && DECL_P (TREE_OPERAND (szdecl, 0))) + { + rszdecl = TREE_OPERAND (szdecl, 0); + rsize = int_size_in_bytes (TREE_TYPE (rszdecl)); + if (rsize <= 0) + size = 0; + } + else + size = 0; + } + if (size > 0) + { + dw_loc_list_ref loc = loc_list_from_tree (szdecl, 2, NULL); + if (loc == NULL + && early_dwarf + && current_function_decl + && DECL_CONTEXT (rszdecl) == current_function_decl) + { + dw_die_ref ref = lookup_decl_die (rszdecl); + dw_loc_descr_ref l = NULL; + if (ref) + { + l = new_loc_descr (DW_OP_call4, 0, 0); + l->dw_loc_oprnd1.val_class = dw_val_class_die_ref; + l->dw_loc_oprnd1.v.val_die_ref.die = ref; + l->dw_loc_oprnd1.v.val_die_ref.external = 0; + } + else if (TREE_CODE (rszdecl) == PARM_DECL + && string_types) + { + l = new_loc_descr (DW_OP_call4, 0, 0); + l->dw_loc_oprnd1.val_class = dw_val_class_decl_ref; + l->dw_loc_oprnd1.v.val_decl_ref = rszdecl; + string_types->safe_push (array_die); + } + if (l && rszdecl != szdecl) + { + if (rsize == DWARF2_ADDR_SIZE) + add_loc_descr (&l, new_loc_descr (DW_OP_deref, + 0, 0)); + else + add_loc_descr (&l, new_loc_descr (DW_OP_deref_size, + rsize, 0)); + } + if (l) + loc = new_loc_list (l, NULL, NULL, NULL); + } + if (loc) + { + add_AT_location_description (array_die, DW_AT_string_length, + loc); + if (size != DWARF2_ADDR_SIZE) + add_AT_unsigned (array_die, DW_AT_byte_size, size); + } } } return; @@ -19278,6 +19334,37 @@ gen_array_type_die (tree type, dw_die_re add_pubtype (type, array_die); } +/* After all arguments are created, adjust any DW_TAG_string_type + DIEs DW_AT_string_length attributes. */ + +static void +adjust_string_types (void) +{ + dw_die_ref array_die; + unsigned int i; + FOR_EACH_VEC_ELT (*string_types, i, array_die) + { + dw_attr_node *a = get_AT (array_die, DW_AT_string_length); + if (a == NULL) + continue; + dw_loc_descr_ref loc = AT_loc (a); + gcc_assert (loc->dw_loc_opc == DW_OP_call4 + && loc->dw_loc_oprnd1.val_class == dw_val_class_decl_ref); + dw_die_ref ref = lookup_decl_die (loc->dw_loc_oprnd1.v.val_decl_ref); + if (ref) + { + loc->dw_loc_oprnd1.val_class = dw_val_class_die_ref; + loc->dw_loc_oprnd1.v.val_die_ref.die = ref; + loc->dw_loc_oprnd1.v.val_die_ref.external = 0; + } + else + { + remove_AT (array_die, DW_AT_string_length); + remove_AT (array_die, DW_AT_byte_size); + } + } +} + /* This routine generates DIE for array with hidden descriptor, details are filled into *info by a langhook. */ @@ -20675,6 +20762,9 @@ gen_subprogram_die (tree decl, dw_die_re tree generic_decl_parm = generic_decl ? DECL_ARGUMENTS (generic_decl) : NULL; + auto_vec<dw_die_ref> string_types_vec; + if (string_types == NULL) + string_types = &string_types_vec; /* Now we want to walk the list of parameters of the function and emit their relevant DIEs. @@ -20737,6 +20827,14 @@ gen_subprogram_die (tree decl, dw_die_re else if (DECL_INITIAL (decl) == NULL_TREE) gen_unspecified_parameters_die (decl, subr_die); } + + /* Adjust DW_TAG_string_type DIEs if needed, now that all arguments + have DIEs. */ + if (string_types == &string_types_vec) + { + adjust_string_types (); + string_types = NULL; + } } if (subr_die != old_die) @@ -26583,6 +26681,175 @@ optimize_location_into_implicit_ptr (dw_ } } +/* Return NULL if l is a DWARF expression, or first op that is not + valid DWARF expression. */ + +static dw_loc_descr_ref +non_dwarf_expression (dw_loc_descr_ref l) +{ + while (l) + { + if (l->dw_loc_opc >= DW_OP_reg0 && l->dw_loc_opc <= DW_OP_reg31) + return l; + switch (l->dw_loc_opc) + { + case DW_OP_regx: + case DW_OP_implicit_value: + case DW_OP_stack_value: + case DW_OP_GNU_implicit_pointer: + case DW_OP_GNU_parameter_ref: + case DW_OP_piece: + case DW_OP_bit_piece: + return l; + default: + break; + } + l = l->dw_loc_next; + } + return NULL; +} + +/* Return adjusted copy of EXPR: + If it is empty DWARF expression, return it. + If it is valid non-empty DWARF expression, + return copy of EXPR with copy of DEREF appended to it. + If it is DWARF expression followed by DW_OP_reg{N,x}, return + copy of the DWARF expression with DW_OP_breg{N,x} <0> appended + and no DEREF. + If it is DWARF expression followed by DW_OP_stack_value, return + copy of the DWARF expression without anything appended. + Otherwise, return NULL. */ + +static dw_loc_descr_ref +copy_deref_exprloc (dw_loc_descr_ref expr, dw_loc_descr_ref deref) +{ + + if (expr == NULL) + return NULL; + + dw_loc_descr_ref l = non_dwarf_expression (expr); + if (l && l->dw_loc_next) + return NULL; + + if (l) + { + if (l->dw_loc_opc >= DW_OP_reg0 && l->dw_loc_opc <= DW_OP_reg31) + deref = new_loc_descr ((enum dwarf_location_atom) + (DW_OP_breg0 + (l->dw_loc_opc - DW_OP_reg0)), + 0, 0); + else + switch (l->dw_loc_opc) + { + case DW_OP_regx: + deref = new_loc_descr (DW_OP_bregx, + l->dw_loc_oprnd1.v.val_unsigned, 0); + break; + case DW_OP_stack_value: + deref = NULL; + break; + default: + return NULL; + } + } + else + deref = new_loc_descr (deref->dw_loc_opc, + deref->dw_loc_oprnd1.v.val_int, 0); + + dw_loc_descr_ref ret = NULL, *p = &ret; + while (expr != l) + { + *p = new_loc_descr (expr->dw_loc_opc, 0, 0); + (*p)->dw_loc_oprnd1 = expr->dw_loc_oprnd1; + (*p)->dw_loc_oprnd2 = expr->dw_loc_oprnd2; + p = &(*p)->dw_loc_next; + expr = expr->dw_loc_next; + } + *p = deref; + return ret; +} + +/* For DW_AT_string_length attribute with DW_OP_call4 reference to a variable + or argument, adjust it if needed and return: + -1 if the DW_AT_string_length attribute and DW_AT_byte_size attribute + if present should be removed + 0 keep the attribute as is if the referenced var or argument has + only DWARF expression that covers all ranges + 1 if the attribute has been successfully adjusted. */ + +static int +optimize_string_length (dw_attr_node *a) +{ + dw_loc_descr_ref l = AT_loc (a), lv; + dw_die_ref die = l->dw_loc_oprnd1.v.val_die_ref.die; + dw_attr_node *av = get_AT (die, DW_AT_location); + dw_loc_list_ref d; + bool non_dwarf_expr = false; + + if (av == NULL) + return -1; + switch (AT_class (av)) + { + case dw_val_class_loc_list: + for (d = AT_loc_list (av); d != NULL; d = d->dw_loc_next) + if (d->expr && non_dwarf_expression (d->expr)) + non_dwarf_expr = true; + break; + case dw_val_class_loc: + lv = AT_loc (av); + if (lv == NULL) + return -1; + if (non_dwarf_expression (lv)) + non_dwarf_expr = true; + break; + default: + return -1; + } + + /* If it is safe to keep DW_OP_call4 in, keep it. */ + if (!non_dwarf_expr + && (l->dw_loc_next == NULL || AT_class (av) == dw_val_class_loc)) + return 0; + + /* If not dereferencing the DW_OP_call4 afterwards, we can just + copy over the DW_AT_location attribute from die to a. */ + if (l->dw_loc_next == NULL) + { + a->dw_attr_val = av->dw_attr_val; + return 1; + } + + dw_loc_list_ref list, *p; + switch (AT_class (av)) + { + case dw_val_class_loc_list: + p = &list; + list = NULL; + for (d = AT_loc_list (av); d != NULL; d = d->dw_loc_next) + { + lv = copy_deref_exprloc (d->expr, l->dw_loc_next); + if (lv) + { + *p = new_loc_list (lv, d->begin, d->end, d->section); + p = &(*p)->dw_loc_next; + } + } + if (list == NULL) + return -1; + a->dw_attr_val.val_class = dw_val_class_loc_list; + gen_llsym (list); + *AT_loc_list_ptr (a) = list; + return 1; + case dw_val_class_loc: + lv = copy_deref_exprloc (AT_loc (av), l->dw_loc_next); + if (lv == NULL) + return -1; + a->dw_attr_val.v.val_loc = lv; + return 1; + default: + gcc_unreachable (); + } +} + /* Resolve DW_OP_addr and DW_AT_const_value CONST_STRING arguments to an address in .rodata section if the string literal is emitted there, or remove the containing location list or replace DW_AT_const_value @@ -26597,6 +26864,7 @@ resolve_addr (dw_die_ref die) dw_attr_node *a; dw_loc_list_ref *curr, *start, loc; unsigned ix; + bool remove_AT_byte_size = false; FOR_EACH_VEC_SAFE_ELT (die->die_attr, ix, a) switch (AT_class (a)) @@ -26657,6 +26925,38 @@ resolve_addr (dw_die_ref die) case dw_val_class_loc: { dw_loc_descr_ref l = AT_loc (a); + /* Using DW_OP_call4 or DW_OP_call4 DW_OP_deref in + DW_AT_string_length is only a rough approximation; unfortunately + DW_AT_string_length can't be a reference to a DIE. DW_OP_call4 + needs a DWARF expression, while DW_AT_location of the referenced + variable or argument might be any location description. */ + if (a->dw_attr == DW_AT_string_length + && l + && l->dw_loc_opc == DW_OP_call4 + && l->dw_loc_oprnd1.val_class == dw_val_class_die_ref + && (l->dw_loc_next == NULL + || (l->dw_loc_next->dw_loc_next == NULL + && (l->dw_loc_next->dw_loc_opc == DW_OP_deref + || l->dw_loc_next->dw_loc_opc != DW_OP_deref_size)))) + { + switch (optimize_string_length (a)) + { + case -1: + remove_AT (die, a->dw_attr); + ix--; + /* For DWARF4 and earlier, if we drop DW_AT_string_length, + we need to drop also DW_AT_byte_size. */ + remove_AT_byte_size = true; + continue; + default: + break; + case 1: + /* Even if we keep the optimized DW_AT_string_length, + it might have changed AT_class, so process it again. */ + ix--; + continue; + } + } /* For -gdwarf-2 don't attempt to optimize DW_AT_data_member_location containing DW_OP_plus_uconst - older consumers might @@ -26741,6 +27041,9 @@ resolve_addr (dw_die_ref die) break; } + if (remove_AT_byte_size) + remove_AT (die, DW_AT_byte_size); + FOR_EACH_CHILD (die, c, resolve_addr (c)); } \f --- gcc/fortran/trans-decl.c.jj 2016-07-21 08:59:47.098616701 +0200 +++ gcc/fortran/trans-decl.c 2016-07-21 11:06:23.002023591 +0200 @@ -1676,26 +1676,23 @@ gfc_get_symbol_decl (gfc_symbol * sym) && !(sym->attr.use_assoc && !intrinsic_array_parameter))) gfc_defer_symbol_init (sym); + /* Associate names can use the hidden string length variable + of their associated target. */ + if (sym->ts.type == BT_CHARACTER + && TREE_CODE (length) != INTEGER_CST) + { + gfc_finish_var_decl (length, sym); + gcc_assert (!sym->value); + } + gfc_finish_var_decl (decl, sym); if (sym->ts.type == BT_CHARACTER) - { - /* Character variables need special handling. */ - gfc_allocate_lang_decl (decl); - - /* Associate names can use the hidden string length variable - of their associated target. */ - if (TREE_CODE (length) != INTEGER_CST) - { - gfc_finish_var_decl (length, sym); - gcc_assert (!sym->value); - } - } + /* Character variables need special handling. */ + gfc_allocate_lang_decl (decl); else if (sym->attr.subref_array_pointer) - { - /* We need the span for these beasts. */ - gfc_allocate_lang_decl (decl); - } + /* We need the span for these beasts. */ + gfc_allocate_lang_decl (decl); if (sym->attr.subref_array_pointer) { Jakub ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Fix early debug regression with DW_AT_string_length (PR debug/71906) 2016-07-21 16:53 [PATCH] Fix early debug regression with DW_AT_string_length (PR debug/71906) Jakub Jelinek @ 2016-07-22 11:55 ` Richard Biener 2016-07-22 12:08 ` Jakub Jelinek 2016-08-12 17:47 ` Jason Merrill 1 sibling, 1 reply; 14+ messages in thread From: Richard Biener @ 2016-07-22 11:55 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Jason Merrill, Aldy Hernandez, GCC Patches On Thu, Jul 21, 2016 at 6:53 PM, Jakub Jelinek <jakub@redhat.com> wrote: > Hi! > > The early debug changes broke e.g. following testcase: > program pr71906 > character(len=8) :: vard > character(len=:), allocatable :: vare > type t > character(len=:), allocatable :: f > end type > type(t) :: varf > allocate(character(len=10) :: vare) > allocate(character(len=9) :: varf%f) > vare = 'foo' > call foo (vard, vare, varf) > contains > subroutine foo (vara, varb, varc) > character(len=*) :: vara > character(len=:), allocatable :: varb > type(t) :: varc > vara = 'bar' > varb = 'baz' > varc%f = 'str' > end subroutine > end program pr71906 > > The issue is that unlike e.g. DW_AT_upper_bound, DW_AT_string_length > doesn't allow a reference to some other DIE, so while for the former > we just emit a reference to an artificial var holding the VLA sizes, > for non-constant string length loc_list_from_tree used to work, but > doesn't anymore. > > The following patch has 4 major parts: > 1) Fortran FE change to emit the artificial vars holding string length > before the string vars (something I broke recently with PR71687 > fix) > 2) for early_dwarf, loc_list_from_tree for the DW_AT_string_length > var will most likely fail, the code in gen_array_type_die > in that case adds DW_OP_call4 referencing the DIE of the artificial > var or parameter; DW_OP_call4 is a rough match, in that it only works > properly if the artificial var has DWARF expressions (rather than > location descriptions); the patch also adds newly support for > varb above, where the string length is INDIRECT_REF of artificial > PARM_DECL; for early dwarf this adds DW_OP_call4; DW_OP_deref{,_size}; > the reason to handle it this way is that IMHO it matches more the > spirit and intention of the early dwarf eventually for LTO, where > I presume we'd stream the early dwarf created debug info and read/adjust > it afterwards; LTO doesn't know that something is a fortran character > and what string length it has > 3) unfortunately, for the PARM_DECL and INDIRECT_REF of PARM_DECL > cases, usually we need to refer to a parameter whose DIE has not > been created yet during early_dwarf; and trying to create it > out of order has various issues, e.g. the debugger would show them > up in different order. So, this is resolved using the string_types > vector, adjust_string_types function and some code in gen_subprogram_die > 4) finally, when finalizing the debug info, resolve_addr and its helper > functions look at DW_AT_string_length with DW_OP_call4 optionally > followed by DW_OP_deref{,_size} in its DWARF expression, look up the > referenced DIE, consider its DW_AT_location and either: > - keep it as is, if it is valid DWARF (i.e. known to be defined for > all PCs to a DWARF expression) > - copy the DW_AT_location attribute value/form to the DW_AT_string_length > (in the non-deref case) > - for the deref case, adjust what can be easily adjusted > (DWARF expression with DW_OP_stack_value at the end drops the stack > value and thus handles the dereference, regx/regN replaced by > bregx/bregN 0, for DWARF expression only add dereference at the end, > drop cases that can't be adjusted > - drop the DW_AT_string_length attribute and DW_AT_byte_size if present > too if the DIE doesn't have DW_AT_location, or it isn't usable etc. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2016-07-21 Jakub Jelinek <jakub@redhat.com> > > PR debug/71906 > * dwarf2out.c (string_types): New variable. > (gen_array_type_die): Change early_dwarf handling of > DW_AT_string_length, create DW_OP_call4 referencing the > length var temporarily. Handle parameters that are pointers > to string length. > (adjust_string_types): New function. > (gen_subprogram_die): Temporarily set string_types to local var, > call adjust_string_types if needed. > (non_dwarf_expression, copy_deref_exprloc, optimize_string_length): > New functions. > (resolve_addr): Adjust DW_AT_string_length if it is DW_OP_call4. > > * trans-decl.c (gfc_get_symbol_decl): Call gfc_finish_var_decl > for decl's character length before gfc_finish_var_decl on the > decl itself. > > --- gcc/dwarf2out.c.jj 2016-07-21 08:59:47.101616662 +0200 > +++ gcc/dwarf2out.c 2016-07-21 11:10:11.510137511 +0200 > @@ -3123,6 +3123,10 @@ static bool frame_pointer_fb_offset_vali > > static vec<dw_die_ref> base_types; > > +/* Pointer to vector of DW_TAG_string_type DIEs that need finalization > + once all arguments are parsed. */ > +static vec<dw_die_ref> *string_types; > + > /* Flags to represent a set of attribute classes for attributes that represent > a scalar value (bounds, pointers, ...). */ > enum dw_scalar_form > @@ -19201,18 +19205,70 @@ gen_array_type_die (tree type, dw_die_re > if (size >= 0) > add_AT_unsigned (array_die, DW_AT_byte_size, size); > else if (TYPE_DOMAIN (type) != NULL_TREE > - && TYPE_MAX_VALUE (TYPE_DOMAIN (type)) != NULL_TREE > - && DECL_P (TYPE_MAX_VALUE (TYPE_DOMAIN (type)))) > + && TYPE_MAX_VALUE (TYPE_DOMAIN (type)) != NULL_TREE) > { > tree szdecl = TYPE_MAX_VALUE (TYPE_DOMAIN (type)); > - dw_loc_list_ref loc = loc_list_from_tree (szdecl, 2, NULL); > + tree rszdecl = szdecl; > + HOST_WIDE_INT rsize = 0; > > size = int_size_in_bytes (TREE_TYPE (szdecl)); > - if (loc && size > 0) > + if (!DECL_P (szdecl)) > { > - add_AT_location_description (array_die, DW_AT_string_length, loc); > - if (size != DWARF2_ADDR_SIZE) > - add_AT_unsigned (array_die, DW_AT_byte_size, size); > + if (TREE_CODE (szdecl) == INDIRECT_REF So I wonder how this can happen with variable-size type gimplification. Shouldn't this be on, say, DECL_VALUE_EXPR of the DECL_P TYPE_MAX_VALUE? Or is this just another case where the fortran FE misses to have a DECL_EXPR to force gimplification of the type size? LTO debug will pretty much rely on all info being in some decl it can annotate later, not sure how that plays with DW_AT_string_length though. My current working copy of LTO early-debug produces just the following for early debug: <2><1e9>: Abbrev Number: 12 (DW_TAG_subprogram) <1ea> DW_AT_name : foo <1ee> DW_AT_decl_file : 1 <1ef> DW_AT_decl_line : 13 <1f0> DW_AT_sibling : <0x22b> <3><1f4>: Abbrev Number: 10 (DW_TAG_formal_parameter) <1f5> DW_AT_name : (indirect string, offset: 0x16e): vara <1f9> DW_AT_decl_file : 1 <1fa> DW_AT_decl_line : 13 <1fb> DW_AT_type : <0x28d> <3><1ff>: Abbrev Number: 10 (DW_TAG_formal_parameter) <200> DW_AT_name : (indirect string, offset: 0x174): varb <204> DW_AT_decl_file : 1 <205> DW_AT_decl_line : 13 <206> DW_AT_type : <0x28f> <3><20a>: Abbrev Number: 10 (DW_TAG_formal_parameter) <20b> DW_AT_name : (indirect string, offset: 0x1a4): varc <20f> DW_AT_decl_file : 1 <210> DW_AT_decl_line : 13 <211> DW_AT_type : <0x175> <3><215>: Abbrev Number: 13 (DW_TAG_formal_parameter) <216> DW_AT_name : (indirect string, offset: 0x16d): _vara <21a> DW_AT_type : <0x1a5> <21e> DW_AT_artificial : 1 <3><21e>: Abbrev Number: 13 (DW_TAG_formal_parameter) <21f> DW_AT_name : (indirect string, offset: 0x173): _varb <223> DW_AT_type : <0x27f> <227> DW_AT_artificial : 1 ... <1><28d>: Abbrev Number: 19 (DW_TAG_string_type) <1><28e>: Abbrev Number: 19 (DW_TAG_string_type) <1><28f>: Abbrev Number: 6 (DW_TAG_pointer_type) so there is nothing to annotate with a location later. Note that even with GCC 5 'varb' didn't get a DW_AT_string_length, 'vara' did, though. Richard. > + && DECL_P (TREE_OPERAND (szdecl, 0))) > + { > + rszdecl = TREE_OPERAND (szdecl, 0); > + rsize = int_size_in_bytes (TREE_TYPE (rszdecl)); > + if (rsize <= 0) > + size = 0; > + } > + else > + size = 0; > + } > + if (size > 0) > + { > + dw_loc_list_ref loc = loc_list_from_tree (szdecl, 2, NULL); > + if (loc == NULL > + && early_dwarf > + && current_function_decl > + && DECL_CONTEXT (rszdecl) == current_function_decl) > + { > + dw_die_ref ref = lookup_decl_die (rszdecl); > + dw_loc_descr_ref l = NULL; > + if (ref) > + { > + l = new_loc_descr (DW_OP_call4, 0, 0); > + l->dw_loc_oprnd1.val_class = dw_val_class_die_ref; > + l->dw_loc_oprnd1.v.val_die_ref.die = ref; > + l->dw_loc_oprnd1.v.val_die_ref.external = 0; > + } > + else if (TREE_CODE (rszdecl) == PARM_DECL > + && string_types) > + { > + l = new_loc_descr (DW_OP_call4, 0, 0); > + l->dw_loc_oprnd1.val_class = dw_val_class_decl_ref; > + l->dw_loc_oprnd1.v.val_decl_ref = rszdecl; > + string_types->safe_push (array_die); > + } > + if (l && rszdecl != szdecl) > + { > + if (rsize == DWARF2_ADDR_SIZE) > + add_loc_descr (&l, new_loc_descr (DW_OP_deref, > + 0, 0)); > + else > + add_loc_descr (&l, new_loc_descr (DW_OP_deref_size, > + rsize, 0)); > + } > + if (l) > + loc = new_loc_list (l, NULL, NULL, NULL); > + } > + if (loc) > + { > + add_AT_location_description (array_die, DW_AT_string_length, > + loc); > + if (size != DWARF2_ADDR_SIZE) > + add_AT_unsigned (array_die, DW_AT_byte_size, size); > + } > } > } > return; > @@ -19278,6 +19334,37 @@ gen_array_type_die (tree type, dw_die_re > add_pubtype (type, array_die); > } > > +/* After all arguments are created, adjust any DW_TAG_string_type > + DIEs DW_AT_string_length attributes. */ > + > +static void > +adjust_string_types (void) > +{ > + dw_die_ref array_die; > + unsigned int i; > + FOR_EACH_VEC_ELT (*string_types, i, array_die) > + { > + dw_attr_node *a = get_AT (array_die, DW_AT_string_length); > + if (a == NULL) > + continue; > + dw_loc_descr_ref loc = AT_loc (a); > + gcc_assert (loc->dw_loc_opc == DW_OP_call4 > + && loc->dw_loc_oprnd1.val_class == dw_val_class_decl_ref); > + dw_die_ref ref = lookup_decl_die (loc->dw_loc_oprnd1.v.val_decl_ref); > + if (ref) > + { > + loc->dw_loc_oprnd1.val_class = dw_val_class_die_ref; > + loc->dw_loc_oprnd1.v.val_die_ref.die = ref; > + loc->dw_loc_oprnd1.v.val_die_ref.external = 0; > + } > + else > + { > + remove_AT (array_die, DW_AT_string_length); > + remove_AT (array_die, DW_AT_byte_size); > + } > + } > +} > + > /* This routine generates DIE for array with hidden descriptor, details > are filled into *info by a langhook. */ > > @@ -20675,6 +20762,9 @@ gen_subprogram_die (tree decl, dw_die_re > tree generic_decl_parm = generic_decl > ? DECL_ARGUMENTS (generic_decl) > : NULL; > + auto_vec<dw_die_ref> string_types_vec; > + if (string_types == NULL) > + string_types = &string_types_vec; > > /* Now we want to walk the list of parameters of the function and > emit their relevant DIEs. > @@ -20737,6 +20827,14 @@ gen_subprogram_die (tree decl, dw_die_re > else if (DECL_INITIAL (decl) == NULL_TREE) > gen_unspecified_parameters_die (decl, subr_die); > } > + > + /* Adjust DW_TAG_string_type DIEs if needed, now that all arguments > + have DIEs. */ > + if (string_types == &string_types_vec) > + { > + adjust_string_types (); > + string_types = NULL; > + } > } > > if (subr_die != old_die) > @@ -26583,6 +26681,175 @@ optimize_location_into_implicit_ptr (dw_ > } > } > > +/* Return NULL if l is a DWARF expression, or first op that is not > + valid DWARF expression. */ > + > +static dw_loc_descr_ref > +non_dwarf_expression (dw_loc_descr_ref l) > +{ > + while (l) > + { > + if (l->dw_loc_opc >= DW_OP_reg0 && l->dw_loc_opc <= DW_OP_reg31) > + return l; > + switch (l->dw_loc_opc) > + { > + case DW_OP_regx: > + case DW_OP_implicit_value: > + case DW_OP_stack_value: > + case DW_OP_GNU_implicit_pointer: > + case DW_OP_GNU_parameter_ref: > + case DW_OP_piece: > + case DW_OP_bit_piece: > + return l; > + default: > + break; > + } > + l = l->dw_loc_next; > + } > + return NULL; > +} > + > +/* Return adjusted copy of EXPR: > + If it is empty DWARF expression, return it. > + If it is valid non-empty DWARF expression, > + return copy of EXPR with copy of DEREF appended to it. > + If it is DWARF expression followed by DW_OP_reg{N,x}, return > + copy of the DWARF expression with DW_OP_breg{N,x} <0> appended > + and no DEREF. > + If it is DWARF expression followed by DW_OP_stack_value, return > + copy of the DWARF expression without anything appended. > + Otherwise, return NULL. */ > + > +static dw_loc_descr_ref > +copy_deref_exprloc (dw_loc_descr_ref expr, dw_loc_descr_ref deref) > +{ > + > + if (expr == NULL) > + return NULL; > + > + dw_loc_descr_ref l = non_dwarf_expression (expr); > + if (l && l->dw_loc_next) > + return NULL; > + > + if (l) > + { > + if (l->dw_loc_opc >= DW_OP_reg0 && l->dw_loc_opc <= DW_OP_reg31) > + deref = new_loc_descr ((enum dwarf_location_atom) > + (DW_OP_breg0 + (l->dw_loc_opc - DW_OP_reg0)), > + 0, 0); > + else > + switch (l->dw_loc_opc) > + { > + case DW_OP_regx: > + deref = new_loc_descr (DW_OP_bregx, > + l->dw_loc_oprnd1.v.val_unsigned, 0); > + break; > + case DW_OP_stack_value: > + deref = NULL; > + break; > + default: > + return NULL; > + } > + } > + else > + deref = new_loc_descr (deref->dw_loc_opc, > + deref->dw_loc_oprnd1.v.val_int, 0); > + > + dw_loc_descr_ref ret = NULL, *p = &ret; > + while (expr != l) > + { > + *p = new_loc_descr (expr->dw_loc_opc, 0, 0); > + (*p)->dw_loc_oprnd1 = expr->dw_loc_oprnd1; > + (*p)->dw_loc_oprnd2 = expr->dw_loc_oprnd2; > + p = &(*p)->dw_loc_next; > + expr = expr->dw_loc_next; > + } > + *p = deref; > + return ret; > +} > + > +/* For DW_AT_string_length attribute with DW_OP_call4 reference to a variable > + or argument, adjust it if needed and return: > + -1 if the DW_AT_string_length attribute and DW_AT_byte_size attribute > + if present should be removed > + 0 keep the attribute as is if the referenced var or argument has > + only DWARF expression that covers all ranges > + 1 if the attribute has been successfully adjusted. */ > + > +static int > +optimize_string_length (dw_attr_node *a) > +{ > + dw_loc_descr_ref l = AT_loc (a), lv; > + dw_die_ref die = l->dw_loc_oprnd1.v.val_die_ref.die; > + dw_attr_node *av = get_AT (die, DW_AT_location); > + dw_loc_list_ref d; > + bool non_dwarf_expr = false; > + > + if (av == NULL) > + return -1; > + switch (AT_class (av)) > + { > + case dw_val_class_loc_list: > + for (d = AT_loc_list (av); d != NULL; d = d->dw_loc_next) > + if (d->expr && non_dwarf_expression (d->expr)) > + non_dwarf_expr = true; > + break; > + case dw_val_class_loc: > + lv = AT_loc (av); > + if (lv == NULL) > + return -1; > + if (non_dwarf_expression (lv)) > + non_dwarf_expr = true; > + break; > + default: > + return -1; > + } > + > + /* If it is safe to keep DW_OP_call4 in, keep it. */ > + if (!non_dwarf_expr > + && (l->dw_loc_next == NULL || AT_class (av) == dw_val_class_loc)) > + return 0; > + > + /* If not dereferencing the DW_OP_call4 afterwards, we can just > + copy over the DW_AT_location attribute from die to a. */ > + if (l->dw_loc_next == NULL) > + { > + a->dw_attr_val = av->dw_attr_val; > + return 1; > + } > + > + dw_loc_list_ref list, *p; > + switch (AT_class (av)) > + { > + case dw_val_class_loc_list: > + p = &list; > + list = NULL; > + for (d = AT_loc_list (av); d != NULL; d = d->dw_loc_next) > + { > + lv = copy_deref_exprloc (d->expr, l->dw_loc_next); > + if (lv) > + { > + *p = new_loc_list (lv, d->begin, d->end, d->section); > + p = &(*p)->dw_loc_next; > + } > + } > + if (list == NULL) > + return -1; > + a->dw_attr_val.val_class = dw_val_class_loc_list; > + gen_llsym (list); > + *AT_loc_list_ptr (a) = list; > + return 1; > + case dw_val_class_loc: > + lv = copy_deref_exprloc (AT_loc (av), l->dw_loc_next); > + if (lv == NULL) > + return -1; > + a->dw_attr_val.v.val_loc = lv; > + return 1; > + default: > + gcc_unreachable (); > + } > +} > + > /* Resolve DW_OP_addr and DW_AT_const_value CONST_STRING arguments to > an address in .rodata section if the string literal is emitted there, > or remove the containing location list or replace DW_AT_const_value > @@ -26597,6 +26864,7 @@ resolve_addr (dw_die_ref die) > dw_attr_node *a; > dw_loc_list_ref *curr, *start, loc; > unsigned ix; > + bool remove_AT_byte_size = false; > > FOR_EACH_VEC_SAFE_ELT (die->die_attr, ix, a) > switch (AT_class (a)) > @@ -26657,6 +26925,38 @@ resolve_addr (dw_die_ref die) > case dw_val_class_loc: > { > dw_loc_descr_ref l = AT_loc (a); > + /* Using DW_OP_call4 or DW_OP_call4 DW_OP_deref in > + DW_AT_string_length is only a rough approximation; unfortunately > + DW_AT_string_length can't be a reference to a DIE. DW_OP_call4 > + needs a DWARF expression, while DW_AT_location of the referenced > + variable or argument might be any location description. */ > + if (a->dw_attr == DW_AT_string_length > + && l > + && l->dw_loc_opc == DW_OP_call4 > + && l->dw_loc_oprnd1.val_class == dw_val_class_die_ref > + && (l->dw_loc_next == NULL > + || (l->dw_loc_next->dw_loc_next == NULL > + && (l->dw_loc_next->dw_loc_opc == DW_OP_deref > + || l->dw_loc_next->dw_loc_opc != DW_OP_deref_size)))) > + { > + switch (optimize_string_length (a)) > + { > + case -1: > + remove_AT (die, a->dw_attr); > + ix--; > + /* For DWARF4 and earlier, if we drop DW_AT_string_length, > + we need to drop also DW_AT_byte_size. */ > + remove_AT_byte_size = true; > + continue; > + default: > + break; > + case 1: > + /* Even if we keep the optimized DW_AT_string_length, > + it might have changed AT_class, so process it again. */ > + ix--; > + continue; > + } > + } > /* For -gdwarf-2 don't attempt to optimize > DW_AT_data_member_location containing > DW_OP_plus_uconst - older consumers might > @@ -26741,6 +27041,9 @@ resolve_addr (dw_die_ref die) > break; > } > > + if (remove_AT_byte_size) > + remove_AT (die, DW_AT_byte_size); > + > FOR_EACH_CHILD (die, c, resolve_addr (c)); > } > > --- gcc/fortran/trans-decl.c.jj 2016-07-21 08:59:47.098616701 +0200 > +++ gcc/fortran/trans-decl.c 2016-07-21 11:06:23.002023591 +0200 > @@ -1676,26 +1676,23 @@ gfc_get_symbol_decl (gfc_symbol * sym) > && !(sym->attr.use_assoc && !intrinsic_array_parameter))) > gfc_defer_symbol_init (sym); > > + /* Associate names can use the hidden string length variable > + of their associated target. */ > + if (sym->ts.type == BT_CHARACTER > + && TREE_CODE (length) != INTEGER_CST) > + { > + gfc_finish_var_decl (length, sym); > + gcc_assert (!sym->value); > + } > + > gfc_finish_var_decl (decl, sym); > > if (sym->ts.type == BT_CHARACTER) > - { > - /* Character variables need special handling. */ > - gfc_allocate_lang_decl (decl); > - > - /* Associate names can use the hidden string length variable > - of their associated target. */ > - if (TREE_CODE (length) != INTEGER_CST) > - { > - gfc_finish_var_decl (length, sym); > - gcc_assert (!sym->value); > - } > - } > + /* Character variables need special handling. */ > + gfc_allocate_lang_decl (decl); > else if (sym->attr.subref_array_pointer) > - { > - /* We need the span for these beasts. */ > - gfc_allocate_lang_decl (decl); > - } > + /* We need the span for these beasts. */ > + gfc_allocate_lang_decl (decl); > > if (sym->attr.subref_array_pointer) > { > > Jakub ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Fix early debug regression with DW_AT_string_length (PR debug/71906) 2016-07-22 11:55 ` Richard Biener @ 2016-07-22 12:08 ` Jakub Jelinek 2016-07-22 12:39 ` Richard Biener 0 siblings, 1 reply; 14+ messages in thread From: Jakub Jelinek @ 2016-07-22 12:08 UTC (permalink / raw) To: Richard Biener; +Cc: Jason Merrill, Aldy Hernandez, GCC Patches On Fri, Jul 22, 2016 at 01:55:22PM +0200, Richard Biener wrote: > > @@ -19201,18 +19205,70 @@ gen_array_type_die (tree type, dw_die_re > > if (size >= 0) > > add_AT_unsigned (array_die, DW_AT_byte_size, size); > > else if (TYPE_DOMAIN (type) != NULL_TREE > > - && TYPE_MAX_VALUE (TYPE_DOMAIN (type)) != NULL_TREE > > - && DECL_P (TYPE_MAX_VALUE (TYPE_DOMAIN (type)))) > > + && TYPE_MAX_VALUE (TYPE_DOMAIN (type)) != NULL_TREE) > > { > > tree szdecl = TYPE_MAX_VALUE (TYPE_DOMAIN (type)); > > - dw_loc_list_ref loc = loc_list_from_tree (szdecl, 2, NULL); > > + tree rszdecl = szdecl; > > + HOST_WIDE_INT rsize = 0; > > > > size = int_size_in_bytes (TREE_TYPE (szdecl)); > > - if (loc && size > 0) > > + if (!DECL_P (szdecl)) > > { > > - add_AT_location_description (array_die, DW_AT_string_length, loc); > > - if (size != DWARF2_ADDR_SIZE) > > - add_AT_unsigned (array_die, DW_AT_byte_size, size); > > + if (TREE_CODE (szdecl) == INDIRECT_REF > > So I wonder how this can happen with variable-size type > gimplification. Shouldn't > this be on, say, DECL_VALUE_EXPR of the DECL_P TYPE_MAX_VALUE? If you mean the INDIRECT_REF, that only happens with PARM_DECLs, and conceptually a dereference of the argument is the right spot where the length lives (if you reallocate the string with different character length, then that is where you store the value. If you add some artificial decl that will hold the value of *_varb, then the trouble is that the variable won't be assigned before the function prologue and most likely will be optimized away anyway. > <1><28d>: Abbrev Number: 19 (DW_TAG_string_type) > <1><28e>: Abbrev Number: 19 (DW_TAG_string_type) > <1><28f>: Abbrev Number: 6 (DW_TAG_pointer_type) > > so there is nothing to annotate with a location later. With the patch there will be DW_OP_call4 in 2 DW_AT_string_length attributes and one DW_OP_call4; DW_OP_deref. > Note that even with GCC 5 'varb' didn't get a DW_AT_string_length, > 'vara' did, though. Yeah, I've mentioned that in the mail. Jakub ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Fix early debug regression with DW_AT_string_length (PR debug/71906) 2016-07-22 12:08 ` Jakub Jelinek @ 2016-07-22 12:39 ` Richard Biener 2016-08-01 10:44 ` Richard Biener 0 siblings, 1 reply; 14+ messages in thread From: Richard Biener @ 2016-07-22 12:39 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Jason Merrill, Aldy Hernandez, GCC Patches On Fri, Jul 22, 2016 at 2:08 PM, Jakub Jelinek <jakub@redhat.com> wrote: > On Fri, Jul 22, 2016 at 01:55:22PM +0200, Richard Biener wrote: >> > @@ -19201,18 +19205,70 @@ gen_array_type_die (tree type, dw_die_re >> > if (size >= 0) >> > add_AT_unsigned (array_die, DW_AT_byte_size, size); >> > else if (TYPE_DOMAIN (type) != NULL_TREE >> > - && TYPE_MAX_VALUE (TYPE_DOMAIN (type)) != NULL_TREE >> > - && DECL_P (TYPE_MAX_VALUE (TYPE_DOMAIN (type)))) >> > + && TYPE_MAX_VALUE (TYPE_DOMAIN (type)) != NULL_TREE) >> > { >> > tree szdecl = TYPE_MAX_VALUE (TYPE_DOMAIN (type)); >> > - dw_loc_list_ref loc = loc_list_from_tree (szdecl, 2, NULL); >> > + tree rszdecl = szdecl; >> > + HOST_WIDE_INT rsize = 0; >> > >> > size = int_size_in_bytes (TREE_TYPE (szdecl)); >> > - if (loc && size > 0) >> > + if (!DECL_P (szdecl)) >> > { >> > - add_AT_location_description (array_die, DW_AT_string_length, loc); >> > - if (size != DWARF2_ADDR_SIZE) >> > - add_AT_unsigned (array_die, DW_AT_byte_size, size); >> > + if (TREE_CODE (szdecl) == INDIRECT_REF >> >> So I wonder how this can happen with variable-size type >> gimplification. Shouldn't >> this be on, say, DECL_VALUE_EXPR of the DECL_P TYPE_MAX_VALUE? > > If you mean the INDIRECT_REF, that only happens with PARM_DECLs, and > conceptually a dereference of the argument is the right spot where the > length lives (if you reallocate the string with different character length, > then that is where you store the value. If you add some artificial > decl that will hold the value of *_varb, then the trouble is that the > variable won't be assigned before the function prologue and most likely will > be optimized away anyway. True. I wonder how other cases look like with the length not based on a parameter. >> <1><28d>: Abbrev Number: 19 (DW_TAG_string_type) >> <1><28e>: Abbrev Number: 19 (DW_TAG_string_type) >> <1><28f>: Abbrev Number: 6 (DW_TAG_pointer_type) >> >> so there is nothing to annotate with a location later. > > With the patch there will be DW_OP_call4 in 2 DW_AT_string_length > attributes and one DW_OP_call4; DW_OP_deref. > >> Note that even with GCC 5 'varb' didn't get a DW_AT_string_length, >> 'vara' did, though. > > Yeah, I've mentioned that in the mail. > > Jakub ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Fix early debug regression with DW_AT_string_length (PR debug/71906) 2016-07-22 12:39 ` Richard Biener @ 2016-08-01 10:44 ` Richard Biener 2016-08-09 14:36 ` Jakub Jelinek 0 siblings, 1 reply; 14+ messages in thread From: Richard Biener @ 2016-08-01 10:44 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Jason Merrill, Aldy Hernandez, GCC Patches On Fri, Jul 22, 2016 at 2:39 PM, Richard Biener <richard.guenther@gmail.com> wrote: > On Fri, Jul 22, 2016 at 2:08 PM, Jakub Jelinek <jakub@redhat.com> wrote: >> On Fri, Jul 22, 2016 at 01:55:22PM +0200, Richard Biener wrote: >>> > @@ -19201,18 +19205,70 @@ gen_array_type_die (tree type, dw_die_re >>> > if (size >= 0) >>> > add_AT_unsigned (array_die, DW_AT_byte_size, size); >>> > else if (TYPE_DOMAIN (type) != NULL_TREE >>> > - && TYPE_MAX_VALUE (TYPE_DOMAIN (type)) != NULL_TREE >>> > - && DECL_P (TYPE_MAX_VALUE (TYPE_DOMAIN (type)))) >>> > + && TYPE_MAX_VALUE (TYPE_DOMAIN (type)) != NULL_TREE) >>> > { >>> > tree szdecl = TYPE_MAX_VALUE (TYPE_DOMAIN (type)); >>> > - dw_loc_list_ref loc = loc_list_from_tree (szdecl, 2, NULL); >>> > + tree rszdecl = szdecl; >>> > + HOST_WIDE_INT rsize = 0; >>> > >>> > size = int_size_in_bytes (TREE_TYPE (szdecl)); >>> > - if (loc && size > 0) >>> > + if (!DECL_P (szdecl)) >>> > { >>> > - add_AT_location_description (array_die, DW_AT_string_length, loc); >>> > - if (size != DWARF2_ADDR_SIZE) >>> > - add_AT_unsigned (array_die, DW_AT_byte_size, size); >>> > + if (TREE_CODE (szdecl) == INDIRECT_REF >>> >>> So I wonder how this can happen with variable-size type >>> gimplification. Shouldn't >>> this be on, say, DECL_VALUE_EXPR of the DECL_P TYPE_MAX_VALUE? >> >> If you mean the INDIRECT_REF, that only happens with PARM_DECLs, and >> conceptually a dereference of the argument is the right spot where the >> length lives (if you reallocate the string with different character length, >> then that is where you store the value. If you add some artificial >> decl that will hold the value of *_varb, then the trouble is that the >> variable won't be assigned before the function prologue and most likely will >> be optimized away anyway. > > True. I wonder how other cases look like with the length not based on a > parameter. Note that reading the dwarf standard, it looks like it accepts a location which means we could do an implicit location description using DW_OP_stack_value which gives us access to arbitrary dwarf expressions (and thus the possibility to handle it similar to VLAs). But maybe I am missing something? (now running into the issue with LTO debug and gfortran.dg/save_5.f90 where during early debug we emit a location that ends up refering to a symbol that might be optimized away later - early debug cannot sanitize referenced symbols via resolv_addr obviously). Annotating the DIE late is also not what I want to do as I'd need to pull in all type DIEs into the late CU that way (well, at least selected ones, but I'm really trying to avoid going down that route). Thanks, Richard. >>> <1><28d>: Abbrev Number: 19 (DW_TAG_string_type) >>> <1><28e>: Abbrev Number: 19 (DW_TAG_string_type) >>> <1><28f>: Abbrev Number: 6 (DW_TAG_pointer_type) >>> >>> so there is nothing to annotate with a location later. >> >> With the patch there will be DW_OP_call4 in 2 DW_AT_string_length >> attributes and one DW_OP_call4; DW_OP_deref. >> >>> Note that even with GCC 5 'varb' didn't get a DW_AT_string_length, >>> 'vara' did, though. >> >> Yeah, I've mentioned that in the mail. >> >> Jakub ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Fix early debug regression with DW_AT_string_length (PR debug/71906) 2016-08-01 10:44 ` Richard Biener @ 2016-08-09 14:36 ` Jakub Jelinek 2016-08-09 17:10 ` Richard Biener 0 siblings, 1 reply; 14+ messages in thread From: Jakub Jelinek @ 2016-08-09 14:36 UTC (permalink / raw) To: Richard Biener; +Cc: Jason Merrill, Aldy Hernandez, GCC Patches On Mon, Aug 01, 2016 at 12:44:30PM +0200, Richard Biener wrote: > Note that reading the dwarf standard, it looks like it accepts a location > which means we could do an implicit location description using > DW_OP_stack_value which gives us access to arbitrary dwarf > expressions (and thus the possibility to handle it similar to VLAs). Sure. But for the DW_AT_string_length, the issue is that with early debug, we want the attribute early, but locations within a function obviously can be provided late, after the function is compiled. Which is why the patch uses DW_OP_call4 as the holder of the information that the string length is provided by the value of some variable. In certain cases it can stay that way even during late debug, if the var disappears or has no location info etc., then DW_AT_string_length is dropped, or in other cases the location of the var is copied into the attribute etc. This is different from the VLA bounds in DW_AT_upper_bound etc., where it is allowed to just refer to a DIE that holds the value (so we can stick it into the attribute early and just supply the var's location later), for DW_AT_string_length it is not allowed. > But maybe I am missing something? (now running into the issue > with LTO debug and gfortran.dg/save_5.f90 where during early debug > we emit a location that ends up refering to a symbol that might be > optimized away later - early debug cannot sanitize referenced symbols > via resolv_addr obviously). Annotating the DIE late is also not > what I want to do as I'd need to pull in all type DIEs into the late CU > that way (well, at least selected ones, but I'm really trying to avoid > going down that route). In some cases like location of file scope vars, or this DW_AT_string_length, you really need to adjust late the debug info created early, there is no workaround for that. Jakub ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Fix early debug regression with DW_AT_string_length (PR debug/71906) 2016-08-09 14:36 ` Jakub Jelinek @ 2016-08-09 17:10 ` Richard Biener 2016-08-10 8:17 ` Jakub Jelinek 0 siblings, 1 reply; 14+ messages in thread From: Richard Biener @ 2016-08-09 17:10 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Jason Merrill, Aldy Hernandez, GCC Patches On August 9, 2016 4:36:24 PM GMT+02:00, Jakub Jelinek <jakub@redhat.com> wrote: >On Mon, Aug 01, 2016 at 12:44:30PM +0200, Richard Biener wrote: >> Note that reading the dwarf standard, it looks like it accepts a >location >> which means we could do an implicit location description using >> DW_OP_stack_value which gives us access to arbitrary dwarf >> expressions (and thus the possibility to handle it similar to VLAs). > >Sure. But for the DW_AT_string_length, the issue is that with early >debug, >we want the attribute early, but locations within a function obviously >can be provided late, after the function is compiled. > >Which is why the patch uses DW_OP_call4 as the holder of the >information >that the string length is provided by the value of some variable. >In certain cases it can stay that way even during late debug, if the >var >disappears or has no location info etc., then DW_AT_string_length is >dropped, or in other cases the location of the var is copied into the >attribute etc. > >This is different from the VLA bounds in DW_AT_upper_bound etc., where >it is allowed to just refer to a DIE that holds the value (so we can >stick >it into the attribute early and just supply the var's location later), >for >DW_AT_string_length it is not allowed. > >> But maybe I am missing something? (now running into the issue >> with LTO debug and gfortran.dg/save_5.f90 where during early debug >> we emit a location that ends up refering to a symbol that might be >> optimized away later - early debug cannot sanitize referenced symbols >> via resolv_addr obviously). Annotating the DIE late is also not >> what I want to do as I'd need to pull in all type DIEs into the late >CU >> that way (well, at least selected ones, but I'm really trying to >avoid >> going down that route). > >In some cases like location of file scope vars, or this >DW_AT_string_length, >you really need to adjust late the debug info created early, there is >no >workaround for that. I suppose the workaround is to fix/extend DWARF... (DW_AT_GNU_string_length that allows us to refer to another DIE?) Richard. > Jakub ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Fix early debug regression with DW_AT_string_length (PR debug/71906) 2016-08-09 17:10 ` Richard Biener @ 2016-08-10 8:17 ` Jakub Jelinek 2016-08-10 10:23 ` Richard Biener 0 siblings, 1 reply; 14+ messages in thread From: Jakub Jelinek @ 2016-08-10 8:17 UTC (permalink / raw) To: Richard Biener; +Cc: Jason Merrill, Aldy Hernandez, GCC Patches On Tue, Aug 09, 2016 at 07:10:34PM +0200, Richard Biener wrote: > >In some cases like location of file scope vars, or this > >DW_AT_string_length, > >you really need to adjust late the debug info created early, there is > >no > >workaround for that. > > I suppose the workaround is to fix/extend DWARF... (DW_AT_GNU_string_length that allows us to refer to another DIE?) Introducing another attribute that does the same thing as existing attribute would be way too ugly. In theory the reference class could be added to DW_AT_string_length, I can ask on DWARF workgroup (but it might be too late for DWARF 5), but that still doesn't solve the issue of the indirect params. How do you want to handle the debug info without ammending the early-generated DWARF though? Just by using it as abstract origins of everything and ammending in copies? That would be a total mess for all the consumers... Parsing DWARF and rewriting it isn't all that hard. Jakub ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Fix early debug regression with DW_AT_string_length (PR debug/71906) 2016-08-10 8:17 ` Jakub Jelinek @ 2016-08-10 10:23 ` Richard Biener 2016-08-11 12:36 ` Jakub Jelinek 0 siblings, 1 reply; 14+ messages in thread From: Richard Biener @ 2016-08-10 10:23 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Jason Merrill, Aldy Hernandez, GCC Patches On Wed, Aug 10, 2016 at 10:16 AM, Jakub Jelinek <jakub@redhat.com> wrote: > On Tue, Aug 09, 2016 at 07:10:34PM +0200, Richard Biener wrote: >> >In some cases like location of file scope vars, or this >> >DW_AT_string_length, >> >you really need to adjust late the debug info created early, there is >> >no >> >workaround for that. >> >> I suppose the workaround is to fix/extend DWARF... (DW_AT_GNU_string_length that allows us to refer to another DIE?) > > Introducing another attribute that does the same thing as existing attribute > would be way too ugly. In theory the reference class could be added to > DW_AT_string_length, I can ask on DWARF workgroup (but it might be too late > for DWARF 5), but that still doesn't solve the issue of the indirect params. > > How do you want to handle the debug info without ammending the early-generated > DWARF though? Just by using it as abstract origins of everything and > ammending in copies? Yes. > That would be a total mess for all the consumers... > Parsing DWARF and rewriting it isn't all that hard. It may be not hard but it takes time and bloats debug info. You'd need to do it N times (each LTRANS unit would need to parse all early generated debug, pickle the "interesting" pieces, annotate it and then hopefully write out a lot less than parsed). Or you do what I do, annotate via abstract origins. A clever DWARF optimizer might then do what you suggest on the final executable (smash all copies with their abstract origin). I'm not objecting to the latter, but that's not my primary objective. My primary objective is to make C++ debugging reasonable even when you use LTO - with my prototype patches now all libstdc++ pretty-printer tests PASS while the currently just FAIL. Yes, I do have some DWARF consumer issues - VLAs don't work, but I got no answer from a mail to gdb last year or the bug I filed a few weeks ago. Yes, lldb immediately crashes when trying to parse the DWARF. Those are the only consumers I have access to. That said, I don't see a way to do scalable "better" debug for LTO. Richard. > > Jakub ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Fix early debug regression with DW_AT_string_length (PR debug/71906) 2016-08-10 10:23 ` Richard Biener @ 2016-08-11 12:36 ` Jakub Jelinek 2016-08-11 14:09 ` Richard Biener 0 siblings, 1 reply; 14+ messages in thread From: Jakub Jelinek @ 2016-08-11 12:36 UTC (permalink / raw) To: Richard Biener; +Cc: Jason Merrill, Aldy Hernandez, GCC Patches Hi! On Wed, Aug 10, 2016 at 12:23:06PM +0200, Richard Biener wrote: > > Introducing another attribute that does the same thing as existing attribute > > would be way too ugly. In theory the reference class could be added to > > DW_AT_string_length, I can ask on DWARF workgroup (but it might be too late > > for DWARF 5), but that still doesn't solve the issue of the indirect params. > > > > How do you want to handle the debug info without ammending the early-generated > > DWARF though? Just by using it as abstract origins of everything and > > ammending in copies? > > Yes. I've filed an enhancement request and got one positive feedback, but it is going to be multiple months at best. So, at least for non-LTO I'd strongly prefer to go with what the current patch does, and for LTO we'd then have to ask GDB to implement the reference class for DW_AT_string_length and then just use that instead, depending on flag_lto or similar, or perhaps for dwarf4 and earlier don't emit for LTO variable string length and keep it only for dwarf5+ (if the change is approved). For the indirect parms and LTO, I guess we'd need to create some artificial VAR_DECL at function scope with DECL_VALUE_EXPR of *parm, DECL_ARTIFICIAL, DECL_NAMELESS and reference that instead of the parm itself. With this, do you object to the current patch? Jakub ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Fix early debug regression with DW_AT_string_length (PR debug/71906) 2016-08-11 12:36 ` Jakub Jelinek @ 2016-08-11 14:09 ` Richard Biener 0 siblings, 0 replies; 14+ messages in thread From: Richard Biener @ 2016-08-11 14:09 UTC (permalink / raw) To: Jakub Jelinek; +Cc: Jason Merrill, Aldy Hernandez, GCC Patches On Thu, Aug 11, 2016 at 2:36 PM, Jakub Jelinek <jakub@redhat.com> wrote: > Hi! > > On Wed, Aug 10, 2016 at 12:23:06PM +0200, Richard Biener wrote: >> > Introducing another attribute that does the same thing as existing attribute >> > would be way too ugly. In theory the reference class could be added to >> > DW_AT_string_length, I can ask on DWARF workgroup (but it might be too late >> > for DWARF 5), but that still doesn't solve the issue of the indirect params. >> > >> > How do you want to handle the debug info without ammending the early-generated >> > DWARF though? Just by using it as abstract origins of everything and >> > ammending in copies? >> >> Yes. > > I've filed an enhancement request and got one positive feedback, but it is > going to be multiple months at best. > So, at least for non-LTO I'd strongly prefer to go with what the current > patch does, and for LTO we'd then have to ask GDB to implement the reference > class for DW_AT_string_length and then just use that instead, depending on > flag_lto or similar, or perhaps for dwarf4 and earlier don't emit for LTO > variable string length and keep it only for dwarf5+ (if the change is > approved). For the indirect parms and LTO, I guess we'd need to create some > artificial VAR_DECL at function scope with DECL_VALUE_EXPR of *parm, > DECL_ARTIFICIAL, DECL_NAMELESS and reference that instead of the parm > itself. > With this, do you object to the current patch? No, I still hope it avoids generating references to possibly optimized out stuff early - I didn't thorougly review it. I guess I'll find out soon ;) Richard. > Jakub ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Fix early debug regression with DW_AT_string_length (PR debug/71906) 2016-07-21 16:53 [PATCH] Fix early debug regression with DW_AT_string_length (PR debug/71906) Jakub Jelinek 2016-07-22 11:55 ` Richard Biener @ 2016-08-12 17:47 ` Jason Merrill 2016-08-12 17:57 ` Jakub Jelinek 1 sibling, 1 reply; 14+ messages in thread From: Jason Merrill @ 2016-08-12 17:47 UTC (permalink / raw) To: Jakub Jelinek, Aldy Hernandez; +Cc: gcc-patches On 07/21/2016 12:53 PM, Jakub Jelinek wrote: > size = int_size_in_bytes (TREE_TYPE (szdecl)); ... > + if (size != DWARF2_ADDR_SIZE) > + add_AT_unsigned (array_die, DW_AT_byte_size, size); For DWARF5, where DW_AT_byte_size is always the size of the array type, I think this should be DW_AT_string_length_byte_size. OK with that change. Jason ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Fix early debug regression with DW_AT_string_length (PR debug/71906) 2016-08-12 17:47 ` Jason Merrill @ 2016-08-12 17:57 ` Jakub Jelinek 2016-08-15 10:07 ` Jakub Jelinek 0 siblings, 1 reply; 14+ messages in thread From: Jakub Jelinek @ 2016-08-12 17:57 UTC (permalink / raw) To: Jason Merrill; +Cc: Aldy Hernandez, gcc-patches On Fri, Aug 12, 2016 at 01:47:14PM -0400, Jason Merrill wrote: > On 07/21/2016 12:53 PM, Jakub Jelinek wrote: > > size = int_size_in_bytes (TREE_TYPE (szdecl)); > ... > >+ if (size != DWARF2_ADDR_SIZE) > >+ add_AT_unsigned (array_die, DW_AT_byte_size, size); > > For DWARF5, where DW_AT_byte_size is always the size of the array type, I > think this should be DW_AT_string_length_byte_size. Sure, but this is just reindenting existing code, DW_AT_string_length_byte_size isn't yet in dwarf2.def nor anywhere else. When DWARF5 will make it into the public beta, I'll try to spend some time on implementing the 5 support, but I think it would be better done incrementally then, not part of this patch. Jakub ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] Fix early debug regression with DW_AT_string_length (PR debug/71906) 2016-08-12 17:57 ` Jakub Jelinek @ 2016-08-15 10:07 ` Jakub Jelinek 0 siblings, 0 replies; 14+ messages in thread From: Jakub Jelinek @ 2016-08-15 10:07 UTC (permalink / raw) To: Jason Merrill; +Cc: Aldy Hernandez, gcc-patches On Fri, Aug 12, 2016 at 07:57:42PM +0200, Jakub Jelinek wrote: > On Fri, Aug 12, 2016 at 01:47:14PM -0400, Jason Merrill wrote: > > On 07/21/2016 12:53 PM, Jakub Jelinek wrote: > > > size = int_size_in_bytes (TREE_TYPE (szdecl)); > > ... > > >+ if (size != DWARF2_ADDR_SIZE) > > >+ add_AT_unsigned (array_die, DW_AT_byte_size, size); > > > > For DWARF5, where DW_AT_byte_size is always the size of the array type, I > > think this should be DW_AT_string_length_byte_size. > > Sure, but this is just reindenting existing code, > DW_AT_string_length_byte_size isn't yet in dwarf2.def nor anywhere else. > When DWARF5 will make it into the public beta, I'll try to spend some time > on implementing the 5 support, but I think it would be better done > incrementally then, not part of this patch. I've committed the patch with following incremental patch on top of it for the trunk (and for 6.2 just the original patch): 2016-08-15 Jakub Jelinek <jakub@redhat.com> * dwarf2.def (DW_AT_string_length_bit_size, DW_AT_string_length_byte_size): New attributes. * dwarf2out.c (struct checksum_attributes): Add at_string_length_bit_size and at_string_length_byte_size fields. (collect_checksum_attributes): Handle DW_AT_string_length_bit_size and DW_AT_string_length_byte_size. (die_checksum_ordered): Handle at_string_length_bit_size and at_string_length_byte_size. (gen_array_type_die): For dwarf_version >= 5 emit DW_AT_string_length_byte_size instead of DW_AT_byte_size. (adjust_string_types): For dwarf_version >= 5 remove DW_AT_string_length_byte_size instead of DW_AT_byte_size. (resolve_addr): Likewise. --- include/dwarf2.def.jj 2016-08-12 11:12:47.000000000 +0200 +++ include/dwarf2.def 2016-08-15 11:03:44.742465435 +0200 @@ -309,6 +309,8 @@ DW_AT (DW_AT_const_expr, 0x6c) DW_AT (DW_AT_enum_class, 0x6d) DW_AT (DW_AT_linkage_name, 0x6e) /* DWARF 5. */ +DW_AT (DW_AT_string_length_bit_size, 0x6f) +DW_AT (DW_AT_string_length_byte_size, 0x70) DW_AT (DW_AT_noreturn, 0x87) DW_AT (DW_AT_deleted, 0x8a) DW_AT (DW_AT_defaulted, 0x8b) --- gcc/dwarf2out.c.jj 2016-08-15 11:02:41.000000000 +0200 +++ gcc/dwarf2out.c 2016-08-15 11:11:40.023507518 +0200 @@ -6363,6 +6363,8 @@ struct checksum_attributes dw_attr_node *at_small; dw_attr_node *at_segment; dw_attr_node *at_string_length; + dw_attr_node *at_string_length_bit_size; + dw_attr_node *at_string_length_byte_size; dw_attr_node *at_threads_scaled; dw_attr_node *at_upper_bound; dw_attr_node *at_use_location; @@ -6502,6 +6504,12 @@ collect_checksum_attributes (struct chec case DW_AT_string_length: attrs->at_string_length = a; break; + case DW_AT_string_length_bit_size: + attrs->at_string_length_bit_size = a; + break; + case DW_AT_string_length_byte_size: + attrs->at_string_length_byte_size = a; + break; case DW_AT_threads_scaled: attrs->at_threads_scaled = a; break; @@ -6588,6 +6596,8 @@ die_checksum_ordered (dw_die_ref die, st CHECKSUM_ATTR (attrs.at_small); CHECKSUM_ATTR (attrs.at_segment); CHECKSUM_ATTR (attrs.at_string_length); + CHECKSUM_ATTR (attrs.at_string_length_bit_size); + CHECKSUM_ATTR (attrs.at_string_length_byte_size); CHECKSUM_ATTR (attrs.at_threads_scaled); CHECKSUM_ATTR (attrs.at_upper_bound); CHECKSUM_ATTR (attrs.at_use_location); @@ -19355,7 +19365,9 @@ gen_array_type_die (tree type, dw_die_re add_AT_location_description (array_die, DW_AT_string_length, loc); if (size != DWARF2_ADDR_SIZE) - add_AT_unsigned (array_die, DW_AT_byte_size, size); + add_AT_unsigned (array_die, dwarf_version >= 5 + ? DW_AT_string_length_byte_size + : DW_AT_byte_size, size); } } } @@ -19448,7 +19460,9 @@ adjust_string_types (void) else { remove_AT (array_die, DW_AT_string_length); - remove_AT (array_die, DW_AT_byte_size); + remove_AT (array_die, dwarf_version >= 5 + ? DW_AT_string_length_byte_size + : DW_AT_byte_size); } } } @@ -26909,8 +26923,8 @@ copy_deref_exprloc (dw_loc_descr_ref exp /* For DW_AT_string_length attribute with DW_OP_call4 reference to a variable or argument, adjust it if needed and return: - -1 if the DW_AT_string_length attribute and DW_AT_byte_size attribute - if present should be removed + -1 if the DW_AT_string_length attribute and DW_AT_{string_length_,}byte_size + attribute if present should be removed 0 keep the attribute as is if the referenced var or argument has only DWARF expression that covers all ranges 1 if the attribute has been successfully adjusted. */ @@ -27083,8 +27097,8 @@ resolve_addr (dw_die_ref die) case -1: remove_AT (die, a->dw_attr); ix--; - /* For DWARF4 and earlier, if we drop DW_AT_string_length, - we need to drop also DW_AT_byte_size. */ + /* If we drop DW_AT_string_length, we need to drop also + DW_AT_{string_length_,}byte_size. */ remove_AT_byte_size = true; continue; default: @@ -27181,7 +27195,9 @@ resolve_addr (dw_die_ref die) } if (remove_AT_byte_size) - remove_AT (die, DW_AT_byte_size); + remove_AT (die, dwarf_version >= 5 + ? DW_AT_string_length_byte_size + : DW_AT_byte_size); FOR_EACH_CHILD (die, c, resolve_addr (c)); } Jakub ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2016-08-15 10:07 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-07-21 16:53 [PATCH] Fix early debug regression with DW_AT_string_length (PR debug/71906) Jakub Jelinek 2016-07-22 11:55 ` Richard Biener 2016-07-22 12:08 ` Jakub Jelinek 2016-07-22 12:39 ` Richard Biener 2016-08-01 10:44 ` Richard Biener 2016-08-09 14:36 ` Jakub Jelinek 2016-08-09 17:10 ` Richard Biener 2016-08-10 8:17 ` Jakub Jelinek 2016-08-10 10:23 ` Richard Biener 2016-08-11 12:36 ` Jakub Jelinek 2016-08-11 14:09 ` Richard Biener 2016-08-12 17:47 ` Jason Merrill 2016-08-12 17:57 ` Jakub Jelinek 2016-08-15 10:07 ` Jakub Jelinek
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).