From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 81956 invoked by alias); 21 Jul 2016 16:53:38 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 81945 invoked by uid 89); 21 Jul 2016 16:53:36 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=BAYES_00,KAM_LAZY_DOMAIN_SECURITY,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=DECL_P, decl_p, Associate, appended X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Thu, 21 Jul 2016 16:53:34 +0000 Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 3B8CE335F7A for ; Thu, 21 Jul 2016 16:53:32 +0000 (UTC) Received: from tucnak.zalov.cz (ovpn-204-43.brq.redhat.com [10.40.204.43]) by int-mx13.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u6LGrRim030107 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Thu, 21 Jul 2016 12:53:31 -0400 Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.15.2/8.15.2) with ESMTP id u6LGrPC9023581; Thu, 21 Jul 2016 18:53:26 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.15.2/8.15.2/Submit) id u6LGrPSB023580; Thu, 21 Jul 2016 18:53:25 +0200 Date: Thu, 21 Jul 2016 16:53:00 -0000 From: Jakub Jelinek To: Jason Merrill , Aldy Hernandez Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] Fix early debug regression with DW_AT_string_length (PR debug/71906) Message-ID: <20160721165324.GI7387@tucnak.redhat.com> Reply-To: Jakub Jelinek MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.24 (2015-08-30) X-IsSubscribed: yes X-SW-Source: 2016-07/txt/msg01403.txt.bz2 Hi! The early debug changes broke e.g. following testcase: program pr71906 character(len=8) :: vard character(len=:), allocatable :: vare type t character(len=:), allocatable :: f end type type(t) :: varf allocate(character(len=10) :: vare) allocate(character(len=9) :: varf%f) vare = 'foo' call foo (vard, vare, varf) contains subroutine foo (vara, varb, varc) character(len=*) :: vara character(len=:), allocatable :: varb type(t) :: varc vara = 'bar' varb = 'baz' varc%f = 'str' end subroutine end program pr71906 The issue is that unlike e.g. DW_AT_upper_bound, DW_AT_string_length doesn't allow a reference to some other DIE, so while for the former we just emit a reference to an artificial var holding the VLA sizes, for non-constant string length loc_list_from_tree used to work, but doesn't anymore. The following patch has 4 major parts: 1) Fortran FE change to emit the artificial vars holding string length before the string vars (something I broke recently with PR71687 fix) 2) for early_dwarf, loc_list_from_tree for the DW_AT_string_length var will most likely fail, the code in gen_array_type_die in that case adds DW_OP_call4 referencing the DIE of the artificial var or parameter; DW_OP_call4 is a rough match, in that it only works properly if the artificial var has DWARF expressions (rather than location descriptions); the patch also adds newly support for varb above, where the string length is INDIRECT_REF of artificial PARM_DECL; for early dwarf this adds DW_OP_call4; DW_OP_deref{,_size}; the reason to handle it this way is that IMHO it matches more the spirit and intention of the early dwarf eventually for LTO, where I presume we'd stream the early dwarf created debug info and read/adjust it afterwards; LTO doesn't know that something is a fortran character and what string length it has 3) unfortunately, for the PARM_DECL and INDIRECT_REF of PARM_DECL cases, usually we need to refer to a parameter whose DIE has not been created yet during early_dwarf; and trying to create it out of order has various issues, e.g. the debugger would show them up in different order. So, this is resolved using the string_types vector, adjust_string_types function and some code in gen_subprogram_die 4) finally, when finalizing the debug info, resolve_addr and its helper functions look at DW_AT_string_length with DW_OP_call4 optionally followed by DW_OP_deref{,_size} in its DWARF expression, look up the referenced DIE, consider its DW_AT_location and either: - keep it as is, if it is valid DWARF (i.e. known to be defined for all PCs to a DWARF expression) - copy the DW_AT_location attribute value/form to the DW_AT_string_length (in the non-deref case) - for the deref case, adjust what can be easily adjusted (DWARF expression with DW_OP_stack_value at the end drops the stack value and thus handles the dereference, regx/regN replaced by bregx/bregN 0, for DWARF expression only add dereference at the end, drop cases that can't be adjusted - drop the DW_AT_string_length attribute and DW_AT_byte_size if present too if the DIE doesn't have DW_AT_location, or it isn't usable etc. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2016-07-21 Jakub Jelinek PR debug/71906 * dwarf2out.c (string_types): New variable. (gen_array_type_die): Change early_dwarf handling of DW_AT_string_length, create DW_OP_call4 referencing the length var temporarily. Handle parameters that are pointers to string length. (adjust_string_types): New function. (gen_subprogram_die): Temporarily set string_types to local var, call adjust_string_types if needed. (non_dwarf_expression, copy_deref_exprloc, optimize_string_length): New functions. (resolve_addr): Adjust DW_AT_string_length if it is DW_OP_call4. * trans-decl.c (gfc_get_symbol_decl): Call gfc_finish_var_decl for decl's character length before gfc_finish_var_decl on the decl itself. --- gcc/dwarf2out.c.jj 2016-07-21 08:59:47.101616662 +0200 +++ gcc/dwarf2out.c 2016-07-21 11:10:11.510137511 +0200 @@ -3123,6 +3123,10 @@ static bool frame_pointer_fb_offset_vali static vec base_types; +/* Pointer to vector of DW_TAG_string_type DIEs that need finalization + once all arguments are parsed. */ +static vec *string_types; + /* Flags to represent a set of attribute classes for attributes that represent a scalar value (bounds, pointers, ...). */ enum dw_scalar_form @@ -19201,18 +19205,70 @@ gen_array_type_die (tree type, dw_die_re if (size >= 0) add_AT_unsigned (array_die, DW_AT_byte_size, size); else if (TYPE_DOMAIN (type) != NULL_TREE - && TYPE_MAX_VALUE (TYPE_DOMAIN (type)) != NULL_TREE - && DECL_P (TYPE_MAX_VALUE (TYPE_DOMAIN (type)))) + && TYPE_MAX_VALUE (TYPE_DOMAIN (type)) != NULL_TREE) { tree szdecl = TYPE_MAX_VALUE (TYPE_DOMAIN (type)); - dw_loc_list_ref loc = loc_list_from_tree (szdecl, 2, NULL); + tree rszdecl = szdecl; + HOST_WIDE_INT rsize = 0; size = int_size_in_bytes (TREE_TYPE (szdecl)); - if (loc && size > 0) + if (!DECL_P (szdecl)) { - add_AT_location_description (array_die, DW_AT_string_length, loc); - if (size != DWARF2_ADDR_SIZE) - add_AT_unsigned (array_die, DW_AT_byte_size, size); + if (TREE_CODE (szdecl) == INDIRECT_REF + && DECL_P (TREE_OPERAND (szdecl, 0))) + { + rszdecl = TREE_OPERAND (szdecl, 0); + rsize = int_size_in_bytes (TREE_TYPE (rszdecl)); + if (rsize <= 0) + size = 0; + } + else + size = 0; + } + if (size > 0) + { + dw_loc_list_ref loc = loc_list_from_tree (szdecl, 2, NULL); + if (loc == NULL + && early_dwarf + && current_function_decl + && DECL_CONTEXT (rszdecl) == current_function_decl) + { + dw_die_ref ref = lookup_decl_die (rszdecl); + dw_loc_descr_ref l = NULL; + if (ref) + { + l = new_loc_descr (DW_OP_call4, 0, 0); + l->dw_loc_oprnd1.val_class = dw_val_class_die_ref; + l->dw_loc_oprnd1.v.val_die_ref.die = ref; + l->dw_loc_oprnd1.v.val_die_ref.external = 0; + } + else if (TREE_CODE (rszdecl) == PARM_DECL + && string_types) + { + l = new_loc_descr (DW_OP_call4, 0, 0); + l->dw_loc_oprnd1.val_class = dw_val_class_decl_ref; + l->dw_loc_oprnd1.v.val_decl_ref = rszdecl; + string_types->safe_push (array_die); + } + if (l && rszdecl != szdecl) + { + if (rsize == DWARF2_ADDR_SIZE) + add_loc_descr (&l, new_loc_descr (DW_OP_deref, + 0, 0)); + else + add_loc_descr (&l, new_loc_descr (DW_OP_deref_size, + rsize, 0)); + } + if (l) + loc = new_loc_list (l, NULL, NULL, NULL); + } + if (loc) + { + add_AT_location_description (array_die, DW_AT_string_length, + loc); + if (size != DWARF2_ADDR_SIZE) + add_AT_unsigned (array_die, DW_AT_byte_size, size); + } } } return; @@ -19278,6 +19334,37 @@ gen_array_type_die (tree type, dw_die_re add_pubtype (type, array_die); } +/* After all arguments are created, adjust any DW_TAG_string_type + DIEs DW_AT_string_length attributes. */ + +static void +adjust_string_types (void) +{ + dw_die_ref array_die; + unsigned int i; + FOR_EACH_VEC_ELT (*string_types, i, array_die) + { + dw_attr_node *a = get_AT (array_die, DW_AT_string_length); + if (a == NULL) + continue; + dw_loc_descr_ref loc = AT_loc (a); + gcc_assert (loc->dw_loc_opc == DW_OP_call4 + && loc->dw_loc_oprnd1.val_class == dw_val_class_decl_ref); + dw_die_ref ref = lookup_decl_die (loc->dw_loc_oprnd1.v.val_decl_ref); + if (ref) + { + loc->dw_loc_oprnd1.val_class = dw_val_class_die_ref; + loc->dw_loc_oprnd1.v.val_die_ref.die = ref; + loc->dw_loc_oprnd1.v.val_die_ref.external = 0; + } + else + { + remove_AT (array_die, DW_AT_string_length); + remove_AT (array_die, DW_AT_byte_size); + } + } +} + /* This routine generates DIE for array with hidden descriptor, details are filled into *info by a langhook. */ @@ -20675,6 +20762,9 @@ gen_subprogram_die (tree decl, dw_die_re tree generic_decl_parm = generic_decl ? DECL_ARGUMENTS (generic_decl) : NULL; + auto_vec string_types_vec; + if (string_types == NULL) + string_types = &string_types_vec; /* Now we want to walk the list of parameters of the function and emit their relevant DIEs. @@ -20737,6 +20827,14 @@ gen_subprogram_die (tree decl, dw_die_re else if (DECL_INITIAL (decl) == NULL_TREE) gen_unspecified_parameters_die (decl, subr_die); } + + /* Adjust DW_TAG_string_type DIEs if needed, now that all arguments + have DIEs. */ + if (string_types == &string_types_vec) + { + adjust_string_types (); + string_types = NULL; + } } if (subr_die != old_die) @@ -26583,6 +26681,175 @@ optimize_location_into_implicit_ptr (dw_ } } +/* Return NULL if l is a DWARF expression, or first op that is not + valid DWARF expression. */ + +static dw_loc_descr_ref +non_dwarf_expression (dw_loc_descr_ref l) +{ + while (l) + { + if (l->dw_loc_opc >= DW_OP_reg0 && l->dw_loc_opc <= DW_OP_reg31) + return l; + switch (l->dw_loc_opc) + { + case DW_OP_regx: + case DW_OP_implicit_value: + case DW_OP_stack_value: + case DW_OP_GNU_implicit_pointer: + case DW_OP_GNU_parameter_ref: + case DW_OP_piece: + case DW_OP_bit_piece: + return l; + default: + break; + } + l = l->dw_loc_next; + } + return NULL; +} + +/* Return adjusted copy of EXPR: + If it is empty DWARF expression, return it. + If it is valid non-empty DWARF expression, + return copy of EXPR with copy of DEREF appended to it. + If it is DWARF expression followed by DW_OP_reg{N,x}, return + copy of the DWARF expression with DW_OP_breg{N,x} <0> appended + and no DEREF. + If it is DWARF expression followed by DW_OP_stack_value, return + copy of the DWARF expression without anything appended. + Otherwise, return NULL. */ + +static dw_loc_descr_ref +copy_deref_exprloc (dw_loc_descr_ref expr, dw_loc_descr_ref deref) +{ + + if (expr == NULL) + return NULL; + + dw_loc_descr_ref l = non_dwarf_expression (expr); + if (l && l->dw_loc_next) + return NULL; + + if (l) + { + if (l->dw_loc_opc >= DW_OP_reg0 && l->dw_loc_opc <= DW_OP_reg31) + deref = new_loc_descr ((enum dwarf_location_atom) + (DW_OP_breg0 + (l->dw_loc_opc - DW_OP_reg0)), + 0, 0); + else + switch (l->dw_loc_opc) + { + case DW_OP_regx: + deref = new_loc_descr (DW_OP_bregx, + l->dw_loc_oprnd1.v.val_unsigned, 0); + break; + case DW_OP_stack_value: + deref = NULL; + break; + default: + return NULL; + } + } + else + deref = new_loc_descr (deref->dw_loc_opc, + deref->dw_loc_oprnd1.v.val_int, 0); + + dw_loc_descr_ref ret = NULL, *p = &ret; + while (expr != l) + { + *p = new_loc_descr (expr->dw_loc_opc, 0, 0); + (*p)->dw_loc_oprnd1 = expr->dw_loc_oprnd1; + (*p)->dw_loc_oprnd2 = expr->dw_loc_oprnd2; + p = &(*p)->dw_loc_next; + expr = expr->dw_loc_next; + } + *p = deref; + return ret; +} + +/* For DW_AT_string_length attribute with DW_OP_call4 reference to a variable + or argument, adjust it if needed and return: + -1 if the DW_AT_string_length attribute and DW_AT_byte_size attribute + if present should be removed + 0 keep the attribute as is if the referenced var or argument has + only DWARF expression that covers all ranges + 1 if the attribute has been successfully adjusted. */ + +static int +optimize_string_length (dw_attr_node *a) +{ + dw_loc_descr_ref l = AT_loc (a), lv; + dw_die_ref die = l->dw_loc_oprnd1.v.val_die_ref.die; + dw_attr_node *av = get_AT (die, DW_AT_location); + dw_loc_list_ref d; + bool non_dwarf_expr = false; + + if (av == NULL) + return -1; + switch (AT_class (av)) + { + case dw_val_class_loc_list: + for (d = AT_loc_list (av); d != NULL; d = d->dw_loc_next) + if (d->expr && non_dwarf_expression (d->expr)) + non_dwarf_expr = true; + break; + case dw_val_class_loc: + lv = AT_loc (av); + if (lv == NULL) + return -1; + if (non_dwarf_expression (lv)) + non_dwarf_expr = true; + break; + default: + return -1; + } + + /* If it is safe to keep DW_OP_call4 in, keep it. */ + if (!non_dwarf_expr + && (l->dw_loc_next == NULL || AT_class (av) == dw_val_class_loc)) + return 0; + + /* If not dereferencing the DW_OP_call4 afterwards, we can just + copy over the DW_AT_location attribute from die to a. */ + if (l->dw_loc_next == NULL) + { + a->dw_attr_val = av->dw_attr_val; + return 1; + } + + dw_loc_list_ref list, *p; + switch (AT_class (av)) + { + case dw_val_class_loc_list: + p = &list; + list = NULL; + for (d = AT_loc_list (av); d != NULL; d = d->dw_loc_next) + { + lv = copy_deref_exprloc (d->expr, l->dw_loc_next); + if (lv) + { + *p = new_loc_list (lv, d->begin, d->end, d->section); + p = &(*p)->dw_loc_next; + } + } + if (list == NULL) + return -1; + a->dw_attr_val.val_class = dw_val_class_loc_list; + gen_llsym (list); + *AT_loc_list_ptr (a) = list; + return 1; + case dw_val_class_loc: + lv = copy_deref_exprloc (AT_loc (av), l->dw_loc_next); + if (lv == NULL) + return -1; + a->dw_attr_val.v.val_loc = lv; + return 1; + default: + gcc_unreachable (); + } +} + /* Resolve DW_OP_addr and DW_AT_const_value CONST_STRING arguments to an address in .rodata section if the string literal is emitted there, or remove the containing location list or replace DW_AT_const_value @@ -26597,6 +26864,7 @@ resolve_addr (dw_die_ref die) dw_attr_node *a; dw_loc_list_ref *curr, *start, loc; unsigned ix; + bool remove_AT_byte_size = false; FOR_EACH_VEC_SAFE_ELT (die->die_attr, ix, a) switch (AT_class (a)) @@ -26657,6 +26925,38 @@ resolve_addr (dw_die_ref die) case dw_val_class_loc: { dw_loc_descr_ref l = AT_loc (a); + /* Using DW_OP_call4 or DW_OP_call4 DW_OP_deref in + DW_AT_string_length is only a rough approximation; unfortunately + DW_AT_string_length can't be a reference to a DIE. DW_OP_call4 + needs a DWARF expression, while DW_AT_location of the referenced + variable or argument might be any location description. */ + if (a->dw_attr == DW_AT_string_length + && l + && l->dw_loc_opc == DW_OP_call4 + && l->dw_loc_oprnd1.val_class == dw_val_class_die_ref + && (l->dw_loc_next == NULL + || (l->dw_loc_next->dw_loc_next == NULL + && (l->dw_loc_next->dw_loc_opc == DW_OP_deref + || l->dw_loc_next->dw_loc_opc != DW_OP_deref_size)))) + { + switch (optimize_string_length (a)) + { + case -1: + remove_AT (die, a->dw_attr); + ix--; + /* For DWARF4 and earlier, if we drop DW_AT_string_length, + we need to drop also DW_AT_byte_size. */ + remove_AT_byte_size = true; + continue; + default: + break; + case 1: + /* Even if we keep the optimized DW_AT_string_length, + it might have changed AT_class, so process it again. */ + ix--; + continue; + } + } /* For -gdwarf-2 don't attempt to optimize DW_AT_data_member_location containing DW_OP_plus_uconst - older consumers might @@ -26741,6 +27041,9 @@ resolve_addr (dw_die_ref die) break; } + if (remove_AT_byte_size) + remove_AT (die, DW_AT_byte_size); + FOR_EACH_CHILD (die, c, resolve_addr (c)); } --- gcc/fortran/trans-decl.c.jj 2016-07-21 08:59:47.098616701 +0200 +++ gcc/fortran/trans-decl.c 2016-07-21 11:06:23.002023591 +0200 @@ -1676,26 +1676,23 @@ gfc_get_symbol_decl (gfc_symbol * sym) && !(sym->attr.use_assoc && !intrinsic_array_parameter))) gfc_defer_symbol_init (sym); + /* Associate names can use the hidden string length variable + of their associated target. */ + if (sym->ts.type == BT_CHARACTER + && TREE_CODE (length) != INTEGER_CST) + { + gfc_finish_var_decl (length, sym); + gcc_assert (!sym->value); + } + gfc_finish_var_decl (decl, sym); if (sym->ts.type == BT_CHARACTER) - { - /* Character variables need special handling. */ - gfc_allocate_lang_decl (decl); - - /* Associate names can use the hidden string length variable - of their associated target. */ - if (TREE_CODE (length) != INTEGER_CST) - { - gfc_finish_var_decl (length, sym); - gcc_assert (!sym->value); - } - } + /* Character variables need special handling. */ + gfc_allocate_lang_decl (decl); else if (sym->attr.subref_array_pointer) - { - /* We need the span for these beasts. */ - gfc_allocate_lang_decl (decl); - } + /* We need the span for these beasts. */ + gfc_allocate_lang_decl (decl); if (sym->attr.subref_array_pointer) { Jakub