From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id B268E3858422 for ; Thu, 22 Sep 2022 14:49:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org B268E3858422 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1663858197; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type; bh=z8+Pui1DmHEuASSBHn5YlEllxwItYUmX8mo6hwN2oXc=; b=Friyoc2bpks2oAoFcStYaHFZNyGX39+FYHMiWqdHLqOUQToWpqoaZGOO2lXOwuhE2m7/2R lLlxnlgx7rhL3OmUA1sRz/QpiDkMAz/Q8W9xOUMqp8ZUTC6KiP0Xj2zvRBPFkysyED/0Zl A5GarFJmfcXuq6v8EhdzX6E8aBlMXk0= Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-553-i2EafxLkMhWMTGc02uIP_g-1; Thu, 22 Sep 2022 10:49:56 -0400 X-MC-Unique: i2EafxLkMhWMTGc02uIP_g-1 Received: by mail-qk1-f199.google.com with SMTP id f12-20020a05620a408c00b006ced53b80e5so6792230qko.17 for ; Thu, 22 Sep 2022 07:49:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=mime-version:user-agent:message-id:date:organization:subject:cc:to :from:x-gm-message-state:from:to:cc:subject:date; bh=z8+Pui1DmHEuASSBHn5YlEllxwItYUmX8mo6hwN2oXc=; b=FzkJVtQcz71R/IL2U8EdWHwDwF9F+Q1CWy6lBVOwovy0eNgWuL/LtC/YJ95OmB8+PI 1OaaWI57AyY6wAKn1EAV3hdJW/llFTL/Crx1KctBxF7Vb5NpZIj0Aw5j1CNJbNOqZXqO aj0rgWP77NCMT3vjcTct9WieyasBM1oH5ZGQX8VF2rgZRdsizMxRgb2Yiba6MuSyzPiD tOrk/bctjXbM0oG5bhuWwxYGCz8wtdHFSJfMZbGsuOKaTWSkNY6ucZ4n/vF04A71Hp0b MkuQtGwwHhbgdzhvH4kcIf9RJHomE+F04fYfPro/fFtCU1VMWud7ZXsaV3OZeYk2xNfV O1WQ== X-Gm-Message-State: ACrzQf24ZmOAAc3OuR6RAiZyyIhkv+PYCXW9wWbOcevsuvFRhZgByMMp CaWBuX/OCI5xwfj8KfaMjQ7En0QHrWniD0/fhsIUym3a/DjqhSl8tqC8PZBjLSHyCDQNhI8JJYB esOvui/H2Evw1EyeKg35T X-Received: by 2002:ac8:5e0d:0:b0:35b:af5b:42df with SMTP id h13-20020ac85e0d000000b0035baf5b42dfmr3062952qtx.100.1663858195521; Thu, 22 Sep 2022 07:49:55 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5BPg4gS1x9Isf3mQgaeBIAQzIrUDISW9tTzrTYND6fZPsn+wD6AeFRtUhgBDmcLQJ9zHy2DQ== X-Received: by 2002:ac8:5e0d:0:b0:35b:af5b:42df with SMTP id h13-20020ac85e0d000000b0035baf5b42dfmr3062913qtx.100.1663858194896; Thu, 22 Sep 2022 07:49:54 -0700 (PDT) Received: from localhost ([88.120.130.27]) by smtp.gmail.com with ESMTPSA id n12-20020a05620a294c00b006ce63901d27sm4119404qkp.4.2022.09.22.07.49.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Sep 2022 07:49:54 -0700 (PDT) Received: by localhost (Postfix, from userid 1000) id 91CF9581C2F; Thu, 22 Sep 2022 16:49:52 +0200 (CEST) From: Dodji Seketeli To: libabigail@sourceware.org Cc: woodard@redhat.com Subject: [PATCH, applied] Better support for golang programs Organization: Red Hat / France X-Operating-System: Fedora 38 X-URL: http://www.redhat.com Date: Thu, 22 Sep 2022 16:49:52 +0200 Message-ID: <87y1ub4g6n.fsf@redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_DNSWL_LOW,SPF_HELO_NONE,SPF_NONE,TXREP,T_FILL_THIS_FORM_SHORT autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hello, When analyzing the bettercap program written in Golang, the DWARF reader goes into an infinite loop due to this recursive DWARF construct: [ 8bbf9] subroutine_type abbrev: 40 name (string) "gopkg.in/sourcemap%2ev1.fn" byte_size (udata) 4 lo_user+0x900 (data1) 19 lo_user+0x904 (addr) +0000000000 [ 8bc1b] formal_parameter abbrev: 41 type (ref_addr) [ 8ba8b] [ 8bc20] formal_parameter abbrev: 41 type (ref_addr) [ 8bc4b] [ 8bc25] formal_parameter abbrev: 41 type (ref_addr) [ 6d43e] [ 8bc2b] typedef abbrev: 39 name (string) "gopkg.in/sourcemap%2ev1.fn" type (ref_addr) [ 8bbf9] [ 8bc4b] pointer_type abbrev: 43 name (string) "*gopkg.in/sourcemap%2ev1.fn" type (ref_addr) [ 8bc2b] lo_user+0x900 (data1) 0 lo_user+0x904 (addr) +0000000000 Note how the typedef DIE at offset [ 8bc2b] references the function type DIE at offset [ 8bbf9] which second parameter DIE at offset [8bc20] has a pointer type described by the DIE [ 8bc4b]. This last pointer type is a pointer to the typedef type which DIE has the offset [ 8bc2b], which started this paragraph. This is a recursive construct. First, there is die_qualified_type_name in the DWARF reader that goes look unnecessarily into the underlying type of a typedef. This makes that function end-up in an infinite loop. That is especially unfortunate because we do not need to do that to construct the name of the typedef. This looks like an old relic of ancient unrelated code that needs to go. This patch lets it go. Second, when building the IR for function type, build_function_type also ends up in a infinite loop because it's written naively. To fix that, this patch does what we do to handle recursively defined classes. The function type IR for that function type DIE is "forward-declared" as being "Work In Progress" aka WIP; then when a construct references that same DIE, the WIP IR is returned. When we are done constructing the function type IR for that DIE, the IR is no longer marked WIP. That way, the infinite recursion is avoided. Now that all function types can be represented in the IR, function_decl::get_pretty_representation_of_declarator is crashing because it wrongly forgets that a parameter can have a function type. The patch fixes that. Last but not least, it appears that the name of elf symbols and functions can contain characters that need to be escaped (to respect the lexical rules of XML) in the emitted ABIXML. The patch fixes that. Together, this patch makes it so that running fedabipkgdiff to compare packages against themselves now succeeds on the f36 distribution, for the following Golang packages: $ fedabipkgdiff --self-compare --from fc36 {containerd, bettercap, apptainer, rclone, singularity} * src/abg-dwarf-reader.cc (die_qualified_type_name): Don't look at the underlying type unnecessarily. (build_function_type): Look for the WIP type first to avoid infinite recursion. * src/abg-ir.cc (function_decl::get_pretty_representation_of_declarator): A parameter can have a function type. * src/abg-writer.cc (write_elf_symbol_reference) (write_function_decl): Escape symbol names, function names and symbol references. Signed-off-by: Dodji Seketeli --- src/abg-dwarf-reader.cc | 26 +++++++++++--------------- src/abg-ir.cc | 3 +-- src/abg-writer.cc | 8 +++++--- 3 files changed, 17 insertions(+), 20 deletions(-) diff --git a/src/abg-dwarf-reader.cc b/src/abg-dwarf-reader.cc index c6ba838c..21d2e11d 100644 --- a/src/abg-dwarf-reader.cc +++ b/src/abg-dwarf-reader.cc @@ -9780,21 +9780,6 @@ die_qualified_type_name(const read_context& ctxt, case DW_TAG_class_type: case DW_TAG_union_type: { - if (tag == DW_TAG_typedef) - { - // If the underlying type of the typedef is unspecified, - // bail out as we don't support that yet. - Dwarf_Die underlying_type_die; - if (die_die_attribute(die, DW_AT_type, underlying_type_die)) - { - string n = die_qualified_type_name(ctxt, &underlying_type_die, - where_offset); - if (die_is_unspecified(&underlying_type_die) - || n.empty()) - break; - } - } - if (name.empty()) // TODO: handle cases where there are more than one // anonymous type of the same kind in the same scope. In @@ -14566,6 +14551,17 @@ build_function_type(read_context& ctxt, const die_source source = ctxt.get_die_source(die); + { + size_t off = dwarf_dieoffset(die); + auto i = ctxt.die_wip_function_types_map(source).find(off); + if (i != ctxt.die_wip_function_types_map(source).end()) + { + function_type_sptr fn_type = is_function_type(i->second); + ABG_ASSERT(fn_type); + return fn_type; + } + } + decl_base_sptr type_decl; translation_unit_sptr tu = ctxt.cur_transl_unit(); diff --git a/src/abg-ir.cc b/src/abg-ir.cc index 8ad870af..1cd2a219 100644 --- a/src/abg-ir.cc +++ b/src/abg-ir.cc @@ -20483,8 +20483,7 @@ function_decl::get_pretty_representation_of_declarator (bool internal) const type_base_sptr type = parm->get_type(); if (internal) type = peel_typedef_type(type); - decl_base_sptr type_decl = get_type_declaration(type); - result += type_decl->get_qualified_name(internal); + result += get_type_name(type, /*qualified=*/true, internal); } } result += ")"; diff --git a/src/abg-writer.cc b/src/abg-writer.cc index a6166f5a..dff8813a 100644 --- a/src/abg-writer.cc +++ b/src/abg-writer.cc @@ -1758,7 +1758,9 @@ write_elf_symbol_reference(const elf_symbol& sym, ostream& o) // If all aliases are suppressed, just stick with the main symbol. if (!found) alias = main; - o << " elf-symbol-id='" << alias->get_id_string() << "'"; + o << " elf-symbol-id='" + << xml::escape_xml_string(alias->get_id_string()) + << "'"; return true; } @@ -3101,7 +3103,7 @@ write_elf_symbol(const elf_symbol_sptr& sym, annotate(sym, ctxt, indent); do_indent(o, indent); - o << "is_variable() && sym->get_size()) o << " size='" << sym->get_size() << "'"; @@ -3400,7 +3402,7 @@ write_function_decl(const function_decl_sptr& decl, write_context& ctxt, ctxt.record_type_as_referenced(parm_type); if (ctxt.get_write_parameter_names() && !(*pi)->get_name().empty()) - o << " name='" << (*pi)->get_name() << "'"; + o << " name='" << xml::escape_xml_string((*pi)->get_name()) << "'"; } write_is_artificial(*pi, o); write_location((*pi)->get_location(), ctxt); -- 2.37.2 -- Dodji