From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 994753858284 for ; Sat, 24 Dec 2022 14:09:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 994753858284 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1671890953; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type; bh=Mgq1bCGbx20vMsiccZYT0XZWQT4qpVaqhuttt6lIOJA=; b=X9OGV/p5YSNQhvKTMvp2utHh33Q7tdo4eNOml7splA4nAHjVIbk7wlmZVuQeypzOo5cPtM jruK+3TJ2dbT05FDX7YQ5uNXfXpJtoyct1VLyZexwVN+imzTpk3pdw+V8AmnATemmcwFQS w7N5JOhSbYO7Y9lsarOKAD3QsJu2kl4= Received: from mail-vk1-f200.google.com (mail-vk1-f200.google.com [209.85.221.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-324-LYvi1McFNW2DBGc56iVz5Q-1; Sat, 24 Dec 2022 09:09:11 -0500 X-MC-Unique: LYvi1McFNW2DBGc56iVz5Q-1 Received: by mail-vk1-f200.google.com with SMTP id j17-20020a1f2311000000b003bd40550849so2080138vkj.6 for ; Sat, 24 Dec 2022 06:09:11 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=mime-version:user-agent:message-id:date:organization:subject:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Mgq1bCGbx20vMsiccZYT0XZWQT4qpVaqhuttt6lIOJA=; b=jM+6DkWXb7BguFEYCIOp127M/n3HDEvoZSbR/+77nSjbJOThKlp1ZSEtGADcelXoSN lM4V6BVGgxd36JnWY8uwWHFQ8MaeRbpLPukks1weO+AojxwafLz/dcgBwuPmMvd3vt1c 8ZyQgNDWP9Ml7uyEPW/hKjPBd/NmqmvPIRoswiaTIsPISYlRBVS91BinJz3YIvsTYRbo zxz72KfTw79NphVt5hsHVA6Sh3p85eDJf44yRN42st15ke64HHcc9lL9mnZ9NRbsf+Ct wzXcOSFJGd2aWoTNViz2BAuCeBHBSb7s2eHw+OgGMSMPXYqmjl9+NlUlzbXJ/PR+dgST 2AnA== X-Gm-Message-State: AFqh2koooyLkfWdQJjf5d5+JptDa6ZS5LwA/3+huWT2plg3gdQhyON0Q UgEdwJEldZ1umc+52uX81ZnHUTNMr0I77UIoV6K4yTvRyY5BjpYZx89A/SoOU9iBvRTdAnn3VJR 3EWb5XQ0v3FLBiNmPF8jJmKBVPXp3gQUcgeL4aB9OdHF8LOMDzJdgJDtEdja2K4TjESDV X-Received: by 2002:a67:cb86:0:b0:3b1:36fc:eba9 with SMTP id h6-20020a67cb86000000b003b136fceba9mr11049582vsl.13.1671890950833; Sat, 24 Dec 2022 06:09:10 -0800 (PST) X-Google-Smtp-Source: AMrXdXtRnoPCufg3B7dB0/52o/VGKui9gnLsM6V5N+MSJvXP8rL2VVAoR3Iu9yjUHBQS5N0NGoJH2w== X-Received: by 2002:a67:cb86:0:b0:3b1:36fc:eba9 with SMTP id h6-20020a67cb86000000b003b136fceba9mr11049553vsl.13.1671890950244; Sat, 24 Dec 2022 06:09:10 -0800 (PST) Received: from localhost ([88.120.130.27]) by smtp.gmail.com with ESMTPSA id y19-20020a05620a44d300b006bbf85cad0fsm4227742qkp.20.2022.12.24.06.09.09 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 24 Dec 2022 06:09:09 -0800 (PST) Received: by localhost (Postfix, from userid 1000) id 218F0A2E7E; Sat, 24 Dec 2022 15:09:08 +0100 (CET) From: Dodji Seketeli To: libabigail@sourceware.org Subject: [PATCH, applied] Bug 29934 - Handle buggy data members with empty names Organization: Red Hat / France X-Operating-System: CentOS Stream release 9 X-URL: http://www.redhat.com Date: Sat, 24 Dec 2022 15:09:08 +0100 Message-ID: <871qoodhbv.fsf@redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FILL_THIS_FORM,GIT_PATCH_0,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hello, When handling the changes between two ABI corpora, the diff graph building pass chokes on a data member which seems to erroneously have an empty name. The steps to reproduce the issue are explained in the problem report at https://sourceware.org/bugzilla/show_bug.cgi?id=29934. The root cause of the problem is that the "struct mailbox" data structure from the binary /usr/lib64/dovecot/libdovecot-storage.so.0 coming from the dovecot-2.2.36-3.el7.x86_64.rpm package contains a data member that has an empty name. The source code of that data structure can be browsed at https://github.com/dovecot/core/blob/release-2.2.36/src/lib-storage/mail-storage-private.h We see that the mailbox::storage data structure at line 352 ends up in the DWARF debug info with an empty name. Let's look at the DWARF dump as emitted by "eu-readelf --debug-dump=info" on the /usr/lib/debug//usr/lib64/dovecot/libdovecot-storage.so.0.debug file from the dovecot-debuginfo-2.2.36-3.el7.x86_64.rpm package. A relevant DIE for the "struct mailbox" is the following: [ 3e3e] structure_type abbrev: 9 name (strp) "mailbox" byte_size (data2) 768 decl_file (data1) mail-storage-private.h (24) decl_line (data2) 348 sibling (ref_udata) [ 41f9] [ 3e4a] member abbrev: 59 name (strp) "name" decl_file (data1) mail-storage-private.h (24) decl_line (data2) 349 type (ref_addr) [ 84] data_member_location (data1) 0 [ 3e57] member abbrev: 59 name (strp) "vname" decl_file (data1) mail-storage-private.h (24) decl_line (data2) 351 type (ref_addr) [ 84] data_member_location (data1) 8 [ 3e64] member abbrev: 47 name (strp) "" decl_file (data1) mail-storage-private.h (24) decl_line (data2) 352 type (ref_udata) [ 3bee] data_member_location (data1) 16 [...] You can see here that the DW_TAG_member DIE at offset 0x3e64 has an empty name. Its DW_AT_type attribute references the DIE at offset 0x3bee. The DIE at offset 0x3bee is this one: [ 3bee] pointer_type abbrev: 95 byte_size (data1) 8 type (ref_udata) [ 3a90] [...] It's a pointer to the type which DIE is at offset 0x3a90, which is: [ 3a90] structure_type abbrev: 48 name (strp) "mail_storage" byte_size (data2) 352 decl_file (data1) mail-storage-private.h (24) decl_line (data1) 132 sibling (ref_udata) [ 3bee] So, the data member of "struct mailbox" which has an empty name has a type "pointer to struct mail_storage", aka "struct mail_storage*". That indeed corresponds to the "storage" data member that we see at line 352 of the mail-storage-private.h file, browsable at https://github.com/dovecot/core/blob/release-2.2.36/src/lib-storage/mail-storage-private.h. The fact that this data member has an empty name seems to me as a bug of the DWARF emitter. Libabigail ought to gently handle this bug instead of choking. This patch assigns an artificial name to that empty data member to handle this kind of cases in the future. The names looks like "unnamed-@-" where "location" is the location of the data member. Please note that there can be normal cases of anonymous data members where the data member has an empty name. In those cases, the data member must be of type union or struct. This is to describe the "unnamed fields" C feature described at https://gcc.gnu.org/onlinedocs/gcc/Unnamed-Fields.html. The buggy case we are seeing here is different from the "unnamed field" case because the type of the anonymous data member is neither struct nor union. * src/abg-dwarf-reader.cc (die_is_anonymous_data_member): Define new static function. (die_member_offset): Move the declaration of this up so that it can be used more generally. (reader::build_name_for_buggy_anonymous_data_member): Define new member function. (add_or_update_class_type): Generate an artificial name for buggy data members with empty names. Signed-off-by: Dodji Seketeli --- src/abg-dwarf-reader.cc | 100 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 100 insertions(+) diff --git a/src/abg-dwarf-reader.cc b/src/abg-dwarf-reader.cc index 37cd5583..5df1516d 100644 --- a/src/abg-dwarf-reader.cc +++ b/src/abg-dwarf-reader.cc @@ -382,6 +382,9 @@ get_scope_die(const reader& rdr, static bool die_is_anonymous(const Dwarf_Die* die); +static bool +die_is_anonymous_data_member(const Dwarf_Die* die); + static bool die_is_type(const Dwarf_Die* die); @@ -486,6 +489,11 @@ die_constant_attribute(const Dwarf_Die *die, bool is_signed, array_type_def::subrange_type::bound_value &value); +static bool +die_member_offset(const reader& rdr, + const Dwarf_Die* die, + int64_t& offset); + static bool form_is_DW_FORM_strx(unsigned form); @@ -3876,6 +3884,61 @@ public: return (i != die_wip_function_types_map(source).end()); } + /// Sometimes, a data member die can erroneously have an empty name as + /// a result of a bug of the DWARF emitter. + /// + /// This is what happens in + /// https://sourceware.org/bugzilla/show_bug.cgi?id=29934. + /// + /// In that case, this function constructs an artificial name for that + /// data member. The pattern of the name is as follows: + /// + /// "unnamed-@-". + /// + ///location is either the value of the data member location of the + ///data member if it has one or concatenation of its source location + ///if it has none. If no location can be calculated then the function + ///returns the empty string. + string + build_name_for_buggy_anonymous_data_member(Dwarf_Die *die) + { + string result; + // Let's make sure we are looking at a data member with an empty + // name ... + if (!die + || dwarf_tag(die) != DW_TAG_member + || !die_name(die).empty()) + return result; + + // ... and yet, it's not an anonymous data member (aka unnamed + // field) as described in + // https://gcc.gnu.org/onlinedocs/gcc/Unnamed-Fields.html. + if (die_is_anonymous_data_member(die)) + return result; + + // If we come this far, it means we are looking at a buggy data + // member with no name. Let's build a name for it so that it can be + // addressed. + int64_t offset_in_bits = 0; + bool has_offset = die_member_offset(*this, die, offset_in_bits); + location loc; + if (!has_offset) + { + loc = die_location(*this, die); + if (!loc) + return result; + } + + std::ostringstream o; + o << "unnamed-dm-@-"; + if (has_offset) + o << "offset-" << offset_in_bits << "bits"; + else + o << "loc-" << loc.expand(); + + return o.str(); + } + /// Getter for the map of declaration-only classes that are to be /// resolved to their definition classes by the end of the corpus /// loading. @@ -5853,6 +5916,33 @@ die_is_anonymous(const Dwarf_Die* die) return false; } +/// Test if a DIE is an anonymous data member, aka, "unnamed field". +/// +/// Unnamed fields are specified at +/// https://gcc.gnu.org/onlinedocs/gcc/Unnamed-Fields.html. +/// +/// @param die the DIE to consider. +/// +/// @return true iff @p die is an anonymous data member. +static bool +die_is_anonymous_data_member(const Dwarf_Die* die) +{ + if (!die + || dwarf_tag(const_cast(die)) != DW_TAG_member + || !die_name(die).empty()) + return false; + + Dwarf_Die type_die; + if (!die_die_attribute(die, DW_AT_type, type_die)) + return false; + + if (dwarf_tag(&type_die) != DW_TAG_structure_type + && dwarf_tag(&type_die) != DW_TAG_union_type) + return false; + + return true; +} + /// Get the value of an attribute that is supposed to be a string, or /// an empty string if the attribute could not be found. /// @@ -13009,6 +13099,16 @@ add_or_update_class_type(reader& rdr, if (!t) continue; + if (n.empty() && !die_is_anonymous_data_member(&child)) + { + // We must be in a case where the data member has an + // empty name because the DWARF emitter has a bug. + // Let's generate an artificial name for that data + // member. + n = rdr.build_name_for_buggy_anonymous_data_member(&child); + ABG_ASSERT(!n.empty()); + } + // The call to build_ir_node_from_die above could have // triggered the adding of a data member named 'n' into // result. So let's check again if the variable is -- 2.31.1 -- Dodji