From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr1-f45.google.com (mail-wr1-f45.google.com [209.85.221.45]) by sourceware.org (Postfix) with ESMTPS id 8E11E3858C39 for ; Wed, 6 Jul 2022 19:20:47 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 8E11E3858C39 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=palves.net Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-wr1-f45.google.com with SMTP id v16so11939900wrd.13 for ; Wed, 06 Jul 2022 12:20:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=OzlYRFhNbxloTkjqYU+hc9Z/5KxsNOf0Z/iolyDV6w8=; b=0vhUczSj26ssXefPutl3VQp5aEEexs749nYg28Ds3cL7GLmSA2cKL60oN2n5ieDS2G qj3gvac5i8jN16LWGzCPdv6Cv+j8M0DPIosXGC3IfSXAfahSfCLerBp8/NFwtaa59/kz IUgtjLQcrYx1gzXKaAWCtMaoe10sTkKaipoJayE2yYUo0p060T4n47GKiUH5zi1X7JcG wt+f0tBj+yctgt0h54mwGen4RALqPumovecM4j9QVwiO07qsYAf1Q7v+ScB4mP63iV+O RnvfbzE95gaUGd0LFwPABGkqwFQjlpfpTxUUFSVkT+7bjZtVRgKUAJbDE6U/E8snc25x 3LMg== X-Gm-Message-State: AJIora+i7FHl6ayqHv0j5JnO2lhLTKqKVjbcGnniN19qRkfIZd/dGux6 wh0gqAkC57ZaWBET//JglmA9KADFXbI= X-Google-Smtp-Source: AGRyM1t9gMxSdk3eo/eJf0xHxVkhkPr8BKN2+YjUsuQN1mRzwpLv8vYAoIRIPFGbeL3eG9KkljS0mA== X-Received: by 2002:a5d:5284:0:b0:21d:7f63:25c0 with SMTP id c4-20020a5d5284000000b0021d7f6325c0mr3346929wrv.674.1657135245224; Wed, 06 Jul 2022 12:20:45 -0700 (PDT) Received: from ?IPv6:2001:8a0:f924:2600:209d:85e2:409e:8726? ([2001:8a0:f924:2600:209d:85e2:409e:8726]) by smtp.gmail.com with ESMTPSA id k23-20020a7bc317000000b003976fbfbf00sm27545380wmj.30.2022.07.06.12.20.43 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 06 Jul 2022 12:20:44 -0700 (PDT) Subject: [PATCH] Introduce struct packed template, fix -fsanitize=thread for per_cu fields To: Tom de Vries , Tom Tromey Cc: gdb-patches@sourceware.org References: <20220629152914.13149-1-tdevries@suse.de> <20220629152914.13149-3-tdevries@suse.de> <8735fgvie4.fsf@tromey.com> <8c6ba70f-305d-c3ed-591d-36f1f38d52a6@suse.de> From: Pedro Alves Message-ID: Date: Wed, 6 Jul 2022 20:20:42 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.10.1 MIME-Version: 1.0 In-Reply-To: <8c6ba70f-305d-c3ed-591d-36f1f38d52a6@suse.de> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-8.6 required=5.0 tests=BAYES_00, BODY_8BITS, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jul 2022 19:20:50 -0000 On 2022-07-04 8:45 p.m., Tom de Vries via Gdb-patches wrote: > On 7/4/22 20:32, Tom Tromey wrote: >>>>>>> "Tom" == Tom de Vries writes: >> >> Tom>  /* The number of bits needed to represent all languages, with enough >> Tom>     padding to allow for reasonable growth.  */ >> Tom> -#define LANGUAGE_BITS 5 >> Tom> +#define LANGUAGE_BITS 8 >> >> This will negatively affect the size of symbols and so I think it should >> be avoided. >> > > Ack, Pedro suggested a way to avoid this: > ... > +  struct { > +    /* The language of this CU.  */ > +    ENUM_BITFIELD (language) m_lang : LANGUAGE_BITS; > +  }; > ... > It actually doesn't avoid it in this case, as the following field will end up moved to the next byte, so if LANGUAGE_BITS is 5, we'll end up with 3 bits gap. Actually, it's worse than that -- it will align m_lang to ENUM_BITFIELD(language)'s natural alignment, so it can introduce byte padding before and after too. :-/ :-( We can see it with "ptype /o" after applying your patch using the struct{} trick. Note the "3-byte padding" below: ... /* 13: 5 | 1 */ bool m_header_read_in : 1; /* XXX 2-bit hole */ /* 14 | 1 */ struct { /* 14: 0 | 1 */ bool addresses_seen : 1; /* XXX 7-bit padding */ /* total size (bytes): 1 */ }; /* 15: 0 | 4 */ unsigned int mark : 1; /* 15: 1 | 1 */ bool files_read : 1; /* XXX 6-bit hole */ /* 16 | 4 */ struct { /* 16: 0 | 4 */ dwarf_unit_type unit_type : 8; /* XXX 3-byte padding */ <<<<<< 3 bytes /* total size (bytes): 4 */ }; /* 20 | 4 */ struct { /* 20: 0 | 4 */ language lang : 5; /* XXX 3-bit padding */ /* XXX 3-byte padding */ <<<<<< 3 bytes /* total size (bytes): 4 */ }; ... So, maybe we really want something else... How about this alternative patch below? I wrote the new struct packed template using the array for storage before I wrote the alternative to use __attribute__((packed)), so I left the array version in there, as a pure standards-conforming implementation. We can just remove that array implementation completely, and use the ATTRIBUTE_PACKED macro if you don't think it's worth it. All compilers worth their salt support attribute packed, or something like it, I believe. The resulting gdbsupport/pack.h would be quite smaller. I tested this on x86-64 Ubuntu 20.04, both attribute packed no attribute versions, saw no regressions. Also smoke-tested with Clang (which uses the attribute packed implementation too, as it defines __GNUC__). I have not actually tested this with -fsanitize=thread, though. Would you be up for testing that, Tom, if this approach looks reasonable? Note that with this, struct dwarf2_per_cu_data is still 120 bytes, there's no growth: (top-gdb) ptype /o dwarf2_per_cu_data /* offset | size */ type = struct dwarf2_per_cu_data { public: /* 0 | 8 */ sect_offset sect_off; /* 8 | 4 */ unsigned int length; private: /* 12 | 1 */ unsigned char m_dwarf_version; public: /* 13: 0 | 4 */ unsigned int queued : 1; /* 13: 1 | 4 */ unsigned int is_debug_types : 1; /* 13: 2 | 4 */ unsigned int is_dwz : 1; /* 13: 3 | 4 */ unsigned int reading_dwo_directly : 1; /* 13: 4 | 4 */ unsigned int tu_read : 1; /* 13: 5 | 1 */ bool m_header_read_in : 1; /* 13: 6 | 4 */ unsigned int mark : 1; /* 13: 7 | 1 */ bool files_read : 1; /* 14 | 1 */ struct packed [with T = bool] { private: /* 14 | 1 */ struct { /* 14: 0 | 1 */ T m_val : 8; /* total size (bytes): 1 */ }; /* total size (bytes): 1 */ } addresses_seen; private: /* 15 | 1 */ struct packed [with T = dwarf_unit_type] { private: /* 15 | 1 */ struct { /* 15: 0 | 4 */ T m_val : 8; /* total size (bytes): 1 */ }; /* total size (bytes): 1 */ } m_unit_type; /* 16 | 1 */ struct packed [with T = language] { private: /* 16 | 1 */ struct { /* 16: 0 | 4 */ T m_val : 8; /* total size (bytes): 1 */ }; /* total size (bytes): 1 */ } m_lang; public: /* 17 | 1 */ struct std::atomic [with _Tp = bool] { private: /* 17 | 1 */ struct std::__atomic_base<_Tp> [with _IntTp = _Tp] { private: static const int _S_alignment; /* 17 | 1 */ __int_type _M_i; /* total size (bytes): 1 */ } _M_base; /* total size (bytes): 1 */ } scanned; /* XXX 2-byte hole */ /* 20 | 4 */ unsigned int index; /* 24 | 8 */ dwarf2_section_info *section; /* 32 | 8 */ dwarf2_per_bfd *per_bfd; /* 40 | 56 */ struct comp_unit_head { /* 40 | 4 */ unsigned int length; /* 44 | 1 */ unsigned char version; /* 45 | 1 */ unsigned char addr_size; /* 46 | 1 */ unsigned char signed_addr_p; /* XXX 1-byte hole */ /* 48 | 8 */ sect_offset abbrev_sect_off; /* 56 | 4 */ unsigned int offset_size; /* 60 | 4 */ unsigned int initial_length_size; /* 64 | 4 */ dwarf_unit_type unit_type; /* 68 | 4 */ cu_offset first_die_cu_offset; /* 72 | 8 */ sect_offset sect_off; /* 80 | 4 */ cu_offset type_cu_offset_in_tu; /* XXX 4-byte hole */ /* 88 | 8 */ ULONGEST signature; /* total size (bytes): 56 */ } m_header; /* 96 | 8 */ std::unique_ptr fnd; /* 104 | 8 */ During symbol reading: Child DIE 0x1d6f34b and its abstract origin 0x1d6f380 have different parents quick_file_names *file_names; /* 112 | 8 */ std::vector *imported_symtabs; /* total size (bytes): 120 */ } (top-gdb) Patch follows: >From c65b4a71a08835dd46c16443f84793e982efdc8a Mon Sep 17 00:00:00 2001 From: Pedro Alves Date: Wed, 6 Jul 2022 14:46:49 +0100 Subject: [PATCH] Introduce struct packed template, fix -fsanitize=thread for per_cu fields Change-Id: Ifa94f0a2cebfae5e8f6ddc73265f05e7fd9e1532 --- gdb/defs.h | 3 ++ gdb/dwarf2/read.h | 20 +++++----- gdbsupport/packed.h | 92 +++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 106 insertions(+), 9 deletions(-) create mode 100644 gdbsupport/packed.h diff --git a/gdb/defs.h b/gdb/defs.h index 99bfdd526ff..19f379d6588 100644 --- a/gdb/defs.h +++ b/gdb/defs.h @@ -232,6 +232,9 @@ enum language #define LANGUAGE_BITS 5 gdb_static_assert (nr_languages <= (1 << LANGUAGE_BITS)); +/* The number of bytes needed to represent all languages. */ +#define LANGUAGE_BYTES ((LANGUAGE_BITS + HOST_CHAR_BIT - 1) / HOST_CHAR_BIT) + enum precision_type { single_precision, diff --git a/gdb/dwarf2/read.h b/gdb/dwarf2/read.h index 1d9c66aafad..f98d8b27649 100644 --- a/gdb/dwarf2/read.h +++ b/gdb/dwarf2/read.h @@ -33,6 +33,7 @@ #include "gdbsupport/gdb_obstack.h" #include "gdbsupport/hash_enum.h" #include "gdbsupport/function-view.h" +#include "gdbsupport/packed.h" /* Hold 'maintenance (set|show) dwarf' commands. */ extern struct cmd_list_element *set_dwarf_cmdlist; @@ -105,11 +106,8 @@ struct dwarf2_per_cu_data reading_dwo_directly (false), tu_read (false), m_header_read_in (false), - addresses_seen (false), mark (false), files_read (false), - m_unit_type {}, - m_lang (language_unknown), scanned (false) { } @@ -161,10 +159,6 @@ struct dwarf2_per_cu_data it private at the moment. */ mutable bool m_header_read_in : 1; - /* If addresses have been read for this CU (usually from - .debug_aranges), then this flag is set. */ - bool addresses_seen : 1; - /* A temporary mark bit used when iterating over all CUs in expand_symtabs_matching. */ unsigned int mark : 1; @@ -173,12 +167,20 @@ struct dwarf2_per_cu_data point in trying to read it again next time. */ bool files_read : 1; + /* Wrap the following in struct packed instead of bitfields to avoid + data races when the bitfields end up on the same memory location + (per C++ memory model). */ + + /* If addresses have been read for this CU (usually from + .debug_aranges), then this flag is set. */ + packed addresses_seen = false; + private: /* The unit type of this CU. */ - ENUM_BITFIELD (dwarf_unit_type) m_unit_type : 8; + packed m_unit_type = (dwarf_unit_type) 0; /* The language of this CU. */ - ENUM_BITFIELD (language) m_lang : LANGUAGE_BITS; + packed m_lang = language_unknown; public: /* True if this CU has been scanned by the indexer; false if diff --git a/gdbsupport/packed.h b/gdbsupport/packed.h new file mode 100644 index 00000000000..a41c408a44d --- /dev/null +++ b/gdbsupport/packed.h @@ -0,0 +1,92 @@ +/* Copyright (C) 2022 Free Software Foundation, Inc. + + This file is part of GDB. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#ifndef PACKED_H +#define PACKED_H + +/* Each instantiation and full specialization of the packed template + defines a type that behaves like a given scalar type, but that has + byte alignment, and, may optionally have a smaller size than the + given scalar type. This is typically used as alternative to + bit-fields (and ENUM_BITFIELD), when the fields must have separate + memory locations to avoid data races. */ + +/* Use __attribute__((packed)) when possible to make it more + convenient to print the field when debugging GDB. */ +#ifdef __GNUC__ +# define PACKED_USE_ATTRIBUTE_PACKED 1 +#else +# define PACKED_USE_ATTRIBUTE_PACKED 0 +#endif + +#if !PACKED_USE_ATTRIBUTE_PACKED + +template struct bytes_to_type; + +template<> struct bytes_to_type<1> +{ + typedef uint8_t type; +}; + +template<> struct bytes_to_type<2> +{ + typedef uint16_t type; +}; + +template<> struct bytes_to_type<4> +{ + typedef uint32_t type; +}; +#endif + +template +struct packed +{ +public: + packed (T val) + { +#if PACKED_USE_ATTRIBUTE_PACKED + m_val = val; +#else + typename bytes_to_type::type tmp; + tmp = val; + memcpy (m_val, &tmp, Bytes); +#endif + } + + operator T () const + { +#if PACKED_USE_ATTRIBUTE_PACKED + return (T) m_val; +#else + typename bytes_to_type::type tmp; + memcpy (&tmp, m_val, Bytes); + return (T) tmp; +#endif + } + +private: +#if PACKED_USE_ATTRIBUTE_PACKED + T m_val : Bytes * 8 __attribute__ ((packed)); +#else + unsigned char m_val[Bytes]; + static_assert (sizeof (typename bytes_to_type::type) == sizeof (m_val), + "sizes must match"); +#endif +}; + +#endif base-commit: 8bd915b770e12eff67f6218f2d727069d04d5752 -- 2.36.0