From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 2205) id C6F36385829F; Tue, 12 Jul 2022 10:20:22 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C6F36385829F Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable From: Tom de Vries To: gdb-cvs@sourceware.org Subject: [binutils-gdb] Fix -fsanitize=thread for per_cu fields X-Act-Checkin: binutils-gdb X-Git-Author: Tom de Vries X-Git-Refname: refs/heads/master X-Git-Oldrev: 68c0faca76a9b179af1a098946d6020091f43e06 X-Git-Newrev: ac3972d81f1065a156daa0da97320262845c2469 Message-Id: <20220712102022.C6F36385829F@sourceware.org> Date: Tue, 12 Jul 2022 10:20:22 +0000 (GMT) X-BeenThere: gdb-cvs@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-cvs mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Jul 2022 10:20:22 -0000 https://sourceware.org/git/gitweb.cgi?p=3Dbinutils-gdb.git;h=3Dac3972d81f10= 65a156daa0da97320262845c2469 commit ac3972d81f1065a156daa0da97320262845c2469 Author: Tom de Vries Date: Tue Jul 12 12:20:13 2022 +0200 Fix -fsanitize=3Dthread for per_cu fields =20 When building gdb with -fsanitize=3Dthread and gcc 12, and running test= -case gdb.dwarf2/dwz.exp, we run into a data race between: ... Read of size 1 at 0x7b200000300d by thread T2:^M #0 cutu_reader::cutu_reader(dwarf2_per_cu_data*, dwarf2_per_objfile= *, \ abbrev_table*, dwarf2_cu*, bool, abbrev_cache*) gdb/dwarf2/read.c:6= 164 \ (gdb+0x82ec95)^M ... and: ... Previous write of size 1 at 0x7b200000300d by main thread:^M #0 prepare_one_comp_unit gdb/dwarf2/read.c:23588 (gdb+0x86f973)^M ... =20 In other words, between: ... if (this_cu->reading_dwo_directly) ... and: ... cu->per_cu->lang =3D pretend_language; ... =20 Likewise, we run into a data race between: ... Write of size 1 at 0x7b200000300e by thread T4: #0 process_psymtab_comp_unit gdb/dwarf2/read.c:6789 (gdb+0x830720) ... and: ... Previous read of size 1 at 0x7b200000300e by main thread: #0 cutu_reader::cutu_reader(dwarf2_per_cu_data*, dwarf2_per_objfile= *, \ abbrev_table*, dwarf2_cu*, bool, abbrev_cache*) gdb/dwarf2/read.c:6= 164 \ (gdb+0x82edab) ... =20 In other words, between: ... this_cu->unit_type =3D DW_UT_partial; ... and: ... if (this_cu->reading_dwo_directly) ... =20 Likewise for the write to addresses_seen in cooked_indexer::check_bound= s and a read from is_dwz in dwarf2_find_containing_comp_unit for test-case gdb.dwarf2/dw2-dir-file-name.exp and target board cc-with-dwz-m. =20 The problem is that the written fields are part of the same memory loca= tion as the read fields, so executing a read and write in different threads is undefined behavour. =20 Making the written fields separate memory locations, using the new struct packed template fixes this. =20 The set of fields has been established experimentally to be the minimal set to get rid of this type of -fsanitize=3Dthread errors, but more fields might require the same treatment. =20 Looking at the properties of the lang field, unlike dwarf_version it's not available in the unit header, so it will be set the first time during the parallel cooked index reading. The same holds for unit_type, and likewise for addresses_seen. =20 dwarf2_per_cu_data::addresses_seen is moved so that the bitfields that currently follow it can be merged in the same memory location as the bitfields that currently precede it, for better packing. =20 Tested on x86_64-linux. =20 Co-Authored-By: Pedro Alves =20 Change-Id: Ifa94f0a2cebfae5e8f6ddc73265f05e7fd9e1532 Diff: --- gdb/defs.h | 3 +++ gdb/dwarf2/read.h | 20 +++++++++++--------- 2 files changed, 14 insertions(+), 9 deletions(-) diff --git a/gdb/defs.h b/gdb/defs.h index 99bfdd526ff..19f379d6588 100644 --- a/gdb/defs.h +++ b/gdb/defs.h @@ -232,6 +232,9 @@ enum language #define LANGUAGE_BITS 5 gdb_static_assert (nr_languages <=3D (1 << LANGUAGE_BITS)); =20 +/* The number of bytes needed to represent all languages. */ +#define LANGUAGE_BYTES ((LANGUAGE_BITS + HOST_CHAR_BIT - 1) / HOST_CHAR_BI= T) + enum precision_type { single_precision, diff --git a/gdb/dwarf2/read.h b/gdb/dwarf2/read.h index 5e0ce5e91a2..707dca9ee57 100644 --- a/gdb/dwarf2/read.h +++ b/gdb/dwarf2/read.h @@ -33,6 +33,7 @@ #include "gdbsupport/gdb_obstack.h" #include "gdbsupport/hash_enum.h" #include "gdbsupport/function-view.h" +#include "gdbsupport/packed.h" =20 /* Hold 'maintenance (set|show) dwarf' commands. */ extern struct cmd_list_element *set_dwarf_cmdlist; @@ -105,11 +106,8 @@ struct dwarf2_per_cu_data reading_dwo_directly (false), tu_read (false), m_header_read_in (false), - addresses_seen (false), mark (false), files_read (false), - m_unit_type {}, - m_lang (language_unknown), scanned (false) { } @@ -162,10 +160,6 @@ public: it private at the moment. */ mutable bool m_header_read_in : 1; =20 - /* If addresses have been read for this CU (usually from - .debug_aranges), then this flag is set. */ - bool addresses_seen : 1; - /* A temporary mark bit used when iterating over all CUs in expand_symtabs_matching. */ unsigned int mark : 1; @@ -174,12 +168,20 @@ public: point in trying to read it again next time. */ bool files_read : 1; =20 + /* Wrap the following in struct packed instead of bitfields to avoid + data races when the bitfields end up on the same memory location + (per C++ memory model). */ + + /* If addresses have been read for this CU (usually from + .debug_aranges), then this flag is set. */ + packed addresses_seen =3D false; + private: /* The unit type of this CU. */ - ENUM_BITFIELD (dwarf_unit_type) m_unit_type : 8; + packed m_unit_type =3D (dwarf_unit_type) 0; =20 /* The language of this CU. */ - ENUM_BITFIELD (language) m_lang : LANGUAGE_BITS; + packed m_lang =3D language_unknown; =20 public: /* True if this CU has been scanned by the indexer; false if