[gdb/symtab] Fix fsanitize=address errors for per_cu fields When building gdb with -fsanitize=thread and gcc 12, and running test-case gdb.dwarf2/dwz.exp, we run into a data race between: ... Read of size 1 at 0x7b200000300d by thread T2:^M #0 cutu_reader::cutu_reader(dwarf2_per_cu_data*, dwarf2_per_objfile*, \ abbrev_table*, dwarf2_cu*, bool, abbrev_cache*) gdb/dwarf2/read.c:6164 \ (gdb+0x82ec95)^M ... and: ... Previous write of size 1 at 0x7b200000300d by main thread:^M #0 prepare_one_comp_unit gdb/dwarf2/read.c:23588 (gdb+0x86f973)^M ... In other words, between: ... if (this_cu->reading_dwo_directly) ... and: ... cu->per_cu->lang = pretend_language; ... Likewise, we run into a data race between: ... Write of size 1 at 0x7b200000300e by thread T4: #0 process_psymtab_comp_unit gdb/dwarf2/read.c:6789 (gdb+0x830720) ... and: ... Previous read of size 1 at 0x7b200000300e by main thread: #0 cutu_reader::cutu_reader(dwarf2_per_cu_data*, dwarf2_per_objfile*, \ abbrev_table*, dwarf2_cu*, bool, abbrev_cache*) gdb/dwarf2/read.c:6164 \ (gdb+0x82edab) ... In other words, between: ... this_cu->unit_type = DW_UT_partial; ... and: ... if (this_cu->reading_dwo_directly) ... Likewise for the write to addresses_seen in cooked_indexer::check_bounds and a read from is_dwz in dwarf2_find_containing_comp_unit for test-case gdb.dwarf2/dw2-dir-file-name.exp and target board cc-with-dwz-m. The problem is that the written fields are part of the same memory location as the read fields, so executing a read and write in different threads is undefined behavour. Making the written fields separate memory locations fixes it: ... struct { ENUM_BITFIELD (dwarf_unit_type) unit_type : 8; }; struct { ENUM_BITFIELD (language) lang : LANGUAGE_BITS; }; struct { bool addresses_seen : 1; }; ... This increases the size of struct dwarf2_per_cu_data from 120 to 128 (for -m64). The set of fields has been established experimentally to be the minimal set to get rid of this type of -fsanitize=thread errors, but more fields might require the same treatment. Looking at the properties of the lang field, unlike dwarf_version it's not available in the unit header, so it will be set the first time during the parallel cooked index reading. The same holds for unit_type, and likewise for addresses_seen. Tested on x86_64-linux. --- gdb/dwarf2/read.h | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/gdb/dwarf2/read.h b/gdb/dwarf2/read.h index b7a03933aa5..c4b007d064d 100644 --- a/gdb/dwarf2/read.h +++ b/gdb/dwarf2/read.h @@ -163,7 +163,9 @@ struct dwarf2_per_cu_data /* If addresses have been read for this CU (usually from .debug_aranges), then this flag is set. */ - bool addresses_seen : 1; + struct { + bool addresses_seen : 1; + }; /* A temporary mark bit used when iterating over all CUs in expand_symtabs_matching. */ @@ -173,11 +175,16 @@ struct dwarf2_per_cu_data point in trying to read it again next time. */ bool files_read : 1; - /* The unit type of this CU. */ - ENUM_BITFIELD (dwarf_unit_type) unit_type : 8; + /* Put this in a struct to ensure a separate memory location. */ + struct { + /* The unit type of this CU. */ + ENUM_BITFIELD (dwarf_unit_type) unit_type : 8; + }; - /* The language of this CU. */ - ENUM_BITFIELD (language) lang : LANGUAGE_BITS; + struct { + /* The language of this CU. */ + ENUM_BITFIELD (language) lang : LANGUAGE_BITS; + }; /* True if this CU has been scanned by the indexer; false if not. */