On 6/29/22 20:28, Pedro Alves wrote:
> On 2022-06-29 19:25, Pedro Alves wrote:
>> On 2022-06-29 18:38, Pedro Alves wrote:
>>> On 2022-06-29 16:29, Tom de Vries via Gdb-patches wrote:
>>>> When building gdb with -fsanitize=thread and gcc 12, and running test-case
>>>> gdb.dwarf2/dwz.exp, we run into a data race between:
>>>> ...
>>>>    Read of size 1 at 0x7b200000300d by thread T2:^M
>>>>      #0 cutu_reader::cutu_reader(dwarf2_per_cu_data*, dwarf2_per_objfile*, \
>>>>      abbrev_table*, dwarf2_cu*, bool, abbrev_cache*) gdb/dwarf2/read.c:6164 \
>>>>      (gdb+0x82ec95)^M
>>>> ...
>>>> and:
>>>> ...
>>>>    Previous write of size 1 at 0x7b200000300d by main thread:^M
>>>>      #0 prepare_one_comp_unit gdb/dwarf2/read.c:23588 (gdb+0x86f973)^M
>>>> ...
>>>>
>>>> In other words, between:
>>>> ...
>>>>    if (this_cu->reading_dwo_directly)
>>>> ...
>>>> and:
>>>> ...
>>>>      cu->per_cu->lang = pretend_language;
>>>> ...
>>>>
>>>> Both fields are part of the same bitfield, and writing to one field while
>>>> reading from another is not a problem, so this is a false positive.
>>>
>>> I don't understand this sentence.  Particularly "same bitfield", or
>>> really "Both fields are part of the same bitfield,".  How can two bitfields
>>> be part of the same bitfield?
>>>
>>> Anyhow, both bitfields are part of a sequence of contiguous bitfields, here
>>> stripped of comments:
>>>
>>>    unsigned int queued : 1;
>>>    unsigned int is_debug_types : 1;
>>>    unsigned int is_dwz : 1;
>>>    unsigned int reading_dwo_directly : 1;
>>>    unsigned int tu_read : 1;
>>>    mutable bool m_header_read_in : 1;
>>>    bool addresses_seen : 1;
>>>    unsigned int mark : 1;
>>>    bool files_read : 1;
>>>    ENUM_BITFIELD (dwarf_unit_type) unit_type : 8;
>>>    ENUM_BITFIELD (language) lang : LANGUAGE_BITS;
>>>
>>> Per C++11, they're all part of the same _memory location_.  From N3253 (C++11), intro.memory:
>>>
>>>   "A memory location is either an object of scalar type or a maximal sequence of adjacent bit-fields all having
>>>   non-zero width. (...)  Two threads of execution (1.10) can update and access separate memory locations
>>>    without interfering with each other.
>>>   (...)
>>>   [ Note: Thus a bit-field and an adjacent non-bit-field are in separate memory locations, and therefore can be
>>>   concurrently updated by two threads of execution without interference. The same applies to two bit-fields,
>>>   if one is declared inside a nested struct declaration and the other is not, or if the two are separated by
>>>   a zero-length bit-field declaration, or if they are separated by a non-bit-field declaration. It is not safe to
>>>   concurrently update two bit-fields in the same struct if all fields between them are also bit-fields of non-zero
>>>   width. — end note ]"
>>>
>>> And while it is true that in practice writing to one bit-field from one thread and reading from another,
>>> if they reside on the same location, is OK in practice, it is still undefined behavior.

Ack, thanks for pointing that out, I was not aware of this.

I've reformulated things in terms of "memory location".

>>>
>>> Note the escape hatch mentioned above:
>>>
>>>     "if the two are separated by a zero-length bit-field declaration"
>>>
>>> Thus, a change like this:
>>>
>>>    unsigned int queued : 1;
>>>    unsigned int is_debug_types : 1;
>>>    unsigned int is_dwz : 1;
>>>    unsigned int reading_dwo_directly : 1;
>>>    unsigned int tu_read : 1;
>>>    mutable bool m_header_read_in : 1;
>>>    bool addresses_seen : 1;
>>>    unsigned int mark : 1;
>>>    bool files_read : 1;
>>>    ENUM_BITFIELD (dwarf_unit_type) unit_type : 8;
>>> +
>>> +  /* Ensure lang is a separate memory location, so we can update
>>> +     it concurrently with other bitfields.  */
>>> +  char :0;
>>> +
>>>    ENUM_BITFIELD (language) lang : LANGUAGE_BITS;
>>>
>>>
>>> ... should work.
>>
>> The "if one is declared inside a nested struct declaration and the other
>> is not" escape hatch may be interesting too, as in, we'd write:
>>
>>     struct {
>>       ENUM_BITFIELD (language) lang : LANGUAGE_BITS;
>>     };
>>
>> ... and since the struct is anonymous, nothing else needs to change.
>>
>> In patch #4, you'd just do this too:
>>
>>    struct {
>>       ENUM_BITFIELD (dwarf_unit_type) unit_type : 8;
>>    };
>>
>> The "wrapping" syntax seems to read a bit better, particularly since this
>> way you don't have to worry about putting a :0 bitfield before and
>> another after.
> 

Done.

> I keep coming back, sorry...  :-P
> 
> Another thought is that in both patches #3 and #4, it's reading_dwo_directly
> that is racing with two other bitfields.  So I wonder whether it's _that_ one
> that should be moved to a separate memory location.
	
I've also tried that, but I got similar errors back, with the same 
writes but different reads.

Also added field addresses_seen, which I found using testing with board 
cc-with-dwz-m.

Thanks,
- Tom