From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <alves.ped@gmail.com>
Received: from mail-wm1-f49.google.com (mail-wm1-f49.google.com
 [209.85.128.49])
 by sourceware.org (Postfix) with ESMTPS id E2AE138460B3
 for <gdb-patches@sourceware.org>; Wed, 29 Jun 2022 18:25:48 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E2AE138460B3
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=palves.net
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com
Received: by mail-wm1-f49.google.com with SMTP id
 g39-20020a05600c4ca700b003a03ac7d540so141258wmp.3
 for <gdb-patches@sourceware.org>; Wed, 29 Jun 2022 11:25:48 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=x-gm-message-state:message-id:date:mime-version:user-agent:subject
 :content-language:from:to:cc:references:in-reply-to
 :content-transfer-encoding;
 bh=gFIu0mJo4UKXUAlJcH9WBg6TmEKH+tzshm48b5gx4E0=;
 b=yU6ep4YemJnhqyxisR1l8rrnIPXbxxqjP7qucSNsUriDVmdqTAI9tuMTJhYncgJWmN
 uLwlWbxprSEiyu07VJwwwXPS1kl8IGb/q0OEljtvan0CFcbtA3ypyQ82+nsUUgkczW6r
 R0Ar6WWjbxHg+krriy+mm3zrGwiaFBU6tdtINQzTJ66d6a7i7nfRB690+ByIZvLblw1e
 xBKQLUfra2oj7ThMRIhy6oGRLK4gRIbKq1LX8Y4XxkFN5yK+NXSX+aEn1M1B8th8O25X
 KmOhCKzWgWtpCt3RqsmyHJgxG/kVeWHLnevjzy6d5s5LRiL3ik9ehyL6Dw+mbhGFUtU2
 y8dg==
X-Gm-Message-State: AJIora+q3YldyuRg+WJ+DEVs0R+r0bY4tDDyV08WZvFmeTIu6954/nCo
 Z3s+2YZCmjULAjSCqkIBQeZeycgctfQ=
X-Google-Smtp-Source: AGRyM1uCOV9pTmSneCmMFGMsop74kjtwQifNu6G9b7MvB1uaK3PsDcBOhew72nGVDD7hC8Ic2qAUkA==
X-Received: by 2002:a05:600c:509:b0:3a0:45d9:43e7 with SMTP id
 i9-20020a05600c050900b003a045d943e7mr5165016wmc.176.1656527147886; 
 Wed, 29 Jun 2022 11:25:47 -0700 (PDT)
Received: from ?IPV6:2001:8a0:f924:2600:209d:85e2:409e:8726?
 ([2001:8a0:f924:2600:209d:85e2:409e:8726])
 by smtp.gmail.com with ESMTPSA id
 ay29-20020a05600c1e1d00b003a03be171b1sm697217wmb.43.2022.06.29.11.25.46
 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
 Wed, 29 Jun 2022 11:25:47 -0700 (PDT)
Message-ID: <1cdbd4f9-4da7-0dd3-ea91-496797f2ad72@palves.net>
Date: Wed, 29 Jun 2022 19:25:46 +0100
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
 Thunderbird/91.10.0
Subject: Re: [PATCH 3/5] [gdb/symtab] Work around fsanitize=address false
 positive for per_cu->lang
Content-Language: en-US
From: Pedro Alves <pedro@palves.net>
To: Tom de Vries <tdevries@suse.de>, gdb-patches@sourceware.org
Cc: Tom Tromey <tom@tromey.com>
References: <20220629152914.13149-1-tdevries@suse.de>
 <20220629152914.13149-3-tdevries@suse.de>
 <e6928845-63bc-50df-969a-1d1775004bda@palves.net>
In-Reply-To: <e6928845-63bc-50df-969a-1d1775004bda@palves.net>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00,
 FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,
 KAM_DMARC_STATUS, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,
 SPF_HELO_NONE, SPF_PASS, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gdb-patches@sourceware.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gdb-patches mailing list <gdb-patches.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/gdb-patches>,
 <mailto:gdb-patches-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/gdb-patches>,
 <mailto:gdb-patches-request@sourceware.org?subject=subscribe>
X-List-Received-Date: Wed, 29 Jun 2022 18:25:50 -0000

On 2022-06-29 18:38, Pedro Alves wrote:
> On 2022-06-29 16:29, Tom de Vries via Gdb-patches wrote:
>> When building gdb with -fsanitize=thread and gcc 12, and running test-case
>> gdb.dwarf2/dwz.exp, we run into a data race between:
>> ...
>>   Read of size 1 at 0x7b200000300d by thread T2:^M
>>     #0 cutu_reader::cutu_reader(dwarf2_per_cu_data*, dwarf2_per_objfile*, \
>>     abbrev_table*, dwarf2_cu*, bool, abbrev_cache*) gdb/dwarf2/read.c:6164 \
>>     (gdb+0x82ec95)^M
>> ...
>> and:
>> ...
>>   Previous write of size 1 at 0x7b200000300d by main thread:^M
>>     #0 prepare_one_comp_unit gdb/dwarf2/read.c:23588 (gdb+0x86f973)^M
>> ...
>>
>> In other words, between:
>> ...
>>   if (this_cu->reading_dwo_directly)
>> ...
>> and:
>> ...
>>     cu->per_cu->lang = pretend_language;
>> ...
>>
>> Both fields are part of the same bitfield, and writing to one field while
>> reading from another is not a problem, so this is a false positive.
> 
> I don't understand this sentence.  Particularly "same bitfield", or
> really "Both fields are part of the same bitfield,".  How can two bitfields
> be part of the same bitfield?
> 
> Anyhow, both bitfields are part of a sequence of contiguous bitfields, here
> stripped of comments:
> 
>   unsigned int queued : 1;
>   unsigned int is_debug_types : 1;
>   unsigned int is_dwz : 1;
>   unsigned int reading_dwo_directly : 1;
>   unsigned int tu_read : 1;
>   mutable bool m_header_read_in : 1;
>   bool addresses_seen : 1;
>   unsigned int mark : 1;
>   bool files_read : 1;
>   ENUM_BITFIELD (dwarf_unit_type) unit_type : 8;
>   ENUM_BITFIELD (language) lang : LANGUAGE_BITS;
> 
> Per C++11, they're all part of the same _memory location_.  From N3253 (C++11), intro.memory:
> 
>  "A memory location is either an object of scalar type or a maximal sequence of adjacent bit-fields all having
>  non-zero width. (...)  Two threads of execution (1.10) can update and access separate memory locations
>   without interfering with each other.
>  (...)
>  [ Note: Thus a bit-field and an adjacent non-bit-field are in separate memory locations, and therefore can be
>  concurrently updated by two threads of execution without interference. The same applies to two bit-fields,
>  if one is declared inside a nested struct declaration and the other is not, or if the two are separated by
>  a zero-length bit-field declaration, or if they are separated by a non-bit-field declaration. It is not safe to
>  concurrently update two bit-fields in the same struct if all fields between them are also bit-fields of non-zero
>  width. — end note ]"
> 
> And while it is true that in practice writing to one bit-field from one thread and reading from another,
> if they reside on the same location, is OK in practice, it is still undefined behavior.
> 
> Note the escape hatch mentioned above:
> 
>    "if the two are separated by a zero-length bit-field declaration"
> 
> Thus, a change like this:
> 
>   unsigned int queued : 1;
>   unsigned int is_debug_types : 1;
>   unsigned int is_dwz : 1;
>   unsigned int reading_dwo_directly : 1;
>   unsigned int tu_read : 1;
>   mutable bool m_header_read_in : 1;
>   bool addresses_seen : 1;
>   unsigned int mark : 1;
>   bool files_read : 1;
>   ENUM_BITFIELD (dwarf_unit_type) unit_type : 8;
> +
> +  /* Ensure lang is a separate memory location, so we can update
> +     it concurrently with other bitfields.  */
> +  char :0;
> +
>   ENUM_BITFIELD (language) lang : LANGUAGE_BITS;
> 
> 
> ... should work.

The "if one is declared inside a nested struct declaration and the other
is not" escape hatch may be interesting too, as in, we'd write:

   struct {
     ENUM_BITFIELD (language) lang : LANGUAGE_BITS;
   };

... and since the struct is anonymous, nothing else needs to change.

In patch #4, you'd just do this too:

  struct {
     ENUM_BITFIELD (dwarf_unit_type) unit_type : 8;
  };

The "wrapping" syntax seems to read a bit better, particularly since this
way you don't have to worry about putting a :0 bitfield before and
another after.