From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by sourceware.org (Postfix) with ESMTPS id BAAA33857361 for ; Tue, 12 Jul 2022 10:22:47 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BAAA33857361 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id ED75E2242F; Tue, 12 Jul 2022 10:22:46 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id D35CE13638; Tue, 12 Jul 2022 10:22:46 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id 9UWBMnZLzWLMaQAAMHmgww (envelope-from ); Tue, 12 Jul 2022 10:22:46 +0000 Message-ID: Date: Tue, 12 Jul 2022 12:22:46 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.0 Subject: Re: [PATCH] Introduce struct packed template, fix -fsanitize=thread for per_cu fields Content-Language: en-US From: Tom de Vries To: Pedro Alves , Tom Tromey Cc: gdb-patches@sourceware.org References: <20220629152914.13149-1-tdevries@suse.de> <20220629152914.13149-3-tdevries@suse.de> <8735fgvie4.fsf@tromey.com> <8c6ba70f-305d-c3ed-591d-36f1f38d52a6@suse.de> <4af2061e-9583-93fe-f33b-dcf6828ccee3@suse.de> <40ba7002-69a0-7a8c-018f-f82c5698bfbb@palves.net> <5ef32bf1-067c-c43d-b786-ab4077d8dedd@suse.de> In-Reply-To: <5ef32bf1-067c-c43d-b786-ab4077d8dedd@suse.de> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, NICE_REPLY_A, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Jul 2022 10:22:50 -0000 On 7/8/22 16:54, Tom de Vries wrote: > On 7/7/22 17:26, Pedro Alves wrote: >> On 2022-07-07 11:18 a.m., Tom de Vries wrote: >>> On 7/6/22 21:20, Pedro Alves wrote: >>>> On 2022-07-04 8:45 p.m., Tom de Vries via Gdb-patches wrote: >>>>> On 7/4/22 20:32, Tom Tromey wrote: >>>>>>>>>>> "Tom" == Tom de Vries writes: >>>>>> >>>>>> Tom>  /* The number of bits needed to represent all languages, >>>>>> with enough >>>>>> Tom>     padding to allow for reasonable growth.  */ >>>>>> Tom> -#define LANGUAGE_BITS 5 >>>>>> Tom> +#define LANGUAGE_BITS 8 >>>>>> >>>>>> This will negatively affect the size of symbols and so I think it >>>>>> should >>>>>> be avoided. >>>>>> >>>>> >>>>> Ack, Pedro suggested a way to avoid this: >>>>> ... >>>>> +  struct { >>>>> +    /* The language of this CU.  */ >>>>> +    ENUM_BITFIELD (language) m_lang : LANGUAGE_BITS; >>>>> +  }; >>>>> ... >>>>> >>>> >>>> It actually doesn't avoid it in this case, >>> >>> We were merely discussing the usage of LANGUAGE_BITS for >>> general_symbol_info::m_language, and indeed using the "struct { ... >>> };" approach avoids changing the LANGUAGE_BITS and introducing a >>> penalty on symbol size (which is a more numerous entity than CUs). >>> >> >> Yeah, sorry, I realized it after sending and decided I'd deserve the >> incoming cluebat.  :-) >> > > Heh :) > >>> Still, of course it's also good to keep the dwarf2_per_cu_data struct >>> as small as possible, so thanks for looking into this. >> >> It's that, but also the desire to settle on some infrastructure or >> approach that we can reuse >> going forward. >> > > Sure. > >> >>>> I have not actually tested this with -fsanitize=thread, though. >>>> Would you >>>> be up for testing that, Tom, if this approach looks reasonable? >>>> >>> >>> Yes, of course. >>> >>> I've applied the patch and then started with my latest approach which >>> avoid locks and uses atomics: >> >> Thanks. >> >>> ... >>> diff --git a/gdb/dwarf2/read.h b/gdb/dwarf2/read.h >>> index f98d8b27649..bc1af0ec2d3 100644 >>> --- a/gdb/dwarf2/read.h >>> +++ b/gdb/dwarf2/read.h >>> @@ -108,6 +108,7 @@ struct dwarf2_per_cu_data >>>         m_header_read_in (false), >>>         mark (false), >>>         files_read (false), >>> +      m_lang (language_unknown), >>>         scanned (false) >>>     { >>>     } >>> @@ -180,7 +181,7 @@ struct dwarf2_per_cu_data >>>     packed m_unit_type = (dwarf_unit_type) 0; >>> >>>     /* The language of this CU.  */ >>> -  packed m_lang = language_unknown; >>> +  std::atomic m_lang __attribute__((packed)); >>> >>>   public: >>>     /* True if this CU has been scanned by the indexer; false if >>> @@ -332,11 +333,13 @@ struct dwarf2_per_cu_data >>> >>>     void set_lang (enum language lang) >>>     { >>> -    /* We'd like to be more strict here, similar to what is done in >>> -       set_unit_type,  but currently a partial unit can go from >>> unknown to >>> -       minimal to ada to c.  */ >>> -    if (m_lang != lang) >>> -      m_lang = lang; >>> +    enum language nope = language_unknown; >>> +    if (m_lang.compare_exchange_strong (nope, lang)) >>> +      return; >>> +    nope = lang; >>> +    if (m_lang.compare_exchange_strong (nope, lang)) >>> +      return; >>> +    gdb_assert_not_reached (); >>>     } >>> >>>     /* Free any cached file names.  */ >>> ... >>> >>> I've tried both: >>> ... >>>    packed, LANGUAGE_BYTES> m_lang >>>      = language_unknown; >>> ... >>> and: >>> ... >>>    std::atomic> m_lang >>>      = language_unknown; >>> ... >>> and both give compilation errors: >>> ... >>> src/gdb/dwarf2/read.h:184:58: error: could not convert >>> ‘language_unknown’ from ‘language’ to ‘std::atomic>> 1> >’ >>>     std::atomic> m_lang = >>> language_unknown; >>> >>> ^~~~~~~~~~~~~~~~ >>> ... >>> and: >>> ... >>> src/gdb/../gdbsupport/packed.h:84:47: error: bit-field >>> ‘std::atomic packed, 1>::m_val’ with >>> non-integral type >>> ... >>> >>> Maybe one of the two should work and the pack template needs further >>> changes, I'm not sure. >> >> Yes, I think std::atomic> should >> work.  We need to write >> the initialized using {}, like this: >> >>    std::atomic> m_lang >> {language_unknown}; >> >> and then we run into errors when comparing m_lang with enum language. >> That is because >> the preexisting operator==/operator!= would require converting from >> enum language to >> packed, and then from packed> LANGUAGE_BYTES> to >> std::atomic>.  That is two implicit >> conversions, but >> C++ only does one automatically.  We can fix that by adding some >> operator==/operator!= >> implementations. >> >> I've done that in patch #1 attached.  I've also ditched the >> non-attribute-packed implementation. >> > > Ack. > >>> >>> Note btw that the attribute packed works here: >>> ... >>> +  std::atomic m_lang __attribute__((packed)); >>> ... >>> in the sense that it's got alignment 1: >>> ... >>>          struct atomic    m_lang \ >>>            __attribute__((__aligned__(1))); /*    16     4 */ >>> ... >>> but given that there's no LANGUAGE_BITS/BYTES, we're back to size 4 >>> for the m_lang field, and size 128 overall. >>> >>> So for now I've settled for: >>> ... >>> +  std::atomic m_lang; >>> ... >>> which does get me back to size 120. >>> >>> WIP patch attached. >> >> Please find attached 3 patches: >> >> #1 - Introduce struct packed template >> #2 - your original patch, but using struct packed, split to a separate >> patch. commit log updated. >> #3 - a version of your std::atomic WIP patch that uses >> std::atomic >> >> Patches #1 and #2 pass the testsuite cleanly for me.  Patch #3 >> compiles, but >> runs into a couple regressions due to the gdb_assert_not_reached in >> set_lang >> being reached.  I am not surprised since that set_lang code in your patch >> looked WIP and I just blindly converted to the new approach to show >> the code >> compiles. >> > > I've rebased #1 and #2 on master, and then applied my current WIP set, > copying the style that I found in #3. > > I'm currently testing it, I've pushed to > https://github.com/vries/gdb/commits/sanitize-thread-7 if you're > interested. > > #1 and #2 LGTM. > So, pushed. Thanks, - Tom > Thanks a lot for all the help :) > > - Tom