Re: [PATCH] Introduce struct packed template, fix -fsanitize=thread for per_cu fields

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

From: Tom de Vries <tdevries@suse.de>
To: Pedro Alves <pedro@palves.net>, Tom Tromey <tom@tromey.com>
Cc: gdb-patches@sourceware.org
Subject: Re: [PATCH] Introduce struct packed template, fix -fsanitize=thread for per_cu fields
Date: Fri, 8 Jul 2022 16:54:48 +0200	[thread overview]
Message-ID: <5ef32bf1-067c-c43d-b786-ab4077d8dedd@suse.de> (raw)
In-Reply-To: <40ba7002-69a0-7a8c-018f-f82c5698bfbb@palves.net>

On 7/7/22 17:26, Pedro Alves wrote:
> On 2022-07-07 11:18 a.m., Tom de Vries wrote:
>> On 7/6/22 21:20, Pedro Alves wrote:
>>> On 2022-07-04 8:45 p.m., Tom de Vries via Gdb-patches wrote:
>>>> On 7/4/22 20:32, Tom Tromey wrote:
>>>>>>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:
>>>>>
>>>>> Tom>  /* The number of bits needed to represent all languages, with enough
>>>>> Tom>     padding to allow for reasonable growth.  */
>>>>> Tom> -#define LANGUAGE_BITS 5
>>>>> Tom> +#define LANGUAGE_BITS 8
>>>>>
>>>>> This will negatively affect the size of symbols and so I think it should
>>>>> be avoided.
>>>>>
>>>>
>>>> Ack, Pedro suggested a way to avoid this:
>>>> ...
>>>> +  struct {
>>>> +    /* The language of this CU.  */
>>>> +    ENUM_BITFIELD (language) m_lang : LANGUAGE_BITS;
>>>> +  };
>>>> ...
>>>>
>>>
>>> It actually doesn't avoid it in this case,
>>
>> We were merely discussing the usage of LANGUAGE_BITS for general_symbol_info::m_language, and indeed using the "struct { ... };" approach avoids changing the LANGUAGE_BITS and introducing a penalty on symbol size (which is a more numerous entity than CUs).
>>
> 
> Yeah, sorry, I realized it after sending and decided I'd deserve the incoming cluebat.  :-)
> 

Heh :)

>> Still, of course it's also good to keep the dwarf2_per_cu_data struct as small as possible, so thanks for looking into this.
> 
> It's that, but also the desire to settle on some infrastructure or approach that we can reuse
> going forward.
> 

Sure.

> 
>>> I have not actually tested this with -fsanitize=thread, though.  Would you
>>> be up for testing that, Tom, if this approach looks reasonable?
>>>
>>
>> Yes, of course.
>>
>> I've applied the patch and then started with my latest approach which avoid locks and uses atomics:
> 
> Thanks.
> 
>> ...
>> diff --git a/gdb/dwarf2/read.h b/gdb/dwarf2/read.h
>> index f98d8b27649..bc1af0ec2d3 100644
>> --- a/gdb/dwarf2/read.h
>> +++ b/gdb/dwarf2/read.h
>> @@ -108,6 +108,7 @@ struct dwarf2_per_cu_data
>>         m_header_read_in (false),
>>         mark (false),
>>         files_read (false),
>> +      m_lang (language_unknown),
>>         scanned (false)
>>     {
>>     }
>> @@ -180,7 +181,7 @@ struct dwarf2_per_cu_data
>>     packed<dwarf_unit_type, 1> m_unit_type = (dwarf_unit_type) 0;
>>
>>     /* The language of this CU.  */
>> -  packed<language, LANGUAGE_BYTES> m_lang = language_unknown;
>> +  std::atomic<language> m_lang __attribute__((packed));
>>
>>   public:
>>     /* True if this CU has been scanned by the indexer; false if
>> @@ -332,11 +333,13 @@ struct dwarf2_per_cu_data
>>
>>     void set_lang (enum language lang)
>>     {
>> -    /* We'd like to be more strict here, similar to what is done in
>> -       set_unit_type,  but currently a partial unit can go from unknown to
>> -       minimal to ada to c.  */
>> -    if (m_lang != lang)
>> -      m_lang = lang;
>> +    enum language nope = language_unknown;
>> +    if (m_lang.compare_exchange_strong (nope, lang))
>> +      return;
>> +    nope = lang;
>> +    if (m_lang.compare_exchange_strong (nope, lang))
>> +      return;
>> +    gdb_assert_not_reached ();
>>     }
>>
>>     /* Free any cached file names.  */
>> ...
>>
>> I've tried both:
>> ...
>>    packed<std::atomic<language>, LANGUAGE_BYTES> m_lang
>>      = language_unknown;
>> ...
>> and:
>> ...
>>    std::atomic<packed<language, LANGUAGE_BYTES>> m_lang
>>      = language_unknown;
>> ...
>> and both give compilation errors:
>> ...
>> src/gdb/dwarf2/read.h:184:58: error: could not convert ‘language_unknown’ from ‘language’ to ‘std::atomic<packed<language, 1> >’
>>     std::atomic<packed<language, LANGUAGE_BYTES>> m_lang = language_unknown;
>>                                                            ^~~~~~~~~~~~~~~~
>> ...
>> and:
>> ...
>> src/gdb/../gdbsupport/packed.h:84:47: error: bit-field ‘std::atomic<language> packed<std::atomic<language>, 1>::m_val’ with non-integral type
>> ...
>>
>> Maybe one of the two should work and the pack template needs further changes, I'm not sure.
> 
> Yes, I think std::atomic<packed<language, LANGUAGE_BYTES>> should work.  We need to write
> the initialized using {}, like this:
> 
>    std::atomic<packed<language, LANGUAGE_BYTES>> m_lang {language_unknown};
> 
> and then we run into errors when comparing m_lang with enum language.  That is because
> the preexisting operator==/operator!= would require converting from enum language to
> packed<language, LANGUAGE_BYTES>, and then from packed<language, LANGUAGE_BYTES> to
> std::atomic<packed<language, LANGUAGE_BYTES>>.  That is two implicit conversions, but
> C++ only does one automatically.  We can fix that by adding some operator==/operator!=
> implementations.
> 
> I've done that in patch #1 attached.  I've also ditched the non-attribute-packed implementation.
> 

Ack.

>>
>> Note btw that the attribute packed works here:
>> ...
>> +  std::atomic<language> m_lang __attribute__((packed));
>> ...
>> in the sense that it's got alignment 1:
>> ...
>>          struct atomic<language>    m_lang \
>>            __attribute__((__aligned__(1))); /*    16     4 */
>> ...
>> but given that there's no LANGUAGE_BITS/BYTES, we're back to size 4 for the m_lang field, and size 128 overall.
>>
>> So for now I've settled for:
>> ...
>> +  std::atomic<LANGUAGE_CONTAINER> m_lang;
>> ...
>> which does get me back to size 120.
>>
>> WIP patch attached.
> 
> Please find attached 3 patches:
> 
> #1 - Introduce struct packed template
> #2 - your original patch, but using struct packed, split to a separate patch. commit log updated.
> #3 - a version of your std::atomic WIP patch that uses std::atomic<packed>
> 
> Patches #1 and #2 pass the testsuite cleanly for me.  Patch #3 compiles, but
> runs into a couple regressions due to the gdb_assert_not_reached in set_lang
> being reached.  I am not surprised since that set_lang code in your patch
> looked WIP and I just blindly converted to the new approach to show the code
> compiles.
> 

I've rebased #1 and #2 on master, and then applied my current WIP set, 
copying the style that I found in #3.

I'm currently testing it, I've pushed to 
https://github.com/vries/gdb/commits/sanitize-thread-7 if you're interested.

#1 and #2 LGTM.

Thanks a lot for all the help :)

- Tom

next prev parent reply	other threads:[~2022-07-08 14:54 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-29 15:29 [PATCH 1/5] [COVER-LETTER, RFC] Fix some fsanitize=thread issues in gdb's cooked index Tom de Vries
2022-06-29 15:29 ` [PATCH 2/5] [gdb/symtab] Fix data race on per_cu->dwarf_version Tom de Vries
2022-07-01 11:16   ` Tom de Vries
2022-07-02 11:07     ` Tom de Vries
2022-07-04 18:51       ` Tom Tromey
2022-07-04 19:43         ` Tom de Vries
2022-07-04 19:53           ` Tom Tromey
2022-06-29 15:29 ` [PATCH 3/5] [gdb/symtab] Work around fsanitize=address false positive for per_cu->lang Tom de Vries
2022-06-29 17:38   ` Pedro Alves
2022-06-29 18:25     ` Pedro Alves
2022-06-29 18:28       ` Pedro Alves
2022-07-04  7:04         ` [PATCH 3/5] [gdb/symtab] Work around fsanitize=address false positive for per_ cu->lang Tom de Vries
2022-07-04 18:32   ` [PATCH 3/5] [gdb/symtab] Work around fsanitize=address false positive for per_cu->lang Tom Tromey
2022-07-04 19:45     ` Tom de Vries
2022-07-06 19:20       ` [PATCH] Introduce struct packed template, fix -fsanitize=thread for per_cu fields Pedro Alves
2022-07-07 10:18         ` Tom de Vries
2022-07-07 15:26           ` Pedro Alves
2022-07-08 14:54             ` Tom de Vries [this message]
2022-07-12 10:22               ` Tom de Vries
2022-06-29 15:29 ` [PATCH 4/5] [gdb/symtab] Work around fsanitize=address false positive for per_cu->unit_type Tom de Vries
2022-06-29 15:29 ` [PATCH 5/5] [gdb/symtab] Fix data race on per_cu->lang Tom de Vries
2022-07-04 18:30   ` Tom Tromey
2022-07-05  8:17     ` Tom de Vries
2022-07-05 15:19     ` Tom de Vries
2022-07-06 15:42       ` Tom de Vries

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5ef32bf1-067c-c43d-b786-ab4077d8dedd@suse.de \
    --to=tdevries@suse.de \
    --cc=gdb-patches@sourceware.org \
    --cc=pedro@palves.net \
    --cc=tom@tromey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).