From: Pedro Alves <pedro@palves.net>
To: Tom de Vries <tdevries@suse.de>, Tom Tromey <tom@tromey.com>
Cc: gdb-patches@sourceware.org
Subject: [PATCH] Introduce struct packed template, fix -fsanitize=thread for per_cu fields
Date: Wed, 6 Jul 2022 20:20:42 +0100 [thread overview]
Message-ID: <bbf37493-9bc4-c88f-bddc-4b22bd1d1382@palves.net> (raw)
In-Reply-To: <8c6ba70f-305d-c3ed-591d-36f1f38d52a6@suse.de>
On 2022-07-04 8:45 p.m., Tom de Vries via Gdb-patches wrote:
> On 7/4/22 20:32, Tom Tromey wrote:
>>>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:
>>
>> Tom> /* The number of bits needed to represent all languages, with enough
>> Tom> padding to allow for reasonable growth. */
>> Tom> -#define LANGUAGE_BITS 5
>> Tom> +#define LANGUAGE_BITS 8
>>
>> This will negatively affect the size of symbols and so I think it should
>> be avoided.
>>
>
> Ack, Pedro suggested a way to avoid this:
> ...
> + struct {
> + /* The language of this CU. */
> + ENUM_BITFIELD (language) m_lang : LANGUAGE_BITS;
> + };
> ...
>
It actually doesn't avoid it in this case, as the following field will end up
moved to the next byte, so if LANGUAGE_BITS is 5, we'll end up with 3 bits gap.
Actually, it's worse than that -- it will align m_lang to ENUM_BITFIELD(language)'s
natural alignment, so it can introduce byte padding before and after too. :-/ :-(
We can see it with "ptype /o" after applying your patch using the struct{}
trick. Note the "3-byte padding" below:
...
/* 13: 5 | 1 */ bool m_header_read_in : 1;
/* XXX 2-bit hole */
/* 14 | 1 */ struct {
/* 14: 0 | 1 */ bool addresses_seen : 1;
/* XXX 7-bit padding */
/* total size (bytes): 1 */
};
/* 15: 0 | 4 */ unsigned int mark : 1;
/* 15: 1 | 1 */ bool files_read : 1;
/* XXX 6-bit hole */
/* 16 | 4 */ struct {
/* 16: 0 | 4 */ dwarf_unit_type unit_type : 8;
/* XXX 3-byte padding */ <<<<<< 3 bytes
/* total size (bytes): 4 */
};
/* 20 | 4 */ struct {
/* 20: 0 | 4 */ language lang : 5;
/* XXX 3-bit padding */
/* XXX 3-byte padding */ <<<<<< 3 bytes
/* total size (bytes): 4 */
};
...
So, maybe we really want something else... How about this alternative patch below?
I wrote the new struct packed template using the array for storage before I
wrote the alternative to use __attribute__((packed)), so I left the array
version in there, as a pure standards-conforming implementation. We can just
remove that array implementation completely, and use the ATTRIBUTE_PACKED macro
if you don't think it's worth it. All compilers worth their salt support attribute
packed, or something like it, I believe. The resulting gdbsupport/pack.h would
be quite smaller.
I tested this on x86-64 Ubuntu 20.04, both attribute packed no attribute
versions, saw no regressions. Also smoke-tested with Clang (which uses the
attribute packed implementation too, as it defines __GNUC__).
I have not actually tested this with -fsanitize=thread, though. Would you
be up for testing that, Tom, if this approach looks reasonable?
Note that with this, struct dwarf2_per_cu_data is still 120 bytes, there's no growth:
(top-gdb) ptype /o dwarf2_per_cu_data
/* offset | size */ type = struct dwarf2_per_cu_data {
public:
/* 0 | 8 */ sect_offset sect_off;
/* 8 | 4 */ unsigned int length;
private:
/* 12 | 1 */ unsigned char m_dwarf_version;
public:
/* 13: 0 | 4 */ unsigned int queued : 1;
/* 13: 1 | 4 */ unsigned int is_debug_types : 1;
/* 13: 2 | 4 */ unsigned int is_dwz : 1;
/* 13: 3 | 4 */ unsigned int reading_dwo_directly : 1;
/* 13: 4 | 4 */ unsigned int tu_read : 1;
/* 13: 5 | 1 */ bool m_header_read_in : 1;
/* 13: 6 | 4 */ unsigned int mark : 1;
/* 13: 7 | 1 */ bool files_read : 1;
/* 14 | 1 */ struct packed<bool, 1> [with T = bool] {
private:
/* 14 | 1 */ struct {
/* 14: 0 | 1 */ T m_val : 8;
/* total size (bytes): 1 */
};
/* total size (bytes): 1 */
} addresses_seen;
private:
/* 15 | 1 */ struct packed<dwarf_unit_type, 1> [with T = dwarf_unit_type] {
private:
/* 15 | 1 */ struct {
/* 15: 0 | 4 */ T m_val : 8;
/* total size (bytes): 1 */
};
/* total size (bytes): 1 */
} m_unit_type;
/* 16 | 1 */ struct packed<language, 1> [with T = language] {
private:
/* 16 | 1 */ struct {
/* 16: 0 | 4 */ T m_val : 8;
/* total size (bytes): 1 */
};
/* total size (bytes): 1 */
} m_lang;
public:
/* 17 | 1 */ struct std::atomic<bool> [with _Tp = bool] {
private:
/* 17 | 1 */ struct std::__atomic_base<_Tp> [with _IntTp = _Tp] {
private:
static const int _S_alignment;
/* 17 | 1 */ __int_type _M_i;
/* total size (bytes): 1 */
} _M_base;
/* total size (bytes): 1 */
} scanned;
/* XXX 2-byte hole */
/* 20 | 4 */ unsigned int index;
/* 24 | 8 */ dwarf2_section_info *section;
/* 32 | 8 */ dwarf2_per_bfd *per_bfd;
/* 40 | 56 */ struct comp_unit_head {
/* 40 | 4 */ unsigned int length;
/* 44 | 1 */ unsigned char version;
/* 45 | 1 */ unsigned char addr_size;
/* 46 | 1 */ unsigned char signed_addr_p;
/* XXX 1-byte hole */
/* 48 | 8 */ sect_offset abbrev_sect_off;
/* 56 | 4 */ unsigned int offset_size;
/* 60 | 4 */ unsigned int initial_length_size;
/* 64 | 4 */ dwarf_unit_type unit_type;
/* 68 | 4 */ cu_offset first_die_cu_offset;
/* 72 | 8 */ sect_offset sect_off;
/* 80 | 4 */ cu_offset type_cu_offset_in_tu;
/* XXX 4-byte hole */
/* 88 | 8 */ ULONGEST signature;
/* total size (bytes): 56 */
} m_header;
/* 96 | 8 */ std::unique_ptr<file_and_directory> fnd;
/* 104 | 8 */ During symbol reading: Child DIE 0x1d6f34b and its abstract origin 0x1d6f380 have different parents
quick_file_names *file_names;
/* 112 | 8 */ std::vector<dwarf2_per_cu_data*> *imported_symtabs;
/* total size (bytes): 120 */
}
(top-gdb)
Patch follows:
From c65b4a71a08835dd46c16443f84793e982efdc8a Mon Sep 17 00:00:00 2001
From: Pedro Alves <pedro@palves.net>
Date: Wed, 6 Jul 2022 14:46:49 +0100
Subject: [PATCH] Introduce struct packed template, fix -fsanitize=thread for
per_cu fields
Change-Id: Ifa94f0a2cebfae5e8f6ddc73265f05e7fd9e1532
---
gdb/defs.h | 3 ++
gdb/dwarf2/read.h | 20 +++++-----
gdbsupport/packed.h | 92 +++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 106 insertions(+), 9 deletions(-)
create mode 100644 gdbsupport/packed.h
diff --git a/gdb/defs.h b/gdb/defs.h
index 99bfdd526ff..19f379d6588 100644
--- a/gdb/defs.h
+++ b/gdb/defs.h
@@ -232,6 +232,9 @@ enum language
#define LANGUAGE_BITS 5
gdb_static_assert (nr_languages <= (1 << LANGUAGE_BITS));
+/* The number of bytes needed to represent all languages. */
+#define LANGUAGE_BYTES ((LANGUAGE_BITS + HOST_CHAR_BIT - 1) / HOST_CHAR_BIT)
+
enum precision_type
{
single_precision,
diff --git a/gdb/dwarf2/read.h b/gdb/dwarf2/read.h
index 1d9c66aafad..f98d8b27649 100644
--- a/gdb/dwarf2/read.h
+++ b/gdb/dwarf2/read.h
@@ -33,6 +33,7 @@
#include "gdbsupport/gdb_obstack.h"
#include "gdbsupport/hash_enum.h"
#include "gdbsupport/function-view.h"
+#include "gdbsupport/packed.h"
/* Hold 'maintenance (set|show) dwarf' commands. */
extern struct cmd_list_element *set_dwarf_cmdlist;
@@ -105,11 +106,8 @@ struct dwarf2_per_cu_data
reading_dwo_directly (false),
tu_read (false),
m_header_read_in (false),
- addresses_seen (false),
mark (false),
files_read (false),
- m_unit_type {},
- m_lang (language_unknown),
scanned (false)
{
}
@@ -161,10 +159,6 @@ struct dwarf2_per_cu_data
it private at the moment. */
mutable bool m_header_read_in : 1;
- /* If addresses have been read for this CU (usually from
- .debug_aranges), then this flag is set. */
- bool addresses_seen : 1;
-
/* A temporary mark bit used when iterating over all CUs in
expand_symtabs_matching. */
unsigned int mark : 1;
@@ -173,12 +167,20 @@ struct dwarf2_per_cu_data
point in trying to read it again next time. */
bool files_read : 1;
+ /* Wrap the following in struct packed instead of bitfields to avoid
+ data races when the bitfields end up on the same memory location
+ (per C++ memory model). */
+
+ /* If addresses have been read for this CU (usually from
+ .debug_aranges), then this flag is set. */
+ packed<bool, 1> addresses_seen = false;
+
private:
/* The unit type of this CU. */
- ENUM_BITFIELD (dwarf_unit_type) m_unit_type : 8;
+ packed<dwarf_unit_type, 1> m_unit_type = (dwarf_unit_type) 0;
/* The language of this CU. */
- ENUM_BITFIELD (language) m_lang : LANGUAGE_BITS;
+ packed<language, LANGUAGE_BYTES> m_lang = language_unknown;
public:
/* True if this CU has been scanned by the indexer; false if
diff --git a/gdbsupport/packed.h b/gdbsupport/packed.h
new file mode 100644
index 00000000000..a41c408a44d
--- /dev/null
+++ b/gdbsupport/packed.h
@@ -0,0 +1,92 @@
+/* Copyright (C) 2022 Free Software Foundation, Inc.
+
+ This file is part of GDB.
+
+ This program is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 3 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>. */
+
+#ifndef PACKED_H
+#define PACKED_H
+
+/* Each instantiation and full specialization of the packed template
+ defines a type that behaves like a given scalar type, but that has
+ byte alignment, and, may optionally have a smaller size than the
+ given scalar type. This is typically used as alternative to
+ bit-fields (and ENUM_BITFIELD), when the fields must have separate
+ memory locations to avoid data races. */
+
+/* Use __attribute__((packed)) when possible to make it more
+ convenient to print the field when debugging GDB. */
+#ifdef __GNUC__
+# define PACKED_USE_ATTRIBUTE_PACKED 1
+#else
+# define PACKED_USE_ATTRIBUTE_PACKED 0
+#endif
+
+#if !PACKED_USE_ATTRIBUTE_PACKED
+
+template<size_t bytes> struct bytes_to_type;
+
+template<> struct bytes_to_type<1>
+{
+ typedef uint8_t type;
+};
+
+template<> struct bytes_to_type<2>
+{
+ typedef uint16_t type;
+};
+
+template<> struct bytes_to_type<4>
+{
+ typedef uint32_t type;
+};
+#endif
+
+template<typename T, size_t Bytes = sizeof (T)>
+struct packed
+{
+public:
+ packed (T val)
+ {
+#if PACKED_USE_ATTRIBUTE_PACKED
+ m_val = val;
+#else
+ typename bytes_to_type<Bytes>::type tmp;
+ tmp = val;
+ memcpy (m_val, &tmp, Bytes);
+#endif
+ }
+
+ operator T () const
+ {
+#if PACKED_USE_ATTRIBUTE_PACKED
+ return (T) m_val;
+#else
+ typename bytes_to_type<Bytes>::type tmp;
+ memcpy (&tmp, m_val, Bytes);
+ return (T) tmp;
+#endif
+ }
+
+private:
+#if PACKED_USE_ATTRIBUTE_PACKED
+ T m_val : Bytes * 8 __attribute__ ((packed));
+#else
+ unsigned char m_val[Bytes];
+ static_assert (sizeof (typename bytes_to_type<Bytes>::type) == sizeof (m_val),
+ "sizes must match");
+#endif
+};
+
+#endif
base-commit: 8bd915b770e12eff67f6218f2d727069d04d5752
--
2.36.0
next prev parent reply other threads:[~2022-07-06 19:20 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-29 15:29 [PATCH 1/5] [COVER-LETTER, RFC] Fix some fsanitize=thread issues in gdb's cooked index Tom de Vries
2022-06-29 15:29 ` [PATCH 2/5] [gdb/symtab] Fix data race on per_cu->dwarf_version Tom de Vries
2022-07-01 11:16 ` Tom de Vries
2022-07-02 11:07 ` Tom de Vries
2022-07-04 18:51 ` Tom Tromey
2022-07-04 19:43 ` Tom de Vries
2022-07-04 19:53 ` Tom Tromey
2022-06-29 15:29 ` [PATCH 3/5] [gdb/symtab] Work around fsanitize=address false positive for per_cu->lang Tom de Vries
2022-06-29 17:38 ` Pedro Alves
2022-06-29 18:25 ` Pedro Alves
2022-06-29 18:28 ` Pedro Alves
2022-07-04 7:04 ` [PATCH 3/5] [gdb/symtab] Work around fsanitize=address false positive for per_ cu->lang Tom de Vries
2022-07-04 18:32 ` [PATCH 3/5] [gdb/symtab] Work around fsanitize=address false positive for per_cu->lang Tom Tromey
2022-07-04 19:45 ` Tom de Vries
2022-07-06 19:20 ` Pedro Alves [this message]
2022-07-07 10:18 ` [PATCH] Introduce struct packed template, fix -fsanitize=thread for per_cu fields Tom de Vries
2022-07-07 15:26 ` Pedro Alves
2022-07-08 14:54 ` Tom de Vries
2022-07-12 10:22 ` Tom de Vries
2022-06-29 15:29 ` [PATCH 4/5] [gdb/symtab] Work around fsanitize=address false positive for per_cu->unit_type Tom de Vries
2022-06-29 15:29 ` [PATCH 5/5] [gdb/symtab] Fix data race on per_cu->lang Tom de Vries
2022-07-04 18:30 ` Tom Tromey
2022-07-05 8:17 ` Tom de Vries
2022-07-05 15:19 ` Tom de Vries
2022-07-06 15:42 ` Tom de Vries
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bbf37493-9bc4-c88f-bddc-4b22bd1d1382@palves.net \
--to=pedro@palves.net \
--cc=gdb-patches@sourceware.org \
--cc=tdevries@suse.de \
--cc=tom@tromey.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).