public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Pedro Alves <pedro@palves.net>
To: Tom de Vries <tdevries@suse.de>, Tom Tromey <tom@tromey.com>
Cc: gdb-patches@sourceware.org
Subject: [PATCH] Introduce struct packed template, fix -fsanitize=thread for per_cu fields
Date: Wed, 6 Jul 2022 20:20:42 +0100	[thread overview]
Message-ID: <bbf37493-9bc4-c88f-bddc-4b22bd1d1382@palves.net> (raw)
In-Reply-To: <8c6ba70f-305d-c3ed-591d-36f1f38d52a6@suse.de>

On 2022-07-04 8:45 p.m., Tom de Vries via Gdb-patches wrote:
> On 7/4/22 20:32, Tom Tromey wrote:
>>>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:
>>
>> Tom>  /* The number of bits needed to represent all languages, with enough
>> Tom>     padding to allow for reasonable growth.  */
>> Tom> -#define LANGUAGE_BITS 5
>> Tom> +#define LANGUAGE_BITS 8
>>
>> This will negatively affect the size of symbols and so I think it should
>> be avoided.
>>
> 
> Ack, Pedro suggested a way to avoid this:
> ...
> +  struct {
> +    /* The language of this CU.  */
> +    ENUM_BITFIELD (language) m_lang : LANGUAGE_BITS;
> +  };
> ...
> 

It actually doesn't avoid it in this case, as the following field will end up
moved to the next byte, so if LANGUAGE_BITS is 5, we'll end up with 3 bits gap.

Actually, it's worse than that -- it will align m_lang to ENUM_BITFIELD(language)'s
natural alignment, so it can introduce byte padding before and after too.  :-/  :-(

We can see it with "ptype /o" after applying your patch using the struct{}
trick.  Note the "3-byte padding" below:

 ...
 /*     13: 5   |       1 */    bool m_header_read_in : 1;
 /* XXX  2-bit hole       */
 /*     14      |       1 */    struct {
 /*     14: 0   |       1 */        bool addresses_seen : 1;
 /* XXX  7-bit padding    */
 
                                    /* total size (bytes):    1 */
                                };
 /*     15: 0   |       4 */    unsigned int mark : 1;
 /*     15: 1   |       1 */    bool files_read : 1;
 /* XXX  6-bit hole       */
 /*     16      |       4 */    struct {
 /*     16: 0   |       4 */        dwarf_unit_type unit_type : 8;
 /* XXX  3-byte padding   */                                               <<<<<< 3 bytes
 
                                    /* total size (bytes):    4 */
                                };
 /*     20      |       4 */    struct {
 /*     20: 0   |       4 */        language lang : 5;
 /* XXX  3-bit padding    */
 /* XXX  3-byte padding   */                                               <<<<<< 3 bytes

                                    /* total size (bytes):    4 */
                                };
 ...



So, maybe we really want something else...  How about this alternative patch below?

I wrote the new struct packed template using the array for storage before I
wrote the alternative to use __attribute__((packed)), so I left the array
version in there, as a pure standards-conforming implementation.  We can just
remove that array implementation completely, and use the ATTRIBUTE_PACKED macro
if you don't think it's worth it.  All compilers worth their salt support attribute
packed, or something like it, I believe.  The resulting gdbsupport/pack.h would
be quite smaller.

I tested this on x86-64 Ubuntu 20.04, both attribute packed no attribute
versions, saw no regressions.  Also smoke-tested with Clang (which uses the
attribute packed implementation too, as it defines __GNUC__).

I have not actually tested this with -fsanitize=thread, though.  Would you
be up for testing that, Tom, if this approach looks reasonable?

Note that with this, struct dwarf2_per_cu_data is still 120 bytes, there's no growth:

(top-gdb) ptype /o dwarf2_per_cu_data
/* offset      |    size */  type = struct dwarf2_per_cu_data {
                             public:
/*      0      |       8 */    sect_offset sect_off;
/*      8      |       4 */    unsigned int length;
                             private:
/*     12      |       1 */    unsigned char m_dwarf_version;
                             public:
/*     13: 0   |       4 */    unsigned int queued : 1;
/*     13: 1   |       4 */    unsigned int is_debug_types : 1;
/*     13: 2   |       4 */    unsigned int is_dwz : 1;
/*     13: 3   |       4 */    unsigned int reading_dwo_directly : 1;
/*     13: 4   |       4 */    unsigned int tu_read : 1;
/*     13: 5   |       1 */    bool m_header_read_in : 1;
/*     13: 6   |       4 */    unsigned int mark : 1;
/*     13: 7   |       1 */    bool files_read : 1;
/*     14      |       1 */    struct packed<bool, 1> [with T = bool] {
                                 private:
/*     14      |       1 */        struct {
/*     14: 0   |       1 */            T m_val : 8;

                                       /* total size (bytes):    1 */
                                   };

                                   /* total size (bytes):    1 */
                               } addresses_seen;
                             private:
/*     15      |       1 */    struct packed<dwarf_unit_type, 1> [with T = dwarf_unit_type] {
                                 private:
/*     15      |       1 */        struct {
/*     15: 0   |       4 */            T m_val : 8;

                                       /* total size (bytes):    1 */
                                   };

                                   /* total size (bytes):    1 */
                               } m_unit_type;
/*     16      |       1 */    struct packed<language, 1> [with T = language] {
                                 private:
/*     16      |       1 */        struct {
/*     16: 0   |       4 */            T m_val : 8;

                                       /* total size (bytes):    1 */
                                   };

                                   /* total size (bytes):    1 */
                               } m_lang;
                             public:
/*     17      |       1 */    struct std::atomic<bool> [with _Tp = bool] {
                                 private:
/*     17      |       1 */        struct std::__atomic_base<_Tp> [with _IntTp = _Tp] {
                                     private:
                                       static const int _S_alignment;
/*     17      |       1 */            __int_type _M_i;

                                       /* total size (bytes):    1 */
                                   } _M_base;

                                   /* total size (bytes):    1 */
                               } scanned;
/* XXX  2-byte hole      */
/*     20      |       4 */    unsigned int index;
/*     24      |       8 */    dwarf2_section_info *section;
/*     32      |       8 */    dwarf2_per_bfd *per_bfd;
/*     40      |      56 */    struct comp_unit_head {
/*     40      |       4 */        unsigned int length;
/*     44      |       1 */        unsigned char version;
/*     45      |       1 */        unsigned char addr_size;
/*     46      |       1 */        unsigned char signed_addr_p;
/* XXX  1-byte hole      */
/*     48      |       8 */        sect_offset abbrev_sect_off;
/*     56      |       4 */        unsigned int offset_size;
/*     60      |       4 */        unsigned int initial_length_size;
/*     64      |       4 */        dwarf_unit_type unit_type;
/*     68      |       4 */        cu_offset first_die_cu_offset;
/*     72      |       8 */        sect_offset sect_off;
/*     80      |       4 */        cu_offset type_cu_offset_in_tu;
/* XXX  4-byte hole      */
/*     88      |       8 */        ULONGEST signature;

                                   /* total size (bytes):   56 */
                               } m_header;
/*     96      |       8 */    std::unique_ptr<file_and_directory> fnd;
/*    104      |       8 */    During symbol reading: Child DIE 0x1d6f34b and its abstract origin 0x1d6f380 have different parents
quick_file_names *file_names;
/*    112      |       8 */    std::vector<dwarf2_per_cu_data*> *imported_symtabs;

                               /* total size (bytes):  120 */
                             }
(top-gdb) 


Patch follows:

From c65b4a71a08835dd46c16443f84793e982efdc8a Mon Sep 17 00:00:00 2001
From: Pedro Alves <pedro@palves.net>
Date: Wed, 6 Jul 2022 14:46:49 +0100
Subject: [PATCH] Introduce struct packed template, fix -fsanitize=thread for
 per_cu fields

Change-Id: Ifa94f0a2cebfae5e8f6ddc73265f05e7fd9e1532
---
 gdb/defs.h          |  3 ++
 gdb/dwarf2/read.h   | 20 +++++-----
 gdbsupport/packed.h | 92 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 106 insertions(+), 9 deletions(-)
 create mode 100644 gdbsupport/packed.h

diff --git a/gdb/defs.h b/gdb/defs.h
index 99bfdd526ff..19f379d6588 100644
--- a/gdb/defs.h
+++ b/gdb/defs.h
@@ -232,6 +232,9 @@ enum language
 #define LANGUAGE_BITS 5
 gdb_static_assert (nr_languages <= (1 << LANGUAGE_BITS));
 
+/* The number of bytes needed to represent all languages.  */
+#define LANGUAGE_BYTES ((LANGUAGE_BITS + HOST_CHAR_BIT - 1) / HOST_CHAR_BIT)
+
 enum precision_type
   {
     single_precision,
diff --git a/gdb/dwarf2/read.h b/gdb/dwarf2/read.h
index 1d9c66aafad..f98d8b27649 100644
--- a/gdb/dwarf2/read.h
+++ b/gdb/dwarf2/read.h
@@ -33,6 +33,7 @@
 #include "gdbsupport/gdb_obstack.h"
 #include "gdbsupport/hash_enum.h"
 #include "gdbsupport/function-view.h"
+#include "gdbsupport/packed.h"
 
 /* Hold 'maintenance (set|show) dwarf' commands.  */
 extern struct cmd_list_element *set_dwarf_cmdlist;
@@ -105,11 +106,8 @@ struct dwarf2_per_cu_data
       reading_dwo_directly (false),
       tu_read (false),
       m_header_read_in (false),
-      addresses_seen (false),
       mark (false),
       files_read (false),
-      m_unit_type {},
-      m_lang (language_unknown),
       scanned (false)
   {
   }
@@ -161,10 +159,6 @@ struct dwarf2_per_cu_data
      it private at the moment.  */
   mutable bool m_header_read_in : 1;
 
-  /* If addresses have been read for this CU (usually from
-     .debug_aranges), then this flag is set.  */
-  bool addresses_seen : 1;
-
   /* A temporary mark bit used when iterating over all CUs in
      expand_symtabs_matching.  */
   unsigned int mark : 1;
@@ -173,12 +167,20 @@ struct dwarf2_per_cu_data
      point in trying to read it again next time.  */
   bool files_read : 1;
 
+  /* Wrap the following in struct packed instead of bitfields to avoid
+     data races when the bitfields end up on the same memory location
+     (per C++ memory model).  */
+
+  /* If addresses have been read for this CU (usually from
+     .debug_aranges), then this flag is set.  */
+  packed<bool, 1> addresses_seen = false;
+
 private:
   /* The unit type of this CU.  */
-  ENUM_BITFIELD (dwarf_unit_type) m_unit_type : 8;
+  packed<dwarf_unit_type, 1> m_unit_type = (dwarf_unit_type) 0;
 
   /* The language of this CU.  */
-  ENUM_BITFIELD (language) m_lang : LANGUAGE_BITS;
+  packed<language, LANGUAGE_BYTES> m_lang = language_unknown;
 
 public:
   /* True if this CU has been scanned by the indexer; false if
diff --git a/gdbsupport/packed.h b/gdbsupport/packed.h
new file mode 100644
index 00000000000..a41c408a44d
--- /dev/null
+++ b/gdbsupport/packed.h
@@ -0,0 +1,92 @@
+/* Copyright (C) 2022 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#ifndef PACKED_H
+#define PACKED_H
+
+/* Each instantiation and full specialization of the packed template
+   defines a type that behaves like a given scalar type, but that has
+   byte alignment, and, may optionally have a smaller size than the
+   given scalar type.  This is typically used as alternative to
+   bit-fields (and ENUM_BITFIELD), when the fields must have separate
+   memory locations to avoid data races.  */
+
+/* Use __attribute__((packed)) when possible to make it more
+   convenient to print the field when debugging GDB.  */
+#ifdef __GNUC__
+# define PACKED_USE_ATTRIBUTE_PACKED 1
+#else
+# define PACKED_USE_ATTRIBUTE_PACKED 0
+#endif
+
+#if !PACKED_USE_ATTRIBUTE_PACKED
+
+template<size_t bytes> struct bytes_to_type;
+
+template<> struct bytes_to_type<1>
+{
+  typedef uint8_t type;
+};
+
+template<> struct bytes_to_type<2>
+{
+  typedef uint16_t type;
+};
+
+template<> struct bytes_to_type<4>
+{
+  typedef uint32_t type;
+};
+#endif
+
+template<typename T, size_t Bytes = sizeof (T)>
+struct packed
+{
+public:
+  packed (T val)
+  {
+#if PACKED_USE_ATTRIBUTE_PACKED
+    m_val = val;
+#else
+    typename bytes_to_type<Bytes>::type tmp;
+    tmp = val;
+    memcpy (m_val, &tmp, Bytes);
+#endif
+  }
+
+  operator T () const
+  {
+#if PACKED_USE_ATTRIBUTE_PACKED
+    return (T) m_val;
+#else
+    typename bytes_to_type<Bytes>::type tmp;
+    memcpy (&tmp, m_val, Bytes);
+    return (T) tmp;
+#endif
+  }
+
+private:
+#if PACKED_USE_ATTRIBUTE_PACKED
+  T m_val : Bytes * 8  __attribute__ ((packed));
+#else
+  unsigned char m_val[Bytes];
+  static_assert (sizeof (typename bytes_to_type<Bytes>::type) == sizeof (m_val),
+		 "sizes must match");
+#endif
+};
+
+#endif

base-commit: 8bd915b770e12eff67f6218f2d727069d04d5752
-- 
2.36.0


  reply	other threads:[~2022-07-06 19:20 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-29 15:29 [PATCH 1/5] [COVER-LETTER, RFC] Fix some fsanitize=thread issues in gdb's cooked index Tom de Vries
2022-06-29 15:29 ` [PATCH 2/5] [gdb/symtab] Fix data race on per_cu->dwarf_version Tom de Vries
2022-07-01 11:16   ` Tom de Vries
2022-07-02 11:07     ` Tom de Vries
2022-07-04 18:51       ` Tom Tromey
2022-07-04 19:43         ` Tom de Vries
2022-07-04 19:53           ` Tom Tromey
2022-06-29 15:29 ` [PATCH 3/5] [gdb/symtab] Work around fsanitize=address false positive for per_cu->lang Tom de Vries
2022-06-29 17:38   ` Pedro Alves
2022-06-29 18:25     ` Pedro Alves
2022-06-29 18:28       ` Pedro Alves
2022-07-04  7:04         ` [PATCH 3/5] [gdb/symtab] Work around fsanitize=address false positive for per_ cu->lang Tom de Vries
2022-07-04 18:32   ` [PATCH 3/5] [gdb/symtab] Work around fsanitize=address false positive for per_cu->lang Tom Tromey
2022-07-04 19:45     ` Tom de Vries
2022-07-06 19:20       ` Pedro Alves [this message]
2022-07-07 10:18         ` [PATCH] Introduce struct packed template, fix -fsanitize=thread for per_cu fields Tom de Vries
2022-07-07 15:26           ` Pedro Alves
2022-07-08 14:54             ` Tom de Vries
2022-07-12 10:22               ` Tom de Vries
2022-06-29 15:29 ` [PATCH 4/5] [gdb/symtab] Work around fsanitize=address false positive for per_cu->unit_type Tom de Vries
2022-06-29 15:29 ` [PATCH 5/5] [gdb/symtab] Fix data race on per_cu->lang Tom de Vries
2022-07-04 18:30   ` Tom Tromey
2022-07-05  8:17     ` Tom de Vries
2022-07-05 15:19     ` Tom de Vries
2022-07-06 15:42       ` Tom de Vries

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bbf37493-9bc4-c88f-bddc-4b22bd1d1382@palves.net \
    --to=pedro@palves.net \
    --cc=gdb-patches@sourceware.org \
    --cc=tdevries@suse.de \
    --cc=tom@tromey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).