public inbox for libabigail@sourceware.org
 help / color / mirror / Atom feed
From: Giuliano Procida <gprocida@google.com>
To: David Marchand <david.marchand@redhat.com>
Cc: libabigail@sourceware.org, Dodji Seketeli <dodji@seketeli.org>,
	kernel-team@android.com, maennich@google.com
Subject: Re: [PATCH 1/1] DWARF reader: fix bitfield offset calculations
Date: Sat, 10 Jul 2021 17:51:38 +0100	[thread overview]
Message-ID: <CAGvU0H=X9DP44JF-atNVYQw552Zqfvi_PRn1AoE6VHr3dKGDxQ@mail.gmail.com> (raw)
In-Reply-To: <CAJFAV8zHjBuBbvoqWjd6-S5ryqaGj9yhJPOD8A+qPSL7mexprg@mail.gmail.com>

Hi.

On Sat, 10 Jul 2021 at 10:18, David Marchand <david.marchand@redhat.com> wrote:
>
> Hello,
>
> On Sat, Jul 10, 2021 at 1:17 AM Giuliano Procida <gprocida@google.com> wrote:
> >
> > PR28060 - Invalid offset for bitfields
> >
> > Bitfield and other member offsets can be specified in DWARF using:
> >
> > - DW_AT_bit_offset, or
> > - DW_AT_data_member_location and optionally DW_AT_data_bit_offset.
>
> I understood the standard (and the code below) differently:
>
> - DW_AT_*data*_bit_offset,
> - DW_AT_data_member_location associated with an optional
> DW_AT_bit_offset (DW_AT_bit_offset maintained for compat with Dwarf <=
> 4),
>

Sorry, I just typo'd the commit message.

> Maybe this is a ENOTENOUGHCOFFEE from me.
>
>
> >
> > The code would only use the value DW_AT_data_member_location if there
> > was no DW_AT_data_bit_offset. This commit fixes this and adjusts
> > documentation and affected tests.

Same typo.

> >
> >         * src/abg-dwarf-reader.cc (read_and_convert_DW_at_bit_offset):
> >         Update documentation.
> >         (die_member_offset): Treat DW_AT_data_bit_offset as an
> >         optional adjustment to DW_AT_data_member_location.

And again.

> >         * tests/data/test-annotate/test13-pr18894.so.abi: Update.
> >         * tests/data/test-annotate/test15-pr18892.so.abi: Update.
> >         * tests/data/test-annotate/test17-pr19027.so.abi: Update.
> >         * tests/data/test-annotate/test19-pr19023-libtcmalloc_and_profiler.so.abi:
> >         Update.
> >         * tests/data/test-annotate/test21-pr19092.so.abi: Update.
> >         * tests/data/test-diff-dwarf-abixml/PR25409-librte_bus_dpaa.so.20.0.abi:
> >         Regenerate.
> >         * tests/data/test-diff-pkg/libcdio-0.94-1.fc26.x86_64--libcdio-0.94-2.fc26.x86_64-report.1.txt:
> >         Report now empty.
> >         * tests/data/test-read-dwarf/PR25007-sdhci.ko.abi: Update.
> >         * tests/data/test-read-dwarf/PR25042-libgdbm-clang-dwarf5.so.6.0.0.abi:
> >         Update.
> >         * tests/data/test-read-dwarf/test13-pr18894.so.abi: Update.
> >         * tests/data/test-read-dwarf/test15-pr18892.so.abi: Update.
> >         * tests/data/test-read-dwarf/test17-pr19027.so.abi: Update.
> >         * tests/data/test-read-dwarf/test19-pr19023-libtcmalloc_and_profiler.so.abi:
> >         Update.
> >         * tests/data/test-read-dwarf/test21-pr19092.so.abi: Update.
> >         * tests/data/test-read-dwarf/test22-pr19097-libstdc++.so.6.0.17.so.abi:
> >         Update.
> >
> > Signed-off-by: Giuliano Procida <gprocida@google.com>
> > ---
> >  src/abg-dwarf-reader.cc                       |   60 +-
> >  .../data/test-annotate/test13-pr18894.so.abi  |   54 +-
> >  .../data/test-annotate/test15-pr18892.so.abi  |    4 +-
> >  .../data/test-annotate/test17-pr19027.so.abi  |   12 +-
> >  ...19-pr19023-libtcmalloc_and_profiler.so.abi |    8 +-
> >  .../data/test-annotate/test21-pr19092.so.abi  |   44 +-
> >  .../PR25409-librte_bus_dpaa.so.20.0.abi       | 5258 +++++++++--------
> >  ...4--libcdio-0.94-2.fc26.x86_64-report.1.txt |   42 -
> >  .../data/test-read-dwarf/PR25007-sdhci.ko.abi |  168 +-
> >  .../PR25042-libgdbm-clang-dwarf5.so.6.0.0.abi |   24 +-
> >  .../test-read-dwarf/test13-pr18894.so.abi     |   54 +-
> >  .../test-read-dwarf/test15-pr18892.so.abi     |    4 +-
> >  .../test-read-dwarf/test17-pr19027.so.abi     |   12 +-
> >  ...19-pr19023-libtcmalloc_and_profiler.so.abi |    8 +-
> >  .../test-read-dwarf/test21-pr19092.so.abi     |   44 +-
> >  .../test22-pr19097-libstdc++.so.6.0.17.so.abi |   14 +-
> >  16 files changed, 2886 insertions(+), 2924 deletions(-)
> >
> > diff --git a/src/abg-dwarf-reader.cc b/src/abg-dwarf-reader.cc
> > index c8cd5170..0f016b66 100644
> > --- a/src/abg-dwarf-reader.cc
> > +++ b/src/abg-dwarf-reader.cc
> > @@ -8434,7 +8434,8 @@ eval_last_constant_dwarf_sub_expr(Dwarf_Op*       expr,
> >  // -----------------------------------
> >
> >  /// Convert the value of the DW_AT_bit_offset attribute into the value
> > -/// of the DW_AT_data_bit_offset attribute.
> > +/// of the DW_AT_data_bit_offset attribute, ignoring the byte offset
> > +/// contribution of the DW_AT_data_member_location attribute.
> >  ///
> >  /// On big endian machines, the value of the DW_AT_bit_offset
> >  /// attribute is the same as the value of the DW_AT_data_bit_offset
> > @@ -8605,8 +8606,9 @@ eval_last_constant_dwarf_sub_expr(Dwarf_Op*       expr,
> >  ///
> >  /// @param offset this is the output parameter into which the value of
> >  /// the DW_AT_bit_offset is put, converted as if it was the value of
> > -/// the DW_AT_data_bit_offset parameter.  This parameter is set iff
> > -/// the function returns true.
> > +/// the DW_AT_data_bit_offset parameter, less the contribution of
> > +/// DW_AT_data_member_location.  This parameter is set iff the
>
> Nit: *if
>
> > +/// function returns true.
> >  ///
> >  /// @return true if DW_AT_bit_offset was found on @p die.
> >  static bool
> > @@ -8657,10 +8659,11 @@ read_and_convert_DW_at_bit_offset(const Dwarf_Die* die,
> >  /// DW_AT_data_member_location is not necessarily a constant that one
> >  /// would just read and be done with it.  Rather, it can be a DWARF
> >  /// expression that one has to interpret.  In general, the offset can
> > -/// be given by the DW_AT_bit_offset or DW_AT_data_bit_offset
> > -/// attribute.  In that case the offset is a constant.  But it can
> > -/// also be given by the DW_AT_data_member_location attribute.  In
> > -/// that case it's a DWARF location expression.
> > +/// be given by the DW_AT_bit_offset or by the
> > +/// DW_AT_data_member_location attribute and optionally the
> > +/// DW_AT_data_bit_offset attribute.  The bit offset attributes are
>
> The "x or y and optional z" is not that clear without paren.
> To enhance readability, I would rephrase as:
> """
> In general, the offset can be given by the DW_AT_data_bit_offset attribute
> or by the DW_AT_data_member_location attribute associated with an optional
> DW_AT_bit_offset attribute.
> """
>
>
> > +/// always simple constants, but the DW_AT_data_member_location
> > +/// attribute is a DWARF location expression.
> >  ///
> >  /// When the it's the DW_AT_data_member_location that is present,
> >  /// there are three cases to possibly take into account:
> > @@ -8704,39 +8707,24 @@ die_member_offset(const read_context& ctxt,
> >  {
> >    Dwarf_Op* expr = NULL;
> >    uint64_t expr_len = 0;
> > -  uint64_t off = 0;
> > +  uint64_t bit_offset = 0;
> >
> >    // First let's see if the DW_AT_data_bit_offset attribute is
> >    // present.
> > -  if (die_unsigned_constant_attribute(die, DW_AT_data_bit_offset, off))
> > -    {
> > -      offset = off;
> > -      return true;
> > -    }
> > -
> > -  // Otherwise, let's see if the DW_AT_bit_offset attribute is
> > -  // present.  On little endian machines, we need to convert this
> > -  // attribute into what it would have been if the
> > -  // DW_AT_data_bit_offset was used instead.  In other words,
> > -  // DW_AT_bit_offset needs to be converted into a
> > -  // human-understandable form that represents the offset of the
> > -  // bitfield data member it describes.  For details about the
> > -  // conversion, please read the extensive comments of
> > -  // read_and_convert_DW_at_bit_offset.
> > -  bool is_big_endian = architecture_is_big_endian(ctxt.elf_handle());
> > -  if (read_and_convert_DW_at_bit_offset(die, is_big_endian, off))
> > +  if (die_unsigned_constant_attribute(die, DW_AT_data_bit_offset, bit_offset))
> >      {
> > -      offset = off;
> > +      offset = bit_offset;
> >        return true;
> >      }
> >
> > +  // Otherwise, let's see if the DW_AT_data_member_location attribute and,
> > +  // optionally, the DW_AT_bit_offset attributes are present.
> >    if (!die_location_expr(die, DW_AT_data_member_location, &expr, &expr_len))
> >      return false;
> >
> > -  // Otherwise, the DW_AT_data_member_location attribute is present.
> > -  // In that case, let's evaluate it and get its constant
> > +  // The DW_AT_data_member_location attribute is present.
> > +  // Let's evaluate it and get its constant
> >    // sub-expression and return that one.
> > -
> >    if (!eval_quickly(expr, expr_len, offset))
> >      {
> >        bool is_tls_address = false;
> > @@ -8745,8 +8733,20 @@ die_member_offset(const read_context& ctxt,
> >                                              ctxt.dwarf_expr_eval_ctxt()))
> >         return false;
> >      }
> > -
> >    offset *= 8;
> > +
> > +  // On little endian machines, we need to convert the DW_AT_bit_offset
> > +  // attribute into what it would have been if the DW_AT_data_bit_offset were
> > +  // used instead.
>
> Again, maybe my lack of coffee, correct me if I am wrong.
>
> I understood DW_AT_data_bit_offset specify an "absolute" offset while
> DW_AT_bit_offset specify an offset "relative" to
> DW_AT_data_member_location.
> If this is the case, I find this comment misleading and I would remove it.
> Else, well, disregard this comment :-).
>

Agreed, I should do more to clean up the comments.

>
> > +  //
> > +  // In other words, DW_AT_bit_offset needs to be converted into a
> > +  // human-understandable form that represents the offset of the bitfield data
> > +  // member it describes.  For details about the conversion, please read the
> > +  // extensive comments of read_and_convert_DW_at_bit_offset.
> > +  bool is_big_endian = architecture_is_big_endian(ctxt.elf_handle());
> > +  if (read_and_convert_DW_at_bit_offset(die, is_big_endian, bit_offset))
> > +    offset += bit_offset;
> > +
> >    return true;
> >  }
> >
>
>
> - I have tested this patch against DPDK.
> I have dwarf4 blobs.
> Before the patch:
>           type of 'rte_eth_dev_data* data' changed:
>             in pointed to type 'struct rte_eth_dev_data' at
> rte_ethdev_core.h:131:1:
>               type size hasn't changed
>               1 data member insertion:
>                 'uint8_t dev_configured', at offset 5 (in bits) at
> rte_ethdev_core.h:171:1
> After the patch:
>           type of 'rte_eth_dev_data* data' changed:
>             in pointed to type 'struct rte_eth_dev_data' at
> rte_ethdev_core.h:131:1:
>               type size hasn't changed
>               1 data member insertion:
>                 'uint8_t dev_configured', at offset 34005 (in bits) at
> rte_ethdev_core.h:171:1
>
> Which seems all good on this side.
>

Great!

>
> But now, I face an issue trying to waive this.
> The addition of the "dev_configured" field in this bitfield is right
> after a 1 bit wide "lro" field and before a "rx_queue_state" field.
> https://git.dpdk.org/dpdk/tree/lib/ethdev/rte_ethdev_core.h?id=f0b97fc4fe4cd16dd492cf30ece4296451e10e10#n165
>
> The rule below does not work:
> [suppress_type]
>         name = rte_eth_dev_data
>         has_data_member_inserted_between = {offset_after(lro),
> offset_of(rx_queue_state)}
>
> While on the other hand, this one works:
> [suppress_type]
>         name = rte_eth_dev_data
>         has_data_member_inserted_between = {34005, offset_of(rx_queue_state)}
>
> Not sure what the issue is, but supposing the offsets are now correct,
> then my bet is on the "lro" field size (here, 1 bit).
>

I'd open a separate bug for that.

>
> Back to my reproducer, that I updated:
>
> struct bigstruct {
>     char name[128];
>     uint8_t bitfield0:1
> #ifndef BEFORE
>             ,bitfield1:1
> #endif
>     ;
>     uint8_t other;
> };
>
> I wrote a suppression rule:
> [suppress_type]
>     name = bigstruct
>     has_data_member_inserted_between = {offset_after(bitfield0),
> offset_of(other)}
>
> And then I tried to debug, but I am lost in libabigail internals.
> I attached a gdb to catch all get_size_in_bits() calls, I see either
> 1024 (sizeof name), 8 (sizeof other?), 0 (this is suspicious) or 64
> (?).
>

I would advise the following: separate ABI extraction (abidw) from ABI
comparison (abidiff), if you haven't already.

You can at least carefully check the XML before comparing XML files.

While there have been bugs in XML sometimes not meaning the same thing
when read back, they are much less likely than bugs in the ABI
extraction or ABI comparison logic.

Regards,
Giuliano.

>
> --
> David Marchand
>

  reply	other threads:[~2021-07-10 16:52 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-09 23:16 [PATCH 0/1] Fix Bug 28060 - Invalid offset for bitfields Giuliano Procida
2021-07-09 23:16 ` [PATCH 1/1] DWARF reader: fix bitfield offset calculations Giuliano Procida
2021-07-10  9:18   ` David Marchand
2021-07-10 16:51     ` Giuliano Procida [this message]
2021-07-10 19:15       ` David Marchand
2021-07-10 16:52   ` [PATCH v2] " Giuliano Procida
2021-07-15 16:21     ` David Marchand
2021-07-19 11:59     ` [PATCH v2, applied] " Dodji Seketeli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGvU0H=X9DP44JF-atNVYQw552Zqfvi_PRn1AoE6VHr3dKGDxQ@mail.gmail.com' \
    --to=gprocida@google.com \
    --cc=david.marchand@redhat.com \
    --cc=dodji@seketeli.org \
    --cc=kernel-team@android.com \
    --cc=libabigail@sourceware.org \
    --cc=maennich@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).