public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <richard.guenther@gmail.com>
To: "Jose E. Marchesi" <jose.marchesi@oracle.com>
Cc: David Faust <david.faust@oracle.com>,
	gcc-patches@gcc.gnu.org, yhs@meta.com,
	 Eduard Zingerman <eddyz87@gmail.com>
Subject: Re: [PATCH 0/9] Add btf_decl_tag C attribute
Date: Wed, 12 Jul 2023 15:21:14 +0200	[thread overview]
Message-ID: <CAFiYyc3+iWJWLBCq5sTpQpAfm+VJKOfS=66sN5ZtGYMd5jLKQQ@mail.gmail.com> (raw)
In-Reply-To: <87y1jlz4e4.fsf@oracle.com>

On Wed, Jul 12, 2023 at 2:44 PM Jose E. Marchesi
<jose.marchesi@oracle.com> wrote:
>
>
> [Added Eduard Zingerman in CC, who is implementing this same feature in
>  clang/llvm and also the consumer component in the kernel (pahole).]
>
> Hi Richard.
>
> > On Tue, Jul 11, 2023 at 11:58 PM David Faust via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> >>
> >> Hello,
> >>
> >> This series adds support for a new attribute, "btf_decl_tag" in GCC.
> >> The same attribute is already supported in clang, and is used by various
> >> components of the BPF ecosystem.
> >>
> >> The purpose of the attribute is to allow to associate (to "tag")
> >> declarations with arbitrary string annotations, which are emitted into
> >> debugging information (DWARF and/or BTF) to facilitate post-compilation
> >> analysis (the motivating use case being the Linux kernel BPF verifier).
> >> Multiple tags are allowed on the same declaration.
> >>
> >> These strings are not interpreted by the compiler, and the attribute
> >> itself has no effect on generated code, other than to produce additional
> >> DWARF DIEs and/or BTF records conveying the annotations.
> >>
> >> This entails:
> >>
> >> - A new C-language-level attribute which allows to associate (to "tag")
> >>   particular declarations with arbitrary strings.
> >>
> >> - The conveyance of that information in DWARF in the form of a new DIE,
> >>   DW_TAG_GNU_annotation, with tag number (0x6000) and format matching
> >>   that of the DW_TAG_LLVM_annotation extension supported in LLVM for
> >>   the same purpose. These DIEs are already supported by BPF tooling,
> >>   such as pahole.
> >>
> >> - The conveyance of that information in BTF debug info in the form of
> >>   BTF_KIND_DECL_TAG records. These records are already supported by
> >>   LLVM and other tools in the eBPF ecosystem, such as the Linux kernel
> >>   eBPF verifier.
> >>
> >>
> >> Background
> >> ==========
> >>
> >> The purpose of these tags is to convey additional semantic information
> >> to post-compilation consumers, in particular the Linux kernel eBPF
> >> verifier. The verifier can make use of that information while analyzing
> >> a BPF program to aid in determining whether to allow or reject the
> >> program to be run. More background on these tags can be found in the
> >> early support for them in the kernel here [1] and [2].
> >>
> >> The "btf_decl_tag" attribute is half the story; the other half is a
> >> sibling attribute "btf_type_tag" which serves the same purpose but
> >> applies to types. Support for btf_type_tag will come in a separate
> >> patch series, since it is impaced by GCC bug 110439 which needs to be
> >> addressed first.
> >>
> >> I submitted an initial version of this work (including btf_type_tag)
> >> last spring [3], however at the time there were some open questions
> >> about the behavior of the btf_type_tag attribute and issues with its
> >> implementation. Since then we have clarified these details and agreed
> >> to solutions with the BPF community and LLVM BPF folks.
> >>
> >> The main motivation for emitting the tags in DWARF is that the Linux
> >> kernel generates its BTF information via pahole, using DWARF as a source:
> >>
> >>     +--------+  BTF                  BTF   +----------+
> >>     | pahole |-------> vmlinux.btf ------->| verifier |
> >>     +--------+                             +----------+
> >>         ^                                        ^
> >>         |                                        |
> >>   DWARF |                                    BTF |
> >>         |                                        |
> >>       vmlinux                              +-------------+
> >>       module1.ko                           | BPF program |
> >>       module2.ko                           +-------------+
> >>         ...
> >>
> >> This is because:
> >>
> >> a)  pahole adds additional kernel-specific information into the
> >>     produced BTF based on additional analysis of kernel objects.
> >>
> >> b)  Unlike GCC, LLVM will only generate BTF for BPF programs.
> >>
> >> b)  GCC can generate BTF for whatever target with -gbtf, but there is no
> >>     support for linking/deduplicating BTF in the linker.
> >>
> >> In the scenario above, the verifier needs access to the pointer tags of
> >> both the kernel types/declarations (conveyed in the DWARF and translated
> >> to BTF by pahole) and those of the BPF program (available directly in BTF).
> >>
> >>
> >> DWARF Representation
> >> ====================
> >>
> >> As noted above, btf_decl_tag is represented in DWARF via a new DIE
> >> DW_TAG_GNU_annotation, with identical format to the LLVM DWARF
> >> extension DW_TAG_LLVM_annotation serving the same purpose. The DIE has
> >> the following format:
> >>
> >>   DW_TAG_GNU_annotation (0x6000)
> >>     DW_AT_name: "btf_decl_tag"
> >>     DW_AT_const_value: <string argument>
> >>
> >> These DIEs are placed in the DWARF tree as children of the DIE for the
> >> appropriate declaration, and one such DIE is created for each occurrence
> >> of the btf_decl_tag attribute on a declaration.
> >>
> >> For example:
> >>
> >>   const int * c __attribute__((btf_decl_tag ("__c"), btf_decl_tag ("devicemem")));
> >>
> >> This declaration produces the following DWARF:
> >>
> >>  <1><1e>: Abbrev Number: 2 (DW_TAG_variable)
> >>     <1f>   DW_AT_name        : c
> >>     <24>   DW_AT_type        : <0x49>
> >>     ...
> >>  <2><36>: Abbrev Number: 3 (User TAG value: 0x6000)
> >>     <37>   DW_AT_name        : (indirect string, offset: 0x4c): btf_decl_tag
> >>     <3b>   DW_AT_const_value : (indirect string, offset: 0): devicemem
> >>  <2><3f>: Abbrev Number: 4 (User TAG value: 0x6000)
> >>     <40>   DW_AT_name        : (indirect string, offset: 0x4c): btf_decl_tag
> >>     <44>   DW_AT_const_value : __c
> >>  <2><48>: Abbrev Number: 0
> >>  <1><49>: Abbrev Number: 5 (DW_TAG_pointer_type)
> >>  ...
> >>
> >> The DIEs for btf_decl_tag are placed as children of the DIE for
> >> variable "c".
> >
> > It looks like a bit of overkill, and inefficient as well.  Why's the
> > tags not referenced via the existing DW_AT_description?
>
> The DWARF spec ("Entity Descriptions") seems to imply that the
> DW_AT_description attribute is intended to be used to hold alternative
> ways to denote the same "debugging information" (object, type, ...),
> i.e. alternative aliases to refer to the same entity than the
> DW_AT_name.  For example, for a type name='foo' we could have
> description='aka. long int'.  We don't think this is the case of the btf
> tags, which are more like properties partially characterizing the tagged
> "debugging information", but couldn't be used as an alias to the name.
>
> Also, repurposing the DW_AT_description attribute to hold btf tag
> information would require to introduce a mini-language and subsequent
> parsing by the clients: how to denote several tags, how to encode the
> embedded string contents, etc.  You kick the complexity out the door and
> it comes back in through the window :)
>
> Finally, for what we know, the existing attribute may already be used by
> some language and handled by some debugger the way it is recommended in
> the spec.  That would be incompatible with having btf tags encoded
> there.

How are the C/C++ standard attributes proposed to be encoded in dwarf?
 I think adding
special encoding just for BTF tags looks wrong.

> > Iff you want new TAGs why require them as children for each DIE rather
> > than referencing (and sharing!) them via a DIE reference from a new
> > attribute?
>
> Hmm, thats a very good question.  The Linux kernel sources uses both
> declaration tags and type tags and not sharing the DIEs may result in
> serious bloating, since the tags are brought in to declarations and type
> specifiers via macros...
>
> > That said, I'd go with DW_AT_description 'btf_decl_tag ("devicemem")'.
> >
> > But well ...
> >
> > Richard.
> >
> >>
> >> BTF Representation
> >> ==================
> >>
> >> In BTF, BTF_KIND_DECL_TAG records convey the annotations. These records refer
> >> to the annotated object by BTF type ID, as well as a component index which is
> >> used for btf_decl_tags placed on struct/union members or function arguments.
> >>
> >> For example, the BTF for the above declaration is:
> >>
> >>   [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
> >>   [2] CONST '(anon)' type_id=1
> >>   [3] PTR '(anon)' type_id=2
> >>   [4] DECL_TAG '__c' type_id=6 component_idx=-1
> >>   [5] DECL_TAG 'devicemem' type_id=6 component_idx=-1
> >>   [6] VAR 'c' type_id=3, linkage=global
> >>   ...
> >>
> >> The BTF format is documented here [4].
> >>
> >>
> >> References
> >> ==========
> >>
> >> [1] https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/
> >> [2] https://lore.kernel.org/bpf/20211011040608.3031468-1-yhs@fb.com/
> >> [3] https://gcc.gnu.org/pipermail/gcc-patches/2022-May/593936.html
> >> [4] https://www.kernel.org/doc/Documentation/bpf/btf.rst
> >>
> >>
> >> David Faust (9):
> >>   c-family: add btf_decl_tag attribute
> >>   include: add BTF decl tag defines
> >>   dwarf: create annotation DIEs for decl tags
> >>   dwarf: expose get_die_parent
> >>   ctf: add support to pass through BTF tags
> >>   dwarf2ctf: convert annotation DIEs to CTF types
> >>   btf: create and output BTF_KIND_DECL_TAG types
> >>   testsuite: add tests for BTF decl tags
> >>   doc: document btf_decl_tag attribute
> >>
> >>  gcc/btfout.cc                                 | 81 ++++++++++++++++++-
> >>  gcc/c-family/c-attribs.cc                     | 23 ++++++
> >>  gcc/ctf-int.h                                 | 28 +++++++
> >>  gcc/ctfc.cc                                   | 10 ++-
> >>  gcc/ctfc.h                                    | 17 +++-
> >>  gcc/doc/extend.texi                           | 47 +++++++++++
> >>  gcc/dwarf2ctf.cc                              | 73 ++++++++++++++++-
> >>  gcc/dwarf2out.cc                              | 37 ++++++++-
> >>  gcc/dwarf2out.h                               |  1 +
> >>  .../gcc.dg/debug/btf/btf-decltag-func.c       | 21 +++++
> >>  .../gcc.dg/debug/btf/btf-decltag-sou.c        | 33 ++++++++
> >>  .../gcc.dg/debug/btf/btf-decltag-var.c        | 19 +++++
> >>  .../gcc.dg/debug/dwarf2/annotation-decl-1.c   |  9 +++
> >>  .../gcc.dg/debug/dwarf2/annotation-decl-2.c   | 18 +++++
> >>  .../gcc.dg/debug/dwarf2/annotation-decl-3.c   | 17 ++++
> >>  include/btf.h                                 | 14 +++-
> >>  include/dwarf2.def                            |  4 +
> >>  17 files changed, 437 insertions(+), 15 deletions(-)
> >>  create mode 100644 gcc/ctf-int.h
> >>  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-func.c
> >>  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-sou.c
> >>  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-var.c
> >>  create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-1.c
> >>  create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-2.c
> >>  create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-3.c
> >>
> >> --
> >> 2.40.1
> >>

  reply	other threads:[~2023-07-12 13:21 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-11 21:57 David Faust
2023-07-11 21:57 ` [PATCH 1/9] c-family: add btf_decl_tag attribute David Faust
2023-07-11 21:57 ` [PATCH 2/9] include: add BTF decl tag defines David Faust
2023-07-11 21:57 ` [PATCH 3/9] dwarf: create annotation DIEs for decl tags David Faust
2023-07-11 21:57 ` [PATCH 4/9] dwarf: expose get_die_parent David Faust
2023-07-11 21:57 ` [PATCH 5/9] ctf: add support to pass through BTF tags David Faust
2023-07-11 21:57 ` [PATCH 6/9] dwarf2ctf: convert annotation DIEs to CTF types David Faust
2023-07-11 21:57 ` [PATCH 7/9] btf: create and output BTF_KIND_DECL_TAG types David Faust
2023-07-11 21:57 ` [PATCH 8/9] testsuite: add tests for BTF decl tags David Faust
2023-07-11 21:57 ` [PATCH 9/9] doc: document btf_decl_tag attribute David Faust
2023-07-12  7:38 ` [PATCH 0/9] Add btf_decl_tag C attribute Richard Biener
2023-07-12 12:43   ` Jose E. Marchesi
2023-07-12 13:21     ` Richard Biener [this message]
2023-07-12 13:49       ` Jose E. Marchesi
2023-07-12 19:33         ` David Faust
2023-07-24 15:56 ` David Faust
2023-08-09 21:05 ` [PING 2][PATCH " David Faust
2023-09-11 21:39 ` [PING][PATCH " David Faust

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFiYyc3+iWJWLBCq5sTpQpAfm+VJKOfS=66sN5ZtGYMd5jLKQQ@mail.gmail.com' \
    --to=richard.guenther@gmail.com \
    --cc=david.faust@oracle.com \
    --cc=eddyz87@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jose.marchesi@oracle.com \
    --cc=yhs@meta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).