From: Richard Biener <richard.guenther@gmail.com>
To: David Faust <david.faust@oracle.com>
Cc: gcc-patches@gcc.gnu.org, jose.marchesi@oracle.com, yhs@meta.com
Subject: Re: [PATCH 0/9] Add btf_decl_tag C attribute
Date: Wed, 12 Jul 2023 09:38:22 +0200 [thread overview]
Message-ID: <CAFiYyc2gp-HUdc5ZQRGr0ATiOF3AzpeC2+Sy=chFe744qN-DSg@mail.gmail.com> (raw)
In-Reply-To: <20230711215716.12980-1-david.faust@oracle.com>
On Tue, Jul 11, 2023 at 11:58 PM David Faust via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Hello,
>
> This series adds support for a new attribute, "btf_decl_tag" in GCC.
> The same attribute is already supported in clang, and is used by various
> components of the BPF ecosystem.
>
> The purpose of the attribute is to allow to associate (to "tag")
> declarations with arbitrary string annotations, which are emitted into
> debugging information (DWARF and/or BTF) to facilitate post-compilation
> analysis (the motivating use case being the Linux kernel BPF verifier).
> Multiple tags are allowed on the same declaration.
>
> These strings are not interpreted by the compiler, and the attribute
> itself has no effect on generated code, other than to produce additional
> DWARF DIEs and/or BTF records conveying the annotations.
>
> This entails:
>
> - A new C-language-level attribute which allows to associate (to "tag")
> particular declarations with arbitrary strings.
>
> - The conveyance of that information in DWARF in the form of a new DIE,
> DW_TAG_GNU_annotation, with tag number (0x6000) and format matching
> that of the DW_TAG_LLVM_annotation extension supported in LLVM for
> the same purpose. These DIEs are already supported by BPF tooling,
> such as pahole.
>
> - The conveyance of that information in BTF debug info in the form of
> BTF_KIND_DECL_TAG records. These records are already supported by
> LLVM and other tools in the eBPF ecosystem, such as the Linux kernel
> eBPF verifier.
>
>
> Background
> ==========
>
> The purpose of these tags is to convey additional semantic information
> to post-compilation consumers, in particular the Linux kernel eBPF
> verifier. The verifier can make use of that information while analyzing
> a BPF program to aid in determining whether to allow or reject the
> program to be run. More background on these tags can be found in the
> early support for them in the kernel here [1] and [2].
>
> The "btf_decl_tag" attribute is half the story; the other half is a
> sibling attribute "btf_type_tag" which serves the same purpose but
> applies to types. Support for btf_type_tag will come in a separate
> patch series, since it is impaced by GCC bug 110439 which needs to be
> addressed first.
>
> I submitted an initial version of this work (including btf_type_tag)
> last spring [3], however at the time there were some open questions
> about the behavior of the btf_type_tag attribute and issues with its
> implementation. Since then we have clarified these details and agreed
> to solutions with the BPF community and LLVM BPF folks.
>
> The main motivation for emitting the tags in DWARF is that the Linux
> kernel generates its BTF information via pahole, using DWARF as a source:
>
> +--------+ BTF BTF +----------+
> | pahole |-------> vmlinux.btf ------->| verifier |
> +--------+ +----------+
> ^ ^
> | |
> DWARF | BTF |
> | |
> vmlinux +-------------+
> module1.ko | BPF program |
> module2.ko +-------------+
> ...
>
> This is because:
>
> a) pahole adds additional kernel-specific information into the
> produced BTF based on additional analysis of kernel objects.
>
> b) Unlike GCC, LLVM will only generate BTF for BPF programs.
>
> b) GCC can generate BTF for whatever target with -gbtf, but there is no
> support for linking/deduplicating BTF in the linker.
>
> In the scenario above, the verifier needs access to the pointer tags of
> both the kernel types/declarations (conveyed in the DWARF and translated
> to BTF by pahole) and those of the BPF program (available directly in BTF).
>
>
> DWARF Representation
> ====================
>
> As noted above, btf_decl_tag is represented in DWARF via a new DIE
> DW_TAG_GNU_annotation, with identical format to the LLVM DWARF
> extension DW_TAG_LLVM_annotation serving the same purpose. The DIE has
> the following format:
>
> DW_TAG_GNU_annotation (0x6000)
> DW_AT_name: "btf_decl_tag"
> DW_AT_const_value: <string argument>
>
> These DIEs are placed in the DWARF tree as children of the DIE for the
> appropriate declaration, and one such DIE is created for each occurrence
> of the btf_decl_tag attribute on a declaration.
>
> For example:
>
> const int * c __attribute__((btf_decl_tag ("__c"), btf_decl_tag ("devicemem")));
>
> This declaration produces the following DWARF:
>
> <1><1e>: Abbrev Number: 2 (DW_TAG_variable)
> <1f> DW_AT_name : c
> <24> DW_AT_type : <0x49>
> ...
> <2><36>: Abbrev Number: 3 (User TAG value: 0x6000)
> <37> DW_AT_name : (indirect string, offset: 0x4c): btf_decl_tag
> <3b> DW_AT_const_value : (indirect string, offset: 0): devicemem
> <2><3f>: Abbrev Number: 4 (User TAG value: 0x6000)
> <40> DW_AT_name : (indirect string, offset: 0x4c): btf_decl_tag
> <44> DW_AT_const_value : __c
> <2><48>: Abbrev Number: 0
> <1><49>: Abbrev Number: 5 (DW_TAG_pointer_type)
> ...
>
> The DIEs for btf_decl_tag are placed as children of the DIE for
> variable "c".
It looks like a bit of overkill, and inefficient as well. Why's the
tags not referenced
via the existing DW_AT_description? Iff you want new TAGs why require them
as children for each DIE rather than referencing (and sharing!) them via a DIE
reference from a new attribute?
That said, I'd go with DW_AT_description 'btf_decl_tag ("devicemem")'.
But well ...
Richard.
>
> BTF Representation
> ==================
>
> In BTF, BTF_KIND_DECL_TAG records convey the annotations. These records refer
> to the annotated object by BTF type ID, as well as a component index which is
> used for btf_decl_tags placed on struct/union members or function arguments.
>
> For example, the BTF for the above declaration is:
>
> [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
> [2] CONST '(anon)' type_id=1
> [3] PTR '(anon)' type_id=2
> [4] DECL_TAG '__c' type_id=6 component_idx=-1
> [5] DECL_TAG 'devicemem' type_id=6 component_idx=-1
> [6] VAR 'c' type_id=3, linkage=global
> ...
>
> The BTF format is documented here [4].
>
>
> References
> ==========
>
> [1] https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/
> [2] https://lore.kernel.org/bpf/20211011040608.3031468-1-yhs@fb.com/
> [3] https://gcc.gnu.org/pipermail/gcc-patches/2022-May/593936.html
> [4] https://www.kernel.org/doc/Documentation/bpf/btf.rst
>
>
> David Faust (9):
> c-family: add btf_decl_tag attribute
> include: add BTF decl tag defines
> dwarf: create annotation DIEs for decl tags
> dwarf: expose get_die_parent
> ctf: add support to pass through BTF tags
> dwarf2ctf: convert annotation DIEs to CTF types
> btf: create and output BTF_KIND_DECL_TAG types
> testsuite: add tests for BTF decl tags
> doc: document btf_decl_tag attribute
>
> gcc/btfout.cc | 81 ++++++++++++++++++-
> gcc/c-family/c-attribs.cc | 23 ++++++
> gcc/ctf-int.h | 28 +++++++
> gcc/ctfc.cc | 10 ++-
> gcc/ctfc.h | 17 +++-
> gcc/doc/extend.texi | 47 +++++++++++
> gcc/dwarf2ctf.cc | 73 ++++++++++++++++-
> gcc/dwarf2out.cc | 37 ++++++++-
> gcc/dwarf2out.h | 1 +
> .../gcc.dg/debug/btf/btf-decltag-func.c | 21 +++++
> .../gcc.dg/debug/btf/btf-decltag-sou.c | 33 ++++++++
> .../gcc.dg/debug/btf/btf-decltag-var.c | 19 +++++
> .../gcc.dg/debug/dwarf2/annotation-decl-1.c | 9 +++
> .../gcc.dg/debug/dwarf2/annotation-decl-2.c | 18 +++++
> .../gcc.dg/debug/dwarf2/annotation-decl-3.c | 17 ++++
> include/btf.h | 14 +++-
> include/dwarf2.def | 4 +
> 17 files changed, 437 insertions(+), 15 deletions(-)
> create mode 100644 gcc/ctf-int.h
> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-func.c
> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-sou.c
> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-var.c
> create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-1.c
> create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-2.c
> create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-3.c
>
> --
> 2.40.1
>
next prev parent reply other threads:[~2023-07-12 7:38 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-11 21:57 David Faust
2023-07-11 21:57 ` [PATCH 1/9] c-family: add btf_decl_tag attribute David Faust
2023-07-11 21:57 ` [PATCH 2/9] include: add BTF decl tag defines David Faust
2023-07-11 21:57 ` [PATCH 3/9] dwarf: create annotation DIEs for decl tags David Faust
2023-07-11 21:57 ` [PATCH 4/9] dwarf: expose get_die_parent David Faust
2023-07-11 21:57 ` [PATCH 5/9] ctf: add support to pass through BTF tags David Faust
2023-07-11 21:57 ` [PATCH 6/9] dwarf2ctf: convert annotation DIEs to CTF types David Faust
2023-07-11 21:57 ` [PATCH 7/9] btf: create and output BTF_KIND_DECL_TAG types David Faust
2023-07-11 21:57 ` [PATCH 8/9] testsuite: add tests for BTF decl tags David Faust
2023-07-11 21:57 ` [PATCH 9/9] doc: document btf_decl_tag attribute David Faust
2023-07-12 7:38 ` Richard Biener [this message]
2023-07-12 12:43 ` [PATCH 0/9] Add btf_decl_tag C attribute Jose E. Marchesi
2023-07-12 13:21 ` Richard Biener
2023-07-12 13:49 ` Jose E. Marchesi
2023-07-12 19:33 ` David Faust
2023-07-24 15:56 ` David Faust
2023-08-09 21:05 ` [PING 2][PATCH " David Faust
2023-09-11 21:39 ` [PING][PATCH " David Faust
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAFiYyc2gp-HUdc5ZQRGr0ATiOF3AzpeC2+Sy=chFe744qN-DSg@mail.gmail.com' \
--to=richard.guenther@gmail.com \
--cc=david.faust@oracle.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=jose.marchesi@oracle.com \
--cc=yhs@meta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).