public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 0/9] Add debug_annotate attributes
@ 2022-06-07 21:43 David Faust
  2022-06-07 21:43 ` [PATCH 1/9] dwarf: add dw_get_die_parent function David Faust
                   ` (9 more replies)
  0 siblings, 10 replies; 26+ messages in thread
From: David Faust @ 2022-06-07 21:43 UTC (permalink / raw)
  To: gcc-patches; +Cc: jose.marchesi, yhs

Hello,

This patch series adds support for:

- Two new C-language-level attributes that allow to associate (to "annotate" or
  to "tag") particular declarations and types with arbitrary strings. As
  explained below, this is intended to be used to, for example, characterize
  certain pointer types.

- The conveyance of that information in the DWARF output in the form of a new
  DIE: DW_TAG_GNU_annotation.

- The conveyance of that information in the BTF output in the form of two new
  kinds of BTF objects: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.

All of these facilities are being added to the eBPF ecosystem, and support for
them exists in some form in LLVM.

Purpose
=======

1)  Addition of C-family language constructs (attributes) to specify free-text
    tags on certain language elements, such as struct fields.

    The purpose of these annotations is to provide additional information about
    types, variables, and function parameters of interest to the kernel. A
    driving use case is to tag pointer types within the linux kernel and eBPF
    programs with additional semantic information, such as '__user' or '__rcu'.

    For example, consider the linux kernel function do_execve with the
    following declaration:

      static int do_execve(struct filename *filename,
         const char __user *const __user *__argv,
         const char __user *const __user *__envp);

    Here, __user could be defined with these annotations to record semantic
    information about the pointer parameters (e.g., they are user-provided) in
    DWARF and BTF information. Other kernel facilites such as the eBPF verifier
    can read the tags and make use of the information.

2)  Conveying the tags in the generated DWARF debug info.

    The main motivation for emitting the tags in DWARF is that the Linux kernel
    generates its BTF information via pahole, using DWARF as a source:

        +--------+  BTF                  BTF   +----------+
        | pahole |-------> vmlinux.btf ------->| verifier |
        +--------+                             +----------+
            ^                                        ^
            |                                        |
      DWARF |                                    BTF |
            |                                        |
         vmlinux                              +-------------+
         module1.ko                           | BPF program |
         module2.ko                           +-------------+
           ...

    This is because:

    a)  Unlike GCC, LLVM will only generate BTF for BPF programs.

    b)  GCC can generate BTF for whatever target with -gbtf, but there is no
        support for linking/deduplicating BTF in the linker.

    In the scenario above, the verifier needs access to the pointer tags of
    both the kernel types/declarations (conveyed in the DWARF and translated
    to BTF by pahole) and those of the BPF program (available directly in BTF).

    Another motivation for having the tag information in DWARF, unrelated to
    BPF and BTF, is that the drgn project (another DWARF consumer) also wants
    to benefit from these tags in order to differentiate between different
    kinds of pointers in the kernel.

3)  Conveying the tags in the generated BTF debug info.

    This is easy: the main purpose of having this info in BTF is for the
    compiled eBPF programs. The kernel verifier can then access the tags
    of pointers used by the eBPF programs.


For more information about these tags and the motivation behind them, please
refer to the following linux kernel discussions:

  https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/
  https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com/
  https://lore.kernel.org/bpf/20211112012604.1504583-1-yhs@fb.com/


Implementation Overview
=======================

To enable these annotations, two new C language attributes are added:
__attribute__((debug_annotate_decl("foo"))) and
__attribute__((debug_annotate_type("bar"))). Both attributes accept a single
arbitrary string constant argument, which will be recorded in the generated
DWARF and/or BTF debug information. They have no effect on code generation.

Note that we are not using the same attribute names as LLVM (btf_decl_tag and
btf_type_tag, respectively). While these attributes are functionally very
similar, they have grown beyond purely BTF-specific uses, so inclusion of "btf"
in the attribute name seems misleading.

DWARF support is enabled via a new DW_TAG_GNU_annotation. When generating DWARF,
declarations and types will be checked for the corresponding attributes. If
present, a DW_TAG_GNU_annotation DIE will be created as a child of the DIE for
the annotated type or declaration, one for each tag. These DIEs link the
arbitrary tag value to the item they annotate.

For example, the following variable declaration:

  #define __typetag1 __attribute__((debug_annotate_type ("typetag1")))

  #define __decltag1 __attribute__((debug_annotate_decl ("decltag1")))
  #define __decltag2 __attribute__((debug_annotate_decl ("decltag2")))

  int * __typetag1 x __decltag1 __decltag2;

Produces the following DWARF information:

 <1><1e>: Abbrev Number: 3 (DW_TAG_variable)
    <1f>   DW_AT_name        : x
    <21>   DW_AT_decl_file   : 1
    <22>   DW_AT_decl_line   : 7
    <23>   DW_AT_decl_column : 18
    <24>   DW_AT_type        : <0x49>
    <28>   DW_AT_external    : 1
    <28>   DW_AT_location    : 9 byte block: 3 0 0 0 0 0 0 0 0 	(DW_OP_addr: 0)
    <32>   DW_AT_sibling     : <0x49>
 <2><36>: Abbrev Number: 1 (User TAG value: 0x6000)
    <37>   DW_AT_name        : (indirect string, offset: 0xd6): debug_annotate_decl
    <3b>   DW_AT_const_value : (indirect string, offset: 0xcd): decltag2
 <2><3f>: Abbrev Number: 1 (User TAG value: 0x6000)
    <40>   DW_AT_name        : (indirect string, offset: 0xd6): debug_annotate_decl
    <44>   DW_AT_const_value : (indirect string, offset: 0x0): decltag1
 <2><48>: Abbrev Number: 0
 <1><49>: Abbrev Number: 4 (DW_TAG_pointer_type)
    <4a>   DW_AT_byte_size   : 8
    <4b>   DW_AT_type        : <0x5d>
    <4f>   DW_AT_sibling     : <0x5d>
 <2><53>: Abbrev Number: 1 (User TAG value: 0x6000)
    <54>   DW_AT_name        : (indirect string, offset: 0x9): debug_annotate_type
    <58>   DW_AT_const_value : (indirect string, offset: 0x1d): typetag1
 <2><5c>: Abbrev Number: 0
 <1><5d>: Abbrev Number: 5 (DW_TAG_base_type)
    <5e>   DW_AT_byte_size   : 4
    <5f>   DW_AT_encoding    : 5	(signed)
    <60>   DW_AT_name        : int
 <1><64>: Abbrev Number: 0

In the case of BTF, the annotations are recorded in two type kinds recently
added to the BTF specification: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.
The above example declaration prodcues the following BTF information:

[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
[2] PTR '(anon)' type_id=3
[3] TYPE_TAG 'typetag1' type_id=1
[4] DECL_TAG 'decltag1' type_id=6 component_idx=-1
[5] DECL_TAG 'decltag2' type_id=6 component_idx=-1
[6] VAR 'x' type_id=2, linkage=global
[7] DATASEC '.bss' size=0 vlen=1
	type_id=6 offset=0 size=8 (VAR 'x')


David Faust (9):
  dwarf: add dw_get_die_parent function
  include: Add new definitions
  c-family: Add debug_annotate attribute handlers
  dwarf: generate annotation DIEs
  ctfc: pass through debug annotations to BTF
  dwarf2ctf: convert annotation DIEs to CTF types
  btf: output decl_tag and type_tag records
  doc: document new attributes
  testsuite: add debug annotation tests

 gcc/btfout.cc                                 |  28 +++++
 gcc/c-family/c-attribs.cc                     |  43 +++++++
 gcc/ctf-int.h                                 |  29 +++++
 gcc/ctfc.cc                                   |  11 +-
 gcc/ctfc.h                                    |  17 ++-
 gcc/doc/extend.texi                           | 106 ++++++++++++++++
 gcc/dwarf2ctf.cc                              | 114 +++++++++++++++++-
 gcc/dwarf2out.cc                              | 102 ++++++++++++++++
 gcc/dwarf2out.h                               |   1 +
 .../gcc.dg/debug/btf/btf-decltag-func.c       |  18 +++
 .../gcc.dg/debug/btf/btf-decltag-sou.c        |  34 ++++++
 .../gcc.dg/debug/btf/btf-decltag-typedef.c    |  15 +++
 .../gcc.dg/debug/btf/btf-typetag-1.c          |  20 +++
 .../gcc.dg/debug/dwarf2/annotation-1.c        |  20 +++
 .../gcc.dg/debug/dwarf2/annotation-2.c        |  17 +++
 .../gcc.dg/debug/dwarf2/annotation-3.c        |  20 +++
 .../gcc.dg/debug/dwarf2/annotation-4.c        |  34 ++++++
 include/btf.h                                 |  17 ++-
 include/dwarf2.def                            |   4 +
 19 files changed, 639 insertions(+), 11 deletions(-)
 create mode 100644 gcc/ctf-int.h
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-func.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-sou.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-typedef.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-typetag-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-2.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-3.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-4.c

-- 
2.36.1


^ permalink raw reply	[flat|nested] 26+ messages in thread
* [PATCH 0/9] Add btf_decl_tag C attribute
@ 2023-07-11 21:57 David Faust
  2023-07-11 21:57 ` [PATCH 6/9] dwarf2ctf: convert annotation DIEs to CTF types David Faust
  0 siblings, 1 reply; 26+ messages in thread
From: David Faust @ 2023-07-11 21:57 UTC (permalink / raw)
  To: gcc-patches; +Cc: jose.marchesi, yhs

Hello,

This series adds support for a new attribute, "btf_decl_tag" in GCC.
The same attribute is already supported in clang, and is used by various
components of the BPF ecosystem.

The purpose of the attribute is to allow to associate (to "tag")
declarations with arbitrary string annotations, which are emitted into
debugging information (DWARF and/or BTF) to facilitate post-compilation
analysis (the motivating use case being the Linux kernel BPF verifier).
Multiple tags are allowed on the same declaration.

These strings are not interpreted by the compiler, and the attribute
itself has no effect on generated code, other than to produce additional
DWARF DIEs and/or BTF records conveying the annotations.

This entails:

- A new C-language-level attribute which allows to associate (to "tag")
  particular declarations with arbitrary strings.

- The conveyance of that information in DWARF in the form of a new DIE,
  DW_TAG_GNU_annotation, with tag number (0x6000) and format matching
  that of the DW_TAG_LLVM_annotation extension supported in LLVM for
  the same purpose. These DIEs are already supported by BPF tooling,
  such as pahole.

- The conveyance of that information in BTF debug info in the form of
  BTF_KIND_DECL_TAG records. These records are already supported by
  LLVM and other tools in the eBPF ecosystem, such as the Linux kernel
  eBPF verifier.


Background
==========

The purpose of these tags is to convey additional semantic information
to post-compilation consumers, in particular the Linux kernel eBPF
verifier. The verifier can make use of that information while analyzing
a BPF program to aid in determining whether to allow or reject the
program to be run. More background on these tags can be found in the
early support for them in the kernel here [1] and [2].

The "btf_decl_tag" attribute is half the story; the other half is a
sibling attribute "btf_type_tag" which serves the same purpose but
applies to types. Support for btf_type_tag will come in a separate
patch series, since it is impaced by GCC bug 110439 which needs to be
addressed first.

I submitted an initial version of this work (including btf_type_tag)
last spring [3], however at the time there were some open questions
about the behavior of the btf_type_tag attribute and issues with its
implementation. Since then we have clarified these details and agreed
to solutions with the BPF community and LLVM BPF folks.

The main motivation for emitting the tags in DWARF is that the Linux
kernel generates its BTF information via pahole, using DWARF as a source:

    +--------+  BTF                  BTF   +----------+
    | pahole |-------> vmlinux.btf ------->| verifier |
    +--------+                             +----------+
        ^                                        ^
        |                                        |
  DWARF |                                    BTF |
        |                                        |
      vmlinux                              +-------------+
      module1.ko                           | BPF program |
      module2.ko                           +-------------+
        ...

This is because:

a)  pahole adds additional kernel-specific information into the
    produced BTF based on additional analysis of kernel objects.

b)  Unlike GCC, LLVM will only generate BTF for BPF programs.

b)  GCC can generate BTF for whatever target with -gbtf, but there is no
    support for linking/deduplicating BTF in the linker.

In the scenario above, the verifier needs access to the pointer tags of
both the kernel types/declarations (conveyed in the DWARF and translated
to BTF by pahole) and those of the BPF program (available directly in BTF).


DWARF Representation
====================

As noted above, btf_decl_tag is represented in DWARF via a new DIE
DW_TAG_GNU_annotation, with identical format to the LLVM DWARF
extension DW_TAG_LLVM_annotation serving the same purpose. The DIE has
the following format:

  DW_TAG_GNU_annotation (0x6000)
    DW_AT_name: "btf_decl_tag"
    DW_AT_const_value: <string argument>

These DIEs are placed in the DWARF tree as children of the DIE for the
appropriate declaration, and one such DIE is created for each occurrence
of the btf_decl_tag attribute on a declaration.

For example:

  const int * c __attribute__((btf_decl_tag ("__c"), btf_decl_tag ("devicemem")));

This declaration produces the following DWARF:

 <1><1e>: Abbrev Number: 2 (DW_TAG_variable)
    <1f>   DW_AT_name        : c
    <24>   DW_AT_type        : <0x49>
    ...
 <2><36>: Abbrev Number: 3 (User TAG value: 0x6000)
    <37>   DW_AT_name        : (indirect string, offset: 0x4c): btf_decl_tag
    <3b>   DW_AT_const_value : (indirect string, offset: 0): devicemem
 <2><3f>: Abbrev Number: 4 (User TAG value: 0x6000)
    <40>   DW_AT_name        : (indirect string, offset: 0x4c): btf_decl_tag
    <44>   DW_AT_const_value : __c
 <2><48>: Abbrev Number: 0
 <1><49>: Abbrev Number: 5 (DW_TAG_pointer_type)
 ...

The DIEs for btf_decl_tag are placed as children of the DIE for
variable "c".

BTF Representation
==================

In BTF, BTF_KIND_DECL_TAG records convey the annotations. These records refer
to the annotated object by BTF type ID, as well as a component index which is
used for btf_decl_tags placed on struct/union members or function arguments.

For example, the BTF for the above declaration is:

  [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
  [2] CONST '(anon)' type_id=1
  [3] PTR '(anon)' type_id=2
  [4] DECL_TAG '__c' type_id=6 component_idx=-1
  [5] DECL_TAG 'devicemem' type_id=6 component_idx=-1
  [6] VAR 'c' type_id=3, linkage=global
  ...

The BTF format is documented here [4].


References
==========

[1] https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/
[2] https://lore.kernel.org/bpf/20211011040608.3031468-1-yhs@fb.com/
[3] https://gcc.gnu.org/pipermail/gcc-patches/2022-May/593936.html
[4] https://www.kernel.org/doc/Documentation/bpf/btf.rst


David Faust (9):
  c-family: add btf_decl_tag attribute
  include: add BTF decl tag defines
  dwarf: create annotation DIEs for decl tags
  dwarf: expose get_die_parent
  ctf: add support to pass through BTF tags
  dwarf2ctf: convert annotation DIEs to CTF types
  btf: create and output BTF_KIND_DECL_TAG types
  testsuite: add tests for BTF decl tags
  doc: document btf_decl_tag attribute

 gcc/btfout.cc                                 | 81 ++++++++++++++++++-
 gcc/c-family/c-attribs.cc                     | 23 ++++++
 gcc/ctf-int.h                                 | 28 +++++++
 gcc/ctfc.cc                                   | 10 ++-
 gcc/ctfc.h                                    | 17 +++-
 gcc/doc/extend.texi                           | 47 +++++++++++
 gcc/dwarf2ctf.cc                              | 73 ++++++++++++++++-
 gcc/dwarf2out.cc                              | 37 ++++++++-
 gcc/dwarf2out.h                               |  1 +
 .../gcc.dg/debug/btf/btf-decltag-func.c       | 21 +++++
 .../gcc.dg/debug/btf/btf-decltag-sou.c        | 33 ++++++++
 .../gcc.dg/debug/btf/btf-decltag-var.c        | 19 +++++
 .../gcc.dg/debug/dwarf2/annotation-decl-1.c   |  9 +++
 .../gcc.dg/debug/dwarf2/annotation-decl-2.c   | 18 +++++
 .../gcc.dg/debug/dwarf2/annotation-decl-3.c   | 17 ++++
 include/btf.h                                 | 14 +++-
 include/dwarf2.def                            |  4 +
 17 files changed, 437 insertions(+), 15 deletions(-)
 create mode 100644 gcc/ctf-int.h
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-func.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-sou.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-var.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-2.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-3.c

-- 
2.40.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2023-07-11 21:58 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-07 21:43 [PATCH 0/9] Add debug_annotate attributes David Faust
2022-06-07 21:43 ` [PATCH 1/9] dwarf: add dw_get_die_parent function David Faust
2022-06-13 10:13   ` Richard Biener
2022-06-07 21:43 ` [PATCH 2/9] include: Add new definitions David Faust
2022-06-07 21:43 ` [PATCH 3/9] c-family: Add debug_annotate attribute handlers David Faust
2022-06-07 21:43 ` [PATCH 4/9] dwarf: generate annotation DIEs David Faust
2022-06-07 21:43 ` [PATCH 5/9] ctfc: pass through debug annotations to BTF David Faust
2022-06-07 21:43 ` [PATCH 6/9] dwarf2ctf: convert annotation DIEs to CTF types David Faust
2022-06-07 21:43 ` [PATCH 7/9] btf: output decl_tag and type_tag records David Faust
2022-06-07 21:43 ` [PATCH 8/9] doc: document new attributes David Faust
2022-06-07 21:43 ` [PATCH 9/9] testsuite: add debug annotation tests David Faust
2022-06-15  5:53 ` [PATCH 0/9] Add debug_annotate attributes Yonghong Song
2022-06-15 20:57   ` David Faust
2022-06-15 22:56     ` Yonghong Song
2022-06-17 17:18       ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} WAS: " Jose E. Marchesi
2022-06-20 17:06         ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type, decl} " Yonghong Song
2022-06-21 16:12           ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} " Jose E. Marchesi
2022-06-24 18:01             ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type, decl} " Yonghong Song
2022-07-07 20:24               ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} " Jose E. Marchesi
2022-07-13  4:23                 ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type, decl} " Yonghong Song
2022-07-14 15:09                   ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} " Jose E. Marchesi
2022-07-15  1:20                     ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type, decl} " Yonghong Song
2022-07-15 14:17                       ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} " Jose E. Marchesi
2022-07-15 16:48                         ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type, decl} " Yonghong Song
2022-11-01 22:29       ` Yonghong Song
2023-07-11 21:57 [PATCH 0/9] Add btf_decl_tag C attribute David Faust
2023-07-11 21:57 ` [PATCH 6/9] dwarf2ctf: convert annotation DIEs to CTF types David Faust

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).