public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 0/9] Add debug_annotate attributes
@ 2022-06-07 21:43 David Faust
  2022-06-07 21:43 ` [PATCH 1/9] dwarf: add dw_get_die_parent function David Faust
                   ` (9 more replies)
  0 siblings, 10 replies; 25+ messages in thread
From: David Faust @ 2022-06-07 21:43 UTC (permalink / raw)
  To: gcc-patches; +Cc: jose.marchesi, yhs

Hello,

This patch series adds support for:

- Two new C-language-level attributes that allow to associate (to "annotate" or
  to "tag") particular declarations and types with arbitrary strings. As
  explained below, this is intended to be used to, for example, characterize
  certain pointer types.

- The conveyance of that information in the DWARF output in the form of a new
  DIE: DW_TAG_GNU_annotation.

- The conveyance of that information in the BTF output in the form of two new
  kinds of BTF objects: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.

All of these facilities are being added to the eBPF ecosystem, and support for
them exists in some form in LLVM.

Purpose
=======

1)  Addition of C-family language constructs (attributes) to specify free-text
    tags on certain language elements, such as struct fields.

    The purpose of these annotations is to provide additional information about
    types, variables, and function parameters of interest to the kernel. A
    driving use case is to tag pointer types within the linux kernel and eBPF
    programs with additional semantic information, such as '__user' or '__rcu'.

    For example, consider the linux kernel function do_execve with the
    following declaration:

      static int do_execve(struct filename *filename,
         const char __user *const __user *__argv,
         const char __user *const __user *__envp);

    Here, __user could be defined with these annotations to record semantic
    information about the pointer parameters (e.g., they are user-provided) in
    DWARF and BTF information. Other kernel facilites such as the eBPF verifier
    can read the tags and make use of the information.

2)  Conveying the tags in the generated DWARF debug info.

    The main motivation for emitting the tags in DWARF is that the Linux kernel
    generates its BTF information via pahole, using DWARF as a source:

        +--------+  BTF                  BTF   +----------+
        | pahole |-------> vmlinux.btf ------->| verifier |
        +--------+                             +----------+
            ^                                        ^
            |                                        |
      DWARF |                                    BTF |
            |                                        |
         vmlinux                              +-------------+
         module1.ko                           | BPF program |
         module2.ko                           +-------------+
           ...

    This is because:

    a)  Unlike GCC, LLVM will only generate BTF for BPF programs.

    b)  GCC can generate BTF for whatever target with -gbtf, but there is no
        support for linking/deduplicating BTF in the linker.

    In the scenario above, the verifier needs access to the pointer tags of
    both the kernel types/declarations (conveyed in the DWARF and translated
    to BTF by pahole) and those of the BPF program (available directly in BTF).

    Another motivation for having the tag information in DWARF, unrelated to
    BPF and BTF, is that the drgn project (another DWARF consumer) also wants
    to benefit from these tags in order to differentiate between different
    kinds of pointers in the kernel.

3)  Conveying the tags in the generated BTF debug info.

    This is easy: the main purpose of having this info in BTF is for the
    compiled eBPF programs. The kernel verifier can then access the tags
    of pointers used by the eBPF programs.


For more information about these tags and the motivation behind them, please
refer to the following linux kernel discussions:

  https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/
  https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com/
  https://lore.kernel.org/bpf/20211112012604.1504583-1-yhs@fb.com/


Implementation Overview
=======================

To enable these annotations, two new C language attributes are added:
__attribute__((debug_annotate_decl("foo"))) and
__attribute__((debug_annotate_type("bar"))). Both attributes accept a single
arbitrary string constant argument, which will be recorded in the generated
DWARF and/or BTF debug information. They have no effect on code generation.

Note that we are not using the same attribute names as LLVM (btf_decl_tag and
btf_type_tag, respectively). While these attributes are functionally very
similar, they have grown beyond purely BTF-specific uses, so inclusion of "btf"
in the attribute name seems misleading.

DWARF support is enabled via a new DW_TAG_GNU_annotation. When generating DWARF,
declarations and types will be checked for the corresponding attributes. If
present, a DW_TAG_GNU_annotation DIE will be created as a child of the DIE for
the annotated type or declaration, one for each tag. These DIEs link the
arbitrary tag value to the item they annotate.

For example, the following variable declaration:

  #define __typetag1 __attribute__((debug_annotate_type ("typetag1")))

  #define __decltag1 __attribute__((debug_annotate_decl ("decltag1")))
  #define __decltag2 __attribute__((debug_annotate_decl ("decltag2")))

  int * __typetag1 x __decltag1 __decltag2;

Produces the following DWARF information:

 <1><1e>: Abbrev Number: 3 (DW_TAG_variable)
    <1f>   DW_AT_name        : x
    <21>   DW_AT_decl_file   : 1
    <22>   DW_AT_decl_line   : 7
    <23>   DW_AT_decl_column : 18
    <24>   DW_AT_type        : <0x49>
    <28>   DW_AT_external    : 1
    <28>   DW_AT_location    : 9 byte block: 3 0 0 0 0 0 0 0 0 	(DW_OP_addr: 0)
    <32>   DW_AT_sibling     : <0x49>
 <2><36>: Abbrev Number: 1 (User TAG value: 0x6000)
    <37>   DW_AT_name        : (indirect string, offset: 0xd6): debug_annotate_decl
    <3b>   DW_AT_const_value : (indirect string, offset: 0xcd): decltag2
 <2><3f>: Abbrev Number: 1 (User TAG value: 0x6000)
    <40>   DW_AT_name        : (indirect string, offset: 0xd6): debug_annotate_decl
    <44>   DW_AT_const_value : (indirect string, offset: 0x0): decltag1
 <2><48>: Abbrev Number: 0
 <1><49>: Abbrev Number: 4 (DW_TAG_pointer_type)
    <4a>   DW_AT_byte_size   : 8
    <4b>   DW_AT_type        : <0x5d>
    <4f>   DW_AT_sibling     : <0x5d>
 <2><53>: Abbrev Number: 1 (User TAG value: 0x6000)
    <54>   DW_AT_name        : (indirect string, offset: 0x9): debug_annotate_type
    <58>   DW_AT_const_value : (indirect string, offset: 0x1d): typetag1
 <2><5c>: Abbrev Number: 0
 <1><5d>: Abbrev Number: 5 (DW_TAG_base_type)
    <5e>   DW_AT_byte_size   : 4
    <5f>   DW_AT_encoding    : 5	(signed)
    <60>   DW_AT_name        : int
 <1><64>: Abbrev Number: 0

In the case of BTF, the annotations are recorded in two type kinds recently
added to the BTF specification: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.
The above example declaration prodcues the following BTF information:

[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
[2] PTR '(anon)' type_id=3
[3] TYPE_TAG 'typetag1' type_id=1
[4] DECL_TAG 'decltag1' type_id=6 component_idx=-1
[5] DECL_TAG 'decltag2' type_id=6 component_idx=-1
[6] VAR 'x' type_id=2, linkage=global
[7] DATASEC '.bss' size=0 vlen=1
	type_id=6 offset=0 size=8 (VAR 'x')


David Faust (9):
  dwarf: add dw_get_die_parent function
  include: Add new definitions
  c-family: Add debug_annotate attribute handlers
  dwarf: generate annotation DIEs
  ctfc: pass through debug annotations to BTF
  dwarf2ctf: convert annotation DIEs to CTF types
  btf: output decl_tag and type_tag records
  doc: document new attributes
  testsuite: add debug annotation tests

 gcc/btfout.cc                                 |  28 +++++
 gcc/c-family/c-attribs.cc                     |  43 +++++++
 gcc/ctf-int.h                                 |  29 +++++
 gcc/ctfc.cc                                   |  11 +-
 gcc/ctfc.h                                    |  17 ++-
 gcc/doc/extend.texi                           | 106 ++++++++++++++++
 gcc/dwarf2ctf.cc                              | 114 +++++++++++++++++-
 gcc/dwarf2out.cc                              | 102 ++++++++++++++++
 gcc/dwarf2out.h                               |   1 +
 .../gcc.dg/debug/btf/btf-decltag-func.c       |  18 +++
 .../gcc.dg/debug/btf/btf-decltag-sou.c        |  34 ++++++
 .../gcc.dg/debug/btf/btf-decltag-typedef.c    |  15 +++
 .../gcc.dg/debug/btf/btf-typetag-1.c          |  20 +++
 .../gcc.dg/debug/dwarf2/annotation-1.c        |  20 +++
 .../gcc.dg/debug/dwarf2/annotation-2.c        |  17 +++
 .../gcc.dg/debug/dwarf2/annotation-3.c        |  20 +++
 .../gcc.dg/debug/dwarf2/annotation-4.c        |  34 ++++++
 include/btf.h                                 |  17 ++-
 include/dwarf2.def                            |   4 +
 19 files changed, 639 insertions(+), 11 deletions(-)
 create mode 100644 gcc/ctf-int.h
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-func.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-sou.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-typedef.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-typetag-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-2.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-3.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-4.c

-- 
2.36.1


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/9] dwarf: add dw_get_die_parent function
  2022-06-07 21:43 [PATCH 0/9] Add debug_annotate attributes David Faust
@ 2022-06-07 21:43 ` David Faust
  2022-06-13 10:13   ` Richard Biener
  2022-06-07 21:43 ` [PATCH 2/9] include: Add new definitions David Faust
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 25+ messages in thread
From: David Faust @ 2022-06-07 21:43 UTC (permalink / raw)
  To: gcc-patches; +Cc: jose.marchesi, yhs

gcc/

	* dwarf2out.cc (dw_get_die_parent): New function.
	* dwarf2out.h (dw_get_die_parent): Declare it here.
---
 gcc/dwarf2out.cc | 8 ++++++++
 gcc/dwarf2out.h  | 1 +
 2 files changed, 9 insertions(+)

diff --git a/gcc/dwarf2out.cc b/gcc/dwarf2out.cc
index 29f32ec6939..9c61026bb34 100644
--- a/gcc/dwarf2out.cc
+++ b/gcc/dwarf2out.cc
@@ -5235,6 +5235,14 @@ dw_get_die_sib (dw_die_ref die)
   return die->die_sib;
 }
 
+/* Return a reference to the parent of a given DIE.  */
+
+dw_die_ref
+dw_get_die_parent (dw_die_ref die)
+{
+  return die->die_parent;
+}
+
 /* Add an address constant attribute value to a DIE.  When using
    dwarf_split_debug_info, address attributes in dies destined for the
    final executable should be direct references--setting the parameter
diff --git a/gcc/dwarf2out.h b/gcc/dwarf2out.h
index 656ef94afde..e6962fb4848 100644
--- a/gcc/dwarf2out.h
+++ b/gcc/dwarf2out.h
@@ -455,6 +455,7 @@ extern dw_die_ref lookup_type_die (tree);
 
 extern dw_die_ref dw_get_die_child (dw_die_ref);
 extern dw_die_ref dw_get_die_sib (dw_die_ref);
+extern dw_die_ref dw_get_die_parent (dw_die_ref);
 extern enum dwarf_tag dw_get_die_tag (dw_die_ref);
 
 /* Data about a single source file.  */
-- 
2.36.1


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 2/9] include: Add new definitions
  2022-06-07 21:43 [PATCH 0/9] Add debug_annotate attributes David Faust
  2022-06-07 21:43 ` [PATCH 1/9] dwarf: add dw_get_die_parent function David Faust
@ 2022-06-07 21:43 ` David Faust
  2022-06-07 21:43 ` [PATCH 3/9] c-family: Add debug_annotate attribute handlers David Faust
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 25+ messages in thread
From: David Faust @ 2022-06-07 21:43 UTC (permalink / raw)
  To: gcc-patches; +Cc: jose.marchesi, yhs

include/

	* btf.h: Add BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG defines. Update
	comments.
	(struct btf_decl_tag): New.
	* dwarf2.def: Add new DWARF extension DW_TAG_GNU_annotation.
---
 include/btf.h      | 17 +++++++++++++++--
 include/dwarf2.def |  4 ++++
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/include/btf.h b/include/btf.h
index 78b551ced23..37deaef8b48 100644
--- a/include/btf.h
+++ b/include/btf.h
@@ -69,7 +69,7 @@ struct btf_type
 
   /* SIZE is used by INT, ENUM, STRUCT, UNION, DATASEC kinds.
      TYPE is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT, FUNC,
-     FUNC_PROTO and VAR kinds.  */
+     FUNC_PROTO, VAR and DECL_TAG kinds.  */
   union
   {
     uint32_t size;	/* Size of the entire type, in bytes.  */
@@ -109,7 +109,9 @@ struct btf_type
 #define BTF_KIND_VAR		14	/* Variable.  */
 #define BTF_KIND_DATASEC	15	/* Section such as .bss or .data.  */
 #define BTF_KIND_FLOAT		16	/* Floating point.  */
-#define BTF_KIND_MAX		BTF_KIND_FLOAT
+#define BTF_KIND_DECL_TAG	17	/* Decl Tag.  */
+#define BTF_KIND_TYPE_TAG	18	/* Type Tag.  */
+#define BTF_KIND_MAX		BTF_KIND_TYPE_TAG
 #define NR_BTF_KINDS		(BTF_KIND_MAX + 1)
 
 /* For some BTF_KINDs, struct btf_type is immediately followed by
@@ -190,6 +192,17 @@ struct btf_var_secinfo
   uint32_t size;	/* Size (in bytes) of variable.  */
 };
 
+/* BTF_KIND_DECL_TAG is followed by a single struct btf_decl_tag, which
+   describes the tag location:
+   - If component_idx == -1, then the tag is applied to a struct, union,
+     variable or function.
+   - Otherwise it is applied to a struct/union member or function argument
+     with the given given index numbered 0..vlen-1.  */
+struct btf_decl_tag
+{
+  int32_t component_idx;
+};
+
 #ifdef	__cplusplus
 }
 #endif
diff --git a/include/dwarf2.def b/include/dwarf2.def
index 530c6f849f9..a1f7a47a036 100644
--- a/include/dwarf2.def
+++ b/include/dwarf2.def
@@ -174,6 +174,10 @@ DW_TAG (DW_TAG_GNU_formal_parameter_pack, 0x4108)
    are properly part of DWARF 5.  */
 DW_TAG (DW_TAG_GNU_call_site, 0x4109)
 DW_TAG (DW_TAG_GNU_call_site_parameter, 0x410a)
+
+/* Extension for BTF annotations.  */
+DW_TAG (DW_TAG_GNU_annotation, 0x6000)
+
 /* Extensions for UPC.  See: http://dwarfstd.org/doc/DWARF4.pdf.  */
 DW_TAG (DW_TAG_upc_shared_type, 0x8765)
 DW_TAG (DW_TAG_upc_strict_type, 0x8766)
-- 
2.36.1


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 3/9] c-family: Add debug_annotate attribute handlers
  2022-06-07 21:43 [PATCH 0/9] Add debug_annotate attributes David Faust
  2022-06-07 21:43 ` [PATCH 1/9] dwarf: add dw_get_die_parent function David Faust
  2022-06-07 21:43 ` [PATCH 2/9] include: Add new definitions David Faust
@ 2022-06-07 21:43 ` David Faust
  2022-06-07 21:43 ` [PATCH 4/9] dwarf: generate annotation DIEs David Faust
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 25+ messages in thread
From: David Faust @ 2022-06-07 21:43 UTC (permalink / raw)
  To: gcc-patches; +Cc: jose.marchesi, yhs

This patch adds attribute handlers for two new attributes:
"debug_annotate_decl" and "debug_annotate_type". Both attributes accept
a single string argument, and are used to add arbitrary annotations to
debug information generated for the decls or types to which they apply.

gcc/c-family/

	* c-attribs.cc (c_common_attribute_table): Add new attributes
	debug_annotate_decl and debug_annotate_type.
	(handle_debug_annotate_decl_attribute): New.
	(handle_debug_annotate_type_attribute): Likewise.
---
 gcc/c-family/c-attribs.cc | 43 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 43 insertions(+)

diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index c8d96723f4c..50e8fc1b695 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -174,6 +174,9 @@ static tree handle_signed_bool_precision_attribute (tree *, tree, tree, int,
 						    bool *);
 static tree handle_retain_attribute (tree *, tree, tree, int, bool *);
 
+static tree handle_debug_annotate_decl_attribute (tree *, tree, tree, int, bool *);
+static tree handle_debug_annotate_type_attribute (tree *, tree, tree, int, bool *);
+
 /* Helper to define attribute exclusions.  */
 #define ATTR_EXCL(name, function, type, variable)	\
   { name, function, type, variable }
@@ -555,6 +558,10 @@ const struct attribute_spec c_common_attribute_table[] =
 			      handle_dealloc_attribute, NULL },
   { "tainted_args",	      0, 0, true,  false, false, false,
 			      handle_tainted_args_attribute, NULL },
+  { "debug_annotate_decl",    1, 1, false, false, false, false,
+			      handle_debug_annotate_decl_attribute, NULL },
+  { "debug_annotate_type",    1, 1, false, true, false, false,
+			      handle_debug_annotate_type_attribute, NULL },
   { NULL,                     0, 0, false, false, false, false, NULL, NULL }
 };
 
@@ -5868,6 +5875,42 @@ handle_tainted_args_attribute (tree *node, tree name, tree, int,
   return NULL_TREE;
 }
 
+/* Handle a "debug_annotate_decl" attribute; arguments as in
+   struct attribute_spec.handler.   */
+
+static tree
+handle_debug_annotate_decl_attribute (tree *, tree name, tree args, int,
+				      bool *no_add_attrs)
+{
+  if (!args)
+    *no_add_attrs = true;
+  else if (TREE_CODE (TREE_VALUE (args)) != STRING_CST)
+    {
+      error ("%qE attribute requires a string", name);
+      *no_add_attrs = true;
+    }
+
+  return NULL_TREE;
+}
+
+/* Handle a "debug_annotate_type" attribute; arguments as in
+   struct attribute_spec.handler.   */
+
+static tree
+handle_debug_annotate_type_attribute (tree *, tree name, tree args, int,
+				      bool *no_add_attrs)
+{
+  if (!args)
+    *no_add_attrs = true;
+  else if (TREE_CODE (TREE_VALUE (args)) != STRING_CST)
+    {
+      error ("%qE attribute requires a string", name);
+      *no_add_attrs = true;
+    }
+
+  return NULL_TREE;
+}
+
 /* Attempt to partially validate a single attribute ATTR as if
    it were to be applied to an entity OPER.  */
 
-- 
2.36.1


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 4/9] dwarf: generate annotation DIEs
  2022-06-07 21:43 [PATCH 0/9] Add debug_annotate attributes David Faust
                   ` (2 preceding siblings ...)
  2022-06-07 21:43 ` [PATCH 3/9] c-family: Add debug_annotate attribute handlers David Faust
@ 2022-06-07 21:43 ` David Faust
  2022-06-07 21:43 ` [PATCH 5/9] ctfc: pass through debug annotations to BTF David Faust
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 25+ messages in thread
From: David Faust @ 2022-06-07 21:43 UTC (permalink / raw)
  To: gcc-patches; +Cc: jose.marchesi, yhs

The "debug_annotate_decl" and "debug_annotate_type" attributes are
handled by constructing DW_TAG_GNU_annotation DIEs. These DIEs are
children of the declarations or types which they annotate, and convey
the information via a string constant.

gcc/

	* dwarf2out.cc (gen_decl_annotation_dies): New function.
	(gen_type_annotation_dies): Likewise.
	(modified_type_die): Call them here, if appropriate.
	(gen_formal_parameter_die): Likewise.
	(gen_typedef_die): Likewise.
	(gen_type_die): Likewise.
	(gen_decl_die): Likewise.
---
 gcc/dwarf2out.cc | 94 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 94 insertions(+)

diff --git a/gcc/dwarf2out.cc b/gcc/dwarf2out.cc
index 9c61026bb34..aff9f72bd55 100644
--- a/gcc/dwarf2out.cc
+++ b/gcc/dwarf2out.cc
@@ -13611,6 +13611,78 @@ long_double_as_float128 (tree type)
   return NULL_TREE;
 }
 
+/* Given a tree T, which may be a decl or a type, process any
+   "debug_annotate_decl" attributes on T, provided in ATTR. Construct
+   DW_TAG_GNU_annotation DIEs appropriately as children of TARGET, usually
+   the DIE for T.  */
+
+static void
+gen_decl_annotation_dies (tree t, dw_die_ref target)
+{
+  dw_die_ref die;
+  tree attr;
+
+  if (t == NULL_TREE || !target)
+    return;
+
+  if (TYPE_P (t))
+    attr = lookup_attribute ("debug_annotate_decl", TYPE_ATTRIBUTES (t));
+  else if (DECL_P (t))
+    attr = lookup_attribute ("debug_annotate_decl", DECL_ATTRIBUTES (t));
+  else
+    /* This is an error.  */
+    gcc_unreachable ();
+
+  while (attr != NULL_TREE)
+    {
+      die = new_die (DW_TAG_GNU_annotation, target, t);
+      add_name_attribute (die, IDENTIFIER_POINTER (get_attribute_name (attr)));
+      add_AT_string (die, DW_AT_const_value,
+		     TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (attr))));
+      attr = TREE_CHAIN (attr);
+    }
+
+  /* Strip the decl tag attribute to avoid creating multiple copies if we hit
+     this tree node again in some recursive call.  */
+  if (TYPE_P (t))
+    TYPE_ATTRIBUTES (t) =
+      remove_attribute ("debug_annotate_decl", TYPE_ATTRIBUTES (t));
+  else if (DECL_P (t))
+    DECL_ATTRIBUTES (t) =
+      remove_attribute ("debug_annotate_decl", DECL_ATTRIBUTES (t));
+}
+
+/* Given a tree TYPE, process any "debug_annotate_type" attributes on
+   TYPE. Construct DW_TAG_GNU_annotation DIEs appropriately as children of
+   TARGET, usually the DIE for TYPE.  */
+
+static void
+gen_type_annotation_dies (tree type, dw_die_ref target)
+{
+  dw_die_ref die;
+  tree attr;
+
+  if (type == NULL_TREE || !target)
+    return;
+
+  gcc_assert (TYPE_P (type));
+
+  attr = lookup_attribute ("debug_annotate_type", TYPE_ATTRIBUTES (type));
+  while (attr != NULL_TREE)
+    {
+      die = new_die (DW_TAG_GNU_annotation, target, type);
+      add_name_attribute (die, IDENTIFIER_POINTER (get_attribute_name (attr)));
+      add_AT_string (die, DW_AT_const_value,
+		     TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (attr))));
+      attr = TREE_CHAIN (attr);
+    }
+
+  /* Strip the type tag attribute to avoid creating multiple copies if we hit
+     this type again in some recursive call.  */
+  TYPE_ATTRIBUTES (type) =
+    remove_attribute ("debug_annotate_type", TYPE_ATTRIBUTES (type));
+}
+
 /* Given a pointer to an arbitrary ..._TYPE tree node, return a debugging
    entry that chains the modifiers specified by CV_QUALS in front of the
    given type.  REVERSE is true if the type is to be interpreted in the
@@ -14009,6 +14081,9 @@ modified_type_die (tree type, int cv_quals, bool reverse,
   if (TYPE_ARTIFICIAL (type))
     add_AT_flag (mod_type_die, DW_AT_artificial, 1);
 
+  /* Generate any annotation DIEs on this type.  */
+  gen_type_annotation_dies (type, mod_type_die);
+
   return mod_type_die;
 }
 
@@ -23002,6 +23077,9 @@ gen_formal_parameter_die (tree node, tree origin, bool emit_name_p,
       gcc_unreachable ();
     }
 
+  /* Generate any annotation DIEs for this decl.  */
+  gen_decl_annotation_dies (node, parm_die);
+
   return parm_die;
 }
 
@@ -26076,6 +26154,9 @@ gen_typedef_die (tree decl, dw_die_ref context_die)
 
   if (get_AT (type_die, DW_AT_name))
     add_pubtype (decl, type_die);
+
+  /* Generate any annotation DIEs for the typedef.  */
+  gen_decl_annotation_dies (decl, type_die);
 }
 
 /* Generate a DIE for a struct, class, enum or union type.  */
@@ -26389,6 +26470,16 @@ gen_type_die (tree type, dw_die_ref context_die)
 	  if (die)
 	    check_die (die);
 	}
+
+      /* Generate any annotation DIEs on the type.  */
+      dw_die_ref die = lookup_type_die (type);
+      if (die)
+	{
+	  gen_type_annotation_dies (type, die);
+
+	  /* "decl" annotations may also be attached to a type.  */
+	  gen_decl_annotation_dies (type, die);
+	}
     }
 }
 
@@ -27145,6 +27236,9 @@ gen_decl_die (tree decl, tree origin, struct vlr_context *ctx,
       break;
     }
 
+  /* Generate any annotation DIEs for the decl.  */
+  gen_decl_annotation_dies (decl, lookup_decl_die (decl_or_origin));
+
   return NULL;
 }
 \f
-- 
2.36.1


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 5/9] ctfc: pass through debug annotations to BTF
  2022-06-07 21:43 [PATCH 0/9] Add debug_annotate attributes David Faust
                   ` (3 preceding siblings ...)
  2022-06-07 21:43 ` [PATCH 4/9] dwarf: generate annotation DIEs David Faust
@ 2022-06-07 21:43 ` David Faust
  2022-06-07 21:43 ` [PATCH 6/9] dwarf2ctf: convert annotation DIEs to CTF types David Faust
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 25+ messages in thread
From: David Faust @ 2022-06-07 21:43 UTC (permalink / raw)
  To: gcc-patches; +Cc: jose.marchesi, yhs

BTF generation currently relies on the internal CTF representation to
convert debug info from DWARF dies. This patch adds a new internal
header, "ctf-int.h", which defines CTF kinds to be used internally to
represent BTF tags which must pass through the CTF container. It also
adds a new type for representing information specific to those tags, and
a member for that type in ctf_dtdef.

This patch also updates ctf_add_reftype to accept a const char * name,
and add it for the newly added type.

gcc/

	* ctf-int.h: New file.
	* ctfc.cc (ctf_add_reftype): Add NAME parameter. Pass it to
	ctf_add_generic call.
	(ctf_add_pointer): Update ctf_add_reftype call accordingly.
	* ctfc.h (ctf_add_reftype): Analogous change.
	(ctf_btf_annotation): New.
	(ctf_dtdef): Add member for it.
	(enum ctf_dtu_d_union_enum): Likewise.
	* dwarf2ctf.cc (gen_ctf_modifier_type): Update call to
	ctf_add_reftype accordingly.
---
 gcc/ctf-int.h    | 29 +++++++++++++++++++++++++++++
 gcc/ctfc.cc      | 11 +++++++----
 gcc/ctfc.h       | 17 ++++++++++++++---
 gcc/dwarf2ctf.cc |  2 +-
 4 files changed, 51 insertions(+), 8 deletions(-)
 create mode 100644 gcc/ctf-int.h

diff --git a/gcc/ctf-int.h b/gcc/ctf-int.h
new file mode 100644
index 00000000000..fb5f4aacad6
--- /dev/null
+++ b/gcc/ctf-int.h
@@ -0,0 +1,29 @@
+/* ctf-int.h - GCC internal definitions used for CTF debug info.
+   Copyright (C) 2022 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef GCC_CTF_INT_H
+#define GCC_CTF_INT_H 1
+
+/* These CTF kinds only exist as a bridge to generating BTF types for
+   BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG. They do not correspond to any
+   representable type kind in CTF.  */
+#define CTF_K_DECL_TAG  62
+#define CTF_K_TYPE_TAG  63
+
+#endif /* GCC_CTF_INT_H */
diff --git a/gcc/ctfc.cc b/gcc/ctfc.cc
index f24e7bff948..a0404520b2a 100644
--- a/gcc/ctfc.cc
+++ b/gcc/ctfc.cc
@@ -107,6 +107,9 @@ ctf_dtu_d_union_selector (ctf_dtdef_ref ctftype)
       return CTF_DTU_D_ARGUMENTS;
     case CTF_K_SLICE:
       return CTF_DTU_D_SLICE;
+    case CTF_K_DECL_TAG:
+    case CTF_K_TYPE_TAG:
+      return CTF_DTU_D_BTFNOTE;
     default:
       /* The largest member as default.  */
       return CTF_DTU_D_ARRAY;
@@ -428,15 +431,15 @@ ctf_add_encoded (ctf_container_ref ctfc, uint32_t flag, const char * name,
 }
 
 ctf_id_t
-ctf_add_reftype (ctf_container_ref ctfc, uint32_t flag, ctf_id_t ref,
-		 uint32_t kind, dw_die_ref die)
+ctf_add_reftype (ctf_container_ref ctfc, uint32_t flag, const char * name,
+		 ctf_id_t ref, uint32_t kind, dw_die_ref die)
 {
   ctf_dtdef_ref dtd;
   ctf_id_t type;
 
   gcc_assert (ref <= CTF_MAX_TYPE);
 
-  type = ctf_add_generic (ctfc, flag, NULL, &dtd, die);
+  type = ctf_add_generic (ctfc, flag, name, &dtd, die);
   dtd->dtd_data.ctti_info = CTF_TYPE_INFO (kind, flag, 0);
   /* Caller of this API must guarantee that a CTF type with id = ref already
      exists.  This will also be validated for us at link-time.  */
@@ -548,7 +551,7 @@ ctf_id_t
 ctf_add_pointer (ctf_container_ref ctfc, uint32_t flag, ctf_id_t ref,
 		 dw_die_ref die)
 {
-  return (ctf_add_reftype (ctfc, flag, ref, CTF_K_POINTER, die));
+  return (ctf_add_reftype (ctfc, flag, NULL, ref, CTF_K_POINTER, die));
 }
 
 ctf_id_t
diff --git a/gcc/ctfc.h b/gcc/ctfc.h
index 001e544ef08..fab18f024d7 100644
--- a/gcc/ctfc.h
+++ b/gcc/ctfc.h
@@ -35,6 +35,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "dwarf2ctf.h"
 #include "ctf.h"
 #include "btf.h"
+#include "ctf-int.h"
 
 /* Invalid CTF type ID definition.  */
 
@@ -151,6 +152,13 @@ typedef struct GTY (()) ctf_func_arg
 
 #define ctf_farg_list_next(elem) ((ctf_func_arg_t *)((elem)->farg_next))
 
+/* BTF support: a BTF type tag or decl tag.  */
+
+typedef struct GTY (()) ctf_btf_annotation
+{
+  uint32_t component_idx;
+} ctf_btf_annotation_t;
+
 /* Type definition for CTF generation.  */
 
 struct GTY ((for_user)) ctf_dtdef
@@ -173,6 +181,8 @@ struct GTY ((for_user)) ctf_dtdef
     ctf_func_arg_t * GTY ((tag ("CTF_DTU_D_ARGUMENTS"))) dtu_argv;
     /* slice.  */
     ctf_sliceinfo_t GTY ((tag ("CTF_DTU_D_SLICE"))) dtu_slice;
+    /* btf annotation.  */
+    ctf_btf_annotation_t GTY ((tag ("CTF_DTU_D_BTFNOTE"))) dtu_btfnote;
   } dtd_u;
 };
 
@@ -212,7 +222,8 @@ enum ctf_dtu_d_union_enum {
   CTF_DTU_D_ARRAY,
   CTF_DTU_D_ENCODING,
   CTF_DTU_D_ARGUMENTS,
-  CTF_DTU_D_SLICE
+  CTF_DTU_D_SLICE,
+  CTF_DTU_D_BTFNOTE
 };
 
 enum ctf_dtu_d_union_enum
@@ -402,8 +413,8 @@ extern bool ctf_dvd_ignore_lookup (const ctf_container_ref ctfc,
 extern const char * ctf_add_string (ctf_container_ref, const char *,
 				    uint32_t *, int);
 
-extern ctf_id_t ctf_add_reftype (ctf_container_ref, uint32_t, ctf_id_t,
-				 uint32_t, dw_die_ref);
+extern ctf_id_t ctf_add_reftype (ctf_container_ref, uint32_t, const char *,
+				 ctf_id_t, uint32_t, dw_die_ref);
 extern ctf_id_t ctf_add_enum (ctf_container_ref, uint32_t, const char *,
 			      HOST_WIDE_INT, dw_die_ref);
 extern ctf_id_t ctf_add_slice (ctf_container_ref, uint32_t, ctf_id_t,
diff --git a/gcc/dwarf2ctf.cc b/gcc/dwarf2ctf.cc
index a6329ab6ee4..393aa92d71d 100644
--- a/gcc/dwarf2ctf.cc
+++ b/gcc/dwarf2ctf.cc
@@ -511,7 +511,7 @@ gen_ctf_modifier_type (ctf_container_ref ctfc, dw_die_ref modifier)
   gcc_assert (kind != CTF_K_MAX);
   /* Now register the modifier itself.  */
   if (!ctf_type_exists (ctfc, modifier, &modifier_type_id))
-    modifier_type_id = ctf_add_reftype (ctfc, CTF_ADD_ROOT,
+    modifier_type_id = ctf_add_reftype (ctfc, CTF_ADD_ROOT, NULL,
 					qual_type_id, kind,
 					modifier);
 
-- 
2.36.1


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 6/9] dwarf2ctf: convert annotation DIEs to CTF types
  2022-06-07 21:43 [PATCH 0/9] Add debug_annotate attributes David Faust
                   ` (4 preceding siblings ...)
  2022-06-07 21:43 ` [PATCH 5/9] ctfc: pass through debug annotations to BTF David Faust
@ 2022-06-07 21:43 ` David Faust
  2022-06-07 21:43 ` [PATCH 7/9] btf: output decl_tag and type_tag records David Faust
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 25+ messages in thread
From: David Faust @ 2022-06-07 21:43 UTC (permalink / raw)
  To: gcc-patches; +Cc: jose.marchesi, yhs

This patch makes the DWARF-to-CTF conversion process aware of the new
DW_TAG_GNU annotation DIEs. The DIEs are converted to an internal-only
CTF representation as appropriate and added to the compilation unit CTF
container.

gcc/

	* dwarf2ctf.cc (handle_debug_annotations): New function.
	(gen_ctf_sou_type): Call it here, if appropriate. Don't try to
	create member types for children that are not DW_TAG_member.
	(gen_ctf_function_type): Call handle_debug_annotations if
	appropriate.
	(gen_ctf_variable): Likewise.
	(gen_ctf_function): Likewise.
	(gen_ctf_type): Likewise.
---
 gcc/dwarf2ctf.cc | 112 ++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 111 insertions(+), 1 deletion(-)

diff --git a/gcc/dwarf2ctf.cc b/gcc/dwarf2ctf.cc
index 393aa92d71d..65714e5d3b9 100644
--- a/gcc/dwarf2ctf.cc
+++ b/gcc/dwarf2ctf.cc
@@ -32,6 +32,12 @@ along with GCC; see the file COPYING3.  If not see
 static ctf_id_t
 gen_ctf_type (ctf_container_ref, dw_die_ref);
 
+static void
+gen_ctf_variable (ctf_container_ref, dw_die_ref);
+
+static void
+handle_debug_annotations (ctf_container_ref, dw_die_ref, ctf_id_t, int);
+
 /* All the DIE structures we handle come from the DWARF information
    generated by GCC.  However, there are three situations where we need
    to create our own created DIE structures because GCC doesn't
@@ -547,6 +553,7 @@ gen_ctf_sou_type (ctf_container_ref ctfc, dw_die_ref sou, uint32_t kind)
   /* Now process the struct members.  */
   {
     dw_die_ref c;
+    int idx = 0;
 
     c = dw_get_die_child (sou);
     if (c)
@@ -559,6 +566,12 @@ gen_ctf_sou_type (ctf_container_ref ctfc, dw_die_ref sou, uint32_t kind)
 
 	  c = dw_get_die_sib (c);
 
+	  if (dw_get_die_tag (c) != DW_TAG_member)
+	    continue;
+
+	  if (c == dw_get_die_child (sou))
+	    idx = 0;
+
 	  field_name = get_AT_string (c, DW_AT_name);
 	  field_type = ctf_get_AT_type (c);
 	  field_location = ctf_get_AT_data_member_location (c);
@@ -626,6 +639,12 @@ gen_ctf_sou_type (ctf_container_ref ctfc, dw_die_ref sou, uint32_t kind)
 				 field_name,
 				 field_type_id,
 				 field_location);
+
+	  /* Handle BTF tags on the member.  */
+	  if (btf_debuginfo_p ())
+	    handle_debug_annotations (ctfc, c, sou_type_id, idx);
+
+	  idx++;
 	}
       while (c != dw_get_die_child (sou));
   }
@@ -716,6 +735,9 @@ gen_ctf_function_type (ctf_container_ref ctfc, dw_die_ref function,
 	      arg_type = gen_ctf_type (ctfc, ctf_get_AT_type (c));
 	      /* Add the argument to the existing CTF function type.  */
 	      ctf_add_function_arg (ctfc, function, arg_name, arg_type);
+
+	      if (btf_debuginfo_p ())
+		handle_debug_annotations (ctfc, c, function_type_id, i - 1);
 	    }
 	  else
 	    /* This is a local variable.  Ignore.  */
@@ -828,6 +850,10 @@ gen_ctf_variable (ctf_container_ref ctfc, dw_die_ref die)
   /* Skip updating the number of global objects at this time.  This is updated
      later after pre-processing as some CTF variable records although
      generated now, will not be emitted later.  [PR105089].  */
+
+  /* Handle any BTF tags on the variable.  */
+  if (btf_debuginfo_p ())
+    handle_debug_annotations (ctfc, die, CTF_NULL_TYPEID, -1);
 }
 
 /* Add a CTF function record for the given input DWARF DIE.  */
@@ -845,8 +871,12 @@ gen_ctf_function (ctf_container_ref ctfc, dw_die_ref die)
      counter.  Note that DWARF encodes function types in both
      DW_TAG_subroutine_type and DW_TAG_subprogram in exactly the same
      way.  */
-  (void) gen_ctf_function_type (ctfc, die, true /* from_global_func */);
+  function_type_id = gen_ctf_function_type (ctfc, die, true /* from_global_func */);
   ctfc->ctfc_num_global_funcs += 1;
+
+  /* Handle any BTF tags on the function itself.  */
+  if (btf_debuginfo_p ())
+    handle_debug_annotations (ctfc, die, function_type_id, -1);
 }
 
 /* Add CTF type record(s) for the given input DWARF DIE and return its type id.
@@ -923,6 +953,10 @@ gen_ctf_type (ctf_container_ref ctfc, dw_die_ref die)
       break;
     }
 
+  /* Handle any BTF tags on the type.  */
+  if (btf_debuginfo_p () && !unrecog_die)
+    handle_debug_annotations (ctfc, die, type_id, -1);
+
   /* For all types unrepresented in CTF, use an explicit CTF type of kind
      CTF_K_UNKNOWN.  */
   if ((type_id == CTF_NULL_TYPEID) && (!unrecog_die))
@@ -931,6 +965,82 @@ gen_ctf_type (ctf_container_ref ctfc, dw_die_ref die)
   return type_id;
 }
 
+/* BTF support. Handle any annotations attached to a given DIE, and generate
+   intermediate CTF types for them. BTF tags are inserted into the type chain
+   at this point. The return value is the CTF type ID of the last type tag
+   created (for type chaining), or the same as the argument TYPE_ID if there are
+   no type tags.
+   Note that despite the name, the BTF spec seems to allow decl tags on types
+   as well as declarations.  */
+
+static void
+handle_debug_annotations (ctf_container_ref ctfc, dw_die_ref die,
+			  ctf_id_t type_id, int component_idx)
+{
+  dw_die_ref c;
+  const char * name = NULL;
+  const char * value = NULL;
+  ctf_dtdef_ref dtd = ctf_dtd_lookup (ctfc, die);
+  ctf_id_t target_id, tag_id;
+
+  if (dtd)
+    target_id = dtd->dtd_data.ctti_type;
+  else
+    target_id = CTF_NULL_TYPEID;
+
+  c = dw_get_die_child (die);
+  if (c)
+    do
+      {
+	if (dw_get_die_tag (c) != DW_TAG_GNU_annotation)
+	  {
+	    c = dw_get_die_sib (c);
+	    continue;
+	  }
+
+	name = get_AT_string (c, DW_AT_name);
+
+	/* BTF decl tags add an arbitrary annotation to the thing they
+	   annotate. The annotated thing could be a variable or a type.  */
+	if (strcmp (name, "debug_annotate_decl") == 0)
+	  {
+	    value = get_AT_string (c, DW_AT_const_value);
+	    if (!ctf_type_exists (ctfc, c, &tag_id))
+	      (void) ctf_add_reftype (ctfc, CTF_ADD_ROOT, value,
+				      type_id, CTF_K_DECL_TAG, c);
+	    ctf_dtdef_ref dtd = ctf_dtd_lookup (ctfc, c);
+	    dtd->dtd_u.dtu_btfnote.component_idx = component_idx;
+	  }
+
+	/* BTF type tags are part of the type chain similar to cvr quals.
+	   But the type tag DIEs are children of the DIEs they annotate.
+
+	   For each type tag on this type, create a CTF type for it and
+	   insert it into the type chain:
+	   - The first tag refers to the type referred to by the parent.
+	   - Each subsequent tag refers to the prior tag.
+	   - The parent type is updated to refer to the last tag.
+
+	   Given this type chain requirement, the representation of type
+	   tags in BTF only makes sense for pointer types. Should this be
+	   enforced here?  */
+	else if (strcmp (name, "debug_annotate_type") == 0)
+	  {
+	    gcc_assert (dtd);
+	    value = get_AT_string (c, DW_AT_const_value);
+
+	    if (!ctf_type_exists (ctfc, c, &tag_id))
+	      tag_id = ctf_add_reftype (ctfc, CTF_ADD_ROOT, value,
+					target_id, CTF_K_TYPE_TAG, c);
+
+	    dtd->dtd_data.ctti_type = tag_id;
+	    target_id = tag_id;
+	  }
+	c = dw_get_die_sib (c);
+      }
+    while (c != dw_get_die_child (die));
+}
+
 /* Prepare for output and write out the CTF debug information.  */
 
 static void
-- 
2.36.1


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 7/9] btf: output decl_tag and type_tag records
  2022-06-07 21:43 [PATCH 0/9] Add debug_annotate attributes David Faust
                   ` (5 preceding siblings ...)
  2022-06-07 21:43 ` [PATCH 6/9] dwarf2ctf: convert annotation DIEs to CTF types David Faust
@ 2022-06-07 21:43 ` David Faust
  2022-06-07 21:43 ` [PATCH 8/9] doc: document new attributes David Faust
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 25+ messages in thread
From: David Faust @ 2022-06-07 21:43 UTC (permalink / raw)
  To: gcc-patches; +Cc: jose.marchesi, yhs

This patch updates btfout.cc to be aware of debug annotations, convert
them to BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG records, and output them
appropriately.

gcc/

	* btfout.cc (get_btf_kind): Handle TYPE_TAG and DECL_TAG kinds.
	(btf_calc_num_vbytes): Likewise.
	(btf_asm_type): Likewise.
	(output_asm_btf_vlen_bytes): Likewise.
---
 gcc/btfout.cc | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/gcc/btfout.cc b/gcc/btfout.cc
index 31af50521da..f291cd925be 100644
--- a/gcc/btfout.cc
+++ b/gcc/btfout.cc
@@ -136,6 +136,8 @@ get_btf_kind (uint32_t ctf_kind)
     case CTF_K_VOLATILE: return BTF_KIND_VOLATILE;
     case CTF_K_CONST:    return BTF_KIND_CONST;
     case CTF_K_RESTRICT: return BTF_KIND_RESTRICT;
+    case CTF_K_TYPE_TAG: return BTF_KIND_TYPE_TAG;
+    case CTF_K_DECL_TAG: return BTF_KIND_DECL_TAG;
     default:;
     }
   return BTF_KIND_UNKN;
@@ -201,6 +203,7 @@ btf_calc_num_vbytes (ctf_dtdef_ref dtd)
     case BTF_KIND_CONST:
     case BTF_KIND_RESTRICT:
     case BTF_KIND_FUNC:
+    case BTF_KIND_TYPE_TAG:
     /* These kinds have no vlen data.  */
       break;
 
@@ -238,6 +241,10 @@ btf_calc_num_vbytes (ctf_dtdef_ref dtd)
       vlen_bytes += vlen * sizeof (struct btf_var_secinfo);
       break;
 
+    case BTF_KIND_DECL_TAG:
+      vlen_bytes += sizeof (struct btf_decl_tag);
+      break;
+
     default:
       break;
     }
@@ -636,6 +643,22 @@ btf_asm_type (ctf_container_ref ctfc, ctf_dtdef_ref dtd)
       dw2_asm_output_data (4, dtd->dtd_data.ctti_size, "btt_size: %uB",
 			   dtd->dtd_data.ctti_size);
       return;
+    case BTF_KIND_DECL_TAG:
+      {
+	/* A decl tag might refer to (be the child DIE of) a variable. Try to
+	   lookup the parent DIE's CTF variable, and if it exists point to the
+	   corresponding BTF variable. This is an odd construction - we have a
+	   'type' which refers to a variable, rather than the reverse.  */
+	dw_die_ref parent = dw_get_die_parent (dtd->dtd_key);
+	ctf_dvdef_ref dvd = ctf_dvd_lookup (ctfc, parent);
+	if (dvd)
+	  {
+	    unsigned int var_id =
+	      *(btf_var_ids->get (dvd)) + num_types_added + 1;
+	    dw2_asm_output_data (4, var_id, "btt_type");
+	    return;
+	  }
+      }
     default:
       break;
     }
@@ -949,6 +972,11 @@ output_asm_btf_vlen_bytes (ctf_container_ref ctfc, ctf_dtdef_ref dtd)
 	 at this point.  */
       gcc_unreachable ();
 
+    case BTF_KIND_DECL_TAG:
+      dw2_asm_output_data (4, dtd->dtd_u.dtu_btfnote.component_idx,
+			   "decltag_compidx");
+      break;
+
     default:
       /* All other BTF type kinds have no variable length data.  */
       break;
-- 
2.36.1


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 8/9] doc: document new attributes
  2022-06-07 21:43 [PATCH 0/9] Add debug_annotate attributes David Faust
                   ` (6 preceding siblings ...)
  2022-06-07 21:43 ` [PATCH 7/9] btf: output decl_tag and type_tag records David Faust
@ 2022-06-07 21:43 ` David Faust
  2022-06-07 21:43 ` [PATCH 9/9] testsuite: add debug annotation tests David Faust
  2022-06-15  5:53 ` [PATCH 0/9] Add debug_annotate attributes Yonghong Song
  9 siblings, 0 replies; 25+ messages in thread
From: David Faust @ 2022-06-07 21:43 UTC (permalink / raw)
  To: gcc-patches; +Cc: jose.marchesi, yhs

gcc/

	* doc/extend.texi (Common Function Attributes): Document
	debug_annotate_decl attribute.
	(Common Variable Attributes): Likewise.
	(Common Type Attributes): Likewise. Also document
	debug_annotate_type attribute.
---
 gcc/doc/extend.texi | 106 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 106 insertions(+)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index a2e2a303ff1..a4c114f0e81 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -2931,6 +2931,30 @@ extern __attribute__ ((alloc_size (1), malloc, nothrow))
 StrongAlias (allocate, alloc);
 @end smallexample
 
+@item debug_annotate_decl (@var{annotation})
+@cindex @code{debug_annotate_decl} function attribute
+The @code{debug_annotate_decl} attribute is used to add arbitrary
+string annotations to the debugging information produced for a given
+declaration. The attribute accepts a single string argument, and may be
+specified multiple times for a single declaration. The behavior is
+to record the string argument in debug information generated for the
+declaration. Currently, DWARF and BTF debug information are supported.
+There is no effect on code generation; the attribute has no effect at
+all if neither DWARF nor BTF are output.
+
+@smallexample
+int foo (int a, int b) __attribute__((debug_annotate_decl ("my_tag")));
+@end smallexample
+
+@noindent
+results in a DW_TAG_GNU_annotation DIE associating the string ``my_tag''
+to the function ``foo'', and/or a BTF_KIND_DECL_TAG BTF record to the
+same effect.
+
+The @code{debug_annotate_decl} attribute can also be used for
+variables and types (@pxref{Common Variable Attributes},
+@pxref{Common Type Attributes}.)
+
 @item deprecated
 @itemx deprecated (@var{msg})
 @cindex @code{deprecated} function attribute
@@ -7510,6 +7534,42 @@ but not attributes that affect a symbol's linkage or visibility such as
 attribute is also not copied.  @xref{Common Function Attributes}.
 @xref{Common Type Attributes}.
 
+@item debug_annotate_decl (@var{annotation})
+@cindex @code{debug_annotate_decl} variable attribute
+The @code{debug_annotate_decl} attribute is used to add arbitrary
+string annotations to the debugging information produced for a given
+declaration. The attribute accepts a single string argument, and may be
+specified multiple times for a single declaration. The behavior is
+to record the string argument in debug information generated for the
+declaration. Currently, DWARF and BTF debug information are supported.
+There is no effect on code generation; the attribute has no effect at
+all if neither DWARF nor BTF are output.
+
+@smallexample
+int my_var __attribute__((debug_annotate_decl ("my_tag")))
+@end smallexample
+
+@noindent
+results in a DW_TAG_GNU_annotation DIE associating the string ``my_tag''
+to the ``my_var'', and/or a BTF_KIND_DECL_TAG BTF record to the same
+effect.
+
+Annotations can be specified for declarations other than variables,
+such as struct fields. For example:
+
+@smallexample
+struct foo @{
+  int * x __attribute__ ((debug_annotate_decl ("my_tag")));
+@};
+@end smallexample
+has similar results, producing debug info which associates the string
+``my_tag'' to the struct field ``x''.
+
+@noindent
+The @code{debug_annotate_decl} attribute can also be used for
+functions and types (@pxref{Common Function Attributes},
+@pxref{Common Type Attributes}.)
+
 @item deprecated
 @itemx deprecated (@var{msg})
 @cindex @code{deprecated} variable attribute
@@ -8593,6 +8653,52 @@ A @{ /* @r{@dots{}} */ @};
 struct __attribute__ ((copy ( (struct A *)0)) B @{ /* @r{@dots{}} */ @};
 @end smallexample
 
+@item debug_annotate_decl (@var{annotation})
+@cindex @code{debug_annotate_decl} type attribute
+The @code{debug_annotate_decl} attribute is used to add arbitrary
+string annotations to the debugging information produced for a given
+type declaration. The attribute accepts a single string argument, and
+may be specified multiple times for a type declaration. The behavior
+is to record the string argument in the debug information generated
+for the declaration. Currently, DWARF and BTF debug information are
+supported. There is no effect on code generation; the attribute has no
+effect at all if neither DWARF nor BTF are output.
+
+@smallexample
+struct t @{
+/* @r{@dots{}} */
+@} __attribute__((debug_annotate_decl ("my_tag")));
+@end smallexample
+
+@noindent
+results in a DW_TAG_GNU_annotation DIE associating the string
+``my_tag'' to the ``struct t'', and/or a BTF_KIND_DECL_TAG BTF record
+to the same effect.
+
+The @code{debug_annotate_decl} attribute can also be used for
+variables and functions (@pxref{Common Variable Attributes},
+@pxref{Common Function Attributes}.)
+
+@item debug_annotate_type (@var{annotation})
+@cindex @code{debug_annotate_type} type attribute
+The @code{debug_annotate_type} attribute is used to add arbitrary
+string annotations to the debugging information produced for a given
+type. The attribute accepts a single string argument, and may be
+specified multiple times for a type declaration. The behavior is to
+record the string argument in the debug information generated for the
+type. Currently, DWARF and BTF debug information are supported. There
+is no effect on code generation; the attribute has no effect at all if
+neither DWARF nor BTF are output.
+
+@smallexample
+int * __attribute__ ((debug_annotate_type ("foo"))) x;
+@end smallexample
+
+@noindent
+results in a DW_TAG_GNU_annotation DIE associating the string ``foo''
+to the pointer type of ``x'' in the case of DWARF, and/or a
+BTF_KIND_TYPE_TAG entry to the same effect in case of BTF debug info.
+
 @item deprecated
 @itemx deprecated (@var{msg})
 @cindex @code{deprecated} type attribute
-- 
2.36.1


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 9/9] testsuite: add debug annotation tests
  2022-06-07 21:43 [PATCH 0/9] Add debug_annotate attributes David Faust
                   ` (7 preceding siblings ...)
  2022-06-07 21:43 ` [PATCH 8/9] doc: document new attributes David Faust
@ 2022-06-07 21:43 ` David Faust
  2022-06-15  5:53 ` [PATCH 0/9] Add debug_annotate attributes Yonghong Song
  9 siblings, 0 replies; 25+ messages in thread
From: David Faust @ 2022-06-07 21:43 UTC (permalink / raw)
  To: gcc-patches; +Cc: jose.marchesi, yhs

This patch adds tests for debug annotations, in BTF and in DWARF.

gcc/testsuite/

	* gcc.dg/debug/btf/btf-decltag-func.c: New test.
	* gcc.dg/debug/btf/btf-decltag-sou.c: Likewise.
	* gcc.dg/debug/btf/btf-decltag-typedef.c: Likewise.
	* gcc.dg/debug/btf/btf-typetag-1.c: Likewise.
	* gcc.dg/debug/dwarf2/annotation-1.c: Likewise.
	* gcc.dg/debug/dwarf2/annotation-2.c: Likewise.
	* gcc.dg/debug/dwarf2/annotation-3.c: Likewise.
	* gcc.dg/debug/dwarf2/annotation-4.c: Likewise.
---
 .../gcc.dg/debug/btf/btf-decltag-func.c       | 18 ++++++++++
 .../gcc.dg/debug/btf/btf-decltag-sou.c        | 34 +++++++++++++++++++
 .../gcc.dg/debug/btf/btf-decltag-typedef.c    | 15 ++++++++
 .../gcc.dg/debug/btf/btf-typetag-1.c          | 20 +++++++++++
 .../gcc.dg/debug/dwarf2/annotation-1.c        | 20 +++++++++++
 .../gcc.dg/debug/dwarf2/annotation-2.c        | 17 ++++++++++
 .../gcc.dg/debug/dwarf2/annotation-3.c        | 20 +++++++++++
 .../gcc.dg/debug/dwarf2/annotation-4.c        | 34 +++++++++++++++++++
 8 files changed, 178 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-func.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-sou.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-typedef.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-typetag-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-2.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-3.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-4.c

diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-decltag-func.c b/gcc/testsuite/gcc.dg/debug/btf/btf-decltag-func.c
new file mode 100644
index 00000000000..b2d6820cf23
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/btf/btf-decltag-func.c
@@ -0,0 +1,18 @@
+
+/* { dg-do compile )  */
+/* { dg-options "-O0 -gbtf -dA" } */
+
+/* { dg-final { scan-assembler-times "\[\t \]0x11000000\[\t \]+\[^\n\]*btt_info" 4 } } */
+/* { dg-final { scan-assembler-times "\[\t \]0xffffffff\[\t \]+\[^\n\]*decltag_compidx" 3 } } */
+/* { dg-final { scan-assembler-times "\[\t \]0x1\[\t \]+\[^\n\]*decltag_compidx" 1 } } */
+
+#define __tag1 __attribute__((debug_annotate_decl("decl-tag-1")))
+#define __tag2 __attribute__((debug_annotate_decl("decl-tag-2")))
+#define __tag3 __attribute__((debug_annotate_decl("decl-tag-3")))
+
+extern int bar (int __tag1, int __tag2) __tag3;
+
+int __tag1 __tag2 foo (int arg1, int *arg2 __tag2)
+  {
+    return bar (arg1 + 1, *arg2 + 2);
+  }
diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-decltag-sou.c b/gcc/testsuite/gcc.dg/debug/btf/btf-decltag-sou.c
new file mode 100644
index 00000000000..bb125b53ce7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/btf/btf-decltag-sou.c
@@ -0,0 +1,34 @@
+
+/* { dg-do compile )  */
+/* { dg-options "-O0 -gbtf -dA" } */
+
+/* { dg-final { scan-assembler-times "\[\t \]0x11000000\[\t \]+\[^\n\]*btt_info" 16 } } */
+/* { dg-final { scan-assembler-times "\[\t \]0\[\t \]+\[^\n\]*decltag_compidx" 2 } } */
+/* { dg-final { scan-assembler-times "\[\t \]0x1\[\t \]+\[^\n\]*decltag_compidx" 1 } } */
+/* { dg-final { scan-assembler-times "\[\t \]0x2\[\t \]+\[^\n\]*decltag_compidx" 3 } } */
+/* { dg-final { scan-assembler-times "\[\t \]0x3\[\t \]+\[^\n\]*decltag_compidx" 3 } } */
+/* { dg-final { scan-assembler-times "\[\t \]0x4\[\t \]+\[^\n\]*decltag_compidx" 1 } } */
+/* { dg-final { scan-assembler-times "\[\t \]0xffffffff\[\t \]+\[^\n\]*decltag_compidx" 6 } } */
+
+#define __tag1 __attribute__((debug_annotate_decl("decl-tag-1")))
+#define __tag2 __attribute__((debug_annotate_decl("decl-tag-2")))
+#define __tag3 __attribute__((debug_annotate_decl("decl-tag-3")))
+
+struct t {
+  int a;
+  long b __tag3;
+  char c __tag2 __tag3;
+} __tag1 __tag2;
+
+struct t my_t __tag1 __tag3;
+
+
+union u {
+  char one __tag1 __tag2;
+  short two;
+  int three __tag1;
+  long four __tag1 __tag2 __tag3;
+  long long five __tag2;
+} __tag3;
+
+union u my_u __tag2;
diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-decltag-typedef.c b/gcc/testsuite/gcc.dg/debug/btf/btf-decltag-typedef.c
new file mode 100644
index 00000000000..6a44aaf9623
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/btf/btf-decltag-typedef.c
@@ -0,0 +1,15 @@
+/* { dg-do compile )  */
+/* { dg-options "-O0 -gbtf -dA" } */
+
+/* { dg-final { scan-assembler-times "\[\t \]0x11000000\[\t \]+\[^\n\]*btt_info" 3 } } */
+/* { dg-final { scan-assembler-times "\[\t \]0xffffffff\[\t \]+\[^\n\]*decltag_compidx" 3 } } */
+
+#define __tag1 __attribute__((debug_annotate_decl("decl-tag-1")))
+#define __tag2 __attribute__((debug_annotate_decl("decl-tag-2")))
+#define __tag3 __attribute__((debug_annotate_decl("decl-tag-3")))
+
+struct s { int a; } __tag1;
+
+typedef struct s * sptr __tag2;
+
+sptr my_sptr __tag3;
diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-typetag-1.c b/gcc/testsuite/gcc.dg/debug/btf/btf-typetag-1.c
new file mode 100644
index 00000000000..0d046265b7a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/btf/btf-typetag-1.c
@@ -0,0 +1,20 @@
+/* { dg-do compile )  */
+/* { dg-options "-O0 -gbtf -dA" } */
+
+/* { dg-final { scan-assembler-times "\[\t \]0x12000000\[\t \]+\[^\n\]*btt_info" 4 } } */
+
+#define __tag1 __attribute__((debug_annotate_type("tag1")))
+#define __tag2 __attribute__((debug_annotate_type("tag2")))
+#define __tag3 __attribute__((debug_annotate_type("tag3")))
+
+int __tag1 * x;
+const int __tag2 * y;
+
+struct a;
+
+struct b
+{
+  struct a __tag2 __tag3 * inner_a;
+};
+
+struct b my_b;
diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2/annotation-1.c b/gcc/testsuite/gcc.dg/debug/dwarf2/annotation-1.c
new file mode 100644
index 00000000000..f6f305c2739
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/dwarf2/annotation-1.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-gdwarf -dA" } */
+#define __decltag1 __attribute__((debug_annotate_decl("decl-tag-1")))
+#define __decltag2 __attribute__((debug_annotate_decl("decl-tag-2")))
+#define __decltag3 __attribute__((debug_annotate_decl("decl-tag-3")))
+
+struct S {
+  int a __decltag2 __decltag3;
+  int b __decltag1;
+} __decltag1 __decltag2;
+
+struct S my_S __decltag3;
+
+/* Verify that we get the expected DW_TAG_GNU_annotation DIEs for each tag.
+   Note: one more TAG in debug abbrev.  */
+/* { dg-final { scan-assembler-times " DW_TAG_GNU_annotation" 7 } } */
+/* { dg-final { scan-assembler-times " DW_AT_name: \"debug_annotate_decl\"" 6 } } */
+/* { dg-final { scan-assembler-times " DW_AT_const_value: \"decl-tag-1\"" 2 } } */
+/* { dg-final { scan-assembler-times " DW_AT_const_value: \"decl-tag-2\"" 2 } } */
+/* { dg-final { scan-assembler-times " DW_AT_const_value: \"decl-tag-3\"" 2 } } */
diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2/annotation-2.c b/gcc/testsuite/gcc.dg/debug/dwarf2/annotation-2.c
new file mode 100644
index 00000000000..04628fb1b81
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/dwarf2/annotation-2.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-gdwarf -dA" } */
+#define __typetag1 __attribute__((debug_annotate_type("type-tag-1")))
+#define __typetag2 __attribute__((debug_annotate_type("type-tag-2")))
+#define __typetag3 __attribute__((debug_annotate_type("type-tag-3")))
+int __typetag1 * x;
+
+char * __typetag1 buf;
+
+int * __typetag1 * __typetag2 __typetag3 g;
+
+/* Verify we get the expected annotation dies. Note +1 TAG in debug abbrev.  */
+/* { dg-final { scan-assembler-times " DW_TAG_GNU_annotation" 6 } } */
+/* { dg-final { scan-assembler-times " DW_AT_name: \"debug_annotate_type\"" 5 } } */
+/* { dg-final { scan-assembler-times " DW_AT_const_value: \"type-tag-1\"" 3 } } */
+/* { dg-final { scan-assembler-times " DW_AT_const_value: \"type-tag-2\"" 1 } } */
+/* { dg-final { scan-assembler-times " DW_AT_const_value: \"type-tag-3\"" 1 } } */
diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2/annotation-3.c b/gcc/testsuite/gcc.dg/debug/dwarf2/annotation-3.c
new file mode 100644
index 00000000000..0548cd52d6d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/dwarf2/annotation-3.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-gdwarf -dA" } */
+#define __decltag1 __attribute__((debug_annotate_decl("decl-tag-1")))
+#define __decltag2 __attribute__((debug_annotate_decl("decl-tag-2")))
+#define __decltag3 __attribute__((debug_annotate_decl("decl-tag-3")))
+
+extern int foo (char a, int b) __decltag1 __decltag3;
+
+int bar (int x)
+{
+  return foo ('a', x);
+}
+
+/* Verify that we get the expected DW_TAG_GNU_annotation DIEs for each tag.
+   Note: one more TAG in debug abbrev.  */
+/* { dg-final { scan-assembler-times " DW_TAG_GNU_annotation" 3 } } */
+/* { dg-final { scan-assembler-times " DW_AT_name: \"debug_annotate_decl\"" 2 } } */
+/* { dg-final { scan-assembler-times " DW_AT_const_value: \"decl-tag-1\"" 1 } } */
+/* { dg-final { scan-assembler-times " DW_AT_const_value: \"decl-tag-2\"" 0 } } */
+/* { dg-final { scan-assembler-times " DW_AT_const_value: \"decl-tag-3\"" 1 } } */
diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2/annotation-4.c b/gcc/testsuite/gcc.dg/debug/dwarf2/annotation-4.c
new file mode 100644
index 00000000000..9d2b3ad5c00
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/dwarf2/annotation-4.c
@@ -0,0 +1,34 @@
+/* { dg-do compile } */
+/* { dg-options "-gdwarf -dA" } */
+#define __decltag1 __attribute__((debug_annotate_decl("decl-tag-1")))
+#define __decltag2 __attribute__((debug_annotate_decl("decl-tag-2")))
+#define __decltag3 __attribute__((debug_annotate_decl("decl-tag-3")))
+
+#define __typetag1 __attribute__((debug_annotate_type("type-tag-1")))
+#define __typetag2 __attribute__((debug_annotate_type("type-tag-2")))
+#define __typetag3 __attribute__((debug_annotate_type("type-tag-3")))
+
+/* Note the decl tags on these parameters will not be recorded... */
+
+extern int foo (int * __typetag1 x __decltag1,
+		char * __typetag3 c __decltag1 __decltag2);
+
+
+/* ... but here they will be.  */
+
+int bar (int * x __decltag1 __decltag3, char **buf __decltag2 __decltag3)
+{
+  return foo (x, buf[0]);
+}
+
+/* Verify that we get the expected DW_TAG_GNU_annotation DIEs for each tag.
+   Note: one more TAG in debug abbrev.  */
+/* { dg-final { scan-assembler-times " DW_TAG_GNU_annotation" 7 } } */
+/* { dg-final { scan-assembler-times " DW_AT_name: \"debug_annotate_decl\"" 4 } } */
+/* { dg-final { scan-assembler-times " DW_AT_const_value: \"decl-tag-1\"" 1 } } */
+/* { dg-final { scan-assembler-times " DW_AT_const_value: \"decl-tag-2\"" 1 } } */
+/* { dg-final { scan-assembler-times " DW_AT_const_value: \"decl-tag-3\"" 2 } } */
+/* { dg-final { scan-assembler-times " DW_AT_name: \"debug_annotate_type\"" 2 } } */
+/* { dg-final { scan-assembler-times " DW_AT_const_value: \"type-tag-1\"" 1 } } */
+/* { dg-final { scan-assembler-times " DW_AT_const_value: \"type-tag-2\"" 0 } } */
+/* { dg-final { scan-assembler-times " DW_AT_const_value: \"type-tag-3\"" 1 } } */
-- 
2.36.1


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/9] dwarf: add dw_get_die_parent function
  2022-06-07 21:43 ` [PATCH 1/9] dwarf: add dw_get_die_parent function David Faust
@ 2022-06-13 10:13   ` Richard Biener
  0 siblings, 0 replies; 25+ messages in thread
From: Richard Biener @ 2022-06-13 10:13 UTC (permalink / raw)
  To: David Faust; +Cc: GCC Patches, yhs

On Tue, Jun 7, 2022 at 11:44 PM David Faust via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:

OK

> gcc/
>
>         * dwarf2out.cc (dw_get_die_parent): New function.
>         * dwarf2out.h (dw_get_die_parent): Declare it here.
> ---
>  gcc/dwarf2out.cc | 8 ++++++++
>  gcc/dwarf2out.h  | 1 +
>  2 files changed, 9 insertions(+)
>
> diff --git a/gcc/dwarf2out.cc b/gcc/dwarf2out.cc
> index 29f32ec6939..9c61026bb34 100644
> --- a/gcc/dwarf2out.cc
> +++ b/gcc/dwarf2out.cc
> @@ -5235,6 +5235,14 @@ dw_get_die_sib (dw_die_ref die)
>    return die->die_sib;
>  }
>
> +/* Return a reference to the parent of a given DIE.  */
> +
> +dw_die_ref
> +dw_get_die_parent (dw_die_ref die)
> +{
> +  return die->die_parent;
> +}
> +
>  /* Add an address constant attribute value to a DIE.  When using
>     dwarf_split_debug_info, address attributes in dies destined for the
>     final executable should be direct references--setting the parameter
> diff --git a/gcc/dwarf2out.h b/gcc/dwarf2out.h
> index 656ef94afde..e6962fb4848 100644
> --- a/gcc/dwarf2out.h
> +++ b/gcc/dwarf2out.h
> @@ -455,6 +455,7 @@ extern dw_die_ref lookup_type_die (tree);
>
>  extern dw_die_ref dw_get_die_child (dw_die_ref);
>  extern dw_die_ref dw_get_die_sib (dw_die_ref);
> +extern dw_die_ref dw_get_die_parent (dw_die_ref);
>  extern enum dwarf_tag dw_get_die_tag (dw_die_ref);
>
>  /* Data about a single source file.  */
> --
> 2.36.1
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/9] Add debug_annotate attributes
  2022-06-07 21:43 [PATCH 0/9] Add debug_annotate attributes David Faust
                   ` (8 preceding siblings ...)
  2022-06-07 21:43 ` [PATCH 9/9] testsuite: add debug annotation tests David Faust
@ 2022-06-15  5:53 ` Yonghong Song
  2022-06-15 20:57   ` David Faust
  9 siblings, 1 reply; 25+ messages in thread
From: Yonghong Song @ 2022-06-15  5:53 UTC (permalink / raw)
  To: David Faust, gcc-patches



On 6/7/22 2:43 PM, David Faust wrote:
> Hello,
> 
> This patch series adds support for:
> 
> - Two new C-language-level attributes that allow to associate (to "annotate" or
>    to "tag") particular declarations and types with arbitrary strings. As
>    explained below, this is intended to be used to, for example, characterize
>    certain pointer types.
> 
> - The conveyance of that information in the DWARF output in the form of a new
>    DIE: DW_TAG_GNU_annotation.
> 
> - The conveyance of that information in the BTF output in the form of two new
>    kinds of BTF objects: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.
> 
> All of these facilities are being added to the eBPF ecosystem, and support for
> them exists in some form in LLVM.
> 
> Purpose
> =======
> 
> 1)  Addition of C-family language constructs (attributes) to specify free-text
>      tags on certain language elements, such as struct fields.
> 
>      The purpose of these annotations is to provide additional information about
>      types, variables, and function parameters of interest to the kernel. A
>      driving use case is to tag pointer types within the linux kernel and eBPF
>      programs with additional semantic information, such as '__user' or '__rcu'.
> 
>      For example, consider the linux kernel function do_execve with the
>      following declaration:
> 
>        static int do_execve(struct filename *filename,
>           const char __user *const __user *__argv,
>           const char __user *const __user *__envp);
> 
>      Here, __user could be defined with these annotations to record semantic
>      information about the pointer parameters (e.g., they are user-provided) in
>      DWARF and BTF information. Other kernel facilites such as the eBPF verifier
>      can read the tags and make use of the information.
> 
> 2)  Conveying the tags in the generated DWARF debug info.
> 
>      The main motivation for emitting the tags in DWARF is that the Linux kernel
>      generates its BTF information via pahole, using DWARF as a source:
> 
>          +--------+  BTF                  BTF   +----------+
>          | pahole |-------> vmlinux.btf ------->| verifier |
>          +--------+                             +----------+
>              ^                                        ^
>              |                                        |
>        DWARF |                                    BTF |
>              |                                        |
>           vmlinux                              +-------------+
>           module1.ko                           | BPF program |
>           module2.ko                           +-------------+
>             ...
> 
>      This is because:
> 
>      a)  Unlike GCC, LLVM will only generate BTF for BPF programs.
> 
>      b)  GCC can generate BTF for whatever target with -gbtf, but there is no
>          support for linking/deduplicating BTF in the linker.
> 
>      In the scenario above, the verifier needs access to the pointer tags of
>      both the kernel types/declarations (conveyed in the DWARF and translated
>      to BTF by pahole) and those of the BPF program (available directly in BTF).
> 
>      Another motivation for having the tag information in DWARF, unrelated to
>      BPF and BTF, is that the drgn project (another DWARF consumer) also wants
>      to benefit from these tags in order to differentiate between different
>      kinds of pointers in the kernel.
> 
> 3)  Conveying the tags in the generated BTF debug info.
> 
>      This is easy: the main purpose of having this info in BTF is for the
>      compiled eBPF programs. The kernel verifier can then access the tags
>      of pointers used by the eBPF programs.
> 
> 
> For more information about these tags and the motivation behind them, please
> refer to the following linux kernel discussions:
> 
>    https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/
>    https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com/
>    https://lore.kernel.org/bpf/20211112012604.1504583-1-yhs@fb.com/
> 
> 
> Implementation Overview
> =======================
> 
> To enable these annotations, two new C language attributes are added:
> __attribute__((debug_annotate_decl("foo"))) and
> __attribute__((debug_annotate_type("bar"))). Both attributes accept a single
> arbitrary string constant argument, which will be recorded in the generated
> DWARF and/or BTF debug information. They have no effect on code generation.
> 
> Note that we are not using the same attribute names as LLVM (btf_decl_tag and
> btf_type_tag, respectively). While these attributes are functionally very
> similar, they have grown beyond purely BTF-specific uses, so inclusion of "btf"
> in the attribute name seems misleading.
> 
> DWARF support is enabled via a new DW_TAG_GNU_annotation. When generating DWARF,
> declarations and types will be checked for the corresponding attributes. If
> present, a DW_TAG_GNU_annotation DIE will be created as a child of the DIE for
> the annotated type or declaration, one for each tag. These DIEs link the
> arbitrary tag value to the item they annotate.
> 
> For example, the following variable declaration:
> 
>    #define __typetag1 __attribute__((debug_annotate_type ("typetag1")))
> 
>    #define __decltag1 __attribute__((debug_annotate_decl ("decltag1")))
>    #define __decltag2 __attribute__((debug_annotate_decl ("decltag2")))
> 
>    int * __typetag1 x __decltag1 __decltag2;

Based on the above example
         static int do_execve(struct filename *filename,
           const char __user *const __user *__argv,
           const char __user *const __user *__envp);

Should the above example should be the below?
     int __typetag1 * x __decltag1 __decltag2

> 
> Produces the following DWARF information:
> 
>   <1><1e>: Abbrev Number: 3 (DW_TAG_variable)
>      <1f>   DW_AT_name        : x
>      <21>   DW_AT_decl_file   : 1
>      <22>   DW_AT_decl_line   : 7
>      <23>   DW_AT_decl_column : 18
>      <24>   DW_AT_type        : <0x49>
>      <28>   DW_AT_external    : 1
>      <28>   DW_AT_location    : 9 byte block: 3 0 0 0 0 0 0 0 0 	(DW_OP_addr: 0)
>      <32>   DW_AT_sibling     : <0x49>
>   <2><36>: Abbrev Number: 1 (User TAG value: 0x6000)
>      <37>   DW_AT_name        : (indirect string, offset: 0xd6): debug_annotate_decl
>      <3b>   DW_AT_const_value : (indirect string, offset: 0xcd): decltag2
>   <2><3f>: Abbrev Number: 1 (User TAG value: 0x6000)
>      <40>   DW_AT_name        : (indirect string, offset: 0xd6): debug_annotate_decl
>      <44>   DW_AT_const_value : (indirect string, offset: 0x0): decltag1
>   <2><48>: Abbrev Number: 0
>   <1><49>: Abbrev Number: 4 (DW_TAG_pointer_type)
>      <4a>   DW_AT_byte_size   : 8
>      <4b>   DW_AT_type        : <0x5d>
>      <4f>   DW_AT_sibling     : <0x5d>
>   <2><53>: Abbrev Number: 1 (User TAG value: 0x6000)
>      <54>   DW_AT_name        : (indirect string, offset: 0x9): debug_annotate_type
>      <58>   DW_AT_const_value : (indirect string, offset: 0x1d): typetag1
>   <2><5c>: Abbrev Number: 0
>   <1><5d>: Abbrev Number: 5 (DW_TAG_base_type)
>      <5e>   DW_AT_byte_size   : 4
>      <5f>   DW_AT_encoding    : 5	(signed)
>      <60>   DW_AT_name        : int
>   <1><64>: Abbrev Number: 0

Maybe you can also show what dwarf debug_info looks like?

> 
> In the case of BTF, the annotations are recorded in two type kinds recently
> added to the BTF specification: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.
> The above example declaration prodcues the following BTF information:
> 
> [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
> [2] PTR '(anon)' type_id=3
> [3] TYPE_TAG 'typetag1' type_id=1
> [4] DECL_TAG 'decltag1' type_id=6 component_idx=-1
> [5] DECL_TAG 'decltag2' type_id=6 component_idx=-1
> [6] VAR 'x' type_id=2, linkage=global
> [7] DATASEC '.bss' size=0 vlen=1
> 	type_id=6 offset=0 size=8 (VAR 'x')
> 
> 
[...]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/9] Add debug_annotate attributes
  2022-06-15  5:53 ` [PATCH 0/9] Add debug_annotate attributes Yonghong Song
@ 2022-06-15 20:57   ` David Faust
  2022-06-15 22:56     ` Yonghong Song
  0 siblings, 1 reply; 25+ messages in thread
From: David Faust @ 2022-06-15 20:57 UTC (permalink / raw)
  To: Yonghong Song; +Cc: jose.marchesi, gcc-patches



On 6/14/22 22:53, Yonghong Song wrote:
> 
> 
> On 6/7/22 2:43 PM, David Faust wrote:
>> Hello,
>>
>> This patch series adds support for:
>>
>> - Two new C-language-level attributes that allow to associate (to "annotate" or
>>    to "tag") particular declarations and types with arbitrary strings. As
>>    explained below, this is intended to be used to, for example, characterize
>>    certain pointer types.
>>
>> - The conveyance of that information in the DWARF output in the form of a new
>>    DIE: DW_TAG_GNU_annotation.
>>
>> - The conveyance of that information in the BTF output in the form of two new
>>    kinds of BTF objects: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.
>>
>> All of these facilities are being added to the eBPF ecosystem, and support for
>> them exists in some form in LLVM.
>>
>> Purpose
>> =======
>>
>> 1)  Addition of C-family language constructs (attributes) to specify free-text
>>      tags on certain language elements, such as struct fields.
>>
>>      The purpose of these annotations is to provide additional information about
>>      types, variables, and function parameters of interest to the kernel. A
>>      driving use case is to tag pointer types within the linux kernel and eBPF
>>      programs with additional semantic information, such as '__user' or '__rcu'.
>>
>>      For example, consider the linux kernel function do_execve with the
>>      following declaration:
>>
>>        static int do_execve(struct filename *filename,
>>           const char __user *const __user *__argv,
>>           const char __user *const __user *__envp);
>>
>>      Here, __user could be defined with these annotations to record semantic
>>      information about the pointer parameters (e.g., they are user-provided) in
>>      DWARF and BTF information. Other kernel facilites such as the eBPF verifier
>>      can read the tags and make use of the information.
>>
>> 2)  Conveying the tags in the generated DWARF debug info.
>>
>>      The main motivation for emitting the tags in DWARF is that the Linux kernel
>>      generates its BTF information via pahole, using DWARF as a source:
>>
>>          +--------+  BTF                  BTF   +----------+
>>          | pahole |-------> vmlinux.btf ------->| verifier |
>>          +--------+                             +----------+
>>              ^                                        ^
>>              |                                        |
>>        DWARF |                                    BTF |
>>              |                                        |
>>           vmlinux                              +-------------+
>>           module1.ko                           | BPF program |
>>           module2.ko                           +-------------+
>>             ...
>>
>>      This is because:
>>
>>      a)  Unlike GCC, LLVM will only generate BTF for BPF programs.
>>
>>      b)  GCC can generate BTF for whatever target with -gbtf, but there is no
>>          support for linking/deduplicating BTF in the linker.
>>
>>      In the scenario above, the verifier needs access to the pointer tags of
>>      both the kernel types/declarations (conveyed in the DWARF and translated
>>      to BTF by pahole) and those of the BPF program (available directly in BTF).
>>
>>      Another motivation for having the tag information in DWARF, unrelated to
>>      BPF and BTF, is that the drgn project (another DWARF consumer) also wants
>>      to benefit from these tags in order to differentiate between different
>>      kinds of pointers in the kernel.
>>
>> 3)  Conveying the tags in the generated BTF debug info.
>>
>>      This is easy: the main purpose of having this info in BTF is for the
>>      compiled eBPF programs. The kernel verifier can then access the tags
>>      of pointers used by the eBPF programs.
>>
>>
>> For more information about these tags and the motivation behind them, please
>> refer to the following linux kernel discussions:
>>
>>    https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/
>>    https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com/
>>    https://lore.kernel.org/bpf/20211112012604.1504583-1-yhs@fb.com/
>>
>>
>> Implementation Overview
>> =======================
>>
>> To enable these annotations, two new C language attributes are added:
>> __attribute__((debug_annotate_decl("foo"))) and
>> __attribute__((debug_annotate_type("bar"))). Both attributes accept a single
>> arbitrary string constant argument, which will be recorded in the generated
>> DWARF and/or BTF debug information. They have no effect on code generation.
>>
>> Note that we are not using the same attribute names as LLVM (btf_decl_tag and
>> btf_type_tag, respectively). While these attributes are functionally very
>> similar, they have grown beyond purely BTF-specific uses, so inclusion of "btf"
>> in the attribute name seems misleading.
>>
>> DWARF support is enabled via a new DW_TAG_GNU_annotation. When generating DWARF,
>> declarations and types will be checked for the corresponding attributes. If
>> present, a DW_TAG_GNU_annotation DIE will be created as a child of the DIE for
>> the annotated type or declaration, one for each tag. These DIEs link the
>> arbitrary tag value to the item they annotate.
>>
>> For example, the following variable declaration:
>>
>>    #define __typetag1 __attribute__((debug_annotate_type ("typetag1")))
>>
>>    #define __decltag1 __attribute__((debug_annotate_decl ("decltag1")))
>>    #define __decltag2 __attribute__((debug_annotate_decl ("decltag2")))
>>
>>    int * __typetag1 x __decltag1 __decltag2;
> 
> Based on the above example
>          static int do_execve(struct filename *filename,
>            const char __user *const __user *__argv,
>            const char __user *const __user *__envp);
> 
> Should the above example should be the below?
>      int __typetag1 * x __decltag1 __decltag2
> 

This example is not related to the one above. It is just meant to
show the behavior of both attributes. My apologies for not making
that clear.

>>
>> Produces the following DWARF information:
>>
>>   <1><1e>: Abbrev Number: 3 (DW_TAG_variable)
>>      <1f>   DW_AT_name        : x
>>      <21>   DW_AT_decl_file   : 1
>>      <22>   DW_AT_decl_line   : 7
>>      <23>   DW_AT_decl_column : 18
>>      <24>   DW_AT_type        : <0x49>
>>      <28>   DW_AT_external    : 1
>>      <28>   DW_AT_location    : 9 byte block: 3 0 0 0 0 0 0 0 0 	(DW_OP_addr: 0)
>>      <32>   DW_AT_sibling     : <0x49>
>>   <2><36>: Abbrev Number: 1 (User TAG value: 0x6000)
>>      <37>   DW_AT_name        : (indirect string, offset: 0xd6): debug_annotate_decl
>>      <3b>   DW_AT_const_value : (indirect string, offset: 0xcd): decltag2
>>   <2><3f>: Abbrev Number: 1 (User TAG value: 0x6000)
>>      <40>   DW_AT_name        : (indirect string, offset: 0xd6): debug_annotate_decl
>>      <44>   DW_AT_const_value : (indirect string, offset: 0x0): decltag1
>>   <2><48>: Abbrev Number: 0
>>   <1><49>: Abbrev Number: 4 (DW_TAG_pointer_type)
>>      <4a>   DW_AT_byte_size   : 8
>>      <4b>   DW_AT_type        : <0x5d>
>>      <4f>   DW_AT_sibling     : <0x5d>
>>   <2><53>: Abbrev Number: 1 (User TAG value: 0x6000)
>>      <54>   DW_AT_name        : (indirect string, offset: 0x9): debug_annotate_type
>>      <58>   DW_AT_const_value : (indirect string, offset: 0x1d): typetag1
>>   <2><5c>: Abbrev Number: 0
>>   <1><5d>: Abbrev Number: 5 (DW_TAG_base_type)
>>      <5e>   DW_AT_byte_size   : 4
>>      <5f>   DW_AT_encoding    : 5	(signed)
>>      <60>   DW_AT_name        : int
>>   <1><64>: Abbrev Number: 0
> 
> Maybe you can also show what dwarf debug_info looks like
I am not sure what you mean. This is the .debug_info section as output 
by readelf -w. I did trim some information not relevant to the discussion
such as the DW_TAG_compile_unit DIE, for brevity.

> 
>>
>> In the case of BTF, the annotations are recorded in two type kinds recently
>> added to the BTF specification: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.
>> The above example declaration prodcues the following BTF information:
>>
>> [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
>> [2] PTR '(anon)' type_id=3
>> [3] TYPE_TAG 'typetag1' type_id=1
>> [4] DECL_TAG 'decltag1' type_id=6 component_idx=-1
>> [5] DECL_TAG 'decltag2' type_id=6 component_idx=-1
>> [6] VAR 'x' type_id=2, linkage=global
>> [7] DATASEC '.bss' size=0 vlen=1
>> 	type_id=6 offset=0 size=8 (VAR 'x')
>>
>>
> [...]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/9] Add debug_annotate attributes
  2022-06-15 20:57   ` David Faust
@ 2022-06-15 22:56     ` Yonghong Song
  2022-06-17 17:18       ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} WAS: " Jose E. Marchesi
  2022-11-01 22:29       ` Yonghong Song
  0 siblings, 2 replies; 25+ messages in thread
From: Yonghong Song @ 2022-06-15 22:56 UTC (permalink / raw)
  To: David Faust; +Cc: jose.marchesi, gcc-patches



On 6/15/22 1:57 PM, David Faust wrote:
> 
> 
> On 6/14/22 22:53, Yonghong Song wrote:
>>
>>
>> On 6/7/22 2:43 PM, David Faust wrote:
>>> Hello,
>>>
>>> This patch series adds support for:
>>>
>>> - Two new C-language-level attributes that allow to associate (to "annotate" or
>>>     to "tag") particular declarations and types with arbitrary strings. As
>>>     explained below, this is intended to be used to, for example, characterize
>>>     certain pointer types.
>>>
>>> - The conveyance of that information in the DWARF output in the form of a new
>>>     DIE: DW_TAG_GNU_annotation.
>>>
>>> - The conveyance of that information in the BTF output in the form of two new
>>>     kinds of BTF objects: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.
>>>
>>> All of these facilities are being added to the eBPF ecosystem, and support for
>>> them exists in some form in LLVM.
>>>
>>> Purpose
>>> =======
>>>
>>> 1)  Addition of C-family language constructs (attributes) to specify free-text
>>>       tags on certain language elements, such as struct fields.
>>>
>>>       The purpose of these annotations is to provide additional information about
>>>       types, variables, and function parameters of interest to the kernel. A
>>>       driving use case is to tag pointer types within the linux kernel and eBPF
>>>       programs with additional semantic information, such as '__user' or '__rcu'.
>>>
>>>       For example, consider the linux kernel function do_execve with the
>>>       following declaration:
>>>
>>>         static int do_execve(struct filename *filename,
>>>            const char __user *const __user *__argv,
>>>            const char __user *const __user *__envp);
>>>
>>>       Here, __user could be defined with these annotations to record semantic
>>>       information about the pointer parameters (e.g., they are user-provided) in
>>>       DWARF and BTF information. Other kernel facilites such as the eBPF verifier
>>>       can read the tags and make use of the information.
>>>
>>> 2)  Conveying the tags in the generated DWARF debug info.
>>>
>>>       The main motivation for emitting the tags in DWARF is that the Linux kernel
>>>       generates its BTF information via pahole, using DWARF as a source:
>>>
>>>           +--------+  BTF                  BTF   +----------+
>>>           | pahole |-------> vmlinux.btf ------->| verifier |
>>>           +--------+                             +----------+
>>>               ^                                        ^
>>>               |                                        |
>>>         DWARF |                                    BTF |
>>>               |                                        |
>>>            vmlinux                              +-------------+
>>>            module1.ko                           | BPF program |
>>>            module2.ko                           +-------------+
>>>              ...
>>>
>>>       This is because:
>>>
>>>       a)  Unlike GCC, LLVM will only generate BTF for BPF programs.
>>>
>>>       b)  GCC can generate BTF for whatever target with -gbtf, but there is no
>>>           support for linking/deduplicating BTF in the linker.
>>>
>>>       In the scenario above, the verifier needs access to the pointer tags of
>>>       both the kernel types/declarations (conveyed in the DWARF and translated
>>>       to BTF by pahole) and those of the BPF program (available directly in BTF).
>>>
>>>       Another motivation for having the tag information in DWARF, unrelated to
>>>       BPF and BTF, is that the drgn project (another DWARF consumer) also wants
>>>       to benefit from these tags in order to differentiate between different
>>>       kinds of pointers in the kernel.
>>>
>>> 3)  Conveying the tags in the generated BTF debug info.
>>>
>>>       This is easy: the main purpose of having this info in BTF is for the
>>>       compiled eBPF programs. The kernel verifier can then access the tags
>>>       of pointers used by the eBPF programs.
>>>
>>>
>>> For more information about these tags and the motivation behind them, please
>>> refer to the following linux kernel discussions:
>>>
>>>     https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/
>>>     https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com/
>>>     https://lore.kernel.org/bpf/20211112012604.1504583-1-yhs@fb.com/
>>>
>>>
>>> Implementation Overview
>>> =======================
>>>
>>> To enable these annotations, two new C language attributes are added:
>>> __attribute__((debug_annotate_decl("foo"))) and
>>> __attribute__((debug_annotate_type("bar"))). Both attributes accept a single
>>> arbitrary string constant argument, which will be recorded in the generated
>>> DWARF and/or BTF debug information. They have no effect on code generation.
>>>
>>> Note that we are not using the same attribute names as LLVM (btf_decl_tag and
>>> btf_type_tag, respectively). While these attributes are functionally very
>>> similar, they have grown beyond purely BTF-specific uses, so inclusion of "btf"
>>> in the attribute name seems misleading.
>>>
>>> DWARF support is enabled via a new DW_TAG_GNU_annotation. When generating DWARF,
>>> declarations and types will be checked for the corresponding attributes. If
>>> present, a DW_TAG_GNU_annotation DIE will be created as a child of the DIE for
>>> the annotated type or declaration, one for each tag. These DIEs link the
>>> arbitrary tag value to the item they annotate.
>>>
>>> For example, the following variable declaration:
>>>
>>>     #define __typetag1 __attribute__((debug_annotate_type ("typetag1")))
>>>
>>>     #define __decltag1 __attribute__((debug_annotate_decl ("decltag1")))
>>>     #define __decltag2 __attribute__((debug_annotate_decl ("decltag2")))
>>>
>>>     int * __typetag1 x __decltag1 __decltag2;
>>
>> Based on the above example
>>           static int do_execve(struct filename *filename,
>>             const char __user *const __user *__argv,
>>             const char __user *const __user *__envp);
>>
>> Should the above example should be the below?
>>       int __typetag1 * x __decltag1 __decltag2
>>
> 
> This example is not related to the one above. It is just meant to
> show the behavior of both attributes. My apologies for not making
> that clear.

Okay, it should be fine if the dwarf debug_info is shown.

> 
>>>
>>> Produces the following DWARF information:
>>>
>>>    <1><1e>: Abbrev Number: 3 (DW_TAG_variable)
>>>       <1f>   DW_AT_name        : x
>>>       <21>   DW_AT_decl_file   : 1
>>>       <22>   DW_AT_decl_line   : 7
>>>       <23>   DW_AT_decl_column : 18
>>>       <24>   DW_AT_type        : <0x49>
>>>       <28>   DW_AT_external    : 1
>>>       <28>   DW_AT_location    : 9 byte block: 3 0 0 0 0 0 0 0 0 	(DW_OP_addr: 0)
>>>       <32>   DW_AT_sibling     : <0x49>
>>>    <2><36>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>       <37>   DW_AT_name        : (indirect string, offset: 0xd6): debug_annotate_decl
>>>       <3b>   DW_AT_const_value : (indirect string, offset: 0xcd): decltag2
>>>    <2><3f>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>       <40>   DW_AT_name        : (indirect string, offset: 0xd6): debug_annotate_decl
>>>       <44>   DW_AT_const_value : (indirect string, offset: 0x0): decltag1
>>>    <2><48>: Abbrev Number: 0
>>>    <1><49>: Abbrev Number: 4 (DW_TAG_pointer_type)
>>>       <4a>   DW_AT_byte_size   : 8
>>>       <4b>   DW_AT_type        : <0x5d>
>>>       <4f>   DW_AT_sibling     : <0x5d>
>>>    <2><53>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>       <54>   DW_AT_name        : (indirect string, offset: 0x9): debug_annotate_type
>>>       <58>   DW_AT_const_value : (indirect string, offset: 0x1d): typetag1
>>>    <2><5c>: Abbrev Number: 0
>>>    <1><5d>: Abbrev Number: 5 (DW_TAG_base_type)
>>>       <5e>   DW_AT_byte_size   : 4
>>>       <5f>   DW_AT_encoding    : 5	(signed)
>>>       <60>   DW_AT_name        : int
>>>    <1><64>: Abbrev Number: 0

This shows the info in .debug_abbrev. What I mean is to
show the related info in .debug_info section which seems more useful to
understand the relationships between different tags. Maybe this is due 
to that I am not fully understanding what <1>/<2> means in <1><49> and 
<2><53> etc.

>>
>> Maybe you can also show what dwarf debug_info looks like
> I am not sure what you mean. This is the .debug_info section as output
> by readelf -w. I did trim some information not relevant to the discussion
> such as the DW_TAG_compile_unit DIE, for brevity.
> 
>>
>>>
>>> In the case of BTF, the annotations are recorded in two type kinds recently
>>> added to the BTF specification: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.
>>> The above example declaration prodcues the following BTF information:
>>>
>>> [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
>>> [2] PTR '(anon)' type_id=3
>>> [3] TYPE_TAG 'typetag1' type_id=1
>>> [4] DECL_TAG 'decltag1' type_id=6 component_idx=-1
>>> [5] DECL_TAG 'decltag2' type_id=6 component_idx=-1
>>> [6] VAR 'x' type_id=2, linkage=global
>>> [7] DATASEC '.bss' size=0 vlen=1
>>> 	type_id=6 offset=0 size=8 (VAR 'x')
>>>
>>>
>> [...]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} WAS: Re: [PATCH 0/9] Add debug_annotate attributes
  2022-06-15 22:56     ` Yonghong Song
@ 2022-06-17 17:18       ` Jose E. Marchesi
  2022-06-20 17:06         ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type, decl} " Yonghong Song
  2022-11-01 22:29       ` Yonghong Song
  1 sibling, 1 reply; 25+ messages in thread
From: Jose E. Marchesi @ 2022-06-17 17:18 UTC (permalink / raw)
  To: Yonghong Song; +Cc: David Faust, gcc-patches


Hi Yonghong.

> On 6/15/22 1:57 PM, David Faust wrote:
>> 
>> On 6/14/22 22:53, Yonghong Song wrote:
>>>
>>>
>>> On 6/7/22 2:43 PM, David Faust wrote:
>>>> Hello,
>>>>
>>>> This patch series adds support for:
>>>>
>>>> - Two new C-language-level attributes that allow to associate (to "annotate" or
>>>>     to "tag") particular declarations and types with arbitrary strings. As
>>>>     explained below, this is intended to be used to, for example, characterize
>>>>     certain pointer types.
>>>>
>>>> - The conveyance of that information in the DWARF output in the form of a new
>>>>     DIE: DW_TAG_GNU_annotation.
>>>>
>>>> - The conveyance of that information in the BTF output in the form of two new
>>>>     kinds of BTF objects: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.
>>>>
>>>> All of these facilities are being added to the eBPF ecosystem, and support for
>>>> them exists in some form in LLVM.
>>>>
>>>> Purpose
>>>> =======
>>>>
>>>> 1)  Addition of C-family language constructs (attributes) to specify free-text
>>>>       tags on certain language elements, such as struct fields.
>>>>
>>>>       The purpose of these annotations is to provide additional information about
>>>>       types, variables, and function parameters of interest to the kernel. A
>>>>       driving use case is to tag pointer types within the linux kernel and eBPF
>>>>       programs with additional semantic information, such as '__user' or '__rcu'.
>>>>
>>>>       For example, consider the linux kernel function do_execve with the
>>>>       following declaration:
>>>>
>>>>         static int do_execve(struct filename *filename,
>>>>            const char __user *const __user *__argv,
>>>>            const char __user *const __user *__envp);
>>>>
>>>>       Here, __user could be defined with these annotations to record semantic
>>>>       information about the pointer parameters (e.g., they are user-provided) in
>>>>       DWARF and BTF information. Other kernel facilites such as the eBPF verifier
>>>>       can read the tags and make use of the information.
>>>>
>>>> 2)  Conveying the tags in the generated DWARF debug info.
>>>>
>>>>       The main motivation for emitting the tags in DWARF is that the Linux kernel
>>>>       generates its BTF information via pahole, using DWARF as a source:
>>>>
>>>>           +--------+  BTF                  BTF   +----------+
>>>>           | pahole |-------> vmlinux.btf ------->| verifier |
>>>>           +--------+                             +----------+
>>>>               ^                                        ^
>>>>               |                                        |
>>>>         DWARF |                                    BTF |
>>>>               |                                        |
>>>>            vmlinux                              +-------------+
>>>>            module1.ko                           | BPF program |
>>>>            module2.ko                           +-------------+
>>>>              ...
>>>>
>>>>       This is because:
>>>>
>>>>       a)  Unlike GCC, LLVM will only generate BTF for BPF programs.
>>>>
>>>>       b)  GCC can generate BTF for whatever target with -gbtf, but there is no
>>>>           support for linking/deduplicating BTF in the linker.
>>>>
>>>>       In the scenario above, the verifier needs access to the pointer tags of
>>>>       both the kernel types/declarations (conveyed in the DWARF and translated
>>>>       to BTF by pahole) and those of the BPF program (available directly in BTF).
>>>>
>>>>       Another motivation for having the tag information in DWARF, unrelated to
>>>>       BPF and BTF, is that the drgn project (another DWARF consumer) also wants
>>>>       to benefit from these tags in order to differentiate between different
>>>>       kinds of pointers in the kernel.
>>>>
>>>> 3)  Conveying the tags in the generated BTF debug info.
>>>>
>>>>       This is easy: the main purpose of having this info in BTF is for the
>>>>       compiled eBPF programs. The kernel verifier can then access the tags
>>>>       of pointers used by the eBPF programs.
>>>>
>>>>
>>>> For more information about these tags and the motivation behind them, please
>>>> refer to the following linux kernel discussions:
>>>>
>>>>     https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/
>>>>     https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com/
>>>>     https://lore.kernel.org/bpf/20211112012604.1504583-1-yhs@fb.com/
>>>>
>>>>
>>>> Implementation Overview
>>>> =======================
>>>>
>>>> To enable these annotations, two new C language attributes are added:
>>>> __attribute__((debug_annotate_decl("foo"))) and
>>>> __attribute__((debug_annotate_type("bar"))). Both attributes accept a single
>>>> arbitrary string constant argument, which will be recorded in the generated
>>>> DWARF and/or BTF debug information. They have no effect on code generation.
>>>>
>>>> Note that we are not using the same attribute names as LLVM (btf_decl_tag and
>>>> btf_type_tag, respectively). While these attributes are functionally very
>>>> similar, they have grown beyond purely BTF-specific uses, so inclusion of "btf"
>>>> in the attribute name seems misleading.
>>>>
>>>> DWARF support is enabled via a new DW_TAG_GNU_annotation. When generating DWARF,
>>>> declarations and types will be checked for the corresponding attributes. If
>>>> present, a DW_TAG_GNU_annotation DIE will be created as a child of the DIE for
>>>> the annotated type or declaration, one for each tag. These DIEs link the
>>>> arbitrary tag value to the item they annotate.
>>>>
>>>> For example, the following variable declaration:
>>>>
>>>>     #define __typetag1 __attribute__((debug_annotate_type ("typetag1")))
>>>>
>>>>     #define __decltag1 __attribute__((debug_annotate_decl ("decltag1")))
>>>>     #define __decltag2 __attribute__((debug_annotate_decl ("decltag2")))
>>>>
>>>>     int * __typetag1 x __decltag1 __decltag2;
>>>
>>> Based on the above example
>>>           static int do_execve(struct filename *filename,
>>>             const char __user *const __user *__argv,
>>>             const char __user *const __user *__envp);
>>>
>>> Should the above example should be the below?
>>>       int __typetag1 * x __decltag1 __decltag2
>>>
>> This example is not related to the one above. It is just meant to
>> show the behavior of both attributes. My apologies for not making
>> that clear.
>
> Okay, it should be fine if the dwarf debug_info is shown.
>
>> 
>>>>
>>>> Produces the following DWARF information:
>>>>
>>>>    <1><1e>: Abbrev Number: 3 (DW_TAG_variable)
>>>>       <1f>   DW_AT_name        : x
>>>>       <21>   DW_AT_decl_file   : 1
>>>>       <22>   DW_AT_decl_line   : 7
>>>>       <23>   DW_AT_decl_column : 18
>>>>       <24>   DW_AT_type        : <0x49>
>>>>       <28>   DW_AT_external    : 1
>>>>       <28>   DW_AT_location    : 9 byte block: 3 0 0 0 0 0 0 0 0 	(DW_OP_addr: 0)
>>>>       <32>   DW_AT_sibling     : <0x49>
>>>>    <2><36>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>       <37>   DW_AT_name        : (indirect string, offset: 0xd6): debug_annotate_decl
>>>>       <3b>   DW_AT_const_value : (indirect string, offset: 0xcd): decltag2
>>>>    <2><3f>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>       <40>   DW_AT_name        : (indirect string, offset: 0xd6): debug_annotate_decl
>>>>       <44>   DW_AT_const_value : (indirect string, offset: 0x0): decltag1
>>>>    <2><48>: Abbrev Number: 0
>>>>    <1><49>: Abbrev Number: 4 (DW_TAG_pointer_type)
>>>>       <4a>   DW_AT_byte_size   : 8
>>>>       <4b>   DW_AT_type        : <0x5d>
>>>>       <4f>   DW_AT_sibling     : <0x5d>
>>>>    <2><53>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>       <54>   DW_AT_name        : (indirect string, offset: 0x9): debug_annotate_type
>>>>       <58>   DW_AT_const_value : (indirect string, offset: 0x1d): typetag1
>>>>    <2><5c>: Abbrev Number: 0
>>>>    <1><5d>: Abbrev Number: 5 (DW_TAG_base_type)
>>>>       <5e>   DW_AT_byte_size   : 4
>>>>       <5f>   DW_AT_encoding    : 5	(signed)
>>>>       <60>   DW_AT_name        : int
>>>>    <1><64>: Abbrev Number: 0
>
> This shows the info in .debug_abbrev. What I mean is to
> show the related info in .debug_info section which seems more useful to
> understand the relationships between different tags. Maybe this is due
> to that I am not fully understanding what <1>/<2> means in <1><49> and 
> <2><53> etc.

I think that dump actually shows .debug_info, with the abbrevs
expanded...

Anyway, it seems to us that the root of this problem is the fact the
kernel sparse annotations, such as address_space(__user), are:

1) To be processed by an external kernel-specific tool (
   https://sparse.docs.kernel.org/en/latest/annotations.html) and not a
   C compiler, and therefore,

2) Not quite the same than compiler attributes (despite the way they
   look.)  In particular, they seem to assume an ordering different than
   of GNU attributes: in some cases given the same written order, they
   refer to different things!.  Which is quite unfortunate :(

Now, if I understood properly, you plan to change the definition of
__user and __kernel in the kernel sources in order to generate the tag
compiler attributes, correct?

Is that the reason why LLVM implements what we assume to be the sparse
ordering, and not the correct GNU attributes ordering, for the tag
attributes?

If that is so, we have quite a problem here: I don't think we can change
the way GCC handles GNU-like attributes just because the kernel sources
want to hook on these __user/__kernel sparse annotations to generate the
compiler tags, even if we could mayhaps get GCC to handle
debug_annotate_type and debug_annotate_decl differently.  Some would say
doing so would perpetuate the mistake instead of fixing it...

Is my understanding correct?

>>>
>>> Maybe you can also show what dwarf debug_info looks like
>> I am not sure what you mean. This is the .debug_info section as output
>> by readelf -w. I did trim some information not relevant to the discussion
>> such as the DW_TAG_compile_unit DIE, for brevity.
>> 
>>>
>>>>
>>>> In the case of BTF, the annotations are recorded in two type kinds recently
>>>> added to the BTF specification: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.
>>>> The above example declaration prodcues the following BTF information:
>>>>
>>>> [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
>>>> [2] PTR '(anon)' type_id=3
>>>> [3] TYPE_TAG 'typetag1' type_id=1
>>>> [4] DECL_TAG 'decltag1' type_id=6 component_idx=-1
>>>> [5] DECL_TAG 'decltag2' type_id=6 component_idx=-1
>>>> [6] VAR 'x' type_id=2, linkage=global
>>>> [7] DATASEC '.bss' size=0 vlen=1
>>>> 	type_id=6 offset=0 size=8 (VAR 'x')
>>>>
>>>>
>>> [...]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kernel sparse annotations vs. compiler attributes and debug_annotate_{type, decl} WAS: Re: [PATCH 0/9] Add debug_annotate attributes
  2022-06-17 17:18       ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} WAS: " Jose E. Marchesi
@ 2022-06-20 17:06         ` Yonghong Song
  2022-06-21 16:12           ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} " Jose E. Marchesi
  0 siblings, 1 reply; 25+ messages in thread
From: Yonghong Song @ 2022-06-20 17:06 UTC (permalink / raw)
  To: Jose E. Marchesi; +Cc: David Faust, gcc-patches



On 6/17/22 10:18 AM, Jose E. Marchesi wrote:
> 
> Hi Yonghong.
> 
>> On 6/15/22 1:57 PM, David Faust wrote:
>>>
>>> On 6/14/22 22:53, Yonghong Song wrote:
>>>>
>>>>
>>>> On 6/7/22 2:43 PM, David Faust wrote:
>>>>> Hello,
>>>>>
>>>>> This patch series adds support for:
>>>>>
>>>>> - Two new C-language-level attributes that allow to associate (to "annotate" or
>>>>>      to "tag") particular declarations and types with arbitrary strings. As
>>>>>      explained below, this is intended to be used to, for example, characterize
>>>>>      certain pointer types.
>>>>>
>>>>> - The conveyance of that information in the DWARF output in the form of a new
>>>>>      DIE: DW_TAG_GNU_annotation.
>>>>>
>>>>> - The conveyance of that information in the BTF output in the form of two new
>>>>>      kinds of BTF objects: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.
>>>>>
>>>>> All of these facilities are being added to the eBPF ecosystem, and support for
>>>>> them exists in some form in LLVM.
>>>>>
>>>>> Purpose
>>>>> =======
>>>>>
>>>>> 1)  Addition of C-family language constructs (attributes) to specify free-text
>>>>>        tags on certain language elements, such as struct fields.
>>>>>
>>>>>        The purpose of these annotations is to provide additional information about
>>>>>        types, variables, and function parameters of interest to the kernel. A
>>>>>        driving use case is to tag pointer types within the linux kernel and eBPF
>>>>>        programs with additional semantic information, such as '__user' or '__rcu'.
>>>>>
>>>>>        For example, consider the linux kernel function do_execve with the
>>>>>        following declaration:
>>>>>
>>>>>          static int do_execve(struct filename *filename,
>>>>>             const char __user *const __user *__argv,
>>>>>             const char __user *const __user *__envp);
>>>>>
>>>>>        Here, __user could be defined with these annotations to record semantic
>>>>>        information about the pointer parameters (e.g., they are user-provided) in
>>>>>        DWARF and BTF information. Other kernel facilites such as the eBPF verifier
>>>>>        can read the tags and make use of the information.
>>>>>
>>>>> 2)  Conveying the tags in the generated DWARF debug info.
>>>>>
>>>>>        The main motivation for emitting the tags in DWARF is that the Linux kernel
>>>>>        generates its BTF information via pahole, using DWARF as a source:
>>>>>
>>>>>            +--------+  BTF                  BTF   +----------+
>>>>>            | pahole |-------> vmlinux.btf ------->| verifier |
>>>>>            +--------+                             +----------+
>>>>>                ^                                        ^
>>>>>                |                                        |
>>>>>          DWARF |                                    BTF |
>>>>>                |                                        |
>>>>>             vmlinux                              +-------------+
>>>>>             module1.ko                           | BPF program |
>>>>>             module2.ko                           +-------------+
>>>>>               ...
>>>>>
>>>>>        This is because:
>>>>>
>>>>>        a)  Unlike GCC, LLVM will only generate BTF for BPF programs.
>>>>>
>>>>>        b)  GCC can generate BTF for whatever target with -gbtf, but there is no
>>>>>            support for linking/deduplicating BTF in the linker.
>>>>>
>>>>>        In the scenario above, the verifier needs access to the pointer tags of
>>>>>        both the kernel types/declarations (conveyed in the DWARF and translated
>>>>>        to BTF by pahole) and those of the BPF program (available directly in BTF).
>>>>>
>>>>>        Another motivation for having the tag information in DWARF, unrelated to
>>>>>        BPF and BTF, is that the drgn project (another DWARF consumer) also wants
>>>>>        to benefit from these tags in order to differentiate between different
>>>>>        kinds of pointers in the kernel.
>>>>>
>>>>> 3)  Conveying the tags in the generated BTF debug info.
>>>>>
>>>>>        This is easy: the main purpose of having this info in BTF is for the
>>>>>        compiled eBPF programs. The kernel verifier can then access the tags
>>>>>        of pointers used by the eBPF programs.
>>>>>
>>>>>
>>>>> For more information about these tags and the motivation behind them, please
>>>>> refer to the following linux kernel discussions:
>>>>>
>>>>>      https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/
>>>>>      https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com/
>>>>>      https://lore.kernel.org/bpf/20211112012604.1504583-1-yhs@fb.com/
>>>>>
>>>>>
>>>>> Implementation Overview
>>>>> =======================
>>>>>
>>>>> To enable these annotations, two new C language attributes are added:
>>>>> __attribute__((debug_annotate_decl("foo"))) and
>>>>> __attribute__((debug_annotate_type("bar"))). Both attributes accept a single
>>>>> arbitrary string constant argument, which will be recorded in the generated
>>>>> DWARF and/or BTF debug information. They have no effect on code generation.
>>>>>
>>>>> Note that we are not using the same attribute names as LLVM (btf_decl_tag and
>>>>> btf_type_tag, respectively). While these attributes are functionally very
>>>>> similar, they have grown beyond purely BTF-specific uses, so inclusion of "btf"
>>>>> in the attribute name seems misleading.
>>>>>
>>>>> DWARF support is enabled via a new DW_TAG_GNU_annotation. When generating DWARF,
>>>>> declarations and types will be checked for the corresponding attributes. If
>>>>> present, a DW_TAG_GNU_annotation DIE will be created as a child of the DIE for
>>>>> the annotated type or declaration, one for each tag. These DIEs link the
>>>>> arbitrary tag value to the item they annotate.
>>>>>
>>>>> For example, the following variable declaration:
>>>>>
>>>>>      #define __typetag1 __attribute__((debug_annotate_type ("typetag1")))
>>>>>
>>>>>      #define __decltag1 __attribute__((debug_annotate_decl ("decltag1")))
>>>>>      #define __decltag2 __attribute__((debug_annotate_decl ("decltag2")))
>>>>>
>>>>>      int * __typetag1 x __decltag1 __decltag2;
>>>>
>>>> Based on the above example
>>>>            static int do_execve(struct filename *filename,
>>>>              const char __user *const __user *__argv,
>>>>              const char __user *const __user *__envp);
>>>>
>>>> Should the above example should be the below?
>>>>        int __typetag1 * x __decltag1 __decltag2
>>>>
>>> This example is not related to the one above. It is just meant to
>>> show the behavior of both attributes. My apologies for not making
>>> that clear.
>>
>> Okay, it should be fine if the dwarf debug_info is shown.
>>
>>>
>>>>>
>>>>> Produces the following DWARF information:
>>>>>
>>>>>     <1><1e>: Abbrev Number: 3 (DW_TAG_variable)
>>>>>        <1f>   DW_AT_name        : x
>>>>>        <21>   DW_AT_decl_file   : 1
>>>>>        <22>   DW_AT_decl_line   : 7
>>>>>        <23>   DW_AT_decl_column : 18
>>>>>        <24>   DW_AT_type        : <0x49>
>>>>>        <28>   DW_AT_external    : 1
>>>>>        <28>   DW_AT_location    : 9 byte block: 3 0 0 0 0 0 0 0 0 	(DW_OP_addr: 0)
>>>>>        <32>   DW_AT_sibling     : <0x49>
>>>>>     <2><36>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>>        <37>   DW_AT_name        : (indirect string, offset: 0xd6): debug_annotate_decl
>>>>>        <3b>   DW_AT_const_value : (indirect string, offset: 0xcd): decltag2
>>>>>     <2><3f>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>>        <40>   DW_AT_name        : (indirect string, offset: 0xd6): debug_annotate_decl
>>>>>        <44>   DW_AT_const_value : (indirect string, offset: 0x0): decltag1
>>>>>     <2><48>: Abbrev Number: 0
>>>>>     <1><49>: Abbrev Number: 4 (DW_TAG_pointer_type)
>>>>>        <4a>   DW_AT_byte_size   : 8
>>>>>        <4b>   DW_AT_type        : <0x5d>
>>>>>        <4f>   DW_AT_sibling     : <0x5d>
>>>>>     <2><53>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>>        <54>   DW_AT_name        : (indirect string, offset: 0x9): debug_annotate_type
>>>>>        <58>   DW_AT_const_value : (indirect string, offset: 0x1d): typetag1
>>>>>     <2><5c>: Abbrev Number: 0
>>>>>     <1><5d>: Abbrev Number: 5 (DW_TAG_base_type)
>>>>>        <5e>   DW_AT_byte_size   : 4
>>>>>        <5f>   DW_AT_encoding    : 5	(signed)
>>>>>        <60>   DW_AT_name        : int
>>>>>     <1><64>: Abbrev Number: 0
>>
>> This shows the info in .debug_abbrev. What I mean is to
>> show the related info in .debug_info section which seems more useful to
>> understand the relationships between different tags. Maybe this is due
>> to that I am not fully understanding what <1>/<2> means in <1><49> and
>> <2><53> etc.
> 
> I think that dump actually shows .debug_info, with the abbrevs
> expanded...
> 
> Anyway, it seems to us that the root of this problem is the fact the
> kernel sparse annotations, such as address_space(__user), are:
> 
> 1) To be processed by an external kernel-specific tool (
>     https://sparse.docs.kernel.org/en/latest/annotations.html) and not a
>     C compiler, and therefore,
> 
> 2) Not quite the same than compiler attributes (despite the way they
>     look.)  In particular, they seem to assume an ordering different than
>     of GNU attributes: in some cases given the same written order, they
>     refer to different things!.  Which is quite unfortunate :(

Yes, currently __user/__kernel macros (implemented with address_space
attribute) are processed by macros.

> 
> Now, if I understood properly, you plan to change the definition of
> __user and __kernel in the kernel sources in order to generate the tag
> compiler attributes, correct?

Right. The original __user definition likes:
   # define __user         __attribute__((noderef, address_space(__user)))

The new attribute looks like
   # define BTF_TYPE_TAG(value) __attribute__((btf_type_tag(#value)))
   #  define __user        BTF_TYPE_TAG(user)

> 
> Is that the reason why LLVM implements what we assume to be the sparse
> ordering, and not the correct GNU attributes ordering, for the tag
> attributes?

Note that __user attributes apply to pointee's and not pointers.
Just like
    const int *p;
the 'const' is not applied to pointer 'p', but the pointee of 'p'.

What current llvm dwarf generation with
    pointer
      <--- btf_type_tag
is just ONE implementation. As I said earlier, I am okay to
have dwarf implementation like
    p->btf_type_tag->const->int.
If you can propose an implementation like this in dwarf. I can propose
to change implementation in llvm.

> 
> If that is so, we have quite a problem here: I don't think we can change
> the way GCC handles GNU-like attributes just because the kernel sources
> want to hook on these __user/__kernel sparse annotations to generate the
> compiler tags, even if we could mayhaps get GCC to handle
> debug_annotate_type and debug_annotate_decl differently.  Some would say
> doing so would perpetuate the mistake instead of fixing it...
> 
> Is my understanding correct?

Let us just say that the btf_type_tag attribute applies to pointees.
Does this help?

> 
>>>>
>>>> Maybe you can also show what dwarf debug_info looks like
>>> I am not sure what you mean. This is the .debug_info section as output
>>> by readelf -w. I did trim some information not relevant to the discussion
>>> such as the DW_TAG_compile_unit DIE, for brevity.
>>>
>>>>
>>>>>
>>>>> In the case of BTF, the annotations are recorded in two type kinds recently
>>>>> added to the BTF specification: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.
>>>>> The above example declaration prodcues the following BTF information:
>>>>>
>>>>> [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
>>>>> [2] PTR '(anon)' type_id=3
>>>>> [3] TYPE_TAG 'typetag1' type_id=1
>>>>> [4] DECL_TAG 'decltag1' type_id=6 component_idx=-1
>>>>> [5] DECL_TAG 'decltag2' type_id=6 component_idx=-1
>>>>> [6] VAR 'x' type_id=2, linkage=global
>>>>> [7] DATASEC '.bss' size=0 vlen=1
>>>>> 	type_id=6 offset=0 size=8 (VAR 'x')
>>>>>
>>>>>
>>>> [...]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} WAS: Re: [PATCH 0/9] Add debug_annotate attributes
  2022-06-20 17:06         ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type, decl} " Yonghong Song
@ 2022-06-21 16:12           ` Jose E. Marchesi
  2022-06-24 18:01             ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type, decl} " Yonghong Song
  0 siblings, 1 reply; 25+ messages in thread
From: Jose E. Marchesi @ 2022-06-21 16:12 UTC (permalink / raw)
  To: Yonghong Song; +Cc: David Faust, gcc-patches


> On 6/17/22 10:18 AM, Jose E. Marchesi wrote:
>> Hi Yonghong.
>> 
>>> On 6/15/22 1:57 PM, David Faust wrote:
>>>>
>>>> On 6/14/22 22:53, Yonghong Song wrote:
>>>>>
>>>>>
>>>>> On 6/7/22 2:43 PM, David Faust wrote:
>>>>>> Hello,
>>>>>>
>>>>>> This patch series adds support for:
>>>>>>
>>>>>> - Two new C-language-level attributes that allow to associate (to "annotate" or
>>>>>>      to "tag") particular declarations and types with arbitrary strings. As
>>>>>>      explained below, this is intended to be used to, for example, characterize
>>>>>>      certain pointer types.
>>>>>>
>>>>>> - The conveyance of that information in the DWARF output in the form of a new
>>>>>>      DIE: DW_TAG_GNU_annotation.
>>>>>>
>>>>>> - The conveyance of that information in the BTF output in the form of two new
>>>>>>      kinds of BTF objects: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.
>>>>>>
>>>>>> All of these facilities are being added to the eBPF ecosystem, and support for
>>>>>> them exists in some form in LLVM.
>>>>>>
>>>>>> Purpose
>>>>>> =======
>>>>>>
>>>>>> 1)  Addition of C-family language constructs (attributes) to specify free-text
>>>>>>        tags on certain language elements, such as struct fields.
>>>>>>
>>>>>>        The purpose of these annotations is to provide additional information about
>>>>>>        types, variables, and function parameters of interest to the kernel. A
>>>>>>        driving use case is to tag pointer types within the linux kernel and eBPF
>>>>>>        programs with additional semantic information, such as '__user' or '__rcu'.
>>>>>>
>>>>>>        For example, consider the linux kernel function do_execve with the
>>>>>>        following declaration:
>>>>>>
>>>>>>          static int do_execve(struct filename *filename,
>>>>>>             const char __user *const __user *__argv,
>>>>>>             const char __user *const __user *__envp);
>>>>>>
>>>>>>        Here, __user could be defined with these annotations to record semantic
>>>>>>        information about the pointer parameters (e.g., they are user-provided) in
>>>>>>        DWARF and BTF information. Other kernel facilites such as the eBPF verifier
>>>>>>        can read the tags and make use of the information.
>>>>>>
>>>>>> 2)  Conveying the tags in the generated DWARF debug info.
>>>>>>
>>>>>>        The main motivation for emitting the tags in DWARF is that the Linux kernel
>>>>>>        generates its BTF information via pahole, using DWARF as a source:
>>>>>>
>>>>>>            +--------+  BTF                  BTF   +----------+
>>>>>>            | pahole |-------> vmlinux.btf ------->| verifier |
>>>>>>            +--------+                             +----------+
>>>>>>                ^                                        ^
>>>>>>                |                                        |
>>>>>>          DWARF |                                    BTF |
>>>>>>                |                                        |
>>>>>>             vmlinux                              +-------------+
>>>>>>             module1.ko                           | BPF program |
>>>>>>             module2.ko                           +-------------+
>>>>>>               ...
>>>>>>
>>>>>>        This is because:
>>>>>>
>>>>>>        a)  Unlike GCC, LLVM will only generate BTF for BPF programs.
>>>>>>
>>>>>>        b)  GCC can generate BTF for whatever target with -gbtf, but there is no
>>>>>>            support for linking/deduplicating BTF in the linker.
>>>>>>
>>>>>>        In the scenario above, the verifier needs access to the pointer tags of
>>>>>>        both the kernel types/declarations (conveyed in the DWARF and translated
>>>>>>        to BTF by pahole) and those of the BPF program (available directly in BTF).
>>>>>>
>>>>>>        Another motivation for having the tag information in DWARF, unrelated to
>>>>>>        BPF and BTF, is that the drgn project (another DWARF consumer) also wants
>>>>>>        to benefit from these tags in order to differentiate between different
>>>>>>        kinds of pointers in the kernel.
>>>>>>
>>>>>> 3)  Conveying the tags in the generated BTF debug info.
>>>>>>
>>>>>>        This is easy: the main purpose of having this info in BTF is for the
>>>>>>        compiled eBPF programs. The kernel verifier can then access the tags
>>>>>>        of pointers used by the eBPF programs.
>>>>>>
>>>>>>
>>>>>> For more information about these tags and the motivation behind them, please
>>>>>> refer to the following linux kernel discussions:
>>>>>>
>>>>>>      https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/
>>>>>>      https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com/
>>>>>>      https://lore.kernel.org/bpf/20211112012604.1504583-1-yhs@fb.com/
>>>>>>
>>>>>>
>>>>>> Implementation Overview
>>>>>> =======================
>>>>>>
>>>>>> To enable these annotations, two new C language attributes are added:
>>>>>> __attribute__((debug_annotate_decl("foo"))) and
>>>>>> __attribute__((debug_annotate_type("bar"))). Both attributes accept a single
>>>>>> arbitrary string constant argument, which will be recorded in the generated
>>>>>> DWARF and/or BTF debug information. They have no effect on code generation.
>>>>>>
>>>>>> Note that we are not using the same attribute names as LLVM (btf_decl_tag and
>>>>>> btf_type_tag, respectively). While these attributes are functionally very
>>>>>> similar, they have grown beyond purely BTF-specific uses, so inclusion of "btf"
>>>>>> in the attribute name seems misleading.
>>>>>>
>>>>>> DWARF support is enabled via a new DW_TAG_GNU_annotation. When generating DWARF,
>>>>>> declarations and types will be checked for the corresponding attributes. If
>>>>>> present, a DW_TAG_GNU_annotation DIE will be created as a child of the DIE for
>>>>>> the annotated type or declaration, one for each tag. These DIEs link the
>>>>>> arbitrary tag value to the item they annotate.
>>>>>>
>>>>>> For example, the following variable declaration:
>>>>>>
>>>>>>      #define __typetag1 __attribute__((debug_annotate_type ("typetag1")))
>>>>>>
>>>>>>      #define __decltag1 __attribute__((debug_annotate_decl ("decltag1")))
>>>>>>      #define __decltag2 __attribute__((debug_annotate_decl ("decltag2")))
>>>>>>
>>>>>>      int * __typetag1 x __decltag1 __decltag2;
>>>>>
>>>>> Based on the above example
>>>>>            static int do_execve(struct filename *filename,
>>>>>              const char __user *const __user *__argv,
>>>>>              const char __user *const __user *__envp);
>>>>>
>>>>> Should the above example should be the below?
>>>>>        int __typetag1 * x __decltag1 __decltag2
>>>>>
>>>> This example is not related to the one above. It is just meant to
>>>> show the behavior of both attributes. My apologies for not making
>>>> that clear.
>>>
>>> Okay, it should be fine if the dwarf debug_info is shown.
>>>
>>>>
>>>>>>
>>>>>> Produces the following DWARF information:
>>>>>>
>>>>>>     <1><1e>: Abbrev Number: 3 (DW_TAG_variable)
>>>>>>        <1f>   DW_AT_name        : x
>>>>>>        <21>   DW_AT_decl_file   : 1
>>>>>>        <22>   DW_AT_decl_line   : 7
>>>>>>        <23>   DW_AT_decl_column : 18
>>>>>>        <24>   DW_AT_type        : <0x49>
>>>>>>        <28>   DW_AT_external    : 1
>>>>>>        <28>   DW_AT_location    : 9 byte block: 3 0 0 0 0 0 0 0 0 	(DW_OP_addr: 0)
>>>>>>        <32>   DW_AT_sibling     : <0x49>
>>>>>>     <2><36>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>>>        <37>   DW_AT_name        : (indirect string, offset: 0xd6): debug_annotate_decl
>>>>>>        <3b>   DW_AT_const_value : (indirect string, offset: 0xcd): decltag2
>>>>>>     <2><3f>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>>>        <40>   DW_AT_name        : (indirect string, offset: 0xd6): debug_annotate_decl
>>>>>>        <44>   DW_AT_const_value : (indirect string, offset: 0x0): decltag1
>>>>>>     <2><48>: Abbrev Number: 0
>>>>>>     <1><49>: Abbrev Number: 4 (DW_TAG_pointer_type)
>>>>>>        <4a>   DW_AT_byte_size   : 8
>>>>>>        <4b>   DW_AT_type        : <0x5d>
>>>>>>        <4f>   DW_AT_sibling     : <0x5d>
>>>>>>     <2><53>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>>>        <54>   DW_AT_name        : (indirect string, offset: 0x9): debug_annotate_type
>>>>>>        <58>   DW_AT_const_value : (indirect string, offset: 0x1d): typetag1
>>>>>>     <2><5c>: Abbrev Number: 0
>>>>>>     <1><5d>: Abbrev Number: 5 (DW_TAG_base_type)
>>>>>>        <5e>   DW_AT_byte_size   : 4
>>>>>>        <5f>   DW_AT_encoding    : 5	(signed)
>>>>>>        <60>   DW_AT_name        : int
>>>>>>     <1><64>: Abbrev Number: 0
>>>
>>> This shows the info in .debug_abbrev. What I mean is to
>>> show the related info in .debug_info section which seems more useful to
>>> understand the relationships between different tags. Maybe this is due
>>> to that I am not fully understanding what <1>/<2> means in <1><49> and
>>> <2><53> etc.
>> I think that dump actually shows .debug_info, with the abbrevs
>> expanded...
>> Anyway, it seems to us that the root of this problem is the fact the
>> kernel sparse annotations, such as address_space(__user), are:
>> 1) To be processed by an external kernel-specific tool (
>>     https://sparse.docs.kernel.org/en/latest/annotations.html) and not a
>>     C compiler, and therefore,
>> 2) Not quite the same than compiler attributes (despite the way they
>>     look.)  In particular, they seem to assume an ordering different than
>>     of GNU attributes: in some cases given the same written order, they
>>     refer to different things!.  Which is quite unfortunate :(
>
> Yes, currently __user/__kernel macros (implemented with address_space
> attribute) are processed by macros.
>
>> Now, if I understood properly, you plan to change the definition of
>> __user and __kernel in the kernel sources in order to generate the tag
>> compiler attributes, correct?
>
> Right. The original __user definition likes:
>   # define __user         __attribute__((noderef, address_space(__user)))
>
> The new attribute looks like
>   # define BTF_TYPE_TAG(value) __attribute__((btf_type_tag(#value)))
>   #  define __user        BTF_TYPE_TAG(user)

Ok I see.  So the kernel will stop using sparse attributes to implement
__user and __kernel and start using compiler attributes for tags
instead.

>> Is that the reason why LLVM implements what we assume to be the
>> sparse
>> ordering, and not the correct GNU attributes ordering, for the tag
>> attributes?
>
> Note that __user attributes apply to pointee's and not pointers.
> Just like
>    const int *p;
> the 'const' is not applied to pointer 'p', but the pointee of 'p'.
>
> What current llvm dwarf generation with
>    pointer
>      <--- btf_type_tag
> is just ONE implementation. As I said earlier, I am okay to
> have dwarf implementation like
>    p->btf_type_tag->const->int.
> If you can propose an implementation like this in dwarf. I can propose
> to change implementation in llvm.

I think we are miscommunicating.

Looks like there is a divergence on what attributes apply to what
language entities between the sparse compiler and GCC/LLVM.  How to
represent that in DWARF is a different matter.

For this example:

  int __typetag1 * __typetag2 __typetag3 * g;

a) GCC associates __typetag1 with the pointer-to-pointer-to-int.
b) LLVM associates __typetag1 to pointer-to-int.

Where:

a) Is the expected behavior of a compiler attributes, as documented in
   the GCC manual.

b) Is presumably what the sparse compiler expects, but _not_ the
   ordering expected for a compiler GNU attribute.

So, if the kernel source __user and __kernel annotations (which
currently expand to sparse attributes) follow the sparse ordering, and
you want to implement __user and __kernel in terms of compiler
attributes instead (the annotation attributes) then you will have to:

1) Fix LLVM to implement the usual ordering for these attributes and
2) fix the kernel sources to use that ordering

[Incidentally, the same applies to another "ex-sparse" attribute you
 have in the kernel and also implemented in LLVM with a weird ordering:
 the address_space attribute.]

For 2), it may be possible to write a coccinnelle script to generate the
patch...

Does this make sense?

>> If that is so, we have quite a problem here: I don't think we can
>> change
>> the way GCC handles GNU-like attributes just because the kernel sources
>> want to hook on these __user/__kernel sparse annotations to generate the
>> compiler tags, even if we could mayhaps get GCC to handle
>> debug_annotate_type and debug_annotate_decl differently.  Some would say
>> doing so would perpetuate the mistake instead of fixing it...
>> Is my understanding correct?
>
> Let us just say that the btf_type_tag attribute applies to pointees.
> Does this help?
>
>> 
>>>>>
>>>>> Maybe you can also show what dwarf debug_info looks like
>>>> I am not sure what you mean. This is the .debug_info section as output
>>>> by readelf -w. I did trim some information not relevant to the discussion
>>>> such as the DW_TAG_compile_unit DIE, for brevity.
>>>>
>>>>>
>>>>>>
>>>>>> In the case of BTF, the annotations are recorded in two type kinds recently
>>>>>> added to the BTF specification: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.
>>>>>> The above example declaration prodcues the following BTF information:
>>>>>>
>>>>>> [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
>>>>>> [2] PTR '(anon)' type_id=3
>>>>>> [3] TYPE_TAG 'typetag1' type_id=1
>>>>>> [4] DECL_TAG 'decltag1' type_id=6 component_idx=-1
>>>>>> [5] DECL_TAG 'decltag2' type_id=6 component_idx=-1
>>>>>> [6] VAR 'x' type_id=2, linkage=global
>>>>>> [7] DATASEC '.bss' size=0 vlen=1
>>>>>> 	type_id=6 offset=0 size=8 (VAR 'x')
>>>>>>
>>>>>>
>>>>> [...]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kernel sparse annotations vs. compiler attributes and debug_annotate_{type, decl} WAS: Re: [PATCH 0/9] Add debug_annotate attributes
  2022-06-21 16:12           ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} " Jose E. Marchesi
@ 2022-06-24 18:01             ` Yonghong Song
  2022-07-07 20:24               ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} " Jose E. Marchesi
  0 siblings, 1 reply; 25+ messages in thread
From: Yonghong Song @ 2022-06-24 18:01 UTC (permalink / raw)
  To: Jose E. Marchesi; +Cc: David Faust, gcc-patches



On 6/21/22 9:12 AM, Jose E. Marchesi wrote:
> 
>> On 6/17/22 10:18 AM, Jose E. Marchesi wrote:
>>> Hi Yonghong.
>>>
>>>> On 6/15/22 1:57 PM, David Faust wrote:
>>>>>
>>>>> On 6/14/22 22:53, Yonghong Song wrote:
>>>>>>
>>>>>>
>>>>>> On 6/7/22 2:43 PM, David Faust wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> This patch series adds support for:
>>>>>>>
>>>>>>> - Two new C-language-level attributes that allow to associate (to "annotate" or
>>>>>>>       to "tag") particular declarations and types with arbitrary strings. As
>>>>>>>       explained below, this is intended to be used to, for example, characterize
>>>>>>>       certain pointer types.
>>>>>>>
>>>>>>> - The conveyance of that information in the DWARF output in the form of a new
>>>>>>>       DIE: DW_TAG_GNU_annotation.
>>>>>>>
>>>>>>> - The conveyance of that information in the BTF output in the form of two new
>>>>>>>       kinds of BTF objects: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.
>>>>>>>
>>>>>>> All of these facilities are being added to the eBPF ecosystem, and support for
>>>>>>> them exists in some form in LLVM.
>>>>>>>
>>>>>>> Purpose
>>>>>>> =======
>>>>>>>
>>>>>>> 1)  Addition of C-family language constructs (attributes) to specify free-text
>>>>>>>         tags on certain language elements, such as struct fields.
>>>>>>>
>>>>>>>         The purpose of these annotations is to provide additional information about
>>>>>>>         types, variables, and function parameters of interest to the kernel. A
>>>>>>>         driving use case is to tag pointer types within the linux kernel and eBPF
>>>>>>>         programs with additional semantic information, such as '__user' or '__rcu'.
>>>>>>>
>>>>>>>         For example, consider the linux kernel function do_execve with the
>>>>>>>         following declaration:
>>>>>>>
>>>>>>>           static int do_execve(struct filename *filename,
>>>>>>>              const char __user *const __user *__argv,
>>>>>>>              const char __user *const __user *__envp);
>>>>>>>
>>>>>>>         Here, __user could be defined with these annotations to record semantic
>>>>>>>         information about the pointer parameters (e.g., they are user-provided) in
>>>>>>>         DWARF and BTF information. Other kernel facilites such as the eBPF verifier
>>>>>>>         can read the tags and make use of the information.
>>>>>>>
>>>>>>> 2)  Conveying the tags in the generated DWARF debug info.
>>>>>>>
>>>>>>>         The main motivation for emitting the tags in DWARF is that the Linux kernel
>>>>>>>         generates its BTF information via pahole, using DWARF as a source:
>>>>>>>
>>>>>>>             +--------+  BTF                  BTF   +----------+
>>>>>>>             | pahole |-------> vmlinux.btf ------->| verifier |
>>>>>>>             +--------+                             +----------+
>>>>>>>                 ^                                        ^
>>>>>>>                 |                                        |
>>>>>>>           DWARF |                                    BTF |
>>>>>>>                 |                                        |
>>>>>>>              vmlinux                              +-------------+
>>>>>>>              module1.ko                           | BPF program |
>>>>>>>              module2.ko                           +-------------+
>>>>>>>                ...
>>>>>>>
>>>>>>>         This is because:
>>>>>>>
>>>>>>>         a)  Unlike GCC, LLVM will only generate BTF for BPF programs.
>>>>>>>
>>>>>>>         b)  GCC can generate BTF for whatever target with -gbtf, but there is no
>>>>>>>             support for linking/deduplicating BTF in the linker.
>>>>>>>
>>>>>>>         In the scenario above, the verifier needs access to the pointer tags of
>>>>>>>         both the kernel types/declarations (conveyed in the DWARF and translated
>>>>>>>         to BTF by pahole) and those of the BPF program (available directly in BTF).
>>>>>>>
>>>>>>>         Another motivation for having the tag information in DWARF, unrelated to
>>>>>>>         BPF and BTF, is that the drgn project (another DWARF consumer) also wants
>>>>>>>         to benefit from these tags in order to differentiate between different
>>>>>>>         kinds of pointers in the kernel.
>>>>>>>
>>>>>>> 3)  Conveying the tags in the generated BTF debug info.
>>>>>>>
>>>>>>>         This is easy: the main purpose of having this info in BTF is for the
>>>>>>>         compiled eBPF programs. The kernel verifier can then access the tags
>>>>>>>         of pointers used by the eBPF programs.
>>>>>>>
>>>>>>>
>>>>>>> For more information about these tags and the motivation behind them, please
>>>>>>> refer to the following linux kernel discussions:
>>>>>>>
>>>>>>>       https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/
>>>>>>>       https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com/
>>>>>>>       https://lore.kernel.org/bpf/20211112012604.1504583-1-yhs@fb.com/
>>>>>>>
>>>>>>>
>>>>>>> Implementation Overview
>>>>>>> =======================
>>>>>>>
>>>>>>> To enable these annotations, two new C language attributes are added:
>>>>>>> __attribute__((debug_annotate_decl("foo"))) and
>>>>>>> __attribute__((debug_annotate_type("bar"))). Both attributes accept a single
>>>>>>> arbitrary string constant argument, which will be recorded in the generated
>>>>>>> DWARF and/or BTF debug information. They have no effect on code generation.
>>>>>>>
>>>>>>> Note that we are not using the same attribute names as LLVM (btf_decl_tag and
>>>>>>> btf_type_tag, respectively). While these attributes are functionally very
>>>>>>> similar, they have grown beyond purely BTF-specific uses, so inclusion of "btf"
>>>>>>> in the attribute name seems misleading.
>>>>>>>
>>>>>>> DWARF support is enabled via a new DW_TAG_GNU_annotation. When generating DWARF,
>>>>>>> declarations and types will be checked for the corresponding attributes. If
>>>>>>> present, a DW_TAG_GNU_annotation DIE will be created as a child of the DIE for
>>>>>>> the annotated type or declaration, one for each tag. These DIEs link the
>>>>>>> arbitrary tag value to the item they annotate.
>>>>>>>
>>>>>>> For example, the following variable declaration:
>>>>>>>
>>>>>>>       #define __typetag1 __attribute__((debug_annotate_type ("typetag1")))
>>>>>>>
>>>>>>>       #define __decltag1 __attribute__((debug_annotate_decl ("decltag1")))
>>>>>>>       #define __decltag2 __attribute__((debug_annotate_decl ("decltag2")))
>>>>>>>
>>>>>>>       int * __typetag1 x __decltag1 __decltag2;
>>>>>>
>>>>>> Based on the above example
>>>>>>             static int do_execve(struct filename *filename,
>>>>>>               const char __user *const __user *__argv,
>>>>>>               const char __user *const __user *__envp);
>>>>>>
>>>>>> Should the above example should be the below?
>>>>>>         int __typetag1 * x __decltag1 __decltag2
>>>>>>
>>>>> This example is not related to the one above. It is just meant to
>>>>> show the behavior of both attributes. My apologies for not making
>>>>> that clear.
>>>>
>>>> Okay, it should be fine if the dwarf debug_info is shown.
>>>>
>>>>>
>>>>>>>
>>>>>>> Produces the following DWARF information:
>>>>>>>
>>>>>>>      <1><1e>: Abbrev Number: 3 (DW_TAG_variable)
>>>>>>>         <1f>   DW_AT_name        : x
>>>>>>>         <21>   DW_AT_decl_file   : 1
>>>>>>>         <22>   DW_AT_decl_line   : 7
>>>>>>>         <23>   DW_AT_decl_column : 18
>>>>>>>         <24>   DW_AT_type        : <0x49>
>>>>>>>         <28>   DW_AT_external    : 1
>>>>>>>         <28>   DW_AT_location    : 9 byte block: 3 0 0 0 0 0 0 0 0 	(DW_OP_addr: 0)
>>>>>>>         <32>   DW_AT_sibling     : <0x49>
>>>>>>>      <2><36>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>>>>         <37>   DW_AT_name        : (indirect string, offset: 0xd6): debug_annotate_decl
>>>>>>>         <3b>   DW_AT_const_value : (indirect string, offset: 0xcd): decltag2
>>>>>>>      <2><3f>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>>>>         <40>   DW_AT_name        : (indirect string, offset: 0xd6): debug_annotate_decl
>>>>>>>         <44>   DW_AT_const_value : (indirect string, offset: 0x0): decltag1
>>>>>>>      <2><48>: Abbrev Number: 0
>>>>>>>      <1><49>: Abbrev Number: 4 (DW_TAG_pointer_type)
>>>>>>>         <4a>   DW_AT_byte_size   : 8
>>>>>>>         <4b>   DW_AT_type        : <0x5d>
>>>>>>>         <4f>   DW_AT_sibling     : <0x5d>
>>>>>>>      <2><53>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>>>>         <54>   DW_AT_name        : (indirect string, offset: 0x9): debug_annotate_type
>>>>>>>         <58>   DW_AT_const_value : (indirect string, offset: 0x1d): typetag1
>>>>>>>      <2><5c>: Abbrev Number: 0
>>>>>>>      <1><5d>: Abbrev Number: 5 (DW_TAG_base_type)
>>>>>>>         <5e>   DW_AT_byte_size   : 4
>>>>>>>         <5f>   DW_AT_encoding    : 5	(signed)
>>>>>>>         <60>   DW_AT_name        : int
>>>>>>>      <1><64>: Abbrev Number: 0
>>>>
>>>> This shows the info in .debug_abbrev. What I mean is to
>>>> show the related info in .debug_info section which seems more useful to
>>>> understand the relationships between different tags. Maybe this is due
>>>> to that I am not fully understanding what <1>/<2> means in <1><49> and
>>>> <2><53> etc.
>>> I think that dump actually shows .debug_info, with the abbrevs
>>> expanded...
>>> Anyway, it seems to us that the root of this problem is the fact the
>>> kernel sparse annotations, such as address_space(__user), are:
>>> 1) To be processed by an external kernel-specific tool (
>>>      https://sparse.docs.kernel.org/en/latest/annotations.html) and not a
>>>      C compiler, and therefore,
>>> 2) Not quite the same than compiler attributes (despite the way they
>>>      look.)  In particular, they seem to assume an ordering different than
>>>      of GNU attributes: in some cases given the same written order, they
>>>      refer to different things!.  Which is quite unfortunate :(
>>
>> Yes, currently __user/__kernel macros (implemented with address_space
>> attribute) are processed by macros.
>>
>>> Now, if I understood properly, you plan to change the definition of
>>> __user and __kernel in the kernel sources in order to generate the tag
>>> compiler attributes, correct?
>>
>> Right. The original __user definition likes:
>>    # define __user         __attribute__((noderef, address_space(__user)))
>>
>> The new attribute looks like
>>    # define BTF_TYPE_TAG(value) __attribute__((btf_type_tag(#value)))
>>    #  define __user        BTF_TYPE_TAG(user)
> 
> Ok I see.  So the kernel will stop using sparse attributes to implement
> __user and __kernel and start using compiler attributes for tags
> instead.
> 
>>> Is that the reason why LLVM implements what we assume to be the
>>> sparse
>>> ordering, and not the correct GNU attributes ordering, for the tag
>>> attributes?
>>
>> Note that __user attributes apply to pointee's and not pointers.
>> Just like
>>     const int *p;
>> the 'const' is not applied to pointer 'p', but the pointee of 'p'.
>>
>> What current llvm dwarf generation with
>>     pointer
>>       <--- btf_type_tag
>> is just ONE implementation. As I said earlier, I am okay to
>> have dwarf implementation like
>>     p->btf_type_tag->const->int.
>> If you can propose an implementation like this in dwarf. I can propose
>> to change implementation in llvm.
> 
> I think we are miscommunicating.
> 
> Looks like there is a divergence on what attributes apply to what
> language entities between the sparse compiler and GCC/LLVM.  How to
> represent that in DWARF is a different matter.
> 
> For this example:
> 
>    int __typetag1 * __typetag2 __typetag3 * g;
> 
> a) GCC associates __typetag1 with the pointer-to-pointer-to-int.
> b) LLVM associates __typetag1 to pointer-to-int.
> 
> Where:
> 
> a) Is the expected behavior of a compiler attributes, as documented in
>     the GCC manual.
> 
> b) Is presumably what the sparse compiler expects, but _not_ the
>     ordering expected for a compiler GNU attribute.
> 
> So, if the kernel source __user and __kernel annotations (which
> currently expand to sparse attributes) follow the sparse ordering, and
> you want to implement __user and __kernel in terms of compiler
> attributes instead (the annotation attributes) then you will have to:
> 
> 1) Fix LLVM to implement the usual ordering for these attributes and
> 2) fix the kernel sources to use that ordering
> 
> [Incidentally, the same applies to another "ex-sparse" attribute you
>   have in the kernel and also implemented in LLVM with a weird ordering:
>   the address_space attribute.]
> 
> For 2), it may be possible to write a coccinnelle script to generate the
> patch...

I don't think (2) (to change kernel source for different attr ordering)
will work. So the only thing we can do is in compiler/pahole except 
macro replacement in kernel.

> 
> Does this make sense?
> 
>>> If that is so, we have quite a problem here: I don't think we can
>>> change
>>> the way GCC handles GNU-like attributes just because the kernel sources
>>> want to hook on these __user/__kernel sparse annotations to generate the
>>> compiler tags, even if we could mayhaps get GCC to handle
>>> debug_annotate_type and debug_annotate_decl differently.  Some would say
>>> doing so would perpetuate the mistake instead of fixing it...
>>> Is my understanding correct?
>>
>> Let us just say that the btf_type_tag attribute applies to pointees.
>> Does this help?
>>
>>>
>>>>>>
>>>>>> Maybe you can also show what dwarf debug_info looks like
>>>>> I am not sure what you mean. This is the .debug_info section as output
>>>>> by readelf -w. I did trim some information not relevant to the discussion
>>>>> such as the DW_TAG_compile_unit DIE, for brevity.
>>>>>
>>>>>>
>>>>>>>
>>>>>>> In the case of BTF, the annotations are recorded in two type kinds recently
>>>>>>> added to the BTF specification: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.
>>>>>>> The above example declaration prodcues the following BTF information:
>>>>>>>
>>>>>>> [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
>>>>>>> [2] PTR '(anon)' type_id=3
>>>>>>> [3] TYPE_TAG 'typetag1' type_id=1
>>>>>>> [4] DECL_TAG 'decltag1' type_id=6 component_idx=-1
>>>>>>> [5] DECL_TAG 'decltag2' type_id=6 component_idx=-1
>>>>>>> [6] VAR 'x' type_id=2, linkage=global
>>>>>>> [7] DATASEC '.bss' size=0 vlen=1
>>>>>>> 	type_id=6 offset=0 size=8 (VAR 'x')
>>>>>>>
>>>>>>>
>>>>>> [...]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} WAS: Re: [PATCH 0/9] Add debug_annotate attributes
  2022-06-24 18:01             ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type, decl} " Yonghong Song
@ 2022-07-07 20:24               ` Jose E. Marchesi
  2022-07-13  4:23                 ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type, decl} " Yonghong Song
  0 siblings, 1 reply; 25+ messages in thread
From: Jose E. Marchesi @ 2022-07-07 20:24 UTC (permalink / raw)
  To: Yonghong Song; +Cc: David Faust, gcc-patches


Hi Yonghong.

> On 6/21/22 9:12 AM, Jose E. Marchesi wrote:
>> 
>>> On 6/17/22 10:18 AM, Jose E. Marchesi wrote:
>>>> Hi Yonghong.
>>>>
>>>>> On 6/15/22 1:57 PM, David Faust wrote:
>>>>>>
>>>>>> On 6/14/22 22:53, Yonghong Song wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 6/7/22 2:43 PM, David Faust wrote:
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> This patch series adds support for:
>>>>>>>>
>>>>>>>> - Two new C-language-level attributes that allow to associate (to "annotate" or
>>>>>>>>       to "tag") particular declarations and types with arbitrary strings. As
>>>>>>>>       explained below, this is intended to be used to, for example, characterize
>>>>>>>>       certain pointer types.
>>>>>>>>
>>>>>>>> - The conveyance of that information in the DWARF output in the form of a new
>>>>>>>>       DIE: DW_TAG_GNU_annotation.
>>>>>>>>
>>>>>>>> - The conveyance of that information in the BTF output in the form of two new
>>>>>>>>       kinds of BTF objects: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.
>>>>>>>>
>>>>>>>> All of these facilities are being added to the eBPF ecosystem, and support for
>>>>>>>> them exists in some form in LLVM.
>>>>>>>>
>>>>>>>> Purpose
>>>>>>>> =======
>>>>>>>>
>>>>>>>> 1)  Addition of C-family language constructs (attributes) to specify free-text
>>>>>>>>         tags on certain language elements, such as struct fields.
>>>>>>>>
>>>>>>>>         The purpose of these annotations is to provide additional information about
>>>>>>>>         types, variables, and function parameters of interest to the kernel. A
>>>>>>>>         driving use case is to tag pointer types within the linux kernel and eBPF
>>>>>>>>         programs with additional semantic information, such as '__user' or '__rcu'.
>>>>>>>>
>>>>>>>>         For example, consider the linux kernel function do_execve with the
>>>>>>>>         following declaration:
>>>>>>>>
>>>>>>>>           static int do_execve(struct filename *filename,
>>>>>>>>              const char __user *const __user *__argv,
>>>>>>>>              const char __user *const __user *__envp);
>>>>>>>>
>>>>>>>>         Here, __user could be defined with these annotations to record semantic
>>>>>>>>         information about the pointer parameters (e.g., they are user-provided) in
>>>>>>>>         DWARF and BTF information. Other kernel facilites such as the eBPF verifier
>>>>>>>>         can read the tags and make use of the information.
>>>>>>>>
>>>>>>>> 2)  Conveying the tags in the generated DWARF debug info.
>>>>>>>>
>>>>>>>>         The main motivation for emitting the tags in DWARF is that the Linux kernel
>>>>>>>>         generates its BTF information via pahole, using DWARF as a source:
>>>>>>>>
>>>>>>>>             +--------+  BTF                  BTF   +----------+
>>>>>>>>             | pahole |-------> vmlinux.btf ------->| verifier |
>>>>>>>>             +--------+                             +----------+
>>>>>>>>                 ^                                        ^
>>>>>>>>                 |                                        |
>>>>>>>>           DWARF |                                    BTF |
>>>>>>>>                 |                                        |
>>>>>>>>              vmlinux                              +-------------+
>>>>>>>>              module1.ko                           | BPF program |
>>>>>>>>              module2.ko                           +-------------+
>>>>>>>>                ...
>>>>>>>>
>>>>>>>>         This is because:
>>>>>>>>
>>>>>>>>         a)  Unlike GCC, LLVM will only generate BTF for BPF programs.
>>>>>>>>
>>>>>>>>         b)  GCC can generate BTF for whatever target with -gbtf, but there is no
>>>>>>>>             support for linking/deduplicating BTF in the linker.
>>>>>>>>
>>>>>>>>         In the scenario above, the verifier needs access to the pointer tags of
>>>>>>>>         both the kernel types/declarations (conveyed in the DWARF and translated
>>>>>>>>         to BTF by pahole) and those of the BPF program (available directly in BTF).
>>>>>>>>
>>>>>>>>         Another motivation for having the tag information in DWARF, unrelated to
>>>>>>>>         BPF and BTF, is that the drgn project (another DWARF consumer) also wants
>>>>>>>>         to benefit from these tags in order to differentiate between different
>>>>>>>>         kinds of pointers in the kernel.
>>>>>>>>
>>>>>>>> 3)  Conveying the tags in the generated BTF debug info.
>>>>>>>>
>>>>>>>>         This is easy: the main purpose of having this info in BTF is for the
>>>>>>>>         compiled eBPF programs. The kernel verifier can then access the tags
>>>>>>>>         of pointers used by the eBPF programs.
>>>>>>>>
>>>>>>>>
>>>>>>>> For more information about these tags and the motivation behind them, please
>>>>>>>> refer to the following linux kernel discussions:
>>>>>>>>
>>>>>>>>       https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/
>>>>>>>>       https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com/
>>>>>>>>       https://lore.kernel.org/bpf/20211112012604.1504583-1-yhs@fb.com/
>>>>>>>>
>>>>>>>>
>>>>>>>> Implementation Overview
>>>>>>>> =======================
>>>>>>>>
>>>>>>>> To enable these annotations, two new C language attributes are added:
>>>>>>>> __attribute__((debug_annotate_decl("foo"))) and
>>>>>>>> __attribute__((debug_annotate_type("bar"))). Both attributes accept a single
>>>>>>>> arbitrary string constant argument, which will be recorded in the generated
>>>>>>>> DWARF and/or BTF debug information. They have no effect on code generation.
>>>>>>>>
>>>>>>>> Note that we are not using the same attribute names as LLVM (btf_decl_tag and
>>>>>>>> btf_type_tag, respectively). While these attributes are functionally very
>>>>>>>> similar, they have grown beyond purely BTF-specific uses, so inclusion of "btf"
>>>>>>>> in the attribute name seems misleading.
>>>>>>>>
>>>>>>>> DWARF support is enabled via a new DW_TAG_GNU_annotation. When generating DWARF,
>>>>>>>> declarations and types will be checked for the corresponding attributes. If
>>>>>>>> present, a DW_TAG_GNU_annotation DIE will be created as a child of the DIE for
>>>>>>>> the annotated type or declaration, one for each tag. These DIEs link the
>>>>>>>> arbitrary tag value to the item they annotate.
>>>>>>>>
>>>>>>>> For example, the following variable declaration:
>>>>>>>>
>>>>>>>>       #define __typetag1 __attribute__((debug_annotate_type ("typetag1")))
>>>>>>>>
>>>>>>>>       #define __decltag1 __attribute__((debug_annotate_decl ("decltag1")))
>>>>>>>>       #define __decltag2 __attribute__((debug_annotate_decl ("decltag2")))
>>>>>>>>
>>>>>>>>       int * __typetag1 x __decltag1 __decltag2;
>>>>>>>
>>>>>>> Based on the above example
>>>>>>>             static int do_execve(struct filename *filename,
>>>>>>>               const char __user *const __user *__argv,
>>>>>>>               const char __user *const __user *__envp);
>>>>>>>
>>>>>>> Should the above example should be the below?
>>>>>>>         int __typetag1 * x __decltag1 __decltag2
>>>>>>>
>>>>>> This example is not related to the one above. It is just meant to
>>>>>> show the behavior of both attributes. My apologies for not making
>>>>>> that clear.
>>>>>
>>>>> Okay, it should be fine if the dwarf debug_info is shown.
>>>>>
>>>>>>
>>>>>>>>
>>>>>>>> Produces the following DWARF information:
>>>>>>>>
>>>>>>>>      <1><1e>: Abbrev Number: 3 (DW_TAG_variable)
>>>>>>>>         <1f>   DW_AT_name        : x
>>>>>>>>         <21>   DW_AT_decl_file   : 1
>>>>>>>>         <22>   DW_AT_decl_line   : 7
>>>>>>>>         <23>   DW_AT_decl_column : 18
>>>>>>>>         <24>   DW_AT_type        : <0x49>
>>>>>>>>         <28>   DW_AT_external    : 1
>>>>>>>>         <28>   DW_AT_location    : 9 byte block: 3 0 0 0 0 0 0 0 0 	(DW_OP_addr: 0)
>>>>>>>>         <32>   DW_AT_sibling     : <0x49>
>>>>>>>>      <2><36>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>>>>>         <37>   DW_AT_name        : (indirect string, offset: 0xd6): debug_annotate_decl
>>>>>>>>         <3b>   DW_AT_const_value : (indirect string, offset: 0xcd): decltag2
>>>>>>>>      <2><3f>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>>>>>         <40>   DW_AT_name        : (indirect string, offset: 0xd6): debug_annotate_decl
>>>>>>>>         <44>   DW_AT_const_value : (indirect string, offset: 0x0): decltag1
>>>>>>>>      <2><48>: Abbrev Number: 0
>>>>>>>>      <1><49>: Abbrev Number: 4 (DW_TAG_pointer_type)
>>>>>>>>         <4a>   DW_AT_byte_size   : 8
>>>>>>>>         <4b>   DW_AT_type        : <0x5d>
>>>>>>>>         <4f>   DW_AT_sibling     : <0x5d>
>>>>>>>>      <2><53>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>>>>>         <54>   DW_AT_name        : (indirect string, offset: 0x9): debug_annotate_type
>>>>>>>>         <58>   DW_AT_const_value : (indirect string, offset: 0x1d): typetag1
>>>>>>>>      <2><5c>: Abbrev Number: 0
>>>>>>>>      <1><5d>: Abbrev Number: 5 (DW_TAG_base_type)
>>>>>>>>         <5e>   DW_AT_byte_size   : 4
>>>>>>>>         <5f>   DW_AT_encoding    : 5	(signed)
>>>>>>>>         <60>   DW_AT_name        : int
>>>>>>>>      <1><64>: Abbrev Number: 0
>>>>>
>>>>> This shows the info in .debug_abbrev. What I mean is to
>>>>> show the related info in .debug_info section which seems more useful to
>>>>> understand the relationships between different tags. Maybe this is due
>>>>> to that I am not fully understanding what <1>/<2> means in <1><49> and
>>>>> <2><53> etc.
>>>> I think that dump actually shows .debug_info, with the abbrevs
>>>> expanded...
>>>> Anyway, it seems to us that the root of this problem is the fact the
>>>> kernel sparse annotations, such as address_space(__user), are:
>>>> 1) To be processed by an external kernel-specific tool (
>>>>      https://sparse.docs.kernel.org/en/latest/annotations.html) and not a
>>>>      C compiler, and therefore,
>>>> 2) Not quite the same than compiler attributes (despite the way they
>>>>      look.)  In particular, they seem to assume an ordering different than
>>>>      of GNU attributes: in some cases given the same written order, they
>>>>      refer to different things!.  Which is quite unfortunate :(
>>>
>>> Yes, currently __user/__kernel macros (implemented with address_space
>>> attribute) are processed by macros.
>>>
>>>> Now, if I understood properly, you plan to change the definition of
>>>> __user and __kernel in the kernel sources in order to generate the tag
>>>> compiler attributes, correct?
>>>
>>> Right. The original __user definition likes:
>>>    # define __user         __attribute__((noderef, address_space(__user)))
>>>
>>> The new attribute looks like
>>>    # define BTF_TYPE_TAG(value) __attribute__((btf_type_tag(#value)))
>>>    #  define __user        BTF_TYPE_TAG(user)
>> Ok I see.  So the kernel will stop using sparse attributes to
>> implement
>> __user and __kernel and start using compiler attributes for tags
>> instead.
>> 
>>>> Is that the reason why LLVM implements what we assume to be the
>>>> sparse
>>>> ordering, and not the correct GNU attributes ordering, for the tag
>>>> attributes?
>>>
>>> Note that __user attributes apply to pointee's and not pointers.
>>> Just like
>>>     const int *p;
>>> the 'const' is not applied to pointer 'p', but the pointee of 'p'.
>>>
>>> What current llvm dwarf generation with
>>>     pointer
>>>       <--- btf_type_tag
>>> is just ONE implementation. As I said earlier, I am okay to
>>> have dwarf implementation like
>>>     p->btf_type_tag->const->int.
>>> If you can propose an implementation like this in dwarf. I can propose
>>> to change implementation in llvm.
>> I think we are miscommunicating.
>> Looks like there is a divergence on what attributes apply to what
>> language entities between the sparse compiler and GCC/LLVM.  How to
>> represent that in DWARF is a different matter.
>> For this example:
>>    int __typetag1 * __typetag2 __typetag3 * g;
>> a) GCC associates __typetag1 with the pointer-to-pointer-to-int.
>> b) LLVM associates __typetag1 to pointer-to-int.
>> Where:
>> a) Is the expected behavior of a compiler attributes, as documented
>> in
>>     the GCC manual.
>> b) Is presumably what the sparse compiler expects, but _not_ the
>>     ordering expected for a compiler GNU attribute.
>> So, if the kernel source __user and __kernel annotations (which
>> currently expand to sparse attributes) follow the sparse ordering, and
>> you want to implement __user and __kernel in terms of compiler
>> attributes instead (the annotation attributes) then you will have to:
>> 1) Fix LLVM to implement the usual ordering for these attributes and
>> 2) fix the kernel sources to use that ordering
>> [Incidentally, the same applies to another "ex-sparse" attribute you
>>   have in the kernel and also implemented in LLVM with a weird ordering:
>>   the address_space attribute.]
>> For 2), it may be possible to write a coccinnelle script to generate
>> the
>> patch...
>
> I don't think (2) (to change kernel source for different attr ordering)
> will work. So the only thing we can do is in compiler/pahole except
> macro replacement in kernel.

I looked at sparse and its parser.  Wanted to be sure the ordering it
uses to interpret sparse annotations (such as address_space, alignment,
etc) is definitely _not_ the same ordering used by __attribute__ in C
compilers.

It is very different indeed and the same can be said about how sparse
interprets other modifiers like `const': in sparse both `int const *foo'
and `int *const foo' parse to a constant pointer to int, for example.

I am not to judge how sparse handles its annotations.  It may be very
well and pertinent for its particular purpose.

But I am not sure if it is reasonable to expect C compilers to implement
certain type __attributes__ to parse differently, just because it
happens these attributes are reused from sparse annotations in a
particular program (in this case the kernel.)  The debug_annotate_decl
and debug_annotate_type attributes are not even intended to be
kernel-specific.

So, if changing the kernel sources is not an option (why btw, other than
being a PITA?) at this point I really don't know what else to suggest :/

Any suggestion from the front-end people?

>> Does this make sense?
>> 
>>>> If that is so, we have quite a problem here: I don't think we can
>>>> change
>>>> the way GCC handles GNU-like attributes just because the kernel sources
>>>> want to hook on these __user/__kernel sparse annotations to generate the
>>>> compiler tags, even if we could mayhaps get GCC to handle
>>>> debug_annotate_type and debug_annotate_decl differently.  Some would say
>>>> doing so would perpetuate the mistake instead of fixing it...
>>>> Is my understanding correct?
>>>
>>> Let us just say that the btf_type_tag attribute applies to pointees.
>>> Does this help?
>>>
>>>>
>>>>>>>
>>>>>>> Maybe you can also show what dwarf debug_info looks like
>>>>>> I am not sure what you mean. This is the .debug_info section as output
>>>>>> by readelf -w. I did trim some information not relevant to the discussion
>>>>>> such as the DW_TAG_compile_unit DIE, for brevity.
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> In the case of BTF, the annotations are recorded in two type kinds recently
>>>>>>>> added to the BTF specification: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.
>>>>>>>> The above example declaration prodcues the following BTF information:
>>>>>>>>
>>>>>>>> [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
>>>>>>>> [2] PTR '(anon)' type_id=3
>>>>>>>> [3] TYPE_TAG 'typetag1' type_id=1
>>>>>>>> [4] DECL_TAG 'decltag1' type_id=6 component_idx=-1
>>>>>>>> [5] DECL_TAG 'decltag2' type_id=6 component_idx=-1
>>>>>>>> [6] VAR 'x' type_id=2, linkage=global
>>>>>>>> [7] DATASEC '.bss' size=0 vlen=1
>>>>>>>> 	type_id=6 offset=0 size=8 (VAR 'x')
>>>>>>>>
>>>>>>>>
>>>>>>> [...]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kernel sparse annotations vs. compiler attributes and debug_annotate_{type, decl} WAS: Re: [PATCH 0/9] Add debug_annotate attributes
  2022-07-07 20:24               ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} " Jose E. Marchesi
@ 2022-07-13  4:23                 ` Yonghong Song
  2022-07-14 15:09                   ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} " Jose E. Marchesi
  0 siblings, 1 reply; 25+ messages in thread
From: Yonghong Song @ 2022-07-13  4:23 UTC (permalink / raw)
  To: Jose E. Marchesi; +Cc: David Faust, gcc-patches



On 7/7/22 1:24 PM, Jose E. Marchesi wrote:
> 
> Hi Yonghong.
> 
>> On 6/21/22 9:12 AM, Jose E. Marchesi wrote:
>>>
>>>> On 6/17/22 10:18 AM, Jose E. Marchesi wrote:
>>>>> Hi Yonghong.
>>>>>
>>>>>> On 6/15/22 1:57 PM, David Faust wrote:
>>>>>>>
>>>>>>> On 6/14/22 22:53, Yonghong Song wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/7/22 2:43 PM, David Faust wrote:
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> This patch series adds support for:
>>>>>>>>>
>>>>>>>>> - Two new C-language-level attributes that allow to associate (to "annotate" or
>>>>>>>>>        to "tag") particular declarations and types with arbitrary strings. As
>>>>>>>>>        explained below, this is intended to be used to, for example, characterize
>>>>>>>>>        certain pointer types.
>>>>>>>>>
>>>>>>>>> - The conveyance of that information in the DWARF output in the form of a new
>>>>>>>>>        DIE: DW_TAG_GNU_annotation.
>>>>>>>>>
>>>>>>>>> - The conveyance of that information in the BTF output in the form of two new
>>>>>>>>>        kinds of BTF objects: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.
>>>>>>>>>
>>>>>>>>> All of these facilities are being added to the eBPF ecosystem, and support for
>>>>>>>>> them exists in some form in LLVM.
>>>>>>>>>
>>>>>>>>> Purpose
>>>>>>>>> =======
>>>>>>>>>
>>>>>>>>> 1)  Addition of C-family language constructs (attributes) to specify free-text
>>>>>>>>>          tags on certain language elements, such as struct fields.
>>>>>>>>>
>>>>>>>>>          The purpose of these annotations is to provide additional information about
>>>>>>>>>          types, variables, and function parameters of interest to the kernel. A
>>>>>>>>>          driving use case is to tag pointer types within the linux kernel and eBPF
>>>>>>>>>          programs with additional semantic information, such as '__user' or '__rcu'.
>>>>>>>>>
>>>>>>>>>          For example, consider the linux kernel function do_execve with the
>>>>>>>>>          following declaration:
>>>>>>>>>
>>>>>>>>>            static int do_execve(struct filename *filename,
>>>>>>>>>               const char __user *const __user *__argv,
>>>>>>>>>               const char __user *const __user *__envp);
>>>>>>>>>
>>>>>>>>>          Here, __user could be defined with these annotations to record semantic
>>>>>>>>>          information about the pointer parameters (e.g., they are user-provided) in
>>>>>>>>>          DWARF and BTF information. Other kernel facilites such as the eBPF verifier
>>>>>>>>>          can read the tags and make use of the information.
>>>>>>>>>
>>>>>>>>> 2)  Conveying the tags in the generated DWARF debug info.
>>>>>>>>>
>>>>>>>>>          The main motivation for emitting the tags in DWARF is that the Linux kernel
>>>>>>>>>          generates its BTF information via pahole, using DWARF as a source:
>>>>>>>>>
>>>>>>>>>              +--------+  BTF                  BTF   +----------+
>>>>>>>>>              | pahole |-------> vmlinux.btf ------->| verifier |
>>>>>>>>>              +--------+                             +----------+
>>>>>>>>>                  ^                                        ^
>>>>>>>>>                  |                                        |
>>>>>>>>>            DWARF |                                    BTF |
>>>>>>>>>                  |                                        |
>>>>>>>>>               vmlinux                              +-------------+
>>>>>>>>>               module1.ko                           | BPF program |
>>>>>>>>>               module2.ko                           +-------------+
>>>>>>>>>                 ...
>>>>>>>>>
>>>>>>>>>          This is because:
>>>>>>>>>
>>>>>>>>>          a)  Unlike GCC, LLVM will only generate BTF for BPF programs.
>>>>>>>>>
>>>>>>>>>          b)  GCC can generate BTF for whatever target with -gbtf, but there is no
>>>>>>>>>              support for linking/deduplicating BTF in the linker.
>>>>>>>>>
>>>>>>>>>          In the scenario above, the verifier needs access to the pointer tags of
>>>>>>>>>          both the kernel types/declarations (conveyed in the DWARF and translated
>>>>>>>>>          to BTF by pahole) and those of the BPF program (available directly in BTF).
>>>>>>>>>
>>>>>>>>>          Another motivation for having the tag information in DWARF, unrelated to
>>>>>>>>>          BPF and BTF, is that the drgn project (another DWARF consumer) also wants
>>>>>>>>>          to benefit from these tags in order to differentiate between different
>>>>>>>>>          kinds of pointers in the kernel.
>>>>>>>>>
>>>>>>>>> 3)  Conveying the tags in the generated BTF debug info.
>>>>>>>>>
>>>>>>>>>          This is easy: the main purpose of having this info in BTF is for the
>>>>>>>>>          compiled eBPF programs. The kernel verifier can then access the tags
>>>>>>>>>          of pointers used by the eBPF programs.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> For more information about these tags and the motivation behind them, please
>>>>>>>>> refer to the following linux kernel discussions:
>>>>>>>>>
>>>>>>>>>        https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/
>>>>>>>>>        https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com/
>>>>>>>>>        https://lore.kernel.org/bpf/20211112012604.1504583-1-yhs@fb.com/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Implementation Overview
>>>>>>>>> =======================
>>>>>>>>>
>>>>>>>>> To enable these annotations, two new C language attributes are added:
>>>>>>>>> __attribute__((debug_annotate_decl("foo"))) and
>>>>>>>>> __attribute__((debug_annotate_type("bar"))). Both attributes accept a single
>>>>>>>>> arbitrary string constant argument, which will be recorded in the generated
>>>>>>>>> DWARF and/or BTF debug information. They have no effect on code generation.
>>>>>>>>>
>>>>>>>>> Note that we are not using the same attribute names as LLVM (btf_decl_tag and
>>>>>>>>> btf_type_tag, respectively). While these attributes are functionally very
>>>>>>>>> similar, they have grown beyond purely BTF-specific uses, so inclusion of "btf"
>>>>>>>>> in the attribute name seems misleading.
>>>>>>>>>
>>>>>>>>> DWARF support is enabled via a new DW_TAG_GNU_annotation. When generating DWARF,
>>>>>>>>> declarations and types will be checked for the corresponding attributes. If
>>>>>>>>> present, a DW_TAG_GNU_annotation DIE will be created as a child of the DIE for
>>>>>>>>> the annotated type or declaration, one for each tag. These DIEs link the
>>>>>>>>> arbitrary tag value to the item they annotate.
>>>>>>>>>
>>>>>>>>> For example, the following variable declaration:
>>>>>>>>>
>>>>>>>>>        #define __typetag1 __attribute__((debug_annotate_type ("typetag1")))
>>>>>>>>>
>>>>>>>>>        #define __decltag1 __attribute__((debug_annotate_decl ("decltag1")))
>>>>>>>>>        #define __decltag2 __attribute__((debug_annotate_decl ("decltag2")))
>>>>>>>>>
>>>>>>>>>        int * __typetag1 x __decltag1 __decltag2;
>>>>>>>>
>>>>>>>> Based on the above example
>>>>>>>>              static int do_execve(struct filename *filename,
>>>>>>>>                const char __user *const __user *__argv,
>>>>>>>>                const char __user *const __user *__envp);
>>>>>>>>
>>>>>>>> Should the above example should be the below?
>>>>>>>>          int __typetag1 * x __decltag1 __decltag2
>>>>>>>>
>>>>>>> This example is not related to the one above. It is just meant to
>>>>>>> show the behavior of both attributes. My apologies for not making
>>>>>>> that clear.
>>>>>>
>>>>>> Okay, it should be fine if the dwarf debug_info is shown.
>>>>>>
>>>>>>>
>>>>>>>>>
>>>>>>>>> Produces the following DWARF information:
>>>>>>>>>
>>>>>>>>>       <1><1e>: Abbrev Number: 3 (DW_TAG_variable)
>>>>>>>>>          <1f>   DW_AT_name        : x
>>>>>>>>>          <21>   DW_AT_decl_file   : 1
>>>>>>>>>          <22>   DW_AT_decl_line   : 7
>>>>>>>>>          <23>   DW_AT_decl_column : 18
>>>>>>>>>          <24>   DW_AT_type        : <0x49>
>>>>>>>>>          <28>   DW_AT_external    : 1
>>>>>>>>>          <28>   DW_AT_location    : 9 byte block: 3 0 0 0 0 0 0 0 0 	(DW_OP_addr: 0)
>>>>>>>>>          <32>   DW_AT_sibling     : <0x49>
>>>>>>>>>       <2><36>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>>>>>>          <37>   DW_AT_name        : (indirect string, offset: 0xd6): debug_annotate_decl
>>>>>>>>>          <3b>   DW_AT_const_value : (indirect string, offset: 0xcd): decltag2
>>>>>>>>>       <2><3f>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>>>>>>          <40>   DW_AT_name        : (indirect string, offset: 0xd6): debug_annotate_decl
>>>>>>>>>          <44>   DW_AT_const_value : (indirect string, offset: 0x0): decltag1
>>>>>>>>>       <2><48>: Abbrev Number: 0
>>>>>>>>>       <1><49>: Abbrev Number: 4 (DW_TAG_pointer_type)
>>>>>>>>>          <4a>   DW_AT_byte_size   : 8
>>>>>>>>>          <4b>   DW_AT_type        : <0x5d>
>>>>>>>>>          <4f>   DW_AT_sibling     : <0x5d>
>>>>>>>>>       <2><53>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>>>>>>          <54>   DW_AT_name        : (indirect string, offset: 0x9): debug_annotate_type
>>>>>>>>>          <58>   DW_AT_const_value : (indirect string, offset: 0x1d): typetag1
>>>>>>>>>       <2><5c>: Abbrev Number: 0
>>>>>>>>>       <1><5d>: Abbrev Number: 5 (DW_TAG_base_type)
>>>>>>>>>          <5e>   DW_AT_byte_size   : 4
>>>>>>>>>          <5f>   DW_AT_encoding    : 5	(signed)
>>>>>>>>>          <60>   DW_AT_name        : int
>>>>>>>>>       <1><64>: Abbrev Number: 0
>>>>>>
>>>>>> This shows the info in .debug_abbrev. What I mean is to
>>>>>> show the related info in .debug_info section which seems more useful to
>>>>>> understand the relationships between different tags. Maybe this is due
>>>>>> to that I am not fully understanding what <1>/<2> means in <1><49> and
>>>>>> <2><53> etc.
>>>>> I think that dump actually shows .debug_info, with the abbrevs
>>>>> expanded...
>>>>> Anyway, it seems to us that the root of this problem is the fact the
>>>>> kernel sparse annotations, such as address_space(__user), are:
>>>>> 1) To be processed by an external kernel-specific tool (
>>>>>       https://sparse.docs.kernel.org/en/latest/annotations.html) and not a
>>>>>       C compiler, and therefore,
>>>>> 2) Not quite the same than compiler attributes (despite the way they
>>>>>       look.)  In particular, they seem to assume an ordering different than
>>>>>       of GNU attributes: in some cases given the same written order, they
>>>>>       refer to different things!.  Which is quite unfortunate :(
>>>>
>>>> Yes, currently __user/__kernel macros (implemented with address_space
>>>> attribute) are processed by macros.
>>>>
>>>>> Now, if I understood properly, you plan to change the definition of
>>>>> __user and __kernel in the kernel sources in order to generate the tag
>>>>> compiler attributes, correct?
>>>>
>>>> Right. The original __user definition likes:
>>>>     # define __user         __attribute__((noderef, address_space(__user)))
>>>>
>>>> The new attribute looks like
>>>>     # define BTF_TYPE_TAG(value) __attribute__((btf_type_tag(#value)))
>>>>     #  define __user        BTF_TYPE_TAG(user)
>>> Ok I see.  So the kernel will stop using sparse attributes to
>>> implement
>>> __user and __kernel and start using compiler attributes for tags
>>> instead.
>>>
>>>>> Is that the reason why LLVM implements what we assume to be the
>>>>> sparse
>>>>> ordering, and not the correct GNU attributes ordering, for the tag
>>>>> attributes?
>>>>
>>>> Note that __user attributes apply to pointee's and not pointers.
>>>> Just like
>>>>      const int *p;
>>>> the 'const' is not applied to pointer 'p', but the pointee of 'p'.
>>>>
>>>> What current llvm dwarf generation with
>>>>      pointer
>>>>        <--- btf_type_tag
>>>> is just ONE implementation. As I said earlier, I am okay to
>>>> have dwarf implementation like
>>>>      p->btf_type_tag->const->int.
>>>> If you can propose an implementation like this in dwarf. I can propose
>>>> to change implementation in llvm.
>>> I think we are miscommunicating.
>>> Looks like there is a divergence on what attributes apply to what
>>> language entities between the sparse compiler and GCC/LLVM.  How to
>>> represent that in DWARF is a different matter.
>>> For this example:
>>>     int __typetag1 * __typetag2 __typetag3 * g;
>>> a) GCC associates __typetag1 with the pointer-to-pointer-to-int.
>>> b) LLVM associates __typetag1 to pointer-to-int.
>>> Where:
>>> a) Is the expected behavior of a compiler attributes, as documented
>>> in
>>>      the GCC manual.
>>> b) Is presumably what the sparse compiler expects, but _not_ the
>>>      ordering expected for a compiler GNU attribute.
>>> So, if the kernel source __user and __kernel annotations (which
>>> currently expand to sparse attributes) follow the sparse ordering, and
>>> you want to implement __user and __kernel in terms of compiler
>>> attributes instead (the annotation attributes) then you will have to:
>>> 1) Fix LLVM to implement the usual ordering for these attributes and
>>> 2) fix the kernel sources to use that ordering
>>> [Incidentally, the same applies to another "ex-sparse" attribute you
>>>    have in the kernel and also implemented in LLVM with a weird ordering:
>>>    the address_space attribute.]
>>> For 2), it may be possible to write a coccinnelle script to generate
>>> the
>>> patch...
>>
>> I don't think (2) (to change kernel source for different attr ordering)
>> will work. So the only thing we can do is in compiler/pahole except
>> macro replacement in kernel.
> 
> I looked at sparse and its parser.  Wanted to be sure the ordering it
> uses to interpret sparse annotations (such as address_space, alignment,
> etc) is definitely _not_ the same ordering used by __attribute__ in C
> compilers.
> 
> It is very different indeed and the same can be said about how sparse
> interprets other modifiers like `const': in sparse both `int const *foo'
> and `int *const foo' parse to a constant pointer to int, for example.
> 
> I am not to judge how sparse handles its annotations.  It may be very
> well and pertinent for its particular purpose.
> 
> But I am not sure if it is reasonable to expect C compilers to implement
> certain type __attributes__ to parse differently, just because it
> happens these attributes are reused from sparse annotations in a
> particular program (in this case the kernel.)  The debug_annotate_decl
> and debug_annotate_type attributes are not even intended to be
> kernel-specific.
> 
> So, if changing the kernel sources is not an option (why btw, other than
> being a PITA?) at this point I really don't know what else to suggest :/
> 
> Any suggestion from the front-end people?

Just want to understand the overall picture. So gcc can still emit
BTF properly with btf_type_tag right? The issue we are talking about
here is about the dwarf, right? If this is the case, we might have
a partial solution here.
   - gcc emits BTF for vmlinux
   - gcc emits dwarf for vmlinux ignoring btf_type_tag
   - in pahole, vmlinux BTF is amended with some additional misc things.
Although there are some use cases to have btf_type_tag in dwarf, but
that can be workarouned with BTF + dwarf both of which are generated
by the compiler. Not elegent, but probably works.

> 
>>> Does this make sense?
>>>
>>>>> If that is so, we have quite a problem here: I don't think we can
>>>>> change
>>>>> the way GCC handles GNU-like attributes just because the kernel sources
>>>>> want to hook on these __user/__kernel sparse annotations to generate the
>>>>> compiler tags, even if we could mayhaps get GCC to handle
>>>>> debug_annotate_type and debug_annotate_decl differently.  Some would say
>>>>> doing so would perpetuate the mistake instead of fixing it...
>>>>> Is my understanding correct?
>>>>
>>>> Let us just say that the btf_type_tag attribute applies to pointees.
>>>> Does this help?
>>>>
>>>>>
>>>>>>>>
>>>>>>>> Maybe you can also show what dwarf debug_info looks like
>>>>>>> I am not sure what you mean. This is the .debug_info section as output
>>>>>>> by readelf -w. I did trim some information not relevant to the discussion
>>>>>>> such as the DW_TAG_compile_unit DIE, for brevity.
>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> In the case of BTF, the annotations are recorded in two type kinds recently
>>>>>>>>> added to the BTF specification: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.
>>>>>>>>> The above example declaration prodcues the following BTF information:
>>>>>>>>>
>>>>>>>>> [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
>>>>>>>>> [2] PTR '(anon)' type_id=3
>>>>>>>>> [3] TYPE_TAG 'typetag1' type_id=1
>>>>>>>>> [4] DECL_TAG 'decltag1' type_id=6 component_idx=-1
>>>>>>>>> [5] DECL_TAG 'decltag2' type_id=6 component_idx=-1
>>>>>>>>> [6] VAR 'x' type_id=2, linkage=global
>>>>>>>>> [7] DATASEC '.bss' size=0 vlen=1
>>>>>>>>> 	type_id=6 offset=0 size=8 (VAR 'x')
>>>>>>>>>
>>>>>>>>>
>>>>>>>> [...]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} WAS: Re: [PATCH 0/9] Add debug_annotate attributes
  2022-07-13  4:23                 ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type, decl} " Yonghong Song
@ 2022-07-14 15:09                   ` Jose E. Marchesi
  2022-07-15  1:20                     ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type, decl} " Yonghong Song
  0 siblings, 1 reply; 25+ messages in thread
From: Jose E. Marchesi @ 2022-07-14 15:09 UTC (permalink / raw)
  To: Yonghong Song; +Cc: David Faust, gcc-patches


Hi Yonghong.

> On 7/7/22 1:24 PM, Jose E. Marchesi wrote:
>> Hi Yonghong.
>> 
>>> On 6/21/22 9:12 AM, Jose E. Marchesi wrote:
>>>>
>>>>> On 6/17/22 10:18 AM, Jose E. Marchesi wrote:
>>>>>> Hi Yonghong.
>>>>>>
>>>>>>> On 6/15/22 1:57 PM, David Faust wrote:
>>>>>>>>
>>>>>>>> On 6/14/22 22:53, Yonghong Song wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 6/7/22 2:43 PM, David Faust wrote:
>>>>>>>>>> Hello,
>>>>>>>>>>
>>>>>>>>>> This patch series adds support for:
>>>>>>>>>>
>>>>>>>>>> - Two new C-language-level attributes that allow to associate (to "annotate" or
>>>>>>>>>>        to "tag") particular declarations and types with arbitrary strings. As
>>>>>>>>>>        explained below, this is intended to be used to, for example, characterize
>>>>>>>>>>        certain pointer types.
>>>>>>>>>>
>>>>>>>>>> - The conveyance of that information in the DWARF output in the form of a new
>>>>>>>>>>        DIE: DW_TAG_GNU_annotation.
>>>>>>>>>>
>>>>>>>>>> - The conveyance of that information in the BTF output in the form of two new
>>>>>>>>>>        kinds of BTF objects: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.
>>>>>>>>>>
>>>>>>>>>> All of these facilities are being added to the eBPF ecosystem, and support for
>>>>>>>>>> them exists in some form in LLVM.
>>>>>>>>>>
>>>>>>>>>> Purpose
>>>>>>>>>> =======
>>>>>>>>>>
>>>>>>>>>> 1)  Addition of C-family language constructs (attributes) to specify free-text
>>>>>>>>>>          tags on certain language elements, such as struct fields.
>>>>>>>>>>
>>>>>>>>>>          The purpose of these annotations is to provide additional information about
>>>>>>>>>>          types, variables, and function parameters of interest to the kernel. A
>>>>>>>>>>          driving use case is to tag pointer types within the linux kernel and eBPF
>>>>>>>>>>          programs with additional semantic information, such as '__user' or '__rcu'.
>>>>>>>>>>
>>>>>>>>>>          For example, consider the linux kernel function do_execve with the
>>>>>>>>>>          following declaration:
>>>>>>>>>>
>>>>>>>>>>            static int do_execve(struct filename *filename,
>>>>>>>>>>               const char __user *const __user *__argv,
>>>>>>>>>>               const char __user *const __user *__envp);
>>>>>>>>>>
>>>>>>>>>>          Here, __user could be defined with these annotations to record semantic
>>>>>>>>>>          information about the pointer parameters (e.g., they are user-provided) in
>>>>>>>>>>          DWARF and BTF information. Other kernel facilites such as the eBPF verifier
>>>>>>>>>>          can read the tags and make use of the information.
>>>>>>>>>>
>>>>>>>>>> 2)  Conveying the tags in the generated DWARF debug info.
>>>>>>>>>>
>>>>>>>>>>          The main motivation for emitting the tags in DWARF is that the Linux kernel
>>>>>>>>>>          generates its BTF information via pahole, using DWARF as a source:
>>>>>>>>>>
>>>>>>>>>>              +--------+  BTF                  BTF   +----------+
>>>>>>>>>>              | pahole |-------> vmlinux.btf ------->| verifier |
>>>>>>>>>>              +--------+                             +----------+
>>>>>>>>>>                  ^                                        ^
>>>>>>>>>>                  |                                        |
>>>>>>>>>>            DWARF |                                    BTF |
>>>>>>>>>>                  |                                        |
>>>>>>>>>>               vmlinux                              +-------------+
>>>>>>>>>>               module1.ko                           | BPF program |
>>>>>>>>>>               module2.ko                           +-------------+
>>>>>>>>>>                 ...
>>>>>>>>>>
>>>>>>>>>>          This is because:
>>>>>>>>>>
>>>>>>>>>>          a)  Unlike GCC, LLVM will only generate BTF for BPF programs.
>>>>>>>>>>
>>>>>>>>>>          b)  GCC can generate BTF for whatever target with -gbtf, but there is no
>>>>>>>>>>              support for linking/deduplicating BTF in the linker.
>>>>>>>>>>
>>>>>>>>>>          In the scenario above, the verifier needs access to the pointer tags of
>>>>>>>>>>          both the kernel types/declarations (conveyed in the DWARF and translated
>>>>>>>>>>          to BTF by pahole) and those of the BPF program (available directly in BTF).
>>>>>>>>>>
>>>>>>>>>>          Another motivation for having the tag information in DWARF, unrelated to
>>>>>>>>>>          BPF and BTF, is that the drgn project (another DWARF consumer) also wants
>>>>>>>>>>          to benefit from these tags in order to differentiate between different
>>>>>>>>>>          kinds of pointers in the kernel.
>>>>>>>>>>
>>>>>>>>>> 3)  Conveying the tags in the generated BTF debug info.
>>>>>>>>>>
>>>>>>>>>>          This is easy: the main purpose of having this info in BTF is for the
>>>>>>>>>>          compiled eBPF programs. The kernel verifier can then access the tags
>>>>>>>>>>          of pointers used by the eBPF programs.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> For more information about these tags and the motivation behind them, please
>>>>>>>>>> refer to the following linux kernel discussions:
>>>>>>>>>>
>>>>>>>>>>        https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/
>>>>>>>>>>        https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com/
>>>>>>>>>>        https://lore.kernel.org/bpf/20211112012604.1504583-1-yhs@fb.com/
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Implementation Overview
>>>>>>>>>> =======================
>>>>>>>>>>
>>>>>>>>>> To enable these annotations, two new C language attributes are added:
>>>>>>>>>> __attribute__((debug_annotate_decl("foo"))) and
>>>>>>>>>> __attribute__((debug_annotate_type("bar"))). Both attributes accept a single
>>>>>>>>>> arbitrary string constant argument, which will be recorded in the generated
>>>>>>>>>> DWARF and/or BTF debug information. They have no effect on code generation.
>>>>>>>>>>
>>>>>>>>>> Note that we are not using the same attribute names as LLVM (btf_decl_tag and
>>>>>>>>>> btf_type_tag, respectively). While these attributes are functionally very
>>>>>>>>>> similar, they have grown beyond purely BTF-specific uses, so inclusion of "btf"
>>>>>>>>>> in the attribute name seems misleading.
>>>>>>>>>>
>>>>>>>>>> DWARF support is enabled via a new DW_TAG_GNU_annotation. When generating DWARF,
>>>>>>>>>> declarations and types will be checked for the corresponding attributes. If
>>>>>>>>>> present, a DW_TAG_GNU_annotation DIE will be created as a child of the DIE for
>>>>>>>>>> the annotated type or declaration, one for each tag. These DIEs link the
>>>>>>>>>> arbitrary tag value to the item they annotate.
>>>>>>>>>>
>>>>>>>>>> For example, the following variable declaration:
>>>>>>>>>>
>>>>>>>>>>        #define __typetag1 __attribute__((debug_annotate_type ("typetag1")))
>>>>>>>>>>
>>>>>>>>>>        #define __decltag1 __attribute__((debug_annotate_decl ("decltag1")))
>>>>>>>>>>        #define __decltag2 __attribute__((debug_annotate_decl ("decltag2")))
>>>>>>>>>>
>>>>>>>>>>        int * __typetag1 x __decltag1 __decltag2;
>>>>>>>>>
>>>>>>>>> Based on the above example
>>>>>>>>>              static int do_execve(struct filename *filename,
>>>>>>>>>                const char __user *const __user *__argv,
>>>>>>>>>                const char __user *const __user *__envp);
>>>>>>>>>
>>>>>>>>> Should the above example should be the below?
>>>>>>>>>          int __typetag1 * x __decltag1 __decltag2
>>>>>>>>>
>>>>>>>> This example is not related to the one above. It is just meant to
>>>>>>>> show the behavior of both attributes. My apologies for not making
>>>>>>>> that clear.
>>>>>>>
>>>>>>> Okay, it should be fine if the dwarf debug_info is shown.
>>>>>>>
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Produces the following DWARF information:
>>>>>>>>>>
>>>>>>>>>>       <1><1e>: Abbrev Number: 3 (DW_TAG_variable)
>>>>>>>>>>          <1f>   DW_AT_name        : x
>>>>>>>>>>          <21>   DW_AT_decl_file   : 1
>>>>>>>>>>          <22>   DW_AT_decl_line   : 7
>>>>>>>>>>          <23>   DW_AT_decl_column : 18
>>>>>>>>>>          <24>   DW_AT_type        : <0x49>
>>>>>>>>>>          <28>   DW_AT_external    : 1
>>>>>>>>>>          <28>   DW_AT_location    : 9 byte block: 3 0 0 0 0 0 0 0 0 	(DW_OP_addr: 0)
>>>>>>>>>>          <32>   DW_AT_sibling     : <0x49>
>>>>>>>>>>       <2><36>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>>>>>>> <37> DW_AT_name : (indirect string, offset: 0xd6):
> debug_annotate_decl
>>>>>>>>>>          <3b>   DW_AT_const_value : (indirect string, offset: 0xcd): decltag2
>>>>>>>>>>       <2><3f>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>>>>>>> <40> DW_AT_name : (indirect string, offset: 0xd6):
> debug_annotate_decl
>>>>>>>>>>          <44>   DW_AT_const_value : (indirect string, offset: 0x0): decltag1
>>>>>>>>>>       <2><48>: Abbrev Number: 0
>>>>>>>>>>       <1><49>: Abbrev Number: 4 (DW_TAG_pointer_type)
>>>>>>>>>>          <4a>   DW_AT_byte_size   : 8
>>>>>>>>>>          <4b>   DW_AT_type        : <0x5d>
>>>>>>>>>>          <4f>   DW_AT_sibling     : <0x5d>
>>>>>>>>>>       <2><53>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>>>>>>> <54> DW_AT_name : (indirect string, offset: 0x9):
> debug_annotate_type
>>>>>>>>>>          <58>   DW_AT_const_value : (indirect string, offset: 0x1d): typetag1
>>>>>>>>>>       <2><5c>: Abbrev Number: 0
>>>>>>>>>>       <1><5d>: Abbrev Number: 5 (DW_TAG_base_type)
>>>>>>>>>>          <5e>   DW_AT_byte_size   : 4
>>>>>>>>>>          <5f>   DW_AT_encoding    : 5	(signed)
>>>>>>>>>>          <60>   DW_AT_name        : int
>>>>>>>>>>       <1><64>: Abbrev Number: 0
>>>>>>>
>>>>>>> This shows the info in .debug_abbrev. What I mean is to
>>>>>>> show the related info in .debug_info section which seems more useful to
>>>>>>> understand the relationships between different tags. Maybe this is due
>>>>>>> to that I am not fully understanding what <1>/<2> means in <1><49> and
>>>>>>> <2><53> etc.
>>>>>> I think that dump actually shows .debug_info, with the abbrevs
>>>>>> expanded...
>>>>>> Anyway, it seems to us that the root of this problem is the fact the
>>>>>> kernel sparse annotations, such as address_space(__user), are:
>>>>>> 1) To be processed by an external kernel-specific tool (
>>>>>>       https://sparse.docs.kernel.org/en/latest/annotations.html) and not a
>>>>>>       C compiler, and therefore,
>>>>>> 2) Not quite the same than compiler attributes (despite the way they
>>>>>>       look.)  In particular, they seem to assume an ordering different than
>>>>>>       of GNU attributes: in some cases given the same written order, they
>>>>>>       refer to different things!.  Which is quite unfortunate :(
>>>>>
>>>>> Yes, currently __user/__kernel macros (implemented with address_space
>>>>> attribute) are processed by macros.
>>>>>
>>>>>> Now, if I understood properly, you plan to change the definition of
>>>>>> __user and __kernel in the kernel sources in order to generate the tag
>>>>>> compiler attributes, correct?
>>>>>
>>>>> Right. The original __user definition likes:
>>>>>     # define __user         __attribute__((noderef, address_space(__user)))
>>>>>
>>>>> The new attribute looks like
>>>>>     # define BTF_TYPE_TAG(value) __attribute__((btf_type_tag(#value)))
>>>>>     #  define __user        BTF_TYPE_TAG(user)
>>>> Ok I see.  So the kernel will stop using sparse attributes to
>>>> implement
>>>> __user and __kernel and start using compiler attributes for tags
>>>> instead.
>>>>
>>>>>> Is that the reason why LLVM implements what we assume to be the
>>>>>> sparse
>>>>>> ordering, and not the correct GNU attributes ordering, for the tag
>>>>>> attributes?
>>>>>
>>>>> Note that __user attributes apply to pointee's and not pointers.
>>>>> Just like
>>>>>      const int *p;
>>>>> the 'const' is not applied to pointer 'p', but the pointee of 'p'.
>>>>>
>>>>> What current llvm dwarf generation with
>>>>>      pointer
>>>>>        <--- btf_type_tag
>>>>> is just ONE implementation. As I said earlier, I am okay to
>>>>> have dwarf implementation like
>>>>>      p->btf_type_tag->const->int.
>>>>> If you can propose an implementation like this in dwarf. I can propose
>>>>> to change implementation in llvm.
>>>> I think we are miscommunicating.
>>>> Looks like there is a divergence on what attributes apply to what
>>>> language entities between the sparse compiler and GCC/LLVM.  How to
>>>> represent that in DWARF is a different matter.
>>>> For this example:
>>>>     int __typetag1 * __typetag2 __typetag3 * g;
>>>> a) GCC associates __typetag1 with the pointer-to-pointer-to-int.
>>>> b) LLVM associates __typetag1 to pointer-to-int.
>>>> Where:
>>>> a) Is the expected behavior of a compiler attributes, as documented
>>>> in
>>>>      the GCC manual.
>>>> b) Is presumably what the sparse compiler expects, but _not_ the
>>>>      ordering expected for a compiler GNU attribute.
>>>> So, if the kernel source __user and __kernel annotations (which
>>>> currently expand to sparse attributes) follow the sparse ordering, and
>>>> you want to implement __user and __kernel in terms of compiler
>>>> attributes instead (the annotation attributes) then you will have to:
>>>> 1) Fix LLVM to implement the usual ordering for these attributes and
>>>> 2) fix the kernel sources to use that ordering
>>>> [Incidentally, the same applies to another "ex-sparse" attribute you
>>>>    have in the kernel and also implemented in LLVM with a weird ordering:
>>>>    the address_space attribute.]
>>>> For 2), it may be possible to write a coccinnelle script to generate
>>>> the
>>>> patch...
>>>
>>> I don't think (2) (to change kernel source for different attr ordering)
>>> will work. So the only thing we can do is in compiler/pahole except
>>> macro replacement in kernel.
>> I looked at sparse and its parser.  Wanted to be sure the ordering
>> it
>> uses to interpret sparse annotations (such as address_space, alignment,
>> etc) is definitely _not_ the same ordering used by __attribute__ in C
>> compilers.
>> It is very different indeed and the same can be said about how
>> sparse
>> interprets other modifiers like `const': in sparse both `int const *foo'
>> and `int *const foo' parse to a constant pointer to int, for example.
>> I am not to judge how sparse handles its annotations.  It may be
>> very
>> well and pertinent for its particular purpose.
>> But I am not sure if it is reasonable to expect C compilers to
>> implement
>> certain type __attributes__ to parse differently, just because it
>> happens these attributes are reused from sparse annotations in a
>> particular program (in this case the kernel.)  The debug_annotate_decl
>> and debug_annotate_type attributes are not even intended to be
>> kernel-specific.
>> So, if changing the kernel sources is not an option (why btw, other
>> than
>> being a PITA?) at this point I really don't know what else to suggest :/
>> Any suggestion from the front-end people?
>
> Just want to understand the overall picture. So gcc can still emit
> BTF properly with btf_type_tag right? The issue we are talking about
> here is about the dwarf, right?

If by "properly" you mean how sparse handles its annotations, then not
really.

The issue we are talking about is rather a language-level one: to what
entity/type the compiler attribute applies.

So, for:

  int __attribute__((debug_annotate_decl("user"))) *foo;

GCC will apply the attribute to the int type, following the rules for
type attributes (sparse would apply the annotation to the *int type
instead).  The emitted debug info (be it DWARF or BTF) will reflect
that, no more no less :/

> If this is the case, we might have
> a partial solution here.
>   - gcc emits BTF for vmlinux

Note that for emitting BTF for vmlinux we would need support in the
linker to merge and deduplicate BTF, which at the moment we don't have.

>   - gcc emits dwarf for vmlinux ignoring btf_type_tag
>   - in pahole, vmlinux BTF is amended with some additional misc things.
> Although there are some use cases to have btf_type_tag in dwarf, but
> that can be workarouned with BTF + dwarf both of which are generated
> by the compiler. Not elegent, but probably works.
>> 
>>>> Does this make sense?
>>>>
>>>>>> If that is so, we have quite a problem here: I don't think we can
>>>>>> change
>>>>>> the way GCC handles GNU-like attributes just because the kernel sources
>>>>>> want to hook on these __user/__kernel sparse annotations to generate the
>>>>>> compiler tags, even if we could mayhaps get GCC to handle
>>>>>> debug_annotate_type and debug_annotate_decl differently.  Some would say
>>>>>> doing so would perpetuate the mistake instead of fixing it...
>>>>>> Is my understanding correct?
>>>>>
>>>>> Let us just say that the btf_type_tag attribute applies to pointees.
>>>>> Does this help?
>>>>>
>>>>>>
>>>>>>>>>
>>>>>>>>> Maybe you can also show what dwarf debug_info looks like
>>>>>>>> I am not sure what you mean. This is the .debug_info section as output
>>>>>>>> by readelf -w. I did trim some information not relevant to the discussion
>>>>>>>> such as the DW_TAG_compile_unit DIE, for brevity.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> In the case of BTF, the annotations are recorded in two type kinds recently
>>>>>>>>>> added to the BTF specification: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.
>>>>>>>>>> The above example declaration prodcues the following BTF information:
>>>>>>>>>>
>>>>>>>>>> [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
>>>>>>>>>> [2] PTR '(anon)' type_id=3
>>>>>>>>>> [3] TYPE_TAG 'typetag1' type_id=1
>>>>>>>>>> [4] DECL_TAG 'decltag1' type_id=6 component_idx=-1
>>>>>>>>>> [5] DECL_TAG 'decltag2' type_id=6 component_idx=-1
>>>>>>>>>> [6] VAR 'x' type_id=2, linkage=global
>>>>>>>>>> [7] DATASEC '.bss' size=0 vlen=1
>>>>>>>>>> 	type_id=6 offset=0 size=8 (VAR 'x')
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> [...]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kernel sparse annotations vs. compiler attributes and debug_annotate_{type, decl} WAS: Re: [PATCH 0/9] Add debug_annotate attributes
  2022-07-14 15:09                   ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} " Jose E. Marchesi
@ 2022-07-15  1:20                     ` Yonghong Song
  2022-07-15 14:17                       ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} " Jose E. Marchesi
  0 siblings, 1 reply; 25+ messages in thread
From: Yonghong Song @ 2022-07-15  1:20 UTC (permalink / raw)
  To: Jose E. Marchesi; +Cc: David Faust, gcc-patches



On 7/14/22 8:09 AM, Jose E. Marchesi wrote:
> 
> Hi Yonghong.
> 
>> On 7/7/22 1:24 PM, Jose E. Marchesi wrote:
>>> Hi Yonghong.
>>>
>>>> On 6/21/22 9:12 AM, Jose E. Marchesi wrote:
>>>>>
>>>>>> On 6/17/22 10:18 AM, Jose E. Marchesi wrote:
>>>>>>> Hi Yonghong.
>>>>>>>
>>>>>>>> On 6/15/22 1:57 PM, David Faust wrote:
>>>>>>>>>
>>>>>>>>> On 6/14/22 22:53, Yonghong Song wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 6/7/22 2:43 PM, David Faust wrote:
>>>>>>>>>>> Hello,
>>>>>>>>>>>
>>>>>>>>>>> This patch series adds support for:
>>>>>>>>>>>
>>>>>>>>>>> - Two new C-language-level attributes that allow to associate (to "annotate" or
>>>>>>>>>>>         to "tag") particular declarations and types with arbitrary strings. As
>>>>>>>>>>>         explained below, this is intended to be used to, for example, characterize
>>>>>>>>>>>         certain pointer types.
>>>>>>>>>>>
>>>>>>>>>>> - The conveyance of that information in the DWARF output in the form of a new
>>>>>>>>>>>         DIE: DW_TAG_GNU_annotation.
>>>>>>>>>>>
>>>>>>>>>>> - The conveyance of that information in the BTF output in the form of two new
>>>>>>>>>>>         kinds of BTF objects: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.
>>>>>>>>>>>
>>>>>>>>>>> All of these facilities are being added to the eBPF ecosystem, and support for
>>>>>>>>>>> them exists in some form in LLVM.
>>>>>>>>>>>
>>>>>>>>>>> Purpose
>>>>>>>>>>> =======
>>>>>>>>>>>
>>>>>>>>>>> 1)  Addition of C-family language constructs (attributes) to specify free-text
>>>>>>>>>>>           tags on certain language elements, such as struct fields.
>>>>>>>>>>>
>>>>>>>>>>>           The purpose of these annotations is to provide additional information about
>>>>>>>>>>>           types, variables, and function parameters of interest to the kernel. A
>>>>>>>>>>>           driving use case is to tag pointer types within the linux kernel and eBPF
>>>>>>>>>>>           programs with additional semantic information, such as '__user' or '__rcu'.
>>>>>>>>>>>
>>>>>>>>>>>           For example, consider the linux kernel function do_execve with the
>>>>>>>>>>>           following declaration:
>>>>>>>>>>>
>>>>>>>>>>>             static int do_execve(struct filename *filename,
>>>>>>>>>>>                const char __user *const __user *__argv,
>>>>>>>>>>>                const char __user *const __user *__envp);
>>>>>>>>>>>
>>>>>>>>>>>           Here, __user could be defined with these annotations to record semantic
>>>>>>>>>>>           information about the pointer parameters (e.g., they are user-provided) in
>>>>>>>>>>>           DWARF and BTF information. Other kernel facilites such as the eBPF verifier
>>>>>>>>>>>           can read the tags and make use of the information.
>>>>>>>>>>>
>>>>>>>>>>> 2)  Conveying the tags in the generated DWARF debug info.
>>>>>>>>>>>
>>>>>>>>>>>           The main motivation for emitting the tags in DWARF is that the Linux kernel
>>>>>>>>>>>           generates its BTF information via pahole, using DWARF as a source:
>>>>>>>>>>>
>>>>>>>>>>>               +--------+  BTF                  BTF   +----------+
>>>>>>>>>>>               | pahole |-------> vmlinux.btf ------->| verifier |
>>>>>>>>>>>               +--------+                             +----------+
>>>>>>>>>>>                   ^                                        ^
>>>>>>>>>>>                   |                                        |
>>>>>>>>>>>             DWARF |                                    BTF |
>>>>>>>>>>>                   |                                        |
>>>>>>>>>>>                vmlinux                              +-------------+
>>>>>>>>>>>                module1.ko                           | BPF program |
>>>>>>>>>>>                module2.ko                           +-------------+
>>>>>>>>>>>                  ...
>>>>>>>>>>>
>>>>>>>>>>>           This is because:
>>>>>>>>>>>
>>>>>>>>>>>           a)  Unlike GCC, LLVM will only generate BTF for BPF programs.
>>>>>>>>>>>
>>>>>>>>>>>           b)  GCC can generate BTF for whatever target with -gbtf, but there is no
>>>>>>>>>>>               support for linking/deduplicating BTF in the linker.
>>>>>>>>>>>
>>>>>>>>>>>           In the scenario above, the verifier needs access to the pointer tags of
>>>>>>>>>>>           both the kernel types/declarations (conveyed in the DWARF and translated
>>>>>>>>>>>           to BTF by pahole) and those of the BPF program (available directly in BTF).
>>>>>>>>>>>
>>>>>>>>>>>           Another motivation for having the tag information in DWARF, unrelated to
>>>>>>>>>>>           BPF and BTF, is that the drgn project (another DWARF consumer) also wants
>>>>>>>>>>>           to benefit from these tags in order to differentiate between different
>>>>>>>>>>>           kinds of pointers in the kernel.
>>>>>>>>>>>
>>>>>>>>>>> 3)  Conveying the tags in the generated BTF debug info.
>>>>>>>>>>>
>>>>>>>>>>>           This is easy: the main purpose of having this info in BTF is for the
>>>>>>>>>>>           compiled eBPF programs. The kernel verifier can then access the tags
>>>>>>>>>>>           of pointers used by the eBPF programs.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> For more information about these tags and the motivation behind them, please
>>>>>>>>>>> refer to the following linux kernel discussions:
>>>>>>>>>>>
>>>>>>>>>>>         https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/
>>>>>>>>>>>         https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com/
>>>>>>>>>>>         https://lore.kernel.org/bpf/20211112012604.1504583-1-yhs@fb.com/
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Implementation Overview
>>>>>>>>>>> =======================
>>>>>>>>>>>
>>>>>>>>>>> To enable these annotations, two new C language attributes are added:
>>>>>>>>>>> __attribute__((debug_annotate_decl("foo"))) and
>>>>>>>>>>> __attribute__((debug_annotate_type("bar"))). Both attributes accept a single
>>>>>>>>>>> arbitrary string constant argument, which will be recorded in the generated
>>>>>>>>>>> DWARF and/or BTF debug information. They have no effect on code generation.
>>>>>>>>>>>
>>>>>>>>>>> Note that we are not using the same attribute names as LLVM (btf_decl_tag and
>>>>>>>>>>> btf_type_tag, respectively). While these attributes are functionally very
>>>>>>>>>>> similar, they have grown beyond purely BTF-specific uses, so inclusion of "btf"
>>>>>>>>>>> in the attribute name seems misleading.
>>>>>>>>>>>
>>>>>>>>>>> DWARF support is enabled via a new DW_TAG_GNU_annotation. When generating DWARF,
>>>>>>>>>>> declarations and types will be checked for the corresponding attributes. If
>>>>>>>>>>> present, a DW_TAG_GNU_annotation DIE will be created as a child of the DIE for
>>>>>>>>>>> the annotated type or declaration, one for each tag. These DIEs link the
>>>>>>>>>>> arbitrary tag value to the item they annotate.
>>>>>>>>>>>
>>>>>>>>>>> For example, the following variable declaration:
>>>>>>>>>>>
>>>>>>>>>>>         #define __typetag1 __attribute__((debug_annotate_type ("typetag1")))
>>>>>>>>>>>
>>>>>>>>>>>         #define __decltag1 __attribute__((debug_annotate_decl ("decltag1")))
>>>>>>>>>>>         #define __decltag2 __attribute__((debug_annotate_decl ("decltag2")))
>>>>>>>>>>>
>>>>>>>>>>>         int * __typetag1 x __decltag1 __decltag2;
>>>>>>>>>>
>>>>>>>>>> Based on the above example
>>>>>>>>>>               static int do_execve(struct filename *filename,
>>>>>>>>>>                 const char __user *const __user *__argv,
>>>>>>>>>>                 const char __user *const __user *__envp);
>>>>>>>>>>
>>>>>>>>>> Should the above example should be the below?
>>>>>>>>>>           int __typetag1 * x __decltag1 __decltag2
>>>>>>>>>>
>>>>>>>>> This example is not related to the one above. It is just meant to
>>>>>>>>> show the behavior of both attributes. My apologies for not making
>>>>>>>>> that clear.
>>>>>>>>
>>>>>>>> Okay, it should be fine if the dwarf debug_info is shown.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Produces the following DWARF information:
>>>>>>>>>>>
>>>>>>>>>>>        <1><1e>: Abbrev Number: 3 (DW_TAG_variable)
>>>>>>>>>>>           <1f>   DW_AT_name        : x
>>>>>>>>>>>           <21>   DW_AT_decl_file   : 1
>>>>>>>>>>>           <22>   DW_AT_decl_line   : 7
>>>>>>>>>>>           <23>   DW_AT_decl_column : 18
>>>>>>>>>>>           <24>   DW_AT_type        : <0x49>
>>>>>>>>>>>           <28>   DW_AT_external    : 1
>>>>>>>>>>>           <28>   DW_AT_location    : 9 byte block: 3 0 0 0 0 0 0 0 0 	(DW_OP_addr: 0)
>>>>>>>>>>>           <32>   DW_AT_sibling     : <0x49>
>>>>>>>>>>>        <2><36>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>>>>>>>> <37> DW_AT_name : (indirect string, offset: 0xd6):
>> debug_annotate_decl
>>>>>>>>>>>           <3b>   DW_AT_const_value : (indirect string, offset: 0xcd): decltag2
>>>>>>>>>>>        <2><3f>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>>>>>>>> <40> DW_AT_name : (indirect string, offset: 0xd6):
>> debug_annotate_decl
>>>>>>>>>>>           <44>   DW_AT_const_value : (indirect string, offset: 0x0): decltag1
>>>>>>>>>>>        <2><48>: Abbrev Number: 0
>>>>>>>>>>>        <1><49>: Abbrev Number: 4 (DW_TAG_pointer_type)
>>>>>>>>>>>           <4a>   DW_AT_byte_size   : 8
>>>>>>>>>>>           <4b>   DW_AT_type        : <0x5d>
>>>>>>>>>>>           <4f>   DW_AT_sibling     : <0x5d>
>>>>>>>>>>>        <2><53>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>>>>>>>> <54> DW_AT_name : (indirect string, offset: 0x9):
>> debug_annotate_type
>>>>>>>>>>>           <58>   DW_AT_const_value : (indirect string, offset: 0x1d): typetag1
>>>>>>>>>>>        <2><5c>: Abbrev Number: 0
>>>>>>>>>>>        <1><5d>: Abbrev Number: 5 (DW_TAG_base_type)
>>>>>>>>>>>           <5e>   DW_AT_byte_size   : 4
>>>>>>>>>>>           <5f>   DW_AT_encoding    : 5	(signed)
>>>>>>>>>>>           <60>   DW_AT_name        : int
>>>>>>>>>>>        <1><64>: Abbrev Number: 0
>>>>>>>>
>>>>>>>> This shows the info in .debug_abbrev. What I mean is to
>>>>>>>> show the related info in .debug_info section which seems more useful to
>>>>>>>> understand the relationships between different tags. Maybe this is due
>>>>>>>> to that I am not fully understanding what <1>/<2> means in <1><49> and
>>>>>>>> <2><53> etc.
>>>>>>> I think that dump actually shows .debug_info, with the abbrevs
>>>>>>> expanded...
>>>>>>> Anyway, it seems to us that the root of this problem is the fact the
>>>>>>> kernel sparse annotations, such as address_space(__user), are:
>>>>>>> 1) To be processed by an external kernel-specific tool (
>>>>>>>        https://sparse.docs.kernel.org/en/latest/annotations.html) and not a
>>>>>>>        C compiler, and therefore,
>>>>>>> 2) Not quite the same than compiler attributes (despite the way they
>>>>>>>        look.)  In particular, they seem to assume an ordering different than
>>>>>>>        of GNU attributes: in some cases given the same written order, they
>>>>>>>        refer to different things!.  Which is quite unfortunate :(
>>>>>>
>>>>>> Yes, currently __user/__kernel macros (implemented with address_space
>>>>>> attribute) are processed by macros.
>>>>>>
>>>>>>> Now, if I understood properly, you plan to change the definition of
>>>>>>> __user and __kernel in the kernel sources in order to generate the tag
>>>>>>> compiler attributes, correct?
>>>>>>
>>>>>> Right. The original __user definition likes:
>>>>>>      # define __user         __attribute__((noderef, address_space(__user)))
>>>>>>
>>>>>> The new attribute looks like
>>>>>>      # define BTF_TYPE_TAG(value) __attribute__((btf_type_tag(#value)))
>>>>>>      #  define __user        BTF_TYPE_TAG(user)
>>>>> Ok I see.  So the kernel will stop using sparse attributes to
>>>>> implement
>>>>> __user and __kernel and start using compiler attributes for tags
>>>>> instead.
>>>>>
>>>>>>> Is that the reason why LLVM implements what we assume to be the
>>>>>>> sparse
>>>>>>> ordering, and not the correct GNU attributes ordering, for the tag
>>>>>>> attributes?
>>>>>>
>>>>>> Note that __user attributes apply to pointee's and not pointers.
>>>>>> Just like
>>>>>>       const int *p;
>>>>>> the 'const' is not applied to pointer 'p', but the pointee of 'p'.
>>>>>>
>>>>>> What current llvm dwarf generation with
>>>>>>       pointer
>>>>>>         <--- btf_type_tag
>>>>>> is just ONE implementation. As I said earlier, I am okay to
>>>>>> have dwarf implementation like
>>>>>>       p->btf_type_tag->const->int.
>>>>>> If you can propose an implementation like this in dwarf. I can propose
>>>>>> to change implementation in llvm.
>>>>> I think we are miscommunicating.
>>>>> Looks like there is a divergence on what attributes apply to what
>>>>> language entities between the sparse compiler and GCC/LLVM.  How to
>>>>> represent that in DWARF is a different matter.
>>>>> For this example:
>>>>>      int __typetag1 * __typetag2 __typetag3 * g;
>>>>> a) GCC associates __typetag1 with the pointer-to-pointer-to-int.
>>>>> b) LLVM associates __typetag1 to pointer-to-int.
>>>>> Where:
>>>>> a) Is the expected behavior of a compiler attributes, as documented
>>>>> in
>>>>>       the GCC manual.
>>>>> b) Is presumably what the sparse compiler expects, but _not_ the
>>>>>       ordering expected for a compiler GNU attribute.
>>>>> So, if the kernel source __user and __kernel annotations (which
>>>>> currently expand to sparse attributes) follow the sparse ordering, and
>>>>> you want to implement __user and __kernel in terms of compiler
>>>>> attributes instead (the annotation attributes) then you will have to:
>>>>> 1) Fix LLVM to implement the usual ordering for these attributes and
>>>>> 2) fix the kernel sources to use that ordering
>>>>> [Incidentally, the same applies to another "ex-sparse" attribute you
>>>>>     have in the kernel and also implemented in LLVM with a weird ordering:
>>>>>     the address_space attribute.]
>>>>> For 2), it may be possible to write a coccinnelle script to generate
>>>>> the
>>>>> patch...
>>>>
>>>> I don't think (2) (to change kernel source for different attr ordering)
>>>> will work. So the only thing we can do is in compiler/pahole except
>>>> macro replacement in kernel.
>>> I looked at sparse and its parser.  Wanted to be sure the ordering
>>> it
>>> uses to interpret sparse annotations (such as address_space, alignment,
>>> etc) is definitely _not_ the same ordering used by __attribute__ in C
>>> compilers.
>>> It is very different indeed and the same can be said about how
>>> sparse
>>> interprets other modifiers like `const': in sparse both `int const *foo'
>>> and `int *const foo' parse to a constant pointer to int, for example.
>>> I am not to judge how sparse handles its annotations.  It may be
>>> very
>>> well and pertinent for its particular purpose.
>>> But I am not sure if it is reasonable to expect C compilers to
>>> implement
>>> certain type __attributes__ to parse differently, just because it
>>> happens these attributes are reused from sparse annotations in a
>>> particular program (in this case the kernel.)  The debug_annotate_decl
>>> and debug_annotate_type attributes are not even intended to be
>>> kernel-specific.
>>> So, if changing the kernel sources is not an option (why btw, other
>>> than
>>> being a PITA?) at this point I really don't know what else to suggest :/
>>> Any suggestion from the front-end people?
>>
>> Just want to understand the overall picture. So gcc can still emit
>> BTF properly with btf_type_tag right? The issue we are talking about
>> here is about the dwarf, right?
> 
> If by "properly" you mean how sparse handles its annotations, then not
> really.
> 
> The issue we are talking about is rather a language-level one: to what
> entity/type the compiler attribute applies.
> 
> So, for:
> 
>    int __attribute__((debug_annotate_decl("user"))) *foo;
> 
> GCC will apply the attribute to the int type, following the rules for
> type attributes (sparse would apply the annotation to the *int type
> instead).  The emitted debug info (be it DWARF or BTF) will reflect
> that, no more no less :/

I don't know what does this 'apply the attribute to the int' mean.
In current clang implementation it means the following dwarf chains
from right to left
   variable 'foo'
     type: ptr
       base type: attr_type: attr
                     underlying type: int

So the type chain is foo -> ptr -> attr -> int

> 
>> If this is the case, we might have
>> a partial solution here.
>>    - gcc emits BTF for vmlinux
> 
> Note that for emitting BTF for vmlinux we would need support in the
> linker to merge and deduplicate BTF, which at the moment we don't have.

This should be okay. pahole will merge and deduplicate btf. In pahole 
'-j' mode, each thread will convert each .o file dwarf to btf, and
then pahole will merge and deduplicate btf.

> 
>>    - gcc emits dwarf for vmlinux ignoring btf_type_tag
>>    - in pahole, vmlinux BTF is amended with some additional misc things.
>> Although there are some use cases to have btf_type_tag in dwarf, but
>> that can be workarouned with BTF + dwarf both of which are generated
>> by the compiler. Not elegent, but probably works.
>>>
>>>>> Does this make sense?
>>>>>
>>>>>>> If that is so, we have quite a problem here: I don't think we can
>>>>>>> change
>>>>>>> the way GCC handles GNU-like attributes just because the kernel sources
>>>>>>> want to hook on these __user/__kernel sparse annotations to generate the
>>>>>>> compiler tags, even if we could mayhaps get GCC to handle
>>>>>>> debug_annotate_type and debug_annotate_decl differently.  Some would say
>>>>>>> doing so would perpetuate the mistake instead of fixing it...
>>>>>>> Is my understanding correct?
>>>>>>
>>>>>> Let us just say that the btf_type_tag attribute applies to pointees.
>>>>>> Does this help?
>>>>>>
>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Maybe you can also show what dwarf debug_info looks like
>>>>>>>>> I am not sure what you mean. This is the .debug_info section as output
>>>>>>>>> by readelf -w. I did trim some information not relevant to the discussion
>>>>>>>>> such as the DW_TAG_compile_unit DIE, for brevity.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> In the case of BTF, the annotations are recorded in two type kinds recently
>>>>>>>>>>> added to the BTF specification: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.
>>>>>>>>>>> The above example declaration prodcues the following BTF information:
>>>>>>>>>>>
>>>>>>>>>>> [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
>>>>>>>>>>> [2] PTR '(anon)' type_id=3
>>>>>>>>>>> [3] TYPE_TAG 'typetag1' type_id=1
>>>>>>>>>>> [4] DECL_TAG 'decltag1' type_id=6 component_idx=-1
>>>>>>>>>>> [5] DECL_TAG 'decltag2' type_id=6 component_idx=-1
>>>>>>>>>>> [6] VAR 'x' type_id=2, linkage=global
>>>>>>>>>>> [7] DATASEC '.bss' size=0 vlen=1
>>>>>>>>>>> 	type_id=6 offset=0 size=8 (VAR 'x')
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> [...]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} WAS: Re: [PATCH 0/9] Add debug_annotate attributes
  2022-07-15  1:20                     ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type, decl} " Yonghong Song
@ 2022-07-15 14:17                       ` Jose E. Marchesi
  2022-07-15 16:48                         ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type, decl} " Yonghong Song
  0 siblings, 1 reply; 25+ messages in thread
From: Jose E. Marchesi @ 2022-07-15 14:17 UTC (permalink / raw)
  To: Yonghong Song; +Cc: David Faust, gcc-patches


> On 7/14/22 8:09 AM, Jose E. Marchesi wrote:
>> Hi Yonghong.
>> 
>>> On 7/7/22 1:24 PM, Jose E. Marchesi wrote:
>>>> Hi Yonghong.
>>>>
>>>>> On 6/21/22 9:12 AM, Jose E. Marchesi wrote:
>>>>>>
>>>>>>> On 6/17/22 10:18 AM, Jose E. Marchesi wrote:
>>>>>>>> Hi Yonghong.
>>>>>>>>
>>>>>>>>> On 6/15/22 1:57 PM, David Faust wrote:
>>>>>>>>>>
>>>>>>>>>> On 6/14/22 22:53, Yonghong Song wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 6/7/22 2:43 PM, David Faust wrote:
>>>>>>>>>>>> Hello,
>>>>>>>>>>>>
>>>>>>>>>>>> This patch series adds support for:
>>>>>>>>>>>>
>>>>>>>>>>>> - Two new C-language-level attributes that allow to associate (to "annotate" or
>>>>>>>>>>>>         to "tag") particular declarations and types with arbitrary strings. As
>>>>>>>>>>>>         explained below, this is intended to be used to, for example, characterize
>>>>>>>>>>>>         certain pointer types.
>>>>>>>>>>>>
>>>>>>>>>>>> - The conveyance of that information in the DWARF output in the form of a new
>>>>>>>>>>>>         DIE: DW_TAG_GNU_annotation.
>>>>>>>>>>>>
>>>>>>>>>>>> - The conveyance of that information in the BTF output in the form of two new
>>>>>>>>>>>>         kinds of BTF objects: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.
>>>>>>>>>>>>
>>>>>>>>>>>> All of these facilities are being added to the eBPF ecosystem, and support for
>>>>>>>>>>>> them exists in some form in LLVM.
>>>>>>>>>>>>
>>>>>>>>>>>> Purpose
>>>>>>>>>>>> =======
>>>>>>>>>>>>
>>>>>>>>>>>> 1)  Addition of C-family language constructs (attributes) to specify free-text
>>>>>>>>>>>>           tags on certain language elements, such as struct fields.
>>>>>>>>>>>>
>>>>>>>>>>>>           The purpose of these annotations is to provide additional information about
>>>>>>>>>>>>           types, variables, and function parameters of interest to the kernel. A
>>>>>>>>>>>>           driving use case is to tag pointer types within the linux kernel and eBPF
>>>>>>>>>>>>           programs with additional semantic information, such as '__user' or '__rcu'.
>>>>>>>>>>>>
>>>>>>>>>>>>           For example, consider the linux kernel function do_execve with the
>>>>>>>>>>>>           following declaration:
>>>>>>>>>>>>
>>>>>>>>>>>>             static int do_execve(struct filename *filename,
>>>>>>>>>>>>                const char __user *const __user *__argv,
>>>>>>>>>>>>                const char __user *const __user *__envp);
>>>>>>>>>>>>
>>>>>>>>>>>>           Here, __user could be defined with these annotations to record semantic
>>>>>>>>>>>>           information about the pointer parameters (e.g., they are user-provided) in
>>>>>>>>>>>>           DWARF and BTF information. Other kernel facilites such as the eBPF verifier
>>>>>>>>>>>>           can read the tags and make use of the information.
>>>>>>>>>>>>
>>>>>>>>>>>> 2)  Conveying the tags in the generated DWARF debug info.
>>>>>>>>>>>>
>>>>>>>>>>>>           The main motivation for emitting the tags in DWARF is that the Linux kernel
>>>>>>>>>>>>           generates its BTF information via pahole, using DWARF as a source:
>>>>>>>>>>>>
>>>>>>>>>>>>               +--------+  BTF                  BTF   +----------+
>>>>>>>>>>>>               | pahole |-------> vmlinux.btf ------->| verifier |
>>>>>>>>>>>>               +--------+                             +----------+
>>>>>>>>>>>>                   ^                                        ^
>>>>>>>>>>>>                   |                                        |
>>>>>>>>>>>>             DWARF |                                    BTF |
>>>>>>>>>>>>                   |                                        |
>>>>>>>>>>>>                vmlinux                              +-------------+
>>>>>>>>>>>>                module1.ko                           | BPF program |
>>>>>>>>>>>>                module2.ko                           +-------------+
>>>>>>>>>>>>                  ...
>>>>>>>>>>>>
>>>>>>>>>>>>           This is because:
>>>>>>>>>>>>
>>>>>>>>>>>>           a)  Unlike GCC, LLVM will only generate BTF for BPF programs.
>>>>>>>>>>>>
>>>>>>>>>>>>           b)  GCC can generate BTF for whatever target with -gbtf, but there is no
>>>>>>>>>>>>               support for linking/deduplicating BTF in the linker.
>>>>>>>>>>>>
>>>>>>>>>>>>           In the scenario above, the verifier needs access to the pointer tags of
>>>>>>>>>>>>           both the kernel types/declarations (conveyed in the DWARF and translated
>>>>>>>>>>>>           to BTF by pahole) and those of the BPF program (available directly in BTF).
>>>>>>>>>>>>
>>>>>>>>>>>>           Another motivation for having the tag information in DWARF, unrelated to
>>>>>>>>>>>>           BPF and BTF, is that the drgn project (another DWARF consumer) also wants
>>>>>>>>>>>>           to benefit from these tags in order to differentiate between different
>>>>>>>>>>>>           kinds of pointers in the kernel.
>>>>>>>>>>>>
>>>>>>>>>>>> 3)  Conveying the tags in the generated BTF debug info.
>>>>>>>>>>>>
>>>>>>>>>>>>           This is easy: the main purpose of having this info in BTF is for the
>>>>>>>>>>>>           compiled eBPF programs. The kernel verifier can then access the tags
>>>>>>>>>>>>           of pointers used by the eBPF programs.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> For more information about these tags and the motivation behind them, please
>>>>>>>>>>>> refer to the following linux kernel discussions:
>>>>>>>>>>>>
>>>>>>>>>>>>         https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/
>>>>>>>>>>>>         https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com/
>>>>>>>>>>>>         https://lore.kernel.org/bpf/20211112012604.1504583-1-yhs@fb.com/
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Implementation Overview
>>>>>>>>>>>> =======================
>>>>>>>>>>>>
>>>>>>>>>>>> To enable these annotations, two new C language attributes are added:
>>>>>>>>>>>> __attribute__((debug_annotate_decl("foo"))) and
>>>>>>>>>>>> __attribute__((debug_annotate_type("bar"))). Both attributes accept a single
>>>>>>>>>>>> arbitrary string constant argument, which will be recorded in the generated
>>>>>>>>>>>> DWARF and/or BTF debug information. They have no effect on code generation.
>>>>>>>>>>>>
>>>>>>>>>>>> Note that we are not using the same attribute names as LLVM (btf_decl_tag and
>>>>>>>>>>>> btf_type_tag, respectively). While these attributes are functionally very
>>>>>>>>>>>> similar, they have grown beyond purely BTF-specific uses, so inclusion of "btf"
>>>>>>>>>>>> in the attribute name seems misleading.
>>>>>>>>>>>>
>>>>>>>>>>>> DWARF support is enabled via a new DW_TAG_GNU_annotation. When generating DWARF,
>>>>>>>>>>>> declarations and types will be checked for the corresponding attributes. If
>>>>>>>>>>>> present, a DW_TAG_GNU_annotation DIE will be created as a child of the DIE for
>>>>>>>>>>>> the annotated type or declaration, one for each tag. These DIEs link the
>>>>>>>>>>>> arbitrary tag value to the item they annotate.
>>>>>>>>>>>>
>>>>>>>>>>>> For example, the following variable declaration:
>>>>>>>>>>>>
>>>>>>>>>>>>         #define __typetag1 __attribute__((debug_annotate_type ("typetag1")))
>>>>>>>>>>>>
>>>>>>>>>>>>         #define __decltag1 __attribute__((debug_annotate_decl ("decltag1")))
>>>>>>>>>>>>         #define __decltag2 __attribute__((debug_annotate_decl ("decltag2")))
>>>>>>>>>>>>
>>>>>>>>>>>>         int * __typetag1 x __decltag1 __decltag2;
>>>>>>>>>>>
>>>>>>>>>>> Based on the above example
>>>>>>>>>>>               static int do_execve(struct filename *filename,
>>>>>>>>>>>                 const char __user *const __user *__argv,
>>>>>>>>>>>                 const char __user *const __user *__envp);
>>>>>>>>>>>
>>>>>>>>>>> Should the above example should be the below?
>>>>>>>>>>>           int __typetag1 * x __decltag1 __decltag2
>>>>>>>>>>>
>>>>>>>>>> This example is not related to the one above. It is just meant to
>>>>>>>>>> show the behavior of both attributes. My apologies for not making
>>>>>>>>>> that clear.
>>>>>>>>>
>>>>>>>>> Okay, it should be fine if the dwarf debug_info is shown.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Produces the following DWARF information:
>>>>>>>>>>>>
>>>>>>>>>>>>        <1><1e>: Abbrev Number: 3 (DW_TAG_variable)
>>>>>>>>>>>>           <1f>   DW_AT_name        : x
>>>>>>>>>>>>           <21>   DW_AT_decl_file   : 1
>>>>>>>>>>>>           <22>   DW_AT_decl_line   : 7
>>>>>>>>>>>>           <23>   DW_AT_decl_column : 18
>>>>>>>>>>>>           <24>   DW_AT_type        : <0x49>
>>>>>>>>>>>>           <28>   DW_AT_external    : 1
>>>>>>>>>>>>           <28>   DW_AT_location    : 9 byte block: 3 0 0 0 0 0 0 0 0 	(DW_OP_addr: 0)
>>>>>>>>>>>>           <32>   DW_AT_sibling     : <0x49>
>>>>>>>>>>>>        <2><36>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>>>>>>>>> <37> DW_AT_name : (indirect string, offset: 0xd6):
>>> debug_annotate_decl
>>>>>>>>>>>>           <3b>   DW_AT_const_value : (indirect string, offset: 0xcd): decltag2
>>>>>>>>>>>>        <2><3f>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>>>>>>>>> <40> DW_AT_name : (indirect string, offset: 0xd6):
>>> debug_annotate_decl
>>>>>>>>>>>>           <44>   DW_AT_const_value : (indirect string, offset: 0x0): decltag1
>>>>>>>>>>>>        <2><48>: Abbrev Number: 0
>>>>>>>>>>>>        <1><49>: Abbrev Number: 4 (DW_TAG_pointer_type)
>>>>>>>>>>>>           <4a>   DW_AT_byte_size   : 8
>>>>>>>>>>>>           <4b>   DW_AT_type        : <0x5d>
>>>>>>>>>>>>           <4f>   DW_AT_sibling     : <0x5d>
>>>>>>>>>>>>        <2><53>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>>>>>>>>> <54> DW_AT_name : (indirect string, offset: 0x9):
>>> debug_annotate_type
>>>>>>>>>>>>           <58>   DW_AT_const_value : (indirect string, offset: 0x1d): typetag1
>>>>>>>>>>>>        <2><5c>: Abbrev Number: 0
>>>>>>>>>>>>        <1><5d>: Abbrev Number: 5 (DW_TAG_base_type)
>>>>>>>>>>>>           <5e>   DW_AT_byte_size   : 4
>>>>>>>>>>>>           <5f>   DW_AT_encoding    : 5	(signed)
>>>>>>>>>>>>           <60>   DW_AT_name        : int
>>>>>>>>>>>>        <1><64>: Abbrev Number: 0
>>>>>>>>>
>>>>>>>>> This shows the info in .debug_abbrev. What I mean is to
>>>>>>>>> show the related info in .debug_info section which seems more useful to
>>>>>>>>> understand the relationships between different tags. Maybe this is due
>>>>>>>>> to that I am not fully understanding what <1>/<2> means in <1><49> and
>>>>>>>>> <2><53> etc.
>>>>>>>> I think that dump actually shows .debug_info, with the abbrevs
>>>>>>>> expanded...
>>>>>>>> Anyway, it seems to us that the root of this problem is the fact the
>>>>>>>> kernel sparse annotations, such as address_space(__user), are:
>>>>>>>> 1) To be processed by an external kernel-specific tool (
>>>>>>>>        https://sparse.docs.kernel.org/en/latest/annotations.html) and not a
>>>>>>>>        C compiler, and therefore,
>>>>>>>> 2) Not quite the same than compiler attributes (despite the way they
>>>>>>>>        look.)  In particular, they seem to assume an ordering different than
>>>>>>>>        of GNU attributes: in some cases given the same written order, they
>>>>>>>>        refer to different things!.  Which is quite unfortunate :(
>>>>>>>
>>>>>>> Yes, currently __user/__kernel macros (implemented with address_space
>>>>>>> attribute) are processed by macros.
>>>>>>>
>>>>>>>> Now, if I understood properly, you plan to change the definition of
>>>>>>>> __user and __kernel in the kernel sources in order to generate the tag
>>>>>>>> compiler attributes, correct?
>>>>>>>
>>>>>>> Right. The original __user definition likes:
>>>>>>>      # define __user         __attribute__((noderef, address_space(__user)))
>>>>>>>
>>>>>>> The new attribute looks like
>>>>>>>      # define BTF_TYPE_TAG(value) __attribute__((btf_type_tag(#value)))
>>>>>>>      #  define __user        BTF_TYPE_TAG(user)
>>>>>> Ok I see.  So the kernel will stop using sparse attributes to
>>>>>> implement
>>>>>> __user and __kernel and start using compiler attributes for tags
>>>>>> instead.
>>>>>>
>>>>>>>> Is that the reason why LLVM implements what we assume to be the
>>>>>>>> sparse
>>>>>>>> ordering, and not the correct GNU attributes ordering, for the tag
>>>>>>>> attributes?
>>>>>>>
>>>>>>> Note that __user attributes apply to pointee's and not pointers.
>>>>>>> Just like
>>>>>>>       const int *p;
>>>>>>> the 'const' is not applied to pointer 'p', but the pointee of 'p'.
>>>>>>>
>>>>>>> What current llvm dwarf generation with
>>>>>>>       pointer
>>>>>>>         <--- btf_type_tag
>>>>>>> is just ONE implementation. As I said earlier, I am okay to
>>>>>>> have dwarf implementation like
>>>>>>>       p->btf_type_tag->const->int.
>>>>>>> If you can propose an implementation like this in dwarf. I can propose
>>>>>>> to change implementation in llvm.
>>>>>> I think we are miscommunicating.
>>>>>> Looks like there is a divergence on what attributes apply to what
>>>>>> language entities between the sparse compiler and GCC/LLVM.  How to
>>>>>> represent that in DWARF is a different matter.
>>>>>> For this example:
>>>>>>      int __typetag1 * __typetag2 __typetag3 * g;
>>>>>> a) GCC associates __typetag1 with the pointer-to-pointer-to-int.
>>>>>> b) LLVM associates __typetag1 to pointer-to-int.
>>>>>> Where:
>>>>>> a) Is the expected behavior of a compiler attributes, as documented
>>>>>> in
>>>>>>       the GCC manual.
>>>>>> b) Is presumably what the sparse compiler expects, but _not_ the
>>>>>>       ordering expected for a compiler GNU attribute.
>>>>>> So, if the kernel source __user and __kernel annotations (which
>>>>>> currently expand to sparse attributes) follow the sparse ordering, and
>>>>>> you want to implement __user and __kernel in terms of compiler
>>>>>> attributes instead (the annotation attributes) then you will have to:
>>>>>> 1) Fix LLVM to implement the usual ordering for these attributes and
>>>>>> 2) fix the kernel sources to use that ordering
>>>>>> [Incidentally, the same applies to another "ex-sparse" attribute you
>>>>>>     have in the kernel and also implemented in LLVM with a weird ordering:
>>>>>>     the address_space attribute.]
>>>>>> For 2), it may be possible to write a coccinnelle script to generate
>>>>>> the
>>>>>> patch...
>>>>>
>>>>> I don't think (2) (to change kernel source for different attr ordering)
>>>>> will work. So the only thing we can do is in compiler/pahole except
>>>>> macro replacement in kernel.
>>>> I looked at sparse and its parser.  Wanted to be sure the ordering
>>>> it
>>>> uses to interpret sparse annotations (such as address_space, alignment,
>>>> etc) is definitely _not_ the same ordering used by __attribute__ in C
>>>> compilers.
>>>> It is very different indeed and the same can be said about how
>>>> sparse
>>>> interprets other modifiers like `const': in sparse both `int const *foo'
>>>> and `int *const foo' parse to a constant pointer to int, for example.
>>>> I am not to judge how sparse handles its annotations.  It may be
>>>> very
>>>> well and pertinent for its particular purpose.
>>>> But I am not sure if it is reasonable to expect C compilers to
>>>> implement
>>>> certain type __attributes__ to parse differently, just because it
>>>> happens these attributes are reused from sparse annotations in a
>>>> particular program (in this case the kernel.)  The debug_annotate_decl
>>>> and debug_annotate_type attributes are not even intended to be
>>>> kernel-specific.
>>>> So, if changing the kernel sources is not an option (why btw, other
>>>> than
>>>> being a PITA?) at this point I really don't know what else to suggest :/
>>>> Any suggestion from the front-end people?
>>>
>>> Just want to understand the overall picture. So gcc can still emit
>>> BTF properly with btf_type_tag right? The issue we are talking about
>>> here is about the dwarf, right?
>> If by "properly" you mean how sparse handles its annotations, then
>> not
>> really.
>> The issue we are talking about is rather a language-level one: to
>> what
>> entity/type the compiler attribute applies.
>> So, for:
>>    int __attribute__((debug_annotate_decl("user"))) *foo;
>> GCC will apply the attribute to the int type, following the rules
>> for
>> type attributes (sparse would apply the annotation to the *int type
>> instead).  The emitted debug info (be it DWARF or BTF) will reflect
>> that, no more no less :/
>
> I don't know what does this 'apply the attribute to the int' mean.
> In current clang implementation it means the following dwarf chains
> from right to left
>   variable 'foo'
>     type: ptr
>       base type: attr_type: attr
>                     underlying type: int
>
> So the type chain is foo -> ptr -> attr -> int

Urgh sorry Yonghong, that was a bad example where there is no
divergence.  At this point I find myself confused regarding the sparse,
clang and GCC attribute issue (I have so many dumps around from all
three tools in several formats) so I better recap on this before
creating further confusion.

Will be back to you soon.

>>> If this is the case, we might have
>>> a partial solution here.
>>>    - gcc emits BTF for vmlinux
>> Note that for emitting BTF for vmlinux we would need support in the
>> linker to merge and deduplicate BTF, which at the moment we don't have.
>
> This should be okay. pahole will merge and deduplicate btf. In pahole
> '-j' mode, each thread will convert each .o file dwarf to btf, and
> then pahole will merge and deduplicate btf.

Thats nice.  If LLVM supported generating BTF for any target (I believe
you got patches for that) you could even skip the dwarf->BTF translation
step alltogether with both LLVM and GCC kernel builds :)

>
>> 
>>>    - gcc emits dwarf for vmlinux ignoring btf_type_tag
>>>    - in pahole, vmlinux BTF is amended with some additional misc things.
>>> Although there are some use cases to have btf_type_tag in dwarf, but
>>> that can be workarouned with BTF + dwarf both of which are generated
>>> by the compiler. Not elegent, but probably works.
>>>>
>>>>>> Does this make sense?
>>>>>>
>>>>>>>> If that is so, we have quite a problem here: I don't think we can
>>>>>>>> change
>>>>>>>> the way GCC handles GNU-like attributes just because the kernel sources
>>>>>>>> want to hook on these __user/__kernel sparse annotations to generate the
>>>>>>>> compiler tags, even if we could mayhaps get GCC to handle
>>>>>>>> debug_annotate_type and debug_annotate_decl differently.  Some would say
>>>>>>>> doing so would perpetuate the mistake instead of fixing it...
>>>>>>>> Is my understanding correct?
>>>>>>>
>>>>>>> Let us just say that the btf_type_tag attribute applies to pointees.
>>>>>>> Does this help?
>>>>>>>
>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Maybe you can also show what dwarf debug_info looks like
>>>>>>>>>> I am not sure what you mean. This is the .debug_info section as output
>>>>>>>>>> by readelf -w. I did trim some information not relevant to the discussion
>>>>>>>>>> such as the DW_TAG_compile_unit DIE, for brevity.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> In the case of BTF, the annotations are recorded in two type kinds recently
>>>>>>>>>>>> added to the BTF specification: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.
>>>>>>>>>>>> The above example declaration prodcues the following BTF information:
>>>>>>>>>>>>
>>>>>>>>>>>> [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
>>>>>>>>>>>> [2] PTR '(anon)' type_id=3
>>>>>>>>>>>> [3] TYPE_TAG 'typetag1' type_id=1
>>>>>>>>>>>> [4] DECL_TAG 'decltag1' type_id=6 component_idx=-1
>>>>>>>>>>>> [5] DECL_TAG 'decltag2' type_id=6 component_idx=-1
>>>>>>>>>>>> [6] VAR 'x' type_id=2, linkage=global
>>>>>>>>>>>> [7] DATASEC '.bss' size=0 vlen=1
>>>>>>>>>>>> 	type_id=6 offset=0 size=8 (VAR 'x')
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> [...]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: kernel sparse annotations vs. compiler attributes and debug_annotate_{type, decl} WAS: Re: [PATCH 0/9] Add debug_annotate attributes
  2022-07-15 14:17                       ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} " Jose E. Marchesi
@ 2022-07-15 16:48                         ` Yonghong Song
  0 siblings, 0 replies; 25+ messages in thread
From: Yonghong Song @ 2022-07-15 16:48 UTC (permalink / raw)
  To: Jose E. Marchesi; +Cc: David Faust, gcc-patches



On 7/15/22 7:17 AM, Jose E. Marchesi wrote:
> 
>> On 7/14/22 8:09 AM, Jose E. Marchesi wrote:
>>> Hi Yonghong.
>>>
>>>> On 7/7/22 1:24 PM, Jose E. Marchesi wrote:
>>>>> Hi Yonghong.
>>>>>
>>>>>> On 6/21/22 9:12 AM, Jose E. Marchesi wrote:
>>>>>>>
>>>>>>>> On 6/17/22 10:18 AM, Jose E. Marchesi wrote:
>>>>>>>>> Hi Yonghong.
>>>>>>>>>
>>>>>>>>>> On 6/15/22 1:57 PM, David Faust wrote:
>>>>>>>>>>>
>>>>>>>>>>> On 6/14/22 22:53, Yonghong Song wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 6/7/22 2:43 PM, David Faust wrote:
>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>
>>>>>>>>>>>>> This patch series adds support for:
>>>>>>>>>>>>>
>>>>>>>>>>>>> - Two new C-language-level attributes that allow to associate (to "annotate" or
>>>>>>>>>>>>>          to "tag") particular declarations and types with arbitrary strings. As
>>>>>>>>>>>>>          explained below, this is intended to be used to, for example, characterize
>>>>>>>>>>>>>          certain pointer types.
>>>>>>>>>>>>>
>>>>>>>>>>>>> - The conveyance of that information in the DWARF output in the form of a new
>>>>>>>>>>>>>          DIE: DW_TAG_GNU_annotation.
>>>>>>>>>>>>>
>>>>>>>>>>>>> - The conveyance of that information in the BTF output in the form of two new
>>>>>>>>>>>>>          kinds of BTF objects: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.
>>>>>>>>>>>>>
>>>>>>>>>>>>> All of these facilities are being added to the eBPF ecosystem, and support for
>>>>>>>>>>>>> them exists in some form in LLVM.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Purpose
>>>>>>>>>>>>> =======
>>>>>>>>>>>>>
>>>>>>>>>>>>> 1)  Addition of C-family language constructs (attributes) to specify free-text
>>>>>>>>>>>>>            tags on certain language elements, such as struct fields.
>>>>>>>>>>>>>
>>>>>>>>>>>>>            The purpose of these annotations is to provide additional information about
>>>>>>>>>>>>>            types, variables, and function parameters of interest to the kernel. A
>>>>>>>>>>>>>            driving use case is to tag pointer types within the linux kernel and eBPF
>>>>>>>>>>>>>            programs with additional semantic information, such as '__user' or '__rcu'.
>>>>>>>>>>>>>
>>>>>>>>>>>>>            For example, consider the linux kernel function do_execve with the
>>>>>>>>>>>>>            following declaration:
>>>>>>>>>>>>>
>>>>>>>>>>>>>              static int do_execve(struct filename *filename,
>>>>>>>>>>>>>                 const char __user *const __user *__argv,
>>>>>>>>>>>>>                 const char __user *const __user *__envp);
>>>>>>>>>>>>>
>>>>>>>>>>>>>            Here, __user could be defined with these annotations to record semantic
>>>>>>>>>>>>>            information about the pointer parameters (e.g., they are user-provided) in
>>>>>>>>>>>>>            DWARF and BTF information. Other kernel facilites such as the eBPF verifier
>>>>>>>>>>>>>            can read the tags and make use of the information.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2)  Conveying the tags in the generated DWARF debug info.
>>>>>>>>>>>>>
>>>>>>>>>>>>>            The main motivation for emitting the tags in DWARF is that the Linux kernel
>>>>>>>>>>>>>            generates its BTF information via pahole, using DWARF as a source:
>>>>>>>>>>>>>
>>>>>>>>>>>>>                +--------+  BTF                  BTF   +----------+
>>>>>>>>>>>>>                | pahole |-------> vmlinux.btf ------->| verifier |
>>>>>>>>>>>>>                +--------+                             +----------+
>>>>>>>>>>>>>                    ^                                        ^
>>>>>>>>>>>>>                    |                                        |
>>>>>>>>>>>>>              DWARF |                                    BTF |
>>>>>>>>>>>>>                    |                                        |
>>>>>>>>>>>>>                 vmlinux                              +-------------+
>>>>>>>>>>>>>                 module1.ko                           | BPF program |
>>>>>>>>>>>>>                 module2.ko                           +-------------+
>>>>>>>>>>>>>                   ...
>>>>>>>>>>>>>
>>>>>>>>>>>>>            This is because:
>>>>>>>>>>>>>
>>>>>>>>>>>>>            a)  Unlike GCC, LLVM will only generate BTF for BPF programs.
>>>>>>>>>>>>>
>>>>>>>>>>>>>            b)  GCC can generate BTF for whatever target with -gbtf, but there is no
>>>>>>>>>>>>>                support for linking/deduplicating BTF in the linker.
>>>>>>>>>>>>>
>>>>>>>>>>>>>            In the scenario above, the verifier needs access to the pointer tags of
>>>>>>>>>>>>>            both the kernel types/declarations (conveyed in the DWARF and translated
>>>>>>>>>>>>>            to BTF by pahole) and those of the BPF program (available directly in BTF).
>>>>>>>>>>>>>
>>>>>>>>>>>>>            Another motivation for having the tag information in DWARF, unrelated to
>>>>>>>>>>>>>            BPF and BTF, is that the drgn project (another DWARF consumer) also wants
>>>>>>>>>>>>>            to benefit from these tags in order to differentiate between different
>>>>>>>>>>>>>            kinds of pointers in the kernel.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 3)  Conveying the tags in the generated BTF debug info.
>>>>>>>>>>>>>
>>>>>>>>>>>>>            This is easy: the main purpose of having this info in BTF is for the
>>>>>>>>>>>>>            compiled eBPF programs. The kernel verifier can then access the tags
>>>>>>>>>>>>>            of pointers used by the eBPF programs.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> For more information about these tags and the motivation behind them, please
>>>>>>>>>>>>> refer to the following linux kernel discussions:
>>>>>>>>>>>>>
>>>>>>>>>>>>>          https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/
>>>>>>>>>>>>>          https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com/
>>>>>>>>>>>>>          https://lore.kernel.org/bpf/20211112012604.1504583-1-yhs@fb.com/
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Implementation Overview
>>>>>>>>>>>>> =======================
>>>>>>>>>>>>>
>>>>>>>>>>>>> To enable these annotations, two new C language attributes are added:
>>>>>>>>>>>>> __attribute__((debug_annotate_decl("foo"))) and
>>>>>>>>>>>>> __attribute__((debug_annotate_type("bar"))). Both attributes accept a single
>>>>>>>>>>>>> arbitrary string constant argument, which will be recorded in the generated
>>>>>>>>>>>>> DWARF and/or BTF debug information. They have no effect on code generation.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Note that we are not using the same attribute names as LLVM (btf_decl_tag and
>>>>>>>>>>>>> btf_type_tag, respectively). While these attributes are functionally very
>>>>>>>>>>>>> similar, they have grown beyond purely BTF-specific uses, so inclusion of "btf"
>>>>>>>>>>>>> in the attribute name seems misleading.
>>>>>>>>>>>>>
>>>>>>>>>>>>> DWARF support is enabled via a new DW_TAG_GNU_annotation. When generating DWARF,
>>>>>>>>>>>>> declarations and types will be checked for the corresponding attributes. If
>>>>>>>>>>>>> present, a DW_TAG_GNU_annotation DIE will be created as a child of the DIE for
>>>>>>>>>>>>> the annotated type or declaration, one for each tag. These DIEs link the
>>>>>>>>>>>>> arbitrary tag value to the item they annotate.
>>>>>>>>>>>>>
>>>>>>>>>>>>> For example, the following variable declaration:
>>>>>>>>>>>>>
>>>>>>>>>>>>>          #define __typetag1 __attribute__((debug_annotate_type ("typetag1")))
>>>>>>>>>>>>>
>>>>>>>>>>>>>          #define __decltag1 __attribute__((debug_annotate_decl ("decltag1")))
>>>>>>>>>>>>>          #define __decltag2 __attribute__((debug_annotate_decl ("decltag2")))
>>>>>>>>>>>>>
>>>>>>>>>>>>>          int * __typetag1 x __decltag1 __decltag2;
>>>>>>>>>>>>
>>>>>>>>>>>> Based on the above example
>>>>>>>>>>>>                static int do_execve(struct filename *filename,
>>>>>>>>>>>>                  const char __user *const __user *__argv,
>>>>>>>>>>>>                  const char __user *const __user *__envp);
>>>>>>>>>>>>
>>>>>>>>>>>> Should the above example should be the below?
>>>>>>>>>>>>            int __typetag1 * x __decltag1 __decltag2
>>>>>>>>>>>>
>>>>>>>>>>> This example is not related to the one above. It is just meant to
>>>>>>>>>>> show the behavior of both attributes. My apologies for not making
>>>>>>>>>>> that clear.
>>>>>>>>>>
>>>>>>>>>> Okay, it should be fine if the dwarf debug_info is shown.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Produces the following DWARF information:
>>>>>>>>>>>>>
>>>>>>>>>>>>>         <1><1e>: Abbrev Number: 3 (DW_TAG_variable)
>>>>>>>>>>>>>            <1f>   DW_AT_name        : x
>>>>>>>>>>>>>            <21>   DW_AT_decl_file   : 1
>>>>>>>>>>>>>            <22>   DW_AT_decl_line   : 7
>>>>>>>>>>>>>            <23>   DW_AT_decl_column : 18
>>>>>>>>>>>>>            <24>   DW_AT_type        : <0x49>
>>>>>>>>>>>>>            <28>   DW_AT_external    : 1
>>>>>>>>>>>>>            <28>   DW_AT_location    : 9 byte block: 3 0 0 0 0 0 0 0 0 	(DW_OP_addr: 0)
>>>>>>>>>>>>>            <32>   DW_AT_sibling     : <0x49>
>>>>>>>>>>>>>         <2><36>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>>>>>>>>>> <37> DW_AT_name : (indirect string, offset: 0xd6):
>>>> debug_annotate_decl
>>>>>>>>>>>>>            <3b>   DW_AT_const_value : (indirect string, offset: 0xcd): decltag2
>>>>>>>>>>>>>         <2><3f>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>>>>>>>>>> <40> DW_AT_name : (indirect string, offset: 0xd6):
>>>> debug_annotate_decl
>>>>>>>>>>>>>            <44>   DW_AT_const_value : (indirect string, offset: 0x0): decltag1
>>>>>>>>>>>>>         <2><48>: Abbrev Number: 0
>>>>>>>>>>>>>         <1><49>: Abbrev Number: 4 (DW_TAG_pointer_type)
>>>>>>>>>>>>>            <4a>   DW_AT_byte_size   : 8
>>>>>>>>>>>>>            <4b>   DW_AT_type        : <0x5d>
>>>>>>>>>>>>>            <4f>   DW_AT_sibling     : <0x5d>
>>>>>>>>>>>>>         <2><53>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>>>>>>>>>> <54> DW_AT_name : (indirect string, offset: 0x9):
>>>> debug_annotate_type
>>>>>>>>>>>>>            <58>   DW_AT_const_value : (indirect string, offset: 0x1d): typetag1
>>>>>>>>>>>>>         <2><5c>: Abbrev Number: 0
>>>>>>>>>>>>>         <1><5d>: Abbrev Number: 5 (DW_TAG_base_type)
>>>>>>>>>>>>>            <5e>   DW_AT_byte_size   : 4
>>>>>>>>>>>>>            <5f>   DW_AT_encoding    : 5	(signed)
>>>>>>>>>>>>>            <60>   DW_AT_name        : int
>>>>>>>>>>>>>         <1><64>: Abbrev Number: 0
>>>>>>>>>>
>>>>>>>>>> This shows the info in .debug_abbrev. What I mean is to
>>>>>>>>>> show the related info in .debug_info section which seems more useful to
>>>>>>>>>> understand the relationships between different tags. Maybe this is due
>>>>>>>>>> to that I am not fully understanding what <1>/<2> means in <1><49> and
>>>>>>>>>> <2><53> etc.
>>>>>>>>> I think that dump actually shows .debug_info, with the abbrevs
>>>>>>>>> expanded...
>>>>>>>>> Anyway, it seems to us that the root of this problem is the fact the
>>>>>>>>> kernel sparse annotations, such as address_space(__user), are:
>>>>>>>>> 1) To be processed by an external kernel-specific tool (
>>>>>>>>>         https://sparse.docs.kernel.org/en/latest/annotations.html) and not a
>>>>>>>>>         C compiler, and therefore,
>>>>>>>>> 2) Not quite the same than compiler attributes (despite the way they
>>>>>>>>>         look.)  In particular, they seem to assume an ordering different than
>>>>>>>>>         of GNU attributes: in some cases given the same written order, they
>>>>>>>>>         refer to different things!.  Which is quite unfortunate :(
>>>>>>>>
>>>>>>>> Yes, currently __user/__kernel macros (implemented with address_space
>>>>>>>> attribute) are processed by macros.
>>>>>>>>
>>>>>>>>> Now, if I understood properly, you plan to change the definition of
>>>>>>>>> __user and __kernel in the kernel sources in order to generate the tag
>>>>>>>>> compiler attributes, correct?
>>>>>>>>
>>>>>>>> Right. The original __user definition likes:
>>>>>>>>       # define __user         __attribute__((noderef, address_space(__user)))
>>>>>>>>
>>>>>>>> The new attribute looks like
>>>>>>>>       # define BTF_TYPE_TAG(value) __attribute__((btf_type_tag(#value)))
>>>>>>>>       #  define __user        BTF_TYPE_TAG(user)
>>>>>>> Ok I see.  So the kernel will stop using sparse attributes to
>>>>>>> implement
>>>>>>> __user and __kernel and start using compiler attributes for tags
>>>>>>> instead.
>>>>>>>
>>>>>>>>> Is that the reason why LLVM implements what we assume to be the
>>>>>>>>> sparse
>>>>>>>>> ordering, and not the correct GNU attributes ordering, for the tag
>>>>>>>>> attributes?
>>>>>>>>
>>>>>>>> Note that __user attributes apply to pointee's and not pointers.
>>>>>>>> Just like
>>>>>>>>        const int *p;
>>>>>>>> the 'const' is not applied to pointer 'p', but the pointee of 'p'.
>>>>>>>>
>>>>>>>> What current llvm dwarf generation with
>>>>>>>>        pointer
>>>>>>>>          <--- btf_type_tag
>>>>>>>> is just ONE implementation. As I said earlier, I am okay to
>>>>>>>> have dwarf implementation like
>>>>>>>>        p->btf_type_tag->const->int.
>>>>>>>> If you can propose an implementation like this in dwarf. I can propose
>>>>>>>> to change implementation in llvm.
>>>>>>> I think we are miscommunicating.
>>>>>>> Looks like there is a divergence on what attributes apply to what
>>>>>>> language entities between the sparse compiler and GCC/LLVM.  How to
>>>>>>> represent that in DWARF is a different matter.
>>>>>>> For this example:
>>>>>>>       int __typetag1 * __typetag2 __typetag3 * g;
>>>>>>> a) GCC associates __typetag1 with the pointer-to-pointer-to-int.
>>>>>>> b) LLVM associates __typetag1 to pointer-to-int.
>>>>>>> Where:
>>>>>>> a) Is the expected behavior of a compiler attributes, as documented
>>>>>>> in
>>>>>>>        the GCC manual.
>>>>>>> b) Is presumably what the sparse compiler expects, but _not_ the
>>>>>>>        ordering expected for a compiler GNU attribute.
>>>>>>> So, if the kernel source __user and __kernel annotations (which
>>>>>>> currently expand to sparse attributes) follow the sparse ordering, and
>>>>>>> you want to implement __user and __kernel in terms of compiler
>>>>>>> attributes instead (the annotation attributes) then you will have to:
>>>>>>> 1) Fix LLVM to implement the usual ordering for these attributes and
>>>>>>> 2) fix the kernel sources to use that ordering
>>>>>>> [Incidentally, the same applies to another "ex-sparse" attribute you
>>>>>>>      have in the kernel and also implemented in LLVM with a weird ordering:
>>>>>>>      the address_space attribute.]
>>>>>>> For 2), it may be possible to write a coccinnelle script to generate
>>>>>>> the
>>>>>>> patch...
>>>>>>
>>>>>> I don't think (2) (to change kernel source for different attr ordering)
>>>>>> will work. So the only thing we can do is in compiler/pahole except
>>>>>> macro replacement in kernel.
>>>>> I looked at sparse and its parser.  Wanted to be sure the ordering
>>>>> it
>>>>> uses to interpret sparse annotations (such as address_space, alignment,
>>>>> etc) is definitely _not_ the same ordering used by __attribute__ in C
>>>>> compilers.
>>>>> It is very different indeed and the same can be said about how
>>>>> sparse
>>>>> interprets other modifiers like `const': in sparse both `int const *foo'
>>>>> and `int *const foo' parse to a constant pointer to int, for example.
>>>>> I am not to judge how sparse handles its annotations.  It may be
>>>>> very
>>>>> well and pertinent for its particular purpose.
>>>>> But I am not sure if it is reasonable to expect C compilers to
>>>>> implement
>>>>> certain type __attributes__ to parse differently, just because it
>>>>> happens these attributes are reused from sparse annotations in a
>>>>> particular program (in this case the kernel.)  The debug_annotate_decl
>>>>> and debug_annotate_type attributes are not even intended to be
>>>>> kernel-specific.
>>>>> So, if changing the kernel sources is not an option (why btw, other
>>>>> than
>>>>> being a PITA?) at this point I really don't know what else to suggest :/
>>>>> Any suggestion from the front-end people?
>>>>
>>>> Just want to understand the overall picture. So gcc can still emit
>>>> BTF properly with btf_type_tag right? The issue we are talking about
>>>> here is about the dwarf, right?
>>> If by "properly" you mean how sparse handles its annotations, then
>>> not
>>> really.
>>> The issue we are talking about is rather a language-level one: to
>>> what
>>> entity/type the compiler attribute applies.
>>> So, for:
>>>     int __attribute__((debug_annotate_decl("user"))) *foo;
>>> GCC will apply the attribute to the int type, following the rules
>>> for
>>> type attributes (sparse would apply the annotation to the *int type
>>> instead).  The emitted debug info (be it DWARF or BTF) will reflect
>>> that, no more no less :/
>>
>> I don't know what does this 'apply the attribute to the int' mean.
>> In current clang implementation it means the following dwarf chains
>> from right to left
>>    variable 'foo'
>>      type: ptr
>>        base type: attr_type: attr
>>                      underlying type: int
>>
>> So the type chain is foo -> ptr -> attr -> int
> 
> Urgh sorry Yonghong, that was a bad example where there is no
> divergence.  At this point I find myself confused regarding the sparse,
> clang and GCC attribute issue (I have so many dumps around from all
> three tools in several formats) so I better recap on this before
> creating further confusion.
> 
> Will be back to you soon.
> 
>>>> If this is the case, we might have
>>>> a partial solution here.
>>>>     - gcc emits BTF for vmlinux
>>> Note that for emitting BTF for vmlinux we would need support in the
>>> linker to merge and deduplicate BTF, which at the moment we don't have.
>>
>> This should be okay. pahole will merge and deduplicate btf. In pahole
>> '-j' mode, each thread will convert each .o file dwarf to btf, and
>> then pahole will merge and deduplicate btf.
> 
> Thats nice.  If LLVM supported generating BTF for any target (I believe
> you got patches for that) you could even skip the dwarf->BTF translation
> step alltogether with both LLVM and GCC kernel builds :)

Yes. This will be a potential future work to generate BTF for all 
targets (at least targets supported by linux kernel) with clang.
I indeed did some experiments before in this area...

But we need pahole regardless as pahole will do some further
processing based vmlinux.o file. So for the time being, if gcc
can generate both BTF (with debug_annotate_decl) and dwarf (without
debug_annotate_decl), pahole should be able to work it out
so we do have a path forward if necessary.

> 
>>
>>>
>>>>     - gcc emits dwarf for vmlinux ignoring btf_type_tag
>>>>     - in pahole, vmlinux BTF is amended with some additional misc things.
>>>> Although there are some use cases to have btf_type_tag in dwarf, but
>>>> that can be workarouned with BTF + dwarf both of which are generated
>>>> by the compiler. Not elegent, but probably works.
>>>>>
>>>>>>> Does this make sense?
>>>>>>>
>>>>>>>>> If that is so, we have quite a problem here: I don't think we can
>>>>>>>>> change
>>>>>>>>> the way GCC handles GNU-like attributes just because the kernel sources
>>>>>>>>> want to hook on these __user/__kernel sparse annotations to generate the
>>>>>>>>> compiler tags, even if we could mayhaps get GCC to handle
>>>>>>>>> debug_annotate_type and debug_annotate_decl differently.  Some would say
>>>>>>>>> doing so would perpetuate the mistake instead of fixing it...
>>>>>>>>> Is my understanding correct?
>>>>>>>>
>>>>>>>> Let us just say that the btf_type_tag attribute applies to pointees.
>>>>>>>> Does this help?
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Maybe you can also show what dwarf debug_info looks like
>>>>>>>>>>> I am not sure what you mean. This is the .debug_info section as output
>>>>>>>>>>> by readelf -w. I did trim some information not relevant to the discussion
>>>>>>>>>>> such as the DW_TAG_compile_unit DIE, for brevity.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> In the case of BTF, the annotations are recorded in two type kinds recently
>>>>>>>>>>>>> added to the BTF specification: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.
>>>>>>>>>>>>> The above example declaration prodcues the following BTF information:
>>>>>>>>>>>>>
>>>>>>>>>>>>> [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
>>>>>>>>>>>>> [2] PTR '(anon)' type_id=3
>>>>>>>>>>>>> [3] TYPE_TAG 'typetag1' type_id=1
>>>>>>>>>>>>> [4] DECL_TAG 'decltag1' type_id=6 component_idx=-1
>>>>>>>>>>>>> [5] DECL_TAG 'decltag2' type_id=6 component_idx=-1
>>>>>>>>>>>>> [6] VAR 'x' type_id=2, linkage=global
>>>>>>>>>>>>> [7] DATASEC '.bss' size=0 vlen=1
>>>>>>>>>>>>> 	type_id=6 offset=0 size=8 (VAR 'x')
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>> [...]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/9] Add debug_annotate attributes
  2022-06-15 22:56     ` Yonghong Song
  2022-06-17 17:18       ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} WAS: " Jose E. Marchesi
@ 2022-11-01 22:29       ` Yonghong Song
  1 sibling, 0 replies; 25+ messages in thread
From: Yonghong Song @ 2022-11-01 22:29 UTC (permalink / raw)
  To: David Faust, jose.marchesi; +Cc: gcc-patches

Hi, Jose and David,

Any progress on implement debug_annotate attribute in gcc?

Thanks,

Yonghong


On 6/15/22 3:56 PM, Yonghong Song wrote:
> 
> 
> On 6/15/22 1:57 PM, David Faust wrote:
>>
>>
>> On 6/14/22 22:53, Yonghong Song wrote:
>>>
>>>
>>> On 6/7/22 2:43 PM, David Faust wrote:
>>>> Hello,
>>>>
>>>> This patch series adds support for:
>>>>
>>>> - Two new C-language-level attributes that allow to associate (to 
>>>> "annotate" or
>>>>     to "tag") particular declarations and types with arbitrary 
>>>> strings. As
>>>>     explained below, this is intended to be used to, for example, 
>>>> characterize
>>>>     certain pointer types.
>>>>
>>>> - The conveyance of that information in the DWARF output in the form 
>>>> of a new
>>>>     DIE: DW_TAG_GNU_annotation.
>>>>
>>>> - The conveyance of that information in the BTF output in the form 
>>>> of two new
>>>>     kinds of BTF objects: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.
>>>>
>>>> All of these facilities are being added to the eBPF ecosystem, and 
>>>> support for
>>>> them exists in some form in LLVM.
>>>>
>>>> Purpose
>>>> =======
>>>>
>>>> 1)  Addition of C-family language constructs (attributes) to specify 
>>>> free-text
>>>>       tags on certain language elements, such as struct fields.
>>>>
>>>>       The purpose of these annotations is to provide additional 
>>>> information about
>>>>       types, variables, and function parameters of interest to the 
>>>> kernel. A
>>>>       driving use case is to tag pointer types within the linux 
>>>> kernel and eBPF
>>>>       programs with additional semantic information, such as 
>>>> '__user' or '__rcu'.
>>>>
>>>>       For example, consider the linux kernel function do_execve with 
>>>> the
>>>>       following declaration:
>>>>
>>>>         static int do_execve(struct filename *filename,
>>>>            const char __user *const __user *__argv,
>>>>            const char __user *const __user *__envp);
>>>>
>>>>       Here, __user could be defined with these annotations to record 
>>>> semantic
>>>>       information about the pointer parameters (e.g., they are 
>>>> user-provided) in
>>>>       DWARF and BTF information. Other kernel facilites such as the 
>>>> eBPF verifier
>>>>       can read the tags and make use of the information.
>>>>
>>>> 2)  Conveying the tags in the generated DWARF debug info.
>>>>
>>>>       The main motivation for emitting the tags in DWARF is that the 
>>>> Linux kernel
>>>>       generates its BTF information via pahole, using DWARF as a 
>>>> source:
>>>>
>>>>           +--------+  BTF                  BTF   +----------+
>>>>           | pahole |-------> vmlinux.btf ------->| verifier |
>>>>           +--------+                             +----------+
>>>>               ^                                        ^
>>>>               |                                        |
>>>>         DWARF |                                    BTF |
>>>>               |                                        |
>>>>            vmlinux                              +-------------+
>>>>            module1.ko                           | BPF program |
>>>>            module2.ko                           +-------------+
>>>>              ...
>>>>
>>>>       This is because:
>>>>
>>>>       a)  Unlike GCC, LLVM will only generate BTF for BPF programs.
>>>>
>>>>       b)  GCC can generate BTF for whatever target with -gbtf, but 
>>>> there is no
>>>>           support for linking/deduplicating BTF in the linker.
>>>>
>>>>       In the scenario above, the verifier needs access to the 
>>>> pointer tags of
>>>>       both the kernel types/declarations (conveyed in the DWARF and 
>>>> translated
>>>>       to BTF by pahole) and those of the BPF program (available 
>>>> directly in BTF).
>>>>
>>>>       Another motivation for having the tag information in DWARF, 
>>>> unrelated to
>>>>       BPF and BTF, is that the drgn project (another DWARF consumer) 
>>>> also wants
>>>>       to benefit from these tags in order to differentiate between 
>>>> different
>>>>       kinds of pointers in the kernel.
>>>>
>>>> 3)  Conveying the tags in the generated BTF debug info.
>>>>
>>>>       This is easy: the main purpose of having this info in BTF is 
>>>> for the
>>>>       compiled eBPF programs. The kernel verifier can then access 
>>>> the tags
>>>>       of pointers used by the eBPF programs.
>>>>
>>>>
>>>> For more information about these tags and the motivation behind 
>>>> them, please
>>>> refer to the following linux kernel discussions:
>>>>
>>>>     https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/
>>>>     https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com/
>>>>     https://lore.kernel.org/bpf/20211112012604.1504583-1-yhs@fb.com/
>>>>
>>>>
>>>> Implementation Overview
>>>> =======================
>>>>
>>>> To enable these annotations, two new C language attributes are added:
>>>> __attribute__((debug_annotate_decl("foo"))) and
>>>> __attribute__((debug_annotate_type("bar"))). Both attributes accept 
>>>> a single
>>>> arbitrary string constant argument, which will be recorded in the 
>>>> generated
>>>> DWARF and/or BTF debug information. They have no effect on code 
>>>> generation.
>>>>
>>>> Note that we are not using the same attribute names as LLVM 
>>>> (btf_decl_tag and
>>>> btf_type_tag, respectively). While these attributes are functionally 
>>>> very
>>>> similar, they have grown beyond purely BTF-specific uses, so 
>>>> inclusion of "btf"
>>>> in the attribute name seems misleading.
>>>>
>>>> DWARF support is enabled via a new DW_TAG_GNU_annotation. When 
>>>> generating DWARF,
>>>> declarations and types will be checked for the corresponding 
>>>> attributes. If
>>>> present, a DW_TAG_GNU_annotation DIE will be created as a child of 
>>>> the DIE for
>>>> the annotated type or declaration, one for each tag. These DIEs link 
>>>> the
>>>> arbitrary tag value to the item they annotate.
>>>>
>>>> For example, the following variable declaration:
>>>>
>>>>     #define __typetag1 __attribute__((debug_annotate_type 
>>>> ("typetag1")))
>>>>
>>>>     #define __decltag1 __attribute__((debug_annotate_decl 
>>>> ("decltag1")))
>>>>     #define __decltag2 __attribute__((debug_annotate_decl 
>>>> ("decltag2")))
>>>>
>>>>     int * __typetag1 x __decltag1 __decltag2;
>>>
>>> Based on the above example
>>>           static int do_execve(struct filename *filename,
>>>             const char __user *const __user *__argv,
>>>             const char __user *const __user *__envp);
>>>
>>> Should the above example should be the below?
>>>       int __typetag1 * x __decltag1 __decltag2
>>>
>>
>> This example is not related to the one above. It is just meant to
>> show the behavior of both attributes. My apologies for not making
>> that clear.
> 
> Okay, it should be fine if the dwarf debug_info is shown.
> 
>>
>>>>
>>>> Produces the following DWARF information:
>>>>
>>>>    <1><1e>: Abbrev Number: 3 (DW_TAG_variable)
>>>>       <1f>   DW_AT_name        : x
>>>>       <21>   DW_AT_decl_file   : 1
>>>>       <22>   DW_AT_decl_line   : 7
>>>>       <23>   DW_AT_decl_column : 18
>>>>       <24>   DW_AT_type        : <0x49>
>>>>       <28>   DW_AT_external    : 1
>>>>       <28>   DW_AT_location    : 9 byte block: 3 0 0 0 0 0 0 0 0     
>>>> (DW_OP_addr: 0)
>>>>       <32>   DW_AT_sibling     : <0x49>
>>>>    <2><36>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>       <37>   DW_AT_name        : (indirect string, offset: 0xd6): 
>>>> debug_annotate_decl
>>>>       <3b>   DW_AT_const_value : (indirect string, offset: 0xcd): 
>>>> decltag2
>>>>    <2><3f>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>       <40>   DW_AT_name        : (indirect string, offset: 0xd6): 
>>>> debug_annotate_decl
>>>>       <44>   DW_AT_const_value : (indirect string, offset: 0x0): 
>>>> decltag1
>>>>    <2><48>: Abbrev Number: 0
>>>>    <1><49>: Abbrev Number: 4 (DW_TAG_pointer_type)
>>>>       <4a>   DW_AT_byte_size   : 8
>>>>       <4b>   DW_AT_type        : <0x5d>
>>>>       <4f>   DW_AT_sibling     : <0x5d>
>>>>    <2><53>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>       <54>   DW_AT_name        : (indirect string, offset: 0x9): 
>>>> debug_annotate_type
>>>>       <58>   DW_AT_const_value : (indirect string, offset: 0x1d): 
>>>> typetag1
>>>>    <2><5c>: Abbrev Number: 0
>>>>    <1><5d>: Abbrev Number: 5 (DW_TAG_base_type)
>>>>       <5e>   DW_AT_byte_size   : 4
>>>>       <5f>   DW_AT_encoding    : 5    (signed)
>>>>       <60>   DW_AT_name        : int
>>>>    <1><64>: Abbrev Number: 0
> 
> This shows the info in .debug_abbrev. What I mean is to
> show the related info in .debug_info section which seems more useful to
> understand the relationships between different tags. Maybe this is due 
> to that I am not fully understanding what <1>/<2> means in <1><49> and 
> <2><53> etc.
> 
>>>
>>> Maybe you can also show what dwarf debug_info looks like
>> I am not sure what you mean. This is the .debug_info section as output
>> by readelf -w. I did trim some information not relevant to the discussion
>> such as the DW_TAG_compile_unit DIE, for brevity.
>>
>>>
>>>>
>>>> In the case of BTF, the annotations are recorded in two type kinds 
>>>> recently
>>>> added to the BTF specification: BTF_KIND_DECL_TAG and 
>>>> BTF_KIND_TYPE_TAG.
>>>> The above example declaration prodcues the following BTF information:
>>>>
>>>> [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
>>>> [2] PTR '(anon)' type_id=3
>>>> [3] TYPE_TAG 'typetag1' type_id=1
>>>> [4] DECL_TAG 'decltag1' type_id=6 component_idx=-1
>>>> [5] DECL_TAG 'decltag2' type_id=6 component_idx=-1
>>>> [6] VAR 'x' type_id=2, linkage=global
>>>> [7] DATASEC '.bss' size=0 vlen=1
>>>>     type_id=6 offset=0 size=8 (VAR 'x')
>>>>
>>>>
>>> [...]

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2022-11-01 22:29 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-07 21:43 [PATCH 0/9] Add debug_annotate attributes David Faust
2022-06-07 21:43 ` [PATCH 1/9] dwarf: add dw_get_die_parent function David Faust
2022-06-13 10:13   ` Richard Biener
2022-06-07 21:43 ` [PATCH 2/9] include: Add new definitions David Faust
2022-06-07 21:43 ` [PATCH 3/9] c-family: Add debug_annotate attribute handlers David Faust
2022-06-07 21:43 ` [PATCH 4/9] dwarf: generate annotation DIEs David Faust
2022-06-07 21:43 ` [PATCH 5/9] ctfc: pass through debug annotations to BTF David Faust
2022-06-07 21:43 ` [PATCH 6/9] dwarf2ctf: convert annotation DIEs to CTF types David Faust
2022-06-07 21:43 ` [PATCH 7/9] btf: output decl_tag and type_tag records David Faust
2022-06-07 21:43 ` [PATCH 8/9] doc: document new attributes David Faust
2022-06-07 21:43 ` [PATCH 9/9] testsuite: add debug annotation tests David Faust
2022-06-15  5:53 ` [PATCH 0/9] Add debug_annotate attributes Yonghong Song
2022-06-15 20:57   ` David Faust
2022-06-15 22:56     ` Yonghong Song
2022-06-17 17:18       ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} WAS: " Jose E. Marchesi
2022-06-20 17:06         ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type, decl} " Yonghong Song
2022-06-21 16:12           ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} " Jose E. Marchesi
2022-06-24 18:01             ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type, decl} " Yonghong Song
2022-07-07 20:24               ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} " Jose E. Marchesi
2022-07-13  4:23                 ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type, decl} " Yonghong Song
2022-07-14 15:09                   ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} " Jose E. Marchesi
2022-07-15  1:20                     ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type, decl} " Yonghong Song
2022-07-15 14:17                       ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} " Jose E. Marchesi
2022-07-15 16:48                         ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type, decl} " Yonghong Song
2022-11-01 22:29       ` Yonghong Song

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).