public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Yonghong Song <yhs@meta.com>
To: David Faust <david.faust@oracle.com>, jose.marchesi@oracle.com
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH 0/9] Add debug_annotate attributes
Date: Tue, 1 Nov 2022 15:29:34 -0700	[thread overview]
Message-ID: <043ed53f-5030-1fe5-ddce-0854e8f9801b@meta.com> (raw)
In-Reply-To: <52dcfdb6-f1b9-1986-5d10-8d6ac8c6d256@fb.com>

Hi, Jose and David,

Any progress on implement debug_annotate attribute in gcc?

Thanks,

Yonghong


On 6/15/22 3:56 PM, Yonghong Song wrote:
> 
> 
> On 6/15/22 1:57 PM, David Faust wrote:
>>
>>
>> On 6/14/22 22:53, Yonghong Song wrote:
>>>
>>>
>>> On 6/7/22 2:43 PM, David Faust wrote:
>>>> Hello,
>>>>
>>>> This patch series adds support for:
>>>>
>>>> - Two new C-language-level attributes that allow to associate (to 
>>>> "annotate" or
>>>>     to "tag") particular declarations and types with arbitrary 
>>>> strings. As
>>>>     explained below, this is intended to be used to, for example, 
>>>> characterize
>>>>     certain pointer types.
>>>>
>>>> - The conveyance of that information in the DWARF output in the form 
>>>> of a new
>>>>     DIE: DW_TAG_GNU_annotation.
>>>>
>>>> - The conveyance of that information in the BTF output in the form 
>>>> of two new
>>>>     kinds of BTF objects: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.
>>>>
>>>> All of these facilities are being added to the eBPF ecosystem, and 
>>>> support for
>>>> them exists in some form in LLVM.
>>>>
>>>> Purpose
>>>> =======
>>>>
>>>> 1)  Addition of C-family language constructs (attributes) to specify 
>>>> free-text
>>>>       tags on certain language elements, such as struct fields.
>>>>
>>>>       The purpose of these annotations is to provide additional 
>>>> information about
>>>>       types, variables, and function parameters of interest to the 
>>>> kernel. A
>>>>       driving use case is to tag pointer types within the linux 
>>>> kernel and eBPF
>>>>       programs with additional semantic information, such as 
>>>> '__user' or '__rcu'.
>>>>
>>>>       For example, consider the linux kernel function do_execve with 
>>>> the
>>>>       following declaration:
>>>>
>>>>         static int do_execve(struct filename *filename,
>>>>            const char __user *const __user *__argv,
>>>>            const char __user *const __user *__envp);
>>>>
>>>>       Here, __user could be defined with these annotations to record 
>>>> semantic
>>>>       information about the pointer parameters (e.g., they are 
>>>> user-provided) in
>>>>       DWARF and BTF information. Other kernel facilites such as the 
>>>> eBPF verifier
>>>>       can read the tags and make use of the information.
>>>>
>>>> 2)  Conveying the tags in the generated DWARF debug info.
>>>>
>>>>       The main motivation for emitting the tags in DWARF is that the 
>>>> Linux kernel
>>>>       generates its BTF information via pahole, using DWARF as a 
>>>> source:
>>>>
>>>>           +--------+  BTF                  BTF   +----------+
>>>>           | pahole |-------> vmlinux.btf ------->| verifier |
>>>>           +--------+                             +----------+
>>>>               ^                                        ^
>>>>               |                                        |
>>>>         DWARF |                                    BTF |
>>>>               |                                        |
>>>>            vmlinux                              +-------------+
>>>>            module1.ko                           | BPF program |
>>>>            module2.ko                           +-------------+
>>>>              ...
>>>>
>>>>       This is because:
>>>>
>>>>       a)  Unlike GCC, LLVM will only generate BTF for BPF programs.
>>>>
>>>>       b)  GCC can generate BTF for whatever target with -gbtf, but 
>>>> there is no
>>>>           support for linking/deduplicating BTF in the linker.
>>>>
>>>>       In the scenario above, the verifier needs access to the 
>>>> pointer tags of
>>>>       both the kernel types/declarations (conveyed in the DWARF and 
>>>> translated
>>>>       to BTF by pahole) and those of the BPF program (available 
>>>> directly in BTF).
>>>>
>>>>       Another motivation for having the tag information in DWARF, 
>>>> unrelated to
>>>>       BPF and BTF, is that the drgn project (another DWARF consumer) 
>>>> also wants
>>>>       to benefit from these tags in order to differentiate between 
>>>> different
>>>>       kinds of pointers in the kernel.
>>>>
>>>> 3)  Conveying the tags in the generated BTF debug info.
>>>>
>>>>       This is easy: the main purpose of having this info in BTF is 
>>>> for the
>>>>       compiled eBPF programs. The kernel verifier can then access 
>>>> the tags
>>>>       of pointers used by the eBPF programs.
>>>>
>>>>
>>>> For more information about these tags and the motivation behind 
>>>> them, please
>>>> refer to the following linux kernel discussions:
>>>>
>>>>     https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/
>>>>     https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com/
>>>>     https://lore.kernel.org/bpf/20211112012604.1504583-1-yhs@fb.com/
>>>>
>>>>
>>>> Implementation Overview
>>>> =======================
>>>>
>>>> To enable these annotations, two new C language attributes are added:
>>>> __attribute__((debug_annotate_decl("foo"))) and
>>>> __attribute__((debug_annotate_type("bar"))). Both attributes accept 
>>>> a single
>>>> arbitrary string constant argument, which will be recorded in the 
>>>> generated
>>>> DWARF and/or BTF debug information. They have no effect on code 
>>>> generation.
>>>>
>>>> Note that we are not using the same attribute names as LLVM 
>>>> (btf_decl_tag and
>>>> btf_type_tag, respectively). While these attributes are functionally 
>>>> very
>>>> similar, they have grown beyond purely BTF-specific uses, so 
>>>> inclusion of "btf"
>>>> in the attribute name seems misleading.
>>>>
>>>> DWARF support is enabled via a new DW_TAG_GNU_annotation. When 
>>>> generating DWARF,
>>>> declarations and types will be checked for the corresponding 
>>>> attributes. If
>>>> present, a DW_TAG_GNU_annotation DIE will be created as a child of 
>>>> the DIE for
>>>> the annotated type or declaration, one for each tag. These DIEs link 
>>>> the
>>>> arbitrary tag value to the item they annotate.
>>>>
>>>> For example, the following variable declaration:
>>>>
>>>>     #define __typetag1 __attribute__((debug_annotate_type 
>>>> ("typetag1")))
>>>>
>>>>     #define __decltag1 __attribute__((debug_annotate_decl 
>>>> ("decltag1")))
>>>>     #define __decltag2 __attribute__((debug_annotate_decl 
>>>> ("decltag2")))
>>>>
>>>>     int * __typetag1 x __decltag1 __decltag2;
>>>
>>> Based on the above example
>>>           static int do_execve(struct filename *filename,
>>>             const char __user *const __user *__argv,
>>>             const char __user *const __user *__envp);
>>>
>>> Should the above example should be the below?
>>>       int __typetag1 * x __decltag1 __decltag2
>>>
>>
>> This example is not related to the one above. It is just meant to
>> show the behavior of both attributes. My apologies for not making
>> that clear.
> 
> Okay, it should be fine if the dwarf debug_info is shown.
> 
>>
>>>>
>>>> Produces the following DWARF information:
>>>>
>>>>    <1><1e>: Abbrev Number: 3 (DW_TAG_variable)
>>>>       <1f>   DW_AT_name        : x
>>>>       <21>   DW_AT_decl_file   : 1
>>>>       <22>   DW_AT_decl_line   : 7
>>>>       <23>   DW_AT_decl_column : 18
>>>>       <24>   DW_AT_type        : <0x49>
>>>>       <28>   DW_AT_external    : 1
>>>>       <28>   DW_AT_location    : 9 byte block: 3 0 0 0 0 0 0 0 0     
>>>> (DW_OP_addr: 0)
>>>>       <32>   DW_AT_sibling     : <0x49>
>>>>    <2><36>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>       <37>   DW_AT_name        : (indirect string, offset: 0xd6): 
>>>> debug_annotate_decl
>>>>       <3b>   DW_AT_const_value : (indirect string, offset: 0xcd): 
>>>> decltag2
>>>>    <2><3f>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>       <40>   DW_AT_name        : (indirect string, offset: 0xd6): 
>>>> debug_annotate_decl
>>>>       <44>   DW_AT_const_value : (indirect string, offset: 0x0): 
>>>> decltag1
>>>>    <2><48>: Abbrev Number: 0
>>>>    <1><49>: Abbrev Number: 4 (DW_TAG_pointer_type)
>>>>       <4a>   DW_AT_byte_size   : 8
>>>>       <4b>   DW_AT_type        : <0x5d>
>>>>       <4f>   DW_AT_sibling     : <0x5d>
>>>>    <2><53>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>       <54>   DW_AT_name        : (indirect string, offset: 0x9): 
>>>> debug_annotate_type
>>>>       <58>   DW_AT_const_value : (indirect string, offset: 0x1d): 
>>>> typetag1
>>>>    <2><5c>: Abbrev Number: 0
>>>>    <1><5d>: Abbrev Number: 5 (DW_TAG_base_type)
>>>>       <5e>   DW_AT_byte_size   : 4
>>>>       <5f>   DW_AT_encoding    : 5    (signed)
>>>>       <60>   DW_AT_name        : int
>>>>    <1><64>: Abbrev Number: 0
> 
> This shows the info in .debug_abbrev. What I mean is to
> show the related info in .debug_info section which seems more useful to
> understand the relationships between different tags. Maybe this is due 
> to that I am not fully understanding what <1>/<2> means in <1><49> and 
> <2><53> etc.
> 
>>>
>>> Maybe you can also show what dwarf debug_info looks like
>> I am not sure what you mean. This is the .debug_info section as output
>> by readelf -w. I did trim some information not relevant to the discussion
>> such as the DW_TAG_compile_unit DIE, for brevity.
>>
>>>
>>>>
>>>> In the case of BTF, the annotations are recorded in two type kinds 
>>>> recently
>>>> added to the BTF specification: BTF_KIND_DECL_TAG and 
>>>> BTF_KIND_TYPE_TAG.
>>>> The above example declaration prodcues the following BTF information:
>>>>
>>>> [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
>>>> [2] PTR '(anon)' type_id=3
>>>> [3] TYPE_TAG 'typetag1' type_id=1
>>>> [4] DECL_TAG 'decltag1' type_id=6 component_idx=-1
>>>> [5] DECL_TAG 'decltag2' type_id=6 component_idx=-1
>>>> [6] VAR 'x' type_id=2, linkage=global
>>>> [7] DATASEC '.bss' size=0 vlen=1
>>>>     type_id=6 offset=0 size=8 (VAR 'x')
>>>>
>>>>
>>> [...]

      parent reply	other threads:[~2022-11-01 22:29 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-07 21:43 David Faust
2022-06-07 21:43 ` [PATCH 1/9] dwarf: add dw_get_die_parent function David Faust
2022-06-13 10:13   ` Richard Biener
2022-06-07 21:43 ` [PATCH 2/9] include: Add new definitions David Faust
2022-06-07 21:43 ` [PATCH 3/9] c-family: Add debug_annotate attribute handlers David Faust
2022-06-07 21:43 ` [PATCH 4/9] dwarf: generate annotation DIEs David Faust
2022-06-07 21:43 ` [PATCH 5/9] ctfc: pass through debug annotations to BTF David Faust
2022-06-07 21:43 ` [PATCH 6/9] dwarf2ctf: convert annotation DIEs to CTF types David Faust
2022-06-07 21:43 ` [PATCH 7/9] btf: output decl_tag and type_tag records David Faust
2022-06-07 21:43 ` [PATCH 8/9] doc: document new attributes David Faust
2022-06-07 21:43 ` [PATCH 9/9] testsuite: add debug annotation tests David Faust
2022-06-15  5:53 ` [PATCH 0/9] Add debug_annotate attributes Yonghong Song
2022-06-15 20:57   ` David Faust
2022-06-15 22:56     ` Yonghong Song
2022-06-17 17:18       ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} WAS: " Jose E. Marchesi
2022-06-20 17:06         ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type, decl} " Yonghong Song
2022-06-21 16:12           ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} " Jose E. Marchesi
2022-06-24 18:01             ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type, decl} " Yonghong Song
2022-07-07 20:24               ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} " Jose E. Marchesi
2022-07-13  4:23                 ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type, decl} " Yonghong Song
2022-07-14 15:09                   ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} " Jose E. Marchesi
2022-07-15  1:20                     ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type, decl} " Yonghong Song
2022-07-15 14:17                       ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} " Jose E. Marchesi
2022-07-15 16:48                         ` kernel sparse annotations vs. compiler attributes and debug_annotate_{type, decl} " Yonghong Song
2022-11-01 22:29       ` Yonghong Song [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=043ed53f-5030-1fe5-ddce-0854e8f9801b@meta.com \
    --to=yhs@meta.com \
    --cc=david.faust@oracle.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jose.marchesi@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).