From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-x22e.google.com (mail-lj1-x22e.google.com [IPv6:2a00:1450:4864:20::22e]) by sourceware.org (Postfix) with ESMTPS id 705003858D20 for ; Wed, 12 Jul 2023 13:21:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 705003858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-lj1-x22e.google.com with SMTP id 38308e7fff4ca-2b703d7ed3aso114790121fa.1 for ; Wed, 12 Jul 2023 06:21:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1689168098; x=1691760098; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Qa/2pO/vEGF38ZXb2Qn2pJE4UWlLZ5hIGl+iJvW36+g=; b=V1fRpXYYsih4IGq+YptmKqxGoRqXGImbsfAscq6Mzdgn9epEcSTAa8bxqUt4BTxuBb Wv4t2iPRojTu8gURFvzOL/VmzB/NbDY/nWTOoE9Bp7EIuoy3X22uRxcILnKhvUHXBACZ z2ifEeTuz8pvzey1RpRPqf/7pJbYgZbqJn1uPjfq/WBWIEGOhsjESdngpHGRz+KAK3mP /OpLhagP5Z38GSYUE+wD7Mt+L33Y5b9zmC4hFp9wQn1LjIPUUtUO3+0Gg7y2jTfJKGyr trc4dYNuuI4HdfoUKZg0HY+nizoZEe8tTYQIrMcauIfGmQsSoodbe8XAcXHt4S5o5Act byqw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689168098; x=1691760098; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Qa/2pO/vEGF38ZXb2Qn2pJE4UWlLZ5hIGl+iJvW36+g=; b=O1MS6RxxWQP5unKouFl3/rx9XzjtGdVglX6q04BNm2mz5zV6a8ToY9ny9fc/4UmZ8t CiO2GI2wJVJQwOr0O9XgOi3dnrunrTmveP+QC6fcUi8MnbO4GamrYparrvee/j8QuP74 o+Ves96p6zaxXhMiVLw6H6D6nacX6rV6xjGBLnWph0Yqx6IT//7nFD1kotGkV7YOQgd4 vIgwlhh8bDvAUFRLClmtY+r79pnFH1JQMQhuJqQBDZ6LQ2+Kdw7ocxD9+x6Ftb+Ab+4V mGFc5+J1OYQOMOUdv0Q/S9yLiC7a+SaVz/dvxGseap4H8bgsn0xM9aRdNzR+s1D/GUyT +nvA== X-Gm-Message-State: ABy/qLYllCBKW/gYVL4x31eEioFLB19T66PlNV9hOsTWBrl36KzSi/jW m6avk5tnA6bOoglltvXujnkV0Fr8G/60NzJhoVQ= X-Google-Smtp-Source: APBJJlFFV/D49eN+fvBTlzYGF8F2oetT/g3EdcquwanP157ioiOv42U7c3pW7Yp5xE+NlPVTREb9DWAOiRWxy97Moo0= X-Received: by 2002:a05:651c:120e:b0:2b6:364:c153 with SMTP id i14-20020a05651c120e00b002b60364c153mr15794642lja.14.1689168097325; Wed, 12 Jul 2023 06:21:37 -0700 (PDT) MIME-Version: 1.0 References: <20230711215716.12980-1-david.faust@oracle.com> <87y1jlz4e4.fsf@oracle.com> In-Reply-To: <87y1jlz4e4.fsf@oracle.com> From: Richard Biener Date: Wed, 12 Jul 2023 15:21:14 +0200 Message-ID: Subject: Re: [PATCH 0/9] Add btf_decl_tag C attribute To: "Jose E. Marchesi" Cc: David Faust , gcc-patches@gcc.gnu.org, yhs@meta.com, Eduard Zingerman Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, Jul 12, 2023 at 2:44=E2=80=AFPM Jose E. Marchesi wrote: > > > [Added Eduard Zingerman in CC, who is implementing this same feature in > clang/llvm and also the consumer component in the kernel (pahole).] > > Hi Richard. > > > On Tue, Jul 11, 2023 at 11:58=E2=80=AFPM David Faust via Gcc-patches > > wrote: > >> > >> Hello, > >> > >> This series adds support for a new attribute, "btf_decl_tag" in GCC. > >> The same attribute is already supported in clang, and is used by vario= us > >> components of the BPF ecosystem. > >> > >> The purpose of the attribute is to allow to associate (to "tag") > >> declarations with arbitrary string annotations, which are emitted into > >> debugging information (DWARF and/or BTF) to facilitate post-compilatio= n > >> analysis (the motivating use case being the Linux kernel BPF verifier)= . > >> Multiple tags are allowed on the same declaration. > >> > >> These strings are not interpreted by the compiler, and the attribute > >> itself has no effect on generated code, other than to produce addition= al > >> DWARF DIEs and/or BTF records conveying the annotations. > >> > >> This entails: > >> > >> - A new C-language-level attribute which allows to associate (to "tag"= ) > >> particular declarations with arbitrary strings. > >> > >> - The conveyance of that information in DWARF in the form of a new DIE= , > >> DW_TAG_GNU_annotation, with tag number (0x6000) and format matching > >> that of the DW_TAG_LLVM_annotation extension supported in LLVM for > >> the same purpose. These DIEs are already supported by BPF tooling, > >> such as pahole. > >> > >> - The conveyance of that information in BTF debug info in the form of > >> BTF_KIND_DECL_TAG records. These records are already supported by > >> LLVM and other tools in the eBPF ecosystem, such as the Linux kernel > >> eBPF verifier. > >> > >> > >> Background > >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > >> > >> The purpose of these tags is to convey additional semantic information > >> to post-compilation consumers, in particular the Linux kernel eBPF > >> verifier. The verifier can make use of that information while analyzin= g > >> a BPF program to aid in determining whether to allow or reject the > >> program to be run. More background on these tags can be found in the > >> early support for them in the kernel here [1] and [2]. > >> > >> The "btf_decl_tag" attribute is half the story; the other half is a > >> sibling attribute "btf_type_tag" which serves the same purpose but > >> applies to types. Support for btf_type_tag will come in a separate > >> patch series, since it is impaced by GCC bug 110439 which needs to be > >> addressed first. > >> > >> I submitted an initial version of this work (including btf_type_tag) > >> last spring [3], however at the time there were some open questions > >> about the behavior of the btf_type_tag attribute and issues with its > >> implementation. Since then we have clarified these details and agreed > >> to solutions with the BPF community and LLVM BPF folks. > >> > >> The main motivation for emitting the tags in DWARF is that the Linux > >> kernel generates its BTF information via pahole, using DWARF as a sour= ce: > >> > >> +--------+ BTF BTF +----------+ > >> | pahole |-------> vmlinux.btf ------->| verifier | > >> +--------+ +----------+ > >> ^ ^ > >> | | > >> DWARF | BTF | > >> | | > >> vmlinux +-------------+ > >> module1.ko | BPF program | > >> module2.ko +-------------+ > >> ... > >> > >> This is because: > >> > >> a) pahole adds additional kernel-specific information into the > >> produced BTF based on additional analysis of kernel objects. > >> > >> b) Unlike GCC, LLVM will only generate BTF for BPF programs. > >> > >> b) GCC can generate BTF for whatever target with -gbtf, but there is = no > >> support for linking/deduplicating BTF in the linker. > >> > >> In the scenario above, the verifier needs access to the pointer tags o= f > >> both the kernel types/declarations (conveyed in the DWARF and translat= ed > >> to BTF by pahole) and those of the BPF program (available directly in = BTF). > >> > >> > >> DWARF Representation > >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > >> > >> As noted above, btf_decl_tag is represented in DWARF via a new DIE > >> DW_TAG_GNU_annotation, with identical format to the LLVM DWARF > >> extension DW_TAG_LLVM_annotation serving the same purpose. The DIE has > >> the following format: > >> > >> DW_TAG_GNU_annotation (0x6000) > >> DW_AT_name: "btf_decl_tag" > >> DW_AT_const_value: > >> > >> These DIEs are placed in the DWARF tree as children of the DIE for the > >> appropriate declaration, and one such DIE is created for each occurren= ce > >> of the btf_decl_tag attribute on a declaration. > >> > >> For example: > >> > >> const int * c __attribute__((btf_decl_tag ("__c"), btf_decl_tag ("de= vicemem"))); > >> > >> This declaration produces the following DWARF: > >> > >> <1><1e>: Abbrev Number: 2 (DW_TAG_variable) > >> <1f> DW_AT_name : c > >> <24> DW_AT_type : <0x49> > >> ... > >> <2><36>: Abbrev Number: 3 (User TAG value: 0x6000) > >> <37> DW_AT_name : (indirect string, offset: 0x4c): btf_de= cl_tag > >> <3b> DW_AT_const_value : (indirect string, offset: 0): devicemem > >> <2><3f>: Abbrev Number: 4 (User TAG value: 0x6000) > >> <40> DW_AT_name : (indirect string, offset: 0x4c): btf_de= cl_tag > >> <44> DW_AT_const_value : __c > >> <2><48>: Abbrev Number: 0 > >> <1><49>: Abbrev Number: 5 (DW_TAG_pointer_type) > >> ... > >> > >> The DIEs for btf_decl_tag are placed as children of the DIE for > >> variable "c". > > > > It looks like a bit of overkill, and inefficient as well. Why's the > > tags not referenced via the existing DW_AT_description? > > The DWARF spec ("Entity Descriptions") seems to imply that the > DW_AT_description attribute is intended to be used to hold alternative > ways to denote the same "debugging information" (object, type, ...), > i.e. alternative aliases to refer to the same entity than the > DW_AT_name. For example, for a type name=3D'foo' we could have > description=3D'aka. long int'. We don't think this is the case of the bt= f > tags, which are more like properties partially characterizing the tagged > "debugging information", but couldn't be used as an alias to the name. > > Also, repurposing the DW_AT_description attribute to hold btf tag > information would require to introduce a mini-language and subsequent > parsing by the clients: how to denote several tags, how to encode the > embedded string contents, etc. You kick the complexity out the door and > it comes back in through the window :) > > Finally, for what we know, the existing attribute may already be used by > some language and handled by some debugger the way it is recommended in > the spec. That would be incompatible with having btf tags encoded > there. How are the C/C++ standard attributes proposed to be encoded in dwarf? I think adding special encoding just for BTF tags looks wrong. > > Iff you want new TAGs why require them as children for each DIE rather > > than referencing (and sharing!) them via a DIE reference from a new > > attribute? > > Hmm, thats a very good question. The Linux kernel sources uses both > declaration tags and type tags and not sharing the DIEs may result in > serious bloating, since the tags are brought in to declarations and type > specifiers via macros... > > > That said, I'd go with DW_AT_description 'btf_decl_tag ("devicemem")'. > > > > But well ... > > > > Richard. > > > >> > >> BTF Representation > >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > >> > >> In BTF, BTF_KIND_DECL_TAG records convey the annotations. These record= s refer > >> to the annotated object by BTF type ID, as well as a component index w= hich is > >> used for btf_decl_tags placed on struct/union members or function argu= ments. > >> > >> For example, the BTF for the above declaration is: > >> > >> [1] INT 'int' size=3D4 bits_offset=3D0 nr_bits=3D32 encoding=3DSIGNE= D > >> [2] CONST '(anon)' type_id=3D1 > >> [3] PTR '(anon)' type_id=3D2 > >> [4] DECL_TAG '__c' type_id=3D6 component_idx=3D-1 > >> [5] DECL_TAG 'devicemem' type_id=3D6 component_idx=3D-1 > >> [6] VAR 'c' type_id=3D3, linkage=3Dglobal > >> ... > >> > >> The BTF format is documented here [4]. > >> > >> > >> References > >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > >> > >> [1] https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/ > >> [2] https://lore.kernel.org/bpf/20211011040608.3031468-1-yhs@fb.com/ > >> [3] https://gcc.gnu.org/pipermail/gcc-patches/2022-May/593936.html > >> [4] https://www.kernel.org/doc/Documentation/bpf/btf.rst > >> > >> > >> David Faust (9): > >> c-family: add btf_decl_tag attribute > >> include: add BTF decl tag defines > >> dwarf: create annotation DIEs for decl tags > >> dwarf: expose get_die_parent > >> ctf: add support to pass through BTF tags > >> dwarf2ctf: convert annotation DIEs to CTF types > >> btf: create and output BTF_KIND_DECL_TAG types > >> testsuite: add tests for BTF decl tags > >> doc: document btf_decl_tag attribute > >> > >> gcc/btfout.cc | 81 ++++++++++++++++++= - > >> gcc/c-family/c-attribs.cc | 23 ++++++ > >> gcc/ctf-int.h | 28 +++++++ > >> gcc/ctfc.cc | 10 ++- > >> gcc/ctfc.h | 17 +++- > >> gcc/doc/extend.texi | 47 +++++++++++ > >> gcc/dwarf2ctf.cc | 73 ++++++++++++++++- > >> gcc/dwarf2out.cc | 37 ++++++++- > >> gcc/dwarf2out.h | 1 + > >> .../gcc.dg/debug/btf/btf-decltag-func.c | 21 +++++ > >> .../gcc.dg/debug/btf/btf-decltag-sou.c | 33 ++++++++ > >> .../gcc.dg/debug/btf/btf-decltag-var.c | 19 +++++ > >> .../gcc.dg/debug/dwarf2/annotation-decl-1.c | 9 +++ > >> .../gcc.dg/debug/dwarf2/annotation-decl-2.c | 18 +++++ > >> .../gcc.dg/debug/dwarf2/annotation-decl-3.c | 17 ++++ > >> include/btf.h | 14 +++- > >> include/dwarf2.def | 4 + > >> 17 files changed, 437 insertions(+), 15 deletions(-) > >> create mode 100644 gcc/ctf-int.h > >> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-func.c > >> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-sou.c > >> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-var.c > >> create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-= 1.c > >> create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-= 2.c > >> create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-= 3.c > >> > >> -- > >> 2.40.1 > >>