From: Yonghong Song <yhs@fb.com>
To: David Faust <david.faust@oracle.com>
Cc: gcc-patches@gcc.gnu.org, Joseph Myers <joseph@codesourcery.com>
Subject: Re: [ping2][PATCH 0/8][RFC] Support BTF decl_tag and type_tag annotations
Date: Tue, 10 May 2022 20:43:56 -0700 [thread overview]
Message-ID: <7125844e-f538-faca-1cdf-5322492c00d9@fb.com> (raw)
In-Reply-To: <b0626323-1bdc-4f99-0496-3007e2752ac4@oracle.com>
On 5/6/22 2:18 PM, David Faust wrote:
>
>
> On 5/5/22 16:00, Yonghong Song wrote:
>>
>>
>> On 5/4/22 10:03 AM, David Faust wrote:
>>>
>>>
>>> On 5/3/22 15:32, Joseph Myers wrote:
>>>> On Mon, 2 May 2022, David Faust via Gcc-patches wrote:
>>>>
>>>>> Consider the following example:
>>>>>
>>>>> #define __typetag1 __attribute__((btf_type_tag("tag1")))
>>>>> #define __typetag2 __attribute__((btf_type_tag("tag2")))
>>>>> #define __typetag3 __attribute__((btf_type_tag("tag3")))
>>>>>
>>>>> int __typetag1 * __typetag2 __typetag3 * g;
>>>>>
>>>>> The expected behavior is that 'g' is "a pointer with tags 'tag2' and
>>>>> 'tag3',
>>>>> to a pointer with tag 'tag1' to an int". i.e.:
>>>>
>>>> That's not a correct expectation for either GNU __attribute__ or C2x
>>>> [[]]
>>>> attribute syntax. In either syntax, __typetag2 __typetag3 should
>>>> apply to
>>>> the type to which g points, not to g or its type, just as if you had a
>>>> type qualifier there. You'd need to put the attributes (or qualifier)
>>>> after the *, not before, to make them apply to the pointer type. See
>>>> "Attribute Syntax" in the GCC manual for how the syntax is defined for
>>>> GNU
>>>> attributes and deduce in turn, for each subsequence of the tokens
>>>> matching
>>>> the syntax for some kind of declarator, what the type for "T D1"
>>>> would be
>>>> as defined there and in the C standard, as deduced from the type for
>>>> "T D"
>>>> for a sub-declarator D.
>>>> >> But GCC's attribute parsing produces a variable 'g' which is "a
>>> pointer with
>>>>> tag 'tag1' to a pointer with tags 'tag2' and 'tag3' to an int", i.e.
>>>>
>>>> In GNU syntax, __typetag1 applies to the declaration, whereas in C2x
>>>> syntax it applies to int. Again, if you wanted it to apply to the
>>>> pointer
>>>> type it would need to go after the * not before.
>>>>
>>>> If you are concerned with the fine details of what construct an
>>>> attribute
>>>> appertains to, I recommend using C2x syntax not GNU syntax.
>>>>
>>>
>>> Joseph, thank you! This is very helpful. My understanding of the syntax
>>> was not correct.
>>>
>>> (Actually, I made a bad mistake in paraphrasing this example from the
>>> discussion of it in the series cover letter. But, the reason why it is
>>> incorrect is the same.)
>>>
>>>
>>> Yonghong, is the specific ordering an expectation in BPF programs or
>>> other users of the tags?
>>
>> This is probably a language writing issue. We are saying tags only
>> apply to pointer. We probably should say it only apply to pointee.
>>
>> $ cat t.c
>> int const *ptr;
>>
>> the llvm ir debuginfo:
>>
>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64)
>> !6 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !7)
>> !7 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>>
>> We could replace 'const' with a tag like below:
>>
>> int __attribute__((btf_type_tag("tag"))) *ptr;
>>
>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64,
>> annotations: !7)
>> !6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>> !7 = !{!8}
>> !8 = !{!"btf_type_tag", !"tag"}
>>
>> In the above IR, we generate annotations to pointer_type because
>> we didn't invent a new DI type for encode btf_type_tag. But it is
>> totally okay to have IR looks like
>>
>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !11, size: 64)
>> !11 = !DIBtfTypeTagType(..., baseType: !6, name: !"Tag")
>> !6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>>
> OK, thanks.
>
> There is still the question of why the DWARF generated for this case
> that I have been concerned about:
>
> int __typetag1 * __typetag2 __typetag3 * g;
>
> differs between GCC (with this series) and clang. After studying it, GCC
> is doing with the attributes exactly as is described in the Attribute
> Syntax portion of the GCC manual where the GNU syntax is described. I do
> not think there is any problem here.
>
> So the difference in DWARF suggests to me that clang is not handling the
> GNU attribute syntax in this particular case correctly, since it seems
> to be associating __typetag2 and __typetag3 to g's type rather than the
> type to which it points.
>
> I am not sure whether for the use purposes of the tags this difference
> is very important, but it is worth noting.
>
>
> As Joseph suggested, it may be better to encourage users of these tags
> to use the C2x attribute syntax if they are concerned with precisely
> which construct the tag applies.
>
> This would also be a way around any issues in handling the attributes
> due to the GNU syntax.
>
> I tried a few test cases using C2x syntax BTF type tags with a clang-15
> build, but ran into some issues (in particular, some of the tag
> attributes being ignored altogether). I couldn't find confirmation
> whether C2x attribute syntax is fully supported in clang yet, so maybe
> this isn't expected to work. Do you know whether the C2x syntax is fully
> supported in clang yet?
Actually, I don't know either. But since the btf decl_tag and type_tag
are also used to compile linux kernel and the minimum compiler version
to compile kernel is gcc5.1 and clang11. I am not sure whether gcc5.1
supports c2x or not, I guess probably not. So I think we most likely
cannot use c2x syntax.
>
>>
>>>
>>> This example comes from my testing against clang to check that the BTF
>>> generated by both toolchains is compatible. In this case we get
>>> different results when using the GNU attribute syntax.
>>>
>>>
>>> To avoid confusion, here is the full example (from the cover letter).
>>> The difference in the results is clear in the DWARF.
>>>
>>>> Consider the following example:
>>>>
>>>> #define __typetag1 __attribute__((btf_type_tag("type-tag-1")))
>>>> #define __typetag2 __attribute__((btf_type_tag("type-tag-2")))
>>>> #define __typetag3 __attribute__((btf_type_tag("type-tag-3")))
>>>>
>>>> int __typetag1 * __typetag2 __typetag3 * g;
>>>>
>>>> <var_decl 0x7ffff7547090 g
>>>> type <pointer_type 0x7ffff7548000
>>>> type <pointer_type 0x7ffff75097e0 type <integer_type
>>>> 0x7ffff74495e8 int>
>>>> asm_written unsigned DI
>>>> size <integer_cst 0x7ffff743c450 constant 64>
>>>> unit-size <integer_cst 0x7ffff743c468 constant 8>
>>>> align:64 warn_if_not_align:0 symtab:0 alias-set -1
>>>> canonical-type 0x7ffff7450888
>>>> attributes <tree_list 0x7ffff75275c8
>>>> purpose <identifier_node 0x7ffff753a1e0 btf_type_tag>
>>>> value <tree_list 0x7ffff7527550
>>>> value <string_cst 0x7ffff75292e0 type <array_type
>>>> 0x7ffff7509738>
>>>> readonly constant static "type-tag-3\000">>
>>>> chain <tree_list 0x7ffff75275a0 purpose
>>>> <identifier_node 0x7ffff753a1e0 btf_type_tag>
>>>> value <tree_list 0x7ffff75274d8
>>>> value <string_cst 0x7ffff75292c0 type
>>>> <array_type 0x7ffff7509738>
>>>> readonly constant static
>>>> "type-tag-2\000">>>>
>>>> pointer_to_this <pointer_type 0x7ffff7509888>>
>>>> asm_written unsigned DI size <integer_cst 0x7ffff743c450 64>
>>>> unit-size <integer_cst 0x7ffff743c468 8>
>>>> align:64 warn_if_not_align:0 symtab:0 alias-set -1
>>>> canonical-type 0x7ffff7509930
>>>> attributes <tree_list 0x7ffff75275f0 purpose <identifier_node
>>>> 0x7ffff753a1e0 btf_type_tag>
>>>> value <tree_list 0x7ffff7527438
>>>> value <string_cst 0x7ffff75292a0 type <array_type
>>>> 0x7ffff7509738>
>>>> readonly constant static "type-tag-1\000">>>>
>>>> public static unsigned DI defer-output
>>>> /home/dfaust/playpen/btf/annotate.c:29:42 size <integer_cst
>>>> 0x7ffff743c450 64> unit-size <integer_cst 0x7ffff743c468 8>
>>>> align:64 warn_if_not_align:0>
>>>
>>>>
>>>> The current implementation produces the following DWARF:
>>>>
>>>> <1><1e>: Abbrev Number: 4 (DW_TAG_variable)
>>>> <1f> DW_AT_name : g
>>>> <21> DW_AT_decl_file : 1
>>>> <22> DW_AT_decl_line : 6
>>>> <23> DW_AT_decl_column : 42
>>>> <24> DW_AT_type : <0x32>
>>>> <28> DW_AT_external : 1
>>>> <28> DW_AT_location : 9 byte block: 3 0 0 0 0 0 0 0 0
>>>> (DW_OP_addr: 0)
>>>> <1><32>: Abbrev Number: 2 (DW_TAG_pointer_type)
>>>> <33> DW_AT_byte_size : 8
>>>> <33> DW_AT_type : <0x45>
>>>> <37> DW_AT_sibling : <0x45>
>>>> <2><3b>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>> <3c> DW_AT_name : (indirect string, offset: 0x18):
>>>> btf_type_tag
>>>> <40> DW_AT_const_value : (indirect string, offset: 0xc7):
>>>> type-tag-1
>>>> <2><44>: Abbrev Number: 0
>>>> <1><45>: Abbrev Number: 2 (DW_TAG_pointer_type)
>>>> <46> DW_AT_byte_size : 8
>>>> <46> DW_AT_type : <0x61>
>>>> <4a> DW_AT_sibling : <0x61>
>>>> <2><4e>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>> <4f> DW_AT_name : (indirect string, offset: 0x18):
>>>> btf_type_tag
>>>> <53> DW_AT_const_value : (indirect string, offset: 0xd):
>>>> type-tag-3
>>>> <2><57>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>> <58> DW_AT_name : (indirect string, offset: 0x18):
>>>> btf_type_tag
>>>> <5c> DW_AT_const_value : (indirect string, offset: 0xd2):
>>>> type-tag-2
>>>> <2><60>: Abbrev Number: 0
>>>> <1><61>: Abbrev Number: 5 (DW_TAG_base_type)
>>>> <62> DW_AT_byte_size : 4
>>>> <63> DW_AT_encoding : 5 (signed)
>>>> <64> DW_AT_name : int
>>>> <1><68>: Abbrev Number: 0
>>>>
>>>> This does not agree with the DWARF produced by LLVM/clang for the same
>>>> case:
>>>> (clang 15.0.0 git 142501117a78080d2615074d3986fa42aa6a0734)
>>>>
>>>> <1><1e>: Abbrev Number: 2 (DW_TAG_variable)
>>>> <1f> DW_AT_name : (indexed string: 0x3): g
>>>> <20> DW_AT_type : <0x29>
>>>> <24> DW_AT_external : 1
>>>> <24> DW_AT_decl_file : 0
>>>> <25> DW_AT_decl_line : 6
>>>> <26> DW_AT_location : 2 byte block: a1 0 ((Unknown
>>>> location op 0xa1))
>>>> <1><29>: Abbrev Number: 3 (DW_TAG_pointer_type)
>>>> <2a> DW_AT_type : <0x35>
>>>> <2><2e>: Abbrev Number: 4 (User TAG value: 0x6000)
>>>> <2f> DW_AT_name : (indexed string: 0x5): btf_type_tag
>>>> <30> DW_AT_const_value : (indexed string: 0x7): type-tag-2
>>>> <2><31>: Abbrev Number: 4 (User TAG value: 0x6000)
>>>> <32> DW_AT_name : (indexed string: 0x5): btf_type_tag
>>>> <33> DW_AT_const_value : (indexed string: 0x8): type-tag-3
>>>> <2><34>: Abbrev Number: 0
>>>> <1><35>: Abbrev Number: 3 (DW_TAG_pointer_type)
>>>> <36> DW_AT_type : <0x3e>
>>>> <2><3a>: Abbrev Number: 4 (User TAG value: 0x6000)
>>>> <3b> DW_AT_name : (indexed string: 0x5): btf_type_tag
>>>> <3c> DW_AT_const_value : (indexed string: 0x6): type-tag-1
>>>> <2><3d>: Abbrev Number: 0
>>>> <1><3e>: Abbrev Number: 5 (DW_TAG_base_type)
>>>> <3f> DW_AT_name : (indexed string: 0x4): int
>>>> <40> DW_AT_encoding : 5 (signed)
>>>> <41> DW_AT_byte_size : 4
>>>> <1><42>: Abbrev Number: 0
>>>>
>>>>
>>>> Because GCC produces BTF from the internal DWARF DIE tree, the BTF
>>>> also differs.
>>>> This can be seen most obviously in the BTF type reference chains:
>>>>
>>>> GCC
>>>> VAR (g) -> ptr -> tag1 -> ptr -> tag3 -> tag2 -> int
>>>>
>>>> LLVM
>>>> VAR (g) -> ptr -> tag3 -> tag2 -> ptr -> tag1 -> int
>>>
>>>
next prev parent reply other threads:[~2022-05-11 3:44 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-01 19:42 [PATCH " David Faust
2022-04-01 19:42 ` [PATCH 1/8] dwarf: Add dw_get_die_parent function David Faust
2022-04-01 19:42 ` [PATCH 2/8] include: Add BTF tag defines to dwarf2 and btf David Faust
2022-04-01 19:42 ` [PATCH 3/8] c-family: Add BTF tag attribute handlers David Faust
2022-04-01 19:42 ` [PATCH 4/8] dwarf: create BTF decl and type tag DIEs David Faust
2022-04-01 19:42 ` [PATCH 5/8] ctfc: Add support to pass through BTF annotations David Faust
2022-04-01 19:42 ` [PATCH 6/8] dwarf2ctf: convert tag DIEs to CTF types David Faust
2022-04-01 19:42 ` [PATCH 7/8] Output BTF DECL_TAG and TYPE_TAG types David Faust
2022-04-01 19:42 ` [PATCH 8/8] testsuite: Add tests for BTF tags David Faust
2022-04-04 22:13 ` [PATCH 0/8][RFC] Support BTF decl_tag and type_tag annotations Yonghong Song
2022-04-05 16:26 ` David Faust
2022-04-18 19:36 ` [ping][PATCH " David Faust
2022-05-02 16:57 ` [ping2][PATCH " David Faust
2022-05-03 22:32 ` Joseph Myers
2022-05-04 17:03 ` David Faust
2022-05-05 23:00 ` Yonghong Song
2022-05-06 21:18 ` David Faust
2022-05-11 3:43 ` Yonghong Song [this message]
2022-05-11 5:05 ` Yonghong Song
2022-05-11 18:44 ` David Faust
2022-05-24 6:33 ` Yonghong Song
2022-05-24 11:07 ` Jose E. Marchesi
2022-05-24 15:52 ` Yonghong Song
2022-05-24 15:53 ` David Faust
2022-05-24 16:03 ` Yonghong Song
2022-05-24 17:04 ` David Faust
2022-05-26 7:29 ` Yonghong Song
2022-05-27 19:56 ` David Faust
2022-06-03 2:04 ` Yonghong Song
2022-06-07 21:42 ` David Faust
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7125844e-f538-faca-1cdf-5322492c00d9@fb.com \
--to=yhs@fb.com \
--cc=david.faust@oracle.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=joseph@codesourcery.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).