From: Yonghong Song <yhs@fb.com>
To: David Faust <david.faust@oracle.com>
Cc: "Jose E. Marchesi" <jose.marchesi@oracle.com>,
Joseph Myers <joseph@codesourcery.com>,
Yonghong Song via Gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: [ping2][PATCH 0/8][RFC] Support BTF decl_tag and type_tag annotations
Date: Thu, 2 Jun 2022 19:04:13 -0700 [thread overview]
Message-ID: <3be9e5df-e74a-bca4-2fa8-f865c4da525b@fb.com> (raw)
In-Reply-To: <0cb066a7-ba9c-123b-d55a-58667073543f@oracle.com>
On 5/27/22 12:56 PM, David Faust wrote:
>
>
> On 5/26/22 00:29, Yonghong Song wrote:
>>
>>
>> On 5/24/22 10:04 AM, David Faust wrote:
>>>
>>>
>>> On 5/24/22 09:03, Yonghong Song wrote:
>>>>
>>>>
>>>> On 5/24/22 8:53 AM, David Faust wrote:
>>>>>
>>>>>
>>>>> On 5/24/22 04:07, Jose E. Marchesi wrote:
>>>>>>
>>>>>>> On 5/11/22 11:44 AM, David Faust wrote:
>>>>>>>>
>>>>>>>> On 5/10/22 22:05, Yonghong Song wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 5/10/22 8:43 PM, Yonghong Song wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 5/6/22 2:18 PM, David Faust wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 5/5/22 16:00, Yonghong Song wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 5/4/22 10:03 AM, David Faust wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 5/3/22 15:32, Joseph Myers wrote:
>>>>>>>>>>>>>> On Mon, 2 May 2022, David Faust via Gcc-patches wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Consider the following example:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> #define __typetag1 __attribute__((btf_type_tag("tag1")))
>>>>>>>>>>>>>>> #define __typetag2 __attribute__((btf_type_tag("tag2")))
>>>>>>>>>>>>>>> #define __typetag3 __attribute__((btf_type_tag("tag3")))
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> int __typetag1 * __typetag2 __typetag3 * g;
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The expected behavior is that 'g' is "a pointer with tags
>>>>>>>>>>>>>>> 'tag2' and
>>>>>>>>>>>>>>> 'tag3',
>>>>>>>>>>>>>>> to a pointer with tag 'tag1' to an int". i.e.:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> That's not a correct expectation for either GNU __attribute__ or
>>>>>>>>>>>>>> C2x [[]]
>>>>>>>>>>>>>> attribute syntax. In either syntax, __typetag2 __typetag3 should
>>>>>>>>>>>>>> apply to
>>>>>>>>>>>>>> the type to which g points, not to g or its type, just as if
>>>>>>>>>>>>>> you had a
>>>>>>>>>>>>>> type qualifier there. You'd need to put the attributes (or
>>>>>>>>>>>>>> qualifier)
>>>>>>>>>>>>>> after the *, not before, to make them apply to the pointer
>>>>>>>>>>>>>> type. See
>>>>>>>>>>>>>> "Attribute Syntax" in the GCC manual for how the syntax is
>>>>>>>>>>>>>> defined for
>>>>>>>>>>>>>> GNU
>>>>>>>>>>>>>> attributes and deduce in turn, for each subsequence of the tokens
>>>>>>>>>>>>>> matching
>>>>>>>>>>>>>> the syntax for some kind of declarator, what the type for "T D1"
>>>>>>>>>>>>>> would be
>>>>>>>>>>>>>> as defined there and in the C standard, as deduced from the type for
>>>>>>>>>>>>>> "T D"
>>>>>>>>>>>>>> for a sub-declarator D.
>>>>>>>>>>>>>> >> But GCC's attribute parsing produces a variable 'g'
>>>>>>>>>>>>>> which is "a
>>>>>>>>>>>>> pointer with
>>>>>>>>>>>>>>> tag 'tag1' to a pointer with tags 'tag2' and 'tag3' to an
>>>>>>>>>>>>>>> int", i.e.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In GNU syntax, __typetag1 applies to the declaration, whereas in C2x
>>>>>>>>>>>>>> syntax it applies to int. Again, if you wanted it to apply to the
>>>>>>>>>>>>>> pointer
>>>>>>>>>>>>>> type it would need to go after the * not before.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If you are concerned with the fine details of what construct an
>>>>>>>>>>>>>> attribute
>>>>>>>>>>>>>> appertains to, I recommend using C2x syntax not GNU syntax.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Joseph, thank you! This is very helpful. My understanding of
>>>>>>>>>>>>> the syntax
>>>>>>>>>>>>> was not correct.
>>>>>>>>>>>>>
>>>>>>>>>>>>> (Actually, I made a bad mistake in paraphrasing this example from the
>>>>>>>>>>>>> discussion of it in the series cover letter. But, the reason
>>>>>>>>>>>>> why it is
>>>>>>>>>>>>> incorrect is the same.)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yonghong, is the specific ordering an expectation in BPF programs or
>>>>>>>>>>>>> other users of the tags?
>>>>>>>>>>>>
>>>>>>>>>>>> This is probably a language writing issue. We are saying tags only
>>>>>>>>>>>> apply to pointer. We probably should say it only apply to pointee.
>>>>>>>>>>>>
>>>>>>>>>>>> $ cat t.c
>>>>>>>>>>>> int const *ptr;
>>>>>>>>>>>>
>>>>>>>>>>>> the llvm ir debuginfo:
>>>>>>>>>>>>
>>>>>>>>>>>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64)
>>>>>>>>>>>> !6 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !7)
>>>>>>>>>>>> !7 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>>>>>>>>>>>>
>>>>>>>>>>>> We could replace 'const' with a tag like below:
>>>>>>>>>>>>
>>>>>>>>>>>> int __attribute__((btf_type_tag("tag"))) *ptr;
>>>>>>>>>>>>
>>>>>>>>>>>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64,
>>>>>>>>>>>> annotations: !7)
>>>>>>>>>>>> !6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>>>>>>>>>>>> !7 = !{!8}
>>>>>>>>>>>> !8 = !{!"btf_type_tag", !"tag"}
>>>>>>>>>>>>
>>>>>>>>>>>> In the above IR, we generate annotations to pointer_type because
>>>>>>>>>>>> we didn't invent a new DI type for encode btf_type_tag. But it is
>>>>>>>>>>>> totally okay to have IR looks like
>>>>>>>>>>>>
>>>>>>>>>>>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !11, size: 64)
>>>>>>>>>>>> !11 = !DIBtfTypeTagType(..., baseType: !6, name: !"Tag")
>>>>>>>>>>>> !6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>>>>>>>>>>>>
>>>>>>>>>>> OK, thanks.
>>>>>>>>>>>
>>>>>>>>>>> There is still the question of why the DWARF generated for this case
>>>>>>>>>>> that I have been concerned about:
>>>>>>>>>>>
>>>>>>>>>>> int __typetag1 * __typetag2 __typetag3 * g;
>>>>>>>>>>>
>>>>>>>>>>> differs between GCC (with this series) and clang. After studying it,
>>>>>>>>>>> GCC is doing with the attributes exactly as is described in the
>>>>>>>>>>> Attribute Syntax portion of the GCC manual where the GNU syntax is
>>>>>>>>>>> described. I do not think there is any problem here.
>>>>>>>>>>>
>>>>>>>>>>> So the difference in DWARF suggests to me that clang is not handling
>>>>>>>>>>> the GNU attribute syntax in this particular case correctly, since it
>>>>>>>>>>> seems to be associating __typetag2 and __typetag3 to g's type rather
>>>>>>>>>>> than the type to which it points.
>>>>>>>>>>>
>>>>>>>>>>> I am not sure whether for the use purposes of the tags this difference
>>>>>>>>>>> is very important, but it is worth noting.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> As Joseph suggested, it may be better to encourage users of these tags
>>>>>>>>>>> to use the C2x attribute syntax if they are concerned with precisely
>>>>>>>>>>> which construct the tag applies.
>>>>>>>>>>>
>>>>>>>>>>> This would also be a way around any issues in handling the attributes
>>>>>>>>>>> due to the GNU syntax.
>>>>>>>>>>>
>>>>>>>>>>> I tried a few test cases using C2x syntax BTF type tags with a
>>>>>>>>>>> clang-15 build, but ran into some issues (in particular, some of the
>>>>>>>>>>> tag attributes being ignored altogether). I couldn't find confirmation
>>>>>>>>>>> whether C2x attribute syntax is fully supported in clang yet, so maybe
>>>>>>>>>>> this isn't expected to work. Do you know whether the C2x syntax is
>>>>>>>>>>> fully supported in clang yet?
>>>>>>>>>>
>>>>>>>>>> Actually, I don't know either. But since the btf decl_tag and type_tag
>>>>>>>>>> are also used to compile linux kernel and the minimum compiler version
>>>>>>>>>> to compile kernel is gcc5.1 and clang11. I am not sure whether gcc5.1
>>>>>>>>>> supports c2x or not, I guess probably not. So I think we most likely
>>>>>>>>>> cannot use c2x syntax.
>>>>>>>>>
>>>>>>>>> Okay, I think we can guard btf_tag's with newer compiler versions.
>>>>>>>>> What kind of c2x syntax you intend to use? I can help compile kernel
>>>>>>>>> with that syntax and llvm15 to see what is the issue and may help
>>>>>>>>> fix it in clang if possible.
>>>>>>>>
>>>>>>>> I am thinking to use the [[]] C2x standard attribute syntax. The
>>>>>>>> syntax makes it quite clear to which entity each attribute applies,
>>>>>>>> and in my opinion is a little more intuitive/less surprising too.
>>>>>>>> It's documented here (PDF):
>>>>>>>> https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2731.pdf
>>>>>>>> See sections 6.7.11 for the syntax and 6.7.6 for
>>>>>>>> declarations. Section 6.7.6.1 specifically describes using the
>>>>>>>> attribute syntax with pointer declarators.
>>>>>>>> The attribute syntax itself for BTF tags is:
>>>>>>>> [[clang::btf_type_tag("tag1")]]
>>>>>>>> or
>>>>>>>> [[gnu::btf_type_tag("tag1")]]
>>>>>>>>
>>>>>>>> I am also looking into whether, with the C2x syntax, we really need two
>>>>>>>> separate attributes (type_tag and decl_tag) at the language
>>>>>>>> level. It might be possible with C2x syntax to use just one language
>>>>>>>> attribute (e.g. just btf_tag).
>>>>>>>>
>>>>>>>> A simple declaration for a tagged pointer to an int:
>>>>>>>> int * [[gnu::btf_type_tag("tag1")]] x;
>>>>>>>> And for the example from this thread:
>>>>>>>> #define __typetag1 [[gnu::btf_type_tag("type-tag-1")]]
>>>>>>>> #define __typetag2 [[gnu::btf_type_tag("type-tag-2")]]
>>>>>>>> #define __typetag3 [[gnu::btf_type_tag("type-tag-3")]]
>>>>>>>> int * __typetag1 * __typetag2 __typetag3 g;
>>>>>>>> Here each tag applies to the preceding pointer, so the result is
>>>>>>>> unsurprising.
>>>>>>>> Actually, this is where I found something that looks like an issue
>>>>>>>> with the C2x attribute syntax in clang. The tags 2 and 3 go missing,
>>>>>>>> but with no warning nor other indication.
>>>>>>>> Compiling this example with gcc:
>>>>>>>> $ ~/toolchains/bpf/bin/bpf-unknown-none-gcc -c -gbtf -gdwarf c2x.c
>>>>>>>> -o c2x.o --std=c2x
>>>>>>>> $ ~/toolchains/llvm/bin/llvm-dwarfdump c2x.o
>>>>>>>> 0x0000000c: DW_TAG_compile_unit
>>>>>>>> DW_AT_producer ("GNU C2X 12.0.1 20220401
>>>>>>>> (experimental) -gbtf -gdwarf -std=c2x")
>>>>>>>> DW_AT_language (DW_LANG_C11)
>>>>>>>> DW_AT_name ("c2x.c")
>>>>>>>> DW_AT_comp_dir ("/home/dfaust/playpen/btf/tags")
>>>>>>>> DW_AT_stmt_list (0x00000000)
>>>>>>>> 0x0000001e: DW_TAG_variable
>>>>>>>> DW_AT_name ("g")
>>>>>>>> DW_AT_decl_file ("/home/dfaust/playpen/btf/tags/c2x.c")
>>>>>>>> DW_AT_decl_line (16)
>>>>>>>> DW_AT_decl_column (0x2a)
>>>>>>>> DW_AT_type (0x00000032 "int **")
>>>>>>>> DW_AT_external (true)
>>>>>>>> DW_AT_location (DW_OP_addr 0x0)
>>>>>>>> 0x00000032: DW_TAG_pointer_type
>>>>>>>> DW_AT_byte_size (8)
>>>>>>>> DW_AT_type (0x0000004e "int *")
>>>>>>>> DW_AT_sibling (0x0000004e)
>>>>>>>> 0x0000003b: DW_TAG_LLVM_annotation
>>>>>>>> DW_AT_name ("btf_type_tag")
>>>>>>>> DW_AT_const_value ("type-tag-3")
>>>>>>>> 0x00000044: DW_TAG_LLVM_annotation
>>>>>>>> DW_AT_name ("btf_type_tag")
>>>>>>>> DW_AT_const_value ("type-tag-2")
>>>>>>>> 0x0000004d: NULL
>>>>>>>> 0x0000004e: DW_TAG_pointer_type
>>>>>>>> DW_AT_byte_size (8)
>>>>>>>> DW_AT_type (0x00000061 "int")
>>>>>>>> DW_AT_sibling (0x00000061)
>>>>>>>> 0x00000057: DW_TAG_LLVM_annotation
>>>>>>>> DW_AT_name ("btf_type_tag")
>>>>>>>> DW_AT_const_value ("type-tag-1")
>>>>>>>> 0x00000060: NULL
>>>>>>>> 0x00000061: DW_TAG_base_type
>>>>>>>> DW_AT_byte_size (0x04)
>>>>>>>> DW_AT_encoding (DW_ATE_signed)
>>>>>>>> DW_AT_name ("int")
>>>>>>>> 0x00000068: NULL
>>>>>>>>
>>>>>>>> and with clang (changing the attribute prefix to clang:: appropriately):
>>>>>>>> $ ~/toolchains/llvm/bin/clang -target bpf -g -c c2x.c -o c2x.o.ll
>>>>>>>> --std=c2x
>>>>>>>> $ ~/toolchains/llvm/bin/llvm-dwarfdump c2x.o.ll
>>>>>>>> 0x0000000c: DW_TAG_compile_unit
>>>>>>>> DW_AT_producer ("clang version 15.0.0
>>>>>>>> (https://github.com/llvm/llvm-project.git
>>>>>>>> f80e369f61ebd33dd9377bb42fcab64d17072b18)")
>>>>>>>> DW_AT_language (DW_LANG_C99)
>>>>>>>> DW_AT_name ("c2x.c")
>>>>>>>> DW_AT_str_offsets_base (0x00000008)
>>>>>>>> DW_AT_stmt_list (0x00000000)
>>>>>>>> DW_AT_comp_dir ("/home/dfaust/playpen/btf/tags")
>>>>>>>> DW_AT_addr_base (0x00000008)
>>>>>>>> 0x0000001e: DW_TAG_variable
>>>>>>>> DW_AT_name ("g")
>>>>>>>> DW_AT_type (0x00000029 "int **")
>>>>>>>> DW_AT_external (true)
>>>>>>>> DW_AT_decl_file ("/home/dfaust/playpen/btf/tags/c2x.c")
>>>>>>>> DW_AT_decl_line (12)
>>>>>>>> DW_AT_location (DW_OP_addrx 0x0)
>>>>>>>> 0x00000029: DW_TAG_pointer_type
>>>>>>>> DW_AT_type (0x00000032 "int *")
>>>>>>>> 0x0000002e: DW_TAG_LLVM_annotation
>>>>>>>> DW_AT_name ("btf_type_tag")
>>>>>>>> DW_AT_const_value ("type-tag-1")
>>>>>>>> 0x00000031: NULL
>>>>>>>> 0x00000032: DW_TAG_pointer_type
>>>>>>>> DW_AT_type (0x00000037 "int")
>>>>>>>> 0x00000037: DW_TAG_base_type
>>>>>>>> DW_AT_name ("int")
>>>>>>>> DW_AT_encoding (DW_ATE_signed)
>>>>>>>> DW_AT_byte_size (0x04)
>>>>>>>> 0x0000003b: NULL
>>>>>>>
>>>>>>> Thanks. I checked with current clang. The generated code looks
>>>>>>> like above. Basically, for code like below
>>>>>>>
>>>>>>> #define __typetag1 [[clang::btf_type_tag("type-tag-1")]]
>>>>>>> #define __typetag2 [[clang::btf_type_tag("type-tag-2")]]
>>>>>>> #define __typetag3 [[clang::btf_type_tag("type-tag-3")]]
>>>>>>>
>>>>>>> int * __typetag1 * __typetag2 __typetag3 g;
>>>>>>>
>>>>>>> The IR type looks like
>>>>>>> __typetag3 -> __typetag2 -> * (ptr1) -> __typetag1 -> * (ptr2) -> int
>>>>>>>
>>>>>>> The IR is similar to what we did if using
>>>>>>> __attribute__((btf_type_tag(""))), but their
>>>>>>> semantic interpretation is quite different.
>>>>>>> For example, with c2x format,
>>>>>>> __typetag1 applies to ptr2
>>>>>>> with __attribute__ format, it applies pointee of ptr1.
>>>>>>>
>>>>>>> But more importantly, c2x format is incompatible with
>>>>>>> the usage of linux kernel. The following are a bunch of kernel
>>>>>>> __user usages. Here, __user intends to be replaced with a btf_type_tag.
>>>>>>>
>>>>>>> vfio_pci_core.h: ssize_t (*rw)(struct vfio_pci_core_device
>>>>>>> *vdev, char __user *buf,
>>>>>>> vfio_pci_core.h: char __user *buf,
>>>>>>> size_t count,
>>>>>>> vfio_pci_core.h:extern ssize_t vfio_pci_bar_rw(struct
>>>>>>> vfio_pci_core_device *vdev, char __user *buf,
>>>>>>> vfio_pci_core.h:extern ssize_t vfio_pci_vga_rw(struct
>>>>>>> vfio_pci_core_device *vdev, char __user *buf,
>>>>>>> vfio_pci_core.h: char __user
>>>>>>> *buf, size_t count,
>>>>>>> vfio_pci_core.h: void __user *arg,
>>>>>>> size_t argsz);
>>>>>>> vfio_pci_core.h:ssize_t vfio_pci_core_read(struct vfio_device
>>>>>>> *core_vdev, char __user *buf,
>>>>>>> vfio_pci_core.h:ssize_t vfio_pci_core_write(struct vfio_device
>>>>>>> *core_vdev, const char __user *buf,
>>>>>>> vringh.h: vring_desc_t __user *desc,
>>>>>>> vringh.h: vring_avail_t __user *avail,
>>>>>>> vringh.h: vring_used_t __user *used);
>>>>>>> vt_kern.h:int con_set_cmap(unsigned char __user *cmap);
>>>>>>> vt_kern.h:int con_get_cmap(unsigned char __user *cmap);
>>>>>>> vt_kern.h:int con_set_trans_old(unsigned char __user * table);
>>>>>>> vt_kern.h:int con_get_trans_old(unsigned char __user * table);
>>>>>>> vt_kern.h:int con_set_trans_new(unsigned short __user * table);
>>>>>>> vt_kern.h:int con_get_trans_new(unsigned short __user * table);
>>>>>>>
>>>>>>> You can see, we will not able to simply replace __user
>>>>>>> with [[clang::btf_type_tag("user")]] because it won't work
>>>>>>> according to c2x expectations.
>>>>>
>>>>> Hi,
>>>>>
>>>>> Thanks for checking this. I see that we probably cannot use the c2x
>>>>> syntax in the kernel, since it will not work as a drop-in replacement
>>>>> for the current uses.
>>>>>
>>>>>>
>>>>>> Hi Yongsong.
>>>>>>
>>>>>> I am a bit confused regarding the GNU attributes problem: our patch
>>>>>> supports it, but as David already noted:
>>>>>>
>>>>>>>>>> There is still the question of why the DWARF generated for this case
>>>>>>>>>> that I have been concerned about:
>>>>>>>>>>
>>>>>>>>>> int __typetag1 * __typetag2 __typetag3 * g;
>>>>>>>>>>
>>>>>>>>>> differs between GCC (with this series) and clang. After studying it,
>>>>>>>>>> GCC is doing with the attributes exactly as is described in the
>>>>>>>>>> Attribute Syntax portion of the GCC manual where the GNU syntax is
>>>>>>>>>> described. I do not think there is any problem here.
>>>>>>>>>>
>>>>>>>>>> So the difference in DWARF suggests to me that clang is not handling
>>>>>>>>>> the GNU attribute syntax in this particular case correctly, since it
>>>>>>>>>> seems to be associating __typetag2 and __typetag3 to g's type rather
>>>>>>>>>> than the type to which it points.
>>>>>>
>>>>>> Note the example he uses is:
>>>>>>
>>>>>> (a) int __typetag1 * __typetag2 __typetag3 * g;
>>>>>>
>>>>>> Not
>>>>>>
>>>>>> (b) int * __typetag1 * __typetag2 __typetag3 g;
>>>>>>
>>>>>> Apparently for (a) clang is generating DWARF that associates __typetag2
>>>>>> and__typetag3 to g's type (the pointer to pointer) instead of the
>>>>>> pointer to int, which contravenes the GNU syntax rules.
>>>>>>
>>>>>> AFAIK thats is where the DWARF we generate differs, and what is blocking
>>>>>> us. David will correct me in the likely case I'm wrong :)
>>>>>
>>>>> Right. This is what I hoped maybe the C2x syntax could resolve.
>>>>>
>>>>> The issue I saw is that in the case (a) above, when using the GNU
>>>>> attribute syntax, GCC and clang produce different results. I think that
>>>>> the underlying cause is some subtle difference in how clang is handling
>>>>> the GNU attribute syntax in the case compared to GCC.
>>>>>
>>>>>
>>>>> To remind ourselves, here is the full example. Notice the significant
>>>>> difference in which objects the tags are associated with in DWARF.
>>>>>
>>>>>
>>>>> #define __typetag1 __attribute__((btf_type_tag("type-tag-1")))
>>>>> #define __typetag2 __attribute__((btf_type_tag("type-tag-2")))
>>>>> #define __typetag3 __attribute__((btf_type_tag("type-tag-3")))
>>>>>
>>>>> int __typetag1 * __typetag2 __typetag3 * g;
>>>>>
>>>>>
>>>>> GCC: bpf-unknown-none-gcc -c -gdwarf -gbtf annotate.c
>>>>>
>>>>> 0x0000000c: DW_TAG_compile_unit
>>>>> DW_AT_producer ("GNU C17 12.0.1 20220401 (experimental) -gdwarf -gbtf")
>>>>> DW_AT_language (DW_LANG_C11)
>>>>> DW_AT_name ("annotate.c")
>>>>> DW_AT_comp_dir ("/home/dfaust/playpen/btf/tags")
>>>>> DW_AT_stmt_list (0x00000000)
>>>>>
>>>>> 0x0000001e: DW_TAG_variable
>>>>> DW_AT_name ("g")
>>>>> DW_AT_decl_file ("/home/dfaust/playpen/btf/tags/annotate.c")
>>>>> DW_AT_decl_line (11)
>>>>> DW_AT_decl_column (0x2a)
>>>>> DW_AT_type (0x00000032 "int **")
>>>>> DW_AT_external (true)
>>>>> DW_AT_location (DW_OP_addr 0x0)
>>>>>
>>>>> 0x00000032: DW_TAG_pointer_type
>>>>> DW_AT_byte_size (8)
>>>>> DW_AT_type (0x00000045 "int *")
>>>>> DW_AT_sibling (0x00000045)
>>>>>
>>>>> 0x0000003b: DW_TAG_LLVM_annotation
>>>>> DW_AT_name ("btf_type_tag")
>>>>> DW_AT_const_value ("type-tag-1")
>>>>>
>>>>> 0x00000044: NULL
>>>>>
>>>>> 0x00000045: DW_TAG_pointer_type
>>>>> DW_AT_byte_size (8)
>>>>> DW_AT_type (0x00000061 "int")
>>>>> DW_AT_sibling (0x00000061)
>>>>>
>>>>> 0x0000004e: DW_TAG_LLVM_annotation
>>>>> DW_AT_name ("btf_type_tag")
>>>>> DW_AT_const_value ("type-tag-3")
>>>>>
>>>>> 0x00000057: DW_TAG_LLVM_annotation
>>>>> DW_AT_name ("btf_type_tag")
>>>>> DW_AT_const_value ("type-tag-2")
>>>>>
>>>>> 0x00000060: NULL
>>>>>
>>>>> 0x00000061: DW_TAG_base_type
>>>>> DW_AT_byte_size (0x04)
>>>>> DW_AT_encoding (DW_ATE_signed)
>>>>> DW_AT_name ("int")
>>>>>
>>>>> 0x00000068: NULL
>>>>
>>>> do you have documentation to show why gnu generates attribute this way?
>>>> If dwarf generates
>>>> ptr -> tag3 -> tag2 -> ptr -> tag1 -> int
>>>> does this help?
>>>
>>> Okay, I think I see the problem. The internal representations between clang
>>> and GCC attach the attributes to different nodes, and as a result they
>>> produce different DWARF:
>>>
>>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64,
>>> annotations: !10)
>>> !6 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !7, size: 64,
>>> annotations: !8)
>>> !7 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>>> !8 = !{!9}
>>> !9 = !{!"btf_type_tag", !"tag1"}
>>> !10 = !{!11, !12}
>>> !11 = !{!"btf_type_tag", !"tag2"}
>>> !12 = !{!"btf_type_tag", !"tag3"}
>>>
>>> If I am reading this IR right, then the tags "tag2" and "tag3" are being
>>> applied to the int**, and "tag1" is applied to the int*
>>>
>>> But I don't think this lines up with how the attribute syntax is defined.
>>> See
>>> https://gcc.gnu.org/onlinedocs/gcc/Attribute-Syntax.html
>>> In particular the "All other attributes" section. (It's a bit dense).
>>> Or, as Joseph summed it up nicely earlier in the thread:
>>>> In either syntax, __typetag2 __typetag3 should apply to
>>>> the type to which g points, not to g or its type, just as if you had a
>>>> type qualifier there. You'd need to put the attributes (or qualifier)
>>>> after the *, not before, to make them apply to the pointer type.
>>>
>>>
>>> Compare that to GCC's internal representation, from which DWARF is generated:
>>>
>>> <var_decl 0x7ffff7535090 g
>>> type <pointer_type 0x7ffff74f8888
>>> type <pointer_type 0x7ffff74f8b28 type <integer_type 0x7ffff74385e8 int>
>>> unsigned DI
>>> size <integer_cst 0x7ffff742b450 constant 64>
>>> unit-size <integer_cst 0x7ffff742b468 constant 8>
>>> align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x7ffff743f888
>>> attributes <tree_list 0x7ffff75165c8
>>> purpose <identifier_node 0x7ffff75290f0 btf_type_tag>
>>> value <tree_list 0x7ffff7516550
>>> value <string_cst 0x7ffff75182e0 type <array_type 0x7ffff74f8738>
>>> readonly constant static "type-tag-3\000">>
>>> chain <tree_list 0x7ffff75165a0 purpose <identifier_node 0x7ffff75290f0 btf_type_tag>
>>> value <tree_list 0x7ffff75164d8
>>> value <string_cst 0x7ffff75182c0 type <array_type 0x7ffff74f8738>
>>> readonly constant static "type-tag-2\000">>>>
>>> pointer_to_this <pointer_type 0x7ffff74f8bd0>>
>>> unsigned DI size <integer_cst 0x7ffff742b450 64> unit-size <integer_cst 0x7ffff742b468 8>
>>> align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x7ffff74f87e0
>>> attributes <tree_list 0x7ffff75165f0 purpose <identifier_node 0x7ffff75290f0 btf_type_tag>
>>> value <tree_list 0x7ffff7516438
>>> value <string_cst 0x7ffff75182a0 type <array_type 0x7ffff74f8738>
>>> readonly constant static "type-tag-1\000">>>>
>>> public static unsigned DI defer-output /home/dfaust/playpen/btf/tags/annotate.c:10:42 size <integer_cst 0x7ffff742b450 64> unit-size <integer_cst 0x7ffff742b468 8>
>>> align:64 warn_if_not_align:0>
>>>
>>> See how tags "tag2" and "tag3" are associated with the pointer_type 0x7ffff74f8b28,
>>> that is, "the type to which g points"
>>>
>>> From GCC's DWARF the BTF we get currently looks like:
>>> VAR(g) -> ptr -> tag1 -> ptr -> tag3 -> tag2 -> int
>>> which is obviously quite different and why this case caught my attention.
>>>
>>> I think this difference is the root of our problems. It might not be
>>> specifically related to the BTF tag attributes but they do reveal some
>>> discrepency between how clang and GCC handle the attribute syntax.
>>
>> The btf_type attribute is very similar to address_space attribute.
>> For example,
>> $ cat t1.c
>> int __attribute__((address_space(1))) * p;
>> $ clang -g -S -emit-llvm t1.c
>>
>> In IR, we will have
>> @p = dso_local global ptr addrspace(1) null, align 8, !dbg !0
>> ...
>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64)
>> !6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>>
>> Replacing address_space with btf_type_tag, we will get
>> ptr->type_tag->int in debuginfo.
>>
>> But it looks like gcc doesn't support address_space attribute
>>
>> $ gcc -g -S t1.c
>> t1.c:1:1: warning: ‘address_space’ attribute directive ignored
>> [-Wattributes]
>> int __attribute__((address_space(1))) * p;
>> ^~~
>>
>> Is it possible for gcc to go with address_space attribute
>> semantics for btf_type_tag attribute?
>
> In cases like this the behavior is the same.
> $ cat foo.c
> int __attribute__((btf_type_tag("tag1"))) * p;
> $ gcc -c -gdwarf -gbtf foo.c
>
> Internally:
> <var_decl 0x7ffff743abd0 p
> type <pointer_type 0x7ffff7590150
> type <integer_type 0x7ffff74475e8 int public SI
> size <integer_cst 0x7ffff742bf90 constant 32>
> unit-size <integer_cst 0x7ffff742bfa8 constant 4>
> align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x7ffff74475e8 precision:32 min <integer_cst 0x7ffff742bf48 -2147483648> max <integer_cst 0x7ffff742bf60 2147483647>
> pointer_to_this <pointer_type 0x7ffff744fa80>>
> unsigned DI
> size <integer_cst 0x7ffff742bd50 constant 64>
> unit-size <integer_cst 0x7ffff742bd68 constant 8>
> align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x7ffff744fa80
> attributes <tree_list 0x7ffff7564d70
> purpose <identifier_node 0x7ffff757f2d0 btf_type_tag>
> value <tree_list 0x7ffff7564cf8
> value <string_cst 0x7ffff757c220 type <array_type 0x7ffff75900a8>
> readonly constant static "tag1\000">>>>
> public static unsigned DI defer-output /home/dfaust/playpen/btf/tags/foo.c:1:45 size <integer_cst 0x7ffff742bd50 64> unit-size <integer_cst 0x7ffff742bd68 8>
> align:64 warn_if_not_align:0>
>
> And the resulting BTF:
>
> [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
> [2] PTR '(anon)' type_id=3
> [3] TYPE_TAG 'tag1' type_id=1
> [4] VAR 'p' type_id=2, linkage=global
> [5] DATASEC '.bss' size=0 vlen=1
> type_id=4 offset=0 size=8 (VAR 'p')
>
> var(p) -> ptr -> type_tag -> int
It would be good if we can generate similar encoding in dwarf.
Currently in clang, we generate
var(p) -> ptr (type_tag) -> int
but I am open to generate
var(p) -> ptr -> type_tag -> int
in dwarf as well if it is possible.
>
>
>>
>>>
>>>>
>>>>>
>>>>>
>>>>> clang: clang -target bpf -c -g annotate.c
>>>>>
>>>>> 0x0000000c: DW_TAG_compile_unit
>>>>> DW_AT_producer ("clang version 15.0.0 (https://github.com/llvm/llvm-project.git f80e369f61ebd33dd9377bb42fcab64d17072b18)")
>>>>> DW_AT_language (DW_LANG_C99)
>>>>> DW_AT_name ("annotate.c")
>>>>> DW_AT_str_offsets_base (0x00000008)
>>>>> DW_AT_stmt_list (0x00000000)
>>>>> DW_AT_comp_dir ("/home/dfaust/playpen/btf/tags")
>>>>> DW_AT_addr_base (0x00000008)
>>>>>
>>>>> 0x0000001e: DW_TAG_variable
>>>>> DW_AT_name ("g")
>>>>> DW_AT_type (0x00000029 "int **")
>>>>> DW_AT_external (true)
>>>>> DW_AT_decl_file ("/home/dfaust/playpen/btf/tags/annotate.c")
>>>>> DW_AT_decl_line (11)
>>>>> DW_AT_location (DW_OP_addrx 0x0)
>>>>>
>>>>> 0x00000029: DW_TAG_pointer_type
>>>>> DW_AT_type (0x00000035 "int *")
>>>>>
>>>>> 0x0000002e: DW_TAG_LLVM_annotation
>>>>> DW_AT_name ("btf_type_tag")
>>>>> DW_AT_const_value ("type-tag-2")
>>>>>
>>>>> 0x00000031: DW_TAG_LLVM_annotation
>>>>> DW_AT_name ("btf_type_tag")
>>>>> DW_AT_const_value ("type-tag-3")
>>>>>
>>>>> 0x00000034: NULL
>>>>>
>>>>> 0x00000035: DW_TAG_pointer_type
>>>>> DW_AT_type (0x0000003e "int")
>>>>>
>>>>> 0x0000003a: DW_TAG_LLVM_annotation
>>>>> DW_AT_name ("btf_type_tag")
>>>>> DW_AT_const_value ("type-tag-1")
>>>>>
>>>>> 0x0000003d: NULL
>>>>>
>>>>> 0x0000003e: DW_TAG_base_type
>>>>> DW_AT_name ("int")
>>>>> DW_AT_encoding (DW_ATE_signed)
>>>>> DW_AT_byte_size (0x04)
>>>>>
>>>>> 0x00000042: NULL
>>>>>
>>>>>
next prev parent reply other threads:[~2022-06-03 2:04 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-01 19:42 [PATCH " David Faust
2022-04-01 19:42 ` [PATCH 1/8] dwarf: Add dw_get_die_parent function David Faust
2022-04-01 19:42 ` [PATCH 2/8] include: Add BTF tag defines to dwarf2 and btf David Faust
2022-04-01 19:42 ` [PATCH 3/8] c-family: Add BTF tag attribute handlers David Faust
2022-04-01 19:42 ` [PATCH 4/8] dwarf: create BTF decl and type tag DIEs David Faust
2022-04-01 19:42 ` [PATCH 5/8] ctfc: Add support to pass through BTF annotations David Faust
2022-04-01 19:42 ` [PATCH 6/8] dwarf2ctf: convert tag DIEs to CTF types David Faust
2022-04-01 19:42 ` [PATCH 7/8] Output BTF DECL_TAG and TYPE_TAG types David Faust
2022-04-01 19:42 ` [PATCH 8/8] testsuite: Add tests for BTF tags David Faust
2022-04-04 22:13 ` [PATCH 0/8][RFC] Support BTF decl_tag and type_tag annotations Yonghong Song
2022-04-05 16:26 ` David Faust
2022-04-18 19:36 ` [ping][PATCH " David Faust
2022-05-02 16:57 ` [ping2][PATCH " David Faust
2022-05-03 22:32 ` Joseph Myers
2022-05-04 17:03 ` David Faust
2022-05-05 23:00 ` Yonghong Song
2022-05-06 21:18 ` David Faust
2022-05-11 3:43 ` Yonghong Song
2022-05-11 5:05 ` Yonghong Song
2022-05-11 18:44 ` David Faust
2022-05-24 6:33 ` Yonghong Song
2022-05-24 11:07 ` Jose E. Marchesi
2022-05-24 15:52 ` Yonghong Song
2022-05-24 15:53 ` David Faust
2022-05-24 16:03 ` Yonghong Song
2022-05-24 17:04 ` David Faust
2022-05-26 7:29 ` Yonghong Song
2022-05-27 19:56 ` David Faust
2022-06-03 2:04 ` Yonghong Song [this message]
2022-06-07 21:42 ` David Faust
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3be9e5df-e74a-bca4-2fa8-f865c4da525b@fb.com \
--to=yhs@fb.com \
--cc=david.faust@oracle.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=jose.marchesi@oracle.com \
--cc=joseph@codesourcery.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).