public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Yonghong Song <yhs@fb.com>
To: David Faust <david.faust@oracle.com>,
	"Jose E. Marchesi" <jose.marchesi@oracle.com>
Cc: Joseph Myers <joseph@codesourcery.com>,
	Yonghong Song via Gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: [ping2][PATCH 0/8][RFC] Support BTF decl_tag and type_tag annotations
Date: Thu, 26 May 2022 00:29:09 -0700	[thread overview]
Message-ID: <a4db600e-e92c-0092-3786-70c46398bc79@fb.com> (raw)
In-Reply-To: <ecb932b5-a3d8-cd0b-57bb-2291ae1b28be@oracle.com>



On 5/24/22 10:04 AM, David Faust wrote:
> 
> 
> On 5/24/22 09:03, Yonghong Song wrote:
>>
>>
>> On 5/24/22 8:53 AM, David Faust wrote:
>>>
>>>
>>> On 5/24/22 04:07, Jose E. Marchesi wrote:
>>>>
>>>>> On 5/11/22 11:44 AM, David Faust wrote:
>>>>>>
>>>>>> On 5/10/22 22:05, Yonghong Song wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 5/10/22 8:43 PM, Yonghong Song wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 5/6/22 2:18 PM, David Faust wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 5/5/22 16:00, Yonghong Song wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 5/4/22 10:03 AM, David Faust wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 5/3/22 15:32, Joseph Myers wrote:
>>>>>>>>>>>> On Mon, 2 May 2022, David Faust via Gcc-patches wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Consider the following example:
>>>>>>>>>>>>>
>>>>>>>>>>>>>         #define __typetag1 __attribute__((btf_type_tag("tag1")))
>>>>>>>>>>>>>         #define __typetag2 __attribute__((btf_type_tag("tag2")))
>>>>>>>>>>>>>         #define __typetag3 __attribute__((btf_type_tag("tag3")))
>>>>>>>>>>>>>
>>>>>>>>>>>>>         int __typetag1 * __typetag2 __typetag3 * g;
>>>>>>>>>>>>>
>>>>>>>>>>>>> The expected behavior is that 'g' is "a pointer with tags
>>>>>>>>>>>>> 'tag2' and
>>>>>>>>>>>>> 'tag3',
>>>>>>>>>>>>> to a pointer with tag 'tag1' to an int". i.e.:
>>>>>>>>>>>>
>>>>>>>>>>>> That's not a correct expectation for either GNU __attribute__ or
>>>>>>>>>>>> C2x [[]]
>>>>>>>>>>>> attribute syntax.  In either syntax, __typetag2 __typetag3 should
>>>>>>>>>>>> apply to
>>>>>>>>>>>> the type to which g points, not to g or its type, just as if
>>>>>>>>>>>> you had a
>>>>>>>>>>>> type qualifier there.  You'd need to put the attributes (or
>>>>>>>>>>>> qualifier)
>>>>>>>>>>>> after the *, not before, to make them apply to the pointer
>>>>>>>>>>>> type.  See
>>>>>>>>>>>> "Attribute Syntax" in the GCC manual for how the syntax is
>>>>>>>>>>>> defined for
>>>>>>>>>>>> GNU
>>>>>>>>>>>> attributes and deduce in turn, for each subsequence of the tokens
>>>>>>>>>>>> matching
>>>>>>>>>>>> the syntax for some kind of declarator, what the type for "T D1"
>>>>>>>>>>>> would be
>>>>>>>>>>>> as defined there and in the C standard, as deduced from the type for
>>>>>>>>>>>> "T D"
>>>>>>>>>>>> for a sub-declarator D.
>>>>>>>>>>>>      >> But GCC's attribute parsing produces a variable 'g'
>>>>>>>>>>>> which is "a
>>>>>>>>>>> pointer with
>>>>>>>>>>>>> tag 'tag1' to a pointer with tags 'tag2' and 'tag3' to an
>>>>>>>>>>>>> int", i.e.
>>>>>>>>>>>>
>>>>>>>>>>>> In GNU syntax, __typetag1 applies to the declaration, whereas in C2x
>>>>>>>>>>>> syntax it applies to int.  Again, if you wanted it to apply to the
>>>>>>>>>>>> pointer
>>>>>>>>>>>> type it would need to go after the * not before.
>>>>>>>>>>>>
>>>>>>>>>>>> If you are concerned with the fine details of what construct an
>>>>>>>>>>>> attribute
>>>>>>>>>>>> appertains to, I recommend using C2x syntax not GNU syntax.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Joseph, thank you! This is very helpful. My understanding of
>>>>>>>>>>> the syntax
>>>>>>>>>>> was not correct.
>>>>>>>>>>>
>>>>>>>>>>> (Actually, I made a bad mistake in paraphrasing this example from the
>>>>>>>>>>> discussion of it in the series cover letter. But, the reason
>>>>>>>>>>> why it is
>>>>>>>>>>> incorrect is the same.)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Yonghong, is the specific ordering an expectation in BPF programs or
>>>>>>>>>>> other users of the tags?
>>>>>>>>>>
>>>>>>>>>> This is probably a language writing issue. We are saying tags only
>>>>>>>>>> apply to pointer. We probably should say it only apply to pointee.
>>>>>>>>>>
>>>>>>>>>> $ cat t.c
>>>>>>>>>> int const *ptr;
>>>>>>>>>>
>>>>>>>>>> the llvm ir debuginfo:
>>>>>>>>>>
>>>>>>>>>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64)
>>>>>>>>>> !6 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !7)
>>>>>>>>>> !7 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>>>>>>>>>>
>>>>>>>>>> We could replace 'const' with a tag like below:
>>>>>>>>>>
>>>>>>>>>> int __attribute__((btf_type_tag("tag"))) *ptr;
>>>>>>>>>>
>>>>>>>>>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64,
>>>>>>>>>> annotations: !7)
>>>>>>>>>> !6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>>>>>>>>>> !7 = !{!8}
>>>>>>>>>> !8 = !{!"btf_type_tag", !"tag"}
>>>>>>>>>>
>>>>>>>>>> In the above IR, we generate annotations to pointer_type because
>>>>>>>>>> we didn't invent a new DI type for encode btf_type_tag. But it is
>>>>>>>>>> totally okay to have IR looks like
>>>>>>>>>>
>>>>>>>>>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !11, size: 64)
>>>>>>>>>> !11 = !DIBtfTypeTagType(..., baseType: !6, name: !"Tag")
>>>>>>>>>> !6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>>>>>>>>>>
>>>>>>>>> OK, thanks.
>>>>>>>>>
>>>>>>>>> There is still the question of why the DWARF generated for this case
>>>>>>>>> that I have been concerned about:
>>>>>>>>>
>>>>>>>>>       int __typetag1 * __typetag2 __typetag3 * g;
>>>>>>>>>
>>>>>>>>> differs between GCC (with this series) and clang. After studying it,
>>>>>>>>> GCC is doing with the attributes exactly as is described in the
>>>>>>>>> Attribute Syntax portion of the GCC manual where the GNU syntax is
>>>>>>>>> described. I do not think there is any problem here.
>>>>>>>>>
>>>>>>>>> So the difference in DWARF suggests to me that clang is not handling
>>>>>>>>> the GNU attribute syntax in this particular case correctly, since it
>>>>>>>>> seems to be associating __typetag2 and __typetag3 to g's type rather
>>>>>>>>> than the type to which it points.
>>>>>>>>>
>>>>>>>>> I am not sure whether for the use purposes of the tags this difference
>>>>>>>>> is very important, but it is worth noting.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> As Joseph suggested, it may be better to encourage users of these tags
>>>>>>>>> to use the C2x attribute syntax if they are concerned with precisely
>>>>>>>>> which construct the tag applies.
>>>>>>>>>
>>>>>>>>> This would also be a way around any issues in handling the attributes
>>>>>>>>> due to the GNU syntax.
>>>>>>>>>
>>>>>>>>> I tried a few test cases using C2x syntax BTF type tags with a
>>>>>>>>> clang-15 build, but ran into some issues (in particular, some of the
>>>>>>>>> tag attributes being ignored altogether). I couldn't find confirmation
>>>>>>>>> whether C2x attribute syntax is fully supported in clang yet, so maybe
>>>>>>>>> this isn't expected to work. Do you know whether the C2x syntax is
>>>>>>>>> fully supported in clang yet?
>>>>>>>>
>>>>>>>> Actually, I don't know either. But since the btf decl_tag and type_tag
>>>>>>>> are also used to compile linux kernel and the minimum compiler version
>>>>>>>> to compile kernel is gcc5.1 and clang11. I am not sure whether gcc5.1
>>>>>>>> supports c2x or not, I guess probably not. So I think we most likely
>>>>>>>> cannot use c2x syntax.
>>>>>>>
>>>>>>> Okay, I think we can guard btf_tag's with newer compiler versions.
>>>>>>> What kind of c2x syntax you intend to use? I can help compile kernel
>>>>>>> with that syntax and llvm15 to see what is the issue and may help
>>>>>>> fix it in clang if possible.
>>>>>>
>>>>>> I am thinking to use the [[]] C2x standard attribute syntax. The
>>>>>> syntax makes it quite clear to which entity each attribute applies,
>>>>>> and in my opinion is a little more intuitive/less surprising too.
>>>>>> It's documented here (PDF):
>>>>>>      https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2731.pdf
>>>>>> See sections 6.7.11 for the syntax and 6.7.6 for
>>>>>> declarations. Section 6.7.6.1 specifically describes using the
>>>>>> attribute syntax with pointer declarators.
>>>>>> The attribute syntax itself for BTF tags is:
>>>>>>      [[clang::btf_type_tag("tag1")]]
>>>>>> or
>>>>>>      [[gnu::btf_type_tag("tag1")]]
>>>>>>
>>>>>> I am also looking into whether, with the C2x syntax, we really need two
>>>>>> separate attributes (type_tag and decl_tag) at the language
>>>>>> level. It might be possible with C2x syntax to use just one language
>>>>>> attribute (e.g. just btf_tag).
>>>>>>
>>>>>> A simple declaration for a tagged pointer to an int:
>>>>>>      int * [[gnu::btf_type_tag("tag1")]] x;
>>>>>> And for the example from this thread:
>>>>>>      #define __typetag1 [[gnu::btf_type_tag("type-tag-1")]]
>>>>>>      #define __typetag2 [[gnu::btf_type_tag("type-tag-2")]]
>>>>>>      #define __typetag3 [[gnu::btf_type_tag("type-tag-3")]]
>>>>>>      int * __typetag1 * __typetag2 __typetag3 g;
>>>>>> Here each tag applies to the preceding pointer, so the result is
>>>>>> unsurprising.
>>>>>> Actually, this is where I found something that looks like an issue
>>>>>> with the C2x attribute syntax in clang. The tags 2 and 3 go missing,
>>>>>> but with no warning nor other indication.
>>>>>> Compiling this example with gcc:
>>>>>> $ ~/toolchains/bpf/bin/bpf-unknown-none-gcc -c -gbtf -gdwarf c2x.c
>>>>>> -o c2x.o --std=c2x
>>>>>> $ ~/toolchains/llvm/bin/llvm-dwarfdump c2x.o
>>>>>> 0x0000000c: DW_TAG_compile_unit
>>>>>>                  DW_AT_producer    ("GNU C2X 12.0.1 20220401
>>>>>> (experimental) -gbtf -gdwarf -std=c2x")
>>>>>>                  DW_AT_language    (DW_LANG_C11)
>>>>>>                  DW_AT_name    ("c2x.c")
>>>>>>                  DW_AT_comp_dir    ("/home/dfaust/playpen/btf/tags")
>>>>>>                  DW_AT_stmt_list    (0x00000000)
>>>>>> 0x0000001e:   DW_TAG_variable
>>>>>>                    DW_AT_name    ("g")
>>>>>>                    DW_AT_decl_file    ("/home/dfaust/playpen/btf/tags/c2x.c")
>>>>>>                    DW_AT_decl_line    (16)
>>>>>>                    DW_AT_decl_column    (0x2a)
>>>>>>                    DW_AT_type    (0x00000032 "int **")
>>>>>>                    DW_AT_external    (true)
>>>>>>                    DW_AT_location    (DW_OP_addr 0x0)
>>>>>> 0x00000032:   DW_TAG_pointer_type
>>>>>>                    DW_AT_byte_size    (8)
>>>>>>                    DW_AT_type    (0x0000004e "int *")
>>>>>>                    DW_AT_sibling    (0x0000004e)
>>>>>> 0x0000003b:     DW_TAG_LLVM_annotation
>>>>>>                      DW_AT_name    ("btf_type_tag")
>>>>>>                      DW_AT_const_value    ("type-tag-3")
>>>>>> 0x00000044:     DW_TAG_LLVM_annotation
>>>>>>                      DW_AT_name    ("btf_type_tag")
>>>>>>                      DW_AT_const_value    ("type-tag-2")
>>>>>> 0x0000004d:     NULL
>>>>>> 0x0000004e:   DW_TAG_pointer_type
>>>>>>                    DW_AT_byte_size    (8)
>>>>>>                    DW_AT_type    (0x00000061 "int")
>>>>>>                    DW_AT_sibling    (0x00000061)
>>>>>> 0x00000057:     DW_TAG_LLVM_annotation
>>>>>>                      DW_AT_name    ("btf_type_tag")
>>>>>>                      DW_AT_const_value    ("type-tag-1")
>>>>>> 0x00000060:     NULL
>>>>>> 0x00000061:   DW_TAG_base_type
>>>>>>                    DW_AT_byte_size    (0x04)
>>>>>>                    DW_AT_encoding    (DW_ATE_signed)
>>>>>>                    DW_AT_name    ("int")
>>>>>> 0x00000068:   NULL
>>>>>>
>>>>>> and with clang (changing the attribute prefix to clang:: appropriately):
>>>>>> $ ~/toolchains/llvm/bin/clang -target bpf -g -c c2x.c -o c2x.o.ll
>>>>>> --std=c2x
>>>>>> $ ~/toolchains/llvm/bin/llvm-dwarfdump c2x.o.ll
>>>>>> 0x0000000c: DW_TAG_compile_unit
>>>>>>                  DW_AT_producer    ("clang version 15.0.0
>>>>>> (https://github.com/llvm/llvm-project.git
>>>>>> f80e369f61ebd33dd9377bb42fcab64d17072b18)")
>>>>>>                  DW_AT_language    (DW_LANG_C99)
>>>>>>                  DW_AT_name    ("c2x.c")
>>>>>>                  DW_AT_str_offsets_base    (0x00000008)
>>>>>>                  DW_AT_stmt_list    (0x00000000)
>>>>>>                  DW_AT_comp_dir    ("/home/dfaust/playpen/btf/tags")
>>>>>>                  DW_AT_addr_base    (0x00000008)
>>>>>> 0x0000001e:   DW_TAG_variable
>>>>>>                    DW_AT_name    ("g")
>>>>>>                    DW_AT_type    (0x00000029 "int **")
>>>>>>                    DW_AT_external    (true)
>>>>>>                    DW_AT_decl_file    ("/home/dfaust/playpen/btf/tags/c2x.c")
>>>>>>                    DW_AT_decl_line    (12)
>>>>>>                    DW_AT_location    (DW_OP_addrx 0x0)
>>>>>> 0x00000029:   DW_TAG_pointer_type
>>>>>>                    DW_AT_type    (0x00000032 "int *")
>>>>>> 0x0000002e:     DW_TAG_LLVM_annotation
>>>>>>                      DW_AT_name    ("btf_type_tag")
>>>>>>                      DW_AT_const_value    ("type-tag-1")
>>>>>> 0x00000031:     NULL
>>>>>> 0x00000032:   DW_TAG_pointer_type
>>>>>>                    DW_AT_type    (0x00000037 "int")
>>>>>> 0x00000037:   DW_TAG_base_type
>>>>>>                    DW_AT_name    ("int")
>>>>>>                    DW_AT_encoding    (DW_ATE_signed)
>>>>>>                    DW_AT_byte_size    (0x04)
>>>>>> 0x0000003b:   NULL
>>>>>
>>>>> Thanks. I checked with current clang. The generated code looks
>>>>> like above. Basically, for code like below
>>>>>
>>>>>      #define __typetag1 [[clang::btf_type_tag("type-tag-1")]]
>>>>>      #define __typetag2 [[clang::btf_type_tag("type-tag-2")]]
>>>>>      #define __typetag3 [[clang::btf_type_tag("type-tag-3")]]
>>>>>
>>>>>      int * __typetag1 * __typetag2 __typetag3 g;
>>>>>
>>>>> The IR type looks like
>>>>>      __typetag3 -> __typetag2 -> * (ptr1) -> __typetag1 -> * (ptr2) -> int
>>>>>
>>>>> The IR is similar to what we did if using
>>>>> __attribute__((btf_type_tag(""))), but their
>>>>> semantic interpretation is quite different.
>>>>> For example, with c2x format,
>>>>>      __typetag1 applies to ptr2
>>>>> with __attribute__ format, it applies pointee of ptr1.
>>>>>
>>>>> But more importantly, c2x format is incompatible with
>>>>> the usage of linux kernel. The following are a bunch of kernel
>>>>> __user usages. Here, __user intends to be replaced with a btf_type_tag.
>>>>>
>>>>> vfio_pci_core.h:        ssize_t (*rw)(struct vfio_pci_core_device
>>>>> *vdev, char __user *buf,
>>>>> vfio_pci_core.h:                                  char __user *buf,
>>>>> size_t count,
>>>>> vfio_pci_core.h:extern ssize_t vfio_pci_bar_rw(struct
>>>>> vfio_pci_core_device *vdev, char __user *buf,
>>>>> vfio_pci_core.h:extern ssize_t vfio_pci_vga_rw(struct
>>>>> vfio_pci_core_device *vdev, char __user *buf,
>>>>> vfio_pci_core.h:                                      char __user
>>>>> *buf, size_t count,
>>>>> vfio_pci_core.h:                                void __user *arg,
>>>>> size_t argsz);
>>>>> vfio_pci_core.h:ssize_t vfio_pci_core_read(struct vfio_device
>>>>> *core_vdev, char __user *buf,
>>>>> vfio_pci_core.h:ssize_t vfio_pci_core_write(struct vfio_device
>>>>> *core_vdev, const char __user *buf,
>>>>> vringh.h:                    vring_desc_t __user *desc,
>>>>> vringh.h:                    vring_avail_t __user *avail,
>>>>> vringh.h:                    vring_used_t __user *used);
>>>>> vt_kern.h:int con_set_cmap(unsigned char __user *cmap);
>>>>> vt_kern.h:int con_get_cmap(unsigned char __user *cmap);
>>>>> vt_kern.h:int con_set_trans_old(unsigned char __user * table);
>>>>> vt_kern.h:int con_get_trans_old(unsigned char __user * table);
>>>>> vt_kern.h:int con_set_trans_new(unsigned short __user * table);
>>>>> vt_kern.h:int con_get_trans_new(unsigned short __user * table);
>>>>>
>>>>> You can see, we will not able to simply replace __user
>>>>> with [[clang::btf_type_tag("user")]] because it won't work
>>>>> according to c2x expectations.
>>>
>>> Hi,
>>>
>>> Thanks for checking this. I see that we probably cannot use the c2x
>>> syntax in the kernel, since it will not work as a drop-in replacement
>>> for the current uses.
>>>
>>>>
>>>> Hi Yongsong.
>>>>
>>>> I am a bit confused regarding the GNU attributes problem: our patch
>>>> supports it, but as David already noted:
>>>>
>>>>>>>> There is still the question of why the DWARF generated for this case
>>>>>>>> that I have been concerned about:
>>>>>>>>
>>>>>>>>      int __typetag1 * __typetag2 __typetag3 * g;
>>>>>>>>
>>>>>>>> differs between GCC (with this series) and clang. After studying it,
>>>>>>>> GCC is doing with the attributes exactly as is described in the
>>>>>>>> Attribute Syntax portion of the GCC manual where the GNU syntax is
>>>>>>>> described. I do not think there is any problem here.
>>>>>>>>
>>>>>>>> So the difference in DWARF suggests to me that clang is not handling
>>>>>>>> the GNU attribute syntax in this particular case correctly, since it
>>>>>>>> seems to be associating __typetag2 and __typetag3 to g's type rather
>>>>>>>> than the type to which it points.
>>>>
>>>> Note the example he uses is:
>>>>
>>>>     (a) int __typetag1 * __typetag2 __typetag3 * g;
>>>>
>>>> Not
>>>>
>>>>     (b) int * __typetag1 * __typetag2 __typetag3 g;
>>>>
>>>> Apparently for (a) clang is generating DWARF that associates __typetag2
>>>> and__typetag3 to g's type (the pointer to pointer) instead of the
>>>> pointer to int, which contravenes the GNU syntax rules.
>>>>
>>>> AFAIK thats is where the DWARF we generate differs, and what is blocking
>>>> us.  David will correct me in the likely case I'm wrong :)
>>>
>>> Right. This is what I hoped maybe the C2x syntax could resolve.
>>>
>>> The issue I saw is that in the case (a) above, when using the GNU
>>> attribute syntax, GCC and clang produce different results. I think that
>>> the underlying cause is some subtle difference in how clang is handling
>>> the GNU attribute syntax in the case compared to GCC.
>>>
>>>
>>> To remind ourselves, here is the full example. Notice the significant
>>> difference in which objects the tags are associated with in DWARF.
>>>
>>>
>>> #define __typetag1 __attribute__((btf_type_tag("type-tag-1")))
>>> #define __typetag2 __attribute__((btf_type_tag("type-tag-2")))
>>> #define __typetag3 __attribute__((btf_type_tag("type-tag-3")))
>>>
>>> int __typetag1 * __typetag2 __typetag3 * g;
>>>
>>>
>>> GCC: bpf-unknown-none-gcc -c -gdwarf -gbtf annotate.c
>>>
>>> 0x0000000c: DW_TAG_compile_unit
>>>                 DW_AT_producer	("GNU C17 12.0.1 20220401 (experimental) -gdwarf -gbtf")
>>>                 DW_AT_language	(DW_LANG_C11)
>>>                 DW_AT_name	("annotate.c")
>>>                 DW_AT_comp_dir	("/home/dfaust/playpen/btf/tags")
>>>                 DW_AT_stmt_list	(0x00000000)
>>>
>>> 0x0000001e:   DW_TAG_variable
>>>                   DW_AT_name	("g")
>>>                   DW_AT_decl_file	("/home/dfaust/playpen/btf/tags/annotate.c")
>>>                   DW_AT_decl_line	(11)
>>>                   DW_AT_decl_column	(0x2a)
>>>                   DW_AT_type	(0x00000032 "int **")
>>>                   DW_AT_external	(true)
>>>                   DW_AT_location	(DW_OP_addr 0x0)
>>>
>>> 0x00000032:   DW_TAG_pointer_type
>>>                   DW_AT_byte_size	(8)
>>>                   DW_AT_type	(0x00000045 "int *")
>>>                   DW_AT_sibling	(0x00000045)
>>>
>>> 0x0000003b:     DW_TAG_LLVM_annotation
>>>                     DW_AT_name	("btf_type_tag")
>>>                     DW_AT_const_value	("type-tag-1")
>>>
>>> 0x00000044:     NULL
>>>
>>> 0x00000045:   DW_TAG_pointer_type
>>>                   DW_AT_byte_size	(8)
>>>                   DW_AT_type	(0x00000061 "int")
>>>                   DW_AT_sibling	(0x00000061)
>>>
>>> 0x0000004e:     DW_TAG_LLVM_annotation
>>>                     DW_AT_name	("btf_type_tag")
>>>                     DW_AT_const_value	("type-tag-3")
>>>
>>> 0x00000057:     DW_TAG_LLVM_annotation
>>>                     DW_AT_name	("btf_type_tag")
>>>                     DW_AT_const_value	("type-tag-2")
>>>
>>> 0x00000060:     NULL
>>>
>>> 0x00000061:   DW_TAG_base_type
>>>                   DW_AT_byte_size	(0x04)
>>>                   DW_AT_encoding	(DW_ATE_signed)
>>>                   DW_AT_name	("int")
>>>
>>> 0x00000068:   NULL
>>
>> do you have documentation to show why gnu generates attribute this way?
>> If dwarf generates
>>       ptr -> tag3 -> tag2 -> ptr -> tag1 -> int
>> does this help?
> 
> Okay, I think I see the problem. The internal representations between clang
> and GCC attach the attributes to different nodes, and as a result they
> produce different DWARF:
> 
> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64,
> annotations: !10)
> !6 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !7, size: 64,
> annotations: !8)
> !7 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
> !8 = !{!9}
> !9 = !{!"btf_type_tag", !"tag1"}
> !10 = !{!11, !12}
> !11 = !{!"btf_type_tag", !"tag2"}
> !12 = !{!"btf_type_tag", !"tag3"}
> 
> If I am reading this IR right, then the tags "tag2" and "tag3" are being
> applied to the int**, and "tag1" is applied to the int*
> 
> But I don't think this lines up with how the attribute syntax is defined.
> See
>    https://gcc.gnu.org/onlinedocs/gcc/Attribute-Syntax.html
> In particular the "All other attributes" section. (It's a bit dense).
> Or, as Joseph summed it up nicely earlier in the thread:
>> In either syntax, __typetag2 __typetag3 should apply to
>> the type to which g points, not to g or its type, just as if you had a
>> type qualifier there.  You'd need to put the attributes (or qualifier)
>> after the *, not before, to make them apply to the pointer type.
> 
> 
> Compare that to GCC's internal representation, from which DWARF is generated:
> 
>   <var_decl 0x7ffff7535090 g
>      type <pointer_type 0x7ffff74f8888
>          type <pointer_type 0x7ffff74f8b28 type <integer_type 0x7ffff74385e8 int>
>              unsigned DI
>              size <integer_cst 0x7ffff742b450 constant 64>
>              unit-size <integer_cst 0x7ffff742b468 constant 8>
>              align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x7ffff743f888
>              attributes <tree_list 0x7ffff75165c8
>                  purpose <identifier_node 0x7ffff75290f0 btf_type_tag>
>                  value <tree_list 0x7ffff7516550
>                      value <string_cst 0x7ffff75182e0 type <array_type 0x7ffff74f8738>
>                          readonly constant static "type-tag-3\000">>
>                  chain <tree_list 0x7ffff75165a0 purpose <identifier_node 0x7ffff75290f0 btf_type_tag>
>                      value <tree_list 0x7ffff75164d8
>                          value <string_cst 0x7ffff75182c0 type <array_type 0x7ffff74f8738>
>                              readonly constant static "type-tag-2\000">>>>
>              pointer_to_this <pointer_type 0x7ffff74f8bd0>>
>          unsigned DI size <integer_cst 0x7ffff742b450 64> unit-size <integer_cst 0x7ffff742b468 8>
>          align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x7ffff74f87e0
>          attributes <tree_list 0x7ffff75165f0 purpose <identifier_node 0x7ffff75290f0 btf_type_tag>
>              value <tree_list 0x7ffff7516438
>                  value <string_cst 0x7ffff75182a0 type <array_type 0x7ffff74f8738>
>                      readonly constant static "type-tag-1\000">>>>
>      public static unsigned DI defer-output /home/dfaust/playpen/btf/tags/annotate.c:10:42 size <integer_cst 0x7ffff742b450 64> unit-size <integer_cst 0x7ffff742b468 8>
>      align:64 warn_if_not_align:0>
> 
> See how tags "tag2" and "tag3" are associated with the pointer_type 0x7ffff74f8b28,
> that is, "the type to which g points"
> 
>  From GCC's DWARF the BTF we get currently looks like:
>    VAR(g) -> ptr -> tag1 -> ptr -> tag3 -> tag2 -> int
> which is obviously quite different and why this case caught my attention.
> 
> I think this difference is the root of our problems. It might not be
> specifically related to the BTF tag attributes but they do reveal some
> discrepency between how clang and GCC handle the attribute syntax.

The btf_type attribute is very similar to address_space attribute.
For example,
$ cat t1.c
int __attribute__((address_space(1))) * p;
$ clang -g -S -emit-llvm t1.c

In IR, we will have
@p = dso_local global ptr addrspace(1) null, align 8, !dbg !0
...
!5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64)
!6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)

Replacing address_space with btf_type_tag, we will get
ptr->type_tag->int in debuginfo.

But it looks like gcc doesn't support address_space attribute

$ gcc -g -S t1.c
t1.c:1:1: warning: ‘address_space’ attribute directive ignored 
[-Wattributes]
  int __attribute__((address_space(1))) * p;
  ^~~

Is it possible for gcc to go with address_space attribute
semantics for btf_type_tag attribute?

> 
>>
>>>
>>>
>>> clang: clang -target bpf -c -g annotate.c
>>>
>>> 0x0000000c: DW_TAG_compile_unit
>>>                 DW_AT_producer	("clang version 15.0.0 (https://github.com/llvm/llvm-project.git f80e369f61ebd33dd9377bb42fcab64d17072b18)")
>>>                 DW_AT_language	(DW_LANG_C99)
>>>                 DW_AT_name	("annotate.c")
>>>                 DW_AT_str_offsets_base	(0x00000008)
>>>                 DW_AT_stmt_list	(0x00000000)
>>>                 DW_AT_comp_dir	("/home/dfaust/playpen/btf/tags")
>>>                 DW_AT_addr_base	(0x00000008)
>>>
>>> 0x0000001e:   DW_TAG_variable
>>>                   DW_AT_name	("g")
>>>                   DW_AT_type	(0x00000029 "int **")
>>>                   DW_AT_external	(true)
>>>                   DW_AT_decl_file	("/home/dfaust/playpen/btf/tags/annotate.c")
>>>                   DW_AT_decl_line	(11)
>>>                   DW_AT_location	(DW_OP_addrx 0x0)
>>>
>>> 0x00000029:   DW_TAG_pointer_type
>>>                   DW_AT_type	(0x00000035 "int *")
>>>
>>> 0x0000002e:     DW_TAG_LLVM_annotation
>>>                     DW_AT_name	("btf_type_tag")
>>>                     DW_AT_const_value	("type-tag-2")
>>>
>>> 0x00000031:     DW_TAG_LLVM_annotation
>>>                     DW_AT_name	("btf_type_tag")
>>>                     DW_AT_const_value	("type-tag-3")
>>>
>>> 0x00000034:     NULL
>>>
>>> 0x00000035:   DW_TAG_pointer_type
>>>                   DW_AT_type	(0x0000003e "int")
>>>
>>> 0x0000003a:     DW_TAG_LLVM_annotation
>>>                     DW_AT_name	("btf_type_tag")
>>>                     DW_AT_const_value	("type-tag-1")
>>>
>>> 0x0000003d:     NULL
>>>
>>> 0x0000003e:   DW_TAG_base_type
>>>                   DW_AT_name	("int")
>>>                   DW_AT_encoding	(DW_ATE_signed)
>>>                   DW_AT_byte_size	(0x04)
>>>
>>> 0x00000042:   NULL
>>>
>>>

  reply	other threads:[~2022-05-26  7:29 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-01 19:42 [PATCH " David Faust
2022-04-01 19:42 ` [PATCH 1/8] dwarf: Add dw_get_die_parent function David Faust
2022-04-01 19:42 ` [PATCH 2/8] include: Add BTF tag defines to dwarf2 and btf David Faust
2022-04-01 19:42 ` [PATCH 3/8] c-family: Add BTF tag attribute handlers David Faust
2022-04-01 19:42 ` [PATCH 4/8] dwarf: create BTF decl and type tag DIEs David Faust
2022-04-01 19:42 ` [PATCH 5/8] ctfc: Add support to pass through BTF annotations David Faust
2022-04-01 19:42 ` [PATCH 6/8] dwarf2ctf: convert tag DIEs to CTF types David Faust
2022-04-01 19:42 ` [PATCH 7/8] Output BTF DECL_TAG and TYPE_TAG types David Faust
2022-04-01 19:42 ` [PATCH 8/8] testsuite: Add tests for BTF tags David Faust
2022-04-04 22:13 ` [PATCH 0/8][RFC] Support BTF decl_tag and type_tag annotations Yonghong Song
2022-04-05 16:26   ` David Faust
2022-04-18 19:36 ` [ping][PATCH " David Faust
2022-05-02 16:57   ` [ping2][PATCH " David Faust
2022-05-03 22:32     ` Joseph Myers
2022-05-04 17:03       ` David Faust
2022-05-05 23:00         ` Yonghong Song
2022-05-06 21:18           ` David Faust
2022-05-11  3:43             ` Yonghong Song
2022-05-11  5:05               ` Yonghong Song
2022-05-11 18:44                 ` David Faust
2022-05-24  6:33                   ` Yonghong Song
2022-05-24 11:07                     ` Jose E. Marchesi
2022-05-24 15:52                       ` Yonghong Song
2022-05-24 15:53                       ` David Faust
2022-05-24 16:03                         ` Yonghong Song
2022-05-24 17:04                           ` David Faust
2022-05-26  7:29                             ` Yonghong Song [this message]
2022-05-27 19:56                               ` David Faust
2022-06-03  2:04                                 ` Yonghong Song
2022-06-07 21:42                                   ` David Faust

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a4db600e-e92c-0092-3786-70c46398bc79@fb.com \
    --to=yhs@fb.com \
    --cc=david.faust@oracle.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jose.marchesi@oracle.com \
    --cc=joseph@codesourcery.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).