public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Yonghong Song <yhs@fb.com>
To: David Faust <david.faust@oracle.com>
Cc: gcc-patches@gcc.gnu.org, Joseph Myers <joseph@codesourcery.com>
Subject: Re: [ping2][PATCH 0/8][RFC] Support BTF decl_tag and type_tag annotations
Date: Tue, 10 May 2022 20:43:56 -0700	[thread overview]
Message-ID: <7125844e-f538-faca-1cdf-5322492c00d9@fb.com> (raw)
In-Reply-To: <b0626323-1bdc-4f99-0496-3007e2752ac4@oracle.com>



On 5/6/22 2:18 PM, David Faust wrote:
> 
> 
> On 5/5/22 16:00, Yonghong Song wrote:
>>
>>
>> On 5/4/22 10:03 AM, David Faust wrote:
>>>
>>>
>>> On 5/3/22 15:32, Joseph Myers wrote:
>>>> On Mon, 2 May 2022, David Faust via Gcc-patches wrote:
>>>>
>>>>> Consider the following example:
>>>>>
>>>>>      #define __typetag1 __attribute__((btf_type_tag("tag1")))
>>>>>      #define __typetag2 __attribute__((btf_type_tag("tag2")))
>>>>>      #define __typetag3 __attribute__((btf_type_tag("tag3")))
>>>>>
>>>>>      int __typetag1 * __typetag2 __typetag3 * g;
>>>>>
>>>>> The expected behavior is that 'g' is "a pointer with tags 'tag2' and
>>>>> 'tag3',
>>>>> to a pointer with tag 'tag1' to an int". i.e.:
>>>>
>>>> That's not a correct expectation for either GNU __attribute__ or C2x 
>>>> [[]]
>>>> attribute syntax.  In either syntax, __typetag2 __typetag3 should
>>>> apply to
>>>> the type to which g points, not to g or its type, just as if you had a
>>>> type qualifier there.  You'd need to put the attributes (or qualifier)
>>>> after the *, not before, to make them apply to the pointer type.  See
>>>> "Attribute Syntax" in the GCC manual for how the syntax is defined for
>>>> GNU
>>>> attributes and deduce in turn, for each subsequence of the tokens
>>>> matching
>>>> the syntax for some kind of declarator, what the type for "T D1" 
>>>> would be
>>>> as defined there and in the C standard, as deduced from the type for
>>>> "T D"
>>>> for a sub-declarator D.
>>>>   >> But GCC's attribute parsing produces a variable 'g' which is "a
>>> pointer with
>>>>> tag 'tag1' to a pointer with tags 'tag2' and 'tag3' to an int", i.e.
>>>>
>>>> In GNU syntax, __typetag1 applies to the declaration, whereas in C2x
>>>> syntax it applies to int.  Again, if you wanted it to apply to the
>>>> pointer
>>>> type it would need to go after the * not before.
>>>>
>>>> If you are concerned with the fine details of what construct an 
>>>> attribute
>>>> appertains to, I recommend using C2x syntax not GNU syntax.
>>>>
>>>
>>> Joseph, thank you! This is very helpful. My understanding of the syntax
>>> was not correct.
>>>
>>> (Actually, I made a bad mistake in paraphrasing this example from the
>>> discussion of it in the series cover letter. But, the reason why it is
>>> incorrect is the same.)
>>>
>>>
>>> Yonghong, is the specific ordering an expectation in BPF programs or
>>> other users of the tags?
>>
>> This is probably a language writing issue. We are saying tags only
>> apply to pointer. We probably should say it only apply to pointee.
>>
>> $ cat t.c
>> int const *ptr;
>>
>> the llvm ir debuginfo:
>>
>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64)
>> !6 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !7)
>> !7 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>>
>> We could replace 'const' with a tag like below:
>>
>> int __attribute__((btf_type_tag("tag"))) *ptr;
>>
>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64,
>> annotations: !7)
>> !6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>> !7 = !{!8}
>> !8 = !{!"btf_type_tag", !"tag"}
>>
>> In the above IR, we generate annotations to pointer_type because
>> we didn't invent a new DI type for encode btf_type_tag. But it is
>> totally okay to have IR looks like
>>
>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !11, size: 64)
>> !11 = !DIBtfTypeTagType(..., baseType: !6, name: !"Tag")
>> !6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>>
> OK, thanks.
> 
> There is still the question of why the DWARF generated for this case 
> that I have been concerned about:
> 
>    int __typetag1 * __typetag2 __typetag3 * g;
> 
> differs between GCC (with this series) and clang. After studying it, GCC 
> is doing with the attributes exactly as is described in the Attribute 
> Syntax portion of the GCC manual where the GNU syntax is described. I do 
> not think there is any problem here.
> 
> So the difference in DWARF suggests to me that clang is not handling the 
> GNU attribute syntax in this particular case correctly, since it seems 
> to be associating __typetag2 and __typetag3 to g's type rather than the 
> type to which it points.
> 
> I am not sure whether for the use purposes of the tags this difference 
> is very important, but it is worth noting.
> 
> 
> As Joseph suggested, it may be better to encourage users of these tags 
> to use the C2x attribute syntax if they are concerned with precisely 
> which construct the tag applies.
> 
> This would also be a way around any issues in handling the attributes 
> due to the GNU syntax.
> 
> I tried a few test cases using C2x syntax BTF type tags with a clang-15 
> build, but ran into some issues (in particular, some of the tag 
> attributes being ignored altogether). I couldn't find confirmation 
> whether C2x attribute syntax is fully supported in clang yet, so maybe 
> this isn't expected to work. Do you know whether the C2x syntax is fully 
> supported in clang yet?

Actually, I don't know either. But since the btf decl_tag and type_tag
are also used to compile linux kernel and the minimum compiler version
to compile kernel is gcc5.1 and clang11. I am not sure whether gcc5.1
supports c2x or not, I guess probably not. So I think we most likely
cannot use c2x syntax.

> 
>>
>>>
>>> This example comes from my testing against clang to check that the BTF
>>> generated by both toolchains is compatible. In this case we get
>>> different results when using the GNU attribute syntax.
>>>
>>>
>>> To avoid confusion, here is the full example (from the cover letter).
>>> The difference in the results is clear in the DWARF.
>>>
>>>> Consider the following example:
>>>>
>>>>    #define __typetag1 __attribute__((btf_type_tag("type-tag-1")))
>>>>    #define __typetag2 __attribute__((btf_type_tag("type-tag-2")))
>>>>    #define __typetag3 __attribute__((btf_type_tag("type-tag-3")))
>>>>
>>>>    int __typetag1 * __typetag2 __typetag3 * g;
>>>>
>>>>   <var_decl 0x7ffff7547090 g
>>>>      type <pointer_type 0x7ffff7548000
>>>>          type <pointer_type 0x7ffff75097e0 type <integer_type
>>>> 0x7ffff74495e8 int>
>>>>              asm_written unsigned DI
>>>>              size <integer_cst 0x7ffff743c450 constant 64>
>>>>              unit-size <integer_cst 0x7ffff743c468 constant 8>
>>>>              align:64 warn_if_not_align:0 symtab:0 alias-set -1
>>>> canonical-type 0x7ffff7450888
>>>>              attributes <tree_list 0x7ffff75275c8
>>>>                  purpose <identifier_node 0x7ffff753a1e0 btf_type_tag>
>>>>                  value <tree_list 0x7ffff7527550
>>>>                      value <string_cst 0x7ffff75292e0 type <array_type
>>>> 0x7ffff7509738>
>>>>                          readonly constant static "type-tag-3\000">>
>>>>                  chain <tree_list 0x7ffff75275a0 purpose
>>>> <identifier_node 0x7ffff753a1e0 btf_type_tag>
>>>>                      value <tree_list 0x7ffff75274d8
>>>>                          value <string_cst 0x7ffff75292c0 type
>>>> <array_type 0x7ffff7509738>
>>>>                              readonly constant static 
>>>> "type-tag-2\000">>>>
>>>>              pointer_to_this <pointer_type 0x7ffff7509888>>
>>>>          asm_written unsigned DI size <integer_cst 0x7ffff743c450 64>
>>>> unit-size <integer_cst 0x7ffff743c468 8>
>>>>          align:64 warn_if_not_align:0 symtab:0 alias-set -1
>>>> canonical-type 0x7ffff7509930
>>>>          attributes <tree_list 0x7ffff75275f0 purpose <identifier_node
>>>> 0x7ffff753a1e0 btf_type_tag>
>>>>              value <tree_list 0x7ffff7527438
>>>>                  value <string_cst 0x7ffff75292a0 type <array_type
>>>> 0x7ffff7509738>
>>>>                      readonly constant static "type-tag-1\000">>>>
>>>>      public static unsigned DI defer-output
>>>> /home/dfaust/playpen/btf/annotate.c:29:42 size <integer_cst
>>>> 0x7ffff743c450 64> unit-size <integer_cst 0x7ffff743c468 8>
>>>>      align:64 warn_if_not_align:0>
>>>
>>>>
>>>> The current implementation produces the following DWARF:
>>>>
>>>>   <1><1e>: Abbrev Number: 4 (DW_TAG_variable)
>>>>      <1f>   DW_AT_name        : g
>>>>      <21>   DW_AT_decl_file   : 1
>>>>      <22>   DW_AT_decl_line   : 6
>>>>      <23>   DW_AT_decl_column : 42
>>>>      <24>   DW_AT_type        : <0x32>
>>>>      <28>   DW_AT_external    : 1
>>>>      <28>   DW_AT_location    : 9 byte block: 3 0 0 0 0 0 0 0 0
>>>> (DW_OP_addr: 0)
>>>>   <1><32>: Abbrev Number: 2 (DW_TAG_pointer_type)
>>>>      <33>   DW_AT_byte_size   : 8
>>>>      <33>   DW_AT_type        : <0x45>
>>>>      <37>   DW_AT_sibling     : <0x45>
>>>>   <2><3b>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>      <3c>   DW_AT_name        : (indirect string, offset: 0x18):
>>>> btf_type_tag
>>>>      <40>   DW_AT_const_value : (indirect string, offset: 0xc7):
>>>> type-tag-1
>>>>   <2><44>: Abbrev Number: 0
>>>>   <1><45>: Abbrev Number: 2 (DW_TAG_pointer_type)
>>>>      <46>   DW_AT_byte_size   : 8
>>>>      <46>   DW_AT_type        : <0x61>
>>>>      <4a>   DW_AT_sibling     : <0x61>
>>>>   <2><4e>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>      <4f>   DW_AT_name        : (indirect string, offset: 0x18):
>>>> btf_type_tag
>>>>      <53>   DW_AT_const_value : (indirect string, offset: 0xd): 
>>>> type-tag-3
>>>>   <2><57>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>>      <58>   DW_AT_name        : (indirect string, offset: 0x18):
>>>> btf_type_tag
>>>>      <5c>   DW_AT_const_value : (indirect string, offset: 0xd2):
>>>> type-tag-2
>>>>   <2><60>: Abbrev Number: 0
>>>>   <1><61>: Abbrev Number: 5 (DW_TAG_base_type)
>>>>      <62>   DW_AT_byte_size   : 4
>>>>      <63>   DW_AT_encoding    : 5    (signed)
>>>>      <64>   DW_AT_name        : int
>>>>   <1><68>: Abbrev Number: 0
>>>>
>>>> This does not agree with the DWARF produced by LLVM/clang for the same
>>>> case:
>>>> (clang 15.0.0 git 142501117a78080d2615074d3986fa42aa6a0734)
>>>>
>>>> <1><1e>: Abbrev Number: 2 (DW_TAG_variable)
>>>>      <1f>   DW_AT_name        : (indexed string: 0x3): g
>>>>      <20>   DW_AT_type        : <0x29>
>>>>      <24>   DW_AT_external    : 1
>>>>      <24>   DW_AT_decl_file   : 0
>>>>      <25>   DW_AT_decl_line   : 6
>>>>      <26>   DW_AT_location    : 2 byte block: a1 0     ((Unknown
>>>> location op 0xa1))
>>>>   <1><29>: Abbrev Number: 3 (DW_TAG_pointer_type)
>>>>      <2a>   DW_AT_type        : <0x35>
>>>>   <2><2e>: Abbrev Number: 4 (User TAG value: 0x6000)
>>>>      <2f>   DW_AT_name        : (indexed string: 0x5): btf_type_tag
>>>>      <30>   DW_AT_const_value : (indexed string: 0x7): type-tag-2
>>>>   <2><31>: Abbrev Number: 4 (User TAG value: 0x6000)
>>>>      <32>   DW_AT_name        : (indexed string: 0x5): btf_type_tag
>>>>      <33>   DW_AT_const_value : (indexed string: 0x8): type-tag-3
>>>>   <2><34>: Abbrev Number: 0
>>>>   <1><35>: Abbrev Number: 3 (DW_TAG_pointer_type)
>>>>      <36>   DW_AT_type        : <0x3e>
>>>>   <2><3a>: Abbrev Number: 4 (User TAG value: 0x6000)
>>>>      <3b>   DW_AT_name        : (indexed string: 0x5): btf_type_tag
>>>>      <3c>   DW_AT_const_value : (indexed string: 0x6): type-tag-1
>>>>   <2><3d>: Abbrev Number: 0
>>>>   <1><3e>: Abbrev Number: 5 (DW_TAG_base_type)
>>>>      <3f>   DW_AT_name        : (indexed string: 0x4): int
>>>>      <40>   DW_AT_encoding    : 5    (signed)
>>>>      <41>   DW_AT_byte_size   : 4
>>>>   <1><42>: Abbrev Number: 0
>>>>
>>>>
>>>> Because GCC produces BTF from the internal DWARF DIE tree, the BTF
>>>> also differs.
>>>> This can be seen most obviously in the BTF type reference chains:
>>>>
>>>>    GCC
>>>>      VAR (g) -> ptr -> tag1 -> ptr -> tag3 -> tag2 -> int
>>>>
>>>>    LLVM
>>>>      VAR (g) -> ptr -> tag3 -> tag2 -> ptr -> tag1 -> int
>>>
>>>

  reply	other threads:[~2022-05-11  3:44 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-01 19:42 [PATCH " David Faust
2022-04-01 19:42 ` [PATCH 1/8] dwarf: Add dw_get_die_parent function David Faust
2022-04-01 19:42 ` [PATCH 2/8] include: Add BTF tag defines to dwarf2 and btf David Faust
2022-04-01 19:42 ` [PATCH 3/8] c-family: Add BTF tag attribute handlers David Faust
2022-04-01 19:42 ` [PATCH 4/8] dwarf: create BTF decl and type tag DIEs David Faust
2022-04-01 19:42 ` [PATCH 5/8] ctfc: Add support to pass through BTF annotations David Faust
2022-04-01 19:42 ` [PATCH 6/8] dwarf2ctf: convert tag DIEs to CTF types David Faust
2022-04-01 19:42 ` [PATCH 7/8] Output BTF DECL_TAG and TYPE_TAG types David Faust
2022-04-01 19:42 ` [PATCH 8/8] testsuite: Add tests for BTF tags David Faust
2022-04-04 22:13 ` [PATCH 0/8][RFC] Support BTF decl_tag and type_tag annotations Yonghong Song
2022-04-05 16:26   ` David Faust
2022-04-18 19:36 ` [ping][PATCH " David Faust
2022-05-02 16:57   ` [ping2][PATCH " David Faust
2022-05-03 22:32     ` Joseph Myers
2022-05-04 17:03       ` David Faust
2022-05-05 23:00         ` Yonghong Song
2022-05-06 21:18           ` David Faust
2022-05-11  3:43             ` Yonghong Song [this message]
2022-05-11  5:05               ` Yonghong Song
2022-05-11 18:44                 ` David Faust
2022-05-24  6:33                   ` Yonghong Song
2022-05-24 11:07                     ` Jose E. Marchesi
2022-05-24 15:52                       ` Yonghong Song
2022-05-24 15:53                       ` David Faust
2022-05-24 16:03                         ` Yonghong Song
2022-05-24 17:04                           ` David Faust
2022-05-26  7:29                             ` Yonghong Song
2022-05-27 19:56                               ` David Faust
2022-06-03  2:04                                 ` Yonghong Song
2022-06-07 21:42                                   ` David Faust

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7125844e-f538-faca-1cdf-5322492c00d9@fb.com \
    --to=yhs@fb.com \
    --cc=david.faust@oracle.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=joseph@codesourcery.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).