public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: David Faust <david.faust@oracle.com>
To: Yonghong Song <yhs@fb.com>
Cc: gcc-patches@gcc.gnu.org, Joseph Myers <joseph@codesourcery.com>
Subject: Re: [ping2][PATCH 0/8][RFC] Support BTF decl_tag and type_tag annotations
Date: Fri, 6 May 2022 14:18:17 -0700	[thread overview]
Message-ID: <b0626323-1bdc-4f99-0496-3007e2752ac4@oracle.com> (raw)
In-Reply-To: <f9698210-e567-59c9-fdbc-85686ad78470@fb.com>



On 5/5/22 16:00, Yonghong Song wrote:
> 
> 
> On 5/4/22 10:03 AM, David Faust wrote:
>>
>>
>> On 5/3/22 15:32, Joseph Myers wrote:
>>> On Mon, 2 May 2022, David Faust via Gcc-patches wrote:
>>>
>>>> Consider the following example:
>>>>
>>>>      #define __typetag1 __attribute__((btf_type_tag("tag1")))
>>>>      #define __typetag2 __attribute__((btf_type_tag("tag2")))
>>>>      #define __typetag3 __attribute__((btf_type_tag("tag3")))
>>>>
>>>>      int __typetag1 * __typetag2 __typetag3 * g;
>>>>
>>>> The expected behavior is that 'g' is "a pointer with tags 'tag2' and
>>>> 'tag3',
>>>> to a pointer with tag 'tag1' to an int". i.e.:
>>>
>>> That's not a correct expectation for either GNU __attribute__ or C2x [[]]
>>> attribute syntax.  In either syntax, __typetag2 __typetag3 should
>>> apply to
>>> the type to which g points, not to g or its type, just as if you had a
>>> type qualifier there.  You'd need to put the attributes (or qualifier)
>>> after the *, not before, to make them apply to the pointer type.  See
>>> "Attribute Syntax" in the GCC manual for how the syntax is defined for
>>> GNU
>>> attributes and deduce in turn, for each subsequence of the tokens
>>> matching
>>> the syntax for some kind of declarator, what the type for "T D1" would be
>>> as defined there and in the C standard, as deduced from the type for
>>> "T D"
>>> for a sub-declarator D.
>>>   >> But GCC's attribute parsing produces a variable 'g' which is "a
>> pointer with
>>>> tag 'tag1' to a pointer with tags 'tag2' and 'tag3' to an int", i.e.
>>>
>>> In GNU syntax, __typetag1 applies to the declaration, whereas in C2x
>>> syntax it applies to int.  Again, if you wanted it to apply to the
>>> pointer
>>> type it would need to go after the * not before.
>>>
>>> If you are concerned with the fine details of what construct an attribute
>>> appertains to, I recommend using C2x syntax not GNU syntax.
>>>
>>
>> Joseph, thank you! This is very helpful. My understanding of the syntax
>> was not correct.
>>
>> (Actually, I made a bad mistake in paraphrasing this example from the
>> discussion of it in the series cover letter. But, the reason why it is
>> incorrect is the same.)
>>
>>
>> Yonghong, is the specific ordering an expectation in BPF programs or
>> other users of the tags?
> 
> This is probably a language writing issue. We are saying tags only
> apply to pointer. We probably should say it only apply to pointee.
> 
> $ cat t.c
> int const *ptr;
> 
> the llvm ir debuginfo:
> 
> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64)
> !6 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !7)
> !7 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
> 
> We could replace 'const' with a tag like below:
> 
> int __attribute__((btf_type_tag("tag"))) *ptr;
> 
> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64,
> annotations: !7)
> !6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
> !7 = !{!8}
> !8 = !{!"btf_type_tag", !"tag"}
> 
> In the above IR, we generate annotations to pointer_type because
> we didn't invent a new DI type for encode btf_type_tag. But it is
> totally okay to have IR looks like
> 
> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !11, size: 64)
> !11 = !DIBtfTypeTagType(..., baseType: !6, name: !"Tag")
> !6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
> 
OK, thanks.

There is still the question of why the DWARF generated for this case 
that I have been concerned about:

   int __typetag1 * __typetag2 __typetag3 * g;

differs between GCC (with this series) and clang. After studying it, GCC 
is doing with the attributes exactly as is described in the Attribute 
Syntax portion of the GCC manual where the GNU syntax is described. I do 
not think there is any problem here.

So the difference in DWARF suggests to me that clang is not handling the 
GNU attribute syntax in this particular case correctly, since it seems 
to be associating __typetag2 and __typetag3 to g's type rather than the 
type to which it points.

I am not sure whether for the use purposes of the tags this difference 
is very important, but it is worth noting.


As Joseph suggested, it may be better to encourage users of these tags 
to use the C2x attribute syntax if they are concerned with precisely 
which construct the tag applies.

This would also be a way around any issues in handling the attributes 
due to the GNU syntax.

I tried a few test cases using C2x syntax BTF type tags with a clang-15 
build, but ran into some issues (in particular, some of the tag 
attributes being ignored altogether). I couldn't find confirmation 
whether C2x attribute syntax is fully supported in clang yet, so maybe 
this isn't expected to work. Do you know whether the C2x syntax is fully 
supported in clang yet?

> 
>>
>> This example comes from my testing against clang to check that the BTF
>> generated by both toolchains is compatible. In this case we get
>> different results when using the GNU attribute syntax.
>>
>>
>> To avoid confusion, here is the full example (from the cover letter).
>> The difference in the results is clear in the DWARF.
>>
>>> Consider the following example:
>>>
>>>    #define __typetag1 __attribute__((btf_type_tag("type-tag-1")))
>>>    #define __typetag2 __attribute__((btf_type_tag("type-tag-2")))
>>>    #define __typetag3 __attribute__((btf_type_tag("type-tag-3")))
>>>
>>>    int __typetag1 * __typetag2 __typetag3 * g;
>>>
>>>   <var_decl 0x7ffff7547090 g
>>>      type <pointer_type 0x7ffff7548000
>>>          type <pointer_type 0x7ffff75097e0 type <integer_type
>>> 0x7ffff74495e8 int>
>>>              asm_written unsigned DI
>>>              size <integer_cst 0x7ffff743c450 constant 64>
>>>              unit-size <integer_cst 0x7ffff743c468 constant 8>
>>>              align:64 warn_if_not_align:0 symtab:0 alias-set -1
>>> canonical-type 0x7ffff7450888
>>>              attributes <tree_list 0x7ffff75275c8
>>>                  purpose <identifier_node 0x7ffff753a1e0 btf_type_tag>
>>>                  value <tree_list 0x7ffff7527550
>>>                      value <string_cst 0x7ffff75292e0 type <array_type
>>> 0x7ffff7509738>
>>>                          readonly constant static "type-tag-3\000">>
>>>                  chain <tree_list 0x7ffff75275a0 purpose
>>> <identifier_node 0x7ffff753a1e0 btf_type_tag>
>>>                      value <tree_list 0x7ffff75274d8
>>>                          value <string_cst 0x7ffff75292c0 type
>>> <array_type 0x7ffff7509738>
>>>                              readonly constant static "type-tag-2\000">>>>
>>>              pointer_to_this <pointer_type 0x7ffff7509888>>
>>>          asm_written unsigned DI size <integer_cst 0x7ffff743c450 64>
>>> unit-size <integer_cst 0x7ffff743c468 8>
>>>          align:64 warn_if_not_align:0 symtab:0 alias-set -1
>>> canonical-type 0x7ffff7509930
>>>          attributes <tree_list 0x7ffff75275f0 purpose <identifier_node
>>> 0x7ffff753a1e0 btf_type_tag>
>>>              value <tree_list 0x7ffff7527438
>>>                  value <string_cst 0x7ffff75292a0 type <array_type
>>> 0x7ffff7509738>
>>>                      readonly constant static "type-tag-1\000">>>>
>>>      public static unsigned DI defer-output
>>> /home/dfaust/playpen/btf/annotate.c:29:42 size <integer_cst
>>> 0x7ffff743c450 64> unit-size <integer_cst 0x7ffff743c468 8>
>>>      align:64 warn_if_not_align:0>
>>
>>>
>>> The current implementation produces the following DWARF:
>>>
>>>   <1><1e>: Abbrev Number: 4 (DW_TAG_variable)
>>>      <1f>   DW_AT_name        : g
>>>      <21>   DW_AT_decl_file   : 1
>>>      <22>   DW_AT_decl_line   : 6
>>>      <23>   DW_AT_decl_column : 42
>>>      <24>   DW_AT_type        : <0x32>
>>>      <28>   DW_AT_external    : 1
>>>      <28>   DW_AT_location    : 9 byte block: 3 0 0 0 0 0 0 0 0
>>> (DW_OP_addr: 0)
>>>   <1><32>: Abbrev Number: 2 (DW_TAG_pointer_type)
>>>      <33>   DW_AT_byte_size   : 8
>>>      <33>   DW_AT_type        : <0x45>
>>>      <37>   DW_AT_sibling     : <0x45>
>>>   <2><3b>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>      <3c>   DW_AT_name        : (indirect string, offset: 0x18):
>>> btf_type_tag
>>>      <40>   DW_AT_const_value : (indirect string, offset: 0xc7):
>>> type-tag-1
>>>   <2><44>: Abbrev Number: 0
>>>   <1><45>: Abbrev Number: 2 (DW_TAG_pointer_type)
>>>      <46>   DW_AT_byte_size   : 8
>>>      <46>   DW_AT_type        : <0x61>
>>>      <4a>   DW_AT_sibling     : <0x61>
>>>   <2><4e>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>      <4f>   DW_AT_name        : (indirect string, offset: 0x18):
>>> btf_type_tag
>>>      <53>   DW_AT_const_value : (indirect string, offset: 0xd): type-tag-3
>>>   <2><57>: Abbrev Number: 1 (User TAG value: 0x6000)
>>>      <58>   DW_AT_name        : (indirect string, offset: 0x18):
>>> btf_type_tag
>>>      <5c>   DW_AT_const_value : (indirect string, offset: 0xd2):
>>> type-tag-2
>>>   <2><60>: Abbrev Number: 0
>>>   <1><61>: Abbrev Number: 5 (DW_TAG_base_type)
>>>      <62>   DW_AT_byte_size   : 4
>>>      <63>   DW_AT_encoding    : 5    (signed)
>>>      <64>   DW_AT_name        : int
>>>   <1><68>: Abbrev Number: 0
>>>
>>> This does not agree with the DWARF produced by LLVM/clang for the same
>>> case:
>>> (clang 15.0.0 git 142501117a78080d2615074d3986fa42aa6a0734)
>>>
>>> <1><1e>: Abbrev Number: 2 (DW_TAG_variable)
>>>      <1f>   DW_AT_name        : (indexed string: 0x3): g
>>>      <20>   DW_AT_type        : <0x29>
>>>      <24>   DW_AT_external    : 1
>>>      <24>   DW_AT_decl_file   : 0
>>>      <25>   DW_AT_decl_line   : 6
>>>      <26>   DW_AT_location    : 2 byte block: a1 0     ((Unknown
>>> location op 0xa1))
>>>   <1><29>: Abbrev Number: 3 (DW_TAG_pointer_type)
>>>      <2a>   DW_AT_type        : <0x35>
>>>   <2><2e>: Abbrev Number: 4 (User TAG value: 0x6000)
>>>      <2f>   DW_AT_name        : (indexed string: 0x5): btf_type_tag
>>>      <30>   DW_AT_const_value : (indexed string: 0x7): type-tag-2
>>>   <2><31>: Abbrev Number: 4 (User TAG value: 0x6000)
>>>      <32>   DW_AT_name        : (indexed string: 0x5): btf_type_tag
>>>      <33>   DW_AT_const_value : (indexed string: 0x8): type-tag-3
>>>   <2><34>: Abbrev Number: 0
>>>   <1><35>: Abbrev Number: 3 (DW_TAG_pointer_type)
>>>      <36>   DW_AT_type        : <0x3e>
>>>   <2><3a>: Abbrev Number: 4 (User TAG value: 0x6000)
>>>      <3b>   DW_AT_name        : (indexed string: 0x5): btf_type_tag
>>>      <3c>   DW_AT_const_value : (indexed string: 0x6): type-tag-1
>>>   <2><3d>: Abbrev Number: 0
>>>   <1><3e>: Abbrev Number: 5 (DW_TAG_base_type)
>>>      <3f>   DW_AT_name        : (indexed string: 0x4): int
>>>      <40>   DW_AT_encoding    : 5    (signed)
>>>      <41>   DW_AT_byte_size   : 4
>>>   <1><42>: Abbrev Number: 0
>>>
>>>
>>> Because GCC produces BTF from the internal DWARF DIE tree, the BTF
>>> also differs.
>>> This can be seen most obviously in the BTF type reference chains:
>>>
>>>    GCC
>>>      VAR (g) -> ptr -> tag1 -> ptr -> tag3 -> tag2 -> int
>>>
>>>    LLVM
>>>      VAR (g) -> ptr -> tag3 -> tag2 -> ptr -> tag1 -> int
>>
>>

  reply	other threads:[~2022-05-06 21:18 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-01 19:42 [PATCH " David Faust
2022-04-01 19:42 ` [PATCH 1/8] dwarf: Add dw_get_die_parent function David Faust
2022-04-01 19:42 ` [PATCH 2/8] include: Add BTF tag defines to dwarf2 and btf David Faust
2022-04-01 19:42 ` [PATCH 3/8] c-family: Add BTF tag attribute handlers David Faust
2022-04-01 19:42 ` [PATCH 4/8] dwarf: create BTF decl and type tag DIEs David Faust
2022-04-01 19:42 ` [PATCH 5/8] ctfc: Add support to pass through BTF annotations David Faust
2022-04-01 19:42 ` [PATCH 6/8] dwarf2ctf: convert tag DIEs to CTF types David Faust
2022-04-01 19:42 ` [PATCH 7/8] Output BTF DECL_TAG and TYPE_TAG types David Faust
2022-04-01 19:42 ` [PATCH 8/8] testsuite: Add tests for BTF tags David Faust
2022-04-04 22:13 ` [PATCH 0/8][RFC] Support BTF decl_tag and type_tag annotations Yonghong Song
2022-04-05 16:26   ` David Faust
2022-04-18 19:36 ` [ping][PATCH " David Faust
2022-05-02 16:57   ` [ping2][PATCH " David Faust
2022-05-03 22:32     ` Joseph Myers
2022-05-04 17:03       ` David Faust
2022-05-05 23:00         ` Yonghong Song
2022-05-06 21:18           ` David Faust [this message]
2022-05-11  3:43             ` Yonghong Song
2022-05-11  5:05               ` Yonghong Song
2022-05-11 18:44                 ` David Faust
2022-05-24  6:33                   ` Yonghong Song
2022-05-24 11:07                     ` Jose E. Marchesi
2022-05-24 15:52                       ` Yonghong Song
2022-05-24 15:53                       ` David Faust
2022-05-24 16:03                         ` Yonghong Song
2022-05-24 17:04                           ` David Faust
2022-05-26  7:29                             ` Yonghong Song
2022-05-27 19:56                               ` David Faust
2022-06-03  2:04                                 ` Yonghong Song
2022-06-07 21:42                                   ` David Faust

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b0626323-1bdc-4f99-0496-3007e2752ac4@oracle.com \
    --to=david.faust@oracle.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=joseph@codesourcery.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).