From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by sourceware.org (Postfix) with ESMTPS id 3CBCF3858D1E for ; Tue, 5 Apr 2022 16:26:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 3CBCF3858D1E Received: from pps.filterd (m0246632.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 235GF4QL000849; Tue, 5 Apr 2022 16:26:07 GMT Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.appoci.oracle.com [147.154.18.20]) by mx0b-00069f02.pphosted.com with ESMTP id 3f6e3spgev-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 05 Apr 2022 16:26:07 +0000 Received: from pps.filterd (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (8.16.1.2/8.16.1.2) with SMTP id 235GAgiJ035231; Tue, 5 Apr 2022 16:26:07 GMT Received: from nam04-bn8-obe.outbound.protection.outlook.com (mail-bn8nam08lp2047.outbound.protection.outlook.com [104.47.74.47]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com with ESMTP id 3f6cx3qsev-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 05 Apr 2022 16:26:06 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=OX8K1AQLoYnD5H7A/7HpFOqwvnaOeC1TLvw11/3c3+7ggGsPcOaZSGtsrXN1JuBubxw01/+WgpORUWSDFbc5Gst/SmLmtUtT4OKmDfYxhxlwYM1JVJI88j6Gl5SS4YiBekDlcijVzWYIqjov+tP+aVkKG4TJ8uqC+W3gYc6SAEUC5go3/L1HSlPIFsSWKX7ebj18NYK9c7/naCgqc1JUHrvIE5uX58Ep5Bwz/WFDUCVcMtGY3pPl2zRFedQfexSL9ZfL3P8UF5Z6XiZtYuuMp8ZrzmLEOGKGrf76eyyBeGQ2C1dPD1ayFgIL/kowPBIFN6jmzd5b3xrJoi5ETRI/Eg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=pd7c7c0rbr/qNJETLOZ9mb9iYgpx3o1lOgriKFxc9JY=; b=K4iGWUBfMrCUY3g21SN9qm60ebA/jrHpmtW37aqoKFu0tkdVkqJ/EfMeKGkyM5xqgDgEbDdiYNF58DwXPmhBfLLViy4QKCC0ZKkzlfPhkD7mIC6wDdo1dILxFMru1lwqbu9rAtBAKuFHkabKg9CKpHFl6IUT0EERpZ1u4JBVxGBRGmt3Sey/iaKsL66IXtBPGgYEtyC2z23vnUm3TKBkK1Lqr+IplTon3IlDeFhnn5AxDLgH6J0dYDtcSHv5jxYF9RROATfP2uTHME/HoLpNnRq2Q248STPgIoYNhF4WUzHQPi06dDXdwITBXlSTlzqqJWFuES5zSEbESB46v9KCLQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none Received: from MN2PR10MB3213.namprd10.prod.outlook.com (2603:10b6:208:131::33) by MN2PR10MB4366.namprd10.prod.outlook.com (2603:10b6:208:1dd::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5123.31; Tue, 5 Apr 2022 16:26:05 +0000 Received: from MN2PR10MB3213.namprd10.prod.outlook.com ([fe80::705e:a465:c30b:fec6]) by MN2PR10MB3213.namprd10.prod.outlook.com ([fe80::705e:a465:c30b:fec6%6]) with mapi id 15.20.5123.031; Tue, 5 Apr 2022 16:26:05 +0000 Message-ID: Date: Tue, 5 Apr 2022 09:26:01 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.6.2 Subject: Re: [PATCH 0/8][RFC] Support BTF decl_tag and type_tag annotations Content-Language: en-US To: Yonghong Song , gcc-patches@gcc.gnu.org References: <20220401194216.16469-1-david.faust@oracle.com> <17035e49-2233-bd65-51de-9f7a013be511@fb.com> From: David Faust In-Reply-To: <17035e49-2233-bd65-51de-9f7a013be511@fb.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-ClientProxiedBy: SN4PR0501CA0147.namprd05.prod.outlook.com (2603:10b6:803:2c::25) To MN2PR10MB3213.namprd10.prod.outlook.com (2603:10b6:208:131::33) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 2c1c06d3-5a40-4133-f27f-08da1720fb4e X-MS-TrafficTypeDiagnostic: MN2PR10MB4366:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: s6Rg7aAaSgPMqRwdUA4NKOob40NFyar5D6TkWmW+d/AtTyTUmJpLG0ZdcDjIG5X9+jiIOnvzreGIlLbwqKajf9Pc6PKAFJtCS/kioEkIuZNL0Rs/+hQ2topN7EdTKG63Z4FgxVC+X1MTBk/LTA7q1mP8FLFRw5yGPpnr8QQ85jSpPlnTPOKf7Ng9al5EWzPGsyXHG2UNBzQlTmwXmhETLowzI/BWh0Eulbw0VlyFbAVp6n3XQl9uFyijNnQvfl4zztYz3MBwwNDYNPHprZ7Sc0vhCIG0qvDeLpCOkOJlnvSBBn351aFfad1dxhPefXIpGVGVEDw4dKrRotTciBCdmhltRubE8VukOHIOURYU1HXjsKBPLGQ79/j7NGXSJAg+6EJ4zDuFNgyiZzK0kf9KbqvpIB/q0TQvY6qwTaDFSoQ7ZCCl89FO7FM2Wh/nICUy+sUw+EnyBlURUyY5qK4OAepI5/IkWqfCyjCvVXfSEfNuLW/pY0hv8uCUARHBo7pTmTQQjeZjtWQi5cOc6zL/g8oTwqxdbH1Jk+ZDtiJzisaYHKJ1kAHjTYvL125oOd5M9RFsGJERRKTo27LqA7efBoQ6UwUqoOjNDle+C6TJRLjEMPHSBhAOJDLn3gBnSXep3eh1kfCPXxz6/E38R1DcgtSNu5lZpaQ61FHV9991fq7S7aIw05wJ+uHRzWPFCYoZP9NoXh6eWGNAsvAnnYX/dLr1aeR1rgyw26e+GhhDxkh4wbMKId2ZnNiflgRsnGlL93Xpse8E7kJH8OTFo5Rs/W8sFST3c9As1hoSWy4nJ5Q2J3QRy4TBjVOVE6IknxjG5dBO39UgovQQrDkutirJIFDAmZ8vXCZSkZgnZKvhiZY= X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:MN2PR10MB3213.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230001)(366004)(66946007)(36756003)(966005)(4326008)(186003)(8676002)(66556008)(31696002)(316002)(6486002)(508600001)(66476007)(6666004)(86362001)(6512007)(2906002)(5660300002)(53546011)(31686004)(107886003)(2616005)(83380400001)(38100700002)(44832011)(6506007)(8936002)(30864003)(43740500002)(45980500001); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?U2t0dHB3SkdrRXNNSTdaUkhjV0lxSFFObWk1Zk1ONzFwODFzVkRNV2pwVmpa?= =?utf-8?B?QjdqN1ArOGJKN0d2VHVaMkxlMUJ3UFBVK0p0KzNBTldTTzZzZkc2Zkg2b1Rw?= =?utf-8?B?em11andIZ2pHZHpqNGJDcWs1Rkwzb3pRWkd2Q29aY1pLQUtVL2RBMGp4ODdL?= =?utf-8?B?WjBxMWhzMmxHclBGMWlCMkRKSmlHWjdqQ21hR25URXpMNnllQWtGOXZ3UE5k?= =?utf-8?B?VUczMm9EUU5KYWpDUnBXaW1qSG9XRXhMVGZDM1prWkdaU0Y3R3FtN0pMYWY5?= =?utf-8?B?VnY1U0lENHliSlRucGszTjRZb3pBM09aTitoZnhzS2VCQzk5VU9VRHZmaElW?= =?utf-8?B?Zkw0blFHdHRDTndnZUlvOS8zZVdhMTI1MDVPYll2NEhjRFpWbGFVRXBiUnVB?= =?utf-8?B?R1NEOU1zbDZRa2g2SzY4dEF4TDZ2dXZtZjU0K0swZWNvaytPM1VPT3FtZi84?= =?utf-8?B?eERjSzlUTUFqQmE0aDJzUUs5a1FzOXBpVnV3aVkyRjdyZFg0dlh6NkdGYUZF?= =?utf-8?B?bnAwYmNtSUxNQmQ5Tnk3NlErU2VIUFFCSkd3WEVZYkZQWlcrUU9YWW12VFFF?= =?utf-8?B?cktkcTR5bi83L0NZMTZJUnlCQTh3eGtBVm1QaTlFaU5Ma2k3ZDJDL3NOUjk1?= =?utf-8?B?RVZuWjVhd1VYb3NmdGhPc2M0UHRpdC8zN1oyYWlzZkhRR0I5eWI2WDJwT1Jz?= =?utf-8?B?VWpqWlJNVFFOVVpHNmVlQ3RuLy9FQUZtT2duUUM0QUNLYVJpQkZabEpaUjZj?= =?utf-8?B?VmJmLzcwUmNkbUdqazZ4QjJGa1l3ZHF6eDdmNHhPaU94VW1oV2ljTkgra1Uv?= =?utf-8?B?cVdIcnh5QVNhc3VOc0hXeDM1NFRGeHZCbzF6S3lkekVlZEY3ZkZtY3Y1VnF4?= =?utf-8?B?aUlKb0xZYnNYL3NsVkt2ejh5L0pYK2tLSlNtWklGZUxhTU5VMmZ3VytUQXJQ?= =?utf-8?B?VTRML3FrNTYwTXVTV0xNL21TM0NFS2FHcU1JRU0xckhTd0VJendtWWs1Y0Ji?= =?utf-8?B?QW8yUC9wM0ROajViMWEvVG5GUGcvVi9Tck8rOVFSbktNcDlDcVhGU0tzNEJm?= =?utf-8?B?eTVycEFXeWlTcGdPbm1OV2dJRmhhdkJTTHFzdElzTXE0VXZqQmZ5UEFCYUh2?= =?utf-8?B?RTRkcDFrTXFRTGg0eTUzVkxKUDVGcGFpSUR5bXQ4L21lVG9iZSt5VTdrekky?= =?utf-8?B?Vng0L0JNbENLQTFJdjNackJMWE1HUjlmcEswY1lzaEJpV0VQNWduRnByOE9P?= =?utf-8?B?QTBZNEFjK3FkaGEwbkx3VWxVOVIwZ2c4a0dycStvN1BlZFpld3pDSkVORjgw?= =?utf-8?B?UTF2YS91bnlsc3VoMTNPVTdVQVAvaEQ5bDFrSGVXZGx5S0lySFZ0bEd3dzhl?= =?utf-8?B?aHpmWlRsY0xUaVQ3Vnc0aXFYNHo5VnBFT2dQTUtja0ZWSDd1WHNNSCttclZV?= =?utf-8?B?WXo1YWVJL0lMdk5NaCtkc2xwU3RRR3ZVRmZxVkdicFhsWmdFOUJoNFRuYTk0?= =?utf-8?B?VU9iOUJrMEJGNzE2M05PTFBtZTkzTHBScVBMOWdPeWdBbitVcjlEWHBnZ2Uy?= =?utf-8?B?K3N0Z1krTkdkcFVidjdPZXIwK29NYUlMTTdPcVhkbmsvTWM0RUt0blh3YmZM?= =?utf-8?B?UEMyZTdnT25JdllqS3NKb0drUVVSaG5HempqeVVUVSs3aTR1WlRrMGxHNGk5?= =?utf-8?B?cXRiSkJidElCU0dyS1QrZ2w1VzI4MW5xUmJaRFcrWXJQZG5LSjcxZk8wVlF6?= =?utf-8?B?L3B6ZHdWYTBYdWhoc1ZSREptY25Zakx2K2o1UGJzUXEvSkN4VkV4NGV6Z3BC?= =?utf-8?B?cXErUEY3WGRZb0dWVHpoZjdBMkRUSW96U1RqbnFyZGV0UGcrWlMwcmliMmly?= =?utf-8?B?aVJSNGZHbjdSU0dObmtVQlQvdVRDWW1sSEUxa21ac3FPYklwNjVHdkJiMnla?= =?utf-8?B?Q1B5N0o3ZVFYNDN1Y2E2aDY5Y3lXMGFxY3A5WmhYQTZVUTFJVzE0NzN1clc5?= =?utf-8?B?R2lWVzJ1UHJObDZkNHlRUTFscHVDeHNUcmpESmpiL3cwWm5lQU1wZU14aFpj?= =?utf-8?B?VjNFNWJkRm92LzFnYnQreVZIcFdBN2hwdldjdFBVd3F2S2w1Z1V1bkZoNkdo?= =?utf-8?B?bkRlcHhna3Uza08vcng4OVZwdkRVaUZWQW11YTkvNU5na2lLMzFPWGo4ZCsr?= =?utf-8?B?ekh6K25yNU85ajcwYVhjMkFES2RMTWVMQ0pEWnNvK0ptRnV4eDBPVys1eXdw?= =?utf-8?B?RVJkWDUrYUpDRkx4aURrV0pYclY1ZXZRY2wzV2xTTVFGTE5QalBRYXc5KzNT?= =?utf-8?B?Zzd0TzNuSHBVcy9QUGFobTdTcVJvUEQrQUVZdVNxWWNjS29NcDFtNXJRVFha?= =?utf-8?Q?85+TAIa407f1d+8Q2d7kmzmIT5C8/JepSmaJz?= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 2c1c06d3-5a40-4133-f27f-08da1720fb4e X-MS-Exchange-CrossTenant-AuthSource: MN2PR10MB3213.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 05 Apr 2022 16:26:05.0868 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: /KixE12j8YS/kmOeOsgP9zRBFgfJKg/65HlSOEYrpgXzne9HA4519rIxKy3Su54dFxtrCpTdlSj3Ngq+YHrvvw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN2PR10MB4366 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.425, 18.0.850 definitions=2022-04-05_04:2022-04-04, 2022-04-05 signatures=0 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 malwarescore=0 spamscore=0 adultscore=0 bulkscore=0 mlxlogscore=999 phishscore=0 suspectscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2202240000 definitions=main-2204050092 X-Proofpoint-ORIG-GUID: nz3Dk0eMiI_JZxtzCSmWXBH6Jss9wWD9 X-Proofpoint-GUID: nz3Dk0eMiI_JZxtzCSmWXBH6Jss9wWD9 X-Spam-Status: No, score=-7.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_ASCII_DIVIDERS, KAM_SHORT, NICE_REPLY_A, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Apr 2022 16:26:12 -0000 On 4/4/22 15:13, Yonghong Song wrote: > > > On 4/1/22 12:42 PM, David Faust wrote: >> Hello, >> >> This patch series is a first attempt at adding support for: >> >> - Two new C-language-level attributes that allow to associate (to "tag") >> particular declarations and types with arbitrary strings. As explained below, >> this is intended to be used to, for example, characterize certain pointer >> types. >> >> - The conveyance of that information in the DWARF output in the form of a new >> DIE: DW_TAG_GNU_annotation. >> >> - The conveyance of that information in the BTF output in the form of two new >> kinds of BTF objects: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG. >> >> All of these facilities are being added to the eBPF ecosystem, and support for >> them exists in some form in LLVM. However, as we shall see, we have found some >> problems implementing them so some discussion is in order. >> >> Purpose >> ======= >> >> 1) Addition of C-family language constructs (attributes) to specify free-text >> tags on certain language elements, such as struct fields. >> >> The purpose of these annotations is to provide additional information about >> types, variables, and function paratemeters of interest to the kernel. A >> driving use case is to tag pointer types within the linux kernel and eBPF >> programs with additional semantic information, such as '__user' or '__rcu'. >> >> For example, consider the linux kernel function do_execve with the >> following declaration: >> >> static int do_execve(struct filename *filename, >> const char __user *const __user *__argv, >> const char __user *const __user *__envp); >> >> Here, __user could be defined with these annotations to record semantic >> information about the pointer parameters (e.g., they are user-provided) in >> DWARF and BTF information. Other kernel facilites such as the eBPF verifier >> can read the tags and make use of the information. >> >> 2) Conveying the tags in the generated DWARF debug info. >> >> The main motivation for emitting the tags in DWARF is that the Linux kernel >> generates its BTF information via pahole, using DWARF as a source: >> >> +--------+ BTF BTF +----------+ >> | pahole |-------> vmlinux.btf ------->| verifier | >> +--------+ +----------+ >> ^ ^ >> | | >> DWARF | BTF | >> | | >> vmlinux +-------------+ >> module1.ko | BPF program | >> module2.ko +-------------+ >> ... >> >> This is because: >> >> a) Unlike GCC, LLVM will only generate BTF for BPF programs. >> >> b) GCC can generate BTF for whatever target with -gbtf, but there is no >> support for linking/deduplicating BTF in the linker. >> >> In the scenario above, the verifier needs access to the pointer tags of >> both the kernel types/declarations (conveyed in the DWARF and translated >> to BTF by pahole) and those of the BPF program (available directly in BTF). >> >> Another motivation for having the tag information in DWARF, unrelated to >> BPF and BTF, is that the drgn project (another DWARF consumer) also wants >> to benefit from these tags in order to differentiate between different >> kinds of pointers in the kernel. >> >> 3) Conveying the tags in the generated BTF debug info. >> >> This is easy: the main purpose of having this info in BTF is for the >> compiled eBPF programs. The kernel verifier can then access the tags >> of pointers used by the eBPF programs. >> >> >> For more information about these tags and the motivation behind them, please >> refer to the following linux kernel discussions: >> >> https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/ >> https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com/ >> https://lore.kernel.org/bpf/20211112012604.1504583-1-yhs@fb.com/ >> >> >> What is in this patch series >> ============================ >> >> This patch series adds support for these annotations in GCC. The implementation >> is largely complete. However, in some cases the produced debug info (both DWARF >> and BTF) differs significantly from that produced by LLVM. This issue is >> discussed in detail below, along with a few specific questions for both GCC and >> LLVM. Any input would be much appreciated. > > Hi, David, Thanks for the RFC implementation! I will answer your > questions related to llvm and kernel. > Hi Yonghong, thanks for the answers! >> >> >> Implementation Overview >> ======================= >> >> To enable these annotations, two new C language attributes are added: >> __attribute__((btf_decl_tag("foo")) and __attribute__((btf_type_tag("bar"))). >> Both attributes accept a single arbitrary string constant argument, which will >> be recorded in the generated DWARF and/or BTF debugging information. They have >> no effect on code generation. >> >> Note that we are using the same attribute names as LLVM, which include "btf" >> in the name. This may be controversial, as these tags are not really >> BTF-specific. A different name may be more appropriate. There was much >> discussion about naming in the proposal for the functionality in LLVM, the >> full thread can be found here: >> >> https://lists.llvm.org/pipermail/llvm-dev/2021-June/151023.html >> >> The name debug_info_annotate, suggested here, might better suit the attribute: >> >> https://lists.llvm.org/pipermail/llvm-dev/2021-June/151042.html >> >> DWARF support is enabled via a new DW_TAG_GNU_annotation. When generating DWARF, >> declarations and types will be checked for the corresponding attributes. If >> present, a DW_TAG_GNU_annotation DIE will be created as a child of the DIE for >> the annotated type or declaration, one for each tag. These DIEs link the >> arbitrary tag value to the item they annotate. >> >> For example, the following variable declaration: >> >> #define __typetag1 __attribute__((btf_type_tag("type-tag-1"))) >> #define __decltag1 __attribute__((btf_decl_tag("decl-tag-1"))) >> #define __decltag2 __attribute__((btf_decl_tag("decl-tag-2"))) >> >> int __typetag1 * x __decltag1 __decltag2; >> >> Produces the following DIEs: >> >> <1><1e>: Abbrev Number: 3 (DW_TAG_variable) >> <1f> DW_AT_name : x >> <21> DW_AT_decl_file : 1 >> <22> DW_AT_decl_line : 6 >> <23> DW_AT_decl_column : 18 >> <24> DW_AT_type : <0x49> >> <28> DW_AT_external : 1 >> <28> DW_AT_location : 9 byte block: 3 0 0 0 0 0 0 0 0 (DW_OP_addr: 0) >> <32> DW_AT_sibling : <0x49> >> <2><36>: Abbrev Number: 1 (User TAG value: 0x6000) >> <37> DW_AT_name : (indirect string, offset: 0x10): btf_decl_tag >> <3b> DW_AT_const_value : (indirect string, offset: 0x0): decl-tag-2 >> <2><3f>: Abbrev Number: 1 (User TAG value: 0x6000) >> <40> DW_AT_name : (indirect string, offset: 0x10): btf_decl_tag >> <44> DW_AT_const_value : (indirect string, offset: 0x1d): decl-tag-1 >> <2><48>: Abbrev Number: 0 >> <1><49>: Abbrev Number: 4 (DW_TAG_pointer_type) >> <4a> DW_AT_byte_size : 8 >> <4b> DW_AT_type : <0x5d> >> <4f> DW_AT_sibling : <0x5d> >> <2><53>: Abbrev Number: 1 (User TAG value: 0x6000) >> <54> DW_AT_name : (indirect string, offset: 0x28): btf_type_tag >> <58> DW_AT_const_value : (indirect string, offset: 0xd7): type-tag-1 >> <2><5c>: Abbrev Number: 0 >> <1><5d>: Abbrev Number: 5 (DW_TAG_base_type) >> <5e> DW_AT_byte_size : 4 >> <5f> DW_AT_encoding : 5 (signed) >> <60> DW_AT_name : int >> <1><64>: Abbrev Number: 0 >> >> Please note that currently, the annotation DWARF DIEs will be generated only if >> BTF debug information requested (via -gbtf). Therefore, the annotation DIEs >> will only be output if both BTF and DWARF are requested (e.g. -gbtf -gdwarf). >> This will change, since these tags are needed even when not generating BTF, >> for example in a GCC-built Linux kernel. >> >> In the case of BTF, the annotations are recorded in two type kinds recently >> added to the BTF specification: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG. >> The above example declaration prodcues the following BTF information: >> >> [1] int 'int'(1U#B) size=4U#B offset=0UB#b bits=32UB#b SIGNED >> [2] ptr type=3 >> [3] type_tag 'type-tag-1'(5U#B) type=1 >> [4] decl_tag 'decl-tag-1'(18U#B) type=6 component_idx=-1 >> [5] decl_tag 'decl-tag-2'(29U#B) type=6 component_idx=-1 >> [6] var 'x'(16U#B) type=2 linkage=1 (global) >> >> >> Current issues in the implementation >> ==================================== >> >> The __attribute__((btf_type_tag ("foo"))) syntax does not work correctly for >> types involving multiple pointers. >> >> Consider the following example: >> >> #define __typetag1 __attribute__((btf_type_tag("type-tag-1"))) >> #define __typetag2 __attribute__((btf_type_tag("type-tag-2"))) >> #define __typetag3 __attribute__((btf_type_tag("type-tag-3"))) >> >> int __typetag1 * __typetag2 __typetag3 * g; >> >> The current implementation produces the following DWARF: >> >> <1><1e>: Abbrev Number: 4 (DW_TAG_variable) >> <1f> DW_AT_name : g >> <21> DW_AT_decl_file : 1 >> <22> DW_AT_decl_line : 6 >> <23> DW_AT_decl_column : 42 >> <24> DW_AT_type : <0x32> >> <28> DW_AT_external : 1 >> <28> DW_AT_location : 9 byte block: 3 0 0 0 0 0 0 0 0 (DW_OP_addr: 0) >> <1><32>: Abbrev Number: 2 (DW_TAG_pointer_type) >> <33> DW_AT_byte_size : 8 >> <33> DW_AT_type : <0x45> >> <37> DW_AT_sibling : <0x45> >> <2><3b>: Abbrev Number: 1 (User TAG value: 0x6000) >> <3c> DW_AT_name : (indirect string, offset: 0x18): btf_type_tag >> <40> DW_AT_const_value : (indirect string, offset: 0xc7): type-tag-1 >> <2><44>: Abbrev Number: 0 >> <1><45>: Abbrev Number: 2 (DW_TAG_pointer_type) >> <46> DW_AT_byte_size : 8 >> <46> DW_AT_type : <0x61> >> <4a> DW_AT_sibling : <0x61> >> <2><4e>: Abbrev Number: 1 (User TAG value: 0x6000) >> <4f> DW_AT_name : (indirect string, offset: 0x18): btf_type_tag >> <53> DW_AT_const_value : (indirect string, offset: 0xd): type-tag-3 >> <2><57>: Abbrev Number: 1 (User TAG value: 0x6000) >> <58> DW_AT_name : (indirect string, offset: 0x18): btf_type_tag >> <5c> DW_AT_const_value : (indirect string, offset: 0xd2): type-tag-2 >> <2><60>: Abbrev Number: 0 >> <1><61>: Abbrev Number: 5 (DW_TAG_base_type) >> <62> DW_AT_byte_size : 4 >> <63> DW_AT_encoding : 5 (signed) >> <64> DW_AT_name : int >> <1><68>: Abbrev Number: 0 >> >> This does not agree with the DWARF produced by LLVM/clang for the same case: >> (clang 15.0.0 git 142501117a78080d2615074d3986fa42aa6a0734) >> >> <1><1e>: Abbrev Number: 2 (DW_TAG_variable) >> <1f> DW_AT_name : (indexed string: 0x3): g >> <20> DW_AT_type : <0x29> >> <24> DW_AT_external : 1 >> <24> DW_AT_decl_file : 0 >> <25> DW_AT_decl_line : 6 >> <26> DW_AT_location : 2 byte block: a1 0 ((Unknown location op 0xa1)) >> <1><29>: Abbrev Number: 3 (DW_TAG_pointer_type) >> <2a> DW_AT_type : <0x35> >> <2><2e>: Abbrev Number: 4 (User TAG value: 0x6000) >> <2f> DW_AT_name : (indexed string: 0x5): btf_type_tag >> <30> DW_AT_const_value : (indexed string: 0x7): type-tag-2 >> <2><31>: Abbrev Number: 4 (User TAG value: 0x6000) >> <32> DW_AT_name : (indexed string: 0x5): btf_type_tag >> <33> DW_AT_const_value : (indexed string: 0x8): type-tag-3 >> <2><34>: Abbrev Number: 0 >> <1><35>: Abbrev Number: 3 (DW_TAG_pointer_type) >> <36> DW_AT_type : <0x3e> >> <2><3a>: Abbrev Number: 4 (User TAG value: 0x6000) >> <3b> DW_AT_name : (indexed string: 0x5): btf_type_tag >> <3c> DW_AT_const_value : (indexed string: 0x6): type-tag-1 >> <2><3d>: Abbrev Number: 0 >> <1><3e>: Abbrev Number: 5 (DW_TAG_base_type) >> <3f> DW_AT_name : (indexed string: 0x4): int >> <40> DW_AT_encoding : 5 (signed) >> <41> DW_AT_byte_size : 4 >> <1><42>: Abbrev Number: 0 >> >> Notice the structural difference. From the DWARF produced by GCC (i.e. this >> patch series), variable 'g' is a pointer with tag 'type-tag-1' to a pointer >> with tags 'type-tag-2' and 'type-tag3' to an int. But from the LLVM DWARF, >> variable 'g' is a pointer with tags 'type-tag-2' and 'type-tag3' to a pointer >> to an int. >> >> Because GCC produces BTF from the internal DWARF DIE tree, the BTF also differs. >> This can be seen most obviously in the BTF type reference chains: >> >> GCC >> VAR (g) -> ptr -> tag1 -> ptr -> tag3 -> tag2 -> int >> >> LLVM >> VAR (g) -> ptr -> tag3 -> tag2 -> ptr -> tag1 -> int >> >> It seems that the ultimate cause for this is the structure of the TREE >> produced by the C frontend parsing and attribute handling. I believe this may >> be due to differences in __attribute__ syntax parsing between GCC and LLVM. >> >> This is the TREE for variable 'g': >> int __typetag1 * __typetag2 __typetag3 * g; >> >> > type > type >> asm_written unsigned DI >> size >> unit-size >> align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x7ffff7450888 >> attributes > purpose >> value > value >> readonly constant static "type-tag-3\000">> >> chain >> value > value >> readonly constant static "type-tag-2\000">>>> >> pointer_to_this > >> asm_written unsigned DI size unit-size >> align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x7ffff7509930 >> attributes >> value > value >> readonly constant static "type-tag-1\000">>>> >> public static unsigned DI defer-output /home/dfaust/playpen/btf/annotate.c:29:42 size unit-size >> align:64 warn_if_not_align:0> >> >> To me this is surprising. I would have expected the int** type of "g" to have >> the tags 'type-tag-2' and 'type-tag-3', and the inner (int*) pointer type to >> have the 'type-tag-1' tag. So far my attempts at resolving this difference in >> the new attribute handlers for the tag attributes has not been successful. >> >> I do not understand why exacly the attributes are attached in this way. I think >> that it may be related to the pointer cases discussed in the "All other >> attributes" section here: >> >> https://gcc.gnu.org/onlinedocs/gcc/Attribute-Syntax.html >> >> In particular it seems similar to this example: >> >> char *__attribute__((aligned(8))) *f; >> >> specifies the type “pointer to 8-byte-aligned pointer to char”. Note again >> that this does not work with most attributes; for example, the usage of >> ‘aligned’ and ‘noreturn’ attributes given above is not yet supported. >> >> I am not sure if this section of the documentation is outdated, if scenarios >> like this one have not been an issue before now, or if there is a way to >> resolve this within the attribute handler. I am by no means an expert in the C >> frontend nor attribute handling, if someone with more knowledge could help me >> understand this case I would be very grateful. :) >> >> Questions for GCC >> ================= >> >> 1) How can this issue with the type tags be resolved? Is this a bug or >> limitation in the attribute parsing that hasn't been an issue until now? >> Oris it that the above case is somehow a "weird" usage of attribtes? >> >> 2) Are attributes the right tool for this? Is there some other mechanism that >> would better fit the design of these tags? In some ways the type tags seem >> more similar to const/volatile/restrict qualifiers than to most other >> attributes. >> >> >> Questions for LLVM / kernel BPF >> =============================== >> >> 1) What special handling does the LLVM frontend/clang do for these attributes? >> Is there anything specific? Or does it simply follow whatever is default? > > the llvm frontend/clang only processed these attributes and encoded them > in AST, and then only these attributes are encoded in debuginfo. > For btf_type_tag, only tags to pointee (like int __tag * __tag * var) > are encoded in debuginfo. OK. So btf_type_tag attribute can be applied to non-pointer types, but it will be effectively ignored. > >> >> 2) What is the correct BTF representation for type tags? The documentation for >> BTF_KIND_TYPE_TAG in linux/Documentation/bpf/btf.rst seems to conflict with >> the output of clang, and the format change that was discussed here: >> https://reviews.llvm.org/D113496 >> I assume the kernel btf.rst might simply be outdated, but I want to be sure. > > Yes, the should be the same. > The document in linux/Documentation/bpf/btf.rst: > > ptr -> [type_tag]* > -> [const | volatile | restrict | typedef]* > -> base_type > > is related to BTF format, which is correct. > > What is not specified is how the following format is > converted to C code, which is also specified in > https://reviews.llvm.org/D113496. > >> >> 3) Is the ordering of multiple type tags on the same type important? >> e.g. for this variable: >> int __tag1 __tag2 __tag3 * b; >> >> would it be "correct" (or at least, acceptable) to produce: >> VAR(b) -> ptr -> tag2 -> tag3 -> tag1 -> int >> >> or _must_ it be: >> VAR(b) -> ptr -> tag3 -> tag2 -> tag1 -> int >> >> In the DWARF representation, all tags are equal sibling children of the type >> they annotate, so this 'ordering' problem seems like it only arises because of >> the BTF format for type tags. > > No. They are all independent modifiers to the pointee. So any ordering > in the above should be correct. OK, thanks. > >> >> 4) Are types with the same tags in different orders considered distinct types? >> I think the answer is "no", but given the format of the tags in BTF we get >> distinct chains for the types I am curious. >> e.g. >> int __tag1 __tag2 * x; >> int __tag2 __tag1 * y; >> >> produces >> VAR(x) -> ptr -> tag2 -> tag1 -> int >> VAR(y) -> ptr -> tag1 -> tag2 -> int >> >> but would >> VAR(y) -> ptr -> tag2 -> tag1 -> int >> >> be just as correct? > > Yes, > VAR(y) -> ptr -> tag2 -> tag1 should be correct > although the original order is preferred since > when we generate vmlinux.h we could like the > type definition as close to the original type > definition as possible. I see. Different orderings are technically correct but there is one preferred format. Thanks for the clarification. > >> >> 5) According to the clang docs, type tags are currently ignored for non-pointer >> types. Is pointer tagging e.g. '__user' the only use case so far? >> >> This GCC implementation allows type tags on non-pointer types. Such tags >> can be represented in the DWARF but don't make much sense in BTF output, >> e.g. >> >> struct __typetag1 S { >> int a; >> int b; >> } __decltag1; >> >> struct S my_s; >> >> This will produce a type tag child DIE of S. In the current implementation, >> it will also produce a BTF type tag type, which refers to the __decltag1 BTF >> decl tag, which in turn refers to the struct type. But nothing refers to >> the type tag type, currently variable my_s in BTF refers to the struct type >> directly. >> >> In my opinion, the DWARF here is useful but the BTF seems odd. What would be >> "correct" BTF in such a case? > > Currently in llvm, __typetag1 will not be encoded in dwarf. OK. Related to 1, type tags on non-pointer types have no effect on debug info generation. Got it. > >> >> 6) Would LLVM be open to changing the name of the attribute, for example to >> 'debug_info_annotate' (or any other suggestion)? The use cases for these >> tags have grown (e.g. drgn) since they were originally proposed, and the >> scope is no longer limited to BTF. >> >> The kernel eBPF developers have said they can accomodate whatever name we >> would like to use. So although we in GCC are not tied to the name LLVM >> uses, it would be ideal for everyone to use the same attribute name. > > The attribute name, esp. 'btf_type_tag' has been used in the linux > kernel. So It would be great we can use the same name. > Not sure whether gcc support or not, maybe has attribute aliases? > clang doesn't support it though. > OK. Will certainly keep this in mind. As I understand it, kernel could accommodate (via compiler.h) if we settle on a different name for GCC, but I agree using the same name between both compilers would be ideal. Thanks > > [...]