From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by sourceware.org (Postfix) with ESMTPS id 4AA1F3858C55 for ; Fri, 15 Jul 2022 01:20:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 4AA1F3858C55 Received: from pps.filterd (m0109332.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 26F1Fbds025045; Thu, 14 Jul 2022 18:20:43 -0700 Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2109.outbound.protection.outlook.com [104.47.70.109]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3haxdg00v0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 Jul 2022 18:20:43 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=KXMZHqBSjUND3rRusORlXQOcAmX6KGdqMDQitC5XXVZEFBiSMQ+T5BdptnTL0k6C6rrKG0PMk/Q/wPZ5QaojqmInWRm4jTbahJs9yCrbbF+G+/XVEfX1NHpeRiPWL+XeabUAH/9YW/RNj8BZ0f1rKeOo5hQfcVeywtkI3i4HbNCxhPb569tTkfsXliE0q9iKLh9AlSQ2pCr1Jzz0hLgA6gUgVfSmYVXjDLmauXg6JWitQpyUOEs15QoOhp8dRVIyPM5ErJVgJD4I7VIa2tAoJAwETpz9u4JoZRe4fQ6YwitFeigfaxGrpW+xzlkhixz1YhcrwjbLpVDpylsMhBlFtA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=7OzI5OKtJ4r9gMq8xyImleqaGjjwAlT3GKlYJBKAOXo=; b=eYyzcCOPRrP/KBA3s7e2QuEtJIaKbR5uiMPc5ox0JwGsdtQZKegDed0V4CcMGDTZ4/bGyKl6C3IrrmiNrIw74Ega1M6hpfrxs00bbTEuBsKcnggqZGT9SJFxM8FnhCcHr/nvrUMUhNXFig7rEWVi3Jk9SAeFY4px4UTumkTCfjw6zqVo2o/MIXSct7IDh3sohLmwNpiXG8JAOQWzX0LsXh+cAASOH0EQ33VLRiSWlJ9TS20OshLHzo3RQlFuX5WIVVrvDIeNnAzY18dVgTmr8OywoAHFfXUjs+N29ffSYRL6BdfGn3LKWvortak0YhUDdwA1QOuc/WD1dx8X4C1eNA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=fb.com; dmarc=pass action=none header.from=fb.com; dkim=pass header.d=fb.com; arc=none Received: from SN6PR1501MB2064.namprd15.prod.outlook.com (2603:10b6:805:d::27) by BYAPR15MB2760.namprd15.prod.outlook.com (2603:10b6:a03:159::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5417.26; Fri, 15 Jul 2022 01:20:40 +0000 Received: from SN6PR1501MB2064.namprd15.prod.outlook.com ([fe80::9568:e5d9:b8ab:bb23]) by SN6PR1501MB2064.namprd15.prod.outlook.com ([fe80::9568:e5d9:b8ab:bb23%6]) with mapi id 15.20.5438.014; Fri, 15 Jul 2022 01:20:40 +0000 Message-ID: <83fef618-c18c-aeb7-ada9-503deff9aa95@fb.com> Date: Thu, 14 Jul 2022 18:20:37 -0700 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Subject: Re: kernel sparse annotations vs. compiler attributes and debug_annotate_{type, decl} WAS: Re: [PATCH 0/9] Add debug_annotate attributes Content-Language: en-US To: "Jose E. Marchesi" Cc: David Faust , gcc-patches@gcc.gnu.org References: <20220607214342.19463-1-david.faust@oracle.com> <2ab1d9a1-0077-a1e7-f212-556fcf8c8883@fb.com> <9bd41e20-5c39-0d35-bd6e-c10c65280da7@oracle.com> <52dcfdb6-f1b9-1986-5d10-8d6ac8c6d256@fb.com> <874k0jfbu0.fsf_-_@oracle.com> <87edziknc1.fsf@oracle.com> <0c104e7b-1873-c141-37b9-71444f585793@fb.com> <87let4isc8.fsf@oracle.com> <94288e98-b6b8-b4b7-27a4-572f6150c691@fb.com> <87r12ng293.fsf@oracle.com> From: Yonghong Song In-Reply-To: <87r12ng293.fsf@oracle.com> Content-Type: text/plain; charset=UTF-8; format=flowed X-ClientProxiedBy: BYAPR03CA0027.namprd03.prod.outlook.com (2603:10b6:a02:a8::40) To SN6PR1501MB2064.namprd15.prod.outlook.com (2603:10b6:805:d::27) X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 47656d80-7433-456a-a441-08da66003ab4 X-MS-TrafficTypeDiagnostic: BYAPR15MB2760:EE_ X-FB-Source: Internal X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: Id3z3i2WBs7Z/wV+Fdz3KI9mUDCrqwCdz1nsbG1X9yyMzA01sd35m7s/btIlegsw7fmxeCc7KNwwINje4hRNHjxsG5WfIlEzukDtKOLXLBs6F81+yRIqtvzpq7Bpd8rWZhV+drNozlGzADFu10g+KnBtQjUDYQvo1wLVmgeWUWSslQBWNvBddfiocyCMJe0hDzW5BGtqfCUjZcOWQw8qTJS/67ZfLU7IYuuwc895RGJE0zHh1ecK1wJIgMw0KCLhi6k0yzO1TYeEzTp6CstCBXvc+DAeIVeu7OjKkKre7UDg8Ts1+CReC8QKs+w6BA+z/S/Ry2oiNdhH6cb2LFt7j5X8We7/csX9zv/X7xpg6YSi2e4tymLkdQpM4h7n1RjkKP54+vq3R1g4brAOOaO3r+UToJrb1/ioqnFNBS1CHnb8SticqUCXVFCF2VwFD+gVNHv8uY5YVSAWa9KWJY28vpyZrvHq6fFRTrX8yqfjH7XW5qNmUUFs6TVLSsKcm/n4qKoYeILMUABbYZcwvcwAYZkWgLZuZoYMQ0G4SeQ6DNDx64a0ZLF33WTqu2z16ur7OcMZK/vxK89ZsHskxFg7y2rqc+2/AS1g5Cpao3Bi/4j15r4psgnN7op9LTUFJmWguQCgqZ111e12kRw14bYToFqE5BPi+QawswyvB9g+77vTqxL1qPeglq2sXWaIGK+o0hTA1+sr9QcyPZL2ECP2FAMToawadRoGDzi43YGbeiKZnLLjFnUM2lajMBDz6+Uc5+HLRUAkgxHMpAeNSZ/Zcpv8EYHULeNmsS0CjxXTMZswOeDil+mYHio5zWN7ttXYSrR1x85IIpDLoVve1PnqHcjn0EG6cx7wBILwem1595tJ4+zQYwZ9aFCq+7tdXZZrAeIe2/yRDhyNRZ2jaX05nvzW6L9DCWWnS9a4DjLL9QE= X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SN6PR1501MB2064.namprd15.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230016)(4636009)(396003)(366004)(136003)(39860400002)(346002)(376002)(36756003)(2616005)(6666004)(66556008)(6512007)(6506007)(86362001)(8676002)(66946007)(66476007)(38100700002)(31696002)(2906002)(31686004)(316002)(186003)(478600001)(8936002)(41300700001)(53546011)(30864003)(83380400001)(4326008)(5660300002)(6486002)(6916009)(966005)(43740500002)(45980500001); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?Q2dtajl2elpEaGNvNk03aHg1eFFoMkg0YUtpekZFUGZwTEw5R2k0WGNvVW9B?= =?utf-8?B?NzN0b1NJYnNUTDgwWWhnanFNVnVQZHM2MVhsSmdtckhtUHBaa04wdFUrdDJm?= =?utf-8?B?Q2xETGtGcU0wZHdkb2Q4Z1ZSd2g2dlRlVmxUekx5MHoyTHQ1UFE0elhsdGgw?= =?utf-8?B?OUxoYjJETUJEQjJlTkg2Z1U2TXFVVUd4OEsvcHdZbURqNFZXc0x3TFZITkY4?= =?utf-8?B?SU56K0IvOGVjalp1UDVoM3ZnRFpRUDhXU2MvT2tDQ203cmdDbWthNEpZeTV1?= =?utf-8?B?QXFSc3o1elVRTlg1OFNReTR2dkwrbXUvUDZMbDBFYlh6RnJ3QkZ4ai95Y3ZE?= =?utf-8?B?MGQrYTE3TVExdFdGajhkQXdHNmhKR2RtSWo0dVd1NE45K2pLZ2FiN2hYQ3JL?= =?utf-8?B?aUp3dGdYWE1xTVFsQ2hjaG4zWVltTFJlS1dMQlNHQWpkSnVGVjJaaXhhQ1pE?= =?utf-8?B?bVBLRnFKUk9wZGxPZFhoWmFIYmdCNkdQZlhqK2VsUnJ4d0Vkc1lyc1h3VWZY?= =?utf-8?B?Q2R2ejBEYXNJZFJwNmxteG5iWU1ya2FNNU1GRElGWk80TURCTkhaQk4rbC9z?= =?utf-8?B?enVQMDlZVmN3Nkt1Mmhvb20rT0s5c2Z2WDV1M0FYT3VSdjRWcVdmWlppNkFG?= =?utf-8?B?VHdwU2FrSUllRmRSVktTQ0o1MjlpbUR3Tk1nekhEUFdGa3ozNE5zV2lTZ3F0?= =?utf-8?B?OUNsazRDd0l1M1VzTW0xVkpzM0pSaFBMdkI4QlRvTzQ5WDNXdFpoMFZYajM4?= =?utf-8?B?NjBnbXhTY2NOaDdYYTVyT2ZhcithNktNOHJTd200VTlsMi9vVXhwZzZLaS9j?= =?utf-8?B?cE00b21jOEJDcGhIdGtPWkxCNGlZSmtNMkp1dU5iUGVvWjdIbUJHczE2eTFR?= =?utf-8?B?RXZ6M255QkltWkhJeE1PNmREeUpRUFNJaTBheEo2U3hhcjdSU2U1ZU1nb2lk?= =?utf-8?B?eVhQcTBZQW1OMnBqejVRMWNxM3o0ZXVTd3pwbmtLaXRiOFVqTFRSaHg0WXVx?= =?utf-8?B?cDRiYTJZTGU5Vy9hV3NDZTJmSk54cTVPV2IxMVlQckQ5U3A3Y3BSUnR5T3RN?= =?utf-8?B?THYzKzNjaWRkcjh3ZXJaUEczbENDRno1YU1KcUJEa1k0WWtGZmpWd01MUnZa?= =?utf-8?B?aFN3dDZXSWxyemNPcWNLTHpJTEhPejN4eXcyNTZXamQvRERPYjU5OFp4cDhY?= =?utf-8?B?bTNFL2VHV0h0cmJJRWhvYUdzWDliNVR4Z0hydVRyaE14MnF0R1U5cHI5dXdD?= =?utf-8?B?N1dhNVVldjVzYXhid3FaY054cHJwb1hCQ3VPdmlnRkFuQkZOUEFpK1ZMMWVQ?= =?utf-8?B?T0xzbHpGZDVJb0dqZ0RHVU1FTTA0RVNDdFpjM2V0U2ZNN0o1UVJoNkpMR3Vt?= =?utf-8?B?UElZK0c0L25ObzBSWWIyQW43N1RDeHlmL2VnalBhMnJzMHpUMElTMEJoSWgy?= =?utf-8?B?QmViRnEvZzlZT0I3citTRDZ4QU1nNmtlSFo5RzNIbGJ5eCtnRWl6U0RxQ0dP?= =?utf-8?B?b3JiRHRzbWtOZFBvUUJUL1ptZWxLa1I0eVZPdFNNMnhjMUZMb0pPaHRmVitV?= =?utf-8?B?MXBhMGRHY3lFZUkyZGZVcG1kWEdteTgxdHFxU2JZZGpIZWhaZkM5Z0UxYlNN?= =?utf-8?B?VVE2ZUYwZEdvTXc3aGxHc0xEVkxUa0pwemZLczhhRGEwU29JYlJOaUFlZWd4?= =?utf-8?B?a1dRVVR6b2llMEhTMkZZMFg4TlF2T1VSdHpuaC9JVnlhazEwSDlWb1hOK1RM?= =?utf-8?B?NjNFWTVXTkY0bnJJdWtLRjNLY2laZm1GbC93alFpb1VTRzR3MUJhRExqNFFZ?= =?utf-8?B?NXEwcXlkOTVxMFFFcmsxZkI0Tkcyc1Z5OWprcmh2T09CbldKMGJtSkd2Zmdl?= =?utf-8?B?ZUlpckdUdVY1elh5eE9hOHM1MHdha2VQdkZSYnVCL2NQZGdFeWFGb0FmdHVY?= =?utf-8?B?dFdweFlZRlhiRWRBalpybVVJaWNKcWloWHJRaytUdnRQMGxjdG9NUVRYL2Zm?= =?utf-8?B?NndVa3RtS2NZSXhZYmI1b3lSanpYTkxzOVZSemFlOUUrRXVzZkNrNjk2S0ZK?= =?utf-8?B?TzcrcWxmVVgvNFZkN1VGRlhKbDh4M3pOOGZYdmMyQnoyRlFoSnpzZTBUUXh1?= =?utf-8?B?K01tb05GNmZhaTh6VUJWUnFZcFRURkQ0eVRqTUhBdTIrS2VwMVFhUFQ1T1JZ?= =?utf-8?B?M1E9PQ==?= X-OriginatorOrg: fb.com X-MS-Exchange-CrossTenant-Network-Message-Id: 47656d80-7433-456a-a441-08da66003ab4 X-MS-Exchange-CrossTenant-AuthSource: SN6PR1501MB2064.namprd15.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Jul 2022 01:20:39.9327 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8ae927fe-1255-47a7-a2af-5f3a069daaa2 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: MWhqjVSi6Ipw2hfbc6CLsWGfsbyJ8lnFtZC73AKHoBqPH+WeUsQKDmfKyWEIGRVL X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR15MB2760 X-Proofpoint-GUID: vzQrPxkjh-FEjphlCg15wzFrr_X8PMg8 X-Proofpoint-ORIG-GUID: vzQrPxkjh-FEjphlCg15wzFrr_X8PMg8 Content-Transfer-Encoding: 7bit X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.883,Hydra:6.0.517,FMLib:17.11.122.1 definitions=2022-07-14_19,2022-07-14_01,2022-06-22_01 X-Spam-Status: No, score=-3.8 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 Jul 2022 01:20:47 -0000 On 7/14/22 8:09 AM, Jose E. Marchesi wrote: > > Hi Yonghong. > >> On 7/7/22 1:24 PM, Jose E. Marchesi wrote: >>> Hi Yonghong. >>> >>>> On 6/21/22 9:12 AM, Jose E. Marchesi wrote: >>>>> >>>>>> On 6/17/22 10:18 AM, Jose E. Marchesi wrote: >>>>>>> Hi Yonghong. >>>>>>> >>>>>>>> On 6/15/22 1:57 PM, David Faust wrote: >>>>>>>>> >>>>>>>>> On 6/14/22 22:53, Yonghong Song wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 6/7/22 2:43 PM, David Faust wrote: >>>>>>>>>>> Hello, >>>>>>>>>>> >>>>>>>>>>> This patch series adds support for: >>>>>>>>>>> >>>>>>>>>>> - Two new C-language-level attributes that allow to associate (to "annotate" or >>>>>>>>>>> to "tag") particular declarations and types with arbitrary strings. As >>>>>>>>>>> explained below, this is intended to be used to, for example, characterize >>>>>>>>>>> certain pointer types. >>>>>>>>>>> >>>>>>>>>>> - The conveyance of that information in the DWARF output in the form of a new >>>>>>>>>>> DIE: DW_TAG_GNU_annotation. >>>>>>>>>>> >>>>>>>>>>> - The conveyance of that information in the BTF output in the form of two new >>>>>>>>>>> kinds of BTF objects: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG. >>>>>>>>>>> >>>>>>>>>>> All of these facilities are being added to the eBPF ecosystem, and support for >>>>>>>>>>> them exists in some form in LLVM. >>>>>>>>>>> >>>>>>>>>>> Purpose >>>>>>>>>>> ======= >>>>>>>>>>> >>>>>>>>>>> 1) Addition of C-family language constructs (attributes) to specify free-text >>>>>>>>>>> tags on certain language elements, such as struct fields. >>>>>>>>>>> >>>>>>>>>>> The purpose of these annotations is to provide additional information about >>>>>>>>>>> types, variables, and function parameters of interest to the kernel. A >>>>>>>>>>> driving use case is to tag pointer types within the linux kernel and eBPF >>>>>>>>>>> programs with additional semantic information, such as '__user' or '__rcu'. >>>>>>>>>>> >>>>>>>>>>> For example, consider the linux kernel function do_execve with the >>>>>>>>>>> following declaration: >>>>>>>>>>> >>>>>>>>>>> static int do_execve(struct filename *filename, >>>>>>>>>>> const char __user *const __user *__argv, >>>>>>>>>>> const char __user *const __user *__envp); >>>>>>>>>>> >>>>>>>>>>> Here, __user could be defined with these annotations to record semantic >>>>>>>>>>> information about the pointer parameters (e.g., they are user-provided) in >>>>>>>>>>> DWARF and BTF information. Other kernel facilites such as the eBPF verifier >>>>>>>>>>> can read the tags and make use of the information. >>>>>>>>>>> >>>>>>>>>>> 2) Conveying the tags in the generated DWARF debug info. >>>>>>>>>>> >>>>>>>>>>> The main motivation for emitting the tags in DWARF is that the Linux kernel >>>>>>>>>>> generates its BTF information via pahole, using DWARF as a source: >>>>>>>>>>> >>>>>>>>>>> +--------+ BTF BTF +----------+ >>>>>>>>>>> | pahole |-------> vmlinux.btf ------->| verifier | >>>>>>>>>>> +--------+ +----------+ >>>>>>>>>>> ^ ^ >>>>>>>>>>> | | >>>>>>>>>>> DWARF | BTF | >>>>>>>>>>> | | >>>>>>>>>>> vmlinux +-------------+ >>>>>>>>>>> module1.ko | BPF program | >>>>>>>>>>> module2.ko +-------------+ >>>>>>>>>>> ... >>>>>>>>>>> >>>>>>>>>>> This is because: >>>>>>>>>>> >>>>>>>>>>> a) Unlike GCC, LLVM will only generate BTF for BPF programs. >>>>>>>>>>> >>>>>>>>>>> b) GCC can generate BTF for whatever target with -gbtf, but there is no >>>>>>>>>>> support for linking/deduplicating BTF in the linker. >>>>>>>>>>> >>>>>>>>>>> In the scenario above, the verifier needs access to the pointer tags of >>>>>>>>>>> both the kernel types/declarations (conveyed in the DWARF and translated >>>>>>>>>>> to BTF by pahole) and those of the BPF program (available directly in BTF). >>>>>>>>>>> >>>>>>>>>>> Another motivation for having the tag information in DWARF, unrelated to >>>>>>>>>>> BPF and BTF, is that the drgn project (another DWARF consumer) also wants >>>>>>>>>>> to benefit from these tags in order to differentiate between different >>>>>>>>>>> kinds of pointers in the kernel. >>>>>>>>>>> >>>>>>>>>>> 3) Conveying the tags in the generated BTF debug info. >>>>>>>>>>> >>>>>>>>>>> This is easy: the main purpose of having this info in BTF is for the >>>>>>>>>>> compiled eBPF programs. The kernel verifier can then access the tags >>>>>>>>>>> of pointers used by the eBPF programs. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> For more information about these tags and the motivation behind them, please >>>>>>>>>>> refer to the following linux kernel discussions: >>>>>>>>>>> >>>>>>>>>>> https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/ >>>>>>>>>>> https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com/ >>>>>>>>>>> https://lore.kernel.org/bpf/20211112012604.1504583-1-yhs@fb.com/ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Implementation Overview >>>>>>>>>>> ======================= >>>>>>>>>>> >>>>>>>>>>> To enable these annotations, two new C language attributes are added: >>>>>>>>>>> __attribute__((debug_annotate_decl("foo"))) and >>>>>>>>>>> __attribute__((debug_annotate_type("bar"))). Both attributes accept a single >>>>>>>>>>> arbitrary string constant argument, which will be recorded in the generated >>>>>>>>>>> DWARF and/or BTF debug information. They have no effect on code generation. >>>>>>>>>>> >>>>>>>>>>> Note that we are not using the same attribute names as LLVM (btf_decl_tag and >>>>>>>>>>> btf_type_tag, respectively). While these attributes are functionally very >>>>>>>>>>> similar, they have grown beyond purely BTF-specific uses, so inclusion of "btf" >>>>>>>>>>> in the attribute name seems misleading. >>>>>>>>>>> >>>>>>>>>>> DWARF support is enabled via a new DW_TAG_GNU_annotation. When generating DWARF, >>>>>>>>>>> declarations and types will be checked for the corresponding attributes. If >>>>>>>>>>> present, a DW_TAG_GNU_annotation DIE will be created as a child of the DIE for >>>>>>>>>>> the annotated type or declaration, one for each tag. These DIEs link the >>>>>>>>>>> arbitrary tag value to the item they annotate. >>>>>>>>>>> >>>>>>>>>>> For example, the following variable declaration: >>>>>>>>>>> >>>>>>>>>>> #define __typetag1 __attribute__((debug_annotate_type ("typetag1"))) >>>>>>>>>>> >>>>>>>>>>> #define __decltag1 __attribute__((debug_annotate_decl ("decltag1"))) >>>>>>>>>>> #define __decltag2 __attribute__((debug_annotate_decl ("decltag2"))) >>>>>>>>>>> >>>>>>>>>>> int * __typetag1 x __decltag1 __decltag2; >>>>>>>>>> >>>>>>>>>> Based on the above example >>>>>>>>>> static int do_execve(struct filename *filename, >>>>>>>>>> const char __user *const __user *__argv, >>>>>>>>>> const char __user *const __user *__envp); >>>>>>>>>> >>>>>>>>>> Should the above example should be the below? >>>>>>>>>> int __typetag1 * x __decltag1 __decltag2 >>>>>>>>>> >>>>>>>>> This example is not related to the one above. It is just meant to >>>>>>>>> show the behavior of both attributes. My apologies for not making >>>>>>>>> that clear. >>>>>>>> >>>>>>>> Okay, it should be fine if the dwarf debug_info is shown. >>>>>>>> >>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Produces the following DWARF information: >>>>>>>>>>> >>>>>>>>>>> <1><1e>: Abbrev Number: 3 (DW_TAG_variable) >>>>>>>>>>> <1f> DW_AT_name : x >>>>>>>>>>> <21> DW_AT_decl_file : 1 >>>>>>>>>>> <22> DW_AT_decl_line : 7 >>>>>>>>>>> <23> DW_AT_decl_column : 18 >>>>>>>>>>> <24> DW_AT_type : <0x49> >>>>>>>>>>> <28> DW_AT_external : 1 >>>>>>>>>>> <28> DW_AT_location : 9 byte block: 3 0 0 0 0 0 0 0 0 (DW_OP_addr: 0) >>>>>>>>>>> <32> DW_AT_sibling : <0x49> >>>>>>>>>>> <2><36>: Abbrev Number: 1 (User TAG value: 0x6000) >>>>>>>>>>> <37> DW_AT_name : (indirect string, offset: 0xd6): >> debug_annotate_decl >>>>>>>>>>> <3b> DW_AT_const_value : (indirect string, offset: 0xcd): decltag2 >>>>>>>>>>> <2><3f>: Abbrev Number: 1 (User TAG value: 0x6000) >>>>>>>>>>> <40> DW_AT_name : (indirect string, offset: 0xd6): >> debug_annotate_decl >>>>>>>>>>> <44> DW_AT_const_value : (indirect string, offset: 0x0): decltag1 >>>>>>>>>>> <2><48>: Abbrev Number: 0 >>>>>>>>>>> <1><49>: Abbrev Number: 4 (DW_TAG_pointer_type) >>>>>>>>>>> <4a> DW_AT_byte_size : 8 >>>>>>>>>>> <4b> DW_AT_type : <0x5d> >>>>>>>>>>> <4f> DW_AT_sibling : <0x5d> >>>>>>>>>>> <2><53>: Abbrev Number: 1 (User TAG value: 0x6000) >>>>>>>>>>> <54> DW_AT_name : (indirect string, offset: 0x9): >> debug_annotate_type >>>>>>>>>>> <58> DW_AT_const_value : (indirect string, offset: 0x1d): typetag1 >>>>>>>>>>> <2><5c>: Abbrev Number: 0 >>>>>>>>>>> <1><5d>: Abbrev Number: 5 (DW_TAG_base_type) >>>>>>>>>>> <5e> DW_AT_byte_size : 4 >>>>>>>>>>> <5f> DW_AT_encoding : 5 (signed) >>>>>>>>>>> <60> DW_AT_name : int >>>>>>>>>>> <1><64>: Abbrev Number: 0 >>>>>>>> >>>>>>>> This shows the info in .debug_abbrev. What I mean is to >>>>>>>> show the related info in .debug_info section which seems more useful to >>>>>>>> understand the relationships between different tags. Maybe this is due >>>>>>>> to that I am not fully understanding what <1>/<2> means in <1><49> and >>>>>>>> <2><53> etc. >>>>>>> I think that dump actually shows .debug_info, with the abbrevs >>>>>>> expanded... >>>>>>> Anyway, it seems to us that the root of this problem is the fact the >>>>>>> kernel sparse annotations, such as address_space(__user), are: >>>>>>> 1) To be processed by an external kernel-specific tool ( >>>>>>> https://sparse.docs.kernel.org/en/latest/annotations.html) and not a >>>>>>> C compiler, and therefore, >>>>>>> 2) Not quite the same than compiler attributes (despite the way they >>>>>>> look.) In particular, they seem to assume an ordering different than >>>>>>> of GNU attributes: in some cases given the same written order, they >>>>>>> refer to different things!. Which is quite unfortunate :( >>>>>> >>>>>> Yes, currently __user/__kernel macros (implemented with address_space >>>>>> attribute) are processed by macros. >>>>>> >>>>>>> Now, if I understood properly, you plan to change the definition of >>>>>>> __user and __kernel in the kernel sources in order to generate the tag >>>>>>> compiler attributes, correct? >>>>>> >>>>>> Right. The original __user definition likes: >>>>>> # define __user __attribute__((noderef, address_space(__user))) >>>>>> >>>>>> The new attribute looks like >>>>>> # define BTF_TYPE_TAG(value) __attribute__((btf_type_tag(#value))) >>>>>> # define __user BTF_TYPE_TAG(user) >>>>> Ok I see. So the kernel will stop using sparse attributes to >>>>> implement >>>>> __user and __kernel and start using compiler attributes for tags >>>>> instead. >>>>> >>>>>>> Is that the reason why LLVM implements what we assume to be the >>>>>>> sparse >>>>>>> ordering, and not the correct GNU attributes ordering, for the tag >>>>>>> attributes? >>>>>> >>>>>> Note that __user attributes apply to pointee's and not pointers. >>>>>> Just like >>>>>> const int *p; >>>>>> the 'const' is not applied to pointer 'p', but the pointee of 'p'. >>>>>> >>>>>> What current llvm dwarf generation with >>>>>> pointer >>>>>> <--- btf_type_tag >>>>>> is just ONE implementation. As I said earlier, I am okay to >>>>>> have dwarf implementation like >>>>>> p->btf_type_tag->const->int. >>>>>> If you can propose an implementation like this in dwarf. I can propose >>>>>> to change implementation in llvm. >>>>> I think we are miscommunicating. >>>>> Looks like there is a divergence on what attributes apply to what >>>>> language entities between the sparse compiler and GCC/LLVM. How to >>>>> represent that in DWARF is a different matter. >>>>> For this example: >>>>> int __typetag1 * __typetag2 __typetag3 * g; >>>>> a) GCC associates __typetag1 with the pointer-to-pointer-to-int. >>>>> b) LLVM associates __typetag1 to pointer-to-int. >>>>> Where: >>>>> a) Is the expected behavior of a compiler attributes, as documented >>>>> in >>>>> the GCC manual. >>>>> b) Is presumably what the sparse compiler expects, but _not_ the >>>>> ordering expected for a compiler GNU attribute. >>>>> So, if the kernel source __user and __kernel annotations (which >>>>> currently expand to sparse attributes) follow the sparse ordering, and >>>>> you want to implement __user and __kernel in terms of compiler >>>>> attributes instead (the annotation attributes) then you will have to: >>>>> 1) Fix LLVM to implement the usual ordering for these attributes and >>>>> 2) fix the kernel sources to use that ordering >>>>> [Incidentally, the same applies to another "ex-sparse" attribute you >>>>> have in the kernel and also implemented in LLVM with a weird ordering: >>>>> the address_space attribute.] >>>>> For 2), it may be possible to write a coccinnelle script to generate >>>>> the >>>>> patch... >>>> >>>> I don't think (2) (to change kernel source for different attr ordering) >>>> will work. So the only thing we can do is in compiler/pahole except >>>> macro replacement in kernel. >>> I looked at sparse and its parser. Wanted to be sure the ordering >>> it >>> uses to interpret sparse annotations (such as address_space, alignment, >>> etc) is definitely _not_ the same ordering used by __attribute__ in C >>> compilers. >>> It is very different indeed and the same can be said about how >>> sparse >>> interprets other modifiers like `const': in sparse both `int const *foo' >>> and `int *const foo' parse to a constant pointer to int, for example. >>> I am not to judge how sparse handles its annotations. It may be >>> very >>> well and pertinent for its particular purpose. >>> But I am not sure if it is reasonable to expect C compilers to >>> implement >>> certain type __attributes__ to parse differently, just because it >>> happens these attributes are reused from sparse annotations in a >>> particular program (in this case the kernel.) The debug_annotate_decl >>> and debug_annotate_type attributes are not even intended to be >>> kernel-specific. >>> So, if changing the kernel sources is not an option (why btw, other >>> than >>> being a PITA?) at this point I really don't know what else to suggest :/ >>> Any suggestion from the front-end people? >> >> Just want to understand the overall picture. So gcc can still emit >> BTF properly with btf_type_tag right? The issue we are talking about >> here is about the dwarf, right? > > If by "properly" you mean how sparse handles its annotations, then not > really. > > The issue we are talking about is rather a language-level one: to what > entity/type the compiler attribute applies. > > So, for: > > int __attribute__((debug_annotate_decl("user"))) *foo; > > GCC will apply the attribute to the int type, following the rules for > type attributes (sparse would apply the annotation to the *int type > instead). The emitted debug info (be it DWARF or BTF) will reflect > that, no more no less :/ I don't know what does this 'apply the attribute to the int' mean. In current clang implementation it means the following dwarf chains from right to left variable 'foo' type: ptr base type: attr_type: attr underlying type: int So the type chain is foo -> ptr -> attr -> int > >> If this is the case, we might have >> a partial solution here. >> - gcc emits BTF for vmlinux > > Note that for emitting BTF for vmlinux we would need support in the > linker to merge and deduplicate BTF, which at the moment we don't have. This should be okay. pahole will merge and deduplicate btf. In pahole '-j' mode, each thread will convert each .o file dwarf to btf, and then pahole will merge and deduplicate btf. > >> - gcc emits dwarf for vmlinux ignoring btf_type_tag >> - in pahole, vmlinux BTF is amended with some additional misc things. >> Although there are some use cases to have btf_type_tag in dwarf, but >> that can be workarouned with BTF + dwarf both of which are generated >> by the compiler. Not elegent, but probably works. >>> >>>>> Does this make sense? >>>>> >>>>>>> If that is so, we have quite a problem here: I don't think we can >>>>>>> change >>>>>>> the way GCC handles GNU-like attributes just because the kernel sources >>>>>>> want to hook on these __user/__kernel sparse annotations to generate the >>>>>>> compiler tags, even if we could mayhaps get GCC to handle >>>>>>> debug_annotate_type and debug_annotate_decl differently. Some would say >>>>>>> doing so would perpetuate the mistake instead of fixing it... >>>>>>> Is my understanding correct? >>>>>> >>>>>> Let us just say that the btf_type_tag attribute applies to pointees. >>>>>> Does this help? >>>>>> >>>>>>> >>>>>>>>>> >>>>>>>>>> Maybe you can also show what dwarf debug_info looks like >>>>>>>>> I am not sure what you mean. This is the .debug_info section as output >>>>>>>>> by readelf -w. I did trim some information not relevant to the discussion >>>>>>>>> such as the DW_TAG_compile_unit DIE, for brevity. >>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> In the case of BTF, the annotations are recorded in two type kinds recently >>>>>>>>>>> added to the BTF specification: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG. >>>>>>>>>>> The above example declaration prodcues the following BTF information: >>>>>>>>>>> >>>>>>>>>>> [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED >>>>>>>>>>> [2] PTR '(anon)' type_id=3 >>>>>>>>>>> [3] TYPE_TAG 'typetag1' type_id=1 >>>>>>>>>>> [4] DECL_TAG 'decltag1' type_id=6 component_idx=-1 >>>>>>>>>>> [5] DECL_TAG 'decltag2' type_id=6 component_idx=-1 >>>>>>>>>>> [6] VAR 'x' type_id=2, linkage=global >>>>>>>>>>> [7] DATASEC '.bss' size=0 vlen=1 >>>>>>>>>>> type_id=6 offset=0 size=8 (VAR 'x') >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> [...]