From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by sourceware.org (Postfix) with ESMTPS id 4EAF23858418 for ; Wed, 13 Jul 2022 04:23:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 4EAF23858418 Received: from pps.filterd (m0109333.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 26CLjbQD016624; Tue, 12 Jul 2022 21:23:40 -0700 Received: from nam12-mw2-obe.outbound.protection.outlook.com (mail-mw2nam12lp2040.outbound.protection.outlook.com [104.47.66.40]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3h9h5chk02-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 12 Jul 2022 21:23:40 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Doa5oOzUAvxGB60aaFPjZz2W4Fhmr0QSFQeZXiBFyTizzAlFKiqzbmv4Z7S01sh2qDIVTjI0Fd7IKvQrPvKE0N+ziPCNAGH85pyPl8zTG6JAJ3pjhJnJY2PC0WBfAtRx3wgOjZiUt7H4We2WB3NXo/KiAeVgfzyHGYZLrGm5JckLOVi0193+Ed5XCtHRUXl2mV2NHYYsNChTPlFMks7MNhXMpL09/1lTPNZxUn2d2S4/DuP9Cw5yZo1GxtPYIhmTJaa8QTRWcG6gXsBRmHh6PPR+XgOV2ooSp9yMjece5YF46Gsnu+c2gvGMdWt77U7OLp4K94JnemCWc1etzyLavA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=P6ADjskEonoGZ9UsMjvXEDEgWrtyOiqVrFL5g78icr4=; b=VmYNjdruVORjqW8zZm0DxAwEGuh0pqSWL2Y8K/H0e8/dOHxdSkcNNiXLI0NIxBYcjmZi6vkt3OHdDf6wpnAzrGhSVpfdWeFpNxys0bX/pZniqBATs9v1MMg5MRORhX0qlTGzbMz3aP1OiUofnNt0EHTquGCBjiAlWJJ01HDYFM4p/ZZEIt/YoVCieNcRDK0+19iXdrR6corQGJcPePUyPdtsgN6FoLrMRHPshEZ7saqOTJ+MBRatI7ZsDJ8R7dBqCYE+UGoD03RAnuGNL3EYjeg8UoTj+jvhcDnFQGtro78ECJQkZzYevJXrwi/D077xWkNzcmf7AYL2o9DHldy5gg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=fb.com; dmarc=pass action=none header.from=fb.com; dkim=pass header.d=fb.com; arc=none Received: from SN6PR1501MB2064.namprd15.prod.outlook.com (2603:10b6:805:d::27) by MWHPR15MB1773.namprd15.prod.outlook.com (2603:10b6:301:53::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5417.26; Wed, 13 Jul 2022 04:23:38 +0000 Received: from SN6PR1501MB2064.namprd15.prod.outlook.com ([fe80::9568:e5d9:b8ab:bb23]) by SN6PR1501MB2064.namprd15.prod.outlook.com ([fe80::9568:e5d9:b8ab:bb23%6]) with mapi id 15.20.5417.026; Wed, 13 Jul 2022 04:23:38 +0000 Message-ID: <94288e98-b6b8-b4b7-27a4-572f6150c691@fb.com> Date: Tue, 12 Jul 2022 21:23:36 -0700 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Subject: Re: kernel sparse annotations vs. compiler attributes and debug_annotate_{type, decl} WAS: Re: [PATCH 0/9] Add debug_annotate attributes Content-Language: en-US To: "Jose E. Marchesi" Cc: David Faust , gcc-patches@gcc.gnu.org References: <20220607214342.19463-1-david.faust@oracle.com> <2ab1d9a1-0077-a1e7-f212-556fcf8c8883@fb.com> <9bd41e20-5c39-0d35-bd6e-c10c65280da7@oracle.com> <52dcfdb6-f1b9-1986-5d10-8d6ac8c6d256@fb.com> <874k0jfbu0.fsf_-_@oracle.com> <87edziknc1.fsf@oracle.com> <0c104e7b-1873-c141-37b9-71444f585793@fb.com> <87let4isc8.fsf@oracle.com> From: Yonghong Song In-Reply-To: <87let4isc8.fsf@oracle.com> Content-Type: text/plain; charset=UTF-8; format=flowed X-ClientProxiedBy: BYAPR03CA0017.namprd03.prod.outlook.com (2603:10b6:a02:a8::30) To SN6PR1501MB2064.namprd15.prod.outlook.com (2603:10b6:805:d::27) X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 6c79189a-aa2a-4c61-e320-08da64877591 X-MS-TrafficTypeDiagnostic: MWHPR15MB1773:EE_ X-FB-Source: Internal X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: J70JqOCJQEYLi701bcUJ+CWT35PGKT2gEZv1C0k3RM8s7oRvVnISxbZr5Po2/Qkmd/vtVnb7utfPFyTF5rWyx7l2LLG6PZqVOGLbY5dox7pSnNBsL2SVjoh16ZSHBwj2NyNZ5Ulh6A1GA8/wv5wkjux2ME281KuEjmm2PfnCTjGOJlrR0etdtSSAsTO1m7Nh4KSANlQZW9ziepJBRp6DzJED6XDC5dHTQLXcH3DyxOOpqu5Oy1RIkIv2PHbsyFayQo7pbzl9EotbekiLCJ7IL5Ch6fgyTq1ozAuRYmPxT7FyDt8SVdCpV2KTtLIY3a+I0BAUOz5x/UECKHUcFKaeG8ZGLXfyBweSbftE5wr/4S3NUdJwVr7nCePuVTLgyh1mESkvydrsFsL0kfZ2UdpeHbTvYbUSSppUREbPbLiu8hdzY9mFovl+XYuB4ebI62cK+nSnzFAyXhZ/joJ+ZuQ/0y91Zn+2vTlIlalEiwGU/b3dfwxG0tMJtkpPq4pqcEQwRJDkuBZ67w8jS1iqHtlmyj16wvB4cWbBytC4OO9pI7S+VbW/znQypI6Mpvs1KO3IXgfnAzqUUbWaCKZVxrIxipM4yDIoJtDn/ZSjjlIT5ZvVjmjxBEmNZxEUBI0rGeUdLDYaU2tP9MK0oAS6x0y4De4hIq0/TtbIyyCHsfjv+TVwJNppNZgfgB210jPeRAuoT+rBkiTEWG6oVc8nzK/vG5PiYoBIhWKSIZLNJ0a67GIS6ej9dGG4NmsqAzYseRrZYkDSMQT4Kd+nXRGfXiEXoowbgeYVxhiVWXwmXPk/IPGU8M0ipREbSLPG8wzizxp+QX6Hk1MUjhOAVGjdE6b0zlS1Q+ZlxgapHwJjL2LvJ9ECJjFw0jhsLLldYBSCTMXA5k0YzMG3Q7f8uIYJR+KmdGUGNVKfujC5q8t6e8Qh2Gw= X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SN6PR1501MB2064.namprd15.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230016)(4636009)(136003)(376002)(39860400002)(366004)(346002)(396003)(478600001)(83380400001)(30864003)(36756003)(31686004)(966005)(2906002)(6486002)(316002)(66476007)(8676002)(5660300002)(8936002)(6916009)(86362001)(66556008)(4326008)(6512007)(38100700002)(31696002)(66946007)(6506007)(186003)(41300700001)(53546011)(2616005)(45980500001)(43740500002); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?OHR6KzRmaDZvTUdTc0tqZHZubjJUTS9VNTB1UWtWZUk1Y21yODVhcEp2SXFK?= =?utf-8?B?dE5maVZBbWdBM1kwZG5yMUJtc0RoWjZESllONGZBY2VEeVZZZDc3OHRiQ2sx?= =?utf-8?B?eUpmOTc0bWptcEV3UFhKNEZTTkgya1JqanVCTnBPWG4wNlVTYXVpencxUlNB?= =?utf-8?B?Mk1KU3A3YzhMVnpvbVJuMTRXM1BDRUJUVDU1OGgydjgveEE1b0FnRHhVZ1Iz?= =?utf-8?B?SUpwZTROR1g3emJuUDdxdFN0ZWR5S04rdUxwQ0xuTGQ4L3gwRFVhSUt4M1d5?= =?utf-8?B?bHZBSlZCVjJWSW1vYnFkZTNCYTVPdkxqaGlkUGI3aDhuS21yc2YwTC9KdnZq?= =?utf-8?B?Y2VjU25GQ3RLeHpoUUduN0ZWQ0FXRWtDSEM0cDJsQjFtNU4wRmN2RFJkdllk?= =?utf-8?B?ZVVVS2RMZDZLeUg2OFNiODhnWlFSeUhJWUlydnZzMEFQQmJ6eDllTTJoRWtC?= =?utf-8?B?cHF1c2k1RU5uMks0SkgxdHY4dXlPYlRJLzZ0UnNJaTd0bFUzSitPbU4rTHZT?= =?utf-8?B?dGtxTC9naFBDMHA4ZGZ5MVpRNk93MkE1Y1hNdWw1dURBZVdMOGpPRDYrYXZa?= =?utf-8?B?UVZoK3UrSDNLSDVRMzBXMVJpNkJZWlo1K09taUFvYTJRZXIvaVFVaFhxZkFE?= =?utf-8?B?Qi9lcy9DY0hEMnJOdFphNU5UYWpkSzYvaXFLaTJObjV3ZWJjSnNVNnpwNVRX?= =?utf-8?B?Ylc4NXFXa1RrQnkrbUdxbjBNcEdVZFlpa2MxRWhDT3c5TS9YZ0ZQdE9jN3VW?= =?utf-8?B?Q2V1dGJoNGUvZGxKb0tqVXNBSVI3MU5xMzRwSUpaRCtpM0ZFc0dDRktpamla?= =?utf-8?B?VGpNbDVYNFBTZzhDdkVmNS9jNmNaMkJjWHl3T2wwdlRETHpHc2dVOFQ0NUhp?= =?utf-8?B?djlDZEE1VXZyVW84UzFDWmNkZmlzV2c5anhMYzd1ck1aSmZiRXBkK0ZKTEJv?= =?utf-8?B?TnpxdW14WlBZak5OYjRaQ2IyZEJpa0pqZ25UQmZWT3JKQ2cvREwvamhMeWZE?= =?utf-8?B?MEUwdy9BMkU5dlRmLzNrcnN6bHhFSzR2UVNMalhkRzN5aTBzbks1cytHMmJW?= =?utf-8?B?V21qT2h6ZTE0Rzg5YkJIVnh4LzZOREhOaE1sL3ZZZWlNWWFMZmVONFc2dUxH?= =?utf-8?B?OWVHTjZPcGxFYUpndWQvS2FpZThpVmVuU0JNdm1ZNEdaQmhuM0c3WGM2OVBh?= =?utf-8?B?bVZiUUZDcGhJY3lLdmYvRGxNbThOYXNYNHJQVFNwZUJqQkFGcjMxZjROeFRW?= =?utf-8?B?MytrbjRNWmR3VFM2cHRmcXEzSjVJajM5bU5xUkxtM2kzRTFxVVZURHlmZnlI?= =?utf-8?B?eGRXcC9Oejg4amN5dzZhZ0ZORU16anNMcWdEd2F4M3JpdUZnbkIra2V5Wlhu?= =?utf-8?B?c1AwTEwxNUdFODBTNGxHdHl1eVU5bG5jaHh3QzVUSENXaGdMbFk0QmlEMFBo?= =?utf-8?B?RVhaaS9KVGhBYXRXRmJjM05jc3VIeDdiNWxnQUhFZVVOenpyK25DWk9MM0F4?= =?utf-8?B?YTQrYk1JN3NWL0VadjFqZ24vcFpUaWRkbnVyTDcxdS9PMDcvTC9nTFJZNSt5?= =?utf-8?B?WnJMSnpzajhzUm8yS29oNkNwQXVQSmFla2ExWlVTbm5tQmk5bzNGL0hsRUdL?= =?utf-8?B?OXp2TTVxT1dLdjcxNU8vZjByU1dmTU83V1NvbjdleXAyOUVPRkhmRjcwQ3lT?= =?utf-8?B?M3g1emEvclhHcnMwNUJhRk1OSTQrVERSQ1BFeHBxL2cwUlA2SVpaRjArd2xD?= =?utf-8?B?MEtEenJzN0l2TmJuNkE2L2JpODJCTWVSVFFqVzZJSUlMQmt2TTMzd0oyTWo3?= =?utf-8?B?eXZUM2tLU2duY0pqMUJNSkppMWo2eFUzdXpXQ1NBL0pSMnppTjhIMWRrRFRO?= =?utf-8?B?WHNxZS9LemRvS2w1UlB3dGNvbWcvSERzM2JYN3Y1TzVhVStmN0hYUmdBQ1hj?= =?utf-8?B?S1NWdU41WmxQWS9oM09IcHBzZnhwaDBkRHlDd1hjWFNjcEtPdUNIbmNieWRm?= =?utf-8?B?WlRyUm82QmsrM0pUeVI3UjVFc2gzbVI5SUNOaFNWUE42ZUFMZmpGZ0EwY2RI?= =?utf-8?B?WlB2U0ZVU3Y4TzVKNkN3V0YzcFlMTFBJL1B0azRreTg1SGdacHNNdFM0eU1W?= =?utf-8?Q?/msnRvynbJCU+jOr5F+a/p8CE?= X-OriginatorOrg: fb.com X-MS-Exchange-CrossTenant-Network-Message-Id: 6c79189a-aa2a-4c61-e320-08da64877591 X-MS-Exchange-CrossTenant-AuthSource: SN6PR1501MB2064.namprd15.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 13 Jul 2022 04:23:38.3766 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8ae927fe-1255-47a7-a2af-5f3a069daaa2 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 4ZYpvYrBc4KsHy60EdZZmuHi4W1UwFfanlC1jUZumLaDj8kH7qR95KFy26dqA9gI X-MS-Exchange-Transport-CrossTenantHeadersStamped: MWHPR15MB1773 X-Proofpoint-GUID: vHxmwhHIBzqFcAwH-XvBxamL0xn0fcR8 X-Proofpoint-ORIG-GUID: vHxmwhHIBzqFcAwH-XvBxamL0xn0fcR8 Content-Transfer-Encoding: 7bit X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.883,Hydra:6.0.517,FMLib:17.11.122.1 definitions=2022-07-12_14,2022-07-12_01,2022-06-22_01 X-Spam-Status: No, score=-3.8 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Jul 2022 04:23:46 -0000 On 7/7/22 1:24 PM, Jose E. Marchesi wrote: > > Hi Yonghong. > >> On 6/21/22 9:12 AM, Jose E. Marchesi wrote: >>> >>>> On 6/17/22 10:18 AM, Jose E. Marchesi wrote: >>>>> Hi Yonghong. >>>>> >>>>>> On 6/15/22 1:57 PM, David Faust wrote: >>>>>>> >>>>>>> On 6/14/22 22:53, Yonghong Song wrote: >>>>>>>> >>>>>>>> >>>>>>>> On 6/7/22 2:43 PM, David Faust wrote: >>>>>>>>> Hello, >>>>>>>>> >>>>>>>>> This patch series adds support for: >>>>>>>>> >>>>>>>>> - Two new C-language-level attributes that allow to associate (to "annotate" or >>>>>>>>> to "tag") particular declarations and types with arbitrary strings. As >>>>>>>>> explained below, this is intended to be used to, for example, characterize >>>>>>>>> certain pointer types. >>>>>>>>> >>>>>>>>> - The conveyance of that information in the DWARF output in the form of a new >>>>>>>>> DIE: DW_TAG_GNU_annotation. >>>>>>>>> >>>>>>>>> - The conveyance of that information in the BTF output in the form of two new >>>>>>>>> kinds of BTF objects: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG. >>>>>>>>> >>>>>>>>> All of these facilities are being added to the eBPF ecosystem, and support for >>>>>>>>> them exists in some form in LLVM. >>>>>>>>> >>>>>>>>> Purpose >>>>>>>>> ======= >>>>>>>>> >>>>>>>>> 1) Addition of C-family language constructs (attributes) to specify free-text >>>>>>>>> tags on certain language elements, such as struct fields. >>>>>>>>> >>>>>>>>> The purpose of these annotations is to provide additional information about >>>>>>>>> types, variables, and function parameters of interest to the kernel. A >>>>>>>>> driving use case is to tag pointer types within the linux kernel and eBPF >>>>>>>>> programs with additional semantic information, such as '__user' or '__rcu'. >>>>>>>>> >>>>>>>>> For example, consider the linux kernel function do_execve with the >>>>>>>>> following declaration: >>>>>>>>> >>>>>>>>> static int do_execve(struct filename *filename, >>>>>>>>> const char __user *const __user *__argv, >>>>>>>>> const char __user *const __user *__envp); >>>>>>>>> >>>>>>>>> Here, __user could be defined with these annotations to record semantic >>>>>>>>> information about the pointer parameters (e.g., they are user-provided) in >>>>>>>>> DWARF and BTF information. Other kernel facilites such as the eBPF verifier >>>>>>>>> can read the tags and make use of the information. >>>>>>>>> >>>>>>>>> 2) Conveying the tags in the generated DWARF debug info. >>>>>>>>> >>>>>>>>> The main motivation for emitting the tags in DWARF is that the Linux kernel >>>>>>>>> generates its BTF information via pahole, using DWARF as a source: >>>>>>>>> >>>>>>>>> +--------+ BTF BTF +----------+ >>>>>>>>> | pahole |-------> vmlinux.btf ------->| verifier | >>>>>>>>> +--------+ +----------+ >>>>>>>>> ^ ^ >>>>>>>>> | | >>>>>>>>> DWARF | BTF | >>>>>>>>> | | >>>>>>>>> vmlinux +-------------+ >>>>>>>>> module1.ko | BPF program | >>>>>>>>> module2.ko +-------------+ >>>>>>>>> ... >>>>>>>>> >>>>>>>>> This is because: >>>>>>>>> >>>>>>>>> a) Unlike GCC, LLVM will only generate BTF for BPF programs. >>>>>>>>> >>>>>>>>> b) GCC can generate BTF for whatever target with -gbtf, but there is no >>>>>>>>> support for linking/deduplicating BTF in the linker. >>>>>>>>> >>>>>>>>> In the scenario above, the verifier needs access to the pointer tags of >>>>>>>>> both the kernel types/declarations (conveyed in the DWARF and translated >>>>>>>>> to BTF by pahole) and those of the BPF program (available directly in BTF). >>>>>>>>> >>>>>>>>> Another motivation for having the tag information in DWARF, unrelated to >>>>>>>>> BPF and BTF, is that the drgn project (another DWARF consumer) also wants >>>>>>>>> to benefit from these tags in order to differentiate between different >>>>>>>>> kinds of pointers in the kernel. >>>>>>>>> >>>>>>>>> 3) Conveying the tags in the generated BTF debug info. >>>>>>>>> >>>>>>>>> This is easy: the main purpose of having this info in BTF is for the >>>>>>>>> compiled eBPF programs. The kernel verifier can then access the tags >>>>>>>>> of pointers used by the eBPF programs. >>>>>>>>> >>>>>>>>> >>>>>>>>> For more information about these tags and the motivation behind them, please >>>>>>>>> refer to the following linux kernel discussions: >>>>>>>>> >>>>>>>>> https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/ >>>>>>>>> https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com/ >>>>>>>>> https://lore.kernel.org/bpf/20211112012604.1504583-1-yhs@fb.com/ >>>>>>>>> >>>>>>>>> >>>>>>>>> Implementation Overview >>>>>>>>> ======================= >>>>>>>>> >>>>>>>>> To enable these annotations, two new C language attributes are added: >>>>>>>>> __attribute__((debug_annotate_decl("foo"))) and >>>>>>>>> __attribute__((debug_annotate_type("bar"))). Both attributes accept a single >>>>>>>>> arbitrary string constant argument, which will be recorded in the generated >>>>>>>>> DWARF and/or BTF debug information. They have no effect on code generation. >>>>>>>>> >>>>>>>>> Note that we are not using the same attribute names as LLVM (btf_decl_tag and >>>>>>>>> btf_type_tag, respectively). While these attributes are functionally very >>>>>>>>> similar, they have grown beyond purely BTF-specific uses, so inclusion of "btf" >>>>>>>>> in the attribute name seems misleading. >>>>>>>>> >>>>>>>>> DWARF support is enabled via a new DW_TAG_GNU_annotation. When generating DWARF, >>>>>>>>> declarations and types will be checked for the corresponding attributes. If >>>>>>>>> present, a DW_TAG_GNU_annotation DIE will be created as a child of the DIE for >>>>>>>>> the annotated type or declaration, one for each tag. These DIEs link the >>>>>>>>> arbitrary tag value to the item they annotate. >>>>>>>>> >>>>>>>>> For example, the following variable declaration: >>>>>>>>> >>>>>>>>> #define __typetag1 __attribute__((debug_annotate_type ("typetag1"))) >>>>>>>>> >>>>>>>>> #define __decltag1 __attribute__((debug_annotate_decl ("decltag1"))) >>>>>>>>> #define __decltag2 __attribute__((debug_annotate_decl ("decltag2"))) >>>>>>>>> >>>>>>>>> int * __typetag1 x __decltag1 __decltag2; >>>>>>>> >>>>>>>> Based on the above example >>>>>>>> static int do_execve(struct filename *filename, >>>>>>>> const char __user *const __user *__argv, >>>>>>>> const char __user *const __user *__envp); >>>>>>>> >>>>>>>> Should the above example should be the below? >>>>>>>> int __typetag1 * x __decltag1 __decltag2 >>>>>>>> >>>>>>> This example is not related to the one above. It is just meant to >>>>>>> show the behavior of both attributes. My apologies for not making >>>>>>> that clear. >>>>>> >>>>>> Okay, it should be fine if the dwarf debug_info is shown. >>>>>> >>>>>>> >>>>>>>>> >>>>>>>>> Produces the following DWARF information: >>>>>>>>> >>>>>>>>> <1><1e>: Abbrev Number: 3 (DW_TAG_variable) >>>>>>>>> <1f> DW_AT_name : x >>>>>>>>> <21> DW_AT_decl_file : 1 >>>>>>>>> <22> DW_AT_decl_line : 7 >>>>>>>>> <23> DW_AT_decl_column : 18 >>>>>>>>> <24> DW_AT_type : <0x49> >>>>>>>>> <28> DW_AT_external : 1 >>>>>>>>> <28> DW_AT_location : 9 byte block: 3 0 0 0 0 0 0 0 0 (DW_OP_addr: 0) >>>>>>>>> <32> DW_AT_sibling : <0x49> >>>>>>>>> <2><36>: Abbrev Number: 1 (User TAG value: 0x6000) >>>>>>>>> <37> DW_AT_name : (indirect string, offset: 0xd6): debug_annotate_decl >>>>>>>>> <3b> DW_AT_const_value : (indirect string, offset: 0xcd): decltag2 >>>>>>>>> <2><3f>: Abbrev Number: 1 (User TAG value: 0x6000) >>>>>>>>> <40> DW_AT_name : (indirect string, offset: 0xd6): debug_annotate_decl >>>>>>>>> <44> DW_AT_const_value : (indirect string, offset: 0x0): decltag1 >>>>>>>>> <2><48>: Abbrev Number: 0 >>>>>>>>> <1><49>: Abbrev Number: 4 (DW_TAG_pointer_type) >>>>>>>>> <4a> DW_AT_byte_size : 8 >>>>>>>>> <4b> DW_AT_type : <0x5d> >>>>>>>>> <4f> DW_AT_sibling : <0x5d> >>>>>>>>> <2><53>: Abbrev Number: 1 (User TAG value: 0x6000) >>>>>>>>> <54> DW_AT_name : (indirect string, offset: 0x9): debug_annotate_type >>>>>>>>> <58> DW_AT_const_value : (indirect string, offset: 0x1d): typetag1 >>>>>>>>> <2><5c>: Abbrev Number: 0 >>>>>>>>> <1><5d>: Abbrev Number: 5 (DW_TAG_base_type) >>>>>>>>> <5e> DW_AT_byte_size : 4 >>>>>>>>> <5f> DW_AT_encoding : 5 (signed) >>>>>>>>> <60> DW_AT_name : int >>>>>>>>> <1><64>: Abbrev Number: 0 >>>>>> >>>>>> This shows the info in .debug_abbrev. What I mean is to >>>>>> show the related info in .debug_info section which seems more useful to >>>>>> understand the relationships between different tags. Maybe this is due >>>>>> to that I am not fully understanding what <1>/<2> means in <1><49> and >>>>>> <2><53> etc. >>>>> I think that dump actually shows .debug_info, with the abbrevs >>>>> expanded... >>>>> Anyway, it seems to us that the root of this problem is the fact the >>>>> kernel sparse annotations, such as address_space(__user), are: >>>>> 1) To be processed by an external kernel-specific tool ( >>>>> https://sparse.docs.kernel.org/en/latest/annotations.html) and not a >>>>> C compiler, and therefore, >>>>> 2) Not quite the same than compiler attributes (despite the way they >>>>> look.) In particular, they seem to assume an ordering different than >>>>> of GNU attributes: in some cases given the same written order, they >>>>> refer to different things!. Which is quite unfortunate :( >>>> >>>> Yes, currently __user/__kernel macros (implemented with address_space >>>> attribute) are processed by macros. >>>> >>>>> Now, if I understood properly, you plan to change the definition of >>>>> __user and __kernel in the kernel sources in order to generate the tag >>>>> compiler attributes, correct? >>>> >>>> Right. The original __user definition likes: >>>> # define __user __attribute__((noderef, address_space(__user))) >>>> >>>> The new attribute looks like >>>> # define BTF_TYPE_TAG(value) __attribute__((btf_type_tag(#value))) >>>> # define __user BTF_TYPE_TAG(user) >>> Ok I see. So the kernel will stop using sparse attributes to >>> implement >>> __user and __kernel and start using compiler attributes for tags >>> instead. >>> >>>>> Is that the reason why LLVM implements what we assume to be the >>>>> sparse >>>>> ordering, and not the correct GNU attributes ordering, for the tag >>>>> attributes? >>>> >>>> Note that __user attributes apply to pointee's and not pointers. >>>> Just like >>>> const int *p; >>>> the 'const' is not applied to pointer 'p', but the pointee of 'p'. >>>> >>>> What current llvm dwarf generation with >>>> pointer >>>> <--- btf_type_tag >>>> is just ONE implementation. As I said earlier, I am okay to >>>> have dwarf implementation like >>>> p->btf_type_tag->const->int. >>>> If you can propose an implementation like this in dwarf. I can propose >>>> to change implementation in llvm. >>> I think we are miscommunicating. >>> Looks like there is a divergence on what attributes apply to what >>> language entities between the sparse compiler and GCC/LLVM. How to >>> represent that in DWARF is a different matter. >>> For this example: >>> int __typetag1 * __typetag2 __typetag3 * g; >>> a) GCC associates __typetag1 with the pointer-to-pointer-to-int. >>> b) LLVM associates __typetag1 to pointer-to-int. >>> Where: >>> a) Is the expected behavior of a compiler attributes, as documented >>> in >>> the GCC manual. >>> b) Is presumably what the sparse compiler expects, but _not_ the >>> ordering expected for a compiler GNU attribute. >>> So, if the kernel source __user and __kernel annotations (which >>> currently expand to sparse attributes) follow the sparse ordering, and >>> you want to implement __user and __kernel in terms of compiler >>> attributes instead (the annotation attributes) then you will have to: >>> 1) Fix LLVM to implement the usual ordering for these attributes and >>> 2) fix the kernel sources to use that ordering >>> [Incidentally, the same applies to another "ex-sparse" attribute you >>> have in the kernel and also implemented in LLVM with a weird ordering: >>> the address_space attribute.] >>> For 2), it may be possible to write a coccinnelle script to generate >>> the >>> patch... >> >> I don't think (2) (to change kernel source for different attr ordering) >> will work. So the only thing we can do is in compiler/pahole except >> macro replacement in kernel. > > I looked at sparse and its parser. Wanted to be sure the ordering it > uses to interpret sparse annotations (such as address_space, alignment, > etc) is definitely _not_ the same ordering used by __attribute__ in C > compilers. > > It is very different indeed and the same can be said about how sparse > interprets other modifiers like `const': in sparse both `int const *foo' > and `int *const foo' parse to a constant pointer to int, for example. > > I am not to judge how sparse handles its annotations. It may be very > well and pertinent for its particular purpose. > > But I am not sure if it is reasonable to expect C compilers to implement > certain type __attributes__ to parse differently, just because it > happens these attributes are reused from sparse annotations in a > particular program (in this case the kernel.) The debug_annotate_decl > and debug_annotate_type attributes are not even intended to be > kernel-specific. > > So, if changing the kernel sources is not an option (why btw, other than > being a PITA?) at this point I really don't know what else to suggest :/ > > Any suggestion from the front-end people? Just want to understand the overall picture. So gcc can still emit BTF properly with btf_type_tag right? The issue we are talking about here is about the dwarf, right? If this is the case, we might have a partial solution here. - gcc emits BTF for vmlinux - gcc emits dwarf for vmlinux ignoring btf_type_tag - in pahole, vmlinux BTF is amended with some additional misc things. Although there are some use cases to have btf_type_tag in dwarf, but that can be workarouned with BTF + dwarf both of which are generated by the compiler. Not elegent, but probably works. > >>> Does this make sense? >>> >>>>> If that is so, we have quite a problem here: I don't think we can >>>>> change >>>>> the way GCC handles GNU-like attributes just because the kernel sources >>>>> want to hook on these __user/__kernel sparse annotations to generate the >>>>> compiler tags, even if we could mayhaps get GCC to handle >>>>> debug_annotate_type and debug_annotate_decl differently. Some would say >>>>> doing so would perpetuate the mistake instead of fixing it... >>>>> Is my understanding correct? >>>> >>>> Let us just say that the btf_type_tag attribute applies to pointees. >>>> Does this help? >>>> >>>>> >>>>>>>> >>>>>>>> Maybe you can also show what dwarf debug_info looks like >>>>>>> I am not sure what you mean. This is the .debug_info section as output >>>>>>> by readelf -w. I did trim some information not relevant to the discussion >>>>>>> such as the DW_TAG_compile_unit DIE, for brevity. >>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> In the case of BTF, the annotations are recorded in two type kinds recently >>>>>>>>> added to the BTF specification: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG. >>>>>>>>> The above example declaration prodcues the following BTF information: >>>>>>>>> >>>>>>>>> [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED >>>>>>>>> [2] PTR '(anon)' type_id=3 >>>>>>>>> [3] TYPE_TAG 'typetag1' type_id=1 >>>>>>>>> [4] DECL_TAG 'decltag1' type_id=6 component_idx=-1 >>>>>>>>> [5] DECL_TAG 'decltag2' type_id=6 component_idx=-1 >>>>>>>>> [6] VAR 'x' type_id=2, linkage=global >>>>>>>>> [7] DATASEC '.bss' size=0 vlen=1 >>>>>>>>> type_id=6 offset=0 size=8 (VAR 'x') >>>>>>>>> >>>>>>>>> >>>>>>>> [...]