From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by sourceware.org (Postfix) with ESMTPS id 48B333857BBD for ; Fri, 15 Jul 2022 14:17:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 48B333857BBD Received: from pps.filterd (m0246629.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 26FEDcBF016517; Fri, 15 Jul 2022 14:17:42 GMT Received: from iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta03.appoci.oracle.com [130.35.103.27]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3h71sgy8q6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 15 Jul 2022 14:17:41 +0000 Received: from pps.filterd (iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com (8.16.1.2/8.16.1.2) with SMTP id 26FEAvZ1015513; Fri, 15 Jul 2022 14:17:40 GMT Received: from nam12-dm6-obe.outbound.protection.outlook.com (mail-dm6nam12lp2170.outbound.protection.outlook.com [104.47.59.170]) by iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com with ESMTP id 3h70473246-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 15 Jul 2022 14:17:40 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=P+nQGsTeq4LfUGQk/jUBYum6y4GZoAbx9zRif5egRIdg6TfKt9d5n15BTHoJPeuV9VPHAvfATV+oQJ+1ZKKNci691BpG1ZVgoeVkrOb6PyYhO6F5GcI3ZCxmxweSLo1IteOfAYQtILE7YTACAEReLZw8x1dEfw1DNi/zEP3LbLg2qMTmJAV7vyVopjax03/YX0nfBdXHWZfH+toXt8J53iyFZZlVrSoFpBoiQGDxqMr4G5Yio/G112uq9tXJMAa8PamvpqJlQtG6R0Fyn1Z2yN9hcFXqXmMrp3eFDJQnFnqTV3dc62RY5HJEe8WtxuqP8zANtX1reMZ7Sf+ET+Yp8A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=11F5iFOr3V+kz3F/vOdK5UoKZfPXNAZTAEHXBWAhQJo=; b=EfgbbnS1ZRxfLzEb/mW1ABTWOllA4T9CCwywAq+3OqIT4jaeysJJwtmg3wtzB2RtLQDpfd93gN4rKkrIvQNKAg1b43OauufsZ/vCEZcsPT1ksLi5fgdX6++rT5BaEXDypBWT1Xb+7ARq5FRb+xoFkBuAknjCBxvGm5J8369mM7C+lLuJs7aZErndkU5WKLRsRj7Kc1i/8UiIttgne0iT6iBVV+G1Sd3kuYemWz06/JvqqyhmAJ6SUlqw8+FJ8VgbOJ50CfYENV9ujzAAO3Jho+Ks1qhV3fwaIbfBlHxl/SzO0OpR+Egn8F5XR87WOAbMx8Rv3+CmFOrv+p7nLMDCig== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none Received: from BYAPR10MB2888.namprd10.prod.outlook.com (2603:10b6:a03:88::32) by PH0PR10MB4789.namprd10.prod.outlook.com (2603:10b6:510:3c::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5438.14; Fri, 15 Jul 2022 14:17:38 +0000 Received: from BYAPR10MB2888.namprd10.prod.outlook.com ([fe80::b5ee:262a:b151:2fdd]) by BYAPR10MB2888.namprd10.prod.outlook.com ([fe80::b5ee:262a:b151:2fdd%4]) with mapi id 15.20.5417.026; Fri, 15 Jul 2022 14:17:37 +0000 From: "Jose E. Marchesi" To: Yonghong Song Cc: David Faust , gcc-patches@gcc.gnu.org Subject: Re: kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} WAS: Re: [PATCH 0/9] Add debug_annotate attributes References: <20220607214342.19463-1-david.faust@oracle.com> <2ab1d9a1-0077-a1e7-f212-556fcf8c8883@fb.com> <9bd41e20-5c39-0d35-bd6e-c10c65280da7@oracle.com> <52dcfdb6-f1b9-1986-5d10-8d6ac8c6d256@fb.com> <874k0jfbu0.fsf_-_@oracle.com> <87edziknc1.fsf@oracle.com> <0c104e7b-1873-c141-37b9-71444f585793@fb.com> <87let4isc8.fsf@oracle.com> <94288e98-b6b8-b4b7-27a4-572f6150c691@fb.com> <87r12ng293.fsf@oracle.com> <83fef618-c18c-aeb7-ada9-503deff9aa95@fb.com> Date: Fri, 15 Jul 2022 16:17:29 +0200 In-Reply-To: <83fef618-c18c-aeb7-ada9-503deff9aa95@fb.com> (Yonghong Song's message of "Thu, 14 Jul 2022 18:20:37 -0700") Message-ID: <875yjya29y.fsf@oracle.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) Content-Type: text/plain X-ClientProxiedBy: LO4P123CA0154.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:188::15) To BYAPR10MB2888.namprd10.prod.outlook.com (2603:10b6:a03:88::32) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: ad3980c0-be81-436f-705d-08da666cc4f0 X-MS-TrafficTypeDiagnostic: PH0PR10MB4789:EE_ X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: cf0eEmTHmPrGgucOO0tJA53lozJbJFvepOhI+dBMm1i16WikU+Tg3DgiojKRYvQKpIzKYVmrwSKHzUimIx/qrIutxyPnNOTSnvY1zjjlscfY23q/VudOVOShFOJHLSpBqQNniTWyKqZwUyc2by9jTlvIbg/ecRKODdfTzsQGqARHfK30G0/8dWUD62gTEbItKIBMu2xsar/x+wnPJVnVcwAaGq4M/J6wBtbnrU7R7GwP+s9iO+MjZ5cGjifVJuF2ZV1ynits9GxVA/UkAq5kEw8VtD/5eNY3QCf7W1VqKxMQJXiz66RdKwqwRVTxQlxkUW7WDwvGgQqAlxTFfJduSZXlSnsPwNuV+wn6KXRJQ/eyAvOx0usJdFbvaGVv8xe+KuTXTb+gxHBVajC1ZmgieyOc06JxWIqdpSZzq+bbP8+Nl16GU5/Fvz+BseeHRQJbj+vT2WGWpJjYwbkMQZpwEpBC6DvCB6Z/SPg8nsJWxDiCwCeib6auj5JfjbnCcICIxfrEpYUoHEfmudDDLTkFChJm/4xLOREtJFaIoAiH8T473EF6p4FM9SzV5045i0iMAQHd3vs/mGh2URIYImU2D+EpJHfnv5mSxkd/9BvyT4CUqxyADbovRVG985TGB6lrLs4uVVhTvtF2oaoJQOfUEZOXkHvBp2zF3R783xT5YIpC8+PSvZ6aGn8WZp2M+oT5EoCeXTIMIOMhwFyzPD+Si4zIDjWI9jsWkMEzg9JDUkfJoEjx9SXSfQkQ7od9+UahRH94jdoGtFcFD6tjM2dG43tZzzo2g9GRvkVmsowsumH/DefAkrrT5ktqvh2NBAijWP6ep1GA+MT/ItgKfPSj0jFv8u5Cr1zRJ2aqVWTn78FOVqPr52Zzg+2SG4Df6kMV X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:BYAPR10MB2888.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230016)(346002)(136003)(366004)(39860400002)(396003)(376002)(316002)(36756003)(26005)(2906002)(30864003)(66946007)(6916009)(52116002)(53546011)(2616005)(5660300002)(6666004)(66556008)(41300700001)(4326008)(83380400001)(8676002)(66476007)(6486002)(6506007)(966005)(8936002)(186003)(6512007)(86362001)(38100700002)(38350700002)(478600001); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?/sbVfyK9L+EaFx3mOA8nyif8V0C3KLc/ZqE1hweT4q1q8rJX2z1boic/KGa/?= =?us-ascii?Q?0VJ9fBLudBF8QHun0l/atWwuRj6eSvGO53jRpCNraE1gCZiufqsPe4FLNEhA?= =?us-ascii?Q?UFVZpHZCtMitI8PdXCST5lJZO0ESpM7UlB8DZl8JPWBawLRlN5OhCFcluwXm?= =?us-ascii?Q?z9lIFlWZKYHmicGVY7AIOJtzkRATqywsbYF5Pz4Iqba5/HSR0uEQFck4GlA1?= =?us-ascii?Q?HrHR9Tv2YY/ERQmHn/0zn6G3GdiToXjFXMzNjl0moMyeWGlEcb9GXtF1bfFT?= =?us-ascii?Q?mImd8bPPtxWM8UD7NT5EG63EaKUvZV8F1MCwUuIBRT7rrFiIwpfoNu6CpTby?= =?us-ascii?Q?9jxLVT+WqDTXjVi4cgD3xWeJy/4bsAu2vCX8slbOHT+ioJCKbuEWN3Zu18re?= =?us-ascii?Q?pZ9aXbDnhaTiBa1Vl8OrwDv394L+P23QH48r0RutJE3ouXmKnGUVhikHC8ag?= =?us-ascii?Q?bTcGSFtJUeUzavK+iYRDrI/IRbetlK37q/ZyJMwzp/pVUTAs2bDdrjEe7kDa?= =?us-ascii?Q?le2LA3UQjnhZFdXp5yGCf0P1NlAmQCFsFngLH/3igZ2ZN+iK/iQbn2ro7RLE?= =?us-ascii?Q?ZB8Adsm/v7MJ1DWoK15VuiYiWw3WtJ4WUyn/kyQbTaqqIuHAg12oB7Jf5LEL?= =?us-ascii?Q?jphvU//xYUtI7BZlgsmcabNM7RSiz/zU9NeyQYXgDohxy2d0hVmezo2B0bHJ?= =?us-ascii?Q?qNGfI329soARyuPDpHlXzn5R54wNziw1EHt+QgTjfaoD9TVTHjOrY7AQOLpv?= =?us-ascii?Q?Cgzm5vjoY3fTrMMuUrQa6I9yNF2AgsIMBLJSbHDfLXiiF6t1sZZOLHBlACIV?= =?us-ascii?Q?/mPcf3nTkbd2pEDsA+uKc7b9Z//+rv4AH1F98KIPWNT2LtxpSlcGglUFnxiw?= =?us-ascii?Q?31SDmKtwlImS/8w3HLWwRhGJQlf4OYwqjw8P1C5qb7y5kQZpv+nCCZMMPVZ/?= =?us-ascii?Q?ysIJEJYLwxp2Q7GYsMbNrHvgbDtn97ySB5/N5Zh8MNSlKI5dObwSomPf6Al2?= =?us-ascii?Q?K8HYLmhHBz7p0sAaL/+1dTcY9JvK2nDlnYGsB9DX5p2GMIC3qkUjDz92fQwb?= =?us-ascii?Q?2apOtcZtJ4nYk6Dh7mciftuk6vSCN4k7QLpV9nzTBsNiw4erbVJBMYZrkbT6?= =?us-ascii?Q?njD/cIo89xE339s1pQPPUq7VXnLM0OUSXOSC6c32YKvgzVTufxUVkmmY7UOX?= =?us-ascii?Q?RUPITnEQS2yjBwzeW4zJ0fS7Fg/UBTAZeyYH1fTVH/H08tA3wL/rwtJgGaiF?= =?us-ascii?Q?WdfsBMdFKXXBlBwja/GAsQJgmEB1fyJ56GDxAHS2tuSRH/P5sZc/tRYjh094?= =?us-ascii?Q?9Bte4sgeJkGUXqxgl9NtX2UL7Nqf1pc/8TUL+Qq+X5+hOFgQwrL9Th+6DfIX?= =?us-ascii?Q?g85kIx4Z2HFOMW1jiOcnr85OvXKrx5eoUB3ya6ewi/iuZCNV4YvXYbtjM/4x?= =?us-ascii?Q?HpFMfv/jW3+zZFGNZ3e7FFs2089D6RJgrT6iVJI+Fq/ZvwM7aSW7Ur26p7hi?= =?us-ascii?Q?KhiuNrSmIceEbNVkEar2vX0mdAM6MO8zzsYkkphMTpt0ay9OjlsgsJnfFjlK?= =?us-ascii?Q?EmB3cPD6kcTYiqjN8qC8VXuO/7jo8NmluXqCYIgy5x7ML58En4oCdgLPg/Qn?= =?us-ascii?Q?Kw=3D=3D?= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: ad3980c0-be81-436f-705d-08da666cc4f0 X-MS-Exchange-CrossTenant-AuthSource: BYAPR10MB2888.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Jul 2022 14:17:37.7470 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: kELWOgdxKOijGWy7UfFO6xfu88dUcM+dnBFM5MTKkuBYDhI6RQU3X0q9xtMBMupsY22b1PCxtm+tY5XFuMN0Iex/38/JMozwd35lF/ujEJU= X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4789 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.517, 18.0.883 definitions=2022-07-15_05:2022-07-14, 2022-07-15 signatures=0 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 mlxlogscore=999 mlxscore=0 suspectscore=0 phishscore=0 spamscore=0 malwarescore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2206140000 definitions=main-2207150063 X-Proofpoint-GUID: l9UagAEQ0kstaBjL-QlqyTZLqj8EeX-q X-Proofpoint-ORIG-GUID: l9UagAEQ0kstaBjL-QlqyTZLqj8EeX-q X-Spam-Status: No, score=-5.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 Jul 2022 14:17:49 -0000 > On 7/14/22 8:09 AM, Jose E. Marchesi wrote: >> Hi Yonghong. >> >>> On 7/7/22 1:24 PM, Jose E. Marchesi wrote: >>>> Hi Yonghong. >>>> >>>>> On 6/21/22 9:12 AM, Jose E. Marchesi wrote: >>>>>> >>>>>>> On 6/17/22 10:18 AM, Jose E. Marchesi wrote: >>>>>>>> Hi Yonghong. >>>>>>>> >>>>>>>>> On 6/15/22 1:57 PM, David Faust wrote: >>>>>>>>>> >>>>>>>>>> On 6/14/22 22:53, Yonghong Song wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 6/7/22 2:43 PM, David Faust wrote: >>>>>>>>>>>> Hello, >>>>>>>>>>>> >>>>>>>>>>>> This patch series adds support for: >>>>>>>>>>>> >>>>>>>>>>>> - Two new C-language-level attributes that allow to associate (to "annotate" or >>>>>>>>>>>> to "tag") particular declarations and types with arbitrary strings. As >>>>>>>>>>>> explained below, this is intended to be used to, for example, characterize >>>>>>>>>>>> certain pointer types. >>>>>>>>>>>> >>>>>>>>>>>> - The conveyance of that information in the DWARF output in the form of a new >>>>>>>>>>>> DIE: DW_TAG_GNU_annotation. >>>>>>>>>>>> >>>>>>>>>>>> - The conveyance of that information in the BTF output in the form of two new >>>>>>>>>>>> kinds of BTF objects: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG. >>>>>>>>>>>> >>>>>>>>>>>> All of these facilities are being added to the eBPF ecosystem, and support for >>>>>>>>>>>> them exists in some form in LLVM. >>>>>>>>>>>> >>>>>>>>>>>> Purpose >>>>>>>>>>>> ======= >>>>>>>>>>>> >>>>>>>>>>>> 1) Addition of C-family language constructs (attributes) to specify free-text >>>>>>>>>>>> tags on certain language elements, such as struct fields. >>>>>>>>>>>> >>>>>>>>>>>> The purpose of these annotations is to provide additional information about >>>>>>>>>>>> types, variables, and function parameters of interest to the kernel. A >>>>>>>>>>>> driving use case is to tag pointer types within the linux kernel and eBPF >>>>>>>>>>>> programs with additional semantic information, such as '__user' or '__rcu'. >>>>>>>>>>>> >>>>>>>>>>>> For example, consider the linux kernel function do_execve with the >>>>>>>>>>>> following declaration: >>>>>>>>>>>> >>>>>>>>>>>> static int do_execve(struct filename *filename, >>>>>>>>>>>> const char __user *const __user *__argv, >>>>>>>>>>>> const char __user *const __user *__envp); >>>>>>>>>>>> >>>>>>>>>>>> Here, __user could be defined with these annotations to record semantic >>>>>>>>>>>> information about the pointer parameters (e.g., they are user-provided) in >>>>>>>>>>>> DWARF and BTF information. Other kernel facilites such as the eBPF verifier >>>>>>>>>>>> can read the tags and make use of the information. >>>>>>>>>>>> >>>>>>>>>>>> 2) Conveying the tags in the generated DWARF debug info. >>>>>>>>>>>> >>>>>>>>>>>> The main motivation for emitting the tags in DWARF is that the Linux kernel >>>>>>>>>>>> generates its BTF information via pahole, using DWARF as a source: >>>>>>>>>>>> >>>>>>>>>>>> +--------+ BTF BTF +----------+ >>>>>>>>>>>> | pahole |-------> vmlinux.btf ------->| verifier | >>>>>>>>>>>> +--------+ +----------+ >>>>>>>>>>>> ^ ^ >>>>>>>>>>>> | | >>>>>>>>>>>> DWARF | BTF | >>>>>>>>>>>> | | >>>>>>>>>>>> vmlinux +-------------+ >>>>>>>>>>>> module1.ko | BPF program | >>>>>>>>>>>> module2.ko +-------------+ >>>>>>>>>>>> ... >>>>>>>>>>>> >>>>>>>>>>>> This is because: >>>>>>>>>>>> >>>>>>>>>>>> a) Unlike GCC, LLVM will only generate BTF for BPF programs. >>>>>>>>>>>> >>>>>>>>>>>> b) GCC can generate BTF for whatever target with -gbtf, but there is no >>>>>>>>>>>> support for linking/deduplicating BTF in the linker. >>>>>>>>>>>> >>>>>>>>>>>> In the scenario above, the verifier needs access to the pointer tags of >>>>>>>>>>>> both the kernel types/declarations (conveyed in the DWARF and translated >>>>>>>>>>>> to BTF by pahole) and those of the BPF program (available directly in BTF). >>>>>>>>>>>> >>>>>>>>>>>> Another motivation for having the tag information in DWARF, unrelated to >>>>>>>>>>>> BPF and BTF, is that the drgn project (another DWARF consumer) also wants >>>>>>>>>>>> to benefit from these tags in order to differentiate between different >>>>>>>>>>>> kinds of pointers in the kernel. >>>>>>>>>>>> >>>>>>>>>>>> 3) Conveying the tags in the generated BTF debug info. >>>>>>>>>>>> >>>>>>>>>>>> This is easy: the main purpose of having this info in BTF is for the >>>>>>>>>>>> compiled eBPF programs. The kernel verifier can then access the tags >>>>>>>>>>>> of pointers used by the eBPF programs. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> For more information about these tags and the motivation behind them, please >>>>>>>>>>>> refer to the following linux kernel discussions: >>>>>>>>>>>> >>>>>>>>>>>> https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/ >>>>>>>>>>>> https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com/ >>>>>>>>>>>> https://lore.kernel.org/bpf/20211112012604.1504583-1-yhs@fb.com/ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Implementation Overview >>>>>>>>>>>> ======================= >>>>>>>>>>>> >>>>>>>>>>>> To enable these annotations, two new C language attributes are added: >>>>>>>>>>>> __attribute__((debug_annotate_decl("foo"))) and >>>>>>>>>>>> __attribute__((debug_annotate_type("bar"))). Both attributes accept a single >>>>>>>>>>>> arbitrary string constant argument, which will be recorded in the generated >>>>>>>>>>>> DWARF and/or BTF debug information. They have no effect on code generation. >>>>>>>>>>>> >>>>>>>>>>>> Note that we are not using the same attribute names as LLVM (btf_decl_tag and >>>>>>>>>>>> btf_type_tag, respectively). While these attributes are functionally very >>>>>>>>>>>> similar, they have grown beyond purely BTF-specific uses, so inclusion of "btf" >>>>>>>>>>>> in the attribute name seems misleading. >>>>>>>>>>>> >>>>>>>>>>>> DWARF support is enabled via a new DW_TAG_GNU_annotation. When generating DWARF, >>>>>>>>>>>> declarations and types will be checked for the corresponding attributes. If >>>>>>>>>>>> present, a DW_TAG_GNU_annotation DIE will be created as a child of the DIE for >>>>>>>>>>>> the annotated type or declaration, one for each tag. These DIEs link the >>>>>>>>>>>> arbitrary tag value to the item they annotate. >>>>>>>>>>>> >>>>>>>>>>>> For example, the following variable declaration: >>>>>>>>>>>> >>>>>>>>>>>> #define __typetag1 __attribute__((debug_annotate_type ("typetag1"))) >>>>>>>>>>>> >>>>>>>>>>>> #define __decltag1 __attribute__((debug_annotate_decl ("decltag1"))) >>>>>>>>>>>> #define __decltag2 __attribute__((debug_annotate_decl ("decltag2"))) >>>>>>>>>>>> >>>>>>>>>>>> int * __typetag1 x __decltag1 __decltag2; >>>>>>>>>>> >>>>>>>>>>> Based on the above example >>>>>>>>>>> static int do_execve(struct filename *filename, >>>>>>>>>>> const char __user *const __user *__argv, >>>>>>>>>>> const char __user *const __user *__envp); >>>>>>>>>>> >>>>>>>>>>> Should the above example should be the below? >>>>>>>>>>> int __typetag1 * x __decltag1 __decltag2 >>>>>>>>>>> >>>>>>>>>> This example is not related to the one above. It is just meant to >>>>>>>>>> show the behavior of both attributes. My apologies for not making >>>>>>>>>> that clear. >>>>>>>>> >>>>>>>>> Okay, it should be fine if the dwarf debug_info is shown. >>>>>>>>> >>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Produces the following DWARF information: >>>>>>>>>>>> >>>>>>>>>>>> <1><1e>: Abbrev Number: 3 (DW_TAG_variable) >>>>>>>>>>>> <1f> DW_AT_name : x >>>>>>>>>>>> <21> DW_AT_decl_file : 1 >>>>>>>>>>>> <22> DW_AT_decl_line : 7 >>>>>>>>>>>> <23> DW_AT_decl_column : 18 >>>>>>>>>>>> <24> DW_AT_type : <0x49> >>>>>>>>>>>> <28> DW_AT_external : 1 >>>>>>>>>>>> <28> DW_AT_location : 9 byte block: 3 0 0 0 0 0 0 0 0 (DW_OP_addr: 0) >>>>>>>>>>>> <32> DW_AT_sibling : <0x49> >>>>>>>>>>>> <2><36>: Abbrev Number: 1 (User TAG value: 0x6000) >>>>>>>>>>>> <37> DW_AT_name : (indirect string, offset: 0xd6): >>> debug_annotate_decl >>>>>>>>>>>> <3b> DW_AT_const_value : (indirect string, offset: 0xcd): decltag2 >>>>>>>>>>>> <2><3f>: Abbrev Number: 1 (User TAG value: 0x6000) >>>>>>>>>>>> <40> DW_AT_name : (indirect string, offset: 0xd6): >>> debug_annotate_decl >>>>>>>>>>>> <44> DW_AT_const_value : (indirect string, offset: 0x0): decltag1 >>>>>>>>>>>> <2><48>: Abbrev Number: 0 >>>>>>>>>>>> <1><49>: Abbrev Number: 4 (DW_TAG_pointer_type) >>>>>>>>>>>> <4a> DW_AT_byte_size : 8 >>>>>>>>>>>> <4b> DW_AT_type : <0x5d> >>>>>>>>>>>> <4f> DW_AT_sibling : <0x5d> >>>>>>>>>>>> <2><53>: Abbrev Number: 1 (User TAG value: 0x6000) >>>>>>>>>>>> <54> DW_AT_name : (indirect string, offset: 0x9): >>> debug_annotate_type >>>>>>>>>>>> <58> DW_AT_const_value : (indirect string, offset: 0x1d): typetag1 >>>>>>>>>>>> <2><5c>: Abbrev Number: 0 >>>>>>>>>>>> <1><5d>: Abbrev Number: 5 (DW_TAG_base_type) >>>>>>>>>>>> <5e> DW_AT_byte_size : 4 >>>>>>>>>>>> <5f> DW_AT_encoding : 5 (signed) >>>>>>>>>>>> <60> DW_AT_name : int >>>>>>>>>>>> <1><64>: Abbrev Number: 0 >>>>>>>>> >>>>>>>>> This shows the info in .debug_abbrev. What I mean is to >>>>>>>>> show the related info in .debug_info section which seems more useful to >>>>>>>>> understand the relationships between different tags. Maybe this is due >>>>>>>>> to that I am not fully understanding what <1>/<2> means in <1><49> and >>>>>>>>> <2><53> etc. >>>>>>>> I think that dump actually shows .debug_info, with the abbrevs >>>>>>>> expanded... >>>>>>>> Anyway, it seems to us that the root of this problem is the fact the >>>>>>>> kernel sparse annotations, such as address_space(__user), are: >>>>>>>> 1) To be processed by an external kernel-specific tool ( >>>>>>>> https://sparse.docs.kernel.org/en/latest/annotations.html) and not a >>>>>>>> C compiler, and therefore, >>>>>>>> 2) Not quite the same than compiler attributes (despite the way they >>>>>>>> look.) In particular, they seem to assume an ordering different than >>>>>>>> of GNU attributes: in some cases given the same written order, they >>>>>>>> refer to different things!. Which is quite unfortunate :( >>>>>>> >>>>>>> Yes, currently __user/__kernel macros (implemented with address_space >>>>>>> attribute) are processed by macros. >>>>>>> >>>>>>>> Now, if I understood properly, you plan to change the definition of >>>>>>>> __user and __kernel in the kernel sources in order to generate the tag >>>>>>>> compiler attributes, correct? >>>>>>> >>>>>>> Right. The original __user definition likes: >>>>>>> # define __user __attribute__((noderef, address_space(__user))) >>>>>>> >>>>>>> The new attribute looks like >>>>>>> # define BTF_TYPE_TAG(value) __attribute__((btf_type_tag(#value))) >>>>>>> # define __user BTF_TYPE_TAG(user) >>>>>> Ok I see. So the kernel will stop using sparse attributes to >>>>>> implement >>>>>> __user and __kernel and start using compiler attributes for tags >>>>>> instead. >>>>>> >>>>>>>> Is that the reason why LLVM implements what we assume to be the >>>>>>>> sparse >>>>>>>> ordering, and not the correct GNU attributes ordering, for the tag >>>>>>>> attributes? >>>>>>> >>>>>>> Note that __user attributes apply to pointee's and not pointers. >>>>>>> Just like >>>>>>> const int *p; >>>>>>> the 'const' is not applied to pointer 'p', but the pointee of 'p'. >>>>>>> >>>>>>> What current llvm dwarf generation with >>>>>>> pointer >>>>>>> <--- btf_type_tag >>>>>>> is just ONE implementation. As I said earlier, I am okay to >>>>>>> have dwarf implementation like >>>>>>> p->btf_type_tag->const->int. >>>>>>> If you can propose an implementation like this in dwarf. I can propose >>>>>>> to change implementation in llvm. >>>>>> I think we are miscommunicating. >>>>>> Looks like there is a divergence on what attributes apply to what >>>>>> language entities between the sparse compiler and GCC/LLVM. How to >>>>>> represent that in DWARF is a different matter. >>>>>> For this example: >>>>>> int __typetag1 * __typetag2 __typetag3 * g; >>>>>> a) GCC associates __typetag1 with the pointer-to-pointer-to-int. >>>>>> b) LLVM associates __typetag1 to pointer-to-int. >>>>>> Where: >>>>>> a) Is the expected behavior of a compiler attributes, as documented >>>>>> in >>>>>> the GCC manual. >>>>>> b) Is presumably what the sparse compiler expects, but _not_ the >>>>>> ordering expected for a compiler GNU attribute. >>>>>> So, if the kernel source __user and __kernel annotations (which >>>>>> currently expand to sparse attributes) follow the sparse ordering, and >>>>>> you want to implement __user and __kernel in terms of compiler >>>>>> attributes instead (the annotation attributes) then you will have to: >>>>>> 1) Fix LLVM to implement the usual ordering for these attributes and >>>>>> 2) fix the kernel sources to use that ordering >>>>>> [Incidentally, the same applies to another "ex-sparse" attribute you >>>>>> have in the kernel and also implemented in LLVM with a weird ordering: >>>>>> the address_space attribute.] >>>>>> For 2), it may be possible to write a coccinnelle script to generate >>>>>> the >>>>>> patch... >>>>> >>>>> I don't think (2) (to change kernel source for different attr ordering) >>>>> will work. So the only thing we can do is in compiler/pahole except >>>>> macro replacement in kernel. >>>> I looked at sparse and its parser. Wanted to be sure the ordering >>>> it >>>> uses to interpret sparse annotations (such as address_space, alignment, >>>> etc) is definitely _not_ the same ordering used by __attribute__ in C >>>> compilers. >>>> It is very different indeed and the same can be said about how >>>> sparse >>>> interprets other modifiers like `const': in sparse both `int const *foo' >>>> and `int *const foo' parse to a constant pointer to int, for example. >>>> I am not to judge how sparse handles its annotations. It may be >>>> very >>>> well and pertinent for its particular purpose. >>>> But I am not sure if it is reasonable to expect C compilers to >>>> implement >>>> certain type __attributes__ to parse differently, just because it >>>> happens these attributes are reused from sparse annotations in a >>>> particular program (in this case the kernel.) The debug_annotate_decl >>>> and debug_annotate_type attributes are not even intended to be >>>> kernel-specific. >>>> So, if changing the kernel sources is not an option (why btw, other >>>> than >>>> being a PITA?) at this point I really don't know what else to suggest :/ >>>> Any suggestion from the front-end people? >>> >>> Just want to understand the overall picture. So gcc can still emit >>> BTF properly with btf_type_tag right? The issue we are talking about >>> here is about the dwarf, right? >> If by "properly" you mean how sparse handles its annotations, then >> not >> really. >> The issue we are talking about is rather a language-level one: to >> what >> entity/type the compiler attribute applies. >> So, for: >> int __attribute__((debug_annotate_decl("user"))) *foo; >> GCC will apply the attribute to the int type, following the rules >> for >> type attributes (sparse would apply the annotation to the *int type >> instead). The emitted debug info (be it DWARF or BTF) will reflect >> that, no more no less :/ > > I don't know what does this 'apply the attribute to the int' mean. > In current clang implementation it means the following dwarf chains > from right to left > variable 'foo' > type: ptr > base type: attr_type: attr > underlying type: int > > So the type chain is foo -> ptr -> attr -> int Urgh sorry Yonghong, that was a bad example where there is no divergence. At this point I find myself confused regarding the sparse, clang and GCC attribute issue (I have so many dumps around from all three tools in several formats) so I better recap on this before creating further confusion. Will be back to you soon. >>> If this is the case, we might have >>> a partial solution here. >>> - gcc emits BTF for vmlinux >> Note that for emitting BTF for vmlinux we would need support in the >> linker to merge and deduplicate BTF, which at the moment we don't have. > > This should be okay. pahole will merge and deduplicate btf. In pahole > '-j' mode, each thread will convert each .o file dwarf to btf, and > then pahole will merge and deduplicate btf. Thats nice. If LLVM supported generating BTF for any target (I believe you got patches for that) you could even skip the dwarf->BTF translation step alltogether with both LLVM and GCC kernel builds :) > >> >>> - gcc emits dwarf for vmlinux ignoring btf_type_tag >>> - in pahole, vmlinux BTF is amended with some additional misc things. >>> Although there are some use cases to have btf_type_tag in dwarf, but >>> that can be workarouned with BTF + dwarf both of which are generated >>> by the compiler. Not elegent, but probably works. >>>> >>>>>> Does this make sense? >>>>>> >>>>>>>> If that is so, we have quite a problem here: I don't think we can >>>>>>>> change >>>>>>>> the way GCC handles GNU-like attributes just because the kernel sources >>>>>>>> want to hook on these __user/__kernel sparse annotations to generate the >>>>>>>> compiler tags, even if we could mayhaps get GCC to handle >>>>>>>> debug_annotate_type and debug_annotate_decl differently. Some would say >>>>>>>> doing so would perpetuate the mistake instead of fixing it... >>>>>>>> Is my understanding correct? >>>>>>> >>>>>>> Let us just say that the btf_type_tag attribute applies to pointees. >>>>>>> Does this help? >>>>>>> >>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Maybe you can also show what dwarf debug_info looks like >>>>>>>>>> I am not sure what you mean. This is the .debug_info section as output >>>>>>>>>> by readelf -w. I did trim some information not relevant to the discussion >>>>>>>>>> such as the DW_TAG_compile_unit DIE, for brevity. >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> In the case of BTF, the annotations are recorded in two type kinds recently >>>>>>>>>>>> added to the BTF specification: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG. >>>>>>>>>>>> The above example declaration prodcues the following BTF information: >>>>>>>>>>>> >>>>>>>>>>>> [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED >>>>>>>>>>>> [2] PTR '(anon)' type_id=3 >>>>>>>>>>>> [3] TYPE_TAG 'typetag1' type_id=1 >>>>>>>>>>>> [4] DECL_TAG 'decltag1' type_id=6 component_idx=-1 >>>>>>>>>>>> [5] DECL_TAG 'decltag2' type_id=6 component_idx=-1 >>>>>>>>>>>> [6] VAR 'x' type_id=2, linkage=global >>>>>>>>>>>> [7] DATASEC '.bss' size=0 vlen=1 >>>>>>>>>>>> type_id=6 offset=0 size=8 (VAR 'x') >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> [...]