From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by sourceware.org (Postfix) with ESMTPS id 599013858C50 for ; Mon, 2 May 2022 16:58:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 599013858C50 Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 242FpTx4026132; Mon, 2 May 2022 16:58:04 GMT Received: from iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta01.appoci.oracle.com [130.35.100.223]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3fruhc3sdc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 02 May 2022 16:58:03 +0000 Received: from pps.filterd (iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (8.16.1.2/8.16.1.2) with SMTP id 242Goo65033374; Mon, 2 May 2022 16:58:02 GMT Received: from nam11-bn8-obe.outbound.protection.outlook.com (mail-bn8nam11lp2171.outbound.protection.outlook.com [104.47.58.171]) by iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com with ESMTP id 3fruj7smry-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 02 May 2022 16:58:02 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=LG/D0WTMSCN6OTmQE5gPRo+hP0ffaei/pRZm4cqbk/vBzRRMNjOYt01hIgAvWFxEMGdycEHHSaqVL57mMObTdh/Rr5VyjOBjj2YxPgZvFwr32kcXEm3j2fcL0Wn1gPisZjWd85efZSaTulLEyW4ynFonr7IkYUZayutxeSDiPe1tSm201QEN6f+ckL7RFPY7IuvwLxEei5mEH3IJ2L8ZtK7piS2OIbmDpr4jSWJYlb2Mu3jCoTfB3OG5npZIR8BrQKuHAbITjPKmP1EsiRSj3Q0N0J0Fp9Wt8vQolb9wiNeZWbBkHBFjUGDOfNnNZWTYcD2yFOW0mLJtkWkQbsMtxA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=X0qAj2EThHtGqx78UacT7KsV4UaBqrER1UqLfVUzhLE=; b=QmNwBoVe2TY5Bud9rVx85zZWavrreLNhBh0+SgSquzGF4xNe4zSohTag6heA0VRsVflvZEMO6wzK2JnSTG4bbyYFFEmQzEWEzaBHnj/iyqgTXvFfo3hY0T9TWn8ZYzHZsahJ7kV/tUCx7d9+0W9id4EMMqmlSxqPB6yZ7cDcvNezbwrF8DoXMF9WOpNye2D6GlHYtFbapEIax7paYnjZO6yMxcswFJutZxIaFsjLK0A9HXXvZ7PRX4t7WvcfRO6Yh8Rk68dCpYYxZUH5M+FJCcc6fqYD+s5mZ4mi2Xm+SdeCvrcAPj8h8LZt8bUWNBN0vs4pLwBQ5okTilRO/O5qNw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none Received: from MN2PR10MB3213.namprd10.prod.outlook.com (2603:10b6:208:131::33) by BN6PR10MB1698.namprd10.prod.outlook.com (2603:10b6:405:7::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5186.23; Mon, 2 May 2022 16:58:00 +0000 Received: from MN2PR10MB3213.namprd10.prod.outlook.com ([fe80::483:8783:8421:5be5]) by MN2PR10MB3213.namprd10.prod.outlook.com ([fe80::483:8783:8421:5be5%7]) with mapi id 15.20.5206.013; Mon, 2 May 2022 16:58:00 +0000 Message-ID: Date: Mon, 2 May 2022 09:57:57 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.8.0 From: David Faust Subject: [ping2][PATCH 0/8][RFC] Support BTF decl_tag and type_tag annotations To: gcc-patches@gcc.gnu.org Cc: "Jose E. Marchesi" , Yonghong Song References: <20220401194216.16469-1-david.faust@oracle.com> Content-Language: en-US In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-ClientProxiedBy: SA0PR11CA0121.namprd11.prod.outlook.com (2603:10b6:806:131::6) To MN2PR10MB3213.namprd10.prod.outlook.com (2603:10b6:208:131::33) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: caccc28d-9a26-49e2-6c4b-08da2c5cea32 X-MS-TrafficTypeDiagnostic: BN6PR10MB1698:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: Hn0HkV3XwHIc1v+IS+h8n8CHzKrmOClC+qToDgq4Q94x+T5j1onAljecJm5lnOJHNSHMcUJI7uml6c8NtVsraFHuNGXfEwiXSi8AicddYu+6lTa33BGycnxrLm+1+Sm7Qf4W4NuC9UYWu0D84qc+8FUpm78V8Jh3lFjlAxKJonG471LoE1RZRUDs5pFs0Bx/RO8XLY/EMXiW0FWeeCNB+O1s9yqjYoOXFi8XzgabshGeFKzOBy9HacuMC6pnBeLsNXftG6WW0Td81U/bhP+DshVeDUSRvCYgcSdX4DXE3LPMtVJ5+CC5FqtQ/OYXx5SAGqPz86n8PLupfiZLgVM9hgcTIBXs9VQvMbvfnhW2BZkPD9V5eQjX0iFVoSEIeAW8VrL6ArXZD6jmlTxuwUBz2l941/waE5ZAuvwBKG89aauuZqPtonIAYvwZU84B6FOmQyuwkObSTChD99n9wL50dSh1XoIvkBEGA41V9VcE98KvGQy42rKIknpzP/aqMXkC2sGkbgeHJ6FBQ2xmPxsaGznMMZk2RFFkRoyNku0cKNCk4dOX8HHoH8uBi3Zv678QOEkiD2BQ3jkDOOm+HHZSLf7a3ELqlRXRMClNlEfiA4yYP+tZdsaDvQJn0Z5RFnBodHEg9gMCPBuSfDbUZmhS8/fQlGqX4YLk6KuthK3uvk14vYYqLgcd5Zw9Pb6p6i2t6ySKaq1crflzUR6s+ANmsKqkT0IrfQKZ8OJuAKoISv4MNmwArebxdRnlfsTxLgaBRSkaeOzg3XxBRGN/3TVq3dlHTzZT0U74v+yQidUpkVsE1Cg3t+3j8TZC8c/bjqDGJoNjSXFVibkSguIhHkSzUEUmzULWWxVGKpoix3zf4bY= X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:MN2PR10MB3213.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230001)(366004)(36756003)(186003)(38100700002)(2616005)(31686004)(6506007)(6666004)(53546011)(2906002)(508600001)(83380400001)(316002)(6512007)(54906003)(8676002)(4326008)(6916009)(30864003)(44832011)(6486002)(966005)(5660300002)(66556008)(31696002)(66946007)(86362001)(8936002)(66476007)(43740500002)(45980500001); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?YUhDYWN6b3VjWmZKTXpSRWJxQ2dNcGd2c0RadHpNSCtCcVRGcmIvemxON1E0?= =?utf-8?B?MTlUMjdCZXRaL0xycDBSRnB0MG4yTFhtN2N0UjRNelpUU2FZZjdycFFVcFFD?= =?utf-8?B?UURwMmtmOHYwRklLbnBhczVsQ3E1NGYrbDVkOC8wL3p0UncvSXVPMkF2VjFV?= =?utf-8?B?eTlrdWNFL2ZKYjZZbHpTNmRIejFLRkpWeDFYbGtEU1hzMGU3QXRHSi9FRmF2?= =?utf-8?B?L1huRVlrRE1FT1RSNmJSWUhWU2xiQ2ttYk9EQ3RlVkJNVDcvd2dQajQ2NkFp?= =?utf-8?B?U1JvOWorZ0prZ05ZeGRYY2pUbG81RWlBUExMeHdDTythSHNGQi84cGtkV3pT?= =?utf-8?B?NkluTnlLL2ZudjZzRDFQMjAvdXlwRFhhRys1M1RjVkxiVDBlbGsvNkhNYWNm?= =?utf-8?B?YkV5YzVyY1BYWi83U1FKMEZxbUJOT3VTcTdJeGxYTmhkWk5YYmsxVnFpUlhO?= =?utf-8?B?TU5RRTlMdEdiaTlwMVg1WkhTcE1pK2YzdTVraDZTNW51b0FxNGlwOGZrajBR?= =?utf-8?B?SkF0S01zOTdIOERZS3B3Nnk5ODRMMjZXMjJpa2FlRzYrNGlyNC9nbGdUL3Za?= =?utf-8?B?VmhwbHE0WFREUHBmeFRFK1M1Zkd6UUdIUDB3MlpRYkFaVnI5VFQ0cXdPWTkr?= =?utf-8?B?RG9xOHBDUlFZVWxQS25tdDFENFFyb2YyR0FxcGgwYUJqZEZqOEpXK2liSmtn?= =?utf-8?B?RHFjdk5rTmdmT29pRlpZK2pyUWlXT0F0bDFsTWJWdXpxWXhnNUlMQXFCT3hN?= =?utf-8?B?a3ZXVFZkWE5NU20wRjAxN25ObUJlSVNuSmxUeWtQS3AwVkg5V05vSXo2SzJK?= =?utf-8?B?YXA1aVR2SzBmdDBpdjBjT1RUaWd4T1dUUjVFWDlEZUx4bk1rd2NtaFIrRFQw?= =?utf-8?B?V0N4SXdwUXZ5N1h2UkdyN2lNTUZKODd5UDFwMitqQm5tR1ZDQS8xSzVBUjZW?= =?utf-8?B?TExEcENONk00c2dMSlozT0Z1UFVJcGpSVFFtaHdha1NwMlhvU25lRzVTc0t4?= =?utf-8?B?QmEvSFpPVmJ5M3JXN0wyL0lpb0FCeXlINHJJRXhXZ2J0MFlYc2x1NzlxSmhl?= =?utf-8?B?UDA1S1YyTVhsQ09nZGlZTkpnRnJaY3lYdnA1WjJ3T3BYeksyRHZYaDkzTHdr?= =?utf-8?B?SDYvV3NGc2hVQmJIZ1Jndkx6ZzViQ3Z0RHhiVjh3WmVOYlo1VFJhQWtmNXFy?= =?utf-8?B?d1dZOE4zUmtVS3ovdVM2N0wrOGh5bHVmbkQzWktEZlJXdlJVMXdzMWFMNEM0?= =?utf-8?B?SndUN1BEc21tZ1ltVE44c3RUS0tmUWpSUERwbGJubUh3SjRMQmlURlQwaUw1?= =?utf-8?B?UmZScWNyeHpFZGJUQkZpc2cwRnBhUzd3QlpzNklTelpRbnlpWUxQODFWU0lP?= =?utf-8?B?RHp3Uk1kNG5rSXU2cFYvOUliRXlkc3lWeExMMDBqU2svZ3RzKzZZOGhKaWdV?= =?utf-8?B?aFkzU2hzdy9BZkUydW5kTzBQclhGYUJIQWI4V01wck4vaGdSYWhwKzBQQzVy?= =?utf-8?B?Y29yL2FzOUFsRTY4Q3gxWklaN2plcHVueXZQUlZpTzU0blh6Z1d4MGxLdHBD?= =?utf-8?B?YUV2UGtFQzZEQzBYcjU3SFVIRWJ5c1drQlc1VXlYTUE1UDJBR3p4WnJBZ3BM?= =?utf-8?B?VzF5Mjc4Q21KN0ZPM29razQ4SDJKRFI5RkdtN0Raa3lmcXIzUmpkUDJVQTgz?= =?utf-8?B?VHJleTQ4aFBOaGE1K2VPaG9lQmxESnlZU0ZXNElHVDEzMnEwZDY1d25ZMGJ5?= =?utf-8?B?eGx6OWw4VG5lUE9LeW5pOGl6TlA2WlZSU1NtMUMxQ1FpRzVrNHN4SjJ3Rlpp?= =?utf-8?B?WDBqYlFmUitxeHh5US90NDdVVVZXK0VsamN0akFQUFR1TWlsZGRQTFVLTVUw?= =?utf-8?B?Z1kyUGpxRXBSYmRvL2xvT0dwTlp3S0RLTnYybmo3ejZXL2l6cDVSTTdZd0NS?= =?utf-8?B?Y3R3RFpsUGw4cUMxRGR0SDlYOGwrOXcvaE9SbXhnUjhMS2cxa09IK2hkUFFY?= =?utf-8?B?WDVVQlF4Z0QrcDBWaHBBaGlRMFNFMVg5c0NpRDFlYjFTRDZRMWlmWExlV0xH?= =?utf-8?B?QzNreGFPL0tJdGtFOGQrTDhMbWxWb3hyWFk5TkszS1IrT29MU3VIdmwwd1FU?= =?utf-8?B?ellyK3puSk16MXIvTldScEgyVklkaHVpWXVpSzRaL2duam81T1JlUGRyVW1D?= =?utf-8?B?OVRSTnRDNHp0WG5sb3pUV3BNRlBGaGdHTkh1bHJrYUo4VlErZFNaNVBBOE1l?= =?utf-8?B?QlJaZndxRnFqZGlOUFZCdTFYMFhVdUlRaUQ5NGxaaFVyZVA3RWhYaS9veElO?= =?utf-8?B?WnR3ZTBMRmpOQkhjQkxYdGhQVVdBNWUzSVcvYkRDZzFIcnVKK25YbndkUVJ5?= =?utf-8?Q?Q0X5nr5k6f7iBYXdPd0mXT9ZK8e5MjjdJ9Iuc?= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: caccc28d-9a26-49e2-6c4b-08da2c5cea32 X-MS-Exchange-CrossTenant-AuthSource: MN2PR10MB3213.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 02 May 2022 16:58:00.5695 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: P2f/ooyF3oOt9Yf3mOSzlz9BSixMQBKKPbmmQZnNyOh4ud3y2Nu6o7x6vR9PcroogKnL1fxL5tjr2tRJoI3Z/g== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN6PR10MB1698 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.486, 18.0.858 definitions=2022-05-02_05:2022-05-02, 2022-05-02 signatures=0 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 mlxscore=0 adultscore=0 bulkscore=0 suspectscore=0 mlxlogscore=999 malwarescore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2202240000 definitions=main-2205020128 X-Proofpoint-GUID: vFJOMKOk6X1CoxOn-mlSVLCx9FtHcgy5 X-Proofpoint-ORIG-GUID: vFJOMKOk6X1CoxOn-mlSVLCx9FtHcgy5 X-Spam-Status: No, score=-6.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_ASCII_DIVIDERS, KAM_SHORT, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 May 2022 16:58:10 -0000 Pinging this series again. Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-April/592685.html This series adds new C-family frontend attributes for recording string "tags" in DWARF and BTF debug info to support kernel use cases. There remains one issue in the implementation which has not been resolved, which I hope someone in the GCC community may be able to shed some light on. Specifically, it is related to how GCC parses the attributes: In cases where the new btf_type_tag attribute (which applies to types) is specified multiple times on different intermediate pointer types of a declaration, GCC seems to attach the attributes to the TREEs in an unexpected order. As a result the behavior of the attribute in GCC cannot be reconciled with its definition in BTF or behavior in the clang compiler. Consider the following example: #define __typetag1 __attribute__((btf_type_tag("tag1"))) #define __typetag2 __attribute__((btf_type_tag("tag2"))) #define __typetag3 __attribute__((btf_type_tag("tag3"))) int __typetag1 * __typetag2 __typetag3 * g; The expected behavior is that 'g' is "a pointer with tags 'tag2' and 'tag3', to a pointer with tag 'tag1' to an int". i.e.: >>> But GCC's attribute parsing produces a variable 'g' which is "a pointer with tag 'tag1' to a pointer with tags 'tag2' and 'tag3' to an int", i.e. >>> And as a result the DWARF and BTF generated for cases like this do not agree with the BTF type tag specification nor output of the clang compiler, which already supports this feature. (Please refer to the "Current issues in implementation" section in the series cover letter for the full details of this example.) So far I have been unable to resolve this issue in the btf_type_tag attribute handler. It seems to me that the cause must be "higher up" in the C frontend attribute parsing but I am not familiar with this area of GCC. Any insight into understanding this issue or comments elsewhere in the series would be most welcome. Thanks, David On 4/18/22 12:36, David Faust via Gcc-patches wrote: > Gentle ping :) > > Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-April/592685.html > > The series adds support for new attribues btf_type_tag and btf_decl_tag, > for recording arbitrary string tags in DWARF and BTF debug info. The > feature is to support kernel use cases. > > Thanks, > David > > On 4/1/22 12:42, David Faust via Gcc-patches wrote: >> Hello, >> >> This patch series is a first attempt at adding support for: >> >> - Two new C-language-level attributes that allow to associate (to "tag") >> particular declarations and types with arbitrary strings. As explained below, >> this is intended to be used to, for example, characterize certain pointer >> types. >> >> - The conveyance of that information in the DWARF output in the form of a new >> DIE: DW_TAG_GNU_annotation. >> >> - The conveyance of that information in the BTF output in the form of two new >> kinds of BTF objects: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG. >> >> All of these facilities are being added to the eBPF ecosystem, and support for >> them exists in some form in LLVM. However, as we shall see, we have found some >> problems implementing them so some discussion is in order. >> >> Purpose >> ======= >> >> 1) Addition of C-family language constructs (attributes) to specify free-text >> tags on certain language elements, such as struct fields. >> >> The purpose of these annotations is to provide additional information about >> types, variables, and function paratemeters of interest to the kernel. A >> driving use case is to tag pointer types within the linux kernel and eBPF >> programs with additional semantic information, such as '__user' or '__rcu'. >> >> For example, consider the linux kernel function do_execve with the >> following declaration: >> >> static int do_execve(struct filename *filename, >> const char __user *const __user *__argv, >> const char __user *const __user *__envp); >> >> Here, __user could be defined with these annotations to record semantic >> information about the pointer parameters (e.g., they are user-provided) in >> DWARF and BTF information. Other kernel facilites such as the eBPF verifier >> can read the tags and make use of the information. >> >> 2) Conveying the tags in the generated DWARF debug info. >> >> The main motivation for emitting the tags in DWARF is that the Linux kernel >> generates its BTF information via pahole, using DWARF as a source: >> >> +--------+ BTF BTF +----------+ >> | pahole |-------> vmlinux.btf ------->| verifier | >> +--------+ +----------+ >> ^ ^ >> | | >> DWARF | BTF | >> | | >> vmlinux +-------------+ >> module1.ko | BPF program | >> module2.ko +-------------+ >> ... >> >> This is because: >> >> a) Unlike GCC, LLVM will only generate BTF for BPF programs. >> >> b) GCC can generate BTF for whatever target with -gbtf, but there is no >> support for linking/deduplicating BTF in the linker. >> >> In the scenario above, the verifier needs access to the pointer tags of >> both the kernel types/declarations (conveyed in the DWARF and translated >> to BTF by pahole) and those of the BPF program (available directly in BTF). >> >> Another motivation for having the tag information in DWARF, unrelated to >> BPF and BTF, is that the drgn project (another DWARF consumer) also wants >> to benefit from these tags in order to differentiate between different >> kinds of pointers in the kernel. >> >> 3) Conveying the tags in the generated BTF debug info. >> >> This is easy: the main purpose of having this info in BTF is for the >> compiled eBPF programs. The kernel verifier can then access the tags >> of pointers used by the eBPF programs. >> >> >> For more information about these tags and the motivation behind them, please >> refer to the following linux kernel discussions: >> >> https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/ >> https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com/ >> https://lore.kernel.org/bpf/20211112012604.1504583-1-yhs@fb.com/ >> >> >> What is in this patch series >> ============================ >> >> This patch series adds support for these annotations in GCC. The implementation >> is largely complete. However, in some cases the produced debug info (both DWARF >> and BTF) differs significantly from that produced by LLVM. This issue is >> discussed in detail below, along with a few specific questions for both GCC and >> LLVM. Any input would be much appreciated. >> >> >> Implementation Overview >> ======================= >> >> To enable these annotations, two new C language attributes are added: >> __attribute__((btf_decl_tag("foo")) and __attribute__((btf_type_tag("bar"))). >> Both attributes accept a single arbitrary string constant argument, which will >> be recorded in the generated DWARF and/or BTF debugging information. They have >> no effect on code generation. >> >> Note that we are using the same attribute names as LLVM, which include "btf" >> in the name. This may be controversial, as these tags are not really >> BTF-specific. A different name may be more appropriate. There was much >> discussion about naming in the proposal for the functionality in LLVM, the >> full thread can be found here: >> >> https://lists.llvm.org/pipermail/llvm-dev/2021-June/151023.html >> >> The name debug_info_annotate, suggested here, might better suit the attribute: >> >> https://lists.llvm.org/pipermail/llvm-dev/2021-June/151042.html >> >> DWARF support is enabled via a new DW_TAG_GNU_annotation. When generating DWARF, >> declarations and types will be checked for the corresponding attributes. If >> present, a DW_TAG_GNU_annotation DIE will be created as a child of the DIE for >> the annotated type or declaration, one for each tag. These DIEs link the >> arbitrary tag value to the item they annotate. >> >> For example, the following variable declaration: >> >> #define __typetag1 __attribute__((btf_type_tag("type-tag-1"))) >> #define __decltag1 __attribute__((btf_decl_tag("decl-tag-1"))) >> #define __decltag2 __attribute__((btf_decl_tag("decl-tag-2"))) >> >> int __typetag1 * x __decltag1 __decltag2; >> >> Produces the following DIEs: >> >> <1><1e>: Abbrev Number: 3 (DW_TAG_variable) >> <1f> DW_AT_name : x >> <21> DW_AT_decl_file : 1 >> <22> DW_AT_decl_line : 6 >> <23> DW_AT_decl_column : 18 >> <24> DW_AT_type : <0x49> >> <28> DW_AT_external : 1 >> <28> DW_AT_location : 9 byte block: 3 0 0 0 0 0 0 0 0 (DW_OP_addr: 0) >> <32> DW_AT_sibling : <0x49> >> <2><36>: Abbrev Number: 1 (User TAG value: 0x6000) >> <37> DW_AT_name : (indirect string, offset: 0x10): btf_decl_tag >> <3b> DW_AT_const_value : (indirect string, offset: 0x0): decl-tag-2 >> <2><3f>: Abbrev Number: 1 (User TAG value: 0x6000) >> <40> DW_AT_name : (indirect string, offset: 0x10): btf_decl_tag >> <44> DW_AT_const_value : (indirect string, offset: 0x1d): decl-tag-1 >> <2><48>: Abbrev Number: 0 >> <1><49>: Abbrev Number: 4 (DW_TAG_pointer_type) >> <4a> DW_AT_byte_size : 8 >> <4b> DW_AT_type : <0x5d> >> <4f> DW_AT_sibling : <0x5d> >> <2><53>: Abbrev Number: 1 (User TAG value: 0x6000) >> <54> DW_AT_name : (indirect string, offset: 0x28): btf_type_tag >> <58> DW_AT_const_value : (indirect string, offset: 0xd7): type-tag-1 >> <2><5c>: Abbrev Number: 0 >> <1><5d>: Abbrev Number: 5 (DW_TAG_base_type) >> <5e> DW_AT_byte_size : 4 >> <5f> DW_AT_encoding : 5 (signed) >> <60> DW_AT_name : int >> <1><64>: Abbrev Number: 0 >> >> Please note that currently, the annotation DWARF DIEs will be generated only if >> BTF debug information requested (via -gbtf). Therefore, the annotation DIEs >> will only be output if both BTF and DWARF are requested (e.g. -gbtf -gdwarf). >> This will change, since these tags are needed even when not generating BTF, >> for example in a GCC-built Linux kernel. >> >> In the case of BTF, the annotations are recorded in two type kinds recently >> added to the BTF specification: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG. >> The above example declaration prodcues the following BTF information: >> >> [1] int 'int'(1U#B) size=4U#B offset=0UB#b bits=32UB#b SIGNED >> [2] ptr type=3 >> [3] type_tag 'type-tag-1'(5U#B) type=1 >> [4] decl_tag 'decl-tag-1'(18U#B) type=6 component_idx=-1 >> [5] decl_tag 'decl-tag-2'(29U#B) type=6 component_idx=-1 >> [6] var 'x'(16U#B) type=2 linkage=1 (global) >> >> >> Current issues in the implementation >> ==================================== >> >> The __attribute__((btf_type_tag ("foo"))) syntax does not work correctly for >> types involving multiple pointers. >> >> Consider the following example: >> >> #define __typetag1 __attribute__((btf_type_tag("type-tag-1"))) >> #define __typetag2 __attribute__((btf_type_tag("type-tag-2"))) >> #define __typetag3 __attribute__((btf_type_tag("type-tag-3"))) >> >> int __typetag1 * __typetag2 __typetag3 * g; >> >> The current implementation produces the following DWARF: >> >> <1><1e>: Abbrev Number: 4 (DW_TAG_variable) >> <1f> DW_AT_name : g >> <21> DW_AT_decl_file : 1 >> <22> DW_AT_decl_line : 6 >> <23> DW_AT_decl_column : 42 >> <24> DW_AT_type : <0x32> >> <28> DW_AT_external : 1 >> <28> DW_AT_location : 9 byte block: 3 0 0 0 0 0 0 0 0 (DW_OP_addr: 0) >> <1><32>: Abbrev Number: 2 (DW_TAG_pointer_type) >> <33> DW_AT_byte_size : 8 >> <33> DW_AT_type : <0x45> >> <37> DW_AT_sibling : <0x45> >> <2><3b>: Abbrev Number: 1 (User TAG value: 0x6000) >> <3c> DW_AT_name : (indirect string, offset: 0x18): btf_type_tag >> <40> DW_AT_const_value : (indirect string, offset: 0xc7): type-tag-1 >> <2><44>: Abbrev Number: 0 >> <1><45>: Abbrev Number: 2 (DW_TAG_pointer_type) >> <46> DW_AT_byte_size : 8 >> <46> DW_AT_type : <0x61> >> <4a> DW_AT_sibling : <0x61> >> <2><4e>: Abbrev Number: 1 (User TAG value: 0x6000) >> <4f> DW_AT_name : (indirect string, offset: 0x18): btf_type_tag >> <53> DW_AT_const_value : (indirect string, offset: 0xd): type-tag-3 >> <2><57>: Abbrev Number: 1 (User TAG value: 0x6000) >> <58> DW_AT_name : (indirect string, offset: 0x18): btf_type_tag >> <5c> DW_AT_const_value : (indirect string, offset: 0xd2): type-tag-2 >> <2><60>: Abbrev Number: 0 >> <1><61>: Abbrev Number: 5 (DW_TAG_base_type) >> <62> DW_AT_byte_size : 4 >> <63> DW_AT_encoding : 5 (signed) >> <64> DW_AT_name : int >> <1><68>: Abbrev Number: 0 >> >> This does not agree with the DWARF produced by LLVM/clang for the same case: >> (clang 15.0.0 git 142501117a78080d2615074d3986fa42aa6a0734) >> >> <1><1e>: Abbrev Number: 2 (DW_TAG_variable) >> <1f> DW_AT_name : (indexed string: 0x3): g >> <20> DW_AT_type : <0x29> >> <24> DW_AT_external : 1 >> <24> DW_AT_decl_file : 0 >> <25> DW_AT_decl_line : 6 >> <26> DW_AT_location : 2 byte block: a1 0 ((Unknown location op 0xa1)) >> <1><29>: Abbrev Number: 3 (DW_TAG_pointer_type) >> <2a> DW_AT_type : <0x35> >> <2><2e>: Abbrev Number: 4 (User TAG value: 0x6000) >> <2f> DW_AT_name : (indexed string: 0x5): btf_type_tag >> <30> DW_AT_const_value : (indexed string: 0x7): type-tag-2 >> <2><31>: Abbrev Number: 4 (User TAG value: 0x6000) >> <32> DW_AT_name : (indexed string: 0x5): btf_type_tag >> <33> DW_AT_const_value : (indexed string: 0x8): type-tag-3 >> <2><34>: Abbrev Number: 0 >> <1><35>: Abbrev Number: 3 (DW_TAG_pointer_type) >> <36> DW_AT_type : <0x3e> >> <2><3a>: Abbrev Number: 4 (User TAG value: 0x6000) >> <3b> DW_AT_name : (indexed string: 0x5): btf_type_tag >> <3c> DW_AT_const_value : (indexed string: 0x6): type-tag-1 >> <2><3d>: Abbrev Number: 0 >> <1><3e>: Abbrev Number: 5 (DW_TAG_base_type) >> <3f> DW_AT_name : (indexed string: 0x4): int >> <40> DW_AT_encoding : 5 (signed) >> <41> DW_AT_byte_size : 4 >> <1><42>: Abbrev Number: 0 >> >> Notice the structural difference. From the DWARF produced by GCC (i.e. this >> patch series), variable 'g' is a pointer with tag 'type-tag-1' to a pointer >> with tags 'type-tag-2' and 'type-tag3' to an int. But from the LLVM DWARF, >> variable 'g' is a pointer with tags 'type-tag-2' and 'type-tag3' to a pointer >> to an int. >> >> Because GCC produces BTF from the internal DWARF DIE tree, the BTF also differs. >> This can be seen most obviously in the BTF type reference chains: >> >> GCC >> VAR (g) -> ptr -> tag1 -> ptr -> tag3 -> tag2 -> int >> >> LLVM >> VAR (g) -> ptr -> tag3 -> tag2 -> ptr -> tag1 -> int >> >> It seems that the ultimate cause for this is the structure of the TREE >> produced by the C frontend parsing and attribute handling. I believe this may >> be due to differences in __attribute__ syntax parsing between GCC and LLVM. >> >> This is the TREE for variable 'g': >> int __typetag1 * __typetag2 __typetag3 * g; >> >> > type > type >> asm_written unsigned DI >> size >> unit-size >> align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x7ffff7450888 >> attributes > purpose >> value > value >> readonly constant static "type-tag-3\000">> >> chain >> value > value >> readonly constant static "type-tag-2\000">>>> >> pointer_to_this > >> asm_written unsigned DI size unit-size >> align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x7ffff7509930 >> attributes >> value > value >> readonly constant static "type-tag-1\000">>>> >> public static unsigned DI defer-output /home/dfaust/playpen/btf/annotate.c:29:42 size unit-size >> align:64 warn_if_not_align:0> >> >> To me this is surprising. I would have expected the int** type of "g" to have >> the tags 'type-tag-2' and 'type-tag-3', and the inner (int*) pointer type to >> have the 'type-tag-1' tag. So far my attempts at resolving this difference in >> the new attribute handlers for the tag attributes has not been successful. >> >> I do not understand why exacly the attributes are attached in this way. I think >> that it may be related to the pointer cases discussed in the "All other >> attributes" section here: >> >> https://gcc.gnu.org/onlinedocs/gcc/Attribute-Syntax.html >> >> In particular it seems similar to this example: >> >> char *__attribute__((aligned(8))) *f; >> >> specifies the type “pointer to 8-byte-aligned pointer to char”. Note again >> that this does not work with most attributes; for example, the usage of >> ‘aligned’ and ‘noreturn’ attributes given above is not yet supported. >> >> I am not sure if this section of the documentation is outdated, if scenarios >> like this one have not been an issue before now, or if there is a way to >> resolve this within the attribute handler. I am by no means an expert in the C >> frontend nor attribute handling, if someone with more knowledge could help me >> understand this case I would be very grateful. :) >> >> Questions for GCC >> ================= >> >> 1) How can this issue with the type tags be resolved? Is this a bug or >> limitation in the attribute parsing that hasn't been an issue until now? >> Oris it that the above case is somehow a "weird" usage of attribtes? >> >> 2) Are attributes the right tool for this? Is there some other mechanism that >> would better fit the design of these tags? In some ways the type tags seem >> more similar to const/volatile/restrict qualifiers than to most other >> attributes. >> >> >> Questions for LLVM / kernel BPF >> =============================== >> >> 1) What special handling does the LLVM frontend/clang do for these attributes? >> Is there anything specific? Or does it simply follow whatever is default? >> >> 2) What is the correct BTF representation for type tags? The documentation for >> BTF_KIND_TYPE_TAG in linux/Documentation/bpf/btf.rst seems to conflict with >> the output of clang, and the format change that was discussed here: >> https://reviews.llvm.org/D113496 >> I assume the kernel btf.rst might simply be outdated, but I want to be sure. >> >> 3) Is the ordering of multiple type tags on the same type important? >> e.g. for this variable: >> int __tag1 __tag2 __tag3 * b; >> >> would it be "correct" (or at least, acceptable) to produce: >> VAR(b) -> ptr -> tag2 -> tag3 -> tag1 -> int >> >> or _must_ it be: >> VAR(b) -> ptr -> tag3 -> tag2 -> tag1 -> int >> >> In the DWARF representation, all tags are equal sibling children of the type >> they annotate, so this 'ordering' problem seems like it only arises because of >> the BTF format for type tags. >> >> 4) Are types with the same tags in different orders considered distinct types? >> I think the answer is "no", but given the format of the tags in BTF we get >> distinct chains for the types I am curious. >> e.g. >> int __tag1 __tag2 * x; >> int __tag2 __tag1 * y; >> >> produces >> VAR(x) -> ptr -> tag2 -> tag1 -> int >> VAR(y) -> ptr -> tag1 -> tag2 -> int >> >> but would >> VAR(y) -> ptr -> tag2 -> tag1 -> int >> >> be just as correct? >> >> 5) According to the clang docs, type tags are currently ignored for non-pointer >> types. Is pointer tagging e.g. '__user' the only use case so far? >> >> This GCC implementation allows type tags on non-pointer types. Such tags >> can be represented in the DWARF but don't make much sense in BTF output, >> e.g. >> >> struct __typetag1 S { >> int a; >> int b; >> } __decltag1; >> >> struct S my_s; >> >> This will produce a type tag child DIE of S. In the current implementation, >> it will also produce a BTF type tag type, which refers to the __decltag1 BTF >> decl tag, which in turn refers to the struct type. But nothing refers to >> the type tag type, currently variable my_s in BTF refers to the struct type >> directly. >> >> In my opinion, the DWARF here is useful but the BTF seems odd. What would be >> "correct" BTF in such a case? >> >> 6) Would LLVM be open to changing the name of the attribute, for example to >> 'debug_info_annotate' (or any other suggestion)? The use cases for these >> tags have grown (e.g. drgn) since they were originally proposed, and the >> scope is no longer limited to BTF. >> >> The kernel eBPF developers have said they can accomodate whatever name we >> would like to use. So although we in GCC are not tied to the name LLVM >> uses, it would be ideal for everyone to use the same attribute name. >> >> Thanks! >> >> David >> >> David Faust (8): >> dwarf: Add dw_get_die_parent function >> include: Add BTF tag defines to dwarf2 and btf >> c-family: Add BTF tag attribute handlers >> dwarf: create BTF decl and type tag DIEs >> ctfc: Add support to pass through BTF annotations >> dwarf2ctf: convert tag DIEs to CTF types >> Output BTF DECL_TAG and TYPE_TAG types >> testsuite: Add tests for BTF tags >> >> gcc/btfout.cc | 28 +++++ >> gcc/c-family/c-attribs.cc | 45 +++++++ >> gcc/ctf-int.h | 29 +++++ >> gcc/ctfc.cc | 11 +- >> gcc/ctfc.h | 17 ++- >> gcc/dwarf2ctf.cc | 115 +++++++++++++++++- >> gcc/dwarf2out.cc | 110 +++++++++++++++++ >> gcc/dwarf2out.h | 1 + >> .../gcc.dg/debug/btf/btf-decltag-func.c | 18 +++ >> .../gcc.dg/debug/btf/btf-decltag-sou.c | 34 ++++++ >> .../gcc.dg/debug/btf/btf-decltag-typedef.c | 15 +++ >> .../gcc.dg/debug/btf/btf-typetag-1.c | 20 +++ >> .../gcc.dg/debug/dwarf2/annotation-1.c | 29 +++++ >> include/btf.h | 17 ++- >> include/dwarf2.def | 4 + >> 15 files changed, 482 insertions(+), 11 deletions(-) >> create mode 100644 gcc/ctf-int.h >> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-func.c >> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-sou.c >> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-typedef.c >> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-typetag-1.c >> create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-1.c >>