From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from CAN01-YQB-obe.outbound.protection.outlook.com (mail-yqbcan01on2064.outbound.protection.outlook.com [40.107.116.64]) by sourceware.org (Postfix) with ESMTPS id 249D238358BD for ; Sat, 17 Sep 2022 15:24:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 249D238358BD Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=efficios.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=efficios.com ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=QRYQw4GxYRA9pXYYfGUbyAGRoJOqe80XK+DjR9S78v7TCb0JdnOY51onI4lcAi1KWURnqXIcJMisTqbu1C1V4S5bbPQchGLji0o/AFeCdfVkhx5Mh79v8d4v0gig2IEMjl3EOnfzPZF0IECjljn7cgbgqp6kv+kvO2gTzZeBUoUW5v/vIe9YkiDuNSW3YkGGJoF0+Fj1gvmjYeMAqs6YyMOFarrD+YVrtDQc8r039woCn8M0RQzyB++VdeebrDw1qoLQ64ciTIls9XIOydv7A3cGbhI4pMQoVb+uATKFg99mm8VdGVyp2kgmBEtnsmA0Nyp8J1Lcb0tUa1i16gqmFg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=wvIaZOc5UrO0Dj9Hvj1MtdJOXbdGr9bQXJdIq+JsvFo=; b=UHB15PQV3OxxbvnllRKTVe3QUj5RpVzWd10Ds324rpH1o0H/a0v2tTP1gr6AQ6kLuFBhQpnUT5h0o+xpcSPoTom5BIGmXst3ZT7Qpy/l9HATeayUz6QrIV1m/jqyCK98r2zmcQKHoaCtOV4djV4/3mfJ+Nhd527Ue3TUv00jyUSC9sWBwpFdT25mykEZ99T7hMDpyAC6CYoc09tu4n4HoQTODhdGwGUGoIis56BcQ4io+tfrSjJ6UeJkjC59P32XhU2c/C1XVdtjfhfu4FDK0BP25b/bidbHDAexLT4t8pJfFNv7XfUdMRAO2xMEZKZBajfwrkw6EI4xZfjavn+02A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=efficios.com; dmarc=pass action=none header.from=efficios.com; dkim=pass header.d=efficios.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=efficios.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=wvIaZOc5UrO0Dj9Hvj1MtdJOXbdGr9bQXJdIq+JsvFo=; b=Ovkg0OhFcVgBVLRtBYdnxzQWrABgwdOJ+tKP0yCrORR/vgYGt99AbW1uA+A5Hk858UGtr3Pp+fmUujqozpu0La8fxlE7PwzHLcxVlDTilJOzv9OteutlD0GpbLDmlfqMk+6nHa+kofeex/hIaje0EEacUxx1r2G8efOnDyXTbCPsE9VL2kj9060fiAPGNExIwzs81tbDy9QRMoUSDqwT86KL0lXd/ehI97kRaWGi8BMlU5EE4ufh2NkoJ30lPwfq9hk9h2KPYkaSmmaeBMADxPeCsRIDBR2CbZ4Zqubp7KbIujfBJqd0oHmYeDhtuHimc6zam8Tuf2Cj8DQOHmzpnA== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=efficios.com; Received: from YT2PR01MB9175.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:b01:be::5) by YT3PR01MB5985.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:b01:5e::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5632.17; Sat, 17 Sep 2022 15:24:32 +0000 Received: from YT2PR01MB9175.CANPRD01.PROD.OUTLOOK.COM ([fe80::bd25:7243:1b45:3fb9]) by YT2PR01MB9175.CANPRD01.PROD.OUTLOOK.COM ([fe80::bd25:7243:1b45:3fb9%5]) with mapi id 15.20.5632.016; Sat, 17 Sep 2022 15:24:31 +0000 Message-ID: <0947bd71-f727-ec7d-5759-84a2183d3c69@efficios.com> Date: Sat, 17 Sep 2022 17:25:11 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.13.0 Subject: Re: [libc-coord] Re: RSEQ symbols: __rseq_size, __rseq_flags vs __rseq_feature_size Content-Language: en-US To: Florian Weimer Cc: Chris Kennelly , libc-coord@lists.openwall.com, "carlos@redhat.com" , libc-alpha , szabolcs.nagy@arm.com References: <87y1uj49t4.fsf@mid.deneb.enyo.de> <87fsgryphl.fsf@mid.deneb.enyo.de> <87mtayvz39.fsf@mid.deneb.enyo.de> From: Mathieu Desnoyers In-Reply-To: <87mtayvz39.fsf@mid.deneb.enyo.de> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-ClientProxiedBy: VI1PR08CA0133.eurprd08.prod.outlook.com (2603:10a6:800:d5::11) To YT2PR01MB9175.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:b01:be::5) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: YT2PR01MB9175:EE_|YT3PR01MB5985:EE_ X-MS-Office365-Filtering-Correlation-Id: d9d98fef-98e4-4f7d-5a60-08da98c0b81a X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: eA9sLk0IhsnckDXN0HiC6Zbdsqe7YJAOd1ms5DequQBmuqZGPQc5Yy5f8f2EyzELwp1kweUrTs6D05QtAbM5dIs1Sb8pA8ReFQcSH7o7WY48cgyZfESUeOi4s67gHf6B2B2Vw563F1ukyJe7axGrVA49PiBWI2n+2BsONzc7y3nqDPshPCTHWgPEL9X/ytI5QQ561Rrf9Q7HBlKyFsrWtpBlQLgg5QSjQyjXGPPeXwzHKpJeDOTTe1/UBIe6intx/LheESSC56LaBYal52ejmIKRUUdKub2AWMW3wG6zKLg6k+RHORbxYrWj/7h6bv6FDLKMOImGj8ycthLPG2OaZbelY5QOfN1e1lejN4Z2Q8St+bNP22umxXqmYSaAn8ev/YkAp42KmyyzmvIFxs90dZv2kk2VIFoXhGOa1MGOFgEfVQPA/gThvzmOkIB2OH6pgYeuztN9K0Nk2MWZ49S91w5f5mT7/2J8Vfd20fgC4wM1JQRkLpwRkROiofdJ4/oRPf46HKM8Nn7u900PO2AfNqn6fA0gwGEkBZpeJh9645FE6v5SnRV2e0gHzkgyM1CNfhzNHjeEcxC6QCHxBmrutqX41QwDxOq+R59H6r0hHtq6Z2Xwlmo+aGqGweEC8+SPjwOl3tEKn1RxqLk1hHvvjxLirs4ZvS+Nhd9/gK7rkSnmM8PYbtaUvYGtJA0LaLfEflqDFfDxWm1+xkslSOHhigQX7bGzDJiuZ8sW0k/b53yPxMPp7l4cy99asz7CgF5fLZH58Q2uLllrSxlClZKrtcNYtf8i0RqtfW1mbTkccgEzK28AWEaFL3SpJLJDoYD3 X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:YT2PR01MB9175.CANPRD01.PROD.OUTLOOK.COM;PTR:;CAT:NONE;SFS:(13230022)(346002)(39830400003)(136003)(376002)(366004)(396003)(34036004)(34096005)(451199015)(31696002)(86362001)(36756003)(31686004)(38100700002)(41300700001)(2616005)(26005)(6512007)(186003)(44832011)(41320700001)(2906002)(5660300002)(6666004)(498600001)(6506007)(53546011)(6486002)(54906003)(38610400001)(586005)(966005)(83380400001)(66946007)(8676002)(8936002)(6916009)(4326008)(66556008)(66476007)(45980500001)(43740500002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?NXVTRGJmNUVJWmYyTzZnc3hRbGVSaGZJYnhIS21IeUlHekFTOFF5T0ZGMXk2?= =?utf-8?B?QmFDeTZRdnVtdEZMWXFFRndSbjVvYjVPcTJYcnVLNm9hK2xzcFpyN1A3TC9r?= =?utf-8?B?c0MxMUdVTDlwYkFUZTFHNDA4SGRtUUpYU0lrVkRaUE9ZR3ByYnBSSlB6N1lB?= =?utf-8?B?L1I1YTg5MUpERW1CUm9jNXpHb3YzYlZwRG5OcFJwdmc2NDljNEtYaU00SEEr?= =?utf-8?B?NFZGM3ZJYkNyTCtUTU1STysyYXpuTlJTbUpVKytmSGZWUndHbWhBM3JmdUJv?= =?utf-8?B?TUx4NXFIZkx5KzdsZjg2YkErR0M1MnNtRm9HaEpKRGlHeTdLOEc1QWpqcEp2?= =?utf-8?B?SGgwZ2tVSW1YandJZmI3OUNkS3NzNTVZNGRDMjYxYTlkMEJxNjBqT0d4bUVY?= =?utf-8?B?RWVVbUQ2azh4bVVpKzJMb3J4aFVDUitsWnY4dEVFUFkrL1pxVHdXU2tRY2hW?= =?utf-8?B?eUQrSXRYTXVEQytGZ3RrZGwyQ2JUd1R6K3cxS2JzVUxuRDFEMUlud0tObHVW?= =?utf-8?B?UnYvMW9nM3JqU0hOenNOMkFiYWpaY1hTcldVbEpTR044d3F2MFovVjg0dWF4?= =?utf-8?B?S05WMDgrNXE5eG9mMzNRYTJzdkNVaEEzSkRHdXIrL2I4ZjArUTVyUXdIQVhK?= =?utf-8?B?alJ1b2VTTTVsK2gyNlVjUnZzZTN0SFBvTldFS3AzSFNoRUVqQS9rSjBockRa?= =?utf-8?B?bFdNQlFnTUp3Z3BhQWlVVlpOR2VzMEVwSWlOK3NHRVBEUVVqVVJJbkRESWYy?= =?utf-8?B?UGVIZkdwTiszdlBiR05kemx1TCtoMm9HdDdOd3MzclJPTFZ2aWJwOGV2ZnNC?= =?utf-8?B?ZWYySUhuMlE2VjBnSXIyOFMwbU5heU1YbDRiS3dPbTB3Wk81RDBLb21Ga1hm?= =?utf-8?B?ZjMyUTg5THVGaE1lMDhJKzFwYmJPVHhSa25VODMzTWQvdDVzNGQ4NDYzeXpR?= =?utf-8?B?RC95MzhzU3dManFRMUVMMnlXUkNzTXlDc1JzdXF0YTdMNmdwTVhMRlk4MFRx?= =?utf-8?B?VGpRRmVZZVpudHo4VGRPcWJySjFseURGT05DQ0RVTjBZSWFyRmRMQ3d4dXVi?= =?utf-8?B?ZXFzQnY5QTVxbXZFeGxuKzREODBEWXpweEQ4UFBCa3VkRzN0ZGV4c2xIQUg1?= =?utf-8?B?Q2wvQTZ4VXlkblZxZmFVaGI1RmpqWjJqbU83V2tLaWVHakhpZU1ab3h5YTh3?= =?utf-8?B?ZVNWYW5OOEpORUREWkphUXhwVHEyaEU4QktjNnRGUW1zRWRQRmRlbUVDaTFk?= =?utf-8?B?U0hxN3h5SUdldVd5akpnaXBZNUJvbzNxb25Oc1FSZDRDcXBOUklqcFQzY05l?= =?utf-8?B?K2ZQckU1UG9FYThNdW1BMUVYeUx0Y05BMCswUnlnZitSK3A4d2FUUTdjMGFl?= =?utf-8?B?NmpGTXpRb3I4cGo4SXY0UGtaeG01YlNqZ2NnNHFjRWpFb3p6YlBmN0dvaWc2?= =?utf-8?B?cVg4TDRyajNCb21YQVp4YXRlOTh0MnFMM2g3YTlleWZzYkhFY1JBMzJGUUNq?= =?utf-8?B?ZGg0ZEtjVytmRFhTeCtvU3BnZ0RoWFc5cDNMWkZxSndrYkJINjc5NW9iUGhR?= =?utf-8?B?b1JwSTNJZzVTRDJFdkFxNlFtWU5XUXptZ0Z1WkdQWWdndXJLclhvV2R5TTRJ?= =?utf-8?B?TTRJT055bXFXbW8yK0ZrVnZXQWg1bTJPZHJGOTJRTnJsbXRaMzExOXJEaUFk?= =?utf-8?B?VU5QVVUvaGRUaVRmeXF6RTBPQ2JtMVhFVGFlT1VhUW5QY3JIejVYekFJQkYz?= =?utf-8?B?dkJQekYyZDV0Q0EzSGNkdEpkcHBPVzBia0pOZEFxbFdDTE1SNWl3RWdNWjdt?= =?utf-8?B?bzJYZ3BvV1hEZUc5R1lOejhoTnNRRHFCdVI3TFBqdjFHK1JJTHlySWZjblNt?= =?utf-8?B?Q25lUFpGWVVWaDZtT2l0MkVUbXNDWk5TWTUrTjdHalVTRHM4dHJjSllWT1NU?= =?utf-8?B?OWVIYlZMTkNucGJyZGdyQ2VQUVVibU50ZVRzdDU4T1N6OXllUmw2THVVM0tS?= =?utf-8?B?QmlMYzJTZjRPV0dCTmNYemVPV0NOU1BST2FaNFVJTHJIekxKM0hxdUpkLzRu?= =?utf-8?B?SHpOOUdYZEU4M2VSVC8zQlk4QnpaN2llcFdUSHQxVDMyTTNqNmVMdzdMNUt4?= =?utf-8?B?TnpORHljZWJVcS84SUI2MElSQ3ZSQUplMTB1MzB0V1hSTzlwdXFBMHhrNmlz?= =?utf-8?Q?88t2mqXVnWhGXJoUpqTVLDs=3D?= X-OriginatorOrg: efficios.com X-MS-Exchange-CrossTenant-Network-Message-Id: d9d98fef-98e4-4f7d-5a60-08da98c0b81a X-MS-Exchange-CrossTenant-AuthSource: YT2PR01MB9175.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 17 Sep 2022 15:24:31.8411 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4f278736-4ab6-415c-957e-1f55336bd31e X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: B+AaZdXVuVlPjba6FB987G3Qh6vA2nMlxWZwZMgauguLkGngmCZwWxz+gIv2msRqZVxJyKKIqNW+sFc4Zavet2wXEVqoa7P9zt55IKAZpow= X-MS-Exchange-Transport-CrossTenantHeadersStamped: YT3PR01MB5985 X-Spam-Status: No, score=-7.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 2022-09-17 16:45, Florian Weimer wrote: > * Mathieu Desnoyers: > >> On 2022-09-16 23:32, Florian Weimer wrote: >>> * Chris Kennelly: >>> >>>>> If the kernel does not currently overwrite the padding, glibc can do >>>>> its own per-thread initialization there to support its malloc >>>>> implementation (because the padding is undefined today from an >>>>> application perspective). That is, we would initialize these >>>>> invisible vCPU IDs the same way we assign arenas today. That would >>>>> cover this specific malloc use case only, of course. >>> >>>> If a user program updates to a new kernel before glibc does, would it be >>>> able to easily take advantage of it? >>> >>> No, as far as I understand it, there is presently no signaling from >>> kernel to applications that bypasses the rseq area registration. So >>> the only thing you could do is to unregister and re-register with a >>> compatible value. And that is of course quite undefined and assumes >>> that you can do this early enough during the life-time of each thread. >>> >>> But if we have the extension handshake, I'll expect us to backport it >>> quite widely, after some time to verify that it works with CRIU etc. >> >> I don't think this is what Chris is asking here. >> >> I think the requirement here is to make sure that the extensibility >> scheme we come up with will allow to extend struct rseq simply by >> upgrading the kernel, without any need to upgrade glibc. (that's indeed >> a requirement of mine). So a new application and a new kernel can use a >> newly available extended field, even with an old glibc. > > I took it for granted that we'd like to get libc out of the picture > for future changes, so I interpreted Chris' question in the context of > the initial switch (i.e., enabling rseq features that need > extensibility on currently released glibc, without upgrading glibc). I think it makes sense to require that glibc needs to be upgraded to a new version before applications can use the feature fields beyond the initial 32 bytes. What I care about is that after we have an extensible scheme in glibc, then there is no need to upgrade glibc afterwards when additional features appear. > >> If we want to keep the kernel ABI as simple as we can, then we just >> expose the rseq feature size (and required alignment), and don't expose >> any rseq feature flags. This in turn means that glibc would have to >> somehow expose the rseq feature size in its ABI. If glibc decides >> against exposing this rseq feature size symbol, then it would be up to >> the application to combine information about __rseq_size and >> getauxval(rseq feature size) to figure out which fields are actually >> populated. It would "work", but chances are that some users will get it >> wrong. It seems simpler for a user to simply do: >> >> if (__rseq_feature_size >= offsetofend(struct rseq, vm_vcpu_id)) >> >> to validate whether a given field is indeed populated. >> >> The rseq feature size approach would scale to very large feature >> numbers. It would *not* allow deprecation of fields after they are >> published, but I see this as a gain in simplicity for users of the ABI, >> even though we lose a knob as kernel developers. > > I think glibc can register rseq with a new flag once it sees > AT_RSEQ_FEATURE_SIZE in the auxiliary vector (even if it's 32). I understand that you propose adding a rseq registration flag to be passed to the rseq system call when glibc supports extended rseq. I wonder whether it's useful at all. We implicitly know the relevant information through the rseq_len parameter of the rseq syscall. If rseq_len==32, this means glibc allocated a 32-byte struct rseq area (either because it was the original structure size, or because getauxval() actually reported that at the supported feature size is <= 32). The kernel is therefore free to use all fields below 32 bytes. This basically mean that we get 3 4-byte word fields available for user-space use right away without a glibc upgrade. The application just needs to use getauxval() to know whether the kernel populates those fields or not. If rseq_len > 32, this means glibc used the getauxval() information to know the area size. Then the kernel can validate that the size is large enough to contain all features it supports upon rseq registration. That > flag would naturally end up in __rseq_flags. I'm still unsure that we need to pass a flag to the kernel at all. For future extensions > __rseq_size should work directly. So for extensions in the last 3 4-byte words of struct rseq (currently padding), applications would have to check both __rseq_flags and compare the offset of the end of the field with __rseq_size. Then for fields beyond the 32 first bytes, just checking the __rseq_size would suffice. However, we should be careful that the semantic of __rseq_size should *not* include padding anymore, but rather end exactly after the last supported feature field. > > But as I said, we better use all the padding at once during the first > step. Or we could add even more stuff to move past the current 32, > then we wouldn't need the flag dance. 8-) Adding semi-useful information in those words means we put a hard requirement on users to upgrade their glibc before they can access the more important feature fields we would rather like to make available first. So with your proposal, the application-level users would be expected to do something like this: static uint32_t local_rseq_feature_size; if (__rseq_flags & RSEQ_FLAG_EXTENDED) local_rseq_feature_size = __rseq_size; else local_rseq_feature_size = 20; /* after last orig. field */ and then use: if (local_rseq_feature_size >= offsetofend(struct rseq, field)) as check for feature availability. I just find it more likely that users may get it wrong compared to having the rseq feature size already available in a new libc __rseq_feature_size symbol. On the upside there is then no need to export an additional symbol. Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. https://www.efficios.com