From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x135.google.com (mail-lf1-x135.google.com [IPv6:2a00:1450:4864:20::135]) by sourceware.org (Postfix) with ESMTPS id CE6ED3858C52 for ; Thu, 28 Sep 2023 14:44:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CE6ED3858C52 Authentication-Results: sourceware.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=google.com Received: by mail-lf1-x135.google.com with SMTP id 2adb3069b0e04-5032a508e74so4557e87.1 for ; Thu, 28 Sep 2023 07:44:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1695912275; x=1696517075; darn=sourceware.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=BFwIpXTwL/lUKsUPWcI8rCYf21e0USndOV3VkIfXzr4=; b=F3HY7GdgnxOAEnxAajBQ/K7lxqCOmi20cA0yhX1rP52PzfJs5jXKGb+3lz9BbxEakV 17sEyKjtHmU2WvTXmOynRm3JGt5D9UaTwzdLvXLD1rcrlYwCinqoTOTOx3RKAghtg3p0 pwP8NxKA1SeUSRkANVJAnUd6JWRsonb8gVaBBQHr6Zccr2FFR88nQv9+wNRR2VwhGrac SgGlegjaNZQKbXTpPEUkw44wd+Om2T+PKQwGiHbsqrj8fMoC+Ra8dpz9XVnipHfekJF5 ZObcSQDm3O5LR/J0iN+x4lZBWcTeaJeZP1g6du4Pd/k4QZ5WY0LgDAdiabIQqZoeIIoB vaXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695912275; x=1696517075; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=BFwIpXTwL/lUKsUPWcI8rCYf21e0USndOV3VkIfXzr4=; b=KNHOkLKXbmCUHTEBZL+1GIS1jNMugWuF3eEvffyHLjpMfoHVzSmHdHUKc4EAWiM2xx +wSERjNQ3Wh4crfZMSNFPBPulpbXV48NnbQOYnN10qFe3OqCi8OsbLGUnMurbIiEIUOr Mtp7YDK0bxn8nIDlh6V6adqu5zp0d5xK4FH8kFlZfrPGvukWc0/5yablZZOGufa2lkUu GZlesYEsC7Ed1y52UlQvk/W/duW8blFHkkC50ADQa8n/QviEVYAykA/qJbZgAxcBJYvJ Fy0Z03T05EOjmeIA0dL+giODAlrdku6i0aT5umbXTECny/uT4b7et2tBw4qDuf/YIJQ3 h7Ag== X-Gm-Message-State: AOJu0YyxVVZH6V7wdbS4iuAox7+D/w+Nb3r31o1DQMRL7DoyYUL/V3em 1YG4n2z2kFJHbOGKvgBs/2Izw6DjCfFyGUPMExwYQQ== X-Google-Smtp-Source: AGHT+IF+k5c+rX4zUNgLUDd2nRwl2MXqWoD/9z4V0hFhg8SGKLIhAegXLUkZVGYDQSttL/WplGoCXnRj0bwajDSqRps= X-Received: by 2002:ac2:47f2:0:b0:502:a55e:fec0 with SMTP id b18-20020ac247f2000000b00502a55efec0mr247781lfp.6.1695912274999; Thu, 28 Sep 2023 07:44:34 -0700 (PDT) MIME-Version: 1.0 References: <2c421e36-a749-7dc3-3562-7a8cf256df3c@efficios.com> <20230926205215.472650-1-dvyukov@google.com> <875y3wp6au.fsf@oldenburg.str.redhat.com> <87bkdmznl6.fsf@oldenburg.str.redhat.com> In-Reply-To: <87bkdmznl6.fsf@oldenburg.str.redhat.com> From: Dmitry Vyukov Date: Thu, 28 Sep 2023 07:44:21 -0700 Message-ID: Subject: Re: [RFC PATCH v2 1/4] rseq: Add sched_state field to struct rseq To: Florian Weimer Cc: mathieu.desnoyers@efficios.com, David.Laight@aculab.com, alexander@mihalicyn.com, andrealmeid@igalia.com, boqun.feng@gmail.com, brauner@kernel.org, carlos@redhat.com, ckennelly@google.com, corbet@lwn.net, dave@stgolabs.net, dvhart@infradead.org, goldstein.w.n@gmail.com, hpa@zytor.com, libc-alpha@sourceware.org, linux-api@vger.kernel.org, linux-kernel@vger.kernel.org, longman@redhat.com, mingo@redhat.com, paulmck@kernel.org, peterz@infradead.org, pjt@google.com, posk@posk.io, rostedt@goodmis.org, tglx@linutronix.de Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-16.4 required=5.0 tests=BAYES_00,DKIMWL_WL_MED,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, 28 Sept 2023 at 01:52, Florian Weimer wrote: > > * Dmitry Vyukov: > > > On Tue, 26 Sept 2023 at 21:51, Florian Weimer wrote: > >> > >> * Dmitry Vyukov: > >> > >> > In reality it's a bit more involved since the field is actually 8 > >> > bytes and only partially overlaps with rseq.cpu_id_start (it's an > >> > 8-byte pointer with high 4 bytes overlap rseq.cpu_id_start): > >> > > >> > https://github.com/google/tcmalloc/blob/229908285e216cca8b844c1781bf16b838128d1b/tcmalloc/internal/percpu.h#L101-L165 > >> > >> This does not compose with other rseq users, as noted in the sources: > >> > >> // Note: this makes __rseq_abi.cpu_id_start unusable for its original purpose. > >> > >> For a core library such a malloc replacement, that is a very bad trap. > > > > I agree. I wouldn't do this if there were other options. That's why I > > am looking for official kernel support for this. > > If we would have a separate 8 bytes that are overwritten with 0 when a > > thread is descheduled, that would be perfect. > > That only solves part of the problem because these fields would still > have to be locked to tcmalloc. I think you'd need a rescheduling > counter, then every library could keep their reference values in > library-private thread-local storage. This unfortunatly won't work for tcmalloc. This data is accessed on the very hot path of malloc/free. We need a ready to use pointer in TLS, which is reset by the kernel to 0 (or some user-space specified value). Doing to separate loads for counters in different cache lines would be too expensive. It may be possible to make several libraries use this feature with an array of notifications (see rseq_desched_notif_t in my previous email).