From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from vmicros1.altlinux.org (vmicros1.altlinux.org [194.107.17.57]) by sourceware.org (Postfix) with ESMTP id AFD433858414 for ; Mon, 7 Feb 2022 11:51:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org AFD433858414 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=altlinux.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=altlinux.org Received: from mua.local.altlinux.org (mua.local.altlinux.org [192.168.1.14]) by vmicros1.altlinux.org (Postfix) with ESMTP id BCE2072C905; Mon, 7 Feb 2022 14:51:13 +0300 (MSK) Received: by mua.local.altlinux.org (Postfix, from userid 508) id AFDF87CD50C; Mon, 7 Feb 2022 14:51:13 +0300 (MSK) Date: Mon, 7 Feb 2022 14:51:13 +0300 From: "Dmitry V. Levin" To: Adhemerval Zanella Cc: libc-alpha@sourceware.org Subject: Re: [PATCH] linux: fix accuracy of get_nprocs and get_nprocs_conf [BZ #28865] Message-ID: <20220207115113.GA29197@altlinux.org> References: <20220205212402.GA5233@altlinux.org> <2f8633c5-6335-b7aa-e735-65dc36322d7f@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2f8633c5-6335-b7aa-e735-65dc36322d7f@linaro.org> X-Spam-Status: No, score=-6.5 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Feb 2022 11:51:17 -0000 Hi, On Mon, Feb 07, 2022 at 08:25:11AM -0300, Adhemerval Zanella via Libc-alpha wrote: > On 05/02/2022 18:24, Dmitry V. Levin wrote: > > get_nprocs() and get_nprocs_conf() use various methods to obtain an > > accurate number of processors. Re-introduce __get_nprocs_sched() as > > a source of information, and fix the order in which these methods are > > used to return the most accurate information. The primary source of > > information used in both functions remains unchanged. > > > > This also changes __get_nprocs_sched() error return value from 2 to 0, > > but all its users are already prepared to handle that. > > > > Old behavior: > > get_nprocs: > > /sys/devices/system/cpu/online -> /proc/stat -> 2 > > get_nprocs_conf: > > /sys/devices/system/cpu/ -> /proc/stat -> 2 > > > > New behavior: > > get_nprocs: > > /sys/devices/system/cpu/online -> sched_getaffinity -> /proc/stat -> 2 > > get_nprocs_conf: > > /sys/devices/system/cpu/ -> /proc/stat -> sched_getaffinity -> 2 > > > > Fixes: 342298278e ("linux: Revert the use of sched_getaffinity on get_nproc") > > Closes: BZ #28865 > > I think we are circling back on this, on BZ#27645 [1] we changed get_nprocs > to use sched_getaffinity and then we have to revert it with BZ#28310 [2] because > it introduced regression on some monitoring tools [3]. > > In fact from BZ#27645 and BZ#28624 [4] discussion I think we can't reliable use > sched_getaffinity because since some container environment returns a synthetic > mask that might break some programs. Also, sched_getaffinity returns a > 'per-process' mask instead of system-wide as we discussed in previous threads. > It should be ok to get adjusting internal tuning (as for malloc). > > [1] https://sourceware.org/bugzilla/show_bug.cgi?id=27645 > [2] https://sourceware.org/bugzilla/show_bug.cgi?id=28310 > [3] https://sourceware.org/bugzilla/show_bug.cgi?id=27645#c5 > [4] https://sourceware.org/bugzilla/show_bug.cgi?id=28624 Is there any realistic case when 2 is a more accurate estimation for the number of processors than sched_getaffinity? I suppose there are no such cases. Also, /sys is consulted first anyway. I wish I saw commit 342298278e earlier to raise objections before it was committed. Please note that BZ #28865 is a real regression we had to patch, this means glibc must behave properly in that environment without any additional tuning. I suggest to install this fix and see what could be done later in an unlikely case anything else breaks. -- ldv