From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi1-x22c.google.com (mail-oi1-x22c.google.com [IPv6:2607:f8b0:4864:20::22c]) by sourceware.org (Postfix) with ESMTPS id F24903858014 for ; Tue, 8 Feb 2022 19:34:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org F24903858014 Received: by mail-oi1-x22c.google.com with SMTP id u3so100264oiv.12 for ; Tue, 08 Feb 2022 11:34:45 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:references:from:in-reply-to :content-transfer-encoding; bh=lJW4GjLS6f5vocQws4BvC7qAtjfFrd7fYn008fP2u7A=; b=PvNT4X7/m9pDYbIsBlCUMciV8Jw8HjZTn4fNUChVzKi/7vcFTlb3gRj3xbOwkZR8FM rN0KX5CBHBVz2mpyHY4vZEFMcFeRfUwA95KhmACG2SCUCWSPTC81J2mTgGmgSLb62x5T +bSngIkPZ0uN35/jpiDUVC0OOolHyizfBnMlExEG4TqlQ6dXYF2vDSx4ZaGhVXszQpr8 ch78E9fhQ5F9ibLEixfjW/GIXQUcldSgrBvfApfsvOcPLvbXT7LDz7bbQa6Qnmsk5llf j1cXsT1wwiWDQWF95cqRqTiREIYYe6evYEdR33dMtZkRXg8e5CEy5pmxvCJ27/RIlrJB VM3g== X-Gm-Message-State: AOAM532Lm5exTBp/4o88CMwD8ksbqw/ADgvUZxMWVUlQi+SaIwdID3Ta vexzSU5DFkFfEpE4zCXSYCRlWA== X-Google-Smtp-Source: ABdhPJxbww0FU2XBg7S76xLXmZIAUNUzIxUmXEpPBwcYYcgSj/nncvtyg5R5qLV4gDL2Xk5xzqTzdw== X-Received: by 2002:a05:6808:3:: with SMTP id u3mr1262347oic.267.1644348885267; Tue, 08 Feb 2022 11:34:45 -0800 (PST) Received: from ?IPV6:2804:431:c7ca:733:86a9:ad5:adef:2f3e? ([2804:431:c7ca:733:86a9:ad5:adef:2f3e]) by smtp.gmail.com with ESMTPSA id v10sm5574810oto.53.2022.02.08.11.34.44 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 08 Feb 2022 11:34:45 -0800 (PST) Message-ID: Date: Tue, 8 Feb 2022 16:34:42 -0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.6.0 Subject: Re: [PATCH v2] linux: fix accuracy of get_nprocs and get_nprocs_conf [BZ #28865] Content-Language: en-US To: "Dmitry V. Levin" , libc-alpha@sourceware.org References: <20220205212402.GA5233@altlinux.org> <20220207135736.GA31310@altlinux.org> From: Adhemerval Zanella In-Reply-To: <20220207135736.GA31310@altlinux.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-12.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 Feb 2022 19:34:48 -0000 On 07/02/2022 10:57, Dmitry V. Levin wrote: > get_nprocs() and get_nprocs_conf() use various methods to obtain an > accurate number of processors. Re-introduce __get_nprocs_sched() as > a source of information, and fix the order in which these methods are > used to return the most accurate information. The primary source of > information used in both functions remains unchanged. > > This also changes __get_nprocs_sched() error return value from 2 to 0, > but all its users are already prepared to handle that. > > Old behavior: > get_nprocs: > /sys/devices/system/cpu/online -> /proc/stat -> 2 > get_nprocs_conf: > /sys/devices/system/cpu/ -> /proc/stat -> 2 > > New behavior: > get_nprocs: > /sys/devices/system/cpu/online -> /proc/stat -> sched_getaffinity -> 2 > get_nprocs_conf: > /sys/devices/system/cpu/ -> /proc/stat -> sched_getaffinity -> 2 > > Fixes: 342298278e ("linux: Revert the use of sched_getaffinity on get_nproc") > Closes: BZ #28865 I am still not fully sure if sched_getaffinity is a correct fallback to an API that should return a system overview, the issue that lead sched_getaffinity removal was that it is subject to per-process filtering (either by seccomp, cgroup, etc.) and it might trigger some wrong behavior in some programs (such as monitoring tools and jvms). However if the environment does not provide a way to actually obtain such information I guess sched_getaffinity should not make things worse (it does not make sense to assume multiprocessor if the process is not allowed more than one CPU, and monitoring tools should not work if sysfs/procfs are not present). LGTM with some nits below. I think you might use brackets instead of square bracket in title (to trigger the bugzilla scripts). Reviewed-by: Adhemerval Zanella > --- > > v2: prioritize /proc/stat over sched_getaffinity in get_nprocs(). > > sysdeps/unix/sysv/linux/getsysstats.c | 94 ++++++++++++++++++--------- > 1 file changed, 63 insertions(+), 31 deletions(-) > > diff --git a/sysdeps/unix/sysv/linux/getsysstats.c b/sysdeps/unix/sysv/linux/getsysstats.c > index c98c8ce3d4..ef77ed193c 100644 > --- a/sysdeps/unix/sysv/linux/getsysstats.c > +++ b/sysdeps/unix/sysv/linux/getsysstats.c > @@ -50,9 +50,8 @@ __get_nprocs_sched (void) > is an arbitrary values assuming such systems should be rare and there > is no offline cpus. */ > return max_num_cpus; > - /* Some other error. 2 is conservative (not a uniprocessor system, so > - atomics are needed). */ > - return 2; > + /* Some other error. */ > + return 0; > } > > static char * > @@ -108,22 +107,19 @@ next_line (int fd, char *const buffer, char **cp, char **re, > } > > static int > -get_nproc_stat (char *buffer, size_t buffer_size) > +get_nproc_stat (void) > { > + enum { buffer_size = 1024 }; > + char buffer[buffer_size]; > char *buffer_end = buffer + buffer_size; > char *cp = buffer_end; > char *re = buffer_end; > - > - /* Default to an SMP system in case we cannot obtain an accurate > - number. */ > - int result = 2; > + int result = 0; > > const int flags = O_RDONLY | O_CLOEXEC; > int fd = __open_nocancel ("/proc/stat", flags); > if (fd != -1) > { > - result = 0; > - > char *l; > while ((l = next_line (fd, buffer, &cp, &re, buffer_end)) != NULL) > /* The current format of /proc/stat has all the cpu* entries > @@ -139,8 +135,8 @@ get_nproc_stat (char *buffer, size_t buffer_size) > return result; > } > > -int > -__get_nprocs (void) > +static int > +get_nprocs_cpu_online (void) > { > enum { buffer_size = 1024 }; > char buffer[buffer_size]; > @@ -179,7 +175,8 @@ __get_nprocs (void) > } > } > > - result += m - n + 1; > + if (m >= n) > + result += m - n + 1; > > l = endp; > if (l < re && *l == ',') > @@ -188,28 +185,18 @@ __get_nprocs (void) > while (l < re && *l != '\n'); > > __close_nocancel_nostatus (fd); > - > - if (result > 0) > - return result; > } > > - return get_nproc_stat (buffer, buffer_size); > + return result; > } > -libc_hidden_def (__get_nprocs) > -weak_alias (__get_nprocs, get_nprocs) > - > > -/* On some architectures it is possible to distinguish between configured > - and active cpus. */ > -int > -__get_nprocs_conf (void) > +static int > +get_nprocs_cpu (void) > { > - /* Try to use the sysfs filesystem. It has actual information about > - online processors. */ > + int count = 0; > DIR *dir = __opendir ("/sys/devices/system/cpu"); > if (dir != NULL) > { > - int count = 0; > struct dirent64 *d; > > while ((d = __readdir64 (dir)) != NULL) > @@ -224,12 +211,57 @@ __get_nprocs_conf (void) > > __closedir (dir); > > - return count; > } > + return count; > +} > > - enum { buffer_size = 1024 }; > - char buffer[buffer_size]; > - return get_nproc_stat (buffer, buffer_size); > +static int > +get_nprocs_fallback (void) > +{ > + int result; > + > + /* Try /proc/stat first. */ > + result = get_nproc_stat (); > + if (result) No implicit check. > + return result; > + > + /* Try sched_getaffinity. */ > + result = __get_nprocs_sched (); > + if (result) Ditto. > + return result; > + > + /* We failed to obtain an accurate number. Be conservative: return > + the smallest number meaning that this is not a uniprocessor system, > + so atomics are needed. */ > + return 2; > +} > + > +int > +__get_nprocs (void) > +{ > + /* Try /sys/devices/system/cpu/online first. */ > + int result = get_nprocs_cpu_online (); > + if (result) Ditto. > + return result; > + > + /* Fall back to /proc/stat and sched_getaffinity. */ > + return get_nprocs_fallback (); > +} > +libc_hidden_def (__get_nprocs) > +weak_alias (__get_nprocs, get_nprocs) > + > +/* On some architectures it is possible to distinguish between configured > + and active cpus. */ > +int > +__get_nprocs_conf (void) > +{ > + /* Try /sys/devices/system/cpu/ first. */ > + int result = get_nprocs_cpu (); > + if (result) Ditto. > + return result; > + > + /* Fall back to /proc/stat and sched_getaffinity. */ > + return get_nprocs_fallback (); > } > libc_hidden_def (__get_nprocs_conf) > weak_alias (__get_nprocs_conf, get_nprocs_conf) >