Re: [PATCH v4 1/1] Created tunable to force small pages on stack allocation.

public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed

From: Cupertino Miranda <cupertino.miranda@oracle.com>
To: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
Cc: libc-alpha@sourceware.org
Subject: Re: [PATCH v4 1/1] Created tunable to force small pages on stack allocation.
Date: Tue, 28 Mar 2023 16:05:07 +0100	[thread overview]
Message-ID: <87bkkcj4rg.fsf@oracle.com> (raw)
In-Reply-To: <0324d4f3-6e0e-5cc3-90a3-ca042d42f000@linaro.org>


Adhemerval Zanella Netto writes:

> On 28/03/23 05:55, Cupertino Miranda via Libc-alpha wrote:
>> Created tunable glibc.pthread.stack_hugetlb to control when hugepages
>> can be used for stack allocation.
>> In case THP are enabled and glibc.pthread.stack_hugetlb is set to
>> 0, glibc will madvise the kernel not to use allow hugepages for stack
>> allocations.
>>
>> Changed from v1:
>>  - removed the __malloc_thp_mode calls to check if hugetlb is
>>    enabled.
>>
>> Changed from v2:
>>  - Added entry in manual/tunables.texi
>>  - Fixed tunable default to description
>>  - Code style corrections.
>>
>> Changes from v3:
>>  - Improve tunables.texi.
>> ---
>>  manual/tunables.texi          | 14 ++++++++++++++
>>  nptl/allocatestack.c          |  6 ++++++
>>  nptl/nptl-stack.c             |  1 +
>>  nptl/nptl-stack.h             |  3 +++
>>  nptl/pthread_mutex_conf.c     |  8 ++++++++
>>  sysdeps/nptl/dl-tunables.list |  6 ++++++
>>  6 files changed, 38 insertions(+)
>>
>> diff --git a/manual/tunables.texi b/manual/tunables.texi
>> index 70dd2264c5..182f644851 100644
>> --- a/manual/tunables.texi
>> +++ b/manual/tunables.texi
>> @@ -459,6 +459,20 @@ registration on behalf of the application.
>>  Restartable sequences are a Linux-specific extension.
>>  @end deftp
>>
>> +@deftp Tunable glibc.pthread.stack_hugetlb
>> +This tunable allows to configure pthread creation stack allocation never to use
>> +hugetlbs.
>> +This is more of an RSS optimization, for example in scenarios where many
>> +threads get created, keeping RSS to a minimum, but also allowing hugestlbs to
>> +be used for malloc allocations.
>> +This tunable only affects stacks allocated through thread creation process,
>> +i.e. it has no effect on stacks assigned with pthread_attr_setstack.
>> +
>> +The default is @samp{1} and preservs glibc default behaviour.
>> +When set to @samp{0}, it advises the kernel not to use hugetlbs for stack
>> +allocation.
>> +@end deftp
>> +
>
> We use 'Huge Pages' on glibc.malloc.hugetlb (hugestlbs sounds strange), RSS is not
> defined in any place (even it is a common Linux VM term), pthread_attr_setstack
> needs to be within a @code.
>
> What about the following:
>
>   This tunable controls whether to use Huge Pages in the stacks created by
>   @code{pthread_create}.  This tunable only affects the stacks created by
>   @theglibc{}, it has not effect on stack assigned with @code{pthread_attr_setstack}.
>
>   The default is @samp{1} where the system default value is used.  Setting
>   its value to @code{0} enables the use of @code{madvise} with @code{MADV_NOHUGEPAGE}
>   after stack creation with @code{mmap}.
>
>   This is a memory utilization optimization, since internal glibc setup of either
>   the thread descriptor and the guard page might force the kernel to move the
>   thread stack originally backup by Huge Pages to default pages.
>
> (maybe expand a bit the problem this tunable aims to improve)
>

It looks good to me. Apologies for the documentation problems and thanks
for the correction. Next time I will be more careful.

Cupertino


>>  @node Hardware Capability Tunables
>>  @section Hardware Capability Tunables
>>  @cindex hardware capability tunables
>> diff --git a/nptl/allocatestack.c b/nptl/allocatestack.c
>> index c7adbccd6f..f9d8cdfd08 100644
>> --- a/nptl/allocatestack.c
>> +++ b/nptl/allocatestack.c
>> @@ -369,6 +369,12 @@ allocate_stack (const struct pthread_attr *attr, struct pthread **pdp,
>>  	  if (__glibc_unlikely (mem == MAP_FAILED))
>>  	    return errno;
>>
>> +	  /* Do madvise in case the tunable glibc.pthread.stack_hugetlb is
>> +	     set to 0, disabling hugetlb.  */
>> +	  if (__glibc_unlikely (__nptl_stack_hugetlb == 0)
>> +	      && __madvise (mem, size, MADV_NOHUGEPAGE) != 0)
>> +	    return errno;
>> +
>>  	  /* SIZE is guaranteed to be greater than zero.
>>  	     So we can never get a null pointer back from mmap.  */
>>  	  assert (mem != NULL);
>> diff --git a/nptl/nptl-stack.c b/nptl/nptl-stack.c
>> index 5eb7773575..e829711cb5 100644
>> --- a/nptl/nptl-stack.c
>> +++ b/nptl/nptl-stack.c
>> @@ -21,6 +21,7 @@
>>  #include <pthreadP.h>
>>
>>  size_t __nptl_stack_cache_maxsize = 40 * 1024 * 1024;
>> +int32_t __nptl_stack_hugetlb = 1;
>>
>>  void
>>  __nptl_stack_list_del (list_t *elem)
>> diff --git a/nptl/nptl-stack.h b/nptl/nptl-stack.h
>> index 34f8bbb15e..cf90b27c2b 100644
>> --- a/nptl/nptl-stack.h
>> +++ b/nptl/nptl-stack.h
>> @@ -27,6 +27,9 @@
>>  /* Maximum size of the cache, in bytes.  40 MiB by default.  */
>>  extern size_t __nptl_stack_cache_maxsize attribute_hidden;
>>
>> +/* Should allow stacks to use hugetlb. (1) is default.  */
>> +extern int32_t __nptl_stack_hugetlb;
>> +
>>  /* Check whether the stack is still used or not.  */
>>  static inline bool
>>  __nptl_stack_in_use (struct pthread *pd)
>> diff --git a/nptl/pthread_mutex_conf.c b/nptl/pthread_mutex_conf.c
>> index 329c4cbb8f..60ef9095aa 100644
>> --- a/nptl/pthread_mutex_conf.c
>> +++ b/nptl/pthread_mutex_conf.c
>> @@ -45,6 +45,12 @@ TUNABLE_CALLBACK (set_stack_cache_size) (tunable_val_t *valp)
>>    __nptl_stack_cache_maxsize = valp->numval;
>>  }
>>
>> +static void
>> +TUNABLE_CALLBACK (set_stack_hugetlb) (tunable_val_t *valp)
>> +{
>> +  __nptl_stack_hugetlb = (int32_t) valp->numval;
>> +}
>> +
>>  void
>>  __pthread_tunables_init (void)
>>  {
>> @@ -52,5 +58,7 @@ __pthread_tunables_init (void)
>>                 TUNABLE_CALLBACK (set_mutex_spin_count));
>>    TUNABLE_GET (stack_cache_size, size_t,
>>                 TUNABLE_CALLBACK (set_stack_cache_size));
>> +  TUNABLE_GET (stack_hugetlb, int32_t,
>> +	       TUNABLE_CALLBACK (set_stack_hugetlb));
>>  }
>>  #endif
>
> Different than glibc.malloc.hugetlb, we always enable it even if the system does not
> support THP.  I think it should be ok (although not in sync with glibc.malloc.hugetlb
> support).
>
>> diff --git a/sysdeps/nptl/dl-tunables.list b/sysdeps/nptl/dl-tunables.list
>> index bd1ddb121d..4cde9500b6 100644
>> --- a/sysdeps/nptl/dl-tunables.list
>> +++ b/sysdeps/nptl/dl-tunables.list
>> @@ -33,5 +33,11 @@ glibc {
>>        maxval: 1
>>        default: 1
>>      }
>> +    stack_hugetlb {
>> +      type: INT_32
>> +      minval: 0
>> +      maxval: 1
>> +      default: 1
>> +    }
>>    }
>>  }

     prev parent reply	other threads:[~2023-03-28 15:05 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-28  8:55 [PATCH v4 0/1] *** " Cupertino Miranda
2023-03-28  8:55 ` [PATCH v4 1/1] " Cupertino Miranda
2023-03-28 13:57   ` Adhemerval Zanella Netto
2023-03-28 15:05     ` Cupertino Miranda [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87bkkcj4rg.fsf@oracle.com \
    --to=cupertino.miranda@oracle.com \
    --cc=adhemerval.zanella@linaro.org \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).