public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Siddhesh Poyarekar <siddhesh@sourceware.org>
To: Adhemerval Zanella <adhemerval.zanella@linaro.org>,
	libc-alpha@sourceware.org
Cc: Norbert Manthey <nmanthey@conp-solutions.com>,
	Guillaume Morin <guillaume@morinfr.org>
Subject: Re: [PATCH v2 0/4] malloc: Improve Huge Page support
Date: Wed, 18 Aug 2021 23:41:00 +0530	[thread overview]
Message-ID: <e175af42-0bbc-c39a-1692-ba11a08f04de@sourceware.org> (raw)
In-Reply-To: <20210818142000.128752-1-adhemerval.zanella@linaro.org>

On 8/18/21 7:49 PM, Adhemerval Zanella wrote:
> Linux currently supports two ways to use Huge Pages: either by using
> specific flags directly with the syscall (MAP_HUGETLB for mmap(), or
> SHM_HUGETLB for shmget()), or by using Transparent Huge Pages (THP)
> where the kernel will try to move allocated anonymous pages to Huge
> Pages blocks transparent to application.
> 
> Also, THP current support three different modes [1]: 'never', 'madvise',
> and 'always'.  The 'never' is self-explanatory and 'always' will enable
> THP for all anonymous memory.  However, 'madvise' is still the default
> for some systems and for such cases THP will be only used if the memory
> range is explicity advertise by the program through a
> madvise(MADV_HUGEPAGE) call.
> 
> This patchset adds a two new tunables to improve malloc() support with
> Huge Page:

I wonder if this could be done with just the one tunable, 
glibc.malloc.hugepages where:

0: Disabled (default)
1: Transparent, where we emulate "always" behaviour of THP
2: HugeTLB enabled with default hugepage size
<size>: HugeTLB enabled with the specified page size

When using HugeTLB, we don't really need to bother with THP so they seem 
mutually exclusive.

> 
>    - glibc.malloc.thp_madvise: instruct the system allocator to issue
>      a madvise(MADV_HUGEPAGE) call after a mmap() one for sizes larger
>      than the default huge page size.  The default behavior is to
>      disable it and if the system does not support THP the tunable also
>      does not enable the madvise() call.
> 
>    - glibc.malloc.mmap_hugetlb: instruct the system allocator to round
>      allocation to huge page sizes along with the required flags
>      (MAP_HUGETLB for Linux).  If the memory allocation fails, the
>      default system page size is used instead.  The default behavior is
>      to disable and a value of 1 uses the default system huge page size.
>      A positive value larger than 1 means to use a specific huge page
>      size, which is matched against the supported ones by the system.
> 
> The 'thp_madvise' tunable also changes the sbrk() usage by malloc
> on main arenas, where the increment is now aligned to the huge page
> size, instead of default page size.
> 
> The 'mmap_hugetlb' aims to replace the 'morecore' removed callback
> from 2.34 for libhugetlbfs (where the library tries to leverage the
> huge pages usage instead to provide a system allocator).  By
> implementing the support directly on the mmap() code patch there is
> no need to try emulate the morecore()/sbrk() semantic which simplifies
> the code and make memory shrink logic more straighforward.
> 
> The performance improvements are really dependent of the workload
> and the platform, however a simple testcase might show the possible
> improvements:

A simple test like below in benchtests would be very useful to at least 
get an initial understanding of the behaviour differences with different 
tunable values.  Later those who care can add more relevant workloads.

> 
> $ cat hugepages.cc
> #include <unordered_map>
> 
> int
> main (int argc, char *argv[])
> {
>    std::size_t iters = 10000000;
>    std::unordered_map <std::size_t, std::size_t> ht;
>    ht.reserve (iters);
>    for (std::size_t i = 0; i < iters; ++i)
>      ht.try_emplace (i, i);
> 
>    return 0;
> }
> $ g++ -std=c++17 -O2 hugepages.cc -o hugepages
> 
> On a x86_64 (Ryzen 9 5900X):
> 
>   Performance counter stats for 'env
> GLIBC_TUNABLES=glibc.malloc.thp_madvise=0 ./testrun.sh ./hugepages':
> 
>              98,874      faults
>             717,059      dTLB-loads
>             411,701      dTLB-load-misses          #   57.42% of all dTLB
> cache accesses
>           3,754,927      cache-misses              #    8.479 % of all
> cache refs
>          44,287,580      cache-references
> 
>         0.315278378 seconds time elapsed
> 
>         0.238635000 seconds user
>         0.076714000 seconds sys
> 
>   Performance counter stats for 'env
> GLIBC_TUNABLES=glibc.malloc.thp_madvise=1 ./testrun.sh ./hugepages':
> 
>               1,871      faults
>             120,035      dTLB-loads
>              19,882      dTLB-load-misses          #   16.56% of all dTLB
> cache accesses
>           4,182,942      cache-misses              #    7.452 % of all
> cache refs
>          56,128,995      cache-references
> 
>         0.262620733 seconds time elapsed
> 
>         0.222233000 seconds user
>         0.040333000 seconds sys
> 
> 
> On an AArch64 (cortex A72):
> 
>   Performance counter stats for 'env
> GLIBC_TUNABLES=glibc.malloc.thp_madvise=0 ./testrun.sh ./hugepages':
> 
>               98835      faults
>          2007234756      dTLB-loads
>             4613669      dTLB-load-misses          #    0.23% of all dTLB
> cache accesses
>             8831801      cache-misses              #    0.504 % of all
> cache refs
>          1751391405      cache-references
> 
>         0.616782575 seconds time elapsed
> 
>         0.460946000 seconds user
>         0.154309000 seconds sys
> 
>   Performance counter stats for 'env
> GLIBC_TUNABLES=glibc.malloc.thp_madvise=1 ./testrun.sh ./hugepages':
> 
>                 955      faults
>          1787401880      dTLB-loads
>              224034      dTLB-load-misses          #    0.01% of all dTLB
> cache accesses
>             5480917      cache-misses              #    0.337 % of all
> cache refs
>          1625937858      cache-references
> 
>         0.487773443 seconds time elapsed
> 
>         0.440894000 seconds user
>         0.046465000 seconds sys
> 
> 
> And on a powerpc64 (POWER8):
> 
>   Performance counter stats for 'env
> GLIBC_TUNABLES=glibc.malloc.thp_madvise=0 ./testrun.sh ./hugepages
> ':
> 
>                5453      faults
>                9940      dTLB-load-misses
>             1338152      cache-misses              #    0.101 % of all
> cache refs
>          1326037487      cache-references
> 
>         1.056355887 seconds time elapsed
> 
>         1.014633000 seconds user
>         0.041805000 seconds sys
> 
>   Performance counter stats for 'env
> GLIBC_TUNABLES=glibc.malloc.thp_madvise=1 ./testrun.sh ./hugepages
> ':
> 
>                1016      faults
>                1746      dTLB-load-misses
>              399052      cache-misses              #    0.030 % of all
> cache refs
>          1316059877      cache-references
> 
>         1.057810501 seconds time elapsed
> 
>         1.012175000 seconds user
>         0.045624000 seconds sys
> 
> It is worth to note that the powerpc64 machine has 'always' set
> on '/sys/kernel/mm/transparent_hugepage/enabled'.
> 
> Norbert Manthey's paper has more information with a more thoroughly
> performance analysis.
> 
> For testing run make check on x86_64-linux-gnu with thp_pagesize=1
> (directly on ptmalloc_init() after tunable initialiazation) and
> with mmap_hugetlb=1 (also directly on ptmalloc_init()) with about
> 10 large pages (so the fallback mmap() call is used) and with
> 1024 large pages (so all mmap(MAP_HUGETLB) are successful).

You could add tests similar to mcheck and malloc-check, i.e. add 
$(tests-hugepages) to run all malloc tests again with the various 
tunable values.  See tests-mcheck for example.

> --
> 
> Changes from previous version:
> 
>    - Renamed thp_pagesize to thp_madvise and make it a boolean state.
>    - Added MAP_HUGETLB support for mmap().
>    - Remove system specific hooks for THP huge page size in favor of
>      Linux generic implementation.
>    - Initial program segments need to be page aligned for the
>      first madvise call.
> 
> Adhemerval Zanella (4):
>    malloc: Add madvise support for Transparent Huge Pages
>    malloc: Add THP/madvise support for sbrk
>    malloc: Move mmap logic to its own function
>    malloc: Add Huge Page support for sysmalloc
> 
>   NEWS                                       |   9 +-
>   elf/dl-tunables.list                       |   9 +
>   elf/tst-rtld-list-tunables.exp             |   2 +
>   include/libc-pointer-arith.h               |  10 +
>   malloc/arena.c                             |   7 +
>   malloc/malloc-internal.h                   |   1 +
>   malloc/malloc.c                            | 263 +++++++++++++++------
>   manual/tunables.texi                       |  23 ++
>   sysdeps/generic/Makefile                   |   8 +
>   sysdeps/generic/malloc-hugepages.c         |  37 +++
>   sysdeps/generic/malloc-hugepages.h         |  49 ++++
>   sysdeps/unix/sysv/linux/malloc-hugepages.c | 201 ++++++++++++++++
>   12 files changed, 542 insertions(+), 77 deletions(-)
>   create mode 100644 sysdeps/generic/malloc-hugepages.c
>   create mode 100644 sysdeps/generic/malloc-hugepages.h
>   create mode 100644 sysdeps/unix/sysv/linux/malloc-hugepages.c
> 


  parent reply	other threads:[~2021-08-18 18:11 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-18 14:19 Adhemerval Zanella
2021-08-18 14:19 ` [PATCH v2 1/4] malloc: Add madvise support for Transparent Huge Pages Adhemerval Zanella
2021-08-18 18:42   ` Siddhesh Poyarekar
2021-08-19 12:00     ` Adhemerval Zanella
2021-08-19 12:22       ` Siddhesh Poyarekar
2021-08-18 14:19 ` [PATCH v2 2/4] malloc: Add THP/madvise support for sbrk Adhemerval Zanella
2021-08-18 14:19 ` [PATCH v2 3/4] malloc: Move mmap logic to its own function Adhemerval Zanella
2021-08-19  0:47   ` Siddhesh Poyarekar
2021-08-18 14:20 ` [PATCH v2 4/4] malloc: Add Huge Page support for sysmalloc Adhemerval Zanella
2021-08-19  1:03   ` Siddhesh Poyarekar
2021-08-19 12:08     ` Adhemerval Zanella
2021-08-19 17:58   ` Matheus Castanho
2021-08-19 18:50     ` Adhemerval Zanella
2021-08-20 12:34       ` Matheus Castanho
2021-08-18 18:11 ` Siddhesh Poyarekar [this message]
2021-08-19 11:26   ` [PATCH v2 0/4] malloc: Improve Huge Page support Adhemerval Zanella
2021-08-19 11:48     ` Siddhesh Poyarekar
2021-08-19 12:04       ` Adhemerval Zanella
2021-08-19 12:26         ` Siddhesh Poyarekar
2021-08-19 12:42           ` Adhemerval Zanella
2021-08-19 16:42 ` Guillaume Morin
2021-08-19 16:55   ` Adhemerval Zanella
2021-08-19 17:17     ` Guillaume Morin
2021-08-19 17:27       ` Adhemerval Zanella

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e175af42-0bbc-c39a-1692-ba11a08f04de@sourceware.org \
    --to=siddhesh@sourceware.org \
    --cc=adhemerval.zanella@linaro.org \
    --cc=guillaume@morinfr.org \
    --cc=libc-alpha@sourceware.org \
    --cc=nmanthey@conp-solutions.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).