public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "sascha.zorn at sap dot com" <sourceware-bugzilla@sourceware.org>
To: glibc-bugs@sourceware.org
Subject: [Bug malloc/30769] New: malloc_trim is not working correctly for arenas other than arena 0
Date: Wed, 16 Aug 2023 11:55:22 +0000	[thread overview]
Message-ID: <bug-30769-131@http.sourceware.org/bugzilla/> (raw)

https://sourceware.org/bugzilla/show_bug.cgi?id=30769

            Bug ID: 30769
           Summary: malloc_trim is not working correctly for arenas other
                    than arena 0
           Product: glibc
           Version: 2.35
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: malloc
          Assignee: unassigned at sourceware dot org
          Reporter: sascha.zorn at sap dot com
  Target Milestone: ---

Created attachment 15068
  --> https://sourceware.org/bugzilla/attachment.cgi?id=15068&action=edit
mem.cpp

This bug is a bit harder to explain and we stumbled upon it via multiple
long-running multi-threaded applications in a sidecar docker container, that
has only 300MB hard limit after which the container is killed. We periodically
reached this limit even though we have no memory leaks in our code (verified by
multiple tools and own allocator frameworks on top) and even manually tried to
trim the malloc areans with malloc_trim(). We need to free the reserved memory
as other processes might need the scarce 300MB. Upon research I stumbled over a
few articles that MIGHT have similar issues and this lead me to experiment with
malloc_stat/malloc_info and limiting the number of arenas (to 1) which finally
worked around the issue.

Examples being:
https://www.algolia.com/blog/engineering/when-allocators-are-hoarding-your-precious-memory/
https://thehftguy.com/2020/05/21/major-bug-in-glibc-is-killing-applications-with-a-memory-limit/

Although the second article mentions there are already plans to:
"Throttle number of arenas based on process rlimit on process startup and/or
everytime RLIMIT_AS is modified"
   - https://sourceware.org/glibc/wiki/Development_Todo/Enhancing_malloc" 

I don't think this is being worked on, would not consider cgroups (docker
container limits) or would even completely solve our issue, as all areans
except Arena 0 is properly trimmed via malloc_trim().

Unfortunately I'm not really good at writing multi-threaded, synchronized C
code and only have a C++20 sample that can reproduce this issue reliably (I
compiled with 'g++ --std=c++20 -o mem -O3 -g mem.cpp'). Basically it spawns a
configurable amount of threads that all wait in a barrier until all threads
have been started and the tries to call malloc() as concurrent as possible.
This in my experience triggers the creation of the most amount of arenas as
quickly as possible. Then it does this concurrent malloc() a few times more,
every time with a little bit increasing malloc sizes, to fragment and populate
the arenas.

I hope you can work with this example to reproduce this issue. I tried this
with glibc 2.35 and see the following behaviour. First number is the thread id.

First iteration looks good:
11: memory allocated (0x7f2681bfe010, size=18.000MB, rss=217.875MB)... 
 1: memory allocated (0x7f2682dff010, size=18.000MB, rss=226.875MB)... 
 3: memory allocated (0x7f266adff010, size=18.000MB, rss=232.875MB)... 
 9: memory allocated (0x7f26903ee010, size=18.000MB, rss=271.500MB)... 
 5: memory allocated (0x7f2689bfe010, size=18.000MB, rss=276.750MB)... 
15: memory allocated (0x7f2672dff010, size=18.000MB, rss=280.688MB)... 
10: memory allocated (0x7f26889fd010, size=18.000MB, rss=287.250MB)... 
13: memory allocated (0x7f26789fd010, size=18.000MB, rss=288.188MB)... 
 6: memory allocated (0x7f26915ef010, size=18.000MB, rss=288.750MB)... 
 4: memory allocated (0x7f2679bfe010, size=18.000MB, rss=289.125MB)... 
14: memory allocated (0x7f2669bfe010, size=18.000MB, rss=289.500MB)... 
 8: memory allocated (0x7f267adff010, size=18.000MB, rss=289.688MB)... 
 2: memory allocated (0x7f268adff010, size=18.000MB, rss=290.250MB)... 
16: memory allocated (0x7f26709fd010, size=18.000MB, rss=290.438MB)... 
 7: memory allocated (0x7f2671bfe010, size=18.000MB, rss=290.625MB)... 
12: memory allocated (0x7f26809fd010, size=18.000MB, rss=290.625MB)... 
Arena 0:
system bytes     =     135168
in use bytes     =      80624
Arena 1:
system bytes     =     135168
in use bytes     =       3392
Arena 2:
system bytes     =     135168
in use bytes     =       3392
Arena 3:
system bytes     =     135168
in use bytes     =       3392
Arena 4:
system bytes     =     135168
in use bytes     =       3392
Arena 5:
system bytes     =     135168
in use bytes     =       3392
Arena 6:
system bytes     =     135168
in use bytes     =       3392
Arena 7:
system bytes     =     135168
in use bytes     =       3392
Arena 8:
system bytes     =     135168
in use bytes     =       3392
Arena 9:
system bytes     =     135168
in use bytes     =       3392
Arena 10:
system bytes     =     135168
in use bytes     =       3392
Arena 11:
system bytes     =     135168
in use bytes     =       3392
Arena 12:
system bytes     =     135168
in use bytes     =       3392
Arena 13:
system bytes     =     135168
in use bytes     =       3392
Arena 14:
system bytes     =     135168
in use bytes     =       3392
Arena 15:
system bytes     =     135168
in use bytes     =       3392
Arena 16:
system bytes     =     135168
in use bytes     =       3392
Total (incl. mmap):
system bytes     =  304353280
in use bytes     =  302190320
max mmap regions =         16
max mmap bytes   =  302055424
 3: deallocated (rss=272.805MB)... trimmed (rss=272.805MB)
14: deallocated (rss=254.816MB)... trimmed (rss=254.816MB)
12: deallocated (rss=236.824MB)... trimmed (rss=236.824MB)
 8: deallocated (rss=218.832MB)... trimmed (rss=218.832MB)
 7: deallocated (rss=201.000MB)... trimmed (rss=201.000MB)
 1: deallocated (rss=183.043MB)... trimmed (rss=183.043MB)
 2: deallocated (rss=165.062MB)... trimmed (rss=165.062MB)
16: deallocated (rss=147.094MB)... trimmed (rss=147.094MB)
 5: deallocated (rss=129.207MB)... trimmed (rss=129.207MB)
 9: deallocated (rss=111.250MB)... trimmed (rss=111.250MB)
 6: deallocated (rss=93.352MB)... trimmed (rss=93.352MB)
 4: deallocated (rss=75.449MB)... trimmed (rss=75.449MB)
10: deallocated (rss=57.457MB)... trimmed (rss=57.457MB)
15: deallocated (rss=39.605MB)... trimmed (rss=39.605MB)
13: deallocated (rss=21.746MB)... trimmed (rss=21.746MB)
11: deallocated (rss=3.746MB)... trimmed (rss=3.746MB)
Arena 0:
system bytes     =      81920
in use bytes     =      80624
Arena 1:
system bytes     =     135168
in use bytes     =       3392
Arena 2:
system bytes     =     135168
in use bytes     =       3392
Arena 3:
system bytes     =     135168
in use bytes     =       3392
Arena 4:
system bytes     =     135168
in use bytes     =       3392
Arena 5:
system bytes     =     135168
in use bytes     =       3392
Arena 6:
system bytes     =     135168
in use bytes     =       3392
Arena 7:
system bytes     =     135168
in use bytes     =       3392
Arena 8:
system bytes     =     135168
in use bytes     =       3392
Arena 9:
system bytes     =     135168
in use bytes     =       3392
Arena 10:
system bytes     =     135168
in use bytes     =       3392
Arena 11:
system bytes     =     135168
in use bytes     =       3392
Arena 12:
system bytes     =     135168
in use bytes     =       3392
Arena 13:
system bytes     =     135168
in use bytes     =       3392
Arena 14:
system bytes     =     135168
in use bytes     =       3392
Arena 15:
system bytes     =     135168
in use bytes     =       3392
Arena 16:
system bytes     =     135168
in use bytes     =       3392
Total (incl. mmap):
system bytes     =    2244608
in use bytes     =     134896
max mmap regions =         16
max mmap bytes   =  302055424




I don't really understand why the areans only show 3392 bytes of memory "in
use", even after the allocations took place, but I guess that all the
allocations took place in mmap, and not in the areans.

After the last deallocation RSS is falling down to 3.746MB, which is 


Fun starts after the second iteration, where the areans > Arena 0 are "resized"
to 18890752 bytes:
15: memory allocated (0x7f2624000d50, size=18.001MB, rss=234.371MB)... 
11: memory allocated (0x7f265c000d50, size=18.001MB, rss=244.684MB)... 
14: memory allocated (0x7f267c000d50, size=18.001MB, rss=276.559MB)... 
 1: memory allocated (0x7f2664000d50, size=18.001MB, rss=279.559MB)... 
 9: memory allocated (0x7f2634000d50, size=18.001MB, rss=281.621MB)... 
 5: memory allocated (0x7f2684000d50, size=18.001MB, rss=285.184MB)... 
 6: memory allocated (0x7f268c000d50, size=18.001MB, rss=285.934MB)... 
 8: memory allocated (0x7f2644000d50, size=18.001MB, rss=288.746MB)... 
 4: memory allocated (0x7f263c000d50, size=18.001MB, rss=289.496MB)... 
 3: memory allocated (0x7f2614000d50, size=18.001MB, rss=290.246MB)... 
 2: memory allocated (0x7f2674000d50, size=18.001MB, rss=290.621MB)... 
16: memory allocated (0x7f2654000d50, size=18.001MB, rss=290.996MB)... 
10: memory allocated (0x7f266c000d50, size=18.001MB, rss=291.371MB)... 
 7: memory allocated (0x7f261c000d50, size=18.001MB, rss=291.559MB)... 
13: memory allocated (0x7f262c000d50, size=18.001MB, rss=291.559MB)... 
12: memory allocated (0x7f264c000d50, size=18.001MB, rss=291.559MB)... 
Arena 0:
system bytes     =      81920
in use bytes     =      80624
Arena 1:
system bytes     =   18890752
in use bytes     =   18878800
Arena 2:
system bytes     =   18890752
in use bytes     =   18878800
Arena 3:
system bytes     =   18890752
in use bytes     =   18878800
Arena 4:
system bytes     =   18890752
in use bytes     =   18878800
Arena 5:
system bytes     =   18890752
in use bytes     =   18878800
Arena 6:
system bytes     =   18890752
in use bytes     =   18878800
Arena 7:
system bytes     =   18890752
in use bytes     =   18878800
Arena 8:
system bytes     =   18890752
in use bytes     =   18878800
Arena 9:
system bytes     =   18890752
in use bytes     =   18878800
Arena 10:
system bytes     =   18890752
in use bytes     =   18878800
Arena 11:
system bytes     =   18890752
in use bytes     =   18878800
Arena 12:
system bytes     =   18890752
in use bytes     =   18878800
Arena 13:
system bytes     =   18890752
in use bytes     =   18878800
Arena 14:
system bytes     =   18890752
in use bytes     =   18878800
Arena 15:
system bytes     =   18890752
in use bytes     =   18878800
Arena 16:
system bytes     =   18890752
in use bytes     =   18878800
Total (incl. mmap):
system bytes     =  302333952
in use bytes     =  302141424
max mmap regions =         16
max mmap bytes   =  302055424
12: deallocated (rss=291.559MB)... trimmed (rss=291.559MB)
 9: deallocated (rss=291.559MB)... trimmed (rss=291.559MB)
15: deallocated (rss=291.559MB)... trimmed (rss=291.559MB)
 5: deallocated (rss=291.559MB)... trimmed (rss=291.559MB)
11: deallocated (rss=291.559MB)... trimmed (rss=291.559MB)
 6: deallocated (rss=291.559MB)... trimmed (rss=291.559MB)
 8: deallocated (rss=291.559MB)... trimmed (rss=291.559MB)
 4: deallocated (rss=291.559MB)... trimmed (rss=291.559MB)
13: deallocated (rss=291.559MB)... trimmed (rss=291.559MB)
 1: deallocated (rss=291.559MB)... trimmed (rss=291.559MB)
 3: deallocated (rss=291.559MB)... trimmed (rss=291.559MB)
 2: deallocated (rss=291.559MB)... trimmed (rss=291.559MB)
16: deallocated (rss=291.559MB)... trimmed (rss=291.559MB)
10: deallocated (rss=291.559MB)... trimmed (rss=291.559MB)
 7: deallocated (rss=291.559MB)... trimmed (rss=291.559MB)
14: deallocated (rss=291.559MB)... trimmed (rss=291.559MB)
Arena 0:
system bytes     =      81920
in use bytes     =      80624
Arena 1:
system bytes     =   18890752
in use bytes     =       3392
Arena 2:
system bytes     =   18890752
in use bytes     =       3392
Arena 3:
system bytes     =   18890752
in use bytes     =       3392
Arena 4:
system bytes     =   18890752
in use bytes     =       3392
Arena 5:
system bytes     =   18890752
in use bytes     =       3392
Arena 6:
system bytes     =   18890752
in use bytes     =       3392
Arena 7:
system bytes     =   18890752
in use bytes     =       3392
Arena 8:
system bytes     =   18890752
in use bytes     =       3392
Arena 9:
system bytes     =   18890752
in use bytes     =       3392
Arena 10:
system bytes     =   18890752
in use bytes     =       3392
Arena 11:
system bytes     =   18890752
in use bytes     =       3392
Arena 12:
system bytes     =   18890752
in use bytes     =       3392
Arena 13:
system bytes     =   18890752
in use bytes     =       3392
Arena 14:
system bytes     =   18890752
in use bytes     =       3392
Arena 15:
system bytes     =   18890752
in use bytes     =       3392
Arena 16:
system bytes     =   18890752
in use bytes     =       3392
Total (incl. mmap):
system bytes     =  302333952
in use bytes     =     134896
max mmap regions =         16
max mmap bytes   =  302055424

Now, RSS does not fall back, but keeps being at rss=291.559MB.

Now if you change ALLOC_INCREMENT from 1024 to 16*1024, now the arenas are
properly trimmed:
 7: deallocated (rss=296.152MB)... trimmed (rss=296.152MB)
 8: deallocated (rss=276.652MB)... trimmed (rss=276.652MB)
 3: deallocated (rss=257.152MB)... trimmed (rss=257.152MB)
10: deallocated (rss=237.652MB)... trimmed (rss=237.652MB)
 4: deallocated (rss=218.184MB)... trimmed (rss=218.184MB)
11: deallocated (rss=198.684MB)... trimmed (rss=198.684MB)
12: deallocated (rss=179.184MB)... trimmed (rss=179.184MB)
 1: deallocated (rss=159.684MB)... trimmed (rss=159.684MB)
 2: deallocated (rss=140.184MB)... trimmed (rss=140.184MB)
 5: deallocated (rss=120.684MB)... trimmed (rss=120.684MB)
13: deallocated (rss=101.184MB)... trimmed (rss=101.184MB)
15: deallocated (rss=81.684MB)... trimmed (rss=81.684MB)
14: deallocated (rss=62.184MB)... trimmed (rss=62.184MB)
 6: deallocated (rss=42.684MB)... trimmed (rss=42.684MB)
16: deallocated (rss=23.137MB)... trimmed (rss=23.137MB)
 9: deallocated (rss=3.590MB)... trimmed (rss=3.590MB)
dealloc stat:
Arena 0:
system bytes     =      81920
in use bytes     =      80624
Arena 1:
system bytes     =     135168
in use bytes     =       3392
Arena 2:
system bytes     =     135168
in use bytes     =       3392
Arena 3:
system bytes     =     135168
in use bytes     =       3392
Arena 4:
system bytes     =     135168
in use bytes     =       3392
Arena 5:
system bytes     =     135168
in use bytes     =       3392
Arena 6:
system bytes     =     135168
in use bytes     =       3392
Arena 7:
system bytes     =     135168
in use bytes     =       3392
Arena 8:
system bytes     =     135168
in use bytes     =       3392
Arena 9:
system bytes     =     135168
in use bytes     =       3392
Arena 10:
system bytes     =     135168
in use bytes     =       3392
Arena 11:
system bytes     =     135168
in use bytes     =       3392
Arena 12:
system bytes     =     135168
in use bytes     =       3392
Arena 13:
system bytes     =     135168
in use bytes     =       3392
Arena 14:
system bytes     =     135168
in use bytes     =       3392
Arena 15:
system bytes     =     135168
in use bytes     =       3392
Arena 16:
system bytes     =     135168
in use bytes     =       3392
Total (incl. mmap):
system bytes     =    2244608
in use bytes     =     134896
max mmap regions =         16
max mmap bytes   =  328007680

Also MALLOC_TOP_PAD_=0 helps.

Or setting MALLOC_ARENA_MAX=1
 9: deallocated (rss=291.422MB)... trimmed (rss=273.426MB)
16: deallocated (rss=273.426MB)... trimmed (rss=255.426MB)
 7: deallocated (rss=255.426MB)... trimmed (rss=237.426MB)
14: deallocated (rss=237.426MB)... trimmed (rss=219.430MB)
 8: deallocated (rss=219.430MB)... trimmed (rss=201.430MB)
 4: deallocated (rss=201.430MB)... trimmed (rss=183.438MB)
 1: deallocated (rss=183.438MB)... trimmed (rss=165.438MB)
 5: deallocated (rss=165.438MB)... trimmed (rss=147.344MB)
10: deallocated (rss=147.344MB)... trimmed (rss=129.340MB)
11: deallocated (rss=129.340MB)... trimmed (rss=111.348MB)
 6: deallocated (rss=93.348MB)... trimmed (rss=93.348MB)
12: deallocated (rss=93.348MB)... trimmed (rss=75.355MB)
15: deallocated (rss=75.355MB)... trimmed (rss=57.352MB)
 2: deallocated (rss=57.352MB)... trimmed (rss=39.410MB)
 3: deallocated (rss=39.410MB)... trimmed (rss=21.309MB)
13: deallocated (rss=3.375MB)... trimmed (rss=3.375MB)
dealloc stat:
Arena 0:
system bytes     =     241664
in use bytes     =      98800
Total (incl. mmap):
system bytes     =     241664
in use bytes     =      98800
max mmap regions =         16
max mmap bytes   =  303628288

-- 
You are receiving this mail because:
You are on the CC list for the bug.

             reply	other threads:[~2023-08-16 11:55 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-16 11:55 sascha.zorn at sap dot com [this message]
2023-09-21 14:36 ` [Bug malloc/30769] " sascha.zorn at sap dot com
2024-01-11  9:41 ` fweimer at redhat dot com
2024-03-18  5:13 ` shubhankargambhir013 at gmail dot com

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-30769-131@http.sourceware.org/bugzilla/ \
    --to=sourceware-bugzilla@sourceware.org \
    --cc=glibc-bugs@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).