public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "keyid.w at qq dot com" <sourceware-bugzilla@sourceware.org>
To: glibc-bugs@sourceware.org
Subject: [Bug malloc/26969] A common malloc pattern can make memory not given back to OS
Date: Tue, 01 Dec 2020 08:43:30 +0000 [thread overview]
Message-ID: <bug-26969-131-3sIrEluRom@http.sourceware.org/bugzilla/> (raw)
In-Reply-To: <bug-26969-131@http.sourceware.org/bugzilla/>
https://sourceware.org/bugzilla/show_bug.cgi?id=26969
keyid.w at qq dot com changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |UNCONFIRMED
Resolution|NOTABUG |---
--- Comment #2 from keyid.w at qq dot com ---
(In reply to Carlos O'Donell from comment #1)
> The glibc implementation of malloc is a heap-based allocator and in that
> design the heap must be logically freed back down in the order that it was
> originally allocated or the heap will continue to grow to keep a maximum
> working set of chunks for application.
>
> If you want to free back down to zero at the last deallocation you must tune
> the allocator by disabling fastbins and tcache.
>
> For example:
> - Allocate A
> - Allocate B
> - Allocate C
> - Free A
> - Free B
> - Free C
>
> Consider A, B and C are all the same size.
>
> Until "Free C" happens the entire stack is held at 3 objects deep.
>
> This can happen because tcache or fastbins holds the most recently freed
> chunk for re-use. There is nothing wrong with this strategy because the C
> library, apriori, doesn't know if you'll carry out this entire workload
> again.
>
> The worse-case degenerate situation for tcache is a sequence of allocations
> which cause tcache to always hold the top-of-heap chunks as in-use. In a
> real program those chunks are refilled into the tcache much more randomly
> via malloc from the unsorted bin or small bin refill strategy. Thus tcache
> should not keep the top-of-heap from freeing down in those cases. It's only
> in synthetic test cases like this where I think you see tcache being the
> blocker to freeing down from the top of heap.
>
> If you need to free pages between workloads and while idle you can call
> malloc_trim() to release page-sized consolidated parts of the heaps.
>
> If you need a minimal working set, then you need to turn off fastbins and
> tcache.
>
> One possible enhancement we can make is to split the heaps by pool sizes,
> and that's something I've talked about a bit with DJ Delorie. As it stands
> though that would be a distinct enhancement.
>
> I'm marking this as RESOLVED/NOTABUG since the algorithm is working as
> intended but doesn't meet your specific synthetic workload. If you have a
> real non-synthetic workload that exhibits problems please open a bug and we
> can talk about it and review performance and capture an API trace.
Thanks for your reply! I indeed faced this problem in a real workload. I tried
to simplify my code, however, the final code is still a little complex and is
in C++. The code is attached at the end.
There's a thread-queue that execute n tasks with m worker threads. Each task
stores some calculated (field, value) data into a map. In my real workload, I
calculate some double numbers from some loaded data then I store the double
numbers into map. The calculation process is very complex so I simplified
here.I think creating the map's key(a short string) is similar to malloc-ing
small pieces and creating the map's value(a large vector) is similar to
malloc-ing large pieces. However, if I don't use the thread-queue, the memory
will be released. So I guess some malloc of something in STL in the thread
queue compound the result. In fact I used gdb to check the content of
tcache/fast bins near the heap top and found they were probably allocated to
something in STL. Also, If I comment the 149th line("return dp;") and
un-comment the 147th and 148th lines, the memory will be released. I don't know
why.
You can compile it just using "g++ test.cpp -o test -lpthread" and run it with
"./test task_number thread_number" .
#include <cstdio>
#include <thread>
#include <mutex>
#include <map>
#include <future>
#include <queue>
#include <memory>
#include <string>
#include <list>
#include <functional>
#include <utility>
using namespace std;
class TestClass {
public:
void DoSomething() {
const int Count = 10000;
map <string,vector<double>> values;
for (int i = 0; i < Count; ++i) {
vector <double> v(10000);
values[std::to_string(i)] = v;
}
}
};
class MultiThreadWorkQueue {
public:
// Constructor.
// cache_size is the maximum capicity of result cache.
// n_threads is number of worker threads. If it is 0, it will be set to
// a reasonable value based on number of cores.
// If cache_size is 0, it will be set to n_threads.
MultiThreadWorkQueue(int cache_size, int n_threads)
: cache_size_(cache_size),
n_threads_(n_threads) {
for (int i = 0; i < n_threads_; ++i) {
workers_.push_back(std::thread(&MultiThreadWorkQueue::ProcessTasks,
this));
}
}
~MultiThreadWorkQueue() {
Abort();
}
void Enqueue(std::function<TestClass*()>&& func) {
{
std::unique_lock<std::mutex> ul(tasks_mutex_);
tasks_.emplace(std::forward<std::function<TestClass*()>> (func));
}
worker_cv_.notify_one();
}
// Gets result from the next task in queue. If it's still pending, block the
current
// thread and wait until the result is available.
//
// Noted that if this is called after Abort(), it will crash.
TestClass* Dequeue() {
std::unique_lock<std::mutex> ul(tasks_mutex_);
dequeue_cv_.wait(ul, [this] {
return aborted_ || returns_.size() > 0;
});
std::future<TestClass*> future = std::move(returns_.front());
returns_.pop();
ul.unlock();
worker_cv_.notify_one();
return future.get();
}
// Stop executing any new tasks and join all the worker threads.
void Abort() {
{
std::unique_lock<std::mutex> ul(tasks_mutex_);
if (aborted_) {
return;
} else {
aborted_ = true;
}
}
worker_cv_.notify_all();
dequeue_cv_.notify_all();
for (auto& thread : workers_) {
thread.join();
}
}
// Size = N(Enqueue) - N(Dequeue).
size_t Size() {
std::unique_lock<std::mutex> ul(tasks_mutex_);
return returns_.size() + tasks_.size();
}
private:
void ProcessTasks() {
std::unique_lock<std::mutex> ul(tasks_mutex_);
while (!aborted_) {
worker_cv_.wait(ul, [this]() {
return aborted_ || (tasks_.size() > 0 && returns_.size() <
cache_size_);
});
if (aborted_) {
break;
}
std::packaged_task<TestClass*()> t;
t.swap(tasks_.front());
tasks_.pop();
returns_.emplace(t.get_future());
ul.unlock();
dequeue_cv_.notify_one();
t();
ul.lock();
}
}
std::mutex tasks_mutex_;
std::atomic<bool> aborted_ { false };
int cache_size_;
int n_threads_;
std::condition_variable worker_cv_;
std::condition_variable dequeue_cv_;
std::queue<std::packaged_task<TestClass*()>> tasks_;
std::queue<std::future<TestClass*>> returns_;
std::list<std::thread> workers_;
};
int main(int argc, char** argv) {
int n = atoi(argv[1]);
int thread_num = atoi(argv[2]);
auto CreateDP = [] {
TestClass* dp = new TestClass;
dp->DoSomething();
// delete dp;
// return nullptr;
return dp;
};
printf("* before run, press enter to continue");
fflush(stdout);
std::getchar();
if (thread_num > 0) {
printf("Multi-thread\n");
MultiThreadWorkQueue work_queue(10, thread_num);
for (int i = 0; i < n; ++i) {
work_queue.Enqueue(CreateDP);
}
for (int i = 0; i < n; ++i) {
std::unique_ptr<TestClass> dp(work_queue.Dequeue());
}
} else {
printf("Single-thread\n");
for (int i = 0; i < n; ++i) {
fflush(stdout);
std::unique_ptr<TestClass> dp(CreateDP());
}
}
printf("* after run, press enter to continue");
fflush(stdout);
std::getchar();
return 0;
}
--
You are receiving this mail because:
You are on the CC list for the bug.
next prev parent reply other threads:[~2020-12-01 8:43 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-28 13:58 [Bug malloc/26969] New: " keyid.w at qq dot com
2020-11-28 13:59 ` [Bug malloc/26969] " keyid.w at qq dot com
2020-11-29 3:24 ` keyid.w at qq dot com
2020-12-01 1:03 ` uwydoc at gmail dot com
2020-12-01 2:51 ` carlos at redhat dot com
2020-12-01 8:43 ` keyid.w at qq dot com [this message]
2021-01-29 16:08 ` dimahabr at gmail dot com
2021-02-01 8:52 ` keyid.w at qq dot com
2022-06-29 16:34 ` romash at rbbn dot com
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bug-26969-131-3sIrEluRom@http.sourceware.org/bugzilla/ \
--to=sourceware-bugzilla@sourceware.org \
--cc=glibc-bugs@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).