From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-x22c.google.com (mail-lj1-x22c.google.com [IPv6:2a00:1450:4864:20::22c]) by sourceware.org (Postfix) with ESMTPS id B0D183858D33 for ; Tue, 7 Feb 2023 16:16:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B0D183858D33 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-lj1-x22c.google.com with SMTP id o5so16156298ljj.1 for ; Tue, 07 Feb 2023 08:16:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=QsvNa65lujVppq6l5mc4bssjub+LkUBUbiajN5n96i8=; b=OwEV2iB5vGm1TvhbRneGmECbs8zcvdmkm+Rv0ouKkzBdMl2+QBOysXZo5uVtf04gsq jrkbid8fPHI/pnUitrSgKN2Lq1X4FyY/oU4XPDX/TDt4yQrMAxCwJhUZrco4hnpV63U+ 7DAjr7fPWjBk7dwAQQl3MDDsSjcbQyyDxGMgj3J+q3I+imgGJ2QrmZFovMoyQPMcRB4x ub6jU2OKla8D6VvrpuHP+p9ujJbGcSwdHcqBweB6qDD9wigheB74DVh1pQPrc8ty7OdU f4gGM6e502eH4v8DpoSNmUv3e6sSQedvhJhc06Bw5R7eRqYTxw0SNsIIVhqLMVDW/cw8 ycBA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QsvNa65lujVppq6l5mc4bssjub+LkUBUbiajN5n96i8=; b=bcAsarFgtbQXaPeacZvPFoIf4nSJoSAwO6ItrPss1o1GykqhqNyBeu8rrt7M+O2wEO 38YXXyer5MXcPqbt3jPVp/7zTPHJXrQjNeWfuwGFVJw8yUfBFkJz5z8irN9rjKuRda2s Lq459NuA6QmoaYUqKmKHd/Z2qMUeByYfu0Vh8A5xxvFvIi4QZu5oe1qVzMgtRboqI+ux M0WaOtSXyco/3fwDFu/b9EZz3Gk+h92FlHH/mjUxpbeSSIEYL1wheXKtIhAsnOdrtoV5 KAUEHGy3fSiX289Ow0HGtpz2/DoS4linJKNwdM/vR2ZEfK86vDP64CEt6GLZuhm7dX2S Ep/Q== X-Gm-Message-State: AO0yUKUh9HQIQk/LSUWFCv0wF2F/v9vpUyQsYrRtn7EsrAzxvJwQopfC 8oDWQKNQLlo4RA3TUjeu4WzIAxGtW9RO6bZsL+ikxsW4 X-Google-Smtp-Source: AK7set8lAEcpG6Rmu/lAXZsYyIxJVMn2FtXRDkRCpagidUbtcZLuo5knIx7X4ZwEfBAWbAwYw4ME34dw//YIcwU2TI4= X-Received: by 2002:a05:651c:1309:b0:28b:663a:168d with SMTP id u9-20020a05651c130900b0028b663a168dmr611105lja.53.1675786583091; Tue, 07 Feb 2023 08:16:23 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: =?UTF-8?Q?Paulo_C=C3=A9sar_Pereira_de_Andrade?= Date: Tue, 7 Feb 2023 13:16:10 -0300 Message-ID: Subject: Re: GLIBC malloc behavior question To: Nikolay.Shustov@gmail.com Cc: libc-alpha@sourceware.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=0.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Em ter., 7 de fev. de 2023 =C3=A0s 12:07, Nikolay Shustov via Libc-alpha escreveu: > > Hi, > I have a question about the malloc() behavior which I observe. > The synopsis is that the during the stress load, the application > aggressively allocates virtual memory without any upper limit. > Just to note, after application is loaded just with the peak of activity > and goes idle, its virtual memory doesn't scale back (I do not expect > much of that though - should I?). There is no garbage collector thread or something similar in some worker thread. But maybe something similar could be done in your code. > The application is heavily multithreaded; at its peak of its activitiy > it creates new threads and destroys them at a pace of approx. 100/second. > After the long and tedious investigation I dare to say that there are no > memory leaks involved. > (Well, there were memory leaks and I first went after those; found and > fixed - but the result did not change much.) You might experiment with a tradeoff speed vs memory usage. The minimum memory usage should be achieved with MALLOC_ARENA_MAX=3D1 see 'man mallopt' for other options. > The application is cross-platform and runs on Windows and some other > platforms too. > There is an OS abstraction layer that provides the unified thread and > memory allocation API for business logic, but the business logic that > triggers memory allocations is platform-independent. > There are no progressive memory allocations in OS abstraction layer > which could be blamed for the memory growth. > > The thing is, on Windows, for the same activity there is no such > application memory growth at all. > It allocates memory moderately and scales back after peak of activity. > This makes me think it is not the business logic to be blamed (to the > extent of that it does not leak memory). > > I used valigrind to profile for memory leaks and heap usage. > Please see massif outputs attached (some callstacks had to be trimmed out= ). > I am also attaching the memory map for the application (run without > valgrind); snapshot is taken after all the threads but main were > destroyed and application is idle. > > The pace of the virtual memory growth is not quite linear. Most likely there are long lived objects doing contention and also probably memory fragmentation, preventing returning memory to the system after a free call. > From my observation, it allocates a big hunk in the beginning of the > peak loading, then in some time starts to grow in steps of ~80Mb / 10 > seconds, then after some times starts to steadily grow it at pace of > ~2Mb/second. > > Some stats from the host: > > OS: Red Hat Enterprise Linux Server release 7.9 (Maipo) > > ldd -version > > ldd (GNU libc) 2.17 > Copyright (C) 2012 Free Software Foundation, Inc. > This is free software; see the source for copying conditions. There > is NO > warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR > PURPOSE. > Written by Roland McGrath and Ulrich Drepper. > > uname -a > > Linux 3.10.0-1160.53.1.el7.x86_64 #1 SMP Thu Dec 16 > 10:19:28 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux > > > At a peak load, the number of application threads is ~180. > If application is left running, I did not observe it would hit any max > virtual memory threshold and eventually ends up with hitting ulimit. > > My questions are: > > - Is this memory growth an expected behavior? It should eventually stabilize. But it is possible that some allocation pattern is causing both, fragmentation and long lived objects preventing consolidation of memory chunks. > - What can be done to prevent it from happening? First approach is MALLOC_ARENA_MAX. After that some coding patterns might help, for example, have large long lived objects allocated from the same thread, preferably at startup. Can also attempt to cache some memory, but note that caching is also an easy way to get contention. To avoid this, you could use memory from buffers from mmap. Depending on your code, you can also experiment with jemalloc or tcmalloc. I would suggest tcmalloc, as its main feature is to work in multithreaded environments: https://gperftools.github.io/gperftools/tcmalloc.html Glibc newer than 2.17 has a per thread cache, but the issue you are experimenting is not it being slow, but memory usage. AFAIK tcmalloc has a kind of garbage collector, but it should not be much different than glibc consolidation logic; it should only run during free, and if there is some contention, it might not be able to release memory. > Thanks in advance, > - Nikolay Thanks! Paulo