From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 118181 invoked by alias); 31 Jul 2018 12:16:17 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Received: (qmail 118152 invoked by uid 89); 31 Jul 2018 12:16:16 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.2 spammy=row, wong, Wong, HX-Received:sk:19-v6mr X-HELO: mail-qt0-f196.google.com Return-Path: Subject: Re: [RFC/PoC] malloc: use wfcqueue to speed up remote frees To: Eric Wong , libc-alpha@sourceware.org References: <20180731084936.g4yw6wnvt677miti@dcvr> From: Carlos O'Donell Openpgp: preference=signencrypt Message-ID: <0cfdccea-d173-486c-85f4-27e285a30a1a@redhat.com> Date: Tue, 31 Jul 2018 12:16:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0 MIME-Version: 1.0 In-Reply-To: <20180731084936.g4yw6wnvt677miti@dcvr> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-SW-Source: 2018-07/txt/msg01062.txt.bz2 On 07/31/2018 04:49 AM, Eric Wong wrote: > The goal is to reduce contention and improve locality of cross-thread > malloc/free traffic common to IPC systems (including Userspace-RCU) and > some garbage-collected runtimes. Eric, This looks like a really interesting contribution! For anyone reviewing this patch I just want to point out that Eric *does* have FSF copyright assignment for glibc, so review can proceed normally for this patch. Thank you! I would like to see urcu used within glibc to provide better data structures for key thread, dynamic loader, and malloc algorithms. So if anything I think this is a move in the right direction. It would be interesting to roll your RFC into Fedora Rawhide for 6 months and see if we hit any problems. I have a few high-level questions: - Can you explain the RSS reduction given this patch? You might think that just adding the frees to a queue wouldn't result in any RSS gains. However, you are calling _int_free a lot in row and that deinterleaving may help (you really want vector free API here so you don't walk all the lists so many times, tcache had the same problem but in reverse for finding chunks). - Adding urcu as a build-time dependency is not acceptable for bootstrap, instead we would bundle a copy of urcu and keep it in sync with upstream. Would that make your work easier? - What problems are you having with `make -j4 check?' Try master and report back. We are about to release 2.28 so it should build and pass. Thank you again for testing this out. -- Cheers, Carlos.