From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf1-x42a.google.com (mail-pf1-x42a.google.com [IPv6:2607:f8b0:4864:20::42a]) by sourceware.org (Postfix) with ESMTPS id DB78839B704B for ; Thu, 19 Aug 2021 12:04:47 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org DB78839B704B Received: by mail-pf1-x42a.google.com with SMTP id t42so2699196pfg.12 for ; Thu, 19 Aug 2021 05:04:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=BJKU3i7dIOACZXg9sk4ZXxPg6n/U987FCoJcaCFZVRk=; b=CDY9grqhlMngwTBynAVVJZ52iwP8duXu/eZjCKXjM6eNulTVvuy4PF1Dt6IWfuVUY6 mcMUfJ4qnwpwhGT/UD+kZ9/w7FDLxj+7rwvvsCOy93wnqb/nc1wsv3RklMrsTsfMZ56y 4faZC0T74CJ9W/75ewEfAAue3OTQUe3NRyDyi8sKvRVzw+/GHVGrWg9y2RwZ6FWadiB2 oLb/FXNnMbdI2AeYeuZb33iN4gqQX1c6Zp5GeFyL6aSfYeb+u+p3QbTjk7Bv7GDAb9aY itGYF3aMwjmapXuUveLcABBJaoPbLIc9Hn4wBdNVxX1LNJHpN+j8sGy25fBYMb7JBxqh IfeQ== X-Gm-Message-State: AOAM530E3kz0jN5L1F7K5RSTz0rp86cGhfSbnGxnInOO1/5bmdT3zr3L qj5J3qpJWbl9jbG62OZc7kSKow== X-Google-Smtp-Source: ABdhPJwXKI0nBwNwgxSeMMmGHl0ejqowbOzHeoOk+0ycrYKj8YdLFXU7V3wewTFufrAEr0E2sjM02g== X-Received: by 2002:a62:32c7:0:b029:3cd:fba0:3218 with SMTP id y190-20020a6232c70000b02903cdfba03218mr14557084pfy.52.1629374686924; Thu, 19 Aug 2021 05:04:46 -0700 (PDT) Received: from ?IPv6:2804:431:c7ca:cd83:aa1a:7bd:9935:9bba? ([2804:431:c7ca:cd83:aa1a:7bd:9935:9bba]) by smtp.gmail.com with ESMTPSA id j68sm4009145pgc.44.2021.08.19.05.04.44 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 19 Aug 2021 05:04:46 -0700 (PDT) Subject: Re: [PATCH v2 0/4] malloc: Improve Huge Page support To: Siddhesh Poyarekar , libc-alpha@sourceware.org Cc: Norbert Manthey , Guillaume Morin References: <20210818142000.128752-1-adhemerval.zanella@linaro.org> <5e37cb66-fd93-5d27-ec7b-28f7cf636246@linaro.org> <9c13a602-573a-666f-071c-f88c1f857b5c@sourceware.org> From: Adhemerval Zanella Message-ID: <21440481-f2b2-8112-1d7f-be59eb43c80e@linaro.org> Date: Thu, 19 Aug 2021 09:04:43 -0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <9c13a602-573a-666f-071c-f88c1f857b5c@sourceware.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-7.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 Aug 2021 12:04:58 -0000 On 19/08/2021 08:48, Siddhesh Poyarekar wrote: > On 8/19/21 4:56 PM, Adhemerval Zanella wrote: >> I though about it, and decided to use two tunables because although >> for mmap() system allocation both tunable are mutually exclusive >> (since it does not make sense to madvise() a mmap(MAP_HUGETLB) >> we still use sbrk() on main arena. The way I did for sbrk() is to align >> to the THP page size advertisen by the kernel, so using the tunable >> does change the behavior slightly (it is not 'transparent' as the >> madvise call). >> >> So to use only one tunable would require to either drop the sbrk() >> madvise when MAP_HUGETLB is used, move it to another tunable (say >> '3: HugeTLB enabled with default hugepage size and madvise() on sbrk()), >> or assume it when huge pages should be used. >> >> (and how do we handle sbrk() with explicit size?) >> >> If one tunable is preferable I think it would be something like: >> >> 0: Disabled (default) >> 1: Transparent, where we emulate "always" behaviour of THP >>     sbrk() is also aligned to huge page size and issued madvise() >> 2: HugeTLB enabled with default hugepage size and sbrk() as >>     handled are 1 >>> : HugeTLB enabled with the specified page size and sbrk() >>     are handled as 1 >> >> By forcing the sbrk() and madvise() on all tunables value make >> the expectation to use huge pages in all possible occasions. > > What do you think about using mmap instead of sbrk for (2) and if hugetlb is requested?  It kinda emulates what libhugetlbfs does and makes the behaviour more consistent with what is advertised by the tunables. I think this would be an additional tunable, we still need to handle the case where mmap() fails either in default path (due maximum number of mmap() per process by kernel or when the poll is exhausted for MAP_HUGETLB). So for sbrk() call, should we align the increment to huge page and issue the madvise() if the tunable is set to use huge pages? > >>> A simple test like below in benchtests would be very useful to at least get an initial understanding of the behaviour differences with different tunable values.  Later those who care can add more relevant workloads. >> >> Yeah, I am open to suggestions on how to properly test it.  The issue >> is we need to have specific system configuration either by proper >> kernel support (THP) or with reserved large pages to actually test >> it. >> >> For THP the issue is really 'transparent' for user, which means that >> we will need to poke on specific Linux sysfs information to check if >> huge pages are being used. And we might not get the expected answer >> depending of the system load and memory utilization (the advised >> pages might not be moved to large pages if there is no sufficient >> memory). > > For benchmarking we can make a minimal assumption that the user will set the system up to appropriately isolate the benchmarks.  As for the sysfs setup, we can always test and bail if unsupported. > >>> You could add tests similar to mcheck and malloc-check, i.e. add $(tests-hugepages) to run all malloc tests again with the various tunable values.  See tests-mcheck for example. >> >> Ok, I can work with this.  This might not add much if the system is >> not configured with either THP or with some huge page pool but at >> least adds some coverage. > > Yeah the main intent is to simply ensure that there are no differences in behaviour with hugepages. Alright, I will add some tunable usage then.