From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oa1-x2c.google.com (mail-oa1-x2c.google.com [IPv6:2001:4860:4864:20::2c]) by sourceware.org (Postfix) with ESMTPS id DE3963858D1E for ; Wed, 17 May 2023 16:50:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DE3963858D1E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org Received: by mail-oa1-x2c.google.com with SMTP id 586e51a60fabf-19a0988a925so401827fac.0 for ; Wed, 17 May 2023 09:50:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1684342238; x=1686934238; h=content-transfer-encoding:in-reply-to:organization:from:references :to:content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=Jx/COJR6lHtwc/fUO7vgZEh8BXePGMAJkopZGMzb1KY=; b=pNNtvlrVLbjgwsSUOSp3GMn/0Gez8kASucLjpQ/AXytK0W6rUNjB0zzTUUA6PJPCeP Tq8JPWN3L0lRH/RYNZUmBGVAqZP5HDylteaEbzyOJjWMdJ5hbiGb+GvCNRI/k5IJ1Wxu U+xCpto3ky+B3mtT6SdUByfq5jCLaLOkAQGkoA7jpUz4pzxdOfsYHY0PmMJTxbAjjofS mRVEEvzOpFJQYJEBePf3T1+o4NO3/6gI7TZL6B6uIy8/onyZQ8eU30sZLcBbNEYFyw0d 7FwVKNScxaeCgxnFmEhbvA0wLlddgkcTtUGaKnJyTv9H65j1LomuGi29LqEJ06kbAPBy LQRA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684342238; x=1686934238; h=content-transfer-encoding:in-reply-to:organization:from:references :to:content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Jx/COJR6lHtwc/fUO7vgZEh8BXePGMAJkopZGMzb1KY=; b=PmNIEmtALlG4TzHT2O1/9y1md4ZNleXQgr31Ld4HC8QBxXjUs8rFYiRjXdZdo12Sfh fLhnYM7U080GnK0ZsGJKUSm2gbsd5WEfOoH5OYglxcvPNHYCaGiBMvYNEuZvF7Ze/Xgp Nxxmkedhd7nXbi9angA6MkI7Zyuw0/LQ0853vMJ8ccclL5n4ZD+I/Ob2vYMbRCV4BpVy fhPj80pehSHiNT05JoPALoEor1oMkHCtf3UmOYaCQ2WqG/Q7u1jDqSGLanz6Ng2jG7SK T2CeWkhj7azCwH3TyTj9kFFLpfoYjzPn2slZXn4V5WFFM0807boMp5c/png7P5vx9OyU lZYg== X-Gm-Message-State: AC+VfDxo0l4x5bAEK2cAqaDJGIjIFRFCfkSe47Enk3Bg2QLVcnw38mkX PAkHAgzv3UtxbGls3U6sG4Qkog== X-Google-Smtp-Source: ACHHUZ5ZoajuVUw6MM53E4Gh67YexyBZKhUaYKlfJsShtaZGTKkVQT3gQxaYDmWanHgXvM0SeLRlpw== X-Received: by 2002:a05:6870:3a1a:b0:187:8c36:d392 with SMTP id du26-20020a0568703a1a00b001878c36d392mr19404630oab.18.1684342238658; Wed, 17 May 2023 09:50:38 -0700 (PDT) Received: from ?IPV6:2804:1b3:a7c0:c914:a525:210b:3f9e:9366? ([2804:1b3:a7c0:c914:a525:210b:3f9e:9366]) by smtp.gmail.com with ESMTPSA id ec17-20020a0568708c1100b001964eef39ebsm8185958oab.40.2023.05.17.09.50.36 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 17 May 2023 09:50:37 -0700 (PDT) Message-ID: Date: Wed, 17 May 2023 13:50:35 -0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.11.0 Subject: Re: [PATCH] nptl: Disable THP on thread stack if it incurs in large RSS usage Content-Language: en-US To: Wilco Dijkstra , "libc-alpha@sourceware.org" , Cupertino Miranda References: <20230420172436.2013698-1-adhemerval.zanella@linaro.org> <4115d7fd-d7a7-cdb1-3833-daf45186480f@linaro.org> <967b94b4-d819-278f-1782-6b758d0841b6@linaro.org> From: Adhemerval Zanella Netto Organization: Linaro In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-6.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 17/05/23 11:22, Wilco Dijkstra wrote: > Hi Adhemerval, > >> AFAIU the issue is after the stack is allocated with huge pages, the >> kernel needs to fallback to standard pages because the guard 'page' >> will be also within the same huge page allocated for the stack.  > > The stack allocation explicitly never overlaps with the guard page, ie. there > is no such fallback. All that matters is the mapped address range of the > stack - if this fits huge pages, you'll get them. > >> My understanding is, once kernel needs to fallback to use default pages, >> it allocates *all* the large page range.  This is what the RSS increase >> make me believe, I am not sure if there is technical limitation to just >> making the range COW (since at the time of guard protection setup, no >> the page has not been touched yet). > > That's not what happens. The RSS size increases because you actually get > a huge page (as requested). There is no fallback to standard pages. But the threads themselves do not end up using all the VMA region allocated for them. Using the test program you can see it: $ cat /proc/meminfo | grep AnonHugePages AnonHugePages: 43008 kB $ ./tststackalloc & [...] [statm] RSS: 1049 pages (4296704 bytes = 4 MB) [smaps] RSS: 5033984 bytes = 4 MB [...] $ cat /proc/meminfo | grep AnonHugePages AnonHugePages: 45056 kB So even if the stack is not aligned to default large page, THP will still back up the thread allocation. The issues is, if the mmap is also aligned to THP size, the guard setup will trigger the issue that will increase RSS. This seems to be same conclusion OpenJVM and some kernel discussion has reached as well [1] [2]. > >>> So the real question is when do huge pages make sense for stacks? >> >> But that's not what the patch is trying to do, it only tries tot mitigate >> a specific corner case where THP will be ineffective.  I agree with > > So far there is no evidence this corner case exists, but even ignoring that, > the expression used is incorrect. > >> Cupertino that this question is really hard to answer and it will be >> really depended of the workload and/or runtime characteristics that we will >> need to plug in kernel feedback to have some answer. > > It should be feasible to run benchmarks to get an idea whether huge stack pages > help or not. And similarly whether the RSS increase is worth it or not. Another option, hinted in both discussion and brought by Florian as well is to add a pthread extension to force huge page disabl (something like pthread_attr_setflags to make is extensible). [1] https://bugs.openjdk.org/browse/JDK-8303215?page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel&showAll=true [2] https://lore.kernel.org/linux-mm/278ec047-4c5d-ab71-de36-094dbed4067c@redhat.com/T/