public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Cupertino Miranda <cupertino.miranda@oracle.com>
To: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Cc: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>,
	'GNU C Library' <libc-alpha@sourceware.org>
Subject: Re: [PATCH v5 1/1] Created tunable to force small pages on stack allocation.
Date: Fri, 14 Apr 2023 15:49:04 +0100	[thread overview]
Message-ID: <87y1mubjtr.fsf@oracle.com> (raw)
In-Reply-To: <PAWPR08MB8982960AE447CCDB257E922283999@PAWPR08MB8982.eurprd08.prod.outlook.com>


Hi Wilco,

To contextualize the tunable was proposed to reduce stack memory usage while
still having THP enabled in the kernel for all other allocation
types.
We identified a scenario which THP will be initially used for stack
allocation, but once protection gets set for the guard region, it will
split the huge page into 4k pages, when the 2M page is already marked
dirty, not really making use of THP benefits, but bloating RSS as all
the small pages are assumed as touch.

I think you are suggesting the oposite, i.e. it would increase RSS.

Also Adhemerval patch intention is not to detect when the THP for stack
allocation makes sense, but rather the oposite. It is to detect when THP
will not bring any benefit.

Regards,
Cupertino


Wilco Dijkstra writes:

> Hi Cupertino,
>
>> If my memory does not betray me, I believe the reason why the hugepages
>> would not be relevant was because the mprotect to the guard region would
>> enforce different protection on part of the huge page.
>
> Yes so with a default stack size of 8MB and a 2MB huge page, this implies that
> it splits the bottom page of the stack and wastes 2 MB of memory that most
> programs will never use (since it is the bottom of the stack).
>
>> In that sense you could test if the guard region is a part of the
>> of the first page. If it is it would mean that the page would be broken
>> in smaller parts and it should advise for smaller pages.
>
> Yes that's the test I suggested.
>
>> While debugging I realized as well you were chacking for allignment of
>> "mem" however for the common case STACK_GROWS_DOWN, the relevant
>> allignment to check is the start of the stack (end of the memory allocated).
>
> This shouldn't matter given the size must also be a multiple of huge page
> size, and so start/end alignment are the same.
>
>> I have included your patch with some changes which I think make a
>> reasonable estimation.
>
> +  if (thpmode != malloc_thp_mode_always)
> +    return 0;
> +  if ((((uintptr_t) stack_start_addr) % thpsize) != 0
>
> I think checking mem and size is better.
>
> +      || (((uintptr_t) mem / thpsize) != ((uintptr_t) guardpos / thpsize)))
>
> I'm not sure what that is trying to test. The issue is not the position of the
> guard page, but the guard page *size* being smaller than a huge page.
> We could test that guardpos % thpsize == 0 as well since the guard may
> be in different places within the mmap, but testing guardsize % thpsize == 0
> is essential.
>
> +    return 0;
> +
> +  return __madvise (mem, size, MADV_NOHUGEPAGE);
>
>> In any case, I would like to keep the tunable, as I think it is armless
>> and makes sense.
>
> Having a tunable to override the default behaviour is perfectly fine.
> However I believe the default behaviour should be to block huge pages
> unless a heuristic says it is safe.
>
> We could automatically increase guard size to be a huge page to avoid any
> splitting (a much larger guard doesn't cost anything). The heuristic could
> allow huge pages if the stacksize is much larger than the default size.
> This would help Fortran code built with -Ofast where it allocates huge arrays
> on the stack to avoid slow malloc/free.
>
> This should do the right thing for the majority of cases without needing
> to set a tunable.
>
> Cheers,
> Wilco

  reply	other threads:[~2023-04-14 14:49 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-13 15:43 Wilco Dijkstra
2023-04-13 16:23 ` Cupertino Miranda
2023-04-13 17:48   ` Adhemerval Zanella Netto
2023-04-14 11:28     ` Cupertino Miranda
2023-04-14 13:24       ` Wilco Dijkstra
2023-04-14 14:49         ` Cupertino Miranda [this message]
2023-04-14 15:32           ` Wilco Dijkstra
2023-04-14 16:03             ` Wilco Dijkstra
2023-04-14 16:35               ` Cupertino Miranda
2023-04-14 23:10                 ` Wilco Dijkstra
2023-04-14 16:27             ` Cupertino Miranda
  -- strict thread matches above, loose matches on Subject: below --
2023-03-28 15:22 [PATCH v5 0/1] *** " Cupertino Miranda
2023-03-28 15:22 ` [PATCH v5 1/1] " Cupertino Miranda
2023-04-11 19:56   ` Adhemerval Zanella Netto
2023-04-12  8:53     ` Cupertino Miranda
2023-04-12 14:10       ` Adhemerval Zanella Netto
2023-04-13 16:13         ` Cupertino Miranda
2023-04-14 11:41       ` Adhemerval Zanella Netto
2023-04-14 12:27         ` Cupertino Miranda
2023-04-14 13:06           ` Adhemerval Zanella Netto
2023-04-14 14:33             ` Cupertino Miranda

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87y1mubjtr.fsf@oracle.com \
    --to=cupertino.miranda@oracle.com \
    --cc=Wilco.Dijkstra@arm.com \
    --cc=adhemerval.zanella@linaro.org \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).